All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-06-29 22:19 ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

The patch set "Enable virtio_net to act as a standby for a passthru
device" [1] deals with live migration of guests that use passthrough
devices. However, that scheme uses the MAC address for pairing
the virtio device and the passthrough device. The thread "netvsc:
refactor notifier/event handling code to use the failover framework"
[2] discusses an alternate mechanism, such as using an UUID, for pairing
the devices. Based on that discussion, proposals "Add "Group Identifier"
to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
store pairing information..." [4] were made.

The current patch set includes all the feedback received for proposals [3]
and [4]. For the sake of completeness, patch for the virtio specification
is also included here. Following is the updated proposal.

1. Extend the virtio specification to include a new virtio PCI capability
   "VIRTIO_PCI_CAP_GROUP_ID_CFG".

2. Enhance the QEMU CLI to include a "failover-group-id" option to the
   virtio device. The "failover-group-id" is a 64 bit value.

3. Enhance the QEMU CLI to include a "failover-group-id" option to the
   Red Hat PCI bridge device (support for the i440FX model).

4. Add a new "pcie-downstream" device, with the option
   "failover-group-id" (support for the Q35 model).

5. The operator creates a 64 bit unique identifier, failover-group-id.

6. When the virtio device is created, the operator uses the
   "failover-group-id" option (for example, '-device
   virtio-net-pci,failover-group-id=<identifier>') and specifies the
   failover-group-id created in step 4.

   QEMU stores the failover-group-id in the virtio device's configuration
   space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".

7. When assigning a PCI device to the guest in passthrough mode, the
   operator first creates a bridge using the "failover-group-id" option
   (for example, '-device pcie-downstream,failover-group-id=<identifier>')
   to specify the failover-group-id created in step 4, and then attaches
   the passthrough device to the bridge.

   QEMU stores the failover-group-id in the configuration space of the
   bridge as Vendor-Specific capability (0x09). The "Vendor" here is
   not to be confused with a specific organization. Instead, the vendor
   of the bridge is QEMU.

8. Patch 4 in patch series "Enable virtio_net to act as a standby for
   a passthru device" [1] needs to be modified to use the UUID values
   present in the bridge's configuration space and the virtio device's
   configuration space instead of the MAC address for pairing the devices.

Thanks!

Venu

[1] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00156.html
[2] https://www.spinics.net/lists/netdev/msg499011.html
[3] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00118.html
[4] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00204.html

Changes in v3:
  - As Michael Tsirkin suggested:
    * Changed the failover group identifier from UUID to a 64-bit value.
    * Changed the command line option from "uuid" to "failover-group-id".
    * Tweaked the "pci-bridge" device's Device ID to
      PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER.
    * Added to new device "pcie-downstream" with Device ID
      PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER (to support the PCIe case).
  - Changed the patch for virtio specification to reflect the above
    changes.

Changes in v2:
  - As Michael Tsirkin suggested, changed the virtio specification
    to restrict the group identifier to be a 16-byte field, presented
    entirely in the virtio device's configuration space.
  - As Michael Tsirkin suggested, instead of tweaking the ioh3420
    device with Red Hat vendor ID, create a new PCIe bridge device
    named "pcie-downstream" with Red Hat Vendor ID, and include the
    group identifier in this device.
  - Added a new patch to enhance the "pci-bridge" device to support
    the group identifier (for the i440FX model).

Venu Busireddy (3):
  Add "Group Identifier" support to virtio devices.
  Add "Group Identifier" support to Red Hat PCI bridge.
  Add "Group Identifier" support to Red Hat PCI Express bridge.

 default-configs/arm-softmmu.mak             |   1 +
 default-configs/i386-softmmu.mak            |   1 +
 default-configs/x86_64-softmmu.mak          |   1 +
 hw/pci-bridge/Makefile.objs                 |   1 +
 hw/pci-bridge/pci_bridge_dev.c              |  10 +
 hw/pci-bridge/pcie_downstream.c             | 220 ++++++++++++++++++++
 hw/pci-bridge/pcie_downstream.h             |  10 +
 hw/pci/pci_bridge.c                         |  34 +++
 hw/virtio/virtio-pci.c                      |  15 ++
 hw/virtio/virtio-pci.h                      |   3 +-
 include/hw/pci/pci.h                        |  37 ++--
 include/hw/pci/pci_bridge.h                 |   2 +
 include/hw/pci/pcie.h                       |   1 +
 include/standard-headers/linux/virtio_pci.h |   8 +
 14 files changed, 326 insertions(+), 18 deletions(-)
 create mode 100644 hw/pci-bridge/pcie_downstream.c
 create mode 100644 hw/pci-bridge/pcie_downstream.h

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-06-29 22:19 ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

The patch set "Enable virtio_net to act as a standby for a passthru
device" [1] deals with live migration of guests that use passthrough
devices. However, that scheme uses the MAC address for pairing
the virtio device and the passthrough device. The thread "netvsc:
refactor notifier/event handling code to use the failover framework"
[2] discusses an alternate mechanism, such as using an UUID, for pairing
the devices. Based on that discussion, proposals "Add "Group Identifier"
to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
store pairing information..." [4] were made.

The current patch set includes all the feedback received for proposals [3]
and [4]. For the sake of completeness, patch for the virtio specification
is also included here. Following is the updated proposal.

1. Extend the virtio specification to include a new virtio PCI capability
   "VIRTIO_PCI_CAP_GROUP_ID_CFG".

2. Enhance the QEMU CLI to include a "failover-group-id" option to the
   virtio device. The "failover-group-id" is a 64 bit value.

3. Enhance the QEMU CLI to include a "failover-group-id" option to the
   Red Hat PCI bridge device (support for the i440FX model).

4. Add a new "pcie-downstream" device, with the option
   "failover-group-id" (support for the Q35 model).

5. The operator creates a 64 bit unique identifier, failover-group-id.

6. When the virtio device is created, the operator uses the
   "failover-group-id" option (for example, '-device
   virtio-net-pci,failover-group-id=<identifier>') and specifies the
   failover-group-id created in step 4.

   QEMU stores the failover-group-id in the virtio device's configuration
   space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".

7. When assigning a PCI device to the guest in passthrough mode, the
   operator first creates a bridge using the "failover-group-id" option
   (for example, '-device pcie-downstream,failover-group-id=<identifier>')
   to specify the failover-group-id created in step 4, and then attaches
   the passthrough device to the bridge.

   QEMU stores the failover-group-id in the configuration space of the
   bridge as Vendor-Specific capability (0x09). The "Vendor" here is
   not to be confused with a specific organization. Instead, the vendor
   of the bridge is QEMU.

8. Patch 4 in patch series "Enable virtio_net to act as a standby for
   a passthru device" [1] needs to be modified to use the UUID values
   present in the bridge's configuration space and the virtio device's
   configuration space instead of the MAC address for pairing the devices.

Thanks!

Venu

[1] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00156.html
[2] https://www.spinics.net/lists/netdev/msg499011.html
[3] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00118.html
[4] https://lists.oasis-open.org/archives/virtio-dev/201805/msg00204.html

Changes in v3:
  - As Michael Tsirkin suggested:
    * Changed the failover group identifier from UUID to a 64-bit value.
    * Changed the command line option from "uuid" to "failover-group-id".
    * Tweaked the "pci-bridge" device's Device ID to
      PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER.
    * Added to new device "pcie-downstream" with Device ID
      PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER (to support the PCIe case).
  - Changed the patch for virtio specification to reflect the above
    changes.

Changes in v2:
  - As Michael Tsirkin suggested, changed the virtio specification
    to restrict the group identifier to be a 16-byte field, presented
    entirely in the virtio device's configuration space.
  - As Michael Tsirkin suggested, instead of tweaking the ioh3420
    device with Red Hat vendor ID, create a new PCIe bridge device
    named "pcie-downstream" with Red Hat Vendor ID, and include the
    group identifier in this device.
  - Added a new patch to enhance the "pci-bridge" device to support
    the group identifier (for the i440FX model).

Venu Busireddy (3):
  Add "Group Identifier" support to virtio devices.
  Add "Group Identifier" support to Red Hat PCI bridge.
  Add "Group Identifier" support to Red Hat PCI Express bridge.

 default-configs/arm-softmmu.mak             |   1 +
 default-configs/i386-softmmu.mak            |   1 +
 default-configs/x86_64-softmmu.mak          |   1 +
 hw/pci-bridge/Makefile.objs                 |   1 +
 hw/pci-bridge/pci_bridge_dev.c              |  10 +
 hw/pci-bridge/pcie_downstream.c             | 220 ++++++++++++++++++++
 hw/pci-bridge/pcie_downstream.h             |  10 +
 hw/pci/pci_bridge.c                         |  34 +++
 hw/virtio/virtio-pci.c                      |  15 ++
 hw/virtio/virtio-pci.h                      |   3 +-
 include/hw/pci/pci.h                        |  37 ++--
 include/hw/pci/pci_bridge.h                 |   2 +
 include/hw/pci/pcie.h                       |   1 +
 include/standard-headers/linux/virtio_pci.h |   8 +
 14 files changed, 326 insertions(+), 18 deletions(-)
 create mode 100644 hw/pci-bridge/pcie_downstream.c
 create mode 100644 hw/pci-bridge/pcie_downstream.h


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* [Qemu-devel] [PATCH v3 1/3] Add "Group Identifier" support to virtio devices.
  2018-06-29 22:19 ` [virtio-dev] " Venu Busireddy
@ 2018-06-29 22:19   ` Venu Busireddy
  -1 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Use the virtio PCI capability "VIRTIO_PCI_CAP_GROUP_ID_CFG" to
store the "Group Identifier" specified via the command line option
"failover-group-id" for the virtio device. The capability will be
present in the virtio device's configuration space iff the
"failover-group-id" option is specified.

Group Identifier is used to pair a virtio device with a passthrough
device.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
---
 hw/virtio/virtio-pci.c                      | 15 +++++++++++++++
 hw/virtio/virtio-pci.h                      |  3 ++-
 include/hw/pci/pci.h                        |  1 +
 include/hw/pci/pcie.h                       |  1 +
 include/standard-headers/linux/virtio_pci.h |  8 ++++++++
 5 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 3a01fe90f0..cdf907e9c5 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1638,6 +1638,10 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
             .cap.cap_len = sizeof cfg,
             .cap.cfg_type = VIRTIO_PCI_CAP_PCI_CFG,
         };
+        struct virtio_pci_group_id_cap group = {
+            .cap.cap_len = sizeof group,
+            .cap.cfg_type = VIRTIO_PCI_CAP_GROUP_ID_CFG,
+        };
         struct virtio_pci_notify_cap notify_pio = {
             .cap.cap_len = sizeof notify,
             .notify_off_multiplier = cpu_to_le32(0x0),
@@ -1647,6 +1651,11 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
 
         virtio_pci_modern_regions_init(proxy);
 
+        if (proxy->pci_dev.failover_group_id != ULLONG_MAX) {
+            group.failover_group_id = proxy->pci_dev.failover_group_id;
+            virtio_pci_modern_mem_region_map(proxy, &proxy->group, &group.cap);
+        }
+
         virtio_pci_modern_mem_region_map(proxy, &proxy->common, &cap);
         virtio_pci_modern_mem_region_map(proxy, &proxy->isr, &cap);
         virtio_pci_modern_mem_region_map(proxy, &proxy->device, &cap);
@@ -1763,6 +1772,10 @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error **errp)
     proxy->device.size = 0x1000;
     proxy->device.type = VIRTIO_PCI_CAP_DEVICE_CFG;
 
+    proxy->group.offset = 0;
+    proxy->group.size = 0;
+    proxy->group.type = VIRTIO_PCI_CAP_GROUP_ID_CFG;
+
     proxy->notify.offset = 0x3000;
     proxy->notify.size = virtio_pci_queue_mem_mult(proxy) * VIRTIO_QUEUE_MAX;
     proxy->notify.type = VIRTIO_PCI_CAP_NOTIFY_CFG;
@@ -1898,6 +1911,8 @@ static Property virtio_pci_properties[] = {
                     VIRTIO_PCI_FLAG_INIT_LNKCTL_BIT, true),
     DEFINE_PROP_BIT("x-pcie-pm-init", VirtIOPCIProxy, flags,
                     VIRTIO_PCI_FLAG_INIT_PM_BIT, true),
+    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
+                       PCIDevice, failover_group_id, ULLONG_MAX),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 813082b0d7..e4592e90bf 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -164,10 +164,11 @@ struct VirtIOPCIProxy {
             VirtIOPCIRegion common;
             VirtIOPCIRegion isr;
             VirtIOPCIRegion device;
+            VirtIOPCIRegion group;
             VirtIOPCIRegion notify;
             VirtIOPCIRegion notify_pio;
         };
-        VirtIOPCIRegion regs[5];
+        VirtIOPCIRegion regs[6];
     };
     MemoryRegion modern_bar;
     MemoryRegion io_bar;
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 990d6fcbde..b59c3e7e38 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -343,6 +343,7 @@ struct PCIDevice {
     bool has_rom;
     MemoryRegion rom;
     uint32_t rom_bar;
+    uint64_t failover_group_id;
 
     /* INTx routing notifier */
     PCIINTxRoutingNotifier intx_routing_notifier;
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index b71e369703..71cd143ee4 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -82,6 +82,7 @@ struct PCIExpressDevice {
 };
 
 #define COMPAT_PROP_PCP "power_controller_present"
+#define COMPAT_PROP_FAILOVER_GROUP_ID "failover-group-id"
 
 /* PCI express capability helper functions */
 int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type,
diff --git a/include/standard-headers/linux/virtio_pci.h b/include/standard-headers/linux/virtio_pci.h
index 9262acd130..e46df63e52 100644
--- a/include/standard-headers/linux/virtio_pci.h
+++ b/include/standard-headers/linux/virtio_pci.h
@@ -113,6 +113,8 @@
 #define VIRTIO_PCI_CAP_DEVICE_CFG	4
 /* PCI configuration access */
 #define VIRTIO_PCI_CAP_PCI_CFG		5
+/* Group Identifier */
+#define VIRTIO_PCI_CAP_GROUP_ID_CFG	6
 
 /* This is the PCI capability header: */
 struct virtio_pci_cap {
@@ -163,6 +165,12 @@ struct virtio_pci_cfg_cap {
 	uint8_t pci_cfg_data[4]; /* Data for BAR access. */
 };
 
+/* Fields in VIRTIO_PCI_CAP_GROUP_ID_CFG: */
+struct virtio_pci_group_id_cap {
+	struct virtio_pci_cap cap;
+	uint64_t failover_group_id;
+};
+
 /* Macro versions of offsets for the Old Timers! */
 #define VIRTIO_PCI_CAP_VNDR		0
 #define VIRTIO_PCI_CAP_NEXT		1

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [virtio-dev] [PATCH v3 1/3] Add "Group Identifier" support to virtio devices.
@ 2018-06-29 22:19   ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Use the virtio PCI capability "VIRTIO_PCI_CAP_GROUP_ID_CFG" to
store the "Group Identifier" specified via the command line option
"failover-group-id" for the virtio device. The capability will be
present in the virtio device's configuration space iff the
"failover-group-id" option is specified.

Group Identifier is used to pair a virtio device with a passthrough
device.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
---
 hw/virtio/virtio-pci.c                      | 15 +++++++++++++++
 hw/virtio/virtio-pci.h                      |  3 ++-
 include/hw/pci/pci.h                        |  1 +
 include/hw/pci/pcie.h                       |  1 +
 include/standard-headers/linux/virtio_pci.h |  8 ++++++++
 5 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 3a01fe90f0..cdf907e9c5 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -1638,6 +1638,10 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
             .cap.cap_len = sizeof cfg,
             .cap.cfg_type = VIRTIO_PCI_CAP_PCI_CFG,
         };
+        struct virtio_pci_group_id_cap group = {
+            .cap.cap_len = sizeof group,
+            .cap.cfg_type = VIRTIO_PCI_CAP_GROUP_ID_CFG,
+        };
         struct virtio_pci_notify_cap notify_pio = {
             .cap.cap_len = sizeof notify,
             .notify_off_multiplier = cpu_to_le32(0x0),
@@ -1647,6 +1651,11 @@ static void virtio_pci_device_plugged(DeviceState *d, Error **errp)
 
         virtio_pci_modern_regions_init(proxy);
 
+        if (proxy->pci_dev.failover_group_id != ULLONG_MAX) {
+            group.failover_group_id = proxy->pci_dev.failover_group_id;
+            virtio_pci_modern_mem_region_map(proxy, &proxy->group, &group.cap);
+        }
+
         virtio_pci_modern_mem_region_map(proxy, &proxy->common, &cap);
         virtio_pci_modern_mem_region_map(proxy, &proxy->isr, &cap);
         virtio_pci_modern_mem_region_map(proxy, &proxy->device, &cap);
@@ -1763,6 +1772,10 @@ static void virtio_pci_realize(PCIDevice *pci_dev, Error **errp)
     proxy->device.size = 0x1000;
     proxy->device.type = VIRTIO_PCI_CAP_DEVICE_CFG;
 
+    proxy->group.offset = 0;
+    proxy->group.size = 0;
+    proxy->group.type = VIRTIO_PCI_CAP_GROUP_ID_CFG;
+
     proxy->notify.offset = 0x3000;
     proxy->notify.size = virtio_pci_queue_mem_mult(proxy) * VIRTIO_QUEUE_MAX;
     proxy->notify.type = VIRTIO_PCI_CAP_NOTIFY_CFG;
@@ -1898,6 +1911,8 @@ static Property virtio_pci_properties[] = {
                     VIRTIO_PCI_FLAG_INIT_LNKCTL_BIT, true),
     DEFINE_PROP_BIT("x-pcie-pm-init", VirtIOPCIProxy, flags,
                     VIRTIO_PCI_FLAG_INIT_PM_BIT, true),
+    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
+                       PCIDevice, failover_group_id, ULLONG_MAX),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
index 813082b0d7..e4592e90bf 100644
--- a/hw/virtio/virtio-pci.h
+++ b/hw/virtio/virtio-pci.h
@@ -164,10 +164,11 @@ struct VirtIOPCIProxy {
             VirtIOPCIRegion common;
             VirtIOPCIRegion isr;
             VirtIOPCIRegion device;
+            VirtIOPCIRegion group;
             VirtIOPCIRegion notify;
             VirtIOPCIRegion notify_pio;
         };
-        VirtIOPCIRegion regs[5];
+        VirtIOPCIRegion regs[6];
     };
     MemoryRegion modern_bar;
     MemoryRegion io_bar;
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 990d6fcbde..b59c3e7e38 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -343,6 +343,7 @@ struct PCIDevice {
     bool has_rom;
     MemoryRegion rom;
     uint32_t rom_bar;
+    uint64_t failover_group_id;
 
     /* INTx routing notifier */
     PCIINTxRoutingNotifier intx_routing_notifier;
diff --git a/include/hw/pci/pcie.h b/include/hw/pci/pcie.h
index b71e369703..71cd143ee4 100644
--- a/include/hw/pci/pcie.h
+++ b/include/hw/pci/pcie.h
@@ -82,6 +82,7 @@ struct PCIExpressDevice {
 };
 
 #define COMPAT_PROP_PCP "power_controller_present"
+#define COMPAT_PROP_FAILOVER_GROUP_ID "failover-group-id"
 
 /* PCI express capability helper functions */
 int pcie_cap_init(PCIDevice *dev, uint8_t offset, uint8_t type,
diff --git a/include/standard-headers/linux/virtio_pci.h b/include/standard-headers/linux/virtio_pci.h
index 9262acd130..e46df63e52 100644
--- a/include/standard-headers/linux/virtio_pci.h
+++ b/include/standard-headers/linux/virtio_pci.h
@@ -113,6 +113,8 @@
 #define VIRTIO_PCI_CAP_DEVICE_CFG	4
 /* PCI configuration access */
 #define VIRTIO_PCI_CAP_PCI_CFG		5
+/* Group Identifier */
+#define VIRTIO_PCI_CAP_GROUP_ID_CFG	6
 
 /* This is the PCI capability header: */
 struct virtio_pci_cap {
@@ -163,6 +165,12 @@ struct virtio_pci_cfg_cap {
 	uint8_t pci_cfg_data[4]; /* Data for BAR access. */
 };
 
+/* Fields in VIRTIO_PCI_CAP_GROUP_ID_CFG: */
+struct virtio_pci_group_id_cap {
+	struct virtio_pci_cap cap;
+	uint64_t failover_group_id;
+};
+
 /* Macro versions of offsets for the Old Timers! */
 #define VIRTIO_PCI_CAP_VNDR		0
 #define VIRTIO_PCI_CAP_NEXT		1

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [Qemu-devel] [PATCH v3 2/3] Add "Group Identifier" support to Red Hat PCI bridge.
  2018-06-29 22:19 ` [virtio-dev] " Venu Busireddy
@ 2018-06-29 22:19   ` Venu Busireddy
  -1 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Add the "Vendor-Specific" capability to the Red Hat PCI bridge device
"pci-bridge", to contain the "Group Identifier" that will be used to pair
a virtio device with the passthrough device attached to that bridge. Also,
change the Device ID of the bridge to PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER
to avoid confusion with bridges that don't have this capability.

This capability is added to the bridge iff the "failover-group-id"
option is specified for the bridge.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
---
 hw/pci-bridge/pci_bridge_dev.c | 10 ++++++++++
 hw/pci/pci_bridge.c            | 34 +++++++++++++++++++++++++++++++++
 include/hw/pci/pci.h           | 35 +++++++++++++++++-----------------
 include/hw/pci/pci_bridge.h    |  2 ++
 4 files changed, 64 insertions(+), 17 deletions(-)

diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index b2d861d216..d3879071a8 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -71,6 +71,13 @@ static void pci_bridge_dev_realize(PCIDevice *dev, Error **errp)
         bridge_dev->msi = ON_OFF_AUTO_OFF;
     }
 
+    err = pci_bridge_vendor_init(dev, 0, PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER,
+            errp);
+    if (err < 0) {
+        error_append_hint(errp, "Can't init group ID, error %d\n", err);
+        goto vendor_cap_err;
+    }
+
     err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0, errp);
     if (err) {
         goto slotid_error;
@@ -109,6 +116,7 @@ slotid_error:
     if (shpc_present(dev)) {
         shpc_cleanup(dev, &bridge_dev->bar);
     }
+vendor_cap_err:
 shpc_error:
     pci_bridge_exitfn(dev);
 }
@@ -162,6 +170,8 @@ static Property pci_bridge_dev_properties[] = {
                             ON_OFF_AUTO_AUTO),
     DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
                     PCI_BRIDGE_DEV_F_SHPC_REQ, true),
+    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
+                       PCIDevice, failover_group_id, ULLONG_MAX),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index 40a39f57cb..68cc619c20 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -40,6 +40,10 @@
 #define PCI_SSVID_SVID          4
 #define PCI_SSVID_SSID          6
 
+#define PCI_VENDOR_SIZEOF             12
+#define PCI_VENDOR_CAP_LEN_OFFSET      2
+#define PCI_VENDOR_GROUP_ID_OFFSET     4
+
 int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
                           uint16_t svid, uint16_t ssid,
                           Error **errp)
@@ -57,6 +61,36 @@ int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
     return pos;
 }
 
+int pci_bridge_vendor_init(PCIDevice *d, uint8_t offset,
+                           uint16_t device_id, Error **errp)
+{
+    int pos;
+    PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(d);
+
+    if (d->failover_group_id == ULLONG_MAX) {
+        return 0;
+    }
+
+    pos = pci_add_capability(d, PCI_CAP_ID_VNDR, offset, PCI_VENDOR_SIZEOF,
+            errp);
+    if (pos < 0) {
+        return pos;
+    }
+
+    /*
+     * Tweak the Device ID to avoid confusion
+     * with bridges that don't have the group id capability.
+     */
+    dc->device_id = device_id;
+    pci_set_word(d->config + PCI_DEVICE_ID, device_id);
+
+    pci_set_word(d->config + pos + PCI_VENDOR_CAP_LEN_OFFSET,
+            PCI_VENDOR_SIZEOF);
+    memcpy(d->config + pos + PCI_VENDOR_GROUP_ID_OFFSET,
+            &d->failover_group_id, sizeof(uint64_t));
+    return pos;
+}
+
 /* Accessor function to get parent bridge device from pci bus. */
 PCIDevice *pci_bridge_get_device(PCIBus *bus)
 {
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index b59c3e7e38..bc38032761 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -86,23 +86,24 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
 #define PCI_DEVICE_ID_VIRTIO_VSOCK       0x1012
 
-#define PCI_VENDOR_ID_REDHAT             0x1b36
-#define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
-#define PCI_DEVICE_ID_REDHAT_SERIAL      0x0002
-#define PCI_DEVICE_ID_REDHAT_SERIAL2     0x0003
-#define PCI_DEVICE_ID_REDHAT_SERIAL4     0x0004
-#define PCI_DEVICE_ID_REDHAT_TEST        0x0005
-#define PCI_DEVICE_ID_REDHAT_ROCKER      0x0006
-#define PCI_DEVICE_ID_REDHAT_SDHCI       0x0007
-#define PCI_DEVICE_ID_REDHAT_PCIE_HOST   0x0008
-#define PCI_DEVICE_ID_REDHAT_PXB         0x0009
-#define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT 0x000a
-#define PCI_DEVICE_ID_REDHAT_PXB_PCIE    0x000b
-#define PCI_DEVICE_ID_REDHAT_PCIE_RP     0x000c
-#define PCI_DEVICE_ID_REDHAT_XHCI        0x000d
-#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000e
-#define PCI_DEVICE_ID_REDHAT_MDPY        0x000f
-#define PCI_DEVICE_ID_REDHAT_QXL         0x0100
+#define PCI_VENDOR_ID_REDHAT                    0x1b36
+#define PCI_DEVICE_ID_REDHAT_BRIDGE             0x0001
+#define PCI_DEVICE_ID_REDHAT_SERIAL             0x0002
+#define PCI_DEVICE_ID_REDHAT_SERIAL2            0x0003
+#define PCI_DEVICE_ID_REDHAT_SERIAL4            0x0004
+#define PCI_DEVICE_ID_REDHAT_TEST               0x0005
+#define PCI_DEVICE_ID_REDHAT_ROCKER             0x0006
+#define PCI_DEVICE_ID_REDHAT_SDHCI              0x0007
+#define PCI_DEVICE_ID_REDHAT_PCIE_HOST          0x0008
+#define PCI_DEVICE_ID_REDHAT_PXB                0x0009
+#define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT        0x000a
+#define PCI_DEVICE_ID_REDHAT_PXB_PCIE           0x000b
+#define PCI_DEVICE_ID_REDHAT_PCIE_RP            0x000c
+#define PCI_DEVICE_ID_REDHAT_XHCI               0x000d
+#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
+#define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
+#define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
+#define PCI_DEVICE_ID_REDHAT_QXL                0x0100
 
 #define FMT_PCIBUS                      PRIx64
 
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index 0347da52d2..aa471ec5a4 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -83,6 +83,8 @@ struct PCIBridge {
 int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
                           uint16_t svid, uint16_t ssid,
                           Error **errp);
+int pci_bridge_vendor_init(PCIDevice *dev, uint8_t offset,
+                           uint16_t device_id, Error **errp);
 
 PCIDevice *pci_bridge_get_device(PCIBus *bus);
 PCIBus *pci_bridge_get_sec_bus(PCIBridge *br);

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [virtio-dev] [PATCH v3 2/3] Add "Group Identifier" support to Red Hat PCI bridge.
@ 2018-06-29 22:19   ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Add the "Vendor-Specific" capability to the Red Hat PCI bridge device
"pci-bridge", to contain the "Group Identifier" that will be used to pair
a virtio device with the passthrough device attached to that bridge. Also,
change the Device ID of the bridge to PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER
to avoid confusion with bridges that don't have this capability.

This capability is added to the bridge iff the "failover-group-id"
option is specified for the bridge.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
---
 hw/pci-bridge/pci_bridge_dev.c | 10 ++++++++++
 hw/pci/pci_bridge.c            | 34 +++++++++++++++++++++++++++++++++
 include/hw/pci/pci.h           | 35 +++++++++++++++++-----------------
 include/hw/pci/pci_bridge.h    |  2 ++
 4 files changed, 64 insertions(+), 17 deletions(-)

diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
index b2d861d216..d3879071a8 100644
--- a/hw/pci-bridge/pci_bridge_dev.c
+++ b/hw/pci-bridge/pci_bridge_dev.c
@@ -71,6 +71,13 @@ static void pci_bridge_dev_realize(PCIDevice *dev, Error **errp)
         bridge_dev->msi = ON_OFF_AUTO_OFF;
     }
 
+    err = pci_bridge_vendor_init(dev, 0, PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER,
+            errp);
+    if (err < 0) {
+        error_append_hint(errp, "Can't init group ID, error %d\n", err);
+        goto vendor_cap_err;
+    }
+
     err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0, errp);
     if (err) {
         goto slotid_error;
@@ -109,6 +116,7 @@ slotid_error:
     if (shpc_present(dev)) {
         shpc_cleanup(dev, &bridge_dev->bar);
     }
+vendor_cap_err:
 shpc_error:
     pci_bridge_exitfn(dev);
 }
@@ -162,6 +170,8 @@ static Property pci_bridge_dev_properties[] = {
                             ON_OFF_AUTO_AUTO),
     DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
                     PCI_BRIDGE_DEV_F_SHPC_REQ, true),
+    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
+                       PCIDevice, failover_group_id, ULLONG_MAX),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
index 40a39f57cb..68cc619c20 100644
--- a/hw/pci/pci_bridge.c
+++ b/hw/pci/pci_bridge.c
@@ -40,6 +40,10 @@
 #define PCI_SSVID_SVID          4
 #define PCI_SSVID_SSID          6
 
+#define PCI_VENDOR_SIZEOF             12
+#define PCI_VENDOR_CAP_LEN_OFFSET      2
+#define PCI_VENDOR_GROUP_ID_OFFSET     4
+
 int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
                           uint16_t svid, uint16_t ssid,
                           Error **errp)
@@ -57,6 +61,36 @@ int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
     return pos;
 }
 
+int pci_bridge_vendor_init(PCIDevice *d, uint8_t offset,
+                           uint16_t device_id, Error **errp)
+{
+    int pos;
+    PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(d);
+
+    if (d->failover_group_id == ULLONG_MAX) {
+        return 0;
+    }
+
+    pos = pci_add_capability(d, PCI_CAP_ID_VNDR, offset, PCI_VENDOR_SIZEOF,
+            errp);
+    if (pos < 0) {
+        return pos;
+    }
+
+    /*
+     * Tweak the Device ID to avoid confusion
+     * with bridges that don't have the group id capability.
+     */
+    dc->device_id = device_id;
+    pci_set_word(d->config + PCI_DEVICE_ID, device_id);
+
+    pci_set_word(d->config + pos + PCI_VENDOR_CAP_LEN_OFFSET,
+            PCI_VENDOR_SIZEOF);
+    memcpy(d->config + pos + PCI_VENDOR_GROUP_ID_OFFSET,
+            &d->failover_group_id, sizeof(uint64_t));
+    return pos;
+}
+
 /* Accessor function to get parent bridge device from pci bus. */
 PCIDevice *pci_bridge_get_device(PCIBus *bus)
 {
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index b59c3e7e38..bc38032761 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -86,23 +86,24 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
 #define PCI_DEVICE_ID_VIRTIO_VSOCK       0x1012
 
-#define PCI_VENDOR_ID_REDHAT             0x1b36
-#define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
-#define PCI_DEVICE_ID_REDHAT_SERIAL      0x0002
-#define PCI_DEVICE_ID_REDHAT_SERIAL2     0x0003
-#define PCI_DEVICE_ID_REDHAT_SERIAL4     0x0004
-#define PCI_DEVICE_ID_REDHAT_TEST        0x0005
-#define PCI_DEVICE_ID_REDHAT_ROCKER      0x0006
-#define PCI_DEVICE_ID_REDHAT_SDHCI       0x0007
-#define PCI_DEVICE_ID_REDHAT_PCIE_HOST   0x0008
-#define PCI_DEVICE_ID_REDHAT_PXB         0x0009
-#define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT 0x000a
-#define PCI_DEVICE_ID_REDHAT_PXB_PCIE    0x000b
-#define PCI_DEVICE_ID_REDHAT_PCIE_RP     0x000c
-#define PCI_DEVICE_ID_REDHAT_XHCI        0x000d
-#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000e
-#define PCI_DEVICE_ID_REDHAT_MDPY        0x000f
-#define PCI_DEVICE_ID_REDHAT_QXL         0x0100
+#define PCI_VENDOR_ID_REDHAT                    0x1b36
+#define PCI_DEVICE_ID_REDHAT_BRIDGE             0x0001
+#define PCI_DEVICE_ID_REDHAT_SERIAL             0x0002
+#define PCI_DEVICE_ID_REDHAT_SERIAL2            0x0003
+#define PCI_DEVICE_ID_REDHAT_SERIAL4            0x0004
+#define PCI_DEVICE_ID_REDHAT_TEST               0x0005
+#define PCI_DEVICE_ID_REDHAT_ROCKER             0x0006
+#define PCI_DEVICE_ID_REDHAT_SDHCI              0x0007
+#define PCI_DEVICE_ID_REDHAT_PCIE_HOST          0x0008
+#define PCI_DEVICE_ID_REDHAT_PXB                0x0009
+#define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT        0x000a
+#define PCI_DEVICE_ID_REDHAT_PXB_PCIE           0x000b
+#define PCI_DEVICE_ID_REDHAT_PCIE_RP            0x000c
+#define PCI_DEVICE_ID_REDHAT_XHCI               0x000d
+#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
+#define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
+#define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
+#define PCI_DEVICE_ID_REDHAT_QXL                0x0100
 
 #define FMT_PCIBUS                      PRIx64
 
diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
index 0347da52d2..aa471ec5a4 100644
--- a/include/hw/pci/pci_bridge.h
+++ b/include/hw/pci/pci_bridge.h
@@ -83,6 +83,8 @@ struct PCIBridge {
 int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
                           uint16_t svid, uint16_t ssid,
                           Error **errp);
+int pci_bridge_vendor_init(PCIDevice *dev, uint8_t offset,
+                           uint16_t device_id, Error **errp);
 
 PCIDevice *pci_bridge_get_device(PCIBus *bus);
 PCIBus *pci_bridge_get_sec_bus(PCIBridge *br);

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [Qemu-devel] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
  2018-06-29 22:19 ` [virtio-dev] " Venu Busireddy
@ 2018-06-29 22:19   ` Venu Busireddy
  -1 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Add a new bridge device "pcie-downstream" with a
Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER. Also, add the "Vendor-Specific"
capability to the bridge to contain the "Group Identifier" that will be
used to pair a virtio device with the passthrough device attached to
that bridge.

This capability is added to the bridge iff the "failover-group-id"
option is specified for the bridge.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
---
 default-configs/arm-softmmu.mak    |   1 +
 default-configs/i386-softmmu.mak   |   1 +
 default-configs/x86_64-softmmu.mak |   1 +
 hw/pci-bridge/Makefile.objs        |   1 +
 hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
 hw/pci-bridge/pcie_downstream.h    |  10 ++
 include/hw/pci/pci.h               |   1 +
 7 files changed, 235 insertions(+)
 create mode 100644 hw/pci-bridge/pcie_downstream.c
 create mode 100644 hw/pci-bridge/pcie_downstream.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 834d45cfaf..b86c6fb122 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_ACPI=y
 CONFIG_SMBIOS=y
diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 8c7d4a0fa0..a900c8f052 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
index 0390b4303c..481e4764be 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
index 47065f87d9..5b42212edc 100644
--- a/hw/pci-bridge/Makefile.objs
+++ b/hw/pci-bridge/Makefile.objs
@@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
 common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
 common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
 common-obj-$(CONFIG_IOH3420) += ioh3420.o
+common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
 common-obj-$(CONFIG_I82801B11) += i82801b11.o
 # NewWorld PowerMac
 common-obj-$(CONFIG_DEC_PCI) += dec.o
diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
new file mode 100644
index 0000000000..49b0d1e933
--- /dev/null
+++ b/hw/pci-bridge/pcie_downstream.c
@@ -0,0 +1,220 @@
+/*
+ * Red Hat PCI Express downstream port.
+ *
+ * pcie_downstream.c
+ * Most of this code is copied from xio3130_downstream.c
+ *
+ * Copyright (c) 2018 Oracle and/or its affiliates.
+ *         Author: Venu Busireddy <venu.busireddy@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/pci/pci_ids.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/pcie.h"
+#include "pcie_downstream.h"
+#include "qapi/error.h"
+
+#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
+#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
+#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
+#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
+#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
+#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
+#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
+#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
+#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
+#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
+
+static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
+                                         uint32_t val, int len)
+{
+    pci_bridge_write_config(d, address, val, len);
+    pcie_cap_flr_write_config(d, address, val, len);
+    pcie_cap_slot_write_config(d, address, val, len);
+    pcie_aer_write_config(d, address, val, len);
+}
+
+static void pcie_downstream_reset(DeviceState *qdev)
+{
+    PCIDevice *d = PCI_DEVICE(qdev);
+
+    pcie_cap_deverr_reset(d);
+    pcie_cap_slot_reset(d);
+    pcie_cap_arifwd_reset(d);
+    pci_bridge_reset(qdev);
+}
+
+static void pcie_downstream_realize(PCIDevice *d, Error **errp)
+{
+    PCIEPort *p = PCIE_PORT(d);
+    PCIESlot *s = PCIE_SLOT(d);
+    int rc;
+
+    pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    pcie_port_init_reg(d);
+
+    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
+            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
+    if (rc < 0) {
+        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
+        goto err_bridge;
+    }
+
+    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
+            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
+            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
+            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
+            errp);
+    if (rc < 0) {
+        assert(rc == -ENOTSUP);
+        goto err_bridge;
+    }
+
+    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
+            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
+            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
+    if (rc < 0) {
+        goto err_bridge;
+    }
+
+    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
+            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
+    if (rc < 0) {
+        goto err_msi;
+    }
+    pcie_cap_flr_init(d);
+    pcie_cap_deverr_init(d);
+    pcie_cap_slot_init(d, s->slot);
+    pcie_cap_arifwd_init(d);
+
+    pcie_chassis_create(s->chassis);
+    rc = pcie_chassis_add_slot(s);
+    if (rc < 0) {
+        error_setg(errp, "Can't add chassis slot, error %d", rc);
+        goto err_pcie_cap;
+    }
+
+    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
+            PCI_ERR_SIZEOF, errp);
+    if (rc < 0) {
+        goto err;
+    }
+
+    return;
+
+err:
+    pcie_chassis_del_slot(s);
+err_pcie_cap:
+    pcie_cap_exit(d);
+err_msi:
+    msi_uninit(d);
+err_bridge:
+    pci_bridge_exitfn(d);
+}
+
+static void pcie_downstream_exitfn(PCIDevice *d)
+{
+    PCIESlot *s = PCIE_SLOT(d);
+
+    pcie_aer_exit(d);
+    pcie_chassis_del_slot(s);
+    pcie_cap_exit(d);
+    msi_uninit(d);
+    pci_bridge_exitfn(d);
+}
+
+PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
+                               const char *bus_name, pci_map_irq_fn map_irq,
+                               uint8_t port, uint8_t chassis, uint16_t slot)
+{
+    PCIDevice *d;
+    PCIBridge *br;
+    DeviceState *qdev;
+
+    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
+    if (!d) {
+        return NULL;
+    }
+    br = PCI_BRIDGE(d);
+
+    qdev = DEVICE(d);
+    pci_bridge_map_irq(br, bus_name, map_irq);
+    qdev_prop_set_uint8(qdev, "port", port);
+    qdev_prop_set_uint8(qdev, "chassis", chassis);
+    qdev_prop_set_uint16(qdev, "slot", slot);
+    qdev_init_nofail(qdev);
+
+    return PCIE_SLOT(d);
+}
+
+static Property pcie_downstream_props[] = {
+    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
+            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
+    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
+            PCIDevice, failover_group_id, ULLONG_MAX),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static const VMStateDescription vmstate_pcie_ds = {
+    .name = "pcie-downstream-port",
+    .priority = MIG_PRI_PCI_BUS,
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .post_load = pcie_cap_slot_post_load,
+    .fields = (VMStateField[]) {
+        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
+        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
+                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static void pcie_downstream_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+    k->is_bridge = 1;
+    k->config_write = pcie_downstream_write_config;
+    k->realize = pcie_downstream_realize;
+    k->exit = pcie_downstream_exitfn;
+    k->vendor_id = PCI_VENDOR_ID_REDHAT;
+    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
+    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->desc = "Red Hat PCIe Downstream Port";
+    dc->reset = pcie_downstream_reset;
+    dc->vmsd = &vmstate_pcie_ds;
+    dc->props = pcie_downstream_props;
+}
+
+static const TypeInfo pcie_downstream_info = {
+    .name          = "pcie-downstream",
+    .parent        = TYPE_PCIE_SLOT,
+    .class_init    = pcie_downstream_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { INTERFACE_PCIE_DEVICE },
+        { }
+    },
+};
+
+static void pcie_downstream_register_types(void)
+{
+    type_register_static(&pcie_downstream_info);
+}
+
+type_init(pcie_downstream_register_types)
diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
new file mode 100644
index 0000000000..a7f1aae4bb
--- /dev/null
+++ b/hw/pci-bridge/pcie_downstream.h
@@ -0,0 +1,10 @@
+#ifndef QEMU_PCIE_DOWNSTREAM_H
+#define QEMU_PCIE_DOWNSTREAM_H
+
+#include "hw/pci/pcie_port.h"
+
+PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
+        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
+        uint8_t chassis, uint16_t slot);
+
+#endif /* QEMU_PCIE_DOWNSTREAM_H */
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index bc38032761..bf04ef59f3 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -103,6 +103,7 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
 #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
 #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
+#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
 #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
 
 #define FMT_PCIBUS                      PRIx64

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
@ 2018-06-29 22:19   ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Add a new bridge device "pcie-downstream" with a
Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER. Also, add the "Vendor-Specific"
capability to the bridge to contain the "Group Identifier" that will be
used to pair a virtio device with the passthrough device attached to
that bridge.

This capability is added to the bridge iff the "failover-group-id"
option is specified for the bridge.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
---
 default-configs/arm-softmmu.mak    |   1 +
 default-configs/i386-softmmu.mak   |   1 +
 default-configs/x86_64-softmmu.mak |   1 +
 hw/pci-bridge/Makefile.objs        |   1 +
 hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
 hw/pci-bridge/pcie_downstream.h    |  10 ++
 include/hw/pci/pci.h               |   1 +
 7 files changed, 235 insertions(+)
 create mode 100644 hw/pci-bridge/pcie_downstream.c
 create mode 100644 hw/pci-bridge/pcie_downstream.h

diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
index 834d45cfaf..b86c6fb122 100644
--- a/default-configs/arm-softmmu.mak
+++ b/default-configs/arm-softmmu.mak
@@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_ACPI=y
 CONFIG_SMBIOS=y
diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 8c7d4a0fa0..a900c8f052 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
index 0390b4303c..481e4764be 100644
--- a/default-configs/x86_64-softmmu.mak
+++ b/default-configs/x86_64-softmmu.mak
@@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
 CONFIG_PCIE_PORT=y
 CONFIG_XIO3130=y
 CONFIG_IOH3420=y
+CONFIG_PCIE_DOWNSTREAM=y
 CONFIG_I82801B11=y
 CONFIG_SMBIOS=y
 CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
index 47065f87d9..5b42212edc 100644
--- a/hw/pci-bridge/Makefile.objs
+++ b/hw/pci-bridge/Makefile.objs
@@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
 common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
 common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
 common-obj-$(CONFIG_IOH3420) += ioh3420.o
+common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
 common-obj-$(CONFIG_I82801B11) += i82801b11.o
 # NewWorld PowerMac
 common-obj-$(CONFIG_DEC_PCI) += dec.o
diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
new file mode 100644
index 0000000000..49b0d1e933
--- /dev/null
+++ b/hw/pci-bridge/pcie_downstream.c
@@ -0,0 +1,220 @@
+/*
+ * Red Hat PCI Express downstream port.
+ *
+ * pcie_downstream.c
+ * Most of this code is copied from xio3130_downstream.c
+ *
+ * Copyright (c) 2018 Oracle and/or its affiliates.
+ *         Author: Venu Busireddy <venu.busireddy@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/pci/pci_ids.h"
+#include "hw/pci/msi.h"
+#include "hw/pci/pcie.h"
+#include "pcie_downstream.h"
+#include "qapi/error.h"
+
+#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
+#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
+#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
+#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
+#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
+#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
+#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
+#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
+#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
+#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
+
+static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
+                                         uint32_t val, int len)
+{
+    pci_bridge_write_config(d, address, val, len);
+    pcie_cap_flr_write_config(d, address, val, len);
+    pcie_cap_slot_write_config(d, address, val, len);
+    pcie_aer_write_config(d, address, val, len);
+}
+
+static void pcie_downstream_reset(DeviceState *qdev)
+{
+    PCIDevice *d = PCI_DEVICE(qdev);
+
+    pcie_cap_deverr_reset(d);
+    pcie_cap_slot_reset(d);
+    pcie_cap_arifwd_reset(d);
+    pci_bridge_reset(qdev);
+}
+
+static void pcie_downstream_realize(PCIDevice *d, Error **errp)
+{
+    PCIEPort *p = PCIE_PORT(d);
+    PCIESlot *s = PCIE_SLOT(d);
+    int rc;
+
+    pci_bridge_initfn(d, TYPE_PCIE_BUS);
+    pcie_port_init_reg(d);
+
+    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
+            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
+    if (rc < 0) {
+        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
+        goto err_bridge;
+    }
+
+    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
+            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
+            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
+            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
+            errp);
+    if (rc < 0) {
+        assert(rc == -ENOTSUP);
+        goto err_bridge;
+    }
+
+    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
+            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
+            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
+    if (rc < 0) {
+        goto err_bridge;
+    }
+
+    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
+            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
+    if (rc < 0) {
+        goto err_msi;
+    }
+    pcie_cap_flr_init(d);
+    pcie_cap_deverr_init(d);
+    pcie_cap_slot_init(d, s->slot);
+    pcie_cap_arifwd_init(d);
+
+    pcie_chassis_create(s->chassis);
+    rc = pcie_chassis_add_slot(s);
+    if (rc < 0) {
+        error_setg(errp, "Can't add chassis slot, error %d", rc);
+        goto err_pcie_cap;
+    }
+
+    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
+            PCI_ERR_SIZEOF, errp);
+    if (rc < 0) {
+        goto err;
+    }
+
+    return;
+
+err:
+    pcie_chassis_del_slot(s);
+err_pcie_cap:
+    pcie_cap_exit(d);
+err_msi:
+    msi_uninit(d);
+err_bridge:
+    pci_bridge_exitfn(d);
+}
+
+static void pcie_downstream_exitfn(PCIDevice *d)
+{
+    PCIESlot *s = PCIE_SLOT(d);
+
+    pcie_aer_exit(d);
+    pcie_chassis_del_slot(s);
+    pcie_cap_exit(d);
+    msi_uninit(d);
+    pci_bridge_exitfn(d);
+}
+
+PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
+                               const char *bus_name, pci_map_irq_fn map_irq,
+                               uint8_t port, uint8_t chassis, uint16_t slot)
+{
+    PCIDevice *d;
+    PCIBridge *br;
+    DeviceState *qdev;
+
+    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
+    if (!d) {
+        return NULL;
+    }
+    br = PCI_BRIDGE(d);
+
+    qdev = DEVICE(d);
+    pci_bridge_map_irq(br, bus_name, map_irq);
+    qdev_prop_set_uint8(qdev, "port", port);
+    qdev_prop_set_uint8(qdev, "chassis", chassis);
+    qdev_prop_set_uint16(qdev, "slot", slot);
+    qdev_init_nofail(qdev);
+
+    return PCIE_SLOT(d);
+}
+
+static Property pcie_downstream_props[] = {
+    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
+            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
+    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
+            PCIDevice, failover_group_id, ULLONG_MAX),
+    DEFINE_PROP_END_OF_LIST()
+};
+
+static const VMStateDescription vmstate_pcie_ds = {
+    .name = "pcie-downstream-port",
+    .priority = MIG_PRI_PCI_BUS,
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .post_load = pcie_cap_slot_post_load,
+    .fields = (VMStateField[]) {
+        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
+        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
+                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
+static void pcie_downstream_class_init(ObjectClass *klass, void *data)
+{
+    DeviceClass *dc = DEVICE_CLASS(klass);
+    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+    k->is_bridge = 1;
+    k->config_write = pcie_downstream_write_config;
+    k->realize = pcie_downstream_realize;
+    k->exit = pcie_downstream_exitfn;
+    k->vendor_id = PCI_VENDOR_ID_REDHAT;
+    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
+    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
+    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+    dc->desc = "Red Hat PCIe Downstream Port";
+    dc->reset = pcie_downstream_reset;
+    dc->vmsd = &vmstate_pcie_ds;
+    dc->props = pcie_downstream_props;
+}
+
+static const TypeInfo pcie_downstream_info = {
+    .name          = "pcie-downstream",
+    .parent        = TYPE_PCIE_SLOT,
+    .class_init    = pcie_downstream_class_init,
+    .interfaces = (InterfaceInfo[]) {
+        { INTERFACE_PCIE_DEVICE },
+        { }
+    },
+};
+
+static void pcie_downstream_register_types(void)
+{
+    type_register_static(&pcie_downstream_info);
+}
+
+type_init(pcie_downstream_register_types)
diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
new file mode 100644
index 0000000000..a7f1aae4bb
--- /dev/null
+++ b/hw/pci-bridge/pcie_downstream.h
@@ -0,0 +1,10 @@
+#ifndef QEMU_PCIE_DOWNSTREAM_H
+#define QEMU_PCIE_DOWNSTREAM_H
+
+#include "hw/pci/pcie_port.h"
+
+PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
+        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
+        uint8_t chassis, uint16_t slot);
+
+#endif /* QEMU_PCIE_DOWNSTREAM_H */
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index bc38032761..bf04ef59f3 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -103,6 +103,7 @@ extern bool pci_available;
 #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
 #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
 #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
+#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
 #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
 
 #define FMT_PCIBUS                      PRIx64

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [Qemu-devel] [PATCH v3 virtio 1/1] Add "Group Identifier" to virtio PCI capabilities.
  2018-06-29 22:19 ` [virtio-dev] " Venu Busireddy
@ 2018-06-29 22:19   ` Venu Busireddy
  -1 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Add VIRTIO_PCI_CAP_GROUP_ID_CFG (Group Identifier) capability to the
virtio PCI capabilities to allow for the grouping of devices.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
---
 content.tex | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/content.tex b/content.tex
index be18234..34627e5 100644
--- a/content.tex
+++ b/content.tex
@@ -599,6 +599,8 @@ The fields are interpreted as follows:
 #define VIRTIO_PCI_CAP_DEVICE_CFG        4
 /* PCI configuration access */
 #define VIRTIO_PCI_CAP_PCI_CFG           5
+/* Group Identifier */
+#define VIRTIO_PCI_CAP_GROUP_ID_CFG      6
 \end{lstlisting}
 
         Any other value is reserved for future use.
@@ -997,6 +999,34 @@ address \field{cap.length} bytes within a BAR range
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Group Identifier capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The VIRTIO_PCI_CAP_GROUP_ID_CFG capability provides means for grouping
+devices together.
+
+The capability is immediately followed by an identifier of type u64
+as below:
+
+\begin{lstlisting}
+struct virtio_pci_group_id_cap {
+        struct virtio_pci_cap cap;
+        u64 group_id; /* Group Identifier */
+};
+\end{lstlisting}
+
+\devicenormative{\paragraph}{Group Identifier capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The device MAY present the VIRTIO_PCI_CAP_GROUP_ID_CFG capability.
+
+When this capability is present, the device must set the fields
+\field{cap.bar}, \field{cap.offset} and \field{cap.length} to 0, and
+set the group_id to a unique identifier.
+
+\drivernormative{\paragraph}{Group Identifier capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The fields \field{cap.bar}, \field{cap.offset}, \field{cap.length}
+and \field{group_id} are read-only for the driver.
+
 \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
 
 Transitional devices MUST present part of configuration

^ permalink raw reply related	[flat|nested] 106+ messages in thread

* [virtio-dev] [PATCH v3 virtio 1/1] Add "Group Identifier" to virtio PCI capabilities.
@ 2018-06-29 22:19   ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-06-29 22:19 UTC (permalink / raw)
  To: venu.busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Add VIRTIO_PCI_CAP_GROUP_ID_CFG (Group Identifier) capability to the
virtio PCI capabilities to allow for the grouping of devices.

Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
---
 content.tex | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/content.tex b/content.tex
index be18234..34627e5 100644
--- a/content.tex
+++ b/content.tex
@@ -599,6 +599,8 @@ The fields are interpreted as follows:
 #define VIRTIO_PCI_CAP_DEVICE_CFG        4
 /* PCI configuration access */
 #define VIRTIO_PCI_CAP_PCI_CFG           5
+/* Group Identifier */
+#define VIRTIO_PCI_CAP_GROUP_ID_CFG      6
 \end{lstlisting}
 
         Any other value is reserved for future use.
@@ -997,6 +999,34 @@ address \field{cap.length} bytes within a BAR range
 specified by some other Virtio Structure PCI Capability
 of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}.
 
+\subsubsection{Group Identifier capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The VIRTIO_PCI_CAP_GROUP_ID_CFG capability provides means for grouping
+devices together.
+
+The capability is immediately followed by an identifier of type u64
+as below:
+
+\begin{lstlisting}
+struct virtio_pci_group_id_cap {
+        struct virtio_pci_cap cap;
+        u64 group_id; /* Group Identifier */
+};
+\end{lstlisting}
+
+\devicenormative{\paragraph}{Group Identifier capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The device MAY present the VIRTIO_PCI_CAP_GROUP_ID_CFG capability.
+
+When this capability is present, the device must set the fields
+\field{cap.bar}, \field{cap.offset} and \field{cap.length} to 0, and
+set the group_id to a unique identifier.
+
+\drivernormative{\paragraph}{Group Identifier capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Group Identifier capability}
+
+The fields \field{cap.bar}, \field{cap.offset}, \field{cap.length}
+and \field{group_id} are read-only for the driver.
+
 \subsubsection{Legacy Interfaces: A Note on PCI Device Layout}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Legacy Interfaces: A Note on PCI Device Layout}
 
 Transitional devices MUST present part of configuration

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-06-29 22:19 ` [virtio-dev] " Venu Busireddy
                   ` (4 preceding siblings ...)
  (?)
@ 2018-07-02 16:14 ` Roman Kagan
  2018-07-02 21:14     ` [virtio-dev] " si-wei liu
  -1 siblings, 1 reply; 106+ messages in thread
From: Roman Kagan @ 2018-07-02 16:14 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Michael S . Tsirkin, Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> The patch set "Enable virtio_net to act as a standby for a passthru
> device" [1] deals with live migration of guests that use passthrough
> devices. However, that scheme uses the MAC address for pairing
> the virtio device and the passthrough device. The thread "netvsc:
> refactor notifier/event handling code to use the failover framework"
> [2] discusses an alternate mechanism, such as using an UUID, for pairing
> the devices. Based on that discussion, proposals "Add "Group Identifier"
> to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
> store pairing information..." [4] were made.
> 
> The current patch set includes all the feedback received for proposals [3]
> and [4]. For the sake of completeness, patch for the virtio specification
> is also included here. Following is the updated proposal.
> 
> 1. Extend the virtio specification to include a new virtio PCI capability
>    "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> 
> 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
>    virtio device. The "failover-group-id" is a 64 bit value.
> 
> 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
>    Red Hat PCI bridge device (support for the i440FX model).
> 
> 4. Add a new "pcie-downstream" device, with the option
>    "failover-group-id" (support for the Q35 model).
> 
> 5. The operator creates a 64 bit unique identifier, failover-group-id.
> 
> 6. When the virtio device is created, the operator uses the
>    "failover-group-id" option (for example, '-device
>    virtio-net-pci,failover-group-id=<identifier>') and specifies the
>    failover-group-id created in step 4.
> 
>    QEMU stores the failover-group-id in the virtio device's configuration
>    space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> 
> 7. When assigning a PCI device to the guest in passthrough mode, the
>    operator first creates a bridge using the "failover-group-id" option
>    (for example, '-device pcie-downstream,failover-group-id=<identifier>')
>    to specify the failover-group-id created in step 4, and then attaches
>    the passthrough device to the bridge.
> 
>    QEMU stores the failover-group-id in the configuration space of the
>    bridge as Vendor-Specific capability (0x09). The "Vendor" here is
>    not to be confused with a specific organization. Instead, the vendor
>    of the bridge is QEMU.
> 
> 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
>    a passthru device" [1] needs to be modified to use the UUID values
>    present in the bridge's configuration space and the virtio device's
>    configuration space instead of the MAC address for pairing the devices.

I'm still missing a few bits in the overall scheme.

Is the guest supposed to acknowledge the support for PT-PV failover?

Should the PT device be visibile to the guest before it acknowledges the
support for failover?  How is this supposed to work with legacy guests
that don't support it?

Is the guest supposed to signal the datapath switch to the host?

Is the scheme going to be applied/extended to other transports (vmbus,
virtio-ccw, etc.)?

Is the failover group concept going to be used beyond PT-PV network
device failover?

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-02 16:14 ` [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices Roman Kagan
@ 2018-07-02 21:14     ` si-wei liu
  0 siblings, 0 replies; 106+ messages in thread
From: si-wei liu @ 2018-07-02 21:14 UTC (permalink / raw)
  To: Roman Kagan, Venu Busireddy, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel



On 7/2/2018 9:14 AM, Roman Kagan wrote:
> On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
>> The patch set "Enable virtio_net to act as a standby for a passthru
>> device" [1] deals with live migration of guests that use passthrough
>> devices. However, that scheme uses the MAC address for pairing
>> the virtio device and the passthrough device. The thread "netvsc:
>> refactor notifier/event handling code to use the failover framework"
>> [2] discusses an alternate mechanism, such as using an UUID, for pairing
>> the devices. Based on that discussion, proposals "Add "Group Identifier"
>> to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
>> store pairing information..." [4] were made.
>>
>> The current patch set includes all the feedback received for proposals [3]
>> and [4]. For the sake of completeness, patch for the virtio specification
>> is also included here. Following is the updated proposal.
>>
>> 1. Extend the virtio specification to include a new virtio PCI capability
>>     "VIRTIO_PCI_CAP_GROUP_ID_CFG".
>>
>> 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
>>     virtio device. The "failover-group-id" is a 64 bit value.
>>
>> 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
>>     Red Hat PCI bridge device (support for the i440FX model).
>>
>> 4. Add a new "pcie-downstream" device, with the option
>>     "failover-group-id" (support for the Q35 model).
>>
>> 5. The operator creates a 64 bit unique identifier, failover-group-id.
>>
>> 6. When the virtio device is created, the operator uses the
>>     "failover-group-id" option (for example, '-device
>>     virtio-net-pci,failover-group-id=<identifier>') and specifies the
>>     failover-group-id created in step 4.
>>
>>     QEMU stores the failover-group-id in the virtio device's configuration
>>     space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
>>
>> 7. When assigning a PCI device to the guest in passthrough mode, the
>>     operator first creates a bridge using the "failover-group-id" option
>>     (for example, '-device pcie-downstream,failover-group-id=<identifier>')
>>     to specify the failover-group-id created in step 4, and then attaches
>>     the passthrough device to the bridge.
>>
>>     QEMU stores the failover-group-id in the configuration space of the
>>     bridge as Vendor-Specific capability (0x09). The "Vendor" here is
>>     not to be confused with a specific organization. Instead, the vendor
>>     of the bridge is QEMU.
>>
>> 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
>>     a passthru device" [1] needs to be modified to use the UUID values
>>     present in the bridge's configuration space and the virtio device's
>>     configuration space instead of the MAC address for pairing the devices.
> I'm still missing a few bits in the overall scheme.
>
> Is the guest supposed to acknowledge the support for PT-PV failover?

Yes. We are leveraging virtio's feature negotiation mechanism for that. 
Guest which does not acknowledge the support will not have PT plugged in.

> Should the PT device be visibile to the guest before it acknowledges the
> support for failover?
No. QEMU will only expose PT device after guest acknowledges the support 
through virtio's feature negotiation.

>    How is this supposed to work with legacy guests that don't support it?
Only PV device will be exposed on legacy guest.

>
> Is the guest supposed to signal the datapath switch to the host?
No, guest doesn't need to be initiating datapath switch at all. However, 
QMP events may be generated when exposing or hiding the PT device 
through hot plug/unplug to facilitate host to switch datapath.

>
> Is the scheme going to be applied/extended to other transports (vmbus,
> virtio-ccw, etc.)?
Well, it depends on the use case, and how feasible it can be extended to 
other transport due to constraints and transport specifics.

> Is the failover group concept going to be used beyond PT-PV network
> device failover?
Although the concept of failover group is generic, the implementation 
itself may vary.

-Siwei


>
> Thanks,
> Roman.
>
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-02 21:14     ` si-wei liu
  0 siblings, 0 replies; 106+ messages in thread
From: si-wei liu @ 2018-07-02 21:14 UTC (permalink / raw)
  To: Roman Kagan, Venu Busireddy, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel



On 7/2/2018 9:14 AM, Roman Kagan wrote:
> On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
>> The patch set "Enable virtio_net to act as a standby for a passthru
>> device" [1] deals with live migration of guests that use passthrough
>> devices. However, that scheme uses the MAC address for pairing
>> the virtio device and the passthrough device. The thread "netvsc:
>> refactor notifier/event handling code to use the failover framework"
>> [2] discusses an alternate mechanism, such as using an UUID, for pairing
>> the devices. Based on that discussion, proposals "Add "Group Identifier"
>> to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
>> store pairing information..." [4] were made.
>>
>> The current patch set includes all the feedback received for proposals [3]
>> and [4]. For the sake of completeness, patch for the virtio specification
>> is also included here. Following is the updated proposal.
>>
>> 1. Extend the virtio specification to include a new virtio PCI capability
>>     "VIRTIO_PCI_CAP_GROUP_ID_CFG".
>>
>> 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
>>     virtio device. The "failover-group-id" is a 64 bit value.
>>
>> 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
>>     Red Hat PCI bridge device (support for the i440FX model).
>>
>> 4. Add a new "pcie-downstream" device, with the option
>>     "failover-group-id" (support for the Q35 model).
>>
>> 5. The operator creates a 64 bit unique identifier, failover-group-id.
>>
>> 6. When the virtio device is created, the operator uses the
>>     "failover-group-id" option (for example, '-device
>>     virtio-net-pci,failover-group-id=<identifier>') and specifies the
>>     failover-group-id created in step 4.
>>
>>     QEMU stores the failover-group-id in the virtio device's configuration
>>     space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
>>
>> 7. When assigning a PCI device to the guest in passthrough mode, the
>>     operator first creates a bridge using the "failover-group-id" option
>>     (for example, '-device pcie-downstream,failover-group-id=<identifier>')
>>     to specify the failover-group-id created in step 4, and then attaches
>>     the passthrough device to the bridge.
>>
>>     QEMU stores the failover-group-id in the configuration space of the
>>     bridge as Vendor-Specific capability (0x09). The "Vendor" here is
>>     not to be confused with a specific organization. Instead, the vendor
>>     of the bridge is QEMU.
>>
>> 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
>>     a passthru device" [1] needs to be modified to use the UUID values
>>     present in the bridge's configuration space and the virtio device's
>>     configuration space instead of the MAC address for pairing the devices.
> I'm still missing a few bits in the overall scheme.
>
> Is the guest supposed to acknowledge the support for PT-PV failover?

Yes. We are leveraging virtio's feature negotiation mechanism for that. 
Guest which does not acknowledge the support will not have PT plugged in.

> Should the PT device be visibile to the guest before it acknowledges the
> support for failover?
No. QEMU will only expose PT device after guest acknowledges the support 
through virtio's feature negotiation.

>    How is this supposed to work with legacy guests that don't support it?
Only PV device will be exposed on legacy guest.

>
> Is the guest supposed to signal the datapath switch to the host?
No, guest doesn't need to be initiating datapath switch at all. However, 
QMP events may be generated when exposing or hiding the PT device 
through hot plug/unplug to facilitate host to switch datapath.

>
> Is the scheme going to be applied/extended to other transports (vmbus,
> virtio-ccw, etc.)?
Well, it depends on the use case, and how feasible it can be extended to 
other transport due to constraints and transport specifics.

> Is the failover group concept going to be used beyond PT-PV network
> device failover?
Although the concept of failover group is generic, the implementation 
itself may vary.

-Siwei


>
> Thanks,
> Roman.
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] [PATCH v3 2/3] Add "Group Identifier" support to Red Hat PCI bridge.
  2018-06-29 22:19   ` [virtio-dev] " Venu Busireddy
@ 2018-07-03  3:13     ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-03  3:13 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Michael S . Tsirkin, Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jun 29, 2018 at 3:19 PM, Venu Busireddy
<venu.busireddy@oracle.com> wrote:
> Add the "Vendor-Specific" capability to the Red Hat PCI bridge device
> "pci-bridge", to contain the "Group Identifier" that will be used to pair
> a virtio device with the passthrough device attached to that bridge. Also,
> change the Device ID of the bridge to PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER
> to avoid confusion with bridges that don't have this capability.
>
> This capability is added to the bridge iff the "failover-group-id"
> option is specified for the bridge.
>
> Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
> ---
>  hw/pci-bridge/pci_bridge_dev.c | 10 ++++++++++
>  hw/pci/pci_bridge.c            | 34 +++++++++++++++++++++++++++++++++
>  include/hw/pci/pci.h           | 35 +++++++++++++++++-----------------
>  include/hw/pci/pci_bridge.h    |  2 ++
>  4 files changed, 64 insertions(+), 17 deletions(-)
>
> diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
> index b2d861d216..d3879071a8 100644
> --- a/hw/pci-bridge/pci_bridge_dev.c
> +++ b/hw/pci-bridge/pci_bridge_dev.c
> @@ -71,6 +71,13 @@ static void pci_bridge_dev_realize(PCIDevice *dev, Error **errp)
>          bridge_dev->msi = ON_OFF_AUTO_OFF;
>      }
>
> +    err = pci_bridge_vendor_init(dev, 0, PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER,
> +            errp);
> +    if (err < 0) {
> +        error_append_hint(errp, "Can't init group ID, error %d\n", err);
> +        goto vendor_cap_err;
> +    }
> +
>      err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0, errp);
>      if (err) {
>          goto slotid_error;
> @@ -109,6 +116,7 @@ slotid_error:
>      if (shpc_present(dev)) {
>          shpc_cleanup(dev, &bridge_dev->bar);
>      }
> +vendor_cap_err:
>  shpc_error:
>      pci_bridge_exitfn(dev);
>  }
> @@ -162,6 +170,8 @@ static Property pci_bridge_dev_properties[] = {
>                              ON_OFF_AUTO_AUTO),
>      DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
>                      PCI_BRIDGE_DEV_F_SHPC_REQ, true),
> +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
> +                       PCIDevice, failover_group_id, ULLONG_MAX),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>
> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
> index 40a39f57cb..68cc619c20 100644
> --- a/hw/pci/pci_bridge.c
> +++ b/hw/pci/pci_bridge.c
> @@ -40,6 +40,10 @@
>  #define PCI_SSVID_SVID          4
>  #define PCI_SSVID_SSID          6
>
> +#define PCI_VENDOR_SIZEOF             12
> +#define PCI_VENDOR_CAP_LEN_OFFSET      2
> +#define PCI_VENDOR_GROUP_ID_OFFSET     4
> +
>  int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>                            uint16_t svid, uint16_t ssid,
>                            Error **errp)
> @@ -57,6 +61,36 @@ int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>      return pos;
>  }
>
> +int pci_bridge_vendor_init(PCIDevice *d, uint8_t offset,
> +                           uint16_t device_id, Error **errp)
> +{
> +    int pos;
> +    PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(d);
> +
> +    if (d->failover_group_id == ULLONG_MAX) {
> +        return 0;
> +    }
> +
> +    pos = pci_add_capability(d, PCI_CAP_ID_VNDR, offset, PCI_VENDOR_SIZEOF,
> +            errp);
> +    if (pos < 0) {
> +        return pos;
> +    }
> +
> +    /*
> +     * Tweak the Device ID to avoid confusion
> +     * with bridges that don't have the group id capability.
> +     */
> +    dc->device_id = device_id;
> +    pci_set_word(d->config + PCI_DEVICE_ID, device_id);
> +
> +    pci_set_word(d->config + pos + PCI_VENDOR_CAP_LEN_OFFSET,
> +            PCI_VENDOR_SIZEOF);
> +    memcpy(d->config + pos + PCI_VENDOR_GROUP_ID_OFFSET,
> +            &d->failover_group_id, sizeof(uint64_t));

You should use pci_set_quad() on a 64bit value instead to avoid
potential endianess issue.

-Siwei



> +    return pos;
> +}
> +
>  /* Accessor function to get parent bridge device from pci bus. */
>  PCIDevice *pci_bridge_get_device(PCIBus *bus)
>  {
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index b59c3e7e38..bc38032761 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -86,23 +86,24 @@ extern bool pci_available;
>  #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
>  #define PCI_DEVICE_ID_VIRTIO_VSOCK       0x1012
>
> -#define PCI_VENDOR_ID_REDHAT             0x1b36
> -#define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
> -#define PCI_DEVICE_ID_REDHAT_SERIAL      0x0002
> -#define PCI_DEVICE_ID_REDHAT_SERIAL2     0x0003
> -#define PCI_DEVICE_ID_REDHAT_SERIAL4     0x0004
> -#define PCI_DEVICE_ID_REDHAT_TEST        0x0005
> -#define PCI_DEVICE_ID_REDHAT_ROCKER      0x0006
> -#define PCI_DEVICE_ID_REDHAT_SDHCI       0x0007
> -#define PCI_DEVICE_ID_REDHAT_PCIE_HOST   0x0008
> -#define PCI_DEVICE_ID_REDHAT_PXB         0x0009
> -#define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT 0x000a
> -#define PCI_DEVICE_ID_REDHAT_PXB_PCIE    0x000b
> -#define PCI_DEVICE_ID_REDHAT_PCIE_RP     0x000c
> -#define PCI_DEVICE_ID_REDHAT_XHCI        0x000d
> -#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000e
> -#define PCI_DEVICE_ID_REDHAT_MDPY        0x000f
> -#define PCI_DEVICE_ID_REDHAT_QXL         0x0100
> +#define PCI_VENDOR_ID_REDHAT                    0x1b36
> +#define PCI_DEVICE_ID_REDHAT_BRIDGE             0x0001
> +#define PCI_DEVICE_ID_REDHAT_SERIAL             0x0002
> +#define PCI_DEVICE_ID_REDHAT_SERIAL2            0x0003
> +#define PCI_DEVICE_ID_REDHAT_SERIAL4            0x0004
> +#define PCI_DEVICE_ID_REDHAT_TEST               0x0005
> +#define PCI_DEVICE_ID_REDHAT_ROCKER             0x0006
> +#define PCI_DEVICE_ID_REDHAT_SDHCI              0x0007
> +#define PCI_DEVICE_ID_REDHAT_PCIE_HOST          0x0008
> +#define PCI_DEVICE_ID_REDHAT_PXB                0x0009
> +#define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT        0x000a
> +#define PCI_DEVICE_ID_REDHAT_PXB_PCIE           0x000b
> +#define PCI_DEVICE_ID_REDHAT_PCIE_RP            0x000c
> +#define PCI_DEVICE_ID_REDHAT_XHCI               0x000d
> +#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
> +#define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
> +#define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
> +#define PCI_DEVICE_ID_REDHAT_QXL                0x0100
>
>  #define FMT_PCIBUS                      PRIx64
>
> diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
> index 0347da52d2..aa471ec5a4 100644
> --- a/include/hw/pci/pci_bridge.h
> +++ b/include/hw/pci/pci_bridge.h
> @@ -83,6 +83,8 @@ struct PCIBridge {
>  int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>                            uint16_t svid, uint16_t ssid,
>                            Error **errp);
> +int pci_bridge_vendor_init(PCIDevice *dev, uint8_t offset,
> +                           uint16_t device_id, Error **errp);
>
>  PCIDevice *pci_bridge_get_device(PCIBus *bus);
>  PCIBus *pci_bridge_get_sec_bus(PCIBridge *br);
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] [PATCH v3 2/3] Add "Group Identifier" support to Red Hat PCI bridge.
@ 2018-07-03  3:13     ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-03  3:13 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Michael S . Tsirkin, Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jun 29, 2018 at 3:19 PM, Venu Busireddy
<venu.busireddy@oracle.com> wrote:
> Add the "Vendor-Specific" capability to the Red Hat PCI bridge device
> "pci-bridge", to contain the "Group Identifier" that will be used to pair
> a virtio device with the passthrough device attached to that bridge. Also,
> change the Device ID of the bridge to PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER
> to avoid confusion with bridges that don't have this capability.
>
> This capability is added to the bridge iff the "failover-group-id"
> option is specified for the bridge.
>
> Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
> ---
>  hw/pci-bridge/pci_bridge_dev.c | 10 ++++++++++
>  hw/pci/pci_bridge.c            | 34 +++++++++++++++++++++++++++++++++
>  include/hw/pci/pci.h           | 35 +++++++++++++++++-----------------
>  include/hw/pci/pci_bridge.h    |  2 ++
>  4 files changed, 64 insertions(+), 17 deletions(-)
>
> diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
> index b2d861d216..d3879071a8 100644
> --- a/hw/pci-bridge/pci_bridge_dev.c
> +++ b/hw/pci-bridge/pci_bridge_dev.c
> @@ -71,6 +71,13 @@ static void pci_bridge_dev_realize(PCIDevice *dev, Error **errp)
>          bridge_dev->msi = ON_OFF_AUTO_OFF;
>      }
>
> +    err = pci_bridge_vendor_init(dev, 0, PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER,
> +            errp);
> +    if (err < 0) {
> +        error_append_hint(errp, "Can't init group ID, error %d\n", err);
> +        goto vendor_cap_err;
> +    }
> +
>      err = slotid_cap_init(dev, 0, bridge_dev->chassis_nr, 0, errp);
>      if (err) {
>          goto slotid_error;
> @@ -109,6 +116,7 @@ slotid_error:
>      if (shpc_present(dev)) {
>          shpc_cleanup(dev, &bridge_dev->bar);
>      }
> +vendor_cap_err:
>  shpc_error:
>      pci_bridge_exitfn(dev);
>  }
> @@ -162,6 +170,8 @@ static Property pci_bridge_dev_properties[] = {
>                              ON_OFF_AUTO_AUTO),
>      DEFINE_PROP_BIT(PCI_BRIDGE_DEV_PROP_SHPC, PCIBridgeDev, flags,
>                      PCI_BRIDGE_DEV_F_SHPC_REQ, true),
> +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
> +                       PCIDevice, failover_group_id, ULLONG_MAX),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>
> diff --git a/hw/pci/pci_bridge.c b/hw/pci/pci_bridge.c
> index 40a39f57cb..68cc619c20 100644
> --- a/hw/pci/pci_bridge.c
> +++ b/hw/pci/pci_bridge.c
> @@ -40,6 +40,10 @@
>  #define PCI_SSVID_SVID          4
>  #define PCI_SSVID_SSID          6
>
> +#define PCI_VENDOR_SIZEOF             12
> +#define PCI_VENDOR_CAP_LEN_OFFSET      2
> +#define PCI_VENDOR_GROUP_ID_OFFSET     4
> +
>  int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>                            uint16_t svid, uint16_t ssid,
>                            Error **errp)
> @@ -57,6 +61,36 @@ int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>      return pos;
>  }
>
> +int pci_bridge_vendor_init(PCIDevice *d, uint8_t offset,
> +                           uint16_t device_id, Error **errp)
> +{
> +    int pos;
> +    PCIDeviceClass *dc = PCI_DEVICE_GET_CLASS(d);
> +
> +    if (d->failover_group_id == ULLONG_MAX) {
> +        return 0;
> +    }
> +
> +    pos = pci_add_capability(d, PCI_CAP_ID_VNDR, offset, PCI_VENDOR_SIZEOF,
> +            errp);
> +    if (pos < 0) {
> +        return pos;
> +    }
> +
> +    /*
> +     * Tweak the Device ID to avoid confusion
> +     * with bridges that don't have the group id capability.
> +     */
> +    dc->device_id = device_id;
> +    pci_set_word(d->config + PCI_DEVICE_ID, device_id);
> +
> +    pci_set_word(d->config + pos + PCI_VENDOR_CAP_LEN_OFFSET,
> +            PCI_VENDOR_SIZEOF);
> +    memcpy(d->config + pos + PCI_VENDOR_GROUP_ID_OFFSET,
> +            &d->failover_group_id, sizeof(uint64_t));

You should use pci_set_quad() on a 64bit value instead to avoid
potential endianess issue.

-Siwei



> +    return pos;
> +}
> +
>  /* Accessor function to get parent bridge device from pci bus. */
>  PCIDevice *pci_bridge_get_device(PCIBus *bus)
>  {
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index b59c3e7e38..bc38032761 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -86,23 +86,24 @@ extern bool pci_available;
>  #define PCI_DEVICE_ID_VIRTIO_9P          0x1009
>  #define PCI_DEVICE_ID_VIRTIO_VSOCK       0x1012
>
> -#define PCI_VENDOR_ID_REDHAT             0x1b36
> -#define PCI_DEVICE_ID_REDHAT_BRIDGE      0x0001
> -#define PCI_DEVICE_ID_REDHAT_SERIAL      0x0002
> -#define PCI_DEVICE_ID_REDHAT_SERIAL2     0x0003
> -#define PCI_DEVICE_ID_REDHAT_SERIAL4     0x0004
> -#define PCI_DEVICE_ID_REDHAT_TEST        0x0005
> -#define PCI_DEVICE_ID_REDHAT_ROCKER      0x0006
> -#define PCI_DEVICE_ID_REDHAT_SDHCI       0x0007
> -#define PCI_DEVICE_ID_REDHAT_PCIE_HOST   0x0008
> -#define PCI_DEVICE_ID_REDHAT_PXB         0x0009
> -#define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT 0x000a
> -#define PCI_DEVICE_ID_REDHAT_PXB_PCIE    0x000b
> -#define PCI_DEVICE_ID_REDHAT_PCIE_RP     0x000c
> -#define PCI_DEVICE_ID_REDHAT_XHCI        0x000d
> -#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE 0x000e
> -#define PCI_DEVICE_ID_REDHAT_MDPY        0x000f
> -#define PCI_DEVICE_ID_REDHAT_QXL         0x0100
> +#define PCI_VENDOR_ID_REDHAT                    0x1b36
> +#define PCI_DEVICE_ID_REDHAT_BRIDGE             0x0001
> +#define PCI_DEVICE_ID_REDHAT_SERIAL             0x0002
> +#define PCI_DEVICE_ID_REDHAT_SERIAL2            0x0003
> +#define PCI_DEVICE_ID_REDHAT_SERIAL4            0x0004
> +#define PCI_DEVICE_ID_REDHAT_TEST               0x0005
> +#define PCI_DEVICE_ID_REDHAT_ROCKER             0x0006
> +#define PCI_DEVICE_ID_REDHAT_SDHCI              0x0007
> +#define PCI_DEVICE_ID_REDHAT_PCIE_HOST          0x0008
> +#define PCI_DEVICE_ID_REDHAT_PXB                0x0009
> +#define PCI_DEVICE_ID_REDHAT_BRIDGE_SEAT        0x000a
> +#define PCI_DEVICE_ID_REDHAT_PXB_PCIE           0x000b
> +#define PCI_DEVICE_ID_REDHAT_PCIE_RP            0x000c
> +#define PCI_DEVICE_ID_REDHAT_XHCI               0x000d
> +#define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
> +#define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
> +#define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
> +#define PCI_DEVICE_ID_REDHAT_QXL                0x0100
>
>  #define FMT_PCIBUS                      PRIx64
>
> diff --git a/include/hw/pci/pci_bridge.h b/include/hw/pci/pci_bridge.h
> index 0347da52d2..aa471ec5a4 100644
> --- a/include/hw/pci/pci_bridge.h
> +++ b/include/hw/pci/pci_bridge.h
> @@ -83,6 +83,8 @@ struct PCIBridge {
>  int pci_bridge_ssvid_init(PCIDevice *dev, uint8_t offset,
>                            uint16_t svid, uint16_t ssid,
>                            Error **errp);
> +int pci_bridge_vendor_init(PCIDevice *dev, uint8_t offset,
> +                           uint16_t device_id, Error **errp);
>
>  PCIDevice *pci_bridge_get_device(PCIBus *bus);
>  PCIBus *pci_bridge_get_sec_bus(PCIBridge *br);
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-02 21:14     ` [virtio-dev] " si-wei liu
  (?)
@ 2018-07-03  9:58     ` Roman Kagan
  2018-07-03 14:28         ` [virtio-dev] " Venu Busireddy
                         ` (2 more replies)
  -1 siblings, 3 replies; 106+ messages in thread
From: Roman Kagan @ 2018-07-03  9:58 UTC (permalink / raw)
  To: si-wei liu
  Cc: Venu Busireddy, Michael S . Tsirkin, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:
> On 7/2/2018 9:14 AM, Roman Kagan wrote:
> > On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> > > The patch set "Enable virtio_net to act as a standby for a passthru
> > > device" [1] deals with live migration of guests that use passthrough
> > > devices. However, that scheme uses the MAC address for pairing
> > > the virtio device and the passthrough device. The thread "netvsc:
> > > refactor notifier/event handling code to use the failover framework"
> > > [2] discusses an alternate mechanism, such as using an UUID, for pairing
> > > the devices. Based on that discussion, proposals "Add "Group Identifier"
> > > to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
> > > store pairing information..." [4] were made.
> > > 
> > > The current patch set includes all the feedback received for proposals [3]
> > > and [4]. For the sake of completeness, patch for the virtio specification
> > > is also included here. Following is the updated proposal.
> > > 
> > > 1. Extend the virtio specification to include a new virtio PCI capability
> > >     "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > > 
> > > 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
> > >     virtio device. The "failover-group-id" is a 64 bit value.
> > > 
> > > 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
> > >     Red Hat PCI bridge device (support for the i440FX model).
> > > 
> > > 4. Add a new "pcie-downstream" device, with the option
> > >     "failover-group-id" (support for the Q35 model).
> > > 
> > > 5. The operator creates a 64 bit unique identifier, failover-group-id.
> > > 
> > > 6. When the virtio device is created, the operator uses the
> > >     "failover-group-id" option (for example, '-device
> > >     virtio-net-pci,failover-group-id=<identifier>') and specifies the
> > >     failover-group-id created in step 4.
> > > 
> > >     QEMU stores the failover-group-id in the virtio device's configuration
> > >     space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > > 
> > > 7. When assigning a PCI device to the guest in passthrough mode, the
> > >     operator first creates a bridge using the "failover-group-id" option
> > >     (for example, '-device pcie-downstream,failover-group-id=<identifier>')
> > >     to specify the failover-group-id created in step 4, and then attaches
> > >     the passthrough device to the bridge.
> > > 
> > >     QEMU stores the failover-group-id in the configuration space of the
> > >     bridge as Vendor-Specific capability (0x09). The "Vendor" here is
> > >     not to be confused with a specific organization. Instead, the vendor
> > >     of the bridge is QEMU.
> > > 
> > > 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
> > >     a passthru device" [1] needs to be modified to use the UUID values
> > >     present in the bridge's configuration space and the virtio device's
> > >     configuration space instead of the MAC address for pairing the devices.
> > I'm still missing a few bits in the overall scheme.
> > 
> > Is the guest supposed to acknowledge the support for PT-PV failover?
> 
> Yes. We are leveraging virtio's feature negotiation mechanism for that.
> Guest which does not acknowledge the support will not have PT plugged in.
> 
> > Should the PT device be visibile to the guest before it acknowledges the
> > support for failover?
> No. QEMU will only expose PT device after guest acknowledges the support
> through virtio's feature negotiation.
> 
> >    How is this supposed to work with legacy guests that don't support it?
> Only PV device will be exposed on legacy guest.

So how is this coordination going to work?  One possibility is that the
PV device emits a QMP event upon the guest driver confirming the support
for failover, the management layer intercepts the event and performs
device_add of the PT device.  Another is that the PT device is added
from the very beginning (e.g. on the QEMU command line) but its parent
PCI bridge subscribes a callback with the PV device to "activate" the PT
device upon negotiating the failover feature.

I think this needs to be decided within the scope of this patchset.

> > Is the guest supposed to signal the datapath switch to the host?
> No, guest doesn't need to be initiating datapath switch at all.

What happens if the guest supports failover in its PV driver, but lacks
the driver for the PT device?

> However, QMP
> events may be generated when exposing or hiding the PT device through hot
> plug/unplug to facilitate host to switch datapath.

The PT device hot plug/unplug are initiated by the host, aren't they?  Why
would it also need QMP events for them?

> > Is the scheme going to be applied/extended to other transports (vmbus,
> > virtio-ccw, etc.)?
> Well, it depends on the use case, and how feasible it can be extended to
> other transport due to constraints and transport specifics.
> 
> > Is the failover group concept going to be used beyond PT-PV network
> > device failover?
> Although the concept of failover group is generic, the implementation itself
> may vary.

My point with these two questions is that since this patchset is
defining external interfaces -- with guest OS, with management layer --
which are not easy to change later, it might make sense to try and see
if the interfaces map to other usecases.  E.g. I think we can get enough
information on how Hyper-V handles PT-PV network device failover from
the current Linux implementation; it may be a good idea to share some
concepts and workflows with virtio-pci.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-03  9:58     ` Roman Kagan
@ 2018-07-03 14:28         ` Venu Busireddy
  2018-07-03 22:27         ` [virtio-dev] " si-wei liu
  2018-07-04  5:43         ` [virtio-dev] " Michael S. Tsirkin
  2 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-07-03 14:28 UTC (permalink / raw)
  To: Roman Kagan, si-wei liu, Michael S . Tsirkin, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On 2018-07-03 12:58:25 +0300, Roman Kagan wrote:
> On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:
> > On 7/2/2018 9:14 AM, Roman Kagan wrote:
> > > On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> > > > The patch set "Enable virtio_net to act as a standby for a passthru
> > > > device" [1] deals with live migration of guests that use passthrough
> > > > devices. However, that scheme uses the MAC address for pairing
> > > > the virtio device and the passthrough device. The thread "netvsc:
> > > > refactor notifier/event handling code to use the failover framework"
> > > > [2] discusses an alternate mechanism, such as using an UUID, for pairing
> > > > the devices. Based on that discussion, proposals "Add "Group Identifier"
> > > > to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
> > > > store pairing information..." [4] were made.
> > > > 
> > > > The current patch set includes all the feedback received for proposals [3]
> > > > and [4]. For the sake of completeness, patch for the virtio specification
> > > > is also included here. Following is the updated proposal.
> > > > 
> > > > 1. Extend the virtio specification to include a new virtio PCI capability
> > > >     "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > > > 
> > > > 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
> > > >     virtio device. The "failover-group-id" is a 64 bit value.
> > > > 
> > > > 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
> > > >     Red Hat PCI bridge device (support for the i440FX model).
> > > > 
> > > > 4. Add a new "pcie-downstream" device, with the option
> > > >     "failover-group-id" (support for the Q35 model).
> > > > 
> > > > 5. The operator creates a 64 bit unique identifier, failover-group-id.
> > > > 
> > > > 6. When the virtio device is created, the operator uses the
> > > >     "failover-group-id" option (for example, '-device
> > > >     virtio-net-pci,failover-group-id=<identifier>') and specifies the
> > > >     failover-group-id created in step 4.
> > > > 
> > > >     QEMU stores the failover-group-id in the virtio device's configuration
> > > >     space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > > > 
> > > > 7. When assigning a PCI device to the guest in passthrough mode, the
> > > >     operator first creates a bridge using the "failover-group-id" option
> > > >     (for example, '-device pcie-downstream,failover-group-id=<identifier>')
> > > >     to specify the failover-group-id created in step 4, and then attaches
> > > >     the passthrough device to the bridge.
> > > > 
> > > >     QEMU stores the failover-group-id in the configuration space of the
> > > >     bridge as Vendor-Specific capability (0x09). The "Vendor" here is
> > > >     not to be confused with a specific organization. Instead, the vendor
> > > >     of the bridge is QEMU.
> > > > 
> > > > 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
> > > >     a passthru device" [1] needs to be modified to use the UUID values
> > > >     present in the bridge's configuration space and the virtio device's
> > > >     configuration space instead of the MAC address for pairing the devices.
> > > I'm still missing a few bits in the overall scheme.
> > > 
> > > Is the guest supposed to acknowledge the support for PT-PV failover?
> > 
> > Yes. We are leveraging virtio's feature negotiation mechanism for that.
> > Guest which does not acknowledge the support will not have PT plugged in.
> > 
> > > Should the PT device be visibile to the guest before it acknowledges the
> > > support for failover?
> > No. QEMU will only expose PT device after guest acknowledges the support
> > through virtio's feature negotiation.
> > 
> > >    How is this supposed to work with legacy guests that don't support it?
> > Only PV device will be exposed on legacy guest.
> 
> So how is this coordination going to work?  One possibility is that the
> PV device emits a QMP event upon the guest driver confirming the support
> for failover, the management layer intercepts the event and performs
> device_add of the PT device.  Another is that the PT device is added
> from the very beginning (e.g. on the QEMU command line) but its parent
> PCI bridge subscribes a callback with the PV device to "activate" the PT
> device upon negotiating the failover feature.
> 
> I think this needs to be decided within the scope of this patchset.
> 
> > > Is the guest supposed to signal the datapath switch to the host?
> > No, guest doesn't need to be initiating datapath switch at all.
> 
> What happens if the guest supports failover in its PV driver, but lacks
> the driver for the PT device?
> 
> > However, QMP
> > events may be generated when exposing or hiding the PT device through hot
> > plug/unplug to facilitate host to switch datapath.
> 
> The PT device hot plug/unplug are initiated by the host, aren't they?  Why
> would it also need QMP events for them?
> 
> > > Is the scheme going to be applied/extended to other transports (vmbus,
> > > virtio-ccw, etc.)?
> > Well, it depends on the use case, and how feasible it can be extended to
> > other transport due to constraints and transport specifics.
> > 
> > > Is the failover group concept going to be used beyond PT-PV network
> > > device failover?
> > Although the concept of failover group is generic, the implementation itself
> > may vary.
> 
> My point with these two questions is that since this patchset is
> defining external interfaces -- with guest OS, with management layer --

This patch set is not defining any external interfaces. All this is doing
is provide the means and locations to store the "group identifier". How
that info will be used, I thought, should be another patch set.

Venu

> which are not easy to change later, it might make sense to try and see
> if the interfaces map to other usecases.  E.g. I think we can get enough
> information on how Hyper-V handles PT-PV network device failover from
> the current Linux implementation; it may be a good idea to share some
> concepts and workflows with virtio-pci.
> 
> Thanks,
> Roman.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-03 14:28         ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-07-03 14:28 UTC (permalink / raw)
  To: Roman Kagan, si-wei liu, Michael S . Tsirkin, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On 2018-07-03 12:58:25 +0300, Roman Kagan wrote:
> On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:
> > On 7/2/2018 9:14 AM, Roman Kagan wrote:
> > > On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> > > > The patch set "Enable virtio_net to act as a standby for a passthru
> > > > device" [1] deals with live migration of guests that use passthrough
> > > > devices. However, that scheme uses the MAC address for pairing
> > > > the virtio device and the passthrough device. The thread "netvsc:
> > > > refactor notifier/event handling code to use the failover framework"
> > > > [2] discusses an alternate mechanism, such as using an UUID, for pairing
> > > > the devices. Based on that discussion, proposals "Add "Group Identifier"
> > > > to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
> > > > store pairing information..." [4] were made.
> > > > 
> > > > The current patch set includes all the feedback received for proposals [3]
> > > > and [4]. For the sake of completeness, patch for the virtio specification
> > > > is also included here. Following is the updated proposal.
> > > > 
> > > > 1. Extend the virtio specification to include a new virtio PCI capability
> > > >     "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > > > 
> > > > 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
> > > >     virtio device. The "failover-group-id" is a 64 bit value.
> > > > 
> > > > 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
> > > >     Red Hat PCI bridge device (support for the i440FX model).
> > > > 
> > > > 4. Add a new "pcie-downstream" device, with the option
> > > >     "failover-group-id" (support for the Q35 model).
> > > > 
> > > > 5. The operator creates a 64 bit unique identifier, failover-group-id.
> > > > 
> > > > 6. When the virtio device is created, the operator uses the
> > > >     "failover-group-id" option (for example, '-device
> > > >     virtio-net-pci,failover-group-id=<identifier>') and specifies the
> > > >     failover-group-id created in step 4.
> > > > 
> > > >     QEMU stores the failover-group-id in the virtio device's configuration
> > > >     space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > > > 
> > > > 7. When assigning a PCI device to the guest in passthrough mode, the
> > > >     operator first creates a bridge using the "failover-group-id" option
> > > >     (for example, '-device pcie-downstream,failover-group-id=<identifier>')
> > > >     to specify the failover-group-id created in step 4, and then attaches
> > > >     the passthrough device to the bridge.
> > > > 
> > > >     QEMU stores the failover-group-id in the configuration space of the
> > > >     bridge as Vendor-Specific capability (0x09). The "Vendor" here is
> > > >     not to be confused with a specific organization. Instead, the vendor
> > > >     of the bridge is QEMU.
> > > > 
> > > > 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
> > > >     a passthru device" [1] needs to be modified to use the UUID values
> > > >     present in the bridge's configuration space and the virtio device's
> > > >     configuration space instead of the MAC address for pairing the devices.
> > > I'm still missing a few bits in the overall scheme.
> > > 
> > > Is the guest supposed to acknowledge the support for PT-PV failover?
> > 
> > Yes. We are leveraging virtio's feature negotiation mechanism for that.
> > Guest which does not acknowledge the support will not have PT plugged in.
> > 
> > > Should the PT device be visibile to the guest before it acknowledges the
> > > support for failover?
> > No. QEMU will only expose PT device after guest acknowledges the support
> > through virtio's feature negotiation.
> > 
> > >    How is this supposed to work with legacy guests that don't support it?
> > Only PV device will be exposed on legacy guest.
> 
> So how is this coordination going to work?  One possibility is that the
> PV device emits a QMP event upon the guest driver confirming the support
> for failover, the management layer intercepts the event and performs
> device_add of the PT device.  Another is that the PT device is added
> from the very beginning (e.g. on the QEMU command line) but its parent
> PCI bridge subscribes a callback with the PV device to "activate" the PT
> device upon negotiating the failover feature.
> 
> I think this needs to be decided within the scope of this patchset.
> 
> > > Is the guest supposed to signal the datapath switch to the host?
> > No, guest doesn't need to be initiating datapath switch at all.
> 
> What happens if the guest supports failover in its PV driver, but lacks
> the driver for the PT device?
> 
> > However, QMP
> > events may be generated when exposing or hiding the PT device through hot
> > plug/unplug to facilitate host to switch datapath.
> 
> The PT device hot plug/unplug are initiated by the host, aren't they?  Why
> would it also need QMP events for them?
> 
> > > Is the scheme going to be applied/extended to other transports (vmbus,
> > > virtio-ccw, etc.)?
> > Well, it depends on the use case, and how feasible it can be extended to
> > other transport due to constraints and transport specifics.
> > 
> > > Is the failover group concept going to be used beyond PT-PV network
> > > device failover?
> > Although the concept of failover group is generic, the implementation itself
> > may vary.
> 
> My point with these two questions is that since this patchset is
> defining external interfaces -- with guest OS, with management layer --

This patch set is not defining any external interfaces. All this is doing
is provide the means and locations to store the "group identifier". How
that info will be used, I thought, should be another patch set.

Venu

> which are not easy to change later, it might make sense to try and see
> if the interfaces map to other usecases.  E.g. I think we can get enough
> information on how Hyper-V handles PT-PV network device failover from
> the current Linux implementation; it may be a good idea to share some
> concepts and workflows with virtio-pci.
> 
> Thanks,
> Roman.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-03 14:28         ` [virtio-dev] " Venu Busireddy
@ 2018-07-03 14:52           ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-03 14:52 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Roman Kagan, si-wei liu, Michael S . Tsirkin, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Tue, 3 Jul 2018 09:28:17 -0500
Venu Busireddy <venu.busireddy@oracle.com> wrote:

> On 2018-07-03 12:58:25 +0300, Roman Kagan wrote:
> > On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:  
> > > On 7/2/2018 9:14 AM, Roman Kagan wrote:  
> > > > Is the scheme going to be applied/extended to other transports (vmbus,
> > > > virtio-ccw, etc.)?  
> > > Well, it depends on the use case, and how feasible it can be extended to
> > > other transport due to constraints and transport specifics.
> > >   
> > > > Is the failover group concept going to be used beyond PT-PV network
> > > > device failover?  
> > > Although the concept of failover group is generic, the implementation itself
> > > may vary.  
> > 
> > My point with these two questions is that since this patchset is
> > defining external interfaces -- with guest OS, with management layer --  
> 
> This patch set is not defining any external interfaces. All this is doing
> is provide the means and locations to store the "group identifier". How
> that info will be used, I thought, should be another patch set.
> 
> Venu
> 
> > which are not easy to change later, it might make sense to try and see
> > if the interfaces map to other usecases.  E.g. I think we can get enough
> > information on how Hyper-V handles PT-PV network device failover from
> > the current Linux implementation; it may be a good idea to share some
> > concepts and workflows with virtio-pci.

But this patch set defines a host<->guest interface to make pairing
information on the host available to the guest, no?

From my point of view, there are several concerns:
- This approach assumes a homogeneous pairing (same transport), and
  even more, it assumes that this transport is pci.
- It won't work for zPCI (although zPCI is really strange) -- this
  means it will be completely unusable on s390x.
- It is too focused on a narrow use case. How is it supposed to be
  extended?

What I would prefer:
- Implement a pairing id support that does not rely on a certain
  transport, but leverages virtio (which is in the game anyway). We'd
  get at least the "virtio-net device paired with vfio" use case, which
  is what is currently implemented in the Linux kernel.
- Think about a more generic way to relay configuration metadata to the
  host.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-03 14:52           ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-03 14:52 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: Roman Kagan, si-wei liu, Michael S . Tsirkin, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Tue, 3 Jul 2018 09:28:17 -0500
Venu Busireddy <venu.busireddy@oracle.com> wrote:

> On 2018-07-03 12:58:25 +0300, Roman Kagan wrote:
> > On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:  
> > > On 7/2/2018 9:14 AM, Roman Kagan wrote:  
> > > > Is the scheme going to be applied/extended to other transports (vmbus,
> > > > virtio-ccw, etc.)?  
> > > Well, it depends on the use case, and how feasible it can be extended to
> > > other transport due to constraints and transport specifics.
> > >   
> > > > Is the failover group concept going to be used beyond PT-PV network
> > > > device failover?  
> > > Although the concept of failover group is generic, the implementation itself
> > > may vary.  
> > 
> > My point with these two questions is that since this patchset is
> > defining external interfaces -- with guest OS, with management layer --  
> 
> This patch set is not defining any external interfaces. All this is doing
> is provide the means and locations to store the "group identifier". How
> that info will be used, I thought, should be another patch set.
> 
> Venu
> 
> > which are not easy to change later, it might make sense to try and see
> > if the interfaces map to other usecases.  E.g. I think we can get enough
> > information on how Hyper-V handles PT-PV network device failover from
> > the current Linux implementation; it may be a good idea to share some
> > concepts and workflows with virtio-pci.

But this patch set defines a host<->guest interface to make pairing
information on the host available to the guest, no?

From my point of view, there are several concerns:
- This approach assumes a homogeneous pairing (same transport), and
  even more, it assumes that this transport is pci.
- It won't work for zPCI (although zPCI is really strange) -- this
  means it will be completely unusable on s390x.
- It is too focused on a narrow use case. How is it supposed to be
  extended?

What I would prefer:
- Implement a pairing id support that does not rely on a certain
  transport, but leverages virtio (which is in the game anyway). We'd
  get at least the "virtio-net device paired with vfio" use case, which
  is what is currently implemented in the Linux kernel.
- Think about a more generic way to relay configuration metadata to the
  host.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-03 14:28         ` [virtio-dev] " Venu Busireddy
  (?)
  (?)
@ 2018-07-03 15:34         ` Roman Kagan
  -1 siblings, 0 replies; 106+ messages in thread
From: Roman Kagan @ 2018-07-03 15:34 UTC (permalink / raw)
  To: Venu Busireddy
  Cc: si-wei liu, Michael S . Tsirkin, Marcel Apfelbaum, virtio-dev,
	qemu-devel

On Tue, Jul 03, 2018 at 09:28:17AM -0500, Venu Busireddy wrote:
> On 2018-07-03 12:58:25 +0300, Roman Kagan wrote:
> > My point with these two questions is that since this patchset is
> > defining external interfaces -- with guest OS, with management layer --
> 
> This patch set is not defining any external interfaces. All this is doing
> is provide the means and locations to store the "group identifier". How
> that info will be used, I thought, should be another patch set.

Device properties are a part of the interface to the management layer.
PCI vendor capabilities are a part of the interface to the guest OS.

So yes, I think it makes sense to have at least a rough plan on how it's
going to be used.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-03  9:58     ` Roman Kagan
@ 2018-07-03 22:27         ` si-wei liu
  2018-07-03 22:27         ` [virtio-dev] " si-wei liu
  2018-07-04  5:43         ` [virtio-dev] " Michael S. Tsirkin
  2 siblings, 0 replies; 106+ messages in thread
From: si-wei liu @ 2018-07-03 22:27 UTC (permalink / raw)
  To: Roman Kagan, Venu Busireddy, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel



On 7/3/2018 2:58 AM, Roman Kagan wrote:
> On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:
>> On 7/2/2018 9:14 AM, Roman Kagan wrote:
>>> On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
>>>> The patch set "Enable virtio_net to act as a standby for a passthru
>>>> device" [1] deals with live migration of guests that use passthrough
>>>> devices. However, that scheme uses the MAC address for pairing
>>>> the virtio device and the passthrough device. The thread "netvsc:
>>>> refactor notifier/event handling code to use the failover framework"
>>>> [2] discusses an alternate mechanism, such as using an UUID, for pairing
>>>> the devices. Based on that discussion, proposals "Add "Group Identifier"
>>>> to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
>>>> store pairing information..." [4] were made.
>>>>
>>>> The current patch set includes all the feedback received for proposals [3]
>>>> and [4]. For the sake of completeness, patch for the virtio specification
>>>> is also included here. Following is the updated proposal.
>>>>
>>>> 1. Extend the virtio specification to include a new virtio PCI capability
>>>>      "VIRTIO_PCI_CAP_GROUP_ID_CFG".
>>>>
>>>> 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
>>>>      virtio device. The "failover-group-id" is a 64 bit value.
>>>>
>>>> 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
>>>>      Red Hat PCI bridge device (support for the i440FX model).
>>>>
>>>> 4. Add a new "pcie-downstream" device, with the option
>>>>      "failover-group-id" (support for the Q35 model).
>>>>
>>>> 5. The operator creates a 64 bit unique identifier, failover-group-id.
>>>>
>>>> 6. When the virtio device is created, the operator uses the
>>>>      "failover-group-id" option (for example, '-device
>>>>      virtio-net-pci,failover-group-id=<identifier>') and specifies the
>>>>      failover-group-id created in step 4.
>>>>
>>>>      QEMU stores the failover-group-id in the virtio device's configuration
>>>>      space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
>>>>
>>>> 7. When assigning a PCI device to the guest in passthrough mode, the
>>>>      operator first creates a bridge using the "failover-group-id" option
>>>>      (for example, '-device pcie-downstream,failover-group-id=<identifier>')
>>>>      to specify the failover-group-id created in step 4, and then attaches
>>>>      the passthrough device to the bridge.
>>>>
>>>>      QEMU stores the failover-group-id in the configuration space of the
>>>>      bridge as Vendor-Specific capability (0x09). The "Vendor" here is
>>>>      not to be confused with a specific organization. Instead, the vendor
>>>>      of the bridge is QEMU.
>>>>
>>>> 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
>>>>      a passthru device" [1] needs to be modified to use the UUID values
>>>>      present in the bridge's configuration space and the virtio device's
>>>>      configuration space instead of the MAC address for pairing the devices.
>>> I'm still missing a few bits in the overall scheme.
>>>
>>> Is the guest supposed to acknowledge the support for PT-PV failover?
>> Yes. We are leveraging virtio's feature negotiation mechanism for that.
>> Guest which does not acknowledge the support will not have PT plugged in.
>>
>>> Should the PT device be visibile to the guest before it acknowledges the
>>> support for failover?
>> No. QEMU will only expose PT device after guest acknowledges the support
>> through virtio's feature negotiation.
>>
>>>     How is this supposed to work with legacy guests that don't support it?
>> Only PV device will be exposed on legacy guest.
> So how is this coordination going to work?  One possibility is that the
> PV device emits a QMP event upon the guest driver confirming the support
> for failover, the management layer intercepts the event and performs
> device_add of the PT device.  Another is that the PT device is added
> from the very beginning (e.g. on the QEMU command line) but its parent
> PCI bridge subscribes a callback with the PV device to "activate" the PT
> device upon negotiating the failover feature.
>
> I think this needs to be decided within the scope of this patchset.
As what had been discussed in previous thread below, we would go with 
the approach that QEMU manages the visibility of the PT device 
automatically. Management layer supplies PT device to QEMU from the very 
beginning. This PT device won't be exposed to guest immediately, unless 
or until the guest virtio driver acknowledges the backup feature 
already. Once virtio driver in the guest initiates a device reset, the 
corresponding PT device must be taken out from guest. Then add it back 
later on after guest virtio completes negotiation for the backup feature.

https://patchwork.ozlabs.org/patch/909976/

>
>>> Is the guest supposed to signal the datapath switch to the host?
>> No, guest doesn't need to be initiating datapath switch at all.
> What happens if the guest supports failover in its PV driver, but lacks
> the driver for the PT device?
The assumption of failover driver is that the primary (PT device) will 
be able to get a datapath once it shows up in the guest . If adding a PT 
device to an unsupported guest, the result will be same as that without 
a standby PV driver - basically got no networking as you don't get a 
working driver.

Then perhaps don't add the PT device in the first place if guest lacks 
driver support?

>
>> However, QMP
>> events may be generated when exposing or hiding the PT device through hot
>> plug/unplug to facilitate host to switch datapath.
> The PT device hot plug/unplug are initiated by the host, aren't they?  Why
> would it also need QMP events for them?
As indicated above, the hot plug/unplug are initiated by QEMU not the 
management layer. Hence the QMP hot plug event is used as an indicator 
to switch host datapath. Unlike Windows Hyper-V SR-IOV driver model, the 
Linux host network stack does not offer a fine grained PF driver API to 
move MAC/VLAN filter, and the VF driver has to start with some initial 
MAC address filter programmed in when present in the guest. The QMP 
event is served as a checkpoint to switch MAC filter and/or VLAN filter 
between the PV and the VF.

>
>>> Is the scheme going to be applied/extended to other transports (vmbus,
>>> virtio-ccw, etc.)?
>> Well, it depends on the use case, and how feasible it can be extended to
>> other transport due to constraints and transport specifics.
>>
>>> Is the failover group concept going to be used beyond PT-PV network
>>> device failover?
>> Although the concept of failover group is generic, the implementation itself
>> may vary.
> My point with these two questions is that since this patchset is
> defining external interfaces -- with guest OS, with management layer --
> which are not easy to change later, it might make sense to try and see
> if the interfaces map to other usecases.  E.g. I think we can get enough
> information on how Hyper-V handles PT-PV network device failover from
> the current Linux implementation; it may be a good idea to share some
> concepts and workflows with virtio-pci.
As you may see from above, the handshake of virtio failover depends on 
hot plug (PCI or ACPI) and virtio specifics (feature negotiation). So 
far as I see the Hyper-V uses a completely different handshake protocol 
of its own (guest initiated datapath switch, Serial number in VMBus PCI 
bridge) than that of virtio. I can barely imagine how code could be 
implemented in a shared manner, although I agree conceptually failover 
group between these two is similar or the same.

-Siwei

>
> Thanks,
> Roman.
>
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-03 22:27         ` si-wei liu
  0 siblings, 0 replies; 106+ messages in thread
From: si-wei liu @ 2018-07-03 22:27 UTC (permalink / raw)
  To: Roman Kagan, Venu Busireddy, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel



On 7/3/2018 2:58 AM, Roman Kagan wrote:
> On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:
>> On 7/2/2018 9:14 AM, Roman Kagan wrote:
>>> On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
>>>> The patch set "Enable virtio_net to act as a standby for a passthru
>>>> device" [1] deals with live migration of guests that use passthrough
>>>> devices. However, that scheme uses the MAC address for pairing
>>>> the virtio device and the passthrough device. The thread "netvsc:
>>>> refactor notifier/event handling code to use the failover framework"
>>>> [2] discusses an alternate mechanism, such as using an UUID, for pairing
>>>> the devices. Based on that discussion, proposals "Add "Group Identifier"
>>>> to virtio PCI capabilities." [3] and "RFC: Use of bridge devices to
>>>> store pairing information..." [4] were made.
>>>>
>>>> The current patch set includes all the feedback received for proposals [3]
>>>> and [4]. For the sake of completeness, patch for the virtio specification
>>>> is also included here. Following is the updated proposal.
>>>>
>>>> 1. Extend the virtio specification to include a new virtio PCI capability
>>>>      "VIRTIO_PCI_CAP_GROUP_ID_CFG".
>>>>
>>>> 2. Enhance the QEMU CLI to include a "failover-group-id" option to the
>>>>      virtio device. The "failover-group-id" is a 64 bit value.
>>>>
>>>> 3. Enhance the QEMU CLI to include a "failover-group-id" option to the
>>>>      Red Hat PCI bridge device (support for the i440FX model).
>>>>
>>>> 4. Add a new "pcie-downstream" device, with the option
>>>>      "failover-group-id" (support for the Q35 model).
>>>>
>>>> 5. The operator creates a 64 bit unique identifier, failover-group-id.
>>>>
>>>> 6. When the virtio device is created, the operator uses the
>>>>      "failover-group-id" option (for example, '-device
>>>>      virtio-net-pci,failover-group-id=<identifier>') and specifies the
>>>>      failover-group-id created in step 4.
>>>>
>>>>      QEMU stores the failover-group-id in the virtio device's configuration
>>>>      space in the capability "VIRTIO_PCI_CAP_GROUP_ID_CFG".
>>>>
>>>> 7. When assigning a PCI device to the guest in passthrough mode, the
>>>>      operator first creates a bridge using the "failover-group-id" option
>>>>      (for example, '-device pcie-downstream,failover-group-id=<identifier>')
>>>>      to specify the failover-group-id created in step 4, and then attaches
>>>>      the passthrough device to the bridge.
>>>>
>>>>      QEMU stores the failover-group-id in the configuration space of the
>>>>      bridge as Vendor-Specific capability (0x09). The "Vendor" here is
>>>>      not to be confused with a specific organization. Instead, the vendor
>>>>      of the bridge is QEMU.
>>>>
>>>> 8. Patch 4 in patch series "Enable virtio_net to act as a standby for
>>>>      a passthru device" [1] needs to be modified to use the UUID values
>>>>      present in the bridge's configuration space and the virtio device's
>>>>      configuration space instead of the MAC address for pairing the devices.
>>> I'm still missing a few bits in the overall scheme.
>>>
>>> Is the guest supposed to acknowledge the support for PT-PV failover?
>> Yes. We are leveraging virtio's feature negotiation mechanism for that.
>> Guest which does not acknowledge the support will not have PT plugged in.
>>
>>> Should the PT device be visibile to the guest before it acknowledges the
>>> support for failover?
>> No. QEMU will only expose PT device after guest acknowledges the support
>> through virtio's feature negotiation.
>>
>>>     How is this supposed to work with legacy guests that don't support it?
>> Only PV device will be exposed on legacy guest.
> So how is this coordination going to work?  One possibility is that the
> PV device emits a QMP event upon the guest driver confirming the support
> for failover, the management layer intercepts the event and performs
> device_add of the PT device.  Another is that the PT device is added
> from the very beginning (e.g. on the QEMU command line) but its parent
> PCI bridge subscribes a callback with the PV device to "activate" the PT
> device upon negotiating the failover feature.
>
> I think this needs to be decided within the scope of this patchset.
As what had been discussed in previous thread below, we would go with 
the approach that QEMU manages the visibility of the PT device 
automatically. Management layer supplies PT device to QEMU from the very 
beginning. This PT device won't be exposed to guest immediately, unless 
or until the guest virtio driver acknowledges the backup feature 
already. Once virtio driver in the guest initiates a device reset, the 
corresponding PT device must be taken out from guest. Then add it back 
later on after guest virtio completes negotiation for the backup feature.

https://patchwork.ozlabs.org/patch/909976/

>
>>> Is the guest supposed to signal the datapath switch to the host?
>> No, guest doesn't need to be initiating datapath switch at all.
> What happens if the guest supports failover in its PV driver, but lacks
> the driver for the PT device?
The assumption of failover driver is that the primary (PT device) will 
be able to get a datapath once it shows up in the guest . If adding a PT 
device to an unsupported guest, the result will be same as that without 
a standby PV driver - basically got no networking as you don't get a 
working driver.

Then perhaps don't add the PT device in the first place if guest lacks 
driver support?

>
>> However, QMP
>> events may be generated when exposing or hiding the PT device through hot
>> plug/unplug to facilitate host to switch datapath.
> The PT device hot plug/unplug are initiated by the host, aren't they?  Why
> would it also need QMP events for them?
As indicated above, the hot plug/unplug are initiated by QEMU not the 
management layer. Hence the QMP hot plug event is used as an indicator 
to switch host datapath. Unlike Windows Hyper-V SR-IOV driver model, the 
Linux host network stack does not offer a fine grained PF driver API to 
move MAC/VLAN filter, and the VF driver has to start with some initial 
MAC address filter programmed in when present in the guest. The QMP 
event is served as a checkpoint to switch MAC filter and/or VLAN filter 
between the PV and the VF.

>
>>> Is the scheme going to be applied/extended to other transports (vmbus,
>>> virtio-ccw, etc.)?
>> Well, it depends on the use case, and how feasible it can be extended to
>> other transport due to constraints and transport specifics.
>>
>>> Is the failover group concept going to be used beyond PT-PV network
>>> device failover?
>> Although the concept of failover group is generic, the implementation itself
>> may vary.
> My point with these two questions is that since this patchset is
> defining external interfaces -- with guest OS, with management layer --
> which are not easy to change later, it might make sense to try and see
> if the interfaces map to other usecases.  E.g. I think we can get enough
> information on how Hyper-V handles PT-PV network device failover from
> the current Linux implementation; it may be a good idea to share some
> concepts and workflows with virtio-pci.
As you may see from above, the handshake of virtio failover depends on 
hot plug (PCI or ACPI) and virtio specifics (feature negotiation). So 
far as I see the Hyper-V uses a completely different handshake protocol 
of its own (guest initiated datapath switch, Serial number in VMBus PCI 
bridge) than that of virtio. I can barely imagine how code could be 
implemented in a shared manner, although I agree conceptually failover 
group between these two is similar or the same.

-Siwei

>
> Thanks,
> Roman.
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-03 14:52           ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
@ 2018-07-03 23:31             ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-03 23:31 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Tue, 3 Jul 2018 09:28:17 -0500
> Venu Busireddy <venu.busireddy@oracle.com> wrote:
>
>> On 2018-07-03 12:58:25 +0300, Roman Kagan wrote:
>> > On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:
>> > > On 7/2/2018 9:14 AM, Roman Kagan wrote:
>> > > > Is the scheme going to be applied/extended to other transports (vmbus,
>> > > > virtio-ccw, etc.)?
>> > > Well, it depends on the use case, and how feasible it can be extended to
>> > > other transport due to constraints and transport specifics.
>> > >
>> > > > Is the failover group concept going to be used beyond PT-PV network
>> > > > device failover?
>> > > Although the concept of failover group is generic, the implementation itself
>> > > may vary.
>> >
>> > My point with these two questions is that since this patchset is
>> > defining external interfaces -- with guest OS, with management layer --
>>
>> This patch set is not defining any external interfaces. All this is doing
>> is provide the means and locations to store the "group identifier". How
>> that info will be used, I thought, should be another patch set.
>>
>> Venu
>>
>> > which are not easy to change later, it might make sense to try and see
>> > if the interfaces map to other usecases.  E.g. I think we can get enough
>> > information on how Hyper-V handles PT-PV network device failover from
>> > the current Linux implementation; it may be a good idea to share some
>> > concepts and workflows with virtio-pci.
>
> But this patch set defines a host<->guest interface to make pairing
> information on the host available to the guest, no?
>
> From my point of view, there are several concerns:
> - This approach assumes a homogeneous pairing (same transport), and
>   even more, it assumes that this transport is pci.

Not really.

There could be some other place to define a generic (transport
independent) virtio feature, whereas the data (group ID) can be stored
in transport specific way. That generic virtio feature and the way to
specify target transport to group with is yet to be defined. I don't
see this patch is in conflict with that direction.


> - It won't work for zPCI (although zPCI is really strange) -- this
>   means it will be completely unusable on s390x.
I still need more information about this use case. First off, does
zPCI support all the hot plug semantics or functionality the same way
as PCI? Or there has to be some platform or firmeware support like
ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
hotplug?

Does the s390x use case in your mind concerns with VFIO migraition, or
replacement of a PT device with backup virtio-ccw path? Or something
else?

As the assumption of SR-IOV migration is that, hotplug is used as an
inidicator for datapath switch - which maps to moving MAC address or
VLAN filter around between PV and VF. I am not sure how that maps to
s390x and zPCI with regard to host coordination.

-Siwei

> - It is too focused on a narrow use case. How is it supposed to be
>   extended?
>
> What I would prefer:
> - Implement a pairing id support that does not rely on a certain
>   transport, but leverages virtio (which is in the game anyway). We'd
>   get at least the "virtio-net device paired with vfio" use case, which
>   is what is currently implemented in the Linux kernel.
> - Think about a more generic way to relay configuration metadata to the
>   host.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-03 23:31             ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-03 23:31 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Tue, 3 Jul 2018 09:28:17 -0500
> Venu Busireddy <venu.busireddy@oracle.com> wrote:
>
>> On 2018-07-03 12:58:25 +0300, Roman Kagan wrote:
>> > On Mon, Jul 02, 2018 at 02:14:52PM -0700, si-wei liu wrote:
>> > > On 7/2/2018 9:14 AM, Roman Kagan wrote:
>> > > > Is the scheme going to be applied/extended to other transports (vmbus,
>> > > > virtio-ccw, etc.)?
>> > > Well, it depends on the use case, and how feasible it can be extended to
>> > > other transport due to constraints and transport specifics.
>> > >
>> > > > Is the failover group concept going to be used beyond PT-PV network
>> > > > device failover?
>> > > Although the concept of failover group is generic, the implementation itself
>> > > may vary.
>> >
>> > My point with these two questions is that since this patchset is
>> > defining external interfaces -- with guest OS, with management layer --
>>
>> This patch set is not defining any external interfaces. All this is doing
>> is provide the means and locations to store the "group identifier". How
>> that info will be used, I thought, should be another patch set.
>>
>> Venu
>>
>> > which are not easy to change later, it might make sense to try and see
>> > if the interfaces map to other usecases.  E.g. I think we can get enough
>> > information on how Hyper-V handles PT-PV network device failover from
>> > the current Linux implementation; it may be a good idea to share some
>> > concepts and workflows with virtio-pci.
>
> But this patch set defines a host<->guest interface to make pairing
> information on the host available to the guest, no?
>
> From my point of view, there are several concerns:
> - This approach assumes a homogeneous pairing (same transport), and
>   even more, it assumes that this transport is pci.

Not really.

There could be some other place to define a generic (transport
independent) virtio feature, whereas the data (group ID) can be stored
in transport specific way. That generic virtio feature and the way to
specify target transport to group with is yet to be defined. I don't
see this patch is in conflict with that direction.


> - It won't work for zPCI (although zPCI is really strange) -- this
>   means it will be completely unusable on s390x.
I still need more information about this use case. First off, does
zPCI support all the hot plug semantics or functionality the same way
as PCI? Or there has to be some platform or firmeware support like
ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
hotplug?

Does the s390x use case in your mind concerns with VFIO migraition, or
replacement of a PT device with backup virtio-ccw path? Or something
else?

As the assumption of SR-IOV migration is that, hotplug is used as an
inidicator for datapath switch - which maps to moving MAC address or
VLAN filter around between PV and VF. I am not sure how that maps to
s390x and zPCI with regard to host coordination.

-Siwei

> - It is too focused on a narrow use case. How is it supposed to be
>   extended?
>
> What I would prefer:
> - Implement a pairing id support that does not rely on a certain
>   transport, but leverages virtio (which is in the game anyway). We'd
>   get at least the "virtio-net device paired with vfio" use case, which
>   is what is currently implemented in the Linux kernel.
> - Think about a more generic way to relay configuration metadata to the
>   host.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-03  9:58     ` Roman Kagan
@ 2018-07-04  5:43         ` Michael S. Tsirkin
  2018-07-03 22:27         ` [virtio-dev] " si-wei liu
  2018-07-04  5:43         ` [virtio-dev] " Michael S. Tsirkin
  2 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-04  5:43 UTC (permalink / raw)
  To: Roman Kagan, si-wei liu, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Tue, Jul 03, 2018 at 12:58:25PM +0300, Roman Kagan wrote:
> > >    How is this supposed to work with legacy guests that don't support it?
> > Only PV device will be exposed on legacy guest.
> 
> So how is this coordination going to work?

Here's a description:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg538867.html

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-04  5:43         ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-04  5:43 UTC (permalink / raw)
  To: Roman Kagan, si-wei liu, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Tue, Jul 03, 2018 at 12:58:25PM +0300, Roman Kagan wrote:
> > >    How is this supposed to work with legacy guests that don't support it?
> > Only PV device will be exposed on legacy guest.
> 
> So how is this coordination going to work?

Here's a description:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg538867.html

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-03 23:31             ` [virtio-dev] Re: [Qemu-devel] " Siwei Liu
@ 2018-07-04 12:15               ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-04 12:15 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Tue, 3 Jul 2018 16:31:03 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > From my point of view, there are several concerns:
> > - This approach assumes a homogeneous pairing (same transport), and
> >   even more, it assumes that this transport is pci.  
> 
> Not really.
> 
> There could be some other place to define a generic (transport
> independent) virtio feature, whereas the data (group ID) can be stored
> in transport specific way. That generic virtio feature and the way to
> specify target transport to group with is yet to be defined. I don't
> see this patch is in conflict with that direction.

Sorry, but I really do not see how this is not pci-specific.

- One of your components is a bridge. A transport does not necessarily
  have that concept, at least not in a way meaningful for this approach
  to work.
- Even if we can introduce transport-specific ways for other
  transports, the bridge concept still means that the pairing cannot be
  cross-transport.

I think we should be clear what we want from a generic
(transport-agnostic) virtio feature first. Probably some way to relay
an identifier of the to-be-paired device (transport-specific +
information what the transport is?)

> > - It won't work for zPCI (although zPCI is really strange) -- this
> >   means it will be completely unusable on s390x.  
> I still need more information about this use case. First off, does
> zPCI support all the hot plug semantics or functionality the same way
> as PCI? Or there has to be some platform or firmeware support like
> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> hotplug?

zPCI is a strange beast, so first a pointer to a writeup I did:
https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html

It does support hotplug, within the s390 architectural context, but
that should be fine for our needs here.

My concern comes from the 'no topology' issue. We build a fake topology
in QEMU (to use the generic pci infrastructure), but the guest does not
see any of this. It issues an instruction and gets a list of functions.
This means any bridge information is not accessible to the guest.

> 
> Does the s390x use case in your mind concerns with VFIO migraition, or
> replacement of a PT device with backup virtio-ccw path? Or something
> else?

Migration with vfio is the case most likely to be relevant. I'm mostly
concerned with not needlessly closing doors, though.

> 
> As the assumption of SR-IOV migration is that, hotplug is used as an
> inidicator for datapath switch - which maps to moving MAC address or
> VLAN filter around between PV and VF. I am not sure how that maps to
> s390x and zPCI with regard to host coordination.

If we can move MAC addresses or VLAN filters on Linux/QEMU/libvirt in
general, there's no inherent reason why we shouldn't be able to do so
on s390 as well. What matters more is probably which pci network cards
are supported (currently Mellanox AFAIK, not sure if there are others).

> 
> -Siwei
> 
> > - It is too focused on a narrow use case. How is it supposed to be
> >   extended?
> >
> > What I would prefer:
> > - Implement a pairing id support that does not rely on a certain
> >   transport, but leverages virtio (which is in the game anyway). We'd
> >   get at least the "virtio-net device paired with vfio" use case, which
> >   is what is currently implemented in the Linux kernel.
> > - Think about a more generic way to relay configuration metadata to the
> >   host.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> >  

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-04 12:15               ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-04 12:15 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Tue, 3 Jul 2018 16:31:03 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > From my point of view, there are several concerns:
> > - This approach assumes a homogeneous pairing (same transport), and
> >   even more, it assumes that this transport is pci.  
> 
> Not really.
> 
> There could be some other place to define a generic (transport
> independent) virtio feature, whereas the data (group ID) can be stored
> in transport specific way. That generic virtio feature and the way to
> specify target transport to group with is yet to be defined. I don't
> see this patch is in conflict with that direction.

Sorry, but I really do not see how this is not pci-specific.

- One of your components is a bridge. A transport does not necessarily
  have that concept, at least not in a way meaningful for this approach
  to work.
- Even if we can introduce transport-specific ways for other
  transports, the bridge concept still means that the pairing cannot be
  cross-transport.

I think we should be clear what we want from a generic
(transport-agnostic) virtio feature first. Probably some way to relay
an identifier of the to-be-paired device (transport-specific +
information what the transport is?)

> > - It won't work for zPCI (although zPCI is really strange) -- this
> >   means it will be completely unusable on s390x.  
> I still need more information about this use case. First off, does
> zPCI support all the hot plug semantics or functionality the same way
> as PCI? Or there has to be some platform or firmeware support like
> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> hotplug?

zPCI is a strange beast, so first a pointer to a writeup I did:
https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html

It does support hotplug, within the s390 architectural context, but
that should be fine for our needs here.

My concern comes from the 'no topology' issue. We build a fake topology
in QEMU (to use the generic pci infrastructure), but the guest does not
see any of this. It issues an instruction and gets a list of functions.
This means any bridge information is not accessible to the guest.

> 
> Does the s390x use case in your mind concerns with VFIO migraition, or
> replacement of a PT device with backup virtio-ccw path? Or something
> else?

Migration with vfio is the case most likely to be relevant. I'm mostly
concerned with not needlessly closing doors, though.

> 
> As the assumption of SR-IOV migration is that, hotplug is used as an
> inidicator for datapath switch - which maps to moving MAC address or
> VLAN filter around between PV and VF. I am not sure how that maps to
> s390x and zPCI with regard to host coordination.

If we can move MAC addresses or VLAN filters on Linux/QEMU/libvirt in
general, there's no inherent reason why we shouldn't be able to do so
on s390 as well. What matters more is probably which pci network cards
are supported (currently Mellanox AFAIK, not sure if there are others).

> 
> -Siwei
> 
> > - It is too focused on a narrow use case. How is it supposed to be
> >   extended?
> >
> > What I would prefer:
> > - Implement a pairing id support that does not rely on a certain
> >   transport, but leverages virtio (which is in the game anyway). We'd
> >   get at least the "virtio-net device paired with vfio" use case, which
> >   is what is currently implemented in the Linux kernel.
> > - Think about a more generic way to relay configuration metadata to the
> >   host.
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> >  


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-04 12:15               ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
@ 2018-07-06  0:49                 ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-06  0:49 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Tue, 3 Jul 2018 16:31:03 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> > From my point of view, there are several concerns:
>> > - This approach assumes a homogeneous pairing (same transport), and
>> >   even more, it assumes that this transport is pci.
>>
>> Not really.
>>
>> There could be some other place to define a generic (transport
>> independent) virtio feature, whereas the data (group ID) can be stored
>> in transport specific way. That generic virtio feature and the way to
>> specify target transport to group with is yet to be defined. I don't
>> see this patch is in conflict with that direction.
>
> Sorry, but I really do not see how this is not pci-specific.
>
> - One of your components is a bridge. A transport does not necessarily
>   have that concept, at least not in a way meaningful for this approach
>   to work.

Assuming we have a transport agnostic high-level virtio feature to
indicate that the target type for the to-be-paired device is PCI, then
we have to use the data stored in a PC bridge to pair the device. It
does not associate transport of virtio itself with the type of target
device, right. The introduction of PCI bridge is just for storing the
group ID data for the PCI passthrough device case, while one may
define and introduce other means to retrieve the info if target device
type is not PCI e.g. a special instruction to get group ID on s390x
zPCI.

> - Even if we can introduce transport-specific ways for other
>   transports, the bridge concept still means that the pairing cannot be
>   cross-transport.

Sorry, I did not get this. A bridge is PCI specific concept indeed,
but the virtio iteself can be of other transport. Why it matters? I
guess a higher-level feature is needed to define the target transport
for the to-be-paired device. The group ID info can reside on the
transport specific area on virtio itself though. E.g. for virtio-pci,
get it on the vendor specific capability in the PCI configuration
space.
For virtio-ccw, you may come up with CCW specific way to get the group
ID data. Is this not what you had in mind?

>
> I think we should be clear what we want from a generic
> (transport-agnostic) virtio feature first. Probably some way to relay
> an identifier of the to-be-paired device (transport-specific +
> information what the transport is?)
>
Are we talking about the transport of the virtio itself or the target
device? Note the virtio feature must always be retrieved using
transport specific way, although the semantics of the feauture can be
transport independent/agnostic. Once a virtio driver instace gets to
the point to read feature bits from the device, the transport of that
virtio device is determined and cannot be changed later.


>> > - It won't work for zPCI (although zPCI is really strange) -- this
>> >   means it will be completely unusable on s390x.
>> I still need more information about this use case. First off, does
>> zPCI support all the hot plug semantics or functionality the same way
>> as PCI? Or there has to be some platform or firmeware support like
>> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
>> hotplug?
>
> zPCI is a strange beast, so first a pointer to a writeup I did:
> https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
>
> It does support hotplug, within the s390 architectural context, but
> that should be fine for our needs here.
>
> My concern comes from the 'no topology' issue. We build a fake topology
> in QEMU (to use the generic pci infrastructure), but the guest does not
> see any of this. It issues an instruction and gets a list of functions.
> This means any bridge information is not accessible to the guest.

That is the reason why I prefer leaving it to specific transport
(zPCI) to define its own means to present and retrieve the group ID
information. The high-level feature bit just provides the indirection
to the target transport (for the to-be-paired device), while seperate
transport can have a way of its own to present/retrieve the group ID
data.

I don't know where to present that group ID data on s390 zPCI though
to be honest. But I doubt we can come up with a very general
transport-agnostic way to present the data for both (and pontentially
ALL) bus architectures.

-Siwei

>
>>
>> Does the s390x use case in your mind concerns with VFIO migraition, or
>> replacement of a PT device with backup virtio-ccw path? Or something
>> else?
>
> Migration with vfio is the case most likely to be relevant. I'm mostly
> concerned with not needlessly closing doors, though.
>
>>
>> As the assumption of SR-IOV migration is that, hotplug is used as an
>> inidicator for datapath switch - which maps to moving MAC address or
>> VLAN filter around between PV and VF. I am not sure how that maps to
>> s390x and zPCI with regard to host coordination.
>
> If we can move MAC addresses or VLAN filters on Linux/QEMU/libvirt in
> general, there's no inherent reason why we shouldn't be able to do so
> on s390 as well. What matters more is probably which pci network cards
> are supported (currently Mellanox AFAIK, not sure if there are others).
>
>>
>> -Siwei
>>
>> > - It is too focused on a narrow use case. How is it supposed to be
>> >   extended?
>> >
>> > What I would prefer:
>> > - Implement a pairing id support that does not rely on a certain
>> >   transport, but leverages virtio (which is in the game anyway). We'd
>> >   get at least the "virtio-net device paired with vfio" use case, which
>> >   is what is currently implemented in the Linux kernel.
>> > - Think about a more generic way to relay configuration metadata to the
>> >   host.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>> >
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-06  0:49                 ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-06  0:49 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Tue, 3 Jul 2018 16:31:03 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> > From my point of view, there are several concerns:
>> > - This approach assumes a homogeneous pairing (same transport), and
>> >   even more, it assumes that this transport is pci.
>>
>> Not really.
>>
>> There could be some other place to define a generic (transport
>> independent) virtio feature, whereas the data (group ID) can be stored
>> in transport specific way. That generic virtio feature and the way to
>> specify target transport to group with is yet to be defined. I don't
>> see this patch is in conflict with that direction.
>
> Sorry, but I really do not see how this is not pci-specific.
>
> - One of your components is a bridge. A transport does not necessarily
>   have that concept, at least not in a way meaningful for this approach
>   to work.

Assuming we have a transport agnostic high-level virtio feature to
indicate that the target type for the to-be-paired device is PCI, then
we have to use the data stored in a PC bridge to pair the device. It
does not associate transport of virtio itself with the type of target
device, right. The introduction of PCI bridge is just for storing the
group ID data for the PCI passthrough device case, while one may
define and introduce other means to retrieve the info if target device
type is not PCI e.g. a special instruction to get group ID on s390x
zPCI.

> - Even if we can introduce transport-specific ways for other
>   transports, the bridge concept still means that the pairing cannot be
>   cross-transport.

Sorry, I did not get this. A bridge is PCI specific concept indeed,
but the virtio iteself can be of other transport. Why it matters? I
guess a higher-level feature is needed to define the target transport
for the to-be-paired device. The group ID info can reside on the
transport specific area on virtio itself though. E.g. for virtio-pci,
get it on the vendor specific capability in the PCI configuration
space.
For virtio-ccw, you may come up with CCW specific way to get the group
ID data. Is this not what you had in mind?

>
> I think we should be clear what we want from a generic
> (transport-agnostic) virtio feature first. Probably some way to relay
> an identifier of the to-be-paired device (transport-specific +
> information what the transport is?)
>
Are we talking about the transport of the virtio itself or the target
device? Note the virtio feature must always be retrieved using
transport specific way, although the semantics of the feauture can be
transport independent/agnostic. Once a virtio driver instace gets to
the point to read feature bits from the device, the transport of that
virtio device is determined and cannot be changed later.


>> > - It won't work for zPCI (although zPCI is really strange) -- this
>> >   means it will be completely unusable on s390x.
>> I still need more information about this use case. First off, does
>> zPCI support all the hot plug semantics or functionality the same way
>> as PCI? Or there has to be some platform or firmeware support like
>> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
>> hotplug?
>
> zPCI is a strange beast, so first a pointer to a writeup I did:
> https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
>
> It does support hotplug, within the s390 architectural context, but
> that should be fine for our needs here.
>
> My concern comes from the 'no topology' issue. We build a fake topology
> in QEMU (to use the generic pci infrastructure), but the guest does not
> see any of this. It issues an instruction and gets a list of functions.
> This means any bridge information is not accessible to the guest.

That is the reason why I prefer leaving it to specific transport
(zPCI) to define its own means to present and retrieve the group ID
information. The high-level feature bit just provides the indirection
to the target transport (for the to-be-paired device), while seperate
transport can have a way of its own to present/retrieve the group ID
data.

I don't know where to present that group ID data on s390 zPCI though
to be honest. But I doubt we can come up with a very general
transport-agnostic way to present the data for both (and pontentially
ALL) bus architectures.

-Siwei

>
>>
>> Does the s390x use case in your mind concerns with VFIO migraition, or
>> replacement of a PT device with backup virtio-ccw path? Or something
>> else?
>
> Migration with vfio is the case most likely to be relevant. I'm mostly
> concerned with not needlessly closing doors, though.
>
>>
>> As the assumption of SR-IOV migration is that, hotplug is used as an
>> inidicator for datapath switch - which maps to moving MAC address or
>> VLAN filter around between PV and VF. I am not sure how that maps to
>> s390x and zPCI with regard to host coordination.
>
> If we can move MAC addresses or VLAN filters on Linux/QEMU/libvirt in
> general, there's no inherent reason why we shouldn't be able to do so
> on s390 as well. What matters more is probably which pci network cards
> are supported (currently Mellanox AFAIK, not sure if there are others).
>
>>
>> -Siwei
>>
>> > - It is too focused on a narrow use case. How is it supposed to be
>> >   extended?
>> >
>> > What I would prefer:
>> > - Implement a pairing id support that does not rely on a certain
>> >   transport, but leverages virtio (which is in the game anyway). We'd
>> >   get at least the "virtio-net device paired with vfio" use case, which
>> >   is what is currently implemented in the Linux kernel.
>> > - Think about a more generic way to relay configuration metadata to the
>> >   host.
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>> >
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-06  0:49                 ` [virtio-dev] Re: [Qemu-devel] " Siwei Liu
@ 2018-07-06 13:54                   ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-06 13:54 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Thu, 5 Jul 2018 17:49:10 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > On Tue, 3 Jul 2018 16:31:03 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >  
> >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> >> > From my point of view, there are several concerns:
> >> > - This approach assumes a homogeneous pairing (same transport), and
> >> >   even more, it assumes that this transport is pci.  
> >>
> >> Not really.
> >>
> >> There could be some other place to define a generic (transport
> >> independent) virtio feature, whereas the data (group ID) can be stored
> >> in transport specific way. That generic virtio feature and the way to
> >> specify target transport to group with is yet to be defined. I don't
> >> see this patch is in conflict with that direction.  
> >
> > Sorry, but I really do not see how this is not pci-specific.
> >
> > - One of your components is a bridge. A transport does not necessarily
> >   have that concept, at least not in a way meaningful for this approach
> >   to work.  
> 
> Assuming we have a transport agnostic high-level virtio feature to
> indicate that the target type for the to-be-paired device is PCI, then
> we have to use the data stored in a PC bridge to pair the device. It
> does not associate transport of virtio itself with the type of target
> device, right. The introduction of PCI bridge is just for storing the
> group ID data for the PCI passthrough device case, while one may
> define and introduce other means to retrieve the info if target device
> type is not PCI e.g. a special instruction to get group ID on s390x
> zPCI.

Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
layers are supposed to distinguish that.

Is there really no existing PCI identifier that (a) the host admin
already knows and (b) the guest can retrieve easily?

> 
> > - Even if we can introduce transport-specific ways for other
> >   transports, the bridge concept still means that the pairing cannot be
> >   cross-transport.  
> 
> Sorry, I did not get this. A bridge is PCI specific concept indeed,
> but the virtio iteself can be of other transport. Why it matters? I
> guess a higher-level feature is needed to define the target transport
> for the to-be-paired device. The group ID info can reside on the
> transport specific area on virtio itself though. E.g. for virtio-pci,
> get it on the vendor specific capability in the PCI configuration
> space.
> For virtio-ccw, you may come up with CCW specific way to get the group
> ID data. Is this not what you had in mind?

No, my idea was it to add it as a field in the virtio-net config space,
making it transport-agnostic. (And making this a typed value, depending
on the vfio device we want to pair the virtio device with.)

If we want to extend that to other device types, we can add the field
in their config space; but I'd rather prefer a more generic "host
relays config metainformation" approach.

> 
> >
> > I think we should be clear what we want from a generic
> > (transport-agnostic) virtio feature first. Probably some way to relay
> > an identifier of the to-be-paired device (transport-specific +
> > information what the transport is?)
> >  
> Are we talking about the transport of the virtio itself or the target
> device? Note the virtio feature must always be retrieved using
> transport specific way, although the semantics of the feauture can be
> transport independent/agnostic. Once a virtio driver instace gets to
> the point to read feature bits from the device, the transport of that
> virtio device is determined and cannot be changed later.

No, I don't want to introduce a new, per-transport mechanism, but
rather put it into the config space.

> 
> 
> >> > - It won't work for zPCI (although zPCI is really strange) -- this
> >> >   means it will be completely unusable on s390x.  
> >> I still need more information about this use case. First off, does
> >> zPCI support all the hot plug semantics or functionality the same way
> >> as PCI? Or there has to be some platform or firmeware support like
> >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> >> hotplug?  
> >
> > zPCI is a strange beast, so first a pointer to a writeup I did:
> > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
> >
> > It does support hotplug, within the s390 architectural context, but
> > that should be fine for our needs here.
> >
> > My concern comes from the 'no topology' issue. We build a fake topology
> > in QEMU (to use the generic pci infrastructure), but the guest does not
> > see any of this. It issues an instruction and gets a list of functions.
> > This means any bridge information is not accessible to the guest.  
> 
> That is the reason why I prefer leaving it to specific transport
> (zPCI) to define its own means to present and retrieve the group ID
> information. The high-level feature bit just provides the indirection
> to the target transport (for the to-be-paired device), while seperate
> transport can have a way of its own to present/retrieve the group ID
> data.
> 
> I don't know where to present that group ID data on s390 zPCI though
> to be honest. But I doubt we can come up with a very general
> transport-agnostic way to present the data for both (and pontentially
> ALL) bus architectures.

I don't want to establish zPCI as a different transport than 'normal'
PCI; that would just make life difficult for everyone. The 'put it into
virtio config space' approach would mean a second feature bit, but
should just work.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-06 13:54                   ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-06 13:54 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Thu, 5 Jul 2018 17:49:10 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > On Tue, 3 Jul 2018 16:31:03 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >  
> >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> >> > From my point of view, there are several concerns:
> >> > - This approach assumes a homogeneous pairing (same transport), and
> >> >   even more, it assumes that this transport is pci.  
> >>
> >> Not really.
> >>
> >> There could be some other place to define a generic (transport
> >> independent) virtio feature, whereas the data (group ID) can be stored
> >> in transport specific way. That generic virtio feature and the way to
> >> specify target transport to group with is yet to be defined. I don't
> >> see this patch is in conflict with that direction.  
> >
> > Sorry, but I really do not see how this is not pci-specific.
> >
> > - One of your components is a bridge. A transport does not necessarily
> >   have that concept, at least not in a way meaningful for this approach
> >   to work.  
> 
> Assuming we have a transport agnostic high-level virtio feature to
> indicate that the target type for the to-be-paired device is PCI, then
> we have to use the data stored in a PC bridge to pair the device. It
> does not associate transport of virtio itself with the type of target
> device, right. The introduction of PCI bridge is just for storing the
> group ID data for the PCI passthrough device case, while one may
> define and introduce other means to retrieve the info if target device
> type is not PCI e.g. a special instruction to get group ID on s390x
> zPCI.

Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
layers are supposed to distinguish that.

Is there really no existing PCI identifier that (a) the host admin
already knows and (b) the guest can retrieve easily?

> 
> > - Even if we can introduce transport-specific ways for other
> >   transports, the bridge concept still means that the pairing cannot be
> >   cross-transport.  
> 
> Sorry, I did not get this. A bridge is PCI specific concept indeed,
> but the virtio iteself can be of other transport. Why it matters? I
> guess a higher-level feature is needed to define the target transport
> for the to-be-paired device. The group ID info can reside on the
> transport specific area on virtio itself though. E.g. for virtio-pci,
> get it on the vendor specific capability in the PCI configuration
> space.
> For virtio-ccw, you may come up with CCW specific way to get the group
> ID data. Is this not what you had in mind?

No, my idea was it to add it as a field in the virtio-net config space,
making it transport-agnostic. (And making this a typed value, depending
on the vfio device we want to pair the virtio device with.)

If we want to extend that to other device types, we can add the field
in their config space; but I'd rather prefer a more generic "host
relays config metainformation" approach.

> 
> >
> > I think we should be clear what we want from a generic
> > (transport-agnostic) virtio feature first. Probably some way to relay
> > an identifier of the to-be-paired device (transport-specific +
> > information what the transport is?)
> >  
> Are we talking about the transport of the virtio itself or the target
> device? Note the virtio feature must always be retrieved using
> transport specific way, although the semantics of the feauture can be
> transport independent/agnostic. Once a virtio driver instace gets to
> the point to read feature bits from the device, the transport of that
> virtio device is determined and cannot be changed later.

No, I don't want to introduce a new, per-transport mechanism, but
rather put it into the config space.

> 
> 
> >> > - It won't work for zPCI (although zPCI is really strange) -- this
> >> >   means it will be completely unusable on s390x.  
> >> I still need more information about this use case. First off, does
> >> zPCI support all the hot plug semantics or functionality the same way
> >> as PCI? Or there has to be some platform or firmeware support like
> >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> >> hotplug?  
> >
> > zPCI is a strange beast, so first a pointer to a writeup I did:
> > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
> >
> > It does support hotplug, within the s390 architectural context, but
> > that should be fine for our needs here.
> >
> > My concern comes from the 'no topology' issue. We build a fake topology
> > in QEMU (to use the generic pci infrastructure), but the guest does not
> > see any of this. It issues an instruction and gets a list of functions.
> > This means any bridge information is not accessible to the guest.  
> 
> That is the reason why I prefer leaving it to specific transport
> (zPCI) to define its own means to present and retrieve the group ID
> information. The high-level feature bit just provides the indirection
> to the target transport (for the to-be-paired device), while seperate
> transport can have a way of its own to present/retrieve the group ID
> data.
> 
> I don't know where to present that group ID data on s390 zPCI though
> to be honest. But I doubt we can come up with a very general
> transport-agnostic way to present the data for both (and pontentially
> ALL) bus architectures.

I don't want to establish zPCI as a different transport than 'normal'
PCI; that would just make life difficult for everyone. The 'put it into
virtio config space' approach would mean a second feature bit, but
should just work.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-06 13:54                   ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
@ 2018-07-06 15:07                     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-06 15:07 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Siwei Liu, Venu Busireddy, Roman Kagan, si-wei liu,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jul 06, 2018 at 03:54:06PM +0200, Cornelia Huck wrote:
> On Thu, 5 Jul 2018 17:49:10 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
> 
> > On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > > On Tue, 3 Jul 2018 16:31:03 -0700
> > > Siwei Liu <loseweigh@gmail.com> wrote:
> > >  
> > >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> > >> > From my point of view, there are several concerns:
> > >> > - This approach assumes a homogeneous pairing (same transport), and
> > >> >   even more, it assumes that this transport is pci.  
> > >>
> > >> Not really.
> > >>
> > >> There could be some other place to define a generic (transport
> > >> independent) virtio feature, whereas the data (group ID) can be stored
> > >> in transport specific way. That generic virtio feature and the way to
> > >> specify target transport to group with is yet to be defined. I don't
> > >> see this patch is in conflict with that direction.  
> > >
> > > Sorry, but I really do not see how this is not pci-specific.
> > >
> > > - One of your components is a bridge. A transport does not necessarily
> > >   have that concept, at least not in a way meaningful for this approach
> > >   to work.  
> > 
> > Assuming we have a transport agnostic high-level virtio feature to
> > indicate that the target type for the to-be-paired device is PCI, then
> > we have to use the data stored in a PC bridge to pair the device. It
> > does not associate transport of virtio itself with the type of target
> > device, right. The introduction of PCI bridge is just for storing the
> > group ID data for the PCI passthrough device case, while one may
> > define and introduce other means to retrieve the info if target device
> > type is not PCI e.g. a special instruction to get group ID on s390x
> > zPCI.
> 
> Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
> layers are supposed to distinguish that.
> 
> Is there really no existing PCI identifier that (a) the host admin
> already knows and (b) the guest can retrieve easily?

There is e.g. PCI Express serial number. However given that
one use of the feature is for live migration, we really
shouldn't use any ID from a real card.

> > 
> > > - Even if we can introduce transport-specific ways for other
> > >   transports, the bridge concept still means that the pairing cannot be
> > >   cross-transport.  
> > 
> > Sorry, I did not get this. A bridge is PCI specific concept indeed,
> > but the virtio iteself can be of other transport. Why it matters? I
> > guess a higher-level feature is needed to define the target transport
> > for the to-be-paired device. The group ID info can reside on the
> > transport specific area on virtio itself though. E.g. for virtio-pci,
> > get it on the vendor specific capability in the PCI configuration
> > space.
> > For virtio-ccw, you may come up with CCW specific way to get the group
> > ID data. Is this not what you had in mind?
> 
> No, my idea was it to add it as a field in the virtio-net config space,
> making it transport-agnostic. (And making this a typed value, depending
> on the vfio device we want to pair the virtio device with.)
> 
> If we want to extend that to other device types, we can add the field
> in their config space; but I'd rather prefer a more generic "host
> relays config metainformation" approach.
> 
> > 
> > >
> > > I think we should be clear what we want from a generic
> > > (transport-agnostic) virtio feature first. Probably some way to relay
> > > an identifier of the to-be-paired device (transport-specific +
> > > information what the transport is?)
> > >  
> > Are we talking about the transport of the virtio itself or the target
> > device? Note the virtio feature must always be retrieved using
> > transport specific way, although the semantics of the feauture can be
> > transport independent/agnostic. Once a virtio driver instace gets to
> > the point to read feature bits from the device, the transport of that
> > virtio device is determined and cannot be changed later.
> 
> No, I don't want to introduce a new, per-transport mechanism, but
> rather put it into the config space.
> > 
> > 
> > >> > - It won't work for zPCI (although zPCI is really strange) -- this
> > >> >   means it will be completely unusable on s390x.  
> > >> I still need more information about this use case. First off, does
> > >> zPCI support all the hot plug semantics or functionality the same way
> > >> as PCI? Or there has to be some platform or firmeware support like
> > >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> > >> hotplug?  
> > >
> > > zPCI is a strange beast, so first a pointer to a writeup I did:
> > > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
> > >
> > > It does support hotplug, within the s390 architectural context, but
> > > that should be fine for our needs here.
> > >
> > > My concern comes from the 'no topology' issue. We build a fake topology
> > > in QEMU (to use the generic pci infrastructure), but the guest does not
> > > see any of this. It issues an instruction and gets a list of functions.
> > > This means any bridge information is not accessible to the guest.  
> > 
> > That is the reason why I prefer leaving it to specific transport
> > (zPCI) to define its own means to present and retrieve the group ID
> > information. The high-level feature bit just provides the indirection
> > to the target transport (for the to-be-paired device), while seperate
> > transport can have a way of its own to present/retrieve the group ID
> > data.
> > 
> > I don't know where to present that group ID data on s390 zPCI though
> > to be honest. But I doubt we can come up with a very general
> > transport-agnostic way to present the data for both (and pontentially
> > ALL) bus architectures.
> 
> I don't want to establish zPCI as a different transport than 'normal'
> PCI; that would just make life difficult for everyone. The 'put it into
> virtio config space' approach would mean a second feature bit, but
> should just work.

So how about that. Each transport will gain a new field
which is offset within config space of a generic config.
Each field within that will be guarded by a feature bit.

E.g. add CCW_CMD_READ_CONF_OFFSET.

Is this better than a separate new generic config space?

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-06 15:07                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-06 15:07 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Siwei Liu, Venu Busireddy, Roman Kagan, si-wei liu,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jul 06, 2018 at 03:54:06PM +0200, Cornelia Huck wrote:
> On Thu, 5 Jul 2018 17:49:10 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
> 
> > On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > > On Tue, 3 Jul 2018 16:31:03 -0700
> > > Siwei Liu <loseweigh@gmail.com> wrote:
> > >  
> > >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> > >> > From my point of view, there are several concerns:
> > >> > - This approach assumes a homogeneous pairing (same transport), and
> > >> >   even more, it assumes that this transport is pci.  
> > >>
> > >> Not really.
> > >>
> > >> There could be some other place to define a generic (transport
> > >> independent) virtio feature, whereas the data (group ID) can be stored
> > >> in transport specific way. That generic virtio feature and the way to
> > >> specify target transport to group with is yet to be defined. I don't
> > >> see this patch is in conflict with that direction.  
> > >
> > > Sorry, but I really do not see how this is not pci-specific.
> > >
> > > - One of your components is a bridge. A transport does not necessarily
> > >   have that concept, at least not in a way meaningful for this approach
> > >   to work.  
> > 
> > Assuming we have a transport agnostic high-level virtio feature to
> > indicate that the target type for the to-be-paired device is PCI, then
> > we have to use the data stored in a PC bridge to pair the device. It
> > does not associate transport of virtio itself with the type of target
> > device, right. The introduction of PCI bridge is just for storing the
> > group ID data for the PCI passthrough device case, while one may
> > define and introduce other means to retrieve the info if target device
> > type is not PCI e.g. a special instruction to get group ID on s390x
> > zPCI.
> 
> Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
> layers are supposed to distinguish that.
> 
> Is there really no existing PCI identifier that (a) the host admin
> already knows and (b) the guest can retrieve easily?

There is e.g. PCI Express serial number. However given that
one use of the feature is for live migration, we really
shouldn't use any ID from a real card.

> > 
> > > - Even if we can introduce transport-specific ways for other
> > >   transports, the bridge concept still means that the pairing cannot be
> > >   cross-transport.  
> > 
> > Sorry, I did not get this. A bridge is PCI specific concept indeed,
> > but the virtio iteself can be of other transport. Why it matters? I
> > guess a higher-level feature is needed to define the target transport
> > for the to-be-paired device. The group ID info can reside on the
> > transport specific area on virtio itself though. E.g. for virtio-pci,
> > get it on the vendor specific capability in the PCI configuration
> > space.
> > For virtio-ccw, you may come up with CCW specific way to get the group
> > ID data. Is this not what you had in mind?
> 
> No, my idea was it to add it as a field in the virtio-net config space,
> making it transport-agnostic. (And making this a typed value, depending
> on the vfio device we want to pair the virtio device with.)
> 
> If we want to extend that to other device types, we can add the field
> in their config space; but I'd rather prefer a more generic "host
> relays config metainformation" approach.
> 
> > 
> > >
> > > I think we should be clear what we want from a generic
> > > (transport-agnostic) virtio feature first. Probably some way to relay
> > > an identifier of the to-be-paired device (transport-specific +
> > > information what the transport is?)
> > >  
> > Are we talking about the transport of the virtio itself or the target
> > device? Note the virtio feature must always be retrieved using
> > transport specific way, although the semantics of the feauture can be
> > transport independent/agnostic. Once a virtio driver instace gets to
> > the point to read feature bits from the device, the transport of that
> > virtio device is determined and cannot be changed later.
> 
> No, I don't want to introduce a new, per-transport mechanism, but
> rather put it into the config space.
> > 
> > 
> > >> > - It won't work for zPCI (although zPCI is really strange) -- this
> > >> >   means it will be completely unusable on s390x.  
> > >> I still need more information about this use case. First off, does
> > >> zPCI support all the hot plug semantics or functionality the same way
> > >> as PCI? Or there has to be some platform or firmeware support like
> > >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> > >> hotplug?  
> > >
> > > zPCI is a strange beast, so first a pointer to a writeup I did:
> > > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
> > >
> > > It does support hotplug, within the s390 architectural context, but
> > > that should be fine for our needs here.
> > >
> > > My concern comes from the 'no topology' issue. We build a fake topology
> > > in QEMU (to use the generic pci infrastructure), but the guest does not
> > > see any of this. It issues an instruction and gets a list of functions.
> > > This means any bridge information is not accessible to the guest.  
> > 
> > That is the reason why I prefer leaving it to specific transport
> > (zPCI) to define its own means to present and retrieve the group ID
> > information. The high-level feature bit just provides the indirection
> > to the target transport (for the to-be-paired device), while seperate
> > transport can have a way of its own to present/retrieve the group ID
> > data.
> > 
> > I don't know where to present that group ID data on s390 zPCI though
> > to be honest. But I doubt we can come up with a very general
> > transport-agnostic way to present the data for both (and pontentially
> > ALL) bus architectures.
> 
> I don't want to establish zPCI as a different transport than 'normal'
> PCI; that would just make life difficult for everyone. The 'put it into
> virtio config space' approach would mean a second feature bit, but
> should just work.

So how about that. Each transport will gain a new field
which is offset within config space of a generic config.
Each field within that will be guarded by a feature bit.

E.g. add CCW_CMD_READ_CONF_OFFSET.

Is this better than a separate new generic config space?

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-06 13:54                   ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
@ 2018-07-06 23:37                     ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-06 23:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jul 6, 2018 at 6:54 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Thu, 5 Jul 2018 17:49:10 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> > On Tue, 3 Jul 2018 16:31:03 -0700
>> > Siwei Liu <loseweigh@gmail.com> wrote:
>> >
>> >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> >> > From my point of view, there are several concerns:
>> >> > - This approach assumes a homogeneous pairing (same transport), and
>> >> >   even more, it assumes that this transport is pci.
>> >>
>> >> Not really.
>> >>
>> >> There could be some other place to define a generic (transport
>> >> independent) virtio feature, whereas the data (group ID) can be stored
>> >> in transport specific way. That generic virtio feature and the way to
>> >> specify target transport to group with is yet to be defined. I don't
>> >> see this patch is in conflict with that direction.
>> >
>> > Sorry, but I really do not see how this is not pci-specific.
>> >
>> > - One of your components is a bridge. A transport does not necessarily
>> >   have that concept, at least not in a way meaningful for this approach
>> >   to work.
>>
>> Assuming we have a transport agnostic high-level virtio feature to
>> indicate that the target type for the to-be-paired device is PCI, then
>> we have to use the data stored in a PC bridge to pair the device. It
>> does not associate transport of virtio itself with the type of target
>> device, right. The introduction of PCI bridge is just for storing the
>> group ID data for the PCI passthrough device case, while one may
>> define and introduce other means to retrieve the info if target device
>> type is not PCI e.g. a special instruction to get group ID on s390x
>> zPCI.
>
> Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
> layers are supposed to distinguish that.
>
> Is there really no existing PCI identifier that (a) the host admin
> already knows and (b) the guest can retrieve easily?

Not sure if zPCI should be categorized as real PCI due to lack of
topology. Some other virtual PCI or para-virtual PCI buses behave like
PCI but not a full blown one. Those are not regarded as (native) PCI
emulation either.

Can s390x guest use UID and FID described in the blog post you wrote
to identify zPCI device? I am thinking if we can define other grouping
mechanism than the 64bit long group ID (formerly 128 bit UUID). The
high-level indirection not just specifies target transport, but maybe
grouping mechanism also.

>
>>
>> > - Even if we can introduce transport-specific ways for other
>> >   transports, the bridge concept still means that the pairing cannot be
>> >   cross-transport.
>>
>> Sorry, I did not get this. A bridge is PCI specific concept indeed,
>> but the virtio iteself can be of other transport. Why it matters? I
>> guess a higher-level feature is needed to define the target transport
>> for the to-be-paired device. The group ID info can reside on the
>> transport specific area on virtio itself though. E.g. for virtio-pci,
>> get it on the vendor specific capability in the PCI configuration
>> space.
>> For virtio-ccw, you may come up with CCW specific way to get the group
>> ID data. Is this not what you had in mind?
>
> No, my idea was it to add it as a field in the virtio-net config space,
> making it transport-agnostic. (And making this a typed value, depending
> on the vfio device we want to pair the virtio device with.)
>
> If we want to extend that to other device types, we can add the field
> in their config space; but I'd rather prefer a more generic "host
> relays config metainformation" approach.
>
>>
>> >
>> > I think we should be clear what we want from a generic
>> > (transport-agnostic) virtio feature first. Probably some way to relay
>> > an identifier of the to-be-paired device (transport-specific +
>> > information what the transport is?)
>> >
>> Are we talking about the transport of the virtio itself or the target
>> device? Note the virtio feature must always be retrieved using
>> transport specific way, although the semantics of the feauture can be
>> transport independent/agnostic. Once a virtio driver instace gets to
>> the point to read feature bits from the device, the transport of that
>> virtio device is determined and cannot be changed later.
>
> No, I don't want to introduce a new, per-transport mechanism, but
> rather put it into the config space.

OK.

I think the common understanding is that config space shouldn't be per
device type. MST's suggestion in the other mail to carve some space
from the device config space for generic use seems viable to me.

>
>>
>>
>> >> > - It won't work for zPCI (although zPCI is really strange) -- this
>> >> >   means it will be completely unusable on s390x.
>> >> I still need more information about this use case. First off, does
>> >> zPCI support all the hot plug semantics or functionality the same way
>> >> as PCI? Or there has to be some platform or firmeware support like
>> >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
>> >> hotplug?
>> >
>> > zPCI is a strange beast, so first a pointer to a writeup I did:
>> > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
>> >
>> > It does support hotplug, within the s390 architectural context, but
>> > that should be fine for our needs here.
>> >
>> > My concern comes from the 'no topology' issue. We build a fake topology
>> > in QEMU (to use the generic pci infrastructure), but the guest does not
>> > see any of this. It issues an instruction and gets a list of functions.
>> > This means any bridge information is not accessible to the guest.
>>
>> That is the reason why I prefer leaving it to specific transport
>> (zPCI) to define its own means to present and retrieve the group ID
>> information. The high-level feature bit just provides the indirection
>> to the target transport (for the to-be-paired device), while seperate
>> transport can have a way of its own to present/retrieve the group ID
>> data.
>>
>> I don't know where to present that group ID data on s390 zPCI though
>> to be honest. But I doubt we can come up with a very general
>> transport-agnostic way to present the data for both (and pontentially
>> ALL) bus architectures.
>
> I don't want to establish zPCI as a different transport than 'normal'
> PCI; that would just make life difficult for everyone. The 'put it into
> virtio config space' approach would mean a second feature bit, but
> should just work.

Sorry, but we still need an ID to match the VFIO passthrough device. I
don't see how putting the info into virtio config space can address
the problem.

-Siwei

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-06 23:37                     ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-06 23:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jul 6, 2018 at 6:54 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Thu, 5 Jul 2018 17:49:10 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> > On Tue, 3 Jul 2018 16:31:03 -0700
>> > Siwei Liu <loseweigh@gmail.com> wrote:
>> >
>> >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> >> > From my point of view, there are several concerns:
>> >> > - This approach assumes a homogeneous pairing (same transport), and
>> >> >   even more, it assumes that this transport is pci.
>> >>
>> >> Not really.
>> >>
>> >> There could be some other place to define a generic (transport
>> >> independent) virtio feature, whereas the data (group ID) can be stored
>> >> in transport specific way. That generic virtio feature and the way to
>> >> specify target transport to group with is yet to be defined. I don't
>> >> see this patch is in conflict with that direction.
>> >
>> > Sorry, but I really do not see how this is not pci-specific.
>> >
>> > - One of your components is a bridge. A transport does not necessarily
>> >   have that concept, at least not in a way meaningful for this approach
>> >   to work.
>>
>> Assuming we have a transport agnostic high-level virtio feature to
>> indicate that the target type for the to-be-paired device is PCI, then
>> we have to use the data stored in a PC bridge to pair the device. It
>> does not associate transport of virtio itself with the type of target
>> device, right. The introduction of PCI bridge is just for storing the
>> group ID data for the PCI passthrough device case, while one may
>> define and introduce other means to retrieve the info if target device
>> type is not PCI e.g. a special instruction to get group ID on s390x
>> zPCI.
>
> Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
> layers are supposed to distinguish that.
>
> Is there really no existing PCI identifier that (a) the host admin
> already knows and (b) the guest can retrieve easily?

Not sure if zPCI should be categorized as real PCI due to lack of
topology. Some other virtual PCI or para-virtual PCI buses behave like
PCI but not a full blown one. Those are not regarded as (native) PCI
emulation either.

Can s390x guest use UID and FID described in the blog post you wrote
to identify zPCI device? I am thinking if we can define other grouping
mechanism than the 64bit long group ID (formerly 128 bit UUID). The
high-level indirection not just specifies target transport, but maybe
grouping mechanism also.

>
>>
>> > - Even if we can introduce transport-specific ways for other
>> >   transports, the bridge concept still means that the pairing cannot be
>> >   cross-transport.
>>
>> Sorry, I did not get this. A bridge is PCI specific concept indeed,
>> but the virtio iteself can be of other transport. Why it matters? I
>> guess a higher-level feature is needed to define the target transport
>> for the to-be-paired device. The group ID info can reside on the
>> transport specific area on virtio itself though. E.g. for virtio-pci,
>> get it on the vendor specific capability in the PCI configuration
>> space.
>> For virtio-ccw, you may come up with CCW specific way to get the group
>> ID data. Is this not what you had in mind?
>
> No, my idea was it to add it as a field in the virtio-net config space,
> making it transport-agnostic. (And making this a typed value, depending
> on the vfio device we want to pair the virtio device with.)
>
> If we want to extend that to other device types, we can add the field
> in their config space; but I'd rather prefer a more generic "host
> relays config metainformation" approach.
>
>>
>> >
>> > I think we should be clear what we want from a generic
>> > (transport-agnostic) virtio feature first. Probably some way to relay
>> > an identifier of the to-be-paired device (transport-specific +
>> > information what the transport is?)
>> >
>> Are we talking about the transport of the virtio itself or the target
>> device? Note the virtio feature must always be retrieved using
>> transport specific way, although the semantics of the feauture can be
>> transport independent/agnostic. Once a virtio driver instace gets to
>> the point to read feature bits from the device, the transport of that
>> virtio device is determined and cannot be changed later.
>
> No, I don't want to introduce a new, per-transport mechanism, but
> rather put it into the config space.

OK.

I think the common understanding is that config space shouldn't be per
device type. MST's suggestion in the other mail to carve some space
from the device config space for generic use seems viable to me.

>
>>
>>
>> >> > - It won't work for zPCI (although zPCI is really strange) -- this
>> >> >   means it will be completely unusable on s390x.
>> >> I still need more information about this use case. First off, does
>> >> zPCI support all the hot plug semantics or functionality the same way
>> >> as PCI? Or there has to be some platform or firmeware support like
>> >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
>> >> hotplug?
>> >
>> > zPCI is a strange beast, so first a pointer to a writeup I did:
>> > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
>> >
>> > It does support hotplug, within the s390 architectural context, but
>> > that should be fine for our needs here.
>> >
>> > My concern comes from the 'no topology' issue. We build a fake topology
>> > in QEMU (to use the generic pci infrastructure), but the guest does not
>> > see any of this. It issues an instruction and gets a list of functions.
>> > This means any bridge information is not accessible to the guest.
>>
>> That is the reason why I prefer leaving it to specific transport
>> (zPCI) to define its own means to present and retrieve the group ID
>> information. The high-level feature bit just provides the indirection
>> to the target transport (for the to-be-paired device), while seperate
>> transport can have a way of its own to present/retrieve the group ID
>> data.
>>
>> I don't know where to present that group ID data on s390 zPCI though
>> to be honest. But I doubt we can come up with a very general
>> transport-agnostic way to present the data for both (and pontentially
>> ALL) bus architectures.
>
> I don't want to establish zPCI as a different transport than 'normal'
> PCI; that would just make life difficult for everyone. The 'put it into
> virtio config space' approach would mean a second feature bit, but
> should just work.

Sorry, but we still need an ID to match the VFIO passthrough device. I
don't see how putting the info into virtio config space can address
the problem.

-Siwei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
  2018-06-29 22:19   ` [virtio-dev] " Venu Busireddy
@ 2018-07-07 12:14     ` Marcel Apfelbaum
  -1 siblings, 0 replies; 106+ messages in thread
From: Marcel Apfelbaum @ 2018-07-07 12:14 UTC (permalink / raw)
  To: Venu Busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Hi Venu,

On 06/30/2018 01:19 AM, Venu Busireddy wrote:
> Add a new bridge device "pcie-downstream" with a
> Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
> PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER.

Can't we use the current pcie-pci-bridge device (see 
hw/pci-bridge/pcie_pci_bridge.c)
by adding the new vendor capability to it so we will not need to support 
a new bridge?

Thanks,
Marcel

>   Also, add the "Vendor-Specific"
> capability to the bridge to contain the "Group Identifier" that will be
> used to pair a virtio device with the passthrough device attached to
> that bridge.
>
> This capability is added to the bridge iff the "failover-group-id"
> option is specified for the bridge.
>
> Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
> ---
>   default-configs/arm-softmmu.mak    |   1 +
>   default-configs/i386-softmmu.mak   |   1 +
>   default-configs/x86_64-softmmu.mak |   1 +
>   hw/pci-bridge/Makefile.objs        |   1 +
>   hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
>   hw/pci-bridge/pcie_downstream.h    |  10 ++
>   include/hw/pci/pci.h               |   1 +
>   7 files changed, 235 insertions(+)
>   create mode 100644 hw/pci-bridge/pcie_downstream.c
>   create mode 100644 hw/pci-bridge/pcie_downstream.h
>
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index 834d45cfaf..b86c6fb122 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
>   CONFIG_PCIE_PORT=y
>   CONFIG_XIO3130=y
>   CONFIG_IOH3420=y
> +CONFIG_PCIE_DOWNSTREAM=y
>   CONFIG_I82801B11=y
>   CONFIG_ACPI=y
>   CONFIG_SMBIOS=y
> diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
> index 8c7d4a0fa0..a900c8f052 100644
> --- a/default-configs/i386-softmmu.mak
> +++ b/default-configs/i386-softmmu.mak
> @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
>   CONFIG_PCIE_PORT=y
>   CONFIG_XIO3130=y
>   CONFIG_IOH3420=y
> +CONFIG_PCIE_DOWNSTREAM=y
>   CONFIG_I82801B11=y
>   CONFIG_SMBIOS=y
>   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
> index 0390b4303c..481e4764be 100644
> --- a/default-configs/x86_64-softmmu.mak
> +++ b/default-configs/x86_64-softmmu.mak
> @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
>   CONFIG_PCIE_PORT=y
>   CONFIG_XIO3130=y
>   CONFIG_IOH3420=y
> +CONFIG_PCIE_DOWNSTREAM=y
>   CONFIG_I82801B11=y
>   CONFIG_SMBIOS=y
>   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
> index 47065f87d9..5b42212edc 100644
> --- a/hw/pci-bridge/Makefile.objs
> +++ b/hw/pci-bridge/Makefile.objs
> @@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
>   common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
>   common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
>   common-obj-$(CONFIG_IOH3420) += ioh3420.o
> +common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
>   common-obj-$(CONFIG_I82801B11) += i82801b11.o
>   # NewWorld PowerMac
>   common-obj-$(CONFIG_DEC_PCI) += dec.o
> diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
> new file mode 100644
> index 0000000000..49b0d1e933
> --- /dev/null
> +++ b/hw/pci-bridge/pcie_downstream.c
> @@ -0,0 +1,220 @@
> +/*
> + * Red Hat PCI Express downstream port.
> + *
> + * pcie_downstream.c
> + * Most of this code is copied from xio3130_downstream.c
> + *
> + * Copyright (c) 2018 Oracle and/or its affiliates.
> + *         Author: Venu Busireddy <venu.busireddy@oracle.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/pci/pci_ids.h"
> +#include "hw/pci/msi.h"
> +#include "hw/pci/pcie.h"
> +#include "pcie_downstream.h"
> +#include "qapi/error.h"
> +
> +#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
> +#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
> +#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
> +#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
> +#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
> +#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
> +#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
> +
> +static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
> +                                         uint32_t val, int len)
> +{
> +    pci_bridge_write_config(d, address, val, len);
> +    pcie_cap_flr_write_config(d, address, val, len);
> +    pcie_cap_slot_write_config(d, address, val, len);
> +    pcie_aer_write_config(d, address, val, len);
> +}
> +
> +static void pcie_downstream_reset(DeviceState *qdev)
> +{
> +    PCIDevice *d = PCI_DEVICE(qdev);
> +
> +    pcie_cap_deverr_reset(d);
> +    pcie_cap_slot_reset(d);
> +    pcie_cap_arifwd_reset(d);
> +    pci_bridge_reset(qdev);
> +}
> +
> +static void pcie_downstream_realize(PCIDevice *d, Error **errp)
> +{
> +    PCIEPort *p = PCIE_PORT(d);
> +    PCIESlot *s = PCIE_SLOT(d);
> +    int rc;
> +
> +    pci_bridge_initfn(d, TYPE_PCIE_BUS);
> +    pcie_port_init_reg(d);
> +
> +    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
> +            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
> +    if (rc < 0) {
> +        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
> +        goto err_bridge;
> +    }
> +
> +    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
> +            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
> +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
> +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
> +            errp);
> +    if (rc < 0) {
> +        assert(rc == -ENOTSUP);
> +        goto err_bridge;
> +    }
> +
> +    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
> +            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
> +            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
> +    if (rc < 0) {
> +        goto err_bridge;
> +    }
> +
> +    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
> +            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
> +    if (rc < 0) {
> +        goto err_msi;
> +    }
> +    pcie_cap_flr_init(d);
> +    pcie_cap_deverr_init(d);
> +    pcie_cap_slot_init(d, s->slot);
> +    pcie_cap_arifwd_init(d);
> +
> +    pcie_chassis_create(s->chassis);
> +    rc = pcie_chassis_add_slot(s);
> +    if (rc < 0) {
> +        error_setg(errp, "Can't add chassis slot, error %d", rc);
> +        goto err_pcie_cap;
> +    }
> +
> +    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
> +            PCI_ERR_SIZEOF, errp);
> +    if (rc < 0) {
> +        goto err;
> +    }
> +
> +    return;
> +
> +err:
> +    pcie_chassis_del_slot(s);
> +err_pcie_cap:
> +    pcie_cap_exit(d);
> +err_msi:
> +    msi_uninit(d);
> +err_bridge:
> +    pci_bridge_exitfn(d);
> +}
> +
> +static void pcie_downstream_exitfn(PCIDevice *d)
> +{
> +    PCIESlot *s = PCIE_SLOT(d);
> +
> +    pcie_aer_exit(d);
> +    pcie_chassis_del_slot(s);
> +    pcie_cap_exit(d);
> +    msi_uninit(d);
> +    pci_bridge_exitfn(d);
> +}
> +
> +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> +                               const char *bus_name, pci_map_irq_fn map_irq,
> +                               uint8_t port, uint8_t chassis, uint16_t slot)
> +{
> +    PCIDevice *d;
> +    PCIBridge *br;
> +    DeviceState *qdev;
> +
> +    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
> +    if (!d) {
> +        return NULL;
> +    }
> +    br = PCI_BRIDGE(d);
> +
> +    qdev = DEVICE(d);
> +    pci_bridge_map_irq(br, bus_name, map_irq);
> +    qdev_prop_set_uint8(qdev, "port", port);
> +    qdev_prop_set_uint8(qdev, "chassis", chassis);
> +    qdev_prop_set_uint16(qdev, "slot", slot);
> +    qdev_init_nofail(qdev);
> +
> +    return PCIE_SLOT(d);
> +}
> +
> +static Property pcie_downstream_props[] = {
> +    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
> +            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
> +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
> +            PCIDevice, failover_group_id, ULLONG_MAX),
> +    DEFINE_PROP_END_OF_LIST()
> +};
> +
> +static const VMStateDescription vmstate_pcie_ds = {
> +    .name = "pcie-downstream-port",
> +    .priority = MIG_PRI_PCI_BUS,
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .post_load = pcie_cap_slot_post_load,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
> +        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
> +                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
> +static void pcie_downstream_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +    k->is_bridge = 1;
> +    k->config_write = pcie_downstream_write_config;
> +    k->realize = pcie_downstream_realize;
> +    k->exit = pcie_downstream_exitfn;
> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> +    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
> +    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> +    dc->desc = "Red Hat PCIe Downstream Port";
> +    dc->reset = pcie_downstream_reset;
> +    dc->vmsd = &vmstate_pcie_ds;
> +    dc->props = pcie_downstream_props;
> +}
> +
> +static const TypeInfo pcie_downstream_info = {
> +    .name          = "pcie-downstream",
> +    .parent        = TYPE_PCIE_SLOT,
> +    .class_init    = pcie_downstream_class_init,
> +    .interfaces = (InterfaceInfo[]) {
> +        { INTERFACE_PCIE_DEVICE },
> +        { }
> +    },
> +};
> +
> +static void pcie_downstream_register_types(void)
> +{
> +    type_register_static(&pcie_downstream_info);
> +}
> +
> +type_init(pcie_downstream_register_types)
> diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
> new file mode 100644
> index 0000000000..a7f1aae4bb
> --- /dev/null
> +++ b/hw/pci-bridge/pcie_downstream.h
> @@ -0,0 +1,10 @@
> +#ifndef QEMU_PCIE_DOWNSTREAM_H
> +#define QEMU_PCIE_DOWNSTREAM_H
> +
> +#include "hw/pci/pcie_port.h"
> +
> +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> +        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
> +        uint8_t chassis, uint16_t slot);
> +
> +#endif /* QEMU_PCIE_DOWNSTREAM_H */
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index bc38032761..bf04ef59f3 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -103,6 +103,7 @@ extern bool pci_available;
>   #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
>   #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
>   #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
> +#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
>   #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
>   
>   #define FMT_PCIBUS                      PRIx64
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
@ 2018-07-07 12:14     ` Marcel Apfelbaum
  0 siblings, 0 replies; 106+ messages in thread
From: Marcel Apfelbaum @ 2018-07-07 12:14 UTC (permalink / raw)
  To: Venu Busireddy, Michael S . Tsirkin, Marcel Apfelbaum
  Cc: virtio-dev, qemu-devel

Hi Venu,

On 06/30/2018 01:19 AM, Venu Busireddy wrote:
> Add a new bridge device "pcie-downstream" with a
> Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
> PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER.

Can't we use the current pcie-pci-bridge device (see 
hw/pci-bridge/pcie_pci_bridge.c)
by adding the new vendor capability to it so we will not need to support 
a new bridge?

Thanks,
Marcel

>   Also, add the "Vendor-Specific"
> capability to the bridge to contain the "Group Identifier" that will be
> used to pair a virtio device with the passthrough device attached to
> that bridge.
>
> This capability is added to the bridge iff the "failover-group-id"
> option is specified for the bridge.
>
> Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
> ---
>   default-configs/arm-softmmu.mak    |   1 +
>   default-configs/i386-softmmu.mak   |   1 +
>   default-configs/x86_64-softmmu.mak |   1 +
>   hw/pci-bridge/Makefile.objs        |   1 +
>   hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
>   hw/pci-bridge/pcie_downstream.h    |  10 ++
>   include/hw/pci/pci.h               |   1 +
>   7 files changed, 235 insertions(+)
>   create mode 100644 hw/pci-bridge/pcie_downstream.c
>   create mode 100644 hw/pci-bridge/pcie_downstream.h
>
> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> index 834d45cfaf..b86c6fb122 100644
> --- a/default-configs/arm-softmmu.mak
> +++ b/default-configs/arm-softmmu.mak
> @@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
>   CONFIG_PCIE_PORT=y
>   CONFIG_XIO3130=y
>   CONFIG_IOH3420=y
> +CONFIG_PCIE_DOWNSTREAM=y
>   CONFIG_I82801B11=y
>   CONFIG_ACPI=y
>   CONFIG_SMBIOS=y
> diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
> index 8c7d4a0fa0..a900c8f052 100644
> --- a/default-configs/i386-softmmu.mak
> +++ b/default-configs/i386-softmmu.mak
> @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
>   CONFIG_PCIE_PORT=y
>   CONFIG_XIO3130=y
>   CONFIG_IOH3420=y
> +CONFIG_PCIE_DOWNSTREAM=y
>   CONFIG_I82801B11=y
>   CONFIG_SMBIOS=y
>   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
> index 0390b4303c..481e4764be 100644
> --- a/default-configs/x86_64-softmmu.mak
> +++ b/default-configs/x86_64-softmmu.mak
> @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
>   CONFIG_PCIE_PORT=y
>   CONFIG_XIO3130=y
>   CONFIG_IOH3420=y
> +CONFIG_PCIE_DOWNSTREAM=y
>   CONFIG_I82801B11=y
>   CONFIG_SMBIOS=y
>   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
> index 47065f87d9..5b42212edc 100644
> --- a/hw/pci-bridge/Makefile.objs
> +++ b/hw/pci-bridge/Makefile.objs
> @@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
>   common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
>   common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
>   common-obj-$(CONFIG_IOH3420) += ioh3420.o
> +common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
>   common-obj-$(CONFIG_I82801B11) += i82801b11.o
>   # NewWorld PowerMac
>   common-obj-$(CONFIG_DEC_PCI) += dec.o
> diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
> new file mode 100644
> index 0000000000..49b0d1e933
> --- /dev/null
> +++ b/hw/pci-bridge/pcie_downstream.c
> @@ -0,0 +1,220 @@
> +/*
> + * Red Hat PCI Express downstream port.
> + *
> + * pcie_downstream.c
> + * Most of this code is copied from xio3130_downstream.c
> + *
> + * Copyright (c) 2018 Oracle and/or its affiliates.
> + *         Author: Venu Busireddy <venu.busireddy@oracle.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "qemu/osdep.h"
> +#include "hw/pci/pci_ids.h"
> +#include "hw/pci/msi.h"
> +#include "hw/pci/pcie.h"
> +#include "pcie_downstream.h"
> +#include "qapi/error.h"
> +
> +#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
> +#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
> +#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
> +#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
> +#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
> +#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
> +#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
> +
> +static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
> +                                         uint32_t val, int len)
> +{
> +    pci_bridge_write_config(d, address, val, len);
> +    pcie_cap_flr_write_config(d, address, val, len);
> +    pcie_cap_slot_write_config(d, address, val, len);
> +    pcie_aer_write_config(d, address, val, len);
> +}
> +
> +static void pcie_downstream_reset(DeviceState *qdev)
> +{
> +    PCIDevice *d = PCI_DEVICE(qdev);
> +
> +    pcie_cap_deverr_reset(d);
> +    pcie_cap_slot_reset(d);
> +    pcie_cap_arifwd_reset(d);
> +    pci_bridge_reset(qdev);
> +}
> +
> +static void pcie_downstream_realize(PCIDevice *d, Error **errp)
> +{
> +    PCIEPort *p = PCIE_PORT(d);
> +    PCIESlot *s = PCIE_SLOT(d);
> +    int rc;
> +
> +    pci_bridge_initfn(d, TYPE_PCIE_BUS);
> +    pcie_port_init_reg(d);
> +
> +    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
> +            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
> +    if (rc < 0) {
> +        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
> +        goto err_bridge;
> +    }
> +
> +    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
> +            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
> +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
> +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
> +            errp);
> +    if (rc < 0) {
> +        assert(rc == -ENOTSUP);
> +        goto err_bridge;
> +    }
> +
> +    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
> +            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
> +            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
> +    if (rc < 0) {
> +        goto err_bridge;
> +    }
> +
> +    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
> +            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
> +    if (rc < 0) {
> +        goto err_msi;
> +    }
> +    pcie_cap_flr_init(d);
> +    pcie_cap_deverr_init(d);
> +    pcie_cap_slot_init(d, s->slot);
> +    pcie_cap_arifwd_init(d);
> +
> +    pcie_chassis_create(s->chassis);
> +    rc = pcie_chassis_add_slot(s);
> +    if (rc < 0) {
> +        error_setg(errp, "Can't add chassis slot, error %d", rc);
> +        goto err_pcie_cap;
> +    }
> +
> +    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
> +            PCI_ERR_SIZEOF, errp);
> +    if (rc < 0) {
> +        goto err;
> +    }
> +
> +    return;
> +
> +err:
> +    pcie_chassis_del_slot(s);
> +err_pcie_cap:
> +    pcie_cap_exit(d);
> +err_msi:
> +    msi_uninit(d);
> +err_bridge:
> +    pci_bridge_exitfn(d);
> +}
> +
> +static void pcie_downstream_exitfn(PCIDevice *d)
> +{
> +    PCIESlot *s = PCIE_SLOT(d);
> +
> +    pcie_aer_exit(d);
> +    pcie_chassis_del_slot(s);
> +    pcie_cap_exit(d);
> +    msi_uninit(d);
> +    pci_bridge_exitfn(d);
> +}
> +
> +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> +                               const char *bus_name, pci_map_irq_fn map_irq,
> +                               uint8_t port, uint8_t chassis, uint16_t slot)
> +{
> +    PCIDevice *d;
> +    PCIBridge *br;
> +    DeviceState *qdev;
> +
> +    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
> +    if (!d) {
> +        return NULL;
> +    }
> +    br = PCI_BRIDGE(d);
> +
> +    qdev = DEVICE(d);
> +    pci_bridge_map_irq(br, bus_name, map_irq);
> +    qdev_prop_set_uint8(qdev, "port", port);
> +    qdev_prop_set_uint8(qdev, "chassis", chassis);
> +    qdev_prop_set_uint16(qdev, "slot", slot);
> +    qdev_init_nofail(qdev);
> +
> +    return PCIE_SLOT(d);
> +}
> +
> +static Property pcie_downstream_props[] = {
> +    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
> +            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
> +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
> +            PCIDevice, failover_group_id, ULLONG_MAX),
> +    DEFINE_PROP_END_OF_LIST()
> +};
> +
> +static const VMStateDescription vmstate_pcie_ds = {
> +    .name = "pcie-downstream-port",
> +    .priority = MIG_PRI_PCI_BUS,
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .post_load = pcie_cap_slot_post_load,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
> +        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
> +                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
> +        VMSTATE_END_OF_LIST()
> +    }
> +};
> +
> +static void pcie_downstream_class_init(ObjectClass *klass, void *data)
> +{
> +    DeviceClass *dc = DEVICE_CLASS(klass);
> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> +
> +    k->is_bridge = 1;
> +    k->config_write = pcie_downstream_write_config;
> +    k->realize = pcie_downstream_realize;
> +    k->exit = pcie_downstream_exitfn;
> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> +    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
> +    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> +    dc->desc = "Red Hat PCIe Downstream Port";
> +    dc->reset = pcie_downstream_reset;
> +    dc->vmsd = &vmstate_pcie_ds;
> +    dc->props = pcie_downstream_props;
> +}
> +
> +static const TypeInfo pcie_downstream_info = {
> +    .name          = "pcie-downstream",
> +    .parent        = TYPE_PCIE_SLOT,
> +    .class_init    = pcie_downstream_class_init,
> +    .interfaces = (InterfaceInfo[]) {
> +        { INTERFACE_PCIE_DEVICE },
> +        { }
> +    },
> +};
> +
> +static void pcie_downstream_register_types(void)
> +{
> +    type_register_static(&pcie_downstream_info);
> +}
> +
> +type_init(pcie_downstream_register_types)
> diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
> new file mode 100644
> index 0000000000..a7f1aae4bb
> --- /dev/null
> +++ b/hw/pci-bridge/pcie_downstream.h
> @@ -0,0 +1,10 @@
> +#ifndef QEMU_PCIE_DOWNSTREAM_H
> +#define QEMU_PCIE_DOWNSTREAM_H
> +
> +#include "hw/pci/pcie_port.h"
> +
> +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> +        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
> +        uint8_t chassis, uint16_t slot);
> +
> +#endif /* QEMU_PCIE_DOWNSTREAM_H */
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index bc38032761..bf04ef59f3 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -103,6 +103,7 @@ extern bool pci_available;
>   #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
>   #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
>   #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
> +#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
>   #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
>   
>   #define FMT_PCIBUS                      PRIx64
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-03 22:27         ` [virtio-dev] " si-wei liu
  (?)
@ 2018-07-09 13:00         ` Roman Kagan
  2018-07-09 18:35             ` [virtio-dev] " Michael S. Tsirkin
                             ` (2 more replies)
  -1 siblings, 3 replies; 106+ messages in thread
From: Roman Kagan @ 2018-07-09 13:00 UTC (permalink / raw)
  To: si-wei liu
  Cc: Venu Busireddy, Michael S . Tsirkin, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Tue, Jul 03, 2018 at 03:27:23PM -0700, si-wei liu wrote:
> On 7/3/2018 2:58 AM, Roman Kagan wrote:
> > So how is this coordination going to work?  One possibility is that the
> > PV device emits a QMP event upon the guest driver confirming the support
> > for failover, the management layer intercepts the event and performs
> > device_add of the PT device.  Another is that the PT device is added
> > from the very beginning (e.g. on the QEMU command line) but its parent
> > PCI bridge subscribes a callback with the PV device to "activate" the PT
> > device upon negotiating the failover feature.
> > 
> > I think this needs to be decided within the scope of this patchset.
> As what had been discussed in previous thread below, we would go with the
> approach that QEMU manages the visibility of the PT device automatically.
> Management layer supplies PT device to QEMU from the very beginning. This PT
> device won't be exposed to guest immediately, unless or until the guest
> virtio driver acknowledges the backup feature already. Once virtio driver in
> the guest initiates a device reset, the corresponding PT device must be
> taken out from guest. Then add it back later on after guest virtio completes
> negotiation for the backup feature.

This means that the parent bridge of the PT device (or whatever else can
control the visibility of the PT device to the guest) will need to
cooperate with the PV device *within* QEMU.  The most natural way to
specify this connection is to have a property of one device to refer to
the other by device-id.

Another benefit of this approach is that it will allow to hide the
(possibly transport-specific) device matching identifiers from the QEMU
caller, as it won't need to be persistent nor visible to the management
layer.  In particular, this will allow to move forward with the
implementation of this PT-PV cooperation while the discussion of the
matching scheme is still ongoing, because matching by MAC will certainly
work as a first approximation.

> > > > Is the guest supposed to signal the datapath switch to the host?
> > > No, guest doesn't need to be initiating datapath switch at all.
> > What happens if the guest supports failover in its PV driver, but lacks
> > the driver for the PT device?
> The assumption of failover driver is that the primary (PT device) will be
> able to get a datapath once it shows up in the guest .

I wonder how universal this assumption is, given the variety of possible
network configurations, including filters, VLANs, etc.  For whatever
reason Hyper-V defines a control message over the PV device from guest
to host for that.

> If adding a PT device
> to an unsupported guest, the result will be same as that without a standby
> PV driver - basically got no networking as you don't get a working driver.
> 
> Then perhaps don't add the PT device in the first place if guest lacks
> driver support?

You don't know this in advance.

> > > However, QMP
> > > events may be generated when exposing or hiding the PT device through hot
> > > plug/unplug to facilitate host to switch datapath.
> > The PT device hot plug/unplug are initiated by the host, aren't they?  Why
> > would it also need QMP events for them?
> As indicated above, the hot plug/unplug are initiated by QEMU not the
> management layer. Hence the QMP hot plug event is used as an indicator to
> switch host datapath. Unlike Windows Hyper-V SR-IOV driver model, the Linux
> host network stack does not offer a fine grained PF driver API to move
> MAC/VLAN filter, and the VF driver has to start with some initial MAC
> address filter programmed in when present in the guest. The QMP event is
> served as a checkpoint to switch MAC filter and/or VLAN filter between the
> PV and the VF.
> 

I'd appreciate something like a sequence diagram to better understand
the whole picture...

> > > > Is the scheme going to be applied/extended to other transports (vmbus,
> > > > virtio-ccw, etc.)?
> > > Well, it depends on the use case, and how feasible it can be extended to
> > > other transport due to constraints and transport specifics.
> > > 
> > > > Is the failover group concept going to be used beyond PT-PV network
> > > > device failover?
> > > Although the concept of failover group is generic, the implementation itself
> > > may vary.
> > My point with these two questions is that since this patchset is
> > defining external interfaces -- with guest OS, with management layer --
> > which are not easy to change later, it might make sense to try and see
> > if the interfaces map to other usecases.  E.g. I think we can get enough
> > information on how Hyper-V handles PT-PV network device failover from
> > the current Linux implementation; it may be a good idea to share some
> > concepts and workflows with virtio-pci.
> As you may see from above, the handshake of virtio failover depends on hot
> plug (PCI or ACPI) and virtio specifics (feature negotiation). So far as I
> see the Hyper-V uses a completely different handshake protocol of its own
> (guest initiated datapath switch, Serial number in VMBus PCI bridge) than
> that of virtio. I can barely imagine how code could be implemented in a
> shared manner, although I agree conceptually failover group between these
> two is similar or the same.

I actually think there must be a lot in common: the way for the
management layer to specify the binding between the PT and PV devices;
the overall sequence of state transitions of every component, the QMP
events and the points in time when they are emitted, the way to adjust
host-side network configuration and the time when to do it, and so on.
It's unfortunate that the implementation of PV-PT failover in guest
Linux happens to have diverged between virtio and hyperv, but I don't
see any fundamental difference and I wouldn't be surprised if they
eventually converged sooner rather than later.

There are a few things that need to be specific for PT and/or PV
transport, the matching identifier among them, but I guess a lot can
still be in common.

Roman.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-06 13:54                   ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
@ 2018-07-09 13:14                     ` Roman Kagan
  -1 siblings, 0 replies; 106+ messages in thread
From: Roman Kagan @ 2018-07-09 13:14 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Siwei Liu, Venu Busireddy, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jul 06, 2018 at 03:54:06PM +0200, Cornelia Huck wrote:
> If we want to extend that to other device types, we can add the field
> in their config space; but I'd rather prefer a more generic "host
> relays config metainformation" approach.

The problem with this approach is that it's too generic and poorly
scoped.  As a result I guess it's likely to take ages to converge.

See for instance the discussion on including vm config metadata in qcow2
images: I think it's similar in scope (or the absense thereof).

Roman.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-09 13:14                     ` Roman Kagan
  0 siblings, 0 replies; 106+ messages in thread
From: Roman Kagan @ 2018-07-09 13:14 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Siwei Liu, Venu Busireddy, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jul 06, 2018 at 03:54:06PM +0200, Cornelia Huck wrote:
> If we want to extend that to other device types, we can add the field
> in their config space; but I'd rather prefer a more generic "host
> relays config metainformation" approach.

The problem with this approach is that it's too generic and poorly
scoped.  As a result I guess it's likely to take ages to converge.

See for instance the discussion on including vm config metadata in qcow2
images: I think it's similar in scope (or the absense thereof).

Roman.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-09 13:14                     ` [virtio-dev] Re: [Qemu-devel] " Roman Kagan
@ 2018-07-09 16:10                       ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-09 16:10 UTC (permalink / raw)
  To: Roman Kagan
  Cc: Siwei Liu, Venu Busireddy, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Mon, 9 Jul 2018 16:14:19 +0300
Roman Kagan <rkagan@virtuozzo.com> wrote:

> On Fri, Jul 06, 2018 at 03:54:06PM +0200, Cornelia Huck wrote:
> > If we want to extend that to other device types, we can add the field
> > in their config space; but I'd rather prefer a more generic "host
> > relays config metainformation" approach.  
> 
> The problem with this approach is that it's too generic and poorly
> scoped.  As a result I guess it's likely to take ages to converge.

I'd certainly not want to wait for it for this particular issue;
however, I think there is value in having this (seeing that at least
Open Stack and s390x DPM already make use of such a mechanism.) But we
can still revisit this later on.

(I'm not too sold on branching out to other device types in general,
but there seems to be demand for it.)

> 
> See for instance the discussion on including vm config metadata in qcow2
> images: I think it's similar in scope (or the absense thereof).
> 
> Roman.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-09 16:10                       ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-09 16:10 UTC (permalink / raw)
  To: Roman Kagan
  Cc: Siwei Liu, Venu Busireddy, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Mon, 9 Jul 2018 16:14:19 +0300
Roman Kagan <rkagan@virtuozzo.com> wrote:

> On Fri, Jul 06, 2018 at 03:54:06PM +0200, Cornelia Huck wrote:
> > If we want to extend that to other device types, we can add the field
> > in their config space; but I'd rather prefer a more generic "host
> > relays config metainformation" approach.  
> 
> The problem with this approach is that it's too generic and poorly
> scoped.  As a result I guess it's likely to take ages to converge.

I'd certainly not want to wait for it for this particular issue;
however, I think there is value in having this (seeing that at least
Open Stack and s390x DPM already make use of such a mechanism.) But we
can still revisit this later on.

(I'm not too sold on branching out to other device types in general,
but there seems to be demand for it.)

> 
> See for instance the discussion on including vm config metadata in qcow2
> images: I think it's similar in scope (or the absense thereof).
> 
> Roman.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-06 15:07                     ` [virtio-dev] Re: [Qemu-devel] " Michael S. Tsirkin
@ 2018-07-09 16:20                       ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-09 16:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Siwei Liu, Venu Busireddy, Roman Kagan, si-wei liu,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, 6 Jul 2018 18:07:00 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, Jul 06, 2018 at 03:54:06PM +0200, Cornelia Huck wrote:
> > On Thu, 5 Jul 2018 17:49:10 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >   
> > > On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> > > > On Tue, 3 Jul 2018 16:31:03 -0700
> > > > Siwei Liu <loseweigh@gmail.com> wrote:
> > > >    
> > > >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:    
> > > >> > From my point of view, there are several concerns:
> > > >> > - This approach assumes a homogeneous pairing (same transport), and
> > > >> >   even more, it assumes that this transport is pci.    
> > > >>
> > > >> Not really.
> > > >>
> > > >> There could be some other place to define a generic (transport
> > > >> independent) virtio feature, whereas the data (group ID) can be stored
> > > >> in transport specific way. That generic virtio feature and the way to
> > > >> specify target transport to group with is yet to be defined. I don't
> > > >> see this patch is in conflict with that direction.    
> > > >
> > > > Sorry, but I really do not see how this is not pci-specific.
> > > >
> > > > - One of your components is a bridge. A transport does not necessarily
> > > >   have that concept, at least not in a way meaningful for this approach
> > > >   to work.    
> > > 
> > > Assuming we have a transport agnostic high-level virtio feature to
> > > indicate that the target type for the to-be-paired device is PCI, then
> > > we have to use the data stored in a PC bridge to pair the device. It
> > > does not associate transport of virtio itself with the type of target
> > > device, right. The introduction of PCI bridge is just for storing the
> > > group ID data for the PCI passthrough device case, while one may
> > > define and introduce other means to retrieve the info if target device
> > > type is not PCI e.g. a special instruction to get group ID on s390x
> > > zPCI.  
> > 
> > Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
> > layers are supposed to distinguish that.
> > 
> > Is there really no existing PCI identifier that (a) the host admin
> > already knows and (b) the guest can retrieve easily?  
> 
> There is e.g. PCI Express serial number. However given that
> one use of the feature is for live migration, we really
> shouldn't use any ID from a real card.

With any "vfio vs. live migration" setup, we basically have two
different cases:
- Both source and target use the same physical device (device shared
  between logical partitions, device accessible from two different
  machines).
- We are dealing with two different devices that can be configured to
  look/act the same.

I have been mostly thinking about the first case, but the serial number
probably won't work with the second case.

> 
> > >   
> > > > - Even if we can introduce transport-specific ways for other
> > > >   transports, the bridge concept still means that the pairing cannot be
> > > >   cross-transport.    
> > > 
> > > Sorry, I did not get this. A bridge is PCI specific concept indeed,
> > > but the virtio iteself can be of other transport. Why it matters? I
> > > guess a higher-level feature is needed to define the target transport
> > > for the to-be-paired device. The group ID info can reside on the
> > > transport specific area on virtio itself though. E.g. for virtio-pci,
> > > get it on the vendor specific capability in the PCI configuration
> > > space.
> > > For virtio-ccw, you may come up with CCW specific way to get the group
> > > ID data. Is this not what you had in mind?  
> > 
> > No, my idea was it to add it as a field in the virtio-net config space,
> > making it transport-agnostic. (And making this a typed value, depending
> > on the vfio device we want to pair the virtio device with.)
> > 
> > If we want to extend that to other device types, we can add the field
> > in their config space; but I'd rather prefer a more generic "host
> > relays config metainformation" approach.
> >   
> > >   
> > > >
> > > > I think we should be clear what we want from a generic
> > > > (transport-agnostic) virtio feature first. Probably some way to relay
> > > > an identifier of the to-be-paired device (transport-specific +
> > > > information what the transport is?)
> > > >    
> > > Are we talking about the transport of the virtio itself or the target
> > > device? Note the virtio feature must always be retrieved using
> > > transport specific way, although the semantics of the feauture can be
> > > transport independent/agnostic. Once a virtio driver instace gets to
> > > the point to read feature bits from the device, the transport of that
> > > virtio device is determined and cannot be changed later.  
> > 
> > No, I don't want to introduce a new, per-transport mechanism, but
> > rather put it into the config space.  
> > > 
> > >   
> > > >> > - It won't work for zPCI (although zPCI is really strange) -- this
> > > >> >   means it will be completely unusable on s390x.    
> > > >> I still need more information about this use case. First off, does
> > > >> zPCI support all the hot plug semantics or functionality the same way
> > > >> as PCI? Or there has to be some platform or firmeware support like
> > > >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> > > >> hotplug?    
> > > >
> > > > zPCI is a strange beast, so first a pointer to a writeup I did:
> > > > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
> > > >
> > > > It does support hotplug, within the s390 architectural context, but
> > > > that should be fine for our needs here.
> > > >
> > > > My concern comes from the 'no topology' issue. We build a fake topology
> > > > in QEMU (to use the generic pci infrastructure), but the guest does not
> > > > see any of this. It issues an instruction and gets a list of functions.
> > > > This means any bridge information is not accessible to the guest.    
> > > 
> > > That is the reason why I prefer leaving it to specific transport
> > > (zPCI) to define its own means to present and retrieve the group ID
> > > information. The high-level feature bit just provides the indirection
> > > to the target transport (for the to-be-paired device), while seperate
> > > transport can have a way of its own to present/retrieve the group ID
> > > data.
> > > 
> > > I don't know where to present that group ID data on s390 zPCI though
> > > to be honest. But I doubt we can come up with a very general
> > > transport-agnostic way to present the data for both (and pontentially
> > > ALL) bus architectures.  
> > 
> > I don't want to establish zPCI as a different transport than 'normal'
> > PCI; that would just make life difficult for everyone. The 'put it into
> > virtio config space' approach would mean a second feature bit, but
> > should just work.  
> 
> So how about that. Each transport will gain a new field
> which is offset within config space of a generic config.
> Each field within that will be guarded by a feature bit.
> 
> E.g. add CCW_CMD_READ_CONF_OFFSET.
> 
> Is this better than a separate new generic config space?

From a ccw view, it would amount to the same (i.e., a new command is
needed in any case). Currently, I'm actually not quite sure how much
more designing of this does make sense (I'll reply elsewhere in this
thread).

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-09 16:20                       ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-09 16:20 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Siwei Liu, Venu Busireddy, Roman Kagan, si-wei liu,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, 6 Jul 2018 18:07:00 +0300
"Michael S. Tsirkin" <mst@redhat.com> wrote:

> On Fri, Jul 06, 2018 at 03:54:06PM +0200, Cornelia Huck wrote:
> > On Thu, 5 Jul 2018 17:49:10 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >   
> > > On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> > > > On Tue, 3 Jul 2018 16:31:03 -0700
> > > > Siwei Liu <loseweigh@gmail.com> wrote:
> > > >    
> > > >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:    
> > > >> > From my point of view, there are several concerns:
> > > >> > - This approach assumes a homogeneous pairing (same transport), and
> > > >> >   even more, it assumes that this transport is pci.    
> > > >>
> > > >> Not really.
> > > >>
> > > >> There could be some other place to define a generic (transport
> > > >> independent) virtio feature, whereas the data (group ID) can be stored
> > > >> in transport specific way. That generic virtio feature and the way to
> > > >> specify target transport to group with is yet to be defined. I don't
> > > >> see this patch is in conflict with that direction.    
> > > >
> > > > Sorry, but I really do not see how this is not pci-specific.
> > > >
> > > > - One of your components is a bridge. A transport does not necessarily
> > > >   have that concept, at least not in a way meaningful for this approach
> > > >   to work.    
> > > 
> > > Assuming we have a transport agnostic high-level virtio feature to
> > > indicate that the target type for the to-be-paired device is PCI, then
> > > we have to use the data stored in a PC bridge to pair the device. It
> > > does not associate transport of virtio itself with the type of target
> > > device, right. The introduction of PCI bridge is just for storing the
> > > group ID data for the PCI passthrough device case, while one may
> > > define and introduce other means to retrieve the info if target device
> > > type is not PCI e.g. a special instruction to get group ID on s390x
> > > zPCI.  
> > 
> > Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
> > layers are supposed to distinguish that.
> > 
> > Is there really no existing PCI identifier that (a) the host admin
> > already knows and (b) the guest can retrieve easily?  
> 
> There is e.g. PCI Express serial number. However given that
> one use of the feature is for live migration, we really
> shouldn't use any ID from a real card.

With any "vfio vs. live migration" setup, we basically have two
different cases:
- Both source and target use the same physical device (device shared
  between logical partitions, device accessible from two different
  machines).
- We are dealing with two different devices that can be configured to
  look/act the same.

I have been mostly thinking about the first case, but the serial number
probably won't work with the second case.

> 
> > >   
> > > > - Even if we can introduce transport-specific ways for other
> > > >   transports, the bridge concept still means that the pairing cannot be
> > > >   cross-transport.    
> > > 
> > > Sorry, I did not get this. A bridge is PCI specific concept indeed,
> > > but the virtio iteself can be of other transport. Why it matters? I
> > > guess a higher-level feature is needed to define the target transport
> > > for the to-be-paired device. The group ID info can reside on the
> > > transport specific area on virtio itself though. E.g. for virtio-pci,
> > > get it on the vendor specific capability in the PCI configuration
> > > space.
> > > For virtio-ccw, you may come up with CCW specific way to get the group
> > > ID data. Is this not what you had in mind?  
> > 
> > No, my idea was it to add it as a field in the virtio-net config space,
> > making it transport-agnostic. (And making this a typed value, depending
> > on the vfio device we want to pair the virtio device with.)
> > 
> > If we want to extend that to other device types, we can add the field
> > in their config space; but I'd rather prefer a more generic "host
> > relays config metainformation" approach.
> >   
> > >   
> > > >
> > > > I think we should be clear what we want from a generic
> > > > (transport-agnostic) virtio feature first. Probably some way to relay
> > > > an identifier of the to-be-paired device (transport-specific +
> > > > information what the transport is?)
> > > >    
> > > Are we talking about the transport of the virtio itself or the target
> > > device? Note the virtio feature must always be retrieved using
> > > transport specific way, although the semantics of the feauture can be
> > > transport independent/agnostic. Once a virtio driver instace gets to
> > > the point to read feature bits from the device, the transport of that
> > > virtio device is determined and cannot be changed later.  
> > 
> > No, I don't want to introduce a new, per-transport mechanism, but
> > rather put it into the config space.  
> > > 
> > >   
> > > >> > - It won't work for zPCI (although zPCI is really strange) -- this
> > > >> >   means it will be completely unusable on s390x.    
> > > >> I still need more information about this use case. First off, does
> > > >> zPCI support all the hot plug semantics or functionality the same way
> > > >> as PCI? Or there has to be some platform or firmeware support like
> > > >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> > > >> hotplug?    
> > > >
> > > > zPCI is a strange beast, so first a pointer to a writeup I did:
> > > > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
> > > >
> > > > It does support hotplug, within the s390 architectural context, but
> > > > that should be fine for our needs here.
> > > >
> > > > My concern comes from the 'no topology' issue. We build a fake topology
> > > > in QEMU (to use the generic pci infrastructure), but the guest does not
> > > > see any of this. It issues an instruction and gets a list of functions.
> > > > This means any bridge information is not accessible to the guest.    
> > > 
> > > That is the reason why I prefer leaving it to specific transport
> > > (zPCI) to define its own means to present and retrieve the group ID
> > > information. The high-level feature bit just provides the indirection
> > > to the target transport (for the to-be-paired device), while seperate
> > > transport can have a way of its own to present/retrieve the group ID
> > > data.
> > > 
> > > I don't know where to present that group ID data on s390 zPCI though
> > > to be honest. But I doubt we can come up with a very general
> > > transport-agnostic way to present the data for both (and pontentially
> > > ALL) bus architectures.  
> > 
> > I don't want to establish zPCI as a different transport than 'normal'
> > PCI; that would just make life difficult for everyone. The 'put it into
> > virtio config space' approach would mean a second feature bit, but
> > should just work.  
> 
> So how about that. Each transport will gain a new field
> which is offset within config space of a generic config.
> Each field within that will be guarded by a feature bit.
> 
> E.g. add CCW_CMD_READ_CONF_OFFSET.
> 
> Is this better than a separate new generic config space?

From a ccw view, it would amount to the same (i.e., a new command is
needed in any case). Currently, I'm actually not quite sure how much
more designing of this does make sense (I'll reply elsewhere in this
thread).

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-06 23:37                     ` [virtio-dev] Re: [Qemu-devel] " Siwei Liu
@ 2018-07-09 16:27                       ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-09 16:27 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, 6 Jul 2018 16:37:10 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Fri, Jul 6, 2018 at 6:54 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > On Thu, 5 Jul 2018 17:49:10 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >  
> >> On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> >> > On Tue, 3 Jul 2018 16:31:03 -0700
> >> > Siwei Liu <loseweigh@gmail.com> wrote:
> >> >  
> >> >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> >> >> > From my point of view, there are several concerns:
> >> >> > - This approach assumes a homogeneous pairing (same transport), and
> >> >> >   even more, it assumes that this transport is pci.  
> >> >>
> >> >> Not really.
> >> >>
> >> >> There could be some other place to define a generic (transport
> >> >> independent) virtio feature, whereas the data (group ID) can be stored
> >> >> in transport specific way. That generic virtio feature and the way to
> >> >> specify target transport to group with is yet to be defined. I don't
> >> >> see this patch is in conflict with that direction.  
> >> >
> >> > Sorry, but I really do not see how this is not pci-specific.
> >> >
> >> > - One of your components is a bridge. A transport does not necessarily
> >> >   have that concept, at least not in a way meaningful for this approach
> >> >   to work.  
> >>
> >> Assuming we have a transport agnostic high-level virtio feature to
> >> indicate that the target type for the to-be-paired device is PCI, then
> >> we have to use the data stored in a PC bridge to pair the device. It
> >> does not associate transport of virtio itself with the type of target
> >> device, right. The introduction of PCI bridge is just for storing the
> >> group ID data for the PCI passthrough device case, while one may
> >> define and introduce other means to retrieve the info if target device
> >> type is not PCI e.g. a special instruction to get group ID on s390x
> >> zPCI.  
> >
> > Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
> > layers are supposed to distinguish that.
> >
> > Is there really no existing PCI identifier that (a) the host admin
> > already knows and (b) the guest can retrieve easily?  
> 
> Not sure if zPCI should be categorized as real PCI due to lack of
> topology. Some other virtual PCI or para-virtual PCI buses behave like
> PCI but not a full blown one. Those are not regarded as (native) PCI
> emulation either.

I'm actually close to giving up on zPCI... it is very odd, not
publically documented, and I see disk passthrough via vfio-ccw as the
more interesting case anyway.

> 
> Can s390x guest use UID and FID described in the blog post you wrote
> to identify zPCI device? I am thinking if we can define other grouping
> mechanism than the 64bit long group ID (formerly 128 bit UUID). The
> high-level indirection not just specifies target transport, but maybe
> grouping mechanism also.

Not sure if this could work with how they are architected
(semantics-wise)... no public documentation is available, so I'll
assume we can't use them for another purpose.

> 
> >  
> >>  
> >> > - Even if we can introduce transport-specific ways for other
> >> >   transports, the bridge concept still means that the pairing cannot be
> >> >   cross-transport.  
> >>
> >> Sorry, I did not get this. A bridge is PCI specific concept indeed,
> >> but the virtio iteself can be of other transport. Why it matters? I
> >> guess a higher-level feature is needed to define the target transport
> >> for the to-be-paired device. The group ID info can reside on the
> >> transport specific area on virtio itself though. E.g. for virtio-pci,
> >> get it on the vendor specific capability in the PCI configuration
> >> space.
> >> For virtio-ccw, you may come up with CCW specific way to get the group
> >> ID data. Is this not what you had in mind?  
> >
> > No, my idea was it to add it as a field in the virtio-net config space,
> > making it transport-agnostic. (And making this a typed value, depending
> > on the vfio device we want to pair the virtio device with.)
> >
> > If we want to extend that to other device types, we can add the field
> > in their config space; but I'd rather prefer a more generic "host
> > relays config metainformation" approach.
> >  
> >>  
> >> >
> >> > I think we should be clear what we want from a generic
> >> > (transport-agnostic) virtio feature first. Probably some way to relay
> >> > an identifier of the to-be-paired device (transport-specific +
> >> > information what the transport is?)
> >> >  
> >> Are we talking about the transport of the virtio itself or the target
> >> device? Note the virtio feature must always be retrieved using
> >> transport specific way, although the semantics of the feauture can be
> >> transport independent/agnostic. Once a virtio driver instace gets to
> >> the point to read feature bits from the device, the transport of that
> >> virtio device is determined and cannot be changed later.  
> >
> > No, I don't want to introduce a new, per-transport mechanism, but
> > rather put it into the config space.  
> 
> OK.
> 
> I think the common understanding is that config space shouldn't be per
> device type. MST's suggestion in the other mail to carve some space
> from the device config space for generic use seems viable to me.
> 
> >  
> >>
> >>  
> >> >> > - It won't work for zPCI (although zPCI is really strange) -- this
> >> >> >   means it will be completely unusable on s390x.  
> >> >> I still need more information about this use case. First off, does
> >> >> zPCI support all the hot plug semantics or functionality the same way
> >> >> as PCI? Or there has to be some platform or firmeware support like
> >> >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> >> >> hotplug?  
> >> >
> >> > zPCI is a strange beast, so first a pointer to a writeup I did:
> >> > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
> >> >
> >> > It does support hotplug, within the s390 architectural context, but
> >> > that should be fine for our needs here.
> >> >
> >> > My concern comes from the 'no topology' issue. We build a fake topology
> >> > in QEMU (to use the generic pci infrastructure), but the guest does not
> >> > see any of this. It issues an instruction and gets a list of functions.
> >> > This means any bridge information is not accessible to the guest.  
> >>
> >> That is the reason why I prefer leaving it to specific transport
> >> (zPCI) to define its own means to present and retrieve the group ID
> >> information. The high-level feature bit just provides the indirection
> >> to the target transport (for the to-be-paired device), while seperate
> >> transport can have a way of its own to present/retrieve the group ID
> >> data.
> >>
> >> I don't know where to present that group ID data on s390 zPCI though
> >> to be honest. But I doubt we can come up with a very general
> >> transport-agnostic way to present the data for both (and pontentially
> >> ALL) bus architectures.  
> >
> > I don't want to establish zPCI as a different transport than 'normal'
> > PCI; that would just make life difficult for everyone. The 'put it into
> > virtio config space' approach would mean a second feature bit, but
> > should just work.  
> 
> Sorry, but we still need an ID to match the VFIO passthrough device. I
> don't see how putting the info into virtio config space can address
> the problem.

As said, it is probably better to just give up on zPCI and focus on
'normal' PCI setups. I think we could handle dasd-via-vfio-ccw without
a paired device, and PCI is probably not the prime use case on s390x.
If any IBM folks (with access to documentation etc.) have an opinion,
they should speak up.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-09 16:27                       ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-09 16:27 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Venu Busireddy, Roman Kagan, si-wei liu, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, 6 Jul 2018 16:37:10 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Fri, Jul 6, 2018 at 6:54 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > On Thu, 5 Jul 2018 17:49:10 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >  
> >> On Wed, Jul 4, 2018 at 5:15 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> >> > On Tue, 3 Jul 2018 16:31:03 -0700
> >> > Siwei Liu <loseweigh@gmail.com> wrote:
> >> >  
> >> >> On Tue, Jul 3, 2018 at 7:52 AM, Cornelia Huck <cohuck@redhat.com> wrote:  
> >> >> > From my point of view, there are several concerns:
> >> >> > - This approach assumes a homogeneous pairing (same transport), and
> >> >> >   even more, it assumes that this transport is pci.  
> >> >>
> >> >> Not really.
> >> >>
> >> >> There could be some other place to define a generic (transport
> >> >> independent) virtio feature, whereas the data (group ID) can be stored
> >> >> in transport specific way. That generic virtio feature and the way to
> >> >> specify target transport to group with is yet to be defined. I don't
> >> >> see this patch is in conflict with that direction.  
> >> >
> >> > Sorry, but I really do not see how this is not pci-specific.
> >> >
> >> > - One of your components is a bridge. A transport does not necessarily
> >> >   have that concept, at least not in a way meaningful for this approach
> >> >   to work.  
> >>
> >> Assuming we have a transport agnostic high-level virtio feature to
> >> indicate that the target type for the to-be-paired device is PCI, then
> >> we have to use the data stored in a PC bridge to pair the device. It
> >> does not associate transport of virtio itself with the type of target
> >> device, right. The introduction of PCI bridge is just for storing the
> >> group ID data for the PCI passthrough device case, while one may
> >> define and introduce other means to retrieve the info if target device
> >> type is not PCI e.g. a special instruction to get group ID on s390x
> >> zPCI.  
> >
> > Well, zPCI *is* PCI, just a very quirky one. I'm not sure how upper
> > layers are supposed to distinguish that.
> >
> > Is there really no existing PCI identifier that (a) the host admin
> > already knows and (b) the guest can retrieve easily?  
> 
> Not sure if zPCI should be categorized as real PCI due to lack of
> topology. Some other virtual PCI or para-virtual PCI buses behave like
> PCI but not a full blown one. Those are not regarded as (native) PCI
> emulation either.

I'm actually close to giving up on zPCI... it is very odd, not
publically documented, and I see disk passthrough via vfio-ccw as the
more interesting case anyway.

> 
> Can s390x guest use UID and FID described in the blog post you wrote
> to identify zPCI device? I am thinking if we can define other grouping
> mechanism than the 64bit long group ID (formerly 128 bit UUID). The
> high-level indirection not just specifies target transport, but maybe
> grouping mechanism also.

Not sure if this could work with how they are architected
(semantics-wise)... no public documentation is available, so I'll
assume we can't use them for another purpose.

> 
> >  
> >>  
> >> > - Even if we can introduce transport-specific ways for other
> >> >   transports, the bridge concept still means that the pairing cannot be
> >> >   cross-transport.  
> >>
> >> Sorry, I did not get this. A bridge is PCI specific concept indeed,
> >> but the virtio iteself can be of other transport. Why it matters? I
> >> guess a higher-level feature is needed to define the target transport
> >> for the to-be-paired device. The group ID info can reside on the
> >> transport specific area on virtio itself though. E.g. for virtio-pci,
> >> get it on the vendor specific capability in the PCI configuration
> >> space.
> >> For virtio-ccw, you may come up with CCW specific way to get the group
> >> ID data. Is this not what you had in mind?  
> >
> > No, my idea was it to add it as a field in the virtio-net config space,
> > making it transport-agnostic. (And making this a typed value, depending
> > on the vfio device we want to pair the virtio device with.)
> >
> > If we want to extend that to other device types, we can add the field
> > in their config space; but I'd rather prefer a more generic "host
> > relays config metainformation" approach.
> >  
> >>  
> >> >
> >> > I think we should be clear what we want from a generic
> >> > (transport-agnostic) virtio feature first. Probably some way to relay
> >> > an identifier of the to-be-paired device (transport-specific +
> >> > information what the transport is?)
> >> >  
> >> Are we talking about the transport of the virtio itself or the target
> >> device? Note the virtio feature must always be retrieved using
> >> transport specific way, although the semantics of the feauture can be
> >> transport independent/agnostic. Once a virtio driver instace gets to
> >> the point to read feature bits from the device, the transport of that
> >> virtio device is determined and cannot be changed later.  
> >
> > No, I don't want to introduce a new, per-transport mechanism, but
> > rather put it into the config space.  
> 
> OK.
> 
> I think the common understanding is that config space shouldn't be per
> device type. MST's suggestion in the other mail to carve some space
> from the device config space for generic use seems viable to me.
> 
> >  
> >>
> >>  
> >> >> > - It won't work for zPCI (although zPCI is really strange) -- this
> >> >> >   means it will be completely unusable on s390x.  
> >> >> I still need more information about this use case. First off, does
> >> >> zPCI support all the hot plug semantics or functionality the same way
> >> >> as PCI? Or there has to be some platform or firmeware support like
> >> >> ACPI hotplug? Does QEMU have all the pieces ready for s390 zPCI
> >> >> hotplug?  
> >> >
> >> > zPCI is a strange beast, so first a pointer to a writeup I did:
> >> > https://virtualpenguins.blogspot.de/2018/02/notes-on-pci-on-s390x.html
> >> >
> >> > It does support hotplug, within the s390 architectural context, but
> >> > that should be fine for our needs here.
> >> >
> >> > My concern comes from the 'no topology' issue. We build a fake topology
> >> > in QEMU (to use the generic pci infrastructure), but the guest does not
> >> > see any of this. It issues an instruction and gets a list of functions.
> >> > This means any bridge information is not accessible to the guest.  
> >>
> >> That is the reason why I prefer leaving it to specific transport
> >> (zPCI) to define its own means to present and retrieve the group ID
> >> information. The high-level feature bit just provides the indirection
> >> to the target transport (for the to-be-paired device), while seperate
> >> transport can have a way of its own to present/retrieve the group ID
> >> data.
> >>
> >> I don't know where to present that group ID data on s390 zPCI though
> >> to be honest. But I doubt we can come up with a very general
> >> transport-agnostic way to present the data for both (and pontentially
> >> ALL) bus architectures.  
> >
> > I don't want to establish zPCI as a different transport than 'normal'
> > PCI; that would just make life difficult for everyone. The 'put it into
> > virtio config space' approach would mean a second feature bit, but
> > should just work.  
> 
> Sorry, but we still need an ID to match the VFIO passthrough device. I
> don't see how putting the info into virtio config space can address
> the problem.

As said, it is probably better to just give up on zPCI and focus on
'normal' PCI setups. I think we could handle dasd-via-vfio-ccw without
a paired device, and PCI is probably not the prime use case on s390x.
If any IBM folks (with access to documentation etc.) have an opinion,
they should speak up.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-09 13:00         ` Roman Kagan
@ 2018-07-09 18:35             ` Michael S. Tsirkin
  2018-07-10  1:11             ` [virtio-dev] " si-wei liu
  2018-07-10  2:05             ` [virtio-dev] " Michael S. Tsirkin
  2 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-09 18:35 UTC (permalink / raw)
  To: Roman Kagan, si-wei liu, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Mon, Jul 09, 2018 at 04:00:36PM +0300, Roman Kagan wrote:
> > The assumption of failover driver is that the primary (PT device) will be
> > able to get a datapath once it shows up in the guest .
> 
> I wonder how universal this assumption is

Not universal at all. It is however the simplest
use-case we could think of, and that is as a result already
supported by Linux.

I for one will be happy if we spec out a fuller support
for more use-cases but it's probably orthogonal to
grouping.

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-09 18:35             ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-09 18:35 UTC (permalink / raw)
  To: Roman Kagan, si-wei liu, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Mon, Jul 09, 2018 at 04:00:36PM +0300, Roman Kagan wrote:
> > The assumption of failover driver is that the primary (PT device) will be
> > able to get a datapath once it shows up in the guest .
> 
> I wonder how universal this assumption is

Not universal at all. It is however the simplest
use-case we could think of, and that is as a result already
supported by Linux.

I for one will be happy if we spec out a fuller support
for more use-cases but it's probably orthogonal to
grouping.

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-09 13:00         ` Roman Kagan
@ 2018-07-10  1:11             ` si-wei liu
  2018-07-10  1:11             ` [virtio-dev] " si-wei liu
  2018-07-10  2:05             ` [virtio-dev] " Michael S. Tsirkin
  2 siblings, 0 replies; 106+ messages in thread
From: si-wei liu @ 2018-07-10  1:11 UTC (permalink / raw)
  To: Roman Kagan, Venu Busireddy, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel
  Cc: sridhar.samudrala, alexander.h.duyck, netdev



On 7/9/2018 6:00 AM, Roman Kagan wrote:
> On Tue, Jul 03, 2018 at 03:27:23PM -0700, si-wei liu wrote:
>> On 7/3/2018 2:58 AM, Roman Kagan wrote:
>>> So how is this coordination going to work?  One possibility is that the
>>> PV device emits a QMP event upon the guest driver confirming the support
>>> for failover, the management layer intercepts the event and performs
>>> device_add of the PT device.  Another is that the PT device is added
>>> from the very beginning (e.g. on the QEMU command line) but its parent
>>> PCI bridge subscribes a callback with the PV device to "activate" the PT
>>> device upon negotiating the failover feature.
>>>
>>> I think this needs to be decided within the scope of this patchset.
>> As what had been discussed in previous thread below, we would go with the
>> approach that QEMU manages the visibility of the PT device automatically.
>> Management layer supplies PT device to QEMU from the very beginning. This PT
>> device won't be exposed to guest immediately, unless or until the guest
>> virtio driver acknowledges the backup feature already. Once virtio driver in
>> the guest initiates a device reset, the corresponding PT device must be
>> taken out from guest. Then add it back later on after guest virtio completes
>> negotiation for the backup feature.
> This means that the parent bridge of the PT device (or whatever else can
> control the visibility of the PT device to the guest) will need to
> cooperate with the PV device *within* QEMU.  The most natural way to
> specify this connection is to have a property of one device to refer to
> the other by device-id.
This scheme has the problem that one device has to depend on the 
presence of the other - QEMU has the implication of the enumeration 
order if the two are not placed in the same birdge or PCI hierarchy. You 
can't get it reliably working if the bridge is going to be realized 
while the dependent PV device hadn't been yet, or vice versa.

> Another benefit of this approach is that it will allow to hide the
> (possibly transport-specific) device matching identifiers from the QEMU
> caller, as it won't need to be persistent nor visible to the management
> layer.  In particular, this will allow to move forward with the
> implementation of this PT-PV cooperation while the discussion of the
> matching scheme is still ongoing, because matching by MAC will certainly
> work as a first approximation.
The plan is to enable group ID based matching in the first place rather 
than match by MAC, the latter of which is fragile and problematic. I 
have made the Linux side changes and will get it posted once the QEMU 
discussion for grouping is finalized.

>
>>>>> Is the guest supposed to signal the datapath switch to the host?
>>>> No, guest doesn't need to be initiating datapath switch at all.
>>> What happens if the guest supports failover in its PV driver, but lacks
>>> the driver for the PT device?
>> The assumption of failover driver is that the primary (PT device) will be
>> able to get a datapath once it shows up in the guest .
> I wonder how universal this assumption is, given the variety of possible
> network configurations, including filters, VLANs, etc.  For whatever
> reason Hyper-V defines a control message over the PV device from guest
> to host for that.
These scenarios are different than the case with no driver support at 
all within guest. I think I particularly raised this as part of doing 
proper error handling when reviewing the original virtio failover patch 
- if failover module fails to enslave the VF due to guest network 
configurations, it has to signal virtio-net to propagate the error back 
to host. One way to handle that is to have virtio-net kick out a device 
reset and clear the feature bit upon re-negotiation, such that VF will 
be plugged out and won't get plugged in. I don't know for what reason 
the patch submitter did not incorporate that change. But it's in our 
plan to enhance that part, no worries.

>> If adding a PT device
>> to an unsupported guest, the result will be same as that without a standby
>> PV driver - basically got no networking as you don't get a working driver.
>>
>> Then perhaps don't add the PT device in the first place if guest lacks
>> driver support?
> You don't know this in advance.
 From migration point of view, it does not matter if guest lacks driver 
support for VF. I like to avoid duplicating hyper-v concept if at all 
possible. What makes sense with Hyper-V's accelerated networking doesn't 
have to work with KVM/QEMU SR-IOV live migration. Are you sure that the 
Hyper-V control message was added for this sole purpose? Seems to me an 
overkill for such an edge scenario.

>
>>>> However, QMP
>>>> events may be generated when exposing or hiding the PT device through hot
>>>> plug/unplug to facilitate host to switch datapath.
>>> The PT device hot plug/unplug are initiated by the host, aren't they?  Why
>>> would it also need QMP events for them?
>> As indicated above, the hot plug/unplug are initiated by QEMU not the
>> management layer. Hence the QMP hot plug event is used as an indicator to
>> switch host datapath. Unlike Windows Hyper-V SR-IOV driver model, the Linux
>> host network stack does not offer a fine grained PF driver API to move
>> MAC/VLAN filter, and the VF driver has to start with some initial MAC
>> address filter programmed in when present in the guest. The QMP event is
>> served as a checkpoint to switch MAC filter and/or VLAN filter between the
>> PV and the VF.
>>
> I'd appreciate something like a sequence diagram to better understand
> the whole picture...
>
>>>>> Is the scheme going to be applied/extended to other transports (vmbus,
>>>>> virtio-ccw, etc.)?
>>>> Well, it depends on the use case, and how feasible it can be extended to
>>>> other transport due to constraints and transport specifics.
>>>>
>>>>> Is the failover group concept going to be used beyond PT-PV network
>>>>> device failover?
>>>> Although the concept of failover group is generic, the implementation itself
>>>> may vary.
>>> My point with these two questions is that since this patchset is
>>> defining external interfaces -- with guest OS, with management layer --
>>> which are not easy to change later, it might make sense to try and see
>>> if the interfaces map to other usecases.  E.g. I think we can get enough
>>> information on how Hyper-V handles PT-PV network device failover from
>>> the current Linux implementation; it may be a good idea to share some
>>> concepts and workflows with virtio-pci.
>> As you may see from above, the handshake of virtio failover depends on hot
>> plug (PCI or ACPI) and virtio specifics (feature negotiation). So far as I
>> see the Hyper-V uses a completely different handshake protocol of its own
>> (guest initiated datapath switch, Serial number in VMBus PCI bridge) than
>> that of virtio. I can barely imagine how code could be implemented in a
>> shared manner, although I agree conceptually failover group between these
>> two is similar or the same.
> I actually think there must be a lot in common: the way for the
> management layer to specify the binding between the PT and PV devices;
> the overall sequence of state transitions of every component, the QMP
> events and the points in time when they are emitted, the way to adjust
> host-side network configuration and the time when to do it, and so on.
> It's unfortunate that the implementation of PV-PT failover in guest
> Linux happens to have diverged between virtio and hyperv, but I don't
> see any fundamental difference and I wouldn't be surprised if they
> eventually converged sooner rather than later.
(loop in Intel folks and Linux netdev)

Actually it's not without reason that Linux/virtio has to diverge from 
Hyper-V. The Hyper-V SR-IOV driver model allows VF be plugged in and 
registered with the stack without a MAC address filter programmed in the 
NIC, while Windows Hyper-V at the host side is able to move a MAC filter 
around from the NIC PV backend to VF upon receiving the DATAPATH_SWITCH 
control message initiated by guest. Windows NDIS has 
OID_RECEIVE_FILTER_MOVE_FILTER API to have PF update the MAC filter for 
the VF without involving guest or VF. For all of these there are no 
equivalent in the Linux SR-IOV driver model. How do you propose to have 
guest initiate the datapath switching at any point in time when you're 
dealing with Linux host network stack?

One may say we can plug in a VF with a random MAC filter programmed in 
prior, and initially use that random MAC within guest. This would require:

a) not relying on permanent MAC address to do pairing during the initial 
discovery, e.g. use the failover group ID as in this discussion
b) host to toggle the MAC address filter: which includes taking down the 
tap device to return the MAC back to PF, followed by assigning that MAC 
to VF using "ip link ... set vf ..."
c) notify guest to reload/reset VF driver for the change of hardware MAC 
address
d) until VF reloads the driver it won't be able to use the datapath, so 
very short period of network outage is (still) expected

But as you see this still diverges from the Hyper-V model. What do we 
buy for using a random address during initial discovery and requiring VF 
to complete the handshake? Less network downtime during datapath 
switching? Sorry but that's not a key factor at all for our main goal - 
live migration.

Regards,
-Siwei

>
> There are a few things that need to be specific for PT and/or PV
> transport, the matching identifier among them, but I guess a lot can
> still be in common.
>
> Roman.
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-10  1:11             ` si-wei liu
  0 siblings, 0 replies; 106+ messages in thread
From: si-wei liu @ 2018-07-10  1:11 UTC (permalink / raw)
  To: Roman Kagan, Venu Busireddy, Michael S . Tsirkin,
	Marcel Apfelbaum, virtio-dev, qemu-devel
  Cc: sridhar.samudrala, alexander.h.duyck, netdev



On 7/9/2018 6:00 AM, Roman Kagan wrote:
> On Tue, Jul 03, 2018 at 03:27:23PM -0700, si-wei liu wrote:
>> On 7/3/2018 2:58 AM, Roman Kagan wrote:
>>> So how is this coordination going to work?  One possibility is that the
>>> PV device emits a QMP event upon the guest driver confirming the support
>>> for failover, the management layer intercepts the event and performs
>>> device_add of the PT device.  Another is that the PT device is added
>>> from the very beginning (e.g. on the QEMU command line) but its parent
>>> PCI bridge subscribes a callback with the PV device to "activate" the PT
>>> device upon negotiating the failover feature.
>>>
>>> I think this needs to be decided within the scope of this patchset.
>> As what had been discussed in previous thread below, we would go with the
>> approach that QEMU manages the visibility of the PT device automatically.
>> Management layer supplies PT device to QEMU from the very beginning. This PT
>> device won't be exposed to guest immediately, unless or until the guest
>> virtio driver acknowledges the backup feature already. Once virtio driver in
>> the guest initiates a device reset, the corresponding PT device must be
>> taken out from guest. Then add it back later on after guest virtio completes
>> negotiation for the backup feature.
> This means that the parent bridge of the PT device (or whatever else can
> control the visibility of the PT device to the guest) will need to
> cooperate with the PV device *within* QEMU.  The most natural way to
> specify this connection is to have a property of one device to refer to
> the other by device-id.
This scheme has the problem that one device has to depend on the 
presence of the other - QEMU has the implication of the enumeration 
order if the two are not placed in the same birdge or PCI hierarchy. You 
can't get it reliably working if the bridge is going to be realized 
while the dependent PV device hadn't been yet, or vice versa.

> Another benefit of this approach is that it will allow to hide the
> (possibly transport-specific) device matching identifiers from the QEMU
> caller, as it won't need to be persistent nor visible to the management
> layer.  In particular, this will allow to move forward with the
> implementation of this PT-PV cooperation while the discussion of the
> matching scheme is still ongoing, because matching by MAC will certainly
> work as a first approximation.
The plan is to enable group ID based matching in the first place rather 
than match by MAC, the latter of which is fragile and problematic. I 
have made the Linux side changes and will get it posted once the QEMU 
discussion for grouping is finalized.

>
>>>>> Is the guest supposed to signal the datapath switch to the host?
>>>> No, guest doesn't need to be initiating datapath switch at all.
>>> What happens if the guest supports failover in its PV driver, but lacks
>>> the driver for the PT device?
>> The assumption of failover driver is that the primary (PT device) will be
>> able to get a datapath once it shows up in the guest .
> I wonder how universal this assumption is, given the variety of possible
> network configurations, including filters, VLANs, etc.  For whatever
> reason Hyper-V defines a control message over the PV device from guest
> to host for that.
These scenarios are different than the case with no driver support at 
all within guest. I think I particularly raised this as part of doing 
proper error handling when reviewing the original virtio failover patch 
- if failover module fails to enslave the VF due to guest network 
configurations, it has to signal virtio-net to propagate the error back 
to host. One way to handle that is to have virtio-net kick out a device 
reset and clear the feature bit upon re-negotiation, such that VF will 
be plugged out and won't get plugged in. I don't know for what reason 
the patch submitter did not incorporate that change. But it's in our 
plan to enhance that part, no worries.

>> If adding a PT device
>> to an unsupported guest, the result will be same as that without a standby
>> PV driver - basically got no networking as you don't get a working driver.
>>
>> Then perhaps don't add the PT device in the first place if guest lacks
>> driver support?
> You don't know this in advance.
 From migration point of view, it does not matter if guest lacks driver 
support for VF. I like to avoid duplicating hyper-v concept if at all 
possible. What makes sense with Hyper-V's accelerated networking doesn't 
have to work with KVM/QEMU SR-IOV live migration. Are you sure that the 
Hyper-V control message was added for this sole purpose? Seems to me an 
overkill for such an edge scenario.

>
>>>> However, QMP
>>>> events may be generated when exposing or hiding the PT device through hot
>>>> plug/unplug to facilitate host to switch datapath.
>>> The PT device hot plug/unplug are initiated by the host, aren't they?  Why
>>> would it also need QMP events for them?
>> As indicated above, the hot plug/unplug are initiated by QEMU not the
>> management layer. Hence the QMP hot plug event is used as an indicator to
>> switch host datapath. Unlike Windows Hyper-V SR-IOV driver model, the Linux
>> host network stack does not offer a fine grained PF driver API to move
>> MAC/VLAN filter, and the VF driver has to start with some initial MAC
>> address filter programmed in when present in the guest. The QMP event is
>> served as a checkpoint to switch MAC filter and/or VLAN filter between the
>> PV and the VF.
>>
> I'd appreciate something like a sequence diagram to better understand
> the whole picture...
>
>>>>> Is the scheme going to be applied/extended to other transports (vmbus,
>>>>> virtio-ccw, etc.)?
>>>> Well, it depends on the use case, and how feasible it can be extended to
>>>> other transport due to constraints and transport specifics.
>>>>
>>>>> Is the failover group concept going to be used beyond PT-PV network
>>>>> device failover?
>>>> Although the concept of failover group is generic, the implementation itself
>>>> may vary.
>>> My point with these two questions is that since this patchset is
>>> defining external interfaces -- with guest OS, with management layer --
>>> which are not easy to change later, it might make sense to try and see
>>> if the interfaces map to other usecases.  E.g. I think we can get enough
>>> information on how Hyper-V handles PT-PV network device failover from
>>> the current Linux implementation; it may be a good idea to share some
>>> concepts and workflows with virtio-pci.
>> As you may see from above, the handshake of virtio failover depends on hot
>> plug (PCI or ACPI) and virtio specifics (feature negotiation). So far as I
>> see the Hyper-V uses a completely different handshake protocol of its own
>> (guest initiated datapath switch, Serial number in VMBus PCI bridge) than
>> that of virtio. I can barely imagine how code could be implemented in a
>> shared manner, although I agree conceptually failover group between these
>> two is similar or the same.
> I actually think there must be a lot in common: the way for the
> management layer to specify the binding between the PT and PV devices;
> the overall sequence of state transitions of every component, the QMP
> events and the points in time when they are emitted, the way to adjust
> host-side network configuration and the time when to do it, and so on.
> It's unfortunate that the implementation of PV-PT failover in guest
> Linux happens to have diverged between virtio and hyperv, but I don't
> see any fundamental difference and I wouldn't be surprised if they
> eventually converged sooner rather than later.
(loop in Intel folks and Linux netdev)

Actually it's not without reason that Linux/virtio has to diverge from 
Hyper-V. The Hyper-V SR-IOV driver model allows VF be plugged in and 
registered with the stack without a MAC address filter programmed in the 
NIC, while Windows Hyper-V at the host side is able to move a MAC filter 
around from the NIC PV backend to VF upon receiving the DATAPATH_SWITCH 
control message initiated by guest. Windows NDIS has 
OID_RECEIVE_FILTER_MOVE_FILTER API to have PF update the MAC filter for 
the VF without involving guest or VF. For all of these there are no 
equivalent in the Linux SR-IOV driver model. How do you propose to have 
guest initiate the datapath switching at any point in time when you're 
dealing with Linux host network stack?

One may say we can plug in a VF with a random MAC filter programmed in 
prior, and initially use that random MAC within guest. This would require:

a) not relying on permanent MAC address to do pairing during the initial 
discovery, e.g. use the failover group ID as in this discussion
b) host to toggle the MAC address filter: which includes taking down the 
tap device to return the MAC back to PF, followed by assigning that MAC 
to VF using "ip link ... set vf ..."
c) notify guest to reload/reset VF driver for the change of hardware MAC 
address
d) until VF reloads the driver it won't be able to use the datapath, so 
very short period of network outage is (still) expected

But as you see this still diverges from the Hyper-V model. What do we 
buy for using a random address during initial discovery and requiring VF 
to complete the handshake? Less network downtime during datapath 
switching? Sorry but that's not a key factor at all for our main goal - 
live migration.

Regards,
-Siwei

>
> There are a few things that need to be specific for PT and/or PV
> transport, the matching identifier among them, but I guess a lot can
> still be in common.
>
> Roman.
>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-10  1:11             ` [virtio-dev] " si-wei liu
@ 2018-07-10  1:54               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-10  1:54 UTC (permalink / raw)
  To: si-wei liu
  Cc: Roman Kagan, Venu Busireddy, Marcel Apfelbaum, virtio-dev,
	qemu-devel, sridhar.samudrala, alexander.h.duyck, netdev

On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
> The plan is to enable group ID based matching in the first place rather than
> match by MAC, the latter of which is fragile and problematic.

It isn't all that fragile - hyperv used same for a while, so if someone
posts working patches with QEMU support but before this grouping stuff,
I'll happily apply them.

> I have made
> the Linux side changes and will get it posted once the QEMU discussion for
> grouping is finalized.

Go ahead but at this point I think we need to see actual QEMU support at
least for the basic functionality before we merge more code Linux-side.

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-10  1:54               ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-10  1:54 UTC (permalink / raw)
  To: si-wei liu
  Cc: Roman Kagan, Venu Busireddy, Marcel Apfelbaum, virtio-dev,
	qemu-devel, sridhar.samudrala, alexander.h.duyck, netdev

On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
> The plan is to enable group ID based matching in the first place rather than
> match by MAC, the latter of which is fragile and problematic.

It isn't all that fragile - hyperv used same for a while, so if someone
posts working patches with QEMU support but before this grouping stuff,
I'll happily apply them.

> I have made
> the Linux side changes and will get it posted once the QEMU discussion for
> grouping is finalized.

Go ahead but at this point I think we need to see actual QEMU support at
least for the basic functionality before we merge more code Linux-side.

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-10  1:11             ` [virtio-dev] " si-wei liu
@ 2018-07-10  1:58               ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-10  1:58 UTC (permalink / raw)
  To: si-wei liu
  Cc: Roman Kagan, Venu Busireddy, Marcel Apfelbaum, virtio-dev,
	qemu-devel, sridhar.samudrala, alexander.h.duyck, netdev

On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
> What do we buy
> for using a random address during initial discovery and requiring VF to
> complete the handshake?

I don't see advantages to using a random address that is then
changed either: changing a MAC causes network downtime for most users.

> Less network downtime during datapath switching?
> Sorry but that's not a key factor at all for our main goal - live migration.

Isn't avoiding downtime what makes the migration "live"?
If you don't care about it at all just remove the device
and migrate without all these tricks.


-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-10  1:58               ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-10  1:58 UTC (permalink / raw)
  To: si-wei liu
  Cc: Roman Kagan, Venu Busireddy, Marcel Apfelbaum, virtio-dev,
	qemu-devel, sridhar.samudrala, alexander.h.duyck, netdev

On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
> What do we buy
> for using a random address during initial discovery and requiring VF to
> complete the handshake?

I don't see advantages to using a random address that is then
changed either: changing a MAC causes network downtime for most users.

> Less network downtime during datapath switching?
> Sorry but that's not a key factor at all for our main goal - live migration.

Isn't avoiding downtime what makes the migration "live"?
If you don't care about it at all just remove the device
and migrate without all these tricks.


-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-09 13:00         ` Roman Kagan
@ 2018-07-10  2:05             ` Michael S. Tsirkin
  2018-07-10  1:11             ` [virtio-dev] " si-wei liu
  2018-07-10  2:05             ` [virtio-dev] " Michael S. Tsirkin
  2 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-10  2:05 UTC (permalink / raw)
  To: Roman Kagan, si-wei liu, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Mon, Jul 09, 2018 at 04:00:36PM +0300, Roman Kagan wrote:
> There are a few things that need to be specific for PT and/or PV
> transport, the matching identifier among them, but I guess a lot can
> still be in common.

Maybe but frankly unless someone actually plans to bother implementing
all this in QEMU's limited emulation of hyper-v on top of kvm, I don't
see why would we care.

What I saw gives me the impression Hyper-V wants to just be
left alone as much as possible and maybe we should respect that.

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-10  2:05             ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-10  2:05 UTC (permalink / raw)
  To: Roman Kagan, si-wei liu, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel

On Mon, Jul 09, 2018 at 04:00:36PM +0300, Roman Kagan wrote:
> There are a few things that need to be specific for PT and/or PV
> transport, the matching identifier among them, but I guess a lot can
> still be in common.

Maybe but frankly unless someone actually plans to bother implementing
all this in QEMU's limited emulation of hyper-v on top of kvm, I don't
see why would we care.

What I saw gives me the impression Hyper-V wants to just be
left alone as much as possible and maybe we should respect that.

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-06-29 22:19 ` [virtio-dev] " Venu Busireddy
@ 2018-07-10  2:11   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-10  2:11 UTC (permalink / raw)
  To: Venu Busireddy; +Cc: Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> The current patch set includes all the feedback received for proposals [3]
> and [4]. For the sake of completeness, patch for the virtio specification
> is also included here. Following is the updated proposal.
> 
> 1. Extend the virtio specification to include a new virtio PCI capability
>    "VIRTIO_PCI_CAP_GROUP_ID_CFG".

....

>  default-configs/arm-softmmu.mak             |   1 +
>  default-configs/i386-softmmu.mak            |   1 +
>  default-configs/x86_64-softmmu.mak          |   1 +
>  hw/pci-bridge/Makefile.objs                 |   1 +
>  hw/pci-bridge/pci_bridge_dev.c              |  10 +
>  hw/pci-bridge/pcie_downstream.c             | 220 ++++++++++++++++++++
>  hw/pci-bridge/pcie_downstream.h             |  10 +
>  hw/pci/pci_bridge.c                         |  34 +++
>  hw/virtio/virtio-pci.c                      |  15 ++
>  hw/virtio/virtio-pci.h                      |   3 +-
>  include/hw/pci/pci.h                        |  37 ++--
>  include/hw/pci/pci_bridge.h                 |   2 +
>  include/hw/pci/pcie.h                       |   1 +
>  include/standard-headers/linux/virtio_pci.h |   8 +
>  14 files changed, 326 insertions(+), 18 deletions(-)
>  create mode 100644 hw/pci-bridge/pcie_downstream.c
>  create mode 100644 hw/pci-bridge/pcie_downstream.h

I don't see the spec patch here.

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-10  2:11   ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-10  2:11 UTC (permalink / raw)
  To: Venu Busireddy; +Cc: Marcel Apfelbaum, virtio-dev, qemu-devel

On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> The current patch set includes all the feedback received for proposals [3]
> and [4]. For the sake of completeness, patch for the virtio specification
> is also included here. Following is the updated proposal.
> 
> 1. Extend the virtio specification to include a new virtio PCI capability
>    "VIRTIO_PCI_CAP_GROUP_ID_CFG".

....

>  default-configs/arm-softmmu.mak             |   1 +
>  default-configs/i386-softmmu.mak            |   1 +
>  default-configs/x86_64-softmmu.mak          |   1 +
>  hw/pci-bridge/Makefile.objs                 |   1 +
>  hw/pci-bridge/pci_bridge_dev.c              |  10 +
>  hw/pci-bridge/pcie_downstream.c             | 220 ++++++++++++++++++++
>  hw/pci-bridge/pcie_downstream.h             |  10 +
>  hw/pci/pci_bridge.c                         |  34 +++
>  hw/virtio/virtio-pci.c                      |  15 ++
>  hw/virtio/virtio-pci.h                      |   3 +-
>  include/hw/pci/pci.h                        |  37 ++--
>  include/hw/pci/pci_bridge.h                 |   2 +
>  include/hw/pci/pcie.h                       |   1 +
>  include/standard-headers/linux/virtio_pci.h |   8 +
>  14 files changed, 326 insertions(+), 18 deletions(-)
>  create mode 100644 hw/pci-bridge/pcie_downstream.c
>  create mode 100644 hw/pci-bridge/pcie_downstream.h

I don't see the spec patch here.

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-10  2:11   ` [virtio-dev] " Michael S. Tsirkin
@ 2018-07-10 14:28     ` Venu Busireddy
  -1 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-07-10 14:28 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Marcel Apfelbaum, virtio-dev, qemu-devel

On 2018-07-10 05:11:18 +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> > The current patch set includes all the feedback received for proposals [3]
> > and [4]. For the sake of completeness, patch for the virtio specification
> > is also included here. Following is the updated proposal.
> > 
> > 1. Extend the virtio specification to include a new virtio PCI capability
> >    "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> 
> ....
> 
> >  default-configs/arm-softmmu.mak             |   1 +
> >  default-configs/i386-softmmu.mak            |   1 +
> >  default-configs/x86_64-softmmu.mak          |   1 +
> >  hw/pci-bridge/Makefile.objs                 |   1 +
> >  hw/pci-bridge/pci_bridge_dev.c              |  10 +
> >  hw/pci-bridge/pcie_downstream.c             | 220 ++++++++++++++++++++
> >  hw/pci-bridge/pcie_downstream.h             |  10 +
> >  hw/pci/pci_bridge.c                         |  34 +++
> >  hw/virtio/virtio-pci.c                      |  15 ++
> >  hw/virtio/virtio-pci.h                      |   3 +-
> >  include/hw/pci/pci.h                        |  37 ++--
> >  include/hw/pci/pci_bridge.h                 |   2 +
> >  include/hw/pci/pcie.h                       |   1 +
> >  include/standard-headers/linux/virtio_pci.h |   8 +
> >  14 files changed, 326 insertions(+), 18 deletions(-)
> >  create mode 100644 hw/pci-bridge/pcie_downstream.c
> >  create mode 100644 hw/pci-bridge/pcie_downstream.h
> 
> I don't see the spec patch here.

Since the spec patch is in a different repo, I could not get 'git
format-patch' to combine them together. As a result, I added the spec
patch itself to the patches, but it is not in the cover letter.

I can combine them manually, and repost. Do you want me do that? 

Thanks,

Venu

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-10 14:28     ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-07-10 14:28 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Marcel Apfelbaum, virtio-dev, qemu-devel

On 2018-07-10 05:11:18 +0300, Michael S. Tsirkin wrote:
> On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> > The current patch set includes all the feedback received for proposals [3]
> > and [4]. For the sake of completeness, patch for the virtio specification
> > is also included here. Following is the updated proposal.
> > 
> > 1. Extend the virtio specification to include a new virtio PCI capability
> >    "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> 
> ....
> 
> >  default-configs/arm-softmmu.mak             |   1 +
> >  default-configs/i386-softmmu.mak            |   1 +
> >  default-configs/x86_64-softmmu.mak          |   1 +
> >  hw/pci-bridge/Makefile.objs                 |   1 +
> >  hw/pci-bridge/pci_bridge_dev.c              |  10 +
> >  hw/pci-bridge/pcie_downstream.c             | 220 ++++++++++++++++++++
> >  hw/pci-bridge/pcie_downstream.h             |  10 +
> >  hw/pci/pci_bridge.c                         |  34 +++
> >  hw/virtio/virtio-pci.c                      |  15 ++
> >  hw/virtio/virtio-pci.h                      |   3 +-
> >  include/hw/pci/pci.h                        |  37 ++--
> >  include/hw/pci/pci_bridge.h                 |   2 +
> >  include/hw/pci/pcie.h                       |   1 +
> >  include/standard-headers/linux/virtio_pci.h |   8 +
> >  14 files changed, 326 insertions(+), 18 deletions(-)
> >  create mode 100644 hw/pci-bridge/pcie_downstream.c
> >  create mode 100644 hw/pci-bridge/pcie_downstream.h
> 
> I don't see the spec patch here.

Since the spec patch is in a different repo, I could not get 'git
format-patch' to combine them together. As a result, I added the spec
patch itself to the patches, but it is not in the cover letter.

I can combine them manually, and repost. Do you want me do that? 

Thanks,

Venu


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-10  1:58               ` [virtio-dev] " Michael S. Tsirkin
  (?)
@ 2018-07-10 18:56                 ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-10 18:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: si-wei liu, Roman Kagan, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel, Samudrala, Sridhar, Alexander Duyck,
	Netdev

On Mon, Jul 9, 2018 at 6:58 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> What do we buy
>> for using a random address during initial discovery and requiring VF to
>> complete the handshake?
>
> I don't see advantages to using a random address that is then
> changed either: changing a MAC causes network downtime for most users.
>
Definitely.

I see Linux host stack fundamentally different with Windows, it's
non-sense to duplicate what Hyper-V is doing especially if there's no
extra benefit.

>> Less network downtime during datapath switching?
>> Sorry but that's not a key factor at all for our main goal - live migration.
>
> Isn't avoiding downtime what makes the migration "live"?
> If you don't care about it at all just remove the device
> and migrate without all these tricks.

Apparently this downtime is not avoidable even if guest initiates the
switch-over when it is done on Linux host stack. Unless the NIC
supports adding duplicate MAC filters with one has higher priority
than the other when both are present. I feel there's very little or
perhaps zero improvement for the downtime if moving to a
guest-initiated datapath switching model.

However, since this downtime is intermittent and generally
unnoticeable with a few packet drops, network should be resilient in
recovering from the drops. My point is that unless we can move to a
datapath switching model with zero downtime theoretically, this kind
of minor optimization offers very little help in general.

Regards,
-Siwei



>
>
> --
> MST
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-10 18:56                 ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-10 18:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: si-wei liu, Roman Kagan, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel, Samudrala, Sridhar, Alexander Duyck,
	Netdev

On Mon, Jul 9, 2018 at 6:58 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> What do we buy
>> for using a random address during initial discovery and requiring VF to
>> complete the handshake?
>
> I don't see advantages to using a random address that is then
> changed either: changing a MAC causes network downtime for most users.
>
Definitely.

I see Linux host stack fundamentally different with Windows, it's
non-sense to duplicate what Hyper-V is doing especially if there's no
extra benefit.

>> Less network downtime during datapath switching?
>> Sorry but that's not a key factor at all for our main goal - live migration.
>
> Isn't avoiding downtime what makes the migration "live"?
> If you don't care about it at all just remove the device
> and migrate without all these tricks.

Apparently this downtime is not avoidable even if guest initiates the
switch-over when it is done on Linux host stack. Unless the NIC
supports adding duplicate MAC filters with one has higher priority
than the other when both are present. I feel there's very little or
perhaps zero improvement for the downtime if moving to a
guest-initiated datapath switching model.

However, since this downtime is intermittent and generally
unnoticeable with a few packet drops, network should be resilient in
recovering from the drops. My point is that unless we can move to a
datapath switching model with zero downtime theoretically, this kind
of minor optimization offers very little help in general.

Regards,
-Siwei



>
>
> --
> MST
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-10 18:56                 ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-10 18:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: si-wei liu, Roman Kagan, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel, Samudrala, Sridhar, Alexander Duyck,
	Netdev

On Mon, Jul 9, 2018 at 6:58 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> What do we buy
>> for using a random address during initial discovery and requiring VF to
>> complete the handshake?
>
> I don't see advantages to using a random address that is then
> changed either: changing a MAC causes network downtime for most users.
>
Definitely.

I see Linux host stack fundamentally different with Windows, it's
non-sense to duplicate what Hyper-V is doing especially if there's no
extra benefit.

>> Less network downtime during datapath switching?
>> Sorry but that's not a key factor at all for our main goal - live migration.
>
> Isn't avoiding downtime what makes the migration "live"?
> If you don't care about it at all just remove the device
> and migrate without all these tricks.

Apparently this downtime is not avoidable even if guest initiates the
switch-over when it is done on Linux host stack. Unless the NIC
supports adding duplicate MAC filters with one has higher priority
than the other when both are present. I feel there's very little or
perhaps zero improvement for the downtime if moving to a
guest-initiated datapath switching model.

However, since this downtime is intermittent and generally
unnoticeable with a few packet drops, network should be resilient in
recovering from the drops. My point is that unless we can move to a
datapath switching model with zero downtime theoretically, this kind
of minor optimization offers very little help in general.

Regards,
-Siwei



>
>
> --
> MST
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-10  1:54               ` [virtio-dev] " Michael S. Tsirkin
  (?)
@ 2018-07-11  0:07                 ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-11  0:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: si-wei liu, Roman Kagan, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel, Samudrala, Sridhar, Alexander Duyck,
	Netdev

On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> The plan is to enable group ID based matching in the first place rather than
>> match by MAC, the latter of which is fragile and problematic.
>
> It isn't all that fragile - hyperv used same for a while, so if someone
> posts working patches with QEMU support but before this grouping stuff,
> I'll happily apply them.

I wouldn't box the solution to very limited scenario just because of
matching by MAC, the benefit of having generic group ID in the first
place is that we save the effort of maintaining legacy MAC based
pairing that just adds complexity anyway. Currently the VF's MAC
address cannot be changed by either PF or by the guest user is a
severe limitation due to this. The other use case is that PT device
than VF would generally have different MAC than the standby virtio. We
shouldn't limit itself to VF specific scenario from the very
beginning.

It's just small piece of QEMU code to add in, why not?

>
>> I have made
>> the Linux side changes and will get it posted once the QEMU discussion for
>> grouping is finalized.
>
> Go ahead but at this point I think we need to see actual QEMU support at
> least for the basic functionality before we merge more code Linux-side.

The VFIO visibility work is being worked on. We'll post patches when it's ready.

Thanks,
-Siwei


>
> --
> MST
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-11  0:07                 ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-11  0:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: si-wei liu, Roman Kagan, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel, Samudrala, Sridhar, Alexander Duyck,
	Netdev

On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> The plan is to enable group ID based matching in the first place rather than
>> match by MAC, the latter of which is fragile and problematic.
>
> It isn't all that fragile - hyperv used same for a while, so if someone
> posts working patches with QEMU support but before this grouping stuff,
> I'll happily apply them.

I wouldn't box the solution to very limited scenario just because of
matching by MAC, the benefit of having generic group ID in the first
place is that we save the effort of maintaining legacy MAC based
pairing that just adds complexity anyway. Currently the VF's MAC
address cannot be changed by either PF or by the guest user is a
severe limitation due to this. The other use case is that PT device
than VF would generally have different MAC than the standby virtio. We
shouldn't limit itself to VF specific scenario from the very
beginning.

It's just small piece of QEMU code to add in, why not?

>
>> I have made
>> the Linux side changes and will get it posted once the QEMU discussion for
>> grouping is finalized.
>
> Go ahead but at this point I think we need to see actual QEMU support at
> least for the basic functionality before we merge more code Linux-side.

The VFIO visibility work is being worked on. We'll post patches when it's ready.

Thanks,
-Siwei


>
> --
> MST
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-11  0:07                 ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-11  0:07 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: si-wei liu, Roman Kagan, Venu Busireddy, Marcel Apfelbaum,
	virtio-dev, qemu-devel, Samudrala, Sridhar, Alexander Duyck,
	Netdev

On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> The plan is to enable group ID based matching in the first place rather than
>> match by MAC, the latter of which is fragile and problematic.
>
> It isn't all that fragile - hyperv used same for a while, so if someone
> posts working patches with QEMU support but before this grouping stuff,
> I'll happily apply them.

I wouldn't box the solution to very limited scenario just because of
matching by MAC, the benefit of having generic group ID in the first
place is that we save the effort of maintaining legacy MAC based
pairing that just adds complexity anyway. Currently the VF's MAC
address cannot be changed by either PF or by the guest user is a
severe limitation due to this. The other use case is that PT device
than VF would generally have different MAC than the standby virtio. We
shouldn't limit itself to VF specific scenario from the very
beginning.

It's just small piece of QEMU code to add in, why not?

>
>> I have made
>> the Linux side changes and will get it posted once the QEMU discussion for
>> grouping is finalized.
>
> Go ahead but at this point I think we need to see actual QEMU support at
> least for the basic functionality before we merge more code Linux-side.

The VFIO visibility work is being worked on. We'll post patches when it's ready.

Thanks,
-Siwei


>
> --
> MST
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-11  0:07                 ` [Qemu-devel] [virtio-dev] " Siwei Liu
  (?)
@ 2018-07-11  9:53                   ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-11  9:53 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Tue, 10 Jul 2018 17:07:37 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> >> The plan is to enable group ID based matching in the first place rather than
> >> match by MAC, the latter of which is fragile and problematic.  
> >
> > It isn't all that fragile - hyperv used same for a while, so if someone
> > posts working patches with QEMU support but before this grouping stuff,
> > I'll happily apply them.  
> 
> I wouldn't box the solution to very limited scenario just because of
> matching by MAC, the benefit of having generic group ID in the first
> place is that we save the effort of maintaining legacy MAC based
> pairing that just adds complexity anyway. Currently the VF's MAC
> address cannot be changed by either PF or by the guest user is a
> severe limitation due to this. The other use case is that PT device
> than VF would generally have different MAC than the standby virtio. We
> shouldn't limit itself to VF specific scenario from the very
> beginning.

So, this brings me to a different concern: the semantics of
VIRTIO_NET_F_STANDBY.

* The currently sole user seems to be the virtio-net Linux driver.
* The commit messages, code comments and Documentation/ all talk about
  matching by MAC.
* I could not find any proposed update to the virtio spec. (If there
  had been an older proposal with a different feature name, it is not
  discoverable.)

VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
official spec, you can only go by the Linux implementation, and by that
its semantics seem to be 'match by MAC', not 'match by other criteria'.

How is this supposed to work in the long run?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-11  9:53                   ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-11  9:53 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Tue, 10 Jul 2018 17:07:37 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> >> The plan is to enable group ID based matching in the first place rather than
> >> match by MAC, the latter of which is fragile and problematic.  
> >
> > It isn't all that fragile - hyperv used same for a while, so if someone
> > posts working patches with QEMU support but before this grouping stuff,
> > I'll happily apply them.  
> 
> I wouldn't box the solution to very limited scenario just because of
> matching by MAC, the benefit of having generic group ID in the first
> place is that we save the effort of maintaining legacy MAC based
> pairing that just adds complexity anyway. Currently the VF's MAC
> address cannot be changed by either PF or by the guest user is a
> severe limitation due to this. The other use case is that PT device
> than VF would generally have different MAC than the standby virtio. We
> shouldn't limit itself to VF specific scenario from the very
> beginning.

So, this brings me to a different concern: the semantics of
VIRTIO_NET_F_STANDBY.

* The currently sole user seems to be the virtio-net Linux driver.
* The commit messages, code comments and Documentation/ all talk about
  matching by MAC.
* I could not find any proposed update to the virtio spec. (If there
  had been an older proposal with a different feature name, it is not
  discoverable.)

VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
official spec, you can only go by the Linux implementation, and by that
its semantics seem to be 'match by MAC', not 'match by other criteria'.

How is this supposed to work in the long run?

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-11  9:53                   ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-11  9:53 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Tue, 10 Jul 2018 17:07:37 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> >> The plan is to enable group ID based matching in the first place rather than
> >> match by MAC, the latter of which is fragile and problematic.  
> >
> > It isn't all that fragile - hyperv used same for a while, so if someone
> > posts working patches with QEMU support but before this grouping stuff,
> > I'll happily apply them.  
> 
> I wouldn't box the solution to very limited scenario just because of
> matching by MAC, the benefit of having generic group ID in the first
> place is that we save the effort of maintaining legacy MAC based
> pairing that just adds complexity anyway. Currently the VF's MAC
> address cannot be changed by either PF or by the guest user is a
> severe limitation due to this. The other use case is that PT device
> than VF would generally have different MAC than the standby virtio. We
> shouldn't limit itself to VF specific scenario from the very
> beginning.

So, this brings me to a different concern: the semantics of
VIRTIO_NET_F_STANDBY.

* The currently sole user seems to be the virtio-net Linux driver.
* The commit messages, code comments and Documentation/ all talk about
  matching by MAC.
* I could not find any proposed update to the virtio spec. (If there
  had been an older proposal with a different feature name, it is not
  discoverable.)

VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
official spec, you can only go by the Linux implementation, and by that
its semantics seem to be 'match by MAC', not 'match by other criteria'.

How is this supposed to work in the long run?

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-11  9:53                   ` [Qemu-devel] [virtio-dev] " Cornelia Huck
  (?)
@ 2018-07-12  9:37                     ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12  9:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Tue, 10 Jul 2018 17:07:37 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> >> The plan is to enable group ID based matching in the first place rather than
>> >> match by MAC, the latter of which is fragile and problematic.
>> >
>> > It isn't all that fragile - hyperv used same for a while, so if someone
>> > posts working patches with QEMU support but before this grouping stuff,
>> > I'll happily apply them.
>>
>> I wouldn't box the solution to very limited scenario just because of
>> matching by MAC, the benefit of having generic group ID in the first
>> place is that we save the effort of maintaining legacy MAC based
>> pairing that just adds complexity anyway. Currently the VF's MAC
>> address cannot be changed by either PF or by the guest user is a
>> severe limitation due to this. The other use case is that PT device
>> than VF would generally have different MAC than the standby virtio. We
>> shouldn't limit itself to VF specific scenario from the very
>> beginning.
>
> So, this brings me to a different concern: the semantics of
> VIRTIO_NET_F_STANDBY.
>
> * The currently sole user seems to be the virtio-net Linux driver.
> * The commit messages, code comments and Documentation/ all talk about
>   matching by MAC.
> * I could not find any proposed update to the virtio spec. (If there
>   had been an older proposal with a different feature name, it is not
>   discoverable.)

No, there was no such spec patch at all when the Linux patch was
submitted, hence match by MAC is the only means to pair device due to
lack of QEMU support.

Q: Does it work?
A: Well, it works for some.
Q: Does it work well to support all scenarios?
A: No, not as it claims to.
Q: Can it do better job to support all scenarios?
A: Yes, do pairing with the failover group ID instead.
Q: Does pairing still need to be MAC based if using failover group ID?
A: It depends, it's up to the implementation to verify MAC address
depending on the need (e.g. VF failover versus PT device replacement),
though MAC matching is no longer positioned as a requirement for
pairing or grouping.

There's no such stickiness for matching by MAC defined anywhere. The
semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
concept that the standby device should be used when the primary is not
present. We now have added the group ID on QEMU. I don't see why
bothering to get rid of the limitation: it's never been exposed. No
existing users. No API/ABI defined at all.

>
> VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> official spec, you can only go by the Linux implementation, and by that
> its semantics seem to be 'match by MAC', not 'match by other criteria'.
>
> How is this supposed to work in the long run?

That group ID thing should work for all OS. Not just Linux.

I will change all the references to MAC matching in my upcoming patch.
Thank you for the note.

-Siwei

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12  9:37                     ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12  9:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Tue, 10 Jul 2018 17:07:37 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> >> The plan is to enable group ID based matching in the first place rather than
>> >> match by MAC, the latter of which is fragile and problematic.
>> >
>> > It isn't all that fragile - hyperv used same for a while, so if someone
>> > posts working patches with QEMU support but before this grouping stuff,
>> > I'll happily apply them.
>>
>> I wouldn't box the solution to very limited scenario just because of
>> matching by MAC, the benefit of having generic group ID in the first
>> place is that we save the effort of maintaining legacy MAC based
>> pairing that just adds complexity anyway. Currently the VF's MAC
>> address cannot be changed by either PF or by the guest user is a
>> severe limitation due to this. The other use case is that PT device
>> than VF would generally have different MAC than the standby virtio. We
>> shouldn't limit itself to VF specific scenario from the very
>> beginning.
>
> So, this brings me to a different concern: the semantics of
> VIRTIO_NET_F_STANDBY.
>
> * The currently sole user seems to be the virtio-net Linux driver.
> * The commit messages, code comments and Documentation/ all talk about
>   matching by MAC.
> * I could not find any proposed update to the virtio spec. (If there
>   had been an older proposal with a different feature name, it is not
>   discoverable.)

No, there was no such spec patch at all when the Linux patch was
submitted, hence match by MAC is the only means to pair device due to
lack of QEMU support.

Q: Does it work?
A: Well, it works for some.
Q: Does it work well to support all scenarios?
A: No, not as it claims to.
Q: Can it do better job to support all scenarios?
A: Yes, do pairing with the failover group ID instead.
Q: Does pairing still need to be MAC based if using failover group ID?
A: It depends, it's up to the implementation to verify MAC address
depending on the need (e.g. VF failover versus PT device replacement),
though MAC matching is no longer positioned as a requirement for
pairing or grouping.

There's no such stickiness for matching by MAC defined anywhere. The
semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
concept that the standby device should be used when the primary is not
present. We now have added the group ID on QEMU. I don't see why
bothering to get rid of the limitation: it's never been exposed. No
existing users. No API/ABI defined at all.

>
> VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> official spec, you can only go by the Linux implementation, and by that
> its semantics seem to be 'match by MAC', not 'match by other criteria'.
>
> How is this supposed to work in the long run?

That group ID thing should work for all OS. Not just Linux.

I will change all the references to MAC matching in my upcoming patch.
Thank you for the note.

-Siwei

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12  9:37                     ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12  9:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Tue, 10 Jul 2018 17:07:37 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> >> The plan is to enable group ID based matching in the first place rather than
>> >> match by MAC, the latter of which is fragile and problematic.
>> >
>> > It isn't all that fragile - hyperv used same for a while, so if someone
>> > posts working patches with QEMU support but before this grouping stuff,
>> > I'll happily apply them.
>>
>> I wouldn't box the solution to very limited scenario just because of
>> matching by MAC, the benefit of having generic group ID in the first
>> place is that we save the effort of maintaining legacy MAC based
>> pairing that just adds complexity anyway. Currently the VF's MAC
>> address cannot be changed by either PF or by the guest user is a
>> severe limitation due to this. The other use case is that PT device
>> than VF would generally have different MAC than the standby virtio. We
>> shouldn't limit itself to VF specific scenario from the very
>> beginning.
>
> So, this brings me to a different concern: the semantics of
> VIRTIO_NET_F_STANDBY.
>
> * The currently sole user seems to be the virtio-net Linux driver.
> * The commit messages, code comments and Documentation/ all talk about
>   matching by MAC.
> * I could not find any proposed update to the virtio spec. (If there
>   had been an older proposal with a different feature name, it is not
>   discoverable.)

No, there was no such spec patch at all when the Linux patch was
submitted, hence match by MAC is the only means to pair device due to
lack of QEMU support.

Q: Does it work?
A: Well, it works for some.
Q: Does it work well to support all scenarios?
A: No, not as it claims to.
Q: Can it do better job to support all scenarios?
A: Yes, do pairing with the failover group ID instead.
Q: Does pairing still need to be MAC based if using failover group ID?
A: It depends, it's up to the implementation to verify MAC address
depending on the need (e.g. VF failover versus PT device replacement),
though MAC matching is no longer positioned as a requirement for
pairing or grouping.

There's no such stickiness for matching by MAC defined anywhere. The
semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
concept that the standby device should be used when the primary is not
present. We now have added the group ID on QEMU. I don't see why
bothering to get rid of the limitation: it's never been exposed. No
existing users. No API/ABI defined at all.

>
> VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> official spec, you can only go by the Linux implementation, and by that
> its semantics seem to be 'match by MAC', not 'match by other criteria'.
>
> How is this supposed to work in the long run?

That group ID thing should work for all OS. Not just Linux.

I will change all the references to MAC matching in my upcoming patch.
Thank you for the note.

-Siwei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-12  9:37                     ` [Qemu-devel] [virtio-dev] " Siwei Liu
  (?)
@ 2018-07-12 11:31                       ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-12 11:31 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, 12 Jul 2018 02:37:03 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > On Tue, 10 Jul 2018 17:07:37 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >  
> >> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:  
> >> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> >> >> The plan is to enable group ID based matching in the first place rather than
> >> >> match by MAC, the latter of which is fragile and problematic.  
> >> >
> >> > It isn't all that fragile - hyperv used same for a while, so if someone
> >> > posts working patches with QEMU support but before this grouping stuff,
> >> > I'll happily apply them.  
> >>
> >> I wouldn't box the solution to very limited scenario just because of
> >> matching by MAC, the benefit of having generic group ID in the first
> >> place is that we save the effort of maintaining legacy MAC based
> >> pairing that just adds complexity anyway. Currently the VF's MAC
> >> address cannot be changed by either PF or by the guest user is a
> >> severe limitation due to this. The other use case is that PT device
> >> than VF would generally have different MAC than the standby virtio. We
> >> shouldn't limit itself to VF specific scenario from the very
> >> beginning.  
> >
> > So, this brings me to a different concern: the semantics of
> > VIRTIO_NET_F_STANDBY.
> >
> > * The currently sole user seems to be the virtio-net Linux driver.
> > * The commit messages, code comments and Documentation/ all talk about
> >   matching by MAC.
> > * I could not find any proposed update to the virtio spec. (If there
> >   had been an older proposal with a different feature name, it is not
> >   discoverable.)  
> 
> No, there was no such spec patch at all when the Linux patch was
> submitted, hence match by MAC is the only means to pair device due to
> lack of QEMU support.

We need to know what the device offers if it offers the feature bit,
and what it is supposed to do when the driver negotiates it. Currently,
we can only go by what the Linux driver does and what it expects. IOW,
we need a spec update proposal. Obviously, this should be discussed in
conjunction with the rest.

> 
> Q: Does it work?
> A: Well, it works for some.
> Q: Does it work well to support all scenarios?
> A: No, not as it claims to.
> Q: Can it do better job to support all scenarios?
> A: Yes, do pairing with the failover group ID instead.
> Q: Does pairing still need to be MAC based if using failover group ID?
> A: It depends, it's up to the implementation to verify MAC address
> depending on the need (e.g. VF failover versus PT device replacement),
> though MAC matching is no longer positioned as a requirement for
> pairing or grouping.

Whether matching by MAC is good or sufficient is a different
discussion. It is, however, what the code currently *does*, and in
absence of a spec update, it is the only reference for this feature.

> 
> There's no such stickiness for matching by MAC defined anywhere. The
> semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
> concept that the standby device should be used when the primary is not
> present. We now have added the group ID on QEMU. I don't see why
> bothering to get rid of the limitation: it's never been exposed. No
> existing users. No API/ABI defined at all.

This is scheduled to be released with the next Linux version, which is
right now in the -rc phase. It *is* API (a guest <-> host API).

No corresponding code is present in QEMU 3.0, which is in freeze right
now. Anything that goes into QEMU 3.1 or later needs to accommodate
Linux 4.18 as a guest.

> 
> >
> > VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> > official spec, you can only go by the Linux implementation, and by that
> > its semantics seem to be 'match by MAC', not 'match by other criteria'.
> >
> > How is this supposed to work in the long run?  
> 
> That group ID thing should work for all OS. Not just Linux.

That's exactly my point: We need to care about not-Linux. And about
not-QEMU as well. A virtio feature bit should not be defined by what
Linux and QEMU do, but needs a real spec.

So, currently we have a Linux driver implementation that matches by
MAC. If a Linux version with this is released, every device that offers
VIRTIO_NET_F_STANDBY needs to support matching by MAC so that this
Linux driver will not break. Adding further matching methods should be
fine, but might need additional features (needs to be discussed).

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 11:31                       ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-12 11:31 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, 12 Jul 2018 02:37:03 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > On Tue, 10 Jul 2018 17:07:37 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >  
> >> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:  
> >> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> >> >> The plan is to enable group ID based matching in the first place rather than
> >> >> match by MAC, the latter of which is fragile and problematic.  
> >> >
> >> > It isn't all that fragile - hyperv used same for a while, so if someone
> >> > posts working patches with QEMU support but before this grouping stuff,
> >> > I'll happily apply them.  
> >>
> >> I wouldn't box the solution to very limited scenario just because of
> >> matching by MAC, the benefit of having generic group ID in the first
> >> place is that we save the effort of maintaining legacy MAC based
> >> pairing that just adds complexity anyway. Currently the VF's MAC
> >> address cannot be changed by either PF or by the guest user is a
> >> severe limitation due to this. The other use case is that PT device
> >> than VF would generally have different MAC than the standby virtio. We
> >> shouldn't limit itself to VF specific scenario from the very
> >> beginning.  
> >
> > So, this brings me to a different concern: the semantics of
> > VIRTIO_NET_F_STANDBY.
> >
> > * The currently sole user seems to be the virtio-net Linux driver.
> > * The commit messages, code comments and Documentation/ all talk about
> >   matching by MAC.
> > * I could not find any proposed update to the virtio spec. (If there
> >   had been an older proposal with a different feature name, it is not
> >   discoverable.)  
> 
> No, there was no such spec patch at all when the Linux patch was
> submitted, hence match by MAC is the only means to pair device due to
> lack of QEMU support.

We need to know what the device offers if it offers the feature bit,
and what it is supposed to do when the driver negotiates it. Currently,
we can only go by what the Linux driver does and what it expects. IOW,
we need a spec update proposal. Obviously, this should be discussed in
conjunction with the rest.

> 
> Q: Does it work?
> A: Well, it works for some.
> Q: Does it work well to support all scenarios?
> A: No, not as it claims to.
> Q: Can it do better job to support all scenarios?
> A: Yes, do pairing with the failover group ID instead.
> Q: Does pairing still need to be MAC based if using failover group ID?
> A: It depends, it's up to the implementation to verify MAC address
> depending on the need (e.g. VF failover versus PT device replacement),
> though MAC matching is no longer positioned as a requirement for
> pairing or grouping.

Whether matching by MAC is good or sufficient is a different
discussion. It is, however, what the code currently *does*, and in
absence of a spec update, it is the only reference for this feature.

> 
> There's no such stickiness for matching by MAC defined anywhere. The
> semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
> concept that the standby device should be used when the primary is not
> present. We now have added the group ID on QEMU. I don't see why
> bothering to get rid of the limitation: it's never been exposed. No
> existing users. No API/ABI defined at all.

This is scheduled to be released with the next Linux version, which is
right now in the -rc phase. It *is* API (a guest <-> host API).

No corresponding code is present in QEMU 3.0, which is in freeze right
now. Anything that goes into QEMU 3.1 or later needs to accommodate
Linux 4.18 as a guest.

> 
> >
> > VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> > official spec, you can only go by the Linux implementation, and by that
> > its semantics seem to be 'match by MAC', not 'match by other criteria'.
> >
> > How is this supposed to work in the long run?  
> 
> That group ID thing should work for all OS. Not just Linux.

That's exactly my point: We need to care about not-Linux. And about
not-QEMU as well. A virtio feature bit should not be defined by what
Linux and QEMU do, but needs a real spec.

So, currently we have a Linux driver implementation that matches by
MAC. If a Linux version with this is released, every device that offers
VIRTIO_NET_F_STANDBY needs to support matching by MAC so that this
Linux driver will not break. Adding further matching methods should be
fine, but might need additional features (needs to be discussed).

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 11:31                       ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-12 11:31 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, 12 Jul 2018 02:37:03 -0700
Siwei Liu <loseweigh@gmail.com> wrote:

> On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> > On Tue, 10 Jul 2018 17:07:37 -0700
> > Siwei Liu <loseweigh@gmail.com> wrote:
> >  
> >> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:  
> >> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> >> >> The plan is to enable group ID based matching in the first place rather than
> >> >> match by MAC, the latter of which is fragile and problematic.  
> >> >
> >> > It isn't all that fragile - hyperv used same for a while, so if someone
> >> > posts working patches with QEMU support but before this grouping stuff,
> >> > I'll happily apply them.  
> >>
> >> I wouldn't box the solution to very limited scenario just because of
> >> matching by MAC, the benefit of having generic group ID in the first
> >> place is that we save the effort of maintaining legacy MAC based
> >> pairing that just adds complexity anyway. Currently the VF's MAC
> >> address cannot be changed by either PF or by the guest user is a
> >> severe limitation due to this. The other use case is that PT device
> >> than VF would generally have different MAC than the standby virtio. We
> >> shouldn't limit itself to VF specific scenario from the very
> >> beginning.  
> >
> > So, this brings me to a different concern: the semantics of
> > VIRTIO_NET_F_STANDBY.
> >
> > * The currently sole user seems to be the virtio-net Linux driver.
> > * The commit messages, code comments and Documentation/ all talk about
> >   matching by MAC.
> > * I could not find any proposed update to the virtio spec. (If there
> >   had been an older proposal with a different feature name, it is not
> >   discoverable.)  
> 
> No, there was no such spec patch at all when the Linux patch was
> submitted, hence match by MAC is the only means to pair device due to
> lack of QEMU support.

We need to know what the device offers if it offers the feature bit,
and what it is supposed to do when the driver negotiates it. Currently,
we can only go by what the Linux driver does and what it expects. IOW,
we need a spec update proposal. Obviously, this should be discussed in
conjunction with the rest.

> 
> Q: Does it work?
> A: Well, it works for some.
> Q: Does it work well to support all scenarios?
> A: No, not as it claims to.
> Q: Can it do better job to support all scenarios?
> A: Yes, do pairing with the failover group ID instead.
> Q: Does pairing still need to be MAC based if using failover group ID?
> A: It depends, it's up to the implementation to verify MAC address
> depending on the need (e.g. VF failover versus PT device replacement),
> though MAC matching is no longer positioned as a requirement for
> pairing or grouping.

Whether matching by MAC is good or sufficient is a different
discussion. It is, however, what the code currently *does*, and in
absence of a spec update, it is the only reference for this feature.

> 
> There's no such stickiness for matching by MAC defined anywhere. The
> semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
> concept that the standby device should be used when the primary is not
> present. We now have added the group ID on QEMU. I don't see why
> bothering to get rid of the limitation: it's never been exposed. No
> existing users. No API/ABI defined at all.

This is scheduled to be released with the next Linux version, which is
right now in the -rc phase. It *is* API (a guest <-> host API).

No corresponding code is present in QEMU 3.0, which is in freeze right
now. Anything that goes into QEMU 3.1 or later needs to accommodate
Linux 4.18 as a guest.

> 
> >
> > VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> > official spec, you can only go by the Linux implementation, and by that
> > its semantics seem to be 'match by MAC', not 'match by other criteria'.
> >
> > How is this supposed to work in the long run?  
> 
> That group ID thing should work for all OS. Not just Linux.

That's exactly my point: We need to care about not-Linux. And about
not-QEMU as well. A virtio feature bit should not be defined by what
Linux and QEMU do, but needs a real spec.

So, currently we have a Linux driver implementation that matches by
MAC. If a Linux version with this is released, every device that offers
VIRTIO_NET_F_STANDBY needs to support matching by MAC so that this
Linux driver will not break. Adding further matching methods should be
fine, but might need additional features (needs to be discussed).

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-11  9:53                   ` [Qemu-devel] [virtio-dev] " Cornelia Huck
  (?)
@ 2018-07-12 19:18                     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-12 19:18 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Siwei Liu, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Wed, Jul 11, 2018 at 11:53:44AM +0200, Cornelia Huck wrote:
> On Tue, 10 Jul 2018 17:07:37 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
> 
> > On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> > >> The plan is to enable group ID based matching in the first place rather than
> > >> match by MAC, the latter of which is fragile and problematic.  
> > >
> > > It isn't all that fragile - hyperv used same for a while, so if someone
> > > posts working patches with QEMU support but before this grouping stuff,
> > > I'll happily apply them.  
> > 
> > I wouldn't box the solution to very limited scenario just because of
> > matching by MAC, the benefit of having generic group ID in the first
> > place is that we save the effort of maintaining legacy MAC based
> > pairing that just adds complexity anyway. Currently the VF's MAC
> > address cannot be changed by either PF or by the guest user is a
> > severe limitation due to this. The other use case is that PT device
> > than VF would generally have different MAC than the standby virtio. We
> > shouldn't limit itself to VF specific scenario from the very
> > beginning.
> 
> So, this brings me to a different concern: the semantics of
> VIRTIO_NET_F_STANDBY.
> 
> * The currently sole user seems to be the virtio-net Linux driver.
> * The commit messages, code comments and Documentation/ all talk about
>   matching by MAC.
> * I could not find any proposed update to the virtio spec. (If there
>   had been an older proposal with a different feature name, it is not
>   discoverable.)
> 
> VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> official spec, you can only go by the Linux implementation, and by that
> its semantics seem to be 'match by MAC', not 'match by other criteria'.
> 
> How is this supposed to work in the long run?

We definitely need a spec patch for VIRTIO_NET_F_STANDBY documenting existing
semantics.  Sridhar, do you plan to take a look?

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 19:18                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-12 19:18 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Siwei Liu, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Wed, Jul 11, 2018 at 11:53:44AM +0200, Cornelia Huck wrote:
> On Tue, 10 Jul 2018 17:07:37 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
> 
> > On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> > >> The plan is to enable group ID based matching in the first place rather than
> > >> match by MAC, the latter of which is fragile and problematic.  
> > >
> > > It isn't all that fragile - hyperv used same for a while, so if someone
> > > posts working patches with QEMU support but before this grouping stuff,
> > > I'll happily apply them.  
> > 
> > I wouldn't box the solution to very limited scenario just because of
> > matching by MAC, the benefit of having generic group ID in the first
> > place is that we save the effort of maintaining legacy MAC based
> > pairing that just adds complexity anyway. Currently the VF's MAC
> > address cannot be changed by either PF or by the guest user is a
> > severe limitation due to this. The other use case is that PT device
> > than VF would generally have different MAC than the standby virtio. We
> > shouldn't limit itself to VF specific scenario from the very
> > beginning.
> 
> So, this brings me to a different concern: the semantics of
> VIRTIO_NET_F_STANDBY.
> 
> * The currently sole user seems to be the virtio-net Linux driver.
> * The commit messages, code comments and Documentation/ all talk about
>   matching by MAC.
> * I could not find any proposed update to the virtio spec. (If there
>   had been an older proposal with a different feature name, it is not
>   discoverable.)
> 
> VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> official spec, you can only go by the Linux implementation, and by that
> its semantics seem to be 'match by MAC', not 'match by other criteria'.
> 
> How is this supposed to work in the long run?

We definitely need a spec patch for VIRTIO_NET_F_STANDBY documenting existing
semantics.  Sridhar, do you plan to take a look?

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 19:18                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-12 19:18 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Siwei Liu, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Wed, Jul 11, 2018 at 11:53:44AM +0200, Cornelia Huck wrote:
> On Tue, 10 Jul 2018 17:07:37 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
> 
> > On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:  
> > >> The plan is to enable group ID based matching in the first place rather than
> > >> match by MAC, the latter of which is fragile and problematic.  
> > >
> > > It isn't all that fragile - hyperv used same for a while, so if someone
> > > posts working patches with QEMU support but before this grouping stuff,
> > > I'll happily apply them.  
> > 
> > I wouldn't box the solution to very limited scenario just because of
> > matching by MAC, the benefit of having generic group ID in the first
> > place is that we save the effort of maintaining legacy MAC based
> > pairing that just adds complexity anyway. Currently the VF's MAC
> > address cannot be changed by either PF or by the guest user is a
> > severe limitation due to this. The other use case is that PT device
> > than VF would generally have different MAC than the standby virtio. We
> > shouldn't limit itself to VF specific scenario from the very
> > beginning.
> 
> So, this brings me to a different concern: the semantics of
> VIRTIO_NET_F_STANDBY.
> 
> * The currently sole user seems to be the virtio-net Linux driver.
> * The commit messages, code comments and Documentation/ all talk about
>   matching by MAC.
> * I could not find any proposed update to the virtio spec. (If there
>   had been an older proposal with a different feature name, it is not
>   discoverable.)
> 
> VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
> official spec, you can only go by the Linux implementation, and by that
> its semantics seem to be 'match by MAC', not 'match by other criteria'.
> 
> How is this supposed to work in the long run?

We definitely need a spec patch for VIRTIO_NET_F_STANDBY documenting existing
semantics.  Sridhar, do you plan to take a look?

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-12 11:31                       ` [Qemu-devel] [virtio-dev] " Cornelia Huck
  (?)
@ 2018-07-12 20:52                         ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12 20:52 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

_

On Thu, Jul 12, 2018 at 4:31 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Thu, 12 Jul 2018 02:37:03 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> > On Tue, 10 Jul 2018 17:07:37 -0700
>> > Siwei Liu <loseweigh@gmail.com> wrote:
>> >
>> >> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> >> >> The plan is to enable group ID based matching in the first place rather than
>> >> >> match by MAC, the latter of which is fragile and problematic.
>> >> >
>> >> > It isn't all that fragile - hyperv used same for a while, so if someone
>> >> > posts working patches with QEMU support but before this grouping stuff,
>> >> > I'll happily apply them.
>> >>
>> >> I wouldn't box the solution to very limited scenario just because of
>> >> matching by MAC, the benefit of having generic group ID in the first
>> >> place is that we save the effort of maintaining legacy MAC based
>> >> pairing that just adds complexity anyway. Currently the VF's MAC
>> >> address cannot be changed by either PF or by the guest user is a
>> >> severe limitation due to this. The other use case is that PT device
>> >> than VF would generally have different MAC than the standby virtio. We
>> >> shouldn't limit itself to VF specific scenario from the very
>> >> beginning.
>> >
>> > So, this brings me to a different concern: the semantics of
>> > VIRTIO_NET_F_STANDBY.
>> >
>> > * The currently sole user seems to be the virtio-net Linux driver.
>> > * The commit messages, code comments and Documentation/ all talk about
>> >   matching by MAC.
>> > * I could not find any proposed update to the virtio spec. (If there
>> >   had been an older proposal with a different feature name, it is not
>> >   discoverable.)
>>
>> No, there was no such spec patch at all when the Linux patch was
>> submitted, hence match by MAC is the only means to pair device due to
>> lack of QEMU support.
>
> We need to know what the device offers if it offers the feature bit,
> and what it is supposed to do when the driver negotiates it. Currently,
> we can only go by what the Linux driver does and what it expects. IOW,
> we need a spec update proposal. Obviously, this should be discussed in
> conjunction with the rest.

The definition is incomplete due to lack of spec. There's no "host"
part defined yet in the host-guest interface. If match by MAC is an
interface, the same must be done on the host(device) side as well,
which has been agreed not the way to go. However, I don't think that's
what the author intends to do by interpreting his QEMU patch - it
missed the other parts as well, such as the feature negotiation and
how it interacts with the paired device.

What I said is that match by MAC is just a guest implementation that
one can change at any time. We now have the group ID on QEMU, why
still sticking to matching by MAC? It shoulnd't be a host-guest
interface in the first place anyway.

>
>>
>> Q: Does it work?
>> A: Well, it works for some.
>> Q: Does it work well to support all scenarios?
>> A: No, not as it claims to.
>> Q: Can it do better job to support all scenarios?
>> A: Yes, do pairing with the failover group ID instead.
>> Q: Does pairing still need to be MAC based if using failover group ID?
>> A: It depends, it's up to the implementation to verify MAC address
>> depending on the need (e.g. VF failover versus PT device replacement),
>> though MAC matching is no longer positioned as a requirement for
>> pairing or grouping.
>
> Whether matching by MAC is good or sufficient is a different
> discussion. It is, however, what the code currently *does*, and in
> absence of a spec update, it is the only reference for this feature.

Not really, it's the same discussion. Before the group ID discussion I
explicitly asked for the spec for VIRTIO_NET_F_STANDBY about the
semantics.  Since no one come up with a spec, I assumed it is still
being worked on and it was all right to have the initial Linux
implementation to be in. I assume that is a formal process while
working on a complicated feature the spec can come later once
everything becomes clear along with the discussions.

https://www.spinics.net/lists/netdev/msg499011.html

I made the same claim at the time that we shouldn't go with what's
been implemented in Linux, but instead should look at the entire
picture to decide what should be the right semantics for
VIRTIO_NET_F_STANDBY. I agree with you we need a spec. I can work on a
spec but I'd like to clarify that there was nothing defined yet for
VIRTIO_NET_F_STANDBY so it shouldn't be called as spec update. It's
still a work-in-progress feature that IMHO it's different than the
situation that there used to be pre-spec virtio implementation already
working on both host and guest side. The current VIRTIO_NET_F_STANDBY
implementation in Linux is pretty much dead code without a spec
followed by a host/device side implementation.

-Siwei


>
>>
>> There's no such stickiness for matching by MAC defined anywhere. The
>> semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
>> concept that the standby device should be used when the primary is not
>> present. We now have added the group ID on QEMU. I don't see why
>> bothering to get rid of the limitation: it's never been exposed. No
>> existing users. No API/ABI defined at all.
>
> This is scheduled to be released with the next Linux version, which is
> right now in the -rc phase. It *is* API (a guest <-> host API).
>
> No corresponding code is present in QEMU 3.0, which is in freeze right
> now. Anything that goes into QEMU 3.1 or later needs to accommodate
> Linux 4.18 as a guest.
>
>>
>> >
>> > VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
>> > official spec, you can only go by the Linux implementation, and by that
>> > its semantics seem to be 'match by MAC', not 'match by other criteria'.
>> >
>> > How is this supposed to work in the long run?
>>
>> That group ID thing should work for all OS. Not just Linux.
>
> That's exactly my point: We need to care about not-Linux. And about
> not-QEMU as well. A virtio feature bit should not be defined by what
> Linux and QEMU do, but needs a real spec.
>
> So, currently we have a Linux driver implementation that matches by
> MAC. If a Linux version with this is released, every device that offers
> VIRTIO_NET_F_STANDBY needs to support matching by MAC so that this
> Linux driver will not break. Adding further matching methods should be
> fine, but might need additional features (needs to be discussed).

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 20:52                         ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12 20:52 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

_

On Thu, Jul 12, 2018 at 4:31 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Thu, 12 Jul 2018 02:37:03 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> > On Tue, 10 Jul 2018 17:07:37 -0700
>> > Siwei Liu <loseweigh@gmail.com> wrote:
>> >
>> >> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> >> >> The plan is to enable group ID based matching in the first place rather than
>> >> >> match by MAC, the latter of which is fragile and problematic.
>> >> >
>> >> > It isn't all that fragile - hyperv used same for a while, so if someone
>> >> > posts working patches with QEMU support but before this grouping stuff,
>> >> > I'll happily apply them.
>> >>
>> >> I wouldn't box the solution to very limited scenario just because of
>> >> matching by MAC, the benefit of having generic group ID in the first
>> >> place is that we save the effort of maintaining legacy MAC based
>> >> pairing that just adds complexity anyway. Currently the VF's MAC
>> >> address cannot be changed by either PF or by the guest user is a
>> >> severe limitation due to this. The other use case is that PT device
>> >> than VF would generally have different MAC than the standby virtio. We
>> >> shouldn't limit itself to VF specific scenario from the very
>> >> beginning.
>> >
>> > So, this brings me to a different concern: the semantics of
>> > VIRTIO_NET_F_STANDBY.
>> >
>> > * The currently sole user seems to be the virtio-net Linux driver.
>> > * The commit messages, code comments and Documentation/ all talk about
>> >   matching by MAC.
>> > * I could not find any proposed update to the virtio spec. (If there
>> >   had been an older proposal with a different feature name, it is not
>> >   discoverable.)
>>
>> No, there was no such spec patch at all when the Linux patch was
>> submitted, hence match by MAC is the only means to pair device due to
>> lack of QEMU support.
>
> We need to know what the device offers if it offers the feature bit,
> and what it is supposed to do when the driver negotiates it. Currently,
> we can only go by what the Linux driver does and what it expects. IOW,
> we need a spec update proposal. Obviously, this should be discussed in
> conjunction with the rest.

The definition is incomplete due to lack of spec. There's no "host"
part defined yet in the host-guest interface. If match by MAC is an
interface, the same must be done on the host(device) side as well,
which has been agreed not the way to go. However, I don't think that's
what the author intends to do by interpreting his QEMU patch - it
missed the other parts as well, such as the feature negotiation and
how it interacts with the paired device.

What I said is that match by MAC is just a guest implementation that
one can change at any time. We now have the group ID on QEMU, why
still sticking to matching by MAC? It shoulnd't be a host-guest
interface in the first place anyway.

>
>>
>> Q: Does it work?
>> A: Well, it works for some.
>> Q: Does it work well to support all scenarios?
>> A: No, not as it claims to.
>> Q: Can it do better job to support all scenarios?
>> A: Yes, do pairing with the failover group ID instead.
>> Q: Does pairing still need to be MAC based if using failover group ID?
>> A: It depends, it's up to the implementation to verify MAC address
>> depending on the need (e.g. VF failover versus PT device replacement),
>> though MAC matching is no longer positioned as a requirement for
>> pairing or grouping.
>
> Whether matching by MAC is good or sufficient is a different
> discussion. It is, however, what the code currently *does*, and in
> absence of a spec update, it is the only reference for this feature.

Not really, it's the same discussion. Before the group ID discussion I
explicitly asked for the spec for VIRTIO_NET_F_STANDBY about the
semantics.  Since no one come up with a spec, I assumed it is still
being worked on and it was all right to have the initial Linux
implementation to be in. I assume that is a formal process while
working on a complicated feature the spec can come later once
everything becomes clear along with the discussions.

https://www.spinics.net/lists/netdev/msg499011.html

I made the same claim at the time that we shouldn't go with what's
been implemented in Linux, but instead should look at the entire
picture to decide what should be the right semantics for
VIRTIO_NET_F_STANDBY. I agree with you we need a spec. I can work on a
spec but I'd like to clarify that there was nothing defined yet for
VIRTIO_NET_F_STANDBY so it shouldn't be called as spec update. It's
still a work-in-progress feature that IMHO it's different than the
situation that there used to be pre-spec virtio implementation already
working on both host and guest side. The current VIRTIO_NET_F_STANDBY
implementation in Linux is pretty much dead code without a spec
followed by a host/device side implementation.

-Siwei


>
>>
>> There's no such stickiness for matching by MAC defined anywhere. The
>> semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
>> concept that the standby device should be used when the primary is not
>> present. We now have added the group ID on QEMU. I don't see why
>> bothering to get rid of the limitation: it's never been exposed. No
>> existing users. No API/ABI defined at all.
>
> This is scheduled to be released with the next Linux version, which is
> right now in the -rc phase. It *is* API (a guest <-> host API).
>
> No corresponding code is present in QEMU 3.0, which is in freeze right
> now. Anything that goes into QEMU 3.1 or later needs to accommodate
> Linux 4.18 as a guest.
>
>>
>> >
>> > VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
>> > official spec, you can only go by the Linux implementation, and by that
>> > its semantics seem to be 'match by MAC', not 'match by other criteria'.
>> >
>> > How is this supposed to work in the long run?
>>
>> That group ID thing should work for all OS. Not just Linux.
>
> That's exactly my point: We need to care about not-Linux. And about
> not-QEMU as well. A virtio feature bit should not be defined by what
> Linux and QEMU do, but needs a real spec.
>
> So, currently we have a Linux driver implementation that matches by
> MAC. If a Linux version with this is released, every device that offers
> VIRTIO_NET_F_STANDBY needs to support matching by MAC so that this
> Linux driver will not break. Adding further matching methods should be
> fine, but might need additional features (needs to be discussed).

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 20:52                         ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12 20:52 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Michael S. Tsirkin, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

_

On Thu, Jul 12, 2018 at 4:31 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Thu, 12 Jul 2018 02:37:03 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Wed, Jul 11, 2018 at 2:53 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> > On Tue, 10 Jul 2018 17:07:37 -0700
>> > Siwei Liu <loseweigh@gmail.com> wrote:
>> >
>> >> On Mon, Jul 9, 2018 at 6:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> >> > On Mon, Jul 09, 2018 at 06:11:53PM -0700, si-wei liu wrote:
>> >> >> The plan is to enable group ID based matching in the first place rather than
>> >> >> match by MAC, the latter of which is fragile and problematic.
>> >> >
>> >> > It isn't all that fragile - hyperv used same for a while, so if someone
>> >> > posts working patches with QEMU support but before this grouping stuff,
>> >> > I'll happily apply them.
>> >>
>> >> I wouldn't box the solution to very limited scenario just because of
>> >> matching by MAC, the benefit of having generic group ID in the first
>> >> place is that we save the effort of maintaining legacy MAC based
>> >> pairing that just adds complexity anyway. Currently the VF's MAC
>> >> address cannot be changed by either PF or by the guest user is a
>> >> severe limitation due to this. The other use case is that PT device
>> >> than VF would generally have different MAC than the standby virtio. We
>> >> shouldn't limit itself to VF specific scenario from the very
>> >> beginning.
>> >
>> > So, this brings me to a different concern: the semantics of
>> > VIRTIO_NET_F_STANDBY.
>> >
>> > * The currently sole user seems to be the virtio-net Linux driver.
>> > * The commit messages, code comments and Documentation/ all talk about
>> >   matching by MAC.
>> > * I could not find any proposed update to the virtio spec. (If there
>> >   had been an older proposal with a different feature name, it is not
>> >   discoverable.)
>>
>> No, there was no such spec patch at all when the Linux patch was
>> submitted, hence match by MAC is the only means to pair device due to
>> lack of QEMU support.
>
> We need to know what the device offers if it offers the feature bit,
> and what it is supposed to do when the driver negotiates it. Currently,
> we can only go by what the Linux driver does and what it expects. IOW,
> we need a spec update proposal. Obviously, this should be discussed in
> conjunction with the rest.

The definition is incomplete due to lack of spec. There's no "host"
part defined yet in the host-guest interface. If match by MAC is an
interface, the same must be done on the host(device) side as well,
which has been agreed not the way to go. However, I don't think that's
what the author intends to do by interpreting his QEMU patch - it
missed the other parts as well, such as the feature negotiation and
how it interacts with the paired device.

What I said is that match by MAC is just a guest implementation that
one can change at any time. We now have the group ID on QEMU, why
still sticking to matching by MAC? It shoulnd't be a host-guest
interface in the first place anyway.

>
>>
>> Q: Does it work?
>> A: Well, it works for some.
>> Q: Does it work well to support all scenarios?
>> A: No, not as it claims to.
>> Q: Can it do better job to support all scenarios?
>> A: Yes, do pairing with the failover group ID instead.
>> Q: Does pairing still need to be MAC based if using failover group ID?
>> A: It depends, it's up to the implementation to verify MAC address
>> depending on the need (e.g. VF failover versus PT device replacement),
>> though MAC matching is no longer positioned as a requirement for
>> pairing or grouping.
>
> Whether matching by MAC is good or sufficient is a different
> discussion. It is, however, what the code currently *does*, and in
> absence of a spec update, it is the only reference for this feature.

Not really, it's the same discussion. Before the group ID discussion I
explicitly asked for the spec for VIRTIO_NET_F_STANDBY about the
semantics.  Since no one come up with a spec, I assumed it is still
being worked on and it was all right to have the initial Linux
implementation to be in. I assume that is a formal process while
working on a complicated feature the spec can come later once
everything becomes clear along with the discussions.

https://www.spinics.net/lists/netdev/msg499011.html

I made the same claim at the time that we shouldn't go with what's
been implemented in Linux, but instead should look at the entire
picture to decide what should be the right semantics for
VIRTIO_NET_F_STANDBY. I agree with you we need a spec. I can work on a
spec but I'd like to clarify that there was nothing defined yet for
VIRTIO_NET_F_STANDBY so it shouldn't be called as spec update. It's
still a work-in-progress feature that IMHO it's different than the
situation that there used to be pre-spec virtio implementation already
working on both host and guest side. The current VIRTIO_NET_F_STANDBY
implementation in Linux is pretty much dead code without a spec
followed by a host/device side implementation.

-Siwei


>
>>
>> There's no such stickiness for matching by MAC defined anywhere. The
>> semantics of VIRTIO_NET_F_STANDBY feature are mostly a failover
>> concept that the standby device should be used when the primary is not
>> present. We now have added the group ID on QEMU. I don't see why
>> bothering to get rid of the limitation: it's never been exposed. No
>> existing users. No API/ABI defined at all.
>
> This is scheduled to be released with the next Linux version, which is
> right now in the -rc phase. It *is* API (a guest <-> host API).
>
> No corresponding code is present in QEMU 3.0, which is in freeze right
> now. Anything that goes into QEMU 3.1 or later needs to accommodate
> Linux 4.18 as a guest.
>
>>
>> >
>> > VIRTIO_NET_F_STANDBY is a host <-> guest interface. As there's no
>> > official spec, you can only go by the Linux implementation, and by that
>> > its semantics seem to be 'match by MAC', not 'match by other criteria'.
>> >
>> > How is this supposed to work in the long run?
>>
>> That group ID thing should work for all OS. Not just Linux.
>
> That's exactly my point: We need to care about not-Linux. And about
> not-QEMU as well. A virtio feature bit should not be defined by what
> Linux and QEMU do, but needs a real spec.
>
> So, currently we have a Linux driver implementation that matches by
> MAC. If a Linux version with this is released, every device that offers
> VIRTIO_NET_F_STANDBY needs to support matching by MAC so that this
> Linux driver will not break. Adding further matching methods should be
> fine, but might need additional features (needs to be discussed).

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-12 20:52                         ` [Qemu-devel] [virtio-dev] " Siwei Liu
  (?)
@ 2018-07-12 21:00                           ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-12 21:00 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
> The definition is incomplete due to lack of spec. There's no "host"
> part defined yet in the host-guest interface. If match by MAC is an
> interface, the same must be done on the host(device) side as well,
> which has been agreed not the way to go. However, I don't think that's
> what the author intends to do by interpreting his QEMU patch - it
> missed the other parts as well, such as the feature negotiation and
> how it interacts with the paired device.
> 
> What I said is that match by MAC is just a guest implementation that
> one can change at any time. We now have the group ID on QEMU, why
> still sticking to matching by MAC? It shoulnd't be a host-guest
> interface in the first place anyway.

I think that match by MAC is a simple portable way to match devices.
E.g. it will work seamlessly with niche things like zPCI. However
there are other niche use-cases that aren't addressed by match by MAC
such as PF pass-through as a primary, and the pci bridge trick addresses
that at cost of some portability.

So I see no issues supporting both mechanisms, but others on the TC
might feel differently.

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 21:00                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-12 21:00 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
> The definition is incomplete due to lack of spec. There's no "host"
> part defined yet in the host-guest interface. If match by MAC is an
> interface, the same must be done on the host(device) side as well,
> which has been agreed not the way to go. However, I don't think that's
> what the author intends to do by interpreting his QEMU patch - it
> missed the other parts as well, such as the feature negotiation and
> how it interacts with the paired device.
> 
> What I said is that match by MAC is just a guest implementation that
> one can change at any time. We now have the group ID on QEMU, why
> still sticking to matching by MAC? It shoulnd't be a host-guest
> interface in the first place anyway.

I think that match by MAC is a simple portable way to match devices.
E.g. it will work seamlessly with niche things like zPCI. However
there are other niche use-cases that aren't addressed by match by MAC
such as PF pass-through as a primary, and the pci bridge trick addresses
that at cost of some portability.

So I see no issues supporting both mechanisms, but others on the TC
might feel differently.

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 21:00                           ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-12 21:00 UTC (permalink / raw)
  To: Siwei Liu
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
> The definition is incomplete due to lack of spec. There's no "host"
> part defined yet in the host-guest interface. If match by MAC is an
> interface, the same must be done on the host(device) side as well,
> which has been agreed not the way to go. However, I don't think that's
> what the author intends to do by interpreting his QEMU patch - it
> missed the other parts as well, such as the feature negotiation and
> how it interacts with the paired device.
> 
> What I said is that match by MAC is just a guest implementation that
> one can change at any time. We now have the group ID on QEMU, why
> still sticking to matching by MAC? It shoulnd't be a host-guest
> interface in the first place anyway.

I think that match by MAC is a simple portable way to match devices.
E.g. it will work seamlessly with niche things like zPCI. However
there are other niche use-cases that aren't addressed by match by MAC
such as PF pass-through as a primary, and the pci bridge trick addresses
that at cost of some portability.

So I see no issues supporting both mechanisms, but others on the TC
might feel differently.

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-10 14:28     ` [virtio-dev] " Venu Busireddy
@ 2018-07-12 21:01       ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-12 21:01 UTC (permalink / raw)
  To: Venu Busireddy; +Cc: Marcel Apfelbaum, virtio-dev, qemu-devel

On Tue, Jul 10, 2018 at 09:28:57AM -0500, Venu Busireddy wrote:
> On 2018-07-10 05:11:18 +0300, Michael S. Tsirkin wrote:
> > On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> > > The current patch set includes all the feedback received for proposals [3]
> > > and [4]. For the sake of completeness, patch for the virtio specification
> > > is also included here. Following is the updated proposal.
> > > 
> > > 1. Extend the virtio specification to include a new virtio PCI capability
> > >    "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > 
> > ....
> > 
> > >  default-configs/arm-softmmu.mak             |   1 +
> > >  default-configs/i386-softmmu.mak            |   1 +
> > >  default-configs/x86_64-softmmu.mak          |   1 +
> > >  hw/pci-bridge/Makefile.objs                 |   1 +
> > >  hw/pci-bridge/pci_bridge_dev.c              |  10 +
> > >  hw/pci-bridge/pcie_downstream.c             | 220 ++++++++++++++++++++
> > >  hw/pci-bridge/pcie_downstream.h             |  10 +
> > >  hw/pci/pci_bridge.c                         |  34 +++
> > >  hw/virtio/virtio-pci.c                      |  15 ++
> > >  hw/virtio/virtio-pci.h                      |   3 +-
> > >  include/hw/pci/pci.h                        |  37 ++--
> > >  include/hw/pci/pci_bridge.h                 |   2 +
> > >  include/hw/pci/pcie.h                       |   1 +
> > >  include/standard-headers/linux/virtio_pci.h |   8 +
> > >  14 files changed, 326 insertions(+), 18 deletions(-)
> > >  create mode 100644 hw/pci-bridge/pcie_downstream.c
> > >  create mode 100644 hw/pci-bridge/pcie_downstream.h
> > 
> > I don't see the spec patch here.
> 
> Since the spec patch is in a different repo, I could not get 'git
> format-patch' to combine them together. As a result, I added the spec
> patch itself to the patches, but it is not in the cover letter.
> 
> I can combine them manually, and repost. Do you want me do that? 
> 
> Thanks,
> 
> Venu

It's ok, I found it now.

-- 
MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 21:01       ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-12 21:01 UTC (permalink / raw)
  To: Venu Busireddy; +Cc: Marcel Apfelbaum, virtio-dev, qemu-devel

On Tue, Jul 10, 2018 at 09:28:57AM -0500, Venu Busireddy wrote:
> On 2018-07-10 05:11:18 +0300, Michael S. Tsirkin wrote:
> > On Fri, Jun 29, 2018 at 05:19:03PM -0500, Venu Busireddy wrote:
> > > The current patch set includes all the feedback received for proposals [3]
> > > and [4]. For the sake of completeness, patch for the virtio specification
> > > is also included here. Following is the updated proposal.
> > > 
> > > 1. Extend the virtio specification to include a new virtio PCI capability
> > >    "VIRTIO_PCI_CAP_GROUP_ID_CFG".
> > 
> > ....
> > 
> > >  default-configs/arm-softmmu.mak             |   1 +
> > >  default-configs/i386-softmmu.mak            |   1 +
> > >  default-configs/x86_64-softmmu.mak          |   1 +
> > >  hw/pci-bridge/Makefile.objs                 |   1 +
> > >  hw/pci-bridge/pci_bridge_dev.c              |  10 +
> > >  hw/pci-bridge/pcie_downstream.c             | 220 ++++++++++++++++++++
> > >  hw/pci-bridge/pcie_downstream.h             |  10 +
> > >  hw/pci/pci_bridge.c                         |  34 +++
> > >  hw/virtio/virtio-pci.c                      |  15 ++
> > >  hw/virtio/virtio-pci.h                      |   3 +-
> > >  include/hw/pci/pci.h                        |  37 ++--
> > >  include/hw/pci/pci_bridge.h                 |   2 +
> > >  include/hw/pci/pcie.h                       |   1 +
> > >  include/standard-headers/linux/virtio_pci.h |   8 +
> > >  14 files changed, 326 insertions(+), 18 deletions(-)
> > >  create mode 100644 hw/pci-bridge/pcie_downstream.c
> > >  create mode 100644 hw/pci-bridge/pcie_downstream.h
> > 
> > I don't see the spec patch here.
> 
> Since the spec patch is in a different repo, I could not get 'git
> format-patch' to combine them together. As a result, I added the spec
> patch itself to the patches, but it is not in the cover letter.
> 
> I can combine them manually, and repost. Do you want me do that? 
> 
> Thanks,
> 
> Venu

It's ok, I found it now.

-- 
MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-12 21:00                           ` [Qemu-devel] [virtio-dev] " Michael S. Tsirkin
  (?)
@ 2018-07-12 22:19                             ` Siwei Liu
  -1 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12 22:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
>> The definition is incomplete due to lack of spec. There's no "host"
>> part defined yet in the host-guest interface. If match by MAC is an
>> interface, the same must be done on the host(device) side as well,
>> which has been agreed not the way to go. However, I don't think that's
>> what the author intends to do by interpreting his QEMU patch - it
>> missed the other parts as well, such as the feature negotiation and
>> how it interacts with the paired device.
>>
>> What I said is that match by MAC is just a guest implementation that
>> one can change at any time. We now have the group ID on QEMU, why
>> still sticking to matching by MAC? It shoulnd't be a host-guest
>> interface in the first place anyway.
>
> I think that match by MAC is a simple portable way to match devices.
> E.g. it will work seamlessly with niche things like zPCI. However

That's a good point. I'm not sure if it's a valid assumption that zPCI
should always use the same MAC address as that of virtio. Someone
who's more familiar with the use case may decide and work on that. It
means VFIO device has to take in the MAC address as an identifier to
the "-device vfio-pci,.." QEMU option. I think there's no point to
match device using group ID in QEMU while using MAC in the guest.
Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
match device based on group ID, while someone may come up with another
feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
support is available. Would it make sense?

-Siwei

> there are other niche use-cases that aren't addressed by match by MAC
> such as PF pass-through as a primary, and the pci bridge trick addresses
> that at cost of some portability.
>
> So I see no issues supporting both mechanisms, but others on the TC
> might feel differently.
>
> --
> MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 22:19                             ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12 22:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
>> The definition is incomplete due to lack of spec. There's no "host"
>> part defined yet in the host-guest interface. If match by MAC is an
>> interface, the same must be done on the host(device) side as well,
>> which has been agreed not the way to go. However, I don't think that's
>> what the author intends to do by interpreting his QEMU patch - it
>> missed the other parts as well, such as the feature negotiation and
>> how it interacts with the paired device.
>>
>> What I said is that match by MAC is just a guest implementation that
>> one can change at any time. We now have the group ID on QEMU, why
>> still sticking to matching by MAC? It shoulnd't be a host-guest
>> interface in the first place anyway.
>
> I think that match by MAC is a simple portable way to match devices.
> E.g. it will work seamlessly with niche things like zPCI. However

That's a good point. I'm not sure if it's a valid assumption that zPCI
should always use the same MAC address as that of virtio. Someone
who's more familiar with the use case may decide and work on that. It
means VFIO device has to take in the MAC address as an identifier to
the "-device vfio-pci,.." QEMU option. I think there's no point to
match device using group ID in QEMU while using MAC in the guest.
Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
match device based on group ID, while someone may come up with another
feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
support is available. Would it make sense?

-Siwei

> there are other niche use-cases that aren't addressed by match by MAC
> such as PF pass-through as a primary, and the pci bridge trick addresses
> that at cost of some portability.
>
> So I see no issues supporting both mechanisms, but others on the TC
> might feel differently.
>
> --
> MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-12 22:19                             ` Siwei Liu
  0 siblings, 0 replies; 106+ messages in thread
From: Siwei Liu @ 2018-07-12 22:19 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Samudrala, Sridhar,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
>> The definition is incomplete due to lack of spec. There's no "host"
>> part defined yet in the host-guest interface. If match by MAC is an
>> interface, the same must be done on the host(device) side as well,
>> which has been agreed not the way to go. However, I don't think that's
>> what the author intends to do by interpreting his QEMU patch - it
>> missed the other parts as well, such as the feature negotiation and
>> how it interacts with the paired device.
>>
>> What I said is that match by MAC is just a guest implementation that
>> one can change at any time. We now have the group ID on QEMU, why
>> still sticking to matching by MAC? It shoulnd't be a host-guest
>> interface in the first place anyway.
>
> I think that match by MAC is a simple portable way to match devices.
> E.g. it will work seamlessly with niche things like zPCI. However

That's a good point. I'm not sure if it's a valid assumption that zPCI
should always use the same MAC address as that of virtio. Someone
who's more familiar with the use case may decide and work on that. It
means VFIO device has to take in the MAC address as an identifier to
the "-device vfio-pci,.." QEMU option. I think there's no point to
match device using group ID in QEMU while using MAC in the guest.
Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
match device based on group ID, while someone may come up with another
feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
support is available. Would it make sense?

-Siwei

> there are other niche use-cases that aren't addressed by match by MAC
> such as PF pass-through as a primary, and the pci bridge trick addresses
> that at cost of some portability.
>
> So I see no issues supporting both mechanisms, but others on the TC
> might feel differently.
>
> --
> MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-12 22:19                             ` [Qemu-devel] [virtio-dev] " Siwei Liu
  (?)
@ 2018-07-13  1:20                               ` Samudrala, Sridhar
  -1 siblings, 0 replies; 106+ messages in thread
From: Samudrala, Sridhar @ 2018-07-13  1:20 UTC (permalink / raw)
  To: Siwei Liu, Michael S. Tsirkin
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Alexander Duyck,
	Netdev

On 7/12/2018 6:19 PM, Siwei Liu wrote:
> On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
>>> The definition is incomplete due to lack of spec. There's no "host"
>>> part defined yet in the host-guest interface. If match by MAC is an
>>> interface, the same must be done on the host(device) side as well,
>>> which has been agreed not the way to go. However, I don't think that's
>>> what the author intends to do by interpreting his QEMU patch - it
>>> missed the other parts as well, such as the feature negotiation and
>>> how it interacts with the paired device.
>>>
>>> What I said is that match by MAC is just a guest implementation that
>>> one can change at any time. We now have the group ID on QEMU, why
>>> still sticking to matching by MAC? It shoulnd't be a host-guest
>>> interface in the first place anyway.
>> I think that match by MAC is a simple portable way to match devices.
>> E.g. it will work seamlessly with niche things like zPCI. However
> That's a good point. I'm not sure if it's a valid assumption that zPCI
> should always use the same MAC address as that of virtio. Someone
> who's more familiar with the use case may decide and work on that. It
> means VFIO device has to take in the MAC address as an identifier to
> the "-device vfio-pci,.." QEMU option. I think there's no point to
> match device using group ID in QEMU while using MAC in the guest.
> Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> match device based on group ID, while someone may come up with another
> feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> support is available. Would it make sense?

VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
by MAC address. I think we should add support for this feature bit in QEMU.
If submitting a patch to update the spec is a pre-requisite to add this
feature bit to QEMU, i can do that.

As far as i understand, group id patches to QEMU are still under review.
Matching by group ID can be another feature bit that could support matching
by group id as well as MAC.


> -Siwei
>
>> there are other niche use-cases that aren't addressed by match by MAC
>> such as PF pass-through as a primary, and the pci bridge trick addresses
>> that at cost of some portability.
>>
>> So I see no issues supporting both mechanisms, but others on the TC
>> might feel differently.
>>
>> --
>> MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-13  1:20                               ` Samudrala, Sridhar
  0 siblings, 0 replies; 106+ messages in thread
From: Samudrala, Sridhar @ 2018-07-13  1:20 UTC (permalink / raw)
  To: Siwei Liu, Michael S. Tsirkin
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Alexander Duyck,
	Netdev

On 7/12/2018 6:19 PM, Siwei Liu wrote:
> On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
>>> The definition is incomplete due to lack of spec. There's no "host"
>>> part defined yet in the host-guest interface. If match by MAC is an
>>> interface, the same must be done on the host(device) side as well,
>>> which has been agreed not the way to go. However, I don't think that's
>>> what the author intends to do by interpreting his QEMU patch - it
>>> missed the other parts as well, such as the feature negotiation and
>>> how it interacts with the paired device.
>>>
>>> What I said is that match by MAC is just a guest implementation that
>>> one can change at any time. We now have the group ID on QEMU, why
>>> still sticking to matching by MAC? It shoulnd't be a host-guest
>>> interface in the first place anyway.
>> I think that match by MAC is a simple portable way to match devices.
>> E.g. it will work seamlessly with niche things like zPCI. However
> That's a good point. I'm not sure if it's a valid assumption that zPCI
> should always use the same MAC address as that of virtio. Someone
> who's more familiar with the use case may decide and work on that. It
> means VFIO device has to take in the MAC address as an identifier to
> the "-device vfio-pci,.." QEMU option. I think there's no point to
> match device using group ID in QEMU while using MAC in the guest.
> Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> match device based on group ID, while someone may come up with another
> feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> support is available. Would it make sense?

VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
by MAC address. I think we should add support for this feature bit in QEMU.
If submitting a patch to update the spec is a pre-requisite to add this
feature bit to QEMU, i can do that.

As far as i understand, group id patches to QEMU are still under review.
Matching by group ID can be another feature bit that could support matching
by group id as well as MAC.


> -Siwei
>
>> there are other niche use-cases that aren't addressed by match by MAC
>> such as PF pass-through as a primary, and the pci bridge trick addresses
>> that at cost of some portability.
>>
>> So I see no issues supporting both mechanisms, but others on the TC
>> might feel differently.
>>
>> --
>> MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-13  1:20                               ` Samudrala, Sridhar
  0 siblings, 0 replies; 106+ messages in thread
From: Samudrala, Sridhar @ 2018-07-13  1:20 UTC (permalink / raw)
  To: Siwei Liu, Michael S. Tsirkin
  Cc: Cornelia Huck, si-wei liu, Roman Kagan, Venu Busireddy,
	Marcel Apfelbaum, virtio-dev, qemu-devel, Alexander Duyck,
	Netdev

On 7/12/2018 6:19 PM, Siwei Liu wrote:
> On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
>>> The definition is incomplete due to lack of spec. There's no "host"
>>> part defined yet in the host-guest interface. If match by MAC is an
>>> interface, the same must be done on the host(device) side as well,
>>> which has been agreed not the way to go. However, I don't think that's
>>> what the author intends to do by interpreting his QEMU patch - it
>>> missed the other parts as well, such as the feature negotiation and
>>> how it interacts with the paired device.
>>>
>>> What I said is that match by MAC is just a guest implementation that
>>> one can change at any time. We now have the group ID on QEMU, why
>>> still sticking to matching by MAC? It shoulnd't be a host-guest
>>> interface in the first place anyway.
>> I think that match by MAC is a simple portable way to match devices.
>> E.g. it will work seamlessly with niche things like zPCI. However
> That's a good point. I'm not sure if it's a valid assumption that zPCI
> should always use the same MAC address as that of virtio. Someone
> who's more familiar with the use case may decide and work on that. It
> means VFIO device has to take in the MAC address as an identifier to
> the "-device vfio-pci,.." QEMU option. I think there's no point to
> match device using group ID in QEMU while using MAC in the guest.
> Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> match device based on group ID, while someone may come up with another
> feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> support is available. Would it make sense?

VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
by MAC address. I think we should add support for this feature bit in QEMU.
If submitting a patch to update the spec is a pre-requisite to add this
feature bit to QEMU, i can do that.

As far as i understand, group id patches to QEMU are still under review.
Matching by group ID can be another feature bit that could support matching
by group id as well as MAC.


> -Siwei
>
>> there are other niche use-cases that aren't addressed by match by MAC
>> such as PF pass-through as a primary, and the pci bridge trick addresses
>> that at cost of some portability.
>>
>> So I see no issues supporting both mechanisms, but others on the TC
>> might feel differently.
>>
>> --
>> MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-13  1:20                               ` [Qemu-devel] [virtio-dev] " Samudrala, Sridhar
  (?)
@ 2018-07-13  3:28                                 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-13  3:28 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Siwei Liu, Cornelia Huck, si-wei liu, Roman Kagan,
	Venu Busireddy, Marcel Apfelbaum, virtio-dev, qemu-devel,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 09:20:41PM -0400, Samudrala, Sridhar wrote:
> On 7/12/2018 6:19 PM, Siwei Liu wrote:
> > On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
> > > > The definition is incomplete due to lack of spec. There's no "host"
> > > > part defined yet in the host-guest interface. If match by MAC is an
> > > > interface, the same must be done on the host(device) side as well,
> > > > which has been agreed not the way to go. However, I don't think that's
> > > > what the author intends to do by interpreting his QEMU patch - it
> > > > missed the other parts as well, such as the feature negotiation and
> > > > how it interacts with the paired device.
> > > > 
> > > > What I said is that match by MAC is just a guest implementation that
> > > > one can change at any time. We now have the group ID on QEMU, why
> > > > still sticking to matching by MAC? It shoulnd't be a host-guest
> > > > interface in the first place anyway.
> > > I think that match by MAC is a simple portable way to match devices.
> > > E.g. it will work seamlessly with niche things like zPCI. However
> > That's a good point. I'm not sure if it's a valid assumption that zPCI
> > should always use the same MAC address as that of virtio. Someone
> > who's more familiar with the use case may decide and work on that. It
> > means VFIO device has to take in the MAC address as an identifier to
> > the "-device vfio-pci,.." QEMU option. I think there's no point to
> > match device using group ID in QEMU while using MAC in the guest.
> > Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> > match device based on group ID, while someone may come up with another
> > feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> > support is available. Would it make sense?
> 
> VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
> by MAC address. I think we should add support for this feature bit in QEMU.
> If submitting a patch to update the spec is a pre-requisite to add this
> feature bit to QEMU, i can do that.

It's not strictly a prerequisite but we need it in spec all the same, so
pls do that.

> As far as i understand, group id patches to QEMU are still under review.
> Matching by group ID can be another feature bit that could support matching
> by group id as well as MAC.
> 
> 
> > -Siwei
> > 
> > > there are other niche use-cases that aren't addressed by match by MAC
> > > such as PF pass-through as a primary, and the pci bridge trick addresses
> > > that at cost of some portability.
> > > 
> > > So I see no issues supporting both mechanisms, but others on the TC
> > > might feel differently.
> > > 
> > > --
> > > MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-13  3:28                                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-13  3:28 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Siwei Liu, Cornelia Huck, si-wei liu, Roman Kagan,
	Venu Busireddy, Marcel Apfelbaum, virtio-dev, qemu-devel,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 09:20:41PM -0400, Samudrala, Sridhar wrote:
> On 7/12/2018 6:19 PM, Siwei Liu wrote:
> > On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
> > > > The definition is incomplete due to lack of spec. There's no "host"
> > > > part defined yet in the host-guest interface. If match by MAC is an
> > > > interface, the same must be done on the host(device) side as well,
> > > > which has been agreed not the way to go. However, I don't think that's
> > > > what the author intends to do by interpreting his QEMU patch - it
> > > > missed the other parts as well, such as the feature negotiation and
> > > > how it interacts with the paired device.
> > > > 
> > > > What I said is that match by MAC is just a guest implementation that
> > > > one can change at any time. We now have the group ID on QEMU, why
> > > > still sticking to matching by MAC? It shoulnd't be a host-guest
> > > > interface in the first place anyway.
> > > I think that match by MAC is a simple portable way to match devices.
> > > E.g. it will work seamlessly with niche things like zPCI. However
> > That's a good point. I'm not sure if it's a valid assumption that zPCI
> > should always use the same MAC address as that of virtio. Someone
> > who's more familiar with the use case may decide and work on that. It
> > means VFIO device has to take in the MAC address as an identifier to
> > the "-device vfio-pci,.." QEMU option. I think there's no point to
> > match device using group ID in QEMU while using MAC in the guest.
> > Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> > match device based on group ID, while someone may come up with another
> > feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> > support is available. Would it make sense?
> 
> VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
> by MAC address. I think we should add support for this feature bit in QEMU.
> If submitting a patch to update the spec is a pre-requisite to add this
> feature bit to QEMU, i can do that.

It's not strictly a prerequisite but we need it in spec all the same, so
pls do that.

> As far as i understand, group id patches to QEMU are still under review.
> Matching by group ID can be another feature bit that could support matching
> by group id as well as MAC.
> 
> 
> > -Siwei
> > 
> > > there are other niche use-cases that aren't addressed by match by MAC
> > > such as PF pass-through as a primary, and the pci bridge trick addresses
> > > that at cost of some portability.
> > > 
> > > So I see no issues supporting both mechanisms, but others on the TC
> > > might feel differently.
> > > 
> > > --
> > > MST

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-13  3:28                                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-13  3:28 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Siwei Liu, Cornelia Huck, si-wei liu, Roman Kagan,
	Venu Busireddy, Marcel Apfelbaum, virtio-dev, qemu-devel,
	Alexander Duyck, Netdev

On Thu, Jul 12, 2018 at 09:20:41PM -0400, Samudrala, Sridhar wrote:
> On 7/12/2018 6:19 PM, Siwei Liu wrote:
> > On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> > > On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:
> > > > The definition is incomplete due to lack of spec. There's no "host"
> > > > part defined yet in the host-guest interface. If match by MAC is an
> > > > interface, the same must be done on the host(device) side as well,
> > > > which has been agreed not the way to go. However, I don't think that's
> > > > what the author intends to do by interpreting his QEMU patch - it
> > > > missed the other parts as well, such as the feature negotiation and
> > > > how it interacts with the paired device.
> > > > 
> > > > What I said is that match by MAC is just a guest implementation that
> > > > one can change at any time. We now have the group ID on QEMU, why
> > > > still sticking to matching by MAC? It shoulnd't be a host-guest
> > > > interface in the first place anyway.
> > > I think that match by MAC is a simple portable way to match devices.
> > > E.g. it will work seamlessly with niche things like zPCI. However
> > That's a good point. I'm not sure if it's a valid assumption that zPCI
> > should always use the same MAC address as that of virtio. Someone
> > who's more familiar with the use case may decide and work on that. It
> > means VFIO device has to take in the MAC address as an identifier to
> > the "-device vfio-pci,.." QEMU option. I think there's no point to
> > match device using group ID in QEMU while using MAC in the guest.
> > Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> > match device based on group ID, while someone may come up with another
> > feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> > support is available. Would it make sense?
> 
> VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
> by MAC address. I think we should add support for this feature bit in QEMU.
> If submitting a patch to update the spec is a pre-requisite to add this
> feature bit to QEMU, i can do that.

It's not strictly a prerequisite but we need it in spec all the same, so
pls do that.

> As far as i understand, group id patches to QEMU are still under review.
> Matching by group ID can be another feature bit that could support matching
> by group id as well as MAC.
> 
> 
> > -Siwei
> > 
> > > there are other niche use-cases that aren't addressed by match by MAC
> > > such as PF pass-through as a primary, and the pci bridge trick addresses
> > > that at cost of some portability.
> > > 
> > > So I see no issues supporting both mechanisms, but others on the TC
> > > might feel differently.
> > > 
> > > --
> > > MST

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
  2018-07-13  1:20                               ` [Qemu-devel] [virtio-dev] " Samudrala, Sridhar
  (?)
@ 2018-07-13  9:15                                 ` Cornelia Huck
  -1 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-13  9:15 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Siwei Liu, Michael S. Tsirkin, si-wei liu, Roman Kagan,
	Venu Busireddy, Marcel Apfelbaum, virtio-dev, qemu-devel,
	Alexander Duyck, Netdev

On Thu, 12 Jul 2018 21:20:41 -0400
"Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:

> On 7/12/2018 6:19 PM, Siwei Liu wrote:
> > On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:  
> >> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:  
> >>> The definition is incomplete due to lack of spec. There's no "host"
> >>> part defined yet in the host-guest interface. If match by MAC is an
> >>> interface, the same must be done on the host(device) side as well,
> >>> which has been agreed not the way to go. However, I don't think that's
> >>> what the author intends to do by interpreting his QEMU patch - it
> >>> missed the other parts as well, such as the feature negotiation and
> >>> how it interacts with the paired device.
> >>>
> >>> What I said is that match by MAC is just a guest implementation that
> >>> one can change at any time. We now have the group ID on QEMU, why
> >>> still sticking to matching by MAC? It shoulnd't be a host-guest
> >>> interface in the first place anyway.  
> >> I think that match by MAC is a simple portable way to match devices.
> >> E.g. it will work seamlessly with niche things like zPCI. However  
> > That's a good point. I'm not sure if it's a valid assumption that zPCI
> > should always use the same MAC address as that of virtio. Someone

I think we can mostly disregard the weirdness that is zPCI right now.
There should be no fundamental reasons that matching by MAC would not
work, though.

> > who's more familiar with the use case may decide and work on that. It
> > means VFIO device has to take in the MAC address as an identifier to
> > the "-device vfio-pci,.." QEMU option. I think there's no point to
> > match device using group ID in QEMU while using MAC in the guest.
> > Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> > match device based on group ID, while someone may come up with another
> > feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> > support is available. Would it make sense?  
> 
> VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
> by MAC address. I think we should add support for this feature bit in QEMU.
> If submitting a patch to update the spec is a pre-requisite to add this
> feature bit to QEMU, i can do that.

Doing a spec patch and implementing matching by MAC in QEMU sounds like
a good plan to me.

> 
> As far as i understand, group id patches to QEMU are still under review.
> Matching by group ID can be another feature bit that could support matching
> by group id as well as MAC.

That plan sounds good to me as well.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] Re: [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-13  9:15                                 ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-13  9:15 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Siwei Liu, Michael S. Tsirkin, si-wei liu, Roman Kagan,
	Venu Busireddy, Marcel Apfelbaum, virtio-dev, qemu-devel,
	Alexander Duyck, Netdev

On Thu, 12 Jul 2018 21:20:41 -0400
"Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:

> On 7/12/2018 6:19 PM, Siwei Liu wrote:
> > On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:  
> >> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:  
> >>> The definition is incomplete due to lack of spec. There's no "host"
> >>> part defined yet in the host-guest interface. If match by MAC is an
> >>> interface, the same must be done on the host(device) side as well,
> >>> which has been agreed not the way to go. However, I don't think that's
> >>> what the author intends to do by interpreting his QEMU patch - it
> >>> missed the other parts as well, such as the feature negotiation and
> >>> how it interacts with the paired device.
> >>>
> >>> What I said is that match by MAC is just a guest implementation that
> >>> one can change at any time. We now have the group ID on QEMU, why
> >>> still sticking to matching by MAC? It shoulnd't be a host-guest
> >>> interface in the first place anyway.  
> >> I think that match by MAC is a simple portable way to match devices.
> >> E.g. it will work seamlessly with niche things like zPCI. However  
> > That's a good point. I'm not sure if it's a valid assumption that zPCI
> > should always use the same MAC address as that of virtio. Someone

I think we can mostly disregard the weirdness that is zPCI right now.
There should be no fundamental reasons that matching by MAC would not
work, though.

> > who's more familiar with the use case may decide and work on that. It
> > means VFIO device has to take in the MAC address as an identifier to
> > the "-device vfio-pci,.." QEMU option. I think there's no point to
> > match device using group ID in QEMU while using MAC in the guest.
> > Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> > match device based on group ID, while someone may come up with another
> > feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> > support is available. Would it make sense?  
> 
> VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
> by MAC address. I think we should add support for this feature bit in QEMU.
> If submitting a patch to update the spec is a pre-requisite to add this
> feature bit to QEMU, i can do that.

Doing a spec patch and implementing matching by MAC in QEMU sounds like
a good plan to me.

> 
> As far as i understand, group id patches to QEMU are still under review.
> Matching by group ID can be another feature bit that could support matching
> by group id as well as MAC.

That plan sounds good to me as well.

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] Re: [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices...
@ 2018-07-13  9:15                                 ` Cornelia Huck
  0 siblings, 0 replies; 106+ messages in thread
From: Cornelia Huck @ 2018-07-13  9:15 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Siwei Liu, Michael S. Tsirkin, si-wei liu, Roman Kagan,
	Venu Busireddy, Marcel Apfelbaum, virtio-dev, qemu-devel,
	Alexander Duyck, Netdev

On Thu, 12 Jul 2018 21:20:41 -0400
"Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:

> On 7/12/2018 6:19 PM, Siwei Liu wrote:
> > On Thu, Jul 12, 2018 at 2:00 PM, Michael S. Tsirkin <mst@redhat.com> wrote:  
> >> On Thu, Jul 12, 2018 at 01:52:53PM -0700, Siwei Liu wrote:  
> >>> The definition is incomplete due to lack of spec. There's no "host"
> >>> part defined yet in the host-guest interface. If match by MAC is an
> >>> interface, the same must be done on the host(device) side as well,
> >>> which has been agreed not the way to go. However, I don't think that's
> >>> what the author intends to do by interpreting his QEMU patch - it
> >>> missed the other parts as well, such as the feature negotiation and
> >>> how it interacts with the paired device.
> >>>
> >>> What I said is that match by MAC is just a guest implementation that
> >>> one can change at any time. We now have the group ID on QEMU, why
> >>> still sticking to matching by MAC? It shoulnd't be a host-guest
> >>> interface in the first place anyway.  
> >> I think that match by MAC is a simple portable way to match devices.
> >> E.g. it will work seamlessly with niche things like zPCI. However  
> > That's a good point. I'm not sure if it's a valid assumption that zPCI
> > should always use the same MAC address as that of virtio. Someone

I think we can mostly disregard the weirdness that is zPCI right now.
There should be no fundamental reasons that matching by MAC would not
work, though.

> > who's more familiar with the use case may decide and work on that. It
> > means VFIO device has to take in the MAC address as an identifier to
> > the "-device vfio-pci,.." QEMU option. I think there's no point to
> > match device using group ID in QEMU while using MAC in the guest.
> > Based on that assumption, I'd go with making VIRTIO_NET_F_STANDBY to
> > match device based on group ID, while someone may come up with another
> > feature bit later, say VIRTIO_NET_F_STANDBY_BY_MAC when its QEMU
> > support is available. Would it make sense?  
> 
> VIRTIO_NET_F_STANDBY as defined in the guest virtio_net driver supports match
> by MAC address. I think we should add support for this feature bit in QEMU.
> If submitting a patch to update the spec is a pre-requisite to add this
> feature bit to QEMU, i can do that.

Doing a spec patch and implementing matching by MAC in QEMU sounds like
a good plan to me.

> 
> As far as i understand, group id patches to QEMU are still under review.
> Matching by group ID can be another feature bit that could support matching
> by group id as well as MAC.

That plan sounds good to me as well.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
  2018-07-07 12:14     ` Marcel Apfelbaum
@ 2018-07-31 15:58       ` Venu Busireddy
  -1 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-07-31 15:58 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: Michael S . Tsirkin, Marcel Apfelbaum, virtio-dev, qemu-devel

On 2018-07-07 15:14:11 +0300, Marcel Apfelbaum wrote:
> Hi Venu,
> 
> On 06/30/2018 01:19 AM, Venu Busireddy wrote:
> > Add a new bridge device "pcie-downstream" with a
> > Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
> > PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER.
> 
> Can't we use the current pcie-pci-bridge device (see
> hw/pci-bridge/pcie_pci_bridge.c)
> by adding the new vendor capability to it so we will not need to support a
> new bridge?

Sorry for the delayed response. I was away on vacation.

This question is probably better answered by Michael, but me let me try.
Michael suggested adding a new device to avoid confusion with existing
bridge devices.

Venu


> 
> Thanks,
> Marcel
> 
> >   Also, add the "Vendor-Specific"
> > capability to the bridge to contain the "Group Identifier" that will be
> > used to pair a virtio device with the passthrough device attached to
> > that bridge.
> > 
> > This capability is added to the bridge iff the "failover-group-id"
> > option is specified for the bridge.
> > 
> > Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
> > ---
> >   default-configs/arm-softmmu.mak    |   1 +
> >   default-configs/i386-softmmu.mak   |   1 +
> >   default-configs/x86_64-softmmu.mak |   1 +
> >   hw/pci-bridge/Makefile.objs        |   1 +
> >   hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
> >   hw/pci-bridge/pcie_downstream.h    |  10 ++
> >   include/hw/pci/pci.h               |   1 +
> >   7 files changed, 235 insertions(+)
> >   create mode 100644 hw/pci-bridge/pcie_downstream.c
> >   create mode 100644 hw/pci-bridge/pcie_downstream.h
> > 
> > diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> > index 834d45cfaf..b86c6fb122 100644
> > --- a/default-configs/arm-softmmu.mak
> > +++ b/default-configs/arm-softmmu.mak
> > @@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
> >   CONFIG_PCIE_PORT=y
> >   CONFIG_XIO3130=y
> >   CONFIG_IOH3420=y
> > +CONFIG_PCIE_DOWNSTREAM=y
> >   CONFIG_I82801B11=y
> >   CONFIG_ACPI=y
> >   CONFIG_SMBIOS=y
> > diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
> > index 8c7d4a0fa0..a900c8f052 100644
> > --- a/default-configs/i386-softmmu.mak
> > +++ b/default-configs/i386-softmmu.mak
> > @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
> >   CONFIG_PCIE_PORT=y
> >   CONFIG_XIO3130=y
> >   CONFIG_IOH3420=y
> > +CONFIG_PCIE_DOWNSTREAM=y
> >   CONFIG_I82801B11=y
> >   CONFIG_SMBIOS=y
> >   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> > diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
> > index 0390b4303c..481e4764be 100644
> > --- a/default-configs/x86_64-softmmu.mak
> > +++ b/default-configs/x86_64-softmmu.mak
> > @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
> >   CONFIG_PCIE_PORT=y
> >   CONFIG_XIO3130=y
> >   CONFIG_IOH3420=y
> > +CONFIG_PCIE_DOWNSTREAM=y
> >   CONFIG_I82801B11=y
> >   CONFIG_SMBIOS=y
> >   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> > diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
> > index 47065f87d9..5b42212edc 100644
> > --- a/hw/pci-bridge/Makefile.objs
> > +++ b/hw/pci-bridge/Makefile.objs
> > @@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
> >   common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
> >   common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
> >   common-obj-$(CONFIG_IOH3420) += ioh3420.o
> > +common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
> >   common-obj-$(CONFIG_I82801B11) += i82801b11.o
> >   # NewWorld PowerMac
> >   common-obj-$(CONFIG_DEC_PCI) += dec.o
> > diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
> > new file mode 100644
> > index 0000000000..49b0d1e933
> > --- /dev/null
> > +++ b/hw/pci-bridge/pcie_downstream.c
> > @@ -0,0 +1,220 @@
> > +/*
> > + * Red Hat PCI Express downstream port.
> > + *
> > + * pcie_downstream.c
> > + * Most of this code is copied from xio3130_downstream.c
> > + *
> > + * Copyright (c) 2018 Oracle and/or its affiliates.
> > + *         Author: Venu Busireddy <venu.busireddy@oracle.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along
> > + * with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "hw/pci/pci_ids.h"
> > +#include "hw/pci/msi.h"
> > +#include "hw/pci/pcie.h"
> > +#include "pcie_downstream.h"
> > +#include "qapi/error.h"
> > +
> > +#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
> > +#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
> > +#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
> > +#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
> > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
> > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
> > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
> > +#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
> > +#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
> > +#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
> > +
> > +static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
> > +                                         uint32_t val, int len)
> > +{
> > +    pci_bridge_write_config(d, address, val, len);
> > +    pcie_cap_flr_write_config(d, address, val, len);
> > +    pcie_cap_slot_write_config(d, address, val, len);
> > +    pcie_aer_write_config(d, address, val, len);
> > +}
> > +
> > +static void pcie_downstream_reset(DeviceState *qdev)
> > +{
> > +    PCIDevice *d = PCI_DEVICE(qdev);
> > +
> > +    pcie_cap_deverr_reset(d);
> > +    pcie_cap_slot_reset(d);
> > +    pcie_cap_arifwd_reset(d);
> > +    pci_bridge_reset(qdev);
> > +}
> > +
> > +static void pcie_downstream_realize(PCIDevice *d, Error **errp)
> > +{
> > +    PCIEPort *p = PCIE_PORT(d);
> > +    PCIESlot *s = PCIE_SLOT(d);
> > +    int rc;
> > +
> > +    pci_bridge_initfn(d, TYPE_PCIE_BUS);
> > +    pcie_port_init_reg(d);
> > +
> > +    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
> > +            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
> > +    if (rc < 0) {
> > +        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
> > +        goto err_bridge;
> > +    }
> > +
> > +    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
> > +            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
> > +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
> > +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
> > +            errp);
> > +    if (rc < 0) {
> > +        assert(rc == -ENOTSUP);
> > +        goto err_bridge;
> > +    }
> > +
> > +    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
> > +            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
> > +            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
> > +    if (rc < 0) {
> > +        goto err_bridge;
> > +    }
> > +
> > +    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
> > +            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
> > +    if (rc < 0) {
> > +        goto err_msi;
> > +    }
> > +    pcie_cap_flr_init(d);
> > +    pcie_cap_deverr_init(d);
> > +    pcie_cap_slot_init(d, s->slot);
> > +    pcie_cap_arifwd_init(d);
> > +
> > +    pcie_chassis_create(s->chassis);
> > +    rc = pcie_chassis_add_slot(s);
> > +    if (rc < 0) {
> > +        error_setg(errp, "Can't add chassis slot, error %d", rc);
> > +        goto err_pcie_cap;
> > +    }
> > +
> > +    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
> > +            PCI_ERR_SIZEOF, errp);
> > +    if (rc < 0) {
> > +        goto err;
> > +    }
> > +
> > +    return;
> > +
> > +err:
> > +    pcie_chassis_del_slot(s);
> > +err_pcie_cap:
> > +    pcie_cap_exit(d);
> > +err_msi:
> > +    msi_uninit(d);
> > +err_bridge:
> > +    pci_bridge_exitfn(d);
> > +}
> > +
> > +static void pcie_downstream_exitfn(PCIDevice *d)
> > +{
> > +    PCIESlot *s = PCIE_SLOT(d);
> > +
> > +    pcie_aer_exit(d);
> > +    pcie_chassis_del_slot(s);
> > +    pcie_cap_exit(d);
> > +    msi_uninit(d);
> > +    pci_bridge_exitfn(d);
> > +}
> > +
> > +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> > +                               const char *bus_name, pci_map_irq_fn map_irq,
> > +                               uint8_t port, uint8_t chassis, uint16_t slot)
> > +{
> > +    PCIDevice *d;
> > +    PCIBridge *br;
> > +    DeviceState *qdev;
> > +
> > +    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
> > +    if (!d) {
> > +        return NULL;
> > +    }
> > +    br = PCI_BRIDGE(d);
> > +
> > +    qdev = DEVICE(d);
> > +    pci_bridge_map_irq(br, bus_name, map_irq);
> > +    qdev_prop_set_uint8(qdev, "port", port);
> > +    qdev_prop_set_uint8(qdev, "chassis", chassis);
> > +    qdev_prop_set_uint16(qdev, "slot", slot);
> > +    qdev_init_nofail(qdev);
> > +
> > +    return PCIE_SLOT(d);
> > +}
> > +
> > +static Property pcie_downstream_props[] = {
> > +    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
> > +            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
> > +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
> > +            PCIDevice, failover_group_id, ULLONG_MAX),
> > +    DEFINE_PROP_END_OF_LIST()
> > +};
> > +
> > +static const VMStateDescription vmstate_pcie_ds = {
> > +    .name = "pcie-downstream-port",
> > +    .priority = MIG_PRI_PCI_BUS,
> > +    .version_id = 1,
> > +    .minimum_version_id = 1,
> > +    .post_load = pcie_cap_slot_post_load,
> > +    .fields = (VMStateField[]) {
> > +        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
> > +        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
> > +                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
> > +        VMSTATE_END_OF_LIST()
> > +    }
> > +};
> > +
> > +static void pcie_downstream_class_init(ObjectClass *klass, void *data)
> > +{
> > +    DeviceClass *dc = DEVICE_CLASS(klass);
> > +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> > +
> > +    k->is_bridge = 1;
> > +    k->config_write = pcie_downstream_write_config;
> > +    k->realize = pcie_downstream_realize;
> > +    k->exit = pcie_downstream_exitfn;
> > +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> > +    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
> > +    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
> > +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> > +    dc->desc = "Red Hat PCIe Downstream Port";
> > +    dc->reset = pcie_downstream_reset;
> > +    dc->vmsd = &vmstate_pcie_ds;
> > +    dc->props = pcie_downstream_props;
> > +}
> > +
> > +static const TypeInfo pcie_downstream_info = {
> > +    .name          = "pcie-downstream",
> > +    .parent        = TYPE_PCIE_SLOT,
> > +    .class_init    = pcie_downstream_class_init,
> > +    .interfaces = (InterfaceInfo[]) {
> > +        { INTERFACE_PCIE_DEVICE },
> > +        { }
> > +    },
> > +};
> > +
> > +static void pcie_downstream_register_types(void)
> > +{
> > +    type_register_static(&pcie_downstream_info);
> > +}
> > +
> > +type_init(pcie_downstream_register_types)
> > diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
> > new file mode 100644
> > index 0000000000..a7f1aae4bb
> > --- /dev/null
> > +++ b/hw/pci-bridge/pcie_downstream.h
> > @@ -0,0 +1,10 @@
> > +#ifndef QEMU_PCIE_DOWNSTREAM_H
> > +#define QEMU_PCIE_DOWNSTREAM_H
> > +
> > +#include "hw/pci/pcie_port.h"
> > +
> > +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> > +        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
> > +        uint8_t chassis, uint16_t slot);
> > +
> > +#endif /* QEMU_PCIE_DOWNSTREAM_H */
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index bc38032761..bf04ef59f3 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -103,6 +103,7 @@ extern bool pci_available;
> >   #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
> >   #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
> >   #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
> > +#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
> >   #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
> >   #define FMT_PCIBUS                      PRIx64
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 
> 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
@ 2018-07-31 15:58       ` Venu Busireddy
  0 siblings, 0 replies; 106+ messages in thread
From: Venu Busireddy @ 2018-07-31 15:58 UTC (permalink / raw)
  To: Marcel Apfelbaum
  Cc: Michael S . Tsirkin, Marcel Apfelbaum, virtio-dev, qemu-devel

On 2018-07-07 15:14:11 +0300, Marcel Apfelbaum wrote:
> Hi Venu,
> 
> On 06/30/2018 01:19 AM, Venu Busireddy wrote:
> > Add a new bridge device "pcie-downstream" with a
> > Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
> > PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER.
> 
> Can't we use the current pcie-pci-bridge device (see
> hw/pci-bridge/pcie_pci_bridge.c)
> by adding the new vendor capability to it so we will not need to support a
> new bridge?

Sorry for the delayed response. I was away on vacation.

This question is probably better answered by Michael, but me let me try.
Michael suggested adding a new device to avoid confusion with existing
bridge devices.

Venu


> 
> Thanks,
> Marcel
> 
> >   Also, add the "Vendor-Specific"
> > capability to the bridge to contain the "Group Identifier" that will be
> > used to pair a virtio device with the passthrough device attached to
> > that bridge.
> > 
> > This capability is added to the bridge iff the "failover-group-id"
> > option is specified for the bridge.
> > 
> > Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
> > ---
> >   default-configs/arm-softmmu.mak    |   1 +
> >   default-configs/i386-softmmu.mak   |   1 +
> >   default-configs/x86_64-softmmu.mak |   1 +
> >   hw/pci-bridge/Makefile.objs        |   1 +
> >   hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
> >   hw/pci-bridge/pcie_downstream.h    |  10 ++
> >   include/hw/pci/pci.h               |   1 +
> >   7 files changed, 235 insertions(+)
> >   create mode 100644 hw/pci-bridge/pcie_downstream.c
> >   create mode 100644 hw/pci-bridge/pcie_downstream.h
> > 
> > diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> > index 834d45cfaf..b86c6fb122 100644
> > --- a/default-configs/arm-softmmu.mak
> > +++ b/default-configs/arm-softmmu.mak
> > @@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
> >   CONFIG_PCIE_PORT=y
> >   CONFIG_XIO3130=y
> >   CONFIG_IOH3420=y
> > +CONFIG_PCIE_DOWNSTREAM=y
> >   CONFIG_I82801B11=y
> >   CONFIG_ACPI=y
> >   CONFIG_SMBIOS=y
> > diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
> > index 8c7d4a0fa0..a900c8f052 100644
> > --- a/default-configs/i386-softmmu.mak
> > +++ b/default-configs/i386-softmmu.mak
> > @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
> >   CONFIG_PCIE_PORT=y
> >   CONFIG_XIO3130=y
> >   CONFIG_IOH3420=y
> > +CONFIG_PCIE_DOWNSTREAM=y
> >   CONFIG_I82801B11=y
> >   CONFIG_SMBIOS=y
> >   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> > diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
> > index 0390b4303c..481e4764be 100644
> > --- a/default-configs/x86_64-softmmu.mak
> > +++ b/default-configs/x86_64-softmmu.mak
> > @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
> >   CONFIG_PCIE_PORT=y
> >   CONFIG_XIO3130=y
> >   CONFIG_IOH3420=y
> > +CONFIG_PCIE_DOWNSTREAM=y
> >   CONFIG_I82801B11=y
> >   CONFIG_SMBIOS=y
> >   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> > diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
> > index 47065f87d9..5b42212edc 100644
> > --- a/hw/pci-bridge/Makefile.objs
> > +++ b/hw/pci-bridge/Makefile.objs
> > @@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
> >   common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
> >   common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
> >   common-obj-$(CONFIG_IOH3420) += ioh3420.o
> > +common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
> >   common-obj-$(CONFIG_I82801B11) += i82801b11.o
> >   # NewWorld PowerMac
> >   common-obj-$(CONFIG_DEC_PCI) += dec.o
> > diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
> > new file mode 100644
> > index 0000000000..49b0d1e933
> > --- /dev/null
> > +++ b/hw/pci-bridge/pcie_downstream.c
> > @@ -0,0 +1,220 @@
> > +/*
> > + * Red Hat PCI Express downstream port.
> > + *
> > + * pcie_downstream.c
> > + * Most of this code is copied from xio3130_downstream.c
> > + *
> > + * Copyright (c) 2018 Oracle and/or its affiliates.
> > + *         Author: Venu Busireddy <venu.busireddy@oracle.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License as published by
> > + * the Free Software Foundation; either version 2 of the License, or
> > + * (at your option) any later version.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License along
> > + * with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#include "qemu/osdep.h"
> > +#include "hw/pci/pci_ids.h"
> > +#include "hw/pci/msi.h"
> > +#include "hw/pci/pcie.h"
> > +#include "pcie_downstream.h"
> > +#include "qapi/error.h"
> > +
> > +#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
> > +#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
> > +#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
> > +#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
> > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
> > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
> > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
> > +#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
> > +#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
> > +#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
> > +
> > +static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
> > +                                         uint32_t val, int len)
> > +{
> > +    pci_bridge_write_config(d, address, val, len);
> > +    pcie_cap_flr_write_config(d, address, val, len);
> > +    pcie_cap_slot_write_config(d, address, val, len);
> > +    pcie_aer_write_config(d, address, val, len);
> > +}
> > +
> > +static void pcie_downstream_reset(DeviceState *qdev)
> > +{
> > +    PCIDevice *d = PCI_DEVICE(qdev);
> > +
> > +    pcie_cap_deverr_reset(d);
> > +    pcie_cap_slot_reset(d);
> > +    pcie_cap_arifwd_reset(d);
> > +    pci_bridge_reset(qdev);
> > +}
> > +
> > +static void pcie_downstream_realize(PCIDevice *d, Error **errp)
> > +{
> > +    PCIEPort *p = PCIE_PORT(d);
> > +    PCIESlot *s = PCIE_SLOT(d);
> > +    int rc;
> > +
> > +    pci_bridge_initfn(d, TYPE_PCIE_BUS);
> > +    pcie_port_init_reg(d);
> > +
> > +    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
> > +            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
> > +    if (rc < 0) {
> > +        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
> > +        goto err_bridge;
> > +    }
> > +
> > +    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
> > +            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
> > +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
> > +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
> > +            errp);
> > +    if (rc < 0) {
> > +        assert(rc == -ENOTSUP);
> > +        goto err_bridge;
> > +    }
> > +
> > +    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
> > +            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
> > +            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
> > +    if (rc < 0) {
> > +        goto err_bridge;
> > +    }
> > +
> > +    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
> > +            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
> > +    if (rc < 0) {
> > +        goto err_msi;
> > +    }
> > +    pcie_cap_flr_init(d);
> > +    pcie_cap_deverr_init(d);
> > +    pcie_cap_slot_init(d, s->slot);
> > +    pcie_cap_arifwd_init(d);
> > +
> > +    pcie_chassis_create(s->chassis);
> > +    rc = pcie_chassis_add_slot(s);
> > +    if (rc < 0) {
> > +        error_setg(errp, "Can't add chassis slot, error %d", rc);
> > +        goto err_pcie_cap;
> > +    }
> > +
> > +    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
> > +            PCI_ERR_SIZEOF, errp);
> > +    if (rc < 0) {
> > +        goto err;
> > +    }
> > +
> > +    return;
> > +
> > +err:
> > +    pcie_chassis_del_slot(s);
> > +err_pcie_cap:
> > +    pcie_cap_exit(d);
> > +err_msi:
> > +    msi_uninit(d);
> > +err_bridge:
> > +    pci_bridge_exitfn(d);
> > +}
> > +
> > +static void pcie_downstream_exitfn(PCIDevice *d)
> > +{
> > +    PCIESlot *s = PCIE_SLOT(d);
> > +
> > +    pcie_aer_exit(d);
> > +    pcie_chassis_del_slot(s);
> > +    pcie_cap_exit(d);
> > +    msi_uninit(d);
> > +    pci_bridge_exitfn(d);
> > +}
> > +
> > +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> > +                               const char *bus_name, pci_map_irq_fn map_irq,
> > +                               uint8_t port, uint8_t chassis, uint16_t slot)
> > +{
> > +    PCIDevice *d;
> > +    PCIBridge *br;
> > +    DeviceState *qdev;
> > +
> > +    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
> > +    if (!d) {
> > +        return NULL;
> > +    }
> > +    br = PCI_BRIDGE(d);
> > +
> > +    qdev = DEVICE(d);
> > +    pci_bridge_map_irq(br, bus_name, map_irq);
> > +    qdev_prop_set_uint8(qdev, "port", port);
> > +    qdev_prop_set_uint8(qdev, "chassis", chassis);
> > +    qdev_prop_set_uint16(qdev, "slot", slot);
> > +    qdev_init_nofail(qdev);
> > +
> > +    return PCIE_SLOT(d);
> > +}
> > +
> > +static Property pcie_downstream_props[] = {
> > +    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
> > +            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
> > +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
> > +            PCIDevice, failover_group_id, ULLONG_MAX),
> > +    DEFINE_PROP_END_OF_LIST()
> > +};
> > +
> > +static const VMStateDescription vmstate_pcie_ds = {
> > +    .name = "pcie-downstream-port",
> > +    .priority = MIG_PRI_PCI_BUS,
> > +    .version_id = 1,
> > +    .minimum_version_id = 1,
> > +    .post_load = pcie_cap_slot_post_load,
> > +    .fields = (VMStateField[]) {
> > +        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
> > +        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
> > +                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
> > +        VMSTATE_END_OF_LIST()
> > +    }
> > +};
> > +
> > +static void pcie_downstream_class_init(ObjectClass *klass, void *data)
> > +{
> > +    DeviceClass *dc = DEVICE_CLASS(klass);
> > +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> > +
> > +    k->is_bridge = 1;
> > +    k->config_write = pcie_downstream_write_config;
> > +    k->realize = pcie_downstream_realize;
> > +    k->exit = pcie_downstream_exitfn;
> > +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> > +    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
> > +    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
> > +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> > +    dc->desc = "Red Hat PCIe Downstream Port";
> > +    dc->reset = pcie_downstream_reset;
> > +    dc->vmsd = &vmstate_pcie_ds;
> > +    dc->props = pcie_downstream_props;
> > +}
> > +
> > +static const TypeInfo pcie_downstream_info = {
> > +    .name          = "pcie-downstream",
> > +    .parent        = TYPE_PCIE_SLOT,
> > +    .class_init    = pcie_downstream_class_init,
> > +    .interfaces = (InterfaceInfo[]) {
> > +        { INTERFACE_PCIE_DEVICE },
> > +        { }
> > +    },
> > +};
> > +
> > +static void pcie_downstream_register_types(void)
> > +{
> > +    type_register_static(&pcie_downstream_info);
> > +}
> > +
> > +type_init(pcie_downstream_register_types)
> > diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
> > new file mode 100644
> > index 0000000000..a7f1aae4bb
> > --- /dev/null
> > +++ b/hw/pci-bridge/pcie_downstream.h
> > @@ -0,0 +1,10 @@
> > +#ifndef QEMU_PCIE_DOWNSTREAM_H
> > +#define QEMU_PCIE_DOWNSTREAM_H
> > +
> > +#include "hw/pci/pcie_port.h"
> > +
> > +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> > +        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
> > +        uint8_t chassis, uint16_t slot);
> > +
> > +#endif /* QEMU_PCIE_DOWNSTREAM_H */
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index bc38032761..bf04ef59f3 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -103,6 +103,7 @@ extern bool pci_available;
> >   #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
> >   #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
> >   #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
> > +#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
> >   #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
> >   #define FMT_PCIBUS                      PRIx64
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
  2018-07-31 15:58       ` Venu Busireddy
@ 2018-07-31 16:03         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-31 16:03 UTC (permalink / raw)
  To: Venu Busireddy; +Cc: Marcel Apfelbaum, Marcel Apfelbaum, virtio-dev, qemu-devel

On Tue, Jul 31, 2018 at 10:58:37AM -0500, Venu Busireddy wrote:
> On 2018-07-07 15:14:11 +0300, Marcel Apfelbaum wrote:
> > Hi Venu,
> > 
> > On 06/30/2018 01:19 AM, Venu Busireddy wrote:
> > > Add a new bridge device "pcie-downstream" with a
> > > Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
> > > PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER.
> > 
> > Can't we use the current pcie-pci-bridge device (see
> > hw/pci-bridge/pcie_pci_bridge.c)
> > by adding the new vendor capability to it so we will not need to support a
> > new bridge?
> 
> Sorry for the delayed response. I was away on vacation.
> 
> This question is probably better answered by Michael, but me let me try.
> Michael suggested adding a new device to avoid confusion with existing
> bridge devices.
> 
> Venu

It's similar to the hotseat trick - dev id is easier to match against.

> 
> > 
> > Thanks,
> > Marcel
> > 
> > >   Also, add the "Vendor-Specific"
> > > capability to the bridge to contain the "Group Identifier" that will be
> > > used to pair a virtio device with the passthrough device attached to
> > > that bridge.
> > > 
> > > This capability is added to the bridge iff the "failover-group-id"
> > > option is specified for the bridge.
> > > 
> > > Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
> > > ---
> > >   default-configs/arm-softmmu.mak    |   1 +
> > >   default-configs/i386-softmmu.mak   |   1 +
> > >   default-configs/x86_64-softmmu.mak |   1 +
> > >   hw/pci-bridge/Makefile.objs        |   1 +
> > >   hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
> > >   hw/pci-bridge/pcie_downstream.h    |  10 ++
> > >   include/hw/pci/pci.h               |   1 +
> > >   7 files changed, 235 insertions(+)
> > >   create mode 100644 hw/pci-bridge/pcie_downstream.c
> > >   create mode 100644 hw/pci-bridge/pcie_downstream.h
> > > 
> > > diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> > > index 834d45cfaf..b86c6fb122 100644
> > > --- a/default-configs/arm-softmmu.mak
> > > +++ b/default-configs/arm-softmmu.mak
> > > @@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
> > >   CONFIG_PCIE_PORT=y
> > >   CONFIG_XIO3130=y
> > >   CONFIG_IOH3420=y
> > > +CONFIG_PCIE_DOWNSTREAM=y
> > >   CONFIG_I82801B11=y
> > >   CONFIG_ACPI=y
> > >   CONFIG_SMBIOS=y
> > > diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
> > > index 8c7d4a0fa0..a900c8f052 100644
> > > --- a/default-configs/i386-softmmu.mak
> > > +++ b/default-configs/i386-softmmu.mak
> > > @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
> > >   CONFIG_PCIE_PORT=y
> > >   CONFIG_XIO3130=y
> > >   CONFIG_IOH3420=y
> > > +CONFIG_PCIE_DOWNSTREAM=y
> > >   CONFIG_I82801B11=y
> > >   CONFIG_SMBIOS=y
> > >   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> > > diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
> > > index 0390b4303c..481e4764be 100644
> > > --- a/default-configs/x86_64-softmmu.mak
> > > +++ b/default-configs/x86_64-softmmu.mak
> > > @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
> > >   CONFIG_PCIE_PORT=y
> > >   CONFIG_XIO3130=y
> > >   CONFIG_IOH3420=y
> > > +CONFIG_PCIE_DOWNSTREAM=y
> > >   CONFIG_I82801B11=y
> > >   CONFIG_SMBIOS=y
> > >   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> > > diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
> > > index 47065f87d9..5b42212edc 100644
> > > --- a/hw/pci-bridge/Makefile.objs
> > > +++ b/hw/pci-bridge/Makefile.objs
> > > @@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
> > >   common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
> > >   common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
> > >   common-obj-$(CONFIG_IOH3420) += ioh3420.o
> > > +common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
> > >   common-obj-$(CONFIG_I82801B11) += i82801b11.o
> > >   # NewWorld PowerMac
> > >   common-obj-$(CONFIG_DEC_PCI) += dec.o
> > > diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
> > > new file mode 100644
> > > index 0000000000..49b0d1e933
> > > --- /dev/null
> > > +++ b/hw/pci-bridge/pcie_downstream.c
> > > @@ -0,0 +1,220 @@
> > > +/*
> > > + * Red Hat PCI Express downstream port.
> > > + *
> > > + * pcie_downstream.c
> > > + * Most of this code is copied from xio3130_downstream.c
> > > + *
> > > + * Copyright (c) 2018 Oracle and/or its affiliates.
> > > + *         Author: Venu Busireddy <venu.busireddy@oracle.com>
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License as published by
> > > + * the Free Software Foundation; either version 2 of the License, or
> > > + * (at your option) any later version.
> > > + *
> > > + * This program is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + * GNU General Public License for more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License along
> > > + * with this program; if not, see <http://www.gnu.org/licenses/>.
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "hw/pci/pci_ids.h"
> > > +#include "hw/pci/msi.h"
> > > +#include "hw/pci/pcie.h"
> > > +#include "pcie_downstream.h"
> > > +#include "qapi/error.h"
> > > +
> > > +#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
> > > +#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
> > > +#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
> > > +#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
> > > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
> > > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
> > > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
> > > +#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
> > > +#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
> > > +#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
> > > +
> > > +static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
> > > +                                         uint32_t val, int len)
> > > +{
> > > +    pci_bridge_write_config(d, address, val, len);
> > > +    pcie_cap_flr_write_config(d, address, val, len);
> > > +    pcie_cap_slot_write_config(d, address, val, len);
> > > +    pcie_aer_write_config(d, address, val, len);
> > > +}
> > > +
> > > +static void pcie_downstream_reset(DeviceState *qdev)
> > > +{
> > > +    PCIDevice *d = PCI_DEVICE(qdev);
> > > +
> > > +    pcie_cap_deverr_reset(d);
> > > +    pcie_cap_slot_reset(d);
> > > +    pcie_cap_arifwd_reset(d);
> > > +    pci_bridge_reset(qdev);
> > > +}
> > > +
> > > +static void pcie_downstream_realize(PCIDevice *d, Error **errp)
> > > +{
> > > +    PCIEPort *p = PCIE_PORT(d);
> > > +    PCIESlot *s = PCIE_SLOT(d);
> > > +    int rc;
> > > +
> > > +    pci_bridge_initfn(d, TYPE_PCIE_BUS);
> > > +    pcie_port_init_reg(d);
> > > +
> > > +    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
> > > +            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
> > > +    if (rc < 0) {
> > > +        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
> > > +        goto err_bridge;
> > > +    }
> > > +
> > > +    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
> > > +            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
> > > +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
> > > +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
> > > +            errp);
> > > +    if (rc < 0) {
> > > +        assert(rc == -ENOTSUP);
> > > +        goto err_bridge;
> > > +    }
> > > +
> > > +    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
> > > +            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
> > > +            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
> > > +    if (rc < 0) {
> > > +        goto err_bridge;
> > > +    }
> > > +
> > > +    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
> > > +            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
> > > +    if (rc < 0) {
> > > +        goto err_msi;
> > > +    }
> > > +    pcie_cap_flr_init(d);
> > > +    pcie_cap_deverr_init(d);
> > > +    pcie_cap_slot_init(d, s->slot);
> > > +    pcie_cap_arifwd_init(d);
> > > +
> > > +    pcie_chassis_create(s->chassis);
> > > +    rc = pcie_chassis_add_slot(s);
> > > +    if (rc < 0) {
> > > +        error_setg(errp, "Can't add chassis slot, error %d", rc);
> > > +        goto err_pcie_cap;
> > > +    }
> > > +
> > > +    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
> > > +            PCI_ERR_SIZEOF, errp);
> > > +    if (rc < 0) {
> > > +        goto err;
> > > +    }
> > > +
> > > +    return;
> > > +
> > > +err:
> > > +    pcie_chassis_del_slot(s);
> > > +err_pcie_cap:
> > > +    pcie_cap_exit(d);
> > > +err_msi:
> > > +    msi_uninit(d);
> > > +err_bridge:
> > > +    pci_bridge_exitfn(d);
> > > +}
> > > +
> > > +static void pcie_downstream_exitfn(PCIDevice *d)
> > > +{
> > > +    PCIESlot *s = PCIE_SLOT(d);
> > > +
> > > +    pcie_aer_exit(d);
> > > +    pcie_chassis_del_slot(s);
> > > +    pcie_cap_exit(d);
> > > +    msi_uninit(d);
> > > +    pci_bridge_exitfn(d);
> > > +}
> > > +
> > > +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> > > +                               const char *bus_name, pci_map_irq_fn map_irq,
> > > +                               uint8_t port, uint8_t chassis, uint16_t slot)
> > > +{
> > > +    PCIDevice *d;
> > > +    PCIBridge *br;
> > > +    DeviceState *qdev;
> > > +
> > > +    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
> > > +    if (!d) {
> > > +        return NULL;
> > > +    }
> > > +    br = PCI_BRIDGE(d);
> > > +
> > > +    qdev = DEVICE(d);
> > > +    pci_bridge_map_irq(br, bus_name, map_irq);
> > > +    qdev_prop_set_uint8(qdev, "port", port);
> > > +    qdev_prop_set_uint8(qdev, "chassis", chassis);
> > > +    qdev_prop_set_uint16(qdev, "slot", slot);
> > > +    qdev_init_nofail(qdev);
> > > +
> > > +    return PCIE_SLOT(d);
> > > +}
> > > +
> > > +static Property pcie_downstream_props[] = {
> > > +    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
> > > +            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
> > > +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
> > > +            PCIDevice, failover_group_id, ULLONG_MAX),
> > > +    DEFINE_PROP_END_OF_LIST()
> > > +};
> > > +
> > > +static const VMStateDescription vmstate_pcie_ds = {
> > > +    .name = "pcie-downstream-port",
> > > +    .priority = MIG_PRI_PCI_BUS,
> > > +    .version_id = 1,
> > > +    .minimum_version_id = 1,
> > > +    .post_load = pcie_cap_slot_post_load,
> > > +    .fields = (VMStateField[]) {
> > > +        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
> > > +        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
> > > +                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
> > > +        VMSTATE_END_OF_LIST()
> > > +    }
> > > +};
> > > +
> > > +static void pcie_downstream_class_init(ObjectClass *klass, void *data)
> > > +{
> > > +    DeviceClass *dc = DEVICE_CLASS(klass);
> > > +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> > > +
> > > +    k->is_bridge = 1;
> > > +    k->config_write = pcie_downstream_write_config;
> > > +    k->realize = pcie_downstream_realize;
> > > +    k->exit = pcie_downstream_exitfn;
> > > +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> > > +    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
> > > +    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
> > > +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> > > +    dc->desc = "Red Hat PCIe Downstream Port";
> > > +    dc->reset = pcie_downstream_reset;
> > > +    dc->vmsd = &vmstate_pcie_ds;
> > > +    dc->props = pcie_downstream_props;
> > > +}
> > > +
> > > +static const TypeInfo pcie_downstream_info = {
> > > +    .name          = "pcie-downstream",
> > > +    .parent        = TYPE_PCIE_SLOT,
> > > +    .class_init    = pcie_downstream_class_init,
> > > +    .interfaces = (InterfaceInfo[]) {
> > > +        { INTERFACE_PCIE_DEVICE },
> > > +        { }
> > > +    },
> > > +};
> > > +
> > > +static void pcie_downstream_register_types(void)
> > > +{
> > > +    type_register_static(&pcie_downstream_info);
> > > +}
> > > +
> > > +type_init(pcie_downstream_register_types)
> > > diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
> > > new file mode 100644
> > > index 0000000000..a7f1aae4bb
> > > --- /dev/null
> > > +++ b/hw/pci-bridge/pcie_downstream.h
> > > @@ -0,0 +1,10 @@
> > > +#ifndef QEMU_PCIE_DOWNSTREAM_H
> > > +#define QEMU_PCIE_DOWNSTREAM_H
> > > +
> > > +#include "hw/pci/pcie_port.h"
> > > +
> > > +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> > > +        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
> > > +        uint8_t chassis, uint16_t slot);
> > > +
> > > +#endif /* QEMU_PCIE_DOWNSTREAM_H */
> > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > index bc38032761..bf04ef59f3 100644
> > > --- a/include/hw/pci/pci.h
> > > +++ b/include/hw/pci/pci.h
> > > @@ -103,6 +103,7 @@ extern bool pci_available;
> > >   #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
> > >   #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
> > >   #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
> > > +#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
> > >   #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
> > >   #define FMT_PCIBUS                      PRIx64
> > > 
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > 
> > 

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
@ 2018-07-31 16:03         ` Michael S. Tsirkin
  0 siblings, 0 replies; 106+ messages in thread
From: Michael S. Tsirkin @ 2018-07-31 16:03 UTC (permalink / raw)
  To: Venu Busireddy; +Cc: Marcel Apfelbaum, Marcel Apfelbaum, virtio-dev, qemu-devel

On Tue, Jul 31, 2018 at 10:58:37AM -0500, Venu Busireddy wrote:
> On 2018-07-07 15:14:11 +0300, Marcel Apfelbaum wrote:
> > Hi Venu,
> > 
> > On 06/30/2018 01:19 AM, Venu Busireddy wrote:
> > > Add a new bridge device "pcie-downstream" with a
> > > Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
> > > PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER.
> > 
> > Can't we use the current pcie-pci-bridge device (see
> > hw/pci-bridge/pcie_pci_bridge.c)
> > by adding the new vendor capability to it so we will not need to support a
> > new bridge?
> 
> Sorry for the delayed response. I was away on vacation.
> 
> This question is probably better answered by Michael, but me let me try.
> Michael suggested adding a new device to avoid confusion with existing
> bridge devices.
> 
> Venu

It's similar to the hotseat trick - dev id is easier to match against.

> 
> > 
> > Thanks,
> > Marcel
> > 
> > >   Also, add the "Vendor-Specific"
> > > capability to the bridge to contain the "Group Identifier" that will be
> > > used to pair a virtio device with the passthrough device attached to
> > > that bridge.
> > > 
> > > This capability is added to the bridge iff the "failover-group-id"
> > > option is specified for the bridge.
> > > 
> > > Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
> > > ---
> > >   default-configs/arm-softmmu.mak    |   1 +
> > >   default-configs/i386-softmmu.mak   |   1 +
> > >   default-configs/x86_64-softmmu.mak |   1 +
> > >   hw/pci-bridge/Makefile.objs        |   1 +
> > >   hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
> > >   hw/pci-bridge/pcie_downstream.h    |  10 ++
> > >   include/hw/pci/pci.h               |   1 +
> > >   7 files changed, 235 insertions(+)
> > >   create mode 100644 hw/pci-bridge/pcie_downstream.c
> > >   create mode 100644 hw/pci-bridge/pcie_downstream.h
> > > 
> > > diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
> > > index 834d45cfaf..b86c6fb122 100644
> > > --- a/default-configs/arm-softmmu.mak
> > > +++ b/default-configs/arm-softmmu.mak
> > > @@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
> > >   CONFIG_PCIE_PORT=y
> > >   CONFIG_XIO3130=y
> > >   CONFIG_IOH3420=y
> > > +CONFIG_PCIE_DOWNSTREAM=y
> > >   CONFIG_I82801B11=y
> > >   CONFIG_ACPI=y
> > >   CONFIG_SMBIOS=y
> > > diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
> > > index 8c7d4a0fa0..a900c8f052 100644
> > > --- a/default-configs/i386-softmmu.mak
> > > +++ b/default-configs/i386-softmmu.mak
> > > @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
> > >   CONFIG_PCIE_PORT=y
> > >   CONFIG_XIO3130=y
> > >   CONFIG_IOH3420=y
> > > +CONFIG_PCIE_DOWNSTREAM=y
> > >   CONFIG_I82801B11=y
> > >   CONFIG_SMBIOS=y
> > >   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> > > diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
> > > index 0390b4303c..481e4764be 100644
> > > --- a/default-configs/x86_64-softmmu.mak
> > > +++ b/default-configs/x86_64-softmmu.mak
> > > @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
> > >   CONFIG_PCIE_PORT=y
> > >   CONFIG_XIO3130=y
> > >   CONFIG_IOH3420=y
> > > +CONFIG_PCIE_DOWNSTREAM=y
> > >   CONFIG_I82801B11=y
> > >   CONFIG_SMBIOS=y
> > >   CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
> > > diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
> > > index 47065f87d9..5b42212edc 100644
> > > --- a/hw/pci-bridge/Makefile.objs
> > > +++ b/hw/pci-bridge/Makefile.objs
> > > @@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
> > >   common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
> > >   common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
> > >   common-obj-$(CONFIG_IOH3420) += ioh3420.o
> > > +common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
> > >   common-obj-$(CONFIG_I82801B11) += i82801b11.o
> > >   # NewWorld PowerMac
> > >   common-obj-$(CONFIG_DEC_PCI) += dec.o
> > > diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
> > > new file mode 100644
> > > index 0000000000..49b0d1e933
> > > --- /dev/null
> > > +++ b/hw/pci-bridge/pcie_downstream.c
> > > @@ -0,0 +1,220 @@
> > > +/*
> > > + * Red Hat PCI Express downstream port.
> > > + *
> > > + * pcie_downstream.c
> > > + * Most of this code is copied from xio3130_downstream.c
> > > + *
> > > + * Copyright (c) 2018 Oracle and/or its affiliates.
> > > + *         Author: Venu Busireddy <venu.busireddy@oracle.com>
> > > + *
> > > + * This program is free software; you can redistribute it and/or modify
> > > + * it under the terms of the GNU General Public License as published by
> > > + * the Free Software Foundation; either version 2 of the License, or
> > > + * (at your option) any later version.
> > > + *
> > > + * This program is distributed in the hope that it will be useful,
> > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > > + * GNU General Public License for more details.
> > > + *
> > > + * You should have received a copy of the GNU General Public License along
> > > + * with this program; if not, see <http://www.gnu.org/licenses/>.
> > > + */
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "hw/pci/pci_ids.h"
> > > +#include "hw/pci/msi.h"
> > > +#include "hw/pci/pcie.h"
> > > +#include "pcie_downstream.h"
> > > +#include "qapi/error.h"
> > > +
> > > +#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
> > > +#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
> > > +#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
> > > +#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
> > > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
> > > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
> > > +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
> > > +#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
> > > +#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
> > > +#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
> > > +
> > > +static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
> > > +                                         uint32_t val, int len)
> > > +{
> > > +    pci_bridge_write_config(d, address, val, len);
> > > +    pcie_cap_flr_write_config(d, address, val, len);
> > > +    pcie_cap_slot_write_config(d, address, val, len);
> > > +    pcie_aer_write_config(d, address, val, len);
> > > +}
> > > +
> > > +static void pcie_downstream_reset(DeviceState *qdev)
> > > +{
> > > +    PCIDevice *d = PCI_DEVICE(qdev);
> > > +
> > > +    pcie_cap_deverr_reset(d);
> > > +    pcie_cap_slot_reset(d);
> > > +    pcie_cap_arifwd_reset(d);
> > > +    pci_bridge_reset(qdev);
> > > +}
> > > +
> > > +static void pcie_downstream_realize(PCIDevice *d, Error **errp)
> > > +{
> > > +    PCIEPort *p = PCIE_PORT(d);
> > > +    PCIESlot *s = PCIE_SLOT(d);
> > > +    int rc;
> > > +
> > > +    pci_bridge_initfn(d, TYPE_PCIE_BUS);
> > > +    pcie_port_init_reg(d);
> > > +
> > > +    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
> > > +            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
> > > +    if (rc < 0) {
> > > +        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
> > > +        goto err_bridge;
> > > +    }
> > > +
> > > +    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
> > > +            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
> > > +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
> > > +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
> > > +            errp);
> > > +    if (rc < 0) {
> > > +        assert(rc == -ENOTSUP);
> > > +        goto err_bridge;
> > > +    }
> > > +
> > > +    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
> > > +            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
> > > +            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
> > > +    if (rc < 0) {
> > > +        goto err_bridge;
> > > +    }
> > > +
> > > +    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
> > > +            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
> > > +    if (rc < 0) {
> > > +        goto err_msi;
> > > +    }
> > > +    pcie_cap_flr_init(d);
> > > +    pcie_cap_deverr_init(d);
> > > +    pcie_cap_slot_init(d, s->slot);
> > > +    pcie_cap_arifwd_init(d);
> > > +
> > > +    pcie_chassis_create(s->chassis);
> > > +    rc = pcie_chassis_add_slot(s);
> > > +    if (rc < 0) {
> > > +        error_setg(errp, "Can't add chassis slot, error %d", rc);
> > > +        goto err_pcie_cap;
> > > +    }
> > > +
> > > +    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
> > > +            PCI_ERR_SIZEOF, errp);
> > > +    if (rc < 0) {
> > > +        goto err;
> > > +    }
> > > +
> > > +    return;
> > > +
> > > +err:
> > > +    pcie_chassis_del_slot(s);
> > > +err_pcie_cap:
> > > +    pcie_cap_exit(d);
> > > +err_msi:
> > > +    msi_uninit(d);
> > > +err_bridge:
> > > +    pci_bridge_exitfn(d);
> > > +}
> > > +
> > > +static void pcie_downstream_exitfn(PCIDevice *d)
> > > +{
> > > +    PCIESlot *s = PCIE_SLOT(d);
> > > +
> > > +    pcie_aer_exit(d);
> > > +    pcie_chassis_del_slot(s);
> > > +    pcie_cap_exit(d);
> > > +    msi_uninit(d);
> > > +    pci_bridge_exitfn(d);
> > > +}
> > > +
> > > +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> > > +                               const char *bus_name, pci_map_irq_fn map_irq,
> > > +                               uint8_t port, uint8_t chassis, uint16_t slot)
> > > +{
> > > +    PCIDevice *d;
> > > +    PCIBridge *br;
> > > +    DeviceState *qdev;
> > > +
> > > +    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
> > > +    if (!d) {
> > > +        return NULL;
> > > +    }
> > > +    br = PCI_BRIDGE(d);
> > > +
> > > +    qdev = DEVICE(d);
> > > +    pci_bridge_map_irq(br, bus_name, map_irq);
> > > +    qdev_prop_set_uint8(qdev, "port", port);
> > > +    qdev_prop_set_uint8(qdev, "chassis", chassis);
> > > +    qdev_prop_set_uint16(qdev, "slot", slot);
> > > +    qdev_init_nofail(qdev);
> > > +
> > > +    return PCIE_SLOT(d);
> > > +}
> > > +
> > > +static Property pcie_downstream_props[] = {
> > > +    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
> > > +            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
> > > +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
> > > +            PCIDevice, failover_group_id, ULLONG_MAX),
> > > +    DEFINE_PROP_END_OF_LIST()
> > > +};
> > > +
> > > +static const VMStateDescription vmstate_pcie_ds = {
> > > +    .name = "pcie-downstream-port",
> > > +    .priority = MIG_PRI_PCI_BUS,
> > > +    .version_id = 1,
> > > +    .minimum_version_id = 1,
> > > +    .post_load = pcie_cap_slot_post_load,
> > > +    .fields = (VMStateField[]) {
> > > +        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
> > > +        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
> > > +                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
> > > +        VMSTATE_END_OF_LIST()
> > > +    }
> > > +};
> > > +
> > > +static void pcie_downstream_class_init(ObjectClass *klass, void *data)
> > > +{
> > > +    DeviceClass *dc = DEVICE_CLASS(klass);
> > > +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
> > > +
> > > +    k->is_bridge = 1;
> > > +    k->config_write = pcie_downstream_write_config;
> > > +    k->realize = pcie_downstream_realize;
> > > +    k->exit = pcie_downstream_exitfn;
> > > +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
> > > +    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
> > > +    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
> > > +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
> > > +    dc->desc = "Red Hat PCIe Downstream Port";
> > > +    dc->reset = pcie_downstream_reset;
> > > +    dc->vmsd = &vmstate_pcie_ds;
> > > +    dc->props = pcie_downstream_props;
> > > +}
> > > +
> > > +static const TypeInfo pcie_downstream_info = {
> > > +    .name          = "pcie-downstream",
> > > +    .parent        = TYPE_PCIE_SLOT,
> > > +    .class_init    = pcie_downstream_class_init,
> > > +    .interfaces = (InterfaceInfo[]) {
> > > +        { INTERFACE_PCIE_DEVICE },
> > > +        { }
> > > +    },
> > > +};
> > > +
> > > +static void pcie_downstream_register_types(void)
> > > +{
> > > +    type_register_static(&pcie_downstream_info);
> > > +}
> > > +
> > > +type_init(pcie_downstream_register_types)
> > > diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
> > > new file mode 100644
> > > index 0000000000..a7f1aae4bb
> > > --- /dev/null
> > > +++ b/hw/pci-bridge/pcie_downstream.h
> > > @@ -0,0 +1,10 @@
> > > +#ifndef QEMU_PCIE_DOWNSTREAM_H
> > > +#define QEMU_PCIE_DOWNSTREAM_H
> > > +
> > > +#include "hw/pci/pcie_port.h"
> > > +
> > > +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
> > > +        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
> > > +        uint8_t chassis, uint16_t slot);
> > > +
> > > +#endif /* QEMU_PCIE_DOWNSTREAM_H */
> > > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > > index bc38032761..bf04ef59f3 100644
> > > --- a/include/hw/pci/pci.h
> > > +++ b/include/hw/pci/pci.h
> > > @@ -103,6 +103,7 @@ extern bool pci_available;
> > >   #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
> > >   #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
> > >   #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
> > > +#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
> > >   #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
> > >   #define FMT_PCIBUS                      PRIx64
> > > 
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > > 
> > 

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [Qemu-devel] [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
  2018-07-31 16:03         ` Michael S. Tsirkin
@ 2018-07-31 19:11           ` Marcel Apfelbaum
  -1 siblings, 0 replies; 106+ messages in thread
From: Marcel Apfelbaum @ 2018-07-31 19:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, Venu Busireddy
  Cc: Marcel Apfelbaum, virtio-dev, qemu-devel



On 07/31/2018 07:03 PM, Michael S. Tsirkin wrote:
> On Tue, Jul 31, 2018 at 10:58:37AM -0500, Venu Busireddy wrote:
>> On 2018-07-07 15:14:11 +0300, Marcel Apfelbaum wrote:
>>> Hi Venu,
>>>
>>> On 06/30/2018 01:19 AM, Venu Busireddy wrote:
>>>> Add a new bridge device "pcie-downstream" with a
>>>> Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
>>>> PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER.
>>> Can't we use the current pcie-pci-bridge device (see
>>> hw/pci-bridge/pcie_pci_bridge.c)
>>> by adding the new vendor capability to it so we will not need to support a
>>> new bridge?
>> Sorry for the delayed response. I was away on vacation.
>>
>> This question is probably better answered by Michael, but me let me try.
>> Michael suggested adding a new device to avoid confusion with existing
>> bridge devices.
>>
>> Venu
> It's similar to the hotseat trick - dev id is easier to match against.

I understand, but even so, we can maybe use the current pcie-pci-brigde
as a base class and derive from it changing only the VENDOR/DEVICE ID
and adding the new properties, instead of a full implementation.

Only a suggestion, of course.

Thanks,
Marcel



>>> Thanks,
>>> Marcel
>>>
>>>>    Also, add the "Vendor-Specific"
>>>> capability to the bridge to contain the "Group Identifier" that will be
>>>> used to pair a virtio device with the passthrough device attached to
>>>> that bridge.
>>>>
>>>> This capability is added to the bridge iff the "failover-group-id"
>>>> option is specified for the bridge.
>>>>
>>>> Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
>>>> ---
>>>>    default-configs/arm-softmmu.mak    |   1 +
>>>>    default-configs/i386-softmmu.mak   |   1 +
>>>>    default-configs/x86_64-softmmu.mak |   1 +
>>>>    hw/pci-bridge/Makefile.objs        |   1 +
>>>>    hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
>>>>    hw/pci-bridge/pcie_downstream.h    |  10 ++
>>>>    include/hw/pci/pci.h               |   1 +
>>>>    7 files changed, 235 insertions(+)
>>>>    create mode 100644 hw/pci-bridge/pcie_downstream.c
>>>>    create mode 100644 hw/pci-bridge/pcie_downstream.h
>>>>
>>>> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
>>>> index 834d45cfaf..b86c6fb122 100644
>>>> --- a/default-configs/arm-softmmu.mak
>>>> +++ b/default-configs/arm-softmmu.mak
>>>> @@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
>>>>    CONFIG_PCIE_PORT=y
>>>>    CONFIG_XIO3130=y
>>>>    CONFIG_IOH3420=y
>>>> +CONFIG_PCIE_DOWNSTREAM=y
>>>>    CONFIG_I82801B11=y
>>>>    CONFIG_ACPI=y
>>>>    CONFIG_SMBIOS=y
>>>> diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
>>>> index 8c7d4a0fa0..a900c8f052 100644
>>>> --- a/default-configs/i386-softmmu.mak
>>>> +++ b/default-configs/i386-softmmu.mak
>>>> @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
>>>>    CONFIG_PCIE_PORT=y
>>>>    CONFIG_XIO3130=y
>>>>    CONFIG_IOH3420=y
>>>> +CONFIG_PCIE_DOWNSTREAM=y
>>>>    CONFIG_I82801B11=y
>>>>    CONFIG_SMBIOS=y
>>>>    CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>>>> diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
>>>> index 0390b4303c..481e4764be 100644
>>>> --- a/default-configs/x86_64-softmmu.mak
>>>> +++ b/default-configs/x86_64-softmmu.mak
>>>> @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
>>>>    CONFIG_PCIE_PORT=y
>>>>    CONFIG_XIO3130=y
>>>>    CONFIG_IOH3420=y
>>>> +CONFIG_PCIE_DOWNSTREAM=y
>>>>    CONFIG_I82801B11=y
>>>>    CONFIG_SMBIOS=y
>>>>    CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>>>> diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
>>>> index 47065f87d9..5b42212edc 100644
>>>> --- a/hw/pci-bridge/Makefile.objs
>>>> +++ b/hw/pci-bridge/Makefile.objs
>>>> @@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
>>>>    common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
>>>>    common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
>>>>    common-obj-$(CONFIG_IOH3420) += ioh3420.o
>>>> +common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
>>>>    common-obj-$(CONFIG_I82801B11) += i82801b11.o
>>>>    # NewWorld PowerMac
>>>>    common-obj-$(CONFIG_DEC_PCI) += dec.o
>>>> diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
>>>> new file mode 100644
>>>> index 0000000000..49b0d1e933
>>>> --- /dev/null
>>>> +++ b/hw/pci-bridge/pcie_downstream.c
>>>> @@ -0,0 +1,220 @@
>>>> +/*
>>>> + * Red Hat PCI Express downstream port.
>>>> + *
>>>> + * pcie_downstream.c
>>>> + * Most of this code is copied from xio3130_downstream.c
>>>> + *
>>>> + * Copyright (c) 2018 Oracle and/or its affiliates.
>>>> + *         Author: Venu Busireddy <venu.busireddy@oracle.com>
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License as published by
>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>> + * (at your option) any later version.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>> + * GNU General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU General Public License along
>>>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>>>> + */
>>>> +
>>>> +#include "qemu/osdep.h"
>>>> +#include "hw/pci/pci_ids.h"
>>>> +#include "hw/pci/msi.h"
>>>> +#include "hw/pci/pcie.h"
>>>> +#include "pcie_downstream.h"
>>>> +#include "qapi/error.h"
>>>> +
>>>> +#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
>>>> +#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
>>>> +#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
>>>> +#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
>>>> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
>>>> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
>>>> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
>>>> +#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
>>>> +#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
>>>> +#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
>>>> +
>>>> +static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
>>>> +                                         uint32_t val, int len)
>>>> +{
>>>> +    pci_bridge_write_config(d, address, val, len);
>>>> +    pcie_cap_flr_write_config(d, address, val, len);
>>>> +    pcie_cap_slot_write_config(d, address, val, len);
>>>> +    pcie_aer_write_config(d, address, val, len);
>>>> +}
>>>> +
>>>> +static void pcie_downstream_reset(DeviceState *qdev)
>>>> +{
>>>> +    PCIDevice *d = PCI_DEVICE(qdev);
>>>> +
>>>> +    pcie_cap_deverr_reset(d);
>>>> +    pcie_cap_slot_reset(d);
>>>> +    pcie_cap_arifwd_reset(d);
>>>> +    pci_bridge_reset(qdev);
>>>> +}
>>>> +
>>>> +static void pcie_downstream_realize(PCIDevice *d, Error **errp)
>>>> +{
>>>> +    PCIEPort *p = PCIE_PORT(d);
>>>> +    PCIESlot *s = PCIE_SLOT(d);
>>>> +    int rc;
>>>> +
>>>> +    pci_bridge_initfn(d, TYPE_PCIE_BUS);
>>>> +    pcie_port_init_reg(d);
>>>> +
>>>> +    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
>>>> +            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
>>>> +    if (rc < 0) {
>>>> +        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
>>>> +        goto err_bridge;
>>>> +    }
>>>> +
>>>> +    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
>>>> +            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
>>>> +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
>>>> +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
>>>> +            errp);
>>>> +    if (rc < 0) {
>>>> +        assert(rc == -ENOTSUP);
>>>> +        goto err_bridge;
>>>> +    }
>>>> +
>>>> +    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
>>>> +            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
>>>> +            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
>>>> +    if (rc < 0) {
>>>> +        goto err_bridge;
>>>> +    }
>>>> +
>>>> +    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
>>>> +            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
>>>> +    if (rc < 0) {
>>>> +        goto err_msi;
>>>> +    }
>>>> +    pcie_cap_flr_init(d);
>>>> +    pcie_cap_deverr_init(d);
>>>> +    pcie_cap_slot_init(d, s->slot);
>>>> +    pcie_cap_arifwd_init(d);
>>>> +
>>>> +    pcie_chassis_create(s->chassis);
>>>> +    rc = pcie_chassis_add_slot(s);
>>>> +    if (rc < 0) {
>>>> +        error_setg(errp, "Can't add chassis slot, error %d", rc);
>>>> +        goto err_pcie_cap;
>>>> +    }
>>>> +
>>>> +    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
>>>> +            PCI_ERR_SIZEOF, errp);
>>>> +    if (rc < 0) {
>>>> +        goto err;
>>>> +    }
>>>> +
>>>> +    return;
>>>> +
>>>> +err:
>>>> +    pcie_chassis_del_slot(s);
>>>> +err_pcie_cap:
>>>> +    pcie_cap_exit(d);
>>>> +err_msi:
>>>> +    msi_uninit(d);
>>>> +err_bridge:
>>>> +    pci_bridge_exitfn(d);
>>>> +}
>>>> +
>>>> +static void pcie_downstream_exitfn(PCIDevice *d)
>>>> +{
>>>> +    PCIESlot *s = PCIE_SLOT(d);
>>>> +
>>>> +    pcie_aer_exit(d);
>>>> +    pcie_chassis_del_slot(s);
>>>> +    pcie_cap_exit(d);
>>>> +    msi_uninit(d);
>>>> +    pci_bridge_exitfn(d);
>>>> +}
>>>> +
>>>> +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
>>>> +                               const char *bus_name, pci_map_irq_fn map_irq,
>>>> +                               uint8_t port, uint8_t chassis, uint16_t slot)
>>>> +{
>>>> +    PCIDevice *d;
>>>> +    PCIBridge *br;
>>>> +    DeviceState *qdev;
>>>> +
>>>> +    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
>>>> +    if (!d) {
>>>> +        return NULL;
>>>> +    }
>>>> +    br = PCI_BRIDGE(d);
>>>> +
>>>> +    qdev = DEVICE(d);
>>>> +    pci_bridge_map_irq(br, bus_name, map_irq);
>>>> +    qdev_prop_set_uint8(qdev, "port", port);
>>>> +    qdev_prop_set_uint8(qdev, "chassis", chassis);
>>>> +    qdev_prop_set_uint16(qdev, "slot", slot);
>>>> +    qdev_init_nofail(qdev);
>>>> +
>>>> +    return PCIE_SLOT(d);
>>>> +}
>>>> +
>>>> +static Property pcie_downstream_props[] = {
>>>> +    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
>>>> +            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
>>>> +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
>>>> +            PCIDevice, failover_group_id, ULLONG_MAX),
>>>> +    DEFINE_PROP_END_OF_LIST()
>>>> +};
>>>> +
>>>> +static const VMStateDescription vmstate_pcie_ds = {
>>>> +    .name = "pcie-downstream-port",
>>>> +    .priority = MIG_PRI_PCI_BUS,
>>>> +    .version_id = 1,
>>>> +    .minimum_version_id = 1,
>>>> +    .post_load = pcie_cap_slot_post_load,
>>>> +    .fields = (VMStateField[]) {
>>>> +        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
>>>> +        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
>>>> +                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
>>>> +        VMSTATE_END_OF_LIST()
>>>> +    }
>>>> +};
>>>> +
>>>> +static void pcie_downstream_class_init(ObjectClass *klass, void *data)
>>>> +{
>>>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>>>> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>>> +
>>>> +    k->is_bridge = 1;
>>>> +    k->config_write = pcie_downstream_write_config;
>>>> +    k->realize = pcie_downstream_realize;
>>>> +    k->exit = pcie_downstream_exitfn;
>>>> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
>>>> +    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
>>>> +    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
>>>> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>>>> +    dc->desc = "Red Hat PCIe Downstream Port";
>>>> +    dc->reset = pcie_downstream_reset;
>>>> +    dc->vmsd = &vmstate_pcie_ds;
>>>> +    dc->props = pcie_downstream_props;
>>>> +}
>>>> +
>>>> +static const TypeInfo pcie_downstream_info = {
>>>> +    .name          = "pcie-downstream",
>>>> +    .parent        = TYPE_PCIE_SLOT,
>>>> +    .class_init    = pcie_downstream_class_init,
>>>> +    .interfaces = (InterfaceInfo[]) {
>>>> +        { INTERFACE_PCIE_DEVICE },
>>>> +        { }
>>>> +    },
>>>> +};
>>>> +
>>>> +static void pcie_downstream_register_types(void)
>>>> +{
>>>> +    type_register_static(&pcie_downstream_info);
>>>> +}
>>>> +
>>>> +type_init(pcie_downstream_register_types)
>>>> diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
>>>> new file mode 100644
>>>> index 0000000000..a7f1aae4bb
>>>> --- /dev/null
>>>> +++ b/hw/pci-bridge/pcie_downstream.h
>>>> @@ -0,0 +1,10 @@
>>>> +#ifndef QEMU_PCIE_DOWNSTREAM_H
>>>> +#define QEMU_PCIE_DOWNSTREAM_H
>>>> +
>>>> +#include "hw/pci/pcie_port.h"
>>>> +
>>>> +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
>>>> +        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
>>>> +        uint8_t chassis, uint16_t slot);
>>>> +
>>>> +#endif /* QEMU_PCIE_DOWNSTREAM_H */
>>>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>>>> index bc38032761..bf04ef59f3 100644
>>>> --- a/include/hw/pci/pci.h
>>>> +++ b/include/hw/pci/pci.h
>>>> @@ -103,6 +103,7 @@ extern bool pci_available;
>>>>    #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
>>>>    #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
>>>>    #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
>>>> +#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
>>>>    #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
>>>>    #define FMT_PCIBUS                      PRIx64
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>>

^ permalink raw reply	[flat|nested] 106+ messages in thread

* Re: [virtio-dev] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge.
@ 2018-07-31 19:11           ` Marcel Apfelbaum
  0 siblings, 0 replies; 106+ messages in thread
From: Marcel Apfelbaum @ 2018-07-31 19:11 UTC (permalink / raw)
  To: Michael S. Tsirkin, Venu Busireddy
  Cc: Marcel Apfelbaum, virtio-dev, qemu-devel



On 07/31/2018 07:03 PM, Michael S. Tsirkin wrote:
> On Tue, Jul 31, 2018 at 10:58:37AM -0500, Venu Busireddy wrote:
>> On 2018-07-07 15:14:11 +0300, Marcel Apfelbaum wrote:
>>> Hi Venu,
>>>
>>> On 06/30/2018 01:19 AM, Venu Busireddy wrote:
>>>> Add a new bridge device "pcie-downstream" with a
>>>> Vendor ID of PCI_VENDOR_ID_REDHAT and a Device ID of
>>>> PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER.
>>> Can't we use the current pcie-pci-bridge device (see
>>> hw/pci-bridge/pcie_pci_bridge.c)
>>> by adding the new vendor capability to it so we will not need to support a
>>> new bridge?
>> Sorry for the delayed response. I was away on vacation.
>>
>> This question is probably better answered by Michael, but me let me try.
>> Michael suggested adding a new device to avoid confusion with existing
>> bridge devices.
>>
>> Venu
> It's similar to the hotseat trick - dev id is easier to match against.

I understand, but even so, we can maybe use the current pcie-pci-brigde
as a base class and derive from it changing only the VENDOR/DEVICE ID
and adding the new properties, instead of a full implementation.

Only a suggestion, of course.

Thanks,
Marcel



>>> Thanks,
>>> Marcel
>>>
>>>>    Also, add the "Vendor-Specific"
>>>> capability to the bridge to contain the "Group Identifier" that will be
>>>> used to pair a virtio device with the passthrough device attached to
>>>> that bridge.
>>>>
>>>> This capability is added to the bridge iff the "failover-group-id"
>>>> option is specified for the bridge.
>>>>
>>>> Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com>
>>>> ---
>>>>    default-configs/arm-softmmu.mak    |   1 +
>>>>    default-configs/i386-softmmu.mak   |   1 +
>>>>    default-configs/x86_64-softmmu.mak |   1 +
>>>>    hw/pci-bridge/Makefile.objs        |   1 +
>>>>    hw/pci-bridge/pcie_downstream.c    | 220 +++++++++++++++++++++++++++++
>>>>    hw/pci-bridge/pcie_downstream.h    |  10 ++
>>>>    include/hw/pci/pci.h               |   1 +
>>>>    7 files changed, 235 insertions(+)
>>>>    create mode 100644 hw/pci-bridge/pcie_downstream.c
>>>>    create mode 100644 hw/pci-bridge/pcie_downstream.h
>>>>
>>>> diff --git a/default-configs/arm-softmmu.mak b/default-configs/arm-softmmu.mak
>>>> index 834d45cfaf..b86c6fb122 100644
>>>> --- a/default-configs/arm-softmmu.mak
>>>> +++ b/default-configs/arm-softmmu.mak
>>>> @@ -139,6 +139,7 @@ CONFIG_IMX_I2C=y
>>>>    CONFIG_PCIE_PORT=y
>>>>    CONFIG_XIO3130=y
>>>>    CONFIG_IOH3420=y
>>>> +CONFIG_PCIE_DOWNSTREAM=y
>>>>    CONFIG_I82801B11=y
>>>>    CONFIG_ACPI=y
>>>>    CONFIG_SMBIOS=y
>>>> diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
>>>> index 8c7d4a0fa0..a900c8f052 100644
>>>> --- a/default-configs/i386-softmmu.mak
>>>> +++ b/default-configs/i386-softmmu.mak
>>>> @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
>>>>    CONFIG_PCIE_PORT=y
>>>>    CONFIG_XIO3130=y
>>>>    CONFIG_IOH3420=y
>>>> +CONFIG_PCIE_DOWNSTREAM=y
>>>>    CONFIG_I82801B11=y
>>>>    CONFIG_SMBIOS=y
>>>>    CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>>>> diff --git a/default-configs/x86_64-softmmu.mak b/default-configs/x86_64-softmmu.mak
>>>> index 0390b4303c..481e4764be 100644
>>>> --- a/default-configs/x86_64-softmmu.mak
>>>> +++ b/default-configs/x86_64-softmmu.mak
>>>> @@ -56,6 +56,7 @@ CONFIG_ACPI_NVDIMM=y
>>>>    CONFIG_PCIE_PORT=y
>>>>    CONFIG_XIO3130=y
>>>>    CONFIG_IOH3420=y
>>>> +CONFIG_PCIE_DOWNSTREAM=y
>>>>    CONFIG_I82801B11=y
>>>>    CONFIG_SMBIOS=y
>>>>    CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
>>>> diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
>>>> index 47065f87d9..5b42212edc 100644
>>>> --- a/hw/pci-bridge/Makefile.objs
>>>> +++ b/hw/pci-bridge/Makefile.objs
>>>> @@ -3,6 +3,7 @@ common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci
>>>>    common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
>>>>    common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
>>>>    common-obj-$(CONFIG_IOH3420) += ioh3420.o
>>>> +common-obj-$(CONFIG_PCIE_DOWNSTREAM) += pcie_downstream.o
>>>>    common-obj-$(CONFIG_I82801B11) += i82801b11.o
>>>>    # NewWorld PowerMac
>>>>    common-obj-$(CONFIG_DEC_PCI) += dec.o
>>>> diff --git a/hw/pci-bridge/pcie_downstream.c b/hw/pci-bridge/pcie_downstream.c
>>>> new file mode 100644
>>>> index 0000000000..49b0d1e933
>>>> --- /dev/null
>>>> +++ b/hw/pci-bridge/pcie_downstream.c
>>>> @@ -0,0 +1,220 @@
>>>> +/*
>>>> + * Red Hat PCI Express downstream port.
>>>> + *
>>>> + * pcie_downstream.c
>>>> + * Most of this code is copied from xio3130_downstream.c
>>>> + *
>>>> + * Copyright (c) 2018 Oracle and/or its affiliates.
>>>> + *         Author: Venu Busireddy <venu.busireddy@oracle.com>
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or modify
>>>> + * it under the terms of the GNU General Public License as published by
>>>> + * the Free Software Foundation; either version 2 of the License, or
>>>> + * (at your option) any later version.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>>> + * GNU General Public License for more details.
>>>> + *
>>>> + * You should have received a copy of the GNU General Public License along
>>>> + * with this program; if not, see <http://www.gnu.org/licenses/>.
>>>> + */
>>>> +
>>>> +#include "qemu/osdep.h"
>>>> +#include "hw/pci/pci_ids.h"
>>>> +#include "hw/pci/msi.h"
>>>> +#include "hw/pci/pcie.h"
>>>> +#include "pcie_downstream.h"
>>>> +#include "qapi/error.h"
>>>> +
>>>> +#define REDHAT_PCIE_DOWNSTREAM_REVISION                 0x1
>>>> +#define REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET               0x70
>>>> +#define REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS      PCI_MSI_FLAGS_64BIT
>>>> +#define REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR            1
>>>> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET             0x80
>>>> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SVID               0
>>>> +#define REDHAT_PCIE_DOWNSTREAM_SSVID_SSID               0
>>>> +#define REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET               0x90
>>>> +#define REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET            0xCC
>>>> +#define REDHAT_PCIE_DOWNSTREAM_AER_OFFSET               0x100
>>>> +
>>>> +static void pcie_downstream_write_config(PCIDevice *d, uint32_t address,
>>>> +                                         uint32_t val, int len)
>>>> +{
>>>> +    pci_bridge_write_config(d, address, val, len);
>>>> +    pcie_cap_flr_write_config(d, address, val, len);
>>>> +    pcie_cap_slot_write_config(d, address, val, len);
>>>> +    pcie_aer_write_config(d, address, val, len);
>>>> +}
>>>> +
>>>> +static void pcie_downstream_reset(DeviceState *qdev)
>>>> +{
>>>> +    PCIDevice *d = PCI_DEVICE(qdev);
>>>> +
>>>> +    pcie_cap_deverr_reset(d);
>>>> +    pcie_cap_slot_reset(d);
>>>> +    pcie_cap_arifwd_reset(d);
>>>> +    pci_bridge_reset(qdev);
>>>> +}
>>>> +
>>>> +static void pcie_downstream_realize(PCIDevice *d, Error **errp)
>>>> +{
>>>> +    PCIEPort *p = PCIE_PORT(d);
>>>> +    PCIESlot *s = PCIE_SLOT(d);
>>>> +    int rc;
>>>> +
>>>> +    pci_bridge_initfn(d, TYPE_PCIE_BUS);
>>>> +    pcie_port_init_reg(d);
>>>> +
>>>> +    rc = pci_bridge_vendor_init(d, REDHAT_PCIE_DOWNSTREAM_VENDOR_OFFSET,
>>>> +            PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER, errp);
>>>> +    if (rc < 0) {
>>>> +        error_append_hint(errp, "Can't init group ID, error %d\n", rc);
>>>> +        goto err_bridge;
>>>> +    }
>>>> +
>>>> +    rc = msi_init(d, REDHAT_PCIE_DOWNSTREAM_MSI_OFFSET,
>>>> +            REDHAT_PCIE_DOWNSTREAM_MSI_NR_VECTOR,
>>>> +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_64BIT,
>>>> +            REDHAT_PCIE_DOWNSTREAM_MSI_SUPPORTED_FLAGS & PCI_MSI_FLAGS_MASKBIT,
>>>> +            errp);
>>>> +    if (rc < 0) {
>>>> +        assert(rc == -ENOTSUP);
>>>> +        goto err_bridge;
>>>> +    }
>>>> +
>>>> +    rc = pci_bridge_ssvid_init(d, REDHAT_PCIE_DOWNSTREAM_SSVID_OFFSET,
>>>> +            REDHAT_PCIE_DOWNSTREAM_SSVID_SVID,
>>>> +            REDHAT_PCIE_DOWNSTREAM_SSVID_SSID, errp);
>>>> +    if (rc < 0) {
>>>> +        goto err_bridge;
>>>> +    }
>>>> +
>>>> +    rc = pcie_cap_init(d, REDHAT_PCIE_DOWNSTREAM_EXP_OFFSET,
>>>> +            PCI_EXP_TYPE_DOWNSTREAM, p->port, errp);
>>>> +    if (rc < 0) {
>>>> +        goto err_msi;
>>>> +    }
>>>> +    pcie_cap_flr_init(d);
>>>> +    pcie_cap_deverr_init(d);
>>>> +    pcie_cap_slot_init(d, s->slot);
>>>> +    pcie_cap_arifwd_init(d);
>>>> +
>>>> +    pcie_chassis_create(s->chassis);
>>>> +    rc = pcie_chassis_add_slot(s);
>>>> +    if (rc < 0) {
>>>> +        error_setg(errp, "Can't add chassis slot, error %d", rc);
>>>> +        goto err_pcie_cap;
>>>> +    }
>>>> +
>>>> +    rc = pcie_aer_init(d, PCI_ERR_VER, REDHAT_PCIE_DOWNSTREAM_AER_OFFSET,
>>>> +            PCI_ERR_SIZEOF, errp);
>>>> +    if (rc < 0) {
>>>> +        goto err;
>>>> +    }
>>>> +
>>>> +    return;
>>>> +
>>>> +err:
>>>> +    pcie_chassis_del_slot(s);
>>>> +err_pcie_cap:
>>>> +    pcie_cap_exit(d);
>>>> +err_msi:
>>>> +    msi_uninit(d);
>>>> +err_bridge:
>>>> +    pci_bridge_exitfn(d);
>>>> +}
>>>> +
>>>> +static void pcie_downstream_exitfn(PCIDevice *d)
>>>> +{
>>>> +    PCIESlot *s = PCIE_SLOT(d);
>>>> +
>>>> +    pcie_aer_exit(d);
>>>> +    pcie_chassis_del_slot(s);
>>>> +    pcie_cap_exit(d);
>>>> +    msi_uninit(d);
>>>> +    pci_bridge_exitfn(d);
>>>> +}
>>>> +
>>>> +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
>>>> +                               const char *bus_name, pci_map_irq_fn map_irq,
>>>> +                               uint8_t port, uint8_t chassis, uint16_t slot)
>>>> +{
>>>> +    PCIDevice *d;
>>>> +    PCIBridge *br;
>>>> +    DeviceState *qdev;
>>>> +
>>>> +    d = pci_create_multifunction(bus, devfn, multifunction, "pcie-downstream");
>>>> +    if (!d) {
>>>> +        return NULL;
>>>> +    }
>>>> +    br = PCI_BRIDGE(d);
>>>> +
>>>> +    qdev = DEVICE(d);
>>>> +    pci_bridge_map_irq(br, bus_name, map_irq);
>>>> +    qdev_prop_set_uint8(qdev, "port", port);
>>>> +    qdev_prop_set_uint8(qdev, "chassis", chassis);
>>>> +    qdev_prop_set_uint16(qdev, "slot", slot);
>>>> +    qdev_init_nofail(qdev);
>>>> +
>>>> +    return PCIE_SLOT(d);
>>>> +}
>>>> +
>>>> +static Property pcie_downstream_props[] = {
>>>> +    DEFINE_PROP_BIT(COMPAT_PROP_PCP, PCIDevice, cap_present,
>>>> +            QEMU_PCIE_SLTCAP_PCP_BITNR, true),
>>>> +    DEFINE_PROP_UINT64(COMPAT_PROP_FAILOVER_GROUP_ID,
>>>> +            PCIDevice, failover_group_id, ULLONG_MAX),
>>>> +    DEFINE_PROP_END_OF_LIST()
>>>> +};
>>>> +
>>>> +static const VMStateDescription vmstate_pcie_ds = {
>>>> +    .name = "pcie-downstream-port",
>>>> +    .priority = MIG_PRI_PCI_BUS,
>>>> +    .version_id = 1,
>>>> +    .minimum_version_id = 1,
>>>> +    .post_load = pcie_cap_slot_post_load,
>>>> +    .fields = (VMStateField[]) {
>>>> +        VMSTATE_PCI_DEVICE(parent_obj.parent_obj.parent_obj, PCIESlot),
>>>> +        VMSTATE_STRUCT(parent_obj.parent_obj.parent_obj.exp.aer_log,
>>>> +                       PCIESlot, 0, vmstate_pcie_aer_log, PCIEAERLog),
>>>> +        VMSTATE_END_OF_LIST()
>>>> +    }
>>>> +};
>>>> +
>>>> +static void pcie_downstream_class_init(ObjectClass *klass, void *data)
>>>> +{
>>>> +    DeviceClass *dc = DEVICE_CLASS(klass);
>>>> +    PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
>>>> +
>>>> +    k->is_bridge = 1;
>>>> +    k->config_write = pcie_downstream_write_config;
>>>> +    k->realize = pcie_downstream_realize;
>>>> +    k->exit = pcie_downstream_exitfn;
>>>> +    k->vendor_id = PCI_VENDOR_ID_REDHAT;
>>>> +    k->device_id = PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER;
>>>> +    k->revision = REDHAT_PCIE_DOWNSTREAM_REVISION;
>>>> +    set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
>>>> +    dc->desc = "Red Hat PCIe Downstream Port";
>>>> +    dc->reset = pcie_downstream_reset;
>>>> +    dc->vmsd = &vmstate_pcie_ds;
>>>> +    dc->props = pcie_downstream_props;
>>>> +}
>>>> +
>>>> +static const TypeInfo pcie_downstream_info = {
>>>> +    .name          = "pcie-downstream",
>>>> +    .parent        = TYPE_PCIE_SLOT,
>>>> +    .class_init    = pcie_downstream_class_init,
>>>> +    .interfaces = (InterfaceInfo[]) {
>>>> +        { INTERFACE_PCIE_DEVICE },
>>>> +        { }
>>>> +    },
>>>> +};
>>>> +
>>>> +static void pcie_downstream_register_types(void)
>>>> +{
>>>> +    type_register_static(&pcie_downstream_info);
>>>> +}
>>>> +
>>>> +type_init(pcie_downstream_register_types)
>>>> diff --git a/hw/pci-bridge/pcie_downstream.h b/hw/pci-bridge/pcie_downstream.h
>>>> new file mode 100644
>>>> index 0000000000..a7f1aae4bb
>>>> --- /dev/null
>>>> +++ b/hw/pci-bridge/pcie_downstream.h
>>>> @@ -0,0 +1,10 @@
>>>> +#ifndef QEMU_PCIE_DOWNSTREAM_H
>>>> +#define QEMU_PCIE_DOWNSTREAM_H
>>>> +
>>>> +#include "hw/pci/pcie_port.h"
>>>> +
>>>> +PCIESlot *pcie_downstream_init(PCIBus *bus, int devfn, bool multifunction,
>>>> +        const char *bus_name, pci_map_irq_fn map_irq, uint8_t port,
>>>> +        uint8_t chassis, uint16_t slot);
>>>> +
>>>> +#endif /* QEMU_PCIE_DOWNSTREAM_H */
>>>> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
>>>> index bc38032761..bf04ef59f3 100644
>>>> --- a/include/hw/pci/pci.h
>>>> +++ b/include/hw/pci/pci.h
>>>> @@ -103,6 +103,7 @@ extern bool pci_available;
>>>>    #define PCI_DEVICE_ID_REDHAT_PCIE_BRIDGE        0x000e
>>>>    #define PCI_DEVICE_ID_REDHAT_MDPY               0x000f
>>>>    #define PCI_DEVICE_ID_REDHAT_BRIDGE_FAILOVER    0x0010
>>>> +#define PCI_DEVICE_ID_REDHAT_DOWNPORT_FAILOVER  0x0011
>>>>    #define PCI_DEVICE_ID_REDHAT_QXL                0x0100
>>>>    #define FMT_PCIBUS                      PRIx64
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
>>>> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>>>>


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 106+ messages in thread

end of thread, other threads:[~2018-07-31 19:12 UTC | newest]

Thread overview: 106+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-29 22:19 [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices Venu Busireddy
2018-06-29 22:19 ` [virtio-dev] " Venu Busireddy
2018-06-29 22:19 ` [Qemu-devel] [PATCH v3 1/3] Add "Group Identifier" support to virtio devices Venu Busireddy
2018-06-29 22:19   ` [virtio-dev] " Venu Busireddy
2018-06-29 22:19 ` [Qemu-devel] [PATCH v3 2/3] Add "Group Identifier" support to Red Hat PCI bridge Venu Busireddy
2018-06-29 22:19   ` [virtio-dev] " Venu Busireddy
2018-07-03  3:13   ` [Qemu-devel] " Siwei Liu
2018-07-03  3:13     ` Siwei Liu
2018-06-29 22:19 ` [Qemu-devel] [PATCH v3 3/3] Add "Group Identifier" support to Red Hat PCI Express bridge Venu Busireddy
2018-06-29 22:19   ` [virtio-dev] " Venu Busireddy
2018-07-07 12:14   ` [Qemu-devel] " Marcel Apfelbaum
2018-07-07 12:14     ` Marcel Apfelbaum
2018-07-31 15:58     ` [Qemu-devel] " Venu Busireddy
2018-07-31 15:58       ` Venu Busireddy
2018-07-31 16:03       ` [Qemu-devel] " Michael S. Tsirkin
2018-07-31 16:03         ` Michael S. Tsirkin
2018-07-31 19:11         ` [Qemu-devel] " Marcel Apfelbaum
2018-07-31 19:11           ` Marcel Apfelbaum
2018-06-29 22:19 ` [Qemu-devel] [PATCH v3 virtio 1/1] Add "Group Identifier" to virtio PCI capabilities Venu Busireddy
2018-06-29 22:19   ` [virtio-dev] " Venu Busireddy
2018-07-02 16:14 ` [Qemu-devel] [PATCH v3 0/3] Use of unique identifier for pairing virtio and passthrough devices Roman Kagan
2018-07-02 21:14   ` si-wei liu
2018-07-02 21:14     ` [virtio-dev] " si-wei liu
2018-07-03  9:58     ` Roman Kagan
2018-07-03 14:28       ` Venu Busireddy
2018-07-03 14:28         ` [virtio-dev] " Venu Busireddy
2018-07-03 14:52         ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-03 14:52           ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
2018-07-03 23:31           ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-07-03 23:31             ` [virtio-dev] Re: [Qemu-devel] " Siwei Liu
2018-07-04 12:15             ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-04 12:15               ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
2018-07-06  0:49               ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-07-06  0:49                 ` [virtio-dev] Re: [Qemu-devel] " Siwei Liu
2018-07-06 13:54                 ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-06 13:54                   ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
2018-07-06 15:07                   ` [Qemu-devel] [virtio-dev] " Michael S. Tsirkin
2018-07-06 15:07                     ` [virtio-dev] Re: [Qemu-devel] " Michael S. Tsirkin
2018-07-09 16:20                     ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-09 16:20                       ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
2018-07-06 23:37                   ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-07-06 23:37                     ` [virtio-dev] Re: [Qemu-devel] " Siwei Liu
2018-07-09 16:27                     ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-09 16:27                       ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
2018-07-09 13:14                   ` [Qemu-devel] [virtio-dev] " Roman Kagan
2018-07-09 13:14                     ` [virtio-dev] Re: [Qemu-devel] " Roman Kagan
2018-07-09 16:10                     ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-09 16:10                       ` [virtio-dev] Re: [Qemu-devel] " Cornelia Huck
2018-07-03 15:34         ` Roman Kagan
2018-07-03 22:27       ` si-wei liu
2018-07-03 22:27         ` [virtio-dev] " si-wei liu
2018-07-09 13:00         ` Roman Kagan
2018-07-09 18:35           ` Michael S. Tsirkin
2018-07-09 18:35             ` [virtio-dev] " Michael S. Tsirkin
2018-07-10  1:11           ` si-wei liu
2018-07-10  1:11             ` [virtio-dev] " si-wei liu
2018-07-10  1:54             ` Michael S. Tsirkin
2018-07-10  1:54               ` [virtio-dev] " Michael S. Tsirkin
2018-07-11  0:07               ` Siwei Liu
2018-07-11  0:07                 ` [virtio-dev] " Siwei Liu
2018-07-11  0:07                 ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-07-11  9:53                 ` Re: [Qemu-devel] " Cornelia Huck
2018-07-11  9:53                   ` [virtio-dev] " Cornelia Huck
2018-07-11  9:53                   ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-12  9:37                   ` Re: [Qemu-devel] " Siwei Liu
2018-07-12  9:37                     ` [virtio-dev] " Siwei Liu
2018-07-12  9:37                     ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-07-12 11:31                     ` Re: [Qemu-devel] " Cornelia Huck
2018-07-12 11:31                       ` [virtio-dev] " Cornelia Huck
2018-07-12 11:31                       ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-12 20:52                       ` Re: [Qemu-devel] " Siwei Liu
2018-07-12 20:52                         ` [virtio-dev] " Siwei Liu
2018-07-12 20:52                         ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-07-12 21:00                         ` Re: [Qemu-devel] " Michael S. Tsirkin
2018-07-12 21:00                           ` [virtio-dev] " Michael S. Tsirkin
2018-07-12 21:00                           ` [Qemu-devel] [virtio-dev] " Michael S. Tsirkin
2018-07-12 22:19                           ` Re: [Qemu-devel] " Siwei Liu
2018-07-12 22:19                             ` [virtio-dev] " Siwei Liu
2018-07-12 22:19                             ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-07-13  1:20                             ` Re: [Qemu-devel] " Samudrala, Sridhar
2018-07-13  1:20                               ` [virtio-dev] " Samudrala, Sridhar
2018-07-13  1:20                               ` [Qemu-devel] [virtio-dev] " Samudrala, Sridhar
2018-07-13  3:28                               ` Re: [Qemu-devel] " Michael S. Tsirkin
2018-07-13  3:28                                 ` [virtio-dev] " Michael S. Tsirkin
2018-07-13  3:28                                 ` [Qemu-devel] [virtio-dev] " Michael S. Tsirkin
2018-07-13  9:15                               ` Re: [Qemu-devel] " Cornelia Huck
2018-07-13  9:15                                 ` [virtio-dev] " Cornelia Huck
2018-07-13  9:15                                 ` [Qemu-devel] [virtio-dev] " Cornelia Huck
2018-07-12 19:18                   ` Re: [Qemu-devel] " Michael S. Tsirkin
2018-07-12 19:18                     ` [virtio-dev] " Michael S. Tsirkin
2018-07-12 19:18                     ` [Qemu-devel] [virtio-dev] " Michael S. Tsirkin
2018-07-10  1:58             ` [Qemu-devel] " Michael S. Tsirkin
2018-07-10  1:58               ` [virtio-dev] " Michael S. Tsirkin
2018-07-10 18:56               ` Siwei Liu
2018-07-10 18:56                 ` [virtio-dev] " Siwei Liu
2018-07-10 18:56                 ` [Qemu-devel] [virtio-dev] " Siwei Liu
2018-07-10  2:05           ` [Qemu-devel] " Michael S. Tsirkin
2018-07-10  2:05             ` [virtio-dev] " Michael S. Tsirkin
2018-07-04  5:43       ` Michael S. Tsirkin
2018-07-04  5:43         ` [virtio-dev] " Michael S. Tsirkin
2018-07-10  2:11 ` Michael S. Tsirkin
2018-07-10  2:11   ` [virtio-dev] " Michael S. Tsirkin
2018-07-10 14:28   ` [Qemu-devel] " Venu Busireddy
2018-07-10 14:28     ` [virtio-dev] " Venu Busireddy
2018-07-12 21:01     ` [Qemu-devel] " Michael S. Tsirkin
2018-07-12 21:01       ` [virtio-dev] " Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.