All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
@ 2016-12-31  9:13 Cao jin
  2016-12-31  9:13 ` [Qemu-devel] [PATCH v11 1/4] pcie_aer: support configurable AER capa version Cao jin
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Cao jin @ 2016-12-31  9:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, mst, izumi.taku

As previous discussion suggest, we could take a step back to handle non-fatal
error first, this will make this patchset much more thinner, because we could
drop all the configuration restriction related patches.

FYI: patch 1 has been cherry picked into another series, and wait to be merged
first, so this patchset can't compile in your host.

v11 changelog:
1. drop a bunch of code which check the configuration.
2. modify patch 3 to handle non-fatal error only, fatal error still
   results in vm stop.
   Doesn't modify as suggestion "add another eventfd do distinguish fatal &
   non-fatal error", because 1st, user has the ability to distinguish them
   just from the uncorrectable error status; 2nd, for back compatible, e.g.
   an old user handle both error, rely on the current error eventfd.

Test:
Test it with intel 82576 NIC, which has 2 functions, function 1 has cable
connected to gateway, function 0 has no link. Test in 4 scenario.
1. just assign function 1 to one vm, function 0 has no user
2. assign 2 function to one vm, totally comply previous configuraton restrction
3. assign 2 function to one vm, under different virtual bus
4, assign functions to 2 different vm

The test steps are the same as v10: assign ip to function 1, add route info,
and ping the gateway. The results meet expectation. But the unsteady hardware
often emit fatal error, still don't know why. And igb driver in guest seems
has bug: ping gateway for a while, even if I don't do anything, it will show
"Destination Host Unreachable" after many successful ping. But obviously,
neither of these has relationship with this patchset.


Chen Fan (3):
  vfio: new function to init aer cap for vfio device
  vfio-pci: pass the aer error to guest
  vfio: add 'aer' property to expose aercap

Dou Liyang (1):
  pcie_aer: support configurable AER capa version

 hw/net/e1000e.c                    |   3 +-
 hw/pci-bridge/ioh3420.c            |   2 +-
 hw/pci-bridge/xio3130_downstream.c |   2 +-
 hw/pci-bridge/xio3130_upstream.c   |   2 +-
 hw/pci/pcie_aer.c                  |   5 +-
 hw/vfio/pci.c                      | 139 +++++++++++++++++++++++++++++++++----
 hw/vfio/pci.h                      |   3 +
 include/hw/pci/pcie_aer.h          |   3 +-
 8 files changed, 139 insertions(+), 20 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH v11 1/4] pcie_aer: support configurable AER capa version
  2016-12-31  9:13 [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest Cao jin
@ 2016-12-31  9:13 ` Cao jin
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device Cao jin
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Cao jin @ 2016-12-31  9:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, mst, izumi.taku, Dou Liyang

From: Dou Liyang <douly.fnst@cn.fujitsu.com>

Now, AER capa version is fixed to v2, if assigned device is actually
v1, this value will be inconsistent between guest and host

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
---
 hw/net/e1000e.c                    | 3 ++-
 hw/pci-bridge/ioh3420.c            | 2 +-
 hw/pci-bridge/xio3130_downstream.c | 2 +-
 hw/pci-bridge/xio3130_upstream.c   | 2 +-
 hw/pci/pcie_aer.c                  | 5 +++--
 include/hw/pci/pcie_aer.h          | 3 ++-
 6 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/hw/net/e1000e.c b/hw/net/e1000e.c
index 4994e1c..66de849 100644
--- a/hw/net/e1000e.c
+++ b/hw/net/e1000e.c
@@ -472,7 +472,8 @@ static void e1000e_pci_realize(PCIDevice *pci_dev, Error **errp)
         hw_error("Failed to initialize PM capability");
     }
 
-    if (pcie_aer_init(pci_dev, e1000e_aer_offset, PCI_ERR_SIZEOF) < 0) {
+    if (pcie_aer_init(pci_dev, PCI_ERR_VER, e1000e_aer_offset,
+                              PCI_ERR_SIZEOF) < 0) {
         hw_error("Failed to initialize AER capability");
     }
 
diff --git a/hw/pci-bridge/ioh3420.c b/hw/pci-bridge/ioh3420.c
index c8b5ac4..d70784c 100644
--- a/hw/pci-bridge/ioh3420.c
+++ b/hw/pci-bridge/ioh3420.c
@@ -135,7 +135,7 @@ static int ioh3420_initfn(PCIDevice *d)
         goto err_pcie_cap;
     }
 
-    rc = pcie_aer_init(d, IOH_EP_AER_OFFSET, PCI_ERR_SIZEOF);
+    rc = pcie_aer_init(d, PCI_ERR_VER, IOH_EP_AER_OFFSET, PCI_ERR_SIZEOF);
     if (rc < 0) {
         goto err;
     }
diff --git a/hw/pci-bridge/xio3130_downstream.c b/hw/pci-bridge/xio3130_downstream.c
index cef6e13..5d1ce01 100644
--- a/hw/pci-bridge/xio3130_downstream.c
+++ b/hw/pci-bridge/xio3130_downstream.c
@@ -97,7 +97,7 @@ static int xio3130_downstream_initfn(PCIDevice *d)
         goto err_pcie_cap;
     }
 
-    rc = pcie_aer_init(d, XIO3130_AER_OFFSET, PCI_ERR_SIZEOF);
+    rc = pcie_aer_init(d, PCI_ERR_VER, XIO3130_AER_OFFSET, PCI_ERR_SIZEOF);
     if (rc < 0) {
         goto err;
     }
diff --git a/hw/pci-bridge/xio3130_upstream.c b/hw/pci-bridge/xio3130_upstream.c
index 4ad0440..3819964 100644
--- a/hw/pci-bridge/xio3130_upstream.c
+++ b/hw/pci-bridge/xio3130_upstream.c
@@ -85,7 +85,7 @@ static int xio3130_upstream_initfn(PCIDevice *d)
     pcie_cap_flr_init(d);
     pcie_cap_deverr_init(d);
 
-    rc = pcie_aer_init(d, XIO3130_AER_OFFSET, PCI_ERR_SIZEOF);
+    rc = pcie_aer_init(d, PCI_ERR_VER, XIO3130_AER_OFFSET, PCI_ERR_SIZEOF);
     if (rc < 0) {
         goto err;
     }
diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 048ce6a..ac47f34 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -96,11 +96,12 @@ static void aer_log_clear_all_err(PCIEAERLog *aer_log)
     aer_log->log_num = 0;
 }
 
-int pcie_aer_init(PCIDevice *dev, uint16_t offset, uint16_t size)
+int pcie_aer_init(PCIDevice *dev, uint8_t cap_ver, uint16_t offset,
+                  uint16_t size)
 {
     PCIExpressDevice *exp;
 
-    pcie_add_capability(dev, PCI_EXT_CAP_ID_ERR, PCI_ERR_VER,
+    pcie_add_capability(dev, PCI_EXT_CAP_ID_ERR, cap_ver,
                         offset, size);
     exp = &dev->exp;
     exp->aer_cap = offset;
diff --git a/include/hw/pci/pcie_aer.h b/include/hw/pci/pcie_aer.h
index c2ee4e2..c373591 100644
--- a/include/hw/pci/pcie_aer.h
+++ b/include/hw/pci/pcie_aer.h
@@ -87,7 +87,8 @@ struct PCIEAERErr {
 
 extern const VMStateDescription vmstate_pcie_aer_log;
 
-int pcie_aer_init(PCIDevice *dev, uint16_t offset, uint16_t size);
+int pcie_aer_init(PCIDevice *dev, uint8_t cap_ver, uint16_t offset,
+                  uint16_t size);
 void pcie_aer_exit(PCIDevice *dev);
 void pcie_aer_write_config(PCIDevice *dev,
                            uint32_t addr, uint32_t val, int len);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device
  2016-12-31  9:13 [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest Cao jin
  2016-12-31  9:13 ` [Qemu-devel] [PATCH v11 1/4] pcie_aer: support configurable AER capa version Cao jin
@ 2016-12-31  9:13 ` Cao jin
  2017-01-18 22:09   ` Alex Williamson
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest Cao jin
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Cao jin @ 2016-12-31  9:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, mst, izumi.taku, Chen Fan, Dou Liyang

From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>

Introduce new function to initilize AER capability registers
for vfio-pci device.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
---
 hw/vfio/pci.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 hw/vfio/pci.h |  3 +++
 2 files changed, 85 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index d7dbe0e..76a8ac3 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -1851,18 +1851,81 @@ out:
     return 0;
 }
 
-static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
+static int vfio_setup_aer(VFIOPCIDevice *vdev, uint8_t cap_ver,
+                          int pos, uint16_t size, Error **errp)
+{
+    PCIDevice *pdev = &vdev->pdev;
+    PCIDevice *dev_iter;
+    uint8_t type;
+    uint32_t errcap;
+
+    /* In case the physical device has AER cap while user doesn't enable AER,
+     * still allocate the config space in the emulated device for AER */
+    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
+        pcie_add_capability(pdev, PCI_EXT_CAP_ID_ERR,
+                            cap_ver, pos, size);
+        return 0;
+    }
+
+    dev_iter = pci_bridge_get_device(pdev->bus);
+    if (!dev_iter) {
+        goto error;
+    }
+
+    while (dev_iter) {
+        if (!pci_is_express(dev_iter)) {
+            goto error;
+        }
+
+        type = pcie_cap_get_type(dev_iter);
+        if ((type != PCI_EXP_TYPE_ROOT_PORT &&
+             type != PCI_EXP_TYPE_UPSTREAM &&
+             type != PCI_EXP_TYPE_DOWNSTREAM)) {
+            goto error;
+        }
+
+        if (!dev_iter->exp.aer_cap) {
+            goto error;
+        }
+
+        dev_iter = pci_bridge_get_device(dev_iter->bus);
+    }
+
+    errcap = vfio_pci_read_config(pdev, pos + PCI_ERR_CAP, 4);
+    /*
+     * The ability to record multiple headers is depending on
+     * the state of the Multiple Header Recording Capable bit and
+     * enabled by the Multiple Header Recording Enable bit.
+     */
+    if ((errcap & PCI_ERR_CAP_MHRC) &&
+        (errcap & PCI_ERR_CAP_MHRE)) {
+        pdev->exp.aer_log.log_max = PCIE_AER_LOG_MAX_DEFAULT;
+    } else {
+        pdev->exp.aer_log.log_max = 0;
+    }
+
+    pcie_cap_deverr_init(pdev);
+    return pcie_aer_init(pdev, cap_ver, pos, size);
+
+error:
+    error_setg(errp, "vfio: Unable to enable AER for device %s, parent bus "
+               "does not support AER signaling", vdev->vbasedev.name);
+    return -1;
+}
+
+static int vfio_add_ext_cap(VFIOPCIDevice *vdev, Error **errp)
 {
     PCIDevice *pdev = &vdev->pdev;
     uint32_t header;
     uint16_t cap_id, next, size;
     uint8_t cap_ver;
     uint8_t *config;
+    int ret = 0;
 
     /* Only add extended caps if we have them and the guest can see them */
     if (!pci_is_express(pdev) || !pci_bus_is_express(pdev->bus) ||
         !pci_get_long(pdev->config + PCI_CONFIG_SPACE_SIZE)) {
-        return;
+        return 0;
     }
 
     /*
@@ -1911,6 +1974,9 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
                                    PCI_EXT_CAP_NEXT_MASK);
 
         switch (cap_id) {
+        case PCI_EXT_CAP_ID_ERR:
+            ret = vfio_setup_aer(vdev, cap_ver, next, size, errp);
+            break;
         case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
         case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization */
             trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, next);
@@ -1919,6 +1985,9 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
             pcie_add_capability(pdev, cap_id, cap_ver, next, size);
         }
 
+        if (ret) {
+            goto out;
+        }
     }
 
     /* Cleanup chain head ID if necessary */
@@ -1926,8 +1995,9 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
         pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
     }
 
+out:
     g_free(config);
-    return;
+    return ret;
 }
 
 static int vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp)
@@ -1945,8 +2015,8 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp)
         return ret;
     }
 
-    vfio_add_ext_cap(vdev);
-    return 0;
+    ret = vfio_add_ext_cap(vdev, errp);
+    return ret;
 }
 
 static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
@@ -2769,6 +2839,13 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
         goto out_teardown;
     }
 
+    if ((vdev->features & VFIO_FEATURE_ENABLE_AER) &&
+        !pdev->exp.aer_cap) {
+        error_setg(errp, "vfio: Unable to enable AER for device %s, device "
+                   "does not support AER signaling", vdev->vbasedev.name);
+        return;
+    }
+
     if (vdev->vga) {
         vfio_vga_quirk_setup(vdev);
     }
diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
index a8366bb..64701c4 100644
--- a/hw/vfio/pci.h
+++ b/hw/vfio/pci.h
@@ -15,6 +15,7 @@
 #include "qemu-common.h"
 #include "exec/memory.h"
 #include "hw/pci/pci.h"
+#include "hw/pci/pci_bridge.h"
 #include "hw/vfio/vfio-common.h"
 #include "qemu/event_notifier.h"
 #include "qemu/queue.h"
@@ -132,6 +133,8 @@ typedef struct VFIOPCIDevice {
 #define VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT 2
 #define VFIO_FEATURE_ENABLE_IGD_OPREGION \
                                 (1 << VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT)
+#define VFIO_FEATURE_ENABLE_AER_BIT 3
+#define VFIO_FEATURE_ENABLE_AER (1 << VFIO_FEATURE_ENABLE_AER_BIT)
     int32_t bootindex;
     uint32_t igd_gms;
     uint8_t pm_cap;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest
  2016-12-31  9:13 [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest Cao jin
  2016-12-31  9:13 ` [Qemu-devel] [PATCH v11 1/4] pcie_aer: support configurable AER capa version Cao jin
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device Cao jin
@ 2016-12-31  9:13 ` Cao jin
  2017-01-09 22:55   ` Michael S. Tsirkin
  2017-01-18 22:31   ` Alex Williamson
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap Cao jin
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 21+ messages in thread
From: Cao jin @ 2016-12-31  9:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, mst, izumi.taku, Chen Fan, Dou Liyang

From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>

When physical device has uncorrectable error hanppened, the vfio_pci
driver will signal the uncorrectable error status register value to
corresponding QEMU's vfio-pci device via the eventfd registered by this
device, then, the vfio-pci's error eventfd handler will be invoked in
event loop.

Construct and pass the aer message to root port, root port will trigger an
interrupt to signal guest, then, the guest driver will do the recovery.

Note: Now only support non-fatal error's recovery, fatal error will
still result in vm stop.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
---
 hw/vfio/pci.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 42 insertions(+), 8 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 76a8ac3..9861f72 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
 static void vfio_err_notifier_handler(void *opaque)
 {
     VFIOPCIDevice *vdev = opaque;
+    PCIDevice *dev = &vdev->pdev;
+    PCIEAERMsg msg = {
+        .severity = 0,
+        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
+    };
+    int len;
+    uint64_t uncor_status;
+
+    /* Read uncorrectable error status from driver */
+    len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status));
+    if (len != sizeof(uncor_status)) {
+        error_report("vfio-pci: uncor error status reading returns"
+                     " invalid number of bytes: %d", len);
+        return; //Or goto stop?
+    }
+
+    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
+        goto stop;
+    }
+
+    /* Populate the aer msg and send it to root port */
+    if (dev->exp.aer_cap) {
+        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
+        bool isfatal = uncor_status &
+                       pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
+
+	if (isfatal) {
+	    goto stop;
+	}
+
+        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
+                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;
 
-    if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
+        error_report("vfio-pci device %d sending AER to root port. uncor"
+                     " status = 0x%"PRIx64, dev->devfn, uncor_status);
+        pcie_aer_msg(dev, &msg);
         return;
     }
 
+stop:
     /*
-     * TBD. Retrieve the error details and decide what action
-     * needs to be taken. One of the actions could be to pass
-     * the error to the guest and have the guest driver recover
-     * from the error. This requires that PCIe capabilities be
-     * exposed to the guest. For now, we just terminate the
-     * guest to contain the error.
+     * Terminate the guest in case of
+     * 1. AER capability is not exposed to guest.
+     * 2. AER capability is exposed, but error is fatal, only non-fatal
+     * error is handled now.
      */
 
-    error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
+    error_report("%s(%s) fatal error detected. Please collect any data"
+            " possible and then kill the guest", __func__, vdev->vbasedev.name);
 
     vm_stop(RUN_STATE_INTERNAL_ERROR);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap
  2016-12-31  9:13 [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest Cao jin
                   ` (2 preceding siblings ...)
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest Cao jin
@ 2016-12-31  9:13 ` Cao jin
  2017-01-18 22:36   ` Alex Williamson
  2016-12-31  9:17 ` [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest no-reply
  2017-01-18 21:43 ` Alex Williamson
  5 siblings, 1 reply; 21+ messages in thread
From: Cao jin @ 2016-12-31  9:13 UTC (permalink / raw)
  To: qemu-devel; +Cc: alex.williamson, mst, izumi.taku, Chen Fan, Dou Liyang

From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>

Add 'aer' property, let user choose whether expose the aer capability
or not. Should disable aer feature by default, because only non-fatal
error is supported now.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
---
 hw/vfio/pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 9861f72..fc9db66 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3057,6 +3057,8 @@ static Property vfio_pci_dev_properties[] = {
     DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice,
                        sub_device_id, PCI_ANY_ID),
     DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0),
+    DEFINE_PROP_BIT("aer", VFIOPCIDevice, features,
+                    VFIO_FEATURE_ENABLE_AER_BIT, false),
     /*
      * TODO - support passed fds... is this necessary?
      * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
  2016-12-31  9:13 [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest Cao jin
                   ` (3 preceding siblings ...)
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap Cao jin
@ 2016-12-31  9:17 ` no-reply
  2017-01-18 21:43 ` Alex Williamson
  5 siblings, 0 replies; 21+ messages in thread
From: no-reply @ 2016-12-31  9:17 UTC (permalink / raw)
  To: caoj.fnst; +Cc: famz, qemu-devel, izumi.taku, alex.williamson, mst

Hi,

Your series seems to have some coding style problems. See output below for
more information:

Message-id: 1483175588-17006-1-git-send-email-caoj.fnst@cn.fujitsu.com
Type: series
Subject: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

# Useful git options
git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/1483175588-17006-1-git-send-email-caoj.fnst@cn.fujitsu.com -> patchew/1483175588-17006-1-git-send-email-caoj.fnst@cn.fujitsu.com
Switched to a new branch 'test'
7f7bb75 vfio: add 'aer' property to expose aercap
a62d474 vfio-pci: pass the aer error to guest
6678558 vfio: new function to init aer cap for vfio device
6e3e08c pcie_aer: support configurable AER capa version

=== OUTPUT BEGIN ===
Checking PATCH 1/4: pcie_aer: support configurable AER capa version...
Checking PATCH 2/4: vfio: new function to init aer cap for vfio device...
Checking PATCH 3/4: vfio-pci: pass the aer error to guest...
ERROR: do not use C99 // comments
#44: FILE: hw/vfio/pci.c:2486:
+        return; //Or goto stop?

ERROR: code indent should never use tabs
#57: FILE: hw/vfio/pci.c:2499:
+^Iif (isfatal) {$

ERROR: code indent should never use tabs
#58: FILE: hw/vfio/pci.c:2500:
+^I    goto stop;$

ERROR: code indent should never use tabs
#59: FILE: hw/vfio/pci.c:2501:
+^I}$

total: 4 errors, 0 warnings, 63 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 4/4: vfio: add 'aer' property to expose aercap...
=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@freelists.org

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest Cao jin
@ 2017-01-09 22:55   ` Michael S. Tsirkin
  2017-01-18 22:31   ` Alex Williamson
  1 sibling, 0 replies; 21+ messages in thread
From: Michael S. Tsirkin @ 2017-01-09 22:55 UTC (permalink / raw)
  To: Cao jin; +Cc: qemu-devel, alex.williamson, izumi.taku, Chen Fan, Dou Liyang

On Sat, Dec 31, 2016 at 05:13:07PM +0800, Cao jin wrote:
> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> 
> When physical device has uncorrectable error hanppened, the vfio_pci
> driver will signal the uncorrectable error status register value to
> corresponding QEMU's vfio-pci device via the eventfd registered by this
> device, then, the vfio-pci's error eventfd handler will be invoked in
> event loop.
> 
> Construct and pass the aer message to root port, root port will trigger an
> interrupt to signal guest, then, the guest driver will do the recovery.
> 
> Note: Now only support non-fatal error's recovery, fatal error will
> still result in vm stop.
> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 42 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 76a8ac3..9861f72 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
>  static void vfio_err_notifier_handler(void *opaque)
>  {
>      VFIOPCIDevice *vdev = opaque;
> +    PCIDevice *dev = &vdev->pdev;
> +    PCIEAERMsg msg = {
> +        .severity = 0,
> +        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
> +    };
> +    int len;
> +    uint64_t uncor_status;
> +
> +    /* Read uncorrectable error status from driver */
> +    len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status));
> +    if (len != sizeof(uncor_status)) {
> +        error_report("vfio-pci: uncor error status reading returns"
> +                     " invalid number of bytes: %d", len);
> +        return; //Or goto stop?

I would definitely suggest this to make sure we don't regress.

> +    }
> +
> +    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
> +        goto stop;
> +    }
> +
> +    /* Populate the aer msg and send it to root port */
> +    if (dev->exp.aer_cap) {
> +        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
> +        bool isfatal = uncor_status &
> +                       pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
> +
> +	if (isfatal) {
> +	    goto stop;
> +	}
> +
> +        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
> +                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;
>  
> -    if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
> +        error_report("vfio-pci device %d sending AER to root port. uncor"
> +                     " status = 0x%"PRIx64, dev->devfn, uncor_status);
> +        pcie_aer_msg(dev, &msg);
>          return;
>      }
>  
> +stop:
>      /*
> -     * TBD. Retrieve the error details and decide what action
> -     * needs to be taken. One of the actions could be to pass
> -     * the error to the guest and have the guest driver recover
> -     * from the error. This requires that PCIe capabilities be
> -     * exposed to the guest. For now, we just terminate the
> -     * guest to contain the error.
> +     * Terminate the guest in case of
> +     * 1. AER capability is not exposed to guest.
> +     * 2. AER capability is exposed, but error is fatal, only non-fatal
> +     * error is handled now.
>       */
>  
> -    error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
> +    error_report("%s(%s) fatal error detected. Please collect any data"
> +            " possible and then kill the guest", __func__, vdev->vbasedev.name);
>  
>      vm_stop(RUN_STATE_INTERNAL_ERROR);
>  }
> -- 
> 1.8.3.1
> 
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
  2016-12-31  9:13 [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest Cao jin
                   ` (4 preceding siblings ...)
  2016-12-31  9:17 ` [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest no-reply
@ 2017-01-18 21:43 ` Alex Williamson
  2017-01-19  6:15   ` Cao jin
  2017-01-19  6:25   ` Cao jin
  5 siblings, 2 replies; 21+ messages in thread
From: Alex Williamson @ 2017-01-18 21:43 UTC (permalink / raw)
  To: Cao jin; +Cc: qemu-devel, mst, izumi.taku

On Sat, 31 Dec 2016 17:13:04 +0800
Cao jin <caoj.fnst@cn.fujitsu.com> wrote:

> As previous discussion suggest, we could take a step back to handle non-fatal
> error first, this will make this patchset much more thinner, because we could
> drop all the configuration restriction related patches.
> 
> FYI: patch 1 has been cherry picked into another series, and wait to be merged
> first, so this patchset can't compile in your host.
> 
> v11 changelog:
> 1. drop a bunch of code which check the configuration.
> 2. modify patch 3 to handle non-fatal error only, fatal error still
>    results in vm stop.
>    Doesn't modify as suggestion "add another eventfd do distinguish fatal &
>    non-fatal error", because 1st, user has the ability to distinguish them
>    just from the uncorrectable error status; 2nd, for back compatible, e.g.
>    an old user handle both error, rely on the current error eventfd.
> 
> Test:
> Test it with intel 82576 NIC, which has 2 functions, function 1 has cable
> connected to gateway, function 0 has no link. Test in 4 scenario.
> 1. just assign function 1 to one vm, function 0 has no user
> 2. assign 2 function to one vm, totally comply previous configuraton restrction
> 3. assign 2 function to one vm, under different virtual bus
> 4, assign functions to 2 different vm
> 
> The test steps are the same as v10: assign ip to function 1, add route info,
> and ping the gateway. The results meet expectation. But the unsteady hardware
> often emit fatal error, still don't know why. And igb driver in guest seems
> has bug: ping gateway for a while, even if I don't do anything, it will show
> "Destination Host Unreachable" after many successful ping. But obviously,
> neither of these has relationship with this patchset.

So something doesn't work right regardless and this doesn't describe
what testing was done of the functionality added here.  Were AER events
injected?  Did fatal ones cause a vmstop, did non-fatal ones continue?
How can you know if he other function was affected if you don't even
have a cable connected?  How is testing on something that doesn't seem
to work correctly already valid?  Thanks,

Alex

> Chen Fan (3):
>   vfio: new function to init aer cap for vfio device
>   vfio-pci: pass the aer error to guest
>   vfio: add 'aer' property to expose aercap
> 
> Dou Liyang (1):
>   pcie_aer: support configurable AER capa version
> 
>  hw/net/e1000e.c                    |   3 +-
>  hw/pci-bridge/ioh3420.c            |   2 +-
>  hw/pci-bridge/xio3130_downstream.c |   2 +-
>  hw/pci-bridge/xio3130_upstream.c   |   2 +-
>  hw/pci/pcie_aer.c                  |   5 +-
>  hw/vfio/pci.c                      | 139 +++++++++++++++++++++++++++++++++----
>  hw/vfio/pci.h                      |   3 +
>  include/hw/pci/pcie_aer.h          |   3 +-
>  8 files changed, 139 insertions(+), 20 deletions(-)
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device Cao jin
@ 2017-01-18 22:09   ` Alex Williamson
  2017-01-20  6:03     ` Cao jin
  0 siblings, 1 reply; 21+ messages in thread
From: Alex Williamson @ 2017-01-18 22:09 UTC (permalink / raw)
  To: Cao jin; +Cc: qemu-devel, mst, izumi.taku, Chen Fan, Dou Liyang

On Sat, 31 Dec 2016 17:13:06 +0800
Cao jin <caoj.fnst@cn.fujitsu.com> wrote:

> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> 
> Introduce new function to initilize AER capability registers
> for vfio-pci device.
> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++----
>  hw/vfio/pci.h |  3 +++
>  2 files changed, 85 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index d7dbe0e..76a8ac3 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -1851,18 +1851,81 @@ out:
>      return 0;
>  }
>  
> -static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> +static int vfio_setup_aer(VFIOPCIDevice *vdev, uint8_t cap_ver,
> +                          int pos, uint16_t size, Error **errp)
> +{
> +    PCIDevice *pdev = &vdev->pdev;
> +    PCIDevice *dev_iter;
> +    uint8_t type;
> +    uint32_t errcap;
> +
> +    /* In case the physical device has AER cap while user doesn't enable AER,
> +     * still allocate the config space in the emulated device for AER */

Bad comment style

/*
 * Multi-line comments should
 * look like this.
 */

/* Single line comments may look like this */

/* Muli-line comments may
 * absolutely not look like this */

> +    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
> +        pcie_add_capability(pdev, PCI_EXT_CAP_ID_ERR,
> +                            cap_ver, pos, size);
> +        return 0;
> +    }
> +
> +    dev_iter = pci_bridge_get_device(pdev->bus);
> +    if (!dev_iter) {
> +        goto error;
> +    }
> +
> +    while (dev_iter) {
> +        if (!pci_is_express(dev_iter)) {
> +            goto error;
> +        }
> +
> +        type = pcie_cap_get_type(dev_iter);
> +        if ((type != PCI_EXP_TYPE_ROOT_PORT &&
> +             type != PCI_EXP_TYPE_UPSTREAM &&
> +             type != PCI_EXP_TYPE_DOWNSTREAM)) {
> +            goto error;
> +        }
> +
> +        if (!dev_iter->exp.aer_cap) {
> +            goto error;
> +        }
> +
> +        dev_iter = pci_bridge_get_device(dev_iter->bus);
> +    }
> +
> +    errcap = vfio_pci_read_config(pdev, pos + PCI_ERR_CAP, 4);
> +    /*
> +     * The ability to record multiple headers is depending on
> +     * the state of the Multiple Header Recording Capable bit and
> +     * enabled by the Multiple Header Recording Enable bit.
> +     */
> +    if ((errcap & PCI_ERR_CAP_MHRC) &&
> +        (errcap & PCI_ERR_CAP_MHRE)) {
> +        pdev->exp.aer_log.log_max = PCIE_AER_LOG_MAX_DEFAULT;
> +    } else {
> +        pdev->exp.aer_log.log_max = 0;
> +    }
> +
> +    pcie_cap_deverr_init(pdev);
> +    return pcie_aer_init(pdev, cap_ver, pos, size);
> +
> +error:
> +    error_setg(errp, "vfio: Unable to enable AER for device %s, parent bus "
> +               "does not support AER signaling", vdev->vbasedev.name);
> +    return -1;

If we're only handling non-fatal errors, should we place the burden on
the user to know when to add the aer flag for the device or should we
just enable it opportunistically as available?  Should pcie_aer_init()
be the one testing the topology?  It doesn't seem vfio specific anymore.

> +}
> +
> +static int vfio_add_ext_cap(VFIOPCIDevice *vdev, Error **errp)
>  {
>      PCIDevice *pdev = &vdev->pdev;
>      uint32_t header;
>      uint16_t cap_id, next, size;
>      uint8_t cap_ver;
>      uint8_t *config;
> +    int ret = 0;
>  
>      /* Only add extended caps if we have them and the guest can see them */
>      if (!pci_is_express(pdev) || !pci_bus_is_express(pdev->bus) ||
>          !pci_get_long(pdev->config + PCI_CONFIG_SPACE_SIZE)) {
> -        return;
> +        return 0;
>      }
>  
>      /*
> @@ -1911,6 +1974,9 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>                                     PCI_EXT_CAP_NEXT_MASK);
>  
>          switch (cap_id) {
> +        case PCI_EXT_CAP_ID_ERR:
> +            ret = vfio_setup_aer(vdev, cap_ver, next, size, errp);
> +            break;
>          case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
>          case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization */
>              trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, next);
> @@ -1919,6 +1985,9 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>              pcie_add_capability(pdev, cap_id, cap_ver, next, size);
>          }
>  
> +        if (ret) {
> +            goto out;
> +        }
>      }
>  
>      /* Cleanup chain head ID if necessary */
> @@ -1926,8 +1995,9 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>          pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
>      }
>  
> +out:
>      g_free(config);
> -    return;
> +    return ret;
>  }
>  
>  static int vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp)
> @@ -1945,8 +2015,8 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev, Error **errp)
>          return ret;
>      }
>  
> -    vfio_add_ext_cap(vdev);
> -    return 0;
> +    ret = vfio_add_ext_cap(vdev, errp);
> +    return ret;
>  }
>  
>  static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
> @@ -2769,6 +2839,13 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>          goto out_teardown;
>      }
>  
> +    if ((vdev->features & VFIO_FEATURE_ENABLE_AER) &&
> +        !pdev->exp.aer_cap) {
> +        error_setg(errp, "vfio: Unable to enable AER for device %s, device "
> +                   "does not support AER signaling", vdev->vbasedev.name);
> +        return;
> +    }
> +
>      if (vdev->vga) {
>          vfio_vga_quirk_setup(vdev);
>      }
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index a8366bb..64701c4 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -15,6 +15,7 @@
>  #include "qemu-common.h"
>  #include "exec/memory.h"
>  #include "hw/pci/pci.h"
> +#include "hw/pci/pci_bridge.h"
>  #include "hw/vfio/vfio-common.h"
>  #include "qemu/event_notifier.h"
>  #include "qemu/queue.h"
> @@ -132,6 +133,8 @@ typedef struct VFIOPCIDevice {
>  #define VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT 2
>  #define VFIO_FEATURE_ENABLE_IGD_OPREGION \
>                                  (1 << VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT)
> +#define VFIO_FEATURE_ENABLE_AER_BIT 3
> +#define VFIO_FEATURE_ENABLE_AER (1 << VFIO_FEATURE_ENABLE_AER_BIT)
>      int32_t bootindex;
>      uint32_t igd_gms;
>      uint8_t pm_cap;

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest Cao jin
  2017-01-09 22:55   ` Michael S. Tsirkin
@ 2017-01-18 22:31   ` Alex Williamson
  2017-01-20  6:50     ` Cao jin
  2017-01-20  6:57     ` Tian, Kevin
  1 sibling, 2 replies; 21+ messages in thread
From: Alex Williamson @ 2017-01-18 22:31 UTC (permalink / raw)
  To: Cao jin; +Cc: qemu-devel, mst, izumi.taku, Chen Fan, Dou Liyang

On Sat, 31 Dec 2016 17:13:07 +0800
Cao jin <caoj.fnst@cn.fujitsu.com> wrote:

> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> 
> When physical device has uncorrectable error hanppened, the vfio_pci
> driver will signal the uncorrectable error status register value to
> corresponding QEMU's vfio-pci device via the eventfd registered by this
> device, then, the vfio-pci's error eventfd handler will be invoked in
> event loop.
> 
> Construct and pass the aer message to root port, root port will trigger an
> interrupt to signal guest, then, the guest driver will do the recovery.
> 
> Note: Now only support non-fatal error's recovery, fatal error will
> still result in vm stop.

Please update the entire commit log, don't just add a note that this
now only covers non-fatal errors.

> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 50 ++++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 42 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 76a8ac3..9861f72 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
>  static void vfio_err_notifier_handler(void *opaque)
>  {
>      VFIOPCIDevice *vdev = opaque;
> +    PCIDevice *dev = &vdev->pdev;
> +    PCIEAERMsg msg = {
> +        .severity = 0,
> +        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
> +    };
> +    int len;
> +    uint64_t uncor_status;
> +
> +    /* Read uncorrectable error status from driver */
> +    len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status));

Whoa this seems bogus.  In the kernel eventfd_signal() adds the value
to the internal counter.  We can't guarantee that the user is going to
immediately respond to every signal, multiple signals might occur, each
incrementing the counter.  I don't think we can use the eventfd like
this.

> +    if (len != sizeof(uncor_status)) {
> +        error_report("vfio-pci: uncor error status reading returns"
> +                     " invalid number of bytes: %d", len);
> +        return; //Or goto stop?

It's bogus use of the eventfd anyway afaict, but
event_notifier_test_and_clear() certainly handles at least EINTR.

> +    }
> +
> +    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
> +        goto stop;
> +    }

I'm not sure this should be user selected anymore.

> +
> +    /* Populate the aer msg and send it to root port */
> +    if (dev->exp.aer_cap) {
> +        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
> +        bool isfatal = uncor_status &
> +                       pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
> +
> +	if (isfatal) {
> +	    goto stop;
> +	}

QEMU uses spaces not tabs.

> +
> +        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
> +                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;

We don't get here if isfatal.

>  
> -    if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
> +        error_report("vfio-pci device %d sending AER to root port. uncor"
> +                     " status = 0x%"PRIx64, dev->devfn, uncor_status);
> +        pcie_aer_msg(dev, &msg);
>          return;
>      }
>  
> +stop:
>      /*
> -     * TBD. Retrieve the error details and decide what action
> -     * needs to be taken. One of the actions could be to pass
> -     * the error to the guest and have the guest driver recover
> -     * from the error. This requires that PCIe capabilities be
> -     * exposed to the guest. For now, we just terminate the
> -     * guest to contain the error.
> +     * Terminate the guest in case of
> +     * 1. AER capability is not exposed to guest.
> +     * 2. AER capability is exposed, but error is fatal, only non-fatal
> +     * error is handled now.

You're also currently requiring the user to enable aer per device.

>       */
>  
> -    error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
> +    error_report("%s(%s) fatal error detected. Please collect any data"
> +            " possible and then kill the guest", __func__, vdev->vbasedev.name);
>  
>      vm_stop(RUN_STATE_INTERNAL_ERROR);
>  }

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap
  2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap Cao jin
@ 2017-01-18 22:36   ` Alex Williamson
  2017-01-20  6:04     ` Cao jin
  0 siblings, 1 reply; 21+ messages in thread
From: Alex Williamson @ 2017-01-18 22:36 UTC (permalink / raw)
  To: Cao jin; +Cc: qemu-devel, mst, izumi.taku, Chen Fan, Dou Liyang

On Sat, 31 Dec 2016 17:13:08 +0800
Cao jin <caoj.fnst@cn.fujitsu.com> wrote:

> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> 
> Add 'aer' property, let user choose whether expose the aer capability
> or not.

But that's not what it does, it only controls the behavior in response
to non-fatal errors, the capability is exposed regardless.

> Should disable aer feature by default, because only non-fatal
> error is supported now. 

Why does that mean it should be disabled by default?  What bad thing
happens if we enable this opportunistically?

> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 9861f72..fc9db66 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3057,6 +3057,8 @@ static Property vfio_pci_dev_properties[] = {
>      DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice,
>                         sub_device_id, PCI_ANY_ID),
>      DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0),
> +    DEFINE_PROP_BIT("aer", VFIOPCIDevice, features,
> +                    VFIO_FEATURE_ENABLE_AER_BIT, false),
>      /*
>       * TODO - support passed fds... is this necessary?
>       * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
  2017-01-18 21:43 ` Alex Williamson
@ 2017-01-19  6:15   ` Cao jin
  2017-01-19  6:25   ` Cao jin
  1 sibling, 0 replies; 21+ messages in thread
From: Cao jin @ 2017-01-19  6:15 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, mst, izumi.taku



On 01/19/2017 05:43 AM, Alex Williamson wrote:
> On Sat, 31 Dec 2016 17:13:04 +0800
> Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> 
>> As previous discussion suggest, we could take a step back to handle non-fatal
>> error first, this will make this patchset much more thinner, because we could
>> drop all the configuration restriction related patches.
>>
>> FYI: patch 1 has been cherry picked into another series, and wait to be merged
>> first, so this patchset can't compile in your host.
>>
>> v11 changelog:
>> 1. drop a bunch of code which check the configuration.
>> 2. modify patch 3 to handle non-fatal error only, fatal error still
>>    results in vm stop.
>>    Doesn't modify as suggestion "add another eventfd do distinguish fatal &
>>    non-fatal error", because 1st, user has the ability to distinguish them
>>    just from the uncorrectable error status; 2nd, for back compatible, e.g.
>>    an old user handle both error, rely on the current error eventfd.
>>
>> Test:
>> Test it with intel 82576 NIC, which has 2 functions, function 1 has cable
>> connected to gateway, function 0 has no link. Test in 4 scenario.
>> 1. just assign function 1 to one vm, function 0 has no user
>> 2. assign 2 function to one vm, totally comply previous configuraton restrction
>> 3. assign 2 function to one vm, under different virtual bus
>> 4, assign functions to 2 different vm
>>
>> The test steps are the same as v10: assign ip to function 1, add route info,
>> and ping the gateway. The results meet expectation. But the unsteady hardware
>> often emit fatal error, still don't know why. And igb driver in guest seems
>> has bug: ping gateway for a while, even if I don't do anything, it will show
>> "Destination Host Unreachable" after many successful ping. But obviously,
>> neither of these has relationship with this patchset.
> 
> So something doesn't work right regardless and this doesn't describe
> what testing was done of the functionality added here.  Were AER events
> injected?  Did fatal ones cause a vmstop, did non-fatal ones continue?
> How can you know if he other function was affected if you don't even
> have a cable connected?  How is testing on something that doesn't seem
> to work correctly already valid?  Thanks,
> 
> Alex
> 

Well, of course I test the functionality added here via aer-inject tool
& driver, the same test as I did in v10. The unsteady hardware slows my
test, make me did the test again & again & again, until see what I
expected, maybe I should share some details.

General test steps(linux only)
  * assign ip to function 1 in guest, add route info, and ping the gateway
  * on host, injecting various non fatal error to both devices.
    Robustness test is: fast repeat injecting manually.

Expectation: in all scenario, function 1 who is pinging still works
after non fatal error recovery; function 0(in host, same guest, another
guest) doesn't have any abnormal phenomenon(crash guest, or any abnormal
log in dmesg); fatal error cause a vmstop(surely)

I did the test in each scenario, and I got what I expected: fatal error
definitely cause a vmstop; even bothered by unsteady hardware, non fatal
error recovery is ok

IIRC, in test of v10, I even find that the NIC would emit the fatal
error after I reboot the host, before I start vm.  This is what I called
unsteady hardware(I will try to figure out on which scenario the
hardware is easy to be unsteady). So I think, something doesn't work
right, but in my opinion, none of this patchset's business.

-- 
Sincerely,
Cao jin

>> Chen Fan (3):
>>   vfio: new function to init aer cap for vfio device
>>   vfio-pci: pass the aer error to guest
>>   vfio: add 'aer' property to expose aercap
>>
>> Dou Liyang (1):
>>   pcie_aer: support configurable AER capa version
>>
>>  hw/net/e1000e.c                    |   3 +-
>>  hw/pci-bridge/ioh3420.c            |   2 +-
>>  hw/pci-bridge/xio3130_downstream.c |   2 +-
>>  hw/pci-bridge/xio3130_upstream.c   |   2 +-
>>  hw/pci/pcie_aer.c                  |   5 +-
>>  hw/vfio/pci.c                      | 139 +++++++++++++++++++++++++++++++++----
>>  hw/vfio/pci.h                      |   3 +
>>  include/hw/pci/pcie_aer.h          |   3 +-
>>  8 files changed, 139 insertions(+), 20 deletions(-)
>>
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest
  2017-01-18 21:43 ` Alex Williamson
  2017-01-19  6:15   ` Cao jin
@ 2017-01-19  6:25   ` Cao jin
  1 sibling, 0 replies; 21+ messages in thread
From: Cao jin @ 2017-01-19  6:25 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, mst, izumi.taku



On 01/19/2017 05:43 AM, Alex Williamson wrote:
> On Sat, 31 Dec 2016 17:13:04 +0800

> How can you know if he other function was affected if you don't even
> have a cable connected?

Will try ask for another cable for the other function.

>  How is testing on something that doesn't seem
> to work correctly already valid?  Thanks,

Not sure I understand you right. Even if I find those abnormal
phenomenon in test, I finally got the very exact result I expected in
each scenario, does it enough to prove this new functionality works? and
especially these abnormal phenomenon exists even without this patchset

-- 
Sincerely,
Cao jin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device
  2017-01-18 22:09   ` Alex Williamson
@ 2017-01-20  6:03     ` Cao jin
  2017-01-20 18:12       ` Alex Williamson
  0 siblings, 1 reply; 21+ messages in thread
From: Cao jin @ 2017-01-20  6:03 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, mst, izumi.taku, Dou Liyang



On 01/19/2017 06:09 AM, Alex Williamson wrote:
> On Sat, 31 Dec 2016 17:13:06 +0800
> Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> 
>> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>
>> Introduce new function to initilize AER capability registers
>> for vfio-pci device.
>>
>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
>> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
>> ---
>>  hw/vfio/pci.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++----
>>  hw/vfio/pci.h |  3 +++
>>  2 files changed, 85 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index d7dbe0e..76a8ac3 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -1851,18 +1851,81 @@ out:
>>      return 0;
>>  }
>>  
>> -static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>> +static int vfio_setup_aer(VFIOPCIDevice *vdev, uint8_t cap_ver,
>> +                          int pos, uint16_t size, Error **errp)
>> +{
>> +    PCIDevice *pdev = &vdev->pdev;
>> +    PCIDevice *dev_iter;
>> +    uint8_t type;
>> +    uint32_t errcap;
>> +
>> +    /* In case the physical device has AER cap while user doesn't enable AER,
>> +     * still allocate the config space in the emulated device for AER */
> 
> Bad comment style
> 
> /*
>  * Multi-line comments should
>  * look like this.
>  */
> 
> /* Single line comments may look like this */
> 
> /* Muli-line comments may
>  * absolutely not look like this */
> 
>> +    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
>> +        pcie_add_capability(pdev, PCI_EXT_CAP_ID_ERR,
>> +                            cap_ver, pos, size);
>> +        return 0;
>> +    }
>> +
>> +    dev_iter = pci_bridge_get_device(pdev->bus);
>> +    if (!dev_iter) {
>> +        goto error;
>> +    }
>> +
>> +    while (dev_iter) {
>> +        if (!pci_is_express(dev_iter)) {
>> +            goto error;
>> +        }
>> +
>> +        type = pcie_cap_get_type(dev_iter);
>> +        if ((type != PCI_EXP_TYPE_ROOT_PORT &&
>> +             type != PCI_EXP_TYPE_UPSTREAM &&
>> +             type != PCI_EXP_TYPE_DOWNSTREAM)) {
>> +            goto error;
>> +        }
>> +
>> +        if (!dev_iter->exp.aer_cap) {
>> +            goto error;
>> +        }
>> +
>> +        dev_iter = pci_bridge_get_device(dev_iter->bus);
>> +    }
>> +
>> +    errcap = vfio_pci_read_config(pdev, pos + PCI_ERR_CAP, 4);
>> +    /*
>> +     * The ability to record multiple headers is depending on
>> +     * the state of the Multiple Header Recording Capable bit and
>> +     * enabled by the Multiple Header Recording Enable bit.
>> +     */
>> +    if ((errcap & PCI_ERR_CAP_MHRC) &&
>> +        (errcap & PCI_ERR_CAP_MHRE)) {
>> +        pdev->exp.aer_log.log_max = PCIE_AER_LOG_MAX_DEFAULT;
>> +    } else {
>> +        pdev->exp.aer_log.log_max = 0;
>> +    }
>> +
>> +    pcie_cap_deverr_init(pdev);
>> +    return pcie_aer_init(pdev, cap_ver, pos, size);
>> +
>> +error:
>> +    error_setg(errp, "vfio: Unable to enable AER for device %s, parent bus "
>> +               "does not support AER signaling", vdev->vbasedev.name);
>> +    return -1;
> 
> If we're only handling non-fatal errors, should we place the burden on
> the user to know when to add the aer flag for the device or should we
> just enable it opportunistically as available?

If we only handle non-fatal error, It make sense to me that we don't
need aer property as a switch and enable aer opportunistically as
available, it is no harm.

But considering that:
1. non-fatal error support is incomplete AER functionality, so I think
make it defaults to off is reasonable.

2. we may support fatal error too in the future, that will bring the
configuration restriction we used before. In this condition, make it
defaults to off is your discussion result(I guess). If this is the
finally choice, adopt it a little earlier is acceptable?

3. from another perspective, if 'aer' property shows in the future once
we support fatal-error too, would it seems that the 'aer' property is
dedicated to fatal-error only?(of course we could make it as the switch
of both error at that time)

> Should pcie_aer_init() be the one testing the topology?  It doesn't
> seem vfio specific anymore.
>

It does is not vfio specific, but I guess not. Question: could a
aer-capable device plugged under certain (root/downstream)port that is
not aer-capable? the answer I think is YES. If topology testing is done
in pcie_aer_init(), that means the answer is NO.

-- 
Sincerely,
Cao jin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap
  2017-01-18 22:36   ` Alex Williamson
@ 2017-01-20  6:04     ` Cao jin
  2017-01-20 18:01       ` Alex Williamson
  0 siblings, 1 reply; 21+ messages in thread
From: Cao jin @ 2017-01-20  6:04 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, mst, izumi.taku, Dou Liyang



On 01/19/2017 06:36 AM, Alex Williamson wrote:
> On Sat, 31 Dec 2016 17:13:08 +0800
> Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> 
>> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>
>> Add 'aer' property, let user choose whether expose the aer capability
>> or not.
> 
> But that's not what it does, it only controls the behavior in response
> to non-fatal errors, the capability is exposed regardless.
> 

This commit log is legacy, and defaults to off is a result of the
configuration restriction & your previous discussion, right?

In current version, if 'aer' property is off, we just allocate the
config space via pcie_add_capability(), we don't init the AER
capability, the value is all 0s there, so does that still mean
"capability is exposed regardless"?

>> Should disable aer feature by default, because only non-fatal
>> error is supported now. 
> 
> Why does that mean it should be disabled by default?  What bad thing
> happens if we enable this opportunistically?
> 
>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
>> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
>> ---
>>  hw/vfio/pci.c | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 9861f72..fc9db66 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3057,6 +3057,8 @@ static Property vfio_pci_dev_properties[] = {
>>      DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice,
>>                         sub_device_id, PCI_ANY_ID),
>>      DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0),
>> +    DEFINE_PROP_BIT("aer", VFIOPCIDevice, features,
>> +                    VFIO_FEATURE_ENABLE_AER_BIT, false),
>>      /*
>>       * TODO - support passed fds... is this necessary?
>>       * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
> 
> 
> 
> .
> 

-- 
Sincerely,
Cao jin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest
  2017-01-18 22:31   ` Alex Williamson
@ 2017-01-20  6:50     ` Cao jin
  2017-01-20  6:57     ` Tian, Kevin
  1 sibling, 0 replies; 21+ messages in thread
From: Cao jin @ 2017-01-20  6:50 UTC (permalink / raw)
  To: Alex Williamson; +Cc: qemu-devel, mst, izumi.taku, Chen Fan, Dou Liyang



On 01/19/2017 06:31 AM, Alex Williamson wrote:
> On Sat, 31 Dec 2016 17:13:07 +0800
> Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> 
>> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>>

>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 76a8ac3..9861f72 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
>>  static void vfio_err_notifier_handler(void *opaque)
>>  {
>>      VFIOPCIDevice *vdev = opaque;
>> +    PCIDevice *dev = &vdev->pdev;
>> +    PCIEAERMsg msg = {
>> +        .severity = 0,
>> +        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
>> +    };
>> +    int len;
>> +    uint64_t uncor_status;
>> +
>> +    /* Read uncorrectable error status from driver */
>> +    len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status));
> 
> Whoa this seems bogus.  In the kernel eventfd_signal() adds the value
> to the internal counter.  We can't guarantee that the user is going to
> immediately respond to every signal, multiple signals might occur, each
> incrementing the counter.  I don't think we can use the eventfd like
> this.
> 

Ah, got your concern, make sense. Michael had the same comments as you,
I didn't got him at that time...

I explained the reason in v10 cover letter(5 of changelog): I find that
error status register reading in error handler sometime returns ALL F's.
 I may want to take a look at v10, incorporate your comments, and test
to see if the issue still exists.  Currently if we only handle non-fatal
error, we can still use eventfd like before.

-- 
Sincerely,
Cao jin
>> +    if (len != sizeof(uncor_status)) {
>> +        error_report("vfio-pci: uncor error status reading returns"
>> +                     " invalid number of bytes: %d", len);
>> +        return; //Or goto stop?
> 
> It's bogus use of the eventfd anyway afaict, but
> event_notifier_test_and_clear() certainly handles at least EINTR.
> 
>> +    }
>> +
>> +    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
>> +        goto stop;
>> +    }
> 
> I'm not sure this should be user selected anymore.
> 
>> +
>> +    /* Populate the aer msg and send it to root port */
>> +    if (dev->exp.aer_cap) {
>> +        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
>> +        bool isfatal = uncor_status &
>> +                       pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
>> +
>> +	if (isfatal) {
>> +	    goto stop;
>> +	}
> 
> QEMU uses spaces not tabs.
> 
>> +
>> +        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
>> +                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;
> 
> We don't get here if isfatal.
> 
>>  
>> -    if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>> +        error_report("vfio-pci device %d sending AER to root port. uncor"
>> +                     " status = 0x%"PRIx64, dev->devfn, uncor_status);
>> +        pcie_aer_msg(dev, &msg);
>>          return;
>>      }
>>  
>> +stop:
>>      /*
>> -     * TBD. Retrieve the error details and decide what action
>> -     * needs to be taken. One of the actions could be to pass
>> -     * the error to the guest and have the guest driver recover
>> -     * from the error. This requires that PCIe capabilities be
>> -     * exposed to the guest. For now, we just terminate the
>> -     * guest to contain the error.
>> +     * Terminate the guest in case of
>> +     * 1. AER capability is not exposed to guest.
>> +     * 2. AER capability is exposed, but error is fatal, only non-fatal
>> +     * error is handled now.
> 
> You're also currently requiring the user to enable aer per device.
> 
>>       */
>>  
>> -    error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name);
>> +    error_report("%s(%s) fatal error detected. Please collect any data"
>> +            " possible and then kill the guest", __func__, vdev->vbasedev.name);
>>  
>>      vm_stop(RUN_STATE_INTERNAL_ERROR);
>>  }
> 
> 
> 
> .
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest
  2017-01-18 22:31   ` Alex Williamson
  2017-01-20  6:50     ` Cao jin
@ 2017-01-20  6:57     ` Tian, Kevin
  2017-01-20 18:21       ` Alex Williamson
  1 sibling, 1 reply; 21+ messages in thread
From: Tian, Kevin @ 2017-01-20  6:57 UTC (permalink / raw)
  To: Alex Williamson, Cao jin
  Cc: Chen Fan, izumi.taku, Dou Liyang, qemu-devel, mst, Peter Xu, Lan,
	Tianyu, Liu, Yi L

> From: Alex Williamson
> Sent: Thursday, January 19, 2017 6:32 AM
> 
> On Sat, 31 Dec 2016 17:13:07 +0800
> Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> 
> > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> >
> > When physical device has uncorrectable error hanppened, the vfio_pci
> > driver will signal the uncorrectable error status register value to
> > corresponding QEMU's vfio-pci device via the eventfd registered by this
> > device, then, the vfio-pci's error eventfd handler will be invoked in
> > event loop.
> >
> > Construct and pass the aer message to root port, root port will trigger an
> > interrupt to signal guest, then, the guest driver will do the recovery.
> >
> > Note: Now only support non-fatal error's recovery, fatal error will
> > still result in vm stop.
> 
> Please update the entire commit log, don't just add a note that this
> now only covers non-fatal errors.
> 

One thing relate to vIOMMU. There is still a TODO task about forwarding
IOMMU fault thru VFIO to Qemu, so Qemu vIOMMU has the chance to
walk guest remapping structure to emulate virtual IOMMU fault. Likely
it also requires eventfd mechanism.

Wondering whether making sense to reuse same eventfd for both AER
and vIOMMU or using separate eventfd is also fine? Even go with the
former option, I don't expect substantial change to this series. Major
change is on interface definition - extensible to multiple types of 
fault/error conditions instead of assuming AER only.

Thoughts?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap
  2017-01-20  6:04     ` Cao jin
@ 2017-01-20 18:01       ` Alex Williamson
  0 siblings, 0 replies; 21+ messages in thread
From: Alex Williamson @ 2017-01-20 18:01 UTC (permalink / raw)
  To: Cao jin; +Cc: qemu-devel, mst, izumi.taku, Dou Liyang

On Fri, 20 Jan 2017 14:04:08 +0800
Cao jin <caoj.fnst@cn.fujitsu.com> wrote:

> On 01/19/2017 06:36 AM, Alex Williamson wrote:
> > On Sat, 31 Dec 2016 17:13:08 +0800
> > Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> >   
> >> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> >>
> >> Add 'aer' property, let user choose whether expose the aer capability
> >> or not.  
> > 
> > But that's not what it does, it only controls the behavior in response
> > to non-fatal errors, the capability is exposed regardless.
> >   
> 
> This commit log is legacy, and defaults to off is a result of the
> configuration restriction & your previous discussion, right?
> 
> In current version, if 'aer' property is off, we just allocate the
> config space via pcie_add_capability(), we don't init the AER
> capability, the value is all 0s there, so does that still mean
> "capability is exposed regardless"?

The design has changed, we no longer require a matching host and guest
topology, we can more easily transparently enable non-fatal error
correction.  We need to reevaluate whether previous decisions are still
valid, we cannot blindly assume that a requirement for a previous
design still applies.  Thanks,

Alex
 
> >> Should disable aer feature by default, because only non-fatal
> >> error is supported now.   
> > 
> > Why does that mean it should be disabled by default?  What bad thing
> > happens if we enable this opportunistically?
> >   
> >> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> >> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> >> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
> >> ---
> >>  hw/vfio/pci.c | 2 ++
> >>  1 file changed, 2 insertions(+)
> >>
> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >> index 9861f72..fc9db66 100644
> >> --- a/hw/vfio/pci.c
> >> +++ b/hw/vfio/pci.c
> >> @@ -3057,6 +3057,8 @@ static Property vfio_pci_dev_properties[] = {
> >>      DEFINE_PROP_UINT32("x-pci-sub-device-id", VFIOPCIDevice,
> >>                         sub_device_id, PCI_ANY_ID),
> >>      DEFINE_PROP_UINT32("x-igd-gms", VFIOPCIDevice, igd_gms, 0),
> >> +    DEFINE_PROP_BIT("aer", VFIOPCIDevice, features,
> >> +                    VFIO_FEATURE_ENABLE_AER_BIT, false),
> >>      /*
> >>       * TODO - support passed fds... is this necessary?
> >>       * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),  
> > 
> > 
> > 
> > .
> >   
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device
  2017-01-20  6:03     ` Cao jin
@ 2017-01-20 18:12       ` Alex Williamson
  0 siblings, 0 replies; 21+ messages in thread
From: Alex Williamson @ 2017-01-20 18:12 UTC (permalink / raw)
  To: Cao jin; +Cc: qemu-devel, mst, izumi.taku, Dou Liyang

On Fri, 20 Jan 2017 14:03:42 +0800
Cao jin <caoj.fnst@cn.fujitsu.com> wrote:

> On 01/19/2017 06:09 AM, Alex Williamson wrote:
> > On Sat, 31 Dec 2016 17:13:06 +0800
> > Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> >   
> >> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> >>
> >> Introduce new function to initilize AER capability registers
> >> for vfio-pci device.
> >>
> >> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> >> Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
> >> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
> >> ---
> >>  hw/vfio/pci.c | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++++++----
> >>  hw/vfio/pci.h |  3 +++
> >>  2 files changed, 85 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >> index d7dbe0e..76a8ac3 100644
> >> --- a/hw/vfio/pci.c
> >> +++ b/hw/vfio/pci.c
> >> @@ -1851,18 +1851,81 @@ out:
> >>      return 0;
> >>  }
> >>  
> >> -static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> >> +static int vfio_setup_aer(VFIOPCIDevice *vdev, uint8_t cap_ver,
> >> +                          int pos, uint16_t size, Error **errp)
> >> +{
> >> +    PCIDevice *pdev = &vdev->pdev;
> >> +    PCIDevice *dev_iter;
> >> +    uint8_t type;
> >> +    uint32_t errcap;
> >> +
> >> +    /* In case the physical device has AER cap while user doesn't enable AER,
> >> +     * still allocate the config space in the emulated device for AER */  
> > 
> > Bad comment style
> > 
> > /*
> >  * Multi-line comments should
> >  * look like this.
> >  */
> > 
> > /* Single line comments may look like this */
> > 
> > /* Muli-line comments may
> >  * absolutely not look like this */
> >   
> >> +    if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) {
> >> +        pcie_add_capability(pdev, PCI_EXT_CAP_ID_ERR,
> >> +                            cap_ver, pos, size);
> >> +        return 0;
> >> +    }
> >> +
> >> +    dev_iter = pci_bridge_get_device(pdev->bus);
> >> +    if (!dev_iter) {
> >> +        goto error;
> >> +    }
> >> +
> >> +    while (dev_iter) {
> >> +        if (!pci_is_express(dev_iter)) {
> >> +            goto error;
> >> +        }
> >> +
> >> +        type = pcie_cap_get_type(dev_iter);
> >> +        if ((type != PCI_EXP_TYPE_ROOT_PORT &&
> >> +             type != PCI_EXP_TYPE_UPSTREAM &&
> >> +             type != PCI_EXP_TYPE_DOWNSTREAM)) {
> >> +            goto error;
> >> +        }
> >> +
> >> +        if (!dev_iter->exp.aer_cap) {
> >> +            goto error;
> >> +        }
> >> +
> >> +        dev_iter = pci_bridge_get_device(dev_iter->bus);
> >> +    }

Does AER really require PCIe and AER support all the way to the root
bus?  Shouldn't it be sufficient for the device to be downstream of a
downstream port supporting AER?  If we inject an error, it's injected
via the direct immediate upstream parent, right?

> >> +
> >> +    errcap = vfio_pci_read_config(pdev, pos + PCI_ERR_CAP, 4);
> >> +    /*
> >> +     * The ability to record multiple headers is depending on
> >> +     * the state of the Multiple Header Recording Capable bit and
> >> +     * enabled by the Multiple Header Recording Enable bit.
> >> +     */
> >> +    if ((errcap & PCI_ERR_CAP_MHRC) &&
> >> +        (errcap & PCI_ERR_CAP_MHRE)) {
> >> +        pdev->exp.aer_log.log_max = PCIE_AER_LOG_MAX_DEFAULT;
> >> +    } else {
> >> +        pdev->exp.aer_log.log_max = 0;
> >> +    }
> >> +
> >> +    pcie_cap_deverr_init(pdev);
> >> +    return pcie_aer_init(pdev, cap_ver, pos, size);
> >> +
> >> +error:
> >> +    error_setg(errp, "vfio: Unable to enable AER for device %s, parent bus "
> >> +               "does not support AER signaling", vdev->vbasedev.name);
> >> +    return -1;  
> > 
> > If we're only handling non-fatal errors, should we place the burden on
> > the user to know when to add the aer flag for the device or should we
> > just enable it opportunistically as available?  
> 
> If we only handle non-fatal error, It make sense to me that we don't
> need aer property as a switch and enable aer opportunistically as
> available, it is no harm.
> 
> But considering that:
> 1. non-fatal error support is incomplete AER functionality, so I think
> make it defaults to off is reasonable.
> 
> 2. we may support fatal error too in the future, that will bring the
> configuration restriction we used before. In this condition, make it
> defaults to off is your discussion result(I guess). If this is the
> finally choice, adopt it a little earlier is acceptable?
> 
> 3. from another perspective, if 'aer' property shows in the future once
> we support fatal-error too, would it seems that the 'aer' property is
> dedicated to fatal-error only?(of course we could make it as the switch
> of both error at that time)

If handling of fatal errors will impose configuration restrictions then
it needs to have its own option and it needs to default to off.  We
cannot reuse "aer" since that would make old VMs incompatible with new
QEMU.  Therefore the question is whether there's value in having an
option for non-fatal errors vs enabling it opportunistically.  We could
certainly default to enabling it when available and adding an
experimental option (x-disable-guest-aer-forward) to allow users to
disable it for debug purposes.
 
> > Should pcie_aer_init() be the one testing the topology?  It doesn't
> > seem vfio specific anymore.
> >  
> 
> It does is not vfio specific, but I guess not. Question: could a
> aer-capable device plugged under certain (root/downstream)port that is
> not aer-capable? the answer I think is YES. If topology testing is done
> in pcie_aer_init(), that means the answer is NO.

Do we have any requirement to expose a device's AER capability in such
a configuration?  We could also argue that there's little value in
doing so, or simply expose the capability w/o any QEMU AER backing for
those cases.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest
  2017-01-20  6:57     ` Tian, Kevin
@ 2017-01-20 18:21       ` Alex Williamson
  2017-01-22  4:43         ` Tian, Kevin
  0 siblings, 1 reply; 21+ messages in thread
From: Alex Williamson @ 2017-01-20 18:21 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Cao jin, Chen Fan, izumi.taku, Dou Liyang, qemu-devel, mst,
	Peter Xu, Lan, Tianyu, Liu, Yi L

On Fri, 20 Jan 2017 06:57:22 +0000
"Tian, Kevin" <kevin.tian@intel.com> wrote:

> > From: Alex Williamson
> > Sent: Thursday, January 19, 2017 6:32 AM
> > 
> > On Sat, 31 Dec 2016 17:13:07 +0800
> > Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> >   
> > > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> > >
> > > When physical device has uncorrectable error hanppened, the vfio_pci
> > > driver will signal the uncorrectable error status register value to
> > > corresponding QEMU's vfio-pci device via the eventfd registered by this
> > > device, then, the vfio-pci's error eventfd handler will be invoked in
> > > event loop.
> > >
> > > Construct and pass the aer message to root port, root port will trigger an
> > > interrupt to signal guest, then, the guest driver will do the recovery.
> > >
> > > Note: Now only support non-fatal error's recovery, fatal error will
> > > still result in vm stop.  
> > 
> > Please update the entire commit log, don't just add a note that this
> > now only covers non-fatal errors.
> >   
> 
> One thing relate to vIOMMU. There is still a TODO task about forwarding
> IOMMU fault thru VFIO to Qemu, so Qemu vIOMMU has the chance to
> walk guest remapping structure to emulate virtual IOMMU fault. Likely
> it also requires eventfd mechanism.
> 
> Wondering whether making sense to reuse same eventfd for both AER
> and vIOMMU or using separate eventfd is also fine? Even go with the
> former option, I don't expect substantial change to this series. Major
> change is on interface definition - extensible to multiple types of 
> fault/error conditions instead of assuming AER only.
> 
> Thoughts?

We can't really convey any information through an eventfd, it's just a
signal, so I don't think we can use the same eventfd for both types of
errors.  Already here we're discussing the idea of using separate
eventfds for fatal vs non-fatal AERs.  IOMMU error processing seems
like yet another eventfd and likely some region or ioctl mechanism for
retrieving the error details since the IOMMU hardware is not directly
accessible.  Furthermore, such an event might logically be connected to
the vfio container rather than the device, so it might not even use the
same file descriptor.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest
  2017-01-20 18:21       ` Alex Williamson
@ 2017-01-22  4:43         ` Tian, Kevin
  0 siblings, 0 replies; 21+ messages in thread
From: Tian, Kevin @ 2017-01-22  4:43 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Cao jin, Chen Fan, izumi.taku, Dou Liyang, qemu-devel, mst,
	Peter Xu, Lan, Tianyu, Liu, Yi L

> From: Alex Williamson [mailto:alex.williamson@redhat.com]
> Sent: Saturday, January 21, 2017 2:21 AM
> 
> On Fri, 20 Jan 2017 06:57:22 +0000
> "Tian, Kevin" <kevin.tian@intel.com> wrote:
> 
> > > From: Alex Williamson
> > > Sent: Thursday, January 19, 2017 6:32 AM
> > >
> > > On Sat, 31 Dec 2016 17:13:07 +0800
> > > Cao jin <caoj.fnst@cn.fujitsu.com> wrote:
> > >
> > > > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> > > >
> > > > When physical device has uncorrectable error hanppened, the vfio_pci
> > > > driver will signal the uncorrectable error status register value to
> > > > corresponding QEMU's vfio-pci device via the eventfd registered by this
> > > > device, then, the vfio-pci's error eventfd handler will be invoked in
> > > > event loop.
> > > >
> > > > Construct and pass the aer message to root port, root port will trigger an
> > > > interrupt to signal guest, then, the guest driver will do the recovery.
> > > >
> > > > Note: Now only support non-fatal error's recovery, fatal error will
> > > > still result in vm stop.
> > >
> > > Please update the entire commit log, don't just add a note that this
> > > now only covers non-fatal errors.
> > >
> >
> > One thing relate to vIOMMU. There is still a TODO task about forwarding
> > IOMMU fault thru VFIO to Qemu, so Qemu vIOMMU has the chance to
> > walk guest remapping structure to emulate virtual IOMMU fault. Likely
> > it also requires eventfd mechanism.
> >
> > Wondering whether making sense to reuse same eventfd for both AER
> > and vIOMMU or using separate eventfd is also fine? Even go with the
> > former option, I don't expect substantial change to this series. Major
> > change is on interface definition - extensible to multiple types of
> > fault/error conditions instead of assuming AER only.
> >
> > Thoughts?
> 
> We can't really convey any information through an eventfd, it's just a
> signal, so I don't think we can use the same eventfd for both types of
> errors.  Already here we're discussing the idea of using separate
> eventfds for fatal vs non-fatal AERs.  IOMMU error processing seems
> like yet another eventfd and likely some region or ioctl mechanism for
> retrieving the error details since the IOMMU hardware is not directly
> accessible.  Furthermore, such an event might logically be connected to
> the vfio container rather than the device, so it might not even use the
> same file descriptor.  Thanks,
> 

Clear enough. Thanks,
Kevin

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-01-22  4:43 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-31  9:13 [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest Cao jin
2016-12-31  9:13 ` [Qemu-devel] [PATCH v11 1/4] pcie_aer: support configurable AER capa version Cao jin
2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 2/4] vfio: new function to init aer cap for vfio device Cao jin
2017-01-18 22:09   ` Alex Williamson
2017-01-20  6:03     ` Cao jin
2017-01-20 18:12       ` Alex Williamson
2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 3/4] vfio-pci: pass the aer error to guest Cao jin
2017-01-09 22:55   ` Michael S. Tsirkin
2017-01-18 22:31   ` Alex Williamson
2017-01-20  6:50     ` Cao jin
2017-01-20  6:57     ` Tian, Kevin
2017-01-20 18:21       ` Alex Williamson
2017-01-22  4:43         ` Tian, Kevin
2016-12-31  9:13 ` [Qemu-devel] [PATCH RFC v11 4/4] vfio: add 'aer' property to expose aercap Cao jin
2017-01-18 22:36   ` Alex Williamson
2017-01-20  6:04     ` Cao jin
2017-01-20 18:01       ` Alex Williamson
2016-12-31  9:17 ` [Qemu-devel] [PATCH RFC v11 0/4] vfio-pci: pass non-fatal error to guest no-reply
2017-01-18 21:43 ` Alex Williamson
2017-01-19  6:15   ` Cao jin
2017-01-19  6:25   ` Cao jin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.