All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device
@ 2015-01-28  8:37 Chen Fan
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 1/8] pcie_aer: fix typos in pcie_aer_inject_error comment Chen Fan
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

For now, for vfio pci passthough devices when qemu receives
an error from host aer report, there just terminate the guest,
but usually user want to know what error occurred but stop the
guest, so this patch add aer capability support for vfio device,
and pass the error to guest, and have guest driver to recover
from the error.

for turning on SERR# patch in seabios list, pls see:
http://www.seabios.org/pipermail/seabios/2015-January/008591.html

v1-v2:
   1. turn on SERR# for bridge control register in firmware.
   2. initilize aer capability for vfio device.
   3. fix some trivial bug.

Chen Fan (8):
  pcie_aer: fix typos in pcie_aer_inject_error comment
  vfio-pci: add aer capability support
  pcie_aer: expose pcie_aer_msg() interface
  vfio-pci: pass the aer error to guest
  pcie_aer: fix a trivial typo in PCIEAERMsg comments
  vfio_pci: fix a wrong check in vfio_pci_reset
  vfio_pci: change vfio device features bit macro to enum definition
  vfio-pci: add VFIO_FEATURE_ENABLE_AER_CAP feature

 hw/pci/pcie_aer.c         |   8 +--
 hw/vfio/pci.c             | 123 ++++++++++++++++++++++++++++++++++++++++++----
 include/hw/compat.h       |   4 ++
 include/hw/pci/pcie_aer.h |   3 +-
 4 files changed, 124 insertions(+), 14 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC v2 1/8] pcie_aer: fix typos in pcie_aer_inject_error comment
  2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
@ 2015-01-28  8:37 ` Chen Fan
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 2/8] vfio-pci: add aer capability support Chen Fan
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

Refer to "PCI Express Base Spec3.0", this comments can't
fit the description in spec, so we should fix them.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 hw/pci/pcie_aer.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 1f4be16..7ca077a 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -618,11 +618,11 @@ static bool pcie_aer_inject_uncor_error(PCIEAERInject *inj, bool is_fatal)
  * non-Function specific error must be recorded in all functions.
  * It is the responsibility of the caller of this function.
  * It is also caller's responsibility to determine which function should
- * report the rerror.
+ * report the error.
  *
  * 6.2.4 Error Logging
- * 6.2.5 Sqeunce of Device Error Signaling and Logging Operations
- * table 6-2: Flowchard Showing Sequence of Device Error Signaling and Logging
+ * 6.2.5 Sequence of Device Error Signaling and Logging Operations
+ * table 6-2: Flowchart Showing Sequence of Device Error Signaling and Logging
  *            Operations
  */
 int pcie_aer_inject_error(PCIDevice *dev, const PCIEAERErr *err)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC v2 2/8] vfio-pci: add aer capability support
  2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 1/8] pcie_aer: fix typos in pcie_aer_inject_error comment Chen Fan
@ 2015-01-28  8:37 ` Chen Fan
  2015-02-02 20:15   ` Alex Williamson
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 3/8] pcie_aer: expose pcie_aer_msg() interface Chen Fan
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

if we detect the aer capability in vfio device, then
we should initialize the vfio device aer rigister bits.
so guest OS can set this bits as needed.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 hw/vfio/pci.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 72 insertions(+)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 014a92c..2072261 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2670,6 +2670,73 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev)
     return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
 }
 
+static void vfio_pci_aer_init(VFIOPCIDevice *vdev)
+{
+    PCIDevice *pdev = &vdev->pdev;
+    PCIExpressDevice *exp = &pdev->exp;
+    uint16_t offset = exp->aer_cap;
+
+    if (!offset) {
+        return;
+    }
+
+    memset(pdev->wmask + offset, 0, PCI_ERR_SIZEOF);
+    memset(pdev->w1cmask + offset, 0, PCI_ERR_SIZEOF);
+    memset(pdev->cmask + offset, 0xFF, PCI_ERR_SIZEOF);
+
+    pci_long_test_and_set_mask(pdev->wmask + exp->exp_cap + PCI_EXP_DEVCTL,
+                               PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE |
+                               PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE);
+    pci_long_test_and_set_mask(pdev->w1cmask + exp->exp_cap + PCI_EXP_DEVSTA,
+                               PCI_EXP_DEVSTA_CED | PCI_EXP_DEVSTA_NFED |
+                               PCI_EXP_DEVSTA_FED | PCI_EXP_DEVSTA_URD);
+
+    pci_set_long(pdev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
+                 PCI_ERR_UNC_SUPPORTED);
+    pci_set_long(pdev->wmask + offset + PCI_ERR_UNCOR_SEVER,
+                 PCI_ERR_UNC_SUPPORTED);
+    pci_long_test_and_set_mask(pdev->w1cmask + offset + PCI_ERR_COR_STATUS,
+                               PCI_ERR_COR_STATUS);
+    pci_set_long(pdev->wmask + offset + PCI_ERR_COR_MASK,
+                 PCI_ERR_COR_SUPPORTED);
+
+    pci_set_long(pdev->wmask + offset + PCI_ERR_CAP,
+                 PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE |
+                 PCI_ERR_CAP_MHRE);
+}
+
+static int vfio_add_ext_capabilities(VFIOPCIDevice *vdev)
+{
+    PCIDevice *pdev = &vdev->pdev;
+    PCIExpressDevice *exp;
+    uint32_t header;
+    uint16_t next = PCI_CONFIG_SPACE_SIZE;
+
+    if (pci_config_size(pdev) <= PCI_CONFIG_SPACE_SIZE) {
+        return 0;
+    }
+
+    header = pci_get_long(pdev->config + next);
+    while (header) {
+        switch (PCI_EXT_CAP_ID(header)) {
+        case PCI_EXT_CAP_ID_ERR:
+             exp = &pdev->exp;
+             exp->aer_cap = next;
+
+             vfio_pci_aer_init(vdev);
+             break;
+        };
+
+        next = PCI_EXT_CAP_NEXT(header);
+        if (!next) {
+            return 0;
+        }
+        header = pci_get_long(pdev->config + next);
+    }
+
+    return 0;
+}
+
 static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
 {
     PCIDevice *pdev = &vdev->pdev;
@@ -3296,6 +3363,11 @@ static int vfio_initfn(PCIDevice *pdev)
         goto out_teardown;
     }
 
+    ret = vfio_add_ext_capabilities(vdev);
+    if (ret) {
+        goto out_teardown;
+    }
+
     /* QEMU emulates all of MSI & MSIX */
     if (pdev->cap_present & QEMU_PCI_CAP_MSIX) {
         memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff,
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC v2 3/8] pcie_aer: expose pcie_aer_msg() interface
  2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 1/8] pcie_aer: fix typos in pcie_aer_inject_error comment Chen Fan
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 2/8] vfio-pci: add aer capability support Chen Fan
@ 2015-01-28  8:37 ` Chen Fan
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 4/8] vfio-pci: pass the aer error to guest Chen Fan
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

For vfio device, we need to propagate the aer error to
Guest OS. we use the pcie_aer_msg() to send aer error
to guest.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 hw/pci/pcie_aer.c         | 2 +-
 include/hw/pci/pcie_aer.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
index 7ca077a..00906b9 100644
--- a/hw/pci/pcie_aer.c
+++ b/hw/pci/pcie_aer.c
@@ -370,7 +370,7 @@ static void pcie_aer_msg_root_port(PCIDevice *dev, const PCIEAERMsg *msg)
  *
  * Walk up the bus tree from the device, propagate the error message.
  */
-static void pcie_aer_msg(PCIDevice *dev, const PCIEAERMsg *msg)
+void pcie_aer_msg(PCIDevice *dev, const PCIEAERMsg *msg)
 {
     uint8_t type;
 
diff --git a/include/hw/pci/pcie_aer.h b/include/hw/pci/pcie_aer.h
index bcac80a..648df73 100644
--- a/include/hw/pci/pcie_aer.h
+++ b/include/hw/pci/pcie_aer.h
@@ -102,5 +102,6 @@ void pcie_aer_root_write_config(PCIDevice *dev,
 
 /* error injection */
 int pcie_aer_inject_error(PCIDevice *dev, const PCIEAERErr *err);
+void pcie_aer_msg(PCIDevice *dev, const PCIEAERMsg *msg);
 
 #endif /* QEMU_PCIE_AER_H */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC v2 4/8] vfio-pci: pass the aer error to guest
  2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
                   ` (2 preceding siblings ...)
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 3/8] pcie_aer: expose pcie_aer_msg() interface Chen Fan
@ 2015-01-28  8:37 ` Chen Fan
  2015-02-02 20:16   ` Alex Williamson
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 5/8] pcie_aer: fix a trivial typo in PCIEAERMsg comments Chen Fan
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

when the vfio device encounters an uncorrectable error in host,
the vfio_pci driver will signal the eventfd registered by this
vfio device, the results in the qemu eventfd handler getting
invoked.

this patch is to pass the error to guest and have the guest driver
recover from the error.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 hw/vfio/pci.c | 34 ++++++++++++++++++++++++++++------
 1 file changed, 28 insertions(+), 6 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 2072261..8c81bb3 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3141,18 +3141,40 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
 static void vfio_err_notifier_handler(void *opaque)
 {
     VFIOPCIDevice *vdev = opaque;
+    PCIDevice *dev = &vdev->pdev;
+    PCIEAERMsg msg = {
+        .severity = 0,
+        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
+    };
 
     if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
         return;
     }
 
+    /* we should read the error details from the real hardware
+     * configuration spaces, here we only need to do is signaling
+     * to guest an uncorrectable error has occurred.
+     */
+    if (dev->exp.aer_cap) {
+        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
+        uint32_t uncor_status;
+        bool isfatal;
+
+        uncor_status = vfio_pci_read_config(dev,
+                           dev->exp.aer_cap + PCI_ERR_UNCOR_STATUS, 4);
+
+        isfatal = uncor_status & pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
+
+        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
+                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;
+
+        pcie_aer_msg(dev, &msg);
+        return;
+    }
+
     /*
-     * TBD. Retrieve the error details and decide what action
-     * needs to be taken. One of the actions could be to pass
-     * the error to the guest and have the guest driver recover
-     * from the error. This requires that PCIe capabilities be
-     * exposed to the guest. For now, we just terminate the
-     * guest to contain the error.
+     * If the aer capability is not exposed to the guest. we just
+     * terminate the guest to contain the error.
      */
 
     error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected.  "
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC v2 5/8] pcie_aer: fix a trivial typo in PCIEAERMsg comments
  2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
                   ` (3 preceding siblings ...)
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 4/8] vfio-pci: pass the aer error to guest Chen Fan
@ 2015-01-28  8:37 ` Chen Fan
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset Chen Fan
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 include/hw/pci/pcie_aer.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/hw/pci/pcie_aer.h b/include/hw/pci/pcie_aer.h
index 648df73..6c4bf3b 100644
--- a/include/hw/pci/pcie_aer.h
+++ b/include/hw/pci/pcie_aer.h
@@ -51,7 +51,7 @@ struct PCIEAERLog {
     PCIEAERErr *log;
 };
 
-/* aer error message: error signaling message has only error sevirity and
+/* aer error message: error signaling message has only error severity and
    source id. See 2.2.8.3 error signaling messages */
 struct PCIEAERMsg {
     /*
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset
  2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
                   ` (4 preceding siblings ...)
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 5/8] pcie_aer: fix a trivial typo in PCIEAERMsg comments Chen Fan
@ 2015-01-28  8:37 ` Chen Fan
  2015-02-02 20:16   ` Alex Williamson
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 7/8] vfio_pci: change vfio device features bit macro to enum definition Chen Fan
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 8/8] vfio-pci: add VFIO_FEATURE_ENABLE_AER_CAP feature Chen Fan
  7 siblings, 1 reply; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

when vfio device support FLR, then when device reset,
we call VFIO_DEVICE_RESET ioctl to reset the device first,
at kernel side, we also can see the order of reset:
3330         rc = pcie_flr(dev, probe);
3331         if (rc != -ENOTTY)
3332                 goto done;
3333
3334         rc = pci_af_flr(dev, probe);
3335         if (rc != -ENOTTY)
3336                 goto done;
3337
3338         rc = pci_pm_reset(dev, probe);
3339         if (rc != -ENOTTY)
3340                 goto done;

so when vfio has FLR, reset it directly.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 hw/vfio/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 8c81bb3..54eb6b4 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -3455,7 +3455,7 @@ static void vfio_pci_reset(DeviceState *dev)
     vfio_pci_pre_reset(vdev);
 
     if (vdev->vbasedev.reset_works &&
-        (vdev->has_flr || !vdev->has_pm_reset) &&
+        vdev->has_flr &&
         !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
         trace_vfio_pci_reset_flr(vdev->vbasedev.name);
         goto post_reset;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC v2 7/8] vfio_pci: change vfio device features bit macro to enum definition
  2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
                   ` (5 preceding siblings ...)
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset Chen Fan
@ 2015-01-28  8:37 ` Chen Fan
  2015-02-02 20:15   ` Alex Williamson
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 8/8] vfio-pci: add VFIO_FEATURE_ENABLE_AER_CAP feature Chen Fan
  7 siblings, 1 reply; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 hw/vfio/pci.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 54eb6b4..65247ee 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -134,6 +134,12 @@ typedef struct VFIOMSIXInfo {
     void *mmap;
 } VFIOMSIXInfo;
 
+/* Bits in VFIOPCIDevice features field. */
+enum {
+#define VFIO_FEATURE_ENABLE_VGA_BIT 0
+    VFIO_FEATURE_ENABLE_VGA = (1 << VFIO_FEATURE_ENABLE_VGA_BIT),
+};
+
 typedef struct VFIOPCIDevice {
     PCIDevice pdev;
     VFIODevice vbasedev;
@@ -154,8 +160,6 @@ typedef struct VFIOPCIDevice {
     PCIHostDeviceAddress host;
     EventNotifier err_notifier;
     uint32_t features;
-#define VFIO_FEATURE_ENABLE_VGA_BIT 0
-#define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT)
     int32_t bootindex;
     uint8_t pm_cap;
     bool has_vga;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [RFC v2 8/8] vfio-pci: add VFIO_FEATURE_ENABLE_AER_CAP feature
  2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
                   ` (6 preceding siblings ...)
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 7/8] vfio_pci: change vfio device features bit macro to enum definition Chen Fan
@ 2015-01-28  8:37 ` Chen Fan
  2015-02-02 20:16   ` Alex Williamson
  7 siblings, 1 reply; 19+ messages in thread
From: Chen Fan @ 2015-01-28  8:37 UTC (permalink / raw)
  To: qemu-devel; +Cc: marcel, izumi.taku, Alex Williamson

for old machine types, we should disable aercap feature.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
---
 hw/vfio/pci.c       | 13 ++++++++++---
 include/hw/compat.h |  4 ++++
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 65247ee..0d830e6 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -138,6 +138,8 @@ typedef struct VFIOMSIXInfo {
 enum {
 #define VFIO_FEATURE_ENABLE_VGA_BIT 0
     VFIO_FEATURE_ENABLE_VGA = (1 << VFIO_FEATURE_ENABLE_VGA_BIT),
+#define VFIO_FEATURE_ENABLE_AER_CAP_BIT 1
+    VFIO_FEATURE_ENABLE_AER_CAP = (1 << VFIO_FEATURE_ENABLE_AER_CAP_BIT),
 };
 
 typedef struct VFIOPCIDevice {
@@ -2724,10 +2726,12 @@ static int vfio_add_ext_capabilities(VFIOPCIDevice *vdev)
     while (header) {
         switch (PCI_EXT_CAP_ID(header)) {
         case PCI_EXT_CAP_ID_ERR:
-             exp = &pdev->exp;
-             exp->aer_cap = next;
+             if (vdev->features & VFIO_FEATURE_ENABLE_AER_CAP) {
+                 exp = &pdev->exp;
+                 exp->aer_cap = next;
 
-             vfio_pci_aer_init(vdev);
+                 vfio_pci_aer_init(vdev);
+	     }
              break;
         };
 
@@ -3498,6 +3502,9 @@ static Property vfio_pci_dev_properties[] = {
     DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features,
                     VFIO_FEATURE_ENABLE_VGA_BIT, false),
     DEFINE_PROP_INT32("bootindex", VFIOPCIDevice, bootindex, -1),
+    DEFINE_PROP_BIT("aercap", VFIOPCIDevice, features,
+                    VFIO_FEATURE_ENABLE_AER_CAP_BIT, true),
+
     /*
      * TODO - support passed fds... is this necessary?
      * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
diff --git a/include/hw/compat.h b/include/hw/compat.h
index 313682a..72a2cdb 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -30,6 +30,10 @@
             .driver   = "virtio-pci",\
             .property = "virtio-pci-bus-master-bug-migration",\
             .value    = "on",\
+        },{\
+            .driver   = "vfio-pci",\
+            .property = "aercap",\
+            .value    = "off",\
         }
 
 #endif /* HW_COMPAT_H */
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 2/8] vfio-pci: add aer capability support
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 2/8] vfio-pci: add aer capability support Chen Fan
@ 2015-02-02 20:15   ` Alex Williamson
  2015-02-06  7:03     ` Chen Fan
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2015-02-02 20:15 UTC (permalink / raw)
  To: Chen Fan; +Cc: marcel, izumi.taku, qemu-devel

On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
> if we detect the aer capability in vfio device, then
> we should initialize the vfio device aer rigister bits.
> so guest OS can set this bits as needed.

s/rigister/register/

> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 72 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 014a92c..2072261 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2670,6 +2670,73 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev)
>      return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
>  }
>  
> +static void vfio_pci_aer_init(VFIOPCIDevice *vdev)
> +{
> +    PCIDevice *pdev = &vdev->pdev;
> +    PCIExpressDevice *exp = &pdev->exp;
> +    uint16_t offset = exp->aer_cap;
> +
> +    if (!offset) {
> +        return;
> +    }
> +

All of these need to be documented with comments.

> +    memset(pdev->wmask + offset, 0, PCI_ERR_SIZEOF);
> +    memset(pdev->w1cmask + offset, 0, PCI_ERR_SIZEOF);
> +    memset(pdev->cmask + offset, 0xFF, PCI_ERR_SIZEOF);
> +
> +    pci_long_test_and_set_mask(pdev->wmask + exp->exp_cap + PCI_EXP_DEVCTL,
> +                               PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE |
> +                               PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE);
> +    pci_long_test_and_set_mask(pdev->w1cmask + exp->exp_cap + PCI_EXP_DEVSTA,
> +                               PCI_EXP_DEVSTA_CED | PCI_EXP_DEVSTA_NFED |
> +                               PCI_EXP_DEVSTA_FED | PCI_EXP_DEVSTA_URD);
> +
> +    pci_set_long(pdev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
> +                 PCI_ERR_UNC_SUPPORTED);
> +    pci_set_long(pdev->wmask + offset + PCI_ERR_UNCOR_SEVER,
> +                 PCI_ERR_UNC_SUPPORTED);
> +    pci_long_test_and_set_mask(pdev->w1cmask + offset + PCI_ERR_COR_STATUS,
> +                               PCI_ERR_COR_STATUS);
> +    pci_set_long(pdev->wmask + offset + PCI_ERR_COR_MASK,
> +                 PCI_ERR_COR_SUPPORTED);
> +
> +    pci_set_long(pdev->wmask + offset + PCI_ERR_CAP,
> +                 PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE |
> +                 PCI_ERR_CAP_MHRE);
> +}
> +
> +static int vfio_add_ext_capabilities(VFIOPCIDevice *vdev)
> +{
> +    PCIDevice *pdev = &vdev->pdev;
> +    PCIExpressDevice *exp;
> +    uint32_t header;
> +    uint16_t next = PCI_CONFIG_SPACE_SIZE;
> +
> +    if (pci_config_size(pdev) <= PCI_CONFIG_SPACE_SIZE) {
> +        return 0;
> +    }
> +
> +    header = pci_get_long(pdev->config + next);
> +    while (header) {
> +        switch (PCI_EXT_CAP_ID(header)) {
> +        case PCI_EXT_CAP_ID_ERR:
> +             exp = &pdev->exp;
> +             exp->aer_cap = next;

Shouldn't we call pcie_aer_init() here?

> +
> +             vfio_pci_aer_init(vdev);
> +             break;
> +        };
> +
> +        next = PCI_EXT_CAP_NEXT(header);
> +        if (!next) {
> +            return 0;
> +        }
> +        header = pci_get_long(pdev->config + next);

I'd like to see this look more like vfio_add_std_cap(), registering
every capability with the QEMU PCIe-core and setting up emulation to
allow QEMU to skip capabilities that it doesn't want to expose.

> +    }
> +
> +    return 0;
> +}
> +
>  static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
>  {
>      PCIDevice *pdev = &vdev->pdev;
> @@ -3296,6 +3363,11 @@ static int vfio_initfn(PCIDevice *pdev)
>          goto out_teardown;
>      }
>  
> +    ret = vfio_add_ext_capabilities(vdev);
> +    if (ret) {
> +        goto out_teardown;
> +    }
> +

Why not extend vfio_add_capabilities()?  It specifically calls
vfio_add_std_cap() in order that there could be a vfio_add_ext_cap().

>      /* QEMU emulates all of MSI & MSIX */
>      if (pdev->cap_present & QEMU_PCI_CAP_MSIX) {
>          memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff,

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 7/8] vfio_pci: change vfio device features bit macro to enum definition
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 7/8] vfio_pci: change vfio device features bit macro to enum definition Chen Fan
@ 2015-02-02 20:15   ` Alex Williamson
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2015-02-02 20:15 UTC (permalink / raw)
  To: Chen Fan; +Cc: marcel, izumi.taku, qemu-devel

On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:


The "why" should go here

> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 54eb6b4..65247ee 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -134,6 +134,12 @@ typedef struct VFIOMSIXInfo {
>      void *mmap;
>  } VFIOMSIXInfo;
>  
> +/* Bits in VFIOPCIDevice features field. */
> +enum {
> +#define VFIO_FEATURE_ENABLE_VGA_BIT 0
> +    VFIO_FEATURE_ENABLE_VGA = (1 << VFIO_FEATURE_ENABLE_VGA_BIT),
> +};

Rather strange to use a bitmap within an enum, isn't it?

> +
>  typedef struct VFIOPCIDevice {
>      PCIDevice pdev;
>      VFIODevice vbasedev;
> @@ -154,8 +160,6 @@ typedef struct VFIOPCIDevice {
>      PCIHostDeviceAddress host;
>      EventNotifier err_notifier;
>      uint32_t features;
> -#define VFIO_FEATURE_ENABLE_VGA_BIT 0
> -#define VFIO_FEATURE_ENABLE_VGA (1 << VFIO_FEATURE_ENABLE_VGA_BIT)
>      int32_t bootindex;
>      uint8_t pm_cap;
>      bool has_vga;

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 8/8] vfio-pci: add VFIO_FEATURE_ENABLE_AER_CAP feature
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 8/8] vfio-pci: add VFIO_FEATURE_ENABLE_AER_CAP feature Chen Fan
@ 2015-02-02 20:16   ` Alex Williamson
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2015-02-02 20:16 UTC (permalink / raw)
  To: Chen Fan; +Cc: marcel, izumi.taku, qemu-devel

On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
> for old machine types, we should disable aercap feature.
> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c       | 13 ++++++++++---
>  include/hw/compat.h |  4 ++++
>  2 files changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 65247ee..0d830e6 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -138,6 +138,8 @@ typedef struct VFIOMSIXInfo {
>  enum {
>  #define VFIO_FEATURE_ENABLE_VGA_BIT 0
>      VFIO_FEATURE_ENABLE_VGA = (1 << VFIO_FEATURE_ENABLE_VGA_BIT),
> +#define VFIO_FEATURE_ENABLE_AER_CAP_BIT 1
> +    VFIO_FEATURE_ENABLE_AER_CAP = (1 << VFIO_FEATURE_ENABLE_AER_CAP_BIT),
>  };
>  
>  typedef struct VFIOPCIDevice {
> @@ -2724,10 +2726,12 @@ static int vfio_add_ext_capabilities(VFIOPCIDevice *vdev)
>      while (header) {
>          switch (PCI_EXT_CAP_ID(header)) {
>          case PCI_EXT_CAP_ID_ERR:
> -             exp = &pdev->exp;
> -             exp->aer_cap = next;
> +             if (vdev->features & VFIO_FEATURE_ENABLE_AER_CAP) {
> +                 exp = &pdev->exp;
> +                 exp->aer_cap = next;
>  
> -             vfio_pci_aer_init(vdev);
> +                 vfio_pci_aer_init(vdev);
> +	     }
>               break;
>          };
>  
> @@ -3498,6 +3502,9 @@ static Property vfio_pci_dev_properties[] = {
>      DEFINE_PROP_BIT("x-vga", VFIOPCIDevice, features,
>                      VFIO_FEATURE_ENABLE_VGA_BIT, false),
>      DEFINE_PROP_INT32("bootindex", VFIOPCIDevice, bootindex, -1),
> +    DEFINE_PROP_BIT("aercap", VFIOPCIDevice, features,
> +                    VFIO_FEATURE_ENABLE_AER_CAP_BIT, true),
> +
>      /*
>       * TODO - support passed fds... is this necessary?
>       * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name),
> diff --git a/include/hw/compat.h b/include/hw/compat.h
> index 313682a..72a2cdb 100644
> --- a/include/hw/compat.h
> +++ b/include/hw/compat.h
> @@ -30,6 +30,10 @@
>              .driver   = "virtio-pci",\
>              .property = "virtio-pci-bus-master-bug-migration",\
>              .value    = "on",\
> +        },{\
> +            .driver   = "vfio-pci",\
> +            .property = "aercap",\
> +            .value    = "off",\

This will leave it enabled on both Q35 and 440FX afaict, so I'm not sure
what this fixes.  We don't care about migration compatibility with
vfio-pci.

>          }
>  
>  #endif /* HW_COMPAT_H */

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset Chen Fan
@ 2015-02-02 20:16   ` Alex Williamson
  2015-02-04  9:54     ` Chen Fan
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2015-02-02 20:16 UTC (permalink / raw)
  To: Chen Fan; +Cc: marcel, izumi.taku, qemu-devel

On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
> when vfio device support FLR, then when device reset,
> we call VFIO_DEVICE_RESET ioctl to reset the device first,
> at kernel side, we also can see the order of reset:
> 3330         rc = pcie_flr(dev, probe);
> 3331         if (rc != -ENOTTY)
> 3332                 goto done;
> 3333
> 3334         rc = pci_af_flr(dev, probe);
> 3335         if (rc != -ENOTTY)
> 3336                 goto done;
> 3337
> 3338         rc = pci_pm_reset(dev, probe);
> 3339         if (rc != -ENOTTY)
> 3340                 goto done;
> 
> so when vfio has FLR, reset it directly.
> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 8c81bb3..54eb6b4 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3455,7 +3455,7 @@ static void vfio_pci_reset(DeviceState *dev)
>      vfio_pci_pre_reset(vdev);
>  
>      if (vdev->vbasedev.reset_works &&
> -        (vdev->has_flr || !vdev->has_pm_reset) &&
> +        vdev->has_flr &&
>          !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
>          trace_vfio_pci_reset_flr(vdev->vbasedev.name);
>          goto post_reset;

Does this actually fix anything?  QEMU shouldn't rely on a specific
behavior of the kernel.  This test is de-prioritizing a PM reset because
they're often non-effective.  If the device supports FLR, the second
part of the OR is unreached, so what's the point of this change?
Thanks,

Alex

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/8] vfio-pci: pass the aer error to guest
  2015-01-28  8:37 ` [Qemu-devel] [RFC v2 4/8] vfio-pci: pass the aer error to guest Chen Fan
@ 2015-02-02 20:16   ` Alex Williamson
  2015-02-06  7:06     ` Chen Fan
  0 siblings, 1 reply; 19+ messages in thread
From: Alex Williamson @ 2015-02-02 20:16 UTC (permalink / raw)
  To: Chen Fan; +Cc: marcel, izumi.taku, qemu-devel

On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
> when the vfio device encounters an uncorrectable error in host,
> the vfio_pci driver will signal the eventfd registered by this
> vfio device, the results in the qemu eventfd handler getting
> invoked.
> 
> this patch is to pass the error to guest and have the guest driver
> recover from the error.
> 
> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> ---
>  hw/vfio/pci.c | 34 ++++++++++++++++++++++++++++------
>  1 file changed, 28 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 2072261..8c81bb3 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -3141,18 +3141,40 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
>  static void vfio_err_notifier_handler(void *opaque)
>  {
>      VFIOPCIDevice *vdev = opaque;
> +    PCIDevice *dev = &vdev->pdev;
> +    PCIEAERMsg msg = {
> +        .severity = 0,
> +        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
> +    };
>  
>      if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>          return;
>      }
>  
> +    /* we should read the error details from the real hardware
> +     * configuration spaces, here we only need to do is signaling
> +     * to guest an uncorrectable error has occurred.
> +     */
> +    if (dev->exp.aer_cap) {
> +        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
> +        uint32_t uncor_status;
> +        bool isfatal;
> +
> +        uncor_status = vfio_pci_read_config(dev,
> +                           dev->exp.aer_cap + PCI_ERR_UNCOR_STATUS, 4);
> +
> +        isfatal = uncor_status & pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
> +
> +        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
> +                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;
> +
> +        pcie_aer_msg(dev, &msg);
> +        return;
> +    }

What if the guest machine type is 440FX?  We've just killed the existing
vm_stop functionality for the majority of users.

> +
>      /*
> -     * TBD. Retrieve the error details and decide what action
> -     * needs to be taken. One of the actions could be to pass
> -     * the error to the guest and have the guest driver recover
> -     * from the error. This requires that PCIe capabilities be
> -     * exposed to the guest. For now, we just terminate the
> -     * guest to contain the error.
> +     * If the aer capability is not exposed to the guest. we just
> +     * terminate the guest to contain the error.

Just because it's exposed doesn't mean the guest chipset allows access
to it, right?

>       */
>  
>      error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected.  "

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset
  2015-02-02 20:16   ` Alex Williamson
@ 2015-02-04  9:54     ` Chen Fan
  2015-02-04 13:53       ` Alex Williamson
  0 siblings, 1 reply; 19+ messages in thread
From: Chen Fan @ 2015-02-04  9:54 UTC (permalink / raw)
  To: Alex Williamson; +Cc: marcel, izumi.taku, qemu-devel


On 02/03/2015 04:16 AM, Alex Williamson wrote:
> On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
>> when vfio device support FLR, then when device reset,
>> we call VFIO_DEVICE_RESET ioctl to reset the device first,
>> at kernel side, we also can see the order of reset:
>> 3330         rc = pcie_flr(dev, probe);
>> 3331         if (rc != -ENOTTY)
>> 3332                 goto done;
>> 3333
>> 3334         rc = pci_af_flr(dev, probe);
>> 3335         if (rc != -ENOTTY)
>> 3336                 goto done;
>> 3337
>> 3338         rc = pci_pm_reset(dev, probe);
>> 3339         if (rc != -ENOTTY)
>> 3340                 goto done;
>>
>> so when vfio has FLR, reset it directly.
>>
>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>> ---
>>   hw/vfio/pci.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 8c81bb3..54eb6b4 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3455,7 +3455,7 @@ static void vfio_pci_reset(DeviceState *dev)
>>       vfio_pci_pre_reset(vdev);
>>   
>>       if (vdev->vbasedev.reset_works &&
>> -        (vdev->has_flr || !vdev->has_pm_reset) &&
>> +        vdev->has_flr &&
>>           !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
>>           trace_vfio_pci_reset_flr(vdev->vbasedev.name);
>>           goto post_reset;
> Does this actually fix anything?  QEMU shouldn't rely on a specific
> behavior of the kernel.  This test is de-prioritizing a PM reset because
> they're often non-effective.  If the device supports FLR, the second
> part of the OR is unreached, so what's the point of this change?
For this change, when I tested the code on my own machine.
I found the vfio device has neither flr nor pm reset (e.g. NoSoftRst+).
this also trigger ioctl VFIO_DEVICE_RESET, is it right?

Thanks,
Chen


> Thanks,
>
> Alex
>
> .
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset
  2015-02-04  9:54     ` Chen Fan
@ 2015-02-04 13:53       ` Alex Williamson
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2015-02-04 13:53 UTC (permalink / raw)
  To: Chen Fan; +Cc: marcel, izumi.taku, qemu-devel

On Wed, 2015-02-04 at 17:54 +0800, Chen Fan wrote:
> On 02/03/2015 04:16 AM, Alex Williamson wrote:
> > On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
> >> when vfio device support FLR, then when device reset,
> >> we call VFIO_DEVICE_RESET ioctl to reset the device first,
> >> at kernel side, we also can see the order of reset:
> >> 3330         rc = pcie_flr(dev, probe);
> >> 3331         if (rc != -ENOTTY)
> >> 3332                 goto done;
> >> 3333
> >> 3334         rc = pci_af_flr(dev, probe);
> >> 3335         if (rc != -ENOTTY)
> >> 3336                 goto done;
> >> 3337
> >> 3338         rc = pci_pm_reset(dev, probe);
> >> 3339         if (rc != -ENOTTY)
> >> 3340                 goto done;
> >>
> >> so when vfio has FLR, reset it directly.
> >>
> >> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> >> ---
> >>   hw/vfio/pci.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >> index 8c81bb3..54eb6b4 100644
> >> --- a/hw/vfio/pci.c
> >> +++ b/hw/vfio/pci.c
> >> @@ -3455,7 +3455,7 @@ static void vfio_pci_reset(DeviceState *dev)
> >>       vfio_pci_pre_reset(vdev);
> >>   
> >>       if (vdev->vbasedev.reset_works &&
> >> -        (vdev->has_flr || !vdev->has_pm_reset) &&
> >> +        vdev->has_flr &&
> >>           !ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)) {
> >>           trace_vfio_pci_reset_flr(vdev->vbasedev.name);
> >>           goto post_reset;
> > Does this actually fix anything?  QEMU shouldn't rely on a specific
> > behavior of the kernel.  This test is de-prioritizing a PM reset because
> > they're often non-effective.  If the device supports FLR, the second
> > part of the OR is unreached, so what's the point of this change?
> For this change, when I tested the code on my own machine.
> I found the vfio device has neither flr nor pm reset (e.g. NoSoftRst+).
> this also trigger ioctl VFIO_DEVICE_RESET, is it right?

Yes, that means that the device has a device specific reset or that it's
a singleton device on the bus and we can use the simpler path of
VFIO_DEVICE_RESET.  Thanks,

Alex

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 2/8] vfio-pci: add aer capability support
  2015-02-02 20:15   ` Alex Williamson
@ 2015-02-06  7:03     ` Chen Fan
  2015-02-06 13:43       ` Alex Williamson
  0 siblings, 1 reply; 19+ messages in thread
From: Chen Fan @ 2015-02-06  7:03 UTC (permalink / raw)
  To: Alex Williamson; +Cc: marcel, izumi.taku, qemu-devel


On 02/03/2015 04:15 AM, Alex Williamson wrote:
> On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
>> if we detect the aer capability in vfio device, then
>> we should initialize the vfio device aer rigister bits.
>> so guest OS can set this bits as needed.
> s/rigister/register/
>
>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>> ---
>>   hw/vfio/pci.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 72 insertions(+)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 014a92c..2072261 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -2670,6 +2670,73 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev)
>>       return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
>>   }
>>   
>> +static void vfio_pci_aer_init(VFIOPCIDevice *vdev)
>> +{
>> +    PCIDevice *pdev = &vdev->pdev;
>> +    PCIExpressDevice *exp = &pdev->exp;
>> +    uint16_t offset = exp->aer_cap;
>> +
>> +    if (!offset) {
>> +        return;
>> +    }
>> +
> All of these need to be documented with comments.
>
>> +    memset(pdev->wmask + offset, 0, PCI_ERR_SIZEOF);
>> +    memset(pdev->w1cmask + offset, 0, PCI_ERR_SIZEOF);
>> +    memset(pdev->cmask + offset, 0xFF, PCI_ERR_SIZEOF);
>> +
>> +    pci_long_test_and_set_mask(pdev->wmask + exp->exp_cap + PCI_EXP_DEVCTL,
>> +                               PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE |
>> +                               PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE);
>> +    pci_long_test_and_set_mask(pdev->w1cmask + exp->exp_cap + PCI_EXP_DEVSTA,
>> +                               PCI_EXP_DEVSTA_CED | PCI_EXP_DEVSTA_NFED |
>> +                               PCI_EXP_DEVSTA_FED | PCI_EXP_DEVSTA_URD);
>> +
>> +    pci_set_long(pdev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
>> +                 PCI_ERR_UNC_SUPPORTED);
>> +    pci_set_long(pdev->wmask + offset + PCI_ERR_UNCOR_SEVER,
>> +                 PCI_ERR_UNC_SUPPORTED);
>> +    pci_long_test_and_set_mask(pdev->w1cmask + offset + PCI_ERR_COR_STATUS,
>> +                               PCI_ERR_COR_STATUS);
>> +    pci_set_long(pdev->wmask + offset + PCI_ERR_COR_MASK,
>> +                 PCI_ERR_COR_SUPPORTED);
>> +
>> +    pci_set_long(pdev->wmask + offset + PCI_ERR_CAP,
>> +                 PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE |
>> +                 PCI_ERR_CAP_MHRE);
>> +}
>> +
>> +static int vfio_add_ext_capabilities(VFIOPCIDevice *vdev)
>> +{
>> +    PCIDevice *pdev = &vdev->pdev;
>> +    PCIExpressDevice *exp;
>> +    uint32_t header;
>> +    uint16_t next = PCI_CONFIG_SPACE_SIZE;
>> +
>> +    if (pci_config_size(pdev) <= PCI_CONFIG_SPACE_SIZE) {
>> +        return 0;
>> +    }
>> +
>> +    header = pci_get_long(pdev->config + next);
>> +    while (header) {
>> +        switch (PCI_EXT_CAP_ID(header)) {
>> +        case PCI_EXT_CAP_ID_ERR:
>> +             exp = &pdev->exp;
>> +             exp->aer_cap = next;
> Shouldn't we call pcie_aer_init() here?
I am afraid pcie_aer_init() maybe impact the corresponding values
in pdev->config.

>
>> +
>> +             vfio_pci_aer_init(vdev);
>> +             break;
>> +        };
>> +
>> +        next = PCI_EXT_CAP_NEXT(header);
>> +        if (!next) {
>> +            return 0;
>> +        }
>> +        header = pci_get_long(pdev->config + next);
> I'd like to see this look more like vfio_add_std_cap(), registering
> every capability with the QEMU PCIe-core and setting up emulation to
> allow QEMU to skip capabilities that it doesn't want to expose.
>
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
>>   {
>>       PCIDevice *pdev = &vdev->pdev;
>> @@ -3296,6 +3363,11 @@ static int vfio_initfn(PCIDevice *pdev)
>>           goto out_teardown;
>>       }
>>   
>> +    ret = vfio_add_ext_capabilities(vdev);
>> +    if (ret) {
>> +        goto out_teardown;
>> +    }
>> +
> Why not extend vfio_add_capabilities()?  It specifically calls
> vfio_add_std_cap() in order that there could be a vfio_add_ext_cap().
it is a good idea.

Thanks,
Chen


>
>>       /* QEMU emulates all of MSI & MSIX */
>>       if (pdev->cap_present & QEMU_PCI_CAP_MSIX) {
>>           memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff,
>
>
> .
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/8] vfio-pci: pass the aer error to guest
  2015-02-02 20:16   ` Alex Williamson
@ 2015-02-06  7:06     ` Chen Fan
  0 siblings, 0 replies; 19+ messages in thread
From: Chen Fan @ 2015-02-06  7:06 UTC (permalink / raw)
  To: Alex Williamson; +Cc: marcel, izumi.taku, qemu-devel


On 02/03/2015 04:16 AM, Alex Williamson wrote:
> On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
>> when the vfio device encounters an uncorrectable error in host,
>> the vfio_pci driver will signal the eventfd registered by this
>> vfio device, the results in the qemu eventfd handler getting
>> invoked.
>>
>> this patch is to pass the error to guest and have the guest driver
>> recover from the error.
>>
>> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
>> ---
>>   hw/vfio/pci.c | 34 ++++++++++++++++++++++++++++------
>>   1 file changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
>> index 2072261..8c81bb3 100644
>> --- a/hw/vfio/pci.c
>> +++ b/hw/vfio/pci.c
>> @@ -3141,18 +3141,40 @@ static void vfio_put_device(VFIOPCIDevice *vdev)
>>   static void vfio_err_notifier_handler(void *opaque)
>>   {
>>       VFIOPCIDevice *vdev = opaque;
>> +    PCIDevice *dev = &vdev->pdev;
>> +    PCIEAERMsg msg = {
>> +        .severity = 0,
>> +        .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn,
>> +    };
>>   
>>       if (!event_notifier_test_and_clear(&vdev->err_notifier)) {
>>           return;
>>       }
>>   
>> +    /* we should read the error details from the real hardware
>> +     * configuration spaces, here we only need to do is signaling
>> +     * to guest an uncorrectable error has occurred.
>> +     */
>> +    if (dev->exp.aer_cap) {
>> +        uint8_t *aer_cap = dev->config + dev->exp.aer_cap;
>> +        uint32_t uncor_status;
>> +        bool isfatal;
>> +
>> +        uncor_status = vfio_pci_read_config(dev,
>> +                           dev->exp.aer_cap + PCI_ERR_UNCOR_STATUS, 4);
>> +
>> +        isfatal = uncor_status & pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER);
>> +
>> +        msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN :
>> +                                 PCI_ERR_ROOT_CMD_NONFATAL_EN;
>> +
>> +        pcie_aer_msg(dev, &msg);
>> +        return;
>> +    }
> What if the guest machine type is 440FX?  We've just killed the existing
> vm_stop functionality for the majority of users.
>
>> +
>>       /*
>> -     * TBD. Retrieve the error details and decide what action
>> -     * needs to be taken. One of the actions could be to pass
>> -     * the error to the guest and have the guest driver recover
>> -     * from the error. This requires that PCIe capabilities be
>> -     * exposed to the guest. For now, we just terminate the
>> -     * guest to contain the error.
>> +     * If the aer capability is not exposed to the guest. we just
>> +     * terminate the guest to contain the error.
> Just because it's exposed doesn't mean the guest chipset allows access
> to it, right?
you are right. I will think more about this interface.

Thanks,
Chen


>
>>        */
>>   
>>       error_report("%s(%04x:%02x:%02x.%x) Unrecoverable error detected.  "
>
>
> .
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [RFC v2 2/8] vfio-pci: add aer capability support
  2015-02-06  7:03     ` Chen Fan
@ 2015-02-06 13:43       ` Alex Williamson
  0 siblings, 0 replies; 19+ messages in thread
From: Alex Williamson @ 2015-02-06 13:43 UTC (permalink / raw)
  To: Chen Fan; +Cc: marcel, izumi.taku, qemu-devel

On Fri, 2015-02-06 at 15:03 +0800, Chen Fan wrote:
> On 02/03/2015 04:15 AM, Alex Williamson wrote:
> > On Wed, 2015-01-28 at 16:37 +0800, Chen Fan wrote:
> >> if we detect the aer capability in vfio device, then
> >> we should initialize the vfio device aer rigister bits.
> >> so guest OS can set this bits as needed.
> > s/rigister/register/
> >
> >> Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
> >> ---
> >>   hw/vfio/pci.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>   1 file changed, 72 insertions(+)
> >>
> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> >> index 014a92c..2072261 100644
> >> --- a/hw/vfio/pci.c
> >> +++ b/hw/vfio/pci.c
> >> @@ -2670,6 +2670,73 @@ static int vfio_add_capabilities(VFIOPCIDevice *vdev)
> >>       return vfio_add_std_cap(vdev, pdev->config[PCI_CAPABILITY_LIST]);
> >>   }
> >>   
> >> +static void vfio_pci_aer_init(VFIOPCIDevice *vdev)
> >> +{
> >> +    PCIDevice *pdev = &vdev->pdev;
> >> +    PCIExpressDevice *exp = &pdev->exp;
> >> +    uint16_t offset = exp->aer_cap;
> >> +
> >> +    if (!offset) {
> >> +        return;
> >> +    }
> >> +
> > All of these need to be documented with comments.
> >
> >> +    memset(pdev->wmask + offset, 0, PCI_ERR_SIZEOF);
> >> +    memset(pdev->w1cmask + offset, 0, PCI_ERR_SIZEOF);
> >> +    memset(pdev->cmask + offset, 0xFF, PCI_ERR_SIZEOF);
> >> +
> >> +    pci_long_test_and_set_mask(pdev->wmask + exp->exp_cap + PCI_EXP_DEVCTL,
> >> +                               PCI_EXP_DEVCTL_CERE | PCI_EXP_DEVCTL_NFERE |
> >> +                               PCI_EXP_DEVCTL_FERE | PCI_EXP_DEVCTL_URRE);
> >> +    pci_long_test_and_set_mask(pdev->w1cmask + exp->exp_cap + PCI_EXP_DEVSTA,
> >> +                               PCI_EXP_DEVSTA_CED | PCI_EXP_DEVSTA_NFED |
> >> +                               PCI_EXP_DEVSTA_FED | PCI_EXP_DEVSTA_URD);
> >> +
> >> +    pci_set_long(pdev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
> >> +                 PCI_ERR_UNC_SUPPORTED);
> >> +    pci_set_long(pdev->wmask + offset + PCI_ERR_UNCOR_SEVER,
> >> +                 PCI_ERR_UNC_SUPPORTED);
> >> +    pci_long_test_and_set_mask(pdev->w1cmask + offset + PCI_ERR_COR_STATUS,
> >> +                               PCI_ERR_COR_STATUS);
> >> +    pci_set_long(pdev->wmask + offset + PCI_ERR_COR_MASK,
> >> +                 PCI_ERR_COR_SUPPORTED);
> >> +
> >> +    pci_set_long(pdev->wmask + offset + PCI_ERR_CAP,
> >> +                 PCI_ERR_CAP_ECRC_GENE | PCI_ERR_CAP_ECRC_CHKE |
> >> +                 PCI_ERR_CAP_MHRE);
> >> +}
> >> +
> >> +static int vfio_add_ext_capabilities(VFIOPCIDevice *vdev)
> >> +{
> >> +    PCIDevice *pdev = &vdev->pdev;
> >> +    PCIExpressDevice *exp;
> >> +    uint32_t header;
> >> +    uint16_t next = PCI_CONFIG_SPACE_SIZE;
> >> +
> >> +    if (pci_config_size(pdev) <= PCI_CONFIG_SPACE_SIZE) {
> >> +        return 0;
> >> +    }
> >> +
> >> +    header = pci_get_long(pdev->config + next);
> >> +    while (header) {
> >> +        switch (PCI_EXT_CAP_ID(header)) {
> >> +        case PCI_EXT_CAP_ID_ERR:
> >> +             exp = &pdev->exp;
> >> +             exp->aer_cap = next;
> > Shouldn't we call pcie_aer_init() here?
> I am afraid pcie_aer_init() maybe impact the corresponding values
> in pdev->config.

Then we should do cleanup after calling pcie_aer_init() or modify
pcie_aer_init() to do what we need.  Doing raw aer manipulation outside
of core support just causes maintenance issues down the road.  Thanks,

Alex

> >> +
> >> +             vfio_pci_aer_init(vdev);
> >> +             break;
> >> +        };
> >> +
> >> +        next = PCI_EXT_CAP_NEXT(header);
> >> +        if (!next) {
> >> +            return 0;
> >> +        }
> >> +        header = pci_get_long(pdev->config + next);
> > I'd like to see this look more like vfio_add_std_cap(), registering
> > every capability with the QEMU PCIe-core and setting up emulation to
> > allow QEMU to skip capabilities that it doesn't want to expose.
> >
> >> +    }
> >> +
> >> +    return 0;
> >> +}
> >> +
> >>   static void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
> >>   {
> >>       PCIDevice *pdev = &vdev->pdev;
> >> @@ -3296,6 +3363,11 @@ static int vfio_initfn(PCIDevice *pdev)
> >>           goto out_teardown;
> >>       }
> >>   
> >> +    ret = vfio_add_ext_capabilities(vdev);
> >> +    if (ret) {
> >> +        goto out_teardown;
> >> +    }
> >> +
> > Why not extend vfio_add_capabilities()?  It specifically calls
> > vfio_add_std_cap() in order that there could be a vfio_add_ext_cap().
> it is a good idea.
> 
> Thanks,
> Chen
> 
> 
> >
> >>       /* QEMU emulates all of MSI & MSIX */
> >>       if (pdev->cap_present & QEMU_PCI_CAP_MSIX) {
> >>           memset(vdev->emulated_config_bits + pdev->msix_cap, 0xff,
> >
> >
> > .
> >
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-02-06 13:43 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-28  8:37 [Qemu-devel] [RFC v2 0/8] pass aer error to guest for vfio device Chen Fan
2015-01-28  8:37 ` [Qemu-devel] [RFC v2 1/8] pcie_aer: fix typos in pcie_aer_inject_error comment Chen Fan
2015-01-28  8:37 ` [Qemu-devel] [RFC v2 2/8] vfio-pci: add aer capability support Chen Fan
2015-02-02 20:15   ` Alex Williamson
2015-02-06  7:03     ` Chen Fan
2015-02-06 13:43       ` Alex Williamson
2015-01-28  8:37 ` [Qemu-devel] [RFC v2 3/8] pcie_aer: expose pcie_aer_msg() interface Chen Fan
2015-01-28  8:37 ` [Qemu-devel] [RFC v2 4/8] vfio-pci: pass the aer error to guest Chen Fan
2015-02-02 20:16   ` Alex Williamson
2015-02-06  7:06     ` Chen Fan
2015-01-28  8:37 ` [Qemu-devel] [RFC v2 5/8] pcie_aer: fix a trivial typo in PCIEAERMsg comments Chen Fan
2015-01-28  8:37 ` [Qemu-devel] [RFC v2 6/8] vfio_pci: fix a wrong check in vfio_pci_reset Chen Fan
2015-02-02 20:16   ` Alex Williamson
2015-02-04  9:54     ` Chen Fan
2015-02-04 13:53       ` Alex Williamson
2015-01-28  8:37 ` [Qemu-devel] [RFC v2 7/8] vfio_pci: change vfio device features bit macro to enum definition Chen Fan
2015-02-02 20:15   ` Alex Williamson
2015-01-28  8:37 ` [Qemu-devel] [RFC v2 8/8] vfio-pci: add VFIO_FEATURE_ENABLE_AER_CAP feature Chen Fan
2015-02-02 20:16   ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.