All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU]
@ 2020-02-06 21:45 Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO Eric Farman
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Eric Farman @ 2020-02-06 21:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, Halil Pasic, qemu-s390x,
	Jared Rossi

Here is a new pass at the channel-path handling code for vfio-ccw,
to take advantage of the corresponding kernel patches posted here:

https://lore.kernel.org/kvm/20200206213825.11444-1-farman@linux.ibm.com/

I did leave a couple things FIXMEs from v1 comments in here.
Thought it'd be best to just get this out with the kernel code, to
make sure things aren't too far in the weeds.

v1: https://lore.kernel.org/qemu-devel/20191115033437.37926-1-farman@linux.ibm.com/

Eric Farman (2):
  vfio-ccw: Refactor cleanup of regions
  vfio-ccw: Refactor ccw irq handler

Farhan Ali (5):
  vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO
  linux-headers: update
  vfio-ccw: Add support for the schib region
  vfio-ccw: Add support for the crw region
  vfio-ccw: Add support for the CRW irq

 hw/s390x/css.c                         |  13 +-
 hw/s390x/s390-ccw.c                    |  28 ++++
 hw/vfio/ccw.c                          | 204 ++++++++++++++++++++++---
 include/hw/s390x/css.h                 |   3 +-
 include/hw/s390x/s390-ccw.h            |   1 +
 include/standard-headers/linux/input.h |   1 +
 linux-headers/asm-arm64/unistd.h       |   1 +
 linux-headers/linux/vfio.h             |   3 +
 linux-headers/linux/vfio_ccw.h         |  19 +++
 target/s390x/ioinst.c                  |   3 +-
 10 files changed, 250 insertions(+), 26 deletions(-)

-- 
2.17.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO
  2020-02-06 21:45 [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU] Eric Farman
@ 2020-02-06 21:45 ` Eric Farman
  2020-03-24 17:04   ` Cornelia Huck
  2020-02-06 21:45 ` [RFC PATCH v2 2/7] linux-headers: update Eric Farman
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Eric Farman @ 2020-02-06 21:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, Halil Pasic, qemu-s390x,
	Jared Rossi

From: Farhan Ali <alifm@linux.ibm.com>

EIO is returned by vfio-ccw mediated device when the backing
host subchannel is not operational anymore. So return cc=3
back to the guest, rather than returning a unit check.
This way the guest can take appropriate action such as
issue an 'stsch'.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---

Notes:
    v1->v2: [EF]
     - Add s-o-b
     - [Seems the discussion on v1 centered on the return code
       set in the kernel, rather than anything that needs to
       change here, unless I've missed something.]

 hw/vfio/ccw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 50cc2ec75c..19144ecfc7 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -114,6 +114,7 @@ again:
         return IOINST_CC_BUSY;
     case -ENODEV:
     case -EACCES:
+    case -EIO:
         return IOINST_CC_NOT_OPERATIONAL;
     case -EFAULT:
     default:
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 2/7] linux-headers: update
  2020-02-06 21:45 [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU] Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO Eric Farman
@ 2020-02-06 21:45 ` Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 3/7] vfio-ccw: Refactor cleanup of regions Eric Farman
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Eric Farman @ 2020-02-06 21:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, Halil Pasic, qemu-s390x,
	Jared Rossi

From: Farhan Ali <alifm@linux.ibm.com>

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---

Notes:
    v1->v2: [EF]
     - Re-ran 3 February 2020 (based on kernel tag v5.5)
    v0->v1: [EF]
     - Run scripts/update-linux-headers.sh properly, but do not
       add resulting changes to linux-headers/asm-mips/

 include/standard-headers/linux/input.h |  1 +
 linux-headers/asm-arm64/unistd.h       |  1 +
 linux-headers/linux/vfio.h             |  3 +++
 linux-headers/linux/vfio_ccw.h         | 19 +++++++++++++++++++
 4 files changed, 24 insertions(+)

diff --git a/include/standard-headers/linux/input.h b/include/standard-headers/linux/input.h
index d8914f25a5..f89c986190 100644
--- a/include/standard-headers/linux/input.h
+++ b/include/standard-headers/linux/input.h
@@ -31,6 +31,7 @@ struct input_event {
 	unsigned long __sec;
 #if defined(__sparc__) && defined(__arch64__)
 	unsigned int __usec;
+	unsigned int __pad;
 #else
 	unsigned long __usec;
 #endif
diff --git a/linux-headers/asm-arm64/unistd.h b/linux-headers/asm-arm64/unistd.h
index 4703d21866..f83a70e07d 100644
--- a/linux-headers/asm-arm64/unistd.h
+++ b/linux-headers/asm-arm64/unistd.h
@@ -19,5 +19,6 @@
 #define __ARCH_WANT_NEW_STAT
 #define __ARCH_WANT_SET_GET_RLIMIT
 #define __ARCH_WANT_TIME32_SYSCALLS
+#define __ARCH_WANT_SYS_CLONE3
 
 #include <asm-generic/unistd.h>
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index fb10370d29..9e227348b3 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -378,6 +378,8 @@ struct vfio_region_gfx_edid {
 
 /* sub-types for VFIO_REGION_TYPE_CCW */
 #define VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD	(1)
+#define VFIO_REGION_SUBTYPE_CCW_SCHIB		(2)
+#define VFIO_REGION_SUBTYPE_CCW_CRW		(3)
 
 /*
  * The MSIX mappable capability informs that MSIX data of a BAR can be mmapped
@@ -577,6 +579,7 @@ enum {
 
 enum {
 	VFIO_CCW_IO_IRQ_INDEX,
+	VFIO_CCW_CRW_IRQ_INDEX,
 	VFIO_CCW_NUM_IRQS
 };
 
diff --git a/linux-headers/linux/vfio_ccw.h b/linux-headers/linux/vfio_ccw.h
index fcc3e69ef5..c56c1a621c 100644
--- a/linux-headers/linux/vfio_ccw.h
+++ b/linux-headers/linux/vfio_ccw.h
@@ -34,4 +34,23 @@ struct ccw_cmd_region {
 	__u32 ret_code;
 } __attribute__((packed));
 
+/*
+ * Used for processing commands that read the subchannel-information block
+ * Reading this region triggers a stsch() to hardware
+ * Note: this is controlled by a capability
+ */
+struct ccw_schib_region {
+#define SCHIB_AREA_SIZE 52
+	__u8 schib_area[SCHIB_AREA_SIZE];
+} __attribute__((packed));
+
+/*
+ * Used for returning Channel Report Word(s) to userspace.
+ * Note: this is controlled by a capability
+ */
+struct ccw_crw_region {
+	__u32 crw0;
+	__u32 crw1;
+} __attribute__((packed));
+
 #endif
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 3/7] vfio-ccw: Refactor cleanup of regions
  2020-02-06 21:45 [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU] Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 2/7] linux-headers: update Eric Farman
@ 2020-02-06 21:45 ` Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 4/7] vfio-ccw: Add support for the schib region Eric Farman
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Eric Farman @ 2020-02-06 21:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, Halil Pasic, qemu-s390x,
	Jared Rossi

While we're at it, add a g_free() for the async_cmd_region that
is the last thing currently created.  g_free() knows how to handle
NULL pointers, so this makes it easier to remember what cleanups
need to be performed when new regions are added.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
---

Notes:
    v1-v2:
     - Add Conny's r-b

 hw/vfio/ccw.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 19144ecfc7..26e479c53f 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -371,8 +371,7 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
     vcdev->io_region_size = info->size;
     if (sizeof(*vcdev->io_region) != vcdev->io_region_size) {
         error_setg(errp, "vfio: Unexpected size of the I/O region");
-        g_free(info);
-        return;
+        goto out_err;
     }
 
     vcdev->io_region_offset = info->offset;
@@ -385,15 +384,20 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
         vcdev->async_cmd_region_size = info->size;
         if (sizeof(*vcdev->async_cmd_region) != vcdev->async_cmd_region_size) {
             error_setg(errp, "vfio: Unexpected size of the async cmd region");
-            g_free(vcdev->io_region);
-            g_free(info);
-            return;
+            goto out_err;
         }
         vcdev->async_cmd_region_offset = info->offset;
         vcdev->async_cmd_region = g_malloc0(info->size);
     }
 
     g_free(info);
+    return;
+
+out_err:
+    g_free(vcdev->async_cmd_region);
+    g_free(vcdev->io_region);
+    g_free(info);
+    return;
 }
 
 static void vfio_ccw_put_region(VFIOCCWDevice *vcdev)
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 4/7] vfio-ccw: Add support for the schib region
  2020-02-06 21:45 [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU] Eric Farman
                   ` (2 preceding siblings ...)
  2020-02-06 21:45 ` [RFC PATCH v2 3/7] vfio-ccw: Refactor cleanup of regions Eric Farman
@ 2020-02-06 21:45 ` Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 5/7] vfio-ccw: Add support for the crw region Eric Farman
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Eric Farman @ 2020-02-06 21:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, Halil Pasic, qemu-s390x,
	Jared Rossi

From: Farhan Ali <alifm@linux.ibm.com>

The schib region can be used to obtain the latest SCHIB from the host
passthrough subchannel. Since the guest SCHIB is virtualized,
we currently only update the path related information so that the
guest is aware of any path related changes when it issues the
'stsch' instruction.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---

Notes:
    v1->v2:
     - Remove silly variable intialization, and add a block comment,
       to css_do_stsch() [CH]
     - Add a TODO statement to s390_ccw_store(), for myself to sort
       out while we go over kernel code more closely [CH/EF]
     - In vfio_ccw_handle_store(),
        - Set schib pointer once region is determined to be non-NULL [CH]
        - Return cc=0 if pread() fails, and log an error [CH]
    v0->v1: [EF]
     - Change various incarnations of "update chp status" to
       "handle_store", to reflect the STSCH instruction that will
       drive this code
     - Remove temporary variable for casting/testing purposes in
       s390_ccw_store(), and add a block comment of WHY its there.
     - Add a few comments to vfio_ccw_handle_store()

 hw/s390x/css.c              | 13 ++++++--
 hw/s390x/s390-ccw.c         | 28 +++++++++++++++++
 hw/vfio/ccw.c               | 63 +++++++++++++++++++++++++++++++++++++
 include/hw/s390x/css.h      |  3 +-
 include/hw/s390x/s390-ccw.h |  1 +
 target/s390x/ioinst.c       |  3 +-
 6 files changed, 106 insertions(+), 5 deletions(-)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index 844caab408..71fd3f9a00 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -1335,11 +1335,20 @@ static void copy_schib_to_guest(SCHIB *dest, const SCHIB *src)
     }
 }
 
-int css_do_stsch(SubchDev *sch, SCHIB *schib)
+IOInstEnding css_do_stsch(SubchDev *sch, SCHIB *schib)
 {
+    int ret;
+
+    /*
+     * For some subchannels, we may want to update parts of
+     * the schib (e.g., update path masks from the host device
+     * for passthrough subchannels).
+     */
+    ret = s390_ccw_store(sch);
+
     /* Use current status. */
     copy_schib_to_guest(schib, &sch->curr_status);
-    return 0;
+    return ret;
 }
 
 static void copy_pmcw_from_guest(PMCW *dest, const PMCW *src)
diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
index 0c5a5b60bd..0c619706a1 100644
--- a/hw/s390x/s390-ccw.c
+++ b/hw/s390x/s390-ccw.c
@@ -51,6 +51,34 @@ int s390_ccw_clear(SubchDev *sch)
     return cdc->handle_clear(sch);
 }
 
+IOInstEnding s390_ccw_store(SubchDev *sch)
+{
+    S390CCWDeviceClass *cdc = NULL;
+    int ret = IOINST_CC_EXPECTED;
+
+    /*
+     * This only applies to passthrough devices, so we can't unconditionally
+     * set this variable like we would for halt/clear.
+     *
+     * TODO from Conny on v1:
+     *   "We have a generic ccw_cb in the subchannel structure for ccw
+     *    interpretation; would it make sense to add a generic callback
+     *    for stsch there as well?
+     *
+     *   "(This works fine, though. Might want to add the check for
+     *    halt/clear as well, but that might be a bit overkill.)"
+     */
+    if (object_dynamic_cast(OBJECT(sch->driver_data), TYPE_S390_CCW)) {
+        cdc = S390_CCW_DEVICE_GET_CLASS(sch->driver_data);
+    }
+
+    if (cdc && cdc->handle_store) {
+        ret = cdc->handle_store(sch);
+    }
+
+    return ret;
+}
+
 static void s390_ccw_get_dev_info(S390CCWDevice *cdev,
                                   char *sysfsdev,
                                   Error **errp)
diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 26e479c53f..e31de3ffd7 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -41,6 +41,9 @@ struct VFIOCCWDevice {
     uint64_t async_cmd_region_size;
     uint64_t async_cmd_region_offset;
     struct ccw_cmd_region *async_cmd_region;
+    uint64_t schib_region_size;
+    uint64_t schib_region_offset;
+    struct ccw_schib_region *schib_region;
     EventNotifier io_notifier;
     bool force_orb_pfch;
     bool warned_orb_pfch;
@@ -124,6 +127,51 @@ again:
     }
 }
 
+static IOInstEnding vfio_ccw_handle_store(SubchDev *sch)
+{
+    S390CCWDevice *cdev = sch->driver_data;
+    VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
+    SCHIB *schib = &sch->curr_status;
+    struct ccw_schib_region *region = vcdev->schib_region;
+    SCHIB *s;
+    int ret;
+
+    /* schib region not available so nothing else to do */
+    if (!region) {
+        return IOINST_CC_EXPECTED;
+    }
+
+    memset(region, 0, sizeof(*region));
+    ret = pread(vcdev->vdev.fd, region, vcdev->schib_region_size,
+                vcdev->schib_region_offset);
+
+    if (ret == -1) {
+        /*
+         * Device is probably damaged, but store subchannel does not
+         * have a nonzero cc defined for this scenario.  Log an error,
+         * and presume things are otherwise fine.
+         */
+        error_report("vfio-ccw: store region read failed with errno=%d", errno);
+        return IOINST_CC_EXPECTED;
+    }
+
+    /*
+     * Selectively copy path-related bits of the SCHIB,
+     * rather than copying the entire struct.
+     */
+    s = (SCHIB *)region->schib_area;
+    schib->pmcw.pnom = s->pmcw.pnom;
+    schib->pmcw.lpum = s->pmcw.lpum;
+    schib->pmcw.pam = s->pmcw.pam;
+    schib->pmcw.pom = s->pmcw.pom;
+
+    if (s->scsw.flags & SCSW_FLAGS_MASK_PNO) {
+        schib->scsw.flags |= SCSW_FLAGS_MASK_PNO;
+    }
+
+    return IOINST_CC_EXPECTED;
+}
+
 static int vfio_ccw_handle_clear(SubchDev *sch)
 {
     S390CCWDevice *cdev = sch->driver_data;
@@ -390,10 +438,23 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
         vcdev->async_cmd_region = g_malloc0(info->size);
     }
 
+    ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW,
+                                   VFIO_REGION_SUBTYPE_CCW_SCHIB, &info);
+    if (!ret) {
+        vcdev->schib_region_size = info->size;
+        if (sizeof(*vcdev->schib_region) != vcdev->schib_region_size) {
+            error_setg(errp, "vfio: Unexpected size of the schib region");
+            goto out_err;
+        }
+        vcdev->schib_region_offset = info->offset;
+        vcdev->schib_region = g_malloc(info->size);
+    }
+
     g_free(info);
     return;
 
 out_err:
+    g_free(vcdev->schib_region);
     g_free(vcdev->async_cmd_region);
     g_free(vcdev->io_region);
     g_free(info);
@@ -402,6 +463,7 @@ out_err:
 
 static void vfio_ccw_put_region(VFIOCCWDevice *vcdev)
 {
+    g_free(vcdev->schib_region);
     g_free(vcdev->async_cmd_region);
     g_free(vcdev->io_region);
 }
@@ -577,6 +639,7 @@ static void vfio_ccw_class_init(ObjectClass *klass, void *data)
     cdc->handle_request = vfio_ccw_handle_request;
     cdc->handle_halt = vfio_ccw_handle_halt;
     cdc->handle_clear = vfio_ccw_handle_clear;
+    cdc->handle_store = vfio_ccw_handle_store;
 }
 
 static const TypeInfo vfio_ccw_info = {
diff --git a/include/hw/s390x/css.h b/include/hw/s390x/css.h
index f46bcafb16..7e3a5e7433 100644
--- a/include/hw/s390x/css.h
+++ b/include/hw/s390x/css.h
@@ -218,6 +218,7 @@ IOInstEnding do_subchannel_work_passthrough(SubchDev *sub);
 
 int s390_ccw_halt(SubchDev *sch);
 int s390_ccw_clear(SubchDev *sch);
+IOInstEnding s390_ccw_store(SubchDev *sch);
 
 typedef enum {
     CSS_IO_ADAPTER_VIRTIO = 0,
@@ -242,7 +243,7 @@ SubchDev *css_find_subch(uint8_t m, uint8_t cssid, uint8_t ssid,
                          uint16_t schid);
 bool css_subch_visible(SubchDev *sch);
 void css_conditional_io_interrupt(SubchDev *sch);
-int css_do_stsch(SubchDev *sch, SCHIB *schib);
+IOInstEnding css_do_stsch(SubchDev *sch, SCHIB *schib);
 bool css_schid_final(int m, uint8_t cssid, uint8_t ssid, uint16_t schid);
 IOInstEnding css_do_msch(SubchDev *sch, const SCHIB *schib);
 IOInstEnding css_do_xsch(SubchDev *sch);
diff --git a/include/hw/s390x/s390-ccw.h b/include/hw/s390x/s390-ccw.h
index fffb54562f..4a43803ef2 100644
--- a/include/hw/s390x/s390-ccw.h
+++ b/include/hw/s390x/s390-ccw.h
@@ -37,6 +37,7 @@ typedef struct S390CCWDeviceClass {
     IOInstEnding (*handle_request) (SubchDev *sch);
     int (*handle_halt) (SubchDev *sch);
     int (*handle_clear) (SubchDev *sch);
+    IOInstEnding (*handle_store) (SubchDev *sch);
 } S390CCWDeviceClass;
 
 #endif
diff --git a/target/s390x/ioinst.c b/target/s390x/ioinst.c
index c437a1d8c6..63b4c84215 100644
--- a/target/s390x/ioinst.c
+++ b/target/s390x/ioinst.c
@@ -257,8 +257,7 @@ void ioinst_handle_stsch(S390CPU *cpu, uint64_t reg1, uint32_t ipb,
     sch = css_find_subch(m, cssid, ssid, schid);
     if (sch) {
         if (css_subch_visible(sch)) {
-            css_do_stsch(sch, &schib);
-            cc = 0;
+            cc = css_do_stsch(sch, &schib);
         } else {
             /* Indicate no more subchannels in this css/ss */
             cc = 3;
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 5/7] vfio-ccw: Add support for the crw region
  2020-02-06 21:45 [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU] Eric Farman
                   ` (3 preceding siblings ...)
  2020-02-06 21:45 ` [RFC PATCH v2 4/7] vfio-ccw: Add support for the schib region Eric Farman
@ 2020-02-06 21:45 ` Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 6/7] vfio-ccw: Refactor ccw irq handler Eric Farman
  2020-02-06 21:45 ` [RFC PATCH v2 7/7] vfio-ccw: Add support for the CRW irq Eric Farman
  6 siblings, 0 replies; 18+ messages in thread
From: Eric Farman @ 2020-02-06 21:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, Halil Pasic, qemu-s390x,
	Jared Rossi

From: Farhan Ali <alifm@linux.ibm.com>

The crw region can be used to obtain information about
Channel Report Words (CRW) from vfio-ccw driver.

Currently only channel path related CRWs are passed to
QEMU from vfio-ccw driver.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---

Notes:
    v0->v1: [EF]
     - Fixed copy/paste error in error message (s/schib/CRW)

 hw/vfio/ccw.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index e31de3ffd7..d2408f3357 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -44,6 +44,9 @@ struct VFIOCCWDevice {
     uint64_t schib_region_size;
     uint64_t schib_region_offset;
     struct ccw_schib_region *schib_region;
+    uint64_t crw_region_size;
+    uint64_t crw_region_offset;
+    struct ccw_crw_region *crw_region;
     EventNotifier io_notifier;
     bool force_orb_pfch;
     bool warned_orb_pfch;
@@ -450,10 +453,24 @@ static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
         vcdev->schib_region = g_malloc(info->size);
     }
 
+    ret = vfio_get_dev_region_info(vdev, VFIO_REGION_TYPE_CCW,
+                                   VFIO_REGION_SUBTYPE_CCW_CRW, &info);
+
+    if (!ret) {
+        vcdev->crw_region_size = info->size;
+        if (sizeof(*vcdev->crw_region) != vcdev->crw_region_size) {
+            error_setg(errp, "vfio: Unexpected size of the CRW region");
+            goto out_err;
+        }
+        vcdev->crw_region_offset = info->offset;
+        vcdev->crw_region = g_malloc(info->size);
+    }
+
     g_free(info);
     return;
 
 out_err:
+    g_free(vcdev->crw_region);
     g_free(vcdev->schib_region);
     g_free(vcdev->async_cmd_region);
     g_free(vcdev->io_region);
@@ -463,6 +480,7 @@ out_err:
 
 static void vfio_ccw_put_region(VFIOCCWDevice *vcdev)
 {
+    g_free(vcdev->crw_region);
     g_free(vcdev->schib_region);
     g_free(vcdev->async_cmd_region);
     g_free(vcdev->io_region);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 6/7] vfio-ccw: Refactor ccw irq handler
  2020-02-06 21:45 [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU] Eric Farman
                   ` (4 preceding siblings ...)
  2020-02-06 21:45 ` [RFC PATCH v2 5/7] vfio-ccw: Add support for the crw region Eric Farman
@ 2020-02-06 21:45 ` Eric Farman
  2020-03-24 17:32   ` Cornelia Huck
  2020-02-06 21:45 ` [RFC PATCH v2 7/7] vfio-ccw: Add support for the CRW irq Eric Farman
  6 siblings, 1 reply; 18+ messages in thread
From: Eric Farman @ 2020-02-06 21:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, Halil Pasic, qemu-s390x,
	Jared Rossi

Make it easier to add new ones in the future.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
---

Notes:
    v1->v2:
     - Make irq parameter unsigned [CH]
     - Remove extraneous %m from error_report calls [CH]

 hw/vfio/ccw.c | 57 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 16 deletions(-)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index d2408f3357..044441a277 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -335,22 +335,36 @@ read_err:
     css_inject_io_interrupt(sch);
 }
 
-static void vfio_ccw_register_io_notifier(VFIOCCWDevice *vcdev, Error **errp)
+static void vfio_ccw_register_irq_notifier(VFIOCCWDevice *vcdev,
+                                           unsigned int irq,
+                                           Error **errp)
 {
     VFIODevice *vdev = &vcdev->vdev;
     struct vfio_irq_info *irq_info;
     size_t argsz;
     int fd;
+    EventNotifier *notifier;
+    IOHandler *fd_read;
+
+    switch (irq) {
+    case VFIO_CCW_IO_IRQ_INDEX:
+        notifier = &vcdev->io_notifier;
+        fd_read = vfio_ccw_io_notifier_handler;
+        break;
+    default:
+        error_setg(errp, "vfio: Unsupported device irq(%d)", irq);
+        return;
+    }
 
-    if (vdev->num_irqs < VFIO_CCW_IO_IRQ_INDEX + 1) {
-        error_setg(errp, "vfio: unexpected number of io irqs %u",
+    if (vdev->num_irqs < irq + 1) {
+        error_setg(errp, "vfio: unexpected number of irqs %u",
                    vdev->num_irqs);
         return;
     }
 
     argsz = sizeof(*irq_info);
     irq_info = g_malloc0(argsz);
-    irq_info->index = VFIO_CCW_IO_IRQ_INDEX;
+    irq_info->index = irq;
     irq_info->argsz = argsz;
     if (ioctl(vdev->fd, VFIO_DEVICE_GET_IRQ_INFO,
               irq_info) < 0 || irq_info->count < 1) {
@@ -358,37 +372,48 @@ static void vfio_ccw_register_io_notifier(VFIOCCWDevice *vcdev, Error **errp)
         goto out_free_info;
     }
 
-    if (event_notifier_init(&vcdev->io_notifier, 0)) {
+    if (event_notifier_init(notifier, 0)) {
         error_setg_errno(errp, errno,
-                         "vfio: Unable to init event notifier for IO");
+                         "vfio: Unable to init event notifier for irq (%d)", irq);
         goto out_free_info;
     }
 
-    fd = event_notifier_get_fd(&vcdev->io_notifier);
-    qemu_set_fd_handler(fd, vfio_ccw_io_notifier_handler, NULL, vcdev);
+    fd = event_notifier_get_fd(notifier);
+    qemu_set_fd_handler(fd, fd_read, NULL, vcdev);
 
-    if (vfio_set_irq_signaling(vdev, VFIO_CCW_IO_IRQ_INDEX, 0,
+    if (vfio_set_irq_signaling(vdev, irq, 0,
                                VFIO_IRQ_SET_ACTION_TRIGGER, fd, errp)) {
         qemu_set_fd_handler(fd, NULL, NULL, vcdev);
-        event_notifier_cleanup(&vcdev->io_notifier);
+        event_notifier_cleanup(notifier);
     }
 
 out_free_info:
     g_free(irq_info);
 }
 
-static void vfio_ccw_unregister_io_notifier(VFIOCCWDevice *vcdev)
+static void vfio_ccw_unregister_irq_notifier(VFIOCCWDevice *vcdev,
+                                             unsigned int irq)
 {
     Error *err = NULL;
+    EventNotifier *notifier;
+
+    switch (irq) {
+    case VFIO_CCW_IO_IRQ_INDEX:
+        notifier = &vcdev->io_notifier;
+        break;
+    default:
+        error_report("vfio: Unsupported device irq(%d)", irq);
+        return;
+    }
 
-    if (vfio_set_irq_signaling(&vcdev->vdev, VFIO_CCW_IO_IRQ_INDEX, 0,
+    if (vfio_set_irq_signaling(&vcdev->vdev, irq, 0,
                                VFIO_IRQ_SET_ACTION_TRIGGER, -1, &err)) {
         error_reportf_err(err, VFIO_MSG_PREFIX, vcdev->vdev.name);
     }
 
-    qemu_set_fd_handler(event_notifier_get_fd(&vcdev->io_notifier),
+    qemu_set_fd_handler(event_notifier_get_fd(notifier),
                         NULL, NULL, vcdev);
-    event_notifier_cleanup(&vcdev->io_notifier);
+    event_notifier_cleanup(notifier);
 }
 
 static void vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
@@ -591,7 +616,7 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp)
         goto out_region_err;
     }
 
-    vfio_ccw_register_io_notifier(vcdev, &err);
+    vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX, &err);
     if (err) {
         goto out_notifier_err;
     }
@@ -620,7 +645,7 @@ static void vfio_ccw_unrealize(DeviceState *dev, Error **errp)
     S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(cdev);
     VFIOGroup *group = vcdev->vdev.group;
 
-    vfio_ccw_unregister_io_notifier(vcdev);
+    vfio_ccw_unregister_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX);
     vfio_ccw_put_region(vcdev);
     vfio_ccw_put_device(vcdev);
     vfio_put_group(group);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [RFC PATCH v2 7/7] vfio-ccw: Add support for the CRW irq
  2020-02-06 21:45 [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU] Eric Farman
                   ` (5 preceding siblings ...)
  2020-02-06 21:45 ` [RFC PATCH v2 6/7] vfio-ccw: Refactor ccw irq handler Eric Farman
@ 2020-02-06 21:45 ` Eric Farman
  2020-04-06 16:22   ` Cornelia Huck
  6 siblings, 1 reply; 18+ messages in thread
From: Eric Farman @ 2020-02-06 21:45 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, Halil Pasic, qemu-s390x,
	Jared Rossi

From: Farhan Ali <alifm@linux.ibm.com>

The CRW irq will be used by vfio-ccw to notify the userspace
about any CRWs the userspace needs to handle. Let's add support
for it.

Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Eric Farman <farman@linux.ibm.com>
---

Notes:
    v1->v2:
     - Add a loop to continually read region while data is
       present, queueing CRWs as found [CH]
    v0->v1: [EF]
     - Check vcdev->crw_region before registering the irq,
       in case host kernel does not have matching support
     - Split the refactoring changes to an earlier (new) patch
       (and don't remove the "num_irqs" check in the register
       routine, but adjust it to the check the input variable)
     - Don't revert the cool vfio_set_irq_signaling() stuff
     - Unregister CRW IRQ before IO IRQ in unrealize
     - s/crw1/crw0/

 hw/vfio/ccw.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
index 044441a277..5e3d446213 100644
--- a/hw/vfio/ccw.c
+++ b/hw/vfio/ccw.c
@@ -48,6 +48,7 @@ struct VFIOCCWDevice {
     uint64_t crw_region_offset;
     struct ccw_crw_region *crw_region;
     EventNotifier io_notifier;
+    EventNotifier crw_notifier;
     bool force_orb_pfch;
     bool warned_orb_pfch;
 };
@@ -265,6 +266,40 @@ static void vfio_ccw_reset(DeviceState *dev)
     ioctl(vcdev->vdev.fd, VFIO_DEVICE_RESET);
 }
 
+static void vfio_ccw_crw_notifier_handler(void *opaque)
+{
+    VFIOCCWDevice *vcdev = opaque;
+    struct ccw_crw_region *region = vcdev->crw_region;
+    CRW crw;
+    int size;
+    uint8_t rsc, erc;
+
+    if (!event_notifier_test_and_clear(&vcdev->crw_notifier)) {
+        return;
+    }
+
+    do {
+        memset(region, 0, sizeof(*region));
+        size = pread(vcdev->vdev.fd, region, vcdev->crw_region_size,
+                     vcdev->crw_region_offset);
+
+        if (size == -1) {
+            error_report("vfio-ccw: Read crw region failed with errno=%d", errno);
+            break;
+        }
+
+        if (size == 0 || region->crw0 == 0) {
+            /* No more CRWs to queue */
+            break;
+        }
+
+        memcpy(&crw, &region->crw0, sizeof(CRW));
+        rsc = (crw.flags & 0x0f00) >> 8;
+        erc = crw.flags & 0x003f;
+        css_queue_crw(rsc, erc, 0, 0, crw.rsid);
+    } while (1);
+}
+
 static void vfio_ccw_io_notifier_handler(void *opaque)
 {
     VFIOCCWDevice *vcdev = opaque;
@@ -351,6 +386,10 @@ static void vfio_ccw_register_irq_notifier(VFIOCCWDevice *vcdev,
         notifier = &vcdev->io_notifier;
         fd_read = vfio_ccw_io_notifier_handler;
         break;
+    case VFIO_CCW_CRW_IRQ_INDEX:
+        notifier = &vcdev->crw_notifier;
+        fd_read = vfio_ccw_crw_notifier_handler;
+        break;
     default:
         error_setg(errp, "vfio: Unsupported device irq(%d)", irq);
         return;
@@ -401,6 +440,9 @@ static void vfio_ccw_unregister_irq_notifier(VFIOCCWDevice *vcdev,
     case VFIO_CCW_IO_IRQ_INDEX:
         notifier = &vcdev->io_notifier;
         break;
+    case VFIO_CCW_CRW_IRQ_INDEX:
+        notifier = &vcdev->crw_notifier;
+        break;
     default:
         error_report("vfio: Unsupported device irq(%d)", irq);
         return;
@@ -621,6 +663,14 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp)
         goto out_notifier_err;
     }
 
+    if (vcdev->crw_region) {
+        vfio_ccw_register_irq_notifier(vcdev, VFIO_CCW_CRW_IRQ_INDEX, &err);
+        if (err) {
+            vfio_ccw_unregister_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX);
+            goto out_notifier_err;
+        }
+    }
+
     return;
 
 out_notifier_err:
@@ -645,6 +695,7 @@ static void vfio_ccw_unrealize(DeviceState *dev, Error **errp)
     S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(cdev);
     VFIOGroup *group = vcdev->vdev.group;
 
+    vfio_ccw_unregister_irq_notifier(vcdev, VFIO_CCW_CRW_IRQ_INDEX);
     vfio_ccw_unregister_irq_notifier(vcdev, VFIO_CCW_IO_IRQ_INDEX);
     vfio_ccw_put_region(vcdev);
     vfio_ccw_put_device(vcdev);
-- 
2.17.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO
  2020-02-06 21:45 ` [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO Eric Farman
@ 2020-03-24 17:04   ` Cornelia Huck
  2020-03-25  2:24     ` Halil Pasic
  0 siblings, 1 reply; 18+ messages in thread
From: Cornelia Huck @ 2020-03-24 17:04 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Jason Herne, qemu-s390x, Jared Rossi, qemu-devel

On Thu,  6 Feb 2020 22:45:03 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> From: Farhan Ali <alifm@linux.ibm.com>
> 
> EIO is returned by vfio-ccw mediated device when the backing
> host subchannel is not operational anymore. So return cc=3
> back to the guest, rather than returning a unit check.
> This way the guest can take appropriate action such as
> issue an 'stsch'.
> 
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> 
> Notes:
>     v1->v2: [EF]
>      - Add s-o-b
>      - [Seems the discussion on v1 centered on the return code
>        set in the kernel, rather than anything that needs to
>        change here, unless I've missed something.]

I've stared at this and at the kernel code for some time again; and I'm
not sure if "return -EIO == not operational" is even true. That said,
I'm not sure a unit check is the right response, either. The only thing
I'm sure about is that the kernel code needs some review of return
codes and some documentation...

> 
>  hw/vfio/ccw.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> index 50cc2ec75c..19144ecfc7 100644
> --- a/hw/vfio/ccw.c
> +++ b/hw/vfio/ccw.c
> @@ -114,6 +114,7 @@ again:
>          return IOINST_CC_BUSY;
>      case -ENODEV:
>      case -EACCES:
> +    case -EIO:
>          return IOINST_CC_NOT_OPERATIONAL;
>      case -EFAULT:
>      default:



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 6/7] vfio-ccw: Refactor ccw irq handler
  2020-02-06 21:45 ` [RFC PATCH v2 6/7] vfio-ccw: Refactor ccw irq handler Eric Farman
@ 2020-03-24 17:32   ` Cornelia Huck
  0 siblings, 0 replies; 18+ messages in thread
From: Cornelia Huck @ 2020-03-24 17:32 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Jason Herne, qemu-s390x, Jared Rossi, qemu-devel

On Thu,  6 Feb 2020 22:45:08 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> Make it easier to add new ones in the future.
> 
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> 
> Notes:
>     v1->v2:
>      - Make irq parameter unsigned [CH]
>      - Remove extraneous %m from error_report calls [CH]
> 
>  hw/vfio/ccw.c | 57 ++++++++++++++++++++++++++++++++++++---------------
>  1 file changed, 41 insertions(+), 16 deletions(-)

Reviewed-by: Cornelia Huck <cohuck@redhat.com>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO
  2020-03-24 17:04   ` Cornelia Huck
@ 2020-03-25  2:24     ` Halil Pasic
  2020-04-01  8:52       ` Cornelia Huck
  0 siblings, 1 reply; 18+ messages in thread
From: Halil Pasic @ 2020-03-25  2:24 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Jason Herne, Eric Farman, qemu-devel, qemu-s390x, Jared Rossi

On Tue, 24 Mar 2020 18:04:30 +0100
Cornelia Huck <cohuck@redhat.com> wrote:

> On Thu,  6 Feb 2020 22:45:03 +0100
> Eric Farman <farman@linux.ibm.com> wrote:
> 
> > From: Farhan Ali <alifm@linux.ibm.com>
> > 
> > EIO is returned by vfio-ccw mediated device when the backing
> > host subchannel is not operational anymore. So return cc=3
> > back to the guest, rather than returning a unit check.
> > This way the guest can take appropriate action such as
> > issue an 'stsch'.

I believe this is not the only situation when vfio-ccw returns
EIO, or?

> > 
> > Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > ---
> > 
> > Notes:
> >     v1->v2: [EF]
> >      - Add s-o-b
> >      - [Seems the discussion on v1 centered on the return code
> >        set in the kernel, rather than anything that needs to
> >        change here, unless I've missed something.]

Does this need to change here? If the kernel is supposed to return ENODEV
then this does not need to change.

> 
> I've stared at this and at the kernel code for some time again; and I'm
> not sure if "return -EIO == not operational" is even true. That said,
> I'm not sure a unit check is the right response, either. The only thing
> I'm sure about is that the kernel code needs some review of return
> codes and some documentation...

I could not agree more, this is semantically uapi and needs to be
properly documented.

With regards to "linux error codes: vs "ionist cc's" an where
the mapping is different example:

"""
/**                                                                             
 * cio_cancel_halt_clear - Cancel running I/O by performing cancel, halt        
 * and clear ordinally if subchannel is valid.                                  
 * @sch: subchannel on which to perform the cancel_halt_clear operation         
 * @iretry: the number of the times remained to retry the next operation        
 *                                                                              
 * This should be called repeatedly since halt/clear are asynchronous           
 * operations. We do one try with cio_cancel, three tries with cio_halt,        
 * 255 tries with cio_clear. The caller should initialize @iretry with          
 * the value 255 for its first call to this, and keep using the same            
 * @iretry in the subsequent calls until it gets a non -EBUSY return.           
 *                                                                              
 * Returns 0 if device now idle, -ENODEV for device not operational,            
 * -EBUSY if an interrupt is expected (either from halt/clear or from a         
 * status pending), and -EIO if out of retries.                                 
 */                                                                             
int cio_cancel_halt_clear(struct subchannel *sch, int *iretry)   

"""
Here -ENODEV is not operational.

Regards,
Halil

> 
> > 
> >  hw/vfio/ccw.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> > index 50cc2ec75c..19144ecfc7 100644
> > --- a/hw/vfio/ccw.c
> > +++ b/hw/vfio/ccw.c
> > @@ -114,6 +114,7 @@ again:
> >          return IOINST_CC_BUSY;
> >      case -ENODEV:
> >      case -EACCES:
> > +    case -EIO:
> >          return IOINST_CC_NOT_OPERATIONAL;
> >      case -EFAULT:
> >      default:
> 
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO
  2020-03-25  2:24     ` Halil Pasic
@ 2020-04-01  8:52       ` Cornelia Huck
  2020-04-06 18:21         ` Eric Farman
  0 siblings, 1 reply; 18+ messages in thread
From: Cornelia Huck @ 2020-04-01  8:52 UTC (permalink / raw)
  To: Halil Pasic; +Cc: Jason Herne, Eric Farman, qemu-devel, qemu-s390x, Jared Rossi

On Wed, 25 Mar 2020 03:24:28 +0100
Halil Pasic <pasic@linux.ibm.com> wrote:

> On Tue, 24 Mar 2020 18:04:30 +0100
> Cornelia Huck <cohuck@redhat.com> wrote:
> 
> > On Thu,  6 Feb 2020 22:45:03 +0100
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> > > From: Farhan Ali <alifm@linux.ibm.com>
> > > 
> > > EIO is returned by vfio-ccw mediated device when the backing
> > > host subchannel is not operational anymore. So return cc=3
> > > back to the guest, rather than returning a unit check.
> > > This way the guest can take appropriate action such as
> > > issue an 'stsch'.  
> 
> I believe this is not the only situation when vfio-ccw returns
> EIO, or?
> 
> > > 
> > > Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> > > Signed-off-by: Eric Farman <farman@linux.ibm.com>
> > > ---
> > > 
> > > Notes:
> > >     v1->v2: [EF]
> > >      - Add s-o-b
> > >      - [Seems the discussion on v1 centered on the return code
> > >        set in the kernel, rather than anything that needs to
> > >        change here, unless I've missed something.]  
> 
> Does this need to change here? If the kernel is supposed to return ENODEV
> then this does not need to change.
> 
> > 
> > I've stared at this and at the kernel code for some time again; and I'm
> > not sure if "return -EIO == not operational" is even true. That said,
> > I'm not sure a unit check is the right response, either. The only thing
> > I'm sure about is that the kernel code needs some review of return
> > codes and some documentation...  
> 
> I could not agree more, this is semantically uapi and needs to be
> properly documented.
> 
> With regards to "linux error codes: vs "ionist cc's" an where
> the mapping is different example:
> 
> """
> /**                                                                             
>  * cio_cancel_halt_clear - Cancel running I/O by performing cancel, halt        
>  * and clear ordinally if subchannel is valid.                                  
>  * @sch: subchannel on which to perform the cancel_halt_clear operation         
>  * @iretry: the number of the times remained to retry the next operation        
>  *                                                                              
>  * This should be called repeatedly since halt/clear are asynchronous           
>  * operations. We do one try with cio_cancel, three tries with cio_halt,        
>  * 255 tries with cio_clear. The caller should initialize @iretry with          
>  * the value 255 for its first call to this, and keep using the same            
>  * @iretry in the subsequent calls until it gets a non -EBUSY return.           
>  *                                                                              
>  * Returns 0 if device now idle, -ENODEV for device not operational,            
>  * -EBUSY if an interrupt is expected (either from halt/clear or from a         
>  * status pending), and -EIO if out of retries.                                 
>  */                                                                             
> int cio_cancel_halt_clear(struct subchannel *sch, int *iretry)   
> 
> """
> Here -ENODEV is not operational.

Ok, I went through the kernel code and checked which error may be
returned in which case (hope I caught all of them). Here's what I
currently have:

diff --git a/Documentation/s390/vfio-ccw.rst b/Documentation/s390/vfio-ccw.rst
index fca9c4f5bd9c..43f375a03cce 100644
--- a/Documentation/s390/vfio-ccw.rst
+++ b/Documentation/s390/vfio-ccw.rst
@@ -210,7 +210,36 @@ Subchannel.
 
 irb_area stores the I/O result.
 
-ret_code stores a return code for each access of the region.
+ret_code stores a return code for each access of the region. The following
+values may occur:
+
+0
+  The operation was successful.
+
+-EOPNOTSUPP
+  The orb specified transport mode or an unidentified IDAW format, did not
+  specify prefetch mode, or the scsw specified a function other than the
+  start function.
+
+-EIO
+  A request was issued while the device was not in a state ready to accept
+  requests, or an internal error occurred.
+
+-EBUSY
+  The subchannel was status pending or busy, or a request is already active.
+
+-EAGAIN
+  A request was being processed, and the caller should retry.
+
+-EACCES
+  The channel path(s) used for the I/O were found to be not operational.
+
+-ENODEV
+  The device was found to be not operational.
+
+-EINVAL
+  The orb specified a chain longer than 255 ccws, or an internal error
+  occurred.
 
 This region is always available.
 
@@ -231,6 +260,29 @@ This region is exposed via region type VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD.
 
 Currently, CLEAR SUBCHANNEL and HALT SUBCHANNEL use this region.
 
+command specifies the command to be issued; ret_code stores a return code
+for each access of the region. The following values may occur:
+
+0
+  The operation was successful.
+
+-ENODEV
+  The device was found to be not operational.
+
+-EINVAL
+  A command other than halt or clear was specified.
+
+-EIO
+  A request was issued while the device was not in a state ready to accept
+  requests.
+
+-EAGAIN
+  A request was being processed, and the caller should retry.
+
+-EBUSY
+  The subchannel was status pending or busy while processing a halt request.
+
+
 vfio-ccw operation details
 --------------------------
 
Unless we interpret "device in wrong state" as "not operational",
mapping -EIO to cc 3 is probably not the right thing to do; but
generating a unit exception probably isn't either. Honestly, I'm unsure
what the right thing to do here would be...

Another problem is that -EIO might signal "something went wrong in the
kernel code" - should not happen, but would certainly not be the
caller's fault, and they can't do anything about it. That "internal
error" thing might also be signaled by -EINVAL (which is odd), but
-EINVAL could also mean "the ccw chain is too long", for which
-EOPNOTSUPP would probably be a better return code, as it's a
limitation in the code, not the architecture, IIRC. Not sure if we can
still change that, though (and QEMU handles both in the same way,
anyway.)

The other return codes look sane, and the return codes for the async
region also seem sane (but have the same issue with -EIO == "device in
wrong state").

Looking at the QEMU handling, I think the -EIO handling is a bit
questionable (without an obvious solution), while mapping -EBUSY to cc
2 is the only reasonable thing to do given that the interface does not
differentiate between "busy" and "status pending". The rest seems sane.



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 7/7] vfio-ccw: Add support for the CRW irq
  2020-02-06 21:45 ` [RFC PATCH v2 7/7] vfio-ccw: Add support for the CRW irq Eric Farman
@ 2020-04-06 16:22   ` Cornelia Huck
  2020-04-06 21:37     ` Eric Farman
  0 siblings, 1 reply; 18+ messages in thread
From: Cornelia Huck @ 2020-04-06 16:22 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Jason Herne, qemu-s390x, Jared Rossi, qemu-devel

On Thu,  6 Feb 2020 22:45:09 +0100
Eric Farman <farman@linux.ibm.com> wrote:

> From: Farhan Ali <alifm@linux.ibm.com>
> 
> The CRW irq will be used by vfio-ccw to notify the userspace
> about any CRWs the userspace needs to handle. Let's add support
> for it.
> 
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> ---
> 
> Notes:
>     v1->v2:
>      - Add a loop to continually read region while data is
>        present, queueing CRWs as found [CH]
>     v0->v1: [EF]
>      - Check vcdev->crw_region before registering the irq,
>        in case host kernel does not have matching support
>      - Split the refactoring changes to an earlier (new) patch
>        (and don't remove the "num_irqs" check in the register
>        routine, but adjust it to the check the input variable)
>      - Don't revert the cool vfio_set_irq_signaling() stuff
>      - Unregister CRW IRQ before IO IRQ in unrealize
>      - s/crw1/crw0/
> 
>  hw/vfio/ccw.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 51 insertions(+)
> 

> @@ -265,6 +266,40 @@ static void vfio_ccw_reset(DeviceState *dev)
>      ioctl(vcdev->vdev.fd, VFIO_DEVICE_RESET);
>  }
>  
> +static void vfio_ccw_crw_notifier_handler(void *opaque)
> +{
> +    VFIOCCWDevice *vcdev = opaque;
> +    struct ccw_crw_region *region = vcdev->crw_region;
> +    CRW crw;
> +    int size;
> +    uint8_t rsc, erc;
> +
> +    if (!event_notifier_test_and_clear(&vcdev->crw_notifier)) {
> +        return;
> +    }
> +
> +    do {
> +        memset(region, 0, sizeof(*region));
> +        size = pread(vcdev->vdev.fd, region, vcdev->crw_region_size,
> +                     vcdev->crw_region_offset);
> +
> +        if (size == -1) {
> +            error_report("vfio-ccw: Read crw region failed with errno=%d", errno);
> +            break;
> +        }
> +
> +        if (size == 0 || region->crw0 == 0) {

Does it make any sense to expect both of them as an indication that
there are no more crws at the moment? Grabbing a zeroed crw makes the
most sense as a stop condition, I think.

Also, I'm not sure anymore whether having space for two crws makes too
much sense. If we have a case in the future where we get two chained
crws, the code will retry anyway and just fetch the chained crw and
queue it, wouldn't it?

> +            /* No more CRWs to queue */
> +            break;
> +        }
> +
> +        memcpy(&crw, &region->crw0, sizeof(CRW));
> +        rsc = (crw.flags & 0x0f00) >> 8;
> +        erc = crw.flags & 0x003f;

I think we already have something for that... ah yes,
CRW_FLAGS_MASK_RSC and CRW_FLAGS_MASK_ERC.

> +        css_queue_crw(rsc, erc, 0, 0, crw.rsid);

...or maybe an alternative interface that allows us to queue a
ready-made crw?

> +    } while (1);
> +}
> +
>  static void vfio_ccw_io_notifier_handler(void *opaque)
>  {
>      VFIOCCWDevice *vcdev = opaque;



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO
  2020-04-01  8:52       ` Cornelia Huck
@ 2020-04-06 18:21         ` Eric Farman
  2020-04-07  6:28           ` Cornelia Huck
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Farman @ 2020-04-06 18:21 UTC (permalink / raw)
  To: Cornelia Huck, Halil Pasic
  Cc: Jason Herne, qemu-s390x, Jared Rossi, qemu-devel



On 4/1/20 4:52 AM, Cornelia Huck wrote:
> On Wed, 25 Mar 2020 03:24:28 +0100
> Halil Pasic <pasic@linux.ibm.com> wrote:
> 
>> On Tue, 24 Mar 2020 18:04:30 +0100
>> Cornelia Huck <cohuck@redhat.com> wrote:
>>
>>> On Thu,  6 Feb 2020 22:45:03 +0100
>>> Eric Farman <farman@linux.ibm.com> wrote:
>>>   
>>>> From: Farhan Ali <alifm@linux.ibm.com>
>>>>
>>>> EIO is returned by vfio-ccw mediated device when the backing
>>>> host subchannel is not operational anymore. So return cc=3
>>>> back to the guest, rather than returning a unit check.
>>>> This way the guest can take appropriate action such as
>>>> issue an 'stsch'.  
>>
>> I believe this is not the only situation when vfio-ccw returns
>> EIO, or?
>>
>>>>
>>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>>>> ---
>>>>
>>>> Notes:
>>>>     v1->v2: [EF]
>>>>      - Add s-o-b
>>>>      - [Seems the discussion on v1 centered on the return code
>>>>        set in the kernel, rather than anything that needs to
>>>>        change here, unless I've missed something.]  
>>
>> Does this need to change here? If the kernel is supposed to return ENODEV
>> then this does not need to change.
>>
>>>
>>> I've stared at this and at the kernel code for some time again; and I'm
>>> not sure if "return -EIO == not operational" is even true. That said,
>>> I'm not sure a unit check is the right response, either. The only thing
>>> I'm sure about is that the kernel code needs some review of return
>>> codes and some documentation...  
>>
>> I could not agree more, this is semantically uapi and needs to be
>> properly documented.
>>
>> With regards to "linux error codes: vs "ionist cc's" an where
>> the mapping is different example:
>>
>> """
>> /**                                                                             
>>  * cio_cancel_halt_clear - Cancel running I/O by performing cancel, halt        
>>  * and clear ordinally if subchannel is valid.                                  
>>  * @sch: subchannel on which to perform the cancel_halt_clear operation         
>>  * @iretry: the number of the times remained to retry the next operation        
>>  *                                                                              
>>  * This should be called repeatedly since halt/clear are asynchronous           
>>  * operations. We do one try with cio_cancel, three tries with cio_halt,        
>>  * 255 tries with cio_clear. The caller should initialize @iretry with          
>>  * the value 255 for its first call to this, and keep using the same            
>>  * @iretry in the subsequent calls until it gets a non -EBUSY return.           
>>  *                                                                              
>>  * Returns 0 if device now idle, -ENODEV for device not operational,            
>>  * -EBUSY if an interrupt is expected (either from halt/clear or from a         
>>  * status pending), and -EIO if out of retries.                                 
>>  */                                                                             
>> int cio_cancel_halt_clear(struct subchannel *sch, int *iretry)   
>>
>> """
>> Here -ENODEV is not operational.
> 
> Ok, I went through the kernel code and checked which error may be
> returned in which case (hope I caught all of them). Here's what I
> currently have:

Thanks for doing all this mapping!

> 
> diff --git a/Documentation/s390/vfio-ccw.rst b/Documentation/s390/vfio-ccw.rst
> index fca9c4f5bd9c..43f375a03cce 100644
> --- a/Documentation/s390/vfio-ccw.rst
> +++ b/Documentation/s390/vfio-ccw.rst
> @@ -210,7 +210,36 @@ Subchannel.
>  
>  irb_area stores the I/O result.
>  
> -ret_code stores a return code for each access of the region.
> +ret_code stores a return code for each access of the region. The following
> +values may occur:
> +
> +0
> +  The operation was successful.
> +
> +-EOPNOTSUPP
> +  The orb specified transport mode or an unidentified IDAW format, did not
> +  specify prefetch mode, or the scsw specified a function other than the
> +  start function.
> +
> +-EIO
> +  A request was issued while the device was not in a state ready to accept
> +  requests, or an internal error occurred.
> +
> +-EBUSY
> +  The subchannel was status pending or busy, or a request is already active.
> +
> +-EAGAIN
> +  A request was being processed, and the caller should retry.
> +
> +-EACCES
> +  The channel path(s) used for the I/O were found to be not operational.
> +
> +-ENODEV
> +  The device was found to be not operational.
> +
> +-EINVAL
> +  The orb specified a chain longer than 255 ccws, or an internal error
> +  occurred.
>  
>  This region is always available.
>  
> @@ -231,6 +260,29 @@ This region is exposed via region type VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD.
>  
>  Currently, CLEAR SUBCHANNEL and HALT SUBCHANNEL use this region.
>  
> +command specifies the command to be issued; ret_code stores a return code
> +for each access of the region. The following values may occur:
> +
> +0
> +  The operation was successful.
> +
> +-ENODEV
> +  The device was found to be not operational.
> +
> +-EINVAL
> +  A command other than halt or clear was specified.
> +
> +-EIO
> +  A request was issued while the device was not in a state ready to accept
> +  requests.
> +
> +-EAGAIN
> +  A request was being processed, and the caller should retry.
> +
> +-EBUSY
> +  The subchannel was status pending or busy while processing a halt request.
> +
> +
>  vfio-ccw operation details
>  --------------------------
>  
> Unless we interpret "device in wrong state" as "not operational",
> mapping -EIO to cc 3 is probably not the right thing to do; but
> generating a unit exception probably isn't either. Honestly, I'm unsure
> what the right thing to do here would be...

Me neither. I grepped my qemu logs for the past, ugh, too long for the
error report that would precede these failures ("vfio-ccw: [wirte/write]
I/O region failed with errno=%d"). Excluding the ones that were
obviously the result of half-baked code, all instances were either
-EBUSY or -ENODEV.  Could I trigger one?  Maybe.  Is it likely?  Doesn't
seem to be.

It seems that getting -EIO would indicate we got ourselves out of sync,
and started buttoning up the device again (or never having opened it in
the first place), so a unit exception might be valid as a "hey
something's screwy here" ?

> 
> Another problem is that -EIO might signal "something went wrong in the
> kernel code" - should not happen, but would certainly not be the
> caller's fault, and they can't do anything about it. That "internal
> error" thing might also be signaled by -EINVAL (which is odd), but
> -EINVAL could also mean "the ccw chain is too long", for which
> -EOPNOTSUPP would probably be a better return code, as it's a
> limitation in the code, not the architecture, IIRC. Not sure if we can
> still change that, though (and QEMU handles both in the same way,
> anyway.)
> 
> The other return codes look sane, and the return codes for the async
> region also seem sane (but have the same issue with -EIO == "device in
> wrong state").
> 
> Looking at the QEMU handling, I think the -EIO handling is a bit
> questionable (without an obvious solution), while mapping -EBUSY to cc
> 2 is the only reasonable thing to do given that the interface does not
> differentiate between "busy" and "status pending". The rest seems sane.
> 

So maybe with all this data, and absent a better solution, it's best to
just drop this patch from v3?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 7/7] vfio-ccw: Add support for the CRW irq
  2020-04-06 16:22   ` Cornelia Huck
@ 2020-04-06 21:37     ` Eric Farman
  2020-04-07  6:35       ` Cornelia Huck
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Farman @ 2020-04-06 21:37 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, Jason Herne, qemu-s390x, Jared Rossi, qemu-devel



On 4/6/20 12:22 PM, Cornelia Huck wrote:
> On Thu,  6 Feb 2020 22:45:09 +0100
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> From: Farhan Ali <alifm@linux.ibm.com>
>>
>> The CRW irq will be used by vfio-ccw to notify the userspace
>> about any CRWs the userspace needs to handle. Let's add support
>> for it.
>>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>> ---
>>
>> Notes:
>>     v1->v2:
>>      - Add a loop to continually read region while data is
>>        present, queueing CRWs as found [CH]
>>     v0->v1: [EF]
>>      - Check vcdev->crw_region before registering the irq,
>>        in case host kernel does not have matching support
>>      - Split the refactoring changes to an earlier (new) patch
>>        (and don't remove the "num_irqs" check in the register
>>        routine, but adjust it to the check the input variable)
>>      - Don't revert the cool vfio_set_irq_signaling() stuff
>>      - Unregister CRW IRQ before IO IRQ in unrealize
>>      - s/crw1/crw0/
>>
>>  hw/vfio/ccw.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 51 insertions(+)
>>
> 
>> @@ -265,6 +266,40 @@ static void vfio_ccw_reset(DeviceState *dev)
>>      ioctl(vcdev->vdev.fd, VFIO_DEVICE_RESET);
>>  }
>>  
>> +static void vfio_ccw_crw_notifier_handler(void *opaque)
>> +{
>> +    VFIOCCWDevice *vcdev = opaque;
>> +    struct ccw_crw_region *region = vcdev->crw_region;
>> +    CRW crw;
>> +    int size;
>> +    uint8_t rsc, erc;
>> +
>> +    if (!event_notifier_test_and_clear(&vcdev->crw_notifier)) {
>> +        return;
>> +    }
>> +
>> +    do {
>> +        memset(region, 0, sizeof(*region));
>> +        size = pread(vcdev->vdev.fd, region, vcdev->crw_region_size,
>> +                     vcdev->crw_region_offset);
>> +
>> +        if (size == -1) {
>> +            error_report("vfio-ccw: Read crw region failed with errno=%d", errno);
>> +            break;
>> +        }
>> +
>> +        if (size == 0 || region->crw0 == 0) {
> 
> Does it make any sense to expect both of them as an indication that
> there are no more crws at the moment? Grabbing a zeroed crw makes the
> most sense as a stop condition, I think.

I think it was overkill on my part.  Though it appears I am missing the
"zeroing" of the region once it got read.  Whoops.  Okay, those are easy
fixups.

> 
> Also, I'm not sure anymore whether having space for two crws makes too
> much sense. If we have a case in the future where we get two chained
> crws, the code will retry anyway and just fetch the chained crw and
> queue it, wouldn't it?

I suppose.

I thought the reason for including them now was to avoid "if region size
== 4 vs 8 vs XX" logic at some mysterious time in the future.  But
certainly ripping it out so we only pass a single CRW at a time would
simplify this quite a bit.

> 
>> +            /* No more CRWs to queue */
>> +            break;
>> +        }
>> +
>> +        memcpy(&crw, &region->crw0, sizeof(CRW));
>> +        rsc = (crw.flags & 0x0f00) >> 8;
>> +        erc = crw.flags & 0x003f;
> 
> I think we already have something for that... ah yes,
> CRW_FLAGS_MASK_RSC and CRW_FLAGS_MASK_ERC.

Huh, look at that.  :)

> 
>> +        css_queue_crw(rsc, erc, 0, 0, crw.rsid);
> 
> ...or maybe an alternative interface that allows us to queue a
> ready-made crw?

Sure, that would be nice.  I'll add that as an additional patch to this
series, prior to this one.

> 
>> +    } while (1);
>> +}
>> +
>>  static void vfio_ccw_io_notifier_handler(void *opaque)
>>  {
>>      VFIOCCWDevice *vcdev = opaque;
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO
  2020-04-06 18:21         ` Eric Farman
@ 2020-04-07  6:28           ` Cornelia Huck
  2020-04-07 10:18             ` Eric Farman
  0 siblings, 1 reply; 18+ messages in thread
From: Cornelia Huck @ 2020-04-07  6:28 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Jason Herne, qemu-s390x, Jared Rossi, qemu-devel

On Mon, 6 Apr 2020 14:21:17 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> On 4/1/20 4:52 AM, Cornelia Huck wrote:
> > On Wed, 25 Mar 2020 03:24:28 +0100
> > Halil Pasic <pasic@linux.ibm.com> wrote:
> >   
> >> On Tue, 24 Mar 2020 18:04:30 +0100
> >> Cornelia Huck <cohuck@redhat.com> wrote:
> >>  
> >>> On Thu,  6 Feb 2020 22:45:03 +0100
> >>> Eric Farman <farman@linux.ibm.com> wrote:
> >>>     
> >>>> From: Farhan Ali <alifm@linux.ibm.com>
> >>>>
> >>>> EIO is returned by vfio-ccw mediated device when the backing
> >>>> host subchannel is not operational anymore. So return cc=3
> >>>> back to the guest, rather than returning a unit check.
> >>>> This way the guest can take appropriate action such as
> >>>> issue an 'stsch'.    
> >>
> >> I believe this is not the only situation when vfio-ccw returns
> >> EIO, or?
> >>  
> >>>>
> >>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> >>>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> >>>> ---
> >>>>
> >>>> Notes:
> >>>>     v1->v2: [EF]
> >>>>      - Add s-o-b
> >>>>      - [Seems the discussion on v1 centered on the return code
> >>>>        set in the kernel, rather than anything that needs to
> >>>>        change here, unless I've missed something.]    
> >>
> >> Does this need to change here? If the kernel is supposed to return ENODEV
> >> then this does not need to change.
> >>  
> >>>
> >>> I've stared at this and at the kernel code for some time again; and I'm
> >>> not sure if "return -EIO == not operational" is even true. That said,
> >>> I'm not sure a unit check is the right response, either. The only thing
> >>> I'm sure about is that the kernel code needs some review of return
> >>> codes and some documentation...    
> >>
> >> I could not agree more, this is semantically uapi and needs to be
> >> properly documented.
> >>
> >> With regards to "linux error codes: vs "ionist cc's" an where
> >> the mapping is different example:
> >>
> >> """
> >> /**                                                                             
> >>  * cio_cancel_halt_clear - Cancel running I/O by performing cancel, halt        
> >>  * and clear ordinally if subchannel is valid.                                  
> >>  * @sch: subchannel on which to perform the cancel_halt_clear operation         
> >>  * @iretry: the number of the times remained to retry the next operation        
> >>  *                                                                              
> >>  * This should be called repeatedly since halt/clear are asynchronous           
> >>  * operations. We do one try with cio_cancel, three tries with cio_halt,        
> >>  * 255 tries with cio_clear. The caller should initialize @iretry with          
> >>  * the value 255 for its first call to this, and keep using the same            
> >>  * @iretry in the subsequent calls until it gets a non -EBUSY return.           
> >>  *                                                                              
> >>  * Returns 0 if device now idle, -ENODEV for device not operational,            
> >>  * -EBUSY if an interrupt is expected (either from halt/clear or from a         
> >>  * status pending), and -EIO if out of retries.                                 
> >>  */                                                                             
> >> int cio_cancel_halt_clear(struct subchannel *sch, int *iretry)   
> >>
> >> """
> >> Here -ENODEV is not operational.  
> > 
> > Ok, I went through the kernel code and checked which error may be
> > returned in which case (hope I caught all of them). Here's what I
> > currently have:  
> 
> Thanks for doing all this mapping!
> 
> > 
> > diff --git a/Documentation/s390/vfio-ccw.rst b/Documentation/s390/vfio-ccw.rst
> > index fca9c4f5bd9c..43f375a03cce 100644
> > --- a/Documentation/s390/vfio-ccw.rst
> > +++ b/Documentation/s390/vfio-ccw.rst
> > @@ -210,7 +210,36 @@ Subchannel.
> >  
> >  irb_area stores the I/O result.
> >  
> > -ret_code stores a return code for each access of the region.
> > +ret_code stores a return code for each access of the region. The following
> > +values may occur:
> > +
> > +0
> > +  The operation was successful.
> > +
> > +-EOPNOTSUPP
> > +  The orb specified transport mode or an unidentified IDAW format, did not
> > +  specify prefetch mode, or the scsw specified a function other than the
> > +  start function.
> > +
> > +-EIO
> > +  A request was issued while the device was not in a state ready to accept
> > +  requests, or an internal error occurred.
> > +
> > +-EBUSY
> > +  The subchannel was status pending or busy, or a request is already active.
> > +
> > +-EAGAIN
> > +  A request was being processed, and the caller should retry.
> > +
> > +-EACCES
> > +  The channel path(s) used for the I/O were found to be not operational.
> > +
> > +-ENODEV
> > +  The device was found to be not operational.
> > +
> > +-EINVAL
> > +  The orb specified a chain longer than 255 ccws, or an internal error
> > +  occurred.
> >  
> >  This region is always available.
> >  
> > @@ -231,6 +260,29 @@ This region is exposed via region type VFIO_REGION_SUBTYPE_CCW_ASYNC_CMD.
> >  
> >  Currently, CLEAR SUBCHANNEL and HALT SUBCHANNEL use this region.
> >  
> > +command specifies the command to be issued; ret_code stores a return code
> > +for each access of the region. The following values may occur:
> > +
> > +0
> > +  The operation was successful.
> > +
> > +-ENODEV
> > +  The device was found to be not operational.
> > +
> > +-EINVAL
> > +  A command other than halt or clear was specified.
> > +
> > +-EIO
> > +  A request was issued while the device was not in a state ready to accept
> > +  requests.
> > +
> > +-EAGAIN
> > +  A request was being processed, and the caller should retry.
> > +
> > +-EBUSY
> > +  The subchannel was status pending or busy while processing a halt request.
> > +
> > +
> >  vfio-ccw operation details
> >  --------------------------
> >  
> > Unless we interpret "device in wrong state" as "not operational",
> > mapping -EIO to cc 3 is probably not the right thing to do; but
> > generating a unit exception probably isn't either. Honestly, I'm unsure
> > what the right thing to do here would be...  
> 
> Me neither. I grepped my qemu logs for the past, ugh, too long for the
> error report that would precede these failures ("vfio-ccw: [wirte/write]
> I/O region failed with errno=%d"). Excluding the ones that were
> obviously the result of half-baked code, all instances were either
> -EBUSY or -ENODEV.  Could I trigger one?  Maybe.  Is it likely?  Doesn't
> seem to be.
> 
> It seems that getting -EIO would indicate we got ourselves out of sync,
> and started buttoning up the device again (or never having opened it in
> the first place), so a unit exception might be valid as a "hey
> something's screwy here" ?

I guess we have to report *some* kind of failure, and an unit exception
is perhaps the least weird way to do so. So probably be best to just
leave it alone, as it seems near impossible to trigger anyway.

> 
> > 
> > Another problem is that -EIO might signal "something went wrong in the
> > kernel code" - should not happen, but would certainly not be the
> > caller's fault, and they can't do anything about it. That "internal
> > error" thing might also be signaled by -EINVAL (which is odd), but
> > -EINVAL could also mean "the ccw chain is too long", for which
> > -EOPNOTSUPP would probably be a better return code, as it's a
> > limitation in the code, not the architecture, IIRC. Not sure if we can
> > still change that, though (and QEMU handles both in the same way,
> > anyway.)
> > 
> > The other return codes look sane, and the return codes for the async
> > region also seem sane (but have the same issue with -EIO == "device in
> > wrong state").
> > 
> > Looking at the QEMU handling, I think the -EIO handling is a bit
> > questionable (without an obvious solution), while mapping -EBUSY to cc
> > 2 is the only reasonable thing to do given that the interface does not
> > differentiate between "busy" and "status pending". The rest seems sane.
> >   
> 
> So maybe with all this data, and absent a better solution, it's best to
> just drop this patch from v3?

Yes, I agree. I'll post the documentation update as a separate patch.



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 7/7] vfio-ccw: Add support for the CRW irq
  2020-04-06 21:37     ` Eric Farman
@ 2020-04-07  6:35       ` Cornelia Huck
  0 siblings, 0 replies; 18+ messages in thread
From: Cornelia Huck @ 2020-04-07  6:35 UTC (permalink / raw)
  To: Eric Farman; +Cc: Halil Pasic, Jason Herne, qemu-s390x, Jared Rossi, qemu-devel

On Mon, 6 Apr 2020 17:37:18 -0400
Eric Farman <farman@linux.ibm.com> wrote:

> On 4/6/20 12:22 PM, Cornelia Huck wrote:
> > On Thu,  6 Feb 2020 22:45:09 +0100
> > Eric Farman <farman@linux.ibm.com> wrote:
> >   
> >> From: Farhan Ali <alifm@linux.ibm.com>
> >>
> >> The CRW irq will be used by vfio-ccw to notify the userspace
> >> about any CRWs the userspace needs to handle. Let's add support
> >> for it.
> >>
> >> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> >> Signed-off-by: Eric Farman <farman@linux.ibm.com>
> >> ---
> >>
> >> Notes:
> >>     v1->v2:
> >>      - Add a loop to continually read region while data is
> >>        present, queueing CRWs as found [CH]
> >>     v0->v1: [EF]
> >>      - Check vcdev->crw_region before registering the irq,
> >>        in case host kernel does not have matching support
> >>      - Split the refactoring changes to an earlier (new) patch
> >>        (and don't remove the "num_irqs" check in the register
> >>        routine, but adjust it to the check the input variable)
> >>      - Don't revert the cool vfio_set_irq_signaling() stuff
> >>      - Unregister CRW IRQ before IO IRQ in unrealize
> >>      - s/crw1/crw0/
> >>
> >>  hw/vfio/ccw.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 51 insertions(+)
> >>  
> >   
> >> @@ -265,6 +266,40 @@ static void vfio_ccw_reset(DeviceState *dev)
> >>      ioctl(vcdev->vdev.fd, VFIO_DEVICE_RESET);
> >>  }
> >>  
> >> +static void vfio_ccw_crw_notifier_handler(void *opaque)
> >> +{
> >> +    VFIOCCWDevice *vcdev = opaque;
> >> +    struct ccw_crw_region *region = vcdev->crw_region;
> >> +    CRW crw;
> >> +    int size;
> >> +    uint8_t rsc, erc;
> >> +
> >> +    if (!event_notifier_test_and_clear(&vcdev->crw_notifier)) {
> >> +        return;
> >> +    }
> >> +
> >> +    do {
> >> +        memset(region, 0, sizeof(*region));
> >> +        size = pread(vcdev->vdev.fd, region, vcdev->crw_region_size,
> >> +                     vcdev->crw_region_offset);
> >> +
> >> +        if (size == -1) {
> >> +            error_report("vfio-ccw: Read crw region failed with errno=%d", errno);
> >> +            break;
> >> +        }
> >> +
> >> +        if (size == 0 || region->crw0 == 0) {  
> > 
> > Does it make any sense to expect both of them as an indication that
> > there are no more crws at the moment? Grabbing a zeroed crw makes the
> > most sense as a stop condition, I think.  
> 
> I think it was overkill on my part.  Though it appears I am missing the
> "zeroing" of the region once it got read.  Whoops.  Okay, those are easy
> fixups.

Yes, just looking at the zeroed region (after changing the kernel part)
seems like the right thing here.

> 
> > 
> > Also, I'm not sure anymore whether having space for two crws makes too
> > much sense. If we have a case in the future where we get two chained
> > crws, the code will retry anyway and just fetch the chained crw and
> > queue it, wouldn't it?  
> 
> I suppose.
> 
> I thought the reason for including them now was to avoid "if region size
> == 4 vs 8 vs XX" logic at some mysterious time in the future.  But
> certainly ripping it out so we only pass a single CRW at a time would
> simplify this quite a bit.

Yes, injecting in a loop is easy anyway.

> 
> >   
> >> +            /* No more CRWs to queue */
> >> +            break;
> >> +        }
> >> +
> >> +        memcpy(&crw, &region->crw0, sizeof(CRW));
> >> +        rsc = (crw.flags & 0x0f00) >> 8;
> >> +        erc = crw.flags & 0x003f;  
> > 
> > I think we already have something for that... ah yes,
> > CRW_FLAGS_MASK_RSC and CRW_FLAGS_MASK_ERC.  
> 
> Huh, look at that.  :)
> 
> >   
> >> +        css_queue_crw(rsc, erc, 0, 0, crw.rsid);  
> > 
> > ...or maybe an alternative interface that allows us to queue a
> > ready-made crw?  
> 
> Sure, that would be nice.  I'll add that as an additional patch to this
> series, prior to this one.

Agreed, makes sense.

> 
> >   
> >> +    } while (1);
> >> +}
> >> +
> >>  static void vfio_ccw_io_notifier_handler(void *opaque)
> >>  {
> >>      VFIOCCWDevice *vcdev = opaque;  
> >   
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO
  2020-04-07  6:28           ` Cornelia Huck
@ 2020-04-07 10:18             ` Eric Farman
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Farman @ 2020-04-07 10:18 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Halil Pasic, Jason Herne, qemu-s390x, Jared Rossi, qemu-devel



On 4/7/20 2:28 AM, Cornelia Huck wrote:
> On Mon, 6 Apr 2020 14:21:17 -0400
> Eric Farman <farman@linux.ibm.com> wrote:
> 
>> On 4/1/20 4:52 AM, Cornelia Huck wrote:
>>> On Wed, 25 Mar 2020 03:24:28 +0100
>>> Halil Pasic <pasic@linux.ibm.com> wrote:
>>>   
>>>> On Tue, 24 Mar 2020 18:04:30 +0100
>>>> Cornelia Huck <cohuck@redhat.com> wrote:
>>>>  
>>>>> On Thu,  6 Feb 2020 22:45:03 +0100
>>>>> Eric Farman <farman@linux.ibm.com> wrote:
>>>>>     
>>>>>> From: Farhan Ali <alifm@linux.ibm.com>
>>>>>>
>>>>>> EIO is returned by vfio-ccw mediated device when the backing
>>>>>> host subchannel is not operational anymore. So return cc=3
>>>>>> back to the guest, rather than returning a unit check.
>>>>>> This way the guest can take appropriate action such as
>>>>>> issue an 'stsch'.    
>>>>
>>>> I believe this is not the only situation when vfio-ccw returns
>>>> EIO, or?
>>>>  
>>>>>>
>>>>>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>>>>>> Signed-off-by: Eric Farman <farman@linux.ibm.com>
>>>>>> ---
>>>>>>
>>>>>> Notes:
>>>>>>     v1->v2: [EF]
>>>>>>      - Add s-o-b
>>>>>>      - [Seems the discussion on v1 centered on the return code
>>>>>>        set in the kernel, rather than anything that needs to
>>>>>>        change here, unless I've missed something.]    
>>>>
>>>> Does this need to change here? If the kernel is supposed to return ENODEV
>>>> then this does not need to change.
>>>>  
>>>>>
>>>>> I've stared at this and at the kernel code for some time again; and I'm
>>>>> not sure if "return -EIO == not operational" is even true. That said,
>>>>> I'm not sure a unit check is the right response, either. The only thing
>>>>> I'm sure about is that the kernel code needs some review of return
>>>>> codes and some documentation...    
>>>>
>>>> I could not agree more, this is semantically uapi and needs to be
>>>> properly documented.
>>>>
>>>> With regards to "linux error codes: vs "ionist cc's" an where
>>>> the mapping is different example:
>>>>
>>>> """
>>>> /**                                                                             
>>>>  * cio_cancel_halt_clear - Cancel running I/O by performing cancel, halt        
>>>>  * and clear ordinally if subchannel is valid.                                  
>>>>  * @sch: subchannel on which to perform the cancel_halt_clear operation         
>>>>  * @iretry: the number of the times remained to retry the next operation        
>>>>  *                                                                              
>>>>  * This should be called repeatedly since halt/clear are asynchronous           
>>>>  * operations. We do one try with cio_cancel, three tries with cio_halt,        
>>>>  * 255 tries with cio_clear. The caller should initialize @iretry with          
>>>>  * the value 255 for its first call to this, and keep using the same            
>>>>  * @iretry in the subsequent calls until it gets a non -EBUSY return.           
>>>>  *                                                                              
>>>>  * Returns 0 if device now idle, -ENODEV for device not operational,            
>>>>  * -EBUSY if an interrupt is expected (either from halt/clear or from a         
>>>>  * status pending), and -EIO if out of retries.                                 
>>>>  */                                                                             
>>>> int cio_cancel_halt_clear(struct subchannel *sch, int *iretry)   
>>>>
>>>> """
>>>> Here -ENODEV is not operational.  
>>>
>>> Ok, I went through the kernel code and checked which error may be
>>> returned in which case (hope I caught all of them). Here's what I
>>> currently have:  
>>
>> Thanks for doing all this mapping!
>>
>>>
>>> diff --git a/Documentation/s390/vfio-ccw.rst b/Documentation/s390/vfio-ccw.rst
>>> index fca9c4f5bd9c..43f375a03cce 100644
>>> --- a/Documentation/s390/vfio-ccw.rst
>>> +++ b/Documentation/s390/vfio-ccw.rst

...snip...

>>>
>>> The other return codes look sane, and the return codes for the async
>>> region also seem sane (but have the same issue with -EIO == "device in
>>> wrong state").
>>>
>>> Looking at the QEMU handling, I think the -EIO handling is a bit
>>> questionable (without an obvious solution), while mapping -EBUSY to cc
>>> 2 is the only reasonable thing to do given that the interface does not
>>> differentiate between "busy" and "status pending". The rest seems sane.
>>>   
>>
>> So maybe with all this data, and absent a better solution, it's best to
>> just drop this patch from v3?
> 
> Yes, I agree. I'll post the documentation update as a separate patch.
> 

Sounds good; thanks!


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-04-07 10:25 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-06 21:45 [RFC PATCH v2 0/7] s390x/vfio_ccw: Channel Path Handling [QEMU] Eric Farman
2020-02-06 21:45 ` [RFC PATCH v2 1/7] vfio-ccw: Return IOINST_CC_NOT_OPERATIONAL for EIO Eric Farman
2020-03-24 17:04   ` Cornelia Huck
2020-03-25  2:24     ` Halil Pasic
2020-04-01  8:52       ` Cornelia Huck
2020-04-06 18:21         ` Eric Farman
2020-04-07  6:28           ` Cornelia Huck
2020-04-07 10:18             ` Eric Farman
2020-02-06 21:45 ` [RFC PATCH v2 2/7] linux-headers: update Eric Farman
2020-02-06 21:45 ` [RFC PATCH v2 3/7] vfio-ccw: Refactor cleanup of regions Eric Farman
2020-02-06 21:45 ` [RFC PATCH v2 4/7] vfio-ccw: Add support for the schib region Eric Farman
2020-02-06 21:45 ` [RFC PATCH v2 5/7] vfio-ccw: Add support for the crw region Eric Farman
2020-02-06 21:45 ` [RFC PATCH v2 6/7] vfio-ccw: Refactor ccw irq handler Eric Farman
2020-03-24 17:32   ` Cornelia Huck
2020-02-06 21:45 ` [RFC PATCH v2 7/7] vfio-ccw: Add support for the CRW irq Eric Farman
2020-04-06 16:22   ` Cornelia Huck
2020-04-06 21:37     ` Eric Farman
2020-04-07  6:35       ` Cornelia Huck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.