qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support
@ 2020-12-09 20:34 Matthew Rosato
  2020-12-09 20:34 ` [RFC 1/8] linux-headers: update against 5.10-rc7 Matthew Rosato
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

Today, ISM devices are completely disallowed for vfio-pci passthrough as
QEMU rejects the device due to an (inappropriate) MSI-X check.  Removing
this fence, however, reveals additional deficiencies in the s390x PCI
interception layer that prevent ISM devices from working correctly.
Namely, ISM block write operations have particular requirements in regards
to the alignment, size and order of writes performed that cannot be
guaranteed when breaking up write operations through the typical
vfio_pci_bar_rw paths. Furthermore, ISM requires that legacy/non-MIO
s390 PCI instructions are used, which is also not guaranteed when the I/O
is passed through the typical userspace channels.

This patchset provides a set of fixes related to enabling ISM device
passthrough and includes patches to enable use of a new vfio region that
will allow s390x PCI pass-through devices to perform s390 PCI instructions
in such a way that the same instruction issued on the guest is re-issued
on the host.

The region is intended for use with ISM devices specifically, as they do
not implement MSI-X and thus transferring I/O in this manner will not
interfere with vfio-pci MSI-X masking.  If someone can think of a good way
to also make this MSI-X compatible (and thus usable by other device types
besides ISM), I'm open to suggestions...

Associated kernel patchset:
https://lkml.org/lkml/2020/12/9/1059

Matthew Rosato (8):
  linux-headers: update against 5.10-rc7
  s390x/pci: MSI-X isn't strictly required for passthrough
  s390x/pci: fix pcistb length
  s390x/pci: Introduce the ZpciOps structure
  s390x/pci: Fix memory_region_access_valid call
  s390x/pci: Handle devices that support relaxed alignment
  s390x/pci: PCISTB via the vfio zPCI I/O region
  s390x/pci: PCILG via the vfio zPCI I/O region

 hw/s390x/s390-pci-bus.c                            |  32 ++-
 hw/s390x/s390-pci-inst.c                           | 289 ++++++++++++++-------
 hw/s390x/s390-pci-vfio.c                           | 164 ++++++++++++
 include/hw/s390x/s390-pci-bus.h                    |  24 ++
 include/hw/s390x/s390-pci-clp.h                    |   1 +
 include/hw/s390x/s390-pci-inst.h                   |   3 +
 include/hw/s390x/s390-pci-vfio.h                   |  23 ++
 include/standard-headers/asm-x86/kvm_para.h        |   1 +
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h        |   2 +-
 include/standard-headers/linux/vhost_types.h       |   9 +
 linux-headers/linux/vfio.h                         |   4 +
 linux-headers/linux/vfio_zdev.h                    |  33 +++
 linux-headers/linux/vhost.h                        |   4 +
 13 files changed, 486 insertions(+), 103 deletions(-)

-- 
1.8.3.1



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [RFC 1/8] linux-headers: update against 5.10-rc7
  2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
@ 2020-12-09 20:34 ` Matthew Rosato
  2020-12-09 20:34 ` [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough Matthew Rosato
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

Placeholder commit to pull in changes from "vfio-pci/zdev: Pass the relaxed
alignment flag" and "vfio-pci/zdev: Introduce the PCISTB vfio region"

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 include/standard-headers/asm-x86/kvm_para.h        |  1 +
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h        |  2 +-
 include/standard-headers/linux/vhost_types.h       |  9 ++++++
 linux-headers/linux/vfio.h                         |  4 +++
 linux-headers/linux/vfio_zdev.h                    | 33 ++++++++++++++++++++++
 linux-headers/linux/vhost.h                        |  4 +++
 6 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/include/standard-headers/asm-x86/kvm_para.h b/include/standard-headers/asm-x86/kvm_para.h
index 07877d3..215d01b 100644
--- a/include/standard-headers/asm-x86/kvm_para.h
+++ b/include/standard-headers/asm-x86/kvm_para.h
@@ -32,6 +32,7 @@
 #define KVM_FEATURE_POLL_CONTROL	12
 #define KVM_FEATURE_PV_SCHED_YIELD	13
 #define KVM_FEATURE_ASYNC_PF_INT	14
+#define KVM_FEATURE_MSI_EXT_DEST_ID	15
 
 #define KVM_HINTS_REALTIME      0
 
diff --git a/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h b/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
index 0a8c7c9..1677208 100644
--- a/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
+++ b/include/standard-headers/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
@@ -176,7 +176,7 @@ struct pvrdma_port_attr {
 	uint8_t			subnet_timeout;
 	uint8_t			init_type_reply;
 	uint8_t			active_width;
-	uint16_t			active_speed;
+	uint8_t			active_speed;
 	uint8_t			phys_state;
 	uint8_t			reserved[2];
 };
diff --git a/include/standard-headers/linux/vhost_types.h b/include/standard-headers/linux/vhost_types.h
index 486630b..0bd2684 100644
--- a/include/standard-headers/linux/vhost_types.h
+++ b/include/standard-headers/linux/vhost_types.h
@@ -138,6 +138,15 @@ struct vhost_vdpa_config {
 	uint8_t buf[0];
 };
 
+/* vhost vdpa IOVA range
+ * @first: First address that can be mapped by vhost-vDPA
+ * @last: Last address that can be mapped by vhost-vDPA
+ */
+struct vhost_vdpa_iova_range {
+	uint64_t first;
+	uint64_t last;
+};
+
 /* Feature bits */
 /* Log all write descriptors. Can be changed while device is active. */
 #define VHOST_F_LOG_ALL 26
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index b92dcc4..703c20c 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -338,6 +338,10 @@ struct vfio_region_info_cap_type {
  * to do TLB invalidation on a GPU.
  */
 #define VFIO_REGION_SUBTYPE_IBM_NVLINK2_ATSD	(1)
+/*
+ * IBM zPCI I/O region
+ */
+#define VFIO_REGION_SUBTYPE_IBM_ZPCI_IO		(2)
 
 /* sub-types for VFIO_REGION_TYPE_GFX */
 #define VFIO_REGION_SUBTYPE_GFX_EDID            (1)
diff --git a/linux-headers/linux/vfio_zdev.h b/linux-headers/linux/vfio_zdev.h
index b430939..22d3408 100644
--- a/linux-headers/linux/vfio_zdev.h
+++ b/linux-headers/linux/vfio_zdev.h
@@ -43,6 +43,7 @@ struct vfio_device_info_cap_zpci_group {
 	__u64 msi_addr;		/* MSI address */
 	__u64 flags;
 #define VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH 1 /* Program-specified TLB refresh */
+#define VFIO_DEVICE_INFO_ZPCI_FLAG_RELAXED 2 /* Relaxed Length and Alignment */
 	__u16 mui;		/* Measurement Block Update Interval */
 	__u16 noi;		/* Maximum number of MSIs */
 	__u16 maxstbl;		/* Maximum Store Block Length */
@@ -75,4 +76,36 @@ struct vfio_device_info_cap_zpci_pfip {
 	__u8 pfip[];
 };
 
+/**
+ * VFIO_REGION_SUBTYPE_IBM_ZPCI_IO - VFIO zPCI PCI Direct I/O Region
+ *
+ * This region is used to transfer I/O operations from the guest directly
+ * to the host zPCI I/O layer.
+ *
+ * The _hdr area is user-readable and is used to provide setup information.
+ * The _req area is user-writable and is used to provide the I/O operation.
+ */
+struct vfio_zpci_io_hdr {
+	__u64 flags;
+	__u16 max;		/* Max block operation size allowed */
+	__u16 reserved;
+	__u32 reserved2;
+};
+
+struct vfio_zpci_io_req {
+	__u64 flags;
+#define VFIO_ZPCI_IO_FLAG_READ (1 << 0) /* Read Operation Specified */
+#define VFIO_ZPCI_IO_FLAG_BLOCKW (1 << 1) /* Block Write Operation Specified */
+	__u64 gaddr;		/* Address of guest data */
+	__u64 offset;		/* Offset into target PCI Address Space */
+	__u32 reserved;
+	__u16 len;		/* Length of guest operation */
+	__u8 pcias;		/* Target PCI Address Space */
+	__u8 reserved2;
+};
+
+struct vfio_region_zpci_io {
+	struct vfio_zpci_io_hdr hdr;
+	struct vfio_zpci_io_req req;
+};
 #endif
diff --git a/linux-headers/linux/vhost.h b/linux-headers/linux/vhost.h
index 7523218..c998860 100644
--- a/linux-headers/linux/vhost.h
+++ b/linux-headers/linux/vhost.h
@@ -146,4 +146,8 @@
 
 /* Set event fd for config interrupt*/
 #define VHOST_VDPA_SET_CONFIG_CALL	_IOW(VHOST_VIRTIO, 0x77, int)
+
+/* Get the valid iova range */
+#define VHOST_VDPA_GET_IOVA_RANGE	_IOR(VHOST_VIRTIO, 0x78, \
+					     struct vhost_vdpa_iova_range)
 #endif
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough
  2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
  2020-12-09 20:34 ` [RFC 1/8] linux-headers: update against 5.10-rc7 Matthew Rosato
@ 2020-12-09 20:34 ` Matthew Rosato
  2020-12-10 10:28   ` Cornelia Huck
  2020-12-09 20:34 ` [RFC 3/8] s390x/pci: fix pcistb length Matthew Rosato
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

s390 PCI currently disallows PCI devices without the MSI-X capability.
However, this fence doesn't make sense for passthrough devices.  Move
the check to only fence emulated devices (e.g., virtio).

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 05f7460..afad048 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -1028,12 +1028,12 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             s390_pci_get_clp_info(pbdev);
         } else {
             pbdev->fh |= FH_SHM_EMUL;
-        }
 
-        if (s390_pci_msix_init(pbdev)) {
-            error_setg(errp, "MSI-X support is mandatory "
-                       "in the S390 architecture");
-            return;
+            if (s390_pci_msix_init(pbdev)) {
+                error_setg(errp, "MSI-X support is mandatory "
+                           "in the S390 architecture");
+                return;
+            }
         }
 
         if (dev->hotplugged) {
@@ -1073,7 +1073,9 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
         devfn = pci_dev->devfn;
         qdev_unrealize(dev);
 
-        s390_pci_msix_free(pbdev);
+        if (pbdev->fh & FH_SHM_EMUL) {
+            s390_pci_msix_free(pbdev);
+        }
         s390_pci_iommu_free(s, bus, devfn);
         pbdev->pdev = NULL;
         pbdev->state = ZPCI_FS_RESERVED;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC 3/8] s390x/pci: fix pcistb length
  2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
  2020-12-09 20:34 ` [RFC 1/8] linux-headers: update against 5.10-rc7 Matthew Rosato
  2020-12-09 20:34 ` [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough Matthew Rosato
@ 2020-12-09 20:34 ` Matthew Rosato
  2020-12-10 10:30   ` Cornelia Huck
  2020-12-09 20:34 ` [RFC 4/8] s390x/pci: Introduce the ZpciOps structure Matthew Rosato
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

In pcistb_service_call, we are grabbing 8 bits from a guest register to
indicate the length of the store operation -- but per the architecture
the length is actually defined by 13 bits of the guest register.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
---
 hw/s390x/s390-pci-inst.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 70bfd91..db86f12 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -750,7 +750,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
     int i;
     uint32_t fh;
     uint8_t pcias;
-    uint8_t len;
+    uint16_t len;
     uint8_t buffer[128];
 
     if (env->psw.mask & PSW_MASK_PSTATE) {
@@ -760,7 +760,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
 
     fh = env->regs[r1] >> 32;
     pcias = (env->regs[r1] >> 16) & 0xf;
-    len = env->regs[r1] & 0xff;
+    len = env->regs[r1] & 0x1fff;
     offset = env->regs[r3];
 
     if (!(fh & FH_MASK_ENABLE)) {
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC 4/8] s390x/pci: Introduce the ZpciOps structure
  2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
                   ` (2 preceding siblings ...)
  2020-12-09 20:34 ` [RFC 3/8] s390x/pci: fix pcistb length Matthew Rosato
@ 2020-12-09 20:34 ` Matthew Rosato
  2020-12-09 20:34 ` [RFC 5/8] s390x/pci: Fix memory_region_access_valid call Matthew Rosato
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

As inftrastructure to introduce different PCI instruction handlers,
introduce the ZpciOps structure to contain function pointers for the
handlers.  Add default handlers for the PCISTG, PCILG and PCISTB
instructions.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          |   3 +
 hw/s390x/s390-pci-inst.c         | 238 ++++++++++++++++++++++++++-------------
 include/hw/s390x/s390-pci-bus.h  |  22 ++++
 include/hw/s390x/s390-pci-inst.h |   1 +
 4 files changed, 185 insertions(+), 79 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index afad048..7d31ded 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -1020,6 +1020,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         pbdev->iommu->pbdev = pbdev;
         pbdev->state = ZPCI_FS_DISABLED;
         set_pbdev_info(pbdev);
+        zpci_assign_default_ops(pbdev);
 
         if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
             pbdev->fh |= FH_SHM_VFIO;
@@ -1079,6 +1080,8 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
         s390_pci_iommu_free(s, bus, devfn);
         pbdev->pdev = NULL;
         pbdev->state = ZPCI_FS_RESERVED;
+        if (pbdev->pcistb_buf)
+            qemu_vfree(pbdev->pcistb_buf);
     } else if (object_dynamic_cast(OBJECT(dev), TYPE_S390_PCI_DEVICE)) {
         pbdev = S390_PCI_DEVICE(dev);
         pbdev->fid = 0;
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index db86f12..b07ef2a 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -404,16 +404,49 @@ static MemTxResult zpci_read_bar(S390PCIBusDevice *pbdev, uint8_t pcias,
                                        MEMTXATTRS_UNSPECIFIED);
 }
 
+static int pcilg_default(S390PCIBusDevice *pbdev, uint64_t *data, uint8_t pcias,
+                         uint16_t len, uint64_t offset)
+{
+    MemTxResult result;
+
+    switch (pcias) {
+    case ZPCI_IO_BAR_MIN...ZPCI_IO_BAR_MAX:
+        if (!len || (len > (8 - (offset & 0x7)))) {
+            return -EINVAL;
+        }
+        result = zpci_read_bar(pbdev, pcias, offset, data, len);
+        if (result != MEMTX_OK) {
+            return -EINVAL;
+        }
+        break;
+    case ZPCI_CONFIG_BAR:
+        if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
+            return -EINVAL;
+        }
+        *data =  pci_host_config_read_common(
+                   pbdev->pdev, offset, pci_config_size(pbdev->pdev), len);
+
+        if (zpci_endian_swap(data, len)) {
+            return -EINVAL;
+        }
+        break;
+    default:
+        return -EFAULT;
+    }
+
+    return 0;
+}
+
 int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
 {
     CPUS390XState *env = &cpu->env;
     S390PCIBusDevice *pbdev;
     uint64_t offset;
     uint64_t data;
-    MemTxResult result;
     uint8_t len;
     uint32_t fh;
     uint8_t pcias;
+    int ret;
 
     if (env->psw.mask & PSW_MASK_PSTATE) {
         s390_program_interrupt(env, PGM_PRIVILEGED, ra);
@@ -452,35 +485,21 @@ int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
         break;
     }
 
-    switch (pcias) {
-    case ZPCI_IO_BAR_MIN...ZPCI_IO_BAR_MAX:
-        if (!len || (len > (8 - (offset & 0x7)))) {
-            s390_program_interrupt(env, PGM_OPERAND, ra);
-            return 0;
-        }
-        result = zpci_read_bar(pbdev, pcias, offset, &data, len);
-        if (result != MEMTX_OK) {
-            s390_program_interrupt(env, PGM_OPERAND, ra);
-            return 0;
-        }
-        break;
-    case ZPCI_CONFIG_BAR:
-        if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
-            s390_program_interrupt(env, PGM_OPERAND, ra);
-            return 0;
-        }
-        data =  pci_host_config_read_common(
-                   pbdev->pdev, offset, pci_config_size(pbdev->pdev), len);
+    ret = pbdev->ops.pcilg(pbdev, &data, pcias, len, offset);
 
-        if (zpci_endian_swap(&data, len)) {
-            s390_program_interrupt(env, PGM_OPERAND, ra);
-            return 0;
-        }
-        break;
-    default:
+    switch (ret) {
+    case -EINVAL:
+        s390_program_interrupt(env, PGM_OPERAND, ra);
+        return 0;
+    case -EFAULT:
         DPRINTF("pcilg invalid space\n");
         setcc(cpu, ZPCI_PCI_LS_ERR);
         s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS);
+    case 0:
+        break;
+    default:
+        DPRINTF("pcilg unexpected return %d from op\n", ret);
+        s390_program_interrupt(env, PGM_OPERAND, ra);
         return 0;
     }
 
@@ -504,15 +523,55 @@ static MemTxResult zpci_write_bar(S390PCIBusDevice *pbdev, uint8_t pcias,
                                         MEMTXATTRS_UNSPECIFIED);
 }
 
+static int pcistg_default(S390PCIBusDevice *pbdev, uint64_t data, uint8_t pcias,
+                          uint16_t len, uint64_t offset)
+{
+    MemTxResult result;
+
+    switch (pcias) {
+        /* A ZPCI PCI card may use any BAR from BAR 0 to BAR 5 */
+    case ZPCI_IO_BAR_MIN...ZPCI_IO_BAR_MAX:
+        /*
+         * Check length:
+         * A length of 0 is invalid and length should not cross a double word
+         */
+        if (!len || (len > (8 - (offset & 0x7)))) {
+            return -EINVAL;
+        }
+
+        result = zpci_write_bar(pbdev, pcias, offset, data, len);
+        if (result != MEMTX_OK) {
+            return -EINVAL;
+        }
+        break;
+    case ZPCI_CONFIG_BAR:
+        /* ZPCI uses the pseudo BAR number 15 as configuration space */
+        /* possible access lengths are 1,2,4 and must not cross a word */
+        if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
+            return -EINVAL;
+        }
+        /* len = 1,2,4 so we do not need to test */
+        zpci_endian_swap(&data, len);
+        pci_host_config_write_common(pbdev->pdev, offset,
+                                     pci_config_size(pbdev->pdev),
+                                     data, len);
+        break;
+    default:
+        return -EFAULT;
+    }
+
+    return 0;
+}
+
 int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
 {
     CPUS390XState *env = &cpu->env;
     uint64_t offset, data;
     S390PCIBusDevice *pbdev;
-    MemTxResult result;
     uint8_t len;
     uint32_t fh;
     uint8_t pcias;
+    int ret;
 
     if (env->psw.mask & PSW_MASK_PSTATE) {
         s390_program_interrupt(env, PGM_PRIVILEGED, ra);
@@ -555,40 +614,21 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra)
         break;
     }
 
-    switch (pcias) {
-        /* A ZPCI PCI card may use any BAR from BAR 0 to BAR 5 */
-    case ZPCI_IO_BAR_MIN...ZPCI_IO_BAR_MAX:
-        /* Check length:
-         * A length of 0 is invalid and length should not cross a double word
-         */
-        if (!len || (len > (8 - (offset & 0x7)))) {
-            s390_program_interrupt(env, PGM_OPERAND, ra);
-            return 0;
-        }
+    ret = pbdev->ops.pcistg(pbdev, data, pcias, len, offset);
 
-        result = zpci_write_bar(pbdev, pcias, offset, data, len);
-        if (result != MEMTX_OK) {
-            s390_program_interrupt(env, PGM_OPERAND, ra);
-            return 0;
-        }
-        break;
-    case ZPCI_CONFIG_BAR:
-        /* ZPCI uses the pseudo BAR number 15 as configuration space */
-        /* possible access lengths are 1,2,4 and must not cross a word */
-        if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
-            s390_program_interrupt(env, PGM_OPERAND, ra);
-            return 0;
-        }
-        /* len = 1,2,4 so we do not need to test */
-        zpci_endian_swap(&data, len);
-        pci_host_config_write_common(pbdev->pdev, offset,
-                                     pci_config_size(pbdev->pdev),
-                                     data, len);
-        break;
-    default:
+    switch (ret) {
+    case -EINVAL:
+        s390_program_interrupt(env, PGM_OPERAND, ra);
+        return 0;
+    case -EFAULT:
         DPRINTF("pcistg invalid space\n");
         setcc(cpu, ZPCI_PCI_LS_ERR);
         s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS);
+    case 0:
+        break;
+    default:
+        DPRINTF("pcistg unexpected return %d from op\n", ret);
+        s390_program_interrupt(env, PGM_OPERAND, ra);
         return 0;
     }
 
@@ -739,19 +779,53 @@ err:
     return 0;
 }
 
+/*
+ * The default PCISTB handler will break PCISTB instructions into a series of
+ * 8B memory operations.
+ */
+static int pcistb_default(S390PCIBusDevice *pbdev, S390CPU *cpu,
+                           uint64_t gaddr, uint8_t ar, uint8_t pcias,
+                           uint16_t len, uint64_t offset)
+{
+    MemTxResult result;
+    MemoryRegion *mr;
+    int i;
+
+    mr = pbdev->pdev->io_regions[pcias].memory;
+    mr = s390_get_subregion(mr, offset, len);
+    offset -= mr->addr;
+
+    if (!memory_region_access_valid(mr, offset, len, true,
+                                    MEMTXATTRS_UNSPECIFIED)) {
+        return -EINVAL;
+    }
+
+    if (s390_cpu_virt_mem_read(cpu, gaddr, ar, pbdev->pcistb_buf, len)) {
+        return -EACCES;
+    }
+
+    for (i = 0; i < len / 8; i++) {
+        result = memory_region_dispatch_write(mr, offset + i * 8,
+                                              ldq_p(pbdev->pcistb_buf + i * 8),
+                                              MO_64, MEMTXATTRS_UNSPECIFIED);
+        if (result != MEMTX_OK) {
+            return -EINVAL;
+        }
+    }
+
+    return 0;
+}
+
 int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
                         uint8_t ar, uintptr_t ra)
 {
     CPUS390XState *env = &cpu->env;
     S390PCIBusDevice *pbdev;
-    MemoryRegion *mr;
-    MemTxResult result;
     uint64_t offset;
-    int i;
     uint32_t fh;
     uint8_t pcias;
     uint16_t len;
-    uint8_t buffer[128];
+    int ret;
 
     if (env->psw.mask & PSW_MASK_PSTATE) {
         s390_program_interrupt(env, PGM_PRIVILEGED, ra);
@@ -812,29 +886,21 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
         goto specification_error;
     }
 
-    mr = pbdev->pdev->io_regions[pcias].memory;
-    mr = s390_get_subregion(mr, offset, len);
-    offset -= mr->addr;
+    ret = pbdev->ops.pcistb(pbdev, cpu, gaddr, ar, pcias, len, offset);
 
-    if (!memory_region_access_valid(mr, offset, len, true,
-                                    MEMTXATTRS_UNSPECIFIED)) {
+    switch (ret) {
+    case -EINVAL:
         s390_program_interrupt(env, PGM_OPERAND, ra);
         return 0;
-    }
-
-    if (s390_cpu_virt_mem_read(cpu, gaddr, ar, buffer, len)) {
+    case -EACCES:
         s390_cpu_virt_mem_handle_exc(cpu, ra);
         return 0;
-    }
-
-    for (i = 0; i < len / 8; i++) {
-        result = memory_region_dispatch_write(mr, offset + i * 8,
-                                              ldq_p(buffer + i * 8),
-                                              MO_64, MEMTXATTRS_UNSPECIFIED);
-        if (result != MEMTX_OK) {
-            s390_program_interrupt(env, PGM_OPERAND, ra);
-            return 0;
-        }
+    case 0:
+        break;
+    default:
+        DPRINTF("pcistb unexpected return %d from op\n", ret);
+        s390_program_interrupt(env, PGM_OPERAND, ra);
+        return 0;
     }
 
     pbdev->fmb.counter[ZPCI_FMB_CNT_STB]++;
@@ -1298,3 +1364,17 @@ out:
     setcc(cpu, cc);
     return 0;
 }
+
+void zpci_assign_default_ops(S390PCIBusDevice *pbdev)
+{
+    /*
+     * As PCISTB operations are not allowed to cross a page boundary, use
+     * qemu_memalign to get a single page for all subseqent PCISTB
+     * operations.
+     */
+    pbdev->pcistb_buf = qemu_memalign(PAGE_SIZE, PAGE_SIZE);
+
+    pbdev->ops.pcistg = pcistg_default;
+    pbdev->ops.pcilg = pcilg_default;
+    pbdev->ops.pcistb = pcistb_default;
+}
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 49ae9f0..bf0034f 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -325,6 +325,26 @@ typedef struct S390PCIGroup {
 S390PCIGroup *s390_group_create(int id);
 S390PCIGroup *s390_group_find(int id);
 
+typedef struct ZpciOps {
+    int (*pcistg)(S390PCIBusDevice *pbdev,
+                  uint64_t data,
+                  uint8_t pcias,
+                  uint16_t len,
+                  uint64_t offset);
+    int (*pcilg)(S390PCIBusDevice *pbdev,
+                 uint64_t *data,
+                 uint8_t pcias,
+                 uint16_t len,
+                 uint64_t offset);
+    int (*pcistb)(S390PCIBusDevice *pbdev,
+                  S390CPU *cpu,
+                  uint64_t gaddr,
+                  uint8_t ar,
+                  uint8_t pcias,
+                  uint16_t len,
+                  uint64_t offset);
+} ZpciOps;
+
 struct S390PCIBusDevice {
     DeviceState qdev;
     PCIDevice *pdev;
@@ -350,6 +370,8 @@ struct S390PCIBusDevice {
     MemoryRegion msix_notify_mr;
     IndAddr *summary_ind;
     IndAddr *indicator;
+    ZpciOps ops;
+    uint8_t *pcistb_buf;
     bool pci_unplug_request_processed;
     bool unplug_requested;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
diff --git a/include/hw/s390x/s390-pci-inst.h b/include/hw/s390x/s390-pci-inst.h
index a55c448..c9fe3f1 100644
--- a/include/hw/s390x/s390-pci-inst.h
+++ b/include/hw/s390x/s390-pci-inst.h
@@ -111,6 +111,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
 int stpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
                          uintptr_t ra);
 void fmb_timer_free(S390PCIBusDevice *pbdev);
+void zpci_assign_default_ops(S390PCIBusDevice *pbdev);
 
 #define ZPCI_IO_BAR_MIN 0
 #define ZPCI_IO_BAR_MAX 5
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC 5/8] s390x/pci: Fix memory_region_access_valid call
  2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
                   ` (3 preceding siblings ...)
  2020-12-09 20:34 ` [RFC 4/8] s390x/pci: Introduce the ZpciOps structure Matthew Rosato
@ 2020-12-09 20:34 ` Matthew Rosato
  2020-12-10 12:15   ` Cornelia Huck
  2020-12-09 20:34 ` [RFC 6/8] s390x/pci: Handle devices that support relaxed alignment Matthew Rosato
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

In pcistb_service_handler, a call is made to validate that the memory
region can be accessed.  However, the call is made using the entire length
of the pcistb operation, which can be larger than the allowed memory
access size (8).  Since we already know that the provided buffer is a
multiple of 8, fix the call to memory_region_access_valid to iterate
over the memory region in the same way as the subsequent call to
memory_region_dispatch_write.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-inst.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index b07ef2a..a5270d0 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -795,9 +795,11 @@ static int pcistb_default(S390PCIBusDevice *pbdev, S390CPU *cpu,
     mr = s390_get_subregion(mr, offset, len);
     offset -= mr->addr;
 
-    if (!memory_region_access_valid(mr, offset, len, true,
-                                    MEMTXATTRS_UNSPECIFIED)) {
-        return -EINVAL;
+    for (i = 0; i < len / 8; i++) {
+        if (!memory_region_access_valid(mr, offset + i * 8, 8, true,
+                                        MEMTXATTRS_UNSPECIFIED)) {
+            return -EINVAL;
+        }
     }
 
     if (s390_cpu_virt_mem_read(cpu, gaddr, ar, pbdev->pcistb_buf, len)) {
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC 6/8] s390x/pci: Handle devices that support relaxed alignment
  2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
                   ` (4 preceding siblings ...)
  2020-12-09 20:34 ` [RFC 5/8] s390x/pci: Fix memory_region_access_valid call Matthew Rosato
@ 2020-12-09 20:34 ` Matthew Rosato
  2020-12-09 20:34 ` [RFC 7/8] s390x/pci: PCISTB via the vfio zPCI I/O region Matthew Rosato
  2020-12-09 20:34 ` [RFC 8/8] s390x/pci: PCILG " Matthew Rosato
  7 siblings, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

Certain zPCI device types (e.g. ISM) allow for a different set of address
alignment rules for PCISTB instructions.  Recognize this distinction and
perform only a subset of alignment checks for intercepted PCISTB
instructions.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-inst.c        | 34 ++++++++++++++++++++--------------
 hw/s390x/s390-pci-vfio.c        |  3 +++
 include/hw/s390x/s390-pci-clp.h |  1 +
 3 files changed, 24 insertions(+), 14 deletions(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index a5270d0..30698e5 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -868,25 +868,31 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
         return 0;
     }
 
-    /* Verify the address, offset and length */
-    /* offset must be a multiple of 8 */
-    if (offset % 8) {
-        goto specification_error;
-    }
-    /* Length must be greater than 8, a multiple of 8 */
-    /* and not greater than maxstbl */
-    if ((len <= 8) || (len % 8) ||
-        (len > pbdev->pci_group->zpci_group.maxstbl)) {
-        goto specification_error;
+    /*
+     * If the specified device supports relaxed alignment, some checks can
+     * be skipped.
+     */
+    if (!(pbdev->pci_group->zpci_group.fr & CLP_RSP_QPCIG_MASK_RELAXED)) {
+        /* Verify the address, offset and length */
+        /* offset must be a multiple of 8 */
+        if (offset % 8) {
+            goto specification_error;
+        }
+        /* Length must be greater than 8, a multiple of 8 */
+        /* and not greater than maxstbl */
+        if ((len <= 8) || (len % 8) ||
+            (len > pbdev->pci_group->zpci_group.maxstbl)) {
+            goto specification_error;
+        }
+        /* Guest address must be double word aligned */
+        if (gaddr & 0x07UL) {
+            goto specification_error;
+        }
     }
     /* Do not cross a 4K-byte boundary */
     if (((offset & 0xfff) + len) > 0x1000) {
         goto specification_error;
     }
-    /* Guest address must be double word aligned */
-    if (gaddr & 0x07UL) {
-        goto specification_error;
-    }
 
     ret = pbdev->ops.pcistb(pbdev, cpu, gaddr, ar, pcias, len, offset);
 
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 9296e1b..9439fe1 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -156,6 +156,9 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
         if (cap->flags & VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH) {
             resgrp->fr = 1;
         }
+        if (cap->flags & VFIO_DEVICE_INFO_ZPCI_FLAG_RELAXED) {
+            resgrp->fr |= CLP_RSP_QPCIG_MASK_RELAXED;
+        }
         resgrp->dasm = cap->dasm;
         resgrp->msia = cap->msi_addr;
         resgrp->mui = cap->mui;
diff --git a/include/hw/s390x/s390-pci-clp.h b/include/hw/s390x/s390-pci-clp.h
index 96b8e3f..73a28a0 100644
--- a/include/hw/s390x/s390-pci-clp.h
+++ b/include/hw/s390x/s390-pci-clp.h
@@ -158,6 +158,7 @@ typedef struct ClpRspQueryPciGrp {
 #define CLP_RSP_QPCIG_MASK_NOI 0xfff
     uint16_t i;
     uint8_t version;
+#define CLP_RSP_QPCIG_MASK_RELAXED 0x8
 #define CLP_RSP_QPCIG_MASK_FRAME   0x2
 #define CLP_RSP_QPCIG_MASK_REFRESH 0x1
     uint8_t fr;
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC 7/8] s390x/pci: PCISTB via the vfio zPCI I/O region
  2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
                   ` (5 preceding siblings ...)
  2020-12-09 20:34 ` [RFC 6/8] s390x/pci: Handle devices that support relaxed alignment Matthew Rosato
@ 2020-12-09 20:34 ` Matthew Rosato
  2020-12-09 20:34 ` [RFC 8/8] s390x/pci: PCILG " Matthew Rosato
  7 siblings, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

For ISM devices, use the vfio region to handle intercepted PCISTB
instructions.  This region will allow large block I/O instructions
intercepted from the guest to be performed as a single I/O instruction on
the host.  This ensure that proper write patterns that are expected by the
underlying device are respected and ensures that a non-MIO instruction is
used to perform the operation (as ISM devices do not support the MIO
instruction set).
Furthermore, add a requirement that the I/O region must be available in
order to pass the device through to the guest.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          |  15 ++++++
 hw/s390x/s390-pci-inst.c         |   8 +++
 hw/s390x/s390-pci-vfio.c         | 108 +++++++++++++++++++++++++++++++++++++++
 include/hw/s390x/s390-pci-bus.h  |   2 +
 include/hw/s390x/s390-pci-inst.h |   1 +
 include/hw/s390x/s390-pci-vfio.h |  15 ++++++
 6 files changed, 149 insertions(+)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 7d31ded..e77c448 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -977,6 +977,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
     S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
     PCIDevice *pdev = NULL;
     S390PCIBusDevice *pbdev = NULL;
+    int ret;
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
         PCIBridge *pb = PCI_BRIDGE(dev);
@@ -1027,6 +1028,20 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
             /* Fill in CLP information passed via the vfio region */
             s390_pci_get_clp_info(pbdev);
+
+            /*
+             * For a relaxed-alignment device, setup the special I/O region
+             * if available.  Otherwise, the device cannot be passed through.
+             */
+            ret = 0;
+            if (pbdev->pci_group->zpci_group.fr & CLP_RSP_QPCIG_MASK_RELAXED) {
+                ret = s390_pci_get_zpci_io_region(pbdev);
+            }
+            if (ret) {
+                error_setg(errp, "vfio zPCI I/O region support is mandatory "
+                           "for %02x:%02x.%01x", pci_dev_bus_num(pdev),
+                           PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
+            }
         } else {
             pbdev->fh |= FH_SHM_EMUL;
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 30698e5..d4c79f6 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -19,6 +19,7 @@
 #include "sysemu/hw_accel.h"
 #include "hw/s390x/s390-pci-inst.h"
 #include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-vfio.h"
 #include "hw/s390x/tod.h"
 
 #ifndef DEBUG_S390PCI_INST
@@ -897,6 +898,8 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
     ret = pbdev->ops.pcistb(pbdev, cpu, gaddr, ar, pcias, len, offset);
 
     switch (ret) {
+    case -EIO:
+        /* fall through */
     case -EINVAL:
         s390_program_interrupt(env, PGM_OPERAND, ra);
         return 0;
@@ -1386,3 +1389,8 @@ void zpci_assign_default_ops(S390PCIBusDevice *pbdev)
     pbdev->ops.pcilg = pcilg_default;
     pbdev->ops.pcistb = pcistb_default;
 }
+
+void zpci_assign_ops_vfio_io_region(S390PCIBusDevice *pbdev)
+{
+    pbdev->ops.pcistb = s390_pci_vfio_pcistb;
+}
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 9439fe1..ad50a62 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -17,6 +17,7 @@
 #include "trace.h"
 #include "hw/s390x/s390-pci-bus.h"
 #include "hw/s390x/s390-pci-clp.h"
+#include "hw/s390x/s390-pci-inst.h"
 #include "hw/s390x/s390-pci-vfio.h"
 #include "hw/vfio/pci.h"
 #include "hw/vfio/vfio-common.h"
@@ -277,3 +278,110 @@ retry:
 
     return;
 }
+
+/*
+ * This function will look for the VFIO_REGION_SUBTYPE_IBM_ZPCI_IO vfio
+ * device region, which is used for performing block I/O operations.
+ */
+int s390_pci_get_zpci_io_region(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vfio_pci;
+    VFIODevice *vdev;
+    struct vfio_region_info *info;
+    int ret;
+
+    vfio_pci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+    vdev = &vfio_pci->vbasedev;
+
+    if (vdev->num_regions < VFIO_PCI_NUM_REGIONS + 1) {
+        return -ENOENT;
+    }
+
+    /* Get the I/O region if it's available */
+    if (vfio_get_dev_region_info(vdev,
+                                 PCI_VENDOR_ID_IBM |
+                                 VFIO_REGION_TYPE_PCI_VENDOR_TYPE,
+                                 VFIO_REGION_SUBTYPE_IBM_ZPCI_IO, &info)) {
+        return -ENOENT;
+    }
+
+    /* If the size is unexpectedly small, don't use the region */
+    if (sizeof(*pbdev->io_region) > info->size) {
+        return -EINVAL;
+    }
+
+    pbdev->io_region = g_malloc0(info->size);
+
+    /* Check the header for setup information */
+    ret = pread(vfio_pci->vbasedev.fd, &pbdev->io_region->hdr,
+                sizeof(struct vfio_zpci_io_hdr), info->offset);
+    if (ret != sizeof(struct vfio_zpci_io_hdr)) {
+        g_free(pbdev->io_region);
+        pbdev->io_region = 0;
+        ret = -EINVAL;
+    } else {
+        pbdev->io_region_op_offset = info->offset +
+                                     offsetof(struct vfio_region_zpci_io, req);
+        /* All devices in this group will use the I/O region for PCISTB */
+        pbdev->pci_group->zpci_group.maxstbl = MIN(PAGE_SIZE,
+                                               pbdev->io_region->hdr.max);
+        ret = 0;
+    }
+    g_free(info);
+
+    /* Register the new handlers for the device if region available */
+    if (pbdev->io_region) {
+        zpci_assign_ops_vfio_io_region(pbdev);
+    }
+
+    return ret;
+}
+
+int s390_pci_vfio_pcistb(S390PCIBusDevice *pbdev, S390CPU *cpu, uint64_t gaddr,
+                         uint8_t ar, uint8_t pcias, uint16_t len,
+                         uint64_t offset)
+{
+    struct vfio_region_zpci_io *region = pbdev->io_region;
+    VFIOPCIDevice *vfio_pci;
+    uint8_t *buffer;
+    int ret;
+
+    if (region == NULL) {
+        return -EIO;
+    }
+
+    vfio_pci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+
+    /*
+     * We've already ensured the input can be no larger than a page.  PCISTB
+     * requires that the operation payload does not cross a page boundary,
+     * otherwise the operation will be rejected.  Therefore, just get a single
+     * page for the write.
+     */
+    buffer = qemu_memalign(PAGE_SIZE, PAGE_SIZE);
+
+    if (s390_cpu_virt_mem_read(cpu, gaddr, ar, buffer, len)) {
+        ret = -EACCES;
+        goto out;
+    }
+
+    region->req.gaddr = (uint64_t)buffer;
+    region->req.offset = offset;
+    region->req.len = len;
+    region->req.pcias = pcias;
+    region->req.flags = VFIO_ZPCI_IO_FLAG_BLOCKW;
+
+    ret = pwrite(vfio_pci->vbasedev.fd, &region->req,
+                 sizeof(struct vfio_zpci_io_req),
+                 pbdev->io_region_op_offset);
+    if (ret != sizeof(struct vfio_zpci_io_req)) {
+        ret = -EIO;
+    } else {
+        ret = 0;
+    }
+
+out:
+    qemu_vfree(buffer);
+
+    return ret;
+}
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index bf0034f..8986a5f 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -355,6 +355,8 @@ struct S390PCIBusDevice {
     uint32_t fh;
     uint32_t fid;
     bool fid_defined;
+    uint64_t io_region_op_offset;
+    struct vfio_region_zpci_io *io_region;
     uint64_t fmb_addr;
     ZpciFmb fmb;
     QEMUTimer *fmb_timer;
diff --git a/include/hw/s390x/s390-pci-inst.h b/include/hw/s390x/s390-pci-inst.h
index c9fe3f1..7ed6175 100644
--- a/include/hw/s390x/s390-pci-inst.h
+++ b/include/hw/s390x/s390-pci-inst.h
@@ -112,6 +112,7 @@ int stpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
                          uintptr_t ra);
 void fmb_timer_free(S390PCIBusDevice *pbdev);
 void zpci_assign_default_ops(S390PCIBusDevice *pbdev);
+void zpci_assign_ops_vfio_io_region(S390PCIBusDevice *pbdev);
 
 #define ZPCI_IO_BAR_MIN 0
 #define ZPCI_IO_BAR_MAX 5
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index ff708ae..f0a994f 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -21,6 +21,10 @@ S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
                                           S390PCIBusDevice *pbdev);
 void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
+int s390_pci_get_zpci_io_region(S390PCIBusDevice *pbdev);
+int s390_pci_vfio_pcistb(S390PCIBusDevice *pbdev, S390CPU *cpu, uint64_t gaddr,
+                         uint8_t ar, uint8_t pcias, uint16_t len,
+                         uint64_t offset);
 #else
 static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
 {
@@ -34,6 +38,17 @@ static inline S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
 static inline void s390_pci_end_dma_count(S390pciState *s,
                                           S390PCIDMACount *cnt) { }
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
+static inline int s390_pci_get_zpci_io_region(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_vfio_pcistb(S390PCIBusDevice *pbdev, S390CPU *cpu,
+                                       uint64_t gaddr, uint8_t ar,
+                                       uint8_t pcias, uint16_t len,
+                                       uint64_t offset)
+{
+    return -EIO;
+}
 #endif
 
 #endif
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [RFC 8/8] s390x/pci: PCILG via the vfio zPCI I/O region
  2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
                   ` (6 preceding siblings ...)
  2020-12-09 20:34 ` [RFC 7/8] s390x/pci: PCISTB via the vfio zPCI I/O region Matthew Rosato
@ 2020-12-09 20:34 ` Matthew Rosato
  7 siblings, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-12-09 20:34 UTC (permalink / raw)
  To: cohuck, thuth
  Cc: pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

For ISM devices, use the vfio region to handle intercepted PCILG
instructions.  This will allow read I/Os intercepted from the guest to be
performed as single operations that ensure the same non-MIO PCI instruction
is used on the host as specified in the guest.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-inst.c         |  3 ++-
 hw/s390x/s390-pci-vfio.c         | 53 ++++++++++++++++++++++++++++++++++++++++
 include/hw/s390x/s390-pci-inst.h |  1 +
 include/hw/s390x/s390-pci-vfio.h |  8 ++++++
 4 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index d4c79f6..33186dc 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -352,7 +352,7 @@ out:
  * @ptr: a pointer to a uint64_t data field
  * @len: the length of the valid data, must be 1,2,4 or 8
  */
-static int zpci_endian_swap(uint64_t *ptr, uint8_t len)
+int zpci_endian_swap(uint64_t *ptr, uint8_t len)
 {
     uint64_t data = *ptr;
 
@@ -1392,5 +1392,6 @@ void zpci_assign_default_ops(S390PCIBusDevice *pbdev)
 
 void zpci_assign_ops_vfio_io_region(S390PCIBusDevice *pbdev)
 {
+    pbdev->ops.pcilg = s390_pci_vfio_pcilg;
     pbdev->ops.pcistb = s390_pci_vfio_pcistb;
 }
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index ad50a62..baba6b0 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -337,6 +337,59 @@ int s390_pci_get_zpci_io_region(S390PCIBusDevice *pbdev)
     return ret;
 }
 
+int s390_pci_vfio_pcilg(S390PCIBusDevice *pbdev, uint64_t *data, uint8_t pcias,
+                        uint16_t len, uint64_t offset)
+{
+    struct vfio_region_zpci_io *region = pbdev->io_region;
+    VFIOPCIDevice *vfio_pci;
+    int ret;
+
+    if (region == NULL) {
+        return -EIO;
+    }
+
+    vfio_pci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+
+    /* Perform Length/Alignment checks */
+    switch (pcias) {
+    case ZPCI_IO_BAR_MIN...ZPCI_IO_BAR_MAX:
+        if (!len || (len > (8 - (offset & 0x7)))) {
+            return -EINVAL;
+        }
+        region->req.gaddr = (uint64_t)data;
+        region->req.offset = offset;
+        region->req.len = len;
+        region->req.pcias = pcias;
+        region->req.flags = VFIO_ZPCI_IO_FLAG_READ;
+
+        ret = pwrite(vfio_pci->vbasedev.fd, &region->req,
+                     sizeof(struct vfio_zpci_io_req),
+                     pbdev->io_region_op_offset);
+        if (ret != sizeof(struct vfio_zpci_io_req)) {
+            ret = -EIO;
+        } else {
+            ret = 0;
+        }
+        break;
+    case ZPCI_CONFIG_BAR:
+        if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
+            return -EINVAL;
+        }
+        *data = pci_host_config_read_common(
+                       pbdev->pdev, offset, pci_config_size(pbdev->pdev), len);
+
+        if (zpci_endian_swap(data, len)) {
+            ret = -EINVAL;
+        }
+        ret = 0;
+        break;
+    default:
+        return -EFAULT;
+    }
+
+    return ret;
+}
+
 int s390_pci_vfio_pcistb(S390PCIBusDevice *pbdev, S390CPU *cpu, uint64_t gaddr,
                          uint8_t ar, uint8_t pcias, uint16_t len,
                          uint64_t offset)
diff --git a/include/hw/s390x/s390-pci-inst.h b/include/hw/s390x/s390-pci-inst.h
index 7ed6175..fe368fb 100644
--- a/include/hw/s390x/s390-pci-inst.h
+++ b/include/hw/s390x/s390-pci-inst.h
@@ -101,6 +101,7 @@ typedef struct ZpciFib {
 int pci_dereg_irqs(S390PCIBusDevice *pbdev);
 void pci_dereg_ioat(S390PCIIOMMU *iommu);
 int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra);
+int zpci_endian_swap(uint64_t *ptr, uint8_t len);
 int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
 int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
 int rpcit_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index f0a994f..d9fb3a4 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -22,6 +22,8 @@ S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
 void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 int s390_pci_get_zpci_io_region(S390PCIBusDevice *pbdev);
+int s390_pci_vfio_pcilg(S390PCIBusDevice *pbdev, uint64_t *data, uint8_t pcias,
+                        uint16_t len, uint64_t offset);
 int s390_pci_vfio_pcistb(S390PCIBusDevice *pbdev, S390CPU *cpu, uint64_t gaddr,
                          uint8_t ar, uint8_t pcias, uint16_t len,
                          uint64_t offset);
@@ -42,6 +44,12 @@ static inline int s390_pci_get_zpci_io_region(S390PCIBusDevice *pbdev)
 {
     return -EINVAL;
 }
+static inline int s390_pci_vfio_pcilg(S390PCIBusDevice *pbdev, uint64_t *data,
+                                      uint8_t pcias, uint16_t len,
+                                      uint64_t offset)
+{
+    return -EIO;
+}
 static inline int s390_pci_vfio_pcistb(S390PCIBusDevice *pbdev, S390CPU *cpu,
                                        uint64_t gaddr, uint8_t ar,
                                        uint8_t pcias, uint16_t len,
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough
  2020-12-09 20:34 ` [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough Matthew Rosato
@ 2020-12-10 10:28   ` Cornelia Huck
  2020-12-10 15:13     ` Matthew Rosato
  0 siblings, 1 reply; 17+ messages in thread
From: Cornelia Huck @ 2020-12-10 10:28 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

On Wed,  9 Dec 2020 15:34:20 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> s390 PCI currently disallows PCI devices without the MSI-X capability.
> However, this fence doesn't make sense for passthrough devices.  Move
> the check to only fence emulated devices (e.g., virtio).
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  hw/s390x/s390-pci-bus.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 05f7460..afad048 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -1028,12 +1028,12 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>              s390_pci_get_clp_info(pbdev);
>          } else {
>              pbdev->fh |= FH_SHM_EMUL;
> -        }
>  
> -        if (s390_pci_msix_init(pbdev)) {
> -            error_setg(errp, "MSI-X support is mandatory "
> -                       "in the S390 architecture");
> -            return;
> +            if (s390_pci_msix_init(pbdev)) {
> +                error_setg(errp, "MSI-X support is mandatory "
> +                           "in the S390 architecture");
> +                return;
> +            }
>          }
>  
>          if (dev->hotplugged) {
> @@ -1073,7 +1073,9 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>          devfn = pci_dev->devfn;
>          qdev_unrealize(dev);
>  
> -        s390_pci_msix_free(pbdev);
> +        if (pbdev->fh & FH_SHM_EMUL) {
> +            s390_pci_msix_free(pbdev);
> +        }
>          s390_pci_iommu_free(s, bus, devfn);
>          pbdev->pdev = NULL;
>          pbdev->state = ZPCI_FS_RESERVED;

Remind me: Wasn't it only msi that was strictly required (i.e., not msi-x?)

Can we generally relax this requirement, possibly with some changes in
the adapter interrupt mapping? I might misremember, though.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 3/8] s390x/pci: fix pcistb length
  2020-12-09 20:34 ` [RFC 3/8] s390x/pci: fix pcistb length Matthew Rosato
@ 2020-12-10 10:30   ` Cornelia Huck
  2020-12-10 15:15     ` Matthew Rosato
  0 siblings, 1 reply; 17+ messages in thread
From: Cornelia Huck @ 2020-12-10 10:30 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

On Wed,  9 Dec 2020 15:34:21 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> In pcistb_service_call, we are grabbing 8 bits from a guest register to
> indicate the length of the store operation -- but per the architecture
> the length is actually defined by 13 bits of the guest register.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  hw/s390x/s390-pci-inst.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 70bfd91..db86f12 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -750,7 +750,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
>      int i;
>      uint32_t fh;
>      uint8_t pcias;
> -    uint8_t len;
> +    uint16_t len;
>      uint8_t buffer[128];
>  
>      if (env->psw.mask & PSW_MASK_PSTATE) {
> @@ -760,7 +760,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
>  
>      fh = env->regs[r1] >> 32;
>      pcias = (env->regs[r1] >> 16) & 0xf;
> -    len = env->regs[r1] & 0xff;
> +    len = env->regs[r1] & 0x1fff;
>      offset = env->regs[r3];
>  
>      if (!(fh & FH_MASK_ENABLE)) {

Is that a general problem that we just did not notice before?

If yes, this probably deserves a Fixes: tag and can be queued
independently of the rest of the series.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 5/8] s390x/pci: Fix memory_region_access_valid call
  2020-12-09 20:34 ` [RFC 5/8] s390x/pci: Fix memory_region_access_valid call Matthew Rosato
@ 2020-12-10 12:15   ` Cornelia Huck
  0 siblings, 0 replies; 17+ messages in thread
From: Cornelia Huck @ 2020-12-10 12:15 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

On Wed,  9 Dec 2020 15:34:23 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> In pcistb_service_handler, a call is made to validate that the memory
> region can be accessed.  However, the call is made using the entire length
> of the pcistb operation, which can be larger than the allowed memory
> access size (8).  Since we already know that the provided buffer is a
> multiple of 8, fix the call to memory_region_access_valid to iterate
> over the memory region in the same way as the subsequent call to
> memory_region_dispatch_write.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  hw/s390x/s390-pci-inst.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index b07ef2a..a5270d0 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -795,9 +795,11 @@ static int pcistb_default(S390PCIBusDevice *pbdev, S390CPU *cpu,
>      mr = s390_get_subregion(mr, offset, len);
>      offset -= mr->addr;
>  
> -    if (!memory_region_access_valid(mr, offset, len, true,
> -                                    MEMTXATTRS_UNSPECIFIED)) {
> -        return -EINVAL;
> +    for (i = 0; i < len / 8; i++) {
> +        if (!memory_region_access_valid(mr, offset + i * 8, 8, true,
> +                                        MEMTXATTRS_UNSPECIFIED)) {
> +            return -EINVAL;
> +        }
>      }
>  
>      if (s390_cpu_virt_mem_read(cpu, gaddr, ar, pbdev->pcistb_buf, len)) {

Hm, that looks like a fix that's applicable for the current code base
as well... do you want to split this out?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough
  2020-12-10 10:28   ` Cornelia Huck
@ 2020-12-10 15:13     ` Matthew Rosato
  2020-12-17 13:08       ` Cornelia Huck
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Rosato @ 2020-12-10 15:13 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: thuth, pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

On 12/10/20 5:28 AM, Cornelia Huck wrote:
> On Wed,  9 Dec 2020 15:34:20 -0500
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> s390 PCI currently disallows PCI devices without the MSI-X capability.
>> However, this fence doesn't make sense for passthrough devices.  Move
>> the check to only fence emulated devices (e.g., virtio).
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c | 14 ++++++++------
>>   1 file changed, 8 insertions(+), 6 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>> index 05f7460..afad048 100644
>> --- a/hw/s390x/s390-pci-bus.c
>> +++ b/hw/s390x/s390-pci-bus.c
>> @@ -1028,12 +1028,12 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>               s390_pci_get_clp_info(pbdev);
>>           } else {
>>               pbdev->fh |= FH_SHM_EMUL;
>> -        }
>>   
>> -        if (s390_pci_msix_init(pbdev)) {
>> -            error_setg(errp, "MSI-X support is mandatory "
>> -                       "in the S390 architecture");
>> -            return;
>> +            if (s390_pci_msix_init(pbdev)) {
>> +                error_setg(errp, "MSI-X support is mandatory "
>> +                           "in the S390 architecture");
>> +                return;
>> +            }
>>           }
>>   
>>           if (dev->hotplugged) {
>> @@ -1073,7 +1073,9 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>           devfn = pci_dev->devfn;
>>           qdev_unrealize(dev);
>>   
>> -        s390_pci_msix_free(pbdev);
>> +        if (pbdev->fh & FH_SHM_EMUL) {
>> +            s390_pci_msix_free(pbdev);
>> +        }
>>           s390_pci_iommu_free(s, bus, devfn);
>>           pbdev->pdev = NULL;
>>           pbdev->state = ZPCI_FS_RESERVED;
> 
> Remind me: Wasn't it only msi that was strictly required (i.e., not msi-x?)
> 
> Can we generally relax this requirement, possibly with some changes in
> the adapter interrupt mapping? I might misremember, though.
> 

Yes, but even so our current emulation support only sets up for MSI-X, 
it does not have an msi_init() equivalent.  I do believe that this 
requirement can be relaxed at some point for the emulation support as 
well, but the focus on this set was to at least stop fencing passthrough 
for no reason.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 3/8] s390x/pci: fix pcistb length
  2020-12-10 10:30   ` Cornelia Huck
@ 2020-12-10 15:15     ` Matthew Rosato
  2020-12-17 13:09       ` Cornelia Huck
  0 siblings, 1 reply; 17+ messages in thread
From: Matthew Rosato @ 2020-12-10 15:15 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: thuth, pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

On 12/10/20 5:30 AM, Cornelia Huck wrote:
> On Wed,  9 Dec 2020 15:34:21 -0500
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> In pcistb_service_call, we are grabbing 8 bits from a guest register to
>> indicate the length of the store operation -- but per the architecture
>> the length is actually defined by 13 bits of the guest register.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-inst.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index 70bfd91..db86f12 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -750,7 +750,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
>>       int i;
>>       uint32_t fh;
>>       uint8_t pcias;
>> -    uint8_t len;
>> +    uint16_t len;
>>       uint8_t buffer[128];
>>   
>>       if (env->psw.mask & PSW_MASK_PSTATE) {
>> @@ -760,7 +760,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
>>   
>>       fh = env->regs[r1] >> 32;
>>       pcias = (env->regs[r1] >> 16) & 0xf;
>> -    len = env->regs[r1] & 0xff;
>> +    len = env->regs[r1] & 0x1fff;
>>       offset = env->regs[r3];
>>   
>>       if (!(fh & FH_MASK_ENABLE)) {
> 
> Is that a general problem that we just did not notice before?
> 
> If yes, this probably deserves a Fixes: tag and can be queued
> independently of the rest of the series.
> 

Good point.  I can split this out, and same for "s390x/pci: Fix 
memory_region_access_valid call"


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough
  2020-12-10 15:13     ` Matthew Rosato
@ 2020-12-17 13:08       ` Cornelia Huck
  2020-12-17 15:12         ` Matthew Rosato
  0 siblings, 1 reply; 17+ messages in thread
From: Cornelia Huck @ 2020-12-17 13:08 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

On Thu, 10 Dec 2020 10:13:29 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 12/10/20 5:28 AM, Cornelia Huck wrote:
> > On Wed,  9 Dec 2020 15:34:20 -0500
> > Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >   
> >> s390 PCI currently disallows PCI devices without the MSI-X capability.
> >> However, this fence doesn't make sense for passthrough devices.  Move
> >> the check to only fence emulated devices (e.g., virtio).
> >>
> >> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> >> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
> >> ---
> >>   hw/s390x/s390-pci-bus.c | 14 ++++++++------
> >>   1 file changed, 8 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> >> index 05f7460..afad048 100644
> >> --- a/hw/s390x/s390-pci-bus.c
> >> +++ b/hw/s390x/s390-pci-bus.c
> >> @@ -1028,12 +1028,12 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >>               s390_pci_get_clp_info(pbdev);
> >>           } else {
> >>               pbdev->fh |= FH_SHM_EMUL;
> >> -        }
> >>   
> >> -        if (s390_pci_msix_init(pbdev)) {
> >> -            error_setg(errp, "MSI-X support is mandatory "
> >> -                       "in the S390 architecture");
> >> -            return;
> >> +            if (s390_pci_msix_init(pbdev)) {
> >> +                error_setg(errp, "MSI-X support is mandatory "
> >> +                           "in the S390 architecture");
> >> +                return;
> >> +            }
> >>           }
> >>   
> >>           if (dev->hotplugged) {
> >> @@ -1073,7 +1073,9 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> >>           devfn = pci_dev->devfn;
> >>           qdev_unrealize(dev);
> >>   
> >> -        s390_pci_msix_free(pbdev);
> >> +        if (pbdev->fh & FH_SHM_EMUL) {
> >> +            s390_pci_msix_free(pbdev);
> >> +        }
> >>           s390_pci_iommu_free(s, bus, devfn);
> >>           pbdev->pdev = NULL;
> >>           pbdev->state = ZPCI_FS_RESERVED;  
> > 
> > Remind me: Wasn't it only msi that was strictly required (i.e., not msi-x?)
> > 
> > Can we generally relax this requirement, possibly with some changes in
> > the adapter interrupt mapping? I might misremember, though.
> >   
> 
> Yes, but even so our current emulation support only sets up for MSI-X, 
> it does not have an msi_init() equivalent.  I do believe that this 
> requirement can be relaxed at some point for the emulation support as 
> well, but the focus on this set was to at least stop fencing passthrough 
> for no reason.

Looking back to when this was introduced, I see that 857cc71985dc
("s390x/pci: merge msix init functions") actually makes this mandatory
and states that nothing changes for passthrough. Has anything changed
regarding msi-x in the architecture in the meantime?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 3/8] s390x/pci: fix pcistb length
  2020-12-10 15:15     ` Matthew Rosato
@ 2020-12-17 13:09       ` Cornelia Huck
  0 siblings, 0 replies; 17+ messages in thread
From: Cornelia Huck @ 2020-12-17 13:09 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

On Thu, 10 Dec 2020 10:15:06 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> On 12/10/20 5:30 AM, Cornelia Huck wrote:
> > On Wed,  9 Dec 2020 15:34:21 -0500
> > Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> >   
> >> In pcistb_service_call, we are grabbing 8 bits from a guest register to
> >> indicate the length of the store operation -- but per the architecture
> >> the length is actually defined by 13 bits of the guest register.
> >>
> >> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> >> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
> >> ---
> >>   hw/s390x/s390-pci-inst.c | 4 ++--
> >>   1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >> index 70bfd91..db86f12 100644
> >> --- a/hw/s390x/s390-pci-inst.c
> >> +++ b/hw/s390x/s390-pci-inst.c
> >> @@ -750,7 +750,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
> >>       int i;
> >>       uint32_t fh;
> >>       uint8_t pcias;
> >> -    uint8_t len;
> >> +    uint16_t len;
> >>       uint8_t buffer[128];
> >>   
> >>       if (env->psw.mask & PSW_MASK_PSTATE) {
> >> @@ -760,7 +760,7 @@ int pcistb_service_call(S390CPU *cpu, uint8_t r1, uint8_t r3, uint64_t gaddr,
> >>   
> >>       fh = env->regs[r1] >> 32;
> >>       pcias = (env->regs[r1] >> 16) & 0xf;
> >> -    len = env->regs[r1] & 0xff;
> >> +    len = env->regs[r1] & 0x1fff;
> >>       offset = env->regs[r3];
> >>   
> >>       if (!(fh & FH_MASK_ENABLE)) {  
> > 
> > Is that a general problem that we just did not notice before?
> > 
> > If yes, this probably deserves a Fixes: tag and can be queued
> > independently of the rest of the series.
> >   
> 
> Good point.  I can split this out, and same for "s390x/pci: Fix 
> memory_region_access_valid call"
> 

Any plans for sending this? I'm currently collecting patches for a last
pull request before the holidays; but I'm sure there will be more pull
requests next year :)



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough
  2020-12-17 13:08       ` Cornelia Huck
@ 2020-12-17 15:12         ` Matthew Rosato
  0 siblings, 0 replies; 17+ messages in thread
From: Matthew Rosato @ 2020-12-17 15:12 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: thuth, pmorel, david, schnelle, richard.henderson, qemu-s390x,
	qemu-devel, pasic, borntraeger, alex.williamson, mst, pbonzini

On 12/17/20 8:08 AM, Cornelia Huck wrote:
> On Thu, 10 Dec 2020 10:13:29 -0500
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> On 12/10/20 5:28 AM, Cornelia Huck wrote:
>>> On Wed,  9 Dec 2020 15:34:20 -0500
>>> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
>>>    
>>>> s390 PCI currently disallows PCI devices without the MSI-X capability.
>>>> However, this fence doesn't make sense for passthrough devices.  Move
>>>> the check to only fence emulated devices (e.g., virtio).
>>>>
>>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>>> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
>>>> ---
>>>>    hw/s390x/s390-pci-bus.c | 14 ++++++++------
>>>>    1 file changed, 8 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>>>> index 05f7460..afad048 100644
>>>> --- a/hw/s390x/s390-pci-bus.c
>>>> +++ b/hw/s390x/s390-pci-bus.c
>>>> @@ -1028,12 +1028,12 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>>>                s390_pci_get_clp_info(pbdev);
>>>>            } else {
>>>>                pbdev->fh |= FH_SHM_EMUL;
>>>> -        }
>>>>    
>>>> -        if (s390_pci_msix_init(pbdev)) {
>>>> -            error_setg(errp, "MSI-X support is mandatory "
>>>> -                       "in the S390 architecture");
>>>> -            return;
>>>> +            if (s390_pci_msix_init(pbdev)) {
>>>> +                error_setg(errp, "MSI-X support is mandatory "
>>>> +                           "in the S390 architecture");
>>>> +                return;
>>>> +            }
>>>>            }
>>>>    
>>>>            if (dev->hotplugged) {
>>>> @@ -1073,7 +1073,9 @@ static void s390_pcihost_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
>>>>            devfn = pci_dev->devfn;
>>>>            qdev_unrealize(dev);
>>>>    
>>>> -        s390_pci_msix_free(pbdev);
>>>> +        if (pbdev->fh & FH_SHM_EMUL) {
>>>> +            s390_pci_msix_free(pbdev);
>>>> +        }
>>>>            s390_pci_iommu_free(s, bus, devfn);
>>>>            pbdev->pdev = NULL;
>>>>            pbdev->state = ZPCI_FS_RESERVED;
>>>
>>> Remind me: Wasn't it only msi that was strictly required (i.e., not msi-x?)
>>>
>>> Can we generally relax this requirement, possibly with some changes in
>>> the adapter interrupt mapping? I might misremember, though.
>>>    
>>
>> Yes, but even so our current emulation support only sets up for MSI-X,
>> it does not have an msi_init() equivalent.  I do believe that this
>> requirement can be relaxed at some point for the emulation support as
>> well, but the focus on this set was to at least stop fencing passthrough
>> for no reason.
> 
> Looking back to when this was introduced, I see that 857cc71985dc
> ("s390x/pci: merge msix init functions") actually makes this mandatory
> and states that nothing changes for passthrough. Has anything changed
> regarding msi-x in the architecture in the meantime?
> 

I don't believe anything has changed from the architecture or kernel 
support perspective (Maybe I'm wrong about the latter; Niklas feel free 
to correct me in that case) -- Otherwise, as far as I can, tell the 
statement from 857cc71985dc was just incorrect and the restriction was 
placed unnecessarily for passthrough.  vfio-pci sets up for both, and 
our base s390 kernel support allows for both (and I can drive ISM 
traffic and see interrupts being routed to the guest with these patches 
applied of course); it's just our qemu emulation support that only sets 
up for MSI-X.


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-12-17 15:19 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-09 20:34 [RFC 0/8] s390x/pci: Fixing s390 vfio-pci ISM support Matthew Rosato
2020-12-09 20:34 ` [RFC 1/8] linux-headers: update against 5.10-rc7 Matthew Rosato
2020-12-09 20:34 ` [RFC 2/8] s390x/pci: MSI-X isn't strictly required for passthrough Matthew Rosato
2020-12-10 10:28   ` Cornelia Huck
2020-12-10 15:13     ` Matthew Rosato
2020-12-17 13:08       ` Cornelia Huck
2020-12-17 15:12         ` Matthew Rosato
2020-12-09 20:34 ` [RFC 3/8] s390x/pci: fix pcistb length Matthew Rosato
2020-12-10 10:30   ` Cornelia Huck
2020-12-10 15:15     ` Matthew Rosato
2020-12-17 13:09       ` Cornelia Huck
2020-12-09 20:34 ` [RFC 4/8] s390x/pci: Introduce the ZpciOps structure Matthew Rosato
2020-12-09 20:34 ` [RFC 5/8] s390x/pci: Fix memory_region_access_valid call Matthew Rosato
2020-12-10 12:15   ` Cornelia Huck
2020-12-09 20:34 ` [RFC 6/8] s390x/pci: Handle devices that support relaxed alignment Matthew Rosato
2020-12-09 20:34 ` [RFC 7/8] s390x/pci: PCISTB via the vfio zPCI I/O region Matthew Rosato
2020-12-09 20:34 ` [RFC 8/8] s390x/pci: PCILG " Matthew Rosato

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).