All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/8] s390x/pci: zPCI interpretation support
@ 2022-05-24 19:02 Matthew Rosato
  2022-05-24 19:02 ` [PATCH v6 1/8] Update linux headers Matthew Rosato
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:02 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

For QEMU, the majority of the work in enabling instruction interpretation       
is handled via SHM bit settings (to indicate to firmware whether or not
interpretive execution facilities are to be used) + a new KVM ioctl is
used to setup firmware-interpreted forwarding of Adapter Event
Notifications.                                        
                                                                                
This series also adds a new, optional 'interpret' parameter to zpci which       
can be used to disable interpretation support (interpret=off) as well as        
an 'forwarding_assist' parameter to determine whether or not the firmware       
assist will be used for adapter event delivery (default when
interpretation is in use) or whether the host will be responsible for
delivering all adapter event notifications (forwarding_assist=off).
                                                                                
The ZPCI_INTERP CPU feature is added beginning with the z14 model to            
enable this support.                                                            
                                                                                
As a consequence of implementing zPCI interpretation, ISM devices now           
become eligible for passthrough (but only when zPCI interpretation is           
available).                                                                     
                                                                                
From the perspective of guest configuration, you passthrough zPCI devices       
in the same manner as before, with intepretation support being used by          
default if available in kernel+qemu.                                            
                                                                                
Associated kernel series:                                                       
https://lore.kernel.org/kvm/20220524185907.140285-1-mjrosato@linux.ibm.com/                             
                       
Changelog v5->v6:
- Update linux headers (KVM_CAP_S390_ZPCI_OP changed)
- Move featoff to ccw_machine_7_0_instance_options() (Thomas)
- s390_pci_get_host_fh: s/unsigned int/uint32_t/ (Thomas)
- s390_pci_kvm_interp_allowed: add !s390_is_pv() check (Pierre)
- Fail guest SET PCI FN (enable) if we cannot get the host fh
  or if the retrieved host FH is not enabled (Pierre)
- bugfix: don't free msix if we never initialized it

Matthew Rosato (8):
  Update linux headers
  target/s390x: add zpci-interp to cpu models
  s390x/pci: add routine to get host function handle from CLP info
  s390x/pci: enable for load/store intepretation
  s390x/pci: don't fence interpreted devices without MSI-X
  s390x/pci: enable adapter event notification for interpreted devices
  s390x/pci: let intercept devices have separate PCI groups
  s390x/pci: reflect proper maxstbl for groups of interpreted devices

 hw/s390x/meson.build                |   1 +
 hw/s390x/s390-pci-bus.c             | 111 ++++++++++++++++++++++--
 hw/s390x/s390-pci-inst.c            |  56 +++++++++++-
 hw/s390x/s390-pci-kvm.c             |  53 ++++++++++++
 hw/s390x/s390-pci-vfio.c            | 129 +++++++++++++++++++++++-----
 hw/s390x/s390-virtio-ccw.c          |   1 +
 include/hw/s390x/s390-pci-bus.h     |   8 +-
 include/hw/s390x/s390-pci-kvm.h     |  38 ++++++++
 include/hw/s390x/s390-pci-vfio.h    |   5 ++
 linux-headers/asm-s390/kvm.h        |   1 +
 linux-headers/linux/kvm.h           |  32 +++++++
 linux-headers/linux/vfio.h          |   4 +-
 linux-headers/linux/vfio_zdev.h     |   7 ++
 target/s390x/cpu_features_def.h.inc |   1 +
 target/s390x/gen-features.c         |   2 +
 target/s390x/kvm/kvm.c              |   8 ++
 target/s390x/kvm/kvm_s390x.h        |   1 +
 17 files changed, 426 insertions(+), 32 deletions(-)
 create mode 100644 hw/s390x/s390-pci-kvm.c
 create mode 100644 include/hw/s390x/s390-pci-kvm.h

-- 
2.27.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v6 1/8] Update linux headers
  2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
@ 2022-05-24 19:02 ` Matthew Rosato
  2022-05-24 19:02 ` [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models Matthew Rosato
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:02 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

This is a placeholder that pulls in unmerged kernel changes
required by this item.  A proper header sync can be done once the
associated kernel code merges.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 linux-headers/asm-s390/kvm.h    |  1 +
 linux-headers/linux/kvm.h       | 32 ++++++++++++++++++++++++++++++++
 linux-headers/linux/vfio.h      |  4 ++--
 linux-headers/linux/vfio_zdev.h |  7 +++++++
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index f053b8304a..d8259ff9a1 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
 #define KVM_S390_VM_CPU_FEAT_PFMFI	11
 #define KVM_S390_VM_CPU_FEAT_SIGPIF	12
 #define KVM_S390_VM_CPU_FEAT_KSS	13
+#define KVM_S390_VM_CPU_FEAT_ZPCI_INTERP 14
 struct kvm_s390_vm_cpu_feat {
 	__u64 feat[16];
 };
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 0d05d02ee4..3013371078 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1150,6 +1150,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_DISABLE_QUIRKS2 213
 /* #define KVM_CAP_VM_TSC_CONTROL 214 */
 #define KVM_CAP_SYSTEM_EVENT_DATA 215
+#define KVM_CAP_S390_ZPCI_OP 216
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -2066,4 +2067,35 @@ struct kvm_stats_desc {
 /* Available with KVM_CAP_XSAVE2 */
 #define KVM_GET_XSAVE2		  _IOR(KVMIO,  0xcf, struct kvm_xsave)
 
+/* Available with KVM_CAP_S390_ZPCI_OP */
+#define KVM_S390_ZPCI_OP         _IOW(KVMIO,  0xd0, struct kvm_s390_zpci_op)
+
+struct kvm_s390_zpci_op {
+	/* in */
+	__u32 fh;               /* target device */
+	__u8  op;               /* operation to perform */
+	__u8  pad[3];
+	union {
+		/* for KVM_S390_ZPCIOP_REG_AEN */
+		struct {
+			__u64 ibv;      /* Guest addr of interrupt bit vector */
+			__u64 sb;       /* Guest addr of summary bit */
+			__u32 flags;
+			__u32 noi;      /* Number of interrupts */
+			__u8 isc;       /* Guest interrupt subclass */
+			__u8 sbo;       /* Offset of guest summary bit vector */
+			__u16 pad;
+		} reg_aen;
+		__u64 reserved[8];
+	} u;
+};
+
+/* types for kvm_s390_zpci_op->op */
+#define KVM_S390_ZPCIOP_REG_AEN                0
+#define KVM_S390_ZPCIOP_DEREG_AEN      1
+
+/* flags for kvm_s390_zpci_op->u.reg_aen.flags */
+#define KVM_S390_ZPCIOP_REGAEN_HOST    (1 << 0)
+
+
 #endif /* __LINUX_KVM_H */
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index e9f7795c39..ede44b5572 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -643,7 +643,7 @@ enum {
 };
 
 /**
- * VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - _IORW(VFIO_TYPE, VFIO_BASE + 12,
+ * VFIO_DEVICE_GET_PCI_HOT_RESET_INFO - _IOWR(VFIO_TYPE, VFIO_BASE + 12,
  *					      struct vfio_pci_hot_reset_info)
  *
  * Return: 0 on success, -errno on failure:
@@ -770,7 +770,7 @@ struct vfio_device_ioeventfd {
 #define VFIO_DEVICE_IOEVENTFD		_IO(VFIO_TYPE, VFIO_BASE + 16)
 
 /**
- * VFIO_DEVICE_FEATURE - _IORW(VFIO_TYPE, VFIO_BASE + 17,
+ * VFIO_DEVICE_FEATURE - _IOWR(VFIO_TYPE, VFIO_BASE + 17,
  *			       struct vfio_device_feature)
  *
  * Get, set, or probe feature data of the device.  The feature is selected
diff --git a/linux-headers/linux/vfio_zdev.h b/linux-headers/linux/vfio_zdev.h
index b4309397b6..77f2aff1f2 100644
--- a/linux-headers/linux/vfio_zdev.h
+++ b/linux-headers/linux/vfio_zdev.h
@@ -29,6 +29,9 @@ struct vfio_device_info_cap_zpci_base {
 	__u16 fmb_length;	/* Measurement Block Length (in bytes) */
 	__u8 pft;		/* PCI Function Type */
 	__u8 gid;		/* PCI function group ID */
+	/* End of version 1 */
+	__u32 fh;		/* PCI function handle */
+	/* End of version 2 */
 };
 
 /**
@@ -47,6 +50,10 @@ struct vfio_device_info_cap_zpci_group {
 	__u16 noi;		/* Maximum number of MSIs */
 	__u16 maxstbl;		/* Maximum Store Block Length */
 	__u8 version;		/* Supported PCI Version */
+	/* End of version 1 */
+	__u8 reserved;
+	__u16 imaxstbl;		/* Maximum Interpreted Store Block Length */
+	/* End of version 2 */
 };
 
 /**
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models
  2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
  2022-05-24 19:02 ` [PATCH v6 1/8] Update linux headers Matthew Rosato
@ 2022-05-24 19:02 ` Matthew Rosato
  2022-06-01  9:52   ` David Hildenbrand
  2022-05-24 19:03 ` [PATCH v6 3/8] s390x/pci: add routine to get host function handle from CLP info Matthew Rosato
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:02 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

The zpci-interp feature is used to specify whether zPCI interpretation is
to be used for this guest.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-virtio-ccw.c          | 1 +
 target/s390x/cpu_features_def.h.inc | 1 +
 target/s390x/gen-features.c         | 2 ++
 target/s390x/kvm/kvm.c              | 1 +
 4 files changed, 5 insertions(+)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 047cca0487..b33310a135 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -806,6 +806,7 @@ static void ccw_machine_7_0_instance_options(MachineState *machine)
     static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V7_0 };
 
     ccw_machine_7_1_instance_options(machine);
+    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
     s390_set_qemu_cpu_model(0x8561, 15, 1, qemu_cpu_feat);
 }
 
diff --git a/target/s390x/cpu_features_def.h.inc b/target/s390x/cpu_features_def.h.inc
index e86662bb3b..4ade3182aa 100644
--- a/target/s390x/cpu_features_def.h.inc
+++ b/target/s390x/cpu_features_def.h.inc
@@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: Conditional-external-interception f
 DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
 DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
 DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
+DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
 
 /* Features exposed via the PLO instruction. */
 DEF_FEAT(PLO_CL, "plo-cl", PLO, 0, "PLO Compare and load (32 bit in general registers)")
diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index c03ec2c9a9..f991646c01 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -554,6 +554,7 @@ static uint16_t full_GEN14_GA1[] = {
     S390_FEAT_HPMA2,
     S390_FEAT_SIE_KSS,
     S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+    S390_FEAT_ZPCI_INTERP,
 };
 
 #define full_GEN14_GA2 EmptyFeat
@@ -650,6 +651,7 @@ static uint16_t default_GEN14_GA1[] = {
     S390_FEAT_GROUP_MSA_EXT_8,
     S390_FEAT_MULTIPLE_EPOCH,
     S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+    S390_FEAT_ZPCI_INTERP,
 };
 
 #define default_GEN14_GA2 EmptyFeat
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 53098bf541..314b0a9039 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2293,6 +2293,7 @@ static int kvm_to_feat[][2] = {
     { KVM_S390_VM_CPU_FEAT_PFMFI, S390_FEAT_SIE_PFMFI},
     { KVM_S390_VM_CPU_FEAT_SIGPIF, S390_FEAT_SIE_SIGPIF},
     { KVM_S390_VM_CPU_FEAT_KSS, S390_FEAT_SIE_KSS},
+    { KVM_S390_VM_CPU_FEAT_ZPCI_INTERP, S390_FEAT_ZPCI_INTERP },
 };
 
 static int query_cpu_feat(S390FeatBitmap features)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 3/8] s390x/pci: add routine to get host function handle from CLP info
  2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
  2022-05-24 19:02 ` [PATCH v6 1/8] Update linux headers Matthew Rosato
  2022-05-24 19:02 ` [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models Matthew Rosato
@ 2022-05-24 19:03 ` Matthew Rosato
  2022-05-24 19:03 ` [PATCH v6 4/8] s390x/pci: enable for load/store intepretation Matthew Rosato
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

In order to interface with the underlying host zPCI device, we need
to know it's function handle.  Add a routine to grab this from the
vfio CLP capabilities chain.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-vfio.c         | 83 ++++++++++++++++++++++++++------
 include/hw/s390x/s390-pci-vfio.h |  5 ++
 2 files changed, 72 insertions(+), 16 deletions(-)

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 6f80a47e29..4bf0a7e22d 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -124,6 +124,27 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
     pbdev->zpci_fn.pft = 0;
 }
 
+static bool get_host_fh(S390PCIBusDevice *pbdev, struct vfio_device_info *info,
+                        uint32_t *fh)
+{
+    struct vfio_info_cap_header *hdr;
+    struct vfio_device_info_cap_zpci_base *cap;
+    VFIOPCIDevice *vpci =  container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+
+    hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_ZPCI_BASE);
+
+    /* Can only get the host fh with version 2 or greater */
+    if (hdr == NULL || hdr->version < 2) {
+        trace_s390_pci_clp_cap(vpci->vbasedev.name,
+                               VFIO_DEVICE_INFO_CAP_ZPCI_BASE);
+        return false;
+    }
+    cap = (void *) hdr;
+
+    *fh = cap->fh;
+    return true;
+}
+
 static void s390_pci_read_group(S390PCIBusDevice *pbdev,
                                 struct vfio_device_info *info)
 {
@@ -217,25 +238,13 @@ static void s390_pci_read_pfip(S390PCIBusDevice *pbdev,
     memcpy(pbdev->zpci_fn.pfip, cap->pfip, CLP_PFIP_NR_SEGMENTS);
 }
 
-/*
- * This function will issue the VFIO_DEVICE_GET_INFO ioctl and look for
- * capabilities that contain information about CLP features provided by the
- * underlying host.
- * On entry, defaults have already been placed into the guest CLP response
- * buffers.  On exit, defaults will have been overwritten for any CLP features
- * found in the capability chain; defaults will remain for any CLP features not
- * found in the chain.
- */
-void s390_pci_get_clp_info(S390PCIBusDevice *pbdev)
+static struct vfio_device_info *get_device_info(S390PCIBusDevice *pbdev,
+                                                uint32_t argsz)
 {
-    g_autofree struct vfio_device_info *info = NULL;
+    struct vfio_device_info *info = g_malloc0(argsz);
     VFIOPCIDevice *vfio_pci;
-    uint32_t argsz;
     int fd;
 
-    argsz = sizeof(*info);
-    info = g_malloc0(argsz);
-
     vfio_pci = container_of(pbdev->pdev, VFIOPCIDevice, pdev);
     fd = vfio_pci->vbasedev.fd;
 
@@ -250,7 +259,8 @@ retry:
 
     if (ioctl(fd, VFIO_DEVICE_GET_INFO, info)) {
         trace_s390_pci_clp_dev_info(vfio_pci->vbasedev.name);
-        return;
+        free(info);
+        return NULL;
     }
 
     if (info->argsz > argsz) {
@@ -259,6 +269,47 @@ retry:
         goto retry;
     }
 
+    return info;
+}
+
+/*
+ * Get the host function handle from the vfio CLP capabilities chain.  Returns
+ * true if a fh value was placed into the provided buffer.  Returns false
+ * if a fh could not be obtained (ioctl failed or capabilitiy version does
+ * not include the fh)
+ */
+bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh)
+{
+    g_autofree struct vfio_device_info *info = NULL;
+
+    assert(fh);
+
+    info = get_device_info(pbdev, sizeof(*info));
+    if (!info) {
+        return false;
+    }
+
+    return get_host_fh(pbdev, info, fh);
+}
+
+/*
+ * This function will issue the VFIO_DEVICE_GET_INFO ioctl and look for
+ * capabilities that contain information about CLP features provided by the
+ * underlying host.
+ * On entry, defaults have already been placed into the guest CLP response
+ * buffers.  On exit, defaults will have been overwritten for any CLP features
+ * found in the capability chain; defaults will remain for any CLP features not
+ * found in the chain.
+ */
+void s390_pci_get_clp_info(S390PCIBusDevice *pbdev)
+{
+    g_autofree struct vfio_device_info *info = NULL;
+
+    info = get_device_info(pbdev, sizeof(*info));
+    if (!info) {
+        return;
+    }
+
     /*
      * Find the CLP features provided and fill in the guest CLP responses.
      * Always call s390_pci_read_base first as information from this could
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index ff708aef50..ae1b126ff7 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -20,6 +20,7 @@ bool s390_pci_update_dma_avail(int fd, unsigned int *avail);
 S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
                                           S390PCIBusDevice *pbdev);
 void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
+bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh);
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 #else
 static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
@@ -33,6 +34,10 @@ static inline S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
 }
 static inline void s390_pci_end_dma_count(S390pciState *s,
                                           S390PCIDMACount *cnt) { }
+static inline bool s390_pci_get_host_fh(S390PCIBusDevice *pbdev, uint32_t *fh)
+{
+    return false;
+}
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
 #endif
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 4/8] s390x/pci: enable for load/store intepretation
  2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
                   ` (2 preceding siblings ...)
  2022-05-24 19:03 ` [PATCH v6 3/8] s390x/pci: add routine to get host function handle from CLP info Matthew Rosato
@ 2022-05-24 19:03 ` Matthew Rosato
  2022-05-24 19:03 ` [PATCH v6 5/8] s390x/pci: don't fence interpreted devices without MSI-X Matthew Rosato
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

If the appropriate CPU facilty is available as well as the necessary
ZPCI_OP ioctl, then the underlying KVM host will enable load/store
intepretation for any guest device without a SHM bit in the guest
function handle.  For a device that will be using interpretation
support, ensure the guest function handle matches the host function
handle; this value is re-checked every time the guest issues a SET PCI FN
to enable the guest device as it is the only opportunity to reflect
function handle changes.

By default, unless interpret=off is specified, interpretation support will
always be assumed and exploited if the necessary ioctl and features are
available on the host kernel.  When these are unavailable, we will silently
revert to the interception model; this allows existing guest configurations
to work unmodified on hosts with and without zPCI interpretation support,
allowing QEMU to choose the best support model available.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/meson.build            |  1 +
 hw/s390x/s390-pci-bus.c         | 66 ++++++++++++++++++++++++++++++++-
 hw/s390x/s390-pci-inst.c        | 16 ++++++++
 hw/s390x/s390-pci-kvm.c         | 23 ++++++++++++
 include/hw/s390x/s390-pci-bus.h |  1 +
 include/hw/s390x/s390-pci-kvm.h | 24 ++++++++++++
 target/s390x/kvm/kvm.c          |  7 ++++
 target/s390x/kvm/kvm_s390x.h    |  1 +
 8 files changed, 138 insertions(+), 1 deletion(-)
 create mode 100644 hw/s390x/s390-pci-kvm.c
 create mode 100644 include/hw/s390x/s390-pci-kvm.h

diff --git a/hw/s390x/meson.build b/hw/s390x/meson.build
index feefe0717e..f291016fee 100644
--- a/hw/s390x/meson.build
+++ b/hw/s390x/meson.build
@@ -23,6 +23,7 @@ s390x_ss.add(when: 'CONFIG_KVM', if_true: files(
   's390-skeys-kvm.c',
   's390-stattrib-kvm.c',
   'pv.c',
+  's390-pci-kvm.c',
 ))
 s390x_ss.add(when: 'CONFIG_TCG', if_true: files(
   'tod-tcg.c',
diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 4b2bdd94b3..156051e6e9 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -16,6 +16,7 @@
 #include "qapi/visitor.h"
 #include "hw/s390x/s390-pci-bus.h"
 #include "hw/s390x/s390-pci-inst.h"
+#include "hw/s390x/s390-pci-kvm.h"
 #include "hw/s390x/s390-pci-vfio.h"
 #include "hw/pci/pci_bus.h"
 #include "hw/qdev-properties.h"
@@ -971,12 +972,51 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
     }
 }
 
+static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
+{
+    uint32_t idx, fh;
+
+    if (!s390_pci_get_host_fh(pbdev, &fh)) {
+        return -EPERM;
+    }
+
+    /*
+     * The host device is already in an enabled state, but we always present
+     * the initial device state to the guest as disabled (ZPCI_FS_DISABLED).
+     * Therefore, mask off the enable bit from the passthrough handle until
+     * the guest issues a CLP SET PCI FN later to enable the device.
+     */
+    pbdev->fh = fh & ~FH_MASK_ENABLE;
+
+    /* Next, see if the idx is already in-use */
+    idx = pbdev->fh & FH_MASK_INDEX;
+    if (pbdev->idx != idx) {
+        if (s390_pci_find_dev_by_idx(s, idx)) {
+            return -EINVAL;
+        }
+        /*
+         * Update the idx entry with the passed through idx
+         * If the relinquished idx is lower than next_idx, use it
+         * to replace next_idx
+         */
+        g_hash_table_remove(s->zpci_table, &pbdev->idx);
+        if (idx < s->next_idx) {
+            s->next_idx = idx;
+        }
+        pbdev->idx = idx;
+        g_hash_table_insert(s->zpci_table, &pbdev->idx, pbdev);
+    }
+
+    return 0;
+}
+
 static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                               Error **errp)
 {
     S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
     PCIDevice *pdev = NULL;
     S390PCIBusDevice *pbdev = NULL;
+    int rc;
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
         PCIBridge *pb = PCI_BRIDGE(dev);
@@ -1022,12 +1062,35 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         set_pbdev_info(pbdev);
 
         if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
-            pbdev->fh |= FH_SHM_VFIO;
+            /*
+             * By default, interpretation is always requested; if the available
+             * facilities indicate it is not available, fallback to the
+             * interception model.
+             */
+            if (pbdev->interp) {
+                if (s390_pci_kvm_interp_allowed()) {
+                    rc = s390_pci_interp_plug(s, pbdev);
+                    if (rc) {
+                        error_setg(errp, "Plug failed for zPCI device in "
+                                   "interpretation mode: %d", rc);
+                        return;
+                    }
+                } else {
+                    DPRINTF("zPCI interpretation facilities missing.\n");
+                    pbdev->interp = false;
+                }
+            }
             pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
             /* Fill in CLP information passed via the vfio region */
             s390_pci_get_clp_info(pbdev);
+            if (!pbdev->interp) {
+                /* Do vfio passthrough but intercept for I/O */
+                pbdev->fh |= FH_SHM_VFIO;
+            }
         } else {
             pbdev->fh |= FH_SHM_EMUL;
+            /* Always intercept emulated devices */
+            pbdev->interp = false;
         }
 
         if (s390_pci_msix_init(pbdev)) {
@@ -1360,6 +1423,7 @@ static Property s390_pci_device_properties[] = {
     DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
     DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
     DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
+    DEFINE_PROP_BOOL("interpret", S390PCIBusDevice, interp, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 6d400d4147..651ec38635 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -18,6 +18,8 @@
 #include "sysemu/hw_accel.h"
 #include "hw/s390x/s390-pci-inst.h"
 #include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-kvm.h"
+#include "hw/s390x/s390-pci-vfio.h"
 #include "hw/s390x/tod.h"
 
 #ifndef DEBUG_S390PCI_INST
@@ -246,6 +248,20 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
                 goto out;
             }
 
+            /*
+             * Take this opportunity to make sure we still have an accurate
+             * host fh.  It's possible part of the handle changed while the
+             * device was disabled to the guest (e.g. vfio hot reset for
+             * ISM during plug)
+             */
+            if (pbdev->interp) {
+                /* Take this opportunity to make sure we are sync'd with host */
+                if (!s390_pci_get_host_fh(pbdev, &pbdev->fh) ||
+                    !(pbdev->fh & FH_MASK_ENABLE)) {
+                    stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_FH);
+                    goto out;
+                }
+            }
             pbdev->fh |= FH_MASK_ENABLE;
             pbdev->state = ZPCI_FS_ENABLED;
             stl_p(&ressetpci->fh, pbdev->fh);
diff --git a/hw/s390x/s390-pci-kvm.c b/hw/s390x/s390-pci-kvm.c
new file mode 100644
index 0000000000..e503aaf1ca
--- /dev/null
+++ b/hw/s390x/s390-pci-kvm.c
@@ -0,0 +1,23 @@
+/*
+ * s390 zPCI KVM interfaces
+ *
+ * Copyright 2022 IBM Corp.
+ * Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "kvm/kvm_s390x.h"
+#include "hw/s390x/pv.h"
+#include "hw/s390x/s390-pci-kvm.h"
+#include "cpu_models.h"
+
+bool s390_pci_kvm_interp_allowed(void)
+{
+    return (s390_has_feat(S390_FEAT_ZPCI_INTERP) && kvm_s390_get_zpci_op() &&
+            !s390_is_pv());
+}
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index da3cde2bb4..a9843dfe97 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -350,6 +350,7 @@ struct S390PCIBusDevice {
     IndAddr *indicator;
     bool pci_unplug_request_processed;
     bool unplug_requested;
+    bool interp;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
 };
 
diff --git a/include/hw/s390x/s390-pci-kvm.h b/include/hw/s390x/s390-pci-kvm.h
new file mode 100644
index 0000000000..80a2e7d0ca
--- /dev/null
+++ b/include/hw/s390x/s390-pci-kvm.h
@@ -0,0 +1,24 @@
+/*
+ * s390 PCI KVM interfaces
+ *
+ * Copyright 2022 IBM Corp.
+ * Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+
+#ifndef HW_S390_PCI_KVM_H
+#define HW_S390_PCI_KVM_H
+
+#ifdef CONFIG_KVM
+bool s390_pci_kvm_interp_allowed(void);
+#else
+static inline bool s390_pci_kvm_interp_allowed(void)
+{
+    return false;
+}
+#endif
+
+#endif
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 314b0a9039..b6717216b3 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -156,6 +156,7 @@ static int cap_ri;
 static int cap_hpage_1m;
 static int cap_vcpu_resets;
 static int cap_protected;
+static int cap_zpci_op;
 
 static int active_cmma;
 
@@ -357,6 +358,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     cap_s390_irq = kvm_check_extension(s, KVM_CAP_S390_INJECT_IRQ);
     cap_vcpu_resets = kvm_check_extension(s, KVM_CAP_S390_VCPU_RESETS);
     cap_protected = kvm_check_extension(s, KVM_CAP_S390_PROTECTED);
+    cap_zpci_op = kvm_check_extension(s, KVM_CAP_S390_ZPCI_OP);
 
     kvm_vm_enable_cap(s, KVM_CAP_S390_USER_SIGP, 0);
     kvm_vm_enable_cap(s, KVM_CAP_S390_VECTOR_REGISTERS, 0);
@@ -2566,3 +2568,8 @@ bool kvm_arch_cpu_check_are_resettable(void)
 {
     return true;
 }
+
+int kvm_s390_get_zpci_op(void)
+{
+    return cap_zpci_op;
+}
diff --git a/target/s390x/kvm/kvm_s390x.h b/target/s390x/kvm/kvm_s390x.h
index 05a5e1e6f4..aaae8570de 100644
--- a/target/s390x/kvm/kvm_s390x.h
+++ b/target/s390x/kvm/kvm_s390x.h
@@ -27,6 +27,7 @@ void kvm_s390_vcpu_interrupt_pre_save(S390CPU *cpu);
 int kvm_s390_vcpu_interrupt_post_load(S390CPU *cpu);
 int kvm_s390_get_hpage_1m(void);
 int kvm_s390_get_ri(void);
+int kvm_s390_get_zpci_op(void);
 int kvm_s390_get_clock(uint8_t *tod_high, uint64_t *tod_clock);
 int kvm_s390_get_clock_ext(uint8_t *tod_high, uint64_t *tod_clock);
 int kvm_s390_set_clock(uint8_t tod_high, uint64_t tod_clock);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 5/8] s390x/pci: don't fence interpreted devices without MSI-X
  2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
                   ` (3 preceding siblings ...)
  2022-05-24 19:03 ` [PATCH v6 4/8] s390x/pci: enable for load/store intepretation Matthew Rosato
@ 2022-05-24 19:03 ` Matthew Rosato
  2022-05-24 19:03 ` [PATCH v6 6/8] s390x/pci: enable adapter event notification for interpreted devices Matthew Rosato
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Lack of MSI-X support is not an issue for interpreted passthrough
devices, so let's let these in.  This will allow, for example, ISM
devices to be passed through -- but only when interpretation is
available and being used.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 156051e6e9..816d17af99 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -881,6 +881,10 @@ static int s390_pci_msix_init(S390PCIBusDevice *pbdev)
 
 static void s390_pci_msix_free(S390PCIBusDevice *pbdev)
 {
+    if (pbdev->msix.entries == 0) {
+        return;
+    }
+
     memory_region_del_subregion(&pbdev->iommu->mr, &pbdev->msix_notify_mr);
     object_unparent(OBJECT(&pbdev->msix_notify_mr));
 }
@@ -1093,7 +1097,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             pbdev->interp = false;
         }
 
-        if (s390_pci_msix_init(pbdev)) {
+        if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
             error_setg(errp, "MSI-X support is mandatory "
                        "in the S390 architecture");
             return;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 6/8] s390x/pci: enable adapter event notification for interpreted devices
  2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
                   ` (4 preceding siblings ...)
  2022-05-24 19:03 ` [PATCH v6 5/8] s390x/pci: don't fence interpreted devices without MSI-X Matthew Rosato
@ 2022-05-24 19:03 ` Matthew Rosato
  2022-05-24 19:03 ` [PATCH v6 7/8] s390x/pci: let intercept devices have separate PCI groups Matthew Rosato
  2022-05-24 19:03 ` [PATCH v6 8/8] s390x/pci: reflect proper maxstbl for groups of interpreted devices Matthew Rosato
  7 siblings, 0 replies; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Use the associated kvm ioctl operation to enable adapter event notification
and forwarding for devices when requested.  This feature will be set up
with or without firmware assist based upon the 'forwarding_assist' setting.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c         | 20 ++++++++++++++---
 hw/s390x/s390-pci-inst.c        | 40 +++++++++++++++++++++++++++++++--
 hw/s390x/s390-pci-kvm.c         | 30 +++++++++++++++++++++++++
 include/hw/s390x/s390-pci-bus.h |  1 +
 include/hw/s390x/s390-pci-kvm.h | 14 ++++++++++++
 5 files changed, 100 insertions(+), 5 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 816d17af99..e66a0dfbef 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -190,7 +190,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
         rc = SCLP_RC_NO_ACTION_REQUIRED;
         break;
     default:
-        if (pbdev->summary_ind) {
+        if (pbdev->interp && (pbdev->fh & FH_MASK_ENABLE)) {
+            /* Interpreted devices were using interrupt forwarding */
+            s390_pci_kvm_aif_disable(pbdev);
+        } else if (pbdev->summary_ind) {
             pci_dereg_irqs(pbdev);
         }
         if (pbdev->iommu->enabled) {
@@ -1082,6 +1085,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                 } else {
                     DPRINTF("zPCI interpretation facilities missing.\n");
                     pbdev->interp = false;
+                    pbdev->forwarding_assist = false;
                 }
             }
             pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
@@ -1090,11 +1094,13 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             if (!pbdev->interp) {
                 /* Do vfio passthrough but intercept for I/O */
                 pbdev->fh |= FH_SHM_VFIO;
+                pbdev->forwarding_assist = false;
             }
         } else {
             pbdev->fh |= FH_SHM_EMUL;
             /* Always intercept emulated devices */
             pbdev->interp = false;
+            pbdev->forwarding_assist = false;
         }
 
         if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
@@ -1244,7 +1250,10 @@ static void s390_pcihost_reset(DeviceState *dev)
     /* Process all pending unplug requests */
     QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) {
         if (pbdev->unplug_requested) {
-            if (pbdev->summary_ind) {
+            if (pbdev->interp && (pbdev->fh & FH_MASK_ENABLE)) {
+                /* Interpreted devices were using interrupt forwarding */
+                s390_pci_kvm_aif_disable(pbdev);
+            } else if (pbdev->summary_ind) {
                 pci_dereg_irqs(pbdev);
             }
             if (pbdev->iommu->enabled) {
@@ -1382,7 +1391,10 @@ static void s390_pci_device_reset(DeviceState *dev)
         break;
     }
 
-    if (pbdev->summary_ind) {
+    if (pbdev->interp && (pbdev->fh & FH_MASK_ENABLE)) {
+        /* Interpreted devices were using interrupt forwarding */
+        s390_pci_kvm_aif_disable(pbdev);
+    } else if (pbdev->summary_ind) {
         pci_dereg_irqs(pbdev);
     }
     if (pbdev->iommu->enabled) {
@@ -1428,6 +1440,8 @@ static Property s390_pci_device_properties[] = {
     DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
     DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
     DEFINE_PROP_BOOL("interpret", S390PCIBusDevice, interp, true),
+    DEFINE_PROP_BOOL("forwarding_assist", S390PCIBusDevice, forwarding_assist,
+                     true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 651ec38635..20a9bcc7af 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -1066,6 +1066,32 @@ static void fmb_update(void *opaque)
     timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui);
 }
 
+static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
+{
+    int rc;
+
+    rc = s390_pci_kvm_aif_enable(pbdev, fib, pbdev->forwarding_assist);
+    if (rc) {
+        DPRINTF("Failed to enable interrupt forwarding\n");
+        return rc;
+    }
+
+    return 0;
+}
+
+static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
+{
+    int rc;
+
+    rc = s390_pci_kvm_aif_disable(pbdev);
+    if (rc) {
+        DPRINTF("Failed to disable interrupt forwarding\n");
+        return rc;
+    }
+
+    return 0;
+}
+
 int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
                         uintptr_t ra)
 {
@@ -1120,7 +1146,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
 
     switch (oc) {
     case ZPCI_MOD_FC_REG_INT:
-        if (pbdev->summary_ind) {
+        if (pbdev->interp) {
+            if (mpcifc_reg_int_interp(pbdev, &fib)) {
+                cc = ZPCI_PCI_LS_ERR;
+                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
+            }
+        } else if (pbdev->summary_ind) {
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else if (reg_irqs(env, pbdev, fib)) {
@@ -1129,7 +1160,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
         }
         break;
     case ZPCI_MOD_FC_DEREG_INT:
-        if (!pbdev->summary_ind) {
+        if (pbdev->interp) {
+            if (mpcifc_dereg_int_interp(pbdev, &fib)) {
+                cc = ZPCI_PCI_LS_ERR;
+                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
+            }
+        } else if (!pbdev->summary_ind) {
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else {
diff --git a/hw/s390x/s390-pci-kvm.c b/hw/s390x/s390-pci-kvm.c
index e503aaf1ca..69b04a36ea 100644
--- a/hw/s390x/s390-pci-kvm.c
+++ b/hw/s390x/s390-pci-kvm.c
@@ -11,9 +11,13 @@
 
 #include "qemu/osdep.h"
 
+#include <linux/kvm.h>
+
 #include "kvm/kvm_s390x.h"
 #include "hw/s390x/pv.h"
+#include "hw/s390x/s390-pci-bus.h"
 #include "hw/s390x/s390-pci-kvm.h"
+#include "hw/s390x/s390-pci-inst.h"
 #include "cpu_models.h"
 
 bool s390_pci_kvm_interp_allowed(void)
@@ -21,3 +25,29 @@ bool s390_pci_kvm_interp_allowed(void)
     return (s390_has_feat(S390_FEAT_ZPCI_INTERP) && kvm_s390_get_zpci_op() &&
             !s390_is_pv());
 }
+
+int s390_pci_kvm_aif_enable(S390PCIBusDevice *pbdev, ZpciFib *fib, bool assist)
+{
+    struct kvm_s390_zpci_op args = {
+        .fh = pbdev->fh,
+        .op = KVM_S390_ZPCIOP_REG_AEN,
+        .u.reg_aen.ibv = fib->aibv,
+        .u.reg_aen.sb = fib->aisb,
+        .u.reg_aen.noi = FIB_DATA_NOI(fib->data),
+        .u.reg_aen.isc = FIB_DATA_ISC(fib->data),
+        .u.reg_aen.sbo = FIB_DATA_AISBO(fib->data),
+        .u.reg_aen.flags = (assist) ? 0 : KVM_S390_ZPCIOP_REGAEN_HOST
+    };
+
+    return kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+}
+
+int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev)
+{
+    struct kvm_s390_zpci_op args = {
+        .fh = pbdev->fh,
+        .op = KVM_S390_ZPCIOP_DEREG_AEN
+    };
+
+    return kvm_vm_ioctl(kvm_state, KVM_S390_ZPCI_OP, &args);
+}
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index a9843dfe97..5b09f0cf2f 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -351,6 +351,7 @@ struct S390PCIBusDevice {
     bool pci_unplug_request_processed;
     bool unplug_requested;
     bool interp;
+    bool forwarding_assist;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
 };
 
diff --git a/include/hw/s390x/s390-pci-kvm.h b/include/hw/s390x/s390-pci-kvm.h
index 80a2e7d0ca..933814a402 100644
--- a/include/hw/s390x/s390-pci-kvm.h
+++ b/include/hw/s390x/s390-pci-kvm.h
@@ -12,13 +12,27 @@
 #ifndef HW_S390_PCI_KVM_H
 #define HW_S390_PCI_KVM_H
 
+#include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-inst.h"
+
 #ifdef CONFIG_KVM
 bool s390_pci_kvm_interp_allowed(void);
+int s390_pci_kvm_aif_enable(S390PCIBusDevice *pbdev, ZpciFib *fib, bool assist);
+int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev);
 #else
 static inline bool s390_pci_kvm_interp_allowed(void)
 {
     return false;
 }
+static inline int s390_pci_kvm_aif_enable(S390PCIBusDevice *pbdev, ZpciFib *fib,
+                                          bool assist)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_kvm_aif_disable(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
 #endif
 
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 7/8] s390x/pci: let intercept devices have separate PCI groups
  2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
                   ` (5 preceding siblings ...)
  2022-05-24 19:03 ` [PATCH v6 6/8] s390x/pci: enable adapter event notification for interpreted devices Matthew Rosato
@ 2022-05-24 19:03 ` Matthew Rosato
  2022-05-24 19:03 ` [PATCH v6 8/8] s390x/pci: reflect proper maxstbl for groups of interpreted devices Matthew Rosato
  7 siblings, 0 replies; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Let's use the reserved pool of simulated PCI groups to allow intercept
devices to have separate groups from interpreted devices as some group
values may be different. If we run out of simulated PCI groups, subsequent
intercept devices just get the default group.
Furthermore, if we encounter any PCI groups from hostdevs that are marked
as simulated, let's just assign them to the default group to avoid
conflicts between host simulated groups and our own simulated groups.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c         | 19 ++++++++++++++--
 hw/s390x/s390-pci-vfio.c        | 40 ++++++++++++++++++++++++++++++---
 include/hw/s390x/s390-pci-bus.h |  6 ++++-
 3 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index e66a0dfbef..5342f7899f 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -748,13 +748,14 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
     object_unref(OBJECT(iommu));
 }
 
-S390PCIGroup *s390_group_create(int id)
+S390PCIGroup *s390_group_create(int id, int host_id)
 {
     S390PCIGroup *group;
     S390pciState *s = s390_get_phb();
 
     group = g_new0(S390PCIGroup, 1);
     group->id = id;
+    group->host_id = host_id;
     QTAILQ_INSERT_TAIL(&s->zpci_groups, group, link);
     return group;
 }
@@ -772,12 +773,25 @@ S390PCIGroup *s390_group_find(int id)
     return NULL;
 }
 
+S390PCIGroup *s390_group_find_host_sim(int host_id)
+{
+    S390PCIGroup *group;
+    S390pciState *s = s390_get_phb();
+
+    QTAILQ_FOREACH(group, &s->zpci_groups, link) {
+        if (group->id >= ZPCI_SIM_GRP_START && group->host_id == host_id) {
+            return group;
+        }
+    }
+    return NULL;
+}
+
 static void s390_pci_init_default_group(void)
 {
     S390PCIGroup *group;
     ClpRspQueryPciGrp *resgrp;
 
-    group = s390_group_create(ZPCI_DEFAULT_FN_GRP);
+    group = s390_group_create(ZPCI_DEFAULT_FN_GRP, ZPCI_DEFAULT_FN_GRP);
     resgrp = &group->zpci_group;
     resgrp->fr = 1;
     resgrp->dasm = 0;
@@ -825,6 +839,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
                                            NULL, g_free);
     s->zpci_table = g_hash_table_new_full(g_int_hash, g_int_equal, NULL, NULL);
     s->bus_no = 0;
+    s->next_sim_grp = ZPCI_SIM_GRP_START;
     QTAILQ_INIT(&s->pending_sei);
     QTAILQ_INIT(&s->zpci_devs);
     QTAILQ_INIT(&s->zpci_dma_limit);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 4bf0a7e22d..985980f021 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -150,13 +150,18 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
 {
     struct vfio_info_cap_header *hdr;
     struct vfio_device_info_cap_zpci_group *cap;
+    S390pciState *s = s390_get_phb();
     ClpRspQueryPciGrp *resgrp;
     VFIOPCIDevice *vpci =  container_of(pbdev->pdev, VFIOPCIDevice, pdev);
+    uint8_t start_gid = pbdev->zpci_fn.pfgid;
 
     hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
 
-    /* If capability not provided, just use the default group */
-    if (hdr == NULL) {
+    /*
+     * If capability not provided or the underlying hostdev is simulated, just
+     * use the default group.
+     */
+    if (hdr == NULL || pbdev->zpci_fn.pfgid >= ZPCI_SIM_GRP_START) {
         trace_s390_pci_clp_cap(vpci->vbasedev.name,
                                VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
         pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
@@ -165,11 +170,40 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
     }
     cap = (void *) hdr;
 
+    /*
+     * For an intercept device, let's use an existing simulated group if one
+     * one was already created for other intercept devices in this group.
+     * If not, create a new simulated group if any are still available.
+     * If all else fails, just fall back on the default group.
+     */
+    if (!pbdev->interp) {
+        pbdev->pci_group = s390_group_find_host_sim(pbdev->zpci_fn.pfgid);
+        if (pbdev->pci_group) {
+            /* Use existing simulated group */
+            pbdev->zpci_fn.pfgid = pbdev->pci_group->id;
+            return;
+        } else {
+            if (s->next_sim_grp == ZPCI_DEFAULT_FN_GRP) {
+                /* All out of simulated groups, use default */
+                trace_s390_pci_clp_cap(vpci->vbasedev.name,
+                                       VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
+                pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
+                pbdev->pci_group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
+                return;
+            } else {
+                /* We can assign a new simulated group */
+                pbdev->zpci_fn.pfgid = s->next_sim_grp;
+                s->next_sim_grp++;
+                /* Fall through to create the new sim group using CLP info */
+            }
+        }
+    }
+
     /* See if the PCI group is already defined, create if not */
     pbdev->pci_group = s390_group_find(pbdev->zpci_fn.pfgid);
 
     if (!pbdev->pci_group) {
-        pbdev->pci_group = s390_group_create(pbdev->zpci_fn.pfgid);
+        pbdev->pci_group = s390_group_create(pbdev->zpci_fn.pfgid, start_gid);
 
         resgrp = &pbdev->pci_group->zpci_group;
         if (cap->flags & VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH) {
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 5b09f0cf2f..0605fcea24 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -315,13 +315,16 @@ typedef struct ZpciFmb {
 QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
 
 #define ZPCI_DEFAULT_FN_GRP 0xFF
+#define ZPCI_SIM_GRP_START 0xF0
 typedef struct S390PCIGroup {
     ClpRspQueryPciGrp zpci_group;
     int id;
+    int host_id;
     QTAILQ_ENTRY(S390PCIGroup) link;
 } S390PCIGroup;
-S390PCIGroup *s390_group_create(int id);
+S390PCIGroup *s390_group_create(int id, int host_id);
 S390PCIGroup *s390_group_find(int id);
+S390PCIGroup *s390_group_find_host_sim(int host_id);
 
 struct S390PCIBusDevice {
     DeviceState qdev;
@@ -370,6 +373,7 @@ struct S390pciState {
     QTAILQ_HEAD(, S390PCIBusDevice) zpci_devs;
     QTAILQ_HEAD(, S390PCIDMACount) zpci_dma_limit;
     QTAILQ_HEAD(, S390PCIGroup) zpci_groups;
+    uint8_t next_sim_grp;
 };
 
 S390pciState *s390_get_phb(void);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 8/8] s390x/pci: reflect proper maxstbl for groups of interpreted devices
  2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
                   ` (6 preceding siblings ...)
  2022-05-24 19:03 ` [PATCH v6 7/8] s390x/pci: let intercept devices have separate PCI groups Matthew Rosato
@ 2022-05-24 19:03 ` Matthew Rosato
  7 siblings, 0 replies; 14+ messages in thread
From: Matthew Rosato @ 2022-05-24 19:03 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

The maximum supported store block length might be different depending
on whether the instruction is interpretively executed (firmware-reported
maximum) or handled via userspace intercept (host kernel API maximum).
Choose the best available value during group creation.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-vfio.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 985980f021..212dd053f7 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -213,7 +213,11 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
         resgrp->msia = cap->msi_addr;
         resgrp->mui = cap->mui;
         resgrp->i = cap->noi;
-        resgrp->maxstbl = cap->maxstbl;
+        if (pbdev->interp && hdr->version >= 2) {
+            resgrp->maxstbl = cap->imaxstbl;
+        } else {
+            resgrp->maxstbl = cap->maxstbl;
+        }
         resgrp->version = cap->version;
         resgrp->dtsm = ZPCI_DTSM;
     }
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models
  2022-05-24 19:02 ` [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models Matthew Rosato
@ 2022-06-01  9:52   ` David Hildenbrand
  2022-06-01 13:48     ` Matthew Rosato
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2022-06-01  9:52 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, pasic, borntraeger, mst, pbonzini, qemu-devel,
	kvm

On 24.05.22 21:02, Matthew Rosato wrote:
> The zpci-interp feature is used to specify whether zPCI interpretation is
> to be used for this guest.

We have

DEF_FEAT(SIE_PFMFI, "pfmfi", SCLP_CONF_CHAR_EXT, 9, "SIE: PFMF
interpretation facility")

and

DEF_FEAT(SIE_SIGPIF, "sigpif", SCLP_CPU, 12, "SIE: SIGP interpretation
facility")


Should we call this simply "zpcii" or "zpciif" (if the official name
includes "Facility")

> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  hw/s390x/s390-virtio-ccw.c          | 1 +
>  target/s390x/cpu_features_def.h.inc | 1 +
>  target/s390x/gen-features.c         | 2 ++
>  target/s390x/kvm/kvm.c              | 1 +
>  4 files changed, 5 insertions(+)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 047cca0487..b33310a135 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -806,6 +806,7 @@ static void ccw_machine_7_0_instance_options(MachineState *machine)
>      static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V7_0 };
>  
>      ccw_machine_7_1_instance_options(machine);
> +    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
>      s390_set_qemu_cpu_model(0x8561, 15, 1, qemu_cpu_feat);
>  }
>  
> diff --git a/target/s390x/cpu_features_def.h.inc b/target/s390x/cpu_features_def.h.inc
> index e86662bb3b..4ade3182aa 100644
> --- a/target/s390x/cpu_features_def.h.inc
> +++ b/target/s390x/cpu_features_def.h.inc
> @@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: Conditional-external-interception f
>  DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
>  DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
>  DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
> +DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")

How is this feature exposed to the guest, meaning, how can the guest
sense support?

Just a gut feeling: does this toggle enable the host to use
interpretation and the guest cannot really determine the difference
whether it's enabled or not? Then, it's not a guest CPU feature. But
let's hear first what this actually enables :)

>  
>  /* Features exposed via the PLO instruction. */
>  DEF_FEAT(PLO_CL, "plo-cl", PLO, 0, "PLO Compare and load (32 bit in general registers)")
> diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
> index c03ec2c9a9..f991646c01 100644
> --- a/target/s390x/gen-features.c
> +++ b/target/s390x/gen-features.c
> @@ -554,6 +554,7 @@ static uint16_t full_GEN14_GA1[] = {
>      S390_FEAT_HPMA2,
>      S390_FEAT_SIE_KSS,
>      S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
> +    S390_FEAT_ZPCI_INTERP,
>  };
>  
>  #define full_GEN14_GA2 EmptyFeat
> @@ -650,6 +651,7 @@ static uint16_t default_GEN14_GA1[] = {
>      S390_FEAT_GROUP_MSA_EXT_8,
>      S390_FEAT_MULTIPLE_EPOCH,
>      S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
> +    S390_FEAT_ZPCI_INTERP,

I'm curious, should we really add this to the default model?

This implies that on any setup where we don't have zpci interpretation
support (including missing kernel support), that a basic "-cpu z14" will
no longer work with the new machine type.

If, OTOH, we expect this feature to be around in any sane installation,
then it's good to include it in the

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models
  2022-06-01  9:52   ` David Hildenbrand
@ 2022-06-01 13:48     ` Matthew Rosato
  2022-06-01 14:10       ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Rosato @ 2022-06-01 13:48 UTC (permalink / raw)
  To: David Hildenbrand, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, pasic, borntraeger, mst, pbonzini, qemu-devel,
	kvm

On 6/1/22 5:52 AM, David Hildenbrand wrote:
> On 24.05.22 21:02, Matthew Rosato wrote:
>> The zpci-interp feature is used to specify whether zPCI interpretation is
>> to be used for this guest.
> 
> We have
> 
> DEF_FEAT(SIE_PFMFI, "pfmfi", SCLP_CONF_CHAR_EXT, 9, "SIE: PFMF
> interpretation facility")
> 
> and
> 
> DEF_FEAT(SIE_SIGPIF, "sigpif", SCLP_CPU, 12, "SIE: SIGP interpretation
> facility")
> 
> 
> Should we call this simply "zpcii" or "zpciif" (if the official name
> includes "Facility")
> 

This actually controls the use of 2 facilities which really only make 
sense together - Maybe just zpcii

>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/s390x/s390-virtio-ccw.c          | 1 +
>>   target/s390x/cpu_features_def.h.inc | 1 +
>>   target/s390x/gen-features.c         | 2 ++
>>   target/s390x/kvm/kvm.c              | 1 +
>>   4 files changed, 5 insertions(+)
>>
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index 047cca0487..b33310a135 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -806,6 +806,7 @@ static void ccw_machine_7_0_instance_options(MachineState *machine)
>>       static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V7_0 };
>>   
>>       ccw_machine_7_1_instance_options(machine);
>> +    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
>>       s390_set_qemu_cpu_model(0x8561, 15, 1, qemu_cpu_feat);
>>   }
>>   
>> diff --git a/target/s390x/cpu_features_def.h.inc b/target/s390x/cpu_features_def.h.inc
>> index e86662bb3b..4ade3182aa 100644
>> --- a/target/s390x/cpu_features_def.h.inc
>> +++ b/target/s390x/cpu_features_def.h.inc
>> @@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: Conditional-external-interception f
>>   DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
>>   DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
>>   DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
>> +DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
> 
> How is this feature exposed to the guest, meaning, how can the guest
> sense support?
> 
> Just a gut feeling: does this toggle enable the host to use
> interpretation and the guest cannot really determine the difference
> whether it's enabled or not? Then, it's not a guest CPU feature. But
> let's hear first what this actually enables :)

This has changed a few times, but collectively we can determine on the 
host kernel if it is allowable based upon the availability of certain 
facility/sclp bits + the availability of an ioctl interface.

If all of these are available, the host kernel allows zPCI 
interpretation, with userspace able to toggle it on/off for the guest 
via this feature.  When allowed and enabled, 2 ECB bits then get set for 
each guest vcpu that enable the associated facilities.  The guest 
continues to use zPCI instructions in the same manner as before; the 
function handles it receives from CLP instructions will look different 
but are still used in the same manner.

We don't yet add vsie support of the facilities with this series, so the 
corresponding facility and sclp bits aren't forwarded to the guest.

> 
>>   
>>   /* Features exposed via the PLO instruction. */
>>   DEF_FEAT(PLO_CL, "plo-cl", PLO, 0, "PLO Compare and load (32 bit in general registers)")
>> diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
>> index c03ec2c9a9..f991646c01 100644
>> --- a/target/s390x/gen-features.c
>> +++ b/target/s390x/gen-features.c
>> @@ -554,6 +554,7 @@ static uint16_t full_GEN14_GA1[] = {
>>       S390_FEAT_HPMA2,
>>       S390_FEAT_SIE_KSS,
>>       S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
>> +    S390_FEAT_ZPCI_INTERP,
>>   };
>>   
>>   #define full_GEN14_GA2 EmptyFeat
>> @@ -650,6 +651,7 @@ static uint16_t default_GEN14_GA1[] = {
>>       S390_FEAT_GROUP_MSA_EXT_8,
>>       S390_FEAT_MULTIPLE_EPOCH,
>>       S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
>> +    S390_FEAT_ZPCI_INTERP,
> 
> I'm curious, should we really add this to the default model?
> 
> This implies that on any setup where we don't have zpci interpretation
> support (including missing kernel support), that a basic "-cpu z14" will
> no longer work with the new machine type.
> 
> If, OTOH, we expect this feature to be around in any sane installation,
> then it's good to include it in the
> 

 From a hardware perspective, everything will be available on z14 and 
later so it's only a question of missing host kernel support (or, you 
aren't running in a z14 LPAR).  As far as host kernel support, the 
expectation is that for a distro release where this QEMU support lands 
the associated kernel support would also be backported.  I guess that 
leaves some awkwardness if one upgrades their distro qemu to a new 
release version without picking up the kernel upgrade for some reason.. 
In that case, you're not totally stuck, you could still use -cpu 
z14,zpcii=off (or better yet pick up the associated kernel upgrade...) 
The intent is for exploitation of interpretation facilities to become 
the default on z14 and later, with the ability to turn it off offered as 
a fall-back / backwards compatibility.

If there's a better way to accomplish that, I'm open to suggestion.





^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models
  2022-06-01 13:48     ` Matthew Rosato
@ 2022-06-01 14:10       ` David Hildenbrand
  2022-06-01 15:11         ` Matthew Rosato
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2022-06-01 14:10 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, pasic, borntraeger, mst, pbonzini, qemu-devel,
	kvm

On 01.06.22 15:48, Matthew Rosato wrote:
> On 6/1/22 5:52 AM, David Hildenbrand wrote:
>> On 24.05.22 21:02, Matthew Rosato wrote:
>>> The zpci-interp feature is used to specify whether zPCI interpretation is
>>> to be used for this guest.
>>
>> We have
>>
>> DEF_FEAT(SIE_PFMFI, "pfmfi", SCLP_CONF_CHAR_EXT, 9, "SIE: PFMF
>> interpretation facility")
>>
>> and
>>
>> DEF_FEAT(SIE_SIGPIF, "sigpif", SCLP_CPU, 12, "SIE: SIGP interpretation
>> facility")
>>
>>
>> Should we call this simply "zpcii" or "zpciif" (if the official name
>> includes "Facility")
>>
> 
> This actually controls the use of 2 facilities which really only make 
> sense together - Maybe just zpcii
> 
>>>
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   hw/s390x/s390-virtio-ccw.c          | 1 +
>>>   target/s390x/cpu_features_def.h.inc | 1 +
>>>   target/s390x/gen-features.c         | 2 ++
>>>   target/s390x/kvm/kvm.c              | 1 +
>>>   4 files changed, 5 insertions(+)
>>>
>>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>>> index 047cca0487..b33310a135 100644
>>> --- a/hw/s390x/s390-virtio-ccw.c
>>> +++ b/hw/s390x/s390-virtio-ccw.c
>>> @@ -806,6 +806,7 @@ static void ccw_machine_7_0_instance_options(MachineState *machine)
>>>       static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V7_0 };
>>>   
>>>       ccw_machine_7_1_instance_options(machine);
>>> +    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
>>>       s390_set_qemu_cpu_model(0x8561, 15, 1, qemu_cpu_feat);
>>>   }
>>>   
>>> diff --git a/target/s390x/cpu_features_def.h.inc b/target/s390x/cpu_features_def.h.inc
>>> index e86662bb3b..4ade3182aa 100644
>>> --- a/target/s390x/cpu_features_def.h.inc
>>> +++ b/target/s390x/cpu_features_def.h.inc
>>> @@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: Conditional-external-interception f
>>>   DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
>>>   DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
>>>   DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
>>> +DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
>>
>> How is this feature exposed to the guest, meaning, how can the guest
>> sense support?
>>
>> Just a gut feeling: does this toggle enable the host to use
>> interpretation and the guest cannot really determine the difference
>> whether it's enabled or not? Then, it's not a guest CPU feature. But
>> let's hear first what this actually enables :)
> 
> This has changed a few times, but collectively we can determine on the 
> host kernel if it is allowable based upon the availability of certain 
> facility/sclp bits + the availability of an ioctl interface.
> 
> If all of these are available, the host kernel allows zPCI 
> interpretation, with userspace able to toggle it on/off for the guest 
> via this feature.  When allowed and enabled, 2 ECB bits then get set for 
> each guest vcpu that enable the associated facilities.  The guest 
> continues to use zPCI instructions in the same manner as before; the 
> function handles it receives from CLP instructions will look different 
> but are still used in the same manner.
> 
> We don't yet add vsie support of the facilities with this series, so the 
> corresponding facility and sclp bits aren't forwarded to the guest.

That's exactly my point:

sigpif and pfmfi are actually vsie features. I'd have expected that
zpcii would be a vsie feature as well.

If interpretation is really more an implementation detail in the
hypervisor to implement zpci, than an actual guest feature (meaning, the
guest is able to observe it as if it were a real CPU feature), then we
most probably want some other way to toggle it (maybe via the machine?).

Example: KVM uses SIGP interpretation based on availability. However, we
don't toggle it via sigpif. sigpif actually tells the guest that it can
use the SIGP interpretation facility along with vsie.

You mention "CLP instructions will look different", I'm not sure if that
should actually be handled via the CPU model. From my gut feeling, zpcii
should actually be the vsie zpcii support to be implemented in the future.


So I wonder if we could simply always enable zPCI interpretation if
HW+kernel support is around and we're on a new compat machine? I there
is a way that migration could break (from old kernel to new kernel),
we'd have to think about alternatives.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models
  2022-06-01 14:10       ` David Hildenbrand
@ 2022-06-01 15:11         ` Matthew Rosato
  2022-06-02  8:58           ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Matthew Rosato @ 2022-06-01 15:11 UTC (permalink / raw)
  To: David Hildenbrand, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, pasic, borntraeger, mst, pbonzini, qemu-devel,
	kvm

On 6/1/22 10:10 AM, David Hildenbrand wrote:
> On 01.06.22 15:48, Matthew Rosato wrote:
>> On 6/1/22 5:52 AM, David Hildenbrand wrote:
>>> On 24.05.22 21:02, Matthew Rosato wrote:
>>>> The zpci-interp feature is used to specify whether zPCI interpretation is
>>>> to be used for this guest.
>>>
>>> We have
>>>
>>> DEF_FEAT(SIE_PFMFI, "pfmfi", SCLP_CONF_CHAR_EXT, 9, "SIE: PFMF
>>> interpretation facility")
>>>
>>> and
>>>
>>> DEF_FEAT(SIE_SIGPIF, "sigpif", SCLP_CPU, 12, "SIE: SIGP interpretation
>>> facility")
>>>
>>>
>>> Should we call this simply "zpcii" or "zpciif" (if the official name
>>> includes "Facility")
>>>
>>
>> This actually controls the use of 2 facilities which really only make
>> sense together - Maybe just zpcii
>>
>>>>
>>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>>> ---
>>>>    hw/s390x/s390-virtio-ccw.c          | 1 +
>>>>    target/s390x/cpu_features_def.h.inc | 1 +
>>>>    target/s390x/gen-features.c         | 2 ++
>>>>    target/s390x/kvm/kvm.c              | 1 +
>>>>    4 files changed, 5 insertions(+)
>>>>
>>>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>>>> index 047cca0487..b33310a135 100644
>>>> --- a/hw/s390x/s390-virtio-ccw.c
>>>> +++ b/hw/s390x/s390-virtio-ccw.c
>>>> @@ -806,6 +806,7 @@ static void ccw_machine_7_0_instance_options(MachineState *machine)
>>>>        static const S390FeatInit qemu_cpu_feat = { S390_FEAT_LIST_QEMU_V7_0 };
>>>>    
>>>>        ccw_machine_7_1_instance_options(machine);
>>>> +    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
>>>>        s390_set_qemu_cpu_model(0x8561, 15, 1, qemu_cpu_feat);
>>>>    }
>>>>    
>>>> diff --git a/target/s390x/cpu_features_def.h.inc b/target/s390x/cpu_features_def.h.inc
>>>> index e86662bb3b..4ade3182aa 100644
>>>> --- a/target/s390x/cpu_features_def.h.inc
>>>> +++ b/target/s390x/cpu_features_def.h.inc
>>>> @@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: Conditional-external-interception f
>>>>    DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
>>>>    DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
>>>>    DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
>>>> +DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
>>>
>>> How is this feature exposed to the guest, meaning, how can the guest
>>> sense support?
>>>
>>> Just a gut feeling: does this toggle enable the host to use
>>> interpretation and the guest cannot really determine the difference
>>> whether it's enabled or not? Then, it's not a guest CPU feature. But
>>> let's hear first what this actually enables :)
>>
>> This has changed a few times, but collectively we can determine on the
>> host kernel if it is allowable based upon the availability of certain
>> facility/sclp bits + the availability of an ioctl interface.
>>
>> If all of these are available, the host kernel allows zPCI
>> interpretation, with userspace able to toggle it on/off for the guest
>> via this feature.  When allowed and enabled, 2 ECB bits then get set for
>> each guest vcpu that enable the associated facilities.  The guest
>> continues to use zPCI instructions in the same manner as before; the
>> function handles it receives from CLP instructions will look different
>> but are still used in the same manner.
>>
>> We don't yet add vsie support of the facilities with this series, so the
>> corresponding facility and sclp bits aren't forwarded to the guest.
> 
> That's exactly my point:
> 
> sigpif and pfmfi are actually vsie features. I'd have expected that
> zpcii would be a vsie feature as well.
> 
> If interpretation is really more an implementation detail in the
> hypervisor to implement zpci, than an actual guest feature (meaning, the
> guest is able to observe it as if it were a real CPU feature), then we
> most probably want some other way to toggle it (maybe via the machine?).
> 
> Example: KVM uses SIGP interpretation based on availability. However, we
> don't toggle it via sigpif. sigpif actually tells the guest that it can
> use the SIGP interpretation facility along with vsie.
> 
> You mention "CLP instructions will look different", I'm not sure if that
> should actually be handled via the CPU model. From my gut feeling, zpcii
> should actually be the vsie zpcii support to be implemented in the future.
> 

Well, what I meant was that the CLP response data looks different, 
primarily because when interpretation is enabled the guest would get 
passthrough of the function handle (which in turn has bits turned off 
that force hypervisor intercepts) rather than one that QEMU fabricated.

As far as a machine option, well we still need a mechanism by which 
userspace can decide whether it's OK to enable interpretation in the 
first place.  I guess we can take advantage of the fact that the 
capability associated with the ioctl interface can indicate both that 
the kernel interface is available + all of the necessary hardware 
facilities are available to that host kernel.

So I guess we could use that to make a decision to default a machine 
setting based upon that (yes if everything is available, no if not).

> 
> So I wonder if we could simply always enable zPCI interpretation if
> HW+kernel support is around and we're on a new compat machine? I there
> is a way that migration could break (from old kernel to new kernel),
> we'd have to think about alternatives.

zpci devices are currently marked unmigratable, so if you want to 
migrate you need to detach all of them first anyway today.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models
  2022-06-01 15:11         ` Matthew Rosato
@ 2022-06-02  8:58           ` David Hildenbrand
  0 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2022-06-02  8:58 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, pasic, borntraeger, mst, pbonzini, qemu-devel,
	kvm

>> That's exactly my point:
>>
>> sigpif and pfmfi are actually vsie features. I'd have expected that
>> zpcii would be a vsie feature as well.
>>
>> If interpretation is really more an implementation detail in the
>> hypervisor to implement zpci, than an actual guest feature (meaning, the
>> guest is able to observe it as if it were a real CPU feature), then we
>> most probably want some other way to toggle it (maybe via the machine?).
>>
>> Example: KVM uses SIGP interpretation based on availability. However, we
>> don't toggle it via sigpif. sigpif actually tells the guest that it can
>> use the SIGP interpretation facility along with vsie.
>>
>> You mention "CLP instructions will look different", I'm not sure if that
>> should actually be handled via the CPU model. From my gut feeling, zpcii
>> should actually be the vsie zpcii support to be implemented in the future.
>>
> 
> Well, what I meant was that the CLP response data looks different, 
> primarily because when interpretation is enabled the guest would get 
> passthrough of the function handle (which in turn has bits turned off 
> that force hypervisor intercepts) rather than one that QEMU fabricated.

Okay, so more some kind of "the device behaves seems to behave slightly
different".

> 
> As far as a machine option, well we still need a mechanism by which 
> userspace can decide whether it's OK to enable interpretation in the 
> first place.  I guess we can take advantage of the fact that the 
> capability associated with the ioctl interface can indicate both that 
> the kernel interface is available + all of the necessary hardware 
> facilities are available to that host kernel.

Yes.

> 
> So I guess we could use that to make a decision to default a machine 
> setting based upon that (yes if everything is available, no if not).

Right, in theory we could enable interpretation whenever possible
(kernel indicates support, including HW support).

In practice, it would be nice to be able to disable zpci interpretation
for debugging purposes.

One option is to simply glue it to compat machines. So selecting an
older compat machine will disable it.

Another option is a e.g., machine property, which can be used to
force-disable it (e.g., zpcii-disabled) and let the property always
default to false.

Third option would simply combine both, making compat machines make
zpcii-disable result in "true".

> 
>>
>> So I wonder if we could simply always enable zPCI interpretation if
>> HW+kernel support is around and we're on a new compat machine? I there
>> is a way that migration could break (from old kernel to new kernel),
>> we'd have to think about alternatives.
> 
> zpci devices are currently marked unmigratable, so if you want to 
> migrate you need to detach all of them first anyway today.

Okay. So it might be reasonable in the future to simply check on source
and migration if zpcii is in the same state if zpci devices are attached
to the VM. If not, simply fail migration -- in sane enironments, we'd
never get a mismatch.

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-06-02  8:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-24 19:02 [PATCH v6 0/8] s390x/pci: zPCI interpretation support Matthew Rosato
2022-05-24 19:02 ` [PATCH v6 1/8] Update linux headers Matthew Rosato
2022-05-24 19:02 ` [PATCH v6 2/8] target/s390x: add zpci-interp to cpu models Matthew Rosato
2022-06-01  9:52   ` David Hildenbrand
2022-06-01 13:48     ` Matthew Rosato
2022-06-01 14:10       ` David Hildenbrand
2022-06-01 15:11         ` Matthew Rosato
2022-06-02  8:58           ` David Hildenbrand
2022-05-24 19:03 ` [PATCH v6 3/8] s390x/pci: add routine to get host function handle from CLP info Matthew Rosato
2022-05-24 19:03 ` [PATCH v6 4/8] s390x/pci: enable for load/store intepretation Matthew Rosato
2022-05-24 19:03 ` [PATCH v6 5/8] s390x/pci: don't fence interpreted devices without MSI-X Matthew Rosato
2022-05-24 19:03 ` [PATCH v6 6/8] s390x/pci: enable adapter event notification for interpreted devices Matthew Rosato
2022-05-24 19:03 ` [PATCH v6 7/8] s390x/pci: let intercept devices have separate PCI groups Matthew Rosato
2022-05-24 19:03 ` [PATCH v6 8/8] s390x/pci: reflect proper maxstbl for groups of interpreted devices Matthew Rosato

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.