All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/9] s390x/pci: zPCI interpretation support
@ 2022-01-14 20:38 ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

For QEMU, the majority of the work in enabling instruction interpretation
is handled via new VFIO ioctls to SET the appropriate interpretation and
interrupt forwarding modes, and to GET the function handle to use for
interpretive execution.  

This series implements these new ioctls, as well as adding a new, optional
'intercept' parameter to zpci to request interpretation support not be used
as well as an 'intassist' parameter to determine whether or not the
firmware assist will be used for interrupt delivery or whether the host
will be responsible for delivering all interrupts.

The ZPCI_INTERP CPU feature is added beginning with the z14 model to
enable this support.

As a consequence of implementing zPCI interpretation, ISM devices now
become eligible for passthrough (but only when zPCI interpretation is
available).

From the perspective of guest configuration, you passthrough zPCI devices
in the same manner as before, with intepretation support being used by
default if available in kernel+qemu.

Associated kernel series:
https://lore.kernel.org/kvm/20220114203145.242984-1-mjrosato@linux.ibm.com/

Changes v1->v2:

- Update kernel headers sync                                                    
- Drop some pre-req patches that are now merged                                 
- Add some R-bs (Thanks!)                                                       
- fence FEAT_ZPCI_INTERP for QEMU 6.2 and older (Christian)                     
- switch from container_of to VFIO_PCI and drop asserts (Thomas)                
- re-arrange g_autofree so we malloc at time of declaration (Thomas) 

Matthew Rosato (9):
  Update linux headers
  target/s390x: add zpci-interp to cpu models
  fixup: force interp off for QEMU machine 6.2 and older
  s390x/pci: enable for load/store intepretation
  s390x/pci: don't fence interpreted devices without MSI-X
  s390x/pci: enable adapter event notification for interpreted devices
  s390x/pci: use I/O Address Translation assist when interpreting
  s390x/pci: use dtsm provided from vfio capabilities for interpreted
    devices
  s390x/pci: let intercept devices have separate PCI groups

 hw/s390x/s390-pci-bus.c                       | 121 ++++++++++-
 hw/s390x/s390-pci-inst.c                      | 168 ++++++++++++++-
 hw/s390x/s390-pci-vfio.c                      | 204 +++++++++++++++++-
 hw/s390x/s390-virtio-ccw.c                    |   1 +
 include/hw/s390x/s390-pci-bus.h               |   8 +-
 include/hw/s390x/s390-pci-inst.h              |   2 +-
 include/hw/s390x/s390-pci-vfio.h              |  45 ++++
 include/standard-headers/asm-x86/kvm_para.h   |   1 +
 include/standard-headers/drm/drm_fourcc.h     |  11 +
 include/standard-headers/linux/ethtool.h      |   1 +
 include/standard-headers/linux/fuse.h         |  60 +++++-
 include/standard-headers/linux/pci_regs.h     |   4 +
 include/standard-headers/linux/virtio_iommu.h |   8 +-
 linux-headers/asm-mips/unistd_n32.h           |   1 +
 linux-headers/asm-mips/unistd_n64.h           |   1 +
 linux-headers/asm-mips/unistd_o32.h           |   1 +
 linux-headers/asm-powerpc/unistd_32.h         |   1 +
 linux-headers/asm-powerpc/unistd_64.h         |   1 +
 linux-headers/asm-s390/kvm.h                  |   1 +
 linux-headers/asm-s390/unistd_32.h            |   1 +
 linux-headers/asm-s390/unistd_64.h            |   1 +
 linux-headers/linux/kvm.h                     |   1 +
 linux-headers/linux/vfio.h                    |  22 ++
 linux-headers/linux/vfio_zdev.h               |  51 +++++
 target/s390x/cpu_features_def.h.inc           |   1 +
 target/s390x/gen-features.c                   |   2 +
 target/s390x/kvm/kvm.c                        |   1 +
 27 files changed, 693 insertions(+), 27 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 0/9] s390x/pci: zPCI interpretation support
@ 2022-01-14 20:38 ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

For QEMU, the majority of the work in enabling instruction interpretation
is handled via new VFIO ioctls to SET the appropriate interpretation and
interrupt forwarding modes, and to GET the function handle to use for
interpretive execution.  

This series implements these new ioctls, as well as adding a new, optional
'intercept' parameter to zpci to request interpretation support not be used
as well as an 'intassist' parameter to determine whether or not the
firmware assist will be used for interrupt delivery or whether the host
will be responsible for delivering all interrupts.

The ZPCI_INTERP CPU feature is added beginning with the z14 model to
enable this support.

As a consequence of implementing zPCI interpretation, ISM devices now
become eligible for passthrough (but only when zPCI interpretation is
available).

From the perspective of guest configuration, you passthrough zPCI devices
in the same manner as before, with intepretation support being used by
default if available in kernel+qemu.

Associated kernel series:
https://lore.kernel.org/kvm/20220114203145.242984-1-mjrosato@linux.ibm.com/

Changes v1->v2:

- Update kernel headers sync                                                    
- Drop some pre-req patches that are now merged                                 
- Add some R-bs (Thanks!)                                                       
- fence FEAT_ZPCI_INTERP for QEMU 6.2 and older (Christian)                     
- switch from container_of to VFIO_PCI and drop asserts (Thomas)                
- re-arrange g_autofree so we malloc at time of declaration (Thomas) 

Matthew Rosato (9):
  Update linux headers
  target/s390x: add zpci-interp to cpu models
  fixup: force interp off for QEMU machine 6.2 and older
  s390x/pci: enable for load/store intepretation
  s390x/pci: don't fence interpreted devices without MSI-X
  s390x/pci: enable adapter event notification for interpreted devices
  s390x/pci: use I/O Address Translation assist when interpreting
  s390x/pci: use dtsm provided from vfio capabilities for interpreted
    devices
  s390x/pci: let intercept devices have separate PCI groups

 hw/s390x/s390-pci-bus.c                       | 121 ++++++++++-
 hw/s390x/s390-pci-inst.c                      | 168 ++++++++++++++-
 hw/s390x/s390-pci-vfio.c                      | 204 +++++++++++++++++-
 hw/s390x/s390-virtio-ccw.c                    |   1 +
 include/hw/s390x/s390-pci-bus.h               |   8 +-
 include/hw/s390x/s390-pci-inst.h              |   2 +-
 include/hw/s390x/s390-pci-vfio.h              |  45 ++++
 include/standard-headers/asm-x86/kvm_para.h   |   1 +
 include/standard-headers/drm/drm_fourcc.h     |  11 +
 include/standard-headers/linux/ethtool.h      |   1 +
 include/standard-headers/linux/fuse.h         |  60 +++++-
 include/standard-headers/linux/pci_regs.h     |   4 +
 include/standard-headers/linux/virtio_iommu.h |   8 +-
 linux-headers/asm-mips/unistd_n32.h           |   1 +
 linux-headers/asm-mips/unistd_n64.h           |   1 +
 linux-headers/asm-mips/unistd_o32.h           |   1 +
 linux-headers/asm-powerpc/unistd_32.h         |   1 +
 linux-headers/asm-powerpc/unistd_64.h         |   1 +
 linux-headers/asm-s390/kvm.h                  |   1 +
 linux-headers/asm-s390/unistd_32.h            |   1 +
 linux-headers/asm-s390/unistd_64.h            |   1 +
 linux-headers/linux/kvm.h                     |   1 +
 linux-headers/linux/vfio.h                    |  22 ++
 linux-headers/linux/vfio_zdev.h               |  51 +++++
 target/s390x/cpu_features_def.h.inc           |   1 +
 target/s390x/gen-features.c                   |   2 +
 target/s390x/kvm/kvm.c                        |   1 +
 27 files changed, 693 insertions(+), 27 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v2 1/9] Update linux headers
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

This is a placeholder that pulls in 5.17 + unmerged kernel changes
required by this item.  A proper header sync can be done once the
associated kernel code merges.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 include/standard-headers/asm-x86/kvm_para.h   |  1 +
 include/standard-headers/drm/drm_fourcc.h     | 11 ++++
 include/standard-headers/linux/ethtool.h      |  1 +
 include/standard-headers/linux/fuse.h         | 60 +++++++++++++++++--
 include/standard-headers/linux/pci_regs.h     |  4 ++
 include/standard-headers/linux/virtio_iommu.h |  8 ++-
 linux-headers/asm-mips/unistd_n32.h           |  1 +
 linux-headers/asm-mips/unistd_n64.h           |  1 +
 linux-headers/asm-mips/unistd_o32.h           |  1 +
 linux-headers/asm-powerpc/unistd_32.h         |  1 +
 linux-headers/asm-powerpc/unistd_64.h         |  1 +
 linux-headers/asm-s390/kvm.h                  |  1 +
 linux-headers/asm-s390/unistd_32.h            |  1 +
 linux-headers/asm-s390/unistd_64.h            |  1 +
 linux-headers/linux/kvm.h                     |  1 +
 linux-headers/linux/vfio.h                    | 22 +++++++
 linux-headers/linux/vfio_zdev.h               | 51 ++++++++++++++++
 17 files changed, 162 insertions(+), 5 deletions(-)

diff --git a/include/standard-headers/asm-x86/kvm_para.h b/include/standard-headers/asm-x86/kvm_para.h
index 204cfb8640..f0235e58a1 100644
--- a/include/standard-headers/asm-x86/kvm_para.h
+++ b/include/standard-headers/asm-x86/kvm_para.h
@@ -8,6 +8,7 @@
  * should be used to determine that a VM is running under KVM.
  */
 #define KVM_CPUID_SIGNATURE	0x40000000
+#define KVM_SIGNATURE "KVMKVMKVM\0\0\0"
 
 /* This CPUID returns two feature bitmaps in eax, edx. Before enabling
  * a particular paravirtualization, the appropriate feature bit should
diff --git a/include/standard-headers/drm/drm_fourcc.h b/include/standard-headers/drm/drm_fourcc.h
index 2c025cb4fe..4888f85f69 100644
--- a/include/standard-headers/drm/drm_fourcc.h
+++ b/include/standard-headers/drm/drm_fourcc.h
@@ -313,6 +313,13 @@ extern "C" {
  */
 #define DRM_FORMAT_P016		fourcc_code('P', '0', '1', '6') /* 2x2 subsampled Cr:Cb plane 16 bits per channel */
 
+/* 2 plane YCbCr420.
+ * 3 10 bit components and 2 padding bits packed into 4 bytes.
+ * index 0 = Y plane, [31:0] x:Y2:Y1:Y0 2:10:10:10 little endian
+ * index 1 = Cr:Cb plane, [63:0] x:Cr2:Cb2:Cr1:x:Cb1:Cr0:Cb0 [2:10:10:10:2:10:10:10] little endian
+ */
+#define DRM_FORMAT_P030		fourcc_code('P', '0', '3', '0') /* 2x2 subsampled Cr:Cb plane 10 bits per channel packed */
+
 /* 3 plane non-subsampled (444) YCbCr
  * 16 bits per component, but only 10 bits are used and 6 bits are padded
  * index 0: Y plane, [15:0] Y:x [10:6] little endian
@@ -853,6 +860,10 @@ drm_fourcc_canonicalize_nvidia_format_mod(uint64_t modifier)
  * and UV.  Some SAND-using hardware stores UV in a separate tiled
  * image from Y to reduce the column height, which is not supported
  * with these modifiers.
+ *
+ * The DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT modifier is also
+ * supported for DRM_FORMAT_P030 where the columns remain as 128 bytes
+ * wide, but as this is a 10 bpp format that translates to 96 pixels.
  */
 
 #define DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(v) \
diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h
index 688eb8dc39..38d5a4cd6e 100644
--- a/include/standard-headers/linux/ethtool.h
+++ b/include/standard-headers/linux/ethtool.h
@@ -231,6 +231,7 @@ enum tunable_id {
 	ETHTOOL_RX_COPYBREAK,
 	ETHTOOL_TX_COPYBREAK,
 	ETHTOOL_PFC_PREVENTION_TOUT, /* timeout in msecs */
+	ETHTOOL_TX_COPYBREAK_BUF_SIZE,
 	/*
 	 * Add your fresh new tunable attribute above and remember to update
 	 * tunable_strings[] in net/ethtool/common.c
diff --git a/include/standard-headers/linux/fuse.h b/include/standard-headers/linux/fuse.h
index 23ea31708b..bda06258be 100644
--- a/include/standard-headers/linux/fuse.h
+++ b/include/standard-headers/linux/fuse.h
@@ -184,6 +184,16 @@
  *
  *  7.34
  *  - add FUSE_SYNCFS
+ *
+ *  7.35
+ *  - add FOPEN_NOFLUSH
+ *
+ *  7.36
+ *  - extend fuse_init_in with reserved fields, add FUSE_INIT_EXT init flag
+ *  - add flags2 to fuse_init_in and fuse_init_out
+ *  - add FUSE_SECURITY_CTX init flag
+ *  - add security context to create, mkdir, symlink, and mknod requests
+ *  - add FUSE_HAS_INODE_DAX, FUSE_ATTR_DAX
  */
 
 #ifndef _LINUX_FUSE_H
@@ -215,7 +225,7 @@
 #define FUSE_KERNEL_VERSION 7
 
 /** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 34
+#define FUSE_KERNEL_MINOR_VERSION 36
 
 /** The node ID of the root inode */
 #define FUSE_ROOT_ID 1
@@ -286,12 +296,14 @@ struct fuse_file_lock {
  * FOPEN_NONSEEKABLE: the file is not seekable
  * FOPEN_CACHE_DIR: allow caching this directory
  * FOPEN_STREAM: the file is stream-like (no file position at all)
+ * FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE)
  */
 #define FOPEN_DIRECT_IO		(1 << 0)
 #define FOPEN_KEEP_CACHE	(1 << 1)
 #define FOPEN_NONSEEKABLE	(1 << 2)
 #define FOPEN_CACHE_DIR		(1 << 3)
 #define FOPEN_STREAM		(1 << 4)
+#define FOPEN_NOFLUSH		(1 << 5)
 
 /**
  * INIT request/reply flags
@@ -332,6 +344,11 @@ struct fuse_file_lock {
  *			write/truncate sgid is killed only if file has group
  *			execute permission. (Same as Linux VFS behavior).
  * FUSE_SETXATTR_EXT:	Server supports extended struct fuse_setxattr_in
+ * FUSE_INIT_EXT: extended fuse_init_in request
+ * FUSE_INIT_RESERVED: reserved, do not use
+ * FUSE_SECURITY_CTX:	add security context to create, mkdir, symlink, and
+ *			mknod
+ * FUSE_HAS_INODE_DAX:  use per inode DAX
  */
 #define FUSE_ASYNC_READ		(1 << 0)
 #define FUSE_POSIX_LOCKS	(1 << 1)
@@ -363,6 +380,11 @@ struct fuse_file_lock {
 #define FUSE_SUBMOUNTS		(1 << 27)
 #define FUSE_HANDLE_KILLPRIV_V2	(1 << 28)
 #define FUSE_SETXATTR_EXT	(1 << 29)
+#define FUSE_INIT_EXT		(1 << 30)
+#define FUSE_INIT_RESERVED	(1 << 31)
+/* bits 32..63 get shifted down 32 bits into the flags2 field */
+#define FUSE_SECURITY_CTX	(1ULL << 32)
+#define FUSE_HAS_INODE_DAX	(1ULL << 33)
 
 /**
  * CUSE INIT request/reply flags
@@ -445,8 +467,10 @@ struct fuse_file_lock {
  * fuse_attr flags
  *
  * FUSE_ATTR_SUBMOUNT: Object is a submount root
+ * FUSE_ATTR_DAX: Enable DAX for this file in per inode DAX mode
  */
 #define FUSE_ATTR_SUBMOUNT      (1 << 0)
+#define FUSE_ATTR_DAX		(1 << 1)
 
 /**
  * Open flags
@@ -732,6 +756,8 @@ struct fuse_init_in {
 	uint32_t	minor;
 	uint32_t	max_readahead;
 	uint32_t	flags;
+	uint32_t	flags2;
+	uint32_t	unused[11];
 };
 
 #define FUSE_COMPAT_INIT_OUT_SIZE 8
@@ -748,7 +774,8 @@ struct fuse_init_out {
 	uint32_t	time_gran;
 	uint16_t	max_pages;
 	uint16_t	map_alignment;
-	uint32_t	unused[8];
+	uint32_t	flags2;
+	uint32_t	unused[7];
 };
 
 #define CUSE_INIT_INFO_MAX 4096
@@ -856,9 +883,12 @@ struct fuse_dirent {
 	char name[];
 };
 
-#define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name)
-#define FUSE_DIRENT_ALIGN(x) \
+/* Align variable length records to 64bit boundary */
+#define FUSE_REC_ALIGN(x) \
 	(((x) + sizeof(uint64_t) - 1) & ~(sizeof(uint64_t) - 1))
+
+#define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name)
+#define FUSE_DIRENT_ALIGN(x) FUSE_REC_ALIGN(x)
 #define FUSE_DIRENT_SIZE(d) \
 	FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + (d)->namelen)
 
@@ -975,4 +1005,26 @@ struct fuse_syncfs_in {
 	uint64_t	padding;
 };
 
+/*
+ * For each security context, send fuse_secctx with size of security context
+ * fuse_secctx will be followed by security context name and this in turn
+ * will be followed by actual context label.
+ * fuse_secctx, name, context
+ */
+struct fuse_secctx {
+	uint32_t	size;
+	uint32_t	padding;
+};
+
+/*
+ * Contains the information about how many fuse_secctx structures are being
+ * sent and what's the total size of all security contexts (including
+ * size of fuse_secctx_header).
+ *
+ */
+struct fuse_secctx_header {
+	uint32_t	size;
+	uint32_t	nr_secctx;
+};
+
 #endif /* _LINUX_FUSE_H */
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index ff6ccbc6ef..318f3f1f9e 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -1086,7 +1086,11 @@
 
 /* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
 #define PCI_DVSEC_HEADER1		0x4 /* Designated Vendor-Specific Header1 */
+#define  PCI_DVSEC_HEADER1_VID(x)	((x) & 0xffff)
+#define  PCI_DVSEC_HEADER1_REV(x)	(((x) >> 16) & 0xf)
+#define  PCI_DVSEC_HEADER1_LEN(x)	(((x) >> 20) & 0xfff)
 #define PCI_DVSEC_HEADER2		0x8 /* Designated Vendor-Specific Header2 */
+#define  PCI_DVSEC_HEADER2_ID(x)		((x) & 0xffff)
 
 /* Data Link Feature */
 #define PCI_DLF_CAP		0x04	/* Capabilities Register */
diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
index b9443b83a1..366379c2f0 100644
--- a/include/standard-headers/linux/virtio_iommu.h
+++ b/include/standard-headers/linux/virtio_iommu.h
@@ -16,6 +16,7 @@
 #define VIRTIO_IOMMU_F_BYPASS			3
 #define VIRTIO_IOMMU_F_PROBE			4
 #define VIRTIO_IOMMU_F_MMIO			5
+#define VIRTIO_IOMMU_F_BYPASS_CONFIG		6
 
 struct virtio_iommu_range_64 {
 	uint64_t					start;
@@ -36,6 +37,8 @@ struct virtio_iommu_config {
 	struct virtio_iommu_range_32		domain_range;
 	/* Probe buffer size */
 	uint32_t					probe_size;
+	uint8_t					bypass;
+	uint8_t					reserved[3];
 };
 
 /* Request types */
@@ -66,11 +69,14 @@ struct virtio_iommu_req_tail {
 	uint8_t					reserved[3];
 };
 
+#define VIRTIO_IOMMU_ATTACH_F_BYPASS		(1 << 0)
+
 struct virtio_iommu_req_attach {
 	struct virtio_iommu_req_head		head;
 	uint32_t					domain;
 	uint32_t					endpoint;
-	uint8_t					reserved[8];
+	uint32_t					flags;
+	uint8_t					reserved[4];
 	struct virtio_iommu_req_tail		tail;
 };
 
diff --git a/linux-headers/asm-mips/unistd_n32.h b/linux-headers/asm-mips/unistd_n32.h
index 4b3e7ad1ec..790309211a 100644
--- a/linux-headers/asm-mips/unistd_n32.h
+++ b/linux-headers/asm-mips/unistd_n32.h
@@ -377,5 +377,6 @@
 #define __NR_landlock_add_rule (__NR_Linux + 445)
 #define __NR_landlock_restrict_self (__NR_Linux + 446)
 #define __NR_process_mrelease (__NR_Linux + 448)
+#define __NR_futex_waitv (__NR_Linux + 449)
 
 #endif /* _ASM_UNISTD_N32_H */
diff --git a/linux-headers/asm-mips/unistd_n64.h b/linux-headers/asm-mips/unistd_n64.h
index 488d9298d9..d68a3cc573 100644
--- a/linux-headers/asm-mips/unistd_n64.h
+++ b/linux-headers/asm-mips/unistd_n64.h
@@ -353,5 +353,6 @@
 #define __NR_landlock_add_rule (__NR_Linux + 445)
 #define __NR_landlock_restrict_self (__NR_Linux + 446)
 #define __NR_process_mrelease (__NR_Linux + 448)
+#define __NR_futex_waitv (__NR_Linux + 449)
 
 #endif /* _ASM_UNISTD_N64_H */
diff --git a/linux-headers/asm-mips/unistd_o32.h b/linux-headers/asm-mips/unistd_o32.h
index f47399870a..18cfa981c7 100644
--- a/linux-headers/asm-mips/unistd_o32.h
+++ b/linux-headers/asm-mips/unistd_o32.h
@@ -423,5 +423,6 @@
 #define __NR_landlock_add_rule (__NR_Linux + 445)
 #define __NR_landlock_restrict_self (__NR_Linux + 446)
 #define __NR_process_mrelease (__NR_Linux + 448)
+#define __NR_futex_waitv (__NR_Linux + 449)
 
 #endif /* _ASM_UNISTD_O32_H */
diff --git a/linux-headers/asm-powerpc/unistd_32.h b/linux-headers/asm-powerpc/unistd_32.h
index 11d54696dc..c8a97e6f51 100644
--- a/linux-headers/asm-powerpc/unistd_32.h
+++ b/linux-headers/asm-powerpc/unistd_32.h
@@ -430,6 +430,7 @@
 #define __NR_landlock_add_rule 445
 #define __NR_landlock_restrict_self 446
 #define __NR_process_mrelease 448
+#define __NR_futex_waitv 449
 
 
 #endif /* _ASM_UNISTD_32_H */
diff --git a/linux-headers/asm-powerpc/unistd_64.h b/linux-headers/asm-powerpc/unistd_64.h
index cf740bab13..5dd37251b8 100644
--- a/linux-headers/asm-powerpc/unistd_64.h
+++ b/linux-headers/asm-powerpc/unistd_64.h
@@ -402,6 +402,7 @@
 #define __NR_landlock_add_rule 445
 #define __NR_landlock_restrict_self 446
 #define __NR_process_mrelease 448
+#define __NR_futex_waitv 449
 
 
 #endif /* _ASM_UNISTD_64_H */
diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index f053b8304a..d8259ff9a1 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
 #define KVM_S390_VM_CPU_FEAT_PFMFI	11
 #define KVM_S390_VM_CPU_FEAT_SIGPIF	12
 #define KVM_S390_VM_CPU_FEAT_KSS	13
+#define KVM_S390_VM_CPU_FEAT_ZPCI_INTERP 14
 struct kvm_s390_vm_cpu_feat {
 	__u64 feat[16];
 };
diff --git a/linux-headers/asm-s390/unistd_32.h b/linux-headers/asm-s390/unistd_32.h
index 8f97d98128..8c60e40ab1 100644
--- a/linux-headers/asm-s390/unistd_32.h
+++ b/linux-headers/asm-s390/unistd_32.h
@@ -420,5 +420,6 @@
 #define __NR_landlock_add_rule 445
 #define __NR_landlock_restrict_self 446
 #define __NR_process_mrelease 448
+#define __NR_futex_waitv 449
 
 #endif /* _ASM_S390_UNISTD_32_H */
diff --git a/linux-headers/asm-s390/unistd_64.h b/linux-headers/asm-s390/unistd_64.h
index 021ffc30e6..5793aa2a83 100644
--- a/linux-headers/asm-s390/unistd_64.h
+++ b/linux-headers/asm-s390/unistd_64.h
@@ -368,5 +368,6 @@
 #define __NR_landlock_add_rule 445
 #define __NR_landlock_restrict_self 446
 #define __NR_process_mrelease 448
+#define __NR_futex_waitv 449
 
 #endif /* _ASM_S390_UNISTD_64_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 02c5e7b7bb..f61f357899 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1130,6 +1130,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_BINARY_STATS_FD 203
 #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
 #define KVM_CAP_ARM_MTE 205
+#define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index e680594f27..96b18b872a 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -1002,6 +1002,28 @@ struct vfio_device_feature {
  */
 #define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN	(0)
 
+/*
+ * Provide support for enabling interpretation of zPCI instructions.  This
+ * feature is only valid for s390x PCI devices.  Data provided when setting
+ * and getting this feature is futher described in vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_INTERP		(1)
+
+/*
+ * Provide support for enbaling adapter interruption forwarding for zPCI
+ * devices.  This feature is only valid for s390x PCI devices.  Data provided
+ * when setting and getting this feature is further described in vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_AIF		(2)
+
+/*
+ * Provide support for enabling guest I/O address translation assistance for
+ * zPCI devices.  This feature is only valid for s390x PCI devices.  Data
+ * provided when setting and getting this feature is further described in
+ * vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_IOAT		(3)
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
diff --git a/linux-headers/linux/vfio_zdev.h b/linux-headers/linux/vfio_zdev.h
index b4309397b6..b4c2ba8e71 100644
--- a/linux-headers/linux/vfio_zdev.h
+++ b/linux-headers/linux/vfio_zdev.h
@@ -47,6 +47,9 @@ struct vfio_device_info_cap_zpci_group {
 	__u16 noi;		/* Maximum number of MSIs */
 	__u16 maxstbl;		/* Maximum Store Block Length */
 	__u8 version;		/* Supported PCI Version */
+	/* End of version 1 */
+	__u8 dtsm;		/* Supported IOAT Designations */
+	/* End of version 2 */
 };
 
 /**
@@ -75,4 +78,52 @@ struct vfio_device_info_cap_zpci_pfip {
 	__u8 pfip[];
 };
 
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_INTERP
+ *
+ * This feature is used for enabling zPCI instruction interpretation for a
+ * device.  No data is provided when setting this feature.  When getting
+ * this feature, the following structure is provided which details whether
+ * or not interpretation is active and provides the guest with host device
+ * information necessary to enable interpretation.
+ */
+struct vfio_device_zpci_interp {
+	__u64 flags;
+#define VFIO_DEVICE_ZPCI_FLAG_INTERP 1
+	__u32 fh;		/* Host device function handle */
+};
+
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_AIF
+ *
+ * This feature is used for enabling forwarding of adapter interrupts directly
+ * from firmware to the guest.  When setting this feature, the flags indicate
+ * whether to enable/disable the feature and the structure defined below is
+ * used to setup the forwarding structures.  When getting this feature, only
+ * the flags are used to indicate the current state.
+ */
+struct vfio_device_zpci_aif {
+	__u64 flags;
+#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
+#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2
+	__u64 ibv;		/* Address of guest interrupt bit vector */
+	__u64 sb;		/* Address of guest summary bit */
+	__u32 noi;		/* Number of interrupts */
+	__u8 isc;		/* Guest interrupt subclass */
+	__u8 sbo;		/* Offset of guest summary bit vector */
+};
+
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_IOAT
+ *
+ * This feature is used for enabling guest I/O translation assistance for
+ * passthrough zPCI devices using instruction interpretation.  When setting
+ * this feature, the iota specifies a KVM guest I/O translation anchor.  When
+ * getting this feature, the most recently set anchor (or 0) is returned in
+ * iota.
+ */
+struct vfio_device_zpci_ioat {
+	__u64 iota;
+};
+
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 1/9] Update linux headers
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

This is a placeholder that pulls in 5.17 + unmerged kernel changes
required by this item.  A proper header sync can be done once the
associated kernel code merges.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 include/standard-headers/asm-x86/kvm_para.h   |  1 +
 include/standard-headers/drm/drm_fourcc.h     | 11 ++++
 include/standard-headers/linux/ethtool.h      |  1 +
 include/standard-headers/linux/fuse.h         | 60 +++++++++++++++++--
 include/standard-headers/linux/pci_regs.h     |  4 ++
 include/standard-headers/linux/virtio_iommu.h |  8 ++-
 linux-headers/asm-mips/unistd_n32.h           |  1 +
 linux-headers/asm-mips/unistd_n64.h           |  1 +
 linux-headers/asm-mips/unistd_o32.h           |  1 +
 linux-headers/asm-powerpc/unistd_32.h         |  1 +
 linux-headers/asm-powerpc/unistd_64.h         |  1 +
 linux-headers/asm-s390/kvm.h                  |  1 +
 linux-headers/asm-s390/unistd_32.h            |  1 +
 linux-headers/asm-s390/unistd_64.h            |  1 +
 linux-headers/linux/kvm.h                     |  1 +
 linux-headers/linux/vfio.h                    | 22 +++++++
 linux-headers/linux/vfio_zdev.h               | 51 ++++++++++++++++
 17 files changed, 162 insertions(+), 5 deletions(-)

diff --git a/include/standard-headers/asm-x86/kvm_para.h b/include/standard-headers/asm-x86/kvm_para.h
index 204cfb8640..f0235e58a1 100644
--- a/include/standard-headers/asm-x86/kvm_para.h
+++ b/include/standard-headers/asm-x86/kvm_para.h
@@ -8,6 +8,7 @@
  * should be used to determine that a VM is running under KVM.
  */
 #define KVM_CPUID_SIGNATURE	0x40000000
+#define KVM_SIGNATURE "KVMKVMKVM\0\0\0"
 
 /* This CPUID returns two feature bitmaps in eax, edx. Before enabling
  * a particular paravirtualization, the appropriate feature bit should
diff --git a/include/standard-headers/drm/drm_fourcc.h b/include/standard-headers/drm/drm_fourcc.h
index 2c025cb4fe..4888f85f69 100644
--- a/include/standard-headers/drm/drm_fourcc.h
+++ b/include/standard-headers/drm/drm_fourcc.h
@@ -313,6 +313,13 @@ extern "C" {
  */
 #define DRM_FORMAT_P016		fourcc_code('P', '0', '1', '6') /* 2x2 subsampled Cr:Cb plane 16 bits per channel */
 
+/* 2 plane YCbCr420.
+ * 3 10 bit components and 2 padding bits packed into 4 bytes.
+ * index 0 = Y plane, [31:0] x:Y2:Y1:Y0 2:10:10:10 little endian
+ * index 1 = Cr:Cb plane, [63:0] x:Cr2:Cb2:Cr1:x:Cb1:Cr0:Cb0 [2:10:10:10:2:10:10:10] little endian
+ */
+#define DRM_FORMAT_P030		fourcc_code('P', '0', '3', '0') /* 2x2 subsampled Cr:Cb plane 10 bits per channel packed */
+
 /* 3 plane non-subsampled (444) YCbCr
  * 16 bits per component, but only 10 bits are used and 6 bits are padded
  * index 0: Y plane, [15:0] Y:x [10:6] little endian
@@ -853,6 +860,10 @@ drm_fourcc_canonicalize_nvidia_format_mod(uint64_t modifier)
  * and UV.  Some SAND-using hardware stores UV in a separate tiled
  * image from Y to reduce the column height, which is not supported
  * with these modifiers.
+ *
+ * The DRM_FORMAT_MOD_BROADCOM_SAND128_COL_HEIGHT modifier is also
+ * supported for DRM_FORMAT_P030 where the columns remain as 128 bytes
+ * wide, but as this is a 10 bpp format that translates to 96 pixels.
  */
 
 #define DRM_FORMAT_MOD_BROADCOM_SAND32_COL_HEIGHT(v) \
diff --git a/include/standard-headers/linux/ethtool.h b/include/standard-headers/linux/ethtool.h
index 688eb8dc39..38d5a4cd6e 100644
--- a/include/standard-headers/linux/ethtool.h
+++ b/include/standard-headers/linux/ethtool.h
@@ -231,6 +231,7 @@ enum tunable_id {
 	ETHTOOL_RX_COPYBREAK,
 	ETHTOOL_TX_COPYBREAK,
 	ETHTOOL_PFC_PREVENTION_TOUT, /* timeout in msecs */
+	ETHTOOL_TX_COPYBREAK_BUF_SIZE,
 	/*
 	 * Add your fresh new tunable attribute above and remember to update
 	 * tunable_strings[] in net/ethtool/common.c
diff --git a/include/standard-headers/linux/fuse.h b/include/standard-headers/linux/fuse.h
index 23ea31708b..bda06258be 100644
--- a/include/standard-headers/linux/fuse.h
+++ b/include/standard-headers/linux/fuse.h
@@ -184,6 +184,16 @@
  *
  *  7.34
  *  - add FUSE_SYNCFS
+ *
+ *  7.35
+ *  - add FOPEN_NOFLUSH
+ *
+ *  7.36
+ *  - extend fuse_init_in with reserved fields, add FUSE_INIT_EXT init flag
+ *  - add flags2 to fuse_init_in and fuse_init_out
+ *  - add FUSE_SECURITY_CTX init flag
+ *  - add security context to create, mkdir, symlink, and mknod requests
+ *  - add FUSE_HAS_INODE_DAX, FUSE_ATTR_DAX
  */
 
 #ifndef _LINUX_FUSE_H
@@ -215,7 +225,7 @@
 #define FUSE_KERNEL_VERSION 7
 
 /** Minor version number of this interface */
-#define FUSE_KERNEL_MINOR_VERSION 34
+#define FUSE_KERNEL_MINOR_VERSION 36
 
 /** The node ID of the root inode */
 #define FUSE_ROOT_ID 1
@@ -286,12 +296,14 @@ struct fuse_file_lock {
  * FOPEN_NONSEEKABLE: the file is not seekable
  * FOPEN_CACHE_DIR: allow caching this directory
  * FOPEN_STREAM: the file is stream-like (no file position at all)
+ * FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE)
  */
 #define FOPEN_DIRECT_IO		(1 << 0)
 #define FOPEN_KEEP_CACHE	(1 << 1)
 #define FOPEN_NONSEEKABLE	(1 << 2)
 #define FOPEN_CACHE_DIR		(1 << 3)
 #define FOPEN_STREAM		(1 << 4)
+#define FOPEN_NOFLUSH		(1 << 5)
 
 /**
  * INIT request/reply flags
@@ -332,6 +344,11 @@ struct fuse_file_lock {
  *			write/truncate sgid is killed only if file has group
  *			execute permission. (Same as Linux VFS behavior).
  * FUSE_SETXATTR_EXT:	Server supports extended struct fuse_setxattr_in
+ * FUSE_INIT_EXT: extended fuse_init_in request
+ * FUSE_INIT_RESERVED: reserved, do not use
+ * FUSE_SECURITY_CTX:	add security context to create, mkdir, symlink, and
+ *			mknod
+ * FUSE_HAS_INODE_DAX:  use per inode DAX
  */
 #define FUSE_ASYNC_READ		(1 << 0)
 #define FUSE_POSIX_LOCKS	(1 << 1)
@@ -363,6 +380,11 @@ struct fuse_file_lock {
 #define FUSE_SUBMOUNTS		(1 << 27)
 #define FUSE_HANDLE_KILLPRIV_V2	(1 << 28)
 #define FUSE_SETXATTR_EXT	(1 << 29)
+#define FUSE_INIT_EXT		(1 << 30)
+#define FUSE_INIT_RESERVED	(1 << 31)
+/* bits 32..63 get shifted down 32 bits into the flags2 field */
+#define FUSE_SECURITY_CTX	(1ULL << 32)
+#define FUSE_HAS_INODE_DAX	(1ULL << 33)
 
 /**
  * CUSE INIT request/reply flags
@@ -445,8 +467,10 @@ struct fuse_file_lock {
  * fuse_attr flags
  *
  * FUSE_ATTR_SUBMOUNT: Object is a submount root
+ * FUSE_ATTR_DAX: Enable DAX for this file in per inode DAX mode
  */
 #define FUSE_ATTR_SUBMOUNT      (1 << 0)
+#define FUSE_ATTR_DAX		(1 << 1)
 
 /**
  * Open flags
@@ -732,6 +756,8 @@ struct fuse_init_in {
 	uint32_t	minor;
 	uint32_t	max_readahead;
 	uint32_t	flags;
+	uint32_t	flags2;
+	uint32_t	unused[11];
 };
 
 #define FUSE_COMPAT_INIT_OUT_SIZE 8
@@ -748,7 +774,8 @@ struct fuse_init_out {
 	uint32_t	time_gran;
 	uint16_t	max_pages;
 	uint16_t	map_alignment;
-	uint32_t	unused[8];
+	uint32_t	flags2;
+	uint32_t	unused[7];
 };
 
 #define CUSE_INIT_INFO_MAX 4096
@@ -856,9 +883,12 @@ struct fuse_dirent {
 	char name[];
 };
 
-#define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name)
-#define FUSE_DIRENT_ALIGN(x) \
+/* Align variable length records to 64bit boundary */
+#define FUSE_REC_ALIGN(x) \
 	(((x) + sizeof(uint64_t) - 1) & ~(sizeof(uint64_t) - 1))
+
+#define FUSE_NAME_OFFSET offsetof(struct fuse_dirent, name)
+#define FUSE_DIRENT_ALIGN(x) FUSE_REC_ALIGN(x)
 #define FUSE_DIRENT_SIZE(d) \
 	FUSE_DIRENT_ALIGN(FUSE_NAME_OFFSET + (d)->namelen)
 
@@ -975,4 +1005,26 @@ struct fuse_syncfs_in {
 	uint64_t	padding;
 };
 
+/*
+ * For each security context, send fuse_secctx with size of security context
+ * fuse_secctx will be followed by security context name and this in turn
+ * will be followed by actual context label.
+ * fuse_secctx, name, context
+ */
+struct fuse_secctx {
+	uint32_t	size;
+	uint32_t	padding;
+};
+
+/*
+ * Contains the information about how many fuse_secctx structures are being
+ * sent and what's the total size of all security contexts (including
+ * size of fuse_secctx_header).
+ *
+ */
+struct fuse_secctx_header {
+	uint32_t	size;
+	uint32_t	nr_secctx;
+};
+
 #endif /* _LINUX_FUSE_H */
diff --git a/include/standard-headers/linux/pci_regs.h b/include/standard-headers/linux/pci_regs.h
index ff6ccbc6ef..318f3f1f9e 100644
--- a/include/standard-headers/linux/pci_regs.h
+++ b/include/standard-headers/linux/pci_regs.h
@@ -1086,7 +1086,11 @@
 
 /* Designated Vendor-Specific (DVSEC, PCI_EXT_CAP_ID_DVSEC) */
 #define PCI_DVSEC_HEADER1		0x4 /* Designated Vendor-Specific Header1 */
+#define  PCI_DVSEC_HEADER1_VID(x)	((x) & 0xffff)
+#define  PCI_DVSEC_HEADER1_REV(x)	(((x) >> 16) & 0xf)
+#define  PCI_DVSEC_HEADER1_LEN(x)	(((x) >> 20) & 0xfff)
 #define PCI_DVSEC_HEADER2		0x8 /* Designated Vendor-Specific Header2 */
+#define  PCI_DVSEC_HEADER2_ID(x)		((x) & 0xffff)
 
 /* Data Link Feature */
 #define PCI_DLF_CAP		0x04	/* Capabilities Register */
diff --git a/include/standard-headers/linux/virtio_iommu.h b/include/standard-headers/linux/virtio_iommu.h
index b9443b83a1..366379c2f0 100644
--- a/include/standard-headers/linux/virtio_iommu.h
+++ b/include/standard-headers/linux/virtio_iommu.h
@@ -16,6 +16,7 @@
 #define VIRTIO_IOMMU_F_BYPASS			3
 #define VIRTIO_IOMMU_F_PROBE			4
 #define VIRTIO_IOMMU_F_MMIO			5
+#define VIRTIO_IOMMU_F_BYPASS_CONFIG		6
 
 struct virtio_iommu_range_64 {
 	uint64_t					start;
@@ -36,6 +37,8 @@ struct virtio_iommu_config {
 	struct virtio_iommu_range_32		domain_range;
 	/* Probe buffer size */
 	uint32_t					probe_size;
+	uint8_t					bypass;
+	uint8_t					reserved[3];
 };
 
 /* Request types */
@@ -66,11 +69,14 @@ struct virtio_iommu_req_tail {
 	uint8_t					reserved[3];
 };
 
+#define VIRTIO_IOMMU_ATTACH_F_BYPASS		(1 << 0)
+
 struct virtio_iommu_req_attach {
 	struct virtio_iommu_req_head		head;
 	uint32_t					domain;
 	uint32_t					endpoint;
-	uint8_t					reserved[8];
+	uint32_t					flags;
+	uint8_t					reserved[4];
 	struct virtio_iommu_req_tail		tail;
 };
 
diff --git a/linux-headers/asm-mips/unistd_n32.h b/linux-headers/asm-mips/unistd_n32.h
index 4b3e7ad1ec..790309211a 100644
--- a/linux-headers/asm-mips/unistd_n32.h
+++ b/linux-headers/asm-mips/unistd_n32.h
@@ -377,5 +377,6 @@
 #define __NR_landlock_add_rule (__NR_Linux + 445)
 #define __NR_landlock_restrict_self (__NR_Linux + 446)
 #define __NR_process_mrelease (__NR_Linux + 448)
+#define __NR_futex_waitv (__NR_Linux + 449)
 
 #endif /* _ASM_UNISTD_N32_H */
diff --git a/linux-headers/asm-mips/unistd_n64.h b/linux-headers/asm-mips/unistd_n64.h
index 488d9298d9..d68a3cc573 100644
--- a/linux-headers/asm-mips/unistd_n64.h
+++ b/linux-headers/asm-mips/unistd_n64.h
@@ -353,5 +353,6 @@
 #define __NR_landlock_add_rule (__NR_Linux + 445)
 #define __NR_landlock_restrict_self (__NR_Linux + 446)
 #define __NR_process_mrelease (__NR_Linux + 448)
+#define __NR_futex_waitv (__NR_Linux + 449)
 
 #endif /* _ASM_UNISTD_N64_H */
diff --git a/linux-headers/asm-mips/unistd_o32.h b/linux-headers/asm-mips/unistd_o32.h
index f47399870a..18cfa981c7 100644
--- a/linux-headers/asm-mips/unistd_o32.h
+++ b/linux-headers/asm-mips/unistd_o32.h
@@ -423,5 +423,6 @@
 #define __NR_landlock_add_rule (__NR_Linux + 445)
 #define __NR_landlock_restrict_self (__NR_Linux + 446)
 #define __NR_process_mrelease (__NR_Linux + 448)
+#define __NR_futex_waitv (__NR_Linux + 449)
 
 #endif /* _ASM_UNISTD_O32_H */
diff --git a/linux-headers/asm-powerpc/unistd_32.h b/linux-headers/asm-powerpc/unistd_32.h
index 11d54696dc..c8a97e6f51 100644
--- a/linux-headers/asm-powerpc/unistd_32.h
+++ b/linux-headers/asm-powerpc/unistd_32.h
@@ -430,6 +430,7 @@
 #define __NR_landlock_add_rule 445
 #define __NR_landlock_restrict_self 446
 #define __NR_process_mrelease 448
+#define __NR_futex_waitv 449
 
 
 #endif /* _ASM_UNISTD_32_H */
diff --git a/linux-headers/asm-powerpc/unistd_64.h b/linux-headers/asm-powerpc/unistd_64.h
index cf740bab13..5dd37251b8 100644
--- a/linux-headers/asm-powerpc/unistd_64.h
+++ b/linux-headers/asm-powerpc/unistd_64.h
@@ -402,6 +402,7 @@
 #define __NR_landlock_add_rule 445
 #define __NR_landlock_restrict_self 446
 #define __NR_process_mrelease 448
+#define __NR_futex_waitv 449
 
 
 #endif /* _ASM_UNISTD_64_H */
diff --git a/linux-headers/asm-s390/kvm.h b/linux-headers/asm-s390/kvm.h
index f053b8304a..d8259ff9a1 100644
--- a/linux-headers/asm-s390/kvm.h
+++ b/linux-headers/asm-s390/kvm.h
@@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
 #define KVM_S390_VM_CPU_FEAT_PFMFI	11
 #define KVM_S390_VM_CPU_FEAT_SIGPIF	12
 #define KVM_S390_VM_CPU_FEAT_KSS	13
+#define KVM_S390_VM_CPU_FEAT_ZPCI_INTERP 14
 struct kvm_s390_vm_cpu_feat {
 	__u64 feat[16];
 };
diff --git a/linux-headers/asm-s390/unistd_32.h b/linux-headers/asm-s390/unistd_32.h
index 8f97d98128..8c60e40ab1 100644
--- a/linux-headers/asm-s390/unistd_32.h
+++ b/linux-headers/asm-s390/unistd_32.h
@@ -420,5 +420,6 @@
 #define __NR_landlock_add_rule 445
 #define __NR_landlock_restrict_self 446
 #define __NR_process_mrelease 448
+#define __NR_futex_waitv 449
 
 #endif /* _ASM_S390_UNISTD_32_H */
diff --git a/linux-headers/asm-s390/unistd_64.h b/linux-headers/asm-s390/unistd_64.h
index 021ffc30e6..5793aa2a83 100644
--- a/linux-headers/asm-s390/unistd_64.h
+++ b/linux-headers/asm-s390/unistd_64.h
@@ -368,5 +368,6 @@
 #define __NR_landlock_add_rule 445
 #define __NR_landlock_restrict_self 446
 #define __NR_process_mrelease 448
+#define __NR_futex_waitv 449
 
 #endif /* _ASM_S390_UNISTD_64_H */
diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index 02c5e7b7bb..f61f357899 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -1130,6 +1130,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_BINARY_STATS_FD 203
 #define KVM_CAP_EXIT_ON_EMULATION_FAILURE 204
 #define KVM_CAP_ARM_MTE 205
+#define KVM_CAP_VM_MOVE_ENC_CONTEXT_FROM 206
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index e680594f27..96b18b872a 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -1002,6 +1002,28 @@ struct vfio_device_feature {
  */
 #define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN	(0)
 
+/*
+ * Provide support for enabling interpretation of zPCI instructions.  This
+ * feature is only valid for s390x PCI devices.  Data provided when setting
+ * and getting this feature is futher described in vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_INTERP		(1)
+
+/*
+ * Provide support for enbaling adapter interruption forwarding for zPCI
+ * devices.  This feature is only valid for s390x PCI devices.  Data provided
+ * when setting and getting this feature is further described in vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_AIF		(2)
+
+/*
+ * Provide support for enabling guest I/O address translation assistance for
+ * zPCI devices.  This feature is only valid for s390x PCI devices.  Data
+ * provided when setting and getting this feature is further described in
+ * vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_IOAT		(3)
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
diff --git a/linux-headers/linux/vfio_zdev.h b/linux-headers/linux/vfio_zdev.h
index b4309397b6..b4c2ba8e71 100644
--- a/linux-headers/linux/vfio_zdev.h
+++ b/linux-headers/linux/vfio_zdev.h
@@ -47,6 +47,9 @@ struct vfio_device_info_cap_zpci_group {
 	__u16 noi;		/* Maximum number of MSIs */
 	__u16 maxstbl;		/* Maximum Store Block Length */
 	__u8 version;		/* Supported PCI Version */
+	/* End of version 1 */
+	__u8 dtsm;		/* Supported IOAT Designations */
+	/* End of version 2 */
 };
 
 /**
@@ -75,4 +78,52 @@ struct vfio_device_info_cap_zpci_pfip {
 	__u8 pfip[];
 };
 
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_INTERP
+ *
+ * This feature is used for enabling zPCI instruction interpretation for a
+ * device.  No data is provided when setting this feature.  When getting
+ * this feature, the following structure is provided which details whether
+ * or not interpretation is active and provides the guest with host device
+ * information necessary to enable interpretation.
+ */
+struct vfio_device_zpci_interp {
+	__u64 flags;
+#define VFIO_DEVICE_ZPCI_FLAG_INTERP 1
+	__u32 fh;		/* Host device function handle */
+};
+
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_AIF
+ *
+ * This feature is used for enabling forwarding of adapter interrupts directly
+ * from firmware to the guest.  When setting this feature, the flags indicate
+ * whether to enable/disable the feature and the structure defined below is
+ * used to setup the forwarding structures.  When getting this feature, only
+ * the flags are used to indicate the current state.
+ */
+struct vfio_device_zpci_aif {
+	__u64 flags;
+#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
+#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2
+	__u64 ibv;		/* Address of guest interrupt bit vector */
+	__u64 sb;		/* Address of guest summary bit */
+	__u32 noi;		/* Number of interrupts */
+	__u8 isc;		/* Guest interrupt subclass */
+	__u8 sbo;		/* Offset of guest summary bit vector */
+};
+
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_IOAT
+ *
+ * This feature is used for enabling guest I/O translation assistance for
+ * passthrough zPCI devices using instruction interpretation.  When setting
+ * this feature, the iota specifies a KVM guest I/O translation anchor.  When
+ * getting this feature, the most recently set anchor (or 0) is returned in
+ * iota.
+ */
+struct vfio_device_zpci_ioat {
+	__u64 iota;
+};
+
 #endif
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 2/9] target/s390x: add zpci-interp to cpu models
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

The zpci-interp feature is used to specify whether zPCI interpretation is
to be used for this guest.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 target/s390x/cpu_features_def.h.inc | 1 +
 target/s390x/gen-features.c         | 2 ++
 target/s390x/kvm/kvm.c              | 1 +
 3 files changed, 4 insertions(+)

diff --git a/target/s390x/cpu_features_def.h.inc b/target/s390x/cpu_features_def.h.inc
index e86662bb3b..4ade3182aa 100644
--- a/target/s390x/cpu_features_def.h.inc
+++ b/target/s390x/cpu_features_def.h.inc
@@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: Conditional-external-interception f
 DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
 DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
 DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
+DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
 
 /* Features exposed via the PLO instruction. */
 DEF_FEAT(PLO_CL, "plo-cl", PLO, 0, "PLO Compare and load (32 bit in general registers)")
diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index 7cb1a6ec10..7005d22415 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -554,6 +554,7 @@ static uint16_t full_GEN14_GA1[] = {
     S390_FEAT_HPMA2,
     S390_FEAT_SIE_KSS,
     S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+    S390_FEAT_ZPCI_INTERP,
 };
 
 #define full_GEN14_GA2 EmptyFeat
@@ -650,6 +651,7 @@ static uint16_t default_GEN14_GA1[] = {
     S390_FEAT_GROUP_MSA_EXT_8,
     S390_FEAT_MULTIPLE_EPOCH,
     S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+    S390_FEAT_ZPCI_INTERP,
 };
 
 #define default_GEN14_GA2 EmptyFeat
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 6acf14d5ec..0357bfda89 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2294,6 +2294,7 @@ static int kvm_to_feat[][2] = {
     { KVM_S390_VM_CPU_FEAT_PFMFI, S390_FEAT_SIE_PFMFI},
     { KVM_S390_VM_CPU_FEAT_SIGPIF, S390_FEAT_SIE_SIGPIF},
     { KVM_S390_VM_CPU_FEAT_KSS, S390_FEAT_SIE_KSS},
+    { KVM_S390_VM_CPU_FEAT_ZPCI_INTERP, S390_FEAT_ZPCI_INTERP },
 };
 
 static int query_cpu_feat(S390FeatBitmap features)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 2/9] target/s390x: add zpci-interp to cpu models
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

The zpci-interp feature is used to specify whether zPCI interpretation is
to be used for this guest.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 target/s390x/cpu_features_def.h.inc | 1 +
 target/s390x/gen-features.c         | 2 ++
 target/s390x/kvm/kvm.c              | 1 +
 3 files changed, 4 insertions(+)

diff --git a/target/s390x/cpu_features_def.h.inc b/target/s390x/cpu_features_def.h.inc
index e86662bb3b..4ade3182aa 100644
--- a/target/s390x/cpu_features_def.h.inc
+++ b/target/s390x/cpu_features_def.h.inc
@@ -146,6 +146,7 @@ DEF_FEAT(SIE_CEI, "cei", SCLP_CPU, 43, "SIE: Conditional-external-interception f
 DEF_FEAT(DAT_ENH_2, "dateh2", MISC, 0, "DAT-enhancement facility 2")
 DEF_FEAT(CMM, "cmm", MISC, 0, "Collaborative-memory-management facility")
 DEF_FEAT(AP, "ap", MISC, 0, "AP instructions installed")
+DEF_FEAT(ZPCI_INTERP, "zpci-interp", MISC, 0, "zPCI interpretation")
 
 /* Features exposed via the PLO instruction. */
 DEF_FEAT(PLO_CL, "plo-cl", PLO, 0, "PLO Compare and load (32 bit in general registers)")
diff --git a/target/s390x/gen-features.c b/target/s390x/gen-features.c
index 7cb1a6ec10..7005d22415 100644
--- a/target/s390x/gen-features.c
+++ b/target/s390x/gen-features.c
@@ -554,6 +554,7 @@ static uint16_t full_GEN14_GA1[] = {
     S390_FEAT_HPMA2,
     S390_FEAT_SIE_KSS,
     S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+    S390_FEAT_ZPCI_INTERP,
 };
 
 #define full_GEN14_GA2 EmptyFeat
@@ -650,6 +651,7 @@ static uint16_t default_GEN14_GA1[] = {
     S390_FEAT_GROUP_MSA_EXT_8,
     S390_FEAT_MULTIPLE_EPOCH,
     S390_FEAT_GROUP_MULTIPLE_EPOCH_PTFF,
+    S390_FEAT_ZPCI_INTERP,
 };
 
 #define default_GEN14_GA2 EmptyFeat
diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index 6acf14d5ec..0357bfda89 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -2294,6 +2294,7 @@ static int kvm_to_feat[][2] = {
     { KVM_S390_VM_CPU_FEAT_PFMFI, S390_FEAT_SIE_PFMFI},
     { KVM_S390_VM_CPU_FEAT_SIGPIF, S390_FEAT_SIE_SIGPIF},
     { KVM_S390_VM_CPU_FEAT_KSS, S390_FEAT_SIE_KSS},
+    { KVM_S390_VM_CPU_FEAT_ZPCI_INTERP, S390_FEAT_ZPCI_INTERP },
 };
 
 static int query_cpu_feat(S390FeatBitmap features)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 3/9] fixup: force interp off for QEMU machine 6.2 and older
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Double-check I'm doing this right + test.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-virtio-ccw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 84e3e63c43..e02fe11b07 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -803,6 +803,7 @@ DEFINE_CCW_MACHINE(7_0, "7.0", true);
 static void ccw_machine_6_2_instance_options(MachineState *machine)
 {
     ccw_machine_7_0_instance_options(machine);
+    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
 }
 
 static void ccw_machine_6_2_class_options(MachineClass *mc)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 3/9] fixup: force interp off for QEMU machine 6.2 and older
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

Double-check I'm doing this right + test.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-virtio-ccw.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 84e3e63c43..e02fe11b07 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -803,6 +803,7 @@ DEFINE_CCW_MACHINE(7_0, "7.0", true);
 static void ccw_machine_6_2_instance_options(MachineState *machine)
 {
     ccw_machine_7_0_instance_options(machine);
+    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
 }
 
 static void ccw_machine_6_2_class_options(MachineClass *mc)
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Use the associated vfio feature ioctl to enable interpretation for devices
when requested.  As part of this process, we must use the host function
handle rather than a QEMU-generated one -- this is provided as part of the
ioctl payload.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
 hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
 hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
 include/hw/s390x/s390-pci-bus.h  |  1 +
 include/hw/s390x/s390-pci-vfio.h | 15 +++++++
 5 files changed, 199 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 01b58ebc70..a39ccfee05 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -971,12 +971,58 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
     }
 }
 
+static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
+{
+    uint32_t idx;
+    int rc;
+
+    rc = s390_pci_probe_interp(pbdev);
+    if (rc) {
+        return rc;
+    }
+
+    rc = s390_pci_update_passthrough_fh(pbdev);
+    if (rc) {
+        return rc;
+    }
+
+    /*
+     * The host device is already in an enabled state, but we always present
+     * the initial device state to the guest as disabled (ZPCI_FS_DISABLED).
+     * Therefore, mask off the enable bit from the passthrough handle until
+     * the guest issues a CLP SET PCI FN later to enable the device.
+     */
+    pbdev->fh &= ~FH_MASK_ENABLE;
+
+    /* Next, see if the idx is already in-use */
+    idx = pbdev->fh & FH_MASK_INDEX;
+    if (pbdev->idx != idx) {
+        if (s390_pci_find_dev_by_idx(s, idx)) {
+            return -EINVAL;
+        }
+        /*
+         * Update the idx entry with the passed through idx
+         * If the relinquished idx is lower than next_idx, use it
+         * to replace next_idx
+         */
+        g_hash_table_remove(s->zpci_table, &pbdev->idx);
+        if (idx < s->next_idx) {
+            s->next_idx = idx;
+        }
+        pbdev->idx = idx;
+        g_hash_table_insert(s->zpci_table, &pbdev->idx, pbdev);
+    }
+
+    return 0;
+}
+
 static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                               Error **errp)
 {
     S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
     PCIDevice *pdev = NULL;
     S390PCIBusDevice *pbdev = NULL;
+    int rc;
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
         PCIBridge *pb = PCI_BRIDGE(dev);
@@ -1022,12 +1068,33 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         set_pbdev_info(pbdev);
 
         if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
-            pbdev->fh |= FH_SHM_VFIO;
+            /*
+             * By default, interpretation is always requested; if the available
+             * facilities indicate it is not available, fallback to the
+             * intercept model.
+             */
+            if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
+                    DPRINTF("zPCI interpretation facilities missing.\n");
+                    pbdev->interp = false;
+                }
+            if (pbdev->interp) {
+                rc = s390_pci_interp_plug(s, pbdev);
+                if (rc) {
+                    error_setg(errp, "zpci interp plug failed: %d", rc);
+                    return;
+                }
+            }
             pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
             /* Fill in CLP information passed via the vfio region */
             s390_pci_get_clp_info(pbdev);
+            if (!pbdev->interp) {
+                /* Do vfio passthrough but intercept for I/O */
+                pbdev->fh |= FH_SHM_VFIO;
+            }
         } else {
             pbdev->fh |= FH_SHM_EMUL;
+            /* Always intercept emulated devices */
+            pbdev->interp = false;
         }
 
         if (s390_pci_msix_init(pbdev)) {
@@ -1360,6 +1427,7 @@ static Property s390_pci_device_properties[] = {
     DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
     DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
     DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
+    DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 6d400d4147..e9a0dc12e4 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -18,6 +18,7 @@
 #include "sysemu/hw_accel.h"
 #include "hw/s390x/s390-pci-inst.h"
 #include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-vfio.h"
 #include "hw/s390x/tod.h"
 
 #ifndef DEBUG_S390PCI_INST
@@ -156,6 +157,47 @@ out:
     return rc;
 }
 
+static int clp_enable_interp(S390PCIBusDevice *pbdev)
+{
+    int rc;
+
+    rc = s390_pci_set_interp(pbdev, true);
+    if (rc) {
+        DPRINTF("Failed to enable interpretation\n");
+        return rc;
+    }
+    rc = s390_pci_update_passthrough_fh(pbdev);
+    if (rc) {
+        DPRINTF("Failed to update passthrough fh\n");
+        return rc;
+    }
+    if (!(pbdev->fh & FH_MASK_ENABLE)) {
+        DPRINTF("Passthrough handle is not enabled\n");
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static int clp_disable_interp(S390PCIBusDevice *pbdev)
+{
+    int rc;
+
+    rc = s390_pci_set_interp(pbdev, false);
+    if (rc) {
+        DPRINTF("Failed to disable interpretation\n");
+        return rc;
+    }
+
+    rc = s390_pci_update_passthrough_fh(pbdev);
+    if (rc) {
+        DPRINTF("Failed to update passthrough fh\n");
+        return rc;
+    }
+
+    return 0;
+}
+
 int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
 {
     ClpReqHdr *reqh;
@@ -246,7 +288,19 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
                 goto out;
             }
 
-            pbdev->fh |= FH_MASK_ENABLE;
+            /*
+             * If interpretation is specified, attempt to enable this now and
+             * update with the host fh
+             */
+            if (pbdev->interp) {
+                if (clp_enable_interp(pbdev)) {
+                    stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_ERR);
+                    goto out;
+                }
+            } else {
+                pbdev->fh |= FH_MASK_ENABLE;
+            }
+
             pbdev->state = ZPCI_FS_ENABLED;
             stl_p(&ressetpci->fh, pbdev->fh);
             stw_p(&ressetpci->hdr.rsp, CLP_RC_OK);
@@ -257,6 +311,13 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
                 goto out;
             }
             device_legacy_reset(DEVICE(pbdev));
+            if (pbdev->interp) {
+                if (clp_disable_interp(pbdev)) {
+                    stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_ERR);
+                    goto out;
+                }
+            }
+            /* Mask off the enabled bit for interpreted devices too */
             pbdev->fh &= ~FH_MASK_ENABLE;
             pbdev->state = ZPCI_FS_DISABLED;
             stl_p(&ressetpci->fh, pbdev->fh);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 6f80a47e29..2cab3a9e89 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -97,6 +97,58 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt)
     }
 }
 
+int s390_pci_probe_interp(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_feature feat = {
+        .argsz = sizeof(struct vfio_device_feature),
+        .flags = VFIO_DEVICE_FEATURE_PROBE | VFIO_DEVICE_FEATURE_ZPCI_INTERP
+    };
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
+}
+
+int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_interp *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_INTERP;
+
+    data = (struct vfio_device_zpci_interp *)&feat->data;
+    if (enable) {
+        data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
+    } else {
+        data->flags = 0;
+    }
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+}
+
+int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_interp *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+    int rc;
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_INTERP;
+
+    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+    if (rc) {
+        return rc;
+    }
+
+    data = (struct vfio_device_zpci_interp *)&feat->data;
+    pbdev->fh = data->fh;
+    return 0;
+}
+
 static void s390_pci_read_base(S390PCIBusDevice *pbdev,
                                struct vfio_device_info *info)
 {
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index da3cde2bb4..a9843dfe97 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -350,6 +350,7 @@ struct S390PCIBusDevice {
     IndAddr *indicator;
     bool pci_unplug_request_processed;
     bool unplug_requested;
+    bool interp;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
 };
 
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index ff708aef50..42533e38f7 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -20,6 +20,9 @@ bool s390_pci_update_dma_avail(int fd, unsigned int *avail);
 S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
                                           S390PCIBusDevice *pbdev);
 void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
+int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
+int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
+int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 #else
 static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
@@ -33,6 +36,18 @@ static inline S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
 }
 static inline void s390_pci_end_dma_count(S390pciState *s,
                                           S390PCIDMACount *cnt) { }
+int s390_pci_probe_interp(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
 #endif
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

Use the associated vfio feature ioctl to enable interpretation for devices
when requested.  As part of this process, we must use the host function
handle rather than a QEMU-generated one -- this is provided as part of the
ioctl payload.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
 hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
 hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
 include/hw/s390x/s390-pci-bus.h  |  1 +
 include/hw/s390x/s390-pci-vfio.h | 15 +++++++
 5 files changed, 199 insertions(+), 2 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 01b58ebc70..a39ccfee05 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -971,12 +971,58 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
     }
 }
 
+static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
+{
+    uint32_t idx;
+    int rc;
+
+    rc = s390_pci_probe_interp(pbdev);
+    if (rc) {
+        return rc;
+    }
+
+    rc = s390_pci_update_passthrough_fh(pbdev);
+    if (rc) {
+        return rc;
+    }
+
+    /*
+     * The host device is already in an enabled state, but we always present
+     * the initial device state to the guest as disabled (ZPCI_FS_DISABLED).
+     * Therefore, mask off the enable bit from the passthrough handle until
+     * the guest issues a CLP SET PCI FN later to enable the device.
+     */
+    pbdev->fh &= ~FH_MASK_ENABLE;
+
+    /* Next, see if the idx is already in-use */
+    idx = pbdev->fh & FH_MASK_INDEX;
+    if (pbdev->idx != idx) {
+        if (s390_pci_find_dev_by_idx(s, idx)) {
+            return -EINVAL;
+        }
+        /*
+         * Update the idx entry with the passed through idx
+         * If the relinquished idx is lower than next_idx, use it
+         * to replace next_idx
+         */
+        g_hash_table_remove(s->zpci_table, &pbdev->idx);
+        if (idx < s->next_idx) {
+            s->next_idx = idx;
+        }
+        pbdev->idx = idx;
+        g_hash_table_insert(s->zpci_table, &pbdev->idx, pbdev);
+    }
+
+    return 0;
+}
+
 static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
                               Error **errp)
 {
     S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
     PCIDevice *pdev = NULL;
     S390PCIBusDevice *pbdev = NULL;
+    int rc;
 
     if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
         PCIBridge *pb = PCI_BRIDGE(dev);
@@ -1022,12 +1068,33 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
         set_pbdev_info(pbdev);
 
         if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
-            pbdev->fh |= FH_SHM_VFIO;
+            /*
+             * By default, interpretation is always requested; if the available
+             * facilities indicate it is not available, fallback to the
+             * intercept model.
+             */
+            if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
+                    DPRINTF("zPCI interpretation facilities missing.\n");
+                    pbdev->interp = false;
+                }
+            if (pbdev->interp) {
+                rc = s390_pci_interp_plug(s, pbdev);
+                if (rc) {
+                    error_setg(errp, "zpci interp plug failed: %d", rc);
+                    return;
+                }
+            }
             pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
             /* Fill in CLP information passed via the vfio region */
             s390_pci_get_clp_info(pbdev);
+            if (!pbdev->interp) {
+                /* Do vfio passthrough but intercept for I/O */
+                pbdev->fh |= FH_SHM_VFIO;
+            }
         } else {
             pbdev->fh |= FH_SHM_EMUL;
+            /* Always intercept emulated devices */
+            pbdev->interp = false;
         }
 
         if (s390_pci_msix_init(pbdev)) {
@@ -1360,6 +1427,7 @@ static Property s390_pci_device_properties[] = {
     DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
     DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
     DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
+    DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 6d400d4147..e9a0dc12e4 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -18,6 +18,7 @@
 #include "sysemu/hw_accel.h"
 #include "hw/s390x/s390-pci-inst.h"
 #include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-vfio.h"
 #include "hw/s390x/tod.h"
 
 #ifndef DEBUG_S390PCI_INST
@@ -156,6 +157,47 @@ out:
     return rc;
 }
 
+static int clp_enable_interp(S390PCIBusDevice *pbdev)
+{
+    int rc;
+
+    rc = s390_pci_set_interp(pbdev, true);
+    if (rc) {
+        DPRINTF("Failed to enable interpretation\n");
+        return rc;
+    }
+    rc = s390_pci_update_passthrough_fh(pbdev);
+    if (rc) {
+        DPRINTF("Failed to update passthrough fh\n");
+        return rc;
+    }
+    if (!(pbdev->fh & FH_MASK_ENABLE)) {
+        DPRINTF("Passthrough handle is not enabled\n");
+        return -EINVAL;
+    }
+
+    return 0;
+}
+
+static int clp_disable_interp(S390PCIBusDevice *pbdev)
+{
+    int rc;
+
+    rc = s390_pci_set_interp(pbdev, false);
+    if (rc) {
+        DPRINTF("Failed to disable interpretation\n");
+        return rc;
+    }
+
+    rc = s390_pci_update_passthrough_fh(pbdev);
+    if (rc) {
+        DPRINTF("Failed to update passthrough fh\n");
+        return rc;
+    }
+
+    return 0;
+}
+
 int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
 {
     ClpReqHdr *reqh;
@@ -246,7 +288,19 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
                 goto out;
             }
 
-            pbdev->fh |= FH_MASK_ENABLE;
+            /*
+             * If interpretation is specified, attempt to enable this now and
+             * update with the host fh
+             */
+            if (pbdev->interp) {
+                if (clp_enable_interp(pbdev)) {
+                    stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_ERR);
+                    goto out;
+                }
+            } else {
+                pbdev->fh |= FH_MASK_ENABLE;
+            }
+
             pbdev->state = ZPCI_FS_ENABLED;
             stl_p(&ressetpci->fh, pbdev->fh);
             stw_p(&ressetpci->hdr.rsp, CLP_RC_OK);
@@ -257,6 +311,13 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
                 goto out;
             }
             device_legacy_reset(DEVICE(pbdev));
+            if (pbdev->interp) {
+                if (clp_disable_interp(pbdev)) {
+                    stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_ERR);
+                    goto out;
+                }
+            }
+            /* Mask off the enabled bit for interpreted devices too */
             pbdev->fh &= ~FH_MASK_ENABLE;
             pbdev->state = ZPCI_FS_DISABLED;
             stl_p(&ressetpci->fh, pbdev->fh);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 6f80a47e29..2cab3a9e89 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -97,6 +97,58 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt)
     }
 }
 
+int s390_pci_probe_interp(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_feature feat = {
+        .argsz = sizeof(struct vfio_device_feature),
+        .flags = VFIO_DEVICE_FEATURE_PROBE | VFIO_DEVICE_FEATURE_ZPCI_INTERP
+    };
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
+}
+
+int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_interp *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_INTERP;
+
+    data = (struct vfio_device_zpci_interp *)&feat->data;
+    if (enable) {
+        data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
+    } else {
+        data->flags = 0;
+    }
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+}
+
+int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_interp *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+    int rc;
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_INTERP;
+
+    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+    if (rc) {
+        return rc;
+    }
+
+    data = (struct vfio_device_zpci_interp *)&feat->data;
+    pbdev->fh = data->fh;
+    return 0;
+}
+
 static void s390_pci_read_base(S390PCIBusDevice *pbdev,
                                struct vfio_device_info *info)
 {
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index da3cde2bb4..a9843dfe97 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -350,6 +350,7 @@ struct S390PCIBusDevice {
     IndAddr *indicator;
     bool pci_unplug_request_processed;
     bool unplug_requested;
+    bool interp;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
 };
 
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index ff708aef50..42533e38f7 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -20,6 +20,9 @@ bool s390_pci_update_dma_avail(int fd, unsigned int *avail);
 S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
                                           S390PCIBusDevice *pbdev);
 void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
+int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
+int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
+int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 #else
 static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
@@ -33,6 +36,18 @@ static inline S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
 }
 static inline void s390_pci_end_dma_count(S390pciState *s,
                                           S390PCIDMACount *cnt) { }
+int s390_pci_probe_interp(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
 #endif
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 5/9] s390x/pci: don't fence interpreted devices without MSI-X
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Lack of MSI-X support is not an issue for interpreted passthrough
devices, so let's let these in.  This will allow, for example, ISM
devices to be passed through -- but only when interpretation is
available and being used.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index a39ccfee05..66649af6e0 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -1097,7 +1097,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             pbdev->interp = false;
         }
 
-        if (s390_pci_msix_init(pbdev)) {
+        if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
             error_setg(errp, "MSI-X support is mandatory "
                        "in the S390 architecture");
             return;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 5/9] s390x/pci: don't fence interpreted devices without MSI-X
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

Lack of MSI-X support is not an issue for interpreted passthrough
devices, so let's let these in.  This will allow, for example, ISM
devices to be passed through -- but only when interpretation is
available and being used.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index a39ccfee05..66649af6e0 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -1097,7 +1097,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             pbdev->interp = false;
         }
 
-        if (s390_pci_msix_init(pbdev)) {
+        if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
             error_setg(errp, "MSI-X support is mandatory "
                        "in the S390 architecture");
             return;
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 6/9] s390x/pci: enable adapter event notification for interpreted devices
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Use the associated vfio feature ioctl to enable adapter event notification
and forwarding for devices when requested.  This feature will be set up
with or without firmware assist based upon the 'intassist' setting.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          | 24 ++++++++--
 hw/s390x/s390-pci-inst.c         | 54 +++++++++++++++++++++-
 hw/s390x/s390-pci-vfio.c         | 79 ++++++++++++++++++++++++++++++++
 include/hw/s390x/s390-pci-bus.h  |  1 +
 include/hw/s390x/s390-pci-vfio.h | 20 ++++++++
 5 files changed, 173 insertions(+), 5 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 66649af6e0..6ee70446ca 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -189,7 +189,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
         rc = SCLP_RC_NO_ACTION_REQUIRED;
         break;
     default:
-        if (pbdev->summary_ind) {
+        if (pbdev->interp) {
+            /* Interpreted devices were using interrupt forwarding */
+            s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
+        } else if (pbdev->summary_ind) {
             pci_dereg_irqs(pbdev);
         }
         if (pbdev->iommu->enabled) {
@@ -981,6 +984,11 @@ static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
         return rc;
     }
 
+    rc = s390_pci_probe_aif(pbdev);
+    if (rc) {
+        return rc;
+    }
+
     rc = s390_pci_update_passthrough_fh(pbdev);
     if (rc) {
         return rc;
@@ -1076,6 +1084,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
                     DPRINTF("zPCI interpretation facilities missing.\n");
                     pbdev->interp = false;
+                    pbdev->intassist = false;
                 }
             if (pbdev->interp) {
                 rc = s390_pci_interp_plug(s, pbdev);
@@ -1090,11 +1099,13 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             if (!pbdev->interp) {
                 /* Do vfio passthrough but intercept for I/O */
                 pbdev->fh |= FH_SHM_VFIO;
+                pbdev->intassist = false;
             }
         } else {
             pbdev->fh |= FH_SHM_EMUL;
             /* Always intercept emulated devices */
             pbdev->interp = false;
+            pbdev->intassist = false;
         }
 
         if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
@@ -1244,7 +1255,10 @@ static void s390_pcihost_reset(DeviceState *dev)
     /* Process all pending unplug requests */
     QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) {
         if (pbdev->unplug_requested) {
-            if (pbdev->summary_ind) {
+            if (pbdev->interp) {
+                /* Interpreted devices were using interrupt forwarding */
+                s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
+            } else if (pbdev->summary_ind) {
                 pci_dereg_irqs(pbdev);
             }
             if (pbdev->iommu->enabled) {
@@ -1382,7 +1396,10 @@ static void s390_pci_device_reset(DeviceState *dev)
         break;
     }
 
-    if (pbdev->summary_ind) {
+    if (pbdev->interp) {
+        /* Interpreted devices were using interrupt forwarding */
+        s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
+    } else if (pbdev->summary_ind) {
         pci_dereg_irqs(pbdev);
     }
     if (pbdev->iommu->enabled) {
@@ -1428,6 +1445,7 @@ static Property s390_pci_device_properties[] = {
     DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
     DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
     DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
+    DEFINE_PROP_BOOL("intassist", S390PCIBusDevice, intassist, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index e9a0dc12e4..121e07cc41 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -1111,6 +1111,46 @@ static void fmb_update(void *opaque)
     timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui);
 }
 
+static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
+{
+    int rc;
+
+    /* Interpreted devices must also use interrupt forwarding */
+    rc = s390_pci_get_aif(pbdev, false, pbdev->intassist);
+    if (rc) {
+        DPRINTF("Bad interrupt forwarding state\n");
+        return rc;
+    }
+
+    rc = s390_pci_set_aif(pbdev, fib, true, pbdev->intassist);
+    if (rc) {
+        DPRINTF("Failed to enable interrupt forwarding\n");
+        return rc;
+    }
+
+    return 0;
+}
+
+static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
+{
+    int rc;
+
+    /* Interpreted devices were using interrupt forwarding */
+    rc = s390_pci_get_aif(pbdev, true, pbdev->intassist);
+    if (rc) {
+        DPRINTF("Bad interrupt forwarding state\n");
+        return rc;
+    }
+
+    rc = s390_pci_set_aif(pbdev, fib, false, pbdev->intassist);
+    if (rc) {
+        DPRINTF("Failed to disable interrupt forwarding\n");
+        return rc;
+    }
+
+    return 0;
+}
+
 int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
                         uintptr_t ra)
 {
@@ -1165,7 +1205,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
 
     switch (oc) {
     case ZPCI_MOD_FC_REG_INT:
-        if (pbdev->summary_ind) {
+        if (pbdev->interp) {
+            if (mpcifc_reg_int_interp(pbdev, &fib)) {
+                cc = ZPCI_PCI_LS_ERR;
+                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
+            }
+        } else if (pbdev->summary_ind) {
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else if (reg_irqs(env, pbdev, fib)) {
@@ -1174,7 +1219,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
         }
         break;
     case ZPCI_MOD_FC_DEREG_INT:
-        if (!pbdev->summary_ind) {
+        if (pbdev->interp) {
+            if (mpcifc_dereg_int_interp(pbdev, &fib)) {
+                cc = ZPCI_PCI_LS_ERR;
+                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
+            }
+        } else if (!pbdev->summary_ind) {
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else {
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 2cab3a9e89..73f3b3ed19 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -149,6 +149,85 @@ int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
     return 0;
 }
 
+int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_feature feat = {
+        .argsz = sizeof(struct vfio_device_feature),
+        .flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_AIF
+    };
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
+}
+
+int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
+                     bool assist)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_aif *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
+
+    data = (struct vfio_device_zpci_aif *)&feat->data;
+    if (enable) {
+        data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
+        if (!pbdev->intassist) {
+            data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
+        }
+        /* Fill in the guest fib info */
+        data->ibv = fib->aibv;
+        data->sb = fib->aisb;
+        data->noi = FIB_DATA_NOI(fib->data);
+        data->isc = FIB_DATA_ISC(fib->data);
+        data->sbo = FIB_DATA_AISBO(fib->data);
+    } else {
+        data->flags = 0;
+    }
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+}
+
+int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_aif *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+    int rc;
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
+
+    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+    if (rc) {
+        return rc;
+    }
+
+    /* Determine if current interrupt settings match the host */
+    data = (struct vfio_device_zpci_aif *)&feat->data;
+    if (enable && (!(data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT))) {
+        rc = -EINVAL;
+    } else if (!enable && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT)) {
+        rc = -EINVAL;
+    }
+
+    /*
+     * When enabled for interrupts, the assist and forced host-delivery are
+     * mututally exclusive
+     */
+    if (enable && assist && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST)) {
+        rc = -EINVAL;
+    } else if (enable && (!assist) && (!(data->flags &
+                                         VFIO_DEVICE_ZPCI_FLAG_AIF_HOST))) {
+        rc = -EINVAL;
+    }
+
+    return rc;
+}
+
 static void s390_pci_read_base(S390PCIBusDevice *pbdev,
                                struct vfio_device_info *info)
 {
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index a9843dfe97..9941ca0084 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -351,6 +351,7 @@ struct S390PCIBusDevice {
     bool pci_unplug_request_processed;
     bool unplug_requested;
     bool interp;
+    bool intassist;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
 };
 
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index 42533e38f7..6cec38a863 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -13,6 +13,7 @@
 #define HW_S390_PCI_VFIO_H
 
 #include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-inst.h"
 #include CONFIG_DEVICES
 
 #ifdef CONFIG_VFIO
@@ -23,6 +24,11 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
 int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
 int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
 int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
+int s390_pci_probe_aif(S390PCIBusDevice *pbdev);
+int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
+                     bool assist);
+int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist);
+
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 #else
 static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
@@ -48,6 +54,20 @@ static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
 {
     return -EINVAL;
 }
+static inline int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib,
+                                   bool enable, bool assist)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable,
+                                   bool assist)
+{
+    return -EINVAL;
+}
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
 #endif
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 6/9] s390x/pci: enable adapter event notification for interpreted devices
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

Use the associated vfio feature ioctl to enable adapter event notification
and forwarding for devices when requested.  This feature will be set up
with or without firmware assist based upon the 'intassist' setting.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          | 24 ++++++++--
 hw/s390x/s390-pci-inst.c         | 54 +++++++++++++++++++++-
 hw/s390x/s390-pci-vfio.c         | 79 ++++++++++++++++++++++++++++++++
 include/hw/s390x/s390-pci-bus.h  |  1 +
 include/hw/s390x/s390-pci-vfio.h | 20 ++++++++
 5 files changed, 173 insertions(+), 5 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 66649af6e0..6ee70446ca 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -189,7 +189,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
         rc = SCLP_RC_NO_ACTION_REQUIRED;
         break;
     default:
-        if (pbdev->summary_ind) {
+        if (pbdev->interp) {
+            /* Interpreted devices were using interrupt forwarding */
+            s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
+        } else if (pbdev->summary_ind) {
             pci_dereg_irqs(pbdev);
         }
         if (pbdev->iommu->enabled) {
@@ -981,6 +984,11 @@ static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
         return rc;
     }
 
+    rc = s390_pci_probe_aif(pbdev);
+    if (rc) {
+        return rc;
+    }
+
     rc = s390_pci_update_passthrough_fh(pbdev);
     if (rc) {
         return rc;
@@ -1076,6 +1084,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
                     DPRINTF("zPCI interpretation facilities missing.\n");
                     pbdev->interp = false;
+                    pbdev->intassist = false;
                 }
             if (pbdev->interp) {
                 rc = s390_pci_interp_plug(s, pbdev);
@@ -1090,11 +1099,13 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
             if (!pbdev->interp) {
                 /* Do vfio passthrough but intercept for I/O */
                 pbdev->fh |= FH_SHM_VFIO;
+                pbdev->intassist = false;
             }
         } else {
             pbdev->fh |= FH_SHM_EMUL;
             /* Always intercept emulated devices */
             pbdev->interp = false;
+            pbdev->intassist = false;
         }
 
         if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
@@ -1244,7 +1255,10 @@ static void s390_pcihost_reset(DeviceState *dev)
     /* Process all pending unplug requests */
     QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) {
         if (pbdev->unplug_requested) {
-            if (pbdev->summary_ind) {
+            if (pbdev->interp) {
+                /* Interpreted devices were using interrupt forwarding */
+                s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
+            } else if (pbdev->summary_ind) {
                 pci_dereg_irqs(pbdev);
             }
             if (pbdev->iommu->enabled) {
@@ -1382,7 +1396,10 @@ static void s390_pci_device_reset(DeviceState *dev)
         break;
     }
 
-    if (pbdev->summary_ind) {
+    if (pbdev->interp) {
+        /* Interpreted devices were using interrupt forwarding */
+        s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
+    } else if (pbdev->summary_ind) {
         pci_dereg_irqs(pbdev);
     }
     if (pbdev->iommu->enabled) {
@@ -1428,6 +1445,7 @@ static Property s390_pci_device_properties[] = {
     DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
     DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
     DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
+    DEFINE_PROP_BOOL("intassist", S390PCIBusDevice, intassist, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index e9a0dc12e4..121e07cc41 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -1111,6 +1111,46 @@ static void fmb_update(void *opaque)
     timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui);
 }
 
+static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
+{
+    int rc;
+
+    /* Interpreted devices must also use interrupt forwarding */
+    rc = s390_pci_get_aif(pbdev, false, pbdev->intassist);
+    if (rc) {
+        DPRINTF("Bad interrupt forwarding state\n");
+        return rc;
+    }
+
+    rc = s390_pci_set_aif(pbdev, fib, true, pbdev->intassist);
+    if (rc) {
+        DPRINTF("Failed to enable interrupt forwarding\n");
+        return rc;
+    }
+
+    return 0;
+}
+
+static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
+{
+    int rc;
+
+    /* Interpreted devices were using interrupt forwarding */
+    rc = s390_pci_get_aif(pbdev, true, pbdev->intassist);
+    if (rc) {
+        DPRINTF("Bad interrupt forwarding state\n");
+        return rc;
+    }
+
+    rc = s390_pci_set_aif(pbdev, fib, false, pbdev->intassist);
+    if (rc) {
+        DPRINTF("Failed to disable interrupt forwarding\n");
+        return rc;
+    }
+
+    return 0;
+}
+
 int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
                         uintptr_t ra)
 {
@@ -1165,7 +1205,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
 
     switch (oc) {
     case ZPCI_MOD_FC_REG_INT:
-        if (pbdev->summary_ind) {
+        if (pbdev->interp) {
+            if (mpcifc_reg_int_interp(pbdev, &fib)) {
+                cc = ZPCI_PCI_LS_ERR;
+                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
+            }
+        } else if (pbdev->summary_ind) {
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else if (reg_irqs(env, pbdev, fib)) {
@@ -1174,7 +1219,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
         }
         break;
     case ZPCI_MOD_FC_DEREG_INT:
-        if (!pbdev->summary_ind) {
+        if (pbdev->interp) {
+            if (mpcifc_dereg_int_interp(pbdev, &fib)) {
+                cc = ZPCI_PCI_LS_ERR;
+                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
+            }
+        } else if (!pbdev->summary_ind) {
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else {
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 2cab3a9e89..73f3b3ed19 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -149,6 +149,85 @@ int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
     return 0;
 }
 
+int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_feature feat = {
+        .argsz = sizeof(struct vfio_device_feature),
+        .flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_AIF
+    };
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
+}
+
+int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
+                     bool assist)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_aif *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
+
+    data = (struct vfio_device_zpci_aif *)&feat->data;
+    if (enable) {
+        data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
+        if (!pbdev->intassist) {
+            data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
+        }
+        /* Fill in the guest fib info */
+        data->ibv = fib->aibv;
+        data->sb = fib->aisb;
+        data->noi = FIB_DATA_NOI(fib->data);
+        data->isc = FIB_DATA_ISC(fib->data);
+        data->sbo = FIB_DATA_AISBO(fib->data);
+    } else {
+        data->flags = 0;
+    }
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+}
+
+int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_aif *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+    int rc;
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
+
+    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+    if (rc) {
+        return rc;
+    }
+
+    /* Determine if current interrupt settings match the host */
+    data = (struct vfio_device_zpci_aif *)&feat->data;
+    if (enable && (!(data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT))) {
+        rc = -EINVAL;
+    } else if (!enable && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT)) {
+        rc = -EINVAL;
+    }
+
+    /*
+     * When enabled for interrupts, the assist and forced host-delivery are
+     * mututally exclusive
+     */
+    if (enable && assist && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST)) {
+        rc = -EINVAL;
+    } else if (enable && (!assist) && (!(data->flags &
+                                         VFIO_DEVICE_ZPCI_FLAG_AIF_HOST))) {
+        rc = -EINVAL;
+    }
+
+    return rc;
+}
+
 static void s390_pci_read_base(S390PCIBusDevice *pbdev,
                                struct vfio_device_info *info)
 {
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index a9843dfe97..9941ca0084 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -351,6 +351,7 @@ struct S390PCIBusDevice {
     bool pci_unplug_request_processed;
     bool unplug_requested;
     bool interp;
+    bool intassist;
     QTAILQ_ENTRY(S390PCIBusDevice) link;
 };
 
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index 42533e38f7..6cec38a863 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -13,6 +13,7 @@
 #define HW_S390_PCI_VFIO_H
 
 #include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-pci-inst.h"
 #include CONFIG_DEVICES
 
 #ifdef CONFIG_VFIO
@@ -23,6 +24,11 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
 int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
 int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
 int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
+int s390_pci_probe_aif(S390PCIBusDevice *pbdev);
+int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
+                     bool assist);
+int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist);
+
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 #else
 static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
@@ -48,6 +54,20 @@ static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
 {
     return -EINVAL;
 }
+static inline int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib,
+                                   bool enable, bool assist)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable,
+                                   bool assist)
+{
+    return -EINVAL;
+}
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
 #endif
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 7/9] s390x/pci: use I/O Address Translation assist when interpreting
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Allow the underlying kvm host to handle the Refresh PCI Translation
instruction intercepts.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          |  6 ++--
 hw/s390x/s390-pci-inst.c         | 51 ++++++++++++++++++++++++++++++--
 hw/s390x/s390-pci-vfio.c         | 27 +++++++++++++++++
 include/hw/s390x/s390-pci-inst.h |  2 +-
 include/hw/s390x/s390-pci-vfio.h | 10 +++++++
 5 files changed, 89 insertions(+), 7 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 6ee70446ca..49ae2fd0ea 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -196,7 +196,7 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
             pci_dereg_irqs(pbdev);
         }
         if (pbdev->iommu->enabled) {
-            pci_dereg_ioat(pbdev->iommu);
+            pci_dereg_ioat(pbdev);
         }
         pbdev->state = ZPCI_FS_STANDBY;
         rc = SCLP_RC_NORMAL_COMPLETION;
@@ -1262,7 +1262,7 @@ static void s390_pcihost_reset(DeviceState *dev)
                 pci_dereg_irqs(pbdev);
             }
             if (pbdev->iommu->enabled) {
-                pci_dereg_ioat(pbdev->iommu);
+                pci_dereg_ioat(pbdev);
             }
             pbdev->state = ZPCI_FS_STANDBY;
             s390_pci_perform_unplug(pbdev);
@@ -1403,7 +1403,7 @@ static void s390_pci_device_reset(DeviceState *dev)
         pci_dereg_irqs(pbdev);
     }
     if (pbdev->iommu->enabled) {
-        pci_dereg_ioat(pbdev->iommu);
+        pci_dereg_ioat(pbdev);
     }
 
     fmb_timer_free(pbdev);
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 121e07cc41..0fa18866c0 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -978,6 +978,24 @@ int pci_dereg_irqs(S390PCIBusDevice *pbdev)
     return 0;
 }
 
+static int reg_ioat_interp(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+    int rc;
+
+    rc = s390_pci_probe_ioat(pbdev);
+    if (rc) {
+        return rc;
+    }
+
+    rc = s390_pci_set_ioat(pbdev, iota);
+    if (rc) {
+        return rc;
+    }
+
+    pbdev->iommu->enabled = true;
+    return 0;
+}
+
 static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
                     uintptr_t ra)
 {
@@ -995,6 +1013,16 @@ static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
         return -EINVAL;
     }
 
+    /* If this is an interpreted device, we must use the IOAT assist */
+    if (pbdev->interp) {
+        if (reg_ioat_interp(pbdev, g_iota)) {
+            error_report("failure starting ioat assist");
+            s390_program_interrupt(env, PGM_OPERAND, ra);
+            return -EINVAL;
+        }
+        return 0;
+    }
+
     /* currently we only support designation type 1 with translation */
     if (!(dt == ZPCI_IOTA_RTTO && t)) {
         error_report("unsupported ioat dt %d t %d", dt, t);
@@ -1011,8 +1039,25 @@ static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
     return 0;
 }
 
-void pci_dereg_ioat(S390PCIIOMMU *iommu)
+static void dereg_ioat_interp(S390PCIBusDevice *pbdev)
 {
+    if (s390_pci_probe_ioat(pbdev) != 0) {
+        return;
+    }
+
+    s390_pci_set_ioat(pbdev, 0);
+    pbdev->iommu->enabled = false;
+}
+
+void pci_dereg_ioat(S390PCIBusDevice *pbdev)
+{
+    S390PCIIOMMU *iommu = pbdev->iommu;
+
+    if (pbdev->interp) {
+        dereg_ioat_interp(pbdev);
+        return;
+    }
+
     s390_pci_iommu_disable(iommu);
     iommu->pba = 0;
     iommu->pal = 0;
@@ -1251,7 +1296,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else {
-            pci_dereg_ioat(pbdev->iommu);
+            pci_dereg_ioat(pbdev);
         }
         break;
     case ZPCI_MOD_FC_REREG_IOAT:
@@ -1262,7 +1307,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else {
-            pci_dereg_ioat(pbdev->iommu);
+            pci_dereg_ioat(pbdev);
             if (reg_ioat(env, pbdev, fib, ra)) {
                 cc = ZPCI_PCI_LS_ERR;
                 s390_set_status_code(env, r1, ZPCI_MOD_ST_INSUF_RES);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 73f3b3ed19..4e3165bff5 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -228,6 +228,33 @@ int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
     return rc;
 }
 
+int s390_pci_probe_ioat(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_feature feat = {
+        .argsz = sizeof(struct vfio_device_feature),
+        .flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_IOAT
+    };
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
+}
+
+int s390_pci_set_ioat(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_ioat *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_IOAT;
+
+    data = (struct vfio_device_zpci_ioat *)&feat->data;
+    data->iota = iota;
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+}
+
 static void s390_pci_read_base(S390PCIBusDevice *pbdev,
                                struct vfio_device_info *info)
 {
diff --git a/include/hw/s390x/s390-pci-inst.h b/include/hw/s390x/s390-pci-inst.h
index a55c448aad..13566fb7b4 100644
--- a/include/hw/s390x/s390-pci-inst.h
+++ b/include/hw/s390x/s390-pci-inst.h
@@ -99,7 +99,7 @@ typedef struct ZpciFib {
 } QEMU_PACKED ZpciFib;
 
 int pci_dereg_irqs(S390PCIBusDevice *pbdev);
-void pci_dereg_ioat(S390PCIIOMMU *iommu);
+void pci_dereg_ioat(S390PCIBusDevice *pbdev);
 int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra);
 int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
 int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index 6cec38a863..e7a2d8ff77 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -28,6 +28,8 @@ int s390_pci_probe_aif(S390PCIBusDevice *pbdev);
 int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
                      bool assist);
 int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist);
+int s390_pci_probe_ioat(S390PCIBusDevice *pbdev);
+int s390_pci_set_ioat(S390PCIBusDevice *pbdev, uint64_t iota);
 
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 #else
@@ -68,6 +70,14 @@ static inline int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable,
 {
     return -EINVAL;
 }
+static inline int s390_pci_probe_ioat(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_set_ioat(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+    return -EINVAL;
+}
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
 #endif
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 7/9] s390x/pci: use I/O Address Translation assist when interpreting
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

Allow the underlying kvm host to handle the Refresh PCI Translation
instruction intercepts.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c          |  6 ++--
 hw/s390x/s390-pci-inst.c         | 51 ++++++++++++++++++++++++++++++--
 hw/s390x/s390-pci-vfio.c         | 27 +++++++++++++++++
 include/hw/s390x/s390-pci-inst.h |  2 +-
 include/hw/s390x/s390-pci-vfio.h | 10 +++++++
 5 files changed, 89 insertions(+), 7 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 6ee70446ca..49ae2fd0ea 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -196,7 +196,7 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
             pci_dereg_irqs(pbdev);
         }
         if (pbdev->iommu->enabled) {
-            pci_dereg_ioat(pbdev->iommu);
+            pci_dereg_ioat(pbdev);
         }
         pbdev->state = ZPCI_FS_STANDBY;
         rc = SCLP_RC_NORMAL_COMPLETION;
@@ -1262,7 +1262,7 @@ static void s390_pcihost_reset(DeviceState *dev)
                 pci_dereg_irqs(pbdev);
             }
             if (pbdev->iommu->enabled) {
-                pci_dereg_ioat(pbdev->iommu);
+                pci_dereg_ioat(pbdev);
             }
             pbdev->state = ZPCI_FS_STANDBY;
             s390_pci_perform_unplug(pbdev);
@@ -1403,7 +1403,7 @@ static void s390_pci_device_reset(DeviceState *dev)
         pci_dereg_irqs(pbdev);
     }
     if (pbdev->iommu->enabled) {
-        pci_dereg_ioat(pbdev->iommu);
+        pci_dereg_ioat(pbdev);
     }
 
     fmb_timer_free(pbdev);
diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
index 121e07cc41..0fa18866c0 100644
--- a/hw/s390x/s390-pci-inst.c
+++ b/hw/s390x/s390-pci-inst.c
@@ -978,6 +978,24 @@ int pci_dereg_irqs(S390PCIBusDevice *pbdev)
     return 0;
 }
 
+static int reg_ioat_interp(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+    int rc;
+
+    rc = s390_pci_probe_ioat(pbdev);
+    if (rc) {
+        return rc;
+    }
+
+    rc = s390_pci_set_ioat(pbdev, iota);
+    if (rc) {
+        return rc;
+    }
+
+    pbdev->iommu->enabled = true;
+    return 0;
+}
+
 static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
                     uintptr_t ra)
 {
@@ -995,6 +1013,16 @@ static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
         return -EINVAL;
     }
 
+    /* If this is an interpreted device, we must use the IOAT assist */
+    if (pbdev->interp) {
+        if (reg_ioat_interp(pbdev, g_iota)) {
+            error_report("failure starting ioat assist");
+            s390_program_interrupt(env, PGM_OPERAND, ra);
+            return -EINVAL;
+        }
+        return 0;
+    }
+
     /* currently we only support designation type 1 with translation */
     if (!(dt == ZPCI_IOTA_RTTO && t)) {
         error_report("unsupported ioat dt %d t %d", dt, t);
@@ -1011,8 +1039,25 @@ static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
     return 0;
 }
 
-void pci_dereg_ioat(S390PCIIOMMU *iommu)
+static void dereg_ioat_interp(S390PCIBusDevice *pbdev)
 {
+    if (s390_pci_probe_ioat(pbdev) != 0) {
+        return;
+    }
+
+    s390_pci_set_ioat(pbdev, 0);
+    pbdev->iommu->enabled = false;
+}
+
+void pci_dereg_ioat(S390PCIBusDevice *pbdev)
+{
+    S390PCIIOMMU *iommu = pbdev->iommu;
+
+    if (pbdev->interp) {
+        dereg_ioat_interp(pbdev);
+        return;
+    }
+
     s390_pci_iommu_disable(iommu);
     iommu->pba = 0;
     iommu->pal = 0;
@@ -1251,7 +1296,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else {
-            pci_dereg_ioat(pbdev->iommu);
+            pci_dereg_ioat(pbdev);
         }
         break;
     case ZPCI_MOD_FC_REREG_IOAT:
@@ -1262,7 +1307,7 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
             cc = ZPCI_PCI_LS_ERR;
             s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
         } else {
-            pci_dereg_ioat(pbdev->iommu);
+            pci_dereg_ioat(pbdev);
             if (reg_ioat(env, pbdev, fib, ra)) {
                 cc = ZPCI_PCI_LS_ERR;
                 s390_set_status_code(env, r1, ZPCI_MOD_ST_INSUF_RES);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 73f3b3ed19..4e3165bff5 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -228,6 +228,33 @@ int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
     return rc;
 }
 
+int s390_pci_probe_ioat(S390PCIBusDevice *pbdev)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_feature feat = {
+        .argsz = sizeof(struct vfio_device_feature),
+        .flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_IOAT
+    };
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
+}
+
+int s390_pci_set_ioat(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
+    struct vfio_device_zpci_ioat *data;
+    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
+    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
+
+    feat->argsz = size;
+    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_IOAT;
+
+    data = (struct vfio_device_zpci_ioat *)&feat->data;
+    data->iota = iota;
+
+    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
+}
+
 static void s390_pci_read_base(S390PCIBusDevice *pbdev,
                                struct vfio_device_info *info)
 {
diff --git a/include/hw/s390x/s390-pci-inst.h b/include/hw/s390x/s390-pci-inst.h
index a55c448aad..13566fb7b4 100644
--- a/include/hw/s390x/s390-pci-inst.h
+++ b/include/hw/s390x/s390-pci-inst.h
@@ -99,7 +99,7 @@ typedef struct ZpciFib {
 } QEMU_PACKED ZpciFib;
 
 int pci_dereg_irqs(S390PCIBusDevice *pbdev);
-void pci_dereg_ioat(S390PCIIOMMU *iommu);
+void pci_dereg_ioat(S390PCIBusDevice *pbdev);
 int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra);
 int pcilg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
 int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2, uintptr_t ra);
diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
index 6cec38a863..e7a2d8ff77 100644
--- a/include/hw/s390x/s390-pci-vfio.h
+++ b/include/hw/s390x/s390-pci-vfio.h
@@ -28,6 +28,8 @@ int s390_pci_probe_aif(S390PCIBusDevice *pbdev);
 int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
                      bool assist);
 int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist);
+int s390_pci_probe_ioat(S390PCIBusDevice *pbdev);
+int s390_pci_set_ioat(S390PCIBusDevice *pbdev, uint64_t iota);
 
 void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
 #else
@@ -68,6 +70,14 @@ static inline int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable,
 {
     return -EINVAL;
 }
+static inline int s390_pci_probe_ioat(S390PCIBusDevice *pbdev)
+{
+    return -EINVAL;
+}
+static inline int s390_pci_set_ioat(S390PCIBusDevice *pbdev, uint64_t iota)
+{
+    return -EINVAL;
+}
 static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
 #endif
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 8/9] s390x/pci: use dtsm provided from vfio capabilities for interpreted devices
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

When using the IOAT assist via interpretation, we should advertise what
the host driver supports, not QEMU.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-vfio.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 4e3165bff5..347cbdfdf8 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -318,7 +318,11 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
         resgrp->i = cap->noi;
         resgrp->maxstbl = cap->maxstbl;
         resgrp->version = cap->version;
-        resgrp->dtsm = ZPCI_DTSM;
+        if (hdr->version >= 2 && pbdev->interp) {
+            resgrp->dtsm = cap->dtsm;
+        } else {
+            resgrp->dtsm = ZPCI_DTSM;
+        }
     }
 }
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 8/9] s390x/pci: use dtsm provided from vfio capabilities for interpreted devices
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

When using the IOAT assist via interpretation, we should advertise what
the host driver supports, not QEMU.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-vfio.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 4e3165bff5..347cbdfdf8 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -318,7 +318,11 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
         resgrp->i = cap->noi;
         resgrp->maxstbl = cap->maxstbl;
         resgrp->version = cap->version;
-        resgrp->dtsm = ZPCI_DTSM;
+        if (hdr->version >= 2 && pbdev->interp) {
+            resgrp->dtsm = cap->dtsm;
+        } else {
+            resgrp->dtsm = ZPCI_DTSM;
+        }
     }
 }
 
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 9/9] s390x/pci: let intercept devices have separate PCI groups
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-14 20:38   ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

Let's use the reserved pool of simulated PCI groups to allow intercept
devices to have separate groups from interpreted devices as some group
values may be different. If we run out of simulated PCI groups, subsequent
intercept devices just get the default group.
Furthermore, if we encounter any PCI groups from hostdevs that are marked
as simulated, let's just assign them to the default group to avoid
conflicts between host simulated groups and our own simulated groups.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c         | 19 ++++++++++++++--
 hw/s390x/s390-pci-vfio.c        | 40 ++++++++++++++++++++++++++++++---
 include/hw/s390x/s390-pci-bus.h |  6 ++++-
 3 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 49ae2fd0ea..705a19ddb9 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -747,13 +747,14 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
     object_unref(OBJECT(iommu));
 }
 
-S390PCIGroup *s390_group_create(int id)
+S390PCIGroup *s390_group_create(int id, int host_id)
 {
     S390PCIGroup *group;
     S390pciState *s = s390_get_phb();
 
     group = g_new0(S390PCIGroup, 1);
     group->id = id;
+    group->host_id = host_id;
     QTAILQ_INSERT_TAIL(&s->zpci_groups, group, link);
     return group;
 }
@@ -771,12 +772,25 @@ S390PCIGroup *s390_group_find(int id)
     return NULL;
 }
 
+S390PCIGroup *s390_group_find_host_sim(int host_id)
+{
+    S390PCIGroup *group;
+    S390pciState *s = s390_get_phb();
+
+    QTAILQ_FOREACH(group, &s->zpci_groups, link) {
+        if (group->id >= ZPCI_SIM_GRP_START && group->host_id == host_id) {
+            return group;
+        }
+    }
+    return NULL;
+}
+
 static void s390_pci_init_default_group(void)
 {
     S390PCIGroup *group;
     ClpRspQueryPciGrp *resgrp;
 
-    group = s390_group_create(ZPCI_DEFAULT_FN_GRP);
+    group = s390_group_create(ZPCI_DEFAULT_FN_GRP, ZPCI_DEFAULT_FN_GRP);
     resgrp = &group->zpci_group;
     resgrp->fr = 1;
     resgrp->dasm = 0;
@@ -824,6 +838,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
                                            NULL, g_free);
     s->zpci_table = g_hash_table_new_full(g_int_hash, g_int_equal, NULL, NULL);
     s->bus_no = 0;
+    s->next_sim_grp = ZPCI_SIM_GRP_START;
     QTAILQ_INIT(&s->pending_sei);
     QTAILQ_INIT(&s->zpci_devs);
     QTAILQ_INIT(&s->zpci_dma_limit);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 347cbdfdf8..e7e6eca60c 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -287,13 +287,17 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
 {
     struct vfio_info_cap_header *hdr;
     struct vfio_device_info_cap_zpci_group *cap;
+    S390pciState *s = s390_get_phb();
     ClpRspQueryPciGrp *resgrp;
     VFIOPCIDevice *vpci =  container_of(pbdev->pdev, VFIOPCIDevice, pdev);
 
     hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
 
-    /* If capability not provided, just use the default group */
-    if (hdr == NULL) {
+    /*
+     * If capability not provided or the underlying hostdev is simulated, just
+     * use the default group.
+     */
+    if (hdr == NULL || pbdev->zpci_fn.pfgid >= ZPCI_SIM_GRP_START) {
         trace_s390_pci_clp_cap(vpci->vbasedev.name,
                                VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
         pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
@@ -302,11 +306,41 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
     }
     cap = (void *) hdr;
 
+    /*
+     * For an intercept device, let's use an existing simulated group if one
+     * one was already created for other intercept devices in this group.
+     * If not, create a new simulated group if any are still available.
+     * If all else fails, just fall back on the default group.
+     */
+    if (!pbdev->interp) {
+        pbdev->pci_group = s390_group_find_host_sim(pbdev->zpci_fn.pfgid);
+        if (pbdev->pci_group) {
+            /* Use existing simulated group */
+            pbdev->zpci_fn.pfgid = pbdev->pci_group->id;
+            return;
+        } else {
+            if (s->next_sim_grp == ZPCI_DEFAULT_FN_GRP) {
+                /* All out of simulated groups, use default */
+                trace_s390_pci_clp_cap(vpci->vbasedev.name,
+                                       VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
+                pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
+                pbdev->pci_group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
+                return;
+            } else {
+                /* We can assign a new simulated group */
+                pbdev->zpci_fn.pfgid = s->next_sim_grp;
+                s->next_sim_grp++;
+                /* Fall through to create the new sim group using CLP info */
+            }
+        }
+    }
+
     /* See if the PCI group is already defined, create if not */
     pbdev->pci_group = s390_group_find(pbdev->zpci_fn.pfgid);
 
     if (!pbdev->pci_group) {
-        pbdev->pci_group = s390_group_create(pbdev->zpci_fn.pfgid);
+        pbdev->pci_group = s390_group_create(pbdev->zpci_fn.pfgid,
+                                             pbdev->zpci_fn.pfgid);
 
         resgrp = &pbdev->pci_group->zpci_group;
         if (cap->flags & VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH) {
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 9941ca0084..8664023d5d 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -315,13 +315,16 @@ typedef struct ZpciFmb {
 QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
 
 #define ZPCI_DEFAULT_FN_GRP 0xFF
+#define ZPCI_SIM_GRP_START 0xF0
 typedef struct S390PCIGroup {
     ClpRspQueryPciGrp zpci_group;
     int id;
+    int host_id;
     QTAILQ_ENTRY(S390PCIGroup) link;
 } S390PCIGroup;
-S390PCIGroup *s390_group_create(int id);
+S390PCIGroup *s390_group_create(int id, int host_id);
 S390PCIGroup *s390_group_find(int id);
+S390PCIGroup *s390_group_find_host_sim(int host_id);
 
 struct S390PCIBusDevice {
     DeviceState qdev;
@@ -370,6 +373,7 @@ struct S390pciState {
     QTAILQ_HEAD(, S390PCIBusDevice) zpci_devs;
     QTAILQ_HEAD(, S390PCIDMACount) zpci_dma_limit;
     QTAILQ_HEAD(, S390PCIGroup) zpci_groups;
+    uint8_t next_sim_grp;
 };
 
 S390pciState *s390_get_phb(void);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v2 9/9] s390x/pci: let intercept devices have separate PCI groups
@ 2022-01-14 20:38   ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:38 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

Let's use the reserved pool of simulated PCI groups to allow intercept
devices to have separate groups from interpreted devices as some group
values may be different. If we run out of simulated PCI groups, subsequent
intercept devices just get the default group.
Furthermore, if we encounter any PCI groups from hostdevs that are marked
as simulated, let's just assign them to the default group to avoid
conflicts between host simulated groups and our own simulated groups.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 hw/s390x/s390-pci-bus.c         | 19 ++++++++++++++--
 hw/s390x/s390-pci-vfio.c        | 40 ++++++++++++++++++++++++++++++---
 include/hw/s390x/s390-pci-bus.h |  6 ++++-
 3 files changed, 59 insertions(+), 6 deletions(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index 49ae2fd0ea..705a19ddb9 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -747,13 +747,14 @@ static void s390_pci_iommu_free(S390pciState *s, PCIBus *bus, int32_t devfn)
     object_unref(OBJECT(iommu));
 }
 
-S390PCIGroup *s390_group_create(int id)
+S390PCIGroup *s390_group_create(int id, int host_id)
 {
     S390PCIGroup *group;
     S390pciState *s = s390_get_phb();
 
     group = g_new0(S390PCIGroup, 1);
     group->id = id;
+    group->host_id = host_id;
     QTAILQ_INSERT_TAIL(&s->zpci_groups, group, link);
     return group;
 }
@@ -771,12 +772,25 @@ S390PCIGroup *s390_group_find(int id)
     return NULL;
 }
 
+S390PCIGroup *s390_group_find_host_sim(int host_id)
+{
+    S390PCIGroup *group;
+    S390pciState *s = s390_get_phb();
+
+    QTAILQ_FOREACH(group, &s->zpci_groups, link) {
+        if (group->id >= ZPCI_SIM_GRP_START && group->host_id == host_id) {
+            return group;
+        }
+    }
+    return NULL;
+}
+
 static void s390_pci_init_default_group(void)
 {
     S390PCIGroup *group;
     ClpRspQueryPciGrp *resgrp;
 
-    group = s390_group_create(ZPCI_DEFAULT_FN_GRP);
+    group = s390_group_create(ZPCI_DEFAULT_FN_GRP, ZPCI_DEFAULT_FN_GRP);
     resgrp = &group->zpci_group;
     resgrp->fr = 1;
     resgrp->dasm = 0;
@@ -824,6 +838,7 @@ static void s390_pcihost_realize(DeviceState *dev, Error **errp)
                                            NULL, g_free);
     s->zpci_table = g_hash_table_new_full(g_int_hash, g_int_equal, NULL, NULL);
     s->bus_no = 0;
+    s->next_sim_grp = ZPCI_SIM_GRP_START;
     QTAILQ_INIT(&s->pending_sei);
     QTAILQ_INIT(&s->zpci_devs);
     QTAILQ_INIT(&s->zpci_dma_limit);
diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
index 347cbdfdf8..e7e6eca60c 100644
--- a/hw/s390x/s390-pci-vfio.c
+++ b/hw/s390x/s390-pci-vfio.c
@@ -287,13 +287,17 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
 {
     struct vfio_info_cap_header *hdr;
     struct vfio_device_info_cap_zpci_group *cap;
+    S390pciState *s = s390_get_phb();
     ClpRspQueryPciGrp *resgrp;
     VFIOPCIDevice *vpci =  container_of(pbdev->pdev, VFIOPCIDevice, pdev);
 
     hdr = vfio_get_device_info_cap(info, VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
 
-    /* If capability not provided, just use the default group */
-    if (hdr == NULL) {
+    /*
+     * If capability not provided or the underlying hostdev is simulated, just
+     * use the default group.
+     */
+    if (hdr == NULL || pbdev->zpci_fn.pfgid >= ZPCI_SIM_GRP_START) {
         trace_s390_pci_clp_cap(vpci->vbasedev.name,
                                VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
         pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
@@ -302,11 +306,41 @@ static void s390_pci_read_group(S390PCIBusDevice *pbdev,
     }
     cap = (void *) hdr;
 
+    /*
+     * For an intercept device, let's use an existing simulated group if one
+     * one was already created for other intercept devices in this group.
+     * If not, create a new simulated group if any are still available.
+     * If all else fails, just fall back on the default group.
+     */
+    if (!pbdev->interp) {
+        pbdev->pci_group = s390_group_find_host_sim(pbdev->zpci_fn.pfgid);
+        if (pbdev->pci_group) {
+            /* Use existing simulated group */
+            pbdev->zpci_fn.pfgid = pbdev->pci_group->id;
+            return;
+        } else {
+            if (s->next_sim_grp == ZPCI_DEFAULT_FN_GRP) {
+                /* All out of simulated groups, use default */
+                trace_s390_pci_clp_cap(vpci->vbasedev.name,
+                                       VFIO_DEVICE_INFO_CAP_ZPCI_GROUP);
+                pbdev->zpci_fn.pfgid = ZPCI_DEFAULT_FN_GRP;
+                pbdev->pci_group = s390_group_find(ZPCI_DEFAULT_FN_GRP);
+                return;
+            } else {
+                /* We can assign a new simulated group */
+                pbdev->zpci_fn.pfgid = s->next_sim_grp;
+                s->next_sim_grp++;
+                /* Fall through to create the new sim group using CLP info */
+            }
+        }
+    }
+
     /* See if the PCI group is already defined, create if not */
     pbdev->pci_group = s390_group_find(pbdev->zpci_fn.pfgid);
 
     if (!pbdev->pci_group) {
-        pbdev->pci_group = s390_group_create(pbdev->zpci_fn.pfgid);
+        pbdev->pci_group = s390_group_create(pbdev->zpci_fn.pfgid,
+                                             pbdev->zpci_fn.pfgid);
 
         resgrp = &pbdev->pci_group->zpci_group;
         if (cap->flags & VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH) {
diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
index 9941ca0084..8664023d5d 100644
--- a/include/hw/s390x/s390-pci-bus.h
+++ b/include/hw/s390x/s390-pci-bus.h
@@ -315,13 +315,16 @@ typedef struct ZpciFmb {
 QEMU_BUILD_BUG_MSG(offsetof(ZpciFmb, fmt0) != 48, "padding in ZpciFmb");
 
 #define ZPCI_DEFAULT_FN_GRP 0xFF
+#define ZPCI_SIM_GRP_START 0xF0
 typedef struct S390PCIGroup {
     ClpRspQueryPciGrp zpci_group;
     int id;
+    int host_id;
     QTAILQ_ENTRY(S390PCIGroup) link;
 } S390PCIGroup;
-S390PCIGroup *s390_group_create(int id);
+S390PCIGroup *s390_group_create(int id, int host_id);
 S390PCIGroup *s390_group_find(int id);
+S390PCIGroup *s390_group_find_host_sim(int host_id);
 
 struct S390PCIBusDevice {
     DeviceState qdev;
@@ -370,6 +373,7 @@ struct S390pciState {
     QTAILQ_HEAD(, S390PCIBusDevice) zpci_devs;
     QTAILQ_HEAD(, S390PCIDMACount) zpci_dma_limit;
     QTAILQ_HEAD(, S390PCIGroup) zpci_groups;
+    uint8_t next_sim_grp;
 };
 
 S390pciState *s390_get_phb(void);
-- 
2.27.0



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 3/9] fixup: force interp off for QEMU machine 6.2 and older
  2022-01-14 20:38   ` Matthew Rosato
@ 2022-01-14 21:04     ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 21:04 UTC (permalink / raw)
  To: qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

On 1/14/22 3:38 PM, Matthew Rosato wrote:
> Double-check I'm doing this right + test.
> 

Argh...  This should have been squashed into the preceding patch 
'target/s390x: add zpci-interp to cpu models'

> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-virtio-ccw.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 84e3e63c43..e02fe11b07 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -803,6 +803,7 @@ DEFINE_CCW_MACHINE(7_0, "7.0", true);
>   static void ccw_machine_6_2_instance_options(MachineState *machine)
>   {
>       ccw_machine_7_0_instance_options(machine);
> +    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
>   }
>   
>   static void ccw_machine_6_2_class_options(MachineClass *mc)


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 3/9] fixup: force interp off for QEMU machine 6.2 and older
@ 2022-01-14 21:04     ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-14 21:04 UTC (permalink / raw)
  To: qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 1/14/22 3:38 PM, Matthew Rosato wrote:
> Double-check I'm doing this right + test.
> 

Argh...  This should have been squashed into the preceding patch 
'target/s390x: add zpci-interp to cpu models'

> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-virtio-ccw.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 84e3e63c43..e02fe11b07 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -803,6 +803,7 @@ DEFINE_CCW_MACHINE(7_0, "7.0", true);
>   static void ccw_machine_6_2_instance_options(MachineState *machine)
>   {
>       ccw_machine_7_0_instance_options(machine);
> +    s390_cpudef_featoff_greater(14, 1, S390_FEAT_ZPCI_INTERP);
>   }
>   
>   static void ccw_machine_6_2_class_options(MachineClass *mc)



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
  2022-01-14 20:38   ` Matthew Rosato
@ 2022-01-17 14:51     ` Thomas Huth
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Huth @ 2022-01-17 14:51 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

On 14/01/2022 21.38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable interpretation for devices
> when requested.  As part of this process, we must use the host function
> handle rather than a QEMU-generated one -- this is provided as part of the
> ioctl payload.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>   5 files changed, 199 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 01b58ebc70..a39ccfee05 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -971,12 +971,58 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
>       }
>   }
>   
> +static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
> +{
> +    uint32_t idx;
> +    int rc;
> +
> +    rc = s390_pci_probe_interp(pbdev);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    rc = s390_pci_update_passthrough_fh(pbdev);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    /*
> +     * The host device is already in an enabled state, but we always present
> +     * the initial device state to the guest as disabled (ZPCI_FS_DISABLED).
> +     * Therefore, mask off the enable bit from the passthrough handle until
> +     * the guest issues a CLP SET PCI FN later to enable the device.
> +     */
> +    pbdev->fh &= ~FH_MASK_ENABLE;
> +
> +    /* Next, see if the idx is already in-use */
> +    idx = pbdev->fh & FH_MASK_INDEX;
> +    if (pbdev->idx != idx) {
> +        if (s390_pci_find_dev_by_idx(s, idx)) {
> +            return -EINVAL;
> +        }
> +        /*
> +         * Update the idx entry with the passed through idx
> +         * If the relinquished idx is lower than next_idx, use it
> +         * to replace next_idx
> +         */
> +        g_hash_table_remove(s->zpci_table, &pbdev->idx);
> +        if (idx < s->next_idx) {
> +            s->next_idx = idx;
> +        }
> +        pbdev->idx = idx;
> +        g_hash_table_insert(s->zpci_table, &pbdev->idx, pbdev);
> +    }
> +
> +    return 0;
> +}
> +
>   static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                 Error **errp)
>   {
>       S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
>       PCIDevice *pdev = NULL;
>       S390PCIBusDevice *pbdev = NULL;
> +    int rc;
>   
>       if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>           PCIBridge *pb = PCI_BRIDGE(dev);
> @@ -1022,12 +1068,33 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>           set_pbdev_info(pbdev);
>   
>           if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
> -            pbdev->fh |= FH_SHM_VFIO;
> +            /*
> +             * By default, interpretation is always requested; if the available
> +             * facilities indicate it is not available, fallback to the
> +             * intercept model.
> +             */
> +            if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
> +                    DPRINTF("zPCI interpretation facilities missing.\n");
> +                    pbdev->interp = false;
> +                }

Wrong indentation in the above three lines.

> +            if (pbdev->interp) {
> +                rc = s390_pci_interp_plug(s, pbdev);
> +                if (rc) {
> +                    error_setg(errp, "zpci interp plug failed: %d", rc);

The error message is a little bit scarce for something that might be 
presented to the user - maybe write at least "interpretation" instead of 
"interp" ?

> +                    return;
> +                }
> +            }
>               pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
>               /* Fill in CLP information passed via the vfio region */
>               s390_pci_get_clp_info(pbdev);
> +            if (!pbdev->interp) {
> +                /* Do vfio passthrough but intercept for I/O */
> +                pbdev->fh |= FH_SHM_VFIO;
> +            }
>           } else {
>               pbdev->fh |= FH_SHM_EMUL;
> +            /* Always intercept emulated devices */
> +            pbdev->interp = false;
>           }

  Thomas


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
@ 2022-01-17 14:51     ` Thomas Huth
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Huth @ 2022-01-17 14:51 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 14/01/2022 21.38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable interpretation for devices
> when requested.  As part of this process, we must use the host function
> handle rather than a QEMU-generated one -- this is provided as part of the
> ioctl payload.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>   5 files changed, 199 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 01b58ebc70..a39ccfee05 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -971,12 +971,58 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
>       }
>   }
>   
> +static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
> +{
> +    uint32_t idx;
> +    int rc;
> +
> +    rc = s390_pci_probe_interp(pbdev);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    rc = s390_pci_update_passthrough_fh(pbdev);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    /*
> +     * The host device is already in an enabled state, but we always present
> +     * the initial device state to the guest as disabled (ZPCI_FS_DISABLED).
> +     * Therefore, mask off the enable bit from the passthrough handle until
> +     * the guest issues a CLP SET PCI FN later to enable the device.
> +     */
> +    pbdev->fh &= ~FH_MASK_ENABLE;
> +
> +    /* Next, see if the idx is already in-use */
> +    idx = pbdev->fh & FH_MASK_INDEX;
> +    if (pbdev->idx != idx) {
> +        if (s390_pci_find_dev_by_idx(s, idx)) {
> +            return -EINVAL;
> +        }
> +        /*
> +         * Update the idx entry with the passed through idx
> +         * If the relinquished idx is lower than next_idx, use it
> +         * to replace next_idx
> +         */
> +        g_hash_table_remove(s->zpci_table, &pbdev->idx);
> +        if (idx < s->next_idx) {
> +            s->next_idx = idx;
> +        }
> +        pbdev->idx = idx;
> +        g_hash_table_insert(s->zpci_table, &pbdev->idx, pbdev);
> +    }
> +
> +    return 0;
> +}
> +
>   static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                 Error **errp)
>   {
>       S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
>       PCIDevice *pdev = NULL;
>       S390PCIBusDevice *pbdev = NULL;
> +    int rc;
>   
>       if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>           PCIBridge *pb = PCI_BRIDGE(dev);
> @@ -1022,12 +1068,33 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>           set_pbdev_info(pbdev);
>   
>           if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
> -            pbdev->fh |= FH_SHM_VFIO;
> +            /*
> +             * By default, interpretation is always requested; if the available
> +             * facilities indicate it is not available, fallback to the
> +             * intercept model.
> +             */
> +            if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
> +                    DPRINTF("zPCI interpretation facilities missing.\n");
> +                    pbdev->interp = false;
> +                }

Wrong indentation in the above three lines.

> +            if (pbdev->interp) {
> +                rc = s390_pci_interp_plug(s, pbdev);
> +                if (rc) {
> +                    error_setg(errp, "zpci interp plug failed: %d", rc);

The error message is a little bit scarce for something that might be 
presented to the user - maybe write at least "interpretation" instead of 
"interp" ?

> +                    return;
> +                }
> +            }
>               pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
>               /* Fill in CLP information passed via the vfio region */
>               s390_pci_get_clp_info(pbdev);
> +            if (!pbdev->interp) {
> +                /* Do vfio passthrough but intercept for I/O */
> +                pbdev->fh |= FH_SHM_VFIO;
> +            }
>           } else {
>               pbdev->fh |= FH_SHM_EMUL;
> +            /* Always intercept emulated devices */
> +            pbdev->interp = false;
>           }

  Thomas



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 0/9] s390x/pci: zPCI interpretation support
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-01-17 15:23   ` Thomas Huth
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Huth @ 2022-01-17 15:23 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

On 14/01/2022 21.38, Matthew Rosato wrote:
> For QEMU, the majority of the work in enabling instruction interpretation
> is handled via new VFIO ioctls to SET the appropriate interpretation and
> interrupt forwarding modes, and to GET the function handle to use for
> interpretive execution.
> 
> This series implements these new ioctls, as well as adding a new, optional
> 'intercept' parameter to zpci to request interpretation support not be used
> as well as an 'intassist' parameter to determine whether or not the
> firmware assist will be used for interrupt delivery or whether the host
> will be responsible for delivering all interrupts.

  Hi Matthew,

would it make sense to create a docs/system/s390x/zpci.rst doc file, too, 
where you could describe such new parameters like 'intassist' and 
'intercept' (or is it 'interp') ? ... otherwise hardly anybody except you 
will know how to use these parameters later...

  Thomas


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 0/9] s390x/pci: zPCI interpretation support
@ 2022-01-17 15:23   ` Thomas Huth
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Huth @ 2022-01-17 15:23 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 14/01/2022 21.38, Matthew Rosato wrote:
> For QEMU, the majority of the work in enabling instruction interpretation
> is handled via new VFIO ioctls to SET the appropriate interpretation and
> interrupt forwarding modes, and to GET the function handle to use for
> interpretive execution.
> 
> This series implements these new ioctls, as well as adding a new, optional
> 'intercept' parameter to zpci to request interpretation support not be used
> as well as an 'intassist' parameter to determine whether or not the
> firmware assist will be used for interrupt delivery or whether the host
> will be responsible for delivering all interrupts.

  Hi Matthew,

would it make sense to create a docs/system/s390x/zpci.rst doc file, too, 
where you could describe such new parameters like 'intassist' and 
'intercept' (or is it 'interp') ? ... otherwise hardly anybody except you 
will know how to use these parameters later...

  Thomas



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
  2022-01-14 20:38   ` Matthew Rosato
@ 2022-01-17 15:38     ` Thomas Huth
  -1 siblings, 0 replies; 44+ messages in thread
From: Thomas Huth @ 2022-01-17 15:38 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

On 14/01/2022 21.38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable interpretation for devices
> when requested.  As part of this process, we must use the host function
> handle rather than a QEMU-generated one -- this is provided as part of the
> ioctl payload.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>   5 files changed, 199 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 01b58ebc70..a39ccfee05 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
...
> @@ -1360,6 +1427,7 @@ static Property s390_pci_device_properties[] = {
>       DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
> +    DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
>       DEFINE_PROP_END_OF_LIST(),
>   };

Since this is something that the user can see, would it maybe make sense to 
provide a full word instead of an abbreviation here? I.e. "interpret" or 
"interpretation" instead of "interp" ?

  Thomas


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
@ 2022-01-17 15:38     ` Thomas Huth
  0 siblings, 0 replies; 44+ messages in thread
From: Thomas Huth @ 2022-01-17 15:38 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 14/01/2022 21.38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable interpretation for devices
> when requested.  As part of this process, we must use the host function
> handle rather than a QEMU-generated one -- this is provided as part of the
> ioctl payload.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>   5 files changed, 199 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 01b58ebc70..a39ccfee05 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
...
> @@ -1360,6 +1427,7 @@ static Property s390_pci_device_properties[] = {
>       DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
> +    DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
>       DEFINE_PROP_END_OF_LIST(),
>   };

Since this is something that the user can see, would it maybe make sense to 
provide a full word instead of an abbreviation here? I.e. "interpret" or 
"interpretation" instead of "interp" ?

  Thomas



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
  2022-01-17 14:51     ` Thomas Huth
@ 2022-01-18 18:08       ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:08 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

On 1/17/22 9:51 AM, Thomas Huth wrote:
> On 14/01/2022 21.38, Matthew Rosato wrote:
...
>>   static void s390_pcihost_plug(HotplugHandler *hotplug_dev, 
>> DeviceState *dev,
>>                                 Error **errp)
>>   {
>>       S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
>>       PCIDevice *pdev = NULL;
>>       S390PCIBusDevice *pbdev = NULL;
>> +    int rc;
>>       if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>>           PCIBridge *pb = PCI_BRIDGE(dev);
>> @@ -1022,12 +1068,33 @@ static void s390_pcihost_plug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>           set_pbdev_info(pbdev);
>>           if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
>> -            pbdev->fh |= FH_SHM_VFIO;
>> +            /*
>> +             * By default, interpretation is always requested; if the 
>> available
>> +             * facilities indicate it is not available, fallback to the
>> +             * intercept model.
>> +             */
>> +            if (pbdev->interp && 
>> !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
>> +                    DPRINTF("zPCI interpretation facilities 
>> missing.\n");
>> +                    pbdev->interp = false;
>> +                }
> 
> Wrong indentation in the above three lines.

Thanks

> 
>> +            if (pbdev->interp) {
>> +                rc = s390_pci_interp_plug(s, pbdev);
>> +                if (rc) {
>> +                    error_setg(errp, "zpci interp plug failed: %d", rc);
> 
> The error message is a little bit scarce for something that might be 
> presented to the user - maybe write at least "interpretation" instead of 
> "interp" ?
> 
Good point, I'll re-word to something like "Plug failed for zPCI device 
in interpretation mode: %d"

>> +                    return;
>> +                }
>> +            }
>>               pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, 
>> pbdev);
>>               /* Fill in CLP information passed via the vfio region */
>>               s390_pci_get_clp_info(pbdev);
>> +            if (!pbdev->interp) {
>> +                /* Do vfio passthrough but intercept for I/O */
>> +                pbdev->fh |= FH_SHM_VFIO;
>> +            }
>>           } else {
>>               pbdev->fh |= FH_SHM_EMUL;
>> +            /* Always intercept emulated devices */
>> +            pbdev->interp = false;
>>           }
> 
>   Thomas
> 


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
@ 2022-01-18 18:08       ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:08 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 1/17/22 9:51 AM, Thomas Huth wrote:
> On 14/01/2022 21.38, Matthew Rosato wrote:
...
>>   static void s390_pcihost_plug(HotplugHandler *hotplug_dev, 
>> DeviceState *dev,
>>                                 Error **errp)
>>   {
>>       S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
>>       PCIDevice *pdev = NULL;
>>       S390PCIBusDevice *pbdev = NULL;
>> +    int rc;
>>       if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>>           PCIBridge *pb = PCI_BRIDGE(dev);
>> @@ -1022,12 +1068,33 @@ static void s390_pcihost_plug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>           set_pbdev_info(pbdev);
>>           if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
>> -            pbdev->fh |= FH_SHM_VFIO;
>> +            /*
>> +             * By default, interpretation is always requested; if the 
>> available
>> +             * facilities indicate it is not available, fallback to the
>> +             * intercept model.
>> +             */
>> +            if (pbdev->interp && 
>> !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
>> +                    DPRINTF("zPCI interpretation facilities 
>> missing.\n");
>> +                    pbdev->interp = false;
>> +                }
> 
> Wrong indentation in the above three lines.

Thanks

> 
>> +            if (pbdev->interp) {
>> +                rc = s390_pci_interp_plug(s, pbdev);
>> +                if (rc) {
>> +                    error_setg(errp, "zpci interp plug failed: %d", rc);
> 
> The error message is a little bit scarce for something that might be 
> presented to the user - maybe write at least "interpretation" instead of 
> "interp" ?
> 
Good point, I'll re-word to something like "Plug failed for zPCI device 
in interpretation mode: %d"

>> +                    return;
>> +                }
>> +            }
>>               pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, 
>> pbdev);
>>               /* Fill in CLP information passed via the vfio region */
>>               s390_pci_get_clp_info(pbdev);
>> +            if (!pbdev->interp) {
>> +                /* Do vfio passthrough but intercept for I/O */
>> +                pbdev->fh |= FH_SHM_VFIO;
>> +            }
>>           } else {
>>               pbdev->fh |= FH_SHM_EMUL;
>> +            /* Always intercept emulated devices */
>> +            pbdev->interp = false;
>>           }
> 
>   Thomas
> 



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
  2022-01-17 15:38     ` Thomas Huth
@ 2022-01-18 18:14       ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:14 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

On 1/17/22 10:38 AM, Thomas Huth wrote:
> On 14/01/2022 21.38, Matthew Rosato wrote:
>> Use the associated vfio feature ioctl to enable interpretation for 
>> devices
>> when requested.  As part of this process, we must use the host function
>> handle rather than a QEMU-generated one -- this is provided as part of 
>> the
>> ioctl payload.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>>   include/hw/s390x/s390-pci-bus.h  |  1 +
>>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>>   5 files changed, 199 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>> index 01b58ebc70..a39ccfee05 100644
>> --- a/hw/s390x/s390-pci-bus.c
>> +++ b/hw/s390x/s390-pci-bus.c
> ...
>> @@ -1360,6 +1427,7 @@ static Property s390_pci_device_properties[] = {
>>       DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
>>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
>> +    DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
>>       DEFINE_PROP_END_OF_LIST(),
>>   };
> 
> Since this is something that the user can see, would it maybe make sense 
> to provide a full word instead of an abbreviation here? I.e. "interpret" 
> or "interpretation" instead of "interp" ?

I'll go with "interpret" unless someone else has a strong opinion on it.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
@ 2022-01-18 18:14       ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:14 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 1/17/22 10:38 AM, Thomas Huth wrote:
> On 14/01/2022 21.38, Matthew Rosato wrote:
>> Use the associated vfio feature ioctl to enable interpretation for 
>> devices
>> when requested.  As part of this process, we must use the host function
>> handle rather than a QEMU-generated one -- this is provided as part of 
>> the
>> ioctl payload.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>>   include/hw/s390x/s390-pci-bus.h  |  1 +
>>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>>   5 files changed, 199 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>> index 01b58ebc70..a39ccfee05 100644
>> --- a/hw/s390x/s390-pci-bus.c
>> +++ b/hw/s390x/s390-pci-bus.c
> ...
>> @@ -1360,6 +1427,7 @@ static Property s390_pci_device_properties[] = {
>>       DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
>>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
>> +    DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
>>       DEFINE_PROP_END_OF_LIST(),
>>   };
> 
> Since this is something that the user can see, would it maybe make sense 
> to provide a full word instead of an abbreviation here? I.e. "interpret" 
> or "interpretation" instead of "interp" ?

I'll go with "interpret" unless someone else has a strong opinion on it.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 0/9] s390x/pci: zPCI interpretation support
  2022-01-17 15:23   ` Thomas Huth
@ 2022-01-18 18:32     ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:32 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, farman, pmorel,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

On 1/17/22 10:23 AM, Thomas Huth wrote:
> On 14/01/2022 21.38, Matthew Rosato wrote:
>> For QEMU, the majority of the work in enabling instruction interpretation
>> is handled via new VFIO ioctls to SET the appropriate interpretation and
>> interrupt forwarding modes, and to GET the function handle to use for
>> interpretive execution.
>>
>> This series implements these new ioctls, as well as adding a new, 
>> optional
>> 'intercept' parameter to zpci to request interpretation support not be 
>> used
>> as well as an 'intassist' parameter to determine whether or not the
>> firmware assist will be used for interrupt delivery or whether the host
>> will be responsible for delivering all interrupts.
> 
>   Hi Matthew,
> 
> would it make sense to create a docs/system/s390x/zpci.rst doc file, 

This is a good idea and probably something that was due for zpci before 
this series even.

> too, where you could describe such new parameters like 'intassist' and 
> 'intercept' (or is it 'interp') ? ... otherwise hardly anybody except 

Oops, 'intercept' was a holdover from a previous version and effectively 
had an inverted meaning.  It is indeed 'interp' with this current series 
(and subject to change again per other thread)



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 0/9] s390x/pci: zPCI interpretation support
@ 2022-01-18 18:32     ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:32 UTC (permalink / raw)
  To: Thomas Huth, qemu-s390x
  Cc: farman, kvm, pmorel, schnelle, cohuck, richard.henderson,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 1/17/22 10:23 AM, Thomas Huth wrote:
> On 14/01/2022 21.38, Matthew Rosato wrote:
>> For QEMU, the majority of the work in enabling instruction interpretation
>> is handled via new VFIO ioctls to SET the appropriate interpretation and
>> interrupt forwarding modes, and to GET the function handle to use for
>> interpretive execution.
>>
>> This series implements these new ioctls, as well as adding a new, 
>> optional
>> 'intercept' parameter to zpci to request interpretation support not be 
>> used
>> as well as an 'intassist' parameter to determine whether or not the
>> firmware assist will be used for interrupt delivery or whether the host
>> will be responsible for delivering all interrupts.
> 
>   Hi Matthew,
> 
> would it make sense to create a docs/system/s390x/zpci.rst doc file, 

This is a good idea and probably something that was due for zpci before 
this series even.

> too, where you could describe such new parameters like 'intassist' and 
> 'intercept' (or is it 'interp') ? ... otherwise hardly anybody except 

Oops, 'intercept' was a holdover from a previous version and effectively 
had an inverted meaning.  It is indeed 'interp' with this current series 
(and subject to change again per other thread)




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
  2022-01-14 20:38   ` Matthew Rosato
@ 2022-01-31  9:37     ` Pierre Morel
  -1 siblings, 0 replies; 44+ messages in thread
From: Pierre Morel @ 2022-01-31  9:37 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm



On 1/14/22 21:38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable interpretation for devices
> when requested.  As part of this process, we must use the host function
> handle rather than a QEMU-generated one -- this is provided as part of the
> ioctl payload.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>   5 files changed, 199 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 01b58ebc70..a39ccfee05 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -971,12 +971,58 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
>       }
>   }
>   

...snip...

> diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
> index ff708aef50..42533e38f7 100644
> --- a/include/hw/s390x/s390-pci-vfio.h
> +++ b/include/hw/s390x/s390-pci-vfio.h
> @@ -20,6 +20,9 @@ bool s390_pci_update_dma_avail(int fd, unsigned int *avail);
>   S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
>                                             S390PCIBusDevice *pbdev);
>   void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
> +int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
> +int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
> +int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
>   void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
>   #else
>   static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
> @@ -33,6 +36,18 @@ static inline S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
>   }
>   static inline void s390_pci_end_dma_count(S390pciState *s,
>                                             S390PCIDMACount *cnt) { }
> +int s390_pci_probe_interp(S390PCIBusDevice *pbdev)

static inline ?

> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
> +{
> +    return -EINVAL;
> +}
>   static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
>   #endif
>   
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
@ 2022-01-31  9:37     ` Pierre Morel
  0 siblings, 0 replies; 44+ messages in thread
From: Pierre Morel @ 2022-01-31  9:37 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: farman, kvm, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger



On 1/14/22 21:38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable interpretation for devices
> when requested.  As part of this process, we must use the host function
> handle rather than a QEMU-generated one -- this is provided as part of the
> ioctl payload.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>   5 files changed, 199 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 01b58ebc70..a39ccfee05 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -971,12 +971,58 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
>       }
>   }
>   

...snip...

> diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
> index ff708aef50..42533e38f7 100644
> --- a/include/hw/s390x/s390-pci-vfio.h
> +++ b/include/hw/s390x/s390-pci-vfio.h
> @@ -20,6 +20,9 @@ bool s390_pci_update_dma_avail(int fd, unsigned int *avail);
>   S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
>                                             S390PCIBusDevice *pbdev);
>   void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
> +int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
> +int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
> +int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
>   void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
>   #else
>   static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
> @@ -33,6 +36,18 @@ static inline S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
>   }
>   static inline void s390_pci_end_dma_count(S390pciState *s,
>                                             S390PCIDMACount *cnt) { }
> +int s390_pci_probe_interp(S390PCIBusDevice *pbdev)

static inline ?

> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
> +{
> +    return -EINVAL;
> +}
>   static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
>   #endif
>   
> 

-- 
Pierre Morel
IBM Lab Boeblingen


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
  2022-01-14 20:38   ` Matthew Rosato
                     ` (3 preceding siblings ...)
  (?)
@ 2022-01-31 14:46   ` Pierre Morel
  2022-01-31 17:11     ` Matthew Rosato
  -1 siblings, 1 reply; 44+ messages in thread
From: Pierre Morel @ 2022-01-31 14:46 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: farman, kvm, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger



On 1/14/22 21:38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable interpretation for devices
> when requested.  As part of this process, we must use the host function
> handle rather than a QEMU-generated one -- this is provided as part of the
> ioctl payload.

I wonder if we should not explain here that having interpretation as a 
default and silently fall back to interception allows backward 
compatibility while allowing performence be chosing by default.
(You can say it better as I do :) )


> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 70 +++++++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-inst.c         | 63 +++++++++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 52 ++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 15 +++++++
>   5 files changed, 199 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 01b58ebc70..a39ccfee05 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -971,12 +971,58 @@ static void s390_pci_update_subordinate(PCIDevice *dev, uint32_t nr)
>       }
>   }
>   
> +static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
> +{
> +    uint32_t idx;
> +    int rc;
> +
> +    rc = s390_pci_probe_interp(pbdev);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    rc = s390_pci_update_passthrough_fh(pbdev);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    /*
> +     * The host device is already in an enabled state, but we always present
> +     * the initial device state to the guest as disabled (ZPCI_FS_DISABLED).
> +     * Therefore, mask off the enable bit from the passthrough handle until
> +     * the guest issues a CLP SET PCI FN later to enable the device.
> +     */
> +    pbdev->fh &= ~FH_MASK_ENABLE;
> +
> +    /* Next, see if the idx is already in-use */
> +    idx = pbdev->fh & FH_MASK_INDEX;
> +    if (pbdev->idx != idx) {
> +        if (s390_pci_find_dev_by_idx(s, idx)) {
> +            return -EINVAL;
> +        }
> +        /*
> +         * Update the idx entry with the passed through idx
> +         * If the relinquished idx is lower than next_idx, use it
> +         * to replace next_idx
> +         */
> +        g_hash_table_remove(s->zpci_table, &pbdev->idx);
> +        if (idx < s->next_idx) {
> +            s->next_idx = idx;
> +        }
> +        pbdev->idx = idx;
> +        g_hash_table_insert(s->zpci_table, &pbdev->idx, pbdev);
> +    }
> +
> +    return 0;
> +}
> +
>   static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>                                 Error **errp)
>   {
>       S390pciState *s = S390_PCI_HOST_BRIDGE(hotplug_dev);
>       PCIDevice *pdev = NULL;
>       S390PCIBusDevice *pbdev = NULL;
> +    int rc;
>   
>       if (object_dynamic_cast(OBJECT(dev), TYPE_PCI_BRIDGE)) {
>           PCIBridge *pb = PCI_BRIDGE(dev);
> @@ -1022,12 +1068,33 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>           set_pbdev_info(pbdev);
>   
>           if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
> -            pbdev->fh |= FH_SHM_VFIO;
> +            /*
> +             * By default, interpretation is always requested; if the available
> +             * facilities indicate it is not available, fallback to the
> +             * intercept model.

s/intercept/interception/ ?

> +             */
> +            if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
> +                    DPRINTF("zPCI interpretation facilities missing.\n");
> +                    pbdev->interp = false;
> +                }
> +            if (pbdev->interp) {
> +                rc = s390_pci_interp_plug(s, pbdev);
> +                if (rc) {
> +                    error_setg(errp, "zpci interp plug failed: %d", rc);
> +                    return;
> +                }
> +            }


Can't we rearrange that as
if (pbdev->interp) {
     if (s390_has_feat) {
     } else {
     }
}


>               pbdev->iommu->dma_limit = s390_pci_start_dma_count(s, pbdev);
>               /* Fill in CLP information passed via the vfio region */
>               s390_pci_get_clp_info(pbdev);
> +            if (!pbdev->interp) {
> +                /* Do vfio passthrough but intercept for I/O */
> +                pbdev->fh |= FH_SHM_VFIO;
> +            }
>           } else {
>               pbdev->fh |= FH_SHM_EMUL;
> +            /* Always intercept emulated devices */
> +            pbdev->interp = false;
>           }
>   
>           if (s390_pci_msix_init(pbdev)) {
> @@ -1360,6 +1427,7 @@ static Property s390_pci_device_properties[] = {
>       DEFINE_PROP_UINT16("uid", S390PCIBusDevice, uid, UID_UNDEFINED),
>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
> +    DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
>       DEFINE_PROP_END_OF_LIST(),
>   };
>   
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 6d400d4147..e9a0dc12e4 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -18,6 +18,7 @@
>   #include "sysemu/hw_accel.h"
>   #include "hw/s390x/s390-pci-inst.h"
>   #include "hw/s390x/s390-pci-bus.h"
> +#include "hw/s390x/s390-pci-vfio.h"
>   #include "hw/s390x/tod.h"
>   
>   #ifndef DEBUG_S390PCI_INST
> @@ -156,6 +157,47 @@ out:
>       return rc;
>   }
>   
> +static int clp_enable_interp(S390PCIBusDevice *pbdev)
> +{
> +    int rc;
> +
> +    rc = s390_pci_set_interp(pbdev, true);
> +    if (rc) {
> +        DPRINTF("Failed to enable interpretation\n");
> +        return rc;
> +    }
> +    rc = s390_pci_update_passthrough_fh(pbdev);
> +    if (rc) {
> +        DPRINTF("Failed to update passthrough fh\n");
> +        return rc;
> +    }
> +    if (!(pbdev->fh & FH_MASK_ENABLE)) {
> +        DPRINTF("Passthrough handle is not enabled\n");
> +        return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +
> +static int clp_disable_interp(S390PCIBusDevice *pbdev)
> +{
> +    int rc;
> +
> +    rc = s390_pci_set_interp(pbdev, false);
> +    if (rc) {
> +        DPRINTF("Failed to disable interpretation\n");
> +        return rc;
> +    }
> +
> +    rc = s390_pci_update_passthrough_fh(pbdev);
> +    if (rc) {
> +        DPRINTF("Failed to update passthrough fh\n");
> +        return rc;
> +    }
> +
> +    return 0;
> +}
> +
>   int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>   {
>       ClpReqHdr *reqh;
> @@ -246,7 +288,19 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>                   goto out;
>               }
>   
> -            pbdev->fh |= FH_MASK_ENABLE;
> +            /*
> +             * If interpretation is specified, attempt to enable this now and
> +             * update with the host fh
> +             */
> +            if (pbdev->interp) {
> +                if (clp_enable_interp(pbdev)) {
> +                    stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_ERR);
> +                    goto out;
> +                }
> +            } else {
> +                pbdev->fh |= FH_MASK_ENABLE;
> +            }
> +
>               pbdev->state = ZPCI_FS_ENABLED;
>               stl_p(&ressetpci->fh, pbdev->fh);
>               stw_p(&ressetpci->hdr.rsp, CLP_RC_OK);
> @@ -257,6 +311,13 @@ int clp_service_call(S390CPU *cpu, uint8_t r2, uintptr_t ra)
>                   goto out;
>               }
>               device_legacy_reset(DEVICE(pbdev));
> +            if (pbdev->interp) {
> +                if (clp_disable_interp(pbdev)) {
> +                    stw_p(&ressetpci->hdr.rsp, CLP_RC_SETPCIFN_ERR);
> +                    goto out;
> +                }
> +            }
> +            /* Mask off the enabled bit for interpreted devices too */
>               pbdev->fh &= ~FH_MASK_ENABLE;
>               pbdev->state = ZPCI_FS_DISABLED;
>               stl_p(&ressetpci->fh, pbdev->fh);
> diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
> index 6f80a47e29..2cab3a9e89 100644
> --- a/hw/s390x/s390-pci-vfio.c
> +++ b/hw/s390x/s390-pci-vfio.c
> @@ -97,6 +97,58 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt)
>       }
>   }
>   
> +int s390_pci_probe_interp(S390PCIBusDevice *pbdev)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_feature feat = {
> +        .argsz = sizeof(struct vfio_device_feature),
> +        .flags = VFIO_DEVICE_FEATURE_PROBE | VFIO_DEVICE_FEATURE_ZPCI_INTERP
> +    };
> +
> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
> +}
> +
> +int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_zpci_interp *data;
> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
> +
> +    feat->argsz = size;
> +    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_INTERP;
> +
> +    data = (struct vfio_device_zpci_interp *)&feat->data;
> +    if (enable) {
> +        data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
> +    } else {
> +        data->flags = 0;
> +    }
> +
> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
> +}
> +
> +int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_zpci_interp *data;
> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
> +    int rc;
> +
> +    feat->argsz = size;
> +    feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_INTERP;
> +
> +    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    data = (struct vfio_device_zpci_interp *)&feat->data;
> +    pbdev->fh = data->fh;
> +    return 0;
> +}
> +
>   static void s390_pci_read_base(S390PCIBusDevice *pbdev,
>                                  struct vfio_device_info *info)
>   {
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
> index da3cde2bb4..a9843dfe97 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -350,6 +350,7 @@ struct S390PCIBusDevice {
>       IndAddr *indicator;
>       bool pci_unplug_request_processed;
>       bool unplug_requested;
> +    bool interp;
>       QTAILQ_ENTRY(S390PCIBusDevice) link;
>   };
>   
> diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
> index ff708aef50..42533e38f7 100644
> --- a/include/hw/s390x/s390-pci-vfio.h
> +++ b/include/hw/s390x/s390-pci-vfio.h
> @@ -20,6 +20,9 @@ bool s390_pci_update_dma_avail(int fd, unsigned int *avail);
>   S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
>                                             S390PCIBusDevice *pbdev);
>   void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
> +int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
> +int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
> +int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
>   void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
>   #else
>   static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
> @@ -33,6 +36,18 @@ static inline S390PCIDMACount *s390_pci_start_dma_count(S390pciState *s,
>   }
>   static inline void s390_pci_end_dma_count(S390pciState *s,
>                                             S390PCIDMACount *cnt) { }
> +int s390_pci_probe_interp(S390PCIBusDevice *pbdev)
> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable)
> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
> +{
> +    return -EINVAL;
> +}
>   static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
>   #endif
>   
> 

LGTM
With the corrections proposed by Thomas.
Mine... you see what you prefer.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 6/9] s390x/pci: enable adapter event notification for interpreted devices
  2022-01-14 20:38   ` Matthew Rosato
@ 2022-01-31 15:10     ` Pierre Morel
  -1 siblings, 0 replies; 44+ messages in thread
From: Pierre Morel @ 2022-01-31 15:10 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm



On 1/14/22 21:38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable adapter event notification
> and forwarding for devices when requested.  This feature will be set up
> with or without firmware assist based upon the 'intassist' setting.

It is a personal opinion but I do not like the term 'intassist'.
Why not using 'gisa' ?
I would find it more relevant.
If something for the profane admin then interrupt_assist or irq_assist 
or even int_assist.

Or is it forwarding_assist ?

> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 24 ++++++++--
>   hw/s390x/s390-pci-inst.c         | 54 +++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 79 ++++++++++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 20 ++++++++
>   5 files changed, 173 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 66649af6e0..6ee70446ca 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -189,7 +189,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
>           rc = SCLP_RC_NO_ACTION_REQUIRED;
>           break;
>       default:
> -        if (pbdev->summary_ind) {
> +        if (pbdev->interp) {
> +            /* Interpreted devices were using interrupt forwarding */
> +            s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
> +        } else if (pbdev->summary_ind) {
>               pci_dereg_irqs(pbdev);
>           }
>           if (pbdev->iommu->enabled) {
> @@ -981,6 +984,11 @@ static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
>           return rc;
>       }
>   
> +    rc = s390_pci_probe_aif(pbdev);
> +    if (rc) {
> +        return rc;
> +    }
> +
>       rc = s390_pci_update_passthrough_fh(pbdev);
>       if (rc) {
>           return rc;
> @@ -1076,6 +1084,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>               if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
>                       DPRINTF("zPCI interpretation facilities missing.\n");
>                       pbdev->interp = false;
> +                    pbdev->intassist = false;
>                   }
>               if (pbdev->interp) {
>                   rc = s390_pci_interp_plug(s, pbdev);
> @@ -1090,11 +1099,13 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>               if (!pbdev->interp) {
>                   /* Do vfio passthrough but intercept for I/O */
>                   pbdev->fh |= FH_SHM_VFIO;
> +                pbdev->intassist = false;
>               }
>           } else {
>               pbdev->fh |= FH_SHM_EMUL;
>               /* Always intercept emulated devices */
>               pbdev->interp = false;
> +            pbdev->intassist = false;
>           }
>   
>           if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
> @@ -1244,7 +1255,10 @@ static void s390_pcihost_reset(DeviceState *dev)
>       /* Process all pending unplug requests */
>       QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) {
>           if (pbdev->unplug_requested) {
> -            if (pbdev->summary_ind) {
> +            if (pbdev->interp) {
> +                /* Interpreted devices were using interrupt forwarding */
> +                s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
> +            } else if (pbdev->summary_ind) {
>                   pci_dereg_irqs(pbdev);
>               }
>               if (pbdev->iommu->enabled) {
> @@ -1382,7 +1396,10 @@ static void s390_pci_device_reset(DeviceState *dev)
>           break;
>       }
>   
> -    if (pbdev->summary_ind) {
> +    if (pbdev->interp) {
> +        /* Interpreted devices were using interrupt forwarding */
> +        s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
> +    } else if (pbdev->summary_ind) {
>           pci_dereg_irqs(pbdev);
>       }
>       if (pbdev->iommu->enabled) {
> @@ -1428,6 +1445,7 @@ static Property s390_pci_device_properties[] = {
>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
>       DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
> +    DEFINE_PROP_BOOL("intassist", S390PCIBusDevice, intassist, true),

We allow to disable IRQ forwarding only for test or debug purpose don't we?
Then shouldn't we make it explicit ?

>       DEFINE_PROP_END_OF_LIST(),
>   };
>   
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index e9a0dc12e4..121e07cc41 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -1111,6 +1111,46 @@ static void fmb_update(void *opaque)
>       timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui);
>   }
>   
> +static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
> +{
> +    int rc;
> +
> +    /* Interpreted devices must also use interrupt forwarding */
> +    rc = s390_pci_get_aif(pbdev, false, pbdev->intassist);
> +    if (rc) {
> +        DPRINTF("Bad interrupt forwarding state\n");
> +        return rc;
> +    }
> +
> +    rc = s390_pci_set_aif(pbdev, fib, true, pbdev->intassist);
> +    if (rc) {
> +        DPRINTF("Failed to enable interrupt forwarding\n");
> +        return rc;
> +    }
> +
> +    return 0;
> +}
> +
> +static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
> +{
> +    int rc;
> +
> +    /* Interpreted devices were using interrupt forwarding */
> +    rc = s390_pci_get_aif(pbdev, true, pbdev->intassist);
> +    if (rc) {
> +        DPRINTF("Bad interrupt forwarding state\n");
> +        return rc;
> +    }
> +
> +    rc = s390_pci_set_aif(pbdev, fib, false, pbdev->intassist);
> +    if (rc) {
> +        DPRINTF("Failed to disable interrupt forwarding\n");
> +        return rc;
> +    }
> +
> +    return 0;
> +}
> +
>   int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
>                           uintptr_t ra)
>   {
> @@ -1165,7 +1205,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
>   
>       switch (oc) {
>       case ZPCI_MOD_FC_REG_INT:
> -        if (pbdev->summary_ind) {
> +        if (pbdev->interp) {
> +            if (mpcifc_reg_int_interp(pbdev, &fib)) {
> +                cc = ZPCI_PCI_LS_ERR;
> +                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
> +            }
> +        } else if (pbdev->summary_ind) {
>               cc = ZPCI_PCI_LS_ERR;
>               s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>           } else if (reg_irqs(env, pbdev, fib)) {
> @@ -1174,7 +1219,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
>           }
>           break;
>       case ZPCI_MOD_FC_DEREG_INT:
> -        if (!pbdev->summary_ind) {
> +        if (pbdev->interp) {
> +            if (mpcifc_dereg_int_interp(pbdev, &fib)) {
> +                cc = ZPCI_PCI_LS_ERR;
> +                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
> +            }
> +        } else if (!pbdev->summary_ind) {
>               cc = ZPCI_PCI_LS_ERR;
>               s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>           } else {
> diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
> index 2cab3a9e89..73f3b3ed19 100644
> --- a/hw/s390x/s390-pci-vfio.c
> +++ b/hw/s390x/s390-pci-vfio.c
> @@ -149,6 +149,85 @@ int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
>       return 0;
>   }
>   
> +int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_feature feat = {
> +        .argsz = sizeof(struct vfio_device_feature),
> +        .flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_AIF
> +    };
> +
> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
> +}
> +
> +int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
> +                     bool assist)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_zpci_aif *data;
> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
> +
> +    feat->argsz = size;
> +    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
> +
> +    data = (struct vfio_device_zpci_aif *)&feat->data;
> +    if (enable) {
> +        data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
> +        if (!pbdev->intassist) {
> +            data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
> +        }
> +        /* Fill in the guest fib info */
> +        data->ibv = fib->aibv;
> +        data->sb = fib->aisb;
> +        data->noi = FIB_DATA_NOI(fib->data);
> +        data->isc = FIB_DATA_ISC(fib->data);
> +        data->sbo = FIB_DATA_AISBO(fib->data);
> +    } else {
> +        data->flags = 0;
> +    }
> +
> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
> +}
> +
> +int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_zpci_aif *data;
> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
> +    int rc;
> +
> +    feat->argsz = size;
> +    feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
> +
> +    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    /* Determine if current interrupt settings match the host */
> +    data = (struct vfio_device_zpci_aif *)&feat->data;
> +    if (enable && (!(data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT))) {
> +        rc = -EINVAL;
> +    } else if (!enable && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT)) {
> +        rc = -EINVAL;
> +    }
> +
> +    /*
> +     * When enabled for interrupts, the assist and forced host-delivery are
> +     * mututally exclusive
> +     */

assist is unclear for me int_assist , forwarding_assist ?

> +    if (enable && assist && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST)) {
> +        rc = -EINVAL;
> +    } else if (enable && (!assist) && (!(data->flags &
> +                                         VFIO_DEVICE_ZPCI_FLAG_AIF_HOST))) {
> +        rc = -EINVAL;
> +    }
> +
> +    return rc;
> +}
> +
>   static void s390_pci_read_base(S390PCIBusDevice *pbdev,
>                                  struct vfio_device_info *info)
>   {
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
> index a9843dfe97..9941ca0084 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -351,6 +351,7 @@ struct S390PCIBusDevice {
>       bool pci_unplug_request_processed;
>       bool unplug_requested;
>       bool interp;
> +    bool intassist;
>       QTAILQ_ENTRY(S390PCIBusDevice) link;
>   };
>   
> diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
> index 42533e38f7..6cec38a863 100644
> --- a/include/hw/s390x/s390-pci-vfio.h
> +++ b/include/hw/s390x/s390-pci-vfio.h
> @@ -13,6 +13,7 @@
>   #define HW_S390_PCI_VFIO_H
>   
>   #include "hw/s390x/s390-pci-bus.h"
> +#include "hw/s390x/s390-pci-inst.h"
>   #include CONFIG_DEVICES
>   
>   #ifdef CONFIG_VFIO
> @@ -23,6 +24,11 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
>   int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
>   int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
>   int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
> +int s390_pci_probe_aif(S390PCIBusDevice *pbdev);
> +int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
> +                     bool assist);
> +int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist);
> +
>   void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
>   #else
>   static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
> @@ -48,6 +54,20 @@ static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
>   {
>       return -EINVAL;
>   }
> +static inline int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib,
> +                                   bool enable, bool assist)
> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable,
> +                                   bool assist)
> +{
> +    return -EINVAL;
> +}
>   static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
>   #endif
>   
> 


apart of the naming (which may be a personal taste) LGTM



-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 6/9] s390x/pci: enable adapter event notification for interpreted devices
@ 2022-01-31 15:10     ` Pierre Morel
  0 siblings, 0 replies; 44+ messages in thread
From: Pierre Morel @ 2022-01-31 15:10 UTC (permalink / raw)
  To: Matthew Rosato, qemu-s390x
  Cc: farman, kvm, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger



On 1/14/22 21:38, Matthew Rosato wrote:
> Use the associated vfio feature ioctl to enable adapter event notification
> and forwarding for devices when requested.  This feature will be set up
> with or without firmware assist based upon the 'intassist' setting.

It is a personal opinion but I do not like the term 'intassist'.
Why not using 'gisa' ?
I would find it more relevant.
If something for the profane admin then interrupt_assist or irq_assist 
or even int_assist.

Or is it forwarding_assist ?

> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   hw/s390x/s390-pci-bus.c          | 24 ++++++++--
>   hw/s390x/s390-pci-inst.c         | 54 +++++++++++++++++++++-
>   hw/s390x/s390-pci-vfio.c         | 79 ++++++++++++++++++++++++++++++++
>   include/hw/s390x/s390-pci-bus.h  |  1 +
>   include/hw/s390x/s390-pci-vfio.h | 20 ++++++++
>   5 files changed, 173 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> index 66649af6e0..6ee70446ca 100644
> --- a/hw/s390x/s390-pci-bus.c
> +++ b/hw/s390x/s390-pci-bus.c
> @@ -189,7 +189,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
>           rc = SCLP_RC_NO_ACTION_REQUIRED;
>           break;
>       default:
> -        if (pbdev->summary_ind) {
> +        if (pbdev->interp) {
> +            /* Interpreted devices were using interrupt forwarding */
> +            s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
> +        } else if (pbdev->summary_ind) {
>               pci_dereg_irqs(pbdev);
>           }
>           if (pbdev->iommu->enabled) {
> @@ -981,6 +984,11 @@ static int s390_pci_interp_plug(S390pciState *s, S390PCIBusDevice *pbdev)
>           return rc;
>       }
>   
> +    rc = s390_pci_probe_aif(pbdev);
> +    if (rc) {
> +        return rc;
> +    }
> +
>       rc = s390_pci_update_passthrough_fh(pbdev);
>       if (rc) {
>           return rc;
> @@ -1076,6 +1084,7 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>               if (pbdev->interp && !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
>                       DPRINTF("zPCI interpretation facilities missing.\n");
>                       pbdev->interp = false;
> +                    pbdev->intassist = false;
>                   }
>               if (pbdev->interp) {
>                   rc = s390_pci_interp_plug(s, pbdev);
> @@ -1090,11 +1099,13 @@ static void s390_pcihost_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
>               if (!pbdev->interp) {
>                   /* Do vfio passthrough but intercept for I/O */
>                   pbdev->fh |= FH_SHM_VFIO;
> +                pbdev->intassist = false;
>               }
>           } else {
>               pbdev->fh |= FH_SHM_EMUL;
>               /* Always intercept emulated devices */
>               pbdev->interp = false;
> +            pbdev->intassist = false;
>           }
>   
>           if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
> @@ -1244,7 +1255,10 @@ static void s390_pcihost_reset(DeviceState *dev)
>       /* Process all pending unplug requests */
>       QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) {
>           if (pbdev->unplug_requested) {
> -            if (pbdev->summary_ind) {
> +            if (pbdev->interp) {
> +                /* Interpreted devices were using interrupt forwarding */
> +                s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
> +            } else if (pbdev->summary_ind) {
>                   pci_dereg_irqs(pbdev);
>               }
>               if (pbdev->iommu->enabled) {
> @@ -1382,7 +1396,10 @@ static void s390_pci_device_reset(DeviceState *dev)
>           break;
>       }
>   
> -    if (pbdev->summary_ind) {
> +    if (pbdev->interp) {
> +        /* Interpreted devices were using interrupt forwarding */
> +        s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
> +    } else if (pbdev->summary_ind) {
>           pci_dereg_irqs(pbdev);
>       }
>       if (pbdev->iommu->enabled) {
> @@ -1428,6 +1445,7 @@ static Property s390_pci_device_properties[] = {
>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
>       DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
> +    DEFINE_PROP_BOOL("intassist", S390PCIBusDevice, intassist, true),

We allow to disable IRQ forwarding only for test or debug purpose don't we?
Then shouldn't we make it explicit ?

>       DEFINE_PROP_END_OF_LIST(),
>   };
>   
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index e9a0dc12e4..121e07cc41 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -1111,6 +1111,46 @@ static void fmb_update(void *opaque)
>       timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui);
>   }
>   
> +static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
> +{
> +    int rc;
> +
> +    /* Interpreted devices must also use interrupt forwarding */
> +    rc = s390_pci_get_aif(pbdev, false, pbdev->intassist);
> +    if (rc) {
> +        DPRINTF("Bad interrupt forwarding state\n");
> +        return rc;
> +    }
> +
> +    rc = s390_pci_set_aif(pbdev, fib, true, pbdev->intassist);
> +    if (rc) {
> +        DPRINTF("Failed to enable interrupt forwarding\n");
> +        return rc;
> +    }
> +
> +    return 0;
> +}
> +
> +static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
> +{
> +    int rc;
> +
> +    /* Interpreted devices were using interrupt forwarding */
> +    rc = s390_pci_get_aif(pbdev, true, pbdev->intassist);
> +    if (rc) {
> +        DPRINTF("Bad interrupt forwarding state\n");
> +        return rc;
> +    }
> +
> +    rc = s390_pci_set_aif(pbdev, fib, false, pbdev->intassist);
> +    if (rc) {
> +        DPRINTF("Failed to disable interrupt forwarding\n");
> +        return rc;
> +    }
> +
> +    return 0;
> +}
> +
>   int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
>                           uintptr_t ra)
>   {
> @@ -1165,7 +1205,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
>   
>       switch (oc) {
>       case ZPCI_MOD_FC_REG_INT:
> -        if (pbdev->summary_ind) {
> +        if (pbdev->interp) {
> +            if (mpcifc_reg_int_interp(pbdev, &fib)) {
> +                cc = ZPCI_PCI_LS_ERR;
> +                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
> +            }
> +        } else if (pbdev->summary_ind) {
>               cc = ZPCI_PCI_LS_ERR;
>               s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>           } else if (reg_irqs(env, pbdev, fib)) {
> @@ -1174,7 +1219,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, uint8_t ar,
>           }
>           break;
>       case ZPCI_MOD_FC_DEREG_INT:
> -        if (!pbdev->summary_ind) {
> +        if (pbdev->interp) {
> +            if (mpcifc_dereg_int_interp(pbdev, &fib)) {
> +                cc = ZPCI_PCI_LS_ERR;
> +                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
> +            }
> +        } else if (!pbdev->summary_ind) {
>               cc = ZPCI_PCI_LS_ERR;
>               s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>           } else {
> diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
> index 2cab3a9e89..73f3b3ed19 100644
> --- a/hw/s390x/s390-pci-vfio.c
> +++ b/hw/s390x/s390-pci-vfio.c
> @@ -149,6 +149,85 @@ int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
>       return 0;
>   }
>   
> +int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_feature feat = {
> +        .argsz = sizeof(struct vfio_device_feature),
> +        .flags = VFIO_DEVICE_FEATURE_PROBE + VFIO_DEVICE_FEATURE_ZPCI_AIF
> +    };
> +
> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
> +}
> +
> +int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
> +                     bool assist)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_zpci_aif *data;
> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
> +
> +    feat->argsz = size;
> +    feat->flags = VFIO_DEVICE_FEATURE_SET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
> +
> +    data = (struct vfio_device_zpci_aif *)&feat->data;
> +    if (enable) {
> +        data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
> +        if (!pbdev->intassist) {
> +            data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
> +        }
> +        /* Fill in the guest fib info */
> +        data->ibv = fib->aibv;
> +        data->sb = fib->aisb;
> +        data->noi = FIB_DATA_NOI(fib->data);
> +        data->isc = FIB_DATA_ISC(fib->data);
> +        data->sbo = FIB_DATA_AISBO(fib->data);
> +    } else {
> +        data->flags = 0;
> +    }
> +
> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
> +}
> +
> +int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
> +{
> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
> +    struct vfio_device_zpci_aif *data;
> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
> +    int rc;
> +
> +    feat->argsz = size;
> +    feat->flags = VFIO_DEVICE_FEATURE_GET + VFIO_DEVICE_FEATURE_ZPCI_AIF;
> +
> +    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
> +    if (rc) {
> +        return rc;
> +    }
> +
> +    /* Determine if current interrupt settings match the host */
> +    data = (struct vfio_device_zpci_aif *)&feat->data;
> +    if (enable && (!(data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT))) {
> +        rc = -EINVAL;
> +    } else if (!enable && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT)) {
> +        rc = -EINVAL;
> +    }
> +
> +    /*
> +     * When enabled for interrupts, the assist and forced host-delivery are
> +     * mututally exclusive
> +     */

assist is unclear for me int_assist , forwarding_assist ?

> +    if (enable && assist && (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST)) {
> +        rc = -EINVAL;
> +    } else if (enable && (!assist) && (!(data->flags &
> +                                         VFIO_DEVICE_ZPCI_FLAG_AIF_HOST))) {
> +        rc = -EINVAL;
> +    }
> +
> +    return rc;
> +}
> +
>   static void s390_pci_read_base(S390PCIBusDevice *pbdev,
>                                  struct vfio_device_info *info)
>   {
> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
> index a9843dfe97..9941ca0084 100644
> --- a/include/hw/s390x/s390-pci-bus.h
> +++ b/include/hw/s390x/s390-pci-bus.h
> @@ -351,6 +351,7 @@ struct S390PCIBusDevice {
>       bool pci_unplug_request_processed;
>       bool unplug_requested;
>       bool interp;
> +    bool intassist;
>       QTAILQ_ENTRY(S390PCIBusDevice) link;
>   };
>   
> diff --git a/include/hw/s390x/s390-pci-vfio.h b/include/hw/s390x/s390-pci-vfio.h
> index 42533e38f7..6cec38a863 100644
> --- a/include/hw/s390x/s390-pci-vfio.h
> +++ b/include/hw/s390x/s390-pci-vfio.h
> @@ -13,6 +13,7 @@
>   #define HW_S390_PCI_VFIO_H
>   
>   #include "hw/s390x/s390-pci-bus.h"
> +#include "hw/s390x/s390-pci-inst.h"
>   #include CONFIG_DEVICES
>   
>   #ifdef CONFIG_VFIO
> @@ -23,6 +24,11 @@ void s390_pci_end_dma_count(S390pciState *s, S390PCIDMACount *cnt);
>   int s390_pci_probe_interp(S390PCIBusDevice *pbdev);
>   int s390_pci_set_interp(S390PCIBusDevice *pbdev, bool enable);
>   int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev);
> +int s390_pci_probe_aif(S390PCIBusDevice *pbdev);
> +int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
> +                     bool assist);
> +int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist);
> +
>   void s390_pci_get_clp_info(S390PCIBusDevice *pbdev);
>   #else
>   static inline bool s390_pci_update_dma_avail(int fd, unsigned int *avail)
> @@ -48,6 +54,20 @@ static inline int s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
>   {
>       return -EINVAL;
>   }
> +static inline int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib,
> +                                   bool enable, bool assist)
> +{
> +    return -EINVAL;
> +}
> +static inline int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable,
> +                                   bool assist)
> +{
> +    return -EINVAL;
> +}
>   static inline void s390_pci_get_clp_info(S390PCIBusDevice *pbdev) { }
>   #endif
>   
> 


apart of the naming (which may be a personal taste) LGTM



-- 
Pierre Morel
IBM Lab Boeblingen


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 6/9] s390x/pci: enable adapter event notification for interpreted devices
  2022-01-31 15:10     ` Pierre Morel
@ 2022-01-31 17:08       ` Matthew Rosato
  -1 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-31 17:08 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: alex.williamson, schnelle, cohuck, thuth, farman,
	richard.henderson, david, pasic, borntraeger, mst, pbonzini,
	qemu-devel, kvm

On 1/31/22 10:10 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:38, Matthew Rosato wrote:
>> Use the associated vfio feature ioctl to enable adapter event 
>> notification
>> and forwarding for devices when requested.  This feature will be set up
>> with or without firmware assist based upon the 'intassist' setting.
> 
> It is a personal opinion but I do not like the term 'intassist'.
> Why not using 'gisa' ?

Well, I don't think 'gisa' is quite accurate, we're not toggling the 
delivery mechanism here, we're toggling who is responsible for doing the 
delivery (forwarding).  The default (intassist=on) says firmware will do 
it if it can and if not, KVM will be tasked with delivery. 
intassist=off means KVM will always be told to do the delivery.

> I would find it more relevant.
> If something for the profane admin then interrupt_assist or irq_assist 
> or even int_assist.
> 
> Or is it forwarding_assist ?

Yeah, perahps 'forwarding_assist' is the right phrase, as we are 
disabling firmware's ability to assist in the forwarding of guest 
interrupts.

> 
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c          | 24 ++++++++--
>>   hw/s390x/s390-pci-inst.c         | 54 +++++++++++++++++++++-
>>   hw/s390x/s390-pci-vfio.c         | 79 ++++++++++++++++++++++++++++++++
>>   include/hw/s390x/s390-pci-bus.h  |  1 +
>>   include/hw/s390x/s390-pci-vfio.h | 20 ++++++++
>>   5 files changed, 173 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>> index 66649af6e0..6ee70446ca 100644
>> --- a/hw/s390x/s390-pci-bus.c
>> +++ b/hw/s390x/s390-pci-bus.c
>> @@ -189,7 +189,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
>>           rc = SCLP_RC_NO_ACTION_REQUIRED;
>>           break;
>>       default:
>> -        if (pbdev->summary_ind) {
>> +        if (pbdev->interp) {
>> +            /* Interpreted devices were using interrupt forwarding */
>> +            s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
>> +        } else if (pbdev->summary_ind) {
>>               pci_dereg_irqs(pbdev);
>>           }
>>           if (pbdev->iommu->enabled) {
>> @@ -981,6 +984,11 @@ static int s390_pci_interp_plug(S390pciState *s, 
>> S390PCIBusDevice *pbdev)
>>           return rc;
>>       }
>> +    rc = s390_pci_probe_aif(pbdev);
>> +    if (rc) {
>> +        return rc;
>> +    }
>> +
>>       rc = s390_pci_update_passthrough_fh(pbdev);
>>       if (rc) {
>>           return rc;
>> @@ -1076,6 +1084,7 @@ static void s390_pcihost_plug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>               if (pbdev->interp && 
>> !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
>>                       DPRINTF("zPCI interpretation facilities 
>> missing.\n");
>>                       pbdev->interp = false;
>> +                    pbdev->intassist = false;
>>                   }
>>               if (pbdev->interp) {
>>                   rc = s390_pci_interp_plug(s, pbdev);
>> @@ -1090,11 +1099,13 @@ static void s390_pcihost_plug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>               if (!pbdev->interp) {
>>                   /* Do vfio passthrough but intercept for I/O */
>>                   pbdev->fh |= FH_SHM_VFIO;
>> +                pbdev->intassist = false;
>>               }
>>           } else {
>>               pbdev->fh |= FH_SHM_EMUL;
>>               /* Always intercept emulated devices */
>>               pbdev->interp = false;
>> +            pbdev->intassist = false;
>>           }
>>           if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
>> @@ -1244,7 +1255,10 @@ static void s390_pcihost_reset(DeviceState *dev)
>>       /* Process all pending unplug requests */
>>       QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) {
>>           if (pbdev->unplug_requested) {
>> -            if (pbdev->summary_ind) {
>> +            if (pbdev->interp) {
>> +                /* Interpreted devices were using interrupt 
>> forwarding */
>> +                s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
>> +            } else if (pbdev->summary_ind) {
>>                   pci_dereg_irqs(pbdev);
>>               }
>>               if (pbdev->iommu->enabled) {
>> @@ -1382,7 +1396,10 @@ static void s390_pci_device_reset(DeviceState 
>> *dev)
>>           break;
>>       }
>> -    if (pbdev->summary_ind) {
>> +    if (pbdev->interp) {
>> +        /* Interpreted devices were using interrupt forwarding */
>> +        s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
>> +    } else if (pbdev->summary_ind) {
>>           pci_dereg_irqs(pbdev);
>>       }
>>       if (pbdev->iommu->enabled) {
>> @@ -1428,6 +1445,7 @@ static Property s390_pci_device_properties[] = {
>>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
>>       DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
>> +    DEFINE_PROP_BOOL("intassist", S390PCIBusDevice, intassist, true),
> 
> We allow to disable IRQ forwarding only for test or debug purpose don't we?
> Then shouldn't we make it explicit ?

Not sure I follow.

This setting makes forwarding assist (intassist=on) an implicit default 
which can be subsequently overruled by an explicit 'intassist=off'.

So, the idea is that by default if interpretation is available and 
enabled we will also use the firmware assist to do interrupt forwarding 
-- This is the preferred setup (performance) and nothing needs to be 
explicitly specified.  Explicitly specifying 'intassist=on' does nothing 
additional.

Conversely, if you want to disable the firmware forwarding (for as you 
say a test/debug scenario) only then must you explicitly specify 
'intassist=off'.

Or were you instead suggesting to make the bool a default of false that 
must be explicity set to true?  In that case, it would probably need a 
different name (e.g. implicit default is host_forwarding=off, override 
with host_forwarding=on and only when host_forwarding=on will we force 
host delivery of interrupts)

FWIW, I chose a default of 'true' because I preferred the idea of having 
to specify something negative like 'intassist=off' when forcing host 
delivery as this implies you are turning a feature off (because you are).

> 
>>       DEFINE_PROP_END_OF_LIST(),
>>   };
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index e9a0dc12e4..121e07cc41 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -1111,6 +1111,46 @@ static void fmb_update(void *opaque)
>>       timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui);
>>   }
>> +static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
>> +{
>> +    int rc;
>> +
>> +    /* Interpreted devices must also use interrupt forwarding */
>> +    rc = s390_pci_get_aif(pbdev, false, pbdev->intassist);
>> +    if (rc) {
>> +        DPRINTF("Bad interrupt forwarding state\n");
>> +        return rc;
>> +    }
>> +
>> +    rc = s390_pci_set_aif(pbdev, fib, true, pbdev->intassist);
>> +    if (rc) {
>> +        DPRINTF("Failed to enable interrupt forwarding\n");
>> +        return rc;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib 
>> *fib)
>> +{
>> +    int rc;
>> +
>> +    /* Interpreted devices were using interrupt forwarding */
>> +    rc = s390_pci_get_aif(pbdev, true, pbdev->intassist);
>> +    if (rc) {
>> +        DPRINTF("Bad interrupt forwarding state\n");
>> +        return rc;
>> +    }
>> +
>> +    rc = s390_pci_set_aif(pbdev, fib, false, pbdev->intassist);
>> +    if (rc) {
>> +        DPRINTF("Failed to disable interrupt forwarding\n");
>> +        return rc;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, 
>> uint8_t ar,
>>                           uintptr_t ra)
>>   {
>> @@ -1165,7 +1205,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t 
>> r1, uint64_t fiba, uint8_t ar,
>>       switch (oc) {
>>       case ZPCI_MOD_FC_REG_INT:
>> -        if (pbdev->summary_ind) {
>> +        if (pbdev->interp) {
>> +            if (mpcifc_reg_int_interp(pbdev, &fib)) {
>> +                cc = ZPCI_PCI_LS_ERR;
>> +                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>> +            }
>> +        } else if (pbdev->summary_ind) {
>>               cc = ZPCI_PCI_LS_ERR;
>>               s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>>           } else if (reg_irqs(env, pbdev, fib)) {
>> @@ -1174,7 +1219,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t 
>> r1, uint64_t fiba, uint8_t ar,
>>           }
>>           break;
>>       case ZPCI_MOD_FC_DEREG_INT:
>> -        if (!pbdev->summary_ind) {
>> +        if (pbdev->interp) {
>> +            if (mpcifc_dereg_int_interp(pbdev, &fib)) {
>> +                cc = ZPCI_PCI_LS_ERR;
>> +                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>> +            }
>> +        } else if (!pbdev->summary_ind) {
>>               cc = ZPCI_PCI_LS_ERR;
>>               s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>>           } else {
>> diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
>> index 2cab3a9e89..73f3b3ed19 100644
>> --- a/hw/s390x/s390-pci-vfio.c
>> +++ b/hw/s390x/s390-pci-vfio.c
>> @@ -149,6 +149,85 @@ int 
>> s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
>>       return 0;
>>   }
>> +int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
>> +{
>> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
>> +    struct vfio_device_feature feat = {
>> +        .argsz = sizeof(struct vfio_device_feature),
>> +        .flags = VFIO_DEVICE_FEATURE_PROBE + 
>> VFIO_DEVICE_FEATURE_ZPCI_AIF
>> +    };
>> +
>> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
>> +}
>> +
>> +int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
>> +                     bool assist)
>> +{
>> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
>> +    struct vfio_device_zpci_aif *data;
>> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
>> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
>> +
>> +    feat->argsz = size;
>> +    feat->flags = VFIO_DEVICE_FEATURE_SET + 
>> VFIO_DEVICE_FEATURE_ZPCI_AIF;
>> +
>> +    data = (struct vfio_device_zpci_aif *)&feat->data;
>> +    if (enable) {
>> +        data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
>> +        if (!pbdev->intassist) {
>> +            data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
>> +        }
>> +        /* Fill in the guest fib info */
>> +        data->ibv = fib->aibv;
>> +        data->sb = fib->aisb;
>> +        data->noi = FIB_DATA_NOI(fib->data);
>> +        data->isc = FIB_DATA_ISC(fib->data);
>> +        data->sbo = FIB_DATA_AISBO(fib->data);
>> +    } else {
>> +        data->flags = 0;
>> +    }
>> +
>> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
>> +}
>> +
>> +int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
>> +{
>> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
>> +    struct vfio_device_zpci_aif *data;
>> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
>> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
>> +    int rc;
>> +
>> +    feat->argsz = size;
>> +    feat->flags = VFIO_DEVICE_FEATURE_GET + 
>> VFIO_DEVICE_FEATURE_ZPCI_AIF;
>> +
>> +    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
>> +    if (rc) {
>> +        return rc;
>> +    }
>> +
>> +    /* Determine if current interrupt settings match the host */
>> +    data = (struct vfio_device_zpci_aif *)&feat->data;
>> +    if (enable && (!(data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT))) {
>> +        rc = -EINVAL;
>> +    } else if (!enable && (data->flags & 
>> VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT)) {
>> +        rc = -EINVAL;
>> +    }
>> +
>> +    /*
>> +     * When enabled for interrupts, the assist and forced 
>> host-delivery are
>> +     * mututally exclusive
>> +     */
> 
> assist is unclear for me int_assist , forwarding_assist ?
> 

Yes, my wording here is confusing.  Here, 'enable' is meant to represent 
whether forwarding (AIF) is currently enabled.  And then 'assist' is the 
intassist (forwarding_assist) setting.  And my use of 'mutually 
exclusive' is just plain confusing and should be removed.

I think that otherwise the code is logically correct -- I'll re-visit 
this and change the variable names and/or improve the comments.  To be 
clear, what we are trying to verify here is:

1) if qemu believes AIF is currently disabled, then kernel should also 
show the same (AIF_FLOAT is off)

2) if qemu believes AIF is already enabled, kernel should also show the 
same (AIF_FLOAT is on).  Additionally, if qemu believes host delivery is 
forced (instassist=off) then kernel should also show this (AIF_HOST is on).
This is where my really confusing use of 'mutally exclusive' comes in to 
play -- the AIF_HOST kernel state is an inversion of the QEMU state, so 
the values should never match -- if intassist=off then AIF_HOST should 
be on.  If intassist=on then AIF_HOST should be off.  But really the 
comment should just read that we are ensuring that kernel space 
indicates what we expect for the current state.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 6/9] s390x/pci: enable adapter event notification for interpreted devices
@ 2022-01-31 17:08       ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-31 17:08 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: farman, kvm, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 1/31/22 10:10 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:38, Matthew Rosato wrote:
>> Use the associated vfio feature ioctl to enable adapter event 
>> notification
>> and forwarding for devices when requested.  This feature will be set up
>> with or without firmware assist based upon the 'intassist' setting.
> 
> It is a personal opinion but I do not like the term 'intassist'.
> Why not using 'gisa' ?

Well, I don't think 'gisa' is quite accurate, we're not toggling the 
delivery mechanism here, we're toggling who is responsible for doing the 
delivery (forwarding).  The default (intassist=on) says firmware will do 
it if it can and if not, KVM will be tasked with delivery. 
intassist=off means KVM will always be told to do the delivery.

> I would find it more relevant.
> If something for the profane admin then interrupt_assist or irq_assist 
> or even int_assist.
> 
> Or is it forwarding_assist ?

Yeah, perahps 'forwarding_assist' is the right phrase, as we are 
disabling firmware's ability to assist in the forwarding of guest 
interrupts.

> 
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c          | 24 ++++++++--
>>   hw/s390x/s390-pci-inst.c         | 54 +++++++++++++++++++++-
>>   hw/s390x/s390-pci-vfio.c         | 79 ++++++++++++++++++++++++++++++++
>>   include/hw/s390x/s390-pci-bus.h  |  1 +
>>   include/hw/s390x/s390-pci-vfio.h | 20 ++++++++
>>   5 files changed, 173 insertions(+), 5 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
>> index 66649af6e0..6ee70446ca 100644
>> --- a/hw/s390x/s390-pci-bus.c
>> +++ b/hw/s390x/s390-pci-bus.c
>> @@ -189,7 +189,10 @@ void s390_pci_sclp_deconfigure(SCCB *sccb)
>>           rc = SCLP_RC_NO_ACTION_REQUIRED;
>>           break;
>>       default:
>> -        if (pbdev->summary_ind) {
>> +        if (pbdev->interp) {
>> +            /* Interpreted devices were using interrupt forwarding */
>> +            s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
>> +        } else if (pbdev->summary_ind) {
>>               pci_dereg_irqs(pbdev);
>>           }
>>           if (pbdev->iommu->enabled) {
>> @@ -981,6 +984,11 @@ static int s390_pci_interp_plug(S390pciState *s, 
>> S390PCIBusDevice *pbdev)
>>           return rc;
>>       }
>> +    rc = s390_pci_probe_aif(pbdev);
>> +    if (rc) {
>> +        return rc;
>> +    }
>> +
>>       rc = s390_pci_update_passthrough_fh(pbdev);
>>       if (rc) {
>>           return rc;
>> @@ -1076,6 +1084,7 @@ static void s390_pcihost_plug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>               if (pbdev->interp && 
>> !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
>>                       DPRINTF("zPCI interpretation facilities 
>> missing.\n");
>>                       pbdev->interp = false;
>> +                    pbdev->intassist = false;
>>                   }
>>               if (pbdev->interp) {
>>                   rc = s390_pci_interp_plug(s, pbdev);
>> @@ -1090,11 +1099,13 @@ static void s390_pcihost_plug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>               if (!pbdev->interp) {
>>                   /* Do vfio passthrough but intercept for I/O */
>>                   pbdev->fh |= FH_SHM_VFIO;
>> +                pbdev->intassist = false;
>>               }
>>           } else {
>>               pbdev->fh |= FH_SHM_EMUL;
>>               /* Always intercept emulated devices */
>>               pbdev->interp = false;
>> +            pbdev->intassist = false;
>>           }
>>           if (s390_pci_msix_init(pbdev) && !pbdev->interp) {
>> @@ -1244,7 +1255,10 @@ static void s390_pcihost_reset(DeviceState *dev)
>>       /* Process all pending unplug requests */
>>       QTAILQ_FOREACH_SAFE(pbdev, &s->zpci_devs, link, next) {
>>           if (pbdev->unplug_requested) {
>> -            if (pbdev->summary_ind) {
>> +            if (pbdev->interp) {
>> +                /* Interpreted devices were using interrupt 
>> forwarding */
>> +                s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
>> +            } else if (pbdev->summary_ind) {
>>                   pci_dereg_irqs(pbdev);
>>               }
>>               if (pbdev->iommu->enabled) {
>> @@ -1382,7 +1396,10 @@ static void s390_pci_device_reset(DeviceState 
>> *dev)
>>           break;
>>       }
>> -    if (pbdev->summary_ind) {
>> +    if (pbdev->interp) {
>> +        /* Interpreted devices were using interrupt forwarding */
>> +        s390_pci_set_aif(pbdev, NULL, false, pbdev->intassist);
>> +    } else if (pbdev->summary_ind) {
>>           pci_dereg_irqs(pbdev);
>>       }
>>       if (pbdev->iommu->enabled) {
>> @@ -1428,6 +1445,7 @@ static Property s390_pci_device_properties[] = {
>>       DEFINE_PROP_S390_PCI_FID("fid", S390PCIBusDevice, fid),
>>       DEFINE_PROP_STRING("target", S390PCIBusDevice, target),
>>       DEFINE_PROP_BOOL("interp", S390PCIBusDevice, interp, true),
>> +    DEFINE_PROP_BOOL("intassist", S390PCIBusDevice, intassist, true),
> 
> We allow to disable IRQ forwarding only for test or debug purpose don't we?
> Then shouldn't we make it explicit ?

Not sure I follow.

This setting makes forwarding assist (intassist=on) an implicit default 
which can be subsequently overruled by an explicit 'intassist=off'.

So, the idea is that by default if interpretation is available and 
enabled we will also use the firmware assist to do interrupt forwarding 
-- This is the preferred setup (performance) and nothing needs to be 
explicitly specified.  Explicitly specifying 'intassist=on' does nothing 
additional.

Conversely, if you want to disable the firmware forwarding (for as you 
say a test/debug scenario) only then must you explicitly specify 
'intassist=off'.

Or were you instead suggesting to make the bool a default of false that 
must be explicity set to true?  In that case, it would probably need a 
different name (e.g. implicit default is host_forwarding=off, override 
with host_forwarding=on and only when host_forwarding=on will we force 
host delivery of interrupts)

FWIW, I chose a default of 'true' because I preferred the idea of having 
to specify something negative like 'intassist=off' when forcing host 
delivery as this implies you are turning a feature off (because you are).

> 
>>       DEFINE_PROP_END_OF_LIST(),
>>   };
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index e9a0dc12e4..121e07cc41 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -1111,6 +1111,46 @@ static void fmb_update(void *opaque)
>>       timer_mod(pbdev->fmb_timer, t + pbdev->pci_group->zpci_group.mui);
>>   }
>> +static int mpcifc_reg_int_interp(S390PCIBusDevice *pbdev, ZpciFib *fib)
>> +{
>> +    int rc;
>> +
>> +    /* Interpreted devices must also use interrupt forwarding */
>> +    rc = s390_pci_get_aif(pbdev, false, pbdev->intassist);
>> +    if (rc) {
>> +        DPRINTF("Bad interrupt forwarding state\n");
>> +        return rc;
>> +    }
>> +
>> +    rc = s390_pci_set_aif(pbdev, fib, true, pbdev->intassist);
>> +    if (rc) {
>> +        DPRINTF("Failed to enable interrupt forwarding\n");
>> +        return rc;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int mpcifc_dereg_int_interp(S390PCIBusDevice *pbdev, ZpciFib 
>> *fib)
>> +{
>> +    int rc;
>> +
>> +    /* Interpreted devices were using interrupt forwarding */
>> +    rc = s390_pci_get_aif(pbdev, true, pbdev->intassist);
>> +    if (rc) {
>> +        DPRINTF("Bad interrupt forwarding state\n");
>> +        return rc;
>> +    }
>> +
>> +    rc = s390_pci_set_aif(pbdev, fib, false, pbdev->intassist);
>> +    if (rc) {
>> +        DPRINTF("Failed to disable interrupt forwarding\n");
>> +        return rc;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   int mpcifc_service_call(S390CPU *cpu, uint8_t r1, uint64_t fiba, 
>> uint8_t ar,
>>                           uintptr_t ra)
>>   {
>> @@ -1165,7 +1205,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t 
>> r1, uint64_t fiba, uint8_t ar,
>>       switch (oc) {
>>       case ZPCI_MOD_FC_REG_INT:
>> -        if (pbdev->summary_ind) {
>> +        if (pbdev->interp) {
>> +            if (mpcifc_reg_int_interp(pbdev, &fib)) {
>> +                cc = ZPCI_PCI_LS_ERR;
>> +                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>> +            }
>> +        } else if (pbdev->summary_ind) {
>>               cc = ZPCI_PCI_LS_ERR;
>>               s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>>           } else if (reg_irqs(env, pbdev, fib)) {
>> @@ -1174,7 +1219,12 @@ int mpcifc_service_call(S390CPU *cpu, uint8_t 
>> r1, uint64_t fiba, uint8_t ar,
>>           }
>>           break;
>>       case ZPCI_MOD_FC_DEREG_INT:
>> -        if (!pbdev->summary_ind) {
>> +        if (pbdev->interp) {
>> +            if (mpcifc_dereg_int_interp(pbdev, &fib)) {
>> +                cc = ZPCI_PCI_LS_ERR;
>> +                s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>> +            }
>> +        } else if (!pbdev->summary_ind) {
>>               cc = ZPCI_PCI_LS_ERR;
>>               s390_set_status_code(env, r1, ZPCI_MOD_ST_SEQUENCE);
>>           } else {
>> diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
>> index 2cab3a9e89..73f3b3ed19 100644
>> --- a/hw/s390x/s390-pci-vfio.c
>> +++ b/hw/s390x/s390-pci-vfio.c
>> @@ -149,6 +149,85 @@ int 
>> s390_pci_update_passthrough_fh(S390PCIBusDevice *pbdev)
>>       return 0;
>>   }
>> +int s390_pci_probe_aif(S390PCIBusDevice *pbdev)
>> +{
>> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
>> +    struct vfio_device_feature feat = {
>> +        .argsz = sizeof(struct vfio_device_feature),
>> +        .flags = VFIO_DEVICE_FEATURE_PROBE + 
>> VFIO_DEVICE_FEATURE_ZPCI_AIF
>> +    };
>> +
>> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, &feat);
>> +}
>> +
>> +int s390_pci_set_aif(S390PCIBusDevice *pbdev, ZpciFib *fib, bool enable,
>> +                     bool assist)
>> +{
>> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
>> +    struct vfio_device_zpci_aif *data;
>> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
>> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
>> +
>> +    feat->argsz = size;
>> +    feat->flags = VFIO_DEVICE_FEATURE_SET + 
>> VFIO_DEVICE_FEATURE_ZPCI_AIF;
>> +
>> +    data = (struct vfio_device_zpci_aif *)&feat->data;
>> +    if (enable) {
>> +        data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
>> +        if (!pbdev->intassist) {
>> +            data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
>> +        }
>> +        /* Fill in the guest fib info */
>> +        data->ibv = fib->aibv;
>> +        data->sb = fib->aisb;
>> +        data->noi = FIB_DATA_NOI(fib->data);
>> +        data->isc = FIB_DATA_ISC(fib->data);
>> +        data->sbo = FIB_DATA_AISBO(fib->data);
>> +    } else {
>> +        data->flags = 0;
>> +    }
>> +
>> +    return ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
>> +}
>> +
>> +int s390_pci_get_aif(S390PCIBusDevice *pbdev, bool enable, bool assist)
>> +{
>> +    VFIOPCIDevice *vdev = VFIO_PCI(pbdev->pdev);
>> +    struct vfio_device_zpci_aif *data;
>> +    int size = sizeof(struct vfio_device_feature) + sizeof(*data);
>> +    g_autofree struct vfio_device_feature *feat = g_malloc0(size);
>> +    int rc;
>> +
>> +    feat->argsz = size;
>> +    feat->flags = VFIO_DEVICE_FEATURE_GET + 
>> VFIO_DEVICE_FEATURE_ZPCI_AIF;
>> +
>> +    rc = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_FEATURE, feat);
>> +    if (rc) {
>> +        return rc;
>> +    }
>> +
>> +    /* Determine if current interrupt settings match the host */
>> +    data = (struct vfio_device_zpci_aif *)&feat->data;
>> +    if (enable && (!(data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT))) {
>> +        rc = -EINVAL;
>> +    } else if (!enable && (data->flags & 
>> VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT)) {
>> +        rc = -EINVAL;
>> +    }
>> +
>> +    /*
>> +     * When enabled for interrupts, the assist and forced 
>> host-delivery are
>> +     * mututally exclusive
>> +     */
> 
> assist is unclear for me int_assist , forwarding_assist ?
> 

Yes, my wording here is confusing.  Here, 'enable' is meant to represent 
whether forwarding (AIF) is currently enabled.  And then 'assist' is the 
intassist (forwarding_assist) setting.  And my use of 'mutually 
exclusive' is just plain confusing and should be removed.

I think that otherwise the code is logically correct -- I'll re-visit 
this and change the variable names and/or improve the comments.  To be 
clear, what we are trying to verify here is:

1) if qemu believes AIF is currently disabled, then kernel should also 
show the same (AIF_FLOAT is off)

2) if qemu believes AIF is already enabled, kernel should also show the 
same (AIF_FLOAT is on).  Additionally, if qemu believes host delivery is 
forced (instassist=off) then kernel should also show this (AIF_HOST is on).
This is where my really confusing use of 'mutally exclusive' comes in to 
play -- the AIF_HOST kernel state is an inversion of the QEMU state, so 
the values should never match -- if intassist=off then AIF_HOST should 
be on.  If intassist=on then AIF_HOST should be off.  But really the 
comment should just read that we are ensuring that kernel space 
indicates what we expect for the current state.




^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 4/9] s390x/pci: enable for load/store intepretation
  2022-01-31 14:46   ` Pierre Morel
@ 2022-01-31 17:11     ` Matthew Rosato
  0 siblings, 0 replies; 44+ messages in thread
From: Matthew Rosato @ 2022-01-31 17:11 UTC (permalink / raw)
  To: Pierre Morel, qemu-s390x
  Cc: farman, kvm, schnelle, cohuck, richard.henderson, thuth,
	qemu-devel, pasic, alex.williamson, mst, pbonzini, david,
	borntraeger

On 1/31/22 9:46 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:38, Matthew Rosato wrote:
>> Use the associated vfio feature ioctl to enable interpretation for 
>> devices
>> when requested.  As part of this process, we must use the host function
>> handle rather than a QEMU-generated one -- this is provided as part of 
>> the
>> ioctl payload.
> 
> I wonder if we should not explain here that having interpretation as a 
> default and silently fall back to interception allows backward 
> compatibility while allowing performence be chosing by default.
> (You can say it better as I do :) )

Good suggestion, I'll think of something to add to the commit message.


>> @@ -1022,12 +1068,33 @@ static void s390_pcihost_plug(HotplugHandler 
>> *hotplug_dev, DeviceState *dev,
>>           set_pbdev_info(pbdev);
>>           if (object_dynamic_cast(OBJECT(dev), "vfio-pci")) {
>> -            pbdev->fh |= FH_SHM_VFIO;
>> +            /*
>> +             * By default, interpretation is always requested; if the 
>> available
>> +             * facilities indicate it is not available, fallback to the
>> +             * intercept model.
> 
> s/intercept/interception/ ?
> 

OK

>> +             */
>> +            if (pbdev->interp && 
>> !s390_has_feat(S390_FEAT_ZPCI_INTERP)) {
>> +                    DPRINTF("zPCI interpretation facilities 
>> missing.\n");
>> +                    pbdev->interp = false;
>> +                }
>> +            if (pbdev->interp) {
>> +                rc = s390_pci_interp_plug(s, pbdev);
>> +                if (rc) {
>> +                    error_setg(errp, "zpci interp plug failed: %d", rc);
>> +                    return;
>> +                }
>> +            }
> 
> 
> Can't we rearrange that as
> if (pbdev->interp) {
>      if (s390_has_feat) {
>      } else {
>      }
> }

Yep, sure

...

> 
> LGTM
> With the corrections proposed by Thomas.
> Mine... you see what you prefer.
> 
> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>

Thanks!


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 0/9] s390x/pci: zPCI interpretation support
  2022-01-14 20:38 ` Matthew Rosato
@ 2022-02-04 12:39   ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2022-02-04 12:39 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: qemu-s390x, alex.williamson, schnelle, cohuck, thuth, farman,
	pmorel, richard.henderson, david, pasic, borntraeger, pbonzini,
	qemu-devel, kvm

On Fri, Jan 14, 2022 at 03:38:40PM -0500, Matthew Rosato wrote:
> For QEMU, the majority of the work in enabling instruction interpretation
> is handled via new VFIO ioctls to SET the appropriate interpretation and
> interrupt forwarding modes, and to GET the function handle to use for
> interpretive execution.  
> 
> This series implements these new ioctls, as well as adding a new, optional
> 'intercept' parameter to zpci to request interpretation support not be used
> as well as an 'intassist' parameter to determine whether or not the
> firmware assist will be used for interrupt delivery or whether the host
> will be responsible for delivering all interrupts.
> 
> The ZPCI_INTERP CPU feature is added beginning with the z14 model to
> enable this support.
> 
> As a consequence of implementing zPCI interpretation, ISM devices now
> become eligible for passthrough (but only when zPCI interpretation is
> available).
> 
> >From the perspective of guest configuration, you passthrough zPCI devices
> in the same manner as before, with intepretation support being used by
> default if available in kernel+qemu.
> 
> Associated kernel series:
> https://lore.kernel.org/kvm/20220114203145.242984-1-mjrosato@linux.ibm.com/

I took a quick look and don't see much to object to, but
I don't know much about the topic either.

Acked-by: Michael S. Tsirkin <mst@redhat.com>

Pls use the s390 tree for this.

> Changes v1->v2:
> 
> - Update kernel headers sync                                                    
> - Drop some pre-req patches that are now merged                                 
> - Add some R-bs (Thanks!)                                                       
> - fence FEAT_ZPCI_INTERP for QEMU 6.2 and older (Christian)                     
> - switch from container_of to VFIO_PCI and drop asserts (Thomas)                
> - re-arrange g_autofree so we malloc at time of declaration (Thomas) 
> 
> Matthew Rosato (9):
>   Update linux headers
>   target/s390x: add zpci-interp to cpu models
>   fixup: force interp off for QEMU machine 6.2 and older
>   s390x/pci: enable for load/store intepretation
>   s390x/pci: don't fence interpreted devices without MSI-X
>   s390x/pci: enable adapter event notification for interpreted devices
>   s390x/pci: use I/O Address Translation assist when interpreting
>   s390x/pci: use dtsm provided from vfio capabilities for interpreted
>     devices
>   s390x/pci: let intercept devices have separate PCI groups
> 
>  hw/s390x/s390-pci-bus.c                       | 121 ++++++++++-
>  hw/s390x/s390-pci-inst.c                      | 168 ++++++++++++++-
>  hw/s390x/s390-pci-vfio.c                      | 204 +++++++++++++++++-
>  hw/s390x/s390-virtio-ccw.c                    |   1 +
>  include/hw/s390x/s390-pci-bus.h               |   8 +-
>  include/hw/s390x/s390-pci-inst.h              |   2 +-
>  include/hw/s390x/s390-pci-vfio.h              |  45 ++++
>  include/standard-headers/asm-x86/kvm_para.h   |   1 +
>  include/standard-headers/drm/drm_fourcc.h     |  11 +
>  include/standard-headers/linux/ethtool.h      |   1 +
>  include/standard-headers/linux/fuse.h         |  60 +++++-
>  include/standard-headers/linux/pci_regs.h     |   4 +
>  include/standard-headers/linux/virtio_iommu.h |   8 +-
>  linux-headers/asm-mips/unistd_n32.h           |   1 +
>  linux-headers/asm-mips/unistd_n64.h           |   1 +
>  linux-headers/asm-mips/unistd_o32.h           |   1 +
>  linux-headers/asm-powerpc/unistd_32.h         |   1 +
>  linux-headers/asm-powerpc/unistd_64.h         |   1 +
>  linux-headers/asm-s390/kvm.h                  |   1 +
>  linux-headers/asm-s390/unistd_32.h            |   1 +
>  linux-headers/asm-s390/unistd_64.h            |   1 +
>  linux-headers/linux/kvm.h                     |   1 +
>  linux-headers/linux/vfio.h                    |  22 ++
>  linux-headers/linux/vfio_zdev.h               |  51 +++++
>  target/s390x/cpu_features_def.h.inc           |   1 +
>  target/s390x/gen-features.c                   |   2 +
>  target/s390x/kvm/kvm.c                        |   1 +
>  27 files changed, 693 insertions(+), 27 deletions(-)
> 
> -- 
> 2.27.0


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v2 0/9] s390x/pci: zPCI interpretation support
@ 2022-02-04 12:39   ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2022-02-04 12:39 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: thuth, alex.williamson, kvm, pmorel, schnelle, cohuck,
	richard.henderson, farman, qemu-devel, pasic, qemu-s390x,
	pbonzini, david, borntraeger

On Fri, Jan 14, 2022 at 03:38:40PM -0500, Matthew Rosato wrote:
> For QEMU, the majority of the work in enabling instruction interpretation
> is handled via new VFIO ioctls to SET the appropriate interpretation and
> interrupt forwarding modes, and to GET the function handle to use for
> interpretive execution.  
> 
> This series implements these new ioctls, as well as adding a new, optional
> 'intercept' parameter to zpci to request interpretation support not be used
> as well as an 'intassist' parameter to determine whether or not the
> firmware assist will be used for interrupt delivery or whether the host
> will be responsible for delivering all interrupts.
> 
> The ZPCI_INTERP CPU feature is added beginning with the z14 model to
> enable this support.
> 
> As a consequence of implementing zPCI interpretation, ISM devices now
> become eligible for passthrough (but only when zPCI interpretation is
> available).
> 
> >From the perspective of guest configuration, you passthrough zPCI devices
> in the same manner as before, with intepretation support being used by
> default if available in kernel+qemu.
> 
> Associated kernel series:
> https://lore.kernel.org/kvm/20220114203145.242984-1-mjrosato@linux.ibm.com/

I took a quick look and don't see much to object to, but
I don't know much about the topic either.

Acked-by: Michael S. Tsirkin <mst@redhat.com>

Pls use the s390 tree for this.

> Changes v1->v2:
> 
> - Update kernel headers sync                                                    
> - Drop some pre-req patches that are now merged                                 
> - Add some R-bs (Thanks!)                                                       
> - fence FEAT_ZPCI_INTERP for QEMU 6.2 and older (Christian)                     
> - switch from container_of to VFIO_PCI and drop asserts (Thomas)                
> - re-arrange g_autofree so we malloc at time of declaration (Thomas) 
> 
> Matthew Rosato (9):
>   Update linux headers
>   target/s390x: add zpci-interp to cpu models
>   fixup: force interp off for QEMU machine 6.2 and older
>   s390x/pci: enable for load/store intepretation
>   s390x/pci: don't fence interpreted devices without MSI-X
>   s390x/pci: enable adapter event notification for interpreted devices
>   s390x/pci: use I/O Address Translation assist when interpreting
>   s390x/pci: use dtsm provided from vfio capabilities for interpreted
>     devices
>   s390x/pci: let intercept devices have separate PCI groups
> 
>  hw/s390x/s390-pci-bus.c                       | 121 ++++++++++-
>  hw/s390x/s390-pci-inst.c                      | 168 ++++++++++++++-
>  hw/s390x/s390-pci-vfio.c                      | 204 +++++++++++++++++-
>  hw/s390x/s390-virtio-ccw.c                    |   1 +
>  include/hw/s390x/s390-pci-bus.h               |   8 +-
>  include/hw/s390x/s390-pci-inst.h              |   2 +-
>  include/hw/s390x/s390-pci-vfio.h              |  45 ++++
>  include/standard-headers/asm-x86/kvm_para.h   |   1 +
>  include/standard-headers/drm/drm_fourcc.h     |  11 +
>  include/standard-headers/linux/ethtool.h      |   1 +
>  include/standard-headers/linux/fuse.h         |  60 +++++-
>  include/standard-headers/linux/pci_regs.h     |   4 +
>  include/standard-headers/linux/virtio_iommu.h |   8 +-
>  linux-headers/asm-mips/unistd_n32.h           |   1 +
>  linux-headers/asm-mips/unistd_n64.h           |   1 +
>  linux-headers/asm-mips/unistd_o32.h           |   1 +
>  linux-headers/asm-powerpc/unistd_32.h         |   1 +
>  linux-headers/asm-powerpc/unistd_64.h         |   1 +
>  linux-headers/asm-s390/kvm.h                  |   1 +
>  linux-headers/asm-s390/unistd_32.h            |   1 +
>  linux-headers/asm-s390/unistd_64.h            |   1 +
>  linux-headers/linux/kvm.h                     |   1 +
>  linux-headers/linux/vfio.h                    |  22 ++
>  linux-headers/linux/vfio_zdev.h               |  51 +++++
>  target/s390x/cpu_features_def.h.inc           |   1 +
>  target/s390x/gen-features.c                   |   2 +
>  target/s390x/kvm/kvm.c                        |   1 +
>  27 files changed, 693 insertions(+), 27 deletions(-)
> 
> -- 
> 2.27.0



^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2022-02-04 12:53 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-14 20:38 [PATCH v2 0/9] s390x/pci: zPCI interpretation support Matthew Rosato
2022-01-14 20:38 ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 1/9] Update linux headers Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 2/9] target/s390x: add zpci-interp to cpu models Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 3/9] fixup: force interp off for QEMU machine 6.2 and older Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-14 21:04   ` Matthew Rosato
2022-01-14 21:04     ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 4/9] s390x/pci: enable for load/store intepretation Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-17 14:51   ` Thomas Huth
2022-01-17 14:51     ` Thomas Huth
2022-01-18 18:08     ` Matthew Rosato
2022-01-18 18:08       ` Matthew Rosato
2022-01-17 15:38   ` Thomas Huth
2022-01-17 15:38     ` Thomas Huth
2022-01-18 18:14     ` Matthew Rosato
2022-01-18 18:14       ` Matthew Rosato
2022-01-31  9:37   ` Pierre Morel
2022-01-31  9:37     ` Pierre Morel
2022-01-31 14:46   ` Pierre Morel
2022-01-31 17:11     ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 5/9] s390x/pci: don't fence interpreted devices without MSI-X Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 6/9] s390x/pci: enable adapter event notification for interpreted devices Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-31 15:10   ` Pierre Morel
2022-01-31 15:10     ` Pierre Morel
2022-01-31 17:08     ` Matthew Rosato
2022-01-31 17:08       ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 7/9] s390x/pci: use I/O Address Translation assist when interpreting Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 8/9] s390x/pci: use dtsm provided from vfio capabilities for interpreted devices Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-14 20:38 ` [PATCH v2 9/9] s390x/pci: let intercept devices have separate PCI groups Matthew Rosato
2022-01-14 20:38   ` Matthew Rosato
2022-01-17 15:23 ` [PATCH v2 0/9] s390x/pci: zPCI interpretation support Thomas Huth
2022-01-17 15:23   ` Thomas Huth
2022-01-18 18:32   ` Matthew Rosato
2022-01-18 18:32     ` Matthew Rosato
2022-02-04 12:39 ` Michael S. Tsirkin
2022-02-04 12:39   ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.