All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution
@ 2022-01-14 20:31 Matthew Rosato
  2022-01-14 20:31 ` [PATCH v2 01/30] s390/sclp: detect the zPCI load/store interpretation facility Matthew Rosato
                   ` (31 more replies)
  0 siblings, 32 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Enable interpretive execution of zPCI instructions + adapter interruption
forwarding for s390x KVM vfio-pci.  This is done by introducing a series
of new vfio-pci feature ioctls that are unique vfio-pci-zdev (s390x) and
are used to negotiate the various aspects of zPCI interpretation setup.
By allowing intepretation of zPCI instructions and firmware delivery of
interrupts to guests, we can significantly reduce the frequency of guest
SIE exits for zPCI.  We then see additional gains by handling a hot-path
instruction that can still intercept to the hypervisor (RPCIT) directly
in kvm.

From the perspective of guest configuration, you passthrough zPCI devices
in the same manner as before, with intepretation support being used by
default if available in kernel+qemu.

Will reply with a link to the associated QEMU series.

Changes v1->v2:
- s/has_zpci_interp/has_zpci_lsi/ (Christian)
- Added many R-bs / ACKs (Thanks!)
- Re-work zpci_set_irq_ctrl (Niklas)
- Simplify changes made for zpci_get_mdd (Niklas, Christian)
- 'KVM: s390: pci: add basic kvm_zdev structure' changes (Pierre)
- only build s390/kvm/pci.o when CONFIG_PCI
- Related to the above, add some more checks for
  IS_ENABLED(CONFIG_PCI) (Pierre)
- Drop set_kvm_facility until VSIE support (Christian)
- Use sclp check instead of stfle when setting ECB (Christian)
- remove unnecessary externs from header (Pierre)
- macro for checkling if shadow ioat is initialized (Pierre)
- Fix interrupt case where we have both AEN and alert list (Christian)
- Re-work AEN setup to satisfy firmware requirements on all supported
  platforms
- V!=R changes (Niklas)
- vifo_pci_zdev_feat_* - check argz against data size (Pierre, Alex)
- vfio_pci_zdev_{open,release} switch to return void (Alex)
- Related to the above, make kvm_s390_pci_*_probe functions return error
  if KVM is not registered for the device (Alex)
- Fix my probe implementation to ignore GET|SET, as these don't change
  the result of the probe.  I was erroneously performing the GET or SET
  operation if specified along with PROBE.
- A few additional fixes regarding ioctl implementation.  Return EINVAL
  if none of PROBE|GET|SET are specified.  Return EINVAL if both GET
  and SET are specified (without PROBE).
- New patch to return status from zpci_refresh_trans (Pierre, Niklas)
- And use that status when possible for KVM rpcit intercept (Pierre)

Matthew Rosato (30):
  s390/sclp: detect the zPCI load/store interpretation facility
  s390/sclp: detect the AISII facility
  s390/sclp: detect the AENI facility
  s390/sclp: detect the AISI facility
  s390/airq: pass more TPI info to airq handlers
  s390/airq: allow for airq structure that uses an input vector
  s390/pci: externalize the SIC operation controls and routine
  s390/pci: stash associated GISA designation
  s390/pci: export some routines related to RPCIT processing
  s390/pci: stash dtsm and maxstbl
  s390/pci: add helper function to find device by handle
  s390/pci: get SHM information from list pci
  s390/pci: return status from zpci_refresh_trans
  KVM: s390: pci: add basic kvm_zdev structure
  KVM: s390: pci: do initial setup for AEN interpretation
  KVM: s390: pci: enable host forwarding of Adapter Event Notifications
  KVM: s390: mechanism to enable guest zPCI Interpretation
  KVM: s390: pci: provide routines for enabling/disabling interpretation
  KVM: s390: pci: provide routines for enabling/disabling interrupt
    forwarding
  KVM: s390: pci: provide routines for enabling/disabling IOAT assist
  KVM: s390: pci: handle refresh of PCI translations
  KVM: s390: intercept the rpcit instruction
  vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV
  vfio-pci/zdev: wire up group notifier
  vfio-pci/zdev: wire up zPCI interpretive execution support
  vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
  vfio-pci/zdev: wire up zPCI IOAT assist support
  vfio-pci/zdev: add DTSM to clp group capability
  KVM: s390: introduce CPU feature for zPCI Interpretation
  MAINTAINERS: additional files related kvm s390 pci passthrough

 MAINTAINERS                      |   2 +
 arch/s390/include/asm/airq.h     |   7 +-
 arch/s390/include/asm/kvm_host.h |   5 +
 arch/s390/include/asm/kvm_pci.h  |  62 +++
 arch/s390/include/asm/pci.h      |  12 +
 arch/s390/include/asm/pci_clp.h  |  11 +-
 arch/s390/include/asm/pci_dma.h  |   3 +
 arch/s390/include/asm/pci_insn.h |  31 +-
 arch/s390/include/asm/sclp.h     |   4 +
 arch/s390/include/asm/tpi.h      |  13 +
 arch/s390/include/uapi/asm/kvm.h |   1 +
 arch/s390/kvm/Makefile           |   2 +-
 arch/s390/kvm/interrupt.c        |  94 +++-
 arch/s390/kvm/kvm-s390.c         |  56 ++-
 arch/s390/kvm/kvm-s390.h         |  10 +
 arch/s390/kvm/pci.c              | 837 +++++++++++++++++++++++++++++++
 arch/s390/kvm/pci.h              |  59 +++
 arch/s390/kvm/priv.c             |  46 ++
 arch/s390/pci/pci.c              |  31 ++
 arch/s390/pci/pci_clp.c          |  31 +-
 arch/s390/pci/pci_dma.c          |   7 +-
 arch/s390/pci/pci_insn.c         |  15 +-
 arch/s390/pci/pci_irq.c          |  48 +-
 drivers/iommu/s390-iommu.c       |   4 +-
 drivers/s390/char/sclp_early.c   |   4 +
 drivers/s390/cio/airq.c          |  12 +-
 drivers/s390/cio/qdio_thinint.c  |   6 +-
 drivers/s390/crypto/ap_bus.c     |   9 +-
 drivers/s390/virtio/virtio_ccw.c |   6 +-
 drivers/vfio/pci/Kconfig         |  11 +
 drivers/vfio/pci/Makefile        |   2 +-
 drivers/vfio/pci/vfio_pci_core.c |   8 +
 drivers/vfio/pci/vfio_pci_zdev.c | 290 ++++++++++-
 include/linux/vfio_pci_core.h    |  42 +-
 include/uapi/linux/vfio.h        |  22 +
 include/uapi/linux/vfio_zdev.h   |  51 ++
 36 files changed, 1788 insertions(+), 66 deletions(-)
 create mode 100644 arch/s390/include/asm/kvm_pci.h
 create mode 100644 arch/s390/kvm/pci.c
 create mode 100644 arch/s390/kvm/pci.h

-- 
2.27.0


^ permalink raw reply	[flat|nested] 97+ messages in thread

* [PATCH v2 01/30] s390/sclp: detect the zPCI load/store interpretation facility
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-14 20:31 ` [PATCH v2 02/30] s390/sclp: detect the AISII facility Matthew Rosato
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger

Detect the zPCI Load/Store Interpretation facility.

Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/sclp.h   | 1 +
 drivers/s390/char/sclp_early.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index c68ea35de498..58a4d3d354b7 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -88,6 +88,7 @@ struct sclp_info {
 	unsigned char has_diag318 : 1;
 	unsigned char has_sipl : 1;
 	unsigned char has_dirq : 1;
+	unsigned char has_zpci_lsi : 1;
 	unsigned int ibc;
 	unsigned int mtid;
 	unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index e9943a86c361..b88dd0da1231 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
 	sclp.has_gisaf = !!(sccb->fac118 & 0x08);
 	sclp.has_hvs = !!(sccb->fac119 & 0x80);
 	sclp.has_kss = !!(sccb->fac98 & 0x01);
+	sclp.has_zpci_lsi = !!(sccb->fac118 & 0x01);
 	if (sccb->fac85 & 0x02)
 		S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
 	if (sccb->fac91 & 0x40)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 02/30] s390/sclp: detect the AISII facility
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
  2022-01-14 20:31 ` [PATCH v2 01/30] s390/sclp: detect the zPCI load/store interpretation facility Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-14 20:31 ` [PATCH v2 03/30] s390/sclp: detect the AENI facility Matthew Rosato
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger

Detect the Adapter Interruption Source ID Interpretation facility.

Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/sclp.h   | 1 +
 drivers/s390/char/sclp_early.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index 58a4d3d354b7..8b56ac5ae496 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -89,6 +89,7 @@ struct sclp_info {
 	unsigned char has_sipl : 1;
 	unsigned char has_dirq : 1;
 	unsigned char has_zpci_lsi : 1;
+	unsigned char has_aisii : 1;
 	unsigned int ibc;
 	unsigned int mtid;
 	unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index b88dd0da1231..29fee179e197 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -45,6 +45,7 @@ static void __init sclp_early_facilities_detect(void)
 	sclp.has_gisaf = !!(sccb->fac118 & 0x08);
 	sclp.has_hvs = !!(sccb->fac119 & 0x80);
 	sclp.has_kss = !!(sccb->fac98 & 0x01);
+	sclp.has_aisii = !!(sccb->fac118 & 0x40);
 	sclp.has_zpci_lsi = !!(sccb->fac118 & 0x01);
 	if (sccb->fac85 & 0x02)
 		S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 03/30] s390/sclp: detect the AENI facility
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
  2022-01-14 20:31 ` [PATCH v2 01/30] s390/sclp: detect the zPCI load/store interpretation facility Matthew Rosato
  2022-01-14 20:31 ` [PATCH v2 02/30] s390/sclp: detect the AISII facility Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-14 20:31 ` [PATCH v2 04/30] s390/sclp: detect the AISI facility Matthew Rosato
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Detect the Adapter Event Notification Interpretation facility.

Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/sclp.h   | 1 +
 drivers/s390/char/sclp_early.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index 8b56ac5ae496..8c2e142000d4 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -90,6 +90,7 @@ struct sclp_info {
 	unsigned char has_dirq : 1;
 	unsigned char has_zpci_lsi : 1;
 	unsigned char has_aisii : 1;
+	unsigned char has_aeni : 1;
 	unsigned int ibc;
 	unsigned int mtid;
 	unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index 29fee179e197..e9af01b4c97a 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -46,6 +46,7 @@ static void __init sclp_early_facilities_detect(void)
 	sclp.has_hvs = !!(sccb->fac119 & 0x80);
 	sclp.has_kss = !!(sccb->fac98 & 0x01);
 	sclp.has_aisii = !!(sccb->fac118 & 0x40);
+	sclp.has_aeni = !!(sccb->fac118 & 0x20);
 	sclp.has_zpci_lsi = !!(sccb->fac118 & 0x01);
 	if (sccb->fac85 & 0x02)
 		S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 04/30] s390/sclp: detect the AISI facility
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (2 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 03/30] s390/sclp: detect the AENI facility Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-17  7:57   ` Thomas Huth
  2022-01-14 20:31 ` [PATCH v2 05/30] s390/airq: pass more TPI info to airq handlers Matthew Rosato
                   ` (27 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Detect the Adapter Interruption Suppression Interpretation facility.

Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/sclp.h   | 1 +
 drivers/s390/char/sclp_early.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
index 8c2e142000d4..33b174007848 100644
--- a/arch/s390/include/asm/sclp.h
+++ b/arch/s390/include/asm/sclp.h
@@ -91,6 +91,7 @@ struct sclp_info {
 	unsigned char has_zpci_lsi : 1;
 	unsigned char has_aisii : 1;
 	unsigned char has_aeni : 1;
+	unsigned char has_aisi : 1;
 	unsigned int ibc;
 	unsigned int mtid;
 	unsigned int mtid_cp;
diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
index e9af01b4c97a..c13e55cc4a5d 100644
--- a/drivers/s390/char/sclp_early.c
+++ b/drivers/s390/char/sclp_early.c
@@ -47,6 +47,7 @@ static void __init sclp_early_facilities_detect(void)
 	sclp.has_kss = !!(sccb->fac98 & 0x01);
 	sclp.has_aisii = !!(sccb->fac118 & 0x40);
 	sclp.has_aeni = !!(sccb->fac118 & 0x20);
+	sclp.has_aisi = !!(sccb->fac118 & 0x10);
 	sclp.has_zpci_lsi = !!(sccb->fac118 & 0x01);
 	if (sccb->fac85 & 0x02)
 		S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 05/30] s390/airq: pass more TPI info to airq handlers
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (3 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 04/30] s390/sclp: detect the AISI facility Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-17  8:27   ` Thomas Huth
  2022-01-14 20:31 ` [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector Matthew Rosato
                   ` (26 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

A subsequent patch will introduce an airq handler that requires additional
TPI information beyond directed vs floating, so pass the entire tpi_info
structure via the handler.  Only pci actually uses this information today,
for the other airq handlers this is effectively a no-op.

Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/airq.h     | 3 ++-
 arch/s390/kvm/interrupt.c        | 4 +++-
 arch/s390/pci/pci_irq.c          | 9 +++++++--
 drivers/s390/cio/airq.c          | 2 +-
 drivers/s390/cio/qdio_thinint.c  | 6 ++++--
 drivers/s390/crypto/ap_bus.c     | 9 ++++++---
 drivers/s390/virtio/virtio_ccw.c | 4 +++-
 7 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 01936fdfaddb..7918a7d09028 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -12,10 +12,11 @@
 
 #include <linux/bit_spinlock.h>
 #include <linux/dma-mapping.h>
+#include <asm/tpi.h>
 
 struct airq_struct {
 	struct hlist_node list;		/* Handler queueing. */
-	void (*handler)(struct airq_struct *airq, bool floating);
+	void (*handler)(struct airq_struct *airq, struct tpi_info *tpi_info);
 	u8 *lsi_ptr;			/* Local-Summary-Indicator pointer */
 	u8 lsi_mask;			/* Local-Summary-Indicator mask */
 	u8 isc;				/* Interrupt-subclass */
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index c3bd993fdd0c..f9b872e358c6 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -28,6 +28,7 @@
 #include <asm/switch_to.h>
 #include <asm/nmi.h>
 #include <asm/airq.h>
+#include <asm/tpi.h>
 #include "kvm-s390.h"
 #include "gaccess.h"
 #include "trace-s390.h"
@@ -3261,7 +3262,8 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
 }
 EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);
 
-static void gib_alert_irq_handler(struct airq_struct *airq, bool floating)
+static void gib_alert_irq_handler(struct airq_struct *airq,
+				  struct tpi_info *tpi_info)
 {
 	inc_irq_stat(IRQIO_GAL);
 	process_gib_alert_list();
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 2b6062c486f5..cc4c8d7c8f5c 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -11,6 +11,7 @@
 
 #include <asm/isc.h>
 #include <asm/airq.h>
+#include <asm/tpi.h>
 
 static enum {FLOATING, DIRECTED} irq_delivery;
 
@@ -216,8 +217,11 @@ static void zpci_handle_fallback_irq(void)
 	}
 }
 
-static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
+static void zpci_directed_irq_handler(struct airq_struct *airq,
+				      struct tpi_info *tpi_info)
 {
+	bool floating = !tpi_info->directed_irq;
+
 	if (floating) {
 		inc_irq_stat(IRQIO_PCF);
 		zpci_handle_fallback_irq();
@@ -227,7 +231,8 @@ static void zpci_directed_irq_handler(struct airq_struct *airq, bool floating)
 	}
 }
 
-static void zpci_floating_irq_handler(struct airq_struct *airq, bool floating)
+static void zpci_floating_irq_handler(struct airq_struct *airq,
+				      struct tpi_info *tpi_info)
 {
 	unsigned long si, ai;
 	struct airq_iv *aibv;
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index e56535c99888..2f2226786319 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -99,7 +99,7 @@ static irqreturn_t do_airq_interrupt(int irq, void *dummy)
 	rcu_read_lock();
 	hlist_for_each_entry_rcu(airq, head, list)
 		if ((*airq->lsi_ptr & airq->lsi_mask) != 0)
-			airq->handler(airq, !tpi_info->directed_irq);
+			airq->handler(airq, tpi_info);
 	rcu_read_unlock();
 
 	return IRQ_HANDLED;
diff --git a/drivers/s390/cio/qdio_thinint.c b/drivers/s390/cio/qdio_thinint.c
index 8e09bf3a2fcd..9b9335dd06db 100644
--- a/drivers/s390/cio/qdio_thinint.c
+++ b/drivers/s390/cio/qdio_thinint.c
@@ -15,6 +15,7 @@
 #include <asm/qdio.h>
 #include <asm/airq.h>
 #include <asm/isc.h>
+#include <asm/tpi.h>
 
 #include "cio.h"
 #include "ioasm.h"
@@ -93,9 +94,10 @@ static inline u32 clear_shared_ind(void)
 /**
  * tiqdio_thinint_handler - thin interrupt handler for qdio
  * @airq: pointer to adapter interrupt descriptor
- * @floating: flag to recognize floating vs. directed interrupts (unused)
+ * @tpi_info: interrupt information (e.g. floating vs directed -- unused)
  */
-static void tiqdio_thinint_handler(struct airq_struct *airq, bool floating)
+static void tiqdio_thinint_handler(struct airq_struct *airq,
+				   struct tpi_info *tpi_info)
 {
 	u64 irq_time = S390_lowcore.int_clock;
 	u32 si_used = clear_shared_ind();
diff --git a/drivers/s390/crypto/ap_bus.c b/drivers/s390/crypto/ap_bus.c
index 1986243f9cd3..df1a038442db 100644
--- a/drivers/s390/crypto/ap_bus.c
+++ b/drivers/s390/crypto/ap_bus.c
@@ -27,6 +27,7 @@
 #include <linux/kthread.h>
 #include <linux/mutex.h>
 #include <asm/airq.h>
+#include <asm/tpi.h>
 #include <linux/atomic.h>
 #include <asm/isc.h>
 #include <linux/hrtimer.h>
@@ -129,7 +130,8 @@ static int ap_max_adapter_id = 63;
 static struct bus_type ap_bus_type;
 
 /* Adapter interrupt definitions */
-static void ap_interrupt_handler(struct airq_struct *airq, bool floating);
+static void ap_interrupt_handler(struct airq_struct *airq,
+				 struct tpi_info *tpi_info);
 
 static bool ap_irq_flag;
 
@@ -442,9 +444,10 @@ static enum hrtimer_restart ap_poll_timeout(struct hrtimer *unused)
 /**
  * ap_interrupt_handler() - Schedule ap_tasklet on interrupt
  * @airq: pointer to adapter interrupt descriptor
- * @floating: ignored
+ * @tpi_info: ignored
  */
-static void ap_interrupt_handler(struct airq_struct *airq, bool floating)
+static void ap_interrupt_handler(struct airq_struct *airq,
+				 struct tpi_info *tpi_info)
 {
 	inc_irq_stat(IRQIO_APB);
 	tasklet_schedule(&ap_tasklet);
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index d35e7a3f7067..52c376d15978 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -33,6 +33,7 @@
 #include <asm/virtio-ccw.h>
 #include <asm/isc.h>
 #include <asm/airq.h>
+#include <asm/tpi.h>
 
 /*
  * virtio related functions
@@ -203,7 +204,8 @@ static void drop_airq_indicator(struct virtqueue *vq, struct airq_info *info)
 	write_unlock_irqrestore(&info->lock, flags);
 }
 
-static void virtio_airq_handler(struct airq_struct *airq, bool floating)
+static void virtio_airq_handler(struct airq_struct *airq,
+				struct tpi_info *tpi_info)
 {
 	struct airq_info *info = container_of(airq, struct airq_info, airq);
 	unsigned long ai;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (4 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 05/30] s390/airq: pass more TPI info to airq handlers Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-17 12:29   ` Claudio Imbrenda
  2022-01-18  9:50   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine Matthew Rosato
                   ` (25 subsequent siblings)
  31 siblings, 2 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

When doing device passthrough where interrupts are being forwarded
from host to guest, we wish to use a pinned section of guest memory
as the vector (the same memory used by the guest as the vector).

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/airq.h     |  4 +++-
 arch/s390/pci/pci_irq.c          |  8 ++++----
 drivers/s390/cio/airq.c          | 10 +++++++---
 drivers/s390/virtio/virtio_ccw.c |  2 +-
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
index 7918a7d09028..e82e5626e139 100644
--- a/arch/s390/include/asm/airq.h
+++ b/arch/s390/include/asm/airq.h
@@ -47,8 +47,10 @@ struct airq_iv {
 #define AIRQ_IV_PTR		4	/* Allocate the ptr array */
 #define AIRQ_IV_DATA		8	/* Allocate the data array */
 #define AIRQ_IV_CACHELINE	16	/* Cacheline alignment for the vector */
+#define AIRQ_IV_GUESTVEC	32	/* Vector is a pinned guest page */
 
-struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags);
+struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
+			       unsigned long *vec);
 void airq_iv_release(struct airq_iv *iv);
 unsigned long airq_iv_alloc(struct airq_iv *iv, unsigned long num);
 void airq_iv_free(struct airq_iv *iv, unsigned long bit, unsigned long num);
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index cc4c8d7c8f5c..0d0a02a9fbbf 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -296,7 +296,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
 		zdev->aisb = bit;
 
 		/* Create adapter interrupt vector */
-		zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK);
+		zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK, NULL);
 		if (!zdev->aibv)
 			return -ENOMEM;
 
@@ -419,7 +419,7 @@ static int __init zpci_directed_irq_init(void)
 	union zpci_sic_iib iib = {{0}};
 	unsigned int cpu;
 
-	zpci_sbv = airq_iv_create(num_possible_cpus(), 0);
+	zpci_sbv = airq_iv_create(num_possible_cpus(), 0, NULL);
 	if (!zpci_sbv)
 		return -ENOMEM;
 
@@ -441,7 +441,7 @@ static int __init zpci_directed_irq_init(void)
 		zpci_ibv[cpu] = airq_iv_create(cache_line_size() * BITS_PER_BYTE,
 					       AIRQ_IV_DATA |
 					       AIRQ_IV_CACHELINE |
-					       (!cpu ? AIRQ_IV_ALLOC : 0));
+					       (!cpu ? AIRQ_IV_ALLOC : 0), NULL);
 		if (!zpci_ibv[cpu])
 			return -ENOMEM;
 	}
@@ -458,7 +458,7 @@ static int __init zpci_floating_irq_init(void)
 	if (!zpci_ibv)
 		return -ENOMEM;
 
-	zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC);
+	zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, NULL);
 	if (!zpci_sbv)
 		goto out_free;
 
diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
index 2f2226786319..375a58b1c838 100644
--- a/drivers/s390/cio/airq.c
+++ b/drivers/s390/cio/airq.c
@@ -122,10 +122,12 @@ static inline unsigned long iv_size(unsigned long bits)
  * airq_iv_create - create an interrupt vector
  * @bits: number of bits in the interrupt vector
  * @flags: allocation flags
+ * @vec: pointer to pinned guest memory if AIRQ_IV_GUESTVEC
  *
  * Returns a pointer to an interrupt vector structure
  */
-struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
+struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
+			       unsigned long *vec)
 {
 	struct airq_iv *iv;
 	unsigned long size;
@@ -146,6 +148,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 					     &iv->vector_dma);
 		if (!iv->vector)
 			goto out_free;
+	} else if (flags & AIRQ_IV_GUESTVEC) {
+		iv->vector = vec;
 	} else {
 		iv->vector = cio_dma_zalloc(size);
 		if (!iv->vector)
@@ -185,7 +189,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
 	kfree(iv->avail);
 	if (iv->flags & AIRQ_IV_CACHELINE && iv->vector)
 		dma_pool_free(airq_iv_cache, iv->vector, iv->vector_dma);
-	else
+	else if (!(iv->flags & AIRQ_IV_GUESTVEC))
 		cio_dma_free(iv->vector, size);
 	kfree(iv);
 out:
@@ -204,7 +208,7 @@ void airq_iv_release(struct airq_iv *iv)
 	kfree(iv->bitlock);
 	if (iv->flags & AIRQ_IV_CACHELINE)
 		dma_pool_free(airq_iv_cache, iv->vector, iv->vector_dma);
-	else
+	else if (!(iv->flags & AIRQ_IV_GUESTVEC))
 		cio_dma_free(iv->vector, iv_size(iv->bits));
 	kfree(iv->avail);
 	kfree(iv);
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 52c376d15978..410498d693f8 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -241,7 +241,7 @@ static struct airq_info *new_airq_info(int index)
 		return NULL;
 	rwlock_init(&info->lock);
 	info->aiv = airq_iv_create(VIRTIO_IV_BITS, AIRQ_IV_ALLOC | AIRQ_IV_PTR
-				   | AIRQ_IV_CACHELINE);
+				   | AIRQ_IV_CACHELINE, NULL);
 	if (!info->aiv) {
 		kfree(info);
 		return NULL;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (5 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-17 16:19   ` Niklas Schnelle
                     ` (2 more replies)
  2022-01-14 20:31 ` [PATCH v2 08/30] s390/pci: stash associated GISA designation Matthew Rosato
                   ` (24 subsequent siblings)
  31 siblings, 3 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

A subsequent patch will be issuing SIC from KVM -- export the necessary
routine and make the operation control definitions available from a header.
Because the routine will now be exported, let's rename __zpci_set_irq_ctrl
to zpci_set_irq_ctrl and get rid of the zero'd iib wrapper function of
the same name.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
 arch/s390/pci/pci_insn.c         |  3 ++-
 arch/s390/pci/pci_irq.c          | 26 ++++++++++++--------------
 3 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
index 61cf9531f68f..5331082fa516 100644
--- a/arch/s390/include/asm/pci_insn.h
+++ b/arch/s390/include/asm/pci_insn.h
@@ -98,6 +98,14 @@ struct zpci_fib {
 	u32 gd;
 } __packed __aligned(8);
 
+/* Set Interruption Controls Operation Controls  */
+#define	SIC_IRQ_MODE_ALL		0
+#define	SIC_IRQ_MODE_SINGLE		1
+#define	SIC_IRQ_MODE_DIRECT		4
+#define	SIC_IRQ_MODE_D_ALL		16
+#define	SIC_IRQ_MODE_D_SINGLE		17
+#define	SIC_IRQ_MODE_SET_CPU		18
+
 /* directed interruption information block */
 struct zpci_diib {
 	u32 : 1;
@@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
 int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
 int __zpci_store_block(const u64 *data, u64 req, u64 offset);
 void zpci_barrier(void);
-int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
-
-static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
-{
-	union zpci_sic_iib iib = {{0}};
-
-	return __zpci_set_irq_ctrl(ctl, isc, &iib);
-}
+int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
 
 #endif
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index 4dd58b196cea..2a47b3936e44 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -97,7 +97,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
 }
 
 /* Set Interruption Controls */
-int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
+int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
 {
 	if (!test_facility(72))
 		return -EIO;
@@ -108,6 +108,7 @@ int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(zpci_set_irq_ctrl);
 
 /* PCI Load */
 static inline int ____pcilg(u64 *data, u64 req, u64 offset, u8 *status)
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 0d0a02a9fbbf..2f675355fd0c 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -15,13 +15,6 @@
 
 static enum {FLOATING, DIRECTED} irq_delivery;
 
-#define	SIC_IRQ_MODE_ALL		0
-#define	SIC_IRQ_MODE_SINGLE		1
-#define	SIC_IRQ_MODE_DIRECT		4
-#define	SIC_IRQ_MODE_D_ALL		16
-#define	SIC_IRQ_MODE_D_SINGLE		17
-#define	SIC_IRQ_MODE_SET_CPU		18
-
 /*
  * summary bit vector
  * FLOATING - summary bit per function
@@ -154,6 +147,7 @@ static struct irq_chip zpci_irq_chip = {
 static void zpci_handle_cpu_local_irq(bool rescan)
 {
 	struct airq_iv *dibv = zpci_ibv[smp_processor_id()];
+	union zpci_sic_iib iib = {{0}};
 	unsigned long bit;
 	int irqs_on = 0;
 
@@ -165,7 +159,7 @@ static void zpci_handle_cpu_local_irq(bool rescan)
 				/* End of second scan with interrupts on. */
 				break;
 			/* First scan complete, reenable interrupts. */
-			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC))
+			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC, &iib))
 				break;
 			bit = 0;
 			continue;
@@ -193,6 +187,7 @@ static void zpci_handle_remote_irq(void *data)
 static void zpci_handle_fallback_irq(void)
 {
 	struct cpu_irq_data *cpu_data;
+	union zpci_sic_iib iib = {{0}};
 	unsigned long cpu;
 	int irqs_on = 0;
 
@@ -203,7 +198,7 @@ static void zpci_handle_fallback_irq(void)
 				/* End of second scan with interrupts on. */
 				break;
 			/* First scan complete, reenable interrupts. */
-			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
+			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib))
 				break;
 			cpu = 0;
 			continue;
@@ -234,6 +229,7 @@ static void zpci_directed_irq_handler(struct airq_struct *airq,
 static void zpci_floating_irq_handler(struct airq_struct *airq,
 				      struct tpi_info *tpi_info)
 {
+	union zpci_sic_iib iib = {{0}};
 	unsigned long si, ai;
 	struct airq_iv *aibv;
 	int irqs_on = 0;
@@ -247,7 +243,7 @@ static void zpci_floating_irq_handler(struct airq_struct *airq,
 				/* End of second scan with interrupts on. */
 				break;
 			/* First scan complete, reenable interrupts. */
-			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
+			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib))
 				break;
 			si = 0;
 			continue;
@@ -407,11 +403,12 @@ static struct airq_struct zpci_airq = {
 static void __init cpu_enable_directed_irq(void *unused)
 {
 	union zpci_sic_iib iib = {{0}};
+	union zpci_sic_iib ziib = {{0}};
 
 	iib.cdiib.dibv_addr = (u64) zpci_ibv[smp_processor_id()]->vector;
 
-	__zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
-	zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC);
+	zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
+	zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC, &ziib);
 }
 
 static int __init zpci_directed_irq_init(void)
@@ -426,7 +423,7 @@ static int __init zpci_directed_irq_init(void)
 	iib.diib.isc = PCI_ISC;
 	iib.diib.nr_cpus = num_possible_cpus();
 	iib.diib.disb_addr = virt_to_phys(zpci_sbv->vector);
-	__zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
+	zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
 
 	zpci_ibv = kcalloc(num_possible_cpus(), sizeof(*zpci_ibv),
 			   GFP_KERNEL);
@@ -471,6 +468,7 @@ static int __init zpci_floating_irq_init(void)
 
 int __init zpci_irq_init(void)
 {
+	union zpci_sic_iib iib = {{0}};
 	int rc;
 
 	irq_delivery = sclp.has_dirq ? DIRECTED : FLOATING;
@@ -502,7 +500,7 @@ int __init zpci_irq_init(void)
 	 * Enable floating IRQs (with suppression after one IRQ). When using
 	 * directed IRQs this enables the fallback path.
 	 */
-	zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC);
+	zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib);
 
 	return 0;
 out_airq:
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 08/30] s390/pci: stash associated GISA designation
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (6 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-24 14:08   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 09/30] s390/pci: export some routines related to RPCIT processing Matthew Rosato
                   ` (23 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger

For passthrough devices, we will need to know the GISA designation of the
guest if interpretation facilities are to be used.  Setup to stash this in
the zdev and set a default of 0 (no GISA designation) for now; a subsequent
patch will set a valid GISA designation for passthrough devices.
Also, extend mpcific routines to specify this stashed designation as part
of the mpcific command.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci.h     | 1 +
 arch/s390/include/asm/pci_clp.h | 3 ++-
 arch/s390/pci/pci.c             | 6 ++++++
 arch/s390/pci/pci_clp.c         | 1 +
 arch/s390/pci/pci_irq.c         | 5 +++++
 5 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 90824be5ce9a..2474b8d30f2a 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -123,6 +123,7 @@ struct zpci_dev {
 	enum zpci_state state;
 	u32		fid;		/* function ID, used by sclp */
 	u32		fh;		/* function handle, used by insn's */
+	u32		gd;		/* GISA designation for passthrough */
 	u16		vfn;		/* virtual function number */
 	u16		pchid;		/* physical channel ID */
 	u8		pfgid;		/* function group ID */
diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
index 1f4b666e85ee..3af8d196da74 100644
--- a/arch/s390/include/asm/pci_clp.h
+++ b/arch/s390/include/asm/pci_clp.h
@@ -173,7 +173,8 @@ struct clp_req_set_pci {
 	u16 reserved2;
 	u8 oc;				/* operation controls */
 	u8 ndas;			/* number of dma spaces */
-	u64 reserved3;
+	u32 reserved3;
+	u32 gd;				/* GISA designation */
 } __packed;
 
 /* Set PCI function response */
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 792f8e0f2178..0c9879dae752 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -119,6 +119,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
 	fib.pba = base;
 	fib.pal = limit;
 	fib.iota = iota | ZPCI_IOTA_RTTO_FLAG;
+	fib.gd = zdev->gd;
 	cc = zpci_mod_fc(req, &fib, &status);
 	if (cc)
 		zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
@@ -132,6 +133,8 @@ int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
 	struct zpci_fib fib = {0};
 	u8 cc, status;
 
+	fib.gd = zdev->gd;
+
 	cc = zpci_mod_fc(req, &fib, &status);
 	if (cc)
 		zpci_dbg(3, "unreg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
@@ -159,6 +162,7 @@ int zpci_fmb_enable_device(struct zpci_dev *zdev)
 	atomic64_set(&zdev->unmapped_pages, 0);
 
 	fib.fmb_addr = virt_to_phys(zdev->fmb);
+	fib.gd = zdev->gd;
 	cc = zpci_mod_fc(req, &fib, &status);
 	if (cc) {
 		kmem_cache_free(zdev_fmb_cache, zdev->fmb);
@@ -177,6 +181,8 @@ int zpci_fmb_disable_device(struct zpci_dev *zdev)
 	if (!zdev->fmb)
 		return -EINVAL;
 
+	fib.gd = zdev->gd;
+
 	/* Function measurement is disabled if fmb address is zero */
 	cc = zpci_mod_fc(req, &fib, &status);
 	if (cc == 3) /* Function already gone. */
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index be077b39da33..e9ed0e4a5cf0 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -240,6 +240,7 @@ static int clp_set_pci_fn(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as, u8 comma
 		rrb->request.fh = zdev->fh;
 		rrb->request.oc = command;
 		rrb->request.ndas = nr_dma_as;
+		rrb->request.gd = zdev->gd;
 
 		rc = clp_req(rrb, CLP_LPS_PCI);
 		if (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY) {
diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
index 2f675355fd0c..17e5adfe1273 100644
--- a/arch/s390/pci/pci_irq.c
+++ b/arch/s390/pci/pci_irq.c
@@ -43,6 +43,7 @@ static int zpci_set_airq(struct zpci_dev *zdev)
 	fib.fmt0.aibvo = 0;	/* each zdev has its own interrupt vector */
 	fib.fmt0.aisb = virt_to_phys(zpci_sbv->vector) + (zdev->aisb / 64) * 8;
 	fib.fmt0.aisbo = zdev->aisb & 63;
+	fib.gd = zdev->gd;
 
 	return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
 }
@@ -54,6 +55,8 @@ static int zpci_clear_airq(struct zpci_dev *zdev)
 	struct zpci_fib fib = {0};
 	u8 cc, status;
 
+	fib.gd = zdev->gd;
+
 	cc = zpci_mod_fc(req, &fib, &status);
 	if (cc == 3 || (cc == 1 && status == 24))
 		/* Function already gone or IRQs already deregistered. */
@@ -72,6 +75,7 @@ static int zpci_set_directed_irq(struct zpci_dev *zdev)
 	fib.fmt = 1;
 	fib.fmt1.noi = zdev->msi_nr_irqs;
 	fib.fmt1.dibvo = zdev->msi_first_bit;
+	fib.gd = zdev->gd;
 
 	return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
 }
@@ -84,6 +88,7 @@ static int zpci_clear_directed_irq(struct zpci_dev *zdev)
 	u8 cc, status;
 
 	fib.fmt = 1;
+	fib.gd = zdev->gd;
 	cc = zpci_mod_fc(req, &fib, &status);
 	if (cc == 3 || (cc == 1 && status == 24))
 		/* Function already gone or IRQs already deregistered. */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 09/30] s390/pci: export some routines related to RPCIT processing
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (7 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 08/30] s390/pci: stash associated GISA designation Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-18  9:51   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 10/30] s390/pci: stash dtsm and maxstbl Matthew Rosato
                   ` (22 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger

KVM will re-use dma_walk_cpu_trans to walk the host shadow table and
will also need to be able to call zpci_refresh_trans to re-issue a RPCIT.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/pci/pci_dma.c  | 1 +
 arch/s390/pci/pci_insn.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index f46833a25526..a81de48d5ea7 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -116,6 +116,7 @@ unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr)
 	px = calc_px(dma_addr);
 	return &pto[px];
 }
+EXPORT_SYMBOL_GPL(dma_walk_cpu_trans);
 
 void dma_update_cpu_trans(unsigned long *entry, phys_addr_t page_addr, int flags)
 {
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index 2a47b3936e44..0509554301c7 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -95,6 +95,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
 
 	return (cc) ? -EIO : 0;
 }
+EXPORT_SYMBOL_GPL(zpci_refresh_trans);
 
 /* Set Interruption Controls */
 int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 10/30] s390/pci: stash dtsm and maxstbl
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (8 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 09/30] s390/pci: export some routines related to RPCIT processing Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-14 20:31 ` [PATCH v2 11/30] s390/pci: add helper function to find device by handle Matthew Rosato
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger

Store information about what IOAT designation types are supported by
underlying hardware as well as the largest store block size allowed.
These values will be needed by passthrough.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci.h     | 2 ++
 arch/s390/include/asm/pci_clp.h | 6 ++++--
 arch/s390/pci/pci_clp.c         | 2 ++
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 2474b8d30f2a..1a8f9f42da3a 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -126,9 +126,11 @@ struct zpci_dev {
 	u32		gd;		/* GISA designation for passthrough */
 	u16		vfn;		/* virtual function number */
 	u16		pchid;		/* physical channel ID */
+	u16		maxstbl;	/* Maximum store block size */
 	u8		pfgid;		/* function group ID */
 	u8		pft;		/* pci function type */
 	u8		port;
+	u8		dtsm;		/* Supported DT mask */
 	u8		rid_available	: 1;
 	u8		has_hp_slot	: 1;
 	u8		has_resources	: 1;
diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
index 3af8d196da74..124fadfb74b9 100644
--- a/arch/s390/include/asm/pci_clp.h
+++ b/arch/s390/include/asm/pci_clp.h
@@ -153,9 +153,11 @@ struct clp_rsp_query_pci_grp {
 	u8			:  6;
 	u8 frame		:  1;
 	u8 refresh		:  1;	/* TLB refresh mode */
-	u16 reserved2;
+	u16			:  3;
+	u16 maxstbl		: 13;	/* Maximum store block size */
 	u16 mui;
-	u16			: 16;
+	u8 dtsm;			/* Supported DT mask */
+	u8 reserved3;
 	u16 maxfaal;
 	u16			:  4;
 	u16 dnoi		: 12;
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index e9ed0e4a5cf0..bc7446566cbc 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -103,6 +103,8 @@ static void clp_store_query_pci_fngrp(struct zpci_dev *zdev,
 	zdev->max_msi = response->noi;
 	zdev->fmb_update = response->mui;
 	zdev->version = response->version;
+	zdev->maxstbl = response->maxstbl;
+	zdev->dtsm = response->dtsm;
 
 	switch (response->version) {
 	case 1:
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 11/30] s390/pci: add helper function to find device by handle
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (9 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 10/30] s390/pci: stash dtsm and maxstbl Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-18  9:53   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 12/30] s390/pci: get SHM information from list pci Matthew Rosato
                   ` (20 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger

Intercepted zPCI instructions will specify the desired function via a
function handle.  Add a routine to find the device with the specified
handle.

Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci.h |  1 +
 arch/s390/pci/pci.c         | 16 ++++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 1a8f9f42da3a..00a2c24d6d2b 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -275,6 +275,7 @@ static inline struct zpci_dev *to_zpci_dev(struct device *dev)
 }
 
 struct zpci_dev *get_zdev_by_fid(u32);
+struct zpci_dev *get_zdev_by_fh(u32 fh);
 
 /* DMA */
 int zpci_dma_init(void);
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 0c9879dae752..1e939b4cf25e 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -76,6 +76,22 @@ struct zpci_dev *get_zdev_by_fid(u32 fid)
 	return zdev;
 }
 
+struct zpci_dev *get_zdev_by_fh(u32 fh)
+{
+	struct zpci_dev *tmp, *zdev = NULL;
+
+	spin_lock(&zpci_list_lock);
+	list_for_each_entry(tmp, &zpci_list, entry) {
+		if (tmp->fh == fh) {
+			zdev = tmp;
+			break;
+		}
+	}
+	spin_unlock(&zpci_list_lock);
+	return zdev;
+}
+EXPORT_SYMBOL_GPL(get_zdev_by_fh);
+
 void zpci_remove_reserved_devices(void)
 {
 	struct zpci_dev *tmp, *zdev;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 12/30] s390/pci: get SHM information from list pci
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (10 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 11/30] s390/pci: add helper function to find device by handle Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-18 10:36   ` Pierre Morel
  2022-01-27 10:29   ` Niklas Schnelle
  2022-01-14 20:31 ` [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans Matthew Rosato
                   ` (19 subsequent siblings)
  31 siblings, 2 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

KVM will need information on the special handle mask used to indicate
emulated devices.  In order to obtain this, a new type of list pci call
must be made to gather the information.  Extend clp_list_pci_req to
also fetch the model-dependent-data field that holds this mask.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci.h     |  1 +
 arch/s390/include/asm/pci_clp.h |  2 +-
 arch/s390/pci/pci_clp.c         | 28 +++++++++++++++++++++++++---
 3 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 00a2c24d6d2b..f3cd2da8128c 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -227,6 +227,7 @@ int clp_enable_fh(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as);
 int clp_disable_fh(struct zpci_dev *zdev, u32 *fh);
 int clp_get_state(u32 fid, enum zpci_state *state);
 int clp_refresh_fh(u32 fid, u32 *fh);
+int zpci_get_mdd(u32 *mdd);
 
 /* UID */
 void update_uid_checking(bool new);
diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
index 124fadfb74b9..d6bc324763f3 100644
--- a/arch/s390/include/asm/pci_clp.h
+++ b/arch/s390/include/asm/pci_clp.h
@@ -76,7 +76,7 @@ struct clp_req_list_pci {
 struct clp_rsp_list_pci {
 	struct clp_rsp_hdr hdr;
 	u64 resume_token;
-	u32 reserved2;
+	u32 mdd;
 	u16 max_fn;
 	u8			: 7;
 	u8 uid_checking		: 1;
diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
index bc7446566cbc..308ffb93413f 100644
--- a/arch/s390/pci/pci_clp.c
+++ b/arch/s390/pci/pci_clp.c
@@ -328,7 +328,7 @@ int clp_disable_fh(struct zpci_dev *zdev, u32 *fh)
 }
 
 static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
-			    u64 *resume_token, int *nentries)
+			    u64 *resume_token, int *nentries, u32 *mdd)
 {
 	int rc;
 
@@ -354,6 +354,8 @@ static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
 	*nentries = (rrb->response.hdr.len - LIST_PCI_HDR_LEN) /
 		rrb->response.entry_size;
 	*resume_token = rrb->response.resume_token;
+	if (mdd)
+		*mdd = rrb->response.mdd;
 
 	return rc;
 }
@@ -365,7 +367,7 @@ static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
 	int nentries, i, rc;
 
 	do {
-		rc = clp_list_pci_req(rrb, &resume_token, &nentries);
+		rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
 		if (rc)
 			return rc;
 		for (i = 0; i < nentries; i++)
@@ -383,7 +385,7 @@ static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
 	int nentries, i, rc;
 
 	do {
-		rc = clp_list_pci_req(rrb, &resume_token, &nentries);
+		rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
 		if (rc)
 			return rc;
 		fh_list = rrb->response.fh_list;
@@ -468,6 +470,26 @@ int clp_get_state(u32 fid, enum zpci_state *state)
 	return rc;
 }
 
+int zpci_get_mdd(u32 *mdd)
+{
+	struct clp_req_rsp_list_pci *rrb;
+	u64 resume_token = 0;
+	int nentries, rc;
+
+	if (!mdd)
+		return -EINVAL;
+
+	rrb = clp_alloc_block(GFP_KERNEL);
+	if (!rrb)
+		return -ENOMEM;
+
+	rc = clp_list_pci_req(rrb, &resume_token, &nentries, mdd);
+
+	clp_free_block(rrb);
+	return rc;
+}
+EXPORT_SYMBOL_GPL(zpci_get_mdd);
+
 static int clp_base_slpc(struct clp_req *req, struct clp_req_rsp_slpc *lpcb)
 {
 	unsigned long limit = PAGE_SIZE - sizeof(lpcb->request);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (11 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 12/30] s390/pci: get SHM information from list pci Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-19 18:13   ` Pierre Morel
                     ` (2 more replies)
  2022-01-14 20:31 ` [PATCH v2 14/30] KVM: s390: pci: add basic kvm_zdev structure Matthew Rosato
                   ` (18 subsequent siblings)
  31 siblings, 3 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Current callers of zpci_refresh_trans don't need to interrogate the status
returned from the underlying instructions.  However, a subsequent patch
will add a KVM caller that needs this information.  Add a new argument to
zpci_refresh_trans to pass the address of a status byte and update
existing call sites to provide it.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci_insn.h |  2 +-
 arch/s390/pci/pci_dma.c          |  6 ++++--
 arch/s390/pci/pci_insn.c         | 10 +++++-----
 drivers/iommu/s390-iommu.c       |  4 +++-
 4 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
index 5331082fa516..32759c407b8f 100644
--- a/arch/s390/include/asm/pci_insn.h
+++ b/arch/s390/include/asm/pci_insn.h
@@ -135,7 +135,7 @@ union zpci_sic_iib {
 DECLARE_STATIC_KEY_FALSE(have_mio);
 
 u8 zpci_mod_fc(u64 req, struct zpci_fib *fib, u8 *status);
-int zpci_refresh_trans(u64 fn, u64 addr, u64 range);
+int zpci_refresh_trans(u64 fn, u64 addr, u64 range, u8 *status);
 int __zpci_load(u64 *data, u64 req, u64 offset);
 int zpci_load(u64 *data, const volatile void __iomem *addr, unsigned long len);
 int __zpci_store(u64 data, u64 req, u64 offset);
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index a81de48d5ea7..b0a2380bcad8 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -23,8 +23,9 @@ static u32 s390_iommu_aperture_factor = 1;
 
 static int zpci_refresh_global(struct zpci_dev *zdev)
 {
+	u8 status;
 	return zpci_refresh_trans((u64) zdev->fh << 32, zdev->start_dma,
-				  zdev->iommu_pages * PAGE_SIZE);
+				  zdev->iommu_pages * PAGE_SIZE, &status);
 }
 
 unsigned long *dma_alloc_cpu_table(void)
@@ -183,6 +184,7 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
 			   size_t size, int flags)
 {
 	unsigned long irqflags;
+	u8 status;
 	int ret;
 
 	/*
@@ -201,7 +203,7 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
 	}
 
 	ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr,
-				 PAGE_ALIGN(size));
+				 PAGE_ALIGN(size), &status);
 	if (ret == -ENOMEM && !s390_iommu_strict) {
 		/* enable the hypervisor to free some resources */
 		if (zpci_refresh_global(zdev))
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index 0509554301c7..ca6399d52767 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -77,20 +77,20 @@ static inline u8 __rpcit(u64 fn, u64 addr, u64 range, u8 *status)
 	return cc;
 }
 
-int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
+int zpci_refresh_trans(u64 fn, u64 addr, u64 range, u8 *status)
 {
-	u8 cc, status;
+	u8 cc;
 
 	do {
-		cc = __rpcit(fn, addr, range, &status);
+		cc = __rpcit(fn, addr, range, status);
 		if (cc == 2)
 			udelay(ZPCI_INSN_BUSY_DELAY);
 	} while (cc == 2);
 
 	if (cc)
-		zpci_err_insn(cc, status, addr, range);
+		zpci_err_insn(cc, *status, addr, range);
 
-	if (cc == 1 && (status == 4 || status == 16))
+	if (cc == 1 && (*status == 4 || *status == 16))
 		return -ENOMEM;
 
 	return (cc) ? -EIO : 0;
diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
index 50860ebdd087..845bb99c183e 100644
--- a/drivers/iommu/s390-iommu.c
+++ b/drivers/iommu/s390-iommu.c
@@ -214,6 +214,7 @@ static int s390_iommu_update_trans(struct s390_domain *s390_domain,
 	unsigned long irq_flags, nr_pages, i;
 	unsigned long *entry;
 	int rc = 0;
+	u8 status;
 
 	if (dma_addr < s390_domain->domain.geometry.aperture_start ||
 	    dma_addr + size > s390_domain->domain.geometry.aperture_end)
@@ -238,7 +239,8 @@ static int s390_iommu_update_trans(struct s390_domain *s390_domain,
 	spin_lock(&s390_domain->list_lock);
 	list_for_each_entry(domain_device, &s390_domain->devices, list) {
 		rc = zpci_refresh_trans((u64) domain_device->zdev->fh << 32,
-					start_dma_addr, nr_pages * PAGE_SIZE);
+					start_dma_addr, nr_pages * PAGE_SIZE,
+					&status);
 		if (rc)
 			break;
 	}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 14/30] KVM: s390: pci: add basic kvm_zdev structure
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (12 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-17 16:25   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation Matthew Rosato
                   ` (17 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

This structure will be used to carry kvm passthrough information related to
zPCI devices.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++++++
 arch/s390/include/asm/pci.h     |  3 +++
 arch/s390/kvm/Makefile          |  2 +-
 arch/s390/kvm/pci.c             | 46 +++++++++++++++++++++++++++++++++
 4 files changed, 79 insertions(+), 1 deletion(-)
 create mode 100644 arch/s390/include/asm/kvm_pci.h
 create mode 100644 arch/s390/kvm/pci.c

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
new file mode 100644
index 000000000000..aafee2976929
--- /dev/null
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * KVM PCI Passthrough for virtual machines on s390
+ *
+ * Copyright IBM Corp. 2021
+ *
+ *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
+ */
+
+
+#ifndef ASM_KVM_PCI_H
+#define ASM_KVM_PCI_H
+
+#include <linux/types.h>
+#include <linux/kvm_types.h>
+#include <linux/kvm_host.h>
+#include <linux/kvm.h>
+#include <linux/pci.h>
+
+struct kvm_zdev {
+	struct zpci_dev *zdev;
+	struct kvm *kvm;
+};
+
+int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
+void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
+void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
+
+#endif /* ASM_KVM_PCI_H */
diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index f3cd2da8128c..9b6c657d8d31 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -97,6 +97,7 @@ struct zpci_bar_struct {
 };
 
 struct s390_domain;
+struct kvm_zdev;
 
 #define ZPCI_FUNCTIONS_PER_BUS 256
 struct zpci_bus {
@@ -190,6 +191,8 @@ struct zpci_dev {
 	struct dentry	*debugfs_dev;
 
 	struct s390_domain *s390_domain; /* s390 IOMMU domain data */
+
+	struct kvm_zdev *kzdev; /* passthrough data */
 };
 
 static inline bool zdev_enabled(struct zpci_dev *zdev)
diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
index b3aaadc60ead..a26f4fe7b680 100644
--- a/arch/s390/kvm/Makefile
+++ b/arch/s390/kvm/Makefile
@@ -11,5 +11,5 @@ ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
 
 kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
 kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
-
+kvm-$(CONFIG_PCI) += pci.o
 obj-$(CONFIG_KVM) += kvm.o
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
new file mode 100644
index 000000000000..1c33bc7bf2bd
--- /dev/null
+++ b/arch/s390/kvm/pci.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * s390 kvm PCI passthrough support
+ *
+ * Copyright IBM Corp. 2021
+ *
+ *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/pci.h>
+#include <asm/kvm_pci.h>
+
+int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
+{
+	struct kvm_zdev *kzdev;
+
+	kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
+	if (!kzdev)
+		return -ENOMEM;
+
+	kzdev->zdev = zdev;
+	zdev->kzdev = kzdev;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
+
+void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
+{
+	struct kvm_zdev *kzdev;
+
+	kzdev = zdev->kzdev;
+	WARN_ON(kzdev->zdev != zdev);
+	zdev->kzdev = 0;
+	kfree(kzdev);
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
+
+void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
+{
+	struct kvm_zdev *kzdev = zdev->kzdev;
+
+	kzdev->kvm = kvm;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (13 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 14/30] KVM: s390: pci: add basic kvm_zdev structure Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-19 18:06   ` Pierre Morel
  2022-01-25 12:23   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 16/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications Matthew Rosato
                   ` (16 subsequent siblings)
  31 siblings, 2 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Initial setup for Adapter Event Notification Interpretation for zPCI
passthrough devices.  Specifically, allocate a structure for forwarding of
adapter events and pass the address of this structure to firmware.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/pci.h      |   4 +
 arch/s390/include/asm/pci_insn.h |  12 +++
 arch/s390/kvm/interrupt.c        |  14 +++
 arch/s390/kvm/kvm-s390.c         |   9 ++
 arch/s390/kvm/pci.c              | 144 +++++++++++++++++++++++++++++++
 arch/s390/kvm/pci.h              |  42 +++++++++
 arch/s390/pci/pci.c              |   6 ++
 7 files changed, 231 insertions(+)
 create mode 100644 arch/s390/kvm/pci.h

diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index 9b6c657d8d31..9ff8dc19975e 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -9,6 +9,7 @@
 #include <asm-generic/pci.h>
 #include <asm/pci_clp.h>
 #include <asm/pci_debug.h>
+#include <asm/pci_insn.h>
 #include <asm/sclp.h>
 
 #define PCIBIOS_MIN_IO		0x1000
@@ -204,6 +205,9 @@ extern const struct attribute_group *zpci_attr_groups[];
 extern unsigned int s390_pci_force_floating __initdata;
 extern unsigned int s390_pci_no_rid;
 
+extern union zpci_sic_iib *zpci_aipb;
+extern struct airq_iv *zpci_aif_sbv;
+
 /* -----------------------------------------------------------------------------
   Prototypes
 ----------------------------------------------------------------------------- */
diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
index 32759c407b8f..ad9000295c82 100644
--- a/arch/s390/include/asm/pci_insn.h
+++ b/arch/s390/include/asm/pci_insn.h
@@ -101,6 +101,7 @@ struct zpci_fib {
 /* Set Interruption Controls Operation Controls  */
 #define	SIC_IRQ_MODE_ALL		0
 #define	SIC_IRQ_MODE_SINGLE		1
+#define	SIC_SET_AENI_CONTROLS		2
 #define	SIC_IRQ_MODE_DIRECT		4
 #define	SIC_IRQ_MODE_D_ALL		16
 #define	SIC_IRQ_MODE_D_SINGLE		17
@@ -127,9 +128,20 @@ struct zpci_cdiib {
 	u64 : 64;
 } __packed __aligned(8);
 
+/* adapter interruption parameters block */
+struct zpci_aipb {
+	u64 faisb;
+	u64 gait;
+	u16 : 13;
+	u16 afi : 3;
+	u32 : 32;
+	u16 faal;
+} __packed __aligned(8);
+
 union zpci_sic_iib {
 	struct zpci_diib diib;
 	struct zpci_cdiib cdiib;
+	struct zpci_aipb aipb;
 };
 
 DECLARE_STATIC_KEY_FALSE(have_mio);
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index f9b872e358c6..a591b8cd662f 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -32,6 +32,7 @@
 #include "kvm-s390.h"
 #include "gaccess.h"
 #include "trace-s390.h"
+#include "pci.h"
 
 #define PFAULT_INIT 0x0600
 #define PFAULT_DONE 0x0680
@@ -3278,6 +3279,11 @@ void kvm_s390_gib_destroy(void)
 {
 	if (!gib)
 		return;
+	if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni && aift) {
+		mutex_lock(&aift->lock);
+		kvm_s390_pci_aen_exit();
+		mutex_unlock(&aift->lock);
+	}
 	chsc_sgib(0);
 	unregister_adapter_interrupt(&gib_alert_irq);
 	free_page((unsigned long)gib);
@@ -3315,6 +3321,14 @@ int kvm_s390_gib_init(u8 nisc)
 		goto out_unreg_gal;
 	}
 
+	if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni) {
+		if (kvm_s390_pci_aen_init(nisc)) {
+			pr_err("Initializing AEN for PCI failed\n");
+			rc = -EIO;
+			goto out_unreg_gal;
+		}
+	}
+
 	KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
 	goto out;
 
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 14a18ba5ff2c..01dc3f6883d0 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -48,6 +48,7 @@
 #include <asm/fpu/api.h>
 #include "kvm-s390.h"
 #include "gaccess.h"
+#include "pci.h"
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
@@ -503,6 +504,14 @@ int kvm_arch_init(void *opaque)
 		goto out;
 	}
 
+	if (IS_ENABLED(CONFIG_PCI)) {
+		rc = kvm_s390_pci_init();
+		if (rc) {
+			pr_err("Unable to allocate AIFT for PCI\n");
+			goto out;
+		}
+	}
+
 	rc = kvm_s390_gib_init(GAL_ISC);
 	if (rc)
 		goto out;
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 1c33bc7bf2bd..dae853da6df1 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -10,6 +10,138 @@
 #include <linux/kvm_host.h>
 #include <linux/pci.h>
 #include <asm/kvm_pci.h>
+#include <asm/pci.h>
+#include <asm/pci_insn.h>
+#include "pci.h"
+
+struct zpci_aift *aift;
+
+static inline int __set_irq_noiib(u16 ctl, u8 isc)
+{
+	union zpci_sic_iib iib = {{0}};
+
+	return zpci_set_irq_ctrl(ctl, isc, &iib);
+}
+
+/* Caller must hold the aift lock before calling this function */
+void kvm_s390_pci_aen_exit(void)
+{
+	unsigned long flags;
+	struct kvm_zdev **gait_kzdev;
+
+	/*
+	 * Contents of the aipb remain registered for the life of the host
+	 * kernel, the information preserved in zpci_aipb and zpci_aif_sbv
+	 * in case we insert the KVM module again later.  Clear the AIFT
+	 * information and free anything not registered with underlying
+	 * firmware.
+	 */
+	spin_lock_irqsave(&aift->gait_lock, flags);
+	gait_kzdev = aift->kzdev;
+	aift->gait = 0;
+	aift->sbv = 0;
+	aift->kzdev = 0;
+	spin_unlock_irqrestore(&aift->gait_lock, flags);
+
+	kfree(gait_kzdev);
+}
+
+int kvm_s390_pci_aen_init(u8 nisc)
+{
+	struct page *page;
+	int rc = 0, size;
+	bool first = false;
+
+	/* If already enabled for AEN, bail out now */
+	if (aift->gait || aift->sbv)
+		return -EPERM;
+
+	mutex_lock(&aift->lock);
+	aift->kzdev = kcalloc(ZPCI_NR_DEVICES, sizeof(struct kvm_zdev),
+			      GFP_KERNEL);
+	if (!aift->kzdev) {
+		rc = -ENOMEM;
+		goto unlock;
+	}
+
+	if (!zpci_aipb) {
+		zpci_aipb = kzalloc(sizeof(union zpci_sic_iib), GFP_KERNEL);
+		if (!zpci_aipb) {
+			rc = -ENOMEM;
+			goto free_zdev;
+		}
+		first = true;
+		aift->sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, 0);
+		if (!aift->sbv) {
+			rc = -ENOMEM;
+			goto free_aipb;
+		}
+		zpci_aif_sbv = aift->sbv;
+		size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
+					    sizeof(struct zpci_gaite)));
+		page = alloc_pages(GFP_KERNEL | __GFP_ZERO, size);
+		if (!page) {
+			rc = -ENOMEM;
+			goto free_sbv;
+		}
+		aift->gait = (struct zpci_gaite *)page_to_phys(page);
+
+		zpci_aipb->aipb.faisb = virt_to_phys(aift->sbv->vector);
+		zpci_aipb->aipb.gait = virt_to_phys(aift->gait);
+		zpci_aipb->aipb.afi = nisc;
+		zpci_aipb->aipb.faal = ZPCI_NR_DEVICES;
+
+		/* Setup Adapter Event Notification Interpretation */
+		if (zpci_set_irq_ctrl(SIC_SET_AENI_CONTROLS, 0, zpci_aipb)) {
+			rc = -EIO;
+			goto free_gait;
+		}
+	} else {
+		/*
+		 * AEN registration can only happen once per system boot.  If
+		 * an aipb already exists then AEN was already registered and
+		 * we can re-use the aipb contents.  This can only happen if
+		 * the KVM module was removed and re-inserted.
+		 */
+		if (zpci_aipb->aipb.afi != nisc ||
+		    zpci_aipb->aipb.faal != ZPCI_NR_DEVICES) {
+			rc = -EINVAL;
+			goto free_zdev;
+		}
+		aift->sbv = zpci_aif_sbv;
+		aift->gait = (struct zpci_gaite *)zpci_aipb->aipb.gait;
+	}
+
+	/* Enable floating IRQs */
+	if (__set_irq_noiib(SIC_IRQ_MODE_SINGLE, nisc)) {
+		rc = -EIO;
+		kvm_s390_pci_aen_exit();
+	}
+
+	goto unlock;
+
+free_gait:
+	size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
+				    sizeof(struct zpci_gaite)));
+	free_pages((unsigned long)aift->gait, size);
+free_sbv:
+	if (first) {
+		/* If AEN setup failed, only free a newly-allocated sbv */
+		airq_iv_release(aift->sbv);
+		zpci_aif_sbv = 0;
+	}
+free_aipb:
+	if (first) {
+		/* If AEN setup failed, only free a newly-allocated aipb */
+		kfree(zpci_aipb);
+		zpci_aipb = 0;
+	}
+free_zdev:
+	kfree(aift->kzdev);
+unlock:
+	mutex_unlock(&aift->lock);
+	return rc;
+}
 
 int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
 {
@@ -44,3 +176,15 @@ void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
 	kzdev->kvm = kvm;
 }
 EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
+
+int kvm_s390_pci_init(void)
+{
+	aift = kzalloc(sizeof(struct zpci_aift), GFP_KERNEL);
+	if (!aift)
+		return -ENOMEM;
+
+	spin_lock_init(&aift->gait_lock);
+	mutex_init(&aift->lock);
+
+	return 0;
+}
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
new file mode 100644
index 000000000000..b2000ed7b8c3
--- /dev/null
+++ b/arch/s390/kvm/pci.h
@@ -0,0 +1,42 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * s390 kvm PCI passthrough support
+ *
+ * Copyright IBM Corp. 2021
+ *
+ *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
+ */
+
+#ifndef __KVM_S390_PCI_H
+#define __KVM_S390_PCI_H
+
+#include <linux/pci.h>
+#include <linux/mutex.h>
+#include <asm/airq.h>
+#include <asm/kvm_pci.h>
+
+struct zpci_gaite {
+	u32 gisa;
+	u8 gisc;
+	u8 count;
+	u8 reserved;
+	u8 aisbo;
+	u64 aisb;
+};
+
+struct zpci_aift {
+	struct zpci_gaite *gait;
+	struct airq_iv *sbv;
+	struct kvm_zdev **kzdev;
+	spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
+	struct mutex lock; /* Protects the other structures in aift */
+};
+
+extern struct zpci_aift *aift;
+
+int kvm_s390_pci_aen_init(u8 nisc);
+void kvm_s390_pci_aen_exit(void);
+
+int kvm_s390_pci_init(void);
+
+#endif /* __KVM_S390_PCI_H */
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 1e939b4cf25e..2a19becbc14c 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -61,6 +61,12 @@ DEFINE_STATIC_KEY_FALSE(have_mio);
 
 static struct kmem_cache *zdev_fmb_cache;
 
+/* AEN structures that must be preserved over KVM module re-insertion */
+union zpci_sic_iib *zpci_aipb;
+EXPORT_SYMBOL_GPL(zpci_aipb);
+struct airq_iv *zpci_aif_sbv;
+EXPORT_SYMBOL_GPL(zpci_aif_sbv);
+
 struct zpci_dev *get_zdev_by_fid(u32 fid)
 {
 	struct zpci_dev *tmp, *zdev = NULL;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 16/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (14 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-17 17:38   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 17/30] KVM: s390: mechanism to enable guest zPCI Interpretation Matthew Rosato
                   ` (15 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

In cases where interrupts are not forwarded to the guest via firmware,
KVM is responsible for ensuring delivery.  When an interrupt presents
with the forwarding bit, we must process the forwarding tables until
all interrupts are delivered.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |  1 +
 arch/s390/include/asm/tpi.h      | 13 ++++++
 arch/s390/kvm/interrupt.c        | 76 +++++++++++++++++++++++++++++++-
 arch/s390/kvm/kvm-s390.c         |  3 +-
 arch/s390/kvm/pci.h              |  9 ++++
 5 files changed, 100 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index a604d51acfc8..3f147b8d050b 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -757,6 +757,7 @@ struct kvm_vm_stat {
 	u64 inject_pfault_done;
 	u64 inject_service_signal;
 	u64 inject_virtio;
+	u64 aen_forward;
 };
 
 struct kvm_arch_memory_slot {
diff --git a/arch/s390/include/asm/tpi.h b/arch/s390/include/asm/tpi.h
index 1ac538b8cbf5..f76e5fdff23a 100644
--- a/arch/s390/include/asm/tpi.h
+++ b/arch/s390/include/asm/tpi.h
@@ -19,6 +19,19 @@ struct tpi_info {
 	u32 :12;
 } __packed __aligned(4);
 
+/* I/O-Interruption Code as stored by TPI for an Adapter I/O */
+struct tpi_adapter_info {
+	u32 aism:8;
+	u32 :22;
+	u32 error:1;
+	u32 forward:1;
+	u32 reserved;
+	u32 adapter_IO:1;
+	u32 directed_irq:1;
+	u32 isc:3;
+	u32 :27;
+} __packed __aligned(4);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_S390_TPI_H */
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index a591b8cd662f..07743c6a67c4 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -3263,11 +3263,85 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
 }
 EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);
 
+static void aen_host_forward(unsigned long si)
+{
+	struct kvm_s390_gisa_interrupt *gi;
+	struct zpci_gaite *gaite;
+	struct kvm *kvm;
+
+	gaite = (struct zpci_gaite *)aift->gait +
+		(si * sizeof(struct zpci_gaite));
+	if (gaite->count == 0)
+		return;
+	if (gaite->aisb != 0)
+		set_bit_inv(gaite->aisbo, (unsigned long *)gaite->aisb);
+
+	kvm = kvm_s390_pci_si_to_kvm(aift, si);
+	if (kvm == 0)
+		return;
+	gi = &kvm->arch.gisa_int;
+
+	if (!(gi->origin->g1.simm & AIS_MODE_MASK(gaite->gisc)) ||
+	    !(gi->origin->g1.nimm & AIS_MODE_MASK(gaite->gisc))) {
+		gisa_set_ipm_gisc(gi->origin, gaite->gisc);
+		if (hrtimer_active(&gi->timer))
+			hrtimer_cancel(&gi->timer);
+		hrtimer_start(&gi->timer, 0, HRTIMER_MODE_REL);
+		kvm->stat.aen_forward++;
+	}
+}
+
+static void aen_process_gait(u8 isc)
+{
+	bool found = false, first = true;
+	union zpci_sic_iib iib = {{0}};
+	unsigned long si, flags;
+
+	spin_lock_irqsave(&aift->gait_lock, flags);
+
+	if (!aift->gait) {
+		spin_unlock_irqrestore(&aift->gait_lock, flags);
+		return;
+	}
+
+	for (si = 0;;) {
+		/* Scan adapter summary indicator bit vector */
+		si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift->sbv));
+		if (si == -1UL) {
+			if (first || found) {
+				/* Reenable interrupts. */
+				if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
+						      &iib))
+					break;
+				first = found = false;
+			} else {
+				/* Interrupts on and all bits processed */
+				break;
+			}
+			found = false;
+			si = 0;
+			continue;
+		}
+		found = true;
+		aen_host_forward(si);
+	}
+
+	spin_unlock_irqrestore(&aift->gait_lock, flags);
+}
+
 static void gib_alert_irq_handler(struct airq_struct *airq,
 				  struct tpi_info *tpi_info)
 {
+	struct tpi_adapter_info *info = (struct tpi_adapter_info *)tpi_info;
+
 	inc_irq_stat(IRQIO_GAL);
-	process_gib_alert_list();
+
+	if (IS_ENABLED(CONFIG_PCI) && (info->forward || info->error)) {
+		aen_process_gait(info->isc);
+		if (info->aism != 0)
+			process_gib_alert_list();
+	} else
+		process_gib_alert_list();
 }
 
 static struct airq_struct gib_alert_irq = {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 01dc3f6883d0..ab8b56deed11 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -65,7 +65,8 @@ const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
 	STATS_DESC_COUNTER(VM, inject_float_mchk),
 	STATS_DESC_COUNTER(VM, inject_pfault_done),
 	STATS_DESC_COUNTER(VM, inject_service_signal),
-	STATS_DESC_COUNTER(VM, inject_virtio)
+	STATS_DESC_COUNTER(VM, inject_virtio),
+	STATS_DESC_COUNTER(VM, aen_forward)
 };
 
 const struct kvm_stats_header kvm_vm_stats_header = {
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index b2000ed7b8c3..387b637863c9 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -12,6 +12,7 @@
 
 #include <linux/pci.h>
 #include <linux/mutex.h>
+#include <linux/kvm_host.h>
 #include <asm/airq.h>
 #include <asm/kvm_pci.h>
 
@@ -34,6 +35,14 @@ struct zpci_aift {
 
 extern struct zpci_aift *aift;
 
+static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
+						 unsigned long si)
+{
+	if (!IS_ENABLED(CONFIG_PCI) || aift->kzdev == 0 || aift->kzdev[si] == 0)
+		return 0;
+	return aift->kzdev[si]->kvm;
+};
+
 int kvm_s390_pci_aen_init(u8 nisc);
 void kvm_s390_pci_aen_exit(void);
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 17/30] KVM: s390: mechanism to enable guest zPCI Interpretation
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (15 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 16/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-24 14:24   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 18/30] KVM: s390: pci: provide routines for enabling/disabling interpretation Matthew Rosato
                   ` (14 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

The guest must have access to certain facilities in order to allow
interpretive execution of zPCI instructions and adapter event
notifications.  However, there are some cases where a guest might
disable interpretation -- provide a mechanism via which we can defer
enabling the associated zPCI interpretation facilities until the guest
indicates it wishes to use them.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_host.h |  4 ++++
 arch/s390/kvm/kvm-s390.c         | 40 ++++++++++++++++++++++++++++++++
 arch/s390/kvm/kvm-s390.h         | 10 ++++++++
 3 files changed, 54 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 3f147b8d050b..38982c1de413 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -252,7 +252,10 @@ struct kvm_s390_sie_block {
 #define ECB2_IEP	0x20
 #define ECB2_PFMFI	0x08
 #define ECB2_ESCA	0x04
+#define ECB2_ZPCI_LSI	0x02
 	__u8    ecb2;                   /* 0x0062 */
+#define ECB3_AISI	0x20
+#define ECB3_AISII	0x10
 #define ECB3_DEA 0x08
 #define ECB3_AES 0x04
 #define ECB3_RI  0x01
@@ -938,6 +941,7 @@ struct kvm_arch{
 	int use_cmma;
 	int use_pfmfi;
 	int use_skf;
+	int use_zpci_interp;
 	int user_cpu_state_ctrl;
 	int user_sigp;
 	int user_stsi;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index ab8b56deed11..b6c32fc3b272 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -1029,6 +1029,44 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
 	return 0;
 }
 
+static void kvm_s390_vcpu_pci_setup(struct kvm_vcpu *vcpu)
+{
+	/* Only set the ECB bits after guest requests zPCI interpretation */
+	if (!vcpu->kvm->arch.use_zpci_interp)
+		return;
+
+	vcpu->arch.sie_block->ecb2 |= ECB2_ZPCI_LSI;
+	vcpu->arch.sie_block->ecb3 |= ECB3_AISII + ECB3_AISI;
+}
+
+void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm)
+{
+	struct kvm_vcpu *vcpu;
+	int i;
+
+	/*
+	 * If host is configured for PCI and the necessary facilities are
+	 * available, turn on interpretation for the life of this guest
+	 */
+	if (!IS_ENABLED(CONFIG_PCI) || !sclp.has_zpci_lsi || !sclp.has_aisii ||
+	    !sclp.has_aeni || !sclp.has_aisi)
+		return;
+
+	mutex_lock(&kvm->lock);
+
+	kvm->arch.use_zpci_interp = 1;
+
+	kvm_s390_vcpu_block_all(kvm);
+
+	kvm_for_each_vcpu(i, vcpu, kvm) {
+		kvm_s390_vcpu_pci_setup(vcpu);
+		kvm_s390_sync_request(KVM_REQ_VSIE_RESTART, vcpu);
+	}
+
+	kvm_s390_vcpu_unblock_all(kvm);
+	mutex_unlock(&kvm->lock);
+}
+
 static void kvm_s390_sync_request_broadcast(struct kvm *kvm, int req)
 {
 	int cx;
@@ -3282,6 +3320,8 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
 
 	kvm_s390_vcpu_crypto_setup(vcpu);
 
+	kvm_s390_vcpu_pci_setup(vcpu);
+
 	mutex_lock(&vcpu->kvm->lock);
 	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
 		rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
index c07a050d757d..a2eccb8b977e 100644
--- a/arch/s390/kvm/kvm-s390.h
+++ b/arch/s390/kvm/kvm-s390.h
@@ -481,6 +481,16 @@ void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
  */
 void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);
 
+/**
+ * kvm_s390_vcpu_pci_enable_interp
+ *
+ * Set the associated PCI attributes for each vcpu to allow for zPCI Load/Store
+ * interpretation as well as adapter interruption forwarding.
+ *
+ * @kvm: the KVM guest
+ */
+void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm);
+
 /**
  * diag9c_forwarding_hz
  *
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 18/30] KVM: s390: pci: provide routines for enabling/disabling interpretation
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (16 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 17/30] KVM: s390: mechanism to enable guest zPCI Interpretation Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-24 14:36   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 19/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding Matthew Rosato
                   ` (13 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

These routines will be wired into the vfio_pci_zdev ioctl handlers to
respond to requests to enable / disable a device for zPCI Load/Store
interpretation.

The first time such a request is received, enable the necessary facilities
for the guest.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h |  4 ++
 arch/s390/kvm/pci.c             | 99 +++++++++++++++++++++++++++++++++
 arch/s390/pci/pci.c             |  3 +
 3 files changed, 106 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index aafee2976929..072401aa7922 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -26,4 +26,8 @@ int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
 void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
 void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
 
+int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
+int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
+int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
+
 #endif /* ASM_KVM_PCI_H */
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index dae853da6df1..122d0992b521 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -12,7 +12,9 @@
 #include <asm/kvm_pci.h>
 #include <asm/pci.h>
 #include <asm/pci_insn.h>
+#include <asm/sclp.h>
 #include "pci.h"
+#include "kvm-s390.h"
 
 struct zpci_aift *aift;
 
@@ -143,6 +145,103 @@ int kvm_s390_pci_aen_init(u8 nisc)
 	return rc;
 }
 
+int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
+{
+	/* Must have appropriate hardware facilities */
+	if (!(sclp.has_zpci_lsi && test_facility(69)))
+		return -EINVAL;
+
+	/* Must have a KVM association registered */
+	if (!zdev->kzdev || !zdev->kzdev->kvm)
+		return -EINVAL;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_probe);
+
+int kvm_s390_pci_interp_enable(struct zpci_dev *zdev)
+{
+	u32 gd;
+	int rc;
+
+	if (!zdev->kzdev || !zdev->kzdev->kvm)
+		return -EINVAL;
+
+	/*
+	 * If this is the first request to use an interpreted device, make the
+	 * necessary vcpu changes
+	 */
+	if (!zdev->kzdev->kvm->arch.use_zpci_interp)
+		kvm_s390_vcpu_pci_enable_interp(zdev->kzdev->kvm);
+
+	/*
+	 * In the event of a system reset in userspace, the GISA designation
+	 * may still be assigned because the device is still enabled.
+	 * Verify it's the same guest before proceeding.
+	 */
+	gd = (u32)(u64)&zdev->kzdev->kvm->arch.sie_page2->gisa;
+	if (zdev->gd != 0 && zdev->gd != gd)
+		return -EPERM;
+
+	if (zdev_enabled(zdev)) {
+		zdev->gd = 0;
+		rc = zpci_disable_device(zdev);
+		if (rc)
+			return rc;
+	}
+
+	/*
+	 * Store information about the identity of the kvm guest allowed to
+	 * access this device via interpretation to be used by host CLP
+	 */
+	zdev->gd = gd;
+
+	rc = zpci_enable_device(zdev);
+	if (rc)
+		goto err;
+
+	/* Re-register the IOMMU that was already created */
+	rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+				virt_to_phys(zdev->dma_table));
+	if (rc)
+		goto err;
+
+	return rc;
+
+err:
+	zdev->gd = 0;
+	return rc;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_enable);
+
+int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
+{
+	int rc;
+
+	if (zdev->gd == 0)
+		return -EINVAL;
+
+	/* Remove the host CLP guest designation */
+	zdev->gd = 0;
+
+	if (zdev_enabled(zdev)) {
+		rc = zpci_disable_device(zdev);
+		if (rc)
+			return rc;
+	}
+
+	rc = zpci_enable_device(zdev);
+	if (rc)
+		return rc;
+
+	/* Re-register the IOMMU that was already created */
+	rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
+				virt_to_phys(zdev->dma_table));
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_disable);
+
 int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
 {
 	struct kvm_zdev *kzdev;
diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
index 2a19becbc14c..58673f633869 100644
--- a/arch/s390/pci/pci.c
+++ b/arch/s390/pci/pci.c
@@ -147,6 +147,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
 		zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
 	return cc;
 }
+EXPORT_SYMBOL_GPL(zpci_register_ioat);
 
 /* Modify PCI: Unregister I/O address translation parameters */
 int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
@@ -727,6 +728,7 @@ int zpci_enable_device(struct zpci_dev *zdev)
 		zpci_update_fh(zdev, fh);
 	return rc;
 }
+EXPORT_SYMBOL_GPL(zpci_enable_device);
 
 int zpci_disable_device(struct zpci_dev *zdev)
 {
@@ -750,6 +752,7 @@ int zpci_disable_device(struct zpci_dev *zdev)
 	}
 	return rc;
 }
+EXPORT_SYMBOL_GPL(zpci_disable_device);
 
 /**
  * zpci_hot_reset_device - perform a reset of the given zPCI function
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 19/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (17 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 18/30] KVM: s390: pci: provide routines for enabling/disabling interpretation Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-25 12:41   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 20/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist Matthew Rosato
                   ` (12 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

These routines will be wired into the vfio_pci_zdev ioctl handlers to
respond to requests to enable / disable a device for Adapter Event
Notifications / Adapter Interuption Forwarding.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h |   7 ++
 arch/s390/kvm/pci.c             | 203 ++++++++++++++++++++++++++++++++
 arch/s390/pci/pci_insn.c        |   1 +
 3 files changed, 211 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 072401aa7922..01fe14fffd7a 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -16,16 +16,23 @@
 #include <linux/kvm_host.h>
 #include <linux/kvm.h>
 #include <linux/pci.h>
+#include <asm/pci_insn.h>
 
 struct kvm_zdev {
 	struct zpci_dev *zdev;
 	struct kvm *kvm;
+	struct zpci_fib fib;
 };
 
 int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
 void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
 void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
 
+int kvm_s390_pci_aif_probe(struct zpci_dev *zdev);
+int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
+			    bool assist);
+int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
+
 int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
 int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
 int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 122d0992b521..7ed9abc476b6 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -12,6 +12,7 @@
 #include <asm/kvm_pci.h>
 #include <asm/pci.h>
 #include <asm/pci_insn.h>
+#include <asm/pci_io.h>
 #include <asm/sclp.h>
 #include "pci.h"
 #include "kvm-s390.h"
@@ -145,6 +146,204 @@ int kvm_s390_pci_aen_init(u8 nisc)
 	return rc;
 }
 
+/* Modify PCI: Register floating adapter interruption forwarding */
+static int kvm_zpci_set_airq(struct zpci_dev *zdev)
+{
+	u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_REG_INT);
+	struct zpci_fib fib = {0};
+	u8 status;
+
+	fib.fmt0.isc = zdev->kzdev->fib.fmt0.isc;
+	fib.fmt0.sum = 1;       /* enable summary notifications */
+	fib.fmt0.noi = airq_iv_end(zdev->aibv);
+	fib.fmt0.aibv = virt_to_phys(zdev->aibv->vector);
+	fib.fmt0.aibvo = 0;
+	fib.fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 64) * 8);
+	fib.fmt0.aisbo = zdev->aisb & 63;
+	fib.gd = zdev->gd;
+
+	return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
+}
+
+/* Modify PCI: Unregister floating adapter interruption forwarding */
+static int kvm_zpci_clear_airq(struct zpci_dev *zdev)
+{
+	u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_DEREG_INT);
+	struct zpci_fib fib = {0};
+	u8 cc, status;
+
+	fib.gd = zdev->gd;
+
+	cc = zpci_mod_fc(req, &fib, &status);
+	if (cc == 3 || (cc == 1 && status == 24))
+		/* Function already gone or IRQs already deregistered. */
+		cc = 0;
+
+	return cc ? -EIO : 0;
+}
+
+int kvm_s390_pci_aif_probe(struct zpci_dev *zdev)
+{
+	/* Must have appropriate hardware facilities */
+	if (!(sclp.has_aeni && test_facility(71)))
+		return -EINVAL;
+
+	/* Must have a KVM association registered */
+	if (!zdev->kzdev || !zdev->kzdev->kvm)
+		return -EINVAL;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_probe);
+
+int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
+			    bool assist)
+{
+	struct page *aibv_page, *aisb_page = NULL;
+	unsigned int msi_vecs, idx;
+	struct zpci_gaite *gaite;
+	unsigned long bit;
+	struct kvm *kvm;
+	phys_addr_t gaddr;
+	int rc = 0;
+
+	/*
+	 * Interrupt forwarding is only applicable if the device is already
+	 * enabled for interpretation
+	 */
+	if (zdev->gd == 0)
+		return -EINVAL;
+
+	kvm = zdev->kzdev->kvm;
+	msi_vecs = min_t(unsigned int, fib->fmt0.noi, zdev->max_msi);
+
+	/* Replace AIBV address */
+	idx = srcu_read_lock(&kvm->srcu);
+	aibv_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aibv));
+	srcu_read_unlock(&kvm->srcu, idx);
+	if (is_error_page(aibv_page)) {
+		rc = -EIO;
+		goto out;
+	}
+	gaddr = page_to_phys(aibv_page) + (fib->fmt0.aibv & ~PAGE_MASK);
+	fib->fmt0.aibv = gaddr;
+
+	/* Pin the guest AISB if one was specified */
+	if (fib->fmt0.sum == 1) {
+		idx = srcu_read_lock(&kvm->srcu);
+		aisb_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aisb));
+		srcu_read_unlock(&kvm->srcu, idx);
+		if (is_error_page(aisb_page)) {
+			rc = -EIO;
+			goto unpin1;
+		}
+	}
+
+	/* AISB must be allocated before we can fill in GAITE */
+	mutex_lock(&aift->lock);
+	bit = airq_iv_alloc_bit(aift->sbv);
+	if (bit == -1UL)
+		goto unpin2;
+	zdev->aisb = bit;
+	zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA |
+					      AIRQ_IV_BITLOCK |
+					      AIRQ_IV_GUESTVEC,
+				    (unsigned long *)fib->fmt0.aibv);
+
+	spin_lock_irq(&aift->gait_lock);
+	gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb *
+						   sizeof(struct zpci_gaite));
+
+	/* If assist not requested, host will get all alerts */
+	if (assist)
+		gaite->gisa = (u32)(u64)&kvm->arch.sie_page2->gisa;
+	else
+		gaite->gisa = 0;
+
+	gaite->gisc = fib->fmt0.isc;
+	gaite->count++;
+	gaite->aisbo = fib->fmt0.aisbo;
+	gaite->aisb = virt_to_phys(page_address(aisb_page) + (fib->fmt0.aisb &
+							      ~PAGE_MASK));
+	aift->kzdev[zdev->aisb] = zdev->kzdev;
+	spin_unlock_irq(&aift->gait_lock);
+
+	/* Update guest FIB for re-issue */
+	fib->fmt0.aisbo = zdev->aisb & 63;
+	fib->fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 64) * 8);
+	fib->fmt0.isc = kvm_s390_gisc_register(kvm, gaite->gisc);
+
+	/* Save some guest fib values in the host for later use */
+	zdev->kzdev->fib.fmt0.isc = fib->fmt0.isc;
+	zdev->kzdev->fib.fmt0.aibv = fib->fmt0.aibv;
+	mutex_unlock(&aift->lock);
+
+	/* Issue the clp to setup the irq now */
+	rc = kvm_zpci_set_airq(zdev);
+	return rc;
+
+unpin2:
+	mutex_unlock(&aift->lock);
+	if (fib->fmt0.sum == 1) {
+		gaddr = page_to_phys(aisb_page);
+		kvm_release_pfn_dirty(gaddr >> PAGE_SHIFT);
+	}
+unpin1:
+	kvm_release_pfn_dirty(fib->fmt0.aibv >> PAGE_SHIFT);
+out:
+	return rc;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_enable);
+
+int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
+{
+	struct kvm_zdev *kzdev = zdev->kzdev;
+	struct zpci_gaite *gaite;
+	int rc;
+	u8 isc;
+
+	if (zdev->gd == 0)
+		return -EINVAL;
+
+	/* Even if the clear fails due to an error, clear the GAITE */
+	rc = kvm_zpci_clear_airq(zdev);
+
+	mutex_lock(&aift->lock);
+	if (zdev->kzdev->fib.fmt0.aibv == 0)
+		goto out;
+	spin_lock_irq(&aift->gait_lock);
+	gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb *
+						   sizeof(struct zpci_gaite));
+	isc = gaite->gisc;
+	gaite->count--;
+	if (gaite->count == 0) {
+		/* Release guest AIBV and AISB */
+		kvm_release_pfn_dirty(kzdev->fib.fmt0.aibv >> PAGE_SHIFT);
+		if (gaite->aisb != 0)
+			kvm_release_pfn_dirty(gaite->aisb >> PAGE_SHIFT);
+		/* Clear the GAIT entry */
+		gaite->aisb = 0;
+		gaite->gisc = 0;
+		gaite->aisbo = 0;
+		gaite->gisa = 0;
+		aift->kzdev[zdev->aisb] = 0;
+		/* Clear zdev info */
+		airq_iv_free_bit(aift->sbv, zdev->aisb);
+		airq_iv_release(zdev->aibv);
+		zdev->aisb = 0;
+		zdev->aibv = NULL;
+	}
+	spin_unlock_irq(&aift->gait_lock);
+	kvm_s390_gisc_unregister(kzdev->kvm, isc);
+	kzdev->fib.fmt0.isc = 0;
+	kzdev->fib.fmt0.aibv = 0;
+out:
+	mutex_unlock(&aift->lock);
+
+	return rc;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
+
 int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
 {
 	/* Must have appropriate hardware facilities */
@@ -221,6 +420,10 @@ int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
 	if (zdev->gd == 0)
 		return -EINVAL;
 
+	/* Forwarding must be turned off before interpretation */
+	if (zdev->kzdev->fib.fmt0.aibv != 0)
+		kvm_s390_pci_aif_disable(zdev);
+
 	/* Remove the host CLP guest designation */
 	zdev->gd = 0;
 
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index ca6399d52767..f7d0e29bbf0b 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -59,6 +59,7 @@ u8 zpci_mod_fc(u64 req, struct zpci_fib *fib, u8 *status)
 
 	return cc;
 }
+EXPORT_SYMBOL_GPL(zpci_mod_fc);
 
 /* Refresh PCI Translations */
 static inline u8 __rpcit(u64 fn, u64 addr, u64 range, u8 *status)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 20/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (18 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 19/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-25 13:29   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations Matthew Rosato
                   ` (11 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

These routines will be wired into the vfio_pci_zdev ioctl handlers to
respond to requests to enable / disable a device for PCI I/O Address
Translation assistance.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h |  15 ++++
 arch/s390/include/asm/pci_dma.h |   2 +
 arch/s390/kvm/pci.c             | 139 ++++++++++++++++++++++++++++++++
 arch/s390/kvm/pci.h             |   2 +
 4 files changed, 158 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 01fe14fffd7a..770849f13a70 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -16,11 +16,21 @@
 #include <linux/kvm_host.h>
 #include <linux/kvm.h>
 #include <linux/pci.h>
+#include <linux/mutex.h>
 #include <asm/pci_insn.h>
+#include <asm/pci_dma.h>
+
+struct kvm_zdev_ioat {
+	unsigned long *head[ZPCI_TABLE_PAGES];
+	unsigned long **seg;
+	unsigned long ***pt;
+	struct mutex lock;
+};
 
 struct kvm_zdev {
 	struct zpci_dev *zdev;
 	struct kvm *kvm;
+	struct kvm_zdev_ioat ioat;
 	struct zpci_fib fib;
 };
 
@@ -33,6 +43,11 @@ int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
 			    bool assist);
 int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
 
+int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev);
+int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota);
+int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev);
+u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev);
+
 int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
 int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
 int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
index 91e63426bdc5..69e616d0712c 100644
--- a/arch/s390/include/asm/pci_dma.h
+++ b/arch/s390/include/asm/pci_dma.h
@@ -50,6 +50,8 @@ enum zpci_ioat_dtype {
 #define ZPCI_TABLE_ALIGN		ZPCI_TABLE_SIZE
 #define ZPCI_TABLE_ENTRY_SIZE		(sizeof(unsigned long))
 #define ZPCI_TABLE_ENTRIES		(ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
+#define ZPCI_TABLE_PAGES		(ZPCI_TABLE_SIZE >> PAGE_SHIFT)
+#define ZPCI_TABLE_ENTRIES_PAGES	(ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
 
 #define ZPCI_TABLE_BITS			11
 #define ZPCI_PT_BITS			8
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 7ed9abc476b6..39c13c25a700 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -13,12 +13,15 @@
 #include <asm/pci.h>
 #include <asm/pci_insn.h>
 #include <asm/pci_io.h>
+#include <asm/pci_dma.h>
 #include <asm/sclp.h>
 #include "pci.h"
 #include "kvm-s390.h"
 
 struct zpci_aift *aift;
 
+#define shadow_ioat_init zdev->kzdev->ioat.head[0]
+
 static inline int __set_irq_noiib(u16 ctl, u8 isc)
 {
 	union zpci_sic_iib iib = {{0}};
@@ -344,6 +347,135 @@ int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
 }
 EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
 
+int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev)
+{
+	/* Must have a KVM association registered */
+	if (!zdev->kzdev || !zdev->kzdev->kvm)
+		return -EINVAL;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_probe);
+
+int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota)
+{
+	gpa_t gpa = (gpa_t)(iota & ZPCI_RTE_ADDR_MASK);
+	struct kvm_zdev_ioat *ioat;
+	struct page *page;
+	struct kvm *kvm;
+	unsigned int idx;
+	void *iaddr;
+	int i, rc = 0;
+
+	if (shadow_ioat_init)
+		return -EINVAL;
+
+	/* Ensure supported type specified */
+	if ((iota & ZPCI_IOTA_RTTO_FLAG) != ZPCI_IOTA_RTTO_FLAG)
+		return -EINVAL;
+
+	kvm = zdev->kzdev->kvm;
+	ioat = &zdev->kzdev->ioat;
+	mutex_lock(&ioat->lock);
+	idx = srcu_read_lock(&kvm->srcu);
+	for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+		page = gfn_to_page(kvm, gpa_to_gfn(gpa));
+		if (is_error_page(page)) {
+			srcu_read_unlock(&kvm->srcu, idx);
+			rc = -EIO;
+			goto out;
+		}
+		iaddr = page_to_virt(page) + (gpa & ~PAGE_MASK);
+		ioat->head[i] = (unsigned long *)iaddr;
+		gpa += PAGE_SIZE;
+	}
+	srcu_read_unlock(&kvm->srcu, idx);
+
+	zdev->kzdev->ioat.seg = kcalloc(ZPCI_TABLE_ENTRIES_PAGES,
+					sizeof(unsigned long *), GFP_KERNEL);
+	if (!zdev->kzdev->ioat.seg)
+		goto unpin;
+	zdev->kzdev->ioat.pt = kcalloc(ZPCI_TABLE_ENTRIES,
+				       sizeof(unsigned long **), GFP_KERNEL);
+	if (!zdev->kzdev->ioat.pt)
+		goto free_seg;
+
+out:
+	mutex_unlock(&ioat->lock);
+	return rc;
+
+free_seg:
+	kfree(zdev->kzdev->ioat.seg);
+unpin:
+	for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+		kvm_release_pfn_dirty((u64)ioat->head[i] >> PAGE_SHIFT);
+		ioat->head[i] = 0;
+	}
+	mutex_unlock(&ioat->lock);
+	return -ENOMEM;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_enable);
+
+static void free_pt_entry(struct kvm_zdev_ioat *ioat, int st, int pt)
+{
+	if (!ioat->pt[st][pt])
+		return;
+
+	kvm_release_pfn_dirty((u64)ioat->pt[st][pt]);
+}
+
+static void free_seg_entry(struct kvm_zdev_ioat *ioat, int entry)
+{
+	int i, st, count = 0;
+
+	for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+		if (ioat->seg[entry + i]) {
+			kvm_release_pfn_dirty((u64)ioat->seg[entry + i]);
+			count++;
+		}
+	}
+
+	if (count == 0)
+		return;
+
+	st = entry / ZPCI_TABLE_PAGES;
+	for (i = 0; i < ZPCI_TABLE_ENTRIES; i++)
+		free_pt_entry(ioat, st, i);
+	kfree(ioat->pt[st]);
+}
+
+int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev)
+{
+	struct kvm_zdev_ioat *ioat;
+	int i;
+
+	if (!shadow_ioat_init)
+		return -EINVAL;
+
+	ioat = &zdev->kzdev->ioat;
+	mutex_lock(&ioat->lock);
+	for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+		kvm_release_pfn_dirty((u64)ioat->head[i] >> PAGE_SHIFT);
+		ioat->head[i] = 0;
+	}
+
+	for (i = 0; i < ZPCI_TABLE_ENTRIES_PAGES; i += ZPCI_TABLE_PAGES)
+		free_seg_entry(ioat, i);
+
+	kfree(ioat->seg);
+	kfree(ioat->pt);
+	mutex_unlock(&ioat->lock);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_disable);
+
+u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev)
+{
+	return (zdev->dtsm & KVM_S390_PCI_DTSM_MASK);
+}
+EXPORT_SYMBOL_GPL(kvm_s390_pci_get_dtsm);
+
 int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
 {
 	/* Must have appropriate hardware facilities */
@@ -424,6 +556,10 @@ int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
 	if (zdev->kzdev->fib.fmt0.aibv != 0)
 		kvm_s390_pci_aif_disable(zdev);
 
+	/* If we are using the IOAT assist, disable it now */
+	if (zdev->kzdev->ioat.head[0])
+		kvm_s390_pci_ioat_disable(zdev);
+
 	/* Remove the host CLP guest designation */
 	zdev->gd = 0;
 
@@ -453,6 +589,8 @@ int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
 	if (!kzdev)
 		return -ENOMEM;
 
+	mutex_init(&kzdev->ioat.lock);
+
 	kzdev->zdev = zdev;
 	zdev->kzdev = kzdev;
 
@@ -467,6 +605,7 @@ void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
 	kzdev = zdev->kzdev;
 	WARN_ON(kzdev->zdev != zdev);
 	zdev->kzdev = 0;
+	mutex_destroy(&kzdev->ioat.lock);
 	kfree(kzdev);
 }
 EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index 387b637863c9..54355634df82 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -16,6 +16,8 @@
 #include <asm/airq.h>
 #include <asm/kvm_pci.h>
 
+#define KVM_S390_PCI_DTSM_MASK 0x40
+
 struct zpci_gaite {
 	u32 gisa;
 	u8 gisc;
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (19 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 20/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-19  9:29   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction Matthew Rosato
                   ` (10 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Add a routine that will perform a shadow operation between a guest
and host IOAT.  A subsequent patch will invoke this in response to
an 04 RPCIT instruction intercept.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h |   1 +
 arch/s390/include/asm/pci_dma.h |   1 +
 arch/s390/kvm/pci.c             | 208 +++++++++++++++++++++++++++++++-
 arch/s390/kvm/pci.h             |   8 +-
 4 files changed, 216 insertions(+), 2 deletions(-)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 770849f13a70..fa90729a35cf 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -30,6 +30,7 @@ struct kvm_zdev_ioat {
 struct kvm_zdev {
 	struct zpci_dev *zdev;
 	struct kvm *kvm;
+	u64 rpcit_count;
 	struct kvm_zdev_ioat ioat;
 	struct zpci_fib fib;
 };
diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
index 69e616d0712c..38004e0a4383 100644
--- a/arch/s390/include/asm/pci_dma.h
+++ b/arch/s390/include/asm/pci_dma.h
@@ -52,6 +52,7 @@ enum zpci_ioat_dtype {
 #define ZPCI_TABLE_ENTRIES		(ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
 #define ZPCI_TABLE_PAGES		(ZPCI_TABLE_SIZE >> PAGE_SHIFT)
 #define ZPCI_TABLE_ENTRIES_PAGES	(ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
+#define ZPCI_TABLE_ENTRIES_PER_PAGE	(ZPCI_TABLE_ENTRIES / ZPCI_TABLE_PAGES)
 
 #define ZPCI_TABLE_BITS			11
 #define ZPCI_PT_BITS			8
diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
index 39c13c25a700..38d2b77ec565 100644
--- a/arch/s390/kvm/pci.c
+++ b/arch/s390/kvm/pci.c
@@ -149,6 +149,208 @@ int kvm_s390_pci_aen_init(u8 nisc)
 	return rc;
 }
 
+static int dma_shadow_cpu_trans(struct kvm_vcpu *vcpu, unsigned long *entry,
+				unsigned long *gentry)
+{
+	phys_addr_t gaddr = 0;
+	unsigned long idx;
+	struct page *page;
+	kvm_pfn_t pfn;
+	gpa_t addr;
+	int rc = 0;
+
+	if (pt_entry_isvalid(*gentry)) {
+		/* pin and validate */
+		addr = *gentry & ZPCI_PTE_ADDR_MASK;
+		idx = srcu_read_lock(&vcpu->kvm->srcu);
+		page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
+		srcu_read_unlock(&vcpu->kvm->srcu, idx);
+		if (is_error_page(page))
+			return -EIO;
+		gaddr = page_to_phys(page) + (addr & ~PAGE_MASK);
+	}
+
+	if (pt_entry_isvalid(*entry)) {
+		/* Either we are invalidating, replacing or no-op */
+		if (gaddr != 0) {
+			if ((*entry & ZPCI_PTE_ADDR_MASK) == gaddr) {
+				/* Duplicate */
+				kvm_release_pfn_dirty(*entry >> PAGE_SHIFT);
+			} else {
+				/* Replace */
+				pfn = (*entry >> PAGE_SHIFT);
+				invalidate_pt_entry(entry);
+				set_pt_pfaa(entry, gaddr);
+				validate_pt_entry(entry);
+				kvm_release_pfn_dirty(pfn);
+				rc = 1;
+			}
+		} else {
+			/* Invalidate */
+			pfn = (*entry >> PAGE_SHIFT);
+			invalidate_pt_entry(entry);
+			kvm_release_pfn_dirty(pfn);
+			rc = 1;
+		}
+	} else if (gaddr != 0) {
+		/* New Entry */
+		set_pt_pfaa(entry, gaddr);
+		validate_pt_entry(entry);
+	}
+
+	return rc;
+}
+
+static unsigned long *dma_walk_guest_cpu_trans(struct kvm_vcpu *vcpu,
+					       struct kvm_zdev_ioat *ioat,
+					       dma_addr_t dma_addr)
+{
+	unsigned long *rto, *sto, *pto;
+	unsigned int rtx, rts, sx, px, idx;
+	struct page *page;
+	gpa_t addr;
+	int i;
+
+	/* Pin guest segment table if needed */
+	rtx = calc_rtx(dma_addr);
+	rto = ioat->head[(rtx / ZPCI_TABLE_ENTRIES_PER_PAGE)];
+	rts = rtx * ZPCI_TABLE_PAGES;
+	if (!ioat->seg[rts]) {
+		if (!reg_entry_isvalid(rto[rtx % ZPCI_TABLE_ENTRIES_PER_PAGE]))
+			return NULL;
+		sto = get_rt_sto(rto[rtx % ZPCI_TABLE_ENTRIES_PER_PAGE]);
+		addr = ((u64)sto & ZPCI_RTE_ADDR_MASK);
+		idx = srcu_read_lock(&vcpu->kvm->srcu);
+		for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
+			page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
+			if (is_error_page(page)) {
+				srcu_read_unlock(&vcpu->kvm->srcu, idx);
+				return NULL;
+			}
+			ioat->seg[rts + i] = page_to_virt(page) +
+					     (addr & ~PAGE_MASK);
+			addr += PAGE_SIZE;
+		}
+		srcu_read_unlock(&vcpu->kvm->srcu, idx);
+	}
+
+	/* Allocate pin pointers for another segment table if needed */
+	if (!ioat->pt[rtx]) {
+		ioat->pt[rtx] = kcalloc(ZPCI_TABLE_ENTRIES,
+					(sizeof(unsigned long *)), GFP_KERNEL);
+		if (!ioat->pt[rtx])
+			return NULL;
+	}
+	/* Pin guest page table if needed */
+	sx = calc_sx(dma_addr);
+	sto = ioat->seg[(rts + (sx / ZPCI_TABLE_ENTRIES_PER_PAGE))];
+	if (!ioat->pt[rtx][sx]) {
+		if (!reg_entry_isvalid(sto[sx % ZPCI_TABLE_ENTRIES_PER_PAGE]))
+			return NULL;
+		pto = get_st_pto(sto[sx % ZPCI_TABLE_ENTRIES_PER_PAGE]);
+		if (!pto)
+			return NULL;
+		addr = ((u64)pto & ZPCI_STE_ADDR_MASK);
+		idx = srcu_read_lock(&vcpu->kvm->srcu);
+		page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
+		srcu_read_unlock(&vcpu->kvm->srcu, idx);
+		if (is_error_page(page))
+			return NULL;
+		ioat->pt[rtx][sx] = page_to_virt(page) + (addr & ~PAGE_MASK);
+	}
+	pto = ioat->pt[rtx][sx];
+
+	/* Return guest PTE */
+	px = calc_px(dma_addr);
+	return &pto[px];
+}
+
+
+static int dma_table_shadow(struct kvm_vcpu *vcpu, struct zpci_dev *zdev,
+			    dma_addr_t dma_addr, size_t size)
+{
+	unsigned int nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
+	struct kvm_zdev *kzdev = zdev->kzdev;
+	unsigned long *entry, *gentry;
+	int i, rc = 0, rc2;
+
+	if (!nr_pages || !kzdev)
+		return -EINVAL;
+
+	mutex_lock(&kzdev->ioat.lock);
+	if (!zdev->dma_table || !kzdev->ioat.head[0]) {
+		rc = -EINVAL;
+		goto out_unlock;
+	}
+
+	for (i = 0; i < nr_pages; i++) {
+		gentry = dma_walk_guest_cpu_trans(vcpu, &kzdev->ioat, dma_addr);
+		if (!gentry)
+			continue;
+		entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr);
+
+		if (!entry) {
+			rc = -ENOMEM;
+			goto out_unlock;
+		}
+
+		rc2 = dma_shadow_cpu_trans(vcpu, entry, gentry);
+		if (rc2 < 0) {
+			rc = -EIO;
+			goto out_unlock;
+		}
+		dma_addr += PAGE_SIZE;
+		rc += rc2;
+	}
+
+out_unlock:
+	mutex_unlock(&kzdev->ioat.lock);
+	return rc;
+}
+
+int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
+			       unsigned long start, unsigned long size,
+			       u8 *status)
+{
+	struct zpci_dev *zdev;
+	u32 fh = req >> 32;
+	int rc;
+
+	/* Make sure this is a valid device associated with this guest */
+	zdev = get_zdev_by_fh(fh);
+	if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm) {
+		*status = 0;
+		return -EINVAL;
+	}
+
+	/* Only proceed if the device is using the assist */
+	if (zdev->kzdev->ioat.head[0] == 0)
+		return -EOPNOTSUPP;
+
+	rc = dma_table_shadow(vcpu, zdev, start, size);
+	if (rc < 0) {
+		/*
+		 * If errors encountered during shadow operations, we must
+		 * fabricate status to present to the guest
+		 */
+		switch (rc) {
+		case -ENOMEM:
+			*status = KVM_S390_RPCIT_INS_RES;
+			break;
+		default:
+			*status = KVM_S390_RPCIT_ERR;
+			break;
+		}
+	} else if (rc > 0) {
+		/* Host RPCIT must be issued */
+		rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size,
+					status);
+	}
+	zdev->kzdev->rpcit_count++;
+
+	return rc;
+}
+
 /* Modify PCI: Register floating adapter interruption forwarding */
 static int kvm_zpci_set_airq(struct zpci_dev *zdev)
 {
@@ -620,6 +822,8 @@ EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
 
 int kvm_s390_pci_init(void)
 {
+	int rc;
+
 	aift = kzalloc(sizeof(struct zpci_aift), GFP_KERNEL);
 	if (!aift)
 		return -ENOMEM;
@@ -627,5 +831,7 @@ int kvm_s390_pci_init(void)
 	spin_lock_init(&aift->gait_lock);
 	mutex_init(&aift->lock);
 
-	return 0;
+	rc = zpci_get_mdd(&aift->mdd);
+
+	return rc;
 }
diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
index 54355634df82..bb2be7fc3934 100644
--- a/arch/s390/kvm/pci.h
+++ b/arch/s390/kvm/pci.h
@@ -18,6 +18,9 @@
 
 #define KVM_S390_PCI_DTSM_MASK 0x40
 
+#define KVM_S390_RPCIT_INS_RES 0x10
+#define KVM_S390_RPCIT_ERR 0x28
+
 struct zpci_gaite {
 	u32 gisa;
 	u8 gisc;
@@ -33,6 +36,7 @@ struct zpci_aift {
 	struct kvm_zdev **kzdev;
 	spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
 	struct mutex lock; /* Protects the other structures in aift */
+	u32 mdd;
 };
 
 extern struct zpci_aift *aift;
@@ -47,7 +51,9 @@ static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
 
 int kvm_s390_pci_aen_init(u8 nisc);
 void kvm_s390_pci_aen_exit(void);
-
+int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
+			       unsigned long start, unsigned long end,
+			       u8 *status);
 int kvm_s390_pci_init(void);
 
 #endif /* __KVM_S390_PCI_H */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (20 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-18 11:05   ` Pierre Morel
  2022-01-19 14:06   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV Matthew Rosato
                   ` (9 subsequent siblings)
  31 siblings, 2 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

For faster handling of PCI translation refreshes, intercept in KVM
and call the associated handler.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/kvm/priv.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 46 insertions(+)

diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 417154b314a6..5b65c1830de2 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -29,6 +29,7 @@
 #include <asm/ap.h>
 #include "gaccess.h"
 #include "kvm-s390.h"
+#include "pci.h"
 #include "trace.h"
 
 static int handle_ri(struct kvm_vcpu *vcpu)
@@ -335,6 +336,49 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static int handle_rpcit(struct kvm_vcpu *vcpu)
+{
+	int reg1, reg2;
+	u8 status;
+	int rc;
+
+	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
+		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
+
+	/* If the host doesn't support PCI, it must be an emulated device */
+	if (!IS_ENABLED(CONFIG_PCI))
+		return -EOPNOTSUPP;
+
+	kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
+
+	/* If the device has a SHM bit on, let userspace take care of this */
+	if (((vcpu->run->s.regs.gprs[reg1] >> 32) & aift->mdd) != 0)
+		return -EOPNOTSUPP;
+
+	rc = kvm_s390_pci_refresh_trans(vcpu, vcpu->run->s.regs.gprs[reg1],
+					vcpu->run->s.regs.gprs[reg2],
+					vcpu->run->s.regs.gprs[reg2+1],
+					&status);
+
+	switch (rc) {
+	case 0:
+		kvm_s390_set_psw_cc(vcpu, 0);
+		break;
+	case -EOPNOTSUPP:
+		return -EOPNOTSUPP;
+	default:
+		vcpu->run->s.regs.gprs[reg1] &= 0xffffffff00ffffffUL;
+		vcpu->run->s.regs.gprs[reg1] |= (u64) status << 24;
+		if (status != 0)
+			kvm_s390_set_psw_cc(vcpu, 1);
+		else
+			kvm_s390_set_psw_cc(vcpu, 3);
+		break;
+	}
+
+	return 0;
+}
+
 #define SSKE_NQ 0x8
 #define SSKE_MR 0x4
 #define SSKE_MC 0x2
@@ -1275,6 +1319,8 @@ int kvm_s390_handle_b9(struct kvm_vcpu *vcpu)
 		return handle_essa(vcpu);
 	case 0xaf:
 		return handle_pfmf(vcpu);
+	case 0xd3:
+		return handle_rpcit(vcpu);
 	default:
 		return -EOPNOTSUPP;
 	}
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (21 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-18 17:20   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 24/30] vfio-pci/zdev: wire up group notifier Matthew Rosato
                   ` (8 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

This was previously removed as unnecessary; while that was true, subsequent
changes will make KVM an additional required component for vfio-pci-zdev.
Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a reason
to say 'n' for it (when not planning to CONFIG_KVM).

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 drivers/vfio/pci/Kconfig      | 11 +++++++++++
 drivers/vfio/pci/Makefile     |  2 +-
 include/linux/vfio_pci_core.h |  2 +-
 3 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 860424ccda1b..fedd1d4cb592 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -42,5 +42,16 @@ config VFIO_PCI_IGD
 	  and LPC bridge config space.
 
 	  To enable Intel IGD assignment through vfio-pci, say Y.
+
+config VFIO_PCI_ZDEV
+	bool "VFIO PCI extensions for s390x KVM passthrough"
+	depends on S390 && KVM
+	default y
+	help
+	  Support s390x-specific extensions to enable support for enhancements
+	  to KVM passthrough capabilities, such as interpretive execution of
+	  zPCI instructions.
+
+	  To enable s390x KVM vfio-pci extensions, say Y.
 endif
 endif
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index 349d68d242b4..01b1f83d83d7 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -1,7 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 vfio-pci-core-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
-vfio-pci-core-$(CONFIG_S390) += vfio_pci_zdev.o
+vfio-pci-core-$(CONFIG_VFIO_PCI_ZDEV) += vfio_pci_zdev.o
 obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o
 
 vfio-pci-y := vfio_pci.o
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index ef9a44b6cf5d..5e2bca3b89db 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -195,7 +195,7 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
 }
 #endif
 
-#ifdef CONFIG_S390
+#ifdef CONFIG_VFIO_PCI_ZDEV
 extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 				       struct vfio_info_cap *caps);
 #else
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 24/30] vfio-pci/zdev: wire up group notifier
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (22 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-18 17:34   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support Matthew Rosato
                   ` (7 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

KVM zPCI passthrough device logic will need a reference to the associated
kvm guest that has access to the device.  Let's register a group notifier
for VFIO_GROUP_NOTIFY_SET_KVM to catch this information in order to create
an association between a kvm guest and the host zdev.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h  |  2 ++
 drivers/vfio/pci/vfio_pci_core.c |  2 ++
 drivers/vfio/pci/vfio_pci_zdev.c | 46 ++++++++++++++++++++++++++++++++
 include/linux/vfio_pci_core.h    | 10 +++++++
 4 files changed, 60 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index fa90729a35cf..97a90b37c87d 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -17,6 +17,7 @@
 #include <linux/kvm.h>
 #include <linux/pci.h>
 #include <linux/mutex.h>
+#include <linux/notifier.h>
 #include <asm/pci_insn.h>
 #include <asm/pci_dma.h>
 
@@ -33,6 +34,7 @@ struct kvm_zdev {
 	u64 rpcit_count;
 	struct kvm_zdev_ioat ioat;
 	struct zpci_fib fib;
+	struct notifier_block nb;
 };
 
 int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index f948e6cd2993..fc57d4d0abbe 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -452,6 +452,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
 
 	vfio_pci_vf_token_user_add(vdev, -1);
 	vfio_spapr_pci_eeh_release(vdev->pdev);
+	vfio_pci_zdev_release(vdev);
 	vfio_pci_core_disable(vdev);
 
 	mutex_lock(&vdev->igate);
@@ -470,6 +471,7 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_close_device);
 void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
 {
 	vfio_pci_probe_mmaps(vdev);
+	vfio_pci_zdev_open(vdev);
 	vfio_spapr_pci_eeh_open(vdev->pdev);
 	vfio_pci_vf_token_user_add(vdev, 1);
 }
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index ea4c0d2b0663..5c2bddc57b39 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -13,6 +13,7 @@
 #include <linux/vfio_zdev.h>
 #include <asm/pci_clp.h>
 #include <asm/pci_io.h>
+#include <asm/kvm_pci.h>
 
 #include <linux/vfio_pci_core.h>
 
@@ -136,3 +137,48 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 
 	return ret;
 }
+
+static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
+					unsigned long action, void *data)
+{
+	struct kvm_zdev *kzdev = container_of(nb, struct kvm_zdev, nb);
+
+	if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
+		if (!data || !kzdev->zdev)
+			return NOTIFY_DONE;
+		kvm_s390_pci_attach_kvm(kzdev->zdev, data);
+	}
+
+	return NOTIFY_OK;
+}
+
+void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
+{
+	unsigned long events = VFIO_GROUP_NOTIFY_SET_KVM;
+	struct zpci_dev *zdev = to_zpci(vdev->pdev);
+
+	if (!zdev)
+		return;
+
+	if (kvm_s390_pci_dev_open(zdev))
+		return;
+
+	zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
+
+	if (vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
+				   &events, &zdev->kzdev->nb))
+		kvm_s390_pci_dev_release(zdev);
+}
+
+void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
+{
+	struct zpci_dev *zdev = to_zpci(vdev->pdev);
+
+	if (!zdev || !zdev->kzdev)
+		return;
+
+	vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
+				 &zdev->kzdev->nb);
+
+	kvm_s390_pci_dev_release(zdev);
+}
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 5e2bca3b89db..05287f8ac855 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -198,12 +198,22 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
 #ifdef CONFIG_VFIO_PCI_ZDEV
 extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 				       struct vfio_info_cap *caps);
+void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
+void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
 #else
 static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 					      struct vfio_info_cap *caps)
 {
 	return -ENODEV;
 }
+
+static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
+{
+}
+
+static inline void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
+{
+}
 #endif
 
 /* Will be exported for vfio pci drivers usage */
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (23 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 24/30] vfio-pci/zdev: wire up group notifier Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-25 13:01   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support Matthew Rosato
                   ` (6 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Introduce support for VFIO_DEVICE_FEATURE_ZPCI_INTERP, which is a new
VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
s390x vfio-pci device wishes to enable/disable zPCI interpretive
execution, which allows zPCI instructions to be executed directly by
underlying firmware without KVM involvement.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h  |  1 +
 drivers/vfio/pci/vfio_pci_core.c |  2 +
 drivers/vfio/pci/vfio_pci_zdev.c | 78 ++++++++++++++++++++++++++++++++
 include/linux/vfio_pci_core.h    | 10 ++++
 include/uapi/linux/vfio.h        |  7 +++
 include/uapi/linux/vfio_zdev.h   | 15 ++++++
 6 files changed, 113 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index 97a90b37c87d..dc00c3f27a00 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -35,6 +35,7 @@ struct kvm_zdev {
 	struct kvm_zdev_ioat ioat;
 	struct zpci_fib fib;
 	struct notifier_block nb;
+	bool interp;
 };
 
 int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index fc57d4d0abbe..2b2d64a2190c 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1172,6 +1172,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
 			mutex_unlock(&vdev->vf_token->lock);
 
 			return 0;
+		case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
+			return vfio_pci_zdev_feat_interp(vdev, feature, arg);
 		default:
 			return -ENOTTY;
 		}
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 5c2bddc57b39..4339f48b98bc 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -54,6 +54,10 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
 		.version = zdev->version
 	};
 
+	/* Some values are different for interpreted devices */
+	if (zdev->kzdev && zdev->kzdev->interp)
+		cap.maxstbl = zdev->maxstbl;
+
 	return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
 }
 
@@ -138,6 +142,72 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 	return ret;
 }
 
+int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
+			      struct vfio_device_feature feature,
+			      unsigned long arg)
+{
+	struct zpci_dev *zdev = to_zpci(vdev->pdev);
+	struct vfio_device_zpci_interp *data;
+	struct vfio_device_feature *feat;
+	unsigned long minsz;
+	int size, rc;
+
+	if (!zdev || !zdev->kzdev)
+		return -EINVAL;
+
+	/* If PROBE specified, return probe results immediately */
+	if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
+		return kvm_s390_pci_interp_probe(zdev);
+
+	/* GET and SET are mutually exclusive */
+	if ((feature.flags & VFIO_DEVICE_FEATURE_GET) &&
+	    (feature.flags & VFIO_DEVICE_FEATURE_SET))
+		return -EINVAL;
+
+	size = sizeof(*feat) + sizeof(*data);
+	feat = kzalloc(size, GFP_KERNEL);
+	if (!feat)
+		return -ENOMEM;
+
+	data = (struct vfio_device_zpci_interp *)&feat->data;
+	minsz = offsetofend(struct vfio_device_feature, flags);
+
+	if (feature.argsz < minsz + sizeof(*data))
+		return -EINVAL;
+
+	/* Get the rest of the payload for GET/SET */
+	rc = copy_from_user(data, (void __user *)(arg + minsz),
+			    sizeof(*data));
+	if (rc)
+		rc = -EINVAL;
+
+	if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
+		if (zdev->gd != 0)
+			data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
+		else
+			data->flags = 0;
+		data->fh = zdev->fh;
+		/* userspace is using host fh, give interpreted clp values */
+		zdev->kzdev->interp = true;
+
+		if (copy_to_user((void __user *)arg, feat, size))
+			rc = -EFAULT;
+	} else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
+		if (data->flags == VFIO_DEVICE_ZPCI_FLAG_INTERP)
+			rc = kvm_s390_pci_interp_enable(zdev);
+		else if (data->flags == 0)
+			rc = kvm_s390_pci_interp_disable(zdev);
+		else
+			rc = -EINVAL;
+	} else {
+		/* Neither GET nor SET were specified */
+		rc = -EINVAL;
+	}
+
+	kfree(feat);
+	return rc;
+}
+
 static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
 					unsigned long action, void *data)
 {
@@ -164,6 +234,7 @@ void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
 		return;
 
 	zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
+	zdev->kzdev->interp = false;
 
 	if (vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
 				   &events, &zdev->kzdev->nb))
@@ -180,5 +251,12 @@ void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
 	vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
 				 &zdev->kzdev->nb);
 
+	/*
+	 * If the device was using interpretation, don't trust that userspace
+	 * did the appropriate cleanup
+	 */
+	if (zdev->gd != 0)
+		kvm_s390_pci_interp_disable(zdev);
+
 	kvm_s390_pci_dev_release(zdev);
 }
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 05287f8ac855..0db2b1051931 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -198,6 +198,9 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
 #ifdef CONFIG_VFIO_PCI_ZDEV
 extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 				       struct vfio_info_cap *caps);
+int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
+			      struct vfio_device_feature feature,
+			      unsigned long arg);
 void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
 void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
 #else
@@ -207,6 +210,13 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 	return -ENODEV;
 }
 
+static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
+					    struct vfio_device_feature feature,
+					    unsigned long arg)
+{
+	return -ENOTTY;
+}
+
 static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
 {
 }
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index ef33ea002b0b..b9a75485b8e7 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1002,6 +1002,13 @@ struct vfio_device_feature {
  */
 #define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN	(0)
 
+/*
+ * Provide support for enabling interpretation of zPCI instructions.  This
+ * feature is only valid for s390x PCI devices.  Data provided when setting
+ * and getting this feature is futher described in vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_INTERP		(1)
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index b4309397b6b2..575f0410dc66 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -75,4 +75,19 @@ struct vfio_device_info_cap_zpci_pfip {
 	__u8 pfip[];
 };
 
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_INTERP
+ *
+ * This feature is used for enabling zPCI instruction interpretation for a
+ * device.  No data is provided when setting this feature.  When getting
+ * this feature, the following structure is provided which details whether
+ * or not interpretation is active and provides the guest with host device
+ * information necessary to enable interpretation.
+ */
+struct vfio_device_zpci_interp {
+	__u64 flags;
+#define VFIO_DEVICE_ZPCI_FLAG_INTERP 1
+	__u32 fh;		/* Host device function handle */
+};
+
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (24 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-19 17:10   ` Pierre Morel
  2022-01-25 12:36   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 27/30] vfio-pci/zdev: wire up zPCI IOAT assist support Matthew Rosato
                   ` (5 subsequent siblings)
  31 siblings, 2 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Introduce support for VFIO_DEVICE_FEATURE_ZPCI_AIF, which is a new
VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
s390x vfio-pci device wishes to enable/disable zPCI adapter interrupt
forwarding, which allows underlying firmware to deliver interrupts
directly to the associated kvm guest.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h  |  2 +
 drivers/vfio/pci/vfio_pci_core.c |  2 +
 drivers/vfio/pci/vfio_pci_zdev.c | 98 +++++++++++++++++++++++++++++++-
 include/linux/vfio_pci_core.h    | 10 ++++
 include/uapi/linux/vfio.h        |  7 +++
 include/uapi/linux/vfio_zdev.h   | 20 +++++++
 6 files changed, 138 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index dc00c3f27a00..dbab349a4a75 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -36,6 +36,8 @@ struct kvm_zdev {
 	struct zpci_fib fib;
 	struct notifier_block nb;
 	bool interp;
+	bool aif;
+	bool fhost;
 };
 
 int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 2b2d64a2190c..01658de660bd 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1174,6 +1174,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
 			return 0;
 		case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
 			return vfio_pci_zdev_feat_interp(vdev, feature, arg);
+		case VFIO_DEVICE_FEATURE_ZPCI_AIF:
+			return vfio_pci_zdev_feat_aif(vdev, feature, arg);
 		default:
 			return -ENOTTY;
 		}
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 4339f48b98bc..891cfa016d63 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -13,6 +13,7 @@
 #include <linux/vfio_zdev.h>
 #include <asm/pci_clp.h>
 #include <asm/pci_io.h>
+#include <asm/pci_insn.h>
 #include <asm/kvm_pci.h>
 
 #include <linux/vfio_pci_core.h>
@@ -208,6 +209,99 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
 	return rc;
 }
 
+int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
+			   struct vfio_device_feature feature,
+			   unsigned long arg)
+{
+	struct zpci_dev *zdev = to_zpci(vdev->pdev);
+	struct vfio_device_zpci_aif *data;
+	struct vfio_device_feature *feat;
+	unsigned long minsz;
+	int size, rc = 0;
+
+	if (!zdev || !zdev->kzdev)
+		return -EINVAL;
+
+	/* If PROBE specified, return probe results immediately */
+	if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
+		return kvm_s390_pci_aif_probe(zdev);
+
+	/* GET and SET are mutually exclusive */
+	if ((feature.flags & VFIO_DEVICE_FEATURE_GET) &&
+	    (feature.flags & VFIO_DEVICE_FEATURE_SET))
+		return -EINVAL;
+
+	size = sizeof(*feat) + sizeof(*data);
+	feat = kzalloc(size, GFP_KERNEL);
+	if (!feat)
+		return -ENOMEM;
+
+	data = (struct vfio_device_zpci_aif *)&feat->data;
+	minsz = offsetofend(struct vfio_device_feature, flags);
+
+	if (feature.argsz < minsz + sizeof(*data))
+		return -EINVAL;
+
+	/* Get the rest of the payload for GET/SET */
+	rc = copy_from_user(data, (void __user *)(arg + minsz),
+			    sizeof(*data));
+	if (rc)
+		rc = -EINVAL;
+
+	if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
+		if (zdev->kzdev->aif)
+			data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
+		if (zdev->kzdev->fhost)
+			data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
+
+		if (copy_to_user((void __user *)arg, feat, size))
+			rc = -EFAULT;
+	} else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
+		if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT) {
+			/* create a guest fib */
+			struct zpci_fib fib;
+
+			fib.fmt0.aibv = data->ibv;
+			fib.fmt0.isc = data->isc;
+			fib.fmt0.noi = data->noi;
+			if (data->sb != 0) {
+				fib.fmt0.aisb = data->sb;
+				fib.fmt0.aisbo = data->sbo;
+				fib.fmt0.sum = 1;
+			} else {
+				fib.fmt0.aisb = 0;
+				fib.fmt0.aisbo = 0;
+				fib.fmt0.sum = 0;
+			}
+			if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST) {
+				rc = kvm_s390_pci_aif_enable(zdev, &fib, false);
+				if (!rc) {
+					zdev->kzdev->aif = true;
+					zdev->kzdev->fhost = true;
+				}
+			} else {
+				rc = kvm_s390_pci_aif_enable(zdev, &fib, true);
+				if (!rc)
+					zdev->kzdev->aif = true;
+			}
+		} else if (data->flags == 0) {
+			rc = kvm_s390_pci_aif_disable(zdev);
+			if (!rc) {
+				zdev->kzdev->aif = false;
+				zdev->kzdev->fhost = false;
+			}
+		} else {
+			rc = -EINVAL;
+		}
+	} else {
+		/* Neither GET nor SET were specified */
+		rc = -EINVAL;
+	}
+
+	kfree(feat);
+	return rc;
+}
+
 static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
 					unsigned long action, void *data)
 {
@@ -255,8 +349,10 @@ void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
 	 * If the device was using interpretation, don't trust that userspace
 	 * did the appropriate cleanup
 	 */
-	if (zdev->gd != 0)
+	if (zdev->gd != 0) {
+		kvm_s390_pci_aif_disable(zdev);
 		kvm_s390_pci_interp_disable(zdev);
+	}
 
 	kvm_s390_pci_dev_release(zdev);
 }
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 0db2b1051931..7ec5e82e7933 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -201,6 +201,9 @@ extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
 int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
 			      struct vfio_device_feature feature,
 			      unsigned long arg);
+int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
+			   struct vfio_device_feature feature,
+			   unsigned long arg);
 void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
 void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
 #else
@@ -217,6 +220,13 @@ static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
 	return -ENOTTY;
 }
 
+static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
+					 struct vfio_device_feature feature,
+					 unsigned long arg)
+{
+	return -ENOTTY;
+}
+
 static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
 {
 }
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index b9a75485b8e7..fe3bfd99bf50 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1009,6 +1009,13 @@ struct vfio_device_feature {
  */
 #define VFIO_DEVICE_FEATURE_ZPCI_INTERP		(1)
 
+/*
+ * Provide support for enbaling adapter interruption forwarding for zPCI
+ * devices.  This feature is only valid for s390x PCI devices.  Data provided
+ * when setting and getting this feature is further described in vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_AIF		(2)
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index 575f0410dc66..c574e23f9385 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -90,4 +90,24 @@ struct vfio_device_zpci_interp {
 	__u32 fh;		/* Host device function handle */
 };
 
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_AIF
+ *
+ * This feature is used for enabling forwarding of adapter interrupts directly
+ * from firmware to the guest.  When setting this feature, the flags indicate
+ * whether to enable/disable the feature and the structure defined below is
+ * used to setup the forwarding structures.  When getting this feature, only
+ * the flags are used to indicate the current state.
+ */
+struct vfio_device_zpci_aif {
+	__u64 flags;
+#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
+#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2
+	__u64 ibv;		/* Address of guest interrupt bit vector */
+	__u64 sb;		/* Address of guest summary bit */
+	__u32 noi;		/* Number of interrupts */
+	__u8 isc;		/* Guest interrupt subclass */
+	__u8 sbo;		/* Offset of guest summary bit vector */
+};
+
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 27/30] vfio-pci/zdev: wire up zPCI IOAT assist support
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (25 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-19 14:03   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 28/30] vfio-pci/zdev: add DTSM to clp group capability Matthew Rosato
                   ` (4 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

Introduce support for VFIO_DEVICE_FEATURE_ZPCI_IOAT, which is a new
VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
s390x vfio-pci device wishes to enable/disable zPCI I/O Address
Translation assistance, allowing the host to perform address translation
and shadowing.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/asm/kvm_pci.h  |  1 +
 drivers/vfio/pci/vfio_pci_core.c |  2 +
 drivers/vfio/pci/vfio_pci_zdev.c | 63 ++++++++++++++++++++++++++++++++
 include/linux/vfio_pci_core.h    | 10 +++++
 include/uapi/linux/vfio.h        |  8 ++++
 include/uapi/linux/vfio_zdev.h   | 13 +++++++
 6 files changed, 97 insertions(+)

diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
index dbab349a4a75..7b6b6d771026 100644
--- a/arch/s390/include/asm/kvm_pci.h
+++ b/arch/s390/include/asm/kvm_pci.h
@@ -32,6 +32,7 @@ struct kvm_zdev {
 	struct zpci_dev *zdev;
 	struct kvm *kvm;
 	u64 rpcit_count;
+	u64 iota;
 	struct kvm_zdev_ioat ioat;
 	struct zpci_fib fib;
 	struct notifier_block nb;
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 01658de660bd..709d9ba22a60 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -1176,6 +1176,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
 			return vfio_pci_zdev_feat_interp(vdev, feature, arg);
 		case VFIO_DEVICE_FEATURE_ZPCI_AIF:
 			return vfio_pci_zdev_feat_aif(vdev, feature, arg);
+		case VFIO_DEVICE_FEATURE_ZPCI_IOAT:
+			return vfio_pci_zdev_feat_ioat(vdev, feature, arg);
 		default:
 			return -ENOTTY;
 		}
diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 891cfa016d63..2b169d688937 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -302,6 +302,68 @@ int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
 	return rc;
 }
 
+int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
+			    struct vfio_device_feature feature,
+			    unsigned long arg)
+{
+	struct zpci_dev *zdev = to_zpci(vdev->pdev);
+	struct vfio_device_zpci_ioat *data;
+	struct vfio_device_feature *feat;
+	unsigned long minsz;
+	int size, rc = 0;
+
+	if (!zdev || !zdev->kzdev)
+		return -EINVAL;
+
+	/* If PROBE specified, return probe results immediately */
+	if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
+		return kvm_s390_pci_ioat_probe(zdev);
+
+	/* GET and SET are mutually exclusive */
+	if ((feature.flags & VFIO_DEVICE_FEATURE_GET) &&
+	    (feature.flags & VFIO_DEVICE_FEATURE_SET))
+		return -EINVAL;
+
+	size = sizeof(*feat) + sizeof(*data);
+	feat = kzalloc(size, GFP_KERNEL);
+	if (!feat)
+		return -ENOMEM;
+
+	data = (struct vfio_device_zpci_ioat *)&feat->data;
+	minsz = offsetofend(struct vfio_device_feature, flags);
+
+	if (feature.argsz < minsz + sizeof(*data))
+		return -EINVAL;
+
+	/* Get the rest of the payload for GET/SET */
+	rc = copy_from_user(data, (void __user *)(arg + minsz),
+			    sizeof(*data));
+	if (rc)
+		rc = -EINVAL;
+
+	if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
+		data->iota = (u64)zdev->kzdev->iota;
+		if (copy_to_user((void __user *)arg, feat, size))
+			rc = -EFAULT;
+	} else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
+		if (data->iota != 0) {
+			rc = kvm_s390_pci_ioat_enable(zdev, data->iota);
+			if (!rc)
+				zdev->kzdev->iota = data->iota;
+		} else if (zdev->kzdev->iota != 0) {
+			rc = kvm_s390_pci_ioat_disable(zdev);
+			if (!rc)
+				zdev->kzdev->iota = 0;
+		}
+	} else {
+		/* Neither GET nor SET were specified */
+		rc = -EINVAL;
+	}
+
+	kfree(feat);
+	return rc;
+}
+
 static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
 					unsigned long action, void *data)
 {
@@ -351,6 +413,7 @@ void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
 	 */
 	if (zdev->gd != 0) {
 		kvm_s390_pci_aif_disable(zdev);
+		kvm_s390_pci_ioat_disable(zdev);
 		kvm_s390_pci_interp_disable(zdev);
 	}
 
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 7ec5e82e7933..f17d761ae14e 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -204,6 +204,9 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
 int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
 			   struct vfio_device_feature feature,
 			   unsigned long arg);
+int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
+			    struct vfio_device_feature feature,
+			    unsigned long arg);
 void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
 void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
 #else
@@ -227,6 +230,13 @@ static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
 	return -ENOTTY;
 }
 
+static inline int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
+					  struct vfio_device_feature feature,
+					  unsigned long arg)
+{
+	return -ENOTTY;
+}
+
 static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
 {
 }
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index fe3bfd99bf50..32c687388f48 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -1016,6 +1016,14 @@ struct vfio_device_feature {
  */
 #define VFIO_DEVICE_FEATURE_ZPCI_AIF		(2)
 
+/*
+ * Provide support for enabling guest I/O address translation assistance for
+ * zPCI devices.  This feature is only valid for s390x PCI devices.  Data
+ * provided when setting and getting this feature is further described in
+ * vfio_zdev.h
+ */
+#define VFIO_DEVICE_FEATURE_ZPCI_IOAT		(3)
+
 /* -------- API for Type1 VFIO IOMMU -------- */
 
 /**
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index c574e23f9385..1a5229b7bb18 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -110,4 +110,17 @@ struct vfio_device_zpci_aif {
 	__u8 sbo;		/* Offset of guest summary bit vector */
 };
 
+/**
+ * VFIO_DEVICE_FEATURE_ZPCI_IOAT
+ *
+ * This feature is used for enabling guest I/O translation assistance for
+ * passthrough zPCI devices using instruction interpretation.  When setting
+ * this feature, the iota specifies a KVM guest I/O translation anchor.  When
+ * getting this feature, the most recently set anchor (or 0) is returned in
+ * iota.
+ */
+struct vfio_device_zpci_ioat {
+	__u64 iota;
+};
+
 #endif
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 28/30] vfio-pci/zdev: add DTSM to clp group capability
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (26 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 27/30] vfio-pci/zdev: wire up zPCI IOAT assist support Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-19 13:48   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 29/30] KVM: s390: introduce CPU feature for zPCI Interpretation Matthew Rosato
                   ` (3 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

The DTSM, or designation type supported mask, indicates what IOAT formats
are available to the guest.  For an interpreted device, userspace will not
know what format(s) the IOAT assist supports, so pass it via the
capability chain.  Since the value belongs to the Query PCI Function Group
clp, let's extend the existing capability with a new version.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 drivers/vfio/pci/vfio_pci_zdev.c | 9 ++++++---
 include/uapi/linux/vfio_zdev.h   | 3 +++
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
index 2b169d688937..aa2ef9067c7d 100644
--- a/drivers/vfio/pci/vfio_pci_zdev.c
+++ b/drivers/vfio/pci/vfio_pci_zdev.c
@@ -45,19 +45,22 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
 {
 	struct vfio_device_info_cap_zpci_group cap = {
 		.header.id = VFIO_DEVICE_INFO_CAP_ZPCI_GROUP,
-		.header.version = 1,
+		.header.version = 2,
 		.dasm = zdev->dma_mask,
 		.msi_addr = zdev->msi_addr,
 		.flags = VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH,
 		.mui = zdev->fmb_update,
 		.noi = zdev->max_msi,
 		.maxstbl = ZPCI_MAX_WRITE_SIZE,
-		.version = zdev->version
+		.version = zdev->version,
+		.dtsm = 0
 	};
 
 	/* Some values are different for interpreted devices */
-	if (zdev->kzdev && zdev->kzdev->interp)
+	if (zdev->kzdev && zdev->kzdev->interp) {
 		cap.maxstbl = zdev->maxstbl;
+		cap.dtsm = kvm_s390_pci_get_dtsm(zdev);
+	}
 
 	return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
 }
diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
index 1a5229b7bb18..b4c2ba8e71f0 100644
--- a/include/uapi/linux/vfio_zdev.h
+++ b/include/uapi/linux/vfio_zdev.h
@@ -47,6 +47,9 @@ struct vfio_device_info_cap_zpci_group {
 	__u16 noi;		/* Maximum number of MSIs */
 	__u16 maxstbl;		/* Maximum Store Block Length */
 	__u8 version;		/* Supported PCI Version */
+	/* End of version 1 */
+	__u8 dtsm;		/* Supported IOAT Designations */
+	/* End of version 2 */
 };
 
 /**
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 29/30] KVM: s390: introduce CPU feature for zPCI Interpretation
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (27 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 28/30] vfio-pci/zdev: add DTSM to clp group capability Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-19 13:39   ` Pierre Morel
  2022-01-14 20:31 ` [PATCH v2 30/30] MAINTAINERS: additional files related kvm s390 pci passthrough Matthew Rosato
                   ` (2 subsequent siblings)
  31 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

KVM_S390_VM_CPU_FEAT_ZPCI_INTERP relays whether zPCI interpretive
execution is possible based on the available hardware facilities.

Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 arch/s390/include/uapi/asm/kvm.h | 1 +
 arch/s390/kvm/kvm-s390.c         | 4 ++++
 2 files changed, 5 insertions(+)

diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 7a6b14874d65..ed06458a871f 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
 #define KVM_S390_VM_CPU_FEAT_PFMFI	11
 #define KVM_S390_VM_CPU_FEAT_SIGPIF	12
 #define KVM_S390_VM_CPU_FEAT_KSS	13
+#define KVM_S390_VM_CPU_FEAT_ZPCI_INTERP 14
 struct kvm_s390_vm_cpu_feat {
 	__u64 feat[16];
 };
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index b6c32fc3b272..3ed59fe512dd 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -434,6 +434,10 @@ static void kvm_s390_cpu_feat_init(void)
 	if (test_facility(151)) /* DFLTCC */
 		__insn32_query(INSN_DFLTCC, kvm_s390_available_subfunc.dfltcc);
 
+	if (test_facility(69) && test_facility(70) && test_facility(71) &&
+	    test_facility(72)) /* zPCI Interpretation */
+		allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ZPCI_INTERP);
+
 	if (MACHINE_HAS_ESOP)
 		allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
 	/*
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* [PATCH v2 30/30] MAINTAINERS: additional files related kvm s390 pci passthrough
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (28 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 29/30] KVM: s390: introduce CPU feature for zPCI Interpretation Matthew Rosato
@ 2022-01-14 20:31 ` Matthew Rosato
  2022-01-14 20:49 ` [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
  2022-01-19 18:10 ` Pierre Morel
  31 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:31 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger

Add entries from the s390 kvm subdirectory related to pci passthrough.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
---
 MAINTAINERS | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5d0cd537803a..1b52acd74cfd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -16874,6 +16874,8 @@ M:	Eric Farman <farman@linux.ibm.com>
 L:	linux-s390@vger.kernel.org
 L:	kvm@vger.kernel.org
 S:	Supported
+F:	arch/s390/include/asm/kvm_pci.h
+F:	arch/s390/kvm/pci*
 F:	drivers/vfio/pci/vfio_pci_zdev.c
 F:	include/uapi/linux/vfio_zdev.h
 
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (29 preceding siblings ...)
  2022-01-14 20:31 ` [PATCH v2 30/30] MAINTAINERS: additional files related kvm s390 pci passthrough Matthew Rosato
@ 2022-01-14 20:49 ` Matthew Rosato
  2022-01-19 18:10 ` Pierre Morel
  31 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-14 20:49 UTC (permalink / raw)
  To: linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/14/22 3:31 PM, Matthew Rosato wrote:
> Enable interpretive execution of zPCI instructions + adapter interruption
> forwarding for s390x KVM vfio-pci.  This is done by introducing a series
> of new vfio-pci feature ioctls that are unique vfio-pci-zdev (s390x) and
> are used to negotiate the various aspects of zPCI interpretation setup.
> By allowing intepretation of zPCI instructions and firmware delivery of
> interrupts to guests, we can significantly reduce the frequency of guest
> SIE exits for zPCI.  We then see additional gains by handling a hot-path
> instruction that can still intercept to the hypervisor (RPCIT) directly
> in kvm.
> 
>  From the perspective of guest configuration, you passthrough zPCI devices
> in the same manner as before, with intepretation support being used by
> default if available in kernel+qemu.
> 
> Will reply with a link to the associated QEMU series.

https://lore.kernel.org/qemu-devel/20220114203849.243657-1-mjrosato@linux.ibm.com/

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 04/30] s390/sclp: detect the AISI facility
  2022-01-14 20:31 ` [PATCH v2 04/30] s390/sclp: detect the AISI facility Matthew Rosato
@ 2022-01-17  7:57   ` Thomas Huth
  0 siblings, 0 replies; 97+ messages in thread
From: Thomas Huth @ 2022-01-17  7:57 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, pasic, kvm, linux-kernel

On 14/01/2022 21.31, Matthew Rosato wrote:
> Detect the Adapter Interruption Suppression Interpretation facility.
> 
> Reviewed-by: Eric Farman <farman@linux.ibm.com>
> Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/sclp.h   | 1 +
>   drivers/s390/char/sclp_early.c | 1 +
>   2 files changed, 2 insertions(+)
> 
> diff --git a/arch/s390/include/asm/sclp.h b/arch/s390/include/asm/sclp.h
> index 8c2e142000d4..33b174007848 100644
> --- a/arch/s390/include/asm/sclp.h
> +++ b/arch/s390/include/asm/sclp.h
> @@ -91,6 +91,7 @@ struct sclp_info {
>   	unsigned char has_zpci_lsi : 1;
>   	unsigned char has_aisii : 1;
>   	unsigned char has_aeni : 1;
> +	unsigned char has_aisi : 1;
>   	unsigned int ibc;
>   	unsigned int mtid;
>   	unsigned int mtid_cp;
> diff --git a/drivers/s390/char/sclp_early.c b/drivers/s390/char/sclp_early.c
> index e9af01b4c97a..c13e55cc4a5d 100644
> --- a/drivers/s390/char/sclp_early.c
> +++ b/drivers/s390/char/sclp_early.c
> @@ -47,6 +47,7 @@ static void __init sclp_early_facilities_detect(void)
>   	sclp.has_kss = !!(sccb->fac98 & 0x01);
>   	sclp.has_aisii = !!(sccb->fac118 & 0x40);
>   	sclp.has_aeni = !!(sccb->fac118 & 0x20);
> +	sclp.has_aisi = !!(sccb->fac118 & 0x10);
>   	sclp.has_zpci_lsi = !!(sccb->fac118 & 0x01);
>   	if (sccb->fac85 & 0x02)
>   		S390_lowcore.machine_flags |= MACHINE_FLAG_ESOP;

Just a matter of taste, but I'd maybe squash patches 1 - 4 into one patch.

  Thomas


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 05/30] s390/airq: pass more TPI info to airq handlers
  2022-01-14 20:31 ` [PATCH v2 05/30] s390/airq: pass more TPI info to airq handlers Matthew Rosato
@ 2022-01-17  8:27   ` Thomas Huth
  0 siblings, 0 replies; 97+ messages in thread
From: Thomas Huth @ 2022-01-17  8:27 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, pmorel, borntraeger,
	hca, gor, gerald.schaefer, agordeev, frankja, david, imbrenda,
	vneethv, oberpar, freude, pasic, kvm, linux-kernel

On 14/01/2022 21.31, Matthew Rosato wrote:
> A subsequent patch will introduce an airq handler that requires additional
> TPI information beyond directed vs floating, so pass the entire tpi_info
> structure via the handler.  Only pci actually uses this information today,
> for the other airq handlers this is effectively a no-op.
> 
> Reviewed-by: Eric Farman <farman@linux.ibm.com>
> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/airq.h     | 3 ++-
>   arch/s390/kvm/interrupt.c        | 4 +++-
>   arch/s390/pci/pci_irq.c          | 9 +++++++--
>   drivers/s390/cio/airq.c          | 2 +-
>   drivers/s390/cio/qdio_thinint.c  | 6 ++++--
>   drivers/s390/crypto/ap_bus.c     | 9 ++++++---
>   drivers/s390/virtio/virtio_ccw.c | 4 +++-
>   7 files changed, 26 insertions(+), 11 deletions(-)

Reviewed-by: Thomas Huth <thuth@redhat.com>


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector
  2022-01-14 20:31 ` [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector Matthew Rosato
@ 2022-01-17 12:29   ` Claudio Imbrenda
  2022-01-18 18:52     ` Matthew Rosato
  2022-01-18  9:50   ` Pierre Morel
  1 sibling, 1 reply; 97+ messages in thread
From: Claudio Imbrenda @ 2022-01-17 12:29 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: linux-s390, alex.williamson, cohuck, schnelle, farman, pmorel,
	borntraeger, hca, gor, gerald.schaefer, agordeev, frankja, david,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

On Fri, 14 Jan 2022 15:31:21 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> When doing device passthrough where interrupts are being forwarded
> from host to guest, we wish to use a pinned section of guest memory
> as the vector (the same memory used by the guest as the vector).

maybe expand the description of the patch to explain what exactly is
being done in this patch. Namely: you add a parameter to a function
(and some logic in the function to use the new parameter), but the
function is not being used yet. And pinning is also done somewhere else.

maybe you can add something like

	This patch adds a new parameter for airq_iv_create to pass the
	existing vector pinned in guest memory and to use it when
	needed instead of allocating a new one.

Apart from that, the patch looks good.

> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  arch/s390/include/asm/airq.h     |  4 +++-
>  arch/s390/pci/pci_irq.c          |  8 ++++----
>  drivers/s390/cio/airq.c          | 10 +++++++---
>  drivers/s390/virtio/virtio_ccw.c |  2 +-
>  4 files changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 7918a7d09028..e82e5626e139 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -47,8 +47,10 @@ struct airq_iv {
>  #define AIRQ_IV_PTR		4	/* Allocate the ptr array */
>  #define AIRQ_IV_DATA		8	/* Allocate the data array */
>  #define AIRQ_IV_CACHELINE	16	/* Cacheline alignment for the vector */
> +#define AIRQ_IV_GUESTVEC	32	/* Vector is a pinned guest page */
>  
> -struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags);
> +struct airq_iv *airq_iv_create(unsigned long bits, unsigned long
> flags,
> +			       unsigned long *vec);
>  void airq_iv_release(struct airq_iv *iv);
>  unsigned long airq_iv_alloc(struct airq_iv *iv, unsigned long num);
>  void airq_iv_free(struct airq_iv *iv, unsigned long bit, unsigned
> long num); diff --git a/arch/s390/pci/pci_irq.c
> b/arch/s390/pci/pci_irq.c index cc4c8d7c8f5c..0d0a02a9fbbf 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -296,7 +296,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int
> nvec, int type) zdev->aisb = bit;
>  
>  		/* Create adapter interrupt vector */
> -		zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA |
> AIRQ_IV_BITLOCK);
> +		zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA |
> AIRQ_IV_BITLOCK, NULL); if (!zdev->aibv)
>  			return -ENOMEM;
>  
> @@ -419,7 +419,7 @@ static int __init zpci_directed_irq_init(void)
>  	union zpci_sic_iib iib = {{0}};
>  	unsigned int cpu;
>  
> -	zpci_sbv = airq_iv_create(num_possible_cpus(), 0);
> +	zpci_sbv = airq_iv_create(num_possible_cpus(), 0, NULL);
>  	if (!zpci_sbv)
>  		return -ENOMEM;
>  
> @@ -441,7 +441,7 @@ static int __init zpci_directed_irq_init(void)
>  		zpci_ibv[cpu] = airq_iv_create(cache_line_size() *
> BITS_PER_BYTE, AIRQ_IV_DATA |
>  					       AIRQ_IV_CACHELINE |
> -					       (!cpu ? AIRQ_IV_ALLOC
> : 0));
> +					       (!cpu ? AIRQ_IV_ALLOC
> : 0), NULL); if (!zpci_ibv[cpu])
>  			return -ENOMEM;
>  	}
> @@ -458,7 +458,7 @@ static int __init zpci_floating_irq_init(void)
>  	if (!zpci_ibv)
>  		return -ENOMEM;
>  
> -	zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC);
> +	zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC,
> NULL); if (!zpci_sbv)
>  		goto out_free;
>  
> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index 2f2226786319..375a58b1c838 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -122,10 +122,12 @@ static inline unsigned long iv_size(unsigned
> long bits)
>   * airq_iv_create - create an interrupt vector
>   * @bits: number of bits in the interrupt vector
>   * @flags: allocation flags
> + * @vec: pointer to pinned guest memory if AIRQ_IV_GUESTVEC
>   *
>   * Returns a pointer to an interrupt vector structure
>   */
> -struct airq_iv *airq_iv_create(unsigned long bits, unsigned long
> flags) +struct airq_iv *airq_iv_create(unsigned long bits, unsigned
> long flags,
> +			       unsigned long *vec)
>  {
>  	struct airq_iv *iv;
>  	unsigned long size;
> @@ -146,6 +148,8 @@ struct airq_iv *airq_iv_create(unsigned long
> bits, unsigned long flags) &iv->vector_dma);
>  		if (!iv->vector)
>  			goto out_free;
> +	} else if (flags & AIRQ_IV_GUESTVEC) {
> +		iv->vector = vec;
>  	} else {
>  		iv->vector = cio_dma_zalloc(size);
>  		if (!iv->vector)
> @@ -185,7 +189,7 @@ struct airq_iv *airq_iv_create(unsigned long
> bits, unsigned long flags) kfree(iv->avail);
>  	if (iv->flags & AIRQ_IV_CACHELINE && iv->vector)
>  		dma_pool_free(airq_iv_cache, iv->vector,
> iv->vector_dma);
> -	else
> +	else if (!(iv->flags & AIRQ_IV_GUESTVEC))
>  		cio_dma_free(iv->vector, size);
>  	kfree(iv);
>  out:
> @@ -204,7 +208,7 @@ void airq_iv_release(struct airq_iv *iv)
>  	kfree(iv->bitlock);
>  	if (iv->flags & AIRQ_IV_CACHELINE)
>  		dma_pool_free(airq_iv_cache, iv->vector,
> iv->vector_dma);
> -	else
> +	else if (!(iv->flags & AIRQ_IV_GUESTVEC))
>  		cio_dma_free(iv->vector, iv_size(iv->bits));
>  	kfree(iv->avail);
>  	kfree(iv);
> diff --git a/drivers/s390/virtio/virtio_ccw.c
> b/drivers/s390/virtio/virtio_ccw.c index 52c376d15978..410498d693f8
> 100644 --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -241,7 +241,7 @@ static struct airq_info *new_airq_info(int index)
>  		return NULL;
>  	rwlock_init(&info->lock);
>  	info->aiv = airq_iv_create(VIRTIO_IV_BITS, AIRQ_IV_ALLOC |
> AIRQ_IV_PTR
> -				   | AIRQ_IV_CACHELINE);
> +				   | AIRQ_IV_CACHELINE, NULL);
>  	if (!info->aiv) {
>  		kfree(info);
>  		return NULL;


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine
  2022-01-14 20:31 ` [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine Matthew Rosato
@ 2022-01-17 16:19   ` Niklas Schnelle
  2022-01-26 10:07   ` Claudio Imbrenda
  2022-01-27  9:57   ` Pierre Morel
  2 siblings, 0 replies; 97+ messages in thread
From: Niklas Schnelle @ 2022-01-17 16:19 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, farman, pmorel, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On Fri, 2022-01-14 at 15:31 -0500, Matthew Rosato wrote:
> A subsequent patch will be issuing SIC from KVM -- export the necessary
> routine and make the operation control definitions available from a header.
> Because the routine will now be exported, let's rename __zpci_set_irq_ctrl
> to zpci_set_irq_ctrl and get rid of the zero'd iib wrapper function of
> the same name.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Looks good thank you!

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>

> ---
>  arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
>  arch/s390/pci/pci_insn.c         |  3 ++-
>  arch/s390/pci/pci_irq.c          | 26 ++++++++++++--------------
>  3 files changed, 23 insertions(+), 23 deletions(-)
> 
> 
---8<---


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 14/30] KVM: s390: pci: add basic kvm_zdev structure
  2022-01-14 20:31 ` [PATCH v2 14/30] KVM: s390: pci: add basic kvm_zdev structure Matthew Rosato
@ 2022-01-17 16:25   ` Pierre Morel
  2022-01-18 17:32     ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-17 16:25 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> This structure will be used to carry kvm passthrough information related to
> zPCI devices.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++++++
>   arch/s390/include/asm/pci.h     |  3 +++
>   arch/s390/kvm/Makefile          |  2 +-
>   arch/s390/kvm/pci.c             | 46 +++++++++++++++++++++++++++++++++
>   4 files changed, 79 insertions(+), 1 deletion(-)
>   create mode 100644 arch/s390/include/asm/kvm_pci.h
>   create mode 100644 arch/s390/kvm/pci.c
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> new file mode 100644
> index 000000000000..aafee2976929
> --- /dev/null
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * KVM PCI Passthrough for virtual machines on s390
> + *
> + * Copyright IBM Corp. 2021
> + *
> + *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
> + */
> +
> +

One blank line too much.

Otherwise, look good to me.

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>

> +#ifndef ASM_KVM_PCI_H
> +#define ASM_KVM_PCI_H
> +
> +#include <linux/types.h>
> +#include <linux/kvm_types.h>
> +#include <linux/kvm_host.h>
> +#include <linux/kvm.h>
> +#include <linux/pci.h>
> +
> +struct kvm_zdev {
> +	struct zpci_dev *zdev;
> +	struct kvm *kvm;
> +};
> +
> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
> +void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
> +
> +#endif /* ASM_KVM_PCI_H */
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index f3cd2da8128c..9b6c657d8d31 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -97,6 +97,7 @@ struct zpci_bar_struct {
>   };
>   
>   struct s390_domain;
> +struct kvm_zdev;
>   
>   #define ZPCI_FUNCTIONS_PER_BUS 256
>   struct zpci_bus {
> @@ -190,6 +191,8 @@ struct zpci_dev {
>   	struct dentry	*debugfs_dev;
>   
>   	struct s390_domain *s390_domain; /* s390 IOMMU domain data */
> +
> +	struct kvm_zdev *kzdev; /* passthrough data */
>   };
>   
>   static inline bool zdev_enabled(struct zpci_dev *zdev)
> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
> index b3aaadc60ead..a26f4fe7b680 100644
> --- a/arch/s390/kvm/Makefile
> +++ b/arch/s390/kvm/Makefile
> @@ -11,5 +11,5 @@ ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>   
>   kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o sigp.o
>   kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
> -
> +kvm-$(CONFIG_PCI) += pci.o
>   obj-$(CONFIG_KVM) += kvm.o
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> new file mode 100644
> index 000000000000..1c33bc7bf2bd
> --- /dev/null
> +++ b/arch/s390/kvm/pci.c
> @@ -0,0 +1,46 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * s390 kvm PCI passthrough support
> + *
> + * Copyright IBM Corp. 2021
> + *
> + *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/pci.h>
> +#include <asm/kvm_pci.h>
> +
> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
> +{
> +	struct kvm_zdev *kzdev;
> +
> +	kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
> +	if (!kzdev)
> +		return -ENOMEM;
> +
> +	kzdev->zdev = zdev;
> +	zdev->kzdev = kzdev;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
> +
> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
> +{
> +	struct kvm_zdev *kzdev;
> +
> +	kzdev = zdev->kzdev;
> +	WARN_ON(kzdev->zdev != zdev);
> +	zdev->kzdev = 0;
> +	kfree(kzdev);
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
> +
> +void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
> +{
> +	struct kvm_zdev *kzdev = zdev->kzdev;
> +
> +	kzdev->kvm = kvm;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 16/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications
  2022-01-14 20:31 ` [PATCH v2 16/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications Matthew Rosato
@ 2022-01-17 17:38   ` Pierre Morel
  2022-01-18 17:25     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-17 17:38 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> In cases where interrupts are not forwarded to the guest via firmware,
> KVM is responsible for ensuring delivery.  When an interrupt presents
> with the forwarding bit, we must process the forwarding tables until
> all interrupts are delivered.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_host.h |  1 +
>   arch/s390/include/asm/tpi.h      | 13 ++++++
>   arch/s390/kvm/interrupt.c        | 76 +++++++++++++++++++++++++++++++-
>   arch/s390/kvm/kvm-s390.c         |  3 +-
>   arch/s390/kvm/pci.h              |  9 ++++
>   5 files changed, 100 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index a604d51acfc8..3f147b8d050b 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -757,6 +757,7 @@ struct kvm_vm_stat {
>   	u64 inject_pfault_done;
>   	u64 inject_service_signal;
>   	u64 inject_virtio;
> +	u64 aen_forward;
>   };
>   
>   struct kvm_arch_memory_slot {
> diff --git a/arch/s390/include/asm/tpi.h b/arch/s390/include/asm/tpi.h
> index 1ac538b8cbf5..f76e5fdff23a 100644
> --- a/arch/s390/include/asm/tpi.h
> +++ b/arch/s390/include/asm/tpi.h
> @@ -19,6 +19,19 @@ struct tpi_info {
>   	u32 :12;
>   } __packed __aligned(4);
>   
> +/* I/O-Interruption Code as stored by TPI for an Adapter I/O */
> +struct tpi_adapter_info {
> +	u32 aism:8;
> +	u32 :22;
> +	u32 error:1;
> +	u32 forward:1;
> +	u32 reserved;
> +	u32 adapter_IO:1;
> +	u32 directed_irq:1;
> +	u32 isc:3;
> +	u32 :27;
> +} __packed __aligned(4);
> +
>   #endif /* __ASSEMBLY__ */
>   
>   #endif /* _ASM_S390_TPI_H */
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index a591b8cd662f..07743c6a67c4 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -3263,11 +3263,85 @@ int kvm_s390_gisc_unregister(struct kvm *kvm, u32 gisc)
>   }
>   EXPORT_SYMBOL_GPL(kvm_s390_gisc_unregister);
>   
> +static void aen_host_forward(unsigned long si)
> +{
> +	struct kvm_s390_gisa_interrupt *gi;
> +	struct zpci_gaite *gaite;
> +	struct kvm *kvm;
> +
> +	gaite = (struct zpci_gaite *)aift->gait +
> +		(si * sizeof(struct zpci_gaite));
> +	if (gaite->count == 0)
> +		return;
> +	if (gaite->aisb != 0)
> +		set_bit_inv(gaite->aisbo, (unsigned long *)gaite->aisb);
> +
> +	kvm = kvm_s390_pci_si_to_kvm(aift, si);
> +	if (kvm == 0)
> +		return;
> +	gi = &kvm->arch.gisa_int;
> +
> +	if (!(gi->origin->g1.simm & AIS_MODE_MASK(gaite->gisc)) ||
> +	    !(gi->origin->g1.nimm & AIS_MODE_MASK(gaite->gisc))) {
> +		gisa_set_ipm_gisc(gi->origin, gaite->gisc);
> +		if (hrtimer_active(&gi->timer))
> +			hrtimer_cancel(&gi->timer);
> +		hrtimer_start(&gi->timer, 0, HRTIMER_MODE_REL);
> +		kvm->stat.aen_forward++;
> +	}
> +}
> +
> +static void aen_process_gait(u8 isc)
> +{
> +	bool found = false, first = true;
> +	union zpci_sic_iib iib = {{0}};
> +	unsigned long si, flags;
> +
> +	spin_lock_irqsave(&aift->gait_lock, flags);
> +
> +	if (!aift->gait) {
> +		spin_unlock_irqrestore(&aift->gait_lock, flags);
> +		return;
> +	}
> +
> +	for (si = 0;;) {
> +		/* Scan adapter summary indicator bit vector */
> +		si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift->sbv));
> +		if (si == -1UL) {
> +			if (first || found) {
> +				/* Reenable interrupts. */
> +				if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
> +						      &iib))
> +					break;

AFAIU this code is VFIO interpretation specific code and facility 12 is 
a precondition for it, so I think this break will never occur.
If I am right we should not test the return value which will make the 
code clearer.

> +				first = found = false;
> +			} else {
> +				/* Interrupts on and all bits processed */
> +				break;
> +			}

May be add a comment: "rescan after re-enabling interrupts"

> +			found = false;
> +			si = 0;
> +			continue;
> +		}
> +		found = true;
> +		aen_host_forward(si);
> +	}
> +
> +	spin_unlock_irqrestore(&aift->gait_lock, flags);
> +}
> +
>   static void gib_alert_irq_handler(struct airq_struct *airq,
>   				  struct tpi_info *tpi_info)
>   {
> +	struct tpi_adapter_info *info = (struct tpi_adapter_info *)tpi_info;
> +
>   	inc_irq_stat(IRQIO_GAL);
> -	process_gib_alert_list();
> +
> +	if (IS_ENABLED(CONFIG_PCI) && (info->forward || info->error)) {
> +		aen_process_gait(info->isc);
> +		if (info->aism != 0)
> +			process_gib_alert_list();
> +	} else
> +		process_gib_alert_list();

NIT: I think we need braces around this statement

>   }
>   
>   static struct airq_struct gib_alert_irq = {
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 01dc3f6883d0..ab8b56deed11 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -65,7 +65,8 @@ const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
>   	STATS_DESC_COUNTER(VM, inject_float_mchk),
>   	STATS_DESC_COUNTER(VM, inject_pfault_done),
>   	STATS_DESC_COUNTER(VM, inject_service_signal),
> -	STATS_DESC_COUNTER(VM, inject_virtio)
> +	STATS_DESC_COUNTER(VM, inject_virtio),
> +	STATS_DESC_COUNTER(VM, aen_forward)
>   };
>   
>   const struct kvm_stats_header kvm_vm_stats_header = {
> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
> index b2000ed7b8c3..387b637863c9 100644
> --- a/arch/s390/kvm/pci.h
> +++ b/arch/s390/kvm/pci.h
> @@ -12,6 +12,7 @@
>   
>   #include <linux/pci.h>
>   #include <linux/mutex.h>
> +#include <linux/kvm_host.h>
>   #include <asm/airq.h>
>   #include <asm/kvm_pci.h>
>   
> @@ -34,6 +35,14 @@ struct zpci_aift {
>   
>   extern struct zpci_aift *aift;
>   
> +static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
> +						 unsigned long si)
> +{
> +	if (!IS_ENABLED(CONFIG_PCI) || aift->kzdev == 0 || aift->kzdev[si] == 0)

Shouldn't it be better CONFIG_VFIO_PCI ?

> +		return 0;
> +	return aift->kzdev[si]->kvm;
> +};
> +
>   int kvm_s390_pci_aen_init(u8 nisc);
>   void kvm_s390_pci_aen_exit(void);
>   
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector
  2022-01-14 20:31 ` [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector Matthew Rosato
  2022-01-17 12:29   ` Claudio Imbrenda
@ 2022-01-18  9:50   ` Pierre Morel
  1 sibling, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-18  9:50 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> When doing device passthrough where interrupts are being forwarded
> from host to guest, we wish to use a pinned section of guest memory
> as the vector (the same memory used by the guest as the vector).
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>


Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>


> ---
>   arch/s390/include/asm/airq.h     |  4 +++-
>   arch/s390/pci/pci_irq.c          |  8 ++++----
>   drivers/s390/cio/airq.c          | 10 +++++++---
>   drivers/s390/virtio/virtio_ccw.c |  2 +-
>   4 files changed, 15 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/s390/include/asm/airq.h b/arch/s390/include/asm/airq.h
> index 7918a7d09028..e82e5626e139 100644
> --- a/arch/s390/include/asm/airq.h
> +++ b/arch/s390/include/asm/airq.h
> @@ -47,8 +47,10 @@ struct airq_iv {
>   #define AIRQ_IV_PTR		4	/* Allocate the ptr array */
>   #define AIRQ_IV_DATA		8	/* Allocate the data array */
>   #define AIRQ_IV_CACHELINE	16	/* Cacheline alignment for the vector */
> +#define AIRQ_IV_GUESTVEC	32	/* Vector is a pinned guest page */
>   
> -struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags);
> +struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
> +			       unsigned long *vec);
>   void airq_iv_release(struct airq_iv *iv);
>   unsigned long airq_iv_alloc(struct airq_iv *iv, unsigned long num);
>   void airq_iv_free(struct airq_iv *iv, unsigned long bit, unsigned long num);
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index cc4c8d7c8f5c..0d0a02a9fbbf 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -296,7 +296,7 @@ int arch_setup_msi_irqs(struct pci_dev *pdev, int nvec, int type)
>   		zdev->aisb = bit;
>   
>   		/* Create adapter interrupt vector */
> -		zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK);
> +		zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA | AIRQ_IV_BITLOCK, NULL);
>   		if (!zdev->aibv)
>   			return -ENOMEM;
>   
> @@ -419,7 +419,7 @@ static int __init zpci_directed_irq_init(void)
>   	union zpci_sic_iib iib = {{0}};
>   	unsigned int cpu;
>   
> -	zpci_sbv = airq_iv_create(num_possible_cpus(), 0);
> +	zpci_sbv = airq_iv_create(num_possible_cpus(), 0, NULL);
>   	if (!zpci_sbv)
>   		return -ENOMEM;
>   
> @@ -441,7 +441,7 @@ static int __init zpci_directed_irq_init(void)
>   		zpci_ibv[cpu] = airq_iv_create(cache_line_size() * BITS_PER_BYTE,
>   					       AIRQ_IV_DATA |
>   					       AIRQ_IV_CACHELINE |
> -					       (!cpu ? AIRQ_IV_ALLOC : 0));
> +					       (!cpu ? AIRQ_IV_ALLOC : 0), NULL);
>   		if (!zpci_ibv[cpu])
>   			return -ENOMEM;
>   	}
> @@ -458,7 +458,7 @@ static int __init zpci_floating_irq_init(void)
>   	if (!zpci_ibv)
>   		return -ENOMEM;
>   
> -	zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC);
> +	zpci_sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, NULL);
>   	if (!zpci_sbv)
>   		goto out_free;
>   
> diff --git a/drivers/s390/cio/airq.c b/drivers/s390/cio/airq.c
> index 2f2226786319..375a58b1c838 100644
> --- a/drivers/s390/cio/airq.c
> +++ b/drivers/s390/cio/airq.c
> @@ -122,10 +122,12 @@ static inline unsigned long iv_size(unsigned long bits)
>    * airq_iv_create - create an interrupt vector
>    * @bits: number of bits in the interrupt vector
>    * @flags: allocation flags
> + * @vec: pointer to pinned guest memory if AIRQ_IV_GUESTVEC
>    *
>    * Returns a pointer to an interrupt vector structure
>    */
> -struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
> +struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags,
> +			       unsigned long *vec)
>   {
>   	struct airq_iv *iv;
>   	unsigned long size;
> @@ -146,6 +148,8 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>   					     &iv->vector_dma);
>   		if (!iv->vector)
>   			goto out_free;
> +	} else if (flags & AIRQ_IV_GUESTVEC) {
> +		iv->vector = vec;
>   	} else {
>   		iv->vector = cio_dma_zalloc(size);
>   		if (!iv->vector)
> @@ -185,7 +189,7 @@ struct airq_iv *airq_iv_create(unsigned long bits, unsigned long flags)
>   	kfree(iv->avail);
>   	if (iv->flags & AIRQ_IV_CACHELINE && iv->vector)
>   		dma_pool_free(airq_iv_cache, iv->vector, iv->vector_dma);
> -	else
> +	else if (!(iv->flags & AIRQ_IV_GUESTVEC))
>   		cio_dma_free(iv->vector, size);
>   	kfree(iv);
>   out:
> @@ -204,7 +208,7 @@ void airq_iv_release(struct airq_iv *iv)
>   	kfree(iv->bitlock);
>   	if (iv->flags & AIRQ_IV_CACHELINE)
>   		dma_pool_free(airq_iv_cache, iv->vector, iv->vector_dma);
> -	else
> +	else if (!(iv->flags & AIRQ_IV_GUESTVEC))
>   		cio_dma_free(iv->vector, iv_size(iv->bits));
>   	kfree(iv->avail);
>   	kfree(iv);
> diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
> index 52c376d15978..410498d693f8 100644
> --- a/drivers/s390/virtio/virtio_ccw.c
> +++ b/drivers/s390/virtio/virtio_ccw.c
> @@ -241,7 +241,7 @@ static struct airq_info *new_airq_info(int index)
>   		return NULL;
>   	rwlock_init(&info->lock);
>   	info->aiv = airq_iv_create(VIRTIO_IV_BITS, AIRQ_IV_ALLOC | AIRQ_IV_PTR
> -				   | AIRQ_IV_CACHELINE);
> +				   | AIRQ_IV_CACHELINE, NULL);
>   	if (!info->aiv) {
>   		kfree(info);
>   		return NULL;
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 09/30] s390/pci: export some routines related to RPCIT processing
  2022-01-14 20:31 ` [PATCH v2 09/30] s390/pci: export some routines related to RPCIT processing Matthew Rosato
@ 2022-01-18  9:51   ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-18  9:51 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger



On 1/14/22 21:31, Matthew Rosato wrote:
> KVM will re-use dma_walk_cpu_trans to walk the host shadow table and
> will also need to be able to call zpci_refresh_trans to re-issue a RPCIT.
> 
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>


Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>

> ---
>   arch/s390/pci/pci_dma.c  | 1 +
>   arch/s390/pci/pci_insn.c | 1 +
>   2 files changed, 2 insertions(+)
> 
> diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
> index f46833a25526..a81de48d5ea7 100644
> --- a/arch/s390/pci/pci_dma.c
> +++ b/arch/s390/pci/pci_dma.c
> @@ -116,6 +116,7 @@ unsigned long *dma_walk_cpu_trans(unsigned long *rto, dma_addr_t dma_addr)
>   	px = calc_px(dma_addr);
>   	return &pto[px];
>   }
> +EXPORT_SYMBOL_GPL(dma_walk_cpu_trans);
>   
>   void dma_update_cpu_trans(unsigned long *entry, phys_addr_t page_addr, int flags)
>   {
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index 2a47b3936e44..0509554301c7 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -95,6 +95,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
>   
>   	return (cc) ? -EIO : 0;
>   }
> +EXPORT_SYMBOL_GPL(zpci_refresh_trans);
>   
>   /* Set Interruption Controls */
>   int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 11/30] s390/pci: add helper function to find device by handle
  2022-01-14 20:31 ` [PATCH v2 11/30] s390/pci: add helper function to find device by handle Matthew Rosato
@ 2022-01-18  9:53   ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-18  9:53 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger



On 1/14/22 21:31, Matthew Rosato wrote:
> Intercepted zPCI instructions will specify the desired function via a
> function handle.  Add a routine to find the device with the specified
> handle.
> 
> Acked-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Reviewed-by: Eric Farman <farman@linux.ibm.com>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>


> ---
>   arch/s390/include/asm/pci.h |  1 +
>   arch/s390/pci/pci.c         | 16 ++++++++++++++++
>   2 files changed, 17 insertions(+)
> 
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 1a8f9f42da3a..00a2c24d6d2b 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -275,6 +275,7 @@ static inline struct zpci_dev *to_zpci_dev(struct device *dev)
>   }
>   
>   struct zpci_dev *get_zdev_by_fid(u32);
> +struct zpci_dev *get_zdev_by_fh(u32 fh);
>   
>   /* DMA */
>   int zpci_dma_init(void);
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 0c9879dae752..1e939b4cf25e 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -76,6 +76,22 @@ struct zpci_dev *get_zdev_by_fid(u32 fid)
>   	return zdev;
>   }
>   
> +struct zpci_dev *get_zdev_by_fh(u32 fh)
> +{
> +	struct zpci_dev *tmp, *zdev = NULL;
> +
> +	spin_lock(&zpci_list_lock);
> +	list_for_each_entry(tmp, &zpci_list, entry) {
> +		if (tmp->fh == fh) {
> +			zdev = tmp;
> +			break;
> +		}
> +	}
> +	spin_unlock(&zpci_list_lock);
> +	return zdev;
> +}
> +EXPORT_SYMBOL_GPL(get_zdev_by_fh);
> +
>   void zpci_remove_reserved_devices(void)
>   {
>   	struct zpci_dev *tmp, *zdev;
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 12/30] s390/pci: get SHM information from list pci
  2022-01-14 20:31 ` [PATCH v2 12/30] s390/pci: get SHM information from list pci Matthew Rosato
@ 2022-01-18 10:36   ` Pierre Morel
  2022-01-26 10:13     ` Claudio Imbrenda
  2022-01-27 10:29   ` Niklas Schnelle
  1 sibling, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-18 10:36 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> KVM will need information on the special handle mask used to indicate
> emulated devices.  In order to obtain this, a new type of list pci call
> must be made to gather the information.  Extend clp_list_pci_req to
> also fetch the model-dependent-data field that holds this mask.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/pci.h     |  1 +
>   arch/s390/include/asm/pci_clp.h |  2 +-
>   arch/s390/pci/pci_clp.c         | 28 +++++++++++++++++++++++++---
>   3 files changed, 27 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 00a2c24d6d2b..f3cd2da8128c 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -227,6 +227,7 @@ int clp_enable_fh(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as);
>   int clp_disable_fh(struct zpci_dev *zdev, u32 *fh);
>   int clp_get_state(u32 fid, enum zpci_state *state);
>   int clp_refresh_fh(u32 fid, u32 *fh);
> +int zpci_get_mdd(u32 *mdd);
>   
>   /* UID */
>   void update_uid_checking(bool new);
> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> index 124fadfb74b9..d6bc324763f3 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -76,7 +76,7 @@ struct clp_req_list_pci {
>   struct clp_rsp_list_pci {
>   	struct clp_rsp_hdr hdr;
>   	u64 resume_token;
> -	u32 reserved2;
> +	u32 mdd;
>   	u16 max_fn;
>   	u8			: 7;
>   	u8 uid_checking		: 1;
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index bc7446566cbc..308ffb93413f 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -328,7 +328,7 @@ int clp_disable_fh(struct zpci_dev *zdev, u32 *fh)
>   }
>   
>   static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
> -			    u64 *resume_token, int *nentries)
> +			    u64 *resume_token, int *nentries, u32 *mdd)
>   {
>   	int rc;
>   
> @@ -354,6 +354,8 @@ static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
>   	*nentries = (rrb->response.hdr.len - LIST_PCI_HDR_LEN) /
>   		rrb->response.entry_size;
>   	*resume_token = rrb->response.resume_token;
> +	if (mdd)
> +		*mdd = rrb->response.mdd;
>   
>   	return rc;
>   }
> @@ -365,7 +367,7 @@ static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
>   	int nentries, i, rc;
>   
>   	do {
> -		rc = clp_list_pci_req(rrb, &resume_token, &nentries);
> +		rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
>   		if (rc)
>   			return rc;
>   		for (i = 0; i < nentries; i++)
> @@ -383,7 +385,7 @@ static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
>   	int nentries, i, rc;
>   
>   	do {
> -		rc = clp_list_pci_req(rrb, &resume_token, &nentries);
> +		rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
>   		if (rc)
>   			return rc;
>   		fh_list = rrb->response.fh_list;
> @@ -468,6 +470,26 @@ int clp_get_state(u32 fid, enum zpci_state *state)
>   	return rc;
>   }
>   
> +int zpci_get_mdd(u32 *mdd)
> +{
> +	struct clp_req_rsp_list_pci *rrb;
> +	u64 resume_token = 0;
> +	int nentries, rc;
> +
> +	if (!mdd)
> +		return -EINVAL;

I think this tests is not useful.
The caller must take care not to call with a NULL pointer,
what the only caller today make sure.


> +
> +	rrb = clp_alloc_block(GFP_KERNEL);
> +	if (!rrb)
> +		return -ENOMEM;
> +
> +	rc = clp_list_pci_req(rrb, &resume_token, &nentries, mdd);
> +
> +	clp_free_block(rrb);
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(zpci_get_mdd);
> +
>   static int clp_base_slpc(struct clp_req *req, struct clp_req_rsp_slpc *lpcb)
>   {
>   	unsigned long limit = PAGE_SIZE - sizeof(lpcb->request);
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction
  2022-01-14 20:31 ` [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction Matthew Rosato
@ 2022-01-18 11:05   ` Pierre Morel
  2022-01-18 17:27     ` Matthew Rosato
  2022-01-19 14:06   ` Pierre Morel
  1 sibling, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-18 11:05 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> For faster handling of PCI translation refreshes, intercept in KVM
> and call the associated handler.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/kvm/priv.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 46 insertions(+)
> 
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 417154b314a6..5b65c1830de2 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -29,6 +29,7 @@
>   #include <asm/ap.h>
>   #include "gaccess.h"
>   #include "kvm-s390.h"
> +#include "pci.h"
>   #include "trace.h"
>   
>   static int handle_ri(struct kvm_vcpu *vcpu)
> @@ -335,6 +336,49 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
>   	return 0;
>   }
>   
> +static int handle_rpcit(struct kvm_vcpu *vcpu)
> +{
> +	int reg1, reg2;
> +	u8 status;
> +	int rc;
> +
> +	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
> +		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
> +
> +	/* If the host doesn't support PCI, it must be an emulated device */
> +	if (!IS_ENABLED(CONFIG_PCI))
> +		return -EOPNOTSUPP;

AFAIU this makes also sure that the following code is not compiled in 
case PCI is not supported.

I am not very used to compilation options, is it true with all our 
compilers and options?
Or do we have to specify a compiler version?

Another concern is, shouldn't we use IS_ENABLED(CONFIG_VFIO_PCI) ?



> +
> +	kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
> +
> +	/* If the device has a SHM bit on, let userspace take care of this */
> +	if (((vcpu->run->s.regs.gprs[reg1] >> 32) & aift->mdd) != 0)
> +		return -EOPNOTSUPP;
> +
> +	rc = kvm_s390_pci_refresh_trans(vcpu, vcpu->run->s.regs.gprs[reg1],
> +					vcpu->run->s.regs.gprs[reg2],
> +					vcpu->run->s.regs.gprs[reg2+1],
> +					&status);
> +
> +	switch (rc) {
> +	case 0:
> +		kvm_s390_set_psw_cc(vcpu, 0);
> +		break;
> +	case -EOPNOTSUPP:
> +		return -EOPNOTSUPP;
> +	default:
> +		vcpu->run->s.regs.gprs[reg1] &= 0xffffffff00ffffffUL;
> +		vcpu->run->s.regs.gprs[reg1] |= (u64) status << 24;
> +		if (status != 0)
> +			kvm_s390_set_psw_cc(vcpu, 1);
> +		else
> +			kvm_s390_set_psw_cc(vcpu, 3);
> +		break;
> +	}
> +
> +	return 0;
> +}
> +
>   #define SSKE_NQ 0x8
>   #define SSKE_MR 0x4
>   #define SSKE_MC 0x2
> @@ -1275,6 +1319,8 @@ int kvm_s390_handle_b9(struct kvm_vcpu *vcpu)
>   		return handle_essa(vcpu);
>   	case 0xaf:
>   		return handle_pfmf(vcpu);
> +	case 0xd3:
> +		return handle_rpcit(vcpu);
>   	default:
>   		return -EOPNOTSUPP;
>   	}
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV
  2022-01-14 20:31 ` [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV Matthew Rosato
@ 2022-01-18 17:20   ` Pierre Morel
  2022-01-18 17:32     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-18 17:20 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> This was previously removed as unnecessary; while that was true, subsequent
> changes will make KVM an additional required component for vfio-pci-zdev.
> Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a reason
> to say 'n' for it (when not planning to CONFIG_KVM).
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   drivers/vfio/pci/Kconfig      | 11 +++++++++++
>   drivers/vfio/pci/Makefile     |  2 +-
>   include/linux/vfio_pci_core.h |  2 +-
>   3 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
> index 860424ccda1b..fedd1d4cb592 100644
> --- a/drivers/vfio/pci/Kconfig
> +++ b/drivers/vfio/pci/Kconfig
> @@ -42,5 +42,16 @@ config VFIO_PCI_IGD
>   	  and LPC bridge config space.
>   
>   	  To enable Intel IGD assignment through vfio-pci, say Y.
> +
> +config VFIO_PCI_ZDEV
> +	bool "VFIO PCI extensions for s390x KVM passthrough"
> +	depends on S390 && KVM
> +	default y
> +	help
> +	  Support s390x-specific extensions to enable support for enhancements
> +	  to KVM passthrough capabilities, such as interpretive execution of
> +	  zPCI instructions.
> +
> +	  To enable s390x KVM vfio-pci extensions, say Y.

In several patches we check on CONFIG_PCI (14,15,16,17 and 22) but we 
may have PCI without VFIO_PCI, wouldn't it be a problem?

Here we define a new CONFIG entry and I have two questions:

1- there is no dependency on VFIO_PCI while the functionality is 
obviously based on VFIO_PCI

2- Wouldn't it be possible to use this item and the single condition for 
the different checks we need through the new VFIO interpretation 
functionality.




>   endif
>   endif
> diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
> index 349d68d242b4..01b1f83d83d7 100644
> --- a/drivers/vfio/pci/Makefile
> +++ b/drivers/vfio/pci/Makefile
> @@ -1,7 +1,7 @@
>   # SPDX-License-Identifier: GPL-2.0-only
>   
>   vfio-pci-core-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o
> -vfio-pci-core-$(CONFIG_S390) += vfio_pci_zdev.o
> +vfio-pci-core-$(CONFIG_VFIO_PCI_ZDEV) += vfio_pci_zdev.o
>   obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o
>   
>   vfio-pci-y := vfio_pci.o
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index ef9a44b6cf5d..5e2bca3b89db 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -195,7 +195,7 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
>   }
>   #endif
>   
> -#ifdef CONFIG_S390
> +#ifdef CONFIG_VFIO_PCI_ZDEV
>   extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   				       struct vfio_info_cap *caps);
>   #else
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 16/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications
  2022-01-17 17:38   ` Pierre Morel
@ 2022-01-18 17:25     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-18 17:25 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/17/22 12:38 PM, Pierre Morel wrote:
> 
...
>> +static void aen_process_gait(u8 isc)
>> +{
>> +    bool found = false, first = true;
>> +    union zpci_sic_iib iib = {{0}};
>> +    unsigned long si, flags;
>> +
>> +    spin_lock_irqsave(&aift->gait_lock, flags);
>> +
>> +    if (!aift->gait) {
>> +        spin_unlock_irqrestore(&aift->gait_lock, flags);
>> +        return;
>> +    }
>> +
>> +    for (si = 0;;) {
>> +        /* Scan adapter summary indicator bit vector */
>> +        si = airq_iv_scan(aift->sbv, si, airq_iv_end(aift->sbv));
>> +        if (si == -1UL) {
>> +            if (first || found) {
>> +                /* Reenable interrupts. */
>> +                if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, isc,
>> +                              &iib))
>> +                    break;
> 
> AFAIU this code is VFIO interpretation specific code and facility 12 is 
> a precondition for it, so I think this break will never occur.
> If I am right we should not test the return value which will make the 
> code clearer.

Yep, you are correct; we can just ignore the return value here.

> 
>> +                first = found = false;
>> +            } else {
>> +                /* Interrupts on and all bits processed */
>> +                break;
>> +            }
> 
> May be add a comment: "rescan after re-enabling interrupts"

OK

> 
>> +            found = false;
>> +            si = 0;
>> +            continue;
>> +        }
>> +        found = true;
>> +        aen_host_forward(si);
>> +    }
>> +
>> +    spin_unlock_irqrestore(&aift->gait_lock, flags);
>> +}
>> +
>>   static void gib_alert_irq_handler(struct airq_struct *airq,
>>                     struct tpi_info *tpi_info)
>>   {
>> +    struct tpi_adapter_info *info = (struct tpi_adapter_info *)tpi_info;
>> +
>>       inc_irq_stat(IRQIO_GAL);
>> -    process_gib_alert_list();
>> +
>> +    if (IS_ENABLED(CONFIG_PCI) && (info->forward || info->error)) {
>> +        aen_process_gait(info->isc);
>> +        if (info->aism != 0)
>> +            process_gib_alert_list();
>> +    } else
>> +        process_gib_alert_list();
> 
> NIT: I think we need braces around this statement

OK

> 
>>   }
>>   static struct airq_struct gib_alert_irq = {
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 01dc3f6883d0..ab8b56deed11 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -65,7 +65,8 @@ const struct _kvm_stats_desc kvm_vm_stats_desc[] = {
>>       STATS_DESC_COUNTER(VM, inject_float_mchk),
>>       STATS_DESC_COUNTER(VM, inject_pfault_done),
>>       STATS_DESC_COUNTER(VM, inject_service_signal),
>> -    STATS_DESC_COUNTER(VM, inject_virtio)
>> +    STATS_DESC_COUNTER(VM, inject_virtio),
>> +    STATS_DESC_COUNTER(VM, aen_forward)
>>   };
>>   const struct kvm_stats_header kvm_vm_stats_header = {
>> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
>> index b2000ed7b8c3..387b637863c9 100644
>> --- a/arch/s390/kvm/pci.h
>> +++ b/arch/s390/kvm/pci.h
>> @@ -12,6 +12,7 @@
>>   #include <linux/pci.h>
>>   #include <linux/mutex.h>
>> +#include <linux/kvm_host.h>
>>   #include <asm/airq.h>
>>   #include <asm/kvm_pci.h>
>> @@ -34,6 +35,14 @@ struct zpci_aift {
>>   extern struct zpci_aift *aift;
>> +static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
>> +                         unsigned long si)
>> +{
>> +    if (!IS_ENABLED(CONFIG_PCI) || aift->kzdev == 0 || 
>> aift->kzdev[si] == 0)
> 
> Shouldn't it be better CONFIG_VFIO_PCI ?

While it's true that we can't be doing interpretation without 
CONFIG_VFIO_PCI=y|m, the reason I'm using CONFIG_PCI here and elsewhere 
in the code is because CONFIG_PCI is what is being used to determine 
whether or not we build arch/s390/kvm/pci.o in patch 14 (and thus 
whether or not the aift exists) -- And the reason we use this is because 
this is where the code dependencies exist (examples include 
ZPCI_NR_DEVICES, the AEN pieces that must be preserved over KVM module 
remove/insert in patch 15)

If we for some reason have a case where CONFIG_KVM=y|m && CONFIG_PCI=y|m 
&& CONFIG_VFIO_PCI=n, this will still work:  aift and aift->kzdev will 
exist (kvm/pci.o is linked) but we will never actually drive this 
routine anyway because we'll never register a device for AEN forwarding 
without CONFIG_VFIO_PCI.




^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction
  2022-01-18 11:05   ` Pierre Morel
@ 2022-01-18 17:27     ` Matthew Rosato
  2022-01-18 17:54       ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-18 17:27 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/18/22 6:05 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> For faster handling of PCI translation refreshes, intercept in KVM
>> and call the associated handler.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/kvm/priv.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 46 insertions(+)
>>
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 417154b314a6..5b65c1830de2 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -29,6 +29,7 @@
>>   #include <asm/ap.h>
>>   #include "gaccess.h"
>>   #include "kvm-s390.h"
>> +#include "pci.h"
>>   #include "trace.h"
>>   static int handle_ri(struct kvm_vcpu *vcpu)
>> @@ -335,6 +336,49 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
>>       return 0;
>>   }
>> +static int handle_rpcit(struct kvm_vcpu *vcpu)
>> +{
>> +    int reg1, reg2;
>> +    u8 status;
>> +    int rc;
>> +
>> +    if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>> +        return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>> +
>> +    /* If the host doesn't support PCI, it must be an emulated device */
>> +    if (!IS_ENABLED(CONFIG_PCI))
>> +        return -EOPNOTSUPP;
> 
> AFAIU this makes also sure that the following code is not compiled in 
> case PCI is not supported.
> 
> I am not very used to compilation options, is it true with all our 
> compilers and options?
> Or do we have to specify a compiler version?
> 
> Another concern is, shouldn't we use IS_ENABLED(CONFIG_VFIO_PCI) ?

Same idea as in the other thread -- What we are trying to protect 
against here is referencing symbols that won't be linked (like 
zpci_refresh_trans, or the aift->mdd a few lines below)

It is indeed true that we should never need to handle the rpcit 
intercept in KVM if CONFIG_VFIO_PCI=n -- but the necessary symbols/code 
are linked at least, so we can just let the SHM logic sort this out. 
When CONFIG_PCI=y|m, arch/s390/kvm/pci.o will be linked and so we can 
compare the function handle against afit->mdd (check to see if the 
device is emulated) and use this to determine whether or not to 
immediately send to userspace -- And if CONFIG_VFIO_PCI=n, a SHM bit 
will always be on and so we'll always go to userspace via this check.

> 
> 
> 
>> +
>> +    kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
>> +
>> +    /* If the device has a SHM bit on, let userspace take care of 
>> this */
>> +    if (((vcpu->run->s.regs.gprs[reg1] >> 32) & aift->mdd) != 0)
>> +        return -EOPNOTSUPP;
>> +
>> +    rc = kvm_s390_pci_refresh_trans(vcpu, vcpu->run->s.regs.gprs[reg1],
>> +                    vcpu->run->s.regs.gprs[reg2],
>> +                    vcpu->run->s.regs.gprs[reg2+1],
>> +                    &status);
>> +
>> +    switch (rc) {
>> +    case 0:
>> +        kvm_s390_set_psw_cc(vcpu, 0);
>> +        break;
>> +    case -EOPNOTSUPP:
>> +        return -EOPNOTSUPP;
>> +    default:
>> +        vcpu->run->s.regs.gprs[reg1] &= 0xffffffff00ffffffUL;
>> +        vcpu->run->s.regs.gprs[reg1] |= (u64) status << 24;
>> +        if (status != 0)
>> +            kvm_s390_set_psw_cc(vcpu, 1);
>> +        else
>> +            kvm_s390_set_psw_cc(vcpu, 3);
>> +        break;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>>   #define SSKE_NQ 0x8
>>   #define SSKE_MR 0x4
>>   #define SSKE_MC 0x2
>> @@ -1275,6 +1319,8 @@ int kvm_s390_handle_b9(struct kvm_vcpu *vcpu)
>>           return handle_essa(vcpu);
>>       case 0xaf:
>>           return handle_pfmf(vcpu);
>> +    case 0xd3:
>> +        return handle_rpcit(vcpu);
>>       default:
>>           return -EOPNOTSUPP;
>>       }
>>
> 


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 14/30] KVM: s390: pci: add basic kvm_zdev structure
  2022-01-17 16:25   ` Pierre Morel
@ 2022-01-18 17:32     ` Pierre Morel
  2022-01-18 18:39       ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-18 17:32 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/17/22 17:25, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> This structure will be used to carry kvm passthrough information 
>> related to
>> zPCI devices.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++++++
>>   arch/s390/include/asm/pci.h     |  3 +++
>>   arch/s390/kvm/Makefile          |  2 +-
>>   arch/s390/kvm/pci.c             | 46 +++++++++++++++++++++++++++++++++
>>   4 files changed, 79 insertions(+), 1 deletion(-)
>>   create mode 100644 arch/s390/include/asm/kvm_pci.h
>>   create mode 100644 arch/s390/kvm/pci.c
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>> b/arch/s390/include/asm/kvm_pci.h
>> new file mode 100644
>> index 000000000000..aafee2976929
>> --- /dev/null
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -0,0 +1,29 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * KVM PCI Passthrough for virtual machines on s390
>> + *
>> + * Copyright IBM Corp. 2021
>> + *
>> + *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
>> + */
>> +
>> +
> 
> One blank line too much.
> 
> Otherwise, look good to me.
> 
> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
> 
>> +#ifndef ASM_KVM_PCI_H
>> +#define ASM_KVM_PCI_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/kvm_types.h>
>> +#include <linux/kvm_host.h>
>> +#include <linux/kvm.h>
>> +#include <linux/pci.h>
>> +
>> +struct kvm_zdev {
>> +    struct zpci_dev *zdev;
>> +    struct kvm *kvm;
>> +};
>> +
>> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
>> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
>> +void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
>> +
>> +#endif /* ASM_KVM_PCI_H */
>> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
>> index f3cd2da8128c..9b6c657d8d31 100644
>> --- a/arch/s390/include/asm/pci.h
>> +++ b/arch/s390/include/asm/pci.h
>> @@ -97,6 +97,7 @@ struct zpci_bar_struct {
>>   };
>>   struct s390_domain;
>> +struct kvm_zdev;
>>   #define ZPCI_FUNCTIONS_PER_BUS 256
>>   struct zpci_bus {
>> @@ -190,6 +191,8 @@ struct zpci_dev {
>>       struct dentry    *debugfs_dev;
>>       struct s390_domain *s390_domain; /* s390 IOMMU domain data */
>> +
>> +    struct kvm_zdev *kzdev; /* passthrough data */
>>   };
>>   static inline bool zdev_enabled(struct zpci_dev *zdev)
>> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
>> index b3aaadc60ead..a26f4fe7b680 100644
>> --- a/arch/s390/kvm/Makefile
>> +++ b/arch/s390/kvm/Makefile
>> @@ -11,5 +11,5 @@ ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>>   kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o priv.o 
>> sigp.o
>>   kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
>> -
>> +kvm-$(CONFIG_PCI) += pci.o
>>   obj-$(CONFIG_KVM) += kvm.o
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> new file mode 100644
>> index 000000000000..1c33bc7bf2bd
>> --- /dev/null
>> +++ b/arch/s390/kvm/pci.c
>> @@ -0,0 +1,46 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * s390 kvm PCI passthrough support
>> + *
>> + * Copyright IBM Corp. 2021
>> + *
>> + *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
>> + */
>> +
>> +#include <linux/kvm_host.h>
>> +#include <linux/pci.h>
>> +#include <asm/kvm_pci.h>
>> +
>> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
>> +{
>> +    struct kvm_zdev *kzdev;
>> +
>> +    kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
>> +    if (!kzdev)
>> +        return -ENOMEM;
>> +
>> +    kzdev->zdev = zdev;
>> +    zdev->kzdev = kzdev;
>> +
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
>> +
>> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
>> +{
>> +    struct kvm_zdev *kzdev;
>> +
>> +    kzdev = zdev->kzdev;
>> +    WARN_ON(kzdev->zdev != zdev);
>> +    zdev->kzdev = 0;
>> +    kfree(kzdev);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
>> +
>> +void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
>> +{
>> +    struct kvm_zdev *kzdev = zdev->kzdev;
>> +
>> +    kzdev->kvm = kvm;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>>
> 

Working now on patch 24, I am not sure that this function is necessary.
the only purpose seems to set kzdev->kvm = kvm while we already know 
kzdev in the caller.



-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV
  2022-01-18 17:20   ` Pierre Morel
@ 2022-01-18 17:32     ` Matthew Rosato
  2022-01-18 17:45       ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-18 17:32 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/18/22 12:20 PM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> This was previously removed as unnecessary; while that was true, 
>> subsequent
>> changes will make KVM an additional required component for vfio-pci-zdev.
>> Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a reason
>> to say 'n' for it (when not planning to CONFIG_KVM).
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   drivers/vfio/pci/Kconfig      | 11 +++++++++++
>>   drivers/vfio/pci/Makefile     |  2 +-
>>   include/linux/vfio_pci_core.h |  2 +-
>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
>> index 860424ccda1b..fedd1d4cb592 100644
>> --- a/drivers/vfio/pci/Kconfig
>> +++ b/drivers/vfio/pci/Kconfig
>> @@ -42,5 +42,16 @@ config VFIO_PCI_IGD
>>         and LPC bridge config space.
>>         To enable Intel IGD assignment through vfio-pci, say Y.
>> +
>> +config VFIO_PCI_ZDEV
>> +    bool "VFIO PCI extensions for s390x KVM passthrough"
>> +    depends on S390 && KVM
>> +    default y
>> +    help
>> +      Support s390x-specific extensions to enable support for 
>> enhancements
>> +      to KVM passthrough capabilities, such as interpretive execution of
>> +      zPCI instructions.
>> +
>> +      To enable s390x KVM vfio-pci extensions, say Y.
> 
> In several patches we check on CONFIG_PCI (14,15,16,17 and 22) but we 
> may have PCI without VFIO_PCI, wouldn't it be a problem?
> 
> Here we define a new CONFIG entry and I have two questions:
> 
> 1- there is no dependency on VFIO_PCI while the functionality is 
> obviously based on VFIO_PCI

It's not obvious from this diff, but this 'config VFIO_PCI_ZDEV' 
statement is within an 'if VFIO_PCI' statement, just like VFIO_PCI_IGD 
above -- so the dependency is there.

> 
> 2- Wouldn't it be possible to use this item and the single condition for 
> the different checks we need through the new VFIO interpretation 
> functionality.

Possibly, but 1) we'd have to make linking arch/s390/kvm/pci.o dependent 
on CONFIG_VFIO_PCI instead of CONFIG_PCI in patch 14 and 2) if the 
relationship between CONFIG_VFIO_PCI and CONFIG_PCI were to ever change 
(though I don't see why it would..), we would be broken because the 
symbols we are referencing really require CONFIG_PCI (as they are 
located in s390 PCI).


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 24/30] vfio-pci/zdev: wire up group notifier
  2022-01-14 20:31 ` [PATCH v2 24/30] vfio-pci/zdev: wire up group notifier Matthew Rosato
@ 2022-01-18 17:34   ` Pierre Morel
  2022-01-18 18:37     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-18 17:34 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> KVM zPCI passthrough device logic will need a reference to the associated
> kvm guest that has access to the device.  Let's register a group notifier
> for VFIO_GROUP_NOTIFY_SET_KVM to catch this information in order to create
> an association between a kvm guest and the host zdev.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h  |  2 ++
>   drivers/vfio/pci/vfio_pci_core.c |  2 ++
>   drivers/vfio/pci/vfio_pci_zdev.c | 46 ++++++++++++++++++++++++++++++++
>   include/linux/vfio_pci_core.h    | 10 +++++++
>   4 files changed, 60 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index fa90729a35cf..97a90b37c87d 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -17,6 +17,7 @@
>   #include <linux/kvm.h>
>   #include <linux/pci.h>
>   #include <linux/mutex.h>
> +#include <linux/notifier.h>
>   #include <asm/pci_insn.h>
>   #include <asm/pci_dma.h>
>   
> @@ -33,6 +34,7 @@ struct kvm_zdev {
>   	u64 rpcit_count;
>   	struct kvm_zdev_ioat ioat;
>   	struct zpci_fib fib;
> +	struct notifier_block nb;
>   };
>   
>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index f948e6cd2993..fc57d4d0abbe 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -452,6 +452,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev)
>   
>   	vfio_pci_vf_token_user_add(vdev, -1);
>   	vfio_spapr_pci_eeh_release(vdev->pdev);
> +	vfio_pci_zdev_release(vdev);
>   	vfio_pci_core_disable(vdev);
>   
>   	mutex_lock(&vdev->igate);
> @@ -470,6 +471,7 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_close_device);
>   void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
>   {
>   	vfio_pci_probe_mmaps(vdev);
> +	vfio_pci_zdev_open(vdev);
>   	vfio_spapr_pci_eeh_open(vdev->pdev);
>   	vfio_pci_vf_token_user_add(vdev, 1);
>   }
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index ea4c0d2b0663..5c2bddc57b39 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -13,6 +13,7 @@
>   #include <linux/vfio_zdev.h>
>   #include <asm/pci_clp.h>
>   #include <asm/pci_io.h>
> +#include <asm/kvm_pci.h>
>   
>   #include <linux/vfio_pci_core.h>
>   
> @@ -136,3 +137,48 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   
>   	return ret;
>   }
> +
> +static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
> +					unsigned long action, void *data)
> +{
> +	struct kvm_zdev *kzdev = container_of(nb, struct kvm_zdev, nb);
> +
> +	if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
> +		if (!data || !kzdev->zdev)
> +			return NOTIFY_DONE;
> +		kvm_s390_pci_attach_kvm(kzdev->zdev, data);

Why not just set kzdev->kvm = data ?

alternatively, define kvm_s390_pci_attach_kvm() as an inline instead of 
a global function.

otherwise LGTM

> +	}
> +
> +	return NOTIFY_OK;
> +}
> +
> +void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> +{
> +	unsigned long events = VFIO_GROUP_NOTIFY_SET_KVM;
> +	struct zpci_dev *zdev = to_zpci(vdev->pdev);
> +
> +	if (!zdev)
> +		return;
> +
> +	if (kvm_s390_pci_dev_open(zdev))
> +		return;
> +
> +	zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
> +
> +	if (vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
> +				   &events, &zdev->kzdev->nb))
> +		kvm_s390_pci_dev_release(zdev);
> +}
> +
> +void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
> +{
> +	struct zpci_dev *zdev = to_zpci(vdev->pdev);
> +
> +	if (!zdev || !zdev->kzdev)
> +		return;
> +
> +	vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
> +				 &zdev->kzdev->nb);
> +
> +	kvm_s390_pci_dev_release(zdev);
> +}
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 5e2bca3b89db..05287f8ac855 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -198,12 +198,22 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
>   #ifdef CONFIG_VFIO_PCI_ZDEV
>   extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   				       struct vfio_info_cap *caps);
> +void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
> +void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
>   #else
>   static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   					      struct vfio_info_cap *caps)
>   {
>   	return -ENODEV;
>   }
> +
> +static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
> +{
> +}
> +
> +static inline void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
> +{
> +}
>   #endif
>   
>   /* Will be exported for vfio pci drivers usage */
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV
  2022-01-18 17:32     ` Matthew Rosato
@ 2022-01-18 17:45       ` Pierre Morel
  2022-01-18 18:05         ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-18 17:45 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/18/22 18:32, Matthew Rosato wrote:
> On 1/18/22 12:20 PM, Pierre Morel wrote:
>>
>>
>> On 1/14/22 21:31, Matthew Rosato wrote:
>>> This was previously removed as unnecessary; while that was true, 
>>> subsequent
>>> changes will make KVM an additional required component for 
>>> vfio-pci-zdev.
>>> Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a 
>>> reason
>>> to say 'n' for it (when not planning to CONFIG_KVM).
>>>
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   drivers/vfio/pci/Kconfig      | 11 +++++++++++
>>>   drivers/vfio/pci/Makefile     |  2 +-
>>>   include/linux/vfio_pci_core.h |  2 +-
>>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
>>> index 860424ccda1b..fedd1d4cb592 100644
>>> --- a/drivers/vfio/pci/Kconfig
>>> +++ b/drivers/vfio/pci/Kconfig
>>> @@ -42,5 +42,16 @@ config VFIO_PCI_IGD
>>>         and LPC bridge config space.
>>>         To enable Intel IGD assignment through vfio-pci, say Y.
>>> +
>>> +config VFIO_PCI_ZDEV
>>> +    bool "VFIO PCI extensions for s390x KVM passthrough"
>>> +    depends on S390 && KVM
>>> +    default y
>>> +    help
>>> +      Support s390x-specific extensions to enable support for 
>>> enhancements
>>> +      to KVM passthrough capabilities, such as interpretive 
>>> execution of
>>> +      zPCI instructions.
>>> +
>>> +      To enable s390x KVM vfio-pci extensions, say Y.
>>
>> In several patches we check on CONFIG_PCI (14,15,16,17 and 22) but we 
>> may have PCI without VFIO_PCI, wouldn't it be a problem?
>>
>> Here we define a new CONFIG entry and I have two questions:
>>
>> 1- there is no dependency on VFIO_PCI while the functionality is 
>> obviously based on VFIO_PCI
> 
> It's not obvious from this diff, but this 'config VFIO_PCI_ZDEV' 
> statement is within an 'if VFIO_PCI' statement, just like VFIO_PCI_IGD 
> above -- so the dependency is there.

sorry, I remember now you already answered this to Christian last time.

> 
>>
>> 2- Wouldn't it be possible to use this item and the single condition 
>> for the different checks we need through the new VFIO interpretation 
>> functionality.
> 
> Possibly, but 1) we'd have to make linking arch/s390/kvm/pci.o dependent 
> on CONFIG_VFIO_PCI instead of CONFIG_PCI in patch 14 and 2) if the 
> relationship between CONFIG_VFIO_PCI and CONFIG_PCI were to ever change 
> (though I don't see why it would..), we would be broken because the 
> symbols we are referencing really require CONFIG_PCI (as they are 
> located in s390 PCI).
> 

Yes but VFIO_PCI_ZDEV depends on KVM, PCI and on VFIO_PCI
Wouldn't a single config item for this new code be easier to manage and 
understand?

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction
  2022-01-18 17:27     ` Matthew Rosato
@ 2022-01-18 17:54       ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-18 17:54 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/18/22 18:27, Matthew Rosato wrote:
> On 1/18/22 6:05 AM, Pierre Morel wrote:
>>
>>
>> On 1/14/22 21:31, Matthew Rosato wrote:
>>> For faster handling of PCI translation refreshes, intercept in KVM
>>> and call the associated handler.
>>>
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   arch/s390/kvm/priv.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 46 insertions(+)
>>>
>>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>>> index 417154b314a6..5b65c1830de2 100644
>>> --- a/arch/s390/kvm/priv.c
>>> +++ b/arch/s390/kvm/priv.c
>>> @@ -29,6 +29,7 @@
>>>   #include <asm/ap.h>
>>>   #include "gaccess.h"
>>>   #include "kvm-s390.h"
>>> +#include "pci.h"
>>>   #include "trace.h"
>>>   static int handle_ri(struct kvm_vcpu *vcpu)
>>> @@ -335,6 +336,49 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
>>>       return 0;
>>>   }
>>> +static int handle_rpcit(struct kvm_vcpu *vcpu)
>>> +{
>>> +    int reg1, reg2;
>>> +    u8 status;
>>> +    int rc;
>>> +
>>> +    if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
>>> +        return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
>>> +
>>> +    /* If the host doesn't support PCI, it must be an emulated 
>>> device */
>>> +    if (!IS_ENABLED(CONFIG_PCI))
>>> +        return -EOPNOTSUPP;
>>
>> AFAIU this makes also sure that the following code is not compiled in 
>> case PCI is not supported.
>>
>> I am not very used to compilation options, is it true with all our 
>> compilers and options?
>> Or do we have to specify a compiler version?
>>
>> Another concern is, shouldn't we use IS_ENABLED(CONFIG_VFIO_PCI) ?
> 
> Same idea as in the other thread -- What we are trying to protect 
> against here is referencing symbols that won't be linked (like 
> zpci_refresh_trans, or the aift->mdd a few lines below)
> 
> It is indeed true that we should never need to handle the rpcit 
> intercept in KVM if CONFIG_VFIO_PCI=n -- but the necessary symbols/code 
> are linked at least, so we can just let the SHM logic sort this out. 
> When CONFIG_PCI=y|m, arch/s390/kvm/pci.o will be linked and so we can 
> compare the function handle against afit->mdd (check to see if the 
> device is emulated) and use this to determine whether or not to 
> immediately send to userspace -- And if CONFIG_VFIO_PCI=n, a SHM bit 
> will always be on and so we'll always go to userspace via this check.

So we agree.
But as I I said somewhere else I wonder if CONFIG_VFIO_PCI_ZDEV would 
not even be better here.

> 
>>
>>
>>
>>> +
>>> +    kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
>>> +
>>> +    /* If the device has a SHM bit on, let userspace take care of 
>>> this */
>>> +    if (((vcpu->run->s.regs.gprs[reg1] >> 32) & aift->mdd) != 0)
>>> +        return -EOPNOTSUPP;
>>> +
>>> +    rc = kvm_s390_pci_refresh_trans(vcpu, vcpu->run->s.regs.gprs[reg1],
>>> +                    vcpu->run->s.regs.gprs[reg2],
>>> +                    vcpu->run->s.regs.gprs[reg2+1],
>>> +                    &status);
>>> +
>>> +    switch (rc) {
>>> +    case 0:
>>> +        kvm_s390_set_psw_cc(vcpu, 0);
>>> +        break;
>>> +    case -EOPNOTSUPP:
>>> +        return -EOPNOTSUPP;
>>> +    default:
>>> +        vcpu->run->s.regs.gprs[reg1] &= 0xffffffff00ffffffUL;
>>> +        vcpu->run->s.regs.gprs[reg1] |= (u64) status << 24;
>>> +        if (status != 0)
>>> +            kvm_s390_set_psw_cc(vcpu, 1);
>>> +        else
>>> +            kvm_s390_set_psw_cc(vcpu, 3);
>>> +        break;
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>>   #define SSKE_NQ 0x8
>>>   #define SSKE_MR 0x4
>>>   #define SSKE_MC 0x2
>>> @@ -1275,6 +1319,8 @@ int kvm_s390_handle_b9(struct kvm_vcpu *vcpu)
>>>           return handle_essa(vcpu);
>>>       case 0xaf:
>>>           return handle_pfmf(vcpu);
>>> +    case 0xd3:
>>> +        return handle_rpcit(vcpu);
>>>       default:
>>>           return -EOPNOTSUPP;
>>>       }
>>>
>>
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV
  2022-01-18 17:45       ` Pierre Morel
@ 2022-01-18 18:05         ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:05 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/18/22 12:45 PM, Pierre Morel wrote:
> 
> 
> On 1/18/22 18:32, Matthew Rosato wrote:
>> On 1/18/22 12:20 PM, Pierre Morel wrote:
>>>
>>>
>>> On 1/14/22 21:31, Matthew Rosato wrote:
>>>> This was previously removed as unnecessary; while that was true, 
>>>> subsequent
>>>> changes will make KVM an additional required component for 
>>>> vfio-pci-zdev.
>>>> Let's re-introduce CONFIG_VFIO_PCI_ZDEV as now there is actually a 
>>>> reason
>>>> to say 'n' for it (when not planning to CONFIG_KVM).
>>>>
>>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>>> ---
>>>>   drivers/vfio/pci/Kconfig      | 11 +++++++++++
>>>>   drivers/vfio/pci/Makefile     |  2 +-
>>>>   include/linux/vfio_pci_core.h |  2 +-
>>>>   3 files changed, 13 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
>>>> index 860424ccda1b..fedd1d4cb592 100644
>>>> --- a/drivers/vfio/pci/Kconfig
>>>> +++ b/drivers/vfio/pci/Kconfig
>>>> @@ -42,5 +42,16 @@ config VFIO_PCI_IGD
>>>>         and LPC bridge config space.
>>>>         To enable Intel IGD assignment through vfio-pci, say Y.
>>>> +
>>>> +config VFIO_PCI_ZDEV
>>>> +    bool "VFIO PCI extensions for s390x KVM passthrough"
>>>> +    depends on S390 && KVM
>>>> +    default y
>>>> +    help
>>>> +      Support s390x-specific extensions to enable support for 
>>>> enhancements
>>>> +      to KVM passthrough capabilities, such as interpretive 
>>>> execution of
>>>> +      zPCI instructions.
>>>> +
>>>> +      To enable s390x KVM vfio-pci extensions, say Y.
>>>
>>> In several patches we check on CONFIG_PCI (14,15,16,17 and 22) but we 
>>> may have PCI without VFIO_PCI, wouldn't it be a problem?
>>>
>>> Here we define a new CONFIG entry and I have two questions:
>>>
>>> 1- there is no dependency on VFIO_PCI while the functionality is 
>>> obviously based on VFIO_PCI
>>
>> It's not obvious from this diff, but this 'config VFIO_PCI_ZDEV' 
>> statement is within an 'if VFIO_PCI' statement, just like VFIO_PCI_IGD 
>> above -- so the dependency is there.
> 
> sorry, I remember now you already answered this to Christian last time.
> 
>>
>>>
>>> 2- Wouldn't it be possible to use this item and the single condition 
>>> for the different checks we need through the new VFIO interpretation 
>>> functionality.
>>
>> Possibly, but 1) we'd have to make linking arch/s390/kvm/pci.o 
>> dependent on CONFIG_VFIO_PCI instead of CONFIG_PCI in patch 14 and 2) 
>> if the relationship between CONFIG_VFIO_PCI and CONFIG_PCI were to 
>> ever change (though I don't see why it would..), we would be broken 
>> because the symbols we are referencing really require CONFIG_PCI (as 
>> they are located in s390 PCI).
>>
> 
> Yes but VFIO_PCI_ZDEV depends on KVM, PCI and on VFIO_PCI
> Wouldn't a single config item for this new code be easier to manage and 
> understand?
> 

I guess my primary resistance is to abstracting/hiding the dependency. 
Yes, userspace will never setup for zPCI interpretation without 
CONFIG_VFIO_PCI{_ZDEV}, but that's not where the compilation dependency 
is -- it's on CONFIG_PCI specifically.

But I guess on the other hand you could argue why even bother building 
pci.o into kvm without CONFIG_VFIO_PCI_ZDEV as it will never be used.

OK, I will have a look at making this change.  It will require a little 
reorganization, at least moving this patch up before patch 14.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 24/30] vfio-pci/zdev: wire up group notifier
  2022-01-18 17:34   ` Pierre Morel
@ 2022-01-18 18:37     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:37 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/18/22 12:34 PM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> KVM zPCI passthrough device logic will need a reference to the associated
>> kvm guest that has access to the device.  Let's register a group notifier
>> for VFIO_GROUP_NOTIFY_SET_KVM to catch this information in order to 
>> create
>> an association between a kvm guest and the host zdev.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/kvm_pci.h  |  2 ++
>>   drivers/vfio/pci/vfio_pci_core.c |  2 ++
>>   drivers/vfio/pci/vfio_pci_zdev.c | 46 ++++++++++++++++++++++++++++++++
>>   include/linux/vfio_pci_core.h    | 10 +++++++
>>   4 files changed, 60 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>> b/arch/s390/include/asm/kvm_pci.h
>> index fa90729a35cf..97a90b37c87d 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -17,6 +17,7 @@
>>   #include <linux/kvm.h>
>>   #include <linux/pci.h>
>>   #include <linux/mutex.h>
>> +#include <linux/notifier.h>
>>   #include <asm/pci_insn.h>
>>   #include <asm/pci_dma.h>
>> @@ -33,6 +34,7 @@ struct kvm_zdev {
>>       u64 rpcit_count;
>>       struct kvm_zdev_ioat ioat;
>>       struct zpci_fib fib;
>> +    struct notifier_block nb;
>>   };
>>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
>> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
>> b/drivers/vfio/pci/vfio_pci_core.c
>> index f948e6cd2993..fc57d4d0abbe 100644
>> --- a/drivers/vfio/pci/vfio_pci_core.c
>> +++ b/drivers/vfio/pci/vfio_pci_core.c
>> @@ -452,6 +452,7 @@ void vfio_pci_core_close_device(struct vfio_device 
>> *core_vdev)
>>       vfio_pci_vf_token_user_add(vdev, -1);
>>       vfio_spapr_pci_eeh_release(vdev->pdev);
>> +    vfio_pci_zdev_release(vdev);
>>       vfio_pci_core_disable(vdev);
>>       mutex_lock(&vdev->igate);
>> @@ -470,6 +471,7 @@ EXPORT_SYMBOL_GPL(vfio_pci_core_close_device);
>>   void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev)
>>   {
>>       vfio_pci_probe_mmaps(vdev);
>> +    vfio_pci_zdev_open(vdev);
>>       vfio_spapr_pci_eeh_open(vdev->pdev);
>>       vfio_pci_vf_token_user_add(vdev, 1);
>>   }
>> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c 
>> b/drivers/vfio/pci/vfio_pci_zdev.c
>> index ea4c0d2b0663..5c2bddc57b39 100644
>> --- a/drivers/vfio/pci/vfio_pci_zdev.c
>> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
>> @@ -13,6 +13,7 @@
>>   #include <linux/vfio_zdev.h>
>>   #include <asm/pci_clp.h>
>>   #include <asm/pci_io.h>
>> +#include <asm/kvm_pci.h>
>>   #include <linux/vfio_pci_core.h>
>> @@ -136,3 +137,48 @@ int vfio_pci_info_zdev_add_caps(struct 
>> vfio_pci_core_device *vdev,
>>       return ret;
>>   }
>> +
>> +static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
>> +                    unsigned long action, void *data)
>> +{
>> +    struct kvm_zdev *kzdev = container_of(nb, struct kvm_zdev, nb);
>> +
>> +    if (action == VFIO_GROUP_NOTIFY_SET_KVM) {
>> +        if (!data || !kzdev->zdev)
>> +            return NOTIFY_DONE;
>> +        kvm_s390_pci_attach_kvm(kzdev->zdev, data);
> 
> Why not just set kzdev->kvm = data ?
> 
> alternatively, define kvm_s390_pci_attach_kvm() as an inline instead of 
> a global function.
> 
> otherwise LGTM

At some point in the past this function did more than just set a 
pointer...  You are correct there's no need for this abstraction now, 
let's just set kzdev->kvm = data directly here and drop the 
kvm_s390_pci_attach_kvm function.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 14/30] KVM: s390: pci: add basic kvm_zdev structure
  2022-01-18 17:32     ` Pierre Morel
@ 2022-01-18 18:39       ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:39 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/18/22 12:32 PM, Pierre Morel wrote:
> 
> 
> On 1/17/22 17:25, Pierre Morel wrote:
>>
>>
>> On 1/14/22 21:31, Matthew Rosato wrote:
>>> This structure will be used to carry kvm passthrough information 
>>> related to
>>> zPCI devices.
>>>
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   arch/s390/include/asm/kvm_pci.h | 29 +++++++++++++++++++++
>>>   arch/s390/include/asm/pci.h     |  3 +++
>>>   arch/s390/kvm/Makefile          |  2 +-
>>>   arch/s390/kvm/pci.c             | 46 +++++++++++++++++++++++++++++++++
>>>   4 files changed, 79 insertions(+), 1 deletion(-)
>>>   create mode 100644 arch/s390/include/asm/kvm_pci.h
>>>   create mode 100644 arch/s390/kvm/pci.c
>>>
>>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>>> b/arch/s390/include/asm/kvm_pci.h
>>> new file mode 100644
>>> index 000000000000..aafee2976929
>>> --- /dev/null
>>> +++ b/arch/s390/include/asm/kvm_pci.h
>>> @@ -0,0 +1,29 @@
>>> +/* SPDX-License-Identifier: GPL-2.0 */
>>> +/*
>>> + * KVM PCI Passthrough for virtual machines on s390
>>> + *
>>> + * Copyright IBM Corp. 2021
>>> + *
>>> + *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
>>> + */
>>> +
>>> +
>>
>> One blank line too much.
>>
>> Otherwise, look good to me.
>>
>> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
>>
>>> +#ifndef ASM_KVM_PCI_H
>>> +#define ASM_KVM_PCI_H
>>> +
>>> +#include <linux/types.h>
>>> +#include <linux/kvm_types.h>
>>> +#include <linux/kvm_host.h>
>>> +#include <linux/kvm.h>
>>> +#include <linux/pci.h>
>>> +
>>> +struct kvm_zdev {
>>> +    struct zpci_dev *zdev;
>>> +    struct kvm *kvm;
>>> +};
>>> +
>>> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
>>> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
>>> +void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
>>> +
>>> +#endif /* ASM_KVM_PCI_H */
>>> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
>>> index f3cd2da8128c..9b6c657d8d31 100644
>>> --- a/arch/s390/include/asm/pci.h
>>> +++ b/arch/s390/include/asm/pci.h
>>> @@ -97,6 +97,7 @@ struct zpci_bar_struct {
>>>   };
>>>   struct s390_domain;
>>> +struct kvm_zdev;
>>>   #define ZPCI_FUNCTIONS_PER_BUS 256
>>>   struct zpci_bus {
>>> @@ -190,6 +191,8 @@ struct zpci_dev {
>>>       struct dentry    *debugfs_dev;
>>>       struct s390_domain *s390_domain; /* s390 IOMMU domain data */
>>> +
>>> +    struct kvm_zdev *kzdev; /* passthrough data */
>>>   };
>>>   static inline bool zdev_enabled(struct zpci_dev *zdev)
>>> diff --git a/arch/s390/kvm/Makefile b/arch/s390/kvm/Makefile
>>> index b3aaadc60ead..a26f4fe7b680 100644
>>> --- a/arch/s390/kvm/Makefile
>>> +++ b/arch/s390/kvm/Makefile
>>> @@ -11,5 +11,5 @@ ccflags-y := -Ivirt/kvm -Iarch/s390/kvm
>>>   kvm-objs := $(common-objs) kvm-s390.o intercept.o interrupt.o 
>>> priv.o sigp.o
>>>   kvm-objs += diag.o gaccess.o guestdbg.o vsie.o pv.o
>>> -
>>> +kvm-$(CONFIG_PCI) += pci.o

As discussed in other threads, I will look at changing this to
kvm-$(CONFIG_VFIO_PCI_ZDEV) += pci.o
Along with other IS_ENABLE(CONFIG_PCI) -> 
IS_ENABLED(CONFIG_VFIO_PCI_ZDDEV) changes

>>>   obj-$(CONFIG_KVM) += kvm.o
>>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>>> new file mode 100644
>>> index 000000000000..1c33bc7bf2bd
>>> --- /dev/null
>>> +++ b/arch/s390/kvm/pci.c
>>> @@ -0,0 +1,46 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/*
>>> + * s390 kvm PCI passthrough support
>>> + *
>>> + * Copyright IBM Corp. 2021
>>> + *
>>> + *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
>>> + */
>>> +
>>> +#include <linux/kvm_host.h>
>>> +#include <linux/pci.h>
>>> +#include <asm/kvm_pci.h>
>>> +
>>> +int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
>>> +{
>>> +    struct kvm_zdev *kzdev;
>>> +
>>> +    kzdev = kzalloc(sizeof(struct kvm_zdev), GFP_KERNEL);
>>> +    if (!kzdev)
>>> +        return -ENOMEM;
>>> +
>>> +    kzdev->zdev = zdev;
>>> +    zdev->kzdev = kzdev;
>>> +
>>> +    return 0;
>>> +}
>>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_open);
>>> +
>>> +void kvm_s390_pci_dev_release(struct zpci_dev *zdev)
>>> +{
>>> +    struct kvm_zdev *kzdev;
>>> +
>>> +    kzdev = zdev->kzdev;
>>> +    WARN_ON(kzdev->zdev != zdev);
>>> +    zdev->kzdev = 0;
>>> +    kfree(kzdev);
>>> +}
>>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_dev_release);
>>> +
>>> +void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm)
>>> +{
>>> +    struct kvm_zdev *kzdev = zdev->kzdev;
>>> +
>>> +    kzdev->kvm = kvm;
>>> +}
>>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>>>
>>
> 
> Working now on patch 24, I am not sure that this function is necessary.
> the only purpose seems to set kzdev->kvm = kvm while we already know 
> kzdev in the caller.
> 

Yep, as mentioned in the patch 24 thread I will drop this function and 
set kzdev->kvm = kvm directly in patch 24.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector
  2022-01-17 12:29   ` Claudio Imbrenda
@ 2022-01-18 18:52     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-18 18:52 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: linux-s390, alex.williamson, cohuck, schnelle, farman, pmorel,
	borntraeger, hca, gor, gerald.schaefer, agordeev, frankja, david,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/17/22 7:29 AM, Claudio Imbrenda wrote:
> On Fri, 14 Jan 2022 15:31:21 -0500
> Matthew Rosato <mjrosato@linux.ibm.com> wrote:
> 
>> When doing device passthrough where interrupts are being forwarded
>> from host to guest, we wish to use a pinned section of guest memory
>> as the vector (the same memory used by the guest as the vector).
> 
> maybe expand the description of the patch to explain what exactly is
> being done in this patch. Namely: you add a parameter to a function
> (and some logic in the function to use the new parameter), but the
> function is not being used yet. And pinning is also done somewhere else.
> 
> maybe you can add something like
> 
> 	This patch adds a new parameter for airq_iv_create to pass the
> 	existing vector pinned in guest memory and to use it when
> 	needed instead of allocating a new one.
> 
> Apart from that, the patch looks good.
> 

Thanks, will re-work to:

When doing device passthrough where interrupts are being forwarded from 
host to guest, we wish to use a pinned section of guest memory as the 
vector (the same memory used by the guest as the vector).
To accomplish this, add a new parameter for airq_iv_create which allows 
passing an existing vector to be used instead of allocating a new one. 
The caller is responsible for ensuring the vector is pinned in memory as 
well as for unpinning the memory when the vector is no longer needed.
A subsequent patch will use this new parameter for zPCI interpretation.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations
  2022-01-14 20:31 ` [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations Matthew Rosato
@ 2022-01-19  9:29   ` Pierre Morel
  2022-01-19 16:39     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-19  9:29 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Add a routine that will perform a shadow operation between a guest
> and host IOAT.  A subsequent patch will invoke this in response to
> an 04 RPCIT instruction intercept.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h |   1 +
>   arch/s390/include/asm/pci_dma.h |   1 +
>   arch/s390/kvm/pci.c             | 208 +++++++++++++++++++++++++++++++-
>   arch/s390/kvm/pci.h             |   8 +-
>   4 files changed, 216 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 770849f13a70..fa90729a35cf 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -30,6 +30,7 @@ struct kvm_zdev_ioat {
>   struct kvm_zdev {
>   	struct zpci_dev *zdev;
>   	struct kvm *kvm;
> +	u64 rpcit_count;
>   	struct kvm_zdev_ioat ioat;
>   	struct zpci_fib fib;
>   };
> diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
> index 69e616d0712c..38004e0a4383 100644
> --- a/arch/s390/include/asm/pci_dma.h
> +++ b/arch/s390/include/asm/pci_dma.h
> @@ -52,6 +52,7 @@ enum zpci_ioat_dtype {
>   #define ZPCI_TABLE_ENTRIES		(ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
>   #define ZPCI_TABLE_PAGES		(ZPCI_TABLE_SIZE >> PAGE_SHIFT)
>   #define ZPCI_TABLE_ENTRIES_PAGES	(ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
> +#define ZPCI_TABLE_ENTRIES_PER_PAGE	(ZPCI_TABLE_ENTRIES / ZPCI_TABLE_PAGES)
>   
>   #define ZPCI_TABLE_BITS			11
>   #define ZPCI_PT_BITS			8
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index 39c13c25a700..38d2b77ec565 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -149,6 +149,208 @@ int kvm_s390_pci_aen_init(u8 nisc)
>   	return rc;
>   }
>   
> +static int dma_shadow_cpu_trans(struct kvm_vcpu *vcpu, unsigned long *entry,
> +				unsigned long *gentry)
> +{
> +	phys_addr_t gaddr = 0;
> +	unsigned long idx;
> +	struct page *page;
> +	kvm_pfn_t pfn;
> +	gpa_t addr;
> +	int rc = 0;
> +
> +	if (pt_entry_isvalid(*gentry)) {
> +		/* pin and validate */
> +		addr = *gentry & ZPCI_PTE_ADDR_MASK;
> +		idx = srcu_read_lock(&vcpu->kvm->srcu);
> +		page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
> +		srcu_read_unlock(&vcpu->kvm->srcu, idx);
> +		if (is_error_page(page))
> +			return -EIO;
> +		gaddr = page_to_phys(page) + (addr & ~PAGE_MASK);
> +	}
> +
> +	if (pt_entry_isvalid(*entry)) {
> +		/* Either we are invalidating, replacing or no-op */
> +		if (gaddr != 0) {
> +			if ((*entry & ZPCI_PTE_ADDR_MASK) == gaddr) {
> +				/* Duplicate */
> +				kvm_release_pfn_dirty(*entry >> PAGE_SHIFT);
> +			} else {
> +				/* Replace */
> +				pfn = (*entry >> PAGE_SHIFT);
> +				invalidate_pt_entry(entry);
> +				set_pt_pfaa(entry, gaddr);
> +				validate_pt_entry(entry);
> +				kvm_release_pfn_dirty(pfn);
> +				rc = 1;
> +			}
> +		} else {
> +			/* Invalidate */
> +			pfn = (*entry >> PAGE_SHIFT);
> +			invalidate_pt_entry(entry);
> +			kvm_release_pfn_dirty(pfn);
> +			rc = 1;
> +		}
> +	} else if (gaddr != 0) {
> +		/* New Entry */
> +		set_pt_pfaa(entry, gaddr);
> +		validate_pt_entry(entry);
> +	}
> +
> +	return rc;
> +}
> +
> +static unsigned long *dma_walk_guest_cpu_trans(struct kvm_vcpu *vcpu,
> +					       struct kvm_zdev_ioat *ioat,
> +					       dma_addr_t dma_addr)
> +{
> +	unsigned long *rto, *sto, *pto;
> +	unsigned int rtx, rts, sx, px, idx;
> +	struct page *page;
> +	gpa_t addr;
> +	int i;
> +
> +	/* Pin guest segment table if needed */
> +	rtx = calc_rtx(dma_addr);
> +	rto = ioat->head[(rtx / ZPCI_TABLE_ENTRIES_PER_PAGE)];
> +	rts = rtx * ZPCI_TABLE_PAGES;
> +	if (!ioat->seg[rts]) {
> +		if (!reg_entry_isvalid(rto[rtx % ZPCI_TABLE_ENTRIES_PER_PAGE]))
> +			return NULL;
> +		sto = get_rt_sto(rto[rtx % ZPCI_TABLE_ENTRIES_PER_PAGE]);
> +		addr = ((u64)sto & ZPCI_RTE_ADDR_MASK);
> +		idx = srcu_read_lock(&vcpu->kvm->srcu);
> +		for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
> +			page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
> +			if (is_error_page(page)) {
> +				srcu_read_unlock(&vcpu->kvm->srcu, idx);
> +				return NULL;
> +			}
> +			ioat->seg[rts + i] = page_to_virt(page) +
> +					     (addr & ~PAGE_MASK);
> +			addr += PAGE_SIZE;
> +		}
> +		srcu_read_unlock(&vcpu->kvm->srcu, idx);
> +	}
> +
> +	/* Allocate pin pointers for another segment table if needed */
> +	if (!ioat->pt[rtx]) {
> +		ioat->pt[rtx] = kcalloc(ZPCI_TABLE_ENTRIES,
> +					(sizeof(unsigned long *)), GFP_KERNEL);
> +		if (!ioat->pt[rtx])
> +			return NULL;
> +	}
> +	/* Pin guest page table if needed */
> +	sx = calc_sx(dma_addr);
> +	sto = ioat->seg[(rts + (sx / ZPCI_TABLE_ENTRIES_PER_PAGE))];
> +	if (!ioat->pt[rtx][sx]) {
> +		if (!reg_entry_isvalid(sto[sx % ZPCI_TABLE_ENTRIES_PER_PAGE]))
> +			return NULL;
> +		pto = get_st_pto(sto[sx % ZPCI_TABLE_ENTRIES_PER_PAGE]);
> +		if (!pto)
> +			return NULL;
> +		addr = ((u64)pto & ZPCI_STE_ADDR_MASK);
> +		idx = srcu_read_lock(&vcpu->kvm->srcu);
> +		page = gfn_to_page(vcpu->kvm, gpa_to_gfn(addr));
> +		srcu_read_unlock(&vcpu->kvm->srcu, idx);
> +		if (is_error_page(page))
> +			return NULL;
> +		ioat->pt[rtx][sx] = page_to_virt(page) + (addr & ~PAGE_MASK);
> +	}
> +	pto = ioat->pt[rtx][sx];
> +
> +	/* Return guest PTE */
> +	px = calc_px(dma_addr);
> +	return &pto[px];
> +}
> +
> +
> +static int dma_table_shadow(struct kvm_vcpu *vcpu, struct zpci_dev *zdev,
> +			    dma_addr_t dma_addr, size_t size)
> +{
> +	unsigned int nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
> +	struct kvm_zdev *kzdev = zdev->kzdev;
> +	unsigned long *entry, *gentry;
> +	int i, rc = 0, rc2;
> +
> +	if (!nr_pages || !kzdev)
> +		return -EINVAL;
> +
> +	mutex_lock(&kzdev->ioat.lock);
> +	if (!zdev->dma_table || !kzdev->ioat.head[0]) {
> +		rc = -EINVAL;
> +		goto out_unlock;
> +	}
> +
> +	for (i = 0; i < nr_pages; i++) {
> +		gentry = dma_walk_guest_cpu_trans(vcpu, &kzdev->ioat, dma_addr);
> +		if (!gentry)
> +			continue;
> +		entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr);
> +
> +		if (!entry) {
> +			rc = -ENOMEM;
> +			goto out_unlock;
> +		}
> +
> +		rc2 = dma_shadow_cpu_trans(vcpu, entry, gentry);
> +		if (rc2 < 0) {
> +			rc = -EIO;
> +			goto out_unlock;
> +		}
> +		dma_addr += PAGE_SIZE;
> +		rc += rc2;
> +	}
> +

In case of error, shouldn't we invalidate the shadow tables entries we 
did validate until the error?

> +out_unlock:
> +	mutex_unlock(&kzdev->ioat.lock);
> +	return rc;
> +}
> +
> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
> +			       unsigned long start, unsigned long size,
> +			       u8 *status)
> +{
> +	struct zpci_dev *zdev;
> +	u32 fh = req >> 32;
> +	int rc;
> +
> +	/* Make sure this is a valid device associated with this guest */
> +	zdev = get_zdev_by_fh(fh);
> +	if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm) {
> +		*status = 0;

Wouldn't it be interesting to add some debug information here.
When would this appear?

Also if we have this error this looks like we have a VM problem, 
shouldn't we treat this in QEMU and return -EOPNOTSUPP ?

> +		return -EINVAL;
> +	}
> +
> +	/* Only proceed if the device is using the assist */
> +	if (zdev->kzdev->ioat.head[0] == 0)
> +		return -EOPNOTSUPP;
> +
> +	rc = dma_table_shadow(vcpu, zdev, start, size);
> +	if (rc < 0) {
> +		/*
> +		 * If errors encountered during shadow operations, we must
> +		 * fabricate status to present to the guest
> +		 */
> +		switch (rc) {
> +		case -ENOMEM:
> +			*status = KVM_S390_RPCIT_INS_RES;
> +			break;
> +		default:
> +			*status = KVM_S390_RPCIT_ERR;
> +			break;
> +		}
> +	} else if (rc > 0) {
> +		/* Host RPCIT must be issued */
> +		rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size,
> +					status);
> +	}
> +	zdev->kzdev->rpcit_count++;
> +
> +	return rc;
> +}
> +
>   /* Modify PCI: Register floating adapter interruption forwarding */
>   static int kvm_zpci_set_airq(struct zpci_dev *zdev)
>   {
> @@ -620,6 +822,8 @@ EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>   
>   int kvm_s390_pci_init(void)
>   {
> +	int rc;
> +
>   	aift = kzalloc(sizeof(struct zpci_aift), GFP_KERNEL);
>   	if (!aift)
>   		return -ENOMEM;
> @@ -627,5 +831,7 @@ int kvm_s390_pci_init(void)
>   	spin_lock_init(&aift->gait_lock);
>   	mutex_init(&aift->lock);
>   
> -	return 0;
> +	rc = zpci_get_mdd(&aift->mdd);
> +
> +	return rc;
>   }
> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
> index 54355634df82..bb2be7fc3934 100644
> --- a/arch/s390/kvm/pci.h
> +++ b/arch/s390/kvm/pci.h
> @@ -18,6 +18,9 @@
>   
>   #define KVM_S390_PCI_DTSM_MASK 0x40
>   
> +#define KVM_S390_RPCIT_INS_RES 0x10
> +#define KVM_S390_RPCIT_ERR 0x28
> +
>   struct zpci_gaite {
>   	u32 gisa;
>   	u8 gisc;
> @@ -33,6 +36,7 @@ struct zpci_aift {
>   	struct kvm_zdev **kzdev;
>   	spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
>   	struct mutex lock; /* Protects the other structures in aift */
> +	u32 mdd;
>   };
>   
>   extern struct zpci_aift *aift;
> @@ -47,7 +51,9 @@ static inline struct kvm *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
>   
>   int kvm_s390_pci_aen_init(u8 nisc);
>   void kvm_s390_pci_aen_exit(void);
> -
> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
> +			       unsigned long start, unsigned long end,
> +			       u8 *status);
>   int kvm_s390_pci_init(void);
>   
>   #endif /* __KVM_S390_PCI_H */
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 29/30] KVM: s390: introduce CPU feature for zPCI Interpretation
  2022-01-14 20:31 ` [PATCH v2 29/30] KVM: s390: introduce CPU feature for zPCI Interpretation Matthew Rosato
@ 2022-01-19 13:39   ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 13:39 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> KVM_S390_VM_CPU_FEAT_ZPCI_INTERP relays whether zPCI interpretive
> execution is possible based on the available hardware facilities.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/uapi/asm/kvm.h | 1 +
>   arch/s390/kvm/kvm-s390.c         | 4 ++++
>   2 files changed, 5 insertions(+)
> 
> diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
> index 7a6b14874d65..ed06458a871f 100644
> --- a/arch/s390/include/uapi/asm/kvm.h
> +++ b/arch/s390/include/uapi/asm/kvm.h
> @@ -130,6 +130,7 @@ struct kvm_s390_vm_cpu_machine {
>   #define KVM_S390_VM_CPU_FEAT_PFMFI	11
>   #define KVM_S390_VM_CPU_FEAT_SIGPIF	12
>   #define KVM_S390_VM_CPU_FEAT_KSS	13
> +#define KVM_S390_VM_CPU_FEAT_ZPCI_INTERP 14
>   struct kvm_s390_vm_cpu_feat {
>   	__u64 feat[16];
>   };
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index b6c32fc3b272..3ed59fe512dd 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -434,6 +434,10 @@ static void kvm_s390_cpu_feat_init(void)
>   	if (test_facility(151)) /* DFLTCC */
>   		__insn32_query(INSN_DFLTCC, kvm_s390_available_subfunc.dfltcc);
>   
> +	if (test_facility(69) && test_facility(70) && test_facility(71) &&
> +	    test_facility(72)) /* zPCI Interpretation */
> +		allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ZPCI_INTERP);
> +

Don't we want to start the support of ZPCI interpretation starting with 
Z14 ?

>   	if (MACHINE_HAS_ESOP)
>   		allow_cpu_feat(KVM_S390_VM_CPU_FEAT_ESOP);
>   	/*
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 28/30] vfio-pci/zdev: add DTSM to clp group capability
  2022-01-14 20:31 ` [PATCH v2 28/30] vfio-pci/zdev: add DTSM to clp group capability Matthew Rosato
@ 2022-01-19 13:48   ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 13:48 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> The DTSM, or designation type supported mask, indicates what IOAT formats
> are available to the guest.  For an interpreted device, userspace will not
> know what format(s) the IOAT assist supports, so pass it via the
> capability chain.  Since the value belongs to the Query PCI Function Group
> clp, let's extend the existing capability with a new version.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>



> ---
>   drivers/vfio/pci/vfio_pci_zdev.c | 9 ++++++---
>   include/uapi/linux/vfio_zdev.h   | 3 +++
>   2 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index 2b169d688937..aa2ef9067c7d 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -45,19 +45,22 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
>   {
>   	struct vfio_device_info_cap_zpci_group cap = {
>   		.header.id = VFIO_DEVICE_INFO_CAP_ZPCI_GROUP,
> -		.header.version = 1,
> +		.header.version = 2,
>   		.dasm = zdev->dma_mask,
>   		.msi_addr = zdev->msi_addr,
>   		.flags = VFIO_DEVICE_INFO_ZPCI_FLAG_REFRESH,
>   		.mui = zdev->fmb_update,
>   		.noi = zdev->max_msi,
>   		.maxstbl = ZPCI_MAX_WRITE_SIZE,
> -		.version = zdev->version
> +		.version = zdev->version,
> +		.dtsm = 0
>   	};
>   
>   	/* Some values are different for interpreted devices */
> -	if (zdev->kzdev && zdev->kzdev->interp)
> +	if (zdev->kzdev && zdev->kzdev->interp) {
>   		cap.maxstbl = zdev->maxstbl;
> +		cap.dtsm = kvm_s390_pci_get_dtsm(zdev);
> +	}
>   
>   	return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
>   }
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index 1a5229b7bb18..b4c2ba8e71f0 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -47,6 +47,9 @@ struct vfio_device_info_cap_zpci_group {
>   	__u16 noi;		/* Maximum number of MSIs */
>   	__u16 maxstbl;		/* Maximum Store Block Length */
>   	__u8 version;		/* Supported PCI Version */
> +	/* End of version 1 */
> +	__u8 dtsm;		/* Supported IOAT Designations */
> +	/* End of version 2 */
>   };
>   
>   /**
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 27/30] vfio-pci/zdev: wire up zPCI IOAT assist support
  2022-01-14 20:31 ` [PATCH v2 27/30] vfio-pci/zdev: wire up zPCI IOAT assist support Matthew Rosato
@ 2022-01-19 14:03   ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 14:03 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_IOAT, which is a new
> VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
> s390x vfio-pci device wishes to enable/disable zPCI I/O Address
> Translation assistance, allowing the host to perform address translation
> and shadowing.
> 

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>


> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h  |  1 +
>   drivers/vfio/pci/vfio_pci_core.c |  2 +
>   drivers/vfio/pci/vfio_pci_zdev.c | 63 ++++++++++++++++++++++++++++++++
>   include/linux/vfio_pci_core.h    | 10 +++++
>   include/uapi/linux/vfio.h        |  8 ++++
>   include/uapi/linux/vfio_zdev.h   | 13 +++++++
>   6 files changed, 97 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index dbab349a4a75..7b6b6d771026 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -32,6 +32,7 @@ struct kvm_zdev {
>   	struct zpci_dev *zdev;
>   	struct kvm *kvm;
>   	u64 rpcit_count;
> +	u64 iota;
>   	struct kvm_zdev_ioat ioat;
>   	struct zpci_fib fib;
>   	struct notifier_block nb;
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 01658de660bd..709d9ba22a60 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1176,6 +1176,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
>   			return vfio_pci_zdev_feat_interp(vdev, feature, arg);
>   		case VFIO_DEVICE_FEATURE_ZPCI_AIF:
>   			return vfio_pci_zdev_feat_aif(vdev, feature, arg);
> +		case VFIO_DEVICE_FEATURE_ZPCI_IOAT:
> +			return vfio_pci_zdev_feat_ioat(vdev, feature, arg);
>   		default:
>   			return -ENOTTY;
>   		}
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index 891cfa016d63..2b169d688937 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -302,6 +302,68 @@ int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
>   	return rc;
>   }
>   
> +int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
> +			    struct vfio_device_feature feature,
> +			    unsigned long arg)
> +{
> +	struct zpci_dev *zdev = to_zpci(vdev->pdev);
> +	struct vfio_device_zpci_ioat *data;

NIT: something more explicit than "data" like "ioat_feature" ?

> +	struct vfio_device_feature *feat;
> +	unsigned long minsz;
> +	int size, rc = 0;
> +
> +	if (!zdev || !zdev->kzdev)
> +		return -EINVAL;
> +
> +	/* If PROBE specified, return probe results immediately */
> +	if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
> +		return kvm_s390_pci_ioat_probe(zdev);
> +
> +	/* GET and SET are mutually exclusive */
> +	if ((feature.flags & VFIO_DEVICE_FEATURE_GET) &&
> +	    (feature.flags & VFIO_DEVICE_FEATURE_SET))
> +		return -EINVAL;
> +
> +	size = sizeof(*feat) + sizeof(*data);
> +	feat = kzalloc(size, GFP_KERNEL);
> +	if (!feat)
> +		return -ENOMEM;
> +
> +	data = (struct vfio_device_zpci_ioat *)&feat->data;
> +	minsz = offsetofend(struct vfio_device_feature, flags);
> +
> +	if (feature.argsz < minsz + sizeof(*data))
> +		return -EINVAL;
> +
> +	/* Get the rest of the payload for GET/SET */
> +	rc = copy_from_user(data, (void __user *)(arg + minsz),
> +			    sizeof(*data));
> +	if (rc)
> +		rc = -EINVAL;
> +
> +	if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
> +		data->iota = (u64)zdev->kzdev->iota;
> +		if (copy_to_user((void __user *)arg, feat, size))
> +			rc = -EFAULT;
> +	} else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
> +		if (data->iota != 0) {
> +			rc = kvm_s390_pci_ioat_enable(zdev, data->iota);
> +			if (!rc)
> +				zdev->kzdev->iota = data->iota;
> +		} else if (zdev->kzdev->iota != 0) {
> +			rc = kvm_s390_pci_ioat_disable(zdev);
> +			if (!rc)
> +				zdev->kzdev->iota = 0;
> +		}
> +	} else {
> +		/* Neither GET nor SET were specified */
> +		rc = -EINVAL;
> +	}
> +
> +	kfree(feat);
> +	return rc;
> +}
> +
>   static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
>   					unsigned long action, void *data)
>   {
> @@ -351,6 +413,7 @@ void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
>   	 */
>   	if (zdev->gd != 0) {
>   		kvm_s390_pci_aif_disable(zdev);
> +		kvm_s390_pci_ioat_disable(zdev);
>   		kvm_s390_pci_interp_disable(zdev);
>   	}
>   
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 7ec5e82e7933..f17d761ae14e 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -204,6 +204,9 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
>   int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
>   			   struct vfio_device_feature feature,
>   			   unsigned long arg);
> +int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
> +			    struct vfio_device_feature feature,
> +			    unsigned long arg);
>   void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
>   void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
>   #else
> @@ -227,6 +230,13 @@ static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
>   	return -ENOTTY;
>   }
>   
> +static inline int vfio_pci_zdev_feat_ioat(struct vfio_pci_core_device *vdev,
> +					  struct vfio_device_feature feature,
> +					  unsigned long arg)
> +{
> +	return -ENOTTY;
> +}
> +
>   static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
>   {
>   }
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index fe3bfd99bf50..32c687388f48 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1016,6 +1016,14 @@ struct vfio_device_feature {
>    */
>   #define VFIO_DEVICE_FEATURE_ZPCI_AIF		(2)
>   
> +/*
> + * Provide support for enabling guest I/O address translation assistance for
> + * zPCI devices.  This feature is only valid for s390x PCI devices.  Data
> + * provided when setting and getting this feature is further described in
> + * vfio_zdev.h
> + */
> +#define VFIO_DEVICE_FEATURE_ZPCI_IOAT		(3)
> +
>   /* -------- API for Type1 VFIO IOMMU -------- */
>   
>   /**
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index c574e23f9385..1a5229b7bb18 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -110,4 +110,17 @@ struct vfio_device_zpci_aif {
>   	__u8 sbo;		/* Offset of guest summary bit vector */
>   };
>   
> +/**
> + * VFIO_DEVICE_FEATURE_ZPCI_IOAT
> + *
> + * This feature is used for enabling guest I/O translation assistance for
> + * passthrough zPCI devices using instruction interpretation.  When setting
> + * this feature, the iota specifies a KVM guest I/O translation anchor.  When
> + * getting this feature, the most recently set anchor (or 0) is returned in
> + * iota.
> + */
> +struct vfio_device_zpci_ioat {
> +	__u64 iota;
> +};
> +
>   #endif
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction
  2022-01-14 20:31 ` [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction Matthew Rosato
  2022-01-18 11:05   ` Pierre Morel
@ 2022-01-19 14:06   ` Pierre Morel
  1 sibling, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 14:06 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> For faster handling of PCI translation refreshes, intercept in KVM
> and call the associated handler.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>


Aside our previous discussion, 2 small codingstyle to fix
> ---
>   arch/s390/kvm/priv.c | 46 ++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 46 insertions(+)
> 
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 417154b314a6..5b65c1830de2 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -29,6 +29,7 @@
>   #include <asm/ap.h>
>   #include "gaccess.h"
>   #include "kvm-s390.h"
> +#include "pci.h"
>   #include "trace.h"
>   
>   static int handle_ri(struct kvm_vcpu *vcpu)
> @@ -335,6 +336,49 @@ static int handle_rrbe(struct kvm_vcpu *vcpu)
>   	return 0;
>   }
>   
> +static int handle_rpcit(struct kvm_vcpu *vcpu)
> +{
> +	int reg1, reg2;
> +	u8 status;
> +	int rc;
> +
> +	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
> +		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
> +
> +	/* If the host doesn't support PCI, it must be an emulated device */
> +	if (!IS_ENABLED(CONFIG_PCI))
> +		return -EOPNOTSUPP;
> +
> +	kvm_s390_get_regs_rre(vcpu, &reg1, &reg2);
> +
> +	/* If the device has a SHM bit on, let userspace take care of this */
> +	if (((vcpu->run->s.regs.gprs[reg1] >> 32) & aift->mdd) != 0)
> +		return -EOPNOTSUPP;
> +
> +	rc = kvm_s390_pci_refresh_trans(vcpu, vcpu->run->s.regs.gprs[reg1],
> +					vcpu->run->s.regs.gprs[reg2],
> +					vcpu->run->s.regs.gprs[reg2+1],

Here, spaces around "+"

> +					&status);
> +
> +	switch (rc) {
> +	case 0:
> +		kvm_s390_set_psw_cc(vcpu, 0);
> +		break;
> +	case -EOPNOTSUPP:
> +		return -EOPNOTSUPP;
> +	default:
> +		vcpu->run->s.regs.gprs[reg1] &= 0xffffffff00ffffffUL;
> +		vcpu->run->s.regs.gprs[reg1] |= (u64) status << 24;

Here no blank after cast.

> +		if (status != 0)
> +			kvm_s390_set_psw_cc(vcpu, 1);
> +		else
> +			kvm_s390_set_psw_cc(vcpu, 3);
> +		break;
> +	}
> +
> +	return 0;
> +}
> +
>   #define SSKE_NQ 0x8
>   #define SSKE_MR 0x4
>   #define SSKE_MC 0x2
> @@ -1275,6 +1319,8 @@ int kvm_s390_handle_b9(struct kvm_vcpu *vcpu)
>   		return handle_essa(vcpu);
>   	case 0xaf:
>   		return handle_pfmf(vcpu);
> +	case 0xd3:
> +		return handle_rpcit(vcpu);
>   	default:
>   		return -EOPNOTSUPP;
>   	}
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations
  2022-01-19  9:29   ` Pierre Morel
@ 2022-01-19 16:39     ` Matthew Rosato
  2022-01-19 18:25       ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-19 16:39 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/19/22 4:29 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
...
>> +static int dma_table_shadow(struct kvm_vcpu *vcpu, struct zpci_dev 
>> *zdev,
>> +                dma_addr_t dma_addr, size_t size)
>> +{
>> +    unsigned int nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>> +    struct kvm_zdev *kzdev = zdev->kzdev;
>> +    unsigned long *entry, *gentry;
>> +    int i, rc = 0, rc2;
>> +
>> +    if (!nr_pages || !kzdev)
>> +        return -EINVAL;
>> +
>> +    mutex_lock(&kzdev->ioat.lock);
>> +    if (!zdev->dma_table || !kzdev->ioat.head[0]) {
>> +        rc = -EINVAL;
>> +        goto out_unlock;
>> +    }
>> +
>> +    for (i = 0; i < nr_pages; i++) {
>> +        gentry = dma_walk_guest_cpu_trans(vcpu, &kzdev->ioat, dma_addr);
>> +        if (!gentry)
>> +            continue;
>> +        entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr);
>> +
>> +        if (!entry) {
>> +            rc = -ENOMEM;
>> +            goto out_unlock;
>> +        }
>> +
>> +        rc2 = dma_shadow_cpu_trans(vcpu, entry, gentry);
>> +        if (rc2 < 0) {
>> +            rc = -EIO;
>> +            goto out_unlock;
>> +        }
>> +        dma_addr += PAGE_SIZE;
>> +        rc += rc2;
>> +    }
>> +
> 
> In case of error, shouldn't we invalidate the shadow tables entries we 
> did validate until the error?

Hmm, I don't think this is strictly necessary - the status returned 
should indicate the specified DMA range is now in an indeterminate state 
(putting the onus on the guest to take corrective action via a global 
refresh).

In fact I think I screwed that up below in kvm_s390_pci_refresh_trans, 
the fabricated status should always be KVM_S390_RPCIT_INS_RES.

> 
>> +out_unlock:
>> +    mutex_unlock(&kzdev->ioat.lock);
>> +    return rc;
>> +}
>> +
>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
>> +                   unsigned long start, unsigned long size,
>> +                   u8 *status)
>> +{
>> +    struct zpci_dev *zdev;
>> +    u32 fh = req >> 32;
>> +    int rc;
>> +
>> +    /* Make sure this is a valid device associated with this guest */
>> +    zdev = get_zdev_by_fh(fh);
>> +    if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm) {
>> +        *status = 0;
> 
> Wouldn't it be interesting to add some debug information here.
> When would this appear?

Yes, I agree -- One of the follow-ons I'd like to add after this series 
is s390dbf entries; this seems like a good spot for one.

As to when this could happen; it should not under normal circumstances, 
but consider something like arbitrary function handles coming from the 
intercepted guest instruction.  We need to ensure that the specified 
function 1) exists and 2) is associated with the guest issuing the refresh.

> 
> Also if we have this error this looks like we have a VM problem, 
> shouldn't we treat this in QEMU and return -EOPNOTSUPP ?
> 

Well, I'm not sure if we can really tell where the problem is (it could 
for example indicate a misbehaving guest, or a bug in our KVM tracking 
of hostdevs).

The guest chose the function handle, and if we got here then that means 
it doesn't indicate that it's an emulated device, which means either we 
are using the assist and KVM should handle the intercept or we are not 
and userspace should handle it.  But in both of those cases, there 
should be a host device and it should be associated with the guest.

I think if we decide to throw this to userspace in this event, QEMU 
needs some extra code to handle it (basically, if QEMU receives the 
intercept and the device is neither emulated nor using intercept mode 
then we must treat as an invalid handle as this intercept should have 
been handled by KVM)


>> +        return -EINVAL;
>> +    }
>> +
>> +    /* Only proceed if the device is using the assist */
>> +    if (zdev->kzdev->ioat.head[0] == 0)
>> +        return -EOPNOTSUPP;
>> +
>> +    rc = dma_table_shadow(vcpu, zdev, start, size);
>> +    if (rc < 0) {
>> +        /*
>> +         * If errors encountered during shadow operations, we must
>> +         * fabricate status to present to the guest
>> +         */
>> +        switch (rc) {
>> +        case -ENOMEM:
>> +            *status = KVM_S390_RPCIT_INS_RES;
>> +            break;
>> +        default:
>> +            *status = KVM_S390_RPCIT_ERR;
>> +            break;

As mentioned above I think this switch statement should go away and 
instead always set KVM_S390_RPCIT_INS_RES when rc < 0.

>> +        }
>> +    } else if (rc > 0) {
>> +        /* Host RPCIT must be issued */
>> +        rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size,
>> +                    status);
>> +    }
>> +    zdev->kzdev->rpcit_count++;
>> +
>> +    return rc;
>> +}
>> +
>>   /* Modify PCI: Register floating adapter interruption forwarding */
>>   static int kvm_zpci_set_airq(struct zpci_dev *zdev)
>>   {
>> @@ -620,6 +822,8 @@ EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>>   int kvm_s390_pci_init(void)
>>   {
>> +    int rc;
>> +
>>       aift = kzalloc(sizeof(struct zpci_aift), GFP_KERNEL);
>>       if (!aift)
>>           return -ENOMEM;
>> @@ -627,5 +831,7 @@ int kvm_s390_pci_init(void)
>>       spin_lock_init(&aift->gait_lock);
>>       mutex_init(&aift->lock);
>> -    return 0;
>> +    rc = zpci_get_mdd(&aift->mdd);
>> +
>> +    return rc;
>>   }
>> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
>> index 54355634df82..bb2be7fc3934 100644
>> --- a/arch/s390/kvm/pci.h
>> +++ b/arch/s390/kvm/pci.h
>> @@ -18,6 +18,9 @@
>>   #define KVM_S390_PCI_DTSM_MASK 0x40
>> +#define KVM_S390_RPCIT_INS_RES 0x10
>> +#define KVM_S390_RPCIT_ERR 0x28
>> +
>>   struct zpci_gaite {
>>       u32 gisa;
>>       u8 gisc;
>> @@ -33,6 +36,7 @@ struct zpci_aift {
>>       struct kvm_zdev **kzdev;
>>       spinlock_t gait_lock; /* Protects the gait, used during AEN 
>> forward */
>>       struct mutex lock; /* Protects the other structures in aift */
>> +    u32 mdd;
>>   };
>>   extern struct zpci_aift *aift;
>> @@ -47,7 +51,9 @@ static inline struct kvm 
>> *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
>>   int kvm_s390_pci_aen_init(u8 nisc);
>>   void kvm_s390_pci_aen_exit(void);
>> -
>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long req,
>> +                   unsigned long start, unsigned long end,
>> +                   u8 *status);
>>   int kvm_s390_pci_init(void);
>>   #endif /* __KVM_S390_PCI_H */
>>
> 


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
  2022-01-14 20:31 ` [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support Matthew Rosato
@ 2022-01-19 17:10   ` Pierre Morel
  2022-01-19 17:20     ` Matthew Rosato
  2022-01-25 12:36   ` Pierre Morel
  1 sibling, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 17:10 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_AIF, which is a new
> VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
> s390x vfio-pci device wishes to enable/disable zPCI adapter interrupt
> forwarding, which allows underlying firmware to deliver interrupts
> directly to the associated kvm guest.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h  |  2 +
>   drivers/vfio/pci/vfio_pci_core.c |  2 +
>   drivers/vfio/pci/vfio_pci_zdev.c | 98 +++++++++++++++++++++++++++++++-
>   include/linux/vfio_pci_core.h    | 10 ++++
>   include/uapi/linux/vfio.h        |  7 +++
>   include/uapi/linux/vfio_zdev.h   | 20 +++++++
>   6 files changed, 138 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index dc00c3f27a00..dbab349a4a75 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -36,6 +36,8 @@ struct kvm_zdev {
>   	struct zpci_fib fib;
>   	struct notifier_block nb;
>   	bool interp;
> +	bool aif;
> +	bool fhost;
>   };
>   
>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 2b2d64a2190c..01658de660bd 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1174,6 +1174,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
>   			return 0;
>   		case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
>   			return vfio_pci_zdev_feat_interp(vdev, feature, arg);
> +		case VFIO_DEVICE_FEATURE_ZPCI_AIF:
> +			return vfio_pci_zdev_feat_aif(vdev, feature, arg);
>   		default:
>   			return -ENOTTY;
>   		}
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index 4339f48b98bc..891cfa016d63 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -13,6 +13,7 @@
>   #include <linux/vfio_zdev.h>
>   #include <asm/pci_clp.h>
>   #include <asm/pci_io.h>
> +#include <asm/pci_insn.h>
>   #include <asm/kvm_pci.h>
>   
>   #include <linux/vfio_pci_core.h>
> @@ -208,6 +209,99 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
>   	return rc;
>   }
>   
> +int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> +			   struct vfio_device_feature feature,
> +			   unsigned long arg)
> +{
> +	struct zpci_dev *zdev = to_zpci(vdev->pdev);
> +	struct vfio_device_zpci_aif *data;
> +	struct vfio_device_feature *feat;
> +	unsigned long minsz;
> +	int size, rc = 0;
> +
> +	if (!zdev || !zdev->kzdev)
> +		return -EINVAL;
> +
> +	/* If PROBE specified, return probe results immediately */
> +	if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
> +		return kvm_s390_pci_aif_probe(zdev);
> +
> +	/* GET and SET are mutually exclusive */
> +	if ((feature.flags & VFIO_DEVICE_FEATURE_GET) &&
> +	    (feature.flags & VFIO_DEVICE_FEATURE_SET))
> +		return -EINVAL;
> +
> +	size = sizeof(*feat) + sizeof(*data);
> +	feat = kzalloc(size, GFP_KERNEL);
> +	if (!feat)
> +		return -ENOMEM;
> +
> +	data = (struct vfio_device_zpci_aif *)&feat->data;
> +	minsz = offsetofend(struct vfio_device_feature, flags);
> +
> +	if (feature.argsz < minsz + sizeof(*data))
> +		return -EINVAL;
> +
> +	/* Get the rest of the payload for GET/SET */
> +	rc = copy_from_user(data, (void __user *)(arg + minsz),
> +			    sizeof(*data));
> +	if (rc)
> +		rc = -EINVAL;
> +
> +	if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
> +		if (zdev->kzdev->aif)
> +			data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
> +		if (zdev->kzdev->fhost)
> +			data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
> +
> +		if (copy_to_user((void __user *)arg, feat, size))
> +			rc = -EFAULT;
> +	} else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
> +		if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT) {
> +			/* create a guest fib */
> +			struct zpci_fib fib;
> +
> +			fib.fmt0.aibv = data->ibv;
> +			fib.fmt0.isc = data->isc;
> +			fib.fmt0.noi = data->noi;
> +			if (data->sb != 0) {
> +				fib.fmt0.aisb = data->sb;
> +				fib.fmt0.aisbo = data->sbo;
> +				fib.fmt0.sum = 1;
> +			} else {
> +				fib.fmt0.aisb = 0;
> +				fib.fmt0.aisbo = 0;
> +				fib.fmt0.sum = 0;
> +			}
> +			if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST) {
> +				rc = kvm_s390_pci_aif_enable(zdev, &fib, false);
> +				if (!rc) {
> +					zdev->kzdev->aif = true;
> +					zdev->kzdev->fhost = true;
> +				}
> +			} else {
> +				rc = kvm_s390_pci_aif_enable(zdev, &fib, true);
> +				if (!rc)
> +					zdev->kzdev->aif = true;
> +			}
> +		} else if (data->flags == 0) {
> +			rc = kvm_s390_pci_aif_disable(zdev);
> +			if (!rc) {
> +				zdev->kzdev->aif = false;
> +				zdev->kzdev->fhost = false;
> +			}
> +		} else {
> +			rc = -EINVAL;
> +		}
> +	} else {
> +		/* Neither GET nor SET were specified */
> +		rc = -EINVAL;
> +	}
> +
> +	kfree(feat);
> +	return rc;
> +}
> +
>   static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
>   					unsigned long action, void *data)
>   {
> @@ -255,8 +349,10 @@ void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
>   	 * If the device was using interpretation, don't trust that userspace
>   	 * did the appropriate cleanup
>   	 */
> -	if (zdev->gd != 0)
> +	if (zdev->gd != 0) {
> +		kvm_s390_pci_aif_disable(zdev);
>   		kvm_s390_pci_interp_disable(zdev);
> +	}
>   
>   	kvm_s390_pci_dev_release(zdev);
>   }
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 0db2b1051931..7ec5e82e7933 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -201,6 +201,9 @@ extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
>   			      struct vfio_device_feature feature,
>   			      unsigned long arg);
> +int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> +			   struct vfio_device_feature feature,
> +			   unsigned long arg);
>   void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
>   void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
>   #else
> @@ -217,6 +220,13 @@ static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
>   	return -ENOTTY;
>   }
>   
> +static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> +					 struct vfio_device_feature feature,
> +					 unsigned long arg)
> +{
> +	return -ENOTTY;
> +}
> +
>   static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
>   {
>   }
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index b9a75485b8e7..fe3bfd99bf50 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1009,6 +1009,13 @@ struct vfio_device_feature {
>    */
>   #define VFIO_DEVICE_FEATURE_ZPCI_INTERP		(1)
>   
> +/*
> + * Provide support for enbaling adapter interruption forwarding for zPCI
> + * devices.  This feature is only valid for s390x PCI devices.  Data provided
> + * when setting and getting this feature is further described in vfio_zdev.h
> + */
> +#define VFIO_DEVICE_FEATURE_ZPCI_AIF		(2)
> +
>   /* -------- API for Type1 VFIO IOMMU -------- */
>   
>   /**
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index 575f0410dc66..c574e23f9385 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -90,4 +90,24 @@ struct vfio_device_zpci_interp {
>   	__u32 fh;		/* Host device function handle */
>   };
>   
> +/**
> + * VFIO_DEVICE_FEATURE_ZPCI_AIF
> + *
> + * This feature is used for enabling forwarding of adapter interrupts directly
> + * from firmware to the guest.  When setting this feature, the flags indicate
> + * whether to enable/disable the feature and the structure defined below is
> + * used to setup the forwarding structures.  When getting this feature, only
> + * the flags are used to indicate the current state.
> + */
> +struct vfio_device_zpci_aif {
> +	__u64 flags;
> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2

Generaly it looks good to me but I miss some explanation on these flags.

Which makes me realize that a more complete documentation under 
Documentation/S390 for VFIO zPCI as we have for VFIO AP and VFIO CCW 
would be of great interest.


> +	__u64 ibv;		/* Address of guest interrupt bit vector */
> +	__u64 sb;		/* Address of guest summary bit */
> +	__u32 noi;		/* Number of interrupts */
> +	__u8 isc;		/* Guest interrupt subclass */
> +	__u8 sbo;		/* Offset of guest summary bit vector */
> +};
> +
>   #endif
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
  2022-01-19 17:10   ` Pierre Morel
@ 2022-01-19 17:20     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-19 17:20 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/19/22 12:10 PM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
...
>> diff --git a/include/uapi/linux/vfio_zdev.h 
>> b/include/uapi/linux/vfio_zdev.h
>> index 575f0410dc66..c574e23f9385 100644
>> --- a/include/uapi/linux/vfio_zdev.h
>> +++ b/include/uapi/linux/vfio_zdev.h
>> @@ -90,4 +90,24 @@ struct vfio_device_zpci_interp {
>>       __u32 fh;        /* Host device function handle */
>>   };
>> +/**
>> + * VFIO_DEVICE_FEATURE_ZPCI_AIF
>> + *
>> + * This feature is used for enabling forwarding of adapter interrupts 
>> directly
>> + * from firmware to the guest.  When setting this feature, the flags 
>> indicate
>> + * whether to enable/disable the feature and the structure defined 
>> below is
>> + * used to setup the forwarding structures.  When getting this 
>> feature, only
>> + * the flags are used to indicate the current state.
>> + */
>> +struct vfio_device_zpci_aif {
>> +    __u64 flags;
>> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
>> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2
> 
> Generaly it looks good to me but I miss some explanation on these flags.

I can add a small line comment for each, like:

  AIF_FLOAT 1 /* Floating interrupts enabled */
  AIF_HOST 2  /* Host delivery forced */

But here's a bit more detail:

On SET:
AIF_FLOAT = 1 means enable the interrupt forwarding assist for floating 
interrupt delivery
AIF_FLOAT = 0 means to disable it.
AIF_HOST = 1 means the assist will always deliver the interrupt to the 
host and let the host inject it
AIF_HOST = 0 host only gets interrupts when firmware can't deliver

on GET, we just indicate the current settings from the most recent SET, 
meaning:
AIF_FLOAT = 1 interrupt forwarding assist is currently active
AIF_FLOAT = 0 interrupt forwarding assist is not currently active
AIF_HOST = 1 interrupt forwarding will always go through host
AIF_HOST = 0 interrupt forwarding will only go through the host when 
necessary

> 
> Which makes me realize that a more complete documentation under 
> Documentation/S390 for VFIO zPCI as we have for VFIO AP and VFIO CCW 
> would be of great interest.

You're not wrong -- a similar comment came up for QEMU.  I will add this 
to my todo list as a follow-on.




^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation
  2022-01-14 20:31 ` [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation Matthew Rosato
@ 2022-01-19 18:06   ` Pierre Morel
  2022-01-19 20:19     ` Matthew Rosato
  2022-01-25 12:23   ` Pierre Morel
  1 sibling, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 18:06 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Initial setup for Adapter Event Notification Interpretation for zPCI
> passthrough devices.  Specifically, allocate a structure for forwarding of
> adapter events and pass the address of this structure to firmware.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/pci.h      |   4 +
>   arch/s390/include/asm/pci_insn.h |  12 +++
>   arch/s390/kvm/interrupt.c        |  14 +++
>   arch/s390/kvm/kvm-s390.c         |   9 ++
>   arch/s390/kvm/pci.c              | 144 +++++++++++++++++++++++++++++++
>   arch/s390/kvm/pci.h              |  42 +++++++++
>   arch/s390/pci/pci.c              |   6 ++
>   7 files changed, 231 insertions(+)
>   create mode 100644 arch/s390/kvm/pci.h
> 
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 9b6c657d8d31..9ff8dc19975e 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -9,6 +9,7 @@
>   #include <asm-generic/pci.h>
>   #include <asm/pci_clp.h>
>   #include <asm/pci_debug.h>
> +#include <asm/pci_insn.h>
>   #include <asm/sclp.h>
>   
>   #define PCIBIOS_MIN_IO		0x1000
> @@ -204,6 +205,9 @@ extern const struct attribute_group *zpci_attr_groups[];
>   extern unsigned int s390_pci_force_floating __initdata;
>   extern unsigned int s390_pci_no_rid;
>   
> +extern union zpci_sic_iib *zpci_aipb;
> +extern struct airq_iv *zpci_aif_sbv;
> +
>   /* -----------------------------------------------------------------------------
>     Prototypes
>   ----------------------------------------------------------------------------- */
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 32759c407b8f..ad9000295c82 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -101,6 +101,7 @@ struct zpci_fib {
>   /* Set Interruption Controls Operation Controls  */
>   #define	SIC_IRQ_MODE_ALL		0
>   #define	SIC_IRQ_MODE_SINGLE		1
> +#define	SIC_SET_AENI_CONTROLS		2
>   #define	SIC_IRQ_MODE_DIRECT		4
>   #define	SIC_IRQ_MODE_D_ALL		16
>   #define	SIC_IRQ_MODE_D_SINGLE		17
> @@ -127,9 +128,20 @@ struct zpci_cdiib {
>   	u64 : 64;
>   } __packed __aligned(8);
>   
> +/* adapter interruption parameters block */
> +struct zpci_aipb {
> +	u64 faisb;
> +	u64 gait;
> +	u16 : 13;
> +	u16 afi : 3;
> +	u32 : 32;
> +	u16 faal;
> +} __packed __aligned(8);
> +
>   union zpci_sic_iib {
>   	struct zpci_diib diib;
>   	struct zpci_cdiib cdiib;
> +	struct zpci_aipb aipb;
>   };
>   
>   DECLARE_STATIC_KEY_FALSE(have_mio);
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index f9b872e358c6..a591b8cd662f 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -32,6 +32,7 @@
>   #include "kvm-s390.h"
>   #include "gaccess.h"
>   #include "trace-s390.h"
> +#include "pci.h"
>   
>   #define PFAULT_INIT 0x0600
>   #define PFAULT_DONE 0x0680
> @@ -3278,6 +3279,11 @@ void kvm_s390_gib_destroy(void)
>   {
>   	if (!gib)
>   		return;
> +	if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni && aift) {
> +		mutex_lock(&aift->lock);
> +		kvm_s390_pci_aen_exit();
> +		mutex_unlock(&aift->lock);
> +	}
>   	chsc_sgib(0);
>   	unregister_adapter_interrupt(&gib_alert_irq);
>   	free_page((unsigned long)gib);
> @@ -3315,6 +3321,14 @@ int kvm_s390_gib_init(u8 nisc)
>   		goto out_unreg_gal;
>   	}
>   
> +	if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni) {
> +		if (kvm_s390_pci_aen_init(nisc)) {
> +			pr_err("Initializing AEN for PCI failed\n");
> +			rc = -EIO;
> +			goto out_unreg_gal;
> +		}
> +	}
> +
>   	KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
>   	goto out;
>   
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 14a18ba5ff2c..01dc3f6883d0 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -48,6 +48,7 @@
>   #include <asm/fpu/api.h>
>   #include "kvm-s390.h"
>   #include "gaccess.h"
> +#include "pci.h"
>   
>   #define CREATE_TRACE_POINTS
>   #include "trace.h"
> @@ -503,6 +504,14 @@ int kvm_arch_init(void *opaque)
>   		goto out;
>   	}
>   
> +	if (IS_ENABLED(CONFIG_PCI)) {
> +		rc = kvm_s390_pci_init();
> +		if (rc) {
> +			pr_err("Unable to allocate AIFT for PCI\n");
> +			goto out;
> +		}
> +	}
> +
>   	rc = kvm_s390_gib_init(GAL_ISC);
>   	if (rc)
>   		goto out;
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index 1c33bc7bf2bd..dae853da6df1 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -10,6 +10,138 @@
>   #include <linux/kvm_host.h>
>   #include <linux/pci.h>
>   #include <asm/kvm_pci.h>
> +#include <asm/pci.h>
> +#include <asm/pci_insn.h>
> +#include "pci.h"
> +
> +struct zpci_aift *aift;
> +
> +static inline int __set_irq_noiib(u16 ctl, u8 isc)
> +{
> +	union zpci_sic_iib iib = {{0}};
> +
> +	return zpci_set_irq_ctrl(ctl, isc, &iib);
> +}
> +
> +/* Caller must hold the aift lock before calling this function */
> +void kvm_s390_pci_aen_exit(void)
> +{
> +	unsigned long flags;
> +	struct kvm_zdev **gait_kzdev;
> +
> +	/*
> +	 * Contents of the aipb remain registered for the life of the host
> +	 * kernel, the information preserved in zpci_aipb and zpci_aif_sbv
> +	 * in case we insert the KVM module again later.  Clear the AIFT
> +	 * information and free anything not registered with underlying
> +	 * firmware.
> +	 */
> +	spin_lock_irqsave(&aift->gait_lock, flags);
> +	gait_kzdev = aift->kzdev;
> +	aift->gait = 0;
> +	aift->sbv = 0;
> +	aift->kzdev = 0;
> +	spin_unlock_irqrestore(&aift->gait_lock, flags);
> +
> +	kfree(gait_kzdev);
> +}
> +
> +int kvm_s390_pci_aen_init(u8 nisc)
> +{
> +	struct page *page;
> +	int rc = 0, size;
> +	bool first = false;
> +
> +	/* If already enabled for AEN, bail out now */
> +	if (aift->gait || aift->sbv)
> +		return -EPERM;
> +
> +	mutex_lock(&aift->lock);
> +	aift->kzdev = kcalloc(ZPCI_NR_DEVICES, sizeof(struct kvm_zdev),
> +			      GFP_KERNEL);
> +	if (!aift->kzdev) {
> +		rc = -ENOMEM;
> +		goto unlock;
> +	}
> +
> +	if (!zpci_aipb) {

I think you should externalize all this allocation and setup of aipb
in a dedicated function zpci_setup_aipb()
from here ----->

> +		zpci_aipb = kzalloc(sizeof(union zpci_sic_iib), GFP_KERNEL);
> +		if (!zpci_aipb) {
> +			rc = -ENOMEM;
> +			goto free_zdev;
> +		}
> +		first = true;
> +		aift->sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, 0);
> +		if (!aift->sbv) {
> +			rc = -ENOMEM;
> +			goto free_aipb;
> +		}
> +		zpci_aif_sbv = aift->sbv;
> +		size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
> +					    sizeof(struct zpci_gaite)));
> +		page = alloc_pages(GFP_KERNEL | __GFP_ZERO, size);
> +		if (!page) {
> +			rc = -ENOMEM;
> +			goto free_sbv;
> +		}
> +		aift->gait = (struct zpci_gaite *)page_to_phys(page);
> +
> +		zpci_aipb->aipb.faisb = virt_to_phys(aift->sbv->vector);
> +		zpci_aipb->aipb.gait = virt_to_phys(aift->gait);
> +		zpci_aipb->aipb.afi = nisc;
> +		zpci_aipb->aipb.faal = ZPCI_NR_DEVICES;
> +
> +		/* Setup Adapter Event Notification Interpretation */
> +		if (zpci_set_irq_ctrl(SIC_SET_AENI_CONTROLS, 0, zpci_aipb)) {
> +			rc = -EIO;
> +			goto free_gait;

to here---->

> +		}
> +	} else {
> +		/*
> +		 * AEN registration can only happen once per system boot.  If
> +		 * an aipb already exists then AEN was already registered and
> +		 * we can re-use the aipb contents.  This can only happen if
> +		 * the KVM module was removed and re-inserted.
> +		 */
> +		if (zpci_aipb->aipb.afi != nisc ||
> +		    zpci_aipb->aipb.faal != ZPCI_NR_DEVICES) {
> +			rc = -EINVAL;
> +			goto free_zdev;
> +		}
> +		aift->sbv = zpci_aif_sbv;
> +		aift->gait = (struct zpci_gaite *)zpci_aipb->aipb.gait;
> +	}
> +
> +	/* Enable floating IRQs */
> +	if (__set_irq_noiib(SIC_IRQ_MODE_SINGLE, nisc)) {
> +		rc = -EIO;
> +		kvm_s390_pci_aen_exit();
> +	}
> +
> +	goto unlock;

and the according errors

here ---->
> +
> +free_gait:
> +	size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
> +				    sizeof(struct zpci_gaite)));
> +	free_pages((unsigned long)aift->gait, size);
> +free_sbv:
> +	if (first) {
> +		/* If AEN setup failed, only free a newly-allocated sbv */
> +		airq_iv_release(aift->sbv);
> +		zpci_aif_sbv = 0;
> +	}
> +free_aipb:
> +	if (first) {
> +		/* If AEN setup failed, only free a newly-allocated aipb */
> +		kfree(zpci_aipb);
> +		zpci_aipb = 0;
> +	}

to here ---->

To simplify the understanding.

> +free_zdev:
> +	kfree(aift->kzdev);
> +unlock:
> +	mutex_unlock(&aift->lock);
> +	return rc;
> +}
>  

... snip...

The second part of the if(aipb) else
could also be externalise in zpci_reset_aipb()

which leads to

    if(!aipb)
	ret = zpci_setup_aipb()
    else
	ret = zpci_reset_aipb()

    if (ret)
	goto cleanup;

     enable_irq()
	goto unlock;

I think that if we can do that it would be much clearer.
what do you think?

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution
  2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
                   ` (30 preceding siblings ...)
  2022-01-14 20:49 ` [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
@ 2022-01-19 18:10 ` Pierre Morel
  31 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 18:10 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Enable interpretive execution of zPCI instructions + adapter interruption
> forwarding for s390x KVM vfio-pci.  This is done by introducing a series
> of new vfio-pci feature ioctls that are unique vfio-pci-zdev (s390x) and
> are used to negotiate the various aspects of zPCI interpretation setup.
> By allowing intepretation of zPCI instructions and firmware delivery of
> interrupts to guests, we can significantly reduce the frequency of guest
> SIE exits for zPCI.  We then see additional gains by handling a hot-path
> instruction that can still intercept to the hypervisor (RPCIT) directly
> in kvm.
> 
>  From the perspective of guest configuration, you passthrough zPCI devices
> in the same manner as before, with intepretation support being used by
> default if available in kernel+qemu.
> 
> Will reply with a link to the associated QEMU series.

I did the comment in a patch but I think that centralizing it here is 
clearer: I think having a documentation in Documentation/S390 like we 
have already for VFIO AP and VFIO CCW would be a good thing.

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans
  2022-01-14 20:31 ` [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans Matthew Rosato
@ 2022-01-19 18:13   ` Pierre Morel
  2022-01-26 10:45   ` Claudio Imbrenda
  2022-01-27 10:30   ` Niklas Schnelle
  2 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 18:13 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Current callers of zpci_refresh_trans don't need to interrogate the status
> returned from the underlying instructions.  However, a subsequent patch
> will add a KVM caller that needs this information.  Add a new argument to
> zpci_refresh_trans to pass the address of a status byte and update
> existing call sites to provide it.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>


> ---
>   arch/s390/include/asm/pci_insn.h |  2 +-
>   arch/s390/pci/pci_dma.c          |  6 ++++--
>   arch/s390/pci/pci_insn.c         | 10 +++++-----
>   drivers/iommu/s390-iommu.c       |  4 +++-
>   4 files changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 5331082fa516..32759c407b8f 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -135,7 +135,7 @@ union zpci_sic_iib {
>   DECLARE_STATIC_KEY_FALSE(have_mio);
>   
>   u8 zpci_mod_fc(u64 req, struct zpci_fib *fib, u8 *status);
> -int zpci_refresh_trans(u64 fn, u64 addr, u64 range);
> +int zpci_refresh_trans(u64 fn, u64 addr, u64 range, u8 *status);
>   int __zpci_load(u64 *data, u64 req, u64 offset);
>   int zpci_load(u64 *data, const volatile void __iomem *addr, unsigned long len);
>   int __zpci_store(u64 data, u64 req, u64 offset);
> diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
> index a81de48d5ea7..b0a2380bcad8 100644
> --- a/arch/s390/pci/pci_dma.c
> +++ b/arch/s390/pci/pci_dma.c
> @@ -23,8 +23,9 @@ static u32 s390_iommu_aperture_factor = 1;
>   
>   static int zpci_refresh_global(struct zpci_dev *zdev)
>   {
> +	u8 status;
>   	return zpci_refresh_trans((u64) zdev->fh << 32, zdev->start_dma,
> -				  zdev->iommu_pages * PAGE_SIZE);
> +				  zdev->iommu_pages * PAGE_SIZE, &status);
>   }
>   
>   unsigned long *dma_alloc_cpu_table(void)
> @@ -183,6 +184,7 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
>   			   size_t size, int flags)
>   {
>   	unsigned long irqflags;
> +	u8 status;
>   	int ret;
>   
>   	/*
> @@ -201,7 +203,7 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
>   	}
>   
>   	ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr,
> -				 PAGE_ALIGN(size));
> +				 PAGE_ALIGN(size), &status);
>   	if (ret == -ENOMEM && !s390_iommu_strict) {
>   		/* enable the hypervisor to free some resources */
>   		if (zpci_refresh_global(zdev))
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index 0509554301c7..ca6399d52767 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -77,20 +77,20 @@ static inline u8 __rpcit(u64 fn, u64 addr, u64 range, u8 *status)
>   	return cc;
>   }
>   
> -int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
> +int zpci_refresh_trans(u64 fn, u64 addr, u64 range, u8 *status)
>   {
> -	u8 cc, status;
> +	u8 cc;
>   
>   	do {
> -		cc = __rpcit(fn, addr, range, &status);
> +		cc = __rpcit(fn, addr, range, status);
>   		if (cc == 2)
>   			udelay(ZPCI_INSN_BUSY_DELAY);
>   	} while (cc == 2);
>   
>   	if (cc)
> -		zpci_err_insn(cc, status, addr, range);
> +		zpci_err_insn(cc, *status, addr, range);
>   
> -	if (cc == 1 && (status == 4 || status == 16))
> +	if (cc == 1 && (*status == 4 || *status == 16))
>   		return -ENOMEM;
>   
>   	return (cc) ? -EIO : 0;
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index 50860ebdd087..845bb99c183e 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -214,6 +214,7 @@ static int s390_iommu_update_trans(struct s390_domain *s390_domain,
>   	unsigned long irq_flags, nr_pages, i;
>   	unsigned long *entry;
>   	int rc = 0;
> +	u8 status;
>   
>   	if (dma_addr < s390_domain->domain.geometry.aperture_start ||
>   	    dma_addr + size > s390_domain->domain.geometry.aperture_end)
> @@ -238,7 +239,8 @@ static int s390_iommu_update_trans(struct s390_domain *s390_domain,
>   	spin_lock(&s390_domain->list_lock);
>   	list_for_each_entry(domain_device, &s390_domain->devices, list) {
>   		rc = zpci_refresh_trans((u64) domain_device->zdev->fh << 32,
> -					start_dma_addr, nr_pages * PAGE_SIZE);
> +					start_dma_addr, nr_pages * PAGE_SIZE,
> +					&status);
>   		if (rc)
>   			break;
>   	}
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations
  2022-01-19 16:39     ` Matthew Rosato
@ 2022-01-19 18:25       ` Pierre Morel
  2022-01-19 20:02         ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-19 18:25 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/19/22 17:39, Matthew Rosato wrote:
> On 1/19/22 4:29 AM, Pierre Morel wrote:
>>
>>
>> On 1/14/22 21:31, Matthew Rosato wrote:
> ...
>>> +static int dma_table_shadow(struct kvm_vcpu *vcpu, struct zpci_dev 
>>> *zdev,
>>> +                dma_addr_t dma_addr, size_t size)
>>> +{
>>> +    unsigned int nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>>> +    struct kvm_zdev *kzdev = zdev->kzdev;
>>> +    unsigned long *entry, *gentry;
>>> +    int i, rc = 0, rc2;
>>> +
>>> +    if (!nr_pages || !kzdev)
>>> +        return -EINVAL;
>>> +
>>> +    mutex_lock(&kzdev->ioat.lock);
>>> +    if (!zdev->dma_table || !kzdev->ioat.head[0]) {
>>> +        rc = -EINVAL;
>>> +        goto out_unlock;
>>> +    }
>>> +
>>> +    for (i = 0; i < nr_pages; i++) {
>>> +        gentry = dma_walk_guest_cpu_trans(vcpu, &kzdev->ioat, 
>>> dma_addr);
>>> +        if (!gentry)
>>> +            continue;
>>> +        entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr);
>>> +
>>> +        if (!entry) {
>>> +            rc = -ENOMEM;
>>> +            goto out_unlock;
>>> +        }
>>> +
>>> +        rc2 = dma_shadow_cpu_trans(vcpu, entry, gentry);
>>> +        if (rc2 < 0) {
>>> +            rc = -EIO;
>>> +            goto out_unlock;
>>> +        }
>>> +        dma_addr += PAGE_SIZE;
>>> +        rc += rc2;
>>> +    }
>>> +
>>
>> In case of error, shouldn't we invalidate the shadow tables entries we 
>> did validate until the error?
> 
> Hmm, I don't think this is strictly necessary - the status returned 
> should indicate the specified DMA range is now in an indeterminate state 
> (putting the onus on the guest to take corrective action via a global 
> refresh).
> 
> In fact I think I screwed that up below in kvm_s390_pci_refresh_trans, 
> the fabricated status should always be KVM_S390_RPCIT_INS_RES.

OK

> 
>>
>>> +out_unlock:
>>> +    mutex_unlock(&kzdev->ioat.lock);
>>> +    return rc;
>>> +}
>>> +
>>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long 
>>> req,
>>> +                   unsigned long start, unsigned long size,
>>> +                   u8 *status)
>>> +{
>>> +    struct zpci_dev *zdev;
>>> +    u32 fh = req >> 32;
>>> +    int rc;
>>> +
>>> +    /* Make sure this is a valid device associated with this guest */
>>> +    zdev = get_zdev_by_fh(fh);
>>> +    if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm) {
>>> +        *status = 0;
>>
>> Wouldn't it be interesting to add some debug information here.
>> When would this appear?
> 
> Yes, I agree -- One of the follow-ons I'd like to add after this series 
> is s390dbf entries; this seems like a good spot for one.
> 
> As to when this could happen; it should not under normal circumstances, 
> but consider something like arbitrary function handles coming from the 
> intercepted guest instruction.  We need to ensure that the specified 
> function 1) exists and 2) is associated with the guest issuing the refresh.
> 
>>
>> Also if we have this error this looks like we have a VM problem, 
>> shouldn't we treat this in QEMU and return -EOPNOTSUPP ?
>>
> 
> Well, I'm not sure if we can really tell where the problem is (it could 
> for example indicate a misbehaving guest, or a bug in our KVM tracking 
> of hostdevs).
> 
> The guest chose the function handle, and if we got here then that means 
> it doesn't indicate that it's an emulated device, which means either we 
> are using the assist and KVM should handle the intercept or we are not 
> and userspace should handle it.  But in both of those cases, there 
> should be a host device and it should be associated with the guest.

That is right if we can not find an associated zdev = F(fh)
but the two other errors are KVM or QEMU errors AFAIU.

> 
> I think if we decide to throw this to userspace in this event, QEMU 
> needs some extra code to handle it (basically, if QEMU receives the 
> intercept and the device is neither emulated nor using intercept mode 
> then we must treat as an invalid handle as this intercept should have 
> been handled by KVM)

I do not want to start a discussion on this, I think we can let it like 
this at first and come back to it when we have a good idea on how to 
handle this.
May be just add a /* TODO */


> 
> 
>>> +        return -EINVAL;
>>> +    }
>>> +
>>> +    /* Only proceed if the device is using the assist */
>>> +    if (zdev->kzdev->ioat.head[0] == 0)
>>> +        return -EOPNOTSUPP;
>>> +
>>> +    rc = dma_table_shadow(vcpu, zdev, start, size);
>>> +    if (rc < 0) {
>>> +        /*
>>> +         * If errors encountered during shadow operations, we must
>>> +         * fabricate status to present to the guest
>>> +         */
>>> +        switch (rc) {
>>> +        case -ENOMEM:
>>> +            *status = KVM_S390_RPCIT_INS_RES;
>>> +            break;
>>> +        default:
>>> +            *status = KVM_S390_RPCIT_ERR;
>>> +            break;
> 
> As mentioned above I think this switch statement should go away and 
> instead always set KVM_S390_RPCIT_INS_RES when rc < 0.
> 
>>> +        }
>>> +    } else if (rc > 0) {
>>> +        /* Host RPCIT must be issued */
>>> +        rc = zpci_refresh_trans((u64) zdev->fh << 32, start, size,
>>> +                    status);
>>> +    }
>>> +    zdev->kzdev->rpcit_count++;
>>> +
>>> +    return rc;
>>> +}
>>> +
>>>   /* Modify PCI: Register floating adapter interruption forwarding */
>>>   static int kvm_zpci_set_airq(struct zpci_dev *zdev)
>>>   {
>>> @@ -620,6 +822,8 @@ EXPORT_SYMBOL_GPL(kvm_s390_pci_attach_kvm);
>>>   int kvm_s390_pci_init(void)
>>>   {
>>> +    int rc;
>>> +
>>>       aift = kzalloc(sizeof(struct zpci_aift), GFP_KERNEL);
>>>       if (!aift)
>>>           return -ENOMEM;
>>> @@ -627,5 +831,7 @@ int kvm_s390_pci_init(void)
>>>       spin_lock_init(&aift->gait_lock);
>>>       mutex_init(&aift->lock);
>>> -    return 0;
>>> +    rc = zpci_get_mdd(&aift->mdd);
>>> +
>>> +    return rc;
>>>   }
>>> diff --git a/arch/s390/kvm/pci.h b/arch/s390/kvm/pci.h
>>> index 54355634df82..bb2be7fc3934 100644
>>> --- a/arch/s390/kvm/pci.h
>>> +++ b/arch/s390/kvm/pci.h
>>> @@ -18,6 +18,9 @@
>>>   #define KVM_S390_PCI_DTSM_MASK 0x40
>>> +#define KVM_S390_RPCIT_INS_RES 0x10
>>> +#define KVM_S390_RPCIT_ERR 0x28
>>> +
>>>   struct zpci_gaite {
>>>       u32 gisa;
>>>       u8 gisc;
>>> @@ -33,6 +36,7 @@ struct zpci_aift {
>>>       struct kvm_zdev **kzdev;
>>>       spinlock_t gait_lock; /* Protects the gait, used during AEN 
>>> forward */
>>>       struct mutex lock; /* Protects the other structures in aift */
>>> +    u32 mdd;
>>>   };
>>>   extern struct zpci_aift *aift;
>>> @@ -47,7 +51,9 @@ static inline struct kvm 
>>> *kvm_s390_pci_si_to_kvm(struct zpci_aift *aift,
>>>   int kvm_s390_pci_aen_init(u8 nisc);
>>>   void kvm_s390_pci_aen_exit(void);
>>> -
>>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long 
>>> req,
>>> +                   unsigned long start, unsigned long end,
>>> +                   u8 *status);
>>>   int kvm_s390_pci_init(void);
>>>   #endif /* __KVM_S390_PCI_H */
>>>
>>
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations
  2022-01-19 18:25       ` Pierre Morel
@ 2022-01-19 20:02         ` Matthew Rosato
  2022-01-20  9:47           ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-19 20:02 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/19/22 1:25 PM, Pierre Morel wrote:
> 
> 
> On 1/19/22 17:39, Matthew Rosato wrote:
>> On 1/19/22 4:29 AM, Pierre Morel wrote:
>>>
>>>
>>> On 1/14/22 21:31, Matthew Rosato wrote:
>> ...
>>>> +static int dma_table_shadow(struct kvm_vcpu *vcpu, struct zpci_dev 
>>>> *zdev,
>>>> +                dma_addr_t dma_addr, size_t size)
>>>> +{
>>>> +    unsigned int nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>>>> +    struct kvm_zdev *kzdev = zdev->kzdev;
>>>> +    unsigned long *entry, *gentry;
>>>> +    int i, rc = 0, rc2;
>>>> +
>>>> +    if (!nr_pages || !kzdev)
>>>> +        return -EINVAL;
>>>> +
>>>> +    mutex_lock(&kzdev->ioat.lock);
>>>> +    if (!zdev->dma_table || !kzdev->ioat.head[0]) {
>>>> +        rc = -EINVAL;
>>>> +        goto out_unlock;
>>>> +    }
>>>> +
>>>> +    for (i = 0; i < nr_pages; i++) {
>>>> +        gentry = dma_walk_guest_cpu_trans(vcpu, &kzdev->ioat, 
>>>> dma_addr);
>>>> +        if (!gentry)
>>>> +            continue;
>>>> +        entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr);
>>>> +
>>>> +        if (!entry) {
>>>> +            rc = -ENOMEM;
>>>> +            goto out_unlock;
>>>> +        }
>>>> +
>>>> +        rc2 = dma_shadow_cpu_trans(vcpu, entry, gentry);
>>>> +        if (rc2 < 0) {
>>>> +            rc = -EIO;
>>>> +            goto out_unlock;
>>>> +        }
>>>> +        dma_addr += PAGE_SIZE;
>>>> +        rc += rc2;
>>>> +    }
>>>> +
>>>
>>> In case of error, shouldn't we invalidate the shadow tables entries 
>>> we did validate until the error?
>>
>> Hmm, I don't think this is strictly necessary - the status returned 
>> should indicate the specified DMA range is now in an indeterminate 
>> state (putting the onus on the guest to take corrective action via a 
>> global refresh).
>>
>> In fact I think I screwed that up below in kvm_s390_pci_refresh_trans, 
>> the fabricated status should always be KVM_S390_RPCIT_INS_RES.
> 
> OK
> 
>>
>>>
>>>> +out_unlock:
>>>> +    mutex_unlock(&kzdev->ioat.lock);
>>>> +    return rc;
>>>> +}
>>>> +
>>>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned long 
>>>> req,
>>>> +                   unsigned long start, unsigned long size,
>>>> +                   u8 *status)
>>>> +{
>>>> +    struct zpci_dev *zdev;
>>>> +    u32 fh = req >> 32;
>>>> +    int rc;
>>>> +
>>>> +    /* Make sure this is a valid device associated with this guest */
>>>> +    zdev = get_zdev_by_fh(fh);
>>>> +    if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm) {
>>>> +        *status = 0;
>>>
>>> Wouldn't it be interesting to add some debug information here.
>>> When would this appear?
>>
>> Yes, I agree -- One of the follow-ons I'd like to add after this 
>> series is s390dbf entries; this seems like a good spot for one.
>>
>> As to when this could happen; it should not under normal 
>> circumstances, but consider something like arbitrary function handles 
>> coming from the intercepted guest instruction.  We need to ensure that 
>> the specified function 1) exists and 2) is associated with the guest 
>> issuing the refresh.
>>
>>>
>>> Also if we have this error this looks like we have a VM problem, 
>>> shouldn't we treat this in QEMU and return -EOPNOTSUPP ?
>>>
>>
>> Well, I'm not sure if we can really tell where the problem is (it 
>> could for example indicate a misbehaving guest, or a bug in our KVM 
>> tracking of hostdevs).
>>
>> The guest chose the function handle, and if we got here then that 
>> means it doesn't indicate that it's an emulated device, which means 
>> either we are using the assist and KVM should handle the intercept or 
>> we are not and userspace should handle it.  But in both of those 
>> cases, there should be a host device and it should be associated with 
>> the guest.
> 
> That is right if we can not find an associated zdev = F(fh)
> but the two other errors are KVM or QEMU errors AFAIU.

I don't think we know for sure for any of the cases...  For a 
well-behaved guest I agree with your assessment.  However, the guest 
decides what fh to put into its refresh instruction and so a misbehaving 
guest could just pick arbitrary numbers for fh and circumstantially 
match some other host device.  What if the guest just decided to try 
every single possible fh number in a loop with a refresh instruction? 
That's neither KVM nor QEMU's fault but can trip each of these cases.

Consider the different cases:

!zdev - Either the guest provided a bogus fh, KVM provided a bad fh via 
the VFIO ioctl which then QEMU fed into CLP or KVM provided the right fh 
via ioctl but QEMU clobbered it when providing it to the guest via CLP.

!zdev->kzdev - Either the guest provided a bogus fh that just so 
happened to match a host fh that has no KVM association, or KVM or QEMU 
screwed up somewhere (as above or because we failed to make the KVM 
assocation somehow)

kzdev->kvm != vcpu->kvm - Pretty much the same as above, but the 
matching device is actually in use by some other guest.  Again it's 
possible the a misbehaving guest 'got lucky' with an arbitrary fh that 
happened to match a host fh with an existing KVM association -- or more 
likely that KVM or QEMU screwed up somewhere.

> 
>>
>> I think if we decide to throw this to userspace in this event, QEMU 
>> needs some extra code to handle it (basically, if QEMU receives the 
>> intercept and the device is neither emulated nor using intercept mode 
>> then we must treat as an invalid handle as this intercept should have 
>> been handled by KVM)
> 
> I do not want to start a discussion on this, I think we can let it like 
> this at first and come back to it when we have a good idea on how to 
> handle this.
> May be just add a /* TODO */

OK, sure.  In any of the above cases, we are certainly done in KVM 
anyway.  Whether there's value in passing it onto userspace vs 
immediately giving an error, let's think about it.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation
  2022-01-19 18:06   ` Pierre Morel
@ 2022-01-19 20:19     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-19 20:19 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/19/22 1:06 PM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> Initial setup for Adapter Event Notification Interpretation for zPCI
>> passthrough devices.  Specifically, allocate a structure for 
>> forwarding of
>> adapter events and pass the address of this structure to firmware.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/pci.h      |   4 +
>>   arch/s390/include/asm/pci_insn.h |  12 +++
>>   arch/s390/kvm/interrupt.c        |  14 +++
>>   arch/s390/kvm/kvm-s390.c         |   9 ++
>>   arch/s390/kvm/pci.c              | 144 +++++++++++++++++++++++++++++++
>>   arch/s390/kvm/pci.h              |  42 +++++++++
>>   arch/s390/pci/pci.c              |   6 ++
>>   7 files changed, 231 insertions(+)
>>   create mode 100644 arch/s390/kvm/pci.h
>>
>> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
>> index 9b6c657d8d31..9ff8dc19975e 100644
>> --- a/arch/s390/include/asm/pci.h
>> +++ b/arch/s390/include/asm/pci.h
>> @@ -9,6 +9,7 @@
>>   #include <asm-generic/pci.h>
>>   #include <asm/pci_clp.h>
>>   #include <asm/pci_debug.h>
>> +#include <asm/pci_insn.h>
>>   #include <asm/sclp.h>
>>   #define PCIBIOS_MIN_IO        0x1000
>> @@ -204,6 +205,9 @@ extern const struct attribute_group 
>> *zpci_attr_groups[];
>>   extern unsigned int s390_pci_force_floating __initdata;
>>   extern unsigned int s390_pci_no_rid;
>> +extern union zpci_sic_iib *zpci_aipb;
>> +extern struct airq_iv *zpci_aif_sbv;
>> +
>>   /* 
>> ----------------------------------------------------------------------------- 
>>
>>     Prototypes
>>   
>> ----------------------------------------------------------------------------- 
>> */
>> diff --git a/arch/s390/include/asm/pci_insn.h 
>> b/arch/s390/include/asm/pci_insn.h
>> index 32759c407b8f..ad9000295c82 100644
>> --- a/arch/s390/include/asm/pci_insn.h
>> +++ b/arch/s390/include/asm/pci_insn.h
>> @@ -101,6 +101,7 @@ struct zpci_fib {
>>   /* Set Interruption Controls Operation Controls  */
>>   #define    SIC_IRQ_MODE_ALL        0
>>   #define    SIC_IRQ_MODE_SINGLE        1
>> +#define    SIC_SET_AENI_CONTROLS        2
>>   #define    SIC_IRQ_MODE_DIRECT        4
>>   #define    SIC_IRQ_MODE_D_ALL        16
>>   #define    SIC_IRQ_MODE_D_SINGLE        17
>> @@ -127,9 +128,20 @@ struct zpci_cdiib {
>>       u64 : 64;
>>   } __packed __aligned(8);
>> +/* adapter interruption parameters block */
>> +struct zpci_aipb {
>> +    u64 faisb;
>> +    u64 gait;
>> +    u16 : 13;
>> +    u16 afi : 3;
>> +    u32 : 32;
>> +    u16 faal;
>> +} __packed __aligned(8);
>> +
>>   union zpci_sic_iib {
>>       struct zpci_diib diib;
>>       struct zpci_cdiib cdiib;
>> +    struct zpci_aipb aipb;
>>   };
>>   DECLARE_STATIC_KEY_FALSE(have_mio);
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index f9b872e358c6..a591b8cd662f 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -32,6 +32,7 @@
>>   #include "kvm-s390.h"
>>   #include "gaccess.h"
>>   #include "trace-s390.h"
>> +#include "pci.h"
>>   #define PFAULT_INIT 0x0600
>>   #define PFAULT_DONE 0x0680
>> @@ -3278,6 +3279,11 @@ void kvm_s390_gib_destroy(void)
>>   {
>>       if (!gib)
>>           return;
>> +    if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni && aift) {
>> +        mutex_lock(&aift->lock);
>> +        kvm_s390_pci_aen_exit();
>> +        mutex_unlock(&aift->lock);
>> +    }
>>       chsc_sgib(0);
>>       unregister_adapter_interrupt(&gib_alert_irq);
>>       free_page((unsigned long)gib);
>> @@ -3315,6 +3321,14 @@ int kvm_s390_gib_init(u8 nisc)
>>           goto out_unreg_gal;
>>       }
>> +    if (IS_ENABLED(CONFIG_PCI) && sclp.has_aeni) {
>> +        if (kvm_s390_pci_aen_init(nisc)) {
>> +            pr_err("Initializing AEN for PCI failed\n");
>> +            rc = -EIO;
>> +            goto out_unreg_gal;
>> +        }
>> +    }
>> +
>>       KVM_EVENT(3, "gib 0x%pK (nisc=%d) initialized", gib, gib->nisc);
>>       goto out;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 14a18ba5ff2c..01dc3f6883d0 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -48,6 +48,7 @@
>>   #include <asm/fpu/api.h>
>>   #include "kvm-s390.h"
>>   #include "gaccess.h"
>> +#include "pci.h"
>>   #define CREATE_TRACE_POINTS
>>   #include "trace.h"
>> @@ -503,6 +504,14 @@ int kvm_arch_init(void *opaque)
>>           goto out;
>>       }
>> +    if (IS_ENABLED(CONFIG_PCI)) {
>> +        rc = kvm_s390_pci_init();
>> +        if (rc) {
>> +            pr_err("Unable to allocate AIFT for PCI\n");
>> +            goto out;
>> +        }
>> +    }
>> +
>>       rc = kvm_s390_gib_init(GAL_ISC);
>>       if (rc)
>>           goto out;
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> index 1c33bc7bf2bd..dae853da6df1 100644
>> --- a/arch/s390/kvm/pci.c
>> +++ b/arch/s390/kvm/pci.c
>> @@ -10,6 +10,138 @@
>>   #include <linux/kvm_host.h>
>>   #include <linux/pci.h>
>>   #include <asm/kvm_pci.h>
>> +#include <asm/pci.h>
>> +#include <asm/pci_insn.h>
>> +#include "pci.h"
>> +
>> +struct zpci_aift *aift;
>> +
>> +static inline int __set_irq_noiib(u16 ctl, u8 isc)
>> +{
>> +    union zpci_sic_iib iib = {{0}};
>> +
>> +    return zpci_set_irq_ctrl(ctl, isc, &iib);
>> +}
>> +
>> +/* Caller must hold the aift lock before calling this function */
>> +void kvm_s390_pci_aen_exit(void)
>> +{
>> +    unsigned long flags;
>> +    struct kvm_zdev **gait_kzdev;
>> +
>> +    /*
>> +     * Contents of the aipb remain registered for the life of the host
>> +     * kernel, the information preserved in zpci_aipb and zpci_aif_sbv
>> +     * in case we insert the KVM module again later.  Clear the AIFT
>> +     * information and free anything not registered with underlying
>> +     * firmware.
>> +     */
>> +    spin_lock_irqsave(&aift->gait_lock, flags);
>> +    gait_kzdev = aift->kzdev;
>> +    aift->gait = 0;
>> +    aift->sbv = 0;
>> +    aift->kzdev = 0;
>> +    spin_unlock_irqrestore(&aift->gait_lock, flags);
>> +
>> +    kfree(gait_kzdev);
>> +}
>> +
>> +int kvm_s390_pci_aen_init(u8 nisc)
>> +{
>> +    struct page *page;
>> +    int rc = 0, size;
>> +    bool first = false;
>> +
>> +    /* If already enabled for AEN, bail out now */
>> +    if (aift->gait || aift->sbv)
>> +        return -EPERM;
>> +
>> +    mutex_lock(&aift->lock);
>> +    aift->kzdev = kcalloc(ZPCI_NR_DEVICES, sizeof(struct kvm_zdev),
>> +                  GFP_KERNEL);
>> +    if (!aift->kzdev) {
>> +        rc = -ENOMEM;
>> +        goto unlock;
>> +    }
>> +
>> +    if (!zpci_aipb) {
> 
> I think you should externalize all this allocation and setup of aipb
> in a dedicated function zpci_setup_aipb()
> from here ----->
> 
>> +        zpci_aipb = kzalloc(sizeof(union zpci_sic_iib), GFP_KERNEL);
>> +        if (!zpci_aipb) {
>> +            rc = -ENOMEM;
>> +            goto free_zdev;
>> +        }
>> +        first = true;
>> +        aift->sbv = airq_iv_create(ZPCI_NR_DEVICES, AIRQ_IV_ALLOC, 0);
>> +        if (!aift->sbv) {
>> +            rc = -ENOMEM;
>> +            goto free_aipb;
>> +        }
>> +        zpci_aif_sbv = aift->sbv;
>> +        size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
>> +                        sizeof(struct zpci_gaite)));
>> +        page = alloc_pages(GFP_KERNEL | __GFP_ZERO, size);
>> +        if (!page) {
>> +            rc = -ENOMEM;
>> +            goto free_sbv;
>> +        }
>> +        aift->gait = (struct zpci_gaite *)page_to_phys(page);
>> +
>> +        zpci_aipb->aipb.faisb = virt_to_phys(aift->sbv->vector);
>> +        zpci_aipb->aipb.gait = virt_to_phys(aift->gait);
>> +        zpci_aipb->aipb.afi = nisc;
>> +        zpci_aipb->aipb.faal = ZPCI_NR_DEVICES;
>> +
>> +        /* Setup Adapter Event Notification Interpretation */
>> +        if (zpci_set_irq_ctrl(SIC_SET_AENI_CONTROLS, 0, zpci_aipb)) {
>> +            rc = -EIO;
>> +            goto free_gait;
> 
> to here---->
> 
>> +        }
>> +    } else {
>> +        /*
>> +         * AEN registration can only happen once per system boot.  If
>> +         * an aipb already exists then AEN was already registered and
>> +         * we can re-use the aipb contents.  This can only happen if
>> +         * the KVM module was removed and re-inserted.
>> +         */
>> +        if (zpci_aipb->aipb.afi != nisc ||
>> +            zpci_aipb->aipb.faal != ZPCI_NR_DEVICES) {
>> +            rc = -EINVAL;
>> +            goto free_zdev;
>> +        }
>> +        aift->sbv = zpci_aif_sbv;
>> +        aift->gait = (struct zpci_gaite *)zpci_aipb->aipb.gait;
>> +    }
>> +
>> +    /* Enable floating IRQs */
>> +    if (__set_irq_noiib(SIC_IRQ_MODE_SINGLE, nisc)) {
>> +        rc = -EIO;
>> +        kvm_s390_pci_aen_exit();
>> +    }
>> +
>> +    goto unlock;
> 
> and the according errors
> 
> here ---->
>> +
>> +free_gait:
>> +    size = get_order(PAGE_ALIGN(ZPCI_NR_DEVICES *
>> +                    sizeof(struct zpci_gaite)));
>> +    free_pages((unsigned long)aift->gait, size);
>> +free_sbv:
>> +    if (first) {
>> +        /* If AEN setup failed, only free a newly-allocated sbv */
>> +        airq_iv_release(aift->sbv);
>> +        zpci_aif_sbv = 0;
>> +    }
>> +free_aipb:
>> +    if (first) {
>> +        /* If AEN setup failed, only free a newly-allocated aipb */
>> +        kfree(zpci_aipb);
>> +        zpci_aipb = 0;
>> +    }
> 
> to here ---->
> 
> To simplify the understanding.
> 
>> +free_zdev:
>> +    kfree(aift->kzdev);
>> +unlock:
>> +    mutex_unlock(&aift->lock);
>> +    return rc;
>> +}
>>
> 
> ... snip...
> 
> The second part of the if(aipb) else
> could also be externalise in zpci_reset_aipb()
> 
> which leads to
> 
>     if(!aipb)
>      ret = zpci_setup_aipb()
>     else
>      ret = zpci_reset_aipb()
> 
>     if (ret)
>      goto cleanup;
> 
>      enable_irq()
>      goto unlock;
> 
> I think that if we can do that it would be much clearer.
> what do you think?
> 

Yup, that sounds good, I will re-organize with 2 new static functions 
zpci_setup_aipb() and zpci_reset_aipb()

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations
  2022-01-19 20:02         ` Matthew Rosato
@ 2022-01-20  9:47           ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-20  9:47 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/19/22 21:02, Matthew Rosato wrote:
> On 1/19/22 1:25 PM, Pierre Morel wrote:
>>
>>
>> On 1/19/22 17:39, Matthew Rosato wrote:
>>> On 1/19/22 4:29 AM, Pierre Morel wrote:
>>>>
>>>>
>>>> On 1/14/22 21:31, Matthew Rosato wrote:
>>> ...
>>>>> +static int dma_table_shadow(struct kvm_vcpu *vcpu, struct zpci_dev 
>>>>> *zdev,
>>>>> +                dma_addr_t dma_addr, size_t size)
>>>>> +{
>>>>> +    unsigned int nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
>>>>> +    struct kvm_zdev *kzdev = zdev->kzdev;
>>>>> +    unsigned long *entry, *gentry;
>>>>> +    int i, rc = 0, rc2;
>>>>> +
>>>>> +    if (!nr_pages || !kzdev)
>>>>> +        return -EINVAL;
>>>>> +
>>>>> +    mutex_lock(&kzdev->ioat.lock);
>>>>> +    if (!zdev->dma_table || !kzdev->ioat.head[0]) {
>>>>> +        rc = -EINVAL;
>>>>> +        goto out_unlock;
>>>>> +    }
>>>>> +
>>>>> +    for (i = 0; i < nr_pages; i++) {
>>>>> +        gentry = dma_walk_guest_cpu_trans(vcpu, &kzdev->ioat, 
>>>>> dma_addr);
>>>>> +        if (!gentry)
>>>>> +            continue;
>>>>> +        entry = dma_walk_cpu_trans(zdev->dma_table, dma_addr);
>>>>> +
>>>>> +        if (!entry) {
>>>>> +            rc = -ENOMEM;
>>>>> +            goto out_unlock;
>>>>> +        }
>>>>> +
>>>>> +        rc2 = dma_shadow_cpu_trans(vcpu, entry, gentry);
>>>>> +        if (rc2 < 0) {
>>>>> +            rc = -EIO;
>>>>> +            goto out_unlock;
>>>>> +        }
>>>>> +        dma_addr += PAGE_SIZE;
>>>>> +        rc += rc2;
>>>>> +    }
>>>>> +
>>>>
>>>> In case of error, shouldn't we invalidate the shadow tables entries 
>>>> we did validate until the error?
>>>
>>> Hmm, I don't think this is strictly necessary - the status returned 
>>> should indicate the specified DMA range is now in an indeterminate 
>>> state (putting the onus on the guest to take corrective action via a 
>>> global refresh).
>>>
>>> In fact I think I screwed that up below in 
>>> kvm_s390_pci_refresh_trans, the fabricated status should always be 
>>> KVM_S390_RPCIT_INS_RES.
>>
>> OK
>>
>>>
>>>>
>>>>> +out_unlock:
>>>>> +    mutex_unlock(&kzdev->ioat.lock);
>>>>> +    return rc;
>>>>> +}
>>>>> +
>>>>> +int kvm_s390_pci_refresh_trans(struct kvm_vcpu *vcpu, unsigned 
>>>>> long req,
>>>>> +                   unsigned long start, unsigned long size,
>>>>> +                   u8 *status)
>>>>> +{
>>>>> +    struct zpci_dev *zdev;
>>>>> +    u32 fh = req >> 32;
>>>>> +    int rc;
>>>>> +
>>>>> +    /* Make sure this is a valid device associated with this guest */
>>>>> +    zdev = get_zdev_by_fh(fh);
>>>>> +    if (!zdev || !zdev->kzdev || zdev->kzdev->kvm != vcpu->kvm) {
>>>>> +        *status = 0;
>>>>
>>>> Wouldn't it be interesting to add some debug information here.
>>>> When would this appear?
>>>
>>> Yes, I agree -- One of the follow-ons I'd like to add after this 
>>> series is s390dbf entries; this seems like a good spot for one.
>>>
>>> As to when this could happen; it should not under normal 
>>> circumstances, but consider something like arbitrary function handles 
>>> coming from the intercepted guest instruction.  We need to ensure 
>>> that the specified function 1) exists and 2) is associated with the 
>>> guest issuing the refresh.
>>>
>>>>
>>>> Also if we have this error this looks like we have a VM problem, 
>>>> shouldn't we treat this in QEMU and return -EOPNOTSUPP ?
>>>>
>>>
>>> Well, I'm not sure if we can really tell where the problem is (it 
>>> could for example indicate a misbehaving guest, or a bug in our KVM 
>>> tracking of hostdevs).
>>>
>>> The guest chose the function handle, and if we got here then that 
>>> means it doesn't indicate that it's an emulated device, which means 
>>> either we are using the assist and KVM should handle the intercept or 
>>> we are not and userspace should handle it.  But in both of those 
>>> cases, there should be a host device and it should be associated with 
>>> the guest.
>>
>> That is right if we can not find an associated zdev = F(fh)
>> but the two other errors are KVM or QEMU errors AFAIU.
> 
> I don't think we know for sure for any of the cases...  For a 
> well-behaved guest I agree with your assessment.  However, the guest 
> decides what fh to put into its refresh instruction and so a misbehaving 
> guest could just pick arbitrary numbers for fh and circumstantially 
> match some other host device.  What if the guest just decided to try 
> every single possible fh number in a loop with a refresh instruction? 
> That's neither KVM nor QEMU's fault but can trip each of these cases.
> 
> Consider the different cases:
> 
> !zdev - Either the guest provided a bogus fh, KVM provided a bad fh via 
> the VFIO ioctl which then QEMU fed into CLP or KVM provided the right fh 
> via ioctl but QEMU clobbered it when providing it to the guest via CLP.
> 
> !zdev->kzdev - Either the guest provided a bogus fh that just so 
> happened to match a host fh that has no KVM association, or KVM or QEMU 
> screwed up somewhere (as above or because we failed to make the KVM 
> assocation somehow)
> 
> kzdev->kvm != vcpu->kvm - Pretty much the same as above, but the 
> matching device is actually in use by some other guest.  Again it's 
> possible the a misbehaving guest 'got lucky' with an arbitrary fh that 
> happened to match a host fh with an existing KVM association -- or more 
> likely that KVM or QEMU screwed up somewhere.

OK, I understand and you are right, my error was to consider that 
get_zdev_by_fh() returns a zdev associated with a valid FH for the guest 
while it returns a zdev associated with a valid FH for the host.

If the comment would have been after the get_zdev_by_fh() and before the 
test I may be wouldn't have done this mistake.

> 
>>
>>>
>>> I think if we decide to throw this to userspace in this event, QEMU 
>>> needs some extra code to handle it (basically, if QEMU receives the 
>>> intercept and the device is neither emulated nor using intercept mode 
>>> then we must treat as an invalid handle as this intercept should have 
>>> been handled by KVM)
>>
>> I do not want to start a discussion on this, I think we can let it 
>> like this at first and come back to it when we have a good idea on how 
>> to handle this.
>> May be just add a /* TODO */
> 
> OK, sure.  In any of the above cases, we are certainly done in KVM 
> anyway.  Whether there's value in passing it onto userspace vs 
> immediately giving an error, let's think about it.

No, I do not think we should anymore.
Sorry for this wrong idea.

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 08/30] s390/pci: stash associated GISA designation
  2022-01-14 20:31 ` [PATCH v2 08/30] s390/pci: stash associated GISA designation Matthew Rosato
@ 2022-01-24 14:08   ` Pierre Morel
  2022-01-24 15:12     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-24 14:08 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger



On 1/14/22 21:31, Matthew Rosato wrote:
> For passthrough devices, we will need to know the GISA designation of the
> guest if interpretation facilities are to be used.  Setup to stash this in
> the zdev and set a default of 0 (no GISA designation) for now; a subsequent
> patch will set a valid GISA designation for passthrough devices.
> Also, extend mpcific routines to specify this stashed designation as part
> of the mpcific command.
> 
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> Reviewed-by: Eric Farman <farman@linux.ibm.com>
> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/pci.h     | 1 +
>   arch/s390/include/asm/pci_clp.h | 3 ++-
>   arch/s390/pci/pci.c             | 6 ++++++
>   arch/s390/pci/pci_clp.c         | 1 +
>   arch/s390/pci/pci_irq.c         | 5 +++++
>   5 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> index 90824be5ce9a..2474b8d30f2a 100644
> --- a/arch/s390/include/asm/pci.h
> +++ b/arch/s390/include/asm/pci.h
> @@ -123,6 +123,7 @@ struct zpci_dev {
>   	enum zpci_state state;
>   	u32		fid;		/* function ID, used by sclp */
>   	u32		fh;		/* function handle, used by insn's */
> +	u32		gd;		/* GISA designation for passthrough */

I already gave my R-B, and do not want to remove it, but wouldn't it be 
possible to use more explicit names like gisa_designation instead of 
just gd.
It would not change anything to the functionality but would facilitate 
the maintenance?

>   	u16		vfn;		/* virtual function number */
>   	u16		pchid;		/* physical channel ID */
>   	u8		pfgid;		/* function group ID */
> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> index 1f4b666e85ee..3af8d196da74 100644
> --- a/arch/s390/include/asm/pci_clp.h
> +++ b/arch/s390/include/asm/pci_clp.h
> @@ -173,7 +173,8 @@ struct clp_req_set_pci {
>   	u16 reserved2;
>   	u8 oc;				/* operation controls */
>   	u8 ndas;			/* number of dma spaces */
> -	u64 reserved3;
> +	u32 reserved3;
> +	u32 gd;				/* GISA designation */

here too.


>   } __packed;
>   
>   /* Set PCI function response */
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 792f8e0f2178..0c9879dae752 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -119,6 +119,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
>   	fib.pba = base;
>   	fib.pal = limit;
>   	fib.iota = iota | ZPCI_IOTA_RTTO_FLAG;
> +	fib.gd = zdev->gd;
>   	cc = zpci_mod_fc(req, &fib, &status);
>   	if (cc)
>   		zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
> @@ -132,6 +133,8 @@ int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
>   	struct zpci_fib fib = {0};
>   	u8 cc, status;
>   
> +	fib.gd = zdev->gd;
> +
>   	cc = zpci_mod_fc(req, &fib, &status);
>   	if (cc)
>   		zpci_dbg(3, "unreg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
> @@ -159,6 +162,7 @@ int zpci_fmb_enable_device(struct zpci_dev *zdev)
>   	atomic64_set(&zdev->unmapped_pages, 0);
>   
>   	fib.fmb_addr = virt_to_phys(zdev->fmb);
> +	fib.gd = zdev->gd;
>   	cc = zpci_mod_fc(req, &fib, &status);
>   	if (cc) {
>   		kmem_cache_free(zdev_fmb_cache, zdev->fmb);
> @@ -177,6 +181,8 @@ int zpci_fmb_disable_device(struct zpci_dev *zdev)
>   	if (!zdev->fmb)
>   		return -EINVAL;
>   
> +	fib.gd = zdev->gd;
> +
>   	/* Function measurement is disabled if fmb address is zero */
>   	cc = zpci_mod_fc(req, &fib, &status);
>   	if (cc == 3) /* Function already gone. */
> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> index be077b39da33..e9ed0e4a5cf0 100644
> --- a/arch/s390/pci/pci_clp.c
> +++ b/arch/s390/pci/pci_clp.c
> @@ -240,6 +240,7 @@ static int clp_set_pci_fn(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as, u8 comma
>   		rrb->request.fh = zdev->fh;
>   		rrb->request.oc = command;
>   		rrb->request.ndas = nr_dma_as;
> +		rrb->request.gd = zdev->gd;
>   
>   		rc = clp_req(rrb, CLP_LPS_PCI);
>   		if (rrb->response.hdr.rsp == CLP_RC_SETPCIFN_BUSY) {
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 2f675355fd0c..17e5adfe1273 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -43,6 +43,7 @@ static int zpci_set_airq(struct zpci_dev *zdev)
>   	fib.fmt0.aibvo = 0;	/* each zdev has its own interrupt vector */
>   	fib.fmt0.aisb = virt_to_phys(zpci_sbv->vector) + (zdev->aisb / 64) * 8;
>   	fib.fmt0.aisbo = zdev->aisb & 63;
> +	fib.gd = zdev->gd;
>   
>   	return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
>   }
> @@ -54,6 +55,8 @@ static int zpci_clear_airq(struct zpci_dev *zdev)
>   	struct zpci_fib fib = {0};
>   	u8 cc, status;
>   
> +	fib.gd = zdev->gd;
> +
>   	cc = zpci_mod_fc(req, &fib, &status);
>   	if (cc == 3 || (cc == 1 && status == 24))
>   		/* Function already gone or IRQs already deregistered. */
> @@ -72,6 +75,7 @@ static int zpci_set_directed_irq(struct zpci_dev *zdev)
>   	fib.fmt = 1;
>   	fib.fmt1.noi = zdev->msi_nr_irqs;
>   	fib.fmt1.dibvo = zdev->msi_first_bit;
> +	fib.gd = zdev->gd;
>   
>   	return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
>   }
> @@ -84,6 +88,7 @@ static int zpci_clear_directed_irq(struct zpci_dev *zdev)
>   	u8 cc, status;
>   
>   	fib.fmt = 1;
> +	fib.gd = zdev->gd;
>   	cc = zpci_mod_fc(req, &fib, &status);
>   	if (cc == 3 || (cc == 1 && status == 24))
>   		/* Function already gone or IRQs already deregistered. */
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 17/30] KVM: s390: mechanism to enable guest zPCI Interpretation
  2022-01-14 20:31 ` [PATCH v2 17/30] KVM: s390: mechanism to enable guest zPCI Interpretation Matthew Rosato
@ 2022-01-24 14:24   ` Pierre Morel
  2022-01-24 15:28     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-24 14:24 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> The guest must have access to certain facilities in order to allow
> interpretive execution of zPCI instructions and adapter event
> notifications.  However, there are some cases where a guest might
> disable interpretation -- provide a mechanism via which we can defer
> enabling the associated zPCI interpretation facilities until the guest
> indicates it wishes to use them.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_host.h |  4 ++++
>   arch/s390/kvm/kvm-s390.c         | 40 ++++++++++++++++++++++++++++++++
>   arch/s390/kvm/kvm-s390.h         | 10 ++++++++
>   3 files changed, 54 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 3f147b8d050b..38982c1de413 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -252,7 +252,10 @@ struct kvm_s390_sie_block {
>   #define ECB2_IEP	0x20
>   #define ECB2_PFMFI	0x08
>   #define ECB2_ESCA	0x04
> +#define ECB2_ZPCI_LSI	0x02
>   	__u8    ecb2;                   /* 0x0062 */
> +#define ECB3_AISI	0x20
> +#define ECB3_AISII	0x10
>   #define ECB3_DEA 0x08
>   #define ECB3_AES 0x04
>   #define ECB3_RI  0x01
> @@ -938,6 +941,7 @@ struct kvm_arch{
>   	int use_cmma;
>   	int use_pfmfi;
>   	int use_skf;
> +	int use_zpci_interp;
>   	int user_cpu_state_ctrl;
>   	int user_sigp;
>   	int user_stsi;
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index ab8b56deed11..b6c32fc3b272 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -1029,6 +1029,44 @@ static int kvm_s390_vm_set_crypto(struct kvm *kvm, struct kvm_device_attr *attr)
>   	return 0;
>   }
>   
> +static void kvm_s390_vcpu_pci_setup(struct kvm_vcpu *vcpu)
> +{
> +	/* Only set the ECB bits after guest requests zPCI interpretation */
> +	if (!vcpu->kvm->arch.use_zpci_interp)
> +		return;
> +
> +	vcpu->arch.sie_block->ecb2 |= ECB2_ZPCI_LSI;
> +	vcpu->arch.sie_block->ecb3 |= ECB3_AISII + ECB3_AISI;

As far as I understood, the interpretation is only possible if a gisa 
designation is associated with the PCI function via CLP enable.

Why do we setup the SIE ECB only when the guest requests for 
interpretation and not systematically in vcpu_setup?

If ECB2_ZPCI_LSI, ECB3_AISII or ECB3_AISI have an effect when the gisa 
designation is not specified shouldn't we have a way to clear these bits?

> +}
> +
> +void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm)
> +{
> +	struct kvm_vcpu *vcpu;
> +	int i;
> +
> +	/*
> +	 * If host is configured for PCI and the necessary facilities are
> +	 * available, turn on interpretation for the life of this guest
> +	 */
> +	if (!IS_ENABLED(CONFIG_PCI) || !sclp.has_zpci_lsi || !sclp.has_aisii ||
> +	    !sclp.has_aeni || !sclp.has_aisi)
> +		return;
> +
> +	mutex_lock(&kvm->lock);
> +
> +	kvm->arch.use_zpci_interp = 1;
> +
> +	kvm_s390_vcpu_block_all(kvm);
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		kvm_s390_vcpu_pci_setup(vcpu);
> +		kvm_s390_sync_request(KVM_REQ_VSIE_RESTART, vcpu);
> +	}
> +
> +	kvm_s390_vcpu_unblock_all(kvm);
> +	mutex_unlock(&kvm->lock);
> +}
> +
>   static void kvm_s390_sync_request_broadcast(struct kvm *kvm, int req)
>   {
>   	int cx;
> @@ -3282,6 +3320,8 @@ static int kvm_s390_vcpu_setup(struct kvm_vcpu *vcpu)
>   
>   	kvm_s390_vcpu_crypto_setup(vcpu);
>   
> +	kvm_s390_vcpu_pci_setup(vcpu);
> +
>   	mutex_lock(&vcpu->kvm->lock);
>   	if (kvm_s390_pv_is_protected(vcpu->kvm)) {
>   		rc = kvm_s390_pv_create_cpu(vcpu, &uvrc, &uvrrc);
> diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h
> index c07a050d757d..a2eccb8b977e 100644
> --- a/arch/s390/kvm/kvm-s390.h
> +++ b/arch/s390/kvm/kvm-s390.h
> @@ -481,6 +481,16 @@ void kvm_s390_reinject_machine_check(struct kvm_vcpu *vcpu,
>    */
>   void kvm_s390_vcpu_crypto_reset_all(struct kvm *kvm);
>   
> +/**
> + * kvm_s390_vcpu_pci_enable_interp
> + *
> + * Set the associated PCI attributes for each vcpu to allow for zPCI Load/Store
> + * interpretation as well as adapter interruption forwarding.
> + *
> + * @kvm: the KVM guest
> + */
> +void kvm_s390_vcpu_pci_enable_interp(struct kvm *kvm);
> +
>   /**
>    * diag9c_forwarding_hz
>    *
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 18/30] KVM: s390: pci: provide routines for enabling/disabling interpretation
  2022-01-14 20:31 ` [PATCH v2 18/30] KVM: s390: pci: provide routines for enabling/disabling interpretation Matthew Rosato
@ 2022-01-24 14:36   ` Pierre Morel
  2022-01-24 15:14     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-24 14:36 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> These routines will be wired into the vfio_pci_zdev ioctl handlers to
> respond to requests to enable / disable a device for zPCI Load/Store
> interpretation.
> 
> The first time such a request is received, enable the necessary facilities
> for the guest.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h |  4 ++
>   arch/s390/kvm/pci.c             | 99 +++++++++++++++++++++++++++++++++
>   arch/s390/pci/pci.c             |  3 +
>   3 files changed, 106 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index aafee2976929..072401aa7922 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -26,4 +26,8 @@ int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
>   void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
>   void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
>   
> +int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
> +int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
> +int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
> +
>   #endif /* ASM_KVM_PCI_H */
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index dae853da6df1..122d0992b521 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -12,7 +12,9 @@
>   #include <asm/kvm_pci.h>
>   #include <asm/pci.h>
>   #include <asm/pci_insn.h>
> +#include <asm/sclp.h>
>   #include "pci.h"
> +#include "kvm-s390.h"
>   
>   struct zpci_aift *aift;
>   
> @@ -143,6 +145,103 @@ int kvm_s390_pci_aen_init(u8 nisc)
>   	return rc;
>   }
>   
> +int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
> +{
> +	/* Must have appropriate hardware facilities */
> +	if (!(sclp.has_zpci_lsi && test_facility(69)))

Should'nt we also test the other facilities we need for the 
interpretation like ARNI, AISII, ASI and GISA ?

Or are we sure they are always there when ZPCI load/store interpretation 
is available?


> +		return -EINVAL;
> +
> +	/* Must have a KVM association registered */
> +	if (!zdev->kzdev || !zdev->kzdev->kvm)
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_probe);
> +
> +int kvm_s390_pci_interp_enable(struct zpci_dev *zdev)
> +{
> +	u32 gd;
> +	int rc;
> +
> +	if (!zdev->kzdev || !zdev->kzdev->kvm)
> +		return -EINVAL;
> +
> +	/*
> +	 * If this is the first request to use an interpreted device, make the
> +	 * necessary vcpu changes
> +	 */
> +	if (!zdev->kzdev->kvm->arch.use_zpci_interp)
> +		kvm_s390_vcpu_pci_enable_interp(zdev->kzdev->kvm);
> +
> +	/*
> +	 * In the event of a system reset in userspace, the GISA designation
> +	 * may still be assigned because the device is still enabled.
> +	 * Verify it's the same guest before proceeding.
> +	 */
> +	gd = (u32)(u64)&zdev->kzdev->kvm->arch.sie_page2->gisa;

should use the virt_to_phys transformation ?

> +	if (zdev->gd != 0 && zdev->gd != gd)
> +		return -EPERM;
> +
> +	if (zdev_enabled(zdev)) {
> +		zdev->gd = 0;
> +		rc = zpci_disable_device(zdev);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	/*
> +	 * Store information about the identity of the kvm guest allowed to
> +	 * access this device via interpretation to be used by host CLP
> +	 */
> +	zdev->gd = gd;
> +
> +	rc = zpci_enable_device(zdev);
> +	if (rc)
> +		goto err;
> +
> +	/* Re-register the IOMMU that was already created */
> +	rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
> +				virt_to_phys(zdev->dma_table));
> +	if (rc)
> +		goto err;
> +
> +	return rc;
> +
> +err:
> +	zdev->gd = 0;
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_enable);
> +
> +int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
> +{
> +	int rc;
> +
> +	if (zdev->gd == 0)
> +		return -EINVAL;
> +
> +	/* Remove the host CLP guest designation */
> +	zdev->gd = 0;
> +
> +	if (zdev_enabled(zdev)) {
> +		rc = zpci_disable_device(zdev);
> +		if (rc)
> +			return rc;
> +	}
> +
> +	rc = zpci_enable_device(zdev);
> +	if (rc)
> +		return rc;
> +
> +	/* Re-register the IOMMU that was already created */
> +	rc = zpci_register_ioat(zdev, 0, zdev->start_dma, zdev->end_dma,
> +				virt_to_phys(zdev->dma_table));
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_disable);
> +
>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev)
>   {
>   	struct kvm_zdev *kzdev;
> diff --git a/arch/s390/pci/pci.c b/arch/s390/pci/pci.c
> index 2a19becbc14c..58673f633869 100644
> --- a/arch/s390/pci/pci.c
> +++ b/arch/s390/pci/pci.c
> @@ -147,6 +147,7 @@ int zpci_register_ioat(struct zpci_dev *zdev, u8 dmaas,
>   		zpci_dbg(3, "reg ioat fid:%x, cc:%d, status:%d\n", zdev->fid, cc, status);
>   	return cc;
>   }
> +EXPORT_SYMBOL_GPL(zpci_register_ioat);
>   
>   /* Modify PCI: Unregister I/O address translation parameters */
>   int zpci_unregister_ioat(struct zpci_dev *zdev, u8 dmaas)
> @@ -727,6 +728,7 @@ int zpci_enable_device(struct zpci_dev *zdev)
>   		zpci_update_fh(zdev, fh);
>   	return rc;
>   }
> +EXPORT_SYMBOL_GPL(zpci_enable_device);
>   
>   int zpci_disable_device(struct zpci_dev *zdev)
>   {
> @@ -750,6 +752,7 @@ int zpci_disable_device(struct zpci_dev *zdev)
>   	}
>   	return rc;
>   }
> +EXPORT_SYMBOL_GPL(zpci_disable_device);
>   
>   /**
>    * zpci_hot_reset_device - perform a reset of the given zPCI function
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 08/30] s390/pci: stash associated GISA designation
  2022-01-24 14:08   ` Pierre Morel
@ 2022-01-24 15:12     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-24 15:12 UTC (permalink / raw)
  To: Pierre Morel, linux-s390, schnelle
  Cc: alex.williamson, cohuck, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel,
	Christian Borntraeger

On 1/24/22 9:08 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> For passthrough devices, we will need to know the GISA designation of the
>> guest if interpretation facilities are to be used.  Setup to stash 
>> this in
>> the zdev and set a default of 0 (no GISA designation) for now; a 
>> subsequent
>> patch will set a valid GISA designation for passthrough devices.
>> Also, extend mpcific routines to specify this stashed designation as part
>> of the mpcific command.
>>
>> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
>> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> Reviewed-by: Eric Farman <farman@linux.ibm.com>
>> Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/pci.h     | 1 +
>>   arch/s390/include/asm/pci_clp.h | 3 ++-
>>   arch/s390/pci/pci.c             | 6 ++++++
>>   arch/s390/pci/pci_clp.c         | 1 +
>>   arch/s390/pci/pci_irq.c         | 5 +++++
>>   5 files changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
>> index 90824be5ce9a..2474b8d30f2a 100644
>> --- a/arch/s390/include/asm/pci.h
>> +++ b/arch/s390/include/asm/pci.h
>> @@ -123,6 +123,7 @@ struct zpci_dev {
>>       enum zpci_state state;
>>       u32        fid;        /* function ID, used by sclp */
>>       u32        fh;        /* function handle, used by insn's */
>> +    u32        gd;        /* GISA designation for passthrough */
> 
> I already gave my R-B, and do not want to remove it, but wouldn't it be 
> possible to use more explicit names like gisa_designation instead of 
> just gd.
> It would not change anything to the functionality but would facilitate 
> the maintenance?
> 

Honestly, I don't have a strong opinion on this one -- AFAICT struct 
zpci_dev has a fair mix of short names (fh) and explicit names 
(max_bus_speed).

It does require changes to this patch and various subsequent patches -- 
The changes are, as you say, not functional, so I think it's not a big deal?

I do think 'gisa_designation' is too verbose though -- How about just 
'gisa', this is the same name used in the structure where we get this 
value from (gisa in struct sie_page2)

As long as nobody objects I will s/gd/gisa/ here and in struct 
clp_req_set_pci, retaining review tags.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 18/30] KVM: s390: pci: provide routines for enabling/disabling interpretation
  2022-01-24 14:36   ` Pierre Morel
@ 2022-01-24 15:14     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-24 15:14 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/24/22 9:36 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> These routines will be wired into the vfio_pci_zdev ioctl handlers to
>> respond to requests to enable / disable a device for zPCI Load/Store
>> interpretation.
>>
>> The first time such a request is received, enable the necessary 
>> facilities
>> for the guest.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/kvm_pci.h |  4 ++
>>   arch/s390/kvm/pci.c             | 99 +++++++++++++++++++++++++++++++++
>>   arch/s390/pci/pci.c             |  3 +
>>   3 files changed, 106 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>> b/arch/s390/include/asm/kvm_pci.h
>> index aafee2976929..072401aa7922 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -26,4 +26,8 @@ int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
>>   void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
>>   void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
>> +int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
>> +int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
>> +int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
>> +
>>   #endif /* ASM_KVM_PCI_H */
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> index dae853da6df1..122d0992b521 100644
>> --- a/arch/s390/kvm/pci.c
>> +++ b/arch/s390/kvm/pci.c
>> @@ -12,7 +12,9 @@
>>   #include <asm/kvm_pci.h>
>>   #include <asm/pci.h>
>>   #include <asm/pci_insn.h>
>> +#include <asm/sclp.h>
>>   #include "pci.h"
>> +#include "kvm-s390.h"
>>   struct zpci_aift *aift;
>> @@ -143,6 +145,103 @@ int kvm_s390_pci_aen_init(u8 nisc)
>>       return rc;
>>   }
>> +int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
>> +{
>> +    /* Must have appropriate hardware facilities */
>> +    if (!(sclp.has_zpci_lsi && test_facility(69)))
> 
> Should'nt we also test the other facilities we need for the 
> interpretation like ARNI, AISII, ASI and GISA ?
> 
> Or are we sure they are always there when ZPCI load/store interpretation 
> is available?

I think some of these are implicit based on others but I think you're 
right that we should be testing for more than this to be safe.  I think 
additionally test for AENI, AISII, AISI -- basically we should match 
what we test for in patch 17.

> 
> 
>> +        return -EINVAL;
>> +
>> +    /* Must have a KVM association registered */
>> +    if (!zdev->kzdev || !zdev->kzdev->kvm)
>> +        return -EINVAL;
>> +
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_interp_probe);
>> +
>> +int kvm_s390_pci_interp_enable(struct zpci_dev *zdev)
>> +{
>> +    u32 gd;
>> +    int rc;
>> +
>> +    if (!zdev->kzdev || !zdev->kzdev->kvm)
>> +        return -EINVAL;
>> +
>> +    /*
>> +     * If this is the first request to use an interpreted device, 
>> make the
>> +     * necessary vcpu changes
>> +     */
>> +    if (!zdev->kzdev->kvm->arch.use_zpci_interp)
>> +        kvm_s390_vcpu_pci_enable_interp(zdev->kzdev->kvm);
>> +
>> +    /*
>> +     * In the event of a system reset in userspace, the GISA designation
>> +     * may still be assigned because the device is still enabled.
>> +     * Verify it's the same guest before proceeding.
>> +     */
>> +    gd = (u32)(u64)&zdev->kzdev->kvm->arch.sie_page2->gisa;
> 
> should use the virt_to_phys transformation ?

Yes

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 17/30] KVM: s390: mechanism to enable guest zPCI Interpretation
  2022-01-24 14:24   ` Pierre Morel
@ 2022-01-24 15:28     ` Matthew Rosato
  2022-01-24 17:15       ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-24 15:28 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/24/22 9:24 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> The guest must have access to certain facilities in order to allow
>> interpretive execution of zPCI instructions and adapter event
>> notifications.  However, there are some cases where a guest might
>> disable interpretation -- provide a mechanism via which we can defer
>> enabling the associated zPCI interpretation facilities until the guest
>> indicates it wishes to use them.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/kvm_host.h |  4 ++++
>>   arch/s390/kvm/kvm-s390.c         | 40 ++++++++++++++++++++++++++++++++
>>   arch/s390/kvm/kvm-s390.h         | 10 ++++++++
>>   3 files changed, 54 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/kvm_host.h 
>> b/arch/s390/include/asm/kvm_host.h
>> index 3f147b8d050b..38982c1de413 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -252,7 +252,10 @@ struct kvm_s390_sie_block {
>>   #define ECB2_IEP    0x20
>>   #define ECB2_PFMFI    0x08
>>   #define ECB2_ESCA    0x04
>> +#define ECB2_ZPCI_LSI    0x02
>>       __u8    ecb2;                   /* 0x0062 */
>> +#define ECB3_AISI    0x20
>> +#define ECB3_AISII    0x10
>>   #define ECB3_DEA 0x08
>>   #define ECB3_AES 0x04
>>   #define ECB3_RI  0x01
>> @@ -938,6 +941,7 @@ struct kvm_arch{
>>       int use_cmma;
>>       int use_pfmfi;
>>       int use_skf;
>> +    int use_zpci_interp;
>>       int user_cpu_state_ctrl;
>>       int user_sigp;
>>       int user_stsi;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index ab8b56deed11..b6c32fc3b272 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -1029,6 +1029,44 @@ static int kvm_s390_vm_set_crypto(struct kvm 
>> *kvm, struct kvm_device_attr *attr)
>>       return 0;
>>   }
>> +static void kvm_s390_vcpu_pci_setup(struct kvm_vcpu *vcpu)
>> +{
>> +    /* Only set the ECB bits after guest requests zPCI interpretation */
>> +    if (!vcpu->kvm->arch.use_zpci_interp)
>> +        return;
>> +
>> +    vcpu->arch.sie_block->ecb2 |= ECB2_ZPCI_LSI;
>> +    vcpu->arch.sie_block->ecb3 |= ECB3_AISII + ECB3_AISI;
> 
> As far as I understood, the interpretation is only possible if a gisa 
> designation is associated with the PCI function via CLP enable.
> 

This is true.  Once ECB is enabled, you must have either a SHM bit on 
for emulated device support or SHM bits off + a GISA designation 
registered for interpretation.  Otherwise, PCI instructions will fail.

> Why do we setup the SIE ECB only when the guest requests for 
> interpretation and not systematically in vcpu_setup?

Once the ECB is enabled for a guest, emulated device FHs must have a SHM 
bit in order to continue working properly (so do passthrough devices 
that don't setup interpretation).  This was not a requirement before 
this series -- simply having the ECB bit off would ensure intercepts for 
all devices regardless of SHM bit settings, so by doing an opt-in once 
the guest indicates it will be doing interpretation we can preserve 
backwards-compatibility with an initial mode where SHM bits are not 
necessarily required.  However once userspace indicates it understands 
interpretation, we can assume it is will also use SHM bits properly.

> 
> If ECB2_ZPCI_LSI, ECB3_AISII or ECB3_AISI have an effect when the gisa 
> designation is not specified shouldn't we have a way to clear these bits?
> 

I'm not sure that's necessary -- The idea here was for the userspace to 
indicate 1) that it knows how to setup for interpreted devices and 2) 
that it has a guest that wants to use at least 1 interpreted device.
Once we know that userspace understands how to manage interpreted 
devices (implied by its use of these new vfio feature ioctls) I think it 
should be OK to leave these bits on and expect userspace to always do 
the appropriate steps (SHM bits for emulated devices / forced intercept 
passthrough devices, GISA designation for interpreted devices).

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 17/30] KVM: s390: mechanism to enable guest zPCI Interpretation
  2022-01-24 15:28     ` Matthew Rosato
@ 2022-01-24 17:15       ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-24 17:15 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/24/22 16:28, Matthew Rosato wrote:
> On 1/24/22 9:24 AM, Pierre Morel wrote:
>>
>>
>> On 1/14/22 21:31, Matthew Rosato wrote:
>>> The guest must have access to certain facilities in order to allow
>>> interpretive execution of zPCI instructions and adapter event
>>> notifications.  However, there are some cases where a guest might
>>> disable interpretation -- provide a mechanism via which we can defer
>>> enabling the associated zPCI interpretation facilities until the guest
>>> indicates it wishes to use them.
>>>
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   arch/s390/include/asm/kvm_host.h |  4 ++++
>>>   arch/s390/kvm/kvm-s390.c         | 40 ++++++++++++++++++++++++++++++++
>>>   arch/s390/kvm/kvm-s390.h         | 10 ++++++++
>>>   3 files changed, 54 insertions(+)
>>>
>>> diff --git a/arch/s390/include/asm/kvm_host.h 
>>> b/arch/s390/include/asm/kvm_host.h
>>> index 3f147b8d050b..38982c1de413 100644
>>> --- a/arch/s390/include/asm/kvm_host.h
>>> +++ b/arch/s390/include/asm/kvm_host.h
>>> @@ -252,7 +252,10 @@ struct kvm_s390_sie_block {
>>>   #define ECB2_IEP    0x20
>>>   #define ECB2_PFMFI    0x08
>>>   #define ECB2_ESCA    0x04
>>> +#define ECB2_ZPCI_LSI    0x02
>>>       __u8    ecb2;                   /* 0x0062 */
>>> +#define ECB3_AISI    0x20
>>> +#define ECB3_AISII    0x10
>>>   #define ECB3_DEA 0x08
>>>   #define ECB3_AES 0x04
>>>   #define ECB3_RI  0x01
>>> @@ -938,6 +941,7 @@ struct kvm_arch{
>>>       int use_cmma;
>>>       int use_pfmfi;
>>>       int use_skf;
>>> +    int use_zpci_interp;
>>>       int user_cpu_state_ctrl;
>>>       int user_sigp;
>>>       int user_stsi;
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index ab8b56deed11..b6c32fc3b272 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -1029,6 +1029,44 @@ static int kvm_s390_vm_set_crypto(struct kvm 
>>> *kvm, struct kvm_device_attr *attr)
>>>       return 0;
>>>   }
>>> +static void kvm_s390_vcpu_pci_setup(struct kvm_vcpu *vcpu)
>>> +{
>>> +    /* Only set the ECB bits after guest requests zPCI 
>>> interpretation */
>>> +    if (!vcpu->kvm->arch.use_zpci_interp)
>>> +        return;
>>> +
>>> +    vcpu->arch.sie_block->ecb2 |= ECB2_ZPCI_LSI;
>>> +    vcpu->arch.sie_block->ecb3 |= ECB3_AISII + ECB3_AISI;
>>
>> As far as I understood, the interpretation is only possible if a gisa 
>> designation is associated with the PCI function via CLP enable.
>>
> 
> This is true.  Once ECB is enabled, you must have either a SHM bit on 
> for emulated device support or SHM bits off + a GISA designation 
> registered for interpretation.  Otherwise, PCI instructions will fail.

AFAIU the PCI instruction should not fail but trigger an interception if 
the GISA designation field of the CLP enable.

However, what you do is not false.
So I think we better keep what you propose.

> 
>> Why do we setup the SIE ECB only when the guest requests for 
>> interpretation and not systematically in vcpu_setup?
> 
> Once the ECB is enabled for a guest, emulated device FHs must have a SHM 
> bit in order to continue working properly (so do passthrough devices 
> that don't setup interpretation).  This was not a requirement before 
> this series -- simply having the ECB bit off would ensure intercepts for 
> all devices regardless of SHM bit settings, so by doing an opt-in once 
> the guest indicates it will be doing interpretation we can preserve 
> backwards-compatibility with an initial mode where SHM bits are not 
> necessarily required.  However once userspace indicates it understands 
> interpretation, we can assume it is will also use SHM bits properly.

If not setting GD in CLP enable triggers interception on later PCI 
instructions, preparing all early would allow to chose between 
interpretation or interception on a function basis during the CLP set 
PCI function with the enable command and make the initialization simpler.

However, what you propose is tested and works so we can have this 
discussion later for enhancement, if it is really one.

> 
>>
>> If ECB2_ZPCI_LSI, ECB3_AISII or ECB3_AISI have an effect when the gisa 
>> designation is not specified shouldn't we have a way to clear these bits?
>>
> 
> I'm not sure that's necessary -- The idea here was for the userspace to 
> indicate 1) that it knows how to setup for interpreted devices and 2) 
> that it has a guest that wants to use at least 1 interpreted device.
> Once we know that userspace understands how to manage interpreted 
> devices (implied by its use of these new vfio feature ioctls) I think it 
> should be OK to leave these bits on and expect userspace to always do 
> the appropriate steps (SHM bits for emulated devices / forced intercept 
> passthrough devices, GISA designation for interpreted devices).

Seems reasonable.

Acked-by: Pierre Morel <pmorel@linux.ibm.com>

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation
  2022-01-14 20:31 ` [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation Matthew Rosato
  2022-01-19 18:06   ` Pierre Morel
@ 2022-01-25 12:23   ` Pierre Morel
  2022-01-25 14:57     ` Matthew Rosato
  1 sibling, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-25 12:23 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Initial setup for Adapter Event Notification Interpretation for zPCI
> passthrough devices.  Specifically, allocate a structure for forwarding of
> adapter events and pass the address of this structure to firmware.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/pci.h      |   4 +
>   arch/s390/include/asm/pci_insn.h |  12 +++
>   arch/s390/kvm/interrupt.c        |  14 +++
>   arch/s390/kvm/kvm-s390.c         |   9 ++
>   arch/s390/kvm/pci.c              | 144 +++++++++++++++++++++++++++++++
>   arch/s390/kvm/pci.h              |  42 +++++++++
>   arch/s390/pci/pci.c              |   6 ++
>   7 files changed, 231 insertions(+)
>   create mode 100644 arch/s390/kvm/pci.h
> 
...snip...

> new file mode 100644
> index 000000000000..b2000ed7b8c3
> --- /dev/null
> +++ b/arch/s390/kvm/pci.h
> @@ -0,0 +1,42 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * s390 kvm PCI passthrough support
> + *
> + * Copyright IBM Corp. 2021
> + *
> + *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
> + */
> +
> +#ifndef __KVM_S390_PCI_H
> +#define __KVM_S390_PCI_H
> +
> +#include <linux/pci.h>
> +#include <linux/mutex.h>
> +#include <asm/airq.h>
> +#include <asm/kvm_pci.h>
> +
> +struct zpci_gaite {
> +	u32 gisa;
> +	u8 gisc;
> +	u8 count;
> +	u8 reserved;
> +	u8 aisbo;
> +	u64 aisb;
> +};
> +
> +struct zpci_aift {
> +	struct zpci_gaite *gait;
> +	struct airq_iv *sbv;
> +	struct kvm_zdev **kzdev;
> +	spinlock_t gait_lock; /* Protects the gait, used during AEN forward */
> +	struct mutex lock; /* Protects the other structures in aift */

To facilitate review and debug, can we please rename the lock aift_lock?


> +};
> +
> +extern struct zpci_aift *aift;
> +
...snip...


-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
  2022-01-14 20:31 ` [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support Matthew Rosato
  2022-01-19 17:10   ` Pierre Morel
@ 2022-01-25 12:36   ` Pierre Morel
  2022-01-25 14:16     ` Matthew Rosato
  1 sibling, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-25 12:36 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_AIF, which is a new
> VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
> s390x vfio-pci device wishes to enable/disable zPCI adapter interrupt
> forwarding, which allows underlying firmware to deliver interrupts
> directly to the associated kvm guest.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h  |  2 +
>   drivers/vfio/pci/vfio_pci_core.c |  2 +
>   drivers/vfio/pci/vfio_pci_zdev.c | 98 +++++++++++++++++++++++++++++++-
>   include/linux/vfio_pci_core.h    | 10 ++++
>   include/uapi/linux/vfio.h        |  7 +++
>   include/uapi/linux/vfio_zdev.h   | 20 +++++++
>   6 files changed, 138 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index dc00c3f27a00..dbab349a4a75 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -36,6 +36,8 @@ struct kvm_zdev {
>   	struct zpci_fib fib;
>   	struct notifier_block nb;
>   	bool interp;
> +	bool aif;
> +	bool fhost;

Can we please have a comment on these booleans?
Can we have explicit naming to be able to follow their usage more easily?
May be aif_float and aif_host to match with the VFIO feature?

>   };
>   
>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index 2b2d64a2190c..01658de660bd 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1174,6 +1174,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
>   			return 0;
>   		case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
>   			return vfio_pci_zdev_feat_interp(vdev, feature, arg);
> +		case VFIO_DEVICE_FEATURE_ZPCI_AIF:
> +			return vfio_pci_zdev_feat_aif(vdev, feature, arg);
>   		default:
>   			return -ENOTTY;
>   		}
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index 4339f48b98bc..891cfa016d63 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -13,6 +13,7 @@
>   #include <linux/vfio_zdev.h>
>   #include <asm/pci_clp.h>
>   #include <asm/pci_io.h>
> +#include <asm/pci_insn.h>
>   #include <asm/kvm_pci.h>
>   
>   #include <linux/vfio_pci_core.h>
> @@ -208,6 +209,99 @@ int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
>   	return rc;
>   }
>   
> +int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> +			   struct vfio_device_feature feature,
> +			   unsigned long arg)
> +{
> +	struct zpci_dev *zdev = to_zpci(vdev->pdev);
> +	struct vfio_device_zpci_aif *data;
> +	struct vfio_device_feature *feat;
> +	unsigned long minsz;
> +	int size, rc = 0;
> +
> +	if (!zdev || !zdev->kzdev)
> +		return -EINVAL;
> +
> +	/* If PROBE specified, return probe results immediately */
> +	if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
> +		return kvm_s390_pci_aif_probe(zdev);
> +
> +	/* GET and SET are mutually exclusive */
> +	if ((feature.flags & VFIO_DEVICE_FEATURE_GET) &&
> +	    (feature.flags & VFIO_DEVICE_FEATURE_SET))
> +		return -EINVAL;
> +
> +	size = sizeof(*feat) + sizeof(*data);
> +	feat = kzalloc(size, GFP_KERNEL);
> +	if (!feat)
> +		return -ENOMEM;
> +
> +	data = (struct vfio_device_zpci_aif *)&feat->data;
> +	minsz = offsetofend(struct vfio_device_feature, flags);
> +
> +	if (feature.argsz < minsz + sizeof(*data))
> +		return -EINVAL;
> +
> +	/* Get the rest of the payload for GET/SET */
> +	rc = copy_from_user(data, (void __user *)(arg + minsz),
> +			    sizeof(*data));
> +	if (rc)
> +		rc = -EINVAL;
> +
> +	if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
> +		if (zdev->kzdev->aif)
> +			data->flags = VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT;
> +		if (zdev->kzdev->fhost)
> +			data->flags |= VFIO_DEVICE_ZPCI_FLAG_AIF_HOST;
> +
> +		if (copy_to_user((void __user *)arg, feat, size))
> +			rc = -EFAULT;
> +	} else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
> +		if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT) {
> +			/* create a guest fib */
> +			struct zpci_fib fib;
> +
> +			fib.fmt0.aibv = data->ibv;
> +			fib.fmt0.isc = data->isc;
> +			fib.fmt0.noi = data->noi;
> +			if (data->sb != 0) {
> +				fib.fmt0.aisb = data->sb;
> +				fib.fmt0.aisbo = data->sbo;
> +				fib.fmt0.sum = 1;
> +			} else {
> +				fib.fmt0.aisb = 0;
> +				fib.fmt0.aisbo = 0;
> +				fib.fmt0.sum = 0;
> +			}
> +			if (data->flags & VFIO_DEVICE_ZPCI_FLAG_AIF_HOST) {
> +				rc = kvm_s390_pci_aif_enable(zdev, &fib, false);
> +				if (!rc) {
> +					zdev->kzdev->aif = true;
> +					zdev->kzdev->fhost = true;
> +				}
> +			} else {
> +				rc = kvm_s390_pci_aif_enable(zdev, &fib, true);
> +				if (!rc)
> +					zdev->kzdev->aif = true;
> +			}
> +		} else if (data->flags == 0) {
> +			rc = kvm_s390_pci_aif_disable(zdev);
> +			if (!rc) {
> +				zdev->kzdev->aif = false;
> +				zdev->kzdev->fhost = false;
> +			}
> +		} else {
> +			rc = -EINVAL;
> +		}
> +	} else {
> +		/* Neither GET nor SET were specified */
> +		rc = -EINVAL;
> +	}
> +
> +	kfree(feat);
> +	return rc;
> +}
> +
>   static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
>   					unsigned long action, void *data)
>   {
> @@ -255,8 +349,10 @@ void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
>   	 * If the device was using interpretation, don't trust that userspace
>   	 * did the appropriate cleanup
>   	 */
> -	if (zdev->gd != 0)
> +	if (zdev->gd != 0) {
> +		kvm_s390_pci_aif_disable(zdev);
>   		kvm_s390_pci_interp_disable(zdev);
> +	}
>   
>   	kvm_s390_pci_dev_release(zdev);
>   }
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 0db2b1051931..7ec5e82e7933 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -201,6 +201,9 @@ extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
>   			      struct vfio_device_feature feature,
>   			      unsigned long arg);
> +int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> +			   struct vfio_device_feature feature,
> +			   unsigned long arg);
>   void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
>   void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
>   #else
> @@ -217,6 +220,13 @@ static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
>   	return -ENOTTY;
>   }
>   
> +static inline int vfio_pci_zdev_feat_aif(struct vfio_pci_core_device *vdev,
> +					 struct vfio_device_feature feature,
> +					 unsigned long arg)
> +{
> +	return -ENOTTY;
> +}
> +
>   static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
>   {
>   }
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index b9a75485b8e7..fe3bfd99bf50 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1009,6 +1009,13 @@ struct vfio_device_feature {
>    */
>   #define VFIO_DEVICE_FEATURE_ZPCI_INTERP		(1)
>   
> +/*
> + * Provide support for enbaling adapter interruption forwarding for zPCI
> + * devices.  This feature is only valid for s390x PCI devices.  Data provided
> + * when setting and getting this feature is further described in vfio_zdev.h
> + */
> +#define VFIO_DEVICE_FEATURE_ZPCI_AIF		(2)
> +
>   /* -------- API for Type1 VFIO IOMMU -------- */
>   
>   /**
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index 575f0410dc66..c574e23f9385 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -90,4 +90,24 @@ struct vfio_device_zpci_interp {
>   	__u32 fh;		/* Host device function handle */
>   };
>   
> +/**
> + * VFIO_DEVICE_FEATURE_ZPCI_AIF
> + *
> + * This feature is used for enabling forwarding of adapter interrupts directly
> + * from firmware to the guest.  When setting this feature, the flags indicate
> + * whether to enable/disable the feature and the structure defined below is
> + * used to setup the forwarding structures.  When getting this feature, only
> + * the flags are used to indicate the current state.
> + */
> +struct vfio_device_zpci_aif {
> +	__u64 flags;
> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2

I think we need more information on these flags.
What does AIF_FLOAT and what does AIF_HOST ?

> +	__u64 ibv;		/* Address of guest interrupt bit vector */
> +	__u64 sb;		/* Address of guest summary bit */
> +	__u32 noi;		/* Number of interrupts */
> +	__u8 isc;		/* Guest interrupt subclass */
> +	__u8 sbo;		/* Offset of guest summary bit vector */
> +};
> +
>   #endif
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 19/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding
  2022-01-14 20:31 ` [PATCH v2 19/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding Matthew Rosato
@ 2022-01-25 12:41   ` Pierre Morel
  2022-01-25 15:44     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-25 12:41 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> These routines will be wired into the vfio_pci_zdev ioctl handlers to
> respond to requests to enable / disable a device for Adapter Event
> Notifications / Adapter Interuption Forwarding.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h |   7 ++
>   arch/s390/kvm/pci.c             | 203 ++++++++++++++++++++++++++++++++
>   arch/s390/pci/pci_insn.c        |   1 +
>   3 files changed, 211 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 072401aa7922..01fe14fffd7a 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -16,16 +16,23 @@
>   #include <linux/kvm_host.h>
>   #include <linux/kvm.h>
>   #include <linux/pci.h>
> +#include <asm/pci_insn.h>
>   
>   struct kvm_zdev {
>   	struct zpci_dev *zdev;
>   	struct kvm *kvm;
> +	struct zpci_fib fib;
>   };
>   
>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
>   void kvm_s390_pci_dev_release(struct zpci_dev *zdev);
>   void kvm_s390_pci_attach_kvm(struct zpci_dev *zdev, struct kvm *kvm);
>   
> +int kvm_s390_pci_aif_probe(struct zpci_dev *zdev);
> +int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
> +			    bool assist);
> +int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
> +
>   int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
>   int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
>   int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index 122d0992b521..7ed9abc476b6 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -12,6 +12,7 @@
>   #include <asm/kvm_pci.h>
>   #include <asm/pci.h>
>   #include <asm/pci_insn.h>
> +#include <asm/pci_io.h>
>   #include <asm/sclp.h>
>   #include "pci.h"
>   #include "kvm-s390.h"
> @@ -145,6 +146,204 @@ int kvm_s390_pci_aen_init(u8 nisc)
>   	return rc;
>   }
>   
> +/* Modify PCI: Register floating adapter interruption forwarding */
> +static int kvm_zpci_set_airq(struct zpci_dev *zdev)
> +{
> +	u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_REG_INT);
> +	struct zpci_fib fib = {0};

I prefer {} instead of {0} even it does the same it looks wrong to me.

> +	u8 status;
> +
> +	fib.fmt0.isc = zdev->kzdev->fib.fmt0.isc;
> +	fib.fmt0.sum = 1;       /* enable summary notifications */
> +	fib.fmt0.noi = airq_iv_end(zdev->aibv);
> +	fib.fmt0.aibv = virt_to_phys(zdev->aibv->vector);
> +	fib.fmt0.aibvo = 0;
> +	fib.fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 64) * 8);
> +	fib.fmt0.aisbo = zdev->aisb & 63;
> +	fib.gd = zdev->gd;
> +
> +	return zpci_mod_fc(req, &fib, &status) ? -EIO : 0;
> +}
> +
> +/* Modify PCI: Unregister floating adapter interruption forwarding */
> +static int kvm_zpci_clear_airq(struct zpci_dev *zdev)
> +{
> +	u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_DEREG_INT);
> +	struct zpci_fib fib = {0};

same here

> +	u8 cc, status;
> +
> +	fib.gd = zdev->gd;
> +
> +	cc = zpci_mod_fc(req, &fib, &status);
> +	if (cc == 3 || (cc == 1 && status == 24))
> +		/* Function already gone or IRQs already deregistered. */
> +		cc = 0;
> +
> +	return cc ? -EIO : 0;
> +}
> +
> +int kvm_s390_pci_aif_probe(struct zpci_dev *zdev)
> +{
> +	/* Must have appropriate hardware facilities */
> +	if (!(sclp.has_aeni && test_facility(71)))
> +		return -EINVAL;
> +
> +	/* Must have a KVM association registered */
> +	if (!zdev->kzdev || !zdev->kzdev->kvm)
> +		return -EINVAL;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_probe);
> +
> +int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
> +			    bool assist)
> +{
> +	struct page *aibv_page, *aisb_page = NULL;
> +	unsigned int msi_vecs, idx;
> +	struct zpci_gaite *gaite;
> +	unsigned long bit;
> +	struct kvm *kvm;
> +	phys_addr_t gaddr;
> +	int rc = 0;
> +
> +	/*
> +	 * Interrupt forwarding is only applicable if the device is already
> +	 * enabled for interpretation
> +	 */
> +	if (zdev->gd == 0)
> +		return -EINVAL;
> +
> +	kvm = zdev->kzdev->kvm;
> +	msi_vecs = min_t(unsigned int, fib->fmt0.noi, zdev->max_msi);
> +
> +	/* Replace AIBV address */
> +	idx = srcu_read_lock(&kvm->srcu);
> +	aibv_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aibv));
> +	srcu_read_unlock(&kvm->srcu, idx);
> +	if (is_error_page(aibv_page)) {
> +		rc = -EIO;
> +		goto out;
> +	}
> +	gaddr = page_to_phys(aibv_page) + (fib->fmt0.aibv & ~PAGE_MASK);
> +	fib->fmt0.aibv = gaddr;
> +
> +	/* Pin the guest AISB if one was specified */
> +	if (fib->fmt0.sum == 1) {
> +		idx = srcu_read_lock(&kvm->srcu);
> +		aisb_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aisb));
> +		srcu_read_unlock(&kvm->srcu, idx);
> +		if (is_error_page(aisb_page)) {
> +			rc = -EIO;
> +			goto unpin1;
> +		}
> +	}
> +
> +	/* AISB must be allocated before we can fill in GAITE */
> +	mutex_lock(&aift->lock);
> +	bit = airq_iv_alloc_bit(aift->sbv);
> +	if (bit == -1UL)
> +		goto unpin2;
> +	zdev->aisb = bit;

aisb here is the aisb offset right?
Then may be add a comment as in gait and fmt0 aisb is an address.

> +	zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA |
> +					      AIRQ_IV_BITLOCK |
> +					      AIRQ_IV_GUESTVEC,
> +				    (unsigned long *)fib->fmt0.aibv);

phys_to_virt ?

> +
> +	spin_lock_irq(&aift->gait_lock);
> +	gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb *
> +						   sizeof(struct zpci_gaite));
> +
> +	/* If assist not requested, host will get all alerts */
> +	if (assist)
> +		gaite->gisa = (u32)(u64)&kvm->arch.sie_page2->gisa;

virt_to_phys ?

> +	else
> +		gaite->gisa = 0;
> +
> +	gaite->gisc = fib->fmt0.isc;
> +	gaite->count++;
> +	gaite->aisbo = fib->fmt0.aisbo;
> +	gaite->aisb = virt_to_phys(page_address(aisb_page) + (fib->fmt0.aisb &
> +							      ~PAGE_MASK));
> +	aift->kzdev[zdev->aisb] = zdev->kzdev;
> +	spin_unlock_irq(&aift->gait_lock);
> +
> +	/* Update guest FIB for re-issue */
> +	fib->fmt0.aisbo = zdev->aisb & 63;
> +	fib->fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 64) * 8);
> +	fib->fmt0.isc = kvm_s390_gisc_register(kvm, gaite->gisc);
> +
> +	/* Save some guest fib values in the host for later use */
> +	zdev->kzdev->fib.fmt0.isc = fib->fmt0.isc;
> +	zdev->kzdev->fib.fmt0.aibv = fib->fmt0.aibv;
> +	mutex_unlock(&aift->lock);
> +
> +	/* Issue the clp to setup the irq now */
> +	rc = kvm_zpci_set_airq(zdev);
> +	return rc;
> +
> +unpin2:
> +	mutex_unlock(&aift->lock);
> +	if (fib->fmt0.sum == 1) {
> +		gaddr = page_to_phys(aisb_page);
> +		kvm_release_pfn_dirty(gaddr >> PAGE_SHIFT);
> +	}
> +unpin1:
> +	kvm_release_pfn_dirty(fib->fmt0.aibv >> PAGE_SHIFT);
> +out:
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_enable);
> +
> +int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
> +{
> +	struct kvm_zdev *kzdev = zdev->kzdev;
> +	struct zpci_gaite *gaite;
> +	int rc;
> +	u8 isc;
> +
> +	if (zdev->gd == 0)
> +		return -EINVAL;
> +
> +	/* Even if the clear fails due to an error, clear the GAITE */
> +	rc = kvm_zpci_clear_airq(zdev);

Having a look at kvm_zpci_clear_airq() the only possible error seems to 
be when an error recovery is in progress.
The error returned for a wrong FH, function does not exist anymore, or 
if the interrupt vectors are already deregistered by the instruction are 
returned as success by the function.

How can we be sure that we have no conflict with a recovery in progress?
Shouldn't we in this case let the recovery process handle the function 
and stop here?

Doesn't the aif lock mutex placed after and not before the clear_irq 
open a door for race condition with the recovery?

> +
> +	mutex_lock(&aift->lock);
> +	if (zdev->kzdev->fib.fmt0.aibv == 0)
> +		goto out;
> +	spin_lock_irq(&aift->gait_lock);
> +	gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb *
> +						   sizeof(struct zpci_gaite));
> +	isc = gaite->gisc;
> +	gaite->count--;
> +	if (gaite->count == 0) {
> +		/* Release guest AIBV and AISB */
> +		kvm_release_pfn_dirty(kzdev->fib.fmt0.aibv >> PAGE_SHIFT);
> +		if (gaite->aisb != 0)
> +			kvm_release_pfn_dirty(gaite->aisb >> PAGE_SHIFT);
> +		/* Clear the GAIT entry */
> +		gaite->aisb = 0;
> +		gaite->gisc = 0;
> +		gaite->aisbo = 0;
> +		gaite->gisa = 0;
> +		aift->kzdev[zdev->aisb] = 0;
> +		/* Clear zdev info */
> +		airq_iv_free_bit(aift->sbv, zdev->aisb);
> +		airq_iv_release(zdev->aibv);
> +		zdev->aisb = 0;
> +		zdev->aibv = NULL;
> +	}
> +	spin_unlock_irq(&aift->gait_lock);
> +	kvm_s390_gisc_unregister(kzdev->kvm, isc);

Don't we need to check the return value?
And maybe to report it to the caller?

> +	kzdev->fib.fmt0.isc = 0;
> +	kzdev->fib.fmt0.aibv = 0;
> +out:
> +	mutex_unlock(&aift->lock);
> +
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
> +
>   int kvm_s390_pci_interp_probe(struct zpci_dev *zdev)
>   {
>   	/* Must have appropriate hardware facilities */
> @@ -221,6 +420,10 @@ int kvm_s390_pci_interp_disable(struct zpci_dev *zdev)
>   	if (zdev->gd == 0)
>   		return -EINVAL;
>   
> +	/* Forwarding must be turned off before interpretation */
> +	if (zdev->kzdev->fib.fmt0.aibv != 0)
> +		kvm_s390_pci_aif_disable(zdev);
> +
>   	/* Remove the host CLP guest designation */
>   	zdev->gd = 0;
>   
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index ca6399d52767..f7d0e29bbf0b 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -59,6 +59,7 @@ u8 zpci_mod_fc(u64 req, struct zpci_fib *fib, u8 *status)
>   
>   	return cc;
>   }
> +EXPORT_SYMBOL_GPL(zpci_mod_fc);
>   
>   /* Refresh PCI Translations */
>   static inline u8 __rpcit(u64 fn, u64 addr, u64 range, u8 *status)
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support
  2022-01-14 20:31 ` [PATCH v2 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support Matthew Rosato
@ 2022-01-25 13:01   ` Pierre Morel
  2022-01-25 14:21     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-25 13:01 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_INTERP, which is a new
> VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
> s390x vfio-pci device wishes to enable/disable zPCI interpretive
> execution, which allows zPCI instructions to be executed directly by
> underlying firmware without KVM involvement.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h  |  1 +
>   drivers/vfio/pci/vfio_pci_core.c |  2 +
>   drivers/vfio/pci/vfio_pci_zdev.c | 78 ++++++++++++++++++++++++++++++++
>   include/linux/vfio_pci_core.h    | 10 ++++
>   include/uapi/linux/vfio.h        |  7 +++
>   include/uapi/linux/vfio_zdev.h   | 15 ++++++
>   6 files changed, 113 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 97a90b37c87d..dc00c3f27a00 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -35,6 +35,7 @@ struct kvm_zdev {
>   	struct kvm_zdev_ioat ioat;
>   	struct zpci_fib fib;
>   	struct notifier_block nb;
> +	bool interp;

NIT: s/interp/interpretation/ ?

>   };
>   
>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index fc57d4d0abbe..2b2d64a2190c 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -1172,6 +1172,8 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd,
>   			mutex_unlock(&vdev->vf_token->lock);
>   
>   			return 0;
> +		case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
> +			return vfio_pci_zdev_feat_interp(vdev, feature, arg);
>   		default:
>   			return -ENOTTY;
>   		}
> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c b/drivers/vfio/pci/vfio_pci_zdev.c
> index 5c2bddc57b39..4339f48b98bc 100644
> --- a/drivers/vfio/pci/vfio_pci_zdev.c
> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
> @@ -54,6 +54,10 @@ static int zpci_group_cap(struct zpci_dev *zdev, struct vfio_info_cap *caps)
>   		.version = zdev->version
>   	};
>   
> +	/* Some values are different for interpreted devices */
> +	if (zdev->kzdev && zdev->kzdev->interp)
> +		cap.maxstbl = zdev->maxstbl;
> +
>   	return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
>   }
>   
> @@ -138,6 +142,72 @@ int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   	return ret;
>   }
>   
> +int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> +			      struct vfio_device_feature feature,
> +			      unsigned long arg)
> +{
> +	struct zpci_dev *zdev = to_zpci(vdev->pdev);
> +	struct vfio_device_zpci_interp *data;
> +	struct vfio_device_feature *feat;
> +	unsigned long minsz;
> +	int size, rc;
> +
> +	if (!zdev || !zdev->kzdev)
> +		return -EINVAL;
> +
> +	/* If PROBE specified, return probe results immediately */
> +	if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
> +		return kvm_s390_pci_interp_probe(zdev);
> +
> +	/* GET and SET are mutually exclusive */
> +	if ((feature.flags & VFIO_DEVICE_FEATURE_GET) &&
> +	    (feature.flags & VFIO_DEVICE_FEATURE_SET))
> +		return -EINVAL;

Isn't the check already done in VFIO core?

> +
> +	size = sizeof(*feat) + sizeof(*data);
> +	feat = kzalloc(size, GFP_KERNEL);
> +	if (!feat)
> +		return -ENOMEM;
> +
> +	data = (struct vfio_device_zpci_interp *)&feat->data;
> +	minsz = offsetofend(struct vfio_device_feature, flags);
> +
> +	if (feature.argsz < minsz + sizeof(*data))
> +		return -EINVAL;
> +
> +	/* Get the rest of the payload for GET/SET */
> +	rc = copy_from_user(data, (void __user *)(arg + minsz),
> +			    sizeof(*data));
> +	if (rc)
> +		rc = -EINVAL;
> +
> +	if (feature.flags & VFIO_DEVICE_FEATURE_GET) {
> +		if (zdev->gd != 0)
> +			data->flags = VFIO_DEVICE_ZPCI_FLAG_INTERP;
> +		else
> +			data->flags = 0;
> +		data->fh = zdev->fh;
> +		/* userspace is using host fh, give interpreted clp values */
> +		zdev->kzdev->interp = true;
> +
> +		if (copy_to_user((void __user *)arg, feat, size))
> +			rc = -EFAULT;
> +	} else if (feature.flags & VFIO_DEVICE_FEATURE_SET) {
> +		if (data->flags == VFIO_DEVICE_ZPCI_FLAG_INTERP)
> +			rc = kvm_s390_pci_interp_enable(zdev);
> +		else if (data->flags == 0)
> +			rc = kvm_s390_pci_interp_disable(zdev);
> +		else
> +			rc = -EINVAL;
> +	} else {
> +		/* Neither GET nor SET were specified */
> +		rc = -EINVAL;
> +	}
> +
> +	kfree(feat);
> +	return rc;
> +}
> +
>   static int vfio_pci_zdev_group_notifier(struct notifier_block *nb,
>   					unsigned long action, void *data)
>   {
> @@ -164,6 +234,7 @@ void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
>   		return;
>   
>   	zdev->kzdev->nb.notifier_call = vfio_pci_zdev_group_notifier;
> +	zdev->kzdev->interp = false;
>   
>   	if (vfio_register_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
>   				   &events, &zdev->kzdev->nb))
> @@ -180,5 +251,12 @@ void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev)
>   	vfio_unregister_notifier(vdev->vdev.dev, VFIO_GROUP_NOTIFY,
>   				 &zdev->kzdev->nb);
>   
> +	/*
> +	 * If the device was using interpretation, don't trust that userspace
> +	 * did the appropriate cleanup
> +	 */
> +	if (zdev->gd != 0)
> +		kvm_s390_pci_interp_disable(zdev);
> +
>   	kvm_s390_pci_dev_release(zdev);
>   }
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 05287f8ac855..0db2b1051931 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -198,6 +198,9 @@ static inline int vfio_pci_igd_init(struct vfio_pci_core_device *vdev)
>   #ifdef CONFIG_VFIO_PCI_ZDEV
>   extern int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   				       struct vfio_info_cap *caps);
> +int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> +			      struct vfio_device_feature feature,
> +			      unsigned long arg);
>   void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev);
>   void vfio_pci_zdev_release(struct vfio_pci_core_device *vdev);
>   #else
> @@ -207,6 +210,13 @@ static inline int vfio_pci_info_zdev_add_caps(struct vfio_pci_core_device *vdev,
>   	return -ENODEV;
>   }
>   
> +static inline int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
> +					    struct vfio_device_feature feature,
> +					    unsigned long arg)
> +{
> +	return -ENOTTY;
> +}
> +
>   static inline void vfio_pci_zdev_open(struct vfio_pci_core_device *vdev)
>   {
>   }
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index ef33ea002b0b..b9a75485b8e7 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1002,6 +1002,13 @@ struct vfio_device_feature {
>    */
>   #define VFIO_DEVICE_FEATURE_PCI_VF_TOKEN	(0)
>   
> +/*
> + * Provide support for enabling interpretation of zPCI instructions.  This
> + * feature is only valid for s390x PCI devices.  Data provided when setting
> + * and getting this feature is futher described in vfio_zdev.h
> + */
> +#define VFIO_DEVICE_FEATURE_ZPCI_INTERP		(1)
> +
>   /* -------- API for Type1 VFIO IOMMU -------- */
>   
>   /**
> diff --git a/include/uapi/linux/vfio_zdev.h b/include/uapi/linux/vfio_zdev.h
> index b4309397b6b2..575f0410dc66 100644
> --- a/include/uapi/linux/vfio_zdev.h
> +++ b/include/uapi/linux/vfio_zdev.h
> @@ -75,4 +75,19 @@ struct vfio_device_info_cap_zpci_pfip {
>   	__u8 pfip[];
>   };
>   
> +/**
> + * VFIO_DEVICE_FEATURE_ZPCI_INTERP
> + *
> + * This feature is used for enabling zPCI instruction interpretation for a
> + * device.  No data is provided when setting this feature.  When getting
> + * this feature, the following structure is provided which details whether
> + * or not interpretation is active and provides the guest with host device
> + * information necessary to enable interpretation.
> + */
> +struct vfio_device_zpci_interp {
> +	__u64 flags;
> +#define VFIO_DEVICE_ZPCI_FLAG_INTERP 1
> +	__u32 fh;		/* Host device function handle */
> +};
> +
>   #endif
> 

Otherwise LGTM

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 20/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist
  2022-01-14 20:31 ` [PATCH v2 20/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist Matthew Rosato
@ 2022-01-25 13:29   ` Pierre Morel
  2022-01-25 14:47     ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-25 13:29 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> These routines will be wired into the vfio_pci_zdev ioctl handlers to
> respond to requests to enable / disable a device for PCI I/O Address
> Translation assistance.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>   arch/s390/include/asm/kvm_pci.h |  15 ++++
>   arch/s390/include/asm/pci_dma.h |   2 +
>   arch/s390/kvm/pci.c             | 139 ++++++++++++++++++++++++++++++++
>   arch/s390/kvm/pci.h             |   2 +
>   4 files changed, 158 insertions(+)
> 
> diff --git a/arch/s390/include/asm/kvm_pci.h b/arch/s390/include/asm/kvm_pci.h
> index 01fe14fffd7a..770849f13a70 100644
> --- a/arch/s390/include/asm/kvm_pci.h
> +++ b/arch/s390/include/asm/kvm_pci.h
> @@ -16,11 +16,21 @@
>   #include <linux/kvm_host.h>
>   #include <linux/kvm.h>
>   #include <linux/pci.h>
> +#include <linux/mutex.h>
>   #include <asm/pci_insn.h>
> +#include <asm/pci_dma.h>
> +
> +struct kvm_zdev_ioat {
> +	unsigned long *head[ZPCI_TABLE_PAGES];
> +	unsigned long **seg;
> +	unsigned long ***pt;
> +	struct mutex lock;

Can we please rename the mutex ioat_lock to have a unique name easy to 
follow for maintenance.
Can you please add a description about when the lock should be used?

> +};
>   
>   struct kvm_zdev {
>   	struct zpci_dev *zdev;
>   	struct kvm *kvm;
> +	struct kvm_zdev_ioat ioat;
>   	struct zpci_fib fib;
>   };
>   
> @@ -33,6 +43,11 @@ int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
>   			    bool assist);
>   int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
>   
> +int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev);
> +int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota);
> +int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev);
> +u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev);
> +
>   int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
>   int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
>   int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
> diff --git a/arch/s390/include/asm/pci_dma.h b/arch/s390/include/asm/pci_dma.h
> index 91e63426bdc5..69e616d0712c 100644
> --- a/arch/s390/include/asm/pci_dma.h
> +++ b/arch/s390/include/asm/pci_dma.h
> @@ -50,6 +50,8 @@ enum zpci_ioat_dtype {
>   #define ZPCI_TABLE_ALIGN		ZPCI_TABLE_SIZE
>   #define ZPCI_TABLE_ENTRY_SIZE		(sizeof(unsigned long))
>   #define ZPCI_TABLE_ENTRIES		(ZPCI_TABLE_SIZE / ZPCI_TABLE_ENTRY_SIZE)
> +#define ZPCI_TABLE_PAGES		(ZPCI_TABLE_SIZE >> PAGE_SHIFT)
> +#define ZPCI_TABLE_ENTRIES_PAGES	(ZPCI_TABLE_ENTRIES * ZPCI_TABLE_PAGES)
>   
>   #define ZPCI_TABLE_BITS			11
>   #define ZPCI_PT_BITS			8
> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
> index 7ed9abc476b6..39c13c25a700 100644
> --- a/arch/s390/kvm/pci.c
> +++ b/arch/s390/kvm/pci.c
> @@ -13,12 +13,15 @@
>   #include <asm/pci.h>
>   #include <asm/pci_insn.h>
>   #include <asm/pci_io.h>
> +#include <asm/pci_dma.h>
>   #include <asm/sclp.h>
>   #include "pci.h"
>   #include "kvm-s390.h"
>   
>   struct zpci_aift *aift;
>   
> +#define shadow_ioat_init zdev->kzdev->ioat.head[0]
> +
>   static inline int __set_irq_noiib(u16 ctl, u8 isc)
>   {
>   	union zpci_sic_iib iib = {{0}};
> @@ -344,6 +347,135 @@ int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
>   }
>   EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
>   
> +int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev)
> +{
> +	/* Must have a KVM association registered */

may be add something like : "The ioat structure is embeded in kzdev"

> +	if (!zdev->kzdev || !zdev->kzdev->kvm)

Why do we need to check for kvm ?
Having kzdev is already tested by the unique caller.

> +		return -EINVAL;
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_probe);
> +
> +int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota)
> +{
> +	gpa_t gpa = (gpa_t)(iota & ZPCI_RTE_ADDR_MASK);
> +	struct kvm_zdev_ioat *ioat;
> +	struct page *page;
> +	struct kvm *kvm;
> +	unsigned int idx;
> +	void *iaddr;
> +	int i, rc = 0;

no need to initialize rc

> +
> +	if (shadow_ioat_init)
> +		return -EINVAL;
> +
> +	/* Ensure supported type specified */
> +	if ((iota & ZPCI_IOTA_RTTO_FLAG) != ZPCI_IOTA_RTTO_FLAG)
> +		return -EINVAL;
> +
> +	kvm = zdev->kzdev->kvm;
> +	ioat = &zdev->kzdev->ioat;
> +	mutex_lock(&ioat->lock);
> +	idx = srcu_read_lock(&kvm->srcu);
> +	for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
> +		page = gfn_to_page(kvm, gpa_to_gfn(gpa));
> +		if (is_error_page(page)) {
> +			srcu_read_unlock(&kvm->srcu, idx);
> +			rc = -EIO;
> +			goto out;

			goto unpin ?

> +		}
> +		iaddr = page_to_virt(page) + (gpa & ~PAGE_MASK);
> +		ioat->head[i] = (unsigned long *)iaddr;
> +		gpa += PAGE_SIZE;
> +	}
> +	srcu_read_unlock(&kvm->srcu, idx);
> +
> +	zdev->kzdev->ioat.seg = kcalloc(ZPCI_TABLE_ENTRIES_PAGES,
> +					sizeof(unsigned long *), GFP_KERNEL);

What about:

         ioat->seg = kcalloc(ZPCI_TABLE_ENTRIES_PAGES,
                             sizeof(*ioat->seg), GFP_KERNEL);
	if (!ioat->seg)
...
	ioat->pt = ...
?

> +	if (!zdev->kzdev->ioat.seg)
> +		goto unpin;
> +	zdev->kzdev->ioat.pt = kcalloc(ZPCI_TABLE_ENTRIES,
> +				       sizeof(unsigned long **), GFP_KERNEL);
> +	if (!zdev->kzdev->ioat.pt)
> +		goto free_seg;
> +
> +out:
> +	mutex_unlock(&ioat->lock);
> +	return rc;

	return 0 ?

> +
> +free_seg:
> +	kfree(zdev->kzdev->ioat.seg);

kfree(ioat->seg) ?
rc = -ENOMEM;

> +unpin:
> +	for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
> +		kvm_release_pfn_dirty((u64)ioat->head[i] >> PAGE_SHIFT);
> +		ioat->head[i] = 0;
> +	}
> +	mutex_unlock(&ioat->lock);
> +	return -ENOMEM;

	return rc;

> +}
...snip...

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
  2022-01-25 12:36   ` Pierre Morel
@ 2022-01-25 14:16     ` Matthew Rosato
  2022-01-26  8:24       ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-25 14:16 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/25/22 7:36 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_AIF, which is a new
>> VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
>> s390x vfio-pci device wishes to enable/disable zPCI adapter interrupt
>> forwarding, which allows underlying firmware to deliver interrupts
>> directly to the associated kvm guest.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/kvm_pci.h  |  2 +
>>   drivers/vfio/pci/vfio_pci_core.c |  2 +
>>   drivers/vfio/pci/vfio_pci_zdev.c | 98 +++++++++++++++++++++++++++++++-
>>   include/linux/vfio_pci_core.h    | 10 ++++
>>   include/uapi/linux/vfio.h        |  7 +++
>>   include/uapi/linux/vfio_zdev.h   | 20 +++++++
>>   6 files changed, 138 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>> b/arch/s390/include/asm/kvm_pci.h
>> index dc00c3f27a00..dbab349a4a75 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -36,6 +36,8 @@ struct kvm_zdev {
>>       struct zpci_fib fib;
>>       struct notifier_block nb;
>>       bool interp;
>> +    bool aif;
>> +    bool fhost;
> 
> Can we please have a comment on these booleans? > Can we have explicit naming to be able to follow their usage more easily?
> May be aif_float and aif_host to match with the VFIO feature?

Sure, rename would be fine.

As for a comment, maybe something like

bool aif_float; /* Enabled for floating interrupt assist */
bool aif_host;  /* Require host delivery */

...

>> diff --git a/include/uapi/linux/vfio_zdev.h 
>> b/include/uapi/linux/vfio_zdev.h
>> index 575f0410dc66..c574e23f9385 100644
>> --- a/include/uapi/linux/vfio_zdev.h
>> +++ b/include/uapi/linux/vfio_zdev.h
>> @@ -90,4 +90,24 @@ struct vfio_device_zpci_interp {
>>       __u32 fh;        /* Host device function handle */
>>   };
>> +/**
>> + * VFIO_DEVICE_FEATURE_ZPCI_AIF
>> + *
>> + * This feature is used for enabling forwarding of adapter interrupts 
>> directly
>> + * from firmware to the guest.  When setting this feature, the flags 
>> indicate
>> + * whether to enable/disable the feature and the structure defined 
>> below is
>> + * used to setup the forwarding structures.  When getting this 
>> feature, only
>> + * the flags are used to indicate the current state.
>> + */
>> +struct vfio_device_zpci_aif {
>> +    __u64 flags;
>> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
>> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2
> 
> I think we need more information on these flags.
> What does AIF_FLOAT and what does AIF_HOST ?
> 

You actually asked for this already on Jan 19 :), here's a copy of that 
response inline here:

I can add a small line comment for each, like:

  AIF_FLOAT 1 /* Floating interrupts enabled */
  AIF_HOST 2  /* Host delivery forced */

But here's a bit more detail:

On SET:
AIF_FLOAT = 1 means enable the interrupt forwarding assist for floating 
interrupt delivery
AIF_FLOAT = 0 means to disable it.
AIF_HOST = 1 means the assist will always deliver the interrupt to the 
host and let the host inject it
AIF_HOST = 0 host only gets interrupts when firmware can't deliver

on GET, we just indicate the current settings from the most recent SET, 
meaning:
AIF_FLOAT = 1 interrupt forwarding assist is currently active
AIF_FLOAT = 0 interrupt forwarding assist is not currently active
AIF_HOST = 1 interrupt forwarding will always go through host
AIF_HOST = 0 interrupt forwarding will only go through the host when 
necessary

My thought would be add the line comments in this patch and then the 
additional detail in a follow-on patch that adds vfio zPCI to 
Documentation/S390


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support
  2022-01-25 13:01   ` Pierre Morel
@ 2022-01-25 14:21     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-25 14:21 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/25/22 8:01 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_INTERP, which is a new
>> VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
>> s390x vfio-pci device wishes to enable/disable zPCI interpretive
>> execution, which allows zPCI instructions to be executed directly by
>> underlying firmware without KVM involvement.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/kvm_pci.h  |  1 +
>>   drivers/vfio/pci/vfio_pci_core.c |  2 +
>>   drivers/vfio/pci/vfio_pci_zdev.c | 78 ++++++++++++++++++++++++++++++++
>>   include/linux/vfio_pci_core.h    | 10 ++++
>>   include/uapi/linux/vfio.h        |  7 +++
>>   include/uapi/linux/vfio_zdev.h   | 15 ++++++
>>   6 files changed, 113 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>> b/arch/s390/include/asm/kvm_pci.h
>> index 97a90b37c87d..dc00c3f27a00 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -35,6 +35,7 @@ struct kvm_zdev {
>>       struct kvm_zdev_ioat ioat;
>>       struct zpci_fib fib;
>>       struct notifier_block nb;
>> +    bool interp;
> 
> NIT: s/interp/interpretation/ ?

OK

> 
>>   };
>>   int kvm_s390_pci_dev_open(struct zpci_dev *zdev);
>> diff --git a/drivers/vfio/pci/vfio_pci_core.c 
>> b/drivers/vfio/pci/vfio_pci_core.c
>> index fc57d4d0abbe..2b2d64a2190c 100644
>> --- a/drivers/vfio/pci/vfio_pci_core.c
>> +++ b/drivers/vfio/pci/vfio_pci_core.c
>> @@ -1172,6 +1172,8 @@ long vfio_pci_core_ioctl(struct vfio_device 
>> *core_vdev, unsigned int cmd,
>>               mutex_unlock(&vdev->vf_token->lock);
>>               return 0;
>> +        case VFIO_DEVICE_FEATURE_ZPCI_INTERP:
>> +            return vfio_pci_zdev_feat_interp(vdev, feature, arg);
>>           default:
>>               return -ENOTTY;
>>           }
>> diff --git a/drivers/vfio/pci/vfio_pci_zdev.c 
>> b/drivers/vfio/pci/vfio_pci_zdev.c
>> index 5c2bddc57b39..4339f48b98bc 100644
>> --- a/drivers/vfio/pci/vfio_pci_zdev.c
>> +++ b/drivers/vfio/pci/vfio_pci_zdev.c
>> @@ -54,6 +54,10 @@ static int zpci_group_cap(struct zpci_dev *zdev, 
>> struct vfio_info_cap *caps)
>>           .version = zdev->version
>>       };
>> +    /* Some values are different for interpreted devices */
>> +    if (zdev->kzdev && zdev->kzdev->interp)
>> +        cap.maxstbl = zdev->maxstbl;
>> +
>>       return vfio_info_add_capability(caps, &cap.header, sizeof(cap));
>>   }
>> @@ -138,6 +142,72 @@ int vfio_pci_info_zdev_add_caps(struct 
>> vfio_pci_core_device *vdev,
>>       return ret;
>>   }
>> +int vfio_pci_zdev_feat_interp(struct vfio_pci_core_device *vdev,
>> +                  struct vfio_device_feature feature,
>> +                  unsigned long arg)
>> +{
>> +    struct zpci_dev *zdev = to_zpci(vdev->pdev);
>> +    struct vfio_device_zpci_interp *data;
>> +    struct vfio_device_feature *feat;
>> +    unsigned long minsz;
>> +    int size, rc;
>> +
>> +    if (!zdev || !zdev->kzdev)
>> +        return -EINVAL;
>> +
>> +    /* If PROBE specified, return probe results immediately */
>> +    if (feature.flags & VFIO_DEVICE_FEATURE_PROBE)
>> +        return kvm_s390_pci_interp_probe(zdev);
>> +
>> +    /* GET and SET are mutually exclusive */
>> +    if ((feature.flags & VFIO_DEVICE_FEATURE_GET) &&
>> +        (feature.flags & VFIO_DEVICE_FEATURE_SET))
>> +        return -EINVAL;
> 
> Isn't the check already done in VFIO core?

Oh, yes you are correct.  Then this can be removed for this patch as 
well as the next 2 patches.



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 20/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist
  2022-01-25 13:29   ` Pierre Morel
@ 2022-01-25 14:47     ` Matthew Rosato
  2022-01-26  8:30       ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Matthew Rosato @ 2022-01-25 14:47 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/25/22 8:29 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> These routines will be wired into the vfio_pci_zdev ioctl handlers to
>> respond to requests to enable / disable a device for PCI I/O Address
>> Translation assistance.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/kvm_pci.h |  15 ++++
>>   arch/s390/include/asm/pci_dma.h |   2 +
>>   arch/s390/kvm/pci.c             | 139 ++++++++++++++++++++++++++++++++
>>   arch/s390/kvm/pci.h             |   2 +
>>   4 files changed, 158 insertions(+)
>>
>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>> b/arch/s390/include/asm/kvm_pci.h
>> index 01fe14fffd7a..770849f13a70 100644
>> --- a/arch/s390/include/asm/kvm_pci.h
>> +++ b/arch/s390/include/asm/kvm_pci.h
>> @@ -16,11 +16,21 @@
>>   #include <linux/kvm_host.h>
>>   #include <linux/kvm.h>
>>   #include <linux/pci.h>
>> +#include <linux/mutex.h>
>>   #include <asm/pci_insn.h>
>> +#include <asm/pci_dma.h>
>> +
>> +struct kvm_zdev_ioat {
>> +    unsigned long *head[ZPCI_TABLE_PAGES];
>> +    unsigned long **seg;
>> +    unsigned long ***pt;
>> +    struct mutex lock;
> 
> Can we please rename the mutex ioat_lock to have a unique name easy to 
> follow for maintenance.
> Can you please add a description about when the lock should be used?
> 

OK.  The lock is meant to protect the contents of kvm_zdev_ioat -- I'll 
think of something to describe it.

>> +};
>>   struct kvm_zdev {
>>       struct zpci_dev *zdev;
>>       struct kvm *kvm;
>> +    struct kvm_zdev_ioat ioat;
>>       struct zpci_fib fib;
>>   };
>> @@ -33,6 +43,11 @@ int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, 
>> struct zpci_fib *fib,
>>                   bool assist);
>>   int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
>> +int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev);
>> +int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota);
>> +int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev);
>> +u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev);
>> +
>>   int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
>>   int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
>>   int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
>> diff --git a/arch/s390/include/asm/pci_dma.h 
>> b/arch/s390/include/asm/pci_dma.h
>> index 91e63426bdc5..69e616d0712c 100644
>> --- a/arch/s390/include/asm/pci_dma.h
>> +++ b/arch/s390/include/asm/pci_dma.h
>> @@ -50,6 +50,8 @@ enum zpci_ioat_dtype {
>>   #define ZPCI_TABLE_ALIGN        ZPCI_TABLE_SIZE
>>   #define ZPCI_TABLE_ENTRY_SIZE        (sizeof(unsigned long))
>>   #define ZPCI_TABLE_ENTRIES        (ZPCI_TABLE_SIZE / 
>> ZPCI_TABLE_ENTRY_SIZE)
>> +#define ZPCI_TABLE_PAGES        (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
>> +#define ZPCI_TABLE_ENTRIES_PAGES    (ZPCI_TABLE_ENTRIES * 
>> ZPCI_TABLE_PAGES)
>>   #define ZPCI_TABLE_BITS            11
>>   #define ZPCI_PT_BITS            8
>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>> index 7ed9abc476b6..39c13c25a700 100644
>> --- a/arch/s390/kvm/pci.c
>> +++ b/arch/s390/kvm/pci.c
>> @@ -13,12 +13,15 @@
>>   #include <asm/pci.h>
>>   #include <asm/pci_insn.h>
>>   #include <asm/pci_io.h>
>> +#include <asm/pci_dma.h>
>>   #include <asm/sclp.h>
>>   #include "pci.h"
>>   #include "kvm-s390.h"
>>   struct zpci_aift *aift;
>> +#define shadow_ioat_init zdev->kzdev->ioat.head[0]
>> +
>>   static inline int __set_irq_noiib(u16 ctl, u8 isc)
>>   {
>>       union zpci_sic_iib iib = {{0}};
>> @@ -344,6 +347,135 @@ int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
>> +int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev)
>> +{
>> +    /* Must have a KVM association registered */
> 
> may be add something like : "The ioat structure is embeded in kzdev"
> 
>> +    if (!zdev->kzdev || !zdev->kzdev->kvm)
> 
> Why do we need to check for kvm ?
> Having kzdev is already tested by the unique caller.
> 

We probably don't need to check for the kzdev because the caller already 
did this, agreed there.

But as for checking the kvm association, Alex asked for this in a 
comment to v1 (comment was against one of the vfio patches that call 
these routines) -- The reason being the probe comes from a userspace 
request and can be against any vfio-pci(-zdev) device at any time, and 
there's no point in proceeding if this device is not associated with a 
KVM guest -- It's possible for the KVM notifier to also pass a null KVM 
address -- so I think it's better to just be sure here.  In a 
well-behaved environment we would never see this (so, another case for 
an s390dbf entry)

>> +        return -EINVAL;
>> +
>> +    return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_ioat_probe);
>> +
>> +int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota)
>> +{
>> +    gpa_t gpa = (gpa_t)(iota & ZPCI_RTE_ADDR_MASK);
>> +    struct kvm_zdev_ioat *ioat;
>> +    struct page *page;
>> +    struct kvm *kvm;
>> +    unsigned int idx;
>> +    void *iaddr;
>> +    int i, rc = 0;
> 
> no need to initialize rc

Agree based on the changes below

> 
>> +
>> +    if (shadow_ioat_init)
>> +        return -EINVAL;
>> +
>> +    /* Ensure supported type specified */
>> +    if ((iota & ZPCI_IOTA_RTTO_FLAG) != ZPCI_IOTA_RTTO_FLAG)
>> +        return -EINVAL;
>> +
>> +    kvm = zdev->kzdev->kvm;
>> +    ioat = &zdev->kzdev->ioat;
>> +    mutex_lock(&ioat->lock);
>> +    idx = srcu_read_lock(&kvm->srcu);
>> +    for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
>> +        page = gfn_to_page(kvm, gpa_to_gfn(gpa));
>> +        if (is_error_page(page)) {
>> +            srcu_read_unlock(&kvm->srcu, idx);
>> +            rc = -EIO;
>> +            goto out;
> 
>              goto unpin ?

Ah, right, in case we hit this error somewhere in the middle of the loop.

> 
>> +        }
>> +        iaddr = page_to_virt(page) + (gpa & ~PAGE_MASK);
>> +        ioat->head[i] = (unsigned long *)iaddr;
>> +        gpa += PAGE_SIZE;
>> +    }
>> +    srcu_read_unlock(&kvm->srcu, idx);
>> +
>> +    zdev->kzdev->ioat.seg = kcalloc(ZPCI_TABLE_ENTRIES_PAGES,
>> +                    sizeof(unsigned long *), GFP_KERNEL);
> 
> What about:
> 
>          ioat->seg = kcalloc(ZPCI_TABLE_ENTRIES_PAGES,
>                              sizeof(*ioat->seg), GFP_KERNEL);
>      if (!ioat->seg)
> ...
>      ioat->pt = ...
> ?

Yep, would be fine (seems I forgot about the local *ioat here)

> 
>> +    if (!zdev->kzdev->ioat.seg)
>> +        goto unpin;
>> +    zdev->kzdev->ioat.pt = kcalloc(ZPCI_TABLE_ENTRIES,
>> +                       sizeof(unsigned long **), GFP_KERNEL);
>> +    if (!zdev->kzdev->ioat.pt)
>> +        goto free_seg;
>> +
>> +out:
>> +    mutex_unlock(&ioat->lock);
>> +    return rc;
> 
>      return 0 ?

Yes, we can do that now that we don't goto out: after is_error_page

> 
>> +
>> +free_seg:
>> +    kfree(zdev->kzdev->ioat.seg);
> 
> kfree(ioat->seg) ?
> rc = -ENOMEM;
> 
>> +unpin:
>> +    for (i = 0; i < ZPCI_TABLE_PAGES; i++) {
>> +        kvm_release_pfn_dirty((u64)ioat->head[i] >> PAGE_SHIFT);
>> +        ioat->head[i] = 0;
>> +    }
>> +    mutex_unlock(&ioat->lock);
>> +    return -ENOMEM;
> 
>      return rc;

And yes, agreed, now that we come here for other reasons (-EIO) we must 
return rc here and also set rc=-ENOMEM as you say for the kfree case above.


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation
  2022-01-25 12:23   ` Pierre Morel
@ 2022-01-25 14:57     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-25 14:57 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/25/22 7:23 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
>> Initial setup for Adapter Event Notification Interpretation for zPCI
>> passthrough devices.  Specifically, allocate a structure for 
>> forwarding of
>> adapter events and pass the address of this structure to firmware.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   arch/s390/include/asm/pci.h      |   4 +
>>   arch/s390/include/asm/pci_insn.h |  12 +++
>>   arch/s390/kvm/interrupt.c        |  14 +++
>>   arch/s390/kvm/kvm-s390.c         |   9 ++
>>   arch/s390/kvm/pci.c              | 144 +++++++++++++++++++++++++++++++
>>   arch/s390/kvm/pci.h              |  42 +++++++++
>>   arch/s390/pci/pci.c              |   6 ++
>>   7 files changed, 231 insertions(+)
>>   create mode 100644 arch/s390/kvm/pci.h
>>
> ...snip...
> 
>> new file mode 100644
>> index 000000000000..b2000ed7b8c3
>> --- /dev/null
>> +++ b/arch/s390/kvm/pci.h
>> @@ -0,0 +1,42 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * s390 kvm PCI passthrough support
>> + *
>> + * Copyright IBM Corp. 2021
>> + *
>> + *    Author(s): Matthew Rosato <mjrosato@linux.ibm.com>
>> + */
>> +
>> +#ifndef __KVM_S390_PCI_H
>> +#define __KVM_S390_PCI_H
>> +
>> +#include <linux/pci.h>
>> +#include <linux/mutex.h>
>> +#include <asm/airq.h>
>> +#include <asm/kvm_pci.h>
>> +
>> +struct zpci_gaite {
>> +    u32 gisa;
>> +    u8 gisc;
>> +    u8 count;
>> +    u8 reserved;
>> +    u8 aisbo;
>> +    u64 aisb;
>> +};
>> +
>> +struct zpci_aift {
>> +    struct zpci_gaite *gait;
>> +    struct airq_iv *sbv;
>> +    struct kvm_zdev **kzdev;
>> +    spinlock_t gait_lock; /* Protects the gait, used during AEN 
>> forward */
>> +    struct mutex lock; /* Protects the other structures in aift */
> 
> To facilitate review and debug, can we please rename the lock aift_lock?
> 
> 

OK, sure


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 19/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding
  2022-01-25 12:41   ` Pierre Morel
@ 2022-01-25 15:44     ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-25 15:44 UTC (permalink / raw)
  To: Pierre Morel, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On 1/25/22 7:41 AM, Pierre Morel wrote:
> 
> 
> On 1/14/22 21:31, Matthew Rosato wrote:
...
>> +/* Modify PCI: Register floating adapter interruption forwarding */
>> +static int kvm_zpci_set_airq(struct zpci_dev *zdev)
>> +{
>> +    u64 req = ZPCI_CREATE_REQ(zdev->fh, 0, ZPCI_MOD_FC_REG_INT);
>> +    struct zpci_fib fib = {0};
> 
> I prefer {} instead of {0} even it does the same it looks wrong to me.
>

OK

...

>> +int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, struct zpci_fib *fib,
>> +                bool assist)
>> +{
>> +    struct page *aibv_page, *aisb_page = NULL;
>> +    unsigned int msi_vecs, idx;
>> +    struct zpci_gaite *gaite;
>> +    unsigned long bit;
>> +    struct kvm *kvm;
>> +    phys_addr_t gaddr;
>> +    int rc = 0;
>> +
>> +    /*
>> +     * Interrupt forwarding is only applicable if the device is already
>> +     * enabled for interpretation
>> +     */
>> +    if (zdev->gd == 0)
>> +        return -EINVAL;
>> +
>> +    kvm = zdev->kzdev->kvm;
>> +    msi_vecs = min_t(unsigned int, fib->fmt0.noi, zdev->max_msi);
>> +
>> +    /* Replace AIBV address */
>> +    idx = srcu_read_lock(&kvm->srcu);
>> +    aibv_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aibv));
>> +    srcu_read_unlock(&kvm->srcu, idx);
>> +    if (is_error_page(aibv_page)) {
>> +        rc = -EIO;
>> +        goto out;
>> +    }
>> +    gaddr = page_to_phys(aibv_page) + (fib->fmt0.aibv & ~PAGE_MASK);
>> +    fib->fmt0.aibv = gaddr;
>> +
>> +    /* Pin the guest AISB if one was specified */
>> +    if (fib->fmt0.sum == 1) {
>> +        idx = srcu_read_lock(&kvm->srcu);
>> +        aisb_page = gfn_to_page(kvm, gpa_to_gfn((gpa_t)fib->fmt0.aisb));
>> +        srcu_read_unlock(&kvm->srcu, idx);
>> +        if (is_error_page(aisb_page)) {
>> +            rc = -EIO;
>> +            goto unpin1;
>> +        }
>> +    }
>> +
>> +    /* AISB must be allocated before we can fill in GAITE */
>> +    mutex_lock(&aift->lock);
>> +    bit = airq_iv_alloc_bit(aift->sbv);
>> +    if (bit == -1UL)
>> +        goto unpin2;
>> +    zdev->aisb = bit;
> 
> aisb here is the aisb offset right?

Yes

> Then may be add a comment as in gait and fmt0 aisb is an address.

Sure, good point

> 
>> +    zdev->aibv = airq_iv_create(msi_vecs, AIRQ_IV_DATA |
>> +                          AIRQ_IV_BITLOCK |
>> +                          AIRQ_IV_GUESTVEC,
>> +                    (unsigned long *)fib->fmt0.aibv);
> 
> phys_to_virt ?

Ugh, yep -- we just put the physical address in fib->fmt0.aibv a few 
lines earlier via page_to_phys

> 
>> +
>> +    spin_lock_irq(&aift->gait_lock);
>> +    gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb *
>> +                           sizeof(struct zpci_gaite));
>> +
>> +    /* If assist not requested, host will get all alerts */
>> +    if (assist)
>> +        gaite->gisa = (u32)(u64)&kvm->arch.sie_page2->gisa;
> 
> virt_to_phys ?

Yes

> 
>> +    else
>> +        gaite->gisa = 0;
>> +
>> +    gaite->gisc = fib->fmt0.isc;
>> +    gaite->count++;
>> +    gaite->aisbo = fib->fmt0.aisbo;
>> +    gaite->aisb = virt_to_phys(page_address(aisb_page) + 
>> (fib->fmt0.aisb &
>> +                                  ~PAGE_MASK));
>> +    aift->kzdev[zdev->aisb] = zdev->kzdev;
>> +    spin_unlock_irq(&aift->gait_lock);
>> +
>> +    /* Update guest FIB for re-issue */
>> +    fib->fmt0.aisbo = zdev->aisb & 63;
>> +    fib->fmt0.aisb = virt_to_phys(aift->sbv->vector + (zdev->aisb / 
>> 64) * 8);
>> +    fib->fmt0.isc = kvm_s390_gisc_register(kvm, gaite->gisc);
>> +
>> +    /* Save some guest fib values in the host for later use */
>> +    zdev->kzdev->fib.fmt0.isc = fib->fmt0.isc;
>> +    zdev->kzdev->fib.fmt0.aibv = fib->fmt0.aibv;
>> +    mutex_unlock(&aift->lock);
>> +
>> +    /* Issue the clp to setup the irq now */
>> +    rc = kvm_zpci_set_airq(zdev);
>> +    return rc;
>> +
>> +unpin2:
>> +    mutex_unlock(&aift->lock);
>> +    if (fib->fmt0.sum == 1) {
>> +        gaddr = page_to_phys(aisb_page);
>> +        kvm_release_pfn_dirty(gaddr >> PAGE_SHIFT);
>> +    }
>> +unpin1:
>> +    kvm_release_pfn_dirty(fib->fmt0.aibv >> PAGE_SHIFT);
>> +out:
>> +    return rc;
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_enable);
>> +
>> +int kvm_s390_pci_aif_disable(struct zpci_dev *zdev)
>> +{
>> +    struct kvm_zdev *kzdev = zdev->kzdev;
>> +    struct zpci_gaite *gaite;
>> +    int rc;
>> +    u8 isc;
>> +
>> +    if (zdev->gd == 0)
>> +        return -EINVAL;
>> +
>> +    /* Even if the clear fails due to an error, clear the GAITE */
>> +    rc = kvm_zpci_clear_airq(zdev);
> 
> Having a look at kvm_zpci_clear_airq() the only possible error seems to 
> be when an error recovery is in progress.
> The error returned for a wrong FH, function does not exist anymore, or 
> if the interrupt vectors are already deregistered by the instruction are 
> returned as success by the function.
> 
> How can we be sure that we have no conflict with a recovery in progress?
> Shouldn't we in this case let the recovery process handle the function 
> and stop here?

Hmm -- So I think for a userspace-initiated call to this routine, yes. 
We could then assume recovery takes care of things.  However, we also 
call this routine from vfio-pci core when closing the device...

So then let's look at how this would work -- the current recovery action 
for passthrough is always PCI_ERS_RESULT_DISCONNECT.  The process of 
disconnecting the device will trigger vfio-pci to close it's device, 
which in turn will trigger vfio_pci_zdev_release() which will in turn 
also call kvm_390_aif_disable as part of cleanup.  However, in this case 
now we want to clear the GAITE anyway even if kvm_zpci_clear_airq(zdev) 
fails now because we know the device is for sure going away.

I think I need some sort of input to this routine that indicates we must 
cleanup (bool force or something) which would only be specified by the 
call from vfio_pci_zdev_release().

> 
> Doesn't the aif lock mutex placed after and not before the clear_irq 
> open a door for race condition with the recovery?

Good point.

> 
>> +
>> +    mutex_lock(&aift->lock);
>> +    if (zdev->kzdev->fib.fmt0.aibv == 0)
>> +        goto out;
>> +    spin_lock_irq(&aift->gait_lock);
>> +    gaite = (struct zpci_gaite *)aift->gait + (zdev->aisb *
>> +                           sizeof(struct zpci_gaite));
>> +    isc = gaite->gisc;
>> +    gaite->count--;
>> +    if (gaite->count == 0) {
>> +        /* Release guest AIBV and AISB */
>> +        kvm_release_pfn_dirty(kzdev->fib.fmt0.aibv >> PAGE_SHIFT);
>> +        if (gaite->aisb != 0)
>> +            kvm_release_pfn_dirty(gaite->aisb >> PAGE_SHIFT);
>> +        /* Clear the GAIT entry */
>> +        gaite->aisb = 0;
>> +        gaite->gisc = 0;
>> +        gaite->aisbo = 0;
>> +        gaite->gisa = 0;
>> +        aift->kzdev[zdev->aisb] = 0;
>> +        /* Clear zdev info */
>> +        airq_iv_free_bit(aift->sbv, zdev->aisb);
>> +        airq_iv_release(zdev->aibv);
>> +        zdev->aisb = 0;
>> +        zdev->aibv = NULL;
>> +    }
>> +    spin_unlock_irq(&aift->gait_lock);
>> +    kvm_s390_gisc_unregister(kzdev->kvm, isc);
> 
> Don't we need to check the return value?
> And maybe to report it to the caller?

Well, actually, I think we really need to look at the 
kvm_s390_gisc_register() call during aif_enable -- I unconditionally 
assigned it to the fib when in fact it can also return a negative error 
value (which I never check for) -- so I will re-arrange the code in 
aif_enable() to do that earlier using a local variable and leave on 
error in aif_enable if this fails.

kvm_s390_gisc_register() returns 2 possible errors, which are shared 
with gisc_unregister -- So with that change we will detect these errors 
(not using GISA, bad guest ISC) at aif_enable time.

So then for gisc_unregister we should really only possibly hit the 3rd 
error (guest ISC is not registered).  And if for some reason we hit that 
error at disable time, well, that's weird and unexpected (s390dbf?) but 
as far as userspace is concerned the GAITE is cleared and the gisc is 
unregistered, so I think we want to return success still to userspace. 
But we must do the checking at gisc_register() time and fail for the 
other cases there.

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support
  2022-01-25 14:16     ` Matthew Rosato
@ 2022-01-26  8:24       ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-26  8:24 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/25/22 15:16, Matthew Rosato wrote:
> On 1/25/22 7:36 AM, Pierre Morel wrote:
>>
>>
>> On 1/14/22 21:31, Matthew Rosato wrote:
>>> Introduce support for VFIO_DEVICE_FEATURE_ZPCI_AIF, which is a new
>>> VFIO_DEVICE_FEATURE ioctl.  This interface is used to indicate that an
>>> s390x vfio-pci device wishes to enable/disable zPCI adapter interrupt
>>> forwarding, which allows underlying firmware to deliver interrupts
>>> directly to the associated kvm guest.
>>>
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   arch/s390/include/asm/kvm_pci.h  |  2 +
>>>   drivers/vfio/pci/vfio_pci_core.c |  2 +
>>>   drivers/vfio/pci/vfio_pci_zdev.c | 98 +++++++++++++++++++++++++++++++-
>>>   include/linux/vfio_pci_core.h    | 10 ++++
>>>   include/uapi/linux/vfio.h        |  7 +++
>>>   include/uapi/linux/vfio_zdev.h   | 20 +++++++
>>>   6 files changed, 138 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>>> b/arch/s390/include/asm/kvm_pci.h
>>> index dc00c3f27a00..dbab349a4a75 100644
>>> --- a/arch/s390/include/asm/kvm_pci.h
>>> +++ b/arch/s390/include/asm/kvm_pci.h
>>> @@ -36,6 +36,8 @@ struct kvm_zdev {
>>>       struct zpci_fib fib;
>>>       struct notifier_block nb;
>>>       bool interp;
>>> +    bool aif;
>>> +    bool fhost;
>>
>> Can we please have a comment on these booleans? > Can we have explicit 
>> naming to be able to follow their usage more easily?
>> May be aif_float and aif_host to match with the VFIO feature?
> 
> Sure, rename would be fine.
> 
> As for a comment, maybe something like
> 
> bool aif_float; /* Enabled for floating interrupt assist */
> bool aif_host;  /* Require host delivery */

good for me.


> 
> ...
> 
>>> diff --git a/include/uapi/linux/vfio_zdev.h 
>>> b/include/uapi/linux/vfio_zdev.h
>>> index 575f0410dc66..c574e23f9385 100644
>>> --- a/include/uapi/linux/vfio_zdev.h
>>> +++ b/include/uapi/linux/vfio_zdev.h
>>> @@ -90,4 +90,24 @@ struct vfio_device_zpci_interp {
>>>       __u32 fh;        /* Host device function handle */
>>>   };
>>> +/**
>>> + * VFIO_DEVICE_FEATURE_ZPCI_AIF
>>> + *
>>> + * This feature is used for enabling forwarding of adapter 
>>> interrupts directly
>>> + * from firmware to the guest.  When setting this feature, the flags 
>>> indicate
>>> + * whether to enable/disable the feature and the structure defined 
>>> below is
>>> + * used to setup the forwarding structures.  When getting this 
>>> feature, only
>>> + * the flags are used to indicate the current state.
>>> + */
>>> +struct vfio_device_zpci_aif {
>>> +    __u64 flags;
>>> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_FLOAT 1
>>> +#define VFIO_DEVICE_ZPCI_FLAG_AIF_HOST 2
>>
>> I think we need more information on these flags.
>> What does AIF_FLOAT and what does AIF_HOST ?
>>
> 
> You actually asked for this already on Jan 19 :), here's a copy of that 
> response inline here:

:) I forgot

> 
> I can add a small line comment for each, like:
> 
>   AIF_FLOAT 1 /* Floating interrupts enabled */
>   AIF_HOST 2  /* Host delivery forced */
> 
> But here's a bit more detail:
> 
> On SET:
> AIF_FLOAT = 1 means enable the interrupt forwarding assist for floating 
> interrupt delivery
> AIF_FLOAT = 0 means to disable it.
> AIF_HOST = 1 means the assist will always deliver the interrupt to the 
> host and let the host inject it
> AIF_HOST = 0 host only gets interrupts when firmware can't deliver
> 
> on GET, we just indicate the current settings from the most recent SET, 
> meaning:
> AIF_FLOAT = 1 interrupt forwarding assist is currently active
> AIF_FLOAT = 0 interrupt forwarding assist is not currently active
> AIF_HOST = 1 interrupt forwarding will always go through host
> AIF_HOST = 0 interrupt forwarding will only go through the host when 
> necessary
> 
> My thought would be add the line comments in this patch and then the 
> additional detail in a follow-on patch that adds vfio zPCI to 
> Documentation/S390
> 

good for me.

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 20/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist
  2022-01-25 14:47     ` Matthew Rosato
@ 2022-01-26  8:30       ` Pierre Morel
  0 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-26  8:30 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/25/22 15:47, Matthew Rosato wrote:
> On 1/25/22 8:29 AM, Pierre Morel wrote:
>>
>>
>> On 1/14/22 21:31, Matthew Rosato wrote:
>>> These routines will be wired into the vfio_pci_zdev ioctl handlers to
>>> respond to requests to enable / disable a device for PCI I/O Address
>>> Translation assistance.
>>>
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>   arch/s390/include/asm/kvm_pci.h |  15 ++++
>>>   arch/s390/include/asm/pci_dma.h |   2 +
>>>   arch/s390/kvm/pci.c             | 139 ++++++++++++++++++++++++++++++++
>>>   arch/s390/kvm/pci.h             |   2 +
>>>   4 files changed, 158 insertions(+)
>>>
>>> diff --git a/arch/s390/include/asm/kvm_pci.h 
>>> b/arch/s390/include/asm/kvm_pci.h
>>> index 01fe14fffd7a..770849f13a70 100644
>>> --- a/arch/s390/include/asm/kvm_pci.h
>>> +++ b/arch/s390/include/asm/kvm_pci.h
>>> @@ -16,11 +16,21 @@
>>>   #include <linux/kvm_host.h>
>>>   #include <linux/kvm.h>
>>>   #include <linux/pci.h>
>>> +#include <linux/mutex.h>
>>>   #include <asm/pci_insn.h>
>>> +#include <asm/pci_dma.h>
>>> +
>>> +struct kvm_zdev_ioat {
>>> +    unsigned long *head[ZPCI_TABLE_PAGES];
>>> +    unsigned long **seg;
>>> +    unsigned long ***pt;
>>> +    struct mutex lock;
>>
>> Can we please rename the mutex ioat_lock to have a unique name easy to 
>> follow for maintenance.
>> Can you please add a description about when the lock should be used?
>>
> 
> OK.  The lock is meant to protect the contents of kvm_zdev_ioat -- I'll 
> think of something to describe it.
> 
>>> +};
>>>   struct kvm_zdev {
>>>       struct zpci_dev *zdev;
>>>       struct kvm *kvm;
>>> +    struct kvm_zdev_ioat ioat;
>>>       struct zpci_fib fib;
>>>   };
>>> @@ -33,6 +43,11 @@ int kvm_s390_pci_aif_enable(struct zpci_dev *zdev, 
>>> struct zpci_fib *fib,
>>>                   bool assist);
>>>   int kvm_s390_pci_aif_disable(struct zpci_dev *zdev);
>>> +int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev);
>>> +int kvm_s390_pci_ioat_enable(struct zpci_dev *zdev, u64 iota);
>>> +int kvm_s390_pci_ioat_disable(struct zpci_dev *zdev);
>>> +u8 kvm_s390_pci_get_dtsm(struct zpci_dev *zdev);
>>> +
>>>   int kvm_s390_pci_interp_probe(struct zpci_dev *zdev);
>>>   int kvm_s390_pci_interp_enable(struct zpci_dev *zdev);
>>>   int kvm_s390_pci_interp_disable(struct zpci_dev *zdev);
>>> diff --git a/arch/s390/include/asm/pci_dma.h 
>>> b/arch/s390/include/asm/pci_dma.h
>>> index 91e63426bdc5..69e616d0712c 100644
>>> --- a/arch/s390/include/asm/pci_dma.h
>>> +++ b/arch/s390/include/asm/pci_dma.h
>>> @@ -50,6 +50,8 @@ enum zpci_ioat_dtype {
>>>   #define ZPCI_TABLE_ALIGN        ZPCI_TABLE_SIZE
>>>   #define ZPCI_TABLE_ENTRY_SIZE        (sizeof(unsigned long))
>>>   #define ZPCI_TABLE_ENTRIES        (ZPCI_TABLE_SIZE / 
>>> ZPCI_TABLE_ENTRY_SIZE)
>>> +#define ZPCI_TABLE_PAGES        (ZPCI_TABLE_SIZE >> PAGE_SHIFT)
>>> +#define ZPCI_TABLE_ENTRIES_PAGES    (ZPCI_TABLE_ENTRIES * 
>>> ZPCI_TABLE_PAGES)
>>>   #define ZPCI_TABLE_BITS            11
>>>   #define ZPCI_PT_BITS            8
>>> diff --git a/arch/s390/kvm/pci.c b/arch/s390/kvm/pci.c
>>> index 7ed9abc476b6..39c13c25a700 100644
>>> --- a/arch/s390/kvm/pci.c
>>> +++ b/arch/s390/kvm/pci.c
>>> @@ -13,12 +13,15 @@
>>>   #include <asm/pci.h>
>>>   #include <asm/pci_insn.h>
>>>   #include <asm/pci_io.h>
>>> +#include <asm/pci_dma.h>
>>>   #include <asm/sclp.h>
>>>   #include "pci.h"
>>>   #include "kvm-s390.h"
>>>   struct zpci_aift *aift;
>>> +#define shadow_ioat_init zdev->kzdev->ioat.head[0]
>>> +
>>>   static inline int __set_irq_noiib(u16 ctl, u8 isc)
>>>   {
>>>       union zpci_sic_iib iib = {{0}};
>>> @@ -344,6 +347,135 @@ int kvm_s390_pci_aif_disable(struct zpci_dev 
>>> *zdev)
>>>   }
>>>   EXPORT_SYMBOL_GPL(kvm_s390_pci_aif_disable);
>>> +int kvm_s390_pci_ioat_probe(struct zpci_dev *zdev)
>>> +{
>>> +    /* Must have a KVM association registered */
>>
>> may be add something like : "The ioat structure is embeded in kzdev"
>>
>>> +    if (!zdev->kzdev || !zdev->kzdev->kvm)
>>
>> Why do we need to check for kvm ?
>> Having kzdev is already tested by the unique caller.
>>
> 
> We probably don't need to check for the kzdev because the caller already 
> did this, agreed there.
> 
> But as for checking the kvm association, Alex asked for this in a 
> comment to v1 (comment was against one of the vfio patches that call 
> these routines) -- The reason being the probe comes from a userspace 
> request and can be against any vfio-pci(-zdev) device at any time, and 
> there's no point in proceeding if this device is not associated with a 
> KVM guest -- It's possible for the KVM notifier to also pass a null KVM 
> address -- so I think it's better to just be sure here.  In a 
> well-behaved environment we would never see this (so, another case for 
> an s390dbf entry)

I thought the check could be done even if the userspace is not 
associated with KVM. But of course OK if Alex asked I would have missed 
some point.



-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine
  2022-01-14 20:31 ` [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine Matthew Rosato
  2022-01-17 16:19   ` Niklas Schnelle
@ 2022-01-26 10:07   ` Claudio Imbrenda
  2022-01-27  9:57   ` Pierre Morel
  2 siblings, 0 replies; 97+ messages in thread
From: Claudio Imbrenda @ 2022-01-26 10:07 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: linux-s390, alex.williamson, cohuck, schnelle, farman, pmorel,
	borntraeger, hca, gor, gerald.schaefer, agordeev, frankja, david,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

On Fri, 14 Jan 2022 15:31:22 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> A subsequent patch will be issuing SIC from KVM -- export the necessary
> routine and make the operation control definitions available from a header.
> Because the routine will now be exported, let's rename __zpci_set_irq_ctrl
> to zpci_set_irq_ctrl and get rid of the zero'd iib wrapper function of
> the same name.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
>  arch/s390/pci/pci_insn.c         |  3 ++-
>  arch/s390/pci/pci_irq.c          | 26 ++++++++++++--------------
>  3 files changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 61cf9531f68f..5331082fa516 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -98,6 +98,14 @@ struct zpci_fib {
>  	u32 gd;
>  } __packed __aligned(8);
>  
> +/* Set Interruption Controls Operation Controls  */
> +#define	SIC_IRQ_MODE_ALL		0
> +#define	SIC_IRQ_MODE_SINGLE		1
> +#define	SIC_IRQ_MODE_DIRECT		4
> +#define	SIC_IRQ_MODE_D_ALL		16
> +#define	SIC_IRQ_MODE_D_SINGLE		17
> +#define	SIC_IRQ_MODE_SET_CPU		18
> +
>  /* directed interruption information block */
>  struct zpci_diib {
>  	u32 : 1;
> @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
>  int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
>  int __zpci_store_block(const u64 *data, u64 req, u64 offset);
>  void zpci_barrier(void);
> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> -
> -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
> -{
> -	union zpci_sic_iib iib = {{0}};
> -
> -	return __zpci_set_irq_ctrl(ctl, isc, &iib);
> -}
> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
>  
>  #endif
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index 4dd58b196cea..2a47b3936e44 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -97,7 +97,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
>  }
>  
>  /* Set Interruption Controls */
> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>  {
>  	if (!test_facility(72))
>  		return -EIO;
> @@ -108,6 +108,7 @@ int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>  
>  	return 0;
>  }
> +EXPORT_SYMBOL_GPL(zpci_set_irq_ctrl);
>  
>  /* PCI Load */
>  static inline int ____pcilg(u64 *data, u64 req, u64 offset, u8 *status)
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 0d0a02a9fbbf..2f675355fd0c 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -15,13 +15,6 @@
>  
>  static enum {FLOATING, DIRECTED} irq_delivery;
>  
> -#define	SIC_IRQ_MODE_ALL		0
> -#define	SIC_IRQ_MODE_SINGLE		1
> -#define	SIC_IRQ_MODE_DIRECT		4
> -#define	SIC_IRQ_MODE_D_ALL		16
> -#define	SIC_IRQ_MODE_D_SINGLE		17
> -#define	SIC_IRQ_MODE_SET_CPU		18
> -
>  /*
>   * summary bit vector
>   * FLOATING - summary bit per function
> @@ -154,6 +147,7 @@ static struct irq_chip zpci_irq_chip = {
>  static void zpci_handle_cpu_local_irq(bool rescan)
>  {
>  	struct airq_iv *dibv = zpci_ibv[smp_processor_id()];
> +	union zpci_sic_iib iib = {{0}};
>  	unsigned long bit;
>  	int irqs_on = 0;
>  
> @@ -165,7 +159,7 @@ static void zpci_handle_cpu_local_irq(bool rescan)
>  				/* End of second scan with interrupts on. */
>  				break;
>  			/* First scan complete, reenable interrupts. */
> -			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC))
> +			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC, &iib))
>  				break;
>  			bit = 0;
>  			continue;
> @@ -193,6 +187,7 @@ static void zpci_handle_remote_irq(void *data)
>  static void zpci_handle_fallback_irq(void)
>  {
>  	struct cpu_irq_data *cpu_data;
> +	union zpci_sic_iib iib = {{0}};
>  	unsigned long cpu;
>  	int irqs_on = 0;
>  
> @@ -203,7 +198,7 @@ static void zpci_handle_fallback_irq(void)
>  				/* End of second scan with interrupts on. */
>  				break;
>  			/* First scan complete, reenable interrupts. */
> -			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
> +			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib))
>  				break;
>  			cpu = 0;
>  			continue;
> @@ -234,6 +229,7 @@ static void zpci_directed_irq_handler(struct airq_struct *airq,
>  static void zpci_floating_irq_handler(struct airq_struct *airq,
>  				      struct tpi_info *tpi_info)
>  {
> +	union zpci_sic_iib iib = {{0}};
>  	unsigned long si, ai;
>  	struct airq_iv *aibv;
>  	int irqs_on = 0;
> @@ -247,7 +243,7 @@ static void zpci_floating_irq_handler(struct airq_struct *airq,
>  				/* End of second scan with interrupts on. */
>  				break;
>  			/* First scan complete, reenable interrupts. */
> -			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
> +			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib))
>  				break;
>  			si = 0;
>  			continue;
> @@ -407,11 +403,12 @@ static struct airq_struct zpci_airq = {
>  static void __init cpu_enable_directed_irq(void *unused)
>  {
>  	union zpci_sic_iib iib = {{0}};
> +	union zpci_sic_iib ziib = {{0}};
>  
>  	iib.cdiib.dibv_addr = (u64) zpci_ibv[smp_processor_id()]->vector;
>  
> -	__zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
> -	zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC);
> +	zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
> +	zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC, &ziib);
>  }
>  
>  static int __init zpci_directed_irq_init(void)
> @@ -426,7 +423,7 @@ static int __init zpci_directed_irq_init(void)
>  	iib.diib.isc = PCI_ISC;
>  	iib.diib.nr_cpus = num_possible_cpus();
>  	iib.diib.disb_addr = virt_to_phys(zpci_sbv->vector);
> -	__zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
> +	zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
>  
>  	zpci_ibv = kcalloc(num_possible_cpus(), sizeof(*zpci_ibv),
>  			   GFP_KERNEL);
> @@ -471,6 +468,7 @@ static int __init zpci_floating_irq_init(void)
>  
>  int __init zpci_irq_init(void)
>  {
> +	union zpci_sic_iib iib = {{0}};
>  	int rc;
>  
>  	irq_delivery = sclp.has_dirq ? DIRECTED : FLOATING;
> @@ -502,7 +500,7 @@ int __init zpci_irq_init(void)
>  	 * Enable floating IRQs (with suppression after one IRQ). When using
>  	 * directed IRQs this enables the fallback path.
>  	 */
> -	zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC);
> +	zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib);
>  
>  	return 0;
>  out_airq:


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 12/30] s390/pci: get SHM information from list pci
  2022-01-18 10:36   ` Pierre Morel
@ 2022-01-26 10:13     ` Claudio Imbrenda
  2022-01-27 13:41       ` Pierre Morel
  0 siblings, 1 reply; 97+ messages in thread
From: Claudio Imbrenda @ 2022-01-26 10:13 UTC (permalink / raw)
  To: Pierre Morel
  Cc: Matthew Rosato, linux-s390, alex.williamson, cohuck, schnelle,
	farman, borntraeger, hca, gor, gerald.schaefer, agordeev,
	frankja, david, vneethv, oberpar, freude, thuth, pasic, kvm,
	linux-kernel

On Tue, 18 Jan 2022 11:36:06 +0100
Pierre Morel <pmorel@linux.ibm.com> wrote:

> On 1/14/22 21:31, Matthew Rosato wrote:
> > KVM will need information on the special handle mask used to indicate
> > emulated devices.  In order to obtain this, a new type of list pci call
> > must be made to gather the information.  Extend clp_list_pci_req to
> > also fetch the model-dependent-data field that holds this mask.
> > 
> > Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> > ---
> >   arch/s390/include/asm/pci.h     |  1 +
> >   arch/s390/include/asm/pci_clp.h |  2 +-
> >   arch/s390/pci/pci_clp.c         | 28 +++++++++++++++++++++++++---
> >   3 files changed, 27 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
> > index 00a2c24d6d2b..f3cd2da8128c 100644
> > --- a/arch/s390/include/asm/pci.h
> > +++ b/arch/s390/include/asm/pci.h
> > @@ -227,6 +227,7 @@ int clp_enable_fh(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as);
> >   int clp_disable_fh(struct zpci_dev *zdev, u32 *fh);
> >   int clp_get_state(u32 fid, enum zpci_state *state);
> >   int clp_refresh_fh(u32 fid, u32 *fh);
> > +int zpci_get_mdd(u32 *mdd);
> >   
> >   /* UID */
> >   void update_uid_checking(bool new);
> > diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
> > index 124fadfb74b9..d6bc324763f3 100644
> > --- a/arch/s390/include/asm/pci_clp.h
> > +++ b/arch/s390/include/asm/pci_clp.h
> > @@ -76,7 +76,7 @@ struct clp_req_list_pci {
> >   struct clp_rsp_list_pci {
> >   	struct clp_rsp_hdr hdr;
> >   	u64 resume_token;
> > -	u32 reserved2;
> > +	u32 mdd;
> >   	u16 max_fn;
> >   	u8			: 7;
> >   	u8 uid_checking		: 1;
> > diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
> > index bc7446566cbc..308ffb93413f 100644
> > --- a/arch/s390/pci/pci_clp.c
> > +++ b/arch/s390/pci/pci_clp.c
> > @@ -328,7 +328,7 @@ int clp_disable_fh(struct zpci_dev *zdev, u32 *fh)
> >   }
> >   
> >   static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
> > -			    u64 *resume_token, int *nentries)
> > +			    u64 *resume_token, int *nentries, u32 *mdd)
> >   {
> >   	int rc;
> >   
> > @@ -354,6 +354,8 @@ static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
> >   	*nentries = (rrb->response.hdr.len - LIST_PCI_HDR_LEN) /
> >   		rrb->response.entry_size;
> >   	*resume_token = rrb->response.resume_token;
> > +	if (mdd)
> > +		*mdd = rrb->response.mdd;
> >   
> >   	return rc;
> >   }
> > @@ -365,7 +367,7 @@ static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
> >   	int nentries, i, rc;
> >   
> >   	do {
> > -		rc = clp_list_pci_req(rrb, &resume_token, &nentries);
> > +		rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
> >   		if (rc)
> >   			return rc;
> >   		for (i = 0; i < nentries; i++)
> > @@ -383,7 +385,7 @@ static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
> >   	int nentries, i, rc;
> >   
> >   	do {
> > -		rc = clp_list_pci_req(rrb, &resume_token, &nentries);
> > +		rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
> >   		if (rc)
> >   			return rc;
> >   		fh_list = rrb->response.fh_list;
> > @@ -468,6 +470,26 @@ int clp_get_state(u32 fid, enum zpci_state *state)
> >   	return rc;
> >   }
> >   
> > +int zpci_get_mdd(u32 *mdd)
> > +{
> > +	struct clp_req_rsp_list_pci *rrb;
> > +	u64 resume_token = 0;
> > +	int nentries, rc;
> > +
> > +	if (!mdd)
> > +		return -EINVAL;  
> 
> I think this tests is not useful.
> The caller must take care not to call with a NULL pointer,
> what the only caller today make sure.

what if the caller does it anyway?

I think the test is useful. if passing NULL is a bug, then maybe
consider using BUG_ON, or WARN_ONCE

> 
> 
> > +
> > +	rrb = clp_alloc_block(GFP_KERNEL);
> > +	if (!rrb)
> > +		return -ENOMEM;
> > +
> > +	rc = clp_list_pci_req(rrb, &resume_token, &nentries, mdd);
> > +
> > +	clp_free_block(rrb);
> > +	return rc;
> > +}
> > +EXPORT_SYMBOL_GPL(zpci_get_mdd);
> > +
> >   static int clp_base_slpc(struct clp_req *req, struct clp_req_rsp_slpc *lpcb)
> >   {
> >   	unsigned long limit = PAGE_SIZE - sizeof(lpcb->request);
> >   
> 


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans
  2022-01-14 20:31 ` [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans Matthew Rosato
  2022-01-19 18:13   ` Pierre Morel
@ 2022-01-26 10:45   ` Claudio Imbrenda
  2022-01-27 10:30   ` Niklas Schnelle
  2 siblings, 0 replies; 97+ messages in thread
From: Claudio Imbrenda @ 2022-01-26 10:45 UTC (permalink / raw)
  To: Matthew Rosato
  Cc: linux-s390, alex.williamson, cohuck, schnelle, farman, pmorel,
	borntraeger, hca, gor, gerald.schaefer, agordeev, frankja, david,
	vneethv, oberpar, freude, thuth, pasic, kvm, linux-kernel

On Fri, 14 Jan 2022 15:31:28 -0500
Matthew Rosato <mjrosato@linux.ibm.com> wrote:

> Current callers of zpci_refresh_trans don't need to interrogate the status
> returned from the underlying instructions.  However, a subsequent patch
> will add a KVM caller that needs this information.  Add a new argument to
> zpci_refresh_trans to pass the address of a status byte and update
> existing call sites to provide it.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>

> ---
>  arch/s390/include/asm/pci_insn.h |  2 +-
>  arch/s390/pci/pci_dma.c          |  6 ++++--
>  arch/s390/pci/pci_insn.c         | 10 +++++-----
>  drivers/iommu/s390-iommu.c       |  4 +++-
>  4 files changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 5331082fa516..32759c407b8f 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -135,7 +135,7 @@ union zpci_sic_iib {
>  DECLARE_STATIC_KEY_FALSE(have_mio);
>  
>  u8 zpci_mod_fc(u64 req, struct zpci_fib *fib, u8 *status);
> -int zpci_refresh_trans(u64 fn, u64 addr, u64 range);
> +int zpci_refresh_trans(u64 fn, u64 addr, u64 range, u8 *status);
>  int __zpci_load(u64 *data, u64 req, u64 offset);
>  int zpci_load(u64 *data, const volatile void __iomem *addr, unsigned long len);
>  int __zpci_store(u64 data, u64 req, u64 offset);
> diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
> index a81de48d5ea7..b0a2380bcad8 100644
> --- a/arch/s390/pci/pci_dma.c
> +++ b/arch/s390/pci/pci_dma.c
> @@ -23,8 +23,9 @@ static u32 s390_iommu_aperture_factor = 1;
>  
>  static int zpci_refresh_global(struct zpci_dev *zdev)
>  {
> +	u8 status;
>  	return zpci_refresh_trans((u64) zdev->fh << 32, zdev->start_dma,
> -				  zdev->iommu_pages * PAGE_SIZE);
> +				  zdev->iommu_pages * PAGE_SIZE, &status);
>  }
>  
>  unsigned long *dma_alloc_cpu_table(void)
> @@ -183,6 +184,7 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
>  			   size_t size, int flags)
>  {
>  	unsigned long irqflags;
> +	u8 status;
>  	int ret;
>  
>  	/*
> @@ -201,7 +203,7 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
>  	}
>  
>  	ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr,
> -				 PAGE_ALIGN(size));
> +				 PAGE_ALIGN(size), &status);
>  	if (ret == -ENOMEM && !s390_iommu_strict) {
>  		/* enable the hypervisor to free some resources */
>  		if (zpci_refresh_global(zdev))
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index 0509554301c7..ca6399d52767 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -77,20 +77,20 @@ static inline u8 __rpcit(u64 fn, u64 addr, u64 range, u8 *status)
>  	return cc;
>  }
>  
> -int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
> +int zpci_refresh_trans(u64 fn, u64 addr, u64 range, u8 *status)
>  {
> -	u8 cc, status;
> +	u8 cc;
>  
>  	do {
> -		cc = __rpcit(fn, addr, range, &status);
> +		cc = __rpcit(fn, addr, range, status);
>  		if (cc == 2)
>  			udelay(ZPCI_INSN_BUSY_DELAY);
>  	} while (cc == 2);
>  
>  	if (cc)
> -		zpci_err_insn(cc, status, addr, range);
> +		zpci_err_insn(cc, *status, addr, range);
>  
> -	if (cc == 1 && (status == 4 || status == 16))
> +	if (cc == 1 && (*status == 4 || *status == 16))
>  		return -ENOMEM;
>  
>  	return (cc) ? -EIO : 0;
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index 50860ebdd087..845bb99c183e 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -214,6 +214,7 @@ static int s390_iommu_update_trans(struct s390_domain *s390_domain,
>  	unsigned long irq_flags, nr_pages, i;
>  	unsigned long *entry;
>  	int rc = 0;
> +	u8 status;
>  
>  	if (dma_addr < s390_domain->domain.geometry.aperture_start ||
>  	    dma_addr + size > s390_domain->domain.geometry.aperture_end)
> @@ -238,7 +239,8 @@ static int s390_iommu_update_trans(struct s390_domain *s390_domain,
>  	spin_lock(&s390_domain->list_lock);
>  	list_for_each_entry(domain_device, &s390_domain->devices, list) {
>  		rc = zpci_refresh_trans((u64) domain_device->zdev->fh << 32,
> -					start_dma_addr, nr_pages * PAGE_SIZE);
> +					start_dma_addr, nr_pages * PAGE_SIZE,
> +					&status);
>  		if (rc)
>  			break;
>  	}


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine
  2022-01-14 20:31 ` [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine Matthew Rosato
  2022-01-17 16:19   ` Niklas Schnelle
  2022-01-26 10:07   ` Claudio Imbrenda
@ 2022-01-27  9:57   ` Pierre Morel
  2 siblings, 0 replies; 97+ messages in thread
From: Pierre Morel @ 2022-01-27  9:57 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, schnelle, farman, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel



On 1/14/22 21:31, Matthew Rosato wrote:
> A subsequent patch will be issuing SIC from KVM -- export the necessary
> routine and make the operation control definitions available from a header.
> Because the routine will now be exported, let's rename __zpci_set_irq_ctrl
> to zpci_set_irq_ctrl and get rid of the zero'd iib wrapper function of
> the same name.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>

Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>


> ---
>   arch/s390/include/asm/pci_insn.h | 17 +++++++++--------
>   arch/s390/pci/pci_insn.c         |  3 ++-
>   arch/s390/pci/pci_irq.c          | 26 ++++++++++++--------------
>   3 files changed, 23 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 61cf9531f68f..5331082fa516 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -98,6 +98,14 @@ struct zpci_fib {
>   	u32 gd;
>   } __packed __aligned(8);
>   
> +/* Set Interruption Controls Operation Controls  */
> +#define	SIC_IRQ_MODE_ALL		0
> +#define	SIC_IRQ_MODE_SINGLE		1
> +#define	SIC_IRQ_MODE_DIRECT		4
> +#define	SIC_IRQ_MODE_D_ALL		16
> +#define	SIC_IRQ_MODE_D_SINGLE		17
> +#define	SIC_IRQ_MODE_SET_CPU		18
> +
>   /* directed interruption information block */
>   struct zpci_diib {
>   	u32 : 1;
> @@ -134,13 +142,6 @@ int __zpci_store(u64 data, u64 req, u64 offset);
>   int zpci_store(const volatile void __iomem *addr, u64 data, unsigned long len);
>   int __zpci_store_block(const u64 *data, u64 req, u64 offset);
>   void zpci_barrier(void);
> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
> -
> -static inline int zpci_set_irq_ctrl(u16 ctl, u8 isc)
> -{
> -	union zpci_sic_iib iib = {{0}};
> -
> -	return __zpci_set_irq_ctrl(ctl, isc, &iib);
> -}
> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib);
>   
>   #endif
> diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
> index 4dd58b196cea..2a47b3936e44 100644
> --- a/arch/s390/pci/pci_insn.c
> +++ b/arch/s390/pci/pci_insn.c
> @@ -97,7 +97,7 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
>   }
>   
>   /* Set Interruption Controls */
> -int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
> +int zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>   {
>   	if (!test_facility(72))
>   		return -EIO;
> @@ -108,6 +108,7 @@ int __zpci_set_irq_ctrl(u16 ctl, u8 isc, union zpci_sic_iib *iib)
>   
>   	return 0;
>   }
> +EXPORT_SYMBOL_GPL(zpci_set_irq_ctrl);
>   
>   /* PCI Load */
>   static inline int ____pcilg(u64 *data, u64 req, u64 offset, u8 *status)
> diff --git a/arch/s390/pci/pci_irq.c b/arch/s390/pci/pci_irq.c
> index 0d0a02a9fbbf..2f675355fd0c 100644
> --- a/arch/s390/pci/pci_irq.c
> +++ b/arch/s390/pci/pci_irq.c
> @@ -15,13 +15,6 @@
>   
>   static enum {FLOATING, DIRECTED} irq_delivery;
>   
> -#define	SIC_IRQ_MODE_ALL		0
> -#define	SIC_IRQ_MODE_SINGLE		1
> -#define	SIC_IRQ_MODE_DIRECT		4
> -#define	SIC_IRQ_MODE_D_ALL		16
> -#define	SIC_IRQ_MODE_D_SINGLE		17
> -#define	SIC_IRQ_MODE_SET_CPU		18
> -
>   /*
>    * summary bit vector
>    * FLOATING - summary bit per function
> @@ -154,6 +147,7 @@ static struct irq_chip zpci_irq_chip = {
>   static void zpci_handle_cpu_local_irq(bool rescan)
>   {
>   	struct airq_iv *dibv = zpci_ibv[smp_processor_id()];
> +	union zpci_sic_iib iib = {{0}};



>   	unsigned long bit;
>   	int irqs_on = 0;
>   
> @@ -165,7 +159,7 @@ static void zpci_handle_cpu_local_irq(bool rescan)
>   				/* End of second scan with interrupts on. */
>   				break;
>   			/* First scan complete, reenable interrupts. */
> -			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC))
> +			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC, &iib))
>   				break;
>   			bit = 0;
>   			continue;
> @@ -193,6 +187,7 @@ static void zpci_handle_remote_irq(void *data)
>   static void zpci_handle_fallback_irq(void)
>   {
>   	struct cpu_irq_data *cpu_data;
> +	union zpci_sic_iib iib = {{0}};
>   	unsigned long cpu;
>   	int irqs_on = 0;
>   
> @@ -203,7 +198,7 @@ static void zpci_handle_fallback_irq(void)
>   				/* End of second scan with interrupts on. */
>   				break;
>   			/* First scan complete, reenable interrupts. */
> -			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
> +			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib))
>   				break;
>   			cpu = 0;
>   			continue;
> @@ -234,6 +229,7 @@ static void zpci_directed_irq_handler(struct airq_struct *airq,
>   static void zpci_floating_irq_handler(struct airq_struct *airq,
>   				      struct tpi_info *tpi_info)
>   {
> +	union zpci_sic_iib iib = {{0}};
>   	unsigned long si, ai;
>   	struct airq_iv *aibv;
>   	int irqs_on = 0;
> @@ -247,7 +243,7 @@ static void zpci_floating_irq_handler(struct airq_struct *airq,
>   				/* End of second scan with interrupts on. */
>   				break;
>   			/* First scan complete, reenable interrupts. */
> -			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC))
> +			if (zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib))
>   				break;
>   			si = 0;
>   			continue;
> @@ -407,11 +403,12 @@ static struct airq_struct zpci_airq = {
>   static void __init cpu_enable_directed_irq(void *unused)
>   {
>   	union zpci_sic_iib iib = {{0}};
> +	union zpci_sic_iib ziib = {{0}};
>   
>   	iib.cdiib.dibv_addr = (u64) zpci_ibv[smp_processor_id()]->vector;
>   
> -	__zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
> -	zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC);
> +	zpci_set_irq_ctrl(SIC_IRQ_MODE_SET_CPU, 0, &iib);
> +	zpci_set_irq_ctrl(SIC_IRQ_MODE_D_SINGLE, PCI_ISC, &ziib);
>   }
>   
>   static int __init zpci_directed_irq_init(void)
> @@ -426,7 +423,7 @@ static int __init zpci_directed_irq_init(void)
>   	iib.diib.isc = PCI_ISC;
>   	iib.diib.nr_cpus = num_possible_cpus();
>   	iib.diib.disb_addr = virt_to_phys(zpci_sbv->vector);
> -	__zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
> +	zpci_set_irq_ctrl(SIC_IRQ_MODE_DIRECT, 0, &iib);
>   
>   	zpci_ibv = kcalloc(num_possible_cpus(), sizeof(*zpci_ibv),
>   			   GFP_KERNEL);
> @@ -471,6 +468,7 @@ static int __init zpci_floating_irq_init(void)
>   
>   int __init zpci_irq_init(void)
>   {
> +	union zpci_sic_iib iib = {{0}};
>   	int rc;
>   
>   	irq_delivery = sclp.has_dirq ? DIRECTED : FLOATING;
> @@ -502,7 +500,7 @@ int __init zpci_irq_init(void)
>   	 * Enable floating IRQs (with suppression after one IRQ). When using
>   	 * directed IRQs this enables the fallback path.
>   	 */
> -	zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC);
> +	zpci_set_irq_ctrl(SIC_IRQ_MODE_SINGLE, PCI_ISC, &iib);
>   
>   	return 0;
>   out_airq:
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 12/30] s390/pci: get SHM information from list pci
  2022-01-14 20:31 ` [PATCH v2 12/30] s390/pci: get SHM information from list pci Matthew Rosato
  2022-01-18 10:36   ` Pierre Morel
@ 2022-01-27 10:29   ` Niklas Schnelle
  1 sibling, 0 replies; 97+ messages in thread
From: Niklas Schnelle @ 2022-01-27 10:29 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, farman, pmorel, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On Fri, 2022-01-14 at 15:31 -0500, Matthew Rosato wrote:
> KVM will need information on the special handle mask used to indicate
> emulated devices.  In order to obtain this, a new type of list pci call
> must be made to gather the information.  Extend clp_list_pci_req to
> also fetch the model-dependent-data field that holds this mask.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  arch/s390/include/asm/pci.h     |  1 +
>  arch/s390/include/asm/pci_clp.h |  2 +-
>  arch/s390/pci/pci_clp.c         | 28 +++++++++++++++++++++++++---
>  3 files changed, 27 insertions(+), 4 deletions(-)
> 
---8<---
>  
> +int zpci_get_mdd(u32 *mdd)
> +{
> +	struct clp_req_rsp_list_pci *rrb;
> +	u64 resume_token = 0;
> +	int nentries, rc;
> +
> +	if (!mdd)
> +		return -EINVAL;
> +
> +	rrb = clp_alloc_block(GFP_KERNEL);
> +	if (!rrb)
> +		return -ENOMEM;
> +
> +	rc = clp_list_pci_req(rrb, &resume_token, &nentries, mdd);
> +
> +	clp_free_block(rrb);
> +	return rc;
> +}
> +EXPORT_SYMBOL_GPL(zpci_get_mdd);
> +
>  static int clp_base_slpc(struct clp_req *req, struct clp_req_rsp_slpc *lpcb)
>  {
>  	unsigned long limit = PAGE_SIZE - sizeof(lpcb->request);

Looks good.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>



^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans
  2022-01-14 20:31 ` [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans Matthew Rosato
  2022-01-19 18:13   ` Pierre Morel
  2022-01-26 10:45   ` Claudio Imbrenda
@ 2022-01-27 10:30   ` Niklas Schnelle
  2 siblings, 0 replies; 97+ messages in thread
From: Niklas Schnelle @ 2022-01-27 10:30 UTC (permalink / raw)
  To: Matthew Rosato, linux-s390
  Cc: alex.williamson, cohuck, farman, pmorel, borntraeger, hca, gor,
	gerald.schaefer, agordeev, frankja, david, imbrenda, vneethv,
	oberpar, freude, thuth, pasic, kvm, linux-kernel

On Fri, 2022-01-14 at 15:31 -0500, Matthew Rosato wrote:
> Current callers of zpci_refresh_trans don't need to interrogate the status
> returned from the underlying instructions.  However, a subsequent patch
> will add a KVM caller that needs this information.  Add a new argument to
> zpci_refresh_trans to pass the address of a status byte and update
> existing call sites to provide it.
> 
> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
> ---
>  arch/s390/include/asm/pci_insn.h |  2 +-
>  arch/s390/pci/pci_dma.c          |  6 ++++--
>  arch/s390/pci/pci_insn.c         | 10 +++++-----
>  drivers/iommu/s390-iommu.c       |  4 +++-
>  4 files changed, 13 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/s390/include/asm/pci_insn.h b/arch/s390/include/asm/pci_insn.h
> index 5331082fa516..32759c407b8f 100644
> --- a/arch/s390/include/asm/pci_insn.h
> +++ b/arch/s390/include/asm/pci_insn.h
> @@ -135,7 +135,7 @@ union zpci_sic_iib {
>  DECLARE_STATIC_KEY_FALSE(have_mio);
>  
>  u8 zpci_mod_fc(u64 req, struct zpci_fib *fib, u8 *status);
> -int zpci_refresh_trans(u64 fn, u64 addr, u64 range);
> +int zpci_refresh_trans(u64 fn, u64 addr, u64 range, u8 *status);
> 
---8<---
>  
>  	return (cc) ? -EIO : 0;
> diff --git a/drivers/iommu/s390-iommu.c b/drivers/iommu/s390-iommu.c
> index 50860ebdd087..845bb99c183e 100644
> --- a/drivers/iommu/s390-iommu.c
> +++ b/drivers/iommu/s390-iommu.c
> @@ -214,6 +214,7 @@ static int s390_iommu_update_trans(struct s390_domain *s390_domain,
>  	unsigned long irq_flags, nr_pages, i;
>  	unsigned long *entry;
>  	int rc = 0;
> +	u8 status;
>  
>  	if (dma_addr < s390_domain->domain.geometry.aperture_start ||
>  	    dma_addr + size > s390_domain->domain.geometry.aperture_end)
> @@ -238,7 +239,8 @@ static int s390_iommu_update_trans(struct s390_domain *s390_domain,
>  	spin_lock(&s390_domain->list_lock);
>  	list_for_each_entry(domain_device, &s390_domain->devices, list) {
>  		rc = zpci_refresh_trans((u64) domain_device->zdev->fh << 32,
> -					start_dma_addr, nr_pages * PAGE_SIZE);
> +					start_dma_addr, nr_pages * PAGE_SIZE,
> +					&status);
>  		if (rc)
>  			break;
>  	}

Looks good.

Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>


^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 12/30] s390/pci: get SHM information from list pci
  2022-01-26 10:13     ` Claudio Imbrenda
@ 2022-01-27 13:41       ` Pierre Morel
  2022-01-27 15:14         ` Matthew Rosato
  0 siblings, 1 reply; 97+ messages in thread
From: Pierre Morel @ 2022-01-27 13:41 UTC (permalink / raw)
  To: Claudio Imbrenda
  Cc: Matthew Rosato, linux-s390, alex.williamson, cohuck, schnelle,
	farman, borntraeger, hca, gor, gerald.schaefer, agordeev,
	frankja, david, vneethv, oberpar, freude, thuth, pasic, kvm,
	linux-kernel



On 1/26/22 11:13, Claudio Imbrenda wrote:
> On Tue, 18 Jan 2022 11:36:06 +0100
> Pierre Morel <pmorel@linux.ibm.com> wrote:
> 
>> On 1/14/22 21:31, Matthew Rosato wrote:
>>> KVM will need information on the special handle mask used to indicate
>>> emulated devices.  In order to obtain this, a new type of list pci call
>>> must be made to gather the information.  Extend clp_list_pci_req to
>>> also fetch the model-dependent-data field that holds this mask.
>>>
>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>> ---
>>>    arch/s390/include/asm/pci.h     |  1 +
>>>    arch/s390/include/asm/pci_clp.h |  2 +-
>>>    arch/s390/pci/pci_clp.c         | 28 +++++++++++++++++++++++++---
>>>    3 files changed, 27 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
>>> index 00a2c24d6d2b..f3cd2da8128c 100644
>>> --- a/arch/s390/include/asm/pci.h
>>> +++ b/arch/s390/include/asm/pci.h
>>> @@ -227,6 +227,7 @@ int clp_enable_fh(struct zpci_dev *zdev, u32 *fh, u8 nr_dma_as);
>>>    int clp_disable_fh(struct zpci_dev *zdev, u32 *fh);
>>>    int clp_get_state(u32 fid, enum zpci_state *state);
>>>    int clp_refresh_fh(u32 fid, u32 *fh);
>>> +int zpci_get_mdd(u32 *mdd);
>>>    
>>>    /* UID */
>>>    void update_uid_checking(bool new);
>>> diff --git a/arch/s390/include/asm/pci_clp.h b/arch/s390/include/asm/pci_clp.h
>>> index 124fadfb74b9..d6bc324763f3 100644
>>> --- a/arch/s390/include/asm/pci_clp.h
>>> +++ b/arch/s390/include/asm/pci_clp.h
>>> @@ -76,7 +76,7 @@ struct clp_req_list_pci {
>>>    struct clp_rsp_list_pci {
>>>    	struct clp_rsp_hdr hdr;
>>>    	u64 resume_token;
>>> -	u32 reserved2;
>>> +	u32 mdd;
>>>    	u16 max_fn;
>>>    	u8			: 7;
>>>    	u8 uid_checking		: 1;
>>> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
>>> index bc7446566cbc..308ffb93413f 100644
>>> --- a/arch/s390/pci/pci_clp.c
>>> +++ b/arch/s390/pci/pci_clp.c
>>> @@ -328,7 +328,7 @@ int clp_disable_fh(struct zpci_dev *zdev, u32 *fh)
>>>    }
>>>    
>>>    static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
>>> -			    u64 *resume_token, int *nentries)
>>> +			    u64 *resume_token, int *nentries, u32 *mdd)
>>>    {
>>>    	int rc;
>>>    
>>> @@ -354,6 +354,8 @@ static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
>>>    	*nentries = (rrb->response.hdr.len - LIST_PCI_HDR_LEN) /
>>>    		rrb->response.entry_size;
>>>    	*resume_token = rrb->response.resume_token;
>>> +	if (mdd)
>>> +		*mdd = rrb->response.mdd;
>>>    
>>>    	return rc;
>>>    }
>>> @@ -365,7 +367,7 @@ static int clp_list_pci(struct clp_req_rsp_list_pci *rrb, void *data,
>>>    	int nentries, i, rc;
>>>    
>>>    	do {
>>> -		rc = clp_list_pci_req(rrb, &resume_token, &nentries);
>>> +		rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
>>>    		if (rc)
>>>    			return rc;
>>>    		for (i = 0; i < nentries; i++)
>>> @@ -383,7 +385,7 @@ static int clp_find_pci(struct clp_req_rsp_list_pci *rrb, u32 fid,
>>>    	int nentries, i, rc;
>>>    
>>>    	do {
>>> -		rc = clp_list_pci_req(rrb, &resume_token, &nentries);
>>> +		rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
>>>    		if (rc)
>>>    			return rc;
>>>    		fh_list = rrb->response.fh_list;
>>> @@ -468,6 +470,26 @@ int clp_get_state(u32 fid, enum zpci_state *state)
>>>    	return rc;
>>>    }
>>>    
>>> +int zpci_get_mdd(u32 *mdd)
>>> +{
>>> +	struct clp_req_rsp_list_pci *rrb;
>>> +	u64 resume_token = 0;
>>> +	int nentries, rc;
>>> +
>>> +	if (!mdd)
>>> +		return -EINVAL;
>>
>> I think this tests is not useful.
>> The caller must take care not to call with a NULL pointer,
>> what the only caller today make sure.
> 
> what if the caller does it anyway?
> 
> I think the test is useful. if passing NULL is a bug, then maybe
> consider using BUG_ON, or WARN_ONCE

I think generally the caller is responsible for the test.
In our case for example the caller can use directly the address
of a u32 allocated on the stack or globaly and he knows if a test is 
useful or not.

Of course we can systematically check in every kernel function all 
pointer parameters against NULL.
But this is not userland, not even a inter-architecture core function 
and we can expect the kernel programmer to programm correctly.

For our special case zpci_get_mdd() nor clp_list_pci_req() do access 
*mdd if mdd is NULL so that giving a NULL mdd pointer will not trigger a 
fault.
Also, the function is named zpci_get_mdd(u32 *mdd) if the caller do not 
give a pointer to mdd which would be quite stupid to call a function 
zpci_get_mdd() in this circumstance he will just no get mdd, no side effect.
So the only purpose having this test here is to say the caller that he 
forgot to check his mdd allocation.
My opinion is that he should have check.

> 
>>
>>
>>> +
>>> +	rrb = clp_alloc_block(GFP_KERNEL);
>>> +	if (!rrb)
>>> +		return -ENOMEM;
>>> +
>>> +	rc = clp_list_pci_req(rrb, &resume_token, &nentries, mdd);
>>> +
>>> +	clp_free_block(rrb);
>>> +	return rc;
>>> +}
>>> +EXPORT_SYMBOL_GPL(zpci_get_mdd);
>>> +
>>>    static int clp_base_slpc(struct clp_req *req, struct clp_req_rsp_slpc *lpcb)
>>>    {
>>>    	unsigned long limit = PAGE_SIZE - sizeof(lpcb->request);
>>>    
>>
> 

-- 
Pierre Morel
IBM Lab Boeblingen

^ permalink raw reply	[flat|nested] 97+ messages in thread

* Re: [PATCH v2 12/30] s390/pci: get SHM information from list pci
  2022-01-27 13:41       ` Pierre Morel
@ 2022-01-27 15:14         ` Matthew Rosato
  0 siblings, 0 replies; 97+ messages in thread
From: Matthew Rosato @ 2022-01-27 15:14 UTC (permalink / raw)
  To: Pierre Morel, Claudio Imbrenda, schnelle
  Cc: linux-s390, alex.williamson, cohuck, farman, borntraeger, hca,
	gor, gerald.schaefer, agordeev, frankja, david, vneethv, oberpar,
	freude, thuth, pasic, kvm, linux-kernel

On 1/27/22 8:41 AM, Pierre Morel wrote:
> 
> 
> On 1/26/22 11:13, Claudio Imbrenda wrote:
>> On Tue, 18 Jan 2022 11:36:06 +0100
>> Pierre Morel <pmorel@linux.ibm.com> wrote:
>>
>>> On 1/14/22 21:31, Matthew Rosato wrote:
>>>> KVM will need information on the special handle mask used to indicate
>>>> emulated devices.  In order to obtain this, a new type of list pci call
>>>> must be made to gather the information.  Extend clp_list_pci_req to
>>>> also fetch the model-dependent-data field that holds this mask.
>>>>
>>>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>>>> ---
>>>>    arch/s390/include/asm/pci.h     |  1 +
>>>>    arch/s390/include/asm/pci_clp.h |  2 +-
>>>>    arch/s390/pci/pci_clp.c         | 28 +++++++++++++++++++++++++---
>>>>    3 files changed, 27 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
>>>> index 00a2c24d6d2b..f3cd2da8128c 100644
>>>> --- a/arch/s390/include/asm/pci.h
>>>> +++ b/arch/s390/include/asm/pci.h
>>>> @@ -227,6 +227,7 @@ int clp_enable_fh(struct zpci_dev *zdev, u32 
>>>> *fh, u8 nr_dma_as);
>>>>    int clp_disable_fh(struct zpci_dev *zdev, u32 *fh);
>>>>    int clp_get_state(u32 fid, enum zpci_state *state);
>>>>    int clp_refresh_fh(u32 fid, u32 *fh);
>>>> +int zpci_get_mdd(u32 *mdd);
>>>>    /* UID */
>>>>    void update_uid_checking(bool new);
>>>> diff --git a/arch/s390/include/asm/pci_clp.h 
>>>> b/arch/s390/include/asm/pci_clp.h
>>>> index 124fadfb74b9..d6bc324763f3 100644
>>>> --- a/arch/s390/include/asm/pci_clp.h
>>>> +++ b/arch/s390/include/asm/pci_clp.h
>>>> @@ -76,7 +76,7 @@ struct clp_req_list_pci {
>>>>    struct clp_rsp_list_pci {
>>>>        struct clp_rsp_hdr hdr;
>>>>        u64 resume_token;
>>>> -    u32 reserved2;
>>>> +    u32 mdd;
>>>>        u16 max_fn;
>>>>        u8            : 7;
>>>>        u8 uid_checking        : 1;
>>>> diff --git a/arch/s390/pci/pci_clp.c b/arch/s390/pci/pci_clp.c
>>>> index bc7446566cbc..308ffb93413f 100644
>>>> --- a/arch/s390/pci/pci_clp.c
>>>> +++ b/arch/s390/pci/pci_clp.c
>>>> @@ -328,7 +328,7 @@ int clp_disable_fh(struct zpci_dev *zdev, u32 *fh)
>>>>    }
>>>>    static int clp_list_pci_req(struct clp_req_rsp_list_pci *rrb,
>>>> -                u64 *resume_token, int *nentries)
>>>> +                u64 *resume_token, int *nentries, u32 *mdd)
>>>>    {
>>>>        int rc;
>>>> @@ -354,6 +354,8 @@ static int clp_list_pci_req(struct 
>>>> clp_req_rsp_list_pci *rrb,
>>>>        *nentries = (rrb->response.hdr.len - LIST_PCI_HDR_LEN) /
>>>>            rrb->response.entry_size;
>>>>        *resume_token = rrb->response.resume_token;
>>>> +    if (mdd)
>>>> +        *mdd = rrb->response.mdd;
>>>>        return rc;
>>>>    }
>>>> @@ -365,7 +367,7 @@ static int clp_list_pci(struct 
>>>> clp_req_rsp_list_pci *rrb, void *data,
>>>>        int nentries, i, rc;
>>>>        do {
>>>> -        rc = clp_list_pci_req(rrb, &resume_token, &nentries);
>>>> +        rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
>>>>            if (rc)
>>>>                return rc;
>>>>            for (i = 0; i < nentries; i++)
>>>> @@ -383,7 +385,7 @@ static int clp_find_pci(struct 
>>>> clp_req_rsp_list_pci *rrb, u32 fid,
>>>>        int nentries, i, rc;
>>>>        do {
>>>> -        rc = clp_list_pci_req(rrb, &resume_token, &nentries);
>>>> +        rc = clp_list_pci_req(rrb, &resume_token, &nentries, NULL);
>>>>            if (rc)
>>>>                return rc;
>>>>            fh_list = rrb->response.fh_list;
>>>> @@ -468,6 +470,26 @@ int clp_get_state(u32 fid, enum zpci_state *state)
>>>>        return rc;
>>>>    }
>>>> +int zpci_get_mdd(u32 *mdd)
>>>> +{
>>>> +    struct clp_req_rsp_list_pci *rrb;
>>>> +    u64 resume_token = 0;
>>>> +    int nentries, rc;
>>>> +
>>>> +    if (!mdd)
>>>> +        return -EINVAL;
>>>
>>> I think this tests is not useful.
>>> The caller must take care not to call with a NULL pointer,
>>> what the only caller today make sure.
>>
>> what if the caller does it anyway?
>>
>> I think the test is useful. if passing NULL is a bug, then maybe
>> consider using BUG_ON, or WARN_ONCE
> 
> I think generally the caller is responsible for the test.
> In our case for example the caller can use directly the address
> of a u32 allocated on the stack or globaly and he knows if a test is 
> useful or not.
> 
> Of course we can systematically check in every kernel function all 
> pointer parameters against NULL.
> But this is not userland, not even a inter-architecture core function 
> and we can expect the kernel programmer to programm correctly.

I appreciate your optimism :)

> 
> For our special case zpci_get_mdd() nor clp_list_pci_req() do access 
> *mdd if mdd is NULL so that giving a NULL mdd pointer will not trigger a 
> fault.
> Also, the function is named zpci_get_mdd(u32 *mdd) if the caller do not 
> give a pointer to mdd which would be quite stupid to call a function 
> zpci_get_mdd() in this circumstance he will just no get mdd, no side 
> effect.
> So the only purpose having this test here is to say the caller that he 
> forgot to check his mdd allocation.
> My opinion is that he should have check.

Based on the thread of conversation, I'm going to assume you meant 'My 
opinion that he should -not- have the check here'

So, I'm generally a fan of being a bit defensive in paths that are not 
performance-intensive and would therefore typically tend to agree with 
Claudio.  But in this particular case, I'm willing to just drop this 
check and move on, primarily because 1) we only have a single caller 
already checking this case and 2) I don't really anticipate any 
additional callers later.

@Niklas do you care / is it OK to keep your R-b if I remove this 
if(!mdd) check?


^ permalink raw reply	[flat|nested] 97+ messages in thread

end of thread, other threads:[~2022-01-27 15:15 UTC | newest]

Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-14 20:31 [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 01/30] s390/sclp: detect the zPCI load/store interpretation facility Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 02/30] s390/sclp: detect the AISII facility Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 03/30] s390/sclp: detect the AENI facility Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 04/30] s390/sclp: detect the AISI facility Matthew Rosato
2022-01-17  7:57   ` Thomas Huth
2022-01-14 20:31 ` [PATCH v2 05/30] s390/airq: pass more TPI info to airq handlers Matthew Rosato
2022-01-17  8:27   ` Thomas Huth
2022-01-14 20:31 ` [PATCH v2 06/30] s390/airq: allow for airq structure that uses an input vector Matthew Rosato
2022-01-17 12:29   ` Claudio Imbrenda
2022-01-18 18:52     ` Matthew Rosato
2022-01-18  9:50   ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 07/30] s390/pci: externalize the SIC operation controls and routine Matthew Rosato
2022-01-17 16:19   ` Niklas Schnelle
2022-01-26 10:07   ` Claudio Imbrenda
2022-01-27  9:57   ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 08/30] s390/pci: stash associated GISA designation Matthew Rosato
2022-01-24 14:08   ` Pierre Morel
2022-01-24 15:12     ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 09/30] s390/pci: export some routines related to RPCIT processing Matthew Rosato
2022-01-18  9:51   ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 10/30] s390/pci: stash dtsm and maxstbl Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 11/30] s390/pci: add helper function to find device by handle Matthew Rosato
2022-01-18  9:53   ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 12/30] s390/pci: get SHM information from list pci Matthew Rosato
2022-01-18 10:36   ` Pierre Morel
2022-01-26 10:13     ` Claudio Imbrenda
2022-01-27 13:41       ` Pierre Morel
2022-01-27 15:14         ` Matthew Rosato
2022-01-27 10:29   ` Niklas Schnelle
2022-01-14 20:31 ` [PATCH v2 13/30] s390/pci: return status from zpci_refresh_trans Matthew Rosato
2022-01-19 18:13   ` Pierre Morel
2022-01-26 10:45   ` Claudio Imbrenda
2022-01-27 10:30   ` Niklas Schnelle
2022-01-14 20:31 ` [PATCH v2 14/30] KVM: s390: pci: add basic kvm_zdev structure Matthew Rosato
2022-01-17 16:25   ` Pierre Morel
2022-01-18 17:32     ` Pierre Morel
2022-01-18 18:39       ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 15/30] KVM: s390: pci: do initial setup for AEN interpretation Matthew Rosato
2022-01-19 18:06   ` Pierre Morel
2022-01-19 20:19     ` Matthew Rosato
2022-01-25 12:23   ` Pierre Morel
2022-01-25 14:57     ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 16/30] KVM: s390: pci: enable host forwarding of Adapter Event Notifications Matthew Rosato
2022-01-17 17:38   ` Pierre Morel
2022-01-18 17:25     ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 17/30] KVM: s390: mechanism to enable guest zPCI Interpretation Matthew Rosato
2022-01-24 14:24   ` Pierre Morel
2022-01-24 15:28     ` Matthew Rosato
2022-01-24 17:15       ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 18/30] KVM: s390: pci: provide routines for enabling/disabling interpretation Matthew Rosato
2022-01-24 14:36   ` Pierre Morel
2022-01-24 15:14     ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 19/30] KVM: s390: pci: provide routines for enabling/disabling interrupt forwarding Matthew Rosato
2022-01-25 12:41   ` Pierre Morel
2022-01-25 15:44     ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 20/30] KVM: s390: pci: provide routines for enabling/disabling IOAT assist Matthew Rosato
2022-01-25 13:29   ` Pierre Morel
2022-01-25 14:47     ` Matthew Rosato
2022-01-26  8:30       ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 21/30] KVM: s390: pci: handle refresh of PCI translations Matthew Rosato
2022-01-19  9:29   ` Pierre Morel
2022-01-19 16:39     ` Matthew Rosato
2022-01-19 18:25       ` Pierre Morel
2022-01-19 20:02         ` Matthew Rosato
2022-01-20  9:47           ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 22/30] KVM: s390: intercept the rpcit instruction Matthew Rosato
2022-01-18 11:05   ` Pierre Morel
2022-01-18 17:27     ` Matthew Rosato
2022-01-18 17:54       ` Pierre Morel
2022-01-19 14:06   ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 23/30] vfio/pci: re-introduce CONFIG_VFIO_PCI_ZDEV Matthew Rosato
2022-01-18 17:20   ` Pierre Morel
2022-01-18 17:32     ` Matthew Rosato
2022-01-18 17:45       ` Pierre Morel
2022-01-18 18:05         ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 24/30] vfio-pci/zdev: wire up group notifier Matthew Rosato
2022-01-18 17:34   ` Pierre Morel
2022-01-18 18:37     ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 25/30] vfio-pci/zdev: wire up zPCI interpretive execution support Matthew Rosato
2022-01-25 13:01   ` Pierre Morel
2022-01-25 14:21     ` Matthew Rosato
2022-01-14 20:31 ` [PATCH v2 26/30] vfio-pci/zdev: wire up zPCI adapter interrupt forwarding support Matthew Rosato
2022-01-19 17:10   ` Pierre Morel
2022-01-19 17:20     ` Matthew Rosato
2022-01-25 12:36   ` Pierre Morel
2022-01-25 14:16     ` Matthew Rosato
2022-01-26  8:24       ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 27/30] vfio-pci/zdev: wire up zPCI IOAT assist support Matthew Rosato
2022-01-19 14:03   ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 28/30] vfio-pci/zdev: add DTSM to clp group capability Matthew Rosato
2022-01-19 13:48   ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 29/30] KVM: s390: introduce CPU feature for zPCI Interpretation Matthew Rosato
2022-01-19 13:39   ` Pierre Morel
2022-01-14 20:31 ` [PATCH v2 30/30] MAINTAINERS: additional files related kvm s390 pci passthrough Matthew Rosato
2022-01-14 20:49 ` [PATCH v2 00/30] KVM: s390: enable zPCI for interpretive execution Matthew Rosato
2022-01-19 18:10 ` Pierre Morel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.