kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control
@ 2019-05-21 15:34 Pierre Morel
  2019-05-21 15:34 ` [PATCH v9 1/4] s390: ap: kvm: add PQAP interception for AQIC Pierre Morel
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Pierre Morel @ 2019-05-21 15:34 UTC (permalink / raw)
  To: borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, freude,
	mimu

This patch series implements PQAP/AQIC interception in KVM.

1) Data to handle GISA interrupt for AQIC

To implement this we need to add a new structure, vfio_ap_queue,
to be able to retrieve the mediated device associated with a queue
and specific values needed to register/unregister the interrupt
structures:
  - APQN: to be able to issue the commands and search for queue
    structures
  - saved NIB : to keep track of the pin page for unpining it
  - saved ISC : to unregister with the GIB interface
  - matrix_mdev: to retrieve the associate matrix, the mediated device
    and KVM

Specific handling bei keeping old values when re-registering is
needed because the guest could unregister interrupt in a invisble
manner bei issuing an un-interceptible RESET command.

Reset commands issued directly by the guest and indirectly when
removing the guest unpin the memory and deregister the ISC.

The vfio_ap_queue is associated to the ap_device during the probe
of the device and dissociated during the remove of the ap_device.

The vfio_ap_queue is associated to the matrix mediated device during
each interception of the AQIC command, so it does not need to be
dissociated until the guest is terminated.

The life of the vfio_ap_queue will be protected by the matrix_dev lock
to guaranty that no change can occur to the CRYCB or that devices can
not be removed when a vfio_ap_queue is in use.

2) KVM destroy race conditions

To make sure that KVM do not vanish and GISA is still available
when the VFIO_AP driver is in used we take a reference to KVM
during the opening of the mediated device and release it on
releasing the mediated device.

3) Interception of PQAP

The driver registers a hook structure to KVM providing:
- a pointer to a function implementing PQAP(AQIC) handling
- the reference to the module owner of the hook

On interception by KVM we do not change the behavior, returning
 -EOPNOTSUPP to the user in the case AP instructions are not
supported by the host or by the guest.
Otherwise we verify the exceptions cases before trying to call 
the vfio_ap hook.

In the case we do not find a hook we assume that the CRYCB has not
been setup for the guest and is empty.

4) Enabling and disabling the IRQ

When enabling the IRQ care is taken to unping the saved NIB.
When disabling IRQ care is taken to wait until the IRQ bit
of the queue status is cleared before unpining the NIB.

On RESET and before unpinning the NIB and unregistering the ISC
the IRQ is disabled using PQAP/AQIC even when a PQAP/APZQ have
been done.

5) Removing the AP device

Removing the AP device without having unassign it is clearly
discourage by the documentation.
The patch series does not check if the queue is used by a
guest. It only de-register the IRQ, unregister ISC and unpin
the NIB, then free the vfio_ap_queue.

6) Associated QEMU patch

There is a QEMU patch which is needed to enable the PQAP/AQIC
facility in the guest.

Posted in qemu-devel@nongnu.org as:
Message-Id: <1550146494-21085-1-git-send-email-pmorel@linux.ibm.com>

7) Compatibility with Dynamic configuration patches

Tony, I did not rebase this series above the dynamic configuration
patches because:
- This series do the work it needs to do without having to take
  care on the dynamic configuration.
- It is guarantied that interrupt will be shut off after removing
  the APQueue device
- The dynamic configuration series is not converging.

However Tony, the choice is your's, I won't be able to help
in a near future.


Pierre Morel (4):
  s390: ap: kvm: add PQAP interception for AQIC
  vfio: ap: register IOMMU VFIO notifier
  s390: ap: implement PAPQ AQIC interception in kernel
  s390: ap: kvm: Enable PQAP/AQIC facility for the guest

 arch/s390/include/asm/kvm_host.h      |   7 +
 arch/s390/kvm/priv.c                  |  86 ++++++++
 arch/s390/tools/gen_facilities.c      |   1 +
 drivers/s390/crypto/vfio_ap_drv.c     |  34 ++-
 drivers/s390/crypto/vfio_ap_ops.c     | 379 +++++++++++++++++++++++++++++++++-
 drivers/s390/crypto/vfio_ap_private.h |  15 ++
 6 files changed, 514 insertions(+), 8 deletions(-)

-- 
2.7.4

Changelog:

Changelog from v8:
- mask the reserved bits when testing the FC in pqap interception
  (Tony)

Changelog from v7:
- Modification of the IRQ disable routine to call ap_aqic
  even a ap_zapq has been done
  (to answer a question from Christian)
- use GISA origin instead of gisa next_alert field to
  initialize ap_aqic register 2
  (Halil)
- Corection of the testing of the vfio_pin_pages return value.
  (Halil)
- Only keep track of saved_isc and saved_pfn for a later 
  interrupt disabling.
  (Halil)
- renaming the routine to enable/disable the interruptions

Changelog from v6:
- Not taking care if the AP queue is associated with a guest
  admin is warn in the odcumentation
 (Tony, Halil)
- Using WARN_ON_ONCE, direct call to report specification errors
  (Christian)
- Wait until the IRQ bit is cleared when clearing interrupts
 (Tony, Halil)
- Some minor changes and add some comments before
  vfio_ap_free_irq_data
  (Pierre)
- initializing the pointer to matrix_mdev in vfio_ap_queue 
  during the interception and suppress the association during
  assignment and the usage of lists.
  (Tony, Halil)
- Merging patches for creation of vfio_ap_queue, initialization
  and use of vfio_ap_queue during interception of PQAP/AQIC
  (Conny)

Changelog from v5:
- Refactoring of the PQAP interception after all discussions
  (Conny, Halil (offline))
- take a big lock around open to avoid parallel changes through
  assignment
- verify that at least one queue has a APID or APQI when
  first assignment is done to not accept unavailable APID/APQI
  (myself)
- Adding comment for locks on free_list
  (Conny)
- Modified comment for 
  "s390: ap: setup relation betwen KVM and mediated device"
  (Halil)

Changelog from v4:
- Add forgotten locking for vfio_get_queue() in pqap callback
  (Conny / Halil)
- Add KVM reference counting to make sure GISA is free after IRQ
  (Christian / Halil)
- Take care that ISC = 0 is a valid ISC
  (Halil)
- Integrate the PQAP call back in a structure with module owner
  reference counting to make sure the callback does not disappear.
- Restrict functionality to always open KVM before opening the
  VFIO device.
- Search all devices in the vfio_ap driver list when associating
  a queue to a mediated device
  (Halil / Tony)
- Get vfio_ap_free_irq() out of vfio_ap_mdev_reset_queue() to call
  it always, whatever the result of the reset.
  (Tony)

Changelog from v3:
- Associating the vfio_queues during APID/APQI assign
  (Tony)
- Dissociating the vfio_queues during APID/APQI unassign
  (Tony)
- Taking care that the guest can directly disable the interrupt
  by using a RESET
  (Halil)
- Remove the patch creating the matrix bus to accelerate its
  integration in Linux stable
  (Christian)

Changelog from v1:
- Refactoring to handle interception in kernel instead of in
  QEMU
  (Halil)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v9 1/4] s390: ap: kvm: add PQAP interception for AQIC
  2019-05-21 15:34 [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Pierre Morel
@ 2019-05-21 15:34 ` Pierre Morel
  2019-06-12 13:55   ` Harald Freudenberger
  2019-05-21 15:34 ` [PATCH v9 2/4] vfio: ap: register IOMMU VFIO notifier Pierre Morel
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Pierre Morel @ 2019-05-21 15:34 UTC (permalink / raw)
  To: borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, freude,
	mimu

We prepare the interception of the PQAP/AQIC instruction for
the case the AQIC facility is enabled in the guest.

First of all we do not want to change existing behavior when
intercepting AP instructions without the SIE allowing the guest
to use AP instructions.

In this patch we only handle the AQIC interception allowed by
facility 65 which will be enabled when the complete interception
infrastructure will be present.

We add a callback inside the KVM arch structure for s390 for
a VFIO driver to handle a specific response to the PQAP
instruction with the AQIC command and only this command.

But we want to be able to return a correct answer to the guest
even there is no VFIO AP driver in the kernel.
Therefor, we inject the correct exceptions from inside KVM for the
case the callback is not initialized, which happens when the vfio_ap
driver is not loaded.

We do consider the responsability of the driver to always initialize
the PQAP callback if it defines queues by initializing the CRYCB for
a guest.
If the callback has been setup we call it.
If not we setup an answer considering that no queue is available
for the guest when no callback has been setup.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 arch/s390/include/asm/kvm_host.h      |  7 +++
 arch/s390/kvm/priv.c                  | 86 +++++++++++++++++++++++++++++++++++
 drivers/s390/crypto/vfio_ap_private.h |  2 +
 3 files changed, 95 insertions(+)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index faa7ebf..a0fc2f1 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -18,6 +18,7 @@
 #include <linux/kvm_host.h>
 #include <linux/kvm.h>
 #include <linux/seqlock.h>
+#include <linux/module.h>
 #include <asm/debug.h>
 #include <asm/cpu.h>
 #include <asm/fpu/api.h>
@@ -723,8 +724,14 @@ struct kvm_s390_cpu_model {
 	unsigned short ibc;
 };
 
+struct kvm_s390_module_hook {
+	int (*hook)(struct kvm_vcpu *vcpu);
+	struct module *owner;
+};
+
 struct kvm_s390_crypto {
 	struct kvm_s390_crypto_cb *crycb;
+	struct kvm_s390_module_hook *pqap_hook;
 	__u32 crycbd;
 	__u8 aes_kw;
 	__u8 dea_kw;
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 8679bd7..ed52ffa 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -27,6 +27,7 @@
 #include <asm/io.h>
 #include <asm/ptrace.h>
 #include <asm/sclp.h>
+#include <asm/ap.h>
 #include "gaccess.h"
 #include "kvm-s390.h"
 #include "trace.h"
@@ -592,6 +593,89 @@ static int handle_io_inst(struct kvm_vcpu *vcpu)
 	}
 }
 
+/*
+ * handle_pqap: Handling pqap interception
+ * @vcpu: the vcpu having issue the pqap instruction
+ *
+ * We now support PQAP/AQIC instructions and we need to correctly
+ * answer the guest even if no dedicated driver's hook is available.
+ *
+ * The intercepting code calls a dedicated callback for this instruction
+ * if a driver did register one in the CRYPTO satellite of the
+ * SIE block.
+ *
+ * If no callback is available, the queues are not available, return this
+ * response code to the caller and set CC to 3.
+ * Else return the response code returned by the callback.
+ */
+static int handle_pqap(struct kvm_vcpu *vcpu)
+{
+	struct ap_queue_status status = {};
+	unsigned long reg0;
+	int ret;
+	uint8_t fc;
+
+	/* Verify that the AP instruction are available */
+	if (!ap_instructions_available())
+		return -EOPNOTSUPP;
+	/* Verify that the guest is allowed to use AP instructions */
+	if (!(vcpu->arch.sie_block->eca & ECA_APIE))
+		return -EOPNOTSUPP;
+	/*
+	 * The only possibly intercepted functions when AP instructions are
+	 * available for the guest are AQIC and TAPQ with the t bit set
+	 * since we do not set IC.3 (FIII) we currently will only intercept
+	 * the AQIC function code.
+	 */
+	reg0 = vcpu->run->s.regs.gprs[0];
+	fc = (reg0 >> 24) & 0xff;
+	if (WARN_ON_ONCE(fc != 0x03))
+		return -EOPNOTSUPP;
+
+	/* PQAP instruction is allowed for guest kernel only */
+	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
+		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
+
+	/* Common PQAP instruction specification exceptions */
+	/* bits 41-47 must all be zeros */
+	if (reg0 & 0x007f0000UL)
+		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
+	/* APFT not install and T bit set */
+	if (!test_kvm_facility(vcpu->kvm, 15) && (reg0 & 0x00800000UL))
+		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
+	/* APXA not installed and APID greater 64 or APQI greater 16 */
+	if (!(vcpu->kvm->arch.crypto.crycbd & 0x02) && (reg0 & 0x0000c0f0UL))
+		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
+
+	/* AQIC function code specific exception */
+	/* facility 65 not present for AQIC function code */
+	if (!test_kvm_facility(vcpu->kvm, 65))
+		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
+
+	/*
+	 * Verify that the hook callback is registered, lock the owner
+	 * and call the hook.
+	 */
+	if (vcpu->kvm->arch.crypto.pqap_hook) {
+		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
+			return -EOPNOTSUPP;
+		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
+		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
+		if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000)
+			kvm_s390_set_psw_cc(vcpu, 3);
+		return ret;
+	}
+	/*
+	 * A vfio_driver must register a hook.
+	 * No hook means no driver to enable the SIE CRYCB and no queues.
+	 * We send this response to the guest.
+	 */
+	status.response_code = 0x01;
+	memcpy(&vcpu->run->s.regs.gprs[1], &status, sizeof(status));
+	kvm_s390_set_psw_cc(vcpu, 3);
+	return 0;
+}
+
 static int handle_stfl(struct kvm_vcpu *vcpu)
 {
 	int rc;
@@ -878,6 +962,8 @@ int kvm_s390_handle_b2(struct kvm_vcpu *vcpu)
 		return handle_sthyi(vcpu);
 	case 0x7d:
 		return handle_stsi(vcpu);
+	case 0xaf:
+		return handle_pqap(vcpu);
 	case 0xb1:
 		return handle_stfl(vcpu);
 	case 0xb2:
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 76b7f98..a910be1 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -16,6 +16,7 @@
 #include <linux/mdev.h>
 #include <linux/delay.h>
 #include <linux/mutex.h>
+#include <linux/kvm_host.h>
 
 #include "ap_bus.h"
 
@@ -81,6 +82,7 @@ struct ap_matrix_mdev {
 	struct ap_matrix matrix;
 	struct notifier_block group_notifier;
 	struct kvm *kvm;
+	struct kvm_s390_module_hook pqap_hook;
 };
 
 extern int vfio_ap_mdev_register(void);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 2/4] vfio: ap: register IOMMU VFIO notifier
  2019-05-21 15:34 [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Pierre Morel
  2019-05-21 15:34 ` [PATCH v9 1/4] s390: ap: kvm: add PQAP interception for AQIC Pierre Morel
@ 2019-05-21 15:34 ` Pierre Morel
  2019-06-12 13:56   ` Harald Freudenberger
  2019-05-21 15:34 ` [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel Pierre Morel
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Pierre Morel @ 2019-05-21 15:34 UTC (permalink / raw)
  To: borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, freude,
	mimu

To be able to use the VFIO interface to facilitate the
mediated device memory pinning/unpinning we need to register
a notifier for IOMMU.

While we will start to pin one guest page for the interrupt indicator
byte, this is still ok with ballooning as this page will never be
used by the guest virtio-balloon driver.
So the pinned page will never be freed. And even a broken guest does
so, that would not impact the host as the original page is still
in control by vfio.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c     | 43 ++++++++++++++++++++++++++++++++++-
 drivers/s390/crypto/vfio_ap_private.h |  2 ++
 2 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 900b9cf..e8e87bf 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -759,6 +759,35 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
 	return 0;
 }
 
+/*
+ * vfio_ap_mdev_iommu_notifier: IOMMU notifier callback
+ *
+ * @nb: The notifier block
+ * @action: Action to be taken
+ * @data: data associated with the request
+ *
+ * For an UNMAP request, unpin the guest IOVA (the NIB guest address we
+ * pinned before). Other requests are ignored.
+ *
+ */
+static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
+				       unsigned long action, void *data)
+{
+	struct ap_matrix_mdev *matrix_mdev;
+
+	matrix_mdev = container_of(nb, struct ap_matrix_mdev, iommu_notifier);
+
+	if (action == VFIO_IOMMU_NOTIFY_DMA_UNMAP) {
+		struct vfio_iommu_type1_dma_unmap *unmap = data;
+		unsigned long g_pfn = unmap->iova >> PAGE_SHIFT;
+
+		vfio_unpin_pages(mdev_dev(matrix_mdev->mdev), &g_pfn, 1);
+		return NOTIFY_OK;
+	}
+
+	return NOTIFY_DONE;
+}
+
 static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
 				       unsigned long action, void *data)
 {
@@ -858,7 +887,17 @@ static int vfio_ap_mdev_open(struct mdev_device *mdev)
 		return ret;
 	}
 
-	return 0;
+	matrix_mdev->iommu_notifier.notifier_call = vfio_ap_mdev_iommu_notifier;
+	events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
+	ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+				     &events, &matrix_mdev->iommu_notifier);
+	if (!ret)
+		return ret;
+
+	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
+				 &matrix_mdev->group_notifier);
+	module_put(THIS_MODULE);
+	return ret;
 }
 
 static void vfio_ap_mdev_release(struct mdev_device *mdev)
@@ -869,6 +908,8 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
 		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
 
 	vfio_ap_mdev_reset_queues(mdev);
+	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
+				 &matrix_mdev->iommu_notifier);
 	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
 				 &matrix_mdev->group_notifier);
 	matrix_mdev->kvm = NULL;
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index a910be1..18dcc4d 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -81,8 +81,10 @@ struct ap_matrix_mdev {
 	struct list_head node;
 	struct ap_matrix matrix;
 	struct notifier_block group_notifier;
+	struct notifier_block iommu_notifier;
 	struct kvm *kvm;
 	struct kvm_s390_module_hook pqap_hook;
+	struct mdev_device *mdev;
 };
 
 extern int vfio_ap_mdev_register(void);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel
  2019-05-21 15:34 [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Pierre Morel
  2019-05-21 15:34 ` [PATCH v9 1/4] s390: ap: kvm: add PQAP interception for AQIC Pierre Morel
  2019-05-21 15:34 ` [PATCH v9 2/4] vfio: ap: register IOMMU VFIO notifier Pierre Morel
@ 2019-05-21 15:34 ` Pierre Morel
  2019-05-23 15:43   ` Tony Krowiak
                     ` (2 more replies)
  2019-05-21 15:34 ` [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest Pierre Morel
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 22+ messages in thread
From: Pierre Morel @ 2019-05-21 15:34 UTC (permalink / raw)
  To: borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, freude,
	mimu

We register a AP PQAP instruction hook during the open
of the mediated device. And unregister it on release.

During the probe of the AP device, we allocate a vfio_ap_queue
structure to keep track of the information we need for the
PQAP/AQIC instruction interception.

In the AP PQAP instruction hook, if we receive a demand to
enable IRQs,
- we retrieve the vfio_ap_queue based on the APQN we receive
  in REG1,
- we retrieve the page of the guest address, (NIB), from
  register REG2
- we retrieve the mediated device to use the VFIO pinning
  infrastructure to pin the page of the guest address,
- we retrieve the pointer to KVM to register the guest ISC
  and retrieve the host ISC
- finaly we activate GISA

If we receive a demand to disable IRQs,
- we deactivate GISA
- unregister from the GIB
- unpin the NIB

When removing the AP device from the driver the device is
reseted and this process unregisters the GISA from the GIB,
and unpins the NIB address then we free the vfio_ap_queue
structure.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_drv.c     |  34 +++-
 drivers/s390/crypto/vfio_ap_ops.c     | 336 +++++++++++++++++++++++++++++++++-
 drivers/s390/crypto/vfio_ap_private.h |  11 ++
 3 files changed, 374 insertions(+), 7 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index e9824c3..003662a 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -5,6 +5,7 @@
  * Copyright IBM Corp. 2018
  *
  * Author(s): Tony Krowiak <akrowiak@linux.ibm.com>
+ *	      Pierre Morel <pmorel@linux.ibm.com>
  */
 
 #include <linux/module.h>
@@ -40,14 +41,45 @@ static struct ap_device_id ap_queue_ids[] = {
 
 MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
 
+/**
+ * vfio_ap_queue_dev_probe:
+ *
+ * Allocate a vfio_ap_queue structure and associate it
+ * with the device as driver_data.
+ */
 static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
 {
+	struct vfio_ap_queue *q;
+
+	q = kzalloc(sizeof(*q), GFP_KERNEL);
+	if (!q)
+		return -ENOMEM;
+	dev_set_drvdata(&apdev->device, q);
+	q->apqn = to_ap_queue(&apdev->device)->qid;
+	q->saved_isc = VFIO_AP_ISC_INVALID;
 	return 0;
 }
 
+/**
+ * vfio_ap_queue_dev_remove:
+ *
+ * Takes the matrix lock to avoid actions on this device while removing
+ * Free the associated vfio_ap_queue structure
+ */
 static void vfio_ap_queue_dev_remove(struct ap_device *apdev)
 {
-	/* Nothing to do yet */
+	struct vfio_ap_queue *q;
+	int apid, apqi;
+
+	mutex_lock(&matrix_dev->lock);
+	q = dev_get_drvdata(&apdev->device);
+	dev_set_drvdata(&apdev->device, NULL);
+	apid = AP_QID_CARD(q->apqn);
+	apqi = AP_QID_QUEUE(q->apqn);
+	vfio_ap_mdev_reset_queue(apid, apqi, 1);
+	vfio_ap_irq_disable(q);
+	kfree(q);
+	mutex_unlock(&matrix_dev->lock);
 }
 
 static void vfio_ap_matrix_dev_release(struct device *dev)
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index e8e87bf..015174f 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -24,6 +24,295 @@
 #define VFIO_AP_MDEV_TYPE_HWVIRT "passthrough"
 #define VFIO_AP_MDEV_NAME_HWVIRT "VFIO AP Passthrough Device"
 
+static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
+
+static int match_apqn(struct device *dev, void *data)
+{
+	struct vfio_ap_queue *q = dev_get_drvdata(dev);
+
+	return (q->apqn == *(int *)(data)) ? 1 : 0;
+}
+
+/**
+ * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
+ * @matrix_mdev: the associated mediated matrix
+ * @apqn: The queue APQN
+ *
+ * Retrieve a queue with a specific APQN from the list of the
+ * devices of the vfio_ap_drv.
+ * Verify that the APID and the APQI are set in the matrix.
+ *
+ * Returns the pointer to the associated vfio_ap_queue
+ */
+struct vfio_ap_queue *vfio_ap_get_queue(struct ap_matrix_mdev *matrix_mdev,
+					int apqn)
+{
+	struct vfio_ap_queue *q;
+	struct device *dev;
+
+	if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
+		return NULL;
+	if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
+		return NULL;
+
+	dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
+				 &apqn, match_apqn);
+	if (!dev)
+		return NULL;
+	q = dev_get_drvdata(dev);
+	q->matrix_mdev = matrix_mdev;
+	put_device(dev);
+
+	return q;
+}
+
+/**
+ * vfio_ap_wait_for_irqclear
+ * @apqn: The AP Queue number
+ *
+ * Checks the IRQ bit for the status of this APQN using ap_tapq.
+ * Returns if the ap_tapq function succedded and the bit is clear.
+ * Returns if ap_tapq function failed with invalid, deconfigured or
+ * checkstopped AP.
+ * Otherwise retries up to 5 times after waiting 20ms.
+ *
+ */
+static void vfio_ap_wait_for_irqclear(int apqn)
+{
+	struct ap_queue_status status;
+	int retry = 5;
+
+	do {
+		status = ap_tapq(apqn, NULL);
+		switch (status.response_code) {
+		case AP_RESPONSE_NORMAL:
+		case AP_RESPONSE_RESET_IN_PROGRESS:
+			if (!status.irq_enabled)
+				return;
+			/* Fall through */
+		case AP_RESPONSE_BUSY:
+			msleep(20);
+			break;
+		case AP_RESPONSE_Q_NOT_AVAIL:
+		case AP_RESPONSE_DECONFIGURED:
+		case AP_RESPONSE_CHECKSTOPPED:
+		default:
+			WARN_ONCE(1, "%s: tapq rc %02x: %04x\n", __func__,
+				  status.response_code, apqn);
+			return;
+		}
+	} while (--retry);
+
+	WARN_ONCE(1, "%s: tapq rc %02x: %04x could not clear IR bit\n",
+		  __func__, status.response_code, apqn);
+}
+
+/**
+ * vfio_ap_free_aqic_resources
+ * @q: The vfio_ap_queue
+ *
+ * Unregisters the ISC in the GIB when the saved ISC not invalid.
+ * Unpin the guest's page holding the NIB when it exist.
+ * Reset the saved_pfn and saved_isc to invalid values.
+ * Clear the pointer to the matrix mediated device.
+ *
+ */
+static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q)
+{
+	if (q->saved_isc != VFIO_AP_ISC_INVALID && q->matrix_mdev)
+		kvm_s390_gisc_unregister(q->matrix_mdev->kvm, q->saved_isc);
+	if (q->saved_pfn && q->matrix_mdev)
+		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
+				 &q->saved_pfn, 1);
+	q->saved_pfn = 0;
+	q->saved_isc = VFIO_AP_ISC_INVALID;
+	q->matrix_mdev = NULL;
+}
+
+/**
+ * vfio_ap_irq_disable
+ * @q: The vfio_ap_queue
+ *
+ * Uses ap_aqic to disable the interruption and in case of success, reset
+ * in progress or IRQ disable command already proceeded: calls
+ * vfio_ap_wait_for_irqclear() to check for the IRQ bit to be clear
+ * and calls vfio_ap_free_aqic_resources() to free the resources associated
+ * with the AP interrupt handling.
+ *
+ * In the case the AP is busy, or a reset is in progress,
+ * retries after 20ms, up to 5 times.
+ *
+ * Returns if ap_aqic function failed with invalid, deconfigured or
+ * checkstopped AP.
+ */
+struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
+{
+	struct ap_qirq_ctrl aqic_gisa = {};
+	struct ap_queue_status status;
+	int retries = 5;
+
+	do {
+		status = ap_aqic(q->apqn, aqic_gisa, NULL);
+		switch (status.response_code) {
+		case AP_RESPONSE_OTHERWISE_CHANGED:
+		case AP_RESPONSE_NORMAL:
+			vfio_ap_wait_for_irqclear(q->apqn);
+			goto end_free;
+		case AP_RESPONSE_RESET_IN_PROGRESS:
+		case AP_RESPONSE_BUSY:
+			msleep(20);
+			break;
+		case AP_RESPONSE_Q_NOT_AVAIL:
+		case AP_RESPONSE_DECONFIGURED:
+		case AP_RESPONSE_CHECKSTOPPED:
+		case AP_RESPONSE_INVALID_ADDRESS:
+		default:
+			/* All cases in default means AP not operational */
+			WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
+				  status.response_code);
+			goto end_free;
+		}
+	} while (retries--);
+
+	WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
+		  status.response_code);
+end_free:
+	vfio_ap_free_aqic_resources(q);
+	return status;
+}
+
+/**
+ * vfio_ap_setirq: Enable Interruption for a APQN
+ *
+ * @dev: the device associated with the ap_queue
+ * @q:   the vfio_ap_queue holding AQIC parameters
+ *
+ * Pin the NIB saved in *q
+ * Register the guest ISC to GIB interface and retrieve the
+ * host ISC to issue the host side PQAP/AQIC
+ *
+ * Response.status may be set to AP_RESPONSE_INVALID_ADDRESS in case the
+ * vfio_pin_pages failed.
+ *
+ * Otherwise return the ap_queue_status returned by the ap_aqic(),
+ * all retry handling will be done by the guest.
+ */
+static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q,
+						 int isc,
+						 unsigned long nib)
+{
+	struct ap_qirq_ctrl aqic_gisa = {};
+	struct ap_queue_status status = {};
+	struct kvm_s390_gisa *gisa;
+	struct kvm *kvm;
+	unsigned long h_nib, g_pfn, h_pfn;
+	int ret;
+
+	g_pfn = nib >> PAGE_SHIFT;
+	ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1,
+			     IOMMU_READ | IOMMU_WRITE, &h_pfn);
+	switch (ret) {
+	case 1:
+		break;
+	default:
+		status.response_code = AP_RESPONSE_INVALID_ADDRESS;
+		return status;
+	}
+
+	kvm = q->matrix_mdev->kvm;
+	gisa = kvm->arch.gisa_int.origin;
+
+	h_nib = (h_pfn << PAGE_SHIFT) | (nib & ~PAGE_MASK);
+	aqic_gisa.gisc = isc;
+	aqic_gisa.isc = kvm_s390_gisc_register(kvm, isc);
+	aqic_gisa.ir = 1;
+	aqic_gisa.gisa = (uint64_t)gisa >> 4;
+
+	status = ap_aqic(q->apqn, aqic_gisa, (void *)h_nib);
+	switch (status.response_code) {
+	case AP_RESPONSE_NORMAL:
+		/* See if we did clear older IRQ configuration */
+		vfio_ap_free_aqic_resources(q);
+		q->saved_pfn = g_pfn;
+		q->saved_isc = isc;
+		break;
+	case AP_RESPONSE_OTHERWISE_CHANGED:
+		/* We could not modify IRQ setings: clear new configuration */
+		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1);
+		kvm_s390_gisc_unregister(kvm, isc);
+		break;
+	default:
+		pr_warn("%s: apqn %04x: response: %02x\n", __func__, q->apqn,
+			status.response_code);
+		vfio_ap_irq_disable(q);
+		break;
+	}
+
+	return status;
+}
+
+/**
+ * handle_pqap: PQAP instruction callback
+ *
+ * @vcpu: The vcpu on which we received the PQAP instruction
+ *
+ * Get the general register contents to initialize internal variables.
+ * REG[0]: APQN
+ * REG[1]: IR and ISC
+ * REG[2]: NIB
+ *
+ * Response.status may be set to following Response Code:
+ * - AP_RESPONSE_Q_NOT_AVAIL: if the queue is not available
+ * - AP_RESPONSE_DECONFIGURED: if the queue is not configured
+ * - AP_RESPONSE_NORMAL (0) : in case of successs
+ *   Check vfio_ap_setirq() and vfio_ap_clrirq() for other possible RC.
+ * We take the matrix_dev lock to ensure serialization on queues and
+ * mediated device access.
+ *
+ * Return 0 if we could handle the request inside KVM.
+ * otherwise, returns -EOPNOTSUPP to let QEMU handle the fault.
+ */
+static int handle_pqap(struct kvm_vcpu *vcpu)
+{
+	uint64_t status;
+	uint16_t apqn;
+	struct vfio_ap_queue *q;
+	struct ap_queue_status qstatus = {
+			       .response_code = AP_RESPONSE_Q_NOT_AVAIL, };
+	struct ap_matrix_mdev *matrix_mdev;
+
+	/* If we do not use the AIV facility just go to userland */
+	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
+		return -EOPNOTSUPP;
+
+	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
+	mutex_lock(&matrix_dev->lock);
+
+	if (!vcpu->kvm->arch.crypto.pqap_hook)
+		goto out_unlock;
+	matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
+				   struct ap_matrix_mdev, pqap_hook);
+
+	q = vfio_ap_get_queue(matrix_mdev, apqn);
+	if (!q)
+		goto out_unlock;
+
+	status = vcpu->run->s.regs.gprs[1];
+
+	/* If IR bit(16) is set we enable the interrupt */
+	if ((status >> (63 - 16)) & 0x01)
+		qstatus = vfio_ap_irq_enable(q, status & 0x07,
+					     vcpu->run->s.regs.gprs[2]);
+	else
+		qstatus = vfio_ap_irq_disable(q);
+
+out_unlock:
+	memcpy(&vcpu->run->s.regs.gprs[1], &qstatus, sizeof(qstatus));
+	vcpu->run->s.regs.gprs[1] >>= 32;
+	mutex_unlock(&matrix_dev->lock);
+	return 0;
+}
+
 static void vfio_ap_matrix_init(struct ap_config_info *info,
 				struct ap_matrix *matrix)
 {
@@ -45,8 +334,11 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
 		return -ENOMEM;
 	}
 
+	matrix_mdev->mdev = mdev;
 	vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
 	mdev_set_drvdata(mdev, matrix_mdev);
+	matrix_mdev->pqap_hook.hook = handle_pqap;
+	matrix_mdev->pqap_hook.owner = THIS_MODULE;
 	mutex_lock(&matrix_dev->lock);
 	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
 	mutex_unlock(&matrix_dev->lock);
@@ -62,6 +354,7 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
 		return -EBUSY;
 
 	mutex_lock(&matrix_dev->lock);
+	vfio_ap_mdev_reset_queues(mdev);
 	list_del(&matrix_mdev->node);
 	mutex_unlock(&matrix_dev->lock);
 
@@ -754,6 +1047,8 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
 	}
 
 	matrix_mdev->kvm = kvm;
+	kvm_get_kvm(kvm);
+	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
 	mutex_unlock(&matrix_dev->lock);
 
 	return 0;
@@ -819,15 +1114,36 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
 	return NOTIFY_OK;
 }
 
-static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
-				    unsigned int retry)
+static void vfio_ap_irq_disable_apqn(int apqn)
+{
+	struct device *dev;
+	struct vfio_ap_queue *q;
+
+	dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
+				 &apqn, match_apqn);
+	if (dev) {
+		q = dev_get_drvdata(dev);
+		vfio_ap_irq_disable(q);
+		put_device(dev);
+	}
+}
+
+int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
+			     unsigned int retry)
 {
 	struct ap_queue_status status;
+	int retry2 = 2;
+	int apqn = AP_MKQID(apid, apqi);
 
 	do {
-		status = ap_zapq(AP_MKQID(apid, apqi));
+		status = ap_zapq(apqn);
 		switch (status.response_code) {
 		case AP_RESPONSE_NORMAL:
+			while (!status.queue_empty && retry2--) {
+				msleep(20);
+				status = ap_tapq(apqn, NULL);
+			}
+			WARN_ON_ONCE(retry <= 0);
 			return 0;
 		case AP_RESPONSE_RESET_IN_PROGRESS:
 		case AP_RESPONSE_BUSY:
@@ -861,6 +1177,7 @@ static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev)
 			 */
 			if (ret)
 				rc = ret;
+			vfio_ap_irq_disable_apqn(AP_MKQID(apid, apqi));
 		}
 	}
 
@@ -904,15 +1221,20 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
 {
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
-	if (matrix_mdev->kvm)
+	mutex_lock(&matrix_dev->lock);
+	if (matrix_mdev->kvm) {
 		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
+		matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
+		vfio_ap_mdev_reset_queues(mdev);
+		kvm_put_kvm(matrix_mdev->kvm);
+		matrix_mdev->kvm = NULL;
+	}
+	mutex_unlock(&matrix_dev->lock);
 
-	vfio_ap_mdev_reset_queues(mdev);
 	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
 				 &matrix_mdev->iommu_notifier);
 	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
 				 &matrix_mdev->group_notifier);
-	matrix_mdev->kvm = NULL;
 	module_put(THIS_MODULE);
 }
 
@@ -941,6 +1263,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
 {
 	int ret;
 
+	mutex_lock(&matrix_dev->lock);
 	switch (cmd) {
 	case VFIO_DEVICE_GET_INFO:
 		ret = vfio_ap_mdev_get_device_info(arg);
@@ -952,6 +1275,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
 		ret = -EOPNOTSUPP;
 		break;
 	}
+	mutex_unlock(&matrix_dev->lock);
 
 	return ret;
 }
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index 18dcc4d..f46dde5 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -4,6 +4,7 @@
  *
  * Author(s): Tony Krowiak <akrowiak@linux.ibm.com>
  *	      Halil Pasic <pasic@linux.ibm.com>
+ *	      Pierre Morel <pmorel@linux.ibm.com>
  *
  * Copyright IBM Corp. 2018
  */
@@ -89,5 +90,15 @@ struct ap_matrix_mdev {
 
 extern int vfio_ap_mdev_register(void);
 extern void vfio_ap_mdev_unregister(void);
+int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
+			     unsigned int retry);
 
+struct vfio_ap_queue {
+	struct ap_matrix_mdev *matrix_mdev;
+	unsigned long saved_pfn;
+	int	apqn;
+#define VFIO_AP_ISC_INVALID 0xff
+	unsigned char saved_isc;
+};
+struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q);
 #endif /* _VFIO_AP_PRIVATE_H_ */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest
  2019-05-21 15:34 [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Pierre Morel
                   ` (2 preceding siblings ...)
  2019-05-21 15:34 ` [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel Pierre Morel
@ 2019-05-21 15:34 ` Pierre Morel
  2019-06-12 13:57   ` Harald Freudenberger
  2019-06-25 20:13   ` Christian Borntraeger
  2019-05-23 15:36 ` [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Tony Krowiak
  2019-07-01 12:00 ` Halil Pasic
  5 siblings, 2 replies; 22+ messages in thread
From: Pierre Morel @ 2019-05-21 15:34 UTC (permalink / raw)
  To: borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, freude,
	mimu

AP Queue Interruption Control (AQIC) facility gives
the guest the possibility to control interruption for
the Cryptographic Adjunct Processor queues.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 arch/s390/tools/gen_facilities.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
index 61ce5b5..aed14fc 100644
--- a/arch/s390/tools/gen_facilities.c
+++ b/arch/s390/tools/gen_facilities.c
@@ -114,6 +114,7 @@ static struct facility_def facility_defs[] = {
 		.bits = (int[]){
 			12, /* AP Query Configuration Information */
 			15, /* AP Facilities Test */
+			65, /* AP Queue Interruption Control */
 			156, /* etoken facility */
 			-1  /* END */
 		}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control
  2019-05-21 15:34 [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Pierre Morel
                   ` (3 preceding siblings ...)
  2019-05-21 15:34 ` [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest Pierre Morel
@ 2019-05-23 15:36 ` Tony Krowiak
  2019-06-04 14:56   ` Halil Pasic
  2019-07-01 12:00 ` Halil Pasic
  5 siblings, 1 reply; 22+ messages in thread
From: Tony Krowiak @ 2019-05-23 15:36 UTC (permalink / raw)
  To: Pierre Morel, borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	pasic, david, schwidefsky, heiko.carstens, freude, mimu

On 5/21/19 11:34 AM, Pierre Morel wrote:
> This patch series implements PQAP/AQIC interception in KVM.
> 
> 1) Data to handle GISA interrupt for AQIC
> 
> To implement this we need to add a new structure, vfio_ap_queue,
> to be able to retrieve the mediated device associated with a queue
> and specific values needed to register/unregister the interrupt
> structures:
>    - APQN: to be able to issue the commands and search for queue
>      structures
>    - saved NIB : to keep track of the pin page for unpining it
>    - saved ISC : to unregister with the GIB interface
>    - matrix_mdev: to retrieve the associate matrix, the mediated device
>      and KVM
> 
> Specific handling bei keeping old values when re-registering is
> needed because the guest could unregister interrupt in a invisble
> manner bei issuing an un-interceptible RESET command.
> 
> Reset commands issued directly by the guest and indirectly when
> removing the guest unpin the memory and deregister the ISC.
> 
> The vfio_ap_queue is associated to the ap_device during the probe
> of the device and dissociated during the remove of the ap_device.
> 
> The vfio_ap_queue is associated to the matrix mediated device during
> each interception of the AQIC command, so it does not need to be
> dissociated until the guest is terminated.
> 
> The life of the vfio_ap_queue will be protected by the matrix_dev lock
> to guaranty that no change can occur to the CRYCB or that devices can
> not be removed when a vfio_ap_queue is in use.
> 
> 2) KVM destroy race conditions
> 
> To make sure that KVM do not vanish and GISA is still available
> when the VFIO_AP driver is in used we take a reference to KVM
> during the opening of the mediated device and release it on
> releasing the mediated device.
> 
> 3) Interception of PQAP
> 
> The driver registers a hook structure to KVM providing:
> - a pointer to a function implementing PQAP(AQIC) handling
> - the reference to the module owner of the hook
> 
> On interception by KVM we do not change the behavior, returning
>   -EOPNOTSUPP to the user in the case AP instructions are not
> supported by the host or by the guest.
> Otherwise we verify the exceptions cases before trying to call
> the vfio_ap hook.
> 
> In the case we do not find a hook we assume that the CRYCB has not
> been setup for the guest and is empty.
> 
> 4) Enabling and disabling the IRQ
> 
> When enabling the IRQ care is taken to unping the saved NIB.
> When disabling IRQ care is taken to wait until the IRQ bit
> of the queue status is cleared before unpining the NIB.
> 
> On RESET and before unpinning the NIB and unregistering the ISC
> the IRQ is disabled using PQAP/AQIC even when a PQAP/APZQ have
> been done.
> 
> 5) Removing the AP device
> 
> Removing the AP device without having unassign it is clearly
> discourage by the documentation.
> The patch series does not check if the queue is used by a
> guest. It only de-register the IRQ, unregister ISC and unpin
> the NIB, then free the vfio_ap_queue.
> 
> 6) Associated QEMU patch
> 
> There is a QEMU patch which is needed to enable the PQAP/AQIC
> facility in the guest.
> 
> Posted in qemu-devel@nongnu.org as:
> Message-Id: <1550146494-21085-1-git-send-email-pmorel@linux.ibm.com>
> 
> 7) Compatibility with Dynamic configuration patches
> 
> Tony, I did not rebase this series above the dynamic configuration
> patches because:
> - This series do the work it needs to do without having to take
>    care on the dynamic configuration.
> - It is guarantied that interrupt will be shut off after removing
>    the APQueue device
> - The dynamic configuration series is not converging.

Would you consider the following?

* Take dynconfig patch "s390: vfio-ap: wait for queue empty on queue
   reset" and include it in your series. This patch modifies the
   reset function to wait for queue empty.

* In dynconfig patch "s390: vfio-ap: handle bind and unbind of AP queue
   device" the following functions are introduced:

      void vfio_ap_mdev_probe_queue(struct ap_queue *queue)
      void vfio_ap_mdev_remove_queue(struct ap_queue *queue)

   The vfio_ap_mdev_probe_queue function is called from the vfio_ap
   driver probe callback. You could embed the code you've introduced in
   the probe callback there. Of course, you would need to return int
   from the function for the -ENOMEM error.

   The vfio_ap_mdev_remove_queue function is called from the vfio_ap
   driver remove callback. You could embed the code you've introduced in
   the remove callback there.

* Move your vfio_ap_irq_disable function to vfio_ap_ops.c and make it a
   static function.

* Leave the vfio_ap_mdev_reset_queue function as a static function in
   vfio_ap_ops.c

If you do the things above, then I can base the dynconfig series on
the IRQ series without much of a merge issue. What say you?

Note: I've included review comments for patch 3/4 to match the
       suggestions above.

> 
> However Tony, the choice is your's, I won't be able to help
> in a near future.
> 
> 
> Pierre Morel (4):
>    s390: ap: kvm: add PQAP interception for AQIC
>    vfio: ap: register IOMMU VFIO notifier
>    s390: ap: implement PAPQ AQIC interception in kernel
>    s390: ap: kvm: Enable PQAP/AQIC facility for the guest
> 
>   arch/s390/include/asm/kvm_host.h      |   7 +
>   arch/s390/kvm/priv.c                  |  86 ++++++++
>   arch/s390/tools/gen_facilities.c      |   1 +
>   drivers/s390/crypto/vfio_ap_drv.c     |  34 ++-
>   drivers/s390/crypto/vfio_ap_ops.c     | 379 +++++++++++++++++++++++++++++++++-
>   drivers/s390/crypto/vfio_ap_private.h |  15 ++
>   6 files changed, 514 insertions(+), 8 deletions(-)
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel
  2019-05-21 15:34 ` [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel Pierre Morel
@ 2019-05-23 15:43   ` Tony Krowiak
  2019-06-04 19:38   ` Tony Krowiak
  2019-06-12 13:56   ` Harald Freudenberger
  2 siblings, 0 replies; 22+ messages in thread
From: Tony Krowiak @ 2019-05-23 15:43 UTC (permalink / raw)
  To: Pierre Morel, borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	pasic, david, schwidefsky, heiko.carstens, freude, mimu

On 5/21/19 11:34 AM, Pierre Morel wrote:
> We register a AP PQAP instruction hook during the open
> of the mediated device. And unregister it on release.
> 
> During the probe of the AP device, we allocate a vfio_ap_queue
> structure to keep track of the information we need for the
> PQAP/AQIC instruction interception.
> 
> In the AP PQAP instruction hook, if we receive a demand to
> enable IRQs,
> - we retrieve the vfio_ap_queue based on the APQN we receive
>    in REG1,
> - we retrieve the page of the guest address, (NIB), from
>    register REG2
> - we retrieve the mediated device to use the VFIO pinning
>    infrastructure to pin the page of the guest address,
> - we retrieve the pointer to KVM to register the guest ISC
>    and retrieve the host ISC
> - finaly we activate GISA
> 
> If we receive a demand to disable IRQs,
> - we deactivate GISA
> - unregister from the GIB
> - unpin the NIB
> 
> When removing the AP device from the driver the device is
> reseted and this process unregisters the GISA from the GIB,
> and unpins the NIB address then we free the vfio_ap_queue
> structure.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>   drivers/s390/crypto/vfio_ap_drv.c     |  34 +++-
>   drivers/s390/crypto/vfio_ap_ops.c     | 336 +++++++++++++++++++++++++++++++++-
>   drivers/s390/crypto/vfio_ap_private.h |  11 ++
>   3 files changed, 374 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
> index e9824c3..003662a 100644
> --- a/drivers/s390/crypto/vfio_ap_drv.c
> +++ b/drivers/s390/crypto/vfio_ap_drv.c
> @@ -5,6 +5,7 @@
>    * Copyright IBM Corp. 2018
>    *
>    * Author(s): Tony Krowiak <akrowiak@linux.ibm.com>
> + *	      Pierre Morel <pmorel@linux.ibm.com>
>    */
>   
>   #include <linux/module.h>
> @@ -40,14 +41,45 @@ static struct ap_device_id ap_queue_ids[] = {
>   
>   MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
>   
> +/**
> + * vfio_ap_queue_dev_probe:
> + *
> + * Allocate a vfio_ap_queue structure and associate it
> + * with the device as driver_data.
> + */
>   static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
>   {
> +	struct vfio_ap_queue *q;
> +
> +	q = kzalloc(sizeof(*q), GFP_KERNEL);
> +	if (!q)
> +		return -ENOMEM;
> +	dev_set_drvdata(&apdev->device, q);
> +	q->apqn = to_ap_queue(&apdev->device)->qid;
> +	q->saved_isc = VFIO_AP_ISC_INVALID;

As per my comments in the cover letter, you would embed the code
above in the vfio_ap_mdev_probe_queue function and call it here.


>   	return 0;
>   }
>   
> +/**
> + * vfio_ap_queue_dev_remove:
> + *
> + * Takes the matrix lock to avoid actions on this device while removing
> + * Free the associated vfio_ap_queue structure
> + */
>   static void vfio_ap_queue_dev_remove(struct ap_device *apdev)
>   {
> -	/* Nothing to do yet */
> +	struct vfio_ap_queue *q;
> +	int apid, apqi;
> +
> +	mutex_lock(&matrix_dev->lock);
> +	q = dev_get_drvdata(&apdev->device);
> +	dev_set_drvdata(&apdev->device, NULL);
> +	apid = AP_QID_CARD(q->apqn);
> +	apqi = AP_QID_QUEUE(q->apqn);
> +	vfio_ap_mdev_reset_queue(apid, apqi, 1);
> +	vfio_ap_irq_disable(q);
> +	kfree(q);

As per my comments in the cover letter, you would embed the code
above in the vfio_ap_mdev_remove_queue function and call it here.

> +	mutex_unlock(&matrix_dev->lock);
>   }
>   
>   static void vfio_ap_matrix_dev_release(struct device *dev)
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index e8e87bf..015174f 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -24,6 +24,295 @@
>   #define VFIO_AP_MDEV_TYPE_HWVIRT "passthrough"
>   #define VFIO_AP_MDEV_NAME_HWVIRT "VFIO AP Passthrough Device"
>   
> +static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
> +
> +static int match_apqn(struct device *dev, void *data)
> +{
> +	struct vfio_ap_queue *q = dev_get_drvdata(dev);
> +
> +	return (q->apqn == *(int *)(data)) ? 1 : 0;
> +}
> +
> +/**
> + * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
> + * @matrix_mdev: the associated mediated matrix
> + * @apqn: The queue APQN
> + *
> + * Retrieve a queue with a specific APQN from the list of the
> + * devices of the vfio_ap_drv.
> + * Verify that the APID and the APQI are set in the matrix.
> + *
> + * Returns the pointer to the associated vfio_ap_queue
> + */
> +struct vfio_ap_queue *vfio_ap_get_queue(struct ap_matrix_mdev *matrix_mdev,
> +					int apqn)
> +{
> +	struct vfio_ap_queue *q;
> +	struct device *dev;
> +
> +	if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
> +		return NULL;
> +	if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
> +		return NULL;
> +
> +	dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> +				 &apqn, match_apqn);
> +	if (!dev)
> +		return NULL;
> +	q = dev_get_drvdata(dev);
> +	q->matrix_mdev = matrix_mdev;
> +	put_device(dev);
> +
> +	return q;
> +}
> +
> +/**
> + * vfio_ap_wait_for_irqclear
> + * @apqn: The AP Queue number
> + *
> + * Checks the IRQ bit for the status of this APQN using ap_tapq.
> + * Returns if the ap_tapq function succedded and the bit is clear.
> + * Returns if ap_tapq function failed with invalid, deconfigured or
> + * checkstopped AP.
> + * Otherwise retries up to 5 times after waiting 20ms.
> + *
> + */
> +static void vfio_ap_wait_for_irqclear(int apqn)
> +{
> +	struct ap_queue_status status;
> +	int retry = 5;
> +
> +	do {
> +		status = ap_tapq(apqn, NULL);
> +		switch (status.response_code) {
> +		case AP_RESPONSE_NORMAL:
> +		case AP_RESPONSE_RESET_IN_PROGRESS:
> +			if (!status.irq_enabled)
> +				return;
> +			/* Fall through */
> +		case AP_RESPONSE_BUSY:
> +			msleep(20);
> +			break;
> +		case AP_RESPONSE_Q_NOT_AVAIL:
> +		case AP_RESPONSE_DECONFIGURED:
> +		case AP_RESPONSE_CHECKSTOPPED:
> +		default:
> +			WARN_ONCE(1, "%s: tapq rc %02x: %04x\n", __func__,
> +				  status.response_code, apqn);
> +			return;
> +		}
> +	} while (--retry);
> +
> +	WARN_ONCE(1, "%s: tapq rc %02x: %04x could not clear IR bit\n",
> +		  __func__, status.response_code, apqn);
> +}
> +
> +/**
> + * vfio_ap_free_aqic_resources
> + * @q: The vfio_ap_queue
> + *
> + * Unregisters the ISC in the GIB when the saved ISC not invalid.
> + * Unpin the guest's page holding the NIB when it exist.
> + * Reset the saved_pfn and saved_isc to invalid values.
> + * Clear the pointer to the matrix mediated device.
> + *
> + */
> +static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q)
> +{
> +	if (q->saved_isc != VFIO_AP_ISC_INVALID && q->matrix_mdev)
> +		kvm_s390_gisc_unregister(q->matrix_mdev->kvm, q->saved_isc);
> +	if (q->saved_pfn && q->matrix_mdev)
> +		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
> +				 &q->saved_pfn, 1);
> +	q->saved_pfn = 0;
> +	q->saved_isc = VFIO_AP_ISC_INVALID;
> +	q->matrix_mdev = NULL;
> +}
> +
> +/**
> + * vfio_ap_irq_disable
> + * @q: The vfio_ap_queue
> + *
> + * Uses ap_aqic to disable the interruption and in case of success, reset
> + * in progress or IRQ disable command already proceeded: calls
> + * vfio_ap_wait_for_irqclear() to check for the IRQ bit to be clear
> + * and calls vfio_ap_free_aqic_resources() to free the resources associated
> + * with the AP interrupt handling.
> + *
> + * In the case the AP is busy, or a reset is in progress,
> + * retries after 20ms, up to 5 times.
> + *
> + * Returns if ap_aqic function failed with invalid, deconfigured or
> + * checkstopped AP.
> + */
> +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
> +{
> +	struct ap_qirq_ctrl aqic_gisa = {};
> +	struct ap_queue_status status;
> +	int retries = 5;
> +
> +	do {
> +		status = ap_aqic(q->apqn, aqic_gisa, NULL);
> +		switch (status.response_code) {
> +		case AP_RESPONSE_OTHERWISE_CHANGED:
> +		case AP_RESPONSE_NORMAL:
> +			vfio_ap_wait_for_irqclear(q->apqn);
> +			goto end_free;
> +		case AP_RESPONSE_RESET_IN_PROGRESS:
> +		case AP_RESPONSE_BUSY:
> +			msleep(20);
> +			break;
> +		case AP_RESPONSE_Q_NOT_AVAIL:
> +		case AP_RESPONSE_DECONFIGURED:
> +		case AP_RESPONSE_CHECKSTOPPED:
> +		case AP_RESPONSE_INVALID_ADDRESS:
> +		default:
> +			/* All cases in default means AP not operational */
> +			WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
> +				  status.response_code);
> +			goto end_free;
> +		}
> +	} while (retries--);
> +
> +	WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
> +		  status.response_code);
> +end_free:
> +	vfio_ap_free_aqic_resources(q);
> +	return status;
> +}
> +
> +/**
> + * vfio_ap_setirq: Enable Interruption for a APQN
> + *
> + * @dev: the device associated with the ap_queue
> + * @q:   the vfio_ap_queue holding AQIC parameters
> + *
> + * Pin the NIB saved in *q
> + * Register the guest ISC to GIB interface and retrieve the
> + * host ISC to issue the host side PQAP/AQIC
> + *
> + * Response.status may be set to AP_RESPONSE_INVALID_ADDRESS in case the
> + * vfio_pin_pages failed.
> + *
> + * Otherwise return the ap_queue_status returned by the ap_aqic(),
> + * all retry handling will be done by the guest.
> + */
> +static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q,
> +						 int isc,
> +						 unsigned long nib)
> +{
> +	struct ap_qirq_ctrl aqic_gisa = {};
> +	struct ap_queue_status status = {};
> +	struct kvm_s390_gisa *gisa;
> +	struct kvm *kvm;
> +	unsigned long h_nib, g_pfn, h_pfn;
> +	int ret;
> +
> +	g_pfn = nib >> PAGE_SHIFT;
> +	ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1,
> +			     IOMMU_READ | IOMMU_WRITE, &h_pfn);
> +	switch (ret) {
> +	case 1:
> +		break;
> +	default:
> +		status.response_code = AP_RESPONSE_INVALID_ADDRESS;
> +		return status;
> +	}
> +
> +	kvm = q->matrix_mdev->kvm;
> +	gisa = kvm->arch.gisa_int.origin;
> +
> +	h_nib = (h_pfn << PAGE_SHIFT) | (nib & ~PAGE_MASK);
> +	aqic_gisa.gisc = isc;
> +	aqic_gisa.isc = kvm_s390_gisc_register(kvm, isc);
> +	aqic_gisa.ir = 1;
> +	aqic_gisa.gisa = (uint64_t)gisa >> 4;
> +
> +	status = ap_aqic(q->apqn, aqic_gisa, (void *)h_nib);
> +	switch (status.response_code) {
> +	case AP_RESPONSE_NORMAL:
> +		/* See if we did clear older IRQ configuration */
> +		vfio_ap_free_aqic_resources(q);
> +		q->saved_pfn = g_pfn;
> +		q->saved_isc = isc;
> +		break;
> +	case AP_RESPONSE_OTHERWISE_CHANGED:
> +		/* We could not modify IRQ setings: clear new configuration */
> +		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1);
> +		kvm_s390_gisc_unregister(kvm, isc);
> +		break;
> +	default:
> +		pr_warn("%s: apqn %04x: response: %02x\n", __func__, q->apqn,
> +			status.response_code);
> +		vfio_ap_irq_disable(q);
> +		break;
> +	}
> +
> +	return status;
> +}
> +
> +/**
> + * handle_pqap: PQAP instruction callback
> + *
> + * @vcpu: The vcpu on which we received the PQAP instruction
> + *
> + * Get the general register contents to initialize internal variables.
> + * REG[0]: APQN
> + * REG[1]: IR and ISC
> + * REG[2]: NIB
> + *
> + * Response.status may be set to following Response Code:
> + * - AP_RESPONSE_Q_NOT_AVAIL: if the queue is not available
> + * - AP_RESPONSE_DECONFIGURED: if the queue is not configured
> + * - AP_RESPONSE_NORMAL (0) : in case of successs
> + *   Check vfio_ap_setirq() and vfio_ap_clrirq() for other possible RC.
> + * We take the matrix_dev lock to ensure serialization on queues and
> + * mediated device access.
> + *
> + * Return 0 if we could handle the request inside KVM.
> + * otherwise, returns -EOPNOTSUPP to let QEMU handle the fault.
> + */
> +static int handle_pqap(struct kvm_vcpu *vcpu)
> +{
> +	uint64_t status;
> +	uint16_t apqn;
> +	struct vfio_ap_queue *q;
> +	struct ap_queue_status qstatus = {
> +			       .response_code = AP_RESPONSE_Q_NOT_AVAIL, };
> +	struct ap_matrix_mdev *matrix_mdev;
> +
> +	/* If we do not use the AIV facility just go to userland */
> +	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
> +		return -EOPNOTSUPP;
> +
> +	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
> +	mutex_lock(&matrix_dev->lock);
> +
> +	if (!vcpu->kvm->arch.crypto.pqap_hook)
> +		goto out_unlock;
> +	matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
> +				   struct ap_matrix_mdev, pqap_hook);
> +
> +	q = vfio_ap_get_queue(matrix_mdev, apqn);
> +	if (!q)
> +		goto out_unlock;
> +
> +	status = vcpu->run->s.regs.gprs[1];
> +
> +	/* If IR bit(16) is set we enable the interrupt */
> +	if ((status >> (63 - 16)) & 0x01)
> +		qstatus = vfio_ap_irq_enable(q, status & 0x07,
> +					     vcpu->run->s.regs.gprs[2]);
> +	else
> +		qstatus = vfio_ap_irq_disable(q);
> +
> +out_unlock:
> +	memcpy(&vcpu->run->s.regs.gprs[1], &qstatus, sizeof(qstatus));
> +	vcpu->run->s.regs.gprs[1] >>= 32;
> +	mutex_unlock(&matrix_dev->lock);
> +	return 0;
> +}
> +
>   static void vfio_ap_matrix_init(struct ap_config_info *info,
>   				struct ap_matrix *matrix)
>   {
> @@ -45,8 +334,11 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
>   		return -ENOMEM;
>   	}
>   
> +	matrix_mdev->mdev = mdev;
>   	vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
>   	mdev_set_drvdata(mdev, matrix_mdev);
> +	matrix_mdev->pqap_hook.hook = handle_pqap;
> +	matrix_mdev->pqap_hook.owner = THIS_MODULE;
>   	mutex_lock(&matrix_dev->lock);
>   	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
>   	mutex_unlock(&matrix_dev->lock);
> @@ -62,6 +354,7 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
>   		return -EBUSY;
>   
>   	mutex_lock(&matrix_dev->lock);
> +	vfio_ap_mdev_reset_queues(mdev);
>   	list_del(&matrix_mdev->node);
>   	mutex_unlock(&matrix_dev->lock);
>   
> @@ -754,6 +1047,8 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>   	}
>   
>   	matrix_mdev->kvm = kvm;
> +	kvm_get_kvm(kvm);
> +	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>   	mutex_unlock(&matrix_dev->lock);
>   
>   	return 0;
> @@ -819,15 +1114,36 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>   	return NOTIFY_OK;
>   }
>   
> -static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> -				    unsigned int retry)
> +static void vfio_ap_irq_disable_apqn(int apqn)
> +{
> +	struct device *dev;
> +	struct vfio_ap_queue *q;
> +
> +	dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> +				 &apqn, match_apqn);
> +	if (dev) {
> +		q = dev_get_drvdata(dev);
> +		vfio_ap_irq_disable(q);
> +		put_device(dev);
> +	}
> +}
> +
> +int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> +			     unsigned int retry)
>   {
>   	struct ap_queue_status status;
> +	int retry2 = 2;
> +	int apqn = AP_MKQID(apid, apqi);
>   
>   	do {
> -		status = ap_zapq(AP_MKQID(apid, apqi));
> +		status = ap_zapq(apqn);
>   		switch (status.response_code) {
>   		case AP_RESPONSE_NORMAL:
> +			while (!status.queue_empty && retry2--) {
> +				msleep(20);
> +				status = ap_tapq(apqn, NULL);
> +			}
> +			WARN_ON_ONCE(retry <= 0);
>   			return 0;
>   		case AP_RESPONSE_RESET_IN_PROGRESS:
>   		case AP_RESPONSE_BUSY:
> @@ -861,6 +1177,7 @@ static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev)
>   			 */
>   			if (ret)
>   				rc = ret;
> +			vfio_ap_irq_disable_apqn(AP_MKQID(apid, apqi));
>   		}
>   	}
>   
> @@ -904,15 +1221,20 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
>   {
>   	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>   
> -	if (matrix_mdev->kvm)
> +	mutex_lock(&matrix_dev->lock);
> +	if (matrix_mdev->kvm) {
>   		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> +		matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> +		vfio_ap_mdev_reset_queues(mdev);
> +		kvm_put_kvm(matrix_mdev->kvm);
> +		matrix_mdev->kvm = NULL;
> +	}
> +	mutex_unlock(&matrix_dev->lock);
>   
> -	vfio_ap_mdev_reset_queues(mdev);
>   	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
>   				 &matrix_mdev->iommu_notifier);
>   	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>   				 &matrix_mdev->group_notifier);
> -	matrix_mdev->kvm = NULL;
>   	module_put(THIS_MODULE);
>   }
>   
> @@ -941,6 +1263,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
>   {
>   	int ret;
>   
> +	mutex_lock(&matrix_dev->lock);
>   	switch (cmd) {
>   	case VFIO_DEVICE_GET_INFO:
>   		ret = vfio_ap_mdev_get_device_info(arg);
> @@ -952,6 +1275,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
>   		ret = -EOPNOTSUPP;
>   		break;
>   	}
> +	mutex_unlock(&matrix_dev->lock);
>   
>   	return ret;
>   }
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index 18dcc4d..f46dde5 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -4,6 +4,7 @@
>    *
>    * Author(s): Tony Krowiak <akrowiak@linux.ibm.com>
>    *	      Halil Pasic <pasic@linux.ibm.com>
> + *	      Pierre Morel <pmorel@linux.ibm.com>
>    *
>    * Copyright IBM Corp. 2018
>    */
> @@ -89,5 +90,15 @@ struct ap_matrix_mdev {
>   
>   extern int vfio_ap_mdev_register(void);
>   extern void vfio_ap_mdev_unregister(void);
> +int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> +			     unsigned int retry);

As per my comments in the cover letter, get rid of this.

>   
> +struct vfio_ap_queue {
> +	struct ap_matrix_mdev *matrix_mdev;
> +	unsigned long saved_pfn;
> +	int	apqn;
> +#define VFIO_AP_ISC_INVALID 0xff
> +	unsigned char saved_isc;
> +};
> +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q);
>   #endif /* _VFIO_AP_PRIVATE_H_ */

As per my comments in the cover letter, get rid of this and make it a
static function in vfio_ap_ops.c.

You'll have to add the following function defs instead:

int vfio_ap_mdev_probe_queue(struct ap_queue *queue);
void vfio_ap_mdev_remove_queue(struct ap_queue *queue);

> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control
  2019-05-23 15:36 ` [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Tony Krowiak
@ 2019-06-04 14:56   ` Halil Pasic
  0 siblings, 0 replies; 22+ messages in thread
From: Halil Pasic @ 2019-06-04 14:56 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: Pierre Morel, borntraeger, alex.williamson, cohuck, linux-kernel,
	linux-s390, kvm, frankja, david, schwidefsky, heiko.carstens,
	freude, mimu

On Thu, 23 May 2019 11:36:12 -0400
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> On 5/21/19 11:34 AM, Pierre Morel wrote:
> > This patch series implements PQAP/AQIC interception in KVM.
> > 

[..]

> > 
> > Tony, I did not rebase this series above the dynamic configuration
> > patches because:
> > - This series do the work it needs to do without having to take
> >    care on the dynamic configuration.
> > - It is guarantied that interrupt will be shut off after removing
> >    the APQueue device
> > - The dynamic configuration series is not converging.  
> 
> Would you consider the following?
> 
> * Take dynconfig patch "s390: vfio-ap: wait for queue empty on queue

[..]

> 
> If you do the things above, then I can base the dynconfig series on
> the IRQ series without much of a merge issue. What say you?
> 
> Note: I've included review comments for patch 3/4 to match the
>        suggestions above.
> 
> > 
> > However Tony, the choice is your's, I won't be able to help
> > in a near future.

Since Pierre won't available for a while, and the patches look good
enough to me, I would like to pick these if nobody objects.

Any objection?

@Tony: Could you please have another look at patch 3? That is the only
one you haven't r-b yet...

Regards,
Halil


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel
  2019-05-21 15:34 ` [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel Pierre Morel
  2019-05-23 15:43   ` Tony Krowiak
@ 2019-06-04 19:38   ` Tony Krowiak
  2019-06-07 14:29     ` Halil Pasic
  2019-06-12 13:56   ` Harald Freudenberger
  2 siblings, 1 reply; 22+ messages in thread
From: Tony Krowiak @ 2019-06-04 19:38 UTC (permalink / raw)
  To: Pierre Morel, borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	pasic, david, schwidefsky, heiko.carstens, freude, mimu

On 5/21/19 11:34 AM, Pierre Morel wrote:
> We register a AP PQAP instruction hook during the open
> of the mediated device. And unregister it on release.
> 
> During the probe of the AP device, we allocate a vfio_ap_queue
> structure to keep track of the information we need for the
> PQAP/AQIC instruction interception.
> 
> In the AP PQAP instruction hook, if we receive a demand to
> enable IRQs,
> - we retrieve the vfio_ap_queue based on the APQN we receive
>    in REG1,
> - we retrieve the page of the guest address, (NIB), from
>    register REG2
> - we retrieve the mediated device to use the VFIO pinning
>    infrastructure to pin the page of the guest address,
> - we retrieve the pointer to KVM to register the guest ISC
>    and retrieve the host ISC
> - finaly we activate GISA
> 
> If we receive a demand to disable IRQs,
> - we deactivate GISA
> - unregister from the GIB
> - unpin the NIB
> 
> When removing the AP device from the driver the device is
> reseted and this process unregisters the GISA from the GIB,
> and unpins the NIB address then we free the vfio_ap_queue
> structure.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>

I made a few comments, but it looks sane to me:
Acked-by: Tony Krowiak <akrowiak@linux.ibm.com>

> ---
>   drivers/s390/crypto/vfio_ap_drv.c     |  34 +++-
>   drivers/s390/crypto/vfio_ap_ops.c     | 336 +++++++++++++++++++++++++++++++++-
>   drivers/s390/crypto/vfio_ap_private.h |  11 ++
>   3 files changed, 374 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
> index e9824c3..003662a 100644
> --- a/drivers/s390/crypto/vfio_ap_drv.c
> +++ b/drivers/s390/crypto/vfio_ap_drv.c
> @@ -5,6 +5,7 @@
>    * Copyright IBM Corp. 2018
>    *
>    * Author(s): Tony Krowiak <akrowiak@linux.ibm.com>
> + *	      Pierre Morel <pmorel@linux.ibm.com>
>    */
>   
>   #include <linux/module.h>
> @@ -40,14 +41,45 @@ static struct ap_device_id ap_queue_ids[] = {
>   
>   MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
>   
> +/**
> + * vfio_ap_queue_dev_probe:
> + *
> + * Allocate a vfio_ap_queue structure and associate it
> + * with the device as driver_data.
> + */
>   static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
>   {
> +	struct vfio_ap_queue *q;
> +
> +	q = kzalloc(sizeof(*q), GFP_KERNEL);
> +	if (!q)
> +		return -ENOMEM;
> +	dev_set_drvdata(&apdev->device, q);
> +	q->apqn = to_ap_queue(&apdev->device)->qid;
> +	q->saved_isc = VFIO_AP_ISC_INVALID;
>   	return 0;
>   }
>   
> +/**
> + * vfio_ap_queue_dev_remove:
> + *
> + * Takes the matrix lock to avoid actions on this device while removing
> + * Free the associated vfio_ap_queue structure
> + */
>   static void vfio_ap_queue_dev_remove(struct ap_device *apdev)
>   {
> -	/* Nothing to do yet */
> +	struct vfio_ap_queue *q;
> +	int apid, apqi;
> +
> +	mutex_lock(&matrix_dev->lock);
> +	q = dev_get_drvdata(&apdev->device);
> +	dev_set_drvdata(&apdev->device, NULL);
> +	apid = AP_QID_CARD(q->apqn);
> +	apqi = AP_QID_QUEUE(q->apqn);
> +	vfio_ap_mdev_reset_queue(apid, apqi, 1);
> +	vfio_ap_irq_disable(q);
> +	kfree(q);
> +	mutex_unlock(&matrix_dev->lock);
>   }
>   
>   static void vfio_ap_matrix_dev_release(struct device *dev)
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index e8e87bf..015174f 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -24,6 +24,295 @@
>   #define VFIO_AP_MDEV_TYPE_HWVIRT "passthrough"
>   #define VFIO_AP_MDEV_NAME_HWVIRT "VFIO AP Passthrough Device"
>   
> +static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
> +
> +static int match_apqn(struct device *dev, void *data)
> +{
> +	struct vfio_ap_queue *q = dev_get_drvdata(dev);
> +
> +	return (q->apqn == *(int *)(data)) ? 1 : 0;
> +}
> +
> +/**
> + * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
> + * @matrix_mdev: the associated mediated matrix
> + * @apqn: The queue APQN
> + *
> + * Retrieve a queue with a specific APQN from the list of the
> + * devices of the vfio_ap_drv.
> + * Verify that the APID and the APQI are set in the matrix.
> + *
> + * Returns the pointer to the associated vfio_ap_queue
> + */
> +struct vfio_ap_queue *vfio_ap_get_queue(struct ap_matrix_mdev *matrix_mdev,
> +					int apqn)
> +{
> +	struct vfio_ap_queue *q;
> +	struct device *dev;
> +
> +	if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
> +		return NULL;
> +	if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
> +		return NULL;
> +
> +	dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> +				 &apqn, match_apqn);
> +	if (!dev)
> +		return NULL;
> +	q = dev_get_drvdata(dev);
> +	q->matrix_mdev = matrix_mdev;
> +	put_device(dev);
> +
> +	return q;
> +}
> +
> +/**
> + * vfio_ap_wait_for_irqclear
> + * @apqn: The AP Queue number
> + *
> + * Checks the IRQ bit for the status of this APQN using ap_tapq.
> + * Returns if the ap_tapq function succedded and the bit is clear.

s/succedded/succeeded/

> + * Returns if ap_tapq function failed with invalid, deconfigured or
> + * checkstopped AP.
> + * Otherwise retries up to 5 times after waiting 20ms.
> + *
> + */
> +static void vfio_ap_wait_for_irqclear(int apqn)
> +{
> +	struct ap_queue_status status;
> +	int retry = 5;
> +
> +	do {
> +		status = ap_tapq(apqn, NULL);
> +		switch (status.response_code) {
> +		case AP_RESPONSE_NORMAL:
> +		case AP_RESPONSE_RESET_IN_PROGRESS:
> +			if (!status.irq_enabled)
> +				return;
> +			/* Fall through */
> +		case AP_RESPONSE_BUSY:
> +			msleep(20);
> +			break;
> +		case AP_RESPONSE_Q_NOT_AVAIL:
> +		case AP_RESPONSE_DECONFIGURED:
> +		case AP_RESPONSE_CHECKSTOPPED:
> +		default:
> +			WARN_ONCE(1, "%s: tapq rc %02x: %04x\n", __func__,
> +				  status.response_code, apqn);
> +			return;

Why not just break out of the loop and just use the WARN_ONCE
outside of the loop?

> +		}
> +	} while (--retry);
> +
> +	WARN_ONCE(1, "%s: tapq rc %02x: %04x could not clear IR bit\n",
> +		  __func__, status.response_code, apqn);
> +}
> +
> +/**
> + * vfio_ap_free_aqic_resources
> + * @q: The vfio_ap_queue
> + *
> + * Unregisters the ISC in the GIB when the saved ISC not invalid.
> + * Unpin the guest's page holding the NIB when it exist.
> + * Reset the saved_pfn and saved_isc to invalid values.
> + * Clear the pointer to the matrix mediated device.
> + *
> + */
> +static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q)
> +{
> +	if (q->saved_isc != VFIO_AP_ISC_INVALID && q->matrix_mdev)
> +		kvm_s390_gisc_unregister(q->matrix_mdev->kvm, q->saved_isc);
> +	if (q->saved_pfn && q->matrix_mdev)
> +		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
> +				 &q->saved_pfn, 1);
> +	q->saved_pfn = 0;
> +	q->saved_isc = VFIO_AP_ISC_INVALID;
> +	q->matrix_mdev = NULL;
> +}
> +
> +/**
> + * vfio_ap_irq_disable
> + * @q: The vfio_ap_queue
> + *
> + * Uses ap_aqic to disable the interruption and in case of success, reset
> + * in progress or IRQ disable command already proceeded: calls
> + * vfio_ap_wait_for_irqclear() to check for the IRQ bit to be clear
> + * and calls vfio_ap_free_aqic_resources() to free the resources associated
> + * with the AP interrupt handling.
> + *
> + * In the case the AP is busy, or a reset is in progress,
> + * retries after 20ms, up to 5 times.
> + *
> + * Returns if ap_aqic function failed with invalid, deconfigured or
> + * checkstopped AP.
> + */
> +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
> +{
> +	struct ap_qirq_ctrl aqic_gisa = {};
> +	struct ap_queue_status status;
> +	int retries = 5;
> +
> +	do {
> +		status = ap_aqic(q->apqn, aqic_gisa, NULL);
> +		switch (status.response_code) {
> +		case AP_RESPONSE_OTHERWISE_CHANGED:
> +		case AP_RESPONSE_NORMAL:
> +			vfio_ap_wait_for_irqclear(q->apqn);
> +			goto end_free;
> +		case AP_RESPONSE_RESET_IN_PROGRESS:
> +		case AP_RESPONSE_BUSY:
> +			msleep(20);
> +			break;
> +		case AP_RESPONSE_Q_NOT_AVAIL:
> +		case AP_RESPONSE_DECONFIGURED:
> +		case AP_RESPONSE_CHECKSTOPPED:
> +		case AP_RESPONSE_INVALID_ADDRESS:
> +		default:
> +			/* All cases in default means AP not operational */
> +			WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
> +				  status.response_code);
> +			goto end_free;

Why not just break out of the loop instead of repeating the WARN_ONCE
message?

> +		}
> +	} while (retries--);
> +
> +	WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
> +		  status.response_code);
> +end_free:
> +	vfio_ap_free_aqic_resources(q);
> +	return status;
> +}
> +
> +/**
> + * vfio_ap_setirq: Enable Interruption for a APQN
> + *
> + * @dev: the device associated with the ap_queue
> + * @q:   the vfio_ap_queue holding AQIC parameters
> + *
> + * Pin the NIB saved in *q
> + * Register the guest ISC to GIB interface and retrieve the
> + * host ISC to issue the host side PQAP/AQIC
> + *
> + * Response.status may be set to AP_RESPONSE_INVALID_ADDRESS in case the
> + * vfio_pin_pages failed.
> + *
> + * Otherwise return the ap_queue_status returned by the ap_aqic(),
> + * all retry handling will be done by the guest.
> + */
> +static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q,
> +						 int isc,
> +						 unsigned long nib)
> +{
> +	struct ap_qirq_ctrl aqic_gisa = {};
> +	struct ap_queue_status status = {};
> +	struct kvm_s390_gisa *gisa;
> +	struct kvm *kvm;
> +	unsigned long h_nib, g_pfn, h_pfn;
> +	int ret;
> +
> +	g_pfn = nib >> PAGE_SHIFT;
> +	ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1,
> +			     IOMMU_READ | IOMMU_WRITE, &h_pfn);
> +	switch (ret) {
> +	case 1:
> +		break;
> +	default:
> +		status.response_code = AP_RESPONSE_INVALID_ADDRESS;
> +		return status;
> +	}
> +
> +	kvm = q->matrix_mdev->kvm;
> +	gisa = kvm->arch.gisa_int.origin;
> +
> +	h_nib = (h_pfn << PAGE_SHIFT) | (nib & ~PAGE_MASK);
> +	aqic_gisa.gisc = isc;
> +	aqic_gisa.isc = kvm_s390_gisc_register(kvm, isc);
> +	aqic_gisa.ir = 1;
> +	aqic_gisa.gisa = (uint64_t)gisa >> 4;
> +
> +	status = ap_aqic(q->apqn, aqic_gisa, (void *)h_nib);
> +	switch (status.response_code) {
> +	case AP_RESPONSE_NORMAL:
> +		/* See if we did clear older IRQ configuration */
> +		vfio_ap_free_aqic_resources(q);
> +		q->saved_pfn = g_pfn;
> +		q->saved_isc = isc;
> +		break;
> +	case AP_RESPONSE_OTHERWISE_CHANGED:
> +		/* We could not modify IRQ setings: clear new configuration */
> +		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1);
> +		kvm_s390_gisc_unregister(kvm, isc);
> +		break;
> +	default:
> +		pr_warn("%s: apqn %04x: response: %02x\n", __func__, q->apqn,
> +			status.response_code);
> +		vfio_ap_irq_disable(q);
> +		break;
> +	}
> +
> +	return status;
> +}
> +
> +/**
> + * handle_pqap: PQAP instruction callback
> + *
> + * @vcpu: The vcpu on which we received the PQAP instruction
> + *
> + * Get the general register contents to initialize internal variables.
> + * REG[0]: APQN
> + * REG[1]: IR and ISC
> + * REG[2]: NIB
> + *
> + * Response.status may be set to following Response Code:
> + * - AP_RESPONSE_Q_NOT_AVAIL: if the queue is not available
> + * - AP_RESPONSE_DECONFIGURED: if the queue is not configured
> + * - AP_RESPONSE_NORMAL (0) : in case of successs
> + *   Check vfio_ap_setirq() and vfio_ap_clrirq() for other possible RC.
> + * We take the matrix_dev lock to ensure serialization on queues and
> + * mediated device access.
> + *
> + * Return 0 if we could handle the request inside KVM.
> + * otherwise, returns -EOPNOTSUPP to let QEMU handle the fault.
> + */
> +static int handle_pqap(struct kvm_vcpu *vcpu)
> +{
> +	uint64_t status;
> +	uint16_t apqn;
> +	struct vfio_ap_queue *q;
> +	struct ap_queue_status qstatus = {
> +			       .response_code = AP_RESPONSE_Q_NOT_AVAIL, };
> +	struct ap_matrix_mdev *matrix_mdev;
> +
> +	/* If we do not use the AIV facility just go to userland */
> +	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
> +		return -EOPNOTSUPP;
> +
> +	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
> +	mutex_lock(&matrix_dev->lock);
> +
> +	if (!vcpu->kvm->arch.crypto.pqap_hook)

Wasn't this already checked in patch 2 prior to calling this
function? In fact, doesn't the hook point to this function?

> +		goto out_unlock;
> +	matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
> +				   struct ap_matrix_mdev, pqap_hook);
> +
> +	q = vfio_ap_get_queue(matrix_mdev, apqn);
> +	if (!q)
> +		goto out_unlock;
> +
> +	status = vcpu->run->s.regs.gprs[1];
> +
> +	/* If IR bit(16) is set we enable the interrupt */
> +	if ((status >> (63 - 16)) & 0x01)
> +		qstatus = vfio_ap_irq_enable(q, status & 0x07,
> +					     vcpu->run->s.regs.gprs[2]);
> +	else
> +		qstatus = vfio_ap_irq_disable(q);
> +
> +out_unlock:
> +	memcpy(&vcpu->run->s.regs.gprs[1], &qstatus, sizeof(qstatus));
> +	vcpu->run->s.regs.gprs[1] >>= 32;
> +	mutex_unlock(&matrix_dev->lock);
> +	return 0;
> +}
> +
>   static void vfio_ap_matrix_init(struct ap_config_info *info,
>   				struct ap_matrix *matrix)
>   {
> @@ -45,8 +334,11 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
>   		return -ENOMEM;
>   	}
>   
> +	matrix_mdev->mdev = mdev;
>   	vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
>   	mdev_set_drvdata(mdev, matrix_mdev);
> +	matrix_mdev->pqap_hook.hook = handle_pqap;
> +	matrix_mdev->pqap_hook.owner = THIS_MODULE;
>   	mutex_lock(&matrix_dev->lock);
>   	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
>   	mutex_unlock(&matrix_dev->lock);
> @@ -62,6 +354,7 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
>   		return -EBUSY;
>   
>   	mutex_lock(&matrix_dev->lock);
> +	vfio_ap_mdev_reset_queues(mdev);
>   	list_del(&matrix_mdev->node);
>   	mutex_unlock(&matrix_dev->lock);
>   
> @@ -754,6 +1047,8 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>   	}
>   
>   	matrix_mdev->kvm = kvm;
> +	kvm_get_kvm(kvm);
> +	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>   	mutex_unlock(&matrix_dev->lock);
>   
>   	return 0;
> @@ -819,15 +1114,36 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>   	return NOTIFY_OK;
>   }
>   
> -static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> -				    unsigned int retry)
> +static void vfio_ap_irq_disable_apqn(int apqn)
> +{
> +	struct device *dev;
> +	struct vfio_ap_queue *q;
> +
> +	dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> +				 &apqn, match_apqn);
> +	if (dev) {
> +		q = dev_get_drvdata(dev);
> +		vfio_ap_irq_disable(q);
> +		put_device(dev);
> +	}
> +}
> +
> +int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> +			     unsigned int retry)
>   {
>   	struct ap_queue_status status;
> +	int retry2 = 2;
> +	int apqn = AP_MKQID(apid, apqi);
>   
>   	do {
> -		status = ap_zapq(AP_MKQID(apid, apqi));
> +		status = ap_zapq(apqn);
>   		switch (status.response_code) {
>   		case AP_RESPONSE_NORMAL:
> +			while (!status.queue_empty && retry2--) {
> +				msleep(20);
> +				status = ap_tapq(apqn, NULL);
> +			}
> +			WARN_ON_ONCE(retry <= 0);
>   			return 0;
>   		case AP_RESPONSE_RESET_IN_PROGRESS:
>   		case AP_RESPONSE_BUSY:
> @@ -861,6 +1177,7 @@ static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev)
>   			 */
>   			if (ret)
>   				rc = ret;
> +			vfio_ap_irq_disable_apqn(AP_MKQID(apid, apqi));
>   		}
>   	}
>   
> @@ -904,15 +1221,20 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
>   {
>   	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>   
> -	if (matrix_mdev->kvm)
> +	mutex_lock(&matrix_dev->lock);
> +	if (matrix_mdev->kvm) {
>   		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> +		matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> +		vfio_ap_mdev_reset_queues(mdev);
> +		kvm_put_kvm(matrix_mdev->kvm);
> +		matrix_mdev->kvm = NULL;
> +	}
> +	mutex_unlock(&matrix_dev->lock);
>   
> -	vfio_ap_mdev_reset_queues(mdev);
>   	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
>   				 &matrix_mdev->iommu_notifier);
>   	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>   				 &matrix_mdev->group_notifier);
> -	matrix_mdev->kvm = NULL;
>   	module_put(THIS_MODULE);
>   }
>   
> @@ -941,6 +1263,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
>   {
>   	int ret;
>   
> +	mutex_lock(&matrix_dev->lock);
>   	switch (cmd) {
>   	case VFIO_DEVICE_GET_INFO:
>   		ret = vfio_ap_mdev_get_device_info(arg);
> @@ -952,6 +1275,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
>   		ret = -EOPNOTSUPP;
>   		break;
>   	}
> +	mutex_unlock(&matrix_dev->lock);
>   
>   	return ret;
>   }
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index 18dcc4d..f46dde5 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -4,6 +4,7 @@
>    *
>    * Author(s): Tony Krowiak <akrowiak@linux.ibm.com>
>    *	      Halil Pasic <pasic@linux.ibm.com>
> + *	      Pierre Morel <pmorel@linux.ibm.com>
>    *
>    * Copyright IBM Corp. 2018
>    */
> @@ -89,5 +90,15 @@ struct ap_matrix_mdev {
>   
>   extern int vfio_ap_mdev_register(void);
>   extern void vfio_ap_mdev_unregister(void);
> +int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> +			     unsigned int retry);
>   
> +struct vfio_ap_queue {
> +	struct ap_matrix_mdev *matrix_mdev;
> +	unsigned long saved_pfn;
> +	int	apqn;
> +#define VFIO_AP_ISC_INVALID 0xff
> +	unsigned char saved_isc;
> +};
> +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q);
>   #endif /* _VFIO_AP_PRIVATE_H_ */
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel
  2019-06-04 19:38   ` Tony Krowiak
@ 2019-06-07 14:29     ` Halil Pasic
  2019-06-11 14:37       ` Tony Krowiak
  0 siblings, 1 reply; 22+ messages in thread
From: Halil Pasic @ 2019-06-07 14:29 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: Pierre Morel, borntraeger, alex.williamson, cohuck, linux-kernel,
	linux-s390, kvm, frankja, david, schwidefsky, heiko.carstens,
	freude, mimu

On Tue, 4 Jun 2019 15:38:51 -0400
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> On 5/21/19 11:34 AM, Pierre Morel wrote:
> > We register a AP PQAP instruction hook during the open
> > of the mediated device. And unregister it on release.

[..]

> > +/**
> > + * vfio_ap_wait_for_irqclear
> > + * @apqn: The AP Queue number
> > + *
> > + * Checks the IRQ bit for the status of this APQN using ap_tapq.
> > + * Returns if the ap_tapq function succedded and the bit is clear.
> 
> s/succedded/succeeded/
> 

I'm gonna fix this up when picking.

> > + * Returns if ap_tapq function failed with invalid, deconfigured or
> > + * checkstopped AP.
> > + * Otherwise retries up to 5 times after waiting 20ms.
> > + *
> > + */
> > +static void vfio_ap_wait_for_irqclear(int apqn)
> > +{
> > +	struct ap_queue_status status;
> > +	int retry = 5;
> > +
> > +	do {
> > +		status = ap_tapq(apqn, NULL);
> > +		switch (status.response_code) {
> > +		case AP_RESPONSE_NORMAL:
> > +		case AP_RESPONSE_RESET_IN_PROGRESS:
> > +			if (!status.irq_enabled)
> > +				return;
> > +			/* Fall through */
> > +		case AP_RESPONSE_BUSY:
> > +			msleep(20);
> > +			break;
> > +		case AP_RESPONSE_Q_NOT_AVAIL:
> > +		case AP_RESPONSE_DECONFIGURED:
> > +		case AP_RESPONSE_CHECKSTOPPED:
> > +		default:
> > +			WARN_ONCE(1, "%s: tapq rc %02x: %04x\n", __func__,
> > +				  status.response_code, apqn);
> > +			return;
> 
> Why not just break out of the loop and just use the WARN_ONCE
> outside of the loop?
> 

AFAIU the idea was to differentiate between got a strange response_code
and ran out of retires.

Actually I suspect that we are fine in case of AP_RESPONSE_Q_NOT_AVAIL,
 AP_RESPONSE_DECONFIGURED and AP_RESPONSE_CHECKSTOPPED in a sense that
what should be the post-condition of this function is guaranteed to be
reached. What do you think?

While I think that we can do better here, I see this as something that
should be done on top.

> > +		}
> > +	} while (--retry);
> > +
> > +	WARN_ONCE(1, "%s: tapq rc %02x: %04x could not clear IR bit\n",
> > +		  __func__, status.response_code, apqn);
> > +}
> > +
> > +/**
> > + * vfio_ap_free_aqic_resources
> > + * @q: The vfio_ap_queue
> > + *
> > + * Unregisters the ISC in the GIB when the saved ISC not invalid.
> > + * Unpin the guest's page holding the NIB when it exist.
> > + * Reset the saved_pfn and saved_isc to invalid values.
> > + * Clear the pointer to the matrix mediated device.
> > + *
> > + */

[..]

> > +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
> > +{
> > +	struct ap_qirq_ctrl aqic_gisa = {};
> > +	struct ap_queue_status status;
> > +	int retries = 5;
> > +
> > +	do {
> > +		status = ap_aqic(q->apqn, aqic_gisa, NULL);
> > +		switch (status.response_code) {
> > +		case AP_RESPONSE_OTHERWISE_CHANGED:
> > +		case AP_RESPONSE_NORMAL:
> > +			vfio_ap_wait_for_irqclear(q->apqn);
> > +			goto end_free;
> > +		case AP_RESPONSE_RESET_IN_PROGRESS:
> > +		case AP_RESPONSE_BUSY:
> > +			msleep(20);
> > +			break;
> > +		case AP_RESPONSE_Q_NOT_AVAIL:
> > +		case AP_RESPONSE_DECONFIGURED:
> > +		case AP_RESPONSE_CHECKSTOPPED:
> > +		case AP_RESPONSE_INVALID_ADDRESS:
> > +		default:
> > +			/* All cases in default means AP not operational */
> > +			WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
> > +				  status.response_code);
> > +			goto end_free;
> 
> Why not just break out of the loop instead of repeating the WARN_ONCE
> message?
> 

I suppose the reason is same as above. I'm not entirely happy with this
code myself. E.g. why do we do retries here -- shouldn't we just fail the
aqic by the guest?

[..]

> > +static int handle_pqap(struct kvm_vcpu *vcpu)
> > +{
> > +	uint64_t status;
> > +	uint16_t apqn;
> > +	struct vfio_ap_queue *q;
> > +	struct ap_queue_status qstatus = {
> > +			       .response_code = AP_RESPONSE_Q_NOT_AVAIL, };
> > +	struct ap_matrix_mdev *matrix_mdev;
> > +
> > +	/* If we do not use the AIV facility just go to userland */
> > +	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
> > +		return -EOPNOTSUPP;
> > +
> > +	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
> > +	mutex_lock(&matrix_dev->lock);
> > +
> > +	if (!vcpu->kvm->arch.crypto.pqap_hook)
> 
> Wasn't this already checked in patch 2 prior to calling this
> function? In fact, doesn't the hook point to this function?
> 

Let us benevolently call this defensive programming. We are actually
in that callback AFAICT, so it sure was set a moment ago, and I guess
the client code still holds the kvm.lock so it is guaranteed to stay
so unless somebody is playing foul.

We can address this with a patch on top.

Regards,
Halil


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel
  2019-06-07 14:29     ` Halil Pasic
@ 2019-06-11 14:37       ` Tony Krowiak
  2019-06-11 15:00         ` Halil Pasic
  0 siblings, 1 reply; 22+ messages in thread
From: Tony Krowiak @ 2019-06-11 14:37 UTC (permalink / raw)
  To: Halil Pasic
  Cc: Pierre Morel, borntraeger, alex.williamson, cohuck, linux-kernel,
	linux-s390, kvm, frankja, david, schwidefsky, heiko.carstens,
	freude, mimu

On 6/7/19 10:29 AM, Halil Pasic wrote:
> On Tue, 4 Jun 2019 15:38:51 -0400
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
> 
>> On 5/21/19 11:34 AM, Pierre Morel wrote:
>>> We register a AP PQAP instruction hook during the open
>>> of the mediated device. And unregister it on release.
> 
> [..]
> 
>>> +/**
>>> + * vfio_ap_wait_for_irqclear
>>> + * @apqn: The AP Queue number
>>> + *
>>> + * Checks the IRQ bit for the status of this APQN using ap_tapq.
>>> + * Returns if the ap_tapq function succedded and the bit is clear.
>>
>> s/succedded/succeeded/
>>
> 
> I'm gonna fix this up when picking.
> 
>>> + * Returns if ap_tapq function failed with invalid, deconfigured or
>>> + * checkstopped AP.
>>> + * Otherwise retries up to 5 times after waiting 20ms.
>>> + *
>>> + */
>>> +static void vfio_ap_wait_for_irqclear(int apqn)
>>> +{
>>> +	struct ap_queue_status status;
>>> +	int retry = 5;
>>> +
>>> +	do {
>>> +		status = ap_tapq(apqn, NULL);
>>> +		switch (status.response_code) {
>>> +		case AP_RESPONSE_NORMAL:
>>> +		case AP_RESPONSE_RESET_IN_PROGRESS:
>>> +			if (!status.irq_enabled)
>>> +				return;
>>> +			/* Fall through */
>>> +		case AP_RESPONSE_BUSY:
>>> +			msleep(20);
>>> +			break;
>>> +		case AP_RESPONSE_Q_NOT_AVAIL:
>>> +		case AP_RESPONSE_DECONFIGURED:
>>> +		case AP_RESPONSE_CHECKSTOPPED:
>>> +		default:
>>> +			WARN_ONCE(1, "%s: tapq rc %02x: %04x\n", __func__,
>>> +				  status.response_code, apqn);
>>> +			return;
>>
>> Why not just break out of the loop and just use the WARN_ONCE
>> outside of the loop?
>>
> 
> AFAIU the idea was to differentiate between got a strange response_code
> and ran out of retires.

In both cases, the response code is placed into the message, so one
should be able to discern the reason in either case. This is not
critical, just an observation.

> 
> Actually I suspect that we are fine in case of AP_RESPONSE_Q_NOT_AVAIL,
>   AP_RESPONSE_DECONFIGURED and AP_RESPONSE_CHECKSTOPPED in a sense that
> what should be the post-condition of this function is guaranteed to be
> reached. What do you think?

That would seem to be the case given those response codes indicate the
queue is not accessible.

> 
> While I think that we can do better here, I see this as something that
> should be done on top.

Are you talking about a patch on top? What do you think needs to be
addressed?

> 
>>> +		}
>>> +	} while (--retry);
>>> +
>>> +	WARN_ONCE(1, "%s: tapq rc %02x: %04x could not clear IR bit\n",
>>> +		  __func__, status.response_code, apqn);
>>> +}
>>> +
>>> +/**
>>> + * vfio_ap_free_aqic_resources
>>> + * @q: The vfio_ap_queue
>>> + *
>>> + * Unregisters the ISC in the GIB when the saved ISC not invalid.
>>> + * Unpin the guest's page holding the NIB when it exist.
>>> + * Reset the saved_pfn and saved_isc to invalid values.
>>> + * Clear the pointer to the matrix mediated device.
>>> + *
>>> + */
> 
> [..]
> 
>>> +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
>>> +{
>>> +	struct ap_qirq_ctrl aqic_gisa = {};
>>> +	struct ap_queue_status status;
>>> +	int retries = 5;
>>> +
>>> +	do {
>>> +		status = ap_aqic(q->apqn, aqic_gisa, NULL);
>>> +		switch (status.response_code) {
>>> +		case AP_RESPONSE_OTHERWISE_CHANGED:
>>> +		case AP_RESPONSE_NORMAL:
>>> +			vfio_ap_wait_for_irqclear(q->apqn);
>>> +			goto end_free;
>>> +		case AP_RESPONSE_RESET_IN_PROGRESS:
>>> +		case AP_RESPONSE_BUSY:
>>> +			msleep(20);
>>> +			break;
>>> +		case AP_RESPONSE_Q_NOT_AVAIL:
>>> +		case AP_RESPONSE_DECONFIGURED:
>>> +		case AP_RESPONSE_CHECKSTOPPED:
>>> +		case AP_RESPONSE_INVALID_ADDRESS:
>>> +		default:
>>> +			/* All cases in default means AP not operational */
>>> +			WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
>>> +				  status.response_code);
>>> +			goto end_free;
>>
>> Why not just break out of the loop instead of repeating the WARN_ONCE
>> message?
>>
> 
> I suppose the reason is same as above. I'm not entirely happy with this
> code myself. E.g. why do we do retries here -- shouldn't we just fail the
> aqic by the guest?

According to my reading of the code, it looks like the retries are for
response code AP_RESPONSE_BUSY. Why wouldn't we want to wait until the
queue was not busy anymore?

> 
> [..]
> 
>>> +static int handle_pqap(struct kvm_vcpu *vcpu)
>>> +{
>>> +	uint64_t status;
>>> +	uint16_t apqn;
>>> +	struct vfio_ap_queue *q;
>>> +	struct ap_queue_status qstatus = {
>>> +			       .response_code = AP_RESPONSE_Q_NOT_AVAIL, };
>>> +	struct ap_matrix_mdev *matrix_mdev;
>>> +
>>> +	/* If we do not use the AIV facility just go to userland */
>>> +	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
>>> +		return -EOPNOTSUPP;
>>> +
>>> +	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
>>> +	mutex_lock(&matrix_dev->lock);
>>> +
>>> +	if (!vcpu->kvm->arch.crypto.pqap_hook)
>>
>> Wasn't this already checked in patch 2 prior to calling this
>> function? In fact, doesn't the hook point to this function?
>>
> 
> Let us benevolently call this defensive programming. We are actually
> in that callback AFAICT, so it sure was set a moment ago, and I guess
> the client code still holds the kvm.lock so it is guaranteed to stay
> so unless somebody is playing foul.

Defensive, but completely unnecessary; however, it doesn't negatively
affect the logic in the least.

> 
> We can address this with a patch on top.
> 
> Regards,
> Halil
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel
  2019-06-11 14:37       ` Tony Krowiak
@ 2019-06-11 15:00         ` Halil Pasic
  0 siblings, 0 replies; 22+ messages in thread
From: Halil Pasic @ 2019-06-11 15:00 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: Pierre Morel, borntraeger, alex.williamson, cohuck, linux-kernel,
	linux-s390, kvm, frankja, david, schwidefsky, heiko.carstens,
	freude, mimu

On Tue, 11 Jun 2019 10:37:55 -0400
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> On 6/7/19 10:29 AM, Halil Pasic wrote:
> > On Tue, 4 Jun 2019 15:38:51 -0400
> > Tony Krowiak <akrowiak@linux.ibm.com> wrote:

[..]

> >>> +static void vfio_ap_wait_for_irqclear(int apqn)
> >>> +{
> >>> +	struct ap_queue_status status;
> >>> +	int retry = 5;
> >>> +
> >>> +	do {
> >>> +		status = ap_tapq(apqn, NULL);
> >>> +		switch (status.response_code) {
> >>> +		case AP_RESPONSE_NORMAL:
> >>> +		case AP_RESPONSE_RESET_IN_PROGRESS:
> >>> +			if (!status.irq_enabled)
> >>> +				return;
> >>> +			/* Fall through */
> >>> +		case AP_RESPONSE_BUSY:
> >>> +			msleep(20);
> >>> +			break;
> >>> +		case AP_RESPONSE_Q_NOT_AVAIL:
> >>> +		case AP_RESPONSE_DECONFIGURED:
> >>> +		case AP_RESPONSE_CHECKSTOPPED:
> >>> +		default:
> >>> +			WARN_ONCE(1, "%s: tapq rc %02x: %04x\n", __func__,
> >>> +				  status.response_code, apqn);
> >>> +			return;
> >>
> >> Why not just break out of the loop and just use the WARN_ONCE
> >> outside of the loop?
> >>
> > 
> > AFAIU the idea was to differentiate between got a strange response_code
> > and ran out of retires.
> 
> In both cases, the response code is placed into the message, so one
> should be able to discern the reason in either case. This is not
> critical, just an observation.
> 

I understand, but the message below does say 'could not clear' while
the message above does not. One could infer that information, but I
could not do it without digging. So I think keeping these separate does
have a certain merit to it.

Let's keep it for now. We can change this later if we want.

> > 
> > Actually I suspect that we are fine in case of AP_RESPONSE_Q_NOT_AVAIL,
> >   AP_RESPONSE_DECONFIGURED and AP_RESPONSE_CHECKSTOPPED in a sense that
> > what should be the post-condition of this function is guaranteed to be
> > reached. What do you think?
> 
> That would seem to be the case given those response codes indicate the
> queue is not accessible.
> 
> > 
> > While I think that we can do better here, I see this as something that
> > should be done on top.
> 
> Are you talking about a patch on top? What do you think needs to be
> addressed?
> 

For starters, I'm not sure if the first warning is necessary or even
appropriate. See the paragraph starting with 'Actually I suspect that we
are fine in case ...'.

> > 
> >>> +		}
> >>> +	} while (--retry);
> >>> +
> >>> +	WARN_ONCE(1, "%s: tapq rc %02x: %04x could not clear IR bit\n",
> >>> +		  __func__, status.response_code, apqn);
> >>> +}
> >>> +
> >>> +/**
> >>> + * vfio_ap_free_aqic_resources
> >>> + * @q: The vfio_ap_queue
> >>> + *
> >>> + * Unregisters the ISC in the GIB when the saved ISC not invalid.
> >>> + * Unpin the guest's page holding the NIB when it exist.
> >>> + * Reset the saved_pfn and saved_isc to invalid values.
> >>> + * Clear the pointer to the matrix mediated device.
> >>> + *
> >>> + */
> > 
> > [..]
> > 
> >>> +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
> >>> +{
> >>> +	struct ap_qirq_ctrl aqic_gisa = {};
> >>> +	struct ap_queue_status status;
> >>> +	int retries = 5;
> >>> +
> >>> +	do {
> >>> +		status = ap_aqic(q->apqn, aqic_gisa, NULL);
> >>> +		switch (status.response_code) {
> >>> +		case AP_RESPONSE_OTHERWISE_CHANGED:
> >>> +		case AP_RESPONSE_NORMAL:
> >>> +			vfio_ap_wait_for_irqclear(q->apqn);
> >>> +			goto end_free;
> >>> +		case AP_RESPONSE_RESET_IN_PROGRESS:
> >>> +		case AP_RESPONSE_BUSY:
> >>> +			msleep(20);
> >>> +			break;
> >>> +		case AP_RESPONSE_Q_NOT_AVAIL:
> >>> +		case AP_RESPONSE_DECONFIGURED:
> >>> +		case AP_RESPONSE_CHECKSTOPPED:
> >>> +		case AP_RESPONSE_INVALID_ADDRESS:
> >>> +		default:
> >>> +			/* All cases in default means AP not operational */
> >>> +			WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
> >>> +				  status.response_code);
> >>> +			goto end_free;
> >>
> >> Why not just break out of the loop instead of repeating the WARN_ONCE
> >> message?
> >>
> > 
> > I suppose the reason is same as above. I'm not entirely happy with this
> > code myself. E.g. why do we do retries here -- shouldn't we just fail the
> > aqic by the guest?
> 
> According to my reading of the code, it looks like the retries are for
> response code AP_RESPONSE_BUSY. Why wouldn't we want to wait until the
> queue was not busy anymore?
> 

Does HW/FW wait or does it present AP_RESPONSE_BUSY? (Rhetoric
question.)  It is for the guest to decide if and how does it wish to
wait or otherwise react to AP_RESPONSE_BUSY. Or am I missing something?

> > 
> > [..]
> > 
> >>> +static int handle_pqap(struct kvm_vcpu *vcpu)
> >>> +{
> >>> +	uint64_t status;
> >>> +	uint16_t apqn;
> >>> +	struct vfio_ap_queue *q;
> >>> +	struct ap_queue_status qstatus = {
> >>> +			       .response_code = AP_RESPONSE_Q_NOT_AVAIL, };
> >>> +	struct ap_matrix_mdev *matrix_mdev;
> >>> +
> >>> +	/* If we do not use the AIV facility just go to userland */
> >>> +	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
> >>> +		return -EOPNOTSUPP;
> >>> +
> >>> +	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
> >>> +	mutex_lock(&matrix_dev->lock);
> >>> +
> >>> +	if (!vcpu->kvm->arch.crypto.pqap_hook)
> >>
> >> Wasn't this already checked in patch 2 prior to calling this
> >> function? In fact, doesn't the hook point to this function?
> >>
> > 
> > Let us benevolently call this defensive programming. We are actually
> > in that callback AFAICT, so it sure was set a moment ago, and I guess
> > the client code still holds the kvm.lock so it is guaranteed to stay
> > so unless somebody is playing foul.
> 
> Defensive, but completely unnecessary; however, it doesn't negatively
> affect the logic in the least.
>

I agree it is unnecessary. We can get rid of it later. I'm not too keen
of altering somebody's patch without a really strong reason.
 
> > 
> > We can address this with a patch on top.
> > 

[..]

Regards,
Halil


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 1/4] s390: ap: kvm: add PQAP interception for AQIC
  2019-05-21 15:34 ` [PATCH v9 1/4] s390: ap: kvm: add PQAP interception for AQIC Pierre Morel
@ 2019-06-12 13:55   ` Harald Freudenberger
  0 siblings, 0 replies; 22+ messages in thread
From: Harald Freudenberger @ 2019-06-12 13:55 UTC (permalink / raw)
  To: Pierre Morel, borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, mimu

On 21.05.19 17:34, Pierre Morel wrote:
> We prepare the interception of the PQAP/AQIC instruction for
> the case the AQIC facility is enabled in the guest.
>
> First of all we do not want to change existing behavior when
> intercepting AP instructions without the SIE allowing the guest
> to use AP instructions.
>
> In this patch we only handle the AQIC interception allowed by
> facility 65 which will be enabled when the complete interception
> infrastructure will be present.
>
> We add a callback inside the KVM arch structure for s390 for
> a VFIO driver to handle a specific response to the PQAP
> instruction with the AQIC command and only this command.
>
> But we want to be able to return a correct answer to the guest
> even there is no VFIO AP driver in the kernel.
> Therefor, we inject the correct exceptions from inside KVM for the
> case the callback is not initialized, which happens when the vfio_ap
> driver is not loaded.
>
> We do consider the responsability of the driver to always initialize
> the PQAP callback if it defines queues by initializing the CRYCB for
> a guest.
> If the callback has been setup we call it.
> If not we setup an answer considering that no queue is available
> for the guest when no callback has been setup.
>
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  arch/s390/include/asm/kvm_host.h      |  7 +++
>  arch/s390/kvm/priv.c                  | 86 +++++++++++++++++++++++++++++++++++
>  drivers/s390/crypto/vfio_ap_private.h |  2 +
>  3 files changed, 95 insertions(+)
>
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index faa7ebf..a0fc2f1 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -18,6 +18,7 @@
>  #include <linux/kvm_host.h>
>  #include <linux/kvm.h>
>  #include <linux/seqlock.h>
> +#include <linux/module.h>
>  #include <asm/debug.h>
>  #include <asm/cpu.h>
>  #include <asm/fpu/api.h>
> @@ -723,8 +724,14 @@ struct kvm_s390_cpu_model {
>  	unsigned short ibc;
>  };
>  
> +struct kvm_s390_module_hook {
> +	int (*hook)(struct kvm_vcpu *vcpu);
> +	struct module *owner;
> +};
> +
>  struct kvm_s390_crypto {
>  	struct kvm_s390_crypto_cb *crycb;
> +	struct kvm_s390_module_hook *pqap_hook;
>  	__u32 crycbd;
>  	__u8 aes_kw;
>  	__u8 dea_kw;
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 8679bd7..ed52ffa 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -27,6 +27,7 @@
>  #include <asm/io.h>
>  #include <asm/ptrace.h>
>  #include <asm/sclp.h>
> +#include <asm/ap.h>
>  #include "gaccess.h"
>  #include "kvm-s390.h"
>  #include "trace.h"
> @@ -592,6 +593,89 @@ static int handle_io_inst(struct kvm_vcpu *vcpu)
>  	}
>  }
>  
> +/*
> + * handle_pqap: Handling pqap interception
> + * @vcpu: the vcpu having issue the pqap instruction
> + *
> + * We now support PQAP/AQIC instructions and we need to correctly
> + * answer the guest even if no dedicated driver's hook is available.
> + *
> + * The intercepting code calls a dedicated callback for this instruction
> + * if a driver did register one in the CRYPTO satellite of the
> + * SIE block.
> + *
> + * If no callback is available, the queues are not available, return this
> + * response code to the caller and set CC to 3.
> + * Else return the response code returned by the callback.
> + */
> +static int handle_pqap(struct kvm_vcpu *vcpu)
> +{
> +	struct ap_queue_status status = {};
> +	unsigned long reg0;
> +	int ret;
> +	uint8_t fc;
> +
> +	/* Verify that the AP instruction are available */
> +	if (!ap_instructions_available())
> +		return -EOPNOTSUPP;
> +	/* Verify that the guest is allowed to use AP instructions */
> +	if (!(vcpu->arch.sie_block->eca & ECA_APIE))
> +		return -EOPNOTSUPP;
> +	/*
> +	 * The only possibly intercepted functions when AP instructions are
> +	 * available for the guest are AQIC and TAPQ with the t bit set
> +	 * since we do not set IC.3 (FIII) we currently will only intercept
> +	 * the AQIC function code.
> +	 */
> +	reg0 = vcpu->run->s.regs.gprs[0];
> +	fc = (reg0 >> 24) & 0xff;
> +	if (WARN_ON_ONCE(fc != 0x03))
> +		return -EOPNOTSUPP;
> +
> +	/* PQAP instruction is allowed for guest kernel only */
> +	if (vcpu->arch.sie_block->gpsw.mask & PSW_MASK_PSTATE)
> +		return kvm_s390_inject_program_int(vcpu, PGM_PRIVILEGED_OP);
> +
> +	/* Common PQAP instruction specification exceptions */
> +	/* bits 41-47 must all be zeros */
> +	if (reg0 & 0x007f0000UL)
> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +	/* APFT not install and T bit set */
> +	if (!test_kvm_facility(vcpu->kvm, 15) && (reg0 & 0x00800000UL))
> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +	/* APXA not installed and APID greater 64 or APQI greater 16 */
> +	if (!(vcpu->kvm->arch.crypto.crycbd & 0x02) && (reg0 & 0x0000c0f0UL))
> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +
> +	/* AQIC function code specific exception */
> +	/* facility 65 not present for AQIC function code */
> +	if (!test_kvm_facility(vcpu->kvm, 65))
> +		return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
> +
> +	/*
> +	 * Verify that the hook callback is registered, lock the owner
> +	 * and call the hook.
> +	 */
> +	if (vcpu->kvm->arch.crypto.pqap_hook) {
> +		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
> +			return -EOPNOTSUPP;
> +		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
> +		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
> +		if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000)
> +			kvm_s390_set_psw_cc(vcpu, 3);
> +		return ret;
> +	}
> +	/*
> +	 * A vfio_driver must register a hook.
> +	 * No hook means no driver to enable the SIE CRYCB and no queues.
> +	 * We send this response to the guest.
> +	 */
> +	status.response_code = 0x01;
> +	memcpy(&vcpu->run->s.regs.gprs[1], &status, sizeof(status));
> +	kvm_s390_set_psw_cc(vcpu, 3);
> +	return 0;
> +}
> +
>  static int handle_stfl(struct kvm_vcpu *vcpu)
>  {
>  	int rc;
> @@ -878,6 +962,8 @@ int kvm_s390_handle_b2(struct kvm_vcpu *vcpu)
>  		return handle_sthyi(vcpu);
>  	case 0x7d:
>  		return handle_stsi(vcpu);
> +	case 0xaf:
> +		return handle_pqap(vcpu);
>  	case 0xb1:
>  		return handle_stfl(vcpu);
>  	case 0xb2:
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index 76b7f98..a910be1 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -16,6 +16,7 @@
>  #include <linux/mdev.h>
>  #include <linux/delay.h>
>  #include <linux/mutex.h>
> +#include <linux/kvm_host.h>
>  
>  #include "ap_bus.h"
>  
> @@ -81,6 +82,7 @@ struct ap_matrix_mdev {
>  	struct ap_matrix matrix;
>  	struct notifier_block group_notifier;
>  	struct kvm *kvm;
> +	struct kvm_s390_module_hook pqap_hook;
>  };
>  
>  extern int vfio_ap_mdev_register(void);
acked-by: Harald Freudenberger <freude@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 2/4] vfio: ap: register IOMMU VFIO notifier
  2019-05-21 15:34 ` [PATCH v9 2/4] vfio: ap: register IOMMU VFIO notifier Pierre Morel
@ 2019-06-12 13:56   ` Harald Freudenberger
  0 siblings, 0 replies; 22+ messages in thread
From: Harald Freudenberger @ 2019-06-12 13:56 UTC (permalink / raw)
  To: Pierre Morel, borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, mimu

On 21.05.19 17:34, Pierre Morel wrote:
> To be able to use the VFIO interface to facilitate the
> mediated device memory pinning/unpinning we need to register
> a notifier for IOMMU.
>
> While we will start to pin one guest page for the interrupt indicator
> byte, this is still ok with ballooning as this page will never be
> used by the guest virtio-balloon driver.
> So the pinned page will never be freed. And even a broken guest does
> so, that would not impact the host as the original page is still
> in control by vfio.
>
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> Reviewed-by: Cornelia Huck <cohuck@redhat.com>
> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c     | 43 ++++++++++++++++++++++++++++++++++-
>  drivers/s390/crypto/vfio_ap_private.h |  2 ++
>  2 files changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 900b9cf..e8e87bf 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -759,6 +759,35 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>  	return 0;
>  }
>  
> +/*
> + * vfio_ap_mdev_iommu_notifier: IOMMU notifier callback
> + *
> + * @nb: The notifier block
> + * @action: Action to be taken
> + * @data: data associated with the request
> + *
> + * For an UNMAP request, unpin the guest IOVA (the NIB guest address we
> + * pinned before). Other requests are ignored.
> + *
> + */
> +static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
> +				       unsigned long action, void *data)
> +{
> +	struct ap_matrix_mdev *matrix_mdev;
> +
> +	matrix_mdev = container_of(nb, struct ap_matrix_mdev, iommu_notifier);
> +
> +	if (action == VFIO_IOMMU_NOTIFY_DMA_UNMAP) {
> +		struct vfio_iommu_type1_dma_unmap *unmap = data;
> +		unsigned long g_pfn = unmap->iova >> PAGE_SHIFT;
> +
> +		vfio_unpin_pages(mdev_dev(matrix_mdev->mdev), &g_pfn, 1);
> +		return NOTIFY_OK;
> +	}
> +
> +	return NOTIFY_DONE;
> +}
> +
>  static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>  				       unsigned long action, void *data)
>  {
> @@ -858,7 +887,17 @@ static int vfio_ap_mdev_open(struct mdev_device *mdev)
>  		return ret;
>  	}
>  
> -	return 0;
> +	matrix_mdev->iommu_notifier.notifier_call = vfio_ap_mdev_iommu_notifier;
> +	events = VFIO_IOMMU_NOTIFY_DMA_UNMAP;
> +	ret = vfio_register_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
> +				     &events, &matrix_mdev->iommu_notifier);
> +	if (!ret)
> +		return ret;
> +
> +	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
> +				 &matrix_mdev->group_notifier);
> +	module_put(THIS_MODULE);
> +	return ret;
>  }
>  
>  static void vfio_ap_mdev_release(struct mdev_device *mdev)
> @@ -869,6 +908,8 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
>  		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>  
>  	vfio_ap_mdev_reset_queues(mdev);
> +	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
> +				 &matrix_mdev->iommu_notifier);
>  	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>  				 &matrix_mdev->group_notifier);
>  	matrix_mdev->kvm = NULL;
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index a910be1..18dcc4d 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -81,8 +81,10 @@ struct ap_matrix_mdev {
>  	struct list_head node;
>  	struct ap_matrix matrix;
>  	struct notifier_block group_notifier;
> +	struct notifier_block iommu_notifier;
>  	struct kvm *kvm;
>  	struct kvm_s390_module_hook pqap_hook;
> +	struct mdev_device *mdev;
>  };
>  
>  extern int vfio_ap_mdev_register(void);
acked-by: Harald Freudenberger <freude@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel
  2019-05-21 15:34 ` [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel Pierre Morel
  2019-05-23 15:43   ` Tony Krowiak
  2019-06-04 19:38   ` Tony Krowiak
@ 2019-06-12 13:56   ` Harald Freudenberger
  2 siblings, 0 replies; 22+ messages in thread
From: Harald Freudenberger @ 2019-06-12 13:56 UTC (permalink / raw)
  To: Pierre Morel, borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, mimu

On 21.05.19 17:34, Pierre Morel wrote:
> We register a AP PQAP instruction hook during the open
> of the mediated device. And unregister it on release.
>
> During the probe of the AP device, we allocate a vfio_ap_queue
> structure to keep track of the information we need for the
> PQAP/AQIC instruction interception.
>
> In the AP PQAP instruction hook, if we receive a demand to
> enable IRQs,
> - we retrieve the vfio_ap_queue based on the APQN we receive
>   in REG1,
> - we retrieve the page of the guest address, (NIB), from
>   register REG2
> - we retrieve the mediated device to use the VFIO pinning
>   infrastructure to pin the page of the guest address,
> - we retrieve the pointer to KVM to register the guest ISC
>   and retrieve the host ISC
> - finaly we activate GISA
>
> If we receive a demand to disable IRQs,
> - we deactivate GISA
> - unregister from the GIB
> - unpin the NIB
>
> When removing the AP device from the driver the device is
> reseted and this process unregisters the GISA from the GIB,
> and unpins the NIB address then we free the vfio_ap_queue
> structure.
>
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_drv.c     |  34 +++-
>  drivers/s390/crypto/vfio_ap_ops.c     | 336 +++++++++++++++++++++++++++++++++-
>  drivers/s390/crypto/vfio_ap_private.h |  11 ++
>  3 files changed, 374 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
> index e9824c3..003662a 100644
> --- a/drivers/s390/crypto/vfio_ap_drv.c
> +++ b/drivers/s390/crypto/vfio_ap_drv.c
> @@ -5,6 +5,7 @@
>   * Copyright IBM Corp. 2018
>   *
>   * Author(s): Tony Krowiak <akrowiak@linux.ibm.com>
> + *	      Pierre Morel <pmorel@linux.ibm.com>
>   */
>  
>  #include <linux/module.h>
> @@ -40,14 +41,45 @@ static struct ap_device_id ap_queue_ids[] = {
>  
>  MODULE_DEVICE_TABLE(vfio_ap, ap_queue_ids);
>  
> +/**
> + * vfio_ap_queue_dev_probe:
> + *
> + * Allocate a vfio_ap_queue structure and associate it
> + * with the device as driver_data.
> + */
>  static int vfio_ap_queue_dev_probe(struct ap_device *apdev)
>  {
> +	struct vfio_ap_queue *q;
> +
> +	q = kzalloc(sizeof(*q), GFP_KERNEL);
> +	if (!q)
> +		return -ENOMEM;
> +	dev_set_drvdata(&apdev->device, q);
> +	q->apqn = to_ap_queue(&apdev->device)->qid;
> +	q->saved_isc = VFIO_AP_ISC_INVALID;
>  	return 0;
>  }
>  
> +/**
> + * vfio_ap_queue_dev_remove:
> + *
> + * Takes the matrix lock to avoid actions on this device while removing
> + * Free the associated vfio_ap_queue structure
> + */
>  static void vfio_ap_queue_dev_remove(struct ap_device *apdev)
>  {
> -	/* Nothing to do yet */
> +	struct vfio_ap_queue *q;
> +	int apid, apqi;
> +
> +	mutex_lock(&matrix_dev->lock);
> +	q = dev_get_drvdata(&apdev->device);
> +	dev_set_drvdata(&apdev->device, NULL);
> +	apid = AP_QID_CARD(q->apqn);
> +	apqi = AP_QID_QUEUE(q->apqn);
> +	vfio_ap_mdev_reset_queue(apid, apqi, 1);
> +	vfio_ap_irq_disable(q);
> +	kfree(q);
> +	mutex_unlock(&matrix_dev->lock);
>  }
>  
>  static void vfio_ap_matrix_dev_release(struct device *dev)
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index e8e87bf..015174f 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -24,6 +24,295 @@
>  #define VFIO_AP_MDEV_TYPE_HWVIRT "passthrough"
>  #define VFIO_AP_MDEV_NAME_HWVIRT "VFIO AP Passthrough Device"
>  
> +static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev);
> +
> +static int match_apqn(struct device *dev, void *data)
> +{
> +	struct vfio_ap_queue *q = dev_get_drvdata(dev);
> +
> +	return (q->apqn == *(int *)(data)) ? 1 : 0;
> +}
> +
> +/**
> + * vfio_ap_get_queue: Retrieve a queue with a specific APQN from a list
> + * @matrix_mdev: the associated mediated matrix
> + * @apqn: The queue APQN
> + *
> + * Retrieve a queue with a specific APQN from the list of the
> + * devices of the vfio_ap_drv.
> + * Verify that the APID and the APQI are set in the matrix.
> + *
> + * Returns the pointer to the associated vfio_ap_queue
> + */
> +struct vfio_ap_queue *vfio_ap_get_queue(struct ap_matrix_mdev *matrix_mdev,
> +					int apqn)
> +{
> +	struct vfio_ap_queue *q;
> +	struct device *dev;
> +
> +	if (!test_bit_inv(AP_QID_CARD(apqn), matrix_mdev->matrix.apm))
> +		return NULL;
> +	if (!test_bit_inv(AP_QID_QUEUE(apqn), matrix_mdev->matrix.aqm))
> +		return NULL;
> +
> +	dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> +				 &apqn, match_apqn);
> +	if (!dev)
> +		return NULL;
> +	q = dev_get_drvdata(dev);
> +	q->matrix_mdev = matrix_mdev;
> +	put_device(dev);
> +
> +	return q;
> +}
> +
> +/**
> + * vfio_ap_wait_for_irqclear
> + * @apqn: The AP Queue number
> + *
> + * Checks the IRQ bit for the status of this APQN using ap_tapq.
> + * Returns if the ap_tapq function succedded and the bit is clear.
> + * Returns if ap_tapq function failed with invalid, deconfigured or
> + * checkstopped AP.
> + * Otherwise retries up to 5 times after waiting 20ms.
> + *
> + */
> +static void vfio_ap_wait_for_irqclear(int apqn)
> +{
> +	struct ap_queue_status status;
> +	int retry = 5;
> +
> +	do {
> +		status = ap_tapq(apqn, NULL);
> +		switch (status.response_code) {
> +		case AP_RESPONSE_NORMAL:
> +		case AP_RESPONSE_RESET_IN_PROGRESS:
> +			if (!status.irq_enabled)
> +				return;
> +			/* Fall through */
> +		case AP_RESPONSE_BUSY:
> +			msleep(20);
> +			break;
> +		case AP_RESPONSE_Q_NOT_AVAIL:
> +		case AP_RESPONSE_DECONFIGURED:
> +		case AP_RESPONSE_CHECKSTOPPED:
> +		default:
> +			WARN_ONCE(1, "%s: tapq rc %02x: %04x\n", __func__,
> +				  status.response_code, apqn);
> +			return;
> +		}
> +	} while (--retry);
> +
> +	WARN_ONCE(1, "%s: tapq rc %02x: %04x could not clear IR bit\n",
> +		  __func__, status.response_code, apqn);
> +}
> +
> +/**
> + * vfio_ap_free_aqic_resources
> + * @q: The vfio_ap_queue
> + *
> + * Unregisters the ISC in the GIB when the saved ISC not invalid.
> + * Unpin the guest's page holding the NIB when it exist.
> + * Reset the saved_pfn and saved_isc to invalid values.
> + * Clear the pointer to the matrix mediated device.
> + *
> + */
> +static void vfio_ap_free_aqic_resources(struct vfio_ap_queue *q)
> +{
> +	if (q->saved_isc != VFIO_AP_ISC_INVALID && q->matrix_mdev)
> +		kvm_s390_gisc_unregister(q->matrix_mdev->kvm, q->saved_isc);
> +	if (q->saved_pfn && q->matrix_mdev)
> +		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev),
> +				 &q->saved_pfn, 1);
> +	q->saved_pfn = 0;
> +	q->saved_isc = VFIO_AP_ISC_INVALID;
> +	q->matrix_mdev = NULL;
> +}
> +
> +/**
> + * vfio_ap_irq_disable
> + * @q: The vfio_ap_queue
> + *
> + * Uses ap_aqic to disable the interruption and in case of success, reset
> + * in progress or IRQ disable command already proceeded: calls
> + * vfio_ap_wait_for_irqclear() to check for the IRQ bit to be clear
> + * and calls vfio_ap_free_aqic_resources() to free the resources associated
> + * with the AP interrupt handling.
> + *
> + * In the case the AP is busy, or a reset is in progress,
> + * retries after 20ms, up to 5 times.
> + *
> + * Returns if ap_aqic function failed with invalid, deconfigured or
> + * checkstopped AP.
> + */
> +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q)
> +{
> +	struct ap_qirq_ctrl aqic_gisa = {};
> +	struct ap_queue_status status;
> +	int retries = 5;
> +
> +	do {
> +		status = ap_aqic(q->apqn, aqic_gisa, NULL);
> +		switch (status.response_code) {
> +		case AP_RESPONSE_OTHERWISE_CHANGED:
> +		case AP_RESPONSE_NORMAL:
> +			vfio_ap_wait_for_irqclear(q->apqn);
> +			goto end_free;
> +		case AP_RESPONSE_RESET_IN_PROGRESS:
> +		case AP_RESPONSE_BUSY:
> +			msleep(20);
> +			break;
> +		case AP_RESPONSE_Q_NOT_AVAIL:
> +		case AP_RESPONSE_DECONFIGURED:
> +		case AP_RESPONSE_CHECKSTOPPED:
> +		case AP_RESPONSE_INVALID_ADDRESS:
> +		default:
> +			/* All cases in default means AP not operational */
> +			WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
> +				  status.response_code);
> +			goto end_free;
> +		}
> +	} while (retries--);
> +
> +	WARN_ONCE(1, "%s: ap_aqic status %d\n", __func__,
> +		  status.response_code);
> +end_free:
> +	vfio_ap_free_aqic_resources(q);
> +	return status;
> +}
> +
> +/**
> + * vfio_ap_setirq: Enable Interruption for a APQN
> + *
> + * @dev: the device associated with the ap_queue
> + * @q:   the vfio_ap_queue holding AQIC parameters
> + *
> + * Pin the NIB saved in *q
> + * Register the guest ISC to GIB interface and retrieve the
> + * host ISC to issue the host side PQAP/AQIC
> + *
> + * Response.status may be set to AP_RESPONSE_INVALID_ADDRESS in case the
> + * vfio_pin_pages failed.
> + *
> + * Otherwise return the ap_queue_status returned by the ap_aqic(),
> + * all retry handling will be done by the guest.
> + */
> +static struct ap_queue_status vfio_ap_irq_enable(struct vfio_ap_queue *q,
> +						 int isc,
> +						 unsigned long nib)
> +{
> +	struct ap_qirq_ctrl aqic_gisa = {};
> +	struct ap_queue_status status = {};
> +	struct kvm_s390_gisa *gisa;
> +	struct kvm *kvm;
> +	unsigned long h_nib, g_pfn, h_pfn;
> +	int ret;
> +
> +	g_pfn = nib >> PAGE_SHIFT;
> +	ret = vfio_pin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1,
> +			     IOMMU_READ | IOMMU_WRITE, &h_pfn);
> +	switch (ret) {
> +	case 1:
> +		break;
> +	default:
> +		status.response_code = AP_RESPONSE_INVALID_ADDRESS;
> +		return status;
> +	}
> +
> +	kvm = q->matrix_mdev->kvm;
> +	gisa = kvm->arch.gisa_int.origin;
> +
> +	h_nib = (h_pfn << PAGE_SHIFT) | (nib & ~PAGE_MASK);
> +	aqic_gisa.gisc = isc;
> +	aqic_gisa.isc = kvm_s390_gisc_register(kvm, isc);
> +	aqic_gisa.ir = 1;
> +	aqic_gisa.gisa = (uint64_t)gisa >> 4;
> +
> +	status = ap_aqic(q->apqn, aqic_gisa, (void *)h_nib);
> +	switch (status.response_code) {
> +	case AP_RESPONSE_NORMAL:
> +		/* See if we did clear older IRQ configuration */
> +		vfio_ap_free_aqic_resources(q);
> +		q->saved_pfn = g_pfn;
> +		q->saved_isc = isc;
> +		break;
> +	case AP_RESPONSE_OTHERWISE_CHANGED:
> +		/* We could not modify IRQ setings: clear new configuration */
> +		vfio_unpin_pages(mdev_dev(q->matrix_mdev->mdev), &g_pfn, 1);
> +		kvm_s390_gisc_unregister(kvm, isc);
> +		break;
> +	default:
> +		pr_warn("%s: apqn %04x: response: %02x\n", __func__, q->apqn,
> +			status.response_code);
> +		vfio_ap_irq_disable(q);
> +		break;
> +	}
> +
> +	return status;
> +}
> +
> +/**
> + * handle_pqap: PQAP instruction callback
> + *
> + * @vcpu: The vcpu on which we received the PQAP instruction
> + *
> + * Get the general register contents to initialize internal variables.
> + * REG[0]: APQN
> + * REG[1]: IR and ISC
> + * REG[2]: NIB
> + *
> + * Response.status may be set to following Response Code:
> + * - AP_RESPONSE_Q_NOT_AVAIL: if the queue is not available
> + * - AP_RESPONSE_DECONFIGURED: if the queue is not configured
> + * - AP_RESPONSE_NORMAL (0) : in case of successs
> + *   Check vfio_ap_setirq() and vfio_ap_clrirq() for other possible RC.
> + * We take the matrix_dev lock to ensure serialization on queues and
> + * mediated device access.
> + *
> + * Return 0 if we could handle the request inside KVM.
> + * otherwise, returns -EOPNOTSUPP to let QEMU handle the fault.
> + */
> +static int handle_pqap(struct kvm_vcpu *vcpu)
> +{
> +	uint64_t status;
> +	uint16_t apqn;
> +	struct vfio_ap_queue *q;
> +	struct ap_queue_status qstatus = {
> +			       .response_code = AP_RESPONSE_Q_NOT_AVAIL, };
> +	struct ap_matrix_mdev *matrix_mdev;
> +
> +	/* If we do not use the AIV facility just go to userland */
> +	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
> +		return -EOPNOTSUPP;
> +
> +	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
> +	mutex_lock(&matrix_dev->lock);
> +
> +	if (!vcpu->kvm->arch.crypto.pqap_hook)
> +		goto out_unlock;
> +	matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
> +				   struct ap_matrix_mdev, pqap_hook);
> +
> +	q = vfio_ap_get_queue(matrix_mdev, apqn);
> +	if (!q)
> +		goto out_unlock;
> +
> +	status = vcpu->run->s.regs.gprs[1];
> +
> +	/* If IR bit(16) is set we enable the interrupt */
> +	if ((status >> (63 - 16)) & 0x01)
> +		qstatus = vfio_ap_irq_enable(q, status & 0x07,
> +					     vcpu->run->s.regs.gprs[2]);
> +	else
> +		qstatus = vfio_ap_irq_disable(q);
> +
> +out_unlock:
> +	memcpy(&vcpu->run->s.regs.gprs[1], &qstatus, sizeof(qstatus));
> +	vcpu->run->s.regs.gprs[1] >>= 32;
> +	mutex_unlock(&matrix_dev->lock);
> +	return 0;
> +}
> +
>  static void vfio_ap_matrix_init(struct ap_config_info *info,
>  				struct ap_matrix *matrix)
>  {
> @@ -45,8 +334,11 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev)
>  		return -ENOMEM;
>  	}
>  
> +	matrix_mdev->mdev = mdev;
>  	vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
>  	mdev_set_drvdata(mdev, matrix_mdev);
> +	matrix_mdev->pqap_hook.hook = handle_pqap;
> +	matrix_mdev->pqap_hook.owner = THIS_MODULE;
>  	mutex_lock(&matrix_dev->lock);
>  	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
>  	mutex_unlock(&matrix_dev->lock);
> @@ -62,6 +354,7 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
>  		return -EBUSY;
>  
>  	mutex_lock(&matrix_dev->lock);
> +	vfio_ap_mdev_reset_queues(mdev);
>  	list_del(&matrix_mdev->node);
>  	mutex_unlock(&matrix_dev->lock);
>  
> @@ -754,6 +1047,8 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>  	}
>  
>  	matrix_mdev->kvm = kvm;
> +	kvm_get_kvm(kvm);
> +	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>  	mutex_unlock(&matrix_dev->lock);
>  
>  	return 0;
> @@ -819,15 +1114,36 @@ static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>  	return NOTIFY_OK;
>  }
>  
> -static int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> -				    unsigned int retry)
> +static void vfio_ap_irq_disable_apqn(int apqn)
> +{
> +	struct device *dev;
> +	struct vfio_ap_queue *q;
> +
> +	dev = driver_find_device(&matrix_dev->vfio_ap_drv->driver, NULL,
> +				 &apqn, match_apqn);
> +	if (dev) {
> +		q = dev_get_drvdata(dev);
> +		vfio_ap_irq_disable(q);
> +		put_device(dev);
> +	}
> +}
> +
> +int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> +			     unsigned int retry)
>  {
>  	struct ap_queue_status status;
> +	int retry2 = 2;
> +	int apqn = AP_MKQID(apid, apqi);
>  
>  	do {
> -		status = ap_zapq(AP_MKQID(apid, apqi));
> +		status = ap_zapq(apqn);
>  		switch (status.response_code) {
>  		case AP_RESPONSE_NORMAL:
> +			while (!status.queue_empty && retry2--) {
> +				msleep(20);
> +				status = ap_tapq(apqn, NULL);
> +			}
> +			WARN_ON_ONCE(retry <= 0);
>  			return 0;
>  		case AP_RESPONSE_RESET_IN_PROGRESS:
>  		case AP_RESPONSE_BUSY:
> @@ -861,6 +1177,7 @@ static int vfio_ap_mdev_reset_queues(struct mdev_device *mdev)
>  			 */
>  			if (ret)
>  				rc = ret;
> +			vfio_ap_irq_disable_apqn(AP_MKQID(apid, apqi));
>  		}
>  	}
>  
> @@ -904,15 +1221,20 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
>  {
>  	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>  
> -	if (matrix_mdev->kvm)
> +	mutex_lock(&matrix_dev->lock);
> +	if (matrix_mdev->kvm) {
>  		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> +		matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> +		vfio_ap_mdev_reset_queues(mdev);
> +		kvm_put_kvm(matrix_mdev->kvm);
> +		matrix_mdev->kvm = NULL;
> +	}
> +	mutex_unlock(&matrix_dev->lock);
>  
> -	vfio_ap_mdev_reset_queues(mdev);
>  	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
>  				 &matrix_mdev->iommu_notifier);
>  	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
>  				 &matrix_mdev->group_notifier);
> -	matrix_mdev->kvm = NULL;
>  	module_put(THIS_MODULE);
>  }
>  
> @@ -941,6 +1263,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
>  {
>  	int ret;
>  
> +	mutex_lock(&matrix_dev->lock);
>  	switch (cmd) {
>  	case VFIO_DEVICE_GET_INFO:
>  		ret = vfio_ap_mdev_get_device_info(arg);
> @@ -952,6 +1275,7 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
>  		ret = -EOPNOTSUPP;
>  		break;
>  	}
> +	mutex_unlock(&matrix_dev->lock);
>  
>  	return ret;
>  }
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index 18dcc4d..f46dde5 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -4,6 +4,7 @@
>   *
>   * Author(s): Tony Krowiak <akrowiak@linux.ibm.com>
>   *	      Halil Pasic <pasic@linux.ibm.com>
> + *	      Pierre Morel <pmorel@linux.ibm.com>
>   *
>   * Copyright IBM Corp. 2018
>   */
> @@ -89,5 +90,15 @@ struct ap_matrix_mdev {
>  
>  extern int vfio_ap_mdev_register(void);
>  extern void vfio_ap_mdev_unregister(void);
> +int vfio_ap_mdev_reset_queue(unsigned int apid, unsigned int apqi,
> +			     unsigned int retry);
>  
> +struct vfio_ap_queue {
> +	struct ap_matrix_mdev *matrix_mdev;
> +	unsigned long saved_pfn;
> +	int	apqn;
> +#define VFIO_AP_ISC_INVALID 0xff
> +	unsigned char saved_isc;
> +};
> +struct ap_queue_status vfio_ap_irq_disable(struct vfio_ap_queue *q);
>  #endif /* _VFIO_AP_PRIVATE_H_ */
acked-by: Harald Freudenberger <freude@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest
  2019-05-21 15:34 ` [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest Pierre Morel
@ 2019-06-12 13:57   ` Harald Freudenberger
  2019-06-25 20:13   ` Christian Borntraeger
  1 sibling, 0 replies; 22+ messages in thread
From: Harald Freudenberger @ 2019-06-12 13:57 UTC (permalink / raw)
  To: Pierre Morel, borntraeger
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, schwidefsky, heiko.carstens, mimu

On 21.05.19 17:34, Pierre Morel wrote:
> AP Queue Interruption Control (AQIC) facility gives
> the guest the possibility to control interruption for
> the Cryptographic Adjunct Processor queues.
>
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  arch/s390/tools/gen_facilities.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
> index 61ce5b5..aed14fc 100644
> --- a/arch/s390/tools/gen_facilities.c
> +++ b/arch/s390/tools/gen_facilities.c
> @@ -114,6 +114,7 @@ static struct facility_def facility_defs[] = {
>  		.bits = (int[]){
>  			12, /* AP Query Configuration Information */
>  			15, /* AP Facilities Test */
> +			65, /* AP Queue Interruption Control */
>  			156, /* etoken facility */
>  			-1  /* END */
>  		}
acked-by: Harald Freudenberger <freude@linux.ibm.com>


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest
  2019-05-21 15:34 ` [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest Pierre Morel
  2019-06-12 13:57   ` Harald Freudenberger
@ 2019-06-25 20:13   ` Christian Borntraeger
  2019-06-25 20:15     ` Christian Borntraeger
  2019-06-27 12:04     ` Halil Pasic
  1 sibling, 2 replies; 22+ messages in thread
From: Christian Borntraeger @ 2019-06-25 20:13 UTC (permalink / raw)
  To: Pierre Morel
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, heiko.carstens, freude, mimu



On 21.05.19 17:34, Pierre Morel wrote:
> AP Queue Interruption Control (AQIC) facility gives
> the guest the possibility to control interruption for
> the Cryptographic Adjunct Processor queues.
> 
> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  arch/s390/tools/gen_facilities.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
> index 61ce5b5..aed14fc 100644
> --- a/arch/s390/tools/gen_facilities.c
> +++ b/arch/s390/tools/gen_facilities.c
> @@ -114,6 +114,7 @@ static struct facility_def facility_defs[] = {
>  		.bits = (int[]){
>  			12, /* AP Query Configuration Information */
>  			15, /* AP Facilities Test */
> +			65, /* AP Queue Interruption Control */
>  			156, /* etoken facility */
>  			-1  /* END */
>  		}
> 

I think we should only set stfle.65 if we have the aiv facility (Because we do not
have a GISA otherwise)

So something like this instead?

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 28ebd64..1501cd6 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2461,6 +2461,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
                set_kvm_facility(kvm->arch.model.fac_list, 147);
        }
 
+       if (css_general_characteristics.aiv)
+               set_kvm_facility(kvm->arch.model.fac_mask, 65);
+       
        kvm->arch.model.cpuid = kvm_s390_get_initial_cpuid();
        kvm->arch.model.ibc = sclp.ibc & 0x0fff;
 


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest
  2019-06-25 20:13   ` Christian Borntraeger
@ 2019-06-25 20:15     ` Christian Borntraeger
  2019-06-26 21:12       ` Tony Krowiak
  2019-06-27 12:04     ` Halil Pasic
  1 sibling, 1 reply; 22+ messages in thread
From: Christian Borntraeger @ 2019-06-25 20:15 UTC (permalink / raw)
  To: Pierre Morel
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	akrowiak, pasic, david, heiko.carstens, freude, mimu



On 25.06.19 22:13, Christian Borntraeger wrote:
> 
> 
> On 21.05.19 17:34, Pierre Morel wrote:
>> AP Queue Interruption Control (AQIC) facility gives
>> the guest the possibility to control interruption for
>> the Cryptographic Adjunct Processor queues.
>>
>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>  arch/s390/tools/gen_facilities.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
>> index 61ce5b5..aed14fc 100644
>> --- a/arch/s390/tools/gen_facilities.c
>> +++ b/arch/s390/tools/gen_facilities.c
>> @@ -114,6 +114,7 @@ static struct facility_def facility_defs[] = {
>>  		.bits = (int[]){
>>  			12, /* AP Query Configuration Information */
>>  			15, /* AP Facilities Test */
>> +			65, /* AP Queue Interruption Control */
>>  			156, /* etoken facility */
>>  			-1  /* END */
>>  		}
>>
> 
> I think we should only set stfle.65 if we have the aiv facility (Because we do not
> have a GISA otherwise)
> 
> So something like this instead?
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 28ebd64..1501cd6 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2461,6 +2461,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>                 set_kvm_facility(kvm->arch.model.fac_list, 147);
>         }
>  
> +       if (css_general_characteristics.aiv)
> +               set_kvm_facility(kvm->arch.model.fac_mask, 65);
> +       
>         kvm->arch.model.cpuid = kvm_s390_get_initial_cpuid();
>         kvm->arch.model.ibc = sclp.ibc & 0x0fff;
>  
> 

Maybe even just piggyback on gisa init (it will bail out early).

diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
index 9dde4d7..9182a04 100644
--- a/arch/s390/kvm/interrupt.c
+++ b/arch/s390/kvm/interrupt.c
@@ -3100,6 +3100,7 @@ void kvm_s390_gisa_init(struct kvm *kvm)
        gi->timer.function = gisa_vcpu_kicker;
        memset(gi->origin, 0, sizeof(struct kvm_s390_gisa));
        gi->origin->next_alert = (u32)(u64)gi->origin;
+       set_kvm_facility(kvm->arch.model.fac_mask, 65);
        VM_EVENT(kvm, 3, "gisa 0x%pK initialized", gi->origin);
 }
 


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest
  2019-06-25 20:15     ` Christian Borntraeger
@ 2019-06-26 21:12       ` Tony Krowiak
  2019-06-27  6:54         ` Christian Borntraeger
  0 siblings, 1 reply; 22+ messages in thread
From: Tony Krowiak @ 2019-06-26 21:12 UTC (permalink / raw)
  To: Christian Borntraeger, Pierre Morel
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	pasic, david, heiko.carstens, freude, mimu

On 6/25/19 4:15 PM, Christian Borntraeger wrote:
> 
> 
> On 25.06.19 22:13, Christian Borntraeger wrote:
>>
>>
>> On 21.05.19 17:34, Pierre Morel wrote:
>>> AP Queue Interruption Control (AQIC) facility gives
>>> the guest the possibility to control interruption for
>>> the Cryptographic Adjunct Processor queues.
>>>
>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
>>> ---
>>>   arch/s390/tools/gen_facilities.c | 1 +
>>>   1 file changed, 1 insertion(+)
>>>
>>> diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
>>> index 61ce5b5..aed14fc 100644
>>> --- a/arch/s390/tools/gen_facilities.c
>>> +++ b/arch/s390/tools/gen_facilities.c
>>> @@ -114,6 +114,7 @@ static struct facility_def facility_defs[] = {
>>>   		.bits = (int[]){
>>>   			12, /* AP Query Configuration Information */
>>>   			15, /* AP Facilities Test */
>>> +			65, /* AP Queue Interruption Control */
>>>   			156, /* etoken facility */
>>>   			-1  /* END */
>>>   		}
>>>
>>
>> I think we should only set stfle.65 if we have the aiv facility (Because we do not
>> have a GISA otherwise)

My assumption here is that you are taking the line added above
(STFLE.65) out and replacing with one of the two suggestions
below. I am quite fuzzy on how all of this CPU model stuff works,
but I am thinking that the above makes STFLE.65 available to be
set via the CPU model (i.e., aqic=on on the QEMU command line) as
long as it is supported by the host. By taking that line out, we
are relying on one of the suggestions below to make STFLE.65
available to the guest only if AIV facility is available. Does that
sound about right?

If that is the case, then wouldn't we also have to add a check to make
sure that STFLE.65 is available on the host (i.e., test_facility(65))?




>>
>> So something like this instead?
>>
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 28ebd64..1501cd6 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -2461,6 +2461,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>                  set_kvm_facility(kvm->arch.model.fac_list, 147);
>>          }
>>   
>> +       if (css_general_characteristics.aiv)
>> +               set_kvm_facility(kvm->arch.model.fac_mask, 65);
>> +
>>          kvm->arch.model.cpuid = kvm_s390_get_initial_cpuid();
>>          kvm->arch.model.ibc = sclp.ibc & 0x0fff;
>>   
>>
> 
> Maybe even just piggyback on gisa init (it will bail out early).

It could also go in the kvm_s390_crypto_init() function since it
is related to crypto.

> 
> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
> index 9dde4d7..9182a04 100644
> --- a/arch/s390/kvm/interrupt.c
> +++ b/arch/s390/kvm/interrupt.c
> @@ -3100,6 +3100,7 @@ void kvm_s390_gisa_init(struct kvm *kvm)
>          gi->timer.function = gisa_vcpu_kicker;
>          memset(gi->origin, 0, sizeof(struct kvm_s390_gisa));
>          gi->origin->next_alert = (u32)(u64)gi->origin;
> +       set_kvm_facility(kvm->arch.model.fac_mask, 65);
>          VM_EVENT(kvm, 3, "gisa 0x%pK initialized", gi->origin);
>   }
>   
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest
  2019-06-26 21:12       ` Tony Krowiak
@ 2019-06-27  6:54         ` Christian Borntraeger
  0 siblings, 0 replies; 22+ messages in thread
From: Christian Borntraeger @ 2019-06-27  6:54 UTC (permalink / raw)
  To: Tony Krowiak, Pierre Morel
  Cc: alex.williamson, cohuck, linux-kernel, linux-s390, kvm, frankja,
	pasic, david, heiko.carstens, freude, mimu



On 26.06.19 23:12, Tony Krowiak wrote:
> On 6/25/19 4:15 PM, Christian Borntraeger wrote:
>>
>>
>> On 25.06.19 22:13, Christian Borntraeger wrote:
>>>
>>>
>>> On 21.05.19 17:34, Pierre Morel wrote:
>>>> AP Queue Interruption Control (AQIC) facility gives
>>>> the guest the possibility to control interruption for
>>>> the Cryptographic Adjunct Processor queues.
>>>>
>>>> Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
>>>> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
>>>> ---
>>>>   arch/s390/tools/gen_facilities.c | 1 +
>>>>   1 file changed, 1 insertion(+)
>>>>
>>>> diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
>>>> index 61ce5b5..aed14fc 100644
>>>> --- a/arch/s390/tools/gen_facilities.c
>>>> +++ b/arch/s390/tools/gen_facilities.c
>>>> @@ -114,6 +114,7 @@ static struct facility_def facility_defs[] = {
>>>>           .bits = (int[]){
>>>>               12, /* AP Query Configuration Information */
>>>>               15, /* AP Facilities Test */
>>>> +            65, /* AP Queue Interruption Control */
>>>>               156, /* etoken facility */
>>>>               -1  /* END */
>>>>           }
>>>>
>>>
>>> I think we should only set stfle.65 if we have the aiv facility (Because we do not
>>> have a GISA otherwise)
> 
> My assumption here is that you are taking the line added above
> (STFLE.65) out and replacing with one of the two suggestions
> below.

Yes, I want to replace this hunk.

 I am quite fuzzy on how all of this CPU model stuff works,
> but I am thinking that the above makes STFLE.65 available to be
> set via the CPU model (i.e., aqic=on on the QEMU command line) as
> long as it is supported by the host.

Yes, it makes it available when the host has stfle.65. But at the same
time it does not look if the adapter interruption virtualization facility
is available. For example for vsie the guest2 will enable stfle.65 for its
guests, but we do not support AIV.

 By taking that line out, we
> are relying on one of the suggestions below to make STFLE.65
> available to the guest only if AIV facility is available. Does that
> sound about right?
> 
> If that is the case, then wouldn't we also have to add a check to make
> sure that STFLE.65 is available on the host (i.e., test_facility(65))?

I think AIV in level n is enough to provide STFLE.65 in level n+1. 
On the other hand also checking for stfle.65 does not hurt.


> 
> 
>>>
>>> So something like this instead?
>>>
>>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>>> index 28ebd64..1501cd6 100644
>>> --- a/arch/s390/kvm/kvm-s390.c
>>> +++ b/arch/s390/kvm/kvm-s390.c
>>> @@ -2461,6 +2461,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>>>                  set_kvm_facility(kvm->arch.model.fac_list, 147);
>>>          }
>>>   +       if (css_general_characteristics.aiv)
>>> +               set_kvm_facility(kvm->arch.model.fac_mask, 65);
>>> +
>>>          kvm->arch.model.cpuid = kvm_s390_get_initial_cpuid();
>>>          kvm->arch.model.ibc = sclp.ibc & 0x0fff;
>>>  
>>
>> Maybe even just piggyback on gisa init (it will bail out early).
> 
> It could also go in the kvm_s390_crypto_init() function since it
> is related to crypto.
> 
>>
>> diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c
>> index 9dde4d7..9182a04 100644
>> --- a/arch/s390/kvm/interrupt.c
>> +++ b/arch/s390/kvm/interrupt.c
>> @@ -3100,6 +3100,7 @@ void kvm_s390_gisa_init(struct kvm *kvm)
>>          gi->timer.function = gisa_vcpu_kicker;
>>          memset(gi->origin, 0, sizeof(struct kvm_s390_gisa));
>>          gi->origin->next_alert = (u32)(u64)gi->origin;
>> +       set_kvm_facility(kvm->arch.model.fac_mask, 65);
>>          VM_EVENT(kvm, 3, "gisa 0x%pK initialized", gi->origin);
>>   }
>>  
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest
  2019-06-25 20:13   ` Christian Borntraeger
  2019-06-25 20:15     ` Christian Borntraeger
@ 2019-06-27 12:04     ` Halil Pasic
  1 sibling, 0 replies; 22+ messages in thread
From: Halil Pasic @ 2019-06-27 12:04 UTC (permalink / raw)
  To: Christian Borntraeger
  Cc: Pierre Morel, alex.williamson, cohuck, linux-kernel, linux-s390,
	kvm, frankja, akrowiak, david, heiko.carstens, freude, mimu

On Tue, 25 Jun 2019 22:13:12 +0200
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> 
> 
> On 21.05.19 17:34, Pierre Morel wrote:
> > AP Queue Interruption Control (AQIC) facility gives
> > the guest the possibility to control interruption for
> > the Cryptographic Adjunct Processor queues.
> > 
> > Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
> > Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
> > ---
> >  arch/s390/tools/gen_facilities.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
> > index 61ce5b5..aed14fc 100644
> > --- a/arch/s390/tools/gen_facilities.c
> > +++ b/arch/s390/tools/gen_facilities.c
> > @@ -114,6 +114,7 @@ static struct facility_def facility_defs[] = {
> >  		.bits = (int[]){
> >  			12, /* AP Query Configuration Information */
> >  			15, /* AP Facilities Test */
> > +			65, /* AP Queue Interruption Control */
> >  			156, /* etoken facility */
> >  			-1  /* END */
> >  		}
> > 
> 
> I think we should only set stfle.65 if we have the aiv facility (Because we do not
> have a GISA otherwise)
> 
> So something like this instead?
> 
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 28ebd64..1501cd6 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2461,6 +2461,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
>                 set_kvm_facility(kvm->arch.model.fac_list, 147);
>         }
>  
> +       if (css_general_characteristics.aiv)
> +               set_kvm_facility(kvm->arch.model.fac_mask, 65);
> +       
>         kvm->arch.model.cpuid = kvm_s390_get_initial_cpuid();
>         kvm->arch.model.ibc = sclp.ibc & 0x0fff;

I will go with this option because it is more readable (easier to find)
IMHO. Will also add a chech for host sltfle.65. So I end up with:

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 28ebd647784c..1c4113f0f2a8 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2461,6 +2461,9 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
                set_kvm_facility(kvm->arch.model.fac_list, 147);
        }
 
+       if (css_general_characteristics.aiv && test_facility(65))
+               set_kvm_facility(kvm->arch.model.fac_mask, 65);
+
        kvm->arch.model.cpuid = kvm_s390_get_initial_cpuid();
        kvm->arch.model.ibc = sclp.ibc & 0x0fff;
 
diff --git a/arch/s390/tools/gen_facilities.c b/arch/s390/tools/gen_facilities.c
index d52f537b7169..cead9e0dcffb 100644
--- a/arch/s390/tools/gen_facilities.c
+++ b/arch/s390/tools/gen_facilities.c
@@ -111,7 +111,6 @@ static struct facility_def facility_defs[] = {
                .bits = (int[]){
                        12, /* AP Query Configuration Information */
                        15, /* AP Facilities Test */
-                       65, /* AP Queue Interruption Control */
                        156, /* etoken facility */
                        -1  /* END */



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control
  2019-05-21 15:34 [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Pierre Morel
                   ` (4 preceding siblings ...)
  2019-05-23 15:36 ` [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Tony Krowiak
@ 2019-07-01 12:00 ` Halil Pasic
  5 siblings, 0 replies; 22+ messages in thread
From: Halil Pasic @ 2019-07-01 12:00 UTC (permalink / raw)
  To: Pierre Morel
  Cc: borntraeger, alex.williamson, cohuck, linux-kernel, linux-s390,
	kvm, frankja, akrowiak, david, schwidefsky, heiko.carstens,
	freude, mimu

On Tue, 21 May 2019 17:34:33 +0200
Pierre Morel <pmorel@linux.ibm.com> wrote:

> This patch series implements PQAP/AQIC interception in KVM.

Thanks, applied


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2019-07-01 12:00 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-21 15:34 [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Pierre Morel
2019-05-21 15:34 ` [PATCH v9 1/4] s390: ap: kvm: add PQAP interception for AQIC Pierre Morel
2019-06-12 13:55   ` Harald Freudenberger
2019-05-21 15:34 ` [PATCH v9 2/4] vfio: ap: register IOMMU VFIO notifier Pierre Morel
2019-06-12 13:56   ` Harald Freudenberger
2019-05-21 15:34 ` [PATCH v9 3/4] s390: ap: implement PAPQ AQIC interception in kernel Pierre Morel
2019-05-23 15:43   ` Tony Krowiak
2019-06-04 19:38   ` Tony Krowiak
2019-06-07 14:29     ` Halil Pasic
2019-06-11 14:37       ` Tony Krowiak
2019-06-11 15:00         ` Halil Pasic
2019-06-12 13:56   ` Harald Freudenberger
2019-05-21 15:34 ` [PATCH v9 4/4] s390: ap: kvm: Enable PQAP/AQIC facility for the guest Pierre Morel
2019-06-12 13:57   ` Harald Freudenberger
2019-06-25 20:13   ` Christian Borntraeger
2019-06-25 20:15     ` Christian Borntraeger
2019-06-26 21:12       ` Tony Krowiak
2019-06-27  6:54         ` Christian Borntraeger
2019-06-27 12:04     ` Halil Pasic
2019-05-23 15:36 ` [PATCH v9 0/4] vfio: ap: AP Queue Interrupt Control Tony Krowiak
2019-06-04 14:56   ` Halil Pasic
2019-07-01 12:00 ` Halil Pasic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).