linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] s390/vfio-ap: refactor mdev remove callback and locks
@ 2021-06-09 22:46 Tony Krowiak
  2021-06-09 22:46 ` [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked Tony Krowiak
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Tony Krowiak @ 2021-06-09 22:46 UTC (permalink / raw)
  To: linux-s390, linux-kernel
  Cc: borntraeger, cohuck, pasic, jjherne, jgg, alex.williamson,
	kwankhede, frankja, david, imbrenda, hca

This series is an expansion of a previous patch series entitled 
's390/vfio-ap: fix memory leak in mdev remove callback'. It has been
renamed because there never really was a memory leak to fix. In addition,
reviews of the previous series gave rise to additional changes to the
locking mechanisms used to control access to various data elements managed
by the vfio_ap device driver.

This patch series:

1. Refactors the mdev remove callback to always clean up mdev resources
2. Adds two semaphores to replace wait queue during KVM pointer set/unset
3. Adds a semaphore to control r/w access to PQAP instruction interception
   handler.

Tony Krowiak (3):
  s390/vfio-ap: clean up mdev resources when remove callback invoked
  s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t
  s390/vfio-ap: r/w lock for PQAP interception handler function pointer

 arch/s390/include/asm/kvm_host.h      |   6 +-
 arch/s390/kvm/kvm-s390.c              |   1 +
 arch/s390/kvm/priv.c                  |   6 +-
 drivers/s390/crypto/vfio_ap_ops.c     | 374 ++++++++++----------------
 drivers/s390/crypto/vfio_ap_private.h |  10 +-
 5 files changed, 163 insertions(+), 234 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked
  2021-06-09 22:46 [PATCH 0/3] s390/vfio-ap: refactor mdev remove callback and locks Tony Krowiak
@ 2021-06-09 22:46 ` Tony Krowiak
  2021-06-11 16:48   ` Jason Gunthorpe
  2021-06-09 22:46 ` [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t Tony Krowiak
  2021-06-09 22:46 ` [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer Tony Krowiak
  2 siblings, 1 reply; 14+ messages in thread
From: Tony Krowiak @ 2021-06-09 22:46 UTC (permalink / raw)
  To: linux-s390, linux-kernel
  Cc: borntraeger, cohuck, pasic, jjherne, jgg, alex.williamson,
	kwankhede, frankja, david, imbrenda, hca

The mdev remove callback for the vfio_ap device driver bails out with
-EBUSY if the mdev is in use by a KVM guest (i.e., the KVM pointer in the
struct ap_matrix_mdev is not NULL). The intended purpose was
to prevent the mdev from being removed while in use. There are two
problems with this scenario:

1. Returning a non-zero return code from the remove callback does not
   prevent the removal of the mdev.

2. The KVM pointer in the struct ap_matrix_mdev will always be NULL because
   the remove callback will not get invoked until the mdev fd is closed.
   When the mdev fd is closed, the mdev release callback is invoked and
   clears the KVM pointer from the struct ap_matrix_mdev.

Let's go ahead and remove the check for KVM in the remove callback and
allow the cleanup of mdev resources to proceed.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 10 ----------
 1 file changed, 10 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index b2c7e10dfdcd..122c85c22469 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -366,16 +366,6 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
 	mutex_lock(&matrix_dev->lock);
-
-	/*
-	 * If the KVM pointer is in flux or the guest is running, disallow
-	 * un-assignment of control domain.
-	 */
-	if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
-		mutex_unlock(&matrix_dev->lock);
-		return -EBUSY;
-	}
-
 	vfio_ap_mdev_reset_queues(mdev);
 	list_del(&matrix_mdev->node);
 	kfree(matrix_mdev);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t
  2021-06-09 22:46 [PATCH 0/3] s390/vfio-ap: refactor mdev remove callback and locks Tony Krowiak
  2021-06-09 22:46 ` [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked Tony Krowiak
@ 2021-06-09 22:46 ` Tony Krowiak
  2021-06-11 17:05   ` Jason Gunthorpe
  2021-06-09 22:46 ` [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer Tony Krowiak
  2 siblings, 1 reply; 14+ messages in thread
From: Tony Krowiak @ 2021-06-09 22:46 UTC (permalink / raw)
  To: linux-s390, linux-kernel
  Cc: borntraeger, cohuck, pasic, jjherne, jgg, alex.williamson,
	kwankhede, frankja, david, imbrenda, hca

This patch introduces two new r/w locks to replace the wait_queue_head_t
that was introduced to fix a lockdep splat reported when testing
pass-through of AP queues to a Secure Execution guest. This was the
abbreviated dependency chain reported by lockdep that was fixed using
a wait queue:

kvm_arch_crypto_set_masks+0x4a/0x2b8 [kvm]        kvm->lock
vfio_ap_mdev_group_notifier+0x154/0x170 [vfio_ap] matrix_dev->lock

handle_pqap+0x56/0x1d0 [vfio_ap]    matrix_dev->lock
kvm_vcpu_ioctl+0x2cc/0x898 [kvm]    vcpu->mutex

kvm_s390_cpus_to_pv+0x4e/0xf8 [kvm]   vcpu->mutex
kvm_arch_vm_ioctl+0x3ec/0x550 [kvm]   kvm->lock

The handle_pqap function handles interception of the PQAP instruction
issued on the guest to enable interrupts for each queue assigned to the
KVM guest. The kvm_arch_crypto_set_masks function sets the bits in the
KVM guest's AP Control Block (APCB) that supplies the AP configuration to
the guest. It is called by the group notifier that responds to the event
indicating that the KVM pointer has been set (i.e., the guest has been
started).

The idea behind the use of the wait_queue_head_t is to set a flag
indicating whether the KVM pointer is being set or not. The flag is set
when the KVM pointer is in the process of being set. All functions
that require access to the KVM pointer must wait until the pointer is set.
This allows the vfio_ap device driver to give up the matrix_dev->lock
while the kvm_arch_crypto_set_masks function is executing which circumvents
the lockdep splat.

It was pointed out by Jason Gunthorpe that this merely defeats lockdep
analysis and ought to be replaced by a proper rwsem. This patch attempts
to do just that; however, a single rw_semaphore will not do the trick by
itself. Consequently, two semaphores are introduced to control access to
the data contained within struct ap_matrix_mdev:

* An rw_semaphore is added to struct ap_matrix_mdev. The purpose of this
  lock is to control access to all data contained within
  struct ap_matrix_mdev, including the bitmaps specifying the
  AP configuration for a KVM guest.

* An rw_semaphore is added to struct ap_matrix to control access to the
  bitmaps specifying the AP configuration for a KVM guest. The primary
  reason for this lock is to prevent write access while the bitmaps are
  being read to update the KVM guest's AP control block (APCB) that
  supplies the AP configuration for the guest. If we provide a lock only
  for struct ap_matrix_mdev and hold that lock while the KVM guest's
  APCB is being updated, we'll end up with a lockdep splat similar to that
  resolved by the wait queue because the kvm->lock is held for the APCB
  update. This lock enables the driver to give up the
  rw_semaphore in struct ap_matrix_mdev while the KVM guest's AP config
  is being updated and therefore avoid the lockdep splat. In order to edit
  struct ap_matrix - as is done when adapters and domains are assigned to
  or unassigned from the mdev - both the semaphores must be locked: A read
  lock on the semaphore in struct ap_matrix_mdev and a write lock on
  ap_matrix.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c     | 356 ++++++++++----------------
 drivers/s390/crypto/vfio_ap_private.h |   8 +-
 2 files changed, 148 insertions(+), 216 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 122c85c22469..d65a5728153b 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -286,27 +286,14 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
 	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
 		return -EOPNOTSUPP;
 
-	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
-	mutex_lock(&matrix_dev->lock);
-
 	if (!vcpu->kvm->arch.crypto.pqap_hook)
-		goto out_unlock;
+		goto done;
 	matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
 				   struct ap_matrix_mdev, pqap_hook);
 
-	/*
-	 * If the KVM pointer is in the process of being set, wait until the
-	 * process has completed.
-	 */
-	wait_event_cmd(matrix_mdev->wait_for_kvm,
-		       !matrix_mdev->kvm_busy,
-		       mutex_unlock(&matrix_dev->lock),
-		       mutex_lock(&matrix_dev->lock));
-
-	/* If the there is no guest using the mdev, there is nothing to do */
-	if (!matrix_mdev->kvm)
-		goto out_unlock;
+	down_read(&matrix_mdev->rwsem);
 
+	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
 	q = vfio_ap_get_queue(matrix_mdev, apqn);
 	if (!q)
 		goto out_unlock;
@@ -321,9 +308,10 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
 		qstatus = vfio_ap_irq_disable(q);
 
 out_unlock:
+	up_read(&matrix_mdev->rwsem);
+done:
 	memcpy(&vcpu->run->s.regs.gprs[1], &qstatus, sizeof(qstatus));
 	vcpu->run->s.regs.gprs[1] >>= 32;
-	mutex_unlock(&matrix_dev->lock);
 	return 0;
 }
 
@@ -333,6 +321,7 @@ static void vfio_ap_matrix_init(struct ap_config_info *info,
 	matrix->apm_max = info->apxa ? info->Na : 63;
 	matrix->aqm_max = info->apxa ? info->Nd : 15;
 	matrix->adm_max = info->apxa ? info->Nd : 15;
+	init_rwsem(&matrix->rwsem);
 }
 
 static int vfio_ap_mdev_create(struct mdev_device *mdev)
@@ -350,10 +339,11 @@ static int vfio_ap_mdev_create(struct mdev_device *mdev)
 
 	matrix_mdev->mdev = mdev;
 	vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
-	init_waitqueue_head(&matrix_mdev->wait_for_kvm);
+	init_rwsem(&matrix_mdev->rwsem);
 	mdev_set_drvdata(mdev, matrix_mdev);
 	matrix_mdev->pqap_hook.hook = handle_pqap;
 	matrix_mdev->pqap_hook.owner = THIS_MODULE;
+
 	mutex_lock(&matrix_dev->lock);
 	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
 	mutex_unlock(&matrix_dev->lock);
@@ -366,13 +356,14 @@ static int vfio_ap_mdev_remove(struct mdev_device *mdev)
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
 	mutex_lock(&matrix_dev->lock);
-	vfio_ap_mdev_reset_queues(mdev);
 	list_del(&matrix_mdev->node);
-	kfree(matrix_mdev);
-	mdev_set_drvdata(mdev, NULL);
 	atomic_inc(&matrix_dev->available_instances);
 	mutex_unlock(&matrix_dev->lock);
 
+	vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
+	kfree(matrix_mdev);
+	mdev_set_drvdata(mdev, NULL);
+
 	return 0;
 }
 
@@ -577,6 +568,30 @@ static int vfio_ap_mdev_verify_no_sharing(struct ap_matrix_mdev *matrix_mdev)
 	return 0;
 }
 
+static int vfio_ap_mdev_matrix_store_lock(struct ap_matrix_mdev *matrix_mdev)
+{
+	if (!down_write_trylock(&matrix_mdev->rwsem))
+		return -EBUSY;
+
+	if (matrix_mdev->kvm) {
+		up_write(&matrix_mdev->rwsem);
+		return -EBUSY;
+	}
+
+	if (!down_write_trylock(&matrix_mdev->matrix.rwsem)) {
+		up_write(&matrix_mdev->rwsem);
+		return -EBUSY;
+	}
+
+	return 0;
+}
+
+static void vfio_ap_mdev_matrix_store_unlock(struct ap_matrix_mdev *matrix_mdev)
+{
+	up_write(&matrix_mdev->rwsem);
+	up_write(&matrix_mdev->matrix.rwsem);
+}
+
 /**
  * assign_adapter_store
  *
@@ -618,31 +633,17 @@ static ssize_t assign_adapter_store(struct device *dev,
 	struct mdev_device *mdev = mdev_from_dev(dev);
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
-	mutex_lock(&matrix_dev->lock);
-
-	/*
-	 * If the KVM pointer is in flux or the guest is running, disallow
-	 * un-assignment of adapter
-	 */
-	if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
-		ret = -EBUSY;
-		goto done;
-	}
-
 	ret = kstrtoul(buf, 0, &apid);
 	if (ret)
-		goto done;
+		return ret;
 
-	if (apid > matrix_mdev->matrix.apm_max) {
-		ret = -ENODEV;
-		goto done;
-	}
+	if (apid > matrix_mdev->matrix.apm_max)
+		return -ENODEV;
+
+	ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
+	if (ret)
+		return ret;
 
-	/*
-	 * Set the bit in the AP mask (APM) corresponding to the AP adapter
-	 * number (APID). The bits in the mask, from most significant to least
-	 * significant bit, correspond to APIDs 0-255.
-	 */
 	ret = vfio_ap_mdev_verify_queues_reserved_for_apid(matrix_mdev, apid);
 	if (ret)
 		goto done;
@@ -659,7 +660,7 @@ static ssize_t assign_adapter_store(struct device *dev,
 share_err:
 	clear_bit_inv(apid, matrix_mdev->matrix.apm);
 done:
-	mutex_unlock(&matrix_dev->lock);
+	vfio_ap_mdev_matrix_store_unlock(matrix_mdev);
 
 	return ret;
 }
@@ -691,31 +692,21 @@ static ssize_t unassign_adapter_store(struct device *dev,
 	struct mdev_device *mdev = mdev_from_dev(dev);
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
-	mutex_lock(&matrix_dev->lock);
-
-	/*
-	 * If the KVM pointer is in flux or the guest is running, disallow
-	 * un-assignment of adapter
-	 */
-	if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
-		ret = -EBUSY;
-		goto done;
-	}
-
 	ret = kstrtoul(buf, 0, &apid);
 	if (ret)
-		goto done;
+		return ret;
 
-	if (apid > matrix_mdev->matrix.apm_max) {
-		ret = -ENODEV;
-		goto done;
-	}
+	if (apid > matrix_mdev->matrix.apm_max)
+		return -ENODEV;
+
+	ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
+	if (ret)
+		return ret;
 
 	clear_bit_inv((unsigned long)apid, matrix_mdev->matrix.apm);
-	ret = count;
-done:
-	mutex_unlock(&matrix_dev->lock);
-	return ret;
+	vfio_ap_mdev_matrix_store_unlock(matrix_mdev);
+
+	return count;
 }
 static DEVICE_ATTR_WO(unassign_adapter);
 
@@ -781,24 +772,15 @@ static ssize_t assign_domain_store(struct device *dev,
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 	unsigned long max_apqi = matrix_mdev->matrix.aqm_max;
 
-	mutex_lock(&matrix_dev->lock);
-
-	/*
-	 * If the KVM pointer is in flux or the guest is running, disallow
-	 * assignment of domain
-	 */
-	if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
-		ret = -EBUSY;
-		goto done;
-	}
-
 	ret = kstrtoul(buf, 0, &apqi);
 	if (ret)
-		goto done;
-	if (apqi > max_apqi) {
-		ret = -ENODEV;
-		goto done;
-	}
+		return ret;
+	if (apqi > max_apqi)
+		return -ENODEV;
+
+	ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
+	if (ret)
+		return ret;
 
 	ret = vfio_ap_mdev_verify_queues_reserved_for_apqi(matrix_mdev, apqi);
 	if (ret)
@@ -816,7 +798,7 @@ static ssize_t assign_domain_store(struct device *dev,
 share_err:
 	clear_bit_inv(apqi, matrix_mdev->matrix.aqm);
 done:
-	mutex_unlock(&matrix_dev->lock);
+	vfio_ap_mdev_matrix_store_unlock(matrix_mdev);
 
 	return ret;
 }
@@ -849,32 +831,21 @@ static ssize_t unassign_domain_store(struct device *dev,
 	struct mdev_device *mdev = mdev_from_dev(dev);
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
-	mutex_lock(&matrix_dev->lock);
-
-	/*
-	 * If the KVM pointer is in flux or the guest is running, disallow
-	 * un-assignment of domain
-	 */
-	if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
-		ret = -EBUSY;
-		goto done;
-	}
-
 	ret = kstrtoul(buf, 0, &apqi);
 	if (ret)
-		goto done;
+		return ret;
 
-	if (apqi > matrix_mdev->matrix.aqm_max) {
-		ret = -ENODEV;
-		goto done;
-	}
+	if (apqi > matrix_mdev->matrix.aqm_max)
+		return -ENODEV;
+
+	ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
+	if (ret)
+		return ret;
 
 	clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm);
-	ret = count;
+	vfio_ap_mdev_matrix_store_unlock(matrix_mdev);
 
-done:
-	mutex_unlock(&matrix_dev->lock);
-	return ret;
+	return count;
 }
 static DEVICE_ATTR_WO(unassign_domain);
 
@@ -903,25 +874,16 @@ static ssize_t assign_control_domain_store(struct device *dev,
 	struct mdev_device *mdev = mdev_from_dev(dev);
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
-	mutex_lock(&matrix_dev->lock);
-
-	/*
-	 * If the KVM pointer is in flux or the guest is running, disallow
-	 * assignment of control domain.
-	 */
-	if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
-		ret = -EBUSY;
-		goto done;
-	}
-
 	ret = kstrtoul(buf, 0, &id);
 	if (ret)
-		goto done;
+		return ret;
 
-	if (id > matrix_mdev->matrix.adm_max) {
-		ret = -ENODEV;
-		goto done;
-	}
+	if (id > matrix_mdev->matrix.adm_max)
+		return -ENODEV;
+
+	ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
+	if (ret)
+		return ret;
 
 	/* Set the bit in the ADM (bitmask) corresponding to the AP control
 	 * domain number (id). The bits in the mask, from most significant to
@@ -929,10 +891,9 @@ static ssize_t assign_control_domain_store(struct device *dev,
 	 * number of control domains that can be assigned.
 	 */
 	set_bit_inv(id, matrix_mdev->matrix.adm);
-	ret = count;
-done:
-	mutex_unlock(&matrix_dev->lock);
-	return ret;
+	vfio_ap_mdev_matrix_store_unlock(matrix_mdev);
+
+	return count;
 }
 static DEVICE_ATTR_WO(assign_control_domain);
 
@@ -962,30 +923,20 @@ static ssize_t unassign_control_domain_store(struct device *dev,
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 	unsigned long max_domid =  matrix_mdev->matrix.adm_max;
 
-	mutex_lock(&matrix_dev->lock);
-
-	/*
-	 * If the KVM pointer is in flux or the guest is running, disallow
-	 * un-assignment of control domain.
-	 */
-	if (matrix_mdev->kvm_busy || matrix_mdev->kvm) {
-		ret = -EBUSY;
-		goto done;
-	}
-
 	ret = kstrtoul(buf, 0, &domid);
 	if (ret)
-		goto done;
-	if (domid > max_domid) {
-		ret = -ENODEV;
-		goto done;
-	}
+		return ret;
+	if (domid > max_domid)
+		return -ENODEV;
+
+	ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
+	if (ret)
+		return ret;
 
 	clear_bit_inv(domid, matrix_mdev->matrix.adm);
-	ret = count;
-done:
-	mutex_unlock(&matrix_dev->lock);
-	return ret;
+	vfio_ap_mdev_matrix_store_unlock(matrix_mdev);
+
+	return count;
 }
 static DEVICE_ATTR_WO(unassign_control_domain);
 
@@ -1001,13 +952,13 @@ static ssize_t control_domains_show(struct device *dev,
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 	unsigned long max_domid = matrix_mdev->matrix.adm_max;
 
-	mutex_lock(&matrix_dev->lock);
+	down_read(&matrix_mdev->rwsem);
 	for_each_set_bit_inv(id, matrix_mdev->matrix.adm, max_domid + 1) {
 		n = sprintf(bufpos, "%04lx\n", id);
 		bufpos += n;
 		nchars += n;
 	}
-	mutex_unlock(&matrix_dev->lock);
+	up_read(&matrix_mdev->rwsem);
 
 	return nchars;
 }
@@ -1028,11 +979,10 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
 	int nchars = 0;
 	int n;
 
+	down_read(&matrix_mdev->rwsem);
 	apid1 = find_first_bit_inv(matrix_mdev->matrix.apm, napm_bits);
 	apqi1 = find_first_bit_inv(matrix_mdev->matrix.aqm, naqm_bits);
 
-	mutex_lock(&matrix_dev->lock);
-
 	if ((apid1 < napm_bits) && (apqi1 < naqm_bits)) {
 		for_each_set_bit_inv(apid, matrix_mdev->matrix.apm, napm_bits) {
 			for_each_set_bit_inv(apqi, matrix_mdev->matrix.aqm,
@@ -1057,7 +1007,7 @@ static ssize_t matrix_show(struct device *dev, struct device_attribute *attr,
 		}
 	}
 
-	mutex_unlock(&matrix_dev->lock);
+	up_read(&matrix_mdev->rwsem);
 
 	return nchars;
 }
@@ -1090,15 +1040,8 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = {
  * @matrix_mdev: a mediated matrix device
  * @kvm: reference to KVM instance
  *
- * Sets all data for @matrix_mdev that are needed to manage AP resources
- * for the guest whose state is represented by @kvm.
- *
- * Note: The matrix_dev->lock must be taken prior to calling
- * this function; however, the lock will be temporarily released while the
- * guest's AP configuration is set to avoid a potential lockdep splat.
- * The kvm->lock is taken to set the guest's AP configuration which, under
- * certain circumstances, will result in a circular lock dependency if this is
- * done under the @matrix_mdev->lock.
+ * Verifies no other mediated matrix device has @kvm and sets a reference to
+ * it in @matrix_mdev->kvm.
  *
  * Return 0 if no other mediated matrix device has a reference to @kvm;
  * otherwise, returns an -EPERM.
@@ -1108,24 +1051,32 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
 {
 	struct ap_matrix_mdev *m;
 
-	if (kvm->arch.crypto.crycbd) {
-		list_for_each_entry(m, &matrix_dev->mdev_list, node) {
-			if (m != matrix_mdev && m->kvm == kvm)
-				return -EPERM;
+	mutex_lock(&matrix_dev->lock);
+	list_for_each_entry(m, &matrix_dev->mdev_list, node) {
+		if ((m != matrix_mdev) && (m->kvm == kvm)) {
+			mutex_unlock(&matrix_dev->lock);
+			return -EPERM;
 		}
+	}
+	mutex_unlock(&matrix_dev->lock);
+
+	down_write(&matrix_mdev->rwsem);
+	matrix_mdev->kvm = kvm;
+	kvm_get_kvm(kvm);
+	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
+	up_write(&matrix_mdev->rwsem);
 
-		kvm_get_kvm(kvm);
-		matrix_mdev->kvm_busy = true;
-		mutex_unlock(&matrix_dev->lock);
+	/*
+	 * If there is no CRYCB pointer, then there is no need to set the
+	 * masks for the KVM guest
+	 */
+	if (kvm->arch.crypto.crycbd) {
+		down_read(&matrix_mdev->matrix.rwsem);
 		kvm_arch_crypto_set_masks(kvm,
 					  matrix_mdev->matrix.apm,
 					  matrix_mdev->matrix.aqm,
 					  matrix_mdev->matrix.adm);
-		mutex_lock(&matrix_dev->lock);
-		kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
-		matrix_mdev->kvm = kvm;
-		matrix_mdev->kvm_busy = false;
-		wake_up_all(&matrix_mdev->wait_for_kvm);
+		up_read(&matrix_mdev->matrix.rwsem);
 	}
 
 	return 0;
@@ -1165,60 +1116,48 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
  *
  * @matrix_mdev: a matrix mediated device
  *
- * Performs clean-up of resources no longer needed by @matrix_mdev.
- *
- * Note: The matrix_dev->lock must be taken prior to calling
- * this function; however, the lock will be temporarily released while the
- * guest's AP configuration is cleared to avoid a potential lockdep splat.
- * The kvm->lock is taken to clear the guest's AP configuration which, under
- * certain circumstances, will result in a circular lock dependency if this is
- * done under the @matrix_mdev->lock.
- *
+ * Unplugs the adapters, domains and control domains from the guest, clears
+ * the KVM pointer from @matrix_mdev and nullifies the pqap_hook pointer.
  */
 static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
 {
-	/*
-	 * If the KVM pointer is in the process of being set, wait until the
-	 * process has completed.
-	 */
-	wait_event_cmd(matrix_mdev->wait_for_kvm,
-		       !matrix_mdev->kvm_busy,
-		       mutex_unlock(&matrix_dev->lock),
-		       mutex_lock(&matrix_dev->lock));
-
 	if (matrix_mdev->kvm) {
-		matrix_mdev->kvm_busy = true;
-		mutex_unlock(&matrix_dev->lock);
-		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
-		mutex_lock(&matrix_dev->lock);
-		vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
+		if (matrix_mdev->kvm->arch.crypto.crycbd)
+			kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
+
+		down_write(&matrix_mdev->rwsem);
 		matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
+		vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
 		kvm_put_kvm(matrix_mdev->kvm);
 		matrix_mdev->kvm = NULL;
-		matrix_mdev->kvm_busy = false;
-		wake_up_all(&matrix_mdev->wait_for_kvm);
+		up_write(&matrix_mdev->rwsem);
 	}
 }
 
 static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
 				       unsigned long action, void *data)
 {
-	int notify_rc = NOTIFY_OK;
+	int ret, notify_rc = NOTIFY_OK;
 	struct ap_matrix_mdev *matrix_mdev;
 
 	if (action != VFIO_GROUP_NOTIFY_SET_KVM)
 		return NOTIFY_OK;
 
-	mutex_lock(&matrix_dev->lock);
 	matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
 
-	if (!data)
-		vfio_ap_mdev_unset_kvm(matrix_mdev);
-	else if (vfio_ap_mdev_set_kvm(matrix_mdev, data))
-		notify_rc = NOTIFY_DONE;
+	if (!data) {
+		if (matrix_mdev->kvm)
+			vfio_ap_mdev_unset_kvm(matrix_mdev);
+		goto notify_done;
+	}
 
-	mutex_unlock(&matrix_dev->lock);
+	ret = vfio_ap_mdev_set_kvm(matrix_mdev, data);
+	if (ret) {
+		notify_rc = NOTIFY_DONE;
+		goto notify_done;
+	}
 
+notify_done:
 	return notify_rc;
 }
 
@@ -1321,7 +1260,6 @@ static int vfio_ap_mdev_open(struct mdev_device *mdev)
 	unsigned long events;
 	int ret;
 
-
 	if (!try_module_get(THIS_MODULE))
 		return -ENODEV;
 
@@ -1352,14 +1290,13 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
 {
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
-	mutex_lock(&matrix_dev->lock);
-	vfio_ap_mdev_unset_kvm(matrix_mdev);
-	mutex_unlock(&matrix_dev->lock);
-
 	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
 				 &matrix_mdev->iommu_notifier);
 	vfio_unregister_notifier(mdev_dev(mdev), VFIO_GROUP_NOTIFY,
 				 &matrix_mdev->group_notifier);
+
+	vfio_ap_mdev_unset_kvm(matrix_mdev);
+
 	module_put(THIS_MODULE);
 }
 
@@ -1389,7 +1326,6 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
 	int ret;
 	struct ap_matrix_mdev *matrix_mdev;
 
-	mutex_lock(&matrix_dev->lock);
 	switch (cmd) {
 	case VFIO_DEVICE_GET_INFO:
 		ret = vfio_ap_mdev_get_device_info(arg);
@@ -1401,22 +1337,14 @@ static ssize_t vfio_ap_mdev_ioctl(struct mdev_device *mdev,
 			break;
 		}
 
-		/*
-		 * If the KVM pointer is in the process of being set, wait until
-		 * the process has completed.
-		 */
-		wait_event_cmd(matrix_mdev->wait_for_kvm,
-			       !matrix_mdev->kvm_busy,
-			       mutex_unlock(&matrix_dev->lock),
-			       mutex_lock(&matrix_dev->lock));
-
+		down_read(&matrix_mdev->rwsem);
 		ret = vfio_ap_mdev_reset_queues(mdev);
+		up_read(&matrix_mdev->rwsem);
 		break;
 	default:
 		ret = -EOPNOTSUPP;
 		break;
 	}
-	mutex_unlock(&matrix_dev->lock);
 
 	return ret;
 }
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index f82a6396acae..a163ac04ff8a 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -59,6 +59,8 @@ extern struct ap_matrix_dev *matrix_dev;
  * @aqm identifies the AP queues (domains) in the matrix
  * @adm_max: max domain number in @adm
  * @adm identifies the AP control domains in the matrix
+ * @rwsem: semaphore to ensure integrity when making changes to the @apm, @aqm
+ *	   and @adm while the guest's AP matrix is being updated.
  */
 struct ap_matrix {
 	unsigned long apm_max;
@@ -67,6 +69,7 @@ struct ap_matrix {
 	DECLARE_BITMAP(aqm, 256);
 	unsigned long adm_max;
 	DECLARE_BITMAP(adm, 256);
+	struct rw_semaphore rwsem;
 };
 
 /**
@@ -76,6 +79,8 @@ struct ap_matrix {
  *		mediated matrix device.
  * @group_notifier: notifier block used for specifying callback function for
  *		    handling the VFIO_GROUP_NOTIFY_SET_KVM event
+ * @rwsem	a semaphore to ensure integrity when making changes to the
+ *		structure's attribute values.
  * @kvm:	the struct holding guest's state
  */
 struct ap_matrix_mdev {
@@ -83,8 +88,7 @@ struct ap_matrix_mdev {
 	struct ap_matrix matrix;
 	struct notifier_block group_notifier;
 	struct notifier_block iommu_notifier;
-	bool kvm_busy;
-	wait_queue_head_t wait_for_kvm;
+	struct rw_semaphore rwsem;
 	struct kvm *kvm;
 	struct kvm_s390_module_hook pqap_hook;
 	struct mdev_device *mdev;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer
  2021-06-09 22:46 [PATCH 0/3] s390/vfio-ap: refactor mdev remove callback and locks Tony Krowiak
  2021-06-09 22:46 ` [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked Tony Krowiak
  2021-06-09 22:46 ` [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t Tony Krowiak
@ 2021-06-09 22:46 ` Tony Krowiak
  2021-06-11 17:06   ` Jason Gunthorpe
  2021-06-15  8:55   ` Christian Borntraeger
  2 siblings, 2 replies; 14+ messages in thread
From: Tony Krowiak @ 2021-06-09 22:46 UTC (permalink / raw)
  To: linux-s390, linux-kernel
  Cc: borntraeger, cohuck, pasic, jjherne, jgg, alex.williamson,
	kwankhede, frankja, david, imbrenda, hca

The function pointer to the interception handler for the PQAP instruction
can get changed during the interception process. Let's add a
semaphore to struct kvm_s390_crypto to control read/write access to the
function pointer contained therein.

The semaphore must be locked for write access by the vfio_ap device driver
when notified that the KVM pointer has been set or cleared. It must be
locked for read access by the interception framework when the PQAP
instruction is intercepted.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 arch/s390/include/asm/kvm_host.h      |  6 +++---
 arch/s390/kvm/kvm-s390.c              |  1 +
 arch/s390/kvm/priv.c                  |  6 +++---
 drivers/s390/crypto/vfio_ap_ops.c     | 14 ++++++++++----
 drivers/s390/crypto/vfio_ap_private.h |  2 +-
 5 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 8925f3969478..58edaa3f9602 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -803,14 +803,14 @@ struct kvm_s390_cpu_model {
 	unsigned short ibc;
 };
 
-struct kvm_s390_module_hook {
+struct kvm_s390_crypto_hook {
 	int (*hook)(struct kvm_vcpu *vcpu);
-	struct module *owner;
 };
 
 struct kvm_s390_crypto {
 	struct kvm_s390_crypto_cb *crycb;
-	struct kvm_s390_module_hook *pqap_hook;
+	struct rw_semaphore pqap_hook_rwsem;
+	struct kvm_s390_crypto_hook *pqap_hook;
 	__u32 crycbd;
 	__u8 aes_kw;
 	__u8 dea_kw;
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index 1296fc10f80c..418d910df569 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -2606,6 +2606,7 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
 {
 	kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
 	kvm_s390_set_crycb_format(kvm);
+	init_rwsem(&kvm->arch.crypto.pqap_hook_rwsem);
 
 	if (!test_kvm_facility(kvm, 76))
 		return;
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 9928f785c677..bbbd84ffe239 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -657,15 +657,15 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
 	 * Verify that the hook callback is registered, lock the owner
 	 * and call the hook.
 	 */
+	down_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
 	if (vcpu->kvm->arch.crypto.pqap_hook) {
-		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
-			return -EOPNOTSUPP;
 		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
-		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
 		if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000)
 			kvm_s390_set_psw_cc(vcpu, 3);
+		up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
 		return ret;
 	}
+	up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
 	/*
 	 * A vfio_driver must register a hook.
 	 * No hook means no driver to enable the SIE CRYCB and no queues.
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index d65a5728153b..2998c1b53ab9 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -342,7 +342,6 @@ static int vfio_ap_mdev_create(struct mdev_device *mdev)
 	init_rwsem(&matrix_mdev->rwsem);
 	mdev_set_drvdata(mdev, matrix_mdev);
 	matrix_mdev->pqap_hook.hook = handle_pqap;
-	matrix_mdev->pqap_hook.owner = THIS_MODULE;
 
 	mutex_lock(&matrix_dev->lock);
 	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
@@ -1063,7 +1062,6 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
 	down_write(&matrix_mdev->rwsem);
 	matrix_mdev->kvm = kvm;
 	kvm_get_kvm(kvm);
-	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
 	up_write(&matrix_mdev->rwsem);
 
 	/*
@@ -1071,6 +1069,10 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
 	 * masks for the KVM guest
 	 */
 	if (kvm->arch.crypto.crycbd) {
+		down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
+		kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
+		up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
+
 		down_read(&matrix_mdev->matrix.rwsem);
 		kvm_arch_crypto_set_masks(kvm,
 					  matrix_mdev->matrix.apm,
@@ -1122,11 +1124,15 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
 static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
 {
 	if (matrix_mdev->kvm) {
-		if (matrix_mdev->kvm->arch.crypto.crycbd)
+		if (matrix_mdev->kvm->arch.crypto.crycbd) {
+			down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
+			matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
+			up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
+
 			kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
+		}
 
 		down_write(&matrix_mdev->rwsem);
-		matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
 		vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
 		kvm_put_kvm(matrix_mdev->kvm);
 		matrix_mdev->kvm = NULL;
diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
index a163ac04ff8a..3d6afd0faaaf 100644
--- a/drivers/s390/crypto/vfio_ap_private.h
+++ b/drivers/s390/crypto/vfio_ap_private.h
@@ -90,7 +90,7 @@ struct ap_matrix_mdev {
 	struct notifier_block iommu_notifier;
 	struct rw_semaphore rwsem;
 	struct kvm *kvm;
-	struct kvm_s390_module_hook pqap_hook;
+	struct kvm_s390_crypto_hook pqap_hook;
 	struct mdev_device *mdev;
 };
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked
  2021-06-09 22:46 ` [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked Tony Krowiak
@ 2021-06-11 16:48   ` Jason Gunthorpe
  2021-06-14 17:29     ` Tony Krowiak
  2021-06-15  7:43     ` Christian Borntraeger
  0 siblings, 2 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2021-06-11 16:48 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, pasic, jjherne,
	alex.williamson, kwankhede, frankja, david, imbrenda, hca

On Wed, Jun 09, 2021 at 06:46:32PM -0400, Tony Krowiak wrote:
> The mdev remove callback for the vfio_ap device driver bails out with
> -EBUSY if the mdev is in use by a KVM guest (i.e., the KVM pointer in the
> struct ap_matrix_mdev is not NULL). The intended purpose was
> to prevent the mdev from being removed while in use. There are two
> problems with this scenario:
> 
> 1. Returning a non-zero return code from the remove callback does not
>    prevent the removal of the mdev.
> 
> 2. The KVM pointer in the struct ap_matrix_mdev will always be NULL because
>    the remove callback will not get invoked until the mdev fd is closed.
>    When the mdev fd is closed, the mdev release callback is invoked and
>    clears the KVM pointer from the struct ap_matrix_mdev.
> 
> Let's go ahead and remove the check for KVM in the remove callback and
> allow the cleanup of mdev resources to proceed.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 10 ----------
>  1 file changed, 10 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t
  2021-06-09 22:46 ` [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t Tony Krowiak
@ 2021-06-11 17:05   ` Jason Gunthorpe
  2021-06-11 17:11     ` David Hildenbrand
  0 siblings, 1 reply; 14+ messages in thread
From: Jason Gunthorpe @ 2021-06-11 17:05 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, pasic, jjherne,
	alex.williamson, kwankhede, frankja, david, imbrenda, hca

On Wed, Jun 09, 2021 at 06:46:33PM -0400, Tony Krowiak wrote:
> This patch introduces two new r/w locks to replace the wait_queue_head_t
> that was introduced to fix a lockdep splat reported when testing
> pass-through of AP queues to a Secure Execution guest. This was the
> abbreviated dependency chain reported by lockdep that was fixed using
> a wait queue:
> 
> kvm_arch_crypto_set_masks+0x4a/0x2b8 [kvm]        kvm->lock
> vfio_ap_mdev_group_notifier+0x154/0x170 [vfio_ap] matrix_dev->lock
> 
> handle_pqap+0x56/0x1d0 [vfio_ap]    matrix_dev->lock
> kvm_vcpu_ioctl+0x2cc/0x898 [kvm]    vcpu->mutex
> 
> kvm_s390_cpus_to_pv+0x4e/0xf8 [kvm]   vcpu->mutex
> kvm_arch_vm_ioctl+0x3ec/0x550 [kvm]   kvm->lock

Is the problem larger than kvm_arch_crypto_set_masks()? If not it
looks easy enough to fix, just pull the kvm->lock out of
kvm_arch_crypto_set_masks() and obtain it in vfio_ap_mdev_set_kvm()
before the rwsem. Now your locks are in the right order and all should
be well?

> +static int vfio_ap_mdev_matrix_store_lock(struct ap_matrix_mdev *matrix_mdev)
> +{
> +	if (!down_write_trylock(&matrix_mdev->rwsem))
> +		return -EBUSY;
> +
> +	if (matrix_mdev->kvm) {
> +		up_write(&matrix_mdev->rwsem);
> +		return -EBUSY;
> +	}
> +
> +	if (!down_write_trylock(&matrix_mdev->matrix.rwsem)) {
> +		up_write(&matrix_mdev->rwsem);
> +		return -EBUSY;
> +	}
> +
> +	return 0;
> +}

This double locking is quite strange, at least it deserves a detailed
comment? The comments suggest these locks protect distinct data so..

> +
> +	ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
> +	if (ret)
> +		return ret;
>  
>  	clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm);

here it obtained both locks but only touched matrix.aqm which is only
protected by the inner lock - what was the point of obtaining the
outer lock?

Also, not convinced down_write_trylock() is appropriate from a sysfs
callback, it should block and wait, surely? Otherwise userspace gets
random racy failures depending on what the kernel is doing??

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer
  2021-06-09 22:46 ` [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer Tony Krowiak
@ 2021-06-11 17:06   ` Jason Gunthorpe
  2021-06-15  8:55   ` Christian Borntraeger
  1 sibling, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2021-06-11 17:06 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, pasic, jjherne,
	alex.williamson, kwankhede, frankja, david, imbrenda, hca

On Wed, Jun 09, 2021 at 06:46:34PM -0400, Tony Krowiak wrote:
> The function pointer to the interception handler for the PQAP instruction
> can get changed during the interception process. Let's add a
> semaphore to struct kvm_s390_crypto to control read/write access to the
> function pointer contained therein.
> 
> The semaphore must be locked for write access by the vfio_ap device driver
> when notified that the KVM pointer has been set or cleared. It must be
> locked for read access by the interception framework when the PQAP
> instruction is intercepted.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  arch/s390/include/asm/kvm_host.h      |  6 +++---
>  arch/s390/kvm/kvm-s390.c              |  1 +
>  arch/s390/kvm/priv.c                  |  6 +++---
>  drivers/s390/crypto/vfio_ap_ops.c     | 14 ++++++++++----
>  drivers/s390/crypto/vfio_ap_private.h |  2 +-
>  5 files changed, 18 insertions(+), 11 deletions(-)

This is alot better than the try_module_get!

Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t
  2021-06-11 17:05   ` Jason Gunthorpe
@ 2021-06-11 17:11     ` David Hildenbrand
  2021-06-11 17:41       ` Jason Gunthorpe
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2021-06-11 17:11 UTC (permalink / raw)
  To: Jason Gunthorpe, Tony Krowiak
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, pasic, jjherne,
	alex.williamson, kwankhede, frankja, imbrenda, hca

On 11.06.21 19:05, Jason Gunthorpe wrote:
> On Wed, Jun 09, 2021 at 06:46:33PM -0400, Tony Krowiak wrote:
>> This patch introduces two new r/w locks to replace the wait_queue_head_t
>> that was introduced to fix a lockdep splat reported when testing
>> pass-through of AP queues to a Secure Execution guest. This was the
>> abbreviated dependency chain reported by lockdep that was fixed using
>> a wait queue:
>>
>> kvm_arch_crypto_set_masks+0x4a/0x2b8 [kvm]        kvm->lock
>> vfio_ap_mdev_group_notifier+0x154/0x170 [vfio_ap] matrix_dev->lock
>>
>> handle_pqap+0x56/0x1d0 [vfio_ap]    matrix_dev->lock
>> kvm_vcpu_ioctl+0x2cc/0x898 [kvm]    vcpu->mutex
>>
>> kvm_s390_cpus_to_pv+0x4e/0xf8 [kvm]   vcpu->mutex
>> kvm_arch_vm_ioctl+0x3ec/0x550 [kvm]   kvm->lock
> 
> Is the problem larger than kvm_arch_crypto_set_masks()? If not it
> looks easy enough to fix, just pull the kvm->lock out of
> kvm_arch_crypto_set_masks() and obtain it in vfio_ap_mdev_set_kvm()
> before the rwsem. Now your locks are in the right order and all should
> be well?
> 
>> +static int vfio_ap_mdev_matrix_store_lock(struct ap_matrix_mdev *matrix_mdev)
>> +{
>> +	if (!down_write_trylock(&matrix_mdev->rwsem))
>> +		return -EBUSY;
>> +
>> +	if (matrix_mdev->kvm) {
>> +		up_write(&matrix_mdev->rwsem);
>> +		return -EBUSY;
>> +	}
>> +
>> +	if (!down_write_trylock(&matrix_mdev->matrix.rwsem)) {
>> +		up_write(&matrix_mdev->rwsem);
>> +		return -EBUSY;
>> +	}
>> +
>> +	return 0;
>> +}
> 
> This double locking is quite strange, at least it deserves a detailed
> comment? The comments suggest these locks protect distinct data so..
> 
>> +
>> +	ret = vfio_ap_mdev_matrix_store_lock(matrix_mdev);
>> +	if (ret)
>> +		return ret;
>>   
>>   	clear_bit_inv((unsigned long)apqi, matrix_mdev->matrix.aqm);
> 
> here it obtained both locks but only touched matrix.aqm which is only
> protected by the inner lock - what was the point of obtaining the
> outer lock?
> 
> Also, not convinced down_write_trylock() is appropriate from a sysfs
> callback, it should block and wait, surely? Otherwise userspace gets
> random racy failures depending on what the kernel is doing??

It might we worth exploring lock_device_hotplug_sysfs() which does a

"return restart_syscall()" with some delay.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t
  2021-06-11 17:11     ` David Hildenbrand
@ 2021-06-11 17:41       ` Jason Gunthorpe
  0 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2021-06-11 17:41 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Tony Krowiak, linux-s390, linux-kernel, borntraeger, cohuck,
	pasic, jjherne, alex.williamson, kwankhede, frankja, imbrenda,
	hca

On Fri, Jun 11, 2021 at 07:11:50PM +0200, David Hildenbrand wrote:

> > Also, not convinced down_write_trylock() is appropriate from a sysfs
> > callback, it should block and wait, surely? Otherwise userspace gets
> > random racy failures depending on what the kernel is doing??
> 
> It might we worth exploring lock_device_hotplug_sysfs() which does a
> 
> "return restart_syscall()" with some delay.

The ideal design from a sysfs should be a single
down_write_killable().

restart_syscall will just create a weird spinlock that is hopefully
unlikely to spin :\

Jason

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked
  2021-06-11 16:48   ` Jason Gunthorpe
@ 2021-06-14 17:29     ` Tony Krowiak
  2021-06-15  7:43     ` Christian Borntraeger
  1 sibling, 0 replies; 14+ messages in thread
From: Tony Krowiak @ 2021-06-14 17:29 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: linux-s390, linux-kernel, borntraeger, cohuck, pasic, jjherne,
	alex.williamson, kwankhede, frankja, david, imbrenda, hca



On 6/11/21 12:48 PM, Jason Gunthorpe wrote:
> Reviewed-by: Jason Gunthorpe<jgg@nvidia.com>

Thanks for the review.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked
  2021-06-11 16:48   ` Jason Gunthorpe
  2021-06-14 17:29     ` Tony Krowiak
@ 2021-06-15  7:43     ` Christian Borntraeger
  1 sibling, 0 replies; 14+ messages in thread
From: Christian Borntraeger @ 2021-06-15  7:43 UTC (permalink / raw)
  To: Jason Gunthorpe, Tony Krowiak
  Cc: linux-s390, linux-kernel, cohuck, pasic, jjherne,
	alex.williamson, kwankhede, frankja, david, imbrenda, hca



On 11.06.21 18:48, Jason Gunthorpe wrote:
> On Wed, Jun 09, 2021 at 06:46:32PM -0400, Tony Krowiak wrote:
>> The mdev remove callback for the vfio_ap device driver bails out with
>> -EBUSY if the mdev is in use by a KVM guest (i.e., the KVM pointer in the
>> struct ap_matrix_mdev is not NULL). The intended purpose was
>> to prevent the mdev from being removed while in use. There are two
>> problems with this scenario:
>>
>> 1. Returning a non-zero return code from the remove callback does not
>>     prevent the removal of the mdev.
>>
>> 2. The KVM pointer in the struct ap_matrix_mdev will always be NULL because
>>     the remove callback will not get invoked until the mdev fd is closed.
>>     When the mdev fd is closed, the mdev release callback is invoked and
>>     clears the KVM pointer from the struct ap_matrix_mdev.
>>
>> Let's go ahead and remove the check for KVM in the remove callback and
>> allow the cleanup of mdev resources to proceed.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>   drivers/s390/crypto/vfio_ap_ops.c | 10 ----------
>>   1 file changed, 10 deletions(-)
> 
> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>

Jason, I guess you want this patch still in 5.13, the other 2 can be 5.14?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer
  2021-06-09 22:46 ` [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer Tony Krowiak
  2021-06-11 17:06   ` Jason Gunthorpe
@ 2021-06-15  8:55   ` Christian Borntraeger
  2021-06-15 18:08     ` Tony Krowiak
  2021-06-15 18:55     ` Tony Krowiak
  1 sibling, 2 replies; 14+ messages in thread
From: Christian Borntraeger @ 2021-06-15  8:55 UTC (permalink / raw)
  To: Tony Krowiak, linux-s390, linux-kernel
  Cc: cohuck, pasic, jjherne, jgg, alex.williamson, kwankhede, frankja,
	david, imbrenda, hca

On 10.06.21 00:46, Tony Krowiak wrote:
> The function pointer to the interception handler for the PQAP instruction
> can get changed during the interception process. Let's add a
> semaphore to struct kvm_s390_crypto to control read/write access to the
> function pointer contained therein.
> 
> The semaphore must be locked for write access by the vfio_ap device driver
> when notified that the KVM pointer has been set or cleared. It must be
> locked for read access by the interception framework when the PQAP
> instruction is intercepted.
> 
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>

Given that patch 2 is still  under discussion. Can this patch go without patch 2?
> ---
>   arch/s390/include/asm/kvm_host.h      |  6 +++---
>   arch/s390/kvm/kvm-s390.c              |  1 +
>   arch/s390/kvm/priv.c                  |  6 +++---
>   drivers/s390/crypto/vfio_ap_ops.c     | 14 ++++++++++----
>   drivers/s390/crypto/vfio_ap_private.h |  2 +-
>   5 files changed, 18 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
> index 8925f3969478..58edaa3f9602 100644
> --- a/arch/s390/include/asm/kvm_host.h
> +++ b/arch/s390/include/asm/kvm_host.h
> @@ -803,14 +803,14 @@ struct kvm_s390_cpu_model {
>   	unsigned short ibc;
>   };
>   
> -struct kvm_s390_module_hook {
> +struct kvm_s390_crypto_hook {
>   	int (*hook)(struct kvm_vcpu *vcpu);
> -	struct module *owner;
>   };
>   
>   struct kvm_s390_crypto {
>   	struct kvm_s390_crypto_cb *crycb;
> -	struct kvm_s390_module_hook *pqap_hook;
> +	struct rw_semaphore pqap_hook_rwsem;
> +	struct kvm_s390_crypto_hook *pqap_hook;
>   	__u32 crycbd;
>   	__u8 aes_kw;
>   	__u8 dea_kw;
> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
> index 1296fc10f80c..418d910df569 100644
> --- a/arch/s390/kvm/kvm-s390.c
> +++ b/arch/s390/kvm/kvm-s390.c
> @@ -2606,6 +2606,7 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>   {
>   	kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>   	kvm_s390_set_crycb_format(kvm);
> +	init_rwsem(&kvm->arch.crypto.pqap_hook_rwsem);
>   
>   	if (!test_kvm_facility(kvm, 76))
>   		return;
> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
> index 9928f785c677..bbbd84ffe239 100644
> --- a/arch/s390/kvm/priv.c
> +++ b/arch/s390/kvm/priv.c
> @@ -657,15 +657,15 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
>   	 * Verify that the hook callback is registered, lock the owner
>   	 * and call the hook.
>   	 */
> +	down_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>   	if (vcpu->kvm->arch.crypto.pqap_hook) {
> -		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
> -			return -EOPNOTSUPP;
>   		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
> -		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
>   		if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000)
>   			kvm_s390_set_psw_cc(vcpu, 3);
> +		up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>   		return ret;
>   	}
> +	up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>   	/*
>   	 * A vfio_driver must register a hook.
>   	 * No hook means no driver to enable the SIE CRYCB and no queues.
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index d65a5728153b..2998c1b53ab9 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -342,7 +342,6 @@ static int vfio_ap_mdev_create(struct mdev_device *mdev)
>   	init_rwsem(&matrix_mdev->rwsem);
>   	mdev_set_drvdata(mdev, matrix_mdev);
>   	matrix_mdev->pqap_hook.hook = handle_pqap;
> -	matrix_mdev->pqap_hook.owner = THIS_MODULE;
>   
>   	mutex_lock(&matrix_dev->lock);
>   	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
> @@ -1063,7 +1062,6 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>   	down_write(&matrix_mdev->rwsem);
>   	matrix_mdev->kvm = kvm;
>   	kvm_get_kvm(kvm);
> -	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>   	up_write(&matrix_mdev->rwsem);
>   
>   	/*
> @@ -1071,6 +1069,10 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>   	 * masks for the KVM guest
>   	 */
>   	if (kvm->arch.crypto.crycbd) {
> +		down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> +		kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
> +		up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> +
>   		down_read(&matrix_mdev->matrix.rwsem);
>   		kvm_arch_crypto_set_masks(kvm,
>   					  matrix_mdev->matrix.apm,
> @@ -1122,11 +1124,15 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
>   static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>   {
>   	if (matrix_mdev->kvm) {
> -		if (matrix_mdev->kvm->arch.crypto.crycbd)
> +		if (matrix_mdev->kvm->arch.crypto.crycbd) {
> +			down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> +			matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> +			up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
> +
>   			kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> +		}
>   
>   		down_write(&matrix_mdev->rwsem);
> -		matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>   		vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
>   		kvm_put_kvm(matrix_mdev->kvm);
>   		matrix_mdev->kvm = NULL;
> diff --git a/drivers/s390/crypto/vfio_ap_private.h b/drivers/s390/crypto/vfio_ap_private.h
> index a163ac04ff8a..3d6afd0faaaf 100644
> --- a/drivers/s390/crypto/vfio_ap_private.h
> +++ b/drivers/s390/crypto/vfio_ap_private.h
> @@ -90,7 +90,7 @@ struct ap_matrix_mdev {
>   	struct notifier_block iommu_notifier;
>   	struct rw_semaphore rwsem;
>   	struct kvm *kvm;
> -	struct kvm_s390_module_hook pqap_hook;
> +	struct kvm_s390_crypto_hook pqap_hook;
>   	struct mdev_device *mdev;
>   };
>   
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer
  2021-06-15  8:55   ` Christian Borntraeger
@ 2021-06-15 18:08     ` Tony Krowiak
  2021-06-15 18:55     ` Tony Krowiak
  1 sibling, 0 replies; 14+ messages in thread
From: Tony Krowiak @ 2021-06-15 18:08 UTC (permalink / raw)
  To: Christian Borntraeger, linux-s390, linux-kernel
  Cc: cohuck, pasic, jjherne, jgg, alex.williamson, kwankhede, frankja,
	david, imbrenda, hca



On 6/15/21 4:55 AM, Christian Borntraeger wrote:
> On 10.06.21 00:46, Tony Krowiak wrote:
>> The function pointer to the interception handler for the PQAP 
>> instruction
>> can get changed during the interception process. Let's add a
>> semaphore to struct kvm_s390_crypto to control read/write access to the
>> function pointer contained therein.
>>
>> The semaphore must be locked for write access by the vfio_ap device 
>> driver
>> when notified that the KVM pointer has been set or cleared. It must be
>> locked for read access by the interception framework when the PQAP
>> instruction is intercepted.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>
> Given that patch 2 is still  under discussion. Can this patch go 
> without patch 2?

Two things: I don't know if this patch would go on cleanly since patch 2 
intervenes;
This patch has not been tested without patch 2.

>> ---
>>   arch/s390/include/asm/kvm_host.h      |  6 +++---
>>   arch/s390/kvm/kvm-s390.c              |  1 +
>>   arch/s390/kvm/priv.c                  |  6 +++---
>>   drivers/s390/crypto/vfio_ap_ops.c     | 14 ++++++++++----
>>   drivers/s390/crypto/vfio_ap_private.h |  2 +-
>>   5 files changed, 18 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/kvm_host.h 
>> b/arch/s390/include/asm/kvm_host.h
>> index 8925f3969478..58edaa3f9602 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -803,14 +803,14 @@ struct kvm_s390_cpu_model {
>>       unsigned short ibc;
>>   };
>>   -struct kvm_s390_module_hook {
>> +struct kvm_s390_crypto_hook {
>>       int (*hook)(struct kvm_vcpu *vcpu);
>> -    struct module *owner;
>>   };
>>     struct kvm_s390_crypto {
>>       struct kvm_s390_crypto_cb *crycb;
>> -    struct kvm_s390_module_hook *pqap_hook;
>> +    struct rw_semaphore pqap_hook_rwsem;
>> +    struct kvm_s390_crypto_hook *pqap_hook;
>>       __u32 crycbd;
>>       __u8 aes_kw;
>>       __u8 dea_kw;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 1296fc10f80c..418d910df569 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -2606,6 +2606,7 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>>   {
>>       kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>       kvm_s390_set_crycb_format(kvm);
>> +    init_rwsem(&kvm->arch.crypto.pqap_hook_rwsem);
>>         if (!test_kvm_facility(kvm, 76))
>>           return;
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 9928f785c677..bbbd84ffe239 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -657,15 +657,15 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
>>        * Verify that the hook callback is registered, lock the owner
>>        * and call the hook.
>>        */
>> + down_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>>       if (vcpu->kvm->arch.crypto.pqap_hook) {
>> -        if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
>> -            return -EOPNOTSUPP;
>>           ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
>> - module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
>>           if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000)
>>               kvm_s390_set_psw_cc(vcpu, 3);
>> + up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>>           return ret;
>>       }
>> +    up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>>       /*
>>        * A vfio_driver must register a hook.
>>        * No hook means no driver to enable the SIE CRYCB and no queues.
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index d65a5728153b..2998c1b53ab9 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -342,7 +342,6 @@ static int vfio_ap_mdev_create(struct mdev_device 
>> *mdev)
>>       init_rwsem(&matrix_mdev->rwsem);
>>       mdev_set_drvdata(mdev, matrix_mdev);
>>       matrix_mdev->pqap_hook.hook = handle_pqap;
>> -    matrix_mdev->pqap_hook.owner = THIS_MODULE;
>>         mutex_lock(&matrix_dev->lock);
>>       list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
>> @@ -1063,7 +1062,6 @@ static int vfio_ap_mdev_set_kvm(struct 
>> ap_matrix_mdev *matrix_mdev,
>>       down_write(&matrix_mdev->rwsem);
>>       matrix_mdev->kvm = kvm;
>>       kvm_get_kvm(kvm);
>> -    kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>>       up_write(&matrix_mdev->rwsem);
>>         /*
>> @@ -1071,6 +1069,10 @@ static int vfio_ap_mdev_set_kvm(struct 
>> ap_matrix_mdev *matrix_mdev,
>>        * masks for the KVM guest
>>        */
>>       if (kvm->arch.crypto.crycbd) {
>> + down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> +        kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>> + up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> +
>>           down_read(&matrix_mdev->matrix.rwsem);
>>           kvm_arch_crypto_set_masks(kvm,
>>                         matrix_mdev->matrix.apm,
>> @@ -1122,11 +1124,15 @@ static int vfio_ap_mdev_iommu_notifier(struct 
>> notifier_block *nb,
>>   static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>>   {
>>       if (matrix_mdev->kvm) {
>> -        if (matrix_mdev->kvm->arch.crypto.crycbd)
>> +        if (matrix_mdev->kvm->arch.crypto.crycbd) {
>> + down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> +            matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>> + up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> +
>>               kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>> +        }
>>             down_write(&matrix_mdev->rwsem);
>> -        matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>>           vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
>>           kvm_put_kvm(matrix_mdev->kvm);
>>           matrix_mdev->kvm = NULL;
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h 
>> b/drivers/s390/crypto/vfio_ap_private.h
>> index a163ac04ff8a..3d6afd0faaaf 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -90,7 +90,7 @@ struct ap_matrix_mdev {
>>       struct notifier_block iommu_notifier;
>>       struct rw_semaphore rwsem;
>>       struct kvm *kvm;
>> -    struct kvm_s390_module_hook pqap_hook;
>> +    struct kvm_s390_crypto_hook pqap_hook;
>>       struct mdev_device *mdev;
>>   };
>>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer
  2021-06-15  8:55   ` Christian Borntraeger
  2021-06-15 18:08     ` Tony Krowiak
@ 2021-06-15 18:55     ` Tony Krowiak
  1 sibling, 0 replies; 14+ messages in thread
From: Tony Krowiak @ 2021-06-15 18:55 UTC (permalink / raw)
  To: Christian Borntraeger, linux-s390, linux-kernel
  Cc: cohuck, pasic, jjherne, jgg, alex.williamson, kwankhede, frankja,
	david, imbrenda, hca



On 6/15/21 4:55 AM, Christian Borntraeger wrote:
> On 10.06.21 00:46, Tony Krowiak wrote:
>> The function pointer to the interception handler for the PQAP 
>> instruction
>> can get changed during the interception process. Let's add a
>> semaphore to struct kvm_s390_crypto to control read/write access to the
>> function pointer contained therein.
>>
>> The semaphore must be locked for write access by the vfio_ap device 
>> driver
>> when notified that the KVM pointer has been set or cleared. It must be
>> locked for read access by the interception framework when the PQAP
>> instruction is intercepted.
>>
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>
> Given that patch 2 is still  under discussion. Can this patch go 
> without patch 2?

Here's what I think I'm going to do. Patches 1 and 3 are related to the 
patch
I posted to resolve the FIXME in the vfio_ap_mdev_remove() callback in a 
patch
posted by Jason Gunthorpe (Message ID: 
<6-v1-d88406ed308e+418-vfio3_jgg@nvidia.com>).
Patch 2 in this series was precipitated by a comment Jason made that was
not directly related to that fix, so I will put a two patch series
with patch 1 and 3, then test them and post them here before proceeding
to resolving the issues in patch 2 which I will post separately.

>> ---
>>   arch/s390/include/asm/kvm_host.h      |  6 +++---
>>   arch/s390/kvm/kvm-s390.c              |  1 +
>>   arch/s390/kvm/priv.c                  |  6 +++---
>>   drivers/s390/crypto/vfio_ap_ops.c     | 14 ++++++++++----
>>   drivers/s390/crypto/vfio_ap_private.h |  2 +-
>>   5 files changed, 18 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/s390/include/asm/kvm_host.h 
>> b/arch/s390/include/asm/kvm_host.h
>> index 8925f3969478..58edaa3f9602 100644
>> --- a/arch/s390/include/asm/kvm_host.h
>> +++ b/arch/s390/include/asm/kvm_host.h
>> @@ -803,14 +803,14 @@ struct kvm_s390_cpu_model {
>>       unsigned short ibc;
>>   };
>>   -struct kvm_s390_module_hook {
>> +struct kvm_s390_crypto_hook {
>>       int (*hook)(struct kvm_vcpu *vcpu);
>> -    struct module *owner;
>>   };
>>     struct kvm_s390_crypto {
>>       struct kvm_s390_crypto_cb *crycb;
>> -    struct kvm_s390_module_hook *pqap_hook;
>> +    struct rw_semaphore pqap_hook_rwsem;
>> +    struct kvm_s390_crypto_hook *pqap_hook;
>>       __u32 crycbd;
>>       __u8 aes_kw;
>>       __u8 dea_kw;
>> diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
>> index 1296fc10f80c..418d910df569 100644
>> --- a/arch/s390/kvm/kvm-s390.c
>> +++ b/arch/s390/kvm/kvm-s390.c
>> @@ -2606,6 +2606,7 @@ static void kvm_s390_crypto_init(struct kvm *kvm)
>>   {
>>       kvm->arch.crypto.crycb = &kvm->arch.sie_page2->crycb;
>>       kvm_s390_set_crycb_format(kvm);
>> +    init_rwsem(&kvm->arch.crypto.pqap_hook_rwsem);
>>         if (!test_kvm_facility(kvm, 76))
>>           return;
>> diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
>> index 9928f785c677..bbbd84ffe239 100644
>> --- a/arch/s390/kvm/priv.c
>> +++ b/arch/s390/kvm/priv.c
>> @@ -657,15 +657,15 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
>>        * Verify that the hook callback is registered, lock the owner
>>        * and call the hook.
>>        */
>> + down_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>>       if (vcpu->kvm->arch.crypto.pqap_hook) {
>> -        if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
>> -            return -EOPNOTSUPP;
>>           ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
>> - module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
>>           if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000)
>>               kvm_s390_set_psw_cc(vcpu, 3);
>> + up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>>           return ret;
>>       }
>> +    up_read(&vcpu->kvm->arch.crypto.pqap_hook_rwsem);
>>       /*
>>        * A vfio_driver must register a hook.
>>        * No hook means no driver to enable the SIE CRYCB and no queues.
>> diff --git a/drivers/s390/crypto/vfio_ap_ops.c 
>> b/drivers/s390/crypto/vfio_ap_ops.c
>> index d65a5728153b..2998c1b53ab9 100644
>> --- a/drivers/s390/crypto/vfio_ap_ops.c
>> +++ b/drivers/s390/crypto/vfio_ap_ops.c
>> @@ -342,7 +342,6 @@ static int vfio_ap_mdev_create(struct mdev_device 
>> *mdev)
>>       init_rwsem(&matrix_mdev->rwsem);
>>       mdev_set_drvdata(mdev, matrix_mdev);
>>       matrix_mdev->pqap_hook.hook = handle_pqap;
>> -    matrix_mdev->pqap_hook.owner = THIS_MODULE;
>>         mutex_lock(&matrix_dev->lock);
>>       list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
>> @@ -1063,7 +1062,6 @@ static int vfio_ap_mdev_set_kvm(struct 
>> ap_matrix_mdev *matrix_mdev,
>>       down_write(&matrix_mdev->rwsem);
>>       matrix_mdev->kvm = kvm;
>>       kvm_get_kvm(kvm);
>> -    kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>>       up_write(&matrix_mdev->rwsem);
>>         /*
>> @@ -1071,6 +1069,10 @@ static int vfio_ap_mdev_set_kvm(struct 
>> ap_matrix_mdev *matrix_mdev,
>>        * masks for the KVM guest
>>        */
>>       if (kvm->arch.crypto.crycbd) {
>> + down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> +        kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
>> + up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> +
>>           down_read(&matrix_mdev->matrix.rwsem);
>>           kvm_arch_crypto_set_masks(kvm,
>>                         matrix_mdev->matrix.apm,
>> @@ -1122,11 +1124,15 @@ static int vfio_ap_mdev_iommu_notifier(struct 
>> notifier_block *nb,
>>   static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>>   {
>>       if (matrix_mdev->kvm) {
>> -        if (matrix_mdev->kvm->arch.crypto.crycbd)
>> +        if (matrix_mdev->kvm->arch.crypto.crycbd) {
>> + down_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> +            matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>> + up_write(&matrix_mdev->kvm->arch.crypto.pqap_hook_rwsem);
>> +
>>               kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>> +        }
>>             down_write(&matrix_mdev->rwsem);
>> -        matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>>           vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
>>           kvm_put_kvm(matrix_mdev->kvm);
>>           matrix_mdev->kvm = NULL;
>> diff --git a/drivers/s390/crypto/vfio_ap_private.h 
>> b/drivers/s390/crypto/vfio_ap_private.h
>> index a163ac04ff8a..3d6afd0faaaf 100644
>> --- a/drivers/s390/crypto/vfio_ap_private.h
>> +++ b/drivers/s390/crypto/vfio_ap_private.h
>> @@ -90,7 +90,7 @@ struct ap_matrix_mdev {
>>       struct notifier_block iommu_notifier;
>>       struct rw_semaphore rwsem;
>>       struct kvm *kvm;
>> -    struct kvm_s390_module_hook pqap_hook;
>> +    struct kvm_s390_crypto_hook pqap_hook;
>>       struct mdev_device *mdev;
>>   };
>>


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-06-15 18:55 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-09 22:46 [PATCH 0/3] s390/vfio-ap: refactor mdev remove callback and locks Tony Krowiak
2021-06-09 22:46 ` [PATCH 1/3] s390/vfio-ap: clean up mdev resources when remove callback invoked Tony Krowiak
2021-06-11 16:48   ` Jason Gunthorpe
2021-06-14 17:29     ` Tony Krowiak
2021-06-15  7:43     ` Christian Borntraeger
2021-06-09 22:46 ` [PATCH 2/3] s390/vfio-ap: introduce two new r/w locks to replace wait_queue_head_t Tony Krowiak
2021-06-11 17:05   ` Jason Gunthorpe
2021-06-11 17:11     ` David Hildenbrand
2021-06-11 17:41       ` Jason Gunthorpe
2021-06-09 22:46 ` [PATCH 3/3] s390/vfio-ap: r/w lock for PQAP interception handler function pointer Tony Krowiak
2021-06-11 17:06   ` Jason Gunthorpe
2021-06-15  8:55   ` Christian Borntraeger
2021-06-15 18:08     ` Tony Krowiak
2021-06-15 18:55     ` Tony Krowiak

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).