linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/1] s390/vfio-ap: fix circular lockdep when staring SE guest
@ 2021-02-16  1:15 Tony Krowiak
  2021-02-16  1:15 ` [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks Tony Krowiak
  0 siblings, 1 reply; 12+ messages in thread
From: Tony Krowiak @ 2021-02-16  1:15 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: stable, borntraeger, cohuck, kwankhede, pbonzini,
	alex.williamson, pasic, Tony Krowiak

Commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
pointer invalidated") introduced a change that results in a circular
lockdep when a Secure Execution guest that is configured with
crypto devices is started. The problem resulted due to the fact that the
patch moved the setting of the guest's AP masks within the protection of
the matrix_dev->lock when the vfio_ap driver is notified that the KVM 
pointer has been set. Since it is not critical that setting/clearing of
the guest's AP masks be done under the matrix_dev->lock when the driver is
notified, the masks will not be updated under the matrix_dev->lock. The
lock is necessary for the setting/unsetting of the KVM pointer, however,
so that will remain in place. 

The dependency chain for the circular lockdep resolved by this patch 
is (in reverse order):

2:	vfio_ap_mdev_group_notifier:	kvm->lock
					matrix_dev->lock

1:	handle_pqap:			matrix_dev->lock
	kvm_vcpu_ioctl:			vcpu->mutex

0:	kvm_s390_cpus_to_pv:		vcpu->mutex
	kvm_vm_ioctl:  			kvm->lock

Please note that if checkpatch is run against this patch series, you may
get a "WARNING: Unknown commit id 'f21916ec4826', maybe rebased or not 
pulled?" message. The commit 'f21916ec4826', however, is definitely
in the master branch on top of which this patch series was built, so I'm
not sure why this message is being output by checkpatch. 

Change log v1=> v2:
------------------
* No longer holding the matrix_dev->lock prior to setting/clearing the
  masks supplying the AP configuration to a KVM guest.
* Make all updates to the data in the matrix mdev that is used to manage
  AP resources used by the KVM guest in the vfio_ap_mdev_set_kvm() function
  instead of the group notifier callback.
* Check for the matrix mdev's KVM pointer in the vfio_ap_mdev_unset_kvm()
  function instead of the vfio_ap_mdev_release() function.

Tony Krowiak (1):
  s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks

 drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++---------
 1 file changed, 84 insertions(+), 35 deletions(-)

-- 
2.21.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
  2021-02-16  1:15 [PATCH v2 0/1] s390/vfio-ap: fix circular lockdep when staring SE guest Tony Krowiak
@ 2021-02-16  1:15 ` Tony Krowiak
  2021-02-19 13:45   ` Cornelia Huck
  2021-02-23  9:48   ` Halil Pasic
  0 siblings, 2 replies; 12+ messages in thread
From: Tony Krowiak @ 2021-02-16  1:15 UTC (permalink / raw)
  To: linux-s390, linux-kernel, kvm
  Cc: stable, borntraeger, cohuck, kwankhede, pbonzini,
	alex.williamson, pasic, Tony Krowiak

This patch fixes a circular locking dependency in the CI introduced by
commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
pointer invalidated"). The lockdep only occurs when starting a Secure
Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
SE guests; however, in order to avoid CI errors, this fix is being
provided.

The circular lockdep was introduced when the masks in the guest's APCB
were taken under the matrix_dev->lock. While the lock is definitely
needed to protect the setting/unsetting of the KVM pointer, it is not
necessarily critical for setting the masks, so this will not be done under
protection of the matrix_dev->lock.

Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
Cc: stable@vger.kernel.org
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
---
 drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++---------
 1 file changed, 84 insertions(+), 35 deletions(-)

diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 41fc2e4135fe..8574b6ecc9c5 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -1027,8 +1027,21 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = {
  * @matrix_mdev: a mediated matrix device
  * @kvm: reference to KVM instance
  *
- * Verifies no other mediated matrix device has @kvm and sets a reference to
- * it in @matrix_mdev->kvm.
+ * Sets all data for @matrix_mdev that are needed to manage AP resources
+ * for the guest whose state is represented by @kvm:
+ * 1. Verifies no other mediated device has a reference to @kvm.
+ * 2. Increments the ref count for @kvm so it doesn't disappear until the
+ *    vfio_ap driver is notified the pointer is being nullified.
+ * 3. Sets a reference to the PQAP hook (i.e., handle_pqap() function) into
+ *    @kvm to handle interception of the PQAP(AQIC) instruction.
+ * 4. Sets the masks supplying the AP configuration to the KVM guest.
+ * 5. Sets the KVM pointer into @kvm so the vfio_ap driver can access it.
+ *
+ * Note: The matrix_dev->lock must be taken prior to calling
+ * this function; however, the lock will be temporarily released to avoid a
+ * potential circular lock dependency with other asynchronous processes that
+ * lock the kvm->lock mutex which is also needed to supply the guest's AP
+ * configuration.
  *
  * Return 0 if no other mediated matrix device has a reference to @kvm;
  * otherwise, returns an -EPERM.
@@ -1043,9 +1056,17 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
 			return -EPERM;
 	}
 
-	matrix_mdev->kvm = kvm;
-	kvm_get_kvm(kvm);
-	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
+	if (kvm->arch.crypto.crycbd) {
+		kvm_get_kvm(kvm);
+		kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
+		mutex_unlock(&matrix_dev->lock);
+		kvm_arch_crypto_set_masks(kvm,
+					  matrix_mdev->matrix.apm,
+					  matrix_mdev->matrix.aqm,
+					  matrix_mdev->matrix.adm);
+		mutex_lock(&matrix_dev->lock);
+		matrix_mdev->kvm = kvm;
+	}
 
 	return 0;
 }
@@ -1079,51 +1100,80 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
 	return NOTIFY_DONE;
 }
 
+/**
+ * vfio_ap_mdev_unset_kvm
+ *
+ * @matrix_mdev: a matrix mediated device
+ *
+ * Performs clean-up of resources no longer needed by @matrix_mdev.
+ *
+ * Note: The matrix_dev->lock must be taken prior to calling this
+ * function; however,  the lock will be temporarily released to avoid a
+ * potential circular lock dependency with other asynchronous processes that
+ * lock the kvm->lock mutex which is also needed to update the guest's AP
+ * configuration as follows:
+ *	1.  Grab a reference to the KVM pointer stored in @matrix_mdev.
+ *	2.  Set the KVM pointer in @matrix_mdev to NULL so no other asynchronous
+ *	    process uses it (e.g., assign_adapter store function) after
+ *	    unlocking the matrix_dev->lock mutex.
+ *	3.  Set the PQAP hook to NULL so it will not be invoked after unlocking
+ *	    the matrix_dev->lock mutex.
+ *	4.  Unlock the matrix_dev->lock mutex to avoid circular lock
+ *	    dependencies.
+ *	5.  Clear the masks in the guest's APCB to remove guest access to AP
+ *	    resources assigned to @matrix_mdev.
+ *	6.  Lock the matrix_dev->lock mutex to prevent access to resources
+ *	    assigned to @matrix_mdev while the remainder of the cleanup
+ *	    operations take place.
+ *	7.  Decrement the reference counter incremented in #1.
+ *	8.  Set the reference to the KVM pointer grabbed in #1 into @matrix_mdev
+ *	    (set to NULL in #2) because it will be needed when the queues are
+ *	    reset to clean up any IRQ resources being held.
+ *	9.  Decrement the reference count that was incremented when the KVM
+ *	    pointer was originally set by the group notifier.
+ *	10. Set the KVM pointer @matrix_mdev to NULL to prevent its usage from
+ *	    here on out.
+ *
+ */
 static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
 {
-	kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
-	matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
-	vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
-	kvm_put_kvm(matrix_mdev->kvm);
-	matrix_mdev->kvm = NULL;
+	struct kvm *kvm;
+
+	if (matrix_mdev->kvm) {
+		kvm = matrix_mdev->kvm;
+		kvm_get_kvm(kvm);
+		matrix_mdev->kvm = NULL;
+		kvm->arch.crypto.pqap_hook = NULL;
+		mutex_unlock(&matrix_dev->lock);
+		kvm_arch_crypto_clear_masks(kvm);
+		mutex_lock(&matrix_dev->lock);
+		kvm_put_kvm(kvm);
+		matrix_mdev->kvm = kvm;
+		vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
+		kvm_put_kvm(matrix_mdev->kvm);
+		matrix_mdev->kvm = NULL;
+	}
 }
 
 static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
 				       unsigned long action, void *data)
 {
-	int ret, notify_rc = NOTIFY_OK;
+	int notify_rc = NOTIFY_OK;
 	struct ap_matrix_mdev *matrix_mdev;
 
 	if (action != VFIO_GROUP_NOTIFY_SET_KVM)
 		return NOTIFY_OK;
 
-	matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
 	mutex_lock(&matrix_dev->lock);
+	matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
 
-	if (!data) {
-		if (matrix_mdev->kvm)
-			vfio_ap_mdev_unset_kvm(matrix_mdev);
-		goto notify_done;
-	}
-
-	ret = vfio_ap_mdev_set_kvm(matrix_mdev, data);
-	if (ret) {
-		notify_rc = NOTIFY_DONE;
-		goto notify_done;
-	}
-
-	/* If there is no CRYCB pointer, then we can't copy the masks */
-	if (!matrix_mdev->kvm->arch.crypto.crycbd) {
+	if (!data)
+		vfio_ap_mdev_unset_kvm(matrix_mdev);
+	else if (vfio_ap_mdev_set_kvm(matrix_mdev, data))
 		notify_rc = NOTIFY_DONE;
-		goto notify_done;
-	}
-
-	kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm,
-				  matrix_mdev->matrix.aqm,
-				  matrix_mdev->matrix.adm);
 
-notify_done:
 	mutex_unlock(&matrix_dev->lock);
+
 	return notify_rc;
 }
 
@@ -1258,8 +1308,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
 	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
 
 	mutex_lock(&matrix_dev->lock);
-	if (matrix_mdev->kvm)
-		vfio_ap_mdev_unset_kvm(matrix_mdev);
+	vfio_ap_mdev_unset_kvm(matrix_mdev);
 	mutex_unlock(&matrix_dev->lock);
 
 	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,
-- 
2.21.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
  2021-02-16  1:15 ` [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks Tony Krowiak
@ 2021-02-19 13:45   ` Cornelia Huck
  2021-02-19 20:49     ` Tony Krowiak
  2021-02-23  9:48   ` Halil Pasic
  1 sibling, 1 reply; 12+ messages in thread
From: Cornelia Huck @ 2021-02-19 13:45 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, stable, borntraeger, kwankhede,
	pbonzini, alex.williamson, pasic

On Mon, 15 Feb 2021 20:15:47 -0500
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> This patch fixes a circular locking dependency in the CI introduced by
> commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
> pointer invalidated"). The lockdep only occurs when starting a Secure
> Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
> SE guests; however, in order to avoid CI errors, this fix is being
> provided.
> 
> The circular lockdep was introduced when the masks in the guest's APCB
> were taken under the matrix_dev->lock. While the lock is definitely
> needed to protect the setting/unsetting of the KVM pointer, it is not
> necessarily critical for setting the masks, so this will not be done under
> protection of the matrix_dev->lock.
> 
> Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
> Cc: stable@vger.kernel.org
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++---------
>  1 file changed, 84 insertions(+), 35 deletions(-)

I've been looking at the patch for a bit now and tried to follow down
the various paths; and while I think it's ok, I do not really have
enough confidence about that for a R-b. But have an

Acked-by: Cornelia Huck <cohuck@redhat.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
  2021-02-19 13:45   ` Cornelia Huck
@ 2021-02-19 20:49     ` Tony Krowiak
  0 siblings, 0 replies; 12+ messages in thread
From: Tony Krowiak @ 2021-02-19 20:49 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: linux-s390, linux-kernel, kvm, stable, borntraeger, kwankhede,
	pbonzini, alex.williamson, pasic



On 2/19/21 8:45 AM, Cornelia Huck wrote:
> On Mon, 15 Feb 2021 20:15:47 -0500
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
>
>> This patch fixes a circular locking dependency in the CI introduced by
>> commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
>> pointer invalidated"). The lockdep only occurs when starting a Secure
>> Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
>> SE guests; however, in order to avoid CI errors, this fix is being
>> provided.
>>
>> The circular lockdep was introduced when the masks in the guest's APCB
>> were taken under the matrix_dev->lock. While the lock is definitely
>> needed to protect the setting/unsetting of the KVM pointer, it is not
>> necessarily critical for setting the masks, so this will not be done under
>> protection of the matrix_dev->lock.
>>
>> Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
>> ---
>>   drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++---------
>>   1 file changed, 84 insertions(+), 35 deletions(-)
> I've been looking at the patch for a bit now and tried to follow down
> the various paths; and while I think it's ok, I do not really have
> enough confidence about that for a R-b. But have an
>
> Acked-by: Cornelia Huck <cohuck@redhat.com>

Thanks for the review.

>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
  2021-02-16  1:15 ` [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks Tony Krowiak
  2021-02-19 13:45   ` Cornelia Huck
@ 2021-02-23  9:48   ` Halil Pasic
  2021-02-24 16:10     ` Christian Borntraeger
       [not found]     ` <63bb0d61-efcd-315b-5a1a-0ef4d99600f4@linux.ibm.com>
  1 sibling, 2 replies; 12+ messages in thread
From: Halil Pasic @ 2021-02-23  9:48 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, stable, borntraeger, cohuck,
	kwankhede, pbonzini, alex.williamson, pasic

On Mon, 15 Feb 2021 20:15:47 -0500
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> This patch fixes a circular locking dependency in the CI introduced by
> commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
> pointer invalidated"). The lockdep only occurs when starting a Secure
> Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
> SE guests; however, in order to avoid CI errors, this fix is being
> provided.
> 
> The circular lockdep was introduced when the masks in the guest's APCB
> were taken under the matrix_dev->lock. While the lock is definitely
> needed to protect the setting/unsetting of the KVM pointer, it is not
> necessarily critical for setting the masks, so this will not be done under
> protection of the matrix_dev->lock.



With the one little thing I commented on below addressed: 
Acked-by: Halil Pasic <pasic@linux.ibm.com>  

This solution probably ain't a perfect one, but can't say I see a simple
way to get around this problem. For instance I played with the thought of
taking locks in a different order and keeping the critical sections
intact, but that has problems of its own. Tony should have the best
understanding of vfio_ap anyway.

In theory the execution of vfio_ap_mdev_group_notifier() and
vfio_ap_mdev_release() could interleave, and we could loose a clear because
in theory some permutations of the critical sections need to be
considered. In practice I hope that won't happen with QEMU.

Tony, you gave this a decent amount of testing or? 

I think we should move forward with this. Any objections? 
> 
> Fixes: f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM pointer invalidated")
> Cc: stable@vger.kernel.org
> Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
> ---
>  drivers/s390/crypto/vfio_ap_ops.c | 119 +++++++++++++++++++++---------
>  1 file changed, 84 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 41fc2e4135fe..8574b6ecc9c5 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -1027,8 +1027,21 @@ static const struct attribute_group *vfio_ap_mdev_attr_groups[] = {
>   * @matrix_mdev: a mediated matrix device
>   * @kvm: reference to KVM instance
>   *
> - * Verifies no other mediated matrix device has @kvm and sets a reference to
> - * it in @matrix_mdev->kvm.
> + * Sets all data for @matrix_mdev that are needed to manage AP resources
> + * for the guest whose state is represented by @kvm:
> + * 1. Verifies no other mediated device has a reference to @kvm.
> + * 2. Increments the ref count for @kvm so it doesn't disappear until the
> + *    vfio_ap driver is notified the pointer is being nullified.
> + * 3. Sets a reference to the PQAP hook (i.e., handle_pqap() function) into
> + *    @kvm to handle interception of the PQAP(AQIC) instruction.
> + * 4. Sets the masks supplying the AP configuration to the KVM guest.
> + * 5. Sets the KVM pointer into @kvm so the vfio_ap driver can access it.
> + *

Could for example a PQAP AQIC run across an unset matrix_mdev->kvm like
this, in theory? I don't think it's likely to happen in the wild though.
Why not set it up before setting the mask?

> + * Note: The matrix_dev->lock must be taken prior to calling
> + * this function; however, the lock will be temporarily released to avoid a
> + * potential circular lock dependency with other asynchronous processes that
> + * lock the kvm->lock mutex which is also needed to supply the guest's AP
> + * configuration.
>   *
>   * Return 0 if no other mediated matrix device has a reference to @kvm;
>   * otherwise, returns an -EPERM.
> @@ -1043,9 +1056,17 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
>  			return -EPERM;
>  	}
>  
> -	matrix_mdev->kvm = kvm;
> -	kvm_get_kvm(kvm);
> -	kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
> +	if (kvm->arch.crypto.crycbd) {
> +		kvm_get_kvm(kvm);
> +		kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
> +		mutex_unlock(&matrix_dev->lock);
> +		kvm_arch_crypto_set_masks(kvm,
> +					  matrix_mdev->matrix.apm,
> +					  matrix_mdev->matrix.aqm,
> +					  matrix_mdev->matrix.adm);
> +		mutex_lock(&matrix_dev->lock);
> +		matrix_mdev->kvm = kvm;
> +	}
>  
>  	return 0;
>  }
> @@ -1079,51 +1100,80 @@ static int vfio_ap_mdev_iommu_notifier(struct notifier_block *nb,
>  	return NOTIFY_DONE;
>  }
>  
> +/**
> + * vfio_ap_mdev_unset_kvm
> + *
> + * @matrix_mdev: a matrix mediated device
> + *
> + * Performs clean-up of resources no longer needed by @matrix_mdev.
> + *
> + * Note: The matrix_dev->lock must be taken prior to calling this
> + * function; however,  the lock will be temporarily released to avoid a
> + * potential circular lock dependency with other asynchronous processes that
> + * lock the kvm->lock mutex which is also needed to update the guest's AP
> + * configuration as follows:
> + *	1.  Grab a reference to the KVM pointer stored in @matrix_mdev.
> + *	2.  Set the KVM pointer in @matrix_mdev to NULL so no other asynchronous
> + *	    process uses it (e.g., assign_adapter store function) after
> + *	    unlocking the matrix_dev->lock mutex.
> + *	3.  Set the PQAP hook to NULL so it will not be invoked after unlocking
> + *	    the matrix_dev->lock mutex.
> + *	4.  Unlock the matrix_dev->lock mutex to avoid circular lock
> + *	    dependencies.
> + *	5.  Clear the masks in the guest's APCB to remove guest access to AP
> + *	    resources assigned to @matrix_mdev.
> + *	6.  Lock the matrix_dev->lock mutex to prevent access to resources
> + *	    assigned to @matrix_mdev while the remainder of the cleanup
> + *	    operations take place.
> + *	7.  Decrement the reference counter incremented in #1.
> + *	8.  Set the reference to the KVM pointer grabbed in #1 into @matrix_mdev
> + *	    (set to NULL in #2) because it will be needed when the queues are
> + *	    reset to clean up any IRQ resources being held.
> + *	9.  Decrement the reference count that was incremented when the KVM
> + *	    pointer was originally set by the group notifier.
> + *	10. Set the KVM pointer @matrix_mdev to NULL to prevent its usage from
> + *	    here on out.
> + *
> + */
>  static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>  {
> -	kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> -	matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> -	vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> -	kvm_put_kvm(matrix_mdev->kvm);
> -	matrix_mdev->kvm = NULL;
> +	struct kvm *kvm;
> +
> +	if (matrix_mdev->kvm) {
> +		kvm = matrix_mdev->kvm;
> +		kvm_get_kvm(kvm);
> +		matrix_mdev->kvm = NULL;

I think if there were two threads dong the unset in parallel, one
of them could bail out and carry on before the cleanup is done. But
since nothing much happens in release after that, I don't see an
immediate problem.

Another thing to consider is, that setting ->kvm to NULL arms
vfio_ap_mdev_remove()...

> +		kvm->arch.crypto.pqap_hook = NULL;
> +		mutex_unlock(&matrix_dev->lock);
> +		kvm_arch_crypto_clear_masks(kvm);
> +		mutex_lock(&matrix_dev->lock);
> +		kvm_put_kvm(kvm);
> +		matrix_mdev->kvm = kvm;
> +		vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> +		kvm_put_kvm(matrix_mdev->kvm);
> +		matrix_mdev->kvm = NULL;
> +	}
>  }
>  
>  static int vfio_ap_mdev_group_notifier(struct notifier_block *nb,
>  				       unsigned long action, void *data)
>  {
> -	int ret, notify_rc = NOTIFY_OK;
> +	int notify_rc = NOTIFY_OK;
>  	struct ap_matrix_mdev *matrix_mdev;
>  
>  	if (action != VFIO_GROUP_NOTIFY_SET_KVM)
>  		return NOTIFY_OK;
>  
> -	matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
>  	mutex_lock(&matrix_dev->lock);
> +	matrix_mdev = container_of(nb, struct ap_matrix_mdev, group_notifier);
>  
> -	if (!data) {
> -		if (matrix_mdev->kvm)
> -			vfio_ap_mdev_unset_kvm(matrix_mdev);
> -		goto notify_done;
> -	}
> -
> -	ret = vfio_ap_mdev_set_kvm(matrix_mdev, data);
> -	if (ret) {
> -		notify_rc = NOTIFY_DONE;
> -		goto notify_done;
> -	}
> -
> -	/* If there is no CRYCB pointer, then we can't copy the masks */
> -	if (!matrix_mdev->kvm->arch.crypto.crycbd) {
> +	if (!data)
> +		vfio_ap_mdev_unset_kvm(matrix_mdev);
> +	else if (vfio_ap_mdev_set_kvm(matrix_mdev, data))
>  		notify_rc = NOTIFY_DONE;
> -		goto notify_done;
> -	}
> -
> -	kvm_arch_crypto_set_masks(matrix_mdev->kvm, matrix_mdev->matrix.apm,
> -				  matrix_mdev->matrix.aqm,
> -				  matrix_mdev->matrix.adm);
>  
> -notify_done:
>  	mutex_unlock(&matrix_dev->lock);
> +
>  	return notify_rc;
>  }
>  
> @@ -1258,8 +1308,7 @@ static void vfio_ap_mdev_release(struct mdev_device *mdev)
>  	struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>  
>  	mutex_lock(&matrix_dev->lock);
> -	if (matrix_mdev->kvm)
> -		vfio_ap_mdev_unset_kvm(matrix_mdev);
> +	vfio_ap_mdev_unset_kvm(matrix_mdev);
>  	mutex_unlock(&matrix_dev->lock);
>  
>  	vfio_unregister_notifier(mdev_dev(mdev), VFIO_IOMMU_NOTIFY,


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
  2021-02-23  9:48   ` Halil Pasic
@ 2021-02-24 16:10     ` Christian Borntraeger
  2021-02-24 23:44       ` Tony Krowiak
       [not found]     ` <63bb0d61-efcd-315b-5a1a-0ef4d99600f4@linux.ibm.com>
  1 sibling, 1 reply; 12+ messages in thread
From: Christian Borntraeger @ 2021-02-24 16:10 UTC (permalink / raw)
  To: Halil Pasic, Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, stable, cohuck, kwankhede,
	pbonzini, alex.williamson, pasic



On 23.02.21 10:48, Halil Pasic wrote:
> On Mon, 15 Feb 2021 20:15:47 -0500
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
> 
>> This patch fixes a circular locking dependency in the CI introduced by
>> commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
>> pointer invalidated"). The lockdep only occurs when starting a Secure
>> Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
>> SE guests; however, in order to avoid CI errors, this fix is being
>> provided.
>>
>> The circular lockdep was introduced when the masks in the guest's APCB
>> were taken under the matrix_dev->lock. While the lock is definitely
>> needed to protect the setting/unsetting of the KVM pointer, it is not
>> necessarily critical for setting the masks, so this will not be done under
>> protection of the matrix_dev->lock.
> 
> 
> 
> With the one little thing I commented on below addressed: 
> Acked-by: Halil Pasic <pasic@linux.ibm.com>  

Tony, can you comment on Halils comment or send a v3 right away?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
  2021-02-24 16:10     ` Christian Borntraeger
@ 2021-02-24 23:44       ` Tony Krowiak
  0 siblings, 0 replies; 12+ messages in thread
From: Tony Krowiak @ 2021-02-24 23:44 UTC (permalink / raw)
  To: Christian Borntraeger, Halil Pasic
  Cc: linux-s390, linux-kernel, kvm, stable, cohuck, kwankhede,
	pbonzini, alex.williamson, pasic



On 2/24/21 11:10 AM, Christian Borntraeger wrote:
>
> On 23.02.21 10:48, Halil Pasic wrote:
>> On Mon, 15 Feb 2021 20:15:47 -0500
>> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
>>
>>> This patch fixes a circular locking dependency in the CI introduced by
>>> commit f21916ec4826 ("s390/vfio-ap: clean up vfio_ap resources when KVM
>>> pointer invalidated"). The lockdep only occurs when starting a Secure
>>> Execution guest. Crypto virtualization (vfio_ap) is not yet supported for
>>> SE guests; however, in order to avoid CI errors, this fix is being
>>> provided.
>>>
>>> The circular lockdep was introduced when the masks in the guest's APCB
>>> were taken under the matrix_dev->lock. While the lock is definitely
>>> needed to protect the setting/unsetting of the KVM pointer, it is not
>>> necessarily critical for setting the masks, so this will not be done under
>>> protection of the matrix_dev->lock.
>>
>>
>> With the one little thing I commented on below addressed:
>> Acked-by: Halil Pasic <pasic@linux.ibm.com>
> Tony, can you comment on Halils comment or send a v3 right away?

I was locked out of email due to expiration of my w3 password.
I am working on the response now.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
       [not found]     ` <63bb0d61-efcd-315b-5a1a-0ef4d99600f4@linux.ibm.com>
@ 2021-02-25 11:28       ` Halil Pasic
       [not found]         ` <f5d5cbab-2181-2a95-8a87-b21d05405936@linux.ibm.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Halil Pasic @ 2021-02-25 11:28 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, stable, borntraeger, cohuck,
	kwankhede, pbonzini, alex.williamson, pasic

On Wed, 24 Feb 2021 22:28:50 -0500
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> >>   static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> >>   {
> >> -	kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> >> -	matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> >> -	vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> >> -	kvm_put_kvm(matrix_mdev->kvm);
> >> -	matrix_mdev->kvm = NULL;
> >> +	struct kvm *kvm;
> >> +
> >> +	if (matrix_mdev->kvm) {
> >> +		kvm = matrix_mdev->kvm;
> >> +		kvm_get_kvm(kvm);
> >> +		matrix_mdev->kvm = NULL;  
> > I think if there were two threads dong the unset in parallel, one
> > of them could bail out and carry on before the cleanup is done. But
> > since nothing much happens in release after that, I don't see an
> > immediate problem.
> >
> > Another thing to consider is, that setting ->kvm to NULL arms
> > vfio_ap_mdev_remove()...  
> 
> I'm not entirely sure what you mean by this, but my
> assumption is that you are talking about the check
> for matrix_mdev->kvm != NULL at the start of
> that function. 

Yes I was talking about the check

static int vfio_ap_mdev_remove(struct mdev_device *mdev)                        
{                                                                               
        struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);            
                                                                                
        if (matrix_mdev->kvm)                                                   
                return -EBUSY;
...
        kfree(matrix_mdev);                                                     
...                                                               
} 

As you see, we bail out if kvm is still set, otherwise we clean up the
matrix_mdev which includes kfree-ing it. And vfio_ap_mdev_remove() is
initiated via the sysfs, i.e. can be initiated at any time. If we were
to free matrix_mdev in mdev_remove() and then carry on with kvm_unset()
with mutex_lock(&matrix_dev->lock); that would be bad.



> The reason
> matrix_mdev->kvm is set to NULL before giving up
> the matrix_dev->lock is so that functions that check
> for the presence of the matrix_mdev->kvm pointer,
> such as assign_adapter_store() - will exit if they get
> control while the masks are being cleared. 

I disagree!

static ssize_t assign_adapter_store(struct device *dev,                         
                                    struct device_attribute *attr,              
                                    const char *buf, size_t count)              
{                                                                               
        int ret;                                                                
        unsigned long apid;                                                     
        struct mdev_device *mdev = mdev_from_dev(dev);                          
        struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);            
                                                                                
        /* If the guest is running, disallow assignment of adapter */           
        if (matrix_mdev->kvm)                                                   
                return -EBUSY;

We bail out when kvm != NULL, so having it set to NULL while the
mask are being cleared will make these not bail out.

> So what we have
> here is a catch-22; in other words, we have the case
> you pointed out above and the cases related to
> assigning/unassigning adapters, domains and
> control domains which should exit when a guest
> is running.


See above.

> 
> I may have an idea to resolve this. Suppose we add:
> 
> struct ap_matrix_mdev {
>      ...
>      bool kvm_busy;
>      ...
> }
> 
> This flag will be set to true at the start of both the
> vfio_ap_mdev_set_kvm() and vfio_ap_mdev_unset_kvm()
> and set to false at the end. The assignment/unassignment
> and remove callback functions can test this flag and
> return -EBUSY if the flag is true. That will preclude assigning
> or unassigning adapters, domains and control domains when
> the KVM pointer is being set/unset. Likewise, removal of the
> mediated device will also be prevented while the KVM pointer
> is being set/unset.
> 
> In the case of the PQAP handler function, it can wait for the
> set/unset of the KVM pointer as follows:
> 
> /while (matrix_mdev->kvm_busy) {//
> //        mutex_unlock(&matrix_dev->lock);//
> //        msleep(100);//
> //        mutex_lock(&matrix_dev->lock);//
> //}//
> //
> //if (!matrix_mdev->kvm)//
> //        goto out_unlock;
> 
> /What say you?
> //

I'm not sure. Since I disagree with your analysis above it is difficult
to deal with the conclusion. I'm not against decoupling the tracking of
the state of the mdev_matrix device from the value of the kvm pointer. I
think we should first get a common understanding of the problem, before
we proceed to the solution.

Regards,
Halil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
       [not found]         ` <f5d5cbab-2181-2a95-8a87-b21d05405936@linux.ibm.com>
@ 2021-02-25 15:25           ` Tony Krowiak
  2021-02-25 15:35             ` Halil Pasic
  2021-02-25 15:36           ` Halil Pasic
  1 sibling, 1 reply; 12+ messages in thread
From: Tony Krowiak @ 2021-02-25 15:25 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, linux-kernel, kvm, stable, borntraeger, cohuck,
	kwankhede, pbonzini, alex.williamson, pasic



On 2/25/21 8:53 AM, Tony Krowiak wrote:
>
>
> On 2/25/21 6:28 AM, Halil Pasic wrote:
>> On Wed, 24 Feb 2021 22:28:50 -0500
>> Tony Krowiak<akrowiak@linux.ibm.com>  wrote:
>>
>>>>>    static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>>>>>    {
>>>>> -	kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>>>>> -	matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>>>>> -	vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
>>>>> -	kvm_put_kvm(matrix_mdev->kvm);
>>>>> -	matrix_mdev->kvm = NULL;
>>>>> +	struct kvm *kvm;
>>>>> +
>>>>> +	if (matrix_mdev->kvm) {
>>>>> +		kvm = matrix_mdev->kvm;
>>>>> +		kvm_get_kvm(kvm);
>>>>> +		matrix_mdev->kvm = NULL;
>>>> I think if there were two threads dong the unset in parallel, one
>>>> of them could bail out and carry on before the cleanup is done. But
>>>> since nothing much happens in release after that, I don't see an
>>>> immediate problem.
>>>>
>>>> Another thing to consider is, that setting ->kvm to NULL arms
>>>> vfio_ap_mdev_remove()...
>>> I'm not entirely sure what you mean by this, but my
>>> assumption is that you are talking about the check
>>> for matrix_mdev->kvm != NULL at the start of
>>> that function.
>> Yes I was talking about the check
>>
>> static int vfio_ap_mdev_remove(struct mdev_device *mdev)
>> {
>>          struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>                                                                                  
>>          if (matrix_mdev->kvm)
>>                  return -EBUSY;
>> ...
>>          kfree(matrix_mdev);
>> ...
>> }
>>
>> As you see, we bail out if kvm is still set, otherwise we clean up the
>> matrix_mdev which includes kfree-ing it. And vfio_ap_mdev_remove() is
>> initiated via the sysfs, i.e. can be initiated at any time. If we were
>> to free matrix_mdev in mdev_remove() and then carry on with kvm_unset()
>> with mutex_lock(&matrix_dev->lock); that would be bad.
>
> I agree.
>
>>
>>> The reason
>>> matrix_mdev->kvm is set to NULL before giving up
>>> the matrix_dev->lock is so that functions that check
>>> for the presence of the matrix_mdev->kvm pointer,
>>> such as assign_adapter_store() - will exit if they get
>>> control while the masks are being cleared.
>> I disagree!
>>
>> static ssize_t assign_adapter_store(struct device *dev,
>>                                      struct device_attribute *attr,
>>                                      const char *buf, size_t count)
>> {
>>          int ret;
>>          unsigned long apid;
>>          struct mdev_device *mdev = mdev_from_dev(dev);
>>          struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>                                                                                  
>>          /* If the guest is running, disallow assignment of adapter */
>>          if (matrix_mdev->kvm)
>>                  return -EBUSY;
>>
>> We bail out when kvm != NULL, so having it set to NULL while the
>> mask are being cleared will make these not bail out.
>
> You are correct, I am an idiot.
>
>>> So what we have
>>> here is a catch-22; in other words, we have the case
>>> you pointed out above and the cases related to
>>> assigning/unassigning adapters, domains and
>>> control domains which should exit when a guest
>>> is running.
>> See above.
>
> Ditto.
>
>>> I may have an idea to resolve this. Suppose we add:
>>>
>>> struct ap_matrix_mdev {
>>>       ...
>>>       bool kvm_busy;
>>>       ...
>>> }
>>>
>>> This flag will be set to true at the start of both the
>>> vfio_ap_mdev_set_kvm() and vfio_ap_mdev_unset_kvm()
>>> and set to false at the end. The assignment/unassignment
>>> and remove callback functions can test this flag and
>>> return -EBUSY if the flag is true. That will preclude assigning
>>> or unassigning adapters, domains and control domains when
>>> the KVM pointer is being set/unset. Likewise, removal of the
>>> mediated device will also be prevented while the KVM pointer
>>> is being set/unset.
>>>
>>> In the case of the PQAP handler function, it can wait for the
>>> set/unset of the KVM pointer as follows:
>>>
>>> /while (matrix_mdev->kvm_busy) {//
>>> //        mutex_unlock(&matrix_dev->lock);//
>>> //        msleep(100);//
>>> //        mutex_lock(&matrix_dev->lock);//
>>> //}//
>>> //
>>> //if (!matrix_mdev->kvm)//
>>> //        goto out_unlock;
>>>
>>> /What say you?
>>> //
>> I'm not sure. Since I disagree with your analysis above it is difficult
>> to deal with the conclusion. I'm not against decoupling the tracking of
>> the state of the mdev_matrix device from the value of the kvm pointer. I
>> think we should first get a common understanding of the problem, before
>> we proceed to the solution.
>
> Regardless of my brain fog regarding the testing of the
> matrix_mdev->kvm pointer, I stand by what I stated
> in the paragraphs just before the code snippet.
>
> The problem is there are 10 functions that depend upon
> the value of the matrix_mdev->kvm pointer that can get
> control while the pointer is being set/unset and the
> matrix_dev->lock is given up to set/clear the masks:

* vfio_ap_irq_enable: called by handle_pqap() when AQIC is intercepted
* vfio_ap_irq_disable: called by handle_pqap() when AQIC is intercepted
* assign_adapter_store: sysfs
* unassign_adapter_store: sysfs
* assign_domain_store: sysfs
* unassign_domain_store: sysfs
* assign__control_domain_store: sysfs
* unassign_control_domain_store: sysfs
* vfio_ap_mdev_remove: sysfs
* vfio_ap_mdev_release: mdev fd closed by userspace (i.e., qemu)If we 
add the proposed flag to indicate when the matrix_mdev->kvm
> pointer is in flux, then we can check that before allowing the functions
> in the list above to proceed.
>
>> Regards,
>> Halil
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
  2021-02-25 15:25           ` Tony Krowiak
@ 2021-02-25 15:35             ` Halil Pasic
  2021-02-25 20:02               ` Tony Krowiak
  0 siblings, 1 reply; 12+ messages in thread
From: Halil Pasic @ 2021-02-25 15:35 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, stable, borntraeger, cohuck,
	kwankhede, pbonzini, alex.williamson, pasic

On Thu, 25 Feb 2021 10:25:24 -0500
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> On 2/25/21 8:53 AM, Tony Krowiak wrote:
> >
> >
> > On 2/25/21 6:28 AM, Halil Pasic wrote:  
> >> On Wed, 24 Feb 2021 22:28:50 -0500
> >> Tony Krowiak<akrowiak@linux.ibm.com>  wrote:
> >>  
> >>>>>    static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
> >>>>>    {
> >>>>> -	kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
> >>>>> -	matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
> >>>>> -	vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
> >>>>> -	kvm_put_kvm(matrix_mdev->kvm);
> >>>>> -	matrix_mdev->kvm = NULL;
> >>>>> +	struct kvm *kvm;
> >>>>> +
> >>>>> +	if (matrix_mdev->kvm) {
> >>>>> +		kvm = matrix_mdev->kvm;
> >>>>> +		kvm_get_kvm(kvm);
> >>>>> +		matrix_mdev->kvm = NULL;  
> >>>> I think if there were two threads dong the unset in parallel, one
> >>>> of them could bail out and carry on before the cleanup is done. But
> >>>> since nothing much happens in release after that, I don't see an
> >>>> immediate problem.
> >>>>
> >>>> Another thing to consider is, that setting ->kvm to NULL arms
> >>>> vfio_ap_mdev_remove()...  
> >>> I'm not entirely sure what you mean by this, but my
> >>> assumption is that you are talking about the check
> >>> for matrix_mdev->kvm != NULL at the start of
> >>> that function.  
> >> Yes I was talking about the check
> >>
> >> static int vfio_ap_mdev_remove(struct mdev_device *mdev)
> >> {
> >>          struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> >>                                                                                  
> >>          if (matrix_mdev->kvm)
> >>                  return -EBUSY;
> >> ...
> >>          kfree(matrix_mdev);
> >> ...
> >> }
> >>
> >> As you see, we bail out if kvm is still set, otherwise we clean up the
> >> matrix_mdev which includes kfree-ing it. And vfio_ap_mdev_remove() is
> >> initiated via the sysfs, i.e. can be initiated at any time. If we were
> >> to free matrix_mdev in mdev_remove() and then carry on with kvm_unset()
> >> with mutex_lock(&matrix_dev->lock); that would be bad.  
> >
> > I agree.
> >  
> >>  
> >>> The reason
> >>> matrix_mdev->kvm is set to NULL before giving up
> >>> the matrix_dev->lock is so that functions that check
> >>> for the presence of the matrix_mdev->kvm pointer,
> >>> such as assign_adapter_store() - will exit if they get
> >>> control while the masks are being cleared.  
> >> I disagree!
> >>
> >> static ssize_t assign_adapter_store(struct device *dev,
> >>                                      struct device_attribute *attr,
> >>                                      const char *buf, size_t count)
> >> {
> >>          int ret;
> >>          unsigned long apid;
> >>          struct mdev_device *mdev = mdev_from_dev(dev);
> >>          struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
> >>                                                                                  
> >>          /* If the guest is running, disallow assignment of adapter */
> >>          if (matrix_mdev->kvm)
> >>                  return -EBUSY;
> >>
> >> We bail out when kvm != NULL, so having it set to NULL while the
> >> mask are being cleared will make these not bail out.  
> >
> > You are correct, I am an idiot.
> >  
> >>> So what we have
> >>> here is a catch-22; in other words, we have the case
> >>> you pointed out above and the cases related to
> >>> assigning/unassigning adapters, domains and
> >>> control domains which should exit when a guest
> >>> is running.  
> >> See above.  
> >
> > Ditto.
> >  
> >>> I may have an idea to resolve this. Suppose we add:
> >>>
> >>> struct ap_matrix_mdev {
> >>>       ...
> >>>       bool kvm_busy;
> >>>       ...
> >>> }
> >>>
> >>> This flag will be set to true at the start of both the
> >>> vfio_ap_mdev_set_kvm() and vfio_ap_mdev_unset_kvm()
> >>> and set to false at the end. The assignment/unassignment
> >>> and remove callback functions can test this flag and
> >>> return -EBUSY if the flag is true. That will preclude assigning
> >>> or unassigning adapters, domains and control domains when
> >>> the KVM pointer is being set/unset. Likewise, removal of the
> >>> mediated device will also be prevented while the KVM pointer
> >>> is being set/unset.
> >>>
> >>> In the case of the PQAP handler function, it can wait for the
> >>> set/unset of the KVM pointer as follows:
> >>>
> >>> /while (matrix_mdev->kvm_busy) {//
> >>> //        mutex_unlock(&matrix_dev->lock);//
> >>> //        msleep(100);//
> >>> //        mutex_lock(&matrix_dev->lock);//
> >>> //}//
> >>> //
> >>> //if (!matrix_mdev->kvm)//
> >>> //        goto out_unlock;
> >>>
> >>> /What say you?
> >>> //  
> >> I'm not sure. Since I disagree with your analysis above it is difficult
> >> to deal with the conclusion. I'm not against decoupling the tracking of
> >> the state of the mdev_matrix device from the value of the kvm pointer. I
> >> think we should first get a common understanding of the problem, before
> >> we proceed to the solution.  
> >
> > Regardless of my brain fog regarding the testing of the
> > matrix_mdev->kvm pointer, I stand by what I stated
> > in the paragraphs just before the code snippet.
> >
> > The problem is there are 10 functions that depend upon
> > the value of the matrix_mdev->kvm pointer that can get
> > control while the pointer is being set/unset and the
> > matrix_dev->lock is given up to set/clear the masks:  
> 
> * vfio_ap_irq_enable: called by handle_pqap() when AQIC is intercepted
> * vfio_ap_irq_disable: called by handle_pqap() when AQIC is intercepted
> * assign_adapter_store: sysfs
> * unassign_adapter_store: sysfs
> * assign_domain_store: sysfs
> * unassign_domain_store: sysfs
> * assign__control_domain_store: sysfs
> * unassign_control_domain_store: sysfs
> * vfio_ap_mdev_remove: sysfs
> * vfio_ap_mdev_release: mdev fd closed by userspace (i.e., qemu)If we 
> add the proposed flag to indicate when the matrix_mdev->kvm

Something is strange with this email. It is basically the same email
as the previous one, just broken, or?

> > pointer is in flux, then we can check that before allowing the functions
> > in the list above to proceed.
> >  
> >> Regards,
> >> Halil  
> >  
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
       [not found]         ` <f5d5cbab-2181-2a95-8a87-b21d05405936@linux.ibm.com>
  2021-02-25 15:25           ` Tony Krowiak
@ 2021-02-25 15:36           ` Halil Pasic
  1 sibling, 0 replies; 12+ messages in thread
From: Halil Pasic @ 2021-02-25 15:36 UTC (permalink / raw)
  To: Tony Krowiak
  Cc: linux-s390, linux-kernel, kvm, stable, borntraeger, cohuck,
	kwankhede, pbonzini, alex.williamson, pasic

On Thu, 25 Feb 2021 08:53:50 -0500
Tony Krowiak <akrowiak@linux.ibm.com> wrote:

> If we add the proposed flag to indicate when the matrix_mdev->kvm
> pointer is in flux, then we can check that before allowing the functions
> in the list above to proceed.

I'm not against that. Go ahead!

Regards,
Halil

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks
  2021-02-25 15:35             ` Halil Pasic
@ 2021-02-25 20:02               ` Tony Krowiak
  0 siblings, 0 replies; 12+ messages in thread
From: Tony Krowiak @ 2021-02-25 20:02 UTC (permalink / raw)
  To: Halil Pasic
  Cc: linux-s390, linux-kernel, kvm, stable, borntraeger, cohuck,
	kwankhede, pbonzini, alex.williamson, pasic



On 2/25/21 10:35 AM, Halil Pasic wrote:
> On Thu, 25 Feb 2021 10:25:24 -0500
> Tony Krowiak <akrowiak@linux.ibm.com> wrote:
>
>> On 2/25/21 8:53 AM, Tony Krowiak wrote:
>>>
>>> On 2/25/21 6:28 AM, Halil Pasic wrote:
>>>> On Wed, 24 Feb 2021 22:28:50 -0500
>>>> Tony Krowiak<akrowiak@linux.ibm.com>  wrote:
>>>>   
>>>>>>>     static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
>>>>>>>     {
>>>>>>> -	kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
>>>>>>> -	matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
>>>>>>> -	vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
>>>>>>> -	kvm_put_kvm(matrix_mdev->kvm);
>>>>>>> -	matrix_mdev->kvm = NULL;
>>>>>>> +	struct kvm *kvm;
>>>>>>> +
>>>>>>> +	if (matrix_mdev->kvm) {
>>>>>>> +		kvm = matrix_mdev->kvm;
>>>>>>> +		kvm_get_kvm(kvm);
>>>>>>> +		matrix_mdev->kvm = NULL;
>>>>>> I think if there were two threads dong the unset in parallel, one
>>>>>> of them could bail out and carry on before the cleanup is done. But
>>>>>> since nothing much happens in release after that, I don't see an
>>>>>> immediate problem.
>>>>>>
>>>>>> Another thing to consider is, that setting ->kvm to NULL arms
>>>>>> vfio_ap_mdev_remove()...
>>>>> I'm not entirely sure what you mean by this, but my
>>>>> assumption is that you are talking about the check
>>>>> for matrix_mdev->kvm != NULL at the start of
>>>>> that function.
>>>> Yes I was talking about the check
>>>>
>>>> static int vfio_ap_mdev_remove(struct mdev_device *mdev)
>>>> {
>>>>           struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>>>                                                                                   
>>>>           if (matrix_mdev->kvm)
>>>>                   return -EBUSY;
>>>> ...
>>>>           kfree(matrix_mdev);
>>>> ...
>>>> }
>>>>
>>>> As you see, we bail out if kvm is still set, otherwise we clean up the
>>>> matrix_mdev which includes kfree-ing it. And vfio_ap_mdev_remove() is
>>>> initiated via the sysfs, i.e. can be initiated at any time. If we were
>>>> to free matrix_mdev in mdev_remove() and then carry on with kvm_unset()
>>>> with mutex_lock(&matrix_dev->lock); that would be bad.
>>> I agree.
>>>   
>>>>   
>>>>> The reason
>>>>> matrix_mdev->kvm is set to NULL before giving up
>>>>> the matrix_dev->lock is so that functions that check
>>>>> for the presence of the matrix_mdev->kvm pointer,
>>>>> such as assign_adapter_store() - will exit if they get
>>>>> control while the masks are being cleared.
>>>> I disagree!
>>>>
>>>> static ssize_t assign_adapter_store(struct device *dev,
>>>>                                       struct device_attribute *attr,
>>>>                                       const char *buf, size_t count)
>>>> {
>>>>           int ret;
>>>>           unsigned long apid;
>>>>           struct mdev_device *mdev = mdev_from_dev(dev);
>>>>           struct ap_matrix_mdev *matrix_mdev = mdev_get_drvdata(mdev);
>>>>                                                                                   
>>>>           /* If the guest is running, disallow assignment of adapter */
>>>>           if (matrix_mdev->kvm)
>>>>                   return -EBUSY;
>>>>
>>>> We bail out when kvm != NULL, so having it set to NULL while the
>>>> mask are being cleared will make these not bail out.
>>> You are correct, I am an idiot.
>>>   
>>>>> So what we have
>>>>> here is a catch-22; in other words, we have the case
>>>>> you pointed out above and the cases related to
>>>>> assigning/unassigning adapters, domains and
>>>>> control domains which should exit when a guest
>>>>> is running.
>>>> See above.
>>> Ditto.
>>>   
>>>>> I may have an idea to resolve this. Suppose we add:
>>>>>
>>>>> struct ap_matrix_mdev {
>>>>>        ...
>>>>>        bool kvm_busy;
>>>>>        ...
>>>>> }
>>>>>
>>>>> This flag will be set to true at the start of both the
>>>>> vfio_ap_mdev_set_kvm() and vfio_ap_mdev_unset_kvm()
>>>>> and set to false at the end. The assignment/unassignment
>>>>> and remove callback functions can test this flag and
>>>>> return -EBUSY if the flag is true. That will preclude assigning
>>>>> or unassigning adapters, domains and control domains when
>>>>> the KVM pointer is being set/unset. Likewise, removal of the
>>>>> mediated device will also be prevented while the KVM pointer
>>>>> is being set/unset.
>>>>>
>>>>> In the case of the PQAP handler function, it can wait for the
>>>>> set/unset of the KVM pointer as follows:
>>>>>
>>>>> /while (matrix_mdev->kvm_busy) {//
>>>>> //        mutex_unlock(&matrix_dev->lock);//
>>>>> //        msleep(100);//
>>>>> //        mutex_lock(&matrix_dev->lock);//
>>>>> //}//
>>>>> //
>>>>> //if (!matrix_mdev->kvm)//
>>>>> //        goto out_unlock;
>>>>>
>>>>> /What say you?
>>>>> //
>>>> I'm not sure. Since I disagree with your analysis above it is difficult
>>>> to deal with the conclusion. I'm not against decoupling the tracking of
>>>> the state of the mdev_matrix device from the value of the kvm pointer. I
>>>> think we should first get a common understanding of the problem, before
>>>> we proceed to the solution.
>>> Regardless of my brain fog regarding the testing of the
>>> matrix_mdev->kvm pointer, I stand by what I stated
>>> in the paragraphs just before the code snippet.
>>>
>>> The problem is there are 10 functions that depend upon
>>> the value of the matrix_mdev->kvm pointer that can get
>>> control while the pointer is being set/unset and the
>>> matrix_dev->lock is given up to set/clear the masks:
>> * vfio_ap_irq_enable: called by handle_pqap() when AQIC is intercepted
>> * vfio_ap_irq_disable: called by handle_pqap() when AQIC is intercepted
>> * assign_adapter_store: sysfs
>> * unassign_adapter_store: sysfs
>> * assign_domain_store: sysfs
>> * unassign_domain_store: sysfs
>> * assign__control_domain_store: sysfs
>> * unassign_control_domain_store: sysfs
>> * vfio_ap_mdev_remove: sysfs
>> * vfio_ap_mdev_release: mdev fd closed by userspace (i.e., qemu)If we
>> add the proposed flag to indicate when the matrix_mdev->kvm
> Something is strange with this email. It is basically the same email
> as the previous one, just broken, or?

the previous email was rejected for the kernel addresses because
I used bulleted lists which aren't acceptable. The kernel email addresses
accept text-only, so I replaced the bulleted list with the above.

>
>>> pointer is in flux, then we can check that before allowing the functions
>>> in the list above to proceed.
>>>   
>>>> Regards,
>>>> Halil
>>>   


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2021-02-25 20:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-16  1:15 [PATCH v2 0/1] s390/vfio-ap: fix circular lockdep when staring SE guest Tony Krowiak
2021-02-16  1:15 ` [PATCH v2 1/1] s390/vfio-ap: fix circular lockdep when setting/clearing crypto masks Tony Krowiak
2021-02-19 13:45   ` Cornelia Huck
2021-02-19 20:49     ` Tony Krowiak
2021-02-23  9:48   ` Halil Pasic
2021-02-24 16:10     ` Christian Borntraeger
2021-02-24 23:44       ` Tony Krowiak
     [not found]     ` <63bb0d61-efcd-315b-5a1a-0ef4d99600f4@linux.ibm.com>
2021-02-25 11:28       ` Halil Pasic
     [not found]         ` <f5d5cbab-2181-2a95-8a87-b21d05405936@linux.ibm.com>
2021-02-25 15:25           ` Tony Krowiak
2021-02-25 15:35             ` Halil Pasic
2021-02-25 20:02               ` Tony Krowiak
2021-02-25 15:36           ` Halil Pasic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).