All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Tony Krowiak <akrowiak@linux.ibm.com>
Cc: linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	borntraeger@de.ibm.com, cohuck@redhat.com,
	pasic@linux.vnet.ibm.com, jjherne@linux.ibm.com,
	alex.williamson@redhat.com, kwankhede@nvidia.com
Subject: Re: [PATCH v3 2/2] s390/vfio-ap: control access to PQAP(AQIC) interception handler
Date: Wed, 19 May 2021 20:22:02 -0300	[thread overview]
Message-ID: <20210519232202.GV1002214@nvidia.com> (raw)
In-Reply-To: <8c93c29a-e223-ac9a-5b54-7329587084c9@linux.ibm.com>

On Wed, May 19, 2021 at 07:04:46PM -0400, Tony Krowiak wrote:
> 
> 
> On 5/19/21 12:16 PM, Jason Gunthorpe wrote:
> > On Wed, May 19, 2021 at 11:39:21AM -0400, Tony Krowiak wrote:
> > 
> > > @@ -287,13 +289,17 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
> > >   	if (!(vcpu->arch.sie_block->eca & ECA_AIV))
> > >   		return -EOPNOTSUPP;
> > > -	apqn = vcpu->run->s.regs.gprs[0] & 0xffff;
> > > -	mutex_lock(&matrix_dev->lock);
> > > +	rcu_read_lock();
> > > +	pqap_module_hook = rcu_dereference(vcpu->kvm->arch.crypto.pqap_hook);
> > > +	if (!pqap_module_hook) {
> > > +		rcu_read_unlock();
> > > +		goto set_status;
> > > +	}
> > > -	if (!vcpu->kvm->arch.crypto.pqap_hook)
> > > -		goto out_unlock;
> > > -	matrix_mdev = container_of(vcpu->kvm->arch.crypto.pqap_hook,
> > > -				   struct ap_matrix_mdev, pqap_hook);
> > > +	matrix_mdev = pqap_module_hook->data;
> > > +	rcu_read_unlock();
> > > +	mutex_lock(&matrix_dev->lock);
> > The matrix_mdev pointer was extracted from the pqap_module_hook,
> > but now there is nothing protecting it since the rcu was dropped and
> > it gets freed in vfio_ap_mdev_remove.
> 
> Therein lies the rub. We can't hold the rcu_read_lock across the
> entire time that the interception is being processed because of
> wait conditions in the interception handler. Regardless of whether
> the pointer to the matrix_mdev is retrieved as the container of
> or extracted from the pqap_hook, there is nothing protecting it
> and there appears to be no way to do so using RCU.

RCU is a lock that should only be used for highly performance
sensitive read work loads. It eliminates one atomic from a lock, but
if you go on to immediately do something like try_module_get, which
has an atomic inside, the whole thing is pointless (assuming
try_module_get was even the right thing to do)

Use a simple sleepable rwsem around the whole thing and forget about
the module_get. Hold the write side when NULL'ing the pointer.

> > And, again, module locking doesn't prevent vfio_ap_mdev_remove() from
> > being called. None of these patches should be combining module locking
> > with RCU.
> 
> Is there any other way besides user interaction with the mdev's
> sysfs remove interface for the remove callback to get invoked?

There are more options after my re-organizing series.

But I'm not sure why you ask in response to my point about module
locking? Module locking is not really related to sysfs.

> If I try to remove the mdev using the sysfs interface while the
> mdev fd is still open by the guest, the remove hangs until the
> fd is closed.

Yes, it will wait when the vfio_device is being destroyed.

> That being the case, the mdev release callback will get invoked
> prior to the remove callback being invoked which renders this whole
> debate moot. What am I missing here?

AFAICT the control flow is such that release can immediately move on
to remove on the same CPU without an additional synchronization. So
the kfree can still race with the above.

But the more I look at this the wonkier it is.. The real issue is not
the matrix_mdev, it is the whole vcpu->kvm->arch.crypto.pqap_hook

This is nonesense too:

	if (vcpu->kvm->arch.crypto.pqap_hook) {
		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
			return -EOPNOTSUPP;
		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);

It should have a lock around it of some kind, not a
try_module_get. module_get is not la lock.

And that lock should interact with loading the hook in the first
place:
		kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;

And of course NULLin'g the hooke should be synchronous with the lock.

There should be no owner for something like this. unregistering the
function pointer should fully fence all access.

The simple solution in sketch is just this:

diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 9928f785c6773a..f70386452367dd 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -657,13 +657,12 @@ static int handle_pqap(struct kvm_vcpu *vcpu)
 	 * Verify that the hook callback is registered, lock the owner
 	 * and call the hook.
 	 */
-	if (vcpu->kvm->arch.crypto.pqap_hook) {
-		if (!try_module_get(vcpu->kvm->arch.crypto.pqap_hook->owner))
-			return -EOPNOTSUPP;
+	if (down_read_trylock(&vcpu->kvm->arch.crypto.rwsem) &&
+	    vcpu->kvm->arch.crypto.pqap_hook) {
 		ret = vcpu->kvm->arch.crypto.pqap_hook->hook(vcpu);
-		module_put(vcpu->kvm->arch.crypto.pqap_hook->owner);
 		if (!ret && vcpu->run->s.regs.gprs[1] & 0x00ff0000)
 			kvm_s390_set_psw_cc(vcpu, 3);
+		up_read(&vcpu->kv->arch.crypto.rwsem);
 		return ret;
 	}
 	/*
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index b2c7e10dfdcdcf..64c89f6a711e94 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -352,8 +352,7 @@ static int vfio_ap_mdev_create(struct mdev_device *mdev)
 	vfio_ap_matrix_init(&matrix_dev->info, &matrix_mdev->matrix);
 	init_waitqueue_head(&matrix_mdev->wait_for_kvm);
 	mdev_set_drvdata(mdev, matrix_mdev);
-	matrix_mdev->pqap_hook.hook = handle_pqap;
-	matrix_mdev->pqap_hook.owner = THIS_MODULE;
+	down_write(&&vcpu->kvm->arch.crypto.rwsem);
 	mutex_lock(&matrix_dev->lock);
 	list_add(&matrix_mdev->node, &matrix_dev->mdev_list);
 	mutex_unlock(&matrix_dev->lock);
@@ -1132,7 +1131,9 @@ static int vfio_ap_mdev_set_kvm(struct ap_matrix_mdev *matrix_mdev,
 					  matrix_mdev->matrix.aqm,
 					  matrix_mdev->matrix.adm);
 		mutex_lock(&matrix_dev->lock);
+		down_write(&kvm->arch.crypto.rwsem);
 		kvm->arch.crypto.pqap_hook = &matrix_mdev->pqap_hook;
+		up_write(&kvm->arch.crypto.rwsem);
 		matrix_mdev->kvm = kvm;
 		matrix_mdev->kvm_busy = false;
 		wake_up_all(&matrix_mdev->wait_for_kvm);
@@ -1202,7 +1203,9 @@ static void vfio_ap_mdev_unset_kvm(struct ap_matrix_mdev *matrix_mdev)
 		kvm_arch_crypto_clear_masks(matrix_mdev->kvm);
 		mutex_lock(&matrix_dev->lock);
 		vfio_ap_mdev_reset_queues(matrix_mdev->mdev);
+		down_write(&matrix_mdev->kvm->arch.crypto.rwsem);
 		matrix_mdev->kvm->arch.crypto.pqap_hook = NULL;
+		up_write(&matrix_mdev->kvm->arch.crypto.rwsem);
 		kvm_put_kvm(matrix_mdev->kvm);
 		matrix_mdev->kvm = NULL;
 		matrix_mdev->kvm_busy = false;


Jason

  reply	other threads:[~2021-05-19 23:22 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-19 15:39 [PATCH v3 0/2] s390/vfio-ap: fix memory leak in mdev remove callback Tony Krowiak
2021-05-19 15:39 ` [PATCH v3 1/2] " Tony Krowiak
2021-05-19 15:39 ` [PATCH v3 2/2] s390/vfio-ap: control access to PQAP(AQIC) interception handler Tony Krowiak
2021-05-19 16:16   ` Jason Gunthorpe
2021-05-19 23:04     ` Tony Krowiak
2021-05-19 23:22       ` Jason Gunthorpe [this message]
2021-05-20  1:08         ` Tony Krowiak
2021-05-20  8:48           ` Halil Pasic
2021-05-20 12:26             ` Jason Gunthorpe
2021-05-20  8:38         ` Halil Pasic
2021-05-20 12:51           ` Jason Gunthorpe
2021-05-21 18:24         ` Tony Krowiak
2021-05-21 18:30           ` Jason Gunthorpe
2021-05-19 17:21   ` Halil Pasic
2021-05-19 23:14     ` Tony Krowiak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210519232202.GV1002214@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=akrowiak@linux.ibm.com \
    --cc=alex.williamson@redhat.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=jjherne@linux.ibm.com \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=pasic@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.