* [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev()
@ 2021-09-10 23:06 Jason Gunthorpe
2021-09-16 18:51 ` Alex Williamson
0 siblings, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2021-09-10 23:06 UTC (permalink / raw)
To: Christian Borntraeger, Harald Freudenberger, Vasily Gorbik,
Heiko Carstens, Jason Herne, linux-s390, Halil Pasic
Cc: Tony Krowiak, Alex Williamson, Cornelia Huck, Christoph Hellwig, kvm
Without this call an xarray entry is leaked when the vfio_ap device is
unprobed. It was missed when the below patch was rebased across the
dev_set patch.
Fixes: eb0feefd4c02 ("vfio/ap_ops: Convert to use vfio_register_group_dev()")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
drivers/s390/crypto/vfio_ap_ops.c | 2 ++
1 file changed, 2 insertions(+)
v2: Fix corrupted diff
v1: https://lore.kernel.org/all/0-v1-3a05c6000668+2ce62-ap_uninit_jgg@nvidia.com/
diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
index 2347808fa3e427..54bb0c22e8020e 100644
--- a/drivers/s390/crypto/vfio_ap_ops.c
+++ b/drivers/s390/crypto/vfio_ap_ops.c
@@ -360,6 +360,7 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev)
mutex_lock(&matrix_dev->lock);
list_del(&matrix_mdev->node);
mutex_unlock(&matrix_dev->lock);
+ vfio_uninit_group_dev(&matrix_mdev->vdev);
kfree(matrix_mdev);
err_dec_available:
atomic_inc(&matrix_dev->available_instances);
@@ -375,6 +376,7 @@ static void vfio_ap_mdev_remove(struct mdev_device *mdev)
mutex_lock(&matrix_dev->lock);
vfio_ap_mdev_reset_queues(matrix_mdev);
list_del(&matrix_mdev->node);
+ vfio_uninit_group_dev(&matrix_mdev->vdev);
kfree(matrix_mdev);
atomic_inc(&matrix_dev->available_instances);
mutex_unlock(&matrix_dev->lock);
base-commit: ea870730d83fc13a5fa2bd0e175176d7ac8a400a
--
2.33.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev()
2021-09-10 23:06 [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev() Jason Gunthorpe
@ 2021-09-16 18:51 ` Alex Williamson
2021-09-20 21:26 ` Tony Krowiak
0 siblings, 1 reply; 5+ messages in thread
From: Alex Williamson @ 2021-09-16 18:51 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Christian Borntraeger, Harald Freudenberger, Vasily Gorbik,
Heiko Carstens, Jason Herne, linux-s390, Halil Pasic,
Tony Krowiak, Cornelia Huck, Christoph Hellwig, kvm
On Fri, 10 Sep 2021 20:06:30 -0300
Jason Gunthorpe <jgg@nvidia.com> wrote:
> Without this call an xarray entry is leaked when the vfio_ap device is
> unprobed. It was missed when the below patch was rebased across the
> dev_set patch.
>
> Fixes: eb0feefd4c02 ("vfio/ap_ops: Convert to use vfio_register_group_dev()")
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
> drivers/s390/crypto/vfio_ap_ops.c | 2 ++
> 1 file changed, 2 insertions(+)
Hi Tony, Halil, Jason (H),
Any acks for this one? Thanks,
Alex
> v2: Fix corrupted diff
> v1: https://lore.kernel.org/all/0-v1-3a05c6000668+2ce62-ap_uninit_jgg@nvidia.com/
>
> diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c
> index 2347808fa3e427..54bb0c22e8020e 100644
> --- a/drivers/s390/crypto/vfio_ap_ops.c
> +++ b/drivers/s390/crypto/vfio_ap_ops.c
> @@ -360,6 +360,7 @@ static int vfio_ap_mdev_probe(struct mdev_device *mdev)
> mutex_lock(&matrix_dev->lock);
> list_del(&matrix_mdev->node);
> mutex_unlock(&matrix_dev->lock);
> + vfio_uninit_group_dev(&matrix_mdev->vdev);
> kfree(matrix_mdev);
> err_dec_available:
> atomic_inc(&matrix_dev->available_instances);
> @@ -375,6 +376,7 @@ static void vfio_ap_mdev_remove(struct mdev_device *mdev)
> mutex_lock(&matrix_dev->lock);
> vfio_ap_mdev_reset_queues(matrix_mdev);
> list_del(&matrix_mdev->node);
> + vfio_uninit_group_dev(&matrix_mdev->vdev);
> kfree(matrix_mdev);
> atomic_inc(&matrix_dev->available_instances);
> mutex_unlock(&matrix_dev->lock);
>
> base-commit: ea870730d83fc13a5fa2bd0e175176d7ac8a400a
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev()
2021-09-16 18:51 ` Alex Williamson
@ 2021-09-20 21:26 ` Tony Krowiak
2021-09-20 23:19 ` Jason Gunthorpe
0 siblings, 1 reply; 5+ messages in thread
From: Tony Krowiak @ 2021-09-20 21:26 UTC (permalink / raw)
To: Alex Williamson, Jason Gunthorpe
Cc: Christian Borntraeger, Harald Freudenberger, Vasily Gorbik,
Heiko Carstens, Jason Herne, linux-s390, Halil Pasic,
Cornelia Huck, Christoph Hellwig, kvm
On 9/16/21 2:51 PM, Alex Williamson wrote:
> On Fri, 10 Sep 2021 20:06:30 -0300
> Jason Gunthorpe <jgg@nvidia.com> wrote:
>
>> Without this call an xarray entry is leaked when the vfio_ap device is
>> unprobed. It was missed when the below patch was rebased across the
>> dev_set patch.
>>
>> Fixes: eb0feefd4c02 ("vfio/ap_ops: Convert to use vfio_register_group_dev()")
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>> ---
>> drivers/s390/crypto/vfio_ap_ops.c | 2 ++
>> 1 file changed, 2 insertions(+)
> Hi Tony, Halil, Jason (H),
>
> Any acks for this one? Thanks,
>
> Alex
I installed this on a test system running the latest linux
code from our library and ran our test suite. I got the
following running a simple test case that assigns some
adapters and domains to a mediated device then
starts a guest using the mdev.
[Sep20 23:19] ======================================================
[ +0.000001] WARNING: possible circular locking dependency detected
[ +0.000002] 5.15.0-rc1-10318-gac08b1c68d1b-dirty #15 Not tainted
[ +0.000002] ------------------------------------------------------
[ +0.000001] nose2/1993 is trying to acquire lock:
[ +0.000002] 00000000cf2a5520 (&new_dev_set->lock){+.+.}-{3:3}, at:
vfio_uninit_group_dev+0x3c/0xf0 [vfio]
[ +0.000012]
but task is already holding lock:
[ +0.000001] 00000000c84424d8 (&matrix_dev->lock){+.+.}-{3:3}, at:
vfio_ap_mdev_remove+0x48/0xc8 [vfio_ap]
[ +0.000006]
which lock already depends on the new lock.
[ +0.000002]
the existing dependency chain (in reverse order) is:
[ +0.000001]
-> #1 (&matrix_dev->lock){+.+.}-{3:3}:
[ +0.000004] __lock_acquire+0x64c/0xc40
[ +0.000005] reacquire_held_locks+0x134/0x228
[ +0.000002] __lock_release+0xc6/0x338
[ +0.000002] lock_release+0x1b0/0x298
[ +0.000001] __mutex_unlock_slowpath+0x3a/0x2d8
[ +0.000005] vfio_ap_mdev_unset_kvm.part.0+0xa6/0xc0 [vfio_ap]
[ +0.000003] vfio_device_fops_release+0x64/0xe8 [vfio]
[ +0.000003] __fput+0xae/0x288
[ +0.000005] task_work_run+0x76/0xc0
[ +0.000005] do_exit+0x268/0x5c0
[ +0.000005] do_group_exit+0x48/0xc0
[ +0.000050] __s390x_sys_exit_group+0x2e/0x30
[ +0.000002] __do_syscall+0x1c2/0x1f0
[ +0.000004] system_call+0x78/0xa0
[ +0.000003]
-> #0 (&new_dev_set->lock){+.+.}-{3:3}:
[ +0.000004] check_prev_add+0xe0/0xed8
[ +0.000002] validate_chain+0x9ca/0xde8
[ +0.000001] __lock_acquire+0x64c/0xc40
[ +0.000002] lock_acquire.part.0+0xec/0x258
[ +0.000002] lock_acquire+0xb0/0x200
[ +0.000001] __mutex_lock+0xa2/0x8f8
[ +0.000002] mutex_lock_nested+0x32/0x40
[ +0.000002] vfio_uninit_group_dev+0x3c/0xf0 [vfio]
[ +0.000004] vfio_ap_mdev_remove+0x90/0xc8 [vfio_ap]
[ +0.000002] mdev_remove+0x32/0x58 [mdev]
[ +0.000004] __device_release_driver+0x18a/0x238
[ +0.000005] device_release_driver+0x40/0x50
[ +0.000002] bus_remove_device+0x110/0x1a0
[ +0.000002] device_del+0x19c/0x3e8
[ +0.000002] mdev_device_remove_common+0x3c/0xb0 [mdev]
[ +0.000003] mdev_device_remove+0xac/0x100 [mdev]
[ +0.000002] remove_store+0x78/0x90 [mdev]
[ +0.000003] kernfs_fop_write_iter+0x13e/0x1e0
[ +0.000004] new_sync_write+0x100/0x190
[ +0.000003] vfs_write+0x230/0x2e0
[ +0.000002] ksys_write+0x6c/0xf8
[ +0.000002] __do_syscall+0x1c2/0x1f0
[ +0.000002] system_call+0x78/0xa0
[ +0.000002]
other info that might help us debug this:
[ +0.000002] Possible unsafe locking scenario:
[ +0.000001] CPU0 CPU1
[ +0.000001] ---- ----
[ +0.000001] lock(&matrix_dev->lock);
[ +0.000002] lock(&new_dev_set->lock);
[ +0.000002] lock(&matrix_dev->lock);
[ +0.000002] lock(&new_dev_set->lock);
[ +0.000002]
*** DEADLOCK ***
[ +0.000001] 7 locks held by nose2/1993:
[ +0.000002] #0: 00000000b373af00 (&f->f_pos_lock){+.+.}-{3:3}, at:
__fdget_pos+0x6a/0x80
[ +0.000007] #1: 00000000978e4498 (sb_writers#4){.+.+}-{0:0}, at:
ksys_write+0x6c/0xf8
[ +0.000005] #2: 00000000b4b89490 (&of->mutex){+.+.}-{3:3}, at:
kernfs_fop_write_iter+0x102/0x1e0
[ +0.000006] #3: 00000000ca215210 (kn->active#215){++++}-{0:0}, at:
sysfs_remove_file_self+0x42/0x78
[ +0.000006] #4: 00000000cb7d85b8 (&parent->unreg_sem){++++}-{3:3},
at: mdev_device_remove+0x9c/0x100 [mdev]
[ +0.000005] #5: 00000000cb188990 (&dev->mutex){....}-{3:3}, at:
device_release_driver+0x32/0x50
[ +0.000006] #6: 00000000c84424d8 (&matrix_dev->lock){+.+.}-{3:3}, at:
vfio_ap_mdev_remove+0x48/0xc8 [vfio_ap]
[ +0.000005]
stack backtrace:
[ +0.000002] CPU: 6 PID: 1993 Comm: nose2 Not tainted
5.15.0-rc1-10318-gac08b1c68d1b-dirty #15
[ +0.000003] Hardware name: IBM 8561 T01 701 (LPAR)
[ +0.000002] Call Trace:
[ +0.000001] [<0000000328775056>] dump_stack_lvl+0x8e/0xc8
[ +0.000003] [<0000000327b4b648>] check_noncircular+0x168/0x188
[ +0.000003] [<0000000327b4c708>] check_prev_add+0xe0/0xed8
[ +0.000002] [<0000000327b4deca>] validate_chain+0x9ca/0xde8
[ +0.000002] [<0000000327b50ec4>] __lock_acquire+0x64c/0xc40
[ +0.000002] [<0000000327b4facc>] lock_acquire.part.0+0xec/0x258
[ +0.000002] [<0000000327b4fce8>] lock_acquire+0xb0/0x200
[ +0.000002] [<00000003287839d2>] __mutex_lock+0xa2/0x8f8
[ +0.000002] [<000000032878425a>] mutex_lock_nested+0x32/0x40
[ +0.000002] [<000003ff801c3ce4>] vfio_uninit_group_dev+0x3c/0xf0 [vfio]
[ +0.000004] [<000003ff805e0000>] vfio_ap_mdev_remove+0x90/0xc8 [vfio_ap]
[ +0.000003] [<000003ff802339fa>] mdev_remove+0x32/0x58 [mdev]
[ +0.000003] [<00000003283e9c92>] __device_release_driver+0x18a/0x238
[ +0.000003] [<00000003283e9d80>] device_release_driver+0x40/0x50
[ +0.000003] [<00000003283e8e28>] bus_remove_device+0x110/0x1a0
[ +0.000002] [<00000003283e2cb4>] device_del+0x19c/0x3e8
[ +0.000002] [<000003ff8023270c>] mdev_device_remove_common+0x3c/0xb0
[mdev]
[ +0.000003] [<000003ff80232fcc>] mdev_device_remove+0xac/0x100 [mdev]
[ +0.000003] [<000003ff80233220>] remove_store+0x78/0x90 [mdev]
[ +0.000003] [<0000000327eefaae>] kernfs_fop_write_iter+0x13e/0x1e0
[ +0.000003] [<0000000327dfa160>] new_sync_write+0x100/0x190
[ +0.000002] [<0000000327dfd038>] vfs_write+0x230/0x2e0
[ +0.000003] [<0000000327dfd354>] ksys_write+0x6c/0xf8
[ +0.000002] [<00000003287787b2>] __do_syscall+0x1c2/0x1f0
[ +0.000002] [<000000032878b178>] system_call+0x78/0xa0
[ +0.000003] INFO: lockdep is turned off.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev()
2021-09-20 21:26 ` Tony Krowiak
@ 2021-09-20 23:19 ` Jason Gunthorpe
2021-09-21 0:40 ` Tony Krowiak
0 siblings, 1 reply; 5+ messages in thread
From: Jason Gunthorpe @ 2021-09-20 23:19 UTC (permalink / raw)
To: Tony Krowiak
Cc: Alex Williamson, Christian Borntraeger, Harald Freudenberger,
Vasily Gorbik, Heiko Carstens, Jason Herne, linux-s390,
Halil Pasic, Cornelia Huck, Christoph Hellwig, kvm
On Mon, Sep 20, 2021 at 05:26:25PM -0400, Tony Krowiak wrote:
>
>
> On 9/16/21 2:51 PM, Alex Williamson wrote:
> > On Fri, 10 Sep 2021 20:06:30 -0300
> > Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > > Without this call an xarray entry is leaked when the vfio_ap device is
> > > unprobed. It was missed when the below patch was rebased across the
> > > dev_set patch.
> > >
> > > Fixes: eb0feefd4c02 ("vfio/ap_ops: Convert to use vfio_register_group_dev()")
> > > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > drivers/s390/crypto/vfio_ap_ops.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > Hi Tony, Halil, Jason (H),
> >
> > Any acks for this one? Thanks,
> >
> > Alex
>
> I installed this on a test system running the latest linux
> code from our library and ran our test suite. I got the
> following running a simple test case that assigns some
> adapters and domains to a mediated device then
> starts a guest using the mdev.
Oh, neat. There is no reason for this stuff to be in the
matrix_dev->lock, it should be symmetrical with the error unwind in
probe.
I'll resend it.
Thanks,
Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev()
2021-09-20 23:19 ` Jason Gunthorpe
@ 2021-09-21 0:40 ` Tony Krowiak
0 siblings, 0 replies; 5+ messages in thread
From: Tony Krowiak @ 2021-09-21 0:40 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Alex Williamson, Christian Borntraeger, Harald Freudenberger,
Vasily Gorbik, Heiko Carstens, Jason Herne, linux-s390,
Halil Pasic, Cornelia Huck, Christoph Hellwig, kvm
On 9/20/21 7:19 PM, Jason Gunthorpe wrote:
> On Mon, Sep 20, 2021 at 05:26:25PM -0400, Tony Krowiak wrote:
>>
>> On 9/16/21 2:51 PM, Alex Williamson wrote:
>>> On Fri, 10 Sep 2021 20:06:30 -0300
>>> Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>
>>>> Without this call an xarray entry is leaked when the vfio_ap device is
>>>> unprobed. It was missed when the below patch was rebased across the
>>>> dev_set patch.
>>>>
>>>> Fixes: eb0feefd4c02 ("vfio/ap_ops: Convert to use vfio_register_group_dev()")
>>>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>>> drivers/s390/crypto/vfio_ap_ops.c | 2 ++
>>>> 1 file changed, 2 insertions(+)
>>> Hi Tony, Halil, Jason (H),
>>>
>>> Any acks for this one? Thanks,
>>>
>>> Alex
>> I installed this on a test system running the latest linux
>> code from our library and ran our test suite. I got the
>> following running a simple test case that assigns some
>> adapters and domains to a mediated device then
>> starts a guest using the mdev.
> Oh, neat. There is no reason for this stuff to be in the
> matrix_dev->lock, it should be symmetrical with the error unwind in
> probe.
I moved the vfio_uninit_group_dev outside of the matrix_dev->lock
and it fixed the problem:
static void vfio_ap_mdev_remove(struct mdev_device *mdev)
mutex_lock(&matrix_dev->lock);
vfio_ap_mdev_reset_queues(matrix_mdev);
list_del(&matrix_mdev->node);
atomic_inc(&matrix_dev->available_instances);
mutex_unlock(&matrix_dev->lock);
vfio_uninit_group_dev(&matrix_mdev->vdev);
kfree(matrix_mdev);
}
>
> I'll resend it.
>
> Thanks,
> Jason
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-09-21 0:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-10 23:06 [PATCH v2] vfio/ap_ops: Add missed vfio_uninit_group_dev() Jason Gunthorpe
2021-09-16 18:51 ` Alex Williamson
2021-09-20 21:26 ` Tony Krowiak
2021-09-20 23:19 ` Jason Gunthorpe
2021-09-21 0:40 ` Tony Krowiak
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.