All of lore.kernel.org
 help / color / mirror / Atom feed
* [Patch 0/1] A bug fix to the error handling path in vfio_device_open()
@ 2022-06-20  8:54 Yi Liu
  2022-06-20  8:54 ` [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem " Yi Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Yi Liu @ 2022-06-20  8:54 UTC (permalink / raw)
  To: alex.williamson; +Cc: yi.l.liu, kevin.tian, kvm, jgg

The open_count is unnecessarily placed in the group_rwsem in the error
handling path.

Yi Liu (1):
  vfio: Move "device->open_count--" out of group_rwsem in
    vfio_device_open()

 drivers/vfio/vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-20  8:54 [Patch 0/1] A bug fix to the error handling path in vfio_device_open() Yi Liu
@ 2022-06-20  8:54 ` Yi Liu
  2022-06-20 20:13   ` Matthew Rosato
  2022-06-24 14:05   ` Jason Gunthorpe
  0 siblings, 2 replies; 10+ messages in thread
From: Yi Liu @ 2022-06-20  8:54 UTC (permalink / raw)
  To: alex.williamson; +Cc: yi.l.liu, kevin.tian, kvm, jgg, Matthew Rosato

No need to protect open_count with group_rwsem

Fixes: 421cfe6596f6 ("vfio: remove VFIO_GROUP_NOTIFY_SET_KVM")

cc: Matthew Rosato <mjrosato@linux.ibm.com>
cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
---
 drivers/vfio/vfio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 61e71c1154be..44c3bf8023ac 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -1146,10 +1146,10 @@ static struct file *vfio_device_open(struct vfio_device *device)
 	if (device->open_count == 1 && device->ops->close_device)
 		device->ops->close_device(device);
 err_undo_count:
+	up_read(&device->group->group_rwsem);
 	device->open_count--;
 	if (device->open_count == 0 && device->kvm)
 		device->kvm = NULL;
-	up_read(&device->group->group_rwsem);
 	mutex_unlock(&device->dev_set->lock);
 	module_put(device->dev->driver->owner);
 err_unassign_container:
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-20  8:54 ` [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem " Yi Liu
@ 2022-06-20 20:13   ` Matthew Rosato
  2022-06-21  1:31     ` Yi Liu
  2022-06-24 14:05   ` Jason Gunthorpe
  1 sibling, 1 reply; 10+ messages in thread
From: Matthew Rosato @ 2022-06-20 20:13 UTC (permalink / raw)
  To: Yi Liu, alex.williamson; +Cc: kevin.tian, kvm, jgg

On 6/20/22 4:54 AM, Yi Liu wrote:
> No need to protect open_count with group_rwsem
> 
> Fixes: 421cfe6596f6 ("vfio: remove VFIO_GROUP_NOTIFY_SET_KVM")
> 
> cc: Matthew Rosato <mjrosato@linux.ibm.com>
> cc: Jason Gunthorpe <jgg@nvidia.com>
> Signed-off-by: Yi Liu <yi.l.liu@intel.com>

Seems pretty harmless as-is, but you are correct group_rwsem can be 
dropped earlier; we do not protect the count with group_rwsem elsewhere 
(see vfio_device_fops_release as a comparison, where we already drop 
group_rwsem before open_count--)

FWIW, this change now also drops group_rswem before setting device->kvm 
= NULL, but that's also OK (again, just like vfio_device_fops_release) 
-- While the setting of device->kvm before open_device is technically 
done while holding the group_rwsem, this is done to protect the group 
kvm value we are copying from, and we should not be relying on that to 
protect the contents of device->kvm; instead we assume this value will 
not change until after the device is closed and while under the 
dev_set->lock.

Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>

> ---
>   drivers/vfio/vfio.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 61e71c1154be..44c3bf8023ac 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -1146,10 +1146,10 @@ static struct file *vfio_device_open(struct vfio_device *device)
>   	if (device->open_count == 1 && device->ops->close_device)
>   		device->ops->close_device(device);
>   err_undo_count:
> +	up_read(&device->group->group_rwsem);
>   	device->open_count--;
>   	if (device->open_count == 0 && device->kvm)
>   		device->kvm = NULL;
> -	up_read(&device->group->group_rwsem);
>   	mutex_unlock(&device->dev_set->lock);
>   	module_put(device->dev->driver->owner);
>   err_unassign_container:


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-20 20:13   ` Matthew Rosato
@ 2022-06-21  1:31     ` Yi Liu
  2022-06-21  2:49       ` Tian, Kevin
  0 siblings, 1 reply; 10+ messages in thread
From: Yi Liu @ 2022-06-21  1:31 UTC (permalink / raw)
  To: Matthew Rosato, alex.williamson; +Cc: kevin.tian, kvm, jgg

On 2022/6/21 04:13, Matthew Rosato wrote:
> On 6/20/22 4:54 AM, Yi Liu wrote:
>> No need to protect open_count with group_rwsem
>>
>> Fixes: 421cfe6596f6 ("vfio: remove VFIO_GROUP_NOTIFY_SET_KVM")
>>
>> cc: Matthew Rosato <mjrosato@linux.ibm.com>
>> cc: Jason Gunthorpe <jgg@nvidia.com>
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> 
> Seems pretty harmless as-is, but you are correct group_rwsem can be dropped 
> earlier; we do not protect the count with group_rwsem elsewhere (see 
> vfio_device_fops_release as a comparison, where we already drop group_rwsem 
> before open_count--)

yes. this is exactly how I found it. Normally, I compare the err handling 
path with the release function to see if they are aligned. :-)

> FWIW, this change now also drops group_rswem before setting device->kvm = 
> NULL, but that's also OK (again, just like vfio_device_fops_release) -- 
> While the setting of device->kvm before open_device is technically done 
> while holding the group_rwsem, this is done to protect the group kvm value 
> we are copying from, and we should not be relying on that to protect the 
> contents of device->kvm; instead we assume this value will not change until 
> after the device is closed and while under the dev_set->lock.

yes. set device->kvm to be NULL has no need to hold group_rwsem. BTW. I
also doubt whether the device->ops->open_device(device) and 
device->ops->close_device(device) should be protected by group_rwsem or
not. seems not, right? group_rwsem protects the fields under vfio_group.
For the open_device/close_device() device->dev_set->lock is enough. Maybe
another nit fix.

> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
> 
>> ---
>>   drivers/vfio/vfio.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
>> index 61e71c1154be..44c3bf8023ac 100644
>> --- a/drivers/vfio/vfio.c
>> +++ b/drivers/vfio/vfio.c
>> @@ -1146,10 +1146,10 @@ static struct file *vfio_device_open(struct 
>> vfio_device *device)
>>       if (device->open_count == 1 && device->ops->close_device)
>>           device->ops->close_device(device);
>>   err_undo_count:
>> +    up_read(&device->group->group_rwsem);
>>       device->open_count--;
>>       if (device->open_count == 0 && device->kvm)
>>           device->kvm = NULL;
>> -    up_read(&device->group->group_rwsem);
>>       mutex_unlock(&device->dev_set->lock);
>>       module_put(device->dev->driver->owner);
>>   err_unassign_container:
> 

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-21  1:31     ` Yi Liu
@ 2022-06-21  2:49       ` Tian, Kevin
  2022-06-21  2:59         ` Yi Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2022-06-21  2:49 UTC (permalink / raw)
  To: Liu, Yi L, Matthew Rosato, alex.williamson; +Cc: kvm, jgg

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Tuesday, June 21, 2022 9:32 AM
> 
> On 2022/6/21 04:13, Matthew Rosato wrote:
> > On 6/20/22 4:54 AM, Yi Liu wrote:
> >> No need to protect open_count with group_rwsem
> >>
> >> Fixes: 421cfe6596f6 ("vfio: remove VFIO_GROUP_NOTIFY_SET_KVM")
> >>
> >> cc: Matthew Rosato <mjrosato@linux.ibm.com>
> >> cc: Jason Gunthorpe <jgg@nvidia.com>
> >> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> >
> > Seems pretty harmless as-is, but you are correct group_rwsem can be
> dropped
> > earlier; we do not protect the count with group_rwsem elsewhere (see
> > vfio_device_fops_release as a comparison, where we already drop
> group_rwsem
> > before open_count--)
> 
> yes. this is exactly how I found it. Normally, I compare the err handling
> path with the release function to see if they are aligned. :-)

In this case we don't need a FIX tag. It's kind of optimization.

> 
> > FWIW, this change now also drops group_rswem before setting device-
> >kvm =
> > NULL, but that's also OK (again, just like vfio_device_fops_release) --
> > While the setting of device->kvm before open_device is technically done
> > while holding the group_rwsem, this is done to protect the group kvm
> value
> > we are copying from, and we should not be relying on that to protect the
> > contents of device->kvm; instead we assume this value will not change until
> > after the device is closed and while under the dev_set->lock.
> 
> yes. set device->kvm to be NULL has no need to hold group_rwsem. BTW. I
> also doubt whether the device->ops->open_device(device) and
> device->ops->close_device(device) should be protected by group_rwsem or
> not. seems not, right? group_rwsem protects the fields under vfio_group.
> For the open_device/close_device() device->dev_set->lock is enough. Maybe
> another nit fix.
> 

group->rwsem is to protect device->group->kvm from being changed
by vfio_file_set_kvm() before it is copied to device->kvm.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-21  2:49       ` Tian, Kevin
@ 2022-06-21  2:59         ` Yi Liu
  2022-06-21  3:26           ` Tian, Kevin
  0 siblings, 1 reply; 10+ messages in thread
From: Yi Liu @ 2022-06-21  2:59 UTC (permalink / raw)
  To: Tian, Kevin, Matthew Rosato, alex.williamson; +Cc: kvm, jgg

On 2022/6/21 10:49, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Tuesday, June 21, 2022 9:32 AM
>>
>> On 2022/6/21 04:13, Matthew Rosato wrote:
>>> On 6/20/22 4:54 AM, Yi Liu wrote:
>>>> No need to protect open_count with group_rwsem
>>>>
>>>> Fixes: 421cfe6596f6 ("vfio: remove VFIO_GROUP_NOTIFY_SET_KVM")
>>>>
>>>> cc: Matthew Rosato <mjrosato@linux.ibm.com>
>>>> cc: Jason Gunthorpe <jgg@nvidia.com>
>>>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>>>
>>> Seems pretty harmless as-is, but you are correct group_rwsem can be
>> dropped
>>> earlier; we do not protect the count with group_rwsem elsewhere (see
>>> vfio_device_fops_release as a comparison, where we already drop
>> group_rwsem
>>> before open_count--)
>>
>> yes. this is exactly how I found it. Normally, I compare the err handling
>> path with the release function to see if they are aligned. :-)
> 
> In this case we don't need a FIX tag. It's kind of optimization.

ok.

>>
>>> FWIW, this change now also drops group_rswem before setting device-
>>> kvm =
>>> NULL, but that's also OK (again, just like vfio_device_fops_release) --
>>> While the setting of device->kvm before open_device is technically done
>>> while holding the group_rwsem, this is done to protect the group kvm
>> value
>>> we are copying from, and we should not be relying on that to protect the
>>> contents of device->kvm; instead we assume this value will not change until
>>> after the device is closed and while under the dev_set->lock.
>>
>> yes. set device->kvm to be NULL has no need to hold group_rwsem. BTW. I
>> also doubt whether the device->ops->open_device(device) and
>> device->ops->close_device(device) should be protected by group_rwsem or
>> not. seems not, right? group_rwsem protects the fields under vfio_group.
>> For the open_device/close_device() device->dev_set->lock is enough. Maybe
>> another nit fix.
>>
> 
> group->rwsem is to protect device->group->kvm from being changed
> by vfio_file_set_kvm() before it is copied to device->kvm.

yes. this is why vfio_device_open() holds the read lock of group_rwsem 
around the device->group->kvm copy. However, for the open_device(), 
callback, I don't think it is necessary to be protected by the group_rwsem
lock.

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* RE: [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-21  2:59         ` Yi Liu
@ 2022-06-21  3:26           ` Tian, Kevin
  2022-06-21  3:35             ` Yi Liu
  0 siblings, 1 reply; 10+ messages in thread
From: Tian, Kevin @ 2022-06-21  3:26 UTC (permalink / raw)
  To: Liu, Yi L, Matthew Rosato, alex.williamson; +Cc: kvm, jgg

> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Tuesday, June 21, 2022 10:59 AM
> >>
> >>> FWIW, this change now also drops group_rswem before setting device-
> >>> kvm =
> >>> NULL, but that's also OK (again, just like vfio_device_fops_release) --
> >>> While the setting of device->kvm before open_device is technically done
> >>> while holding the group_rwsem, this is done to protect the group kvm
> >> value
> >>> we are copying from, and we should not be relying on that to protect the
> >>> contents of device->kvm; instead we assume this value will not change
> until
> >>> after the device is closed and while under the dev_set->lock.
> >>
> >> yes. set device->kvm to be NULL has no need to hold group_rwsem. BTW.
> I
> >> also doubt whether the device->ops->open_device(device) and
> >> device->ops->close_device(device) should be protected by group_rwsem
> or
> >> not. seems not, right? group_rwsem protects the fields under vfio_group.
> >> For the open_device/close_device() device->dev_set->lock is enough.
> Maybe
> >> another nit fix.
> >>
> >
> > group->rwsem is to protect device->group->kvm from being changed
> > by vfio_file_set_kvm() before it is copied to device->kvm.
> 
> yes. this is why vfio_device_open() holds the read lock of group_rwsem
> around the device->group->kvm copy. However, for the open_device(),
> callback, I don't think it is necessary to be protected by the group_rwsem
> lock.
> 

The kvm object could be freed after device->kvm is set, if
group_rwsem is not held when calling open_device(). Then you'll
hit another use-after-free bug when mdev driver tries to obtain
a reference on kvm.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-21  3:26           ` Tian, Kevin
@ 2022-06-21  3:35             ` Yi Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Yi Liu @ 2022-06-21  3:35 UTC (permalink / raw)
  To: Tian, Kevin, Matthew Rosato, alex.williamson; +Cc: kvm, jgg

On 2022/6/21 11:26, Tian, Kevin wrote:
>> From: Liu, Yi L <yi.l.liu@intel.com>
>> Sent: Tuesday, June 21, 2022 10:59 AM
>>>>
>>>>> FWIW, this change now also drops group_rswem before setting device-
>>>>> kvm =
>>>>> NULL, but that's also OK (again, just like vfio_device_fops_release) --
>>>>> While the setting of device->kvm before open_device is technically done
>>>>> while holding the group_rwsem, this is done to protect the group kvm
>>>> value
>>>>> we are copying from, and we should not be relying on that to protect the
>>>>> contents of device->kvm; instead we assume this value will not change
>> until
>>>>> after the device is closed and while under the dev_set->lock.
>>>>
>>>> yes. set device->kvm to be NULL has no need to hold group_rwsem. BTW.
>> I
>>>> also doubt whether the device->ops->open_device(device) and
>>>> device->ops->close_device(device) should be protected by group_rwsem
>> or
>>>> not. seems not, right? group_rwsem protects the fields under vfio_group.
>>>> For the open_device/close_device() device->dev_set->lock is enough.
>> Maybe
>>>> another nit fix.
>>>>
>>>
>>> group->rwsem is to protect device->group->kvm from being changed
>>> by vfio_file_set_kvm() before it is copied to device->kvm.
>>
>> yes. this is why vfio_device_open() holds the read lock of group_rwsem
>> around the device->group->kvm copy. However, for the open_device(),
>> callback, I don't think it is necessary to be protected by the group_rwsem
>> lock.
>>
> 
> The kvm object could be freed after device->kvm is set, if
> group_rwsem is not held when calling open_device(). Then you'll
> hit another use-after-free bug when mdev driver tries to obtain
> a reference on kvm.

aha. I see. so group_rwsem prevents the kvm object free as such a thread
should be blocked in the vfio_file_set_kvm() if the vfio_device_open() has
held the group_rwsem. Hence kvm object is safe to be used. thanks!

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-20  8:54 ` [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem " Yi Liu
  2022-06-20 20:13   ` Matthew Rosato
@ 2022-06-24 14:05   ` Jason Gunthorpe
  2022-06-27  7:45     ` Yi Liu
  1 sibling, 1 reply; 10+ messages in thread
From: Jason Gunthorpe @ 2022-06-24 14:05 UTC (permalink / raw)
  To: Yi Liu; +Cc: alex.williamson, kevin.tian, kvm, Matthew Rosato

On Mon, Jun 20, 2022 at 01:54:59AM -0700, Yi Liu wrote:
> No need to protect open_count with group_rwsem

You should explain why this is a good change as you did in the emails
here, and no cover letter required for single patches
 
> Fixes: 421cfe6596f6 ("vfio: remove VFIO_GROUP_NOTIFY_SET_KVM")
> 

Extra space after fixes line.
No Fixes line for things that are not bugs.

> cc: Matthew Rosato <mjrosato@linux.ibm.com>
> cc: Jason Gunthorpe <jgg@nvidia.com>

Cc:

> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
> ---
>  drivers/vfio/vfio.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

It seems OK but a bit unnecessary.

Jason

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem in vfio_device_open()
  2022-06-24 14:05   ` Jason Gunthorpe
@ 2022-06-27  7:45     ` Yi Liu
  0 siblings, 0 replies; 10+ messages in thread
From: Yi Liu @ 2022-06-27  7:45 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: alex.williamson, kevin.tian, kvm, Matthew Rosato

On 2022/6/24 22:05, Jason Gunthorpe wrote:
> On Mon, Jun 20, 2022 at 01:54:59AM -0700, Yi Liu wrote:
>> No need to protect open_count with group_rwsem
> 
> You should explain why this is a good change as you did in the emails
> here, and no cover letter required for single patches
>   
>> Fixes: 421cfe6596f6 ("vfio: remove VFIO_GROUP_NOTIFY_SET_KVM")
>>
> 
> Extra space after fixes line.
> No Fixes line for things that are not bugs.
> 
>> cc: Matthew Rosato <mjrosato@linux.ibm.com>
>> cc: Jason Gunthorpe <jgg@nvidia.com>
> 
> Cc:
> 
>> Signed-off-by: Yi Liu <yi.l.liu@intel.com>
>> ---
>>   drivers/vfio/vfio.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> It seems OK but a bit unnecessary.
> 

thanks, Jason. all good suggestions.I'd just want to see the
device->dev_set->lock and the group_rwsem hold/release is
symmetric in the case of err_undo_count is hit in the vfio_device_open().
Just like the order in vfio_device_fops_release().

-- 
Regards,
Yi Liu

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-06-27  7:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-20  8:54 [Patch 0/1] A bug fix to the error handling path in vfio_device_open() Yi Liu
2022-06-20  8:54 ` [Patch 1/1] vfio: Move "device->open_count--" out of group_rwsem " Yi Liu
2022-06-20 20:13   ` Matthew Rosato
2022-06-21  1:31     ` Yi Liu
2022-06-21  2:49       ` Tian, Kevin
2022-06-21  2:59         ` Yi Liu
2022-06-21  3:26           ` Tian, Kevin
2022-06-21  3:35             ` Yi Liu
2022-06-24 14:05   ` Jason Gunthorpe
2022-06-27  7:45     ` Yi Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.