All of lore.kernel.org
 help / color / mirror / Atom feed
* vfio/dev-assignment: potential pci_block_user_cfg_access nesting
@ 2011-08-23 13:31 Jan Kiszka
  2011-08-23 22:05 ` Alex Williamson
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kiszka @ 2011-08-23 13:31 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm

Hi Alex,

just ran into some corner case with my reanimated IRQ sharing patches
that may affect vfio as well:

How are vfio_enable/disable_intx synchronized against all other possible
spots that call pci_block_user_cfg_access?

I hit the recursion bug check in pci_block_user_cfg_access with my code
which takes the user_cfg lock like vfio does. It likely races with
pci_reset_function here - and should do so in vfio as well.

Just taking some lock would mean having to run pci_reset_function with
IRQs disabled to synchronize with the IRQ handler (not sure if that is
possible at all). Alternatively, we would have to disable the interrupt
line or deregister the IRQ while resetting. Or we perform INTx mask
manipulation in an unsynchronized fashion, resolving races with user
space differently (still need to think about this option).

Any other thoughts?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: vfio/dev-assignment: potential pci_block_user_cfg_access nesting
  2011-08-23 13:31 vfio/dev-assignment: potential pci_block_user_cfg_access nesting Jan Kiszka
@ 2011-08-23 22:05 ` Alex Williamson
  2011-08-24  9:09   ` Jan Kiszka
  0 siblings, 1 reply; 4+ messages in thread
From: Alex Williamson @ 2011-08-23 22:05 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm

On Tue, 2011-08-23 at 15:31 +0200, Jan Kiszka wrote:
> Hi Alex,
> 
> just ran into some corner case with my reanimated IRQ sharing patches
> that may affect vfio as well:
> 
> How are vfio_enable/disable_intx synchronized against all other possible
> spots that call pci_block_user_cfg_access?
> 
> I hit the recursion bug check in pci_block_user_cfg_access with my code
> which takes the user_cfg lock like vfio does. It likely races with
> pci_reset_function here - and should do so in vfio as well.

So the race is that we're doing a pci_reset_function and while we've got
pci_block_user_cfg_access set, an interrupt comes in (maybe from a
device sharing the interrupt line), and we hit the BUG_ON when trying to
nest pci_block_user_cfg_access?

> Just taking some lock would mean having to run pci_reset_function with
> IRQs disabled to synchronize with the IRQ handler (not sure if that is
> possible at all). Alternatively, we would have to disable the interrupt
> line or deregister the IRQ while resetting. Or we perform INTx mask
> manipulation in an unsynchronized fashion, resolving races with user
> space differently (still need to think about this option).
> 
> Any other thoughts?

I think this is a bit easier for vfio since the reset is already routed
through a vfio ioctl.  We can just use a mutex between the two, reset
would wait on the mutex while the interrupt handler would skip masking
of a shared interrupt if it can't get the mutex (hopefully the interrupt
is really for a shared device or we squelch it via the reset before we
trigger the spurious interrupt counter).

I think the only path for kvm assignment that doesn't involve also
rerouting the reset through a kvm ioctl would have to be avoiding the
problem in userspace.  We'd have to unregister the interrupt handler,
reset, then re-register.  That sounds pretty heavy, but the reset is
already a slow process.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: vfio/dev-assignment: potential pci_block_user_cfg_access nesting
  2011-08-23 22:05 ` Alex Williamson
@ 2011-08-24  9:09   ` Jan Kiszka
  2011-08-24 15:10     ` Alex Williamson
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kiszka @ 2011-08-24  9:09 UTC (permalink / raw)
  To: Alex Williamson; +Cc: kvm

On 2011-08-24 00:05, Alex Williamson wrote:
> On Tue, 2011-08-23 at 15:31 +0200, Jan Kiszka wrote:
>> Hi Alex,
>>
>> just ran into some corner case with my reanimated IRQ sharing patches
>> that may affect vfio as well:
>>
>> How are vfio_enable/disable_intx synchronized against all other possible
>> spots that call pci_block_user_cfg_access?
>>
>> I hit the recursion bug check in pci_block_user_cfg_access with my code
>> which takes the user_cfg lock like vfio does. It likely races with
>> pci_reset_function here - and should do so in vfio as well.
> 
> So the race is that we're doing a pci_reset_function and while we've got
> pci_block_user_cfg_access set, an interrupt comes in (maybe from a
> device sharing the interrupt line), and we hit the BUG_ON when trying to
> nest pci_block_user_cfg_access?

Most probably the scenario I was seeing, but I didn't debugged it in all
details as it already locked up my notebook twice.

> 
>> Just taking some lock would mean having to run pci_reset_function with
>> IRQs disabled to synchronize with the IRQ handler (not sure if that is
>> possible at all). Alternatively, we would have to disable the interrupt
>> line or deregister the IRQ while resetting. Or we perform INTx mask
>> manipulation in an unsynchronized fashion, resolving races with user
>> space differently (still need to think about this option).
>>
>> Any other thoughts?
> 
> I think this is a bit easier for vfio since the reset is already routed
> through a vfio ioctl.  We can just use a mutex between the two, reset
> would wait on the mutex while the interrupt handler would skip masking
> of a shared interrupt if it can't get the mutex (hopefully the interrupt
> is really for a shared device or we squelch it via the reset before we
> trigger the spurious interrupt counter).
> 
> I think the only path for kvm assignment that doesn't involve also
> rerouting the reset through a kvm ioctl would have to be avoiding the
> problem in userspace.  We'd have to unregister the interrupt handler,
> reset, then re-register.  That sounds pretty heavy, but the reset is
> already a slow process.  Thanks,

I don't think we can allow userspace to potentially crash the host.

Anyway, this problem turns out to be way more generic. Just run two
"echo 1 > /sys/bus/pci/.../reset" loops on the same device in parallel.
But be warned, you will have to reboot that box afterward.

Maybe this very creative interface of pci_block_user_cfg_access was once
OK when only the IPR SCSI driver used it. But by the time it grew beyond
that use case, it became hopelessly broken (well, open-coded
locking...). We need to redesign it, synchronizing users that can sleep
via a simple mutex and addressing access to the status/command word
separately via an IRQ-save spinlock (as we need that service in hard IRQ
handlers).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: vfio/dev-assignment: potential pci_block_user_cfg_access nesting
  2011-08-24  9:09   ` Jan Kiszka
@ 2011-08-24 15:10     ` Alex Williamson
  0 siblings, 0 replies; 4+ messages in thread
From: Alex Williamson @ 2011-08-24 15:10 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: kvm

On Wed, 2011-08-24 at 11:09 +0200, Jan Kiszka wrote:
> On 2011-08-24 00:05, Alex Williamson wrote:
> > On Tue, 2011-08-23 at 15:31 +0200, Jan Kiszka wrote:
> >> Hi Alex,
> >>
> >> just ran into some corner case with my reanimated IRQ sharing patches
> >> that may affect vfio as well:
> >>
> >> How are vfio_enable/disable_intx synchronized against all other possible
> >> spots that call pci_block_user_cfg_access?
> >>
> >> I hit the recursion bug check in pci_block_user_cfg_access with my code
> >> which takes the user_cfg lock like vfio does. It likely races with
> >> pci_reset_function here - and should do so in vfio as well.
> > 
> > So the race is that we're doing a pci_reset_function and while we've got
> > pci_block_user_cfg_access set, an interrupt comes in (maybe from a
> > device sharing the interrupt line), and we hit the BUG_ON when trying to
> > nest pci_block_user_cfg_access?
> 
> Most probably the scenario I was seeing, but I didn't debugged it in all
> details as it already locked up my notebook twice.
> 
> > 
> >> Just taking some lock would mean having to run pci_reset_function with
> >> IRQs disabled to synchronize with the IRQ handler (not sure if that is
> >> possible at all). Alternatively, we would have to disable the interrupt
> >> line or deregister the IRQ while resetting. Or we perform INTx mask
> >> manipulation in an unsynchronized fashion, resolving races with user
> >> space differently (still need to think about this option).
> >>
> >> Any other thoughts?
> > 
> > I think this is a bit easier for vfio since the reset is already routed
> > through a vfio ioctl.  We can just use a mutex between the two, reset
> > would wait on the mutex while the interrupt handler would skip masking
> > of a shared interrupt if it can't get the mutex (hopefully the interrupt
> > is really for a shared device or we squelch it via the reset before we
> > trigger the spurious interrupt counter).
> > 
> > I think the only path for kvm assignment that doesn't involve also
> > rerouting the reset through a kvm ioctl would have to be avoiding the
> > problem in userspace.  We'd have to unregister the interrupt handler,
> > reset, then re-register.  That sounds pretty heavy, but the reset is
> > already a slow process.  Thanks,
> 
> I don't think we can allow userspace to potentially crash the host.
> 
> Anyway, this problem turns out to be way more generic. Just run two
> "echo 1 > /sys/bus/pci/.../reset" loops on the same device in parallel.
> But be warned, you will have to reboot that box afterward.
> 
> Maybe this very creative interface of pci_block_user_cfg_access was once
> OK when only the IPR SCSI driver used it. But by the time it grew beyond
> that use case, it became hopelessly broken (well, open-coded
> locking...). We need to redesign it, synchronizing users that can sleep
> via a simple mutex and addressing access to the status/command word
> separately via an IRQ-save spinlock (as we need that service in hard IRQ
> handlers).

Yep, that sounds like the best path.  pci_block_user_cfg_access is at
best "fragile" in it's current implementation.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-08-24 15:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-23 13:31 vfio/dev-assignment: potential pci_block_user_cfg_access nesting Jan Kiszka
2011-08-23 22:05 ` Alex Williamson
2011-08-24  9:09   ` Jan Kiszka
2011-08-24 15:10     ` Alex Williamson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.