All of lore.kernel.org
 help / color / mirror / Atom feed
* Hit a deadlock: between AER and pcieport/pciehp
@ 2015-03-12  1:48 Rajat Jain
  2015-03-17 19:11 ` Rajat Jain
  0 siblings, 1 reply; 4+ messages in thread
From: Rajat Jain @ 2015-03-12  1:48 UTC (permalink / raw)
  To: linux-pci, linux-kernel; +Cc: Bjorn Helgaas, Guenter Roeck, Rajat Jain

Hello,


I have hit a kernel deadlock situation on my system that has
hierarchical hot plug situations (i.e. we can hot-plug a card, that
itself may have a hot-plug slot for another level of hot-pluggable
add-on cards). In summary, I see 2 threads that are both waiting on
mutexes that is acquired by the other one. The mutexes are the
(global) "pci_bus_sem" and "device->mutex" respectively.


Thread1
 =======
 This is the pciehp worker thread, that scans a new card, and on
finding that there is a hotplug slot downstream, tries to
pci_create_slot().
 pciehp_power_thread()
   -> pciehp_enable_slot()
     -> pciehp_configure_device()
       -> pci_bus_add_devices() discovers all devices including a new
hotplug slot.
         -> ....(etc)...
         -> device_attach(dev) (for the newly discovered HP slot /
downstream port)
           -> device_lock(dev) SUCCESSFULLY ACQUIRES dev->mutex for
the new slot.
         -> ....(etc)...
         -> ... (goes on)
         -> pciehp_probe(dev)
             -> __pci_hp_register()
                -> pci_create_slot()
                     -> down_write(pci_bus_sem); /* Deadlocked */

 This how the stack looks like:
  [<ffffffff814e9923>] call_rwsem_down_write_failed+0x13/0x20
 [<ffffffff81522d4f>] pci_create_slot+0x3f/0x280
 [<ffffffff8152c030>] __pci_hp_register+0x70/0x400
 [<ffffffff8152cf49>] pciehp_probe+0x1a9/0x450
 [<ffffffff8152865d>] pcie_port_probe_service+0x3d/0x90
 [<ffffffff815c45b9>] driver_probe_device+0xf9/0x350
 [<ffffffff815c490b>] __device_attach+0x4b/0x60
 [<ffffffff815c25a6>] bus_for_each_drv+0x56/0xa0
 [<ffffffff815c4468>] device_attach+0xa8/0xc0
 [<ffffffff815c38d0>] bus_probe_device+0xb0/0xe0
 [<ffffffff815c16ce>] device_add+0x3de/0x560
 [<ffffffff815c1a2e>] device_register+0x1e/0x30
 [<ffffffff81528aef>] pcie_port_device_register+0x32f/0x510
 [<ffffffff81528eb8>] pcie_portdrv_probe+0x48/0x80
 [<ffffffff8151b17c>] pci_device_probe+0x9c/0xf0
 [<ffffffff815c45b9>] driver_probe_device+0xf9/0x350
 [<ffffffff815c490b>] __device_attach+0x4b/0x60
 [<ffffffff815c25a6>] bus_for_each_drv+0x56/0xa0
 [<ffffffff815c4468>] device_attach+0xa8/0xc0
 [<ffffffff815116c1>] pci_bus_add_device+0x41/0x70
 [<ffffffff81511a41>] pci_bus_add_devices+0x41/0x90
 [<ffffffff81511a6f>] pci_bus_add_devices+0x6f/0x90
 [<ffffffff8152e7e2>] pciehp_configure_device+0xa2/0x140
 [<ffffffff8152df08>] pciehp_enable_slot+0x188/0x2d0
 [<ffffffff8152e3d1>] pciehp_power_thread+0x2b1/0x3c0
 [<ffffffff810d92a0>] process_one_work+0x1d0/0x510
 [<ffffffff810d9cc1>] worker_thread+0x121/0x440
 [<ffffffff810df0bf>] kthread+0xef/0x110
 [<ffffffff81a4d8ac>] ret_from_fork+0x7c/0xb0
 [<ffffffffffffffff>] 0xffffffffffffffff


 Thread2
 =======
 While the above thread is doing its work, the root port gets a
completion timeout. And thus the AER Error recovery worker thread
kicks in to handle that error. And as part of that error recovery -
since the completion timeout was detected at root port, attempts to
see for ALL the devices downstream if they have an error handler that
need to be called. Here is what happens:


aer_isr()
   -> aer_isr_one_error()
     -> aer_process_err_device()
        -> ... (etc)...
          -> do_recovery()
            -> broadcast_error_message()
              -> pci_walk_bus( ..., report_error_detected,...) /*
effectively for all buses below root port */
                    -> down_read(&pci_bus_sem);  /* SUCCESSFULLY
ACQUIRES the semaophore */
                    -> report_error_detected(dev) /* for the newly
detected slot */
                         -> device_lock(dev) /* Deadlocked */

 This is how the stack looks like:
 [<ffffffff81529e7e>] report_error_detected+0x4e/0x170 <--- Waiting on
device_lock()
 [<ffffffff8151162e>] pci_walk_bus+0x4e/0xa0
 [<ffffffff81529b84>] broadcast_error_message+0xc4/0xf0
 [<ffffffff81529bed>] do_recovery+0x3d/0x280
 [<ffffffff8152a5d0>] aer_isr+0x300/0x3e0
 [<ffffffff810d92a0>] process_one_work+0x1d0/0x510
 [<ffffffff810d9cc1>] worker_thread+0x121/0x440
 [<ffffffff810df0bf>] kthread+0xef/0x110
 [<ffffffff81a4d8ac>] ret_from_fork+0x7c/0xb0
 [<ffffffffffffffff>] 0xffffffffffffffff


As a temporary work around to let me proceed, I was thinking may be I
could change in report_error_detected() such that completion timeouts
errors may not be broadcast (do we really have any drivers that have
aer handlers that handle such an error? What would the handler do
anyway to fix such an error?)


But not sure what the right solution might look like. I thought about
whether these locks should have been taken in a particular order in
order to avoid this problem, but looking at the stack there seems to
be no other way. What do you think is the best way to fix this
deadlock?

Any help or suggestions in this regard are greatly appreciated.

 Thanks,

Rajat

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Hit a deadlock: between AER and pcieport/pciehp
  2015-03-12  1:48 Hit a deadlock: between AER and pcieport/pciehp Rajat Jain
@ 2015-03-17 19:11 ` Rajat Jain
  2015-03-18  2:28   ` Bjorn Helgaas
  0 siblings, 1 reply; 4+ messages in thread
From: Rajat Jain @ 2015-03-17 19:11 UTC (permalink / raw)
  To: linux-pci, linux-kernel
  Cc: Bjorn Helgaas, Guenter Roeck, Rajat Jain, rthirumal, sanjayj

Hello,

I was wondering if any one has a any suggestions to make here. I
believe this is a pretty serious deadlock - and I'm looking for ideas
on what should be the right way to fix this.

Thanks,

Rajat


On Wed, Mar 11, 2015 at 6:48 PM, Rajat Jain <rajatxjain@gmail.com> wrote:
> Hello,
>
>
> I have hit a kernel deadlock situation on my system that has
> hierarchical hot plug situations (i.e. we can hot-plug a card, that
> itself may have a hot-plug slot for another level of hot-pluggable
> add-on cards). In summary, I see 2 threads that are both waiting on
> mutexes that is acquired by the other one. The mutexes are the
> (global) "pci_bus_sem" and "device->mutex" respectively.
>
>
> Thread1
>  =======
>  This is the pciehp worker thread, that scans a new card, and on
> finding that there is a hotplug slot downstream, tries to
> pci_create_slot().
>  pciehp_power_thread()
>    -> pciehp_enable_slot()
>      -> pciehp_configure_device()
>        -> pci_bus_add_devices() discovers all devices including a new
> hotplug slot.
>          -> ....(etc)...
>          -> device_attach(dev) (for the newly discovered HP slot /
> downstream port)
>            -> device_lock(dev) SUCCESSFULLY ACQUIRES dev->mutex for
> the new slot.
>          -> ....(etc)...
>          -> ... (goes on)
>          -> pciehp_probe(dev)
>              -> __pci_hp_register()
>                 -> pci_create_slot()
>                      -> down_write(pci_bus_sem); /* Deadlocked */
>
>  This how the stack looks like:
>   [<ffffffff814e9923>] call_rwsem_down_write_failed+0x13/0x20
>  [<ffffffff81522d4f>] pci_create_slot+0x3f/0x280
>  [<ffffffff8152c030>] __pci_hp_register+0x70/0x400
>  [<ffffffff8152cf49>] pciehp_probe+0x1a9/0x450
>  [<ffffffff8152865d>] pcie_port_probe_service+0x3d/0x90
>  [<ffffffff815c45b9>] driver_probe_device+0xf9/0x350
>  [<ffffffff815c490b>] __device_attach+0x4b/0x60
>  [<ffffffff815c25a6>] bus_for_each_drv+0x56/0xa0
>  [<ffffffff815c4468>] device_attach+0xa8/0xc0
>  [<ffffffff815c38d0>] bus_probe_device+0xb0/0xe0
>  [<ffffffff815c16ce>] device_add+0x3de/0x560
>  [<ffffffff815c1a2e>] device_register+0x1e/0x30
>  [<ffffffff81528aef>] pcie_port_device_register+0x32f/0x510
>  [<ffffffff81528eb8>] pcie_portdrv_probe+0x48/0x80
>  [<ffffffff8151b17c>] pci_device_probe+0x9c/0xf0
>  [<ffffffff815c45b9>] driver_probe_device+0xf9/0x350
>  [<ffffffff815c490b>] __device_attach+0x4b/0x60
>  [<ffffffff815c25a6>] bus_for_each_drv+0x56/0xa0
>  [<ffffffff815c4468>] device_attach+0xa8/0xc0
>  [<ffffffff815116c1>] pci_bus_add_device+0x41/0x70
>  [<ffffffff81511a41>] pci_bus_add_devices+0x41/0x90
>  [<ffffffff81511a6f>] pci_bus_add_devices+0x6f/0x90
>  [<ffffffff8152e7e2>] pciehp_configure_device+0xa2/0x140
>  [<ffffffff8152df08>] pciehp_enable_slot+0x188/0x2d0
>  [<ffffffff8152e3d1>] pciehp_power_thread+0x2b1/0x3c0
>  [<ffffffff810d92a0>] process_one_work+0x1d0/0x510
>  [<ffffffff810d9cc1>] worker_thread+0x121/0x440
>  [<ffffffff810df0bf>] kthread+0xef/0x110
>  [<ffffffff81a4d8ac>] ret_from_fork+0x7c/0xb0
>  [<ffffffffffffffff>] 0xffffffffffffffff
>
>
>  Thread2
>  =======
>  While the above thread is doing its work, the root port gets a
> completion timeout. And thus the AER Error recovery worker thread
> kicks in to handle that error. And as part of that error recovery -
> since the completion timeout was detected at root port, attempts to
> see for ALL the devices downstream if they have an error handler that
> need to be called. Here is what happens:
>
>
> aer_isr()
>    -> aer_isr_one_error()
>      -> aer_process_err_device()
>         -> ... (etc)...
>           -> do_recovery()
>             -> broadcast_error_message()
>               -> pci_walk_bus( ..., report_error_detected,...) /*
> effectively for all buses below root port */
>                     -> down_read(&pci_bus_sem);  /* SUCCESSFULLY
> ACQUIRES the semaophore */
>                     -> report_error_detected(dev) /* for the newly
> detected slot */
>                          -> device_lock(dev) /* Deadlocked */
>
>  This is how the stack looks like:
>  [<ffffffff81529e7e>] report_error_detected+0x4e/0x170 <--- Waiting on
> device_lock()
>  [<ffffffff8151162e>] pci_walk_bus+0x4e/0xa0
>  [<ffffffff81529b84>] broadcast_error_message+0xc4/0xf0
>  [<ffffffff81529bed>] do_recovery+0x3d/0x280
>  [<ffffffff8152a5d0>] aer_isr+0x300/0x3e0
>  [<ffffffff810d92a0>] process_one_work+0x1d0/0x510
>  [<ffffffff810d9cc1>] worker_thread+0x121/0x440
>  [<ffffffff810df0bf>] kthread+0xef/0x110
>  [<ffffffff81a4d8ac>] ret_from_fork+0x7c/0xb0
>  [<ffffffffffffffff>] 0xffffffffffffffff
>
>
> As a temporary work around to let me proceed, I was thinking may be I
> could change in report_error_detected() such that completion timeouts
> errors may not be broadcast (do we really have any drivers that have
> aer handlers that handle such an error? What would the handler do
> anyway to fix such an error?)
>
>
> But not sure what the right solution might look like. I thought about
> whether these locks should have been taken in a particular order in
> order to avoid this problem, but looking at the stack there seems to
> be no other way. What do you think is the best way to fix this
> deadlock?
>
> Any help or suggestions in this regard are greatly appreciated.
>
>  Thanks,
>
> Rajat

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Hit a deadlock: between AER and pcieport/pciehp
  2015-03-17 19:11 ` Rajat Jain
@ 2015-03-18  2:28   ` Bjorn Helgaas
  2015-03-18 10:04     ` Guenter Roeck
  0 siblings, 1 reply; 4+ messages in thread
From: Bjorn Helgaas @ 2015-03-18  2:28 UTC (permalink / raw)
  To: Rajat Jain
  Cc: linux-pci, linux-kernel, Guenter Roeck, Rajat Jain, rthirumal,
	sanjayj, Rafael Wysocki

[+cc Rafael]

Hi Rajat,

On Tue, Mar 17, 2015 at 2:11 PM, Rajat Jain <rajatxjain@gmail.com> wrote:
> Hello,
>
> I was wondering if any one has a any suggestions to make here. I
> believe this is a pretty serious deadlock - and I'm looking for ideas
> on what should be the right way to fix this.

I agree, this definitely sounds like a real problem.  I'm not ignoring
it; I just haven't had time to look into it :)

After ten seconds of thought, my suggestion is to try to make this
work in a way that doesn't require taking the mutexes in two different
orders.  It might be *possible* to write code that is smart enough to
take them in different orders, but I'm pretty sure our automated lock
checking tools wouldn't be that smart.

I added Rafael because he recently did some work on PCI bus locking
and might have better ideas than I do.

> On Wed, Mar 11, 2015 at 6:48 PM, Rajat Jain <rajatxjain@gmail.com> wrote:
>> Hello,
>>
>>
>> I have hit a kernel deadlock situation on my system that has
>> hierarchical hot plug situations (i.e. we can hot-plug a card, that
>> itself may have a hot-plug slot for another level of hot-pluggable
>> add-on cards). In summary, I see 2 threads that are both waiting on
>> mutexes that is acquired by the other one. The mutexes are the
>> (global) "pci_bus_sem" and "device->mutex" respectively.
>>
>>
>> Thread1
>>  =======
>>  This is the pciehp worker thread, that scans a new card, and on
>> finding that there is a hotplug slot downstream, tries to
>> pci_create_slot().
>>  pciehp_power_thread()
>>    -> pciehp_enable_slot()
>>      -> pciehp_configure_device()
>>        -> pci_bus_add_devices() discovers all devices including a new
>> hotplug slot.
>>          -> ....(etc)...
>>          -> device_attach(dev) (for the newly discovered HP slot /
>> downstream port)
>>            -> device_lock(dev) SUCCESSFULLY ACQUIRES dev->mutex for
>> the new slot.
>>          -> ....(etc)...
>>          -> ... (goes on)
>>          -> pciehp_probe(dev)
>>              -> __pci_hp_register()
>>                 -> pci_create_slot()
>>                      -> down_write(pci_bus_sem); /* Deadlocked */
>>
>>  This how the stack looks like:
>>   [<ffffffff814e9923>] call_rwsem_down_write_failed+0x13/0x20
>>  [<ffffffff81522d4f>] pci_create_slot+0x3f/0x280
>>  [<ffffffff8152c030>] __pci_hp_register+0x70/0x400
>>  [<ffffffff8152cf49>] pciehp_probe+0x1a9/0x450
>>  [<ffffffff8152865d>] pcie_port_probe_service+0x3d/0x90
>>  [<ffffffff815c45b9>] driver_probe_device+0xf9/0x350
>>  [<ffffffff815c490b>] __device_attach+0x4b/0x60
>>  [<ffffffff815c25a6>] bus_for_each_drv+0x56/0xa0
>>  [<ffffffff815c4468>] device_attach+0xa8/0xc0
>>  [<ffffffff815c38d0>] bus_probe_device+0xb0/0xe0
>>  [<ffffffff815c16ce>] device_add+0x3de/0x560
>>  [<ffffffff815c1a2e>] device_register+0x1e/0x30
>>  [<ffffffff81528aef>] pcie_port_device_register+0x32f/0x510
>>  [<ffffffff81528eb8>] pcie_portdrv_probe+0x48/0x80
>>  [<ffffffff8151b17c>] pci_device_probe+0x9c/0xf0
>>  [<ffffffff815c45b9>] driver_probe_device+0xf9/0x350
>>  [<ffffffff815c490b>] __device_attach+0x4b/0x60
>>  [<ffffffff815c25a6>] bus_for_each_drv+0x56/0xa0
>>  [<ffffffff815c4468>] device_attach+0xa8/0xc0
>>  [<ffffffff815116c1>] pci_bus_add_device+0x41/0x70
>>  [<ffffffff81511a41>] pci_bus_add_devices+0x41/0x90
>>  [<ffffffff81511a6f>] pci_bus_add_devices+0x6f/0x90
>>  [<ffffffff8152e7e2>] pciehp_configure_device+0xa2/0x140
>>  [<ffffffff8152df08>] pciehp_enable_slot+0x188/0x2d0
>>  [<ffffffff8152e3d1>] pciehp_power_thread+0x2b1/0x3c0
>>  [<ffffffff810d92a0>] process_one_work+0x1d0/0x510
>>  [<ffffffff810d9cc1>] worker_thread+0x121/0x440
>>  [<ffffffff810df0bf>] kthread+0xef/0x110
>>  [<ffffffff81a4d8ac>] ret_from_fork+0x7c/0xb0
>>  [<ffffffffffffffff>] 0xffffffffffffffff
>>
>>
>>  Thread2
>>  =======
>>  While the above thread is doing its work, the root port gets a
>> completion timeout. And thus the AER Error recovery worker thread
>> kicks in to handle that error. And as part of that error recovery -
>> since the completion timeout was detected at root port, attempts to
>> see for ALL the devices downstream if they have an error handler that
>> need to be called. Here is what happens:
>>
>>
>> aer_isr()
>>    -> aer_isr_one_error()
>>      -> aer_process_err_device()
>>         -> ... (etc)...
>>           -> do_recovery()
>>             -> broadcast_error_message()
>>               -> pci_walk_bus( ..., report_error_detected,...) /*
>> effectively for all buses below root port */
>>                     -> down_read(&pci_bus_sem);  /* SUCCESSFULLY
>> ACQUIRES the semaophore */
>>                     -> report_error_detected(dev) /* for the newly
>> detected slot */
>>                          -> device_lock(dev) /* Deadlocked */
>>
>>  This is how the stack looks like:
>>  [<ffffffff81529e7e>] report_error_detected+0x4e/0x170 <--- Waiting on
>> device_lock()
>>  [<ffffffff8151162e>] pci_walk_bus+0x4e/0xa0
>>  [<ffffffff81529b84>] broadcast_error_message+0xc4/0xf0
>>  [<ffffffff81529bed>] do_recovery+0x3d/0x280
>>  [<ffffffff8152a5d0>] aer_isr+0x300/0x3e0
>>  [<ffffffff810d92a0>] process_one_work+0x1d0/0x510
>>  [<ffffffff810d9cc1>] worker_thread+0x121/0x440
>>  [<ffffffff810df0bf>] kthread+0xef/0x110
>>  [<ffffffff81a4d8ac>] ret_from_fork+0x7c/0xb0
>>  [<ffffffffffffffff>] 0xffffffffffffffff
>>
>>
>> As a temporary work around to let me proceed, I was thinking may be I
>> could change in report_error_detected() such that completion timeouts
>> errors may not be broadcast (do we really have any drivers that have
>> aer handlers that handle such an error? What would the handler do
>> anyway to fix such an error?)
>>
>>
>> But not sure what the right solution might look like. I thought about
>> whether these locks should have been taken in a particular order in
>> order to avoid this problem, but looking at the stack there seems to
>> be no other way. What do you think is the best way to fix this
>> deadlock?
>>
>> Any help or suggestions in this regard are greatly appreciated.
>>
>>  Thanks,
>>
>> Rajat

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Hit a deadlock: between AER and pcieport/pciehp
  2015-03-18  2:28   ` Bjorn Helgaas
@ 2015-03-18 10:04     ` Guenter Roeck
  0 siblings, 0 replies; 4+ messages in thread
From: Guenter Roeck @ 2015-03-18 10:04 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Rajat Jain, linux-pci, linux-kernel, Rajat Jain, rthirumal,
	sanjayj, Rafael Wysocki

On Tue, Mar 17, 2015 at 09:28:24PM -0500, Bjorn Helgaas wrote:
> [+cc Rafael]
> 
> Hi Rajat,
> 
> On Tue, Mar 17, 2015 at 2:11 PM, Rajat Jain <rajatxjain@gmail.com> wrote:
> > Hello,
> >
> > I was wondering if any one has a any suggestions to make here. I
> > believe this is a pretty serious deadlock - and I'm looking for ideas
> > on what should be the right way to fix this.
> 
> I agree, this definitely sounds like a real problem.  I'm not ignoring
> it; I just haven't had time to look into it :)
> 
> After ten seconds of thought, my suggestion is to try to make this
> work in a way that doesn't require taking the mutexes in two different
> orders.  It might be *possible* to write code that is smart enough to
> take them in different orders, but I'm pretty sure our automated lock
> checking tools wouldn't be that smart.
> 

Assuming that pci_walk_bus needs to hold pci_bus_sem, and pci_create_slot
needs to hold it as well, that may be tricky. More likely one just must
not call device_lock() from any function called through pci_walk_bus.

Does report_error_detected() really need to call device_lock(),
or would get_device() be sufficient ? Or even nothing ?

Guenter

> I added Rafael because he recently did some work on PCI bus locking
> and might have better ideas than I do.
> 
> > On Wed, Mar 11, 2015 at 6:48 PM, Rajat Jain <rajatxjain@gmail.com> wrote:
> >> Hello,
> >>
> >>
> >> I have hit a kernel deadlock situation on my system that has
> >> hierarchical hot plug situations (i.e. we can hot-plug a card, that
> >> itself may have a hot-plug slot for another level of hot-pluggable
> >> add-on cards). In summary, I see 2 threads that are both waiting on
> >> mutexes that is acquired by the other one. The mutexes are the
> >> (global) "pci_bus_sem" and "device->mutex" respectively.
> >>
> >>
> >> Thread1
> >>  =======
> >>  This is the pciehp worker thread, that scans a new card, and on
> >> finding that there is a hotplug slot downstream, tries to
> >> pci_create_slot().
> >>  pciehp_power_thread()
> >>    -> pciehp_enable_slot()
> >>      -> pciehp_configure_device()
> >>        -> pci_bus_add_devices() discovers all devices including a new
> >> hotplug slot.
> >>          -> ....(etc)...
> >>          -> device_attach(dev) (for the newly discovered HP slot /
> >> downstream port)
> >>            -> device_lock(dev) SUCCESSFULLY ACQUIRES dev->mutex for
> >> the new slot.
> >>          -> ....(etc)...
> >>          -> ... (goes on)
> >>          -> pciehp_probe(dev)
> >>              -> __pci_hp_register()
> >>                 -> pci_create_slot()
> >>                      -> down_write(pci_bus_sem); /* Deadlocked */
> >>
> >>  This how the stack looks like:
> >>   [<ffffffff814e9923>] call_rwsem_down_write_failed+0x13/0x20
> >>  [<ffffffff81522d4f>] pci_create_slot+0x3f/0x280
> >>  [<ffffffff8152c030>] __pci_hp_register+0x70/0x400
> >>  [<ffffffff8152cf49>] pciehp_probe+0x1a9/0x450
> >>  [<ffffffff8152865d>] pcie_port_probe_service+0x3d/0x90
> >>  [<ffffffff815c45b9>] driver_probe_device+0xf9/0x350
> >>  [<ffffffff815c490b>] __device_attach+0x4b/0x60
> >>  [<ffffffff815c25a6>] bus_for_each_drv+0x56/0xa0
> >>  [<ffffffff815c4468>] device_attach+0xa8/0xc0
> >>  [<ffffffff815c38d0>] bus_probe_device+0xb0/0xe0
> >>  [<ffffffff815c16ce>] device_add+0x3de/0x560
> >>  [<ffffffff815c1a2e>] device_register+0x1e/0x30
> >>  [<ffffffff81528aef>] pcie_port_device_register+0x32f/0x510
> >>  [<ffffffff81528eb8>] pcie_portdrv_probe+0x48/0x80
> >>  [<ffffffff8151b17c>] pci_device_probe+0x9c/0xf0
> >>  [<ffffffff815c45b9>] driver_probe_device+0xf9/0x350
> >>  [<ffffffff815c490b>] __device_attach+0x4b/0x60
> >>  [<ffffffff815c25a6>] bus_for_each_drv+0x56/0xa0
> >>  [<ffffffff815c4468>] device_attach+0xa8/0xc0
> >>  [<ffffffff815116c1>] pci_bus_add_device+0x41/0x70
> >>  [<ffffffff81511a41>] pci_bus_add_devices+0x41/0x90
> >>  [<ffffffff81511a6f>] pci_bus_add_devices+0x6f/0x90
> >>  [<ffffffff8152e7e2>] pciehp_configure_device+0xa2/0x140
> >>  [<ffffffff8152df08>] pciehp_enable_slot+0x188/0x2d0
> >>  [<ffffffff8152e3d1>] pciehp_power_thread+0x2b1/0x3c0
> >>  [<ffffffff810d92a0>] process_one_work+0x1d0/0x510
> >>  [<ffffffff810d9cc1>] worker_thread+0x121/0x440
> >>  [<ffffffff810df0bf>] kthread+0xef/0x110
> >>  [<ffffffff81a4d8ac>] ret_from_fork+0x7c/0xb0
> >>  [<ffffffffffffffff>] 0xffffffffffffffff
> >>
> >>
> >>  Thread2
> >>  =======
> >>  While the above thread is doing its work, the root port gets a
> >> completion timeout. And thus the AER Error recovery worker thread
> >> kicks in to handle that error. And as part of that error recovery -
> >> since the completion timeout was detected at root port, attempts to
> >> see for ALL the devices downstream if they have an error handler that
> >> need to be called. Here is what happens:
> >>
> >>
> >> aer_isr()
> >>    -> aer_isr_one_error()
> >>      -> aer_process_err_device()
> >>         -> ... (etc)...
> >>           -> do_recovery()
> >>             -> broadcast_error_message()
> >>               -> pci_walk_bus( ..., report_error_detected,...) /*
> >> effectively for all buses below root port */
> >>                     -> down_read(&pci_bus_sem);  /* SUCCESSFULLY
> >> ACQUIRES the semaophore */
> >>                     -> report_error_detected(dev) /* for the newly
> >> detected slot */
> >>                          -> device_lock(dev) /* Deadlocked */
> >>
> >>  This is how the stack looks like:
> >>  [<ffffffff81529e7e>] report_error_detected+0x4e/0x170 <--- Waiting on
> >> device_lock()
> >>  [<ffffffff8151162e>] pci_walk_bus+0x4e/0xa0
> >>  [<ffffffff81529b84>] broadcast_error_message+0xc4/0xf0
> >>  [<ffffffff81529bed>] do_recovery+0x3d/0x280
> >>  [<ffffffff8152a5d0>] aer_isr+0x300/0x3e0
> >>  [<ffffffff810d92a0>] process_one_work+0x1d0/0x510
> >>  [<ffffffff810d9cc1>] worker_thread+0x121/0x440
> >>  [<ffffffff810df0bf>] kthread+0xef/0x110
> >>  [<ffffffff81a4d8ac>] ret_from_fork+0x7c/0xb0
> >>  [<ffffffffffffffff>] 0xffffffffffffffff
> >>
> >>
> >> As a temporary work around to let me proceed, I was thinking may be I
> >> could change in report_error_detected() such that completion timeouts
> >> errors may not be broadcast (do we really have any drivers that have
> >> aer handlers that handle such an error? What would the handler do
> >> anyway to fix such an error?)
> >>
> >>
> >> But not sure what the right solution might look like. I thought about
> >> whether these locks should have been taken in a particular order in
> >> order to avoid this problem, but looking at the stack there seems to
> >> be no other way. What do you think is the best way to fix this
> >> deadlock?
> >>
> >> Any help or suggestions in this regard are greatly appreciated.
> >>
> >>  Thanks,
> >>
> >> Rajat

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-03-18 10:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-12  1:48 Hit a deadlock: between AER and pcieport/pciehp Rajat Jain
2015-03-17 19:11 ` Rajat Jain
2015-03-18  2:28   ` Bjorn Helgaas
2015-03-18 10:04     ` Guenter Roeck

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.