All of lore.kernel.org
 help / color / mirror / Atom feed
* hitting lockdep warning as of too early VF probe with 3.9-rc1
@ 2013-03-05 15:21 Or Gerlitz
  2013-03-06  2:43 ` Ming Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Or Gerlitz @ 2013-03-05 15:21 UTC (permalink / raw)
  To: Ming Lei, Greg Kroah-Hartman, David Miller, Roland Dreier
  Cc: netdev, Yan Burman, Jack Morgenstein

Hi Ming, Greg, Roland, Dave, all..

With 3.9-rc1, we are hitting the below lockdep with probing of virtual 
functions over the mlx4 driver, where it seems that the probing of the 
VF starts before the PF initialization is done.

Yan Burman from our team bisected that to be introduced by commit 
190888ac01d059e38ffe77a2291d44cafa9016fb
"driver core: fix possible missing of device probe".

Basically what happens is that the VF probe fails, and once the PF 
probing/initialization is done, the VF
is probed again and this time it succeeds.

Anything here which people see to be possibly wrong with the mlx4_core 
(drivers/net/ethernet/mellanox/mlx4) driver interaction with the PCI 
subsystem?

Or.


mlx4_core: Initializing 0000:04:00.0
mlx4_core 0000:04:00.0: Enabling SR-IOV with 1 VFs
pci 0000:04:00.1: [15b3:1004] type 00 class 0x028000

=============================================
[ INFO: possible recursive locking detected ]
3.9.0-rc1 #96 Not tainted
---------------------------------------------
kworker/0:1/734 is trying to acquire lock:
  ((&wfc.work)){+.+.+.}, at: [<ffffffff81066cb0>] flush_work+0x0/0x250

but task is already holding lock:
  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>] 
process_one_work+0x162/0x4c0

other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock((&wfc.work));
   lock((&wfc.work));

  *** DEADLOCK ***

  May be due to missing lock nesting notation

3 locks held by kworker/0:1/734:
  #0:  (events){.+.+.+}, at: [<ffffffff81064352>] 
process_one_work+0x162/0x4c0
  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>] 
process_one_work+0x162/0x4c0
  #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff812db225>] 
device_attach+0x25/0xb0

stack backtrace:
Pid: 734, comm: kworker/0:1 Not tainted 3.9.0-rc1 #96
Call Trace:
  [<ffffffff810948ec>] validate_chain+0xdcc/0x11f0
  [<ffffffff81095150>] __lock_acquire+0x440/0xc70
  [<ffffffff81095150>] ? __lock_acquire+0x440/0xc70
  [<ffffffff810959da>] lock_acquire+0x5a/0x70
  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
  [<ffffffff81066cf5>] flush_work+0x45/0x250
  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
  [<ffffffff810922be>] ? mark_held_locks+0x9e/0x130
  [<ffffffff81066a96>] ? queue_work_on+0x46/0x90
  [<ffffffff810925dd>] ? trace_hardirqs_on_caller+0xfd/0x190
  [<ffffffff8109267d>] ? trace_hardirqs_on+0xd/0x10
  [<ffffffff81066f74>] work_on_cpu+0x74/0x90
  [<ffffffff81063820>] ? keventd_up+0x20/0x20
  [<ffffffff8121fd30>] ? pci_pm_prepare+0x60/0x60
  [<ffffffff811f9293>] ? cpumask_next_and+0x23/0x40
  [<ffffffff81220a1a>] pci_device_probe+0xba/0x110
  [<ffffffff812dadca>] ? driver_sysfs_add+0x7a/0xb0
  [<ffffffff812daf1f>] driver_probe_device+0x8f/0x230
  [<ffffffff812db170>] ? __driver_attach+0xb0/0xb0
  [<ffffffff812db1bb>] __device_attach+0x4b/0x60
  [<ffffffff812d9314>] bus_for_each_drv+0x64/0x90
  [<ffffffff812db298>] device_attach+0x98/0xb0
  [<ffffffff81218474>] pci_bus_add_device+0x24/0x50
  [<ffffffff81232e80>] virtfn_add+0x240/0x3e0
  [<ffffffff8146ce3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
  [<ffffffff812333be>] pci_enable_sriov+0x23e/0x500
  [<ffffffffa011fa1a>] __mlx4_init_one+0x5da/0xce0 [mlx4_core]
  [<ffffffffa012016d>] mlx4_init_one+0x2d/0x60 [mlx4_core]
  [<ffffffff8121fd79>] local_pci_probe+0x49/0x80
  [<ffffffff81063833>] work_for_cpu_fn+0x13/0x20
  [<ffffffff810643b8>] process_one_work+0x1c8/0x4c0
  [<ffffffff81064352>] ? process_one_work+0x162/0x4c0
  [<ffffffff81064cfb>] worker_thread+0x30b/0x430
  [<ffffffff810649f0>] ? manage_workers+0x340/0x340
  [<ffffffff8106cea6>] kthread+0xd6/0xe0
  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
  [<ffffffff8146daac>] ret_from_fork+0x7c/0xb0
  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
mlx4_core: Initializing 0000:04:00.1
mlx4_core 0000:04:00.1: enabling device (0000 -> 0002)
mlx4_core 0000:04:00.1: Detected virtual function - running in slave mode
mlx4_core 0000:04:00.1: Sending reset
mlx4_core 0000:04:00.1: Got slave FLRed from Communication channel (ret:0x1)
mlx4_core 0000:04:00.1: slave is currently in themiddle of FLR. 
retrying...(try num:1)
mlx4_core 0000:04:00.1: Communication channel is not idle.my toggle is 1 
(cmd:0x0)
mlx4_core 0000:04:00.1: slave is currently in themiddle of FLR. 
retrying...(try num:2)
[... repeated the same ...]
mlx4_core 0000:04:00.1: slave is currently in themiddle of FLR. 
retrying...(try num:10)
mlx4_core 0000:04:00.1: Communication channel is not idle.my toggle is 1 
(cmd:0x0)
mlx4_core 0000:04:00.1: slave driver version is not supported by the master
mlx4_core 0000:04:00.1: Communication channel is not idle.my toggle is 1 
(cmd:0x0)
mlx4_core 0000:04:00.1: Failed to initialize slave
mlx4_core: probe of 0000:04:00.1 failed with error -5
mlx4_core 0000:04:00.0: Running in master mode
mlx4_core 0000:04:00.0: FW version 2.11.500 (cmd intf rev 3), max 
commands 16
mlx4_core 0000:04:00.0: Catastrophic error buffer at 0x1f020, size 0x10, 
BAR 0
mlx4_core 0000:04:00.0: Communication vector bar:2 offset:0x800
[... probing of PF continues ...]
mlx4_core 0000:04:00.0: Started init_resource_tracker: 80 slaves
mlx4_core 0000:04:00.0: irq 83 for MSI/MSI-X
mlx4_core 0000:04:00.0: irq 84 for MSI/MSI-X
mlx4_core 0000:04:00.0: irq 85 for MSI/MSI-X
mlx4_core 0000:04:00.0: irq 86 for MSI/MSI-X
mlx4_core 0000:04:00.0: NOP command IRQ test passed
[... probing of PF ends ...]
[... probing of VF done again ...]
mlx4_core: Initializing 0000:04:00.1
mlx4_core 0000:04:00.1: enabling device (0000 -> 0002)
mlx4_core 0000:04:00.1: Detected virtual function - running in slave mode
mlx4_core 0000:04:00.1: Sending reset
mlx4_core 0000:04:00.0: Received reset from slave:1
mlx4_core 0000:04:00.1: Sending vhcr0
[... probing of VF succeeds ...]
mlx4_core 0000:04:00.1: NOP command IRQ test passed

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hitting lockdep warning as of too early VF probe with 3.9-rc1
  2013-03-05 15:21 hitting lockdep warning as of too early VF probe with 3.9-rc1 Or Gerlitz
@ 2013-03-06  2:43 ` Ming Lei
  2013-03-06 20:54   ` Or Gerlitz
  2013-04-11 15:25 ` [PATCH for-3.9] pci: avoid work_on_cpu for nested SRIOV probes Michael S. Tsirkin
  2013-04-18 20:08 ` [PATCHv2 " Michael S. Tsirkin
  2 siblings, 1 reply; 16+ messages in thread
From: Ming Lei @ 2013-03-06  2:43 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Greg Kroah-Hartman, David Miller, Roland Dreier, netdev,
	Yan Burman, Jack Morgenstein

On Tue, Mar 5, 2013 at 11:21 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> Hi Ming, Greg, Roland, Dave, all..
>
> With 3.9-rc1, we are hitting the below lockdep with probing of virtual
> functions over the mlx4 driver, where it seems that the probing of the VF
> starts before the PF initialization is done.
>
> Yan Burman from our team bisected that to be introduced by commit
> 190888ac01d059e38ffe77a2291d44cafa9016fb
> "driver core: fix possible missing of device probe".

I guess that the lockdep warning still can be triggered without the commit.

> Basically what happens is that the VF probe fails, and once the PF
> probing/initialization is done, the VF
> is probed again and this time it succeeds.
>
> Anything here which people see to be possibly wrong with the mlx4_core
> (drivers/net/ethernet/mellanox/mlx4) driver interaction with the PCI
> subsystem?
>
> Or.
>
>
> mlx4_core: Initializing 0000:04:00.0
> mlx4_core 0000:04:00.0: Enabling SR-IOV with 1 VFs
> pci 0000:04:00.1: [15b3:1004] type 00 class 0x028000
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 3.9.0-rc1 #96 Not tainted
> ---------------------------------------------
> kworker/0:1/734 is trying to acquire lock:
>  ((&wfc.work)){+.+.+.}, at: [<ffffffff81066cb0>] flush_work+0x0/0x250
>
> but task is already holding lock:
>  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
> process_one_work+0x162/0x4c0
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock((&wfc.work));
>   lock((&wfc.work));
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
> 3 locks held by kworker/0:1/734:
>  #0:  (events){.+.+.+}, at: [<ffffffff81064352>]
> process_one_work+0x162/0x4c0
>  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
> process_one_work+0x162/0x4c0
>  #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff812db225>]
> device_attach+0x25/0xb0
>
> stack backtrace:
> Pid: 734, comm: kworker/0:1 Not tainted 3.9.0-rc1 #96
> Call Trace:
>  [<ffffffff810948ec>] validate_chain+0xdcc/0x11f0
>  [<ffffffff81095150>] __lock_acquire+0x440/0xc70
>  [<ffffffff81095150>] ? __lock_acquire+0x440/0xc70
>  [<ffffffff810959da>] lock_acquire+0x5a/0x70
>  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
>  [<ffffffff81066cf5>] flush_work+0x45/0x250
>  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
>  [<ffffffff810922be>] ? mark_held_locks+0x9e/0x130
>  [<ffffffff81066a96>] ? queue_work_on+0x46/0x90
>  [<ffffffff810925dd>] ? trace_hardirqs_on_caller+0xfd/0x190
>  [<ffffffff8109267d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff81066f74>] work_on_cpu+0x74/0x90
>  [<ffffffff81063820>] ? keventd_up+0x20/0x20
>  [<ffffffff8121fd30>] ? pci_pm_prepare+0x60/0x60
>  [<ffffffff811f9293>] ? cpumask_next_and+0x23/0x40
>  [<ffffffff81220a1a>] pci_device_probe+0xba/0x110
>  [<ffffffff812dadca>] ? driver_sysfs_add+0x7a/0xb0
>  [<ffffffff812daf1f>] driver_probe_device+0x8f/0x230
>  [<ffffffff812db170>] ? __driver_attach+0xb0/0xb0
>  [<ffffffff812db1bb>] __device_attach+0x4b/0x60
>  [<ffffffff812d9314>] bus_for_each_drv+0x64/0x90
>  [<ffffffff812db298>] device_attach+0x98/0xb0
>  [<ffffffff81218474>] pci_bus_add_device+0x24/0x50
>  [<ffffffff81232e80>] virtfn_add+0x240/0x3e0

You are adding one new PCI device inside another PCI device's probe(),
so the new device will be probed, since PCI probe() is scheduled by
work_on_cpu, then cause flush_work() called inside worker function,
which might be a real deadlock.

I am wondering why this commit can cause the problem, since the PCI
device will be probed with its driver if there is one driver for it. There is no
any limit on when the driver should be loaded into system, either before
device is added or after.

>From driver core view, looks no wrong things are found.

>  [<ffffffff8146ce3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
>  [<ffffffff812333be>] pci_enable_sriov+0x23e/0x500
>  [<ffffffffa011fa1a>] __mlx4_init_one+0x5da/0xce0 [mlx4_core]
>  [<ffffffffa012016d>] mlx4_init_one+0x2d/0x60 [mlx4_core]
>  [<ffffffff8121fd79>] local_pci_probe+0x49/0x80
>  [<ffffffff81063833>] work_for_cpu_fn+0x13/0x20
>  [<ffffffff810643b8>] process_one_work+0x1c8/0x4c0
>  [<ffffffff81064352>] ? process_one_work+0x162/0x4c0
>  [<ffffffff81064cfb>] worker_thread+0x30b/0x430
>  [<ffffffff810649f0>] ? manage_workers+0x340/0x340
>  [<ffffffff8106cea6>] kthread+0xd6/0xe0
>  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
>  [<ffffffff8146daac>] ret_from_fork+0x7c/0xb0
>  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
> mlx4_core: Initializing 0000:04:00.1
> mlx4_core 0000:04:00.1: enabling device (0000 -> 0002)
> mlx4_core 0000:04:00.1: Detected virtual function - running in slave mode
> mlx4_core 0000:04:00.1: Sending reset
> mlx4_core 0000:04:00.1: Got slave FLRed from Communication channel (ret:0x1)
> mlx4_core 0000:04:00.1: slave is currently in themiddle of FLR.
> retrying...(try num:1)
> mlx4_core 0000:04:00.1: Communication channel is not idle.my toggle is 1
> (cmd:0x0)
> mlx4_core 0000:04:00.1: slave is currently in themiddle of FLR.
> retrying...(try num:2)
> [... repeated the same ...]
> mlx4_core 0000:04:00.1: slave is currently in themiddle of FLR.
> retrying...(try num:10)
> mlx4_core 0000:04:00.1: Communication channel is not idle.my toggle is 1
> (cmd:0x0)
> mlx4_core 0000:04:00.1: slave driver version is not supported by the master
> mlx4_core 0000:04:00.1: Communication channel is not idle.my toggle is 1
> (cmd:0x0)
> mlx4_core 0000:04:00.1: Failed to initialize slave
> mlx4_core: probe of 0000:04:00.1 failed with error -5
> mlx4_core 0000:04:00.0: Running in master mode
> mlx4_core 0000:04:00.0: FW version 2.11.500 (cmd intf rev 3), max commands
> 16
> mlx4_core 0000:04:00.0: Catastrophic error buffer at 0x1f020, size 0x10, BAR
> 0
> mlx4_core 0000:04:00.0: Communication vector bar:2 offset:0x800
> [... probing of PF continues ...]
> mlx4_core 0000:04:00.0: Started init_resource_tracker: 80 slaves
> mlx4_core 0000:04:00.0: irq 83 for MSI/MSI-X
> mlx4_core 0000:04:00.0: irq 84 for MSI/MSI-X
> mlx4_core 0000:04:00.0: irq 85 for MSI/MSI-X
> mlx4_core 0000:04:00.0: irq 86 for MSI/MSI-X
> mlx4_core 0000:04:00.0: NOP command IRQ test passed
> [... probing of PF ends ...]
> [... probing of VF done again ...]
> mlx4_core: Initializing 0000:04:00.1
> mlx4_core 0000:04:00.1: enabling device (0000 -> 0002)
> mlx4_core 0000:04:00.1: Detected virtual function - running in slave mode
> mlx4_core 0000:04:00.1: Sending reset
> mlx4_core 0000:04:00.0: Received reset from slave:1
> mlx4_core 0000:04:00.1: Sending vhcr0
> [... probing of VF succeeds ...]
> mlx4_core 0000:04:00.1: NOP command IRQ test passed
>
>
>
>

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hitting lockdep warning as of too early VF probe with 3.9-rc1
  2013-03-06  2:43 ` Ming Lei
@ 2013-03-06 20:54   ` Or Gerlitz
  2013-03-07  2:03     ` Ming Lei
  0 siblings, 1 reply; 16+ messages in thread
From: Or Gerlitz @ 2013-03-06 20:54 UTC (permalink / raw)
  To: Ming Lei
  Cc: Or Gerlitz, Greg Kroah-Hartman, David Miller, Roland Dreier,
	netdev, Yan Burman, Jack Morgenstein, Liran Liss

On Wed, Mar 6, 2013 at 4:43 AM, Ming Lei <ming.lei@canonical.com> wrote:
> You are adding one new PCI device inside another PCI device's probe(),
> so the new device will be probed, since PCI probe() is scheduled by
> work_on_cpu, then cause flush_work() called inside worker function,
> which might be a real deadlock.

So if I understand correct, you recommend to somehow avoid this nested probing?

> I am wondering why this commit can cause the problem, since the PCI
> device will be probed with its driver if there is one driver for it. There is no
> any limit on when the driver should be loaded into system, either before
> device is added or after.

FWIW to undertstanding the issue - the same driver (mlx4_core) is used
by the PF and VF, so the VF driver is already loaded at the time its
been added as new PCI device.

> From driver core view, looks no wrong things are found.

So this got me confused, you pointed on possible deadlock, are you
saying the deadlock wouldn't be the result of how the driver code is
going nor the commited we bisected?

Or.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hitting lockdep warning as of too early VF probe with 3.9-rc1
  2013-03-06 20:54   ` Or Gerlitz
@ 2013-03-07  2:03     ` Ming Lei
  2013-03-10 15:28       ` Jack Morgenstein
  2013-04-17 15:14       ` Or Gerlitz
  0 siblings, 2 replies; 16+ messages in thread
From: Ming Lei @ 2013-03-07  2:03 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Or Gerlitz, Greg Kroah-Hartman, David Miller, Roland Dreier,
	netdev, Yan Burman, Jack Morgenstein, Liran Liss

On Thu, Mar 7, 2013 at 4:54 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
> On Wed, Mar 6, 2013 at 4:43 AM, Ming Lei <ming.lei@canonical.com> wrote:
>> You are adding one new PCI device inside another PCI device's probe(),
>> so the new device will be probed, since PCI probe() is scheduled by
>> work_on_cpu, then cause flush_work() called inside worker function,
>> which might be a real deadlock.
>
> So if I understand correct, you recommend to somehow avoid this nested probing?

Yes, you might need to avoid the nested probing in your driver.

>
>> I am wondering why this commit can cause the problem, since the PCI
>> device will be probed with its driver if there is one driver for it. There is no
>> any limit on when the driver should be loaded into system, either before
>> device is added or after.
>
> FWIW to undertstanding the issue - the same driver (mlx4_core) is used
> by the PF and VF, so the VF driver is already loaded at the time its
> been added as new PCI device.
>
>> From driver core view, looks no wrong things are found.
>
> So this got me confused, you pointed on possible deadlock, are you
> saying the deadlock wouldn't be the result of how the driver code is
> going nor the commited we bisected?

My commit only affects the driver loading path, but your warning
is hit in driver probe path triggered by device addition, so the lockdep
warning should still be triggered without my commit since the two paths
are totally independent, right?

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hitting lockdep warning as of too early VF probe with 3.9-rc1
  2013-03-07  2:03     ` Ming Lei
@ 2013-03-10 15:28       ` Jack Morgenstein
  2013-03-10 16:37         ` Greg Kroah-Hartman
                           ` (2 more replies)
  2013-04-17 15:14       ` Or Gerlitz
  1 sibling, 3 replies; 16+ messages in thread
From: Jack Morgenstein @ 2013-03-10 15:28 UTC (permalink / raw)
  To: Ming Lei
  Cc: Or Gerlitz, Or Gerlitz, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Liran Liss

Hello, Ming, Greg, Roland, Dave, all...

>From a quick scan of ethernet drivers in Dave Miller's net-next git, I
notice that the following drivers (apart from the Mellanox mlx4 driver)
enable SRIOV during the PF probe:
  cisco enic (function "enic_probe")
  neterion vxge driver(function "vxge_probe")
  Solarflare efx driver (function "efx_pci_probe", which invokes "efx_sriov_init")
  emulex driver (function "be_probe" --> be_setup --> be_vf_setup)

It would seem that these drivers are susceptible to the nested probe/deadlock
race condition as well.

I believe that it is healthiest for everyone if the probe code in the kernel itself
would avoid such nested probe calls (rather than forcing vendors to deal
with this issue).  The kernel code is certainly aware
(or could easily track) that it is invoking the a driver's probe function
while that same probe function has already been invoked and has not yet returned!

-Jack

On Thursday 07 March 2013 04:03, Ming Lei wrote:
> On Thu, Mar 7, 2013 at 4:54 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
> > On Wed, Mar 6, 2013 at 4:43 AM, Ming Lei <ming.lei@canonical.com> wrote:
> >> You are adding one new PCI device inside another PCI device's probe(),
> >> so the new device will be probed, since PCI probe() is scheduled by
> >> work_on_cpu, then cause flush_work() called inside worker function,
> >> which might be a real deadlock.
> >
> > So if I understand correct, you recommend to somehow avoid this nested probing?
> 
> Yes, you might need to avoid the nested probing in your driver.
> 
> >
> >> I am wondering why this commit can cause the problem, since the PCI
> >> device will be probed with its driver if there is one driver for it. There is no
> >> any limit on when the driver should be loaded into system, either before
> >> device is added or after.
> >
> > FWIW to undertstanding the issue - the same driver (mlx4_core) is used
> > by the PF and VF, so the VF driver is already loaded at the time its
> > been added as new PCI device.
> >
> >> From driver core view, looks no wrong things are found.
> >
> > So this got me confused, you pointed on possible deadlock, are you
> > saying the deadlock wouldn't be the result of how the driver code is
> > going nor the commited we bisected?
> 
> My commit only affects the driver loading path, but your warning
> is hit in driver probe path triggered by device addition, so the lockdep
> warning should still be triggered without my commit since the two paths
> are totally independent, right?
> 
> Thanks,
> --
> Ming Lei
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hitting lockdep warning as of too early VF probe with 3.9-rc1
  2013-03-10 15:28       ` Jack Morgenstein
@ 2013-03-10 16:37         ` Greg Kroah-Hartman
  2013-03-11  1:26         ` Ming Lei
  2013-03-11 20:24         ` Ben Hutchings
  2 siblings, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartman @ 2013-03-10 16:37 UTC (permalink / raw)
  To: Jack Morgenstein
  Cc: Ming Lei, Or Gerlitz, Or Gerlitz, David Miller, Roland Dreier,
	netdev, Yan Burman, Liran Liss

On Sun, Mar 10, 2013 at 05:28:50PM +0200, Jack Morgenstein wrote:
> Hello, Ming, Greg, Roland, Dave, all...
> 
> From a quick scan of ethernet drivers in Dave Miller's net-next git, I
> notice that the following drivers (apart from the Mellanox mlx4 driver)
> enable SRIOV during the PF probe:
>   cisco enic (function "enic_probe")
>   neterion vxge driver(function "vxge_probe")
>   Solarflare efx driver (function "efx_pci_probe", which invokes "efx_sriov_init")
>   emulex driver (function "be_probe" --> be_setup --> be_vf_setup)
> 
> It would seem that these drivers are susceptible to the nested probe/deadlock
> race condition as well.
> 
> I believe that it is healthiest for everyone if the probe code in the kernel itself
> would avoid such nested probe calls (rather than forcing vendors to deal
> with this issue).  The kernel code is certainly aware
> (or could easily track) that it is invoking the a driver's probe function
> while that same probe function has already been invoked and has not yet returned!

Patches to handle this are always gladly accepted.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hitting lockdep warning as of too early VF probe with 3.9-rc1
  2013-03-10 15:28       ` Jack Morgenstein
  2013-03-10 16:37         ` Greg Kroah-Hartman
@ 2013-03-11  1:26         ` Ming Lei
  2013-03-11 20:24         ` Ben Hutchings
  2 siblings, 0 replies; 16+ messages in thread
From: Ming Lei @ 2013-03-11  1:26 UTC (permalink / raw)
  To: Jack Morgenstein
  Cc: Or Gerlitz, Or Gerlitz, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Liran Liss

On Sun, Mar 10, 2013 at 11:28 PM, Jack Morgenstein
<jackm@dev.mellanox.co.il> wrote:
> Hello, Ming, Greg, Roland, Dave, all...
>
> From a quick scan of ethernet drivers in Dave Miller's net-next git, I
> notice that the following drivers (apart from the Mellanox mlx4 driver)
> enable SRIOV during the PF probe:
>   cisco enic (function "enic_probe")
>   neterion vxge driver(function "vxge_probe")
>   Solarflare efx driver (function "efx_pci_probe", which invokes "efx_sriov_init")
>   emulex driver (function "be_probe" --> be_setup --> be_vf_setup)
>
> It would seem that these drivers are susceptible to the nested probe/deadlock
> race condition as well.
>
> I believe that it is healthiest for everyone if the probe code in the kernel itself
> would avoid such nested probe calls (rather than forcing vendors to deal
> with this issue).  The kernel code is certainly aware
> (or could easily track) that it is invoking the a driver's probe function
> while that same probe function has already been invoked and has not yet returned!

Generally nested probe() should be fine if your driver knows how to end
the nesting, also USB drivers use nested probe too, which is healthy. This
report is only related with PCI driver probe, which is scheduled as work
function, so flush_work() around nested probe() may cause deadlock.

IMO, fixing the problem generally might need to touch PCI driver probe
code(pci_call_probe).

Thanks,
--
Ming Lei

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: hitting lockdep warning as of too early VF probe with 3.9-rc1
  2013-03-10 15:28       ` Jack Morgenstein
  2013-03-10 16:37         ` Greg Kroah-Hartman
  2013-03-11  1:26         ` Ming Lei
@ 2013-03-11 20:24         ` Ben Hutchings
  2 siblings, 0 replies; 16+ messages in thread
From: Ben Hutchings @ 2013-03-11 20:24 UTC (permalink / raw)
  To: Jack Morgenstein
  Cc: Ming Lei, Or Gerlitz, Or Gerlitz, Greg Kroah-Hartman,
	David Miller, Roland Dreier, netdev, Yan Burman, Liran Liss

On Sun, 2013-03-10 at 17:28 +0200, Jack Morgenstein wrote:
> Hello, Ming, Greg, Roland, Dave, all...
> 
> From a quick scan of ethernet drivers in Dave Miller's net-next git, I
> notice that the following drivers (apart from the Mellanox mlx4 driver)
> enable SRIOV during the PF probe:
>   cisco enic (function "enic_probe")
>   neterion vxge driver(function "vxge_probe")
>   Solarflare efx driver (function "efx_pci_probe", which invokes "efx_sriov_init")

It's actually called 'sfc', despite using the 'efx' prefix for
hysterical raisins.

>   emulex driver (function "be_probe" --> be_setup --> be_vf_setup)
> 
> It would seem that these drivers are susceptible to the nested probe/deadlock
> race condition as well.
> 
> I believe that it is healthiest for everyone if the probe code in the kernel itself
> would avoid such nested probe calls (rather than forcing vendors to deal
> with this issue).  The kernel code is certainly aware
> (or could easily track) that it is invoking the a driver's probe function
> while that same probe function has already been invoked and has not yet returned!
[...]

Does it make a difference whether the VF driver is the same code as the
PF driver?  I suspect not.

Since Donald Dutile's changes to allow setting the number of VFs through
sysfs, PF drivers should not need to enable VFs during their own probe
function any more.  (Though I think that it will take a non-trivial
amount of work to change that in sfc, anyway.)  This should avoid the
apparent problems with nested probes.

By the way, since VF net drivers will acquire the RTNL lock during their
probe functions, the PF net driver must not hold the RTNL lock while
calling pci_enable_sriov().  This allows rtnl_vfinfo_size() and
rtnl_vf_ports_fill() to race with the change to number of VFs enabled,
if the PF's net device is already registered - which will definitely be
the case if the change is being made through the new sysfs method.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-03-05 15:21 hitting lockdep warning as of too early VF probe with 3.9-rc1 Or Gerlitz
  2013-03-06  2:43 ` Ming Lei
@ 2013-04-11 15:25 ` Michael S. Tsirkin
  2013-04-18 20:08 ` [PATCHv2 " Michael S. Tsirkin
  2 siblings, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2013-04-11 15:25 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Ming Lei, Greg Kroah-Hartman, David Miller, Roland Dreier,
	netdev, Yan Burman, Jack Morgenstein

The following lockdep report triggers since 3.9-rc1:

=============================================
[ INFO: possible recursive locking detected ]
3.9.0-rc1 #96 Not tainted
---------------------------------------------
kworker/0:1/734 is trying to acquire lock:
 ((&wfc.work)){+.+.+.}, at: [<ffffffff81066cb0>] flush_work+0x0/0x250

but task is already holding lock:
 ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock((&wfc.work));
  lock((&wfc.work));

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by kworker/0:1/734:
 #0:  (events){.+.+.+}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0
 #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0
 #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff812db225>]
device_attach+0x25/0xb0

stack backtrace:
Pid: 734, comm: kworker/0:1 Not tainted 3.9.0-rc1 #96
Call Trace:
 [<ffffffff810948ec>] validate_chain+0xdcc/0x11f0
 [<ffffffff81095150>] __lock_acquire+0x440/0xc70
 [<ffffffff81095150>] ? __lock_acquire+0x440/0xc70
 [<ffffffff810959da>] lock_acquire+0x5a/0x70
 [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
 [<ffffffff81066cf5>] flush_work+0x45/0x250
 [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
 [<ffffffff810922be>] ? mark_held_locks+0x9e/0x130
 [<ffffffff81066a96>] ? queue_work_on+0x46/0x90
 [<ffffffff810925dd>] ? trace_hardirqs_on_caller+0xfd/0x190
 [<ffffffff8109267d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81066f74>] work_on_cpu+0x74/0x90
 [<ffffffff81063820>] ? keventd_up+0x20/0x20
 [<ffffffff8121fd30>] ? pci_pm_prepare+0x60/0x60
 [<ffffffff811f9293>] ? cpumask_next_and+0x23/0x40
 [<ffffffff81220a1a>] pci_device_probe+0xba/0x110
 [<ffffffff812dadca>] ? driver_sysfs_add+0x7a/0xb0
 [<ffffffff812daf1f>] driver_probe_device+0x8f/0x230
 [<ffffffff812db170>] ? __driver_attach+0xb0/0xb0
 [<ffffffff812db1bb>] __device_attach+0x4b/0x60
 [<ffffffff812d9314>] bus_for_each_drv+0x64/0x90
 [<ffffffff812db298>] device_attach+0x98/0xb0
 [<ffffffff81218474>] pci_bus_add_device+0x24/0x50
 [<ffffffff81232e80>] virtfn_add+0x240/0x3e0
 [<ffffffff8146ce3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
 [<ffffffff812333be>] pci_enable_sriov+0x23e/0x500
 [<ffffffffa011fa1a>] __mlx4_init_one+0x5da/0xce0 [mlx4_core]
 [<ffffffffa012016d>] mlx4_init_one+0x2d/0x60 [mlx4_core]
 [<ffffffff8121fd79>] local_pci_probe+0x49/0x80
 [<ffffffff81063833>] work_for_cpu_fn+0x13/0x20
 [<ffffffff810643b8>] process_one_work+0x1c8/0x4c0
 [<ffffffff81064352>] ? process_one_work+0x162/0x4c0
 [<ffffffff81064cfb>] worker_thread+0x30b/0x430
 [<ffffffff810649f0>] ? manage_workers+0x340/0x340
 [<ffffffff8106cea6>] kthread+0xd6/0xe0
 [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff8146daac>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70

The issue is that a driver, in it's probe function, calls
pci_sriov_enable so a PF device probe causes VF probe (AKA nested
probe).  Each probe in pci_device_probe which is (normally) run through
work_on_cpu (this is to get the right numa node for memory allocated by
the driver).  In turn work_on_cpu does this internally:

        schedule_work_on(cpu, &wfc.work);
        flush_work(&wfc.work);

So if you are running probe on CPU1, and cause another
probe on the same CPU, this will try to flush
workqueue from inside same workqueue which of course
deadlocks.

Nested probing might be tricky to get right generally.

But for pci_sriov_enable, the situation is actually very simple: all VFs
naturally have same affinity as the PF, and cpumask_any_and is actually
same as cpumask_first_and, so it always gives us the same CPU.
So let's just detect that, and run the probing for VFs locally without a
workqueue.

This is hardly elegant, but looks to me like an appropriate quick fix
for 3.9.

Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 1fa1e48..6eeb5ec 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -286,8 +286,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 		int cpu;
 
 		get_online_cpus();
 		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
-		if (cpu < nr_cpu_ids)
+		if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
 			error = work_on_cpu(cpu, local_pci_probe, &ddi);
 		else
 			error = local_pci_probe(&ddi);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: hitting lockdep warning as of too early VF probe with 3.9-rc1
  2013-03-07  2:03     ` Ming Lei
  2013-03-10 15:28       ` Jack Morgenstein
@ 2013-04-17 15:14       ` Or Gerlitz
  1 sibling, 0 replies; 16+ messages in thread
From: Or Gerlitz @ 2013-04-17 15:14 UTC (permalink / raw)
  To: Ming Lei
  Cc: Or Gerlitz, Greg Kroah-Hartman, David Miller, Roland Dreier,
	netdev, Yan Burman, Jack Morgenstein, Michael S. Tsirkin,
	Don Dutile, Tejun Heo

On 07/03/2013 04:03, Ming Lei wrote:
> On Thu, Mar 7, 2013 at 4:54 AM, Or Gerlitz <or.gerlitz@gmail.com> wrote:
>> On Wed, Mar 6, 2013 at 4:43 AM, Ming Lei <ming.lei@canonical.com> wrote:
>>> You are adding one new PCI device inside another PCI device's probe(),
>>> so the new device will be probed, since PCI probe() is scheduled by
>>> work_on_cpu, then cause flush_work() called inside worker function,
>>> which might be a real deadlock.
>> So if I understand correct, you recommend to somehow avoid this nested probing?
> Yes, you might need to avoid the nested probing in your driver.
>
>>> I am wondering why this commit can cause the problem, since the PCI
>>> device will be probed with its driver if there is one driver for it. There is no
>>> any limit on when the driver should be loaded into system, either before
>>> device is added or after.
>> FWIW to undertstanding the issue - the same driver (mlx4_core) is used
>> by the PF and VF, so the VF driver is already loaded at the time its
>> been added as new PCI device.
>>
>>>  From driver core view, looks no wrong things are found.
>> So this got me confused, you pointed on possible deadlock, are you
>> saying the deadlock wouldn't be the result of how the driver code is
>> going nor the commited we bisected?
> My commit only affects the driver loading path, but your warning
> is hit in driver probe path triggered by device addition, so the lockdep
> warning should still be triggered without my commit since the two paths
> are totally independent, right?

If this is the case, I am not sure how come we never saw this warning 
before 3.9-rc1 which introduced your commit 
190888ac01d059e38ffe77a2291d44cafa9016fb "driver core: fix possible 
missing of device probe", any rough guess?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCHv2 for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-03-05 15:21 hitting lockdep warning as of too early VF probe with 3.9-rc1 Or Gerlitz
  2013-03-06  2:43 ` Ming Lei
  2013-04-11 15:25 ` [PATCH for-3.9] pci: avoid work_on_cpu for nested SRIOV probes Michael S. Tsirkin
@ 2013-04-18 20:08 ` Michael S. Tsirkin
  2013-04-18 21:40   ` Bjorn Helgaas
  2 siblings, 1 reply; 16+ messages in thread
From: Michael S. Tsirkin @ 2013-04-18 20:08 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Ming Lei, Greg Kroah-Hartman, David Miller, Roland Dreier,
	netdev, Yan Burman, Jack Morgenstein, Tejun Heo, Bjorn Helgaas,
	linux-pci

The following lockdep report triggers since 3.9-rc1:

=============================================
[ INFO: possible recursive locking detected ]
3.9.0-rc1 #96 Not tainted
---------------------------------------------
kworker/0:1/734 is trying to acquire lock:
 ((&wfc.work)){+.+.+.}, at: [<ffffffff81066cb0>] flush_work+0x0/0x250

but task is already holding lock:
 ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock((&wfc.work));
  lock((&wfc.work));

 *** DEADLOCK ***

 May be due to missing lock nesting notation

3 locks held by kworker/0:1/734:
 #0:  (events){.+.+.+}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0
 #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
process_one_work+0x162/0x4c0
 #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff812db225>]
device_attach+0x25/0xb0

stack backtrace:
Pid: 734, comm: kworker/0:1 Not tainted 3.9.0-rc1 #96
Call Trace:
 [<ffffffff810948ec>] validate_chain+0xdcc/0x11f0
 [<ffffffff81095150>] __lock_acquire+0x440/0xc70
 [<ffffffff81095150>] ? __lock_acquire+0x440/0xc70
 [<ffffffff810959da>] lock_acquire+0x5a/0x70
 [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
 [<ffffffff81066cf5>] flush_work+0x45/0x250
 [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
 [<ffffffff810922be>] ? mark_held_locks+0x9e/0x130
 [<ffffffff81066a96>] ? queue_work_on+0x46/0x90
 [<ffffffff810925dd>] ? trace_hardirqs_on_caller+0xfd/0x190
 [<ffffffff8109267d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff81066f74>] work_on_cpu+0x74/0x90
 [<ffffffff81063820>] ? keventd_up+0x20/0x20
 [<ffffffff8121fd30>] ? pci_pm_prepare+0x60/0x60
 [<ffffffff811f9293>] ? cpumask_next_and+0x23/0x40
 [<ffffffff81220a1a>] pci_device_probe+0xba/0x110
 [<ffffffff812dadca>] ? driver_sysfs_add+0x7a/0xb0
 [<ffffffff812daf1f>] driver_probe_device+0x8f/0x230
 [<ffffffff812db170>] ? __driver_attach+0xb0/0xb0
 [<ffffffff812db1bb>] __device_attach+0x4b/0x60
 [<ffffffff812d9314>] bus_for_each_drv+0x64/0x90
 [<ffffffff812db298>] device_attach+0x98/0xb0
 [<ffffffff81218474>] pci_bus_add_device+0x24/0x50
 [<ffffffff81232e80>] virtfn_add+0x240/0x3e0
 [<ffffffff8146ce3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
 [<ffffffff812333be>] pci_enable_sriov+0x23e/0x500
 [<ffffffffa011fa1a>] __mlx4_init_one+0x5da/0xce0 [mlx4_core]
 [<ffffffffa012016d>] mlx4_init_one+0x2d/0x60 [mlx4_core]
 [<ffffffff8121fd79>] local_pci_probe+0x49/0x80
 [<ffffffff81063833>] work_for_cpu_fn+0x13/0x20
 [<ffffffff810643b8>] process_one_work+0x1c8/0x4c0
 [<ffffffff81064352>] ? process_one_work+0x162/0x4c0
 [<ffffffff81064cfb>] worker_thread+0x30b/0x430
 [<ffffffff810649f0>] ? manage_workers+0x340/0x340
 [<ffffffff8106cea6>] kthread+0xd6/0xe0
 [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
 [<ffffffff8146daac>] ret_from_fork+0x7c/0xb0
 [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70

The issue is that a driver, in it's probe function, calls
pci_sriov_enable so a PF device probe causes VF probe (AKA nested
probe).  Each probe in pci_device_probe which is (normally) run through
work_on_cpu (this is to get the right numa node for memory allocated by
the driver).  In turn work_on_cpu does this internally:

        schedule_work_on(cpu, &wfc.work);
        flush_work(&wfc.work);

So if you are running probe on CPU1, and cause another
probe on the same CPU, this will try to flush
workqueue from inside same workqueue which causes
a lockep warning.

Nested probing might be tricky to get right generally.

But for pci_sriov_enable, the situation is actually very simple: all VFs
naturally have same affinity as the PF, and cpumask_any_and is actually
same as cpumask_first_and, so it always gives us the same CPU.
So let's just detect that, and run the probing for VFs locally without a
workqueue.

This is hardly elegant, but looks to me like an appropriate quick fix
for 3.9.

Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>

---

Changes from v1:
    - clarified commit log and added Ack by Tejun Heo
      patch is unchanged.

diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 1fa1e48..6eeb5ec 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -286,8 +286,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
 		int cpu;
 
 		get_online_cpus();
 		cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
-		if (cpu < nr_cpu_ids)
+		if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
 			error = work_on_cpu(cpu, local_pci_probe, &ddi);
 		else
 			error = local_pci_probe(&ddi);

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 20:08 ` [PATCHv2 " Michael S. Tsirkin
@ 2013-04-18 21:40   ` Bjorn Helgaas
  2013-04-18 21:57     ` Bjorn Helgaas
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2013-04-18 21:40 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein, Tejun Heo,
	linux-pci

On Thu, Apr 18, 2013 at 2:08 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> The following lockdep report triggers since 3.9-rc1:
>
> =============================================
> [ INFO: possible recursive locking detected ]
> 3.9.0-rc1 #96 Not tainted
> ---------------------------------------------
> kworker/0:1/734 is trying to acquire lock:
>  ((&wfc.work)){+.+.+.}, at: [<ffffffff81066cb0>] flush_work+0x0/0x250
>
> but task is already holding lock:
>  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
> process_one_work+0x162/0x4c0
>
> other info that might help us debug this:
>  Possible unsafe locking scenario:
>
>        CPU0
>        ----
>   lock((&wfc.work));
>   lock((&wfc.work));
>
>  *** DEADLOCK ***
>
>  May be due to missing lock nesting notation
>
> 3 locks held by kworker/0:1/734:
>  #0:  (events){.+.+.+}, at: [<ffffffff81064352>]
> process_one_work+0x162/0x4c0
>  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
> process_one_work+0x162/0x4c0
>  #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff812db225>]
> device_attach+0x25/0xb0
>
> stack backtrace:
> Pid: 734, comm: kworker/0:1 Not tainted 3.9.0-rc1 #96
> Call Trace:
>  [<ffffffff810948ec>] validate_chain+0xdcc/0x11f0
>  [<ffffffff81095150>] __lock_acquire+0x440/0xc70
>  [<ffffffff81095150>] ? __lock_acquire+0x440/0xc70
>  [<ffffffff810959da>] lock_acquire+0x5a/0x70
>  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
>  [<ffffffff81066cf5>] flush_work+0x45/0x250
>  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
>  [<ffffffff810922be>] ? mark_held_locks+0x9e/0x130
>  [<ffffffff81066a96>] ? queue_work_on+0x46/0x90
>  [<ffffffff810925dd>] ? trace_hardirqs_on_caller+0xfd/0x190
>  [<ffffffff8109267d>] ? trace_hardirqs_on+0xd/0x10
>  [<ffffffff81066f74>] work_on_cpu+0x74/0x90
>  [<ffffffff81063820>] ? keventd_up+0x20/0x20
>  [<ffffffff8121fd30>] ? pci_pm_prepare+0x60/0x60
>  [<ffffffff811f9293>] ? cpumask_next_and+0x23/0x40
>  [<ffffffff81220a1a>] pci_device_probe+0xba/0x110
>  [<ffffffff812dadca>] ? driver_sysfs_add+0x7a/0xb0
>  [<ffffffff812daf1f>] driver_probe_device+0x8f/0x230
>  [<ffffffff812db170>] ? __driver_attach+0xb0/0xb0
>  [<ffffffff812db1bb>] __device_attach+0x4b/0x60
>  [<ffffffff812d9314>] bus_for_each_drv+0x64/0x90
>  [<ffffffff812db298>] device_attach+0x98/0xb0
>  [<ffffffff81218474>] pci_bus_add_device+0x24/0x50
>  [<ffffffff81232e80>] virtfn_add+0x240/0x3e0
>  [<ffffffff8146ce3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
>  [<ffffffff812333be>] pci_enable_sriov+0x23e/0x500
>  [<ffffffffa011fa1a>] __mlx4_init_one+0x5da/0xce0 [mlx4_core]
>  [<ffffffffa012016d>] mlx4_init_one+0x2d/0x60 [mlx4_core]
>  [<ffffffff8121fd79>] local_pci_probe+0x49/0x80
>  [<ffffffff81063833>] work_for_cpu_fn+0x13/0x20
>  [<ffffffff810643b8>] process_one_work+0x1c8/0x4c0
>  [<ffffffff81064352>] ? process_one_work+0x162/0x4c0
>  [<ffffffff81064cfb>] worker_thread+0x30b/0x430
>  [<ffffffff810649f0>] ? manage_workers+0x340/0x340
>  [<ffffffff8106cea6>] kthread+0xd6/0xe0
>  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
>  [<ffffffff8146daac>] ret_from_fork+0x7c/0xb0
>  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
>
> The issue is that a driver, in it's probe function, calls
> pci_sriov_enable so a PF device probe causes VF probe (AKA nested
> probe).  Each probe in pci_device_probe which is (normally) run through
> work_on_cpu (this is to get the right numa node for memory allocated by
> the driver).  In turn work_on_cpu does this internally:
>
>         schedule_work_on(cpu, &wfc.work);
>         flush_work(&wfc.work);
>
> So if you are running probe on CPU1, and cause another
> probe on the same CPU, this will try to flush
> workqueue from inside same workqueue which causes
> a lockep warning.
>
> Nested probing might be tricky to get right generally.
>
> But for pci_sriov_enable, the situation is actually very simple: all VFs
> naturally have same affinity as the PF, and cpumask_any_and is actually
> same as cpumask_first_and, so it always gives us the same CPU.
> So let's just detect that, and run the probing for VFs locally without a
> workqueue.
>
> This is hardly elegant, but looks to me like an appropriate quick fix
> for 3.9.
>
> Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Acked-by: Tejun Heo <tj@kernel.org>

Thanks, Michael.  I put this in my for-linus branch:

http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=for-linus

I'll send a pull request to Linus today.

Bjorn

> ---
>
> Changes from v1:
>     - clarified commit log and added Ack by Tejun Heo
>       patch is unchanged.
>
> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> index 1fa1e48..6eeb5ec 100644
> --- a/drivers/pci/pci-driver.c
> +++ b/drivers/pci/pci-driver.c
> @@ -286,8 +286,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>                 int cpu;
>
>                 get_online_cpus();
>                 cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
> -               if (cpu < nr_cpu_ids)
> +               if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
>                         error = work_on_cpu(cpu, local_pci_probe, &ddi);
>                 else
>                         error = local_pci_probe(&ddi);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 21:40   ` Bjorn Helgaas
@ 2013-04-18 21:57     ` Bjorn Helgaas
  2013-04-19 14:36       ` Michael S. Tsirkin
       [not found]       ` <CAOS58YO+uV5KkS=sTP9Y3BPPh1nVnQ06yRyNU8GvEbym7R+X+Q@mail.gmail.com>
  0 siblings, 2 replies; 16+ messages in thread
From: Bjorn Helgaas @ 2013-04-18 21:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein, Tejun Heo,
	linux-pci

On Thu, Apr 18, 2013 at 3:40 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Thu, Apr 18, 2013 at 2:08 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
>> The following lockdep report triggers since 3.9-rc1:
>>
>> =============================================
>> [ INFO: possible recursive locking detected ]
>> 3.9.0-rc1 #96 Not tainted
>> ---------------------------------------------
>> kworker/0:1/734 is trying to acquire lock:
>>  ((&wfc.work)){+.+.+.}, at: [<ffffffff81066cb0>] flush_work+0x0/0x250
>>
>> but task is already holding lock:
>>  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
>> process_one_work+0x162/0x4c0
>>
>> other info that might help us debug this:
>>  Possible unsafe locking scenario:
>>
>>        CPU0
>>        ----
>>   lock((&wfc.work));
>>   lock((&wfc.work));
>>
>>  *** DEADLOCK ***
>>
>>  May be due to missing lock nesting notation
>>
>> 3 locks held by kworker/0:1/734:
>>  #0:  (events){.+.+.+}, at: [<ffffffff81064352>]
>> process_one_work+0x162/0x4c0
>>  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
>> process_one_work+0x162/0x4c0
>>  #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff812db225>]
>> device_attach+0x25/0xb0
>>
>> stack backtrace:
>> Pid: 734, comm: kworker/0:1 Not tainted 3.9.0-rc1 #96
>> Call Trace:
>>  [<ffffffff810948ec>] validate_chain+0xdcc/0x11f0
>>  [<ffffffff81095150>] __lock_acquire+0x440/0xc70
>>  [<ffffffff81095150>] ? __lock_acquire+0x440/0xc70
>>  [<ffffffff810959da>] lock_acquire+0x5a/0x70
>>  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
>>  [<ffffffff81066cf5>] flush_work+0x45/0x250
>>  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
>>  [<ffffffff810922be>] ? mark_held_locks+0x9e/0x130
>>  [<ffffffff81066a96>] ? queue_work_on+0x46/0x90
>>  [<ffffffff810925dd>] ? trace_hardirqs_on_caller+0xfd/0x190
>>  [<ffffffff8109267d>] ? trace_hardirqs_on+0xd/0x10
>>  [<ffffffff81066f74>] work_on_cpu+0x74/0x90
>>  [<ffffffff81063820>] ? keventd_up+0x20/0x20
>>  [<ffffffff8121fd30>] ? pci_pm_prepare+0x60/0x60
>>  [<ffffffff811f9293>] ? cpumask_next_and+0x23/0x40
>>  [<ffffffff81220a1a>] pci_device_probe+0xba/0x110
>>  [<ffffffff812dadca>] ? driver_sysfs_add+0x7a/0xb0
>>  [<ffffffff812daf1f>] driver_probe_device+0x8f/0x230
>>  [<ffffffff812db170>] ? __driver_attach+0xb0/0xb0
>>  [<ffffffff812db1bb>] __device_attach+0x4b/0x60
>>  [<ffffffff812d9314>] bus_for_each_drv+0x64/0x90
>>  [<ffffffff812db298>] device_attach+0x98/0xb0
>>  [<ffffffff81218474>] pci_bus_add_device+0x24/0x50
>>  [<ffffffff81232e80>] virtfn_add+0x240/0x3e0
>>  [<ffffffff8146ce3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
>>  [<ffffffff812333be>] pci_enable_sriov+0x23e/0x500
>>  [<ffffffffa011fa1a>] __mlx4_init_one+0x5da/0xce0 [mlx4_core]
>>  [<ffffffffa012016d>] mlx4_init_one+0x2d/0x60 [mlx4_core]
>>  [<ffffffff8121fd79>] local_pci_probe+0x49/0x80
>>  [<ffffffff81063833>] work_for_cpu_fn+0x13/0x20
>>  [<ffffffff810643b8>] process_one_work+0x1c8/0x4c0
>>  [<ffffffff81064352>] ? process_one_work+0x162/0x4c0
>>  [<ffffffff81064cfb>] worker_thread+0x30b/0x430
>>  [<ffffffff810649f0>] ? manage_workers+0x340/0x340
>>  [<ffffffff8106cea6>] kthread+0xd6/0xe0
>>  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
>>  [<ffffffff8146daac>] ret_from_fork+0x7c/0xb0
>>  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
>>
>> The issue is that a driver, in it's probe function, calls
>> pci_sriov_enable so a PF device probe causes VF probe (AKA nested
>> probe).  Each probe in pci_device_probe which is (normally) run through
>> work_on_cpu (this is to get the right numa node for memory allocated by
>> the driver).  In turn work_on_cpu does this internally:
>>
>>         schedule_work_on(cpu, &wfc.work);
>>         flush_work(&wfc.work);
>>
>> So if you are running probe on CPU1, and cause another
>> probe on the same CPU, this will try to flush
>> workqueue from inside same workqueue which causes
>> a lockep warning.
>>
>> Nested probing might be tricky to get right generally.
>>
>> But for pci_sriov_enable, the situation is actually very simple: all VFs
>> naturally have same affinity as the PF, and cpumask_any_and is actually
>> same as cpumask_first_and, so it always gives us the same CPU.
>> So let's just detect that, and run the probing for VFs locally without a
>> workqueue.
>>
>> This is hardly elegant, but looks to me like an appropriate quick fix
>> for 3.9.
>>
>> Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> Acked-by: Tejun Heo <tj@kernel.org>
>
> Thanks, Michael.  I put this in my for-linus branch:
>
> http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=for-linus
>
> I'll send a pull request to Linus today.

Actually, let me make sure I understand this correctly:  This patch
fixes the lockdep warning, but it does not fix an actual deadlock or
make any functional change.  Is that right?

If that's true, how much pain would it cause to just hold this for
v3.9.1?  I'm nervous about doing a warning fix when we're only a day
or two before releasing v3.9.

Bjorn

>> ---
>>
>> Changes from v1:
>>     - clarified commit log and added Ack by Tejun Heo
>>       patch is unchanged.
>>
>> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
>> index 1fa1e48..6eeb5ec 100644
>> --- a/drivers/pci/pci-driver.c
>> +++ b/drivers/pci/pci-driver.c
>> @@ -286,8 +286,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
>>                 int cpu;
>>
>>                 get_online_cpus();
>>                 cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
>> -               if (cpu < nr_cpu_ids)
>> +               if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
>>                         error = work_on_cpu(cpu, local_pci_probe, &ddi);
>>                 else
>>                         error = local_pci_probe(&ddi);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
  2013-04-18 21:57     ` Bjorn Helgaas
@ 2013-04-19 14:36       ` Michael S. Tsirkin
       [not found]       ` <CAOS58YO+uV5KkS=sTP9Y3BPPh1nVnQ06yRyNU8GvEbym7R+X+Q@mail.gmail.com>
  1 sibling, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2013-04-19 14:36 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Or Gerlitz, Ming Lei, Greg Kroah-Hartman, David Miller,
	Roland Dreier, netdev, Yan Burman, Jack Morgenstein, Tejun Heo,
	linux-pci

On Thu, Apr 18, 2013 at 03:57:36PM -0600, Bjorn Helgaas wrote:
> On Thu, Apr 18, 2013 at 3:40 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> > On Thu, Apr 18, 2013 at 2:08 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> >> The following lockdep report triggers since 3.9-rc1:
> >>
> >> =============================================
> >> [ INFO: possible recursive locking detected ]
> >> 3.9.0-rc1 #96 Not tainted
> >> ---------------------------------------------
> >> kworker/0:1/734 is trying to acquire lock:
> >>  ((&wfc.work)){+.+.+.}, at: [<ffffffff81066cb0>] flush_work+0x0/0x250
> >>
> >> but task is already holding lock:
> >>  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
> >> process_one_work+0x162/0x4c0
> >>
> >> other info that might help us debug this:
> >>  Possible unsafe locking scenario:
> >>
> >>        CPU0
> >>        ----
> >>   lock((&wfc.work));
> >>   lock((&wfc.work));
> >>
> >>  *** DEADLOCK ***
> >>
> >>  May be due to missing lock nesting notation
> >>
> >> 3 locks held by kworker/0:1/734:
> >>  #0:  (events){.+.+.+}, at: [<ffffffff81064352>]
> >> process_one_work+0x162/0x4c0
> >>  #1:  ((&wfc.work)){+.+.+.}, at: [<ffffffff81064352>]
> >> process_one_work+0x162/0x4c0
> >>  #2:  (&__lockdep_no_validate__){......}, at: [<ffffffff812db225>]
> >> device_attach+0x25/0xb0
> >>
> >> stack backtrace:
> >> Pid: 734, comm: kworker/0:1 Not tainted 3.9.0-rc1 #96
> >> Call Trace:
> >>  [<ffffffff810948ec>] validate_chain+0xdcc/0x11f0
> >>  [<ffffffff81095150>] __lock_acquire+0x440/0xc70
> >>  [<ffffffff81095150>] ? __lock_acquire+0x440/0xc70
> >>  [<ffffffff810959da>] lock_acquire+0x5a/0x70
> >>  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
> >>  [<ffffffff81066cf5>] flush_work+0x45/0x250
> >>  [<ffffffff81066cb0>] ? wq_worker_waking_up+0x60/0x60
> >>  [<ffffffff810922be>] ? mark_held_locks+0x9e/0x130
> >>  [<ffffffff81066a96>] ? queue_work_on+0x46/0x90
> >>  [<ffffffff810925dd>] ? trace_hardirqs_on_caller+0xfd/0x190
> >>  [<ffffffff8109267d>] ? trace_hardirqs_on+0xd/0x10
> >>  [<ffffffff81066f74>] work_on_cpu+0x74/0x90
> >>  [<ffffffff81063820>] ? keventd_up+0x20/0x20
> >>  [<ffffffff8121fd30>] ? pci_pm_prepare+0x60/0x60
> >>  [<ffffffff811f9293>] ? cpumask_next_and+0x23/0x40
> >>  [<ffffffff81220a1a>] pci_device_probe+0xba/0x110
> >>  [<ffffffff812dadca>] ? driver_sysfs_add+0x7a/0xb0
> >>  [<ffffffff812daf1f>] driver_probe_device+0x8f/0x230
> >>  [<ffffffff812db170>] ? __driver_attach+0xb0/0xb0
> >>  [<ffffffff812db1bb>] __device_attach+0x4b/0x60
> >>  [<ffffffff812d9314>] bus_for_each_drv+0x64/0x90
> >>  [<ffffffff812db298>] device_attach+0x98/0xb0
> >>  [<ffffffff81218474>] pci_bus_add_device+0x24/0x50
> >>  [<ffffffff81232e80>] virtfn_add+0x240/0x3e0
> >>  [<ffffffff8146ce3d>] ? _raw_spin_unlock_irqrestore+0x3d/0x80
> >>  [<ffffffff812333be>] pci_enable_sriov+0x23e/0x500
> >>  [<ffffffffa011fa1a>] __mlx4_init_one+0x5da/0xce0 [mlx4_core]
> >>  [<ffffffffa012016d>] mlx4_init_one+0x2d/0x60 [mlx4_core]
> >>  [<ffffffff8121fd79>] local_pci_probe+0x49/0x80
> >>  [<ffffffff81063833>] work_for_cpu_fn+0x13/0x20
> >>  [<ffffffff810643b8>] process_one_work+0x1c8/0x4c0
> >>  [<ffffffff81064352>] ? process_one_work+0x162/0x4c0
> >>  [<ffffffff81064cfb>] worker_thread+0x30b/0x430
> >>  [<ffffffff810649f0>] ? manage_workers+0x340/0x340
> >>  [<ffffffff8106cea6>] kthread+0xd6/0xe0
> >>  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
> >>  [<ffffffff8146daac>] ret_from_fork+0x7c/0xb0
> >>  [<ffffffff8106cdd0>] ? __init_kthread_worker+0x70/0x70
> >>
> >> The issue is that a driver, in it's probe function, calls
> >> pci_sriov_enable so a PF device probe causes VF probe (AKA nested
> >> probe).  Each probe in pci_device_probe which is (normally) run through
> >> work_on_cpu (this is to get the right numa node for memory allocated by
> >> the driver).  In turn work_on_cpu does this internally:
> >>
> >>         schedule_work_on(cpu, &wfc.work);
> >>         flush_work(&wfc.work);
> >>
> >> So if you are running probe on CPU1, and cause another
> >> probe on the same CPU, this will try to flush
> >> workqueue from inside same workqueue which causes
> >> a lockep warning.
> >>
> >> Nested probing might be tricky to get right generally.
> >>
> >> But for pci_sriov_enable, the situation is actually very simple: all VFs
> >> naturally have same affinity as the PF, and cpumask_any_and is actually
> >> same as cpumask_first_and, so it always gives us the same CPU.
> >> So let's just detect that, and run the probing for VFs locally without a
> >> workqueue.
> >>
> >> This is hardly elegant, but looks to me like an appropriate quick fix
> >> for 3.9.
> >>
> >> Tested-by: Or Gerlitz <ogerlitz@mellanox.com>
> >> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >> Acked-by: Tejun Heo <tj@kernel.org>
> >
> > Thanks, Michael.  I put this in my for-linus branch:
> >
> > http://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=for-linus
> >
> > I'll send a pull request to Linus today.
> 
> Actually, let me make sure I understand this correctly:  This patch
> fixes the lockdep warning, but it does not fix an actual deadlock or
> make any functional change.  Is that right?

Tejun said that this warning is a false positive, so yes.

> If that's true, how much pain would it cause to just hold this for
> v3.9.1?  I'm nervous about doing a warning fix when we're only a day
> or two before releasing v3.9.
> 
> Bjorn

I don't have this hardware, so I don't know. It was apparently reported
by real users ...

> >> ---
> >>
> >> Changes from v1:
> >>     - clarified commit log and added Ack by Tejun Heo
> >>       patch is unchanged.
> >>
> >> diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
> >> index 1fa1e48..6eeb5ec 100644
> >> --- a/drivers/pci/pci-driver.c
> >> +++ b/drivers/pci/pci-driver.c
> >> @@ -286,8 +286,8 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev,
> >>                 int cpu;
> >>
> >>                 get_online_cpus();
> >>                 cpu = cpumask_any_and(cpumask_of_node(node), cpu_online_mask);
> >> -               if (cpu < nr_cpu_ids)
> >> +               if (cpu != raw_smp_processor_id() && cpu < nr_cpu_ids)
> >>                         error = work_on_cpu(cpu, local_pci_probe, &ddi);
> >>                 else
> >>                         error = local_pci_probe(&ddi);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
       [not found]       ` <CAOS58YO+uV5KkS=sTP9Y3BPPh1nVnQ06yRyNU8GvEbym7R+X+Q@mail.gmail.com>
@ 2013-04-19 16:39         ` Bjorn Helgaas
  2013-04-20 19:05         ` Michael S. Tsirkin
  1 sibling, 0 replies; 16+ messages in thread
From: Bjorn Helgaas @ 2013-04-19 16:39 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Michael S. Tsirkin, Or Gerlitz, Ming Lei, Greg Kroah-Hartman,
	David Miller, Roland Dreier, netdev, Yan Burman,
	Jack Morgenstein, linux-pci

On Thu, Apr 18, 2013 at 9:04 PM, Tejun Heo <tj@kernel.org> wrote:
> So, the thing is there is no actual deadlock. If you're okay with releasing
> w/ spurious lockdep warning, leaving things alone should be fine. An issue
> with mst's patch is that it actually changes the behavior to avoid a
> spurious warning. An alternative course would be leaving it alone now and
> remove the spurious warning during the coming devel cycle and mark it w/
> -stable.

If I understand correctly, you need v3.9-rc1 or later,
CONFIG_PROVE_LOCKING=y, and an SR-IOV device to see the warning.

I like the idea of fixing the spurious warning for v3.10 and
backporting to -stable.  It sounds like there's a cleaner fix in the
works that needs a bit more polishing.  If we need a quick fix sooner,
we'll still have this one in our back       pocket.

Bjorn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCHv2 for-3.9] pci: avoid work_on_cpu for nested SRIOV probes
       [not found]       ` <CAOS58YO+uV5KkS=sTP9Y3BPPh1nVnQ06yRyNU8GvEbym7R+X+Q@mail.gmail.com>
  2013-04-19 16:39         ` Bjorn Helgaas
@ 2013-04-20 19:05         ` Michael S. Tsirkin
  1 sibling, 0 replies; 16+ messages in thread
From: Michael S. Tsirkin @ 2013-04-20 19:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Bjorn Helgaas, Or Gerlitz, Ming Lei, Greg Kroah-Hartman,
	David Miller, Roland Dreier, netdev, Yan Burman,
	Jack Morgenstein, linux-pci

On Thu, Apr 18, 2013 at 08:04:47PM -0700, Tejun Heo wrote:
> So, the thing is there is no actual deadlock. If you're okay with releasing w/
> spurious lockdep warning, leaving things alone should be fine. An issue with
> mst's patch is that it actually changes the behavior to avoid a spurious
> warning.

Yes but in fact, isn't it cleaner to avoid work_on if we are going to run
on the local CPU, anyway?

> An alternative course would be leaving it alone now and remove the
> spurious warning during the coming devel cycle and mark it w/ -stable.
> 
> Thanks.

Okay. Could you send tested a version of work_on_nested?

> --
> tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2013-04-20 19:05 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-05 15:21 hitting lockdep warning as of too early VF probe with 3.9-rc1 Or Gerlitz
2013-03-06  2:43 ` Ming Lei
2013-03-06 20:54   ` Or Gerlitz
2013-03-07  2:03     ` Ming Lei
2013-03-10 15:28       ` Jack Morgenstein
2013-03-10 16:37         ` Greg Kroah-Hartman
2013-03-11  1:26         ` Ming Lei
2013-03-11 20:24         ` Ben Hutchings
2013-04-17 15:14       ` Or Gerlitz
2013-04-11 15:25 ` [PATCH for-3.9] pci: avoid work_on_cpu for nested SRIOV probes Michael S. Tsirkin
2013-04-18 20:08 ` [PATCHv2 " Michael S. Tsirkin
2013-04-18 21:40   ` Bjorn Helgaas
2013-04-18 21:57     ` Bjorn Helgaas
2013-04-19 14:36       ` Michael S. Tsirkin
     [not found]       ` <CAOS58YO+uV5KkS=sTP9Y3BPPh1nVnQ06yRyNU8GvEbym7R+X+Q@mail.gmail.com>
2013-04-19 16:39         ` Bjorn Helgaas
2013-04-20 19:05         ` Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.