linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* pci: kernel crash in bus_find_device
@ 2014-05-20 19:17 Francesco Ruggeri
  2014-05-20 19:50 ` Guenter Roeck
  0 siblings, 1 reply; 16+ messages in thread
From: Francesco Ruggeri @ 2014-05-20 19:17 UTC (permalink / raw)
  To: linux-pci, linux-kernel; +Cc: Francesco Ruggeri

I posted this about a week ago but I did not get any replies.
Re-trying.

While traversing devices on pci_bus_type I ran into the crash below.
The immediate cause of the crash is that bus_find_device is trying to resume
a scan starting from a device that has been unregistered (and whose knode_bus
has already been klist_del' ed).
The main issue seems to be that when resuming a scan the caller should
be holding a
reference to the klist_node, but instead it relies on holding a
reference to the device.
I played with a couple of narrow fixes, but a clean solution would
affect quite a bit of code.

Has anybody run into this before?

Thanks,
Francesco Ruggeri


------------[ cut here ]------------
WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/include/linux/kref.h:41
klist_iter_init_node+0x30/0x38()
Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6
nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG
xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O)
8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw
iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw
ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon
kvm_amd kvm
Pid: 6861, comm: pci_scan_0 Tainted: P           O
3.4.43.Ar-1797671.flbocafruggeri #1
Call Trace:
 [<ffffffff81029dc4>] warn_slowpath_common+0x80/0x98
 [<ffffffff811b57f1>] ? pci_do_find_bus+0x49/0x49
 [<ffffffff81029df1>] warn_slowpath_null+0x15/0x17
 [<ffffffff813a43ce>] klist_iter_init_node+0x30/0x38
 [<ffffffff8120e57e>] bus_find_device+0x48/0x90
 [<ffffffff811b5908>] pci_get_dev_by_id+0x5e/0x81
 [<ffffffff811b5a6a>] pci_get_subsys+0x5c/0x7f
 [<ffffffff811b5a9e>] pci_get_device+0x11/0x13
 [<ffffffffa00b2087>] pci_scan+0x39/0x8a [pci_scan]
 [<ffffffffa00b204e>] ? init_module+0x3c/0x3c [pci_scan]
 [<ffffffff81040e6e>] kthread+0x84/0x8c
 [<ffffffff813c8b14>] kernel_thread_helper+0x4/0x10
 [<ffffffff81040dea>] ? __init_kthread_worker+0x37/0x37
 [<ffffffff813c8b10>] ? gs_change+0xb/0xb
---[ end trace 79cea1ec476672fe ]---
------------[ cut here ]------------
WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/lib/klist.c:189
klist_release+0x2b/0xeb()
Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6
nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG
xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O)
8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw
iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw
ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon
kvm_amd kvm
Pid: 6861, comm: pci_scan_0 Tainted: P        W  O
3.4.43.Ar-1797671.flbocafruggeri #1
Call Trace:
 [<ffffffff81029dc4>] warn_slowpath_common+0x80/0x98
 [<ffffffff8120de13>] ? bus_get_device_klist+0x10/0x10
 [<ffffffff81029df1>] warn_slowpath_null+0x15/0x17
 [<ffffffff813a440e>] klist_release+0x2b/0xeb
 [<ffffffff813a44ec>] klist_dec_and_del+0x1e/0x25
 [<ffffffff813a4528>] klist_next+0x35/0xc9
 [<ffffffff811b57f1>] ? pci_do_find_bus+0x49/0x49
 [<ffffffff8120deb3>] next_device+0x9/0x19
 [<ffffffff8120e5a2>] bus_find_device+0x6c/0x90
 [<ffffffff811b5908>] pci_get_dev_by_id+0x5e/0x81
 [<ffffffff811b5a6a>] pci_get_subsys+0x5c/0x7f
 [<ffffffff811b5a9e>] pci_get_device+0x11/0x13
 [<ffffffffa00b2087>] pci_scan+0x39/0x8a [pci_scan]
 [<ffffffffa00b204e>] ? init_module+0x3c/0x3c [pci_scan]
 [<ffffffff81040e6e>] kthread+0x84/0x8c
 [<ffffffff813c8b14>] kernel_thread_helper+0x4/0x10
 [<ffffffff81040dea>] ? __init_kthread_worker+0x37/0x37
 [<ffffffff813c8b10>] ? gs_change+0xb/0xb
---[ end trace 79cea1ec476672ff ]---
general protection fault: 0000 [#1] PREEMPT SMP
CPU 1
Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6
nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG
xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O)
8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw
iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw
ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon
kvm_amd kvm

Pid: 6861, comm: pci_scan_0 Tainted: P        W  O
3.4.43.Ar-1797671.flbocafruggeri #1
RIP: 0010:[<ffffffff813a442c>]  [<ffffffff813a442c>] klist_release+0x49/0xeb
RSP: 0018:ffff88001c55bd50  EFLAGS: 00010293
RAX: dead000000200200 RBX: ffff880030949e78 RCX: ffff880000000010
RDX: dead000000100100 RSI: 0000000000000000 RDI: dead000000200200
RBP: ffff88001c55bd70 R08: dead000000100100 R09: 000000000000000a
R10: 0000000000000000 R11: ffffffff81619920 R12: ffff880030949e90
R13: ffff880030949e78 R14: ffffffff8120de13 R15: ffff880027e717e0
FS:  0000000000000000(0000) GS:ffff88013fb00000(0000) knlGS:00000000f73bc6d0
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000009012644 CR3: 0000000069f9e000 CR4: 00000000000007e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process pci_scan_0 (pid: 6861, threadinfo ffff88001c55a000, task
ffff880032ffd340)
Stack:
 ffff880030949e78 ffff88001c55bde0 dead000000100100 ffff880030949e78
 ffff88001c55bd80 ffffffff813a44ec ffff88001c55bdc0 ffffffff813a4528
 ffff88001c55bde0 ffff880027e717e0 ffffffff811b57f1 ffff88001c55bde0
Call Trace:
 [<ffffffff813a44ec>] klist_dec_and_del+0x1e/0x25
 [<ffffffff813a4528>] klist_next+0x35/0xc9
 [<ffffffff811b57f1>] ? pci_do_find_bus+0x49/0x49
 [<ffffffff8120deb3>] next_device+0x9/0x19
 [<ffffffff8120e5a2>] bus_find_device+0x6c/0x90
 [<ffffffff811b5908>] pci_get_dev_by_id+0x5e/0x81
 [<ffffffff811b5a6a>] pci_get_subsys+0x5c/0x7f
 [<ffffffff811b5a9e>] pci_get_device+0x11/0x13
 [<ffffffffa00b2087>] pci_scan+0x39/0x8a [pci_scan]
 [<ffffffffa00b204e>] ? init_module+0x3c/0x3c [pci_scan]
 [<ffffffff81040e6e>] kthread+0x84/0x8c
 [<ffffffff813c8b14>] kernel_thread_helper+0x4/0x10
 [<ffffffff81040dea>] ? __init_kthread_worker+0x37/0x37
 [<ffffffff813c8b10>] ? gs_change+0xb/0xb
Code: 00 48 c7 c7 a1 01 51 81 e8 ce 59 c8 ff 49 8b 54 24 f0 49 8b 44
24 f8 49 b8 00 01 10 00 00 00 ad de 48 bf 00 02 20 00 00 00 ad de <48>
89 42 08 48 89 10 49 89 7c 24 f8 4d 89 44 24 f0 48 c7 c7 30
RIP  [<ffffffff813a442c>] klist_release+0x49/0xeb
 RSP <ffff88001c55bd50>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-20 19:17 pci: kernel crash in bus_find_device Francesco Ruggeri
@ 2014-05-20 19:50 ` Guenter Roeck
  2014-05-20 22:35   ` Francesco Ruggeri
  0 siblings, 1 reply; 16+ messages in thread
From: Guenter Roeck @ 2014-05-20 19:50 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: linux-pci, linux-kernel, Francesco Ruggeri

On Tue, May 20, 2014 at 12:17:57PM -0700, Francesco Ruggeri wrote:
> I posted this about a week ago but I did not get any replies.
> Re-trying.
> 
> While traversing devices on pci_bus_type I ran into the crash below.
> The immediate cause of the crash is that bus_find_device is trying to resume
> a scan starting from a device that has been unregistered (and whose knode_bus
> has already been klist_del' ed).
> The main issue seems to be that when resuming a scan the caller should
> be holding a
> reference to the klist_node, but instead it relies on holding a
> reference to the device.
> I played with a couple of narrow fixes, but a clean solution would
> affect quite a bit of code.
> 
> Has anybody run into this before?
> 

Hi Francesco,

I may be missing something, but I don't find a pci_scan symbol in the 3.4
kernel. Also, the process name suggests that you may possibly trigger pci
rescans from user space. Both suggest that you may possibly run third party
code in your kernel.

Either case, I ran into similar problems myself with pci rescans triggered
from user space. The 3.4 kernel has no synchronization for rescans triggered
from user space with those triggered from the kernel. In a nutshell, when
triggering rescans and removals from user space you must ensure that only
one such rescan/removal is active at any given time. Under no circumstances
trigger rescans from user space if a rescan can also be triggered from the
kernel. Obviously that also applies if rescans can be triggered multiple times
in parallel by some third party kernel module. Maybe that explains your
problem ?

The problem has been addressed recently with commit 9d16947 (PCI: Add
global pci_lock_rescan_remove) and several subsequent patches.

Guenter

> Thanks,
> Francesco Ruggeri
> 
> 
> ------------[ cut here ]------------
> WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/include/linux/kref.h:41
> klist_iter_init_node+0x30/0x38()
> Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
> macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6
> nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG
> xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O)
> 8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw
> iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw
> ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon
> kvm_amd kvm
> Pid: 6861, comm: pci_scan_0 Tainted: P           O
> 3.4.43.Ar-1797671.flbocafruggeri #1
> Call Trace:
>  [<ffffffff81029dc4>] warn_slowpath_common+0x80/0x98
>  [<ffffffff811b57f1>] ? pci_do_find_bus+0x49/0x49
>  [<ffffffff81029df1>] warn_slowpath_null+0x15/0x17
>  [<ffffffff813a43ce>] klist_iter_init_node+0x30/0x38
>  [<ffffffff8120e57e>] bus_find_device+0x48/0x90
>  [<ffffffff811b5908>] pci_get_dev_by_id+0x5e/0x81
>  [<ffffffff811b5a6a>] pci_get_subsys+0x5c/0x7f
>  [<ffffffff811b5a9e>] pci_get_device+0x11/0x13
>  [<ffffffffa00b2087>] pci_scan+0x39/0x8a [pci_scan]
>  [<ffffffffa00b204e>] ? init_module+0x3c/0x3c [pci_scan]
>  [<ffffffff81040e6e>] kthread+0x84/0x8c
>  [<ffffffff813c8b14>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81040dea>] ? __init_kthread_worker+0x37/0x37
>  [<ffffffff813c8b10>] ? gs_change+0xb/0xb
> ---[ end trace 79cea1ec476672fe ]---
> ------------[ cut here ]------------
> WARNING: at /bld/EosKernel/Artools-rpmbuild/linux-3.4/lib/klist.c:189
> klist_release+0x2b/0xeb()
> Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
> macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6
> nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG
> xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O)
> 8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw
> iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw
> ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon
> kvm_amd kvm
> Pid: 6861, comm: pci_scan_0 Tainted: P        W  O
> 3.4.43.Ar-1797671.flbocafruggeri #1
> Call Trace:
>  [<ffffffff81029dc4>] warn_slowpath_common+0x80/0x98
>  [<ffffffff8120de13>] ? bus_get_device_klist+0x10/0x10
>  [<ffffffff81029df1>] warn_slowpath_null+0x15/0x17
>  [<ffffffff813a440e>] klist_release+0x2b/0xeb
>  [<ffffffff813a44ec>] klist_dec_and_del+0x1e/0x25
>  [<ffffffff813a4528>] klist_next+0x35/0xc9
>  [<ffffffff811b57f1>] ? pci_do_find_bus+0x49/0x49
>  [<ffffffff8120deb3>] next_device+0x9/0x19
>  [<ffffffff8120e5a2>] bus_find_device+0x6c/0x90
>  [<ffffffff811b5908>] pci_get_dev_by_id+0x5e/0x81
>  [<ffffffff811b5a6a>] pci_get_subsys+0x5c/0x7f
>  [<ffffffff811b5a9e>] pci_get_device+0x11/0x13
>  [<ffffffffa00b2087>] pci_scan+0x39/0x8a [pci_scan]
>  [<ffffffffa00b204e>] ? init_module+0x3c/0x3c [pci_scan]
>  [<ffffffff81040e6e>] kthread+0x84/0x8c
>  [<ffffffff813c8b14>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81040dea>] ? __init_kthread_worker+0x37/0x37
>  [<ffffffff813c8b10>] ? gs_change+0xb/0xb
> ---[ end trace 79cea1ec476672ff ]---
> general protection fault: 0000 [#1] PREEMPT SMP
> CPU 1
> Modules linked in: pci_scan(O) sch_prio sand_dma(PO) arista_bde(PO)
> macvlan ip6table_mangle iptable_mangle msr nf_conntrack_ipv6
> nf_defrag_ipv6 ip6t_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_LOG
> xt_limit ipt_REJECT xt_hl xt_state xt_multiport xt_tcpudp kbfd(O)
> 8021q garp stp llc tun scd_em_driver(O) nf_conntrack_tftp iptable_raw
> iptable_filter ip_tables xt_NOTRACK nf_conntrack xt_mark ip6table_raw
> ip6table_filter ip6_tables x_tables scd(O) k8temp amd64_edac_mod hwmon
> kvm_amd kvm
> 
> Pid: 6861, comm: pci_scan_0 Tainted: P        W  O
> 3.4.43.Ar-1797671.flbocafruggeri #1
> RIP: 0010:[<ffffffff813a442c>]  [<ffffffff813a442c>] klist_release+0x49/0xeb
> RSP: 0018:ffff88001c55bd50  EFLAGS: 00010293
> RAX: dead000000200200 RBX: ffff880030949e78 RCX: ffff880000000010
> RDX: dead000000100100 RSI: 0000000000000000 RDI: dead000000200200
> RBP: ffff88001c55bd70 R08: dead000000100100 R09: 000000000000000a
> R10: 0000000000000000 R11: ffffffff81619920 R12: ffff880030949e90
> R13: ffff880030949e78 R14: ffffffff8120de13 R15: ffff880027e717e0
> FS:  0000000000000000(0000) GS:ffff88013fb00000(0000) knlGS:00000000f73bc6d0
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000009012644 CR3: 0000000069f9e000 CR4: 00000000000007e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process pci_scan_0 (pid: 6861, threadinfo ffff88001c55a000, task
> ffff880032ffd340)
> Stack:
>  ffff880030949e78 ffff88001c55bde0 dead000000100100 ffff880030949e78
>  ffff88001c55bd80 ffffffff813a44ec ffff88001c55bdc0 ffffffff813a4528
>  ffff88001c55bde0 ffff880027e717e0 ffffffff811b57f1 ffff88001c55bde0
> Call Trace:
>  [<ffffffff813a44ec>] klist_dec_and_del+0x1e/0x25
>  [<ffffffff813a4528>] klist_next+0x35/0xc9
>  [<ffffffff811b57f1>] ? pci_do_find_bus+0x49/0x49
>  [<ffffffff8120deb3>] next_device+0x9/0x19
>  [<ffffffff8120e5a2>] bus_find_device+0x6c/0x90
>  [<ffffffff811b5908>] pci_get_dev_by_id+0x5e/0x81
>  [<ffffffff811b5a6a>] pci_get_subsys+0x5c/0x7f
>  [<ffffffff811b5a9e>] pci_get_device+0x11/0x13
>  [<ffffffffa00b2087>] pci_scan+0x39/0x8a [pci_scan]
>  [<ffffffffa00b204e>] ? init_module+0x3c/0x3c [pci_scan]
>  [<ffffffff81040e6e>] kthread+0x84/0x8c
>  [<ffffffff813c8b14>] kernel_thread_helper+0x4/0x10
>  [<ffffffff81040dea>] ? __init_kthread_worker+0x37/0x37
>  [<ffffffff813c8b10>] ? gs_change+0xb/0xb
> Code: 00 48 c7 c7 a1 01 51 81 e8 ce 59 c8 ff 49 8b 54 24 f0 49 8b 44
> 24 f8 49 b8 00 01 10 00 00 00 ad de 48 bf 00 02 20 00 00 00 ad de <48>
> 89 42 08 48 89 10 49 89 7c 24 f8 4d 89 44 24 f0 48 c7 c7 30
> RIP  [<ffffffff813a442c>] klist_release+0x49/0xeb
>  RSP <ffff88001c55bd50>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-20 19:50 ` Guenter Roeck
@ 2014-05-20 22:35   ` Francesco Ruggeri
  2014-05-20 23:38     ` Guenter Roeck
  2014-05-21 17:39     ` Guenter Roeck
  0 siblings, 2 replies; 16+ messages in thread
From: Francesco Ruggeri @ 2014-05-20 22:35 UTC (permalink / raw)
  To: Guenter Roeck, linux-pci; +Cc: linux-kernel, Francesco Ruggeri

Hi Guenter,
thank you for your reply. I will check out the changes that you pointed to.
The problem we are seeing is a race condition between for_each_pci_dev
(or similar) and device_unregisters. I am not sure if use of the new
lock should be extended to all code using for_each_pci_dev as well.

pci_scan is a kernel thread that I used for testing purposes, to
mimick the dynamics that we saw in our crashes in
edac_pci_clear_parity_errors:

        for (;;) {
                pci_dev = NULL;
                while ((pci_dev = pci_get_device(PCI_ANY_ID,
PCI_ANY_ID, pci_dev)) != NULL)
                        ;
        }

It keeps traversing klist_devices in pci_bus_type using
bus_find_device, costantly resuming its search for the next element
starting from the one it got in the previous round.
There are several loops of this kind in linux. In case of this thread
no action is taken on the elements as they are "found".

The race condition occurs when bus_find_device resumes its search from
a device that has been unregistered. Because device_unregister resets
klist_bus in the device, bus_find device cannot resume from where it
left off in the klist.
The sequence is device_unregister, device_del, bus_remove_device,
klist_del(&dev->p->knode_bus.).

Francesco

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-20 22:35   ` Francesco Ruggeri
@ 2014-05-20 23:38     ` Guenter Roeck
       [not found]       ` <CA+HUmGge7AEpAnwAG_VJD2CKTtRBoC2bCGVU_t4qm-x6+OCr-g@mail.gmail.com>
  2014-05-21 17:39     ` Guenter Roeck
  1 sibling, 1 reply; 16+ messages in thread
From: Guenter Roeck @ 2014-05-20 23:38 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: linux-pci, linux-kernel, Francesco Ruggeri

On Tue, May 20, 2014 at 03:35:15PM -0700, Francesco Ruggeri wrote:
> Hi Guenter,
> thank you for your reply. I will check out the changes that you pointed to.
> The problem we are seeing is a race condition between for_each_pci_dev
> (or similar) and device_unregisters. I am not sure if use of the new
> lock should be extended to all code using for_each_pci_dev as well.
> 
> pci_scan is a kernel thread that I used for testing purposes, to
> mimick the dynamics that we saw in our crashes in
> edac_pci_clear_parity_errors:
> 
>         for (;;) {
>                 pci_dev = NULL;
>                 while ((pci_dev = pci_get_device(PCI_ANY_ID,
> PCI_ANY_ID, pci_dev)) != NULL)
>                         ;
>         }
> 
> It keeps traversing klist_devices in pci_bus_type using
> bus_find_device, costantly resuming its search for the next element
> starting from the one it got in the previous round.
> There are several loops of this kind in linux. In case of this thread
> no action is taken on the elements as they are "found".
> 
> The race condition occurs when bus_find_device resumes its search from
> a device that has been unregistered. Because device_unregister resets
> klist_bus in the device, bus_find device cannot resume from where it
> left off in the klist.
> The sequence is device_unregister, device_del, bus_remove_device,
> klist_del(&dev->p->knode_bus.).
> 
Hmmm ... sounds more like a generic problem, not specifically related to pci.
Essentially everything calling bus_find_device() with a starting device which
has been removed (though only pci and scsi seem to be doing that in practice).

Can you reproduce the problem with the latest kernel ? Also, can you
send me the entire file with the kernel thread you mentioned above ?
Maybe I can reproduce the problem here.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-20 22:35   ` Francesco Ruggeri
  2014-05-20 23:38     ` Guenter Roeck
@ 2014-05-21 17:39     ` Guenter Roeck
  1 sibling, 0 replies; 16+ messages in thread
From: Guenter Roeck @ 2014-05-21 17:39 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: linux-pci, linux-kernel, Francesco Ruggeri

On Tue, May 20, 2014 at 03:35:15PM -0700, Francesco Ruggeri wrote:
> Hi Guenter,
> thank you for your reply. I will check out the changes that you pointed to.
> The problem we are seeing is a race condition between for_each_pci_dev
> (or similar) and device_unregisters. I am not sure if use of the new
> lock should be extended to all code using for_each_pci_dev as well.
> 
> pci_scan is a kernel thread that I used for testing purposes, to
> mimick the dynamics that we saw in our crashes in
> edac_pci_clear_parity_errors:
> 
>         for (;;) {
>                 pci_dev = NULL;
>                 while ((pci_dev = pci_get_device(PCI_ANY_ID,
> PCI_ANY_ID, pci_dev)) != NULL)
>                         ;
>         }
> 
> It keeps traversing klist_devices in pci_bus_type using
> bus_find_device, costantly resuming its search for the next element
> starting from the one it got in the previous round.
> There are several loops of this kind in linux. In case of this thread
> no action is taken on the elements as they are "found".
> 
> The race condition occurs when bus_find_device resumes its search from
> a device that has been unregistered. Because device_unregister resets
> klist_bus in the device, bus_find device cannot resume from where it
> left off in the klist.
> The sequence is device_unregister, device_del, bus_remove_device,
> klist_del(&dev->p->knode_bus.).
> 

Problem is confirmed to exist in 3.14, and can be reproduced easily
with the following dummy driver, courtesy to Francesco. I added
usleep_range() to make it easier to reproduce. It took only about
half a dozen hot insertion/removal events to make it happen.

Here are the tracebacks:

------------[ cut here ]------------
WARNING: at /home/p2020/linux-freescale/include/linux/kref.h:47
Modules linked in: jnx_connector leds_gpio sam_flash gpio_sam i2c_sam sam_core uio_pci_hostif pci_scan [last unloaded: sam_core]
CPU: 0 PID: 2641 Comm: pci_scan Not tainted 3.14.4-juniper-00422-gf428c34 #47
task: e7ce8ea0 ti: e73e6000 task.ti: e73e6000
NIP: c04e0988 LR: c02baa28 CTR: c0268ca4
REGS: e73e7da0 TRAP: 0700   Not tainted (3.14.4-juniper-00422-gf428c34)
MSR: 00029000 <CE,EE,ME>  CR: 24038382  XER: 00000000
GPR00: c0268b38 e73e7e50 e7ce8ea0 e7c96f94 e73e7e58 e725a264 c0268a38 eedaa2c0 
GPR08: 00000002 00000001 00000000 00021000 2403d382 00000000 c00576f8 e7377750 
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
GPR24: 00000000 00000000 f170f000 00000000 c056c33c c0268a38 e73e7ea8 eedaa2c0 
NIP [c04e0988] klist_iter_init_node.part.0+0xc/0x684
LR [c02baa28] bus_find_device+0x48/0xac
Call Trace:
[e73e7e80] [c0268b38] pci_get_dev_by_id+0x5c/0x94
[e73e7ea0] [c0268c94] pci_get_subsys+0x38/0x48
[e73e7ed0] [f170f02c] pci_scan+0x2c/0x64 [pci_scan]
[e73e7ee0] [c00577bc] kthread+0xc4/0xd8
[e73e7f40] [c000f004] ret_from_kernel_thread+0x5c/0x64

and:

------------[ cut here ]------------
WARNING: at /home/p2020/linux-freescale/lib/klist.c:189
Modules linked in: jnx_connector leds_gpio sam_flash gpio_sam i2c_sam sam_core uio_pci_hostif pci_scan [last unloaded: sam_core]
CPU: 0 PID: 2641 Comm: pci_scan Tainted: G        W 3.14.4-juniper-00422-gf428c34 #47
task: e7ce8ea0 ti: e73e6000 task.ti: e73e6000
NIP: c04d7ad0 LR: c04d7be4 CTR: c0268ca4
REGS: e73e7d30 TRAP: 0700   Tainted: G        W (3.14.4-juniper-00422-gf428c34)
MSR: 00029000 <CE,EE,ME>  CR: 24038382  XER: 00000000
GPR00: c04d7be4 e73e7de0 e7ce8ea0 e725a264 e73e7e58 e725a264 c0268a38 eedaa2c0 
GPR08: 00000002 00000001 00000001 00021000 24038384 00000000 c00576f8 e7377750 
GPR16: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 
GPR24: 00000000 00000000 f170f000 00000000 c02ba364 e725a258 e725a258 e73e7e58 
NIP [c04d7ad0] klist_release+0x20/0xec
LR [c04d7be4] klist_dec_and_del+0x48/0x5c
Call Trace:
[e73e7e10] [c04d7be4] klist_dec_and_del+0x48/0x5c
[e73e7e20] [c04d7c3c] klist_next+0x44/0x138
[e73e7e40] [c02ba444] next_device+0x10/0x34
[e73e7e50] [c02baa30] bus_find_device+0x50/0xac
[e73e7e80] [c0268b38] pci_get_dev_by_id+0x5c/0x94
[e73e7ea0] [c0268c94] pci_get_subsys+0x38/0x48
[e73e7ed0] [f170f02c] pci_scan+0x2c/0x64 [pci_scan]
[e73e7ee0] [c00577bc] kthread+0xc4/0xd8
[e73e7f40] [c000f004] ret_from_kernel_thread+0x5c/0x64

Francesco, I'll test the patches you sent me next.

Guenter

---

/*
 * PCI scan test driver
 */

#include <linux/delay.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/kprobes.h>
#include <linux/kallsyms.h>
#include <linux/kthread.h>
#include <linux/pci.h>
#include <linux/pcieport_if.h>

static struct task_struct *pci_scan_task = NULL;

static int pci_scan(void *unused)
{
	for (;;) {
		struct pci_dev *dev = NULL;

		while ((dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev)) != NULL)
			usleep_range(1000, 2000);
		schedule();
		if (kthread_should_stop())
			break;
	}
	return 0;
}

static int __init pci_scan_init(void)
{
	pci_scan_task = kthread_create(pci_scan, NULL, "pci_scan");
	if (!pci_scan_task)
		return -ENODEV;

	wake_up_process(pci_scan_task);
	return 0;
}

static void __exit pci_scan_exit(void)
{
	if (pci_scan_task)
		kthread_stop(pci_scan_task);
}

module_init(pci_scan_init);
module_exit(pci_scan_exit);

MODULE_LICENSE("GPL");


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
       [not found]           ` <CA+HUmGhm1VLTvMKW1TUUPqStUhD11M5u0VyTZyXyWz_ZS8uSVw@mail.gmail.com>
@ 2014-05-21 22:59             ` Guenter Roeck
  2014-05-22  7:14               ` Greg Kroah-Hartmann
  0 siblings, 1 reply; 16+ messages in thread
From: Guenter Roeck @ 2014-05-21 22:59 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: Hannes Reinecke, Greg Kroah-Hartmann, linux-kernel

On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
> I have been using an x86 platform.
> When I started working on it I got early crashes until I added the
> check for p not NULL in
> 
> +void bus_release_device(struct device *dev)
> +{
> + struct device_private *p = dev->p;
> +
> + if (p && klist_node_attached(&p->knode_bus))
> + klist_put_last(&p->knode_bus);
> +}
> +
> 
> Maybe on powerpc *p is overriden between device_del and device_release?
> 
> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are
> treated as WARN_ONs in the current klist code.
> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I
> ran into it without the second patch (but only when I ran my module
> and tests).
> 
Hi Francesco,

I replaced the BUG_ON with WARN_ON; still crashes.

Anyway, the problem seems to be known. I found two related exchanges.

[1] describes pretty much the same problem. I don't see if/where it was
ever fixed, though.

[2] is a patch to fix the problem. It did not apply cleanly to 3.14,
so I had to make some adjustments in klist_iter_init_node. Resulting
patch is below. With this patch, the problem is gone. It is not perfect,
as it aborts the loop if it encounters a deleted kobject, but it is better
than nothing. Unfortunately, the patch never made it upstream; no idea why.
Copying the author and Greg to get additional feedback.

Guenter

[1] https://lkml.org/lkml/2008/10/26/79
[2] https://lkml.org/lkml/2012/4/16/218

----

>From 2bf95c4a05a91dbe7168b53d4405dee3a0912a98 Mon Sep 17 00:00:00 2001
From: Hannes Reinecke <hare@suse.de>
Date: Mon, 16 Apr 2012 15:06:25 +0200
Subject: [PATCH] driver core: check start node in klist_iter_init_node

klist_iter_init_node() takes a node as a start argument.
However, this node might not be valid anymore.
This patch updates the klist_iter_init_node() and
dependent functions to return an error if so.
All calling functions have been audited to check
for a return code here.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: Greg Kroah-Hartmann <gregkh@linuxfoundation.org>
Cc: Kay Sievers <kay@vrfy.org>
[groeck: ported to 3.14]
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
 drivers/base/bus.c     |   46 +++++++++++++++++++++++++++++-----------------
 drivers/base/class.c   |   32 ++++++++++++++++++++------------
 drivers/base/driver.c  |   18 +++++++++++-------
 include/linux/device.h |   10 +++++-----
 include/linux/klist.h  |    2 +-
 lib/klist.c            |   15 +++++++++++----
 6 files changed, 77 insertions(+), 46 deletions(-)

diff --git a/drivers/base/bus.c b/drivers/base/bus.c
index fbad61b..9d6af80 100644
--- a/drivers/base/bus.c
+++ b/drivers/base/bus.c
@@ -305,11 +305,13 @@ int bus_for_each_dev(struct bus_type *bus, struct device *start,
 	if (!bus || !bus->p || (start && !start->p))
 		return -EINVAL;
 
-	klist_iter_init_node(&bus->p->klist_devices, &i,
-			     (start ? &start->p->knode_bus : NULL));
-	while ((dev = next_device(&i)) && !error)
-		error = fn(dev, data);
-	klist_iter_exit(&i);
+	error = klist_iter_init_node(&bus->p->klist_devices, &i,
+				     (start ? &start->p->knode_bus : NULL));
+	if (!error) {
+		while ((dev = next_device(&i)) && !error)
+			error = fn(dev, data);
+		klist_iter_exit(&i);
+	}
 	return error;
 }
 EXPORT_SYMBOL_GPL(bus_for_each_dev);
@@ -339,8 +341,10 @@ struct device *bus_find_device(struct bus_type *bus,
 	if (!bus || !bus->p)
 		return NULL;
 
-	klist_iter_init_node(&bus->p->klist_devices, &i,
-			     (start ? &start->p->knode_bus : NULL));
+	if (klist_iter_init_node(&bus->p->klist_devices, &i,
+				 (start ? &start->p->knode_bus : NULL)) < 0)
+		return NULL;
+
 	while ((dev = next_device(&i)))
 		if (match(dev, data) && get_device(dev))
 			break;
@@ -393,7 +397,9 @@ struct device *subsys_find_device_by_id(struct bus_type *subsys, unsigned int id
 		return NULL;
 
 	if (hint) {
-		klist_iter_init_node(&subsys->p->klist_devices, &i, &hint->p->knode_bus);
+		if (klist_iter_init_node(&subsys->p->klist_devices, &i,
+					 &hint->p->knode_bus) < 0)
+			return NULL;
 		dev = next_device(&i);
 		if (dev && dev->id == id && get_device(dev)) {
 			klist_iter_exit(&i);
@@ -455,11 +461,13 @@ int bus_for_each_drv(struct bus_type *bus, struct device_driver *start,
 	if (!bus)
 		return -EINVAL;
 
-	klist_iter_init_node(&bus->p->klist_drivers, &i,
-			     start ? &start->p->knode_bus : NULL);
-	while ((drv = next_driver(&i)) && !error)
-		error = fn(drv, data);
-	klist_iter_exit(&i);
+	error = klist_iter_init_node(&bus->p->klist_drivers, &i,
+				     start ? &start->p->knode_bus : NULL);
+	if (!error) {
+		while ((drv = next_driver(&i)) && !error)
+			error = fn(drv, data);
+		klist_iter_exit(&i);
+	}
 	return error;
 }
 EXPORT_SYMBOL_GPL(bus_for_each_drv);
@@ -1057,15 +1065,19 @@ EXPORT_SYMBOL_GPL(bus_sort_breadthfirst);
  * otherwise if it is NULL, the iteration starts at the beginning of
  * the list.
  */
-void subsys_dev_iter_init(struct subsys_dev_iter *iter, struct bus_type *subsys,
-			  struct device *start, const struct device_type *type)
+int subsys_dev_iter_init(struct subsys_dev_iter *iter, struct bus_type *subsys,
+			 struct device *start, const struct device_type *type)
 {
 	struct klist_node *start_knode = NULL;
+	int error;
 
 	if (start)
 		start_knode = &start->p->knode_bus;
-	klist_iter_init_node(&subsys->p->klist_devices, &iter->ki, start_knode);
-	iter->type = type;
+	error = klist_iter_init_node(&subsys->p->klist_devices, &iter->ki,
+				     start_knode);
+	if (!error)
+		iter->type = type;
+	return error;
 }
 EXPORT_SYMBOL_GPL(subsys_dev_iter_init);
 
diff --git a/drivers/base/class.c b/drivers/base/class.c
index f96f704..46bcbc0 100644
--- a/drivers/base/class.c
+++ b/drivers/base/class.c
@@ -290,15 +290,20 @@ void class_destroy(struct class *cls)
  * otherwise if it is NULL, the iteration starts at the beginning of
  * the list.
  */
-void class_dev_iter_init(struct class_dev_iter *iter, struct class *class,
-			 struct device *start, const struct device_type *type)
+int class_dev_iter_init(struct class_dev_iter *iter, struct class *class,
+			struct device *start, const struct device_type *type)
 {
 	struct klist_node *start_knode = NULL;
+	int error;
 
 	if (start)
 		start_knode = &start->knode_class;
-	klist_iter_init_node(&class->p->klist_devices, &iter->ki, start_knode);
-	iter->type = type;
+	error = klist_iter_init_node(&class->p->klist_devices, &iter->ki,
+				     start_knode);
+	if (!error)
+		iter->type = type;
+
+	return error;
 }
 EXPORT_SYMBOL_GPL(class_dev_iter_init);
 
@@ -376,14 +381,15 @@ int class_for_each_device(struct class *class, struct device *start,
 		return -EINVAL;
 	}
 
-	class_dev_iter_init(&iter, class, start, NULL);
-	while ((dev = class_dev_iter_next(&iter))) {
-		error = fn(dev, data);
-		if (error)
-			break;
+	error = class_dev_iter_init(&iter, class, start, NULL);
+	if (!error) {
+		while ((dev = class_dev_iter_next(&iter))) {
+			error = fn(dev, data);
+			if (error)
+				break;
+		}
+		class_dev_iter_exit(&iter);
 	}
-	class_dev_iter_exit(&iter);
-
 	return error;
 }
 EXPORT_SYMBOL_GPL(class_for_each_device);
@@ -423,7 +429,9 @@ struct device *class_find_device(struct class *class, struct device *start,
 		return NULL;
 	}
 
-	class_dev_iter_init(&iter, class, start, NULL);
+	if (class_dev_iter_init(&iter, class, start, NULL) < 0)
+		return NULL;
+
 	while ((dev = class_dev_iter_next(&iter))) {
 		if (match(dev, data)) {
 			get_device(dev);
diff --git a/drivers/base/driver.c b/drivers/base/driver.c
index 9e29943..d7836c1 100644
--- a/drivers/base/driver.c
+++ b/drivers/base/driver.c
@@ -50,11 +50,13 @@ int driver_for_each_device(struct device_driver *drv, struct device *start,
 	if (!drv)
 		return -EINVAL;
 
-	klist_iter_init_node(&drv->p->klist_devices, &i,
-			     start ? &start->p->knode_driver : NULL);
-	while ((dev = next_device(&i)) && !error)
-		error = fn(dev, data);
-	klist_iter_exit(&i);
+	error = klist_iter_init_node(&drv->p->klist_devices, &i,
+				     start ? &start->p->knode_driver : NULL);
+	if (!error) {
+		while ((dev = next_device(&i)) && !error)
+			error = fn(dev, data);
+		klist_iter_exit(&i);
+	}
 	return error;
 }
 EXPORT_SYMBOL_GPL(driver_for_each_device);
@@ -84,8 +86,10 @@ struct device *driver_find_device(struct device_driver *drv,
 	if (!drv || !drv->p)
 		return NULL;
 
-	klist_iter_init_node(&drv->p->klist_devices, &i,
-			     (start ? &start->p->knode_driver : NULL));
+	if (klist_iter_init_node(&drv->p->klist_devices, &i,
+				 (start ? &start->p->knode_driver : NULL)) < 0)
+		return NULL;
+
 	while ((dev = next_device(&i)))
 		if (match(dev, data) && get_device(dev))
 			break;
diff --git a/include/linux/device.h b/include/linux/device.h
index 952b010..9b5bed7 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -141,7 +141,7 @@ struct subsys_dev_iter {
 	struct klist_iter		ki;
 	const struct device_type	*type;
 };
-void subsys_dev_iter_init(struct subsys_dev_iter *iter,
+int subsys_dev_iter_init(struct subsys_dev_iter *iter,
 			 struct bus_type *subsys,
 			 struct device *start,
 			 const struct device_type *type);
@@ -400,10 +400,10 @@ int class_compat_create_link(struct class_compat *cls, struct device *dev,
 void class_compat_remove_link(struct class_compat *cls, struct device *dev,
 			      struct device *device_link);
 
-extern void class_dev_iter_init(struct class_dev_iter *iter,
-				struct class *class,
-				struct device *start,
-				const struct device_type *type);
+extern int class_dev_iter_init(struct class_dev_iter *iter,
+			       struct class *class,
+			       struct device *start,
+			       const struct device_type *type);
 extern struct device *class_dev_iter_next(struct class_dev_iter *iter);
 extern void class_dev_iter_exit(struct class_dev_iter *iter);
 
diff --git a/include/linux/klist.h b/include/linux/klist.h
index a370ce5..9f63323 100644
--- a/include/linux/klist.h
+++ b/include/linux/klist.h
@@ -60,7 +60,7 @@ struct klist_iter {
 
 
 extern void klist_iter_init(struct klist *k, struct klist_iter *i);
-extern void klist_iter_init_node(struct klist *k, struct klist_iter *i,
+extern int klist_iter_init_node(struct klist *k, struct klist_iter *i,
 				 struct klist_node *n);
 extern void klist_iter_exit(struct klist_iter *i);
 extern struct klist_node *klist_next(struct klist_iter *i);
diff --git a/lib/klist.c b/lib/klist.c
index 358a368..dc911f8 100644
--- a/lib/klist.c
+++ b/lib/klist.c
@@ -278,13 +278,20 @@ EXPORT_SYMBOL_GPL(klist_node_attached);
  * Similar to klist_iter_init(), but starts the action off with @n,
  * instead of with the list head.
  */
-void klist_iter_init_node(struct klist *k, struct klist_iter *i,
-			  struct klist_node *n)
+int klist_iter_init_node(struct klist *k, struct klist_iter *i,
+			 struct klist_node *n)
 {
+	if (n) {
+		if (!kref_get_unless_zero(&n->n_ref))
+			return -ENODEV;
+		if (!n->n_klist || knode_dead(n)) {
+			klist_dec_and_del(n);
+			return -ENODEV;
+		}
+	}
 	i->i_klist = k;
 	i->i_cur = n;
-	if (n)
-		kref_get(&n->n_ref);
+	return 0;
 }
 EXPORT_SYMBOL_GPL(klist_iter_init_node);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-21 22:59             ` Guenter Roeck
@ 2014-05-22  7:14               ` Greg Kroah-Hartmann
  2014-05-22  7:22                 ` Guenter Roeck
  0 siblings, 1 reply; 16+ messages in thread
From: Greg Kroah-Hartmann @ 2014-05-22  7:14 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Francesco Ruggeri, Hannes Reinecke, linux-kernel

On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
> > I have been using an x86 platform.
> > When I started working on it I got early crashes until I added the
> > check for p not NULL in
> > 
> > +void bus_release_device(struct device *dev)
> > +{
> > + struct device_private *p = dev->p;
> > +
> > + if (p && klist_node_attached(&p->knode_bus))
> > + klist_put_last(&p->knode_bus);
> > +}
> > +
> > 
> > Maybe on powerpc *p is overriden between device_del and device_release?
> > 
> > Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are
> > treated as WARN_ONs in the current klist code.
> > The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I
> > ran into it without the second patch (but only when I ran my module
> > and tests).
> > 
> Hi Francesco,
> 
> I replaced the BUG_ON with WARN_ON; still crashes.
> 
> Anyway, the problem seems to be known. I found two related exchanges.
> 
> [1] describes pretty much the same problem. I don't see if/where it was
> ever fixed, though.
> 
> [2] is a patch to fix the problem. It did not apply cleanly to 3.14,
> so I had to make some adjustments in klist_iter_init_node. Resulting
> patch is below. With this patch, the problem is gone. It is not perfect,
> as it aborts the loop if it encounters a deleted kobject, but it is better
> than nothing. Unfortunately, the patch never made it upstream; no idea why.
> Copying the author and Greg to get additional feedback.
> 
> Guenter
> 
> [1] https://lkml.org/lkml/2008/10/26/79
> [2] https://lkml.org/lkml/2012/4/16/218

2 years ago?  I have no idea what was up with that, sorry...

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-22  7:14               ` Greg Kroah-Hartmann
@ 2014-05-22  7:22                 ` Guenter Roeck
  2014-05-22 16:19                   ` Francesco Ruggeri
  2014-05-23  2:31                   ` Greg Kroah-Hartmann
  0 siblings, 2 replies; 16+ messages in thread
From: Guenter Roeck @ 2014-05-22  7:22 UTC (permalink / raw)
  To: Greg Kroah-Hartmann; +Cc: Francesco Ruggeri, Hannes Reinecke, linux-kernel

On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote:
> On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
>> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
>>> I have been using an x86 platform.
>>> When I started working on it I got early crashes until I added the
>>> check for p not NULL in
>>>
>>> +void bus_release_device(struct device *dev)
>>> +{
>>> + struct device_private *p = dev->p;
>>> +
>>> + if (p && klist_node_attached(&p->knode_bus))
>>> + klist_put_last(&p->knode_bus);
>>> +}
>>> +
>>>
>>> Maybe on powerpc *p is overriden between device_del and device_release?
>>>
>>> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are
>>> treated as WARN_ONs in the current klist code.
>>> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I
>>> ran into it without the second patch (but only when I ran my module
>>> and tests).
>>>
>> Hi Francesco,
>>
>> I replaced the BUG_ON with WARN_ON; still crashes.
>>
>> Anyway, the problem seems to be known. I found two related exchanges.
>>
>> [1] describes pretty much the same problem. I don't see if/where it was
>> ever fixed, though.
>>
>> [2] is a patch to fix the problem. It did not apply cleanly to 3.14,
>> so I had to make some adjustments in klist_iter_init_node. Resulting
>> patch is below. With this patch, the problem is gone. It is not perfect,
>> as it aborts the loop if it encounters a deleted kobject, but it is better
>> than nothing. Unfortunately, the patch never made it upstream; no idea why.
>> Copying the author and Greg to get additional feedback.
>>
>> Guenter
>>
>> [1] https://lkml.org/lkml/2008/10/26/79
>> [2] https://lkml.org/lkml/2012/4/16/218
>
> 2 years ago?  I have no idea what was up with that, sorry...
>

Ok, but do you have comments on the patch itself in its current version ?

Guenter


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-22  7:22                 ` Guenter Roeck
@ 2014-05-22 16:19                   ` Francesco Ruggeri
  2014-05-22 17:57                     ` Guenter Roeck
  2014-05-23  2:31                   ` Greg Kroah-Hartmann
  1 sibling, 1 reply; 16+ messages in thread
From: Francesco Ruggeri @ 2014-05-22 16:19 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Greg Kroah-Hartmann, Hannes Reinecke, linux-kernel

Aborting a search does not sound like a correct solution.
How does a higher level user (eg for_each_pci_dev) know that a search
was aborted and decide whether it should try again, assuming it would
be ok repeating the action on the devices visited the first time?

Francesco


On Thu, May 22, 2014 at 12:22 AM, Guenter Roeck <linux@roeck-us.net> wrote:
> On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote:
>>
>> On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
>>>
>>> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
>>>>
>>>> I have been using an x86 platform.
>>>> When I started working on it I got early crashes until I added the
>>>> check for p not NULL in
>>>>
>>>> +void bus_release_device(struct device *dev)
>>>> +{
>>>> + struct device_private *p = dev->p;
>>>> +
>>>> + if (p && klist_node_attached(&p->knode_bus))
>>>> + klist_put_last(&p->knode_bus);
>>>> +}
>>>> +
>>>>
>>>> Maybe on powerpc *p is overriden between device_del and device_release?
>>>>
>>>> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are
>>>> treated as WARN_ONs in the current klist code.
>>>> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I
>>>> ran into it without the second patch (but only when I ran my module
>>>> and tests).
>>>>
>>> Hi Francesco,
>>>
>>> I replaced the BUG_ON with WARN_ON; still crashes.
>>>
>>> Anyway, the problem seems to be known. I found two related exchanges.
>>>
>>> [1] describes pretty much the same problem. I don't see if/where it was
>>> ever fixed, though.
>>>
>>> [2] is a patch to fix the problem. It did not apply cleanly to 3.14,
>>> so I had to make some adjustments in klist_iter_init_node. Resulting
>>> patch is below. With this patch, the problem is gone. It is not perfect,
>>> as it aborts the loop if it encounters a deleted kobject, but it is
>>> better
>>> than nothing. Unfortunately, the patch never made it upstream; no idea
>>> why.
>>> Copying the author and Greg to get additional feedback.
>>>
>>> Guenter
>>>
>>> [1] https://lkml.org/lkml/2008/10/26/79
>>> [2] https://lkml.org/lkml/2012/4/16/218
>>
>>
>> 2 years ago?  I have no idea what was up with that, sorry...
>>
>
> Ok, but do you have comments on the patch itself in its current version ?
>
> Guenter
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-22 16:19                   ` Francesco Ruggeri
@ 2014-05-22 17:57                     ` Guenter Roeck
  0 siblings, 0 replies; 16+ messages in thread
From: Guenter Roeck @ 2014-05-22 17:57 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: Greg Kroah-Hartmann, Hannes Reinecke, linux-kernel

On Thu, May 22, 2014 at 09:19:40AM -0700, Francesco Ruggeri wrote:
> Aborting a search does not sound like a correct solution.
> How does a higher level user (eg for_each_pci_dev) know that a search
> was aborted and decide whether it should try again, assuming it would
> be ok repeating the action on the devices visited the first time?
> 
Agreed, it is less than desirable.

I would consider this to be a secondary problem, though, the immediate
problem being the crash. One possible solution might be to have the various
functions return error codes (ERR_PTR), but that would be quite invasive as
well. I really think we need input from Greg and, if the solution touches
the PCI subsystem, from Bjorn Helgaas to find an acceptable solution
to that problem.

Guenter

> Francesco
> 
> 
> On Thu, May 22, 2014 at 12:22 AM, Guenter Roeck <linux@roeck-us.net> wrote:
> > On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote:
> >>
> >> On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
> >>>
> >>> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
> >>>>
> >>>> I have been using an x86 platform.
> >>>> When I started working on it I got early crashes until I added the
> >>>> check for p not NULL in
> >>>>
> >>>> +void bus_release_device(struct device *dev)
> >>>> +{
> >>>> + struct device_private *p = dev->p;
> >>>> +
> >>>> + if (p && klist_node_attached(&p->knode_bus))
> >>>> + klist_put_last(&p->knode_bus);
> >>>> +}
> >>>> +
> >>>>
> >>>> Maybe on powerpc *p is overriden between device_del and device_release?
> >>>>
> >>>> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are
> >>>> treated as WARN_ONs in the current klist code.
> >>>> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I
> >>>> ran into it without the second patch (but only when I ran my module
> >>>> and tests).
> >>>>
> >>> Hi Francesco,
> >>>
> >>> I replaced the BUG_ON with WARN_ON; still crashes.
> >>>
> >>> Anyway, the problem seems to be known. I found two related exchanges.
> >>>
> >>> [1] describes pretty much the same problem. I don't see if/where it was
> >>> ever fixed, though.
> >>>
> >>> [2] is a patch to fix the problem. It did not apply cleanly to 3.14,
> >>> so I had to make some adjustments in klist_iter_init_node. Resulting
> >>> patch is below. With this patch, the problem is gone. It is not perfect,
> >>> as it aborts the loop if it encounters a deleted kobject, but it is
> >>> better
> >>> than nothing. Unfortunately, the patch never made it upstream; no idea
> >>> why.
> >>> Copying the author and Greg to get additional feedback.
> >>>
> >>> Guenter
> >>>
> >>> [1] https://lkml.org/lkml/2008/10/26/79
> >>> [2] https://lkml.org/lkml/2012/4/16/218
> >>
> >>
> >> 2 years ago?  I have no idea what was up with that, sorry...
> >>
> >
> > Ok, but do you have comments on the patch itself in its current version ?
> >
> > Guenter
> >
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-05-22  7:22                 ` Guenter Roeck
  2014-05-22 16:19                   ` Francesco Ruggeri
@ 2014-05-23  2:31                   ` Greg Kroah-Hartmann
  1 sibling, 0 replies; 16+ messages in thread
From: Greg Kroah-Hartmann @ 2014-05-23  2:31 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Francesco Ruggeri, Hannes Reinecke, linux-kernel

On Thu, May 22, 2014 at 12:22:40AM -0700, Guenter Roeck wrote:
> On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote:
> > On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote:
> >> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote:
> >>> I have been using an x86 platform.
> >>> When I started working on it I got early crashes until I added the
> >>> check for p not NULL in
> >>>
> >>> +void bus_release_device(struct device *dev)
> >>> +{
> >>> + struct device_private *p = dev->p;
> >>> +
> >>> + if (p && klist_node_attached(&p->knode_bus))
> >>> + klist_put_last(&p->knode_bus);
> >>> +}
> >>> +
> >>>
> >>> Maybe on powerpc *p is overriden between device_del and device_release?
> >>>
> >>> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are
> >>> treated as WARN_ONs in the current klist code.
> >>> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I
> >>> ran into it without the second patch (but only when I ran my module
> >>> and tests).
> >>>
> >> Hi Francesco,
> >>
> >> I replaced the BUG_ON with WARN_ON; still crashes.
> >>
> >> Anyway, the problem seems to be known. I found two related exchanges.
> >>
> >> [1] describes pretty much the same problem. I don't see if/where it was
> >> ever fixed, though.
> >>
> >> [2] is a patch to fix the problem. It did not apply cleanly to 3.14,
> >> so I had to make some adjustments in klist_iter_init_node. Resulting
> >> patch is below. With this patch, the problem is gone. It is not perfect,
> >> as it aborts the loop if it encounters a deleted kobject, but it is better
> >> than nothing. Unfortunately, the patch never made it upstream; no idea why.
> >> Copying the author and Greg to get additional feedback.
> >>
> >> Guenter
> >>
> >> [1] https://lkml.org/lkml/2008/10/26/79
> >> [2] https://lkml.org/lkml/2012/4/16/218
> >
> > 2 years ago?  I have no idea what was up with that, sorry...
> >
> 
> Ok, but do you have comments on the patch itself in its current version ?

I have no idea, and at the moment, no time to look at this at all,
sorry.  Feel free to work on it and verify if it is a valid fix or not
for this issue and let me know.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-06-04  3:25   ` Guenter Roeck
@ 2014-06-04  6:22     ` Francesco Ruggeri
  0 siblings, 0 replies; 16+ messages in thread
From: Francesco Ruggeri @ 2014-06-04  6:22 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Greg KH, Francesco Ruggeri, linux-kernel, Hannes Reinecke

>>
> Look for callers of bus_find_device. Unless I am missing something, only pci
> and scsi code call it with non-NULL 'start' argument, and the scsi use is
> limited to a walk through scsi devices for a proc file.
>
> Makes me wonder if the start argument should go away, and if pci and scsi
> should use another means to walk through devices.

I think that would be the correct approach.
In case of pci all functions using pci_get_device, pci_get_subsys or
pci_get_class (which call pci_get_dev_by_id/bus_find_device) to
iterate over the whole list using a non-NULL start argument would have
to be audited.
There seem to be quite a few of them using loops of the kind
while ((dev = pci_get_device( …, dev)) != NULL)
(and similarly for pci_get_subsys and pci_get_class) and they could
all be vulnerable if they try to resume their search from a device
that was unregistered.

Francesco


>
> Guenter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-06-03 23:21 ` Greg KH
@ 2014-06-04  3:25   ` Guenter Roeck
  2014-06-04  6:22     ` Francesco Ruggeri
  0 siblings, 1 reply; 16+ messages in thread
From: Guenter Roeck @ 2014-06-04  3:25 UTC (permalink / raw)
  To: Greg KH; +Cc: Francesco Ruggeri, linux-kernel, hare, fruggeri

On Tue, Jun 03, 2014 at 04:21:00PM -0700, Greg KH wrote:
> On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote:
> > In-Reply-To: <20140523023141.GC13900@kroah.com>
> > 
> > 
> > Hi Guenter,
> > I got back to looking into this crash.
> > Just as an example, the attached diffs also fix my bus_find_device problem for
> > traversals that start from the head of the list and traverse it completely.
> > They are very specific to the case of bus_find_device, and a complete solution
> > would affect a lot of code.
> > The main issue seems to be that when a device is found in a klist by say
> > bus_find_device the klist_node reference should be returned to the caller,
> > who should then decide whether to use it for the next klist search, drop it or
> > maybe exchange it for a struct device reference. When resuming a search one
> > should already hold a klist_node reference from the previous search.
> > This model is broken by several functions using struct devices such as
> > bus_find_device, which resume klist searches on the implicit assumption that
> > holding a reference to the struct device is enough to acquire one on the
> > klist_node. 
> > The only reason that this has not been a big issue so far is probably that
> > on most systems struct devices are not destroyed and created very often.
> 
> Not true, this happens on every USB device insertion and removal, and on
> startup and shutdown.  What makes PCI special that we aren't hitting
> these issues in USB and other subsystems that do a lot of device
> creation/removal?
> 
Look for callers of bus_find_device. Unless I am missing something, only pci
and scsi code call it with non-NULL 'start' argument, and the scsi use is
limited to a walk through scsi devices for a proc file.

Makes me wonder if the start argument should go away, and if pci and scsi
should use another means to walk through devices.

Guenter

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-06-03 22:55 Francesco Ruggeri
  2014-06-03 23:21 ` Greg KH
@ 2014-06-03 23:23 ` Greg KH
  1 sibling, 0 replies; 16+ messages in thread
From: Greg KH @ 2014-06-03 23:23 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: linux-kernel, hare, linux, fruggeri

On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote:
> @@ -719,6 +719,11 @@ struct pci_dev *pci_get_device(unsigned 
>  struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
>  				unsigned int ss_vendor, unsigned int ss_device,
>  				struct pci_dev *from);
> +struct pci_dev *pci_get_device_noref(unsigned int vendor, unsigned int device,
> +				     struct pci_dev *from);
> +struct pci_dev *pci_get_subsys_noref(unsigned int vendor, unsigned int device,
> +				     unsigned int ss_vendor, unsigned int ss_device,
> +				     struct pci_dev *from);

To follow up, what drivers are you thinking need to make these calls?
Perhaps they shouldn't be doing something like this?

And, to answer my previous question, the reason PCI is different is that
drivers want to walk the list of devices to find "stupid" things like
this out, USB doesn't do dumb stuff like that :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
  2014-06-03 22:55 Francesco Ruggeri
@ 2014-06-03 23:21 ` Greg KH
  2014-06-04  3:25   ` Guenter Roeck
  2014-06-03 23:23 ` Greg KH
  1 sibling, 1 reply; 16+ messages in thread
From: Greg KH @ 2014-06-03 23:21 UTC (permalink / raw)
  To: Francesco Ruggeri; +Cc: linux-kernel, hare, linux, fruggeri

On Tue, Jun 03, 2014 at 03:55:02PM -0700, Francesco Ruggeri wrote:
> In-Reply-To: <20140523023141.GC13900@kroah.com>
> 
> 
> Hi Guenter,
> I got back to looking into this crash.
> Just as an example, the attached diffs also fix my bus_find_device problem for
> traversals that start from the head of the list and traverse it completely.
> They are very specific to the case of bus_find_device, and a complete solution
> would affect a lot of code.
> The main issue seems to be that when a device is found in a klist by say
> bus_find_device the klist_node reference should be returned to the caller,
> who should then decide whether to use it for the next klist search, drop it or
> maybe exchange it for a struct device reference. When resuming a search one
> should already hold a klist_node reference from the previous search.
> This model is broken by several functions using struct devices such as
> bus_find_device, which resume klist searches on the implicit assumption that
> holding a reference to the struct device is enough to acquire one on the
> klist_node. 
> The only reason that this has not been a big issue so far is probably that
> on most systems struct devices are not destroyed and created very often.

Not true, this happens on every USB device insertion and removal, and on
startup and shutdown.  What makes PCI special that we aren't hitting
these issues in USB and other subsystems that do a lot of device
creation/removal?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: pci: kernel crash in bus_find_device
@ 2014-06-03 22:55 Francesco Ruggeri
  2014-06-03 23:21 ` Greg KH
  2014-06-03 23:23 ` Greg KH
  0 siblings, 2 replies; 16+ messages in thread
From: Francesco Ruggeri @ 2014-06-03 22:55 UTC (permalink / raw)
  To: linux-kernel, hare, gregkh, linux; +Cc: fruggeri

In-Reply-To: <20140523023141.GC13900@kroah.com>


Hi Guenter,
I got back to looking into this crash.
Just as an example, the attached diffs also fix my bus_find_device problem for
traversals that start from the head of the list and traverse it completely.
They are very specific to the case of bus_find_device, and a complete solution
would affect a lot of code.
The main issue seems to be that when a device is found in a klist by say
bus_find_device the klist_node reference should be returned to the caller,
who should then decide whether to use it for the next klist search, drop it or
maybe exchange it for a struct device reference. When resuming a search one
should already hold a klist_node reference from the previous search.
This model is broken by several functions using struct devices such as
bus_find_device, which resume klist searches on the implicit assumption that
holding a reference to the struct device is enough to acquire one on the
klist_node. 
The only reason that this has not been a big issue so far is probably that
on most systems struct devices are not destroyed and created very often.
Thanks,
Francesco


Index: linux-3.4/lib/klist.c
===================================================================
--- linux-3.4.orig/lib/klist.c
+++ linux-3.4/lib/klist.c
@@ -318,6 +318,51 @@ void klist_iter_exit(struct klist_iter *
 }
 EXPORT_SYMBOL_GPL(klist_iter_exit);
 
+/**
+ * klist_iter_init_node_noref - Initialize a klist_iter structure without
+ *				taking a kref.
+ * @k: klist we're iterating.
+ * @i: klist_iter we're filling.
+ * @n: node to start with.
+ *
+ * Similar to klist_iter_init_noref(), but starts the action off with @n,
+ * instead of with the list head.
+ */
+void klist_iter_init_node_noref(struct klist *k, struct klist_iter *i,
+			  struct klist_node *n)
+{
+	i->i_klist = k;
+	i->i_cur = n;
+}
+EXPORT_SYMBOL_GPL(klist_iter_init_node_noref);
+
+/**
+ * klist_iter_init_noref - Initalize a klist_iter structure without taking
+ *			   a kref.
+ * @k: klist we're iterating.
+ * @i: klist_iter structure we're filling.
+ *
+ * Similar to klist_iter_init_node_noref(), but start with the list head.
+ */
+void klist_iter_init_noref(struct klist *k, struct klist_iter *i)
+{
+	klist_iter_init_node_noref(k, i, NULL);
+}
+EXPORT_SYMBOL_GPL(klist_iter_init_noref);
+
+/**
+ * klist_iter_exit_noref - Finish a list iteration without dropping a kref.
+ * @i: Iterator structure.
+ *
+ * Not really needed, but always good form.
+ */
+void klist_iter_exit_noref(struct klist_iter *i)
+{
+	if (i->i_cur)
+		i->i_cur = NULL;
+}
+EXPORT_SYMBOL_GPL(klist_iter_exit_noref);
+
 static struct klist_node *to_klist_node(struct list_head *n)
 {
 	return container_of(n, struct klist_node, n_node);
Index: linux-3.4/drivers/base/bus.c
===================================================================
--- linux-3.4.orig/drivers/base/bus.c
+++ linux-3.4/drivers/base/bus.c
@@ -341,6 +341,37 @@ struct device *bus_find_device(struct bu
 }
 EXPORT_SYMBOL_GPL(bus_find_device);
 
+/**
+ * bus_find_device_noref - device iterator for locating a particular device.
+ * @bus: bus type
+ * @start: Device to begin with
+ * @data: Data to pass to match function
+ * @match: Callback function to check device
+ *
+ * Same as bus_find_device, but it assumes caller already has klist_node ref,
+ * and it returns to caller a struct device with the new kref.
+ */
+
+struct device *bus_find_device_noref(struct bus_type *bus,
+				     struct device *start, void *data,
+				     int (*match)(struct device *dev, void *data))
+{
+	struct klist_iter i;
+	struct device *dev;
+
+	if (!bus || !bus->p)
+		return NULL;
+
+	klist_iter_init_node_noref(&bus->p->klist_devices, &i,
+				   (start ? &start->p->knode_bus : NULL));
+	while ((dev = next_device(&i)))
+		if (match(dev, data) && get_device(dev))
+			break;
+	klist_iter_exit_noref(&i);
+	return dev;
+}
+EXPORT_SYMBOL_GPL(bus_find_device_noref);
+
 static int match_name(struct device *dev, void *data)
 {
 	const char *name = data;
Index: linux-3.4/drivers/pci/search.c
===================================================================
--- linux-3.4.orig/drivers/pci/search.c
+++ linux-3.4/drivers/pci/search.c
@@ -289,6 +289,94 @@ pci_get_device(unsigned int vendor, unsi
 	return pci_get_subsys(vendor, device, PCI_ANY_ID, PCI_ANY_ID, from);
 }
 
+/*
+ * pci_get_dev_by_id_noref - begin or continue searching for a PCI device by id
+ * @id: pointer to struct pci_device_id to match for the device
+ * @from: Previous PCI device found in search, or %NULL for new search.
+ *
+ * Same as pci_get_dev_by_id, but it expects caller to hold a kref on knode_bus
+ * in "from" from previous search, and it holds a kref on knode_bus in returned
+ * pdev.
+ * This is an internal function for use by the other search functions in
+ * this file.
+ */
+static struct pci_dev *pci_get_dev_by_id_noref(const struct pci_device_id *id,
+					       struct pci_dev *from)
+{
+	struct device *dev;
+	struct device *dev_start = NULL;
+	struct pci_dev *pdev = NULL;
+
+	WARN_ON(in_interrupt());
+	if (from)
+		dev_start = &from->dev;
+	dev = bus_find_device_noref(&pci_bus_type, dev_start, (void *)id,
+				    match_pci_dev_by_id);
+	if (dev)
+		pdev = to_pci_dev(dev);
+	if (from)
+		pci_dev_put(from);
+	return pdev;
+}
+
+/**
+ * pci_get_subsys_noref - begin or continue searching for a PCI device by vendor/subvendor/device/subdevice id
+ * @vendor: PCI vendor id to match, or %PCI_ANY_ID to match all vendor ids
+ * @device: PCI device id to match, or %PCI_ANY_ID to match all device ids
+ * @ss_vendor: PCI subsystem vendor id to match, or %PCI_ANY_ID to match all vendor ids
+ * @ss_device: PCI subsystem device id to match, or %PCI_ANY_ID to match all device ids
+ * @from: Previous PCI device found in search, or %NULL for new search.
+ *
+ * Same as pci_get_subsys, but it expects caller to hold a kref on knode_bus
+ * in "from" from previous search, and it holds a kref on knode_bus in returned
+ * pdev.
+ */
+struct pci_dev *pci_get_subsys_noref(unsigned int vendor, unsigned int device,
+				     unsigned int ss_vendor, unsigned int ss_device,
+				     struct pci_dev *from)
+{
+	struct pci_dev *pdev;
+	struct pci_device_id *id;
+
+	/*
+	 * pci_find_subsys() can be called on the ide_setup() path,
+	 * super-early in boot.  But the down_read() will enable local
+	 * interrupts, which can cause some machines to crash.  So here we
+	 * detect and flag that situation and bail out early.
+	 */
+	if (unlikely(no_pci_devices()))
+		return NULL;
+
+	id = kzalloc(sizeof(*id), GFP_KERNEL);
+	if (!id)
+		return NULL;
+	id->vendor = vendor;
+	id->device = device;
+	id->subvendor = ss_vendor;
+	id->subdevice = ss_device;
+
+	pdev = pci_get_dev_by_id_noref(id, from);
+	kfree(id);
+
+	return pdev;
+}
+
+/**
+ * pci_get_device_noref - begin or continue searching for a PCI device by vendor/device id
+ * @vendor: PCI vendor id to match, or %PCI_ANY_ID to match all vendor ids
+ * @device: PCI device id to match, or %PCI_ANY_ID to match all device ids
+ * @from: Previous PCI device found in search, or %NULL for new search.
+ *
+ * Same as pci_get_subsys, but it expects caller to hold a kref on knode_bus
+ * in "from" from previous search, and it holds a kref on knode_bus in returned
+ * pdev.
+ */
+struct pci_dev *
+pci_get_device_noref(unsigned int vendor, unsigned int device, struct pci_dev *from)
+{
+	return pci_get_subsys_noref(vendor, device, PCI_ANY_ID, PCI_ANY_ID, from);
+}
+
 /**
  * pci_get_class - begin or continue searching for a PCI device by class
  * @class: search for a PCI device with this class designation
@@ -355,5 +443,7 @@ EXPORT_SYMBOL(pci_find_next_bus);
 /* For everyone */
 EXPORT_SYMBOL(pci_get_device);
 EXPORT_SYMBOL(pci_get_subsys);
+EXPORT_SYMBOL(pci_get_device_noref);
+EXPORT_SYMBOL(pci_get_subsys_noref);
 EXPORT_SYMBOL(pci_get_slot);
 EXPORT_SYMBOL(pci_get_class);
Index: linux-3.4/include/linux/pci.h
===================================================================
--- linux-3.4.orig/include/linux/pci.h
+++ linux-3.4/include/linux/pci.h
@@ -719,6 +719,11 @@ struct pci_dev *pci_get_device(unsigned 
 struct pci_dev *pci_get_subsys(unsigned int vendor, unsigned int device,
 				unsigned int ss_vendor, unsigned int ss_device,
 				struct pci_dev *from);
+struct pci_dev *pci_get_device_noref(unsigned int vendor, unsigned int device,
+				     struct pci_dev *from);
+struct pci_dev *pci_get_subsys_noref(unsigned int vendor, unsigned int device,
+				     unsigned int ss_vendor, unsigned int ss_device,
+				     struct pci_dev *from);
 struct pci_dev *pci_get_slot(struct pci_bus *bus, unsigned int devfn);
 struct pci_dev *pci_get_domain_bus_and_slot(int domain, unsigned int bus,
 					    unsigned int devfn);
Index: linux-3.4/include/linux/device.h
===================================================================
--- linux-3.4.orig/include/linux/device.h
+++ linux-3.4/include/linux/device.h
@@ -140,6 +140,9 @@ int bus_for_each_dev(struct bus_type *bu
 struct device *bus_find_device(struct bus_type *bus, struct device *start,
 			       void *data,
 			       int (*match)(struct device *dev, void *data));
+struct device *bus_find_device_noref(struct bus_type *bus, struct device *start,
+				     void *data,
+				     int (*match)(struct device *dev, void *data));
 struct device *bus_find_device_by_name(struct bus_type *bus,
 				       struct device *start,
 				       const char *name);
Index: linux-3.4/include/linux/klist.h
===================================================================
--- linux-3.4.orig/include/linux/klist.h
+++ linux-3.4/include/linux/klist.h
@@ -63,6 +63,10 @@ extern void klist_iter_init(struct klist
 extern void klist_iter_init_node(struct klist *k, struct klist_iter *i,
 				 struct klist_node *n);
 extern void klist_iter_exit(struct klist_iter *i);
+extern void klist_iter_init_noref(struct klist *k, struct klist_iter *i);
+extern void klist_iter_init_node_noref(struct klist *k, struct klist_iter *i,
+				       struct klist_node *n);
+extern void klist_iter_exit_noref(struct klist_iter *i);
 extern struct klist_node *klist_next(struct klist_iter *i);
 
 #endif

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-06-04  6:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-20 19:17 pci: kernel crash in bus_find_device Francesco Ruggeri
2014-05-20 19:50 ` Guenter Roeck
2014-05-20 22:35   ` Francesco Ruggeri
2014-05-20 23:38     ` Guenter Roeck
     [not found]       ` <CA+HUmGge7AEpAnwAG_VJD2CKTtRBoC2bCGVU_t4qm-x6+OCr-g@mail.gmail.com>
     [not found]         ` <20140521193010.GA1721@roeck-us.net>
     [not found]           ` <CA+HUmGhm1VLTvMKW1TUUPqStUhD11M5u0VyTZyXyWz_ZS8uSVw@mail.gmail.com>
2014-05-21 22:59             ` Guenter Roeck
2014-05-22  7:14               ` Greg Kroah-Hartmann
2014-05-22  7:22                 ` Guenter Roeck
2014-05-22 16:19                   ` Francesco Ruggeri
2014-05-22 17:57                     ` Guenter Roeck
2014-05-23  2:31                   ` Greg Kroah-Hartmann
2014-05-21 17:39     ` Guenter Roeck
2014-06-03 22:55 Francesco Ruggeri
2014-06-03 23:21 ` Greg KH
2014-06-04  3:25   ` Guenter Roeck
2014-06-04  6:22     ` Francesco Ruggeri
2014-06-03 23:23 ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).