Linux-PCI Archive on lore.kernel.org
 help / color / Atom feed
From: "Oliver O'Halloran" <oohall@gmail.com>
To: linux-pci@vger.kernel.org
Cc: Lukas Wunner <lukas@wunner.de>
Subject: pciehp lockdep splat on 5.2-rc6
Date: Thu, 27 Jun 2019 18:21:50 +1000
Message-ID: <CAOSf1CGnYaGUTZQHbgy39dC47XNZRi4At2aTWRK0MeBvupKrxg@mail.gmail.com> (raw)

Hi all,

I've been re-working the powerpc pci code to make it possible to use
pciehp on powernv. While testing I'm consistently seeing this lockdep
splat when hot-adding a U.2 NVMe drive. This might be my bug, but all
my changes are contained in arch/powerpc/ and the stack traces point
entirely at generic code, so maybe not. Does this look like a real
issue?

Oliver

[  169.920642][ T1952] pcieport 0020:02:02.0: pciehp: Slot(0-3): Card present
[  169.924806][ T1952] pcieport 0020:02:02.0: pciehp: Slot(0-3): Link Up
[  170.083951][ T1952] pci 0020:08:00.0: [1344:5192] type 00 class 0x010802
[  170.088307][ T1952] pci 0020:08:00.0: reg 0x10: [mem
0x00000000-0x00003fff 64bit]
[  170.092711][ T1952] pci 0020:08:00.0: reg 0x30: [mem
0x00000000-0x0001ffff pref]
[  170.096839][ T1952] pci 0020:08:00.0: Max Payload Size set to 256
(was 128, max 256)
[  170.100973][ T1952] pci 0020:08:00.0: enabling Extended Tags
[  170.105323][ T1952] pci 0020:08:00.0: BAR0 [mem size 0x00004000
64bit]: requesting alignment to 0x10000
[  170.111719][ T1952] pci 0020:08:00.0: Adding to iommu group 8
[  170.142333][ T1952] pcieport 0020:02:02.0: bridge window [io
0x1000-0x0fff] to [bus 08] add_size 1000
[  170.145079][ T1952] pcieport 0020:02:02.0: BAR 7: no space for [io
size 0x1000]
[  170.147783][ T1952] pcieport 0020:02:02.0: BAR 7: failed to assign
[io  size 0x1000]
[  170.150524][ T1952] pcieport 0020:02:02.0: BAR 7: no space for [io
size 0x1000]
[  170.153292][ T1952] pcieport 0020:02:02.0: BAR 7: failed to assign
[io  size 0x1000]
[  170.156078][ T1952] pci 0020:08:00.0: BAR 6: assigned [mem
0x3fe800800000-0x3fe80081ffff pref]
[  170.158897][ T1952] pci 0020:08:00.0: BAR 0: assigned [mem
0x3fe800820000-0x3fe800823fff 64bit]
[  170.161761][ T1952] 0020:08:00.0: No device node associated with device !
[  170.164607][ T1952] pcieport 0020:02:02.0: PCI bridge to [bus 08]
[  170.167478][ T1952] pcieport 0020:02:02.0:   bridge window [mem
0x3fe800800000-0x3fe800ffffff]
[  170.170386][ T1952] pcieport 0020:02:02.0:   bridge window [mem
0x240100000000-0x2401ffffffff 64bit pref]
[  170.173389][ T1952] pnv_pcibios_bus_add_device: EEH: Setting up
device 0020:08:00.0.
[  170.176372][ T1952] EEH: Adding device 0020:08:00.0
[  170.179352][ T1952] pnv_eeh_probe_pdev: probing 0020:08:00.0
[  170.182597][ T1952] EEH: Add 0020:08:00.0 to Device PE#fd, Parent PE#0
[  170.185783][ T1952] pnv_eeh_probe_pdev: EEH enabled on 08:00.0 PHB#20-PE#fd
[  170.189244][ T1952] nvme 0020:08:00.0: runtime IRQ mapping not
provided by arch
[  170.192617][ T1952]
[  170.194049][  T656] nvme nvme1: pci function 0020:08:00.0
[  170.195732][ T1952] ======================================================
[  170.195743][ T1952] WARNING: possible circular locking dependency detected
[  170.201813][ T1947] nvme 0020:08:00.0: enabling device (0000 -> 0002)
[  170.205419][ T1952] 5.2.0-rc6-00074-g99d8de2-dirty #225 Tainted: G        W
[  170.205422][ T1952] ------------------------------------------------------
[  170.205425][ T1952] irq/251-pciehp/1952 is trying to acquire lock:
[  170.205428][ T1952] 00000000c71b3eb9
((work_completion)(&wfc.work)){+.+.}, at: __flush_work+0x68/0x120
[  170.205443][ T1952]
[  170.205443][ T1952] but task is already holding lock:
[  170.205445][ T1952] 00000000788e6539
(pci_rescan_remove_lock){+.+.}, at: pci_lock_rescan_remove+0x2c/0x40
[  170.205456][ T1952]
[  170.205456][ T1952] which lock already depends on the new lock.
[  170.205456][ T1952]
[  170.205458][ T1952]
[  170.205458][ T1952] the existing dependency chain (in reverse order) is:
[  170.205460][ T1952]
[  170.205460][ T1952] -> #2 (pci_rescan_remove_lock){+.+.}:
[  170.211487][ T1947] nvme 0020:08:00.0: enabling bus mastering
[  170.215287][ T1952]        validate_chain.isra.12+0x520/0x780
[  170.215292][ T1952]        __lock_acquire+0x358/0x7c0
[  170.215297][ T1952]        lock_acquire+0xd4/0x230
[  170.215303][ T1952]        __mutex_lock+0x8c/0xa30
[  170.215308][ T1952]        pci_lock_rescan_remove+0x2c/0x40
[  170.215313][ T1952]        pciehp_unconfigure_device+0x54/0x1a0
[  170.215318][ T1952]        remove_board+0x28/0x90
[  170.215329][ T1952]        __pciehp_disable_slot+0x64/0xb0
[  170.292060][ T1952]        pciehp_disable_slot+0x58/0xc0
[  170.295062][ T1952]        pciehp_handle_presence_or_link_change+0x94/0x2a0
[  170.298064][ T1952]        pciehp_ist+0x1fc/0x280
[  170.300988][ T1952]        irq_thread_fn+0x4c/0xa0
[  170.303878][ T1952]        irq_thread+0xdc/0x150
[  170.306740][ T1952]        kthread+0x15c/0x1a0
[  170.309584][ T1952]        ret_from_kernel_thread+0x5c/0x78
[  170.312445][ T1952]
[  170.312445][ T1952] -> #1 (&ctrl->reset_lock){.+.+}:
[  170.318025][ T1952]        validate_chain.isra.12+0x520/0x780
[  170.320822][ T1952]        __lock_acquire+0x358/0x7c0
[  170.323589][ T1952]        lock_acquire+0xd4/0x230
[  170.326324][ T1952]        down_read+0x44/0x130
[  170.329053][ T1952]        pciehp_check_presence+0x3c/0x110
[  170.331737][ T1952]        pciehp_probe+0x94/0x1c0
[  170.334374][ T1952]        pcie_port_probe_service+0x60/0xb0
[  170.337014][ T1952]        really_probe+0x2c4/0x500
[  170.339616][ T1952]        driver_probe_device+0x124/0x150
[  170.342197][ T1952]        bus_for_each_drv+0x80/0xe0
[  170.344742][ T1952]        __device_attach+0x130/0x200
[  170.347246][ T1952]        bus_probe_device+0xe4/0xf0
[  170.349730][ T1952]        device_add+0x294/0x4a0
[  170.352174][ T1952]        pcie_device_init+0x124/0x170
[  170.354644][ T1952]        pcie_port_device_register+0x130/0x160
[  170.357053][ T1952]        pcie_portdrv_probe+0x74/0x130
[  170.359421][ T1952]        local_pci_probe+0x6c/0x110
[  170.361756][ T1952]        work_for_cpu_fn+0x38/0x60
[  170.364059][ T1952]        process_one_work+0x21c/0x6a0
[  170.366312][ T1952]        process_scheduled_works+0x50/0x80
[  170.368514][ T1952]        worker_thread+0x15c/0x2d0
[  170.370708][ T1952]        kthread+0x15c/0x1a0
[  170.372831][ T1952]        ret_from_kernel_thread+0x5c/0x78
[  170.374913][ T1952]
[  170.374913][ T1952] -> #0 ((work_completion)(&wfc.work)){+.+.}:
[  170.379010][ T1952]        check_prevs_add+0x160/0x1a0
[  170.381126][ T1952]        validate_chain.isra.12+0x520/0x780
[  170.383246][ T1952]        __lock_acquire+0x358/0x7c0
[  170.385334][ T1952]        lock_acquire+0xd4/0x230
[  170.387410][ T1952]        __flush_work+0x8c/0x120
[  170.389457][ T1952]        work_on_cpu+0xb8/0xe0
[  170.391483][ T1952]        pci_call_probe+0x124/0x160
[  170.393508][ T1952]        pci_device_probe+0x68/0xd0
[  170.395530][ T1952]        really_probe+0x2c4/0x500
[  170.397615][ T1952]        driver_probe_device+0x124/0x150
[  170.399647][ T1952]        bus_for_each_drv+0x80/0xe0
[  170.401676][ T1952]        __device_attach+0x130/0x200
[  170.403703][ T1952]        pci_bus_add_device+0x78/0x100
[  170.405731][ T1952]        pci_bus_add_devices+0x60/0xd0
[  170.407734][ T1952]        pciehp_configure_device+0xf4/0x1f0
[  170.409737][ T1952]        board_added+0xa0/0x190
[  170.411712][ T1952]        __pciehp_enable_slot+0x64/0x100
[  170.413707][ T1952]        pciehp_enable_slot+0x4c/0x130
[  170.415718][ T1952]        pciehp_handle_presence_or_link_change+0x120/0x2a0
[  170.417783][ T1952]        pciehp_ist+0x1fc/0x280
[  170.419816][ T1952]        irq_thread_fn+0x4c/0xa0
[  170.421840][ T1952]        irq_thread+0xdc/0x150
[  170.423851][ T1952]        kthread+0x15c/0x1a0
[  170.425839][ T1952]        ret_from_kernel_thread+0x5c/0x78
[  170.427835][ T1952]
[  170.427835][ T1952] other info that might help us debug this:
[  170.427835][ T1952]
[  170.433742][ T1952] Chain exists of:
[  170.433742][ T1952]   (work_completion)(&wfc.work) -->
&ctrl->reset_lock --> pci_rescan_remove_lock
[  170.433742][ T1952]
[  170.439873][ T1952]  Possible unsafe locking scenario:
[  170.439873][ T1952]
[  170.444014][ T1952]        CPU0                    CPU1
[  170.446158][ T1952]        ----                    ----
[  170.448228][ T1952]   lock(pci_rescan_remove_lock);
[  170.450316][ T1952]                                lock(&ctrl->reset_lock);
[  170.452454][ T1952]
lock(pci_rescan_remove_lock);
[  170.454592][ T1952]   lock((work_completion)(&wfc.work));
[  170.456728][ T1952]
[  170.456728][ T1952]  *** DEADLOCK ***
[  170.456728][ T1952]
[  170.462943][ T1952] 3 locks held by irq/251-pciehp/1952:
[  170.465064][ T1952]  #0: 000000006b79c00e
(&ctrl->reset_lock){.+.+}, at: pciehp_ist+0xd0/0x280
[  170.467301][ T1952]  #1: 00000000788e6539
(pci_rescan_remove_lock){+.+.}, at: pci_lock_rescan_remove+0x2c/0x40
[  170.469621][ T1952]  #2: 00000000af82701a (&dev->mutex){....}, at:
__device_attach+0x44/0x200
[  170.471965][ T1952]
[  170.471965][ T1952] stack backtrace:
[  170.476487][ T1952] CPU: 54 PID: 1952 Comm: irq/251-pciehp Tainted:
G        W         5.2.0-rc6-00074-g99d8de2-dirty #225
[  170.480804][ T1952] Call Trace:
[  170.484966][ T1952] [c000000005493110] [c000000000e22dd8]
__dump_stack+0x2c/0x40 (unreliable)
[  170.493286][ T1952] [c000000005493130] [c000000000e22ec0]
dump_stack+0xd4/0x144
[  170.500222][ T1952] [c000000005493180] [c000000000195da0]
print_circular_bug+0x160/0x170
[  170.508537][ T1952] [c000000005493210] [c000000000197640]
check_prev_add+0x6f0/0xef0
[  170.515472][ T1952] [c000000005493300] [c000000000197fa0]
check_prevs_add+0x160/0x1a0
[  170.523793][ T1952] [c000000005493360] [c000000000198500]
validate_chain.isra.12+0x520/0x780
[  170.532113][ T1952] [c000000005493420] [c000000000199d58]
__lock_acquire+0x358/0x7c0
[  170.540427][ T1952] [c0000000054934f0] [c00000000019ab94]
lock_acquire+0xd4/0x230
[  170.547366][ T1952] [c0000000054935b0] [c000000000135a9c]
__flush_work+0x8c/0x120
[  170.554299][ T1952] [c000000005493690] [c0000000001376f8]
work_on_cpu+0xb8/0xe0
[  170.562616][ T1952] [c000000005493730] [c00000000074c514]
pci_call_probe+0x124/0x160
[  170.569555][ T1952] [c000000005493780] [c00000000074cfc8]
pci_device_probe+0x68/0xd0
[  170.577868][ T1952] [c0000000054937c0] [c00000000087a274]
really_probe+0x2c4/0x500
[  170.584806][ T1952] [c000000005493850] [c00000000087a964]
driver_probe_device+0x124/0x150
[  170.593123][ T1952] [c0000000054938c0] [c000000000877120]
bus_for_each_drv+0x80/0xe0
[  170.601442][ T1952] [c000000005493910] [c00000000087a6b0]
__device_attach+0x130/0x200
[  170.608375][ T1952] [c0000000054939a0] [c000000000739eb8]
pci_bus_add_device+0x78/0x100
[  170.616694][ T1952] [c000000005493a10] [c000000000739fa0]
pci_bus_add_devices+0x60/0xd0
[  170.625014][ T1952] [c000000005493a50] [c00000000076a094]
pciehp_configure_device+0xf4/0x1f0
[  170.633339][ T1952] [c000000005493ad0] [c0000000007691b0]
board_added+0xa0/0x190
[  170.640268][ T1952] [c000000005493b50] [c000000000769304]
__pciehp_enable_slot+0x64/0x100
[  170.648588][ T1952] [c000000005493bd0] [c0000000007693ec]
pciehp_enable_slot+0x4c/0x130
[  170.656907][ T1952] [c000000005493c10] [c000000000769a10]
pciehp_handle_presence_or_link_change+0x120/0x2a0
[  170.666610][ T1952] [c000000005493c90] [c00000000076baec]
pciehp_ist+0x1fc/0x280
[  170.673549][ T1952] [c000000005493d20] [c0000000001b015c]
irq_thread_fn+0x4c/0xa0
[  170.680482][ T1952] [c000000005493d60] [c0000000001b10cc]
irq_thread+0xdc/0x150
[  170.688796][ T1952] [c000000005493db0] [c000000000140b6c] kthread+0x15c/0x1a0
[  170.695731][ T1952] [c000000005493e20] [c00000000000bb84]
ret_from_kernel_thread+0x5c/0x78
[  174.904923][ T1947] nvme nvme1: Shutdown timeout set to 10 seconds
[  174.993798][ T1947] nvme nvme1: 128/0/0 default/read/poll queues

                 reply index

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOSf1CGnYaGUTZQHbgy39dC47XNZRi4At2aTWRK0MeBvupKrxg@mail.gmail.com \
    --to=oohall@gmail.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-PCI Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-pci/0 linux-pci/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-pci linux-pci/ https://lore.kernel.org/linux-pci \
		linux-pci@vger.kernel.org
	public-inbox-index linux-pci

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-pci


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git