All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] PCI: Add a pci_dev_depth() helper function
@ 2021-11-29 12:19 Hans de Goede
  2021-11-29 12:19 ` [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors Hans de Goede
  0 siblings, 1 reply; 9+ messages in thread
From: Hans de Goede @ 2021-11-29 12:19 UTC (permalink / raw)
  To: Bjorn Helgaas, Lukas Wunner
  Cc: Hans de Goede, Theodore Ts'o, Andreas Noever, Michael Jamet,
	Mika Westerberg, Yehezkel Bernat, linux-pci

Add a pci_dev_depth() helper function, which returns the depth
of a device in the PCI hierarchy.

This is useful to have for lockdep annotations for dealing with
nested locked when traversing the hierarchy.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 include/linux/pci.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 7d825637d7ca..6ad78cc67aa8 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -691,6 +691,26 @@ static inline bool pci_is_bridge(struct pci_dev *dev)
 		dev->hdr_type == PCI_HEADER_TYPE_CARDBUS;
 }
 
+/**
+ * pci_dev_depth - Return the depth of a device in the PCI hierarchy
+ * @dev: PCI device
+ *
+ * Return the depth (number of parent busses above) the device in
+ * the PCI hierarchy.
+ */
+static inline int pci_dev_depth(struct pci_dev *dev)
+{
+	struct pci_bus *bus = dev->bus;
+	int depth = 0;
+
+	while (bus->parent) {
+		depth++;
+		bus = bus->parent;
+	}
+
+	return depth;
+}
+
 #define for_each_pci_bridge(dev, bus)				\
 	list_for_each_entry(dev, &bus->devices, bus_list)	\
 		if (!pci_is_bridge(dev)) {} else
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  2021-11-29 12:19 [PATCH 1/2] PCI: Add a pci_dev_depth() helper function Hans de Goede
@ 2021-11-29 12:19 ` Hans de Goede
  2021-11-29 15:39   ` Lukas Wunner
                     ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Hans de Goede @ 2021-11-29 12:19 UTC (permalink / raw)
  To: Bjorn Helgaas, Lukas Wunner
  Cc: Hans de Goede, Theodore Ts'o, Andreas Noever, Michael Jamet,
	Mika Westerberg, Yehezkel Bernat, linux-pci

Use down_read_nested() and down_write_nested() when taking the
ctrl->reset_lock rw-sem, passing the PCI-device depth in the hierarchy
as lock subclass parameter. This fixes the following false-positive lockdep
report when unplugging a Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:

[   28.583853] pcieport 0000:06:01.0: pciehp: Slot(1): Link Down
[   28.583891] pcieport 0000:06:01.0: pciehp: Slot(1): Card not present
[   28.583995] pcieport 0000:09:04.0: can't change power state from D3cold to D0 (config space inaccessible)

[   28.584849] ============================================
[   28.584854] WARNING: possible recursive locking detected
[   28.584858] 5.16.0-rc2+ #621 Not tainted
[   28.584864] --------------------------------------------
[   28.584867] irq/124-pciehp/86 is trying to acquire lock:
[   28.584873] ffff8e5ac4299ef8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80
[   28.584904]
               but task is already holding lock:
[   28.584908] ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
[   28.584929]
               other info that might help us debug this:
[   28.584933]  Possible unsafe locking scenario:

[   28.584936]        CPU0
[   28.584939]        ----
[   28.584942]   lock(&ctrl->reset_lock);
[   28.584949]   lock(&ctrl->reset_lock);
[   28.584955]
                *** DEADLOCK ***

[   28.584959]  May be due to missing lock nesting notation

[   28.584963] 3 locks held by irq/124-pciehp/86:
[   28.584970]  #0: ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
[   28.584991]  #1: ffffffffa3b024e8 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_unconfigure_device+0x31/0x110
[   28.585012]  #2: ffff8e5ac1ee2248 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x1c/0x40
[   28.585037]
               stack backtrace:
[   28.585042] CPU: 4 PID: 86 Comm: irq/124-pciehp Not tainted 5.16.0-rc2+ #621
[   28.585052] Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET30W (1.20 ) 08/26/2021
[   28.585059] Call Trace:
[   28.585064]  <TASK>
[   28.585073]  dump_stack_lvl+0x59/0x73
[   28.585087]  __lock_acquire.cold+0xc5/0x2c6
[   28.585106]  ? find_held_lock+0x2b/0x80
[   28.585124]  lock_acquire+0xb5/0x2b0
[   28.585132]  ? pciehp_check_presence+0x23/0x80
[   28.585144]  ? lock_is_held_type+0xa8/0x120
[   28.585161]  down_read+0x3e/0x50
[   28.585172]  ? pciehp_check_presence+0x23/0x80
[   28.585183]  pciehp_check_presence+0x23/0x80
[   28.585194]  pciehp_runtime_resume+0x5c/0xa0
[   28.585206]  ? pci_msix_init+0x60/0x60
[   28.585214]  device_for_each_child+0x45/0x70
[   28.585227]  pcie_port_device_runtime_resume+0x20/0x30
[   28.585236]  pci_pm_runtime_resume+0xa7/0xc0
[   28.585246]  ? pci_pm_freeze_noirq+0x100/0x100
[   28.585257]  __rpm_callback+0x41/0x110
[   28.585271]  ? pci_pm_freeze_noirq+0x100/0x100
[   28.585281]  rpm_callback+0x59/0x70
[   28.585293]  rpm_resume+0x512/0x7b0
[   28.585309]  __pm_runtime_resume+0x4a/0x90
[   28.585322]  __device_release_driver+0x28/0x240
[   28.585338]  device_release_driver+0x26/0x40
[   28.585351]  pci_stop_bus_device+0x68/0x90
[   28.585363]  pci_stop_bus_device+0x2c/0x90
[   28.585373]  pci_stop_and_remove_bus_device+0xe/0x20
[   28.585384]  pciehp_unconfigure_device+0x6c/0x110
[   28.585396]  ? __pm_runtime_resume+0x58/0x90
[   28.585409]  pciehp_disable_slot+0x5b/0xe0
[   28.585421]  pciehp_handle_presence_or_link_change+0xc3/0x2f0
[   28.585436]  pciehp_ist+0x179/0x180
[   28.585449]  ? disable_irq_nosync+0x10/0x10
[   28.585460]  irq_thread_fn+0x1d/0x60
[   28.585470]  ? irq_thread+0x81/0x1a0
[   28.585480]  irq_thread+0xcb/0x1a0
[   28.585491]  ? irq_thread_fn+0x60/0x60
[   28.585502]  ? irq_thread_check_affinity+0xb0/0xb0
[   28.585514]  kthread+0x165/0x190
[   28.585522]  ? set_kthread_struct+0x40/0x40
[   28.585531]  ret_from_fork+0x1f/0x30
[   28.585554]  </TASK>
[   28.586512] xhci_hcd 0000:0a:00.0: remove, state 1
[   28.586538] usb usb4: USB disconnect, device number 1
[   28.586547] usb 4-2: USB disconnect, device number 2
[   28.586555] usb 4-2.1: USB disconnect, device number 3
[   28.586561] usb 4-2.1.2: USB disconnect, device number 5
[   28.587709] xhci_hcd 0000:0a:00.0: xHCI host controller not responding, assume dead
[   28.590021] usb 4-2.3: USB disconnect, device number 4
[   28.613814] xhci_hcd 0000:0a:00.0: WARN Can't disable streams for endpoint 0x1, streams are being disabled already
[   28.616865] xhci_hcd 0000:0a:00.0: USB bus 4 deregistered
[   28.617082] xhci_hcd 0000:0a:00.0: remove, state 1
[   28.617089] usb usb3: USB disconnect, device number 1
[   28.617092] usb 3-2: USB disconnect, device number 2
[   28.617094] usb 3-2.1: USB disconnect, device number 3
[   28.617096] usb 3-2.1.1: USB disconnect, device number 6
[   28.617098] usb 3-2.1.1.4: USB disconnect, device number 8
[   28.645354] usb 3-2.1.3: USB disconnect, device number 7
[   28.645357] usb 3-2.1.3.1: USB disconnect, device number 10
[   28.647489] usb 3-2.1.3.4: USB disconnect, device number 11
[   28.647494] usb 3-2.1.3.4.1: USB disconnect, device number 13
[   28.760411] usb 3-2.1.4: USB disconnect, device number 9
[   28.760414] usb 3-2.1.4.1: USB disconnect, device number 12
[   28.795513] usb 3-2.4: USB disconnect, device number 4
[   28.821464] usb 3-2.5: USB disconnect, device number 5
[   28.822850] xhci_hcd 0000:0a:00.0: Host halt failed, -19
[   28.822854] xhci_hcd 0000:0a:00.0: Host not accessible, reset failed.
[   28.823331] xhci_hcd 0000:0a:00.0: USB bus 3 deregistered
[   28.823589] pci 0000:0a:00.0: Removing from iommu group 15
[   28.823605] pci_bus 0000:0a: busn_res: [bus 0a] is released
[   28.823660] pci 0000:09:02.0: Removing from iommu group 15
[   28.823729] pci_bus 0000:0b: busn_res: [bus 0b-2c] is released
[   28.823773] pci 0000:09:04.0: Removing from iommu group 15
[   28.823782] pci_bus 0000:09: busn_res: [bus 09-2c] is released
[   28.823851] pci 0000:08:00.0: Removing from iommu group 15

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/pci/hotplug/pciehp.h      | 1 +
 drivers/pci/hotplug/pciehp_core.c | 2 +-
 drivers/pci/hotplug/pciehp_hpc.c  | 7 ++++---
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h
index 918dccbc74b6..7de17f32bf8c 100644
--- a/drivers/pci/hotplug/pciehp.h
+++ b/drivers/pci/hotplug/pciehp.h
@@ -106,6 +106,7 @@ struct controller {
 
 	struct hotplug_slot hotplug_slot;	/* hotplug core interface */
 	struct rw_semaphore reset_lock;
+	unsigned int depth;
 	unsigned int ist_running;
 	int request_result;
 	wait_queue_head_t requester;
diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c
index f34114d45259..4042d87d539d 100644
--- a/drivers/pci/hotplug/pciehp_core.c
+++ b/drivers/pci/hotplug/pciehp_core.c
@@ -166,7 +166,7 @@ static void pciehp_check_presence(struct controller *ctrl)
 {
 	int occupied;
 
-	down_read(&ctrl->reset_lock);
+	down_read_nested(&ctrl->reset_lock, ctrl->depth);
 	mutex_lock(&ctrl->state_lock);
 
 	occupied = pciehp_card_present_or_link_active(ctrl);
diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index 83a0fa119cae..54643a95baad 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -583,7 +583,7 @@ static void pciehp_ignore_dpc_link_change(struct controller *ctrl,
 	 * the corresponding link change may have been ignored above.
 	 * Synthesize it to ensure that it is acted on.
 	 */
-	down_read(&ctrl->reset_lock);
+	down_read_nested(&ctrl->reset_lock, ctrl->depth);
 	if (!pciehp_check_link_active(ctrl))
 		pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC);
 	up_read(&ctrl->reset_lock);
@@ -746,7 +746,7 @@ static irqreturn_t pciehp_ist(int irq, void *dev_id)
 	 * Disable requests have higher priority than Presence Detect Changed
 	 * or Data Link Layer State Changed events.
 	 */
-	down_read(&ctrl->reset_lock);
+	down_read_nested(&ctrl->reset_lock, ctrl->depth);
 	if (events & DISABLE_SLOT)
 		pciehp_handle_disable_request(ctrl);
 	else if (events & (PCI_EXP_SLTSTA_PDC | PCI_EXP_SLTSTA_DLLSC))
@@ -906,7 +906,7 @@ int pciehp_reset_slot(struct hotplug_slot *hotplug_slot, bool probe)
 	if (probe)
 		return 0;
 
-	down_write(&ctrl->reset_lock);
+	down_write_nested(&ctrl->reset_lock, ctrl->depth);
 
 	if (!ATTN_BUTTN(ctrl)) {
 		ctrl_mask |= PCI_EXP_SLTCTL_PDCE;
@@ -975,6 +975,7 @@ struct controller *pcie_init(struct pcie_device *dev)
 		return NULL;
 
 	ctrl->pcie = dev;
+	ctrl->depth = pci_dev_depth(dev->port);
 	pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, &slot_cap);
 
 	if (pdev->hotplug_user_indicators)
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  2021-11-29 12:19 ` [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors Hans de Goede
@ 2021-11-29 15:39   ` Lukas Wunner
  2021-11-29 18:59     ` Lukas Wunner
  2021-11-29 15:45   ` Lukas Wunner
  2021-11-29 23:45   ` Bjorn Helgaas
  2 siblings, 1 reply; 9+ messages in thread
From: Lukas Wunner @ 2021-11-29 15:39 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Bjorn Helgaas, Theodore Ts'o, Andreas Noever, Michael Jamet,
	Mika Westerberg, Yehezkel Bernat, linux-pci

On Mon, Nov 29, 2021 at 01:19:34PM +0100, Hans de Goede wrote:
> Use down_read_nested() and down_write_nested() when taking the
> ctrl->reset_lock rw-sem, passing the PCI-device depth in the hierarchy
> as lock subclass parameter. This fixes the following false-positive lockdep
> report when unplugging a Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:
[...]
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>

That's exactly what I had in mind, thanks.

Reported-by: "Theodore Ts'o" <tytso@mit.edu>
Link: https://lore.kernel.org/linux-pci/20190402021933.GA2966@mit.edu/
Link: https://lore.kernel.org/linux-pci/de684a28-9038-8fc6-27ca-3f6f2f6400d7@redhat.com/
Reviewed-by: Lukas Wunner <lukas@wunner.de>


> Note the 2nd patch can probably use a Fixes: tag but I had no
> idea which commit to put there. Or maybe add a Cc: stable to
> both patches?

I'd just add a stable designation and let the stable maintainers decide
which versions to backport to.  The problem I see is the dependency on
the first patch in the series.  In theory there's a syntax to specify
such prerequisites (see "Option 3" in
Documentation/process/stable-kernel-rules.rst), but in practice,
my experience is that stable maintainers may ignore such prerequisite tags.
It might be simplest to just squash the two patches together.

If you do respin, it would be good to explain in the commit message why
one lockdep class is used per hierarchy level:  This is done to conserve
class keys because their number is limited and the complexity grows
quadratically with number of keys according to
Documentation/locking/lockdep-design.rst.

It would also be good to explain why the lockdep splat occurs and why
it's a false positive:  With Thunderbolt, hotplug ports are nested.  When
removing multiple devices in a daisy-chain, each hotplug port's reset_lock
may be acquired recursively.  It's never the same lock, so the lockdep
splat is a false positive.  Because locks at the same hierarchy level
are never acquired recursively, a per-level lockdep class is sufficient
to fix the lockdep splat.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  2021-11-29 12:19 ` [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors Hans de Goede
  2021-11-29 15:39   ` Lukas Wunner
@ 2021-11-29 15:45   ` Lukas Wunner
  2021-11-29 23:45   ` Bjorn Helgaas
  2 siblings, 0 replies; 9+ messages in thread
From: Lukas Wunner @ 2021-11-29 15:45 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Bjorn Helgaas, Theodore Ts'o, Andreas Noever, Michael Jamet,
	Mika Westerberg, Yehezkel Bernat, linux-pci

On Mon, Nov 29, 2021 at 01:19:34PM +0100, Hans de Goede wrote:
> --- a/drivers/pci/hotplug/pciehp.h
> +++ b/drivers/pci/hotplug/pciehp.h
> @@ -106,6 +106,7 @@ struct controller {
>  
>  	struct hotplug_slot hotplug_slot;	/* hotplug core interface */
>  	struct rw_semaphore reset_lock;
> +	unsigned int depth;
>  	unsigned int ist_running;
>  	int request_result;
>  	wait_queue_head_t requester;

Could you amend the kernel-doc of the struct with a short explanation
of the attribute you're adding above?  Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  2021-11-29 15:39   ` Lukas Wunner
@ 2021-11-29 18:59     ` Lukas Wunner
  2021-11-30 10:15       ` Hans de Goede
  0 siblings, 1 reply; 9+ messages in thread
From: Lukas Wunner @ 2021-11-29 18:59 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Bjorn Helgaas, Theodore Ts'o, Andreas Noever, Michael Jamet,
	Mika Westerberg, Yehezkel Bernat, linux-pci

On Mon, Nov 29, 2021 at 04:39:43PM +0100, Lukas Wunner wrote:
> On Mon, Nov 29, 2021 at 01:19:34PM +0100, Hans de Goede wrote:
> > Use down_read_nested() and down_write_nested() when taking the
> > ctrl->reset_lock rw-sem, passing the PCI-device depth in the hierarchy
> > as lock subclass parameter. This fixes the following false-positive lockdep
> > report when unplugging a Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:
> [...]
> > Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> 
> That's exactly what I had in mind, thanks.

Hm, I found the notes that I took when I investigated Theodore's report:
Using a subclass may be problematic because it's limited to a value < 8
(MAX_LOCKDEP_SUBCLASSES).  If there's a hotplug port at a deeper level
than 8 in the PCI hierarchy (can easily happen, Thunderbolt daisy chain
allows up to 6 devices, each device contains a PCIe switch, so 2 levels per
device), look_up_lock_class() in kernel/locking/lockdep.c will emit a BUG
error message.

It may be necessary to call lockdep_register_key() for each level or
for each hotplug port and assign the lock with lockdep_set_class()
(or ..._and_name() and use the dev_name()).

It's these complications that made me put aside the problem back in the day.
My apologies for not remembering them earlier.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  2021-11-29 12:19 ` [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors Hans de Goede
  2021-11-29 15:39   ` Lukas Wunner
  2021-11-29 15:45   ` Lukas Wunner
@ 2021-11-29 23:45   ` Bjorn Helgaas
  2 siblings, 0 replies; 9+ messages in thread
From: Bjorn Helgaas @ 2021-11-29 23:45 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Bjorn Helgaas, Lukas Wunner, Theodore Ts'o, Andreas Noever,
	Michael Jamet, Mika Westerberg, Yehezkel Bernat, linux-pci

On Mon, Nov 29, 2021 at 01:19:34PM +0100, Hans de Goede wrote:
> Use down_read_nested() and down_write_nested() when taking the
> ctrl->reset_lock rw-sem, passing the PCI-device depth in the hierarchy
> as lock subclass parameter. This fixes the following false-positive lockdep
> report when unplugging a Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:
> 
> [   28.583853] pcieport 0000:06:01.0: pciehp: Slot(1): Link Down
> [   28.583891] pcieport 0000:06:01.0: pciehp: Slot(1): Card not present
> [   28.583995] pcieport 0000:09:04.0: can't change power state from D3cold to D0 (config space inaccessible)
> 
> [   28.584849] ============================================
> [   28.584854] WARNING: possible recursive locking detected
> [   28.584858] 5.16.0-rc2+ #621 Not tainted
> [   28.584864] --------------------------------------------
> [   28.584867] irq/124-pciehp/86 is trying to acquire lock:
> [   28.584873] ffff8e5ac4299ef8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80
> [   28.584904]
>                but task is already holding lock:
> [   28.584908] ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
> [   28.584929]
>                other info that might help us debug this:
> [   28.584933]  Possible unsafe locking scenario:
> 
> [   28.584936]        CPU0
> [   28.584939]        ----
> [   28.584942]   lock(&ctrl->reset_lock);
> [   28.584949]   lock(&ctrl->reset_lock);
> [   28.584955]
>                 *** DEADLOCK ***
> 
> [   28.584959]  May be due to missing lock nesting notation
> 
> [   28.584963] 3 locks held by irq/124-pciehp/86:
> [   28.584970]  #0: ffff8e5ac4298af8 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xf3/0x180
> [   28.584991]  #1: ffffffffa3b024e8 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_unconfigure_device+0x31/0x110
> [   28.585012]  #2: ffff8e5ac1ee2248 (&dev->mutex){....}-{3:3}, at: device_release_driver+0x1c/0x40
> [   28.585037]
>                stack backtrace:
> [   28.585042] CPU: 4 PID: 86 Comm: irq/124-pciehp Not tainted 5.16.0-rc2+ #621
> [   28.585052] Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET30W (1.20 ) 08/26/2021
> [   28.585059] Call Trace:
> [   28.585064]  <TASK>
> [   28.585073]  dump_stack_lvl+0x59/0x73
> [   28.585087]  __lock_acquire.cold+0xc5/0x2c6
> [   28.585106]  ? find_held_lock+0x2b/0x80
> [   28.585124]  lock_acquire+0xb5/0x2b0
> [   28.585132]  ? pciehp_check_presence+0x23/0x80
> [   28.585144]  ? lock_is_held_type+0xa8/0x120
> [   28.585161]  down_read+0x3e/0x50
> [   28.585172]  ? pciehp_check_presence+0x23/0x80
> [   28.585183]  pciehp_check_presence+0x23/0x80
> [   28.585194]  pciehp_runtime_resume+0x5c/0xa0
> [   28.585206]  ? pci_msix_init+0x60/0x60
> [   28.585214]  device_for_each_child+0x45/0x70
> [   28.585227]  pcie_port_device_runtime_resume+0x20/0x30
> [   28.585236]  pci_pm_runtime_resume+0xa7/0xc0
> [   28.585246]  ? pci_pm_freeze_noirq+0x100/0x100
> [   28.585257]  __rpm_callback+0x41/0x110
> [   28.585271]  ? pci_pm_freeze_noirq+0x100/0x100
> [   28.585281]  rpm_callback+0x59/0x70
> [   28.585293]  rpm_resume+0x512/0x7b0
> [   28.585309]  __pm_runtime_resume+0x4a/0x90
> [   28.585322]  __device_release_driver+0x28/0x240
> [   28.585338]  device_release_driver+0x26/0x40
> [   28.585351]  pci_stop_bus_device+0x68/0x90
> [   28.585363]  pci_stop_bus_device+0x2c/0x90
> [   28.585373]  pci_stop_and_remove_bus_device+0xe/0x20
> [   28.585384]  pciehp_unconfigure_device+0x6c/0x110
> [   28.585396]  ? __pm_runtime_resume+0x58/0x90
> [   28.585409]  pciehp_disable_slot+0x5b/0xe0
> [   28.585421]  pciehp_handle_presence_or_link_change+0xc3/0x2f0
> [   28.585436]  pciehp_ist+0x179/0x180
> [   28.585449]  ? disable_irq_nosync+0x10/0x10
> [   28.585460]  irq_thread_fn+0x1d/0x60
> [   28.585470]  ? irq_thread+0x81/0x1a0
> [   28.585480]  irq_thread+0xcb/0x1a0
> [   28.585491]  ? irq_thread_fn+0x60/0x60
> [   28.585502]  ? irq_thread_check_affinity+0xb0/0xb0
> [   28.585514]  kthread+0x165/0x190
> [   28.585522]  ? set_kthread_struct+0x40/0x40
> [   28.585531]  ret_from_fork+0x1f/0x30
> [   28.585554]  </TASK>
> [   28.586512] xhci_hcd 0000:0a:00.0: remove, state 1
> [   28.586538] usb usb4: USB disconnect, device number 1
> [   28.586547] usb 4-2: USB disconnect, device number 2
> [   28.586555] usb 4-2.1: USB disconnect, device number 3
> [   28.586561] usb 4-2.1.2: USB disconnect, device number 5
> [   28.587709] xhci_hcd 0000:0a:00.0: xHCI host controller not responding, assume dead
> [   28.590021] usb 4-2.3: USB disconnect, device number 4
> [   28.613814] xhci_hcd 0000:0a:00.0: WARN Can't disable streams for endpoint 0x1, streams are being disabled already
> [   28.616865] xhci_hcd 0000:0a:00.0: USB bus 4 deregistered
> [   28.617082] xhci_hcd 0000:0a:00.0: remove, state 1
> [   28.617089] usb usb3: USB disconnect, device number 1
> [   28.617092] usb 3-2: USB disconnect, device number 2
> [   28.617094] usb 3-2.1: USB disconnect, device number 3
> [   28.617096] usb 3-2.1.1: USB disconnect, device number 6
> [   28.617098] usb 3-2.1.1.4: USB disconnect, device number 8
> [   28.645354] usb 3-2.1.3: USB disconnect, device number 7
> [   28.645357] usb 3-2.1.3.1: USB disconnect, device number 10
> [   28.647489] usb 3-2.1.3.4: USB disconnect, device number 11
> [   28.647494] usb 3-2.1.3.4.1: USB disconnect, device number 13
> [   28.760411] usb 3-2.1.4: USB disconnect, device number 9
> [   28.760414] usb 3-2.1.4.1: USB disconnect, device number 12
> [   28.795513] usb 3-2.4: USB disconnect, device number 4
> [   28.821464] usb 3-2.5: USB disconnect, device number 5
> [   28.822850] xhci_hcd 0000:0a:00.0: Host halt failed, -19
> [   28.822854] xhci_hcd 0000:0a:00.0: Host not accessible, reset failed.
> [   28.823331] xhci_hcd 0000:0a:00.0: USB bus 3 deregistered
> [   28.823589] pci 0000:0a:00.0: Removing from iommu group 15
> [   28.823605] pci_bus 0000:0a: busn_res: [bus 0a] is released
> [   28.823660] pci 0000:09:02.0: Removing from iommu group 15
> [   28.823729] pci_bus 0000:0b: busn_res: [bus 0b-2c] is released
> [   28.823773] pci 0000:09:04.0: Removing from iommu group 15
> [   28.823782] pci_bus 0000:09: busn_res: [bus 09-2c] is released
> [   28.823851] pci 0000:08:00.0: Removing from iommu group 15

I think you're planning a v2 for Lukas comments.  When you do, can you
strip out the timestamps above, since they don't contribute to
understanding the problem.  It would also be helpful if you could
remove the irrelevant items from the stack trace and dmesg, e.g.,
"Can't disable streams", "busn_res", "Removing from iommu group",
"Host halt failed", maybe "USB disconnect", etc.  We mainly want to
see the call paths that cause the lockdep warning, and just enough of
the dmesg to help someone match their symptom to this fix.

Bjorn

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  2021-11-29 18:59     ` Lukas Wunner
@ 2021-11-30 10:15       ` Hans de Goede
  2021-11-30 19:39         ` Lukas Wunner
  0 siblings, 1 reply; 9+ messages in thread
From: Hans de Goede @ 2021-11-30 10:15 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Theodore Ts'o, Andreas Noever, Michael Jamet,
	Mika Westerberg, Yehezkel Bernat, linux-pci

Hi Lukas,

First of all thank you for the review and for all the remarks,
I agree with all of them, so I'll incorporate all of them in v2
of this patch-set including squashing the 2 patches together.

On 11/29/21 19:59, Lukas Wunner wrote:
> On Mon, Nov 29, 2021 at 04:39:43PM +0100, Lukas Wunner wrote:
>> On Mon, Nov 29, 2021 at 01:19:34PM +0100, Hans de Goede wrote:
>>> Use down_read_nested() and down_write_nested() when taking the
>>> ctrl->reset_lock rw-sem, passing the PCI-device depth in the hierarchy
>>> as lock subclass parameter. This fixes the following false-positive lockdep
>>> report when unplugging a Lenovo X1C8 from a Lenovo 2nd gen TB3 dock:
>> [...]
>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>
>> That's exactly what I had in mind, thanks.
> 
> Hm, I found the notes that I took when I investigated Theodore's report:
> Using a subclass may be problematic because it's limited to a value < 8
> (MAX_LOCKDEP_SUBCLASSES).  If there's a hotplug port at a deeper level
> than 8 in the PCI hierarchy (can easily happen, Thunderbolt daisy chain
> allows up to 6 devices, each device contains a PCIe switch, so 2 levels per
> device), look_up_lock_class() in kernel/locking/lockdep.c will emit a BUG
> error message.
> 
> It may be necessary to call lockdep_register_key() for each level or
> for each hotplug port and assign the lock with lockdep_set_class()
> (or ..._and_name() and use the dev_name()).
> 
> It's these complications that made me put aside the problem back in the day.
> My apologies for not remembering them earlier.

No worries.

I've been looking at how to rework things and this should be pretty doable.

My plan is to have a global static array of lock_class_key-s in
drivers/pci/hotplug/pciehp_hpc.c, one per depth level (say 32 of them ?)
and then use lockdep_set_class() and that looks like it should be pretty
doable.

When using a static global array of lock_class_key-s there is no need to call
lockdep_register_key().

Also see:

drivers/usb/storage/usb.c

Which does something similar to avoid issues when a single usb-device has
multiple USB-storage interfaces.

Regards,

Hans



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  2021-11-30 10:15       ` Hans de Goede
@ 2021-11-30 19:39         ` Lukas Wunner
  2021-12-02 11:52           ` Hans de Goede
  0 siblings, 1 reply; 9+ messages in thread
From: Lukas Wunner @ 2021-11-30 19:39 UTC (permalink / raw)
  To: Hans de Goede
  Cc: Bjorn Helgaas, Theodore Ts'o, Andreas Noever, Michael Jamet,
	Mika Westerberg, Yehezkel Bernat, linux-pci

On Tue, Nov 30, 2021 at 11:15:40AM +0100, Hans de Goede wrote:
> On 11/29/21 19:59, Lukas Wunner wrote:
> > Hm, I found the notes that I took when I investigated Theodore's report:
> > Using a subclass may be problematic because it's limited to a value < 8
> > (MAX_LOCKDEP_SUBCLASSES).  If there's a hotplug port at a deeper level
> > than 8 in the PCI hierarchy (can easily happen, Thunderbolt daisy chain
> > allows up to 6 devices, each device contains a PCIe switch, so 2 levels per
> > device), look_up_lock_class() in kernel/locking/lockdep.c will emit a BUG
> > error message.

Actually, after thinking about this some more it has occurred to me
that you should be able to make do with 8 subclasses if you change
the algorithm in patch [1/2] to something like this:

	while (bus->parent) {
-		depth++;
		bus = bus->parent;
+		if (bus->self->is_hotplug_bridge)
+			depth++;
	}

(It may be necessary to add a NULL pointer check for bus->self,
off the top of my hat I'm not sure if that's set for the root bus.)

Because with the patches as they are, the array of 8 subclasses is
sparsely populated, i.e. if a parent port is not a hotplug port,
a subclass is wasted.  You only care about hotplug ports (more
specifically those handled by pciehp) because those are the only
ones with a reset_lock.  The above change should result in a
densely populated array of subclasses.

Naturally, because this makes the function pciehp-specific,
it should be moved from include/linux/pci.h to pciehp_hpc.c.

With Thunderbolt you can have 6 devices plus the hotplug port in
the host controller.  And there may be an 8th hotplug port at the
Root Port if the host controller can be "unplugged" when it goes
to D3cold.  (That's handled by acpiphp, not pciehp, and I'm not
even sure if is_hotplug_bridge is set in that case.)
So 8 subclasses should be sufficient.

Aside from Thunderbolt, nested hotplug ports exist in data centers
with hot-removable NVMe slots in hot-removable chassis, but I highly
doubt those use cases have more than 8 levels of hotplug ports.

So that would probably be a pragmatic and minimalistic approach to this
problem.

Thanks,

Lukas

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors
  2021-11-30 19:39         ` Lukas Wunner
@ 2021-12-02 11:52           ` Hans de Goede
  0 siblings, 0 replies; 9+ messages in thread
From: Hans de Goede @ 2021-12-02 11:52 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Bjorn Helgaas, Theodore Ts'o, Andreas Noever, Michael Jamet,
	Mika Westerberg, Yehezkel Bernat, linux-pci

Hi,

On 11/30/21 20:39, Lukas Wunner wrote:
> On Tue, Nov 30, 2021 at 11:15:40AM +0100, Hans de Goede wrote:
>> On 11/29/21 19:59, Lukas Wunner wrote:
>>> Hm, I found the notes that I took when I investigated Theodore's report:
>>> Using a subclass may be problematic because it's limited to a value < 8
>>> (MAX_LOCKDEP_SUBCLASSES).  If there's a hotplug port at a deeper level
>>> than 8 in the PCI hierarchy (can easily happen, Thunderbolt daisy chain
>>> allows up to 6 devices, each device contains a PCIe switch, so 2 levels per
>>> device), look_up_lock_class() in kernel/locking/lockdep.c will emit a BUG
>>> error message.
> 
> Actually, after thinking about this some more it has occurred to me
> that you should be able to make do with 8 subclasses if you change
> the algorithm in patch [1/2] to something like this:
> 
> 	while (bus->parent) {
> -		depth++;
> 		bus = bus->parent;
> +		if (bus->self->is_hotplug_bridge)
> +			depth++;
> 	}
> 
> (It may be necessary to add a NULL pointer check for bus->self,
> off the top of my hat I'm not sure if that's set for the root bus.)
> 
> Because with the patches as they are, the array of 8 subclasses is
> sparsely populated, i.e. if a parent port is not a hotplug port,
> a subclass is wasted.  You only care about hotplug ports (more
> specifically those handled by pciehp) because those are the only
> ones with a reset_lock.  The above change should result in a
> densely populated array of subclasses.
> 
> Naturally, because this makes the function pciehp-specific,
> it should be moved from include/linux/pci.h to pciehp_hpc.c.
> 
> With Thunderbolt you can have 6 devices plus the hotplug port in
> the host controller.  And there may be an 8th hotplug port at the
> Root Port if the host controller can be "unplugged" when it goes
> to D3cold.  (That's handled by acpiphp, not pciehp, and I'm not
> even sure if is_hotplug_bridge is set in that case.)
> So 8 subclasses should be sufficient.
> 
> Aside from Thunderbolt, nested hotplug ports exist in data centers
> with hot-removable NVMe slots in hot-removable chassis, but I highly
> doubt those use cases have more than 8 levels of hotplug ports.
> 
> So that would probably be a pragmatic and minimalistic approach to this
> problem.

Ack. I've prepared and tested a v2 patch using this approach, this
v2 also addresses all your and Bjorn's other remarks.

I will send out the v2 patch right after this email.

Regards,

Hans


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-12-02 11:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-29 12:19 [PATCH 1/2] PCI: Add a pci_dev_depth() helper function Hans de Goede
2021-11-29 12:19 ` [PATCH 2/2] PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors Hans de Goede
2021-11-29 15:39   ` Lukas Wunner
2021-11-29 18:59     ` Lukas Wunner
2021-11-30 10:15       ` Hans de Goede
2021-11-30 19:39         ` Lukas Wunner
2021-12-02 11:52           ` Hans de Goede
2021-11-29 15:45   ` Lukas Wunner
2021-11-29 23:45   ` Bjorn Helgaas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.