All of lore.kernel.org
 help / color / mirror / Atom feed
* [perf] lockdep warning between cpu_add_remove_lock and &dev->mutex.
@ 2023-01-23 10:39 Tetsuo Handa
  2023-01-23 11:41 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Tetsuo Handa @ 2023-01-23 10:39 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim
  Cc: linux-perf-users, LKML

Hello.

I tried to apply below patch, and hit lockdep warning during boot.
Can you break this dependency?

----------
From f7ff56455ae7813768c6ab85e8e3db374122f32b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Date: Mon, 23 Jan 2023 19:32:26 +0900
Subject: [PATCH] drivers/core: Remove lockdep_set_novalidate_class() usage

This patch experimentally removes lockdep_set_novalidate_class() call
 from device_initialize() introduced by commit 1704f47b50b5 ("lockdep:
Add novalidate class for dev->mutex conversion"), for this commit made it
impossible to find real deadlocks unless timing dependent testings manage
to trigger hung task like [1] and [2]. Let's try if we can find remaining
drivers which need to use separate classes without causing too many crashes
to continue.

[1] https://syzkaller.appspot.com/bug?extid=2d6ac90723742279e101
[2] https://syzkaller.appspot.com/bug?extid=2e39bc6569d281acbcfb

Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
 drivers/base/core.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index a3e14143ec0c..68189722e343 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2941,7 +2941,6 @@ void device_initialize(struct device *dev)
 	kobject_init(&dev->kobj, &device_ktype);
 	INIT_LIST_HEAD(&dev->dma_pools);
 	mutex_init(&dev->mutex);
-	lockdep_set_novalidate_class(&dev->mutex);
 	spin_lock_init(&dev->devres_lock);
 	INIT_LIST_HEAD(&dev->devres_head);
 	device_pm_init(dev);
-- 
2.18.4
----------

----------
[    2.241650][    T9] Trying to unpack rootfs image as initramfs...
[    2.241630][    T1] software IO TLB: mapped [mem 0x00000000bbed0000-0x00000000bfed0000] (64MB)
[    2.241670][    T1] workingset: timestamp_bits=46 max_order=21 bucket_order=0
[    2.241670][    T1] SGI XFS with ACLs, security attributes, verbose warnings, quota, no debug enabled
[    2.241670][    T1] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    2.798150][    T1] 
[    2.798660][    T1] ======================================================
[    2.798660][    T1] WARNING: possible circular locking dependency detected
[    2.798660][    T1] 6.2.0-rc5+ #9 Not tainted
[    2.798660][    T1] ------------------------------------------------------
[    2.798660][    T1] swapper/0/1 is trying to acquire lock:
[    2.798660][    T1] ffffffffb002e888 (cpu_add_remove_lock){+.+.}-{3:3}, at: cpu_hotplug_disable+0x12/0x30
[    2.798660][    T1] 
[    2.798660][    T1] but task is already holding lock:
[    2.798660][    T1] ffff941940a161b8 (&dev->mutex){+.+.}-{3:3}, at: __device_driver_lock+0x28/0x40
[    2.798660][    T1] 
[    2.798660][    T1] which lock already depends on the new lock.
[    2.798660][    T1] 
[    2.798660][    T1] 
[    2.798660][    T1] the existing dependency chain (in reverse order) is:
[    2.798660][    T1] 
[    2.798660][    T1] -> #3 (&dev->mutex){+.+.}-{3:3}:
[    2.798660][    T1]        lock_acquire+0xc7/0x2e0
[    2.798660][    T1]        __mutex_lock+0x99/0xf00
[    2.798660][    T1]        mutex_lock_nested+0x16/0x20
[    2.798660][    T1]        __device_attach+0x35/0x1a0
[    2.798660][    T1]        device_initial_probe+0xe/0x10
[    2.798660][    T1]        bus_probe_device+0x9b/0xb0
[    2.798660][    T1]        device_add+0x3e1/0x900
[    2.798660][    T1]        pmu_dev_alloc+0x98/0xf0
[    2.798660][    T1]        perf_event_sysfs_init+0x56/0x8f
[    2.798660][    T1]        do_one_initcall+0x58/0x300
[    2.798660][    T1]        kernel_init_freeable+0x181/0x1d2
[    2.798660][    T1]        kernel_init+0x15/0x120
[    2.798660][    T1]        ret_from_fork+0x1f/0x30
[    2.798660][    T1] 
[    2.798660][    T1] -> #2 (pmus_lock){+.+.}-{3:3}:
[    2.798660][    T1]        lock_acquire+0xc7/0x2e0
[    2.798660][    T1]        __mutex_lock+0x99/0xf00
[    2.798660][    T1]        mutex_lock_nested+0x16/0x20
[    2.798660][    T1]        perf_event_init_cpu+0x4c/0x110
[    2.798660][    T1]        cpuhp_invoke_callback+0x17a/0x880
[    2.798660][    T1]        __cpuhp_invoke_callback_range+0x77/0xb0
[    2.798660][    T1]        _cpu_up+0xdc/0x240
[    2.798660][    T1]        cpu_up+0x8c/0xa0
[    2.798660][    T1]        bringup_nonboot_cpus+0x56/0x60
[    2.798660][    T1]        smp_init+0x25/0x5f
[    2.798660][    T1]        kernel_init_freeable+0xb4/0x1d2
[    2.798660][    T1]        kernel_init+0x15/0x120
[    2.798660][    T1]        ret_from_fork+0x1f/0x30
[    2.798660][    T1] 
[    2.798660][    T1] -> #1 (cpu_hotplug_lock){++++}-{0:0}:
[    2.798660][    T1]        lock_acquire+0xc7/0x2e0
[    2.798660][    T1]        percpu_down_write+0x44/0x2c0
[    2.798660][    T1]        _cpu_up+0x35/0x240
[    2.798660][    T1]        cpu_up+0x8c/0xa0
[    2.798660][    T1]        bringup_nonboot_cpus+0x56/0x60
[    2.798660][    T1]        smp_init+0x25/0x5f
[    2.798660][    T1]        kernel_init_freeable+0xb4/0x1d2
[    2.798660][    T1]        kernel_init+0x15/0x120
[    2.798660][    T1]        ret_from_fork+0x1f/0x30
[    2.798660][    T1] 
[    2.798660][    T1] -> #0 (cpu_add_remove_lock){+.+.}-{3:3}:
[    2.798660][    T1]        check_prevs_add+0x16a/0x1070
[    2.798660][    T1]        __lock_acquire+0x11bd/0x1670
[    2.798660][    T1]        lock_acquire+0xc7/0x2e0
[    2.798660][    T1]        __mutex_lock+0x99/0xf00
[    2.798660][    T1]        mutex_lock_nested+0x16/0x20
[    2.798660][    T1]        cpu_hotplug_disable+0x12/0x30
[    2.798660][    T1]        pci_device_probe+0x8c/0x150
[    2.798660][    T1]        really_probe+0xd9/0x340
[    2.798660][    T1]        __driver_probe_device+0x78/0x170
[    2.798660][    T1]        driver_probe_device+0x1f/0x90
[    2.798660][    T1]        __driver_attach+0xaa/0x160
[    2.798660][    T1]        bus_for_each_dev+0x75/0xb0
[    2.798660][    T1]        driver_attach+0x19/0x20
[    2.798660][    T1]        bus_add_driver+0x1be/0x210
[    2.798660][    T1]        driver_register+0x6b/0xc0
[    2.798660][    T1]        __pci_register_driver+0x7c/0x80
[    2.798660][    T1]        pcie_portdrv_init+0x3d/0x45
[    2.798660][    T1]        do_one_initcall+0x58/0x300
[    2.798660][    T1]        kernel_init_freeable+0x181/0x1d2
[    2.798660][    T1]        kernel_init+0x15/0x120
[    2.798660][    T1]        ret_from_fork+0x1f/0x30
[    2.798660][    T1] 
[    2.798660][    T1] other info that might help us debug this:
[    2.798660][    T1] 
[    2.798660][    T1] Chain exists of:
[    2.798660][    T1]   cpu_add_remove_lock --> pmus_lock --> &dev->mutex
[    2.798660][    T1] 
[    2.798660][    T1]  Possible unsafe locking scenario:
[    2.798660][    T1] 
[    2.798660][    T1]        CPU0                    CPU1
[    2.798660][    T1]        ----                    ----
[    2.798660][    T1]   lock(&dev->mutex);
[    2.798660][    T1]                                lock(pmus_lock);
[    2.798660][    T1]                                lock(&dev->mutex);
[    2.798660][    T1]   lock(cpu_add_remove_lock);
[    2.798660][    T1] 
[    2.798660][    T1]  *** DEADLOCK ***
[    2.798660][    T1] 
[    2.798660][    T1] 1 lock held by swapper/0/1:
[    2.798660][    T1]  #0: ffff941940a161b8 (&dev->mutex){+.+.}-{3:3}, at: __device_driver_lock+0x28/0x40
[    2.798660][    T1] 
[    2.798660][    T1] stack backtrace:
[    2.798660][    T1] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc5+ #9
[    2.798660][    T1] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[    2.798660][    T1] Call Trace:
[    2.798660][    T1]  <TASK>
[    2.798660][    T1]  dump_stack_lvl+0x49/0x5e
[    2.798660][    T1]  dump_stack+0x10/0x12
[    2.798660][    T1]  print_circular_bug.isra.46.cold.66+0x13e/0x143
[    2.798660][    T1]  check_noncircular+0xfe/0x110
[    2.798660][    T1]  check_prevs_add+0x16a/0x1070
[    2.798660][    T1]  __lock_acquire+0x11bd/0x1670
[    2.798660][    T1]  lock_acquire+0xc7/0x2e0
[    2.798660][    T1]  ? cpu_hotplug_disable+0x12/0x30
[    2.798660][    T1]  __mutex_lock+0x99/0xf00
[    2.798660][    T1]  ? cpu_hotplug_disable+0x12/0x30
[    2.798660][    T1]  ? pci_match_device+0xd5/0x130
[    2.798660][    T1]  ? __this_cpu_preempt_check+0x13/0x20
[    2.798660][    T1]  ? cpu_hotplug_disable+0x12/0x30
[    2.798660][    T1]  ? kernfs_add_one+0xf1/0x130
[    2.798660][    T1]  mutex_lock_nested+0x16/0x20
[    2.798660][    T1]  ? mutex_lock_nested+0x16/0x20
[    2.798660][    T1]  cpu_hotplug_disable+0x12/0x30
[    2.798660][    T1]  pci_device_probe+0x8c/0x150
[    2.798660][    T1]  really_probe+0xd9/0x340
[    2.798660][    T1]  ? pm_runtime_barrier+0x52/0xb0
[    2.798660][    T1]  __driver_probe_device+0x78/0x170
[    2.798660][    T1]  driver_probe_device+0x1f/0x90
[    2.798660][    T1]  __driver_attach+0xaa/0x160
[    2.798660][    T1]  ? __device_attach_driver+0x100/0x100
[    2.798660][    T1]  bus_for_each_dev+0x75/0xb0
[    2.798660][    T1]  driver_attach+0x19/0x20
[    2.798660][    T1]  bus_add_driver+0x1be/0x210
[    2.798660][    T1]  ? dmi_pcie_pme_disable_msi+0x1f/0x1f
[    2.798660][    T1]  ? dmi_pcie_pme_disable_msi+0x1f/0x1f
[    2.798660][    T1]  ? rdinit_setup+0x27/0x27
[    2.798660][    T1]  driver_register+0x6b/0xc0
[    2.798660][    T1]  ? dmi_pcie_pme_disable_msi+0x1f/0x1f
[    2.798660][    T1]  __pci_register_driver+0x7c/0x80
[    2.798660][    T1]  pcie_portdrv_init+0x3d/0x45
[    2.798660][    T1]  do_one_initcall+0x58/0x300
[    2.798660][    T1]  ? rdinit_setup+0x27/0x27
[    2.798660][    T1]  ? rcu_read_lock_sched_held+0x4a/0x70
[    2.798660][    T1]  kernel_init_freeable+0x181/0x1d2
[    2.798660][    T1]  ? rest_init+0x190/0x190
[    2.798660][    T1]  kernel_init+0x15/0x120
[    2.798660][    T1]  ret_from_fork+0x1f/0x30
[    2.798660][    T1]  </TASK>
[    3.991673][   T92] tsc: Refined TSC clocksource calibration: 2611.210 MHz
[    3.991673][   T92] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x25a399d04c4, max_idle_ns: 440795206293 ns
[    4.992946][   T92] clocksource: Switched to clocksource tsc
----------

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [perf] lockdep warning between cpu_add_remove_lock and &dev->mutex.
  2023-01-23 10:39 [perf] lockdep warning between cpu_add_remove_lock and &dev->mutex Tetsuo Handa
@ 2023-01-23 11:41 ` Peter Zijlstra
  2023-01-23 14:10   ` Tetsuo Handa
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2023-01-23 11:41 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-perf-users,
	LKML

On Mon, Jan 23, 2023 at 07:39:24PM +0900, Tetsuo Handa wrote:
> Hello.
> 
> I tried to apply below patch, and hit lockdep warning during boot.
> Can you break this dependency?

  cpu_add_remove_lock
    cpu_hotplug_lock
      pmus_lock
        dev->mutex		(pmu_dev_alloc)

vs

  dev->mutex
    cpu_add_remove_lock		(pci_device_probe)


Possibly something like this might do -- I'm not entirely sure it's
fully correct, needs a bit of auditing.

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index eacc3702654d..d6b2265a9982 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -13570,9 +13570,9 @@ static void perf_event_exit_cpu_context(int cpu)
 {
 	struct perf_cpu_context *cpuctx;
 	struct perf_event_context *ctx;
+	int idx = srcu_read_lock(&pmus_srcu);
 
 	// XXX simplify cpuctx->online
-	mutex_lock(&pmus_lock);
 	cpuctx = per_cpu_ptr(&perf_cpu_context, cpu);
 	ctx = &cpuctx->ctx;
 
@@ -13581,7 +13581,7 @@ static void perf_event_exit_cpu_context(int cpu)
 	cpuctx->online = 0;
 	mutex_unlock(&ctx->mutex);
 	cpumask_clear_cpu(cpu, perf_online_mask);
-	mutex_unlock(&pmus_lock);
+	srcu_read_unlock(&pmus_srcu, idx);
 }
 #else
 
@@ -13593,10 +13594,11 @@ int perf_event_init_cpu(unsigned int cpu)
 {
 	struct perf_cpu_context *cpuctx;
 	struct perf_event_context *ctx;
+	int idx;
 
 	perf_swevent_init_cpu(cpu);
 
-	mutex_lock(&pmus_lock);
+	idx = srcu_read_lock(&pmus_srcu);
 	cpumask_set_cpu(cpu, perf_online_mask);
 	cpuctx = per_cpu_ptr(&perf_cpu_context, cpu);
 	ctx = &cpuctx->ctx;
@@ -13604,7 +13606,7 @@ int perf_event_init_cpu(unsigned int cpu)
 	mutex_lock(&ctx->mutex);
 	cpuctx->online = 1;
 	mutex_unlock(&ctx->mutex);
-	mutex_unlock(&pmus_lock);
+	srcu_read_unlock(&pmus_srcu, idx);
 
 	return 0;
 }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [perf] lockdep warning between cpu_add_remove_lock and &dev->mutex.
  2023-01-23 11:41 ` Peter Zijlstra
@ 2023-01-23 14:10   ` Tetsuo Handa
  2023-01-23 15:02     ` Peter Zijlstra
                       ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Tetsuo Handa @ 2023-01-23 14:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-perf-users,
	LKML

On 2023/01/23 20:41, Peter Zijlstra wrote:
> On Mon, Jan 23, 2023 at 07:39:24PM +0900, Tetsuo Handa wrote:
>> Hello.
>>
>> I tried to apply below patch, and hit lockdep warning during boot.
>> Can you break this dependency?
> 
>   cpu_add_remove_lock
>     cpu_hotplug_lock
>       pmus_lock
>         dev->mutex		(pmu_dev_alloc)
> 
> vs
> 
>   dev->mutex
>     cpu_add_remove_lock		(pci_device_probe)
> 
> 
> Possibly something like this might do -- I'm not entirely sure it's
> fully correct, needs a bit of auditing.
> 

After applying your diff, lockdep message changed like below. Is this
the reason commit 1704f47b50b5 ("lockdep: Add novalidate class for
dev->mutex conversion") was applied?

----------
[    2.276394][    T9] Trying to unpack rootfs image as initramfs...
[    2.276394][    T1] software IO TLB: mapped [mem 0x00000000bbed0000-0x00000000bfed0000] (64MB)
[    2.276394][    T1] workingset: timestamp_bits=46 max_order=21 bucket_order=0
[    2.276394][    T1] SGI XFS with ACLs, security attributes, verbose warnings, quota, no debug enabled
[    2.276394][    T1] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
[    2.837244][    T1] 
[    2.837244][    T1] ============================================
[    2.837244][    T1] WARNING: possible recursive locking detected
[    2.837244][    T1] 6.2.0-rc5+ #10 Not tainted
[    2.837244][    T1] --------------------------------------------
[    2.837244][    T1] swapper/0/1 is trying to acquire lock:
[    2.837244][    T1] ffff984dc3d50108 (&dev->mutex){+.+.}-{3:3}, at: __device_attach+0x35/0x1a0
[    2.837244][    T1] 
[    2.837244][    T1] but task is already holding lock:
[    2.837244][    T1] ffff984dc1b5e1b8 (&dev->mutex){+.+.}-{3:3}, at: __device_driver_lock+0x28/0x40
[    2.837244][    T1] 
[    2.837244][    T1] other info that might help us debug this:
[    2.837244][    T1]  Possible unsafe locking scenario:
[    2.837244][    T1] 
[    2.837244][    T1]        CPU0
[    2.837244][    T1]        ----
[    2.837244][    T1]   lock(&dev->mutex);
[    2.837244][    T1]   lock(&dev->mutex);
[    2.837244][    T1] 
[    2.837244][    T1]  *** DEADLOCK ***
[    2.837244][    T1] 
[    2.837244][    T1]  May be due to missing lock nesting notation
[    2.837244][    T1] 
[    2.837244][    T1] 1 lock held by swapper/0/1:
[    2.837244][    T1]  #0: ffff984dc1b5e1b8 (&dev->mutex){+.+.}-{3:3}, at: __device_driver_lock+0x28/0x40
[    2.837244][    T1] 
[    2.837244][    T1] stack backtrace:
[    2.837244][    T1] CPU: 7 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc5+ #10
[    2.837244][    T1] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[    2.837244][    T1] Call Trace:
[    2.837244][    T1]  <TASK>
[    2.837244][    T1]  dump_stack_lvl+0x49/0x5e
[    2.837244][    T1]  dump_stack+0x10/0x12
[    2.837244][    T1]  __lock_acquire.cold.73+0x12e/0x2c7
[    2.837244][    T1]  lock_acquire+0xc7/0x2e0
[    2.837244][    T1]  ? __device_attach+0x35/0x1a0
[    2.837244][    T1]  __mutex_lock+0x99/0xf00
[    2.837244][    T1]  ? __device_attach+0x35/0x1a0
[    2.837244][    T1]  ? __this_cpu_preempt_check+0x13/0x20
[    2.837244][    T1]  ? __device_attach+0x35/0x1a0
[    2.837244][    T1]  ? kobject_uevent_env+0x12f/0x770
[    2.837244][    T1]  mutex_lock_nested+0x16/0x20
[    2.837244][    T1]  ? mutex_lock_nested+0x16/0x20
[    2.837244][    T1]  __device_attach+0x35/0x1a0
[    2.837244][    T1]  device_initial_probe+0xe/0x10
[    2.837244][    T1]  bus_probe_device+0x9b/0xb0
[    2.837244][    T1]  device_add+0x3e1/0x900
[    2.837244][    T1]  ? __init_waitqueue_head+0x4a/0x70
[    2.837244][    T1]  device_register+0x15/0x20
[    2.837244][    T1]  pcie_portdrv_probe+0x3e3/0x670
[    2.837244][    T1]  ? trace_hardirqs_on+0x3b/0x100
[    2.837244][    T1]  pci_device_probe+0xa8/0x150
[    2.837244][    T1]  really_probe+0xd9/0x340
[    2.837244][    T1]  ? pm_runtime_barrier+0x52/0xb0
[    2.837244][    T1]  __driver_probe_device+0x78/0x170
[    2.837244][    T1]  driver_probe_device+0x1f/0x90
[    2.837244][    T1]  __driver_attach+0xaa/0x160
[    2.837244][    T1]  ? __device_attach_driver+0x100/0x100
[    2.837244][    T1]  bus_for_each_dev+0x75/0xb0
[    2.837244][    T1]  driver_attach+0x19/0x20
[    2.837244][    T1]  bus_add_driver+0x1be/0x210
[    2.837244][    T1]  ? dmi_pcie_pme_disable_msi+0x1f/0x1f
[    2.837244][    T1]  ? dmi_pcie_pme_disable_msi+0x1f/0x1f
[    2.837244][    T1]  ? rdinit_setup+0x27/0x27
[    2.837244][    T1]  driver_register+0x6b/0xc0
[    2.837244][    T1]  ? dmi_pcie_pme_disable_msi+0x1f/0x1f
[    2.837244][    T1]  __pci_register_driver+0x7c/0x80
[    2.837244][    T1]  pcie_portdrv_init+0x3d/0x45
[    2.837244][    T1]  do_one_initcall+0x58/0x300
[    2.837244][    T1]  ? rdinit_setup+0x27/0x27
[    2.837244][    T1]  ? rcu_read_lock_sched_held+0x4a/0x70
[    2.837244][    T1]  kernel_init_freeable+0x181/0x1d2
[    2.837244][    T1]  ? rest_init+0x190/0x190
[    2.837244][    T1]  kernel_init+0x15/0x120
[    2.837244][    T1]  ret_from_fork+0x1f/0x30
[    2.837244][    T1]  </TASK>
[    4.126397][    T1] pcieport 0000:00:15.0: PME: Signaling with IRQ 24
[    4.126397][    T1] pcieport 0000:00:15.0: pciehp: Slot #160 AttnBtn+ PwrCtrl+ MRL- AttnInd- PwrInd- HotPlug+ Surprise- Interlock- NoCompl+ IbPresDis- LLActRep+
[    4.126397][    T1] pcieport 0000:00:15.1: PME: Signaling with IRQ 25
----------

# ./scripts/faddr2line --list vmlinux __device_attach+0x35/0x1a0 __device_driver_lock+0x28/0x40
__device_attach+0x35/0x1a0:

__device_attach at drivers/base/dd.c:984
 979    {
 980            int ret = 0;
 981            bool async = false;
 982
 983            device_lock(dev);
>984<           if (dev->p->dead) {
 985                    goto out_unlock;
 986            } else if (dev->driver) {
 987                    if (device_is_bound(dev)) {
 988                            ret = 1;
 989                            goto out_unlock;

__device_driver_lock+0x28/0x40:

__device_driver_lock at drivers/base/dd.c:1074
 1069   static void __device_driver_lock(struct device *dev, struct device *parent)
 1070   {
 1071           if (parent && dev->bus->need_parent_lock)
 1072                   device_lock(parent);
 1073           device_lock(dev);
>1074<  }
 1075
 1076   /*
 1077    * __device_driver_unlock - release locks needed to manipulate dev->drv
 1078    * @dev: Device we will update driver info for
 1079    * @parent: Parent device. Needed if the bus requires parent lock

# ./scripts/faddr2line vmlinux __device_attach+0x35/0x1a0 device_initial_probe+0xe/0x10 bus_probe_device+0x9b/0xb0 device_add+0x3e1/0x900 device_register+0x15/0x20 pcie_portdrv_probe+0x3e3/0x670 pci_device_probe+0xa8/0x150 really_probe+0xd9/0x340 __driver_probe_device+0x78/0x170 driver_probe_device+0x1f/0x90 __driver_attach+0xaa/0x160 bus_for_each_dev+0x75/0xb0 driver_attach+0x19/0x20 bus_add_driver+0x1be/0x210 driver_register+0x6b/0xc0
__device_attach+0x35/0x1a0:
__device_attach at drivers/base/dd.c:984

device_initial_probe+0xe/0x10:
device_initial_probe at drivers/base/dd.c:1058

bus_probe_device+0x9b/0xb0:
bus_probe_device at drivers/base/bus.c:487

device_add+0x3e1/0x900:
device_add at drivers/base/core.c:3485

device_register+0x15/0x20:
device_register at drivers/base/core.c:3560

pcie_portdrv_probe+0x3e3/0x670:
pcie_device_init at drivers/pci/pcie/portdrv.c:310
(inlined by) pcie_port_device_register at drivers/pci/pcie/portdrv.c:363
(inlined by) pcie_portdrv_probe at drivers/pci/pcie/portdrv.c:696

pci_device_probe+0xa8/0x150:
local_pci_probe at drivers/pci/pci-driver.c:324
(inlined by) pci_call_probe at drivers/pci/pci-driver.c:392
(inlined by) __pci_device_probe at drivers/pci/pci-driver.c:417
(inlined by) pci_device_probe at drivers/pci/pci-driver.c:460

really_probe+0xd9/0x340:
call_driver_probe at drivers/base/dd.c:560
(inlined by) really_probe at drivers/base/dd.c:639

__driver_probe_device+0x78/0x170:
__driver_probe_device at drivers/base/dd.c:778

driver_probe_device+0x1f/0x90:
driver_probe_device at drivers/base/dd.c:808

__driver_attach+0xaa/0x160:
__driver_attach at drivers/base/dd.c:1195

bus_for_each_dev+0x75/0xb0:
bus_for_each_dev at drivers/base/bus.c:300

driver_attach+0x19/0x20:
driver_attach at drivers/base/dd.c:1212

bus_add_driver+0x1be/0x210:
bus_add_driver at drivers/base/bus.c:619

driver_register+0x6b/0xc0:
driver_register at drivers/base/driver.c:246



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [perf] lockdep warning between cpu_add_remove_lock and &dev->mutex.
  2023-01-23 14:10   ` Tetsuo Handa
@ 2023-01-23 15:02     ` Peter Zijlstra
  2023-01-23 15:28     ` Peter Zijlstra
       [not found]     ` <20230124095424.4448-1-hdanton@sina.com>
  2 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2023-01-23 15:02 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-perf-users,
	LKML

On Mon, Jan 23, 2023 at 11:10:57PM +0900, Tetsuo Handa wrote:

> After applying your diff, lockdep message changed like below. Is this
> the reason commit 1704f47b50b5 ("lockdep: Add novalidate class for
> dev->mutex conversion") was applied?

No, reason was device probing itself. There should be a thread about
that with Alan Stern some 15 years ago or so.

I'll try and have a look at the new splat later today.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [perf] lockdep warning between cpu_add_remove_lock and &dev->mutex.
  2023-01-23 14:10   ` Tetsuo Handa
  2023-01-23 15:02     ` Peter Zijlstra
@ 2023-01-23 15:28     ` Peter Zijlstra
       [not found]     ` <20230124095424.4448-1-hdanton@sina.com>
  2 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2023-01-23 15:28 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, linux-perf-users,
	LKML

On Mon, Jan 23, 2023 at 11:10:57PM +0900, Tetsuo Handa wrote:
> On 2023/01/23 20:41, Peter Zijlstra wrote:
> > On Mon, Jan 23, 2023 at 07:39:24PM +0900, Tetsuo Handa wrote:
> >> Hello.
> >>
> >> I tried to apply below patch, and hit lockdep warning during boot.
> >> Can you break this dependency?
> > 
> >   cpu_add_remove_lock
> >     cpu_hotplug_lock
> >       pmus_lock
> >         dev->mutex		(pmu_dev_alloc)
> > 
> > vs
> > 
> >   dev->mutex
> >     cpu_add_remove_lock		(pci_device_probe)
> > 
> > 
> > Possibly something like this might do -- I'm not entirely sure it's
> > fully correct, needs a bit of auditing.
> > 
> 
> After applying your diff, lockdep message changed like below. Is this
> the reason commit 1704f47b50b5 ("lockdep: Add novalidate class for
> dev->mutex conversion") was applied?

*sigh*, clearly I should have actually read the slat and not assumed it
was another perf splat.

Yes, something along these lines is why it was done. I think it was this
thread, but there might have been more:

  https://lore.kernel.org/all/Pine.LNX.4.44L0.0804171117450.18040-100000@iolanthe.rowland.org/

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [perf] lockdep warning between cpu_add_remove_lock and &dev->mutex.
       [not found]     ` <20230124095424.4448-1-hdanton@sina.com>
@ 2023-01-24 11:16       ` Tetsuo Handa
  0 siblings, 0 replies; 6+ messages in thread
From: Tetsuo Handa @ 2023-01-24 11:16 UTC (permalink / raw)
  To: Hillf Danton; +Cc: Peter Zijlstra, Alan Stern, linux-kernel

On 2023/01/24 18:54, Hillf Danton wrote:
> Given device locked in the probe path, bind device to driver without lock held.
> Just to see whatever deadlock two steps ahead.
> 
>  void device_initial_probe(struct device *dev)
>  {
> +	/* invoked with device locked */
>  	__device_attach(dev, true);
>  }

Applying your diff with below diff resulted in flood of lockdep warnings
saying that device_initial_probe() is invoked without device locked.

--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -1058,6 +1058,7 @@ EXPORT_SYMBOL_GPL(device_attach);
 void device_initial_probe(struct device *dev)
 {
         /* invoked with device locked */
+        device_lock_assert(dev);
         __device_attach(dev, true);
 }

----------
[    0.476081][    T1] PCI: Using configuration type 1 for base access
[    0.476375][    T1] ------------[ cut here ]------------
[    0.477185][    T1] WARNING: CPU: 0 PID: 1 at include/linux/device.h:851 device_initial_probe+0x37/0x50
[    0.478046][    T1] Modules linked in:
[    0.479046][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc5+ #13
[    0.480046][    T1] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[    0.481046][    T1] RIP: 0010:device_initial_probe+0x37/0x50
[    0.481268][    T1] Code: fb 85 c0 75 10 48 89 df be 01 00 00 00 e8 21 f4 ff ff 5b 5d c3 48 8d bf e8 00 00 00 be ff ff ff ff e8 cd 6d 2b 00 85 c0 75 db <0f> 0b be 01 00 00 00 48 89 df e8 fa f3 ff ff 5b 5d c3 0f 1f 80 00
[    0.483046][    T1] RSP: 0000:ffffaa0540017d48 EFLAGS: 00010246
[    0.484220][    T1] RAX: 0000000000000000 RBX: ffffa2e4f6e1a928 RCX: 0000000000000001
[    0.485233][    T1] RDX: 0000000000000001 RSI: ffffffffbde45e07 RDI: ffffffffbdea29ee
[    0.486065][    T1] RBP: ffffaa0540017d50 R08: 0000000000000001 R09: 0000000000000000
[    0.487056][    T1] R10: b7eeca500c46dbc8 R11: 0000000000000000 R12: ffffffffbe43ff40
[    0.488046][    T1] R13: ffffa2e4f6e1a928 R14: 0000000000000000 R15: ffffa2e4f6e1a928
[    0.489053][    T1] FS:  0000000000000000(0000) GS:ffffa2e4f6e00000(0000) knlGS:0000000000000000
[    0.489663][    T1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.490226][    T1] CR2: ffffa2e3e8202000 CR3: 0000000127011001 CR4: 0000000000370ef0
[    0.491046][    T1] Call Trace:
[    0.491208][    T1]  <TASK>
[    0.492084][    T1]  bus_probe_device+0x9b/0xb0
[    0.492317][    T1]  device_add+0x3e1/0x900
[    0.493046][    T1]  ? __init_waitqueue_head+0x4a/0x70
[    0.493276][    T1]  ? rdinit_setup+0x27/0x27
[    0.494174][    T1]  device_register+0x15/0x20
[    0.494327][    T1]  register_cpu+0xda/0x120
[    0.495146][    T1]  arch_register_cpu+0x48/0x110
[    0.495288][    T1]  topology_init+0x35/0x3e
[    0.496110][    T1]  ? enable_cpu0_hotplug+0x10/0x10
[    0.496283][    T1]  do_one_initcall+0x58/0x300
[    0.497153][    T1]  ? rdinit_setup+0x27/0x27
[    0.498058][    T1]  ? rcu_read_lock_sched_held+0x4a/0x70
[    0.498255][    T1]  kernel_init_freeable+0x181/0x1d2
[    0.499046][    T1]  ? rest_init+0x190/0x190
[    0.499249][    T1]  kernel_init+0x15/0x120
[    0.500046][    T1]  ret_from_fork+0x1f/0x30
[    0.500258][    T1]  </TASK>
[    0.501059][    T1] irq event stamp: 101019
[    0.501161][    T1] hardirqs last  enabled at (101029): [<ffffffffbd0d9953>] __up_console_sem+0x53/0x60
[    0.502054][    T1] hardirqs last disabled at (101046): [<ffffffffbd0d9938>] __up_console_sem+0x38/0x60
[    0.503063][    T1] softirqs last  enabled at (101044): [<ffffffffbd8d524b>] __do_softirq+0x30b/0x46f
[    0.504061][    T1] softirqs last disabled at (101039): [<ffffffffbd06e859>] irq_exit_rcu+0xb9/0xf0
[    0.505076][    T1] ---[ end trace 0000000000000000 ]---
[    0.506065][   T11] Callback from call_rcu_tasks() invoked.
----------



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-01-24 11:17 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-23 10:39 [perf] lockdep warning between cpu_add_remove_lock and &dev->mutex Tetsuo Handa
2023-01-23 11:41 ` Peter Zijlstra
2023-01-23 14:10   ` Tetsuo Handa
2023-01-23 15:02     ` Peter Zijlstra
2023-01-23 15:28     ` Peter Zijlstra
     [not found]     ` <20230124095424.4448-1-hdanton@sina.com>
2023-01-24 11:16       ` Tetsuo Handa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.