All of lore.kernel.org
 help / color / mirror / Atom feed
* lockdep null-ptr-deref
@ 2020-09-29 14:31 Qian Cai
  2020-09-29 23:08 ` Boqun Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Qian Cai @ 2020-09-29 14:31 UTC (permalink / raw)
  To: Peter Zijlstra, Boqun Feng
  Cc: Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

I tried to add a few new Kconfig options like LEDS_TRIGGERS instantly trigger a
warning during the boot, and then there is null-ptr-deref in lockdep below. Any
idea?

[   16.487309] WARNING: possible irq lock inversion dependency detected
[   16.488313] 5.9.0-rc7-next-20200928+ #9 Not tainted
[   16.488936] --------------------------------------------------------
[   16.489767] swapper/6/0 just changed the state of lock:
[   16.490449] ffff8889eea6f418 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x1e/0x530 [libata]
__ata_sff_interrupt at /home/linux-mm/linux-next/drivers/ata/libata-sff.c:1534
(inlined by) ata_bmdma_interrupt at /home/linux-mm/linux-next/drivers/ata/libata-sff.c:2832
[   16.491639] but this lock took another, HARDIRQ-READ-unsafe lock in the past:
[   16.492561]  (&trig->leddev_list_lock){.+.?}-{2:2}
[   16.492565] 
[   16.492565] 
[   16.492565] and interrupts could create inverse lock ordering between them.
[   16.492565] 
[   16.494635] 
[   16.494635] other info that might help us debug this:
[   16.495479]  Possible interrupt unsafe locking scenario:
[   16.495479] 
[   16.496360]        CPU0                    CPU1
[   16.496941]        ----                    ----
[   16.497542]   lock(&trig->leddev_list_lock);
[   16.498095]                                local_irq_disable();
[   16.498864]                                lock(&host->lock);
[   16.499611]                                lock(&trig->leddev_list_lock);
[   16.500481]   <Interrupt>
[   16.500833]     lock(&host->lock);
[   16.501289] 
[   16.501289]  *** DEADLOCK ***
[   16.501289] 
[   16.502044] no locks held by swapper/6/0.
[   16.502566] 
[   16.502566] the shortest dependencies between 2nd lock and 1st lock:
[   16.503578]  -> (&trig->leddev_list_lock){.+.?}-{2:2} {
[   16.504259]     HARDIRQ-ON-R at:
[   16.504692]                       lock_acquire+0x17f/0x7e0
[   16.505411]                       _raw_read_lock+0x38/0x70
[   16.506120]                       led_trigger_event+0x2b/0xb0
led_trigger_event at drivers/leds/led-triggers.c:386
(inlined by) led_trigger_event at drivers/leds/led-triggers.c:377
[   16.506868]                       kbd_propagate_led_state+0x5d/0x80
[   16.507680]                       kbd_bh+0x14d/0x1d0
[   16.508335]                       tasklet_action_common.isra.13+0x23a/0x2e0
[   16.509235]                       __do_softirq+0x1ce/0x828
[   16.509940]                       run_ksoftirqd+0x26/0x50
[   16.510647]                       smpboot_thread_fn+0x30f/0x740
[   16.511413]                       kthread+0x357/0x420
[   16.512068]                       ret_from_fork+0x22/0x30
[   16.512762]     IN-SOFTIRQ-R at:
[   16.513187]                       lock_acquire+0x17f/0x7e0
[   16.513891]                       _raw_read_lock+0x38/0x70
[   16.514602]                       led_trigger_event+0x2b/0xb0
[   16.515356]                       kbd_propagate_led_state+0x5d/0x80
[   16.516165]                       kbd_bh+0x14d/0x1d0
[   16.516810]                       tasklet_action_common.isra.13+0x23a/0x2e0
[   16.517701]                       __do_softirq+0x1ce/0x828
[   16.518418]                       run_ksoftirqd+0x26/0x50
[   16.519119]                       smpboot_thread_fn+0x30f/0x740
[   16.519874]                       kthread+0x357/0x420
[   16.520460] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
[   16.520528]                       ret_from_fork+0x22/0x30
[   16.520531]     SOFTIRQ-ON-R at:
[   16.522704]                       lock_acquire+0x17f/0x7e0
[   16.523423]                       _raw_read_lock+0x5d/0x70
[   16.524124]                       led_trigger_event+0x2b/0xb0
[   16.524865]                       kbd_propagate_led_state+0x5d/0x80
[   16.525671]                       kbd_start+0xd2/0xf0
[   16.526332]                       input_register_handle+0x282/0x4f0
[   16.527142]                       kbd_connect+0xc0/0x120
[   16.527826]                       input_attach_handler+0x10a/0x170
[   16.528758]                       input_register_device.cold.22+0xac/0x29d
[   16.529651]                       atkbd_connect+0x58f/0x810
[   16.530374]                       serio_connect_driver+0x4a/0x70
[   16.531144]                       really_probe+0x222/0xb20
[   16.531861]                       driver_probe_device+0x1f6/0x380
[   16.532650]                       device_driver_attach+0xea/0x120
[   16.533441]                       __driver_attach+0xf5/0x270
[   16.534172]                       bus_for_each_dev+0x11c/0x1b0
[   16.534924]                       serio_handle_event+0x1df/0x7f0
[   16.535701]                       process_one_work+0x842/0x1410
[   16.536470]                       worker_thread+0x87/0xb40
[   16.537178]                       kthread+0x357/0x420
[   16.537828]                       ret_from_fork+0x22/0x30
[   16.538674]     INITIAL USE at:
[   16.539220]                      lock_acquire+0x17f/0x7e0
[   16.540116]                      _raw_read_lock+0x38/0x70
[   16.540999]                      led_trigger_event+0x2b/0xb0
[   16.541923]                      kbd_propagate_led_state+0x5d/0x80
[   16.542879]                      kbd_bh+0x14d/0x1d0
[   16.543538]                      tasklet_action_common.isra.13+0x23a/0x2e0
[   16.544427]                      __do_softirq+0x1ce/0x828
[   16.545117]                      run_ksoftirqd+0x26/0x50
[   16.545807]                      smpboot_thread_fn+0x30f/0x740
[   16.546566]                      kthread+0x357/0x420
[   16.547227]                      ret_from_fork+0x22/0x30
[   16.547899]     (null) at:
[   16.548298] general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] SMP KASAN PTI
[   16.549666] KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
[   16.550644] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.9.0-rc7-next-20200928+ #9
[   16.551615] Hardware name: Red Hat KVM, BIOS 1.14.0-1.module+el8.3.0+7638+07cf13d2 04/01/2014
[   16.552722] RIP: 0010:print_shortest_lock_dependencies.cold.65+0x10b/0x2a4
print_shortest_lock_dependencies.cold.65+0x10b/0x2a4:
print_lock_trace at kernel/locking/lockdep.c:1751
(inlined by) print_lock_class_header at kernel/locking/lockdep.c:2240
(inlined by) print_shortest_lock_dependencies at kernel/locking/lockdep.c:2263
[   16.553613] Code: 41 01 c6 4c 89 e8 48 c1 e8 03 45 01 fe 80 3c 18 00 74 08 4c 89 ef e8 db e2 ac fe 49 8b 45 00 48 8d 78 14 48 89 fa 48 c1 ea 03 <0f> b6 0c 1a 48 89 fa 83 e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 82
[   16.555983] RSP: 0018:ffff888a7eb09988 EFLAGS: 00010003
[   16.556658] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000027
[   16.557571] RDX: 0000000000000002 RSI: 0000000000000004 RDI: 0000000000000015
[   16.558499] RBP: ffffffffb3dd5888 R08: ffffed114fd640c2 R09: ffffed114fd640c2
[   16.559420] R10: ffff888a7eb2060b R11: ffffed114fd640c1 R12: 0000000000000009
[   16.560329] R13: ffffffffb3dd5928 R14: 000000000000000f R15: 0000000000000001
[   16.561237] FS:  0000000000000000(0000) GS:ffff888a7eb00000(0000) knlGS:0000000000000000
[   16.562264] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.562989] CR2: 00007fdc00f597a0 CR3: 000000091f02e006 CR4: 0000000000770ee0
[   16.563909] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   16.564818] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   16.565728] PKRU: 55555554
[   16.566080] Call Trace:
[   16.566410]  <IRQ>
[   16.566679]  print_irq_inversion_bug.cold.66+0x31b/0x36e
print_irq_inversion_bug at kernel/locking/lockdep.c:3771
(inlined by) print_irq_inversion_bug at kernel/locking/lockdep.c:3716
[   16.567369]  mark_lock.part.49+0x116a/0x1900
[   16.567915]  ? print_usage_bug+0x1f0/0x1f0
[   16.568456]  ? __lock_acquire+0xc66/0x3ac0
[   16.568983]  ? print_usage_bug+0x1f0/0x1f0
[   16.569521]  ? mark_lock.part.49+0x107/0x1900
[   16.570085]  __lock_acquire+0x14ec/0x3ac0
[   16.570610]  ? static_obj+0xb0/0xc0
[   16.571064]  ? lockdep_hardirqs_on_prepare+0x4d0/0x4d0
[   16.571731]  ? rcu_read_lock_sched_held+0xa1/0xd0
[   16.572347]  lock_acquire+0x17f/0x7e0
[   16.572825]  ? ata_bmdma_interrupt+0x1e/0x530 [libata]
[   16.573494]  ? rcu_read_unlock+0x40/0x40
[   16.573994]  ? __lock_acquire+0x1864/0x3ac0
[   16.574543]  _raw_spin_lock_irqsave+0x35/0x50
[   16.575111]  ? ata_bmdma_interrupt+0x1e/0x530 [libata]
[   16.575779]  ? ata_bmdma_port_intr+0x320/0x320 [libata]
[   16.576460]  ata_bmdma_interrupt+0x1e/0x530 [libata]
[   16.577092]  ? rcu_read_lock_bh_held+0xb0/0xb0
[   16.577676]  ? find_held_lock+0x33/0x1c0
[   16.578205]  ? ata_bmdma_port_intr+0x320/0x320 [libata]
[   16.578875]  __handle_irq_event_percpu+0xeb/0x5f0
[   16.579491]  handle_irq_event_percpu+0x73/0x140
[   16.580072]  ? __handle_irq_event_percpu+0x5f0/0x5f0
[   16.580714]  ? rwlock_bug.part.1+0x90/0x90
[   16.581251]  ? do_raw_spin_unlock+0x14b/0x230
[   16.581805]  handle_irq_event+0xa1/0x130
[   16.582324]  handle_edge_irq+0x1ee/0x770
[   16.582841]  asm_call_irq_on_stack+0x12/0x20
[   16.583416]  </IRQ>
[   16.583699]  common_interrupt+0xe2/0x180
[   16.584225]  asm_common_interrupt+0x1e/0x40
[   16.584760] RIP: 0010:default_idle+0x18/0x20
[   16.585323] Code: 5d e9 bc cd 35 ff cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 e8 96 07 6b fe e9 07 00 00 00 0f 00 2d 4a 98 5b 00 fb f4 <c3> 0f 1f 80 00 00 00 00 0f 1f 44 00 00 53 be 08 00 00 00 65 4b
[   16.587687] RSP: 0018:ffff888107317df8 EFLAGS: 00000202
[   16.588378] RAX: 0000000000008ca1 RBX: 0000000000000006 RCX: ffffffffb143f2e0
[   16.589297] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb1464e4a
[   16.590207] RBP: 0000000000000006 R08: 0000000000000001 R09: 0000000000000001
[   16.591109] R10: ffff888a7eb360eb R11: ffffed114fd66c1d R12: ffffffffb2d6fca0
[   16.592012] R13: 0000000000000000 R14: 1ffff11020e62fc6 R15: 0000000000000000
[   16.592931]  ? rcu_eqs_enter.constprop.75+0xa0/0xc0
[   16.593572]  ? default_idle+0xa/0x20
[   16.594034]  ? default_idle+0xa/0x20
[   16.594511]  default_idle_call+0xa0/0x370
[   16.595028]  do_idle+0x3c1/0x570
[   16.595462]  ? arch_cpu_idle_exit+0x40/0x40
[   16.595995]  ? lockdep_hardirqs_on_prepare+0x32b/0x4d0
[   16.596663]  ? _raw_spin_unlock_irqrestore+0x39/0x40
[   16.597306]  cpu_startup_entry+0x19/0x20
[   16.597817]  start_secondary+0x250/0x2f0
[   16.598342]  ? set_cpu_sibling_map+0x2430/0x2430
[   16.598935]  ? start_cpu0+0xc/0xc
[   16.599378]  secondary_startup_64_no_verify+0xb8/0xbb
[   16.600017] Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel virtiofs ata_piix ghash_clmulni_intel fuse libata serio_raw e1000(+) sunrpc dm_mirror dm_region_hash dm_log dm_mod
[   16.602209] ---[ end trace f713527030b9fc6e ]---
[   16.602804] RIP: 0010:print_shortest_lock_dependencies.cold.65+0x10b/0x2a4
[   16.603695] Code: 41 01 c6 4c 89 e8 48 c1 e8 03 45 01 fe 80 3c 18 00 74 08 4c 89 ef e8 db e2 ac fe 49 8b 45 00 48 8d 78 14 48 89 fa 48 c1 ea 03 <0f> b6 0c 1a 48 89 fa 83 e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 82
[   16.606060] RSP: 0018:ffff888a7eb09988 EFLAGS: 00010003
[   16.606730] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000027
[   16.607640] RDX: 0000000000000002 RSI: 0000000000000004 RDI: 0000000000000015
[   16.608568] RBP: ffffffffb3dd5888 R08: ffffed114fd640c2 R09: ffffed114fd640c2
[   16.609495] R10: ffff888a7eb2060b R11: ffffed114fd640c1 R12: 0000000000000009
[   16.610410] R13: ffffffffb3dd5928 R14: 000000000000000f R15: 0000000000000001
[   16.611322] FS:  0000000000000000(0000) GS:ffff888a7eb00000(0000) knlGS:0000000000000000
[   16.612351] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.613078] CR2: 00007fdc00f597a0 CR3: 000000091f02e006 CR4: 0000000000770ee0
[   16.613980] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   16.614903] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   16.615824] PKRU: 55555554
[   16.616177] Kernel panic - not syncing: Fatal exception in interrupt
[   16.617445] Kernel Offset: 0x2e600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[   16.618792] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-09-29 14:31 lockdep null-ptr-deref Qian Cai
@ 2020-09-29 23:08 ` Boqun Feng
  2020-09-30  9:16   ` Peter Zijlstra
  0 siblings, 1 reply; 35+ messages in thread
From: Boqun Feng @ 2020-09-29 23:08 UTC (permalink / raw)
  To: Qian Cai
  Cc: Peter Zijlstra, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Tue, Sep 29, 2020 at 10:31:56AM -0400, Qian Cai wrote:
> I tried to add a few new Kconfig options like LEDS_TRIGGERS instantly trigger a
> warning during the boot, and then there is null-ptr-deref in lockdep below. Any
> idea?
> 
> [   16.487309] WARNING: possible irq lock inversion dependency detected
> [   16.488313] 5.9.0-rc7-next-20200928+ #9 Not tainted
> [   16.488936] --------------------------------------------------------
> [   16.489767] swapper/6/0 just changed the state of lock:
> [   16.490449] ffff8889eea6f418 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x1e/0x530 [libata]
> __ata_sff_interrupt at /home/linux-mm/linux-next/drivers/ata/libata-sff.c:1534
> (inlined by) ata_bmdma_interrupt at /home/linux-mm/linux-next/drivers/ata/libata-sff.c:2832
> [   16.491639] but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> [   16.492561]  (&trig->leddev_list_lock){.+.?}-{2:2}
> [   16.492565] 
> [   16.492565] 
> [   16.492565] and interrupts could create inverse lock ordering between them.
> [   16.492565] 
> [   16.494635] 
> [   16.494635] other info that might help us debug this:
> [   16.495479]  Possible interrupt unsafe locking scenario:
> [   16.495479] 
> [   16.496360]        CPU0                    CPU1
> [   16.496941]        ----                    ----
> [   16.497542]   lock(&trig->leddev_list_lock);
> [   16.498095]                                local_irq_disable();
> [   16.498864]                                lock(&host->lock);
> [   16.499611]                                lock(&trig->leddev_list_lock);
> [   16.500481]   <Interrupt>
> [   16.500833]     lock(&host->lock);
> [   16.501289] 
> [   16.501289]  *** DEADLOCK ***
> [   16.501289] 
> [   16.502044] no locks held by swapper/6/0.
> [   16.502566] 
> [   16.502566] the shortest dependencies between 2nd lock and 1st lock:
> [   16.503578]  -> (&trig->leddev_list_lock){.+.?}-{2:2} {
> [   16.504259]     HARDIRQ-ON-R at:
> [   16.504692]                       lock_acquire+0x17f/0x7e0
> [   16.505411]                       _raw_read_lock+0x38/0x70
> [   16.506120]                       led_trigger_event+0x2b/0xb0
> led_trigger_event at drivers/leds/led-triggers.c:386
> (inlined by) led_trigger_event at drivers/leds/led-triggers.c:377
> [   16.506868]                       kbd_propagate_led_state+0x5d/0x80
> [   16.507680]                       kbd_bh+0x14d/0x1d0
> [   16.508335]                       tasklet_action_common.isra.13+0x23a/0x2e0
> [   16.509235]                       __do_softirq+0x1ce/0x828
> [   16.509940]                       run_ksoftirqd+0x26/0x50
> [   16.510647]                       smpboot_thread_fn+0x30f/0x740
> [   16.511413]                       kthread+0x357/0x420
> [   16.512068]                       ret_from_fork+0x22/0x30
> [   16.512762]     IN-SOFTIRQ-R at:
> [   16.513187]                       lock_acquire+0x17f/0x7e0
> [   16.513891]                       _raw_read_lock+0x38/0x70
> [   16.514602]                       led_trigger_event+0x2b/0xb0
> [   16.515356]                       kbd_propagate_led_state+0x5d/0x80
> [   16.516165]                       kbd_bh+0x14d/0x1d0
> [   16.516810]                       tasklet_action_common.isra.13+0x23a/0x2e0
> [   16.517701]                       __do_softirq+0x1ce/0x828
> [   16.518418]                       run_ksoftirqd+0x26/0x50
> [   16.519119]                       smpboot_thread_fn+0x30f/0x740
> [   16.519874]                       kthread+0x357/0x420
> [   16.520460] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
> [   16.520528]                       ret_from_fork+0x22/0x30
> [   16.520531]     SOFTIRQ-ON-R at:
> [   16.522704]                       lock_acquire+0x17f/0x7e0
> [   16.523423]                       _raw_read_lock+0x5d/0x70
> [   16.524124]                       led_trigger_event+0x2b/0xb0
> [   16.524865]                       kbd_propagate_led_state+0x5d/0x80
> [   16.525671]                       kbd_start+0xd2/0xf0
> [   16.526332]                       input_register_handle+0x282/0x4f0
> [   16.527142]                       kbd_connect+0xc0/0x120
> [   16.527826]                       input_attach_handler+0x10a/0x170
> [   16.528758]                       input_register_device.cold.22+0xac/0x29d
> [   16.529651]                       atkbd_connect+0x58f/0x810
> [   16.530374]                       serio_connect_driver+0x4a/0x70
> [   16.531144]                       really_probe+0x222/0xb20
> [   16.531861]                       driver_probe_device+0x1f6/0x380
> [   16.532650]                       device_driver_attach+0xea/0x120
> [   16.533441]                       __driver_attach+0xf5/0x270
> [   16.534172]                       bus_for_each_dev+0x11c/0x1b0
> [   16.534924]                       serio_handle_event+0x1df/0x7f0
> [   16.535701]                       process_one_work+0x842/0x1410
> [   16.536470]                       worker_thread+0x87/0xb40
> [   16.537178]                       kthread+0x357/0x420
> [   16.537828]                       ret_from_fork+0x22/0x30
> [   16.538674]     INITIAL USE at:
> [   16.539220]                      lock_acquire+0x17f/0x7e0
> [   16.540116]                      _raw_read_lock+0x38/0x70
> [   16.540999]                      led_trigger_event+0x2b/0xb0
> [   16.541923]                      kbd_propagate_led_state+0x5d/0x80
> [   16.542879]                      kbd_bh+0x14d/0x1d0
> [   16.543538]                      tasklet_action_common.isra.13+0x23a/0x2e0
> [   16.544427]                      __do_softirq+0x1ce/0x828
> [   16.545117]                      run_ksoftirqd+0x26/0x50
> [   16.545807]                      smpboot_thread_fn+0x30f/0x740
> [   16.546566]                      kthread+0x357/0x420
> [   16.547227]                      ret_from_fork+0x22/0x30
> [   16.547899]     (null) at:

I think there are two problems here:

1) the "(null)" means we don't have the "usage_str" for a usage bit,
which I think is the LOCK_USED_READ bit introduced by Peter at
23870f122768 ('locking/lockdep: Fix "USED" <- "IN-NMI" inversions').

2) the next null-ptr-deref, and I think this is also caused by
LOCK_USED_READ bit, because in the loop inside
print_lock_class_header(), we iterate from 0 to LOCK_USAGE_STATES (which
is 4*2 + 3), however the class->usage_traces[] only has
XXX_LOCK_USAGE_STATES (which is 4*2 + 1) elements, so if we have
LOCK_USED_READ bit set in ->usage_mask, we will try to access an element
out of the ->usage_traces[] array.

Probably the following helps? And another possible fix is to enlarge the
->usage_trace[] array and record the call trace of LOCK_READ_USED.

Regards,
Boqun

---------->8
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 454355c033d2..ec7e6d90751c 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -600,6 +600,7 @@ static const char *usage_str[] =
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 	[LOCK_USED] = "INITIAL USE",
+	[LOCK_READ_USED】 = "INITIAL READ USE",
 	[LOCK_USAGE_STATES] = "IN-NMI",
 };
 #endif
@@ -2231,7 +2232,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 #endif
 	printk(KERN_CONT " {\n");
 
-	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
+	for (bit = 0; bit < XXX_LOCK_USAGE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 

> [   16.548298] general protection fault, probably for non-canonical address 0xdffffc0000000002: 0000 [#1] SMP KASAN PTI
> [   16.549666] KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017]
> [   16.550644] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.9.0-rc7-next-20200928+ #9
> [   16.551615] Hardware name: Red Hat KVM, BIOS 1.14.0-1.module+el8.3.0+7638+07cf13d2 04/01/2014
> [   16.552722] RIP: 0010:print_shortest_lock_dependencies.cold.65+0x10b/0x2a4
> print_shortest_lock_dependencies.cold.65+0x10b/0x2a4:
> print_lock_trace at kernel/locking/lockdep.c:1751
> (inlined by) print_lock_class_header at kernel/locking/lockdep.c:2240
> (inlined by) print_shortest_lock_dependencies at kernel/locking/lockdep.c:2263
> [   16.553613] Code: 41 01 c6 4c 89 e8 48 c1 e8 03 45 01 fe 80 3c 18 00 74 08 4c 89 ef e8 db e2 ac fe 49 8b 45 00 48 8d 78 14 48 89 fa 48 c1 ea 03 <0f> b6 0c 1a 48 89 fa 83 e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 82
> [   16.555983] RSP: 0018:ffff888a7eb09988 EFLAGS: 00010003
> [   16.556658] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000027
> [   16.557571] RDX: 0000000000000002 RSI: 0000000000000004 RDI: 0000000000000015
> [   16.558499] RBP: ffffffffb3dd5888 R08: ffffed114fd640c2 R09: ffffed114fd640c2
> [   16.559420] R10: ffff888a7eb2060b R11: ffffed114fd640c1 R12: 0000000000000009
> [   16.560329] R13: ffffffffb3dd5928 R14: 000000000000000f R15: 0000000000000001
> [   16.561237] FS:  0000000000000000(0000) GS:ffff888a7eb00000(0000) knlGS:0000000000000000
> [   16.562264] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   16.562989] CR2: 00007fdc00f597a0 CR3: 000000091f02e006 CR4: 0000000000770ee0
> [   16.563909] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   16.564818] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   16.565728] PKRU: 55555554
> [   16.566080] Call Trace:
> [   16.566410]  <IRQ>
> [   16.566679]  print_irq_inversion_bug.cold.66+0x31b/0x36e
> print_irq_inversion_bug at kernel/locking/lockdep.c:3771
> (inlined by) print_irq_inversion_bug at kernel/locking/lockdep.c:3716
> [   16.567369]  mark_lock.part.49+0x116a/0x1900
> [   16.567915]  ? print_usage_bug+0x1f0/0x1f0
> [   16.568456]  ? __lock_acquire+0xc66/0x3ac0
> [   16.568983]  ? print_usage_bug+0x1f0/0x1f0
> [   16.569521]  ? mark_lock.part.49+0x107/0x1900
> [   16.570085]  __lock_acquire+0x14ec/0x3ac0
> [   16.570610]  ? static_obj+0xb0/0xc0
> [   16.571064]  ? lockdep_hardirqs_on_prepare+0x4d0/0x4d0
> [   16.571731]  ? rcu_read_lock_sched_held+0xa1/0xd0
> [   16.572347]  lock_acquire+0x17f/0x7e0
> [   16.572825]  ? ata_bmdma_interrupt+0x1e/0x530 [libata]
> [   16.573494]  ? rcu_read_unlock+0x40/0x40
> [   16.573994]  ? __lock_acquire+0x1864/0x3ac0
> [   16.574543]  _raw_spin_lock_irqsave+0x35/0x50
> [   16.575111]  ? ata_bmdma_interrupt+0x1e/0x530 [libata]
> [   16.575779]  ? ata_bmdma_port_intr+0x320/0x320 [libata]
> [   16.576460]  ata_bmdma_interrupt+0x1e/0x530 [libata]
> [   16.577092]  ? rcu_read_lock_bh_held+0xb0/0xb0
> [   16.577676]  ? find_held_lock+0x33/0x1c0
> [   16.578205]  ? ata_bmdma_port_intr+0x320/0x320 [libata]
> [   16.578875]  __handle_irq_event_percpu+0xeb/0x5f0
> [   16.579491]  handle_irq_event_percpu+0x73/0x140
> [   16.580072]  ? __handle_irq_event_percpu+0x5f0/0x5f0
> [   16.580714]  ? rwlock_bug.part.1+0x90/0x90
> [   16.581251]  ? do_raw_spin_unlock+0x14b/0x230
> [   16.581805]  handle_irq_event+0xa1/0x130
> [   16.582324]  handle_edge_irq+0x1ee/0x770
> [   16.582841]  asm_call_irq_on_stack+0x12/0x20
> [   16.583416]  </IRQ>
> [   16.583699]  common_interrupt+0xe2/0x180
> [   16.584225]  asm_common_interrupt+0x1e/0x40
> [   16.584760] RIP: 0010:default_idle+0x18/0x20
> [   16.585323] Code: 5d e9 bc cd 35 ff cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 e8 96 07 6b fe e9 07 00 00 00 0f 00 2d 4a 98 5b 00 fb f4 <c3> 0f 1f 80 00 00 00 00 0f 1f 44 00 00 53 be 08 00 00 00 65 4b
> [   16.587687] RSP: 0018:ffff888107317df8 EFLAGS: 00000202
> [   16.588378] RAX: 0000000000008ca1 RBX: 0000000000000006 RCX: ffffffffb143f2e0
> [   16.589297] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffb1464e4a
> [   16.590207] RBP: 0000000000000006 R08: 0000000000000001 R09: 0000000000000001
> [   16.591109] R10: ffff888a7eb360eb R11: ffffed114fd66c1d R12: ffffffffb2d6fca0
> [   16.592012] R13: 0000000000000000 R14: 1ffff11020e62fc6 R15: 0000000000000000
> [   16.592931]  ? rcu_eqs_enter.constprop.75+0xa0/0xc0
> [   16.593572]  ? default_idle+0xa/0x20
> [   16.594034]  ? default_idle+0xa/0x20
> [   16.594511]  default_idle_call+0xa0/0x370
> [   16.595028]  do_idle+0x3c1/0x570
> [   16.595462]  ? arch_cpu_idle_exit+0x40/0x40
> [   16.595995]  ? lockdep_hardirqs_on_prepare+0x32b/0x4d0
> [   16.596663]  ? _raw_spin_unlock_irqrestore+0x39/0x40
> [   16.597306]  cpu_startup_entry+0x19/0x20
> [   16.597817]  start_secondary+0x250/0x2f0
> [   16.598342]  ? set_cpu_sibling_map+0x2430/0x2430
> [   16.598935]  ? start_cpu0+0xc/0xc
> [   16.599378]  secondary_startup_64_no_verify+0xb8/0xbb
> [   16.600017] Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel virtiofs ata_piix ghash_clmulni_intel fuse libata serio_raw e1000(+) sunrpc dm_mirror dm_region_hash dm_log dm_mod
> [   16.602209] ---[ end trace f713527030b9fc6e ]---
> [   16.602804] RIP: 0010:print_shortest_lock_dependencies.cold.65+0x10b/0x2a4
> [   16.603695] Code: 41 01 c6 4c 89 e8 48 c1 e8 03 45 01 fe 80 3c 18 00 74 08 4c 89 ef e8 db e2 ac fe 49 8b 45 00 48 8d 78 14 48 89 fa 48 c1 ea 03 <0f> b6 0c 1a 48 89 fa 83 e2 07 83 c2 03 38 ca 7c 08 84 c9 0f 82
> [   16.606060] RSP: 0018:ffff888a7eb09988 EFLAGS: 00010003
> [   16.606730] RAX: 0000000000000001 RBX: dffffc0000000000 RCX: 0000000000000027
> [   16.607640] RDX: 0000000000000002 RSI: 0000000000000004 RDI: 0000000000000015
> [   16.608568] RBP: ffffffffb3dd5888 R08: ffffed114fd640c2 R09: ffffed114fd640c2
> [   16.609495] R10: ffff888a7eb2060b R11: ffffed114fd640c1 R12: 0000000000000009
> [   16.610410] R13: ffffffffb3dd5928 R14: 000000000000000f R15: 0000000000000001
> [   16.611322] FS:  0000000000000000(0000) GS:ffff888a7eb00000(0000) knlGS:0000000000000000
> [   16.612351] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   16.613078] CR2: 00007fdc00f597a0 CR3: 000000091f02e006 CR4: 0000000000770ee0
> [   16.613980] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   16.614903] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   16.615824] PKRU: 55555554
> [   16.616177] Kernel panic - not syncing: Fatal exception in interrupt
> [   16.617445] Kernel Offset: 0x2e600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [   16.618792] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-09-29 23:08 ` Boqun Feng
@ 2020-09-30  9:16   ` Peter Zijlstra
  2020-09-30  9:49     ` Peter Zijlstra
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2020-09-30  9:16 UTC (permalink / raw)
  To: Boqun Feng, g
  Cc: Qian Cai, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Wed, Sep 30, 2020 at 07:08:23AM +0800, Boqun Feng wrote:
> I think there are two problems here:
> 
> 1) the "(null)" means we don't have the "usage_str" for a usage bit,
> which I think is the LOCK_USED_READ bit introduced by Peter at
> 23870f122768 ('locking/lockdep: Fix "USED" <- "IN-NMI" inversions').
> 
> 2) the next null-ptr-deref, and I think this is also caused by
> LOCK_USED_READ bit, because in the loop inside
> print_lock_class_header(), we iterate from 0 to LOCK_USAGE_STATES (which
> is 4*2 + 3), however the class->usage_traces[] only has
> XXX_LOCK_USAGE_STATES (which is 4*2 + 1) elements, so if we have
> LOCK_USED_READ bit set in ->usage_mask, we will try to access an element
> out of the ->usage_traces[] array.
> 
> Probably the following helps? And another possible fix is to enlarge the
> ->usage_trace[] array and record the call trace of LOCK_READ_USED.

Urgh.. yeah, I wanted to avoid saving that trace; it's pretty useless :/
The existing USED trace is already mostly pointless, the consistent
thing would be to remove both but that might be too radical.

But you're right in that I made a right mess of it. Not sure what's
best here.

Let me have a play.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-09-30  9:16   ` Peter Zijlstra
@ 2020-09-30  9:49     ` Peter Zijlstra
  2020-09-30 12:18       ` Boqun Feng
                         ` (3 more replies)
  0 siblings, 4 replies; 35+ messages in thread
From: Peter Zijlstra @ 2020-09-30  9:49 UTC (permalink / raw)
  To: Boqun Feng, g
  Cc: Qian Cai, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Wed, Sep 30, 2020 at 11:16:11AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 30, 2020 at 07:08:23AM +0800, Boqun Feng wrote:
> > I think there are two problems here:
> > 
> > 1) the "(null)" means we don't have the "usage_str" for a usage bit,
> > which I think is the LOCK_USED_READ bit introduced by Peter at
> > 23870f122768 ('locking/lockdep: Fix "USED" <- "IN-NMI" inversions').
> > 
> > 2) the next null-ptr-deref, and I think this is also caused by
> > LOCK_USED_READ bit, because in the loop inside
> > print_lock_class_header(), we iterate from 0 to LOCK_USAGE_STATES (which
> > is 4*2 + 3), however the class->usage_traces[] only has
> > XXX_LOCK_USAGE_STATES (which is 4*2 + 1) elements, so if we have
> > LOCK_USED_READ bit set in ->usage_mask, we will try to access an element
> > out of the ->usage_traces[] array.
> > 
> > Probably the following helps? And another possible fix is to enlarge the
> > ->usage_trace[] array and record the call trace of LOCK_READ_USED.
> 
> Urgh.. yeah, I wanted to avoid saving that trace; it's pretty useless :/
> The existing USED trace is already mostly pointless, the consistent
> thing would be to remove both but that might be too radical.
> 
> But you're right in that I made a right mess of it. Not sure what's
> best here.
> 
> Let me have a play.

How's something like this? It's bigger than I'd like, but I feel the
result is more consistent/readable.

---
diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
index bb35b449f533..a55b1d314ae8 100644
--- a/include/linux/lockdep_types.h
+++ b/include/linux/lockdep_types.h
@@ -35,8 +35,12 @@ enum lockdep_wait_type {
 /*
  * We'd rather not expose kernel/lockdep_states.h this wide, but we do need
  * the total number of states... :-(
+ *
+ * XXX_LOCK_USAGE_STATES is the number of lines in lockdep_states.h, for each
+ * of those we generates 4 states, Additionally we (for now) report on USED.
  */
-#define XXX_LOCK_USAGE_STATES		(1+2*4)
+#define XXX_LOCK_USAGE_STATES		2
+#define LOCK_TRACE_STATES		(XXX_LOCK_USAGE_STATES*4 + 1)
 
 /*
  * NR_LOCKDEP_CACHING_CLASSES ... Number of classes
@@ -106,7 +110,7 @@ struct lock_class {
 	 * IRQ/softirq usage tracking bits:
 	 */
 	unsigned long			usage_mask;
-	const struct lock_trace		*usage_traces[XXX_LOCK_USAGE_STATES];
+	const struct lock_trace		*usage_traces[LOCK_TRACE_STATES];
 
 	/*
 	 * Generation counter, when doing certain classes of graph walking,
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 454355c033d2..4f98ac8b4575 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -600,6 +600,8 @@ static const char *usage_str[] =
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 	[LOCK_USED] = "INITIAL USE",
+	[LOCK_USED_READ] = "INITIAL READ USE",
+	/* abused as string storage for verify_lock_unused() */
 	[LOCK_USAGE_STATES] = "IN-NMI",
 };
 #endif
@@ -2231,7 +2233,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 #endif
 	printk(KERN_CONT " {\n");
 
-	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
+	for (bit = 0; bit < LOCK_TRACE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 
@@ -4354,27 +4356,24 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	old_mask = hlock_class(this)->usage_mask;
 	hlock_class(this)->usage_mask |= new_mask;
 
-	/*
-	 * Save one usage_traces[] entry and map both LOCK_USED and
-	 * LOCK_USED_READ onto the same entry.
-	 */
-	if (new_bit == LOCK_USED || new_bit == LOCK_USED_READ) {
-		if (old_mask & (LOCKF_USED | LOCKF_USED_READ))
-			goto unlock;
-		new_bit = LOCK_USED;
+	if (new_bit < LOCK_TRACE_STATES) {
+		if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
+			return 0;
 	}
 
-	if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
-		return 0;
-
 	switch (new_bit) {
+	case 0 ... LOCK_USED-1:
+		ret = mark_lock_irq(curr, this, new_bit);
+		if (!ret)
+			return 0;
+		break;
+
 	case LOCK_USED:
 		debug_atomic_dec(nr_unused_locks);
 		break;
+
 	default:
-		ret = mark_lock_irq(curr, this, new_bit);
-		if (!ret)
-			return 0;
+		break;
 	}
 
 unlock:
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index b0be1560ed17..67dc46e46552 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -20,9 +20,12 @@ enum lock_usage_bit {
 #undef LOCKDEP_STATE
 	LOCK_USED,
 	LOCK_USED_READ,
-	LOCK_USAGE_STATES
+	LOCK_USAGE_STATES,
 };
 
+/* states after LOCK_USED are not traced and printed */
+static_assert(LOCK_TRACE_STATES == LOCK_USED_READ);
+
 #define LOCK_USAGE_READ_MASK 1
 #define LOCK_USAGE_DIR_MASK  2
 #define LOCK_USAGE_STATE_MASK (~(LOCK_USAGE_READ_MASK | LOCK_USAGE_DIR_MASK))
@@ -121,7 +124,7 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 extern struct list_head all_lock_classes;
 extern struct lock_chain lock_chains[];
 
-#define LOCK_USAGE_CHARS (1+LOCK_USAGE_STATES/2)
+#define LOCK_USAGE_CHARS (2*XXX_LOCK_USAGE_STATES + 1)
 
 extern void get_usage_chars(struct lock_class *class,
 			    char usage[LOCK_USAGE_CHARS]);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-09-30  9:49     ` Peter Zijlstra
@ 2020-09-30 12:18       ` Boqun Feng
  2020-09-30 19:02         ` Peter Zijlstra
  2020-10-02 10:06       ` Peter Zijlstra
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 35+ messages in thread
From: Boqun Feng @ 2020-09-30 12:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: g, Qian Cai, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Wed, Sep 30, 2020 at 11:49:37AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 30, 2020 at 11:16:11AM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 30, 2020 at 07:08:23AM +0800, Boqun Feng wrote:
> > > I think there are two problems here:
> > > 
> > > 1) the "(null)" means we don't have the "usage_str" for a usage bit,
> > > which I think is the LOCK_USED_READ bit introduced by Peter at
> > > 23870f122768 ('locking/lockdep: Fix "USED" <- "IN-NMI" inversions').
> > > 
> > > 2) the next null-ptr-deref, and I think this is also caused by
> > > LOCK_USED_READ bit, because in the loop inside
> > > print_lock_class_header(), we iterate from 0 to LOCK_USAGE_STATES (which
> > > is 4*2 + 3), however the class->usage_traces[] only has
> > > XXX_LOCK_USAGE_STATES (which is 4*2 + 1) elements, so if we have
> > > LOCK_USED_READ bit set in ->usage_mask, we will try to access an element
> > > out of the ->usage_traces[] array.
> > > 
> > > Probably the following helps? And another possible fix is to enlarge the
> > > ->usage_trace[] array and record the call trace of LOCK_READ_USED.
> > 
> > Urgh.. yeah, I wanted to avoid saving that trace; it's pretty useless :/
> > The existing USED trace is already mostly pointless, the consistent
> > thing would be to remove both but that might be too radical.
> > 
> > But you're right in that I made a right mess of it. Not sure what's
> > best here.
> > 
> > Let me have a play.
> 
> How's something like this? It's bigger than I'd like, but I feel the
> result is more consistent/readable.
> 

Looks good to me.

For one thing, I do think that LOCK_READ_USED trace is helpful for
better reporting, because if there is a read lock in the dependency path
which causes the deadlock, it's better to have the LOCK_READ_USED trace
to know at least the initial READ usage. For example, if we have

	void f1(...)
	{
		write_lock(&A);
		spin_lock(&C);
		// A -> C
		...
	}

	void g(...)
	{
		read_lock(&A);
		...
	}
	void f2(...)
	{
		spin_lock(&B);
		g(...);
		// B -> A
	}

	void f3(...) {
		spin_lock(&C);
		spin_lock(&B);
		// C -> B, trigger lockdep splat
	}

when lockdep reports the deadlock (at the time f3() is called), it will
be useful if we have a trace like:

	INITIAL READ usage at:
	g+0x.../0x...
	f2+0x.../0x...

Thoughts?

Regards,
Boqun

> ---
> diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
> index bb35b449f533..a55b1d314ae8 100644
> --- a/include/linux/lockdep_types.h
> +++ b/include/linux/lockdep_types.h
> @@ -35,8 +35,12 @@ enum lockdep_wait_type {
>  /*
>   * We'd rather not expose kernel/lockdep_states.h this wide, but we do need
>   * the total number of states... :-(
> + *
> + * XXX_LOCK_USAGE_STATES is the number of lines in lockdep_states.h, for each
> + * of those we generates 4 states, Additionally we (for now) report on USED.
>   */
> -#define XXX_LOCK_USAGE_STATES		(1+2*4)
> +#define XXX_LOCK_USAGE_STATES		2
> +#define LOCK_TRACE_STATES		(XXX_LOCK_USAGE_STATES*4 + 1)
>  
>  /*
>   * NR_LOCKDEP_CACHING_CLASSES ... Number of classes
> @@ -106,7 +110,7 @@ struct lock_class {
>  	 * IRQ/softirq usage tracking bits:
>  	 */
>  	unsigned long			usage_mask;
> -	const struct lock_trace		*usage_traces[XXX_LOCK_USAGE_STATES];
> +	const struct lock_trace		*usage_traces[LOCK_TRACE_STATES];
>  
>  	/*
>  	 * Generation counter, when doing certain classes of graph walking,
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 454355c033d2..4f98ac8b4575 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -600,6 +600,8 @@ static const char *usage_str[] =
>  #include "lockdep_states.h"
>  #undef LOCKDEP_STATE
>  	[LOCK_USED] = "INITIAL USE",
> +	[LOCK_USED_READ] = "INITIAL READ USE",
> +	/* abused as string storage for verify_lock_unused() */
>  	[LOCK_USAGE_STATES] = "IN-NMI",
>  };
>  #endif
> @@ -2231,7 +2233,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
>  #endif
>  	printk(KERN_CONT " {\n");
>  
> -	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
> +	for (bit = 0; bit < LOCK_TRACE_STATES; bit++) {
>  		if (class->usage_mask & (1 << bit)) {
>  			int len = depth;
>  
> @@ -4354,27 +4356,24 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
>  	old_mask = hlock_class(this)->usage_mask;
>  	hlock_class(this)->usage_mask |= new_mask;
>  
> -	/*
> -	 * Save one usage_traces[] entry and map both LOCK_USED and
> -	 * LOCK_USED_READ onto the same entry.
> -	 */
> -	if (new_bit == LOCK_USED || new_bit == LOCK_USED_READ) {
> -		if (old_mask & (LOCKF_USED | LOCKF_USED_READ))
> -			goto unlock;
> -		new_bit = LOCK_USED;
> +	if (new_bit < LOCK_TRACE_STATES) {
> +		if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
> +			return 0;
>  	}
>  
> -	if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
> -		return 0;
> -
>  	switch (new_bit) {
> +	case 0 ... LOCK_USED-1:
> +		ret = mark_lock_irq(curr, this, new_bit);
> +		if (!ret)
> +			return 0;
> +		break;
> +
>  	case LOCK_USED:
>  		debug_atomic_dec(nr_unused_locks);
>  		break;
> +
>  	default:
> -		ret = mark_lock_irq(curr, this, new_bit);
> -		if (!ret)
> -			return 0;
> +		break;
>  	}
>  
>  unlock:
> diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
> index b0be1560ed17..67dc46e46552 100644
> --- a/kernel/locking/lockdep_internals.h
> +++ b/kernel/locking/lockdep_internals.h
> @@ -20,9 +20,12 @@ enum lock_usage_bit {
>  #undef LOCKDEP_STATE
>  	LOCK_USED,
>  	LOCK_USED_READ,
> -	LOCK_USAGE_STATES
> +	LOCK_USAGE_STATES,
>  };
>  
> +/* states after LOCK_USED are not traced and printed */
> +static_assert(LOCK_TRACE_STATES == LOCK_USED_READ);
> +
>  #define LOCK_USAGE_READ_MASK 1
>  #define LOCK_USAGE_DIR_MASK  2
>  #define LOCK_USAGE_STATE_MASK (~(LOCK_USAGE_READ_MASK | LOCK_USAGE_DIR_MASK))
> @@ -121,7 +124,7 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
>  extern struct list_head all_lock_classes;
>  extern struct lock_chain lock_chains[];
>  
> -#define LOCK_USAGE_CHARS (1+LOCK_USAGE_STATES/2)
> +#define LOCK_USAGE_CHARS (2*XXX_LOCK_USAGE_STATES + 1)
>  
>  extern void get_usage_chars(struct lock_class *class,
>  			    char usage[LOCK_USAGE_CHARS]);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-09-30 12:18       ` Boqun Feng
@ 2020-09-30 19:02         ` Peter Zijlstra
  2020-10-02 12:36           ` Boqun Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2020-09-30 19:02 UTC (permalink / raw)
  To: Boqun Feng
  Cc: g, Qian Cai, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Wed, Sep 30, 2020 at 08:18:18PM +0800, Boqun Feng wrote:

> For one thing, I do think that LOCK_READ_USED trace is helpful for
> better reporting, because if there is a read lock in the dependency path
> which causes the deadlock, it's better to have the LOCK_READ_USED trace
> to know at least the initial READ usage. For example, if we have
> 
> 	void f1(...)
> 	{
> 		write_lock(&A);
> 		spin_lock(&C);
> 		// A -> C
> 		...
> 	}
> 
> 	void g(...)
> 	{
> 		read_lock(&A);
> 		...
> 	}
> 	void f2(...)
> 	{
> 		spin_lock(&B);
> 		g(...);
> 		// B -> A
> 	}
> 
> 	void f3(...) {
> 		spin_lock(&C);
> 		spin_lock(&B);
> 		// C -> B, trigger lockdep splat
> 	}
> 
> when lockdep reports the deadlock (at the time f3() is called), it will
> be useful if we have a trace like:
> 
> 	INITIAL READ usage at:
> 	g+0x.../0x...
> 	f2+0x.../0x...
> 
> Thoughts?

Wouldn't that also be in LOCK_ENABLED_*_READ ?

That is, with PROVE_LOCKING on, the initial usage is bound to set more
states, except for !check||trylock usage, and those aren't really all
that interesting.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-09-30  9:49     ` Peter Zijlstra
  2020-09-30 12:18       ` Boqun Feng
@ 2020-10-02 10:06       ` Peter Zijlstra
  2020-10-02 13:40         ` Qian Cai
  2020-10-07 16:20       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra
  2020-10-09  7:58       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra
  3 siblings, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-02 10:06 UTC (permalink / raw)
  To: Boqun Feng; +Cc: Qian Cai, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Wed, Sep 30, 2020 at 11:49:37AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 30, 2020 at 11:16:11AM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 30, 2020 at 07:08:23AM +0800, Boqun Feng wrote:
> > > I think there are two problems here:
> > > 
> > > 1) the "(null)" means we don't have the "usage_str" for a usage bit,
> > > which I think is the LOCK_USED_READ bit introduced by Peter at
> > > 23870f122768 ('locking/lockdep: Fix "USED" <- "IN-NMI" inversions').
> > > 
> > > 2) the next null-ptr-deref, and I think this is also caused by
> > > LOCK_USED_READ bit, because in the loop inside
> > > print_lock_class_header(), we iterate from 0 to LOCK_USAGE_STATES (which
> > > is 4*2 + 3), however the class->usage_traces[] only has
> > > XXX_LOCK_USAGE_STATES (which is 4*2 + 1) elements, so if we have
> > > LOCK_USED_READ bit set in ->usage_mask, we will try to access an element
> > > out of the ->usage_traces[] array.
> > > 
> > > Probably the following helps? And another possible fix is to enlarge the
> > > ->usage_trace[] array and record the call trace of LOCK_READ_USED.
> > 
> > Urgh.. yeah, I wanted to avoid saving that trace; it's pretty useless :/
> > The existing USED trace is already mostly pointless, the consistent
> > thing would be to remove both but that might be too radical.
> > 
> > But you're right in that I made a right mess of it. Not sure what's
> > best here.
> > 
> > Let me have a play.
> 
> How's something like this? It's bigger than I'd like, but I feel the
> result is more consistent/readable.
> 

Qian, could you confirm this works for you?

> ---
> diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
> index bb35b449f533..a55b1d314ae8 100644
> --- a/include/linux/lockdep_types.h
> +++ b/include/linux/lockdep_types.h
> @@ -35,8 +35,12 @@ enum lockdep_wait_type {
>  /*
>   * We'd rather not expose kernel/lockdep_states.h this wide, but we do need
>   * the total number of states... :-(
> + *
> + * XXX_LOCK_USAGE_STATES is the number of lines in lockdep_states.h, for each
> + * of those we generates 4 states, Additionally we (for now) report on USED.
>   */
> -#define XXX_LOCK_USAGE_STATES		(1+2*4)
> +#define XXX_LOCK_USAGE_STATES		2
> +#define LOCK_TRACE_STATES		(XXX_LOCK_USAGE_STATES*4 + 1)
>  
>  /*
>   * NR_LOCKDEP_CACHING_CLASSES ... Number of classes
> @@ -106,7 +110,7 @@ struct lock_class {
>  	 * IRQ/softirq usage tracking bits:
>  	 */
>  	unsigned long			usage_mask;
> -	const struct lock_trace		*usage_traces[XXX_LOCK_USAGE_STATES];
> +	const struct lock_trace		*usage_traces[LOCK_TRACE_STATES];
>  
>  	/*
>  	 * Generation counter, when doing certain classes of graph walking,
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 454355c033d2..4f98ac8b4575 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -600,6 +600,8 @@ static const char *usage_str[] =
>  #include "lockdep_states.h"
>  #undef LOCKDEP_STATE
>  	[LOCK_USED] = "INITIAL USE",
> +	[LOCK_USED_READ] = "INITIAL READ USE",
> +	/* abused as string storage for verify_lock_unused() */
>  	[LOCK_USAGE_STATES] = "IN-NMI",
>  };
>  #endif
> @@ -2231,7 +2233,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
>  #endif
>  	printk(KERN_CONT " {\n");
>  
> -	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
> +	for (bit = 0; bit < LOCK_TRACE_STATES; bit++) {
>  		if (class->usage_mask & (1 << bit)) {
>  			int len = depth;
>  
> @@ -4354,27 +4356,24 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
>  	old_mask = hlock_class(this)->usage_mask;
>  	hlock_class(this)->usage_mask |= new_mask;
>  
> -	/*
> -	 * Save one usage_traces[] entry and map both LOCK_USED and
> -	 * LOCK_USED_READ onto the same entry.
> -	 */
> -	if (new_bit == LOCK_USED || new_bit == LOCK_USED_READ) {
> -		if (old_mask & (LOCKF_USED | LOCKF_USED_READ))
> -			goto unlock;
> -		new_bit = LOCK_USED;
> +	if (new_bit < LOCK_TRACE_STATES) {
> +		if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
> +			return 0;
>  	}
>  
> -	if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
> -		return 0;
> -
>  	switch (new_bit) {
> +	case 0 ... LOCK_USED-1:
> +		ret = mark_lock_irq(curr, this, new_bit);
> +		if (!ret)
> +			return 0;
> +		break;
> +
>  	case LOCK_USED:
>  		debug_atomic_dec(nr_unused_locks);
>  		break;
> +
>  	default:
> -		ret = mark_lock_irq(curr, this, new_bit);
> -		if (!ret)
> -			return 0;
> +		break;
>  	}
>  
>  unlock:
> diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
> index b0be1560ed17..67dc46e46552 100644
> --- a/kernel/locking/lockdep_internals.h
> +++ b/kernel/locking/lockdep_internals.h
> @@ -20,9 +20,12 @@ enum lock_usage_bit {
>  #undef LOCKDEP_STATE
>  	LOCK_USED,
>  	LOCK_USED_READ,
> -	LOCK_USAGE_STATES
> +	LOCK_USAGE_STATES,
>  };
>  
> +/* states after LOCK_USED are not traced and printed */
> +static_assert(LOCK_TRACE_STATES == LOCK_USED_READ);
> +
>  #define LOCK_USAGE_READ_MASK 1
>  #define LOCK_USAGE_DIR_MASK  2
>  #define LOCK_USAGE_STATE_MASK (~(LOCK_USAGE_READ_MASK | LOCK_USAGE_DIR_MASK))
> @@ -121,7 +124,7 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
>  extern struct list_head all_lock_classes;
>  extern struct lock_chain lock_chains[];
>  
> -#define LOCK_USAGE_CHARS (1+LOCK_USAGE_STATES/2)
> +#define LOCK_USAGE_CHARS (2*XXX_LOCK_USAGE_STATES + 1)
>  
>  extern void get_usage_chars(struct lock_class *class,
>  			    char usage[LOCK_USAGE_CHARS]);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-09-30 19:02         ` Peter Zijlstra
@ 2020-10-02 12:36           ` Boqun Feng
  2020-10-02 13:09             ` Peter Zijlstra
  0 siblings, 1 reply; 35+ messages in thread
From: Boqun Feng @ 2020-10-02 12:36 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: g, Qian Cai, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Wed, Sep 30, 2020 at 09:02:28PM +0200, Peter Zijlstra wrote:
> On Wed, Sep 30, 2020 at 08:18:18PM +0800, Boqun Feng wrote:
> 
> > For one thing, I do think that LOCK_READ_USED trace is helpful for
> > better reporting, because if there is a read lock in the dependency path
> > which causes the deadlock, it's better to have the LOCK_READ_USED trace
> > to know at least the initial READ usage. For example, if we have
> > 
> > 	void f1(...)
> > 	{
> > 		write_lock(&A);
> > 		spin_lock(&C);
> > 		// A -> C
> > 		...
> > 	}
> > 
> > 	void g(...)
> > 	{
> > 		read_lock(&A);
> > 		...
> > 	}
> > 	void f2(...)
> > 	{
> > 		spin_lock(&B);
> > 		g(...);
> > 		// B -> A
> > 	}
> > 
> > 	void f3(...) {
> > 		spin_lock(&C);
> > 		spin_lock(&B);
> > 		// C -> B, trigger lockdep splat
> > 	}
> > 
> > when lockdep reports the deadlock (at the time f3() is called), it will
> > be useful if we have a trace like:
> > 
> > 	INITIAL READ usage at:
> > 	g+0x.../0x...
> > 	f2+0x.../0x...
> > 
> > Thoughts?
> 
> Wouldn't that also be in LOCK_ENABLED_*_READ ?
> 

But what if f2() is called with interrupt disabled? Or f2() disables
interrupt inside the function, like:

	void f2(...)
	{
		local_irq_disable();
		spin_lock(&B);
		g(...);
		...
		local_irq_enable();
	}

In this case, there wouldn't be any LOCK_ENABLED_*_READ usage for
rwlock_t A. As a result, we won't see it in the lockdep splat.

Regards,
Boqun

> That is, with PROVE_LOCKING on, the initial usage is bound to set more
> states, except for !check||trylock usage, and those aren't really all
> that interesting.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-10-02 12:36           ` Boqun Feng
@ 2020-10-02 13:09             ` Peter Zijlstra
  2020-10-02 13:35               ` Boqun Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-02 13:09 UTC (permalink / raw)
  To: Boqun Feng
  Cc: g, Qian Cai, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Fri, Oct 02, 2020 at 08:36:02PM +0800, Boqun Feng wrote:

> But what if f2() is called with interrupt disabled? Or f2() disables
> interrupt inside the function, like:
> 
> 	void f2(...)
> 	{
> 		local_irq_disable();
> 		spin_lock(&B);
> 		g(...);
> 		...
> 		local_irq_enable();
> 	}
> 
> In this case, there wouldn't be any LOCK_ENABLED_*_READ usage for
> rwlock_t A. As a result, we won't see it in the lockdep splat.

Hurm, fair enough. So just to make sure, you're arguing for:

-#define LOCK_TRACE_STATES              (XXX_LOCK_USAGE_STATES*4 + 1)
+#define LOCK_TRACE_STATES              (XXX_LOCK_USAGE_STATES*4 + 2)

On top of my earlier patch, right?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-10-02 13:09             ` Peter Zijlstra
@ 2020-10-02 13:35               ` Boqun Feng
  0 siblings, 0 replies; 35+ messages in thread
From: Boqun Feng @ 2020-10-02 13:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: g, Qian Cai, Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Fri, Oct 02, 2020 at 03:09:29PM +0200, Peter Zijlstra wrote:
> On Fri, Oct 02, 2020 at 08:36:02PM +0800, Boqun Feng wrote:
> 
> > But what if f2() is called with interrupt disabled? Or f2() disables
> > interrupt inside the function, like:
> > 
> > 	void f2(...)
> > 	{
> > 		local_irq_disable();
> > 		spin_lock(&B);
> > 		g(...);
> > 		...
> > 		local_irq_enable();
> > 	}
> > 
> > In this case, there wouldn't be any LOCK_ENABLED_*_READ usage for
> > rwlock_t A. As a result, we won't see it in the lockdep splat.
> 
> Hurm, fair enough. So just to make sure, you're arguing for:
> 
> -#define LOCK_TRACE_STATES              (XXX_LOCK_USAGE_STATES*4 + 1)
> +#define LOCK_TRACE_STATES              (XXX_LOCK_USAGE_STATES*4 + 2)
> 
> On top of my earlier patch, right?

Yep. Thanks ;-)

Regards,
Boqun

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: lockdep null-ptr-deref
  2020-10-02 10:06       ` Peter Zijlstra
@ 2020-10-02 13:40         ` Qian Cai
  0 siblings, 0 replies; 35+ messages in thread
From: Qian Cai @ 2020-10-02 13:40 UTC (permalink / raw)
  To: Peter Zijlstra, Boqun Feng
  Cc: Waiman Long, Ingo Molnar, Will Deacon, linux-kernel

On Fri, 2020-10-02 at 12:06 +0200, Peter Zijlstra wrote:
> On Wed, Sep 30, 2020 at 11:49:37AM +0200, Peter Zijlstra wrote:
> > On Wed, Sep 30, 2020 at 11:16:11AM +0200, Peter Zijlstra wrote:
> > > On Wed, Sep 30, 2020 at 07:08:23AM +0800, Boqun Feng wrote:
> > > > I think there are two problems here:
> > > > 
> > > > 1) the "(null)" means we don't have the "usage_str" for a usage bit,
> > > > which I think is the LOCK_USED_READ bit introduced by Peter at
> > > > 23870f122768 ('locking/lockdep: Fix "USED" <- "IN-NMI" inversions').
> > > > 
> > > > 2) the next null-ptr-deref, and I think this is also caused by
> > > > LOCK_USED_READ bit, because in the loop inside
> > > > print_lock_class_header(), we iterate from 0 to LOCK_USAGE_STATES (which
> > > > is 4*2 + 3), however the class->usage_traces[] only has
> > > > XXX_LOCK_USAGE_STATES (which is 4*2 + 1) elements, so if we have
> > > > LOCK_USED_READ bit set in ->usage_mask, we will try to access an element
> > > > out of the ->usage_traces[] array.
> > > > 
> > > > Probably the following helps? And another possible fix is to enlarge the
> > > > ->usage_trace[] array and record the call trace of LOCK_READ_USED.
> > > 
> > > Urgh.. yeah, I wanted to avoid saving that trace; it's pretty useless :/
> > > The existing USED trace is already mostly pointless, the consistent
> > > thing would be to remove both but that might be too radical.
> > > 
> > > But you're right in that I made a right mess of it. Not sure what's
> > > best here.
> > > 
> > > Let me have a play.
> > 
> > How's something like this? It's bigger than I'd like, but I feel the
> > result is more consistent/readable.
> > 
> 
> Qian, could you confirm this works for you?

It works fine. No crash. The new report looks like this:

[   29.439920] ========================================================
[   29.449225] WARNING: possible irq lock inversion dependency detected
[   29.453195] 5.9.0-rc7-next-20201002+ #5 Not tainted
[   29.456023] --------------------------------------------------------
[   29.459012] swapper/6/0 just changed the state of lock:
[   29.461442] ffff8889e8cc0818 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x1e/0x530 [libata]
[   29.462752] scsi 0:0:0:0: Direct-Access     ATA      QEMU HARDDISK    2.5+ PQ: 0 ANSI: 5
[   29.465625] but this lock took another, HARDIRQ-READ-unsafe lock in the past:
[   29.465626]  (&trig->leddev_list_lock){.+.?}-{2:2}
[   29.465629] 
               
               and interrupts could create inverse lock ordering between them.

[   29.465630] 
               other info that might help us debug this:
[   29.465631]  Possible interrupt unsafe locking scenario:

[   29.465632]        CPU0                    CPU1
[   29.465638]        ----                    ----
[   29.490458]   lock(&trig->leddev_list_lock);
[   29.492455]                                local_irq_disable();
[   29.495248]                                lock(&host->lock);
[   29.497882]                                lock(&trig->leddev_list_lock);
[   29.501018]   <Interrupt>
[   29.502293]     lock(&host->lock);
[   29.503853] 
                *** DEADLOCK ***

[   29.506636] no locks held by swapper/6/0.
[   29.508558] 
               the shortest dependencies between 2nd lock and 1st lock:
[   29.512201]  -> (&trig->leddev_list_lock){.+.?}-{2:2} {
[   29.514614]     HARDIRQ-ON-R at:
[   29.516194]                       lock_acquire+0x17f/0x7e0
[   29.518755]                       _raw_read_lock+0x38/0x70
[   29.521310]                       led_trigger_event+0x2b/0xb0
[   29.523945]                       kbd_propagate_led_state+0x5d/0x80
[   29.526858]                       kbd_bh+0x14d/0x1d0
[   29.529139]                       tasklet_action_common.isra.13+0x23a/0x2e0
[   29.532338]                       __do_softirq+0x1ce/0x828
[   29.534847]                       run_ksoftirqd+0x26/0x50
[   29.537390]                       smpboot_thread_fn+0x30f/0x740
[   29.540094]                       kthread+0x357/0x420
[   29.542468]                       ret_from_fork+0x22/0x30
[   29.544957]     IN-SOFTIRQ-R at:
[   29.546487]                       lock_acquire+0x17f/0x7e0
[   29.549003]                       _raw_read_lock+0x38/0x70
[   29.551577]                       led_trigger_event+0x2b/0xb0
[   29.554257]                       kbd_propagate_led_state+0x5d/0x80
[   29.557200]                       kbd_bh+0x14d/0x1d0
[   29.559497]                       tasklet_action_common.isra.13+0x23a/0x2e0
[   29.562705]                       __do_softirq+0x1ce/0x828
[   29.565315]                       run_ksoftirqd+0x26/0x50
[   29.567798]                       smpboot_thread_fn+0x30f/0x740
[   29.570543]                       kthread+0x357/0x420
[   29.572867]                       ret_from_fork+0x22/0x30
[   29.575428]     SOFTIRQ-ON-R at:
[   29.576915]                       lock_acquire+0x17f/0x7e0
[   29.579460]                       _raw_read_lock+0x5d/0x70
[   29.581954]                       led_trigger_event+0x2b/0xb0
[   29.584621]                       kbd_propagate_led_state+0x5d/0x80
[   29.587551]                       kbd_start+0xd2/0xf0
[   29.589901]                       input_register_handle+0x282/0x4f0
[   29.592811]                       kbd_connect+0xc0/0x120
[   29.595347]                       input_attach_handler+0x10a/0x170
[   29.598212]                       input_register_device.cold.22+0xac/0x29d
[   29.601390]                       atkbd_connect+0x58f/0x810
[   29.603950]                       serio_connect_driver+0x4a/0x70
[   29.606791]                       really_probe+0x222/0xb20
[   29.609369]                       driver_probe_device+0x1f6/0x380
[   29.612193]                       device_driver_attach+0xea/0x120
[   29.615002]                       __driver_attach+0xf5/0x270
[   29.617660]                       bus_for_each_dev+0x11c/0x1b0
[   29.620370]                       serio_handle_event+0x1df/0x7f0
[   29.623129]                       process_one_work+0x842/0x1410
[   29.625874]                       worker_thread+0x87/0xb40
[   29.628428]                       kthread+0x357/0x420
[   29.630740]                       ret_from_fork+0x22/0x30
[   29.633256]     INITIAL USE at:
[   29.634723]                      lock_acquire+0x17f/0x7e0
[   29.637236]                      _raw_write_lock_irqsave+0x35/0x50
[   29.640049]                      led_trigger_set+0x44b/0xb70
[   29.642692]                      led_trigger_set_default+0x163/0x200
[   29.645672]                      led_classdev_register_ext+0x482/0x670
[   29.648689]                      input_leds_connect+0x375/0x620
[   29.651449]                      input_attach_handler+0x10a/0x170
[   29.654293]                      input_register_device.cold.22+0xac/0x29d
[   29.657468]                      atkbd_connect+0x58f/0x810
[   29.659979]                      serio_connect_driver+0x4a/0x70
[   29.662731]                      really_probe+0x222/0xb20
[   29.665297]                      driver_probe_device+0x1f6/0x380
[   29.668039]                      device_driver_attach+0xea/0x120
[   29.670817]                      __driver_attach+0xf5/0x270
[   29.673510]                      bus_for_each_dev+0x11c/0x1b0
[   29.676237]                      serio_handle_event+0x1df/0x7f0
[   29.678962]                      process_one_work+0x842/0x1410
[   29.681676]                      worker_thread+0x87/0xb40
[   29.684198]                      kthread+0x357/0x420
[   29.686523]                      ret_from_fork+0x22/0x30
[   29.688958]   }
[   29.689836]   ... key      at: [<ffffffffbea9c780>] __key.33862+0x0/0x40
[   29.692932]   ... acquired at:
[   29.694395]    lock_acquire+0x17f/0x7e0
[   29.696221]    _raw_read_lock+0x5d/0x70
[   29.697972]    led_trigger_blink_oneshot+0x39/0xd0
[   29.700232]    ledtrig_disk_activity+0x77/0xf0
[   29.702322]    ata_qc_complete+0x63/0xd70 [libata]
[   29.704542]    ata_do_link_abort+0x1ae/0x340 [libata]
[   29.706909]    ata_sff_hsm_move+0x55f/0x1a30 [libata]
[   29.709298]    ata_sff_pio_task+0x397/0x4a0 [libata]
[   29.711582]    process_one_work+0x842/0x1410
[   29.713567]    worker_thread+0x87/0xb40
[   29.715392]    kthread+0x357/0x420
[   29.716945]    ret_from_fork+0x22/0x30

[   29.719446] -> (&host->lock){-...}-{2:2} {
[   29.721368]    IN-HARDIRQ-W at:
[   29.722817]                     lock_acquire+0x17f/0x7e0
[   29.725316]                     _raw_spin_lock_irqsave+0x35/0x50
[   29.728053]                     ata_bmdma_interrupt+0x1e/0x530 [libata]
[   29.731122]                     __handle_irq_event_percpu+0xeb/0x5f0
[   29.734033]                     handle_irq_event_percpu+0x73/0x140
[   29.736929]                     handle_irq_event+0xa1/0x130
[   29.739532]                     handle_edge_irq+0x1ee/0x770
[   29.742097]                     asm_call_irq_on_stack+0x12/0x20
[   29.744885]                     common_interrupt+0xe2/0x180
[   29.747514]                     asm_common_interrupt+0x1e/0x40
[   29.750208]                     default_idle+0x18/0x20
[   29.752579]                     default_idle_call+0xa0/0x370
[   29.755225]                     do_idle+0x3c1/0x570
[   29.757515]                     cpu_startup_entry+0x19/0x20
[   29.760094]                     start_secondary+0x250/0x2f0
[   29.762690]                     secondary_startup_64_no_verify+0xb8/0xbb
[   29.765805]    INITIAL USE at:
[   29.767270]                    lock_acquire+0x17f/0x7e0
[   29.769671]                    _raw_spin_lock_irqsave+0x35/0x50
[   29.772429]                    ata_dev_init+0x195/0x3c0 [libata]
[   29.775241]                    ata_link_init+0x19f/0x300 [libata]
[   29.778018]                    ata_port_alloc+0x498/0x580 [libata]
[   29.780885]                    ata_host_alloc+0x1c4/0x250 [libata]
[   29.783751]                    ata_host_alloc_pinfo+0x1d/0x540 [libata]
[   29.786840]                    ata_pci_sff_prepare_host+0x3c/0xb0 [libata]
[   29.790017]                    ata_pci_bmdma_prepare_host+0x12/0x60 [libata]
[   29.793318]                    piix_init_one+0x3f3/0xb62 [ata_piix]
[   29.796259]                    local_pci_probe+0xd8/0x170
[   29.798752]                    pci_device_probe+0x322/0x600
[   29.801363]                    really_probe+0x222/0xb20
[   29.803767]                    driver_probe_device+0x1f6/0x380
[   29.806511]                    device_driver_attach+0xea/0x120
[   29.809223]                    __driver_attach+0xf5/0x270
[   29.811716]                    bus_for_each_dev+0x11c/0x1b0
[   29.814320]                    bus_add_driver+0x39e/0x560
[   29.816821]                    driver_register+0x18e/0x310
[   29.819389]                    piix_init+0x1f/0x1000 [ata_piix]
[   29.822116]                    do_one_initcall+0xbb/0x3e0
[   29.824649]                    do_init_module+0x1f2/0x760
[   29.827156]                    load_module+0x5846/0x6e80
[   29.829595]                    __do_sys_finit_module+0x11a/0x1b0
[   29.832371]                    do_syscall_64+0x33/0x40
[   29.834714]                    entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   29.837755]  }
[   29.838569]  ... key      at: [<ffffffffc0552600>] __key.69338+0x0/0xfffffffffffbfa00 [libata]
[   29.842534]  ... acquired at:
[   29.843920]    __lock_acquire+0x14ec/0x3ac0
[   29.845895]    lock_acquire+0x17f/0x7e0
[   29.847707]    _raw_spin_lock_irqsave+0x35/0x50
[   29.849831]    ata_bmdma_interrupt+0x1e/0x530 [libata]
[   29.852253]    __handle_irq_event_percpu+0xeb/0x5f0
[   29.854480]    handle_irq_event_percpu+0x73/0x140
[   29.856678]    handle_irq_event+0xa1/0x130
[   29.858581]    handle_edge_irq+0x1ee/0x770
[   29.860507]    asm_call_irq_on_stack+0x12/0x20
[   29.862570]    common_interrupt+0xe2/0x180
[   29.864484]    asm_common_interrupt+0x1e/0x40
[   29.866517]    default_idle+0x18/0x20
[   29.868232]    default_idle_call+0xa0/0x370
[   29.870172]    do_idle+0x3c1/0x570
[   29.871747]    cpu_startup_entry+0x19/0x20
[   29.873655]    start_secondary+0x250/0x2f0
[   29.875580]    secondary_startup_64_no_verify+0xb8/0xbb

[   29.878701] 
               stack backtrace:
[   29.880737] CPU: 6 PID: 0 Comm: swapper/6 Not tainted 5.9.0-rc7-next-20201002+ #5
[   29.884206] Hardware name: Red Hat KVM, BIOS 1.14.0-1.module+el8.3.0+7638+07cf13d2 04/01/2014
[   29.888111] Call Trace:
[   29.889288]  <IRQ>
[   29.890254]  dump_stack+0x99/0xcb
[   29.891778]  mark_lock.part.48+0x131c/0x1be0
[   29.893756]  ? print_usage_bug+0x1f0/0x1f0
[   29.895678]  ? print_usage_bug+0x1f0/0x1f0
[   29.897582]  ? mark_lock.part.48+0x105/0x1be0
[   29.899595]  ? mark_lock.part.48+0x105/0x1be0
[   29.901607]  ? print_usage_bug+0x1f0/0x1f0
[   29.903517]  __lock_acquire+0x14ec/0x3ac0
[   29.905399]  ? __bpf_trace_lock+0xf0/0xf0
[   29.907257]  ? lockdep_hardirqs_on_prepare+0x4d0/0x4d0
[   29.909614]  ? rcu_read_lock_sched_held+0xa1/0xd0
[   29.911775]  lock_acquire+0x17f/0x7e0
[   29.913499]  ? ata_bmdma_interrupt+0x1e/0x530 [libata]
[   29.915868]  ? rcu_read_unlock+0x40/0x40
[   29.917690]  ? __lock_acquire+0x1864/0x3ac0
[   29.919637]  _raw_spin_lock_irqsave+0x35/0x50
[   29.921657]  ? ata_bmdma_interrupt+0x1e/0x530 [libata]
[   29.924010]  ? ata_bmdma_port_intr+0x320/0x320 [libata]
[   29.926469]  ata_bmdma_interrupt+0x1e/0x530 [libata]
[   29.928744]  ? rcu_read_lock_bh_held+0xb0/0xb0
[   29.930795]  ? find_held_lock+0x33/0x1c0
[   29.932625]  ? ata_bmdma_port_intr+0x320/0x320 [libata]
[   29.935034]  __handle_irq_event_percpu+0xeb/0x5f0
[   29.937254]  handle_irq_event_percpu+0x73/0x140
[   29.939369]  ? __handle_irq_event_percpu+0x5f0/0x5f0
[   29.941659]  ? rwlock_bug.part.1+0x90/0x90
[   29.943568]  ? do_raw_spin_unlock+0x14b/0x230
[   29.945613]  handle_irq_event+0xa1/0x130
[   29.947466]  handle_edge_irq+0x1ee/0x770
[   29.949301]  asm_call_irq_on_stack+0x12/0x20
[   29.951291]  </IRQ>
[   29.952297]  common_interrupt+0xe2/0x180
[   29.954103]  asm_common_interrupt+0x1e/0x40
[   29.956072] RIP: 0010:default_idle+0x18/0x20
[   29.958068] Code: 5d e9 7c 52 35 ff cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 e8 c6 a4 69 fe e9 07 00 00 00 0f 00 2d 0a 2a 5a 00 fb f4 <c3> 0f 1f 80 00 00 00 00 0f 1f 44 00 00 53 be 08 00 00 00 65 48 8b
[   29.966729] RSP: 0018:ffff8881068d7df8 EFLAGS: 00000202
[   29.969173] RAX: 000000000001235b RBX: 0000000000000006 RCX: ffffffffbac561f0
[   29.972467] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffbac7bf0a
[   29.975767] RBP: 0000000000000006 R08: 0000000000000001 R09: 0000000000000001
[   29.979051] R10: ffff888a7eb3606b R11: ffffed114fd66c0d R12: ffffffffbc5718a0
[   29.982377] R13: 0000000000000000 R14: 1ffff11020d1afc6 R15: 0000000000000000
[   29.985667]  ? rcu_eqs_enter.constprop.74+0xa0/0xc0
[   29.987913]  ? default_idle+0xa/0x20
[   29.989593]  ? default_idle+0xa/0x20
[   29.991282]  default_idle_call+0xa0/0x370
[   29.993228]  do_idle+0x3c1/0x570
[   29.994740]  ? arch_cpu_idle_exit+0x40/0x40
[   29.996670]  ? lockdep_hardirqs_on_prepare+0x32b/0x4d0
[   29.999021]  ? _raw_spin_unlock_irqrestore+0x39/0x40
[   30.001355]  cpu_startup_entry+0x19/0x20
[   30.003178]  start_secondary+0x250/0x2f0
[   30.004986]  ? set_cpu_sibling_map+0x2430/0x2430
[   30.007155]  ? start_cpu0+0xc/0xc
[   30.008700]  secondary_startup_64_no_verify+0xb8/0xbb



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-09-30  9:49     ` Peter Zijlstra
  2020-09-30 12:18       ` Boqun Feng
  2020-10-02 10:06       ` Peter Zijlstra
@ 2020-10-07 16:20       ` tip-bot2 for Peter Zijlstra
  2020-10-27 11:29         ` Chris Wilson
  2020-10-09  7:58       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra
  3 siblings, 1 reply; 35+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-10-07 16:20 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Qian Cai, Peter Zijlstra (Intel), x86, LKML

The following commit has been merged into the locking/core branch of tip:

Commit-ID:     24d5a3bffef117ed90685f285c6c9d2faa3a02b4
Gitweb:        https://git.kernel.org/tip/24d5a3bffef117ed90685f285c6c9d2faa3a02b4
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 30 Sep 2020 11:49:37 +02:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Wed, 07 Oct 2020 18:14:17 +02:00

lockdep: Fix usage_traceoverflow

Basically print_lock_class_header()'s for loop is out of sync with the
the size of of ->usage_traces[].

Clean things up a bit.

Fixes: 23870f122768 ("locking/lockdep: Fix "USED" <- "IN-NMI" inversions")
Reported-by: Qian Cai <cai@redhat.com>
Debugged-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Qian Cai <cai@redhat.com>
Link: https://lkml.kernel.org/r/20200930094937.GE2651@hirez.programming.kicks-ass.net
---
 include/linux/lockdep_types.h      |  8 +++++--
 kernel/locking/lockdep.c           | 32 +++++++++++++----------------
 kernel/locking/lockdep_internals.h |  7 ++++--
 3 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
index bb35b44..9a1fd49 100644
--- a/include/linux/lockdep_types.h
+++ b/include/linux/lockdep_types.h
@@ -35,8 +35,12 @@ enum lockdep_wait_type {
 /*
  * We'd rather not expose kernel/lockdep_states.h this wide, but we do need
  * the total number of states... :-(
+ *
+ * XXX_LOCK_USAGE_STATES is the number of lines in lockdep_states.h, for each
+ * of those we generates 4 states, Additionally we report on USED and USED_READ.
  */
-#define XXX_LOCK_USAGE_STATES		(1+2*4)
+#define XXX_LOCK_USAGE_STATES		2
+#define LOCK_TRACE_STATES		(XXX_LOCK_USAGE_STATES*4 + 2)
 
 /*
  * NR_LOCKDEP_CACHING_CLASSES ... Number of classes
@@ -106,7 +110,7 @@ struct lock_class {
 	 * IRQ/softirq usage tracking bits:
 	 */
 	unsigned long			usage_mask;
-	const struct lock_trace		*usage_traces[XXX_LOCK_USAGE_STATES];
+	const struct lock_trace		*usage_traces[LOCK_TRACE_STATES];
 
 	/*
 	 * Generation counter, when doing certain classes of graph walking,
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index cdb3892..b3b3161 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -600,6 +600,8 @@ static const char *usage_str[] =
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 	[LOCK_USED] = "INITIAL USE",
+	[LOCK_USED_READ] = "INITIAL READ USE",
+	/* abused as string storage for verify_lock_unused() */
 	[LOCK_USAGE_STATES] = "IN-NMI",
 };
 #endif
@@ -2252,7 +2254,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 #endif
 	printk(KERN_CONT " {\n");
 
-	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
+	for (bit = 0; bit < LOCK_TRACE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 
@@ -4345,7 +4347,7 @@ static int separate_irq_context(struct task_struct *curr,
 static int mark_lock(struct task_struct *curr, struct held_lock *this,
 			     enum lock_usage_bit new_bit)
 {
-	unsigned int old_mask, new_mask, ret = 1;
+	unsigned int new_mask, ret = 1;
 
 	if (new_bit >= LOCK_USAGE_STATES) {
 		DEBUG_LOCKS_WARN_ON(1);
@@ -4372,30 +4374,26 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	if (unlikely(hlock_class(this)->usage_mask & new_mask))
 		goto unlock;
 
-	old_mask = hlock_class(this)->usage_mask;
 	hlock_class(this)->usage_mask |= new_mask;
 
-	/*
-	 * Save one usage_traces[] entry and map both LOCK_USED and
-	 * LOCK_USED_READ onto the same entry.
-	 */
-	if (new_bit == LOCK_USED || new_bit == LOCK_USED_READ) {
-		if (old_mask & (LOCKF_USED | LOCKF_USED_READ))
-			goto unlock;
-		new_bit = LOCK_USED;
+	if (new_bit < LOCK_TRACE_STATES) {
+		if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
+			return 0;
 	}
 
-	if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
-		return 0;
-
 	switch (new_bit) {
+	case 0 ... LOCK_USED-1:
+		ret = mark_lock_irq(curr, this, new_bit);
+		if (!ret)
+			return 0;
+		break;
+
 	case LOCK_USED:
 		debug_atomic_dec(nr_unused_locks);
 		break;
+
 	default:
-		ret = mark_lock_irq(curr, this, new_bit);
-		if (!ret)
-			return 0;
+		break;
 	}
 
 unlock:
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index b0be156..de49f9e 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -20,9 +20,12 @@ enum lock_usage_bit {
 #undef LOCKDEP_STATE
 	LOCK_USED,
 	LOCK_USED_READ,
-	LOCK_USAGE_STATES
+	LOCK_USAGE_STATES,
 };
 
+/* states after LOCK_USED_READ are not traced and printed */
+static_assert(LOCK_TRACE_STATES == LOCK_USAGE_STATES);
+
 #define LOCK_USAGE_READ_MASK 1
 #define LOCK_USAGE_DIR_MASK  2
 #define LOCK_USAGE_STATE_MASK (~(LOCK_USAGE_READ_MASK | LOCK_USAGE_DIR_MASK))
@@ -121,7 +124,7 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 extern struct list_head all_lock_classes;
 extern struct lock_chain lock_chains[];
 
-#define LOCK_USAGE_CHARS (1+LOCK_USAGE_STATES/2)
+#define LOCK_USAGE_CHARS (2*XXX_LOCK_USAGE_STATES + 1)
 
 extern void get_usage_chars(struct lock_class *class,
 			    char usage[LOCK_USAGE_CHARS]);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-09-30  9:49     ` Peter Zijlstra
                         ` (2 preceding siblings ...)
  2020-10-07 16:20       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra
@ 2020-10-09  7:58       ` tip-bot2 for Peter Zijlstra
  3 siblings, 0 replies; 35+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-10-09  7:58 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Qian Cai, Peter Zijlstra (Intel), Ingo Molnar, x86, LKML

The following commit has been merged into the locking/core branch of tip:

Commit-ID:     2bb8945bcc1a768f2bc402a16c9610bba8d5187d
Gitweb:        https://git.kernel.org/tip/2bb8945bcc1a768f2bc402a16c9610bba8d5187d
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 30 Sep 2020 11:49:37 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Fri, 09 Oct 2020 08:53:08 +02:00

lockdep: Fix usage_traceoverflow

Basically print_lock_class_header()'s for loop is out of sync with the
the size of of ->usage_traces[].

Also clean things up a bit while at it, to avoid such mishaps in the future.

Fixes: 23870f122768 ("locking/lockdep: Fix "USED" <- "IN-NMI" inversions")
Reported-by: Qian Cai <cai@redhat.com>
Debugged-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Qian Cai <cai@redhat.com>
Link: https://lkml.kernel.org/r/20200930094937.GE2651@hirez.programming.kicks-ass.net
---
 include/linux/lockdep_types.h      |  8 +++++--
 kernel/locking/lockdep.c           | 32 +++++++++++++----------------
 kernel/locking/lockdep_internals.h |  7 ++++--
 3 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
index bb35b44..9a1fd49 100644
--- a/include/linux/lockdep_types.h
+++ b/include/linux/lockdep_types.h
@@ -35,8 +35,12 @@ enum lockdep_wait_type {
 /*
  * We'd rather not expose kernel/lockdep_states.h this wide, but we do need
  * the total number of states... :-(
+ *
+ * XXX_LOCK_USAGE_STATES is the number of lines in lockdep_states.h, for each
+ * of those we generates 4 states, Additionally we report on USED and USED_READ.
  */
-#define XXX_LOCK_USAGE_STATES		(1+2*4)
+#define XXX_LOCK_USAGE_STATES		2
+#define LOCK_TRACE_STATES		(XXX_LOCK_USAGE_STATES*4 + 2)
 
 /*
  * NR_LOCKDEP_CACHING_CLASSES ... Number of classes
@@ -106,7 +110,7 @@ struct lock_class {
 	 * IRQ/softirq usage tracking bits:
 	 */
 	unsigned long			usage_mask;
-	const struct lock_trace		*usage_traces[XXX_LOCK_USAGE_STATES];
+	const struct lock_trace		*usage_traces[LOCK_TRACE_STATES];
 
 	/*
 	 * Generation counter, when doing certain classes of graph walking,
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2facbbd..a430fbb 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -585,6 +585,8 @@ static const char *usage_str[] =
 #include "lockdep_states.h"
 #undef LOCKDEP_STATE
 	[LOCK_USED] = "INITIAL USE",
+	[LOCK_USED_READ] = "INITIAL READ USE",
+	/* abused as string storage for verify_lock_unused() */
 	[LOCK_USAGE_STATES] = "IN-NMI",
 };
 #endif
@@ -1939,7 +1941,7 @@ static void print_lock_class_header(struct lock_class *class, int depth)
 #endif
 	printk(KERN_CONT " {\n");
 
-	for (bit = 0; bit < LOCK_USAGE_STATES; bit++) {
+	for (bit = 0; bit < LOCK_TRACE_STATES; bit++) {
 		if (class->usage_mask & (1 << bit)) {
 			int len = depth;
 
@@ -3969,7 +3971,7 @@ static int separate_irq_context(struct task_struct *curr,
 static int mark_lock(struct task_struct *curr, struct held_lock *this,
 			     enum lock_usage_bit new_bit)
 {
-	unsigned int old_mask, new_mask, ret = 1;
+	unsigned int new_mask, ret = 1;
 
 	if (new_bit >= LOCK_USAGE_STATES) {
 		DEBUG_LOCKS_WARN_ON(1);
@@ -3996,30 +3998,26 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	if (unlikely(hlock_class(this)->usage_mask & new_mask))
 		goto unlock;
 
-	old_mask = hlock_class(this)->usage_mask;
 	hlock_class(this)->usage_mask |= new_mask;
 
-	/*
-	 * Save one usage_traces[] entry and map both LOCK_USED and
-	 * LOCK_USED_READ onto the same entry.
-	 */
-	if (new_bit == LOCK_USED || new_bit == LOCK_USED_READ) {
-		if (old_mask & (LOCKF_USED | LOCKF_USED_READ))
-			goto unlock;
-		new_bit = LOCK_USED;
+	if (new_bit < LOCK_TRACE_STATES) {
+		if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
+			return 0;
 	}
 
-	if (!(hlock_class(this)->usage_traces[new_bit] = save_trace()))
-		return 0;
-
 	switch (new_bit) {
+	case 0 ... LOCK_USED-1:
+		ret = mark_lock_irq(curr, this, new_bit);
+		if (!ret)
+			return 0;
+		break;
+
 	case LOCK_USED:
 		debug_atomic_dec(nr_unused_locks);
 		break;
+
 	default:
-		ret = mark_lock_irq(curr, this, new_bit);
-		if (!ret)
-			return 0;
+		break;
 	}
 
 unlock:
diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
index b0be156..de49f9e 100644
--- a/kernel/locking/lockdep_internals.h
+++ b/kernel/locking/lockdep_internals.h
@@ -20,9 +20,12 @@ enum lock_usage_bit {
 #undef LOCKDEP_STATE
 	LOCK_USED,
 	LOCK_USED_READ,
-	LOCK_USAGE_STATES
+	LOCK_USAGE_STATES,
 };
 
+/* states after LOCK_USED_READ are not traced and printed */
+static_assert(LOCK_TRACE_STATES == LOCK_USAGE_STATES);
+
 #define LOCK_USAGE_READ_MASK 1
 #define LOCK_USAGE_DIR_MASK  2
 #define LOCK_USAGE_STATE_MASK (~(LOCK_USAGE_READ_MASK | LOCK_USAGE_DIR_MASK))
@@ -121,7 +124,7 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
 extern struct list_head all_lock_classes;
 extern struct lock_chain lock_chains[];
 
-#define LOCK_USAGE_CHARS (1+LOCK_USAGE_STATES/2)
+#define LOCK_USAGE_CHARS (2*XXX_LOCK_USAGE_STATES + 1)
 
 extern void get_usage_chars(struct lock_class *class,
 			    char usage[LOCK_USAGE_CHARS]);

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-07 16:20       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra
@ 2020-10-27 11:29         ` Chris Wilson
  2020-10-27 11:59           ` Peter Zijlstra
  0 siblings, 1 reply; 35+ messages in thread
From: Chris Wilson @ 2020-10-27 11:29 UTC (permalink / raw)
  To: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra
  Cc: Qian Cai, Peter Zijlstra, x86, LKML

Quoting tip-bot2 for Peter Zijlstra (2020-10-07 17:20:13)
> The following commit has been merged into the locking/core branch of tip:
> 
> Commit-ID:     24d5a3bffef117ed90685f285c6c9d2faa3a02b4
> Gitweb:        https://git.kernel.org/tip/24d5a3bffef117ed90685f285c6c9d2faa3a02b4
> Author:        Peter Zijlstra <peterz@infradead.org>
> AuthorDate:    Wed, 30 Sep 2020 11:49:37 +02:00
> Committer:     Peter Zijlstra <peterz@infradead.org>
> CommitterDate: Wed, 07 Oct 2020 18:14:17 +02:00
> 
> lockdep: Fix usage_traceoverflow
> 
> Basically print_lock_class_header()'s for loop is out of sync with the
> the size of of ->usage_traces[].

We're hitting a problem,

	$ cat /proc/lockdep_stats

upon boot generates:

[   29.465702] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused)
[   29.465716] WARNING: CPU: 0 PID: 488 at kernel/locking/lockdep_proc.c:256 lockdep_stats_show+0xa33/0xac0

that bisected to this patch. Only just completed the bisection and
thought you would like a heads up.
-Chris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-27 11:29         ` Chris Wilson
@ 2020-10-27 11:59           ` Peter Zijlstra
  2020-10-27 12:30             ` Peter Zijlstra
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-27 11:59 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

On Tue, Oct 27, 2020 at 11:29:35AM +0000, Chris Wilson wrote:
> Quoting tip-bot2 for Peter Zijlstra (2020-10-07 17:20:13)
> > The following commit has been merged into the locking/core branch of tip:
> > 
> > Commit-ID:     24d5a3bffef117ed90685f285c6c9d2faa3a02b4
> > Gitweb:        https://git.kernel.org/tip/24d5a3bffef117ed90685f285c6c9d2faa3a02b4
> > Author:        Peter Zijlstra <peterz@infradead.org>
> > AuthorDate:    Wed, 30 Sep 2020 11:49:37 +02:00
> > Committer:     Peter Zijlstra <peterz@infradead.org>
> > CommitterDate: Wed, 07 Oct 2020 18:14:17 +02:00
> > 
> > lockdep: Fix usage_traceoverflow
> > 
> > Basically print_lock_class_header()'s for loop is out of sync with the
> > the size of of ->usage_traces[].
> 
> We're hitting a problem,
> 
> 	$ cat /proc/lockdep_stats
> 
> upon boot generates:
> 
> [   29.465702] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused)
> [   29.465716] WARNING: CPU: 0 PID: 488 at kernel/locking/lockdep_proc.c:256 lockdep_stats_show+0xa33/0xac0
> 
> that bisected to this patch. Only just completed the bisection and
> thought you would like a heads up.

Oh hey, that's 'curious'... it does indeed trivially reproduce, let me
have a poke.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-27 11:59           ` Peter Zijlstra
@ 2020-10-27 12:30             ` Peter Zijlstra
  2020-10-27 12:48               ` Peter Zijlstra
  2020-10-27 13:29               ` [tip: locking/core] lockdep: Fix usage_traceoverflow Chris Wilson
  0 siblings, 2 replies; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-27 12:30 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

On Tue, Oct 27, 2020 at 12:59:55PM +0100, Peter Zijlstra wrote:
> On Tue, Oct 27, 2020 at 11:29:35AM +0000, Chris Wilson wrote:
> > Quoting tip-bot2 for Peter Zijlstra (2020-10-07 17:20:13)
> > > The following commit has been merged into the locking/core branch of tip:
> > > 
> > > Commit-ID:     24d5a3bffef117ed90685f285c6c9d2faa3a02b4
> > > Gitweb:        https://git.kernel.org/tip/24d5a3bffef117ed90685f285c6c9d2faa3a02b4
> > > Author:        Peter Zijlstra <peterz@infradead.org>
> > > AuthorDate:    Wed, 30 Sep 2020 11:49:37 +02:00
> > > Committer:     Peter Zijlstra <peterz@infradead.org>
> > > CommitterDate: Wed, 07 Oct 2020 18:14:17 +02:00
> > > 
> > > lockdep: Fix usage_traceoverflow
> > > 
> > > Basically print_lock_class_header()'s for loop is out of sync with the
> > > the size of of ->usage_traces[].
> > 
> > We're hitting a problem,
> > 
> > 	$ cat /proc/lockdep_stats
> > 
> > upon boot generates:
> > 
> > [   29.465702] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused)
> > [   29.465716] WARNING: CPU: 0 PID: 488 at kernel/locking/lockdep_proc.c:256 lockdep_stats_show+0xa33/0xac0
> > 
> > that bisected to this patch. Only just completed the bisection and
> > thought you would like a heads up.
> 
> Oh hey, that's 'curious'... it does indeed trivially reproduce, let me
> have a poke.

This seems to make it happy. Not quite sure that's the best solution.

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 3e99dfef8408..81295bc760fe 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4411,7 +4405,9 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 		break;
 
 	case LOCK_USED:
-		debug_atomic_dec(nr_unused_locks);
+	case LOCK_USED_READ:
+		if ((hlock_class(this)->usage_mask & (LOCKF_USED|LOCKF_USED_READ)) == new_mask)
+			debug_atomic_dec(nr_unused_locks);
 		break;
 
 	default:




^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-27 12:30             ` Peter Zijlstra
@ 2020-10-27 12:48               ` Peter Zijlstra
  2020-10-27 14:13                 ` Chris Wilson
  2020-10-31 11:30                 ` [tip: locking/urgent] lockdep: Fix nr_unused_locks accounting tip-bot2 for Peter Zijlstra
  2020-10-27 13:29               ` [tip: locking/core] lockdep: Fix usage_traceoverflow Chris Wilson
  1 sibling, 2 replies; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-27 12:48 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

On Tue, Oct 27, 2020 at 01:30:56PM +0100, Peter Zijlstra wrote:
> This seems to make it happy. Not quite sure that's the best solution.
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 3e99dfef8408..81295bc760fe 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -4411,7 +4405,9 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
>  		break;
>  
>  	case LOCK_USED:
> -		debug_atomic_dec(nr_unused_locks);
> +	case LOCK_USED_READ:
> +		if ((hlock_class(this)->usage_mask & (LOCKF_USED|LOCKF_USED_READ)) == new_mask)
> +			debug_atomic_dec(nr_unused_locks);
>  		break;
>  
>  	default:

This also works, and I think I likes it better.. anyone?

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 3e99dfef8408..e603e86c0227 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4396,6 +4390,9 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	if (unlikely(hlock_class(this)->usage_mask & new_mask))
 		goto unlock;
 
+	if (!hlock_class(this)->usage_mask)
+		debug_atomic_dec(nr_unused_locks);
+
 	hlock_class(this)->usage_mask |= new_mask;
 
 	if (new_bit < LOCK_TRACE_STATES) {
@@ -4403,19 +4400,10 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 			return 0;
 	}
 
-	switch (new_bit) {
-	case 0 ... LOCK_USED-1:
+	if (new_bit < LOCK_USED) {
 		ret = mark_lock_irq(curr, this, new_bit);
 		if (!ret)
 			return 0;
-		break;
-
-	case LOCK_USED:
-		debug_atomic_dec(nr_unused_locks);
-		break;
-
-	default:
-		break;
 	}
 
 unlock:



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-27 12:30             ` Peter Zijlstra
  2020-10-27 12:48               ` Peter Zijlstra
@ 2020-10-27 13:29               ` Chris Wilson
  2020-10-27 15:45                 ` Peter Zijlstra
  1 sibling, 1 reply; 35+ messages in thread
From: Chris Wilson @ 2020-10-27 13:29 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

Quoting Peter Zijlstra (2020-10-27 12:30:56)
> On Tue, Oct 27, 2020 at 12:59:55PM +0100, Peter Zijlstra wrote:
> > On Tue, Oct 27, 2020 at 11:29:35AM +0000, Chris Wilson wrote:
> > > Quoting tip-bot2 for Peter Zijlstra (2020-10-07 17:20:13)
> > > > The following commit has been merged into the locking/core branch of tip:
> > > > 
> > > > Commit-ID:     24d5a3bffef117ed90685f285c6c9d2faa3a02b4
> > > > Gitweb:        https://git.kernel.org/tip/24d5a3bffef117ed90685f285c6c9d2faa3a02b4
> > > > Author:        Peter Zijlstra <peterz@infradead.org>
> > > > AuthorDate:    Wed, 30 Sep 2020 11:49:37 +02:00
> > > > Committer:     Peter Zijlstra <peterz@infradead.org>
> > > > CommitterDate: Wed, 07 Oct 2020 18:14:17 +02:00
> > > > 
> > > > lockdep: Fix usage_traceoverflow
> > > > 
> > > > Basically print_lock_class_header()'s for loop is out of sync with the
> > > > the size of of ->usage_traces[].
> > > 
> > > We're hitting a problem,
> > > 
> > >     $ cat /proc/lockdep_stats
> > > 
> > > upon boot generates:
> > > 
> > > [   29.465702] DEBUG_LOCKS_WARN_ON(debug_atomic_read(nr_unused_locks) != nr_unused)
> > > [   29.465716] WARNING: CPU: 0 PID: 488 at kernel/locking/lockdep_proc.c:256 lockdep_stats_show+0xa33/0xac0
> > > 
> > > that bisected to this patch. Only just completed the bisection and
> > > thought you would like a heads up.
> > 
> > Oh hey, that's 'curious'... it does indeed trivially reproduce, let me
> > have a poke.
> 
> This seems to make it happy. Not quite sure that's the best solution.

Finished the first round of testing on this patch (will try the second
in a second). It solves the nr_unused_locks issue, but we find something
else:

<4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
<4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0
<4> [304.908898] Modules linked in: vgem snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio i915 mei_hdcp x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul snd_hda_intel btusb snd_intel_dspcfg btrtl snd_hda_codec btbcm btintel ghash_clmulni_intel snd_hwdep bluetooth snd_hda_core e1000e snd_pcm cdc_ether ptp usbnet mei_me mii pps_core mei ecdh_generic ecc intel_lpss_pci prime_numbers
<4> [304.908920] CPU: 0 PID: 5658 Comm: kms_psr Not tainted 5.10.0-rc1-CI-Trybot_7174+ #1
<4> [304.908922] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4> [304.908923] RIP: 0010:check_chain_key+0x1a4/0x1f0
<4> [304.908925] Code: a5 d8 08 00 00 74 e7 e8 7a eb 96 00 8b b5 e0 08 00 00 4c 89 e1 89 da 4c 8b 85 d8 08 00 00 48 c7 c7 d0 8f 30 82 e8 5f 2c 92 00 <0f> 0b 5b 5d 41 5c 41 5d c3 49 89 d5 49 c7 c4 ff ff ff ff 31 db e8
<4> [304.908926] RSP: 0018:ffffc90000ba7af0 EFLAGS: 00010086
<4> [304.908928] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000002
<4> [304.908929] RDX: 0000000080000002 RSI: ffffffff82348c47 RDI: 00000000ffffffff
<4> [304.908930] RBP: ffff88812d7dc040 R08: 0000000000000000 R09: c000000100002c92
<4> [304.908931] R10: 00000000003b5380 R11: ffffc90000ba7900 R12: 3425cfea6ff31f7f
<4> [304.908931] R13: ffff88812d7dc9f0 R14: 0000000000000003 R15: ffff88812d7dc9f0
<4> [304.908933] FS:  00007f51722bb300(0000) GS:ffff88849fa00000(0000) knlGS:0000000000000000
<4> [304.908934] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4> [304.908935] CR2: 00007ffd197adff0 CR3: 000000011d9ee004 CR4: 0000000000770ef0
<4> [304.908935] PKRU: 55555554
<4> [304.908936] Call Trace:
<4> [304.908939]  __lock_acquire+0x5d0/0x2740
<4> [304.908941]  lock_acquire+0xdc/0x3c0
<4> [304.908944]  ? drm_modeset_lock+0xf6/0x110
<4> [304.908947]  __ww_mutex_lock.constprop.18+0xd0/0x1010
<4> [304.908949]  ? drm_modeset_lock+0xf6/0x110
<4> [304.908951]  ? drm_modeset_lock+0xf6/0x110
<4> [304.908953]  ? ww_mutex_lock_interruptible+0x39/0xa0
<4> [304.908954]  ww_mutex_lock_interruptible+0x39/0xa0
<4> [304.908956]  drm_modeset_lock+0xf6/0x110
<4> [304.908958]  drm_atomic_get_connector_state+0x28/0x180
<4> [304.909003]  intel_psr_fastset_force+0x76/0x170 [i915]
<4> [304.909034]  i915_edp_psr_debug_set+0x53/0x70 [i915]
<4> [304.909037]  simple_attr_write+0xb1/0xd0
<4> [304.909040]  full_proxy_write+0x51/0x80
<4> [304.909042]  vfs_write+0xc4/0x230
<4> [304.909043]  ksys_write+0x5a/0xd0
<4> [304.909045]  do_syscall_64+0x33/0x80
<4> [304.909046]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4> [304.909047] RIP: 0033:0x7f517180d281
<4> [304.909049] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4> [304.909050] RSP: 002b:00007ffd197b0728 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4> [304.909051] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f517180d281
<4> [304.909052] RDX: 0000000000000003 RSI: 00007f5171cb0dee RDI: 0000000000000009
<4> [304.909053] RBP: 0000000000000003 R08: 00007ffd197eb1b0 R09: 000000000005d270
<4> [304.909054] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f5171cb0dee
<4> [304.909055] R13: 0000000000000009 R14: 000055bad33d98c0 R15: 0000000000000000
<4> [304.909057] CPU: 0 PID: 5658 Comm: kms_psr Not tainted 5.10.0-rc1-CI-Trybot_7174+ #1
<4> [304.909058] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4> [304.909059] Call Trace:
<4> [304.909061]  dump_stack+0x77/0x97
<4> [304.909064]  __warn.cold.14+0xe/0x4b
<4> [304.909066]  ? check_chain_key+0x1a4/0x1f0
<4> [304.909068]  report_bug+0xbd/0xf0
<4> [304.909070]  handle_bug+0x3f/0x70
<4> [304.909071]  exc_invalid_op+0x13/0x60
<4> [304.909072]  asm_exc_invalid_op+0x12/0x20
<4> [304.909074] RIP: 0010:check_chain_key+0x1a4/0x1f0
<4> [304.909075] Code: a5 d8 08 00 00 74 e7 e8 7a eb 96 00 8b b5 e0 08 00 00 4c 89 e1 89 da 4c 8b 85 d8 08 00 00 48 c7 c7 d0 8f 30 82 e8 5f 2c 92 00 <0f> 0b 5b 5d 41 5c 41 5d c3 49 89 d5 49 c7 c4 ff ff ff ff 31 db e8
<4> [304.909076] RSP: 0018:ffffc90000ba7af0 EFLAGS: 00010086
<4> [304.909077] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000002
<4> [304.909078] RDX: 0000000080000002 RSI: ffffffff82348c47 RDI: 00000000ffffffff
<4> [304.909079] RBP: ffff88812d7dc040 R08: 0000000000000000 R09: c000000100002c92
<4> [304.909080] R10: 00000000003b5380 R11: ffffc90000ba7900 R12: 3425cfea6ff31f7f
<4> [304.909081] R13: ffff88812d7dc9f0 R14: 0000000000000003 R15: ffff88812d7dc9f0
<4> [304.909083]  __lock_acquire+0x5d0/0x2740
<4> [304.909086]  lock_acquire+0xdc/0x3c0
<4> [304.909087]  ? drm_modeset_lock+0xf6/0x110
<4> [304.909090]  __ww_mutex_lock.constprop.18+0xd0/0x1010
<4> [304.909091]  ? drm_modeset_lock+0xf6/0x110
<4> [304.909093]  ? drm_modeset_lock+0xf6/0x110
<4> [304.909095]  ? ww_mutex_lock_interruptible+0x39/0xa0
<4> [304.909096]  ww_mutex_lock_interruptible+0x39/0xa0
<4> [304.909098]  drm_modeset_lock+0xf6/0x110
<4> [304.909100]  drm_atomic_get_connector_state+0x28/0x180
<4> [304.909149]  intel_psr_fastset_force+0x76/0x170 [i915]
<4> [304.909179]  i915_edp_psr_debug_set+0x53/0x70 [i915]
<4> [304.909181]  simple_attr_write+0xb1/0xd0
<4> [304.909183]  full_proxy_write+0x51/0x80
<4> [304.909184]  vfs_write+0xc4/0x230
<4> [304.909185]  ksys_write+0x5a/0xd0
<4> [304.909187]  do_syscall_64+0x33/0x80
<4> [304.909188]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
<4> [304.909189] RIP: 0033:0x7f517180d281
<4> [304.909190] Code: c3 0f 1f 84 00 00 00 00 00 48 8b 05 59 8d 20 00 c3 0f 1f 84 00 00 00 00 00 8b 05 8a d1 20 00 85 c0 75 16 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 57 f3 c3 0f 1f 44 00 00 41 54 55 49 89 d4 53
<4> [304.909191] RSP: 002b:00007ffd197b0728 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
<4> [304.909193] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f517180d281
<4> [304.909193] RDX: 0000000000000003 RSI: 00007f5171cb0dee RDI: 0000000000000009
<4> [304.909194] RBP: 0000000000000003 R08: 00007ffd197eb1b0 R09: 000000000005d270
<4> [304.909195] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f5171cb0dee
<4> [304.909196] R13: 0000000000000009 R14: 000055bad33d98c0 R15: 0000000000000000
<4> [304.909198] irq event stamp: 93035
<4> [304.909200] hardirqs last  enabled at (93035): [<ffffffff81a8b8b2>] _raw_spin_unlock_irqrestore+0x42/0x50
<4> [304.909201] hardirqs last disabled at (93034): [<ffffffff81a8b687>] _raw_spin_lock_irqsave+0x47/0x50
<4> [304.909203] softirqs last  enabled at (92760): [<ffffffff81e00342>] __do_softirq+0x342/0x48e
<4> [304.909204] softirqs last disabled at (92753): [<ffffffff81c00f4f>] asm_call_irq_on_stack+0xf/0x20

https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7174/fi-icl-u2/igt@kms_psr@primary_mmap_gtt.html
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7174/fi-kbl-r/igt@kms_psr@primary_page_flip.html
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7174/fi-cml-s/igt@kms_psr@primary_mmap_gtt.html
https://intel-gfx-ci.01.org/tree/drm-tip/Trybot_7174/fi-cml-u2/igt@kms_psr@primary_page_flip.html

It is suspicious that those are all the same tests midway through the
run, so it might be our bug, but it is one we haven't seen before.
-Chris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-27 12:48               ` Peter Zijlstra
@ 2020-10-27 14:13                 ` Chris Wilson
  2020-10-31 11:30                 ` [tip: locking/urgent] lockdep: Fix nr_unused_locks accounting tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 35+ messages in thread
From: Chris Wilson @ 2020-10-27 14:13 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

Quoting Peter Zijlstra (2020-10-27 12:48:34)
> On Tue, Oct 27, 2020 at 01:30:56PM +0100, Peter Zijlstra wrote:
> > This seems to make it happy. Not quite sure that's the best solution.
> > 
> > diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> > index 3e99dfef8408..81295bc760fe 100644
> > --- a/kernel/locking/lockdep.c
> > +++ b/kernel/locking/lockdep.c
> > @@ -4411,7 +4405,9 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
> >               break;
> >  
> >       case LOCK_USED:
> > -             debug_atomic_dec(nr_unused_locks);
> > +     case LOCK_USED_READ:
> > +             if ((hlock_class(this)->usage_mask & (LOCKF_USED|LOCKF_USED_READ)) == new_mask)
> > +                     debug_atomic_dec(nr_unused_locks);
> >               break;
> >  
> >       default:
> 
> This also works, and I think I likes it better.. anyone?
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 3e99dfef8408..e603e86c0227 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -4396,6 +4390,9 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
>         if (unlikely(hlock_class(this)->usage_mask & new_mask))
>                 goto unlock;
>  
> +       if (!hlock_class(this)->usage_mask)
> +               debug_atomic_dec(nr_unused_locks);
> +

From an outside perspective, this is much easier for me to match with
the assertion in lockdep_proc.

Our CI confirms this works, and we are just left with the new issue of

<4> [260.903453] hm#2, depth: 6 [6], eb18a85a2df37d3d != a6ee4649c0022599
<4> [260.903458] WARNING: CPU: 7 PID: 5515 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0

Thanks,
-Chris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-27 13:29               ` [tip: locking/core] lockdep: Fix usage_traceoverflow Chris Wilson
@ 2020-10-27 15:45                 ` Peter Zijlstra
  2020-10-27 16:34                   ` Chris Wilson
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-27 15:45 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

On Tue, Oct 27, 2020 at 01:29:10PM +0000, Chris Wilson wrote:

> <4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
> <4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0

Urgh, I don't think I've _ever_ seen that warning trigger.

The comments that go with it suggest memory corruption is the most
likely trigger of it. Is it easy to trigger?

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-27 15:45                 ` Peter Zijlstra
@ 2020-10-27 16:34                   ` Chris Wilson
  2020-10-28 17:40                     ` Chris Wilson
  0 siblings, 1 reply; 35+ messages in thread
From: Chris Wilson @ 2020-10-27 16:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

Quoting Peter Zijlstra (2020-10-27 15:45:33)
> On Tue, Oct 27, 2020 at 01:29:10PM +0000, Chris Wilson wrote:
> 
> > <4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
> > <4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0
> 
> Urgh, I don't think I've _ever_ seen that warning trigger.
> 
> The comments that go with it suggest memory corruption is the most
> likely trigger of it. Is it easy to trigger?

For the automated CI, yes, the few machines that run that particular HW
test seem to hit it regularly. I have not yet reproduced it for myself.
I thought it looked like something kasan would provide some insight for
and we should get a kasan run through CI over the w/e. I suspect we've
feed in some garbage and called it a lock.
-Chris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-27 16:34                   ` Chris Wilson
@ 2020-10-28 17:40                     ` Chris Wilson
  2020-10-28 18:06                       ` Chris Wilson
  2020-10-28 19:42                       ` Peter Zijlstra
  0 siblings, 2 replies; 35+ messages in thread
From: Chris Wilson @ 2020-10-28 17:40 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

Quoting Chris Wilson (2020-10-27 16:34:53)
> Quoting Peter Zijlstra (2020-10-27 15:45:33)
> > On Tue, Oct 27, 2020 at 01:29:10PM +0000, Chris Wilson wrote:
> > 
> > > <4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
> > > <4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0
> > 
> > Urgh, I don't think I've _ever_ seen that warning trigger.
> > 
> > The comments that go with it suggest memory corruption is the most
> > likely trigger of it. Is it easy to trigger?
> 
> For the automated CI, yes, the few machines that run that particular HW
> test seem to hit it regularly. I have not yet reproduced it for myself.
> I thought it looked like something kasan would provide some insight for
> and we should get a kasan run through CI over the w/e. I suspect we've
> feed in some garbage and called it a lock.

I tracked it down to a second invocation of lock_acquire_shared_recursive()
intermingled with some other regular mutexes (in this case ww_mutex).

We hit this path in validate_chain():
	/*
	 * Mark recursive read, as we jump over it when
	 * building dependencies (just like we jump over
	 * trylock entries):
	 */
	if (ret == 2)
		hlock->read = 2;

and that is modifying hlock_id() and so the chain-key, after it has
already been computed.

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 035f81b1cc87..f193f756e1e3 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4831,7 +4831,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
        if (!validate_chain(curr, hlock, chain_head, chain_key))
                return 0;

-       curr->curr_chain_key = chain_key;
+       curr->curr_chain_key = iterate_chain_key(chain_key, hlock_id(hlock));
        curr->lockdep_depth++;
        check_chain_key(curr);

works as a heavy hammer.
-Chris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-28 17:40                     ` Chris Wilson
@ 2020-10-28 18:06                       ` Chris Wilson
  2020-10-28 19:42                       ` Peter Zijlstra
  1 sibling, 0 replies; 35+ messages in thread
From: Chris Wilson @ 2020-10-28 18:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86

Quoting Chris Wilson (2020-10-28 17:40:48)
> Quoting Chris Wilson (2020-10-27 16:34:53)
> > Quoting Peter Zijlstra (2020-10-27 15:45:33)
> > > On Tue, Oct 27, 2020 at 01:29:10PM +0000, Chris Wilson wrote:
> > > 
> > > > <4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
> > > > <4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0
> > > 
> > > Urgh, I don't think I've _ever_ seen that warning trigger.
> > > 
> > > The comments that go with it suggest memory corruption is the most
> > > likely trigger of it. Is it easy to trigger?
> > 
> > For the automated CI, yes, the few machines that run that particular HW
> > test seem to hit it regularly. I have not yet reproduced it for myself.
> > I thought it looked like something kasan would provide some insight for
> > and we should get a kasan run through CI over the w/e. I suspect we've
> > feed in some garbage and called it a lock.
> 
> I tracked it down to a second invocation of lock_acquire_shared_recursive()
> intermingled with some other regular mutexes (in this case ww_mutex).
> 
> We hit this path in validate_chain():
>         /*
>          * Mark recursive read, as we jump over it when
>          * building dependencies (just like we jump over
>          * trylock entries):
>          */
>         if (ret == 2)
>                 hlock->read = 2;
> 
> and that is modifying hlock_id() and so the chain-key, after it has
> already been computed.
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 035f81b1cc87..f193f756e1e3 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -4831,7 +4831,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
>         if (!validate_chain(curr, hlock, chain_head, chain_key))
>                 return 0;
> 
> -       curr->curr_chain_key = chain_key;
> +       curr->curr_chain_key = iterate_chain_key(chain_key, hlock_id(hlock));
>         curr->lockdep_depth++;
>         check_chain_key(curr);

No, chain_key should not be chained onto itself again, but I hope it
makes the issue clear nevertheless.
-Chris

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-28 17:40                     ` Chris Wilson
  2020-10-28 18:06                       ` Chris Wilson
@ 2020-10-28 19:42                       ` Peter Zijlstra
  2020-10-28 19:59                         ` Peter Zijlstra
  1 sibling, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-28 19:42 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86, Boqun Feng

On Wed, Oct 28, 2020 at 05:40:48PM +0000, Chris Wilson wrote:
> Quoting Chris Wilson (2020-10-27 16:34:53)
> > Quoting Peter Zijlstra (2020-10-27 15:45:33)
> > > On Tue, Oct 27, 2020 at 01:29:10PM +0000, Chris Wilson wrote:
> > > 
> > > > <4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
> > > > <4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0
> > > 
> > > Urgh, I don't think I've _ever_ seen that warning trigger.
> > > 
> > > The comments that go with it suggest memory corruption is the most
> > > likely trigger of it. Is it easy to trigger?
> > 
> > For the automated CI, yes, the few machines that run that particular HW
> > test seem to hit it regularly. I have not yet reproduced it for myself.
> > I thought it looked like something kasan would provide some insight for
> > and we should get a kasan run through CI over the w/e. I suspect we've
> > feed in some garbage and called it a lock.
> 
> I tracked it down to a second invocation of lock_acquire_shared_recursive()
> intermingled with some other regular mutexes (in this case ww_mutex).
> 
> We hit this path in validate_chain():
> 	/*
> 	 * Mark recursive read, as we jump over it when
> 	 * building dependencies (just like we jump over
> 	 * trylock entries):
> 	 */
> 	if (ret == 2)
> 		hlock->read = 2;
> 
> and that is modifying hlock_id() and so the chain-key, after it has
> already been computed.

Ooh, interesting.. I'll have to go look at this in the morning, brain is
fried already. Thanks for digging into it.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-28 19:42                       ` Peter Zijlstra
@ 2020-10-28 19:59                         ` Peter Zijlstra
  2020-10-30  3:51                           ` Boqun Feng
  0 siblings, 1 reply; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-28 19:59 UTC (permalink / raw)
  To: Chris Wilson
  Cc: linux-kernel, linux-tip-commits, tip-bot2 for Peter Zijlstra,
	Qian Cai, x86, Boqun Feng

On Wed, Oct 28, 2020 at 08:42:09PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 28, 2020 at 05:40:48PM +0000, Chris Wilson wrote:
> > Quoting Chris Wilson (2020-10-27 16:34:53)
> > > Quoting Peter Zijlstra (2020-10-27 15:45:33)
> > > > On Tue, Oct 27, 2020 at 01:29:10PM +0000, Chris Wilson wrote:
> > > > 
> > > > > <4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
> > > > > <4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0
> > > > 
> > > > Urgh, I don't think I've _ever_ seen that warning trigger.
> > > > 
> > > > The comments that go with it suggest memory corruption is the most
> > > > likely trigger of it. Is it easy to trigger?
> > > 
> > > For the automated CI, yes, the few machines that run that particular HW
> > > test seem to hit it regularly. I have not yet reproduced it for myself.
> > > I thought it looked like something kasan would provide some insight for
> > > and we should get a kasan run through CI over the w/e. I suspect we've
> > > feed in some garbage and called it a lock.
> > 
> > I tracked it down to a second invocation of lock_acquire_shared_recursive()
> > intermingled with some other regular mutexes (in this case ww_mutex).
> > 
> > We hit this path in validate_chain():
> > 	/*
> > 	 * Mark recursive read, as we jump over it when
> > 	 * building dependencies (just like we jump over
> > 	 * trylock entries):
> > 	 */
> > 	if (ret == 2)
> > 		hlock->read = 2;
> > 
> > and that is modifying hlock_id() and so the chain-key, after it has
> > already been computed.
> 
> Ooh, interesting.. I'll have to go look at this in the morning, brain is
> fried already. Thanks for digging into it.

So that's commit f611e8cf98ec ("lockdep: Take read/write status in
consideration when generate chainkey") that did that.

So validate_chain() requires the new chain_key, but can change ->read
which then invalidates the chain_key we just calculated.

This happens when check_deadlock() returns 2, which only happens when:

  - next->read == 2 && ... ; however @hext is our @hlock, so that's
    pointless

  - when there's a nest_lock involved ; ww_mutex uses that !!!

I suppose something like the below _might_ just do it, but I haven't
compiled it, and like said, my brain is fried.

Boqun, could you have a look, you're a few timezones ahead of us so your
morning is earlier ;-)

---

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 3e99dfef8408..3caf63532bc2 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -3556,7 +3556,7 @@ static inline int lookup_chain_cache_add(struct task_struct *curr,
 
 static int validate_chain(struct task_struct *curr,
 			  struct held_lock *hlock,
-			  int chain_head, u64 chain_key)
+			  int chain_head, u64 *chain_key)
 {
 	/*
 	 * Trylock needs to maintain the stack of held locks, but it
@@ -3568,6 +3568,7 @@ static int validate_chain(struct task_struct *curr,
 	 * (If lookup_chain_cache_add() return with 1 it acquires
 	 * graph_lock for us)
 	 */
+again:
 	if (!hlock->trylock && hlock->check &&
 	    lookup_chain_cache_add(curr, hlock, chain_key)) {
 		/*
@@ -3597,8 +3598,12 @@ static int validate_chain(struct task_struct *curr,
 		 * building dependencies (just like we jump over
 		 * trylock entries):
 		 */
-		if (ret == 2)
+		if (ret == 2) {
 			hlock->read = 2;
+			*chain_key = iterate_chain_key(hlock->prev_chain_key, hlock_id(hlock));
+			goto again;
+		}
+
 		/*
 		 * Add dependency only if this lock is not the head
 		 * of the chain, and if it's not a secondary read-lock:
@@ -3620,7 +3625,7 @@ static int validate_chain(struct task_struct *curr,
 #else
 static inline int validate_chain(struct task_struct *curr,
 				 struct held_lock *hlock,
-				 int chain_head, u64 chain_key)
+				 int chain_head, u64 *chain_key)
 {
 	return 1;
 }
@@ -4834,7 +4839,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 		WARN_ON_ONCE(!hlock_class(hlock)->key);
 	}
 
-	if (!validate_chain(curr, hlock, chain_head, chain_key))
+	if (!validate_chain(curr, hlock, chain_head, &chain_key))
 		return 0;
 
 	curr->curr_chain_key = chain_key;

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-28 19:59                         ` Peter Zijlstra
@ 2020-10-30  3:51                           ` Boqun Feng
  2020-10-30  9:38                             ` Peter Zijlstra
  0 siblings, 1 reply; 35+ messages in thread
From: Boqun Feng @ 2020-10-30  3:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Chris Wilson, linux-kernel, linux-tip-commits,
	tip-bot2 for Peter Zijlstra, Qian Cai, x86

Hi Peter,

On Wed, Oct 28, 2020 at 08:59:10PM +0100, Peter Zijlstra wrote:
> On Wed, Oct 28, 2020 at 08:42:09PM +0100, Peter Zijlstra wrote:
> > On Wed, Oct 28, 2020 at 05:40:48PM +0000, Chris Wilson wrote:
> > > Quoting Chris Wilson (2020-10-27 16:34:53)
> > > > Quoting Peter Zijlstra (2020-10-27 15:45:33)
> > > > > On Tue, Oct 27, 2020 at 01:29:10PM +0000, Chris Wilson wrote:
> > > > > 
> > > > > > <4> [304.908891] hm#2, depth: 6 [6], 3425cfea6ff31f7f != 547d92e9ec2ab9af
> > > > > > <4> [304.908897] WARNING: CPU: 0 PID: 5658 at kernel/locking/lockdep.c:3679 check_chain_key+0x1a4/0x1f0
> > > > > 
> > > > > Urgh, I don't think I've _ever_ seen that warning trigger.
> > > > > 
> > > > > The comments that go with it suggest memory corruption is the most
> > > > > likely trigger of it. Is it easy to trigger?
> > > > 
> > > > For the automated CI, yes, the few machines that run that particular HW
> > > > test seem to hit it regularly. I have not yet reproduced it for myself.
> > > > I thought it looked like something kasan would provide some insight for
> > > > and we should get a kasan run through CI over the w/e. I suspect we've
> > > > feed in some garbage and called it a lock.
> > > 
> > > I tracked it down to a second invocation of lock_acquire_shared_recursive()
> > > intermingled with some other regular mutexes (in this case ww_mutex).
> > > 
> > > We hit this path in validate_chain():
> > > 	/*
> > > 	 * Mark recursive read, as we jump over it when
> > > 	 * building dependencies (just like we jump over
> > > 	 * trylock entries):
> > > 	 */
> > > 	if (ret == 2)
> > > 		hlock->read = 2;
> > > 
> > > and that is modifying hlock_id() and so the chain-key, after it has
> > > already been computed.
> > 
> > Ooh, interesting.. I'll have to go look at this in the morning, brain is
> > fried already. Thanks for digging into it.
> 

Sorry for the late response.

> So that's commit f611e8cf98ec ("lockdep: Take read/write status in
> consideration when generate chainkey") that did that.
> 

Yeah, I think that's related, howver ...

> So validate_chain() requires the new chain_key, but can change ->read
> which then invalidates the chain_key we just calculated.
> 
> This happens when check_deadlock() returns 2, which only happens when:
> 
>   - next->read == 2 && ... ; however @hext is our @hlock, so that's
>     pointless
> 

I don't think we should return 2 (earlier) in this case anymore. Because
now we have recursive read deadlock detection, it's safe to add dep:
"prev -> next" in the dependency graph. I think we can just continue in
this case. Actually I think this is something I'm missing in my
recursive read detection patchset :-/

>   - when there's a nest_lock involved ; ww_mutex uses that !!!
> 

That leaves check_deadlock() return 2 only if hlock is a nest_lock, and
...

> I suppose something like the below _might_ just do it, but I haven't
> compiled it, and like said, my brain is fried.
> 
> Boqun, could you have a look, you're a few timezones ahead of us so your
> morning is earlier ;-)
> 
> ---
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 3e99dfef8408..3caf63532bc2 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -3556,7 +3556,7 @@ static inline int lookup_chain_cache_add(struct task_struct *curr,
>  
>  static int validate_chain(struct task_struct *curr,
>  			  struct held_lock *hlock,
> -			  int chain_head, u64 chain_key)
> +			  int chain_head, u64 *chain_key)
>  {
>  	/*
>  	 * Trylock needs to maintain the stack of held locks, but it
> @@ -3568,6 +3568,7 @@ static int validate_chain(struct task_struct *curr,
>  	 * (If lookup_chain_cache_add() return with 1 it acquires
>  	 * graph_lock for us)
>  	 */
> +again:
>  	if (!hlock->trylock && hlock->check &&
>  	    lookup_chain_cache_add(curr, hlock, chain_key)) {
>  		/*
> @@ -3597,8 +3598,12 @@ static int validate_chain(struct task_struct *curr,
>  		 * building dependencies (just like we jump over
>  		 * trylock entries):
>  		 */
> -		if (ret == 2)
> +		if (ret == 2) {
>  			hlock->read = 2;
> +			*chain_key = iterate_chain_key(hlock->prev_chain_key, hlock_id(hlock));

If "ret == 2" means hlock is a a nest_lock, than we don't need the
"->read = 2" trick here and we don't need to update chain_key either.
We used to have this "->read = 2" only because we want to skip the
dependency adding step afterwards. So how about the following:

It survived a lockdep selftest at boot time.

Regards,
Boqun

----------------------------->8
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 3e99dfef8408..b23ca6196561 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2765,7 +2765,7 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
  * (Note that this has to be done separately, because the graph cannot
  * detect such classes of deadlocks.)
  *
- * Returns: 0 on deadlock detected, 1 on OK, 2 on recursive read
+ * Returns: 0 on deadlock detected, 1 on OK, 2 on nest_lock
  */
 static int
 check_deadlock(struct task_struct *curr, struct held_lock *next)
@@ -2788,7 +2788,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next)
 		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
 		 */
 		if ((next->read == 2) && prev->read)
-			return 2;
+			continue;
 
 		/*
 		 * We're holding the nest_lock, which serializes this lock's
@@ -3592,16 +3592,9 @@ static int validate_chain(struct task_struct *curr,
 
 		if (!ret)
 			return 0;
-		/*
-		 * Mark recursive read, as we jump over it when
-		 * building dependencies (just like we jump over
-		 * trylock entries):
-		 */
-		if (ret == 2)
-			hlock->read = 2;
 		/*
 		 * Add dependency only if this lock is not the head
-		 * of the chain, and if it's not a secondary read-lock:
+		 * of the chain, and if it's not a nest_lock:
 		 */
 		if (!chain_head && ret != 2) {
 			if (!check_prevs_add(curr, hlock))

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-30  3:51                           ` Boqun Feng
@ 2020-10-30  9:38                             ` Peter Zijlstra
  2020-10-30  9:55                               ` Peter Zijlstra
  2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
  0 siblings, 2 replies; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-30  9:38 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Chris Wilson, linux-kernel, linux-tip-commits,
	tip-bot2 for Peter Zijlstra, Qian Cai, x86

On Fri, Oct 30, 2020 at 11:51:18AM +0800, Boqun Feng wrote:
> On Wed, Oct 28, 2020 at 08:59:10PM +0100, Peter Zijlstra wrote:

> Sorry for the late response.

No worries, glad you could have a look.

> > So that's commit f611e8cf98ec ("lockdep: Take read/write status in
> > consideration when generate chainkey") that did that.
> > 
> 
> Yeah, I think that's related, howver ...

It's the commit that made the chainkey depend on the read state, and
thus introduced this connondrum.

> > So validate_chain() requires the new chain_key, but can change ->read
> > which then invalidates the chain_key we just calculated.
> > 
> > This happens when check_deadlock() returns 2, which only happens when:
> > 
> >   - next->read == 2 && ... ; however @hext is our @hlock, so that's
> >     pointless
> > 
> 
> I don't think we should return 2 (earlier) in this case anymore. Because
> now we have recursive read deadlock detection, it's safe to add dep:
> "prev -> next" in the dependency graph. I think we can just continue in
> this case. Actually I think this is something I'm missing in my
> recursive read detection patchset :-/

Yes, I agree, this case should go. We now fully support recursive read
depndencies per your recent work.

> >   - when there's a nest_lock involved ; ww_mutex uses that !!!
> > 
> 
> That leaves check_deadlock() return 2 only if hlock is a nest_lock, and
> ...

> > @@ -3597,8 +3598,12 @@ static int validate_chain(struct task_struct *curr,
> >  		 * building dependencies (just like we jump over
> >  		 * trylock entries):
> >  		 */
> > -		if (ret == 2)
> > +		if (ret == 2) {
> >  			hlock->read = 2;
> > +			*chain_key = iterate_chain_key(hlock->prev_chain_key, hlock_id(hlock));
> 
> If "ret == 2" means hlock is a a nest_lock, than we don't need the
> "->read = 2" trick here and we don't need to update chain_key either.
> We used to have this "->read = 2" only because we want to skip the
> dependency adding step afterwards. So how about the following:
> 
> It survived a lockdep selftest at boot time.

Right, but our self-tests didn't trigger this problem to begin with, let
me go try and create one that does.

> ----------------------------->8
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 3e99dfef8408..b23ca6196561 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -2765,7 +2765,7 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
>   * (Note that this has to be done separately, because the graph cannot
>   * detect such classes of deadlocks.)
>   *
> - * Returns: 0 on deadlock detected, 1 on OK, 2 on recursive read
> + * Returns: 0 on deadlock detected, 1 on OK, 2 on nest_lock
>   */
>  static int
>  check_deadlock(struct task_struct *curr, struct held_lock *next)
> @@ -2788,7 +2788,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next)
>  		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
>  		 */
>  		if ((next->read == 2) && prev->read)
> -			return 2;
> +			continue;
>  
>  		/*
>  		 * We're holding the nest_lock, which serializes this lock's
> @@ -3592,16 +3592,9 @@ static int validate_chain(struct task_struct *curr,
>  
>  		if (!ret)
>  			return 0;
> -		/*
> -		 * Mark recursive read, as we jump over it when
> -		 * building dependencies (just like we jump over
> -		 * trylock entries):
> -		 */
> -		if (ret == 2)
> -			hlock->read = 2;
>  		/*
>  		 * Add dependency only if this lock is not the head
> -		 * of the chain, and if it's not a secondary read-lock:
> +		 * of the chain, and if it's not a nest_lock:
>  		 */
>  		if (!chain_head && ret != 2) {
>  			if (!check_prevs_add(curr, hlock))

I'm not entirely sure that doesn't still trigger the problem. Consider
@chain_head := true.

Anyway, let me go try and write this self-tests, maybe that'll get my
snot-addled brains sufficiently aligned to make sense of all this :/

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [tip: locking/core] lockdep: Fix usage_traceoverflow
  2020-10-30  9:38                             ` Peter Zijlstra
@ 2020-10-30  9:55                               ` Peter Zijlstra
  2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
  1 sibling, 0 replies; 35+ messages in thread
From: Peter Zijlstra @ 2020-10-30  9:55 UTC (permalink / raw)
  To: Boqun Feng
  Cc: Chris Wilson, linux-kernel, linux-tip-commits,
	tip-bot2 for Peter Zijlstra, Qian Cai, x86

On Fri, Oct 30, 2020 at 10:38:06AM +0100, Peter Zijlstra wrote:
> >  		if (!chain_head && ret != 2) {
> >  			if (!check_prevs_add(curr, hlock))
> 
> I'm not entirely sure that doesn't still trigger the problem. Consider
> @chain_head := true.

Urgh, clearly my brain really isn't working well. Ignore that.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [tip: locking/urgent] lockdep: Fix nr_unused_locks accounting
  2020-10-27 12:48               ` Peter Zijlstra
  2020-10-27 14:13                 ` Chris Wilson
@ 2020-10-31 11:30                 ` tip-bot2 for Peter Zijlstra
  1 sibling, 0 replies; 35+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2020-10-31 11:30 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Chris Wilson, Peter Zijlstra (Intel), x86, LKML

The following commit has been merged into the locking/urgent branch of tip:

Commit-ID:     1a39340865ce505a029b37aeb47a3e4c8db5f6c6
Gitweb:        https://git.kernel.org/tip/1a39340865ce505a029b37aeb47a3e4c8db5f6c6
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Tue, 27 Oct 2020 13:48:34 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 30 Oct 2020 17:07:18 +01:00

lockdep: Fix nr_unused_locks accounting

Chris reported that commit 24d5a3bffef1 ("lockdep: Fix
usage_traceoverflow") breaks the nr_unused_locks validation code
triggered by /proc/lockdep_stats.

By fully splitting LOCK_USED and LOCK_USED_READ it becomes a bad
indicator for accounting nr_unused_locks; simplyfy by using any first
bit.

Fixes: 24d5a3bffef1 ("lockdep: Fix usage_traceoverflow")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://lkml.kernel.org/r/20201027124834.GL2628@hirez.programming.kicks-ass.net
---
 kernel/locking/lockdep.c | 14 ++++----------
 1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 1102849..b71ad8d 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -4396,6 +4396,9 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 	if (unlikely(hlock_class(this)->usage_mask & new_mask))
 		goto unlock;
 
+	if (!hlock_class(this)->usage_mask)
+		debug_atomic_dec(nr_unused_locks);
+
 	hlock_class(this)->usage_mask |= new_mask;
 
 	if (new_bit < LOCK_TRACE_STATES) {
@@ -4403,19 +4406,10 @@ static int mark_lock(struct task_struct *curr, struct held_lock *this,
 			return 0;
 	}
 
-	switch (new_bit) {
-	case 0 ... LOCK_USED-1:
+	if (new_bit < LOCK_USED) {
 		ret = mark_lock_irq(curr, this, new_bit);
 		if (!ret)
 			return 0;
-		break;
-
-	case LOCK_USED:
-		debug_atomic_dec(nr_unused_locks);
-		break;
-
-	default:
-		break;
 	}
 
 unlock:

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain()
  2020-10-30  9:38                             ` Peter Zijlstra
  2020-10-30  9:55                               ` Peter Zijlstra
@ 2020-11-02  5:37                               ` Boqun Feng
  2020-11-02  5:37                                 ` [PATCH 2/2] lockdep/selftest: Add spin_nest_lock test Boqun Feng
                                                   ` (3 more replies)
  1 sibling, 4 replies; 35+ messages in thread
From: Boqun Feng @ 2020-11-02  5:37 UTC (permalink / raw)
  To: linux-kernel, linux-tip-commits
  Cc: Qian Cai, Boqun Feng, Chris Wilson, Peter Zijlstra, Ingo Molnar,
	Will Deacon

Chris Wilson reported a problem spotted by check_chain_key(): a chain
key got changed in validate_chain() because we modify the ->read in
validate_chain() to skip checks for dependency adding, and ->read is
taken into calculation for chain key since commit f611e8cf98ec
("lockdep: Take read/write status in consideration when generate
chainkey").

Fix this by avoiding to modify ->read in validate_chain() based on two
facts: a) since we now support recursive read lock detection, there is
no need to skip checks for dependency adding for recursive readers, b)
since we have a), there is only one case left (nest_lock) where we want
to skip checks in validate_chain(), we simply remove the modification
for ->read and rely on the return value of check_deadlock() to skip the
dependency adding.

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
---
Peter,

I managed to get a reproducer for the problem Chris reported, please see
patch #2. With this patch, that problem gets fixed.

This small patchset is based on your locking/core, patch #2 actually
relies on your "s/raw_spin/spin" changes, thanks for taking care of that
;-)

Regards,
Boqun

 kernel/locking/lockdep.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 3e99dfef8408..a294326fd998 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2765,7 +2765,9 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
  * (Note that this has to be done separately, because the graph cannot
  * detect such classes of deadlocks.)
  *
- * Returns: 0 on deadlock detected, 1 on OK, 2 on recursive read
+ * Returns: 0 on deadlock detected, 1 on OK, 2 if another lock with the same
+ * lock class is held but nest_lock is also held, i.e. we rely on the
+ * nest_lock to avoid the deadlock.
  */
 static int
 check_deadlock(struct task_struct *curr, struct held_lock *next)
@@ -2788,7 +2790,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next)
 		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
 		 */
 		if ((next->read == 2) && prev->read)
-			return 2;
+			continue;
 
 		/*
 		 * We're holding the nest_lock, which serializes this lock's
@@ -3592,16 +3594,13 @@ static int validate_chain(struct task_struct *curr,
 
 		if (!ret)
 			return 0;
-		/*
-		 * Mark recursive read, as we jump over it when
-		 * building dependencies (just like we jump over
-		 * trylock entries):
-		 */
-		if (ret == 2)
-			hlock->read = 2;
 		/*
 		 * Add dependency only if this lock is not the head
-		 * of the chain, and if it's not a secondary read-lock:
+		 * of the chain, and if the new lock introduces no more
+		 * lock dependency (because we already hold a lock with the
+		 * same lock class) nor deadlock (because the nest_lock
+		 * serializes nesting locks), see the comments for
+		 * check_deadlock().
 		 */
 		if (!chain_head && ret != 2) {
 			if (!check_prevs_add(curr, hlock))
-- 
2.28.0


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 2/2] lockdep/selftest: Add spin_nest_lock test
  2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
@ 2020-11-02  5:37                                 ` Boqun Feng
  2020-12-03 10:35                                   ` [tip: locking/core] " tip-bot2 for Boqun Feng
  2020-11-05  6:25                                 ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
                                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 35+ messages in thread
From: Boqun Feng @ 2020-11-02  5:37 UTC (permalink / raw)
  To: linux-kernel, linux-tip-commits
  Cc: Qian Cai, Boqun Feng, Chris Wilson, Peter Zijlstra, Ingo Molnar,
	Will Deacon

Add a self test case to test the behavior for the following case:

	lock(A);
	lock_nest_lock(C1, A);
	lock(B);
	lock_nest_lock(C2, A);

This is a reproducer for a problem[1] reported by Chris Wilson, and is
helpful to prevent this.

[1]: https://lore.kernel.org/lkml/160390684819.31966.12048967113267928793@build.alporthouse.com/

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 lib/locking-selftest.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index afa7d4bb291f..4c24ac8a456c 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -2009,6 +2009,19 @@ static void ww_test_spin_nest_unlocked(void)
 	U(A);
 }
 
+/* This is not a deadlock, because we have X1 to serialize Y1 and Y2 */
+static void ww_test_spin_nest_lock(void)
+{
+	spin_lock(&lock_X1);
+	spin_lock_nest_lock(&lock_Y1, &lock_X1);
+	spin_lock(&lock_A);
+	spin_lock_nest_lock(&lock_Y2, &lock_X1);
+	spin_unlock(&lock_A);
+	spin_unlock(&lock_Y2);
+	spin_unlock(&lock_Y1);
+	spin_unlock(&lock_X1);
+}
+
 static void ww_test_unneeded_slow(void)
 {
 	WWAI(&t);
@@ -2226,6 +2239,10 @@ static void ww_tests(void)
 	dotest(ww_test_spin_nest_unlocked, FAILURE, LOCKTYPE_WW);
 	pr_cont("\n");
 
+	print_testname("spinlock nest test");
+	dotest(ww_test_spin_nest_lock, SUCCESS, LOCKTYPE_WW);
+	pr_cont("\n");
+
 	printk("  -----------------------------------------------------\n");
 	printk("                                 |block | try  |context|\n");
 	printk("  -----------------------------------------------------\n");
-- 
2.28.0


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain()
  2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
  2020-11-02  5:37                                 ` [PATCH 2/2] lockdep/selftest: Add spin_nest_lock test Boqun Feng
@ 2020-11-05  6:25                                 ` Boqun Feng
  2020-11-10 17:28                                 ` Peter Zijlstra
  2020-11-11  8:23                                 ` [tip: locking/urgent] " tip-bot2 for Boqun Feng
  3 siblings, 0 replies; 35+ messages in thread
From: Boqun Feng @ 2020-11-05  6:25 UTC (permalink / raw)
  To: linux-kernel, linux-tip-commits
  Cc: Qian Cai, Chris Wilson, Peter Zijlstra, Ingo Molnar, Will Deacon

Hi Chris,

Could you try this to see if it fixes the problem? Thanks!

Regards,
Boqun

On Mon, Nov 02, 2020 at 01:37:41PM +0800, Boqun Feng wrote:
> Chris Wilson reported a problem spotted by check_chain_key(): a chain
> key got changed in validate_chain() because we modify the ->read in
> validate_chain() to skip checks for dependency adding, and ->read is
> taken into calculation for chain key since commit f611e8cf98ec
> ("lockdep: Take read/write status in consideration when generate
> chainkey").
> 
> Fix this by avoiding to modify ->read in validate_chain() based on two
> facts: a) since we now support recursive read lock detection, there is
> no need to skip checks for dependency adding for recursive readers, b)
> since we have a), there is only one case left (nest_lock) where we want
> to skip checks in validate_chain(), we simply remove the modification
> for ->read and rely on the return value of check_deadlock() to skip the
> dependency adding.
> 
> Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> ---
> Peter,
> 
> I managed to get a reproducer for the problem Chris reported, please see
> patch #2. With this patch, that problem gets fixed.
> 
> This small patchset is based on your locking/core, patch #2 actually
> relies on your "s/raw_spin/spin" changes, thanks for taking care of that
> ;-)
> 
> Regards,
> Boqun
> 
>  kernel/locking/lockdep.c | 19 +++++++++----------
>  1 file changed, 9 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 3e99dfef8408..a294326fd998 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -2765,7 +2765,9 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
>   * (Note that this has to be done separately, because the graph cannot
>   * detect such classes of deadlocks.)
>   *
> - * Returns: 0 on deadlock detected, 1 on OK, 2 on recursive read
> + * Returns: 0 on deadlock detected, 1 on OK, 2 if another lock with the same
> + * lock class is held but nest_lock is also held, i.e. we rely on the
> + * nest_lock to avoid the deadlock.
>   */
>  static int
>  check_deadlock(struct task_struct *curr, struct held_lock *next)
> @@ -2788,7 +2790,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next)
>  		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
>  		 */
>  		if ((next->read == 2) && prev->read)
> -			return 2;
> +			continue;
>  
>  		/*
>  		 * We're holding the nest_lock, which serializes this lock's
> @@ -3592,16 +3594,13 @@ static int validate_chain(struct task_struct *curr,
>  
>  		if (!ret)
>  			return 0;
> -		/*
> -		 * Mark recursive read, as we jump over it when
> -		 * building dependencies (just like we jump over
> -		 * trylock entries):
> -		 */
> -		if (ret == 2)
> -			hlock->read = 2;
>  		/*
>  		 * Add dependency only if this lock is not the head
> -		 * of the chain, and if it's not a secondary read-lock:
> +		 * of the chain, and if the new lock introduces no more
> +		 * lock dependency (because we already hold a lock with the
> +		 * same lock class) nor deadlock (because the nest_lock
> +		 * serializes nesting locks), see the comments for
> +		 * check_deadlock().
>  		 */
>  		if (!chain_head && ret != 2) {
>  			if (!check_prevs_add(curr, hlock))
> -- 
> 2.28.0
> 

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain()
  2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
  2020-11-02  5:37                                 ` [PATCH 2/2] lockdep/selftest: Add spin_nest_lock test Boqun Feng
  2020-11-05  6:25                                 ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
@ 2020-11-10 17:28                                 ` Peter Zijlstra
  2020-11-11  8:23                                 ` [tip: locking/urgent] " tip-bot2 for Boqun Feng
  3 siblings, 0 replies; 35+ messages in thread
From: Peter Zijlstra @ 2020-11-10 17:28 UTC (permalink / raw)
  To: Boqun Feng
  Cc: linux-kernel, linux-tip-commits, Qian Cai, Chris Wilson,
	Ingo Molnar, Will Deacon

On Mon, Nov 02, 2020 at 01:37:41PM +0800, Boqun Feng wrote:
> Chris Wilson reported a problem spotted by check_chain_key(): a chain
> key got changed in validate_chain() because we modify the ->read in
> validate_chain() to skip checks for dependency adding, and ->read is
> taken into calculation for chain key since commit f611e8cf98ec
> ("lockdep: Take read/write status in consideration when generate
> chainkey").
> 
> Fix this by avoiding to modify ->read in validate_chain() based on two
> facts: a) since we now support recursive read lock detection, there is
> no need to skip checks for dependency adding for recursive readers, b)
> since we have a), there is only one case left (nest_lock) where we want
> to skip checks in validate_chain(), we simply remove the modification
> for ->read and rely on the return value of check_deadlock() to skip the
> dependency adding.
> 
> Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>

Thanks Boqun!

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [tip: locking/urgent] lockdep: Avoid to modify chain keys in validate_chain()
  2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
                                                   ` (2 preceding siblings ...)
  2020-11-10 17:28                                 ` Peter Zijlstra
@ 2020-11-11  8:23                                 ` tip-bot2 for Boqun Feng
  3 siblings, 0 replies; 35+ messages in thread
From: tip-bot2 for Boqun Feng @ 2020-11-11  8:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chris Wilson, Boqun Feng, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the locking/urgent branch of tip:

Commit-ID:     d61fc96a37603384cd531622c1e89de1096b5123
Gitweb:        https://git.kernel.org/tip/d61fc96a37603384cd531622c1e89de1096b5123
Author:        Boqun Feng <boqun.feng@gmail.com>
AuthorDate:    Mon, 02 Nov 2020 13:37:41 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Tue, 10 Nov 2020 18:38:38 +01:00

lockdep: Avoid to modify chain keys in validate_chain()

Chris Wilson reported a problem spotted by check_chain_key(): a chain
key got changed in validate_chain() because we modify the ->read in
validate_chain() to skip checks for dependency adding, and ->read is
taken into calculation for chain key since commit f611e8cf98ec
("lockdep: Take read/write status in consideration when generate
chainkey").

Fix this by avoiding to modify ->read in validate_chain() based on two
facts: a) since we now support recursive read lock detection, there is
no need to skip checks for dependency adding for recursive readers, b)
since we have a), there is only one case left (nest_lock) where we want
to skip checks in validate_chain(), we simply remove the modification
for ->read and rely on the return value of check_deadlock() to skip the
dependency adding.

Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201102053743.450459-1-boqun.feng@gmail.com
---
 kernel/locking/lockdep.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index b71ad8d..d9fb9e1 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2765,7 +2765,9 @@ print_deadlock_bug(struct task_struct *curr, struct held_lock *prev,
  * (Note that this has to be done separately, because the graph cannot
  * detect such classes of deadlocks.)
  *
- * Returns: 0 on deadlock detected, 1 on OK, 2 on recursive read
+ * Returns: 0 on deadlock detected, 1 on OK, 2 if another lock with the same
+ * lock class is held but nest_lock is also held, i.e. we rely on the
+ * nest_lock to avoid the deadlock.
  */
 static int
 check_deadlock(struct task_struct *curr, struct held_lock *next)
@@ -2788,7 +2790,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next)
 		 * lock class (i.e. read_lock(lock)+read_lock(lock)):
 		 */
 		if ((next->read == 2) && prev->read)
-			return 2;
+			continue;
 
 		/*
 		 * We're holding the nest_lock, which serializes this lock's
@@ -3593,15 +3595,12 @@ static int validate_chain(struct task_struct *curr,
 		if (!ret)
 			return 0;
 		/*
-		 * Mark recursive read, as we jump over it when
-		 * building dependencies (just like we jump over
-		 * trylock entries):
-		 */
-		if (ret == 2)
-			hlock->read = 2;
-		/*
 		 * Add dependency only if this lock is not the head
-		 * of the chain, and if it's not a secondary read-lock:
+		 * of the chain, and if the new lock introduces no more
+		 * lock dependency (because we already hold a lock with the
+		 * same lock class) nor deadlock (because the nest_lock
+		 * serializes nesting locks), see the comments for
+		 * check_deadlock().
 		 */
 		if (!chain_head && ret != 2) {
 			if (!check_prevs_add(curr, hlock))

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [tip: locking/core] lockdep/selftest: Add spin_nest_lock test
  2020-11-02  5:37                                 ` [PATCH 2/2] lockdep/selftest: Add spin_nest_lock test Boqun Feng
@ 2020-12-03 10:35                                   ` tip-bot2 for Boqun Feng
  0 siblings, 0 replies; 35+ messages in thread
From: tip-bot2 for Boqun Feng @ 2020-12-03 10:35 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Boqun Feng, Peter Zijlstra (Intel), x86, linux-kernel

The following commit has been merged into the locking/core branch of tip:

Commit-ID:     e04ce676e7aa490dcf5df880592e3db5e842a9bc
Gitweb:        https://git.kernel.org/tip/e04ce676e7aa490dcf5df880592e3db5e842a9bc
Author:        Boqun Feng <boqun.feng@gmail.com>
AuthorDate:    Mon, 02 Nov 2020 13:37:42 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 03 Dec 2020 11:20:50 +01:00

lockdep/selftest: Add spin_nest_lock test

Add a self test case to test the behavior for the following case:

	lock(A);
	lock_nest_lock(C1, A);
	lock(B);
	lock_nest_lock(C2, A);

This is a reproducer for a problem[1] reported by Chris Wilson, and is
helpful to prevent this.

[1]: https://lore.kernel.org/lkml/160390684819.31966.12048967113267928793@build.alporthouse.com/

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20201102053743.450459-2-boqun.feng@gmail.com
---
 lib/locking-selftest.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/lib/locking-selftest.c b/lib/locking-selftest.c
index afa7d4b..4c24ac8 100644
--- a/lib/locking-selftest.c
+++ b/lib/locking-selftest.c
@@ -2009,6 +2009,19 @@ static void ww_test_spin_nest_unlocked(void)
 	U(A);
 }
 
+/* This is not a deadlock, because we have X1 to serialize Y1 and Y2 */
+static void ww_test_spin_nest_lock(void)
+{
+	spin_lock(&lock_X1);
+	spin_lock_nest_lock(&lock_Y1, &lock_X1);
+	spin_lock(&lock_A);
+	spin_lock_nest_lock(&lock_Y2, &lock_X1);
+	spin_unlock(&lock_A);
+	spin_unlock(&lock_Y2);
+	spin_unlock(&lock_Y1);
+	spin_unlock(&lock_X1);
+}
+
 static void ww_test_unneeded_slow(void)
 {
 	WWAI(&t);
@@ -2226,6 +2239,10 @@ static void ww_tests(void)
 	dotest(ww_test_spin_nest_unlocked, FAILURE, LOCKTYPE_WW);
 	pr_cont("\n");
 
+	print_testname("spinlock nest test");
+	dotest(ww_test_spin_nest_lock, SUCCESS, LOCKTYPE_WW);
+	pr_cont("\n");
+
 	printk("  -----------------------------------------------------\n");
 	printk("                                 |block | try  |context|\n");
 	printk("  -----------------------------------------------------\n");

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2020-12-03 10:36 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-29 14:31 lockdep null-ptr-deref Qian Cai
2020-09-29 23:08 ` Boqun Feng
2020-09-30  9:16   ` Peter Zijlstra
2020-09-30  9:49     ` Peter Zijlstra
2020-09-30 12:18       ` Boqun Feng
2020-09-30 19:02         ` Peter Zijlstra
2020-10-02 12:36           ` Boqun Feng
2020-10-02 13:09             ` Peter Zijlstra
2020-10-02 13:35               ` Boqun Feng
2020-10-02 10:06       ` Peter Zijlstra
2020-10-02 13:40         ` Qian Cai
2020-10-07 16:20       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra
2020-10-27 11:29         ` Chris Wilson
2020-10-27 11:59           ` Peter Zijlstra
2020-10-27 12:30             ` Peter Zijlstra
2020-10-27 12:48               ` Peter Zijlstra
2020-10-27 14:13                 ` Chris Wilson
2020-10-31 11:30                 ` [tip: locking/urgent] lockdep: Fix nr_unused_locks accounting tip-bot2 for Peter Zijlstra
2020-10-27 13:29               ` [tip: locking/core] lockdep: Fix usage_traceoverflow Chris Wilson
2020-10-27 15:45                 ` Peter Zijlstra
2020-10-27 16:34                   ` Chris Wilson
2020-10-28 17:40                     ` Chris Wilson
2020-10-28 18:06                       ` Chris Wilson
2020-10-28 19:42                       ` Peter Zijlstra
2020-10-28 19:59                         ` Peter Zijlstra
2020-10-30  3:51                           ` Boqun Feng
2020-10-30  9:38                             ` Peter Zijlstra
2020-10-30  9:55                               ` Peter Zijlstra
2020-11-02  5:37                               ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
2020-11-02  5:37                                 ` [PATCH 2/2] lockdep/selftest: Add spin_nest_lock test Boqun Feng
2020-12-03 10:35                                   ` [tip: locking/core] " tip-bot2 for Boqun Feng
2020-11-05  6:25                                 ` [PATCH 1/2] lockdep: Avoid to modify chain keys in validate_chain() Boqun Feng
2020-11-10 17:28                                 ` Peter Zijlstra
2020-11-11  8:23                                 ` [tip: locking/urgent] " tip-bot2 for Boqun Feng
2020-10-09  7:58       ` [tip: locking/core] lockdep: Fix usage_traceoverflow tip-bot2 for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.