All of lore.kernel.org
 help / color / mirror / Atom feed
* KASAN: stack-out-of-bounds Read in csd_lock_record
@ 2020-07-03 23:31 syzbot
  2020-07-04  0:48 ` syzbot
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: syzbot @ 2020-07-03 23:31 UTC (permalink / raw)
  To: bigeasy, linux-kernel, mingo, paulmck, peterz, syzkaller-bugs, tglx

Hello,

syzbot found the following crash on:

HEAD commit:    9e50b94b Add linux-next specific files for 20200703
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1024b405100000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f99cc0faa1476ed6
dashboard link: https://syzkaller.appspot.com/bug?extid=0f719294463916a3fc0e
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16dc490f100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: stack-out-of-bounds in csd_lock_record+0xcb/0xe0 kernel/smp.c:118
Read of size 8 at addr ffffc90001727710 by task syz-executor.0/10721

CPU: 1 PID: 10721 Comm: syz-executor.0 Not tainted 5.8.0-rc3-next-20200703-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0x5/0x436 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 csd_lock_record+0xcb/0xe0 kernel/smp.c:118
 flush_smp_call_function_queue+0x285/0x730 kernel/smp.c:391
 __sysvec_call_function_single+0x98/0x490 arch/x86/kernel/smp.c:248
 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 sysvec_call_function_single+0xe0/0x120 arch/x86/kernel/smp.c:243
 asm_sysvec_call_function_single+0x12/0x20 arch/x86/include/asm/idtentry.h:604
RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:765 [inline]
RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
RIP: 0010:_raw_spin_unlock_irqrestore+0x8c/0xe0 kernel/locking/spinlock.c:191
Code: 48 c7 c0 00 ff b4 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 37 48 83 3d 9b 74 c8 01 00 74 22 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 00 00 e8 95 fb 62 f9 65 8b 05 fe 73 15 78
RSP: 0018:ffffc900016e7558 EFLAGS: 00000282
RAX: 1ffffffff1369fe0 RBX: 0000000000000282 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000282
RBP: ffffffff8cb02508 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 1ffffffff19604a0
R13: 0000000000000000 R14: dead000000000100 R15: dffffc0000000000
 __debug_check_no_obj_freed lib/debugobjects.c:977 [inline]
 debug_check_no_obj_freed+0x20c/0x41c lib/debugobjects.c:998
 free_pages_prepare mm/page_alloc.c:1219 [inline]
 __free_pages_ok+0x20b/0xc90 mm/page_alloc.c:1471
 release_pages+0x5ec/0x17a0 mm/swap.c:880
 tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
 tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
 tlb_flush_mmu+0xe9/0x6b0 mm/mmu_gather.c:249
 zap_pte_range mm/memory.c:1155 [inline]
 zap_pmd_range mm/memory.c:1193 [inline]
 zap_pud_range mm/memory.c:1222 [inline]
 zap_p4d_range mm/memory.c:1243 [inline]
 unmap_page_range+0x1e22/0x2b20 mm/memory.c:1264
 unmap_single_vma+0x198/0x300 mm/memory.c:1309
 unmap_vmas+0x16f/0x2f0 mm/memory.c:1341
 exit_mmap+0x2b1/0x530 mm/mmap.c:3165
 __mmput+0x122/0x470 kernel/fork.c:1075
 mmput+0x53/0x60 kernel/fork.c:1096
 exit_mm kernel/exit.c:483 [inline]
 do_exit+0xa8f/0x2a40 kernel/exit.c:793
 do_group_exit+0x125/0x310 kernel/exit.c:904
 get_signal+0x40b/0x1ee0 kernel/signal.c:2743
 do_signal+0x82/0x2520 arch/x86/kernel/signal.c:810
 exit_to_usermode_loop arch/x86/entry/common.c:218 [inline]
 __prepare_exit_to_usermode+0x156/0x1f0 arch/x86/entry/common.c:252
 do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:376
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45cb29
Code: Bad RIP value.
RSP: 002b:00007fb154b96cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 000000000078bf08 RCX: 000000000045cb29
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 000000000078bf0c
RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bf0c
R13: 00007ffd3933f26f R14: 00007fb154b979c0 R15: 000000000078bf0c


Memory state around the buggy address:
 ffffc90001727600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc90001727680: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
>ffffc90001727700: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
                         ^
 ffffc90001727780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc90001727800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: KASAN: stack-out-of-bounds Read in csd_lock_record
  2020-07-03 23:31 KASAN: stack-out-of-bounds Read in csd_lock_record syzbot
@ 2020-07-04  0:48 ` syzbot
  2020-07-04 16:45 ` Paul E. McKenney
  2020-10-09  6:35 ` [tip: core/rcu] kernel/smp: Provide CSD lock timeout diagnostics tip-bot2 for Paul E. McKenney
  2 siblings, 0 replies; 9+ messages in thread
From: syzbot @ 2020-07-04  0:48 UTC (permalink / raw)
  To: bigeasy, linux-kernel, mingo, paulmck, peterz, syzkaller-bugs, tglx

syzbot has found a reproducer for the following crash on:

HEAD commit:    9e50b94b Add linux-next specific files for 20200703
git tree:       linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1224dc83100000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f99cc0faa1476ed6
dashboard link: https://syzkaller.appspot.com/bug?extid=0f719294463916a3fc0e
compiler:       gcc (GCC) 10.1.0-syz 20200507
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=170442d5100000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=162ef66d100000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: stack-out-of-bounds in csd_lock_record+0xd2/0xe0 kernel/smp.c:119
Read of size 8 at addr ffffc900016d75f8 by task swapper/1/0

CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.8.0-rc3-next-20200703-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 <IRQ>
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x18f/0x20d lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0x5/0x436 mm/kasan/report.c:383
 __kasan_report mm/kasan/report.c:513 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
 csd_lock_record+0xd2/0xe0 kernel/smp.c:119
 flush_smp_call_function_queue+0x285/0x730 kernel/smp.c:391
 __sysvec_call_function_single+0x98/0x490 arch/x86/kernel/smp.c:248
 asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
 </IRQ>
 __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
 run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
 sysvec_call_function_single+0xe0/0x120 arch/x86/kernel/smp.c:243
 asm_sysvec_call_function_single+0x12/0x20 arch/x86/include/asm/idtentry.h:604
RIP: 0010:native_safe_halt+0xe/0x10 arch/x86/include/asm/irqflags.h:61
Code: ff 4c 89 ef e8 33 30 c7 f9 e9 8e fe ff ff 48 89 df e8 26 30 c7 f9 eb 8a cc cc cc cc e9 07 00 00 00 0f 00 2d 14 4b 5c 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d 04 4b 5c 00 f4 c3 cc cc 55 53 e8 c9
RSP: 0018:ffffc90000d3fd18 EFLAGS: 00000293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff8880a95f0340 RSI: ffffffff87ec78c8 RDI: ffffffff87ec789e
RBP: ffff88821af4d864 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88821af4d864
R13: 1ffff920001a7fad R14: ffff88821af4d865 R15: 0000000000000001
 arch_safe_halt arch/x86/include/asm/paravirt.h:150 [inline]
 acpi_safe_halt+0x8d/0x110 drivers/acpi/processor_idle.c:111
 acpi_idle_do_entry+0x15c/0x1b0 drivers/acpi/processor_idle.c:525
 acpi_idle_enter+0x3f9/0xab0 drivers/acpi/processor_idle.c:651
 cpuidle_enter_state+0xff/0x960 drivers/cpuidle/cpuidle.c:235
 cpuidle_enter+0x4a/0xa0 drivers/cpuidle/cpuidle.c:346
 call_cpuidle kernel/sched/idle.c:126 [inline]
 cpuidle_idle_call kernel/sched/idle.c:214 [inline]
 do_idle+0x431/0x6d0 kernel/sched/idle.c:276
 cpu_startup_entry+0x14/0x20 kernel/sched/idle.c:372
 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243


Memory state around the buggy address:
 ffffc900016d7480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc900016d7500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffffc900016d7580: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 f3 f3 f3 f3
                                                                ^
 ffffc900016d7600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffc900016d7680: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: KASAN: stack-out-of-bounds Read in csd_lock_record
  2020-07-03 23:31 KASAN: stack-out-of-bounds Read in csd_lock_record syzbot
  2020-07-04  0:48 ` syzbot
@ 2020-07-04 16:45 ` Paul E. McKenney
  2020-07-04 18:34   ` Dmitry Vyukov
  2020-10-09  6:35 ` [tip: core/rcu] kernel/smp: Provide CSD lock timeout diagnostics tip-bot2 for Paul E. McKenney
  2 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2020-07-04 16:45 UTC (permalink / raw)
  To: syzbot; +Cc: bigeasy, linux-kernel, mingo, peterz, syzkaller-bugs, tglx

On Fri, Jul 03, 2020 at 04:31:22PM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    9e50b94b Add linux-next specific files for 20200703
> git tree:       linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1024b405100000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f99cc0faa1476ed6
> dashboard link: https://syzkaller.appspot.com/bug?extid=0f719294463916a3fc0e
> compiler:       gcc (GCC) 10.1.0-syz 20200507
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16dc490f100000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com

Good catch!  A call to csd_lock_record() was on the wrong side of a
call to csd_unlock().

But is folded into another commit for bisectability reasons, so
"Reported-by" would not make sense.  I have instead added this to the
commit log:

[ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com

							Thanx, Paul

> ==================================================================
> BUG: KASAN: stack-out-of-bounds in csd_lock_record+0xcb/0xe0 kernel/smp.c:118
> Read of size 8 at addr ffffc90001727710 by task syz-executor.0/10721
> 
> CPU: 1 PID: 10721 Comm: syz-executor.0 Not tainted 5.8.0-rc3-next-20200703-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
>  <IRQ>
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x18f/0x20d lib/dump_stack.c:118
>  print_address_description.constprop.0.cold+0x5/0x436 mm/kasan/report.c:383
>  __kasan_report mm/kasan/report.c:513 [inline]
>  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
>  csd_lock_record+0xcb/0xe0 kernel/smp.c:118
>  flush_smp_call_function_queue+0x285/0x730 kernel/smp.c:391
>  __sysvec_call_function_single+0x98/0x490 arch/x86/kernel/smp.c:248
>  asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
>  </IRQ>
>  __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
>  run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
>  sysvec_call_function_single+0xe0/0x120 arch/x86/kernel/smp.c:243
>  asm_sysvec_call_function_single+0x12/0x20 arch/x86/include/asm/idtentry.h:604
> RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:765 [inline]
> RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
> RIP: 0010:_raw_spin_unlock_irqrestore+0x8c/0xe0 kernel/locking/spinlock.c:191
> Code: 48 c7 c0 00 ff b4 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 37 48 83 3d 9b 74 c8 01 00 74 22 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 00 00 e8 95 fb 62 f9 65 8b 05 fe 73 15 78
> RSP: 0018:ffffc900016e7558 EFLAGS: 00000282
> RAX: 1ffffffff1369fe0 RBX: 0000000000000282 RCX: 0000000000000000
> RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000282
> RBP: ffffffff8cb02508 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000000 R12: 1ffffffff19604a0
> R13: 0000000000000000 R14: dead000000000100 R15: dffffc0000000000
>  __debug_check_no_obj_freed lib/debugobjects.c:977 [inline]
>  debug_check_no_obj_freed+0x20c/0x41c lib/debugobjects.c:998
>  free_pages_prepare mm/page_alloc.c:1219 [inline]
>  __free_pages_ok+0x20b/0xc90 mm/page_alloc.c:1471
>  release_pages+0x5ec/0x17a0 mm/swap.c:880
>  tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
>  tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
>  tlb_flush_mmu+0xe9/0x6b0 mm/mmu_gather.c:249
>  zap_pte_range mm/memory.c:1155 [inline]
>  zap_pmd_range mm/memory.c:1193 [inline]
>  zap_pud_range mm/memory.c:1222 [inline]
>  zap_p4d_range mm/memory.c:1243 [inline]
>  unmap_page_range+0x1e22/0x2b20 mm/memory.c:1264
>  unmap_single_vma+0x198/0x300 mm/memory.c:1309
>  unmap_vmas+0x16f/0x2f0 mm/memory.c:1341
>  exit_mmap+0x2b1/0x530 mm/mmap.c:3165
>  __mmput+0x122/0x470 kernel/fork.c:1075
>  mmput+0x53/0x60 kernel/fork.c:1096
>  exit_mm kernel/exit.c:483 [inline]
>  do_exit+0xa8f/0x2a40 kernel/exit.c:793
>  do_group_exit+0x125/0x310 kernel/exit.c:904
>  get_signal+0x40b/0x1ee0 kernel/signal.c:2743
>  do_signal+0x82/0x2520 arch/x86/kernel/signal.c:810
>  exit_to_usermode_loop arch/x86/entry/common.c:218 [inline]
>  __prepare_exit_to_usermode+0x156/0x1f0 arch/x86/entry/common.c:252
>  do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:376
>  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> RIP: 0033:0x45cb29
> Code: Bad RIP value.
> RSP: 002b:00007fb154b96cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: 0000000000000001 RBX: 000000000078bf08 RCX: 000000000045cb29
> RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 000000000078bf0c
> RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bf0c
> R13: 00007ffd3933f26f R14: 00007fb154b979c0 R15: 000000000078bf0c
> 
> 
> Memory state around the buggy address:
>  ffffc90001727600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ffffc90001727680: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
> >ffffc90001727700: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>                          ^
>  ffffc90001727780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ffffc90001727800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: KASAN: stack-out-of-bounds Read in csd_lock_record
  2020-07-04 16:45 ` Paul E. McKenney
@ 2020-07-04 18:34   ` Dmitry Vyukov
  2020-07-07 15:51     ` Dmitry Vyukov
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2020-07-04 18:34 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: syzbot, Sebastian Andrzej Siewior, LKML, Ingo Molnar,
	Peter Zijlstra, syzkaller-bugs, Thomas Gleixner

On Sat, Jul 4, 2020 at 6:45 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Fri, Jul 03, 2020 at 04:31:22PM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    9e50b94b Add linux-next specific files for 20200703
> > git tree:       linux-next
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1024b405100000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=f99cc0faa1476ed6
> > dashboard link: https://syzkaller.appspot.com/bug?extid=0f719294463916a3fc0e
> > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16dc490f100000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com
>
> Good catch!  A call to csd_lock_record() was on the wrong side of a
> call to csd_unlock().

Thanks for taking a look.

> But is folded into another commit for bisectability reasons, so
> "Reported-by" would not make sense.  I have instead added this to the
> commit log:
>
> [ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
> Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
> Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com

This should work, as far as I remember sybot looks for the email+hash
anywhere in the commit.
FWIW Tested-by can make sense as well.

>                                                         Thanx, Paul
>
> > ==================================================================
> > BUG: KASAN: stack-out-of-bounds in csd_lock_record+0xcb/0xe0 kernel/smp.c:118
> > Read of size 8 at addr ffffc90001727710 by task syz-executor.0/10721
> >
> > CPU: 1 PID: 10721 Comm: syz-executor.0 Not tainted 5.8.0-rc3-next-20200703-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > Call Trace:
> >  <IRQ>
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x18f/0x20d lib/dump_stack.c:118
> >  print_address_description.constprop.0.cold+0x5/0x436 mm/kasan/report.c:383
> >  __kasan_report mm/kasan/report.c:513 [inline]
> >  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
> >  csd_lock_record+0xcb/0xe0 kernel/smp.c:118
> >  flush_smp_call_function_queue+0x285/0x730 kernel/smp.c:391
> >  __sysvec_call_function_single+0x98/0x490 arch/x86/kernel/smp.c:248
> >  asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
> >  </IRQ>
> >  __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
> >  run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
> >  sysvec_call_function_single+0xe0/0x120 arch/x86/kernel/smp.c:243
> >  asm_sysvec_call_function_single+0x12/0x20 arch/x86/include/asm/idtentry.h:604
> > RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:765 [inline]
> > RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
> > RIP: 0010:_raw_spin_unlock_irqrestore+0x8c/0xe0 kernel/locking/spinlock.c:191
> > Code: 48 c7 c0 00 ff b4 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 37 48 83 3d 9b 74 c8 01 00 74 22 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 00 00 e8 95 fb 62 f9 65 8b 05 fe 73 15 78
> > RSP: 0018:ffffc900016e7558 EFLAGS: 00000282
> > RAX: 1ffffffff1369fe0 RBX: 0000000000000282 RCX: 0000000000000000
> > RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000282
> > RBP: ffffffff8cb02508 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000001 R11: 0000000000000000 R12: 1ffffffff19604a0
> > R13: 0000000000000000 R14: dead000000000100 R15: dffffc0000000000
> >  __debug_check_no_obj_freed lib/debugobjects.c:977 [inline]
> >  debug_check_no_obj_freed+0x20c/0x41c lib/debugobjects.c:998
> >  free_pages_prepare mm/page_alloc.c:1219 [inline]
> >  __free_pages_ok+0x20b/0xc90 mm/page_alloc.c:1471
> >  release_pages+0x5ec/0x17a0 mm/swap.c:880
> >  tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
> >  tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
> >  tlb_flush_mmu+0xe9/0x6b0 mm/mmu_gather.c:249
> >  zap_pte_range mm/memory.c:1155 [inline]
> >  zap_pmd_range mm/memory.c:1193 [inline]
> >  zap_pud_range mm/memory.c:1222 [inline]
> >  zap_p4d_range mm/memory.c:1243 [inline]
> >  unmap_page_range+0x1e22/0x2b20 mm/memory.c:1264
> >  unmap_single_vma+0x198/0x300 mm/memory.c:1309
> >  unmap_vmas+0x16f/0x2f0 mm/memory.c:1341
> >  exit_mmap+0x2b1/0x530 mm/mmap.c:3165
> >  __mmput+0x122/0x470 kernel/fork.c:1075
> >  mmput+0x53/0x60 kernel/fork.c:1096
> >  exit_mm kernel/exit.c:483 [inline]
> >  do_exit+0xa8f/0x2a40 kernel/exit.c:793
> >  do_group_exit+0x125/0x310 kernel/exit.c:904
> >  get_signal+0x40b/0x1ee0 kernel/signal.c:2743
> >  do_signal+0x82/0x2520 arch/x86/kernel/signal.c:810
> >  exit_to_usermode_loop arch/x86/entry/common.c:218 [inline]
> >  __prepare_exit_to_usermode+0x156/0x1f0 arch/x86/entry/common.c:252
> >  do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:376
> >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > RIP: 0033:0x45cb29
> > Code: Bad RIP value.
> > RSP: 002b:00007fb154b96cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> > RAX: 0000000000000001 RBX: 000000000078bf08 RCX: 000000000045cb29
> > RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 000000000078bf0c
> > RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bf0c
> > R13: 00007ffd3933f26f R14: 00007fb154b979c0 R15: 000000000078bf0c
> >
> >
> > Memory state around the buggy address:
> >  ffffc90001727600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  ffffc90001727680: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
> > >ffffc90001727700: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
> >                          ^
> >  ffffc90001727780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> >  ffffc90001727800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > ==================================================================
> >
> >
> > ---
> > This bug is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> >
> > syzbot will keep track of this bug report. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > syzbot can test patches for this bug, for details see:
> > https://goo.gl/tpsmEJ#testing-patches
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200704164522.GO9247%40paulmck-ThinkPad-P72.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: KASAN: stack-out-of-bounds Read in csd_lock_record
  2020-07-04 18:34   ` Dmitry Vyukov
@ 2020-07-07 15:51     ` Dmitry Vyukov
  2020-07-07 16:26       ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2020-07-07 15:51 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: syzbot, Sebastian Andrzej Siewior, LKML, Ingo Molnar,
	Peter Zijlstra, syzkaller-bugs, Thomas Gleixner

On Sat, Jul 4, 2020 at 8:34 PM Dmitry Vyukov <dvyukov@google.com> wrote:
>
> On Sat, Jul 4, 2020 at 6:45 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Fri, Jul 03, 2020 at 04:31:22PM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    9e50b94b Add linux-next specific files for 20200703
> > > git tree:       linux-next
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1024b405100000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=f99cc0faa1476ed6
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=0f719294463916a3fc0e
> > > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16dc490f100000
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com
> >
> > Good catch!  A call to csd_lock_record() was on the wrong side of a
> > call to csd_unlock().
>
> Thanks for taking a look.
>
> > But is folded into another commit for bisectability reasons, so
> > "Reported-by" would not make sense.  I have instead added this to the
> > commit log:
> >
> > [ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
> > Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
> > Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
>
> This should work, as far as I remember sybot looks for the email+hash
> anywhere in the commit.
> FWIW Tested-by can make sense as well.


Paul, there is also some spike of stalls in smp_call_function,
if you look at the top ones at:
https://syzkaller.appspot.com/upstream#open

Can these be caused by the same root cause?
I am not sure what trees the bug was/is present... This seems to only
happen on linux-next and nowhere else. But these stalls equally happen
on mainline...



> >                                                         Thanx, Paul
> >
> > > ==================================================================
> > > BUG: KASAN: stack-out-of-bounds in csd_lock_record+0xcb/0xe0 kernel/smp.c:118
> > > Read of size 8 at addr ffffc90001727710 by task syz-executor.0/10721
> > >
> > > CPU: 1 PID: 10721 Comm: syz-executor.0 Not tainted 5.8.0-rc3-next-20200703-syzkaller #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > Call Trace:
> > >  <IRQ>
> > >  __dump_stack lib/dump_stack.c:77 [inline]
> > >  dump_stack+0x18f/0x20d lib/dump_stack.c:118
> > >  print_address_description.constprop.0.cold+0x5/0x436 mm/kasan/report.c:383
> > >  __kasan_report mm/kasan/report.c:513 [inline]
> > >  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
> > >  csd_lock_record+0xcb/0xe0 kernel/smp.c:118
> > >  flush_smp_call_function_queue+0x285/0x730 kernel/smp.c:391
> > >  __sysvec_call_function_single+0x98/0x490 arch/x86/kernel/smp.c:248
> > >  asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
> > >  </IRQ>
> > >  __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
> > >  run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
> > >  sysvec_call_function_single+0xe0/0x120 arch/x86/kernel/smp.c:243
> > >  asm_sysvec_call_function_single+0x12/0x20 arch/x86/include/asm/idtentry.h:604
> > > RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:765 [inline]
> > > RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
> > > RIP: 0010:_raw_spin_unlock_irqrestore+0x8c/0xe0 kernel/locking/spinlock.c:191
> > > Code: 48 c7 c0 00 ff b4 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 37 48 83 3d 9b 74 c8 01 00 74 22 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 00 00 e8 95 fb 62 f9 65 8b 05 fe 73 15 78
> > > RSP: 0018:ffffc900016e7558 EFLAGS: 00000282
> > > RAX: 1ffffffff1369fe0 RBX: 0000000000000282 RCX: 0000000000000000
> > > RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000282
> > > RBP: ffffffff8cb02508 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000001 R11: 0000000000000000 R12: 1ffffffff19604a0
> > > R13: 0000000000000000 R14: dead000000000100 R15: dffffc0000000000
> > >  __debug_check_no_obj_freed lib/debugobjects.c:977 [inline]
> > >  debug_check_no_obj_freed+0x20c/0x41c lib/debugobjects.c:998
> > >  free_pages_prepare mm/page_alloc.c:1219 [inline]
> > >  __free_pages_ok+0x20b/0xc90 mm/page_alloc.c:1471
> > >  release_pages+0x5ec/0x17a0 mm/swap.c:880
> > >  tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
> > >  tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
> > >  tlb_flush_mmu+0xe9/0x6b0 mm/mmu_gather.c:249
> > >  zap_pte_range mm/memory.c:1155 [inline]
> > >  zap_pmd_range mm/memory.c:1193 [inline]
> > >  zap_pud_range mm/memory.c:1222 [inline]
> > >  zap_p4d_range mm/memory.c:1243 [inline]
> > >  unmap_page_range+0x1e22/0x2b20 mm/memory.c:1264
> > >  unmap_single_vma+0x198/0x300 mm/memory.c:1309
> > >  unmap_vmas+0x16f/0x2f0 mm/memory.c:1341
> > >  exit_mmap+0x2b1/0x530 mm/mmap.c:3165
> > >  __mmput+0x122/0x470 kernel/fork.c:1075
> > >  mmput+0x53/0x60 kernel/fork.c:1096
> > >  exit_mm kernel/exit.c:483 [inline]
> > >  do_exit+0xa8f/0x2a40 kernel/exit.c:793
> > >  do_group_exit+0x125/0x310 kernel/exit.c:904
> > >  get_signal+0x40b/0x1ee0 kernel/signal.c:2743
> > >  do_signal+0x82/0x2520 arch/x86/kernel/signal.c:810
> > >  exit_to_usermode_loop arch/x86/entry/common.c:218 [inline]
> > >  __prepare_exit_to_usermode+0x156/0x1f0 arch/x86/entry/common.c:252
> > >  do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:376
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > RIP: 0033:0x45cb29
> > > Code: Bad RIP value.
> > > RSP: 002b:00007fb154b96cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> > > RAX: 0000000000000001 RBX: 000000000078bf08 RCX: 000000000045cb29
> > > RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 000000000078bf0c
> > > RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bf0c
> > > R13: 00007ffd3933f26f R14: 00007fb154b979c0 R15: 000000000078bf0c
> > >
> > >
> > > Memory state around the buggy address:
> > >  ffffc90001727600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > >  ffffc90001727680: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
> > > >ffffc90001727700: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
> > >                          ^
> > >  ffffc90001727780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > >  ffffc90001727800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > ==================================================================
> > >
> > >
> > > ---
> > > This bug is generated by a bot. It may contain errors.
> > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > >
> > > syzbot will keep track of this bug report. See:
> > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > > syzbot can test patches for this bug, for details see:
> > > https://goo.gl/tpsmEJ#testing-patches
> >
> > --
> > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200704164522.GO9247%40paulmck-ThinkPad-P72.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: KASAN: stack-out-of-bounds Read in csd_lock_record
  2020-07-07 15:51     ` Dmitry Vyukov
@ 2020-07-07 16:26       ` Paul E. McKenney
  2020-07-09 10:13         ` Dmitry Vyukov
  0 siblings, 1 reply; 9+ messages in thread
From: Paul E. McKenney @ 2020-07-07 16:26 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Sebastian Andrzej Siewior, LKML, Ingo Molnar,
	Peter Zijlstra, syzkaller-bugs, Thomas Gleixner

On Tue, Jul 07, 2020 at 05:51:48PM +0200, Dmitry Vyukov wrote:
> On Sat, Jul 4, 2020 at 8:34 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> >
> > On Sat, Jul 4, 2020 at 6:45 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Fri, Jul 03, 2020 at 04:31:22PM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    9e50b94b Add linux-next specific files for 20200703
> > > > git tree:       linux-next
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1024b405100000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=f99cc0faa1476ed6
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0f719294463916a3fc0e
> > > > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16dc490f100000
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com
> > >
> > > Good catch!  A call to csd_lock_record() was on the wrong side of a
> > > call to csd_unlock().
> >
> > Thanks for taking a look.
> >
> > > But is folded into another commit for bisectability reasons, so
> > > "Reported-by" would not make sense.  I have instead added this to the
> > > commit log:
> > >
> > > [ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
> > > Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
> > > Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
> >
> > This should work, as far as I remember sybot looks for the email+hash
> > anywhere in the commit.
> > FWIW Tested-by can make sense as well.
> 
> Paul, there is also some spike of stalls in smp_call_function,
> if you look at the top ones at:
> https://syzkaller.appspot.com/upstream#open
> 
> Can these be caused by the same root cause?
> I am not sure what trees the bug was/is present... This seems to only
> happen on linux-next and nowhere else. But these stalls equally happen
> on mainline...

I would be surprised, given that the csd_unlock() was before the faulting
reference.  But then again, I have been surprised before.

You aren't running scftorture with its longwait parameter set to a
non-zero value, are you?  In that case, stalls are expected behavior.
This is to support test the CSD lock diagnostics in -rcu.  Which isn't
in mainline yet, so maybe I am asking a stupid question.

If these are repeatable, one thing to try is to build the kernel with
CSD_LOCK_WAIT_DEBUG=y.  This requires c6c67d89c059 ("smp: Add source and
destination CPUs to __call_single_data") and 216d15e0d870 ("kernel/smp:
Provide CSD lock timeout diagnostics") from the -rcu tree's "dev" branch.
This will dump out the smp_call_function() function that was to be
invoked, on the off-chance that the problem is something like lock
contention in that function.

							Thanx, Paul

> > > > ==================================================================
> > > > BUG: KASAN: stack-out-of-bounds in csd_lock_record+0xcb/0xe0 kernel/smp.c:118
> > > > Read of size 8 at addr ffffc90001727710 by task syz-executor.0/10721
> > > >
> > > > CPU: 1 PID: 10721 Comm: syz-executor.0 Not tainted 5.8.0-rc3-next-20200703-syzkaller #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > > Call Trace:
> > > >  <IRQ>
> > > >  __dump_stack lib/dump_stack.c:77 [inline]
> > > >  dump_stack+0x18f/0x20d lib/dump_stack.c:118
> > > >  print_address_description.constprop.0.cold+0x5/0x436 mm/kasan/report.c:383
> > > >  __kasan_report mm/kasan/report.c:513 [inline]
> > > >  kasan_report.cold+0x1f/0x37 mm/kasan/report.c:530
> > > >  csd_lock_record+0xcb/0xe0 kernel/smp.c:118
> > > >  flush_smp_call_function_queue+0x285/0x730 kernel/smp.c:391
> > > >  __sysvec_call_function_single+0x98/0x490 arch/x86/kernel/smp.c:248
> > > >  asm_call_on_stack+0xf/0x20 arch/x86/entry/entry_64.S:706
> > > >  </IRQ>
> > > >  __run_on_irqstack arch/x86/include/asm/irq_stack.h:22 [inline]
> > > >  run_on_irqstack_cond arch/x86/include/asm/irq_stack.h:48 [inline]
> > > >  sysvec_call_function_single+0xe0/0x120 arch/x86/kernel/smp.c:243
> > > >  asm_sysvec_call_function_single+0x12/0x20 arch/x86/include/asm/idtentry.h:604
> > > > RIP: 0010:arch_local_irq_restore arch/x86/include/asm/paravirt.h:765 [inline]
> > > > RIP: 0010:__raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:160 [inline]
> > > > RIP: 0010:_raw_spin_unlock_irqrestore+0x8c/0xe0 kernel/locking/spinlock.c:191
> > > > Code: 48 c7 c0 00 ff b4 89 48 ba 00 00 00 00 00 fc ff df 48 c1 e8 03 80 3c 10 00 75 37 48 83 3d 9b 74 c8 01 00 74 22 48 89 df 57 9d <0f> 1f 44 00 00 bf 01 00 00 00 e8 95 fb 62 f9 65 8b 05 fe 73 15 78
> > > > RSP: 0018:ffffc900016e7558 EFLAGS: 00000282
> > > > RAX: 1ffffffff1369fe0 RBX: 0000000000000282 RCX: 0000000000000000
> > > > RDX: dffffc0000000000 RSI: 0000000000000000 RDI: 0000000000000282
> > > > RBP: ffffffff8cb02508 R08: 0000000000000000 R09: 0000000000000000
> > > > R10: 0000000000000001 R11: 0000000000000000 R12: 1ffffffff19604a0
> > > > R13: 0000000000000000 R14: dead000000000100 R15: dffffc0000000000
> > > >  __debug_check_no_obj_freed lib/debugobjects.c:977 [inline]
> > > >  debug_check_no_obj_freed+0x20c/0x41c lib/debugobjects.c:998
> > > >  free_pages_prepare mm/page_alloc.c:1219 [inline]
> > > >  __free_pages_ok+0x20b/0xc90 mm/page_alloc.c:1471
> > > >  release_pages+0x5ec/0x17a0 mm/swap.c:880
> > > >  tlb_batch_pages_flush mm/mmu_gather.c:49 [inline]
> > > >  tlb_flush_mmu_free mm/mmu_gather.c:242 [inline]
> > > >  tlb_flush_mmu+0xe9/0x6b0 mm/mmu_gather.c:249
> > > >  zap_pte_range mm/memory.c:1155 [inline]
> > > >  zap_pmd_range mm/memory.c:1193 [inline]
> > > >  zap_pud_range mm/memory.c:1222 [inline]
> > > >  zap_p4d_range mm/memory.c:1243 [inline]
> > > >  unmap_page_range+0x1e22/0x2b20 mm/memory.c:1264
> > > >  unmap_single_vma+0x198/0x300 mm/memory.c:1309
> > > >  unmap_vmas+0x16f/0x2f0 mm/memory.c:1341
> > > >  exit_mmap+0x2b1/0x530 mm/mmap.c:3165
> > > >  __mmput+0x122/0x470 kernel/fork.c:1075
> > > >  mmput+0x53/0x60 kernel/fork.c:1096
> > > >  exit_mm kernel/exit.c:483 [inline]
> > > >  do_exit+0xa8f/0x2a40 kernel/exit.c:793
> > > >  do_group_exit+0x125/0x310 kernel/exit.c:904
> > > >  get_signal+0x40b/0x1ee0 kernel/signal.c:2743
> > > >  do_signal+0x82/0x2520 arch/x86/kernel/signal.c:810
> > > >  exit_to_usermode_loop arch/x86/entry/common.c:218 [inline]
> > > >  __prepare_exit_to_usermode+0x156/0x1f0 arch/x86/entry/common.c:252
> > > >  do_syscall_64+0x6c/0xe0 arch/x86/entry/common.c:376
> > > >  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > > > RIP: 0033:0x45cb29
> > > > Code: Bad RIP value.
> > > > RSP: 002b:00007fb154b96cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> > > > RAX: 0000000000000001 RBX: 000000000078bf08 RCX: 000000000045cb29
> > > > RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 000000000078bf0c
> > > > RBP: 000000000078bf00 R08: 0000000000000000 R09: 0000000000000000
> > > > R10: 0000000000000000 R11: 0000000000000246 R12: 000000000078bf0c
> > > > R13: 00007ffd3933f26f R14: 00007fb154b979c0 R15: 000000000078bf0c
> > > >
> > > >
> > > > Memory state around the buggy address:
> > > >  ffffc90001727600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >  ffffc90001727680: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00 00 00 00
> > > > >ffffc90001727700: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
> > > >                          ^
> > > >  ffffc90001727780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > >  ffffc90001727800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > ==================================================================
> > > >
> > > >
> > > > ---
> > > > This bug is generated by a bot. It may contain errors.
> > > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > > syzbot engineers can be reached at syzkaller@googlegroups.com.
> > > >
> > > > syzbot will keep track of this bug report. See:
> > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > > > syzbot can test patches for this bug, for details see:
> > > > https://goo.gl/tpsmEJ#testing-patches
> > >
> > > --
> > > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> > > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> > > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200704164522.GO9247%40paulmck-ThinkPad-P72.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: KASAN: stack-out-of-bounds Read in csd_lock_record
  2020-07-07 16:26       ` Paul E. McKenney
@ 2020-07-09 10:13         ` Dmitry Vyukov
  2020-07-09 16:45           ` Paul E. McKenney
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Vyukov @ 2020-07-09 10:13 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: syzbot, Sebastian Andrzej Siewior, LKML, Ingo Molnar,
	Peter Zijlstra, syzkaller-bugs, Thomas Gleixner

On Tue, Jul 7, 2020 at 6:26 PM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Tue, Jul 07, 2020 at 05:51:48PM +0200, Dmitry Vyukov wrote:
> > On Sat, Jul 4, 2020 at 8:34 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > >
> > > On Sat, Jul 4, 2020 at 6:45 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Fri, Jul 03, 2020 at 04:31:22PM -0700, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit:    9e50b94b Add linux-next specific files for 20200703
> > > > > git tree:       linux-next
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1024b405100000
> > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=f99cc0faa1476ed6
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0f719294463916a3fc0e
> > > > > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16dc490f100000
> > > > >
> > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > > Reported-by: syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com
> > > >
> > > > Good catch!  A call to csd_lock_record() was on the wrong side of a
> > > > call to csd_unlock().
> > >
> > > Thanks for taking a look.
> > >
> > > > But is folded into another commit for bisectability reasons, so
> > > > "Reported-by" would not make sense.  I have instead added this to the
> > > > commit log:
> > > >
> > > > [ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
> > > > Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
> > > > Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
> > >
> > > This should work, as far as I remember sybot looks for the email+hash
> > > anywhere in the commit.
> > > FWIW Tested-by can make sense as well.
> >
> > Paul, there is also some spike of stalls in smp_call_function,
> > if you look at the top ones at:
> > https://syzkaller.appspot.com/upstream#open
> >
> > Can these be caused by the same root cause?
> > I am not sure what trees the bug was/is present... This seems to only
> > happen on linux-next and nowhere else. But these stalls equally happen
> > on mainline...
>
> I would be surprised, given that the csd_unlock() was before the faulting
> reference.  But then again, I have been surprised before.

Yes, it seems unrelated.
It looks like something broken in the kernel recently and now instead
of diagnosing a stall on one CPU, it diagnoses it as a stall in
smp_call_function on another CPU. This produces large number of
assorted stall reports which are not too actionable...


> You aren't running scftorture with its longwait parameter set to a
> non-zero value, are you?  In that case, stalls are expected behavior.
> This is to support test the CSD lock diagnostics in -rcu.  Which isn't
> in mainline yet, so maybe I am asking a stupid question.

Since I don't know what is scftorture/longwait, I guess I am not running it :)

> If these are repeatable, one thing to try is to build the kernel with
> CSD_LOCK_WAIT_DEBUG=y.  This requires c6c67d89c059 ("smp: Add source and
> destination CPUs to __call_single_data") and 216d15e0d870 ("kernel/smp:
> Provide CSD lock timeout diagnostics") from the -rcu tree's "dev" branch.
> This will dump out the smp_call_function() function that was to be
> invoked, on the off-chance that the problem is something like lock
> contention in that function.

Here are some with reproducers:
https://syzkaller.appspot.com/bug?id=8a1e95291152ce5afea43c103a1fd62a257fcf4b
https://syzkaller.appspot.com/bug?id=5e3ac329b6304aacc6304cfaab1a514bca12ce82
https://syzkaller.appspot.com/bug?id=a01b4478f89e19cee91531f7c2b7751f0caf8c0c
https://syzkaller.appspot.com/bug?id=e4caef9fc41d0c019c532a4257faec129699a42e

But the question is if this CSD_LOCK_WAIT_DEBUG=y is useful in
general? Should we enable it all the time?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: KASAN: stack-out-of-bounds Read in csd_lock_record
  2020-07-09 10:13         ` Dmitry Vyukov
@ 2020-07-09 16:45           ` Paul E. McKenney
  0 siblings, 0 replies; 9+ messages in thread
From: Paul E. McKenney @ 2020-07-09 16:45 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, Sebastian Andrzej Siewior, LKML, Ingo Molnar,
	Peter Zijlstra, syzkaller-bugs, Thomas Gleixner

On Thu, Jul 09, 2020 at 12:13:44PM +0200, Dmitry Vyukov wrote:
> On Tue, Jul 7, 2020 at 6:26 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Tue, Jul 07, 2020 at 05:51:48PM +0200, Dmitry Vyukov wrote:
> > > On Sat, Jul 4, 2020 at 8:34 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > > >
> > > > On Sat, Jul 4, 2020 at 6:45 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > >
> > > > > On Fri, Jul 03, 2020 at 04:31:22PM -0700, syzbot wrote:
> > > > > > Hello,
> > > > > >
> > > > > > syzbot found the following crash on:
> > > > > >
> > > > > > HEAD commit:    9e50b94b Add linux-next specific files for 20200703
> > > > > > git tree:       linux-next
> > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1024b405100000
> > > > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=f99cc0faa1476ed6
> > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=0f719294463916a3fc0e
> > > > > > compiler:       gcc (GCC) 10.1.0-syz 20200507
> > > > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=16dc490f100000
> > > > > >
> > > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > > > Reported-by: syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com
> > > > >
> > > > > Good catch!  A call to csd_lock_record() was on the wrong side of a
> > > > > call to csd_unlock().
> > > >
> > > > Thanks for taking a look.
> > > >
> > > > > But is folded into another commit for bisectability reasons, so
> > > > > "Reported-by" would not make sense.  I have instead added this to the
> > > > > commit log:
> > > > >
> > > > > [ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
> > > > > Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
> > > > > Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
> > > >
> > > > This should work, as far as I remember sybot looks for the email+hash
> > > > anywhere in the commit.
> > > > FWIW Tested-by can make sense as well.
> > >
> > > Paul, there is also some spike of stalls in smp_call_function,
> > > if you look at the top ones at:
> > > https://syzkaller.appspot.com/upstream#open
> > >
> > > Can these be caused by the same root cause?
> > > I am not sure what trees the bug was/is present... This seems to only
> > > happen on linux-next and nowhere else. But these stalls equally happen
> > > on mainline...
> >
> > I would be surprised, given that the csd_unlock() was before the faulting
> > reference.  But then again, I have been surprised before.
> 
> Yes, it seems unrelated.
> It looks like something broken in the kernel recently and now instead
> of diagnosing a stall on one CPU, it diagnoses it as a stall in
> smp_call_function on another CPU. This produces large number of
> assorted stall reports which are not too actionable...
> 
> 
> > You aren't running scftorture with its longwait parameter set to a
> > non-zero value, are you?  In that case, stalls are expected behavior.
> > This is to support test the CSD lock diagnostics in -rcu.  Which isn't
> > in mainline yet, so maybe I am asking a stupid question.
> 
> Since I don't know what is scftorture/longwait, I guess I am not running it :)
> 
> > If these are repeatable, one thing to try is to build the kernel with
> > CSD_LOCK_WAIT_DEBUG=y.  This requires c6c67d89c059 ("smp: Add source and
> > destination CPUs to __call_single_data") and 216d15e0d870 ("kernel/smp:
> > Provide CSD lock timeout diagnostics") from the -rcu tree's "dev" branch.
> > This will dump out the smp_call_function() function that was to be
> > invoked, on the off-chance that the problem is something like lock
> > contention in that function.
> 
> Here are some with reproducers:
> https://syzkaller.appspot.com/bug?id=8a1e95291152ce5afea43c103a1fd62a257fcf4b
> https://syzkaller.appspot.com/bug?id=5e3ac329b6304aacc6304cfaab1a514bca12ce82
> https://syzkaller.appspot.com/bug?id=a01b4478f89e19cee91531f7c2b7751f0caf8c0c
> https://syzkaller.appspot.com/bug?id=e4caef9fc41d0c019c532a4257faec129699a42e
> 
> But the question is if this CSD_LOCK_WAIT_DEBUG=y is useful in
> general? Should we enable it all the time?

The CSD_LOCK_WAIT_DEBUG functionality is quite new, so it is quite
possible that it is causing rather than detecting problems.  ;-)

But once it is stable, then yes, it might be quite generally useful.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip: core/rcu] kernel/smp: Provide CSD lock timeout diagnostics
  2020-07-03 23:31 KASAN: stack-out-of-bounds Read in csd_lock_record syzbot
  2020-07-04  0:48 ` syzbot
  2020-07-04 16:45 ` Paul E. McKenney
@ 2020-10-09  6:35 ` tip-bot2 for Paul E. McKenney
  2 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Paul E. McKenney @ 2020-10-09  6:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner,
	Sebastian Andrzej Siewior, Paul E. McKenney, x86, LKML

The following commit has been merged into the core/rcu branch of tip:

Commit-ID:     35feb60474bf4f7fa7840e14fc7fd344996b919d
Gitweb:        https://git.kernel.org/tip/35feb60474bf4f7fa7840e14fc7fd344996b919d
Author:        Paul E. McKenney <paulmck@kernel.org>
AuthorDate:    Tue, 30 Jun 2020 13:22:54 -07:00
Committer:     Paul E. McKenney <paulmck@kernel.org>
CommitterDate: Fri, 04 Sep 2020 11:52:50 -07:00

kernel/smp: Provide CSD lock timeout diagnostics

This commit causes csd_lock_wait() to emit diagnostics when a CPU
fails to respond quickly enough to one of the smp_call_function()
family of function calls.  These diagnostics are enabled by a new
CSD_LOCK_WAIT_DEBUG Kconfig option that depends on DEBUG_KERNEL.

This commit was inspired by an earlier patch by Josef Bacik.

[ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
[ paulmck: Fix KASAN use-after-free issue reported by Qian Cai. ]
[ paulmck: Fix botched nr_cpu_ids comparison per Dan Carpenter. ]
[ paulmck: Apply Peter Zijlstra feedback. ]
Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
 kernel/smp.c      | 132 ++++++++++++++++++++++++++++++++++++++++++++-
 lib/Kconfig.debug |  11 ++++-
 2 files changed, 141 insertions(+), 2 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 865a876..c5d3188 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -20,6 +20,9 @@
 #include <linux/sched.h>
 #include <linux/sched/idle.h>
 #include <linux/hypervisor.h>
+#include <linux/sched/clock.h>
+#include <linux/nmi.h>
+#include <linux/sched/debug.h>
 
 #include "smpboot.h"
 #include "sched/smp.h"
@@ -96,6 +99,103 @@ void __init call_function_init(void)
 	smpcfd_prepare_cpu(smp_processor_id());
 }
 
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+
+static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
+static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
+static DEFINE_PER_CPU(void *, cur_csd_info);
+
+#define CSD_LOCK_TIMEOUT (5ULL * NSEC_PER_SEC)
+atomic_t csd_bug_count = ATOMIC_INIT(0);
+
+/* Record current CSD work for current CPU, NULL to erase. */
+static void csd_lock_record(call_single_data_t *csd)
+{
+	if (!csd) {
+		smp_mb(); /* NULL cur_csd after unlock. */
+		__this_cpu_write(cur_csd, NULL);
+		return;
+	}
+	__this_cpu_write(cur_csd_func, csd->func);
+	__this_cpu_write(cur_csd_info, csd->info);
+	smp_wmb(); /* func and info before csd. */
+	__this_cpu_write(cur_csd, csd);
+	smp_mb(); /* Update cur_csd before function call. */
+		  /* Or before unlock, as the case may be. */
+}
+
+static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd)
+{
+	unsigned int csd_type;
+
+	csd_type = CSD_TYPE(csd);
+	if (csd_type == CSD_TYPE_ASYNC || csd_type == CSD_TYPE_SYNC)
+		return csd->dst; /* Other CSD_TYPE_ values might not have ->dst. */
+	return -1;
+}
+
+/*
+ * Complain if too much time spent waiting.  Note that only
+ * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
+ * so waiting on other types gets much less information.
+ */
+static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
+{
+	int cpu = -1;
+	int cpux;
+	bool firsttime;
+	u64 ts2, ts_delta;
+	call_single_data_t *cpu_cur_csd;
+	unsigned int flags = READ_ONCE(csd->flags);
+
+	if (!(flags & CSD_FLAG_LOCK)) {
+		if (!unlikely(*bug_id))
+			return true;
+		cpu = csd_lock_wait_getcpu(csd);
+		pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
+			 *bug_id, raw_smp_processor_id(), cpu);
+		return true;
+	}
+
+	ts2 = sched_clock();
+	ts_delta = ts2 - *ts1;
+	if (likely(ts_delta <= CSD_LOCK_TIMEOUT))
+		return false;
+
+	firsttime = !*bug_id;
+	if (firsttime)
+		*bug_id = atomic_inc_return(&csd_bug_count);
+	cpu = csd_lock_wait_getcpu(csd);
+	if (WARN_ONCE(cpu < 0 || cpu >= nr_cpu_ids, "%s: cpu = %d\n", __func__, cpu))
+		cpux = 0;
+	else
+		cpux = cpu;
+	cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
+	pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n",
+		 firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts2 - ts0,
+		 cpu, csd->func, csd->info);
+	if (cpu_cur_csd && csd != cpu_cur_csd) {
+		pr_alert("\tcsd: CSD lock (#%d) handling prior %pS(%ps) request.\n",
+			 *bug_id, READ_ONCE(per_cpu(cur_csd_func, cpux)),
+			 READ_ONCE(per_cpu(cur_csd_info, cpux)));
+	} else {
+		pr_alert("\tcsd: CSD lock (#%d) %s.\n",
+			 *bug_id, !cpu_cur_csd ? "unresponsive" : "handling this request");
+	}
+	if (cpu >= 0) {
+		if (!trigger_single_cpu_backtrace(cpu))
+			dump_cpu_task(cpu);
+		if (!cpu_cur_csd) {
+			pr_alert("csd: Re-sending CSD lock (#%d) IPI from CPU#%02d to CPU#%02d\n", *bug_id, raw_smp_processor_id(), cpu);
+			arch_send_call_function_single_ipi(cpu);
+		}
+	}
+	dump_stack();
+	*ts1 = ts2;
+
+	return false;
+}
+
 /*
  * csd_lock/csd_unlock used to serialize access to per-cpu csd resources
  *
@@ -105,8 +205,28 @@ void __init call_function_init(void)
  */
 static __always_inline void csd_lock_wait(call_single_data_t *csd)
 {
+	int bug_id = 0;
+	u64 ts0, ts1;
+
+	ts1 = ts0 = sched_clock();
+	for (;;) {
+		if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id))
+			break;
+		cpu_relax();
+	}
+	smp_acquire__after_ctrl_dep();
+}
+
+#else
+static void csd_lock_record(call_single_data_t *csd)
+{
+}
+
+static __always_inline void csd_lock_wait(call_single_data_t *csd)
+{
 	smp_cond_load_acquire(&csd->flags, !(VAL & CSD_FLAG_LOCK));
 }
+#endif
 
 static __always_inline void csd_lock(call_single_data_t *csd)
 {
@@ -166,9 +286,11 @@ static int generic_exec_single(int cpu, call_single_data_t *csd)
 		 * We can unlock early even for the synchronous on-stack case,
 		 * since we're doing this from the same CPU..
 		 */
+		csd_lock_record(csd);
 		csd_unlock(csd);
 		local_irq_save(flags);
 		func(info);
+		csd_lock_record(NULL);
 		local_irq_restore(flags);
 		return 0;
 	}
@@ -268,8 +390,10 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
 				entry = &csd_next->llist;
 			}
 
+			csd_lock_record(csd);
 			func(info);
 			csd_unlock(csd);
+			csd_lock_record(NULL);
 		} else {
 			prev = &csd->llist;
 		}
@@ -296,8 +420,10 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
 				smp_call_func_t func = csd->func;
 				void *info = csd->info;
 
+				csd_lock_record(csd);
 				csd_unlock(csd);
 				func(info);
+				csd_lock_record(NULL);
 			} else if (type == CSD_TYPE_IRQ_WORK) {
 				irq_work_single(csd);
 			}
@@ -375,7 +501,8 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
 
 	csd->func = func;
 	csd->info = info;
-#ifdef CONFIG_64BIT
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+	csd->src = smp_processor_id();
 	csd->dst = cpu;
 #endif
 
@@ -543,7 +670,8 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 			csd->flags |= CSD_TYPE_SYNC;
 		csd->func = func;
 		csd->info = info;
-#ifdef CONFIG_64BIT
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+		csd->src = smp_processor_id();
 		csd->dst = cpu;
 #endif
 		if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e068c3c..86a35fd 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1367,6 +1367,17 @@ config WW_MUTEX_SELFTEST
 	  Say M if you want these self tests to build as a module.
 	  Say N if you are unsure.
 
+config CSD_LOCK_WAIT_DEBUG
+	bool "Debugging for csd_lock_wait(), called from smp_call_function*()"
+	depends on DEBUG_KERNEL
+	depends on 64BIT
+	default n
+	help
+	  This option enables debug prints when CPUs are slow to respond
+	  to the smp_call_function*() IPI wrappers.  These debug prints
+	  include the IPI handler function currently executing (if any)
+	  and relevant stack traces.
+
 endmenu # lock debugging
 
 config TRACE_IRQFLAGS

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-10-09  6:39 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-03 23:31 KASAN: stack-out-of-bounds Read in csd_lock_record syzbot
2020-07-04  0:48 ` syzbot
2020-07-04 16:45 ` Paul E. McKenney
2020-07-04 18:34   ` Dmitry Vyukov
2020-07-07 15:51     ` Dmitry Vyukov
2020-07-07 16:26       ` Paul E. McKenney
2020-07-09 10:13         ` Dmitry Vyukov
2020-07-09 16:45           ` Paul E. McKenney
2020-10-09  6:35 ` [tip: core/rcu] kernel/smp: Provide CSD lock timeout diagnostics tip-bot2 for Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.