All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-01 14:36 ` syzbot
  0 siblings, 0 replies; 18+ messages in thread
From: syzbot @ 2019-05-01 14:36 UTC (permalink / raw)
  To: adrian.hunter, davem, dedekind1, ebiggers, jbaron, jpoimboe,
	linux-kernel, linux-mtd, luto, mingo, peterz, richard, riel,
	rostedt, syzkaller-bugs, tglx

Hello,

syzbot found the following crash on:

HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000

The bug was bisected to:

commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
Author: Eric Biggers <ebiggers@google.com>
Date:   Wed Nov 29 20:43:17 2017 +0000

     ubifs: switch to fscrypt_prepare_setattr()

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")

watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
Modules linked in:
irq event stamp: 26556
hardirqs last  enabled at (26555): [<ffffffff81006673>]  
trace_hardirqs_on_thunk+0x1a/0x1c
hardirqs last disabled at (26556): [<ffffffff8100668f>]  
trace_hardirqs_off_thunk+0x1a/0x1c
softirqs last  enabled at (596): [<ffffffff87400662>]  
__do_softirq+0x662/0x95a kernel/softirq.c:320
softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq  
kernel/softirq.c:374 [inline]
softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0  
kernel/softirq.c:414
CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa  
ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4  
01 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
  smp_call_function+0x42/0x90 kernel/smp.c:492
  on_each_cpu+0x31/0x200 kernel/smp.c:602
  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
  jump_label_update kernel/jump_label.c:752 [inline]
  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569  
[inline]
  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
  vfs_ioctl fs/ioctl.c:46 [inline]
  file_ioctl fs/ioctl.c:509 [inline]
  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
  __do_sys_ioctl fs/ioctl.c:720 [inline]
  __se_sys_ioctl fs/ioctl.c:718 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458da9
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
Sending NMI from CPU 0 to CPUs 1:


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

^ permalink raw reply	[flat|nested] 18+ messages in thread

* BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-01 14:36 ` syzbot
  0 siblings, 0 replies; 18+ messages in thread
From: syzbot @ 2019-05-01 14:36 UTC (permalink / raw)
  To: adrian.hunter, davem, dedekind1, ebiggers, jbaron, jpoimboe,
	linux-kernel, linux-mtd, luto, mingo, peterz, richard, riel,
	rostedt, syzkaller-bugs, tglx

Hello,

syzbot found the following crash on:

HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000

The bug was bisected to:

commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
Author: Eric Biggers <ebiggers@google.com>
Date:   Wed Nov 29 20:43:17 2017 +0000

     ubifs: switch to fscrypt_prepare_setattr()

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")

watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
Modules linked in:
irq event stamp: 26556
hardirqs last  enabled at (26555): [<ffffffff81006673>]  
trace_hardirqs_on_thunk+0x1a/0x1c
hardirqs last disabled at (26556): [<ffffffff8100668f>]  
trace_hardirqs_off_thunk+0x1a/0x1c
softirqs last  enabled at (596): [<ffffffff87400662>]  
__do_softirq+0x662/0x95a kernel/softirq.c:320
softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq  
kernel/softirq.c:374 [inline]
softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0  
kernel/softirq.c:414
CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa  
ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4  
01 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
  smp_call_function+0x42/0x90 kernel/smp.c:492
  on_each_cpu+0x31/0x200 kernel/smp.c:602
  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
  jump_label_update kernel/jump_label.c:752 [inline]
  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569  
[inline]
  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
  vfs_ioctl fs/ioctl.c:46 [inline]
  file_ioctl fs/ioctl.c:509 [inline]
  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
  __do_sys_ioctl fs/ioctl.c:720 [inline]
  __se_sys_ioctl fs/ioctl.c:718 [inline]
  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458da9
Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
Sending NMI from CPU 0 to CPUs 1:


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
  2019-05-01 14:36 ` syzbot
@ 2019-05-02  2:34   ` Eric Biggers
  -1 siblings, 0 replies; 18+ messages in thread
From: Eric Biggers @ 2019-05-02  2:34 UTC (permalink / raw)
  To: syzbot, Dmitry Vyukov, kvm
  Cc: adrian.hunter, davem, dedekind1, jbaron, jpoimboe, linux-kernel,
	linux-mtd, luto, mingo, peterz, richard, riel, rostedt,
	syzkaller-bugs, tglx

On Wed, May 01, 2019 at 07:36:05AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
> dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000
> 
> The bug was bisected to:
> 
> commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
> Author: Eric Biggers <ebiggers@google.com>
> Date:   Wed Nov 29 20:43:17 2017 +0000
> 
>     ubifs: switch to fscrypt_prepare_setattr()
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
> Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")
> 
> watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
> Modules linked in:
> irq event stamp: 26556
> hardirqs last  enabled at (26555): [<ffffffff81006673>]
> trace_hardirqs_on_thunk+0x1a/0x1c
> hardirqs last disabled at (26556): [<ffffffff8100668f>]
> trace_hardirqs_off_thunk+0x1a/0x1c
> softirqs last  enabled at (596): [<ffffffff87400662>]
> __do_softirq+0x662/0x95a kernel/softirq.c:320
> softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq
> kernel/softirq.c:374 [inline]
> softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0
> kernel/softirq.c:414
> CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
> RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
> Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa
> ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4 01
> 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
> RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
> RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
> RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
> R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
> R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
>  smp_call_function+0x42/0x90 kernel/smp.c:492
>  on_each_cpu+0x31/0x200 kernel/smp.c:602
>  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
>  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
>  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
>  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
>  jump_label_update kernel/jump_label.c:752 [inline]
>  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
>  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
>  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
>  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
>  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
>  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
>  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
>  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> [inline]
>  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  file_ioctl fs/ioctl.c:509 [inline]
>  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x458da9
> Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
> RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
> RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
> R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
> Sending NMI from CPU 0 to CPUs 1:
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches
> 
> -- 
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000fb78720587d46fe9%40google.com.
> For more options, visit https://groups.google.com/d/optout.

Can the KVM maintainers take a look at this?  This doesn't have anything to do
with my commit that syzbot bisected it to.

+Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
run the reproducer more times if it isn't working reliably.  Otherwise it ends
up blaming some random commit.

I'm also curious how syzbot found the list of people to send this to, as it
seems very random.  This should obviously have gone to the kvm mailing list, but
it wasn't sent there; I had to manually add it.

- Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-02  2:34   ` Eric Biggers
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Biggers @ 2019-05-02  2:34 UTC (permalink / raw)
  To: syzbot, Dmitry Vyukov, kvm
  Cc: mingo, dedekind1, peterz, richard, jbaron, riel, syzkaller-bugs,
	linux-kernel, rostedt, adrian.hunter, linux-mtd, luto, jpoimboe,
	tglx, davem

On Wed, May 01, 2019 at 07:36:05AM -0700, syzbot wrote:
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
> dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000
> 
> The bug was bisected to:
> 
> commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
> Author: Eric Biggers <ebiggers@google.com>
> Date:   Wed Nov 29 20:43:17 2017 +0000
> 
>     ubifs: switch to fscrypt_prepare_setattr()
> 
> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
> final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
> Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")
> 
> watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
> Modules linked in:
> irq event stamp: 26556
> hardirqs last  enabled at (26555): [<ffffffff81006673>]
> trace_hardirqs_on_thunk+0x1a/0x1c
> hardirqs last disabled at (26556): [<ffffffff8100668f>]
> trace_hardirqs_off_thunk+0x1a/0x1c
> softirqs last  enabled at (596): [<ffffffff87400662>]
> __do_softirq+0x662/0x95a kernel/softirq.c:320
> softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq
> kernel/softirq.c:374 [inline]
> softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0
> kernel/softirq.c:414
> CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
> RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
> Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa
> ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4 01
> 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
> RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
> RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
> RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
> R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
> R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
>  smp_call_function+0x42/0x90 kernel/smp.c:492
>  on_each_cpu+0x31/0x200 kernel/smp.c:602
>  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
>  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
>  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
>  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
>  jump_label_update kernel/jump_label.c:752 [inline]
>  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
>  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
>  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
>  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
>  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
>  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
>  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
>  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> [inline]
>  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
>  vfs_ioctl fs/ioctl.c:46 [inline]
>  file_ioctl fs/ioctl.c:509 [inline]
>  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
>  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
>  __do_sys_ioctl fs/ioctl.c:720 [inline]
>  __se_sys_ioctl fs/ioctl.c:718 [inline]
>  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
>  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x458da9
> Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
> RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
> RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
> R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
> Sending NMI from CPU 0 to CPUs 1:
> 
> 
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
> 
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches
> 
> -- 
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000fb78720587d46fe9%40google.com.
> For more options, visit https://groups.google.com/d/optout.

Can the KVM maintainers take a look at this?  This doesn't have anything to do
with my commit that syzbot bisected it to.

+Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
run the reproducer more times if it isn't working reliably.  Otherwise it ends
up blaming some random commit.

I'm also curious how syzbot found the list of people to send this to, as it
seems very random.  This should obviously have gone to the kvm mailing list, but
it wasn't sent there; I had to manually add it.

- Eric

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
  2019-05-02  2:34   ` Eric Biggers
@ 2019-05-02  3:10     ` Steven Rostedt
  -1 siblings, 0 replies; 18+ messages in thread
From: Steven Rostedt @ 2019-05-02  3:10 UTC (permalink / raw)
  To: Eric Biggers
  Cc: syzbot, Dmitry Vyukov, kvm, adrian.hunter, davem, dedekind1,
	jbaron, jpoimboe, linux-kernel, linux-mtd, luto, mingo, peterz,
	richard, riel, syzkaller-bugs, tglx

On Wed, 1 May 2019 19:34:27 -0700
Eric Biggers <ebiggers@kernel.org> wrote:

> > Call Trace:
> >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> >  smp_call_function+0x42/0x90 kernel/smp.c:492
> >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> >  jump_label_update kernel/jump_label.c:752 [inline]
> >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > [inline]
> >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe

> 
> I'm also curious how syzbot found the list of people to send this to, as it
> seems very random.  This should obviously have gone to the kvm mailing list, but
> it wasn't sent there; I had to manually add it.

My guess is that it went down the call stack, and picked those that
deal with the functions that are listed at the deepest part of the
stack. kvm doesn't appear for 12 functions up from the crash. It
probably stopped its search before that.

-- Steve

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-02  3:10     ` Steven Rostedt
  0 siblings, 0 replies; 18+ messages in thread
From: Steven Rostedt @ 2019-05-02  3:10 UTC (permalink / raw)
  To: Eric Biggers
  Cc: mingo, kvm, dedekind1, peterz, jbaron, syzbot, riel,
	syzkaller-bugs, adrian.hunter, linux-kernel, richard, linux-mtd,
	luto, jpoimboe, tglx, davem, Dmitry Vyukov

On Wed, 1 May 2019 19:34:27 -0700
Eric Biggers <ebiggers@kernel.org> wrote:

> > Call Trace:
> >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> >  smp_call_function+0x42/0x90 kernel/smp.c:492
> >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> >  jump_label_update kernel/jump_label.c:752 [inline]
> >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > [inline]
> >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe

> 
> I'm also curious how syzbot found the list of people to send this to, as it
> seems very random.  This should obviously have gone to the kvm mailing list, but
> it wasn't sent there; I had to manually add it.

My guess is that it went down the call stack, and picked those that
deal with the functions that are listed at the deepest part of the
stack. kvm doesn't appear for 12 functions up from the crash. It
probably stopped its search before that.

-- Steve

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
  2019-05-02  3:10     ` Steven Rostedt
@ 2019-05-08 11:23       ` Dmitry Vyukov
  -1 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-05-08 11:23 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Eric Biggers, syzbot, KVM list, adrian.hunter, David Miller,
	Artem Bityutskiy, jbaron, Josh Poimboeuf, LKML, linux-mtd,
	Andy Lutomirski, Ingo Molnar, Peter Zijlstra, Richard Weinberger,
	Rik van Riel, syzkaller-bugs, Thomas Gleixner

From: Steven Rostedt <rostedt@goodmis.org>
Date: Thu, May 2, 2019 at 5:10 AM
To: Eric Biggers
Cc: syzbot, Dmitry Vyukov, <kvm@vger.kernel.org>,
<adrian.hunter@intel.com>, <davem@davemloft.net>,
<dedekind1@gmail.com>, <jbaron@redhat.com>, <jpoimboe@redhat.com>,
<linux-kernel@vger.kernel.org>, <linux-mtd@lists.infradead.org>,
<luto@kernel.org>, <mingo@kernel.org>, <peterz@infradead.org>,
<richard@nod.at>, <riel@surriel.com>,
<syzkaller-bugs@googlegroups.com>, <tglx@linutronix.de>

> On Wed, 1 May 2019 19:34:27 -0700
> Eric Biggers <ebiggers@kernel.org> wrote:
>
> > > Call Trace:
> > >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> > >  smp_call_function+0x42/0x90 kernel/smp.c:492
> > >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> > >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> > >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> > >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> > >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> > >  jump_label_update kernel/jump_label.c:752 [inline]
> > >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> > >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> > >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> > >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> > >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> > >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> > >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> > >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > > [inline]
> > >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> > >  vfs_ioctl fs/ioctl.c:46 [inline]
> > >  file_ioctl fs/ioctl.c:509 [inline]
> > >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> > >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> > >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> > >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> > >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> > >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> > >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> >
> > I'm also curious how syzbot found the list of people to send this to, as it
> > seems very random.  This should obviously have gone to the kvm mailing list, but
> > it wasn't sent there; I had to manually add it.
>
> My guess is that it went down the call stack, and picked those that
> deal with the functions that are listed at the deepest part of the
> stack. kvm doesn't appear for 12 functions up from the crash. It
> probably stopped its search before that.

Hi,

What we do now is the following. We take all filenames in the report
starting from top to bottom, and then apply a blacklist to filter out
utility functions and bug detection facilities:
https://github.com/google/syzkaller/blob/master/pkg/report/linux.go#L59-L89
The first file name that is not blacklisted is used with get_maintainers.pl.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-08 11:23       ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-05-08 11:23 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Ingo Molnar, KVM list, Artem Bityutskiy, Peter Zijlstra, jbaron,
	syzbot, Rik van Riel, syzkaller-bugs, adrian.hunter, LKML,
	Eric Biggers, Richard Weinberger, linux-mtd, Andy Lutomirski,
	Josh Poimboeuf, Thomas Gleixner, David Miller

From: Steven Rostedt <rostedt@goodmis.org>
Date: Thu, May 2, 2019 at 5:10 AM
To: Eric Biggers
Cc: syzbot, Dmitry Vyukov, <kvm@vger.kernel.org>,
<adrian.hunter@intel.com>, <davem@davemloft.net>,
<dedekind1@gmail.com>, <jbaron@redhat.com>, <jpoimboe@redhat.com>,
<linux-kernel@vger.kernel.org>, <linux-mtd@lists.infradead.org>,
<luto@kernel.org>, <mingo@kernel.org>, <peterz@infradead.org>,
<richard@nod.at>, <riel@surriel.com>,
<syzkaller-bugs@googlegroups.com>, <tglx@linutronix.de>

> On Wed, 1 May 2019 19:34:27 -0700
> Eric Biggers <ebiggers@kernel.org> wrote:
>
> > > Call Trace:
> > >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> > >  smp_call_function+0x42/0x90 kernel/smp.c:492
> > >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> > >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> > >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> > >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> > >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> > >  jump_label_update kernel/jump_label.c:752 [inline]
> > >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> > >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> > >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> > >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> > >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> > >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> > >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> > >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > > [inline]
> > >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> > >  vfs_ioctl fs/ioctl.c:46 [inline]
> > >  file_ioctl fs/ioctl.c:509 [inline]
> > >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> > >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> > >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> > >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> > >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> > >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> > >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> >
> > I'm also curious how syzbot found the list of people to send this to, as it
> > seems very random.  This should obviously have gone to the kvm mailing list, but
> > it wasn't sent there; I had to manually add it.
>
> My guess is that it went down the call stack, and picked those that
> deal with the functions that are listed at the deepest part of the
> stack. kvm doesn't appear for 12 functions up from the crash. It
> probably stopped its search before that.

Hi,

What we do now is the following. We take all filenames in the report
starting from top to bottom, and then apply a blacklist to filter out
utility functions and bug detection facilities:
https://github.com/google/syzkaller/blob/master/pkg/report/linux.go#L59-L89
The first file name that is not blacklisted is used with get_maintainers.pl.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
  2019-05-02  2:34   ` Eric Biggers
@ 2019-05-08 11:25     ` Dmitry Vyukov
  -1 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-05-08 11:25 UTC (permalink / raw)
  To: Eric Biggers
  Cc: syzbot, KVM list, adrian.hunter, David Miller, Artem Bityutskiy,
	jbaron, Josh Poimboeuf, LKML, linux-mtd, Andy Lutomirski,
	Ingo Molnar, Peter Zijlstra, Richard Weinberger, Rik van Riel,
	Steven Rostedt, syzkaller-bugs, Thomas Gleixner

From: Eric Biggers <ebiggers@kernel.org>
Date: Thu, May 2, 2019 at 4:34 AM
To: syzbot, Dmitry Vyukov, <kvm@vger.kernel.org>
Cc: <adrian.hunter@intel.com>, <davem@davemloft.net>,
<dedekind1@gmail.com>, <jbaron@redhat.com>, <jpoimboe@redhat.com>,
<linux-kernel@vger.kernel.org>, <linux-mtd@lists.infradead.org>,
<luto@kernel.org>, <mingo@kernel.org>, <peterz@infradead.org>,
<richard@nod.at>, <riel@surriel.com>, <rostedt@goodmis.org>,
<syzkaller-bugs@googlegroups.com>, <tglx@linutronix.de>

> On Wed, May 01, 2019 at 07:36:05AM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
> > dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000
> >
> > The bug was bisected to:
> >
> > commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
> > Author: Eric Biggers <ebiggers@google.com>
> > Date:   Wed Nov 29 20:43:17 2017 +0000
> >
> >     ubifs: switch to fscrypt_prepare_setattr()
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
> > final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
> > Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")
> >
> > watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
> > Modules linked in:
> > irq event stamp: 26556
> > hardirqs last  enabled at (26555): [<ffffffff81006673>]
> > trace_hardirqs_on_thunk+0x1a/0x1c
> > hardirqs last disabled at (26556): [<ffffffff8100668f>]
> > trace_hardirqs_off_thunk+0x1a/0x1c
> > softirqs last  enabled at (596): [<ffffffff87400662>]
> > __do_softirq+0x662/0x95a kernel/softirq.c:320
> > softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq
> > kernel/softirq.c:374 [inline]
> > softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0
> > kernel/softirq.c:414
> > CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
> > RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
> > Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa
> > ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4 01
> > 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
> > RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> > RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
> > RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
> > RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
> > R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
> > R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> > FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> >  smp_call_function+0x42/0x90 kernel/smp.c:492
> >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> >  jump_label_update kernel/jump_label.c:752 [inline]
> >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > [inline]
> >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x458da9
> > Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> > 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
> > RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
> > RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
> > R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
> > Sending NMI from CPU 0 to CPUs 1:

> Can the KVM maintainers take a look at this?  This doesn't have anything to do
> with my commit that syzbot bisected it to.
>
> +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> run the reproducer more times if it isn't working reliably.  Otherwise it ends
> up blaming some random commit.

Added a note to https://github.com/google/syzkaller/issues/1051
Thanks

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-08 11:25     ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-05-08 11:25 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Ingo Molnar, KVM list, Artem Bityutskiy, Peter Zijlstra, LKML,
	jbaron, Rik van Riel, syzkaller-bugs, adrian.hunter,
	Steven Rostedt, David Miller, Richard Weinberger, linux-mtd,
	Andy Lutomirski, Josh Poimboeuf, Thomas Gleixner, syzbot

From: Eric Biggers <ebiggers@kernel.org>
Date: Thu, May 2, 2019 at 4:34 AM
To: syzbot, Dmitry Vyukov, <kvm@vger.kernel.org>
Cc: <adrian.hunter@intel.com>, <davem@davemloft.net>,
<dedekind1@gmail.com>, <jbaron@redhat.com>, <jpoimboe@redhat.com>,
<linux-kernel@vger.kernel.org>, <linux-mtd@lists.infradead.org>,
<luto@kernel.org>, <mingo@kernel.org>, <peterz@infradead.org>,
<richard@nod.at>, <riel@surriel.com>, <rostedt@goodmis.org>,
<syzkaller-bugs@googlegroups.com>, <tglx@linutronix.de>

> On Wed, May 01, 2019 at 07:36:05AM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
> > dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000
> >
> > The bug was bisected to:
> >
> > commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
> > Author: Eric Biggers <ebiggers@google.com>
> > Date:   Wed Nov 29 20:43:17 2017 +0000
> >
> >     ubifs: switch to fscrypt_prepare_setattr()
> >
> > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
> > final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
> > Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")
> >
> > watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
> > Modules linked in:
> > irq event stamp: 26556
> > hardirqs last  enabled at (26555): [<ffffffff81006673>]
> > trace_hardirqs_on_thunk+0x1a/0x1c
> > hardirqs last disabled at (26556): [<ffffffff8100668f>]
> > trace_hardirqs_off_thunk+0x1a/0x1c
> > softirqs last  enabled at (596): [<ffffffff87400662>]
> > __do_softirq+0x662/0x95a kernel/softirq.c:320
> > softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq
> > kernel/softirq.c:374 [inline]
> > softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0
> > kernel/softirq.c:414
> > CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
> > RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
> > Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa
> > ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4 01
> > 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
> > RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> > RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
> > RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
> > RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
> > R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
> > R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> > FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> >  smp_call_function+0x42/0x90 kernel/smp.c:492
> >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> >  jump_label_update kernel/jump_label.c:752 [inline]
> >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > [inline]
> >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x458da9
> > Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> > 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
> > RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
> > RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
> > R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
> > Sending NMI from CPU 0 to CPUs 1:

> Can the KVM maintainers take a look at this?  This doesn't have anything to do
> with my commit that syzbot bisected it to.
>
> +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> run the reproducer more times if it isn't working reliably.  Otherwise it ends
> up blaming some random commit.

Added a note to https://github.com/google/syzkaller/issues/1051
Thanks

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
  2019-05-08 11:25     ` Dmitry Vyukov
@ 2019-05-08 16:58       ` Dmitry Vyukov
  -1 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-05-08 16:58 UTC (permalink / raw)
  To: Eric Biggers
  Cc: syzbot, KVM list, adrian.hunter, David Miller, Artem Bityutskiy,
	jbaron, Josh Poimboeuf, LKML, linux-mtd, Andy Lutomirski,
	Ingo Molnar, Peter Zijlstra, Richard Weinberger, Rik van Riel,
	Steven Rostedt, syzkaller-bugs, Thomas Gleixner

From: Dmitry Vyukov <dvyukov@google.com>
Date: Wed, May 8, 2019 at 1:25 PM
To: Eric Biggers
Cc: syzbot, KVM list, <adrian.hunter@intel.com>, David Miller, Artem
Bityutskiy, <jbaron@redhat.com>, Josh Poimboeuf, LKML,
<linux-mtd@lists.infradead.org>, Andy Lutomirski, Ingo Molnar, Peter
Zijlstra, Richard Weinberger, Rik van Riel, Steven Rostedt,
syzkaller-bugs, Thomas Gleixner

> From: Eric Biggers <ebiggers@kernel.org>
> Date: Thu, May 2, 2019 at 4:34 AM
> To: syzbot, Dmitry Vyukov, <kvm@vger.kernel.org>
> Cc: <adrian.hunter@intel.com>, <davem@davemloft.net>,
> <dedekind1@gmail.com>, <jbaron@redhat.com>, <jpoimboe@redhat.com>,
> <linux-kernel@vger.kernel.org>, <linux-mtd@lists.infradead.org>,
> <luto@kernel.org>, <mingo@kernel.org>, <peterz@infradead.org>,
> <richard@nod.at>, <riel@surriel.com>, <rostedt@goodmis.org>,
> <syzkaller-bugs@googlegroups.com>, <tglx@linutronix.de>
>
> > On Wed, May 01, 2019 at 07:36:05AM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000
> > >
> > > The bug was bisected to:
> > >
> > > commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
> > > Author: Eric Biggers <ebiggers@google.com>
> > > Date:   Wed Nov 29 20:43:17 2017 +0000
> > >
> > >     ubifs: switch to fscrypt_prepare_setattr()
> > >
> > > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
> > > final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
> > > Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")
> > >
> > > watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
> > > Modules linked in:
> > > irq event stamp: 26556
> > > hardirqs last  enabled at (26555): [<ffffffff81006673>]
> > > trace_hardirqs_on_thunk+0x1a/0x1c
> > > hardirqs last disabled at (26556): [<ffffffff8100668f>]
> > > trace_hardirqs_off_thunk+0x1a/0x1c
> > > softirqs last  enabled at (596): [<ffffffff87400662>]
> > > __do_softirq+0x662/0x95a kernel/softirq.c:320
> > > softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq
> > > kernel/softirq.c:374 [inline]
> > > softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0
> > > kernel/softirq.c:414
> > > CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
> > > RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
> > > Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa
> > > ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4 01
> > > 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
> > > RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> > > RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
> > > RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
> > > RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
> > > R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
> > > R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> > > FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> > >  smp_call_function+0x42/0x90 kernel/smp.c:492
> > >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> > >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> > >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> > >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> > >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> > >  jump_label_update kernel/jump_label.c:752 [inline]
> > >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> > >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> > >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> > >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> > >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> > >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> > >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> > >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > > [inline]
> > >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> > >  vfs_ioctl fs/ioctl.c:46 [inline]
> > >  file_ioctl fs/ioctl.c:509 [inline]
> > >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> > >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> > >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> > >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> > >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> > >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> > >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > RIP: 0033:0x458da9
> > > Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> > > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> > > 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > > RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
> > > RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
> > > RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
> > > R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
> > > Sending NMI from CPU 0 to CPUs 1:
>
> > Can the KVM maintainers take a look at this?  This doesn't have anything to do
> > with my commit that syzbot bisected it to.
> >
> > +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> > case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> > run the reproducer more times if it isn't working reliably.  Otherwise it ends
> > up blaming some random commit.
>
> Added a note to https://github.com/google/syzkaller/issues/1051
> Thanks

As we increase number of instances, we increase chances of hitting
unrelated bugs. E.g. take a look at the bisection log for:
https://syzkaller.appspot.com/bug?extid=f14868630901fc6151d3
What is the optimum number of tests is a good question. I suspect that
the current 10 instances is close to optimum. If we use significantly
more we may break every other bisection on unrelated bugs...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-08 16:58       ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-05-08 16:58 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Ingo Molnar, KVM list, Artem Bityutskiy, Peter Zijlstra, LKML,
	jbaron, Rik van Riel, syzkaller-bugs, adrian.hunter,
	Steven Rostedt, David Miller, Richard Weinberger, linux-mtd,
	Andy Lutomirski, Josh Poimboeuf, Thomas Gleixner, syzbot

From: Dmitry Vyukov <dvyukov@google.com>
Date: Wed, May 8, 2019 at 1:25 PM
To: Eric Biggers
Cc: syzbot, KVM list, <adrian.hunter@intel.com>, David Miller, Artem
Bityutskiy, <jbaron@redhat.com>, Josh Poimboeuf, LKML,
<linux-mtd@lists.infradead.org>, Andy Lutomirski, Ingo Molnar, Peter
Zijlstra, Richard Weinberger, Rik van Riel, Steven Rostedt,
syzkaller-bugs, Thomas Gleixner

> From: Eric Biggers <ebiggers@kernel.org>
> Date: Thu, May 2, 2019 at 4:34 AM
> To: syzbot, Dmitry Vyukov, <kvm@vger.kernel.org>
> Cc: <adrian.hunter@intel.com>, <davem@davemloft.net>,
> <dedekind1@gmail.com>, <jbaron@redhat.com>, <jpoimboe@redhat.com>,
> <linux-kernel@vger.kernel.org>, <linux-mtd@lists.infradead.org>,
> <luto@kernel.org>, <mingo@kernel.org>, <peterz@infradead.org>,
> <richard@nod.at>, <riel@surriel.com>, <rostedt@goodmis.org>,
> <syzkaller-bugs@googlegroups.com>, <tglx@linutronix.de>
>
> > On Wed, May 01, 2019 at 07:36:05AM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
> > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000
> > >
> > > The bug was bisected to:
> > >
> > > commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
> > > Author: Eric Biggers <ebiggers@google.com>
> > > Date:   Wed Nov 29 20:43:17 2017 +0000
> > >
> > >     ubifs: switch to fscrypt_prepare_setattr()
> > >
> > > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
> > > final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
> > > Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")
> > >
> > > watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
> > > Modules linked in:
> > > irq event stamp: 26556
> > > hardirqs last  enabled at (26555): [<ffffffff81006673>]
> > > trace_hardirqs_on_thunk+0x1a/0x1c
> > > hardirqs last disabled at (26556): [<ffffffff8100668f>]
> > > trace_hardirqs_off_thunk+0x1a/0x1c
> > > softirqs last  enabled at (596): [<ffffffff87400662>]
> > > __do_softirq+0x662/0x95a kernel/softirq.c:320
> > > softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq
> > > kernel/softirq.c:374 [inline]
> > > softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0
> > > kernel/softirq.c:414
> > > CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
> > > RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
> > > Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa
> > > ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4 01
> > > 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
> > > RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> > > RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
> > > RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
> > > RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
> > > R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
> > > R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> > > FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> > >  smp_call_function+0x42/0x90 kernel/smp.c:492
> > >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> > >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> > >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> > >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> > >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> > >  jump_label_update kernel/jump_label.c:752 [inline]
> > >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> > >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> > >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> > >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> > >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> > >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> > >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> > >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > > [inline]
> > >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> > >  vfs_ioctl fs/ioctl.c:46 [inline]
> > >  file_ioctl fs/ioctl.c:509 [inline]
> > >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> > >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> > >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> > >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> > >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> > >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> > >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > RIP: 0033:0x458da9
> > > Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> > > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> > > 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > > RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
> > > RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
> > > RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
> > > R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
> > > Sending NMI from CPU 0 to CPUs 1:
>
> > Can the KVM maintainers take a look at this?  This doesn't have anything to do
> > with my commit that syzbot bisected it to.
> >
> > +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> > case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> > run the reproducer more times if it isn't working reliably.  Otherwise it ends
> > up blaming some random commit.
>
> Added a note to https://github.com/google/syzkaller/issues/1051
> Thanks

As we increase number of instances, we increase chances of hitting
unrelated bugs. E.g. take a look at the bisection log for:
https://syzkaller.appspot.com/bug?extid=f14868630901fc6151d3
What is the optimum number of tests is a good question. I suspect that
the current 10 instances is close to optimum. If we use significantly
more we may break every other bisection on unrelated bugs...

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
  2019-05-08 16:58       ` Dmitry Vyukov
@ 2019-05-09  3:18         ` Eric Biggers
  -1 siblings, 0 replies; 18+ messages in thread
From: Eric Biggers @ 2019-05-09  3:18 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, KVM list, adrian.hunter, David Miller, Artem Bityutskiy,
	jbaron, Josh Poimboeuf, LKML, linux-mtd, Andy Lutomirski,
	Ingo Molnar, Peter Zijlstra, Richard Weinberger, Rik van Riel,
	Steven Rostedt, syzkaller-bugs, Thomas Gleixner

On Wed, May 08, 2019 at 06:58:28PM +0200, 'Dmitry Vyukov' via syzkaller-bugs wrote:
> From: Dmitry Vyukov <dvyukov@google.com>
> Date: Wed, May 8, 2019 at 1:25 PM
> To: Eric Biggers
> Cc: syzbot, KVM list, <adrian.hunter@intel.com>, David Miller, Artem
> Bityutskiy, <jbaron@redhat.com>, Josh Poimboeuf, LKML,
> <linux-mtd@lists.infradead.org>, Andy Lutomirski, Ingo Molnar, Peter
> Zijlstra, Richard Weinberger, Rik van Riel, Steven Rostedt,
> syzkaller-bugs, Thomas Gleixner
> 
> > From: Eric Biggers <ebiggers@kernel.org>
> > Date: Thu, May 2, 2019 at 4:34 AM
> > To: syzbot, Dmitry Vyukov, <kvm@vger.kernel.org>
> > Cc: <adrian.hunter@intel.com>, <davem@davemloft.net>,
> > <dedekind1@gmail.com>, <jbaron@redhat.com>, <jpoimboe@redhat.com>,
> > <linux-kernel@vger.kernel.org>, <linux-mtd@lists.infradead.org>,
> > <luto@kernel.org>, <mingo@kernel.org>, <peterz@infradead.org>,
> > <richard@nod.at>, <riel@surriel.com>, <rostedt@goodmis.org>,
> > <syzkaller-bugs@googlegroups.com>, <tglx@linutronix.de>
> >
> > > On Wed, May 01, 2019 at 07:36:05AM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
> > > > git tree:       upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
> > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000
> > > >
> > > > The bug was bisected to:
> > > >
> > > > commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
> > > > Author: Eric Biggers <ebiggers@google.com>
> > > > Date:   Wed Nov 29 20:43:17 2017 +0000
> > > >
> > > >     ubifs: switch to fscrypt_prepare_setattr()
> > > >
> > > > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
> > > > final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
> > > > Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")
> > > >
> > > > watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
> > > > Modules linked in:
> > > > irq event stamp: 26556
> > > > hardirqs last  enabled at (26555): [<ffffffff81006673>]
> > > > trace_hardirqs_on_thunk+0x1a/0x1c
> > > > hardirqs last disabled at (26556): [<ffffffff8100668f>]
> > > > trace_hardirqs_off_thunk+0x1a/0x1c
> > > > softirqs last  enabled at (596): [<ffffffff87400662>]
> > > > __do_softirq+0x662/0x95a kernel/softirq.c:320
> > > > softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq
> > > > kernel/softirq.c:374 [inline]
> > > > softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0
> > > > kernel/softirq.c:414
> > > > CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > Google 01/01/2011
> > > > RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
> > > > RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
> > > > Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa
> > > > ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4 01
> > > > 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
> > > > RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> > > > RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
> > > > RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
> > > > RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
> > > > R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
> > > > R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> > > > FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > Call Trace:
> > > >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> > > >  smp_call_function+0x42/0x90 kernel/smp.c:492
> > > >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> > > >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> > > >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> > > >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> > > >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> > > >  jump_label_update kernel/jump_label.c:752 [inline]
> > > >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> > > >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> > > >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> > > >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> > > >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> > > >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> > > >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> > > >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > > > [inline]
> > > >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> > > >  vfs_ioctl fs/ioctl.c:46 [inline]
> > > >  file_ioctl fs/ioctl.c:509 [inline]
> > > >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> > > >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> > > >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> > > >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> > > >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> > > >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> > > >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > > RIP: 0033:0x458da9
> > > > Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> > > > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> > > > 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > > > RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > > > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
> > > > RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
> > > > RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> > > > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
> > > > R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
> > > > Sending NMI from CPU 0 to CPUs 1:
> >
> > > Can the KVM maintainers take a look at this?  This doesn't have anything to do
> > > with my commit that syzbot bisected it to.
> > >
> > > +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> > > case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> > > run the reproducer more times if it isn't working reliably.  Otherwise it ends
> > > up blaming some random commit.
> >
> > Added a note to https://github.com/google/syzkaller/issues/1051
> > Thanks
> 
> As we increase number of instances, we increase chances of hitting
> unrelated bugs. E.g. take a look at the bisection log for:
> https://syzkaller.appspot.com/bug?extid=f14868630901fc6151d3
> What is the optimum number of tests is a good question. I suspect that
> the current 10 instances is close to optimum. If we use significantly
> more we may break every other bisection on unrelated bugs...
> 

Only because syzbot is being super dumb in how it does the bisection.  AFAICS,
in the example you linked to, buggy kernels reliably crashed 10 out of 10 times
with the original crash signature, "WARNING in cgroup_exit".  Then at some point
it tested some kernel without the bug and got a different crash just 1 in 10
times, "WARNING: ODEBUG bug in netdev_freemem".

The facts that the crash frequency was very different, and the crash signature
was different, should be taken as a very strong signal that it's not the bug
being bisected for.  And this is something easily checked for in code.

BTW, I hope you're treating fixing this as a high priority, given that syzbot is
now sending bug reports to kernel developers literally selected at random.  This
is a great way to teach people to ignore syzbot reports.  (When I suggested
bisection originally, I had assumed you'd implement some basic sanity checks so
that only bisection results likely to be reliable would be mailed out.)

- Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-09  3:18         ` Eric Biggers
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Biggers @ 2019-05-09  3:18 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Ingo Molnar, KVM list, Artem Bityutskiy, Peter Zijlstra, LKML,
	jbaron, Rik van Riel, syzkaller-bugs, adrian.hunter,
	Steven Rostedt, David Miller, Richard Weinberger, linux-mtd,
	Andy Lutomirski, Josh Poimboeuf, Thomas Gleixner, syzbot

On Wed, May 08, 2019 at 06:58:28PM +0200, 'Dmitry Vyukov' via syzkaller-bugs wrote:
> From: Dmitry Vyukov <dvyukov@google.com>
> Date: Wed, May 8, 2019 at 1:25 PM
> To: Eric Biggers
> Cc: syzbot, KVM list, <adrian.hunter@intel.com>, David Miller, Artem
> Bityutskiy, <jbaron@redhat.com>, Josh Poimboeuf, LKML,
> <linux-mtd@lists.infradead.org>, Andy Lutomirski, Ingo Molnar, Peter
> Zijlstra, Richard Weinberger, Rik van Riel, Steven Rostedt,
> syzkaller-bugs, Thomas Gleixner
> 
> > From: Eric Biggers <ebiggers@kernel.org>
> > Date: Thu, May 2, 2019 at 4:34 AM
> > To: syzbot, Dmitry Vyukov, <kvm@vger.kernel.org>
> > Cc: <adrian.hunter@intel.com>, <davem@davemloft.net>,
> > <dedekind1@gmail.com>, <jbaron@redhat.com>, <jpoimboe@redhat.com>,
> > <linux-kernel@vger.kernel.org>, <linux-mtd@lists.infradead.org>,
> > <luto@kernel.org>, <mingo@kernel.org>, <peterz@infradead.org>,
> > <richard@nod.at>, <riel@surriel.com>, <rostedt@goodmis.org>,
> > <syzkaller-bugs@googlegroups.com>, <tglx@linutronix.de>
> >
> > > On Wed, May 01, 2019 at 07:36:05AM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit:    baf76f0c slip: make slhc_free() silently accept an error p..
> > > > git tree:       upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1407f57f200000
> > > > kernel config:  https://syzkaller.appspot.com/x/.config?x=a42d110b47dd6b36
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=8d9bb6157e7b379f740e
> > > > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=1266a588a00000
> > > >
> > > > The bug was bisected to:
> > > >
> > > > commit 252153ba518ac0bcde6b7152c63380d4415bfe5d
> > > > Author: Eric Biggers <ebiggers@google.com>
> > > > Date:   Wed Nov 29 20:43:17 2017 +0000
> > > >
> > > >     ubifs: switch to fscrypt_prepare_setattr()
> > > >
> > > > bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=1448f588a00000
> > > > final crash:    https://syzkaller.appspot.com/x/report.txt?x=1648f588a00000
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=1248f588a00000
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: syzbot+8d9bb6157e7b379f740e@syzkaller.appspotmail.com
> > > > Fixes: 252153ba518a ("ubifs: switch to fscrypt_prepare_setattr()")
> > > >
> > > > watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.3:22023]
> > > > Modules linked in:
> > > > irq event stamp: 26556
> > > > hardirqs last  enabled at (26555): [<ffffffff81006673>]
> > > > trace_hardirqs_on_thunk+0x1a/0x1c
> > > > hardirqs last disabled at (26556): [<ffffffff8100668f>]
> > > > trace_hardirqs_off_thunk+0x1a/0x1c
> > > > softirqs last  enabled at (596): [<ffffffff87400662>]
> > > > __do_softirq+0x662/0x95a kernel/softirq.c:320
> > > > softirqs last disabled at (517): [<ffffffff8144e4e0>] invoke_softirq
> > > > kernel/softirq.c:374 [inline]
> > > > softirqs last disabled at (517): [<ffffffff8144e4e0>] irq_exit+0x180/0x1d0
> > > > kernel/softirq.c:414
> > > > CPU: 0 PID: 22023 Comm: syz-executor.3 Not tainted 5.1.0-rc6+ #89
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > > Google 01/01/2011
> > > > RIP: 0010:csd_lock_wait kernel/smp.c:108 [inline]
> > > > RIP: 0010:smp_call_function_single+0x13e/0x420 kernel/smp.c:302
> > > > Code: 00 48 8b 4c 24 08 48 8b 54 24 10 48 8d 74 24 40 8b 7c 24 1c e8 23 fa
> > > > ff ff 41 89 c5 eb 07 e8 e9 87 0a 00 f3 90 44 8b 64 24 58 <31> ff 41 83 e4 01
> > > > 44 89 e6 e8 54 89 0a 00 45 85 e4 75 e1 e8 ca 87
> > > > RSP: 0018:ffff88809277f3e0 EFLAGS: 00000293 ORIG_RAX: ffffffffffffff13
> > > > RAX: ffff8880a8bfc040 RBX: 1ffff110124efe80 RCX: ffffffff8166051c
> > > > RDX: 0000000000000000 RSI: ffffffff81660507 RDI: 0000000000000005
> > > > RBP: ffff88809277f4b8 R08: ffff8880a8bfc040 R09: ffffed1015d25be9
> > > > R10: ffffed1015d25be8 R11: ffff8880ae92df47 R12: 0000000000000003
> > > > R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> > > > FS:  00007fd569980700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> > > > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 00007fd56997e178 CR3: 00000000a4fd2000 CR4: 00000000001426f0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > Call Trace:
> > > >  smp_call_function_many+0x750/0x8c0 kernel/smp.c:434
> > > >  smp_call_function+0x42/0x90 kernel/smp.c:492
> > > >  on_each_cpu+0x31/0x200 kernel/smp.c:602
> > > >  text_poke_bp+0x107/0x19b arch/x86/kernel/alternative.c:821
> > > >  __jump_label_transform+0x263/0x330 arch/x86/kernel/jump_label.c:91
> > > >  arch_jump_label_transform+0x2b/0x40 arch/x86/kernel/jump_label.c:99
> > > >  __jump_label_update+0x16a/0x210 kernel/jump_label.c:389
> > > >  jump_label_update kernel/jump_label.c:752 [inline]
> > > >  jump_label_update+0x1ce/0x3d0 kernel/jump_label.c:731
> > > >  static_key_slow_inc_cpuslocked+0x1c1/0x250 kernel/jump_label.c:129
> > > >  static_key_slow_inc+0x1b/0x30 kernel/jump_label.c:144
> > > >  kvm_arch_vcpu_init+0x6b7/0x870 arch/x86/kvm/x86.c:9068
> > > >  kvm_vcpu_init+0x272/0x370 arch/x86/kvm/../../../virt/kvm/kvm_main.c:320
> > > >  vmx_create_vcpu+0x191/0x2540 arch/x86/kvm/vmx/vmx.c:6577
> > > >  kvm_arch_vcpu_create+0x80/0x120 arch/x86/kvm/x86.c:8755
> > > >  kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:2569
> > > > [inline]
> > > >  kvm_vm_ioctl+0x5ce/0x19c0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3105
> > > >  vfs_ioctl fs/ioctl.c:46 [inline]
> > > >  file_ioctl fs/ioctl.c:509 [inline]
> > > >  do_vfs_ioctl+0xd6e/0x1390 fs/ioctl.c:696
> > > >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> > > >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> > > >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> > > >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> > > >  do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
> > > >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > > > RIP: 0033:0x458da9
> > > > Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7
> > > > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> > > > 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
> > > > RSP: 002b:00007fd56997fc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > > > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458da9
> > > > RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000005
> > > > RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> > > > R10: 0000000000000000 R11: 0000000000000246 R12: 00007fd5699806d4
> > > > R13: 00000000004c1905 R14: 00000000004d40d0 R15: 00000000ffffffff
> > > > Sending NMI from CPU 0 to CPUs 1:
> >
> > > Can the KVM maintainers take a look at this?  This doesn't have anything to do
> > > with my commit that syzbot bisected it to.
> > >
> > > +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> > > case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> > > run the reproducer more times if it isn't working reliably.  Otherwise it ends
> > > up blaming some random commit.
> >
> > Added a note to https://github.com/google/syzkaller/issues/1051
> > Thanks
> 
> As we increase number of instances, we increase chances of hitting
> unrelated bugs. E.g. take a look at the bisection log for:
> https://syzkaller.appspot.com/bug?extid=f14868630901fc6151d3
> What is the optimum number of tests is a good question. I suspect that
> the current 10 instances is close to optimum. If we use significantly
> more we may break every other bisection on unrelated bugs...
> 

Only because syzbot is being super dumb in how it does the bisection.  AFAICS,
in the example you linked to, buggy kernels reliably crashed 10 out of 10 times
with the original crash signature, "WARNING in cgroup_exit".  Then at some point
it tested some kernel without the bug and got a different crash just 1 in 10
times, "WARNING: ODEBUG bug in netdev_freemem".

The facts that the crash frequency was very different, and the crash signature
was different, should be taken as a very strong signal that it's not the bug
being bisected for.  And this is something easily checked for in code.

BTW, I hope you're treating fixing this as a high priority, given that syzbot is
now sending bug reports to kernel developers literally selected at random.  This
is a great way to teach people to ignore syzbot reports.  (When I suggested
bisection originally, I had assumed you'd implement some basic sanity checks so
that only bisection results likely to be reliable would be mailed out.)

- Eric

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
  2019-05-09  3:18         ` Eric Biggers
@ 2019-05-09 14:22           ` Dmitry Vyukov
  -1 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-05-09 14:22 UTC (permalink / raw)
  To: Eric Biggers
  Cc: syzbot, KVM list, adrian.hunter, David Miller, Artem Bityutskiy,
	jbaron, Josh Poimboeuf, LKML, linux-mtd, Andy Lutomirski,
	Ingo Molnar, Peter Zijlstra, Richard Weinberger, Rik van Riel,
	Steven Rostedt, syzkaller-bugs, Thomas Gleixner, syzkaller

> > > > Can the KVM maintainers take a look at this?  This doesn't have anything to do
> > > > with my commit that syzbot bisected it to.
> > > >
> > > > +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> > > > case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> > > > run the reproducer more times if it isn't working reliably.  Otherwise it ends
> > > > up blaming some random commit.
> > >
> > > Added a note to https://github.com/google/syzkaller/issues/1051
> > > Thanks
> >
> > As we increase number of instances, we increase chances of hitting
> > unrelated bugs. E.g. take a look at the bisection log for:
> > https://syzkaller.appspot.com/bug?extid=f14868630901fc6151d3
> > What is the optimum number of tests is a good question. I suspect that
> > the current 10 instances is close to optimum. If we use significantly
> > more we may break every other bisection on unrelated bugs...
> >
>
> Only because syzbot is being super dumb in how it does the bisection.  AFAICS,
> in the example you linked to, buggy kernels reliably crashed 10 out of 10 times
> with the original crash signature, "WARNING in cgroup_exit".  Then at some point
> it tested some kernel without the bug and got a different crash just 1 in 10
> times, "WARNING: ODEBUG bug in netdev_freemem".
>
> The facts that the crash frequency was very different, and the crash signature
> was different, should be taken as a very strong signal that it's not the bug
> being bisected for.  And this is something easily checked for in code.
>
> BTW, I hope you're treating fixing this as a high priority, given that syzbot is
> now sending bug reports to kernel developers literally selected at random.  This
> is a great way to teach people to ignore syzbot reports.  (When I suggested
> bisection originally, I had assumed you'd implement some basic sanity checks so
> that only bisection results likely to be reliable would be mailed out.)



While I believe we can get some quality improvement by shuffling
numbers. I don't think we can get significant improvement overall and
definitely not eliminate wrong bisection results entirely. It's easy
to take a single wrong bisection and design a system around this
scenario, but it's very hard to design a system that will handle all
of them in all generality. For example, look at these bisection logs
for cases where reproduction frequency varies from 1 to all, but
that's still the same bug:
https://syzkaller.appspot.com/x/bisect.txt?x=12df1ba3200000
https://syzkaller.appspot.com/x/bisect.txt?x=10daff1b200000
https://syzkaller.appspot.com/x/bisect.txt?x=1592b037200000
https://syzkaller.appspot.com/x/bisect.txt?x=11c610a7200000
https://syzkaller.appspot.com/x/bisect.txt?x=17affd1b200000
You also refer to "a different crash". But that's not a predicate we
can have. And definitely not something that is "easily checked for in
code". Consider, a function rename anywhere in the range will lead to
as if a different crash. If you look at all bisection logs you find
lots of amusing cases where something that a program may consider a
different bugs is actually the same bug, or the other way around. So
if we increase number of tests and we don't have a way to distinguish
crashes (which we don't), we will necessary increase incorrect results
due to unrelated bugs.

Bisection is a subtle process and the predicate, whatever logic it
does internally, in the end need to produce a single yes/no. And a
single wrong answer in the chain leads to a completely incorrect
result. There are some fundamental reasons for wrong results:
 - hard to reproduce bugs (not fixable)
 - unrelated bugs/broken builds (fixable)
While tuning numbers can pepper over these to some degree (maybe),
these reasons will stay and will lead to incorrect results. Also I
don't this tuning as something that is trivially to do as you suggest.
For example, how exactly do you assess a crash as reliably happening
vs episodically? How exactly do you choose number of tests for each
case? Choosing too few tests will lead to incorrect results, choosing
too many will lead to incorrect results. How exactly do you assess
that something that was happening reliably now does not happen
reliably? How do you assess that a crash is very different? Each of
the choices have chances of producing more bad results, so one would
need to rerun hundreds of bisections with old/new version, and then
manually mark results and then estimate quality change (which most
likely will be flaky or inconclusive in lots of cases). Tuning quality
of heuristics-based algorithms is very time consuming, especially if
each experiment takes weeks.

There is another down-side for not "super dumb" algorithms. Which is
explaining results. Consider that syzbot now mails a bisection where
the crash happened and a developer sees that it's the same crash, but
syzbot says "nope. did not crash". That will cause reasonable
questions and somebody (who would that be?) will need to come and
explain what happens and why, and how that counter-intuitive local
result was shown to improve quality overall. Simpler algorithms are
much easier to explain.

I consider bisection as high priority, but unfortunately only among
other high priority and very high priority work.
Besides work on the fuzzer itself and bug detection tools, we now test
15 kernels across 6 different OSes. Operational work can't be
deprioritized because then nothing will work at all. Change reviews
can't be deprioritized. Overseeing bug flow can't be deprioritized.
Updating crash parsing in response to new kernel output can't be
deprioritized. Answering all human emails can't be deprioritized.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-09 14:22           ` Dmitry Vyukov
  0 siblings, 0 replies; 18+ messages in thread
From: Dmitry Vyukov @ 2019-05-09 14:22 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Ingo Molnar, KVM list, Artem Bityutskiy, Peter Zijlstra, LKML,
	jbaron, Rik van Riel, syzkaller-bugs, adrian.hunter,
	Steven Rostedt, David Miller, Richard Weinberger, syzkaller,
	linux-mtd, Andy Lutomirski, Josh Poimboeuf, Thomas Gleixner,
	syzbot

> > > > Can the KVM maintainers take a look at this?  This doesn't have anything to do
> > > > with my commit that syzbot bisected it to.
> > > >
> > > > +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> > > > case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> > > > run the reproducer more times if it isn't working reliably.  Otherwise it ends
> > > > up blaming some random commit.
> > >
> > > Added a note to https://github.com/google/syzkaller/issues/1051
> > > Thanks
> >
> > As we increase number of instances, we increase chances of hitting
> > unrelated bugs. E.g. take a look at the bisection log for:
> > https://syzkaller.appspot.com/bug?extid=f14868630901fc6151d3
> > What is the optimum number of tests is a good question. I suspect that
> > the current 10 instances is close to optimum. If we use significantly
> > more we may break every other bisection on unrelated bugs...
> >
>
> Only because syzbot is being super dumb in how it does the bisection.  AFAICS,
> in the example you linked to, buggy kernels reliably crashed 10 out of 10 times
> with the original crash signature, "WARNING in cgroup_exit".  Then at some point
> it tested some kernel without the bug and got a different crash just 1 in 10
> times, "WARNING: ODEBUG bug in netdev_freemem".
>
> The facts that the crash frequency was very different, and the crash signature
> was different, should be taken as a very strong signal that it's not the bug
> being bisected for.  And this is something easily checked for in code.
>
> BTW, I hope you're treating fixing this as a high priority, given that syzbot is
> now sending bug reports to kernel developers literally selected at random.  This
> is a great way to teach people to ignore syzbot reports.  (When I suggested
> bisection originally, I had assumed you'd implement some basic sanity checks so
> that only bisection results likely to be reliable would be mailed out.)



While I believe we can get some quality improvement by shuffling
numbers. I don't think we can get significant improvement overall and
definitely not eliminate wrong bisection results entirely. It's easy
to take a single wrong bisection and design a system around this
scenario, but it's very hard to design a system that will handle all
of them in all generality. For example, look at these bisection logs
for cases where reproduction frequency varies from 1 to all, but
that's still the same bug:
https://syzkaller.appspot.com/x/bisect.txt?x=12df1ba3200000
https://syzkaller.appspot.com/x/bisect.txt?x=10daff1b200000
https://syzkaller.appspot.com/x/bisect.txt?x=1592b037200000
https://syzkaller.appspot.com/x/bisect.txt?x=11c610a7200000
https://syzkaller.appspot.com/x/bisect.txt?x=17affd1b200000
You also refer to "a different crash". But that's not a predicate we
can have. And definitely not something that is "easily checked for in
code". Consider, a function rename anywhere in the range will lead to
as if a different crash. If you look at all bisection logs you find
lots of amusing cases where something that a program may consider a
different bugs is actually the same bug, or the other way around. So
if we increase number of tests and we don't have a way to distinguish
crashes (which we don't), we will necessary increase incorrect results
due to unrelated bugs.

Bisection is a subtle process and the predicate, whatever logic it
does internally, in the end need to produce a single yes/no. And a
single wrong answer in the chain leads to a completely incorrect
result. There are some fundamental reasons for wrong results:
 - hard to reproduce bugs (not fixable)
 - unrelated bugs/broken builds (fixable)
While tuning numbers can pepper over these to some degree (maybe),
these reasons will stay and will lead to incorrect results. Also I
don't this tuning as something that is trivially to do as you suggest.
For example, how exactly do you assess a crash as reliably happening
vs episodically? How exactly do you choose number of tests for each
case? Choosing too few tests will lead to incorrect results, choosing
too many will lead to incorrect results. How exactly do you assess
that something that was happening reliably now does not happen
reliably? How do you assess that a crash is very different? Each of
the choices have chances of producing more bad results, so one would
need to rerun hundreds of bisections with old/new version, and then
manually mark results and then estimate quality change (which most
likely will be flaky or inconclusive in lots of cases). Tuning quality
of heuristics-based algorithms is very time consuming, especially if
each experiment takes weeks.

There is another down-side for not "super dumb" algorithms. Which is
explaining results. Consider that syzbot now mails a bisection where
the crash happened and a developer sees that it's the same crash, but
syzbot says "nope. did not crash". That will cause reasonable
questions and somebody (who would that be?) will need to come and
explain what happens and why, and how that counter-intuitive local
result was shown to improve quality overall. Simpler algorithms are
much easier to explain.

I consider bisection as high priority, but unfortunately only among
other high priority and very high priority work.
Besides work on the fuzzer itself and bug detection tools, we now test
15 kernels across 6 different OSes. Operational work can't be
deprioritized because then nothing will work at all. Change reviews
can't be deprioritized. Overseeing bug flow can't be deprioritized.
Updating crash parsing in response to new kernel output can't be
deprioritized. Answering all human emails can't be deprioritized.

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
  2019-05-09 14:22           ` Dmitry Vyukov
@ 2019-05-09 17:52             ` Eric Biggers
  -1 siblings, 0 replies; 18+ messages in thread
From: Eric Biggers @ 2019-05-09 17:52 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: syzbot, KVM list, adrian.hunter, David Miller, Artem Bityutskiy,
	jbaron, Josh Poimboeuf, LKML, linux-mtd, Andy Lutomirski,
	Ingo Molnar, Peter Zijlstra, Richard Weinberger, Rik van Riel,
	Steven Rostedt, syzkaller-bugs, Thomas Gleixner, syzkaller

On Thu, May 09, 2019 at 04:22:56PM +0200, 'Dmitry Vyukov' via syzkaller wrote:
> > > > > Can the KVM maintainers take a look at this?  This doesn't have anything to do
> > > > > with my commit that syzbot bisected it to.
> > > > >
> > > > > +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> > > > > case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> > > > > run the reproducer more times if it isn't working reliably.  Otherwise it ends
> > > > > up blaming some random commit.
> > > >
> > > > Added a note to https://github.com/google/syzkaller/issues/1051
> > > > Thanks
> > >
> > > As we increase number of instances, we increase chances of hitting
> > > unrelated bugs. E.g. take a look at the bisection log for:
> > > https://syzkaller.appspot.com/bug?extid=f14868630901fc6151d3
> > > What is the optimum number of tests is a good question. I suspect that
> > > the current 10 instances is close to optimum. If we use significantly
> > > more we may break every other bisection on unrelated bugs...
> > >
> >
> > Only because syzbot is being super dumb in how it does the bisection.  AFAICS,
> > in the example you linked to, buggy kernels reliably crashed 10 out of 10 times
> > with the original crash signature, "WARNING in cgroup_exit".  Then at some point
> > it tested some kernel without the bug and got a different crash just 1 in 10
> > times, "WARNING: ODEBUG bug in netdev_freemem".
> >
> > The facts that the crash frequency was very different, and the crash signature
> > was different, should be taken as a very strong signal that it's not the bug
> > being bisected for.  And this is something easily checked for in code.
> >
> > BTW, I hope you're treating fixing this as a high priority, given that syzbot is
> > now sending bug reports to kernel developers literally selected at random.  This
> > is a great way to teach people to ignore syzbot reports.  (When I suggested
> > bisection originally, I had assumed you'd implement some basic sanity checks so
> > that only bisection results likely to be reliable would be mailed out.)
> 
> 
> 
> While I believe we can get some quality improvement by shuffling
> numbers. I don't think we can get significant improvement overall and
> definitely not eliminate wrong bisection results entirely. It's easy
> to take a single wrong bisection and design a system around this
> scenario, but it's very hard to design a system that will handle all
> of them in all generality. For example, look at these bisection logs
> for cases where reproduction frequency varies from 1 to all, but
> that's still the same bug:
> https://syzkaller.appspot.com/x/bisect.txt?x=12df1ba3200000
> https://syzkaller.appspot.com/x/bisect.txt?x=10daff1b200000
> https://syzkaller.appspot.com/x/bisect.txt?x=1592b037200000
> https://syzkaller.appspot.com/x/bisect.txt?x=11c610a7200000
> https://syzkaller.appspot.com/x/bisect.txt?x=17affd1b200000

In all those, the bisection either blamed the wrong commit or failed entirely.
So rather than supporting your argument, they're actually examples of how a
reproduction frequency of 1 causes the result to be unreliable.

> You also refer to "a different crash". But that's not a predicate we
> can have. And definitely not something that is "easily checked for in
> code". Consider, a function rename anywhere in the range will lead to
> as if a different crash. If you look at all bisection logs you find
> lots of amusing cases where something that a program may consider a
> different bugs is actually the same bug, or the other way around. So
> if we increase number of tests and we don't have a way to distinguish
> crashes (which we don't), we will necessary increase incorrect results
> due to unrelated bugs.
> 
> Bisection is a subtle process and the predicate, whatever logic it
> does internally, in the end need to produce a single yes/no. And a
> single wrong answer in the chain leads to a completely incorrect
> result. There are some fundamental reasons for wrong results:
>  - hard to reproduce bugs (not fixable)
>  - unrelated bugs/broken builds (fixable)
> While tuning numbers can pepper over these to some degree (maybe),
> these reasons will stay and will lead to incorrect results. Also I
> don't this tuning as something that is trivially to do as you suggest.
> For example, how exactly do you assess a crash as reliably happening
> vs episodically? How exactly do you choose number of tests for each
> case? Choosing too few tests will lead to incorrect results, choosing
> too many will lead to incorrect results. How exactly do you assess
> that something that was happening reliably now does not happen
> reliably? How do you assess that a crash is very different? Each of
> the choices have chances of producing more bad results, so one would
> need to rerun hundreds of bisections with old/new version, and then
> manually mark results and then estimate quality change (which most
> likely will be flaky or inconclusive in lots of cases). Tuning quality
> of heuristics-based algorithms is very time consuming, especially if
> each experiment takes weeks.
> 
> There is another down-side for not "super dumb" algorithms. Which is
> explaining results. Consider that syzbot now mails a bisection where
> the crash happened and a developer sees that it's the same crash, but
> syzbot says "nope. did not crash". That will cause reasonable
> questions and somebody (who would that be?) will need to come and
> explain what happens and why, and how that counter-intuitive local
> result was shown to improve quality overall. Simpler algorithms are
> much easier to explain.

What I have in mind is that syzbot would assign a confidence level to each
bisection log.

Start at 100% confident.

If multiple different crash signatures were seen, decrease the confidence level.

If the crash is probabilistic (sometimes occurred n/10 times where 1 <= n <= 9),
decrease the confidence level again.  If it ever occurred just 1 time, decrease
it a lot more.  OFC it could be a smarter, more complex calculation using some
formula, but this would be the minimum.

If bisection ended on merge commit, release commit, or is obviously a non-code
change (e.g. only modified Documentation/), decrease the confidence level again.

Then:

- If bisection result is very confident, mail out the result as-is.
- If bisection result is somewhat confident, mail out the result with a warning
  that syzbot detected that it may be unreliable.
- Otherwise don't mail out the result, just keep it on the syzbot dashboard.

Yes, these are heuristics.  Yes, they will sometimes be wrong and cause false
negatives.  I don't need you to go into multi page explanation of all the
theoretical edge cases.  But they should reduce false positive rate massively,
which will encourage people to actually look at the reports...

It seems you don't want to be "responsible" for a false negative, so you just
send out every result.  But that just makes the reports low quality so people
ignore them, which is much worse.  The status quo of sending reports to authors
of kernel commits selected at random is really not acceptable.

> 
> I consider bisection as high priority, but unfortunately only among
> other high priority and very high priority work.
> Besides work on the fuzzer itself and bug detection tools, we now test
> 15 kernels across 6 different OSes. Operational work can't be
> deprioritized because then nothing will work at all. Change reviews
> can't be deprioritized. Overseeing bug flow can't be deprioritized.
> Updating crash parsing in response to new kernel output can't be
> deprioritized. Answering all human emails can't be deprioritized.
> 

So don't be surprised when people ignore syzbot reports for the same reason.

- Eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: BUG: soft lockup in kvm_vm_ioctl
@ 2019-05-09 17:52             ` Eric Biggers
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Biggers @ 2019-05-09 17:52 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Ingo Molnar, KVM list, Artem Bityutskiy, Peter Zijlstra, LKML,
	jbaron, Rik van Riel, syzkaller-bugs, adrian.hunter,
	Steven Rostedt, David Miller, Richard Weinberger, syzkaller,
	linux-mtd, Andy Lutomirski, Josh Poimboeuf, Thomas Gleixner,
	syzbot

On Thu, May 09, 2019 at 04:22:56PM +0200, 'Dmitry Vyukov' via syzkaller wrote:
> > > > > Can the KVM maintainers take a look at this?  This doesn't have anything to do
> > > > > with my commit that syzbot bisected it to.
> > > > >
> > > > > +Dmitry, statistics lession: if a crash occurs only 1 in 10 times, as was the
> > > > > case here, then often it will happen 0 in 10 times by chance.  syzbot needs to
> > > > > run the reproducer more times if it isn't working reliably.  Otherwise it ends
> > > > > up blaming some random commit.
> > > >
> > > > Added a note to https://github.com/google/syzkaller/issues/1051
> > > > Thanks
> > >
> > > As we increase number of instances, we increase chances of hitting
> > > unrelated bugs. E.g. take a look at the bisection log for:
> > > https://syzkaller.appspot.com/bug?extid=f14868630901fc6151d3
> > > What is the optimum number of tests is a good question. I suspect that
> > > the current 10 instances is close to optimum. If we use significantly
> > > more we may break every other bisection on unrelated bugs...
> > >
> >
> > Only because syzbot is being super dumb in how it does the bisection.  AFAICS,
> > in the example you linked to, buggy kernels reliably crashed 10 out of 10 times
> > with the original crash signature, "WARNING in cgroup_exit".  Then at some point
> > it tested some kernel without the bug and got a different crash just 1 in 10
> > times, "WARNING: ODEBUG bug in netdev_freemem".
> >
> > The facts that the crash frequency was very different, and the crash signature
> > was different, should be taken as a very strong signal that it's not the bug
> > being bisected for.  And this is something easily checked for in code.
> >
> > BTW, I hope you're treating fixing this as a high priority, given that syzbot is
> > now sending bug reports to kernel developers literally selected at random.  This
> > is a great way to teach people to ignore syzbot reports.  (When I suggested
> > bisection originally, I had assumed you'd implement some basic sanity checks so
> > that only bisection results likely to be reliable would be mailed out.)
> 
> 
> 
> While I believe we can get some quality improvement by shuffling
> numbers. I don't think we can get significant improvement overall and
> definitely not eliminate wrong bisection results entirely. It's easy
> to take a single wrong bisection and design a system around this
> scenario, but it's very hard to design a system that will handle all
> of them in all generality. For example, look at these bisection logs
> for cases where reproduction frequency varies from 1 to all, but
> that's still the same bug:
> https://syzkaller.appspot.com/x/bisect.txt?x=12df1ba3200000
> https://syzkaller.appspot.com/x/bisect.txt?x=10daff1b200000
> https://syzkaller.appspot.com/x/bisect.txt?x=1592b037200000
> https://syzkaller.appspot.com/x/bisect.txt?x=11c610a7200000
> https://syzkaller.appspot.com/x/bisect.txt?x=17affd1b200000

In all those, the bisection either blamed the wrong commit or failed entirely.
So rather than supporting your argument, they're actually examples of how a
reproduction frequency of 1 causes the result to be unreliable.

> You also refer to "a different crash". But that's not a predicate we
> can have. And definitely not something that is "easily checked for in
> code". Consider, a function rename anywhere in the range will lead to
> as if a different crash. If you look at all bisection logs you find
> lots of amusing cases where something that a program may consider a
> different bugs is actually the same bug, or the other way around. So
> if we increase number of tests and we don't have a way to distinguish
> crashes (which we don't), we will necessary increase incorrect results
> due to unrelated bugs.
> 
> Bisection is a subtle process and the predicate, whatever logic it
> does internally, in the end need to produce a single yes/no. And a
> single wrong answer in the chain leads to a completely incorrect
> result. There are some fundamental reasons for wrong results:
>  - hard to reproduce bugs (not fixable)
>  - unrelated bugs/broken builds (fixable)
> While tuning numbers can pepper over these to some degree (maybe),
> these reasons will stay and will lead to incorrect results. Also I
> don't this tuning as something that is trivially to do as you suggest.
> For example, how exactly do you assess a crash as reliably happening
> vs episodically? How exactly do you choose number of tests for each
> case? Choosing too few tests will lead to incorrect results, choosing
> too many will lead to incorrect results. How exactly do you assess
> that something that was happening reliably now does not happen
> reliably? How do you assess that a crash is very different? Each of
> the choices have chances of producing more bad results, so one would
> need to rerun hundreds of bisections with old/new version, and then
> manually mark results and then estimate quality change (which most
> likely will be flaky or inconclusive in lots of cases). Tuning quality
> of heuristics-based algorithms is very time consuming, especially if
> each experiment takes weeks.
> 
> There is another down-side for not "super dumb" algorithms. Which is
> explaining results. Consider that syzbot now mails a bisection where
> the crash happened and a developer sees that it's the same crash, but
> syzbot says "nope. did not crash". That will cause reasonable
> questions and somebody (who would that be?) will need to come and
> explain what happens and why, and how that counter-intuitive local
> result was shown to improve quality overall. Simpler algorithms are
> much easier to explain.

What I have in mind is that syzbot would assign a confidence level to each
bisection log.

Start at 100% confident.

If multiple different crash signatures were seen, decrease the confidence level.

If the crash is probabilistic (sometimes occurred n/10 times where 1 <= n <= 9),
decrease the confidence level again.  If it ever occurred just 1 time, decrease
it a lot more.  OFC it could be a smarter, more complex calculation using some
formula, but this would be the minimum.

If bisection ended on merge commit, release commit, or is obviously a non-code
change (e.g. only modified Documentation/), decrease the confidence level again.

Then:

- If bisection result is very confident, mail out the result as-is.
- If bisection result is somewhat confident, mail out the result with a warning
  that syzbot detected that it may be unreliable.
- Otherwise don't mail out the result, just keep it on the syzbot dashboard.

Yes, these are heuristics.  Yes, they will sometimes be wrong and cause false
negatives.  I don't need you to go into multi page explanation of all the
theoretical edge cases.  But they should reduce false positive rate massively,
which will encourage people to actually look at the reports...

It seems you don't want to be "responsible" for a false negative, so you just
send out every result.  But that just makes the reports low quality so people
ignore them, which is much worse.  The status quo of sending reports to authors
of kernel commits selected at random is really not acceptable.

> 
> I consider bisection as high priority, but unfortunately only among
> other high priority and very high priority work.
> Besides work on the fuzzer itself and bug detection tools, we now test
> 15 kernels across 6 different OSes. Operational work can't be
> deprioritized because then nothing will work at all. Change reviews
> can't be deprioritized. Overseeing bug flow can't be deprioritized.
> Updating crash parsing in response to new kernel output can't be
> deprioritized. Answering all human emails can't be deprioritized.
> 

So don't be surprised when people ignore syzbot reports for the same reason.

- Eric

______________________________________________________
Linux MTD discussion mailing list
http://lists.infradead.org/mailman/listinfo/linux-mtd/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-05-09 17:52 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-01 14:36 BUG: soft lockup in kvm_vm_ioctl syzbot
2019-05-01 14:36 ` syzbot
2019-05-02  2:34 ` Eric Biggers
2019-05-02  2:34   ` Eric Biggers
2019-05-02  3:10   ` Steven Rostedt
2019-05-02  3:10     ` Steven Rostedt
2019-05-08 11:23     ` Dmitry Vyukov
2019-05-08 11:23       ` Dmitry Vyukov
2019-05-08 11:25   ` Dmitry Vyukov
2019-05-08 11:25     ` Dmitry Vyukov
2019-05-08 16:58     ` Dmitry Vyukov
2019-05-08 16:58       ` Dmitry Vyukov
2019-05-09  3:18       ` Eric Biggers
2019-05-09  3:18         ` Eric Biggers
2019-05-09 14:22         ` Dmitry Vyukov
2019-05-09 14:22           ` Dmitry Vyukov
2019-05-09 17:52           ` Eric Biggers
2019-05-09 17:52             ` Eric Biggers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.