* BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff @ 2020-03-22 6:43 syzbot 2020-03-22 6:59 ` Dmitry Vyukov ` (2 more replies) 0 siblings, 3 replies; 20+ messages in thread From: syzbot @ 2020-03-22 6:43 UTC (permalink / raw) To: bp, hpa, jmattson, joro, kvm, linux-kernel, mingo, pbonzini, sean.j.christopherson, syzkaller-bugs, tglx, vkuznets, wanpengli, x86 Hello, syzbot found the following crash on: HEAD commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=16403223e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985 dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38 compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) Unfortunately, I don't have any reproducer for this crash yet. IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com BUG: kernel NULL pointer dereference, address: 0000000000000086 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD a63a4067 P4D a63a4067 PUD a7627067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 0 PID: 9785 Comm: syz-executor.2 Not tainted 5.6.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x86 Code: Bad RIP value. RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: handle_external_interrupt_irqoff+0x154/0x280 arch/x86/kvm/vmx/vmx.c:6274 kvm_before_interrupt arch/x86/kvm/x86.h:343 [inline] handle_external_interrupt_irqoff+0x132/0x280 arch/x86/kvm/vmx/vmx.c:6272 __irqentry_text_start+0x8/0x8 vcpu_enter_guest+0x6c77/0x9290 arch/x86/kvm/x86.c:8405 save_stack mm/kasan/common.c:72 [inline] set_track mm/kasan/common.c:80 [inline] kasan_set_free_info mm/kasan/common.c:337 [inline] __kasan_slab_free+0x12e/0x1e0 mm/kasan/common.c:476 __cache_free mm/slab.c:3426 [inline] kfree+0x10a/0x220 mm/slab.c:3757 tomoyo_path_number_perm+0x525/0x690 security/tomoyo/file.c:736 security_file_ioctl+0x55/0xb0 security/security.c:1441 entry_SYSCALL_64_after_hwframe+0x49/0xbe __lock_acquire+0xc5a/0x1bc0 kernel/locking/lockdep.c:3954 test_bit include/asm-generic/bitops/instrumented-non-atomic.h:110 [inline] hlock_class kernel/locking/lockdep.c:163 [inline] mark_lock+0x107/0x1650 kernel/locking/lockdep.c:3642 lock_acquire+0x154/0x250 kernel/locking/lockdep.c:4484 rcu_lock_acquire+0x9/0x30 include/linux/rcupdate.h:208 kvm_check_async_pf_completion+0x34e/0x360 arch/x86/kvm/../../../virt/kvm/async_pf.c:137 vcpu_run+0x3a3/0xd50 arch/x86/kvm/x86.c:8513 kvm_arch_vcpu_ioctl_run+0x419/0x880 arch/x86/kvm/x86.c:8735 kvm_vcpu_ioctl+0x67c/0xa80 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2932 kvm_vm_release+0x50/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:858 vfs_ioctl fs/ioctl.c:47 [inline] ksys_ioctl fs/ioctl.c:763 [inline] __do_sys_ioctl fs/ioctl.c:772 [inline] __se_sys_ioctl+0xf9/0x160 fs/ioctl.c:770 do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Modules linked in: CR2: 0000000000000086 ---[ end trace 4da75c292cd7e3e8 ]--- RIP: 0010:0x86 Code: Bad RIP value. RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#status for how to communicate with syzbot. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-22 6:43 BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff syzbot @ 2020-03-22 6:59 ` Dmitry Vyukov 2020-03-22 7:03 ` Dmitry Vyukov 2020-03-23 8:18 ` Paolo Bonzini 2020-03-22 8:53 ` syzbot 2020-03-22 13:29 ` syzbot 2 siblings, 2 replies; 20+ messages in thread From: Dmitry Vyukov @ 2020-03-22 6:59 UTC (permalink / raw) To: syzbot, clang-built-linux Cc: Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, Paolo Bonzini, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanpengli, the arch/x86 maintainers On Sun, Mar 22, 2020 at 7:43 AM syzbot <syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com> wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne.. > git tree: upstream > console output: https://syzkaller.appspot.com/x/log.txt?x=16403223e00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985 > dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38 > compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) > > Unfortunately, I don't have any reproducer for this crash yet. > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com +clang-built-linux This only happens on the instance that uses clang. So potentially this is related to clang. The instance also uses smack lsm, but it's less likely to be involved I think. This actually started happening around Mar 6, but the ORC unwinder somehow fails to unwind stack and prints only questionable frames, so the reports were classified as "corrupted" and all thrown in the "corrupted reports" bucket: https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452 There is already some discussion about this on the clang-built-linux list: https://groups.google.com/d/msg/clang-built-linux/Cm3VojRK69I/cfDGxIlTAwAJ The handle_external_interrupt_irqoff has some inline asm and the special STACK_FRAME_NON_STANDARD. So it has some potential for bad interaction with compilers... The commit range is presumably fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 But I don't see anything that says "it's me". The only commit that does non-trivial changes to x86/vmx seems to be "KVM: VMX: check descriptor table exits on instruction emulation": $ git log --oneline fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 virt/kvm/ arch/x86/kvm/ 86f7e90ce840a KVM: VMX: check descriptor table exits on instruction emulation e951445f4d3b5 Merge tag 'kvmarm-fixes-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD ef935c25fd648 kvm: x86: Limit the number of "kvm: disabled by bios" messages aaec7c03de92c KVM: x86: avoid useless copy of cpufreq policy 4f337faf1c55e KVM: allow disabling -Werror 575b255c1663c KVM: x86: allow compiling as non-module with W=1 7943f4acea3ca KVM: SVM: allocate AVIC data structures based on kvm_amd module parameter b3f15ec3d809c kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe() 51b2569402a38 KVM: arm/arm64: Fix up includes for trace.h > BUG: kernel NULL pointer dereference, address: 0000000000000086 > #PF: supervisor instruction fetch in kernel mode > #PF: error_code(0x0010) - not-present page > PGD a63a4067 P4D a63a4067 PUD a7627067 PMD 0 > Oops: 0010 [#1] PREEMPT SMP KASAN > CPU: 0 PID: 9785 Comm: syz-executor.2 Not tainted 5.6.0-rc6-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > RIP: 0010:0x86 > Code: Bad RIP value. > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > handle_external_interrupt_irqoff+0x154/0x280 arch/x86/kvm/vmx/vmx.c:6274 > kvm_before_interrupt arch/x86/kvm/x86.h:343 [inline] > handle_external_interrupt_irqoff+0x132/0x280 arch/x86/kvm/vmx/vmx.c:6272 > __irqentry_text_start+0x8/0x8 > vcpu_enter_guest+0x6c77/0x9290 arch/x86/kvm/x86.c:8405 > save_stack mm/kasan/common.c:72 [inline] > set_track mm/kasan/common.c:80 [inline] > kasan_set_free_info mm/kasan/common.c:337 [inline] > __kasan_slab_free+0x12e/0x1e0 mm/kasan/common.c:476 > __cache_free mm/slab.c:3426 [inline] > kfree+0x10a/0x220 mm/slab.c:3757 > tomoyo_path_number_perm+0x525/0x690 security/tomoyo/file.c:736 > security_file_ioctl+0x55/0xb0 security/security.c:1441 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > __lock_acquire+0xc5a/0x1bc0 kernel/locking/lockdep.c:3954 > test_bit include/asm-generic/bitops/instrumented-non-atomic.h:110 [inline] > hlock_class kernel/locking/lockdep.c:163 [inline] > mark_lock+0x107/0x1650 kernel/locking/lockdep.c:3642 > lock_acquire+0x154/0x250 kernel/locking/lockdep.c:4484 > rcu_lock_acquire+0x9/0x30 include/linux/rcupdate.h:208 > kvm_check_async_pf_completion+0x34e/0x360 arch/x86/kvm/../../../virt/kvm/async_pf.c:137 > vcpu_run+0x3a3/0xd50 arch/x86/kvm/x86.c:8513 > kvm_arch_vcpu_ioctl_run+0x419/0x880 arch/x86/kvm/x86.c:8735 > kvm_vcpu_ioctl+0x67c/0xa80 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2932 > kvm_vm_release+0x50/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:858 > vfs_ioctl fs/ioctl.c:47 [inline] > ksys_ioctl fs/ioctl.c:763 [inline] > __do_sys_ioctl fs/ioctl.c:772 [inline] > __se_sys_ioctl+0xf9/0x160 fs/ioctl.c:770 > do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:294 > entry_SYSCALL_64_after_hwframe+0x49/0xbe > Modules linked in: > CR2: 0000000000000086 > ---[ end trace 4da75c292cd7e3e8 ]--- > RIP: 0010:0x86 > Code: Bad RIP value. > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > --- > This bug is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this bug report. See: > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000277a0405a16bd5c9%40google.com. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-22 6:59 ` Dmitry Vyukov @ 2020-03-22 7:03 ` Dmitry Vyukov 2020-03-23 8:18 ` Paolo Bonzini 1 sibling, 0 replies; 20+ messages in thread From: Dmitry Vyukov @ 2020-03-22 7:03 UTC (permalink / raw) To: syzbot, clang-built-linux Cc: Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, Paolo Bonzini, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanpengli, the arch/x86 maintainers On Sun, Mar 22, 2020 at 7:59 AM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Sun, Mar 22, 2020 at 7:43 AM syzbot > <syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com> wrote: > > > > Hello, > > > > syzbot found the following crash on: > > > > HEAD commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne.. > > git tree: upstream > > console output: https://syzkaller.appspot.com/x/log.txt?x=16403223e00000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985 > > dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38 > > compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) > > > > Unfortunately, I don't have any reproducer for this crash yet. > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com > > +clang-built-linux > > This only happens on the instance that uses clang. So potentially this > is related to clang. The instance also uses smack lsm, but it's less > likely to be involved I think. > This actually started happening around Mar 6, but the ORC unwinder > somehow fails to unwind stack and prints only questionable frames, so > the reports were classified as "corrupted" and all thrown in the > "corrupted reports" bucket: > https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452 > > There is already some discussion about this on the clang-built-linux list: > https://groups.google.com/d/msg/clang-built-linux/Cm3VojRK69I/cfDGxIlTAwAJ > > The handle_external_interrupt_irqoff has some inline asm and the > special STACK_FRAME_NON_STANDARD. So it has some potential for bad > interaction with compilers... > > The commit range is presumably > fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 > But I don't see anything that says "it's me". The only commit that > does non-trivial changes to x86/vmx seems to be "KVM: VMX: check > descriptor table exits on instruction emulation": > > $ git log --oneline > fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 > virt/kvm/ arch/x86/kvm/ > 86f7e90ce840a KVM: VMX: check descriptor table exits on instruction emulation > e951445f4d3b5 Merge tag 'kvmarm-fixes-5.6-1' of > git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD > ef935c25fd648 kvm: x86: Limit the number of "kvm: disabled by bios" messages > aaec7c03de92c KVM: x86: avoid useless copy of cpufreq policy > 4f337faf1c55e KVM: allow disabling -Werror > 575b255c1663c KVM: x86: allow compiling as non-module with W=1 > 7943f4acea3ca KVM: SVM: allocate AVIC data structures based on kvm_amd > module parameter > b3f15ec3d809c kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe() > 51b2569402a38 KVM: arm/arm64: Fix up includes for trace.h And the problem with this crash is that it happens all the time, basically the only crash that now happens on the instance. So effectively all kernel testing of all subsystems has stalled due to this. > > BUG: kernel NULL pointer dereference, address: 0000000000000086 > > #PF: supervisor instruction fetch in kernel mode > > #PF: error_code(0x0010) - not-present page > > PGD a63a4067 P4D a63a4067 PUD a7627067 PMD 0 > > Oops: 0010 [#1] PREEMPT SMP KASAN > > CPU: 0 PID: 9785 Comm: syz-executor.2 Not tainted 5.6.0-rc6-syzkaller #0 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > > RIP: 0010:0x86 > > Code: Bad RIP value. > > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > Call Trace: > > handle_external_interrupt_irqoff+0x154/0x280 arch/x86/kvm/vmx/vmx.c:6274 > > kvm_before_interrupt arch/x86/kvm/x86.h:343 [inline] > > handle_external_interrupt_irqoff+0x132/0x280 arch/x86/kvm/vmx/vmx.c:6272 > > __irqentry_text_start+0x8/0x8 > > vcpu_enter_guest+0x6c77/0x9290 arch/x86/kvm/x86.c:8405 > > save_stack mm/kasan/common.c:72 [inline] > > set_track mm/kasan/common.c:80 [inline] > > kasan_set_free_info mm/kasan/common.c:337 [inline] > > __kasan_slab_free+0x12e/0x1e0 mm/kasan/common.c:476 > > __cache_free mm/slab.c:3426 [inline] > > kfree+0x10a/0x220 mm/slab.c:3757 > > tomoyo_path_number_perm+0x525/0x690 security/tomoyo/file.c:736 > > security_file_ioctl+0x55/0xb0 security/security.c:1441 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > __lock_acquire+0xc5a/0x1bc0 kernel/locking/lockdep.c:3954 > > test_bit include/asm-generic/bitops/instrumented-non-atomic.h:110 [inline] > > hlock_class kernel/locking/lockdep.c:163 [inline] > > mark_lock+0x107/0x1650 kernel/locking/lockdep.c:3642 > > lock_acquire+0x154/0x250 kernel/locking/lockdep.c:4484 > > rcu_lock_acquire+0x9/0x30 include/linux/rcupdate.h:208 > > kvm_check_async_pf_completion+0x34e/0x360 arch/x86/kvm/../../../virt/kvm/async_pf.c:137 > > vcpu_run+0x3a3/0xd50 arch/x86/kvm/x86.c:8513 > > kvm_arch_vcpu_ioctl_run+0x419/0x880 arch/x86/kvm/x86.c:8735 > > kvm_vcpu_ioctl+0x67c/0xa80 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2932 > > kvm_vm_release+0x50/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:858 > > vfs_ioctl fs/ioctl.c:47 [inline] > > ksys_ioctl fs/ioctl.c:763 [inline] > > __do_sys_ioctl fs/ioctl.c:772 [inline] > > __se_sys_ioctl+0xf9/0x160 fs/ioctl.c:770 > > do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:294 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > Modules linked in: > > CR2: 0000000000000086 > > ---[ end trace 4da75c292cd7e3e8 ]--- > > RIP: 0010:0x86 > > Code: Bad RIP value. > > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > --- > > This bug is generated by a bot. It may contain errors. > > See https://goo.gl/tpsmEJ for more information about syzbot. > > syzbot engineers can be reached at syzkaller@googlegroups.com. > > > > syzbot will keep track of this bug report. See: > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot. > > > > -- > > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000277a0405a16bd5c9%40google.com. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-22 6:59 ` Dmitry Vyukov 2020-03-22 7:03 ` Dmitry Vyukov @ 2020-03-23 8:18 ` Paolo Bonzini 2020-03-23 16:31 ` Alexander Potapenko 1 sibling, 1 reply; 20+ messages in thread From: Paolo Bonzini @ 2020-03-23 8:18 UTC (permalink / raw) To: Dmitry Vyukov, syzbot, clang-built-linux Cc: Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanpengli, the arch/x86 maintainers On 22/03/20 07:59, Dmitry Vyukov wrote: > > The commit range is presumably > fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 > But I don't see anything that says "it's me". The only commit that > does non-trivial changes to x86/vmx seems to be "KVM: VMX: check > descriptor table exits on instruction emulation": That seems unlikely, it's a completely different file and it would only affect the outside (non-nested) environment rather than your own kernel. The only instance of "0x86" in the registers is in the flags: > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 That would suggest a miscompilation of the inline assembly, which does push the flags: #ifdef CONFIG_X86_64 "mov %%" _ASM_SP ", %[sp]\n\t" "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t" "push $%c[ss]\n\t" "push %[sp]\n\t" #endif "pushf\n\t" __ASM_SIZE(push) " $%c[cs]\n\t" CALL_NOSPEC It would not explain why it suddenly started to break, unless the clang version also changed, but it would be easy to ascertain and fix (in either KVM or clang). Dmitry, can you send me the vmx.o and kvm-intel.ko files? Thanks, Paolo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 8:18 ` Paolo Bonzini @ 2020-03-23 16:31 ` Alexander Potapenko 2020-03-23 16:39 ` Sean Christopherson 0 siblings, 1 reply; 20+ messages in thread From: Alexander Potapenko @ 2020-03-23 16:31 UTC (permalink / raw) To: Paolo Bonzini Cc: Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanpengli, the arch/x86 maintainers [-- Attachment #1: Type: text/plain, Size: 2329 bytes --] On Mon, Mar 23, 2020 at 9:18 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 22/03/20 07:59, Dmitry Vyukov wrote: > > > > The commit range is presumably > > fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 > > But I don't see anything that says "it's me". The only commit that > > does non-trivial changes to x86/vmx seems to be "KVM: VMX: check > > descriptor table exits on instruction emulation": > > That seems unlikely, it's a completely different file and it would only > affect the outside (non-nested) environment rather than your own kernel. > > The only instance of "0x86" in the registers is in the flags: > > > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > That would suggest a miscompilation of the inline assembly, which does > push the flags: > > #ifdef CONFIG_X86_64 > "mov %%" _ASM_SP ", %[sp]\n\t" > "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t" > "push $%c[ss]\n\t" > "push %[sp]\n\t" > #endif > "pushf\n\t" > __ASM_SIZE(push) " $%c[cs]\n\t" > CALL_NOSPEC > > > It would not explain why it suddenly started to break, unless the clang > version also changed, but it would be easy to ascertain and fix (in > either KVM or clang). Dmitry, can you send me the vmx.o and > kvm-intel.ko files? On a quick glance, Clang does not miscompile this part. Attached is the disassembly of handle_external_interrupt_irqoff() from v5.4 (where the problem seems to also reproduce) with Clang and GCC. They do virtually the same (look for asm blob after kvm_before_interrupt()). [-- Attachment #2: handle_external_interrupt_irqoff.gcc.txt --] [-- Type: text/plain, Size: 14916 bytes --] vmlinux.gcc: file format elf64-x86-64 Disassembly of section .text: ffffffff8118dfe0 <handle_external_interrupt_irqoff>: handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6230 kvm_after_interrupt(&vmx->vcpu); } } static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) { ffffffff8118dfe0: 41 55 push %r13 ffffffff8118dfe2: 41 54 push %r12 ffffffff8118dfe4: 55 push %rbp ffffffff8118dfe5: 48 89 fd mov %rdi,%rbp ffffffff8118dfe8: 53 push %rbx arch_static_branch(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/jump_label.h:25 #include <linux/stringify.h> #include <linux/types.h> static __always_inline bool arch_static_branch(struct static_key *key, bool branch) { asm_volatile_goto("1:" ffffffff8118dfe9: e8 42 79 57 00 callq ffffffff81705930 <__sanitizer_cov_trace_pc> ffffffff8118dfee: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) ffffffff8118dff3: e8 38 79 57 00 callq ffffffff81705930 <__sanitizer_cov_trace_pc> __vmcs_readl(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/ops.h:70 static __always_inline unsigned long __vmcs_readl(unsigned long field) { unsigned long value; asm volatile("1: vmread %2, %1\n\t" ffffffff8118dff8: bb 04 44 00 00 mov $0x4404,%ebx ffffffff8118dffd: 0f 78 db vmread %rbx,%rbx ffffffff8118e000: 3e 77 0d ja,pt ffffffff8118e010 <handle_external_interrupt_irqoff+0x30> ffffffff8118e003: 48 89 df mov %rbx,%rdi ffffffff8118e006: 48 31 f6 xor %rsi,%rsi ffffffff8118e009: e8 62 93 00 00 callq ffffffff81197370 <vmread_error> ffffffff8118e00e: 31 db xor %ebx,%ebx is_external_intr(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmcs.h:129 == (INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK); } static inline bool is_external_intr(u32 intr_info) { return (intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK)) ffffffff8118e010: 41 89 dc mov %ebx,%r12d handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6240 #endif gate_desc *desc; u32 intr_info; intr_info = vmcs_read32(VM_EXIT_INTR_INFO); if (WARN_ONCE(!is_external_intr(intr_info), ffffffff8118e013: bf 00 00 00 80 mov $0x80000000,%edi vmcs_read32(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/ops.h:102 static __always_inline u32 vmcs_read32(unsigned long field) { vmcs_check32(field); if (static_branch_unlikely(&enable_evmcs)) return evmcs_read32(field); return __vmcs_readl(field); ffffffff8118e018: 41 89 dd mov %ebx,%r13d is_external_intr(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmcs.h:129 ffffffff8118e01b: 41 81 e4 00 07 00 80 and $0x80000700,%r12d handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6240 ffffffff8118e022: 44 89 e6 mov %r12d,%esi ffffffff8118e025: e8 76 7a 57 00 callq ffffffff81705aa0 <__sanitizer_cov_trace_const_cmp4> ffffffff8118e02a: 41 81 fc 00 00 00 80 cmp $0x80000000,%r12d ffffffff8118e031: 0f 85 7a 01 00 00 jne ffffffff8118e1b1 <handle_external_interrupt_irqoff+0x1d1> /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6244 "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info)) return; vector = intr_info & INTR_INFO_VECTOR_MASK; ffffffff8118e037: e8 f4 78 57 00 callq ffffffff81705930 <__sanitizer_cov_trace_pc> /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6245 desc = (gate_desc *)host_idt_base + vector; ffffffff8118e03c: 0f b6 db movzbl %bl,%ebx gate_offset(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 typedef struct gate_struct gate_desc; static inline unsigned long gate_offset(const gate_desc *g) { #ifdef CONFIG_X86_64 return g->offset_low | ((unsigned long)g->offset_middle << 16) | ffffffff8118e03f: 48 b9 00 00 00 00 00 movabs $0xdffffc0000000000,%rcx ffffffff8118e046: fc ff df handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6245 ffffffff8118e049: 48 c1 e3 04 shl $0x4,%rbx ffffffff8118e04d: 48 03 1d 2c 68 2f 0a add 0xa2f682c(%rip),%rbx # ffffffff8b484880 <host_idt_base> gate_offset(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 ffffffff8118e054: 48 89 d8 mov %rbx,%rax ffffffff8118e057: 48 c1 e8 03 shr $0x3,%rax ffffffff8118e05b: 0f b6 14 08 movzbl (%rax,%rcx,1),%edx ffffffff8118e05f: 48 8d 43 01 lea 0x1(%rbx),%rax ffffffff8118e063: 48 89 c6 mov %rax,%rsi ffffffff8118e066: 48 c1 ee 03 shr $0x3,%rsi ffffffff8118e06a: 0f b6 0c 0e movzbl (%rsi,%rcx,1),%ecx ffffffff8118e06e: 48 89 de mov %rbx,%rsi ffffffff8118e071: 83 e6 07 and $0x7,%esi ffffffff8118e074: 40 38 f2 cmp %sil,%dl ffffffff8118e077: 40 0f 9e c6 setle %sil ffffffff8118e07b: 84 d2 test %dl,%dl ffffffff8118e07d: 0f 95 c2 setne %dl ffffffff8118e080: 40 84 d6 test %dl,%sil ffffffff8118e083: 0f 85 7e 01 00 00 jne ffffffff8118e207 <handle_external_interrupt_irqoff+0x227> ffffffff8118e089: 83 e0 07 and $0x7,%eax ffffffff8118e08c: 38 c1 cmp %al,%cl ffffffff8118e08e: 0f 9e c2 setle %dl ffffffff8118e091: 84 c9 test %cl,%cl ffffffff8118e093: 0f 95 c0 setne %al ffffffff8118e096: 84 c2 test %al,%dl ffffffff8118e098: 0f 85 69 01 00 00 jne ffffffff8118e207 <handle_external_interrupt_irqoff+0x227> ffffffff8118e09e: 48 8d 7b 06 lea 0x6(%rbx),%rdi ffffffff8118e0a2: 44 0f b7 2b movzwl (%rbx),%r13d ffffffff8118e0a6: 48 b9 00 00 00 00 00 movabs $0xdffffc0000000000,%rcx ffffffff8118e0ad: fc ff df ffffffff8118e0b0: 48 89 f8 mov %rdi,%rax ffffffff8118e0b3: 48 c1 e8 03 shr $0x3,%rax ffffffff8118e0b7: 0f b6 14 08 movzbl (%rax,%rcx,1),%edx ffffffff8118e0bb: 48 8d 43 07 lea 0x7(%rbx),%rax ffffffff8118e0bf: 48 89 c6 mov %rax,%rsi ffffffff8118e0c2: 48 c1 ee 03 shr $0x3,%rsi ffffffff8118e0c6: 0f b6 0c 0e movzbl (%rsi,%rcx,1),%ecx ffffffff8118e0ca: 48 89 fe mov %rdi,%rsi ffffffff8118e0cd: 83 e6 07 and $0x7,%esi ffffffff8118e0d0: 40 38 f2 cmp %sil,%dl ffffffff8118e0d3: 40 0f 9e c6 setle %sil ffffffff8118e0d7: 84 d2 test %dl,%dl ffffffff8118e0d9: 0f 95 c2 setne %dl ffffffff8118e0dc: 40 84 d6 test %dl,%sil ffffffff8118e0df: 0f 85 13 01 00 00 jne ffffffff8118e1f8 <handle_external_interrupt_irqoff+0x218> ffffffff8118e0e5: 83 e0 07 and $0x7,%eax ffffffff8118e0e8: 38 c1 cmp %al,%cl ffffffff8118e0ea: 0f 9e c2 setle %dl ffffffff8118e0ed: 84 c9 test %cl,%cl ffffffff8118e0ef: 0f 95 c0 setne %al ffffffff8118e0f2: 84 c2 test %al,%dl ffffffff8118e0f4: 0f 85 fe 00 00 00 jne ffffffff8118e1f8 <handle_external_interrupt_irqoff+0x218> /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:94 ((unsigned long) g->offset_high << 32); ffffffff8118e0fa: 48 8d 7b 08 lea 0x8(%rbx),%rdi /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 return g->offset_low | ((unsigned long)g->offset_middle << 16) | ffffffff8118e0fe: 44 0f b7 63 06 movzwl 0x6(%rbx),%r12d /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:94 ((unsigned long) g->offset_high << 32); ffffffff8118e103: 48 b9 00 00 00 00 00 movabs $0xdffffc0000000000,%rcx ffffffff8118e10a: fc ff df ffffffff8118e10d: 48 89 f8 mov %rdi,%rax ffffffff8118e110: 48 c1 e8 03 shr $0x3,%rax ffffffff8118e114: 0f b6 14 08 movzbl (%rax,%rcx,1),%edx ffffffff8118e118: 48 8d 43 0b lea 0xb(%rbx),%rax ffffffff8118e11c: 48 89 c6 mov %rax,%rsi /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 return g->offset_low | ((unsigned long)g->offset_middle << 16) | ffffffff8118e11f: 49 c1 e4 10 shl $0x10,%r12 /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:94 ((unsigned long) g->offset_high << 32); ffffffff8118e123: 48 c1 ee 03 shr $0x3,%rsi ffffffff8118e127: 0f b6 0c 0e movzbl (%rsi,%rcx,1),%ecx ffffffff8118e12b: 48 89 fe mov %rdi,%rsi ffffffff8118e12e: 83 e6 07 and $0x7,%esi ffffffff8118e131: 40 38 f2 cmp %sil,%dl ffffffff8118e134: 40 0f 9e c6 setle %sil ffffffff8118e138: 84 d2 test %dl,%dl ffffffff8118e13a: 0f 95 c2 setne %dl ffffffff8118e13d: 40 84 d6 test %dl,%sil ffffffff8118e140: 0f 85 a3 00 00 00 jne ffffffff8118e1e9 <handle_external_interrupt_irqoff+0x209> ffffffff8118e146: 83 e0 07 and $0x7,%eax ffffffff8118e149: 38 c1 cmp %al,%cl ffffffff8118e14b: 0f 9e c2 setle %dl ffffffff8118e14e: 84 c9 test %cl,%cl ffffffff8118e150: 0f 95 c0 setne %al ffffffff8118e153: 84 c2 test %al,%dl ffffffff8118e155: 0f 85 8e 00 00 00 jne ffffffff8118e1e9 <handle_external_interrupt_irqoff+0x209> ffffffff8118e15b: 8b 5b 08 mov 0x8(%rbx),%ebx kvm_before_interrupt(): /usr/local/google/src/linux-trunk/arch/x86/kvm/x86.h:352 DECLARE_PER_CPU(struct kvm_vcpu *, current_vcpu); static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu) { __this_cpu_write(current_vcpu, vcpu); ffffffff8118e15e: 48 c7 c7 80 fd e3 87 mov $0xffffffff87e3fd80,%rdi ffffffff8118e165: e8 26 21 6a 02 callq ffffffff83830290 <__this_cpu_preempt_check> ffffffff8118e16a: 65 48 89 2d 5e ff e8 mov %rbp,%gs:0x7ee8ff5e(%rip) # 1e0d0 <current_vcpu> ffffffff8118e171: 7e gate_offset(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:94 ffffffff8118e172: 48 c1 e3 20 shl $0x20,%rbx /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 return g->offset_low | ((unsigned long)g->offset_middle << 16) | ffffffff8118e176: 4c 09 e3 or %r12,%rbx ffffffff8118e179: 4c 09 eb or %r13,%rbx handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6250 entry = gate_offset(desc); kvm_before_interrupt(vcpu); asm volatile( ffffffff8118e17c: 48 89 e0 mov %rsp,%rax ffffffff8118e17f: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp ffffffff8118e183: 6a 18 pushq $0x18 ffffffff8118e185: 50 push %rax ffffffff8118e186: 9c pushfq ffffffff8118e187: 6a 10 pushq $0x10 ffffffff8118e189: ff d3 callq *%rbx kvm_after_interrupt(): /usr/local/google/src/linux-trunk/arch/x86/kvm/x86.h:357 } static inline void kvm_after_interrupt(struct kvm_vcpu *vcpu) { __this_cpu_write(current_vcpu, NULL); ffffffff8118e18b: 48 c7 c7 80 fd e3 87 mov $0xffffffff87e3fd80,%rdi ffffffff8118e192: e8 f9 20 6a 02 callq ffffffff83830290 <__this_cpu_preempt_check> ffffffff8118e197: 65 48 c7 05 2d ff e8 movq $0x0,%gs:0x7ee8ff2d(%rip) # 1e0d0 <current_vcpu> ffffffff8118e19e: 7e 00 00 00 00 handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6272 [ss]"i"(__KERNEL_DS), [cs]"i"(__KERNEL_CS) ); kvm_after_interrupt(vcpu); } ffffffff8118e1a3: 5b pop %rbx ffffffff8118e1a4: 5d pop %rbp ffffffff8118e1a5: 41 5c pop %r12 ffffffff8118e1a7: 41 5d pop %r13 ffffffff8118e1a9: e9 82 77 57 00 jmpq ffffffff81705930 <__sanitizer_cov_trace_pc> vmcs_read32(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/ops.h:101 return evmcs_read32(field); ffffffff8118e1ae: 45 31 ed xor %r13d,%r13d handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6240 (discriminator 1) if (WARN_ONCE(!is_external_intr(intr_info), ffffffff8118e1b1: e8 7a 77 57 00 callq ffffffff81705930 <__sanitizer_cov_trace_pc> ffffffff8118e1b6: 0f b6 1d cf 2f fa 08 movzbl 0x8fa2fcf(%rip),%ebx # ffffffff8a13118c <__warned.77930> ffffffff8118e1bd: 31 ff xor %edi,%edi ffffffff8118e1bf: 89 de mov %ebx,%esi ffffffff8118e1c1: e8 9a 78 57 00 callq ffffffff81705a60 <__sanitizer_cov_trace_const_cmp1> ffffffff8118e1c6: 84 db test %bl,%bl ffffffff8118e1c8: 75 d9 jne ffffffff8118e1a3 <handle_external_interrupt_irqoff+0x1c3> /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6240 (discriminator 3) ffffffff8118e1ca: e8 61 77 57 00 callq ffffffff81705930 <__sanitizer_cov_trace_pc> ffffffff8118e1cf: 44 89 ee mov %r13d,%esi ffffffff8118e1d2: 48 c7 c7 e0 fc e3 87 mov $0xffffffff87e3fce0,%rdi ffffffff8118e1d9: c6 05 ac 2f fa 08 01 movb $0x1,0x8fa2fac(%rip) # ffffffff8a13118c <__warned.77930> ffffffff8118e1e0: e8 10 1a 2a 00 callq ffffffff8142fbf5 <__warn_printk> ffffffff8118e1e5: 0f 0b ud2 ffffffff8118e1e7: eb ba jmp ffffffff8118e1a3 <handle_external_interrupt_irqoff+0x1c3> gate_offset(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:94 ((unsigned long) g->offset_high << 32); ffffffff8118e1e9: be 04 00 00 00 mov $0x4,%esi ffffffff8118e1ee: e8 4d 80 91 00 callq ffffffff81aa6240 <__asan_report_load_n_noabort> ffffffff8118e1f3: e9 63 ff ff ff jmpq ffffffff8118e15b <handle_external_interrupt_irqoff+0x17b> /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 return g->offset_low | ((unsigned long)g->offset_middle << 16) | ffffffff8118e1f8: be 02 00 00 00 mov $0x2,%esi ffffffff8118e1fd: e8 3e 80 91 00 callq ffffffff81aa6240 <__asan_report_load_n_noabort> ffffffff8118e202: e9 f3 fe ff ff jmpq ffffffff8118e0fa <handle_external_interrupt_irqoff+0x11a> ffffffff8118e207: be 02 00 00 00 mov $0x2,%esi ffffffff8118e20c: 48 89 df mov %rbx,%rdi ffffffff8118e20f: e8 2c 80 91 00 callq ffffffff81aa6240 <__asan_report_load_n_noabort> ffffffff8118e214: e9 85 fe ff ff jmpq ffffffff8118e09e <handle_external_interrupt_irqoff+0xbe> handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 ffffffff8118e219: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) [-- Attachment #3: handle_external_interrupt_irqoff.clang.txt --] [-- Type: text/plain, Size: 15167 bytes --] vmlinux.clang: file format elf64-x86-64 Disassembly of section .text: ffffffff811b7850 <handle_external_interrupt_irqoff>: handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6230 kvm_after_interrupt(&vmx->vcpu); } } static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) { ffffffff811b7850: 55 push %rbp ffffffff811b7851: 41 57 push %r15 ffffffff811b7853: 41 56 push %r14 ffffffff811b7855: 41 55 push %r13 ffffffff811b7857: 41 54 push %r12 ffffffff811b7859: 53 push %rbx ffffffff811b785a: 48 83 ec 10 sub $0x10,%rsp ffffffff811b785e: 49 89 fe mov %rdi,%r14 ffffffff811b7861: e8 6a 06 56 00 callq ffffffff81717ed0 <__sanitizer_cov_trace_pc> ffffffff811b7866: 31 db xor %ebx,%ebx arch_static_branch(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6230 ffffffff811b7868: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) vmcs_read32(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/jump_label.h:25 #include <linux/stringify.h> #include <linux/types.h> static __always_inline bool arch_static_branch(struct static_key *key, bool branch) { asm_volatile_goto("1:" ffffffff811b786d: e8 5e 06 56 00 callq ffffffff81717ed0 <__sanitizer_cov_trace_pc> ffffffff811b7872: b8 04 44 00 00 mov $0x4404,%eax __vmcs_readl(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/ops.h:70 static __always_inline unsigned long __vmcs_readl(unsigned long field) { unsigned long value; asm volatile("1: vmread %2, %1\n\t" ffffffff811b7877: 0f 78 c3 vmread %rax,%rbx ffffffff811b787a: 3e 77 0d ja,pt ffffffff811b788a <handle_external_interrupt_irqoff+0x3a> ffffffff811b787d: 48 89 c7 mov %rax,%rdi ffffffff811b7880: 48 31 f6 xor %rsi,%rsi ffffffff811b7883: e8 d8 2f ff ff callq ffffffff811aa860 <vmread_error> ffffffff811b7888: 31 db xor %ebx,%ebx is_external_intr(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmcs.h:129 == (INTR_TYPE_NMI_INTR | INTR_INFO_VALID_MASK); } static inline bool is_external_intr(u32 intr_info) { return (intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK)) ffffffff811b788a: 89 dd mov %ebx,%ebp ffffffff811b788c: 81 e5 00 07 00 80 and $0x80000700,%ebp /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmcs.h:130 == (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR); ffffffff811b7892: bf 00 00 00 80 mov $0x80000000,%edi ffffffff811b7897: 89 ee mov %ebp,%esi ffffffff811b7899: e8 d2 09 56 00 callq ffffffff81718270 <__sanitizer_cov_trace_const_cmp4> ffffffff811b789e: 81 fd 00 00 00 80 cmp $0x80000000,%ebp handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6240 #endif gate_desc *desc; u32 intr_info; intr_info = vmcs_read32(VM_EXIT_INTR_INFO); if (WARN_ONCE(!is_external_intr(intr_info), ffffffff811b78a4: 0f 85 14 01 00 00 jne ffffffff811b79be <handle_external_interrupt_irqoff+0x16e> ffffffff811b78aa: 49 bc 00 00 00 00 00 movabs $0xdffffc0000000000,%r12 ffffffff811b78b1: fc ff df /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6244 "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info)) return; vector = intr_info & INTR_INFO_VECTOR_MASK; ffffffff811b78b4: e8 17 06 56 00 callq ffffffff81717ed0 <__sanitizer_cov_trace_pc> ffffffff811b78b9: 0f b6 eb movzbl %bl,%ebp /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6245 desc = (gate_desc *)host_idt_base + vector; ffffffff811b78bc: 48 8b 1d 7d e3 2a 09 mov 0x92ae37d(%rip),%rbx # ffffffff8a465c40 <host_idt_base> gate_offset(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 typedef struct gate_struct gate_desc; static inline unsigned long gate_offset(const gate_desc *g) { #ifdef CONFIG_X86_64 return g->offset_low | ((unsigned long)g->offset_middle << 16) | ffffffff811b78c3: 48 c1 e5 04 shl $0x4,%rbp ffffffff811b78c7: 4c 8d 2c 2b lea (%rbx,%rbp,1),%r13 ffffffff811b78cb: 4c 8d 7c 2b 01 lea 0x1(%rbx,%rbp,1),%r15 ffffffff811b78d0: 4c 89 e8 mov %r13,%rax ffffffff811b78d3: 48 c1 e8 03 shr $0x3,%rax ffffffff811b78d7: 42 8a 04 20 mov (%rax,%r12,1),%al ffffffff811b78db: 84 c0 test %al,%al ffffffff811b78dd: 0f 85 18 01 00 00 jne ffffffff811b79fb <handle_external_interrupt_irqoff+0x1ab> ffffffff811b78e3: 4c 89 f8 mov %r15,%rax ffffffff811b78e6: 48 c1 e8 03 shr $0x3,%rax ffffffff811b78ea: 42 8a 04 20 mov (%rax,%r12,1),%al ffffffff811b78ee: 84 c0 test %al,%al ffffffff811b78f0: 0f 85 25 01 00 00 jne ffffffff811b7a1b <handle_external_interrupt_irqoff+0x1cb> ffffffff811b78f6: 41 0f b7 55 00 movzwl 0x0(%r13),%edx ffffffff811b78fb: 4c 8d 6c 2b 06 lea 0x6(%rbx,%rbp,1),%r13 ffffffff811b7900: 4c 8d 7c 2b 07 lea 0x7(%rbx,%rbp,1),%r15 ffffffff811b7905: 4c 89 e8 mov %r13,%rax ffffffff811b7908: 48 c1 e8 03 shr $0x3,%rax ffffffff811b790c: 42 8a 04 20 mov (%rax,%r12,1),%al ffffffff811b7910: 84 c0 test %al,%al ffffffff811b7912: 0f 85 23 01 00 00 jne ffffffff811b7a3b <handle_external_interrupt_irqoff+0x1eb> ffffffff811b7918: 4c 89 34 24 mov %r14,(%rsp) ffffffff811b791c: 4c 89 f8 mov %r15,%rax ffffffff811b791f: 48 c1 e8 03 shr $0x3,%rax ffffffff811b7923: 42 8a 04 20 mov (%rax,%r12,1),%al ffffffff811b7927: 84 c0 test %al,%al ffffffff811b7929: 0f 85 34 01 00 00 jne ffffffff811b7a63 <handle_external_interrupt_irqoff+0x213> ffffffff811b792f: 45 0f b7 75 00 movzwl 0x0(%r13),%r14d ffffffff811b7934: 49 c1 e6 10 shl $0x10,%r14 /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:94 ((unsigned long) g->offset_high << 32); ffffffff811b7938: 4c 8d 6c 2b 08 lea 0x8(%rbx,%rbp,1),%r13 ffffffff811b793d: 4c 8d 7c 2b 0b lea 0xb(%rbx,%rbp,1),%r15 ffffffff811b7942: 4c 89 e8 mov %r13,%rax ffffffff811b7945: 48 c1 e8 03 shr $0x3,%rax ffffffff811b7949: 42 8a 04 20 mov (%rax,%r12,1),%al ffffffff811b794d: 84 c0 test %al,%al ffffffff811b794f: 0f 85 34 01 00 00 jne ffffffff811b7a89 <handle_external_interrupt_irqoff+0x239> ffffffff811b7955: 49 09 d6 or %rdx,%r14 ffffffff811b7958: 4c 89 f8 mov %r15,%rax ffffffff811b795b: 48 c1 e8 03 shr $0x3,%rax ffffffff811b795f: 42 8a 04 20 mov (%rax,%r12,1),%al ffffffff811b7963: 84 c0 test %al,%al ffffffff811b7965: 0f 85 44 01 00 00 jne ffffffff811b7aaf <handle_external_interrupt_irqoff+0x25f> ffffffff811b796b: 41 8b 45 00 mov 0x0(%r13),%eax ffffffff811b796f: 48 c1 e0 20 shl $0x20,%rax /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 return g->offset_low | ((unsigned long)g->offset_middle << 16) | ffffffff811b7973: 49 09 c6 or %rax,%r14 kvm_before_interrupt(): /usr/local/google/src/linux-trunk/arch/x86/kvm/x86.h:352 DECLARE_PER_CPU(struct kvm_vcpu *, current_vcpu); static inline void kvm_before_interrupt(struct kvm_vcpu *vcpu) { __this_cpu_write(current_vcpu, vcpu); ffffffff811b7976: 48 c7 c7 d8 93 84 88 mov $0xffffffff888493d8,%rdi ffffffff811b797d: e8 ce f3 69 02 callq ffffffff83856d50 <__this_cpu_preempt_check> ffffffff811b7982: 48 8b 04 24 mov (%rsp),%rax ffffffff811b7986: 65 48 89 05 2a 67 e6 mov %rax,%gs:0x7ee6672a(%rip) # 1e0b8 <current_vcpu> ffffffff811b798d: 7e handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6250 entry = gate_offset(desc); kvm_before_interrupt(vcpu); asm volatile( ffffffff811b798e: 4c 89 74 24 08 mov %r14,0x8(%rsp) ffffffff811b7993: 48 89 e0 mov %rsp,%rax ffffffff811b7996: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp ffffffff811b799a: 6a 18 pushq $0x18 ffffffff811b799c: 50 push %rax ffffffff811b799d: 9c pushfq ffffffff811b799e: 6a 10 pushq $0x10 ffffffff811b79a0: ff 54 24 08 callq *0x8(%rsp) kvm_after_interrupt(): /usr/local/google/src/linux-trunk/arch/x86/kvm/x86.h:357 } static inline void kvm_after_interrupt(struct kvm_vcpu *vcpu) { __this_cpu_write(current_vcpu, NULL); ffffffff811b79a4: 48 c7 c7 d8 93 84 88 mov $0xffffffff888493d8,%rdi ffffffff811b79ab: e8 a0 f3 69 02 callq ffffffff83856d50 <__this_cpu_preempt_check> ffffffff811b79b0: 65 48 c7 05 fc 66 e6 movq $0x0,%gs:0x7ee666fc(%rip) # 1e0b8 <current_vcpu> ffffffff811b79b7: 7e 00 00 00 00 ffffffff811b79bc: eb 0e jmp ffffffff811b79cc <handle_external_interrupt_irqoff+0x17c> handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6240 if (WARN_ONCE(!is_external_intr(intr_info), ffffffff811b79be: 80 3d 80 3a e4 07 01 cmpb $0x1,0x7e43a80(%rip) # ffffffff88ffb445 <handle_external_interrupt_irqoff.__warned> ffffffff811b79c5: 75 14 jne ffffffff811b79db <handle_external_interrupt_irqoff+0x18b> ffffffff811b79c7: e8 04 05 56 00 callq ffffffff81717ed0 <__sanitizer_cov_trace_pc> /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6272 [ss]"i"(__KERNEL_DS), [cs]"i"(__KERNEL_CS) ); kvm_after_interrupt(vcpu); } ffffffff811b79cc: 48 83 c4 10 add $0x10,%rsp ffffffff811b79d0: 5b pop %rbx ffffffff811b79d1: 41 5c pop %r12 ffffffff811b79d3: 41 5d pop %r13 ffffffff811b79d5: 41 5e pop %r14 ffffffff811b79d7: 41 5f pop %r15 ffffffff811b79d9: 5d pop %rbp ffffffff811b79da: c3 retq /usr/local/google/src/linux-trunk/arch/x86/kvm/vmx/vmx.c:6240 if (WARN_ONCE(!is_external_intr(intr_info), ffffffff811b79db: e8 f0 04 56 00 callq ffffffff81717ed0 <__sanitizer_cov_trace_pc> ffffffff811b79e0: c6 05 5e 3a e4 07 01 movb $0x1,0x7e43a5e(%rip) # ffffffff88ffb445 <handle_external_interrupt_irqoff.__warned> ffffffff811b79e7: 48 c7 c7 47 88 7b 88 mov $0xffffffff887b8847,%rdi ffffffff811b79ee: 89 de mov %ebx,%esi ffffffff811b79f0: 31 c0 xor %eax,%eax ffffffff811b79f2: e8 e9 76 29 00 callq ffffffff8144f0e0 <__warn_printk> ffffffff811b79f7: 0f 0b ud2 ffffffff811b79f9: eb d1 jmp ffffffff811b79cc <handle_external_interrupt_irqoff+0x17c> gate_offset(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:93 ffffffff811b79fb: 44 89 e9 mov %r13d,%ecx ffffffff811b79fe: 80 e1 07 and $0x7,%cl ffffffff811b7a01: 38 c1 cmp %al,%cl ffffffff811b7a03: 0f 8c da fe ff ff jl ffffffff811b78e3 <handle_external_interrupt_irqoff+0x93> ffffffff811b7a09: be 02 00 00 00 mov $0x2,%esi ffffffff811b7a0e: 4c 89 ef mov %r13,%rdi ffffffff811b7a11: e8 aa 8f 8f 00 callq ffffffff81ab09c0 <__asan_report_load_n_noabort> ffffffff811b7a16: e9 c8 fe ff ff jmpq ffffffff811b78e3 <handle_external_interrupt_irqoff+0x93> ffffffff811b7a1b: 44 89 f9 mov %r15d,%ecx ffffffff811b7a1e: 80 e1 07 and $0x7,%cl ffffffff811b7a21: 38 c1 cmp %al,%cl ffffffff811b7a23: 0f 8c cd fe ff ff jl ffffffff811b78f6 <handle_external_interrupt_irqoff+0xa6> ffffffff811b7a29: be 02 00 00 00 mov $0x2,%esi ffffffff811b7a2e: 4c 89 ff mov %r15,%rdi ffffffff811b7a31: e8 8a 8f 8f 00 callq ffffffff81ab09c0 <__asan_report_load_n_noabort> ffffffff811b7a36: e9 bb fe ff ff jmpq ffffffff811b78f6 <handle_external_interrupt_irqoff+0xa6> ffffffff811b7a3b: 44 89 e9 mov %r13d,%ecx ffffffff811b7a3e: 80 e1 07 and $0x7,%cl ffffffff811b7a41: 38 c1 cmp %al,%cl ffffffff811b7a43: 0f 8c cf fe ff ff jl ffffffff811b7918 <handle_external_interrupt_irqoff+0xc8> ffffffff811b7a49: be 02 00 00 00 mov $0x2,%esi ffffffff811b7a4e: 4c 89 ef mov %r13,%rdi ffffffff811b7a51: 48 89 14 24 mov %rdx,(%rsp) ffffffff811b7a55: e8 66 8f 8f 00 callq ffffffff81ab09c0 <__asan_report_load_n_noabort> ffffffff811b7a5a: 48 8b 14 24 mov (%rsp),%rdx ffffffff811b7a5e: e9 b5 fe ff ff jmpq ffffffff811b7918 <handle_external_interrupt_irqoff+0xc8> ffffffff811b7a63: 44 89 f9 mov %r15d,%ecx ffffffff811b7a66: 80 e1 07 and $0x7,%cl ffffffff811b7a69: 38 c1 cmp %al,%cl ffffffff811b7a6b: 0f 8c be fe ff ff jl ffffffff811b792f <handle_external_interrupt_irqoff+0xdf> ffffffff811b7a71: be 02 00 00 00 mov $0x2,%esi ffffffff811b7a76: 4c 89 ff mov %r15,%rdi ffffffff811b7a79: 49 89 d6 mov %rdx,%r14 ffffffff811b7a7c: e8 3f 8f 8f 00 callq ffffffff81ab09c0 <__asan_report_load_n_noabort> ffffffff811b7a81: 4c 89 f2 mov %r14,%rdx ffffffff811b7a84: e9 a6 fe ff ff jmpq ffffffff811b792f <handle_external_interrupt_irqoff+0xdf> /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:94 ((unsigned long) g->offset_high << 32); ffffffff811b7a89: 44 89 e9 mov %r13d,%ecx ffffffff811b7a8c: 80 e1 07 and $0x7,%cl ffffffff811b7a8f: 38 c1 cmp %al,%cl ffffffff811b7a91: 0f 8c be fe ff ff jl ffffffff811b7955 <handle_external_interrupt_irqoff+0x105> ffffffff811b7a97: be 04 00 00 00 mov $0x4,%esi ffffffff811b7a9c: 4c 89 ef mov %r13,%rdi ffffffff811b7a9f: 48 89 d3 mov %rdx,%rbx ffffffff811b7aa2: e8 19 8f 8f 00 callq ffffffff81ab09c0 <__asan_report_load_n_noabort> ffffffff811b7aa7: 48 89 da mov %rbx,%rdx ffffffff811b7aaa: e9 a6 fe ff ff jmpq ffffffff811b7955 <handle_external_interrupt_irqoff+0x105> ffffffff811b7aaf: 44 89 f9 mov %r15d,%ecx ffffffff811b7ab2: 80 e1 07 and $0x7,%cl ffffffff811b7ab5: 38 c1 cmp %al,%cl ffffffff811b7ab7: 0f 8c ae fe ff ff jl ffffffff811b796b <handle_external_interrupt_irqoff+0x11b> ffffffff811b7abd: be 04 00 00 00 mov $0x4,%esi ffffffff811b7ac2: 4c 89 ff mov %r15,%rdi ffffffff811b7ac5: e8 f6 8e 8f 00 callq ffffffff81ab09c0 <__asan_report_load_n_noabort> ffffffff811b7aca: e9 9c fe ff ff jmpq ffffffff811b796b <handle_external_interrupt_irqoff+0x11b> handle_external_interrupt_irqoff(): /usr/local/google/src/linux-trunk/./arch/x86/include/asm/desc_defs.h:94 ffffffff811b7acf: 90 nop ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 16:31 ` Alexander Potapenko @ 2020-03-23 16:39 ` Sean Christopherson 2020-03-23 16:43 ` Alexander Potapenko 2020-03-23 16:57 ` Nick Desaulniers 0 siblings, 2 replies; 20+ messages in thread From: Sean Christopherson @ 2020-03-23 16:39 UTC (permalink / raw) To: Alexander Potapenko Cc: Paolo Bonzini, Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanpengli, the arch/x86 maintainers On Mon, Mar 23, 2020 at 05:31:15PM +0100, Alexander Potapenko wrote: > On Mon, Mar 23, 2020 at 9:18 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > On 22/03/20 07:59, Dmitry Vyukov wrote: > > > > > > The commit range is presumably > > > fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 > > > But I don't see anything that says "it's me". The only commit that > > > does non-trivial changes to x86/vmx seems to be "KVM: VMX: check > > > descriptor table exits on instruction emulation": > > > > That seems unlikely, it's a completely different file and it would only > > affect the outside (non-nested) environment rather than your own kernel. > > > > The only instance of "0x86" in the registers is in the flags: > > > > > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > > > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > > > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > > > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > > > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > > > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > > > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > That would suggest a miscompilation of the inline assembly, which does > > push the flags: > > > > #ifdef CONFIG_X86_64 > > "mov %%" _ASM_SP ", %[sp]\n\t" > > "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t" > > "push $%c[ss]\n\t" > > "push %[sp]\n\t" > > #endif > > "pushf\n\t" > > __ASM_SIZE(push) " $%c[cs]\n\t" > > CALL_NOSPEC > > > > > > It would not explain why it suddenly started to break, unless the clang > > version also changed, but it would be easy to ascertain and fix (in > > either KVM or clang). Dmitry, can you send me the vmx.o and > > kvm-intel.ko files? > > On a quick glance, Clang does not miscompile this part. Clang definitely miscompiles the asm, the indirect call operates on the EFLAGS value, not on @entry as expected. It looks like clang doesn't honor ASM_CALL_CONSTRAINT, which effectively tells the compiler that %rsp is getting clobbered, e.g. the "mov %r14,0x8(%rsp)" is loading @entry for "callq *0x8(%rsp)", which breaks because of asm's pushes. clang: kvm_before_interrupt(vcpu); asm volatile( ffffffff811b798e: 4c 89 74 24 08 mov %r14,0x8(%rsp) ffffffff811b7993: 48 89 e0 mov %rsp,%rax ffffffff811b7996: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp ffffffff811b799a: 6a 18 pushq $0x18 ffffffff811b799c: 50 push %rax ffffffff811b799d: 9c pushfq ffffffff811b799e: 6a 10 pushq $0x10 ffffffff811b79a0: ff 54 24 08 callq *0x8(%rsp) <--------- calls the EFLAGS value kvm_after_interrupt(): gcc: kvm_before_interrupt(vcpu); asm volatile( ffffffff8118e17c: 48 89 e0 mov %rsp,%rax ffffffff8118e17f: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp ffffffff8118e183: 6a 18 pushq $0x18 ffffffff8118e185: 50 push %rax ffffffff8118e186: 9c pushfq ffffffff8118e187: 6a 10 pushq $0x10 ffffffff8118e189: ff d3 callq *%rbx <-------- calls @entry kvm_after_interrupt(): ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 16:39 ` Sean Christopherson @ 2020-03-23 16:43 ` Alexander Potapenko 2020-03-23 16:57 ` Nick Desaulniers 1 sibling, 0 replies; 20+ messages in thread From: Alexander Potapenko @ 2020-03-23 16:43 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanpengli, the arch/x86 maintainers On Mon, Mar 23, 2020 at 5:39 PM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Mon, Mar 23, 2020 at 05:31:15PM +0100, Alexander Potapenko wrote: > > On Mon, Mar 23, 2020 at 9:18 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > > > On 22/03/20 07:59, Dmitry Vyukov wrote: > > > > > > > > The commit range is presumably > > > > fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 > > > > But I don't see anything that says "it's me". The only commit that > > > > does non-trivial changes to x86/vmx seems to be "KVM: VMX: check > > > > descriptor table exits on instruction emulation": > > > > > > That seems unlikely, it's a completely different file and it would only > > > affect the outside (non-nested) environment rather than your own kernel. > > > > > > The only instance of "0x86" in the registers is in the flags: > > > > > > > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > > > > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > > > > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > > > > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > > > > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > > > > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > > > > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > That would suggest a miscompilation of the inline assembly, which does > > > push the flags: > > > > > > #ifdef CONFIG_X86_64 > > > "mov %%" _ASM_SP ", %[sp]\n\t" > > > "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t" > > > "push $%c[ss]\n\t" > > > "push %[sp]\n\t" > > > #endif > > > "pushf\n\t" > > > __ASM_SIZE(push) " $%c[cs]\n\t" > > > CALL_NOSPEC > > > > > > > > > It would not explain why it suddenly started to break, unless the clang > > > version also changed, but it would be easy to ascertain and fix (in > > > either KVM or clang). Dmitry, can you send me the vmx.o and > > > kvm-intel.ko files? > > > > On a quick glance, Clang does not miscompile this part. > > Clang definitely miscompiles the asm, the indirect call operates on the > EFLAGS value, not on @entry as expected. It looks like clang doesn't honor > ASM_CALL_CONSTRAINT, which effectively tells the compiler that %rsp is > getting clobbered, e.g. the "mov %r14,0x8(%rsp)" is loading @entry for > "callq *0x8(%rsp)", which breaks because of asm's pushes. Ugh, I completely overlooked this. Right, this is something to work this on the Clang side. > clang: > > kvm_before_interrupt(vcpu); > > asm volatile( > ffffffff811b798e: 4c 89 74 24 08 mov %r14,0x8(%rsp) > ffffffff811b7993: 48 89 e0 mov %rsp,%rax > ffffffff811b7996: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > ffffffff811b799a: 6a 18 pushq $0x18 > ffffffff811b799c: 50 push %rax > ffffffff811b799d: 9c pushfq > ffffffff811b799e: 6a 10 pushq $0x10 > ffffffff811b79a0: ff 54 24 08 callq *0x8(%rsp) <--------- calls the EFLAGS value > kvm_after_interrupt(): > > > gcc: > kvm_before_interrupt(vcpu); > > asm volatile( > ffffffff8118e17c: 48 89 e0 mov %rsp,%rax > ffffffff8118e17f: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > ffffffff8118e183: 6a 18 pushq $0x18 > ffffffff8118e185: 50 push %rax > ffffffff8118e186: 9c pushfq > ffffffff8118e187: 6a 10 pushq $0x10 > ffffffff8118e189: ff d3 callq *%rbx <-------- calls @entry > kvm_after_interrupt(): > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200323163925.GP28711%40linux.intel.com. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 16:39 ` Sean Christopherson 2020-03-23 16:43 ` Alexander Potapenko @ 2020-03-23 16:57 ` Nick Desaulniers 2020-03-23 17:28 ` Nick Desaulniers 1 sibling, 1 reply; 20+ messages in thread From: Nick Desaulniers @ 2020-03-23 16:57 UTC (permalink / raw) To: Sean Christopherson Cc: Alexander Potapenko, Paolo Bonzini, Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers On Mon, Mar 23, 2020 at 9:39 AM Sean Christopherson <sean.j.christopherson@intel.com> wrote: > > On Mon, Mar 23, 2020 at 05:31:15PM +0100, Alexander Potapenko wrote: > > On Mon, Mar 23, 2020 at 9:18 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > > > On 22/03/20 07:59, Dmitry Vyukov wrote: > > > > > > > > The commit range is presumably > > > > fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 > > > > But I don't see anything that says "it's me". The only commit that > > > > does non-trivial changes to x86/vmx seems to be "KVM: VMX: check > > > > descriptor table exits on instruction emulation": > > > > > > That seems unlikely, it's a completely different file and it would only > > > affect the outside (non-nested) environment rather than your own kernel. > > > > > > The only instance of "0x86" in the registers is in the flags: > > > > > > > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > > > > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > > > > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > > > > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > > > > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > > > > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > > > > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > That would suggest a miscompilation of the inline assembly, which does > > > push the flags: > > > > > > #ifdef CONFIG_X86_64 > > > "mov %%" _ASM_SP ", %[sp]\n\t" > > > "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t" > > > "push $%c[ss]\n\t" > > > "push %[sp]\n\t" > > > #endif > > > "pushf\n\t" > > > __ASM_SIZE(push) " $%c[cs]\n\t" > > > CALL_NOSPEC > > > > > > > > > It would not explain why it suddenly started to break, unless the clang > > > version also changed, but it would be easy to ascertain and fix (in > > > either KVM or clang). Dmitry, can you send me the vmx.o and > > > kvm-intel.ko files? > > > > On a quick glance, Clang does not miscompile this part. > > Clang definitely miscompiles the asm, the indirect call operates on the > EFLAGS value, not on @entry as expected. It looks like clang doesn't honor > ASM_CALL_CONSTRAINT, which effectively tells the compiler that %rsp is > getting clobbered, e.g. the "mov %r14,0x8(%rsp)" is loading @entry for > "callq *0x8(%rsp)", which breaks because of asm's pushes. > > clang: > > kvm_before_interrupt(vcpu); > > asm volatile( > ffffffff811b798e: 4c 89 74 24 08 mov %r14,0x8(%rsp) > ffffffff811b7993: 48 89 e0 mov %rsp,%rax > ffffffff811b7996: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > ffffffff811b799a: 6a 18 pushq $0x18 > ffffffff811b799c: 50 push %rax > ffffffff811b799d: 9c pushfq > ffffffff811b799e: 6a 10 pushq $0x10 > ffffffff811b79a0: ff 54 24 08 callq *0x8(%rsp) <--------- calls the EFLAGS value > kvm_after_interrupt(): > > > gcc: > kvm_before_interrupt(vcpu); > > asm volatile( > ffffffff8118e17c: 48 89 e0 mov %rsp,%rax > ffffffff8118e17f: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > ffffffff8118e183: 6a 18 pushq $0x18 > ffffffff8118e185: 50 push %rax > ffffffff8118e186: 9c pushfq > ffffffff8118e187: 6a 10 pushq $0x10 > ffffffff8118e189: ff d3 callq *%rbx <-------- calls @entry > kvm_after_interrupt(): Thanks for this analysis, it looks like this is dependent on some particular configuration; here's clang+defconfig+CONFIG_KVM_INTEL=y: 0x000000000000528f <+127>: pushq $0x18 0x0000000000005291 <+129>: push %rcx 0x0000000000005292 <+130>: pushfq 0x0000000000005293 <+131>: pushq $0x10 0x0000000000005295 <+133>: callq *%rax -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 16:57 ` Nick Desaulniers @ 2020-03-23 17:28 ` Nick Desaulniers 2020-03-23 17:55 ` Alexander Potapenko 0 siblings, 1 reply; 20+ messages in thread From: Nick Desaulniers @ 2020-03-23 17:28 UTC (permalink / raw) To: Dmitry Vyukov Cc: Alexander Potapenko, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson On Mon, Mar 23, 2020 at 9:57 AM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Mon, Mar 23, 2020 at 9:39 AM Sean Christopherson > <sean.j.christopherson@intel.com> wrote: > > > > On Mon, Mar 23, 2020 at 05:31:15PM +0100, Alexander Potapenko wrote: > > > On Mon, Mar 23, 2020 at 9:18 AM Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > > > > > On 22/03/20 07:59, Dmitry Vyukov wrote: > > > > > > > > > > The commit range is presumably > > > > > fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7 > > > > > But I don't see anything that says "it's me". The only commit that > > > > > does non-trivial changes to x86/vmx seems to be "KVM: VMX: check > > > > > descriptor table exits on instruction emulation": > > > > > > > > That seems unlikely, it's a completely different file and it would only > > > > affect the outside (non-nested) environment rather than your own kernel. > > > > > > > > The only instance of "0x86" in the registers is in the flags: > > > > > > > > > RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086 > > > > > RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000 > > > > > RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453 > > > > > RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca > > > > > R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000 > > > > > R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb > > > > > FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 > > > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > > CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0 > > > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > > > > > > > That would suggest a miscompilation of the inline assembly, which does > > > > push the flags: > > > > > > > > #ifdef CONFIG_X86_64 > > > > "mov %%" _ASM_SP ", %[sp]\n\t" > > > > "and $0xfffffffffffffff0, %%" _ASM_SP "\n\t" > > > > "push $%c[ss]\n\t" > > > > "push %[sp]\n\t" > > > > #endif > > > > "pushf\n\t" > > > > __ASM_SIZE(push) " $%c[cs]\n\t" > > > > CALL_NOSPEC > > > > > > > > > > > > It would not explain why it suddenly started to break, unless the clang > > > > version also changed, but it would be easy to ascertain and fix (in > > > > either KVM or clang). Dmitry, can you send me the vmx.o and > > > > kvm-intel.ko files? > > > > > > On a quick glance, Clang does not miscompile this part. > > > > Clang definitely miscompiles the asm, the indirect call operates on the > > EFLAGS value, not on @entry as expected. It looks like clang doesn't honor > > ASM_CALL_CONSTRAINT, which effectively tells the compiler that %rsp is I noticed that in the syzcaller config I have, that CONFIG_RETPOLINE is not set. I'm more reliably able to reproduce this with clang+defconfig+CONFIG_KVM=y+CONFIG_KVM_INTEL=y+CONFIG_RETPOLINE=n, ie. by manually disabling retpoline. > > getting clobbered, e.g. the "mov %r14,0x8(%rsp)" is loading @entry for > > "callq *0x8(%rsp)", which breaks because of asm's pushes. > > > > clang: > > > > kvm_before_interrupt(vcpu); > > > > asm volatile( > > ffffffff811b798e: 4c 89 74 24 08 mov %r14,0x8(%rsp) > > ffffffff811b7993: 48 89 e0 mov %rsp,%rax > > ffffffff811b7996: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > > ffffffff811b799a: 6a 18 pushq $0x18 > > ffffffff811b799c: 50 push %rax > > ffffffff811b799d: 9c pushfq > > ffffffff811b799e: 6a 10 pushq $0x10 > > ffffffff811b79a0: ff 54 24 08 callq *0x8(%rsp) <--------- calls the EFLAGS value > > kvm_after_interrupt(): > > > > > > gcc: > > kvm_before_interrupt(vcpu); > > > > asm volatile( > > ffffffff8118e17c: 48 89 e0 mov %rsp,%rax > > ffffffff8118e17f: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > > ffffffff8118e183: 6a 18 pushq $0x18 > > ffffffff8118e185: 50 push %rax > > ffffffff8118e186: 9c pushfq > > ffffffff8118e187: 6a 10 pushq $0x10 > > ffffffff8118e189: ff d3 callq *%rbx <-------- calls @entry > > kvm_after_interrupt(): > > Thanks for this analysis, it looks like this is dependent on some > particular configuration; here's clang+defconfig+CONFIG_KVM_INTEL=y: > > 0x000000000000528f <+127>: pushq $0x18 > 0x0000000000005291 <+129>: push %rcx > 0x0000000000005292 <+130>: pushfq > 0x0000000000005293 <+131>: pushq $0x10 > 0x0000000000005295 <+133>: callq *%rax > > -- > Thanks, > ~Nick Desaulniers -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 17:28 ` Nick Desaulniers @ 2020-03-23 17:55 ` Alexander Potapenko 2020-03-23 18:06 ` Nick Desaulniers 2020-03-23 18:06 ` Alexander Potapenko 0 siblings, 2 replies; 20+ messages in thread From: Alexander Potapenko @ 2020-03-23 17:55 UTC (permalink / raw) To: Nick Desaulniers Cc: Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson I've reduced the faulty test case to the following code: ================================= a; long b; register unsigned long current_stack_pointer asm("rsp"); handle_external_interrupt_irqoff() { asm("and $0xfffffffffffffff0, %%rsp\n\tpush $%c[ss]\n\tpush " "%[sp]\n\tpushf\n\tpushq $%c[cs]\n\tcall *%[thunk_target]\n" : [ sp ] "=&r"(b), "+r" (current_stack_pointer) : [ thunk_target ] "rm"(a), [ ss ] "i"(3 * 8), [ cs ] "i"(2 * 8) ); } ================================= (in fact creduce even throws away current_stack_pointer, but we probably want to keep it to prove the point). Clang generates the following code for it: $ clang vmx.i -O2 -c -w -o vmx.o $ objdump -d vmx.o ... 0000000000000000 <handle_external_interrupt_irqoff>: 0: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 6 <handle_external_interrupt_irqoff+0x6> 6: 89 44 24 fc mov %eax,-0x4(%rsp) a: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp e: 6a 18 pushq $0x18 10: 50 push %rax 11: 9c pushfq 12: 6a 10 pushq $0x10 14: ff 54 24 fc callq *-0x4(%rsp) 18: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 1f <handle_external_interrupt_irqoff+0x1f> 1f: c3 retq The question is whether using current_stack_pointer as an output is actually a valid way to tell the compiler it should not clobber RSP. Intuitively it is, but explicitly adding RSP to the clobber list sounds a bit more bulletproof. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 17:55 ` Alexander Potapenko @ 2020-03-23 18:06 ` Nick Desaulniers 2020-03-23 18:06 ` Alexander Potapenko 1 sibling, 0 replies; 20+ messages in thread From: Nick Desaulniers @ 2020-03-23 18:06 UTC (permalink / raw) To: Alexander Potapenko Cc: Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson On Mon, Mar 23, 2020 at 10:55 AM Alexander Potapenko <glider@google.com> wrote: > > I've reduced the faulty test case to the following code: > > ================================= > a; > long b; > register unsigned long current_stack_pointer asm("rsp"); > handle_external_interrupt_irqoff() { > asm("and $0xfffffffffffffff0, %%rsp\n\tpush $%c[ss]\n\tpush " > "%[sp]\n\tpushf\n\tpushq $%c[cs]\n\tcall *%[thunk_target]\n" > : [ sp ] "=&r"(b), "+r" (current_stack_pointer) > : [ thunk_target ] "rm"(a), [ ss ] "i"(3 * 8), [ cs ] "i"(2 * 8) ); > } > ================================= > (in fact creduce even throws away current_stack_pointer, but we > probably want to keep it to prove the point). > > Clang generates the following code for it: > > $ clang vmx.i -O2 -c -w -o vmx.o > $ objdump -d vmx.o > ... > 0000000000000000 <handle_external_interrupt_irqoff>: > 0: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 6 > <handle_external_interrupt_irqoff+0x6> > 6: 89 44 24 fc mov %eax,-0x4(%rsp) > a: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > e: 6a 18 pushq $0x18 > 10: 50 push %rax > 11: 9c pushfq > 12: 6a 10 pushq $0x10 > 14: ff 54 24 fc callq *-0x4(%rsp) > 18: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 1f > <handle_external_interrupt_irqoff+0x1f> > 1f: c3 retq > > The question is whether using current_stack_pointer as an output is > actually a valid way to tell the compiler it should not clobber RSP. > Intuitively it is, but explicitly adding RSP to the clobber list > sounds a bit more bulletproof. Ok, I think this reproducer demonstrates the issue: https://godbolt.org/z/jAafjz I *think* what's happening is that we're not specifying correctly that the stack is being modified by inline asm, so using variable references against the stack pointer is not correct. commit f5caf621ee357 ("x86/asm: Fix inline asm call constraints for Clang") has more context about ASM_CALL_CONSTRAINT. It seems that specifying "rsp" in the clobber list is a -Wdeprecated warning in GCC, and an error in Clang (unless you remove current_stack_pointer as an output, but will get Clang to produce the correct code). -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 17:55 ` Alexander Potapenko 2020-03-23 18:06 ` Nick Desaulniers @ 2020-03-23 18:06 ` Alexander Potapenko 2020-03-23 18:16 ` Nick Desaulniers 1 sibling, 1 reply; 20+ messages in thread From: Alexander Potapenko @ 2020-03-23 18:06 UTC (permalink / raw) To: Nick Desaulniers Cc: Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson On Mon, Mar 23, 2020 at 6:55 PM Alexander Potapenko <glider@google.com> wrote: > > I've reduced the faulty test case to the following code: > > ================================= > a; > long b; > register unsigned long current_stack_pointer asm("rsp"); > handle_external_interrupt_irqoff() { > asm("and $0xfffffffffffffff0, %%rsp\n\tpush $%c[ss]\n\tpush " > "%[sp]\n\tpushf\n\tpushq $%c[cs]\n\tcall *%[thunk_target]\n" > : [ sp ] "=&r"(b), "+r" (current_stack_pointer) > : [ thunk_target ] "rm"(a), [ ss ] "i"(3 * 8), [ cs ] "i"(2 * 8) ); > } > ================================= > (in fact creduce even throws away current_stack_pointer, but we > probably want to keep it to prove the point). > > Clang generates the following code for it: > > $ clang vmx.i -O2 -c -w -o vmx.o > $ objdump -d vmx.o > ... > 0000000000000000 <handle_external_interrupt_irqoff>: > 0: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 6 > <handle_external_interrupt_irqoff+0x6> > 6: 89 44 24 fc mov %eax,-0x4(%rsp) > a: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > e: 6a 18 pushq $0x18 > 10: 50 push %rax > 11: 9c pushfq > 12: 6a 10 pushq $0x10 > 14: ff 54 24 fc callq *-0x4(%rsp) > 18: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 1f > <handle_external_interrupt_irqoff+0x1f> > 1f: c3 retq > > The question is whether using current_stack_pointer as an output is > actually a valid way to tell the compiler it should not clobber RSP. > Intuitively it is, but explicitly adding RSP to the clobber list > sounds a bit more bulletproof. Ok, I am wrong: according to https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html it's incorrect to list RSP in the clobber list. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 18:06 ` Alexander Potapenko @ 2020-03-23 18:16 ` Nick Desaulniers 2020-03-23 18:49 ` Nick Desaulniers 0 siblings, 1 reply; 20+ messages in thread From: Nick Desaulniers @ 2020-03-23 18:16 UTC (permalink / raw) To: Alexander Potapenko Cc: Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson On Mon, Mar 23, 2020 at 11:06 AM Alexander Potapenko <glider@google.com> wrote: > > On Mon, Mar 23, 2020 at 6:55 PM Alexander Potapenko <glider@google.com> wrote: > > > > I've reduced the faulty test case to the following code: > > > > ================================= > > a; > > long b; > > register unsigned long current_stack_pointer asm("rsp"); > > handle_external_interrupt_irqoff() { > > asm("and $0xfffffffffffffff0, %%rsp\n\tpush $%c[ss]\n\tpush " > > "%[sp]\n\tpushf\n\tpushq $%c[cs]\n\tcall *%[thunk_target]\n" > > : [ sp ] "=&r"(b), "+r" (current_stack_pointer) > > : [ thunk_target ] "rm"(a), [ ss ] "i"(3 * 8), [ cs ] "i"(2 * 8) ); > > } > > ================================= > > (in fact creduce even throws away current_stack_pointer, but we > > probably want to keep it to prove the point). > > > > Clang generates the following code for it: > > > > $ clang vmx.i -O2 -c -w -o vmx.o > > $ objdump -d vmx.o > > ... > > 0000000000000000 <handle_external_interrupt_irqoff>: > > 0: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 6 > > <handle_external_interrupt_irqoff+0x6> > > 6: 89 44 24 fc mov %eax,-0x4(%rsp) > > a: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > > e: 6a 18 pushq $0x18 > > 10: 50 push %rax > > 11: 9c pushfq > > 12: 6a 10 pushq $0x10 > > 14: ff 54 24 fc callq *-0x4(%rsp) > > 18: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 1f > > <handle_external_interrupt_irqoff+0x1f> > > 1f: c3 retq > > > > The question is whether using current_stack_pointer as an output is > > actually a valid way to tell the compiler it should not clobber RSP. > > Intuitively it is, but explicitly adding RSP to the clobber list > > sounds a bit more bulletproof. > > Ok, I am wrong: according to > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html it's incorrect to > list RSP in the clobber list. You could force `entry` into a register: diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4d22b1b5e822..083a7e980bb5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6277,7 +6277,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) #endif ASM_CALL_CONSTRAINT : - THUNK_TARGET(entry), + [thunk_target] "a"(entry), [ss]"i"(__KERNEL_DS), [cs]"i"(__KERNEL_CS) ); (https://stackoverflow.com/a/48877683/1027966 had some interesting feedback to this problem) -- Thanks, ~Nick Desaulniers ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 18:16 ` Nick Desaulniers @ 2020-03-23 18:49 ` Nick Desaulniers 2020-03-23 19:12 ` [PATCH] KVM: VMX: don't allow memory operands for inline asm that modifies SP Nick Desaulniers 2020-03-23 19:30 ` BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff Nick Desaulniers 0 siblings, 2 replies; 20+ messages in thread From: Nick Desaulniers @ 2020-03-23 18:49 UTC (permalink / raw) To: Alexander Potapenko Cc: Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson On Mon, Mar 23, 2020 at 11:16 AM Nick Desaulniers <ndesaulniers@google.com> wrote: > > On Mon, Mar 23, 2020 at 11:06 AM Alexander Potapenko <glider@google.com> wrote: > > > > On Mon, Mar 23, 2020 at 6:55 PM Alexander Potapenko <glider@google.com> wrote: > > > > > > I've reduced the faulty test case to the following code: > > > > > > ================================= > > > a; > > > long b; > > > register unsigned long current_stack_pointer asm("rsp"); > > > handle_external_interrupt_irqoff() { > > > asm("and $0xfffffffffffffff0, %%rsp\n\tpush $%c[ss]\n\tpush " > > > "%[sp]\n\tpushf\n\tpushq $%c[cs]\n\tcall *%[thunk_target]\n" > > > : [ sp ] "=&r"(b), "+r" (current_stack_pointer) > > > : [ thunk_target ] "rm"(a), [ ss ] "i"(3 * 8), [ cs ] "i"(2 * 8) ); > > > } > > > ================================= > > > (in fact creduce even throws away current_stack_pointer, but we > > > probably want to keep it to prove the point). > > > > > > Clang generates the following code for it: > > > > > > $ clang vmx.i -O2 -c -w -o vmx.o > > > $ objdump -d vmx.o > > > ... > > > 0000000000000000 <handle_external_interrupt_irqoff>: > > > 0: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 6 > > > <handle_external_interrupt_irqoff+0x6> > > > 6: 89 44 24 fc mov %eax,-0x4(%rsp) > > > a: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp > > > e: 6a 18 pushq $0x18 > > > 10: 50 push %rax > > > 11: 9c pushfq > > > 12: 6a 10 pushq $0x10 > > > 14: ff 54 24 fc callq *-0x4(%rsp) > > > 18: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 1f > > > <handle_external_interrupt_irqoff+0x1f> > > > 1f: c3 retq > > > > > > The question is whether using current_stack_pointer as an output is > > > actually a valid way to tell the compiler it should not clobber RSP. > > > Intuitively it is, but explicitly adding RSP to the clobber list > > > sounds a bit more bulletproof. > > > > Ok, I am wrong: according to > > https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html it's incorrect to > > list RSP in the clobber list. > > You could force `entry` into a register: > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 4d22b1b5e822..083a7e980bb5 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -6277,7 +6277,7 @@ static void > handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) > #endif > ASM_CALL_CONSTRAINT > : > - THUNK_TARGET(entry), > + [thunk_target] "a"(entry), > [ss]"i"(__KERNEL_DS), > [cs]"i"(__KERNEL_CS) > ); > > (https://stackoverflow.com/a/48877683/1027966 had some interesting > feedback to this problem) Sean said: > It looks like clang doesn't honor > ASM_CALL_CONSTRAINT, which effectively tells the compiler that %rsp is > getting clobbered, e.g. the "mov %r14,0x8(%rsp)" is loading @entry for > "callq *0x8(%rsp)", which breaks because of asm's pushes. I'm not sure about this, I think ASM_CALL_CONSTRAINT may be a red herring, based on the commit message that added it (commit f5caf621ee357 ("x86/asm: Fix inline asm call constraints for Clang")). Further, it seems the "m" in "rm" in THUNK_TARGET for CONFIG_RETPOLINE=n is problematic. THUNK_TARGET defines [thunk_target] as "rm" when CONFIG_RETPOLINE is not set, which isn't constrained enough for this specific case; if `entry` winds up at the bottom of the stack where rsp points to, then `%rsp` is good enough to satisfy the constraints for using `entry` as an input. For inline assembly that modifies the the stack pointer before using this input, the underspecification of constraints is dangerous, and results in an indirect call to a previously pushed flags register. So maybe we can find why commit 76b043848fd2 ("x86/retpoline: Add initial retpoline support") added THUNK_TARGET with and without "m" constraint, and either: - remove "m" from THUNK_TARGET. (Maybe this doesn't compile somewhere) or - use my above recommendation locally avoiding THUNK_TARGET. We can use "r" rather than "a" (what Clang would have picked) or "b (what GCC would have picked) to give the compilers maximal flexibility. -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH] KVM: VMX: don't allow memory operands for inline asm that modifies SP 2020-03-23 18:49 ` Nick Desaulniers @ 2020-03-23 19:12 ` Nick Desaulniers 2020-03-23 19:30 ` BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff Nick Desaulniers 1 sibling, 0 replies; 20+ messages in thread From: Nick Desaulniers @ 2020-03-23 19:12 UTC (permalink / raw) To: pbonzini, sean.j.christopherson Cc: ndesaulniers, bp, clang-built-linux, dvyukov, glider, hpa, jmattson, joro, kvm, linux-kernel, mingo, syzbot+3f29ca2efb056a761e38, syzkaller-bugs, tglx, vkuznets, wanpengli, x86 THUNK_TARGET defines [thunk_target] as having "rm" input constraints when CONFIG_RETPOLINE is not set, which isn't constrained enough for this specific case. For inline assembly that modifies the stack pointer before using this input, the underspecification of constraints is dangerous, and results in an indirect call to a previously pushed flags register. In this case `entry`'s stack slot is good enough to satisfy the "m" constraint in "rm", but the inline assembly in handle_external_interrupt_irqoff() modifies the stack pointer via push+pushf before using this input, which in this case results in calling what was the previous state of the flags register, rather than `entry`. Be more specific in the constraints by requiring `entry` be in a register, and not a memory operand. Reported-by: Dmitry Vyukov <dvyukov@google.com> Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com Debugged-by: Alexander Potapenko <glider@google.com> Debugged-by: Paolo Bonzini <pbonzini@redhat.com> Debugged-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> --- arch/x86/kvm/vmx/vmx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 4d22b1b5e822..310e8c1169b8 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6277,7 +6277,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) #endif ASM_CALL_CONSTRAINT : - THUNK_TARGET(entry), + [thunk_target]"r"(entry), [ss]"i"(__KERNEL_DS), [cs]"i"(__KERNEL_CS) ); -- 2.25.1.696.g5e7596f4ac-goog ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 18:49 ` Nick Desaulniers 2020-03-23 19:12 ` [PATCH] KVM: VMX: don't allow memory operands for inline asm that modifies SP Nick Desaulniers @ 2020-03-23 19:30 ` Nick Desaulniers 2020-03-23 19:39 ` Paolo Bonzini 1 sibling, 1 reply; 20+ messages in thread From: Nick Desaulniers @ 2020-03-23 19:30 UTC (permalink / raw) To: Alexander Potapenko Cc: Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson On Mon, Mar 23, 2020 at 11:49 AM Nick Desaulniers <ndesaulniers@google.com> wrote: > > So maybe we can find why > commit 76b043848fd2 ("x86/retpoline: Add initial retpoline support") > added THUNK_TARGET with and without "m" constraint, and either: > - remove "m" from THUNK_TARGET. (Maybe this doesn't compile somewhere) > or > - use my above recommendation locally avoiding THUNK_TARGET. We can > use "r" rather than "a" (what Clang would have picked) or "b (what GCC > would have picked) to give the compilers maximal flexibility. So I've sent a patch for the latter; my reason for not pursuing the former is: 1. I assume that the thunk target could be spilled, or a pointer, and we'd like to keep flexibility for the general case of inline asm that doesn't modify the stack pointer. 2. `entry` is local to `handle_external_interrupt_irqoff`; it's not being passed in via pointer as a function parameter. 3. register pressure is irrelevant if the resulting code is incorrect. -- Thanks, ~Nick Desaulniers ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-23 19:30 ` BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff Nick Desaulniers @ 2020-03-23 19:39 ` Paolo Bonzini 0 siblings, 0 replies; 20+ messages in thread From: Paolo Bonzini @ 2020-03-23 19:39 UTC (permalink / raw) To: Nick Desaulniers, Alexander Potapenko Cc: Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson On 23/03/20 20:30, Nick Desaulniers wrote: > <ndesaulniers@google.com> wrote: >> So maybe we can find why >> commit 76b043848fd2 ("x86/retpoline: Add initial retpoline support") >> added THUNK_TARGET with and without "m" constraint, and either: >> - remove "m" from THUNK_TARGET. (Maybe this doesn't compile somewhere) >> or >> - use my above recommendation locally avoiding THUNK_TARGET. We can >> use "r" rather than "a" (what Clang would have picked) or "b (what GCC >> would have picked) to give the compilers maximal flexibility. > So I've sent a patch for the latter; my reason for not pursuing the former is: > 1. I assume that the thunk target could be spilled, or a pointer, and > we'd like to keep flexibility for the general case of inline asm that > doesn't modify the stack pointer. > 2. `entry` is local to `handle_external_interrupt_irqoff`; it's not > being passed in via pointer as a function parameter. > 3. register pressure is irrelevant if the resulting code is incorrect. Yes, this is fair enough. I've queued your patch and will send it shortly to Linus. Paolo ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-22 6:43 BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff syzbot 2020-03-22 6:59 ` Dmitry Vyukov @ 2020-03-22 8:53 ` syzbot 2020-03-22 13:29 ` syzbot 2 siblings, 0 replies; 20+ messages in thread From: syzbot @ 2020-03-22 8:53 UTC (permalink / raw) To: bp, clang-built-linux, dvyukov, hpa, jmattson, joro, kvm, linux-kernel, mingo, pbonzini, sean.j.christopherson, syzkaller-bugs, tglx, vkuznets, wanpengli, x86 syzbot has found a reproducer for the following crash on: HEAD commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=13059373e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985 dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38 compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1199c0c5e00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15097373e00000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com BUG: kernel NULL pointer dereference, address: 0000000000000086 #PF: supervisor instruction fetch in kernel mode #PF: error_code(0x0010) - not-present page PGD 9330b067 P4D 9330b067 PUD 9e66f067 PMD 0 Oops: 0010 [#1] PREEMPT SMP KASAN CPU: 1 PID: 8439 Comm: syz-executor724 Not tainted 5.6.0-rc6-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:0x86 Code: Bad RIP value. RSP: 0018:ffffc900022e7998 EFLAGS: 00010086 RAX: ffffc900022e79c8 RBX: fffffe0000000000 RCX: ffff88809dcf2500 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca R10: ffff88809dcf2500 R11: 0000000000000002 R12: dffffc0000000000 R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb FS: 0000000001d0d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000005c CR3: 00000000978c3000 CR4: 00000000001426e0 Call Trace: handle_external_interrupt_irqoff+0x154/0x280 arch/x86/kvm/vmx/vmx.c:6274 kvm_before_interrupt arch/x86/kvm/x86.h:343 [inline] handle_external_interrupt_irqoff+0x132/0x280 arch/x86/kvm/vmx/vmx.c:6272 __irqentry_text_start+0x8/0x8 vcpu_enter_guest+0x6c77/0x9290 arch/x86/kvm/x86.c:8405 save_stack mm/kasan/common.c:72 [inline] set_track mm/kasan/common.c:80 [inline] kasan_set_free_info mm/kasan/common.c:337 [inline] __kasan_slab_free+0x12e/0x1e0 mm/kasan/common.c:476 __cache_free mm/slab.c:3426 [inline] kfree+0x10a/0x220 mm/slab.c:3757 tomoyo_path_number_perm+0x525/0x690 security/tomoyo/file.c:736 security_file_ioctl+0x55/0xb0 security/security.c:1441 entry_SYSCALL_64_after_hwframe+0x49/0xbe __lock_acquire+0xc5a/0x1bc0 kernel/locking/lockdep.c:3954 paravirt_write_msr arch/x86/include/asm/paravirt.h:167 [inline] wrmsrl arch/x86/include/asm/paravirt.h:200 [inline] native_x2apic_icr_write arch/x86/include/asm/apic.h:249 [inline] __x2apic_send_IPI_dest arch/x86/kernel/apic/x2apic_phys.c:112 [inline] x2apic_send_IPI+0x96/0xc0 arch/x86/kernel/apic/x2apic_phys.c:41 test_bit include/asm-generic/bitops/instrumented-non-atomic.h:110 [inline] hlock_class kernel/locking/lockdep.c:163 [inline] mark_lock+0x107/0x1650 kernel/locking/lockdep.c:3642 lock_acquire+0x154/0x250 kernel/locking/lockdep.c:4484 rcu_lock_acquire+0x9/0x30 include/linux/rcupdate.h:208 vcpu_run+0x3a3/0xd50 arch/x86/kvm/x86.c:8513 kvm_arch_vcpu_ioctl_run+0x419/0x880 arch/x86/kvm/x86.c:8735 kvm_vcpu_ioctl+0x67c/0xa80 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2932 lock_is_held include/linux/lockdep.h:361 [inline] rcu_read_lock_sched_held+0x106/0x170 kernel/rcu/update.c:121 kvm_vm_release+0x50/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:858 vfs_ioctl fs/ioctl.c:47 [inline] ksys_ioctl fs/ioctl.c:763 [inline] __do_sys_ioctl fs/ioctl.c:772 [inline] __se_sys_ioctl+0xf9/0x160 fs/ioctl.c:770 do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe Modules linked in: CR2: 0000000000000086 ---[ end trace 480d6b60d5a9226d ]--- RIP: 0010:0x86 Code: Bad RIP value. RSP: 0018:ffffc900022e7998 EFLAGS: 00010086 RAX: ffffc900022e79c8 RBX: fffffe0000000000 RCX: ffff88809dcf2500 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca R10: ffff88809dcf2500 R11: 0000000000000002 R12: dffffc0000000000 R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb FS: 0000000001d0d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000005c CR3: 00000000978c3000 CR4: 00000000001426e0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-22 6:43 BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff syzbot 2020-03-22 6:59 ` Dmitry Vyukov 2020-03-22 8:53 ` syzbot @ 2020-03-22 13:29 ` syzbot 2020-03-22 13:43 ` Dmitry Vyukov 2 siblings, 1 reply; 20+ messages in thread From: syzbot @ 2020-03-22 13:29 UTC (permalink / raw) To: bp, clang-built-linux, davem, dhowells, dvyukov, hpa, jmattson, joro, kuba, kvm, linux-afs, linux-kernel, mingo, netdev, pbonzini, sean.j.christopherson, syzkaller-bugs, tglx, vkuznets, wanpengli, x86 syzbot has bisected this bug to: commit f71dbf2fb28489a79bde0dca1c8adfb9cdb20a6b Author: David Howells <dhowells@redhat.com> Date: Thu Jan 30 21:50:36 2020 +0000 rxrpc: Fix insufficient receive notification generation bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1483bb19e00000 start commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne.. git tree: upstream final crash: https://syzkaller.appspot.com/x/report.txt?x=1683bb19e00000 console output: https://syzkaller.appspot.com/x/log.txt?x=1283bb19e00000 kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985 dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38 syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1199c0c5e00000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15097373e00000 Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com Fixes: f71dbf2fb284 ("rxrpc: Fix insufficient receive notification generation") For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff 2020-03-22 13:29 ` syzbot @ 2020-03-22 13:43 ` Dmitry Vyukov 0 siblings, 0 replies; 20+ messages in thread From: Dmitry Vyukov @ 2020-03-22 13:43 UTC (permalink / raw) To: syzbot Cc: Borislav Petkov, clang-built-linux, David Miller, David Howells, H. Peter Anvin, Jim Mattson, Joerg Roedel, kuba, KVM list, linux-afs, LKML, Ingo Molnar, netdev, Paolo Bonzini, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanpengli, the arch/x86 maintainers On Sun, Mar 22, 2020 at 2:29 PM syzbot <syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com> wrote: > > syzbot has bisected this bug to: > > commit f71dbf2fb28489a79bde0dca1c8adfb9cdb20a6b > Author: David Howells <dhowells@redhat.com> > Date: Thu Jan 30 21:50:36 2020 +0000 > > rxrpc: Fix insufficient receive notification generation This is unrelated. Somehow the crash wasn't reproduced again on the same commit. Can it depend on host CPU type maybe? > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1483bb19e00000 > start commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne.. > git tree: upstream > final crash: https://syzkaller.appspot.com/x/report.txt?x=1683bb19e00000 > console output: https://syzkaller.appspot.com/x/log.txt?x=1283bb19e00000 > kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985 > dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38 > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1199c0c5e00000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15097373e00000 > > Reported-by: syzbot+3f29ca2efb056a761e38@syzkaller.appspotmail.com > Fixes: f71dbf2fb284 ("rxrpc: Fix insufficient receive notification generation") > > For information about bisection process see: https://goo.gl/tpsmEJ#bisection ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2020-03-23 19:40 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-22 6:43 BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff syzbot 2020-03-22 6:59 ` Dmitry Vyukov 2020-03-22 7:03 ` Dmitry Vyukov 2020-03-23 8:18 ` Paolo Bonzini 2020-03-23 16:31 ` Alexander Potapenko 2020-03-23 16:39 ` Sean Christopherson 2020-03-23 16:43 ` Alexander Potapenko 2020-03-23 16:57 ` Nick Desaulniers 2020-03-23 17:28 ` Nick Desaulniers 2020-03-23 17:55 ` Alexander Potapenko 2020-03-23 18:06 ` Nick Desaulniers 2020-03-23 18:06 ` Alexander Potapenko 2020-03-23 18:16 ` Nick Desaulniers 2020-03-23 18:49 ` Nick Desaulniers 2020-03-23 19:12 ` [PATCH] KVM: VMX: don't allow memory operands for inline asm that modifies SP Nick Desaulniers 2020-03-23 19:30 ` BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff Nick Desaulniers 2020-03-23 19:39 ` Paolo Bonzini 2020-03-22 8:53 ` syzbot 2020-03-22 13:29 ` syzbot 2020-03-22 13:43 ` Dmitry Vyukov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).