* KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page @ 2018-11-07 1:38 syzbot 2018-11-07 12:10 ` Alexander Potapenko 0 siblings, 1 reply; 16+ messages in thread From: syzbot @ 2018-11-07 1:38 UTC (permalink / raw) To: kvm, linux-kernel, pbonzini, rkrcmar, syzkaller-bugs Hello, syzbot found the following crash on: HEAD commit: 88b95ef4c780 kmsan: use MSan assembly instrumentation git tree: https://github.com/google/kmsan.git/master console output: https://syzkaller.appspot.com/x/log.txt?x=12505e33400000 kernel config: https://syzkaller.appspot.com/x/.config?x=8df5fc509a1b351b dashboard link: https://syzkaller.appspot.com/bug?extid=ded1696f6b50b615b630 compiler: clang version 8.0.0 (trunk 343298) syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15ce62f5400000 C reproducer: https://syzkaller.appspot.com/x/repro.c?x=174efca3400000 IMPORTANT: if you fix the bug, please add the following tag to the commit: Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com ================================================================== BUG: KMSAN: kernel-infoleak in __copy_to_user include/linux/uaccess.h:121 [inline] BUG: KMSAN: kernel-infoleak in __kvm_write_guest_page arch/x86/kvm/../../../virt/kvm/kvm_main.c:1849 [inline] BUG: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page+0x39a/0x510 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1870 CPU: 0 PID: 7918 Comm: syz-executor542 Not tainted 4.19.0+ #77 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x32d/0x480 lib/dump_stack.c:113 kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:911 kmsan_internal_check_memory+0x34c/0x430 mm/kmsan/kmsan.c:991 kmsan_copy_to_user+0x85/0xe0 mm/kmsan/kmsan_hooks.c:552 __copy_to_user include/linux/uaccess.h:121 [inline] __kvm_write_guest_page arch/x86/kvm/../../../virt/kvm/kvm_main.c:1849 [inline] kvm_vcpu_write_guest_page+0x39a/0x510 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1870 nested_release_vmcs12 arch/x86/kvm/vmx.c:8441 [inline] handle_vmptrld+0x2384/0x26b0 arch/x86/kvm/vmx.c:8907 vmx_handle_exit+0x1e81/0xbac0 arch/x86/kvm/vmx.c:10128 vcpu_enter_guest arch/x86/kvm/x86.c:7667 [inline] vcpu_run arch/x86/kvm/x86.c:7730 [inline] kvm_arch_vcpu_ioctl_run+0xac32/0x11d80 arch/x86/kvm/x86.c:7930 kvm_vcpu_ioctl+0xfb1/0x1f90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2590 do_vfs_ioctl+0xf77/0x2d30 fs/ioctl.c:46 ksys_ioctl fs/ioctl.c:702 [inline] __do_sys_ioctl fs/ioctl.c:709 [inline] __se_sys_ioctl+0x1da/0x270 fs/ioctl.c:707 __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:707 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 RIP: 0033:0x44b6e9 Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b ff fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f096b292ce8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 00000000006e3c48 RCX: 000000000044b6e9 RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 RBP: 00000000006e3c40 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 00000000006e3c4c R13: 00007ffd978aeb2f R14: 00007f096b2939c0 R15: 00000000006e3d4c Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:252 [inline] kmsan_internal_poison_shadow+0xc8/0x1e0 mm/kmsan/kmsan.c:177 kmsan_kmalloc+0x98/0x110 mm/kmsan/kmsan_hooks.c:104 __kmalloc+0x14c/0x4d0 mm/slub.c:3789 kmalloc include/linux/slab.h:518 [inline] enter_vmx_operation+0x980/0x1a90 arch/x86/kvm/vmx.c:8278 vmx_set_nested_state+0xc3a/0x1530 arch/x86/kvm/vmx.c:14045 kvm_arch_vcpu_ioctl+0x4fc9/0x73a0 arch/x86/kvm/x86.c:4057 kvm_vcpu_ioctl+0xca3/0x1f90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2742 do_vfs_ioctl+0xf77/0x2d30 fs/ioctl.c:46 ksys_ioctl fs/ioctl.c:702 [inline] __do_sys_ioctl fs/ioctl.c:709 [inline] __se_sys_ioctl+0x1da/0x270 fs/ioctl.c:707 __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:707 do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 Bytes 1000-4095 of 4096 are uninitialized Memory access of size 4096 starts at ffff8801b5157000 ================================================================== --- This bug is generated by a bot. It may contain errors. See https://goo.gl/tpsmEJ for more information about syzbot. syzbot engineers can be reached at syzkaller@googlegroups.com. syzbot will keep track of this bug report. See: https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with syzbot. syzbot can test patches for this bug, for details see: https://goo.gl/tpsmEJ#testing-patches ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page 2018-11-07 1:38 KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page syzbot @ 2018-11-07 12:10 ` Alexander Potapenko 2018-11-07 12:47 ` Paolo Bonzini 2018-11-07 12:52 ` KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page Liran Alon 0 siblings, 2 replies; 16+ messages in thread From: Alexander Potapenko @ 2018-11-07 12:10 UTC (permalink / raw) To: syzbot+ded1696f6b50b615b630; +Cc: kvm, LKML, pbonzini, rkrcmar, syzkaller-bugs [-- Attachment #1: Type: text/plain, Size: 5671 bytes --] On Wed, Nov 7, 2018 at 2:38 AM syzbot <syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com> wrote: > > Hello, > > syzbot found the following crash on: > > HEAD commit: 88b95ef4c780 kmsan: use MSan assembly instrumentation > git tree: https://github.com/google/kmsan.git/master > console output: https://syzkaller.appspot.com/x/log.txt?x=12505e33400000 > kernel config: https://syzkaller.appspot.com/x/.config?x=8df5fc509a1b351b > dashboard link: https://syzkaller.appspot.com/bug?extid=ded1696f6b50b615b630 > compiler: clang version 8.0.0 (trunk 343298) > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=15ce62f5400000 > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=174efca3400000 > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com > > ================================================================== > BUG: KMSAN: kernel-infoleak in __copy_to_user include/linux/uaccess.h:121 > [inline] > BUG: KMSAN: kernel-infoleak in __kvm_write_guest_page > arch/x86/kvm/../../../virt/kvm/kvm_main.c:1849 [inline] > BUG: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page+0x39a/0x510 > arch/x86/kvm/../../../virt/kvm/kvm_main.c:1870 > CPU: 0 PID: 7918 Comm: syz-executor542 Not tainted 4.19.0+ #77 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > Google 01/01/2011 > Call Trace: > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x32d/0x480 lib/dump_stack.c:113 > kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:911 > kmsan_internal_check_memory+0x34c/0x430 mm/kmsan/kmsan.c:991 > kmsan_copy_to_user+0x85/0xe0 mm/kmsan/kmsan_hooks.c:552 > __copy_to_user include/linux/uaccess.h:121 [inline] > __kvm_write_guest_page arch/x86/kvm/../../../virt/kvm/kvm_main.c:1849 > [inline] > kvm_vcpu_write_guest_page+0x39a/0x510 > arch/x86/kvm/../../../virt/kvm/kvm_main.c:1870 > nested_release_vmcs12 arch/x86/kvm/vmx.c:8441 [inline] > handle_vmptrld+0x2384/0x26b0 arch/x86/kvm/vmx.c:8907 > vmx_handle_exit+0x1e81/0xbac0 arch/x86/kvm/vmx.c:10128 > vcpu_enter_guest arch/x86/kvm/x86.c:7667 [inline] > vcpu_run arch/x86/kvm/x86.c:7730 [inline] > kvm_arch_vcpu_ioctl_run+0xac32/0x11d80 arch/x86/kvm/x86.c:7930 > kvm_vcpu_ioctl+0xfb1/0x1f90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2590 > do_vfs_ioctl+0xf77/0x2d30 fs/ioctl.c:46 > ksys_ioctl fs/ioctl.c:702 [inline] > __do_sys_ioctl fs/ioctl.c:709 [inline] > __se_sys_ioctl+0x1da/0x270 fs/ioctl.c:707 > __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:707 > do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 > entry_SYSCALL_64_after_hwframe+0x63/0xe7 > RIP: 0033:0x44b6e9 > Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 > 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff > ff 0f 83 2b ff fb ff c3 66 2e 0f 1f 84 00 00 00 00 > RSP: 002b:00007f096b292ce8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 > RAX: ffffffffffffffda RBX: 00000000006e3c48 RCX: 000000000044b6e9 > RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 > RBP: 00000000006e3c40 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 0000000000000206 R12: 00000000006e3c4c > R13: 00007ffd978aeb2f R14: 00007f096b2939c0 R15: 00000000006e3d4c > > Uninit was created at: > kmsan_save_stack_with_flags mm/kmsan/kmsan.c:252 [inline] > kmsan_internal_poison_shadow+0xc8/0x1e0 mm/kmsan/kmsan.c:177 > kmsan_kmalloc+0x98/0x110 mm/kmsan/kmsan_hooks.c:104 > __kmalloc+0x14c/0x4d0 mm/slub.c:3789 > kmalloc include/linux/slab.h:518 [inline] > enter_vmx_operation+0x980/0x1a90 arch/x86/kvm/vmx.c:8278 > vmx_set_nested_state+0xc3a/0x1530 arch/x86/kvm/vmx.c:14045 > kvm_arch_vcpu_ioctl+0x4fc9/0x73a0 arch/x86/kvm/x86.c:4057 > kvm_vcpu_ioctl+0xca3/0x1f90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2742 > do_vfs_ioctl+0xf77/0x2d30 fs/ioctl.c:46 > ksys_ioctl fs/ioctl.c:702 [inline] > __do_sys_ioctl fs/ioctl.c:709 [inline] > __se_sys_ioctl+0x1da/0x270 fs/ioctl.c:707 > __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:707 > do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 > entry_SYSCALL_64_after_hwframe+0x63/0xe7 > > Bytes 1000-4095 of 4096 are uninitialized > Memory access of size 4096 starts at ffff8801b5157000 > ================================================================== This appears to be a real bug in KVM. Please see a simplified reproducer attached. > > --- > This bug is generated by a bot. It may contain errors. > See https://goo.gl/tpsmEJ for more information about syzbot. > syzbot engineers can be reached at syzkaller@googlegroups.com. > > syzbot will keep track of this bug report. See: > https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with > syzbot. > syzbot can test patches for this bug, for details see: > https://goo.gl/tpsmEJ#testing-patches > > -- > You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. > To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/0000000000005de8da057a092ba2%40google.com. > For more options, visit https://groups.google.com/d/optout. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Halimah DeLaine Prado Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg [-- Attachment #2: kvm-infoleak.c --] [-- Type: text/x-csrc, Size: 27803 bytes --] // autogenerated by syzkaller (https://github.com/google/syzkaller) #define _GNU_SOURCE #include <dirent.h> #include <endian.h> #include <errno.h> #include <fcntl.h> #include <setjmp.h> #include <signal.h> #include <stdarg.h> #include <stddef.h> #include <stdint.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/ioctl.h> #include <sys/mount.h> #include <sys/prctl.h> #include <sys/stat.h> #include <sys/syscall.h> #include <sys/types.h> #include <sys/wait.h> #include <time.h> #include <unistd.h> #include <linux/futex.h> #include <linux/kvm.h> const char kvm_asm64_init_vm[] = "\x0f\x20\xc0\x0d\x00\x00\x00\x80\x0f\x22\xc0\xea\xde\xc0\xad\x0b\x50\x00" "\x48\xc7\xc0\xd8\x00\x00\x00\x0f\x00\xd8\x48\xc7\xc1\x3a\x00\x00\x00\x0f" "\x32\x48\x83\xc8\x05\x0f\x30\x0f\x20\xe0\x48\x0d\x00\x20\x00\x00\x0f\x22" "\xe0\x48\xc7\xc1\x80\x04\x00\x00\x0f\x32\x48\xc7\xc2\x00\x60\x00\x00\x89" "\x02\x48\xc7\xc2\x00\x70\x00\x00\x89\x02\x48\xc7\xc0\x00\x5f\x00\x00\xf3" "\x0f\xc7\x30\x48\xc7\xc0\x08\x5f\x00\x00\x66\x0f\xc7\x30\x0f\xc7\x30\x48" "\xc7\xc1\x81\x04\x00\x00\x0f\x32\x48\x83\xc8\x3f\x48\x21\xd0\x48\xc7\xc2" "\x00\x40\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x02\x40\x00\x00\x48\xb8\x84\x9e" "\x99\xf3\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x1e\x40\x00\x00\x48\xc7" "\xc0\x81\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc1\x83\x04\x00\x00\x0f\x32\x48" "\x0d\xff\x6f\x03\x00\x48\x21\xd0\x48\xc7\xc2\x0c\x40\x00\x00\x0f\x79\xd0" "\x48\xc7\xc1\x84\x04\x00\x00\x0f\x32\x48\x0d\xff\x17\x00\x00\x48\x21\xd0" "\x48\xc7\xc2\x12\x40\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x04\x2c\x00\x00\x48" "\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x00\x28\x00\x00\x48\xc7" "\xc0\xff\xff\xff\xff\x0f\x79\xd0\x48\xc7\xc2\x02\x0c\x00\x00\x48\xc7\xc0" "\x50\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc0\x58\x00\x00\x00\x48\xc7\xc2\x00" "\x0c\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x04\x0c\x00\x00\x0f\x79\xd0\x48\xc7" "\xc2\x06\x0c\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x08\x0c\x00\x00\x0f\x79\xd0" "\x48\xc7\xc2\x0a\x0c\x00\x00\x0f\x79\xd0\x48\xc7\xc0\xd8\x00\x00\x00\x48" "\xc7\xc2\x0c\x0c\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x02\x2c\x00\x00\x48\xc7" "\xc0\x00\x05\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x00\x4c\x00\x00\x48\xc7\xc0" "\x50\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x10\x6c\x00\x00\x48\xc7\xc0\x00" "\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x12\x6c\x00\x00\x48\xc7\xc0\x00\x00" "\x00\x00\x0f\x79\xd0\x0f\x20\xc0\x48\xc7\xc2\x00\x6c\x00\x00\x48\x89\xc0" "\x0f\x79\xd0\x0f\x20\xd8\x48\xc7\xc2\x02\x6c\x00\x00\x48\x89\xc0\x0f\x79" "\xd0\x0f\x20\xe0\x48\xc7\xc2\x04\x6c\x00\x00\x48\x89\xc0\x0f\x79\xd0\x48" "\xc7\xc2\x06\x6c\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7" "\xc2\x08\x6c\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2" "\x0a\x6c\x00\x00\x48\xc7\xc0\x00\x3a\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x0c" "\x6c\x00\x00\x48\xc7\xc0\x00\x10\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x0e\x6c" "\x00\x00\x48\xc7\xc0\x00\x38\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x14\x6c\x00" "\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x16\x6c\x00\x00" "\x48\x8b\x04\x25\x10\x5f\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x00\x00\x00\x00" "\x48\xc7\xc0\x01\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x02\x00\x00\x00\x48" "\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x00\x20\x00\x00\x48\xc7" "\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x02\x20\x00\x00\x48\xc7\xc0" "\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x04\x20\x00\x00\x48\xc7\xc0\x00" "\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x06\x20\x00\x00\x48\xc7\xc0\x00\x00" "\x00\x00\x0f\x79\xd0\x48\xc7\xc1\x77\x02\x00\x00\x0f\x32\x48\xc1\xe2\x20" "\x48\x09\xd0\x48\xc7\xc2\x00\x2c\x00\x00\x48\x89\xc0\x0f\x79\xd0\x48\xc7" "\xc2\x04\x40\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2" "\x0a\x40\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x0e" "\x40\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x10\x40" "\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x16\x40\x00" "\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x14\x40\x00\x00" "\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x00\x60\x00\x00\x48" "\xc7\xc0\xff\xff\xff\xff\x0f\x79\xd0\x48\xc7\xc2\x02\x60\x00\x00\x48\xc7" "\xc0\xff\xff\xff\xff\x0f\x79\xd0\x48\xc7\xc2\x1c\x20\x00\x00\x48\xc7\xc0" "\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x1e\x20\x00\x00\x48\xc7\xc0\x00" "\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x20\x20\x00\x00\x48\xc7\xc0\x00\x00" "\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x22\x20\x00\x00\x48\xc7\xc0\x00\x00\x00" "\x00\x0f\x79\xd0\x48\xc7\xc2\x00\x08\x00\x00\x48\xc7\xc0\x58\x00\x00\x00" "\x0f\x79\xd0\x48\xc7\xc2\x02\x08\x00\x00\x48\xc7\xc0\x50\x00\x00\x00\x0f" "\x79\xd0\x48\xc7\xc2\x04\x08\x00\x00\x48\xc7\xc0\x58\x00\x00\x00\x0f\x79" "\xd0\x48\xc7\xc2\x06\x08\x00\x00\x48\xc7\xc0\x58\x00\x00\x00\x0f\x79\xd0" "\x48\xc7\xc2\x08\x08\x00\x00\x48\xc7\xc0\x58\x00\x00\x00\x0f\x79\xd0\x48" "\xc7\xc2\x0a\x08\x00\x00\x48\xc7\xc0\x58\x00\x00\x00\x0f\x79\xd0\x48\xc7" "\xc2\x0c\x08\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2" "\x0e\x08\x00\x00\x48\xc7\xc0\xd8\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x12" "\x68\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x14\x68" "\x00\x00\x48\xc7\xc0\x00\x3a\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x16\x68\x00" "\x00\x48\xc7\xc0\x00\x10\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x18\x68\x00\x00" "\x48\xc7\xc0\x00\x38\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x00\x48\x00\x00\x48" "\xc7\xc0\xff\xff\x0f\x00\x0f\x79\xd0\x48\xc7\xc2\x02\x48\x00\x00\x48\xc7" "\xc0\xff\xff\x0f\x00\x0f\x79\xd0\x48\xc7\xc2\x04\x48\x00\x00\x48\xc7\xc0" "\xff\xff\x0f\x00\x0f\x79\xd0\x48\xc7\xc2\x06\x48\x00\x00\x48\xc7\xc0\xff" "\xff\x0f\x00\x0f\x79\xd0\x48\xc7\xc2\x08\x48\x00\x00\x48\xc7\xc0\xff\xff" "\x0f\x00\x0f\x79\xd0\x48\xc7\xc2\x0a\x48\x00\x00\x48\xc7\xc0\xff\xff\x0f" "\x00\x0f\x79\xd0\x48\xc7\xc2\x0c\x48\x00\x00\x48\xc7\xc0\x00\x00\x00\x00" "\x0f\x79\xd0\x48\xc7\xc2\x0e\x48\x00\x00\x48\xc7\xc0\xff\x1f\x00\x00\x0f" "\x79\xd0\x48\xc7\xc2\x10\x48\x00\x00\x48\xc7\xc0\xff\x1f\x00\x00\x0f\x79" "\xd0\x48\xc7\xc2\x12\x48\x00\x00\x48\xc7\xc0\xff\x1f\x00\x00\x0f\x79\xd0" "\x48\xc7\xc2\x14\x48\x00\x00\x48\xc7\xc0\x93\x40\x00\x00\x0f\x79\xd0\x48" "\xc7\xc2\x16\x48\x00\x00\x48\xc7\xc0\x9b\x20\x00\x00\x0f\x79\xd0\x48\xc7" "\xc2\x18\x48\x00\x00\x48\xc7\xc0\x93\x40\x00\x00\x0f\x79\xd0\x48\xc7\xc2" "\x1a\x48\x00\x00\x48\xc7\xc0\x93\x40\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x1c" "\x48\x00\x00\x48\xc7\xc0\x93\x40\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x1e\x48" "\x00\x00\x48\xc7\xc0\x93\x40\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x20\x48\x00" "\x00\x48\xc7\xc0\x82\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x22\x48\x00\x00" "\x48\xc7\xc0\x8b\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x1c\x68\x00\x00\x48" "\xc7\xc0\x00\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x1e\x68\x00\x00\x48\xc7" "\xc0\x00\x91\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x20\x68\x00\x00\x48\xc7\xc0" "\x02\x00\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x06\x28\x00\x00\x48\xc7\xc0\x00" "\x05\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x0a\x28\x00\x00\x48\xc7\xc0\x00\x00" "\x00\x00\x0f\x79\xd0\x48\xc7\xc2\x0c\x28\x00\x00\x48\xc7\xc0\x00\x00\x00" "\x00\x0f\x79\xd0\x48\xc7\xc2\x0e\x28\x00\x00\x48\xc7\xc0\x00\x00\x00\x00" "\x0f\x79\xd0\x48\xc7\xc2\x10\x28\x00\x00\x48\xc7\xc0\x00\x00\x00\x00\x0f" "\x79\xd0\x0f\x20\xc0\x48\xc7\xc2\x00\x68\x00\x00\x48\x89\xc0\x0f\x79\xd0" "\x0f\x20\xd8\x48\xc7\xc2\x02\x68\x00\x00\x48\x89\xc0\x0f\x79\xd0\x0f\x20" "\xe0\x48\xc7\xc2\x04\x68\x00\x00\x48\x89\xc0\x0f\x79\xd0\x48\xc7\xc0\x18" "\x5f\x00\x00\x48\x8b\x10\x48\xc7\xc0\x20\x5f\x00\x00\x48\x8b\x08\x48\x31" "\xc0\x0f\x78\xd0\x48\x31\xc8\x0f\x79\xd0\x0f\x01\xc2\x48\xc7\xc2\x00\x44" "\x00\x00\x0f\x78\xd0\xf4"; const char kvm_asm64_vm_exit[] = "\x48\xc7\xc3\x00\x44\x00\x00\x0f\x78\xda\x48" "\xc7\xc3\x02\x44\x00\x00\x0f\x78\xd9\x48\xc7" "\xc0\x00\x64\x00\x00\x0f\x78\xc0\x48\xc7\xc3" "\x1e\x68\x00\x00\x0f\x78\xdb\xf4"; #define ADDR_TEXT 0x0000 #define ADDR_GDT 0x1000 #define ADDR_LDT 0x1800 #define ADDR_PML4 0x2000 #define ADDR_PDP 0x3000 #define ADDR_PD 0x4000 #define ADDR_STACK0 0x0f80 #define ADDR_VAR_HLT 0x2800 #define ADDR_VAR_SYSRET 0x2808 #define ADDR_VAR_SYSEXIT 0x2810 #define ADDR_VAR_IDT 0x3800 #define ADDR_VAR_TSS64 0x3a00 #define ADDR_VAR_TSS64_CPL3 0x3c00 #define ADDR_VAR_TSS16 0x3d00 #define ADDR_VAR_TSS16_2 0x3e00 #define ADDR_VAR_TSS16_CPL3 0x3f00 #define ADDR_VAR_TSS32 0x4800 #define ADDR_VAR_TSS32_2 0x4a00 #define ADDR_VAR_TSS32_CPL3 0x4c00 #define ADDR_VAR_TSS32_VM86 0x4e00 #define ADDR_VAR_VMXON_PTR 0x5f00 #define ADDR_VAR_VMCS_PTR 0x5f08 #define ADDR_VAR_VMEXIT_PTR 0x5f10 #define ADDR_VAR_VMWRITE_FLD 0x5f18 #define ADDR_VAR_VMWRITE_VAL 0x5f20 #define ADDR_VAR_VMXON 0x6000 #define ADDR_VAR_VMCS 0x7000 #define ADDR_VAR_VMEXIT_CODE 0x9000 #define ADDR_VAR_USER_CODE 0x9100 #define ADDR_VAR_USER_CODE2 0x9120 #define SEL_LDT (1 << 3) #define SEL_CS16 (2 << 3) #define SEL_DS16 (3 << 3) #define SEL_CS16_CPL3 ((4 << 3) + 3) #define SEL_DS16_CPL3 ((5 << 3) + 3) #define SEL_CS32 (6 << 3) #define SEL_DS32 (7 << 3) #define SEL_CS32_CPL3 ((8 << 3) + 3) #define SEL_DS32_CPL3 ((9 << 3) + 3) #define SEL_CS64 (10 << 3) #define SEL_DS64 (11 << 3) #define SEL_CS64_CPL3 ((12 << 3) + 3) #define SEL_DS64_CPL3 ((13 << 3) + 3) #define SEL_CGATE16 (14 << 3) #define SEL_TGATE16 (15 << 3) #define SEL_CGATE32 (16 << 3) #define SEL_TGATE32 (17 << 3) #define SEL_CGATE64 (18 << 3) #define SEL_CGATE64_HI (19 << 3) #define SEL_TSS16 (20 << 3) #define SEL_TSS16_2 (21 << 3) #define SEL_TSS16_CPL3 ((22 << 3) + 3) #define SEL_TSS32 (23 << 3) #define SEL_TSS32_2 (24 << 3) #define SEL_TSS32_CPL3 ((25 << 3) + 3) #define SEL_TSS32_VM86 (26 << 3) #define SEL_TSS64 (27 << 3) #define SEL_TSS64_HI (28 << 3) #define SEL_TSS64_CPL3 ((29 << 3) + 3) #define SEL_TSS64_CPL3_HI (30 << 3) #define MSR_IA32_SYSENTER_CS 0x174 #define MSR_IA32_SYSENTER_ESP 0x175 #define MSR_IA32_SYSENTER_EIP 0x176 #define MSR_IA32_STAR 0xC0000081 #define MSR_IA32_LSTAR 0xC0000082 #define PREFIX_SIZE 0xba1d #define CR0_PE 1 #define CR0_MP (1 << 1) #define CR0_EM (1 << 2) #define CR0_TS (1 << 3) #define CR0_ET (1 << 4) #define CR0_NE (1 << 5) #define CR4_PAE (1 << 5) #define EFER_SCE 1 #define EFER_LME (1 << 8) #define PDE64_PRESENT 1 #define PDE64_RW (1 << 1) #define PDE64_USER (1 << 2) #define PDE64_PS (1 << 7) #define MIN_STRLEN 1 void dump_buf(unsigned char *buf, int len) { int i, nz = 0; for (i = 0; i < len; i++) { if (buf[i]) { nz = 1; break; } } if (!nz) // The buffer is empty. return; //pthread_mutex_lock(&out_mutex); for (i = 0; i < len; i++) { if (buf[i]) { int str_len = strlen(&buf[i]); // Short string pieces are too boring. if (str_len >= MIN_STRLEN) { unsigned char *c; for (c = &buf[i]; c < &buf[i + str_len]; c++) { if ((*c > 127) || ((*c < 32) && (*c != 10) && (*c != 13))) { *c = ' '; continue; } } // Dump the buffer. fprintf(stderr, "%s\n", &buf[i]); } i += str_len; } } //pthread_mutex_unlock(&out_mutex); } struct tss16 { uint16_t prev; uint16_t sp0; uint16_t ss0; uint16_t sp1; uint16_t ss1; uint16_t sp2; uint16_t ss2; uint16_t ip; uint16_t flags; uint16_t ax; uint16_t cx; uint16_t dx; uint16_t bx; uint16_t sp; uint16_t bp; uint16_t si; uint16_t di; uint16_t es; uint16_t cs; uint16_t ss; uint16_t ds; uint16_t ldt; } __attribute__((packed)); struct tss32 { uint16_t prev, prevh; uint32_t sp0; uint16_t ss0, ss0h; uint32_t sp1; uint16_t ss1, ss1h; uint32_t sp2; uint16_t ss2, ss2h; uint32_t cr3; uint32_t ip; uint32_t flags; uint32_t ax; uint32_t cx; uint32_t dx; uint32_t bx; uint32_t sp; uint32_t bp; uint32_t si; uint32_t di; uint16_t es, esh; uint16_t cs, csh; uint16_t ss, ssh; uint16_t ds, dsh; uint16_t fs, fsh; uint16_t gs, gsh; uint16_t ldt, ldth; uint16_t trace; uint16_t io_bitmap; } __attribute__((packed)); struct tss64 { uint32_t reserved0; uint64_t rsp[3]; uint64_t reserved1; uint64_t ist[7]; uint64_t reserved2; uint32_t reserved3; uint32_t io_bitmap; } __attribute__((packed)); static void fill_segment_descriptor(uint64_t* dt, uint64_t* lt, struct kvm_segment* seg) { uint16_t index = seg->selector >> 3; uint64_t limit = seg->g ? seg->limit >> 12 : seg->limit; uint64_t sd = (limit & 0xffff) | (seg->base & 0xffffff) << 16 | (uint64_t)seg->type << 40 | (uint64_t)seg->s << 44 | (uint64_t)seg->dpl << 45 | (uint64_t)seg->present << 47 | (limit & 0xf0000ULL) << 48 | (uint64_t)seg->avl << 52 | (uint64_t)seg->l << 53 | (uint64_t)seg->db << 54 | (uint64_t)seg->g << 55 | (seg->base & 0xff000000ULL) << 56; dt[index] = sd; lt[index] = sd; } static void fill_segment_descriptor_dword(uint64_t* dt, uint64_t* lt, struct kvm_segment* seg) { fill_segment_descriptor(dt, lt, seg); uint16_t index = seg->selector >> 3; dt[index + 1] = 0; lt[index + 1] = 0; } static void setup_syscall_msrs(int cpufd, uint16_t sel_cs, uint16_t sel_cs_cpl3) { char buf[sizeof(struct kvm_msrs) + 5 * sizeof(struct kvm_msr_entry)]; memset(buf, 0, sizeof(buf)); struct kvm_msrs* msrs = (struct kvm_msrs*)buf; struct kvm_msr_entry* entries = msrs->entries; msrs->nmsrs = 5; entries[0].index = MSR_IA32_SYSENTER_CS; entries[0].data = sel_cs; entries[1].index = MSR_IA32_SYSENTER_ESP; entries[1].data = ADDR_STACK0; entries[2].index = MSR_IA32_SYSENTER_EIP; entries[2].data = ADDR_VAR_SYSEXIT; entries[3].index = MSR_IA32_STAR; entries[3].data = ((uint64_t)sel_cs << 32) | ((uint64_t)sel_cs_cpl3 << 48); entries[4].index = MSR_IA32_LSTAR; entries[4].data = ADDR_VAR_SYSRET; ioctl(cpufd, KVM_SET_MSRS, msrs); } static void setup_64bit_idt(struct kvm_sregs* sregs, char* host_mem, uintptr_t guest_mem) { sregs->idt.base = guest_mem + ADDR_VAR_IDT; sregs->idt.limit = 0x1ff; uint64_t* idt = (uint64_t*)(host_mem + sregs->idt.base); int i; for (i = 0; i < 32; i++) { struct kvm_segment gate; gate.selector = (i * 2) << 3; gate.type = (i & 1) ? 14 : 15; gate.base = SEL_CS64; gate.limit = guest_mem + ADDR_VAR_USER_CODE2; gate.present = 1; gate.dpl = 0; gate.s = 0; gate.g = 0; gate.db = 0; gate.l = 0; gate.avl = 0; fill_segment_descriptor_dword(idt, idt, &gate); } } void syz_kvm_setup_vmfd(int vmfd, char *host_mem) { const uintptr_t page_size = 4 << 10; const uintptr_t ioapic_page = 10; const uintptr_t guest_mem_size = 24 * page_size; const uintptr_t guest_mem = 0; uintptr_t i; for (i = 0; i < guest_mem_size / page_size; i++) { struct kvm_userspace_memory_region memreg; memreg.slot = i; memreg.flags = 0; memreg.guest_phys_addr = guest_mem + i * page_size; if (i == ioapic_page) memreg.guest_phys_addr = 0xfec00000; memreg.memory_size = page_size; memreg.userspace_addr = (uintptr_t)host_mem + i * page_size; if (vmfd != -1) ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg); } struct kvm_userspace_memory_region memreg; memreg.slot = 1 + (1 << 16); memreg.flags = 0; memreg.guest_phys_addr = 0x30000; memreg.memory_size = 64 << 10; memreg.userspace_addr = (uintptr_t)host_mem; if (vmfd != -1) ioctl(vmfd, KVM_SET_USER_MEMORY_REGION, &memreg); } static uintptr_t syz_kvm_setup_cpu(int kvmfd, uintptr_t cpufd, char *host_mem) { const uintptr_t guest_mem = 0; uintptr_t i; struct kvm_sregs sregs; if (ioctl(cpufd, KVM_GET_SREGS, &sregs)) return -1; struct kvm_regs regs; memset(®s, 0, sizeof(regs)); regs.rip = guest_mem + ADDR_TEXT; regs.rsp = ADDR_STACK0; sregs.gdt.base = guest_mem + ADDR_GDT; sregs.gdt.limit = 256 * sizeof(uint64_t) - 1; uint64_t* gdt = (uint64_t*)(host_mem + sregs.gdt.base); struct kvm_segment seg_ldt; seg_ldt.selector = SEL_LDT; seg_ldt.type = 2; seg_ldt.base = guest_mem + ADDR_LDT; seg_ldt.limit = 256 * sizeof(uint64_t) - 1; seg_ldt.present = 1; seg_ldt.dpl = 0; seg_ldt.s = 0; seg_ldt.g = 0; seg_ldt.db = 1; seg_ldt.l = 0; sregs.ldt = seg_ldt; uint64_t* ldt = (uint64_t*)(host_mem + sregs.ldt.base); struct kvm_segment seg_cs16; seg_cs16.selector = SEL_CS16; seg_cs16.type = 11; seg_cs16.base = 0; seg_cs16.limit = 0xfffff; seg_cs16.present = 1; seg_cs16.dpl = 0; seg_cs16.s = 1; seg_cs16.g = 0; seg_cs16.db = 0; seg_cs16.l = 0; struct kvm_segment seg_ds16 = seg_cs16; seg_ds16.selector = SEL_DS16; seg_ds16.type = 3; struct kvm_segment seg_cs16_cpl3 = seg_cs16; seg_cs16_cpl3.selector = SEL_CS16_CPL3; seg_cs16_cpl3.dpl = 3; struct kvm_segment seg_ds16_cpl3 = seg_ds16; seg_ds16_cpl3.selector = SEL_DS16_CPL3; seg_ds16_cpl3.dpl = 3; struct kvm_segment seg_cs32 = seg_cs16; seg_cs32.selector = SEL_CS32; seg_cs32.db = 1; struct kvm_segment seg_ds32 = seg_ds16; seg_ds32.selector = SEL_DS32; seg_ds32.db = 1; struct kvm_segment seg_cs32_cpl3 = seg_cs32; seg_cs32_cpl3.selector = SEL_CS32_CPL3; seg_cs32_cpl3.dpl = 3; struct kvm_segment seg_ds32_cpl3 = seg_ds32; seg_ds32_cpl3.selector = SEL_DS32_CPL3; seg_ds32_cpl3.dpl = 3; struct kvm_segment seg_cs64 = seg_cs16; seg_cs64.selector = SEL_CS64; seg_cs64.l = 1; struct kvm_segment seg_ds64 = seg_ds32; seg_ds64.selector = SEL_DS64; struct kvm_segment seg_cs64_cpl3 = seg_cs64; seg_cs64_cpl3.selector = SEL_CS64_CPL3; seg_cs64_cpl3.dpl = 3; struct kvm_segment seg_ds64_cpl3 = seg_ds64; seg_ds64_cpl3.selector = SEL_DS64_CPL3; seg_ds64_cpl3.dpl = 3; struct kvm_segment seg_tss32; seg_tss32.selector = SEL_TSS32; seg_tss32.type = 9; seg_tss32.base = ADDR_VAR_TSS32; seg_tss32.limit = 0x1ff; seg_tss32.present = 1; seg_tss32.dpl = 0; seg_tss32.s = 0; seg_tss32.g = 0; seg_tss32.db = 0; seg_tss32.l = 0; struct kvm_segment seg_tss32_2 = seg_tss32; seg_tss32_2.selector = SEL_TSS32_2; seg_tss32_2.base = ADDR_VAR_TSS32_2; struct kvm_segment seg_tss32_cpl3 = seg_tss32; seg_tss32_cpl3.selector = SEL_TSS32_CPL3; seg_tss32_cpl3.base = ADDR_VAR_TSS32_CPL3; struct kvm_segment seg_tss32_vm86 = seg_tss32; seg_tss32_vm86.selector = SEL_TSS32_VM86; seg_tss32_vm86.base = ADDR_VAR_TSS32_VM86; struct kvm_segment seg_tss16 = seg_tss32; seg_tss16.selector = SEL_TSS16; seg_tss16.base = ADDR_VAR_TSS16; seg_tss16.limit = 0xff; seg_tss16.type = 1; struct kvm_segment seg_tss16_2 = seg_tss16; seg_tss16_2.selector = SEL_TSS16_2; seg_tss16_2.base = ADDR_VAR_TSS16_2; seg_tss16_2.dpl = 0; struct kvm_segment seg_tss16_cpl3 = seg_tss16; seg_tss16_cpl3.selector = SEL_TSS16_CPL3; seg_tss16_cpl3.base = ADDR_VAR_TSS16_CPL3; seg_tss16_cpl3.dpl = 3; struct kvm_segment seg_tss64 = seg_tss32; seg_tss64.selector = SEL_TSS64; seg_tss64.base = ADDR_VAR_TSS64; seg_tss64.limit = 0x1ff; struct kvm_segment seg_tss64_cpl3 = seg_tss64; seg_tss64_cpl3.selector = SEL_TSS64_CPL3; seg_tss64_cpl3.base = ADDR_VAR_TSS64_CPL3; seg_tss64_cpl3.dpl = 3; struct kvm_segment seg_cgate16; seg_cgate16.selector = SEL_CGATE16; seg_cgate16.type = 4; seg_cgate16.base = SEL_CS16 | (2 << 16); seg_cgate16.limit = ADDR_VAR_USER_CODE2; seg_cgate16.present = 1; seg_cgate16.dpl = 0; seg_cgate16.s = 0; seg_cgate16.g = 0; seg_cgate16.db = 0; seg_cgate16.l = 0; seg_cgate16.avl = 0; struct kvm_segment seg_tgate16 = seg_cgate16; seg_tgate16.selector = SEL_TGATE16; seg_tgate16.type = 3; seg_cgate16.base = SEL_TSS16_2; seg_tgate16.limit = 0; struct kvm_segment seg_cgate32 = seg_cgate16; seg_cgate32.selector = SEL_CGATE32; seg_cgate32.type = 12; seg_cgate32.base = SEL_CS32 | (2 << 16); struct kvm_segment seg_tgate32 = seg_cgate32; seg_tgate32.selector = SEL_TGATE32; seg_tgate32.type = 11; seg_tgate32.base = SEL_TSS32_2; seg_tgate32.limit = 0; struct kvm_segment seg_cgate64 = seg_cgate16; seg_cgate64.selector = SEL_CGATE64; seg_cgate64.type = 12; seg_cgate64.base = SEL_CS64; char buf[sizeof(struct kvm_cpuid2) + 128 * sizeof(struct kvm_cpuid_entry2)]; memset(buf, 0, sizeof(buf)); struct kvm_cpuid2* cpuid = (struct kvm_cpuid2*)buf; cpuid->nent = 128; ioctl(kvmfd, KVM_GET_SUPPORTED_CPUID, cpuid); ioctl(cpufd, KVM_SET_CPUID2, cpuid); const char* text_prefix = 0; int text_prefix_size = 0; char* host_text = host_mem + ADDR_TEXT; if (1) { sregs.efer |= EFER_LME | EFER_SCE; sregs.cr0 |= CR0_PE; setup_syscall_msrs(cpufd, SEL_CS64, SEL_CS64_CPL3); setup_64bit_idt(&sregs, host_mem, guest_mem); sregs.cs = seg_cs32; sregs.ds = sregs.es = sregs.fs = sregs.gs = sregs.ss = seg_ds32; uint64_t pml4_addr = guest_mem + ADDR_PML4; uint64_t* pml4 = (uint64_t*)(host_mem + ADDR_PML4); uint64_t pdpt_addr = guest_mem + ADDR_PDP; uint64_t* pdpt = (uint64_t*)(host_mem + ADDR_PDP); uint64_t pd_addr = guest_mem + ADDR_PD; uint64_t* pd = (uint64_t*)(host_mem + ADDR_PD); pml4[0] = PDE64_PRESENT | PDE64_RW | PDE64_USER | pdpt_addr; pdpt[0] = PDE64_PRESENT | PDE64_RW | PDE64_USER | pd_addr; pd[0] = PDE64_PRESENT | PDE64_RW | PDE64_USER | PDE64_PS; sregs.cr3 = pml4_addr; sregs.cr4 |= CR4_PAE; if (1) { sregs.cr0 |= CR0_NE; *((uint64_t*)(host_mem + ADDR_VAR_VMXON_PTR)) = ADDR_VAR_VMXON; *((uint64_t*)(host_mem + ADDR_VAR_VMCS_PTR)) = ADDR_VAR_VMCS; memcpy(host_mem + ADDR_VAR_VMEXIT_CODE, kvm_asm64_vm_exit, sizeof(kvm_asm64_vm_exit) - 1); *((uint64_t*)(host_mem + ADDR_VAR_VMEXIT_PTR)) = ADDR_VAR_VMEXIT_CODE; text_prefix = kvm_asm64_init_vm; text_prefix_size = sizeof(kvm_asm64_init_vm) - 1; } } struct tss16 tss16; memset(&tss16, 0, sizeof(tss16)); tss16.ss0 = tss16.ss1 = tss16.ss2 = SEL_DS16; tss16.sp0 = tss16.sp1 = tss16.sp2 = ADDR_STACK0; tss16.ip = ADDR_VAR_USER_CODE2; tss16.flags = (1 << 1); tss16.cs = SEL_CS16; tss16.es = tss16.ds = tss16.ss = SEL_DS16; tss16.ldt = SEL_LDT; struct tss16* tss16_addr = (struct tss16*)(host_mem + seg_tss16_2.base); memcpy(tss16_addr, &tss16, sizeof(tss16)); memset(&tss16, 0, sizeof(tss16)); tss16.ss0 = tss16.ss1 = tss16.ss2 = SEL_DS16; tss16.sp0 = tss16.sp1 = tss16.sp2 = ADDR_STACK0; tss16.ip = ADDR_VAR_USER_CODE2; tss16.flags = (1 << 1); tss16.cs = SEL_CS16_CPL3; tss16.es = tss16.ds = tss16.ss = SEL_DS16_CPL3; tss16.ldt = SEL_LDT; struct tss16* tss16_cpl3_addr = (struct tss16*)(host_mem + seg_tss16_cpl3.base); memcpy(tss16_cpl3_addr, &tss16, sizeof(tss16)); struct tss32 tss32; memset(&tss32, 0, sizeof(tss32)); tss32.ss0 = tss32.ss1 = tss32.ss2 = SEL_DS32; tss32.sp0 = tss32.sp1 = tss32.sp2 = ADDR_STACK0; tss32.ip = ADDR_VAR_USER_CODE; tss32.flags = (1 << 1) | (1 << 17); tss32.ldt = SEL_LDT; tss32.cr3 = sregs.cr3; tss32.io_bitmap = offsetof(struct tss32, io_bitmap); struct tss32* tss32_addr = (struct tss32*)(host_mem + seg_tss32_vm86.base); memcpy(tss32_addr, &tss32, sizeof(tss32)); memset(&tss32, 0, sizeof(tss32)); tss32.ss0 = tss32.ss1 = tss32.ss2 = SEL_DS32; tss32.sp0 = tss32.sp1 = tss32.sp2 = ADDR_STACK0; tss32.ip = ADDR_VAR_USER_CODE; tss32.flags = (1 << 1); tss32.cr3 = sregs.cr3; tss32.es = tss32.ds = tss32.ss = tss32.gs = tss32.fs = SEL_DS32; tss32.cs = SEL_CS32; tss32.ldt = SEL_LDT; tss32.cr3 = sregs.cr3; tss32.io_bitmap = offsetof(struct tss32, io_bitmap); struct tss32* tss32_cpl3_addr = (struct tss32*)(host_mem + seg_tss32_2.base); memcpy(tss32_cpl3_addr, &tss32, sizeof(tss32)); struct tss64 tss64; memset(&tss64, 0, sizeof(tss64)); tss64.rsp[0] = ADDR_STACK0; tss64.rsp[1] = ADDR_STACK0; tss64.rsp[2] = ADDR_STACK0; tss64.io_bitmap = offsetof(struct tss64, io_bitmap); struct tss64* tss64_addr = (struct tss64*)(host_mem + seg_tss64.base); memcpy(tss64_addr, &tss64, sizeof(tss64)); memset(&tss64, 0, sizeof(tss64)); tss64.rsp[0] = ADDR_STACK0; tss64.rsp[1] = ADDR_STACK0; tss64.rsp[2] = ADDR_STACK0; tss64.io_bitmap = offsetof(struct tss64, io_bitmap); struct tss64* tss64_cpl3_addr = (struct tss64*)(host_mem + seg_tss64_cpl3.base); memcpy(tss64_cpl3_addr, &tss64, sizeof(tss64)); if (text_prefix) { memcpy(host_text, text_prefix, text_prefix_size); void* patch = 0; patch = memmem(host_text, text_prefix_size, "\xde\xc0\xad\x0b", 4); if (patch) *((uint32_t*)patch) = guest_mem + ADDR_TEXT + ((char*)patch - host_text) + 6; uint16_t magic = PREFIX_SIZE; patch = 0; patch = memmem(host_text, text_prefix_size, &magic, sizeof(magic)); if (patch) *((uint16_t*)patch) = guest_mem + ADDR_TEXT + text_prefix_size; } *(host_text + text_prefix_size) = 0xf4; *(host_mem + ADDR_VAR_USER_CODE) = 0xf4; *(host_mem + ADDR_VAR_HLT) = 0xf4; memcpy(host_mem + ADDR_VAR_SYSRET, "\x0f\x07\xf4", 3); memcpy(host_mem + ADDR_VAR_SYSEXIT, "\x0f\x35\xf4", 3); *(uint64_t*)(host_mem + ADDR_VAR_VMWRITE_FLD) = 0; *(uint64_t*)(host_mem + ADDR_VAR_VMWRITE_VAL) = 0; regs.rflags |= 2; fill_segment_descriptor(gdt, ldt, &seg_ldt); fill_segment_descriptor(gdt, ldt, &seg_cs16); fill_segment_descriptor(gdt, ldt, &seg_ds16); fill_segment_descriptor(gdt, ldt, &seg_cs16_cpl3); fill_segment_descriptor(gdt, ldt, &seg_ds16_cpl3); fill_segment_descriptor(gdt, ldt, &seg_cs32); fill_segment_descriptor(gdt, ldt, &seg_ds32); fill_segment_descriptor(gdt, ldt, &seg_cs32_cpl3); fill_segment_descriptor(gdt, ldt, &seg_ds32_cpl3); fill_segment_descriptor(gdt, ldt, &seg_cs64); fill_segment_descriptor(gdt, ldt, &seg_ds64); fill_segment_descriptor(gdt, ldt, &seg_cs64_cpl3); fill_segment_descriptor(gdt, ldt, &seg_ds64_cpl3); fill_segment_descriptor(gdt, ldt, &seg_tss32); fill_segment_descriptor(gdt, ldt, &seg_tss32_2); fill_segment_descriptor(gdt, ldt, &seg_tss32_cpl3); fill_segment_descriptor(gdt, ldt, &seg_tss32_vm86); fill_segment_descriptor(gdt, ldt, &seg_tss16); fill_segment_descriptor(gdt, ldt, &seg_tss16_2); fill_segment_descriptor(gdt, ldt, &seg_tss16_cpl3); fill_segment_descriptor_dword(gdt, ldt, &seg_tss64); fill_segment_descriptor_dword(gdt, ldt, &seg_tss64_cpl3); fill_segment_descriptor(gdt, ldt, &seg_cgate16); fill_segment_descriptor(gdt, ldt, &seg_tgate16); fill_segment_descriptor(gdt, ldt, &seg_cgate32); fill_segment_descriptor(gdt, ldt, &seg_tgate32); fill_segment_descriptor_dword(gdt, ldt, &seg_cgate64); if (ioctl(cpufd, KVM_SET_SREGS, &sregs)) return -1; if (ioctl(cpufd, KVM_SET_REGS, ®s)) return -1; return 0; } int kvmfd, vmfd, vcpufd; int main(void) { syscall(__NR_mmap, 0x20000000, 0x1000000, 3, 0x32, -1, 0); kvmfd = open("/dev/kvm", O_RDWR); vmfd = ioctl(kvmfd, KVM_CREATE_VM, 0); vcpufd = ioctl(vmfd, KVM_CREATE_VCPU, 0); syz_kvm_setup_vmfd(vmfd, 0x20000000); syz_kvm_setup_cpu(kvmfd, vcpufd, /*host_mem*/0x20000000); *(uint16_t*)0x20000580 = 0; *(uint16_t*)0x20000582 = 0; *(uint32_t*)0x20000584 = 1040; *(uint64_t*)0x20000588 = 0; *(uint64_t*)0x20000590 = 0xe000; // host_mem output offset *(uint16_t*)0x20000598 = 0; ioctl(vcpufd, 0x4080aebf, 0x20000580); ioctl(vcpufd, KVM_RUN, 0); dump_buf(0x2000e000, 0x1000); return 0; } ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page 2018-11-07 12:10 ` Alexander Potapenko @ 2018-11-07 12:47 ` Paolo Bonzini 2018-11-07 12:58 ` Liran Alon 2018-11-07 12:52 ` KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page Liran Alon 1 sibling, 1 reply; 16+ messages in thread From: Paolo Bonzini @ 2018-11-07 12:47 UTC (permalink / raw) To: Alexander Potapenko, syzbot+ded1696f6b50b615b630 Cc: kvm, LKML, rkrcmar, syzkaller-bugs On 07/11/2018 13:10, Alexander Potapenko wrote: > This appears to be a real bug in KVM. > Please see a simplified reproducer attached. Thanks, I agree it's a reael bug. The basic issue is that the kvm_state->size member is too small (1040) in the KVM_SET_NESTED_STATE ioctl, aka 0x4080aebf. One way to fix it would be to just change kmalloc to kzalloc when allocating cached_vmcs12 and cached_shadow_vmcs12, but really the ioctl is wrong and should be rejected. And the case where a shadow VMCS has to be loaded is even more wrong, and we have to fix it anyway, so I don't really like the idea of papering over the bug in the allocation. I'll test this patch and submit it formally: diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index c645f777b425..c546f0b1f3e0 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -14888,10 +14888,13 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, if (ret) return ret; - /* Empty 'VMXON' state is permitted */ - if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12)) + /* Empty 'VMXON' state is permitted. A partial VMCS12 is not. */ + if (kvm_state->size == sizeof(kvm_state)) return 0; + if (kvm_state->size < sizeof(kvm_state) + VMCS12_SIZE) + return -EINVAL; + if (kvm_state->vmx.vmcs_pa != -1ull) { if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa || !page_address_valid(vcpu, kvm_state->vmx.vmcs_pa)) @@ -14917,6 +14920,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, } vmcs12 = get_vmcs12(vcpu); + BUILD_BUG_ON(sizeof(*vmcs12) > VMCS12_SIZE); if (copy_from_user(vmcs12, user_kvm_nested_state->data, sizeof(*vmcs12))) return -EFAULT; @@ -14932,7 +14936,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu, if (nested_cpu_has_shadow_vmcs(vmcs12) && vmcs12->vmcs_link_pointer != -1ull) { struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu); - if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12)) + if (kvm_state->size < sizeof(kvm_state) + 2 * VMCS12_SIZE) return -EINVAL; if (copy_from_user(shadow_vmcs12, Paolo ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page 2018-11-07 12:47 ` Paolo Bonzini @ 2018-11-07 12:58 ` Liran Alon 2018-11-07 13:37 ` Paolo Bonzini 0 siblings, 1 reply; 16+ messages in thread From: Liran Alon @ 2018-11-07 12:58 UTC (permalink / raw) To: Paolo Bonzini Cc: Alexander Potapenko, syzbot+ded1696f6b50b615b630, kvm, LKML, rkrcmar, syzkaller-bugs > On 7 Nov 2018, at 14:47, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 07/11/2018 13:10, Alexander Potapenko wrote: >> This appears to be a real bug in KVM. >> Please see a simplified reproducer attached. > > Thanks, I agree it's a reael bug. The basic issue is that the > kvm_state->size member is too small (1040) in the KVM_SET_NESTED_STATE > ioctl, aka 0x4080aebf. > > One way to fix it would be to just change kmalloc to kzalloc when > allocating cached_vmcs12 and cached_shadow_vmcs12, but really the ioctl > is wrong and should be rejected. And the case where a shadow VMCS has > to be loaded is even more wrong, and we have to fix it anyway, so I > don't really like the idea of papering over the bug in the allocation. > > I'll test this patch and submit it formally: > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index c645f777b425..c546f0b1f3e0 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -14888,10 +14888,13 @@ static int vmx_set_nested_state(struct > kvm_vcpu *vcpu, > if (ret) > return ret; > > - /* Empty 'VMXON' state is permitted */ > - if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12)) > + /* Empty 'VMXON' state is permitted. A partial VMCS12 is not. */ > + if (kvm_state->size == sizeof(kvm_state)) > return 0; > > + if (kvm_state->size < sizeof(kvm_state) + VMCS12_SIZE) > + return -EINVAL; > + I don’t think that this test is sufficient to fully resolve issue. What if malicious userspace supplies valid size but pages containing nested_state->vmcs12 is unmapped? This will result in vmx_set_nested_state() still calling set_current_vmptr() but failing on copy_from_user() which still leaks cached_vmcs12 on next VMPTRLD of guest. Therefore, I think that the correct patch should be to change vmx_set_nested_state() to first gather all relevant information from userspace and validate it, and only then start applying it to KVM’s internal vCPU state. > if (kvm_state->vmx.vmcs_pa != -1ull) { > if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa || > !page_address_valid(vcpu, kvm_state->vmx.vmcs_pa)) > @@ -14917,6 +14920,7 @@ static int vmx_set_nested_state(struct kvm_vcpu > *vcpu, > } > > vmcs12 = get_vmcs12(vcpu); > + BUILD_BUG_ON(sizeof(*vmcs12) > VMCS12_SIZE); Why put this BUILD_BUG_ON() specifically here? There are many places which assumes cached_vmcs12 is of size VMCS12_SIZE. (Such as nested_release_vmcs12() and handle_vmptrld()). > if (copy_from_user(vmcs12, user_kvm_nested_state->data, sizeof(*vmcs12))) > return -EFAULT; > > @@ -14932,7 +14936,7 @@ static int vmx_set_nested_state(struct kvm_vcpu > *vcpu, > if (nested_cpu_has_shadow_vmcs(vmcs12) && > vmcs12->vmcs_link_pointer != -1ull) { > struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu); > - if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12)) > + if (kvm_state->size < sizeof(kvm_state) + 2 * VMCS12_SIZE) > return -EINVAL; > > if (copy_from_user(shadow_vmcs12, > > Paolo -Liran ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page 2018-11-07 12:58 ` Liran Alon @ 2018-11-07 13:37 ` Paolo Bonzini 2019-01-14 23:47 ` [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 Tom Roeder 0 siblings, 1 reply; 16+ messages in thread From: Paolo Bonzini @ 2018-11-07 13:37 UTC (permalink / raw) To: Liran Alon Cc: Alexander Potapenko, syzbot+ded1696f6b50b615b630, kvm, LKML, rkrcmar, syzkaller-bugs On 07/11/2018 13:58, Liran Alon wrote: > > >> On 7 Nov 2018, at 14:47, Paolo Bonzini <pbonzini@redhat.com> wrote: >> >> On 07/11/2018 13:10, Alexander Potapenko wrote: >>> This appears to be a real bug in KVM. >>> Please see a simplified reproducer attached. >> >> Thanks, I agree it's a reael bug. The basic issue is that the >> kvm_state->size member is too small (1040) in the KVM_SET_NESTED_STATE >> ioctl, aka 0x4080aebf. >> >> One way to fix it would be to just change kmalloc to kzalloc when >> allocating cached_vmcs12 and cached_shadow_vmcs12, but really the ioctl >> is wrong and should be rejected. And the case where a shadow VMCS has >> to be loaded is even more wrong, and we have to fix it anyway, so I >> don't really like the idea of papering over the bug in the allocation. >> >> I'll test this patch and submit it formally: >> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index c645f777b425..c546f0b1f3e0 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -14888,10 +14888,13 @@ static int vmx_set_nested_state(struct >> kvm_vcpu *vcpu, >> if (ret) >> return ret; >> >> - /* Empty 'VMXON' state is permitted */ >> - if (kvm_state->size < sizeof(kvm_state) + sizeof(*vmcs12)) >> + /* Empty 'VMXON' state is permitted. A partial VMCS12 is not. */ >> + if (kvm_state->size == sizeof(kvm_state)) >> return 0; >> >> + if (kvm_state->size < sizeof(kvm_state) + VMCS12_SIZE) >> + return -EINVAL; >> + > > I don’t think that this test is sufficient to fully resolve issue. > What if malicious userspace supplies valid size but pages containing nested_state->vmcs12 is unmapped? > This will result in vmx_set_nested_state() still calling set_current_vmptr() but failing on copy_from_user() > which still leaks cached_vmcs12 on next VMPTRLD of guest. Makes sense; since SET_NESTED_STATE is not a fast path, we can just memdup_user and pass a kernel pointer to vmx_set_nested_state. > Therefore, I think that the correct patch should be to change vmx_set_nested_state() to > first gather all relevant information from userspace and validate it, > and only then start applying it to KVM’s internal vCPU state. > >> if (kvm_state->vmx.vmcs_pa != -1ull) { >> if (kvm_state->vmx.vmcs_pa == kvm_state->vmx.vmxon_pa || >> !page_address_valid(vcpu, kvm_state->vmx.vmcs_pa)) >> @@ -14917,6 +14920,7 @@ static int vmx_set_nested_state(struct kvm_vcpu >> *vcpu, >> } >> >> vmcs12 = get_vmcs12(vcpu); >> + BUILD_BUG_ON(sizeof(*vmcs12) > VMCS12_SIZE); > > Why put this BUILD_BUG_ON() specifically here? > There are many places which assumes cached_vmcs12 is of size VMCS12_SIZE. > (Such as nested_release_vmcs12() and handle_vmptrld()). Unlike those places, here the copy has sizeof(*vmcs12) bytes and an overflow would cause a userspace write to kernel memory. Though, that means there is still a possibility of leaking kernel data when nested_release_vmcs12 is called. So it also makes sense to use VMCS12_SIZE for the memory copies, and kzalloc. Thanks, Paolo >> if (copy_from_user(vmcs12, user_kvm_nested_state->data, sizeof(*vmcs12))) >> return -EFAULT; >> >> @@ -14932,7 +14936,7 @@ static int vmx_set_nested_state(struct kvm_vcpu >> *vcpu, >> if (nested_cpu_has_shadow_vmcs(vmcs12) && >> vmcs12->vmcs_link_pointer != -1ull) { >> struct vmcs12 *shadow_vmcs12 = get_shadow_vmcs12(vcpu); >> - if (kvm_state->size < sizeof(kvm_state) + 2 * sizeof(*vmcs12)) >> + if (kvm_state->size < sizeof(kvm_state) + 2 * VMCS12_SIZE) >> return -EINVAL; >> >> if (copy_from_user(shadow_vmcs12, >> >> Paolo > > -Liran > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2018-11-07 13:37 ` Paolo Bonzini @ 2019-01-14 23:47 ` Tom Roeder 2019-01-15 0:03 ` Jim Mattson 2019-01-15 2:43 ` Sean Christopherson 0 siblings, 2 replies; 16+ messages in thread From: Tom Roeder @ 2019-01-14 23:47 UTC (permalink / raw) To: Paolo Bonzini, Radim Krčmář Cc: Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, Tom Roeder, syzbot+ded1696f6b50b615b630 This changes the allocation of cached_vmcs12 to use kzalloc instead of kmalloc. This removes the information leak found by Syzkaller (see Reported-by) in this case and prevents similar leaks from happening based on cached_vmcs12. The email from Syszkaller led to a discussion about a patch in early November on the KVM list (I've made this a reply to that thread), but the current upstream kernel still has kmalloc instead of kzalloc for cached_vmcs12 and cached_shadow_vmcs12. This RFC proposes changing to kzalloc for defense in depth. Tested: rebuilt but not tested, since this is an RFC Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com Signed-off-by: Tom Roeder <tmroeder@google.com> --- arch/x86/kvm/vmx/nested.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index 2616bd2c7f2c7..ad46667042c7a 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -4140,11 +4140,11 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu) if (r < 0) goto out_vmcs02; - vmx->nested.cached_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL); + vmx->nested.cached_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); if (!vmx->nested.cached_vmcs12) goto out_cached_vmcs12; - vmx->nested.cached_shadow_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL); + vmx->nested.cached_shadow_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); if (!vmx->nested.cached_shadow_vmcs12) goto out_cached_shadow_vmcs12; -- 2.20.1.97.g81188d93c3-goog ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-14 23:47 ` [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 Tom Roeder @ 2019-01-15 0:03 ` Jim Mattson 2019-01-15 2:43 ` Sean Christopherson 1 sibling, 0 replies; 16+ messages in thread From: Jim Mattson @ 2019-01-15 0:03 UTC (permalink / raw) To: Tom Roeder Cc: Paolo Bonzini, Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, the arch/x86 maintainers, kvm list, LKML, syzbot+ded1696f6b50b615b630 On Mon, Jan 14, 2019 at 3:48 PM Tom Roeder <tmroeder@google.com> wrote: > > This changes the allocation of cached_vmcs12 to use kzalloc instead of > kmalloc. This removes the information leak found by Syzkaller (see > Reported-by) in this case and prevents similar leaks from happening > based on cached_vmcs12. > > The email from Syszkaller led to a discussion about a patch in early > November on the KVM list (I've made this a reply to that thread), but > the current upstream kernel still has kmalloc instead of kzalloc for > cached_vmcs12 and cached_shadow_vmcs12. This RFC proposes changing to > kzalloc for defense in depth. > > Tested: rebuilt but not tested, since this is an RFC > > Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com > Signed-off-by: Tom Roeder <tmroeder@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-14 23:47 ` [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 Tom Roeder 2019-01-15 0:03 ` Jim Mattson @ 2019-01-15 2:43 ` Sean Christopherson 2019-01-15 10:15 ` Paolo Bonzini 2019-01-15 17:51 ` Tom Roeder 1 sibling, 2 replies; 16+ messages in thread From: Sean Christopherson @ 2019-01-15 2:43 UTC (permalink / raw) To: Tom Roeder Cc: Paolo Bonzini, Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, syzbot+ded1696f6b50b615b630 On Mon, Jan 14, 2019 at 03:47:28PM -0800, Tom Roeder wrote: > This changes the allocation of cached_vmcs12 to use kzalloc instead of > kmalloc. This removes the information leak found by Syzkaller (see > Reported-by) in this case and prevents similar leaks from happening > based on cached_vmcs12. Is the leak specific to vmx_set_nested_state(), e.g. can we zero out the memory if copy_from_user() fails instead of taking the hit on every allocation? > The email from Syszkaller led to a discussion about a patch in early > November on the KVM list (I've made this a reply to that thread), but > the current upstream kernel still has kmalloc instead of kzalloc for > cached_vmcs12 and cached_shadow_vmcs12. This RFC proposes changing to > kzalloc for defense in depth. > > Tested: rebuilt but not tested, since this is an RFC > > Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com > Signed-off-by: Tom Roeder <tmroeder@google.com> > --- > arch/x86/kvm/vmx/nested.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c > index 2616bd2c7f2c7..ad46667042c7a 100644 > --- a/arch/x86/kvm/vmx/nested.c > +++ b/arch/x86/kvm/vmx/nested.c > @@ -4140,11 +4140,11 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu) > if (r < 0) > goto out_vmcs02; > > - vmx->nested.cached_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL); > + vmx->nested.cached_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); > if (!vmx->nested.cached_vmcs12) > goto out_cached_vmcs12; Obviously not your code, but why do we allocate VMCS12_SIZE instead of sizeof(struct vmcs12)? I get why we require userspace to reserve the full 4k, but I don't understand why KVM needs to allocate the reserved bytes internally. > - vmx->nested.cached_shadow_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL); > + vmx->nested.cached_shadow_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); > if (!vmx->nested.cached_shadow_vmcs12) > goto out_cached_shadow_vmcs12; > > -- > 2.20.1.97.g81188d93c3-goog > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-15 2:43 ` Sean Christopherson @ 2019-01-15 10:15 ` Paolo Bonzini 2019-01-23 18:25 ` Tom Roeder 2019-01-15 17:51 ` Tom Roeder 1 sibling, 1 reply; 16+ messages in thread From: Paolo Bonzini @ 2019-01-15 10:15 UTC (permalink / raw) To: Sean Christopherson, Tom Roeder Cc: Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, syzbot+ded1696f6b50b615b630 On 15/01/19 03:43, Sean Christopherson wrote: >> - vmx->nested.cached_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL); >> + vmx->nested.cached_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); >> if (!vmx->nested.cached_vmcs12) >> goto out_cached_vmcs12; > Obviously not your code, but why do we allocate VMCS12_SIZE instead of > sizeof(struct vmcs12)? I get why we require userspace to reserve the > full 4k, but I don't understand why KVM needs to allocate the reserved > bytes internally. It's just cleaner and shorter code to copy everything in and out, instead of having to explicitly zero the slack. Paolo ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-15 10:15 ` Paolo Bonzini @ 2019-01-23 18:25 ` Tom Roeder 2019-01-24 1:17 ` Paolo Bonzini 0 siblings, 1 reply; 16+ messages in thread From: Tom Roeder @ 2019-01-23 18:25 UTC (permalink / raw) To: Paolo Bonzini Cc: Sean Christopherson, Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, syzbot+ded1696f6b50b615b630 On Tue, Jan 15, 2019 at 11:15:51AM +0100, Paolo Bonzini wrote: > On 15/01/19 03:43, Sean Christopherson wrote: > >> - vmx->nested.cached_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL); > >> + vmx->nested.cached_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); > >> if (!vmx->nested.cached_vmcs12) > >> goto out_cached_vmcs12; > > Obviously not your code, but why do we allocate VMCS12_SIZE instead of > > sizeof(struct vmcs12)? I get why we require userspace to reserve the > > full 4k, but I don't understand why KVM needs to allocate the reserved > > bytes internally. > > It's just cleaner and shorter code to copy everything in and out, > instead of having to explicitly zero the slack. Could you please clarify? I don't see code that copies everything in and out, but it depends on what you mean by "everything". In the context of this email exchange, I assumed that "everything" was "all 4k (VMCS12_SIZE)". But it looks to me like the code doesn't copy 4k in and out, but rather only ever copies sizeof(struct vmcs12) in and out. The copy_from_user and copy_to_user cases in nested.c use sizeof(*vmcs12), which is sizeof(struct vmcs12). So maybe can switch to allocating sizeof(struct vmcs12). Is this correct, or is there some other reason to allocate the larger size? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-23 18:25 ` Tom Roeder @ 2019-01-24 1:17 ` Paolo Bonzini 0 siblings, 0 replies; 16+ messages in thread From: Paolo Bonzini @ 2019-01-24 1:17 UTC (permalink / raw) To: Tom Roeder Cc: Sean Christopherson, Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, syzbot+ded1696f6b50b615b630 On 23/01/19 19:25, Tom Roeder wrote: > On Tue, Jan 15, 2019 at 11:15:51AM +0100, Paolo Bonzini wrote: >> On 15/01/19 03:43, Sean Christopherson wrote: >>>> - vmx->nested.cached_vmcs12 = kmalloc(VMCS12_SIZE, GFP_KERNEL); >>>> + vmx->nested.cached_vmcs12 = kzalloc(VMCS12_SIZE, GFP_KERNEL); >>>> if (!vmx->nested.cached_vmcs12) >>>> goto out_cached_vmcs12; >>> Obviously not your code, but why do we allocate VMCS12_SIZE instead of >>> sizeof(struct vmcs12)? I get why we require userspace to reserve the >>> full 4k, but I don't understand why KVM needs to allocate the reserved >>> bytes internally. >> >> It's just cleaner and shorter code to copy everything in and out, >> instead of having to explicitly zero the slack. > > Could you please clarify? I don't see code that copies everything in and > out, but it depends on what you mean by "everything". In the context of > this email exchange, I assumed that "everything" was "all 4k > (VMCS12_SIZE)". I was thinking of vmx_get_nested_state, but actually it only copies sizeof(*vmcs12). However, that is the place where we should copy 4k out of it, including the zeroes. Otherwise, our userspace clients (which doesn't know sizeof(*vmcs12) could leak uninitialized data of their own. Paolo > But it looks to me like the code doesn't copy 4k in and out, but rather > only ever copies sizeof(struct vmcs12) in and out. The copy_from_user > and copy_to_user cases in nested.c use sizeof(*vmcs12), which is > sizeof(struct vmcs12). > > So maybe can switch to allocating sizeof(struct vmcs12). Is this > correct, or is there some other reason to allocate the larger size? > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-15 2:43 ` Sean Christopherson 2019-01-15 10:15 ` Paolo Bonzini @ 2019-01-15 17:51 ` Tom Roeder 2019-01-23 18:33 ` Tom Roeder 1 sibling, 1 reply; 16+ messages in thread From: Tom Roeder @ 2019-01-15 17:51 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, syzbot+ded1696f6b50b615b630 On Mon, Jan 14, 2019 at 06:43:04PM -0800, Sean Christopherson wrote: > On Mon, Jan 14, 2019 at 03:47:28PM -0800, Tom Roeder wrote: > > This changes the allocation of cached_vmcs12 to use kzalloc instead of > > kmalloc. This removes the information leak found by Syzkaller (see > > Reported-by) in this case and prevents similar leaks from happening > > based on cached_vmcs12. > > Is the leak specific to vmx_set_nested_state(), e.g. can we zero out > the memory if copy_from_user() fails instead of taking the hit on every > allocation? I don't know if the leak is specific to vmx_set_nested_state. This question (and the rest of the thread from November) goes to the heart of what I wanted to get feedback about; hence the "RFC" part of the subject line. I'm new to the kernel, and I don't know all the idioms and expectations, so the follow analysis is an outsider's view of the issue at hand. What I see in this case is a field that is intended to be copied to the guest and is allocated initially with data from the kernel. I'm sure we could figure out all the current paths and error cases that we need to handle to make sure that this data never leaks. Future reviewers then also need to make sure that changes to the nested VMX code don't leak data from this field. Why not instead make sure that there isn't any data to leak in the first place? I understand that there's a cost to kzalloc vs. kmalloc, but I don't know what it is in practice; slab.c shows that the extra code for the __GFP_ZERO flag is a memset of 0 over the allocated memory. The allocation looks very infrequent for the two lines in this patch, since they occur in enter_vmx_operation. That sounds to me like the allocation only happens when the guest enables nested virtualization. Given the frequency of allocation and the relative security benefit of not having to worry about leaking the data, I'd advocate for changing it here. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-15 17:51 ` Tom Roeder @ 2019-01-23 18:33 ` Tom Roeder 2019-01-24 1:18 ` Paolo Bonzini 0 siblings, 1 reply; 16+ messages in thread From: Tom Roeder @ 2019-01-23 18:33 UTC (permalink / raw) To: Sean Christopherson Cc: Paolo Bonzini, Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, syzbot+ded1696f6b50b615b630 On Tue, Jan 15, 2019 at 09:51:11AM -0800, Tom Roeder wrote: > On Mon, Jan 14, 2019 at 06:43:04PM -0800, Sean Christopherson wrote: > > On Mon, Jan 14, 2019 at 03:47:28PM -0800, Tom Roeder wrote: > > > This changes the allocation of cached_vmcs12 to use kzalloc instead of > > > kmalloc. This removes the information leak found by Syzkaller (see > > > Reported-by) in this case and prevents similar leaks from happening > > > based on cached_vmcs12. > > > > Is the leak specific to vmx_set_nested_state(), e.g. can we zero out > > the memory if copy_from_user() fails instead of taking the hit on every > > allocation? > > I don't know if the leak is specific to vmx_set_nested_state. I've looked at the code more now, and it looks to me like there might be other cases where the memory could leak. But I don't know enough of the flows to be sure. The enter_vmx_operation function is called in handle_vmon, and no data is copied from the guest immediately after that. So, it depends on what happens after VMXON. Even in vmx_set_nested_state, there are about 30 lines of code in between enter_vmx_operation and copy_from_user, and there are a couple of cases that cause vmx_set_nested_state to return with an error. So if we want to fix this by handling all the error paths, I think it might be cleanest to convert vmx_set_nested_state to use goto error handling, since that would allow us to clear the allocated memory in one place. What do you think? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-23 18:33 ` Tom Roeder @ 2019-01-24 1:18 ` Paolo Bonzini 2019-01-24 21:46 ` Tom Roeder 0 siblings, 1 reply; 16+ messages in thread From: Paolo Bonzini @ 2019-01-24 1:18 UTC (permalink / raw) To: Tom Roeder, Sean Christopherson Cc: Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, syzbot+ded1696f6b50b615b630 On 23/01/19 19:33, Tom Roeder wrote: > > Even in vmx_set_nested_state, there are about 30 lines of code in > between enter_vmx_operation and copy_from_user, and there are a couple > of cases that cause vmx_set_nested_state to return with an error. So if > we want to fix this by handling all the error paths, I think it might be > cleanest to convert vmx_set_nested_state to use goto error handling, > since that would allow us to clear the allocated memory in one place. I prefer to have kzalloc, and change vmx_get_nested_state to copy the whole page including the trailing zeroes. Paolo ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 2019-01-24 1:18 ` Paolo Bonzini @ 2019-01-24 21:46 ` Tom Roeder 0 siblings, 0 replies; 16+ messages in thread From: Tom Roeder @ 2019-01-24 21:46 UTC (permalink / raw) To: Paolo Bonzini Cc: Sean Christopherson, Radim Krčmář, Liran Alon, Thomas Gleixner, Ingo Molnar, Borislav Petkov, H . Peter Anvin, x86, kvm, linux-kernel, syzbot+ded1696f6b50b615b630 On Thu, Jan 24, 2019 at 02:18:26AM +0100, Paolo Bonzini wrote: > On 23/01/19 19:33, Tom Roeder wrote: > > > > Even in vmx_set_nested_state, there are about 30 lines of code in > > between enter_vmx_operation and copy_from_user, and there are a couple > > of cases that cause vmx_set_nested_state to return with an error. So if > > we want to fix this by handling all the error paths, I think it might be > > cleanest to convert vmx_set_nested_state to use goto error handling, > > since that would allow us to clear the allocated memory in one place. > > I prefer to have kzalloc, and change vmx_get_nested_state to copy the > whole page including the trailing zeroes. OK, sending as a v2 patch in a new thread. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page 2018-11-07 12:10 ` Alexander Potapenko 2018-11-07 12:47 ` Paolo Bonzini @ 2018-11-07 12:52 ` Liran Alon 1 sibling, 0 replies; 16+ messages in thread From: Liran Alon @ 2018-11-07 12:52 UTC (permalink / raw) To: Alexander Potapenko Cc: syzbot+ded1696f6b50b615b630, kvm, LKML, pbonzini, rkrcmar, syzkaller-bugs > On 7 Nov 2018, at 14:10, Alexander Potapenko <glider@google.com> wrote: > > On Wed, Nov 7, 2018 at 2:38 AM syzbot > <syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com> wrote: >> >> Hello, >> >> syzbot found the following crash on: >> >> HEAD commit: 88b95ef4c780 kmsan: use MSan assembly instrumentation >> git tree: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_google_kmsan.git_master&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I&s=A5G9_UvV5TpBoJitbGWS2VXmfVMXFUgggq64QM-6nqc&e= >> console output: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_log.txt-3Fx-3D12505e33400000&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I&s=ZGVw04dRMjYdKZf4amN1yl3IheljZZq6V9h3mO9U-vM&e= >> kernel config: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_.config-3Fx-3D8df5fc509a1b351b&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I&s=OUIhmSMzBSMhswtebZqyLnc6SkAagVPBGr0EmCLL2Fg&e= >> dashboard link: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_bug-3Fextid-3Dded1696f6b50b615b630&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I&s=jeCou6OWbHHIf190FBGsLr1wGsDvBWlpKnfNMmqGIqI&e= >> compiler: clang version 8.0.0 (trunk 343298) >> syz repro: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.syz-3Fx-3D15ce62f5400000&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I&s=PVi1m-uNP3XRO4XbDZJicGiqXdVmOPFDxCP20jmXuAs&e= >> C reproducer: https://urldefense.proofpoint.com/v2/url?u=https-3A__syzkaller.appspot.com_x_repro.c-3Fx-3D174efca3400000&d=DwIFaQ&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=Jk6Q8nNzkQ6LJ6g42qARkg6ryIDGQr-yKXPNGZbpTx0&m=NpjK8vgD4UmRGNWTNe6YFA-AJTZp9VlD0oMFvKFV25I&s=XzvJtd3__2LEBvhAi4F6RTLbxV9mELkY07bMDSDLRMc&e= >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit: >> Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com >> >> ================================================================== >> BUG: KMSAN: kernel-infoleak in __copy_to_user include/linux/uaccess.h:121 >> [inline] >> BUG: KMSAN: kernel-infoleak in __kvm_write_guest_page >> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1849 [inline] >> BUG: KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page+0x39a/0x510 >> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1870 >> CPU: 0 PID: 7918 Comm: syz-executor542 Not tainted 4.19.0+ #77 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> Google 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:77 [inline] >> dump_stack+0x32d/0x480 lib/dump_stack.c:113 >> kmsan_report+0x1a2/0x2e0 mm/kmsan/kmsan.c:911 >> kmsan_internal_check_memory+0x34c/0x430 mm/kmsan/kmsan.c:991 >> kmsan_copy_to_user+0x85/0xe0 mm/kmsan/kmsan_hooks.c:552 >> __copy_to_user include/linux/uaccess.h:121 [inline] >> __kvm_write_guest_page arch/x86/kvm/../../../virt/kvm/kvm_main.c:1849 >> [inline] >> kvm_vcpu_write_guest_page+0x39a/0x510 >> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1870 >> nested_release_vmcs12 arch/x86/kvm/vmx.c:8441 [inline] >> handle_vmptrld+0x2384/0x26b0 arch/x86/kvm/vmx.c:8907 >> vmx_handle_exit+0x1e81/0xbac0 arch/x86/kvm/vmx.c:10128 >> vcpu_enter_guest arch/x86/kvm/x86.c:7667 [inline] >> vcpu_run arch/x86/kvm/x86.c:7730 [inline] >> kvm_arch_vcpu_ioctl_run+0xac32/0x11d80 arch/x86/kvm/x86.c:7930 >> kvm_vcpu_ioctl+0xfb1/0x1f90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2590 >> do_vfs_ioctl+0xf77/0x2d30 fs/ioctl.c:46 >> ksys_ioctl fs/ioctl.c:702 [inline] >> __do_sys_ioctl fs/ioctl.c:709 [inline] >> __se_sys_ioctl+0x1da/0x270 fs/ioctl.c:707 >> __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:707 >> do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 >> entry_SYSCALL_64_after_hwframe+0x63/0xe7 >> RIP: 0033:0x44b6e9 >> Code: e8 dc e6 ff ff 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 >> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff >> ff 0f 83 2b ff fb ff c3 66 2e 0f 1f 84 00 00 00 00 >> RSP: 002b:00007f096b292ce8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 >> RAX: ffffffffffffffda RBX: 00000000006e3c48 RCX: 000000000044b6e9 >> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 >> RBP: 00000000006e3c40 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 0000000000000206 R12: 00000000006e3c4c >> R13: 00007ffd978aeb2f R14: 00007f096b2939c0 R15: 00000000006e3d4c >> >> Uninit was created at: >> kmsan_save_stack_with_flags mm/kmsan/kmsan.c:252 [inline] >> kmsan_internal_poison_shadow+0xc8/0x1e0 mm/kmsan/kmsan.c:177 >> kmsan_kmalloc+0x98/0x110 mm/kmsan/kmsan_hooks.c:104 >> __kmalloc+0x14c/0x4d0 mm/slub.c:3789 >> kmalloc include/linux/slab.h:518 [inline] >> enter_vmx_operation+0x980/0x1a90 arch/x86/kvm/vmx.c:8278 >> vmx_set_nested_state+0xc3a/0x1530 arch/x86/kvm/vmx.c:14045 >> kvm_arch_vcpu_ioctl+0x4fc9/0x73a0 arch/x86/kvm/x86.c:4057 >> kvm_vcpu_ioctl+0xca3/0x1f90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2742 >> do_vfs_ioctl+0xf77/0x2d30 fs/ioctl.c:46 >> ksys_ioctl fs/ioctl.c:702 [inline] >> __do_sys_ioctl fs/ioctl.c:709 [inline] >> __se_sys_ioctl+0x1da/0x270 fs/ioctl.c:707 >> __x64_sys_ioctl+0x4a/0x70 fs/ioctl.c:707 >> do_syscall_64+0xcf/0x110 arch/x86/entry/common.c:291 >> entry_SYSCALL_64_after_hwframe+0x63/0xe7 >> >> Bytes 1000-4095 of 4096 are uninitialized >> Memory access of size 4096 starts at ffff8801b5157000 >> ================================================================== > This appears to be a real bug in KVM. > Please see a simplified reproducer attached. If I understand the above trace correctly: 1) When guest enters VMX operation, cached_vmcs12 is allocated in enter_vmx_operation() by kmalloc() 2) When guest will execute VMPTRLD, handle_vmptrld() will first release current vmcs12 by nested_release_vmcs12() and only then copy newly loaded vmcs12 to cached_vmcs12 by memcpy(). 3) The bug theoretically is that nested_release_vmcs12() copies entire cached_vmcs12 to guest memory which is not initialised in case this is the very first VMPTRLD executed by guest. But this case should not happen because nested_release_vmcs12() starts by checking if current_vmptr is equal to -1 and vmx_create_vcpu() set current_vmptr to -1ull. However, one case where (3) could happen is by vmx_set_nested_state(). In case kvm_state->vmx.vmcs_pa is valid, set_current_vmptr() is called which sets current_vmptr. However, it is possible that copy_from_user() which copies into cached_vmcs12 will fail. Thus, leaving us with current_vmptr != -1 but with uninitialised cached_vmcs12. Therefore, next VMPTRLD will indeed leak into guest the uninitialised content of cached_vmcs12. What I described seems to be a bug to me (correct me if I’m wrong) but it doesn’t seem to be the same issue described here as we can see that attached reproducer don’t use KVM_SET_NESTED_STATE ioctl. -Liran ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2019-01-24 21:46 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-11-07 1:38 KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page syzbot 2018-11-07 12:10 ` Alexander Potapenko 2018-11-07 12:47 ` Paolo Bonzini 2018-11-07 12:58 ` Liran Alon 2018-11-07 13:37 ` Paolo Bonzini 2019-01-14 23:47 ` [RFC PATCH] kvm: x86/vmx: Use kzalloc for cached_vmcs12 Tom Roeder 2019-01-15 0:03 ` Jim Mattson 2019-01-15 2:43 ` Sean Christopherson 2019-01-15 10:15 ` Paolo Bonzini 2019-01-23 18:25 ` Tom Roeder 2019-01-24 1:17 ` Paolo Bonzini 2019-01-15 17:51 ` Tom Roeder 2019-01-23 18:33 ` Tom Roeder 2019-01-24 1:18 ` Paolo Bonzini 2019-01-24 21:46 ` Tom Roeder 2018-11-07 12:52 ` KMSAN: kernel-infoleak in kvm_vcpu_write_guest_page Liran Alon
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).