kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 103141] New: Host-triggerable NULL pointer oops
@ 2015-08-19 16:42 bugzilla-daemon
  2015-08-19 22:48 ` [Bug 103141] " bugzilla-daemon
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: bugzilla-daemon @ 2015-08-19 16:42 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=103141

            Bug ID: 103141
           Summary: Host-triggerable NULL pointer oops
           Product: Virtualization
           Version: unspecified
    Kernel Version: 4.1.5
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: felix.von.s@posteo.de
        Regression: No

Created attachment 185241
  --> https://bugzilla.kernel.org/attachment.cgi?id=185241&action=edit
Test program (C99)

Amusingly enough, I found this while trying to come up with a minimal test
program for #103131.

Running ioctl(KVM_CREATE_VCPU) _after_ ioctl(KVM_SET_USER_MEMORY_REGION) with
certain address/size combinations may generate a null pointer dereference.

dmesg after running the test program:

[11557.519426] BUG: unable to handle kernel NULL pointer dereference at
000000000000005f
[11557.520561] IP: [<ffffffffa045b2f5>] vmx_fpu_activate+0x5/0x20 [kvm_intel]
[11557.521716] PGD 13841a067 PUD 13857c067 PMD 0 
[11557.522891] Oops: 0000 [#25] PREEMPT SMP 
[11557.524073] Modules linked in: [REDACTED]
[11557.534572] CPU: 5 PID: 4295 Comm: tcc Tainted: P      D    O   
4.1.5-1-ARCH #1
[11557.536451] Hardware name: [REDACTED]
[11557.538361] task: ffff880068425180 ti: ffff880138784000 task.ti:
ffff880138784000
[11557.540331] RIP: 0010:[<ffffffffa045b2f5>]  [<ffffffffa045b2f5>]
vmx_fpu_activate+0x5/0x20 [kvm_intel]
[11557.542367] RSP: 0018:ffff880138787da0  EFLAGS: 00010292
[11557.544411] RAX: ffffffffa0476160 RBX: ffffffffffffffef RCX:
0000000000000000
[11557.546476] RDX: 0000000000001f85 RSI: ffff88014b15e8b0 RDI:
ffffffffffffffef
[11557.548553] RBP: ffff880138787db8 R08: 000000000001e8b0 R09:
ffffffffa045cbf3
[11557.550605] R10: ffffea00027eee00 R11: ffff88014b157348 R12:
0000000000000000
[11557.552637] R13: 0000000000000000 R14: 000000000000ae41 R15:
0000000000000000
[11557.554691] FS:  00007fba3936d700(0000) GS:ffff88014b140000(0000)
knlGS:0000000000000000
[11557.556796] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11557.558914] CR2: 000000000000005f CR3: 000000013857d000 CR4:
00000000000426e0
[11557.561092] Stack:
[11557.563213]  ffffffffa03deaf1 0000000000000000 ffff8800a52fc000
ffff880138787e78
[11557.565412]  ffffffffa03ca6d8 ffff880138787de8 ffffffff81175b5b
ffff88011edffb80
[11557.567650]  0000000000000000 00000000fffbc000 0000000000044000
00007fba39371000
[11557.569906] Call Trace:
[11557.572169]  [<ffffffffa03deaf1>] ? kvm_arch_vcpu_create+0x51/0x70 [kvm]
[11557.574476]  [<ffffffffa03ca6d8>] kvm_vm_ioctl+0x1c8/0x7a0 [kvm]
[11557.576773]  [<ffffffff81175b5b>] ?
lru_cache_add_active_or_unevictable+0x2b/0xb0
[11557.579118]  [<ffffffff811f4646>] do_vfs_ioctl+0x2c6/0x4d0
[11557.581470]  [<ffffffff811f48d1>] SyS_ioctl+0x81/0xa0
[11557.583841]  [<ffffffff8158bf2e>] system_call_fastpath+0x12/0x71
[11557.586265] Code: 00 e8 20 bf ff ff 5b 41 5c 5d c3 0f 1f 00 48 8b 05 31 85
fc ff ff 90 b8 00 00 00 eb 87 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 <8b> 47
70 85 c0 75 0a 55 48 89 e5 e8 3b ff ff ff 5d f3 c3 0f 1f 
[11557.592112] RIP  [<ffffffffa045b2f5>] vmx_fpu_activate+0x5/0x20 [kvm_intel]
[11557.594990]  RSP <ffff880138787da0>
[11557.597859] CR2: 000000000000005f
[11557.600786] ---[ end trace b28b93d27b3449c9 ]---

When I move ioctl(KVM_CREATE_VCPU) immediately below ioctl(KVM_CREATE_VM) there
is no oops, but a later KVM_RUN exits with KVM_EXIT_INTERNAL_ERROR, subcode
KVM_INTERNAL_ERROR_EMULATION. The crashes also stop when I decrease
umr.memory_size below what I specified in the attached test program.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 103141] Host-triggerable NULL pointer oops
  2015-08-19 16:42 [Bug 103141] New: Host-triggerable NULL pointer oops bugzilla-daemon
@ 2015-08-19 22:48 ` bugzilla-daemon
  2015-08-24 15:46 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2015-08-19 22:48 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=103141

Wanpeng Li <wanpeng.li@hotmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wanpeng.li@hotmail.com

--- Comment #1 from Wanpeng Li <wanpeng.li@hotmail.com> ---
The below commit can fix it.

commit 370777daab3f024f1645177039955088e2e9ae73
Author: Radim Krčmář <rkrcmar@redhat.com>
Date:   Fri Jul 3 15:49:28 2015 +0200

    KVM: VMX: fix vmwrite to invalid VMCS

    fpu_activate is called outside of vcpu_load(), which means it should not
    touch VMCS, but fpu_activate needs to.  Avoid the call by moving it to a
    point where we know that the guest needs eager FPU and VMCS is loaded.

    This will get rid of the following trace

     vmwrite error: reg 6800 value 0 (err 1)
      [<ffffffff8162035b>] dump_stack+0x19/0x1b
      [<ffffffffa046c701>] vmwrite_error+0x2c/0x2e [kvm_intel]
      [<ffffffffa045f26f>] vmcs_writel+0x1f/0x30 [kvm_intel]
      [<ffffffffa04617e5>] vmx_fpu_activate.part.61+0x45/0xb0 [kvm_intel]
      [<ffffffffa0461865>] vmx_fpu_activate+0x15/0x20 [kvm_intel]
      [<ffffffffa0560b91>] kvm_arch_vcpu_create+0x51/0x70 [kvm]
      [<ffffffffa0548011>] kvm_vm_ioctl+0x1c1/0x760 [kvm]
      [<ffffffff8118b55a>] ? handle_mm_fault+0x49a/0xec0
      [<ffffffff811e47d5>] do_vfs_ioctl+0x2e5/0x4c0
      [<ffffffff8127abbe>] ? file_has_perm+0xae/0xc0
      [<ffffffff811e4a51>] SyS_ioctl+0xa1/0xc0
      [<ffffffff81630949>] system_call_fastpath+0x16/0x1b

    (Note: we also unconditionally activate FPU in vmx_vcpu_reset(), so the
     removed code added nothing.)

    Fixes: c447e76b4cab ("kvm/fpu: Enable eager restore kvm FPU for MPX")
    Cc: <stable@vger.kernel.org>
    Reported-by: Vlastimil Holer <vlastimil.holer@gmail.com>
    Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 103141] Host-triggerable NULL pointer oops
  2015-08-19 16:42 [Bug 103141] New: Host-triggerable NULL pointer oops bugzilla-daemon
  2015-08-19 22:48 ` [Bug 103141] " bugzilla-daemon
@ 2015-08-24 15:46 ` bugzilla-daemon
  2015-08-24 15:57 ` bugzilla-daemon
  2019-07-14 18:09 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2015-08-24 15:46 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=103141

--- Comment #2 from felix <felix.von.s@posteo.de> ---
Created attachment 185681
  --> https://bugzilla.kernel.org/attachment.cgi?id=185681&action=edit
Test program 2 (C99)

You mean "can" as in "I think it does" or "it did for me"?

And anyway, it seems to only fix the most proximate cause of the crash. My
biggest worry is that KVM_SET_USER_MEMORY_REGION ioctls with guest_phys_addr
around the 0xfff00000 to 0xffff0000 range seem not to "register"; starting the
VM looks like as if the region wasn't placed there.

I attach test program 2. Running that on my system with 0x44000 as an argument
outputs "halted" (as expected), but 0x45000 and larger multiples of 0x1000 give
"internal error, subcode 1".

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 103141] Host-triggerable NULL pointer oops
  2015-08-19 16:42 [Bug 103141] New: Host-triggerable NULL pointer oops bugzilla-daemon
  2015-08-19 22:48 ` [Bug 103141] " bugzilla-daemon
  2015-08-24 15:46 ` bugzilla-daemon
@ 2015-08-24 15:57 ` bugzilla-daemon
  2019-07-14 18:09 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2015-08-24 15:57 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=103141

felix <felix.von.s@posteo.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
 Attachment #185681|0                           |1
        is obsolete|                            |

--- Comment #3 from felix <felix.von.s@posteo.de> ---
Created attachment 185691
  --> https://bugzilla.kernel.org/attachment.cgi?id=185691&action=edit
Test program 2 (C99) [non-oopsing version]

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug 103141] Host-triggerable NULL pointer oops
  2015-08-19 16:42 [Bug 103141] New: Host-triggerable NULL pointer oops bugzilla-daemon
                   ` (2 preceding siblings ...)
  2015-08-24 15:57 ` bugzilla-daemon
@ 2019-07-14 18:09 ` bugzilla-daemon
  3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2019-07-14 18:09 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=103141

Alex Lyakas (alex@zadara.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alex@zadara.com

--- Comment #4 from Alex Lyakas (alex@zadara.com) ---
We hit the same issue with kernel 3.18.19.

After some debugging, I see that the first test program that felix attached,
causes kvm_x86_ops->vcpu_create to return -EEXIST instead of a valid vcpu
pointer. As a result, the call to kvm_x86_ops->fpu_activate tries to access an
invalid pointer, and causes a NULL pointer dereference.

The suggested fix was delivered in kernel 4.2. Although it was tagged as
"stable", I don't see that it was backported to earlier kernels. I believe that
the fix addresses a different issue, in which the vcpu pointer is valid, but
further VMCS write has a problem (this is my understanding). But, of course,
this fix will address also the issue that felix reported. Although for the
latter, a simpler fix would suffice:

--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7012,20 +7012,24 @@ struct kvm_vcpu *kvm_arch_vcpu_create(struct kvm *kvm,
                                                unsigned int id)
 {
        struct kvm_vcpu *vcpu;

        if (check_tsc_unstable() && atomic_read(&kvm->online_vcpus) != 0)
                printk_once(KERN_WARNING
                "kvm: SMP vm created on host with unstable TSC; "
                "guest TSC will not be reliable\n");

        vcpu = kvm_x86_ops->vcpu_create(kvm, id);
+       if (IS_ERR(vcpu)) {
+               pr_err("kvm_x86_ops->vcpu_create id=%u err=%ld\n", id,
PTR_ERR(vcpu));
+               return vcpu;
+       }

        /*
         * Activate fpu unconditionally in case the guest needs eager FPU.  It
will be
         * deactivated soon if it doesn't.
         */
        kvm_x86_ops->fpu_activate(vcpu);
        return vcpu;
 }

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-07-14 18:09 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-19 16:42 [Bug 103141] New: Host-triggerable NULL pointer oops bugzilla-daemon
2015-08-19 22:48 ` [Bug 103141] " bugzilla-daemon
2015-08-24 15:46 ` bugzilla-daemon
2015-08-24 15:57 ` bugzilla-daemon
2019-07-14 18:09 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).