kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 203845] New: Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference)
@ 2019-06-07 15:57 bugzilla-daemon
  2019-06-07 18:46 ` [Bug 203845] " bugzilla-daemon
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: bugzilla-daemon @ 2019-06-07 15:57 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=203845

            Bug ID: 203845
           Summary: Can't run qemu/kvm on 5.0.0 kernel (NULL pointer
                    dereference)
           Product: Virtualization
           Version: unspecified
    Kernel Version: 5.0.0
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: kvm
          Assignee: virtualization_kvm@kernel-bugs.osdl.org
          Reporter: jpalecek@web.de
        Regression: No

Hello,

I can't start a linux system (for autopkgtest testing) using kvm on an AMD
system. The qemu process is just killed after ioctl(..., KVM_RUN). In dmesg, I
see:

[54998.896817] BUG: unable to handle kernel NULL pointer dereference at
00000000
[54998.896823] #PF error: [WRITE]
[54998.896826] *pdpt = 0000000011b11001 *pde = 0000000000000000 
[54998.896831] Oops: 0002 [#9] SMP NOPTI
[54998.896836] CPU: 0 PID: 5289 Comm: qemu-system-i38 Tainted: P      D    OE  
  5.0.0-trunk-686-pae #1 Debian 5.0.2-1~exp1
[54998.896839] Hardware name: System manufacturer System Product Name/M4N68T-M,
BIOS 1301    07/05/2011
[54998.896864] EIP: kvm_mmu_load+0xbc/0x4b0 [kvm]
[54998.896868] Code: 81 c1 00 00 00 40 c6 00 00 0f 1f 00 8b 87 60 02 00 00 8b
55 e8 83 c9 01 81 c3 00 00 04 00 83 d6 00 83 c4 10 8b 80 88 00 00 00 <89> 0c 10
c7 44 10 04 00 00 00 00 83 c2 08 89 55 e8 83 fa 20 75 89
[54998.896871] EAX: 00000000 EBX: 00040000 ECX: 0dc1d001 EDX: 00000000
[54998.896874] ESI: 00000000 EDI: cdf58000 EBP: cf811de0 ESP: cf811dbc
[54998.896876] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210282
[54998.896878] CR0: 80050033 CR2: 00000000 CR3: 22de02e0 CR4: 000006f0
[54998.896880] Call Trace:
[54998.896900]  ? kvm_ioapic_scan_entry+0x62/0xe0 [kvm]
[54998.896919]  kvm_arch_vcpu_ioctl_run+0x105f/0x1a20 [kvm]
[54998.896925]  ? _cond_resched+0x17/0x30
[54998.896943]  kvm_vcpu_ioctl+0x214/0x590 [kvm]
[54998.896959]  ? kvm_vcpu_ioctl+0x214/0x590 [kvm]
[54998.896964]  ? do_futex+0xae/0xa70
[54998.896969]  ? __fpu__restore_sig+0x265/0x500
[54998.896985]  ? __bpf_trace_kvm_async_pf_nopresent_ready+0x20/0x20 [kvm]
[54998.896988]  do_vfs_ioctl+0x9a/0x6c0
[54998.896993]  ? __audit_syscall_entry+0xb4/0xf0
[54998.896997]  ? syscall_trace_enter+0x1da/0x240
[54998.897000]  ksys_ioctl+0x56/0x80
[54998.897002]  sys_ioctl+0x16/0x20
[54998.897005]  do_fast_syscall_32+0x81/0x184
[54998.897009]  entry_SYSENTER_32+0x6b/0xbe
[54998.897011] EIP: 0xb7f49881
[54998.897014] Code: 8b 98 58 cd ff ff 89 c8 85 d2 74 02 89 0a 5b 5d c3 8b 04
24 c3 8b 14 24 c3 8b 34 24 c3 8b 3c 24 c3 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59
c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76
[54998.897016] EAX: ffffffda EBX: 0000000e ECX: 0000ae80 EDX: 00000000
[54998.897018] ESI: 0261b080 EDI: 00000000 EBP: b51cf000 ESP: b30fdc98
[54998.897021] DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b EFLAGS: 00200292
[54998.897024] Modules linked in: cfg80211 nfnetlink_queue nfnetlink_log
nfnetlink bluetooth drbg ansi_cprng ecdh_generic rfkill snd_hrtimer
snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device
cpufreq_powersave cpufreq_userspace cpufreq_conservative binfmt_misc
nvidia_drm(POE) drm_kms_helper drm nvidia_modeset(POE) nvidia(POE) fuse
snd_hda_codec_via nls_iso8859_2 nls_cp437 vfat snd_hda_codec_generic
snd_hda_codec_hdmi ledtrig_audio fat snd_hda_intel edac_mce_amd snd_hda_codec
kvm_amd snd_hda_core snd_hwdep kvm snd_pcm_oss sr_mod snd_mixer_oss snd_pcm
cdrom snd_timer sg irqbypass snd k10temp soundcore pcspkr asus_atk0110 ohci_pci
ohci_hcd pcc_cpufreq sata_nv ehci_pci forcedeth acpi_cpufreq i2c_nforce2
ehci_hcd button ipmi_devintf ipmi_msghandler usblp usbcore usb_common
parport_pc ppdev lp parport ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2
crc32c_generic fscrypto ecb crypto_simd cryptd aes_i586 sd_mod pata_amd
ata_generic libata psmouse evdev serio_raw scsi_mod
[54998.897072] CR2: 0000000000000000
[54998.897075] ---[ end trace a515b8f5e69d047e ]---
[54998.897093] EIP: kvm_mmu_load+0xbc/0x4b0 [kvm]
[54998.897096] Code: 81 c1 00 00 00 40 c6 00 00 0f 1f 00 8b 87 60 02 00 00 8b
55 e8 83 c9 01 81 c3 00 00 04 00 83 d6 00 83 c4 10 8b 80 88 00 00 00 <89> 0c 10
c7 44 10 04 00 00 00 00 83 c2 08 89 55 e8 83 fa 20 75 89
[54998.897098] EAX: 00000000 EBX: 00040000 ECX: 0facd001 EDX: 00000000
[54998.897100] ESI: 00000000 EDI: cd06b0c0 EBP: cd227de0 ESP: d4a59dfc
[54998.897102] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210282
[54998.897104] CR0: 80050033 CR2: 00000000 CR3: 22de02e0 CR4: 000006f0

debian:~# uname -a
Linux debian 5.0.0-trunk-686-pae #1 SMP Debian 5.0.2-1~exp1 (2019-03-18) i686
GNU/Linux

The processor of the host is AMD Athlon(tm) II X2 245

This particular VM worked previously, but I will have to look into whether it
is the kernel or maybe qemu that broke it. However, a null pointer dereference
should still not happen IMHO.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 203845] Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference)
  2019-06-07 15:57 [Bug 203845] New: Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference) bugzilla-daemon
@ 2019-06-07 18:46 ` bugzilla-daemon
  2019-06-08 17:36 ` bugzilla-daemon
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2019-06-07 18:46 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=203845

Sean Christopherson (sean.j.christopherson@intel.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sean.j.christopherson@intel
                   |                            |.com

--- Comment #1 from Sean Christopherson (sean.j.christopherson@intel.com) ---
The EIP and code stream puts this at the following line in
mmu_alloc_direct_roots():

  vcpu->arch.mmu->pae_root[i] = root | PT_PRESENT_MASK;

The code in question is only encountered on a 32-bit KVM with two-dimensional
paging (TDP) is disabled, i.e. without AMD's Nested Page Tables, which fits
your setup (i686-pae on Athlon).  What I can't figure out is how pae_root would
be NULL in this scenario.  There is one flow that I think could theoretically
result in a NULL pae_root, but it would require using nested virtualization,
which doesn't seem to be the case here.

I tried to test my theory but running a 32-bit KVM without TDP just hangs for
me, i.e. it appears to be broken on Intel VMX at least as far back as v4.20.

What was the last kernel that did work on your system?  That might help narrow
down when things went awry.  In the meantime, I'll try to debug/bisect the
issue I'm seeing as time allows.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 203845] Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference)
  2019-06-07 15:57 [Bug 203845] New: Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference) bugzilla-daemon
  2019-06-07 18:46 ` [Bug 203845] " bugzilla-daemon
@ 2019-06-08 17:36 ` bugzilla-daemon
  2019-06-13  9:49 ` bugzilla-daemon
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2019-06-08 17:36 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=203845

--- Comment #2 from Jiri Palecek (jpalecek@web.de) ---
Hello,

thanks for looking at it. 4.18.20 works for me, while 4.19.37 doesn't.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 203845] Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference)
  2019-06-07 15:57 [Bug 203845] New: Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference) bugzilla-daemon
  2019-06-07 18:46 ` [Bug 203845] " bugzilla-daemon
  2019-06-08 17:36 ` bugzilla-daemon
@ 2019-06-13  9:49 ` bugzilla-daemon
  2019-06-13 14:30 ` bugzilla-daemon
  2019-07-16 16:34 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2019-06-13  9:49 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=203845

--- Comment #3 from Jiri Palecek (jpalecek@web.de) ---
So, after bisecting, the first bad commit is:

commit ee6268ba3a6861b2806e569bff7fe91fbdf846dd (refs/bisect/bad)
Author: Liang Chen <liangchen.linux@gmail.com>
Date:   Wed Jul 25 16:32:14 2018 +0800

    KVM: x86: Skip pae_root shadow allocation if tdp enabled

    Considering the fact that the pae_root shadow is not needed when
    tdp is in use, skip the pae_root shadow page allocation to allow
    mmu creation even not being able to obtain memory from DMA32
    zone when particular cgroup cpuset.mems or mempolicy control is
    applied.

    Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 203845] Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference)
  2019-06-07 15:57 [Bug 203845] New: Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference) bugzilla-daemon
                   ` (2 preceding siblings ...)
  2019-06-13  9:49 ` bugzilla-daemon
@ 2019-06-13 14:30 ` bugzilla-daemon
  2019-07-16 16:34 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2019-06-13 14:30 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=203845

--- Comment #4 from Sean Christopherson (sean.j.christopherson@intel.com) ---
Ah, I finally see how KVM ends up consuming a NULL pae_root.  32-bit KVM on SVM
with nested page tables uses PAE tables in the host.  I'll send a patch.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [Bug 203845] Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference)
  2019-06-07 15:57 [Bug 203845] New: Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference) bugzilla-daemon
                   ` (3 preceding siblings ...)
  2019-06-13 14:30 ` bugzilla-daemon
@ 2019-07-16 16:34 ` bugzilla-daemon
  4 siblings, 0 replies; 6+ messages in thread
From: bugzilla-daemon @ 2019-07-16 16:34 UTC (permalink / raw)
  To: kvm

https://bugzilla.kernel.org/show_bug.cgi?id=203845

Jiri Palecek (jpalecek@web.de) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |CODE_FIX

--- Comment #5 from Jiri Palecek (jpalecek@web.de) ---
Fixed by commit b6b80c78af

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-07-16 16:34 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-07 15:57 [Bug 203845] New: Can't run qemu/kvm on 5.0.0 kernel (NULL pointer dereference) bugzilla-daemon
2019-06-07 18:46 ` [Bug 203845] " bugzilla-daemon
2019-06-08 17:36 ` bugzilla-daemon
2019-06-13  9:49 ` bugzilla-daemon
2019-06-13 14:30 ` bugzilla-daemon
2019-07-16 16:34 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).