kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 5.2.11+ Regression: > nproc/2 lockups during initramfs
@ 2019-09-08 10:37 James Harvey
  2019-09-10 18:32 ` Sean Christopherson
  0 siblings, 1 reply; 5+ messages in thread
From: James Harvey @ 2019-09-08 10:37 UTC (permalink / raw)
  To: kvm

Host is up to date Arch Linux, with exception of downgrading linux to
track this down to 5.2.11 - 5.2.13.  QEMU 4.1.0, but have also
downgraded to 4.0.0 to confirm no change.

Host is dual E5-2690 v1 Xeons.  With hyperthreading, 32 logical cores.
I've always been able to boot qemu with "-smp
cpus=30,cores=15,threads=1,sockets=2".  I leave 2 free for host
responsiveness.

Upgrading from 5.2.10 to 5.2.11 causes the VM to lock up while loading
the initramfs about 90-95% of the time.  (Probably a slight race
condition.)  On host, QEMU shows as nVmCPUs*100% CPU usage, so around
3000% for 30 cpus.

If I back down to "cpus=16,cores=8", it always boots.  If I increase
to "cpus=18,cores=9", it goes back to locking up 90-95% of the time.

Omitting "-accel=kvm" allows 5.2.11 to work on the host without issue,
so combined with that the only package needing to be downgraded is
linux to 5.2.10 to prevent the issue with KVM, I think this must be a
KVM issue.

Using version of QEMU with debug symbols gives:
* gdb backtrace: http://ix.io/1UyO
* 11 seconds of attaching strace to locked up qemu (167K): http://ix.io/1UyP
* strace from the beginning of starting a qemu that locks up (8MB):
https://filebin.ca/4uI15ztGAarw/strace.qemu.from.start
** This definitely changed timings, and it became harder to replicate,
to where I'd guess 20-30% of boots hang
** Interestingly, the strace only collected data for 5 seconds, even
though qemu continued at full CPU usage much longer.  Don't know what
to make of that, especially because the first strace was attached to
an already locked up qemu that had gone well past 5 seconds.

Like how the strace changed timings, I have seen attaching GDB to a
running qemu which pauses it, then simply running continue, has gotten
it "unstuck" immediately.

I've let this go 14 hours, but once it goes into complete CPU usage,
it never comes out.

If booting from the September 2019 Arch ISO, it hangs right after the
ISO's UEFI bootloader selects Arch Linux, then the screen goes black.

If booting from grub/systemd, it hangs right after "Loading Initial Ramdisk..."

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 5.2.11+ Regression: > nproc/2 lockups during initramfs
  2019-09-08 10:37 5.2.11+ Regression: > nproc/2 lockups during initramfs James Harvey
@ 2019-09-10 18:32 ` Sean Christopherson
  2019-09-12  7:59   ` James Harvey
  0 siblings, 1 reply; 5+ messages in thread
From: Sean Christopherson @ 2019-09-10 18:32 UTC (permalink / raw)
  To: James Harvey; +Cc: kvm, Alex Willamson, Paolo Bonzini

On Sun, Sep 08, 2019 at 06:37:43AM -0400, James Harvey wrote:
> Host is up to date Arch Linux, with exception of downgrading linux to
> track this down to 5.2.11 - 5.2.13.  QEMU 4.1.0, but have also
> downgraded to 4.0.0 to confirm no change.
> 
> Host is dual E5-2690 v1 Xeons.  With hyperthreading, 32 logical cores.
> I've always been able to boot qemu with "-smp
> cpus=30,cores=15,threads=1,sockets=2".  I leave 2 free for host
> responsiveness.
> 
> Upgrading from 5.2.10 to 5.2.11 causes the VM to lock up while loading
> the initramfs about 90-95% of the time.  (Probably a slight race
> condition.)  On host, QEMU shows as nVmCPUs*100% CPU usage, so around
> 3000% for 30 cpus.
> 
> If I back down to "cpus=16,cores=8", it always boots.  If I increase
> to "cpus=18,cores=9", it goes back to locking up 90-95% of the time.
> 
> Omitting "-accel=kvm" allows 5.2.11 to work on the host without issue,
> so combined with that the only package needing to be downgraded is
> linux to 5.2.10 to prevent the issue with KVM, I think this must be a
> KVM issue.
> 
> Using version of QEMU with debug symbols gives:
> * gdb backtrace: http://ix.io/1UyO

Fudge.

One of the threads is deleting a memory region, and v5.2.11 reverted a
change related to flushing sptes on memory region deletion.

Can you try reverting the following commit?  Reverting the revert isn't a
viable solution, but it'll at least be helpful to confirm this it's the
source of your troubles.

commit 2ad350fb4c924f611d174e2b0da4edba8a6e430a
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Thu Aug 15 09:43:32 2019 +0200

    Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"
    
    commit d012a06ab1d23178fc6856d8d2161fbcc4dd8ebd upstream.
    
    This reverts commit 4e103134b862314dc2f2f18f2fb0ab972adc3f5f.
    Alex Williamson reported regressions with device assignment with
    this patch.  Even though the bug is probably elsewhere and still
    latent, this is needed to fix the regression.
    
    Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
    Reported-by: Alex Willamson <alex.williamson@redhat.com>
    Cc: stable@vger.kernel.org
    Cc: Sean Christopherson <sean.j.christopherson@intel.com>
    Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


Thread 10 (Thread 0x7ff72fdff700 (LWP 4507)):
#1  0x000055c976c411a4 in kvm_vm_ioctl
#2  0x000055c976c3c6bf in kvm_set_user_memory_region
#3  0x000055c976c3dbb6 in kvm_set_phys_mem
#4  0x000055c976c3dd68 in kvm_region_del
#5  0x000055c976c272c8 in address_space_update_topology_pass
#6  0x000055c976c27897 in address_space_set_flatview
#7  0x000055c976c27a5d in memory_region_transaction_commit
#8  0x000055c976c2b3b5 in memory_region_del_subregion
#9  0x000055c976f267d6 in pci_update_mappings
#10 0x000055c976f26bb6 in pci_default_write_config
#11 0x000055c976fd9baa in virtio_write_config
#12 0x000055c976f30453 in pci_host_config_write_common
#13 0x000055c976f305b0 in pci_data_write
#14 0x000055c976f306dc in pci_host_data_write
#15 0x000055c976c25887 in memory_region_write_accessor
#16 0x000055c976c25aa7 in access_with_adjusted_size
#17 0x000055c976c28ad0 in memory_region_dispatch_write
#18 0x000055c976bc6b30 in flatview_write_continue
#19 0x000055c976bc6c77 in flatview_write
#20 0x000055c976bc6f82 in address_space_write
#21 0x000055c976bc6fd4 in address_space_rw
#22 0x000055c976c4059a in kvm_handle_io
#23 0x000055c976c40d33 in kvm_cpu_exec
#24 0x000055c976c166df in qemu_kvm_cpu_thread_fn
#25 0x000055c9771d3bd8 in qemu_thread_start
#26 0x00007ff73892357f in start_thread
#27 0x00007ff7388510e3 in clone

> * 11 seconds of attaching strace to locked up qemu (167K): http://ix.io/1UyP
> * strace from the beginning of starting a qemu that locks up (8MB):
> https://filebin.ca/4uI15ztGAarw/strace.qemu.from.start
> ** This definitely changed timings, and it became harder to replicate,
> to where I'd guess 20-30% of boots hang
> ** Interestingly, the strace only collected data for 5 seconds, even
> though qemu continued at full CPU usage much longer.  Don't know what
> to make of that, especially because the first strace was attached to
> an already locked up qemu that had gone well past 5 seconds.
> 
> Like how the strace changed timings, I have seen attaching GDB to a
> running qemu which pauses it, then simply running continue, has gotten
> it "unstuck" immediately.
> 
> I've let this go 14 hours, but once it goes into complete CPU usage,
> it never comes out.
> 
> If booting from the September 2019 Arch ISO, it hangs right after the
> ISO's UEFI bootloader selects Arch Linux, then the screen goes black.
> 
> If booting from grub/systemd, it hangs right after "Loading Initial Ramdisk..."

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 5.2.11+ Regression: > nproc/2 lockups during initramfs
  2019-09-10 18:32 ` Sean Christopherson
@ 2019-09-12  7:59   ` James Harvey
  2019-09-17 13:36     ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: James Harvey @ 2019-09-12  7:59 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: kvm, Alex Willamson, Paolo Bonzini, gregkh

On Tue, Sep 10, 2019 at 2:32 PM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
>
> On Sun, Sep 08, 2019 at 06:37:43AM -0400, James Harvey wrote:
> > Host is up to date Arch Linux, with exception of downgrading linux to
> > track this down to 5.2.11 - 5.2.13.  QEMU 4.1.0, but have also
> > downgraded to 4.0.0 to confirm no change.
> >
> > Host is dual E5-2690 v1 Xeons.  With hyperthreading, 32 logical cores.
> > I've always been able to boot qemu with "-smp
> > cpus=30,cores=15,threads=1,sockets=2".  I leave 2 free for host
> > responsiveness.
> >
> > Upgrading from 5.2.10 to 5.2.11 causes the VM to lock up while loading
> > the initramfs about 90-95% of the time.  (Probably a slight race
> > condition.)  On host, QEMU shows as nVmCPUs*100% CPU usage, so around
> > 3000% for 30 cpus.
> >
> > If I back down to "cpus=16,cores=8", it always boots.  If I increase
> > to "cpus=18,cores=9", it goes back to locking up 90-95% of the time.
> >
> > Omitting "-accel=kvm" allows 5.2.11 to work on the host without issue,
> > so combined with that the only package needing to be downgraded is
> > linux to 5.2.10 to prevent the issue with KVM, I think this must be a
> > KVM issue.
> >
> > Using version of QEMU with debug symbols gives:
> > * gdb backtrace: http://ix.io/1UyO
>
> Fudge.
>
> One of the threads is deleting a memory region, and v5.2.11 reverted a
> change related to flushing sptes on memory region deletion.
>
> Can you try reverting the following commit?  Reverting the revert isn't a
> viable solution, but it'll at least be helpful to confirm this it's the
> source of your troubles.
>
> commit 2ad350fb4c924f611d174e2b0da4edba8a6e430a
> Author: Paolo Bonzini <pbonzini@redhat.com>
> Date:   Thu Aug 15 09:43:32 2019 +0200
>
>     Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"
>
>     commit d012a06ab1d23178fc6856d8d2161fbcc4dd8ebd upstream.
>
>     This reverts commit 4e103134b862314dc2f2f18f2fb0ab972adc3f5f.
>     Alex Williamson reported regressions with device assignment with
>     this patch.  Even though the bug is probably elsewhere and still
>     latent, this is needed to fix the regression.
>
>     Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
>     Reported-by: Alex Willamson <alex.williamson@redhat.com>
>     Cc: stable@vger.kernel.org
>     Cc: Sean Christopherson <sean.j.christopherson@intel.com>
>     Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>     Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Yes, confirmed reverting this commit (to restore the originally
reverted commit) fixes the issue.

I'm really surprised to have not found similar reports, especially of
Arch users which had 5.2.11 put into the repos on Aug 29.  Makes me
wonder if it's reproducible on all hardware using host hyperthreading
and giving a VM > nproc/2 virtual cpus.

In the meantime, what should go into distro decisions on whether to
revert?  Since you mentioned: "Reverting the revert isn't a viable
solution."

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 5.2.11+ Regression: > nproc/2 lockups during initramfs
  2019-09-12  7:59   ` James Harvey
@ 2019-09-17 13:36     ` Paolo Bonzini
  2019-09-17 13:41       ` Greg KH
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2019-09-17 13:36 UTC (permalink / raw)
  To: James Harvey, Sean Christopherson; +Cc: kvm, Alex Willamson, gregkh

On 12/09/19 09:59, James Harvey wrote:
> Yes, confirmed reverting this commit (to restore the originally
> reverted commit) fixes the issue.
> 
> I'm really surprised to have not found similar reports, especially of
> Arch users which had 5.2.11 put into the repos on Aug 29.  Makes me
> wonder if it's reproducible on all hardware using host hyperthreading
> and giving a VM > nproc/2 virtual cpus.
> 
> In the meantime, what should go into distro decisions on whether to
> revert?  Since you mentioned: "Reverting the revert isn't a viable
> solution."

Hi James,

the fix (which turned out to be livelock) is now part of 5.3.  You
should expect it sometime soon in 5.2 stable kernels.

    commit 002c5f73c508f7df5681bda339831c27f3c1aef4
    KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: 5.2.11+ Regression: > nproc/2 lockups during initramfs
  2019-09-17 13:36     ` Paolo Bonzini
@ 2019-09-17 13:41       ` Greg KH
  0 siblings, 0 replies; 5+ messages in thread
From: Greg KH @ 2019-09-17 13:41 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: James Harvey, Sean Christopherson, kvm, Alex Willamson

On Tue, Sep 17, 2019 at 03:36:00PM +0200, Paolo Bonzini wrote:
> On 12/09/19 09:59, James Harvey wrote:
> > Yes, confirmed reverting this commit (to restore the originally
> > reverted commit) fixes the issue.
> > 
> > I'm really surprised to have not found similar reports, especially of
> > Arch users which had 5.2.11 put into the repos on Aug 29.  Makes me
> > wonder if it's reproducible on all hardware using host hyperthreading
> > and giving a VM > nproc/2 virtual cpus.
> > 
> > In the meantime, what should go into distro decisions on whether to
> > revert?  Since you mentioned: "Reverting the revert isn't a viable
> > solution."
> 
> Hi James,
> 
> the fix (which turned out to be livelock) is now part of 5.3.  You
> should expect it sometime soon in 5.2 stable kernels.
> 
>     commit 002c5f73c508f7df5681bda339831c27f3c1aef4
>     KVM: x86/mmu: Reintroduce fast invalidate/zap for flushing memslot

Will be in the next 5.2-stable release in a few days.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-09-17 13:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-08 10:37 5.2.11+ Regression: > nproc/2 lockups during initramfs James Harvey
2019-09-10 18:32 ` Sean Christopherson
2019-09-12  7:59   ` James Harvey
2019-09-17 13:36     ` Paolo Bonzini
2019-09-17 13:41       ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).