kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [5.2 regression] x86/fpu changes cause crashes in KVM guest
       [not found] <217248af-e980-9cb0-ff0d-9773413b9d38@thomaslambertz.de>
@ 2019-07-19  8:59 ` Wanpeng Li
  2019-07-19 11:09   ` Paolo Bonzini
  0 siblings, 1 reply; 3+ messages in thread
From: Wanpeng Li @ 2019-07-19  8:59 UTC (permalink / raw)
  To: Thomas Lambertz
  Cc: Sebastian Andrzej Siewior, Rik van Riel, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	the arch/x86 maintainers, LKML, Paolo Bonzini, Radim Krcmar, kvm,
	Peter Zijlstra

Cc kvm ml,
On Thu, 18 Jul 2019 at 08:08, Thomas Lambertz <mail@thomaslambertz.de> wrote:
>
> Since kernel 5.2, I've been experiencing strange issues in my Windows 10
> QEMU/KVM guest.
> Via bisection, I have tracked down that the issue lies in the FPU state
> handling changes.
> Kernels before 8ff468c29e9a9c3afe9152c10c7b141343270bf3 work great, the
> ones afterwards are affected.
> Sometimes the state seems to be restored incorrectly in the guest.
>
> I have managed to reproduce it relatively cleanly, on a linux guest.
> (ubuntu-server 18.04, but that should not matter, since it occured on
> windows aswell)
>
> To reproduce the issue, you need prime95 (or mprime), from
> https://www.mersenne.org/download/ .
> This is just a stress test for the FPU, which helps reproduce the error
> much quicker.
>
> - Run it in the guest as 'Benchmark Only', and choose the '(2) Small
> FFTs' torture test. Give it the maximum amount of cores (for me 10).
> - On the host, run the same test. To keep my pc usable, I limited it to
> 5 cores. I do this to put some pressure on the system.
> - repeatedly focus and unfocus the qemu window
>
> With this config, errors in the guest usually occur within 30 seconds.
> Without the refocusing, takes ~5min on average, but the variance of this
> time is quite large.
>
> The error messages are either
>      "FATAL ERROR: Rounding was ......., expected less than 0.4"
> or
>      "FATAL ERROR: Resulting sum was ....., expexted: ......",
> suggesting that something in the calculation has gone wrong.
>
> On the host, no errors are ever observed!

I found it is offended by commit 5f409e20b (x86/fpu: Defer FPU state
load until return to userspace) and can only be reproduced when
CONFIG_PREEMPT is enabled. Why restore qemu userspace fpu context to
hardware before vmentry in the commit?
https://lkml.org/lkml/2017/11/14/945 Actually I suspect the commit
f775b13eedee2 (x86,kvm: move qemu/guest FPU switching out to vcpu_run)
inaccurately save guest fpu state which in xsave area into the qemu
userspace fpu buffer. However, Rik replied in
https://lkml.org/lkml/2017/11/14/891, "The scheduler will save the
guest fpu context when a vCPU thread is preempted, and restore it when
it is scheduled back in." But I can't find any scheduler codes do
this. In addition, below codes can fix the mprime error warning.
(Still not sure it is correct)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 58305cf..18f928e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3306,6 +3306,9 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)

     kvm_x86_ops->vcpu_load(vcpu, cpu);

+    if (test_thread_flag(TIF_NEED_FPU_LOAD))
+        switch_fpu_return();
+
     /* Apply any externally detected TSC adjustments (due to suspend) */
     if (unlikely(vcpu->arch.tsc_offset_adjustment)) {
         adjust_tsc_offset_host(vcpu, vcpu->arch.tsc_offset_adjustment);
@@ -7990,10 +7993,6 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
     trace_kvm_entry(vcpu->vcpu_id);
     guest_enter_irqoff();

-    fpregs_assert_state_consistent();
-    if (test_thread_flag(TIF_NEED_FPU_LOAD))
-        switch_fpu_return();
-
     if (unlikely(vcpu->arch.switch_db_regs)) {
         set_debugreg(0, 7);
         set_debugreg(vcpu->arch.eff_db[0], 0);

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [5.2 regression] x86/fpu changes cause crashes in KVM guest
  2019-07-19  8:59 ` [5.2 regression] x86/fpu changes cause crashes in KVM guest Wanpeng Li
@ 2019-07-19 11:09   ` Paolo Bonzini
  2019-07-22  4:28     ` Wanpeng Li
  0 siblings, 1 reply; 3+ messages in thread
From: Paolo Bonzini @ 2019-07-19 11:09 UTC (permalink / raw)
  To: Wanpeng Li, Thomas Lambertz
  Cc: Sebastian Andrzej Siewior, Rik van Riel, Dave Hansen,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, H. Peter Anvin,
	the arch/x86 maintainers, LKML, Radim Krcmar, kvm,
	Peter Zijlstra, Marc Orr, Dave Hansen

On 19/07/19 10:59, Wanpeng Li wrote:
> https://lkml.org/lkml/2017/11/14/891, "The scheduler will save the
> guest fpu context when a vCPU thread is preempted, and restore it when
> it is scheduled back in." But I can't find any scheduler codes do
> this.

That's because applying commit 240c35a37 was completely wrong.  The idea
before commit 240c35a37 was that you have the following FPU states:

               userspace (QEMU)             guest
---------------------------------------------------------------------------
               processor                    vcpu->arch.guest_fpu
>>> KVM_RUN: kvm_load_guest_fpu
               vcpu->arch.user_fpu          processor
>>> preempt out
               vcpu->arch.user_fpu          current->thread.fpu
>>> preempt in
               vcpu->arch.user_fpu          processor
>>> back to userspace
>>> kvm_put_guest_fpu
               processor                    vcpu->arch.guest_fpu
---------------------------------------------------------------------------

After removing user_fpu, QEMU's FPU state is destroyed when KVM_RUN is
preempted.  So that's already messed up (I'll send a revert), and given
the diagram above your patch makes total sense.

With the new lazy model we want to hook into kvm_vcpu_arch_load and get
the state back to the processor from current->thread.fpu, and indeed
switch_fpu_return is essentially copy_kernel_to_fpregs(&current->thread.
fpu->state).

However I would keep the fpregs_assert_state_consistent in
kvm_arch_vcpu_load, and also
WARN_ON_ONCE(test_thread_flag(TIF_NEED_FPU_LOAD)) in vcpu_enter_guest.

Paolo

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [5.2 regression] x86/fpu changes cause crashes in KVM guest
  2019-07-19 11:09   ` Paolo Bonzini
@ 2019-07-22  4:28     ` Wanpeng Li
  0 siblings, 0 replies; 3+ messages in thread
From: Wanpeng Li @ 2019-07-22  4:28 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Thomas Lambertz, Sebastian Andrzej Siewior, Rik van Riel,
	Dave Hansen, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	H. Peter Anvin, the arch/x86 maintainers, LKML, Radim Krcmar,
	kvm, Peter Zijlstra, Marc Orr

On Fri, 19 Jul 2019 at 19:09, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
> On 19/07/19 10:59, Wanpeng Li wrote:
> > https://lkml.org/lkml/2017/11/14/891, "The scheduler will save the
> > guest fpu context when a vCPU thread is preempted, and restore it when
> > it is scheduled back in." But I can't find any scheduler codes do
> > this.
>
> That's because applying commit 240c35a37 was completely wrong.  The idea
> before commit 240c35a37 was that you have the following FPU states:
>
>                userspace (QEMU)             guest
> ---------------------------------------------------------------------------
>                processor                    vcpu->arch.guest_fpu
> >>> KVM_RUN: kvm_load_guest_fpu
>                vcpu->arch.user_fpu          processor
> >>> preempt out
>                vcpu->arch.user_fpu          current->thread.fpu
> >>> preempt in
>                vcpu->arch.user_fpu          processor
> >>> back to userspace
> >>> kvm_put_guest_fpu
>                processor                    vcpu->arch.guest_fpu
> ---------------------------------------------------------------------------
>
> After removing user_fpu, QEMU's FPU state is destroyed when KVM_RUN is
> preempted.  So that's already messed up (I'll send a revert), and given
> the diagram above your patch makes total sense.
>
> With the new lazy model we want to hook into kvm_vcpu_arch_load and get
> the state back to the processor from current->thread.fpu, and indeed
> switch_fpu_return is essentially copy_kernel_to_fpregs(&current->thread.
> fpu->state).
>
> However I would keep the fpregs_assert_state_consistent in
> kvm_arch_vcpu_load, and also
> WARN_ON_ONCE(test_thread_flag(TIF_NEED_FPU_LOAD)) in vcpu_enter_guest.

Looks good to me, just send out two patches rebase on the revert.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2019-07-22  4:28 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <217248af-e980-9cb0-ff0d-9773413b9d38@thomaslambertz.de>
2019-07-19  8:59 ` [5.2 regression] x86/fpu changes cause crashes in KVM guest Wanpeng Li
2019-07-19 11:09   ` Paolo Bonzini
2019-07-22  4:28     ` Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).