[Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-11-03  6:58 Xulei (Stone, Euler)
  2015-11-04  0:48 ` Gonglei
  0 siblings, 1 reply; 41+ messages in thread
From: Xulei (Stone, Euler) @ 2015-11-03  6:58 UTC (permalink / raw)
  To: qemu-devel; +Cc: wangxin (U), Gonglei (Arei), Huangweidong (C)

[-- Attachment #1: Type: text/plain, Size: 1604 bytes --]

On qemu-kvm platform, when I reset a VM through "virsh reset", and coincidently
the VM is in process of internal rebooting at the same time. Then the VM will
not be successfully reseted any more due to the reset reentrancy. I found:
(1)SeaBios try to shutdown the VM after reseting it failed by apm_shutdown().
However, apm_shutdown() does not work on qemu-kvm platform;
(2)I add 1s sleep in qemu_prep_reset(), then continuously reset the VM twice,
aforementioned case must happen.
This patch fixes this issue by letting the VM always execute the reboot
routing while a reenrancy happenes instead of attempting apm_shutdown on
qemu-kvm platform.

Signed-off-by: Lei Xu <stone.xulei@huawei.com>
---
 roms/seabios/src/resume.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/roms/seabios/src/resume.c b/roms/seabios/src/resume.c
index 1903174..96ff79e 100644
--- a/roms/seabios/src/resume.c
+++ b/roms/seabios/src/resume.c
@@ -16,6 +16,7 @@
 #include "std/bda.h" // struct bios_data_area_s
 #include "string.h" // memset
 #include "util.h" // dma_setup
+#include "fw/paravirt.h" //runningOnKVM

 // Handler for post calls that look like a resume.
 void VISIBLE16
@@ -122,7 +123,11 @@ tryReboot(void)
         dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
         apm_shutdown();
     }
-    HaveAttemptedReboot = 1;
+    if (!runningOnKVM()) {
+        // Hard reboot has failed - try to shutdown machine.
+        HaveAttemptedReboot = 1;
+    }
+

     dprintf(1, "Attempting a hard reboot\n");

--
1.7.12.4

[-- Attachment #2: Type: text/html, Size: 2887 bytes --]

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-03  6:58 [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform Xulei (Stone, Euler)
@ 2015-11-04  0:48 ` Gonglei
  2015-11-04 17:42   ` Kevin O'Connor
  0 siblings, 1 reply; 41+ messages in thread
From: Gonglei @ 2015-11-04  0:48 UTC (permalink / raw)
  To: Xulei (Stone, Euler), qemu-devel; +Cc: wangxin (U), seabios, Huangweidong (C)

Ccing Seabios community.

On 2015/11/3 14:58, Xulei (Stone, Euler) wrote:
> On qemu-kvm platform, when I reset a VM through "virsh reset", and coincidently
> the VM is in process of internal rebooting at the same time. Then the VM will
> not be successfully reseted any more due to the reset reentrancy. I found:
> (1)SeaBios try to shutdown the VM after reseting it failed by apm_shutdown().
> However, apm_shutdown() does not work on qemu-kvm platform;
> (2)I add 1s sleep in qemu_prep_reset(), then continuously reset the VM twice,
> aforementioned case must happen.
> This patch fixes this issue by letting the VM always execute the reboot
> routing while a reenrancy happenes instead of attempting apm_shutdown on
> qemu-kvm platform.
> 
> Signed-off-by: Lei Xu <stone.xulei@huawei.com>
> ---
>   roms/seabios/src/resume.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/roms/seabios/src/resume.c b/roms/seabios/src/resume.c
> index 1903174..96ff79e 100644
> --- a/roms/seabios/src/resume.c
> +++ b/roms/seabios/src/resume.c
> @@ -16,6 +16,7 @@
>   #include "std/bda.h" // struct bios_data_area_s
>   #include "string.h" // memset
>   #include "util.h" // dma_setup
> +#include "fw/paravirt.h" //runningOnKVM
> 
>   // Handler for post calls that look like a resume.
>   void VISIBLE16
> @@ -122,7 +123,11 @@ tryReboot(void)
>           dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
>           apm_shutdown();
>       }
> -    HaveAttemptedReboot = 1;
> +    if (!runningOnKVM()) {
> +        // Hard reboot has failed - try to shutdown machine.
> +        HaveAttemptedReboot = 1;
> +    }
> +
> 
>       dprintf(1, "Attempting a hard reboot\n");
> 
> --
> 1.7.12.4
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-04  0:48 ` Gonglei
@ 2015-11-04 17:42   ` Kevin O'Connor
  2015-11-06  9:12     ` Xulei (Stone)
  0 siblings, 1 reply; 41+ messages in thread
From: Kevin O'Connor @ 2015-11-04 17:42 UTC (permalink / raw)
  To: Gonglei
  Cc: Huangweidong (C), wangxin (U), seabios, Xulei (Stone, Euler), qemu-devel

On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
> On 2015/11/3 14:58, Xulei (Stone, Euler) wrote:
> > On qemu-kvm platform, when I reset a VM through "virsh reset", and coincidently
> > the VM is in process of internal rebooting at the same time. Then the VM will
> > not be successfully reseted any more due to the reset reentrancy. I found:
> > (1)SeaBios try to shutdown the VM after reseting it failed by apm_shutdown().
> > However, apm_shutdown() does not work on qemu-kvm platform;
> > (2)I add 1s sleep in qemu_prep_reset(), then continuously reset the VM twice,
> > aforementioned case must happen.

So, the problem occurs when issuing a second reset before the first
reset completes?

> > This patch fixes this issue by letting the VM always execute the reboot
> > routing while a reenrancy happenes instead of attempting apm_shutdown on
> > qemu-kvm platform.

The reason for the HaveAttemptedReboot check is to work around old
versions of KVM that unexpectedly map the same memory to both 0xf0000
and 0xffff0000.  So, it does not make sense to wrap the check in a
!runningOnKVM() block as that disables the only reason for the check.

I'm surprised you would see the above on a recent qemu/kvm though - as
on a newer KVM I think the second reset would have to happen after
HaveAttemptedReboot is set and prior to the memcpy in
qemu_prep_reset() completing.  Can you verify your KVM version?

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-04 17:42   ` Kevin O'Connor
@ 2015-11-06  9:12     ` Xulei (Stone)
  2015-11-09 13:32       ` Kevin O'Connor
  0 siblings, 1 reply; 41+ messages in thread
From: Xulei (Stone) @ 2015-11-06  9:12 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), wangxin (U), Gonglei (Arei), seabios, qemu-devel


>On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
>> On 2015/11/3 14:58, Xulei (Stone, Euler) wrote:
>> > On qemu-kvm platform, when I reset a VM through "virsh reset", and coincidently
>> > the VM is in process of internal rebooting at the same time. Then the VM will
>> > not be successfully reseted any more due to the reset reentrancy. I found:
>> > (1)SeaBios try to shutdown the VM after reseting it failed by apm_shutdown().
>> > However, apm_shutdown() does not work on qemu-kvm platform;
>> > (2)I add 1s sleep in qemu_prep_reset(), then continuously reset the VM twice,
>> > aforementioned case must happen.
>
>So, the problem occurs when issuing a second reset before the first
>reset completes?

Yes. Detailedly, the 2nd reset issued after "HaveAttemptedReboot = 1"
and prior to the memcpy completing in qemu_prep_reset().

>> > This patch fixes this issue by letting the VM always execute the reboot
>> > routing while a reenrancy happenes instead of attempting apm_shutdown on
>> > qemu-kvm platform.
>
>The reason for the HaveAttemptedReboot check is to work around old
>versions of KVM that unexpectedly map the same memory to both 0xf0000
>and 0xffff0000.  So, it does not make sense to wrap the check in a
>!runningOnKVM() block as that disables the only reason for the check.
>
>I'm surprised you would see the above on a recent qemu/kvm though - as
>on a newer KVM I think the second reset would have to happen after
>HaveAttemptedReboot is set and prior to the memcpy in
>qemu_prep_reset() completing.  Can you verify your KVM version?
>
>-Kevin

I've tested on KVM-3.6 and KVM-4.1.3. On both of these versions, i can 
see this problem. 
I do like this: put a HA and a watchdog mechanism in a VM. Deliberately, 
let this VM lose heartbeat and don't feed dog. Then, after 2 minutes, 
a self-defined timeout, HA mechnism will issue a internal reboot command to
the VM and watchdog mechanism will issue a "virsh reset" from the host. Then, 
aforementioned problem will occurs in high probability. 

-Leixu

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-06  9:12     ` Xulei (Stone)
@ 2015-11-09 13:32       ` Kevin O'Connor
  2015-11-09 20:06         ` Kevin O'Connor
  0 siblings, 1 reply; 41+ messages in thread
From: Kevin O'Connor @ 2015-11-09 13:32 UTC (permalink / raw)
  To: Xulei (Stone)
  Cc: Huangweidong (C), wangxin (U), Gonglei (Arei), seabios, qemu-devel

On Fri, Nov 06, 2015 at 09:12:34AM +0000, Xulei (Stone) wrote:
> 
> >On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
> >> On 2015/11/3 14:58, Xulei (Stone, Euler) wrote:
> >> > On qemu-kvm platform, when I reset a VM through "virsh reset", and coincidently
> >> > the VM is in process of internal rebooting at the same time. Then the VM will
> >> > not be successfully reseted any more due to the reset reentrancy. I found:
> >> > (1)SeaBios try to shutdown the VM after reseting it failed by apm_shutdown().
> >> > However, apm_shutdown() does not work on qemu-kvm platform;
> >> > (2)I add 1s sleep in qemu_prep_reset(), then continuously reset the VM twice,
> >> > aforementioned case must happen.
> >
> >So, the problem occurs when issuing a second reset before the first
> >reset completes?
> 
> Yes. Detailedly, the 2nd reset issued after "HaveAttemptedReboot = 1"
> and prior to the memcpy completing in qemu_prep_reset().
> 
> >> > This patch fixes this issue by letting the VM always execute the reboot
> >> > routing while a reenrancy happenes instead of attempting apm_shutdown on
> >> > qemu-kvm platform.
> >
> >The reason for the HaveAttemptedReboot check is to work around old
> >versions of KVM that unexpectedly map the same memory to both 0xf0000
> >and 0xffff0000.  So, it does not make sense to wrap the check in a
> >!runningOnKVM() block as that disables the only reason for the check.
> >
> >I'm surprised you would see the above on a recent qemu/kvm though - as
> >on a newer KVM I think the second reset would have to happen after
> >HaveAttemptedReboot is set and prior to the memcpy in
> >qemu_prep_reset() completing.  Can you verify your KVM version?
> >
> >-Kevin
> 
> I've tested on KVM-3.6 and KVM-4.1.3. On both of these versions, i can 
> see this problem. 
> I do like this: put a HA and a watchdog mechanism in a VM. Deliberately, 
> let this VM lose heartbeat and don't feed dog. Then, after 2 minutes, 
> a self-defined timeout, HA mechnism will issue a internal reboot command to
> the VM and watchdog mechanism will issue a "virsh reset" from the host. Then, 
> aforementioned problem will occurs in high probability. 

Ah, okay.  I'm not sure what the best solution to this problem is.  We
don't want to exclude KVM because the check is meant to prevent an
infinite loop on older versions of KVM (which looks like a mysterious
hang to users).  We also don't want to be in a situation where we
reboot and the memcpy hasn't fully completed, as that's likely to lead
to mysterious crashes on the next boot.

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-09 13:32       ` Kevin O'Connor
@ 2015-11-09 20:06         ` Kevin O'Connor
  2015-11-09 20:27           ` Kevin O'Connor
  0 siblings, 1 reply; 41+ messages in thread
From: Kevin O'Connor @ 2015-11-09 20:06 UTC (permalink / raw)
  To: Xulei (Stone)
  Cc: Huangweidong (C), wangxin (U), Gonglei (Arei), seabios, qemu-devel

On Mon, Nov 09, 2015 at 08:32:53AM -0500, Kevin O'Connor wrote:
> On Fri, Nov 06, 2015 at 09:12:34AM +0000, Xulei (Stone) wrote:
> > >On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
> > >I'm surprised you would see the above on a recent qemu/kvm though - as
> > >on a newer KVM I think the second reset would have to happen after
> > >HaveAttemptedReboot is set and prior to the memcpy in
> > >qemu_prep_reset() completing.  Can you verify your KVM version?
> > 
> > I've tested on KVM-3.6 and KVM-4.1.3. On both of these versions, i can 
> > see this problem. 
> > I do like this: put a HA and a watchdog mechanism in a VM. Deliberately, 
> > let this VM lose heartbeat and don't feed dog. Then, after 2 minutes, 
> > a self-defined timeout, HA mechnism will issue a internal reboot command to
> > the VM and watchdog mechanism will issue a "virsh reset" from the host. Then, 
> > aforementioned problem will occurs in high probability. 
> 
> Ah, okay.  I'm not sure what the best solution to this problem is.

After thinking about this further, I think we can move the
HaveAttemptedReboot assignment after the memcpy.  Does the SeaBIOS
patch below fix things for you?

-Kevin


commit d0e9e2cca9fa6dacd2ad07081ef09c59be1ae945
Author: Kevin O'Connor <kevin@koconnor.net>
Date:   Mon Nov 9 15:00:19 2015 -0500

    resume: Don't set HaveAttemptedReboot until after internal bios memcpy
    
    Move the check for soft reboot loops from resume.c to shadow.c and
    only set the HaveAttemptedReboot flag after restoring the BIOS image.
    This prevents a hang if an external reboot request occurs during the
    BIOS memcpy.
    
    Signed-off-by: Kevin O'Connor <kevin@koconnor.net>

diff --git a/src/fw/shadow.c b/src/fw/shadow.c
index ee87d36..f2d0d65 100644
--- a/src/fw/shadow.c
+++ b/src/fw/shadow.c
@@ -156,6 +156,8 @@ make_bios_readonly(void)
         make_bios_readonly_intel(ShadowBDF, Q35_HOST_BRIDGE_PAM0);
 }
 
+u8 HaveAttemptedReboot VARLOW;
+
 void
 qemu_prep_reset(void)
 {
@@ -163,7 +165,13 @@ qemu_prep_reset(void)
         return;
     // QEMU doesn't map 0xc0000-0xfffff back to the original rom on a
     // reset, so do that manually before invoking a hard reset.
+    if (HaveAttemptedReboot) {
+        // Hard reboot has failed - try to shutdown machine.
+        dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
+        apm_shutdown();
+    }
     make_bios_writable();
     memcpy(VSYMBOL(code32flat_start), VSYMBOL(code32flat_start) + BIOS_SRC_OFFSET
            , SYMBOL(code32flat_end) - SYMBOL(code32flat_start));
+    HaveAttemptedReboot = 1;
 }
diff --git a/src/resume.c b/src/resume.c
index a5465d8..afeadcf 100644
--- a/src/resume.c
+++ b/src/resume.c
@@ -114,19 +114,10 @@ s3_resume(void)
     farcall16big(&br);
 }
 
-u8 HaveAttemptedReboot VARLOW;
-
 // Attempt to invoke a hard-reboot.
 static void
 tryReboot(void)
 {
-    if (HaveAttemptedReboot) {
-        // Hard reboot has failed - try to shutdown machine.
-        dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
-        apm_shutdown();
-    }
-    HaveAttemptedReboot = 1;
-
     dprintf(1, "Attempting a hard reboot\n");
 
     // Setup for reset on qemu.

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-09 20:06         ` Kevin O'Connor
@ 2015-11-09 20:27           ` Kevin O'Connor
  2015-11-19  1:04             ` Xulei (Stone)
  0 siblings, 1 reply; 41+ messages in thread
From: Kevin O'Connor @ 2015-11-09 20:27 UTC (permalink / raw)
  To: Xulei (Stone)
  Cc: Huangweidong (C), wangxin (U), Gonglei (Arei), seabios, qemu-devel

On Mon, Nov 09, 2015 at 03:06:18PM -0500, Kevin O'Connor wrote:
> On Mon, Nov 09, 2015 at 08:32:53AM -0500, Kevin O'Connor wrote:
> > On Fri, Nov 06, 2015 at 09:12:34AM +0000, Xulei (Stone) wrote:
> > > >On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
> > > >I'm surprised you would see the above on a recent qemu/kvm though - as
> > > >on a newer KVM I think the second reset would have to happen after
> > > >HaveAttemptedReboot is set and prior to the memcpy in
> > > >qemu_prep_reset() completing.  Can you verify your KVM version?
> > > 
> > > I've tested on KVM-3.6 and KVM-4.1.3. On both of these versions, i can 
> > > see this problem. 
> > > I do like this: put a HA and a watchdog mechanism in a VM. Deliberately, 
> > > let this VM lose heartbeat and don't feed dog. Then, after 2 minutes, 
> > > a self-defined timeout, HA mechnism will issue a internal reboot command to
> > > the VM and watchdog mechanism will issue a "virsh reset" from the host. Then, 
> > > aforementioned problem will occurs in high probability. 
> > 
> > Ah, okay.  I'm not sure what the best solution to this problem is.
> 
> After thinking about this further, I think we can move the
> HaveAttemptedReboot assignment after the memcpy.

The previous patch could cause corruption if the memcpy() failed.  I
think the new SeaBIOS patch below should be okay though.

-Kevin


commit 8a6e44ad5c953266d2339b3299f5fb4ff32c8cbb
Author: Kevin O'Connor <kevin@koconnor.net>
Date:   Mon Nov 9 15:00:19 2015 -0500

    resume: Make KVM soft reboot loop detection more flexible
    
    Move the check for soft reboot loops from resume.c to shadow.c and
    directly check for the case where the memcpy fails.  This prevents a
    hang if an external reboot request occurs during the BIOS memcpy.
    
    Signed-off-by: Kevin O'Connor <kevin@koconnor.net>

diff --git a/src/fw/shadow.c b/src/fw/shadow.c
index ee87d36..b2f2dd8 100644
--- a/src/fw/shadow.c
+++ b/src/fw/shadow.c
@@ -156,6 +156,8 @@ make_bios_readonly(void)
         make_bios_readonly_intel(ShadowBDF, Q35_HOST_BRIDGE_PAM0);
 }
 
+static u8 AttemptingReboot;
+
 void
 qemu_prep_reset(void)
 {
@@ -164,6 +166,19 @@ qemu_prep_reset(void)
     // QEMU doesn't map 0xc0000-0xfffff back to the original rom on a
     // reset, so do that manually before invoking a hard reset.
     make_bios_writable();
+    AttemptingReboot = 1;
+    barrier();
+    if (!AttemptingReboot)
+        goto fail;
+    barrier();
     memcpy(VSYMBOL(code32flat_start), VSYMBOL(code32flat_start) + BIOS_SRC_OFFSET
            , SYMBOL(code32flat_end) - SYMBOL(code32flat_start));
+    barrier();
+    if (AttemptingReboot)
+        goto fail;
+    return;
+fail:
+    // Attempt to restore code has failed - try to shutdown machine.
+    dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
+    apm_shutdown();
 }
diff --git a/src/resume.c b/src/resume.c
index a5465d8..afeadcf 100644
--- a/src/resume.c
+++ b/src/resume.c
@@ -114,19 +114,10 @@ s3_resume(void)
     farcall16big(&br);
 }
 
-u8 HaveAttemptedReboot VARLOW;
-
 // Attempt to invoke a hard-reboot.
 static void
 tryReboot(void)
 {
-    if (HaveAttemptedReboot) {
-        // Hard reboot has failed - try to shutdown machine.
-        dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
-        apm_shutdown();
-    }
-    HaveAttemptedReboot = 1;
-
     dprintf(1, "Attempting a hard reboot\n");
 
     // Setup for reset on qemu.

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-09 20:27           ` Kevin O'Connor
@ 2015-11-19  1:04             ` Xulei (Stone)
  2015-11-19 12:42               ` Xulei (Stone)
  0 siblings, 1 reply; 41+ messages in thread
From: Xulei (Stone) @ 2015-11-19  1:04 UTC (permalink / raw)
  To: Kevin O'Connor; +Cc: Huangweidong (C), Gonglei (Arei), seabios, qemu-devel

Dear Kevin,

Sorry for delayed replying. This patch works for me well. Thanks a lot!

Recently, I found another odd thing. A qemu-kvm VM is stuck at the SeaBIOS 
after self-rebooting many times. Analyzing the SeaBIOS log attached below, I
think there maybe someting wrong from this block of code:

/src/fw/smp.c

    u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
    while (cmos_smp_count != CountCPUs)
        asm volatile(
            // Release lock and allow other processors to use the stack.
            "  movl %%esp, %1\n"
            "  movl $0, %0\n"
            // Reacquire lock and take back ownership of stack.
            "1:rep ; nop\n"
            "  lock btsl $0, %0\n"
            "  jc 1b\n"
            : "+m" (SMPLock), "+m" (SMPStack)
            : : "cc", "memory");
    yield();

It seems if SeaBIOS read an incorrect number sometimes from QEMU 
through cmos 0x5f，the SeaBIOS really may be stucked. So, i wonder
what may cause this problem after a VM self-rebooting many times?

================bad SeaBIOS log===========
[2015-11-13 18:45:58] In resume (status=0)
[2015-11-13 18:45:58] In 32bit resume
[2015-11-13 18:45:58] Attempting a hard reboot
[2015-11-13 18:46:00] SeaBIOS (version rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org)
[2015-11-13 18:46:00] No Xen hypervisor found.
[2015-11-13 18:46:00] Running on QEMU (i440fx)
[2015-11-13 18:46:00] Running on KVM
[2015-11-13 18:46:00] RamSize: 0xc0000000 [cmos]
[2015-11-13 18:46:00] Relocating init from 0x000de8f0 to 0xbffaec00 (size 70464)
[2015-11-13 18:46:00] Found QEMU fw_cfg
[2015-11-13 18:46:00] RamBlock: addr 0x0000000000000000 len 0x00000000c0000000 [e820]
[2015-11-13 18:46:00] RamBlock: addr 0x0000000100000000 len 0x0000000340000000 [e820]
[2015-11-13 18:46:00] Moving pm_base to 0x600
[2015-11-13 18:46:00] boot order:
[2015-11-13 18:46:00] 1: /pci@i0cf8/scsi@e/disk@0,0
[2015-11-13 18:46:00] 2: HALT
[2015-11-13 18:46:00] CPU Mhz=2402
[2015-11-13 18:46:00] === PCI bus & bridge init ===
[2015-11-13 18:46:00] PCI: pci_bios_init_bus_rec bus = 0x0
[2015-11-13 18:46:00] === PCI device probing ===
[2015-11-13 18:46:00] Found 21 PCI devices (max PCI bus is 00)
[2015-11-13 18:46:00] === PCI new allocation pass #1 ===
[2015-11-13 18:46:00] PCI: check devices
[2015-11-13 18:46:00] === PCI new allocation pass #2 ===
[2015-11-13 18:46:00] PCI: IO: c000 - c1cf
[2015-11-13 18:46:00] PCI: 32: 00000000c0000000 - 00000000fec00000
[2015-11-13 18:46:00] PCI: map device bdf=00:1f.0  bar 0, addr 0000c000, size 00000100 [io]
[2015-11-13 18:46:00] PCI: map device bdf=00:0e.0  bar 0, addr 0000c100, size 00000040 [io]
[2015-11-13 18:46:00] PCI: map device bdf=00:0f.0  bar 0, addr 0000c140, size 00000040 [io]
[2015-11-13 18:46:00] PCI: map device bdf=00:01.2  bar 4, addr 0000c180, size 00000020 [io]
[2015-11-13 18:46:00] PCI: map device bdf=00:0d.0  bar 0, addr 0000c1a0, size 00000020 [io]
[2015-11-13 18:46:00] PCI: map device bdf=00:01.1  bar 4, addr 0000c1c0, size 00000010 [io]
[2015-11-13 18:46:00] PCI: map device bdf=00:02.0  bar 6, addr febe0000, size 00010000 [mem]
[2015-11-13 18:46:00] PCI: map device bdf=00:02.0  bar 1, addr febf0000, size 00001000 [mem]
[2015-11-13 18:46:00] PCI: map device bdf=00:0d.0  bar 1, addr febf1000, size 00001000 [mem]
[2015-11-13 18:46:00] PCI: map device bdf=00:0e.0  bar 1, addr febf2000, size 00001000 [mem]
[2015-11-13 18:46:00] PCI: map device bdf=00:0f.0  bar 1, addr febf3000, size 00001000 [mem]
[2015-11-13 18:46:00] PCI: map device bdf=00:10.0  bar 0, addr febf4000, size 00001000 [mem]
[2015-11-13 18:46:00] PCI: map device bdf=00:02.0  bar 0, addr f6000000, size 02000000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:10.0  bar 2, addr f8000000, size 01000000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:03.0  bar 2, addr f9000000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:04.0  bar 2, addr f9800000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:05.0  bar 2, addr fa000000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:06.0  bar 2, addr fa800000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:07.0  bar 2, addr fb000000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:08.0  bar 2, addr fb800000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:09.0  bar 2, addr fc000000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:0a.0  bar 2, addr fc800000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:0b.0  bar 2, addr fd000000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: map device bdf=00:0c.0  bar 2, addr fd800000, size 00800000 [prefmem]
[2015-11-13 18:46:00] PCI: init bdf=00:00.0 id=8086:1237
[2015-11-13 18:46:00] PCI: init bdf=00:01.0 id=8086:7000
[2015-11-13 18:46:00] PIIX3/PIIX4 init: elcr=00 0c
[2015-11-13 18:46:00] PCI: init bdf=00:01.1 id=8086:7010
[2015-11-13 18:46:00] PCI: init bdf=00:01.2 id=8086:7020
[2015-11-13 18:46:00] PCI: init bdf=00:01.3 id=8086:7113
[2015-11-13 18:46:00] Using pmtimer, ioport 0x608
[2015-11-13 18:46:00] PCI: init bdf=00:02.0 id=1013:00b8
[2015-11-13 18:46:00] PCI: init bdf=00:03.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:04.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:05.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:06.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:07.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:08.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:09.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:0a.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:0b.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:0c.0 id=15b3:1004
[2015-11-13 18:46:00] PCI: init bdf=00:0d.0 id=1af4:1003
[2015-11-13 18:46:00] PCI: init bdf=00:0e.0 id=1af4:1001
[2015-11-13 18:46:00] PCI: init bdf=00:0f.0 id=1af4:1001
[2015-11-13 18:46:00] PCI: init bdf=00:10.0 id=1af4:1110
[2015-11-13 18:46:00] PCI: init bdf=00:1f.0 id=1af4:8888
[2015-11-13 18:46:00] PCI: Using 00:02.0 for primary VGA
[2015-11-13 18:46:00] handle_smp: apic_id=1
[2015-11-13 18:46:00] handle_smp: apic_id=6
[2015-11-13 18:46:00] handle_smp: apic_id=7
[2015-11-13 18:46:00] handle_smp: apic_id=3
[2015-11-13 18:46:00] handle_smp: apic_id=2
[2015-11-13 18:46:00] handle_smp: apic_id=5
[2015-11-13 18:46:00] handle_smp: apic_id=4
========The End, nothing more======
>On Mon, Nov 09, 2015 at 03:06:18PM -0500, Kevin O'Connor wrote:
>> On Mon, Nov 09, 2015 at 08:32:53AM -0500, Kevin O'Connor wrote:
>> > On Fri, Nov 06, 2015 at 09:12:34AM +0000, Xulei (Stone) wrote:
>> > > >On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
>> > > >I'm surprised you would see the above on a recent qemu/kvm though - as
>> > > >on a newer KVM I think the second reset would have to happen after
>> > > >HaveAttemptedReboot is set and prior to the memcpy in
>> > > >qemu_prep_reset() completing.  Can you verify your KVM version?
>> > > 
>> > > I've tested on KVM-3.6 and KVM-4.1.3. On both of these versions, i can 
>> > > see this problem. 
>> > > I do like this: put a HA and a watchdog mechanism in a VM. Deliberately, 
>> > > let this VM lose heartbeat and don't feed dog. Then, after 2 minutes, 
>> > > a self-defined timeout, HA mechnism will issue a internal reboot command to
>> > > the VM and watchdog mechanism will issue a "virsh reset" from the host. Then, 
>> > > aforementioned problem will occurs in high probability. 
>> > 
>> > Ah, okay.  I'm not sure what the best solution to this problem is.
>> 
>> After thinking about this further, I think we can move the
>> HaveAttemptedReboot assignment after the memcpy.
>
>The previous patch could cause corruption if the memcpy() failed.  I
>think the new SeaBIOS patch below should be okay though.
>
>-Kevin
>
>
>commit 8a6e44ad5c953266d2339b3299f5fb4ff32c8cbb
>Author: Kevin O'Connor <kevin@koconnor.net>
>Date:   Mon Nov 9 15:00:19 2015 -0500
>
>    resume: Make KVM soft reboot loop detection more flexible
>    
>    Move the check for soft reboot loops from resume.c to shadow.c and
>    directly check for the case where the memcpy fails.  This prevents a
>    hang if an external reboot request occurs during the BIOS memcpy.
>    
>    Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
>
>diff --git a/src/fw/shadow.c b/src/fw/shadow.c
>index ee87d36..b2f2dd8 100644
>--- a/src/fw/shadow.c
>+++ b/src/fw/shadow.c
>@@ -156,6 +156,8 @@ make_bios_readonly(void)
>         make_bios_readonly_intel(ShadowBDF, Q35_HOST_BRIDGE_PAM0);
> }
> 
>+static u8 AttemptingReboot;
>+
> void
> qemu_prep_reset(void)
> {
>@@ -164,6 +166,19 @@ qemu_prep_reset(void)
>     // QEMU doesn't map 0xc0000-0xfffff back to the original rom on a
>     // reset, so do that manually before invoking a hard reset.
>     make_bios_writable();
>+    AttemptingReboot = 1;
>+    barrier();
>+    if (!AttemptingReboot)
>+        goto fail;
>+    barrier();
>     memcpy(VSYMBOL(code32flat_start), VSYMBOL(code32flat_start) + BIOS_SRC_OFFSET
>            , SYMBOL(code32flat_end) - SYMBOL(code32flat_start));
>+    barrier();
>+    if (AttemptingReboot)
>+        goto fail;
>+    return;
>+fail:
>+    // Attempt to restore code has failed - try to shutdown machine.
>+    dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
>+    apm_shutdown();
> }
>diff --git a/src/resume.c b/src/resume.c
>index a5465d8..afeadcf 100644
>--- a/src/resume.c
>+++ b/src/resume.c
>@@ -114,19 +114,10 @@ s3_resume(void)
>     farcall16big(&br);
> }
> 
>-u8 HaveAttemptedReboot VARLOW;
>-
> // Attempt to invoke a hard-reboot.
> static void
> tryReboot(void)
> {
>-    if (HaveAttemptedReboot) {
>-        // Hard reboot has failed - try to shutdown machine.
>-        dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
>-        apm_shutdown();
>-    }
>-    HaveAttemptedReboot = 1;
>-
>     dprintf(1, "Attempting a hard reboot\n");
> 
>     // Setup for reset on qemu.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-19  1:04             ` Xulei (Stone)
@ 2015-11-19 12:42               ` Xulei (Stone)
  2015-11-19 13:40                 ` Kevin O'Connor
  0 siblings, 1 reply; 41+ messages in thread
From: Xulei (Stone) @ 2015-11-19 12:42 UTC (permalink / raw)
  To: Kevin O'Connor; +Cc: Huangweidong (C), Gonglei (Arei), seabios, qemu-devel

Kevin,

After deeply analyzing, i think there may be 3 possible reasons:
1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no lock to protect.
So, sometimes, 2 or more vcpu may get the same current value of CountCPUs. Then we'll
get a single incrementation instead of 2 or more and "while (cmos_smp_count != CountCPUs)"
will loop forever;

2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number?

3)yield() stuck. Is it possible that SeaBIOS is stuck during yield? I've tested, 
when yield() is running, SeaBIOS seems has not created some other threads except
the main thread. So I don't know what's the function of yield() here.?

>Dear Kevin,
>
>Sorry for delayed replying. This patch works for me well. Thanks a lot!
>
>Recently, I found another odd thing. A qemu-kvm VM is stuck at the SeaBIOS
>after self-rebooting many times. Analyzing the SeaBIOS log attached below, I
>think there maybe someting wrong from this block of code:
>
>/src/fw/smp.c
>
>    u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
>    while (cmos_smp_count != CountCPUs)
>        asm volatile(
>            // Release lock and allow other processors to use the stack.
>            "  movl %%esp, %1\n"
>            "  movl $0, %0\n"
>            // Reacquire lock and take back ownership of stack.
>            "1:rep ; nop\n"
>            "  lock btsl $0, %0\n"
>            "  jc 1b\n"
>            : "+m" (SMPLock), "+m" (SMPStack)
>            : : "cc", "memory");
>    yield();
>
>It seems if SeaBIOS read an incorrect number sometimes from QEMU
>through cmos 0x5f，the SeaBIOS really may be stucked. So, i wonder
>what may cause this problem after a VM self-rebooting many times?
>
>================bad SeaBIOS log===========
>[2015-11-13 18:45:58] In resume (status=0)
>[2015-11-13 18:45:58] In 32bit resume
>[2015-11-13 18:45:58] Attempting a hard reboot
>[2015-11-13 18:46:00] SeaBIOS (version rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org)
>[2015-11-13 18:46:00] No Xen hypervisor found.
>[2015-11-13 18:46:00] Running on QEMU (i440fx)
>[2015-11-13 18:46:00] Running on KVM
>[2015-11-13 18:46:00] RamSize: 0xc0000000 [cmos]
>[2015-11-13 18:46:00] Relocating init from 0x000de8f0 to 0xbffaec00 (size 70464)
>[2015-11-13 18:46:00] Found QEMU fw_cfg
>[2015-11-13 18:46:00] RamBlock: addr 0x0000000000000000 len 0x00000000c0000000 [e820]
>[2015-11-13 18:46:00] RamBlock: addr 0x0000000100000000 len 0x0000000340000000 [e820]
>[2015-11-13 18:46:00] Moving pm_base to 0x600
>[2015-11-13 18:46:00] boot order:
>[2015-11-13 18:46:00] 1: /pci@i0cf8/scsi@e/disk@0,0
>[2015-11-13 18:46:00] 2: HALT
>[2015-11-13 18:46:00] CPU Mhz=2402
>[2015-11-13 18:46:00] === PCI bus & bridge init ===
>[2015-11-13 18:46:00] PCI: pci_bios_init_bus_rec bus = 0x0
>[2015-11-13 18:46:00] === PCI device probing ===
>[2015-11-13 18:46:00] Found 21 PCI devices (max PCI bus is 00)
>[2015-11-13 18:46:00] === PCI new allocation pass #1 ===
>[2015-11-13 18:46:00] PCI: check devices
>[2015-11-13 18:46:00] === PCI new allocation pass #2 ===
>[2015-11-13 18:46:00] PCI: IO: c000 - c1cf
>[2015-11-13 18:46:00] PCI: 32: 00000000c0000000 - 00000000fec00000
>[2015-11-13 18:46:00] PCI: map device bdf=00:1f.0  bar 0, addr 0000c000, size 00000100 [io]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0e.0  bar 0, addr 0000c100, size 00000040 [io]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0f.0  bar 0, addr 0000c140, size 00000040 [io]
>[2015-11-13 18:46:00] PCI: map device bdf=00:01.2  bar 4, addr 0000c180, size 00000020 [io]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0d.0  bar 0, addr 0000c1a0, size 00000020 [io]
>[2015-11-13 18:46:00] PCI: map device bdf=00:01.1  bar 4, addr 0000c1c0, size 00000010 [io]
>[2015-11-13 18:46:00] PCI: map device bdf=00:02.0  bar 6, addr febe0000, size 00010000 [mem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:02.0  bar 1, addr febf0000, size 00001000 [mem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0d.0  bar 1, addr febf1000, size 00001000 [mem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0e.0  bar 1, addr febf2000, size 00001000 [mem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0f.0  bar 1, addr febf3000, size 00001000 [mem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:10.0  bar 0, addr febf4000, size 00001000 [mem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:02.0  bar 0, addr f6000000, size 02000000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:10.0  bar 2, addr f8000000, size 01000000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:03.0  bar 2, addr f9000000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:04.0  bar 2, addr f9800000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:05.0  bar 2, addr fa000000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:06.0  bar 2, addr fa800000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:07.0  bar 2, addr fb000000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:08.0  bar 2, addr fb800000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:09.0  bar 2, addr fc000000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0a.0  bar 2, addr fc800000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0b.0  bar 2, addr fd000000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: map device bdf=00:0c.0  bar 2, addr fd800000, size 00800000 [prefmem]
>[2015-11-13 18:46:00] PCI: init bdf=00:00.0 id=8086:1237
>[2015-11-13 18:46:00] PCI: init bdf=00:01.0 id=8086:7000
>[2015-11-13 18:46:00] PIIX3/PIIX4 init: elcr=00 0c
>[2015-11-13 18:46:00] PCI: init bdf=00:01.1 id=8086:7010
>[2015-11-13 18:46:00] PCI: init bdf=00:01.2 id=8086:7020
>[2015-11-13 18:46:00] PCI: init bdf=00:01.3 id=8086:7113
>[2015-11-13 18:46:00] Using pmtimer, ioport 0x608
>[2015-11-13 18:46:00] PCI: init bdf=00:02.0 id=1013:00b8
>[2015-11-13 18:46:00] PCI: init bdf=00:03.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:04.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:05.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:06.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:07.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:08.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:09.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:0a.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:0b.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:0c.0 id=15b3:1004
>[2015-11-13 18:46:00] PCI: init bdf=00:0d.0 id=1af4:1003
>[2015-11-13 18:46:00] PCI: init bdf=00:0e.0 id=1af4:1001
>[2015-11-13 18:46:00] PCI: init bdf=00:0f.0 id=1af4:1001
>[2015-11-13 18:46:00] PCI: init bdf=00:10.0 id=1af4:1110
>[2015-11-13 18:46:00] PCI: init bdf=00:1f.0 id=1af4:8888
>[2015-11-13 18:46:00] PCI: Using 00:02.0 for primary VGA
>[2015-11-13 18:46:00] handle_smp: apic_id=1
>[2015-11-13 18:46:00] handle_smp: apic_id=6
>[2015-11-13 18:46:00] handle_smp: apic_id=7
>[2015-11-13 18:46:00] handle_smp: apic_id=3
>[2015-11-13 18:46:00] handle_smp: apic_id=2
>[2015-11-13 18:46:00] handle_smp: apic_id=5
>[2015-11-13 18:46:00] handle_smp: apic_id=4
>========The End, nothing more======
>>On Mon, Nov 09, 2015 at 03:06:18PM -0500, Kevin O'Connor wrote:
>>> On Mon, Nov 09, 2015 at 08:32:53AM -0500, Kevin O'Connor wrote:
>>> > On Fri, Nov 06, 2015 at 09:12:34AM +0000, Xulei (Stone) wrote:
>>> > > >On Wed, Nov 04, 2015 at 08:48:20AM +0800, Gonglei wrote:
>>> > > >I'm surprised you would see the above on a recent qemu/kvm though - as
>>> > > >on a newer KVM I think the second reset would have to happen after
>>> > > >HaveAttemptedReboot is set and prior to the memcpy in
>>> > > >qemu_prep_reset() completing.  Can you verify your KVM version?
>>> > >
>>> > > I've tested on KVM-3.6 and KVM-4.1.3. On both of these versions, i can
>>> > > see this problem.
>>> > > I do like this: put a HA and a watchdog mechanism in a VM. Deliberately,
>>> > > let this VM lose heartbeat and don't feed dog. Then, after 2 minutes,
>>> > > a self-defined timeout, HA mechnism will issue a internal reboot command to
>>> > > the VM and watchdog mechanism will issue a "virsh reset" from the host. Then,
>>> > > aforementioned problem will occurs in high probability.
>>> >
>>> > Ah, okay.  I'm not sure what the best solution to this problem is.
>>>
>>> After thinking about this further, I think we can move the
>>> HaveAttemptedReboot assignment after the memcpy.
>>
>>The previous patch could cause corruption if the memcpy() failed.  I
>>think the new SeaBIOS patch below should be okay though.
>>
>>-Kevin
>>
>>
>>commit 8a6e44ad5c953266d2339b3299f5fb4ff32c8cbb
>>Author: Kevin O'Connor <kevin@koconnor.net>
>>Date:   Mon Nov 9 15:00:19 2015 -0500
>>
>>    resume: Make KVM soft reboot loop detection more flexible
>>   
>>    Move the check for soft reboot loops from resume.c to shadow.c and
>>    directly check for the case where the memcpy fails.  This prevents a
>>    hang if an external reboot request occurs during the BIOS memcpy.
>>   
>>    Signed-off-by: Kevin O'Connor <kevin@koconnor.net>
>>
>>diff --git a/src/fw/shadow.c b/src/fw/shadow.c
>>index ee87d36..b2f2dd8 100644
>>--- a/src/fw/shadow.c
>>+++ b/src/fw/shadow.c
>>@@ -156,6 +156,8 @@ make_bios_readonly(void)
>>         make_bios_readonly_intel(ShadowBDF, Q35_HOST_BRIDGE_PAM0);
>> }
>>
>>+static u8 AttemptingReboot;
>>+
>> void
>> qemu_prep_reset(void)
>> {
>>@@ -164,6 +166,19 @@ qemu_prep_reset(void)
>>     // QEMU doesn't map 0xc0000-0xfffff back to the original rom on a
>>     // reset, so do that manually before invoking a hard reset.
>>     make_bios_writable();
>>+    AttemptingReboot = 1;
>>+    barrier();
>>+    if (!AttemptingReboot)
>>+        goto fail;
>>+    barrier();
>>     memcpy(VSYMBOL(code32flat_start), VSYMBOL(code32flat_start) + BIOS_SRC_OFFSET
>>            , SYMBOL(code32flat_end) - SYMBOL(code32flat_start));
>>+    barrier();
>>+    if (AttemptingReboot)
>>+        goto fail;
>>+    return;
>>+fail:
>>+    // Attempt to restore code has failed - try to shutdown machine.
>>+    dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
>>+    apm_shutdown();
>> }
>>diff --git a/src/resume.c b/src/resume.c
>>index a5465d8..afeadcf 100644
>>--- a/src/resume.c
>>+++ b/src/resume.c
>>@@ -114,19 +114,10 @@ s3_resume(void)
>>     farcall16big(&br);
>> }
>>
>>-u8 HaveAttemptedReboot VARLOW;
>>-
>> // Attempt to invoke a hard-reboot.
>> static void
>> tryReboot(void)
>> {
>>-    if (HaveAttemptedReboot) {
>>-        // Hard reboot has failed - try to shutdown machine.
>>-        dprintf(1, "Unable to hard-reboot machine - attempting shutdown.\n");
>>-        apm_shutdown();
>>-    }
>>-    HaveAttemptedReboot = 1;
>>-
>>     dprintf(1, "Attempting a hard reboot\n");
>>
>>     // Setup for reset on qemu. 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-19 12:42               ` Xulei (Stone)
@ 2015-11-19 13:40                 ` Kevin O'Connor
  2015-11-20  2:05                   ` Xulei (Stone)
                                     ` (2 more replies)
  0 siblings, 3 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-11-19 13:40 UTC (permalink / raw)
  To: Xulei (Stone); +Cc: Huangweidong (C), Gonglei (Arei), seabios, qemu-devel

On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote:
> Kevin,
> 
> After deeply analyzing, i think there may be 3 possible reasons:
> 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no
> lock to protect.  So, sometimes, 2 or more vcpu may get the same
> current value of CountCPUs. Then we'll get a single incrementation
> instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will
> loop forever;

The handle_smp() code is called from romlayout.S:entry_smp() which
does take a lock.  So, all of handle_smp() should run synchronous.

> 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number?

Not sure - the last time there were problems in this area of the code
others used kvmtrace to try and track this down.  Since you are
getting dprintf statements, you could also try outputting
cmos_smp_count prior to the loop (see patch below).

> 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield?
> I've tested, when yield() is running, SeaBIOS seems has not created
> some other threads except the main thread. So I don't know what's
> the function of yield() here.?

The yield() allows hardware interrupts to occur.  But note that
yield() isn't called in the loop - is is only called after the loop
completes.

If you are only getting this on massive repetitive reboot requests,
there are some other possible explanations:

- perhaps the SIPI is getting lost because one of the CPUs is still
  resetting or still processing a SIPI from the last reboot?

- the seabios code itself may have been corrupted if the memcpy() in
  qemu_prep_reset() got far enough along to clear HaveRunPost, but did
  not get far enough along to fully complete the memcpy().

If the failure is reproducible, the patch below could help narrow the
possibilities.

-Kevin


--- a/src/fw/smp.c
+++ b/src/fw/smp.c
@@ -125,6 +125,7 @@ smp_setup(void)
 
     // Wait for other CPUs to process the SIPI.
     u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
+    dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count);
     while (cmos_smp_count != CountCPUs)
         asm volatile(
             // Release lock and allow other processors to use the stack.
@@ -136,6 +137,7 @@ smp_setup(void)
             "  jc 1b\n"
             : "+m" (SMPLock), "+m" (SMPStack)
             : : "cc", "memory");
+    dprintf(1, "finish smp\n");
     yield();
 
     // Restore memory.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-19 13:40                 ` Kevin O'Connor
@ 2015-11-20  2:05                   ` Xulei (Stone)
       [not found]                   ` <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
  2015-12-19  1:08                     ` [Qemu-devel] " Gonglei (Arei)
  2 siblings, 0 replies; 41+ messages in thread
From: Xulei (Stone) @ 2015-11-20  2:05 UTC (permalink / raw)
  To: Kevin O'Connor; +Cc: Huangweidong (C), Gonglei (Arei), seabios, qemu-devel



>On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote:
>> Kevin,
>> 
>> After deeply analyzing, i think there may be 3 possible reasons:
>> 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no
>> lock to protect.  So, sometimes, 2 or more vcpu may get the same
>> current value of CountCPUs. Then we'll get a single incrementation
>> instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will
>> loop forever;
>
>The handle_smp() code is called from romlayout.S:entry_smp() which
>does take a lock.  So, all of handle_smp() should run synchronous.
>

Ok, this possibility is ruled out!

>> 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number?
>
>Not sure - the last time there were problems in this area of the code
>others used kvmtrace to try and track this down.  Since you are
>getting dprintf statements, you could also try outputting
>cmos_smp_count prior to the loop (see patch below).
>
I'll test again with this patch, and observe the output.
But frankly speaking, i don't think SeaBIOS may read an incorrect number.
Because, QEMU set smp_cpus value in pc_cmos_init (cmos ox5f) at the 1st
time (QEMU does not execute pc_cmos_init again during repetitive reboot).
SeaBIOS works well for many times which means there is no reason that
a time SeaBIOS suddenly reads an incorrect number.

>> 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield?
>> I've tested, when yield() is running, SeaBIOS seems has not created
>> some other threads except the main thread. So I don't know what's
>> the function of yield() here.?
>
>The yield() allows hardware interrupts to occur.  But note that
>yield() isn't called in the loop - is is only called after the loop
>completes.
>
>If you are only getting this on massive repetitive reboot requests,
>there are some other possible explanations:
>
>- perhaps the SIPI is getting lost because one of the CPUs is still
>  resetting or still processing a SIPI from the last reboot?
>
Seems impossible. Seen from the qemu log and SeaBIOS log, the VM has 
booted successfully and can execute our "regular rebooting" daemon
process before the last reboot.

>- the seabios code itself may have been corrupted if the memcpy() in
>  qemu_prep_reset() got far enough along to clear HaveRunPost, but did
>  not get far enough along to fully complete the memcpy().
>
BTW, my VM reboots every 220 second (reboot time interval = 220s). I think
SeaBIOS has enough time to process all kinds of affairs, like SIPI and
memcpy().

Kevin, I want to know whether it is possible that if my VM is stuck at QEMU
(a point of pci device reset procedure? or whatever) SeaBIOS will hold at
hanle_smp() and could not printf "Found %d cpu(s) max supported %d cpu(s)"?
Is this possible?

================== bad QEMU log=======
[2015-11-13 18:45:57] qemu_devices_reset:1941 reset all devices
[2015-11-13 18:45:57] set_nmi_flag:71 set nmi val = 0
[2015-11-13 18:45:58] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411558, "microseconds": 650381}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "channel0"}}
[2015-11-13 18:45:58] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411558, "microseconds": 796285}, "event": "RESET"}
[2015-11-13 18:45:58] qemu_devices_reset:1941 reset all devices
[2015-11-13 18:45:59] set_nmi_flag:71 set nmi val = 0
[2015-11-13 18:46:00] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411560, "microseconds": 212196}, "event": "RESET"}
[2015-11-13 18:46:00] qemu_reset_report:749 domain is rebooting
[2015-11-13 18:46:00] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411558, "microseconds": 650558}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "channel3"}}

================ good QEMU log=========
[2015-11-13 18:42:12] qemu_devices_reset:1941 reset all devices
[2015-11-13 18:42:12] set_nmi_flag:71 set nmi val = 0
[2015-11-13 18:42:13] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411333, "microseconds": 718617}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "channel0"}}
[2015-11-13 18:42:13] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411333, "microseconds": 848236}, "event": "RESET"}
[2015-11-13 18:42:14] qemu_devices_reset:1941 reset all devices
[2015-11-13 18:42:14] set_nmi_flag:71 set nmi val = 0
[2015-11-13 18:42:15] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411335, "microseconds": 280198}, "event": "RESET"}
[2015-11-13 18:42:15] qemu_reset_report:749 domain is rebooting
[2015-11-13 18:42:15] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411333, "microseconds": 718794}, "event": "VSERPORT_CHANGE", "data": {"open": false, "id": "channel3"}}
[2015-11-13 18:42:15] virtio_set_status:524 virtio-blk device status is 3 that means DRIVER
[2015-11-13 18:42:15] virtio_set_status:524 virtio-blk device status is 7 that means DRIVER OK
[2015-11-13 18:42:15] virtio_set_status:524 virtio-blk device status is 3 that means DRIVER
[2015-11-13 18:42:15] virtio_set_status:524 virtio-blk device status is 7 that means DRIVER OK
[2015-11-13 18:42:23] virtio_set_status:524 virtio-serial device status is 1 that means ACKNOWLEDGE
[2015-11-13 18:42:23] virtio_set_status:524 virtio-serial device status is 3 that means DRIVER
[2015-11-13 18:42:23] handle_control_message:333 virtio serial port '-1' hanle control message event = 0, value = 1
[2015-11-13 18:42:23] send_control_event:225 virtio serial port 1 send control message event = 1, value = 1
[2015-11-13 18:42:23] send_control_event:225 virtio serial port 2 send control message event = 1, value = 1
[2015-11-13 18:42:23] send_control_event:225 virtio serial port 3 send control message event = 1, value = 1
[2015-11-13 18:42:23] send_control_event:225 virtio serial port 4 send control message event = 1, value = 1
[2015-11-13 18:42:23] virtio_set_status:524 virtio-serial device status is 7 that means DRIVER OK
[2015-11-13 18:42:23] handle_control_message:333 virtio serial port '1' hanle control message event = 3, value = 1
[2015-11-13 18:42:23] send_control_event:225 virtio serial port 1 send control message event = 6, value = 1
[2015-11-13 18:42:23] handle_control_message:333 virtio serial port '2' hanle control message event = 3, value = 1
[2015-11-13 18:42:23] send_control_event:225 virtio serial port 2 send control message event = 6, value = 1
[2015-11-13 18:42:23] handle_control_message:333 virtio serial port '3' hanle control message event = 3, value = 1
[2015-11-13 18:42:23] send_control_event:225 virtio serial port 3 send control message event = 6, value = 1
[2015-11-13 18:42:23] handle_control_message:333 virtio serial port '4' hanle control message event = 3, value = 1
[2015-11-13 18:42:23] virtio_set_status:524 virtio-blk device status is 1 that means ACKNOWLEDGE
[2015-11-13 18:42:23] virtio_set_status:524 virtio-blk device status is 1 that means ACKNOWLEDGE
[2015-11-13 18:42:23] virtio_set_status:524 virtio-blk device status is 3 that means DRIVER
[2015-11-13 18:42:23] virtio_set_status:524 virtio-blk device status is 7 that means DRIVER OK
[2015-11-13 18:42:23] virtio_set_status:524 virtio-blk device status is 3 that means DRIVER
[2015-11-13 18:42:23] virtio_set_status:524 virtio-blk device status is 7 that means DRIVER OK
[2015-11-13 18:42:30] handle_control_message:333 virtio serial port '2' hanle control message event = 6, value = 1
[2015-11-13 18:42:30] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411350, "microseconds": 214826}, "event": "VSERPORT_CHANGE", "data": {"open": true, "id": "channel1"}}
[2015-11-13 18:42:30] handle_control_message:333 virtio serial port '1' hanle control message event = 6, value = 1
[2015-11-13 18:42:30] handle_control_message:333 virtio serial port '3' hanle control message event = 6, value = 1
[2015-11-13 18:42:30] handle_control_message:333 virtio serial port '4' hanle control message event = 6, value = 1
[2015-11-13 18:42:31] monitor_qapi_event_emit:483 {"timestamp": {"seconds": 1447411350, "microseconds": 220665}, "event": "VSERPORT_CHANGE", "data": {"open": true, "id": "channel3"}}

>If the failure is reproducible, the patch below could help narrow the
>possibilities.
>
>-Kevin
>
>
>--- a/src/fw/smp.c
>+++ b/src/fw/smp.c
>@@ -125,6 +125,7 @@ smp_setup(void)
> 
>     // Wait for other CPUs to process the SIPI.
>     u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
>+    dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count);
>     while (cmos_smp_count != CountCPUs)
>         asm volatile(
>             // Release lock and allow other processors to use the stack.
>@@ -136,6 +137,7 @@ smp_setup(void)
>             "  jc 1b\n"
>             : "+m" (SMPLock), "+m" (SMPStack)
>             : : "cc", "memory");
>+    dprintf(1, "finish smp\n");
>     yield();
> 
>     // Restore memory.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
       [not found]                   ` <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
@ 2015-12-18 23:13                       ` Kevin O'Connor
  0 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-18 23:13 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Xulei (Stone), Paolo Bonzini, qemu-devel, seabios, Huangweidong (C), kvm

On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
> Hi Kevin & Paolo,
> 
> Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck.
[...]
> [2015-12-18 10:38:10]  gonglei: finish while   
[...]
> <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
> <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
> <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
> <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0

This is an odd finding.  It seems to indicate that the code is caught
in an infinite irq loop once irqs are enabled.  What doesn't make
sense is that an NMI shouldn't depend on the cpu irq enable flag.
Also, I can't explain why rip would be 0x03, nor why a #UD in an
exception handler wouldn't result in a triple fault.  Maybe someone
with more kvm knowledge could help here.

I did notice that you appear to be running with SeaBIOS v1.8.1 - I
recommend you upgrade to the latest.  There were two important fixes
in this area (8b9942fa and 3156b71a).  I don't think either of these
fixes would explain the log above, but it would be best to eliminate
the possibility.

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-18 23:13                       ` Kevin O'Connor
  0 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-18 23:13 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Huangweidong (C), kvm, seabios, Xulei (Stone), qemu-devel, Paolo Bonzini

On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
> Hi Kevin & Paolo,
> 
> Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck.
[...]
> [2015-12-18 10:38:10]  gonglei: finish while   
[...]
> <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
> <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
> <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
> <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0

This is an odd finding.  It seems to indicate that the code is caught
in an infinite irq loop once irqs are enabled.  What doesn't make
sense is that an NMI shouldn't depend on the cpu irq enable flag.
Also, I can't explain why rip would be 0x03, nor why a #UD in an
exception handler wouldn't result in a triple fault.  Maybe someone
with more kvm knowledge could help here.

I did notice that you appear to be running with SeaBIOS v1.8.1 - I
recommend you upgrade to the latest.  There were two important fixes
in this area (8b9942fa and 3156b71a).  I don't think either of these
fixes would explain the log above, but it would be best to eliminate
the possibility.

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-11-19 13:40                 ` Kevin O'Connor
@ 2015-12-19  1:08                     ` Gonglei (Arei)
       [not found]                   ` <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
  2015-12-19  1:08                     ` [Qemu-devel] " Gonglei (Arei)
  2 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-19  1:08 UTC (permalink / raw)
  To: Kevin O'Connor, Xulei (Stone), Paolo Bonzini
  Cc: seabios, Huangweidong (C), qemu-devel, kvm

Hi Kevin & Paolo,

Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck.

[BTW, the whole SeaBIOS log attached]

[2015-12-18 10:38:10] >>>>>gonglei: enter smp_setup()...
[2015-12-18 10:38:10] >>>>>gonglei: begine to enable local APIC...
[2015-12-18 10:38:10] >>>>>gonglei: finish enable local APIC...
[2015-12-18 10:38:10] >>>gonglei: cmos_smp_count=8
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=5
[2015-12-18 10:38:10] ===: CountCPUs=2, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=7
[2015-12-18 10:38:10] ===: CountCPUs=3, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=1
[2015-12-18 10:38:10] ===: CountCPUs=4, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=2
[2015-12-18 10:38:10] ===: CountCPUs=5, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=4
[2015-12-18 10:38:10] ===: CountCPUs=6, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=3
[2015-12-18 10:38:10] ===: CountCPUs=7, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=6
[2015-12-18 10:38:10] ===: CountCPUs=8, SMPStack=0x6d84
[2015-12-18 10:38:10]  gonglei: finish while   

[pid 31509 is a vcpu thread used 100% cpu overhead]

# cat /proc/31509/stack  
[<ffffffffa05c337c>] vmx_vcpu_run+0x35c/0x580 [kvm_intel]
[<ffffffffa06b0f10>] em_push+0x0/0x20 [kvm]
[<ffffffffa06a30dc>] x86_emulate_instruction+0x20c/0x440 [kvm]
[<ffffffffa05c9224>] handle_exception+0xe4/0x1b58 [kvm_intel]
[<ffffffffa06a24c5>] vcpu_enter_guest+0x565/0x790 [kvm]
[<ffffffffa05bf990>] vmx_get_segment_base+0x0/0xb0 [kvm_intel]
[<ffffffffa06a2888>] __vcpu_run+0x198/0x260 [kvm]
[<ffffffffa06a3508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
[<ffffffffa068f92e>] vcpu_load+0x4e/0x80 [kvm]
[<ffffffffa068fcee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
[<ffffffff8109618d>] futex_wake+0xfd/0x110
[<ffffffff811ed57c>] security_file_permission+0x1c/0xa0
[<ffffffff8116bedb>] do_vfs_ioctl+0x8b/0x3b0
[<ffffffff8116c2a1>] sys_ioctl+0xa1/0xb0
[<ffffffff81469272>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

And kvm tracing information:

<...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180077: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180078: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180078: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180079: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180079: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180079: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180080: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180080: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180080: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180081: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180081: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180081: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180081: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180082: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180083: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180083: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180083: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180084: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180084: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180084: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180084: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180085: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180085: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180085: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180085: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180086: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306

Now, it's very clear that the guest stuck in yiled(), and then kvm encounter the exception #UD.

Do you have any thoughts? Thanks!


The Seabios patch below:

diff --git a/roms/seabios/src/boot.c b/roms/seabios/src/boot.c
index f23e9e1..552914a 100644
--- a/roms/seabios/src/boot.c
+++ b/roms/seabios/src/boot.c
@@ -93,7 +93,7 @@ glob_prefix(const char *glob, const char *str)
 static int
 find_prio(const char *glob)
 {
-    dprintf(1, "Searching bootorder for: %s\n", glob);
+    //dprintf(1, "Searching bootorder for: %s\n", glob);
     int i;
     for (i = 0; i < BootorderCount; i++)
         if (glob_prefix(glob, Bootorder[i]))
diff --git a/roms/seabios/src/fw/smp.c b/roms/seabios/src/fw/smp.c
index a466ea6..46ec607 100644
--- a/roms/seabios/src/fw/smp.c
+++ b/roms/seabios/src/fw/smp.c
@@ -46,12 +46,16 @@ int apic_id_is_present(u8 apic_id)
     return !!(FoundAPICIDs[apic_id/32] & (1ul << (apic_id % 32)));
 }
 
+// Atomic lock for shared stack across processors.
+u32 SMPLock __VISIBLE;
+u32 SMPStack __VISIBLE;
+
 void VISIBLE32FLAT
 handle_smp(void)
 {
     if (!CONFIG_QEMU)
         return;
-
+    dprintf(1, ">>> enter handle_smp...\n");
     // Enable CPU caching
     setcr0(getcr0() & ~(CR0_CD|CR0_NW));
 
@@ -70,19 +74,16 @@ handle_smp(void)
     FoundAPICIDs[apic_id/32] |= (1 << (apic_id % 32));
 
     CountCPUs++;
+    dprintf(1, "===: CountCPUs=%d, SMPStack=0x%x\n", CountCPUs, SMPStack);
 }
 
-// Atomic lock for shared stack across processors.
-u32 SMPLock __VISIBLE;
-u32 SMPStack __VISIBLE;
-
 // find and initialize the CPUs by launching a SIPI to them
 void
 smp_setup(void)
 {
     if (!CONFIG_QEMU)
         return;
-
+    dprintf(1, ">>>>>gonglei: enter smp_setup()...\n");
     ASSERT32FLAT();
     u32 eax, ebx, ecx, cpuid_features;
     cpuid(1, &eax, &ebx, &ecx, &cpuid_features);
@@ -106,7 +107,7 @@ smp_setup(void)
     u64 new = (0xea | ((u64)SEG_BIOS<<24)
                | (((u32)entry_smp - BUILD_BIOS_ADDR) << 8));
     *(u64*)BUILD_AP_BOOT_ADDR = new;
-
+    dprintf(1, ">>>>>gonglei: begine to enable local APIC...\n");
     // enable local APIC
     u32 val = readl(APIC_SVR);
     writel(APIC_SVR, val | APIC_ENABLED);
@@ -125,10 +126,21 @@ smp_setup(void)
     writel(APIC_ICR_LOW, 0x000C4500);
     u32 sipi_vector = BUILD_AP_BOOT_ADDR >> 12;
     writel(APIC_ICR_LOW, 0x000C4600 | sipi_vector);
-
+    dprintf(1, ">>>>>gonglei: finish enable local APIC...\n");
     // Wait for other CPUs to process the SIPI.
     u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
-    while (cmos_smp_count != CountCPUs)
+    dprintf(1, ">>>gonglei: cmos_smp_count=%d\n", cmos_smp_count);
+    while (cmos_smp_count != CountCPUs) {
         asm volatile(
             // Release lock and allow other processors to use the stack.
             "  movl %%esp, %1\n"
@@ -139,8 +151,11 @@ smp_setup(void)
             "  jc 1b\n"
             : "+m" (SMPLock), "+m" (SMPStack)
             : : "cc", "memory");
+       //yield();
+    }
+    dprintf(1, " gonglei: finish while\n");
     yield();
-
+    dprintf(1, " gonglei: finish yield\n");
     // Restore memory.
     *(u64*)BUILD_AP_BOOT_ADDR = old;
 
diff --git a/roms/seabios/src/misc.c b/roms/seabios/src/misc.c
index 8caaf31..77f6be3 100644
--- a/roms/seabios/src/misc.c
+++ b/roms/seabios/src/misc.c
@@ -64,6 +64,10 @@ void VISIBLE16
 handle_02(void)
 {
     debug_isr(DEBUG_ISR_02);
+    dprintf(1, "gonglei hand nmi inject, write rtc \n");
 }
 
 void
diff --git a/roms/seabios/src/stacks.c b/roms/seabios/src/stacks.c
index 1dbdfe9..c1b5203 100644
--- a/roms/seabios/src/stacks.c
+++ b/roms/seabios/src/stacks.c
@@ -174,6 +174,7 @@ call16_smm(u32 eax, u32 edx, void *func)
 static void
 call32_sloppy_prep(void)
 {
+    dprintf(1, ">>> enter call32_sloppy_prep...\n");
     // Backup cmos index register and disable nmi
     u8 cmosindex = inb(PORT_CMOS_INDEX);
     outb(cmosindex | NMI_DISABLE_BIT, PORT_CMOS_INDEX);


Regards,
-Gonglei


> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Thursday, November 19, 2015 9:41 PM
> To: Xulei (Stone)
> Cc: Gonglei (Arei); qemu-devel; seabios@seabios.org; Huangweidong (C)
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote:
> > Kevin,
> >
> > After deeply analyzing, i think there may be 3 possible reasons:
> > 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no
> > lock to protect.  So, sometimes, 2 or more vcpu may get the same
> > current value of CountCPUs. Then we'll get a single incrementation
> > instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will
> > loop forever;
> 
> The handle_smp() code is called from romlayout.S:entry_smp() which does take
> a lock.  So, all of handle_smp() should run synchronous.
> 
> > 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number?
> 
> Not sure - the last time there were problems in this area of the code others
> used kvmtrace to try and track this down.  Since you are getting dprintf
> statements, you could also try outputting cmos_smp_count prior to the loop
> (see patch below).
> 
> > 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield?
> > I've tested, when yield() is running, SeaBIOS seems has not created
> > some other threads except the main thread. So I don't know what's the
> > function of yield() here.?
> 
> The yield() allows hardware interrupts to occur.  But note that
> yield() isn't called in the loop - is is only called after the loop completes.
> 
> If you are only getting this on massive repetitive reboot requests, there are
> some other possible explanations:
> 
> - perhaps the SIPI is getting lost because one of the CPUs is still
>   resetting or still processing a SIPI from the last reboot?
> 
> - the seabios code itself may have been corrupted if the memcpy() in
>   qemu_prep_reset() got far enough along to clear HaveRunPost, but did
>   not get far enough along to fully complete the memcpy().
> 
> If the failure is reproducible, the patch below could help narrow the
> possibilities.
> 
> -Kevin
> 
> 
> --- a/src/fw/smp.c
> +++ b/src/fw/smp.c
> @@ -125,6 +125,7 @@ smp_setup(void)
> 
>      // Wait for other CPUs to process the SIPI.
>      u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
> +    dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count);
>      while (cmos_smp_count != CountCPUs)
>          asm volatile(
>              // Release lock and allow other processors to use the stack.
> @@ -136,6 +137,7 @@ smp_setup(void)
>              "  jc 1b\n"
>              : "+m" (SMPLock), "+m" (SMPStack)
>              : : "cc", "memory");
> +    dprintf(1, "finish smp\n");
>      yield();
> 
>      // Restore memory.

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-19  1:08                     ` Gonglei (Arei)
  0 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-19  1:08 UTC (permalink / raw)
  To: Kevin O'Connor, Xulei (Stone), Paolo Bonzini
  Cc: seabios, Huangweidong (C), qemu-devel, kvm

Hi Kevin & Paolo,

Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck.

[BTW, the whole SeaBIOS log attached]

[2015-12-18 10:38:10] >>>>>gonglei: enter smp_setup()...
[2015-12-18 10:38:10] >>>>>gonglei: begine to enable local APIC...
[2015-12-18 10:38:10] >>>>>gonglei: finish enable local APIC...
[2015-12-18 10:38:10] >>>gonglei: cmos_smp_count=8
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=5
[2015-12-18 10:38:10] ===: CountCPUs=2, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=7
[2015-12-18 10:38:10] ===: CountCPUs=3, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=1
[2015-12-18 10:38:10] ===: CountCPUs=4, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=2
[2015-12-18 10:38:10] ===: CountCPUs=5, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=4
[2015-12-18 10:38:10] ===: CountCPUs=6, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=3
[2015-12-18 10:38:10] ===: CountCPUs=7, SMPStack=0x6d84
[2015-12-18 10:38:10] >>> enter handle_smp...
[2015-12-18 10:38:10] handle_smp: apic_id=6
[2015-12-18 10:38:10] ===: CountCPUs=8, SMPStack=0x6d84
[2015-12-18 10:38:10]  gonglei: finish while   

[pid 31509 is a vcpu thread used 100% cpu overhead]

# cat /proc/31509/stack  
[<ffffffffa05c337c>] vmx_vcpu_run+0x35c/0x580 [kvm_intel]
[<ffffffffa06b0f10>] em_push+0x0/0x20 [kvm]
[<ffffffffa06a30dc>] x86_emulate_instruction+0x20c/0x440 [kvm]
[<ffffffffa05c9224>] handle_exception+0xe4/0x1b58 [kvm_intel]
[<ffffffffa06a24c5>] vcpu_enter_guest+0x565/0x790 [kvm]
[<ffffffffa05bf990>] vmx_get_segment_base+0x0/0xb0 [kvm_intel]
[<ffffffffa06a2888>] __vcpu_run+0x198/0x260 [kvm]
[<ffffffffa06a3508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
[<ffffffffa068f92e>] vcpu_load+0x4e/0x80 [kvm]
[<ffffffffa068fcee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
[<ffffffff8109618d>] futex_wake+0xfd/0x110
[<ffffffff811ed57c>] security_file_permission+0x1c/0xa0
[<ffffffff8116bedb>] do_vfs_ioctl+0x8b/0x3b0
[<ffffffff8116c2a1>] sys_ioctl+0xa1/0xb0
[<ffffffff81469272>] system_call_fastpath+0x16/0x1b
[<ffffffffffffffff>] 0xffffffffffffffff

And kvm tracing information:

<...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180077: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180078: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180078: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180079: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180079: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180079: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180080: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180080: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180080: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180081: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180081: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180081: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180081: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180082: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180083: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180083: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180083: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180084: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180084: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180084: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180084: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180085: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180085: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180085: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180085: kvm_entry: vcpu 0
<...>-31509 [035] 154753.180086: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306

Now, it's very clear that the guest stuck in yiled(), and then kvm encounter the exception #UD.

Do you have any thoughts? Thanks!


The Seabios patch below:

diff --git a/roms/seabios/src/boot.c b/roms/seabios/src/boot.c
index f23e9e1..552914a 100644
--- a/roms/seabios/src/boot.c
+++ b/roms/seabios/src/boot.c
@@ -93,7 +93,7 @@ glob_prefix(const char *glob, const char *str)
 static int
 find_prio(const char *glob)
 {
-    dprintf(1, "Searching bootorder for: %s\n", glob);
+    //dprintf(1, "Searching bootorder for: %s\n", glob);
     int i;
     for (i = 0; i < BootorderCount; i++)
         if (glob_prefix(glob, Bootorder[i]))
diff --git a/roms/seabios/src/fw/smp.c b/roms/seabios/src/fw/smp.c
index a466ea6..46ec607 100644
--- a/roms/seabios/src/fw/smp.c
+++ b/roms/seabios/src/fw/smp.c
@@ -46,12 +46,16 @@ int apic_id_is_present(u8 apic_id)
     return !!(FoundAPICIDs[apic_id/32] & (1ul << (apic_id % 32)));
 }
 
+// Atomic lock for shared stack across processors.
+u32 SMPLock __VISIBLE;
+u32 SMPStack __VISIBLE;
+
 void VISIBLE32FLAT
 handle_smp(void)
 {
     if (!CONFIG_QEMU)
         return;
-
+    dprintf(1, ">>> enter handle_smp...\n");
     // Enable CPU caching
     setcr0(getcr0() & ~(CR0_CD|CR0_NW));
 
@@ -70,19 +74,16 @@ handle_smp(void)
     FoundAPICIDs[apic_id/32] |= (1 << (apic_id % 32));
 
     CountCPUs++;
+    dprintf(1, "===: CountCPUs=%d, SMPStack=0x%x\n", CountCPUs, SMPStack);
 }
 
-// Atomic lock for shared stack across processors.
-u32 SMPLock __VISIBLE;
-u32 SMPStack __VISIBLE;
-
 // find and initialize the CPUs by launching a SIPI to them
 void
 smp_setup(void)
 {
     if (!CONFIG_QEMU)
         return;
-
+    dprintf(1, ">>>>>gonglei: enter smp_setup()...\n");
     ASSERT32FLAT();
     u32 eax, ebx, ecx, cpuid_features;
     cpuid(1, &eax, &ebx, &ecx, &cpuid_features);
@@ -106,7 +107,7 @@ smp_setup(void)
     u64 new = (0xea | ((u64)SEG_BIOS<<24)
                | (((u32)entry_smp - BUILD_BIOS_ADDR) << 8));
     *(u64*)BUILD_AP_BOOT_ADDR = new;
-
+    dprintf(1, ">>>>>gonglei: begine to enable local APIC...\n");
     // enable local APIC
     u32 val = readl(APIC_SVR);
     writel(APIC_SVR, val | APIC_ENABLED);
@@ -125,10 +126,21 @@ smp_setup(void)
     writel(APIC_ICR_LOW, 0x000C4500);
     u32 sipi_vector = BUILD_AP_BOOT_ADDR >> 12;
     writel(APIC_ICR_LOW, 0x000C4600 | sipi_vector);
-
+    dprintf(1, ">>>>>gonglei: finish enable local APIC...\n");
     // Wait for other CPUs to process the SIPI.
     u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
-    while (cmos_smp_count != CountCPUs)
+    dprintf(1, ">>>gonglei: cmos_smp_count=%d\n", cmos_smp_count);
+    while (cmos_smp_count != CountCPUs) {
         asm volatile(
             // Release lock and allow other processors to use the stack.
             "  movl %%esp, %1\n"
@@ -139,8 +151,11 @@ smp_setup(void)
             "  jc 1b\n"
             : "+m" (SMPLock), "+m" (SMPStack)
             : : "cc", "memory");
+       //yield();
+    }
+    dprintf(1, " gonglei: finish while\n");
     yield();
-
+    dprintf(1, " gonglei: finish yield\n");
     // Restore memory.
     *(u64*)BUILD_AP_BOOT_ADDR = old;
 
diff --git a/roms/seabios/src/misc.c b/roms/seabios/src/misc.c
index 8caaf31..77f6be3 100644
--- a/roms/seabios/src/misc.c
+++ b/roms/seabios/src/misc.c
@@ -64,6 +64,10 @@ void VISIBLE16
 handle_02(void)
 {
     debug_isr(DEBUG_ISR_02);
+    dprintf(1, "gonglei hand nmi inject, write rtc \n");
 }
 
 void
diff --git a/roms/seabios/src/stacks.c b/roms/seabios/src/stacks.c
index 1dbdfe9..c1b5203 100644
--- a/roms/seabios/src/stacks.c
+++ b/roms/seabios/src/stacks.c
@@ -174,6 +174,7 @@ call16_smm(u32 eax, u32 edx, void *func)
 static void
 call32_sloppy_prep(void)
 {
+    dprintf(1, ">>> enter call32_sloppy_prep...\n");
     // Backup cmos index register and disable nmi
     u8 cmosindex = inb(PORT_CMOS_INDEX);
     outb(cmosindex | NMI_DISABLE_BIT, PORT_CMOS_INDEX);


Regards,
-Gonglei


> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Thursday, November 19, 2015 9:41 PM
> To: Xulei (Stone)
> Cc: Gonglei (Arei); qemu-devel; seabios@seabios.org; Huangweidong (C)
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote:
> > Kevin,
> >
> > After deeply analyzing, i think there may be 3 possible reasons:
> > 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no
> > lock to protect.  So, sometimes, 2 or more vcpu may get the same
> > current value of CountCPUs. Then we'll get a single incrementation
> > instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will
> > loop forever;
> 
> The handle_smp() code is called from romlayout.S:entry_smp() which does take
> a lock.  So, all of handle_smp() should run synchronous.
> 
> > 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number?
> 
> Not sure - the last time there were problems in this area of the code others
> used kvmtrace to try and track this down.  Since you are getting dprintf
> statements, you could also try outputting cmos_smp_count prior to the loop
> (see patch below).
> 
> > 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield?
> > I've tested, when yield() is running, SeaBIOS seems has not created
> > some other threads except the main thread. So I don't know what's the
> > function of yield() here.?
> 
> The yield() allows hardware interrupts to occur.  But note that
> yield() isn't called in the loop - is is only called after the loop completes.
> 
> If you are only getting this on massive repetitive reboot requests, there are
> some other possible explanations:
> 
> - perhaps the SIPI is getting lost because one of the CPUs is still
>   resetting or still processing a SIPI from the last reboot?
> 
> - the seabios code itself may have been corrupted if the memcpy() in
>   qemu_prep_reset() got far enough along to clear HaveRunPost, but did
>   not get far enough along to fully complete the memcpy().
> 
> If the failure is reproducible, the patch below could help narrow the
> possibilities.
> 
> -Kevin
> 
> 
> --- a/src/fw/smp.c
> +++ b/src/fw/smp.c
> @@ -125,6 +125,7 @@ smp_setup(void)
> 
>      // Wait for other CPUs to process the SIPI.
>      u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1;
> +    dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count);
>      while (cmos_smp_count != CountCPUs)
>          asm volatile(
>              // Release lock and allow other processors to use the stack.
> @@ -136,6 +137,7 @@ smp_setup(void)
>              "  jc 1b\n"
>              : "+m" (SMPLock), "+m" (SMPStack)
>              : : "cc", "memory");
> +    dprintf(1, "finish smp\n");
>      yield();
> 
>      // Restore memory.

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-18 23:13                       ` Kevin O'Connor
@ 2015-12-19  6:28                         ` Gonglei (Arei)
  -1 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-19  6:28 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

>
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Saturday, December 19, 2015 7:13 AM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> Huangweidong (C); kvm@vger.kernel.org
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
> > Hi Kevin & Paolo,
> >
> > Luckily, I reproduced this problem last night. And I got the below log when
> SeaBIOS is stuck.
> [...]
> > [2015-12-18 10:38:10]  gonglei: finish while
> [...]
> > <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3
> info 0 80000306
> > <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
> > <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
> > <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0
> 
> This is an odd finding.  It seems to indicate that the code is caught
> in an infinite irq loop once irqs are enabled.  What doesn't make
> sense is that an NMI shouldn't depend on the cpu irq enable flag.
> Also, I can't explain why rip would be 0x03, nor why a #UD in an
> exception handler wouldn't result in a triple fault.  Maybe someone
> with more kvm knowledge could help here.
> 

Ccing Paolo and Radim.

> I did notice that you appear to be running with SeaBIOS v1.8.1 - I
> recommend you upgrade to the latest.  There were two important fixes
> in this area (8b9942fa and 3156b71a).  I don't think either of these
> fixes would explain the log above, but it would be best to eliminate
> the possibility.
> 
We can reproduce the problem using latest SeaBIOS too. :(


Regards,
-Gonglei

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-19  6:28                         ` Gonglei (Arei)
  0 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-19  6:28 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

>
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Saturday, December 19, 2015 7:13 AM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> Huangweidong (C); kvm@vger.kernel.org
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
> > Hi Kevin & Paolo,
> >
> > Luckily, I reproduced this problem last night. And I got the below log when
> SeaBIOS is stuck.
> [...]
> > [2015-12-18 10:38:10]  gonglei: finish while
> [...]
> > <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3
> info 0 80000306
> > <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
> > <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
> > <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0
> 
> This is an odd finding.  It seems to indicate that the code is caught
> in an infinite irq loop once irqs are enabled.  What doesn't make
> sense is that an NMI shouldn't depend on the cpu irq enable flag.
> Also, I can't explain why rip would be 0x03, nor why a #UD in an
> exception handler wouldn't result in a triple fault.  Maybe someone
> with more kvm knowledge could help here.
> 

Ccing Paolo and Radim.

> I did notice that you appear to be running with SeaBIOS v1.8.1 - I
> recommend you upgrade to the latest.  There were two important fixes
> in this area (8b9942fa and 3156b71a).  I don't think either of these
> fixes would explain the log above, but it would be best to eliminate
> the possibility.
> 
We can reproduce the problem using latest SeaBIOS too. :(


Regards,
-Gonglei

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-18 23:13                       ` Kevin O'Connor
@ 2015-12-19 12:03                         ` Gonglei (Arei)
  -1 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-19 12:03 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

Hi Kevin,


> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> 
> On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
> > Hi Kevin & Paolo,
> >
> > Luckily, I reproduced this problem last night. And I got the below log when
> SeaBIOS is stuck.
> [...]
> > [2015-12-18 10:38:10]  gonglei: finish while
> [...]
> > <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3
> info 0 80000306
> > <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
> > <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
> > <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0
> 
> This is an odd finding.  It seems to indicate that the code is caught
> in an infinite irq loop once irqs are enabled.  What doesn't make
> sense is that an NMI shouldn't depend on the cpu irq enable flag.

Maybe the root cause is not NMI but INTR, so yield() can open hardware interrupt,
And then execute interrupt handler, but the interrupt handler make the SeaBIOS
stack broken, so that the BSP can't execute the instruction and occur exception,
VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
the surface phenomenon.

Kevin, can we drop yield() in smp_setup() ?

diff --git a/src/fw/smp.c b/src/fw/smp.c
index 579acdb..dd23eda 100644
--- a/src/fw/smp.c
+++ b/src/fw/smp.c
@@ -136,7 +136,6 @@ smp_setup(void)
             "  jc 1b\n"
             : "+m" (SMPLock), "+m" (SMPStack)
             : : "cc", "memory");
-    yield();
 
     // Restore memory.
     *(u64*)BUILD_AP_BOOT_ADDR = old;

Is it really useful and allowable for SeaBIOS? Maybe for other components?
I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
the current problem.


Regards,
-Gonglei

> Also, I can't explain why rip would be 0x03, nor why a #UD in an
> exception handler wouldn't result in a triple fault.  Maybe someone
> with more kvm knowledge could help here.
> 
> I did notice that you appear to be running with SeaBIOS v1.8.1 - I
> recommend you upgrade to the latest.  There were two important fixes
> in this area (8b9942fa and 3156b71a).  I don't think either of these
> fixes would explain the log above, but it would be best to eliminate
> the possibility.
> 
> -Kevin

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-19 12:03                         ` Gonglei (Arei)
  0 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-19 12:03 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

Hi Kevin,


> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> 
> On Fri, Dec 18, 2015 at 03:04:58AM +0000, Gonglei (Arei) wrote:
> > Hi Kevin & Paolo,
> >
> > Luckily, I reproduced this problem last night. And I got the below log when
> SeaBIOS is stuck.
> [...]
> > [2015-12-18 10:38:10]  gonglei: finish while
> [...]
> > <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3
> info 0 80000306
> > <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
> > <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
> > <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0
> 
> This is an odd finding.  It seems to indicate that the code is caught
> in an infinite irq loop once irqs are enabled.  What doesn't make
> sense is that an NMI shouldn't depend on the cpu irq enable flag.

Maybe the root cause is not NMI but INTR, so yield() can open hardware interrupt,
And then execute interrupt handler, but the interrupt handler make the SeaBIOS
stack broken, so that the BSP can't execute the instruction and occur exception,
VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
the surface phenomenon.

Kevin, can we drop yield() in smp_setup() ?

diff --git a/src/fw/smp.c b/src/fw/smp.c
index 579acdb..dd23eda 100644
--- a/src/fw/smp.c
+++ b/src/fw/smp.c
@@ -136,7 +136,6 @@ smp_setup(void)
             "  jc 1b\n"
             : "+m" (SMPLock), "+m" (SMPStack)
             : : "cc", "memory");
-    yield();
 
     // Restore memory.
     *(u64*)BUILD_AP_BOOT_ADDR = old;

Is it really useful and allowable for SeaBIOS? Maybe for other components?
I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
the current problem.


Regards,
-Gonglei

> Also, I can't explain why rip would be 0x03, nor why a #UD in an
> exception handler wouldn't result in a triple fault.  Maybe someone
> with more kvm knowledge could help here.
> 
> I did notice that you appear to be running with SeaBIOS v1.8.1 - I
> recommend you upgrade to the latest.  There were two important fixes
> in this area (8b9942fa and 3156b71a).  I don't think either of these
> fixes would explain the log above, but it would be best to eliminate
> the possibility.
> 
> -Kevin

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-19 12:03                         ` [Qemu-devel] " Gonglei (Arei)
@ 2015-12-19 15:11                           ` Kevin O'Connor
  -1 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-19 15:11 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Xulei (Stone),
	Paolo Bonzini, qemu-devel, seabios, Huangweidong (C),
	kvm, Radim Krcmar

On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> Maybe the root cause is not NMI but INTR, so yield() can open hardware interrupt,
> And then execute interrupt handler, but the interrupt handler make the SeaBIOS
> stack broken, so that the BSP can't execute the instruction and occur exception,
> VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
> the surface phenomenon.

I can't see any reason why allowing interrupts at this location would
be a problem.

> Kevin, can we drop yield() in smp_setup() ?

It's possible to eliminate this instance of yield, but I think it
would just push the crash to the next time interrupts are enabled.

> Is it really useful and allowable for SeaBIOS? Maybe for other components?
> I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
> NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
> the current problem.

If you apply the patches you had to prevent that NMI crash problem,
does it also prevent the above crash?

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-19 15:11                           ` Kevin O'Connor
  0 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-19 15:11 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> Maybe the root cause is not NMI but INTR, so yield() can open hardware interrupt,
> And then execute interrupt handler, but the interrupt handler make the SeaBIOS
> stack broken, so that the BSP can't execute the instruction and occur exception,
> VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
> the surface phenomenon.

I can't see any reason why allowing interrupts at this location would
be a problem.

> Kevin, can we drop yield() in smp_setup() ?

It's possible to eliminate this instance of yield, but I think it
would just push the crash to the next time interrupts are enabled.

> Is it really useful and allowable for SeaBIOS? Maybe for other components?
> I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
> NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
> the current problem.

If you apply the patches you had to prevent that NMI crash problem,
does it also prevent the above crash?

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-19 15:11                           ` Kevin O'Connor
@ 2015-12-20  9:49                             ` Gonglei (Arei)
  -1 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-20  9:49 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini


> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Saturday, December 19, 2015 11:12 PM
> On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> interrupt,
> > And then execute interrupt handler, but the interrupt handler make the
> SeaBIOS
> > stack broken, so that the BSP can't execute the instruction and occur
> exception,
> > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
> > the surface phenomenon.
> 
> I can't see any reason why allowing interrupts at this location would
> be a problem.
> 
Does it have any relationship with *extra stack* of SeaBIOS?

> > Kevin, can we drop yield() in smp_setup() ?
> 
> It's possible to eliminate this instance of yield, but I think it
> would just push the crash to the next time interrupts are enabled.
> 
Perhaps. I'm not sure.

> > Is it really useful and allowable for SeaBIOS? Maybe for other components?
> > I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
> > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
> > the current problem.
> 
> If you apply the patches you had to prevent that NMI crash problem,
> does it also prevent the above crash?
> 
Yes, but we cannot prevent the NMI injection (though I'll submit some patches to
forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).


Regards,
-Gonglei

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-20  9:49                             ` Gonglei (Arei)
  0 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-20  9:49 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini


> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Saturday, December 19, 2015 11:12 PM
> On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> interrupt,
> > And then execute interrupt handler, but the interrupt handler make the
> SeaBIOS
> > stack broken, so that the BSP can't execute the instruction and occur
> exception,
> > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
> > the surface phenomenon.
> 
> I can't see any reason why allowing interrupts at this location would
> be a problem.
> 
Does it have any relationship with *extra stack* of SeaBIOS?

> > Kevin, can we drop yield() in smp_setup() ?
> 
> It's possible to eliminate this instance of yield, but I think it
> would just push the crash to the next time interrupts are enabled.
> 
Perhaps. I'm not sure.

> > Is it really useful and allowable for SeaBIOS? Maybe for other components?
> > I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
> > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
> > the current problem.
> 
> If you apply the patches you had to prevent that NMI crash problem,
> does it also prevent the above crash?
> 
Yes, but we cannot prevent the NMI injection (though I'll submit some patches to
forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).


Regards,
-Gonglei

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-20  9:49                             ` [Qemu-devel] " Gonglei (Arei)
@ 2015-12-20 14:33                               ` Kevin O'Connor
  -1 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-20 14:33 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Xulei (Stone),
	Paolo Bonzini, qemu-devel, seabios, Huangweidong (C),
	kvm, Radim Krcmar

On Sun, Dec 20, 2015 at 09:49:54AM +0000, Gonglei (Arei) wrote:
> > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > Sent: Saturday, December 19, 2015 11:12 PM
> > On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> > > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> > interrupt,
> > > And then execute interrupt handler, but the interrupt handler make the
> > SeaBIOS
> > > stack broken, so that the BSP can't execute the instruction and occur
> > exception,
> > > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
> > > the surface phenomenon.
> > 
> > I can't see any reason why allowing interrupts at this location would
> > be a problem.
> > 
> Does it have any relationship with *extra stack* of SeaBIOS?

None that I can see.  Also, the kvm trace seems to show the code
trying to execute at rip=0x03 - that will crash long before the extra
stack is used.

> > > Kevin, can we drop yield() in smp_setup() ?
> > 
> > It's possible to eliminate this instance of yield, but I think it
> > would just push the crash to the next time interrupts are enabled.
> > 
> Perhaps. I'm not sure.
> 
> > > Is it really useful and allowable for SeaBIOS? Maybe for other components?
> > > I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
> > > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
> > > the current problem.
> > 
> > If you apply the patches you had to prevent that NMI crash problem,
> > does it also prevent the above crash?
> > 
> Yes, but we cannot prevent the NMI injection (though I'll submit some patches to
> forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).
> 

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-20 14:33                               ` Kevin O'Connor
  0 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-20 14:33 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

On Sun, Dec 20, 2015 at 09:49:54AM +0000, Gonglei (Arei) wrote:
> > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > Sent: Saturday, December 19, 2015 11:12 PM
> > On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> > > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> > interrupt,
> > > And then execute interrupt handler, but the interrupt handler make the
> > SeaBIOS
> > > stack broken, so that the BSP can't execute the instruction and occur
> > exception,
> > > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except
> > > the surface phenomenon.
> > 
> > I can't see any reason why allowing interrupts at this location would
> > be a problem.
> > 
> Does it have any relationship with *extra stack* of SeaBIOS?

None that I can see.  Also, the kvm trace seems to show the code
trying to execute at rip=0x03 - that will crash long before the extra
stack is used.

> > > Kevin, can we drop yield() in smp_setup() ?
> > 
> > It's possible to eliminate this instance of yield, but I think it
> > would just push the crash to the next time interrupts are enabled.
> > 
> Perhaps. I'm not sure.
> 
> > > Is it really useful and allowable for SeaBIOS? Maybe for other components?
> > > I'm not sure. Because we found that when SeaBIOS is booting, if we inject a
> > > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with
> > > the current problem.
> > 
> > If you apply the patches you had to prevent that NMI crash problem,
> > does it also prevent the above crash?
> > 
> Yes, but we cannot prevent the NMI injection (though I'll submit some patches to
> forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).
> 

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-20 14:33                               ` Kevin O'Connor
@ 2015-12-21  9:41                                 ` Gonglei (Arei)
  -1 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-21  9:41 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

Dear Kevin,

> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Sunday, December 20, 2015 10:33 PM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Sun, Dec 20, 2015 at 09:49:54AM +0000, Gonglei (Arei) wrote:
> > > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > > Sent: Saturday, December 19, 2015 11:12 PM
> > > On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> > > > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> > > interrupt,
> > > > And then execute interrupt handler, but the interrupt handler make the
> > > SeaBIOS
> > > > stack broken, so that the BSP can't execute the instruction and occur
> > > exception,
> > > > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs
> except
> > > > the surface phenomenon.
> > >
> > > I can't see any reason why allowing interrupts at this location would
> > > be a problem.
> > >
> > Does it have any relationship with *extra stack* of SeaBIOS?
> 
> None that I can see.  Also, the kvm trace seems to show the code
> trying to execute at rip=0x03 - that will crash long before the extra
> stack is used.
> 
When the gurb of OS is booting, then the softirq and C function send_disk_op()
may use extra stack of SeaBIOS. If we inject a NMI, romlayout.S: irqentry_extrastack
is invoked, and the extra stack will be used again. And the stack of first calling
will be broken, so that the SeaBIOS stuck. 

You can easily reproduce the problem.

1. start on guest
2. reset the guest
3. inject a NMI when the guest show the grub surface
4. then the guest stuck

If we disabled extra stack by setting

 CONFIG_ENTRY_EXTRASTACK=n

Then the problem is gone.

Besides, I have another thought:

Is it possible when one cpu is using the extra stack, but other cpus (APs)
still be waked up by hardware interrupt after yield() or br->flags = F_IF 
and used the extra stack again?


Regards,
-Gonglei

> > > > Kevin, can we drop yield() in smp_setup() ?
> > >
> > > It's possible to eliminate this instance of yield, but I think it
> > > would just push the crash to the next time interrupts are enabled.
> > >
> > Perhaps. I'm not sure.
> >
> > > > Is it really useful and allowable for SeaBIOS? Maybe for other
> components?
> > > > I'm not sure. Because we found that when SeaBIOS is booting, if we inject
> a
> > > > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same
> with
> > > > the current problem.
> > >
> > > If you apply the patches you had to prevent that NMI crash problem,
> > > does it also prevent the above crash?
> > >
> > Yes, but we cannot prevent the NMI injection (though I'll submit some
> patches to
> > forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).
> >
> 
> -Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-21  9:41                                 ` Gonglei (Arei)
  0 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-21  9:41 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

Dear Kevin,

> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Sunday, December 20, 2015 10:33 PM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Sun, Dec 20, 2015 at 09:49:54AM +0000, Gonglei (Arei) wrote:
> > > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > > Sent: Saturday, December 19, 2015 11:12 PM
> > > On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote:
> > > > Maybe the root cause is not NMI but INTR, so yield() can open hardware
> > > interrupt,
> > > > And then execute interrupt handler, but the interrupt handler make the
> > > SeaBIOS
> > > > stack broken, so that the BSP can't execute the instruction and occur
> > > exception,
> > > > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs
> except
> > > > the surface phenomenon.
> > >
> > > I can't see any reason why allowing interrupts at this location would
> > > be a problem.
> > >
> > Does it have any relationship with *extra stack* of SeaBIOS?
> 
> None that I can see.  Also, the kvm trace seems to show the code
> trying to execute at rip=0x03 - that will crash long before the extra
> stack is used.
> 
When the gurb of OS is booting, then the softirq and C function send_disk_op()
may use extra stack of SeaBIOS. If we inject a NMI, romlayout.S: irqentry_extrastack
is invoked, and the extra stack will be used again. And the stack of first calling
will be broken, so that the SeaBIOS stuck. 

You can easily reproduce the problem.

1. start on guest
2. reset the guest
3. inject a NMI when the guest show the grub surface
4. then the guest stuck

If we disabled extra stack by setting

 CONFIG_ENTRY_EXTRASTACK=n

Then the problem is gone.

Besides, I have another thought:

Is it possible when one cpu is using the extra stack, but other cpus (APs)
still be waked up by hardware interrupt after yield() or br->flags = F_IF 
and used the extra stack again?


Regards,
-Gonglei

> > > > Kevin, can we drop yield() in smp_setup() ?
> > >
> > > It's possible to eliminate this instance of yield, but I think it
> > > would just push the crash to the next time interrupts are enabled.
> > >
> > Perhaps. I'm not sure.
> >
> > > > Is it really useful and allowable for SeaBIOS? Maybe for other
> components?
> > > > I'm not sure. Because we found that when SeaBIOS is booting, if we inject
> a
> > > > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same
> with
> > > > the current problem.
> > >
> > > If you apply the patches you had to prevent that NMI crash problem,
> > > does it also prevent the above crash?
> > >
> > Yes, but we cannot prevent the NMI injection (though I'll submit some
> patches to
> > forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70).
> >
> 
> -Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-21  9:41                                 ` [Qemu-devel] " Gonglei (Arei)
@ 2015-12-21 18:47                                   ` Kevin O'Connor
  -1 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-21 18:47 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
> When the gurb of OS is booting, then the softirq and C function send_disk_op()
> may use extra stack of SeaBIOS. If we inject a NMI, romlayout.S: irqentry_extrastack
> is invoked, and the extra stack will be used again. And the stack of first calling
> will be broken, so that the SeaBIOS stuck. 
> 
> You can easily reproduce the problem.
> 
> 1. start on guest
> 2. reset the guest
> 3. inject a NMI when the guest show the grub surface
> 4. then the guest stuck

Does the SeaBIOS patch below help?  I'm not familiar with how to
"inject a NMI" - can you describe the process in more detail?

-Kevin


--- a/src/romlayout.S
+++ b/src/romlayout.S
@@ -548,7 +548,9 @@ entry_post:
         ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
 
         ORG 0xe2c3
-        IRQ_ENTRY 02
+        .global entry_02
+entry_02:
+        ENTRY handle_02  // NMI handler does not switch onto extra stack
 
         ORG 0xe3fe
         .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-21 18:47                                   ` Kevin O'Connor
  0 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-21 18:47 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
> When the gurb of OS is booting, then the softirq and C function send_disk_op()
> may use extra stack of SeaBIOS. If we inject a NMI, romlayout.S: irqentry_extrastack
> is invoked, and the extra stack will be used again. And the stack of first calling
> will be broken, so that the SeaBIOS stuck. 
> 
> You can easily reproduce the problem.
> 
> 1. start on guest
> 2. reset the guest
> 3. inject a NMI when the guest show the grub surface
> 4. then the guest stuck

Does the SeaBIOS patch below help?  I'm not familiar with how to
"inject a NMI" - can you describe the process in more detail?

-Kevin


--- a/src/romlayout.S
+++ b/src/romlayout.S
@@ -548,7 +548,9 @@ entry_post:
         ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
 
         ORG 0xe2c3
-        IRQ_ENTRY 02
+        .global entry_02
+entry_02:
+        ENTRY handle_02  // NMI handler does not switch onto extra stack
 
         ORG 0xe3fe
         .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* RE: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-21 18:47                                   ` [Qemu-devel] " Kevin O'Connor
@ 2015-12-22  2:14                                     ` Gonglei (Arei)
  -1 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-22  2:14 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Xulei (Stone),
	Paolo Bonzini, qemu-devel, seabios, Huangweidong (C),
	kvm, Radim Krcmar

> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Tuesday, December 22, 2015 2:47 AM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
> > When the gurb of OS is booting, then the softirq and C function
> > send_disk_op() may use extra stack of SeaBIOS. If we inject a NMI,
> > romlayout.S: irqentry_extrastack is invoked, and the extra stack will
> > be used again. And the stack of first calling will be broken, so that the
> SeaBIOS stuck.
> >
> > You can easily reproduce the problem.
> >
> > 1. start on guest
> > 2. reset the guest
> > 3. inject a NMI when the guest show the grub surface 4. then the guest
> > stuck
> 
> Does the SeaBIOS patch below help?  

Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch. 


> I'm not familiar with how to "inject a
> NMI" - can you describe the process in more detail?
> 

1. Qemu Command line:

#: /home/qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096 -smp 8 -name suse -vnc 0.0.0.0:10 \
-device virtio-scsi-pci,id=scsi0 -drive file=/home/suse11_sp3_32_2,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \
-device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
-chardev file,id=seabios,path=/home/seabios.log -device isa-debugcon,iobase=0x402,chardev=seabios \
-monitor stdio -qmp unix:/tmp/qmp,server,nowait 

2. Inject a NMI by QMP:

#: /home/qemu/scripts/qmp # ./qmp-shell /tmp/qmp
Welcome to the QMP low-level shell!
Connected to QEMU 2.5.0

(QEMU) system_reset
{"return": {}}
(QEMU) inject-nmi  
{"return": {}}
(QEMU) inject-nmi
{"return": {}}


Regards,
-Gonglei

> -Kevin
> 
> 
> --- a/src/romlayout.S
> +++ b/src/romlayout.S
> @@ -548,7 +548,9 @@ entry_post:
>          ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
> 
>          ORG 0xe2c3
> -        IRQ_ENTRY 02
> +        .global entry_02
> +entry_02:
> +        ENTRY handle_02  // NMI handler does not switch onto extra
> +stack
> 
>          ORG 0xe3fe
>          .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-22  2:14                                     ` Gonglei (Arei)
  0 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-22  2:14 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Tuesday, December 22, 2015 2:47 AM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
> > When the gurb of OS is booting, then the softirq and C function
> > send_disk_op() may use extra stack of SeaBIOS. If we inject a NMI,
> > romlayout.S: irqentry_extrastack is invoked, and the extra stack will
> > be used again. And the stack of first calling will be broken, so that the
> SeaBIOS stuck.
> >
> > You can easily reproduce the problem.
> >
> > 1. start on guest
> > 2. reset the guest
> > 3. inject a NMI when the guest show the grub surface 4. then the guest
> > stuck
> 
> Does the SeaBIOS patch below help?  

Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch. 


> I'm not familiar with how to "inject a
> NMI" - can you describe the process in more detail?
> 

1. Qemu Command line:

#: /home/qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096 -smp 8 -name suse -vnc 0.0.0.0:10 \
-device virtio-scsi-pci,id=scsi0 -drive file=/home/suse11_sp3_32_2,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \
-device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
-chardev file,id=seabios,path=/home/seabios.log -device isa-debugcon,iobase=0x402,chardev=seabios \
-monitor stdio -qmp unix:/tmp/qmp,server,nowait 

2. Inject a NMI by QMP:

#: /home/qemu/scripts/qmp # ./qmp-shell /tmp/qmp
Welcome to the QMP low-level shell!
Connected to QEMU 2.5.0

(QEMU) system_reset
{"return": {}}
(QEMU) inject-nmi  
{"return": {}}
(QEMU) inject-nmi
{"return": {}}


Regards,
-Gonglei

> -Kevin
> 
> 
> --- a/src/romlayout.S
> +++ b/src/romlayout.S
> @@ -548,7 +548,9 @@ entry_post:
>          ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
> 
>          ORG 0xe2c3
> -        IRQ_ENTRY 02
> +        .global entry_02
> +entry_02:
> +        ENTRY handle_02  // NMI handler does not switch onto extra
> +stack
> 
>          ORG 0xe3fe
>          .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-22  2:14                                     ` Gonglei (Arei)
@ 2015-12-22  3:15                                       ` Xulei (Stone)
  -1 siblings, 0 replies; 41+ messages in thread
From: Xulei (Stone) @ 2015-12-22  3:15 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, qemu-devel, Paolo Bonzini

Hi, Kevin,
Can you tell how to reset/reboot this VM, if it goes to the handle_hwpic1()
on its booting procedure? I mean, usually, SeaBIOS would not go to 
handle_hwpic routine. But in my test case, SeaBIOS calls handle_hwpic when
KVM injects a #UD expcetion (not irq) and  SeaBIOS will loop to handle this
if KVM persistently injects exception.
 
Now, i just wish to reset/reboot this VM if it is fall into handle_hwpic. I
tried follwing patch and it seems not work. What can i do to force reset/reboot? 

@@ -7,10 +7,11 @@
  #include "biosvar.h" // SET_IVT
  #include "config.h" // CONFIG_*
  #include "output.h" // dprintf
  #include "pic.h" // pic_*
+ #include "hw/ps2port.h"
 
 u16
 pic_irqmask_read(void)
 {
     if (!CONFIG_HARDWARE_IRQ)
@@ -103,10 +104,11 @@ pic_isr2_read(void)
 void VISIBLE16
 handle_hwpic1(struct bregs *regs)
 {
     dprintf(DEBUG_ISR_hwpic1, "handle_hwpic1 irq=%x\n", pic_isr1_read());
     pic_eoi1();
+	 i8042_reboot();
 }

useful information:
kmod ftrace:
<...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180077: kvm_entry: vcpu 0

bad SeaBIOS log:
[2015-12-17 12:37:30] In 32bit resume
[2015-12-17 12:37:30] =====Attempting a hard reboot====
[2015-12-17 12:37:30] SeaBIOS (version rel-1.8.1-0-g4adadbd-20151217_104405-linux-emBwNn)
[2015-12-17 12:37:30] No Xen hypervisor found.
[2015-12-17 12:37:30] Running on QEMU (i440fx)
[2015-12-17 12:37:30] Running on KVM
[2015-12-17 12:37:30] RamSize: 0x80000000 [cmos]
[2015-12-17 12:37:30] Relocating init from 0x000db230 to 0x7ffad360 (size 76768)
[2015-12-17 12:37:30] Found QEMU fw_cfg
[2015-12-17 12:37:30] RamBlock: addr 0x0000000000000000 len 0x0000000080000000 [e820]
[2015-12-17 12:37:30] Moving pm_base to 0x600
[2015-12-17 12:37:30] boot order:
[2015-12-17 12:37:30] 1: /pci@i0cf8/ide@1,1/drive@0/disk@0
[2015-12-17 12:37:30] 2: HALT
[2015-12-17 12:37:30] maininit
[2015-12-17 12:37:30] platform_hardware_setup
[2015-12-17 12:37:30] init pic
[2015-12-17 12:37:30] pic_setup
[2015-12-17 12:37:30] pic_reset
[2015-12-17 12:37:30] enable_hwirq
[2015-12-17 12:37:30] CPU Mhz=3304
[2015-12-17 12:37:30] enable_hwirq
[2015-12-17 12:37:30] enable_hwirq
[2015-12-17 12:37:30] === PCI bus & bridge init ===
[2015-12-17 12:37:30] PCI: pci_bios_init_bus_rec bus = 0x0
[2015-12-17 12:37:30] === PCI device probing ===
[2015-12-17 12:37:30] Found 6 PCI devices (max PCI bus is 00)
[2015-12-17 12:37:30] === PCI new allocation pass #1 ===
[2015-12-17 12:37:30] PCI: check devices
[2015-12-17 12:37:30] === PCI new allocation pass #2 ===
[2015-12-17 12:37:30] PCI: IO: c000 - c02f
[2015-12-17 12:37:30] PCI: 32: 0000000080000000 - 00000000fec00000
[2015-12-17 12:37:30] PCI: map device bdf=00:01.2  bar 4, addr 0000c000, size 00000020 [io]
[2015-12-17 12:37:30] PCI: map device bdf=00:01.1  bar 4, addr 0000c020, size 00000010 [io]
[2015-12-17 12:37:30] PCI: map device bdf=00:02.0  bar 6, addr febe0000, size 00010000 [mem]
[2015-12-17 12:37:30] PCI: map device bdf=00:02.0  bar 1, addr febf0000, size 00001000 [mem]
[2015-12-17 12:37:30] PCI: map device bdf=00:02.0  bar 0, addr fc000000, size 02000000 [prefmem]
[2015-12-17 12:37:30] PCI: init bdf=00:00.0 id=8086:1237
[2015-12-17 12:37:30] PCI: init bdf=00:01.0 id=8086:7000
[2015-12-17 12:37:30] PIIX3/PIIX4 init: elcr=00 0c
[2015-12-17 12:37:30] PCI: init bdf=00:01.1 id=8086:7010
[2015-12-17 12:37:30] PCI: init bdf=00:01.2 id=8086:7020
[2015-12-17 12:37:30] PCI: init bdf=00:01.3 id=8086:7113
[2015-12-17 12:37:30] Using pmtimer, ioport 0x608
[2015-12-17 12:37:30] PCI: init bdf=00:02.0 id=1013:00b8
[2015-12-17 12:37:30] PCI: Using 00:02.0 for primary VGA
[2015-12-17 12:37:30] handle_hshamanpnd:dl leae_p_sismcmp_p:i: d a=ap3               <<======= everytime stuck, AP setup log seems abnormal!
[2015-12-17 12:37:30] ièf[cf_^ifd_=f3
[2015-12-17 12:37:30] èf[f^f_f]fÃÍ^XË<90>Found 4 cpu(s) max supported 4 cpu(s)
[2015-12-17 12:37:30] Copying PIR from 0x7ffbea18 to 0x000f5700
[2015-12-17 12:37:30] Copying MPTABLE from 0x00006e30/7ffa42c0 to 0x000f55e0
[2015-12-17 12:37:30] Copying SMBIOS entry point from 0x00006e11 to 0x000f55c0
[2015-12-17 12:37:31] Scan for VGA option rom
[2015-12-17 12:37:31] Running option rom at c000:0003
[2015-12-17 12:37:31] Start SeaVGABIOS (version rel-1.8.1-0-g4adadbd-20150316_085902-nilsson.home.kraxel.org)
[2015-12-17 12:37:31] enter vga_post:
[2015-12-17 12:37:31]    a=00000010  b=0000ffff  c=00000000  d=0000ffff ds=0000 es=f000 ss=0000
[2015-12-17 12:37:31]   si=00000000 di=000057e0 bp=00000000 sp=00006dbe cs=f000 ip=d1fb  f=0000
[2015-12-17 12:37:31] cirrus init
[2015-12-17 12:37:31] cirrus init 2
[2015-12-17 12:37:31] Attempting to allocate VGA stack via pmm call to f000:d2a0   <<====== here stuck, loop handle PIC irq0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
... always hanle_hwpic1 irq=0, never ends anymore...


>> -----Original Message-----
>> From: Kevin O'Connor [mailto:kevin@koconnor.net]
>> Sent: Tuesday, December 22, 2015 2:47 AM
>> To: Gonglei (Arei)
>> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
>> Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
>> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
>> problem on qemu-kvm platform
>>
>> On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
>> > When the gurb of OS is booting, then the softirq and C function
>> > send_disk_op() may use extra stack of SeaBIOS. If we inject a NMI,
>> > romlayout.S: irqentry_extrastack is invoked, and the extra stack will
>> > be used again. And the stack of first calling will be broken, so that the
>> SeaBIOS stuck.
>> >
>> > You can easily reproduce the problem.
>> >
>> > 1. start on guest
>> > 2. reset the guest
>> > 3. inject a NMI when the guest show the grub surface 4. then the guest
>> > stuck
>>
>> Does the SeaBIOS patch below help? 
>
>Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
>Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch.
>
>
>> I'm not familiar with how to "inject a
>> NMI" - can you describe the process in more detail?
>>
>
>1. Qemu Command line:
>
>#: /home/qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096 -smp 8 -name suse -vnc 0.0.0.0:10 \
>-device virtio-scsi-pci,id=scsi0 -drive file=/home/suse11_sp3_32_2,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \
>-device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
>-chardev file,id=seabios,path=/home/seabios.log -device isa-debugcon,iobase=0x402,chardev=seabios \
>-monitor stdio -qmp unix:/tmp/qmp,server,nowait
>
>2. Inject a NMI by QMP:
>
>#: /home/qemu/scripts/qmp # ./qmp-shell /tmp/qmp
>Welcome to the QMP low-level shell!
>Connected to QEMU 2.5.0
>
>(QEMU) system_reset
>{"return": {}}
>(QEMU) inject-nmi 
>{"return": {}}
>(QEMU) inject-nmi
>{"return": {}}
>
>
>Regards,
>-Gonglei
>
>> -Kevin
>>
>>
>> --- a/src/romlayout.S
>> +++ b/src/romlayout.S
>> @@ -548,7 +548,9 @@ entry_post:
>>          ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
>>
>>          ORG 0xe2c3
>> -        IRQ_ENTRY 02
>> +        .global entry_02
>> +entry_02:
>> +        ENTRY handle_02  // NMI handler does not switch onto extra
>> +stack
>>
>>          ORG 0xe3fe
>>          .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-22  3:15                                       ` Xulei (Stone)
  0 siblings, 0 replies; 41+ messages in thread
From: Xulei (Stone) @ 2015-12-22  3:15 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, qemu-devel, Paolo Bonzini

Hi, Kevin,
Can you tell how to reset/reboot this VM, if it goes to the handle_hwpic1()
on its booting procedure? I mean, usually, SeaBIOS would not go to 
handle_hwpic routine. But in my test case, SeaBIOS calls handle_hwpic when
KVM injects a #UD expcetion (not irq) and  SeaBIOS will loop to handle this
if KVM persistently injects exception.
 
Now, i just wish to reset/reboot this VM if it is fall into handle_hwpic. I
tried follwing patch and it seems not work. What can i do to force reset/reboot? 

@@ -7,10 +7,11 @@
  #include "biosvar.h" // SET_IVT
  #include "config.h" // CONFIG_*
  #include "output.h" // dprintf
  #include "pic.h" // pic_*
+ #include "hw/ps2port.h"
 
 u16
 pic_irqmask_read(void)
 {
     if (!CONFIG_HARDWARE_IRQ)
@@ -103,10 +104,11 @@ pic_isr2_read(void)
 void VISIBLE16
 handle_hwpic1(struct bregs *regs)
 {
     dprintf(DEBUG_ISR_hwpic1, "handle_hwpic1 irq=%x\n", pic_isr1_read());
     pic_eoi1();
+	 i8042_reboot();
 }

useful information:
kmod ftrace:
<...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306
<...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real)
<...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0)
<...>-31509 [035] 154753.180077: kvm_entry: vcpu 0

bad SeaBIOS log:
[2015-12-17 12:37:30] In 32bit resume
[2015-12-17 12:37:30] =====Attempting a hard reboot====
[2015-12-17 12:37:30] SeaBIOS (version rel-1.8.1-0-g4adadbd-20151217_104405-linux-emBwNn)
[2015-12-17 12:37:30] No Xen hypervisor found.
[2015-12-17 12:37:30] Running on QEMU (i440fx)
[2015-12-17 12:37:30] Running on KVM
[2015-12-17 12:37:30] RamSize: 0x80000000 [cmos]
[2015-12-17 12:37:30] Relocating init from 0x000db230 to 0x7ffad360 (size 76768)
[2015-12-17 12:37:30] Found QEMU fw_cfg
[2015-12-17 12:37:30] RamBlock: addr 0x0000000000000000 len 0x0000000080000000 [e820]
[2015-12-17 12:37:30] Moving pm_base to 0x600
[2015-12-17 12:37:30] boot order:
[2015-12-17 12:37:30] 1: /pci@i0cf8/ide@1,1/drive@0/disk@0
[2015-12-17 12:37:30] 2: HALT
[2015-12-17 12:37:30] maininit
[2015-12-17 12:37:30] platform_hardware_setup
[2015-12-17 12:37:30] init pic
[2015-12-17 12:37:30] pic_setup
[2015-12-17 12:37:30] pic_reset
[2015-12-17 12:37:30] enable_hwirq
[2015-12-17 12:37:30] CPU Mhz=3304
[2015-12-17 12:37:30] enable_hwirq
[2015-12-17 12:37:30] enable_hwirq
[2015-12-17 12:37:30] === PCI bus & bridge init ===
[2015-12-17 12:37:30] PCI: pci_bios_init_bus_rec bus = 0x0
[2015-12-17 12:37:30] === PCI device probing ===
[2015-12-17 12:37:30] Found 6 PCI devices (max PCI bus is 00)
[2015-12-17 12:37:30] === PCI new allocation pass #1 ===
[2015-12-17 12:37:30] PCI: check devices
[2015-12-17 12:37:30] === PCI new allocation pass #2 ===
[2015-12-17 12:37:30] PCI: IO: c000 - c02f
[2015-12-17 12:37:30] PCI: 32: 0000000080000000 - 00000000fec00000
[2015-12-17 12:37:30] PCI: map device bdf=00:01.2  bar 4, addr 0000c000, size 00000020 [io]
[2015-12-17 12:37:30] PCI: map device bdf=00:01.1  bar 4, addr 0000c020, size 00000010 [io]
[2015-12-17 12:37:30] PCI: map device bdf=00:02.0  bar 6, addr febe0000, size 00010000 [mem]
[2015-12-17 12:37:30] PCI: map device bdf=00:02.0  bar 1, addr febf0000, size 00001000 [mem]
[2015-12-17 12:37:30] PCI: map device bdf=00:02.0  bar 0, addr fc000000, size 02000000 [prefmem]
[2015-12-17 12:37:30] PCI: init bdf=00:00.0 id=8086:1237
[2015-12-17 12:37:30] PCI: init bdf=00:01.0 id=8086:7000
[2015-12-17 12:37:30] PIIX3/PIIX4 init: elcr=00 0c
[2015-12-17 12:37:30] PCI: init bdf=00:01.1 id=8086:7010
[2015-12-17 12:37:30] PCI: init bdf=00:01.2 id=8086:7020
[2015-12-17 12:37:30] PCI: init bdf=00:01.3 id=8086:7113
[2015-12-17 12:37:30] Using pmtimer, ioport 0x608
[2015-12-17 12:37:30] PCI: init bdf=00:02.0 id=1013:00b8
[2015-12-17 12:37:30] PCI: Using 00:02.0 for primary VGA
[2015-12-17 12:37:30] handle_hshamanpnd:dl leae_p_sismcmp_p:i: d a=ap3               <<======= everytime stuck, AP setup log seems abnormal!
[2015-12-17 12:37:30] ièf[cf_^ifd_=f3
[2015-12-17 12:37:30] èf[f^f_f]fÃÍ^XË<90>Found 4 cpu(s) max supported 4 cpu(s)
[2015-12-17 12:37:30] Copying PIR from 0x7ffbea18 to 0x000f5700
[2015-12-17 12:37:30] Copying MPTABLE from 0x00006e30/7ffa42c0 to 0x000f55e0
[2015-12-17 12:37:30] Copying SMBIOS entry point from 0x00006e11 to 0x000f55c0
[2015-12-17 12:37:31] Scan for VGA option rom
[2015-12-17 12:37:31] Running option rom at c000:0003
[2015-12-17 12:37:31] Start SeaVGABIOS (version rel-1.8.1-0-g4adadbd-20150316_085902-nilsson.home.kraxel.org)
[2015-12-17 12:37:31] enter vga_post:
[2015-12-17 12:37:31]    a=00000010  b=0000ffff  c=00000000  d=0000ffff ds=0000 es=f000 ss=0000
[2015-12-17 12:37:31]   si=00000000 di=000057e0 bp=00000000 sp=00006dbe cs=f000 ip=d1fb  f=0000
[2015-12-17 12:37:31] cirrus init
[2015-12-17 12:37:31] cirrus init 2
[2015-12-17 12:37:31] Attempting to allocate VGA stack via pmm call to f000:d2a0   <<====== here stuck, loop handle PIC irq0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
[2015-12-17 12:37:35] handle_hwpic1 irq=0
... always hanle_hwpic1 irq=0, never ends anymore...


>> -----Original Message-----
>> From: Kevin O'Connor [mailto:kevin@koconnor.net]
>> Sent: Tuesday, December 22, 2015 2:47 AM
>> To: Gonglei (Arei)
>> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
>> Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
>> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
>> problem on qemu-kvm platform
>>
>> On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
>> > When the gurb of OS is booting, then the softirq and C function
>> > send_disk_op() may use extra stack of SeaBIOS. If we inject a NMI,
>> > romlayout.S: irqentry_extrastack is invoked, and the extra stack will
>> > be used again. And the stack of first calling will be broken, so that the
>> SeaBIOS stuck.
>> >
>> > You can easily reproduce the problem.
>> >
>> > 1. start on guest
>> > 2. reset the guest
>> > 3. inject a NMI when the guest show the grub surface 4. then the guest
>> > stuck
>>
>> Does the SeaBIOS patch below help? 
>
>Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
>Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch.
>
>
>> I'm not familiar with how to "inject a
>> NMI" - can you describe the process in more detail?
>>
>
>1. Qemu Command line:
>
>#: /home/qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096 -smp 8 -name suse -vnc 0.0.0.0:10 \
>-device virtio-scsi-pci,id=scsi0 -drive file=/home/suse11_sp3_32_2,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \
>-device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
>-chardev file,id=seabios,path=/home/seabios.log -device isa-debugcon,iobase=0x402,chardev=seabios \
>-monitor stdio -qmp unix:/tmp/qmp,server,nowait
>
>2. Inject a NMI by QMP:
>
>#: /home/qemu/scripts/qmp # ./qmp-shell /tmp/qmp
>Welcome to the QMP low-level shell!
>Connected to QEMU 2.5.0
>
>(QEMU) system_reset
>{"return": {}}
>(QEMU) inject-nmi 
>{"return": {}}
>(QEMU) inject-nmi
>{"return": {}}
>
>
>Regards,
>-Gonglei
>
>> -Kevin
>>
>>
>> --- a/src/romlayout.S
>> +++ b/src/romlayout.S
>> @@ -548,7 +548,9 @@ entry_post:
>>          ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
>>
>>          ORG 0xe2c3
>> -        IRQ_ENTRY 02
>> +        .global entry_02
>> +entry_02:
>> +        ENTRY handle_02  // NMI handler does not switch onto extra
>> +stack
>>
>>          ORG 0xe3fe
>>          .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-22  3:15                                       ` [Qemu-devel] " Xulei (Stone)
@ 2015-12-22 15:38                                         ` Kevin O'Connor
  -1 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-22 15:38 UTC (permalink / raw)
  To: Xulei (Stone)
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, qemu-devel, Paolo Bonzini

On Tue, Dec 22, 2015 at 03:15:26AM +0000, Xulei (Stone) wrote:
> Hi, Kevin,
> Can you tell how to reset/reboot this VM, if it goes to the handle_hwpic1()
> on its booting procedure? I mean, usually, SeaBIOS would not go to 
> handle_hwpic routine. But in my test case, SeaBIOS calls handle_hwpic when
> KVM injects a #UD expcetion (not irq) and  SeaBIOS will loop to handle this
> if KVM persistently injects exception.
>  
> Now, i just wish to reset/reboot this VM if it is fall into handle_hwpic. I
> tried follwing patch and it seems not work. What can i do to force reset/reboot? 

Call the reset() function.

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-22 15:38                                         ` Kevin O'Connor
  0 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-22 15:38 UTC (permalink / raw)
  To: Xulei (Stone)
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, qemu-devel, Paolo Bonzini

On Tue, Dec 22, 2015 at 03:15:26AM +0000, Xulei (Stone) wrote:
> Hi, Kevin,
> Can you tell how to reset/reboot this VM, if it goes to the handle_hwpic1()
> on its booting procedure? I mean, usually, SeaBIOS would not go to 
> handle_hwpic routine. But in my test case, SeaBIOS calls handle_hwpic when
> KVM injects a #UD expcetion (not irq) and  SeaBIOS will loop to handle this
> if KVM persistently injects exception.
>  
> Now, i just wish to reset/reboot this VM if it is fall into handle_hwpic. I
> tried follwing patch and it seems not work. What can i do to force reset/reboot? 

Call the reset() function.

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-22  2:14                                     ` Gonglei (Arei)
@ 2015-12-22 15:51                                       ` Kevin O'Connor
  -1 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-22 15:51 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Xulei (Stone),
	Paolo Bonzini, qemu-devel, seabios, Huangweidong (C),
	kvm, Radim Krcmar

On Tue, Dec 22, 2015 at 02:14:12AM +0000, Gonglei (Arei) wrote:
> > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > Sent: Tuesday, December 22, 2015 2:47 AM
> > To: Gonglei (Arei)
> > Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> > Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> > Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> > problem on qemu-kvm platform
> > 
> > On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
> > > When the gurb of OS is booting, then the softirq and C function
> > > send_disk_op() may use extra stack of SeaBIOS. If we inject a NMI,
> > > romlayout.S: irqentry_extrastack is invoked, and the extra stack will
> > > be used again. And the stack of first calling will be broken, so that the
> > SeaBIOS stuck.
> > >
> > > You can easily reproduce the problem.
> > >
> > > 1. start on guest
> > > 2. reset the guest
> > > 3. inject a NMI when the guest show the grub surface 4. then the guest
> > > stuck
> > 
> > Does the SeaBIOS patch below help?  
> 
> Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
> Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch. 

Oops, can you try with the patch below instead?

> > I'm not familiar with how to "inject a
> > NMI" - can you describe the process in more detail?
> > 
> 
> 1. Qemu Command line:
> 
> #: /home/qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096 -smp 8 -name suse -vnc 0.0.0.0:10 \
> -device virtio-scsi-pci,id=scsi0 -drive file=/home/suse11_sp3_32_2,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \
> -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
> -chardev file,id=seabios,path=/home/seabios.log -device isa-debugcon,iobase=0x402,chardev=seabios \
> -monitor stdio -qmp unix:/tmp/qmp,server,nowait 
> 
> 2. Inject a NMI by QMP:
> 
> #: /home/qemu/scripts/qmp # ./qmp-shell /tmp/qmp
> Welcome to the QMP low-level shell!
> Connected to QEMU 2.5.0
> 
> (QEMU) system_reset
> {"return": {}}
> (QEMU) inject-nmi  
> {"return": {}}
> (QEMU) inject-nmi
> {"return": {}}
> 

I tried a few simple tests but was not able to reproduce.

-Kevin


--- a/src/romlayout.S
+++ b/src/romlayout.S
@@ -548,7 +548,10 @@ entry_post:
         ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
 
         ORG 0xe2c3
-        IRQ_ENTRY 02
+        .global entry_02
+entry_02:
+        ENTRY handle_02  // NMI handler does not switch onto extra stack
+        iretw
 
         ORG 0xe3fe
         .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-22 15:51                                       ` Kevin O'Connor
  0 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-22 15:51 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

On Tue, Dec 22, 2015 at 02:14:12AM +0000, Gonglei (Arei) wrote:
> > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > Sent: Tuesday, December 22, 2015 2:47 AM
> > To: Gonglei (Arei)
> > Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> > Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> > Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> > problem on qemu-kvm platform
> > 
> > On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
> > > When the gurb of OS is booting, then the softirq and C function
> > > send_disk_op() may use extra stack of SeaBIOS. If we inject a NMI,
> > > romlayout.S: irqentry_extrastack is invoked, and the extra stack will
> > > be used again. And the stack of first calling will be broken, so that the
> > SeaBIOS stuck.
> > >
> > > You can easily reproduce the problem.
> > >
> > > 1. start on guest
> > > 2. reset the guest
> > > 3. inject a NMI when the guest show the grub surface 4. then the guest
> > > stuck
> > 
> > Does the SeaBIOS patch below help?  
> 
> Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
> Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch. 

Oops, can you try with the patch below instead?

> > I'm not familiar with how to "inject a
> > NMI" - can you describe the process in more detail?
> > 
> 
> 1. Qemu Command line:
> 
> #: /home/qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096 -smp 8 -name suse -vnc 0.0.0.0:10 \
> -device virtio-scsi-pci,id=scsi0 -drive file=/home/suse11_sp3_32_2,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none,aio=native \
> -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
> -chardev file,id=seabios,path=/home/seabios.log -device isa-debugcon,iobase=0x402,chardev=seabios \
> -monitor stdio -qmp unix:/tmp/qmp,server,nowait 
> 
> 2. Inject a NMI by QMP:
> 
> #: /home/qemu/scripts/qmp # ./qmp-shell /tmp/qmp
> Welcome to the QMP low-level shell!
> Connected to QEMU 2.5.0
> 
> (QEMU) system_reset
> {"return": {}}
> (QEMU) inject-nmi  
> {"return": {}}
> (QEMU) inject-nmi
> {"return": {}}
> 

I tried a few simple tests but was not able to reproduce.

-Kevin


--- a/src/romlayout.S
+++ b/src/romlayout.S
@@ -548,7 +548,10 @@ entry_post:
         ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
 
         ORG 0xe2c3
-        IRQ_ENTRY 02
+        .global entry_02
+entry_02:
+        ENTRY handle_02  // NMI handler does not switch onto extra stack
+        iretw
 
         ORG 0xe3fe
         .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-22 15:51                                       ` Kevin O'Connor
@ 2015-12-23  6:40                                         ` Gonglei (Arei)
  -1 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-23  6:40 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Tuesday, December 22, 2015 11:51 PM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Tue, Dec 22, 2015 at 02:14:12AM +0000, Gonglei (Arei) wrote:
> > > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > > Sent: Tuesday, December 22, 2015 2:47 AM
> > > To: Gonglei (Arei)
> > > Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> > > Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> > > Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> > > problem on qemu-kvm platform
> > >
> > > On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
> > > > When the gurb of OS is booting, then the softirq and C function
> > > > send_disk_op() may use extra stack of SeaBIOS. If we inject a NMI,
> > > > romlayout.S: irqentry_extrastack is invoked, and the extra stack will
> > > > be used again. And the stack of first calling will be broken, so that the
> > > SeaBIOS stuck.
> > > >
> > > > You can easily reproduce the problem.
> > > >
> > > > 1. start on guest
> > > > 2. reset the guest
> > > > 3. inject a NMI when the guest show the grub surface 4. then the guest
> > > > stuck
> > >
> > > Does the SeaBIOS patch below help?
> >
> > Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
> > Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch.
> 
> Oops, can you try with the patch below instead?
> 

It works now. Thanks!

But do we need to check other possible situations
that maybe cause *extra stack* broken or overridden?


> > > I'm not familiar with how to "inject a
> > > NMI" - can you describe the process in more detail?
> > >
> >
> > 1. Qemu Command line:
> >
> > #: /home/qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096
> -smp 8 -name suse -vnc 0.0.0.0:10 \
> > -device virtio-scsi-pci,id=scsi0 -drive
> file=/home/suse11_sp3_32_2,if=none,id=drive-scsi0-0-0-0,format=raw,cache=
> none,aio=native \
> > -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
> > -chardev file,id=seabios,path=/home/seabios.log -device
> isa-debugcon,iobase=0x402,chardev=seabios \
> > -monitor stdio -qmp unix:/tmp/qmp,server,nowait
> >
> > 2. Inject a NMI by QMP:
> >
> > #: /home/qemu/scripts/qmp # ./qmp-shell /tmp/qmp
> > Welcome to the QMP low-level shell!
> > Connected to QEMU 2.5.0
> >
> > (QEMU) system_reset
> > {"return": {}}
> > (QEMU) inject-nmi
> > {"return": {}}
> > (QEMU) inject-nmi
> > {"return": {}}
> >
> 
> I tried a few simple tests but was not able to reproduce.
> 
After reset the guest, then you inject an NMI when you see the grub surface
ASAP. 

Kevin, I sent you a picture in private. :)


Regards,
-Gonglei

> -Kevin
> 
> 
> --- a/src/romlayout.S
> +++ b/src/romlayout.S
> @@ -548,7 +548,10 @@ entry_post:
>          ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
> 
>          ORG 0xe2c3
> -        IRQ_ENTRY 02
> +        .global entry_02
> +entry_02:
> +        ENTRY handle_02  // NMI handler does not switch onto extra stack
> +        iretw
> 
>          ORG 0xe3fe
>          .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-23  6:40                                         ` Gonglei (Arei)
  0 siblings, 0 replies; 41+ messages in thread
From: Gonglei (Arei) @ 2015-12-23  6:40 UTC (permalink / raw)
  To: Kevin O'Connor
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

> -----Original Message-----
> From: Kevin O'Connor [mailto:kevin@koconnor.net]
> Sent: Tuesday, December 22, 2015 11:51 PM
> To: Gonglei (Arei)
> Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> problem on qemu-kvm platform
> 
> On Tue, Dec 22, 2015 at 02:14:12AM +0000, Gonglei (Arei) wrote:
> > > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > > Sent: Tuesday, December 22, 2015 2:47 AM
> > > To: Gonglei (Arei)
> > > Cc: Xulei (Stone); Paolo Bonzini; qemu-devel; seabios@seabios.org;
> > > Huangweidong (C); kvm@vger.kernel.org; Radim Krcmar
> > > Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy
> > > problem on qemu-kvm platform
> > >
> > > On Mon, Dec 21, 2015 at 09:41:32AM +0000, Gonglei (Arei) wrote:
> > > > When the gurb of OS is booting, then the softirq and C function
> > > > send_disk_op() may use extra stack of SeaBIOS. If we inject a NMI,
> > > > romlayout.S: irqentry_extrastack is invoked, and the extra stack will
> > > > be used again. And the stack of first calling will be broken, so that the
> > > SeaBIOS stuck.
> > > >
> > > > You can easily reproduce the problem.
> > > >
> > > > 1. start on guest
> > > > 2. reset the guest
> > > > 3. inject a NMI when the guest show the grub surface 4. then the guest
> > > > stuck
> > >
> > > Does the SeaBIOS patch below help?
> >
> > Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
> > Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch.
> 
> Oops, can you try with the patch below instead?
> 

It works now. Thanks!

But do we need to check other possible situations
that maybe cause *extra stack* broken or overridden?


> > > I'm not familiar with how to "inject a
> > > NMI" - can you describe the process in more detail?
> > >
> >
> > 1. Qemu Command line:
> >
> > #: /home/qemu/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -m 4096
> -smp 8 -name suse -vnc 0.0.0.0:10 \
> > -device virtio-scsi-pci,id=scsi0 -drive
> file=/home/suse11_sp3_32_2,if=none,id=drive-scsi0-0-0-0,format=raw,cache=
> none,aio=native \
> > -device scsi-hd,bus=scsi0.0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0 \
> > -chardev file,id=seabios,path=/home/seabios.log -device
> isa-debugcon,iobase=0x402,chardev=seabios \
> > -monitor stdio -qmp unix:/tmp/qmp,server,nowait
> >
> > 2. Inject a NMI by QMP:
> >
> > #: /home/qemu/scripts/qmp # ./qmp-shell /tmp/qmp
> > Welcome to the QMP low-level shell!
> > Connected to QEMU 2.5.0
> >
> > (QEMU) system_reset
> > {"return": {}}
> > (QEMU) inject-nmi
> > {"return": {}}
> > (QEMU) inject-nmi
> > {"return": {}}
> >
> 
> I tried a few simple tests but was not able to reproduce.
> 
After reset the guest, then you inject an NMI when you see the grub surface
ASAP. 

Kevin, I sent you a picture in private. :)


Regards,
-Gonglei

> -Kevin
> 
> 
> --- a/src/romlayout.S
> +++ b/src/romlayout.S
> @@ -548,7 +548,10 @@ entry_post:
>          ENTRY_INTO32 _cfunc32flat_handle_post   // Normal entry point
> 
>          ORG 0xe2c3
> -        IRQ_ENTRY 02
> +        .global entry_02
> +entry_02:
> +        ENTRY handle_02  // NMI handler does not switch onto extra stack
> +        iretw
> 
>          ORG 0xe3fe
>          .global entry_13_official

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
  2015-12-23  6:40                                         ` [Qemu-devel] " Gonglei (Arei)
@ 2015-12-23 18:06                                           ` Kevin O'Connor
  -1 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-23 18:06 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Xulei (Stone),
	Paolo Bonzini, qemu-devel, seabios, Huangweidong (C),
	kvm, Radim Krcmar

On Wed, Dec 23, 2015 at 06:40:12AM +0000, Gonglei (Arei) wrote:
> > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > On Tue, Dec 22, 2015 at 02:14:12AM +0000, Gonglei (Arei) wrote:
> > > Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
> > > Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch.
> > 
> > Oops, can you try with the patch below instead?
> > 
> 
> It works now. Thanks!
> 
> But do we need to check other possible situations
> that maybe cause *extra stack* broken or overridden?

I believe the issue is that an NMI could occur while SeaBIOS is
already using its extra stack.  The code is not prepared to switch
into the extra stack while already on the extra stack.  SeaBIOS is
careful to always disable IRQs while running C code to prevent this
issue, but disabling normal IRQs does not disable NMIs.  So, I believe
this issue is specific to the nature of NMIs.

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform
@ 2015-12-23 18:06                                           ` Kevin O'Connor
  0 siblings, 0 replies; 41+ messages in thread
From: Kevin O'Connor @ 2015-12-23 18:06 UTC (permalink / raw)
  To: Gonglei (Arei)
  Cc: Huangweidong (C), kvm, Radim Krcmar, seabios, Xulei (Stone),
	qemu-devel, Paolo Bonzini

On Wed, Dec 23, 2015 at 06:40:12AM +0000, Gonglei (Arei) wrote:
> > From: Kevin O'Connor [mailto:kevin@koconnor.net]
> > On Tue, Dec 22, 2015 at 02:14:12AM +0000, Gonglei (Arei) wrote:
> > > Sorry, it doesn't work. What's worse is we cannot stop SeaBIOS stuck by
> > > Setting "CONFIG_ENTRY_EXTRASTACK=n" after applying this patch.
> > 
> > Oops, can you try with the patch below instead?
> > 
> 
> It works now. Thanks!
> 
> But do we need to check other possible situations
> that maybe cause *extra stack* broken or overridden?

I believe the issue is that an NMI could occur while SeaBIOS is
already using its extra stack.  The code is not prepared to switch
into the extra stack while already on the extra stack.  SeaBIOS is
careful to always disable IRQs while running C code to prevent this
issue, but disabling normal IRQs does not disable NMIs.  So, I believe
this issue is specific to the nature of NMIs.

-Kevin

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2015-12-23 18:06 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-03  6:58 [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform Xulei (Stone, Euler)
2015-11-04  0:48 ` Gonglei
2015-11-04 17:42   ` Kevin O'Connor
2015-11-06  9:12     ` Xulei (Stone)
2015-11-09 13:32       ` Kevin O'Connor
2015-11-09 20:06         ` Kevin O'Connor
2015-11-09 20:27           ` Kevin O'Connor
2015-11-19  1:04             ` Xulei (Stone)
2015-11-19 12:42               ` Xulei (Stone)
2015-11-19 13:40                 ` Kevin O'Connor
2015-11-20  2:05                   ` Xulei (Stone)
     [not found]                   ` <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com>
2015-12-18 23:13                     ` Kevin O'Connor
2015-12-18 23:13                       ` Kevin O'Connor
2015-12-19  6:28                       ` Gonglei (Arei)
2015-12-19  6:28                         ` [Qemu-devel] " Gonglei (Arei)
2015-12-19 12:03                       ` Gonglei (Arei)
2015-12-19 12:03                         ` [Qemu-devel] " Gonglei (Arei)
2015-12-19 15:11                         ` Kevin O'Connor
2015-12-19 15:11                           ` Kevin O'Connor
2015-12-20  9:49                           ` Gonglei (Arei)
2015-12-20  9:49                             ` [Qemu-devel] " Gonglei (Arei)
2015-12-20 14:33                             ` Kevin O'Connor
2015-12-20 14:33                               ` Kevin O'Connor
2015-12-21  9:41                               ` Gonglei (Arei)
2015-12-21  9:41                                 ` [Qemu-devel] " Gonglei (Arei)
2015-12-21 18:47                                 ` Kevin O'Connor
2015-12-21 18:47                                   ` [Qemu-devel] " Kevin O'Connor
2015-12-22  2:14                                   ` Gonglei (Arei)
2015-12-22  2:14                                     ` Gonglei (Arei)
2015-12-22  3:15                                     ` Xulei (Stone)
2015-12-22  3:15                                       ` [Qemu-devel] " Xulei (Stone)
2015-12-22 15:38                                       ` Kevin O'Connor
2015-12-22 15:38                                         ` [Qemu-devel] " Kevin O'Connor
2015-12-22 15:51                                     ` Kevin O'Connor
2015-12-22 15:51                                       ` Kevin O'Connor
2015-12-23  6:40                                       ` Gonglei (Arei)
2015-12-23  6:40                                         ` [Qemu-devel] " Gonglei (Arei)
2015-12-23 18:06                                         ` Kevin O'Connor
2015-12-23 18:06                                           ` Kevin O'Connor
2015-12-19  1:08                   ` Gonglei (Arei)
2015-12-19  1:08                     ` [Qemu-devel] " Gonglei (Arei)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.