From: "Gonglei (Arei)" <arei.gonglei@huawei.com> To: "Kevin O'Connor" <kevin@koconnor.net>, "Xulei (Stone)" <stone.xulei@huawei.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: "seabios@seabios.org" <seabios@seabios.org>, "Huangweidong \(C\)" <weidong.huang@huawei.com>, qemu-devel <qemu-devel@nongnu.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org> Subject: Re: [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform Date: Sat, 19 Dec 2015 01:08:05 +0000 [thread overview] Message-ID: <33183CC9F5247A488A2544077AF19020B02B79EE@SZXEMA503-MBS.china.huawei.com> (raw) In-Reply-To: <20151119134039.GA27717@morn.lan> Hi Kevin & Paolo, Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck. [BTW, the whole SeaBIOS log attached] [2015-12-18 10:38:10] >>>>>gonglei: enter smp_setup()... [2015-12-18 10:38:10] >>>>>gonglei: begine to enable local APIC... [2015-12-18 10:38:10] >>>>>gonglei: finish enable local APIC... [2015-12-18 10:38:10] >>>gonglei: cmos_smp_count=8 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=5 [2015-12-18 10:38:10] ===: CountCPUs=2, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=7 [2015-12-18 10:38:10] ===: CountCPUs=3, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=1 [2015-12-18 10:38:10] ===: CountCPUs=4, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=2 [2015-12-18 10:38:10] ===: CountCPUs=5, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=4 [2015-12-18 10:38:10] ===: CountCPUs=6, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=3 [2015-12-18 10:38:10] ===: CountCPUs=7, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=6 [2015-12-18 10:38:10] ===: CountCPUs=8, SMPStack=0x6d84 [2015-12-18 10:38:10] gonglei: finish while [pid 31509 is a vcpu thread used 100% cpu overhead] # cat /proc/31509/stack [<ffffffffa05c337c>] vmx_vcpu_run+0x35c/0x580 [kvm_intel] [<ffffffffa06b0f10>] em_push+0x0/0x20 [kvm] [<ffffffffa06a30dc>] x86_emulate_instruction+0x20c/0x440 [kvm] [<ffffffffa05c9224>] handle_exception+0xe4/0x1b58 [kvm_intel] [<ffffffffa06a24c5>] vcpu_enter_guest+0x565/0x790 [kvm] [<ffffffffa05bf990>] vmx_get_segment_base+0x0/0xb0 [kvm_intel] [<ffffffffa06a2888>] __vcpu_run+0x198/0x260 [kvm] [<ffffffffa06a3508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] [<ffffffffa068f92e>] vcpu_load+0x4e/0x80 [kvm] [<ffffffffa068fcee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] [<ffffffff8109618d>] futex_wake+0xfd/0x110 [<ffffffff811ed57c>] security_file_permission+0x1c/0xa0 [<ffffffff8116bedb>] do_vfs_ioctl+0x8b/0x3b0 [<ffffffff8116c2a1>] sys_ioctl+0xa1/0xb0 [<ffffffff81469272>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff And kvm tracing information: <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180078: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180078: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180079: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180079: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180079: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180080: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180080: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180080: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180081: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180081: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180081: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180081: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180082: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180083: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180083: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180083: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180084: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180084: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180084: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180084: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180085: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180085: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180085: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180085: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180086: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 Now, it's very clear that the guest stuck in yiled(), and then kvm encounter the exception #UD. Do you have any thoughts? Thanks! The Seabios patch below: diff --git a/roms/seabios/src/boot.c b/roms/seabios/src/boot.c index f23e9e1..552914a 100644 --- a/roms/seabios/src/boot.c +++ b/roms/seabios/src/boot.c @@ -93,7 +93,7 @@ glob_prefix(const char *glob, const char *str) static int find_prio(const char *glob) { - dprintf(1, "Searching bootorder for: %s\n", glob); + //dprintf(1, "Searching bootorder for: %s\n", glob); int i; for (i = 0; i < BootorderCount; i++) if (glob_prefix(glob, Bootorder[i])) diff --git a/roms/seabios/src/fw/smp.c b/roms/seabios/src/fw/smp.c index a466ea6..46ec607 100644 --- a/roms/seabios/src/fw/smp.c +++ b/roms/seabios/src/fw/smp.c @@ -46,12 +46,16 @@ int apic_id_is_present(u8 apic_id) return !!(FoundAPICIDs[apic_id/32] & (1ul << (apic_id % 32))); } +// Atomic lock for shared stack across processors. +u32 SMPLock __VISIBLE; +u32 SMPStack __VISIBLE; + void VISIBLE32FLAT handle_smp(void) { if (!CONFIG_QEMU) return; - + dprintf(1, ">>> enter handle_smp...\n"); // Enable CPU caching setcr0(getcr0() & ~(CR0_CD|CR0_NW)); @@ -70,19 +74,16 @@ handle_smp(void) FoundAPICIDs[apic_id/32] |= (1 << (apic_id % 32)); CountCPUs++; + dprintf(1, "===: CountCPUs=%d, SMPStack=0x%x\n", CountCPUs, SMPStack); } -// Atomic lock for shared stack across processors. -u32 SMPLock __VISIBLE; -u32 SMPStack __VISIBLE; - // find and initialize the CPUs by launching a SIPI to them void smp_setup(void) { if (!CONFIG_QEMU) return; - + dprintf(1, ">>>>>gonglei: enter smp_setup()...\n"); ASSERT32FLAT(); u32 eax, ebx, ecx, cpuid_features; cpuid(1, &eax, &ebx, &ecx, &cpuid_features); @@ -106,7 +107,7 @@ smp_setup(void) u64 new = (0xea | ((u64)SEG_BIOS<<24) | (((u32)entry_smp - BUILD_BIOS_ADDR) << 8)); *(u64*)BUILD_AP_BOOT_ADDR = new; - + dprintf(1, ">>>>>gonglei: begine to enable local APIC...\n"); // enable local APIC u32 val = readl(APIC_SVR); writel(APIC_SVR, val | APIC_ENABLED); @@ -125,10 +126,21 @@ smp_setup(void) writel(APIC_ICR_LOW, 0x000C4500); u32 sipi_vector = BUILD_AP_BOOT_ADDR >> 12; writel(APIC_ICR_LOW, 0x000C4600 | sipi_vector); - + dprintf(1, ">>>>>gonglei: finish enable local APIC...\n"); // Wait for other CPUs to process the SIPI. u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; - while (cmos_smp_count != CountCPUs) + dprintf(1, ">>>gonglei: cmos_smp_count=%d\n", cmos_smp_count); + while (cmos_smp_count != CountCPUs) { asm volatile( // Release lock and allow other processors to use the stack. " movl %%esp, %1\n" @@ -139,8 +151,11 @@ smp_setup(void) " jc 1b\n" : "+m" (SMPLock), "+m" (SMPStack) : : "cc", "memory"); + //yield(); + } + dprintf(1, " gonglei: finish while\n"); yield(); - + dprintf(1, " gonglei: finish yield\n"); // Restore memory. *(u64*)BUILD_AP_BOOT_ADDR = old; diff --git a/roms/seabios/src/misc.c b/roms/seabios/src/misc.c index 8caaf31..77f6be3 100644 --- a/roms/seabios/src/misc.c +++ b/roms/seabios/src/misc.c @@ -64,6 +64,10 @@ void VISIBLE16 handle_02(void) { debug_isr(DEBUG_ISR_02); + dprintf(1, "gonglei hand nmi inject, write rtc \n"); } void diff --git a/roms/seabios/src/stacks.c b/roms/seabios/src/stacks.c index 1dbdfe9..c1b5203 100644 --- a/roms/seabios/src/stacks.c +++ b/roms/seabios/src/stacks.c @@ -174,6 +174,7 @@ call16_smm(u32 eax, u32 edx, void *func) static void call32_sloppy_prep(void) { + dprintf(1, ">>> enter call32_sloppy_prep...\n"); // Backup cmos index register and disable nmi u8 cmosindex = inb(PORT_CMOS_INDEX); outb(cmosindex | NMI_DISABLE_BIT, PORT_CMOS_INDEX); Regards, -Gonglei > -----Original Message----- > From: Kevin O'Connor [mailto:kevin@koconnor.net] > Sent: Thursday, November 19, 2015 9:41 PM > To: Xulei (Stone) > Cc: Gonglei (Arei); qemu-devel; seabios@seabios.org; Huangweidong (C) > Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy > problem on qemu-kvm platform > > On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote: > > Kevin, > > > > After deeply analyzing, i think there may be 3 possible reasons: > > 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no > > lock to protect. So, sometimes, 2 or more vcpu may get the same > > current value of CountCPUs. Then we'll get a single incrementation > > instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will > > loop forever; > > The handle_smp() code is called from romlayout.S:entry_smp() which does take > a lock. So, all of handle_smp() should run synchronous. > > > 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number? > > Not sure - the last time there were problems in this area of the code others > used kvmtrace to try and track this down. Since you are getting dprintf > statements, you could also try outputting cmos_smp_count prior to the loop > (see patch below). > > > 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield? > > I've tested, when yield() is running, SeaBIOS seems has not created > > some other threads except the main thread. So I don't know what's the > > function of yield() here.? > > The yield() allows hardware interrupts to occur. But note that > yield() isn't called in the loop - is is only called after the loop completes. > > If you are only getting this on massive repetitive reboot requests, there are > some other possible explanations: > > - perhaps the SIPI is getting lost because one of the CPUs is still > resetting or still processing a SIPI from the last reboot? > > - the seabios code itself may have been corrupted if the memcpy() in > qemu_prep_reset() got far enough along to clear HaveRunPost, but did > not get far enough along to fully complete the memcpy(). > > If the failure is reproducible, the patch below could help narrow the > possibilities. > > -Kevin > > > --- a/src/fw/smp.c > +++ b/src/fw/smp.c > @@ -125,6 +125,7 @@ smp_setup(void) > > // Wait for other CPUs to process the SIPI. > u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; > + dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count); > while (cmos_smp_count != CountCPUs) > asm volatile( > // Release lock and allow other processors to use the stack. > @@ -136,6 +137,7 @@ smp_setup(void) > " jc 1b\n" > : "+m" (SMPLock), "+m" (SMPStack) > : : "cc", "memory"); > + dprintf(1, "finish smp\n"); > yield(); > > // Restore memory.
WARNING: multiple messages have this Message-ID (diff)
From: "Gonglei (Arei)" <arei.gonglei@huawei.com> To: Kevin O'Connor <kevin@koconnor.net>, "Xulei (Stone)" <stone.xulei@huawei.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: "seabios@seabios.org" <seabios@seabios.org>, "Huangweidong (C)" <weidong.huang@huawei.com>, qemu-devel <qemu-devel@nongnu.org>, "kvm@vger.kernel.org" <kvm@vger.kernel.org> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform Date: Sat, 19 Dec 2015 01:08:05 +0000 [thread overview] Message-ID: <33183CC9F5247A488A2544077AF19020B02B79EE@SZXEMA503-MBS.china.huawei.com> (raw) In-Reply-To: <20151119134039.GA27717@morn.lan> Hi Kevin & Paolo, Luckily, I reproduced this problem last night. And I got the below log when SeaBIOS is stuck. [BTW, the whole SeaBIOS log attached] [2015-12-18 10:38:10] >>>>>gonglei: enter smp_setup()... [2015-12-18 10:38:10] >>>>>gonglei: begine to enable local APIC... [2015-12-18 10:38:10] >>>>>gonglei: finish enable local APIC... [2015-12-18 10:38:10] >>>gonglei: cmos_smp_count=8 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=5 [2015-12-18 10:38:10] ===: CountCPUs=2, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=7 [2015-12-18 10:38:10] ===: CountCPUs=3, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=1 [2015-12-18 10:38:10] ===: CountCPUs=4, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=2 [2015-12-18 10:38:10] ===: CountCPUs=5, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=4 [2015-12-18 10:38:10] ===: CountCPUs=6, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=3 [2015-12-18 10:38:10] ===: CountCPUs=7, SMPStack=0x6d84 [2015-12-18 10:38:10] >>> enter handle_smp... [2015-12-18 10:38:10] handle_smp: apic_id=6 [2015-12-18 10:38:10] ===: CountCPUs=8, SMPStack=0x6d84 [2015-12-18 10:38:10] gonglei: finish while [pid 31509 is a vcpu thread used 100% cpu overhead] # cat /proc/31509/stack [<ffffffffa05c337c>] vmx_vcpu_run+0x35c/0x580 [kvm_intel] [<ffffffffa06b0f10>] em_push+0x0/0x20 [kvm] [<ffffffffa06a30dc>] x86_emulate_instruction+0x20c/0x440 [kvm] [<ffffffffa05c9224>] handle_exception+0xe4/0x1b58 [kvm_intel] [<ffffffffa06a24c5>] vcpu_enter_guest+0x565/0x790 [kvm] [<ffffffffa05bf990>] vmx_get_segment_base+0x0/0xb0 [kvm_intel] [<ffffffffa06a2888>] __vcpu_run+0x198/0x260 [kvm] [<ffffffffa06a3508>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm] [<ffffffffa068f92e>] vcpu_load+0x4e/0x80 [kvm] [<ffffffffa068fcee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm] [<ffffffff8109618d>] futex_wake+0xfd/0x110 [<ffffffff811ed57c>] security_file_permission+0x1c/0xa0 [<ffffffff8116bedb>] do_vfs_ioctl+0x8b/0x3b0 [<ffffffff8116c2a1>] sys_ioctl+0xa1/0xb0 [<ffffffff81469272>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff And kvm tracing information: <...>-31509 [035] 154753.180077: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180077: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180077: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180077: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180078: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180078: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180079: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180079: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180079: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180080: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180080: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180080: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180081: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180081: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180081: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180081: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180082: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180083: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180083: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180083: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180084: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180084: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180084: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180084: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180085: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 <...>-31509 [035] 154753.180085: kvm_emulate_insn: 0:3:f0 53 (real) <...>-31509 [035] 154753.180085: kvm_inj_exception: #UD (0x0) <...>-31509 [035] 154753.180085: kvm_entry: vcpu 0 <...>-31509 [035] 154753.180086: kvm_exit: reason EXCEPTION_NMI rip 0x3 info 0 80000306 Now, it's very clear that the guest stuck in yiled(), and then kvm encounter the exception #UD. Do you have any thoughts? Thanks! The Seabios patch below: diff --git a/roms/seabios/src/boot.c b/roms/seabios/src/boot.c index f23e9e1..552914a 100644 --- a/roms/seabios/src/boot.c +++ b/roms/seabios/src/boot.c @@ -93,7 +93,7 @@ glob_prefix(const char *glob, const char *str) static int find_prio(const char *glob) { - dprintf(1, "Searching bootorder for: %s\n", glob); + //dprintf(1, "Searching bootorder for: %s\n", glob); int i; for (i = 0; i < BootorderCount; i++) if (glob_prefix(glob, Bootorder[i])) diff --git a/roms/seabios/src/fw/smp.c b/roms/seabios/src/fw/smp.c index a466ea6..46ec607 100644 --- a/roms/seabios/src/fw/smp.c +++ b/roms/seabios/src/fw/smp.c @@ -46,12 +46,16 @@ int apic_id_is_present(u8 apic_id) return !!(FoundAPICIDs[apic_id/32] & (1ul << (apic_id % 32))); } +// Atomic lock for shared stack across processors. +u32 SMPLock __VISIBLE; +u32 SMPStack __VISIBLE; + void VISIBLE32FLAT handle_smp(void) { if (!CONFIG_QEMU) return; - + dprintf(1, ">>> enter handle_smp...\n"); // Enable CPU caching setcr0(getcr0() & ~(CR0_CD|CR0_NW)); @@ -70,19 +74,16 @@ handle_smp(void) FoundAPICIDs[apic_id/32] |= (1 << (apic_id % 32)); CountCPUs++; + dprintf(1, "===: CountCPUs=%d, SMPStack=0x%x\n", CountCPUs, SMPStack); } -// Atomic lock for shared stack across processors. -u32 SMPLock __VISIBLE; -u32 SMPStack __VISIBLE; - // find and initialize the CPUs by launching a SIPI to them void smp_setup(void) { if (!CONFIG_QEMU) return; - + dprintf(1, ">>>>>gonglei: enter smp_setup()...\n"); ASSERT32FLAT(); u32 eax, ebx, ecx, cpuid_features; cpuid(1, &eax, &ebx, &ecx, &cpuid_features); @@ -106,7 +107,7 @@ smp_setup(void) u64 new = (0xea | ((u64)SEG_BIOS<<24) | (((u32)entry_smp - BUILD_BIOS_ADDR) << 8)); *(u64*)BUILD_AP_BOOT_ADDR = new; - + dprintf(1, ">>>>>gonglei: begine to enable local APIC...\n"); // enable local APIC u32 val = readl(APIC_SVR); writel(APIC_SVR, val | APIC_ENABLED); @@ -125,10 +126,21 @@ smp_setup(void) writel(APIC_ICR_LOW, 0x000C4500); u32 sipi_vector = BUILD_AP_BOOT_ADDR >> 12; writel(APIC_ICR_LOW, 0x000C4600 | sipi_vector); - + dprintf(1, ">>>>>gonglei: finish enable local APIC...\n"); // Wait for other CPUs to process the SIPI. u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; - while (cmos_smp_count != CountCPUs) + dprintf(1, ">>>gonglei: cmos_smp_count=%d\n", cmos_smp_count); + while (cmos_smp_count != CountCPUs) { asm volatile( // Release lock and allow other processors to use the stack. " movl %%esp, %1\n" @@ -139,8 +151,11 @@ smp_setup(void) " jc 1b\n" : "+m" (SMPLock), "+m" (SMPStack) : : "cc", "memory"); + //yield(); + } + dprintf(1, " gonglei: finish while\n"); yield(); - + dprintf(1, " gonglei: finish yield\n"); // Restore memory. *(u64*)BUILD_AP_BOOT_ADDR = old; diff --git a/roms/seabios/src/misc.c b/roms/seabios/src/misc.c index 8caaf31..77f6be3 100644 --- a/roms/seabios/src/misc.c +++ b/roms/seabios/src/misc.c @@ -64,6 +64,10 @@ void VISIBLE16 handle_02(void) { debug_isr(DEBUG_ISR_02); + dprintf(1, "gonglei hand nmi inject, write rtc \n"); } void diff --git a/roms/seabios/src/stacks.c b/roms/seabios/src/stacks.c index 1dbdfe9..c1b5203 100644 --- a/roms/seabios/src/stacks.c +++ b/roms/seabios/src/stacks.c @@ -174,6 +174,7 @@ call16_smm(u32 eax, u32 edx, void *func) static void call32_sloppy_prep(void) { + dprintf(1, ">>> enter call32_sloppy_prep...\n"); // Backup cmos index register and disable nmi u8 cmosindex = inb(PORT_CMOS_INDEX); outb(cmosindex | NMI_DISABLE_BIT, PORT_CMOS_INDEX); Regards, -Gonglei > -----Original Message----- > From: Kevin O'Connor [mailto:kevin@koconnor.net] > Sent: Thursday, November 19, 2015 9:41 PM > To: Xulei (Stone) > Cc: Gonglei (Arei); qemu-devel; seabios@seabios.org; Huangweidong (C) > Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy > problem on qemu-kvm platform > > On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote: > > Kevin, > > > > After deeply analyzing, i think there may be 3 possible reasons: > > 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no > > lock to protect. So, sometimes, 2 or more vcpu may get the same > > current value of CountCPUs. Then we'll get a single incrementation > > instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will > > loop forever; > > The handle_smp() code is called from romlayout.S:entry_smp() which does take > a lock. So, all of handle_smp() should run synchronous. > > > 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number? > > Not sure - the last time there were problems in this area of the code others > used kvmtrace to try and track this down. Since you are getting dprintf > statements, you could also try outputting cmos_smp_count prior to the loop > (see patch below). > > > 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield? > > I've tested, when yield() is running, SeaBIOS seems has not created > > some other threads except the main thread. So I don't know what's the > > function of yield() here.? > > The yield() allows hardware interrupts to occur. But note that > yield() isn't called in the loop - is is only called after the loop completes. > > If you are only getting this on massive repetitive reboot requests, there are > some other possible explanations: > > - perhaps the SIPI is getting lost because one of the CPUs is still > resetting or still processing a SIPI from the last reboot? > > - the seabios code itself may have been corrupted if the memcpy() in > qemu_prep_reset() got far enough along to clear HaveRunPost, but did > not get far enough along to fully complete the memcpy(). > > If the failure is reproducible, the patch below could help narrow the > possibilities. > > -Kevin > > > --- a/src/fw/smp.c > +++ b/src/fw/smp.c > @@ -125,6 +125,7 @@ smp_setup(void) > > // Wait for other CPUs to process the SIPI. > u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; > + dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count); > while (cmos_smp_count != CountCPUs) > asm volatile( > // Release lock and allow other processors to use the stack. > @@ -136,6 +137,7 @@ smp_setup(void) > " jc 1b\n" > : "+m" (SMPLock), "+m" (SMPStack) > : : "cc", "memory"); > + dprintf(1, "finish smp\n"); > yield(); > > // Restore memory.
next prev parent reply other threads:[~2015-12-19 1:08 UTC|newest] Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-11-03 6:58 [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform Xulei (Stone, Euler) 2015-11-04 0:48 ` Gonglei 2015-11-04 17:42 ` Kevin O'Connor 2015-11-06 9:12 ` Xulei (Stone) 2015-11-09 13:32 ` Kevin O'Connor 2015-11-09 20:06 ` Kevin O'Connor 2015-11-09 20:27 ` Kevin O'Connor 2015-11-19 1:04 ` Xulei (Stone) 2015-11-19 12:42 ` Xulei (Stone) 2015-11-19 13:40 ` Kevin O'Connor 2015-11-20 2:05 ` Xulei (Stone) [not found] ` <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com> 2015-12-18 23:13 ` Kevin O'Connor 2015-12-18 23:13 ` Kevin O'Connor 2015-12-19 6:28 ` Gonglei (Arei) 2015-12-19 6:28 ` [Qemu-devel] " Gonglei (Arei) 2015-12-19 12:03 ` Gonglei (Arei) 2015-12-19 12:03 ` [Qemu-devel] " Gonglei (Arei) 2015-12-19 15:11 ` Kevin O'Connor 2015-12-19 15:11 ` Kevin O'Connor 2015-12-20 9:49 ` Gonglei (Arei) 2015-12-20 9:49 ` [Qemu-devel] " Gonglei (Arei) 2015-12-20 14:33 ` Kevin O'Connor 2015-12-20 14:33 ` Kevin O'Connor 2015-12-21 9:41 ` Gonglei (Arei) 2015-12-21 9:41 ` [Qemu-devel] " Gonglei (Arei) 2015-12-21 18:47 ` Kevin O'Connor 2015-12-21 18:47 ` [Qemu-devel] " Kevin O'Connor 2015-12-22 2:14 ` Gonglei (Arei) 2015-12-22 2:14 ` Gonglei (Arei) 2015-12-22 3:15 ` Xulei (Stone) 2015-12-22 3:15 ` [Qemu-devel] " Xulei (Stone) 2015-12-22 15:38 ` Kevin O'Connor 2015-12-22 15:38 ` [Qemu-devel] " Kevin O'Connor 2015-12-22 15:51 ` Kevin O'Connor 2015-12-22 15:51 ` Kevin O'Connor 2015-12-23 6:40 ` Gonglei (Arei) 2015-12-23 6:40 ` [Qemu-devel] " Gonglei (Arei) 2015-12-23 18:06 ` Kevin O'Connor 2015-12-23 18:06 ` Kevin O'Connor 2015-12-19 1:08 ` Gonglei (Arei) [this message] 2015-12-19 1:08 ` Gonglei (Arei)
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=33183CC9F5247A488A2544077AF19020B02B79EE@SZXEMA503-MBS.china.huawei.com \ --to=arei.gonglei@huawei.com \ --cc=kevin@koconnor.net \ --cc=kvm@vger.kernel.org \ --cc=pbonzini@redhat.com \ --cc=qemu-devel@nongnu.org \ --cc=seabios@seabios.org \ --cc=stone.xulei@huawei.com \ --cc=weidong.huang@huawei.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.