From mboxrd@z Thu Jan 1 00:00:00 1970 From: Kevin O'Connor Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform Date: Sun, 20 Dec 2015 09:33:20 -0500 Message-ID: <20151220143320.GA23942@morn.lan> References: <20151109200618.GA29129@morn.lan> <20151109202726.GA31490@morn.lan> <8E78D212B8C25246BE4CE7EA0E645FE52B5BE3@SZXEMI504-MBS.china.huawei.com> <8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com> <20151119134039.GA27717@morn.lan> <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com> <20151218231326.GA4138@morn.lan> <33183CC9F5247A488A2544077AF19020B02B7A73@SZXEMA503-MBS.china.huawei.com> <20151219151159.GA22542@morn.lan> <33183CC9F5247A488A2544077AF19020B02B7BC2@SZXEMA503-MBS.china.huawei.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: "Xulei (Stone)" , Paolo Bonzini , qemu-devel , "seabios@seabios.org" , "Huangweidong (C)" , "kvm@vger.kernel.org" , Radim Krcmar To: "Gonglei (Arei)" Return-path: Received: from mail-qk0-f181.google.com ([209.85.220.181]:34033 "EHLO mail-qk0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753042AbbLTOdW (ORCPT ); Sun, 20 Dec 2015 09:33:22 -0500 Received: by mail-qk0-f181.google.com with SMTP id p187so128726558qkd.1 for ; Sun, 20 Dec 2015 06:33:22 -0800 (PST) Content-Disposition: inline In-Reply-To: <33183CC9F5247A488A2544077AF19020B02B7BC2@SZXEMA503-MBS.china.huawei.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, Dec 20, 2015 at 09:49:54AM +0000, Gonglei (Arei) wrote: > > From: Kevin O'Connor [mailto:kevin@koconnor.net] > > Sent: Saturday, December 19, 2015 11:12 PM > > On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote: > > > Maybe the root cause is not NMI but INTR, so yield() can open hardware > > interrupt, > > > And then execute interrupt handler, but the interrupt handler make the > > SeaBIOS > > > stack broken, so that the BSP can't execute the instruction and occur > > exception, > > > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except > > > the surface phenomenon. > > > > I can't see any reason why allowing interrupts at this location would > > be a problem. > > > Does it have any relationship with *extra stack* of SeaBIOS? None that I can see. Also, the kvm trace seems to show the code trying to execute at rip=0x03 - that will crash long before the extra stack is used. > > > Kevin, can we drop yield() in smp_setup() ? > > > > It's possible to eliminate this instance of yield, but I think it > > would just push the crash to the next time interrupts are enabled. > > > Perhaps. I'm not sure. > > > > Is it really useful and allowable for SeaBIOS? Maybe for other components? > > > I'm not sure. Because we found that when SeaBIOS is booting, if we inject a > > > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with > > > the current problem. > > > > If you apply the patches you had to prevent that NMI crash problem, > > does it also prevent the above crash? > > > Yes, but we cannot prevent the NMI injection (though I'll submit some patches to > forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70). > -Kevin From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39095) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aAf38-0000nK-2i for qemu-devel@nongnu.org; Sun, 20 Dec 2015 09:33:26 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aAf34-0006Z9-TE for qemu-devel@nongnu.org; Sun, 20 Dec 2015 09:33:26 -0500 Received: from mail-qk0-x22f.google.com ([2607:f8b0:400d:c09::22f]:36722) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aAf34-0006Yl-Mu for qemu-devel@nongnu.org; Sun, 20 Dec 2015 09:33:22 -0500 Received: by mail-qk0-x22f.google.com with SMTP id t125so128915686qkh.3 for ; Sun, 20 Dec 2015 06:33:22 -0800 (PST) Date: Sun, 20 Dec 2015 09:33:20 -0500 From: Kevin O'Connor Message-ID: <20151220143320.GA23942@morn.lan> References: <20151109200618.GA29129@morn.lan> <20151109202726.GA31490@morn.lan> <8E78D212B8C25246BE4CE7EA0E645FE52B5BE3@SZXEMI504-MBS.china.huawei.com> <8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com> <20151119134039.GA27717@morn.lan> <33183CC9F5247A488A2544077AF19020B02B72BA@SZXEMA503-MBS.china.huawei.com> <20151218231326.GA4138@morn.lan> <33183CC9F5247A488A2544077AF19020B02B7A73@SZXEMA503-MBS.china.huawei.com> <20151219151159.GA22542@morn.lan> <33183CC9F5247A488A2544077AF19020B02B7BC2@SZXEMA503-MBS.china.huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <33183CC9F5247A488A2544077AF19020B02B7BC2@SZXEMA503-MBS.china.huawei.com> Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Gonglei (Arei)" Cc: "Huangweidong (C)" , "kvm@vger.kernel.org" , Radim Krcmar , "seabios@seabios.org" , "Xulei (Stone)" , qemu-devel , Paolo Bonzini On Sun, Dec 20, 2015 at 09:49:54AM +0000, Gonglei (Arei) wrote: > > From: Kevin O'Connor [mailto:kevin@koconnor.net] > > Sent: Saturday, December 19, 2015 11:12 PM > > On Sat, Dec 19, 2015 at 12:03:15PM +0000, Gonglei (Arei) wrote: > > > Maybe the root cause is not NMI but INTR, so yield() can open hardware > > interrupt, > > > And then execute interrupt handler, but the interrupt handler make the > > SeaBIOS > > > stack broken, so that the BSP can't execute the instruction and occur > > exception, > > > VM_EXIT to Kmod, which is an infinite loop. But I don't have any proofs except > > > the surface phenomenon. > > > > I can't see any reason why allowing interrupts at this location would > > be a problem. > > > Does it have any relationship with *extra stack* of SeaBIOS? None that I can see. Also, the kvm trace seems to show the code trying to execute at rip=0x03 - that will crash long before the extra stack is used. > > > Kevin, can we drop yield() in smp_setup() ? > > > > It's possible to eliminate this instance of yield, but I think it > > would just push the crash to the next time interrupts are enabled. > > > Perhaps. I'm not sure. > > > > Is it really useful and allowable for SeaBIOS? Maybe for other components? > > > I'm not sure. Because we found that when SeaBIOS is booting, if we inject a > > > NMI by QMP, the guest will *stuck*. And the kvm tracing log is the same with > > > the current problem. > > > > If you apply the patches you had to prevent that NMI crash problem, > > does it also prevent the above crash? > > > Yes, but we cannot prevent the NMI injection (though I'll submit some patches to > forbid users' NMI injection after NMI_EN disabled by RTC bit7 of port 0x70). > -Kevin