From mboxrd@z Thu Jan 1 00:00:00 1970 From: Meng Xu Subject: Re: Question about Xen reboot on panic Date: Wed, 18 Nov 2015 22:58:48 -0500 Message-ID: References: <5643C716.1050102@citrix.com> <5643D091.7090503@citrix.com> <56448BA8.6080705@citrix.com> <5644D58002000078000B469D@prv-mh.provo.novell.com> <5645A1D502000078000B494F@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5645A1D502000078000B494F@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich Cc: Andrew Cooper , Wei Liu , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org 2015-11-13 2:39 GMT-05:00 Jan Beulich : >>>> On 12.11.15 at 20:54, wrote: >> However, the line after that if statement is: >> smp_send_stop(); >> >> which is not in the if ( get_apic_id() != boot_cpu_physical_apicid ) >> statement. >> >> So P0 may run this code, and from what I read from this >> smp_send_stop(), it has the following code: >> >> local_irq_disable(); >> >> __stop_this_cpu(); >> >> disable_IO_APIC(); >> >> hpet_disable(); >> >> local_irq_enable(); >> >> I'm guessing at __stop_this_cpu() when it is on P0, P0 will be >> stopped. That's why P0 will never have the chance to proceed to the >> rest of logic in the machine_restart(). Therefore, the machine won't >> restart. > > The code is quite clear in this regard - smp_send_stop() stops all other > CPUs, but calls only __stop_this_cpu() (not stop_this_cpu()) for itself. > I.e. execution is at least supposed to make it back to the caller. Also > please don't forget that this is working for most everyone else, so > what you're looking for is more likely some oddity on your system, not > some general issue. I see. Hmm, maybe it is because of some oddity on my machine, which is not a commodity machine but assembled from components. :-( > > (Btw - are you doing this on master, which is what I'd expect you to? > I ask because the sequence of calls you quote above doesn't match > with what I see on there. I'd really like to avoid hunting a problem > long fixed.) Not really. I added several commits after the master and then "buried" a bug in the scheduler to cause the system crash when I destroy a VM. Because the rebooting issue only appears when the kernel crashes in some cases, so I just tried to use the bug to test if xen kernel can successfully reboot after crash. I didn't experience kernel crash on master. I will come back when I experience the reboot issue on Xen master. But I still think that Xen should reboot even when the other parts of Xen (not the reboot logic) has a bug. Maybe I'm wrong? > >> If I move this smp_send_stop(void) into the if statement, Xen will reboot. >> >> Do you think this could be a fix? > > Definitely not. I see the reason now... This issue does not happen on another machine of mine. Probably it is because of the oddity on the assembled machine, as Jan said. :-( Thank you very much for your help and advice! Best, Meng -- ----------- Meng Xu PhD Student in Computer and Information Science University of Pennsylvania http://www.cis.upenn.edu/~mengxu/