From mboxrd@z Thu Jan 1 00:00:00 1970 From: agya naila Subject: Re: IBM HS20 Xen 4.1 and 4.2 Critical Interrupt - Front panel NMI crash Date: Wed, 6 Feb 2013 13:01:55 +0100 Message-ID: References: <20130205163021.GR8912@reaktio.net> <20130205200847.GS8912@reaktio.net> <51121B5002000078000BC573@nat28.tlf.novell.com> <20130206112910.GT8912@reaktio.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============4336674204630531010==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: =?ISO-8859-1?Q?Pasi_K=E4rkk=E4inen?= Cc: arrfab@centos.org, Jan Beulich , xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org --===============4336674204630531010== Content-Type: multipart/alternative; boundary=047d7b5d4808afaa1404d50d1720 --047d7b5d4808afaa1404d50d1720 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable As I read on IBM paper : When a non-maskable interrupt (NMI) signal is received, the processor immediately drops what it was doing and attends to it. The NMI signal is normally used only for critical problem situations, such as serious hardware errors This setting is Enabled by default. When an NMI is issued by a critical event the BMC performs the system to reset for recovering the system. The BMC logs the reboot and additional error events in the SEL. I dont know why XEN trigger or cause this NMI signal, since when I boot the machine with the same Operating system Ubuntu 12.04.1 Desktop 64 bit without XEN its run perfectly. One more interesting fact with the same Dom0 with excacly the same XEN version and configuration running perfectly on my notebook Toshiba Satelite L735 Intel I5, Hopefully anyone have solution for the server. Agya On Wed, Feb 6, 2013 at 12:48 PM, agya naila wrote: > Thank you Pasi to forward this email for me too, it seem not only me > facing this problem. I found this guy also found similar problem, its in > french but we can translate it easily using google > http://debian.2.n7.nabble.com/Probleme-XEN-4-0-1-et-SQUEEZE-64bits-reboot= -td1230690.html > > I found parameter nmi=3Dignore | dom0 | fatal > > nmi=3Dreaction : Enables you to specify how the hypervisor reacts to a no= n - > maskable interrupt > (NMI) resulting from a parity or I/O error. Possible values for reaction > are fatal (the hypervisor > prints a diagnostic message and then hangs), dom0 (send a message to > domain0 for logging > purposes but continue), and ignore (ignore the error). If you do not > specify this option, Xen > uses the default value dom0 internally. > > But its still doesn't work on my machine. > > Agya > > > On Wed, Feb 6, 2013 at 12:29 PM, Pasi K=E4rkk=E4inen wrote= : > >> On Wed, Feb 06, 2013 at 07:58:56AM +0000, Jan Beulich wrote: >> > >>> On 05.02.13 at 21:08, Pasi K=E4rkk=E4inen wrote: >> > > Arrfab (CC'd) is actually seeing a similar problem on IBM HS20 blade >> with >> > > Xen 4.2.1 >> > > with Linux 3.4.28 dom0 kernel. >> > > >> > > Does this ring anyone's bells? >> > > >> > > >> > > serial console log of the crash >> > >> > Which doesn't even include the message in the subject afaics, so I >> > don't even know what you're talking about. And the other, earlier >> > report has no useful information either. >> > >> > From an abstract perspective, a front panel NMI to me would mean >> > someone pressed an NMI button on the system's front panel. You >> > don't think Xen can do anything about this, do you? And even if >> > the NMI has another origin, it's still a hardware generated event >> > that Xen has no control over. >> > >> >> Arrfab said Xen crashes and reboots in the middle of the boot process, >> and the blade chassis management logs the NMI error. The user is not >> pressing (NMI) buttons. >> >> The serial log included is everything he gets. No error visible in the >> serial log, >> only a crash/reboot without any errors.. No idea what could be causing >> that.. >> >> The same Dom0 kernel (pvops 3.4.28) boots OK on baremetal without Xen. >> >> Do you have any Xen and/or dom0 kernel options to use to do further >> analysis? >> >> -- Pasi >> >> > --047d7b5d4808afaa1404d50d1720 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable As I read on IBM paper :

When a non-maskable interr= upt (NMI) signal is received, the processor immediately drops what it was d= oing and attends to it. The NMI signal is normally used only for critical p= roblem situations, such as serious hardware errors This setting is Enabled = by default. When an NMI is issued by a critical event the BMC performs the = system to reset for recovering the system. The BMC logs the reboot and addi= tional error events in the SEL.

I dont know why XEN trigger or cause this NMI signal, s= ince when I boot the machine with the same Operating system Ubuntu 12.04.1 = Desktop 64 bit without XEN its run perfectly. One more interesting fact wit= h the same Dom0 with excacly the same XEN version and configuration running= perfectly on my notebook Toshiba Satelite L735 Intel I5, Hopefully anyone = have solution for the server.

Agya

On Wed, Feb 6,= 2013 at 12:48 PM, agya naila <agya.naila@gmail.com> wrot= e:
Thank you Pasi to forward this email for me = too, it seem not only me facing this problem. I found this guy also found s= imilar problem, its in french but we can translate it easily using google= =A0http://debian.2.n7.nabble.co= m/Probleme-XEN-4-0-1-et-SQUEEZE-64bits-reboot-td1230690.html

I found parameter nmi=3Dignore | dom0 | fatal=A0
<= br>
nmi=3Dreaction : Enables you to specify how the hypervis= or reacts to a non - maskable interrupt
(NMI) resulting from a pa= rity or I/O error. Possible values for reaction are fatal (the hypervisor
prints a diagnostic message and then hangs), dom0 (send a message to d= omain0 for logging
purposes but continue), and ignore (ignore the= error). If you do not specify this option, Xen
uses the default = value dom0 internally.

But its still doesn't work on my machine.

Agya
=


On Wed, Feb 6, 2013 at 12:29 PM, Pasi K=E4rkk=E4inen <= pasik@iki.fi> wrote:
On Wed, Feb 06, 2013 at 07:58:56AM= +0000, Jan Beulich wrote:
> >>> On 05.02.13 at 21:08, Pasi K=E4rkk=E4inen<pasik@iki.fi> wrote:
> > Arrfab (CC'd) is actually seeing a similar problem on IBM HS2= 0 blade with
> > Xen 4.2.1
> > with Linux 3.4.28 dom0 kernel.
> >
> > Does this ring anyone's bells?
> >
> >
> > serial console log of the crash
>
> Which doesn't even include the message in the subject afaics, so I=
> don't even know what you're talking about. And the other, earl= ier
> report has no useful information either.
>
> From an abstract perspective, a front panel NMI to me would mean
> someone pressed an NMI button on the system's front panel. You
> don't think Xen can do anything about this, do you? And even if > the NMI has another origin, it's still a hardware generated event<= br> > that Xen has no control over.
>

Arrfab said Xen crashes and reboots in the middle of the boot p= rocess,
and the blade chassis management logs the NMI error. The user is not pressi= ng (NMI) buttons.

The serial log included is everything he gets. No error visible in the seri= al log,
only a crash/reboot without any errors.. No idea what could be causing that= ..

The same Dom0 kernel (pvops 3.4.28) boots OK on baremetal without Xen.

Do you have any Xen and/or dom0 kernel options to use to do further analysi= s?

-- Pasi



--047d7b5d4808afaa1404d50d1720-- --===============4336674204630531010== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============4336674204630531010==--