From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934768Ab1ETINo (ORCPT ); Fri, 20 May 2011 04:13:44 -0400 Received: from mga02.intel.com ([134.134.136.20]:34270 "EHLO mga02.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934590Ab1ETIN2 (ORCPT ); Fri, 20 May 2011 04:13:28 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.65,241,1304319600"; d="scan'208";a="1418444" Message-ID: <4DD622A5.9030902@intel.com> Date: Fri, 20 May 2011 16:13:25 +0800 From: Huang Ying User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.15) Gecko/20110402 Iceowl/1.0b2 Icedove/3.1.9 MIME-Version: 1.0 To: Don Zickus CC: Andi Kleen , Cyrill Gorcunov , huang ying , Ingo Molnar , "linux-kernel@vger.kernel.org" , Andi Kleen , Robert Richter Subject: Re: [RFC] x86, NMI, Treat unknown NMI as hardware error References: <4DCE3493.4090404@gmail.com> <4DCF7413.4070704@gmail.com> <4DD07959.4030608@intel.com> <20110516190310.GH31888@redhat.com> <4DD20A2F.604@intel.com> <20110517142427.GL31888@redhat.com> <20110517163847.GF24805@tassilo.jf.intel.com> <20110517175707.GP31888@redhat.com> <20110517181859.GA25937@tassilo.jf.intel.com> <20110517190738.GH29881@redhat.com> In-Reply-To: <20110517190738.GH29881@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Don, On 05/18/2011 03:07 AM, Don Zickus wrote: > On Tue, May 17, 2011 at 11:18:59AM -0700, Andi Kleen wrote: >>> Random thought, in the Firmware first mode of HEST (which is the only way >>> GHES records get produced??), does an SCI happen first to jump into the >>> firmware for processing, then an NMI? >> >> Either that or there is a separate service processor which handles it. >> Presumably it depends a lot on the particular system. > > Ah interesting. I was going to suggest somehow setting a bit when an SCI > comes in and check that bit in the unknown NMI path as a possible hint > that the NMI might be related to HEST (sorta how we flag unknown NMIs in > the perf code). > > It was just an idea. Obviously a service processor will make that more > difficult. :-) Hmm, what's the conclusion? Do you think unknown NMI should be seen as hardware error? At least on some white listed machines? Best Regards, Huang Ying