From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755018AbdA0Nho (ORCPT ); Fri, 27 Jan 2017 08:37:44 -0500 Received: from mx1.molgen.mpg.de ([141.14.17.9]:47417 "EHLO mx1.molgen.mpg.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755011AbdA0Ngj (ORCPT ); Fri, 27 Jan 2017 08:36:39 -0500 Subject: Re: Dell XPS13: MCE (Hardware Error) reported To: Ashok Raj References: <20170104225546.wy36fu5t2jbow2dq@pd.tnic> <20170105011236.GA80100@otc-brkl-03> <662102c9-94da-3193-08c4-9fe75411cadb@molgen.mpg.de> <20170109192336.GA42856@otc-nc-03> Cc: Borislav Petkov , Linux Kernel Mailing List , Thorsten Leemhuis , Len Brown , Tony Luck , Mario Limonciello , Thorsten Leemhuis From: Paul Menzel Message-ID: <2714c370-d3ba-4522-a7ec-be30186181f0@molgen.mpg.de> Date: Fri, 27 Jan 2017 14:35:16 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0 MIME-Version: 1.0 In-Reply-To: <20170109192336.GA42856@otc-nc-03> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dear Ashok, On 01/09/17 20:23, Raj, Ashok wrote: > On Mon, Jan 09, 2017 at 12:53:33PM +0100, Paul Menzel wrote: > >> On 01/05/17 02:12, Raj, Ashok wrote: >> >>>>> CPUID Vendor Intel Family 6 Model 142 >>> This is Kabylake Mobile >>> >>>>> Hardware event. This is not a software error. >>>>> MCE 1 >>>>> CPU 0 BANK 7 >>>>> MISC 7880018086 ADDR fef1ce40 >>>>> TIME 1483543069 Wed Jan 4 16:17:49 2017 > >>>>> STATUS ee0000000040110a MCGSTATUS 0 >>> >>> Decoding the bits further from MCi_STATUS above: >>> Val=1, OVER=1, UC=1, but EN=0 indicates this isn't a MCE, hence should have >>> been signaled by a CMCI. >>> >>> PCC=1, but should be ignored when EN=0. >>> MCACOD: 110a MSCOD: 0040 > > This MSCOD indicates that its a write back access to mmio space. Its possible > that BIOS is scanning certain memory region during boot. During which time > BIOS does disable generation of MCE's. Which is why EN=0 in the above log. > > Its a BIOS bug, one would expect that BIOS clears up these before handoff to > OS. During OS boot we also scan all MC banks and log/clear them. > > If you aren't observing them during normal operation you can safely ignore > these preboot logs, or pass them along to your OEM. Thank you very much for your help. After wasting my time with the Dell support over Twitter [1], where they basically also make you jump through hoops, and then claim it’s an mcelog issue – as they apparently only execute `sudo mcelog` –, I updated to the latest firmware 1.3.2 released yesterday [2]. With that new firmware version, it looks like that the firmware has been fixed and Linux does not report any MCEs. It’d be great if other Dell XPS13 9360 users could verify that. Kind regards, Paul [1] https://twitter.com/pmenzel_molgen/status/818808708692115456 [2] XPS_9360_1.3.2.exe