All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Paul Menzel <pmenzel@molgen.mpg.de>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thorsten Leemhuis <linux@leemhuis.info>,
	Len Brown <len.brown@intel.com>, Tony Luck <tony.luck@intel.com>,
	"Raj, Ashok" <ashok.raj@intel.com>
Subject: Re: Dell XPS13: MCE (Hardware Error) reported
Date: Wed, 4 Jan 2017 23:55:46 +0100	[thread overview]
Message-ID: <20170104225546.wy36fu5t2jbow2dq@pd.tnic> (raw)
In-Reply-To: <f6d1d38d-ed57-8953-501b-c76a80a2f452@molgen.mpg.de>

Lemme add some more folks to CC.

On Wed, Jan 04, 2017 at 04:42:18PM +0100, Paul Menzel wrote:
> Dear Linux folks,
> 
> 
> The logs contain the following messages.
> 
> From Linux 4.10-rc2+ (0f64df301240 Merge branch 'parisc-4.10-2' of
> git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux):
> 
> > Jan 04 16:17:51 xps13 kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 6: ee0000000040110a
> > Jan 04 16:17:51 xps13 kernel: mce: [Hardware Error]: TSC 0 ADDR fef1ff40 MISC 47880018086
> > Jan 04 16:17:51 xps13 kernel: mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1483543069 SOCKET 0 APIC 0 microcode 0
> > Jan 04 16:17:51 xps13 kernel: mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 7: ee0000000040110a
> > Jan 04 16:17:51 xps13 kernel: mce: [Hardware Error]: TSC 0 ADDR fef1ce40 MISC 7880018086
> > Jan 04 16:17:51 xps13 kernel: mce: [Hardware Error]: PROCESSOR 0:806e9 TIME 1483543069 SOCKET 0 APIC 0 microcode 0
> 
> I am able to reproduce this also with Linux 4.8.11 from Debian Sid/unstable.
> 
> Installing *mcelog* 144+dfsg-1, the file below is created.
> 
> ```
> $ more /var/log/mcelog
> Hardware event. This is not a software error.
> MCE 0
> CPU 0 BANK 6
> MISC 47880018086 ADDR fef1ff40
> TIME 1483543069 Wed Jan  4 16:17:49 2017
> MCG status:
> MCi status:
> Error overflow
> Uncorrected error
> MCi_MISC register valid
> MCi_ADDR register valid
> Processor context corrupt
> MCA: corrected filtering (some unreported errors in same region)
> Generic CACHE Level-2 Generic Error
> STATUS ee0000000040110a MCGSTATUS 0
> MCGCAP c08 APICID 0 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 142
> Hardware event. This is not a software error.
> MCE 1
> CPU 0 BANK 7
> MISC 7880018086 ADDR fef1ce40
> TIME 1483543069 Wed Jan  4 16:17:49 2017
> MCG status:
> MCi status:
> Error overflow
> Uncorrected error
> MCi_MISC register valid
> MCi_ADDR register valid
> Processor context corrupt
> MCA: corrected filtering (some unreported errors in same region)
> Generic CACHE Level-2 Generic Error
> STATUS ee0000000040110a MCGSTATUS 0
> MCGCAP c08 APICID 0 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 142
> Hardware event. This is not a software error.
> MCE 0
> CPU 0 BANK 6
> MISC 47880018086 ADDR fef1ff40
> TIME 1483543581 Wed Jan  4 16:26:21 2017
> MCG status:
> MCi status:
> Error overflow
> Uncorrected error
> MCi_MISC register valid
> MCi_ADDR register valid
> Processor context corrupt
> MCA: corrected filtering (some unreported errors in same region)
> Generic CACHE Level-2 Generic Error
> STATUS ee0000000040110a MCGSTATUS 0
> MCGCAP c08 APICID 0 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 142
> Hardware event. This is not a software error.
> MCE 1
> CPU 0 BANK 7
> MISC 7880018086 ADDR fef1ce40
> TIME 1483543581 Wed Jan  4 16:26:21 2017
> MCG status:
> MCi status:
> Error overflow
> Uncorrected error
> MCi_MISC register valid
> MCi_ADDR register valid
> Processor context corrupt
> MCA: corrected filtering (some unreported errors in same region)
> Generic CACHE Level-2 Generic Error
> STATUS ee0000000040110a MCGSTATUS 0
> MCGCAP c08 APICID 0 SOCKETID 0
> CPUID Vendor Intel Family 6 Model 142
> ```
> 
> It looks like it’s a common problem on this machine [1].
> 
> > First, I fear that I cannot really give good answers to your questions. I also own a Dell XPS 13 (9360) and see the same MCE messages. I'm in contact with Dell Support because of these. They replaced the mainboard but it did not help. Same messages in the logs. At some point they concluded that it is probably a false positive. They had no idea what is causing it, though (mcelog/kernel/Intel problem?). The correspondence with Support is still ongoing.
> > 
> > <rant> Btw, talking to Dell Support is a very unpleasant experience. They seem to only suggest the "standard" solutions like resetting the Firmware, run self-health tests and so on. I didn't had the impression to talk to someone with some technical insight. </rant>
> > 
> > To add more details, I see the same issue on Fedora 24 so it seems not to be related to Ubuntu.
> > 
> > Regarding your questions:
> > 
> >     What do these errors mean and should I worry about them?
> > 
> > I don't know. Dell Support thinks those are false positives.
> > 
> >     Could these hardware errors be the cause of the freezes of the entire system?
> > 
> > Besides the messages my system works fine. I'd guess the freeze is a different issue.
> > 
> >     Should I have the laptop (or parts) replaced by the manufacturer?
> > 
> > Replacing the mainboard did not fix the MCE issue. It might solve the freezing issue, although it seems that this was fixed by a kernel update.
> > 
> >     Are there any other actions I should take?
> > 
> > If you are not already in contact with Support, contact them. Maybe they will come up with a real solution once they see that it affects more customers.
> 
> Could you please tell me, if and where I should open an issue in the Linux
> bug tracker [2]?
> 
> Any ideas are welcome.
> 
> 
> Kind regards,
> 
> Paul
> 
> 
> [1] https://unix.stackexchange.com/questions/324237/understanding-machine-check-exceptions-mce/330283
> [2] https://bugzilla.kernel.org/
> 

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

  reply	other threads:[~2017-01-04 23:07 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-04 15:42 Dell XPS13: MCE (Hardware Error) reported Paul Menzel
2017-01-04 22:55 ` Borislav Petkov [this message]
2017-01-05  1:12   ` Raj, Ashok
2017-01-09 11:53     ` Paul Menzel
2017-01-09 19:23       ` Raj, Ashok
2017-01-27 13:35         ` Paul Menzel
2017-01-27 17:10           ` Borislav Petkov
2017-01-27 17:16             ` Mario.Limonciello
2017-01-31 15:29               ` Paul Menzel
2017-01-31 17:20                 ` Borislav Petkov
2017-01-31 18:50                 ` Austin S. Hemmelgarn
2017-02-01 20:52                 ` Mario.Limonciello
2017-01-05  5:00 Daniel J Blueman
2017-01-05 14:05 ` Daniel J Blueman
2017-01-05 20:10   ` Alexander Alemayhu
2017-01-05 20:31     ` Borislav Petkov
2017-01-05 20:43       ` Raj, Ashok
2017-01-05 21:03         ` Pandruvada, Srinivas
2017-01-05 23:23           ` Alexander Alemayhu
2017-01-05 21:38       ` Alexander Alemayhu
2017-01-05 23:28       ` Raj, Ashok
2017-01-05 23:56         ` Borislav Petkov
2017-01-06  1:26           ` Raj, Ashok
2017-01-06 11:16             ` Borislav Petkov
2017-01-06 15:58               ` Raj, Ashok
2017-01-06 16:54                 ` Borislav Petkov
2017-01-06 17:04                   ` Raj, Ashok
2017-01-09 10:55                   ` Paul Menzel
2017-01-09 11:05                     ` Borislav Petkov
2017-01-09 11:11                       ` Paul Menzel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170104225546.wy36fu5t2jbow2dq@pd.tnic \
    --to=bp@alien8.de \
    --cc=ashok.raj@intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@leemhuis.info \
    --cc=pmenzel@molgen.mpg.de \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.