All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: "Verma, Vishal L" <vishal.l.verma@intel.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>
Cc: "linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>
Subject: RE: [PATCH] acpi, nfit: fix the memory error check in nfit_handle_mce
Date: Fri, 21 Apr 2017 20:16:00 +0000	[thread overview]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F6127653B@ORSMSX114.amr.corp.intel.com> (raw)
In-Reply-To: <1492804517.2738.25.camel@intel.com>

>> > +       if (!(mce->status & 0xef80) == BIT(7))
>> 
>> Can we get a define for this, or a comment explaining all the magic
>> that's happening on that one line?
>
> Yes - also like lkp pointed out, the check isn't correct at all. Let me
> figure out what really needs to be done, and I will resend with a better
> comment. 

Needs extra parentheses to make it right. Vishal, sorry I led you astray.

	if (!((mce->status & 0xef80) == BIT(7)))

The magic is shown in table 15-9 of the Intel Software Developers Manual
(but perhaps not well explained there).

mce->status in the above code is a value plucked from a machine check
bank status register. See figure 15-6 in the SDM.  The important bits for this
are {15:0} which are the "MCA Error code".  Table 15-9 shows how these
are grouped into types, where the type is defined by the most significant '1'
bit in the field (excluding bit 12 which is the Correction Report Filtering bit,
see section 15.9.2.1).

So if BIT(3) is the most significant bit, the this is a "Generic Cache Hierarchy"
error, BIT(4) denotes a TLB error, BIT(7) a Memory error, and so on.

Maybe we should have defines in mce.h for them?  It gets a bit more complicated
as all the above only applies to Intel branded X86 CPUs ... on AMD different
decoding rules apply.

-Tony


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: "Luck, Tony" <tony.luck-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: "Verma,
	Vishal L"
	<vishal.l.verma-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	"Williams,
	Dan J" <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: "linux-acpi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-acpi-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	"linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org"
	<linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org>
Subject: RE: [PATCH] acpi, nfit: fix the memory error check in nfit_handle_mce
Date: Fri, 21 Apr 2017 20:16:00 +0000	[thread overview]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F6127653B@ORSMSX114.amr.corp.intel.com> (raw)
In-Reply-To: <1492804517.2738.25.camel-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

>> > +       if (!(mce->status & 0xef80) == BIT(7))
>> 
>> Can we get a define for this, or a comment explaining all the magic
>> that's happening on that one line?
>
> Yes - also like lkp pointed out, the check isn't correct at all. Let me
> figure out what really needs to be done, and I will resend with a better
> comment. 

Needs extra parentheses to make it right. Vishal, sorry I led you astray.

	if (!((mce->status & 0xef80) == BIT(7)))

The magic is shown in table 15-9 of the Intel Software Developers Manual
(but perhaps not well explained there).

mce->status in the above code is a value plucked from a machine check
bank status register. See figure 15-6 in the SDM.  The important bits for this
are {15:0} which are the "MCA Error code".  Table 15-9 shows how these
are grouped into types, where the type is defined by the most significant '1'
bit in the field (excluding bit 12 which is the Correction Report Filtering bit,
see section 15.9.2.1).

So if BIT(3) is the most significant bit, the this is a "Generic Cache Hierarchy"
error, BIT(4) denotes a TLB error, BIT(7) a Memory error, and so on.

Maybe we should have defines in mce.h for them?  It gets a bit more complicated
as all the above only applies to Intel branded X86 CPUs ... on AMD different
decoding rules apply.

-Tony


_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

WARNING: multiple messages have this Message-ID (diff)
From: "Luck, Tony" <tony.luck@intel.com>
To: "Verma, Vishal L" <vishal.l.verma@intel.com>,
	"Williams, Dan J" <dan.j.williams@intel.com>
Cc: "linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>
Subject: RE: [PATCH] acpi, nfit: fix the memory error check in nfit_handle_mce
Date: Fri, 21 Apr 2017 20:16:00 +0000	[thread overview]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F6127653B@ORSMSX114.amr.corp.intel.com> (raw)
In-Reply-To: <1492804517.2738.25.camel@intel.com>

[-- Attachment #1: Type: text/plain, Size: 1344 bytes --]

>> > +       if (!(mce->status & 0xef80) == BIT(7))
>> 
>> Can we get a define for this, or a comment explaining all the magic
>> that's happening on that one line?
>
> Yes - also like lkp pointed out, the check isn't correct at all. Let me
> figure out what really needs to be done, and I will resend with a better
> comment. 

Needs extra parentheses to make it right. Vishal, sorry I led you astray.

	if (!((mce->status & 0xef80) == BIT(7)))

The magic is shown in table 15-9 of the Intel Software Developers Manual
(but perhaps not well explained there).

mce->status in the above code is a value plucked from a machine check
bank status register. See figure 15-6 in the SDM.  The important bits for this
are {15:0} which are the "MCA Error code".  Table 15-9 shows how these
are grouped into types, where the type is defined by the most significant '1'
bit in the field (excluding bit 12 which is the Correction Report Filtering bit,
see section 15.9.2.1).

So if BIT(3) is the most significant bit, the this is a "Generic Cache Hierarchy"
error, BIT(4) denotes a TLB error, BIT(7) a Memory error, and so on.

Maybe we should have defines in mce.h for them?  It gets a bit more complicated
as all the above only applies to Intel branded X86 CPUs ... on AMD different
decoding rules apply.

-Tony



[-- Attachment #2: compounderrorcodes.png --]
[-- Type: image/png, Size: 23390 bytes --]

  reply	other threads:[~2017-04-21 20:16 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-20 22:18 [PATCH] acpi, nfit: fix the memory error check in nfit_handle_mce Vishal Verma
2017-04-20 22:18 ` Vishal Verma
2017-04-20 22:18 ` Vishal Verma
2017-04-20 22:21 ` Verma, Vishal L
2017-04-20 22:21   ` Verma, Vishal L
2017-04-21  2:21 ` kbuild test robot
2017-04-21  2:21   ` kbuild test robot
2017-04-21  2:21   ` kbuild test robot
2017-04-21 19:21 ` Dan Williams
2017-04-21 19:21   ` Dan Williams
2017-04-21 19:56   ` Verma, Vishal L
2017-04-21 19:56     ` Verma, Vishal L
2017-04-21 19:56     ` Verma, Vishal L
2017-04-21 20:16     ` Luck, Tony [this message]
2017-04-21 20:16       ` Luck, Tony
2017-04-21 20:16       ` Luck, Tony
2017-04-21 20:19       ` Dan Williams
2017-04-21 20:19         ` Dan Williams
2017-04-21 20:27         ` Luck, Tony
2017-04-21 20:27           ` Luck, Tony
2017-04-21 21:07           ` Borislav Petkov
2017-04-21 21:07             ` Borislav Petkov
2017-04-21 21:07             ` Borislav Petkov
2017-04-24 11:36             ` [PATCH 1/2] x86/MCE: Export memory_error() Borislav Petkov
2017-04-24 11:36               ` Borislav Petkov
2017-04-25 21:07               ` Vishal Verma
2017-04-25 21:07                 ` Vishal Verma
2017-05-10 19:31                 ` Verma, Vishal L
2017-05-10 19:31                   ` Verma, Vishal L
2017-05-10 19:31                   ` Verma, Vishal L
2017-05-10 20:04                   ` Borislav Petkov
2017-05-10 20:04                     ` Borislav Petkov
2017-05-10 20:06                     ` Verma, Vishal L
2017-05-10 20:06                       ` Verma, Vishal L
2017-05-10 20:08                       ` Borislav Petkov
2017-05-10 20:08                         ` Borislav Petkov
2017-05-10 21:12                         ` Verma, Vishal L
2017-05-10 21:12                           ` Verma, Vishal L
2017-05-10 21:57                           ` Borislav Petkov
2017-05-10 21:57                             ` Borislav Petkov
2017-05-10 22:03                             ` Verma, Vishal L
2017-05-10 22:03                               ` Verma, Vishal L
2017-05-10 22:16                               ` Borislav Petkov
2017-05-10 22:16                                 ` Borislav Petkov
2017-05-10 22:16                                 ` Borislav Petkov
2017-05-10 22:22                                 ` Verma, Vishal L
2017-05-10 22:22                                   ` Verma, Vishal L
2017-05-17 12:38                                   ` Borislav Petkov
2017-05-17 12:38                                     ` Borislav Petkov
2017-05-17 12:38                                     ` Borislav Petkov
2017-05-17 18:58                                     ` Verma, Vishal L
2017-05-17 18:58                                       ` Verma, Vishal L
2017-05-17 19:20                                       ` Borislav Petkov
2017-05-17 19:20                                         ` Borislav Petkov
2017-04-24 11:37             ` [PATCH 2/2] x86/ras/mce_amd_inj: Preset MCE injection struct Borislav Petkov
2017-04-24 11:37               ` Borislav Petkov
2017-04-26 19:59               ` kbuild test robot
2017-04-26 19:59                 ` kbuild test robot
2017-04-21 20:35       ` [PATCH] acpi, nfit: fix the memory error check in nfit_handle_mce Vishal Verma
2017-04-21 20:35         ` Vishal Verma
2017-04-21 20:35         ` Vishal Verma
2017-04-21 20:50         ` Luck, Tony
2017-04-21 20:50           ` Luck, Tony
2017-04-21 20:50           ` Luck, Tony
2017-04-21 20:54           ` Vishal Verma
2017-04-21 20:54             ` Vishal Verma

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3908561D78D1C84285E8C5FCA982C28F6127653B@ORSMSX114.amr.corp.intel.com \
    --to=tony.luck@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=stable@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.