All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Elliott, Robert (Persistent Memory)" <elliott@hpe.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH-resend] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages
Date: Thu, 17 Aug 2017 23:32:16 +0000	[thread overview]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F61342363@ORSMSX114.amr.corp.intel.com> (raw)
In-Reply-To: <20170817150942.017f87537b6cbb48e9cfc082@linux-foundation.org>

> It's unclear (to lil ole me) what the end-user-visible effects of this
> are.
>
> Could we please have a description of that?  So a) people can
> understand your decision to cc:stable and b) people whose kernels are
> misbehaving can use your description to decide whether your patch might
> fix the issue their users are reporting.

Ingo already applied this to the tip tree, so too late to fix the commit message :-(

A very, very, unlucky end user with a system that supports machine check recovery
(Xeon E7, or Xeon-SP-platinum) that has recovered from one or more uncorrected
memory errors (lucky so far) might find a subsequent uncorrected memory error flagged
as fatal because the machine check bank that should log the error is already occupied
by a log caused by a speculative access to one of the earlier uncorrected errors (the
unlucky part).

We haven't seen this happen at the Linux OS level, but it is a theoretical possibility.
[Some BIOS that map physical memory 1:1 have seen this when doing eMCA processing
for the first error ... as soon as they load the address of the error from the MCi_ADDR
register they are vulnerable to some speculative access dereferencing the register with 
the address and setting the overflow bit in the machine check bank that still holds the
original log].

-Tony

WARNING: multiple messages have this Message-ID (diff)
From: "Luck, Tony" <tony.luck@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@suse.de>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	"Elliott, Robert (Persistent Memory)" <elliott@hpe.com>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: RE: [PATCH-resend] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages
Date: Thu, 17 Aug 2017 23:32:16 +0000	[thread overview]
Message-ID: <3908561D78D1C84285E8C5FCA982C28F61342363@ORSMSX114.amr.corp.intel.com> (raw)
In-Reply-To: <20170817150942.017f87537b6cbb48e9cfc082@linux-foundation.org>

> It's unclear (to lil ole me) what the end-user-visible effects of this
> are.
>
> Could we please have a description of that?  So a) people can
> understand your decision to cc:stable and b) people whose kernels are
> misbehaving can use your description to decide whether your patch might
> fix the issue their users are reporting.

Ingo already applied this to the tip tree, so too late to fix the commit message :-(

A very, very, unlucky end user with a system that supports machine check recovery
(Xeon E7, or Xeon-SP-platinum) that has recovered from one or more uncorrected
memory errors (lucky so far) might find a subsequent uncorrected memory error flagged
as fatal because the machine check bank that should log the error is already occupied
by a log caused by a speculative access to one of the earlier uncorrected errors (the
unlucky part).

We haven't seen this happen at the Linux OS level, but it is a theoretical possibility.
[Some BIOS that map physical memory 1:1 have seen this when doing eMCA processing
for the first error ... as soon as they load the address of the error from the MCi_ADDR
register they are vulnerable to some speculative access dereferencing the register with 
the address and setting the overflow bit in the machine check bank that still holds the
original log].

-Tony

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-08-17 23:32 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-16 19:02 [PATCH] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages Luck, Tony
2017-06-16 19:02 ` Luck, Tony
2017-06-19 18:01 ` Borislav Petkov
2017-06-19 18:01   ` Borislav Petkov
2017-06-21 17:47   ` Luck, Tony
2017-06-21 17:47     ` Luck, Tony
2017-06-21 19:59     ` Elliott, Robert (Persistent Memory)
2017-06-21 19:59       ` Elliott, Robert (Persistent Memory)
2017-06-21 20:19       ` Luck, Tony
2017-06-21 20:19         ` Luck, Tony
2017-06-22  9:39     ` Borislav Petkov
2017-06-22  9:39       ` Borislav Petkov
2017-06-29 22:11       ` git send-email (w/o Cc: stable) Luck, Tony
2017-06-29 22:11         ` Luck, Tony
2017-06-30  7:08         ` Borislav Petkov
2017-06-30  7:08           ` Borislav Petkov
2017-06-23 22:19     ` [PATCH] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages Elliott, Robert (Persistent Memory)
2017-06-23 22:19       ` Elliott, Robert (Persistent Memory)
2017-06-23 22:19       ` Elliott, Robert (Persistent Memory)
2017-06-27 22:04       ` Luck, Tony
2017-06-27 22:04         ` Luck, Tony
2017-06-27 22:04         ` Luck, Tony
2017-06-27 22:09         ` Dan Williams
2017-06-27 22:09           ` Dan Williams
2017-08-16 17:18           ` [PATCH-resend] " Luck, Tony
2017-08-16 17:18             ` Luck, Tony
2017-08-17 10:19             ` [tip:x86/mm] x86/mm, " tip-bot for Tony Luck
2017-08-17 22:09             ` [PATCH-resend] " Andrew Morton
2017-08-17 22:09               ` Andrew Morton
2017-08-17 22:29               ` Elliott, Robert (Persistent Memory)
2017-08-17 22:29                 ` Elliott, Robert (Persistent Memory)
2017-08-17 23:32               ` Luck, Tony [this message]
2017-08-17 23:32                 ` Luck, Tony
2017-06-21  2:12 ` [PATCH] " Naoya Horiguchi
2017-06-21  2:12   ` Naoya Horiguchi
2017-06-21 17:54   ` Luck, Tony
2017-06-21 17:54     ` Luck, Tony
2017-06-21 19:47     ` Elliott, Robert (Persistent Memory)
2017-06-21 19:47       ` Elliott, Robert (Persistent Memory)
2017-06-21 19:47       ` Elliott, Robert (Persistent Memory)
2017-06-21 20:30       ` Luck, Tony
2017-06-21 20:30         ` Luck, Tony
2017-06-21 20:30         ` Luck, Tony
2017-06-23  5:07         ` Dan Williams
2017-06-23  5:07           ` Dan Williams
2017-06-23  5:07           ` Dan Williams
2017-06-23 20:59           ` Luck, Tony
2017-06-23 20:59             ` Luck, Tony
2017-06-23 20:59             ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3908561D78D1C84285E8C5FCA982C28F61342363@ORSMSX114.amr.corp.intel.com \
    --to=tony.luck@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@suse.de \
    --cc=dave.hansen@intel.com \
    --cc=elliott@hpe.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.