linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Luck, Tony" <tony.luck@intel.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
	Borislav Petkov <bp@alien8.de>,
	linux-edac@vger.kernel.org, X86 ML <x86@kernel.org>,
	Christoph Hellwig <hch@lst.de>, Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jan Kara <jack@suse.cz>
Subject: Re: [PATCH v2 07/11] x86, memory_failure: Introduce {set, clear}_mce_nospec()
Date: Mon, 4 Jun 2018 11:08:45 -0700	[thread overview]
Message-ID: <20180604180845.GA17942@agluck-desk> (raw)
In-Reply-To: <CAPcyv4g76nv=D2NXrpFLS-aWQR=N0T2n01WhrVY2_KPavkvLCA@mail.gmail.com>

On Mon, Jun 04, 2018 at 10:39:48AM -0700, Dan Williams wrote:
> On Mon, Jun 4, 2018 at 10:08 AM, Luck, Tony <tony.luck@intel.com> wrote:
> > On Sat, Jun 02, 2018 at 10:23:20PM -0700, Dan Williams wrote:
> >> +static inline int set_mce_nospec(unsigned long pfn)
> >> +{
> >> +     int rc;
> >> +
> >> +     rc = set_memory_uc((unsigned long) __va(PFN_PHYS(pfn)), 1);
> >
> > You should really do the decoy_addr thing here that I had in mce_unmap_kpfn().
> > Putting the virtual address of the page you mustn't accidentally prefetch
> > from into a register is a pretty good way to make sure that the processor
> > does do a prefetch.
> 
> Maybe I'm misreading, but doesn't that make the page completely
> inaccessible? We still want to read pmem through the driver and the
> linear mapping with memcpy_mcsafe(). Alternatively I could just drop
> this patch and setup a private / alias mapping for the pmem driver to
> use. It seems aliased mappings would be the safer option, but I want
> to make sure I've comprehended your suggestion correctly?

I'm OK with the call to set_memory_uc() to make this uncacheable
instead of set_memory_np() to make it inaccessible.

The problem is how to achieve that.

The result of __va(PFN_PHYS(pfn) is the virtual address where the poison
page is currently mapped into the kernel. That value gets put into
register %rdi to make the call to set_memory_uc() (which goes on to
call a bunch of other functions passing the virtual address along
the way).

Now imagine an impatient super-speculative processor is waiting for
some result to decide where to jump next, and picks a path that isn't
going to be taken ... out in the weeds somewhere it runs into:

	movzbl	(%rdi), %eax

Oops ... now you just read from the address you were trying to
avoid. So we log an error. Eventually the speculation gets sorted
out and the processor knows not to signal a machine check. But the
log is sitting in a machine check bank waiting to cause an overflow
if we try to log a second error.

The decoy_addr trick in mce_unmap_kpfn() throws in the high bit
to the address passed.  The set_memory_np() code (and I assume the
set_memory_uc()) code ignores it, but it means any stray speculative
access won't point at the poison page.

-Tony

Note: this is *mostly* a problem if the poison is in the first
cache line of the page.  But you could hit other lines if the
instruction you speculatively ran into had the right offset. E.g.
to hit the third line:

	movzbl	128(%rdi), %eax

  reply	other threads:[~2018-06-04 18:08 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-03  5:22 [PATCH v2 00/11] mm: Teach memory_failure() about ZONE_DEVICE pages Dan Williams
2018-06-03  5:22 ` [PATCH v2 01/11] device-dax: Convert to vmf_insert_mixed and vm_fault_t Dan Williams
2018-06-03  5:22 ` [PATCH v2 02/11] device-dax: Cleanup vm_fault de-reference chains Dan Williams
2018-06-03  5:22 ` [PATCH v2 03/11] device-dax: Enable page_mapping() Dan Williams
2018-06-03  5:23 ` [PATCH v2 04/11] device-dax: Set page->index Dan Williams
2018-06-03  5:23 ` [PATCH v2 05/11] filesystem-dax: " Dan Williams
2018-06-03  5:23 ` [PATCH v2 06/11] mm, madvise_inject_error: Let memory_failure() optionally take a page reference Dan Williams
2018-06-03  5:23 ` [PATCH v2 07/11] x86, memory_failure: Introduce {set, clear}_mce_nospec() Dan Williams
2018-06-04 17:08   ` Luck, Tony
2018-06-04 17:39     ` Dan Williams
2018-06-04 18:08       ` Luck, Tony [this message]
2018-06-04 18:35         ` Dan Williams
2018-06-03  5:23 ` [PATCH v2 08/11] mm, memory_failure: Pass page size to kill_proc() Dan Williams
2018-06-03  5:23 ` [PATCH v2 09/11] mm, memory_failure: Fix page->mapping assumptions relative to the page lock Dan Williams
2018-06-03  5:23 ` [PATCH v2 10/11] mm, memory_failure: Teach memory_failure() about dev_pagemap pages Dan Williams
2018-06-03  5:23 ` [PATCH v2 11/11] libnvdimm, pmem: Restore page attributes when clearing errors Dan Williams
2018-06-04 12:40 ` [PATCH v2 00/11] mm: Teach memory_failure() about ZONE_DEVICE pages Michal Hocko
2018-06-04 14:31   ` Dan Williams
2018-06-05 14:11     ` Michal Hocko
2018-06-05 14:33       ` Dan Williams
2018-06-06  7:39         ` Michal Hocko
2018-06-06 13:44           ` Dan Williams
2018-06-07 14:37             ` Michal Hocko
2018-06-07 16:52               ` Dan Williams
2018-06-11  7:50                 ` Michal Hocko
2018-06-11 14:44                   ` Dan Williams
2018-06-11 14:56                     ` Michal Hocko
2018-06-11 15:19                       ` Dan Williams
2018-06-11 17:35                         ` Andi Kleen
2018-06-12  1:50                         ` Naoya Horiguchi
2018-06-12  1:58                           ` Dan Williams
2018-06-12  4:04                           ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180604180845.GA17942@agluck-desk \
    --to=tony.luck@intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=hpa@zytor.com \
    --cc=jack@suse.cz \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).