nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Linux NVDIMM <nvdimm@lists.linux.dev>,
	Jane Chu <jane.chu@oracle.com>,
	Luis Chamberlain <mcgrof@suse.com>,
	Tony Luck <tony.luck@intel.com>
Subject: Re: [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem
Date: Fri, 17 Sep 2021 13:30:13 +0200	[thread overview]
Message-ID: <YUR8RTx9blI2ezvQ@zn.tnic> (raw)
In-Reply-To: <CAPcyv4hABimEQ3z7HNjaQBWNtq7yhEXe=nbRx-N_xCuTZ1D_-g@mail.gmail.com>

On Thu, Sep 16, 2021 at 01:33:42PM -0700, Dan Williams wrote:
> I am specifically talking about the memory_failure_dev_pagemap() path
> taken from memory_failure().
> 
> > But then I have no clue what the "DAX-memory_failure()" path is.
> 
> Sorry, I should not have lazily used {PMEM,DAX} memory_failure() to
> refer to the exact routine in question: memory_failure_dev_pagemap().
> That path is taken when memory_failure() sees that a pfn was mapped by
> memremap_pages(). It determines that the pfn does not represent a
> dynamic page allocator page and may refer to a static page of storage
> from a device like /dev/pmem, or /dev/dax.

Aaaha, that's that innocent-looking pfn_to_online_page() call there in
memory_failure() which would return NULL for pmem/dax pages.

> The zeroing of the poison comes optionally later if userspace
> explicitly asks for it to be cleared.

I see.

> Yes. The device driver avoids reconsumption of the same error by
> recording the error in a "badblocks" structure (block/badblocks.c).
> The driver consults its badblocks instance on every subsequent access
> and preemptively returns EIO. The driver registers for machine-check
> notifications and translates those events into badblocks entries. So,
> repeat consumption is avoided unless/until the babblocks entry can be
> cleared along with the poison itself.

Sounds good.

> I'll note that going forward the filesystem wants to be more involved
> in tracking and managing these errors so the driver will grow the
> ability to forward those notifications up the stack.

... which would save the ask-the-driver each time thing. Yap.

> It does. In addition to machine-check synchronous notification of
> poison, there are asynchronous scanning mechanisms that prepopulate
> the badblocks entries to get ahead of consumption events.

Probably something similar to patrol scrubbing on normal DRAM.

> ... Because the entire page is dead there is no need for UC because all
> the cachelines are gone, nothing to read-around in this page. In short
> DRAM and PMEM want to share the exact same policy here and use NP for
> whole_page() and UC for not. Just the small matter of ignoring the
> memtype by using _set_memory_uc().

Hmm, so looking at this more:

        else
                rc = set_memory_uc(decoy_addr, 1);

->  memtype_reserve(__pa(addr), __pa(addr) + numpages * PAGE_SIZE,

->  memtype_reserve(__pa(addr), __pa(addr) + 1,

Looking at

  17fae1294ad9 ("x86/{mce,mm}: Unmap the entire page if the whole page is affected and poisoned")

it says

    Fix is to check the scope of the poison by checking the MCi_MISC register.
    If the entire page is affected, then unmap the page. If only part of the
    page is affected, then mark the page as uncacheable.

so I guess for normal DRAM we still want to map the page uncacheable so
that other parts except that cacheline can still be read.

And PMEM should be excluded from that treatise here because it needs to
be WB, as you said earlier.

Right?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2021-09-17 11:30 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07  1:01 [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem Dan Williams
2021-08-26 19:08 ` Dan Williams
2021-08-27  7:12   ` Jane Chu
2021-09-13 10:29 ` Borislav Petkov
2021-09-14 18:08   ` Dan Williams
2021-09-15 10:41     ` Borislav Petkov
2021-09-16 20:33       ` Dan Williams
2021-09-17 11:30         ` Borislav Petkov [this message]
2021-09-21  2:04           ` Dan Williams
2021-09-30 17:19             ` Borislav Petkov
2021-09-30 17:28               ` Luck, Tony
2021-09-30 19:30                 ` Borislav Petkov
2021-09-30 19:41                   ` Dan Williams
2021-09-30 19:44                   ` Luck, Tony
2021-09-30 20:01                     ` Borislav Petkov
2021-09-30 20:15                       ` Luck, Tony
2021-09-30 20:32                         ` Borislav Petkov
2021-09-30 20:39                           ` Dan Williams
2021-09-30 20:54                             ` Borislav Petkov
2021-09-30 21:05                               ` Dan Williams
2021-09-30 21:20                                 ` Borislav Petkov
2021-09-30 21:41                                   ` Dan Williams
2021-09-30 22:35                                     ` Borislav Petkov
2021-09-30 22:44                                       ` Dan Williams
2021-10-01 10:41                                         ` Borislav Petkov
2021-10-01  0:43                                       ` Jane Chu
2021-10-01  2:02                                         ` Dan Williams
2021-10-01 10:50                                           ` Borislav Petkov
2021-10-01 16:52                                             ` Dan Williams
2021-10-01 18:11                                               ` Borislav Petkov
2021-10-01 18:29                                                 ` Dan Williams
2021-10-02 10:17                                                   ` Borislav Petkov
2021-11-11  0:06                                                     ` Jane Chu
2021-11-12  0:30                                                       ` Jane Chu
2021-11-12  0:51                                                         ` Dan Williams
2021-11-12 17:57                                                           ` Jane Chu
2021-11-12 19:24                                                             ` Dan Williams
2021-11-12 22:35                                                               ` Jane Chu
2021-11-12 22:50                                                                 ` Jane Chu
2021-11-12 23:08                                                                 ` Dan Williams
2021-11-13  5:50                                                                   ` Jane Chu
2021-11-13 20:47                                                                     ` Dan Williams
2021-11-18 19:03                                                                       ` Jane Chu
2021-11-25  0:16                                                                         ` Dan Williams
2021-11-30 23:00                                                                           ` Jane Chu
2021-09-30 18:15         ` Jane Chu
2021-09-30 19:11           ` Dan Williams
2021-09-30 21:23             ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YUR8RTx9blI2ezvQ@zn.tnic \
    --to=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=mcgrof@suse.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).