nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Jane Chu <jane.chu@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>,
	"Luck, Tony" <tony.luck@intel.com>,
	 Linux NVDIMM <nvdimm@lists.linux.dev>,
	Luis Chamberlain <mcgrof@suse.com>
Subject: Re: [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem
Date: Thu, 30 Sep 2021 19:02:13 -0700	[thread overview]
Message-ID: <CAPcyv4j9KH+Y4hperuCwBMLOSPHKfbbku_T8uFNoqiNYrvfRdA@mail.gmail.com> (raw)
In-Reply-To: <ba3b12bf-c71e-7422-e205-258e96f29be5@oracle.com>

On Thu, Sep 30, 2021 at 5:43 PM Jane Chu <jane.chu@oracle.com> wrote:
>
>
> On 9/30/2021 3:35 PM, Borislav Petkov wrote:
> > On Thu, Sep 30, 2021 at 02:41:52PM -0700, Dan Williams wrote:
> >> I fail to see the point of that extra plumbing when MSi_MISC
> >> indicating "whole_page", or not is sufficient. What am I missing?
> >
> > I think you're looking at it from the wrong side... (or it is too late
> > here, but we'll see). Forget how a memory type can be mapped but think
> > about how the recovery action looks like.
> >
> > - DRAM: when a DRAM page is poisoned, it is only poisoned as a whole
> > page by memory_failure(). whole_page is always true here, no matter what
> > the hardware says because we don't and cannot do any sub-page recovery
> > actions. So it doesn't matter how we map it, UC, NP... I suggested NP
> > because the page is practically not present if you want to access it
> > because mm won't allow it...
> >
> > - PMEM: reportedly, we can do sub-page recovery here so PMEM should be
> > mapped in the way it is better for the recovery action to work.
> >
> > In both cases, the recovery action should control how the memory type is
> > mapped.
> >
> > Now, you say we cannot know the memory type when the error gets
> > reported.
> >
> > And I say: for simplicity's sake, we simply go and work with whole
> > pages. Always. That is the case anyway for DRAM.
>
> Sorry, please correct me if I misunderstand. The DRAM poison handling
> at page frame granularity is a helpless compromise due to lack of
> guarantee to decipher the precise error blast radius given all
> types of DRAM and architectures, right?  But that's not true for
> the PMEM case. So why should PMEM poison handling follow the lead
> of DRAM?

If I understand the proposal correctly Boris is basically saying
"figure out how to do your special PMEM stuff in the driver directly
and make it so MCE code has no knowledge of the PMEM case". The flow
is:

memory_failure(pfn, flags)
nfit_handle_mce(...) <--- right now this on mce notifier chain
set_mce_nospec(pfn) <--- drop the "whole page" concept

This poses a problem because not all memory_failure() paths trigger
set_mce_nospec() or the mce notifier chain. If that disconnect
happens, attempts to read PMEM pages that have been signalled to
memory_failure() will now crash in the driver without workarounds for
NP pages.

So memory_failure() needs to ensure that it communicates with the
driver before any possible NP page attribute changes. I.e. the driver
needs to know that regardless of how many cachelines are poisoned the
entire page is always unmapped in the direct map.

Then, when the driver is called with the new RWF_RECOVER_DATA flag, it
can set up a new UC alias mapping for the pfn and access the good data
in the page while being careful to read around the poisoned cache
lines.

In my mind this moves the RWF_RECOVER_DATA flag proposal from "nice to
have" to "critical for properly coordinating with memory_failure() and
mce expectations"

  reply	other threads:[~2021-10-01  2:02 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07  1:01 [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem Dan Williams
2021-08-26 19:08 ` Dan Williams
2021-08-27  7:12   ` Jane Chu
2021-09-13 10:29 ` Borislav Petkov
2021-09-14 18:08   ` Dan Williams
2021-09-15 10:41     ` Borislav Petkov
2021-09-16 20:33       ` Dan Williams
2021-09-17 11:30         ` Borislav Petkov
2021-09-21  2:04           ` Dan Williams
2021-09-30 17:19             ` Borislav Petkov
2021-09-30 17:28               ` Luck, Tony
2021-09-30 19:30                 ` Borislav Petkov
2021-09-30 19:41                   ` Dan Williams
2021-09-30 19:44                   ` Luck, Tony
2021-09-30 20:01                     ` Borislav Petkov
2021-09-30 20:15                       ` Luck, Tony
2021-09-30 20:32                         ` Borislav Petkov
2021-09-30 20:39                           ` Dan Williams
2021-09-30 20:54                             ` Borislav Petkov
2021-09-30 21:05                               ` Dan Williams
2021-09-30 21:20                                 ` Borislav Petkov
2021-09-30 21:41                                   ` Dan Williams
2021-09-30 22:35                                     ` Borislav Petkov
2021-09-30 22:44                                       ` Dan Williams
2021-10-01 10:41                                         ` Borislav Petkov
2021-10-01  0:43                                       ` Jane Chu
2021-10-01  2:02                                         ` Dan Williams [this message]
2021-10-01 10:50                                           ` Borislav Petkov
2021-10-01 16:52                                             ` Dan Williams
2021-10-01 18:11                                               ` Borislav Petkov
2021-10-01 18:29                                                 ` Dan Williams
2021-10-02 10:17                                                   ` Borislav Petkov
2021-11-11  0:06                                                     ` Jane Chu
2021-11-12  0:30                                                       ` Jane Chu
2021-11-12  0:51                                                         ` Dan Williams
2021-11-12 17:57                                                           ` Jane Chu
2021-11-12 19:24                                                             ` Dan Williams
2021-11-12 22:35                                                               ` Jane Chu
2021-11-12 22:50                                                                 ` Jane Chu
2021-11-12 23:08                                                                 ` Dan Williams
2021-11-13  5:50                                                                   ` Jane Chu
2021-11-13 20:47                                                                     ` Dan Williams
2021-11-18 19:03                                                                       ` Jane Chu
2021-11-25  0:16                                                                         ` Dan Williams
2021-11-30 23:00                                                                           ` Jane Chu
2021-09-30 18:15         ` Jane Chu
2021-09-30 19:11           ` Dan Williams
2021-09-30 21:23             ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4j9KH+Y4hperuCwBMLOSPHKfbbku_T8uFNoqiNYrvfRdA@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=bp@alien8.de \
    --cc=jane.chu@oracle.com \
    --cc=mcgrof@suse.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).