nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Jane Chu <jane.chu@oracle.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	Linux NVDIMM <nvdimm@lists.linux.dev>,
	Luis Chamberlain <mcgrof@suse.com>
Subject: Re: [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem
Date: Fri, 1 Oct 2021 12:50:06 +0200	[thread overview]
Message-ID: <YVbn3ohRhYkTNdEK@zn.tnic> (raw)
In-Reply-To: <CAPcyv4j9KH+Y4hperuCwBMLOSPHKfbbku_T8uFNoqiNYrvfRdA@mail.gmail.com>

On Thu, Sep 30, 2021 at 07:02:13PM -0700, Dan Williams wrote:
> If I understand the proposal correctly Boris is basically saying
> "figure out how to do your special PMEM stuff in the driver directly
> and make it so MCE code has no knowledge of the PMEM case".

I don't mind if it has to because of a good reason X - I'm just trying
to simplify things and not do stuff which is practically unnecessary.

> The flow
> is:
> 
> memory_failure(pfn, flags)
> nfit_handle_mce(...) <--- right now this on mce notifier chain
> set_mce_nospec(pfn) <--- drop the "whole page" concept
> 
> This poses a problem because not all memory_failure() paths trigger
> set_mce_nospec() or the mce notifier chain.

Hmm, so this sounds like more is needed than this small patch.

> If that disconnect happens, attempts to read PMEM pages that have been
> signalled to memory_failure() will now crash in the driver without
> workarounds for NP pages.
>
> So memory_failure() needs to ensure that it communicates with the
> driver before any possible NP page attribute changes. I.e. the driver
> needs to know that regardless of how many cachelines are poisoned the
> entire page is always unmapped in the direct map.

Ok, so those errors do get reported through MCE so MCE code does need to
know about them.

So the question is, how should MCE mark them so that the driver can deal
with them properly.

> Then, when the driver is called with the new RWF_RECOVER_DATA flag, it
> can set up a new UC alias mapping for the pfn and access the good data
> in the page while being careful to read around the poisoned cache
> lines.

See my other mail - that sounds good.

> In my mind this moves the RWF_RECOVER_DATA flag proposal from "nice to
> have" to "critical for properly coordinating with memory_failure() and
> mce expectations"

Right, so to reiterate: I don't mind if the MCE code knows about this -
in the end of the day, it is that code which does the error reporting.
My goal is to make that reporting and marking optimal so that the driver
can work with that marking and simplify and document *why* we're doing
that so that future MCE changes can keep that in mind.

So to sum up: it sounds to me like it should simply mark *whole* pages
as NC:

1. this should prevent the spec. access for DRAM.
2. PMEM can do UC alias mapping and do sub-page access to salvage data.

How's that?

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply	other threads:[~2021-10-01 10:50 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07  1:01 [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem Dan Williams
2021-08-26 19:08 ` Dan Williams
2021-08-27  7:12   ` Jane Chu
2021-09-13 10:29 ` Borislav Petkov
2021-09-14 18:08   ` Dan Williams
2021-09-15 10:41     ` Borislav Petkov
2021-09-16 20:33       ` Dan Williams
2021-09-17 11:30         ` Borislav Petkov
2021-09-21  2:04           ` Dan Williams
2021-09-30 17:19             ` Borislav Petkov
2021-09-30 17:28               ` Luck, Tony
2021-09-30 19:30                 ` Borislav Petkov
2021-09-30 19:41                   ` Dan Williams
2021-09-30 19:44                   ` Luck, Tony
2021-09-30 20:01                     ` Borislav Petkov
2021-09-30 20:15                       ` Luck, Tony
2021-09-30 20:32                         ` Borislav Petkov
2021-09-30 20:39                           ` Dan Williams
2021-09-30 20:54                             ` Borislav Petkov
2021-09-30 21:05                               ` Dan Williams
2021-09-30 21:20                                 ` Borislav Petkov
2021-09-30 21:41                                   ` Dan Williams
2021-09-30 22:35                                     ` Borislav Petkov
2021-09-30 22:44                                       ` Dan Williams
2021-10-01 10:41                                         ` Borislav Petkov
2021-10-01  0:43                                       ` Jane Chu
2021-10-01  2:02                                         ` Dan Williams
2021-10-01 10:50                                           ` Borislav Petkov [this message]
2021-10-01 16:52                                             ` Dan Williams
2021-10-01 18:11                                               ` Borislav Petkov
2021-10-01 18:29                                                 ` Dan Williams
2021-10-02 10:17                                                   ` Borislav Petkov
2021-11-11  0:06                                                     ` Jane Chu
2021-11-12  0:30                                                       ` Jane Chu
2021-11-12  0:51                                                         ` Dan Williams
2021-11-12 17:57                                                           ` Jane Chu
2021-11-12 19:24                                                             ` Dan Williams
2021-11-12 22:35                                                               ` Jane Chu
2021-11-12 22:50                                                                 ` Jane Chu
2021-11-12 23:08                                                                 ` Dan Williams
2021-11-13  5:50                                                                   ` Jane Chu
2021-11-13 20:47                                                                     ` Dan Williams
2021-11-18 19:03                                                                       ` Jane Chu
2021-11-25  0:16                                                                         ` Dan Williams
2021-11-30 23:00                                                                           ` Jane Chu
2021-09-30 18:15         ` Jane Chu
2021-09-30 19:11           ` Dan Williams
2021-09-30 21:23             ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YVbn3ohRhYkTNdEK@zn.tnic \
    --to=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=mcgrof@suse.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).