nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Jane Chu <jane.chu@oracle.com>,
	"Luck, Tony" <tony.luck@intel.com>,
	 Linux NVDIMM <nvdimm@lists.linux.dev>,
	Luis Chamberlain <mcgrof@suse.com>
Subject: Re: [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem
Date: Fri, 1 Oct 2021 11:29:43 -0700	[thread overview]
Message-ID: <CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com> (raw)
In-Reply-To: <YVdPWcggek5ykbft@zn.tnic>

On Fri, Oct 1, 2021 at 11:12 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Oct 01, 2021 at 09:52:21AM -0700, Dan Williams wrote:
> > I think that puts us back in the situation that Tony fixed in:
> >
> > 17fae1294ad9 x86/{mce,mm}: Unmap the entire page if the whole page is
> > affected and poisoned
> >
> > ...where the clflush in _set_memory_uc() causes more unwanted virtual
> > #MC injection.
>
> Hmm, lemme read that commit message again: so the guest kernel sees a
> *second* MCE while doing set_memory_uc().
>
> So what prevents the guest kernel from seeing a second MCE when it does
> set_memory_np() instead?
>
> "Further investigation showed that the VMM had passed in another machine
> check because is appeared that the guest was accessing the bad page."
>
> but then the beginning of the commit message says:
>
> "The VMM unmapped the bad page from guest physical space and passed the
> machine check to the guest."
>
> so I'm really confused here what actually happens. Did the VMM manage to
> unmap it or not really? Because if the VMM had unmapped it, then how was
> the guest still accessing the bad page? ... Maybe I'm reading that wrong.

My read is that the guest gets virtual #MC on an access to that page.
When the guest tries to do set_memory_uc() and instructs cpa_flush()
to do clean caches that results in taking another fault / exception
perhaps because the VMM unmapped the page from the guest? If the guest
had flipped the page to NP then cpa_flush() says "oh, no caching
change, skip the clflush() loop".

>
> > I think that means that we have no choice but to mark the page NP
> > unconditionally and do the work to ensure that the driver has updated
> > its poisoned cacheline tracking for data recovery requests.
>
> So a couple of emails earlier I had this:
>
> |Well, the driver has special knowledge so *actually* it could go and use
> |the NP marking as "this page has been poisoned" and mark it NC only for
> |itself, so that it gets the job done. Dunno if that would end up being
> |too ugly to live and turn into a layering violation or so.
>
> So if we have marked the page NP, then nothing would be able to access
> it anymore and it will be marked as hwpoison additionally, which will
> prevent that access too.
>
> Then, the PMEM driver would go and map the page however it wants to, it
> could even remove it from the direct map so that nothing else accesses
> it, even in the kernel, and then do all kinds of recovery.
>
> Hmm?

Yeah, I thought UC would make the PMEM driver's life easier, but if it
has to contend with an NP case at all, might as well make it handle
that case all the time.

Safe to say this patch of mine is woefully insufficient and I need to
go look at how to make the guarantees needed by the PMEM driver so it
can handle NP and set up alias maps.

This was a useful discussion. It proves that my commit:

284ce4011ba6 x86/memory_failure: Introduce {set, clear}_mce_nospec()

...was broken from the outset.

  reply	other threads:[~2021-10-01 18:29 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07  1:01 [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem Dan Williams
2021-08-26 19:08 ` Dan Williams
2021-08-27  7:12   ` Jane Chu
2021-09-13 10:29 ` Borislav Petkov
2021-09-14 18:08   ` Dan Williams
2021-09-15 10:41     ` Borislav Petkov
2021-09-16 20:33       ` Dan Williams
2021-09-17 11:30         ` Borislav Petkov
2021-09-21  2:04           ` Dan Williams
2021-09-30 17:19             ` Borislav Petkov
2021-09-30 17:28               ` Luck, Tony
2021-09-30 19:30                 ` Borislav Petkov
2021-09-30 19:41                   ` Dan Williams
2021-09-30 19:44                   ` Luck, Tony
2021-09-30 20:01                     ` Borislav Petkov
2021-09-30 20:15                       ` Luck, Tony
2021-09-30 20:32                         ` Borislav Petkov
2021-09-30 20:39                           ` Dan Williams
2021-09-30 20:54                             ` Borislav Petkov
2021-09-30 21:05                               ` Dan Williams
2021-09-30 21:20                                 ` Borislav Petkov
2021-09-30 21:41                                   ` Dan Williams
2021-09-30 22:35                                     ` Borislav Petkov
2021-09-30 22:44                                       ` Dan Williams
2021-10-01 10:41                                         ` Borislav Petkov
2021-10-01  0:43                                       ` Jane Chu
2021-10-01  2:02                                         ` Dan Williams
2021-10-01 10:50                                           ` Borislav Petkov
2021-10-01 16:52                                             ` Dan Williams
2021-10-01 18:11                                               ` Borislav Petkov
2021-10-01 18:29                                                 ` Dan Williams [this message]
2021-10-02 10:17                                                   ` Borislav Petkov
2021-11-11  0:06                                                     ` Jane Chu
2021-11-12  0:30                                                       ` Jane Chu
2021-11-12  0:51                                                         ` Dan Williams
2021-11-12 17:57                                                           ` Jane Chu
2021-11-12 19:24                                                             ` Dan Williams
2021-11-12 22:35                                                               ` Jane Chu
2021-11-12 22:50                                                                 ` Jane Chu
2021-11-12 23:08                                                                 ` Dan Williams
2021-11-13  5:50                                                                   ` Jane Chu
2021-11-13 20:47                                                                     ` Dan Williams
2021-11-18 19:03                                                                       ` Jane Chu
2021-11-25  0:16                                                                         ` Dan Williams
2021-11-30 23:00                                                                           ` Jane Chu
2021-09-30 18:15         ` Jane Chu
2021-09-30 19:11           ` Dan Williams
2021-09-30 21:23             ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4hrXPb1tASBZUg-GgdVs0OOFKXMXLiHmktg_kFi7YBMyQ@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=bp@alien8.de \
    --cc=jane.chu@oracle.com \
    --cc=mcgrof@suse.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).