nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Jane Chu <jane.chu@oracle.com>
To: Borislav Petkov <bp@alien8.de>, Dan Williams <dan.j.williams@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>,
	Linux NVDIMM <nvdimm@lists.linux.dev>,
	Luis Chamberlain <mcgrof@suse.com>
Subject: Re: [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem
Date: Fri, 12 Nov 2021 00:30:33 +0000	[thread overview]
Message-ID: <7ae58b24-ad48-7afa-c023-23ccf28e3994@oracle.com> (raw)
In-Reply-To: <48c47c52-499a-8721-350a-5ac55a9a70de@oracle.com>

Just a quick update -

I managed to test the 'NP' and 'UC' effect on a pmem dax file.
The result is, as expected, both setting 'NP' and 'UC' works
well in preventing the prefetcher from accessing the poisoned
pmem page.

I injected back-to-back poisons to the 3rd block(512B) of
the 3rd page in my dax file.  With 'NP', the 'mc_safe read'
stops  after reading the 1st and 2nd pages, with 'UC',
the 'mc_safe read' was able to read [2 pages + 2 blocks] on
my test machine.

thanks,
-jane


On 11/10/2021 4:06 PM, Jane Chu wrote:
> Hi, All,
> 
> Since I volunteered to add a patch in my series to captured the
> idea discussed here about what set_mce_nospec() should do, I went
> back reread the discussion.
> 
> The conclusion of the discussion is that set_mce_nospec() should
> always do set_memory_np(), regardless 'whole_page', or
> DRAM vs PMEM which the code cannot tell any way.
> 
> I'd like to voice concern about the risk that Dan raised about
> this change of behavior to pmem users.  Yes, in theory, it's all
> driver's problem, let the driver deal with it. But in reality,
> that translates to many unhappy customers when their
> mission-critical applications crash and things don't get fixed
> the next day.
> 
> Is the risk unavoidable? If I'm not mistaken, I thought the
> central point of the discussion was about clarity/simplicity
> rather than correctness.  How about we address that without
> raising risk to existing customers?  Here is my proposed
> wording with the fix Dan sent earlier.
> 
>     /*
>      * set_memory_nospec - make memory type marking in order to prevent
>      * speculative access to poisoned page.
>      *
>      * @pfn - pfn of a page that is either poisoned in the whole, or
> partially
>      *       poisoned,
>      * @whole_page - indicates whether the entire page is poisoned or only
>      *       part of the page is poisoned accoding to the MSi_MISC register.
>      *
>      * The page might be a DRAM or a PMEM page which the code cannot tell.
>      * The page might have already been unmapped (when 'whole_page') is true
>      * and may not be accessed in any case (e.g. guest page).
>      *
>      * The page might be partially poisoned and hence the non-poisoned
>      * cachelines could be safely accessed _in theory_, although practically,
>      * when a DRAM page is marked PageHWPoison, the mm-subsystem prevents
>      * it from being accessed, but when a PMEM page is marked PageHWPoison,
>      * it could practically be accessed as it is not entirely under the
>      * mm-subsystem management.
>      *
>      * Setting mem_type of the page to either 'NP' or 'UC' prevents the
>      * prefetcher from accessing the page, henec the rest of the decision
>      * is based on minimizing the risk and maximizing the flexibility,
> that is,
>      * in case of 'whole_page', set mem_type to 'NP', but for partial page
>      * poisoning, set mem_type to 'UC', regardless the page is DRAM or PMEM.
>      */
> static inline int set_mce_nospec(unsigned long pfn, bool whole_page)
> <snip>
>           if (whole_page)
>                   rc = set_memory_np(decoy_addr, 1);
>           else {
>                  rc = _set_memory_uc(decoy_addr, 1);
>           }
> <snip>
> 
> Comments? Suggestions?
> 
> thanks,
> -jane
> 
> 
> 
> 
> On 10/2/2021 3:17 AM, Borislav Petkov wrote:
>> On Fri, Oct 01, 2021 at 11:29:43AM -0700, Dan Williams wrote:
>>> My read is that the guest gets virtual #MC on an access to that page.
>>> When the guest tries to do set_memory_uc() and instructs cpa_flush()
>>> to do clean caches that results in taking another fault / exception
>>> perhaps because the VMM unmapped the page from the guest? If the guest
>>> had flipped the page to NP then cpa_flush() says "oh, no caching
>>> change, skip the clflush() loop".
>>
>> ... and the CLFLUSH is the insn which caused the second MCE because it
>> "appeared that the guest was accessing the bad page."
>>
>> Uuf, that could be. Nasty.
>>
>>> Yeah, I thought UC would make the PMEM driver's life easier, but if it
>>> has to contend with an NP case at all, might as well make it handle
>>> that case all the time.
>>>
>>> Safe to say this patch of mine is woefully insufficient and I need to
>>> go look at how to make the guarantees needed by the PMEM driver so it
>>> can handle NP and set up alias maps.
>>>
>>> This was a useful discussion.
>>
>> Oh yeah, thanks for taking the time!
>>
>>> It proves that my commit:
>>>
>>> 284ce4011ba6 x86/memory_failure: Introduce {set, clear}_mce_nospec()
>>>
>>> ...was broken from the outset.
>>
>> Well, the problem with hw errors is that it is always very hard to test
>> the code. But I hear injection works now soo... :-)
>>
>> Thx.
>>
> 


  reply	other threads:[~2021-11-12  0:30 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-07  1:01 [RFT PATCH] x86/pat: Fix set_mce_nospec() for pmem Dan Williams
2021-08-26 19:08 ` Dan Williams
2021-08-27  7:12   ` Jane Chu
2021-09-13 10:29 ` Borislav Petkov
2021-09-14 18:08   ` Dan Williams
2021-09-15 10:41     ` Borislav Petkov
2021-09-16 20:33       ` Dan Williams
2021-09-17 11:30         ` Borislav Petkov
2021-09-21  2:04           ` Dan Williams
2021-09-30 17:19             ` Borislav Petkov
2021-09-30 17:28               ` Luck, Tony
2021-09-30 19:30                 ` Borislav Petkov
2021-09-30 19:41                   ` Dan Williams
2021-09-30 19:44                   ` Luck, Tony
2021-09-30 20:01                     ` Borislav Petkov
2021-09-30 20:15                       ` Luck, Tony
2021-09-30 20:32                         ` Borislav Petkov
2021-09-30 20:39                           ` Dan Williams
2021-09-30 20:54                             ` Borislav Petkov
2021-09-30 21:05                               ` Dan Williams
2021-09-30 21:20                                 ` Borislav Petkov
2021-09-30 21:41                                   ` Dan Williams
2021-09-30 22:35                                     ` Borislav Petkov
2021-09-30 22:44                                       ` Dan Williams
2021-10-01 10:41                                         ` Borislav Petkov
2021-10-01  0:43                                       ` Jane Chu
2021-10-01  2:02                                         ` Dan Williams
2021-10-01 10:50                                           ` Borislav Petkov
2021-10-01 16:52                                             ` Dan Williams
2021-10-01 18:11                                               ` Borislav Petkov
2021-10-01 18:29                                                 ` Dan Williams
2021-10-02 10:17                                                   ` Borislav Petkov
2021-11-11  0:06                                                     ` Jane Chu
2021-11-12  0:30                                                       ` Jane Chu [this message]
2021-11-12  0:51                                                         ` Dan Williams
2021-11-12 17:57                                                           ` Jane Chu
2021-11-12 19:24                                                             ` Dan Williams
2021-11-12 22:35                                                               ` Jane Chu
2021-11-12 22:50                                                                 ` Jane Chu
2021-11-12 23:08                                                                 ` Dan Williams
2021-11-13  5:50                                                                   ` Jane Chu
2021-11-13 20:47                                                                     ` Dan Williams
2021-11-18 19:03                                                                       ` Jane Chu
2021-11-25  0:16                                                                         ` Dan Williams
2021-11-30 23:00                                                                           ` Jane Chu
2021-09-30 18:15         ` Jane Chu
2021-09-30 19:11           ` Dan Williams
2021-09-30 21:23             ` Jane Chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ae58b24-ad48-7afa-c023-23ccf28e3994@oracle.com \
    --to=jane.chu@oracle.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=mcgrof@suse.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).