All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Missing compound_head() in memory-failure
@ 2022-01-31 20:29 Matthew Wilcox
  2022-01-31 20:54 ` Joao Martins
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2022-01-31 20:29 UTC (permalink / raw)
  To: Joao Martins, Jane Chu; +Cc: Naoya Horiguchi, linux-mm, Andrew Morton

Unless I am mistaken, you have to pass the compound head of the page
which has the error to collect_procs().  Am I mistaken?

(found by code inspection as part of the folio work; I think
collect_procs() needs to take a folio, which would eliminate this class
of errors)

+++ b/mm/memory-failure.c
@@ -1633,7 +1633,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
         * SIGBUS (i.e. MF_MUST_KILL)
         */
        flags |= MF_ACTION_REQUIRED | MF_MUST_KILL;
-       collect_procs(page, &tokill, flags & MF_ACTION_REQUIRED);
+       collect_procs(compound_head(page), &tokill, flags & MF_ACTION_REQUIRED);

        list_for_each_entry(tk, &tokill, nd)
                if (tk->size_shift)




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-01-31 20:29 [RFC] Missing compound_head() in memory-failure Matthew Wilcox
@ 2022-01-31 20:54 ` Joao Martins
  2022-02-01  2:34   ` Matthew Wilcox
  2022-02-01 15:46   ` Matthew Wilcox
  0 siblings, 2 replies; 12+ messages in thread
From: Joao Martins @ 2022-01-31 20:54 UTC (permalink / raw)
  To: Matthew Wilcox, Jane Chu; +Cc: Naoya Horiguchi, linux-mm, Andrew Morton



On 1/31/22 20:29, Matthew Wilcox wrote:
> Unless I am mistaken, you have to pass the compound head of the page
> which has the error to collect_procs().  Am I mistaken?
> 
-rc2 already has a fix for it:

https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/

Earlier in that function there's a:

	page = compound_head(page);

So the @page passed to collect_procs() already is a head page.

filesystem-dax, though, it's always base pages. So there's no heads.

> (found by code inspection as part of the folio work; I think
> collect_procs() needs to take a folio, which would eliminate this class
> of errors)
> 
> +++ b/mm/memory-failure.c
> @@ -1633,7 +1633,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>          * SIGBUS (i.e. MF_MUST_KILL)
>          */
>         flags |= MF_ACTION_REQUIRED | MF_MUST_KILL;
> -       collect_procs(page, &tokill, flags & MF_ACTION_REQUIRED);
> +       collect_procs(compound_head(page), &tokill, flags & MF_ACTION_REQUIRED);
> 
>         list_for_each_entry(tk, &tokill, nd)
>                 if (tk->size_shift)
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-01-31 20:54 ` Joao Martins
@ 2022-02-01  2:34   ` Matthew Wilcox
  2022-02-01 15:46   ` Matthew Wilcox
  1 sibling, 0 replies; 12+ messages in thread
From: Matthew Wilcox @ 2022-02-01  2:34 UTC (permalink / raw)
  To: Joao Martins; +Cc: Jane Chu, Naoya Horiguchi, linux-mm, Andrew Morton

On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
> 
> 
> On 1/31/22 20:29, Matthew Wilcox wrote:
> > Unless I am mistaken, you have to pass the compound head of the page
> > which has the error to collect_procs().  Am I mistaken?
> > 
> -rc2 already has a fix for it:
> 
> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
> 
> Earlier in that function there's a:
> 
> 	page = compound_head(page);
> 
> So the @page passed to collect_procs() already is a head page.

Well, that'll teach me to be two days behind on Linus' tree ;-)



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-01-31 20:54 ` Joao Martins
  2022-02-01  2:34   ` Matthew Wilcox
@ 2022-02-01 15:46   ` Matthew Wilcox
  2022-02-01 16:01     ` Joao Martins
  2022-02-01 21:11     ` Jane Chu
  1 sibling, 2 replies; 12+ messages in thread
From: Matthew Wilcox @ 2022-02-01 15:46 UTC (permalink / raw)
  To: Joao Martins; +Cc: Jane Chu, Naoya Horiguchi, linux-mm, Andrew Morton

On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
> On 1/31/22 20:29, Matthew Wilcox wrote:
> > Unless I am mistaken, you have to pass the compound head of the page
> > which has the error to collect_procs().  Am I mistaken?
> > 
> -rc2 already has a fix for it:
> 
> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
> 
> Earlier in that function there's a:
> 
> 	page = compound_head(page);
> 
> So the @page passed to collect_procs() already is a head page.

It's wrong though ;-(  You set the HWPoison bit on the page after
calling compound_head(), so you set the bit on the head page instead
of the precise page that had the poison.

I'm fixing this up as part of the folio patches, but you may wish to
fix it earlier than that.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-02-01 15:46   ` Matthew Wilcox
@ 2022-02-01 16:01     ` Joao Martins
  2022-02-01 17:40       ` Joao Martins
  2022-02-01 21:11     ` Jane Chu
  1 sibling, 1 reply; 12+ messages in thread
From: Joao Martins @ 2022-02-01 16:01 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jane Chu, Naoya Horiguchi, linux-mm, Andrew Morton, Dan Williams

On 2/1/22 15:46, Matthew Wilcox wrote:
> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
>> On 1/31/22 20:29, Matthew Wilcox wrote:
>>> Unless I am mistaken, you have to pass the compound head of the page
>>> which has the error to collect_procs().  Am I mistaken?
>>>
>> -rc2 already has a fix for it:
>>
>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
>>
>> Earlier in that function there's a:
>>
>> 	page = compound_head(page);
>>
>> So the @page passed to collect_procs() already is a head page.
> 
> It's wrong though ;-(  You set the HWPoison bit on the page after
> calling compound_head(), so you set the bit on the head page instead
> of the precise page that had the poison.
> 
Considering that on device-dax we would unmap the whole 2M page regardless
of the poisoned subpage isn't that actually representative still?

> I'm fixing this up as part of the folio patches, but you may wish to
> fix it earlier than that.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-02-01 16:01     ` Joao Martins
@ 2022-02-01 17:40       ` Joao Martins
  2022-02-01 17:47         ` Matthew Wilcox
  0 siblings, 1 reply; 12+ messages in thread
From: Joao Martins @ 2022-02-01 17:40 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jane Chu, Naoya Horiguchi, linux-mm, Andrew Morton, Dan Williams

On 2/1/22 16:01, Joao Martins wrote:
> On 2/1/22 15:46, Matthew Wilcox wrote:
>> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
>>> On 1/31/22 20:29, Matthew Wilcox wrote:
>>>> Unless I am mistaken, you have to pass the compound head of the page
>>>> which has the error to collect_procs().  Am I mistaken?
>>>>
>>> -rc2 already has a fix for it:
>>>
>>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
>>>
>>> Earlier in that function there's a:
>>>
>>> 	page = compound_head(page);
>>>
>>> So the @page passed to collect_procs() already is a head page.
>>
>> It's wrong though ;-(  You set the HWPoison bit on the page after
>> calling compound_head(), so you set the bit on the head page instead
>> of the precise page that had the poison.
>>
> Considering that on device-dax we would unmap the whole 2M page regardless
> of the poisoned subpage isn't that actually representative still?
> 

To say this another way. We do set the HWPoison on the head page,
and not the subpage as you say, but we end up propagating the resultant
MCE action on the superset of pages in the whole PMD or PUD.

What I was trying to say in perhaps a convoluted way is that device-dax
case isn't different than HugeTLB that only wants to poison head. If there's
a head page, there's likely a PMD or PUD populated (depending what the device
was onlined with) and thus that's what gets unmapped. There's no idea of
subpages being treated any differently, at least as far as device-dax is
concerned -- unless I miss auditing some other code path.

fsdax IIUC seems to rely more on the subpage bit being flagged but no
functional change here for fsdax as there's only base pages there (no heads).

>> I'm fixing this up as part of the folio patches, but you may wish to
>> fix it earlier than that.
(+Dan in case I misrepresented or missed something)

Should we deem it a problem, I'll fix for the next -rc.
Just in case, here's the diff stashed:

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 2e2f740c63dc..661c23df8115 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1577,7 +1577,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
 static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
                struct dev_pagemap *pgmap)
 {
-       struct page *page = pfn_to_page(pfn);
+       struct page *page = pfn_to_page(pfn), *subpage = page;
        unsigned long size = 0;
        struct to_kill *tk;
        LIST_HEAD(tokill);
@@ -1631,7 +1631,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
         * Use this flag as an indication that the dax page has been
         * remapped UC to prevent speculative consumption of poison.
         */
-       SetPageHWPoison(page);
+       SetPageHWPoison(subpage);

        /*
         * Unlike System-RAM there is no possibility to swap in a


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-02-01 17:40       ` Joao Martins
@ 2022-02-01 17:47         ` Matthew Wilcox
  2022-02-01 18:34           ` Joao Martins
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2022-02-01 17:47 UTC (permalink / raw)
  To: Joao Martins
  Cc: Jane Chu, Naoya Horiguchi, linux-mm, Andrew Morton, Dan Williams

On Tue, Feb 01, 2022 at 05:40:57PM +0000, Joao Martins wrote:
> On 2/1/22 16:01, Joao Martins wrote:
> > On 2/1/22 15:46, Matthew Wilcox wrote:
> >> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
> >>> On 1/31/22 20:29, Matthew Wilcox wrote:
> >>>> Unless I am mistaken, you have to pass the compound head of the page
> >>>> which has the error to collect_procs().  Am I mistaken?
> >>>>
> >>> -rc2 already has a fix for it:
> >>>
> >>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
> >>>
> >>> Earlier in that function there's a:
> >>>
> >>> 	page = compound_head(page);
> >>>
> >>> So the @page passed to collect_procs() already is a head page.
> >>
> >> It's wrong though ;-(  You set the HWPoison bit on the page after
> >> calling compound_head(), so you set the bit on the head page instead
> >> of the precise page that had the poison.
> >>
> > Considering that on device-dax we would unmap the whole 2M page regardless
> > of the poisoned subpage isn't that actually representative still?
> > 
> 
> To say this another way. We do set the HWPoison on the head page,
> and not the subpage as you say, but we end up propagating the resultant
> MCE action on the superset of pages in the whole PMD or PUD.
> 
> What I was trying to say in perhaps a convoluted way is that device-dax
> case isn't different than HugeTLB that only wants to poison head. If there's
> a head page, there's likely a PMD or PUD populated (depending what the device
> was onlined with) and thus that's what gets unmapped. There's no idea of
> subpages being treated any differently, at least as far as device-dax is
> concerned -- unless I miss auditing some other code path.

Using HugeTLB as a model is not a good idea.  THP is the model to
aim for; one can choose to map pages askew (ie not aligned with a
PMD), and I don't think you'll find all the mappings with the current
code (eg if someone has mapped a single page of the hugepage).

> fsdax IIUC seems to rely more on the subpage bit being flagged but no
> functional change here for fsdax as there's only base pages there (no heads).
> 
> >> I'm fixing this up as part of the folio patches, but you may wish to
> >> fix it earlier than that.
> (+Dan in case I misrepresented or missed something)
> 
> Should we deem it a problem, I'll fix for the next -rc.
> Just in case, here's the diff stashed:
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 2e2f740c63dc..661c23df8115 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1577,7 +1577,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>  static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>                 struct dev_pagemap *pgmap)
>  {
> -       struct page *page = pfn_to_page(pfn);
> +       struct page *page = pfn_to_page(pfn), *subpage = page;
>         unsigned long size = 0;
>         struct to_kill *tk;
>         LIST_HEAD(tokill);
> @@ -1631,7 +1631,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>          * Use this flag as an indication that the dax page has been
>          * remapped UC to prevent speculative consumption of poison.
>          */
> -       SetPageHWPoison(page);
> +       SetPageHWPoison(subpage);
> 
>         /*
>          * Unlike System-RAM there is no possibility to swap in a


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-02-01 17:47         ` Matthew Wilcox
@ 2022-02-01 18:34           ` Joao Martins
  0 siblings, 0 replies; 12+ messages in thread
From: Joao Martins @ 2022-02-01 18:34 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Jane Chu, Naoya Horiguchi, linux-mm, Andrew Morton, Dan Williams

On 2/1/22 17:47, Matthew Wilcox wrote:
> On Tue, Feb 01, 2022 at 05:40:57PM +0000, Joao Martins wrote:
>> On 2/1/22 16:01, Joao Martins wrote:
>>> On 2/1/22 15:46, Matthew Wilcox wrote:
>>>> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
>>>>> On 1/31/22 20:29, Matthew Wilcox wrote:
>>>>>> Unless I am mistaken, you have to pass the compound head of the page
>>>>>> which has the error to collect_procs().  Am I mistaken?
>>>>>>
>>>>> -rc2 already has a fix for it:
>>>>>
>>>>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
>>>>>
>>>>> Earlier in that function there's a:
>>>>>
>>>>> 	page = compound_head(page);
>>>>>
>>>>> So the @page passed to collect_procs() already is a head page.
>>>>
>>>> It's wrong though ;-(  You set the HWPoison bit on the page after
>>>> calling compound_head(), so you set the bit on the head page instead
>>>> of the precise page that had the poison.
>>>>
>>> Considering that on device-dax we would unmap the whole 2M page regardless
>>> of the poisoned subpage isn't that actually representative still?
>>>
>>
>> To say this another way. We do set the HWPoison on the head page,
>> and not the subpage as you say, but we end up propagating the resultant
>> MCE action on the superset of pages in the whole PMD or PUD.
>>
>> What I was trying to say in perhaps a convoluted way is that device-dax
>> case isn't different than HugeTLB that only wants to poison head. If there's
>> a head page, there's likely a PMD or PUD populated (depending what the device
>> was onlined with) and thus that's what gets unmapped. There's no idea of
>> subpages being treated any differently, at least as far as device-dax is
>> concerned -- unless I miss auditing some other code path.
> 
> Using HugeTLB as a model is not a good idea.  THP is the model to
> aim for; one can choose to map pages askew (ie not aligned with a
> PMD), and I don't think you'll find all the mappings with the current
> code (eg if someone has mapped a single page of the hugepage).
> 

Don't know if that last sentence referred to THP or DAX -- but to be extra sure,
on device-dax current code you can't map or remap subpages of a hugepage.
Even if you open a device-dax fd and touch just a subpage, what gets mapped
in the user mm is the entirety of the PMD (or PUD). You can't even map a
single random page, it always needs to be aligned /at a minimum/ to the
namespace boundary (which is the page table granularity i.e. PTE, PMD or PUD).

>> fsdax IIUC seems to rely more on the subpage bit being flagged but no
>> functional change here for fsdax as there's only base pages there (no heads).
>>
>>>> I'm fixing this up as part of the folio patches, but you may wish to
>>>> fix it earlier than that.
>> (+Dan in case I misrepresented or missed something)
>>
>> Should we deem it a problem, I'll fix for the next -rc.
>> Just in case, here's the diff stashed:
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 2e2f740c63dc..661c23df8115 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -1577,7 +1577,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
>>  static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>>                 struct dev_pagemap *pgmap)
>>  {
>> -       struct page *page = pfn_to_page(pfn);
>> +       struct page *page = pfn_to_page(pfn), *subpage = page;
>>         unsigned long size = 0;
>>         struct to_kill *tk;
>>         LIST_HEAD(tokill);
>> @@ -1631,7 +1631,7 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
>>          * Use this flag as an indication that the dax page has been
>>          * remapped UC to prevent speculative consumption of poison.
>>          */
>> -       SetPageHWPoison(page);
>> +       SetPageHWPoison(subpage);
>>
>>         /*
>>          * Unlike System-RAM there is no possibility to swap in a


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-02-01 15:46   ` Matthew Wilcox
  2022-02-01 16:01     ` Joao Martins
@ 2022-02-01 21:11     ` Jane Chu
  2022-02-01 22:04       ` Joao Martins
  1 sibling, 1 reply; 12+ messages in thread
From: Jane Chu @ 2022-02-01 21:11 UTC (permalink / raw)
  To: Matthew Wilcox, Joao Martins; +Cc: Naoya Horiguchi, linux-mm, Andrew Morton

On 2/1/2022 7:46 AM, Matthew Wilcox wrote:
> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
>> On 1/31/22 20:29, Matthew Wilcox wrote:
>>> Unless I am mistaken, you have to pass the compound head of the page
>>> which has the error to collect_procs().  Am I mistaken?
>>>
>> -rc2 already has a fix for it:
>>
>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
>>
>> Earlier in that function there's a:
>>
>> 	page = compound_head(page);
>>
>> So the @page passed to collect_procs() already is a head page.
> 
> It's wrong though ;-(  You set the HWPoison bit on the page after
> calling compound_head(), so you set the bit on the head page instead
> of the precise page that had the poison.

Indeed. The rest of the kernel including  pmem driver still deal with
base page on clearing poison, bookkeeping etc. So the HWpoison bit needs 
to be set precisely on the poisoned base page such that we pass the 
correct 'pfn' to set_mce_nospec() to discourage speculative access.

> 
> I'm fixing this up as part of the folio patches, but you may wish to
> fix it earlier than that.

Thanks for the fix!

-jane



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-02-01 21:11     ` Jane Chu
@ 2022-02-01 22:04       ` Joao Martins
  2022-02-01 22:20         ` Dan Williams
  0 siblings, 1 reply; 12+ messages in thread
From: Joao Martins @ 2022-02-01 22:04 UTC (permalink / raw)
  To: Jane Chu, Matthew Wilcox
  Cc: Naoya Horiguchi, linux-mm, Andrew Morton, Dan Williams

On 2/1/22 21:11, Jane Chu wrote:
> On 2/1/2022 7:46 AM, Matthew Wilcox wrote:
>> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
>>> On 1/31/22 20:29, Matthew Wilcox wrote:
>>>> Unless I am mistaken, you have to pass the compound head of the page
>>>> which has the error to collect_procs().  Am I mistaken?
>>>>
>>> -rc2 already has a fix for it:
>>>
>>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
>>>
>>> Earlier in that function there's a:
>>>
>>> 	page = compound_head(page);
>>>
>>> So the @page passed to collect_procs() already is a head page.
>>
>> It's wrong though ;-(  You set the HWPoison bit on the page after
>> calling compound_head(), so you set the bit on the head page instead
>> of the precise page that had the poison.
> 
> Indeed. The rest of the kernel including  pmem driver still deal with
> base page on clearing poison, bookkeeping etc. So the HWpoison bit needs 
> to be set precisely on the poisoned base page such that we pass the 
> correct 'pfn' to set_mce_nospec() to discourage speculative access.
> 
set_mce_nospec() machinery makes no use of the HWPoison bit as far as
my reading goes. And the PFN that is passed to set_mce_nospec() is already
the subpage PFN that eventually lands on set_memory_np()/set_memory_uc() when
it changes the kernel page tables mapping (which also don't use the poison bit).

I still can't see how device-dax machinery makes use of that bit? At least
the one which could use it (clear_mce_nospec()) doesn't actually go through
device-dax nvdimm-specific code only fsdax which I reiterate that the patch
does not change as there's no compound head there. Am I missing something?

>> I'm fixing this up as part of the folio patches, but you may wish to
>> fix it earlier than that.
> 
> Thanks for the fix!

As I had mentioned earlier I have one prepped, if this turns out to be
indeed a problem. So far, I haven't spotted any on my testing since I
started this work, but it could also be an oversight on my end.

	Joao


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-02-01 22:04       ` Joao Martins
@ 2022-02-01 22:20         ` Dan Williams
  2022-02-01 23:04           ` Jane Chu
  0 siblings, 1 reply; 12+ messages in thread
From: Dan Williams @ 2022-02-01 22:20 UTC (permalink / raw)
  To: Joao Martins
  Cc: Jane Chu, Matthew Wilcox, Naoya Horiguchi, linux-mm, Andrew Morton

On Tue, Feb 1, 2022 at 2:05 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> On 2/1/22 21:11, Jane Chu wrote:
> > On 2/1/2022 7:46 AM, Matthew Wilcox wrote:
> >> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
> >>> On 1/31/22 20:29, Matthew Wilcox wrote:
> >>>> Unless I am mistaken, you have to pass the compound head of the page
> >>>> which has the error to collect_procs().  Am I mistaken?
> >>>>
> >>> -rc2 already has a fix for it:
> >>>
> >>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
> >>>
> >>> Earlier in that function there's a:
> >>>
> >>>     page = compound_head(page);
> >>>
> >>> So the @page passed to collect_procs() already is a head page.
> >>
> >> It's wrong though ;-(  You set the HWPoison bit on the page after
> >> calling compound_head(), so you set the bit on the head page instead
> >> of the precise page that had the poison.
> >
> > Indeed. The rest of the kernel including  pmem driver still deal with
> > base page on clearing poison, bookkeeping etc. So the HWpoison bit needs
> > to be set precisely on the poisoned base page such that we pass the
> > correct 'pfn' to set_mce_nospec() to discourage speculative access.
> >
> set_mce_nospec() machinery makes no use of the HWPoison bit as far as
> my reading goes. And the PFN that is passed to set_mce_nospec() is already
> the subpage PFN that eventually lands on set_memory_np()/set_memory_uc() when
> it changes the kernel page tables mapping (which also don't use the poison bit).
>
> I still can't see how device-dax machinery makes use of that bit? At least
> the one which could use it (clear_mce_nospec()) doesn't actually go through
> device-dax nvdimm-specific code only fsdax which I reiterate that the patch
> does not change as there's no compound head there. Am I missing something?

device-dax does not use that bit, because there is no kernel mediated
access to the backing range. In the fsdax case that bit is used to
determine when to run clear_mce_nospec(). Jane is in the process of
reworking this path.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC] Missing compound_head() in memory-failure
  2022-02-01 22:20         ` Dan Williams
@ 2022-02-01 23:04           ` Jane Chu
  0 siblings, 0 replies; 12+ messages in thread
From: Jane Chu @ 2022-02-01 23:04 UTC (permalink / raw)
  To: Dan Williams, Joao Martins
  Cc: Matthew Wilcox, Naoya Horiguchi, linux-mm, Andrew Morton

On 2/1/2022 2:20 PM, Dan Williams wrote:
> On Tue, Feb 1, 2022 at 2:05 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>>
>> On 2/1/22 21:11, Jane Chu wrote:
>>> On 2/1/2022 7:46 AM, Matthew Wilcox wrote:
>>>> On Mon, Jan 31, 2022 at 08:54:39PM +0000, Joao Martins wrote:
>>>>> On 1/31/22 20:29, Matthew Wilcox wrote:
>>>>>> Unless I am mistaken, you have to pass the compound head of the page
>>>>>> which has the error to collect_procs().  Am I mistaken?
>>>>>>
>>>>> -rc2 already has a fix for it:
>>>>>
>>>>> https://lore.kernel.org/linux-mm/20220129021420.PgBIZm-q9%25akpm@linux-foundation.org/
>>>>>
>>>>> Earlier in that function there's a:
>>>>>
>>>>>      page = compound_head(page);
>>>>>
>>>>> So the @page passed to collect_procs() already is a head page.
>>>>
>>>> It's wrong though ;-(  You set the HWPoison bit on the page after
>>>> calling compound_head(), so you set the bit on the head page instead
>>>> of the precise page that had the poison.
>>>
>>> Indeed. The rest of the kernel including  pmem driver still deal with
>>> base page on clearing poison, bookkeeping etc. So the HWpoison bit needs
>>> to be set precisely on the poisoned base page such that we pass the
>>> correct 'pfn' to set_mce_nospec() to discourage speculative access.
>>>
>> set_mce_nospec() machinery makes no use of the HWPoison bit as far as
>> my reading goes. And the PFN that is passed to set_mce_nospec() is already
>> the subpage PFN that eventually lands on set_memory_np()/set_memory_uc() when
>> it changes the kernel page tables mapping (which also don't use the poison bit).
>>
>> I still can't see how device-dax machinery makes use of that bit? At least
>> the one which could use it (clear_mce_nospec()) doesn't actually go through
>> device-dax nvdimm-specific code only fsdax which I reiterate that the patch
>> does not change as there's no compound head there. Am I missing something?
> 
> device-dax does not use that bit, because there is no kernel mediated
> access to the backing range. In the fsdax case that bit is used to
> determine when to run clear_mce_nospec(). Jane is in the process of
> reworking this path.

Thanks Dan!  Sorry for confusing set_mce_nospec with PageHWpoison bit,
another look at the code leads me to realize that perhaps for devdax,
the PageHWpoison bit is never cleared during the life time of the page.
and that is okay since the kernel parts that deal with devdax pages do 
not care.  As for fsdax, fsdax use base pages and Joao's code doesn't 
change that, hence the PageHWpoison bit would be set precisely in the 
poisoned base page.

thanks!
-jane


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-02-01 23:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-31 20:29 [RFC] Missing compound_head() in memory-failure Matthew Wilcox
2022-01-31 20:54 ` Joao Martins
2022-02-01  2:34   ` Matthew Wilcox
2022-02-01 15:46   ` Matthew Wilcox
2022-02-01 16:01     ` Joao Martins
2022-02-01 17:40       ` Joao Martins
2022-02-01 17:47         ` Matthew Wilcox
2022-02-01 18:34           ` Joao Martins
2022-02-01 21:11     ` Jane Chu
2022-02-01 22:04       ` Joao Martins
2022-02-01 22:20         ` Dan Williams
2022-02-01 23:04           ` Jane Chu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.