All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm,hwpoison: unmap poisoned page before invalidation
@ 2022-03-25 20:14 Rik van Riel
  2022-03-26  7:48 ` Miaohe Lin
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Rik van Riel @ 2022-03-25 20:14 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, kernel-team, Oscar Salvador, Miaohe Lin,
	Naoya Horiguchi, Mel Gorman, Johannes Weiner, Andrew Morton,
	stable

In some cases it appears the invalidation of a hwpoisoned page
fails because the page is still mapped in another process. This
can cause a program to be continuously restarted and die when
it page faults on the page that was not invalidated. Avoid that
problem by unmapping the hwpoisoned page when we find it.

Another issue is that sometimes we end up oopsing in finish_fault,
if the code tries to do something with the now-NULL vmf->page.
I did not hit this error when submitting the previous patch because
there are several opportunities for alloc_set_pte to bail out before
accessing vmf->page, and that apparently happened on those systems,
and most of the time on other systems, too.

However, across several million systems that error does occur a
handful of times a day. It can be avoided by returning VM_FAULT_NOPAGE
which will cause do_read_fault to return before calling finish_fault.

Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path")
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
---
 mm/memory.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index be44d0b36b18..76e3af9639d9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
 		return ret;
 
 	if (unlikely(PageHWPoison(vmf->page))) {
+		struct page *page = vmf->page;
 		vm_fault_t poisonret = VM_FAULT_HWPOISON;
 		if (ret & VM_FAULT_LOCKED) {
+			if (page_mapped(page))
+				unmap_mapping_pages(page_mapping(page),
+						    page->index, 1, false);
 			/* Retry if a clean page was removed from the cache. */
-			if (invalidate_inode_page(vmf->page))
-				poisonret = 0;
-			unlock_page(vmf->page);
+			if (invalidate_inode_page(page))
+				poisonret = VM_FAULT_NOPAGE;
+			unlock_page(page);
 		}
-		put_page(vmf->page);
+		put_page(page);
 		vmf->page = NULL;
 		return poisonret;
 	}
-- 
2.35.1



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-25 20:14 [PATCH] mm,hwpoison: unmap poisoned page before invalidation Rik van Riel
@ 2022-03-26  7:48 ` Miaohe Lin
  2022-03-26 20:14   ` Rik van Riel
  2022-03-28  9:00 ` Oscar Salvador
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Miaohe Lin @ 2022-03-26  7:48 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-mm, kernel-team, Oscar Salvador, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable, linux-kernel

On 2022/3/26 4:14, Rik van Riel wrote:
> In some cases it appears the invalidation of a hwpoisoned page
> fails because the page is still mapped in another process. This
> can cause a program to be continuously restarted and die when
> it page faults on the page that was not invalidated. Avoid that
> problem by unmapping the hwpoisoned page when we find it.
> 
> Another issue is that sometimes we end up oopsing in finish_fault,
> if the code tries to do something with the now-NULL vmf->page.
> I did not hit this error when submitting the previous patch because
> there are several opportunities for alloc_set_pte to bail out before
> accessing vmf->page, and that apparently happened on those systems,
> and most of the time on other systems, too.
> 
> However, across several million systems that error does occur a
> handful of times a day. It can be avoided by returning VM_FAULT_NOPAGE
> which will cause do_read_fault to return before calling finish_fault.
> 
> Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path")
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@vger.kernel.org
> ---
>  mm/memory.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index be44d0b36b18..76e3af9639d9 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>  		return ret;
>  
>  	if (unlikely(PageHWPoison(vmf->page))) {
> +		struct page *page = vmf->page;
>  		vm_fault_t poisonret = VM_FAULT_HWPOISON;
>  		if (ret & VM_FAULT_LOCKED) {
> +			if (page_mapped(page))
> +				unmap_mapping_pages(page_mapping(page),
> +						    page->index, 1, false);

It seems this unmap_mapping_pages also helps the success rate of the below invalidate_inode_page.

>  			/* Retry if a clean page was removed from the cache. */
> -			if (invalidate_inode_page(vmf->page))
> -				poisonret = 0;
> -			unlock_page(vmf->page);
> +			if (invalidate_inode_page(page))
> +				poisonret = VM_FAULT_NOPAGE;
> +			unlock_page(page);
>  		}
> -		put_page(vmf->page);
> +		put_page(page);

Do we use page instead of vmf->page just for simplicity? Or there is some other concern?

>  		vmf->page = NULL;

We return either VM_FAULT_NOPAGE or VM_FAULT_HWPOISON with vmf->page = NULL. If any case,
finish_fault won't be called later. So I think your fix is right.

>  		return poisonret;
>  	}
> 

Many thanks for your patch.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-26  7:48 ` Miaohe Lin
@ 2022-03-26 20:14   ` Rik van Riel
  2022-03-28  2:14     ` Miaohe Lin
  0 siblings, 1 reply; 11+ messages in thread
From: Rik van Riel @ 2022-03-26 20:14 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: linux-mm, kernel-team, Oscar Salvador, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2483 bytes --]

On Sat, 2022-03-26 at 15:48 +0800, Miaohe Lin wrote:
> On 2022/3/26 4:14, Rik van Riel wrote:
> > 
> > +++ b/mm/memory.c
> > @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct
> > vm_fault *vmf)
> >                 return ret;
> >  
> >         if (unlikely(PageHWPoison(vmf->page))) {
> > +               struct page *page = vmf->page;
> >                 vm_fault_t poisonret = VM_FAULT_HWPOISON;
> >                 if (ret & VM_FAULT_LOCKED) {
> > +                       if (page_mapped(page))
> > +                               unmap_mapping_pages(page_mapping(pa
> > ge),
> > +                                                   page->index, 1,
> > false);
> 
> It seems this unmap_mapping_pages also helps the success rate of the
> below invalidate_inode_page.
> 

That is indeed what it is supposed to do.

It isn't fool proof, since you can still end up
with dirty pages that don't get cleaned immediately,
but it seems to turn infinite loops of a program
being killed every time it's started into a more
manageable situation where the task succeeds again
pretty quickly.

> >                         /* Retry if a clean page was removed from
> > the cache. */
> > -                       if (invalidate_inode_page(vmf->page))
> > -                               poisonret = 0;
> > -                       unlock_page(vmf->page);
> > +                       if (invalidate_inode_page(page))
> > +                               poisonret = VM_FAULT_NOPAGE;
> > +                       unlock_page(page);
> >                 }
> > -               put_page(vmf->page);
> > +               put_page(page);
> 
> Do we use page instead of vmf->page just for simplicity? Or there is
> some other concern?
> 

Just a simplification, and not dereferencing the same thing
6 times.

> >                 vmf->page = NULL;
> 
> We return either VM_FAULT_NOPAGE or VM_FAULT_HWPOISON with vmf->page
> = NULL. If any case,
> finish_fault won't be called later. So I think your fix is right.

Want to send in a Reviewed-by or Acked-by? :)

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-26 20:14   ` Rik van Riel
@ 2022-03-28  2:14     ` Miaohe Lin
  2022-03-28  2:24       ` Rik van Riel
  0 siblings, 1 reply; 11+ messages in thread
From: Miaohe Lin @ 2022-03-28  2:14 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-mm, kernel-team, Oscar Salvador, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable, linux-kernel

On 2022/3/27 4:14, Rik van Riel wrote:
> On Sat, 2022-03-26 at 15:48 +0800, Miaohe Lin wrote:
>> On 2022/3/26 4:14, Rik van Riel wrote:
>>>
>>> +++ b/mm/memory.c
>>> @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct
>>> vm_fault *vmf)
>>>                 return ret;
>>>  
>>>         if (unlikely(PageHWPoison(vmf->page))) {
>>> +               struct page *page = vmf->page;
>>>                 vm_fault_t poisonret = VM_FAULT_HWPOISON;
>>>                 if (ret & VM_FAULT_LOCKED) {
>>> +                       if (page_mapped(page))
>>> +                               unmap_mapping_pages(page_mapping(pa
>>> ge),
>>> +                                                   page->index, 1,
>>> false);
>>
>> It seems this unmap_mapping_pages also helps the success rate of the
>> below invalidate_inode_page.
>>
> 
> That is indeed what it is supposed to do.
> 
> It isn't fool proof, since you can still end up
> with dirty pages that don't get cleaned immediately,
> but it seems to turn infinite loops of a program
> being killed every time it's started into a more
> manageable situation where the task succeeds again
> pretty quickly.

Looks convincing to me.

> 
>>>                         /* Retry if a clean page was removed from
>>> the cache. */
>>> -                       if (invalidate_inode_page(vmf->page))
>>> -                               poisonret = 0;
>>> -                       unlock_page(vmf->page);
>>> +                       if (invalidate_inode_page(page))
>>> +                               poisonret = VM_FAULT_NOPAGE;
>>> +                       unlock_page(page);
>>>                 }
>>> -               put_page(vmf->page);
>>> +               put_page(page);
>>
>> Do we use page instead of vmf->page just for simplicity? Or there is
>> some other concern?
>>
> 
> Just a simplification, and not dereferencing the same thing
> 6 times.
> 

I see. :)

>>>                 vmf->page = NULL;
>>
>> We return either VM_FAULT_NOPAGE or VM_FAULT_HWPOISON with vmf->page
>> = NULL. If any case,
>> finish_fault won't be called later. So I think your fix is right.
> 
> Want to send in a Reviewed-by or Acked-by? :)
> 

Sure, but when I think more about this, it seems this fix isn't ideal:
If VM_FAULT_NOPAGE is returned with page table unset, the process will
re-trigger page fault again and again until invalidate_inode_page succeeds
to evict the inode page. This might hang the process a really long time.
Or am I miss something?

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-28  2:14     ` Miaohe Lin
@ 2022-03-28  2:24       ` Rik van Riel
  2022-03-28  2:41         ` Miaohe Lin
  0 siblings, 1 reply; 11+ messages in thread
From: Rik van Riel @ 2022-03-28  2:24 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: linux-mm, kernel-team, Oscar Salvador, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1240 bytes --]

On Mon, 2022-03-28 at 10:14 +0800, Miaohe Lin wrote:
> On 2022/3/27 4:14, Rik van Riel wrote:
> 
> 
> > 
> > > >                         /* Retry if a clean page was removed
> > > > from
> > > > the cache. */
> > > > -                       if (invalidate_inode_page(vmf->page))
> > > > -                               poisonret = 0;
> > > > -                       unlock_page(vmf->page);
> > > > +                       if (invalidate_inode_page(page))
> > > > +                               poisonret = VM_FAULT_NOPAGE;
> > > > +                       unlock_page(page);
> > 
> 
> Sure, but when I think more about this, it seems this fix isn't
> ideal:
> If VM_FAULT_NOPAGE is returned with page table unset, the process
> will
> re-trigger page fault again and again until invalidate_inode_page
> succeeds
> to evict the inode page. This might hang the process a really long
> time.
> Or am I miss something?
> 
If invalidate_inode_page fails, we will return
VM_FAULT_HWPOISON, and kill the task, instead
of looping indefinitely.

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-28  2:24       ` Rik van Riel
@ 2022-03-28  2:41         ` Miaohe Lin
  0 siblings, 0 replies; 11+ messages in thread
From: Miaohe Lin @ 2022-03-28  2:41 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-mm, kernel-team, Oscar Salvador, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable, linux-kernel

On 2022/3/28 10:24, Rik van Riel wrote:
> On Mon, 2022-03-28 at 10:14 +0800, Miaohe Lin wrote:
>> On 2022/3/27 4:14, Rik van Riel wrote:
>>
>>
>>>
>>>>>                         /* Retry if a clean page was removed
>>>>> from
>>>>> the cache. */
>>>>> -                       if (invalidate_inode_page(vmf->page))
>>>>> -                               poisonret = 0;
>>>>> -                       unlock_page(vmf->page);
>>>>> +                       if (invalidate_inode_page(page))
>>>>> +                               poisonret = VM_FAULT_NOPAGE;
>>>>> +                       unlock_page(page);
>>>
>>
>> Sure, but when I think more about this, it seems this fix isn't
>> ideal:
>> If VM_FAULT_NOPAGE is returned with page table unset, the process
>> will
>> re-trigger page fault again and again until invalidate_inode_page
>> succeeds
>> to evict the inode page. This might hang the process a really long
>> time.
>> Or am I miss something?
>>
> If invalidate_inode_page fails, we will return
> VM_FAULT_HWPOISON, and kill the task, instead
> of looping indefinitely.

Oh, really sorry! It's a drowsy Monday morning. :)

This patch looks good to me. Thanks!

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>

> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-25 20:14 [PATCH] mm,hwpoison: unmap poisoned page before invalidation Rik van Riel
  2022-03-26  7:48 ` Miaohe Lin
@ 2022-03-28  9:00 ` Oscar Salvador
  2022-03-29 15:49   ` Rik van Riel
  2022-03-28 11:01 ` HORIGUCHI NAOYA(堀口 直也)
  2022-03-29 19:13 ` Oscar Salvador
  3 siblings, 1 reply; 11+ messages in thread
From: Oscar Salvador @ 2022-03-28  9:00 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, linux-mm, kernel-team, Miaohe Lin, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable

On Fri, Mar 25, 2022 at 04:14:28PM -0400, Rik van Riel wrote:
> In some cases it appears the invalidation of a hwpoisoned page
> fails because the page is still mapped in another process. This
> can cause a program to be continuously restarted and die when
> it page faults on the page that was not invalidated. Avoid that
> problem by unmapping the hwpoisoned page when we find it.
> 
> Another issue is that sometimes we end up oopsing in finish_fault,
> if the code tries to do something with the now-NULL vmf->page.
> I did not hit this error when submitting the previous patch because
> there are several opportunities for alloc_set_pte to bail out before
> accessing vmf->page, and that apparently happened on those systems,
> and most of the time on other systems, too.
> 
> However, across several million systems that error does occur a
> handful of times a day. It can be avoided by returning VM_FAULT_NOPAGE
> which will cause do_read_fault to return before calling finish_fault.
> 
> Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path")
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@vger.kernel.org
> ---
>  mm/memory.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index be44d0b36b18..76e3af9639d9 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>  		return ret;
>  
>  	if (unlikely(PageHWPoison(vmf->page))) {
> +		struct page *page = vmf->page;
>  		vm_fault_t poisonret = VM_FAULT_HWPOISON;
>  		if (ret & VM_FAULT_LOCKED) {
> +			if (page_mapped(page))
> +				unmap_mapping_pages(page_mapping(page),
> +						    page->index, 1, false);
>  			/* Retry if a clean page was removed from the cache. */
> -			if (invalidate_inode_page(vmf->page))
> -				poisonret = 0;
> -			unlock_page(vmf->page);
> +			if (invalidate_inode_page(page))
> +				poisonret = VM_FAULT_NOPAGE;

What is the effect of returning VM_FAULT_NOPAGE?
I take that we are cool because the pte has been installed and points to
a new page? (I could not find where that is being done).


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-25 20:14 [PATCH] mm,hwpoison: unmap poisoned page before invalidation Rik van Riel
  2022-03-26  7:48 ` Miaohe Lin
  2022-03-28  9:00 ` Oscar Salvador
@ 2022-03-28 11:01 ` HORIGUCHI NAOYA(堀口 直也)
  2022-03-29 19:13 ` Oscar Salvador
  3 siblings, 0 replies; 11+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2022-03-28 11:01 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, linux-mm, kernel-team, Oscar Salvador, Miaohe Lin,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable

On Fri, Mar 25, 2022 at 04:14:28PM -0400, Rik van Riel wrote:
> In some cases it appears the invalidation of a hwpoisoned page
> fails because the page is still mapped in another process. This
> can cause a program to be continuously restarted and die when
> it page faults on the page that was not invalidated. Avoid that
> problem by unmapping the hwpoisoned page when we find it.
> 
> Another issue is that sometimes we end up oopsing in finish_fault,
> if the code tries to do something with the now-NULL vmf->page.
> I did not hit this error when submitting the previous patch because
> there are several opportunities for alloc_set_pte to bail out before
> accessing vmf->page, and that apparently happened on those systems,
> and most of the time on other systems, too.
> 
> However, across several million systems that error does occur a
> handful of times a day. It can be avoided by returning VM_FAULT_NOPAGE
> which will cause do_read_fault to return before calling finish_fault.

I artificially created clean/dirty page cache pages with PageHWPoison flag
(with SystemTap), then reproduced NULL pointer dereference by page fault on
current mainline branch (with e53ac7374e64).  And confirmed that the bug was
fixed with this patch, so the fix seems to work.

(Maybe I should've done this kind of testing before merging e53ac7374e64, sorry..)

Anyway, thank you very much.

Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

> 
> Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path")
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@vger.kernel.org
> ---
>  mm/memory.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index be44d0b36b18..76e3af9639d9 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>  		return ret;
>  
>  	if (unlikely(PageHWPoison(vmf->page))) {
> +		struct page *page = vmf->page;
>  		vm_fault_t poisonret = VM_FAULT_HWPOISON;
>  		if (ret & VM_FAULT_LOCKED) {
> +			if (page_mapped(page))
> +				unmap_mapping_pages(page_mapping(page),
> +						    page->index, 1, false);
>  			/* Retry if a clean page was removed from the cache. */
> -			if (invalidate_inode_page(vmf->page))
> -				poisonret = 0;
> -			unlock_page(vmf->page);
> +			if (invalidate_inode_page(page))
> +				poisonret = VM_FAULT_NOPAGE;
> +			unlock_page(page);
>  		}
> -		put_page(vmf->page);
> +		put_page(page);
>  		vmf->page = NULL;
>  		return poisonret;
>  	}
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-28  9:00 ` Oscar Salvador
@ 2022-03-29 15:49   ` Rik van Riel
  2022-03-29 19:13     ` Oscar Salvador
  0 siblings, 1 reply; 11+ messages in thread
From: Rik van Riel @ 2022-03-29 15:49 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-kernel, linux-mm, kernel-team, Miaohe Lin, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable

[-- Attachment #1: Type: text/plain, Size: 851 bytes --]

On Mon, 2022-03-28 at 11:00 +0200, Oscar Salvador wrote:
> On Fri, Mar 25, 2022 at 04:14:28PM -0400, Rik van Riel wrote:
> > +                       if (invalidate_inode_page(page))
> > +                               poisonret = VM_FAULT_NOPAGE;
> 
> What is the effect of returning VM_FAULT_NOPAGE?
> I take that we are cool because the pte has been installed and points
> to
> a new page? (I could not find where that is being done).
> 
It results in us returning to userspace as if the page
fault had been handled, resulting in a second fault on
the same address.

However, now the page is no longer in the page cache,
and we can read it in from disk, to a page that is not
hardware poisoned, and we can then use that second page
without issues.

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-29 15:49   ` Rik van Riel
@ 2022-03-29 19:13     ` Oscar Salvador
  0 siblings, 0 replies; 11+ messages in thread
From: Oscar Salvador @ 2022-03-29 19:13 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, linux-mm, kernel-team, Miaohe Lin, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable

On Tue, Mar 29, 2022 at 11:49:53AM -0400, Rik van Riel wrote:

> It results in us returning to userspace as if the page
> fault had been handled, resulting in a second fault on
> the same address.
> 
> However, now the page is no longer in the page cache,
> and we can read it in from disk, to a page that is not
> hardware poisoned, and we can then use that second page
> without issues.

Ok, I see, thanks a lot for the explanation Rik.


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm,hwpoison: unmap poisoned page before invalidation
  2022-03-25 20:14 [PATCH] mm,hwpoison: unmap poisoned page before invalidation Rik van Riel
                   ` (2 preceding siblings ...)
  2022-03-28 11:01 ` HORIGUCHI NAOYA(堀口 直也)
@ 2022-03-29 19:13 ` Oscar Salvador
  3 siblings, 0 replies; 11+ messages in thread
From: Oscar Salvador @ 2022-03-29 19:13 UTC (permalink / raw)
  To: Rik van Riel
  Cc: linux-kernel, linux-mm, kernel-team, Miaohe Lin, Naoya Horiguchi,
	Mel Gorman, Johannes Weiner, Andrew Morton, stable

On Fri, Mar 25, 2022 at 04:14:28PM -0400, Rik van Riel wrote:
> In some cases it appears the invalidation of a hwpoisoned page
> fails because the page is still mapped in another process. This
> can cause a program to be continuously restarted and die when
> it page faults on the page that was not invalidated. Avoid that
> problem by unmapping the hwpoisoned page when we find it.
> 
> Another issue is that sometimes we end up oopsing in finish_fault,
> if the code tries to do something with the now-NULL vmf->page.
> I did not hit this error when submitting the previous patch because
> there are several opportunities for alloc_set_pte to bail out before
> accessing vmf->page, and that apparently happened on those systems,
> and most of the time on other systems, too.
> 
> However, across several million systems that error does occur a
> handful of times a day. It can be avoided by returning VM_FAULT_NOPAGE
> which will cause do_read_fault to return before calling finish_fault.
> 
> Fixes: e53ac7374e64 ("mm: invalidate hwpoison page cache page in fault path")
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@vger.kernel.org

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> ---
>  mm/memory.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index be44d0b36b18..76e3af9639d9 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3918,14 +3918,18 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>  		return ret;
>  
>  	if (unlikely(PageHWPoison(vmf->page))) {
> +		struct page *page = vmf->page;
>  		vm_fault_t poisonret = VM_FAULT_HWPOISON;
>  		if (ret & VM_FAULT_LOCKED) {
> +			if (page_mapped(page))
> +				unmap_mapping_pages(page_mapping(page),
> +						    page->index, 1, false);
>  			/* Retry if a clean page was removed from the cache. */
> -			if (invalidate_inode_page(vmf->page))
> -				poisonret = 0;
> -			unlock_page(vmf->page);
> +			if (invalidate_inode_page(page))
> +				poisonret = VM_FAULT_NOPAGE;
> +			unlock_page(page);
>  		}
> -		put_page(vmf->page);
> +		put_page(page);
>  		vmf->page = NULL;
>  		return poisonret;
>  	}
> -- 
> 2.35.1
> 
> 
> 

-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-29 19:13 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-25 20:14 [PATCH] mm,hwpoison: unmap poisoned page before invalidation Rik van Riel
2022-03-26  7:48 ` Miaohe Lin
2022-03-26 20:14   ` Rik van Riel
2022-03-28  2:14     ` Miaohe Lin
2022-03-28  2:24       ` Rik van Riel
2022-03-28  2:41         ` Miaohe Lin
2022-03-28  9:00 ` Oscar Salvador
2022-03-29 15:49   ` Rik van Riel
2022-03-29 19:13     ` Oscar Salvador
2022-03-28 11:01 ` HORIGUCHI NAOYA(堀口 直也)
2022-03-29 19:13 ` Oscar Salvador

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.