All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page()
@ 2021-11-20 17:44 Matthew Wilcox (Oracle)
  2021-11-21 23:35 ` HORIGUCHI NAOYA(堀口 直也)
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Matthew Wilcox (Oracle) @ 2021-11-20 17:44 UTC (permalink / raw)
  To: Yang Shi, Naoya Horiguchi, Kirill A . Shutemov, Andrew Morton,
	linux-mm, Hugh Dickins
  Cc: Matthew Wilcox (Oracle)

Pages are individually marked as suffering from hardware poisoning.
Checking that the head page is not hardware poisoned doesn't make
sense; we might be after a subpage.  We check each page individually
before we use it, so this was an optimisation gone wrong.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/filemap.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 0b6f996108b4..65973204112d 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page,
 			goto skip;
 		if (!PageUptodate(page) || PageReadahead(page))
 			goto skip;
-		if (PageHWPoison(page))
-			goto skip;
 		if (!trylock_page(page))
 			goto skip;
 		if (page->mapping != mapping)
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page()
  2021-11-20 17:44 [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page() Matthew Wilcox (Oracle)
@ 2021-11-21 23:35 ` HORIGUCHI NAOYA(堀口 直也)
  2021-11-22 19:28 ` Yang Shi
  2021-11-24  3:24 ` Andrew Morton
  2 siblings, 0 replies; 8+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-11-21 23:35 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Yang Shi, Kirill A . Shutemov, Andrew Morton, linux-mm, Hugh Dickins

On Sat, Nov 20, 2021 at 05:44:29PM +0000, Matthew Wilcox (Oracle) wrote:
> Pages are individually marked as suffering from hardware poisoning.
> Checking that the head page is not hardware poisoned doesn't make
> sense; we might be after a subpage.  We check each page individually
> before we use it, so this was an optimisation gone wrong.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks good to me. Thank you.

Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page()
  2021-11-20 17:44 [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page() Matthew Wilcox (Oracle)
  2021-11-21 23:35 ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-11-22 19:28 ` Yang Shi
  2021-11-24  0:11   ` HORIGUCHI NAOYA(堀口 直也)
  2021-11-24  3:24 ` Andrew Morton
  2 siblings, 1 reply; 8+ messages in thread
From: Yang Shi @ 2021-11-22 19:28 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Naoya Horiguchi, Kirill A . Shutemov, Andrew Morton, Linux MM,
	Hugh Dickins

On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle)
<willy@infradead.org> wrote:
>
> Pages are individually marked as suffering from hardware poisoning.
> Checking that the head page is not hardware poisoned doesn't make
> sense; we might be after a subpage.  We check each page individually
> before we use it, so this was an optimisation gone wrong.

Yeah, it doesn't make too much sense to check the head page. And it
seems the non-poisoned subpages could be PTE mapped instead of
skipping the whole THP.

Not sure if this is by design, it seems the hwpoisoned check in
filemap_map_pages() does skip the subpages after the poisoned page. Or
we should just skip the poisoned page itself? If so the below change
may be needed:

diff --git a/mm/filemap.c b/mm/filemap.c
index daa0e23a6ee6..f1f0cb263b4a 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
        do {
                page = find_subpage(head, xas.xa_index);
                if (PageHWPoison(page))
-                       goto unlock;
+                       goto skip;

                if (mmap_miss > 0)
                        mmap_miss--;
@@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
                do_set_pte(vmf, page, addr);
                /* no need to invalidate: a not-present page won't be cached */
                update_mmu_cache(vma, addr, vmf->pte);
+skip:
                unlock_page(head);
                continue;
 unlock:


>
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>  mm/filemap.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 0b6f996108b4..65973204112d 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page,
>                         goto skip;
>                 if (!PageUptodate(page) || PageReadahead(page))
>                         goto skip;
> -               if (PageHWPoison(page))
> -                       goto skip;
>                 if (!trylock_page(page))
>                         goto skip;
>                 if (page->mapping != mapping)
> --
> 2.33.0
>


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page()
  2021-11-22 19:28 ` Yang Shi
@ 2021-11-24  0:11   ` HORIGUCHI NAOYA(堀口 直也)
  2021-11-24  0:32     ` Yang Shi
  0 siblings, 1 reply; 8+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2021-11-24  0:11 UTC (permalink / raw)
  To: Yang Shi
  Cc: Matthew Wilcox (Oracle),
	Kirill A . Shutemov, Andrew Morton, Linux MM, Hugh Dickins

On Mon, Nov 22, 2021 at 11:28:10AM -0800, Yang Shi wrote:
> On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle)
> <willy@infradead.org> wrote:
> >
> > Pages are individually marked as suffering from hardware poisoning.
> > Checking that the head page is not hardware poisoned doesn't make
> > sense; we might be after a subpage.  We check each page individually
> > before we use it, so this was an optimisation gone wrong.
> 
> Yeah, it doesn't make too much sense to check the head page. And it
> seems the non-poisoned subpages could be PTE mapped instead of
> skipping the whole THP.
> 
> Not sure if this is by design, it seems the hwpoisoned check in
> filemap_map_pages() does skip the subpages after the poisoned page. Or
> we should just skip the poisoned page itself? If so the below change
> may be needed:
> 
> diff --git a/mm/filemap.c b/mm/filemap.c
> index daa0e23a6ee6..f1f0cb263b4a 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
>         do {
>                 page = find_subpage(head, xas.xa_index);
>                 if (PageHWPoison(page))
> -                       goto unlock;
> +                       goto skip;
> 
>                 if (mmap_miss > 0)
>                         mmap_miss--;
> @@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
>                 do_set_pte(vmf, page, addr);
>                 /* no need to invalidate: a not-present page won't be cached */
>                 update_mmu_cache(vma, addr, vmf->pte);
> +skip:
>                 unlock_page(head);
>                 continue;
>  unlock:

first_map_page() or next_map_page() returns a page (if found) with
holding the refcount, and the new 'goto skip' path skips releasing it.
So this looks to me lead to the mismatch of refcount.
Could you explain the intention a little more (maybe related to your
recent patch about keeping hwpoison page in pagecache?) ?

Thanks,
Naoya Horiguchi

> 
> 
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> >  mm/filemap.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index 0b6f996108b4..65973204112d 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page,
> >                         goto skip;
> >                 if (!PageUptodate(page) || PageReadahead(page))
> >                         goto skip;
> > -               if (PageHWPoison(page))
> > -                       goto skip;
> >                 if (!trylock_page(page))
> >                         goto skip;
> >                 if (page->mapping != mapping)
> > --
> > 2.33.0
> >

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page()
  2021-11-24  0:11   ` HORIGUCHI NAOYA(堀口 直也)
@ 2021-11-24  0:32     ` Yang Shi
  2021-11-24  0:57       ` Yang Shi
  0 siblings, 1 reply; 8+ messages in thread
From: Yang Shi @ 2021-11-24  0:32 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Matthew Wilcox (Oracle),
	Kirill A . Shutemov, Andrew Morton, Linux MM, Hugh Dickins

On Tue, Nov 23, 2021 at 4:11 PM HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@nec.com> wrote:
>
> On Mon, Nov 22, 2021 at 11:28:10AM -0800, Yang Shi wrote:
> > On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle)
> > <willy@infradead.org> wrote:
> > >
> > > Pages are individually marked as suffering from hardware poisoning.
> > > Checking that the head page is not hardware poisoned doesn't make
> > > sense; we might be after a subpage.  We check each page individually
> > > before we use it, so this was an optimisation gone wrong.
> >
> > Yeah, it doesn't make too much sense to check the head page. And it
> > seems the non-poisoned subpages could be PTE mapped instead of
> > skipping the whole THP.
> >
> > Not sure if this is by design, it seems the hwpoisoned check in
> > filemap_map_pages() does skip the subpages after the poisoned page. Or
> > we should just skip the poisoned page itself? If so the below change
> > may be needed:
> >
> > diff --git a/mm/filemap.c b/mm/filemap.c
> > index daa0e23a6ee6..f1f0cb263b4a 100644
> > --- a/mm/filemap.c
> > +++ b/mm/filemap.c
> > @@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
> >         do {
> >                 page = find_subpage(head, xas.xa_index);
> >                 if (PageHWPoison(page))
> > -                       goto unlock;
> > +                       goto skip;
> >
> >                 if (mmap_miss > 0)
> >                         mmap_miss--;
> > @@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
> >                 do_set_pte(vmf, page, addr);
> >                 /* no need to invalidate: a not-present page won't be cached */
> >                 update_mmu_cache(vma, addr, vmf->pte);
> > +skip:
> >                 unlock_page(head);
> >                 continue;
> >  unlock:
>
> first_map_page() or next_map_page() returns a page (if found) with
> holding the refcount, and the new 'goto skip' path skips releasing it.
> So this looks to me lead to the mismatch of refcount.
> Could you explain the intention a little more (maybe related to your
> recent patch about keeping hwpoison page in pagecache?) ?

No, not related to my patches.

The current code maps the subpages by PTEs *before* the poisoned page,
but skips the subpages *after* the poisoned page IIUC. It seems not
right, I thought the code was intended to map all subpages by PTEs
except the poisoned pages. So the suggested code is trying to fix the
misbehavior.

That code is just a quick and untested illustration to the above
hypothesis. The corrected version:

diff --git a/mm/filemap.c b/mm/filemap.c
index daa0e23a6ee6..1a76e3edc878 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -3317,8 +3317,11 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
        vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
        do {
                page = find_subpage(head, xas.xa_index);
-               if (PageHWPoison(page))
-                       goto unlock;
+               if (PageHWPoison(page)) {
+                       unlock_page(page);
+                       put_page(page);
+                       continue;
+               }

                if (mmap_miss > 0)
                        mmap_miss--;

>
> Thanks,
> Naoya Horiguchi
>
> >
> >
> > >
> > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > ---
> > >  mm/filemap.c | 2 --
> > >  1 file changed, 2 deletions(-)
> > >
> > > diff --git a/mm/filemap.c b/mm/filemap.c
> > > index 0b6f996108b4..65973204112d 100644
> > > --- a/mm/filemap.c
> > > +++ b/mm/filemap.c
> > > @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page,
> > >                         goto skip;
> > >                 if (!PageUptodate(page) || PageReadahead(page))
> > >                         goto skip;
> > > -               if (PageHWPoison(page))
> > > -                       goto skip;
> > >                 if (!trylock_page(page))
> > >                         goto skip;
> > >                 if (page->mapping != mapping)
> > > --
> > > 2.33.0
> > >


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page()
  2021-11-24  0:32     ` Yang Shi
@ 2021-11-24  0:57       ` Yang Shi
  0 siblings, 0 replies; 8+ messages in thread
From: Yang Shi @ 2021-11-24  0:57 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Matthew Wilcox (Oracle),
	Kirill A . Shutemov, Andrew Morton, Linux MM, Hugh Dickins

On Tue, Nov 23, 2021 at 4:32 PM Yang Shi <shy828301@gmail.com> wrote:
>
> On Tue, Nov 23, 2021 at 4:11 PM HORIGUCHI NAOYA(堀口 直也)
> <naoya.horiguchi@nec.com> wrote:
> >
> > On Mon, Nov 22, 2021 at 11:28:10AM -0800, Yang Shi wrote:
> > > On Sat, Nov 20, 2021 at 9:44 AM Matthew Wilcox (Oracle)
> > > <willy@infradead.org> wrote:
> > > >
> > > > Pages are individually marked as suffering from hardware poisoning.
> > > > Checking that the head page is not hardware poisoned doesn't make
> > > > sense; we might be after a subpage.  We check each page individually
> > > > before we use it, so this was an optimisation gone wrong.
> > >
> > > Yeah, it doesn't make too much sense to check the head page. And it
> > > seems the non-poisoned subpages could be PTE mapped instead of
> > > skipping the whole THP.
> > >
> > > Not sure if this is by design, it seems the hwpoisoned check in
> > > filemap_map_pages() does skip the subpages after the poisoned page. Or
> > > we should just skip the poisoned page itself? If so the below change
> > > may be needed:
> > >
> > > diff --git a/mm/filemap.c b/mm/filemap.c
> > > index daa0e23a6ee6..f1f0cb263b4a 100644
> > > --- a/mm/filemap.c
> > > +++ b/mm/filemap.c
> > > @@ -3318,7 +3318,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
> > >         do {
> > >                 page = find_subpage(head, xas.xa_index);
> > >                 if (PageHWPoison(page))
> > > -                       goto unlock;
> > > +                       goto skip;
> > >
> > >                 if (mmap_miss > 0)
> > >                         mmap_miss--;
> > > @@ -3337,6 +3337,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
> > >                 do_set_pte(vmf, page, addr);
> > >                 /* no need to invalidate: a not-present page won't be cached */
> > >                 update_mmu_cache(vma, addr, vmf->pte);
> > > +skip:
> > >                 unlock_page(head);
> > >                 continue;
> > >  unlock:
> >
> > first_map_page() or next_map_page() returns a page (if found) with
> > holding the refcount, and the new 'goto skip' path skips releasing it.
> > So this looks to me lead to the mismatch of refcount.
> > Could you explain the intention a little more (maybe related to your
> > recent patch about keeping hwpoison page in pagecache?) ?
>
> No, not related to my patches.
>
> The current code maps the subpages by PTEs *before* the poisoned page,
> but skips the subpages *after* the poisoned page IIUC. It seems not
> right, I thought the code was intended to map all subpages by PTEs
> except the poisoned pages. So the suggested code is trying to fix the
> misbehavior.

Err... I think I misread the code. It does iterate every subpages to
map each of them by PTE and just skip the hwpoisoned subpages.

Sorry for the confusion.

>
> That code is just a quick and untested illustration to the above
> hypothesis. The corrected version:
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index daa0e23a6ee6..1a76e3edc878 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3317,8 +3317,11 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
>         vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl);
>         do {
>                 page = find_subpage(head, xas.xa_index);
> -               if (PageHWPoison(page))
> -                       goto unlock;
> +               if (PageHWPoison(page)) {
> +                       unlock_page(page);
> +                       put_page(page);
> +                       continue;
> +               }
>
>                 if (mmap_miss > 0)
>                         mmap_miss--;
>
> >
> > Thanks,
> > Naoya Horiguchi
> >
> > >
> > >
> > > >
> > > > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > > > ---
> > > >  mm/filemap.c | 2 --
> > > >  1 file changed, 2 deletions(-)
> > > >
> > > > diff --git a/mm/filemap.c b/mm/filemap.c
> > > > index 0b6f996108b4..65973204112d 100644
> > > > --- a/mm/filemap.c
> > > > +++ b/mm/filemap.c
> > > > @@ -3239,8 +3239,6 @@ static struct page *next_uptodate_page(struct page *page,
> > > >                         goto skip;
> > > >                 if (!PageUptodate(page) || PageReadahead(page))
> > > >                         goto skip;
> > > > -               if (PageHWPoison(page))
> > > > -                       goto skip;
> > > >                 if (!trylock_page(page))
> > > >                         goto skip;
> > > >                 if (page->mapping != mapping)
> > > > --
> > > > 2.33.0
> > > >


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page()
  2021-11-20 17:44 [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page() Matthew Wilcox (Oracle)
  2021-11-21 23:35 ` HORIGUCHI NAOYA(堀口 直也)
  2021-11-22 19:28 ` Yang Shi
@ 2021-11-24  3:24 ` Andrew Morton
  2021-11-25 14:51   ` Matthew Wilcox
  2 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2021-11-24  3:24 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Yang Shi, Naoya Horiguchi, Kirill A . Shutemov, linux-mm, Hugh Dickins

On Sat, 20 Nov 2021 17:44:29 +0000 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:

> Pages are individually marked as suffering from hardware poisoning.
> Checking that the head page is not hardware poisoned doesn't make
> sense; we might be after a subpage.  We check each page individually
> before we use it, so this was an optimisation gone wrong.

What were the runtime effects of this bug?




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page()
  2021-11-24  3:24 ` Andrew Morton
@ 2021-11-25 14:51   ` Matthew Wilcox
  0 siblings, 0 replies; 8+ messages in thread
From: Matthew Wilcox @ 2021-11-25 14:51 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Yang Shi, Naoya Horiguchi, Kirill A . Shutemov, linux-mm, Hugh Dickins

On Tue, Nov 23, 2021 at 07:24:42PM -0800, Andrew Morton wrote:
> On Sat, 20 Nov 2021 17:44:29 +0000 "Matthew Wilcox (Oracle)" <willy@infradead.org> wrote:
> 
> > Pages are individually marked as suffering from hardware poisoning.
> > Checking that the head page is not hardware poisoned doesn't make
> > sense; we might be after a subpage.  We check each page individually
> > before we use it, so this was an optimisation gone wrong.
> 
> What were the runtime effects of this bug?

We'd fall back to the slow path when there was no need to do that.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-11-25 14:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-20 17:44 [PATCH] filemap: Remove PageHWPoison check from next_uptodate_page() Matthew Wilcox (Oracle)
2021-11-21 23:35 ` HORIGUCHI NAOYA(堀口 直也)
2021-11-22 19:28 ` Yang Shi
2021-11-24  0:11   ` HORIGUCHI NAOYA(堀口 直也)
2021-11-24  0:32     ` Yang Shi
2021-11-24  0:57       ` Yang Shi
2021-11-24  3:24 ` Andrew Morton
2021-11-25 14:51   ` Matthew Wilcox

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.