linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] hugetlbfs: don't delete error page from pagecache
@ 2022-10-18 20:01 James Houghton
  2022-10-19 18:31 ` Yang Shi
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: James Houghton @ 2022-10-18 20:01 UTC (permalink / raw)
  To: Mike Kravetz, Muchun Song, Naoya Horiguchi, Miaohe Lin
  Cc: Andrew Morton, Yang Shi, Axel Rasmussen, linux-mm, linux-kernel,
	James Houghton

This change is very similar to the change that was made for shmem [1],
and it solves the same problem but for HugeTLBFS instead.

Currently, when poison is found in a HugeTLB page, the page is removed
from the page cache. That means that attempting to map or read that
hugepage in the future will result in a new hugepage being allocated
instead of notifying the user that the page was poisoned. As [1] states,
this is effectively memory corruption.

The fix is to leave the page in the page cache. If the user attempts to
use a poisoned HugeTLB page with a syscall, the syscall will fail with
EIO, the same error code that shmem uses. For attempts to map the page,
the thread will get a BUS_MCEERR_AR SIGBUS.

[1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")

Signed-off-by: James Houghton <jthoughton@google.com>
---
 fs/hugetlbfs/inode.c | 13 ++++++-------
 mm/hugetlb.c         |  4 ++++
 mm/memory-failure.c  |  5 ++++-
 3 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index fef5165b73a5..7f836f8f9db1 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -328,6 +328,12 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
 		} else {
 			unlock_page(page);
 
+			if (PageHWPoison(page)) {
+				put_page(page);
+				retval = -EIO;
+				break;
+			}
+
 			/*
 			 * We have the page, copy it to user space buffer.
 			 */
@@ -1111,13 +1117,6 @@ static int hugetlbfs_migrate_folio(struct address_space *mapping,
 static int hugetlbfs_error_remove_page(struct address_space *mapping,
 				struct page *page)
 {
-	struct inode *inode = mapping->host;
-	pgoff_t index = page->index;
-
-	hugetlb_delete_from_page_cache(page_folio(page));
-	if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1)))
-		hugetlb_fix_reserve_counts(inode);
-
 	return 0;
 }
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 97896165fd3f..5120a9ccbf5b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6101,6 +6101,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
 
 	ptl = huge_pte_lock(h, dst_mm, dst_pte);
 
+	ret = -EIO;
+	if (PageHWPoison(page))
+		goto out_release_unlock;
+
 	/*
 	 * We allow to overwrite a pte marker: consider when both MISSING|WP
 	 * registered, we firstly wr-protect a none pte which has no page cache
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 145bb561ddb3..bead6bccc7f2 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1080,6 +1080,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
 	int res;
 	struct page *hpage = compound_head(p);
 	struct address_space *mapping;
+	bool extra_pins = false;
 
 	if (!PageHuge(hpage))
 		return MF_DELAYED;
@@ -1087,6 +1088,8 @@ static int me_huge_page(struct page_state *ps, struct page *p)
 	mapping = page_mapping(hpage);
 	if (mapping) {
 		res = truncate_error_page(hpage, page_to_pfn(p), mapping);
+		/* The page is kept in page cache. */
+		extra_pins = true;
 		unlock_page(hpage);
 	} else {
 		unlock_page(hpage);
@@ -1104,7 +1107,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
 		}
 	}
 
-	if (has_extra_refcount(ps, p, false))
+	if (has_extra_refcount(ps, p, extra_pins))
 		res = MF_FAILED;
 
 	return res;
-- 
2.38.0.413.g74048e4d9e-goog


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-10-18 20:01 [PATCH] hugetlbfs: don't delete error page from pagecache James Houghton
@ 2022-10-19 18:31 ` Yang Shi
  2022-10-19 18:42   ` James Houghton
  2022-10-19 18:55   ` Mike Kravetz
  2022-10-20  0:02 ` Mike Kravetz
  2022-11-04  2:10 ` HORIGUCHI NAOYA(堀口 直也)
  2 siblings, 2 replies; 13+ messages in thread
From: Yang Shi @ 2022-10-19 18:31 UTC (permalink / raw)
  To: James Houghton
  Cc: Mike Kravetz, Muchun Song, Naoya Horiguchi, Miaohe Lin,
	Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel

On Tue, Oct 18, 2022 at 1:01 PM James Houghton <jthoughton@google.com> wrote:
>
> This change is very similar to the change that was made for shmem [1],
> and it solves the same problem but for HugeTLBFS instead.
>
> Currently, when poison is found in a HugeTLB page, the page is removed
> from the page cache. That means that attempting to map or read that
> hugepage in the future will result in a new hugepage being allocated
> instead of notifying the user that the page was poisoned. As [1] states,
> this is effectively memory corruption.
>
> The fix is to leave the page in the page cache. If the user attempts to
> use a poisoned HugeTLB page with a syscall, the syscall will fail with
> EIO, the same error code that shmem uses. For attempts to map the page,
> the thread will get a BUS_MCEERR_AR SIGBUS.
>
> [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
>
> Signed-off-by: James Houghton <jthoughton@google.com>

Thanks for the patch. Yes, we should do the same thing for hugetlbfs.
When I was working on shmem I did look into hugetlbfs too. But the
problem is we actually make the whole hugetlb page unavailable even
though just one 4K sub page is hwpoisoned. It may be fine to 2M
hugetlb page, but a lot of memory may be a huge waste for 1G hugetlb
page, particular for the page fault path.

So I discussed this with Mike offline last year, and I was told Google
was working on PTE mapped hugetlb page. That should be able to solve
the problem. And we'd like to have the high-granularity hugetlb
mapping support as the predecessor.

There were some other details, but I can't remember all of them, I
have to refresh my memory by rereading the email discussions...

> ---
>  fs/hugetlbfs/inode.c | 13 ++++++-------
>  mm/hugetlb.c         |  4 ++++
>  mm/memory-failure.c  |  5 ++++-
>  3 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index fef5165b73a5..7f836f8f9db1 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -328,6 +328,12 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
>                 } else {
>                         unlock_page(page);
>
> +                       if (PageHWPoison(page)) {
> +                               put_page(page);
> +                               retval = -EIO;
> +                               break;
> +                       }
> +
>                         /*
>                          * We have the page, copy it to user space buffer.
>                          */
> @@ -1111,13 +1117,6 @@ static int hugetlbfs_migrate_folio(struct address_space *mapping,
>  static int hugetlbfs_error_remove_page(struct address_space *mapping,
>                                 struct page *page)
>  {
> -       struct inode *inode = mapping->host;
> -       pgoff_t index = page->index;
> -
> -       hugetlb_delete_from_page_cache(page_folio(page));
> -       if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1)))
> -               hugetlb_fix_reserve_counts(inode);
> -
>         return 0;
>  }
>
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 97896165fd3f..5120a9ccbf5b 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6101,6 +6101,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>
>         ptl = huge_pte_lock(h, dst_mm, dst_pte);
>
> +       ret = -EIO;
> +       if (PageHWPoison(page))
> +               goto out_release_unlock;
> +
>         /*
>          * We allow to overwrite a pte marker: consider when both MISSING|WP
>          * registered, we firstly wr-protect a none pte which has no page cache
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 145bb561ddb3..bead6bccc7f2 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1080,6 +1080,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>         int res;
>         struct page *hpage = compound_head(p);
>         struct address_space *mapping;
> +       bool extra_pins = false;
>
>         if (!PageHuge(hpage))
>                 return MF_DELAYED;
> @@ -1087,6 +1088,8 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>         mapping = page_mapping(hpage);
>         if (mapping) {
>                 res = truncate_error_page(hpage, page_to_pfn(p), mapping);
> +               /* The page is kept in page cache. */
> +               extra_pins = true;
>                 unlock_page(hpage);
>         } else {
>                 unlock_page(hpage);
> @@ -1104,7 +1107,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>                 }
>         }
>
> -       if (has_extra_refcount(ps, p, false))
> +       if (has_extra_refcount(ps, p, extra_pins))
>                 res = MF_FAILED;
>
>         return res;
> --
> 2.38.0.413.g74048e4d9e-goog
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-10-19 18:31 ` Yang Shi
@ 2022-10-19 18:42   ` James Houghton
  2022-10-20 18:27     ` Yang Shi
  2022-10-19 18:55   ` Mike Kravetz
  1 sibling, 1 reply; 13+ messages in thread
From: James Houghton @ 2022-10-19 18:42 UTC (permalink / raw)
  To: Yang Shi
  Cc: Mike Kravetz, Muchun Song, Naoya Horiguchi, Miaohe Lin,
	Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel

On Wed, Oct 19, 2022 at 11:31 AM Yang Shi <shy828301@gmail.com> wrote:
>
> On Tue, Oct 18, 2022 at 1:01 PM James Houghton <jthoughton@google.com> wrote:
> >
> > This change is very similar to the change that was made for shmem [1],
> > and it solves the same problem but for HugeTLBFS instead.
> >
> > Currently, when poison is found in a HugeTLB page, the page is removed
> > from the page cache. That means that attempting to map or read that
> > hugepage in the future will result in a new hugepage being allocated
> > instead of notifying the user that the page was poisoned. As [1] states,
> > this is effectively memory corruption.
> >
> > The fix is to leave the page in the page cache. If the user attempts to
> > use a poisoned HugeTLB page with a syscall, the syscall will fail with
> > EIO, the same error code that shmem uses. For attempts to map the page,
> > the thread will get a BUS_MCEERR_AR SIGBUS.
> >
> > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> >
> > Signed-off-by: James Houghton <jthoughton@google.com>
>
> Thanks for the patch. Yes, we should do the same thing for hugetlbfs.
> When I was working on shmem I did look into hugetlbfs too. But the
> problem is we actually make the whole hugetlb page unavailable even
> though just one 4K sub page is hwpoisoned. It may be fine to 2M
> hugetlb page, but a lot of memory may be a huge waste for 1G hugetlb
> page, particular for the page fault path.

Right -- it is wasted until a hole is punched or the file is
truncated. Although we're wasting the rest of the hugepage for a
little longer with this patch, I think it's worth it to have correct
behavior.

>
> So I discussed this with Mike offline last year, and I was told Google
> was working on PTE mapped hugetlb page. That should be able to solve
> the problem. And we'd like to have the high-granularity hugetlb
> mapping support as the predecessor.
>
> There were some other details, but I can't remember all of them, I
> have to refresh my memory by rereading the email discussions...

Yes! I am working on this. :) I will send up a series in the coming
weeks that implements basic support for high-granularity mapping
(HGM). This patch is required for hwpoison semantics to work properly
for high-granularity mapping (and, as the patch states, for shared
HugeTLB mappings generally). For HGM, if we partially map a hugepage
and find poison, faulting on the unmapped bits of it will allocate a
new hugepage. By keeping the poisoned page in the pagecache, we
correctly give userspace a SIGBUS. I didn't mention this in the commit
description because I think this patch is correct on its own.

I haven't implemented PAGE_SIZE poisoning of HugeTLB pages yet, but
high-granularity mapping unblocks this work. Hopefully that will be
ready in the coming months. :)

- James Houghton

>
> > ---
> >  fs/hugetlbfs/inode.c | 13 ++++++-------
> >  mm/hugetlb.c         |  4 ++++
> >  mm/memory-failure.c  |  5 ++++-
> >  3 files changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> > index fef5165b73a5..7f836f8f9db1 100644
> > --- a/fs/hugetlbfs/inode.c
> > +++ b/fs/hugetlbfs/inode.c
> > @@ -328,6 +328,12 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
> >                 } else {
> >                         unlock_page(page);
> >
> > +                       if (PageHWPoison(page)) {
> > +                               put_page(page);
> > +                               retval = -EIO;
> > +                               break;
> > +                       }
> > +
> >                         /*
> >                          * We have the page, copy it to user space buffer.
> >                          */
> > @@ -1111,13 +1117,6 @@ static int hugetlbfs_migrate_folio(struct address_space *mapping,
> >  static int hugetlbfs_error_remove_page(struct address_space *mapping,
> >                                 struct page *page)
> >  {
> > -       struct inode *inode = mapping->host;
> > -       pgoff_t index = page->index;
> > -
> > -       hugetlb_delete_from_page_cache(page_folio(page));
> > -       if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1)))
> > -               hugetlb_fix_reserve_counts(inode);
> > -
> >         return 0;
> >  }
> >
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index 97896165fd3f..5120a9ccbf5b 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -6101,6 +6101,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> >
> >         ptl = huge_pte_lock(h, dst_mm, dst_pte);
> >
> > +       ret = -EIO;
> > +       if (PageHWPoison(page))
> > +               goto out_release_unlock;
> > +
> >         /*
> >          * We allow to overwrite a pte marker: consider when both MISSING|WP
> >          * registered, we firstly wr-protect a none pte which has no page cache
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 145bb561ddb3..bead6bccc7f2 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1080,6 +1080,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
> >         int res;
> >         struct page *hpage = compound_head(p);
> >         struct address_space *mapping;
> > +       bool extra_pins = false;
> >
> >         if (!PageHuge(hpage))
> >                 return MF_DELAYED;
> > @@ -1087,6 +1088,8 @@ static int me_huge_page(struct page_state *ps, struct page *p)
> >         mapping = page_mapping(hpage);
> >         if (mapping) {
> >                 res = truncate_error_page(hpage, page_to_pfn(p), mapping);
> > +               /* The page is kept in page cache. */
> > +               extra_pins = true;
> >                 unlock_page(hpage);
> >         } else {
> >                 unlock_page(hpage);
> > @@ -1104,7 +1107,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
> >                 }
> >         }
> >
> > -       if (has_extra_refcount(ps, p, false))
> > +       if (has_extra_refcount(ps, p, extra_pins))
> >                 res = MF_FAILED;
> >
> >         return res;
> > --
> > 2.38.0.413.g74048e4d9e-goog
> >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-10-19 18:31 ` Yang Shi
  2022-10-19 18:42   ` James Houghton
@ 2022-10-19 18:55   ` Mike Kravetz
  2022-10-20 18:42     ` Yang Shi
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Kravetz @ 2022-10-19 18:55 UTC (permalink / raw)
  To: Yang Shi
  Cc: James Houghton, Muchun Song, Naoya Horiguchi, Miaohe Lin,
	Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel

On 10/19/22 11:31, Yang Shi wrote:
> On Tue, Oct 18, 2022 at 1:01 PM James Houghton <jthoughton@google.com> wrote:
> >
> > This change is very similar to the change that was made for shmem [1],
> > and it solves the same problem but for HugeTLBFS instead.
> >
> > Currently, when poison is found in a HugeTLB page, the page is removed
> > from the page cache. That means that attempting to map or read that
> > hugepage in the future will result in a new hugepage being allocated
> > instead of notifying the user that the page was poisoned. As [1] states,
> > this is effectively memory corruption.
> >
> > The fix is to leave the page in the page cache. If the user attempts to
> > use a poisoned HugeTLB page with a syscall, the syscall will fail with
> > EIO, the same error code that shmem uses. For attempts to map the page,
> > the thread will get a BUS_MCEERR_AR SIGBUS.
> >
> > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> >
> > Signed-off-by: James Houghton <jthoughton@google.com>
> 
> Thanks for the patch. Yes, we should do the same thing for hugetlbfs.
> When I was working on shmem I did look into hugetlbfs too. But the
> problem is we actually make the whole hugetlb page unavailable even
> though just one 4K sub page is hwpoisoned. It may be fine to 2M
> hugetlb page, but a lot of memory may be a huge waste for 1G hugetlb
> page, particular for the page fault path.

One thing that complicated this a bit is the vmemmap optimizations for
hugetlb.  However, I believe Naoya may have addressed this recently.

> So I discussed this with Mike offline last year, and I was told Google
> was working on PTE mapped hugetlb page. That should be able to solve
> the problem. And we'd like to have the high-granularity hugetlb
> mapping support as the predecessor.

Yes, I went back in my notes and noticed it had been one year.  No offense
intended to James and his great work on HGM.  However, in hindsight we should
have fixed this in some way without waiting for a HGM based.

> There were some other details, but I can't remember all of them, I
> have to refresh my memory by rereading the email discussions...

I think the complicating factor was vmemmap optimization.  As mentioned
above, this may have already been addressed by Naoya in patches to
indicate which sub-page(s) had the actual error.

As Yang Shi notes, this patch makes the entire hugetlb page inaccessible.
With some work, we could allow reads to everything but the sub-page with
error.  However, this should be much easier with HGM.  And, we could
potentially even do page faults everywhere but the sub-page with error.

I still think it may be better to wait for HGM instead of trying to do
read access to all but sub-page with error now.  But, entirely open to
other opinions.

I plan to do a review of this patch a little later.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-10-18 20:01 [PATCH] hugetlbfs: don't delete error page from pagecache James Houghton
  2022-10-19 18:31 ` Yang Shi
@ 2022-10-20  0:02 ` Mike Kravetz
  2022-11-04  2:10 ` HORIGUCHI NAOYA(堀口 直也)
  2 siblings, 0 replies; 13+ messages in thread
From: Mike Kravetz @ 2022-10-20  0:02 UTC (permalink / raw)
  To: James Houghton
  Cc: Muchun Song, Naoya Horiguchi, Miaohe Lin, Andrew Morton,
	Yang Shi, Axel Rasmussen, linux-mm, linux-kernel

On 10/18/22 20:01, James Houghton wrote:
> This change is very similar to the change that was made for shmem [1],
> and it solves the same problem but for HugeTLBFS instead.
> 
> Currently, when poison is found in a HugeTLB page, the page is removed
> from the page cache. That means that attempting to map or read that
> hugepage in the future will result in a new hugepage being allocated
> instead of notifying the user that the page was poisoned. As [1] states,
> this is effectively memory corruption.
> 
> The fix is to leave the page in the page cache. If the user attempts to
> use a poisoned HugeTLB page with a syscall, the syscall will fail with
> EIO, the same error code that shmem uses. For attempts to map the page,
> the thread will get a BUS_MCEERR_AR SIGBUS.
> 
> [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> 
> Signed-off-by: James Houghton <jthoughton@google.com>
> ---
>  fs/hugetlbfs/inode.c | 13 ++++++-------
>  mm/hugetlb.c         |  4 ++++
>  mm/memory-failure.c  |  5 ++++-
>  3 files changed, 14 insertions(+), 8 deletions(-)

Thanks James!

Code looks correct.  One observation below, but I do not suggest changing.

Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>

> 
> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> index fef5165b73a5..7f836f8f9db1 100644
> --- a/fs/hugetlbfs/inode.c
> +++ b/fs/hugetlbfs/inode.c
> @@ -328,6 +328,12 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
>  		} else {
>  			unlock_page(page);
>  
> +			if (PageHWPoison(page)) {
> +				put_page(page);
> +				retval = -EIO;
> +				break;
> +			}
> +
>  			/*
>  			 * We have the page, copy it to user space buffer.
>  			 */
> @@ -1111,13 +1117,6 @@ static int hugetlbfs_migrate_folio(struct address_space *mapping,
>  static int hugetlbfs_error_remove_page(struct address_space *mapping,
>  				struct page *page)
>  {
> -	struct inode *inode = mapping->host;
> -	pgoff_t index = page->index;
> -
> -	hugetlb_delete_from_page_cache(page_folio(page));
> -	if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1)))
> -		hugetlb_fix_reserve_counts(inode);
> -
>  	return 0;
>  }
>  
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 97896165fd3f..5120a9ccbf5b 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6101,6 +6101,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
>  
>  	ptl = huge_pte_lock(h, dst_mm, dst_pte);
>  
> +	ret = -EIO;
> +	if (PageHWPoison(page))
> +		goto out_release_unlock;
> +
>  	/*
>  	 * We allow to overwrite a pte marker: consider when both MISSING|WP
>  	 * registered, we firstly wr-protect a none pte which has no page cache
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 145bb561ddb3..bead6bccc7f2 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1080,6 +1080,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>  	int res;
>  	struct page *hpage = compound_head(p);
>  	struct address_space *mapping;
> +	bool extra_pins = false;
>  
>  	if (!PageHuge(hpage))
>  		return MF_DELAYED;
> @@ -1087,6 +1088,8 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>  	mapping = page_mapping(hpage);
>  	if (mapping) {
>  		res = truncate_error_page(hpage, page_to_pfn(p), mapping);
> +		/* The page is kept in page cache. */

That looks a bit silly as we are know truncate_error_page() is a noop and call
it anyway.  I suppose we could just set 'res = MF_RECOVERED'.  However, it
really should be left as is in case more non-hugetlb functionality is added to
truncate_error_page() in the future.
-- 
Mike Kravetz

> +		extra_pins = true;
>  		unlock_page(hpage);
>  	} else {
>  		unlock_page(hpage);
> @@ -1104,7 +1107,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
>  		}
>  	}
>  
> -	if (has_extra_refcount(ps, p, false))
> +	if (has_extra_refcount(ps, p, extra_pins))
>  		res = MF_FAILED;
>  
>  	return res;
> -- 
> 2.38.0.413.g74048e4d9e-goog
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-10-19 18:42   ` James Houghton
@ 2022-10-20 18:27     ` Yang Shi
  0 siblings, 0 replies; 13+ messages in thread
From: Yang Shi @ 2022-10-20 18:27 UTC (permalink / raw)
  To: James Houghton
  Cc: Mike Kravetz, Muchun Song, Naoya Horiguchi, Miaohe Lin,
	Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel

On Wed, Oct 19, 2022 at 11:42 AM James Houghton <jthoughton@google.com> wrote:
>
> On Wed, Oct 19, 2022 at 11:31 AM Yang Shi <shy828301@gmail.com> wrote:
> >
> > On Tue, Oct 18, 2022 at 1:01 PM James Houghton <jthoughton@google.com> wrote:
> > >
> > > This change is very similar to the change that was made for shmem [1],
> > > and it solves the same problem but for HugeTLBFS instead.
> > >
> > > Currently, when poison is found in a HugeTLB page, the page is removed
> > > from the page cache. That means that attempting to map or read that
> > > hugepage in the future will result in a new hugepage being allocated
> > > instead of notifying the user that the page was poisoned. As [1] states,
> > > this is effectively memory corruption.
> > >
> > > The fix is to leave the page in the page cache. If the user attempts to
> > > use a poisoned HugeTLB page with a syscall, the syscall will fail with
> > > EIO, the same error code that shmem uses. For attempts to map the page,
> > > the thread will get a BUS_MCEERR_AR SIGBUS.
> > >
> > > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> > >
> > > Signed-off-by: James Houghton <jthoughton@google.com>
> >
> > Thanks for the patch. Yes, we should do the same thing for hugetlbfs.
> > When I was working on shmem I did look into hugetlbfs too. But the
> > problem is we actually make the whole hugetlb page unavailable even
> > though just one 4K sub page is hwpoisoned. It may be fine to 2M
> > hugetlb page, but a lot of memory may be a huge waste for 1G hugetlb
> > page, particular for the page fault path.
>
> Right -- it is wasted until a hole is punched or the file is
> truncated. Although we're wasting the rest of the hugepage for a
> little longer with this patch, I think it's worth it to have correct
> behavior.
>
> >
> > So I discussed this with Mike offline last year, and I was told Google
> > was working on PTE mapped hugetlb page. That should be able to solve
> > the problem. And we'd like to have the high-granularity hugetlb
> > mapping support as the predecessor.
> >
> > There were some other details, but I can't remember all of them, I
> > have to refresh my memory by rereading the email discussions...
>
> Yes! I am working on this. :) I will send up a series in the coming
> weeks that implements basic support for high-granularity mapping
> (HGM). This patch is required for hwpoison semantics to work properly
> for high-granularity mapping (and, as the patch states, for shared
> HugeTLB mappings generally). For HGM, if we partially map a hugepage
> and find poison, faulting on the unmapped bits of it will allocate a
> new hugepage. By keeping the poisoned page in the pagecache, we
> correctly give userspace a SIGBUS. I didn't mention this in the commit
> description because I think this patch is correct on its own.

I don't mean it is not correct. I'm not sure which one (this patch or
HGM) would go first. But it sounds like you thought this patch should
be the predecessor of HGM since it could deliver userspace the correct
signal?

>
> I haven't implemented PAGE_SIZE poisoning of HugeTLB pages yet, but
> high-granularity mapping unblocks this work. Hopefully that will be
> ready in the coming months. :)
>
> - James Houghton
>
> >
> > > ---
> > >  fs/hugetlbfs/inode.c | 13 ++++++-------
> > >  mm/hugetlb.c         |  4 ++++
> > >  mm/memory-failure.c  |  5 ++++-
> > >  3 files changed, 14 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
> > > index fef5165b73a5..7f836f8f9db1 100644
> > > --- a/fs/hugetlbfs/inode.c
> > > +++ b/fs/hugetlbfs/inode.c
> > > @@ -328,6 +328,12 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to)
> > >                 } else {
> > >                         unlock_page(page);
> > >
> > > +                       if (PageHWPoison(page)) {
> > > +                               put_page(page);
> > > +                               retval = -EIO;
> > > +                               break;
> > > +                       }
> > > +
> > >                         /*
> > >                          * We have the page, copy it to user space buffer.
> > >                          */
> > > @@ -1111,13 +1117,6 @@ static int hugetlbfs_migrate_folio(struct address_space *mapping,
> > >  static int hugetlbfs_error_remove_page(struct address_space *mapping,
> > >                                 struct page *page)
> > >  {
> > > -       struct inode *inode = mapping->host;
> > > -       pgoff_t index = page->index;
> > > -
> > > -       hugetlb_delete_from_page_cache(page_folio(page));
> > > -       if (unlikely(hugetlb_unreserve_pages(inode, index, index + 1, 1)))
> > > -               hugetlb_fix_reserve_counts(inode);
> > > -
> > >         return 0;
> > >  }
> > >
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 97896165fd3f..5120a9ccbf5b 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -6101,6 +6101,10 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> > >
> > >         ptl = huge_pte_lock(h, dst_mm, dst_pte);
> > >
> > > +       ret = -EIO;
> > > +       if (PageHWPoison(page))
> > > +               goto out_release_unlock;
> > > +
> > >         /*
> > >          * We allow to overwrite a pte marker: consider when both MISSING|WP
> > >          * registered, we firstly wr-protect a none pte which has no page cache
> > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > > index 145bb561ddb3..bead6bccc7f2 100644
> > > --- a/mm/memory-failure.c
> > > +++ b/mm/memory-failure.c
> > > @@ -1080,6 +1080,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
> > >         int res;
> > >         struct page *hpage = compound_head(p);
> > >         struct address_space *mapping;
> > > +       bool extra_pins = false;
> > >
> > >         if (!PageHuge(hpage))
> > >                 return MF_DELAYED;
> > > @@ -1087,6 +1088,8 @@ static int me_huge_page(struct page_state *ps, struct page *p)
> > >         mapping = page_mapping(hpage);
> > >         if (mapping) {
> > >                 res = truncate_error_page(hpage, page_to_pfn(p), mapping);
> > > +               /* The page is kept in page cache. */
> > > +               extra_pins = true;
> > >                 unlock_page(hpage);
> > >         } else {
> > >                 unlock_page(hpage);
> > > @@ -1104,7 +1107,7 @@ static int me_huge_page(struct page_state *ps, struct page *p)
> > >                 }
> > >         }
> > >
> > > -       if (has_extra_refcount(ps, p, false))
> > > +       if (has_extra_refcount(ps, p, extra_pins))
> > >                 res = MF_FAILED;
> > >
> > >         return res;
> > > --
> > > 2.38.0.413.g74048e4d9e-goog
> > >

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-10-19 18:55   ` Mike Kravetz
@ 2022-10-20 18:42     ` Yang Shi
  0 siblings, 0 replies; 13+ messages in thread
From: Yang Shi @ 2022-10-20 18:42 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: James Houghton, Muchun Song, Naoya Horiguchi, Miaohe Lin,
	Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel

On Wed, Oct 19, 2022 at 11:55 AM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 10/19/22 11:31, Yang Shi wrote:
> > On Tue, Oct 18, 2022 at 1:01 PM James Houghton <jthoughton@google.com> wrote:
> > >
> > > This change is very similar to the change that was made for shmem [1],
> > > and it solves the same problem but for HugeTLBFS instead.
> > >
> > > Currently, when poison is found in a HugeTLB page, the page is removed
> > > from the page cache. That means that attempting to map or read that
> > > hugepage in the future will result in a new hugepage being allocated
> > > instead of notifying the user that the page was poisoned. As [1] states,
> > > this is effectively memory corruption.
> > >
> > > The fix is to leave the page in the page cache. If the user attempts to
> > > use a poisoned HugeTLB page with a syscall, the syscall will fail with
> > > EIO, the same error code that shmem uses. For attempts to map the page,
> > > the thread will get a BUS_MCEERR_AR SIGBUS.
> > >
> > > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> > >
> > > Signed-off-by: James Houghton <jthoughton@google.com>
> >
> > Thanks for the patch. Yes, we should do the same thing for hugetlbfs.
> > When I was working on shmem I did look into hugetlbfs too. But the
> > problem is we actually make the whole hugetlb page unavailable even
> > though just one 4K sub page is hwpoisoned. It may be fine to 2M
> > hugetlb page, but a lot of memory may be a huge waste for 1G hugetlb
> > page, particular for the page fault path.
>
> One thing that complicated this a bit is the vmemmap optimizations for
> hugetlb.  However, I believe Naoya may have addressed this recently.
>
> > So I discussed this with Mike offline last year, and I was told Google
> > was working on PTE mapped hugetlb page. That should be able to solve
> > the problem. And we'd like to have the high-granularity hugetlb
> > mapping support as the predecessor.
>
> Yes, I went back in my notes and noticed it had been one year.  No offense
> intended to James and his great work on HGM.  However, in hindsight we should
> have fixed this in some way without waiting for a HGM based.
>
> > There were some other details, but I can't remember all of them, I
> > have to refresh my memory by rereading the email discussions...
>
> I think the complicating factor was vmemmap optimization.  As mentioned
> above, this may have already been addressed by Naoya in patches to
> indicate which sub-page(s) had the actual error.
>
> As Yang Shi notes, this patch makes the entire hugetlb page inaccessible.
> With some work, we could allow reads to everything but the sub-page with
> error.  However, this should be much easier with HGM.  And, we could
> potentially even do page faults everywhere but the sub-page with error.
>
> I still think it may be better to wait for HGM instead of trying to do
> read access to all but sub-page with error now.  But, entirely open to
> other opinions.

I have no strong preference about which goes first.


>
> I plan to do a review of this patch a little later.
> --
> Mike Kravetz

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-10-18 20:01 [PATCH] hugetlbfs: don't delete error page from pagecache James Houghton
  2022-10-19 18:31 ` Yang Shi
  2022-10-20  0:02 ` Mike Kravetz
@ 2022-11-04  2:10 ` HORIGUCHI NAOYA(堀口 直也)
  2022-11-04 13:27   ` Mike Kravetz
  2 siblings, 1 reply; 13+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2022-11-04  2:10 UTC (permalink / raw)
  To: James Houghton
  Cc: Mike Kravetz, Muchun Song, Miaohe Lin, Andrew Morton, Yang Shi,
	Axel Rasmussen, linux-mm, linux-kernel

On Tue, Oct 18, 2022 at 08:01:25PM +0000, James Houghton wrote:
> This change is very similar to the change that was made for shmem [1],
> and it solves the same problem but for HugeTLBFS instead.
> 
> Currently, when poison is found in a HugeTLB page, the page is removed
> from the page cache. That means that attempting to map or read that
> hugepage in the future will result in a new hugepage being allocated
> instead of notifying the user that the page was poisoned. As [1] states,
> this is effectively memory corruption.
> 
> The fix is to leave the page in the page cache. If the user attempts to
> use a poisoned HugeTLB page with a syscall, the syscall will fail with
> EIO, the same error code that shmem uses. For attempts to map the page,
> the thread will get a BUS_MCEERR_AR SIGBUS.
> 
> [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> 
> Signed-off-by: James Houghton <jthoughton@google.com>

I did some testing and found no issue. So I agree with this patch.
Thank you very much.

Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

As for whether to go with HGM patchset or not, I have no strong opinion.
As you stated in another email this patch is correct without HGM patch,
so it's OK to me to make this merged first.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-11-04  2:10 ` HORIGUCHI NAOYA(堀口 直也)
@ 2022-11-04 13:27   ` Mike Kravetz
  2022-11-07  4:25     ` HORIGUCHI NAOYA(堀口 直也)
  0 siblings, 1 reply; 13+ messages in thread
From: Mike Kravetz @ 2022-11-04 13:27 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: James Houghton, Muchun Song, Miaohe Lin, Andrew Morton, Yang Shi,
	Axel Rasmussen, linux-mm, linux-kernel

On 11/04/22 02:10, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Tue, Oct 18, 2022 at 08:01:25PM +0000, James Houghton wrote:
> > This change is very similar to the change that was made for shmem [1],
> > and it solves the same problem but for HugeTLBFS instead.
> > 
> > Currently, when poison is found in a HugeTLB page, the page is removed
> > from the page cache. That means that attempting to map or read that
> > hugepage in the future will result in a new hugepage being allocated
> > instead of notifying the user that the page was poisoned. As [1] states,
> > this is effectively memory corruption.
> > 
> > The fix is to leave the page in the page cache. If the user attempts to
> > use a poisoned HugeTLB page with a syscall, the syscall will fail with
> > EIO, the same error code that shmem uses. For attempts to map the page,
> > the thread will get a BUS_MCEERR_AR SIGBUS.
> > 
> > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> > 
> > Signed-off-by: James Houghton <jthoughton@google.com>
> 
> I did some testing and found no issue. So I agree with this patch.
> Thank you very much.
> 
> Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> 
> As for whether to go with HGM patchset or not, I have no strong opinion.
> As you stated in another email this patch is correct without HGM patch,
> so it's OK to me to make this merged first.

Thanks Naoya.

This is a late thought, but ...
Should this patch and Yang Shi's shmem patch be backported to stable releases?
Both address potential data corruption/loss, so it certainly seems like
stable material.
-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-11-04 13:27   ` Mike Kravetz
@ 2022-11-07  4:25     ` HORIGUCHI NAOYA(堀口 直也)
  2022-11-07 16:55       ` Mike Kravetz
  2022-11-07 18:50       ` Yang Shi
  0 siblings, 2 replies; 13+ messages in thread
From: HORIGUCHI NAOYA(堀口 直也) @ 2022-11-07  4:25 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: James Houghton, Muchun Song, Miaohe Lin, Andrew Morton, Yang Shi,
	Axel Rasmussen, linux-mm, linux-kernel

On Fri, Nov 04, 2022 at 06:27:44AM -0700, Mike Kravetz wrote:
> On 11/04/22 02:10, HORIGUCHI NAOYA(堀口 直也) wrote:
> > On Tue, Oct 18, 2022 at 08:01:25PM +0000, James Houghton wrote:
> > > This change is very similar to the change that was made for shmem [1],
> > > and it solves the same problem but for HugeTLBFS instead.
> > > 
> > > Currently, when poison is found in a HugeTLB page, the page is removed
> > > from the page cache. That means that attempting to map or read that
> > > hugepage in the future will result in a new hugepage being allocated
> > > instead of notifying the user that the page was poisoned. As [1] states,
> > > this is effectively memory corruption.
> > > 
> > > The fix is to leave the page in the page cache. If the user attempts to
> > > use a poisoned HugeTLB page with a syscall, the syscall will fail with
> > > EIO, the same error code that shmem uses. For attempts to map the page,
> > > the thread will get a BUS_MCEERR_AR SIGBUS.
> > > 
> > > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> > > 
> > > Signed-off-by: James Houghton <jthoughton@google.com>
> > 
> > I did some testing and found no issue. So I agree with this patch.
> > Thank you very much.
> > 
> > Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > 
> > As for whether to go with HGM patchset or not, I have no strong opinion.
> > As you stated in another email this patch is correct without HGM patch,
> > so it's OK to me to make this merged first.
> 
> Thanks Naoya.
> 
> This is a late thought, but ...
> Should this patch and Yang Shi's shmem patch be backported to stable releases?
> Both address potential data corruption/loss, so it certainly seems like
> stable material.

Yes, I agree that backporting these could be helpful.

So I think that I'll try to backport commit a7605426666 and its dependencies
to 5.15 (and older LTS if possible).  For this patch, just adding "Cc: stable"
should be enough for now.

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-11-07  4:25     ` HORIGUCHI NAOYA(堀口 直也)
@ 2022-11-07 16:55       ` Mike Kravetz
  2022-11-07 19:41         ` Andrew Morton
  2022-11-07 18:50       ` Yang Shi
  1 sibling, 1 reply; 13+ messages in thread
From: Mike Kravetz @ 2022-11-07 16:55 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: James Houghton, Muchun Song, Miaohe Lin, Andrew Morton, Yang Shi,
	Axel Rasmussen, linux-mm, linux-kernel

On 11/07/22 04:25, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Fri, Nov 04, 2022 at 06:27:44AM -0700, Mike Kravetz wrote:
> > On 11/04/22 02:10, HORIGUCHI NAOYA(堀口 直也) wrote:
> > > On Tue, Oct 18, 2022 at 08:01:25PM +0000, James Houghton wrote:
> > > > This change is very similar to the change that was made for shmem [1],
> > > > and it solves the same problem but for HugeTLBFS instead.
> > > > 
> > > > Currently, when poison is found in a HugeTLB page, the page is removed
> > > > from the page cache. That means that attempting to map or read that
> > > > hugepage in the future will result in a new hugepage being allocated
> > > > instead of notifying the user that the page was poisoned. As [1] states,
> > > > this is effectively memory corruption.
> > > > 
> > > > The fix is to leave the page in the page cache. If the user attempts to
> > > > use a poisoned HugeTLB page with a syscall, the syscall will fail with
> > > > EIO, the same error code that shmem uses. For attempts to map the page,
> > > > the thread will get a BUS_MCEERR_AR SIGBUS.
> > > > 
> > > > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> > > > 
> > > > Signed-off-by: James Houghton <jthoughton@google.com>
> > > 
> > > I did some testing and found no issue. So I agree with this patch.
> > > Thank you very much.
> > > 
> > > Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > 
> > > As for whether to go with HGM patchset or not, I have no strong opinion.
> > > As you stated in another email this patch is correct without HGM patch,
> > > so it's OK to me to make this merged first.
> > 
> > Thanks Naoya.
> > 
> > This is a late thought, but ...
> > Should this patch and Yang Shi's shmem patch be backported to stable releases?
> > Both address potential data corruption/loss, so it certainly seems like
> > stable material.
> 
> Yes, I agree that backporting these could be helpful.
> 
> So I think that I'll try to backport commit a7605426666 and its dependencies
> to 5.15 (and older LTS if possible).  For this patch, just adding "Cc: stable"
> should be enough for now.

We have now actually seen users impacted by this issue.  I suggest
adding the following tags to this patch.

Fixes: 78bb920344b8 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error")
Cc: <stable@vger.kernel.org>

-- 
Mike Kravetz

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-11-07  4:25     ` HORIGUCHI NAOYA(堀口 直也)
  2022-11-07 16:55       ` Mike Kravetz
@ 2022-11-07 18:50       ` Yang Shi
  1 sibling, 0 replies; 13+ messages in thread
From: Yang Shi @ 2022-11-07 18:50 UTC (permalink / raw)
  To: HORIGUCHI NAOYA(堀口 直也)
  Cc: Mike Kravetz, James Houghton, Muchun Song, Miaohe Lin,
	Andrew Morton, Axel Rasmussen, linux-mm, linux-kernel

On Sun, Nov 6, 2022 at 8:25 PM HORIGUCHI NAOYA(堀口 直也)
<naoya.horiguchi@nec.com> wrote:
>
> On Fri, Nov 04, 2022 at 06:27:44AM -0700, Mike Kravetz wrote:
> > On 11/04/22 02:10, HORIGUCHI NAOYA(堀口 直也) wrote:
> > > On Tue, Oct 18, 2022 at 08:01:25PM +0000, James Houghton wrote:
> > > > This change is very similar to the change that was made for shmem [1],
> > > > and it solves the same problem but for HugeTLBFS instead.
> > > >
> > > > Currently, when poison is found in a HugeTLB page, the page is removed
> > > > from the page cache. That means that attempting to map or read that
> > > > hugepage in the future will result in a new hugepage being allocated
> > > > instead of notifying the user that the page was poisoned. As [1] states,
> > > > this is effectively memory corruption.
> > > >
> > > > The fix is to leave the page in the page cache. If the user attempts to
> > > > use a poisoned HugeTLB page with a syscall, the syscall will fail with
> > > > EIO, the same error code that shmem uses. For attempts to map the page,
> > > > the thread will get a BUS_MCEERR_AR SIGBUS.
> > > >
> > > > [1]: commit a76054266661 ("mm: shmem: don't truncate page if memory failure happens")
> > > >
> > > > Signed-off-by: James Houghton <jthoughton@google.com>
> > >
> > > I did some testing and found no issue. So I agree with this patch.
> > > Thank you very much.
> > >
> > > Tested-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > > Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> > >
> > > As for whether to go with HGM patchset or not, I have no strong opinion.
> > > As you stated in another email this patch is correct without HGM patch,
> > > so it's OK to me to make this merged first.
> >
> > Thanks Naoya.
> >
> > This is a late thought, but ...
> > Should this patch and Yang Shi's shmem patch be backported to stable releases?
> > Both address potential data corruption/loss, so it certainly seems like
> > stable material.
>
> Yes, I agree that backporting these could be helpful.

+1

>
> So I think that I'll try to backport commit a7605426666 and its dependencies
> to 5.15 (and older LTS if possible).  For this patch, just adding "Cc: stable"
> should be enough for now.

Thanks, Naoya.

>
> Thanks,
> Naoya Horiguchi

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] hugetlbfs: don't delete error page from pagecache
  2022-11-07 16:55       ` Mike Kravetz
@ 2022-11-07 19:41         ` Andrew Morton
  0 siblings, 0 replies; 13+ messages in thread
From: Andrew Morton @ 2022-11-07 19:41 UTC (permalink / raw)
  To: Mike Kravetz
  Cc: HORIGUCHI NAOYA, James Houghton, Muchun Song, Miaohe Lin,
	Yang Shi, Axel Rasmussen, linux-mm, linux-kernel

On Mon, 7 Nov 2022 08:55:43 -0800 Mike Kravetz <mike.kravetz@oracle.com> wrote:

> We have now actually seen users impacted by this issue.  I suggest
> adding the following tags to this patch.
> 
> Fixes: 78bb920344b8 ("mm: hwpoison: dissolve in-use hugepage in unrecoverable memory error")
> Cc: <stable@vger.kernel.org>

I've made those changes.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-11-07 19:41 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-18 20:01 [PATCH] hugetlbfs: don't delete error page from pagecache James Houghton
2022-10-19 18:31 ` Yang Shi
2022-10-19 18:42   ` James Houghton
2022-10-20 18:27     ` Yang Shi
2022-10-19 18:55   ` Mike Kravetz
2022-10-20 18:42     ` Yang Shi
2022-10-20  0:02 ` Mike Kravetz
2022-11-04  2:10 ` HORIGUCHI NAOYA(堀口 直也)
2022-11-04 13:27   ` Mike Kravetz
2022-11-07  4:25     ` HORIGUCHI NAOYA(堀口 直也)
2022-11-07 16:55       ` Mike Kravetz
2022-11-07 19:41         ` Andrew Morton
2022-11-07 18:50       ` Yang Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).