All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Three memory-failure fixes
@ 2023-12-18 13:58 Matthew Wilcox (Oracle)
  2023-12-18 13:58 ` [PATCH 1/3] mm/memory-failure: Pass the folio and the page to collect_procs() Matthew Wilcox (Oracle)
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-18 13:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Naoya Horiguchi, Dan Williams

I've been looking at the memory-failure code and I believe I have found
three bugs that need fixing -- one going all the way back to 2010!
I'll have more patches later to use folios more extensively but didn't
want these bugfixes to get caught up in that.

Matthew Wilcox (Oracle) (3):
  mm/memory-failure: Pass the folio and the page to collect_procs()
  mm/memory-failure: Check the mapcount of the precise page
  mm/memory-failure: Cast index to loff_t before shifting it

 mm/memory-failure.c | 33 ++++++++++++++++-----------------
 1 file changed, 16 insertions(+), 17 deletions(-)

-- 
2.42.0



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] mm/memory-failure: Pass the folio and the page to collect_procs()
  2023-12-18 13:58 [PATCH 0/3] Three memory-failure fixes Matthew Wilcox (Oracle)
@ 2023-12-18 13:58 ` Matthew Wilcox (Oracle)
  2023-12-21  8:13   ` Naoya Horiguchi
  2023-12-18 13:58 ` [PATCH 2/3] mm/memory-failure: Check the mapcount of the precise page Matthew Wilcox (Oracle)
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-18 13:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Naoya Horiguchi, Dan Williams, stable

Both collect_procs_anon() and collect_procs_file() iterate over the VMA
interval trees looking for a single pgoff, so it is wrong to look for
the pgoff of the head page as is currently done.  However, it is also
wrong to look at page->mapping of the precise page as this is invalid
for tail pages.  Clear up the confusion by passing both the folio and
the precise page to collect_procs().

Fixes: 415c64c1453a ("mm/memory-failure: split thp earlier in memory error handling")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/memory-failure.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 660c21859118..6953bda11e6e 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -595,10 +595,9 @@ struct task_struct *task_early_kill(struct task_struct *tsk, int force_early)
 /*
  * Collect processes when the error hit an anonymous page.
  */
-static void collect_procs_anon(struct page *page, struct list_head *to_kill,
-				int force_early)
+static void collect_procs_anon(struct folio *folio, struct page *page,
+		struct list_head *to_kill, int force_early)
 {
-	struct folio *folio = page_folio(page);
 	struct vm_area_struct *vma;
 	struct task_struct *tsk;
 	struct anon_vma *av;
@@ -633,12 +632,12 @@ static void collect_procs_anon(struct page *page, struct list_head *to_kill,
 /*
  * Collect processes when the error hit a file mapped page.
  */
-static void collect_procs_file(struct page *page, struct list_head *to_kill,
-				int force_early)
+static void collect_procs_file(struct folio *folio, struct page *page,
+		struct list_head *to_kill, int force_early)
 {
 	struct vm_area_struct *vma;
 	struct task_struct *tsk;
-	struct address_space *mapping = page->mapping;
+	struct address_space *mapping = folio->mapping;
 	pgoff_t pgoff;
 
 	i_mmap_lock_read(mapping);
@@ -704,17 +703,17 @@ static void collect_procs_fsdax(struct page *page,
 /*
  * Collect the processes who have the corrupted page mapped to kill.
  */
-static void collect_procs(struct page *page, struct list_head *tokill,
-				int force_early)
+static void collect_procs(struct folio *folio, struct page *page,
+		struct list_head *tokill, int force_early)
 {
-	if (!page->mapping)
+	if (!folio->mapping)
 		return;
 	if (unlikely(PageKsm(page)))
 		collect_procs_ksm(page, tokill, force_early);
 	else if (PageAnon(page))
-		collect_procs_anon(page, tokill, force_early);
+		collect_procs_anon(folio, page, tokill, force_early);
 	else
-		collect_procs_file(page, tokill, force_early);
+		collect_procs_file(folio, page, tokill, force_early);
 }
 
 struct hwpoison_walk {
@@ -1602,7 +1601,7 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
 	 * mapped in dirty form.  This has to be done before try_to_unmap,
 	 * because ttu takes the rmap data structures down.
 	 */
-	collect_procs(hpage, &tokill, flags & MF_ACTION_REQUIRED);
+	collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
 
 	if (PageHuge(hpage) && !PageAnon(hpage)) {
 		/*
@@ -1772,7 +1771,7 @@ static int mf_generic_kill_procs(unsigned long long pfn, int flags,
 	 * SIGBUS (i.e. MF_MUST_KILL)
 	 */
 	flags |= MF_ACTION_REQUIRED | MF_MUST_KILL;
-	collect_procs(&folio->page, &to_kill, true);
+	collect_procs(folio, &folio->page, &to_kill, true);
 
 	unmap_and_kill(&to_kill, pfn, folio->mapping, folio->index, flags);
 unlock:
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] mm/memory-failure: Check the mapcount of the precise page
  2023-12-18 13:58 [PATCH 0/3] Three memory-failure fixes Matthew Wilcox (Oracle)
  2023-12-18 13:58 ` [PATCH 1/3] mm/memory-failure: Pass the folio and the page to collect_procs() Matthew Wilcox (Oracle)
@ 2023-12-18 13:58 ` Matthew Wilcox (Oracle)
  2023-12-22  1:23   ` Naoya Horiguchi
  2023-12-18 13:58 ` [PATCH 3/3] mm/memory-failure: Cast index to loff_t before shifting it Matthew Wilcox (Oracle)
  2023-12-18 14:03 ` [PATCH] mailmap: Add an old address for Naoya Horiguchi Matthew Wilcox (Oracle)
  3 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-18 13:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Naoya Horiguchi, Dan Williams, stable

A process may map only some of the pages in a folio, and might be missed
if it maps the poisoned page but not the head page.  Or it might be
unnecessarily hit if it maps the head page, but not the poisoned page.

Fixes: 7af446a841a2 ("HWPOISON, hugetlb: enable error handling path for hugepage")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/memory-failure.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 6953bda11e6e..82e15baabb48 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1570,7 +1570,7 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
 	 * This check implies we don't kill processes if their pages
 	 * are in the swap cache early. Those are always late kills.
 	 */
-	if (!page_mapped(hpage))
+	if (!page_mapped(p))
 		return true;
 
 	if (PageSwapCache(p)) {
@@ -1621,10 +1621,10 @@ static bool hwpoison_user_mappings(struct page *p, unsigned long pfn,
 		try_to_unmap(folio, ttu);
 	}
 
-	unmap_success = !page_mapped(hpage);
+	unmap_success = !page_mapped(p);
 	if (!unmap_success)
 		pr_err("%#lx: failed to unmap page (mapcount=%d)\n",
-		       pfn, page_mapcount(hpage));
+		       pfn, page_mapcount(p));
 
 	/*
 	 * try_to_unmap() might put mlocked page in lru cache, so call
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] mm/memory-failure: Cast index to loff_t before shifting it
  2023-12-18 13:58 [PATCH 0/3] Three memory-failure fixes Matthew Wilcox (Oracle)
  2023-12-18 13:58 ` [PATCH 1/3] mm/memory-failure: Pass the folio and the page to collect_procs() Matthew Wilcox (Oracle)
  2023-12-18 13:58 ` [PATCH 2/3] mm/memory-failure: Check the mapcount of the precise page Matthew Wilcox (Oracle)
@ 2023-12-18 13:58 ` Matthew Wilcox (Oracle)
  2023-12-21  8:13   ` Naoya Horiguchi
  2023-12-18 14:03 ` [PATCH] mailmap: Add an old address for Naoya Horiguchi Matthew Wilcox (Oracle)
  3 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-18 13:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Matthew Wilcox (Oracle), linux-mm, Naoya Horiguchi, Dan Williams, stable

On 32-bit systems, we'll lose the top bits of index because arithmetic
will be performed in unsigned long instead of unsigned long long.  This
affects files over 4GB in size.

Fixes: 6100e34b2526 ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/memory-failure.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 82e15baabb48..455093f73a70 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1704,7 +1704,7 @@ static void unmap_and_kill(struct list_head *to_kill, unsigned long pfn,
 		 * mapping being torn down is communicated in siginfo, see
 		 * kill_proc()
 		 */
-		loff_t start = (index << PAGE_SHIFT) & ~(size - 1);
+		loff_t start = ((loff_t)index << PAGE_SHIFT) & ~(size - 1);
 
 		unmap_mapping_range(mapping, start, size, 0);
 	}
-- 
2.42.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] mailmap: Add an old address for Naoya Horiguchi
  2023-12-18 13:58 [PATCH 0/3] Three memory-failure fixes Matthew Wilcox (Oracle)
                   ` (2 preceding siblings ...)
  2023-12-18 13:58 ` [PATCH 3/3] mm/memory-failure: Cast index to loff_t before shifting it Matthew Wilcox (Oracle)
@ 2023-12-18 14:03 ` Matthew Wilcox (Oracle)
  2023-12-21  8:15   ` Naoya Horiguchi
  3 siblings, 1 reply; 9+ messages in thread
From: Matthew Wilcox (Oracle) @ 2023-12-18 14:03 UTC (permalink / raw)
  To: Andrew Morton, linux-mm, naoya.horiguchi; +Cc: Matthew Wilcox (Oracle)

This address now bounces, remap it to a current address.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index a18bbede70f8..b6fcce3eff94 100644
--- a/.mailmap
+++ b/.mailmap
@@ -433,6 +433,7 @@ Muna Sinada <quic_msinada@quicinc.com> <msinada@codeaurora.org>
 Murali Nalajala <quic_mnalajal@quicinc.com> <mnalajal@codeaurora.org>
 Mythri P K <mythripk@ti.com>
 Nadia Yvette Chambers <nyc@holomorphy.com> William Lee Irwin III <wli@holomorphy.com>
+Naoya Horiguchi <naoya.horiguchi@nec.com> <n-horiguchi@ah.jp.nec.com>
 Nathan Chancellor <nathan@kernel.org> <natechancellor@gmail.com>
 Neeraj Upadhyay <quic_neeraju@quicinc.com> <neeraju@codeaurora.org>
 Neil Armstrong <neil.armstrong@linaro.org> <narmstrong@baylibre.com>
-- 
2.42.0



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] mm/memory-failure: Pass the folio and the page to collect_procs()
  2023-12-18 13:58 ` [PATCH 1/3] mm/memory-failure: Pass the folio and the page to collect_procs() Matthew Wilcox (Oracle)
@ 2023-12-21  8:13   ` Naoya Horiguchi
  0 siblings, 0 replies; 9+ messages in thread
From: Naoya Horiguchi @ 2023-12-21  8:13 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Naoya Horiguchi, Dan Williams, stable

On Mon, Dec 18, 2023 at 01:58:35PM +0000, Matthew Wilcox (Oracle) wrote:
> Both collect_procs_anon() and collect_procs_file() iterate over the VMA
> interval trees looking for a single pgoff, so it is wrong to look for
> the pgoff of the head page as is currently done.  However, it is also
> wrong to look at page->mapping of the precise page as this is invalid
> for tail pages.  Clear up the confusion by passing both the folio and
> the precise page to collect_procs().
> 
> Fixes: 415c64c1453a ("mm/memory-failure: split thp earlier in memory error handling")
> Cc: stable@vger.kernel.org
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Looks good to me, thank you.

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] mm/memory-failure: Cast index to loff_t before shifting it
  2023-12-18 13:58 ` [PATCH 3/3] mm/memory-failure: Cast index to loff_t before shifting it Matthew Wilcox (Oracle)
@ 2023-12-21  8:13   ` Naoya Horiguchi
  0 siblings, 0 replies; 9+ messages in thread
From: Naoya Horiguchi @ 2023-12-21  8:13 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Naoya Horiguchi, Dan Williams, stable

On Mon, Dec 18, 2023 at 01:58:37PM +0000, Matthew Wilcox (Oracle) wrote:
> On 32-bit systems, we'll lose the top bits of index because arithmetic
> will be performed in unsigned long instead of unsigned long long.  This
> affects files over 4GB in size.
> 
> Fixes: 6100e34b2526 ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages")
> Cc: stable@vger.kernel.org
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] mailmap: Add an old address for Naoya Horiguchi
  2023-12-18 14:03 ` [PATCH] mailmap: Add an old address for Naoya Horiguchi Matthew Wilcox (Oracle)
@ 2023-12-21  8:15   ` Naoya Horiguchi
  0 siblings, 0 replies; 9+ messages in thread
From: Naoya Horiguchi @ 2023-12-21  8:15 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle); +Cc: Andrew Morton, linux-mm, naoya.horiguchi

On Mon, Dec 18, 2023 at 02:03:28PM +0000, Matthew Wilcox (Oracle) wrote:
> This address now bounces, remap it to a current address.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Thank you for updating this file!

Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] mm/memory-failure: Check the mapcount of the precise page
  2023-12-18 13:58 ` [PATCH 2/3] mm/memory-failure: Check the mapcount of the precise page Matthew Wilcox (Oracle)
@ 2023-12-22  1:23   ` Naoya Horiguchi
  0 siblings, 0 replies; 9+ messages in thread
From: Naoya Horiguchi @ 2023-12-22  1:23 UTC (permalink / raw)
  To: Matthew Wilcox (Oracle)
  Cc: Andrew Morton, linux-mm, Naoya Horiguchi, Dan Williams, stable

On Mon, Dec 18, 2023 at 01:58:36PM +0000, Matthew Wilcox (Oracle) wrote:
> A process may map only some of the pages in a folio, and might be missed
> if it maps the poisoned page but not the head page.  Or it might be
> unnecessarily hit if it maps the head page, but not the poisoned page.
> 
> Fixes: 7af446a841a2 ("HWPOISON, hugetlb: enable error handling path for hugepage")
> Cc: stable@vger.kernel.org
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-12-22  1:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-12-18 13:58 [PATCH 0/3] Three memory-failure fixes Matthew Wilcox (Oracle)
2023-12-18 13:58 ` [PATCH 1/3] mm/memory-failure: Pass the folio and the page to collect_procs() Matthew Wilcox (Oracle)
2023-12-21  8:13   ` Naoya Horiguchi
2023-12-18 13:58 ` [PATCH 2/3] mm/memory-failure: Check the mapcount of the precise page Matthew Wilcox (Oracle)
2023-12-22  1:23   ` Naoya Horiguchi
2023-12-18 13:58 ` [PATCH 3/3] mm/memory-failure: Cast index to loff_t before shifting it Matthew Wilcox (Oracle)
2023-12-21  8:13   ` Naoya Horiguchi
2023-12-18 14:03 ` [PATCH] mailmap: Add an old address for Naoya Horiguchi Matthew Wilcox (Oracle)
2023-12-21  8:15   ` Naoya Horiguchi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.