linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/vmscan: Fix NR_ISOLATED_FILE corruption on 64-bit
@ 2020-10-29  3:23 Nicholas Piggin
  2020-10-29 13:11 ` Michal Hocko
  0 siblings, 1 reply; 4+ messages in thread
From: Nicholas Piggin @ 2020-10-29  3:23 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nicholas Piggin, linux-mm, Maninder Singh, Vaneet Narang,
	Michal Hocko, Amit Sahrawat, Mel Gorman, Vlastimil Babka

Previously the negated unsigned long would be cast back to signed long
which would have the correct negative value. After commit 730ec8c01a2b
("mm/vmscan.c: change prototype for shrink_page_list"), the large
unsigned int converts to a large positive signed long.

Symptoms include CMA allocations hanging forever holding the cma_mutex
due to alloc_contig_range->...->isolate_migratepages_block waiting
forever in "while (unlikely(too_many_isolated(pgdat)))".

Cc: linux-mm@kvack.org
Cc: Maninder Singh <maninder1.s@samsung.com>
Cc: Vaneet Narang <v.narang@samsung.com>
Cc: Maninder Singh <maninder1.s@samsung.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Amit Sahrawat <a.sahrawat@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1b8f0e059767..92c507bacf09 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1516,7 +1516,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
 	nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
 			TTU_IGNORE_ACCESS, &stat, true);
 	list_splice(&clean_pages, page_list);
-	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed);
+	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -(long)nr_reclaimed);
 	/*
 	 * Since lazyfree pages are isolated from file LRU from the beginning,
 	 * they will rotate back to anonymous LRU in the end if it failed to
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/vmscan: Fix NR_ISOLATED_FILE corruption on 64-bit
  2020-10-29  3:23 [PATCH] mm/vmscan: Fix NR_ISOLATED_FILE corruption on 64-bit Nicholas Piggin
@ 2020-10-29 13:11 ` Michal Hocko
  2020-10-30  5:25   ` Andrew Morton
  0 siblings, 1 reply; 4+ messages in thread
From: Michal Hocko @ 2020-10-29 13:11 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Andrew Morton, linux-mm, Maninder Singh, Vaneet Narang,
	Amit Sahrawat, Mel Gorman, Vlastimil Babka

On Thu 29-10-20 13:23:20, Nicholas Piggin wrote:
> Previously the negated unsigned long would be cast back to signed long
> which would have the correct negative value. After commit 730ec8c01a2b
> ("mm/vmscan.c: change prototype for shrink_page_list"), the large
> unsigned int converts to a large positive signed long.
> 
> Symptoms include CMA allocations hanging forever holding the cma_mutex
> due to alloc_contig_range->...->isolate_migratepages_block waiting
> forever in "while (unlikely(too_many_isolated(pgdat)))".

Nasty.

> Cc: linux-mm@kvack.org
> Cc: Maninder Singh <maninder1.s@samsung.com>
> Cc: Vaneet Narang <v.narang@samsung.com>
> Cc: Maninder Singh <maninder1.s@samsung.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Amit Sahrawat <a.sahrawat@samsung.com>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  mm/vmscan.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 1b8f0e059767..92c507bacf09 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1516,7 +1516,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
>  	nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
>  			TTU_IGNORE_ACCESS, &stat, true);
>  	list_splice(&clean_pages, page_list);
> -	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed);
> +	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -(long)nr_reclaimed);
>  	/*
>  	 * Since lazyfree pages are isolated from file LRU from the beginning,
>  	 * they will rotate back to anonymous LRU in the end if it failed to

You want the same also for -stat.nr_lazyfree_fail right?

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/vmscan: Fix NR_ISOLATED_FILE corruption on 64-bit
  2020-10-29 13:11 ` Michal Hocko
@ 2020-10-30  5:25   ` Andrew Morton
  2020-10-30  7:00     ` Michal Hocko
  0 siblings, 1 reply; 4+ messages in thread
From: Andrew Morton @ 2020-10-30  5:25 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Nicholas Piggin, linux-mm, Maninder Singh, Vaneet Narang,
	Amit Sahrawat, Mel Gorman, Vlastimil Babka

On Thu, 29 Oct 2020 14:11:03 +0100 Michal Hocko <mhocko@suse.com> wrote:

> On Thu 29-10-20 13:23:20, Nicholas Piggin wrote:
> > Previously the negated unsigned long would be cast back to signed long
> > which would have the correct negative value. After commit 730ec8c01a2b
> > ("mm/vmscan.c: change prototype for shrink_page_list"), the large
> > unsigned int converts to a large positive signed long.
> > 
> > Symptoms include CMA allocations hanging forever holding the cma_mutex
> > due to alloc_contig_range->...->isolate_migratepages_block waiting
> > forever in "while (unlikely(too_many_isolated(pgdat)))".
> 
> Nasty.
> 
> > Cc: linux-mm@kvack.org
> > Cc: Maninder Singh <maninder1.s@samsung.com>
> > Cc: Vaneet Narang <v.narang@samsung.com>
> > Cc: Maninder Singh <maninder1.s@samsung.com>
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Amit Sahrawat <a.sahrawat@samsung.com>
> > Cc: Mel Gorman <mgorman@suse.de>
> > Cc: Vlastimil Babka <vbabka@suse.cz>
> > Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list")
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> >  mm/vmscan.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 1b8f0e059767..92c507bacf09 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1516,7 +1516,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
> >  	nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
> >  			TTU_IGNORE_ACCESS, &stat, true);
> >  	list_splice(&clean_pages, page_list);
> > -	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed);
> > +	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -(long)nr_reclaimed);
> >  	/*
> >  	 * Since lazyfree pages are isolated from file LRU from the beginning,
> >  	 * they will rotate back to anonymous LRU in the end if it failed to
> 
> You want the same also for -stat.nr_lazyfree_fail right?

I did the below, and added a cc:stable.

--- a/mm/vmscan.c~mm-vmscan-fix-nr_isolated_file-corruption-on-64-bit-fix
+++ a/mm/vmscan.c
@@ -1516,7 +1516,8 @@ unsigned int reclaim_clean_pages_from_li
 	nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
 			TTU_IGNORE_ACCESS, &stat, true);
 	list_splice(&clean_pages, page_list);
-	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -(long)nr_reclaimed);
+	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
+			    -(long)nr_reclaimed);
 	/*
 	 * Since lazyfree pages are isolated from file LRU from the beginning,
 	 * they will rotate back to anonymous LRU in the end if it failed to
@@ -1526,7 +1527,7 @@ unsigned int reclaim_clean_pages_from_li
 	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON,
 			    stat.nr_lazyfree_fail);
 	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
-			    -stat.nr_lazyfree_fail);
+			    -(long)stat.nr_lazyfree_fail);
 	return nr_reclaimed;
 }
 
_



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm/vmscan: Fix NR_ISOLATED_FILE corruption on 64-bit
  2020-10-30  5:25   ` Andrew Morton
@ 2020-10-30  7:00     ` Michal Hocko
  0 siblings, 0 replies; 4+ messages in thread
From: Michal Hocko @ 2020-10-30  7:00 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Nicholas Piggin, linux-mm, Maninder Singh, Vaneet Narang,
	Amit Sahrawat, Mel Gorman, Vlastimil Babka

On Thu 29-10-20 22:25:26, Andrew Morton wrote:
> On Thu, 29 Oct 2020 14:11:03 +0100 Michal Hocko <mhocko@suse.com> wrote:
> 
> > On Thu 29-10-20 13:23:20, Nicholas Piggin wrote:
> > > Previously the negated unsigned long would be cast back to signed long
> > > which would have the correct negative value. After commit 730ec8c01a2b
> > > ("mm/vmscan.c: change prototype for shrink_page_list"), the large
> > > unsigned int converts to a large positive signed long.
> > > 
> > > Symptoms include CMA allocations hanging forever holding the cma_mutex
> > > due to alloc_contig_range->...->isolate_migratepages_block waiting
> > > forever in "while (unlikely(too_many_isolated(pgdat)))".
> > 
> > Nasty.
> > 
> > > Cc: linux-mm@kvack.org
> > > Cc: Maninder Singh <maninder1.s@samsung.com>
> > > Cc: Vaneet Narang <v.narang@samsung.com>
> > > Cc: Maninder Singh <maninder1.s@samsung.com>
> > > Cc: Michal Hocko <mhocko@suse.com>
> > > Cc: Amit Sahrawat <a.sahrawat@samsung.com>
> > > Cc: Mel Gorman <mgorman@suse.de>
> > > Cc: Vlastimil Babka <vbabka@suse.cz>
> > > Fixes: 730ec8c01a2b ("mm/vmscan.c: change prototype for shrink_page_list")
> > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > > ---
> > >  mm/vmscan.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index 1b8f0e059767..92c507bacf09 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -1516,7 +1516,7 @@ unsigned int reclaim_clean_pages_from_list(struct zone *zone,
> > >  	nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
> > >  			TTU_IGNORE_ACCESS, &stat, true);
> > >  	list_splice(&clean_pages, page_list);
> > > -	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -nr_reclaimed);
> > > +	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -(long)nr_reclaimed);
> > >  	/*
> > >  	 * Since lazyfree pages are isolated from file LRU from the beginning,
> > >  	 * they will rotate back to anonymous LRU in the end if it failed to
> > 
> > You want the same also for -stat.nr_lazyfree_fail right?
> 
> I did the below, and added a cc:stable.
> 
> --- a/mm/vmscan.c~mm-vmscan-fix-nr_isolated_file-corruption-on-64-bit-fix
> +++ a/mm/vmscan.c
> @@ -1516,7 +1516,8 @@ unsigned int reclaim_clean_pages_from_li
>  	nr_reclaimed = shrink_page_list(&clean_pages, zone->zone_pgdat, &sc,
>  			TTU_IGNORE_ACCESS, &stat, true);
>  	list_splice(&clean_pages, page_list);
> -	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE, -(long)nr_reclaimed);
> +	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
> +			    -(long)nr_reclaimed);
>  	/*
>  	 * Since lazyfree pages are isolated from file LRU from the beginning,
>  	 * they will rotate back to anonymous LRU in the end if it failed to
> @@ -1526,7 +1527,7 @@ unsigned int reclaim_clean_pages_from_li
>  	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON,
>  			    stat.nr_lazyfree_fail);
>  	mod_node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE,
> -			    -stat.nr_lazyfree_fail);
> +			    -(long)stat.nr_lazyfree_fail);
>  	return nr_reclaimed;
>  }

Acked-by: Michal Hocko <mhocko@suse.com>

Btw. regular reclaim paths are OK because nr_taken is unsigned long.

Btw #2 I was quite surprised about the inconsistency for !SMP case where
__mod_node_page_state uses int for the delta.

Thanks!
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-10-30  7:00 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-29  3:23 [PATCH] mm/vmscan: Fix NR_ISOLATED_FILE corruption on 64-bit Nicholas Piggin
2020-10-29 13:11 ` Michal Hocko
2020-10-30  5:25   ` Andrew Morton
2020-10-30  7:00     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).