linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Two minor cleanups for pcp list in page_alloc
@ 2023-08-09 10:07 Kemeng Shi
  2023-08-09 10:07 ` [PATCH 1/2] mm/page_alloc: remove track of active PCP lists range in bulk free Kemeng Shi
  2023-08-09 10:07 ` [PATCH 2/2] mm/page_alloc: remove unnecessary parameter batch of nr_pcp_free Kemeng Shi
  0 siblings, 2 replies; 8+ messages in thread
From: Kemeng Shi @ 2023-08-09 10:07 UTC (permalink / raw)
  To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
  Cc: shikemeng

There are two minor cleanups for pcp list in page_alloc. More details
can be found in respective patches. Thanks!

Kemeng Shi (2):
  mm/page_alloc: remove track of active PCP lists range in bulk free
  mm/page_alloc: remove unnecessary batch parameter of nr_pcp_free

 mm/page_alloc.c | 23 ++++++-----------------
 1 file changed, 6 insertions(+), 17 deletions(-)

-- 
2.30.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] mm/page_alloc: remove track of active PCP lists range in bulk free
  2023-08-09 10:07 [PATCH 0/2] Two minor cleanups for pcp list in page_alloc Kemeng Shi
@ 2023-08-09 10:07 ` Kemeng Shi
  2023-08-15 17:45   ` Chris Li
  2023-08-09 10:07 ` [PATCH 2/2] mm/page_alloc: remove unnecessary parameter batch of nr_pcp_free Kemeng Shi
  1 sibling, 1 reply; 8+ messages in thread
From: Kemeng Shi @ 2023-08-09 10:07 UTC (permalink / raw)
  To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
  Cc: shikemeng

After commit fd56eef258a17 ("mm/page_alloc: simplify how many pages are
selected per pcp list during bulk free"), we will drain all pages in
selected pcp list. And we ensured passed count is < pcp->count. Then,
the search will finish before wrap-around and track of active PCP lists
range intended for wrap-around case is no longer needed.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/page_alloc.c | 15 +++------------
 1 file changed, 3 insertions(+), 12 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 96b7c1a7d1f2..1ddcb2707d05 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1207,8 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
 					int pindex)
 {
 	unsigned long flags;
-	int min_pindex = 0;
-	int max_pindex = NR_PCP_LISTS - 1;
 	unsigned int order;
 	bool isolated_pageblocks;
 	struct page *page;
@@ -1231,17 +1229,10 @@ static void free_pcppages_bulk(struct zone *zone, int count,
 
 		/* Remove pages from lists in a round-robin fashion. */
 		do {
-			if (++pindex > max_pindex)
-				pindex = min_pindex;
+			if (++pindex > NR_PCP_LISTS - 1)
+				pindex = 0;
 			list = &pcp->lists[pindex];
-			if (!list_empty(list))
-				break;
-
-			if (pindex == max_pindex)
-				max_pindex--;
-			if (pindex == min_pindex)
-				min_pindex++;
-		} while (1);
+		} while (list_empty(list));
 
 		order = pindex_to_order(pindex);
 		nr_pages = 1 << order;
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] mm/page_alloc: remove unnecessary parameter batch of nr_pcp_free
  2023-08-09 10:07 [PATCH 0/2] Two minor cleanups for pcp list in page_alloc Kemeng Shi
  2023-08-09 10:07 ` [PATCH 1/2] mm/page_alloc: remove track of active PCP lists range in bulk free Kemeng Shi
@ 2023-08-09 10:07 ` Kemeng Shi
  2023-08-15 17:46   ` Chris Li
  1 sibling, 1 reply; 8+ messages in thread
From: Kemeng Shi @ 2023-08-09 10:07 UTC (permalink / raw)
  To: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy
  Cc: shikemeng

We get batch from pcp and just pass it to nr_pcp_free immediately. Get
batch from pcp inside nr_pcp_free to remove unnecessary parameter batch.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 mm/page_alloc.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1ddcb2707d05..bb1d14e806ad 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2376,10 +2376,10 @@ static bool free_unref_page_prepare(struct page *page, unsigned long pfn,
 	return true;
 }
 
-static int nr_pcp_free(struct per_cpu_pages *pcp, int high, int batch,
-		       bool free_high)
+static int nr_pcp_free(struct per_cpu_pages *pcp, int high, bool free_high)
 {
 	int min_nr_free, max_nr_free;
+	int batch = READ_ONCE(pcp->batch);
 
 	/* Free everything if batch freeing high-order pages. */
 	if (unlikely(free_high))
@@ -2446,9 +2446,7 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp,
 
 	high = nr_pcp_high(pcp, zone, free_high);
 	if (pcp->count >= high) {
-		int batch = READ_ONCE(pcp->batch);
-
-		free_pcppages_bulk(zone, nr_pcp_free(pcp, high, batch, free_high), pcp, pindex);
+		free_pcppages_bulk(zone, nr_pcp_free(pcp, high, free_high), pcp, pindex);
 	}
 }
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/page_alloc: remove track of active PCP lists range in bulk free
  2023-08-09 10:07 ` [PATCH 1/2] mm/page_alloc: remove track of active PCP lists range in bulk free Kemeng Shi
@ 2023-08-15 17:45   ` Chris Li
  2023-08-17  2:22     ` Kemeng Shi
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Li @ 2023-08-15 17:45 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy

Hi Kemeng,

Can you confirm this patch has no intended functional change?

I have a patch sitting in my tree for a while related to this
count vs pcp->count.  The BPF function hook can potentially change
pcp->count and make count out of sync with pcp->count which causes
a dead loop.

Maybe I can send my out alone side with yours for discussion?
I don't mind my patch combined with yours.

Your change looks fine to me. There is more can be done
on the clean up.

Chris

On Wed, Aug 09, 2023 at 06:07:53PM +0800, Kemeng Shi wrote:
> After commit fd56eef258a17 ("mm/page_alloc: simplify how many pages are
> selected per pcp list during bulk free"), we will drain all pages in
> selected pcp list. And we ensured passed count is < pcp->count. Then,
> the search will finish before wrap-around and track of active PCP lists
> range intended for wrap-around case is no longer needed.

> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
>  mm/page_alloc.c | 15 +++------------
>  1 file changed, 3 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 96b7c1a7d1f2..1ddcb2707d05 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1207,8 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>  					int pindex)
>  {
>  	unsigned long flags;
> -	int min_pindex = 0;
> -	int max_pindex = NR_PCP_LISTS - 1;
>  	unsigned int order;
>  	bool isolated_pageblocks;
>  	struct page *page;
> @@ -1231,17 +1229,10 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>  
>  		/* Remove pages from lists in a round-robin fashion. */
>  		do {
> -			if (++pindex > max_pindex)
> -				pindex = min_pindex;
> +			if (++pindex > NR_PCP_LISTS - 1)
> +				pindex = 0;
>  			list = &pcp->lists[pindex];
> -			if (!list_empty(list))
> -				break;
> -
> -			if (pindex == max_pindex)
> -				max_pindex--;
> -			if (pindex == min_pindex)
> -				min_pindex++;
> -		} while (1);
> +		} while (list_empty(list));
>  
>  		order = pindex_to_order(pindex);
>  		nr_pages = 1 << order;
> -- 
> 2.30.0
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/page_alloc: remove unnecessary parameter batch of nr_pcp_free
  2023-08-09 10:07 ` [PATCH 2/2] mm/page_alloc: remove unnecessary parameter batch of nr_pcp_free Kemeng Shi
@ 2023-08-15 17:46   ` Chris Li
  2023-08-17  2:43     ` Kemeng Shi
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Li @ 2023-08-15 17:46 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy

Hi Kemeng,

Since I am discussing the other patch in this series, I might just commend on this one
as well.

On Wed, Aug 09, 2023 at 06:07:54PM +0800, Kemeng Shi wrote:
> We get batch from pcp and just pass it to nr_pcp_free immediately. Get
> batch from pcp inside nr_pcp_free to remove unnecessary parameter batch.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
> ---
>  mm/page_alloc.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1ddcb2707d05..bb1d14e806ad 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2376,10 +2376,10 @@ static bool free_unref_page_prepare(struct page *page, unsigned long pfn,
>  	return true;
>  }
>  
> -static int nr_pcp_free(struct per_cpu_pages *pcp, int high, int batch,
> -		       bool free_high)
> +static int nr_pcp_free(struct per_cpu_pages *pcp, int high, bool free_high)
>  {
>  	int min_nr_free, max_nr_free;
> +	int batch = READ_ONCE(pcp->batch);

Because nr_pcp_free is static and has only one caller. This function gets inlined
at the caller's side. I verified that on X86_64 compiled code.

So this fix in my opinion is not worthwhile to fix. It will produce the same
machine code. One minor side effect is that it will hide the commit under it
in "git blame".

Chris


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/page_alloc: remove track of active PCP lists range in bulk free
  2023-08-15 17:45   ` Chris Li
@ 2023-08-17  2:22     ` Kemeng Shi
  2023-08-17  7:51       ` Chris Li
  0 siblings, 1 reply; 8+ messages in thread
From: Kemeng Shi @ 2023-08-17  2:22 UTC (permalink / raw)
  To: Chris Li; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy



on 8/16/2023 1:45 AM, Chris Li wrote:
> Hi Kemeng,
> 
> Can you confirm this patch has no intended functional change?
> 
Hi Chris, there is no functional change intended in this patch. As
I menthioned in changelog, there is no wrap for list iteration, so
that the active PCP lists range will never be used.
> I have a patch sitting in my tree for a while related to this
> count vs pcp->count.  The BPF function hook can potentially change
> pcp->count and make count out of sync with pcp->count which causes
> a dead loop.
> 
I guess pcp->count is set to bigger than it should be. In this case,
we will keep trying get pages while all pages in pcp list were taken
off already and dead lock will happen. In this case, dead looo will
happen with or without this patch as the root cause is that we try
to get pages more than pcp list owns.> Maybe I can send my out alone side with yours for discussion?
> I don't mind my patch combined with yours.
>
Either way is acceptable to me, just feel free to choose one you like
and I'd like to see if more we could do to this.

> Your change looks fine to me. There is more can be done
> on the clean up.
>
Thanks for feedback, and more clean up is welcome.
> Chris
> 
> On Wed, Aug 09, 2023 at 06:07:53PM +0800, Kemeng Shi wrote:
>> After commit fd56eef258a17 ("mm/page_alloc: simplify how many pages are
>> selected per pcp list during bulk free"), we will drain all pages in
>> selected pcp list. And we ensured passed count is < pcp->count. Then,
>> the search will finish before wrap-around and track of active PCP lists
>> range intended for wrap-around case is no longer needed.
> 
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> ---
>>  mm/page_alloc.c | 15 +++------------
>>  1 file changed, 3 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 96b7c1a7d1f2..1ddcb2707d05 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -1207,8 +1207,6 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>>  					int pindex)
>>  {
>>  	unsigned long flags;
>> -	int min_pindex = 0;
>> -	int max_pindex = NR_PCP_LISTS - 1;
>>  	unsigned int order;
>>  	bool isolated_pageblocks;
>>  	struct page *page;
>> @@ -1231,17 +1229,10 @@ static void free_pcppages_bulk(struct zone *zone, int count,
>>  
>>  		/* Remove pages from lists in a round-robin fashion. */
>>  		do {
>> -			if (++pindex > max_pindex)
>> -				pindex = min_pindex;
>> +			if (++pindex > NR_PCP_LISTS - 1)
>> +				pindex = 0;
>>  			list = &pcp->lists[pindex];
>> -			if (!list_empty(list))
>> -				break;
>> -
>> -			if (pindex == max_pindex)
>> -				max_pindex--;
>> -			if (pindex == min_pindex)
>> -				min_pindex++;
>> -		} while (1);
>> +		} while (list_empty(list));
>>  
>>  		order = pindex_to_order(pindex);
>>  		nr_pages = 1 << order;
>> -- 
>> 2.30.0
>>
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 2/2] mm/page_alloc: remove unnecessary parameter batch of nr_pcp_free
  2023-08-15 17:46   ` Chris Li
@ 2023-08-17  2:43     ` Kemeng Shi
  0 siblings, 0 replies; 8+ messages in thread
From: Kemeng Shi @ 2023-08-17  2:43 UTC (permalink / raw)
  To: Chris Li; +Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy



on 8/16/2023 1:46 AM, Chris Li wrote:
> Hi Kemeng,
> 
> Since I am discussing the other patch in this series, I might just commend on this one
> as well.
> 
> On Wed, Aug 09, 2023 at 06:07:54PM +0800, Kemeng Shi wrote:
>> We get batch from pcp and just pass it to nr_pcp_free immediately. Get
>> batch from pcp inside nr_pcp_free to remove unnecessary parameter batch.
>>
>> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
>> ---
>>  mm/page_alloc.c | 8 +++-----
>>  1 file changed, 3 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 1ddcb2707d05..bb1d14e806ad 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -2376,10 +2376,10 @@ static bool free_unref_page_prepare(struct page *page, unsigned long pfn,
>>  	return true;
>>  }
>>  
>> -static int nr_pcp_free(struct per_cpu_pages *pcp, int high, int batch,
>> -		       bool free_high)
>> +static int nr_pcp_free(struct per_cpu_pages *pcp, int high, bool free_high)
>>  {
>>  	int min_nr_free, max_nr_free;
>> +	int batch = READ_ONCE(pcp->batch);
> 
> Because nr_pcp_free is static and has only one caller. This function gets inlined
> at the caller's side. I verified that on X86_64 compiled code.
> 
> So this fix in my opinion is not worthwhile to fix. It will produce the same
> machine code. One minor side effect is that it will hide the commit under it
> in "git blame".
> 
Hi Chris, thank for the reply. Except to reduce argument to pass, this patch also
tries make code look little cleaner. I think it's always better to reduce variable
scope and keep relevant code tight. In this case, we know batch is from
per_cpu_pages during reading nr_pcp_free alone rather than search caller to find it
out. And more callers of nr_pcp_free in fulture is free from pass pcp->batch. And so
on. Anyway, this patch definely gains a little without lost in my opinion.:) With it
makes sense to you.

> Chris
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/2] mm/page_alloc: remove track of active PCP lists range in bulk free
  2023-08-17  2:22     ` Kemeng Shi
@ 2023-08-17  7:51       ` Chris Li
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Li @ 2023-08-17  7:51 UTC (permalink / raw)
  To: Kemeng Shi
  Cc: linux-mm, linux-kernel, akpm, baolin.wang, mgorman, david, willy

Hi Kemeng,

On Wed, Aug 16, 2023 at 7:22 PM Kemeng Shi <shikemeng@huaweicloud.com> wrote:
> Hi Chris, there is no functional change intended in this patch. As
> I menthioned in changelog, there is no wrap for list iteration, so
> that the active PCP lists range will never be used.
> > I have a patch sitting in my tree for a while related to this
> > count vs pcp->count.  The BPF function hook can potentially change
> > pcp->count and make count out of sync with pcp->count which causes
> > a dead loop.

In this case the BPF allocates a page inside spin_lock. The "pcp->count" is
smaller than "count". The loop condition only checks "count > 0" but all
pcp->lists pages have been free. pcp->count is 0 but "count" is 1.
After a few times wrap around, the pindex_max is smaller than pindex_min,
then reach to -1 cause the invalid page fault.

> I guess pcp->count is set to bigger than it should be. In this case,
> we will keep trying get pages while all pages in pcp list were taken
> off already and dead lock will happen. In this case, dead looo will
> happen with or without this patch as the root cause is that we try
> to get pages more than pcp list owns.> Maybe I can send my out alone side with yours for discussion?

My patch is split into two parts, the first patch is a functional
change to allow
pcp->count drop below "count".
The other patch is just to clean up, and should have the same function.

Sure will send it out and CC you for discussion.

> > I don't mind my patch combined with yours.
> >
> Either way is acceptable to me, just feel free to choose one you like
> and I'd like to see if more we could do to this.

Thanks

Chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-17  7:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-09 10:07 [PATCH 0/2] Two minor cleanups for pcp list in page_alloc Kemeng Shi
2023-08-09 10:07 ` [PATCH 1/2] mm/page_alloc: remove track of active PCP lists range in bulk free Kemeng Shi
2023-08-15 17:45   ` Chris Li
2023-08-17  2:22     ` Kemeng Shi
2023-08-17  7:51       ` Chris Li
2023-08-09 10:07 ` [PATCH 2/2] mm/page_alloc: remove unnecessary parameter batch of nr_pcp_free Kemeng Shi
2023-08-15 17:46   ` Chris Li
2023-08-17  2:43     ` Kemeng Shi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).