All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Mel Gorman <mgorman@techsingularity.net>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hillf Danton <hdanton@sina.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 2/6] mm/page_alloc: Disassociate the pcp->high from pcp->batch
Date: Fri, 28 May 2021 12:27:58 +0200	[thread overview]
Message-ID: <9ccf113a-d292-2b34-2470-5a4e2ed4276e@suse.cz> (raw)
In-Reply-To: <20210527105241.GB30378@techsingularity.net>

On 5/27/21 12:52 PM, Mel Gorman wrote:
> On Wed, May 26, 2021 at 08:14:13PM +0200, Vlastimil Babka wrote:
>> > @@ -6698,11 +6717,10 @@ static void __zone_set_pageset_high_and_batch(struct zone *zone, unsigned long h
>> >   */
>> >  static void zone_set_pageset_high_and_batch(struct zone *zone)
>> >  {
>> > -	unsigned long new_high, new_batch;
>> > +	int new_high, new_batch;
>> >  
>> > -	new_batch = zone_batchsize(zone);
>> > -	new_high = 6 * new_batch;
>> > -	new_batch = max(1UL, 1 * new_batch);
>> > +	new_batch = max(1, zone_batchsize(zone));
>> > +	new_high = zone_highsize(zone, new_batch);
>> >  
>> >  	if (zone->pageset_high == new_high &&
>> >  	    zone->pageset_batch == new_batch)
>> > @@ -8170,6 +8188,12 @@ static void __setup_per_zone_wmarks(void)
>> >  		zone->_watermark[WMARK_LOW]  = min_wmark_pages(zone) + tmp;
>> >  		zone->_watermark[WMARK_HIGH] = min_wmark_pages(zone) + tmp * 2;
>> >  
>> > +		/*
>> > +		 * The watermark size have changed so update the pcpu batch
>> > +		 * and high limits or the limits may be inappropriate.
>> > +		 */
>> > +		zone_set_pageset_high_and_batch(zone);
>> 
>> Hm so this puts the call in the path of various watermark related sysctl
>> handlers, but it's not protected by pcp_batch_high_lock. The zone lock won't
>> help against zone_pcp_update() from a hotplug handler. On the other hand, since
>> hotplug handlers also call __setup_per_zone_wmarks(), the zone_pcp_update()
>> calls there are now redundant and could be removed, no?
>> But later there will be a new sysctl in patch 6/6 using pcp_batch_high_lock,
>> thus that one will not be protected against the watermark related sysctl
>> handlers that reach here.
>> 
>> To solve all this, seems like the static lock in setup_per_zone_wmarks() could
>> become a top-level visible lock and pcp high/batch updates could switch to that
>> one instead of own pcp_batch_high_lock. And zone_pcp_update() calls from hotplug
>> handlers could be removed.
>> 
> 
> Hmm, the locking has very different hold times. The static lock in
> setup_per_zone_wmarks is a spinlock that protects against parallel updates
> of watermarks and is held for a short duration. The pcp_batch_high_lock
> is a mutex that is held for a relatively long time while memory is being
> offlined and can sleep. Memory hotplug updates the watermarks without
> pcp_batch_high_lock held so overall, unifying the locking there should
> be a separate series.
> 
> How about this as a fix for this patch?
> 
> ---8<---
> mm/page_alloc: Disassociate the pcp->high from pcp->batch -fix
> 
> Vlastimil Babka noted that __setup_per_zone_wmarks updating pcp->high
> did not protect watermark-related sysctl handlers from a parallel
> memory hotplug operations. This patch moves the PCP update to
> setup_per_zone_wmarks and updates the PCP high value while protected
> by the pcp_batch_high_lock mutex.
> 
> This is a fix to the mmotm patch mm-page_alloc-disassociate-the-pcp-high-from-pcp-batch.patch.
> It'll cause a conflict with mm-page_alloc-adjust-pcp-high-after-cpu-hotplug-events.patch
> but the resolution is simply to change the caller in setup_per_zone_wmarks
> to zone_pcp_update(zone, 0)
> 
> Signed-off-by: Mel Gorman <mgorman@techsingularity.net>

Looks fine. But I would also remove the redudancy introduced by this patch+fix,
as part of the patch:

online_pages()
  zone_pcp_update(zone); <- this predates the patch
  init_per_zone_wmark_min()
    setup_per_zone_wmarks()
      for_each_zone(zone)
         zone_pcp_update(zone); <- new in this patch

offline_pages() similarly

In any case, for the fixed version:
Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/page_alloc.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 329b71e41db4..b1b3c66e9d88 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8199,12 +8199,6 @@ static void __setup_per_zone_wmarks(void)
>  		zone->_watermark[WMARK_LOW]  = min_wmark_pages(zone) + tmp;
>  		zone->_watermark[WMARK_HIGH] = min_wmark_pages(zone) + tmp * 2;
>  
> -		/*
> -		 * The watermark size have changed so update the pcpu batch
> -		 * and high limits or the limits may be inappropriate.
> -		 */
> -		zone_set_pageset_high_and_batch(zone);
> -
>  		spin_unlock_irqrestore(&zone->lock, flags);
>  	}
>  
> @@ -8221,11 +8215,19 @@ static void __setup_per_zone_wmarks(void)
>   */
>  void setup_per_zone_wmarks(void)
>  {
> +	struct zone *zone;
>  	static DEFINE_SPINLOCK(lock);
>  
>  	spin_lock(&lock);
>  	__setup_per_zone_wmarks();
>  	spin_unlock(&lock);
> +
> +	/*
> +	 * The watermark size have changed so update the pcpu batch
> +	 * and high limits or the limits may be inappropriate.
> +	 */
> +	for_each_zone(zone)
> +		zone_pcp_update(zone);
>  }
>  
>  /*
> 


  reply	other threads:[~2021-05-28 10:28 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-25  8:01 [PATCH 0/6 v2] Calculate pcp->high based on zone sizes and active CPUs Mel Gorman
2021-05-25  8:01 ` [PATCH 1/6] mm/page_alloc: Delete vm.percpu_pagelist_fraction Mel Gorman
2021-05-26 17:41   ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 2/6] mm/page_alloc: Disassociate the pcp->high from pcp->batch Mel Gorman
2021-05-26 18:14   ` Vlastimil Babka
2021-05-27 10:52     ` Mel Gorman
2021-05-28 10:27       ` Vlastimil Babka [this message]
2021-05-25  8:01 ` [PATCH 3/6] mm/page_alloc: Adjust pcp->high after CPU hotplug events Mel Gorman
2021-05-28 11:08   ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 4/6] mm/page_alloc: Scale the number of pages that are batch freed Mel Gorman
2021-05-28 11:19   ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 5/6] mm/page_alloc: Limit the number of pages on PCP lists when reclaim is active Mel Gorman
2021-05-28 11:43   ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 6/6] mm/page_alloc: Introduce vm.percpu_pagelist_high_fraction Mel Gorman
2021-05-28 11:59   ` Vlastimil Babka
2021-05-28 12:53     ` Mel Gorman
2021-05-28 14:38       ` Vlastimil Babka
2021-05-27 19:36 ` [PATCH 0/6 v2] Calculate pcp->high based on zone sizes and active CPUs Dave Hansen
2021-05-28  8:55   ` Mel Gorman
2021-05-28  9:03     ` David Hildenbrand
2021-05-28  9:08       ` David Hildenbrand
2021-05-28  9:49         ` Mel Gorman
2021-05-28  9:52           ` David Hildenbrand
2021-05-28 10:09             ` Mel Gorman
2021-05-28 10:21               ` David Hildenbrand
2021-05-28 12:12     ` Vlastimil Babka
2021-05-28 12:37       ` Mel Gorman
2021-05-28 14:39     ` Dave Hansen
2021-05-28 15:18       ` Mel Gorman
2021-05-28 16:17         ` Dave Hansen
2021-05-31 12:00           ` Feng Tang
  -- strict thread matches above, loose matches on Subject: below --
2021-05-21 10:28 [RFC PATCH 0/6] " Mel Gorman
2021-05-21 10:28 ` [PATCH 2/6] mm/page_alloc: Disassociate the pcp->high from pcp->batch Mel Gorman
2021-05-21 21:52   ` Dave Hansen
2021-05-24  8:32     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ccf113a-d292-2b34-2470-5a4e2ed4276e@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.