All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hillf Danton <hdanton@sina.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 6/6] mm/page_alloc: Introduce vm.percpu_pagelist_high_fraction
Date: Fri, 28 May 2021 13:53:35 +0100	[thread overview]
Message-ID: <20210528125334.GP30378@techsingularity.net> (raw)
In-Reply-To: <018c4b99-81a5-bc12-03cd-662a938ef05a@suse.cz>

On Fri, May 28, 2021 at 01:59:37PM +0200, Vlastimil Babka wrote:
> On 5/25/21 10:01 AM, Mel Gorman wrote:
> > This introduces a new sysctl vm.percpu_pagelist_high_fraction. It is
> > similar to the old vm.percpu_pagelist_fraction. The old sysctl increased
> > both pcp->batch and pcp->high with the higher pcp->high potentially
> > reducing zone->lock contention. However, the higher pcp->batch value also
> > potentially increased allocation latency while the PCP was refilled.
> > This sysctl only adjusts pcp->high so that zone->lock contention is
> > potentially reduced but allocation latency during a PCP refill remains
> > the same.
> > 
> >   # grep -E "high:|batch" /proc/zoneinfo | tail -2
> >               high:  649
> >               batch: 63
> > 
> >   # sysctl vm.percpu_pagelist_high_fraction=8
> >   # grep -E "high:|batch" /proc/zoneinfo | tail -2
> >               high:  35071
> >               batch: 63
> > 
> >   # sysctl vm.percpu_pagelist_high_fraction=64
> >               high:  4383
> >               batch: 63
> > 
> >   # sysctl vm.percpu_pagelist_high_fraction=0
> >               high:  649
> >               batch: 63
> > 
> > Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> > Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 

Thanks.

> Documentation nit below:
> 
> > @@ -789,6 +790,25 @@ panic_on_oom=2+kdump gives you very strong tool to investigate
> >  why oom happens. You can get snapshot.
> >  
> >  
> > +percpu_pagelist_high_fraction
> > +=============================
> > +
> > +This is the fraction of pages in each zone that are allocated for each
> > +per cpu page list.  The min value for this is 8.  It means that we do
> > +not allow more than 1/8th of pages in each zone to be allocated in any
> > +single per_cpu_pagelist.
> 
> This, while technically correct (as an upper limit) is somewhat misleading as
> the limit for a single per_cpu_pagelist also considers the number of local cpus.
> 
> >  This entry only changes the value of hot per
> > +cpu pagelists. User can specify a number like 100 to allocate 1/100th
> > +of each zone to each per cpu page list.
> 
> This is worse. Anyone trying to reproduce this example on a system with multiple
> cpus per node and checking the result will be puzzled.
> So I think the part about number of local cpus should be mentioned to avoid
> confusion.
> 

Is this any better?

diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index e85c2f21d209..2da25735a629 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -793,15 +793,16 @@ why oom happens. You can get snapshot.
 percpu_pagelist_high_fraction
 =============================
 
-This is the fraction of pages in each zone that are allocated for each
-per cpu page list.  The min value for this is 8.  It means that we do
-not allow more than 1/8th of pages in each zone to be allocated in any
-single per_cpu_pagelist.  This entry only changes the value of hot per
-cpu pagelists. User can specify a number like 100 to allocate 1/100th
-of each zone to each per cpu page list.
-
-The batch value of each per cpu pagelist remains the same regardless of the
-value of the high fraction so allocation latencies are unaffected.
+This is the fraction of pages in each zone that are can be stored to
+per-cpu page lists. It is an upper boundary that is divided depending
+on the number of online CPUs. The min value for this is 8 which means
+that we do not allow more than 1/8th of pages in each zone to be stored
+on per-cpu page lists. This entry only changes the value of hot per-cpu
+page lists. A user can specify a number like 100 to allocate 1/100th of
+each zone between per-cpu lists.
+
+The batch value of each per-cpu page list remains the same regardless of
+the value of the high fraction so allocation latencies are unaffected.
 
 The initial value is zero. Kernel uses this value to set the high pcp->high
 mark based on the low watermark for the zone and the number of local
-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2021-05-28 12:53 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-25  8:01 [PATCH 0/6 v2] Calculate pcp->high based on zone sizes and active CPUs Mel Gorman
2021-05-25  8:01 ` [PATCH 1/6] mm/page_alloc: Delete vm.percpu_pagelist_fraction Mel Gorman
2021-05-26 17:41   ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 2/6] mm/page_alloc: Disassociate the pcp->high from pcp->batch Mel Gorman
2021-05-26 18:14   ` Vlastimil Babka
2021-05-27 10:52     ` Mel Gorman
2021-05-28 10:27       ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 3/6] mm/page_alloc: Adjust pcp->high after CPU hotplug events Mel Gorman
2021-05-28 11:08   ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 4/6] mm/page_alloc: Scale the number of pages that are batch freed Mel Gorman
2021-05-28 11:19   ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 5/6] mm/page_alloc: Limit the number of pages on PCP lists when reclaim is active Mel Gorman
2021-05-28 11:43   ` Vlastimil Babka
2021-05-25  8:01 ` [PATCH 6/6] mm/page_alloc: Introduce vm.percpu_pagelist_high_fraction Mel Gorman
2021-05-28 11:59   ` Vlastimil Babka
2021-05-28 12:53     ` Mel Gorman [this message]
2021-05-28 14:38       ` Vlastimil Babka
2021-05-27 19:36 ` [PATCH 0/6 v2] Calculate pcp->high based on zone sizes and active CPUs Dave Hansen
2021-05-28  8:55   ` Mel Gorman
2021-05-28  9:03     ` David Hildenbrand
2021-05-28  9:08       ` David Hildenbrand
2021-05-28  9:49         ` Mel Gorman
2021-05-28  9:52           ` David Hildenbrand
2021-05-28 10:09             ` Mel Gorman
2021-05-28 10:21               ` David Hildenbrand
2021-05-28 12:12     ` Vlastimil Babka
2021-05-28 12:37       ` Mel Gorman
2021-05-28 14:39     ` Dave Hansen
2021-05-28 15:18       ` Mel Gorman
2021-05-28 16:17         ` Dave Hansen
2021-05-31 12:00           ` Feng Tang
  -- strict thread matches above, loose matches on Subject: below --
2021-05-21 10:28 [RFC PATCH 0/6] " Mel Gorman
2021-05-21 10:28 ` [PATCH 6/6] mm/page_alloc: Introduce vm.percpu_pagelist_high_fraction Mel Gorman
2021-05-21 22:57   ` Dave Hansen
2021-05-24  9:25     ` Mel Gorman
2021-05-22  2:19   ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210528125334.GP30378@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.