From: Mel Gorman <firstname.lastname@example.org> To: Vlastimil Babka <email@example.com> Cc: Andrew Morton <firstname.lastname@example.org>, Hillf Danton <email@example.com>, Dave Hansen <firstname.lastname@example.org>, Michal Hocko <email@example.com>, LKML <firstname.lastname@example.org>, Linux-MM <email@example.com> Subject: Re: [PATCH 6/6] mm/page_alloc: Introduce vm.percpu_pagelist_high_fraction Date: Fri, 28 May 2021 13:53:35 +0100 [thread overview] Message-ID: <20210528125334.GP30378@techsingularity.net> (raw) In-Reply-To: <firstname.lastname@example.org> On Fri, May 28, 2021 at 01:59:37PM +0200, Vlastimil Babka wrote: > On 5/25/21 10:01 AM, Mel Gorman wrote: > > This introduces a new sysctl vm.percpu_pagelist_high_fraction. It is > > similar to the old vm.percpu_pagelist_fraction. The old sysctl increased > > both pcp->batch and pcp->high with the higher pcp->high potentially > > reducing zone->lock contention. However, the higher pcp->batch value also > > potentially increased allocation latency while the PCP was refilled. > > This sysctl only adjusts pcp->high so that zone->lock contention is > > potentially reduced but allocation latency during a PCP refill remains > > the same. > > > > # grep -E "high:|batch" /proc/zoneinfo | tail -2 > > high: 649 > > batch: 63 > > > > # sysctl vm.percpu_pagelist_high_fraction=8 > > # grep -E "high:|batch" /proc/zoneinfo | tail -2 > > high: 35071 > > batch: 63 > > > > # sysctl vm.percpu_pagelist_high_fraction=64 > > high: 4383 > > batch: 63 > > > > # sysctl vm.percpu_pagelist_high_fraction=0 > > high: 649 > > batch: 63 > > > > Signed-off-by: Mel Gorman <email@example.com> > > Acked-by: Dave Hansen <firstname.lastname@example.org> > > Acked-by: Vlastimil Babka <email@example.com> > Thanks. > Documentation nit below: > > > @@ -789,6 +790,25 @@ panic_on_oom=2+kdump gives you very strong tool to investigate > > why oom happens. You can get snapshot. > > > > > > +percpu_pagelist_high_fraction > > +============================= > > + > > +This is the fraction of pages in each zone that are allocated for each > > +per cpu page list. The min value for this is 8. It means that we do > > +not allow more than 1/8th of pages in each zone to be allocated in any > > +single per_cpu_pagelist. > > This, while technically correct (as an upper limit) is somewhat misleading as > the limit for a single per_cpu_pagelist also considers the number of local cpus. > > > This entry only changes the value of hot per > > +cpu pagelists. User can specify a number like 100 to allocate 1/100th > > +of each zone to each per cpu page list. > > This is worse. Anyone trying to reproduce this example on a system with multiple > cpus per node and checking the result will be puzzled. > So I think the part about number of local cpus should be mentioned to avoid > confusion. > Is this any better? diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index e85c2f21d209..2da25735a629 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -793,15 +793,16 @@ why oom happens. You can get snapshot. percpu_pagelist_high_fraction ============================= -This is the fraction of pages in each zone that are allocated for each -per cpu page list. The min value for this is 8. It means that we do -not allow more than 1/8th of pages in each zone to be allocated in any -single per_cpu_pagelist. This entry only changes the value of hot per -cpu pagelists. User can specify a number like 100 to allocate 1/100th -of each zone to each per cpu page list. - -The batch value of each per cpu pagelist remains the same regardless of the -value of the high fraction so allocation latencies are unaffected. +This is the fraction of pages in each zone that are can be stored to +per-cpu page lists. It is an upper boundary that is divided depending +on the number of online CPUs. The min value for this is 8 which means +that we do not allow more than 1/8th of pages in each zone to be stored +on per-cpu page lists. This entry only changes the value of hot per-cpu +page lists. A user can specify a number like 100 to allocate 1/100th of +each zone between per-cpu lists. + +The batch value of each per-cpu page list remains the same regardless of +the value of the high fraction so allocation latencies are unaffected. The initial value is zero. Kernel uses this value to set the high pcp->high mark based on the low watermark for the zone and the number of local -- Mel Gorman SUSE Labs
next prev parent reply other threads:[~2021-05-28 12:53 UTC|newest] Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-05-25 8:01 [PATCH 0/6 v2] Calculate pcp->high based on zone sizes and active CPUs Mel Gorman 2021-05-25 8:01 ` [PATCH 1/6] mm/page_alloc: Delete vm.percpu_pagelist_fraction Mel Gorman 2021-05-26 17:41 ` Vlastimil Babka 2021-05-25 8:01 ` [PATCH 2/6] mm/page_alloc: Disassociate the pcp->high from pcp->batch Mel Gorman 2021-05-26 18:14 ` Vlastimil Babka 2021-05-27 10:52 ` Mel Gorman 2021-05-28 10:27 ` Vlastimil Babka 2021-05-25 8:01 ` [PATCH 3/6] mm/page_alloc: Adjust pcp->high after CPU hotplug events Mel Gorman 2021-05-28 11:08 ` Vlastimil Babka 2021-05-25 8:01 ` [PATCH 4/6] mm/page_alloc: Scale the number of pages that are batch freed Mel Gorman 2021-05-28 11:19 ` Vlastimil Babka 2021-05-25 8:01 ` [PATCH 5/6] mm/page_alloc: Limit the number of pages on PCP lists when reclaim is active Mel Gorman 2021-05-28 11:43 ` Vlastimil Babka 2021-05-25 8:01 ` [PATCH 6/6] mm/page_alloc: Introduce vm.percpu_pagelist_high_fraction Mel Gorman 2021-05-28 11:59 ` Vlastimil Babka 2021-05-28 12:53 ` Mel Gorman [this message] 2021-05-28 14:38 ` Vlastimil Babka 2021-05-27 19:36 ` [PATCH 0/6 v2] Calculate pcp->high based on zone sizes and active CPUs Dave Hansen 2021-05-28 8:55 ` Mel Gorman 2021-05-28 9:03 ` David Hildenbrand 2021-05-28 9:08 ` David Hildenbrand 2021-05-28 9:49 ` Mel Gorman 2021-05-28 9:52 ` David Hildenbrand 2021-05-28 10:09 ` Mel Gorman 2021-05-28 10:21 ` David Hildenbrand 2021-05-28 12:12 ` Vlastimil Babka 2021-05-28 12:37 ` Mel Gorman 2021-05-28 14:39 ` Dave Hansen 2021-05-28 15:18 ` Mel Gorman 2021-05-28 16:17 ` Dave Hansen 2021-05-31 12:00 ` Feng Tang -- strict thread matches above, loose matches on Subject: below -- 2021-05-21 10:28 [RFC PATCH 0/6] " Mel Gorman 2021-05-21 10:28 ` [PATCH 6/6] mm/page_alloc: Introduce vm.percpu_pagelist_high_fraction Mel Gorman 2021-05-21 22:57 ` Dave Hansen 2021-05-24 9:25 ` Mel Gorman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210528125334.GP30378@techsingularity.net \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --subject='Re: [PATCH 6/6] mm/page_alloc: Introduce vm.percpu_pagelist_high_fraction' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).