From: Vincent Guittot <vincent.guittot@linaro.org>
To: Bharata B Rao <bharata@linux.ibm.com>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
linux-mm@kvack.org, Christoph Lameter <cl@linux.com>,
David Rientjes <rientjes@google.com>,
Joonsoo Kim <iamjoonsoo.kim@lge.com>,
Andrew Morton <akpm@linux-foundation.org>,
guro@fb.com, vbabka@suse.cz, shakeelb@google.com,
Johannes Weiner <hannes@cmpxchg.org>,
aneesh.kumar@linux.ibm.com
Subject: Re: [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order
Date: Wed, 20 Jan 2021 18:36:31 +0100 [thread overview]
Message-ID: <CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com> (raw)
In-Reply-To: <20201118082759.1413056-1-bharata@linux.ibm.com>
Hi,
On Wed, 18 Nov 2020 at 09:28, Bharata B Rao <bharata@linux.ibm.com> wrote:
>
> The page order of the slab that gets chosen for a given slab
> cache depends on the number of objects that can be fit in the
> slab while meeting other requirements. We start with a value
> of minimum objects based on nr_cpu_ids that is driven by
> possible number of CPUs and hence could be higher than the
> actual number of CPUs present in the system. This leads to
> calculate_order() chosing a page order that is on the higher
> side leading to increased slab memory consumption on systems
> that have bigger page sizes.
>
> Hence rely on the number of online CPUs when determining the
> mininum objects, thereby increasing the chances of chosing
> a lower conservative page order for the slab.
>
> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> ---
> This is a generic change and I am unsure how it would affect
> other archs, but as a start, here are some numbers from
> PowerPC pseries KVM guest with and without this patch:
>
> This table shows how this change has affected some of the slab
> caches.
> ===================================================================
> Current Patched
> Cache <objperslab> <pagesperslab> <objperslab> <pagesperslab>
> ===================================================================
> TCPv6 53 2 26 1
> net_namespace 53 4 26 2
> dtl 32 2 16 1
> names_cache 32 2 16 1
> task_struct 53 8 13 2
> thread_stack 32 8 8 2
> pgtable-2^11 16 8 8 4
> pgtable-2^8 32 2 16 1
> kmalloc-32k 16 8 8 4
> kmalloc-16k 32 8 8 2
> kmalloc-8k 32 4 8 1
> kmalloc-4k 32 2 16 1
> ===================================================================
>
> Slab memory (kB) consumption comparision
> ==================================================================
> Current Patched
> ==================================================================
> After-boot 205760 156096
> During-hackbench 629145 506752 (Avg of 5 runs)
> After-hackbench 474176 331840 (after drop_caches)
> ==================================================================
>
> Hackbench Time (Avg of 5 runs)
> (hackbench -s 1024 -l 200 -g 200 -f 25 -P)
> ==========================================
> Current Patched
> ==========================================
> 10.990 11.010
> ==========================================
>
> Measuring the effect due to CPU hotplug
> ----------------------------------------
> Since the patch doesn't consider all the possible CPUs for page
> order calcluation, let's see how affects the case when CPUs are
> hotplugged. Here I compare a system that is booted with 64CPUs
> with a system that is booted with 16CPUs but hotplugged with
> 48CPUs after boot. These numbers are with the patch applied.
>
> Slab memory (kB) consumption comparision
> ===================================================================
> 64bootCPUs 16bootCPUs+48HotPluggedCPUs
> ===================================================================
> After-boot 390272 159744
> After-hotplug - 251328
> During-hackbench 1001267 941926 (Avg of 5 runs)
> After-hackbench 913600 827200 (after drop_caches)
> ===================================================================
>
> Hackbench Time (Avg of 5 runs)
> (hackbench -s 1024 -l 200 -g 200 -f 25 -P)
> ===========================================
> 64bootCPUs 16bootCPUs+48HotPluggedCPUs
> ===========================================
> 12.554 12.589
> ===========================================
> mm/slub.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
I'm facing significant performances regression on a large arm64 server
system (224 CPUs). Regressions is also present on small arm64 system
(8 CPUs) but in a far smaller order of magnitude
On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
v5.11-rc4 : 9.135sec (+/- 0.45%)
v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
v5.10: 3.136sec (+/- 0.40%)
This is a 191% regression compared to v5.10.
The problem is that calculate_order() is called a number of times
before secondaries CPUs are booted and it returns 1 instead of 224.
This makes the use of num_online_cpus() irrelevant for those cases
After adding in my command line "slub_min_objects=36" which equals to
4 * (fls(num_online_cpus()) + 1) with a correct num_online_cpus == 224
, the regression diseapears:
9 iterations of hackbench -l 16000 -g 16: 3.201sec (+/- 0.90%)
> diff --git a/mm/slub.c b/mm/slub.c
> index 34dcc09e2ec9..8342c0a167b2 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3433,7 +3433,7 @@ static inline int calculate_order(unsigned int size)
> */
> min_objects = slub_min_objects;
> if (!min_objects)
> - min_objects = 4 * (fls(nr_cpu_ids) + 1);
> + min_objects = 4 * (fls(num_online_cpus()) + 1);
> max_objects = order_objects(slub_max_order, size);
> min_objects = min(min_objects, max_objects);
>
> --
> 2.26.2
>
next prev parent reply other threads:[~2021-01-20 17:36 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-18 8:27 [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order Bharata B Rao
2020-11-18 11:25 ` Vlastimil Babka
2020-11-18 19:34 ` Roman Gushchin
2020-11-18 19:53 ` David Rientjes
2021-01-20 17:36 ` Vincent Guittot [this message]
2021-01-21 5:30 ` Bharata B Rao
2021-01-21 9:09 ` Vincent Guittot
2021-01-21 10:01 ` Christoph Lameter
2021-01-21 10:48 ` Vincent Guittot
2021-01-21 18:19 ` Vlastimil Babka
2021-01-22 8:03 ` Vincent Guittot
2021-01-22 12:03 ` Vlastimil Babka
2021-01-22 13:16 ` Vincent Guittot
2021-01-23 5:16 ` Bharata B Rao
2021-01-23 12:32 ` Vincent Guittot
2021-01-25 11:20 ` Vlastimil Babka
2021-01-26 23:03 ` Will Deacon
2021-01-27 9:10 ` Christoph Lameter
2021-01-27 11:04 ` Vlastimil Babka
2021-02-03 11:10 ` Bharata B Rao
2021-02-04 7:32 ` Vincent Guittot
2021-02-04 9:07 ` Christoph Lameter
2021-02-04 9:33 ` Vlastimil Babka
2021-02-08 13:41 ` [PATCH] mm, slub: better heuristic for number of cpus when calculating slab order Vlastimil Babka
2021-02-08 14:54 ` Vincent Guittot
2021-02-10 14:07 ` Mel Gorman
2021-01-22 13:05 ` [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order Jann Horn
2021-01-22 13:09 ` Jann Horn
2021-01-22 15:27 ` Vlastimil Babka
2021-01-25 4:28 ` Bharata B Rao
2021-01-26 8:52 ` Michal Hocko
2021-01-26 13:38 ` Vincent Guittot
2021-01-26 13:59 ` Michal Hocko
2021-01-28 13:45 ` Mel Gorman
2021-01-28 13:57 ` Michal Hocko
2021-01-28 14:42 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com' \
--to=vincent.guittot@linaro.org \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=bharata@linux.ibm.com \
--cc=cl@linux.com \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=iamjoonsoo.kim@lge.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rientjes@google.com \
--cc=shakeelb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).