linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Bharata B Rao <bharata@linux.ibm.com>
To: Vincent Guittot <vincent.guittot@linaro.org>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	guro@fb.com, vbabka@suse.cz, shakeelb@google.com,
	Johannes Weiner <hannes@cmpxchg.org>,
	aneesh.kumar@linux.ibm.com
Subject: Re: [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order
Date: Thu, 21 Jan 2021 11:00:03 +0530	[thread overview]
Message-ID: <20210121053003.GB2587010@in.ibm.com> (raw)
In-Reply-To: <CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com>

On Wed, Jan 20, 2021 at 06:36:31PM +0100, Vincent Guittot wrote:
> Hi,
> 
> On Wed, 18 Nov 2020 at 09:28, Bharata B Rao <bharata@linux.ibm.com> wrote:
> >
> > The page order of the slab that gets chosen for a given slab
> > cache depends on the number of objects that can be fit in the
> > slab while meeting other requirements. We start with a value
> > of minimum objects based on nr_cpu_ids that is driven by
> > possible number of CPUs and hence could be higher than the
> > actual number of CPUs present in the system. This leads to
> > calculate_order() chosing a page order that is on the higher
> > side leading to increased slab memory consumption on systems
> > that have bigger page sizes.
> >
> > Hence rely on the number of online CPUs when determining the
> > mininum objects, thereby increasing the chances of chosing
> > a lower conservative page order for the slab.
> >
> > Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> > ---
> > This is a generic change and I am unsure how it would affect
> > other archs, but as a start, here are some numbers from
> > PowerPC pseries KVM guest with and without this patch:
> >
> > This table shows how this change has affected some of the slab
> > caches.
> > ===================================================================
> >                 Current                         Patched
> > Cache   <objperslab> <pagesperslab>     <objperslab> <pagesperslab>
> > ===================================================================
> > TCPv6           53    2                 26    1
> > net_namespace   53    4                 26    2
> > dtl             32    2                 16    1
> > names_cache     32    2                 16    1
> > task_struct     53    8                 13    2
> > thread_stack    32    8                 8     2
> > pgtable-2^11    16    8                 8     4
> > pgtable-2^8     32    2                 16    1
> > kmalloc-32k     16    8                 8     4
> > kmalloc-16k     32    8                 8     2
> > kmalloc-8k      32    4                 8     1
> > kmalloc-4k      32    2                 16    1
> > ===================================================================
> >
> > Slab memory (kB) consumption comparision
> > ==================================================================
> >                         Current         Patched
> > ==================================================================
> > After-boot              205760          156096
> > During-hackbench        629145          506752 (Avg of 5 runs)
> > After-hackbench         474176          331840 (after drop_caches)
> > ==================================================================
> >
> > Hackbench Time (Avg of 5 runs)
> > (hackbench -s 1024 -l 200 -g 200 -f 25 -P)
> > ==========================================
> > Current         Patched
> > ==========================================
> > 10.990          11.010
> > ==========================================
> >
> > Measuring the effect due to CPU hotplug
> > ----------------------------------------
> > Since the patch doesn't consider all the possible CPUs for page
> > order calcluation, let's see how affects the case when CPUs are
> > hotplugged. Here I compare a system that is booted with 64CPUs
> > with a system that is booted with 16CPUs but hotplugged with
> > 48CPUs after boot. These numbers are with the patch applied.
> >
> > Slab memory (kB) consumption comparision
> > ===================================================================
> >                         64bootCPUs      16bootCPUs+48HotPluggedCPUs
> > ===================================================================
> > After-boot              390272          159744
> > After-hotplug           -               251328
> > During-hackbench        1001267         941926 (Avg of 5 runs)
> > After-hackbench         913600          827200 (after drop_caches)
> > ===================================================================
> >
> > Hackbench Time (Avg of 5 runs)
> > (hackbench -s 1024 -l 200 -g 200 -f 25 -P)
> > ===========================================
> > 64bootCPUs      16bootCPUs+48HotPluggedCPUs
> > ===========================================
> > 12.554          12.589
> > ===========================================
> >  mm/slub.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> 
> I'm facing significant performances regression on a large arm64 server
> system (224 CPUs). Regressions is also present on small arm64 system
> (8 CPUs) but in a far smaller order of magnitude
> 
> On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
> v5.11-rc4 : 9.135sec (+/- 0.45%)
> v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
> v5.10: 3.136sec (+/- 0.40%)
> 
> This is a 191% regression compared to v5.10.
> 
> The problem is that calculate_order() is called a number of times
> before secondaries CPUs are booted and it returns 1 instead of 224.
> This makes the use of num_online_cpus() irrelevant for those cases
> 
> After adding in my command line "slub_min_objects=36" which equals to
> 4 * (fls(num_online_cpus()) + 1) with a correct num_online_cpus == 224
> , the regression diseapears:
> 
> 9 iterations of hackbench -l 16000 -g 16: 3.201sec (+/- 0.90%)

Should we have switched to num_present_cpus() rather than
num_online_cpus()? If so, the below patch should address the
above problem.

From 252b332ccbee7152da1e18f1fff5b83f8e01b8df Mon Sep 17 00:00:00 2001
From: Bharata B Rao <bharata@linux.ibm.com>
Date: Thu, 21 Jan 2021 10:35:08 +0530
Subject: [PATCH] mm/slub: let number of present CPUs determine the slub
 page order

Commit 045ab8c9487b ("mm/slub: let number of online CPUs determine
the slub page order") changed the slub page order to depend on
num_online_cpus() from nr_cpu_ids. However we find that certain
caches (kmalloc) are initialized even before the secondary CPUs
are onlined resulting in lower slub page order and subsequent
regression.

Switch to num_present_cpus() instead.

Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
---
 mm/slub.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/slub.c b/mm/slub.c
index d9e4e10683cc..2f3e412c849d 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -3433,7 +3433,7 @@ static inline int calculate_order(unsigned int size)
 	 */
 	min_objects = slub_min_objects;
 	if (!min_objects)
-		min_objects = 4 * (fls(num_online_cpus()) + 1);
+		min_objects = 4 * (fls(num_present_cpus()) + 1);
 	max_objects = order_objects(slub_max_order, size);
 	min_objects = min(min_objects, max_objects);
 
-- 
2.26.2





  reply	other threads:[~2021-01-21  5:31 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18  8:27 [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order Bharata B Rao
2020-11-18 11:25 ` Vlastimil Babka
2020-11-18 19:34   ` Roman Gushchin
2020-11-18 19:53     ` David Rientjes
2021-01-20 17:36 ` Vincent Guittot
2021-01-21  5:30   ` Bharata B Rao [this message]
2021-01-21  9:09     ` Vincent Guittot
2021-01-21 10:01     ` Christoph Lameter
2021-01-21 10:48       ` Vincent Guittot
2021-01-21 18:19       ` Vlastimil Babka
2021-01-22  8:03         ` Vincent Guittot
2021-01-22 12:03           ` Vlastimil Babka
2021-01-22 13:16             ` Vincent Guittot
2021-01-23  5:16             ` Bharata B Rao
2021-01-23 12:32               ` Vincent Guittot
2021-01-25 11:20                 ` Vlastimil Babka
2021-01-26 23:03                   ` Will Deacon
2021-01-27  9:10                     ` Christoph Lameter
2021-01-27 11:04                       ` Vlastimil Babka
2021-02-03 11:10                         ` Bharata B Rao
2021-02-04  7:32                           ` Vincent Guittot
2021-02-04  9:07                             ` Christoph Lameter
2021-02-04  9:33                           ` Vlastimil Babka
2021-02-08 13:41                             ` [PATCH] mm, slub: better heuristic for number of cpus when calculating slab order Vlastimil Babka
2021-02-08 14:54                               ` Vincent Guittot
2021-02-10 14:07                               ` Mel Gorman
2021-01-22 13:05         ` [RFC PATCH v0] mm/slub: Let number of online CPUs determine the slub page order Jann Horn
2021-01-22 13:09           ` Jann Horn
2021-01-22 15:27           ` Vlastimil Babka
2021-01-25  4:28           ` Bharata B Rao
2021-01-26  8:52         ` Michal Hocko
2021-01-26 13:38           ` Vincent Guittot
2021-01-26 13:59             ` Michal Hocko
2021-01-28 13:45               ` Mel Gorman
2021-01-28 13:57                 ` Michal Hocko
2021-01-28 14:42                   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210121053003.GB2587010@in.ibm.com \
    --to=bharata@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=cl@linux.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    --cc=shakeelb@google.com \
    --cc=vbabka@suse.cz \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).