From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED
	autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 81456C433DB
	for <linux-kernel@archiver.kernel.org>; Mon,  8 Feb 2021 14:56:12 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by mail.kernel.org (Postfix) with ESMTP id 3E1DE64E3F
	for <linux-kernel@archiver.kernel.org>; Mon,  8 Feb 2021 14:56:12 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S233139AbhBHOzu (ORCPT <rfc822;linux-kernel@archiver.kernel.org>);
        Mon, 8 Feb 2021 09:55:50 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47910 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S233019AbhBHOya (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Mon, 8 Feb 2021 09:54:30 -0500
Received: from mail-lj1-x234.google.com (mail-lj1-x234.google.com [IPv6:2a00:1450:4864:20::234])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57F8AC061786
        for <linux-kernel@vger.kernel.org>; Mon,  8 Feb 2021 06:54:15 -0800 (PST)
Received: by mail-lj1-x234.google.com with SMTP id s18so17392641ljg.7
        for <linux-kernel@vger.kernel.org>; Mon, 08 Feb 2021 06:54:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=linaro.org; s=google;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=w8Jk0YpMgPDaUhDN13euXFyFH0KqawbpJWfkO8gNvsY=;
        b=sNim2cWavd/vwnl3alh4aPD5U6o+Pr72VLKiz6VvHnA6QKfPH7PsXCb6lJGqn/SVnM
         b/u8+9K6qWlpTTYZP4lh/dV5O3dIN4sTCWK0g4XBoOIKBm4gdbUq9dZbCPLrmmJSYkl4
         Cb8XTyKYIO7akFbBWeg5Gn4yxGjiZZQxcCQ25WxfeWghEpJilaLNeitNYupAlKtJDqXv
         4KgoKd72h3pKStcGMykEEhst+jWsA6/mXGLC7Maq0M4aqoqXuF75xB4I5kUEtPh3JViu
         sjXs9BZo7981My0JgJENkzheqyY600Vt45RQX7rXRbPnbJCumt7mhKW1x32vyo84HMIO
         MFVw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=w8Jk0YpMgPDaUhDN13euXFyFH0KqawbpJWfkO8gNvsY=;
        b=de89w+3D/aSH/1NUA5XrejgyUZYjLUljmw/h9467hNcephb3zK30nJ9P8kHADnEMD9
         GDojOfRp/jaf7EZVkSHid797y/SmlUDexcY1VsmQ/FEO0f9ztkid9poOQrzedVreNssR
         vA6Y/Mi1rSSBTiD7u6tzrF2clfjwl4VSXWT6MQ1YLTawlqimMwPRQw8xPKHGiUZHlCB4
         kzFeOnSGDVHyDnd+JrYYIqQrcIXK3OlPa2ONA7izb/CBBRfrBD246HvwXkPc5IHuvqY+
         5fp1mdWkGynCfcaI4j5AphgeSO/kGkDh/vRUC52lv3bJ+YrhxU5nMV1kp0ud7Xfamlpr
         MRqA==
X-Gm-Message-State: AOAM533zSXSMQaRU1EFxQ72qE4UxiF8ynIXSAB9d2K1ouQpzrlfiNDAV
        SNLbe8yWVKwUPc0I4CNqwBpG36bKuHJcHgwbwK6XOA==
X-Google-Smtp-Source: ABdhPJz0ckJUO3TgIgzxCLZ8Q9/Wu3x/M+c4pOMsWZhwlA/rDDaQOe19bsRABlVTf+gQHUI+jCkrRwFKzI1x+wn1kWw=
X-Received: by 2002:a2e:8696:: with SMTP id l22mr348290lji.445.1612796053760;
 Mon, 08 Feb 2021 06:54:13 -0800 (PST)
MIME-Version: 1.0
References: <aac07668-99a0-4c7e-5f8b-10751af364c5@suse.cz> <20210208134108.22286-1-vbabka@suse.cz>
In-Reply-To: <20210208134108.22286-1-vbabka@suse.cz>
From:   Vincent Guittot <vincent.guittot@linaro.org>
Date:   Mon, 8 Feb 2021 15:54:02 +0100
Message-ID: <CAKfTPtBR4AjOGE-h2q=jKjf55hc_xiJOAywzOWZtsWgNvbmYYg@mail.gmail.com>
Subject: Re: [PATCH] mm, slub: better heuristic for number of cpus when
 calculating slab order
To:     Vlastimil Babka <vbabka@suse.cz>
Cc:     Catalin Marinas <Catalin.Marinas@arm.com>,
        Andrew Morton <akpm@linux-foundation.org>,
        aneesh.kumar@linux.ibm.com, Bharata B Rao <bharata@linux.ibm.com>,
        Christoph Lameter <cl@linux.com>, guro@fb.com,
        Johannes Weiner <hannes@cmpxchg.org>,
        Joonsoo Kim <iamjoonsoo.kim@lge.com>,
        Jann Horn <jannh@google.com>,
        linux-kernel <linux-kernel@vger.kernel.org>, linux-mm@kvack.org,
        Michal Hocko <mhocko@kernel.org>,
        David Rientjes <rientjes@google.com>,
        Shakeel Butt <shakeelb@google.com>,
        Will Deacon <will@kernel.org>,
        Mel Gorman <mgorman@techsingularity.net>,
        "# v4 . 16+" <stable@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Mon, 8 Feb 2021 at 14:41, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> When creating a new kmem cache, SLUB determines how large the slab pages will
> based on number of inputs, including the number of CPUs in the system. Larger
> slab pages mean that more objects can be allocated/free from per-cpu slabs
> before accessing shared structures, but also potentially more memory can be
> wasted due to low slab usage and fragmentation.
> The rough idea of using number of CPUs is that larger systems will be more
> likely to benefit from reduced contention, and also should have enough memory
> to spare.
>
> Number of CPUs used to be determined as nr_cpu_ids, which is number of possible
> cpus, but on some systems many will never be onlined, thus commit 045ab8c9487b
> ("mm/slub: let number of online CPUs determine the slub page order") changed it
> to nr_online_cpus(). However, for kmem caches created early before CPUs are
> onlined, this may lead to permamently low slab page sizes.
>
> Vincent reports a regression [1] of hackbench on arm64 systems:
>
> > I'm facing significant performances regression on a large arm64 server
> > system (224 CPUs). Regressions is also present on small arm64 system
> > (8 CPUs) but in a far smaller order of magnitude
>
> > On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
> > v5.11-rc4 : 9.135sec (+/- 0.45%)
> > v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
> > v5.10: 3.136sec (+/- 0.40%)
>
> Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
> page allocator contention:
>
> > i.e. the patch incurs a 7% to 32% performance penalty. This bisected
> > cleanly yesterday when I was looking for the regression and then found
> > the thread.
>
> > Numerous caches change size. For example, kmalloc-512 goes from order-0
> > (vanilla) to order-2 with the revert.
>
> > So mostly this is down to the number of times SLUB calls into the page
> > allocator which only caches order-0 pages on a per-cpu basis.
>
> Clearly num_online_cpus() doesn't work too early in bootup. We could change
> the order dynamically in a memory hotplug callback, but runtime order changing
> for existing kmem caches has been already shown as dangerous, and removed in
> 32a6f409b693 ("mm, slub: remove runtime allocation order changes"). It could be
> resurrected in a safe manner with some effort, but to fix the regression we
> need something simpler.
>
> We could use num_present_cpus() that should be the number of physically present
> CPUs even before they are onlined. That would for for PowerPC [3], which

minor typo : "That would for for PowerPC" should be "That would work
for PowerPC" ?

> triggered the original commit,  but that still doesn't work on arm64 [4] as
> explained in [5].
>
> So this patch tries to determine the best available value without specific arch
> knowledge.
> - num_present_cpus() if the number is larger than 1, as that means the arch is
> likely setting it properly
> - nr_cpu_ids otherwise
>
> This should fix the reported regressions while also keeping the effect of
> 045ab8c9487b for PowerPC systems. It's possible there are configurations where
> num_present_cpus() is 1 during boot while nr_cpu_ids is at the same time
> bloated, so these (if they exist) would keep the large orders based on
> nr_cpu_ids as was before 045ab8c9487b.
>
> [1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com/
> [2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/
> [3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/
> [4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03YsY18cmWv_g@mail.gmail.com/
> [5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/
>
> Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
> Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
> Reported-by: Mel Gorman <mgorman@techsingularity.net>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Tested on both large and small arm64 systems. There is no regression
with this patch applied

Tested-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>
> OK, this is a 5.11 regression, so we should try to it by 5.12. I've also
> Cc'd stable for that reason although it's not a crash fix.
> We can still try later to replace this with a safe order update in hotplug
> callbacks, but that's infeasible for 5.12.
>
>  mm/slub.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 176b1cb0d006..8fc9190e6cb3 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3454,6 +3454,7 @@ static inline int calculate_order(unsigned int size)
>         unsigned int order;
>         unsigned int min_objects;
>         unsigned int max_objects;
> +       unsigned int nr_cpus;
>
>         /*
>          * Attempt to find best configuration for a slab. This
> @@ -3464,8 +3465,21 @@ static inline int calculate_order(unsigned int size)
>          * we reduce the minimum objects required in a slab.
>          */
>         min_objects = slub_min_objects;
> -       if (!min_objects)
> -               min_objects = 4 * (fls(num_online_cpus()) + 1);
> +       if (!min_objects) {
> +               /*
> +                * Some architectures will only update present cpus when
> +                * onlining them, so don't trust the number if it's just 1. But
> +                * we also don't want to use nr_cpu_ids always, as on some other
> +                * architectures, there can be many possible cpus, but never
> +                * onlined. Here we compromise between trying to avoid too high
> +                * order on systems that appear larger than they are, and too
> +                * low order on systems that appear smaller than they are.
> +                */
> +               nr_cpus = num_present_cpus();
> +               if (nr_cpus <= 1)
> +                       nr_cpus = nr_cpu_ids;
> +               min_objects = 4 * (fls(nr_cpus) + 1);
> +       }
>         max_objects = order_objects(slub_max_order, size);
>         min_objects = min(min_objects, max_objects);
>
> --
> 2.30.0
>

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=1uZe=HK=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,
	INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED
	autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 79E07C433DB
	for <linux-mm@archiver.kernel.org>; Mon,  8 Feb 2021 14:54:18 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id A82FB64E3F
	for <linux-mm@archiver.kernel.org>; Mon,  8 Feb 2021 14:54:17 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A82FB64E3F
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id DE77E6B0006; Mon,  8 Feb 2021 09:54:16 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id D98136B006C; Mon,  8 Feb 2021 09:54:16 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id C86516B006E; Mon,  8 Feb 2021 09:54:16 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187])
	by kanga.kvack.org (Postfix) with ESMTP id ACA326B0006
	for <linux-mm@kvack.org>; Mon,  8 Feb 2021 09:54:16 -0500 (EST)
Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay03.hostedemail.com (Postfix) with ESMTP id 6CC9F8249980
	for <linux-mm@kvack.org>; Mon,  8 Feb 2021 14:54:16 +0000 (UTC)
X-FDA: 77795396112.18.title68_3411f29275ff
Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251])
	by smtpin18.hostedemail.com (Postfix) with ESMTP id 03F2E100ED0E4
	for <linux-mm@kvack.org>; Mon,  8 Feb 2021 14:54:15 +0000 (UTC)
X-HE-Tag: title68_3411f29275ff
X-Filterd-Recvd-Size: 9572
Received: from mail-lj1-f176.google.com (mail-lj1-f176.google.com [209.85.208.176])
	by imf14.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Mon,  8 Feb 2021 14:54:15 +0000 (UTC)
Received: by mail-lj1-f176.google.com with SMTP id f2so17353588ljp.11
        for <linux-mm@kvack.org>; Mon, 08 Feb 2021 06:54:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=linaro.org; s=google;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=w8Jk0YpMgPDaUhDN13euXFyFH0KqawbpJWfkO8gNvsY=;
        b=sNim2cWavd/vwnl3alh4aPD5U6o+Pr72VLKiz6VvHnA6QKfPH7PsXCb6lJGqn/SVnM
         b/u8+9K6qWlpTTYZP4lh/dV5O3dIN4sTCWK0g4XBoOIKBm4gdbUq9dZbCPLrmmJSYkl4
         Cb8XTyKYIO7akFbBWeg5Gn4yxGjiZZQxcCQ25WxfeWghEpJilaLNeitNYupAlKtJDqXv
         4KgoKd72h3pKStcGMykEEhst+jWsA6/mXGLC7Maq0M4aqoqXuF75xB4I5kUEtPh3JViu
         sjXs9BZo7981My0JgJENkzheqyY600Vt45RQX7rXRbPnbJCumt7mhKW1x32vyo84HMIO
         MFVw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=w8Jk0YpMgPDaUhDN13euXFyFH0KqawbpJWfkO8gNvsY=;
        b=QS5C++mCxIBHTN8l6Mhum3CpTjbstz9wR0i+gYakMvsPM2wynWkSfG3zxt0YnGABwA
         pJYsK6KPP9cqIrcrATS+tFzN4dKpTNrXNV2E/h+7iFGwWrNXm3YSXdEKgEUwUDwxkKg5
         V5yXyBg9WY1QIfVTK1M4zHwEJCzz3kc2a9mQMswmVsYFyd/A/gr0vviBmrY3MChOI3X7
         v6LdO1eNYo1RxC7QJA12cBqyNnOpg/H3Brf5IfIErLBbruYbB2E/pWSta0sfiteGVBlb
         vLxgR9PomuHdfe5YlAy2A4c70QYC8JeKysTn3Tv4anqQSn9VCS+VnBK2t9OCghUBAgEv
         WtFw==
X-Gm-Message-State: AOAM532jB4NrkpQCfEtX0yrijFp0IinFyIk4RxX3F9vhC3fhaE1xn6SO
	p0lJa5lfkfe3xoAJVF1y6dSSyBTX+RbGRQZIciMwkQ==
X-Google-Smtp-Source: ABdhPJz0ckJUO3TgIgzxCLZ8Q9/Wu3x/M+c4pOMsWZhwlA/rDDaQOe19bsRABlVTf+gQHUI+jCkrRwFKzI1x+wn1kWw=
X-Received: by 2002:a2e:8696:: with SMTP id l22mr348290lji.445.1612796053760;
 Mon, 08 Feb 2021 06:54:13 -0800 (PST)
MIME-Version: 1.0
References: <aac07668-99a0-4c7e-5f8b-10751af364c5@suse.cz> <20210208134108.22286-1-vbabka@suse.cz>
In-Reply-To: <20210208134108.22286-1-vbabka@suse.cz>
From: Vincent Guittot <vincent.guittot@linaro.org>
Date: Mon, 8 Feb 2021 15:54:02 +0100
Message-ID: <CAKfTPtBR4AjOGE-h2q=jKjf55hc_xiJOAywzOWZtsWgNvbmYYg@mail.gmail.com>
Subject: Re: [PATCH] mm, slub: better heuristic for number of cpus when
 calculating slab order
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Catalin Marinas <Catalin.Marinas@arm.com>, Andrew Morton <akpm@linux-foundation.org>, 
	aneesh.kumar@linux.ibm.com, Bharata B Rao <bharata@linux.ibm.com>, 
	Christoph Lameter <cl@linux.com>, guro@fb.com, Johannes Weiner <hannes@cmpxchg.org>, 
	Joonsoo Kim <iamjoonsoo.kim@lge.com>, Jann Horn <jannh@google.com>, 
	linux-kernel <linux-kernel@vger.kernel.org>, linux-mm@kvack.org, 
	Michal Hocko <mhocko@kernel.org>, David Rientjes <rientjes@google.com>, 
	Shakeel Butt <shakeelb@google.com>, Will Deacon <will@kernel.org>, 
	Mel Gorman <mgorman@techsingularity.net>, "# v4 . 16+" <stable@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Mon, 8 Feb 2021 at 14:41, Vlastimil Babka <vbabka@suse.cz> wrote:
>
> When creating a new kmem cache, SLUB determines how large the slab pages will
> based on number of inputs, including the number of CPUs in the system. Larger
> slab pages mean that more objects can be allocated/free from per-cpu slabs
> before accessing shared structures, but also potentially more memory can be
> wasted due to low slab usage and fragmentation.
> The rough idea of using number of CPUs is that larger systems will be more
> likely to benefit from reduced contention, and also should have enough memory
> to spare.
>
> Number of CPUs used to be determined as nr_cpu_ids, which is number of possible
> cpus, but on some systems many will never be onlined, thus commit 045ab8c9487b
> ("mm/slub: let number of online CPUs determine the slub page order") changed it
> to nr_online_cpus(). However, for kmem caches created early before CPUs are
> onlined, this may lead to permamently low slab page sizes.
>
> Vincent reports a regression [1] of hackbench on arm64 systems:
>
> > I'm facing significant performances regression on a large arm64 server
> > system (224 CPUs). Regressions is also present on small arm64 system
> > (8 CPUs) but in a far smaller order of magnitude
>
> > On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16
> > v5.11-rc4 : 9.135sec (+/- 0.45%)
> > v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%)
> > v5.10: 3.136sec (+/- 0.40%)
>
> Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting
> page allocator contention:
>
> > i.e. the patch incurs a 7% to 32% performance penalty. This bisected
> > cleanly yesterday when I was looking for the regression and then found
> > the thread.
>
> > Numerous caches change size. For example, kmalloc-512 goes from order-0
> > (vanilla) to order-2 with the revert.
>
> > So mostly this is down to the number of times SLUB calls into the page
> > allocator which only caches order-0 pages on a per-cpu basis.
>
> Clearly num_online_cpus() doesn't work too early in bootup. We could change
> the order dynamically in a memory hotplug callback, but runtime order changing
> for existing kmem caches has been already shown as dangerous, and removed in
> 32a6f409b693 ("mm, slub: remove runtime allocation order changes"). It could be
> resurrected in a safe manner with some effort, but to fix the regression we
> need something simpler.
>
> We could use num_present_cpus() that should be the number of physically present
> CPUs even before they are onlined. That would for for PowerPC [3], which

minor typo : "That would for for PowerPC" should be "That would work
for PowerPC" ?

> triggered the original commit,  but that still doesn't work on arm64 [4] as
> explained in [5].
>
> So this patch tries to determine the best available value without specific arch
> knowledge.
> - num_present_cpus() if the number is larger than 1, as that means the arch is
> likely setting it properly
> - nr_cpu_ids otherwise
>
> This should fix the reported regressions while also keeping the effect of
> 045ab8c9487b for PowerPC systems. It's possible there are configurations where
> num_present_cpus() is 1 during boot while nr_cpu_ids is at the same time
> bloated, so these (if they exist) would keep the large orders based on
> nr_cpu_ids as was before 045ab8c9487b.
>
> [1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com/
> [2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/
> [3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/
> [4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03YsY18cmWv_g@mail.gmail.com/
> [5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/
>
> Fixes: 045ab8c9487b ("mm/slub: let number of online CPUs determine the slub page order")
> Reported-by: Vincent Guittot <vincent.guittot@linaro.org>
> Reported-by: Mel Gorman <mgorman@techsingularity.net>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>

Tested on both large and small arm64 systems. There is no regression
with this patch applied

Tested-by: Vincent Guittot <vincent.guittot@linaro.org>

> ---
>
> OK, this is a 5.11 regression, so we should try to it by 5.12. I've also
> Cc'd stable for that reason although it's not a crash fix.
> We can still try later to replace this with a safe order update in hotplug
> callbacks, but that's infeasible for 5.12.
>
>  mm/slub.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/mm/slub.c b/mm/slub.c
> index 176b1cb0d006..8fc9190e6cb3 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -3454,6 +3454,7 @@ static inline int calculate_order(unsigned int size)
>         unsigned int order;
>         unsigned int min_objects;
>         unsigned int max_objects;
> +       unsigned int nr_cpus;
>
>         /*
>          * Attempt to find best configuration for a slab. This
> @@ -3464,8 +3465,21 @@ static inline int calculate_order(unsigned int size)
>          * we reduce the minimum objects required in a slab.
>          */
>         min_objects = slub_min_objects;
> -       if (!min_objects)
> -               min_objects = 4 * (fls(num_online_cpus()) + 1);
> +       if (!min_objects) {
> +               /*
> +                * Some architectures will only update present cpus when
> +                * onlining them, so don't trust the number if it's just 1. But
> +                * we also don't want to use nr_cpu_ids always, as on some other
> +                * architectures, there can be many possible cpus, but never
> +                * onlined. Here we compromise between trying to avoid too high
> +                * order on systems that appear larger than they are, and too
> +                * low order on systems that appear smaller than they are.
> +                */
> +               nr_cpus = num_present_cpus();
> +               if (nr_cpus <= 1)
> +                       nr_cpus = nr_cpu_ids;
> +               min_objects = 4 * (fls(nr_cpus) + 1);
> +       }
>         max_objects = order_objects(slub_max_order, size);
>         min_objects = min(min_objects, max_objects);
>
> --
> 2.30.0
>