All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: Libo Chen <libo.chen@oracle.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
	Randy Dunlap <rdunlap@infradead.org>,
	gregkh <gregkh@linuxfoundation.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux Kbuild mailing list <linux-kbuild@vger.kernel.org>,
	Linux ARM <linux-arm-kernel@lists.infradead.org>,
	linux-arch <linux-arch@vger.kernel.org>
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
Date: Thu, 14 Apr 2022 13:41:11 +0200	[thread overview]
Message-ID: <CAK8P3a0uy8JcHP_G_ebz61AMB-Mx6jr5+vuzJHmWbDCajTdTfQ@mail.gmail.com> (raw)
In-Reply-To: <ce420ed3-4a36-122f-460d-8cccd0310033@oracle.com>

On Wed, Apr 13, 2022 at 11:50 PM Libo Chen <libo.chen@oracle.com> wrote:
> On 4/13/22 13:52, Arnd Bergmann wrote:
> >>> Yes, it is. I don't know that the problem is...
> >> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
> >> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
> >> seem to be what we want.
> > I think the correct way to do it is to follow x86 and powerpc, and tying
> > CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
> > For smaller values of NR_CPUS, the onstack masks are obviously
> > cheaper, we just need to decide what the cut-off point is.
>
> I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on
> some architectures such as parisc and nios2 as reported by kernel test
> robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of
> guard on CPUMASK_OFFSTACK.

NIOS2 does not support SMP builds at all, so it should never be possible to
select CPUMASK_OFFSTACK there. We may want to guard
DEBUG_PER_CPU_MAPS by adding a 'depends on SMP' in order to
prevent it from getting selected.

For PARISC, the largest configuration is 32-way SMP, so CPUMASK_OFFSTACK
is clearly pointless there as well, even though it should technically
be possible
to support. What is the build error on parisc?

> > In x86, the onstack masks can be selected for normal SMP builds with
> > up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
> > CPUs while selecting CPUMASK_OFFSTACK.
> > PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
> > implicitly whenever NR_CPUS is set to 8192 or more.
> >
> > I think we can easily do the same as powerpc on arm64. With the
> I am leaning more towards x86's way because even NR_CPUS=160 is too
> expensive for 4-core arm64 VMs according to apachebench. I highly doubt
> that there is a good cut-off point to make everybody happy (or not unhappy).

It seems surprising that you would see any improvement for offstack masks
when using NR_CPUS=160, that is just three 64-bit words worth of data, but
it requires allocating the mask dynamically, which takes way more memory
to initialize.

> > ApacheBench test you cite in the patch description, what is the
> > value of NR_CPUS at which you start seeing a noticeable
> > benefit for offstack masks? Can you do the same test for
> > NR_CPUS=1024 or 2048?
>
> As mentioned above, a good cut-off point moves depends on the actual
> number of CPUs. But yeah I can do the same test for 1024 or even smaller
> NR_CPUs values on the same 64-core arm64 VM setup.

If you see an improvement for small NR_CPUS values using offstack masks,
it's possible that the actual difference is something completely
different and we
can just make the on-stack case faster, possibly the cause is something about
cacheline alignment or inlining decisions using your specific kernel config.

Are you able to compare the 'perf report' output between runs with either
size to see where the extra time gets spent?

        Arnd

WARNING: multiple messages have this Message-ID (diff)
From: Arnd Bergmann <arnd@arndb.de>
To: Libo Chen <libo.chen@oracle.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
	Randy Dunlap <rdunlap@infradead.org>,
	 gregkh <gregkh@linuxfoundation.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	 Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	 Ingo Molnar <mingo@kernel.org>, Vlastimil Babka <vbabka@suse.cz>,
	 Andrew Morton <akpm@linux-foundation.org>,
	 Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	 Linux Kbuild mailing list <linux-kbuild@vger.kernel.org>,
	 Linux ARM <linux-arm-kernel@lists.infradead.org>,
	 linux-arch <linux-arch@vger.kernel.org>
Subject: Re: [PATCH RESEND 1/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK
Date: Thu, 14 Apr 2022 13:41:11 +0200	[thread overview]
Message-ID: <CAK8P3a0uy8JcHP_G_ebz61AMB-Mx6jr5+vuzJHmWbDCajTdTfQ@mail.gmail.com> (raw)
In-Reply-To: <ce420ed3-4a36-122f-460d-8cccd0310033@oracle.com>

On Wed, Apr 13, 2022 at 11:50 PM Libo Chen <libo.chen@oracle.com> wrote:
> On 4/13/22 13:52, Arnd Bergmann wrote:
> >>> Yes, it is. I don't know that the problem is...
> >> Masahiro explained that CPUMASK_OFFSTACK can only be configured by
> >> options not users if DEBUG_PER_CPU_MASK is not enabled. This doesn't
> >> seem to be what we want.
> > I think the correct way to do it is to follow x86 and powerpc, and tying
> > CPUMASK_OFFSTACK to "large" values of CONFIG_NR_CPUS.
> > For smaller values of NR_CPUS, the onstack masks are obviously
> > cheaper, we just need to decide what the cut-off point is.
>
> I agree. It appears enabling CPUMASK_OFFSTACK breaks kernel builds on
> some architectures such as parisc and nios2 as reported by kernel test
> robot. Maybe it makes sense to use DEBUG_PER_CPU_MAPS as some kind of
> guard on CPUMASK_OFFSTACK.

NIOS2 does not support SMP builds at all, so it should never be possible to
select CPUMASK_OFFSTACK there. We may want to guard
DEBUG_PER_CPU_MAPS by adding a 'depends on SMP' in order to
prevent it from getting selected.

For PARISC, the largest configuration is 32-way SMP, so CPUMASK_OFFSTACK
is clearly pointless there as well, even though it should technically
be possible
to support. What is the build error on parisc?

> > In x86, the onstack masks can be selected for normal SMP builds with
> > up to 512 CPUs, while CONFIG_MAXSMP=y raises the limit to 8192
> > CPUs while selecting CPUMASK_OFFSTACK.
> > PowerPC does it the other way round, selecting CPUMASK_OFFSTACK
> > implicitly whenever NR_CPUS is set to 8192 or more.
> >
> > I think we can easily do the same as powerpc on arm64. With the
> I am leaning more towards x86's way because even NR_CPUS=160 is too
> expensive for 4-core arm64 VMs according to apachebench. I highly doubt
> that there is a good cut-off point to make everybody happy (or not unhappy).

It seems surprising that you would see any improvement for offstack masks
when using NR_CPUS=160, that is just three 64-bit words worth of data, but
it requires allocating the mask dynamically, which takes way more memory
to initialize.

> > ApacheBench test you cite in the patch description, what is the
> > value of NR_CPUS at which you start seeing a noticeable
> > benefit for offstack masks? Can you do the same test for
> > NR_CPUS=1024 or 2048?
>
> As mentioned above, a good cut-off point moves depends on the actual
> number of CPUs. But yeah I can do the same test for 1024 or even smaller
> NR_CPUs values on the same 64-core arm64 VM setup.

If you see an improvement for small NR_CPUS values using offstack masks,
it's possible that the actual difference is something completely
different and we
can just make the on-stack case faster, possibly the cause is something about
cacheline alignment or inlining decisions using your specific kernel config.

Are you able to compare the 'perf report' output between runs with either
size to see where the extra time gets spent?

        Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2022-04-14 11:41 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-12 23:15 [PATCH RESEND 0/1] lib/Kconfig: remove DEBUG_PER_CPU_MAPS dependency for CPUMASK_OFFSTACK Libo Chen
2022-04-12 23:15 ` Libo Chen
2022-04-12 23:15 ` [PATCH RESEND 1/1] " Libo Chen
2022-04-12 23:15   ` Libo Chen
2022-04-13  0:18   ` Randy Dunlap
2022-04-13  0:18     ` Randy Dunlap
2022-04-13  1:35     ` Libo Chen
2022-04-13  1:35       ` Libo Chen
2022-04-13  2:13       ` Randy Dunlap
2022-04-13  2:13         ` Randy Dunlap
2022-04-13  2:34         ` Libo Chen
2022-04-13  2:34           ` Libo Chen
2022-04-13  5:54           ` Randy Dunlap
2022-04-13  5:54             ` Randy Dunlap
2022-04-13  6:56             ` Libo Chen
2022-04-13  6:56               ` Libo Chen
2022-04-13  8:37               ` Masahiro Yamada
2022-04-13  8:37                 ` Masahiro Yamada
2022-04-13 15:41               ` Randy Dunlap
2022-04-13 15:41                 ` Randy Dunlap
2022-04-13 19:28                 ` Libo Chen
2022-04-13 19:28                   ` Libo Chen
2022-04-13 20:52                   ` Arnd Bergmann
2022-04-13 20:52                     ` Arnd Bergmann
2022-04-13 21:50                     ` Libo Chen
2022-04-13 21:50                       ` Libo Chen
2022-04-14  1:20                       ` Randy Dunlap
2022-04-14  1:20                         ` Randy Dunlap
2022-04-14 11:41                       ` Arnd Bergmann [this message]
2022-04-14 11:41                         ` Arnd Bergmann
2022-04-14 18:01                         ` Libo Chen
2022-04-14 18:01                           ` Libo Chen
2022-04-13 13:11   ` kernel test robot
2022-04-13 13:11     ` kernel test robot
2022-04-13 14:33   ` kernel test robot
2022-04-13 14:33     ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAK8P3a0uy8JcHP_G_ebz61AMB-Mx6jr5+vuzJHmWbDCajTdTfQ@mail.gmail.com \
    --to=arnd@arndb.de \
    --cc=akpm@linux-foundation.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=libo.chen@oracle.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.