linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH ARM64]: Introduce CONFIG_MAXSMP to allow up to 512 cpus
@ 2023-11-21  1:04 Christoph Lameter (Ampere)
  2023-11-23 19:33 ` Catalin Marinas
  2023-11-28  6:40 ` Anshuman Khandual
  0 siblings, 2 replies; 6+ messages in thread
From: Christoph Lameter (Ampere) @ 2023-11-21  1:04 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, Anshuman.Khandual, Valentin.Schneider,
	Vanshidhar Konda, Jonathan Cameron, Catalin Marinas,
	Robin Murphy, Dave Kleikamp, Matteo Carlini

Ampere Computing develops high end ARM processors that support an ever
increasing number of processors. The current default of 256 processors is
not enough for our newer products. The default is used by Linux
distros and therefore our customers cannot use distro kernels because
the number of processors is not supported.

The x86 arch has support for a "CONFIG_MAXSMP" configuration option that
enables support for the largest known configurations. This usually means
hundreds or thousands of processors. For those sizes it is no longer
practical to allocate bitmaps of cpus on the kernel stack. There is
a kernel option CONFIG_CPUMASK_OFFSTACK that makes the kernel allocate
and free bitmaps for cpu masks from slab memory instead of keeping it
on the stack etc.

With that is becomes possible to dynamically size the allocation of
the bitmap depending on the quantity of processors detected on
bootup.

This patch enables that logic if CONFIG_MAXSMP is enabled.

If CONFIG_MAXSMP is disabled then a default of 64 processors
is supported. A bitmap for 64 processors fits into one word and
therefore can be efficiently handled on the stack. Using a pointer
to a bitmap would be overkill.

The number of processors can be manually configured if
CONFIG_MAXSMP is not set.

Currently the default for CONFIG_MAXSMP is 512 processors.
This will have to be increased if ARM processor vendors start
supporting more processors.

Signed-off-by: Christoph Lameter (Ampere) <cl@linux.com>

---
NR_CPU limits on ARM64 were discussed before at
https://lore.kernel.org/all/20210110053615.3594358-1-vanshikonda@os.amperecomputing.com/


Index: linux/arch/arm64/Kconfig
===================================================================
--- linux.orig/arch/arm64/Kconfig
+++ linux/arch/arm64/Kconfig
@@ -1402,10 +1402,56 @@ config SCHED_SMT
   	  MultiThreading at a cost of slightly increased overhead in some
   	  places. If unsure say N here.

+
+config MAXSMP
+	bool "Compile kernel with support for the maximum number of SMP Processors"
+	depends on SMP && DEBUG_KERNEL
+	select CPUMASK_OFFSTACK
+	help
+	  Enable maximum number of CPUS and NUMA Nodes for this architecture.
+	  If unsure, say N.
+
+#
+# The maximum number of CPUs supported:
+#
+# The main config value is NR_CPUS, which defaults to NR_CPUS_DEFAULT,
+# and which can be configured interactively in the
+# [NR_CPUS_RANGE_BEGIN ... NR_CPUS_RANGE_END] range.
+#
+# ( If MAXSMP is enabled we just use the highest possible value and disable
+#   interactive configuration. )
+#
+
+config NR_CPUS_RANGE_BEGIN
+	int
+	default NR_CPUS_RANGE_END if MAXSMP
+	default    1 if !SMP
+	default    2
+
+config NR_CPUS_RANGE_END
+	int
+	default 8192 if  SMP && CPUMASK_OFFSTACK
+	default  512 if  SMP && !CPUMASK_OFFSTACK
+	default    1 if !SMP
+
+config NR_CPUS_DEFAULT
+	int
+	default  512 if  MAXSMP
+	default   64 if  SMP
+	default    1 if !SMP
+
   config NR_CPUS
-	int "Maximum number of CPUs (2-4096)"
-	range 2 4096
-	default "256"
+	int "Set maximum number of CPUs" if SMP && !MAXSMP
+	range NR_CPUS_RANGE_BEGIN NR_CPUS_RANGE_END
+	default NR_CPUS_DEFAULT
+	help
+	  This allows you to specify the maximum number of CPUs which this
+	  kernel will support.  If CPUMASK_OFFSTACK is enabled, the maximum
+	  supported value is 8192, otherwise the maximum value is 512.  The
+	  minimum value which makes sense is 2.
+
+	  This is purely to save memory: each supported CPU adds about 8KB
+	  to the kernel image.

   config HOTPLUG_CPU
   	bool "Support for hot-pluggable CPUs"
Index: linux/arch/arm64/configs/defconfig
===================================================================
--- linux.orig/arch/arm64/configs/defconfig
+++ linux/arch/arm64/configs/defconfig
@@ -15,6 +15,7 @@ CONFIG_TASK_IO_ACCOUNTING=y
   CONFIG_IKCONFIG=y
   CONFIG_IKCONFIG_PROC=y
   CONFIG_NUMA_BALANCING=y
+CONFIG_MAXSMP=y
   CONFIG_MEMCG=y
   CONFIG_BLK_CGROUP=y
   CONFIG_CGROUP_PIDS=y

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH ARM64]: Introduce CONFIG_MAXSMP to allow up to 512 cpus
  2023-11-21  1:04 [PATCH ARM64]: Introduce CONFIG_MAXSMP to allow up to 512 cpus Christoph Lameter (Ampere)
@ 2023-11-23 19:33 ` Catalin Marinas
  2023-11-27 19:58   ` Christoph Lameter (Ampere)
  2023-11-28  6:40 ` Anshuman Khandual
  1 sibling, 1 reply; 6+ messages in thread
From: Catalin Marinas @ 2023-11-23 19:33 UTC (permalink / raw)
  To: Christoph Lameter (Ampere)
  Cc: linux-arm-kernel, linux-kernel, Anshuman.Khandual,
	Valentin.Schneider, Vanshidhar Konda, Jonathan Cameron,
	Robin Murphy, Dave Kleikamp, Matteo Carlini

On Mon, Nov 20, 2023 at 05:04:35PM -0800, Christoph Lameter (Ampere) wrote:
> Index: linux/arch/arm64/Kconfig
> ===================================================================
> --- linux.orig/arch/arm64/Kconfig
> +++ linux/arch/arm64/Kconfig
> @@ -1402,10 +1402,56 @@ config SCHED_SMT
>   	  MultiThreading at a cost of slightly increased overhead in some
>   	  places. If unsure say N here.
> 
> +
> +config MAXSMP
> +	bool "Compile kernel with support for the maximum number of SMP Processors"
> +	depends on SMP && DEBUG_KERNEL
> +	select CPUMASK_OFFSTACK
> +	help
> +	  Enable maximum number of CPUS and NUMA Nodes for this architecture.
> +	  If unsure, say N.
> +
> +#
> +# The maximum number of CPUs supported:
> +#
> +# The main config value is NR_CPUS, which defaults to NR_CPUS_DEFAULT,
> +# and which can be configured interactively in the
> +# [NR_CPUS_RANGE_BEGIN ... NR_CPUS_RANGE_END] range.
> +#
> +# ( If MAXSMP is enabled we just use the highest possible value and disable
> +#   interactive configuration. )
> +#
> +
> +config NR_CPUS_RANGE_BEGIN
> +	int
> +	default NR_CPUS_RANGE_END if MAXSMP
> +	default    1 if !SMP
> +	default    2

We don't support !SMP on arm64.

> +
> +config NR_CPUS_RANGE_END
> +	int
> +	default 8192 if  SMP && CPUMASK_OFFSTACK
> +	default  512 if  SMP && !CPUMASK_OFFSTACK
> +	default    1 if !SMP
> +
> +config NR_CPUS_DEFAULT
> +	int
> +	default  512 if  MAXSMP
> +	default   64 if  SMP
> +	default    1 if !SMP
> +
>   config NR_CPUS
> -	int "Maximum number of CPUs (2-4096)"
> -	range 2 4096
> -	default "256"
> +	int "Set maximum number of CPUs" if SMP && !MAXSMP
> +	range NR_CPUS_RANGE_BEGIN NR_CPUS_RANGE_END
> +	default NR_CPUS_DEFAULT
> +	help
> +	  This allows you to specify the maximum number of CPUs which this
> +	  kernel will support.  If CPUMASK_OFFSTACK is enabled, the maximum
> +	  supported value is 8192, otherwise the maximum value is 512.  The
> +	  minimum value which makes sense is 2.
> +
> +	  This is purely to save memory: each supported CPU adds about 8KB
> +	  to the kernel image.

Is this all needed just to select CPUMASK_OFFSTACK if larger NR_CPUS?
Would something like this do:

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7b071a00425d..697d5700bad1 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -119,6 +119,7 @@ config ARM64
 	select CLONE_BACKWARDS
 	select COMMON_CLK
 	select CPU_PM if (SUSPEND || CPU_IDLE)
+	select CPUMASK_OFFSTACK if NR_CPUS > 512
 	select CRC32
 	select DCACHE_WORD_ACCESS
 	select DYNAMIC_FTRACE if FUNCTION_TRACER

togehther with a larger NR_CPUS in defconfig?

-- 
Catalin

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH ARM64]: Introduce CONFIG_MAXSMP to allow up to 512 cpus
  2023-11-23 19:33 ` Catalin Marinas
@ 2023-11-27 19:58   ` Christoph Lameter (Ampere)
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Lameter (Ampere) @ 2023-11-27 19:58 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arm-kernel, linux-kernel, Anshuman.Khandual,
	Valentin.Schneider, Vanshidhar Konda, Jonathan Cameron,
	Robin Murphy, Dave Kleikamp, Matteo Carlini

On Thu, 23 Nov 2023, Catalin Marinas wrote:

>> +config NR_CPUS_RANGE_BEGIN
>> +	int
>> +	default NR_CPUS_RANGE_END if MAXSMP
>> +	default    1 if !SMP
>> +	default    2
>
> We don't support !SMP on arm64.

Ok we can drop that.

>> +	  This is purely to save memory: each supported CPU adds about 8KB
>> +	  to the kernel image.
>
> Is this all needed just to select CPUMASK_OFFSTACK if larger NR_CPUS?
> Would something like this do:
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 7b071a00425d..697d5700bad1 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -119,6 +119,7 @@ config ARM64
> 	select CLONE_BACKWARDS
> 	select COMMON_CLK
> 	select CPU_PM if (SUSPEND || CPU_IDLE)
> +	select CPUMASK_OFFSTACK if NR_CPUS > 512
> 	select CRC32
> 	select DCACHE_WORD_ACCESS
> 	select DYNAMIC_FTRACE if FUNCTION_TRACER
>
> togehther with a larger NR_CPUS in defconfig?

Well that is certainly better because it does not introduce an additional 
kernel config option.



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH ARM64]: Introduce CONFIG_MAXSMP to allow up to 512 cpus
  2023-11-21  1:04 [PATCH ARM64]: Introduce CONFIG_MAXSMP to allow up to 512 cpus Christoph Lameter (Ampere)
  2023-11-23 19:33 ` Catalin Marinas
@ 2023-11-28  6:40 ` Anshuman Khandual
  2023-11-28 18:02   ` Christoph Lameter (Ampere)
  1 sibling, 1 reply; 6+ messages in thread
From: Anshuman Khandual @ 2023-11-28  6:40 UTC (permalink / raw)
  To: Christoph Lameter (Ampere), linux-arm-kernel
  Cc: linux-kernel, Valentin.Schneider, Vanshidhar Konda,
	Jonathan Cameron, Catalin Marinas, Robin Murphy, Dave Kleikamp,
	Matteo Carlini



On 11/21/23 06:34, Christoph Lameter (Ampere) wrote:
> Ampere Computing develops high end ARM processors that support an ever
> increasing number of processors. The current default of 256 processors is
> not enough for our newer products. The default is used by Linux
> distros and therefore our customers cannot use distro kernels because
> the number of processors is not supported.

In the previous thread mentioned below, Catalin had mentioned that the
distros do tweak the config for their needs. The default is applicable
for an wide range systems, hence just wondering why default NR_CPUS be
changed for all.

Also just curious, what might be the concern for distros to have large
platform specific configs overriding the default.

> 
> The x86 arch has support for a "CONFIG_MAXSMP" configuration option that
> enables support for the largest known configurations. This usually means
> hundreds or thousands of processors. For those sizes it is no longer
> practical to allocate bitmaps of cpus on the kernel stack. There is
> a kernel option CONFIG_CPUMASK_OFFSTACK that makes the kernel allocate
> and free bitmaps for cpu masks from slab memory instead of keeping it
> on the stack etc.
> 
> With that is becomes possible to dynamically size the allocation of
> the bitmap depending on the quantity of processors detected on
> bootup.
> 
> This patch enables that logic if CONFIG_MAXSMP is enabled.
> 
> If CONFIG_MAXSMP is disabled then a default of 64 processors
> is supported. A bitmap for 64 processors fits into one word and
> therefore can be efficiently handled on the stack. Using a pointer
> to a bitmap would be overkill.
> 
> The number of processors can be manually configured if
> CONFIG_MAXSMP is not set.
> 
> Currently the default for CONFIG_MAXSMP is 512 processors.
> This will have to be increased if ARM processor vendors start
> supporting more processors.
> 
> Signed-off-by: Christoph Lameter (Ampere) <cl@linux.com>
> 
> ---
> NR_CPU limits on ARM64 were discussed before at
> https://lore.kernel.org/all/20210110053615.3594358-1-vanshikonda@os.amperecomputing.com/
> 
> 
> Index: linux/arch/arm64/Kconfig
> ===================================================================
> --- linux.orig/arch/arm64/Kconfig
> +++ linux/arch/arm64/Kconfig
> @@ -1402,10 +1402,56 @@ config SCHED_SMT
>         MultiThreading at a cost of slightly increased overhead in some
>         places. If unsure say N here.
> 
> +
> +config MAXSMP
> +    bool "Compile kernel with support for the maximum number of SMP Processors"
> +    depends on SMP && DEBUG_KERNEL
> +    select CPUMASK_OFFSTACK
> +    help
> +      Enable maximum number of CPUS and NUMA Nodes for this architecture.
> +      If unsure, say N.
> +
> +#
> +# The maximum number of CPUs supported:
> +#
> +# The main config value is NR_CPUS, which defaults to NR_CPUS_DEFAULT,
> +# and which can be configured interactively in the
> +# [NR_CPUS_RANGE_BEGIN ... NR_CPUS_RANGE_END] range.
> +#
> +# ( If MAXSMP is enabled we just use the highest possible value and disable
> +#   interactive configuration. )
> +#
> +
> +config NR_CPUS_RANGE_BEGIN
> +    int
> +    default NR_CPUS_RANGE_END if MAXSMP
> +    default    1 if !SMP
> +    default    2
> +
> +config NR_CPUS_RANGE_END
> +    int
> +    default 8192 if  SMP && CPUMASK_OFFSTACK
> +    default  512 if  SMP && !CPUMASK_OFFSTACK
> +    default    1 if !SMP
> +
> +config NR_CPUS_DEFAULT
> +    int
> +    default  512 if  MAXSMP
> +    default   64 if  SMP
> +    default    1 if !SMP
> +
>   config NR_CPUS
> -    int "Maximum number of CPUs (2-4096)"
> -    range 2 4096
> -    default "256"
> +    int "Set maximum number of CPUs" if SMP && !MAXSMP
> +    range NR_CPUS_RANGE_BEGIN NR_CPUS_RANGE_END
> +    default NR_CPUS_DEFAULT
> +    help
> +      This allows you to specify the maximum number of CPUs which this
> +      kernel will support.  If CPUMASK_OFFSTACK is enabled, the maximum
> +      supported value is 8192, otherwise the maximum value is 512.  The
> +      minimum value which makes sense is 2.
> +
> +      This is purely to save memory: each supported CPU adds about 8KB
> +      to the kernel image.
> 
>   config HOTPLUG_CPU
>       bool "Support for hot-pluggable CPUs"
> Index: linux/arch/arm64/configs/defconfig
> ===================================================================
> --- linux.orig/arch/arm64/configs/defconfig
> +++ linux/arch/arm64/configs/defconfig
> @@ -15,6 +15,7 @@ CONFIG_TASK_IO_ACCOUNTING=y
>   CONFIG_IKCONFIG=y
>   CONFIG_IKCONFIG_PROC=y
>   CONFIG_NUMA_BALANCING=y
> +CONFIG_MAXSMP=y
>   CONFIG_MEMCG=y
>   CONFIG_BLK_CGROUP=y
>   CONFIG_CGROUP_PIDS=y

I do agree with Catalin's suggestion - just selecting CPUMASK_OFFSTACK
for larger NR_CPUS.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH ARM64]: Introduce CONFIG_MAXSMP to allow up to 512 cpus
  2023-11-28  6:40 ` Anshuman Khandual
@ 2023-11-28 18:02   ` Christoph Lameter (Ampere)
  2023-12-04 18:49     ` [PATCH] ARM64: Dynamicaly allocate cpumasks and increase supported CPUs to 512 (was: CONFIG_MAXSMP to allow up to 512 cpus) Christoph Lameter (Ampere)
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Lameter (Ampere) @ 2023-11-28 18:02 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Valentin.Schneider,
	Vanshidhar Konda, Jonathan Cameron, Catalin Marinas,
	Robin Murphy, Dave Kleikamp, Matteo Carlini

On Tue, 28 Nov 2023, Anshuman Khandual wrote:

>
>
> On 11/21/23 06:34, Christoph Lameter (Ampere) wrote:
>> Ampere Computing develops high end ARM processors that support an ever
>> increasing number of processors. The current default of 256 processors is
>> not enough for our newer products. The default is used by Linux
>> distros and therefore our customers cannot use distro kernels because
>> the number of processors is not supported.
>
> In the previous thread mentioned below, Catalin had mentioned that the
> distros do tweak the config for their needs. The default is applicable
> for an wide range systems, hence just wondering why default NR_CPUS be
> changed for all.

We would like the standard kernel to be able to boot on our systems and 
those have more than the current NR_CPU processors. The distros only 
tweaks things on request and with this change the tweaking is no longer
necessary.

> Also just curious, what might be the concern for distros to have large
> platform specific configs overriding the default.

There are numerous distributions as well as individuals who built kernels. 
It is surprising if someone builds an upstream kernel with the defaults 
that should fit all supported platforms only to find that only a portion 
of their cpus come up. The work of discovery why this is and how to fix it 
has to be done by numerous individuals and organizations in order to 
enable all cpus. That work is not necessary if the default is such that a 
sufficient number of processors are supported by the default configuration 
accommodating all ARM hardware.

The CONFIG_MAXSMP configuration on X86 was developed exactly for these 
situations and we have a special KCONFIG option to have potentially large 
bitmaps for cpus allocated as needed in the kernel core. The patch enables 
the use of that facility.



_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH] ARM64: Dynamicaly allocate cpumasks and increase supported CPUs to 512 (was: CONFIG_MAXSMP to allow up to 512 cpus)
  2023-11-28 18:02   ` Christoph Lameter (Ampere)
@ 2023-12-04 18:49     ` Christoph Lameter (Ampere)
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Lameter (Ampere) @ 2023-12-04 18:49 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Valentin.Schneider,
	Vanshidhar Konda, Jonathan Cameron, Catalin Marinas,
	Robin Murphy, Dave Kleikamp, Matteo Carlini

New version of the patch after feedback from Catalin:


From: Christoph Lameter (Ampere) <cl@linux.com>
Subject: [PATCH] ARM64: Dynamicaly allocate cpumasks and increase supported CPUs to 512

Ampere Computing develops high end ARM processor that support an ever
increasing number of processors. The default 256 processors are
not enough for our newer products. The default is used by
distros and therefore our customers cannot use distro kernels because
the number of processors is not supported.

One of the objections against earlier patches to increase the limit
was that the memory use becomes too high. There is a feature called
CPUMASK_OFFSTACK that configures the cpumasks in the kernel to be
dynamically allocated. This was used in the X86 architecture in the
past to enable support for larger CPU configurations up to 8k cpus.

With that is becomes possible to dynamically size the allocation of
the cpu bitmaps depending on the quantity of processors detected on
bootup.

This patch enables that logic if more than 256 processors
are configured and increases the default to 512 processors.

Further increases may be needed if ARM processor vendors start
supporting more processors. Given the current inflationary trends
in core counts from multiple processor manufacturers this may occur.

Signed-off-by: Christoph Lameter (Ampere) <cl@linux.com>

Index: linux/arch/arm64/Kconfig
===================================================================
--- linux.orig/arch/arm64/Kconfig
+++ linux/arch/arm64/Kconfig
@@ -1407,7 +1407,21 @@ config SCHED_SMT
  config NR_CPUS
  	int "Maximum number of CPUs (2-4096)"
  	range 2 4096
-	default "256"
+	default 512
+
+#
+# Determines the placement of cpumasks.
+#
+# With CPUMASK_OFFSTACK the cpumasks are dynamically allocated.
+# Useful for machines with lots of core because it avoids increasing
+# the size of many of the data structures in the kernel.
+#
+# If this is off then the cpumasks have a static sizes and are
+# embedded within data structures.
+#
+config CPUMASK_OFFSTACK
+	def_bool y
+	depends on NR_CPUS > 256

  config HOTPLUG_CPU
  	bool "Support for hot-pluggable CPUs"

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-12-04 18:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-21  1:04 [PATCH ARM64]: Introduce CONFIG_MAXSMP to allow up to 512 cpus Christoph Lameter (Ampere)
2023-11-23 19:33 ` Catalin Marinas
2023-11-27 19:58   ` Christoph Lameter (Ampere)
2023-11-28  6:40 ` Anshuman Khandual
2023-11-28 18:02   ` Christoph Lameter (Ampere)
2023-12-04 18:49     ` [PATCH] ARM64: Dynamicaly allocate cpumasks and increase supported CPUs to 512 (was: CONFIG_MAXSMP to allow up to 512 cpus) Christoph Lameter (Ampere)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).