All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
@ 2022-09-05 23:08 Yury Norov
  2022-09-05 23:08 ` [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1 Yury Norov
                   ` (6 more replies)
  0 siblings, 7 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-05 23:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers

cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
despite that the variables have different meaning and purpose. It makes
some cpumask functions broken.

This series cleans that mess and adds new config FORCE_NR_CPUS that
allows to optimize cpumask subsystem if the number of CPUs is known
at compile-time.

After some testing I found build broken when SMP is on and NR_CPUS == 1.
This is addressed in a new patch #1, and in the following patch #2 that
now declares set_nr_cpu_ids in cpumask.h (was in smp.h).

v1: https://lore.kernel.org/lkml/20220829165748.1008477-1-yury.norov@gmail.com/T/#mecbd787f8d1bff1454a4ec2fe46ad6dc168df695
v2:
 - don't declare nr_cpu_ids if NR_CPUS == 1;
 - move set_nr_cpu_ids() from smp.h to cpumask.h to avoid errors in some
   inclusion paths;
 - drop 'default n' in FORCE_NR_CPUS option description;
 - rebase on top of v6.0-rc4.

Yury Norov (5):
  smp: don't declare nr_cpu_ids if NR_CPUS == 1
  smp: add set_nr_cpu_ids()
  lib/cpumask: delete misleading comment
  lib/cpumask: deprecate nr_cpumask_bits
  lib/cpumask: add FORCE_NR_CPUS config option

 arch/loongarch/kernel/setup.c |  2 +-
 arch/mips/kernel/setup.c      |  2 +-
 arch/x86/kernel/smpboot.c     |  4 ++--
 arch/x86/xen/smp_pv.c         |  2 +-
 include/linux/cpumask.h       | 22 +++++++++++-----------
 kernel/smp.c                  |  6 ++++--
 lib/Kconfig                   |  9 +++++++++
 7 files changed, 29 insertions(+), 18 deletions(-)

-- 
2.34.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1
  2022-09-05 23:08 [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Yury Norov
@ 2022-09-05 23:08 ` Yury Norov
  2022-09-06  8:53   ` Peter Zijlstra
  2022-09-05 23:08 ` [PATCH v2 2/5] smp: add set_nr_cpu_ids() Yury Norov
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 40+ messages in thread
From: Yury Norov @ 2022-09-05 23:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers

SMP and NR_CPUS are independent options, hence nr_cpu_ids may be
declared even if NR_CPUS == 1, which is useless.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 kernel/smp.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/smp.c b/kernel/smp.c
index 650810a6f29b..e971c9626a1b 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -1088,9 +1088,11 @@ static int __init maxcpus(char *str)
 
 early_param("maxcpus", maxcpus);
 
+#if (NR_CPUS > 1)
 /* Setup number of possible processor ids */
 unsigned int nr_cpu_ids __read_mostly = NR_CPUS;
 EXPORT_SYMBOL(nr_cpu_ids);
+#endif
 
 /* An arch may set nr_cpu_ids earlier if needed, so this would be redundant */
 void __init setup_nr_cpu_ids(void)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 2/5] smp: add set_nr_cpu_ids()
  2022-09-05 23:08 [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Yury Norov
  2022-09-05 23:08 ` [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1 Yury Norov
@ 2022-09-05 23:08 ` Yury Norov
  2022-09-05 23:08 ` [PATCH v2 3/5] lib/cpumask: delete misleading comment Yury Norov
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-05 23:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers

In preparation to support compile-time nr_cpu_ids, add a setter for
the variable.

This is a no-op for all arches.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 arch/loongarch/kernel/setup.c | 2 +-
 arch/mips/kernel/setup.c      | 2 +-
 arch/x86/kernel/smpboot.c     | 4 ++--
 arch/x86/xen/smp_pv.c         | 2 +-
 include/linux/cpumask.h       | 5 +++++
 kernel/smp.c                  | 4 ++--
 6 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
index 8f5c2f9a1a83..18a81edd3ac5 100644
--- a/arch/loongarch/kernel/setup.c
+++ b/arch/loongarch/kernel/setup.c
@@ -346,7 +346,7 @@ static void __init prefill_possible_map(void)
 	for (; i < NR_CPUS; i++)
 		set_cpu_possible(i, false);
 
-	nr_cpu_ids = possible;
+	set_nr_cpu_ids(possible);
 }
 #endif
 
diff --git a/arch/mips/kernel/setup.c b/arch/mips/kernel/setup.c
index 2ca156a5b231..e8a0759cb4d0 100644
--- a/arch/mips/kernel/setup.c
+++ b/arch/mips/kernel/setup.c
@@ -750,7 +750,7 @@ static void __init prefill_possible_map(void)
 	for (; i < NR_CPUS; i++)
 		set_cpu_possible(i, false);
 
-	nr_cpu_ids = possible;
+	set_nr_cpu_ids(possible);
 }
 #else
 static inline void prefill_possible_map(void) {}
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index f24227bc3220..3f3ea0287f69 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1316,7 +1316,7 @@ static void __init smp_sanity_check(void)
 			nr++;
 		}
 
-		nr_cpu_ids = 8;
+		set_nr_cpu_ids(8);
 	}
 #endif
 
@@ -1569,7 +1569,7 @@ __init void prefill_possible_map(void)
 		possible = i;
 	}
 
-	nr_cpu_ids = possible;
+	set_nr_cpu_ids(possible);
 
 	pr_info("Allowing %d CPUs, %d hotplug CPUs\n",
 		possible, max_t(int, possible - num_processors, 0));
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index ba7af2eca755..480be82e9b7b 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -179,7 +179,7 @@ static void __init _get_smp_config(unsigned int early)
 	 * hypercall to expand the max number of VCPUs an already
 	 * running guest has. So cap it up to X. */
 	if (subtract)
-		nr_cpu_ids = nr_cpu_ids - subtract;
+		set_nr_cpu_ids(nr_cpu_ids - subtract);
 #endif
 
 }
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index bd047864c7ac..0588bfe350d4 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -39,6 +39,11 @@ typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
 #define nr_cpu_ids		1U
 #else
 extern unsigned int nr_cpu_ids;
+
+static inline void set_nr_cpu_ids(unsigned int nr)
+{
+	nr_cpu_ids = nr;
+}
 #endif
 
 #ifdef CONFIG_CPUMASK_OFFSTACK
diff --git a/kernel/smp.c b/kernel/smp.c
index e971c9626a1b..150310a0947a 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -1070,7 +1070,7 @@ static int __init nrcpus(char *str)
 	int nr_cpus;
 
 	if (get_option(&str, &nr_cpus) && nr_cpus > 0 && nr_cpus < nr_cpu_ids)
-		nr_cpu_ids = nr_cpus;
+		set_nr_cpu_ids(nr_cpus);
 
 	return 0;
 }
@@ -1097,7 +1097,7 @@ EXPORT_SYMBOL(nr_cpu_ids);
 /* An arch may set nr_cpu_ids earlier if needed, so this would be redundant */
 void __init setup_nr_cpu_ids(void)
 {
-	nr_cpu_ids = find_last_bit(cpumask_bits(cpu_possible_mask),NR_CPUS) + 1;
+	set_nr_cpu_ids(find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1);
 }
 
 /* Called by boot processor to activate the rest. */
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 3/5] lib/cpumask: delete misleading comment
  2022-09-05 23:08 [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Yury Norov
  2022-09-05 23:08 ` [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1 Yury Norov
  2022-09-05 23:08 ` [PATCH v2 2/5] smp: add set_nr_cpu_ids() Yury Norov
@ 2022-09-05 23:08 ` Yury Norov
  2022-09-05 23:08 ` [PATCH v2 4/5] lib/cpumask: deprecate nr_cpumask_bits Yury Norov
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-05 23:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers

The comment says that HOTPLUG config option enables all cpus in
cpu_possible_mask up to NR_CPUs. This is wrong. Even if HOTPLUG is
enabled, the mask is populated on boot with respect to ACPI/DT records.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 0588bfe350d4..8bac1dee8448 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -72,10 +72,6 @@ static inline void set_nr_cpu_ids(unsigned int nr)
  *  cpu_online_mask is the dynamic subset of cpu_present_mask,
  *  indicating those CPUs available for scheduling.
  *
- *  If HOTPLUG is enabled, then cpu_possible_mask is forced to have
- *  all NR_CPUS bits set, otherwise it is just the set of CPUs that
- *  ACPI reports present at boot.
- *
  *  If HOTPLUG is enabled, then cpu_present_mask varies dynamically,
  *  depending on what ACPI reports as currently plugged in, otherwise
  *  cpu_present_mask is just a copy of cpu_possible_mask.
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 4/5] lib/cpumask: deprecate nr_cpumask_bits
  2022-09-05 23:08 [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Yury Norov
                   ` (2 preceding siblings ...)
  2022-09-05 23:08 ` [PATCH v2 3/5] lib/cpumask: delete misleading comment Yury Norov
@ 2022-09-05 23:08 ` Yury Norov
  2022-09-05 23:08 ` [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option Yury Norov
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-05 23:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers

Cpumask code is written in assumption that when CONFIG_CPUMASK_OFFSTACK
is enabled, all cpumasks have boot-time defined size, otherwise the size
is always NR_CPUS.

The latter is wrong because the number of possible cpus is always
calculated on boot, and it may be less than NR_CPUS.

On my 4-cpu arm64 VM the nr_cpu_ids is 4, as expected, and nr_cpumask_bits
is 256, which corresponds to NR_CPUS. This not only leads to useless
traversing of cpumask bits greater than 4, this also makes some cpumask
routines fail.

For example, cpumask_full(0b1111000..000) would erroneously return false
in the example above because tail bits in the mask are all unset.

This patch deprecates nr_cpumask_bits and wires it to nr_cpu_ids
unconditionally, so that cpumask routines will not waste time traversing
unused part of cpu masks. It also fixes cpumask_full() and similar
routines.

As a side effect, because now a length of cpumasks is defined at run-time
even if CPUMASK_OFFSTACK is disabled, compiler can't optimize corresponding
functions.

It increases kernel size by ~2.5KB if OFFSTACK is off. This is addressed in
the following patch.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 8bac1dee8448..e01fba8ecc27 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -46,13 +46,8 @@ static inline void set_nr_cpu_ids(unsigned int nr)
 }
 #endif
 
-#ifdef CONFIG_CPUMASK_OFFSTACK
-/* Assuming NR_CPUS is huge, a runtime limit is more efficient.  Also,
- * not all bits may be allocated. */
+/* Deprecated. Always use nr_cpu_ids. */
 #define nr_cpumask_bits	nr_cpu_ids
-#else
-#define nr_cpumask_bits	((unsigned int)NR_CPUS)
-#endif
 
 /*
  * The following particular system cpumasks and operations manage
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-09-05 23:08 [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Yury Norov
                   ` (3 preceding siblings ...)
  2022-09-05 23:08 ` [PATCH v2 4/5] lib/cpumask: deprecate nr_cpumask_bits Yury Norov
@ 2022-09-05 23:08 ` Yury Norov
  2022-10-18  8:21   ` Geert Uytterhoeven
  2022-09-06  8:55 ` [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Peter Zijlstra
  2022-09-15 14:45 ` Yury Norov
  6 siblings, 1 reply; 40+ messages in thread
From: Yury Norov @ 2022-09-05 23:08 UTC (permalink / raw)
  To: linux-kernel
  Cc: Yury Norov, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers

The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
but defined at boot-time when kernel parses ACPI/DT tables, and stored in
nr_cpu_ids. In many practical cases, number of CPUs for a target is known
at compile time, and can be provided with NR_CPUS.

In that case, compiler may be instructed to rely on NR_CPUS as on actual
number of CPUs, not an upper limit. It allows to optimize many cpumask
routines and significantly shrink size of the kernel image.

This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
NR_CPUS and enable corresponding optimizations.

If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
that doesn't hold.

The new option is especially useful in embedded applications because
kernel configurations are unique for each SoC, the number of CPUs is
constant and known well, and memory limitations are typically harder.

For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
  add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 include/linux/cpumask.h | 10 +++++++---
 kernel/smp.c            |  2 +-
 lib/Kconfig             |  9 +++++++++
 3 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index e01fba8ecc27..0753202819e5 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -35,16 +35,20 @@ typedef struct cpumask { DECLARE_BITMAP(bits, NR_CPUS); } cpumask_t;
  */
 #define cpumask_pr_args(maskp)		nr_cpu_ids, cpumask_bits(maskp)
 
-#if NR_CPUS == 1
-#define nr_cpu_ids		1U
+#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
+#define nr_cpu_ids ((unsigned int)NR_CPUS)
 #else
 extern unsigned int nr_cpu_ids;
+#endif
 
 static inline void set_nr_cpu_ids(unsigned int nr)
 {
+#if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
+	WARN_ON(nr != nr_cpu_ids);
+#else
 	nr_cpu_ids = nr;
-}
 #endif
+}
 
 /* Deprecated. Always use nr_cpu_ids. */
 #define nr_cpumask_bits	nr_cpu_ids
diff --git a/kernel/smp.c b/kernel/smp.c
index 150310a0947a..661d09ae5d6a 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -1088,7 +1088,7 @@ static int __init maxcpus(char *str)
 
 early_param("maxcpus", maxcpus);
 
-#if (NR_CPUS > 1)
+#if (NR_CPUS > 1) && !defined(CONFIG_FORCE_NR_CPUS)
 /* Setup number of possible processor ids */
 unsigned int nr_cpu_ids __read_mostly = NR_CPUS;
 EXPORT_SYMBOL(nr_cpu_ids);
diff --git a/lib/Kconfig b/lib/Kconfig
index dc1ab2ed1dc6..77ead982c8b9 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -527,6 +527,15 @@ config CPUMASK_OFFSTACK
 	  them on the stack.  This is a bit more expensive, but avoids
 	  stack overflow.
 
+config FORCE_NR_CPUS
+       bool "NR_CPUS is set to an actual number of CPUs"
+       depends on SMP
+       help
+         Say Yes if you have NR_CPUS set to an actual number of possible
+         CPUs in your system, not to a default value. This forces the core
+         code to rely on compile-time value and optimize kernel routines
+         better.
+
 config CPU_RMAP
 	bool
 	depends on SMP
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1
  2022-09-05 23:08 ` [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1 Yury Norov
@ 2022-09-06  8:53   ` Peter Zijlstra
  2022-09-06 14:06     ` Yury Norov
  0 siblings, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2022-09-06  8:53 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Mon, Sep 05, 2022 at 04:08:16PM -0700, Yury Norov wrote:
> SMP and NR_CPUS are independent options, hence nr_cpu_ids may be
> declared even if NR_CPUS == 1, which is useless.

I'm thikning you're fixing the wrong problem here. Also who the heck
cares about SMP=y NR_CPUS=1 anyway?

Why do we need extra source complexity for this?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
  2022-09-05 23:08 [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Yury Norov
                   ` (4 preceding siblings ...)
  2022-09-05 23:08 ` [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option Yury Norov
@ 2022-09-06  8:55 ` Peter Zijlstra
  2022-09-06 12:06   ` Valentin Schneider
  2022-09-06 14:35   ` Yury Norov
  2022-09-15 14:45 ` Yury Norov
  6 siblings, 2 replies; 40+ messages in thread
From: Peter Zijlstra @ 2022-09-06  8:55 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
> cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
> despite that the variables have different meaning and purpose. It makes
> some cpumask functions broken.
> 
> This series cleans that mess and adds new config FORCE_NR_CPUS that
> allows to optimize cpumask subsystem if the number of CPUs is known
> at compile-time.

Who will use this? Distro's can't, which means 99% of people will not
use this ever. Is it worth it?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
  2022-09-06  8:55 ` [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Peter Zijlstra
@ 2022-09-06 12:06   ` Valentin Schneider
  2022-09-06 14:38     ` Peter Zijlstra
  2022-09-06 14:48     ` Yury Norov
  2022-09-06 14:35   ` Yury Norov
  1 sibling, 2 replies; 40+ messages in thread
From: Valentin Schneider @ 2022-09-06 12:06 UTC (permalink / raw)
  To: Peter Zijlstra, Yury Norov
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Sander Vanheule, Alexey Klimov,
	Eric Biggers

On 06/09/22 10:55, Peter Zijlstra wrote:
> On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
>> cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
>> despite that the variables have different meaning and purpose. It makes
>> some cpumask functions broken.
>>
>> This series cleans that mess and adds new config FORCE_NR_CPUS that
>> allows to optimize cpumask subsystem if the number of CPUs is known
>> at compile-time.
>
> Who will use this? Distro's can't, which means 99% of people will not
> use this ever. Is it worth it?

I'd tend to agree here.

One extra thing worth noting is CONFIG_CPUMASK_OFFSTACK=n cpumask_size()
still uses NR_CPUS under the hood, despite being (mostly) used to
dynamically allocate cpumasks. So having an unconditionnal

  #define nr_cpumask_bits nr_cpu_ids

would save up some memory for those allocations.

A quick compile test on x86 defconfig (OFFSTACK=n) gives me:

  Total: Before=18711411, After=18705653, chg -0.03%

If it's in the range of barely-half-a-page on other archs, could we just
do that then?


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1
  2022-09-06  8:53   ` Peter Zijlstra
@ 2022-09-06 14:06     ` Yury Norov
  2022-09-06 14:36       ` Peter Zijlstra
  0 siblings, 1 reply; 40+ messages in thread
From: Yury Norov @ 2022-09-06 14:06 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 06, 2022 at 10:53:53AM +0200, Peter Zijlstra wrote:
> On Mon, Sep 05, 2022 at 04:08:16PM -0700, Yury Norov wrote:
> > SMP and NR_CPUS are independent options, hence nr_cpu_ids may be
> > declared even if NR_CPUS == 1, which is useless.
> 
> I'm thikning you're fixing the wrong problem here.

I'm removing dead code. If NR_CPUS == 1, nr_cpu_ids does exist, exported
as an interface variable, but never normally reached, because in some
other piece of code (not even in smp.h) it's declared conditionally. 

> Also who the heck
> cares about SMP=y NR_CPUS=1 anyway?

Build bots.

> Why do we need extra source complexity for this?

To have effective code generation for UP builds.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
  2022-09-06  8:55 ` [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Peter Zijlstra
  2022-09-06 12:06   ` Valentin Schneider
@ 2022-09-06 14:35   ` Yury Norov
  1 sibling, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-06 14:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 06, 2022 at 10:55:20AM +0200, Peter Zijlstra wrote:
> On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
> > cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
> > despite that the variables have different meaning and purpose. It makes
> > some cpumask functions broken.
> > 
> > This series cleans that mess and adds new config FORCE_NR_CPUS that
> > allows to optimize cpumask subsystem if the number of CPUs is known
> > at compile-time.
> 
> Who will use this? Distro's can't,

Raspberry PI can. Smartphone vendors can, and probably should. Ditto
for vehicles and all embedded applications.

> which means 99% of people will not
> use this ever. Is it worth it?

I will definitely enable it for myself. Actually anyone who set NR_CPUS
to a non-default value, may also be interested in FORCE_NR_CPUS.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1
  2022-09-06 14:06     ` Yury Norov
@ 2022-09-06 14:36       ` Peter Zijlstra
  2022-09-06 15:01         ` Andy Shevchenko
                           ` (2 more replies)
  0 siblings, 3 replies; 40+ messages in thread
From: Peter Zijlstra @ 2022-09-06 14:36 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 06, 2022 at 07:06:31AM -0700, Yury Norov wrote:
> On Tue, Sep 06, 2022 at 10:53:53AM +0200, Peter Zijlstra wrote:
> > On Mon, Sep 05, 2022 at 04:08:16PM -0700, Yury Norov wrote:
> > > SMP and NR_CPUS are independent options, hence nr_cpu_ids may be
> > > declared even if NR_CPUS == 1, which is useless.
> > 
> > I'm thikning you're fixing the wrong problem here.
> 
> I'm removing dead code. If NR_CPUS == 1, nr_cpu_ids does exist, exported
> as an interface variable, but never normally reached, because in some
> other piece of code (not even in smp.h) it's declared conditionally. 

Can't you simply disallow NR_CPUS==1 for SMP builds? It doesn't make
sense anyway.

> > Why do we need extra source complexity for this?
> 
> To have effective code generation for UP builds.

Again, who cares... isn't it hard to find actual UP chips these days?

It was suggested the other day we remove a whole bunch of SMP=n code and
unconditionally use SMP code, even if its pointless on UP just to make
the source simpler.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
  2022-09-06 12:06   ` Valentin Schneider
@ 2022-09-06 14:38     ` Peter Zijlstra
  2022-09-06 15:45       ` Valentin Schneider
  2022-09-06 14:48     ` Yury Norov
  1 sibling, 1 reply; 40+ messages in thread
From: Peter Zijlstra @ 2022-09-06 14:38 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Yury Norov, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 06, 2022 at 01:06:47PM +0100, Valentin Schneider wrote:

>   #define nr_cpumask_bits nr_cpu_ids

That assumes the CPU space is dense; is this so? That is, you can have 4
CPUs while the highest cpu number is vastly larger than 4.

It's uncommon, but not unheard of I think. ISTR some BIOSes leaving
holes in the CPU space when there were empty CPU sockets on the
motherboard.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
  2022-09-06 12:06   ` Valentin Schneider
  2022-09-06 14:38     ` Peter Zijlstra
@ 2022-09-06 14:48     ` Yury Norov
  1 sibling, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-06 14:48 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Peter Zijlstra, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 06, 2022 at 01:06:47PM +0100, Valentin Schneider wrote:
> On 06/09/22 10:55, Peter Zijlstra wrote:
> > On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
> >> cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
> >> despite that the variables have different meaning and purpose. It makes
> >> some cpumask functions broken.
> >>
> >> This series cleans that mess and adds new config FORCE_NR_CPUS that
> >> allows to optimize cpumask subsystem if the number of CPUs is known
> >> at compile-time.
> >
> > Who will use this? Distro's can't, which means 99% of people will not
> > use this ever. Is it worth it?
> 
> I'd tend to agree here.
> 
> One extra thing worth noting is CONFIG_CPUMASK_OFFSTACK=n cpumask_size()
> still uses NR_CPUS under the hood, despite being (mostly) used to
> dynamically allocate cpumasks. So having an unconditionnal
> 
>   #define nr_cpumask_bits nr_cpu_ids
> 
> would save up some memory for those allocations.

Thanks, I didn't mention this. This is exactly what I meant by
'cleaning the mess'.
 
> A quick compile test on x86 defconfig (OFFSTACK=n) gives me:
> 
>   Total: Before=18711411, After=18705653, chg -0.03%
 
All cpumask_size() allocations are runtime, right?

> If it's in the range of barely-half-a-page on other archs, could we just
> do that then?

How many is that in terms of I-cache lines?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1
  2022-09-06 14:36       ` Peter Zijlstra
@ 2022-09-06 15:01         ` Andy Shevchenko
  2022-09-06 17:53           ` Peter Zijlstra
  2022-09-06 15:07         ` Yury Norov
  2022-09-07  8:02         ` David Laight
  2 siblings, 1 reply; 40+ messages in thread
From: Andy Shevchenko @ 2022-09-06 15:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Yury Norov, linux-kernel, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 06, 2022 at 04:36:39PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 07:06:31AM -0700, Yury Norov wrote:
> > On Tue, Sep 06, 2022 at 10:53:53AM +0200, Peter Zijlstra wrote:
> > > On Mon, Sep 05, 2022 at 04:08:16PM -0700, Yury Norov wrote:

...

> > > Why do we need extra source complexity for this?
> > 
> > To have effective code generation for UP builds.
> 
> Again, who cares... isn't it hard to find actual UP chips these days?

For the record, Intel produced somewhat a volume of Intel Quark chips [1],
that are UP (in 2013-2019).

[1]: https://en.wikipedia.org/wiki/Intel_Quark

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1
  2022-09-06 14:36       ` Peter Zijlstra
  2022-09-06 15:01         ` Andy Shevchenko
@ 2022-09-06 15:07         ` Yury Norov
  2022-09-07  8:02         ` David Laight
  2 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-06 15:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 06, 2022 at 04:36:39PM +0200, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 07:06:31AM -0700, Yury Norov wrote:
> > On Tue, Sep 06, 2022 at 10:53:53AM +0200, Peter Zijlstra wrote:
> > > On Mon, Sep 05, 2022 at 04:08:16PM -0700, Yury Norov wrote:
> > > > SMP and NR_CPUS are independent options, hence nr_cpu_ids may be
> > > > declared even if NR_CPUS == 1, which is useless.
> > > 
> > > I'm thikning you're fixing the wrong problem here.
> > 
> > I'm removing dead code. If NR_CPUS == 1, nr_cpu_ids does exist, exported
> > as an interface variable, but never normally reached, because in some
> > other piece of code (not even in smp.h) it's declared conditionally. 
> 
> Can't you simply disallow NR_CPUS==1 for SMP builds? It doesn't make
> sense anyway.

There are SMP_ON_UP and SMP_UP options in arm and mips configs. I have
no idea what do they do, but disallowing NR_CPUS==1 && SMP=y looks
unsafe...

 
> > > Why do we need extra source complexity for this?
> > 
> > To have effective code generation for UP builds.
> 
> Again, who cares... isn't it hard to find actual UP chips these days?

What about UP VMs? People are interested in UP. Check for example the
recent b81dce77cedce ("cpumask: Fix invalid uniprocessor mask assumption")

> It was suggested the other day we remove a whole bunch of SMP=n code and
> unconditionally use SMP code, even if its pointless on UP just to make
> the source simpler.

So while SMP=n is there, let's keep the code base coherent?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
  2022-09-06 14:38     ` Peter Zijlstra
@ 2022-09-06 15:45       ` Valentin Schneider
  2022-09-06 16:26         ` Yury Norov
  0 siblings, 1 reply; 40+ messages in thread
From: Valentin Schneider @ 2022-09-06 15:45 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Yury Norov, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On 06/09/22 16:38, Peter Zijlstra wrote:
> On Tue, Sep 06, 2022 at 01:06:47PM +0100, Valentin Schneider wrote:
>
>>   #define nr_cpumask_bits nr_cpu_ids
>
> That assumes the CPU space is dense; is this so? That is, you can have 4
> CPUs while the highest cpu number is vastly larger than 4.
>
> It's uncommon, but not unheard of I think. ISTR some BIOSes leaving
> holes in the CPU space when there were empty CPU sockets on the
> motherboard.

I'd assume this would be visible in the cpu_possible_mask and thus be
properly reflected in nr_cpu_ids. Otherwise that would already break with
CONFIG_CPUMASK_OFFSTACK=y, I think.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
  2022-09-06 15:45       ` Valentin Schneider
@ 2022-09-06 16:26         ` Yury Norov
  0 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-06 16:26 UTC (permalink / raw)
  To: Valentin Schneider
  Cc: Peter Zijlstra, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 6, 2022 at 8:45 AM Valentin Schneider <vschneid@redhat.com> wrote:
>
> On 06/09/22 16:38, Peter Zijlstra wrote:
> > On Tue, Sep 06, 2022 at 01:06:47PM +0100, Valentin Schneider wrote:
> >
> >>   #define nr_cpumask_bits nr_cpu_ids
> >
> > That assumes the CPU space is dense; is this so? That is, you can have 4
> > CPUs while the highest cpu number is vastly larger than 4.

It's quite common. One can configure qemu to give to the user one cpu at
start, and hotplug more on demand. In that case those unattached CPUS
are set in cpu_possible_mask.

> > It's uncommon, but not unheard of I think. ISTR some BIOSes leaving
> > holes in the CPU space when there were empty CPU sockets on the
> > motherboard.

Didn't get my hands on that particular board, but I suspect that those missed
CPUs will be set in possible mask, and unset in present, online and active
masks.

> I'd assume this would be visible in the cpu_possible_mask and thus be
> properly reflected in nr_cpu_ids. Otherwise that would already break with
> CONFIG_CPUMASK_OFFSTACK=y, I think.

Yes.

The nr_cpu_ids is set as:
  find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1

For all board and VM configurations I've been working with, the
cpu_possible_mask
was dense up to nr_cpu_ids.The holes that may appear if HOTPLUG is enabled
or by any other reason are all in the present mask, and therefore
nr_cpu_ids is set
correctly.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1
  2022-09-06 15:01         ` Andy Shevchenko
@ 2022-09-06 17:53           ` Peter Zijlstra
  0 siblings, 0 replies; 40+ messages in thread
From: Peter Zijlstra @ 2022-09-06 17:53 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Yury Norov, linux-kernel, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

On Tue, Sep 06, 2022 at 06:01:01PM +0300, Andy Shevchenko wrote:

> For the record, Intel produced somewhat a volume of Intel Quark chips [1],
> that are UP (in 2013-2019).

The day we drop 32bit x86 support can't be soon enough ;-)

^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1
  2022-09-06 14:36       ` Peter Zijlstra
  2022-09-06 15:01         ` Andy Shevchenko
  2022-09-06 15:07         ` Yury Norov
@ 2022-09-07  8:02         ` David Laight
  2 siblings, 0 replies; 40+ messages in thread
From: David Laight @ 2022-09-07  8:02 UTC (permalink / raw)
  To: 'Peter Zijlstra', Yury Norov
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers

From: Peter Zijlstra
> Sent: 06 September 2022 15:37
...
> It was suggested the other day we remove a whole bunch of SMP=n code and
> unconditionally use SMP code, even if its pointless on UP just to make
> the source simpler.

All the world isn't x86.

There are some small embedded systems which are definitely UP
and where you don't want any of the SMP 'bloat'.
Architectures like NiosII will only ever be SMP.

There certainly have recently been (and probably still are)
small ppc cpu for embedded systems that are UP.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess
  2022-09-05 23:08 [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Yury Norov
                   ` (5 preceding siblings ...)
  2022-09-06  8:55 ` [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Peter Zijlstra
@ 2022-09-15 14:45 ` Yury Norov
  6 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-09-15 14:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, David Laight,
	Eric Biggers

On Mon, Sep 05, 2022 at 04:08:15PM -0700, Yury Norov wrote:
> cpumask subsystem uses nr_cpu_ids and nr_cpumask_bits interchangeably
> despite that the variables have different meaning and purpose. It makes
> some cpumask functions broken.
> 
> This series cleans that mess and adds new config FORCE_NR_CPUS that
> allows to optimize cpumask subsystem if the number of CPUs is known
> at compile-time.
> 
> After some testing I found build broken when SMP is on and NR_CPUS == 1.
> This is addressed in a new patch #1, and in the following patch #2 that
> now declares set_nr_cpu_ids in cpumask.h (was in smp.h).
> 
> v1: https://lore.kernel.org/lkml/20220829165748.1008477-1-yury.norov@gmail.com/T/#mecbd787f8d1bff1454a4ec2fe46ad6dc168df695
> v2:
>  - don't declare nr_cpu_ids if NR_CPUS == 1;
>  - move set_nr_cpu_ids() from smp.h to cpumask.h to avoid errors in some
>    inclusion paths;
>  - drop 'default n' in FORCE_NR_CPUS option description;
>  - rebase on top of v6.0-rc4.

Any more comments? If not, I'd like to move it into bitmap-for-next.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-09-05 23:08 ` [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option Yury Norov
@ 2022-10-18  8:21   ` Geert Uytterhoeven
  2022-10-18 13:15       ` Geert Uytterhoeven
  0 siblings, 1 reply; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-18  8:21 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers

Hi Yury,

On Tue, Sep 6, 2022 at 1:10 AM Yury Norov <yury.norov@gmail.com> wrote:
> The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
> but defined at boot-time when kernel parses ACPI/DT tables, and stored in
> nr_cpu_ids. In many practical cases, number of CPUs for a target is known
> at compile time, and can be provided with NR_CPUS.
>
> In that case, compiler may be instructed to rely on NR_CPUS as on actual
> number of CPUs, not an upper limit. It allows to optimize many cpumask
> routines and significantly shrink size of the kernel image.
>
> This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
> NR_CPUS and enable corresponding optimizations.
>
> If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
> that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
> that doesn't hold.
>
> The new option is especially useful in embedded applications because
> kernel configurations are unique for each SoC, the number of CPUs is
> constant and known well, and memory limitations are typically harder.
>
> For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
>   add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)
>
> Signed-off-by: Yury Norov <yury.norov@gmail.com>

Thanks for your patch, which is now commit 6f9c07be9d020489
("lib/cpumask: add FORCE_NR_CPUS config option") in v6.1-rc1.

FORCE_NR_CPUS is enabled for e.g. an allmodconfig kernel, which I
believe now makes it unsafe to boot such a kernel on any system that
does not have exactly CONFIG_NR_CPUS CPU cores?

If my assumption is true, this really needs some protection to prevent
enabling this option inadvertently, as it is quite common to boot
allmodconfig kernels for testing.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18  8:21   ` Geert Uytterhoeven
@ 2022-10-18 13:15       ` Geert Uytterhoeven
  0 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-18 13:15 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

Hi Yury,

On Tue, Oct 18, 2022 at 10:21 AM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> On Tue, Sep 6, 2022 at 1:10 AM Yury Norov <yury.norov@gmail.com> wrote:
> > The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
> > but defined at boot-time when kernel parses ACPI/DT tables, and stored in
> > nr_cpu_ids. In many practical cases, number of CPUs for a target is known
> > at compile time, and can be provided with NR_CPUS.
> >
> > In that case, compiler may be instructed to rely on NR_CPUS as on actual
> > number of CPUs, not an upper limit. It allows to optimize many cpumask
> > routines and significantly shrink size of the kernel image.
> >
> > This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
> > NR_CPUS and enable corresponding optimizations.
> >
> > If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
> > that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
> > that doesn't hold.
> >
> > The new option is especially useful in embedded applications because
> > kernel configurations are unique for each SoC, the number of CPUs is
> > constant and known well, and memory limitations are typically harder.
> >
> > For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
> >   add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
>
> Thanks for your patch, which is now commit 6f9c07be9d020489
> ("lib/cpumask: add FORCE_NR_CPUS config option") in v6.1-rc1.
>
> FORCE_NR_CPUS is enabled for e.g. an allmodconfig kernel, which I
> believe now makes it unsafe to boot such a kernel on any system that
> does not have exactly CONFIG_NR_CPUS CPU cores?
>
> If my assumption is true, this really needs some protection to prevent
> enabling this option inadvertently, as it is quite common to boot
> allmodconfig kernels for testing.

Moreover, this cannot be used on all systems.  E.g. on Icicle Kit with
Microchip PolarFire SoC, CONFIG_NR_CPUS needs to be larger than 4,
as the system has actually 5 CPU cores (1xE51 and 4xU54), but Linux
runs only on 4 of them.  So you cannot use FORCE_NR_CPUS=y.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-18 13:15       ` Geert Uytterhoeven
  0 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-18 13:15 UTC (permalink / raw)
  To: Yury Norov
  Cc: linux-kernel, Andy Shevchenko, Rasmus Villemoes, Andrew Morton,
	Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

Hi Yury,

On Tue, Oct 18, 2022 at 10:21 AM Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> On Tue, Sep 6, 2022 at 1:10 AM Yury Norov <yury.norov@gmail.com> wrote:
> > The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
> > but defined at boot-time when kernel parses ACPI/DT tables, and stored in
> > nr_cpu_ids. In many practical cases, number of CPUs for a target is known
> > at compile time, and can be provided with NR_CPUS.
> >
> > In that case, compiler may be instructed to rely on NR_CPUS as on actual
> > number of CPUs, not an upper limit. It allows to optimize many cpumask
> > routines and significantly shrink size of the kernel image.
> >
> > This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
> > NR_CPUS and enable corresponding optimizations.
> >
> > If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
> > that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
> > that doesn't hold.
> >
> > The new option is especially useful in embedded applications because
> > kernel configurations are unique for each SoC, the number of CPUs is
> > constant and known well, and memory limitations are typically harder.
> >
> > For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
> >   add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)
> >
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
>
> Thanks for your patch, which is now commit 6f9c07be9d020489
> ("lib/cpumask: add FORCE_NR_CPUS config option") in v6.1-rc1.
>
> FORCE_NR_CPUS is enabled for e.g. an allmodconfig kernel, which I
> believe now makes it unsafe to boot such a kernel on any system that
> does not have exactly CONFIG_NR_CPUS CPU cores?
>
> If my assumption is true, this really needs some protection to prevent
> enabling this option inadvertently, as it is quite common to boot
> allmodconfig kernels for testing.

Moreover, this cannot be used on all systems.  E.g. on Icicle Kit with
Microchip PolarFire SoC, CONFIG_NR_CPUS needs to be larger than 4,
as the system has actually 5 CPU cores (1xE51 and 4xU54), but Linux
runs only on 4 of them.  So you cannot use FORCE_NR_CPUS=y.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18 13:15       ` Geert Uytterhoeven
@ 2022-10-18 13:41         ` Andreas Schwab
  -1 siblings, 0 replies; 40+ messages in thread
From: Andreas Schwab @ 2022-10-18 13:41 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Yury Norov, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

On Okt 18 2022, Geert Uytterhoeven wrote:

> Moreover, this cannot be used on all systems.  E.g. on Icicle Kit with
> Microchip PolarFire SoC, CONFIG_NR_CPUS needs to be larger than 4,
> as the system has actually 5 CPU cores (1xE51 and 4xU54), but Linux
> runs only on 4 of them.  So you cannot use FORCE_NR_CPUS=y.

But does Linux acually see the E51 core?  On the Hifive boards it is
disabled in the device tree, and the cpu probing just skips it,
effectively resulting in only four cpus.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-18 13:41         ` Andreas Schwab
  0 siblings, 0 replies; 40+ messages in thread
From: Andreas Schwab @ 2022-10-18 13:41 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Yury Norov, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

On Okt 18 2022, Geert Uytterhoeven wrote:

> Moreover, this cannot be used on all systems.  E.g. on Icicle Kit with
> Microchip PolarFire SoC, CONFIG_NR_CPUS needs to be larger than 4,
> as the system has actually 5 CPU cores (1xE51 and 4xU54), but Linux
> runs only on 4 of them.  So you cannot use FORCE_NR_CPUS=y.

But does Linux acually see the E51 core?  On the Hifive boards it is
disabled in the device tree, and the cpu probing just skips it,
effectively resulting in only four cpus.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18 13:41         ` Andreas Schwab
@ 2022-10-18 13:50           ` Geert Uytterhoeven
  -1 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-18 13:50 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Yury Norov, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

Hi Andreas,

On Tue, Oct 18, 2022 at 3:41 PM Andreas Schwab <schwab@suse.de> wrote:
> On Okt 18 2022, Geert Uytterhoeven wrote:
> > Moreover, this cannot be used on all systems.  E.g. on Icicle Kit with
> > Microchip PolarFire SoC, CONFIG_NR_CPUS needs to be larger than 4,
> > as the system has actually 5 CPU cores (1xE51 and 4xU54), but Linux
> > runs only on 4 of them.  So you cannot use FORCE_NR_CPUS=y.
>
> But does Linux acually see the E51 core?  On the Hifive boards it is
> disabled in the device tree, and the cpu probing just skips it,
> effectively resulting in only four cpus.

The E51 is indeed disabled in DT.
The CPU parts of arch/riscv/boot/dts/sifive/fu540-c000.dtsi and
arch/riscv/boot/dts/microchip/mpfs.dtsi arre very similar.
Do you get 4 CPUs on Hifive with CONFIG_NR_CPUS=4?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-18 13:50           ` Geert Uytterhoeven
  0 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-18 13:50 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Yury Norov, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

Hi Andreas,

On Tue, Oct 18, 2022 at 3:41 PM Andreas Schwab <schwab@suse.de> wrote:
> On Okt 18 2022, Geert Uytterhoeven wrote:
> > Moreover, this cannot be used on all systems.  E.g. on Icicle Kit with
> > Microchip PolarFire SoC, CONFIG_NR_CPUS needs to be larger than 4,
> > as the system has actually 5 CPU cores (1xE51 and 4xU54), but Linux
> > runs only on 4 of them.  So you cannot use FORCE_NR_CPUS=y.
>
> But does Linux acually see the E51 core?  On the Hifive boards it is
> disabled in the device tree, and the cpu probing just skips it,
> effectively resulting in only four cpus.

The E51 is indeed disabled in DT.
The CPU parts of arch/riscv/boot/dts/sifive/fu540-c000.dtsi and
arch/riscv/boot/dts/microchip/mpfs.dtsi arre very similar.
Do you get 4 CPUs on Hifive with CONFIG_NR_CPUS=4?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18 13:50           ` Geert Uytterhoeven
@ 2022-10-18 14:35             ` Yury Norov
  -1 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-10-18 14:35 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Andreas Schwab, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> Hi Andreas,
> 
> On Tue, Oct 18, 2022 at 3:41 PM Andreas Schwab <schwab@suse.de> wrote:
> > On Okt 18 2022, Geert Uytterhoeven wrote:
> > > Moreover, this cannot be used on all systems.  E.g. on Icicle Kit with
> > > Microchip PolarFire SoC, CONFIG_NR_CPUS needs to be larger than 4,
> > > as the system has actually 5 CPU cores (1xE51 and 4xU54), but Linux
> > > runs only on 4 of them.  So you cannot use FORCE_NR_CPUS=y.
> >
> > But does Linux acually see the E51 core?  On the Hifive boards it is
> > disabled in the device tree, and the cpu probing just skips it,
> > effectively resulting in only four cpus.
> 
> The E51 is indeed disabled in DT.
> The CPU parts of arch/riscv/boot/dts/sifive/fu540-c000.dtsi and
> arch/riscv/boot/dts/microchip/mpfs.dtsi arre very similar.
> Do you get 4 CPUs on Hifive with CONFIG_NR_CPUS=4?
> 
> Gr{oetje,eeting}s,

Hi Geert, Andreas,

Thanks for pointing that. Indeed, it should be disabled in
allmodconfig.

Linus also asked to make this option depending on CONFIG_EXPERT.
So, I'm working on a patch.

From general considerations, NR_CPUS defines length of cpumasks,
per-cpu arrays etc. If it's impossible to boot Linux on a specific
core, we don't need to reserve memory for all that. In other words,
number of possible CPUs in your example should be equal to 4.

When FORCE_NR_CPUS=y, the boot code still parses DT/ACPI tables for
actual numbers of CPUs, and if it doesn't equal to NR_CPUS, prints
a message in syslog. 

For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
to a value that matches to what's parsed from DT.

Can you please look at the draft below that disables FORCE_NR_CPUS
in allmodconfig? If it's OK with you, I'll send a patch. If you think
that there are architectures where it's not possible to set correct
NR_CPUS at compile time for some reason, I'll add ARCH_UNFORCE_NR_CPUS
option.

Thanks,
Yury
---
 lib/Kconfig | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/lib/Kconfig b/lib/Kconfig
index 9bbf8a4b2108..578cee3593d7 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -528,15 +528,33 @@ config CPUMASK_OFFSTACK
 	  them on the stack.  This is a bit more expensive, but avoids
 	  stack overflow.
 
+choice
+	prompt "Number of CPUs detection"
+	default UNFORCE_NR_CPUS
+        depends on SMP && EXPERT
+	help
+	  Select between boot-time and compile-time detection of number
+	  of CPUs. If it's possible to provide exact number of CPUs at
+	  compile-time, kernel code may be optimized better.
+
+	  For general-purpose kernel, choose "boot time" option.
+
+config UNFORCE_NR_CPUS
+	bool "Detect number of CPUs at boot time"
+	help
+	  Choose it if you build general-purpose kernel and want to rely
+	  on kernel to detect actual number of CPUs.
+
 config FORCE_NR_CPUS
        bool "NR_CPUS is set to an actual number of CPUs"
-       depends on SMP
        help
          Say Yes if you have NR_CPUS set to an actual number of possible
          CPUs in your system, not to a default value. This forces the core
          code to rely on compile-time value and optimize kernel routines
          better.
 
+endchoice
+
 config CPU_RMAP
 	bool
 	depends on SMP
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-18 14:35             ` Yury Norov
  0 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-10-18 14:35 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Andreas Schwab, linux-kernel, Andy Shevchenko, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> Hi Andreas,
> 
> On Tue, Oct 18, 2022 at 3:41 PM Andreas Schwab <schwab@suse.de> wrote:
> > On Okt 18 2022, Geert Uytterhoeven wrote:
> > > Moreover, this cannot be used on all systems.  E.g. on Icicle Kit with
> > > Microchip PolarFire SoC, CONFIG_NR_CPUS needs to be larger than 4,
> > > as the system has actually 5 CPU cores (1xE51 and 4xU54), but Linux
> > > runs only on 4 of them.  So you cannot use FORCE_NR_CPUS=y.
> >
> > But does Linux acually see the E51 core?  On the Hifive boards it is
> > disabled in the device tree, and the cpu probing just skips it,
> > effectively resulting in only four cpus.
> 
> The E51 is indeed disabled in DT.
> The CPU parts of arch/riscv/boot/dts/sifive/fu540-c000.dtsi and
> arch/riscv/boot/dts/microchip/mpfs.dtsi arre very similar.
> Do you get 4 CPUs on Hifive with CONFIG_NR_CPUS=4?
> 
> Gr{oetje,eeting}s,

Hi Geert, Andreas,

Thanks for pointing that. Indeed, it should be disabled in
allmodconfig.

Linus also asked to make this option depending on CONFIG_EXPERT.
So, I'm working on a patch.

From general considerations, NR_CPUS defines length of cpumasks,
per-cpu arrays etc. If it's impossible to boot Linux on a specific
core, we don't need to reserve memory for all that. In other words,
number of possible CPUs in your example should be equal to 4.

When FORCE_NR_CPUS=y, the boot code still parses DT/ACPI tables for
actual numbers of CPUs, and if it doesn't equal to NR_CPUS, prints
a message in syslog. 

For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
to a value that matches to what's parsed from DT.

Can you please look at the draft below that disables FORCE_NR_CPUS
in allmodconfig? If it's OK with you, I'll send a patch. If you think
that there are architectures where it's not possible to set correct
NR_CPUS at compile time for some reason, I'll add ARCH_UNFORCE_NR_CPUS
option.

Thanks,
Yury
---
 lib/Kconfig | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/lib/Kconfig b/lib/Kconfig
index 9bbf8a4b2108..578cee3593d7 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -528,15 +528,33 @@ config CPUMASK_OFFSTACK
 	  them on the stack.  This is a bit more expensive, but avoids
 	  stack overflow.
 
+choice
+	prompt "Number of CPUs detection"
+	default UNFORCE_NR_CPUS
+        depends on SMP && EXPERT
+	help
+	  Select between boot-time and compile-time detection of number
+	  of CPUs. If it's possible to provide exact number of CPUs at
+	  compile-time, kernel code may be optimized better.
+
+	  For general-purpose kernel, choose "boot time" option.
+
+config UNFORCE_NR_CPUS
+	bool "Detect number of CPUs at boot time"
+	help
+	  Choose it if you build general-purpose kernel and want to rely
+	  on kernel to detect actual number of CPUs.
+
 config FORCE_NR_CPUS
        bool "NR_CPUS is set to an actual number of CPUs"
-       depends on SMP
        help
          Say Yes if you have NR_CPUS set to an actual number of possible
          CPUs in your system, not to a default value. This forces the core
          code to rely on compile-time value and optimize kernel routines
          better.
 
+endchoice
+
 config CPU_RMAP
 	bool
 	depends on SMP
-- 
2.34.1


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply related	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18 14:35             ` Yury Norov
@ 2022-10-18 14:44               ` Andy Shevchenko
  -1 siblings, 0 replies; 40+ messages in thread
From: Andy Shevchenko @ 2022-10-18 14:44 UTC (permalink / raw)
  To: Yury Norov
  Cc: Geert Uytterhoeven, Andreas Schwab, linux-kernel,
	Rasmus Villemoes, Andrew Morton, Stephen Rothwell,
	Peter Zijlstra, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers, linux-riscv

On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:

...

> For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> to a value that matches to what's parsed from DT.
> 
> Can you please look at the draft below that disables FORCE_NR_CPUS
> in allmodconfig? If it's OK with you, I'll send a patch. If you think
> that there are architectures where it's not possible to set correct
> NR_CPUS at compile time for some reason, I'll add ARCH_UNFORCE_NR_CPUS
> option.

Instead you may simply add

	depends on CONFIG_$ARCH/$MACHINE=n

and so on to the FORCE_NR_CPUS, no?

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-18 14:44               ` Andy Shevchenko
  0 siblings, 0 replies; 40+ messages in thread
From: Andy Shevchenko @ 2022-10-18 14:44 UTC (permalink / raw)
  To: Yury Norov
  Cc: Geert Uytterhoeven, Andreas Schwab, linux-kernel,
	Rasmus Villemoes, Andrew Morton, Stephen Rothwell,
	Peter Zijlstra, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers, linux-riscv

On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:

...

> For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> to a value that matches to what's parsed from DT.
> 
> Can you please look at the draft below that disables FORCE_NR_CPUS
> in allmodconfig? If it's OK with you, I'll send a patch. If you think
> that there are architectures where it's not possible to set correct
> NR_CPUS at compile time for some reason, I'll add ARCH_UNFORCE_NR_CPUS
> option.

Instead you may simply add

	depends on CONFIG_$ARCH/$MACHINE=n

and so on to the FORCE_NR_CPUS, no?

-- 
With Best Regards,
Andy Shevchenko



_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18 14:44               ` Andy Shevchenko
@ 2022-10-18 14:59                 ` Yury Norov
  -1 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-10-18 14:59 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Geert Uytterhoeven, Andreas Schwab, linux-kernel,
	Rasmus Villemoes, Andrew Morton, Stephen Rothwell,
	Peter Zijlstra, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers, linux-riscv

On Tue, Oct 18, 2022 at 05:44:09PM +0300, Andy Shevchenko wrote:
> On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> > On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> 
> ...
> 
> > For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> > to a value that matches to what's parsed from DT.
> > 
> > Can you please look at the draft below that disables FORCE_NR_CPUS
> > in allmodconfig? If it's OK with you, I'll send a patch. If you think
> > that there are architectures where it's not possible to set correct
> > NR_CPUS at compile time for some reason, I'll add ARCH_UNFORCE_NR_CPUS
> > option.
> 
> Instead you may simply add
> 
> 	depends on CONFIG_$ARCH/$MACHINE=n
> 
> and so on to the FORCE_NR_CPUS, no?

Yes, if there's just one machine like that. If there's many of them, the
'depends' list would be too long.

I hope there's no such a weird machines, and we don't need that at
all. Let's see what Geert will say.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-18 14:59                 ` Yury Norov
  0 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-10-18 14:59 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Geert Uytterhoeven, Andreas Schwab, linux-kernel,
	Rasmus Villemoes, Andrew Morton, Stephen Rothwell,
	Peter Zijlstra, Thomas Gleixner, Paul E . McKenney,
	Vlastimil Babka, Dmitry Vyukov, Valentin Schneider,
	Sander Vanheule, Alexey Klimov, Eric Biggers, linux-riscv

On Tue, Oct 18, 2022 at 05:44:09PM +0300, Andy Shevchenko wrote:
> On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> > On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> 
> ...
> 
> > For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> > to a value that matches to what's parsed from DT.
> > 
> > Can you please look at the draft below that disables FORCE_NR_CPUS
> > in allmodconfig? If it's OK with you, I'll send a patch. If you think
> > that there are architectures where it's not possible to set correct
> > NR_CPUS at compile time for some reason, I'll add ARCH_UNFORCE_NR_CPUS
> > option.
> 
> Instead you may simply add
> 
> 	depends on CONFIG_$ARCH/$MACHINE=n
> 
> and so on to the FORCE_NR_CPUS, no?

Yes, if there's just one machine like that. If there's many of them, the
'depends' list would be too long.

I hope there's no such a weird machines, and we don't need that at
all. Let's see what Geert will say.


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18 14:59                 ` Yury Norov
@ 2022-10-18 15:15                   ` Geert Uytterhoeven
  -1 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-18 15:15 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Andreas Schwab, linux-kernel, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

Hi Yuri,

On Tue, Oct 18, 2022 at 5:01 PM Yury Norov <yury.norov@gmail.com> wrote:
> On Tue, Oct 18, 2022 at 05:44:09PM +0300, Andy Shevchenko wrote:
> > On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> > > On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> >
> > ...
> >
> > > For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> > > to a value that matches to what's parsed from DT.
> > >
> > > Can you please look at the draft below that disables FORCE_NR_CPUS
> > > in allmodconfig? If it's OK with you, I'll send a patch. If you think
> > > that there are architectures where it's not possible to set correct
> > > NR_CPUS at compile time for some reason, I'll add ARCH_UNFORCE_NR_CPUS
> > > option.
> >
> > Instead you may simply add
> >
> >       depends on CONFIG_$ARCH/$MACHINE=n
> >
> > and so on to the FORCE_NR_CPUS, no?
>
> Yes, if there's just one machine like that. If there's many of them, the
> 'depends' list would be too long.
>
> I hope there's no such a weird machines, and we don't need that at
> all. Let's see what Geert will say.

I haven't tried the patch from your other email yet, but I did try
CONFIG_NR_CPUS=4 and CONFIG_FORCE_NR_CPUS=y on
Icicle earlier today.

There was no warning, as the number of CPUs did match, but the
fourth CPU (cpu@4, i.e. the fifth core in DT) failed to come online:

    CPU3: failed to come online
    smp: Brought up 1 node, 3 CPUs

BTW, it behaves the same with CONFIG_FORCE_NR_CPUS=n.
Increasing CONFIG_NR_CPUS (before I used 8) makes the fourth
CPU core come online again.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-18 15:15                   ` Geert Uytterhoeven
  0 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-18 15:15 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Andreas Schwab, linux-kernel, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

Hi Yuri,

On Tue, Oct 18, 2022 at 5:01 PM Yury Norov <yury.norov@gmail.com> wrote:
> On Tue, Oct 18, 2022 at 05:44:09PM +0300, Andy Shevchenko wrote:
> > On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> > > On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> >
> > ...
> >
> > > For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> > > to a value that matches to what's parsed from DT.
> > >
> > > Can you please look at the draft below that disables FORCE_NR_CPUS
> > > in allmodconfig? If it's OK with you, I'll send a patch. If you think
> > > that there are architectures where it's not possible to set correct
> > > NR_CPUS at compile time for some reason, I'll add ARCH_UNFORCE_NR_CPUS
> > > option.
> >
> > Instead you may simply add
> >
> >       depends on CONFIG_$ARCH/$MACHINE=n
> >
> > and so on to the FORCE_NR_CPUS, no?
>
> Yes, if there's just one machine like that. If there's many of them, the
> 'depends' list would be too long.
>
> I hope there's no such a weird machines, and we don't need that at
> all. Let's see what Geert will say.

I haven't tried the patch from your other email yet, but I did try
CONFIG_NR_CPUS=4 and CONFIG_FORCE_NR_CPUS=y on
Icicle earlier today.

There was no warning, as the number of CPUs did match, but the
fourth CPU (cpu@4, i.e. the fifth core in DT) failed to come online:

    CPU3: failed to come online
    smp: Brought up 1 node, 3 CPUs

BTW, it behaves the same with CONFIG_FORCE_NR_CPUS=n.
Increasing CONFIG_NR_CPUS (before I used 8) makes the fourth
CPU core come online again.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18 15:15                   ` Geert Uytterhoeven
@ 2022-10-18 16:16                     ` Yury Norov
  -1 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-10-18 16:16 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Andy Shevchenko, Andreas Schwab, linux-kernel, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

On Tue, Oct 18, 2022 at 05:15:41PM +0200, Geert Uytterhoeven wrote:
> Hi Yuri,
> 
> On Tue, Oct 18, 2022 at 5:01 PM Yury Norov <yury.norov@gmail.com> wrote:
> > On Tue, Oct 18, 2022 at 05:44:09PM +0300, Andy Shevchenko wrote:
> > > On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> > > > On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> > >
> > > ...
> > >
> > > > For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> > > > to a value that matches to what's parsed from DT.

...

> I haven't tried the patch from your other email yet, but I did try
> CONFIG_NR_CPUS=4 and CONFIG_FORCE_NR_CPUS=y on
> Icicle earlier today.
> 
> There was no warning, as the number of CPUs did match, but the
> fourth CPU (cpu@4, i.e. the fifth core in DT) failed to come online:
> 
>     CPU3: failed to come online
>     smp: Brought up 1 node, 3 CPUs
> 
> BTW, it behaves the same with CONFIG_FORCE_NR_CPUS=n.
> Increasing CONFIG_NR_CPUS (before I used 8) makes the fourth
> CPU core come online again.

The problem is seemingly unrelated to CONFIG_FORCE_NR_CPUS...
If so, we don't need ARCH_UNFORCE_NR_CPUS. Is that right?

This all looks weird. RISCV hasn't an arch code to setup nr_cpu_ids,
and therefore should use generic setup_nr_cpu_ids(), which is:

  void __init setup_nr_cpu_ids(void)
  {
          set_nr_cpu_ids(find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1);
  }

Where:

  static inline void set_nr_cpu_ids(unsigned int nr)
  {
  #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
          WARN_ON(nr != nr_cpu_ids);
  #else
          nr_cpu_ids = nr;
  #endif
  }


As you can see, at this point cpu_possible_mask is initialized based
on DT, and even if arch has non-dense cpu_possible_mask, the logic
should still be correct.

Wish I could tell more, if I had an access to the hardware...

Thanks,
Yury

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-18 16:16                     ` Yury Norov
  0 siblings, 0 replies; 40+ messages in thread
From: Yury Norov @ 2022-10-18 16:16 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Andy Shevchenko, Andreas Schwab, linux-kernel, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

On Tue, Oct 18, 2022 at 05:15:41PM +0200, Geert Uytterhoeven wrote:
> Hi Yuri,
> 
> On Tue, Oct 18, 2022 at 5:01 PM Yury Norov <yury.norov@gmail.com> wrote:
> > On Tue, Oct 18, 2022 at 05:44:09PM +0300, Andy Shevchenko wrote:
> > > On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> > > > On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> > >
> > > ...
> > >
> > > > For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> > > > to a value that matches to what's parsed from DT.

...

> I haven't tried the patch from your other email yet, but I did try
> CONFIG_NR_CPUS=4 and CONFIG_FORCE_NR_CPUS=y on
> Icicle earlier today.
> 
> There was no warning, as the number of CPUs did match, but the
> fourth CPU (cpu@4, i.e. the fifth core in DT) failed to come online:
> 
>     CPU3: failed to come online
>     smp: Brought up 1 node, 3 CPUs
> 
> BTW, it behaves the same with CONFIG_FORCE_NR_CPUS=n.
> Increasing CONFIG_NR_CPUS (before I used 8) makes the fourth
> CPU core come online again.

The problem is seemingly unrelated to CONFIG_FORCE_NR_CPUS...
If so, we don't need ARCH_UNFORCE_NR_CPUS. Is that right?

This all looks weird. RISCV hasn't an arch code to setup nr_cpu_ids,
and therefore should use generic setup_nr_cpu_ids(), which is:

  void __init setup_nr_cpu_ids(void)
  {
          set_nr_cpu_ids(find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1);
  }

Where:

  static inline void set_nr_cpu_ids(unsigned int nr)
  {
  #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
          WARN_ON(nr != nr_cpu_ids);
  #else
          nr_cpu_ids = nr;
  #endif
  }


As you can see, at this point cpu_possible_mask is initialized based
on DT, and even if arch has non-dense cpu_possible_mask, the logic
should still be correct.

Wish I could tell more, if I had an access to the hardware...

Thanks,
Yury

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
  2022-10-18 16:16                     ` Yury Norov
@ 2022-10-19  6:58                       ` Geert Uytterhoeven
  -1 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-19  6:58 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Andreas Schwab, linux-kernel, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

Hi Yury,

On Tue, Oct 18, 2022 at 6:19 PM Yury Norov <yury.norov@gmail.com> wrote:
> On Tue, Oct 18, 2022 at 05:15:41PM +0200, Geert Uytterhoeven wrote:
> > On Tue, Oct 18, 2022 at 5:01 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > On Tue, Oct 18, 2022 at 05:44:09PM +0300, Andy Shevchenko wrote:
> > > > On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> > > > > On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> > > >
> > > > ...
> > > >
> > > > > For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> > > > > to a value that matches to what's parsed from DT.
>
> ...
>
> > I haven't tried the patch from your other email yet, but I did try
> > CONFIG_NR_CPUS=4 and CONFIG_FORCE_NR_CPUS=y on
> > Icicle earlier today.
> >
> > There was no warning, as the number of CPUs did match, but the
> > fourth CPU (cpu@4, i.e. the fifth core in DT) failed to come online:
> >
> >     CPU3: failed to come online
> >     smp: Brought up 1 node, 3 CPUs
> >
> > BTW, it behaves the same with CONFIG_FORCE_NR_CPUS=n.
> > Increasing CONFIG_NR_CPUS (before I used 8) makes the fourth
> > CPU core come online again.
>
> The problem is seemingly unrelated to CONFIG_FORCE_NR_CPUS...
> If so, we don't need ARCH_UNFORCE_NR_CPUS. Is that right?
>
> This all looks weird. RISCV hasn't an arch code to setup nr_cpu_ids,
> and therefore should use generic setup_nr_cpu_ids(), which is:
>
>   void __init setup_nr_cpu_ids(void)
>   {
>           set_nr_cpu_ids(find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1);
>   }
>
> Where:
>
>   static inline void set_nr_cpu_ids(unsigned int nr)
>   {
>   #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
>           WARN_ON(nr != nr_cpu_ids);
>   #else
>           nr_cpu_ids = nr;
>   #endif
>   }
>
>
> As you can see, at this point cpu_possible_mask is initialized based
> on DT, and even if arch has non-dense cpu_possible_mask, the logic
> should still be correct.

Quite possible this is just an issue with the RISC-V CPU sparse hart ID
handling code.  E.g. arm64 works fine with cpu@{0,1,2,3,100,101,102,103}
and CONFIG_NR_CPUS=8.  NR_CPUS in arch/riscv/Kconfig has always
defaulted to at least 8, while all upstream DTS files describe only
4 Linux-capable CPU cores (+ 1 management core).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option
@ 2022-10-19  6:58                       ` Geert Uytterhoeven
  0 siblings, 0 replies; 40+ messages in thread
From: Geert Uytterhoeven @ 2022-10-19  6:58 UTC (permalink / raw)
  To: Yury Norov
  Cc: Andy Shevchenko, Andreas Schwab, linux-kernel, Rasmus Villemoes,
	Andrew Morton, Stephen Rothwell, Peter Zijlstra, Thomas Gleixner,
	Paul E . McKenney, Vlastimil Babka, Dmitry Vyukov,
	Valentin Schneider, Sander Vanheule, Alexey Klimov, Eric Biggers,
	linux-riscv

Hi Yury,

On Tue, Oct 18, 2022 at 6:19 PM Yury Norov <yury.norov@gmail.com> wrote:
> On Tue, Oct 18, 2022 at 05:15:41PM +0200, Geert Uytterhoeven wrote:
> > On Tue, Oct 18, 2022 at 5:01 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > On Tue, Oct 18, 2022 at 05:44:09PM +0300, Andy Shevchenko wrote:
> > > > On Tue, Oct 18, 2022 at 07:35:09AM -0700, Yury Norov wrote:
> > > > > On Tue, Oct 18, 2022 at 03:50:31PM +0200, Geert Uytterhoeven wrote:
> > > >
> > > > ...
> > > >
> > > > > For those who choose FORCE_NR_CPUS, it's required to set NR_CPUS
> > > > > to a value that matches to what's parsed from DT.
>
> ...
>
> > I haven't tried the patch from your other email yet, but I did try
> > CONFIG_NR_CPUS=4 and CONFIG_FORCE_NR_CPUS=y on
> > Icicle earlier today.
> >
> > There was no warning, as the number of CPUs did match, but the
> > fourth CPU (cpu@4, i.e. the fifth core in DT) failed to come online:
> >
> >     CPU3: failed to come online
> >     smp: Brought up 1 node, 3 CPUs
> >
> > BTW, it behaves the same with CONFIG_FORCE_NR_CPUS=n.
> > Increasing CONFIG_NR_CPUS (before I used 8) makes the fourth
> > CPU core come online again.
>
> The problem is seemingly unrelated to CONFIG_FORCE_NR_CPUS...
> If so, we don't need ARCH_UNFORCE_NR_CPUS. Is that right?
>
> This all looks weird. RISCV hasn't an arch code to setup nr_cpu_ids,
> and therefore should use generic setup_nr_cpu_ids(), which is:
>
>   void __init setup_nr_cpu_ids(void)
>   {
>           set_nr_cpu_ids(find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + 1);
>   }
>
> Where:
>
>   static inline void set_nr_cpu_ids(unsigned int nr)
>   {
>   #if (NR_CPUS == 1) || defined(CONFIG_FORCE_NR_CPUS)
>           WARN_ON(nr != nr_cpu_ids);
>   #else
>           nr_cpu_ids = nr;
>   #endif
>   }
>
>
> As you can see, at this point cpu_possible_mask is initialized based
> on DT, and even if arch has non-dense cpu_possible_mask, the logic
> should still be correct.

Quite possible this is just an issue with the RISC-V CPU sparse hart ID
handling code.  E.g. arm64 works fine with cpu@{0,1,2,3,100,101,102,103}
and CONFIG_NR_CPUS=8.  NR_CPUS in arch/riscv/Kconfig has always
defaulted to at least 8, while all upstream DTS files describe only
4 Linux-capable CPU cores (+ 1 management core).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2022-10-19  6:59 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-05 23:08 [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Yury Norov
2022-09-05 23:08 ` [PATCH v2 1/5] smp: don't declare nr_cpu_ids if NR_CPUS == 1 Yury Norov
2022-09-06  8:53   ` Peter Zijlstra
2022-09-06 14:06     ` Yury Norov
2022-09-06 14:36       ` Peter Zijlstra
2022-09-06 15:01         ` Andy Shevchenko
2022-09-06 17:53           ` Peter Zijlstra
2022-09-06 15:07         ` Yury Norov
2022-09-07  8:02         ` David Laight
2022-09-05 23:08 ` [PATCH v2 2/5] smp: add set_nr_cpu_ids() Yury Norov
2022-09-05 23:08 ` [PATCH v2 3/5] lib/cpumask: delete misleading comment Yury Norov
2022-09-05 23:08 ` [PATCH v2 4/5] lib/cpumask: deprecate nr_cpumask_bits Yury Norov
2022-09-05 23:08 ` [PATCH v2 5/5] lib/cpumask: add FORCE_NR_CPUS config option Yury Norov
2022-10-18  8:21   ` Geert Uytterhoeven
2022-10-18 13:15     ` Geert Uytterhoeven
2022-10-18 13:15       ` Geert Uytterhoeven
2022-10-18 13:41       ` Andreas Schwab
2022-10-18 13:41         ` Andreas Schwab
2022-10-18 13:50         ` Geert Uytterhoeven
2022-10-18 13:50           ` Geert Uytterhoeven
2022-10-18 14:35           ` Yury Norov
2022-10-18 14:35             ` Yury Norov
2022-10-18 14:44             ` Andy Shevchenko
2022-10-18 14:44               ` Andy Shevchenko
2022-10-18 14:59               ` Yury Norov
2022-10-18 14:59                 ` Yury Norov
2022-10-18 15:15                 ` Geert Uytterhoeven
2022-10-18 15:15                   ` Geert Uytterhoeven
2022-10-18 16:16                   ` Yury Norov
2022-10-18 16:16                     ` Yury Norov
2022-10-19  6:58                     ` Geert Uytterhoeven
2022-10-19  6:58                       ` Geert Uytterhoeven
2022-09-06  8:55 ` [PATCH v2 0/5] cpumask: cleanup nr_cpu_ids vs nr_cpumask_bits mess Peter Zijlstra
2022-09-06 12:06   ` Valentin Schneider
2022-09-06 14:38     ` Peter Zijlstra
2022-09-06 15:45       ` Valentin Schneider
2022-09-06 16:26         ` Yury Norov
2022-09-06 14:48     ` Yury Norov
2022-09-06 14:35   ` Yury Norov
2022-09-15 14:45 ` Yury Norov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.