All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cpu/hotplug: Cache number of online CPUs
@ 2019-07-04 20:42 Thomas Gleixner
  2019-07-04 20:59 ` Mathieu Desnoyers
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2019-07-04 20:42 UTC (permalink / raw)
  To: LKML; +Cc: x86, Nadav Amit, Mathieu Desnoyers

Revaluating the bitmap wheight of the online cpus bitmap in every
invocation of num_online_cpus() over and over is a pretty useless
exercise. Especially when num_online_cpus() is used in code pathes like the
IPI delivery of x86 or the membarrier code.

Cache the number of online CPUs in the core and just return the cached
variable.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/cpumask.h |   16 +++++++---------
 kernel/cpu.c            |   16 ++++++++++++++++
 2 files changed, 23 insertions(+), 9 deletions(-)

--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -95,8 +95,13 @@ extern struct cpumask __cpu_active_mask;
 #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
 #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
 
+extern unsigned int __num_online_cpus;
+
 #if NR_CPUS > 1
-#define num_online_cpus()	cpumask_weight(cpu_online_mask)
+static inline unsigned int num_online_cpus(void)
+{
+	return __num_online_cpus;
+}
 #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
 #define num_present_cpus()	cpumask_weight(cpu_present_mask)
 #define num_active_cpus()	cpumask_weight(cpu_active_mask)
@@ -821,14 +826,7 @@ set_cpu_present(unsigned int cpu, bool p
 		cpumask_clear_cpu(cpu, &__cpu_present_mask);
 }
 
-static inline void
-set_cpu_online(unsigned int cpu, bool online)
-{
-	if (online)
-		cpumask_set_cpu(cpu, &__cpu_online_mask);
-	else
-		cpumask_clear_cpu(cpu, &__cpu_online_mask);
-}
+void set_cpu_online(unsigned int cpu, bool online);
 
 static inline void
 set_cpu_active(unsigned int cpu, bool active)
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2291,6 +2291,9 @@ EXPORT_SYMBOL(__cpu_present_mask);
 struct cpumask __cpu_active_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_active_mask);
 
+unsigned int __num_online_cpus __read_mostly;
+EXPORT_SYMBOL(__num_online_cpus);
+
 void init_cpu_present(const struct cpumask *src)
 {
 	cpumask_copy(&__cpu_present_mask, src);
@@ -2306,6 +2309,19 @@ void init_cpu_online(const struct cpumas
 	cpumask_copy(&__cpu_online_mask, src);
 }
 
+void set_cpu_online(unsigned int cpu, bool online)
+{
+	lockdep_assert_cpus_held();
+
+	if (online) {
+		if (!cpumask_test_and_set_cpu(cpu, &__cpu_online_mask))
+			__num_online_cpus++;
+	} else {
+		if (cpumask_test_and_clear_cpu(cpu, &__cpu_online_mask))
+			__num_online_cpus--;
+	}
+}
+
 /*
  * Activate the first processor.
  */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-04 20:42 [PATCH] cpu/hotplug: Cache number of online CPUs Thomas Gleixner
@ 2019-07-04 20:59 ` Mathieu Desnoyers
  2019-07-04 21:10   ` Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: Mathieu Desnoyers @ 2019-07-04 20:59 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, x86, Nadav Amit

----- On Jul 4, 2019, at 4:42 PM, Thomas Gleixner tglx@linutronix.de wrote:

> Revaluating the bitmap wheight of the online cpus bitmap in every
> invocation of num_online_cpus() over and over is a pretty useless
> exercise. Especially when num_online_cpus() is used in code pathes like the
> IPI delivery of x86 or the membarrier code.
> 
> Cache the number of online CPUs in the core and just return the cached
> variable.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> include/linux/cpumask.h |   16 +++++++---------
> kernel/cpu.c            |   16 ++++++++++++++++
> 2 files changed, 23 insertions(+), 9 deletions(-)
> 
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -95,8 +95,13 @@ extern struct cpumask __cpu_active_mask;
> #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
> #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
> 
> +extern unsigned int __num_online_cpus;

[...]

> +
> +void set_cpu_online(unsigned int cpu, bool online)
> +{
> +	lockdep_assert_cpus_held();

I don't think it is required that the cpu_hotplug lock is held
when reading __num_online_cpus, right ?

I would have expected the increment/decrement below to be performed
with a WRITE_ONCE(), and use a READ_ONCE() when reading the current
value.

Thanks,

Mathieu

> +
> +	if (online) {
> +		if (!cpumask_test_and_set_cpu(cpu, &__cpu_online_mask))
> +			__num_online_cpus++;
> +	} else {
> +		if (cpumask_test_and_clear_cpu(cpu, &__cpu_online_mask))
> +			__num_online_cpus--;
> +	}
> +}
> +
> /*
>  * Activate the first processor.
>   */

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-04 20:59 ` Mathieu Desnoyers
@ 2019-07-04 21:10   ` Thomas Gleixner
  2019-07-04 22:00     ` Mathieu Desnoyers
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2019-07-04 21:10 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: linux-kernel, x86, Nadav Amit

On Thu, 4 Jul 2019, Mathieu Desnoyers wrote:

> ----- On Jul 4, 2019, at 4:42 PM, Thomas Gleixner tglx@linutronix.de wrote:
> 
> > Revaluating the bitmap wheight of the online cpus bitmap in every
> > invocation of num_online_cpus() over and over is a pretty useless
> > exercise. Especially when num_online_cpus() is used in code pathes like the
> > IPI delivery of x86 or the membarrier code.
> > 
> > Cache the number of online CPUs in the core and just return the cached
> > variable.
> > 
> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> > ---
> > include/linux/cpumask.h |   16 +++++++---------
> > kernel/cpu.c            |   16 ++++++++++++++++
> > 2 files changed, 23 insertions(+), 9 deletions(-)
> > 
> > --- a/include/linux/cpumask.h
> > +++ b/include/linux/cpumask.h
> > @@ -95,8 +95,13 @@ extern struct cpumask __cpu_active_mask;
> > #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
> > #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
> > 
> > +extern unsigned int __num_online_cpus;
> 
> [...]
> 
> > +
> > +void set_cpu_online(unsigned int cpu, bool online)
> > +{
> > +	lockdep_assert_cpus_held();
> 
> I don't think it is required that the cpu_hotplug lock is held
> when reading __num_online_cpus, right ?

Errm, that's the update function. And this is better called from a hotplug
lock held region and not from some random crappy code.

> I would have expected the increment/decrement below to be performed
> with a WRITE_ONCE(), and use a READ_ONCE() when reading the current
> value.

What for?

num_online_cpus() is racy today vs. CPU hotplug operations as
long as you don't hold the hotplug lock.

Thanks,

	tglx




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-04 21:10   ` Thomas Gleixner
@ 2019-07-04 22:00     ` Mathieu Desnoyers
  2019-07-04 22:33       ` Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: Mathieu Desnoyers @ 2019-07-04 22:00 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, x86, Nadav Amit

----- On Jul 4, 2019, at 5:10 PM, Thomas Gleixner tglx@linutronix.de wrote:

> On Thu, 4 Jul 2019, Mathieu Desnoyers wrote:
> 
>> ----- On Jul 4, 2019, at 4:42 PM, Thomas Gleixner tglx@linutronix.de wrote:
>> 
>> > Revaluating the bitmap wheight of the online cpus bitmap in every
>> > invocation of num_online_cpus() over and over is a pretty useless
>> > exercise. Especially when num_online_cpus() is used in code pathes like the
>> > IPI delivery of x86 or the membarrier code.
>> > 
>> > Cache the number of online CPUs in the core and just return the cached
>> > variable.
>> > 
>> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
>> > ---
>> > include/linux/cpumask.h |   16 +++++++---------
>> > kernel/cpu.c            |   16 ++++++++++++++++
>> > 2 files changed, 23 insertions(+), 9 deletions(-)
>> > 
>> > --- a/include/linux/cpumask.h
>> > +++ b/include/linux/cpumask.h
>> > @@ -95,8 +95,13 @@ extern struct cpumask __cpu_active_mask;
>> > #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
>> > #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
>> > 
>> > +extern unsigned int __num_online_cpus;
>> 
>> [...]
>> 
>> > +
>> > +void set_cpu_online(unsigned int cpu, bool online)
>> > +{
>> > +	lockdep_assert_cpus_held();
>> 
>> I don't think it is required that the cpu_hotplug lock is held
>> when reading __num_online_cpus, right ?
> 
> Errm, that's the update function. And this is better called from a hotplug
> lock held region and not from some random crappy code.

Sure, this is fine to assume this lock is held for the update.
It's the read-side I'm worried about (which does not hold the lock).

> 
>> I would have expected the increment/decrement below to be performed
>> with a WRITE_ONCE(), and use a READ_ONCE() when reading the current
>> value.
> 
> What for?
> 
> num_online_cpus() is racy today vs. CPU hotplug operations as
> long as you don't hold the hotplug lock.

Fair point, AFAIU none of the loads performed within num_online_cpus()
seem to rely on atomic nor volatile accesses. So not using a volatile
access to load the cached value should not introduce any regression.

I'm concerned that some code may rely on re-fetching of the cached
value between iterations of a loop. The lack of READ_ONCE() would
let the compiler keep a lifted load within a register and never
re-fetch, unless there is a cpu_relax() or a barrier() within the
loop.

Thoughts ?

Thanks,

Mathieu


> 
> Thanks,
> 
> 	tglx

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-04 22:00     ` Mathieu Desnoyers
@ 2019-07-04 22:33       ` Thomas Gleixner
  2019-07-04 23:34         ` Mathieu Desnoyers
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2019-07-04 22:33 UTC (permalink / raw)
  To: Mathieu Desnoyers; +Cc: linux-kernel, x86, Nadav Amit

On Thu, 4 Jul 2019, Mathieu Desnoyers wrote:
> ----- On Jul 4, 2019, at 5:10 PM, Thomas Gleixner tglx@linutronix.de wrote:
> >
> > num_online_cpus() is racy today vs. CPU hotplug operations as
> > long as you don't hold the hotplug lock.
> 
> Fair point, AFAIU none of the loads performed within num_online_cpus()
> seem to rely on atomic nor volatile accesses. So not using a volatile
> access to load the cached value should not introduce any regression.
> 
> I'm concerned that some code may rely on re-fetching of the cached
> value between iterations of a loop. The lack of READ_ONCE() would
> let the compiler keep a lifted load within a register and never
> re-fetch, unless there is a cpu_relax() or a barrier() within the
> loop.

If someone really wants to write code which can handle concurrent CPU
hotplug operations and rely on that information, then it's probably better
to write out:

     ncpus = READ_ONCE(__num_online_cpus);

explicitely along with a big fat comment.

I can't figure out why one wants to do that and how it is supposed to work,
but my brain is in shutdown mode already :)

I'd rather write a proper kernel doc comment for num_online_cpus() which
explains what the constraints are instead of pretending that the READ_ONCE
in the inline has any meaning.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-04 22:33       ` Thomas Gleixner
@ 2019-07-04 23:34         ` Mathieu Desnoyers
  2019-07-05  8:49           ` Ingo Molnar
  0 siblings, 1 reply; 18+ messages in thread
From: Mathieu Desnoyers @ 2019-07-04 23:34 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel, x86, Nadav Amit, Paul E. McKenney

----- On Jul 4, 2019, at 6:33 PM, Thomas Gleixner tglx@linutronix.de wrote:

> On Thu, 4 Jul 2019, Mathieu Desnoyers wrote:
>> ----- On Jul 4, 2019, at 5:10 PM, Thomas Gleixner tglx@linutronix.de wrote:
>> >
>> > num_online_cpus() is racy today vs. CPU hotplug operations as
>> > long as you don't hold the hotplug lock.
>> 
>> Fair point, AFAIU none of the loads performed within num_online_cpus()
>> seem to rely on atomic nor volatile accesses. So not using a volatile
>> access to load the cached value should not introduce any regression.
>> 
>> I'm concerned that some code may rely on re-fetching of the cached
>> value between iterations of a loop. The lack of READ_ONCE() would
>> let the compiler keep a lifted load within a register and never
>> re-fetch, unless there is a cpu_relax() or a barrier() within the
>> loop.
> 
> If someone really wants to write code which can handle concurrent CPU
> hotplug operations and rely on that information, then it's probably better
> to write out:
> 
>     ncpus = READ_ONCE(__num_online_cpus);
> 
> explicitely along with a big fat comment.
> 
> I can't figure out why one wants to do that and how it is supposed to work,
> but my brain is in shutdown mode already :)
> 
> I'd rather write a proper kernel doc comment for num_online_cpus() which
> explains what the constraints are instead of pretending that the READ_ONCE
> in the inline has any meaning.

The other aspect I am concerned about is freedom given to the compiler
to perform the store to __num_online_cpus non-atomically, or the load
non-atomically due to memory pressure. Is that something we should be
concerned about ?

I thought we had WRITE_ONCE and READ_ONCE to take care of that kind of
situation.

The semantic I am looking for here is C11's relaxed atomics.

Thanks,

Mathieu


> 
> Thanks,
> 
> 	tglx

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-04 23:34         ` Mathieu Desnoyers
@ 2019-07-05  8:49           ` Ingo Molnar
  2019-07-05 15:38             ` Mathieu Desnoyers
  0 siblings, 1 reply; 18+ messages in thread
From: Ingo Molnar @ 2019-07-05  8:49 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Thomas Gleixner, linux-kernel, x86, Nadav Amit, Paul E. McKenney


* Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> ----- On Jul 4, 2019, at 6:33 PM, Thomas Gleixner tglx@linutronix.de wrote:
> 
> > On Thu, 4 Jul 2019, Mathieu Desnoyers wrote:
> >> ----- On Jul 4, 2019, at 5:10 PM, Thomas Gleixner tglx@linutronix.de wrote:
> >> >
> >> > num_online_cpus() is racy today vs. CPU hotplug operations as
> >> > long as you don't hold the hotplug lock.
> >> 
> >> Fair point, AFAIU none of the loads performed within num_online_cpus()
> >> seem to rely on atomic nor volatile accesses. So not using a volatile
> >> access to load the cached value should not introduce any regression.
> >> 
> >> I'm concerned that some code may rely on re-fetching of the cached
> >> value between iterations of a loop. The lack of READ_ONCE() would
> >> let the compiler keep a lifted load within a register and never
> >> re-fetch, unless there is a cpu_relax() or a barrier() within the
> >> loop.
> > 
> > If someone really wants to write code which can handle concurrent CPU
> > hotplug operations and rely on that information, then it's probably better
> > to write out:
> > 
> >     ncpus = READ_ONCE(__num_online_cpus);
> > 
> > explicitely along with a big fat comment.
> > 
> > I can't figure out why one wants to do that and how it is supposed to work,
> > but my brain is in shutdown mode already :)
> > 
> > I'd rather write a proper kernel doc comment for num_online_cpus() which
> > explains what the constraints are instead of pretending that the READ_ONCE
> > in the inline has any meaning.
> 
> The other aspect I am concerned about is freedom given to the compiler 
> to perform the store to __num_online_cpus non-atomically, or the load 
> non-atomically due to memory pressure.

What connection does "memory pressure" have to what the compiler does? 

Did you confuse it with "register pressure"?

> Is that something we should be concerned about ?

Once I understand it :)

> I thought we had WRITE_ONCE and READ_ONCE to take care of that kind of 
> situation.

Store and load tearing is one of the minor properties of READ_ONCE() and 
WRITE_ONCE() - the main properties are the ordering guarantees.

Since __num_online_cpus is neither weirdly aligned nor is it written via 
constants I don't see how load/store tearing could occur. Can you outline 
such a scenario?

> The semantic I am looking for here is C11's relaxed atomics.

What does this mean?

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-05  8:49           ` Ingo Molnar
@ 2019-07-05 15:38             ` Mathieu Desnoyers
  2019-07-05 20:53               ` Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: Mathieu Desnoyers @ 2019-07-05 15:38 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Thomas Gleixner, linux-kernel, x86, Nadav Amit, paulmck,
	Peter Zijlstra, Will Deacon



----- On Jul 5, 2019, at 4:49 AM, Ingo Molnar mingo@kernel.org wrote:

> * Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
>> ----- On Jul 4, 2019, at 6:33 PM, Thomas Gleixner tglx@linutronix.de wrote:
>> 
>> > On Thu, 4 Jul 2019, Mathieu Desnoyers wrote:
>> >> ----- On Jul 4, 2019, at 5:10 PM, Thomas Gleixner tglx@linutronix.de wrote:
>> >> >
>> >> > num_online_cpus() is racy today vs. CPU hotplug operations as
>> >> > long as you don't hold the hotplug lock.
>> >> 
>> >> Fair point, AFAIU none of the loads performed within num_online_cpus()
>> >> seem to rely on atomic nor volatile accesses. So not using a volatile
>> >> access to load the cached value should not introduce any regression.
>> >> 
>> >> I'm concerned that some code may rely on re-fetching of the cached
>> >> value between iterations of a loop. The lack of READ_ONCE() would
>> >> let the compiler keep a lifted load within a register and never
>> >> re-fetch, unless there is a cpu_relax() or a barrier() within the
>> >> loop.
>> > 
>> > If someone really wants to write code which can handle concurrent CPU
>> > hotplug operations and rely on that information, then it's probably better
>> > to write out:
>> > 
>> >     ncpus = READ_ONCE(__num_online_cpus);
>> > 
>> > explicitely along with a big fat comment.
>> > 
>> > I can't figure out why one wants to do that and how it is supposed to work,
>> > but my brain is in shutdown mode already :)
>> > 
>> > I'd rather write a proper kernel doc comment for num_online_cpus() which
>> > explains what the constraints are instead of pretending that the READ_ONCE
>> > in the inline has any meaning.
>> 
>> The other aspect I am concerned about is freedom given to the compiler
>> to perform the store to __num_online_cpus non-atomically, or the load
>> non-atomically due to memory pressure.
> 
> What connection does "memory pressure" have to what the compiler does?
> 
> Did you confuse it with "register pressure"?

My brain wires must have been crossed when I wrote that. Indeed, I mean
register pressure.

> 
>> Is that something we should be concerned about ?
> 
> Once I understand it :)
> 
>> I thought we had WRITE_ONCE and READ_ONCE to take care of that kind of
>> situation.
> 
> Store and load tearing is one of the minor properties of READ_ONCE() and
> WRITE_ONCE() - the main properties are the ordering guarantees.
> 
> Since __num_online_cpus is neither weirdly aligned nor is it written via
> constants I don't see how load/store tearing could occur. Can you outline
> such a scenario?

I'm referring to the examples provided in Documentation/core-api/atomic_ops.rst,
e.g.:

"For another example, consider the following code::

        tmp_a = a;
        do_something_with(tmp_a);
        do_something_else_with(tmp_a);

 If the compiler can prove that do_something_with() does not store to the
 variable a, then the compiler is within its rights to manufacture an
 additional load as follows::

        tmp_a = a;
        do_something_with(tmp_a);
        tmp_a = a;
        do_something_else_with(tmp_a);"

So if instead of "a" we have num_online_cpus(), the compiler would be
within its right to re-fetch the number of online cpus between the two
calls, which seems unexpected.

> 
>> The semantic I am looking for here is C11's relaxed atomics.
> 
> What does this mean?

C11 states:

"Atomic operations specifying memory_order_relaxed are  relaxed  only  with  respect
to memory ordering.  Implementations must still guarantee that any given atomic access
to a particular atomic object be indivisible with respect to all other atomic accesses
to that object."

So I am concerned that num_online_cpus() as proposed in this patch
try to access __num_online_cpus non-atomically, and without using
READ_ONCE().

Similarly, the update-side should use WRITE_ONCE(). Protecting with a mutex
does not provide mutual exclusion against concurrent readers of that variable.

Again based on Documentation/core-api/atomic_ops.rst:

"For a final example, consider the following code, assuming that the
 variable a is set at boot time before the second CPU is brought online
 and never changed later, so that memory barriers are not needed::

        if (a)
                b = 9;
        else
                b = 42;

 The compiler is within its rights to manufacture an additional store
 by transforming the above code into the following::

        b = 42;
        if (a)
                b = 9;

 This could come as a fatal surprise to other code running concurrently
 that expected b to never have the value 42 if a was zero.  To prevent
 the compiler from doing this, write something like::

        if (a)
                WRITE_ONCE(b, 9);
        else
                WRITE_ONCE(b, 42);"

If the compiler decides to manufacture additional stores when updating
the __num_online_cpus cached value, then loads could observe this
intermediate value because they fetch it without holding any mutex.

Thanks,

Mathieu

> 
> Thanks,
> 
> 	Ingo

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-05 15:38             ` Mathieu Desnoyers
@ 2019-07-05 20:53               ` Thomas Gleixner
  2019-07-05 21:00                 ` Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2019-07-05 20:53 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, linux-kernel, x86, Nadav Amit, paulmck,
	Peter Zijlstra, Will Deacon

On Fri, 5 Jul 2019, Mathieu Desnoyers wrote:
> ----- On Jul 5, 2019, at 4:49 AM, Ingo Molnar mingo@kernel.org wrote:
> > * Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> >> The semantic I am looking for here is C11's relaxed atomics.
> > 
> > What does this mean?
> 
> C11 states:
> 
> "Atomic operations specifying memory_order_relaxed are  relaxed  only  with  respect
> to memory ordering.  Implementations must still guarantee that any given atomic access
> to a particular atomic object be indivisible with respect to all other atomic accesses
> to that object."
> 
> So I am concerned that num_online_cpus() as proposed in this patch
> try to access __num_online_cpus non-atomically, and without using
> READ_ONCE().
>
> 
> Similarly, the update-side should use WRITE_ONCE(). Protecting with a mutex
> does not provide mutual exclusion against concurrent readers of that variable.

Again. This is nothing new. The current implementation of num_online_cpus()
has no guarantees whatsoever. 

bitmap_hweight() can be hit by a concurrent update of the mask it is
looking at.

num_online_cpus() gives you only the correct number if you invoke it inside
a cpuhp_lock held section. So why do we need that fuzz about atomicity now?

It's racy and was racy forever and even if we add that READ/WRITE_ONCE muck
then it still wont give you a reliable answer unless you hold cpuhp_lock at
least for read. So fore me that READ/WRITE_ONCE is just a cosmetic and
misleading reality distortion.

Thanks,

	tglx





^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-05 20:53               ` Thomas Gleixner
@ 2019-07-05 21:00                 ` Thomas Gleixner
  2019-07-06 23:24                   ` Mathieu Desnoyers
  2019-07-08 13:43                   ` [PATCH V2] " Thomas Gleixner
  0 siblings, 2 replies; 18+ messages in thread
From: Thomas Gleixner @ 2019-07-05 21:00 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, linux-kernel, x86, Nadav Amit, paulmck,
	Peter Zijlstra, Will Deacon

On Fri, 5 Jul 2019, Thomas Gleixner wrote:
> On Fri, 5 Jul 2019, Mathieu Desnoyers wrote:
> > ----- On Jul 5, 2019, at 4:49 AM, Ingo Molnar mingo@kernel.org wrote:
> > > * Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> > >> The semantic I am looking for here is C11's relaxed atomics.
> > > 
> > > What does this mean?
> > 
> > C11 states:
> > 
> > "Atomic operations specifying memory_order_relaxed are  relaxed  only  with  respect
> > to memory ordering.  Implementations must still guarantee that any given atomic access
> > to a particular atomic object be indivisible with respect to all other atomic accesses
> > to that object."
> > 
> > So I am concerned that num_online_cpus() as proposed in this patch
> > try to access __num_online_cpus non-atomically, and without using
> > READ_ONCE().
> >
> > 
> > Similarly, the update-side should use WRITE_ONCE(). Protecting with a mutex
> > does not provide mutual exclusion against concurrent readers of that variable.
> 
> Again. This is nothing new. The current implementation of num_online_cpus()
> has no guarantees whatsoever. 
> 
> bitmap_hweight() can be hit by a concurrent update of the mask it is
> looking at.
> 
> num_online_cpus() gives you only the correct number if you invoke it inside
> a cpuhp_lock held section. So why do we need that fuzz about atomicity now?
> 
> It's racy and was racy forever and even if we add that READ/WRITE_ONCE muck
> then it still wont give you a reliable answer unless you hold cpuhp_lock at
> least for read. So fore me that READ/WRITE_ONCE is just a cosmetic and
> misleading reality distortion.

That said. If it makes everyone happy and feel better, I'm happy to add it
along with a bit fat comment which explains that it's just preventing a
theoretical store/load tearing issue and does not provide any guarantees
other than that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH] cpu/hotplug: Cache number of online CPUs
  2019-07-05 21:00                 ` Thomas Gleixner
@ 2019-07-06 23:24                   ` Mathieu Desnoyers
  2019-07-08 13:43                   ` [PATCH V2] " Thomas Gleixner
  1 sibling, 0 replies; 18+ messages in thread
From: Mathieu Desnoyers @ 2019-07-06 23:24 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Ingo Molnar, linux-kernel, x86, Nadav Amit, paulmck,
	Peter Zijlstra, Will Deacon

----- On Jul 5, 2019, at 5:00 PM, Thomas Gleixner tglx@linutronix.de wrote:

> On Fri, 5 Jul 2019, Thomas Gleixner wrote:
>> On Fri, 5 Jul 2019, Mathieu Desnoyers wrote:
>> > ----- On Jul 5, 2019, at 4:49 AM, Ingo Molnar mingo@kernel.org wrote:
>> > > * Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
>> > >> The semantic I am looking for here is C11's relaxed atomics.
>> > > 
>> > > What does this mean?
>> > 
>> > C11 states:
>> > 
>> > "Atomic operations specifying memory_order_relaxed are  relaxed  only  with
>> > respect
>> > to memory ordering.  Implementations must still guarantee that any given atomic
>> > access
>> > to a particular atomic object be indivisible with respect to all other atomic
>> > accesses
>> > to that object."
>> > 
>> > So I am concerned that num_online_cpus() as proposed in this patch
>> > try to access __num_online_cpus non-atomically, and without using
>> > READ_ONCE().
>> >
>> > 
>> > Similarly, the update-side should use WRITE_ONCE(). Protecting with a mutex
>> > does not provide mutual exclusion against concurrent readers of that variable.
>> 
>> Again. This is nothing new. The current implementation of num_online_cpus()
>> has no guarantees whatsoever.
>> 
>> bitmap_hweight() can be hit by a concurrent update of the mask it is
>> looking at.
>> 
>> num_online_cpus() gives you only the correct number if you invoke it inside
>> a cpuhp_lock held section. So why do we need that fuzz about atomicity now?
>> 
>> It's racy and was racy forever and even if we add that READ/WRITE_ONCE muck
>> then it still wont give you a reliable answer unless you hold cpuhp_lock at
>> least for read. So fore me that READ/WRITE_ONCE is just a cosmetic and
>> misleading reality distortion.
> 
> That said. If it makes everyone happy and feel better, I'm happy to add it
> along with a bit fat comment which explains that it's just preventing a
> theoretical store/load tearing issue and does not provide any guarantees
> other than that.

One example where the lack of READ_ONCE() makes me uneasy is found
in drivers/net/wireless/intel/iwlwifi/pcie/trans.c: iwl_pcie_set_interrupt_capa():

static void iwl_pcie_set_interrupt_capa(struct pci_dev *pdev,
                                        struct iwl_trans *trans)
{
        struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
        int max_irqs, num_irqs, i, ret;
[...]
        max_irqs = min_t(u32, num_online_cpus() + 2, IWL_MAX_RX_HW_QUEUES);
        for (i = 0; i < max_irqs; i++)
                trans_pcie->msix_entries[i].entry = i;

max_entries is an array of IWL_MAX_RX_HW_QUEUES entries. AFAIU, if the compiler
decides to re-fetch num_online_cpus() for the loop condition after hot-plugging
of a few cpus, we end up doing an out-of bound access to the array.

The scenario would be:

- load __num_online_cpus into a register,
- compare register + 2 to IWL_MAX_RX_HW_QUEUES
  (let's say the current number of cpus is lower that IWL_MAX_RX_HW_QUEUES - 2)
- a few other cpus are brought online,
- compiler decides to re-fetch __num_online_cpus for the loop bound, because
  its value is rightfully assumed to have never changed,
- corruption and sadness follows.

I'm not saying the current compiler implementations actually generate this
for this specific function, but I'm concerned about the fact that they are
within their right to do so. It seems quite fragile to expose a kernel API
which can yield to this kind of subtle bug.

A quick kernel tree grep for both "num_online_cpus" and "min" on the same line
gives 64 results, so it's not an isolated case.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH V2] cpu/hotplug: Cache number of online CPUs
  2019-07-05 21:00                 ` Thomas Gleixner
  2019-07-06 23:24                   ` Mathieu Desnoyers
@ 2019-07-08 13:43                   ` Thomas Gleixner
  2019-07-08 14:07                     ` Paul E. McKenney
  1 sibling, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2019-07-08 13:43 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, linux-kernel, x86, Nadav Amit, paulmck,
	Peter Zijlstra, Will Deacon

Revaluating the bitmap wheight of the online cpus bitmap in every
invocation of num_online_cpus() over and over is a pretty useless
exercise. Especially when num_online_cpus() is used in code pathes like the
IPI delivery of x86 or the membarrier code.

Cache the number of online CPUs in the core and just return the cached
variable.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V2: Use READ/WRITE_ONCE() and add comment what it actually achieves. Remove
    the bogus lockdep assert in the write path as the caller cannot hold the
    lock. It's a task on the plugged CPU which is not the controlling task.
---
 include/linux/cpumask.h |   26 +++++++++++++++++---------
 kernel/cpu.c            |   22 ++++++++++++++++++++++
 2 files changed, 39 insertions(+), 9 deletions(-)

--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -95,8 +95,23 @@ extern struct cpumask __cpu_active_mask;
 #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
 #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
 
+extern unsigned int __num_online_cpus;
+
 #if NR_CPUS > 1
-#define num_online_cpus()	cpumask_weight(cpu_online_mask)
+/**
+ * num_online_cpus() - Read the number of online CPUs
+ *
+ * READ_ONCE() protects against theoretical load tearing and prevents
+ * the compiler from reloading the value in a function or loop.
+ *
+ * Even with that, this interface gives only a momentary snapshot and is
+ * not protected against concurrent CPU hotplug operations unless invoked
+ * from a cpuhp_lock held region.
+ */
+static inline unsigned int num_online_cpus(void)
+{
+	return READ_ONCE(__num_online_cpus);
+}
 #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
 #define num_present_cpus()	cpumask_weight(cpu_present_mask)
 #define num_active_cpus()	cpumask_weight(cpu_active_mask)
@@ -805,14 +820,7 @@ set_cpu_present(unsigned int cpu, bool p
 		cpumask_clear_cpu(cpu, &__cpu_present_mask);
 }
 
-static inline void
-set_cpu_online(unsigned int cpu, bool online)
-{
-	if (online)
-		cpumask_set_cpu(cpu, &__cpu_online_mask);
-	else
-		cpumask_clear_cpu(cpu, &__cpu_online_mask);
-}
+void set_cpu_online(unsigned int cpu, bool online);
 
 static inline void
 set_cpu_active(unsigned int cpu, bool active)
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2288,6 +2288,9 @@ EXPORT_SYMBOL(__cpu_present_mask);
 struct cpumask __cpu_active_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_active_mask);
 
+unsigned int __num_online_cpus __read_mostly;
+EXPORT_SYMBOL(__num_online_cpus);
+
 void init_cpu_present(const struct cpumask *src)
 {
 	cpumask_copy(&__cpu_present_mask, src);
@@ -2303,6 +2306,25 @@ void init_cpu_online(const struct cpumas
 	cpumask_copy(&__cpu_online_mask, src);
 }
 
+void set_cpu_online(unsigned int cpu, bool online)
+{
+	int adj = 0;
+
+	if (online) {
+		if (!cpumask_test_and_set_cpu(cpu, &__cpu_online_mask))
+			adj = 1;
+	} else {
+		if (cpumask_test_and_clear_cpu(cpu, &__cpu_online_mask))
+			adj = -1;
+	}
+	/*
+	 * WRITE_ONCE() protects only against the theoretical stupidity of
+	 * a compiler to tear the store, but won't protect readers which
+	 * are not serialized against concurrent hotplug operations.
+	 */
+	WRITE_ONCE(__num_online_cpus, __num_online_cpus + adj);
+}
+
 /*
  * Activate the first processor.
  */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] cpu/hotplug: Cache number of online CPUs
  2019-07-08 13:43                   ` [PATCH V2] " Thomas Gleixner
@ 2019-07-08 14:07                     ` Paul E. McKenney
  2019-07-08 14:20                       ` Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: Paul E. McKenney @ 2019-07-08 14:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mathieu Desnoyers, Ingo Molnar, linux-kernel, x86, Nadav Amit,
	Peter Zijlstra, Will Deacon

On Mon, Jul 08, 2019 at 03:43:55PM +0200, Thomas Gleixner wrote:
> Revaluating the bitmap wheight of the online cpus bitmap in every

s/wheight/weight/?

> invocation of num_online_cpus() over and over is a pretty useless
> exercise. Especially when num_online_cpus() is used in code pathes like the
> IPI delivery of x86 or the membarrier code.
> 
> Cache the number of online CPUs in the core and just return the cached
> variable.

I do like this and the comments on limited guarantees make sense.
One suggestion for saving a few lines below, but either way:

Acked-by: Paul E. McKenney <paulmck@linux.ibm.com>

> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> V2: Use READ/WRITE_ONCE() and add comment what it actually achieves. Remove
>     the bogus lockdep assert in the write path as the caller cannot hold the
>     lock. It's a task on the plugged CPU which is not the controlling task.
> ---
>  include/linux/cpumask.h |   26 +++++++++++++++++---------
>  kernel/cpu.c            |   22 ++++++++++++++++++++++
>  2 files changed, 39 insertions(+), 9 deletions(-)
> 
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -95,8 +95,23 @@ extern struct cpumask __cpu_active_mask;
>  #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
>  #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
>  
> +extern unsigned int __num_online_cpus;
> +
>  #if NR_CPUS > 1
> -#define num_online_cpus()	cpumask_weight(cpu_online_mask)
> +/**
> + * num_online_cpus() - Read the number of online CPUs
> + *
> + * READ_ONCE() protects against theoretical load tearing and prevents
> + * the compiler from reloading the value in a function or loop.
> + *
> + * Even with that, this interface gives only a momentary snapshot and is
> + * not protected against concurrent CPU hotplug operations unless invoked
> + * from a cpuhp_lock held region.
> + */
> +static inline unsigned int num_online_cpus(void)
> +{
> +	return READ_ONCE(__num_online_cpus);
> +}
>  #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
>  #define num_present_cpus()	cpumask_weight(cpu_present_mask)
>  #define num_active_cpus()	cpumask_weight(cpu_active_mask)
> @@ -805,14 +820,7 @@ set_cpu_present(unsigned int cpu, bool p
>  		cpumask_clear_cpu(cpu, &__cpu_present_mask);
>  }
>  
> -static inline void
> -set_cpu_online(unsigned int cpu, bool online)
> -{
> -	if (online)
> -		cpumask_set_cpu(cpu, &__cpu_online_mask);
> -	else
> -		cpumask_clear_cpu(cpu, &__cpu_online_mask);
> -}
> +void set_cpu_online(unsigned int cpu, bool online);
>  
>  static inline void
>  set_cpu_active(unsigned int cpu, bool active)
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -2288,6 +2288,9 @@ EXPORT_SYMBOL(__cpu_present_mask);
>  struct cpumask __cpu_active_mask __read_mostly;
>  EXPORT_SYMBOL(__cpu_active_mask);
>  
> +unsigned int __num_online_cpus __read_mostly;
> +EXPORT_SYMBOL(__num_online_cpus);
> +
>  void init_cpu_present(const struct cpumask *src)
>  {
>  	cpumask_copy(&__cpu_present_mask, src);
> @@ -2303,6 +2306,25 @@ void init_cpu_online(const struct cpumas
>  	cpumask_copy(&__cpu_online_mask, src);
>  }
>  
> +void set_cpu_online(unsigned int cpu, bool online)
> +{
> +	int adj = 0;
> +
> +	if (online) {
> +		if (!cpumask_test_and_set_cpu(cpu, &__cpu_online_mask))
> +			adj = 1;
> +	} else {
> +		if (cpumask_test_and_clear_cpu(cpu, &__cpu_online_mask))
> +			adj = -1;
> +	}
> +	/*
> +	 * WRITE_ONCE() protects only against the theoretical stupidity of
> +	 * a compiler to tear the store, but won't protect readers which
> +	 * are not serialized against concurrent hotplug operations.
> +	 */
> +	WRITE_ONCE(__num_online_cpus, __num_online_cpus + adj);

	WRITE_ONCE(__num_online_cpus, cpumask_weight(__cpu_online_mask));

Then "adj" can be dispensed with, and the old non-value-returning atomic
updates can be used on __cpu_online_mask.  Or is someone now depending
on full ordering from set_cpu_online() or some such?

> +}
> +
>  /*
>   * Activate the first processor.
>   */
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V2] cpu/hotplug: Cache number of online CPUs
  2019-07-08 14:07                     ` Paul E. McKenney
@ 2019-07-08 14:20                       ` Thomas Gleixner
  2019-07-09 14:23                         ` [PATCH V3] " Thomas Gleixner
  0 siblings, 1 reply; 18+ messages in thread
From: Thomas Gleixner @ 2019-07-08 14:20 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Mathieu Desnoyers, Ingo Molnar, linux-kernel, x86, Nadav Amit,
	Peter Zijlstra, Will Deacon

On Mon, 8 Jul 2019, Paul E. McKenney wrote:

> On Mon, Jul 08, 2019 at 03:43:55PM +0200, Thomas Gleixner wrote:
> > Revaluating the bitmap wheight of the online cpus bitmap in every
> 
> s/wheight/weight/?
> 
> > invocation of num_online_cpus() over and over is a pretty useless
> > exercise. Especially when num_online_cpus() is used in code pathes like the
> > IPI delivery of x86 or the membarrier code.
> > 
> > Cache the number of online CPUs in the core and just return the cached
> > variable.
> 
> I do like this and the comments on limited guarantees make sense.
> One suggestion for saving a few lines below, but either way:

Nah. I have to redo it. Mathieu just pointed out in IRC that there is the
reboot and emergency/panic/kexec crap which calls set_cpu_online()
completely non serialized. Sigh....

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH V3] cpu/hotplug: Cache number of online CPUs
  2019-07-08 14:20                       ` Thomas Gleixner
@ 2019-07-09 14:23                         ` Thomas Gleixner
  2019-07-09 15:52                           ` Mathieu Desnoyers
                                             ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Thomas Gleixner @ 2019-07-09 14:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Mathieu Desnoyers, Ingo Molnar, linux-kernel, x86, Nadav Amit,
	Peter Zijlstra, Will Deacon

Re-evaluating the bitmap wheight of the online cpus bitmap in every
invocation of num_online_cpus() over and over is a pretty useless
exercise. Especially when num_online_cpus() is used in code paths
like the IPI delivery of x86 or the membarrier code.

Cache the number of online CPUs in the core and just return the cached
variable. The accessor function provides only a snapshot when used without
protection against concurrent CPU hotplug.

The storage needs to use an atomic_t because the kexec and reboot code
(ab)use set_cpu_online() in their 'shutdown' handlers without any form of
serialization as pointed out by Mathieu. Regular CPU hotplug usage is
properly serialized.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
V3: Make __num_online_cpus an atomic_t because the kexec and reboot code
    (ab)use set_cpu_online() in their 'shutdown' handlers.

V2: Use READ/WRITE_ONCE() and add comment what it actually achieves. Remove
    the bogus lockdep assert in the write path as the caller cannot hold the
    lock. It's a task on the plugged CPU which is not the controlling task.
---
 include/linux/cpumask.h |   25 ++++++++++++++++---------
 kernel/cpu.c            |   24 ++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 9 deletions(-)

--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -10,6 +10,7 @@
 #include <linux/kernel.h>
 #include <linux/threads.h>
 #include <linux/bitmap.h>
+#include <linux/atomic.h>
 #include <linux/bug.h>
 
 /* Don't assign or return these: may not be this big! */
@@ -95,8 +96,21 @@ extern struct cpumask __cpu_active_mask;
 #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
 #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
 
+extern atomic_t __num_online_cpus;
+
 #if NR_CPUS > 1
-#define num_online_cpus()	cpumask_weight(cpu_online_mask)
+/**
+ * num_online_cpus() - Read the number of online CPUs
+ *
+ * Despite the fact that __num_online_cpus is of type atomic_t, this
+ * interface gives only a momentary snapshot and is not protected against
+ * concurrent CPU hotplug operations unless invoked from a cpuhp_lock held
+ * region.
+ */
+static inline unsigned int num_online_cpus(void)
+{
+	return atomic_read(&__num_online_cpus);
+}
 #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
 #define num_present_cpus()	cpumask_weight(cpu_present_mask)
 #define num_active_cpus()	cpumask_weight(cpu_active_mask)
@@ -805,14 +819,7 @@ set_cpu_present(unsigned int cpu, bool p
 		cpumask_clear_cpu(cpu, &__cpu_present_mask);
 }
 
-static inline void
-set_cpu_online(unsigned int cpu, bool online)
-{
-	if (online)
-		cpumask_set_cpu(cpu, &__cpu_online_mask);
-	else
-		cpumask_clear_cpu(cpu, &__cpu_online_mask);
-}
+void set_cpu_online(unsigned int cpu, bool online);
 
 static inline void
 set_cpu_active(unsigned int cpu, bool active)
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2295,6 +2295,9 @@ EXPORT_SYMBOL(__cpu_present_mask);
 struct cpumask __cpu_active_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_active_mask);
 
+atomic_t __num_online_cpus __read_mostly;
+EXPORT_SYMBOL(__num_online_cpus);
+
 void init_cpu_present(const struct cpumask *src)
 {
 	cpumask_copy(&__cpu_present_mask, src);
@@ -2310,6 +2313,27 @@ void init_cpu_online(const struct cpumas
 	cpumask_copy(&__cpu_online_mask, src);
 }
 
+void set_cpu_online(unsigned int cpu, bool online)
+{
+	/*
+	 * atomic_inc/dec() is required to handle the horrid abuse of this
+	 * function by the reboot and kexec code which invokes it from
+	 * IPI/NMI broadcasts when shutting down CPUs. Inocation from
+	 * regular CPU hotplug is properly serialized.
+	 *
+	 * Note, that the fact that __num_online_cpus is of type atomic_t
+	 * does not protect readers which are not serialized against
+	 * concurrent hotplug operations.
+	 */
+	if (online) {
+		if (!cpumask_test_and_set_cpu(cpu, &__cpu_online_mask))
+			atomic_inc(&__num_online_cpus);
+	} else {
+		if (cpumask_test_and_clear_cpu(cpu, &__cpu_online_mask))
+			atomic_dec(&__num_online_cpus);
+	}
+}
+
 /*
  * Activate the first processor.
  */

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH V3] cpu/hotplug: Cache number of online CPUs
  2019-07-09 14:23                         ` [PATCH V3] " Thomas Gleixner
@ 2019-07-09 15:52                           ` Mathieu Desnoyers
  2019-07-22  7:58                           ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
  2019-07-25 14:11                           ` tip-bot for Thomas Gleixner
  2 siblings, 0 replies; 18+ messages in thread
From: Mathieu Desnoyers @ 2019-07-09 15:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: paulmck, Ingo Molnar, linux-kernel, x86, Nadav Amit,
	Peter Zijlstra, Will Deacon

----- On Jul 9, 2019, at 10:23 AM, Thomas Gleixner tglx@linutronix.de wrote:
[...]
> +void set_cpu_online(unsigned int cpu, bool online)
> +{
> +	/*
> +	 * atomic_inc/dec() is required to handle the horrid abuse of this
> +	 * function by the reboot and kexec code which invokes it from

invokes -> invoke

> +	 * IPI/NMI broadcasts when shutting down CPUs. Inocation from

Inocation -> Invocation

The rest looks good!

Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

> +	 * regular CPU hotplug is properly serialized.
> +	 *
> +	 * Note, that the fact that __num_online_cpus is of type atomic_t
> +	 * does not protect readers which are not serialized against
> +	 * concurrent hotplug operations.
> +	 */
> +	if (online) {
> +		if (!cpumask_test_and_set_cpu(cpu, &__cpu_online_mask))
> +			atomic_inc(&__num_online_cpus);
> +	} else {
> +		if (cpumask_test_and_clear_cpu(cpu, &__cpu_online_mask))
> +			atomic_dec(&__num_online_cpus);
> +	}
> +}
> +
> /*
>  * Activate the first processor.
>   */

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [tip:smp/hotplug] cpu/hotplug: Cache number of online CPUs
  2019-07-09 14:23                         ` [PATCH V3] " Thomas Gleixner
  2019-07-09 15:52                           ` Mathieu Desnoyers
@ 2019-07-22  7:58                           ` tip-bot for Thomas Gleixner
  2019-07-25 14:11                           ` tip-bot for Thomas Gleixner
  2 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Thomas Gleixner @ 2019-07-22  7:58 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: mathieu.desnoyers, linux-kernel, mingo, tglx, hpa

Commit-ID:  3f70915f5be935a145d11b5f46a26627066b6261
Gitweb:     https://git.kernel.org/tip/3f70915f5be935a145d11b5f46a26627066b6261
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 9 Jul 2019 16:23:40 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 22 Jul 2019 09:52:21 +0200

cpu/hotplug: Cache number of online CPUs

Re-evaluating the bitmap wheight of the online cpus bitmap in every
invocation of num_online_cpus() over and over is a pretty useless
exercise. Especially when num_online_cpus() is used in code paths
like the IPI delivery of x86 or the membarrier code.

Cache the number of online CPUs in the core and just return the cached
variable. The accessor function provides only a snapshot when used without
protection against concurrent CPU hotplug.

The storage needs to use an atomic_t because the kexec and reboot code
(ab)use set_cpu_online() in their 'shutdown' handlers without any form of
serialization as pointed out by Mathieu. Regular CPU hotplug usage is
properly serialized.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907091622590.1634@nanos.tec.linutronix.de

---
 include/linux/cpumask.h | 25 ++++++++++++++++---------
 kernel/cpu.c            | 24 ++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 21755471b1c3..fdd627dbfa3a 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -10,6 +10,7 @@
 #include <linux/kernel.h>
 #include <linux/threads.h>
 #include <linux/bitmap.h>
+#include <linux/atomic.h>
 #include <linux/bug.h>
 
 /* Don't assign or return these: may not be this big! */
@@ -95,8 +96,21 @@ extern struct cpumask __cpu_active_mask;
 #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
 #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
 
+extern atomic_t __num_online_cpus;
+
 #if NR_CPUS > 1
-#define num_online_cpus()	cpumask_weight(cpu_online_mask)
+/**
+ * num_online_cpus() - Read the number of online CPUs
+ *
+ * Despite the fact that __num_online_cpus is of type atomic_t, this
+ * interface gives only a momentary snapshot and is not protected against
+ * concurrent CPU hotplug operations unless invoked from a cpuhp_lock held
+ * region.
+ */
+static inline unsigned int num_online_cpus(void)
+{
+	return atomic_read(&__num_online_cpus);
+}
 #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
 #define num_present_cpus()	cpumask_weight(cpu_present_mask)
 #define num_active_cpus()	cpumask_weight(cpu_active_mask)
@@ -805,14 +819,7 @@ set_cpu_present(unsigned int cpu, bool present)
 		cpumask_clear_cpu(cpu, &__cpu_present_mask);
 }
 
-static inline void
-set_cpu_online(unsigned int cpu, bool online)
-{
-	if (online)
-		cpumask_set_cpu(cpu, &__cpu_online_mask);
-	else
-		cpumask_clear_cpu(cpu, &__cpu_online_mask);
-}
+void set_cpu_online(unsigned int cpu, bool online);
 
 static inline void
 set_cpu_active(unsigned int cpu, bool active)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index e84c0873559e..3bf9881a3be5 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2295,6 +2295,9 @@ EXPORT_SYMBOL(__cpu_present_mask);
 struct cpumask __cpu_active_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_active_mask);
 
+atomic_t __num_online_cpus __read_mostly;
+EXPORT_SYMBOL(__num_online_cpus);
+
 void init_cpu_present(const struct cpumask *src)
 {
 	cpumask_copy(&__cpu_present_mask, src);
@@ -2310,6 +2313,27 @@ void init_cpu_online(const struct cpumask *src)
 	cpumask_copy(&__cpu_online_mask, src);
 }
 
+void set_cpu_online(unsigned int cpu, bool online)
+{
+	/*
+	 * atomic_inc/dec() is required to handle the horrid abuse of this
+	 * function by the reboot and kexec code which invoke it from
+	 * IPI/NMI broadcasts when shutting down CPUs. Invocation from
+	 * regular CPU hotplug is properly serialized.
+	 *
+	 * Note, that the fact that __num_online_cpus is of type atomic_t
+	 * does not protect readers which are not serialized against
+	 * concurrent hotplug operations.
+	 */
+	if (online) {
+		if (!cpumask_test_and_set_cpu(cpu, &__cpu_online_mask))
+			atomic_inc(&__num_online_cpus);
+	} else {
+		if (cpumask_test_and_clear_cpu(cpu, &__cpu_online_mask))
+			atomic_dec(&__num_online_cpus);
+	}
+}
+
 /*
  * Activate the first processor.
  */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [tip:smp/hotplug] cpu/hotplug: Cache number of online CPUs
  2019-07-09 14:23                         ` [PATCH V3] " Thomas Gleixner
  2019-07-09 15:52                           ` Mathieu Desnoyers
  2019-07-22  7:58                           ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
@ 2019-07-25 14:11                           ` tip-bot for Thomas Gleixner
  2 siblings, 0 replies; 18+ messages in thread
From: tip-bot for Thomas Gleixner @ 2019-07-25 14:11 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, mingo, tglx, hpa, mathieu.desnoyers

Commit-ID:  0c09ab96fc820109d63097a2adcbbd20836b655f
Gitweb:     https://git.kernel.org/tip/0c09ab96fc820109d63097a2adcbbd20836b655f
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Tue, 9 Jul 2019 16:23:40 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Thu, 25 Jul 2019 15:48:01 +0200

cpu/hotplug: Cache number of online CPUs

Re-evaluating the bitmap wheight of the online cpus bitmap in every
invocation of num_online_cpus() over and over is a pretty useless
exercise. Especially when num_online_cpus() is used in code paths
like the IPI delivery of x86 or the membarrier code.

Cache the number of online CPUs in the core and just return the cached
variable. The accessor function provides only a snapshot when used without
protection against concurrent CPU hotplug.

The storage needs to use an atomic_t because the kexec and reboot code
(ab)use set_cpu_online() in their 'shutdown' handlers without any form of
serialization as pointed out by Mathieu. Regular CPU hotplug usage is
properly serialized.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1907091622590.1634@nanos.tec.linutronix.de

---
 include/linux/cpumask.h | 25 ++++++++++++++++---------
 kernel/cpu.c            | 24 ++++++++++++++++++++++++
 2 files changed, 40 insertions(+), 9 deletions(-)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 0c7db5efe66c..b5a5a1ed9efd 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -10,6 +10,7 @@
 #include <linux/kernel.h>
 #include <linux/threads.h>
 #include <linux/bitmap.h>
+#include <linux/atomic.h>
 #include <linux/bug.h>
 
 /* Don't assign or return these: may not be this big! */
@@ -95,8 +96,21 @@ extern struct cpumask __cpu_active_mask;
 #define cpu_present_mask  ((const struct cpumask *)&__cpu_present_mask)
 #define cpu_active_mask   ((const struct cpumask *)&__cpu_active_mask)
 
+extern atomic_t __num_online_cpus;
+
 #if NR_CPUS > 1
-#define num_online_cpus()	cpumask_weight(cpu_online_mask)
+/**
+ * num_online_cpus() - Read the number of online CPUs
+ *
+ * Despite the fact that __num_online_cpus is of type atomic_t, this
+ * interface gives only a momentary snapshot and is not protected against
+ * concurrent CPU hotplug operations unless invoked from a cpuhp_lock held
+ * region.
+ */
+static inline unsigned int num_online_cpus(void)
+{
+	return atomic_read(&__num_online_cpus);
+}
 #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
 #define num_present_cpus()	cpumask_weight(cpu_present_mask)
 #define num_active_cpus()	cpumask_weight(cpu_active_mask)
@@ -821,14 +835,7 @@ set_cpu_present(unsigned int cpu, bool present)
 		cpumask_clear_cpu(cpu, &__cpu_present_mask);
 }
 
-static inline void
-set_cpu_online(unsigned int cpu, bool online)
-{
-	if (online)
-		cpumask_set_cpu(cpu, &__cpu_online_mask);
-	else
-		cpumask_clear_cpu(cpu, &__cpu_online_mask);
-}
+void set_cpu_online(unsigned int cpu, bool online);
 
 static inline void
 set_cpu_active(unsigned int cpu, bool active)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 05778e32674a..e1967e9eddc2 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2298,6 +2298,9 @@ EXPORT_SYMBOL(__cpu_present_mask);
 struct cpumask __cpu_active_mask __read_mostly;
 EXPORT_SYMBOL(__cpu_active_mask);
 
+atomic_t __num_online_cpus __read_mostly;
+EXPORT_SYMBOL(__num_online_cpus);
+
 void init_cpu_present(const struct cpumask *src)
 {
 	cpumask_copy(&__cpu_present_mask, src);
@@ -2313,6 +2316,27 @@ void init_cpu_online(const struct cpumask *src)
 	cpumask_copy(&__cpu_online_mask, src);
 }
 
+void set_cpu_online(unsigned int cpu, bool online)
+{
+	/*
+	 * atomic_inc/dec() is required to handle the horrid abuse of this
+	 * function by the reboot and kexec code which invoke it from
+	 * IPI/NMI broadcasts when shutting down CPUs. Invocation from
+	 * regular CPU hotplug is properly serialized.
+	 *
+	 * Note, that the fact that __num_online_cpus is of type atomic_t
+	 * does not protect readers which are not serialized against
+	 * concurrent hotplug operations.
+	 */
+	if (online) {
+		if (!cpumask_test_and_set_cpu(cpu, &__cpu_online_mask))
+			atomic_inc(&__num_online_cpus);
+	} else {
+		if (cpumask_test_and_clear_cpu(cpu, &__cpu_online_mask))
+			atomic_dec(&__num_online_cpus);
+	}
+}
+
 /*
  * Activate the first processor.
  */

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-07-25 14:11 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-04 20:42 [PATCH] cpu/hotplug: Cache number of online CPUs Thomas Gleixner
2019-07-04 20:59 ` Mathieu Desnoyers
2019-07-04 21:10   ` Thomas Gleixner
2019-07-04 22:00     ` Mathieu Desnoyers
2019-07-04 22:33       ` Thomas Gleixner
2019-07-04 23:34         ` Mathieu Desnoyers
2019-07-05  8:49           ` Ingo Molnar
2019-07-05 15:38             ` Mathieu Desnoyers
2019-07-05 20:53               ` Thomas Gleixner
2019-07-05 21:00                 ` Thomas Gleixner
2019-07-06 23:24                   ` Mathieu Desnoyers
2019-07-08 13:43                   ` [PATCH V2] " Thomas Gleixner
2019-07-08 14:07                     ` Paul E. McKenney
2019-07-08 14:20                       ` Thomas Gleixner
2019-07-09 14:23                         ` [PATCH V3] " Thomas Gleixner
2019-07-09 15:52                           ` Mathieu Desnoyers
2019-07-22  7:58                           ` [tip:smp/hotplug] " tip-bot for Thomas Gleixner
2019-07-25 14:11                           ` tip-bot for Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.