All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/3] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue
@ 2014-05-15 19:12 Srivatsa S. Bhat
  2014-05-15 19:13 ` [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
                   ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:12 UTC (permalink / raw)
  To: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch
  Cc: mgorman, riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, srivatsa.bhat


Hi,

There is a long-standing problem related to CPU hotplug which causes IPIs to
be delivered to offline CPUs, and the smp-call-function IPI handler code
prints out a warning whenever this is detected. Every once in a while this
(usually harmless) warning gets reported on LKML, but so far it has not been
completely fixed. Usually the solution involves finding out the IPI sender
and fixing it by adding appropriate synchronization with CPU hotplug.

However, while going through one such internal bug reports, I found that
there is a significant bug in the receiver side itself (more specifically,
in stop-machine) that can lead to this problem even when the sender code
is perfectly fine. This patchset fixes that synchronization problem in the
CPU hotplug stop-machine code.

Patch 1 adds some additional debug code to the smp-call-function framework,
to help debug such issues easily.

Patch 2 modifies the stop-machine code to ensure that any IPIs that were sent
while the target CPU was online, would be noticed and handled by that CPU
without fail before it goes offline. Thus, this avoids scenarios where IPIs
are received on offline CPUs (as long as the sender uses proper hotplug
synchronization).

Patch 3 adds a mechanism to flush any pending smp-call-function callbacks
queued on the CPU going offline (including those callbacks for which the
IPIs from the source CPUs might not have arrived in time at the outgoing CPU).
This ensures that a CPU never goes offline with work still pending.


In fact, I debugged the problem by using Patch 1, and found that the
payload of the IPI was always the block layer's trigger_softirq() function.
But I was not able to find anything wrong with the block layer code. That's
when I started looking at the stop-machine code and realized that there is
a race-window which makes the IPI _receiver_ the culprit, not the sender.
Patch 2 fixes that race and hence this should put an end to most of the
hard-to-debug IPI-to-offline-CPU issues.


Changes in v5:
Added Patch 3 to flush out any pending smp-call-function callbacks on the
outgoing CPU, as suggested by Frederic Weisbecker.

Changes in v4:
Rewrote a comment in Patch 2 and reorganized the code for better readability.

Changes in v3:
Rewrote patch 2 and split the MULTI_STOP_DISABLE_IRQ state into two:
MULTI_STOP_DISABLE_IRQ_INACTIVE and MULTI_STOP_DISABLE_IRQ_ACTIVE, and
used this framework to ensure that the CPU going offline always disables
its interrupts last. Suggested by Tejun Heo.

v1 and v2:
https://lkml.org/lkml/2014/5/6/474


 Srivatsa S. Bhat (3):
      smp: Print more useful debug info upon receiving IPI on an offline CPU
      CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU"
      CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline


 include/linux/smp.h   |    2 ++
 kernel/smp.c          |   48 +++++++++++++++++++++++++++++++++++++++++++++--
 kernel/stop_machine.c |   50 ++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 93 insertions(+), 7 deletions(-)


Regards,
Srivatsa S. Bhat
IBM Linux Technology Center


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU
  2014-05-15 19:12 [PATCH v5 0/3] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue Srivatsa S. Bhat
@ 2014-05-15 19:13 ` Srivatsa S. Bhat
  2014-05-15 19:19   ` Joe Perches
  2014-05-15 19:13 ` [PATCH v5 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU" Srivatsa S. Bhat
  2014-05-15 19:14 ` [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
  2 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:13 UTC (permalink / raw)
  To: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch
  Cc: mgorman, riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, srivatsa.bhat

Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU. This info is sufficient to let us know that something went
wrong, but often it is very hard to debug exactly who sent the IPI and why,
from this info alone.

In most cases, we get the warning about the IPI to an offline CPU, immediately
after the CPU going offline comes out of the stop-machine phase and reenables
interrupts. Since all online CPUs participate in stop-machine, the information
regarding the sender of the IPI is already lost by the time we exit the
stop-machine loop. So even if we dump the stack on each CPU at this point,
we won't find anything useful since all of them will show the stack-trace of
the stopper thread. So we need a better way to figure out who sent the IPI and
why.

To achieve this, when we detect an IPI targeted to an offline CPU, loop through
the call-single-data linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/smp.c |   18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 06d574e..ae45446 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -185,14 +185,26 @@ void generic_smp_call_function_single_interrupt(void)
 {
 	struct llist_node *entry;
 	struct call_single_data *csd, *csd_next;
+	static bool warned;
+
+	entry = llist_del_all(&__get_cpu_var(call_single_queue));
+	entry = llist_reverse_order(entry);
 
 	/*
 	 * Shouldn't receive this interrupt on a cpu that is not yet online.
 	 */
-	WARN_ON_ONCE(!cpu_online(smp_processor_id()));
+	if (unlikely(!cpu_online(smp_processor_id()) && !warned)) {
+		warned = true;
+		WARN_ONCE(1, "IPI on offline CPU %d\n", smp_processor_id());
 
-	entry = llist_del_all(&__get_cpu_var(call_single_queue));
-	entry = llist_reverse_order(entry);
+		/*
+		 * We don't have to use the _safe() variant here
+		 * because we are not invoking the IPI handlers yet.
+		 */
+		llist_for_each_entry(csd, entry, llist)
+			pr_warn("IPI callback %pS sent to offline CPU\n",
+				csd->func);
+	}
 
 	llist_for_each_entry_safe(csd, csd_next, entry, llist) {
 		csd->func(csd->info);


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU"
  2014-05-15 19:12 [PATCH v5 0/3] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue Srivatsa S. Bhat
  2014-05-15 19:13 ` [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
@ 2014-05-15 19:13 ` Srivatsa S. Bhat
  2014-05-15 19:17   ` Tejun Heo
  2014-05-15 19:14 ` [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
  2 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:13 UTC (permalink / raw)
  To: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch
  Cc: mgorman, riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, srivatsa.bhat

During CPU offline, stop-machine is used to take control over all the online
CPUs (via the per-cpu stopper thread) and then run take_cpu_down() on the CPU
that is to be taken offline.

But stop-machine itself has several stages: _PREPARE, _DISABLE_IRQ, _RUN etc.
The important thing to note here is that the _DISABLE_IRQ stage comes much
later after starting stop-machine, and hence there is a large window where
other CPUs can send IPIs to the CPU going offline. As a result, we can
encounter a scenario as depicted below, which causes IPIs to be sent to the
CPU going offline, and that CPU notices them *after* it has gone offline,
triggering the "IPI-to-offline-CPU" warning from the smp-call-function code.


              CPU 1                                         CPU 2
          (Online CPU)                               (CPU going offline)

       Enter _PREPARE stage                          Enter _PREPARE stage

                                                     Enter _DISABLE_IRQ stage


                                                   =
       Got a device interrupt,                     | Didn't notice the IPI
       and the interrupt handler                   | since interrupts were
       called smp_call_function()                  | disabled on this CPU.
       and sent an IPI to CPU 2.                   |
                                                   =


       Enter _DISABLE_IRQ stage


       Enter _RUN stage                              Enter _RUN stage

                                  =
       Busy loop with interrupts  |                  Invoke take_cpu_down()
       disabled.                  |                  and take CPU 2 offline
                                  =


       Enter _EXIT stage                             Enter _EXIT stage

       Re-enable interrupts                          Re-enable interrupts

                                                     The pending IPI is noted
                                                     immediately, but alas,
                                                     the CPU is offline at
                                                     this point.



So, as we can observe from this scenario, the IPI was sent when CPU 2 was
still online, and hence it was perfectly legal. But unfortunately it was
noted only after CPU 2 went offline, resulting in the warning from the
IPI handling code. In other words, the fault was not at the sender, but
at the receiver side - and if we look closely, the real bug is in the
stop-machine sequence itself.

The problem here is that the CPU going offline disabled its local interrupts
(by entering _DISABLE_IRQ phase) *before* the other CPUs. And that's the
reason why it was not able to respond to the IPI before going offline.

A simple solution to this problem is to ensure that the CPU going offline
disables its interrupts only *after* the other CPUs do the same thing.
To achieve this, split the _DISABLE_IRQ state into 2 parts:

1st part: MULTI_STOP_DISABLE_IRQ_INACTIVE, where only the non-active CPUs
(i.e., the "other" CPUs) disable their interrupts.

2nd part: MULTI_STOP_DISABLE_IRQ_ACTIVE, where the active CPU (i.e., the
CPU going offline) disables its interrupts.

With this in place, the CPU going offline will always be the last one to
disable interrupts. After this step, no further IPIs can be sent to the
outgoing CPU, since all the other CPUs would be executing the stop-machine
code with interrupts disabled. And by the time stop-machine ends, the CPU
would have gone offline and disappeared from the cpu_online_mask, and hence
future invocations of smp_call_function() and friends will automatically
prune that CPU out. Thus, we can guarantee that no CPU will end up
*inadvertently* sending IPIs to an offline CPU.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/stop_machine.c |   39 ++++++++++++++++++++++++++++++++++-----
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 01fbae5..288f7fe 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -130,8 +130,10 @@ enum multi_stop_state {
 	MULTI_STOP_NONE,
 	/* Awaiting everyone to be scheduled. */
 	MULTI_STOP_PREPARE,
-	/* Disable interrupts. */
-	MULTI_STOP_DISABLE_IRQ,
+	/* Disable interrupts on CPUs not in ->active_cpus mask. */
+	MULTI_STOP_DISABLE_IRQ_INACTIVE,
+	/* Disable interrupts on CPUs in ->active_cpus mask. */
+	MULTI_STOP_DISABLE_IRQ_ACTIVE,
 	/* Run the function */
 	MULTI_STOP_RUN,
 	/* Exit */
@@ -189,12 +191,39 @@ static int multi_cpu_stop(void *data)
 	do {
 		/* Chill out and ensure we re-read multi_stop_state. */
 		cpu_relax();
+
+		/*
+		 * We use 2 separate stages to disable interrupts, namely
+		 * _INACTIVE and _ACTIVE, to ensure that the inactive CPUs
+		 * disable their interrupts first, followed by the active CPUs.
+		 *
+		 * This is done to avoid a race in the CPU offline path, which
+		 * can lead to receiving IPIs on the outgoing CPU *after* it
+		 * has gone offline.
+		 *
+		 * During CPU offline, we don't want the other CPUs to send
+		 * IPIs to the active_cpu (the outgoing CPU) *after* it has
+		 * disabled interrupts (because, then it will notice the IPIs
+		 * only after it has gone offline). We can prevent this by
+		 * making the other CPUs disable their interrupts first - that
+		 * way, they will run the stop-machine code with interrupts
+		 * disabled, and hence won't send IPIs after that point.
+		 */
+
 		if (msdata->state != curstate) {
 			curstate = msdata->state;
 			switch (curstate) {
-			case MULTI_STOP_DISABLE_IRQ:
-				local_irq_disable();
-				hard_irq_disable();
+			case MULTI_STOP_DISABLE_IRQ_INACTIVE:
+				if (!is_active) {
+					local_irq_disable();
+					hard_irq_disable();
+				}
+				break;
+			case MULTI_STOP_DISABLE_IRQ_ACTIVE:
+				if (is_active) {
+					local_irq_disable();
+					hard_irq_disable();
+				}
 				break;
 			case MULTI_STOP_RUN:
 				if (is_active)


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-15 19:12 [PATCH v5 0/3] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue Srivatsa S. Bhat
  2014-05-15 19:13 ` [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
  2014-05-15 19:13 ` [PATCH v5 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU" Srivatsa S. Bhat
@ 2014-05-15 19:14 ` Srivatsa S. Bhat
  2014-05-15 19:19   ` Tejun Heo
  2 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:14 UTC (permalink / raw)
  To: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch
  Cc: mgorman, riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel, srivatsa.bhat

During CPU offline, in the stop-machine loop, we use 2 separate stages to
disable interrupts, to ensure that the CPU going offline doesn't get any new
IPIs from the other CPUs after it has gone offline.

However, an IPI sent much earlier might arrive late on the target CPU
(possibly _after_ the CPU has gone offline) due to hardware latencies,
and due to this, the smp-call-function callbacks queued on the outgoing
CPU might not get noticed (and hence not executed) at all.

This is somewhat theoretical, but in any case, it makes sense to explicitly
loop through the call_single_queue and flush any pending callbacks before the
CPU goes completely offline. So, flush the queued smp-call-function callbacks
in the MULTI_STOP_DISABLE_IRQ_ACTIVE stage, after disabling interrupts on the
active CPU. That way, we would have handled all the queued callbacks before
going offline, and also, no new IPIs can be sent by the other CPUs to the
outgoing CPU at that point, because they will all be executing the stop-machine
code with interrupts disabled.

Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/smp.h   |    2 ++
 kernel/smp.c          |   32 ++++++++++++++++++++++++++++++++
 kernel/stop_machine.c |   11 +++++++++++
 3 files changed, 45 insertions(+)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 633f5ed..7924191 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -52,6 +52,8 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
 
 int smp_call_function_single_async(int cpu, struct call_single_data *csd);
 
+void flush_smp_call_function_queue(void);
+
 #ifdef CONFIG_SMP
 
 #include <linux/preempt.h>
diff --git a/kernel/smp.c b/kernel/smp.c
index ae45446..3722fd4 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -213,6 +213,38 @@ void generic_smp_call_function_single_interrupt(void)
 }
 
 /*
+ * flush_smp_call_function_queue - Flush any pending smp-call-function
+ * callbacks queued on this CPU (including those for which the source CPU's
+ * IPIs might not have been received on this CPU yet). This is invoked by a
+ * CPU about to go offline, to ensure that all pending IPI functions are run
+ * before it goes completely offline.
+ *
+ * Loop through the call_single_queue and run all the queued functions.
+ * Must be called with interrupts disabled.
+ */
+void flush_smp_call_function_queue(void)
+{
+	struct llist_head *head;
+	struct llist_node *entry;
+	struct call_single_data *csd, *csd_next;
+
+	WARN_ON(!irqs_disabled());
+
+	head = &__get_cpu_var(call_single_queue);
+
+	if (likely(llist_empty(head)))
+		return;
+
+	entry = llist_del_all(head);
+	entry = llist_reverse_order(entry);
+
+	llist_for_each_entry_safe(csd, csd_next, entry, llist) {
+		csd->func(csd->info);
+		csd_unlock(csd);
+	}
+}
+
+/*
  * smp_call_function_single - Run a function on a specific CPU
  * @func: The function to run. This must be fast and non-blocking.
  * @info: An arbitrary pointer to pass to the function.
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 288f7fe..6b3a2f3 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -21,6 +21,7 @@
 #include <linux/smpboot.h>
 #include <linux/atomic.h>
 #include <linux/lglock.h>
+#include <linux/smp.h>
 
 /*
  * Structure to determine completion condition and record errors.  May
@@ -224,6 +225,16 @@ static int multi_cpu_stop(void *data)
 					local_irq_disable();
 					hard_irq_disable();
 				}
+
+				/*
+				 * IPIs (from the inactive CPUs) might arrive
+				 * late due to hardware latencies. So flush
+				 * out any pending IPI callbacks explicitly,
+				 * to ensure that the outgoing CPU doesn't go
+				 * offline with work still pending (during
+				 * CPU hotplug).
+				 */
+				flush_smp_call_function_queue();
 				break;
 			case MULTI_STOP_RUN:
 				if (is_active)


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU"
  2014-05-15 19:13 ` [PATCH v5 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU" Srivatsa S. Bhat
@ 2014-05-15 19:17   ` Tejun Heo
  2014-05-15 19:18     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2014-05-15 19:17 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: peterz, tglx, mingo, rusty, akpm, fweisbec, hch, mgorman, riel,
	bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw, linux-kernel

On Fri, May 16, 2014 at 12:43:44AM +0530, Srivatsa S. Bhat wrote:
> During CPU offline, stop-machine is used to take control over all the online
> CPUs (via the per-cpu stopper thread) and then run take_cpu_down() on the CPU
> that is to be taken offline.
...
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>

Reviewed-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU"
  2014-05-15 19:17   ` Tejun Heo
@ 2014-05-15 19:18     ` Srivatsa S. Bhat
  0 siblings, 0 replies; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:18 UTC (permalink / raw)
  To: Tejun Heo
  Cc: peterz, tglx, mingo, rusty, akpm, fweisbec, hch, mgorman, riel,
	bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw, linux-kernel

On 05/16/2014 12:47 AM, Tejun Heo wrote:
> On Fri, May 16, 2014 at 12:43:44AM +0530, Srivatsa S. Bhat wrote:
>> During CPU offline, stop-machine is used to take control over all the online
>> CPUs (via the per-cpu stopper thread) and then run take_cpu_down() on the CPU
>> that is to be taken offline.
> ...
>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> 
> Reviewed-by: Tejun Heo <tj@kernel.org>
> 

Cool! Thanks a lot Tejun!
 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU
  2014-05-15 19:13 ` [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
@ 2014-05-15 19:19   ` Joe Perches
  2014-05-15 19:34     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Joe Perches @ 2014-05-15 19:19 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch, mgorman,
	riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel

On Fri, 2014-05-16 at 00:43 +0530, Srivatsa S. Bhat wrote:
> Today the smp-call-function code just prints a warning if we get an IPI on
> an offline CPU. This info is sufficient to let us know that something went
> wrong, but often it is very hard to debug exactly who sent the IPI and why,
> from this info alone.
[]
> diff --git a/kernel/smp.c b/kernel/smp.c
[]
> @@ -185,14 +185,26 @@ void generic_smp_call_function_single_interrupt(void)
[]
> -	entry = llist_del_all(&__get_cpu_var(call_single_queue));
> -	entry = llist_reverse_order(entry);
> +		/*
> +		 * We don't have to use the _safe() variant here
> +		 * because we are not invoking the IPI handlers yet.
> +		 */
> +		llist_for_each_entry(csd, entry, llist)
> +			pr_warn("IPI callback %pS sent to offline CPU\n",
> +				csd->func);

Perhaps add ratelimited?



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-15 19:14 ` [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
@ 2014-05-15 19:19   ` Tejun Heo
  2014-05-15 19:26     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Tejun Heo @ 2014-05-15 19:19 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: peterz, tglx, mingo, rusty, akpm, fweisbec, hch, mgorman, riel,
	bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw, linux-kernel

Hello,

On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
>  /*
> + * flush_smp_call_function_queue - Flush any pending smp-call-function

Don't we need a blank line here?

> + * callbacks queued on this CPU (including those for which the source CPU's
> + * IPIs might not have been received on this CPU yet). This is invoked by a
> + * CPU about to go offline, to ensure that all pending IPI functions are run
> + * before it goes completely offline.
> + *
> + * Loop through the call_single_queue and run all the queued functions.
> + * Must be called with interrupts disabled.
> + */

Other than that, looks good to me.

Reviewed-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-15 19:19   ` Tejun Heo
@ 2014-05-15 19:26     ` Srivatsa S. Bhat
  2014-05-15 19:36       ` Paul E. McKenney
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:26 UTC (permalink / raw)
  To: Tejun Heo
  Cc: peterz, tglx, mingo, rusty, akpm, fweisbec, hch, mgorman, riel,
	bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw, linux-kernel

On 05/16/2014 12:49 AM, Tejun Heo wrote:
> Hello,
> 
> On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
>>  /*
>> + * flush_smp_call_function_queue - Flush any pending smp-call-function
> 
> Don't we need a blank line here?
>

Hmm? That sentence continues on the next line, hence I didn't add any blank
line there.

>> + * callbacks queued on this CPU (including those for which the source CPU's
>> + * IPIs might not have been received on this CPU yet). This is invoked by a
>> + * CPU about to go offline, to ensure that all pending IPI functions are run
>> + * before it goes completely offline.
>> + *
>> + * Loop through the call_single_queue and run all the queued functions.
>> + * Must be called with interrupts disabled.
>> + */
> 
> Other than that, looks good to me.
> 
> Reviewed-by: Tejun Heo <tj@kernel.org>
> 

Thank you!

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU
  2014-05-15 19:19   ` Joe Perches
@ 2014-05-15 19:34     ` Srivatsa S. Bhat
  2014-05-15 19:43       ` Joe Perches
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:34 UTC (permalink / raw)
  To: Joe Perches
  Cc: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch, mgorman,
	riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel

On 05/16/2014 12:49 AM, Joe Perches wrote:
> On Fri, 2014-05-16 at 00:43 +0530, Srivatsa S. Bhat wrote:
>> Today the smp-call-function code just prints a warning if we get an IPI on
>> an offline CPU. This info is sufficient to let us know that something went
>> wrong, but often it is very hard to debug exactly who sent the IPI and why,
>> from this info alone.
> []
>> diff --git a/kernel/smp.c b/kernel/smp.c
> []
>> @@ -185,14 +185,26 @@ void generic_smp_call_function_single_interrupt(void)
> []
>> -	entry = llist_del_all(&__get_cpu_var(call_single_queue));
>> -	entry = llist_reverse_order(entry);
>> +		/*
>> +		 * We don't have to use the _safe() variant here
>> +		 * because we are not invoking the IPI handlers yet.
>> +		 */
>> +		llist_for_each_entry(csd, entry, llist)
>> +			pr_warn("IPI callback %pS sent to offline CPU\n",
>> +				csd->func);
> 
> Perhaps add ratelimited?
> 

This entire scenario is expected to be _very_ infrequent, and even if
it happens, these prints will appear only once during the entire run
(note the use of the 'warned' variable to control that). So I don't think
ratelimiting is called for in this case.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-15 19:26     ` Srivatsa S. Bhat
@ 2014-05-15 19:36       ` Paul E. McKenney
  2014-05-15 19:43         ` [PATCH v5 UPDATED " Srivatsa S. Bhat
  2014-05-15 19:44         ` [PATCH v5 " Tejun Heo
  0 siblings, 2 replies; 26+ messages in thread
From: Paul E. McKenney @ 2014-05-15 19:36 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Tejun Heo, peterz, tglx, mingo, rusty, akpm, fweisbec, hch,
	mgorman, riel, bp, rostedt, mgalbraith, ego, oleg, rjw,
	linux-kernel

On Fri, May 16, 2014 at 12:56:11AM +0530, Srivatsa S. Bhat wrote:
> On 05/16/2014 12:49 AM, Tejun Heo wrote:
> > Hello,
> > 
> > On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
> >>  /*
> >> + * flush_smp_call_function_queue - Flush any pending smp-call-function
> > 
> > Don't we need a blank line here?
> 
> Hmm? That sentence continues on the next line, hence I didn't add any blank
> line there.

Tejun might be wanting this to be a docbook comment, in which case
it also needs "/**" at the beginning of the comment.

							Thanx, Paul

> >> + * callbacks queued on this CPU (including those for which the source CPU's
> >> + * IPIs might not have been received on this CPU yet). This is invoked by a
> >> + * CPU about to go offline, to ensure that all pending IPI functions are run
> >> + * before it goes completely offline.
> >> + *
> >> + * Loop through the call_single_queue and run all the queued functions.
> >> + * Must be called with interrupts disabled.
> >> + */
> > 
> > Other than that, looks good to me.
> > 
> > Reviewed-by: Tejun Heo <tj@kernel.org>
> > 
> 
> Thank you!
> 
> Regards,
> Srivatsa S. Bhat
> 


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 UPDATED 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-15 19:36       ` Paul E. McKenney
@ 2014-05-15 19:43         ` Srivatsa S. Bhat
  2014-05-15 19:45           ` Tejun Heo
  2014-05-19 12:19           ` [PATCH v5 UPDATEDv2 " Srivatsa S. Bhat
  2014-05-15 19:44         ` [PATCH v5 " Tejun Heo
  1 sibling, 2 replies; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:43 UTC (permalink / raw)
  To: paulmck
  Cc: Tejun Heo, peterz, tglx, mingo, rusty, akpm, fweisbec, hch,
	mgorman, riel, bp, rostedt, mgalbraith, ego, oleg, rjw,
	linux-kernel

On 05/16/2014 01:06 AM, Paul E. McKenney wrote:
> On Fri, May 16, 2014 at 12:56:11AM +0530, Srivatsa S. Bhat wrote:
>> On 05/16/2014 12:49 AM, Tejun Heo wrote:
>>> Hello,
>>>
>>> On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
>>>>  /*
>>>> + * flush_smp_call_function_queue - Flush any pending smp-call-function
>>>
>>> Don't we need a blank line here?
>>
>> Hmm? That sentence continues on the next line, hence I didn't add any blank
>> line there.
> 
> Tejun might be wanting this to be a docbook comment, in which case
> it also needs "/**" at the beginning of the comment.
> 

Oh, I see. Like this?

-----------------------------------------------------------------------

From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
[PATCH v5 UPDATED 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline

During CPU offline, in the stop-machine loop, we use 2 separate stages to
disable interrupts, to ensure that the CPU going offline doesn't get any new
IPIs from the other CPUs after it has gone offline.

However, an IPI sent much earlier might arrive late on the target CPU
(possibly _after_ the CPU has gone offline) due to hardware latencies,
and due to this, the smp-call-function callbacks queued on the outgoing
CPU might not get noticed (and hence not executed) at all.

This is somewhat theoretical, but in any case, it makes sense to explicitly
loop through the call_single_queue and flush any pending callbacks before the
CPU goes completely offline. So, flush the queued smp-call-function callbacks
in the MULTI_STOP_DISABLE_IRQ_ACTIVE stage, after disabling interrupts on the
active CPU. That way, we would have handled all the queued callbacks before
going offline, and also, no new IPIs can be sent by the other CPUs to the
outgoing CPU at that point, because they will all be executing the stop-machine
code with interrupts disabled.

Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/smp.h   |    2 ++
 kernel/smp.c          |   33 +++++++++++++++++++++++++++++++++
 kernel/stop_machine.c |   11 +++++++++++
 3 files changed, 46 insertions(+)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 633f5ed..7924191 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -52,6 +52,8 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
 
 int smp_call_function_single_async(int cpu, struct call_single_data *csd);
 
+void flush_smp_call_function_queue(void);
+
 #ifdef CONFIG_SMP
 
 #include <linux/preempt.h>
diff --git a/kernel/smp.c b/kernel/smp.c
index ae45446..a152152 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -212,6 +212,39 @@ void generic_smp_call_function_single_interrupt(void)
 	}
 }
 
+/**
+ * flush_smp_call_function_queue - Flush pending smp-call-function callbacks
+ *
+ * Flush any pending smp-call-function callbacks queued on this CPU (including
+ * those for which the source CPU's IPIs might not have been received on this
+ * CPU yet). This is invoked by a CPU about to go offline, to ensure that all
+ * pending IPI functions are run before it goes completely offline.
+ *
+ * Loop through the call_single_queue and run all the queued functions.
+ * Must be called with interrupts disabled.
+ */
+void flush_smp_call_function_queue(void)
+{
+	struct llist_head *head;
+	struct llist_node *entry;
+	struct call_single_data *csd, *csd_next;
+
+	WARN_ON(!irqs_disabled());
+
+	head = &__get_cpu_var(call_single_queue);
+
+	if (likely(llist_empty(head)))
+		return;
+
+	entry = llist_del_all(head);
+	entry = llist_reverse_order(entry);
+
+	llist_for_each_entry_safe(csd, csd_next, entry, llist) {
+		csd->func(csd->info);
+		csd_unlock(csd);
+	}
+}
+
 /*
  * smp_call_function_single - Run a function on a specific CPU
  * @func: The function to run. This must be fast and non-blocking.
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 288f7fe..6b3a2f3 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -21,6 +21,7 @@
 #include <linux/smpboot.h>
 #include <linux/atomic.h>
 #include <linux/lglock.h>
+#include <linux/smp.h>
 
 /*
  * Structure to determine completion condition and record errors.  May
@@ -224,6 +225,16 @@ static int multi_cpu_stop(void *data)
 					local_irq_disable();
 					hard_irq_disable();
 				}
+
+				/*
+				 * IPIs (from the inactive CPUs) might arrive
+				 * late due to hardware latencies. So flush
+				 * out any pending IPI callbacks explicitly,
+				 * to ensure that the outgoing CPU doesn't go
+				 * offline with work still pending (during
+				 * CPU hotplug).
+				 */
+				flush_smp_call_function_queue();
 				break;
 			case MULTI_STOP_RUN:
 				if (is_active)



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU
  2014-05-15 19:34     ` Srivatsa S. Bhat
@ 2014-05-15 19:43       ` Joe Perches
  2014-05-15 19:51         ` [PATCH v5 UPDATED " Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Joe Perches @ 2014-05-15 19:43 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch, mgorman,
	riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel

On Fri, 2014-05-16 at 01:04 +0530, Srivatsa S. Bhat wrote:
> On 05/16/2014 12:49 AM, Joe Perches wrote:
> > On Fri, 2014-05-16 at 00:43 +0530, Srivatsa S. Bhat wrote:
> >> Today the smp-call-function code just prints a warning if we get an IPI on
> >> an offline CPU. This info is sufficient to let us know that something went
> >> wrong, but often it is very hard to debug exactly who sent the IPI and why,
> >> from this info alone.
> > []
> >> diff --git a/kernel/smp.c b/kernel/smp.c
> > []
> >> @@ -185,14 +185,26 @@ void generic_smp_call_function_single_interrupt(void)
> > []
> >> -	entry = llist_del_all(&__get_cpu_var(call_single_queue));
> >> -	entry = llist_reverse_order(entry);
> >> +		/*
> >> +		 * We don't have to use the _safe() variant here
> >> +		 * because we are not invoking the IPI handlers yet.
> >> +		 */
> >> +		llist_for_each_entry(csd, entry, llist)
> >> +			pr_warn("IPI callback %pS sent to offline CPU\n",
> >> +				csd->func);
> > 
> > Perhaps add ratelimited?
> > 
> 
> This entire scenario is expected to be _very_ infrequent, and even if
> it happens, these prints will appear only once during the entire run
> (note the use of the 'warned' variable to control that). So I don't think
> ratelimiting is called for in this case.

Ah, good.

I was misled a bit by the WARN_ONCE that is in the
same block.  Perhaps because there is a guard flag
above the block, maybe the WARN_ONCE should just be
WARN.



^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-15 19:36       ` Paul E. McKenney
  2014-05-15 19:43         ` [PATCH v5 UPDATED " Srivatsa S. Bhat
@ 2014-05-15 19:44         ` Tejun Heo
  1 sibling, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2014-05-15 19:44 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Srivatsa S. Bhat, peterz, tglx, mingo, rusty, akpm, fweisbec,
	hch, mgorman, riel, bp, rostedt, mgalbraith, ego, oleg, rjw,
	linux-kernel

Hello,

On Thu, May 15, 2014 at 12:36:56PM -0700, Paul E. McKenney wrote:
> On Fri, May 16, 2014 at 12:56:11AM +0530, Srivatsa S. Bhat wrote:
> > On 05/16/2014 12:49 AM, Tejun Heo wrote:
> > > Hello,
> > > 
> > > On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
> > >>  /*
> > >> + * flush_smp_call_function_queue - Flush any pending smp-call-function
> > > 
> > > Don't we need a blank line here?
> > 
> > Hmm? That sentence continues on the next line, hence I didn't add any blank
> > line there.
> 
> Tejun might be wanting this to be a docbook comment, in which case
> it also needs "/**" at the beginning of the comment.

Yes, right.  Gotta at least read the text before commenting. :)

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATED 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-15 19:43         ` [PATCH v5 UPDATED " Srivatsa S. Bhat
@ 2014-05-15 19:45           ` Tejun Heo
  2014-05-19 12:19           ` [PATCH v5 UPDATEDv2 " Srivatsa S. Bhat
  1 sibling, 0 replies; 26+ messages in thread
From: Tejun Heo @ 2014-05-15 19:45 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: paulmck, peterz, tglx, mingo, rusty, akpm, fweisbec, hch,
	mgorman, riel, bp, rostedt, mgalbraith, ego, oleg, rjw,
	linux-kernel

On Fri, May 16, 2014 at 01:13:18AM +0530, Srivatsa S. Bhat wrote:
> On 05/16/2014 01:06 AM, Paul E. McKenney wrote:
> > On Fri, May 16, 2014 at 12:56:11AM +0530, Srivatsa S. Bhat wrote:
> >> On 05/16/2014 12:49 AM, Tejun Heo wrote:
> >>> Hello,
> >>>
> >>> On Fri, May 16, 2014 at 12:44:13AM +0530, Srivatsa S. Bhat wrote:
> >>>>  /*
> >>>> + * flush_smp_call_function_queue - Flush any pending smp-call-function
> >>>
> >>> Don't we need a blank line here?
> >>
> >> Hmm? That sentence continues on the next line, hence I didn't add any blank
> >> line there.
> > 
> > Tejun might be wanting this to be a docbook comment, in which case
> > it also needs "/**" at the beginning of the comment.
> > 
> 
> Oh, I see. Like this?

Yeah, looks good to me.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 UPDATED 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU
  2014-05-15 19:43       ` Joe Perches
@ 2014-05-15 19:51         ` Srivatsa S. Bhat
  0 siblings, 0 replies; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-15 19:51 UTC (permalink / raw)
  To: Joe Perches
  Cc: peterz, tglx, mingo, tj, rusty, akpm, fweisbec, hch, mgorman,
	riel, bp, rostedt, mgalbraith, ego, paulmck, oleg, rjw,
	linux-kernel

On 05/16/2014 01:13 AM, Joe Perches wrote:
[...]
> Ah, good.
> 
> I was misled a bit by the WARN_ONCE that is in the
> same block.  Perhaps because there is a guard flag
> above the block, maybe the WARN_ONCE should just be
> WARN.
> 

Ah, right, just WARN is sufficient there. Thanks!

-------------------------------------------------------

From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
[PATCH v5 UPDATED 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU

Today the smp-call-function code just prints a warning if we get an IPI on
an offline CPU. This info is sufficient to let us know that something went
wrong, but often it is very hard to debug exactly who sent the IPI and why,
from this info alone.

In most cases, we get the warning about the IPI to an offline CPU, immediately
after the CPU going offline comes out of the stop-machine phase and reenables
interrupts. Since all online CPUs participate in stop-machine, the information
regarding the sender of the IPI is already lost by the time we exit the
stop-machine loop. So even if we dump the stack on each CPU at this point,
we won't find anything useful since all of them will show the stack-trace of
the stopper thread. So we need a better way to figure out who sent the IPI and
why.

To achieve this, when we detect an IPI targeted to an offline CPU, loop through
the call-single-data linked list and print out the payload (i.e., the name
of the function which was supposed to be executed by the target CPU). This
would give us an insight as to who might have sent the IPI and help us debug
this further.

Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 kernel/smp.c |   18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index 06d574e..306f818 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -185,14 +185,26 @@ void generic_smp_call_function_single_interrupt(void)
 {
 	struct llist_node *entry;
 	struct call_single_data *csd, *csd_next;
+	static bool warned;
+
+	entry = llist_del_all(&__get_cpu_var(call_single_queue));
+	entry = llist_reverse_order(entry);
 
 	/*
 	 * Shouldn't receive this interrupt on a cpu that is not yet online.
 	 */
-	WARN_ON_ONCE(!cpu_online(smp_processor_id()));
+	if (unlikely(!cpu_online(smp_processor_id()) && !warned)) {
+		warned = true;
+		WARN(1, "IPI on offline CPU %d\n", smp_processor_id());
 
-	entry = llist_del_all(&__get_cpu_var(call_single_queue));
-	entry = llist_reverse_order(entry);
+		/*
+		 * We don't have to use the _safe() variant here
+		 * because we are not invoking the IPI handlers yet.
+		 */
+		llist_for_each_entry(csd, entry, llist)
+			pr_warn("IPI callback %pS sent to offline CPU\n",
+				csd->func);
+	}
 
 	llist_for_each_entry_safe(csd, csd_next, entry, llist) {
 		csd->func(csd->info);



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v5 UPDATEDv2 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-15 19:43         ` [PATCH v5 UPDATED " Srivatsa S. Bhat
  2014-05-15 19:45           ` Tejun Heo
@ 2014-05-19 12:19           ` Srivatsa S. Bhat
  2014-05-19 16:18             ` Oleg Nesterov
  1 sibling, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-19 12:19 UTC (permalink / raw)
  To: Tejun Heo, akpm, fweisbec
  Cc: paulmck, peterz, tglx, mingo, rusty, hch, mgorman, riel, bp,
	rostedt, mgalbraith, ego, oleg, rjw, linux-kernel

On 05/16/2014 01:13 AM, Srivatsa S. Bhat wrote:
> -----------------------------------------------------------------------
> 
> From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> [PATCH v5 UPDATED 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
> 
[...]
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 288f7fe..6b3a2f3 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -21,6 +21,7 @@
>  #include <linux/smpboot.h>
>  #include <linux/atomic.h>
>  #include <linux/lglock.h>
> +#include <linux/smp.h>
>  
>  /*
>   * Structure to determine completion condition and record errors.  May
> @@ -224,6 +225,16 @@ static int multi_cpu_stop(void *data)
>  					local_irq_disable();
>  					hard_irq_disable();
>  				}
> +
> +				/*
> +				 * IPIs (from the inactive CPUs) might arrive
> +				 * late due to hardware latencies. So flush
> +				 * out any pending IPI callbacks explicitly,
> +				 * to ensure that the outgoing CPU doesn't go
> +				 * offline with work still pending (during
> +				 * CPU hotplug).
> +				 */
> +				flush_smp_call_function_queue();
>  				break;

I just realized that since flush_smp_call_function_queue() is called outside
of the "if (is_active)" block, it will get executed by all CPUs (even the
inactive ones) unnecessarily. We need only the active-cpu to run it; the others
don't have to.

I must have been carried away by the name of the state (which has _ACTIVE in it)
and thought that only the active CPUs will run it. So, here is an updated patch.

-----------------------------------------------------------------------------

From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
[PATCH v5 UPDATEDv2 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline

During CPU offline, in the stop-machine loop, we use 2 separate stages to
disable interrupts, to ensure that the CPU going offline doesn't get any new
IPIs from the other CPUs after it has gone offline.

However, an IPI sent much earlier might arrive late on the target CPU
(possibly _after_ the CPU has gone offline) due to hardware latencies,
and due to this, the smp-call-function callbacks queued on the outgoing
CPU might not get noticed (and hence not executed) at all.

This is somewhat theoretical, but in any case, it makes sense to explicitly
loop through the call_single_queue and flush any pending callbacks before the
CPU goes completely offline. So, flush the queued smp-call-function callbacks
in the MULTI_STOP_DISABLE_IRQ_ACTIVE stage, after disabling interrupts on the
active CPU. That way, we would have handled all the queued callbacks before
going offline, and also, no new IPIs can be sent by the other CPUs to the
outgoing CPU at that point, because they will all be executing the stop-machine
code with interrupts disabled.

Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/smp.h   |    2 ++
 kernel/smp.c          |   33 +++++++++++++++++++++++++++++++++
 kernel/stop_machine.c |   11 +++++++++++
 3 files changed, 46 insertions(+)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 633f5ed..7924191 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -52,6 +52,8 @@ void on_each_cpu_cond(bool (*cond_func)(int cpu, void *info),
 
 int smp_call_function_single_async(int cpu, struct call_single_data *csd);
 
+void flush_smp_call_function_queue(void);
+
 #ifdef CONFIG_SMP
 
 #include <linux/preempt.h>
diff --git a/kernel/smp.c b/kernel/smp.c
index 306f818..f361681 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -212,6 +212,39 @@ void generic_smp_call_function_single_interrupt(void)
 	}
 }
 
+/**
+ * flush_smp_call_function_queue - Flush pending smp-call-function callbacks
+ *
+ * Flush any pending smp-call-function callbacks queued on this CPU (including
+ * those for which the source CPU's IPIs might not have been received on this
+ * CPU yet). This is invoked by a CPU about to go offline, to ensure that all
+ * pending IPI functions are run before it goes completely offline.
+ *
+ * Loop through the call_single_queue and run all the queued functions.
+ * Must be called with interrupts disabled.
+ */
+void flush_smp_call_function_queue(void)
+{
+	struct llist_head *head;
+	struct llist_node *entry;
+	struct call_single_data *csd, *csd_next;
+
+	WARN_ON(!irqs_disabled());
+
+	head = &__get_cpu_var(call_single_queue);
+
+	if (likely(llist_empty(head)))
+		return;
+
+	entry = llist_del_all(head);
+	entry = llist_reverse_order(entry);
+
+	llist_for_each_entry_safe(csd, csd_next, entry, llist) {
+		csd->func(csd->info);
+		csd_unlock(csd);
+	}
+}
+
 /*
  * smp_call_function_single - Run a function on a specific CPU
  * @func: The function to run. This must be fast and non-blocking.
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 288f7fe..8581d20 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -21,6 +21,7 @@
 #include <linux/smpboot.h>
 #include <linux/atomic.h>
 #include <linux/lglock.h>
+#include <linux/smp.h>
 
 /*
  * Structure to determine completion condition and record errors.  May
@@ -223,6 +224,16 @@ static int multi_cpu_stop(void *data)
 				if (is_active) {
 					local_irq_disable();
 					hard_irq_disable();
+
+					/*
+					 * IPIs (from the inactive CPUs) might
+					 * arrive late due to hardware latencies.
+					 * So flush out any pending IPI callbacks
+					 * explicitly, to ensure that the outgoing
+					 * CPU doesn't go offline with work still
+					 * pending (during CPU hotplug).
+					 */
+					flush_smp_call_function_queue();
 				}
 				break;
 			case MULTI_STOP_RUN:



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATEDv2 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-19 12:19           ` [PATCH v5 UPDATEDv2 " Srivatsa S. Bhat
@ 2014-05-19 16:18             ` Oleg Nesterov
  2014-05-19 19:49               ` Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Oleg Nesterov @ 2014-05-19 16:18 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Tejun Heo, akpm, fweisbec, paulmck, peterz, tglx, mingo, rusty,
	hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On 05/19, Srivatsa S. Bhat wrote:
>
> However, an IPI sent much earlier might arrive late on the target CPU
> (possibly _after_ the CPU has gone offline) due to hardware latencies,
> and due to this, the smp-call-function callbacks queued on the outgoing
> CPU might not get noticed (and hence not executed) at all.

OK, but

> +void flush_smp_call_function_queue(void)
> +{
> +	struct llist_head *head;
> +	struct llist_node *entry;
> +	struct call_single_data *csd, *csd_next;
> +
> +	WARN_ON(!irqs_disabled());
> +
> +	head = &__get_cpu_var(call_single_queue);
> +
> +	if (likely(llist_empty(head)))
> +		return;
> +
> +	entry = llist_del_all(head);
> +	entry = llist_reverse_order(entry);
> +
> +	llist_for_each_entry_safe(csd, csd_next, entry, llist) {
> +		csd->func(csd->info);
> +		csd_unlock(csd);
> +	}
> +}

why do we need it? Can't multi_cpu_stop() just call
generic_smp_call_function_single_interrupt() ? This cpu is still online,
we should not worry about WARN_ON(!cpu_online()) ?

Oleg.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATEDv2 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-19 16:18             ` Oleg Nesterov
@ 2014-05-19 19:49               ` Srivatsa S. Bhat
  2014-05-19 20:22                 ` [PATCH v5 UPDATEDv3 " Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-19 19:49 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, akpm, fweisbec, paulmck, peterz, tglx, mingo, rusty,
	hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On 05/19/2014 09:48 PM, Oleg Nesterov wrote:
> On 05/19, Srivatsa S. Bhat wrote:
>>
>> However, an IPI sent much earlier might arrive late on the target CPU
>> (possibly _after_ the CPU has gone offline) due to hardware latencies,
>> and due to this, the smp-call-function callbacks queued on the outgoing
>> CPU might not get noticed (and hence not executed) at all.
> 
> OK, but
> 
>> +void flush_smp_call_function_queue(void)
>> +{
>> +	struct llist_head *head;
>> +	struct llist_node *entry;
>> +	struct call_single_data *csd, *csd_next;
>> +
>> +	WARN_ON(!irqs_disabled());
>> +
>> +	head = &__get_cpu_var(call_single_queue);
>> +
>> +	if (likely(llist_empty(head)))
>> +		return;
>> +
>> +	entry = llist_del_all(head);
>> +	entry = llist_reverse_order(entry);
>> +
>> +	llist_for_each_entry_safe(csd, csd_next, entry, llist) {
>> +		csd->func(csd->info);
>> +		csd_unlock(csd);
>> +	}
>> +}
> 
> why do we need it? Can't multi_cpu_stop() just call
> generic_smp_call_function_single_interrupt() ? This cpu is still online,
> we should not worry about WARN_ON(!cpu_online()) ?
> 

Ah, cool idea! :-) I'll use this method and post an updated patch.

Thank you!
 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-19 19:49               ` Srivatsa S. Bhat
@ 2014-05-19 20:22                 ` Srivatsa S. Bhat
  2014-05-20  9:42                   ` Peter Zijlstra
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-19 20:22 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Tejun Heo, akpm, fweisbec, paulmck, peterz, tglx, mingo, rusty,
	hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On 05/20/2014 01:19 AM, Srivatsa S. Bhat wrote:
> On 05/19/2014 09:48 PM, Oleg Nesterov wrote:
>> On 05/19, Srivatsa S. Bhat wrote:
>>>
>>> However, an IPI sent much earlier might arrive late on the target CPU
>>> (possibly _after_ the CPU has gone offline) due to hardware latencies,
>>> and due to this, the smp-call-function callbacks queued on the outgoing
>>> CPU might not get noticed (and hence not executed) at all.
>>
[...]
>> why do we need it? Can't multi_cpu_stop() just call
>> generic_smp_call_function_single_interrupt() ? This cpu is still online,
>> we should not worry about WARN_ON(!cpu_online()) ?
>>
> 
> Ah, cool idea! :-) I'll use this method and post an updated patch.
> 

--------------------------------------------------------------------

From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
[PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline

During CPU offline, in the stop-machine loop, we use 2 separate stages to
disable interrupts, to ensure that the CPU going offline doesn't get any new
IPIs from the other CPUs after it has gone offline.

However, an IPI sent much earlier might arrive late on the target CPU
(possibly _after_ the CPU has gone offline) due to hardware latencies,
and due to this, the smp-call-function callbacks queued on the outgoing
CPU might not get noticed (and hence not executed) at all.

This is somewhat theoretical, but in any case, it makes sense to explicitly
loop through the call_single_queue and flush any pending callbacks before the
CPU goes completely offline. So, flush the queued smp-call-function callbacks
in the MULTI_STOP_DISABLE_IRQ_ACTIVE stage, after disabling interrupts on the
active CPU. This can be trivially achieved by invoking the
generic_smp_call_function_single_interrupt() function itself (and since the
outgoing CPU is still online at this point, we won't trigger the "IPI to offline
CPU" warning in this function; so we are safe to call it here).

This way, we would have handled all the queued callbacks before going offline,
and also, no new IPIs can be sent by the other CPUs to the outgoing CPU at that
point, because they will all be executing the stop-machine code with interrupts
disabled.

Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
---

 include/linux/smp.h   |    2 ++
 kernel/smp.c          |   17 ++++++++++++++---
 kernel/stop_machine.c |   11 +++++++++++
 3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 633f5ed..e6b090d 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -151,6 +151,8 @@ smp_call_function_any(const struct cpumask *mask, smp_call_func_t func,
 
 static inline void kick_all_cpus_sync(void) {  }
 
+static inline void generic_smp_call_function_single_interrupt(void) { }
+
 #endif /* !SMP */
 
 /*
diff --git a/kernel/smp.c b/kernel/smp.c
index 306f818..b765167 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -177,9 +177,18 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
 	return 0;
 }
 
-/*
- * Invoked by arch to handle an IPI for call function single. Must be
- * called from the arch with interrupts disabled.
+/**
+ * generic_smp_call_function_single_interrupt - Execute SMP IPI callbacks
+ *
+ * Invoked by arch to handle an IPI for call function single.
+ *
+ * This is also invoked by a CPU about to go offline, to flush any pending
+ * smp-call-function callbacks queued on this CPU (including those for which
+ * the source CPU's IPIs might not have been received on this CPU yet).
+ * This ensures that all pending IPI callbacks are run before the CPU goes
+ * completely offline.
+ *
+ * Must be called with interrupts disabled.
  */
 void generic_smp_call_function_single_interrupt(void)
 {
@@ -187,6 +196,8 @@ void generic_smp_call_function_single_interrupt(void)
 	struct call_single_data *csd, *csd_next;
 	static bool warned;
 
+	WARN_ON(!irqs_disabled());
+
 	entry = llist_del_all(&__get_cpu_var(call_single_queue));
 	entry = llist_reverse_order(entry);
 
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 288f7fe..4069c13 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -21,6 +21,7 @@
 #include <linux/smpboot.h>
 #include <linux/atomic.h>
 #include <linux/lglock.h>
+#include <linux/smp.h>
 
 /*
  * Structure to determine completion condition and record errors.  May
@@ -223,6 +224,16 @@ static int multi_cpu_stop(void *data)
 				if (is_active) {
 					local_irq_disable();
 					hard_irq_disable();
+
+					/*
+					 * IPIs (from the inactive CPUs) might
+					 * arrive late due to hardware latencies.
+					 * So flush out any pending IPI callbacks
+					 * explicitly, to ensure that the outgoing
+					 * CPU doesn't go offline with work still
+					 * pending (during CPU hotplug).
+					 */
+					generic_smp_call_function_single_interrupt();
 				}
 				break;
 			case MULTI_STOP_RUN:



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-19 20:22                 ` [PATCH v5 UPDATEDv3 " Srivatsa S. Bhat
@ 2014-05-20  9:42                   ` Peter Zijlstra
  2014-05-20 10:09                     ` Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Zijlstra @ 2014-05-20  9:42 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Oleg Nesterov, Tejun Heo, akpm, fweisbec, paulmck, tglx, mingo,
	rusty, hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On Tue, May 20, 2014 at 01:52:41AM +0530, Srivatsa S. Bhat wrote:
> From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
> 
> During CPU offline, in the stop-machine loop, we use 2 separate stages to
> disable interrupts, to ensure that the CPU going offline doesn't get any new
> IPIs from the other CPUs after it has gone offline.
> 
> However, an IPI sent much earlier might arrive late on the target CPU
> (possibly _after_ the CPU has gone offline) due to hardware latencies,
> and due to this, the smp-call-function callbacks queued on the outgoing
> CPU might not get noticed (and hence not executed) at all.
> 
> This is somewhat theoretical, but in any case, it makes sense to explicitly
> loop through the call_single_queue and flush any pending callbacks before the
> CPU goes completely offline. So, flush the queued smp-call-function callbacks
> in the MULTI_STOP_DISABLE_IRQ_ACTIVE stage, after disabling interrupts on the
> active CPU. This can be trivially achieved by invoking the
> generic_smp_call_function_single_interrupt() function itself (and since the
> outgoing CPU is still online at this point, we won't trigger the "IPI to offline
> CPU" warning in this function; so we are safe to call it here).
> 
> This way, we would have handled all the queued callbacks before going offline,
> and also, no new IPIs can be sent by the other CPUs to the outgoing CPU at that
> point, because they will all be executing the stop-machine code with interrupts
> disabled.
> 
> Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
> Reviewed-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
> ---
> 
>  include/linux/smp.h   |    2 ++
>  kernel/smp.c          |   17 ++++++++++++++---
>  kernel/stop_machine.c |   11 +++++++++++
>  3 files changed, 27 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 633f5ed..e6b090d 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -151,6 +151,8 @@ smp_call_function_any(const struct cpumask *mask, smp_call_func_t func,
>  
>  static inline void kick_all_cpus_sync(void) {  }
>  
> +static inline void generic_smp_call_function_single_interrupt(void) { }
> +
>  #endif /* !SMP */
>  
>  /*
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 306f818..b765167 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -177,9 +177,18 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
>  	return 0;
>  }
>  
> -/*
> - * Invoked by arch to handle an IPI for call function single. Must be
> - * called from the arch with interrupts disabled.
> +/**
> + * generic_smp_call_function_single_interrupt - Execute SMP IPI callbacks
> + *
> + * Invoked by arch to handle an IPI for call function single.
> + *
> + * This is also invoked by a CPU about to go offline, to flush any pending
> + * smp-call-function callbacks queued on this CPU (including those for which
> + * the source CPU's IPIs might not have been received on this CPU yet).
> + * This ensures that all pending IPI callbacks are run before the CPU goes
> + * completely offline.
> + *
> + * Must be called with interrupts disabled.
>   */
>  void generic_smp_call_function_single_interrupt(void)
>  {
> @@ -187,6 +196,8 @@ void generic_smp_call_function_single_interrupt(void)
>  	struct call_single_data *csd, *csd_next;
>  	static bool warned;
>  
> +	WARN_ON(!irqs_disabled());
> +
>  	entry = llist_del_all(&__get_cpu_var(call_single_queue));
>  	entry = llist_reverse_order(entry);
>  
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 288f7fe..4069c13 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -21,6 +21,7 @@
>  #include <linux/smpboot.h>
>  #include <linux/atomic.h>
>  #include <linux/lglock.h>
> +#include <linux/smp.h>
>  
>  /*
>   * Structure to determine completion condition and record errors.  May
> @@ -223,6 +224,16 @@ static int multi_cpu_stop(void *data)
>  				if (is_active) {
>  					local_irq_disable();
>  					hard_irq_disable();
> +
> +					/*
> +					 * IPIs (from the inactive CPUs) might
> +					 * arrive late due to hardware latencies.
> +					 * So flush out any pending IPI callbacks
> +					 * explicitly, to ensure that the outgoing
> +					 * CPU doesn't go offline with work still
> +					 * pending (during CPU hotplug).
> +					 */
> +					generic_smp_call_function_single_interrupt();
>  				}
>  				break;
>  			case MULTI_STOP_RUN:

The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
is wrong or the patch is.

Furthermore, you yourself have worked on trying to remove stop machine
from hotplug, that too should've been a big hint this is the wrong place
for this.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-20  9:42                   ` Peter Zijlstra
@ 2014-05-20 10:09                     ` Srivatsa S. Bhat
  2014-05-20 10:25                       ` Peter Zijlstra
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-20 10:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Tejun Heo, akpm, fweisbec, paulmck, tglx, mingo,
	rusty, hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On 05/20/2014 03:12 PM, Peter Zijlstra wrote:
> On Tue, May 20, 2014 at 01:52:41AM +0530, Srivatsa S. Bhat wrote:
>> From: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>> [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
>>
>> During CPU offline, in the stop-machine loop, we use 2 separate stages to
>> disable interrupts, to ensure that the CPU going offline doesn't get any new
>> IPIs from the other CPUs after it has gone offline.
>>
>> However, an IPI sent much earlier might arrive late on the target CPU
>> (possibly _after_ the CPU has gone offline) due to hardware latencies,
>> and due to this, the smp-call-function callbacks queued on the outgoing
>> CPU might not get noticed (and hence not executed) at all.
>>
>> This is somewhat theoretical, but in any case, it makes sense to explicitly
>> loop through the call_single_queue and flush any pending callbacks before the
>> CPU goes completely offline. So, flush the queued smp-call-function callbacks
>> in the MULTI_STOP_DISABLE_IRQ_ACTIVE stage, after disabling interrupts on the
>> active CPU. This can be trivially achieved by invoking the
>> generic_smp_call_function_single_interrupt() function itself (and since the
>> outgoing CPU is still online at this point, we won't trigger the "IPI to offline
>> CPU" warning in this function; so we are safe to call it here).
>>
>> This way, we would have handled all the queued callbacks before going offline,
>> and also, no new IPIs can be sent by the other CPUs to the outgoing CPU at that
>> point, because they will all be executing the stop-machine code with interrupts
>> disabled.
>>
>> Suggested-by: Frederic Weisbecker <fweisbec@gmail.com>
>> Reviewed-by: Tejun Heo <tj@kernel.org>
>> Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
>> ---
>>
>>  include/linux/smp.h   |    2 ++
>>  kernel/smp.c          |   17 ++++++++++++++---
>>  kernel/stop_machine.c |   11 +++++++++++
>>  3 files changed, 27 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/smp.h b/include/linux/smp.h
>> index 633f5ed..e6b090d 100644
>> --- a/include/linux/smp.h
>> +++ b/include/linux/smp.h
>> @@ -151,6 +151,8 @@ smp_call_function_any(const struct cpumask *mask, smp_call_func_t func,
>>  
>>  static inline void kick_all_cpus_sync(void) {  }
>>  
>> +static inline void generic_smp_call_function_single_interrupt(void) { }
>> +
>>  #endif /* !SMP */
>>  
>>  /*
>> diff --git a/kernel/smp.c b/kernel/smp.c
>> index 306f818..b765167 100644
>> --- a/kernel/smp.c
>> +++ b/kernel/smp.c
>> @@ -177,9 +177,18 @@ static int generic_exec_single(int cpu, struct call_single_data *csd,
>>  	return 0;
>>  }
>>  
>> -/*
>> - * Invoked by arch to handle an IPI for call function single. Must be
>> - * called from the arch with interrupts disabled.
>> +/**
>> + * generic_smp_call_function_single_interrupt - Execute SMP IPI callbacks
>> + *
>> + * Invoked by arch to handle an IPI for call function single.
>> + *
>> + * This is also invoked by a CPU about to go offline, to flush any pending
>> + * smp-call-function callbacks queued on this CPU (including those for which
>> + * the source CPU's IPIs might not have been received on this CPU yet).
>> + * This ensures that all pending IPI callbacks are run before the CPU goes
>> + * completely offline.
>> + *
>> + * Must be called with interrupts disabled.
>>   */
>>  void generic_smp_call_function_single_interrupt(void)
>>  {
>> @@ -187,6 +196,8 @@ void generic_smp_call_function_single_interrupt(void)
>>  	struct call_single_data *csd, *csd_next;
>>  	static bool warned;
>>  
>> +	WARN_ON(!irqs_disabled());
>> +
>>  	entry = llist_del_all(&__get_cpu_var(call_single_queue));
>>  	entry = llist_reverse_order(entry);
>>  
>> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
>> index 288f7fe..4069c13 100644
>> --- a/kernel/stop_machine.c
>> +++ b/kernel/stop_machine.c
>> @@ -21,6 +21,7 @@
>>  #include <linux/smpboot.h>
>>  #include <linux/atomic.h>
>>  #include <linux/lglock.h>
>> +#include <linux/smp.h>
>>  
>>  /*
>>   * Structure to determine completion condition and record errors.  May
>> @@ -223,6 +224,16 @@ static int multi_cpu_stop(void *data)
>>  				if (is_active) {
>>  					local_irq_disable();
>>  					hard_irq_disable();
>> +
>> +					/*
>> +					 * IPIs (from the inactive CPUs) might
>> +					 * arrive late due to hardware latencies.
>> +					 * So flush out any pending IPI callbacks
>> +					 * explicitly, to ensure that the outgoing
>> +					 * CPU doesn't go offline with work still
>> +					 * pending (during CPU hotplug).
>> +					 */
>> +					generic_smp_call_function_single_interrupt();
>>  				}
>>  				break;
>>  			case MULTI_STOP_RUN:
> 
> The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
> is wrong or the patch is.
> 

Yes, I know that multi_cpu_stop() isn't exclusive to hotplug. That's why
I have explicitly referred to CPU hotplug in the comment as well as the
changelog.

But I totally agree that code-wise this is not the best way to do it since
this affects (although harmlessly) usecases other than hotplug as well.

Do you have any other suggestions?

> Furthermore, you yourself have worked on trying to remove stop machine
> from hotplug, that too should've been a big hint this is the wrong place
> for this.
> 

I know :-( But for lack of any other solution (other than getting rid of
stop-machine from the hotplug path), I went with this rather poor fix, as
a stop-gap arrangement until we implement the long-term solution (stop-
machine-free CPU hotplug).

While we are on the topic of stop-machine removal, I'll try to revive that
patchset again and hope that this time it will get more reviews and probably
get a chance to go upstream. That should solve a _lot_ of problems with
hotplug!

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-20 10:09                     ` Srivatsa S. Bhat
@ 2014-05-20 10:25                       ` Peter Zijlstra
  2014-05-20 10:31                         ` Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Peter Zijlstra @ 2014-05-20 10:25 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: Oleg Nesterov, Tejun Heo, akpm, fweisbec, paulmck, tglx, mingo,
	rusty, hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
> > The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
> > is wrong or the patch is.
> > 
> 
> Yes, I know that multi_cpu_stop() isn't exclusive to hotplug. That's why
> I have explicitly referred to CPU hotplug in the comment as well as the
> changelog.
> 
> But I totally agree that code-wise this is not the best way to do it since
> this affects (although harmlessly) usecases other than hotplug as well.
> 
> Do you have any other suggestions?

How about making a kernel/smp.c hotplug notifier and stuffing it in the
CPU_DYING list? That's typically after we've already torn down the
interrupts for that cpu, so no chance of any new ones coming in.

Or is that too late?

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-20 10:25                       ` Peter Zijlstra
@ 2014-05-20 10:31                         ` Srivatsa S. Bhat
  2014-05-20 10:38                           ` Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-20 10:31 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Tejun Heo, akpm, fweisbec, paulmck, tglx, mingo,
	rusty, hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
> On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
>>> The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
>>> is wrong or the patch is.
>>>
>>
>> Yes, I know that multi_cpu_stop() isn't exclusive to hotplug. That's why
>> I have explicitly referred to CPU hotplug in the comment as well as the
>> changelog.
>>
>> But I totally agree that code-wise this is not the best way to do it since
>> this affects (although harmlessly) usecases other than hotplug as well.
>>
>> Do you have any other suggestions?
> 
> How about making a kernel/smp.c hotplug notifier and stuffing it in the
> CPU_DYING list? That's typically after we've already torn down the
> interrupts for that cpu, so no chance of any new ones coming in.
> 
> Or is that too late?
> 

No, that should work just fine. Thank you for the suggestion! I'll give
it a shot.

Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-20 10:31                         ` Srivatsa S. Bhat
@ 2014-05-20 10:38                           ` Srivatsa S. Bhat
  2014-05-20 10:42                             ` Srivatsa S. Bhat
  0 siblings, 1 reply; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-20 10:38 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Tejun Heo, akpm, fweisbec, paulmck, tglx, mingo,
	rusty, hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On 05/20/2014 04:01 PM, Srivatsa S. Bhat wrote:
> On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
>> On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
>>>> The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
>>>> is wrong or the patch is.
>>>>
>>>
>>> Yes, I know that multi_cpu_stop() isn't exclusive to hotplug. That's why
>>> I have explicitly referred to CPU hotplug in the comment as well as the
>>> changelog.
>>>
>>> But I totally agree that code-wise this is not the best way to do it since
>>> this affects (although harmlessly) usecases other than hotplug as well.
>>>
>>> Do you have any other suggestions?
>>
>> How about making a kernel/smp.c hotplug notifier and stuffing it in the
>> CPU_DYING list? That's typically after we've already torn down the
>> interrupts for that cpu, so no chance of any new ones coming in.
>>
>> Or is that too late?
>>
> 
> No, that should work just fine. Thank you for the suggestion! I'll give
> it a shot.
> 

The only problem will be that CPU_DYING notifiers are run after marking
the CPU offline, and hence the warning will trigger. We can avoid that by
defining a __generic_smp_call_function_single_interrupt() that doesn't
check for cpu_online(smp_processor_id) and call this function from the
hotplug notifier.


Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v5 UPDATEDv3 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline
  2014-05-20 10:38                           ` Srivatsa S. Bhat
@ 2014-05-20 10:42                             ` Srivatsa S. Bhat
  0 siblings, 0 replies; 26+ messages in thread
From: Srivatsa S. Bhat @ 2014-05-20 10:42 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Oleg Nesterov, Tejun Heo, akpm, fweisbec, paulmck, tglx, mingo,
	rusty, hch, mgorman, riel, bp, rostedt, mgalbraith, ego, rjw,
	linux-kernel

On 05/20/2014 04:08 PM, Srivatsa S. Bhat wrote:
> On 05/20/2014 04:01 PM, Srivatsa S. Bhat wrote:
>> On 05/20/2014 03:55 PM, Peter Zijlstra wrote:
>>> On Tue, May 20, 2014 at 03:39:59PM +0530, Srivatsa S. Bhat wrote:
>>>>> The multi_cpu_stop() path isn't exclusive to hotplug, so your changelog
>>>>> is wrong or the patch is.
>>>>>
>>>>
>>>> Yes, I know that multi_cpu_stop() isn't exclusive to hotplug. That's why
>>>> I have explicitly referred to CPU hotplug in the comment as well as the
>>>> changelog.
>>>>
>>>> But I totally agree that code-wise this is not the best way to do it since
>>>> this affects (although harmlessly) usecases other than hotplug as well.
>>>>
>>>> Do you have any other suggestions?
>>>
>>> How about making a kernel/smp.c hotplug notifier and stuffing it in the
>>> CPU_DYING list? That's typically after we've already torn down the
>>> interrupts for that cpu, so no chance of any new ones coming in.
>>>
>>> Or is that too late?
>>>
>>
>> No, that should work just fine. Thank you for the suggestion! I'll give
>> it a shot.
>>
> 
> The only problem will be that CPU_DYING notifiers are run after marking
> the CPU offline, and hence the warning will trigger. We can avoid that by
> defining a __generic_smp_call_function_single_interrupt() that doesn't
> check for cpu_online(smp_processor_id) and call this function from the
> hotplug notifier.
> 
> 

(And that's probably something to fix while cleaning up hotplug later:
Why should we mark the CPU offline _before_ running CPU_DYING? It makes
more sense to mark it offline _after_ running CPU_DYING notifiers. I'll
audit that path as well and see what I can find).
 
Regards,
Srivatsa S. Bhat


^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-05-20 10:44 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-05-15 19:12 [PATCH v5 0/3] CPU hotplug: Fix the long-standing "IPI to offline CPU" issue Srivatsa S. Bhat
2014-05-15 19:13 ` [PATCH v5 1/3] smp: Print more useful debug info upon receiving IPI on an offline CPU Srivatsa S. Bhat
2014-05-15 19:19   ` Joe Perches
2014-05-15 19:34     ` Srivatsa S. Bhat
2014-05-15 19:43       ` Joe Perches
2014-05-15 19:51         ` [PATCH v5 UPDATED " Srivatsa S. Bhat
2014-05-15 19:13 ` [PATCH v5 2/3] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU" Srivatsa S. Bhat
2014-05-15 19:17   ` Tejun Heo
2014-05-15 19:18     ` Srivatsa S. Bhat
2014-05-15 19:14 ` [PATCH v5 3/3] CPU hotplug, smp: Flush any pending IPI callbacks before CPU offline Srivatsa S. Bhat
2014-05-15 19:19   ` Tejun Heo
2014-05-15 19:26     ` Srivatsa S. Bhat
2014-05-15 19:36       ` Paul E. McKenney
2014-05-15 19:43         ` [PATCH v5 UPDATED " Srivatsa S. Bhat
2014-05-15 19:45           ` Tejun Heo
2014-05-19 12:19           ` [PATCH v5 UPDATEDv2 " Srivatsa S. Bhat
2014-05-19 16:18             ` Oleg Nesterov
2014-05-19 19:49               ` Srivatsa S. Bhat
2014-05-19 20:22                 ` [PATCH v5 UPDATEDv3 " Srivatsa S. Bhat
2014-05-20  9:42                   ` Peter Zijlstra
2014-05-20 10:09                     ` Srivatsa S. Bhat
2014-05-20 10:25                       ` Peter Zijlstra
2014-05-20 10:31                         ` Srivatsa S. Bhat
2014-05-20 10:38                           ` Srivatsa S. Bhat
2014-05-20 10:42                             ` Srivatsa S. Bhat
2014-05-15 19:44         ` [PATCH v5 " Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.