* [PATCH tip/core/rcu 0/4] Add smp_call_function() debugging for v5.10
@ 2020-08-31 18:13 Paul E. McKenney
2020-08-31 18:14 ` [PATCH tip/core/rcu 1/4] lib: Add backtrace_idle parameter to force backtrace of idle CPUs paulmck
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Paul E. McKenney @ 2020-08-31 18:13 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm,
mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel
Hello!
This series adds debugging code for stalled smp_call_function() handlers:
1. Add backtrace_idle parameter to force backtrace of idle CPUs.
2. Add source and destination CPUs to __call_single_data.
3. kernel/smp: Provide CSD lock timeout diagnostics.
4. Make symbol 'csd_bug_count' static.
Thanx, Paul
------------------------------------------------------------------------
Documentation/admin-guide/kernel-parameters.txt | 4
include/linux/smp.h | 3
include/linux/smp_types.h | 3
kernel/smp.c | 136 +++++++++++++++++++++++-
lib/Kconfig.debug | 11 +
lib/nmi_backtrace.c | 6 -
6 files changed, 161 insertions(+), 2 deletions(-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH tip/core/rcu 1/4] lib: Add backtrace_idle parameter to force backtrace of idle CPUs
2020-08-31 18:13 [PATCH tip/core/rcu 0/4] Add smp_call_function() debugging for v5.10 Paul E. McKenney
@ 2020-08-31 18:14 ` paulmck
2020-08-31 18:14 ` [PATCH tip/core/rcu 2/4] smp: Add source and destination CPUs to __call_single_data paulmck
` (2 subsequent siblings)
3 siblings, 0 replies; 10+ messages in thread
From: paulmck @ 2020-08-31 18:14 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm,
mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Paul E. McKenney,
Jonathan Corbet, Greg Kroah-Hartman, linux-doc
From: "Paul E. McKenney" <paulmck@kernel.org>
Currently, the nmi_cpu_backtrace() declines to produce backtraces for
idle CPUs. This is a good choice in the common case in which problems are
caused only by non-idle CPUs. However, there are occasionally situations
in which idle CPUs are helping to cause problems. This commit therefore
adds an nmi_backtrace.backtrace_idle kernel boot parameter that causes
nmi_cpu_backtrace() to dump stacks even of idle CPUs.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <linux-doc@vger.kernel.org>
---
Documentation/admin-guide/kernel-parameters.txt | 4 ++++
lib/nmi_backtrace.c | 6 +++++-
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index bdc1f33..5e6d191 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3073,6 +3073,10 @@
and gids from such clients. This is intended to ease
migration from NFSv2/v3.
+ nmi_backtrace.backtrace_idle [KNL]
+ Dump stacks even of idle CPUs in response to an
+ NMI stack-backtrace request.
+
nmi_debug= [KNL,SH] Specify one or more actions to take
when a NMI is triggered.
Format: [state][,regs][,debounce][,die]
diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c
index 15ca78e..8abe187 100644
--- a/lib/nmi_backtrace.c
+++ b/lib/nmi_backtrace.c
@@ -85,12 +85,16 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask,
put_cpu();
}
+// Dump stacks even for idle CPUs.
+static bool backtrace_idle;
+module_param(backtrace_idle, bool, 0644);
+
bool nmi_cpu_backtrace(struct pt_regs *regs)
{
int cpu = smp_processor_id();
if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
- if (regs && cpu_in_idle(instruction_pointer(regs))) {
+ if (!READ_ONCE(backtrace_idle) && regs && cpu_in_idle(instruction_pointer(regs))) {
pr_warn("NMI backtrace for cpu %d skipped: idling at %pS\n",
cpu, (void *)instruction_pointer(regs));
} else {
--
2.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH tip/core/rcu 2/4] smp: Add source and destination CPUs to __call_single_data
2020-08-31 18:13 [PATCH tip/core/rcu 0/4] Add smp_call_function() debugging for v5.10 Paul E. McKenney
2020-08-31 18:14 ` [PATCH tip/core/rcu 1/4] lib: Add backtrace_idle parameter to force backtrace of idle CPUs paulmck
@ 2020-08-31 18:14 ` paulmck
2020-09-04 2:16 ` Boqun Feng
2020-08-31 18:14 ` [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics paulmck
2020-08-31 18:14 ` [PATCH tip/core/rcu 4/4] smp: Make symbol 'csd_bug_count' static paulmck
3 siblings, 1 reply; 10+ messages in thread
From: paulmck @ 2020-08-31 18:14 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm,
mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Paul E. McKenney,
Sebastian Andrzej Siewior, Frederic Weisbecker
From: "Paul E. McKenney" <paulmck@kernel.org>
This commit adds a destination CPU to __call_single_data, and is inspired
by an earlier commit by Peter Zijlstra. This version adds #ifdef to
permit use by 32-bit systems and supplying the destination CPU for all
smp_call_function*() requests, not just smp_call_function_single().
If need be, 32-bit systems could be accommodated by shrinking the flags
field to 16 bits (the atomic_t variant is currently unused) and by
providing only eight bits for CPU on such systems.
It is not clear that the addition of the fields to __call_single_node
are really needed.
Link: https://lore.kernel.org/lkml/20200615164048.GC2531@hirez.programming.kicks-ass.net/
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
include/linux/smp.h | 3 +++
include/linux/smp_types.h | 3 +++
kernel/smp.c | 2 ++
3 files changed, 8 insertions(+)
diff --git a/include/linux/smp.h b/include/linux/smp.h
index 80d557e..9f13966 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -26,6 +26,9 @@ struct __call_single_data {
struct {
struct llist_node llist;
unsigned int flags;
+#ifdef CONFIG_64BIT
+ u16 src, dst;
+#endif
};
};
smp_call_func_t func;
diff --git a/include/linux/smp_types.h b/include/linux/smp_types.h
index 364b3ae..2e8461a 100644
--- a/include/linux/smp_types.h
+++ b/include/linux/smp_types.h
@@ -61,6 +61,9 @@ struct __call_single_node {
unsigned int u_flags;
atomic_t a_flags;
};
+#ifdef CONFIG_64BIT
+ u16 src, dst;
+#endif
};
#endif /* __LINUX_SMP_TYPES_H */
diff --git a/kernel/smp.c b/kernel/smp.c
index d0ae8eb..a47382d 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -375,6 +375,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
csd->func = func;
csd->info = info;
+ csd->dst = cpu;
err = generic_exec_single(cpu, csd);
@@ -540,6 +541,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
csd->flags |= CSD_TYPE_SYNC;
csd->func = func;
csd->info = info;
+ csd->dst = cpu;
if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
__cpumask_set_cpu(cpu, cfd->cpumask_ipi);
}
--
2.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics
2020-08-31 18:13 [PATCH tip/core/rcu 0/4] Add smp_call_function() debugging for v5.10 Paul E. McKenney
2020-08-31 18:14 ` [PATCH tip/core/rcu 1/4] lib: Add backtrace_idle parameter to force backtrace of idle CPUs paulmck
2020-08-31 18:14 ` [PATCH tip/core/rcu 2/4] smp: Add source and destination CPUs to __call_single_data paulmck
@ 2020-08-31 18:14 ` paulmck
2020-09-04 3:19 ` Boqun Feng
2020-09-04 3:31 ` Boqun Feng
2020-08-31 18:14 ` [PATCH tip/core/rcu 4/4] smp: Make symbol 'csd_bug_count' static paulmck
3 siblings, 2 replies; 10+ messages in thread
From: paulmck @ 2020-08-31 18:14 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm,
mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Paul E. McKenney,
Sebastian Andrzej Siewior
From: "Paul E. McKenney" <paulmck@kernel.org>
This commit causes csd_lock_wait() to emit diagnostics when a CPU
fails to respond quickly enough to one of the smp_call_function()
family of function calls. These diagnostics are enabled by a new
CSD_LOCK_WAIT_DEBUG Kconfig option that depends on DEBUG_KERNEL.
This commit was inspired by an earlier patch by Josef Bacik.
[ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
[ paulmck: Fix KASAN use-after-free issue reported by Qian Cai. ]
[ paulmck: Fix botched nr_cpu_ids comparison per Dan Carpenter. ]
[ paulmck: Apply Peter Zijlstra feedback. ]
Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
---
kernel/smp.c | 132 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
lib/Kconfig.debug | 11 +++++
2 files changed, 143 insertions(+)
diff --git a/kernel/smp.c b/kernel/smp.c
index a47382d..c5d3188 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -20,6 +20,9 @@
#include <linux/sched.h>
#include <linux/sched/idle.h>
#include <linux/hypervisor.h>
+#include <linux/sched/clock.h>
+#include <linux/nmi.h>
+#include <linux/sched/debug.h>
#include "smpboot.h"
#include "sched/smp.h"
@@ -96,6 +99,103 @@ void __init call_function_init(void)
smpcfd_prepare_cpu(smp_processor_id());
}
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+
+static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
+static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
+static DEFINE_PER_CPU(void *, cur_csd_info);
+
+#define CSD_LOCK_TIMEOUT (5ULL * NSEC_PER_SEC)
+atomic_t csd_bug_count = ATOMIC_INIT(0);
+
+/* Record current CSD work for current CPU, NULL to erase. */
+static void csd_lock_record(call_single_data_t *csd)
+{
+ if (!csd) {
+ smp_mb(); /* NULL cur_csd after unlock. */
+ __this_cpu_write(cur_csd, NULL);
+ return;
+ }
+ __this_cpu_write(cur_csd_func, csd->func);
+ __this_cpu_write(cur_csd_info, csd->info);
+ smp_wmb(); /* func and info before csd. */
+ __this_cpu_write(cur_csd, csd);
+ smp_mb(); /* Update cur_csd before function call. */
+ /* Or before unlock, as the case may be. */
+}
+
+static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd)
+{
+ unsigned int csd_type;
+
+ csd_type = CSD_TYPE(csd);
+ if (csd_type == CSD_TYPE_ASYNC || csd_type == CSD_TYPE_SYNC)
+ return csd->dst; /* Other CSD_TYPE_ values might not have ->dst. */
+ return -1;
+}
+
+/*
+ * Complain if too much time spent waiting. Note that only
+ * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
+ * so waiting on other types gets much less information.
+ */
+static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
+{
+ int cpu = -1;
+ int cpux;
+ bool firsttime;
+ u64 ts2, ts_delta;
+ call_single_data_t *cpu_cur_csd;
+ unsigned int flags = READ_ONCE(csd->flags);
+
+ if (!(flags & CSD_FLAG_LOCK)) {
+ if (!unlikely(*bug_id))
+ return true;
+ cpu = csd_lock_wait_getcpu(csd);
+ pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
+ *bug_id, raw_smp_processor_id(), cpu);
+ return true;
+ }
+
+ ts2 = sched_clock();
+ ts_delta = ts2 - *ts1;
+ if (likely(ts_delta <= CSD_LOCK_TIMEOUT))
+ return false;
+
+ firsttime = !*bug_id;
+ if (firsttime)
+ *bug_id = atomic_inc_return(&csd_bug_count);
+ cpu = csd_lock_wait_getcpu(csd);
+ if (WARN_ONCE(cpu < 0 || cpu >= nr_cpu_ids, "%s: cpu = %d\n", __func__, cpu))
+ cpux = 0;
+ else
+ cpux = cpu;
+ cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
+ pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n",
+ firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts2 - ts0,
+ cpu, csd->func, csd->info);
+ if (cpu_cur_csd && csd != cpu_cur_csd) {
+ pr_alert("\tcsd: CSD lock (#%d) handling prior %pS(%ps) request.\n",
+ *bug_id, READ_ONCE(per_cpu(cur_csd_func, cpux)),
+ READ_ONCE(per_cpu(cur_csd_info, cpux)));
+ } else {
+ pr_alert("\tcsd: CSD lock (#%d) %s.\n",
+ *bug_id, !cpu_cur_csd ? "unresponsive" : "handling this request");
+ }
+ if (cpu >= 0) {
+ if (!trigger_single_cpu_backtrace(cpu))
+ dump_cpu_task(cpu);
+ if (!cpu_cur_csd) {
+ pr_alert("csd: Re-sending CSD lock (#%d) IPI from CPU#%02d to CPU#%02d\n", *bug_id, raw_smp_processor_id(), cpu);
+ arch_send_call_function_single_ipi(cpu);
+ }
+ }
+ dump_stack();
+ *ts1 = ts2;
+
+ return false;
+}
+
/*
* csd_lock/csd_unlock used to serialize access to per-cpu csd resources
*
@@ -105,8 +205,28 @@ void __init call_function_init(void)
*/
static __always_inline void csd_lock_wait(call_single_data_t *csd)
{
+ int bug_id = 0;
+ u64 ts0, ts1;
+
+ ts1 = ts0 = sched_clock();
+ for (;;) {
+ if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id))
+ break;
+ cpu_relax();
+ }
+ smp_acquire__after_ctrl_dep();
+}
+
+#else
+static void csd_lock_record(call_single_data_t *csd)
+{
+}
+
+static __always_inline void csd_lock_wait(call_single_data_t *csd)
+{
smp_cond_load_acquire(&csd->flags, !(VAL & CSD_FLAG_LOCK));
}
+#endif
static __always_inline void csd_lock(call_single_data_t *csd)
{
@@ -166,9 +286,11 @@ static int generic_exec_single(int cpu, call_single_data_t *csd)
* We can unlock early even for the synchronous on-stack case,
* since we're doing this from the same CPU..
*/
+ csd_lock_record(csd);
csd_unlock(csd);
local_irq_save(flags);
func(info);
+ csd_lock_record(NULL);
local_irq_restore(flags);
return 0;
}
@@ -268,8 +390,10 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
entry = &csd_next->llist;
}
+ csd_lock_record(csd);
func(info);
csd_unlock(csd);
+ csd_lock_record(NULL);
} else {
prev = &csd->llist;
}
@@ -296,8 +420,10 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
smp_call_func_t func = csd->func;
void *info = csd->info;
+ csd_lock_record(csd);
csd_unlock(csd);
func(info);
+ csd_lock_record(NULL);
} else if (type == CSD_TYPE_IRQ_WORK) {
irq_work_single(csd);
}
@@ -375,7 +501,10 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
csd->func = func;
csd->info = info;
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+ csd->src = smp_processor_id();
csd->dst = cpu;
+#endif
err = generic_exec_single(cpu, csd);
@@ -541,7 +670,10 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
csd->flags |= CSD_TYPE_SYNC;
csd->func = func;
csd->info = info;
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+ csd->src = smp_processor_id();
csd->dst = cpu;
+#endif
if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
__cpumask_set_cpu(cpu, cfd->cpumask_ipi);
}
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e068c3c..86a35fd 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1367,6 +1367,17 @@ config WW_MUTEX_SELFTEST
Say M if you want these self tests to build as a module.
Say N if you are unsure.
+config CSD_LOCK_WAIT_DEBUG
+ bool "Debugging for csd_lock_wait(), called from smp_call_function*()"
+ depends on DEBUG_KERNEL
+ depends on 64BIT
+ default n
+ help
+ This option enables debug prints when CPUs are slow to respond
+ to the smp_call_function*() IPI wrappers. These debug prints
+ include the IPI handler function currently executing (if any)
+ and relevant stack traces.
+
endmenu # lock debugging
config TRACE_IRQFLAGS
--
2.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH tip/core/rcu 4/4] smp: Make symbol 'csd_bug_count' static
2020-08-31 18:13 [PATCH tip/core/rcu 0/4] Add smp_call_function() debugging for v5.10 Paul E. McKenney
` (2 preceding siblings ...)
2020-08-31 18:14 ` [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics paulmck
@ 2020-08-31 18:14 ` paulmck
3 siblings, 0 replies; 10+ messages in thread
From: paulmck @ 2020-08-31 18:14 UTC (permalink / raw)
To: rcu
Cc: linux-kernel, kernel-team, mingo, jiangshanlai, dipankar, akpm,
mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Wei Yongjun, Paul E . McKenney,
Sebastian Andrzej Siewior
From: Wei Yongjun <weiyongjun1@huawei.com>
The sparse tool complains as follows:
kernel/smp.c:107:10: warning:
symbol 'csd_bug_count' was not declared. Should it be static?
Because variable is not used outside of smp.c, this commit marks it
static.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/smp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/smp.c b/kernel/smp.c
index c5d3188..b25383d 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -106,7 +106,7 @@ static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
static DEFINE_PER_CPU(void *, cur_csd_info);
#define CSD_LOCK_TIMEOUT (5ULL * NSEC_PER_SEC)
-atomic_t csd_bug_count = ATOMIC_INIT(0);
+static atomic_t csd_bug_count = ATOMIC_INIT(0);
/* Record current CSD work for current CPU, NULL to erase. */
static void csd_lock_record(call_single_data_t *csd)
--
2.9.5
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH tip/core/rcu 2/4] smp: Add source and destination CPUs to __call_single_data
2020-08-31 18:14 ` [PATCH tip/core/rcu 2/4] smp: Add source and destination CPUs to __call_single_data paulmck
@ 2020-09-04 2:16 ` Boqun Feng
2020-09-04 2:42 ` Paul E. McKenney
0 siblings, 1 reply; 10+ messages in thread
From: Boqun Feng @ 2020-09-04 2:16 UTC (permalink / raw)
To: paulmck
Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar,
akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Sebastian Andrzej Siewior,
Frederic Weisbecker
Hi Paul,
On Mon, Aug 31, 2020 at 11:14:15AM -0700, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@kernel.org>
>
> This commit adds a destination CPU to __call_single_data, and is inspired
> by an earlier commit by Peter Zijlstra. This version adds #ifdef to
> permit use by 32-bit systems and supplying the destination CPU for all
> smp_call_function*() requests, not just smp_call_function_single().
>
> If need be, 32-bit systems could be accommodated by shrinking the flags
> field to 16 bits (the atomic_t variant is currently unused) and by
> providing only eight bits for CPU on such systems.
>
> It is not clear that the addition of the fields to __call_single_node
> are really needed.
>
> Link: https://lore.kernel.org/lkml/20200615164048.GC2531@hirez.programming.kicks-ass.net/
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> ---
> include/linux/smp.h | 3 +++
> include/linux/smp_types.h | 3 +++
> kernel/smp.c | 2 ++
> 3 files changed, 8 insertions(+)
>
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 80d557e..9f13966 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -26,6 +26,9 @@ struct __call_single_data {
> struct {
> struct llist_node llist;
> unsigned int flags;
> +#ifdef CONFIG_64BIT
> + u16 src, dst;
> +#endif
> };
> };
> smp_call_func_t func;
> diff --git a/include/linux/smp_types.h b/include/linux/smp_types.h
> index 364b3ae..2e8461a 100644
> --- a/include/linux/smp_types.h
> +++ b/include/linux/smp_types.h
> @@ -61,6 +61,9 @@ struct __call_single_node {
> unsigned int u_flags;
> atomic_t a_flags;
> };
> +#ifdef CONFIG_64BIT
> + u16 src, dst;
> +#endif
> };
>
> #endif /* __LINUX_SMP_TYPES_H */
> diff --git a/kernel/smp.c b/kernel/smp.c
> index d0ae8eb..a47382d 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -375,6 +375,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
>
> csd->func = func;
> csd->info = info;
> + csd->dst = cpu;
Would this and the next modification cause compile errors with
CONFIG_64BIT = n? I saw you add #ifdef guard in the next patch, so maybe
move those two into next patch (of course, if they trigger compile
errors)
Regards,
Boqun
>
> err = generic_exec_single(cpu, csd);
>
> @@ -540,6 +541,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
> csd->flags |= CSD_TYPE_SYNC;
> csd->func = func;
> csd->info = info;
> + csd->dst = cpu;
> if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
> __cpumask_set_cpu(cpu, cfd->cpumask_ipi);
> }
> --
> 2.9.5
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH tip/core/rcu 2/4] smp: Add source and destination CPUs to __call_single_data
2020-09-04 2:16 ` Boqun Feng
@ 2020-09-04 2:42 ` Paul E. McKenney
0 siblings, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2020-09-04 2:42 UTC (permalink / raw)
To: Boqun Feng
Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar,
akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Sebastian Andrzej Siewior,
Frederic Weisbecker
On Fri, Sep 04, 2020 at 10:16:10AM +0800, Boqun Feng wrote:
> Hi Paul,
>
> On Mon, Aug 31, 2020 at 11:14:15AM -0700, paulmck@kernel.org wrote:
> > From: "Paul E. McKenney" <paulmck@kernel.org>
> >
> > This commit adds a destination CPU to __call_single_data, and is inspired
> > by an earlier commit by Peter Zijlstra. This version adds #ifdef to
> > permit use by 32-bit systems and supplying the destination CPU for all
> > smp_call_function*() requests, not just smp_call_function_single().
> >
> > If need be, 32-bit systems could be accommodated by shrinking the flags
> > field to 16 bits (the atomic_t variant is currently unused) and by
> > providing only eight bits for CPU on such systems.
> >
> > It is not clear that the addition of the fields to __call_single_node
> > are really needed.
> >
> > Link: https://lore.kernel.org/lkml/20200615164048.GC2531@hirez.programming.kicks-ass.net/
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Cc: Frederic Weisbecker <frederic@kernel.org>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > ---
> > include/linux/smp.h | 3 +++
> > include/linux/smp_types.h | 3 +++
> > kernel/smp.c | 2 ++
> > 3 files changed, 8 insertions(+)
> >
> > diff --git a/include/linux/smp.h b/include/linux/smp.h
> > index 80d557e..9f13966 100644
> > --- a/include/linux/smp.h
> > +++ b/include/linux/smp.h
> > @@ -26,6 +26,9 @@ struct __call_single_data {
> > struct {
> > struct llist_node llist;
> > unsigned int flags;
> > +#ifdef CONFIG_64BIT
> > + u16 src, dst;
> > +#endif
> > };
> > };
> > smp_call_func_t func;
> > diff --git a/include/linux/smp_types.h b/include/linux/smp_types.h
> > index 364b3ae..2e8461a 100644
> > --- a/include/linux/smp_types.h
> > +++ b/include/linux/smp_types.h
> > @@ -61,6 +61,9 @@ struct __call_single_node {
> > unsigned int u_flags;
> > atomic_t a_flags;
> > };
> > +#ifdef CONFIG_64BIT
> > + u16 src, dst;
> > +#endif
> > };
> >
> > #endif /* __LINUX_SMP_TYPES_H */
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index d0ae8eb..a47382d 100644
> > --- a/kernel/smp.c
> > +++ b/kernel/smp.c
> > @@ -375,6 +375,7 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
> >
> > csd->func = func;
> > csd->info = info;
> > + csd->dst = cpu;
>
> Would this and the next modification cause compile errors with
> CONFIG_64BIT = n? I saw you add #ifdef guard in the next patch, so maybe
> move those two into next patch (of course, if they trigger compile
> errors)
Good point, will fix, thank you!
Thanx, Paul
> Regards,
> Boqun
>
> >
> > err = generic_exec_single(cpu, csd);
> >
> > @@ -540,6 +541,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
> > csd->flags |= CSD_TYPE_SYNC;
> > csd->func = func;
> > csd->info = info;
> > + csd->dst = cpu;
> > if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
> > __cpumask_set_cpu(cpu, cfd->cpumask_ipi);
> > }
> > --
> > 2.9.5
> >
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics
2020-08-31 18:14 ` [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics paulmck
@ 2020-09-04 3:19 ` Boqun Feng
2020-09-04 18:09 ` Paul E. McKenney
2020-09-04 3:31 ` Boqun Feng
1 sibling, 1 reply; 10+ messages in thread
From: Boqun Feng @ 2020-09-04 3:19 UTC (permalink / raw)
To: paulmck
Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar,
akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Sebastian Andrzej Siewior
On Mon, Aug 31, 2020 at 11:14:16AM -0700, paulmck@kernel.org wrote:
> From: "Paul E. McKenney" <paulmck@kernel.org>
>
> This commit causes csd_lock_wait() to emit diagnostics when a CPU
> fails to respond quickly enough to one of the smp_call_function()
> family of function calls. These diagnostics are enabled by a new
> CSD_LOCK_WAIT_DEBUG Kconfig option that depends on DEBUG_KERNEL.
>
> This commit was inspired by an earlier patch by Josef Bacik.
>
> [ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
> [ paulmck: Fix KASAN use-after-free issue reported by Qian Cai. ]
> [ paulmck: Fix botched nr_cpu_ids comparison per Dan Carpenter. ]
> [ paulmck: Apply Peter Zijlstra feedback. ]
> Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
> Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> ---
> kernel/smp.c | 132 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> lib/Kconfig.debug | 11 +++++
> 2 files changed, 143 insertions(+)
>
> diff --git a/kernel/smp.c b/kernel/smp.c
> index a47382d..c5d3188 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -20,6 +20,9 @@
> #include <linux/sched.h>
> #include <linux/sched/idle.h>
> #include <linux/hypervisor.h>
> +#include <linux/sched/clock.h>
> +#include <linux/nmi.h>
> +#include <linux/sched/debug.h>
>
> #include "smpboot.h"
> #include "sched/smp.h"
> @@ -96,6 +99,103 @@ void __init call_function_init(void)
> smpcfd_prepare_cpu(smp_processor_id());
> }
>
> +#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
> +
> +static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
> +static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
> +static DEFINE_PER_CPU(void *, cur_csd_info);
> +
> +#define CSD_LOCK_TIMEOUT (5ULL * NSEC_PER_SEC)
> +atomic_t csd_bug_count = ATOMIC_INIT(0);
> +
> +/* Record current CSD work for current CPU, NULL to erase. */
> +static void csd_lock_record(call_single_data_t *csd)
> +{
> + if (!csd) {
> + smp_mb(); /* NULL cur_csd after unlock. */
> + __this_cpu_write(cur_csd, NULL);
> + return;
> + }
> + __this_cpu_write(cur_csd_func, csd->func);
> + __this_cpu_write(cur_csd_info, csd->info);
> + smp_wmb(); /* func and info before csd. */
> + __this_cpu_write(cur_csd, csd);
> + smp_mb(); /* Update cur_csd before function call. */
> + /* Or before unlock, as the case may be. */
> +}
> +
> +static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd)
> +{
> + unsigned int csd_type;
> +
> + csd_type = CSD_TYPE(csd);
> + if (csd_type == CSD_TYPE_ASYNC || csd_type == CSD_TYPE_SYNC)
> + return csd->dst; /* Other CSD_TYPE_ values might not have ->dst. */
> + return -1;
> +}
> +
> +/*
> + * Complain if too much time spent waiting. Note that only
> + * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
> + * so waiting on other types gets much less information.
> + */
> +static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
> +{
> + int cpu = -1;
> + int cpux;
> + bool firsttime;
> + u64 ts2, ts_delta;
> + call_single_data_t *cpu_cur_csd;
> + unsigned int flags = READ_ONCE(csd->flags);
> +
> + if (!(flags & CSD_FLAG_LOCK)) {
> + if (!unlikely(*bug_id))
> + return true;
> + cpu = csd_lock_wait_getcpu(csd);
> + pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
> + *bug_id, raw_smp_processor_id(), cpu);
> + return true;
> + }
> +
> + ts2 = sched_clock();
> + ts_delta = ts2 - *ts1;
> + if (likely(ts_delta <= CSD_LOCK_TIMEOUT))
> + return false;
> +
> + firsttime = !*bug_id;
> + if (firsttime)
> + *bug_id = atomic_inc_return(&csd_bug_count);
> + cpu = csd_lock_wait_getcpu(csd);
> + if (WARN_ONCE(cpu < 0 || cpu >= nr_cpu_ids, "%s: cpu = %d\n", __func__, cpu))
> + cpux = 0;
> + else
> + cpux = cpu;
> + cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
Should we re-read csd->flags and check CSD_FLAG_LOCK again to improve
the accuracy or to avoid unnecessary stack dumping? Because cpux may
just finish the csd in question and clear the CSD_FLAG_LOCK bit, so it's
late but no so late ;-) Maybe we can put the following check:
> + pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n",
> + firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts2 - ts0,
> + cpu, csd->func, csd->info);
/*
* Re-read and check the flag.
*
* If destination cpu just handled the csd in question, we want
* to report the timeout but may not want to do stack dumping on
* either src or dst cpu, because in that case the stacks will
* be pretty much pointless. Since csd_lock_record(NULL) will
* issue a smp_mb() before setting cur_csd, we are guaranteed
* to observe the unlocked state of ->flags if the previous
* smp_load_acquire() observe the csd_lock_record(NULL) of
* handling the csd in question as follow:
*
* CPU 0 CPU 1
* <queue csd>
* <handle other csds> csd_lock_wait_toolong()
* flags = READ_ONCE(csd->flags); // locked
* <handle the csd of CPU 1>
* csd_lock_record(csd);
* func(info);
* csd_unlock(csd); // unlocked
* csd_lock_record(NULL);
* <ts_delat > CSD_LOCK_TIMEOUT> // going to report the timeout
* cpu_cur_csd = smp_load_acquire(...);
* <handle other csds>
* csd_lock_record(other_csd);
* flags = READ_ONCE(csd->flags); // observe the unlocked state
* // no matter cpu_cur_csd is NULL or other_csd
*
*/
flags = READ_ONCE(csd->flags);
if (!(flags & CSD_FLAG_LOCK)) {
pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
*bug_id, raw_smp_processor_id(), cpux);
return true;
}
Thoughts?
Regards,
Boqun
> + if (cpu_cur_csd && csd != cpu_cur_csd) {
> + pr_alert("\tcsd: CSD lock (#%d) handling prior %pS(%ps) request.\n",
> + *bug_id, READ_ONCE(per_cpu(cur_csd_func, cpux)),
> + READ_ONCE(per_cpu(cur_csd_info, cpux)));
> + } else {
> + pr_alert("\tcsd: CSD lock (#%d) %s.\n",
> + *bug_id, !cpu_cur_csd ? "unresponsive" : "handling this request");
> + }
> + if (cpu >= 0) {
> + if (!trigger_single_cpu_backtrace(cpu))
> + dump_cpu_task(cpu);
> + if (!cpu_cur_csd) {
> + pr_alert("csd: Re-sending CSD lock (#%d) IPI from CPU#%02d to CPU#%02d\n", *bug_id, raw_smp_processor_id(), cpu);
> + arch_send_call_function_single_ipi(cpu);
> + }
> + }
> + dump_stack();
> + *ts1 = ts2;
> +
> + return false;
> +}
> +
> /*
> * csd_lock/csd_unlock used to serialize access to per-cpu csd resources
> *
> @@ -105,8 +205,28 @@ void __init call_function_init(void)
> */
> static __always_inline void csd_lock_wait(call_single_data_t *csd)
> {
> + int bug_id = 0;
> + u64 ts0, ts1;
> +
> + ts1 = ts0 = sched_clock();
> + for (;;) {
> + if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id))
> + break;
> + cpu_relax();
> + }
> + smp_acquire__after_ctrl_dep();
> +}
> +
> +#else
> +static void csd_lock_record(call_single_data_t *csd)
> +{
> +}
> +
> +static __always_inline void csd_lock_wait(call_single_data_t *csd)
> +{
> smp_cond_load_acquire(&csd->flags, !(VAL & CSD_FLAG_LOCK));
> }
> +#endif
>
> static __always_inline void csd_lock(call_single_data_t *csd)
> {
> @@ -166,9 +286,11 @@ static int generic_exec_single(int cpu, call_single_data_t *csd)
> * We can unlock early even for the synchronous on-stack case,
> * since we're doing this from the same CPU..
> */
> + csd_lock_record(csd);
> csd_unlock(csd);
> local_irq_save(flags);
> func(info);
> + csd_lock_record(NULL);
> local_irq_restore(flags);
> return 0;
> }
> @@ -268,8 +390,10 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
> entry = &csd_next->llist;
> }
>
> + csd_lock_record(csd);
> func(info);
> csd_unlock(csd);
> + csd_lock_record(NULL);
> } else {
> prev = &csd->llist;
> }
> @@ -296,8 +420,10 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
> smp_call_func_t func = csd->func;
> void *info = csd->info;
>
> + csd_lock_record(csd);
> csd_unlock(csd);
> func(info);
> + csd_lock_record(NULL);
> } else if (type == CSD_TYPE_IRQ_WORK) {
> irq_work_single(csd);
> }
> @@ -375,7 +501,10 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
>
> csd->func = func;
> csd->info = info;
> +#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
> + csd->src = smp_processor_id();
> csd->dst = cpu;
> +#endif
>
> err = generic_exec_single(cpu, csd);
>
> @@ -541,7 +670,10 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
> csd->flags |= CSD_TYPE_SYNC;
> csd->func = func;
> csd->info = info;
> +#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
> + csd->src = smp_processor_id();
> csd->dst = cpu;
> +#endif
> if (llist_add(&csd->llist, &per_cpu(call_single_queue, cpu)))
> __cpumask_set_cpu(cpu, cfd->cpumask_ipi);
> }
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index e068c3c..86a35fd 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1367,6 +1367,17 @@ config WW_MUTEX_SELFTEST
> Say M if you want these self tests to build as a module.
> Say N if you are unsure.
>
> +config CSD_LOCK_WAIT_DEBUG
> + bool "Debugging for csd_lock_wait(), called from smp_call_function*()"
> + depends on DEBUG_KERNEL
> + depends on 64BIT
> + default n
> + help
> + This option enables debug prints when CPUs are slow to respond
> + to the smp_call_function*() IPI wrappers. These debug prints
> + include the IPI handler function currently executing (if any)
> + and relevant stack traces.
> +
> endmenu # lock debugging
>
> config TRACE_IRQFLAGS
> --
> 2.9.5
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics
2020-08-31 18:14 ` [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics paulmck
2020-09-04 3:19 ` Boqun Feng
@ 2020-09-04 3:31 ` Boqun Feng
1 sibling, 0 replies; 10+ messages in thread
From: Boqun Feng @ 2020-09-04 3:31 UTC (permalink / raw)
To: paulmck
Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar,
akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Sebastian Andrzej Siewior
On Mon, Aug 31, 2020 at 11:14:16AM -0700, paulmck@kernel.org wrote:
[...]
> +static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
> +{
> + int cpu = -1;
> + int cpux;
> + bool firsttime;
> + u64 ts2, ts_delta;
> + call_single_data_t *cpu_cur_csd;
> + unsigned int flags = READ_ONCE(csd->flags);
> +
> + if (!(flags & CSD_FLAG_LOCK)) {
> + if (!unlikely(*bug_id))
> + return true;
> + cpu = csd_lock_wait_getcpu(csd);
> + pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
> + *bug_id, raw_smp_processor_id(), cpu);
> + return true;
> + }
> +
> + ts2 = sched_clock();
> + ts_delta = ts2 - *ts1;
> + if (likely(ts_delta <= CSD_LOCK_TIMEOUT))
> + return false;
> +
> + firsttime = !*bug_id;
> + if (firsttime)
> + *bug_id = atomic_inc_return(&csd_bug_count);
> + cpu = csd_lock_wait_getcpu(csd);
> + if (WARN_ONCE(cpu < 0 || cpu >= nr_cpu_ids, "%s: cpu = %d\n", __func__, cpu))
> + cpux = 0;
> + else
> + cpux = cpu;
> + cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
> + pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n",
> + firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts2 - ts0,
> + cpu, csd->func, csd->info);
> + if (cpu_cur_csd && csd != cpu_cur_csd) {
> + pr_alert("\tcsd: CSD lock (#%d) handling prior %pS(%ps) request.\n",
> + *bug_id, READ_ONCE(per_cpu(cur_csd_func, cpux)),
> + READ_ONCE(per_cpu(cur_csd_info, cpux)));
> + } else {
> + pr_alert("\tcsd: CSD lock (#%d) %s.\n",
> + *bug_id, !cpu_cur_csd ? "unresponsive" : "handling this request");
> + }
> + if (cpu >= 0) {
> + if (!trigger_single_cpu_backtrace(cpu))
> + dump_cpu_task(cpu);
> + if (!cpu_cur_csd) {
> + pr_alert("csd: Re-sending CSD lock (#%d) IPI from CPU#%02d to CPU#%02d\n", *bug_id, raw_smp_processor_id(), cpu);
> + arch_send_call_function_single_ipi(cpu);
> + }
> + }
> + dump_stack();
> + *ts1 = ts2;
> +
> + return false;
> +}
> +
> /*
> * csd_lock/csd_unlock used to serialize access to per-cpu csd resources
> *
> @@ -105,8 +205,28 @@ void __init call_function_init(void)
> */
> static __always_inline void csd_lock_wait(call_single_data_t *csd)
> {
> + int bug_id = 0;
> + u64 ts0, ts1;
> +
> + ts1 = ts0 = sched_clock();
> + for (;;) {
> + if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id))
> + break;
> + cpu_relax();
> + }
> + smp_acquire__after_ctrl_dep();
It's a little difficult here to figure out what operation we want to add
ACQUIRE semantics. So maybe a few lines of comments?
/*
* Add the ACQUIRE semantics for the read of csd->flags in
* csd_lock_wait_toolong().
*/
Regards,
Boqun
> +}
> +
> +#else
> +static void csd_lock_record(call_single_data_t *csd)
> +{
> +}
> +
> +static __always_inline void csd_lock_wait(call_single_data_t *csd)
> +{
> smp_cond_load_acquire(&csd->flags, !(VAL & CSD_FLAG_LOCK));
> }
> +#endif
>
[...]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics
2020-09-04 3:19 ` Boqun Feng
@ 2020-09-04 18:09 ` Paul E. McKenney
0 siblings, 0 replies; 10+ messages in thread
From: Paul E. McKenney @ 2020-09-04 18:09 UTC (permalink / raw)
To: Boqun Feng
Cc: rcu, linux-kernel, kernel-team, mingo, jiangshanlai, dipankar,
akpm, mathieu.desnoyers, josh, tglx, peterz, rostedt, dhowells,
edumazet, fweisbec, oleg, joel, Sebastian Andrzej Siewior
On Fri, Sep 04, 2020 at 11:19:53AM +0800, Boqun Feng wrote:
> On Mon, Aug 31, 2020 at 11:14:16AM -0700, paulmck@kernel.org wrote:
> > From: "Paul E. McKenney" <paulmck@kernel.org>
> >
> > This commit causes csd_lock_wait() to emit diagnostics when a CPU
> > fails to respond quickly enough to one of the smp_call_function()
> > family of function calls. These diagnostics are enabled by a new
> > CSD_LOCK_WAIT_DEBUG Kconfig option that depends on DEBUG_KERNEL.
> >
> > This commit was inspired by an earlier patch by Josef Bacik.
> >
> > [ paulmck: Fix for syzbot+0f719294463916a3fc0e@syzkaller.appspotmail.com ]
> > [ paulmck: Fix KASAN use-after-free issue reported by Qian Cai. ]
> > [ paulmck: Fix botched nr_cpu_ids comparison per Dan Carpenter. ]
> > [ paulmck: Apply Peter Zijlstra feedback. ]
> > Link: https://lore.kernel.org/lkml/00000000000042f21905a991ecea@google.com
> > Link: https://lore.kernel.org/lkml/0000000000002ef21705a9933cf3@google.com
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Thomas Gleixner <tglx@linutronix.de>
> > Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
> > ---
> > kernel/smp.c | 132 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > lib/Kconfig.debug | 11 +++++
> > 2 files changed, 143 insertions(+)
> >
> > diff --git a/kernel/smp.c b/kernel/smp.c
> > index a47382d..c5d3188 100644
> > --- a/kernel/smp.c
> > +++ b/kernel/smp.c
> > @@ -20,6 +20,9 @@
> > #include <linux/sched.h>
> > #include <linux/sched/idle.h>
> > #include <linux/hypervisor.h>
> > +#include <linux/sched/clock.h>
> > +#include <linux/nmi.h>
> > +#include <linux/sched/debug.h>
> >
> > #include "smpboot.h"
> > #include "sched/smp.h"
> > @@ -96,6 +99,103 @@ void __init call_function_init(void)
> > smpcfd_prepare_cpu(smp_processor_id());
> > }
> >
> > +#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
> > +
> > +static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
> > +static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
> > +static DEFINE_PER_CPU(void *, cur_csd_info);
> > +
> > +#define CSD_LOCK_TIMEOUT (5ULL * NSEC_PER_SEC)
> > +atomic_t csd_bug_count = ATOMIC_INIT(0);
> > +
> > +/* Record current CSD work for current CPU, NULL to erase. */
> > +static void csd_lock_record(call_single_data_t *csd)
> > +{
> > + if (!csd) {
> > + smp_mb(); /* NULL cur_csd after unlock. */
> > + __this_cpu_write(cur_csd, NULL);
> > + return;
> > + }
> > + __this_cpu_write(cur_csd_func, csd->func);
> > + __this_cpu_write(cur_csd_info, csd->info);
> > + smp_wmb(); /* func and info before csd. */
> > + __this_cpu_write(cur_csd, csd);
> > + smp_mb(); /* Update cur_csd before function call. */
> > + /* Or before unlock, as the case may be. */
> > +}
> > +
> > +static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd)
> > +{
> > + unsigned int csd_type;
> > +
> > + csd_type = CSD_TYPE(csd);
> > + if (csd_type == CSD_TYPE_ASYNC || csd_type == CSD_TYPE_SYNC)
> > + return csd->dst; /* Other CSD_TYPE_ values might not have ->dst. */
> > + return -1;
> > +}
> > +
> > +/*
> > + * Complain if too much time spent waiting. Note that only
> > + * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
> > + * so waiting on other types gets much less information.
> > + */
> > +static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
> > +{
> > + int cpu = -1;
> > + int cpux;
> > + bool firsttime;
> > + u64 ts2, ts_delta;
> > + call_single_data_t *cpu_cur_csd;
> > + unsigned int flags = READ_ONCE(csd->flags);
> > +
> > + if (!(flags & CSD_FLAG_LOCK)) {
> > + if (!unlikely(*bug_id))
> > + return true;
> > + cpu = csd_lock_wait_getcpu(csd);
> > + pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
> > + *bug_id, raw_smp_processor_id(), cpu);
> > + return true;
> > + }
> > +
> > + ts2 = sched_clock();
> > + ts_delta = ts2 - *ts1;
> > + if (likely(ts_delta <= CSD_LOCK_TIMEOUT))
> > + return false;
> > +
> > + firsttime = !*bug_id;
> > + if (firsttime)
> > + *bug_id = atomic_inc_return(&csd_bug_count);
> > + cpu = csd_lock_wait_getcpu(csd);
> > + if (WARN_ONCE(cpu < 0 || cpu >= nr_cpu_ids, "%s: cpu = %d\n", __func__, cpu))
> > + cpux = 0;
> > + else
> > + cpux = cpu;
> > + cpu_cur_csd = smp_load_acquire(&per_cpu(cur_csd, cpux)); /* Before func and info. */
>
> Should we re-read csd->flags and check CSD_FLAG_LOCK again to improve
> the accuracy or to avoid unnecessary stack dumping? Because cpux may
> just finish the csd in question and clear the CSD_FLAG_LOCK bit, so it's
> late but no so late ;-) Maybe we can put the following check:
>
> > + pr_alert("csd: %s non-responsive CSD lock (#%d) on CPU#%d, waiting %llu ns for CPU#%02d %pS(%ps).\n",
> > + firsttime ? "Detected" : "Continued", *bug_id, raw_smp_processor_id(), ts2 - ts0,
> > + cpu, csd->func, csd->info);
>
> /*
> * Re-read and check the flag.
> *
> * If destination cpu just handled the csd in question, we want
> * to report the timeout but may not want to do stack dumping on
> * either src or dst cpu, because in that case the stacks will
> * be pretty much pointless. Since csd_lock_record(NULL) will
> * issue a smp_mb() before setting cur_csd, we are guaranteed
> * to observe the unlocked state of ->flags if the previous
> * smp_load_acquire() observe the csd_lock_record(NULL) of
> * handling the csd in question as follow:
> *
> * CPU 0 CPU 1
> * <queue csd>
> * <handle other csds> csd_lock_wait_toolong()
> * flags = READ_ONCE(csd->flags); // locked
> * <handle the csd of CPU 1>
> * csd_lock_record(csd);
> * func(info);
> * csd_unlock(csd); // unlocked
> * csd_lock_record(NULL);
> * <ts_delat > CSD_LOCK_TIMEOUT> // going to report the timeout
> * cpu_cur_csd = smp_load_acquire(...);
> * <handle other csds>
> * csd_lock_record(other_csd);
> * flags = READ_ONCE(csd->flags); // observe the unlocked state
> * // no matter cpu_cur_csd is NULL or other_csd
> *
> */
> flags = READ_ONCE(csd->flags);
> if (!(flags & CSD_FLAG_LOCK)) {
> pr_alert("csd: CSD lock (#%d) got unstuck on CPU#%02d, CPU#%02d released the lock.\n",
> *bug_id, raw_smp_processor_id(), cpux);
> return true;
> }
>
> Thoughts?
it might eventually be necessary to do this, but we do need to see
a real problem before layering on more code. After all, the episode
could just as easily end right after the re-read. And thus far, aside
from scftorture, the episodes in the wild were of the type that never
does end. Plus the stack dump will take quite a bit more time than the
interval between the initial read and your proposed re-read. So again,
we need to see a real problem before doing something like this.
Thanx, Paul
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-09-04 18:10 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31 18:13 [PATCH tip/core/rcu 0/4] Add smp_call_function() debugging for v5.10 Paul E. McKenney
2020-08-31 18:14 ` [PATCH tip/core/rcu 1/4] lib: Add backtrace_idle parameter to force backtrace of idle CPUs paulmck
2020-08-31 18:14 ` [PATCH tip/core/rcu 2/4] smp: Add source and destination CPUs to __call_single_data paulmck
2020-09-04 2:16 ` Boqun Feng
2020-09-04 2:42 ` Paul E. McKenney
2020-08-31 18:14 ` [PATCH tip/core/rcu 3/4] kernel/smp: Provide CSD lock timeout diagnostics paulmck
2020-09-04 3:19 ` Boqun Feng
2020-09-04 18:09 ` Paul E. McKenney
2020-09-04 3:31 ` Boqun Feng
2020-08-31 18:14 ` [PATCH tip/core/rcu 4/4] smp: Make symbol 'csd_bug_count' static paulmck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).