linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] kernel/smp.c: add more CSD lock debugging
@ 2021-02-26 11:25 Juergen Gross
  2021-02-26 11:25 ` [PATCH 1/3] kernel/smp: add boot parameter for controlling " Juergen Gross
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Juergen Gross @ 2021-02-26 11:25 UTC (permalink / raw)
  To: linux-kernel, linux-doc; +Cc: paulmck, mhocko, Juergen Gross, Jonathan Corbet

This patch series was created to help catching a rather long standing
problem with smp_call_function_any() and friends.

Very rarely a remote cpu seems not to execute a queued function and
the cpu queueing that function request will wait forever for the
CSD lock to be released by the remote cpu.

This problem has been observed primarily when running as a guest on
top of KVM or Xen, but there are reports of the same pattern for the
bare metal case, too. It seems to exist since about 2 years now, and
there is not much data available.

What is known up to now is that resending an IPI to the remote cpu is
helping.

The patches are adding more debug data being printed in a hang
situation using a kernel with CONFIG_CSD_LOCK_WAIT_DEBUG configured.
Additionally the debug coding can be controlled via a new parameter
in order to make it easier to use such a kernel in a production
environment without too much negative performance impact. Per default
the debugging additions will be switched off and they can be activated
via the new boot parameter:

csdlock_debug=1 will switch on the basic debugging and IPI resend
csdlock_debug=ext will add additional data printed out in a hang
  situation, but this option will have a larger impact on performance.

I hope that the "ext" setting will help to find the root cause of the
problem.

Juergen Gross (3):
  kernel/smp: add boot parameter for controlling CSD lock debugging
  kernel/smp: prepare more CSD lock debugging
  kernel/smp: add more data to CSD lock debugging

 .../admin-guide/kernel-parameters.txt         |  10 +
 kernel/smp.c                                  | 193 +++++++++++++++++-
 2 files changed, 192 insertions(+), 11 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/3] kernel/smp: add boot parameter for controlling CSD lock debugging
  2021-02-26 11:25 [PATCH 0/3] kernel/smp.c: add more CSD lock debugging Juergen Gross
@ 2021-02-26 11:25 ` Juergen Gross
  2021-02-26 11:25 ` [PATCH 2/3] kernel/smp: prepare more " Juergen Gross
  2021-02-26 11:25 ` [PATCH 3/3] kernel/smp: add more data to " Juergen Gross
  2 siblings, 0 replies; 11+ messages in thread
From: Juergen Gross @ 2021-02-26 11:25 UTC (permalink / raw)
  To: linux-kernel, linux-doc; +Cc: paulmck, mhocko, Juergen Gross, Jonathan Corbet

Currently CSD lock debugging can be switched on and off via a kernel
config option only. Unfortunately there is at least one problem with
CSD lock handling pending for about 2 years now, which has been seen
in different environments (mostly when running virtualized under KVM
or Xen, at least once on bare metal). Multiple attempts to catch this
issue have finally led to introduction of CSD lock debug code, but
this code is not in use in most distros as it has some impact on
performance.

In order to be able to ship kernels with CONFIG_CSD_LOCK_WAIT_DEBUG
enabled even for production use, add a boot parameter for switching
the debug functionality on. This will reduce any performance impact
of the debug coding to a bare minimum when not being used.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 .../admin-guide/kernel-parameters.txt         |  6 +++
 kernel/smp.c                                  | 38 +++++++++++++++++--
 2 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index bab6a8b01202..af9749b866c2 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -784,6 +784,12 @@
 	cs89x0_media=	[HW,NET]
 			Format: { rj45 | aui | bnc }
 
+	csdlock_debug=	[KNL] Enable debug add-ons of cross-cpu function call
+			handling. When switched on additional debug data is
+			printed to the console in case a hanging cpu is
+			detected and that cpu is pinged again in order to try
+			to resolve the hang situation.
+
 	dasd=		[HW,NET]
 			See header of drivers/s390/block/dasd_devmap.c.
 
diff --git a/kernel/smp.c b/kernel/smp.c
index aeb0adfa0606..d5f0b21ab55e 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -24,6 +24,7 @@
 #include <linux/sched/clock.h>
 #include <linux/nmi.h>
 #include <linux/sched/debug.h>
+#include <linux/jump_label.h>
 
 #include "smpboot.h"
 #include "sched/smp.h"
@@ -102,6 +103,20 @@ void __init call_function_init(void)
 
 #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
 
+static DEFINE_STATIC_KEY_FALSE(csdlock_debug_enabled);
+
+static int __init csdlock_debug(char *str)
+{
+	unsigned int val = 0;
+
+	get_option(&str, &val);
+	if (val)
+		static_branch_enable(&csdlock_debug_enabled);
+
+	return 0;
+}
+early_param("csdlock_debug", csdlock_debug);
+
 static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
 static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
 static DEFINE_PER_CPU(void *, cur_csd_info);
@@ -110,7 +125,7 @@ static DEFINE_PER_CPU(void *, cur_csd_info);
 static atomic_t csd_bug_count = ATOMIC_INIT(0);
 
 /* Record current CSD work for current CPU, NULL to erase. */
-static void csd_lock_record(call_single_data_t *csd)
+static void __csd_lock_record(call_single_data_t *csd)
 {
 	if (!csd) {
 		smp_mb(); /* NULL cur_csd after unlock. */
@@ -125,7 +140,13 @@ static void csd_lock_record(call_single_data_t *csd)
 		  /* Or before unlock, as the case may be. */
 }
 
-static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd)
+static __always_inline void csd_lock_record(call_single_data_t *csd)
+{
+	if (static_branch_unlikely(&csdlock_debug_enabled))
+		__csd_lock_record(csd);
+}
+
+static int csd_lock_wait_getcpu(call_single_data_t *csd)
 {
 	unsigned int csd_type;
 
@@ -140,7 +161,7 @@ static __always_inline int csd_lock_wait_getcpu(call_single_data_t *csd)
  * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
  * so waiting on other types gets much less information.
  */
-static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
+static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, int *bug_id)
 {
 	int cpu = -1;
 	int cpux;
@@ -204,7 +225,7 @@ static __always_inline bool csd_lock_wait_toolong(call_single_data_t *csd, u64 t
  * previous function call. For multi-cpu calls its even more interesting
  * as we'll have to ensure no other cpu is observing our csd.
  */
-static __always_inline void csd_lock_wait(call_single_data_t *csd)
+static void __csd_lock_wait(call_single_data_t *csd)
 {
 	int bug_id = 0;
 	u64 ts0, ts1;
@@ -218,6 +239,15 @@ static __always_inline void csd_lock_wait(call_single_data_t *csd)
 	smp_acquire__after_ctrl_dep();
 }
 
+static __always_inline void csd_lock_wait(call_single_data_t *csd)
+{
+	if (static_branch_unlikely(&csdlock_debug_enabled)) {
+		__csd_lock_wait(csd);
+		return;
+	}
+
+	smp_cond_load_acquire(&csd->node.u_flags, !(VAL & CSD_FLAG_LOCK));
+}
 #else
 static void csd_lock_record(call_single_data_t *csd)
 {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/3] kernel/smp: prepare more CSD lock debugging
  2021-02-26 11:25 [PATCH 0/3] kernel/smp.c: add more CSD lock debugging Juergen Gross
  2021-02-26 11:25 ` [PATCH 1/3] kernel/smp: add boot parameter for controlling " Juergen Gross
@ 2021-02-26 11:25 ` Juergen Gross
  2021-02-26 11:25 ` [PATCH 3/3] kernel/smp: add more data to " Juergen Gross
  2 siblings, 0 replies; 11+ messages in thread
From: Juergen Gross @ 2021-02-26 11:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: paulmck, mhocko, Juergen Gross

In order to be able to easily add more CSD lock debugging data to
struct call_function_data->csd move the call_single_data_t element
into a sub-structure.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 kernel/smp.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/kernel/smp.c b/kernel/smp.c
index d5f0b21ab55e..6d7e6dbe33dc 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -31,8 +31,12 @@
 
 #define CSD_TYPE(_csd)	((_csd)->node.u_flags & CSD_FLAG_TYPE_MASK)
 
+struct cfd_percpu {
+	call_single_data_t	csd;
+};
+
 struct call_function_data {
-	call_single_data_t	__percpu *csd;
+	struct cfd_percpu	__percpu *pcpu;
 	cpumask_var_t		cpumask;
 	cpumask_var_t		cpumask_ipi;
 };
@@ -55,8 +59,8 @@ int smpcfd_prepare_cpu(unsigned int cpu)
 		free_cpumask_var(cfd->cpumask);
 		return -ENOMEM;
 	}
-	cfd->csd = alloc_percpu(call_single_data_t);
-	if (!cfd->csd) {
+	cfd->pcpu = alloc_percpu(struct cfd_percpu);
+	if (!cfd->pcpu) {
 		free_cpumask_var(cfd->cpumask);
 		free_cpumask_var(cfd->cpumask_ipi);
 		return -ENOMEM;
@@ -71,7 +75,7 @@ int smpcfd_dead_cpu(unsigned int cpu)
 
 	free_cpumask_var(cfd->cpumask);
 	free_cpumask_var(cfd->cpumask_ipi);
-	free_percpu(cfd->csd);
+	free_percpu(cfd->pcpu);
 	return 0;
 }
 
@@ -694,7 +698,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 
 	cpumask_clear(cfd->cpumask_ipi);
 	for_each_cpu(cpu, cfd->cpumask) {
-		call_single_data_t *csd = per_cpu_ptr(cfd->csd, cpu);
+		call_single_data_t *csd = &per_cpu_ptr(cfd->pcpu, cpu)->csd;
 
 		if (cond_func && !cond_func(cpu, info))
 			continue;
@@ -719,7 +723,7 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 		for_each_cpu(cpu, cfd->cpumask) {
 			call_single_data_t *csd;
 
-			csd = per_cpu_ptr(cfd->csd, cpu);
+			csd = &per_cpu_ptr(cfd->pcpu, cpu)->csd;
 			csd_lock_wait(csd);
 		}
 	}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/3] kernel/smp: add more data to CSD lock debugging
  2021-02-26 11:25 [PATCH 0/3] kernel/smp.c: add more CSD lock debugging Juergen Gross
  2021-02-26 11:25 ` [PATCH 1/3] kernel/smp: add boot parameter for controlling " Juergen Gross
  2021-02-26 11:25 ` [PATCH 2/3] kernel/smp: prepare more " Juergen Gross
@ 2021-02-26 11:25 ` Juergen Gross
  2021-02-26 16:38   ` Peter Zijlstra
  2 siblings, 1 reply; 11+ messages in thread
From: Juergen Gross @ 2021-02-26 11:25 UTC (permalink / raw)
  To: linux-kernel, linux-doc; +Cc: paulmck, mhocko, Juergen Gross, Jonathan Corbet

In order to help identifying problems with IPI handling and remote
function execution add some more data to IPI debugging code.

There have been multiple reports of cpus looping long times (many
seconds) in smp_call_function_many() waiting for another cpu executing
a function like tlb flushing. Most of these reports have been for
cases where the kernel was running as a guest on top of KVM or Xen
(there are rumours of that happening under VMWare, too, and even on
bare metal).

Finding the root cause hasn't been successful yet, even after more than
2 years of chasing this bug by different developers.

Commit 35feb60474bf4f7 ("kernel/smp: Provide CSD lock timeout
diagnostics") tried to address this by adding some debug code and by
issuing another IPI when a hang was detected. This helped mitigating
the problem (the repeated IPI unlocks the hang), but the root cause is
still unknown.

Current available data suggests that either an IPI wasn't sent when it
should have been, or that the IPI didn't result in the target cpu
executing the queued function (due to the IPI not reaching the cpu,
the IPI handler not being called, or the handler not seeing the queued
request).

Try to add more diagnostic data by introducing a global atomic counter
which is being incremented when doing critical operations (before and
after queueing a new request, when sending an IPI, and when dequeueing
a request). The counter value is stored in percpu variables which can
be printed out when a hang is detected.

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 .../admin-guide/kernel-parameters.txt         |   4 +
 kernel/smp.c                                  | 143 +++++++++++++++++-
 2 files changed, 144 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index af9749b866c2..89da345ec8f1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -789,6 +789,10 @@
 			printed to the console in case a hanging cpu is
 			detected and that cpu is pinged again in order to try
 			to resolve the hang situation.
+			0: disable csdlock debugging (default)
+			1: enable basic csdlock debugging (minor impact)
+			ext: enable extended csdlock debugging (more impact,
+			     but more data)
 
 	dasd=		[HW,NET]
 			See header of drivers/s390/block/dasd_devmap.c.
diff --git a/kernel/smp.c b/kernel/smp.c
index 6d7e6dbe33dc..0d6a6a802fd1 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -31,8 +31,44 @@
 
 #define CSD_TYPE(_csd)	((_csd)->node.u_flags & CSD_FLAG_TYPE_MASK)
 
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+union cfd_seq_cnt {
+	u64		val;
+	struct {
+		u64	src:16;
+		u64	dst:16;
+#define CFD_SEQ_NOCPU	0xffff
+		u64	type:4;
+#define CFD_SEQ_QUEUE	0
+#define CFD_SEQ_IPI	1
+#define CFD_SEQ_NOIPI	2
+#define CFD_SEQ_PING	3
+#define CFD_SEQ_PINGED	4
+#define CFD_SEQ_HANDLE	5
+#define CFD_SEQ_DEQUEUE	6
+#define CFD_SEQ_IDLE	7
+#define CFD_SEQ_GOTIPI	8
+		u64	cnt:28;
+	}		u;
+};
+
+struct cfd_seq_local {
+	u64	ping;
+	u64	pinged;
+	u64	handle;
+	u64	dequeue;
+	u64	idle;
+	u64	gotipi;
+};
+#endif
+
 struct cfd_percpu {
 	call_single_data_t	csd;
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+	u64	seq_queue;
+	u64	seq_ipi;
+	u64	seq_noipi;
+#endif
 };
 
 struct call_function_data {
@@ -108,12 +144,18 @@ void __init call_function_init(void)
 #ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
 
 static DEFINE_STATIC_KEY_FALSE(csdlock_debug_enabled);
+static DEFINE_STATIC_KEY_FALSE(csdlock_debug_extended);
 
 static int __init csdlock_debug(char *str)
 {
 	unsigned int val = 0;
 
-	get_option(&str, &val);
+	if (str && !strcmp(str, "ext")) {
+		val = 1;
+		static_branch_enable(&csdlock_debug_extended);
+	} else
+		get_option(&str, &val);
+
 	if (val)
 		static_branch_enable(&csdlock_debug_enabled);
 
@@ -124,9 +166,33 @@ early_param("csdlock_debug", csdlock_debug);
 static DEFINE_PER_CPU(call_single_data_t *, cur_csd);
 static DEFINE_PER_CPU(smp_call_func_t, cur_csd_func);
 static DEFINE_PER_CPU(void *, cur_csd_info);
+static DEFINE_PER_CPU(struct cfd_seq_local, cfd_seq_local);
 
 #define CSD_LOCK_TIMEOUT (5ULL * NSEC_PER_SEC)
 static atomic_t csd_bug_count = ATOMIC_INIT(0);
+static u64 cfd_seq;
+
+static u64 cfd_seq_inc(unsigned int src, unsigned int dst, unsigned int type)
+{
+	union cfd_seq_cnt new, old;
+
+	new.u.src = src;
+	new.u.dst = dst;
+	new.u.type = type;
+
+	do {
+		old.val = READ_ONCE(cfd_seq);
+		new.u.cnt = old.u.cnt + 1;
+	} while (cmpxchg(&cfd_seq, old.val, new.val) != old.val);
+
+	return old.val;
+}
+
+#define cfd_seq_store(var, src, dst, type)				\
+	do {								\
+		if (static_branch_unlikely(&csdlock_debug_extended))	\
+			var = cfd_seq_inc(src, dst, type);		\
+	} while (0)
 
 /* Record current CSD work for current CPU, NULL to erase. */
 static void __csd_lock_record(call_single_data_t *csd)
@@ -160,6 +226,24 @@ static int csd_lock_wait_getcpu(call_single_data_t *csd)
 	return -1;
 }
 
+static void csd_lock_print_extended(call_single_data_t *csd, int cpu)
+{
+	struct cfd_seq_local *seq = &per_cpu(cfd_seq_local, cpu);
+	unsigned int srccpu = csd->node.src;
+	struct call_function_data *cfd = per_cpu_ptr(&cfd_data, srccpu);
+	struct cfd_percpu *pcpu = per_cpu_ptr(cfd->pcpu, cpu);
+
+	pr_alert("\tcsd: CPU#%d->CPU#%d%s: queue=%016llx, ipi=%016llx, noipi=%016llx\n",
+		 srccpu, cpu,
+		 (srccpu != raw_smp_processor_id()) ? " (might be stale)" : "",
+		 pcpu->seq_queue, pcpu->seq_ipi, pcpu->seq_noipi);
+	pr_alert("\tcsd: CSD source CPU#%d: ping=%016llx, pinged=%016llx\n", srccpu,
+		 per_cpu(cfd_seq_local.ping, srccpu),
+		 per_cpu(cfd_seq_local.pinged, srccpu));
+	pr_alert("\tcsd: CSD target CPU#%d: idle=%016llx, gotipi=%016llx, handle=%016llx, dequeue=%016llx\n",
+		 cpu, seq->idle, seq->gotipi, seq->handle, seq->dequeue);
+}
+
 /*
  * Complain if too much time spent waiting.  Note that only
  * the CSD_TYPE_SYNC/ASYNC types provide the destination CPU,
@@ -209,6 +293,8 @@ static bool csd_lock_wait_toolong(call_single_data_t *csd, u64 ts0, u64 *ts1, in
 			 *bug_id, !cpu_cur_csd ? "unresponsive" : "handling this request");
 	}
 	if (cpu >= 0) {
+		if (static_branch_unlikely(&csdlock_debug_extended))
+			csd_lock_print_extended(csd, cpu);
 		if (!trigger_single_cpu_backtrace(cpu))
 			dump_cpu_task(cpu);
 		if (!cpu_cur_csd) {
@@ -252,7 +338,27 @@ static __always_inline void csd_lock_wait(call_single_data_t *csd)
 
 	smp_cond_load_acquire(&csd->node.u_flags, !(VAL & CSD_FLAG_LOCK));
 }
+
+static void __smp_call_single_queue_debug(int cpu, struct llist_node *node)
+{
+	unsigned int this_cpu = smp_processor_id();
+	struct cfd_seq_local *seq = this_cpu_ptr(&cfd_seq_local);
+	struct call_function_data *cfd = this_cpu_ptr(&cfd_data);
+	struct cfd_percpu *pcpu = per_cpu_ptr(cfd->pcpu, cpu);
+
+	cfd_seq_store(pcpu->seq_queue, this_cpu, cpu, CFD_SEQ_QUEUE);
+	if (llist_add(node, &per_cpu(call_single_queue, cpu))) {
+		cfd_seq_store(pcpu->seq_ipi, this_cpu, cpu, CFD_SEQ_IPI);
+		cfd_seq_store(seq->ping, this_cpu, cpu, CFD_SEQ_PING);
+		send_call_function_single_ipi(cpu);
+		cfd_seq_store(seq->pinged, this_cpu, cpu, CFD_SEQ_PINGED);
+	} else {
+		cfd_seq_store(pcpu->seq_noipi, this_cpu, cpu, CFD_SEQ_NOIPI);
+	}
+}
 #else
+#define cfd_seq_store(var, src, dst, type)
+
 static void csd_lock_record(call_single_data_t *csd)
 {
 }
@@ -290,6 +396,19 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
 
 void __smp_call_single_queue(int cpu, struct llist_node *node)
 {
+#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
+	if (static_branch_unlikely(&csdlock_debug_extended)) {
+		unsigned int type;
+
+		type = CSD_TYPE(container_of(node, call_single_data_t,
+					     node.llist));
+		if (type == CSD_TYPE_SYNC || type == CSD_TYPE_ASYNC) {
+			__smp_call_single_queue_debug(cpu, node);
+			return;
+		}
+	}
+#endif
+
 	/*
 	 * The list addition should be visible before sending the IPI
 	 * handler locks the list to pull the entry off it because of
@@ -348,6 +467,8 @@ static int generic_exec_single(int cpu, call_single_data_t *csd)
  */
 void generic_smp_call_function_single_interrupt(void)
 {
+	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->gotipi, CFD_SEQ_NOCPU,
+		      smp_processor_id(), CFD_SEQ_GOTIPI);
 	flush_smp_call_function_queue(true);
 }
 
@@ -375,7 +496,11 @@ static void flush_smp_call_function_queue(bool warn_cpu_offline)
 	lockdep_assert_irqs_disabled();
 
 	head = this_cpu_ptr(&call_single_queue);
+	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->handle, CFD_SEQ_NOCPU,
+		      smp_processor_id(), CFD_SEQ_HANDLE);
 	entry = llist_del_all(head);
+	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->dequeue, CFD_SEQ_NOCPU,
+		      smp_processor_id(), CFD_SEQ_DEQUEUE);
 	entry = llist_reverse_order(entry);
 
 	/* There shouldn't be any pending callbacks on an offline CPU. */
@@ -482,6 +607,8 @@ void flush_smp_call_function_from_idle(void)
 	if (llist_empty(this_cpu_ptr(&call_single_queue)))
 		return;
 
+	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->idle, CFD_SEQ_NOCPU,
+		      smp_processor_id(), CFD_SEQ_IDLE);
 	local_irq_save(flags);
 	flush_smp_call_function_queue(true);
 	if (local_softirq_pending())
@@ -698,7 +825,8 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 
 	cpumask_clear(cfd->cpumask_ipi);
 	for_each_cpu(cpu, cfd->cpumask) {
-		call_single_data_t *csd = &per_cpu_ptr(cfd->pcpu, cpu)->csd;
+		struct cfd_percpu *pcpu = per_cpu_ptr(cfd->pcpu, cpu);
+		call_single_data_t *csd = &pcpu->csd;
 
 		if (cond_func && !cond_func(cpu, info))
 			continue;
@@ -712,12 +840,21 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
 		csd->node.src = smp_processor_id();
 		csd->node.dst = cpu;
 #endif
-		if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu)))
+		cfd_seq_store(pcpu->seq_queue, this_cpu, cpu, CFD_SEQ_QUEUE);
+		if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu))) {
 			__cpumask_set_cpu(cpu, cfd->cpumask_ipi);
+			cfd_seq_store(pcpu->seq_ipi, this_cpu, cpu, CFD_SEQ_IPI);
+		} else {
+			cfd_seq_store(pcpu->seq_noipi, this_cpu, cpu, CFD_SEQ_NOIPI);
+		}
 	}
 
 	/* Send a message to all CPUs in the map */
+	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->ping, this_cpu,
+		      CFD_SEQ_NOCPU, CFD_SEQ_PING);
 	arch_send_call_function_ipi_mask(cfd->cpumask_ipi);
+	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->pinged, this_cpu,
+		      CFD_SEQ_NOCPU, CFD_SEQ_PINGED);
 
 	if (wait) {
 		for_each_cpu(cpu, cfd->cpumask) {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] kernel/smp: add more data to CSD lock debugging
  2021-02-26 11:25 ` [PATCH 3/3] kernel/smp: add more data to " Juergen Gross
@ 2021-02-26 16:38   ` Peter Zijlstra
  2021-02-26 18:12     ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2021-02-26 16:38 UTC (permalink / raw)
  To: Juergen Gross; +Cc: linux-kernel, linux-doc, paulmck, mhocko, Jonathan Corbet


I hate all of this, but if this will finally catch the actual problem,
we can then revert all this, so sure.

Also, I think this will conflict with the patches from Nadav that I have
queued:

  https://lkml.kernel.org/r/20210220231712.2475218-1-namit@vmware.com

which I'll be pushing to tip/x86/mm once -rc1 happens.

On Fri, Feb 26, 2021 at 12:25:21PM +0100, Juergen Gross wrote:

> +static void __smp_call_single_queue_debug(int cpu, struct llist_node *node)
> +{
> +	unsigned int this_cpu = smp_processor_id();
> +	struct cfd_seq_local *seq = this_cpu_ptr(&cfd_seq_local);
> +	struct call_function_data *cfd = this_cpu_ptr(&cfd_data);
> +	struct cfd_percpu *pcpu = per_cpu_ptr(cfd->pcpu, cpu);
> +
> +	cfd_seq_store(pcpu->seq_queue, this_cpu, cpu, CFD_SEQ_QUEUE);
> +	if (llist_add(node, &per_cpu(call_single_queue, cpu))) {
> +		cfd_seq_store(pcpu->seq_ipi, this_cpu, cpu, CFD_SEQ_IPI);
> +		cfd_seq_store(seq->ping, this_cpu, cpu, CFD_SEQ_PING);
> +		send_call_function_single_ipi(cpu);
> +		cfd_seq_store(seq->pinged, this_cpu, cpu, CFD_SEQ_PINGED);
> +	} else {
> +		cfd_seq_store(pcpu->seq_noipi, this_cpu, cpu, CFD_SEQ_NOIPI);
> +	}
> +}
>  #else
> +#define cfd_seq_store(var, src, dst, type)
> +
>  static void csd_lock_record(call_single_data_t *csd)
>  {
>  }
> @@ -290,6 +396,19 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
>  
>  void __smp_call_single_queue(int cpu, struct llist_node *node)
>  {
> +#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
> +	if (static_branch_unlikely(&csdlock_debug_extended)) {
> +		unsigned int type;
> +
> +		type = CSD_TYPE(container_of(node, call_single_data_t,
> +					     node.llist));
> +		if (type == CSD_TYPE_SYNC || type == CSD_TYPE_ASYNC) {
> +			__smp_call_single_queue_debug(cpu, node);
> +			return;
> +		}
> +	}
> +#endif

This is a bit weird, might as well put it in generic_exec_single()
because there you still know the type matches.


> @@ -712,12 +840,21 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
>  		csd->node.src = smp_processor_id();
>  		csd->node.dst = cpu;
>  #endif
> -		if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu)))
> +		cfd_seq_store(pcpu->seq_queue, this_cpu, cpu, CFD_SEQ_QUEUE);
> +		if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu))) {
>  			__cpumask_set_cpu(cpu, cfd->cpumask_ipi);
> +			cfd_seq_store(pcpu->seq_ipi, this_cpu, cpu, CFD_SEQ_IPI);
> +		} else {
> +			cfd_seq_store(pcpu->seq_noipi, this_cpu, cpu, CFD_SEQ_NOIPI);
> +		}
>  	}
>  
>  	/* Send a message to all CPUs in the map */
> +	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->ping, this_cpu,
> +		      CFD_SEQ_NOCPU, CFD_SEQ_PING);
>  	arch_send_call_function_ipi_mask(cfd->cpumask_ipi);
> +	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->pinged, this_cpu,
> +		      CFD_SEQ_NOCPU, CFD_SEQ_PINGED);

Too bad we can't share with the single case, a well.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] kernel/smp: add more data to CSD lock debugging
  2021-02-26 16:38   ` Peter Zijlstra
@ 2021-02-26 18:12     ` Paul E. McKenney
  2021-02-26 21:05       ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2021-02-26 18:12 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juergen Gross, linux-kernel, linux-doc, mhocko, Jonathan Corbet

On Fri, Feb 26, 2021 at 05:38:44PM +0100, Peter Zijlstra wrote:
> 
> I hate all of this, but if this will finally catch the actual problem,
> we can then revert all this, so sure.

OK, I will bite...  What exactly do you hate about it?

							Thanx, Paul

> Also, I think this will conflict with the patches from Nadav that I have
> queued:
> 
>   https://lkml.kernel.org/r/20210220231712.2475218-1-namit@vmware.com
> 
> which I'll be pushing to tip/x86/mm once -rc1 happens.
> 
> On Fri, Feb 26, 2021 at 12:25:21PM +0100, Juergen Gross wrote:
> 
> > +static void __smp_call_single_queue_debug(int cpu, struct llist_node *node)
> > +{
> > +	unsigned int this_cpu = smp_processor_id();
> > +	struct cfd_seq_local *seq = this_cpu_ptr(&cfd_seq_local);
> > +	struct call_function_data *cfd = this_cpu_ptr(&cfd_data);
> > +	struct cfd_percpu *pcpu = per_cpu_ptr(cfd->pcpu, cpu);
> > +
> > +	cfd_seq_store(pcpu->seq_queue, this_cpu, cpu, CFD_SEQ_QUEUE);
> > +	if (llist_add(node, &per_cpu(call_single_queue, cpu))) {
> > +		cfd_seq_store(pcpu->seq_ipi, this_cpu, cpu, CFD_SEQ_IPI);
> > +		cfd_seq_store(seq->ping, this_cpu, cpu, CFD_SEQ_PING);
> > +		send_call_function_single_ipi(cpu);
> > +		cfd_seq_store(seq->pinged, this_cpu, cpu, CFD_SEQ_PINGED);
> > +	} else {
> > +		cfd_seq_store(pcpu->seq_noipi, this_cpu, cpu, CFD_SEQ_NOIPI);
> > +	}
> > +}
> >  #else
> > +#define cfd_seq_store(var, src, dst, type)
> > +
> >  static void csd_lock_record(call_single_data_t *csd)
> >  {
> >  }
> > @@ -290,6 +396,19 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(call_single_data_t, csd_data);
> >  
> >  void __smp_call_single_queue(int cpu, struct llist_node *node)
> >  {
> > +#ifdef CONFIG_CSD_LOCK_WAIT_DEBUG
> > +	if (static_branch_unlikely(&csdlock_debug_extended)) {
> > +		unsigned int type;
> > +
> > +		type = CSD_TYPE(container_of(node, call_single_data_t,
> > +					     node.llist));
> > +		if (type == CSD_TYPE_SYNC || type == CSD_TYPE_ASYNC) {
> > +			__smp_call_single_queue_debug(cpu, node);
> > +			return;
> > +		}
> > +	}
> > +#endif
> 
> This is a bit weird, might as well put it in generic_exec_single()
> because there you still know the type matches.
> 
> 
> > @@ -712,12 +840,21 @@ static void smp_call_function_many_cond(const struct cpumask *mask,
> >  		csd->node.src = smp_processor_id();
> >  		csd->node.dst = cpu;
> >  #endif
> > -		if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu)))
> > +		cfd_seq_store(pcpu->seq_queue, this_cpu, cpu, CFD_SEQ_QUEUE);
> > +		if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu))) {
> >  			__cpumask_set_cpu(cpu, cfd->cpumask_ipi);
> > +			cfd_seq_store(pcpu->seq_ipi, this_cpu, cpu, CFD_SEQ_IPI);
> > +		} else {
> > +			cfd_seq_store(pcpu->seq_noipi, this_cpu, cpu, CFD_SEQ_NOIPI);
> > +		}
> >  	}
> >  
> >  	/* Send a message to all CPUs in the map */
> > +	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->ping, this_cpu,
> > +		      CFD_SEQ_NOCPU, CFD_SEQ_PING);
> >  	arch_send_call_function_ipi_mask(cfd->cpumask_ipi);
> > +	cfd_seq_store(this_cpu_ptr(&cfd_seq_local)->pinged, this_cpu,
> > +		      CFD_SEQ_NOCPU, CFD_SEQ_PINGED);
> 
> Too bad we can't share with the single case, a well.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] kernel/smp: add more data to CSD lock debugging
  2021-02-26 18:12     ` Paul E. McKenney
@ 2021-02-26 21:05       ` Peter Zijlstra
  2021-03-01  0:07         ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2021-02-26 21:05 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Juergen Gross, linux-kernel, linux-doc, mhocko, Jonathan Corbet

On Fri, Feb 26, 2021 at 10:12:05AM -0800, Paul E. McKenney wrote:
> On Fri, Feb 26, 2021 at 05:38:44PM +0100, Peter Zijlstra wrote:
> > 
> > I hate all of this, but if this will finally catch the actual problem,
> > we can then revert all this, so sure.
> 
> OK, I will bite...  What exactly do you hate about it?

It's ugly and messy. We're basically commiting a debug session. Normally
gunk like this is done in private trees, then we find the problem and
fix that and crap like this never sees the light of day.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] kernel/smp: add more data to CSD lock debugging
  2021-02-26 21:05       ` Peter Zijlstra
@ 2021-03-01  0:07         ` Paul E. McKenney
  2021-03-01 15:45           ` Peter Zijlstra
  0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2021-03-01  0:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Juergen Gross, linux-kernel, linux-doc, mhocko, Jonathan Corbet

On Fri, Feb 26, 2021 at 10:05:51PM +0100, Peter Zijlstra wrote:
> On Fri, Feb 26, 2021 at 10:12:05AM -0800, Paul E. McKenney wrote:
> > On Fri, Feb 26, 2021 at 05:38:44PM +0100, Peter Zijlstra wrote:
> > > 
> > > I hate all of this, but if this will finally catch the actual problem,
> > > we can then revert all this, so sure.
> > 
> > OK, I will bite...  What exactly do you hate about it?
> 
> It's ugly and messy. We're basically commiting a debug session. Normally
> gunk like this is done in private trees, then we find the problem and
> fix that and crap like this never sees the light of day.

Is your hatred due to the presence of debug code at all, or a belief that
this code is unlikely to be useful in any subsequent IPI-related bug hunt?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] kernel/smp: add more data to CSD lock debugging
  2021-03-01  0:07         ` Paul E. McKenney
@ 2021-03-01 15:45           ` Peter Zijlstra
  2021-03-01 15:53             ` Jürgen Groß
  0 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2021-03-01 15:45 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Juergen Gross, linux-kernel, linux-doc, mhocko, Jonathan Corbet

On Sun, Feb 28, 2021 at 04:07:04PM -0800, Paul E. McKenney wrote:
> On Fri, Feb 26, 2021 at 10:05:51PM +0100, Peter Zijlstra wrote:
> > On Fri, Feb 26, 2021 at 10:12:05AM -0800, Paul E. McKenney wrote:
> > > On Fri, Feb 26, 2021 at 05:38:44PM +0100, Peter Zijlstra wrote:
> > > > 
> > > > I hate all of this, but if this will finally catch the actual problem,
> > > > we can then revert all this, so sure.
> > > 
> > > OK, I will bite...  What exactly do you hate about it?
> > 
> > It's ugly and messy. We're basically commiting a debug session. Normally
> > gunk like this is done in private trees, then we find the problem and
> > fix that and crap like this never sees the light of day.
> 
> Is your hatred due to the presence of debug code at all, or a belief that
> this code is unlikely to be useful in any subsequent IPI-related bug hunt?

The clutter, mostly I think. There's a lot of debug sprinkled about.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] kernel/smp: add more data to CSD lock debugging
  2021-03-01 15:45           ` Peter Zijlstra
@ 2021-03-01 15:53             ` Jürgen Groß
  2021-03-01 16:45               ` Paul E. McKenney
  0 siblings, 1 reply; 11+ messages in thread
From: Jürgen Groß @ 2021-03-01 15:53 UTC (permalink / raw)
  To: Peter Zijlstra, Paul E. McKenney
  Cc: linux-kernel, linux-doc, mhocko, Jonathan Corbet


[-- Attachment #1.1.1: Type: text/plain, Size: 1311 bytes --]

On 01.03.21 16:45, Peter Zijlstra wrote:
> On Sun, Feb 28, 2021 at 04:07:04PM -0800, Paul E. McKenney wrote:
>> On Fri, Feb 26, 2021 at 10:05:51PM +0100, Peter Zijlstra wrote:
>>> On Fri, Feb 26, 2021 at 10:12:05AM -0800, Paul E. McKenney wrote:
>>>> On Fri, Feb 26, 2021 at 05:38:44PM +0100, Peter Zijlstra wrote:
>>>>>
>>>>> I hate all of this, but if this will finally catch the actual problem,
>>>>> we can then revert all this, so sure.
>>>>
>>>> OK, I will bite...  What exactly do you hate about it?
>>>
>>> It's ugly and messy. We're basically commiting a debug session. Normally
>>> gunk like this is done in private trees, then we find the problem and
>>> fix that and crap like this never sees the light of day.
>>
>> Is your hatred due to the presence of debug code at all, or a belief that
>> this code is unlikely to be useful in any subsequent IPI-related bug hunt?
> 
> The clutter, mostly I think. There's a lot of debug sprinkled about.

I agree.

In case we are able to identify the root cause of the problem I think
it would be no problem to revert this patch and put a comment into smp.c
naming the commit-id of this patch and what it was good for. This will
enable future bug hunters to make use of the patch without spoiling the
code for others.


Juergen


[-- Attachment #1.1.2: OpenPGP_0xB0DE9DD628BF132F.asc --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] kernel/smp: add more data to CSD lock debugging
  2021-03-01 15:53             ` Jürgen Groß
@ 2021-03-01 16:45               ` Paul E. McKenney
  0 siblings, 0 replies; 11+ messages in thread
From: Paul E. McKenney @ 2021-03-01 16:45 UTC (permalink / raw)
  To: Jürgen Groß
  Cc: Peter Zijlstra, linux-kernel, linux-doc, mhocko, Jonathan Corbet

On Mon, Mar 01, 2021 at 04:53:27PM +0100, Jürgen Groß wrote:
> On 01.03.21 16:45, Peter Zijlstra wrote:
> > On Sun, Feb 28, 2021 at 04:07:04PM -0800, Paul E. McKenney wrote:
> > > On Fri, Feb 26, 2021 at 10:05:51PM +0100, Peter Zijlstra wrote:
> > > > On Fri, Feb 26, 2021 at 10:12:05AM -0800, Paul E. McKenney wrote:
> > > > > On Fri, Feb 26, 2021 at 05:38:44PM +0100, Peter Zijlstra wrote:
> > > > > > 
> > > > > > I hate all of this, but if this will finally catch the actual problem,
> > > > > > we can then revert all this, so sure.
> > > > > 
> > > > > OK, I will bite...  What exactly do you hate about it?
> > > > 
> > > > It's ugly and messy. We're basically commiting a debug session. Normally
> > > > gunk like this is done in private trees, then we find the problem and
> > > > fix that and crap like this never sees the light of day.
> > > 
> > > Is your hatred due to the presence of debug code at all, or a belief that
> > > this code is unlikely to be useful in any subsequent IPI-related bug hunt?
> > 
> > The clutter, mostly I think. There's a lot of debug sprinkled about.
> 
> I agree.
> 
> In case we are able to identify the root cause of the problem I think
> it would be no problem to revert this patch and put a comment into smp.c
> naming the commit-id of this patch and what it was good for. This will
> enable future bug hunters to make use of the patch without spoiling the
> code for others.

Works for me!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-03-01 21:25 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-26 11:25 [PATCH 0/3] kernel/smp.c: add more CSD lock debugging Juergen Gross
2021-02-26 11:25 ` [PATCH 1/3] kernel/smp: add boot parameter for controlling " Juergen Gross
2021-02-26 11:25 ` [PATCH 2/3] kernel/smp: prepare more " Juergen Gross
2021-02-26 11:25 ` [PATCH 3/3] kernel/smp: add more data to " Juergen Gross
2021-02-26 16:38   ` Peter Zijlstra
2021-02-26 18:12     ` Paul E. McKenney
2021-02-26 21:05       ` Peter Zijlstra
2021-03-01  0:07         ` Paul E. McKenney
2021-03-01 15:45           ` Peter Zijlstra
2021-03-01 15:53             ` Jürgen Groß
2021-03-01 16:45               ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).