linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable()
@ 2017-04-07 14:01 Steven Rostedt
  2017-04-07 14:01 ` [PATCH 1/5 v2] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
                   ` (7 more replies)
  0 siblings, 8 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

Paul,

Here's my latest. You OK with it?

-- Steve


Paul E. McKenney (1):
      rcu: Fix dyntick-idle tracing

Steven Rostedt (VMware) (4):
      ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines
      tracing: Replace the per_cpu() with this_cpu() in trace_stack.c
      tracing: Add stack_tracer_disable/enable() functions
      tracing: Rename trace_active to disable_stack_tracer and inline its modification

----
Changes since v1:

  . Make stack_tracer_{dis,en}able() into inlines

  . Rename trace_active to disable_stack_tracer and make global

  . Do not call preempt_disable() from stack_tracer_disable()
    and likewise from stack_tracer_enable()

  . Add check for preemption disabled when CONFIG_PREEMPT_DEBUG

 include/linux/ftrace.h     | 38 ++++++++++++++++++++++++++++++++++++
 kernel/rcu/tree.c          | 48 ++++++++++++++++++++++------------------------
 kernel/trace/Kconfig       |  3 ++-
 kernel/trace/ftrace.c      | 42 +++++++++++++++++-----------------------
 kernel/trace/trace_stack.c | 27 +++++++++-----------------
 5 files changed, 90 insertions(+), 68 deletions(-)

Diff from v1:

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 40afee3..4bde7ff 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -287,8 +287,40 @@ stack_trace_sysctl(struct ctl_table *table, int write,
 		   void __user *buffer, size_t *lenp,
 		   loff_t *ppos);
 
-void stack_tracer_disable(void);
-void stack_tracer_enable(void);
+/* DO NOT MODIFY THIS VARIABLE DIRECTLY! */
+DECLARE_PER_CPU(int, disable_stack_tracer);
+
+/**
+ * stack_tracer_disable - temporarily disable the stack tracer
+ *
+ * There's a few locations (namely in RCU) where stack tracing
+ * cannot be executed. This function is used to disable stack
+ * tracing during those critical sections.
+ *
+ * This function must be called with preemption or interrupts
+ * disabled and stack_tracer_enable() must be called shortly after
+ * while preemption or interrupts are still disabled.
+ */
+static inline void stack_tracer_disable(void)
+{
+	/* Preemption or interupts must be disabled */
+	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
+		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
+	this_cpu_inc(disable_stack_tracer);
+}
+
+/**
+ * stack_tracer_enable - re-enable the stack tracer
+ *
+ * After stack_tracer_disable() is called, stack_tracer_enable()
+ * must be called shortly afterward.
+ */
+static inline void stack_tracer_enable(void)
+{
+	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
+		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
+	this_cpu_dec(disable_stack_tracer);
+}
 #else
 static inline void stack_tracer_disable(void) { }
 static inline void stack_tracer_enabe(void) { }
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 5adbb73..84fafb6 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -35,40 +35,12 @@ unsigned long stack_trace_max_size;
 arch_spinlock_t stack_trace_max_lock =
 	(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
-static DEFINE_PER_CPU(int, trace_active);
+DEFINE_PER_CPU(int, disable_stack_tracer);
 static DEFINE_MUTEX(stack_sysctl_mutex);
 
 int stack_tracer_enabled;
 static int last_stack_tracer_enabled;
 
-/**
- * stack_tracer_disable - temporarily disable the stack tracer
- *
- * There's a few locations (namely in RCU) where stack tracing
- * can not be executed. This function is used to disable stack
- * tracing during those critical sections.
- *
- * This function will disable preemption. stack_tracer_enable()
- * must be called shortly after this is called.
- */
-void stack_tracer_disable(void)
-{
-	preempt_disable_notrace();
-	this_cpu_inc(trace_active);
-}
-
-/**
- * stack_tracer_enable - re-enable the stack tracer
- *
- * After stack_tracer_disable() is called, stack_tracer_enable()
- * must shortly be called afterward.
- */
-void stack_tracer_enable(void)
-{
-	this_cpu_dec(trace_active);
-	preempt_enable_notrace();
-}
-
 void stack_trace_print(void)
 {
 	long i;
@@ -239,8 +211,8 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	preempt_disable_notrace();
 
 	/* no atomic needed, we only modify this variable by this cpu */
-	this_cpu_inc(trace_active);
-	if (this_cpu_read(trace_active) != 1)
+	this_cpu_inc(disable_stack_tracer);
+	if (this_cpu_read(disable_stack_tracer) != 1)
 		goto out;
 
 	ip += MCOUNT_INSN_SIZE;
@@ -248,7 +220,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	check_stack(ip, &stack);
 
  out:
-	this_cpu_dec(trace_active);
+	this_cpu_dec(disable_stack_tracer);
 	/* prevent recursion in schedule */
 	preempt_enable_notrace();
 }
@@ -290,15 +262,15 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
 	/*
 	 * In case we trace inside arch_spin_lock() or after (NMI),
 	 * we will cause circular lock, so we also need to increase
-	 * the percpu trace_active here.
+	 * the percpu disable_stack_tracer here.
 	 */
-	this_cpu_inc(trace_active);
+	this_cpu_inc(disable_stack_tracer);
 
 	arch_spin_lock(&stack_trace_max_lock);
 	*ptr = val;
 	arch_spin_unlock(&stack_trace_max_lock);
 
-	this_cpu_dec(trace_active);
+	this_cpu_dec(disable_stack_tracer);
 	local_irq_restore(flags);
 
 	return count;
@@ -334,7 +306,7 @@ static void *t_start(struct seq_file *m, loff_t *pos)
 {
 	local_irq_disable();
 
-	this_cpu_inc(trace_active);
+	this_cpu_inc(disable_stack_tracer);
 
 	arch_spin_lock(&stack_trace_max_lock);
 
@@ -348,7 +320,7 @@ static void t_stop(struct seq_file *m, void *p)
 {
 	arch_spin_unlock(&stack_trace_max_lock);
 
-	this_cpu_dec(trace_active);
+	this_cpu_dec(disable_stack_tracer);
 
 	local_irq_enable();
 }

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 1/5 v2] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines
  2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
@ 2017-04-07 14:01 ` Steven Rostedt
  2017-04-07 14:01 ` [PATCH 2/5 v2] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c Steven Rostedt
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0001-ftrace-Add-use-of-synchronize_rcu_tasks-with-dynamic.patch --]
[-- Type: text/plain, Size: 4464 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The function tracer needs to be more careful than other subsystems when it
comes to freeing data. Especially if that data is actually executable code.
When a single function is traced, a trampoline can be dynamically allocated
which is called to jump to the function trace callback. When the callback is
no longer needed, the dynamic allocated trampoline needs to be freed. This
is where the issues arise. The dynamically allocated trampoline must not be
used again. As function tracing can trace all subsystems, including
subsystems that are used to serialize aspects of freeing (namely RCU), it
must take extra care when doing the freeing.

Before synchronize_rcu_tasks() was around, there was no way for the function
tracer to know that nothing was using the dynamically allocated trampoline
when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely
preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(),
it will wait till all tasks have either voluntarily scheduled (not on the
trampoline) or goes into userspace (not on the trampoline). Then it is safe
to free the trampoline even with CONFIG_PREEMPT set.

Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/Kconfig  |  3 ++-
 kernel/trace/ftrace.c | 42 ++++++++++++++++++------------------------
 2 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index d4a06e714645..67b463b4f169 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -134,7 +134,8 @@ config FUNCTION_TRACER
 	select KALLSYMS
 	select GENERIC_TRACER
 	select CONTEXT_SWITCH_TRACER
-        select GLOB
+	select GLOB
+	select TASKS_RCU if PREEMPT
 	help
 	  Enable the kernel to trace every kernel function. This is done
 	  by using a compiler feature to insert a small, 5-byte No-Operation
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 8efd9fe7aec0..34f63e78d661 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2808,18 +2808,28 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
 	 * callers are done before leaving this function.
 	 * The same goes for freeing the per_cpu data of the per_cpu
 	 * ops.
-	 *
-	 * Again, normal synchronize_sched() is not good enough.
-	 * We need to do a hard force of sched synchronization.
-	 * This is because we use preempt_disable() to do RCU, but
-	 * the function tracers can be called where RCU is not watching
-	 * (like before user_exit()). We can not rely on the RCU
-	 * infrastructure to do the synchronization, thus we must do it
-	 * ourselves.
 	 */
 	if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) {
+		/*
+		 * We need to do a hard force of sched synchronization.
+		 * This is because we use preempt_disable() to do RCU, but
+		 * the function tracers can be called where RCU is not watching
+		 * (like before user_exit()). We can not rely on the RCU
+		 * infrastructure to do the synchronization, thus we must do it
+		 * ourselves.
+		 */
 		schedule_on_each_cpu(ftrace_sync);
 
+		/*
+		 * When the kernel is preeptive, tasks can be preempted
+		 * while on a ftrace trampoline. Just scheduling a task on
+		 * a CPU is not good enough to flush them. Calling
+		 * synchornize_rcu_tasks() will wait for those tasks to
+		 * execute and either schedule voluntarily or enter user space.
+		 */
+		if (IS_ENABLED(CONFIG_PREEMPT))
+			synchronize_rcu_tasks();
+
 		arch_ftrace_trampoline_free(ops);
 
 		if (ops->flags & FTRACE_OPS_FL_PER_CPU)
@@ -5366,22 +5376,6 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 
 static void ftrace_update_trampoline(struct ftrace_ops *ops)
 {
-
-/*
- * Currently there's no safe way to free a trampoline when the kernel
- * is configured with PREEMPT. That is because a task could be preempted
- * when it jumped to the trampoline, it may be preempted for a long time
- * depending on the system load, and currently there's no way to know
- * when it will be off the trampoline. If the trampoline is freed
- * too early, when the task runs again, it will be executing on freed
- * memory and crash.
- */
-#ifdef CONFIG_PREEMPT
-	/* Currently, only non dynamic ops can have a trampoline */
-	if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
-		return;
-#endif
-
 	arch_ftrace_update_trampoline(ops);
 }
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 2/5 v2] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c
  2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
  2017-04-07 14:01 ` [PATCH 1/5 v2] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
@ 2017-04-07 14:01 ` Steven Rostedt
  2017-04-07 14:36   ` Paul E. McKenney
  2017-04-07 14:01 ` [PATCH 3/5 v2] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0002-tracing-Replace-the-per_cpu-with-this_cpu-in-trace_s.patch --]
[-- Type: text/plain, Size: 2603 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The updates to the trace_active per cpu variable can be updated with the
this_cpu_*() functions as it only gets updated on the CPU that the variable
is on.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 23 +++++++----------------
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 5fb1f2c87e6b..05ad2b86461e 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -207,13 +207,12 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 		 struct ftrace_ops *op, struct pt_regs *pt_regs)
 {
 	unsigned long stack;
-	int cpu;
 
 	preempt_disable_notrace();
 
-	cpu = raw_smp_processor_id();
 	/* no atomic needed, we only modify this variable by this cpu */
-	if (per_cpu(trace_active, cpu)++ != 0)
+	this_cpu_inc(trace_active);
+	if (this_cpu_read(trace_active) != 1)
 		goto out;
 
 	ip += MCOUNT_INSN_SIZE;
@@ -221,7 +220,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	check_stack(ip, &stack);
 
  out:
-	per_cpu(trace_active, cpu)--;
+	this_cpu_dec(trace_active);
 	/* prevent recursion in schedule */
 	preempt_enable_notrace();
 }
@@ -253,7 +252,6 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
 	long *ptr = filp->private_data;
 	unsigned long val, flags;
 	int ret;
-	int cpu;
 
 	ret = kstrtoul_from_user(ubuf, count, 10, &val);
 	if (ret)
@@ -266,14 +264,13 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
 	 * we will cause circular lock, so we also need to increase
 	 * the percpu trace_active here.
 	 */
-	cpu = smp_processor_id();
-	per_cpu(trace_active, cpu)++;
+	this_cpu_inc(trace_active);
 
 	arch_spin_lock(&stack_trace_max_lock);
 	*ptr = val;
 	arch_spin_unlock(&stack_trace_max_lock);
 
-	per_cpu(trace_active, cpu)--;
+	this_cpu_dec(trace_active);
 	local_irq_restore(flags);
 
 	return count;
@@ -307,12 +304,9 @@ t_next(struct seq_file *m, void *v, loff_t *pos)
 
 static void *t_start(struct seq_file *m, loff_t *pos)
 {
-	int cpu;
-
 	local_irq_disable();
 
-	cpu = smp_processor_id();
-	per_cpu(trace_active, cpu)++;
+	this_cpu_inc(trace_active);
 
 	arch_spin_lock(&stack_trace_max_lock);
 
@@ -324,12 +318,9 @@ static void *t_start(struct seq_file *m, loff_t *pos)
 
 static void t_stop(struct seq_file *m, void *p)
 {
-	int cpu;
-
 	arch_spin_unlock(&stack_trace_max_lock);
 
-	cpu = smp_processor_id();
-	per_cpu(trace_active, cpu)--;
+	this_cpu_dec(trace_active);
 
 	local_irq_enable();
 }
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 3/5 v2] tracing: Add stack_tracer_disable/enable() functions
  2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
  2017-04-07 14:01 ` [PATCH 1/5 v2] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
  2017-04-07 14:01 ` [PATCH 2/5 v2] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c Steven Rostedt
@ 2017-04-07 14:01 ` Steven Rostedt
  2017-04-07 14:22   ` Steven Rostedt
  2017-04-07 14:25   ` [PATCH 3/5 v2.1] " Steven Rostedt
  2017-04-07 14:01 ` [PATCH 4/5 v2] tracing: Rename trace_active to disable_stack_tracer and inline its modification Steven Rostedt
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0003-tracing-Add-stack_tracer_disable-enable-functions.patch --]
[-- Type: text/plain, Size: 2483 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

There are certain parts of the kernel that cannot let stack tracing
proceed (namely in RCU), because the stack tracer uses RCU, and parts of RCU
internals cannot handle having RCU read side locks taken.

Add stack_tracer_disable() and stack_tracer_enable() functions to let RCU
stop stack tracing on the current CPU when it is in those critical sections.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/ftrace.h     |  6 ++++++
 kernel/trace/trace_stack.c | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index ef7123219f14..40afee35565a 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -286,6 +286,12 @@ int
 stack_trace_sysctl(struct ctl_table *table, int write,
 		   void __user *buffer, size_t *lenp,
 		   loff_t *ppos);
+
+void stack_tracer_disable(void);
+void stack_tracer_enable(void);
+#else
+static inline void stack_tracer_disable(void) { }
+static inline void stack_tracer_enabe(void) { }
 #endif
 
 struct ftrace_func_command {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 05ad2b86461e..7d52504680f9 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -41,6 +41,38 @@ static DEFINE_MUTEX(stack_sysctl_mutex);
 int stack_tracer_enabled;
 static int last_stack_tracer_enabled;
 
+/**
+ * stack_tracer_disable - temporarily disable the stack tracer
+ *
+ * There's a few locations (namely in RCU) where stack tracing
+ * cannot be executed. This function is used to disable stack
+ * tracing during those critical sections.
+ *
+ * This function must be called with preemption or interrupts
+ * disabled and stack_tracer_enable() must be called shortly after
+ * while preemption or interrupts are still disabled.
+ */
+void stack_tracer_disable(void)
+{
+	/* Preemption or interupts must be disabled */
+	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
+		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
+	this_cpu_inc(trace_active);
+}
+
+/**
+ * stack_tracer_enable - re-enable the stack tracer
+ *
+ * After stack_tracer_disable() is called, stack_tracer_enable()
+ * must be called shortly afterward.
+ */
+void stack_tracer_enable(void)
+{
+	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
+		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
+	this_cpu_dec(trace_active);
+}
+
 void stack_trace_print(void)
 {
 	long i;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 4/5 v2] tracing: Rename trace_active to disable_stack_tracer and inline its modification
  2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
                   ` (2 preceding siblings ...)
  2017-04-07 14:01 ` [PATCH 3/5 v2] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
@ 2017-04-07 14:01 ` Steven Rostedt
  2017-04-07 14:01 ` [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing Steven Rostedt
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0004-tracing-Rename-trace_active-to-disable_stack_tracer-.patch --]
[-- Type: text/plain, Size: 5282 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

In order to eliminate a function call, make "trace_active" into
"disable_stack_tracer" and convert stack_tracer_disable() and friends into
static inline functions.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/ftrace.h     | 36 +++++++++++++++++++++++++++++++--
 kernel/trace/trace_stack.c | 50 +++++++++-------------------------------------
 2 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 40afee35565a..4bde7ff2621d 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -287,8 +287,40 @@ stack_trace_sysctl(struct ctl_table *table, int write,
 		   void __user *buffer, size_t *lenp,
 		   loff_t *ppos);
 
-void stack_tracer_disable(void);
-void stack_tracer_enable(void);
+/* DO NOT MODIFY THIS VARIABLE DIRECTLY! */
+DECLARE_PER_CPU(int, disable_stack_tracer);
+
+/**
+ * stack_tracer_disable - temporarily disable the stack tracer
+ *
+ * There's a few locations (namely in RCU) where stack tracing
+ * cannot be executed. This function is used to disable stack
+ * tracing during those critical sections.
+ *
+ * This function must be called with preemption or interrupts
+ * disabled and stack_tracer_enable() must be called shortly after
+ * while preemption or interrupts are still disabled.
+ */
+static inline void stack_tracer_disable(void)
+{
+	/* Preemption or interupts must be disabled */
+	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
+		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
+	this_cpu_inc(disable_stack_tracer);
+}
+
+/**
+ * stack_tracer_enable - re-enable the stack tracer
+ *
+ * After stack_tracer_disable() is called, stack_tracer_enable()
+ * must be called shortly afterward.
+ */
+static inline void stack_tracer_enable(void)
+{
+	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
+		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
+	this_cpu_dec(disable_stack_tracer);
+}
 #else
 static inline void stack_tracer_disable(void) { }
 static inline void stack_tracer_enabe(void) { }
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 7d52504680f9..84fafb6a331c 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -35,44 +35,12 @@ unsigned long stack_trace_max_size;
 arch_spinlock_t stack_trace_max_lock =
 	(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
 
-static DEFINE_PER_CPU(int, trace_active);
+DEFINE_PER_CPU(int, disable_stack_tracer);
 static DEFINE_MUTEX(stack_sysctl_mutex);
 
 int stack_tracer_enabled;
 static int last_stack_tracer_enabled;
 
-/**
- * stack_tracer_disable - temporarily disable the stack tracer
- *
- * There's a few locations (namely in RCU) where stack tracing
- * cannot be executed. This function is used to disable stack
- * tracing during those critical sections.
- *
- * This function must be called with preemption or interrupts
- * disabled and stack_tracer_enable() must be called shortly after
- * while preemption or interrupts are still disabled.
- */
-void stack_tracer_disable(void)
-{
-	/* Preemption or interupts must be disabled */
-	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
-		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
-	this_cpu_inc(trace_active);
-}
-
-/**
- * stack_tracer_enable - re-enable the stack tracer
- *
- * After stack_tracer_disable() is called, stack_tracer_enable()
- * must be called shortly afterward.
- */
-void stack_tracer_enable(void)
-{
-	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
-		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
-	this_cpu_dec(trace_active);
-}
-
 void stack_trace_print(void)
 {
 	long i;
@@ -243,8 +211,8 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	preempt_disable_notrace();
 
 	/* no atomic needed, we only modify this variable by this cpu */
-	this_cpu_inc(trace_active);
-	if (this_cpu_read(trace_active) != 1)
+	this_cpu_inc(disable_stack_tracer);
+	if (this_cpu_read(disable_stack_tracer) != 1)
 		goto out;
 
 	ip += MCOUNT_INSN_SIZE;
@@ -252,7 +220,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	check_stack(ip, &stack);
 
  out:
-	this_cpu_dec(trace_active);
+	this_cpu_dec(disable_stack_tracer);
 	/* prevent recursion in schedule */
 	preempt_enable_notrace();
 }
@@ -294,15 +262,15 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
 	/*
 	 * In case we trace inside arch_spin_lock() or after (NMI),
 	 * we will cause circular lock, so we also need to increase
-	 * the percpu trace_active here.
+	 * the percpu disable_stack_tracer here.
 	 */
-	this_cpu_inc(trace_active);
+	this_cpu_inc(disable_stack_tracer);
 
 	arch_spin_lock(&stack_trace_max_lock);
 	*ptr = val;
 	arch_spin_unlock(&stack_trace_max_lock);
 
-	this_cpu_dec(trace_active);
+	this_cpu_dec(disable_stack_tracer);
 	local_irq_restore(flags);
 
 	return count;
@@ -338,7 +306,7 @@ static void *t_start(struct seq_file *m, loff_t *pos)
 {
 	local_irq_disable();
 
-	this_cpu_inc(trace_active);
+	this_cpu_inc(disable_stack_tracer);
 
 	arch_spin_lock(&stack_trace_max_lock);
 
@@ -352,7 +320,7 @@ static void t_stop(struct seq_file *m, void *p)
 {
 	arch_spin_unlock(&stack_trace_max_lock);
 
-	this_cpu_dec(trace_active);
+	this_cpu_dec(disable_stack_tracer);
 
 	local_irq_enable();
 }
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing
  2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
                   ` (3 preceding siblings ...)
  2017-04-07 14:01 ` [PATCH 4/5 v2] tracing: Rename trace_active to disable_stack_tracer and inline its modification Steven Rostedt
@ 2017-04-07 14:01 ` Steven Rostedt
  2017-04-07 14:40   ` Paul E. McKenney
  2017-04-07 14:43 ` [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Paul E. McKenney
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0005-rcu-Fix-dyntick-idle-tracing.patch --]
[-- Type: text/plain, Size: 4765 bytes --]

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The tracing subsystem started using rcu_irq_entry() and rcu_irq_exit()
(with my blessing) to allow the current _rcuidle alternative tracepoint
name to be dispensed with while still maintaining good performance.
Unfortunately, this causes RCU's dyntick-idle entry code's tracing to
appear to RCU like an interrupt that occurs where RCU is not designed
to handle interrupts.

This commit fixes this problem by moving the zeroing of ->dynticks_nesting
after the offending trace_rcu_dyntick() statement, which narrows the
window of vulnerability to a pair of adjacent statements that are now
marked with comments to that effect.

Link: http://lkml.kernel.org/r/20170405193928.GM1600@linux.vnet.ibm.com

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/rcu/tree.c | 48 +++++++++++++++++++++++-------------------------
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 50fee7689e71..8b4d273331e4 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -57,6 +57,7 @@
 #include <linux/random.h>
 #include <linux/trace_events.h>
 #include <linux/suspend.h>
+#include <linux/ftrace.h>
 
 #include "tree.h"
 #include "rcu.h"
@@ -771,25 +772,24 @@ cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
 }
 
 /*
- * rcu_eqs_enter_common - current CPU is moving towards extended quiescent state
+ * rcu_eqs_enter_common - current CPU is entering an extended quiescent state
  *
- * If the new value of the ->dynticks_nesting counter now is zero,
- * we really have entered idle, and must do the appropriate accounting.
- * The caller must have disabled interrupts.
+ * Enter idle, doing appropriate accounting.  The caller must have
+ * disabled interrupts.
  */
-static void rcu_eqs_enter_common(long long oldval, bool user)
+static void rcu_eqs_enter_common(bool user)
 {
 	struct rcu_state *rsp;
 	struct rcu_data *rdp;
-	RCU_TRACE(struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);)
+	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
 
-	trace_rcu_dyntick(TPS("Start"), oldval, rdtp->dynticks_nesting);
+	trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
 	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
 	    !user && !is_idle_task(current)) {
 		struct task_struct *idle __maybe_unused =
 			idle_task(smp_processor_id());
 
-		trace_rcu_dyntick(TPS("Error on entry: not idle task"), oldval, 0);
+		trace_rcu_dyntick(TPS("Error on entry: not idle task"), rdtp->dynticks_nesting, 0);
 		rcu_ftrace_dump(DUMP_ORIG);
 		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
 			  current->pid, current->comm,
@@ -800,7 +800,10 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
 		do_nocb_deferred_wakeup(rdp);
 	}
 	rcu_prepare_for_idle();
-	rcu_dynticks_eqs_enter();
+	stack_tracer_disable();
+	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
+	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
+	stack_tracer_enable();
 	rcu_dynticks_task_enter();
 
 	/*
@@ -821,19 +824,15 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
  */
 static void rcu_eqs_enter(bool user)
 {
-	long long oldval;
 	struct rcu_dynticks *rdtp;
 
 	rdtp = this_cpu_ptr(&rcu_dynticks);
-	oldval = rdtp->dynticks_nesting;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     (oldval & DYNTICK_TASK_NEST_MASK) == 0);
-	if ((oldval & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE) {
-		rdtp->dynticks_nesting = 0;
-		rcu_eqs_enter_common(oldval, user);
-	} else {
+		     (rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0);
+	if ((rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE)
+		rcu_eqs_enter_common(user);
+	else
 		rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
-	}
 }
 
 /**
@@ -892,19 +891,18 @@ void rcu_user_enter(void)
  */
 void rcu_irq_exit(void)
 {
-	long long oldval;
 	struct rcu_dynticks *rdtp;
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
-	oldval = rdtp->dynticks_nesting;
-	rdtp->dynticks_nesting--;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     rdtp->dynticks_nesting < 0);
-	if (rdtp->dynticks_nesting)
-		trace_rcu_dyntick(TPS("--="), oldval, rdtp->dynticks_nesting);
-	else
-		rcu_eqs_enter_common(oldval, true);
+		     rdtp->dynticks_nesting < 1);
+	if (rdtp->dynticks_nesting <= 1) {
+		rcu_eqs_enter_common(true);
+	} else {
+		trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nesting, rdtp->dynticks_nesting - 1);
+		rdtp->dynticks_nesting--;
+	}
 	rcu_sysidle_enter(1);
 }
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 3/5 v2] tracing: Add stack_tracer_disable/enable() functions
  2017-04-07 14:01 ` [PATCH 3/5 v2] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
@ 2017-04-07 14:22   ` Steven Rostedt
  2017-04-07 14:25   ` [PATCH 3/5 v2.1] " Steven Rostedt
  1 sibling, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:22 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

On Fri, 07 Apr 2017 10:01:09 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index ef7123219f14..40afee35565a 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -286,6 +286,12 @@ int
>  stack_trace_sysctl(struct ctl_table *table, int write,
>  		   void __user *buffer, size_t *lenp,
>  		   loff_t *ppos);
> +
> +void stack_tracer_disable(void);
> +void stack_tracer_enable(void);
> +#else
> +static inline void stack_tracer_disable(void) { }
> +static inline void stack_tracer_enabe(void) { }

Grumble. kbot caught that this is enabe not enable.

I cut and pasted the bad name to all places, and when I compiled, it
complained. I fixed the ones to make my system build, but forgot to fix
the #ifndef version :-(

I'll fix this before pushing.

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 3/5 v2.1] tracing: Add stack_tracer_disable/enable() functions
  2017-04-07 14:01 ` [PATCH 3/5 v2] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
  2017-04-07 14:22   ` Steven Rostedt
@ 2017-04-07 14:25   ` Steven Rostedt
  1 sibling, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:25 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney


There are certain parts of the kernel that cannot let stack tracing
proceed (namely in RCU), because the stack tracer uses RCU, and parts of RCU
internals cannot handle having RCU read side locks taken.

Add stack_tracer_disable() and stack_tracer_enable() functions to let RCU
stop stack tracing on the current CPU when it is in those critical sections.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/ftrace.h     |  6 ++++++
 kernel/trace/trace_stack.c | 32 ++++++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

v2.1 change: s/stack_tracer_enabe/stack_tracer_enable/
   (kbuild test robot)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index ef71232..7b4e657 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -286,6 +286,12 @@ int
 stack_trace_sysctl(struct ctl_table *table, int write,
 		   void __user *buffer, size_t *lenp,
 		   loff_t *ppos);
+
+void stack_tracer_disable(void);
+void stack_tracer_enable(void);
+#else
+static inline void stack_tracer_disable(void) { }
+static inline void stack_tracer_enable(void) { }
 #endif
 
 struct ftrace_func_command {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 05ad2b8..7d52504 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -41,6 +41,38 @@ static DEFINE_MUTEX(stack_sysctl_mutex);
 int stack_tracer_enabled;
 static int last_stack_tracer_enabled;
 
+/**
+ * stack_tracer_disable - temporarily disable the stack tracer
+ *
+ * There's a few locations (namely in RCU) where stack tracing
+ * cannot be executed. This function is used to disable stack
+ * tracing during those critical sections.
+ *
+ * This function must be called with preemption or interrupts
+ * disabled and stack_tracer_enable() must be called shortly after
+ * while preemption or interrupts are still disabled.
+ */
+void stack_tracer_disable(void)
+{
+	/* Preemption or interupts must be disabled */
+	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
+		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
+	this_cpu_inc(trace_active);
+}
+
+/**
+ * stack_tracer_enable - re-enable the stack tracer
+ *
+ * After stack_tracer_disable() is called, stack_tracer_enable()
+ * must be called shortly afterward.
+ */
+void stack_tracer_enable(void)
+{
+	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
+		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
+	this_cpu_dec(trace_active);
+}
+
 void stack_trace_print(void)
 {
 	long i;
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/5 v2] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c
  2017-04-07 14:01 ` [PATCH 2/5 v2] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c Steven Rostedt
@ 2017-04-07 14:36   ` Paul E. McKenney
  2017-04-07 14:48     ` Steven Rostedt
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 14:36 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 10:01:08AM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> The updates to the trace_active per cpu variable can be updated with the
> this_cpu_*() functions as it only gets updated on the CPU that the variable
> is on.
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/trace_stack.c | 23 +++++++----------------
>  1 file changed, 7 insertions(+), 16 deletions(-)
> 
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 5fb1f2c87e6b..05ad2b86461e 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -207,13 +207,12 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
>  		 struct ftrace_ops *op, struct pt_regs *pt_regs)
>  {
>  	unsigned long stack;
> -	int cpu;
> 
>  	preempt_disable_notrace();
> 
> -	cpu = raw_smp_processor_id();
>  	/* no atomic needed, we only modify this variable by this cpu */
> -	if (per_cpu(trace_active, cpu)++ != 0)
> +	this_cpu_inc(trace_active);

For whatever it is worth...

I was about to complain that this_cpu_inc() only disables preemption,
not interrupts, but then I realized that any correct interrupt handler
would have to restore the per-CPU variable to its original value.

Presumably you have to sum up all the per-CPU trace_active counts,
given that there is no guarantee that a process-level dec will happen
on the same CPU that did the inc.

							Thanx, Paul

> +	if (this_cpu_read(trace_active) != 1)
>  		goto out;
> 
>  	ip += MCOUNT_INSN_SIZE;
> @@ -221,7 +220,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
>  	check_stack(ip, &stack);
> 
>   out:
> -	per_cpu(trace_active, cpu)--;
> +	this_cpu_dec(trace_active);
>  	/* prevent recursion in schedule */
>  	preempt_enable_notrace();
>  }
> @@ -253,7 +252,6 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
>  	long *ptr = filp->private_data;
>  	unsigned long val, flags;
>  	int ret;
> -	int cpu;
> 
>  	ret = kstrtoul_from_user(ubuf, count, 10, &val);
>  	if (ret)
> @@ -266,14 +264,13 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
>  	 * we will cause circular lock, so we also need to increase
>  	 * the percpu trace_active here.
>  	 */
> -	cpu = smp_processor_id();
> -	per_cpu(trace_active, cpu)++;
> +	this_cpu_inc(trace_active);
> 
>  	arch_spin_lock(&stack_trace_max_lock);
>  	*ptr = val;
>  	arch_spin_unlock(&stack_trace_max_lock);
> 
> -	per_cpu(trace_active, cpu)--;
> +	this_cpu_dec(trace_active);
>  	local_irq_restore(flags);
> 
>  	return count;
> @@ -307,12 +304,9 @@ t_next(struct seq_file *m, void *v, loff_t *pos)
> 
>  static void *t_start(struct seq_file *m, loff_t *pos)
>  {
> -	int cpu;
> -
>  	local_irq_disable();
> 
> -	cpu = smp_processor_id();
> -	per_cpu(trace_active, cpu)++;
> +	this_cpu_inc(trace_active);
> 
>  	arch_spin_lock(&stack_trace_max_lock);
> 
> @@ -324,12 +318,9 @@ static void *t_start(struct seq_file *m, loff_t *pos)
> 
>  static void t_stop(struct seq_file *m, void *p)
>  {
> -	int cpu;
> -
>  	arch_spin_unlock(&stack_trace_max_lock);
> 
> -	cpu = smp_processor_id();
> -	per_cpu(trace_active, cpu)--;
> +	this_cpu_dec(trace_active);
> 
>  	local_irq_enable();
>  }
> -- 
> 2.10.2
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing
  2017-04-07 14:01 ` [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing Steven Rostedt
@ 2017-04-07 14:40   ` Paul E. McKenney
  2017-04-07 14:53     ` Steven Rostedt
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 14:40 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 10:01:11AM -0400, Steven Rostedt wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> The tracing subsystem started using rcu_irq_entry() and rcu_irq_exit()
> (with my blessing) to allow the current _rcuidle alternative tracepoint
> name to be dispensed with while still maintaining good performance.
> Unfortunately, this causes RCU's dyntick-idle entry code's tracing to
> appear to RCU like an interrupt that occurs where RCU is not designed
> to handle interrupts.
> 
> This commit fixes this problem by moving the zeroing of ->dynticks_nesting
> after the offending trace_rcu_dyntick() statement, which narrows the
> window of vulnerability to a pair of adjacent statements that are now
> marked with comments to that effect.
> 
> Link: http://lkml.kernel.org/r/20170405193928.GM1600@linux.vnet.ibm.com
> 
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/rcu/tree.c | 48 +++++++++++++++++++++++-------------------------
>  1 file changed, 23 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 50fee7689e71..8b4d273331e4 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -57,6 +57,7 @@
>  #include <linux/random.h>
>  #include <linux/trace_events.h>
>  #include <linux/suspend.h>
> +#include <linux/ftrace.h>
> 
>  #include "tree.h"
>  #include "rcu.h"
> @@ -771,25 +772,24 @@ cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
>  }
> 
>  /*
> - * rcu_eqs_enter_common - current CPU is moving towards extended quiescent state
> + * rcu_eqs_enter_common - current CPU is entering an extended quiescent state
>   *
> - * If the new value of the ->dynticks_nesting counter now is zero,
> - * we really have entered idle, and must do the appropriate accounting.
> - * The caller must have disabled interrupts.
> + * Enter idle, doing appropriate accounting.  The caller must have
> + * disabled interrupts.
>   */
> -static void rcu_eqs_enter_common(long long oldval, bool user)
> +static void rcu_eqs_enter_common(bool user)
>  {
>  	struct rcu_state *rsp;
>  	struct rcu_data *rdp;
> -	RCU_TRACE(struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);)
> +	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> 
> -	trace_rcu_dyntick(TPS("Start"), oldval, rdtp->dynticks_nesting);
> +	trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
>  	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
>  	    !user && !is_idle_task(current)) {
>  		struct task_struct *idle __maybe_unused =
>  			idle_task(smp_processor_id());
> 
> -		trace_rcu_dyntick(TPS("Error on entry: not idle task"), oldval, 0);
> +		trace_rcu_dyntick(TPS("Error on entry: not idle task"), rdtp->dynticks_nesting, 0);
>  		rcu_ftrace_dump(DUMP_ORIG);
>  		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
>  			  current->pid, current->comm,
> @@ -800,7 +800,10 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
>  		do_nocb_deferred_wakeup(rdp);
>  	}
>  	rcu_prepare_for_idle();
> -	rcu_dynticks_eqs_enter();
> +	stack_tracer_disable();
> +	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
> +	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
> +	stack_tracer_enable();

Hmmm...  There is not supposed to be any tracing in this interval,
and interrupts are disabled.  Wouldn't it be better to have something
that made tracing illegal during this interval?

Yeah, I am a bit concerned about idle-entry latency...

							Thanx, Paul

>  	rcu_dynticks_task_enter();
> 
>  	/*
> @@ -821,19 +824,15 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
>   */
>  static void rcu_eqs_enter(bool user)
>  {
> -	long long oldval;
>  	struct rcu_dynticks *rdtp;
> 
>  	rdtp = this_cpu_ptr(&rcu_dynticks);
> -	oldval = rdtp->dynticks_nesting;
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> -		     (oldval & DYNTICK_TASK_NEST_MASK) == 0);
> -	if ((oldval & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE) {
> -		rdtp->dynticks_nesting = 0;
> -		rcu_eqs_enter_common(oldval, user);
> -	} else {
> +		     (rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0);
> +	if ((rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE)
> +		rcu_eqs_enter_common(user);
> +	else
>  		rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
> -	}
>  }
> 
>  /**
> @@ -892,19 +891,18 @@ void rcu_user_enter(void)
>   */
>  void rcu_irq_exit(void)
>  {
> -	long long oldval;
>  	struct rcu_dynticks *rdtp;
> 
>  	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
>  	rdtp = this_cpu_ptr(&rcu_dynticks);
> -	oldval = rdtp->dynticks_nesting;
> -	rdtp->dynticks_nesting--;
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> -		     rdtp->dynticks_nesting < 0);
> -	if (rdtp->dynticks_nesting)
> -		trace_rcu_dyntick(TPS("--="), oldval, rdtp->dynticks_nesting);
> -	else
> -		rcu_eqs_enter_common(oldval, true);
> +		     rdtp->dynticks_nesting < 1);
> +	if (rdtp->dynticks_nesting <= 1) {
> +		rcu_eqs_enter_common(true);
> +	} else {
> +		trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nesting, rdtp->dynticks_nesting - 1);
> +		rdtp->dynticks_nesting--;
> +	}
>  	rcu_sysidle_enter(1);
>  }
> 
> -- 
> 2.10.2
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable()
  2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
                   ` (4 preceding siblings ...)
  2017-04-07 14:01 ` [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing Steven Rostedt
@ 2017-04-07 14:43 ` Paul E. McKenney
  2017-04-07 14:58   ` Steven Rostedt
  2017-04-07 16:35 ` [PATCH 6/5]rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work Steven Rostedt
  2017-04-07 17:06 ` [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events Steven Rostedt
  7 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 14:43 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 10:01:06AM -0400, Steven Rostedt wrote:
> Paul,
> 
> Here's my latest. You OK with it?

Given your update to 3/5, I suspect that we could live with it.  I am
expecting some complaints about increases in idle-entry latency, but might
be best to wait for complaints rather than complexifying too proactively.

That said, there isn't supposed to be any tracing during the now very
small interval where RCU's idle-entry is incomplete.  Mightn't it be
better to (under CONFIG_PROVE_RCU or some such) give splats if tracing
showed up in that interval?

							Thanx, Paul

> -- Steve
> 
> 
> Paul E. McKenney (1):
>       rcu: Fix dyntick-idle tracing
> 
> Steven Rostedt (VMware) (4):
>       ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines
>       tracing: Replace the per_cpu() with this_cpu() in trace_stack.c
>       tracing: Add stack_tracer_disable/enable() functions
>       tracing: Rename trace_active to disable_stack_tracer and inline its modification
> 
> ----
> Changes since v1:
> 
>   . Make stack_tracer_{dis,en}able() into inlines
> 
>   . Rename trace_active to disable_stack_tracer and make global
> 
>   . Do not call preempt_disable() from stack_tracer_disable()
>     and likewise from stack_tracer_enable()
> 
>   . Add check for preemption disabled when CONFIG_PREEMPT_DEBUG
> 
>  include/linux/ftrace.h     | 38 ++++++++++++++++++++++++++++++++++++
>  kernel/rcu/tree.c          | 48 ++++++++++++++++++++++------------------------
>  kernel/trace/Kconfig       |  3 ++-
>  kernel/trace/ftrace.c      | 42 +++++++++++++++++-----------------------
>  kernel/trace/trace_stack.c | 27 +++++++++-----------------
>  5 files changed, 90 insertions(+), 68 deletions(-)
> 
> Diff from v1:
> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index 40afee3..4bde7ff 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -287,8 +287,40 @@ stack_trace_sysctl(struct ctl_table *table, int write,
>  		   void __user *buffer, size_t *lenp,
>  		   loff_t *ppos);
> 
> -void stack_tracer_disable(void);
> -void stack_tracer_enable(void);
> +/* DO NOT MODIFY THIS VARIABLE DIRECTLY! */
> +DECLARE_PER_CPU(int, disable_stack_tracer);
> +
> +/**
> + * stack_tracer_disable - temporarily disable the stack tracer
> + *
> + * There's a few locations (namely in RCU) where stack tracing
> + * cannot be executed. This function is used to disable stack
> + * tracing during those critical sections.
> + *
> + * This function must be called with preemption or interrupts
> + * disabled and stack_tracer_enable() must be called shortly after
> + * while preemption or interrupts are still disabled.
> + */
> +static inline void stack_tracer_disable(void)
> +{
> +	/* Preemption or interupts must be disabled */
> +	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
> +		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
> +	this_cpu_inc(disable_stack_tracer);
> +}
> +
> +/**
> + * stack_tracer_enable - re-enable the stack tracer
> + *
> + * After stack_tracer_disable() is called, stack_tracer_enable()
> + * must be called shortly afterward.
> + */
> +static inline void stack_tracer_enable(void)
> +{
> +	if (IS_ENABLED(CONFIG_PREEMPT_DEBUG))
> +		WARN_ON_ONCE(!preempt_count() || !irqs_disabled());
> +	this_cpu_dec(disable_stack_tracer);
> +}
>  #else
>  static inline void stack_tracer_disable(void) { }
>  static inline void stack_tracer_enabe(void) { }
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 5adbb73..84fafb6 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -35,40 +35,12 @@ unsigned long stack_trace_max_size;
>  arch_spinlock_t stack_trace_max_lock =
>  	(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
> 
> -static DEFINE_PER_CPU(int, trace_active);
> +DEFINE_PER_CPU(int, disable_stack_tracer);
>  static DEFINE_MUTEX(stack_sysctl_mutex);
> 
>  int stack_tracer_enabled;
>  static int last_stack_tracer_enabled;
> 
> -/**
> - * stack_tracer_disable - temporarily disable the stack tracer
> - *
> - * There's a few locations (namely in RCU) where stack tracing
> - * can not be executed. This function is used to disable stack
> - * tracing during those critical sections.
> - *
> - * This function will disable preemption. stack_tracer_enable()
> - * must be called shortly after this is called.
> - */
> -void stack_tracer_disable(void)
> -{
> -	preempt_disable_notrace();
> -	this_cpu_inc(trace_active);
> -}
> -
> -/**
> - * stack_tracer_enable - re-enable the stack tracer
> - *
> - * After stack_tracer_disable() is called, stack_tracer_enable()
> - * must shortly be called afterward.
> - */
> -void stack_tracer_enable(void)
> -{
> -	this_cpu_dec(trace_active);
> -	preempt_enable_notrace();
> -}
> -
>  void stack_trace_print(void)
>  {
>  	long i;
> @@ -239,8 +211,8 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
>  	preempt_disable_notrace();
> 
>  	/* no atomic needed, we only modify this variable by this cpu */
> -	this_cpu_inc(trace_active);
> -	if (this_cpu_read(trace_active) != 1)
> +	this_cpu_inc(disable_stack_tracer);
> +	if (this_cpu_read(disable_stack_tracer) != 1)
>  		goto out;
> 
>  	ip += MCOUNT_INSN_SIZE;
> @@ -248,7 +220,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
>  	check_stack(ip, &stack);
> 
>   out:
> -	this_cpu_dec(trace_active);
> +	this_cpu_dec(disable_stack_tracer);
>  	/* prevent recursion in schedule */
>  	preempt_enable_notrace();
>  }
> @@ -290,15 +262,15 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
>  	/*
>  	 * In case we trace inside arch_spin_lock() or after (NMI),
>  	 * we will cause circular lock, so we also need to increase
> -	 * the percpu trace_active here.
> +	 * the percpu disable_stack_tracer here.
>  	 */
> -	this_cpu_inc(trace_active);
> +	this_cpu_inc(disable_stack_tracer);
> 
>  	arch_spin_lock(&stack_trace_max_lock);
>  	*ptr = val;
>  	arch_spin_unlock(&stack_trace_max_lock);
> 
> -	this_cpu_dec(trace_active);
> +	this_cpu_dec(disable_stack_tracer);
>  	local_irq_restore(flags);
> 
>  	return count;
> @@ -334,7 +306,7 @@ static void *t_start(struct seq_file *m, loff_t *pos)
>  {
>  	local_irq_disable();
> 
> -	this_cpu_inc(trace_active);
> +	this_cpu_inc(disable_stack_tracer);
> 
>  	arch_spin_lock(&stack_trace_max_lock);
> 
> @@ -348,7 +320,7 @@ static void t_stop(struct seq_file *m, void *p)
>  {
>  	arch_spin_unlock(&stack_trace_max_lock);
> 
> -	this_cpu_dec(trace_active);
> +	this_cpu_dec(disable_stack_tracer);
> 
>  	local_irq_enable();
>  }
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/5 v2] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c
  2017-04-07 14:36   ` Paul E. McKenney
@ 2017-04-07 14:48     ` Steven Rostedt
  2017-04-07 15:08       ` Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:48 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, 7 Apr 2017 07:36:19 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Apr 07, 2017 at 10:01:08AM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > The updates to the trace_active per cpu variable can be updated with the
> > this_cpu_*() functions as it only gets updated on the CPU that the variable
> > is on.
> > 
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > ---
> >  kernel/trace/trace_stack.c | 23 +++++++----------------
> >  1 file changed, 7 insertions(+), 16 deletions(-)
> > 
> > diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> > index 5fb1f2c87e6b..05ad2b86461e 100644
> > --- a/kernel/trace/trace_stack.c
> > +++ b/kernel/trace/trace_stack.c
> > @@ -207,13 +207,12 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
> >  		 struct ftrace_ops *op, struct pt_regs *pt_regs)
> >  {
> >  	unsigned long stack;
> > -	int cpu;
> > 
> >  	preempt_disable_notrace();
> > 
> > -	cpu = raw_smp_processor_id();
> >  	/* no atomic needed, we only modify this variable by this cpu */
> > -	if (per_cpu(trace_active, cpu)++ != 0)
> > +	this_cpu_inc(trace_active);  
> 
> For whatever it is worth...
> 
> I was about to complain that this_cpu_inc() only disables preemption,
> not interrupts, but then I realized that any correct interrupt handler
> would have to restore the per-CPU variable to its original value.

Yep, that's the reason for the comment about "no atomic needed". This
is a "stack modification". Any interruption in the flow will reset the
changes back to the way it was before going back to what it interrupted.

> 
> Presumably you have to sum up all the per-CPU trace_active counts,
> given that there is no guarantee that a process-level dec will happen
> on the same CPU that did the inc.

That's why we disable preemption. We guarantee that a process-level dec
*will* happen on the same CPU that did the inc.

It's also the reason for the preemption disabled check in the
stack_tracer_disable() code.

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing
  2017-04-07 14:40   ` Paul E. McKenney
@ 2017-04-07 14:53     ` Steven Rostedt
  2017-04-07 15:09       ` Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:53 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, 7 Apr 2017 07:40:11 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Apr 07, 2017 at 10:01:11AM -0400, Steven Rostedt wrote:
> > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > 
> > The tracing subsystem started using rcu_irq_entry() and rcu_irq_exit()
> > (with my blessing) to allow the current _rcuidle alternative tracepoint
> > name to be dispensed with while still maintaining good performance.
> > Unfortunately, this causes RCU's dyntick-idle entry code's tracing to
> > appear to RCU like an interrupt that occurs where RCU is not designed
> > to handle interrupts.
> > 
> > This commit fixes this problem by moving the zeroing of ->dynticks_nesting
> > after the offending trace_rcu_dyntick() statement, which narrows the
> > window of vulnerability to a pair of adjacent statements that are now
> > marked with comments to that effect.
> > 
> > Link: http://lkml.kernel.org/r/20170405193928.GM1600@linux.vnet.ibm.com
> > 
> > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > ---
> >  kernel/rcu/tree.c | 48 +++++++++++++++++++++++-------------------------
> >  1 file changed, 23 insertions(+), 25 deletions(-)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 50fee7689e71..8b4d273331e4 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -57,6 +57,7 @@
> >  #include <linux/random.h>
> >  #include <linux/trace_events.h>
> >  #include <linux/suspend.h>
> > +#include <linux/ftrace.h>
> > 
> >  #include "tree.h"
> >  #include "rcu.h"
> > @@ -771,25 +772,24 @@ cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
> >  }
> > 
> >  /*
> > - * rcu_eqs_enter_common - current CPU is moving towards extended quiescent state
> > + * rcu_eqs_enter_common - current CPU is entering an extended quiescent state
> >   *
> > - * If the new value of the ->dynticks_nesting counter now is zero,
> > - * we really have entered idle, and must do the appropriate accounting.
> > - * The caller must have disabled interrupts.
> > + * Enter idle, doing appropriate accounting.  The caller must have
> > + * disabled interrupts.
> >   */
> > -static void rcu_eqs_enter_common(long long oldval, bool user)
> > +static void rcu_eqs_enter_common(bool user)
> >  {
> >  	struct rcu_state *rsp;
> >  	struct rcu_data *rdp;
> > -	RCU_TRACE(struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);)
> > +	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > 
> > -	trace_rcu_dyntick(TPS("Start"), oldval, rdtp->dynticks_nesting);
> > +	trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
> >  	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> >  	    !user && !is_idle_task(current)) {
> >  		struct task_struct *idle __maybe_unused =
> >  			idle_task(smp_processor_id());
> > 
> > -		trace_rcu_dyntick(TPS("Error on entry: not idle task"), oldval, 0);
> > +		trace_rcu_dyntick(TPS("Error on entry: not idle task"), rdtp->dynticks_nesting, 0);
> >  		rcu_ftrace_dump(DUMP_ORIG);
> >  		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
> >  			  current->pid, current->comm,
> > @@ -800,7 +800,10 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
> >  		do_nocb_deferred_wakeup(rdp);
> >  	}
> >  	rcu_prepare_for_idle();
> > -	rcu_dynticks_eqs_enter();
> > +	stack_tracer_disable();
> > +	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
> > +	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
> > +	stack_tracer_enable();  
> 
> Hmmm...  There is not supposed to be any tracing in this interval,

Why not? function tracing happens without an issue. But then again,
function tracing doesn't depend on RCU.

> and interrupts are disabled.  Wouldn't it be better to have something
> that made tracing illegal during this interval?

I don't see an issue here. Function tracing is fine. There should be no
trace_events() as those are static events and shouldn't dynamically
appear in this interval.

The problem I hit is that stack tracing uses function tracing to check
the stack of all functions. It doesn't need RCU either, unless it hits
a new "max stack", which it then calls save_stack_trace(), which does a
lot, and it does perform an rcu_read_lock(), which is what broke.

I'm fine with tracing, as that works. What doesn't work is tracing a
new max stack.

> 
> Yeah, I am a bit concerned about idle-entry latency...
> 

Which should now be fine because of the inlined this_cpu_inc/dec()
which is very efficient and made for fast paths like this.

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable()
  2017-04-07 14:43 ` [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Paul E. McKenney
@ 2017-04-07 14:58   ` Steven Rostedt
  2017-04-07 15:11     ` Paul E. McKenney
  0 siblings, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 14:58 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, 7 Apr 2017 07:43:35 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Apr 07, 2017 at 10:01:06AM -0400, Steven Rostedt wrote:
> > Paul,
> > 
> > Here's my latest. You OK with it?  
> 
> Given your update to 3/5, I suspect that we could live with it.  I am
> expecting some complaints about increases in idle-entry latency, but might
> be best to wait for complaints rather than complexifying too proactively.

We only added a this_cpu_inc() and this_cpu_dec() which are very fast
operations. I highly doubt it will be measurable. Although, I'm talking
about x86, IIRC, the this_cpu_inc/dec were be poorly written for other
archs in the past. I'm not sure if that was fixed though.

> 
> That said, there isn't supposed to be any tracing during the now very
> small interval where RCU's idle-entry is incomplete.  Mightn't it be
> better to (under CONFIG_PROVE_RCU or some such) give splats if tracing
> showed up in that interval?
> 

Again, tracing is not the issue. I do function tracing in that location
without any problems. The issue here was the stack tracer.

Maybe we can create a new variable that is more cache local to the RCU
code.

What about calling it "rcu_disabled"? Then tracing that depends on RCU
can simply check that.

s/stack_trace_disable/disable_rcu/
s/stack_trace_enable/enable_rcu/

export a per cpu variable rcu_disabled

Then I can have the stack tracer check that variable as well. And we
could even put in a WARN_ON(this_cpu_read(rcu_disabled) in the
TRACE_EVENT() macros.

Thoughts?

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/5 v2] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c
  2017-04-07 14:48     ` Steven Rostedt
@ 2017-04-07 15:08       ` Paul E. McKenney
  0 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 15:08 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 10:48:38AM -0400, Steven Rostedt wrote:
> On Fri, 7 Apr 2017 07:36:19 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Apr 07, 2017 at 10:01:08AM -0400, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > > 
> > > The updates to the trace_active per cpu variable can be updated with the
> > > this_cpu_*() functions as it only gets updated on the CPU that the variable
> > > is on.
> > > 
> > > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > > ---
> > >  kernel/trace/trace_stack.c | 23 +++++++----------------
> > >  1 file changed, 7 insertions(+), 16 deletions(-)
> > > 
> > > diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> > > index 5fb1f2c87e6b..05ad2b86461e 100644
> > > --- a/kernel/trace/trace_stack.c
> > > +++ b/kernel/trace/trace_stack.c
> > > @@ -207,13 +207,12 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
> > >  		 struct ftrace_ops *op, struct pt_regs *pt_regs)
> > >  {
> > >  	unsigned long stack;
> > > -	int cpu;
> > > 
> > >  	preempt_disable_notrace();
> > > 
> > > -	cpu = raw_smp_processor_id();
> > >  	/* no atomic needed, we only modify this variable by this cpu */
> > > -	if (per_cpu(trace_active, cpu)++ != 0)
> > > +	this_cpu_inc(trace_active);  
> > 
> > For whatever it is worth...
> > 
> > I was about to complain that this_cpu_inc() only disables preemption,
> > not interrupts, but then I realized that any correct interrupt handler
> > would have to restore the per-CPU variable to its original value.
> 
> Yep, that's the reason for the comment about "no atomic needed". This
> is a "stack modification". Any interruption in the flow will reset the
> changes back to the way it was before going back to what it interrupted.
> 
> > 
> > Presumably you have to sum up all the per-CPU trace_active counts,
> > given that there is no guarantee that a process-level dec will happen
> > on the same CPU that did the inc.
> 
> That's why we disable preemption. We guarantee that a process-level dec
> *will* happen on the same CPU that did the inc.

But in that case, can't you use __this_cpu_inc()?  Save a few cycles
on RISC systems.

							Thanx, Paul

> It's also the reason for the preemption disabled check in the
> stack_tracer_disable() code.
> 
> -- Steve
> 
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing
  2017-04-07 14:53     ` Steven Rostedt
@ 2017-04-07 15:09       ` Paul E. McKenney
  2017-04-07 15:29         ` Steven Rostedt
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 15:09 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 10:53:16AM -0400, Steven Rostedt wrote:
> On Fri, 7 Apr 2017 07:40:11 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Apr 07, 2017 at 10:01:11AM -0400, Steven Rostedt wrote:
> > > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> > > 
> > > The tracing subsystem started using rcu_irq_entry() and rcu_irq_exit()
> > > (with my blessing) to allow the current _rcuidle alternative tracepoint
> > > name to be dispensed with while still maintaining good performance.
> > > Unfortunately, this causes RCU's dyntick-idle entry code's tracing to
> > > appear to RCU like an interrupt that occurs where RCU is not designed
> > > to handle interrupts.
> > > 
> > > This commit fixes this problem by moving the zeroing of ->dynticks_nesting
> > > after the offending trace_rcu_dyntick() statement, which narrows the
> > > window of vulnerability to a pair of adjacent statements that are now
> > > marked with comments to that effect.
> > > 
> > > Link: http://lkml.kernel.org/r/20170405193928.GM1600@linux.vnet.ibm.com
> > > 
> > > Reported-by: Steven Rostedt <rostedt@goodmis.org>
> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> > > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > > ---
> > >  kernel/rcu/tree.c | 48 +++++++++++++++++++++++-------------------------
> > >  1 file changed, 23 insertions(+), 25 deletions(-)
> > > 
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 50fee7689e71..8b4d273331e4 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -57,6 +57,7 @@
> > >  #include <linux/random.h>
> > >  #include <linux/trace_events.h>
> > >  #include <linux/suspend.h>
> > > +#include <linux/ftrace.h>
> > > 
> > >  #include "tree.h"
> > >  #include "rcu.h"
> > > @@ -771,25 +772,24 @@ cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
> > >  }
> > > 
> > >  /*
> > > - * rcu_eqs_enter_common - current CPU is moving towards extended quiescent state
> > > + * rcu_eqs_enter_common - current CPU is entering an extended quiescent state
> > >   *
> > > - * If the new value of the ->dynticks_nesting counter now is zero,
> > > - * we really have entered idle, and must do the appropriate accounting.
> > > - * The caller must have disabled interrupts.
> > > + * Enter idle, doing appropriate accounting.  The caller must have
> > > + * disabled interrupts.
> > >   */
> > > -static void rcu_eqs_enter_common(long long oldval, bool user)
> > > +static void rcu_eqs_enter_common(bool user)
> > >  {
> > >  	struct rcu_state *rsp;
> > >  	struct rcu_data *rdp;
> > > -	RCU_TRACE(struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);)
> > > +	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> > > 
> > > -	trace_rcu_dyntick(TPS("Start"), oldval, rdtp->dynticks_nesting);
> > > +	trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
> > >  	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> > >  	    !user && !is_idle_task(current)) {
> > >  		struct task_struct *idle __maybe_unused =
> > >  			idle_task(smp_processor_id());
> > > 
> > > -		trace_rcu_dyntick(TPS("Error on entry: not idle task"), oldval, 0);
> > > +		trace_rcu_dyntick(TPS("Error on entry: not idle task"), rdtp->dynticks_nesting, 0);
> > >  		rcu_ftrace_dump(DUMP_ORIG);
> > >  		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
> > >  			  current->pid, current->comm,
> > > @@ -800,7 +800,10 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
> > >  		do_nocb_deferred_wakeup(rdp);
> > >  	}
> > >  	rcu_prepare_for_idle();
> > > -	rcu_dynticks_eqs_enter();
> > > +	stack_tracer_disable();
> > > +	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
> > > +	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
> > > +	stack_tracer_enable();  
> > 
> > Hmmm...  There is not supposed to be any tracing in this interval,
> 
> Why not? function tracing happens without an issue. But then again,
> function tracing doesn't depend on RCU.
> 
> > and interrupts are disabled.  Wouldn't it be better to have something
> > that made tracing illegal during this interval?
> 
> I don't see an issue here. Function tracing is fine. There should be no
> trace_events() as those are static events and shouldn't dynamically
> appear in this interval.
> 
> The problem I hit is that stack tracing uses function tracing to check
> the stack of all functions. It doesn't need RCU either, unless it hits
> a new "max stack", which it then calls save_stack_trace(), which does a
> lot, and it does perform an rcu_read_lock(), which is what broke.
> 
> I'm fine with tracing, as that works. What doesn't work is tracing a
> new max stack.
> 
> > 
> > Yeah, I am a bit concerned about idle-entry latency...
> > 
> 
> Which should now be fine because of the inlined this_cpu_inc/dec()
> which is very efficient and made for fast paths like this.

OK, I am willing to let people complain if they can measure the
difference.  If they can measure the difference, there are some things
we can do.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable()
  2017-04-07 14:58   ` Steven Rostedt
@ 2017-04-07 15:11     ` Paul E. McKenney
  2017-04-07 15:28       ` Steven Rostedt
  0 siblings, 1 reply; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 15:11 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 10:58:26AM -0400, Steven Rostedt wrote:
> On Fri, 7 Apr 2017 07:43:35 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Apr 07, 2017 at 10:01:06AM -0400, Steven Rostedt wrote:
> > > Paul,
> > > 
> > > Here's my latest. You OK with it?  
> > 
> > Given your update to 3/5, I suspect that we could live with it.  I am
> > expecting some complaints about increases in idle-entry latency, but might
> > be best to wait for complaints rather than complexifying too proactively.
> 
> We only added a this_cpu_inc() and this_cpu_dec() which are very fast
> operations. I highly doubt it will be measurable. Although, I'm talking
> about x86, IIRC, the this_cpu_inc/dec were be poorly written for other
> archs in the past. I'm not sure if that was fixed though.

That is an issue for CPUs that don't have a to-memory increment
instruction.  How about __this_cpu_inc() and __this_cpu_dec(), given
that preemption is disabled?

> > That said, there isn't supposed to be any tracing during the now very
> > small interval where RCU's idle-entry is incomplete.  Mightn't it be
> > better to (under CONFIG_PROVE_RCU or some such) give splats if tracing
> > showed up in that interval?
> 
> Again, tracing is not the issue. I do function tracing in that location
> without any problems. The issue here was the stack tracer.
> 
> Maybe we can create a new variable that is more cache local to the RCU
> code.
> 
> What about calling it "rcu_disabled"? Then tracing that depends on RCU
> can simply check that.
> 
> s/stack_trace_disable/disable_rcu/
> s/stack_trace_enable/enable_rcu/
> 
> export a per cpu variable rcu_disabled
> 
> Then I can have the stack tracer check that variable as well. And we
> could even put in a WARN_ON(this_cpu_read(rcu_disabled) in the
> TRACE_EVENT() macros.
> 
> Thoughts?

At this point, if you can use the "__" versions, the change should be
small.  With that change, if no one else complains, I am OK.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable()
  2017-04-07 15:11     ` Paul E. McKenney
@ 2017-04-07 15:28       ` Steven Rostedt
  0 siblings, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 15:28 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, 7 Apr 2017 08:11:46 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Apr 07, 2017 at 10:58:26AM -0400, Steven Rostedt wrote:
> > On Fri, 7 Apr 2017 07:43:35 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > On Fri, Apr 07, 2017 at 10:01:06AM -0400, Steven Rostedt wrote:  
> > > > Paul,
> > > > 
> > > > Here's my latest. You OK with it?    
> > > 
> > > Given your update to 3/5, I suspect that we could live with it.  I am
> > > expecting some complaints about increases in idle-entry latency, but might
> > > be best to wait for complaints rather than complexifying too proactively.  
> > 
> > We only added a this_cpu_inc() and this_cpu_dec() which are very fast
> > operations. I highly doubt it will be measurable. Although, I'm talking
> > about x86, IIRC, the this_cpu_inc/dec were be poorly written for other
> > archs in the past. I'm not sure if that was fixed though.  
> 
> That is an issue for CPUs that don't have a to-memory increment
> instruction.  How about __this_cpu_inc() and __this_cpu_dec(), given
> that preemption is disabled?

Ah, so the issue still exists. I remember complaining to Christoph
Lamater about this. Yeah, switching to __this_cpu_inc() works as well.


> 
> > > That said, there isn't supposed to be any tracing during the now very
> > > small interval where RCU's idle-entry is incomplete.  Mightn't it be
> > > better to (under CONFIG_PROVE_RCU or some such) give splats if tracing
> > > showed up in that interval?  
> > 
> > Again, tracing is not the issue. I do function tracing in that location
> > without any problems. The issue here was the stack tracer.
> > 
> > Maybe we can create a new variable that is more cache local to the RCU
> > code.
> > 
> > What about calling it "rcu_disabled"? Then tracing that depends on RCU
> > can simply check that.
> > 
> > s/stack_trace_disable/disable_rcu/
> > s/stack_trace_enable/enable_rcu/
> > 
> > export a per cpu variable rcu_disabled
> > 
> > Then I can have the stack tracer check that variable as well. And we
> > could even put in a WARN_ON(this_cpu_read(rcu_disabled) in the
> > TRACE_EVENT() macros.
> > 
> > Thoughts?  
> 
> At this point, if you can use the "__" versions, the change should be
> small.  With that change, if no one else complains, I am OK.

Yeah, I was thinking this_cpu_inc() was the same as __this_cpu_inc(),
(where I simply forgot about the existence of __this_cpu_inc()
otherwise I would have used it from the beginning.

But that said, I wonder if a rcu_disabled would make more sense.
I kinda feel we are currently doing this backwards. The real state
change is that rcu is currently doing something that will definitely
break any rcu_irq_enter() call. Anything doing that to wake up RCU
should be informed that that doesn't currently work. Having a way for
RCU to tell others that "rcu_irq_enter() wont do anything right now"
seems more appropriate to me, as we can then do checks in the
trace_event_rcuidle() code to do the same thing.

Having RCU just pick stack tracing seems to be backward to me.

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing
  2017-04-07 15:09       ` Paul E. McKenney
@ 2017-04-07 15:29         ` Steven Rostedt
  0 siblings, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 15:29 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, 7 Apr 2017 08:09:22 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
   
> > > 
> > > Yeah, I am a bit concerned about idle-entry latency...
> > >   
> > 
> > Which should now be fine because of the inlined this_cpu_inc/dec()
> > which is very efficient and made for fast paths like this.  
> 
> OK, I am willing to let people complain if they can measure the
> difference.  If they can measure the difference, there are some things
> we can do.

Again, I was getting this_cpu_inc/dec confused with __this_cpu_inc/dec.
I'll update my patch to fix that.

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 6/5]rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work
  2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
                   ` (5 preceding siblings ...)
  2017-04-07 14:43 ` [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Paul E. McKenney
@ 2017-04-07 16:35 ` Steven Rostedt
  2017-04-07 16:42   ` Paul E. McKenney
  2017-04-07 17:06 ` [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events Steven Rostedt
  7 siblings, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 16:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when
it needs to use rcu_read_lock() and friends. This is because tracing can
happen as RCU is about to enter user space, or about to go idle, and RCU
does not watch for RCU read side critical sections as it makes the
transition.

There is a small location within the RCU infrastructure that rcu_irq_enter()
itself will not work. If tracing were to occur in that section it will break
if it tries to use rcu_irq_enter().

Originally, this happens with the stack_tracer, because it will call
save_stack_trace when it encounters stack usage that is greater than any
stack usage it had encountered previously. There was a case where that
happened in the RCU section where rcu_irq_enter() did not work, and lockdep
complained loudly about it. To fix it, stack tracing added a call to be
disabled and RCU would disable stack tracing during the critical section
that rcu_irq_enter() was inoperable. This solution worked, but there are
other cases that use rcu_irq_enter() and it would be a good idea to let RCU
give a way to let others know that rcu_irq_enter() will not work. For
example, in trace events.

Another helpful aspect of this change is that it also moves the per cpu
variable called in the RCU critical section into a cache locale along with
other RCU per cpu variables used in that same location.

I'm keeping the stack_trace_disable() code, as that still could be used in
the future by places that really need to disable it. And since it's only a
static inline, it wont take up any kernel text if it is not used.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/rcupdate.h   |  5 +++++
 kernel/rcu/tree.c          | 18 ++++++++++++++++--
 kernel/trace/trace_stack.c |  8 ++++++++
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index de88b33..d922ee7 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -97,6 +97,7 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
 			       unsigned long secs,
 			       unsigned long c_old,
 			       unsigned long c);
+bool rcu_is_disabled(void);
 #else
 static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
 					  int *flags,
@@ -113,6 +114,10 @@ static inline void rcutorture_record_test_transition(void)
 static inline void rcutorture_record_progress(unsigned long vernum)
 {
 }
+bool rcu_is_disabled(void)
+{
+	return false;
+}
 #ifdef CONFIG_RCU_TRACE
 void do_trace_rcu_torture_read(const char *rcutorturename,
 			       struct rcu_head *rhp,
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8b4d273..727325f 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -285,6 +285,20 @@ static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
 };
 
 /*
+ * There's a few places, currently just in the tracing infrastructure,
+ * that uses rcu_irq_enter() to make sure RCU is watching. But there's
+ * a small location where that will not even work. In those cases
+ * rcu_is_disabled() needs to be checked to make sure rcu_irq_enter()
+ * can be called.
+ */
+static DEFINE_PER_CPU(bool, rcu_disabled);
+
+bool rcu_is_disabled(void)
+{
+	return this_cpu_read(rcu_disabled);
+}
+
+/*
  * Record entry into an extended quiescent state.  This is only to be
  * called when not already in an extended quiescent state.
  */
@@ -800,10 +814,10 @@ static void rcu_eqs_enter_common(bool user)
 		do_nocb_deferred_wakeup(rdp);
 	}
 	rcu_prepare_for_idle();
-	stack_tracer_disable();
+	__this_cpu_inc(rcu_disabled);
 	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
 	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
-	stack_tracer_enable();
+	__this_cpu_dec(rcu_disabled);
 	rcu_dynticks_task_enter();
 
 	/*
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index f2f02ff..aec2df1 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -96,6 +96,14 @@ check_stack(unsigned long ip, unsigned long *stack)
 	if (in_nmi())
 		return;
 
+	/*
+	 * There's a slight chance that we are tracing inside the
+	 * RCU infrastructure, and rcu_irq_enter() will not work
+	 * as expected.
+	 */
+	if (unlikely(rcu_is_disabled()))
+		return;
+
 	local_irq_save(flags);
 	arch_spin_lock(&stack_trace_max_lock);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/5]rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work
  2017-04-07 16:35 ` [PATCH 6/5]rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work Steven Rostedt
@ 2017-04-07 16:42   ` Paul E. McKenney
  2017-04-07 16:44     ` Steven Rostedt
  2017-04-07 17:03     ` [PATCH 6/5 v2] rcu/tracing: " Steven Rostedt
  0 siblings, 2 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 16:42 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 12:35:16PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when
> it needs to use rcu_read_lock() and friends. This is because tracing can
> happen as RCU is about to enter user space, or about to go idle, and RCU
> does not watch for RCU read side critical sections as it makes the
> transition.
> 
> There is a small location within the RCU infrastructure that rcu_irq_enter()
> itself will not work. If tracing were to occur in that section it will break
> if it tries to use rcu_irq_enter().
> 
> Originally, this happens with the stack_tracer, because it will call
> save_stack_trace when it encounters stack usage that is greater than any
> stack usage it had encountered previously. There was a case where that
> happened in the RCU section where rcu_irq_enter() did not work, and lockdep
> complained loudly about it. To fix it, stack tracing added a call to be
> disabled and RCU would disable stack tracing during the critical section
> that rcu_irq_enter() was inoperable. This solution worked, but there are
> other cases that use rcu_irq_enter() and it would be a good idea to let RCU
> give a way to let others know that rcu_irq_enter() will not work. For
> example, in trace events.
> 
> Another helpful aspect of this change is that it also moves the per cpu
> variable called in the RCU critical section into a cache locale along with
> other RCU per cpu variables used in that same location.
> 
> I'm keeping the stack_trace_disable() code, as that still could be used in
> the future by places that really need to disable it. And since it's only a
> static inline, it wont take up any kernel text if it is not used.
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Looks better, especially __this_cpu_inc() and __this_cpu_dec().

How about rcu_irq_enter_disabled instead of rcu_disabled?  We aren't
really disabling all of RCU.  ;-)

							Thanx, Paul

> ---
>  include/linux/rcupdate.h   |  5 +++++
>  kernel/rcu/tree.c          | 18 ++++++++++++++++--
>  kernel/trace/trace_stack.c |  8 ++++++++
>  3 files changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index de88b33..d922ee7 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -97,6 +97,7 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
>  			       unsigned long secs,
>  			       unsigned long c_old,
>  			       unsigned long c);
> +bool rcu_is_disabled(void);
>  #else
>  static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
>  					  int *flags,
> @@ -113,6 +114,10 @@ static inline void rcutorture_record_test_transition(void)
>  static inline void rcutorture_record_progress(unsigned long vernum)
>  {
>  }
> +bool rcu_is_disabled(void)
> +{
> +	return false;
> +}
>  #ifdef CONFIG_RCU_TRACE
>  void do_trace_rcu_torture_read(const char *rcutorturename,
>  			       struct rcu_head *rhp,
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8b4d273..727325f 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -285,6 +285,20 @@ static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
>  };
> 
>  /*
> + * There's a few places, currently just in the tracing infrastructure,
> + * that uses rcu_irq_enter() to make sure RCU is watching. But there's
> + * a small location where that will not even work. In those cases
> + * rcu_is_disabled() needs to be checked to make sure rcu_irq_enter()
> + * can be called.
> + */
> +static DEFINE_PER_CPU(bool, rcu_disabled);
> +
> +bool rcu_is_disabled(void)
> +{
> +	return this_cpu_read(rcu_disabled);
> +}
> +
> +/*
>   * Record entry into an extended quiescent state.  This is only to be
>   * called when not already in an extended quiescent state.
>   */
> @@ -800,10 +814,10 @@ static void rcu_eqs_enter_common(bool user)
>  		do_nocb_deferred_wakeup(rdp);
>  	}
>  	rcu_prepare_for_idle();
> -	stack_tracer_disable();
> +	__this_cpu_inc(rcu_disabled);
>  	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
>  	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
> -	stack_tracer_enable();
> +	__this_cpu_dec(rcu_disabled);
>  	rcu_dynticks_task_enter();
> 
>  	/*
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index f2f02ff..aec2df1 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -96,6 +96,14 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	if (in_nmi())
>  		return;
> 
> +	/*
> +	 * There's a slight chance that we are tracing inside the
> +	 * RCU infrastructure, and rcu_irq_enter() will not work
> +	 * as expected.
> +	 */
> +	if (unlikely(rcu_is_disabled()))
> +		return;
> +
>  	local_irq_save(flags);
>  	arch_spin_lock(&stack_trace_max_lock);
> 
> -- 
> 2.9.3
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/5]rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work
  2017-04-07 16:42   ` Paul E. McKenney
@ 2017-04-07 16:44     ` Steven Rostedt
  2017-04-07 16:53       ` Paul E. McKenney
  2017-04-07 17:03     ` [PATCH 6/5 v2] rcu/tracing: " Steven Rostedt
  1 sibling, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 16:44 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, 7 Apr 2017 09:42:20 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Fri, Apr 07, 2017 at 12:35:16PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when
> > it needs to use rcu_read_lock() and friends. This is because tracing can
> > happen as RCU is about to enter user space, or about to go idle, and RCU
> > does not watch for RCU read side critical sections as it makes the
> > transition.
> > 
> > There is a small location within the RCU infrastructure that rcu_irq_enter()
> > itself will not work. If tracing were to occur in that section it will break
> > if it tries to use rcu_irq_enter().
> > 
> > Originally, this happens with the stack_tracer, because it will call
> > save_stack_trace when it encounters stack usage that is greater than any
> > stack usage it had encountered previously. There was a case where that
> > happened in the RCU section where rcu_irq_enter() did not work, and lockdep
> > complained loudly about it. To fix it, stack tracing added a call to be
> > disabled and RCU would disable stack tracing during the critical section
> > that rcu_irq_enter() was inoperable. This solution worked, but there are
> > other cases that use rcu_irq_enter() and it would be a good idea to let RCU
> > give a way to let others know that rcu_irq_enter() will not work. For
> > example, in trace events.
> > 
> > Another helpful aspect of this change is that it also moves the per cpu
> > variable called in the RCU critical section into a cache locale along with
> > other RCU per cpu variables used in that same location.
> > 
> > I'm keeping the stack_trace_disable() code, as that still could be used in
> > the future by places that really need to disable it. And since it's only a
> > static inline, it wont take up any kernel text if it is not used.
> > 
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> 
> Looks better, especially __this_cpu_inc() and __this_cpu_dec().
> 
> How about rcu_irq_enter_disabled instead of rcu_disabled?  We aren't
> really disabling all of RCU.  ;-)

OK, I'll make the update and send a 6.1 (and also a new 7/5)!

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/5]rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work
  2017-04-07 16:44     ` Steven Rostedt
@ 2017-04-07 16:53       ` Paul E. McKenney
  0 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 16:53 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 12:44:40PM -0400, Steven Rostedt wrote:
> On Fri, 7 Apr 2017 09:42:20 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Fri, Apr 07, 2017 at 12:35:16PM -0400, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > > 
> > > Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when
> > > it needs to use rcu_read_lock() and friends. This is because tracing can
> > > happen as RCU is about to enter user space, or about to go idle, and RCU
> > > does not watch for RCU read side critical sections as it makes the
> > > transition.
> > > 
> > > There is a small location within the RCU infrastructure that rcu_irq_enter()
> > > itself will not work. If tracing were to occur in that section it will break
> > > if it tries to use rcu_irq_enter().
> > > 
> > > Originally, this happens with the stack_tracer, because it will call
> > > save_stack_trace when it encounters stack usage that is greater than any
> > > stack usage it had encountered previously. There was a case where that
> > > happened in the RCU section where rcu_irq_enter() did not work, and lockdep
> > > complained loudly about it. To fix it, stack tracing added a call to be
> > > disabled and RCU would disable stack tracing during the critical section
> > > that rcu_irq_enter() was inoperable. This solution worked, but there are
> > > other cases that use rcu_irq_enter() and it would be a good idea to let RCU
> > > give a way to let others know that rcu_irq_enter() will not work. For
> > > example, in trace events.
> > > 
> > > Another helpful aspect of this change is that it also moves the per cpu
> > > variable called in the RCU critical section into a cache locale along with
> > > other RCU per cpu variables used in that same location.
> > > 
> > > I'm keeping the stack_trace_disable() code, as that still could be used in
> > > the future by places that really need to disable it. And since it's only a
> > > static inline, it wont take up any kernel text if it is not used.
> > > 
> > > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> > 
> > Looks better, especially __this_cpu_inc() and __this_cpu_dec().
> > 
> > How about rcu_irq_enter_disabled instead of rcu_disabled?  We aren't
> > really disabling all of RCU.  ;-)
> 
> OK, I'll make the update and send a 6.1 (and also a new 7/5)!

So the ex-Borg member was really 9 of 7 rather than 7 of 9, then?  ;-)

							Thanx, Paul

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 6/5 v2] rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work
  2017-04-07 16:42   ` Paul E. McKenney
  2017-04-07 16:44     ` Steven Rostedt
@ 2017-04-07 17:03     ` Steven Rostedt
  2017-04-07 17:15       ` Paul E. McKenney
  1 sibling, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 17:03 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when
it needs to use rcu_read_lock() and friends. This is because tracing can
happen as RCU is about to enter user space, or about to go idle, and RCU
does not watch for RCU read side critical sections as it makes the
transition.

There is a small location within the RCU infrastructure that rcu_irq_enter()
itself will not work. If tracing were to occur in that section it will break
if it tries to use rcu_irq_enter().

Originally, this happens with the stack_tracer, because it will call
save_stack_trace when it encounters stack usage that is greater than any
stack usage it had encountered previously. There was a case where that
happened in the RCU section where rcu_irq_enter() did not work, and lockdep
complained loudly about it. To fix it, stack tracing added a call to be
disabled and RCU would disable stack tracing during the critical section
that rcu_irq_enter() was inoperable. This solution worked, but there are
other cases that use rcu_irq_enter() and it would be a good idea to let RCU
give a way to let others know that rcu_irq_enter() will not work. For
example, in trace events.

Another helpful aspect of this change is that it also moves the per cpu
variable called in the RCU critical section into a cache locale along with
other RCU per cpu variables used in that same location.

I'm keeping the stack_trace_disable() code, as that still could be used in
the future by places that really need to disable it. And since it's only a
static inline, it wont take up any kernel text if it is not used.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/rcupdate.h   |  5 +++++
 kernel/rcu/tree.c          | 18 ++++++++++++++++--
 kernel/trace/trace_stack.c |  8 ++++++++
 3 files changed, 29 insertions(+), 2 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index de88b33..b096685 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -97,6 +97,7 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
 			       unsigned long secs,
 			       unsigned long c_old,
 			       unsigned long c);
+bool rcu_irq_enter_disabled(void);
 #else
 static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
 					  int *flags,
@@ -113,6 +114,10 @@ static inline void rcutorture_record_test_transition(void)
 static inline void rcutorture_record_progress(unsigned long vernum)
 {
 }
+bool rcu_irq_enter_disabled(void)
+{
+	return false;
+}
 #ifdef CONFIG_RCU_TRACE
 void do_trace_rcu_torture_read(const char *rcutorturename,
 			       struct rcu_head *rhp,
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8b4d273..a6dcf3b 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -285,6 +285,20 @@ static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
 };
 
 /*
+ * There's a few places, currently just in the tracing infrastructure,
+ * that uses rcu_irq_enter() to make sure RCU is watching. But there's
+ * a small location where that will not even work. In those cases
+ * rcu_irq_enter_disabled() needs to be checked to make sure rcu_irq_enter()
+ * can be called.
+ */
+static DEFINE_PER_CPU(bool, disable_rcu_irq_enter);
+
+bool rcu_irq_enter_disabled(void)
+{
+	return this_cpu_read(disable_rcu_irq_enter);
+}
+
+/*
  * Record entry into an extended quiescent state.  This is only to be
  * called when not already in an extended quiescent state.
  */
@@ -800,10 +814,10 @@ static void rcu_eqs_enter_common(bool user)
 		do_nocb_deferred_wakeup(rdp);
 	}
 	rcu_prepare_for_idle();
-	stack_tracer_disable();
+	__this_cpu_inc(disable_rcu_irq_enter);
 	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
 	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
-	stack_tracer_enable();
+	__this_cpu_dec(disable_rcu_irq_enter);
 	rcu_dynticks_task_enter();
 
 	/*
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index f2f02ff..76aa04d 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -96,6 +96,14 @@ check_stack(unsigned long ip, unsigned long *stack)
 	if (in_nmi())
 		return;
 
+	/*
+	 * There's a slight chance that we are tracing inside the
+	 * RCU infrastructure, and rcu_irq_enter() will not work
+	 * as expected.
+	 */
+	if (unlikely(rcu_irq_enter_disabled()))
+		return;
+
 	local_irq_save(flags);
 	arch_spin_lock(&stack_trace_max_lock);
 
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
                   ` (6 preceding siblings ...)
  2017-04-07 16:35 ` [PATCH 6/5]rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work Steven Rostedt
@ 2017-04-07 17:06 ` Steven Rostedt
  2017-04-07 17:15   ` Paul E. McKenney
  2017-04-07 17:19   ` Mathieu Desnoyers
  7 siblings, 2 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 17:06 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney, Mathieu Desnoyers

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Stack tracing discovered that there's a small location inside the RCU
infrastructure that calling rcu_irq_enter() does not work. As trace events
use rcu_irq_enter() it must make sure that it is functionable. A check
against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace
event should ever be used in that part of RCU. If the warning is triggered,
then the trace event is ignored.

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/tracepoint.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index f72fcfe..8baef96 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -159,6 +159,8 @@ extern void syscall_unregfunc(void);
 				TP_PROTO(data_proto),			\
 				TP_ARGS(data_args),			\
 				TP_CONDITION(cond),			\
+				if (WARN_ON_ONCE(rcu_irq_enter_disabled())) \
+					return;				\
 				rcu_irq_enter_irqson(),			\
 				rcu_irq_exit_irqson());			\
 	}
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 6/5 v2] rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work
  2017-04-07 17:03     ` [PATCH 6/5 v2] rcu/tracing: " Steven Rostedt
@ 2017-04-07 17:15       ` Paul E. McKenney
  0 siblings, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 17:15 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Fri, Apr 07, 2017 at 01:03:15PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Tracing uses rcu_irq_enter() as a way to make sure that RCU is watching when
> it needs to use rcu_read_lock() and friends. This is because tracing can
> happen as RCU is about to enter user space, or about to go idle, and RCU
> does not watch for RCU read side critical sections as it makes the
> transition.
> 
> There is a small location within the RCU infrastructure that rcu_irq_enter()
> itself will not work. If tracing were to occur in that section it will break
> if it tries to use rcu_irq_enter().
> 
> Originally, this happens with the stack_tracer, because it will call
> save_stack_trace when it encounters stack usage that is greater than any
> stack usage it had encountered previously. There was a case where that
> happened in the RCU section where rcu_irq_enter() did not work, and lockdep
> complained loudly about it. To fix it, stack tracing added a call to be
> disabled and RCU would disable stack tracing during the critical section
> that rcu_irq_enter() was inoperable. This solution worked, but there are
> other cases that use rcu_irq_enter() and it would be a good idea to let RCU
> give a way to let others know that rcu_irq_enter() will not work. For
> example, in trace events.
> 
> Another helpful aspect of this change is that it also moves the per cpu
> variable called in the RCU critical section into a cache locale along with
> other RCU per cpu variables used in that same location.
> 
> I'm keeping the stack_trace_disable() code, as that still could be used in
> the future by places that really need to disable it. And since it's only a
> static inline, it wont take up any kernel text if it is not used.
> 
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Looks good!

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  include/linux/rcupdate.h   |  5 +++++
>  kernel/rcu/tree.c          | 18 ++++++++++++++++--
>  kernel/trace/trace_stack.c |  8 ++++++++
>  3 files changed, 29 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index de88b33..b096685 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -97,6 +97,7 @@ void do_trace_rcu_torture_read(const char *rcutorturename,
>  			       unsigned long secs,
>  			       unsigned long c_old,
>  			       unsigned long c);
> +bool rcu_irq_enter_disabled(void);
>  #else
>  static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
>  					  int *flags,
> @@ -113,6 +114,10 @@ static inline void rcutorture_record_test_transition(void)
>  static inline void rcutorture_record_progress(unsigned long vernum)
>  {
>  }
> +bool rcu_irq_enter_disabled(void)
> +{
> +	return false;
> +}
>  #ifdef CONFIG_RCU_TRACE
>  void do_trace_rcu_torture_read(const char *rcutorturename,
>  			       struct rcu_head *rhp,
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 8b4d273..a6dcf3b 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -285,6 +285,20 @@ static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
>  };
> 
>  /*
> + * There's a few places, currently just in the tracing infrastructure,
> + * that uses rcu_irq_enter() to make sure RCU is watching. But there's
> + * a small location where that will not even work. In those cases
> + * rcu_irq_enter_disabled() needs to be checked to make sure rcu_irq_enter()
> + * can be called.
> + */
> +static DEFINE_PER_CPU(bool, disable_rcu_irq_enter);
> +
> +bool rcu_irq_enter_disabled(void)
> +{
> +	return this_cpu_read(disable_rcu_irq_enter);
> +}
> +
> +/*
>   * Record entry into an extended quiescent state.  This is only to be
>   * called when not already in an extended quiescent state.
>   */
> @@ -800,10 +814,10 @@ static void rcu_eqs_enter_common(bool user)
>  		do_nocb_deferred_wakeup(rdp);
>  	}
>  	rcu_prepare_for_idle();
> -	stack_tracer_disable();
> +	__this_cpu_inc(disable_rcu_irq_enter);
>  	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
>  	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
> -	stack_tracer_enable();
> +	__this_cpu_dec(disable_rcu_irq_enter);
>  	rcu_dynticks_task_enter();
> 
>  	/*
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index f2f02ff..76aa04d 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -96,6 +96,14 @@ check_stack(unsigned long ip, unsigned long *stack)
>  	if (in_nmi())
>  		return;
> 
> +	/*
> +	 * There's a slight chance that we are tracing inside the
> +	 * RCU infrastructure, and rcu_irq_enter() will not work
> +	 * as expected.
> +	 */
> +	if (unlikely(rcu_irq_enter_disabled()))
> +		return;
> +
>  	local_irq_save(flags);
>  	arch_spin_lock(&stack_trace_max_lock);
> 
> -- 
> 2.9.3
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:06 ` [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events Steven Rostedt
@ 2017-04-07 17:15   ` Paul E. McKenney
  2017-04-07 17:19   ` Mathieu Desnoyers
  1 sibling, 0 replies; 39+ messages in thread
From: Paul E. McKenney @ 2017-04-07 17:15 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Mathieu Desnoyers

On Fri, Apr 07, 2017 at 01:06:15PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Stack tracing discovered that there's a small location inside the RCU
> infrastructure that calling rcu_irq_enter() does not work. As trace events
> use rcu_irq_enter() it must make sure that it is functionable. A check
> against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace
> event should ever be used in that part of RCU. If the warning is triggered,
> then the trace event is ignored.
> 
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
>  include/linux/tracepoint.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index f72fcfe..8baef96 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -159,6 +159,8 @@ extern void syscall_unregfunc(void);
>  				TP_PROTO(data_proto),			\
>  				TP_ARGS(data_args),			\
>  				TP_CONDITION(cond),			\
> +				if (WARN_ON_ONCE(rcu_irq_enter_disabled())) \
> +					return;				\
>  				rcu_irq_enter_irqson(),			\
>  				rcu_irq_exit_irqson());			\
>  	}
> -- 
> 2.9.3
> 

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:06 ` [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events Steven Rostedt
  2017-04-07 17:15   ` Paul E. McKenney
@ 2017-04-07 17:19   ` Mathieu Desnoyers
  2017-04-07 17:26     ` Steven Rostedt
                       ` (2 more replies)
  1 sibling, 3 replies; 39+ messages in thread
From: Mathieu Desnoyers @ 2017-04-07 17:19 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

----- On Apr 7, 2017, at 1:06 PM, rostedt rostedt@goodmis.org wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Stack tracing discovered that there's a small location inside the RCU
> infrastructure that calling rcu_irq_enter() does not work. As trace events

that -> where

Do you have a link to the lkml thread where this stack tracing discovery
happened ?

> use rcu_irq_enter() it must make sure that it is functionable. A check

I don't think functionable is the word you are looking for here. Perhaps
"must make sure that it can be invoked" ?

> against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace
> event should ever be used in that part of RCU. If the warning is triggered,
> then the trace event is ignored.
> 
> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
> include/linux/tracepoint.h | 2 ++
> 1 file changed, 2 insertions(+)
> 
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index f72fcfe..8baef96 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -159,6 +159,8 @@ extern void syscall_unregfunc(void);
> 				TP_PROTO(data_proto),			\
> 				TP_ARGS(data_args),			\
> 				TP_CONDITION(cond),			\
> +				if (WARN_ON_ONCE(rcu_irq_enter_disabled())) \
> +					return;				\

I must admit that it's a bit odd to have:

if (WARN_ON_ONCE(rcu_irq_enter_disabled()))
       return;
rcu_irq_enter_irqson()

as one argument to the __DO_TRACE() macro. To me it's a bit unexpected
coding-style wise. Am I the only one not comfortable with the proposed
syntax ?

Thanks,

Mathieu


> 				rcu_irq_enter_irqson(),			\
> 				rcu_irq_exit_irqson());			\
> 	}
> --
> 2.9.3

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:19   ` Mathieu Desnoyers
@ 2017-04-07 17:26     ` Steven Rostedt
  2017-04-07 17:32       ` Steven Rostedt
  2017-04-07 17:49       ` Mathieu Desnoyers
  2017-04-07 17:28     ` [PATCH 7/5] " Steven Rostedt
  2017-04-07 17:48     ` [PATCH 7/5 v2] " Steven Rostedt
  2 siblings, 2 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 17:26 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

On Fri, 7 Apr 2017 17:19:05 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> ----- On Apr 7, 2017, at 1:06 PM, rostedt rostedt@goodmis.org wrote:
> 
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > Stack tracing discovered that there's a small location inside the RCU
> > infrastructure that calling rcu_irq_enter() does not work. As trace events  
> 
> that -> where

ok

> 
> Do you have a link to the lkml thread where this stack tracing discovery
> happened ?

Actually it's this thread. But here:

Version 1 of the patch series:
 http://lkml.kernel.org/r/20170406164237.874767449@goodmis.org

Version 2 of the patch series:
 http://lkml.kernel.org/r/20170407140106.051135969@goodmis.org

> 
> > use rcu_irq_enter() it must make sure that it is functionable. A check  
> 
> I don't think functionable is the word you are looking for here. Perhaps
> "must make sure that it can be invoked" ?
> 
> > against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace
> > event should ever be used in that part of RCU. If the warning is triggered,
> > then the trace event is ignored.
> > 
> > Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > ---
> > include/linux/tracepoint.h | 2 ++
> > 1 file changed, 2 insertions(+)
> > 
> > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> > index f72fcfe..8baef96 100644
> > --- a/include/linux/tracepoint.h
> > +++ b/include/linux/tracepoint.h
> > @@ -159,6 +159,8 @@ extern void syscall_unregfunc(void);
> > 				TP_PROTO(data_proto),			\
> > 				TP_ARGS(data_args),			\
> > 				TP_CONDITION(cond),			\
> > +				if (WARN_ON_ONCE(rcu_irq_enter_disabled())) \
> > +					return;				\  
> 
> I must admit that it's a bit odd to have:
> 
> if (WARN_ON_ONCE(rcu_irq_enter_disabled()))
>        return;
> rcu_irq_enter_irqson()

Welcome to MACRO MAGIC!

> 
> as one argument to the __DO_TRACE() macro. To me it's a bit unexpected
> coding-style wise. Am I the only one not comfortable with the proposed
> syntax ?

The entire TRACE_EVENT()/__DO_TRACE() is special.

I thought about add yet another parameter, but as it doesn't change
much, I figured this was good enough. We could beak it up if you like:

#define RCU_IRQ_ENTER_CHECK \
	if (WARN_ON_ONCE(rcu_irq_enter_disabled()) 	\
		return;					\
	rcu_irq_enter_irqson();

[..]
			__DO_TRACE(&__tracepoint_##name,		\
				TP_PROTO(data_proto),			\
				TP_ARGS(data_args),			\
				TP_CONDITION(cond),			\
				PARAMS(RCU_IRQ_ENTER_CHECK),		\
				rcu_irq_exit_irqson());			\


Would that make you feel more comfortable?

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:19   ` Mathieu Desnoyers
  2017-04-07 17:26     ` Steven Rostedt
@ 2017-04-07 17:28     ` Steven Rostedt
  2017-04-07 17:48     ` [PATCH 7/5 v2] " Steven Rostedt
  2 siblings, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 17:28 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney


> > use rcu_irq_enter() it must make sure that it is functionable. A check  
> 
> I don't think functionable is the word you are looking for here. Perhaps
> "must make sure that it can be invoked" ?

Well, rcu_irq_enter() doesn't function in that location. And it's a
change log, not in the code. I think it gets the point across enough.

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:26     ` Steven Rostedt
@ 2017-04-07 17:32       ` Steven Rostedt
  2017-04-07 17:49       ` Mathieu Desnoyers
  1 sibling, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 17:32 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

On Fri, 7 Apr 2017 13:26:13 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> > Do you have a link to the lkml thread where this stack tracing discovery
> > happened ?  
> 
> Actually it's this thread. But here:
> 
> Version 1 of the patch series:
>  http://lkml.kernel.org/r/20170406164237.874767449@goodmis.org
> 
> Version 2 of the patch series:
>  http://lkml.kernel.org/r/20170407140106.051135969@goodmis.org

Oh, and I did post a bug report on it. I should add this link in the
change log:

  http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 7/5 v2] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:19   ` Mathieu Desnoyers
  2017-04-07 17:26     ` Steven Rostedt
  2017-04-07 17:28     ` [PATCH 7/5] " Steven Rostedt
@ 2017-04-07 17:48     ` Steven Rostedt
  2 siblings, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 17:48 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Stack tracing discovered that there's a small location inside the RCU
infrastructure where calling rcu_irq_enter() does not work. As trace events
use rcu_irq_enter() it must make sure that it is functionable. A check
against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace
event should ever be used in that part of RCU. If the warning is triggered,
then the trace event is ignored.

Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
Mathieu, is this better? (yeah, i left in functionable, because I like
that word ;-)

include/linux/tracepoint.h | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index f72fcfe..352f32a 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -151,7 +151,12 @@ extern void syscall_unregfunc(void);
 	} while (0)
 
 #ifndef MODULE
-#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)	\
+#define TRACE_RCU_IRQ_ENTER_CHECK					\
+	if (WARN_ON_ONCE(rcu_irq_enter_disabled()))			\
+		return;							\
+	rcu_irq_enter_irqson()
+
+#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args) \
 	static inline void trace_##name##_rcuidle(proto)		\
 	{								\
 		if (static_key_false(&__tracepoint_##name.key))		\
@@ -159,7 +164,7 @@ extern void syscall_unregfunc(void);
 				TP_PROTO(data_proto),			\
 				TP_ARGS(data_args),			\
 				TP_CONDITION(cond),			\
-				rcu_irq_enter_irqson(),			\
+				PARAMS(TRACE_RCU_IRQ_ENTER_CHECK),	\
 				rcu_irq_exit_irqson());			\
 	}
 #else
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:26     ` Steven Rostedt
  2017-04-07 17:32       ` Steven Rostedt
@ 2017-04-07 17:49       ` Mathieu Desnoyers
  2017-04-07 17:55         ` Steven Rostedt
  2017-04-07 18:10         ` [PATCH 7/5 v3] " Steven Rostedt
  1 sibling, 2 replies; 39+ messages in thread
From: Mathieu Desnoyers @ 2017-04-07 17:49 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

----- On Apr 7, 2017, at 1:26 PM, rostedt rostedt@goodmis.org wrote:

> On Fri, 7 Apr 2017 17:19:05 +0000 (UTC)
> Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
[...]
>> > ---
>> > include/linux/tracepoint.h | 2 ++
>> > 1 file changed, 2 insertions(+)
>> > 
>> > diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
>> > index f72fcfe..8baef96 100644
>> > --- a/include/linux/tracepoint.h
>> > +++ b/include/linux/tracepoint.h
>> > @@ -159,6 +159,8 @@ extern void syscall_unregfunc(void);
>> > 				TP_PROTO(data_proto),			\
>> > 				TP_ARGS(data_args),			\
>> > 				TP_CONDITION(cond),			\
>> > +				if (WARN_ON_ONCE(rcu_irq_enter_disabled())) \
>> > +					return;				\
>> 
>> I must admit that it's a bit odd to have:
>> 
>> if (WARN_ON_ONCE(rcu_irq_enter_disabled()))
>>        return;
>> rcu_irq_enter_irqson()
> 
> Welcome to MACRO MAGIC!
> 
>> 
>> as one argument to the __DO_TRACE() macro. To me it's a bit unexpected
>> coding-style wise. Am I the only one not comfortable with the proposed
>> syntax ?
> 
> The entire TRACE_EVENT()/__DO_TRACE() is special.
> 
> I thought about add yet another parameter, but as it doesn't change
> much, I figured this was good enough. We could beak it up if you like:
> 
> #define RCU_IRQ_ENTER_CHECK \
>	if (WARN_ON_ONCE(rcu_irq_enter_disabled()) 	\
>		return;					\
>	rcu_irq_enter_irqson();
> 
> [..]
>			__DO_TRACE(&__tracepoint_##name,		\
>				TP_PROTO(data_proto),			\
>				TP_ARGS(data_args),			\
>				TP_CONDITION(cond),			\
>				PARAMS(RCU_IRQ_ENTER_CHECK),		\
>				rcu_irq_exit_irqson());			\
> 
> 
> Would that make you feel more comfortable?

No, it's almost worse and adds still adds a return that apply within __DO_TRACE(),
but which is passed as an argument (code as macro argument), which I find really
unsettling.

I would prefer to add a new argument to __DO_TRACE, which we can call
"checkrcu", e.g.:

#define __DO_TRACE(tp, proto, args, cond, checkrcu, prercu, postrcu)    \
        do {                                                            \
                struct tracepoint_func *it_func_ptr;                    \
                void *it_func;                                          \
                void *__data;                                           \
                                                                        \
                if (!((cond) && (checkrcu)))                            \
                        return;                                         \
                prercu;                                                 \
                rcu_read_lock_sched_notrace();                          \
                it_func_ptr = rcu_dereference_sched((tp)->funcs);       \
                if (it_func_ptr) {                                      \
                        do {                                            \
                                it_func = (it_func_ptr)->func;          \
                                __data = (it_func_ptr)->data;           \
                                ((void(*)(proto))(it_func))(args);      \
                        } while ((++it_func_ptr)->func);                \
                }                                                       \
                rcu_read_unlock_sched_notrace();                        \
                postrcu;                                                \
        } while (0)

And use it like this:

#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)     \
        static inline void trace_##name##_rcuidle(proto)                \
        {                                                               \
                if (static_key_false(&__tracepoint_##name.key))         \
                        __DO_TRACE(&__tracepoint_##name,                \
                                TP_PROTO(data_proto),                   \
                                TP_ARGS(data_args),                     \
                                TP_CONDITION(cond),                     \
                                !WARN_ON_ONCE(rcu_irq_enter_disabled()),\
                                rcu_irq_enter_irqson(),                 \
                                rcu_irq_exit_irqson());                 \
        }

This way we only pass evaluated expression (not code with "return" that
changes the flow) as arguments to __DO_TRACE, which makes it behave more
like a "sub-function", which is what we usually expect.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:49       ` Mathieu Desnoyers
@ 2017-04-07 17:55         ` Steven Rostedt
  2017-04-07 18:10         ` [PATCH 7/5 v3] " Steven Rostedt
  1 sibling, 0 replies; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 17:55 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

On Fri, 7 Apr 2017 17:49:17 +0000 (UTC)
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> > Welcome to MACRO MAGIC!

Somebody is not wizardly happy.

> >   
> >> 
> >> as one argument to the __DO_TRACE() macro. To me it's a bit unexpected
> >> coding-style wise. Am I the only one not comfortable with the proposed
> >> syntax ?  
> > 
> > The entire TRACE_EVENT()/__DO_TRACE() is special.
> > 
> > I thought about add yet another parameter, but as it doesn't change
> > much, I figured this was good enough. We could beak it up if you like:
> > 
> > #define RCU_IRQ_ENTER_CHECK \
> >	if (WARN_ON_ONCE(rcu_irq_enter_disabled()) 	\
> >		return;					\
> >	rcu_irq_enter_irqson();
> > 
> > [..]
> >			__DO_TRACE(&__tracepoint_##name,		\
> >				TP_PROTO(data_proto),			\
> >				TP_ARGS(data_args),			\
> >				TP_CONDITION(cond),			\
> >				PARAMS(RCU_IRQ_ENTER_CHECK),		\
> >				rcu_irq_exit_irqson());			\
> > 
> > 
> > Would that make you feel more comfortable?  
> 
> No, it's almost worse and adds still adds a return that apply within __DO_TRACE(),
> but which is passed as an argument (code as macro argument), which I find really
> unsettling.

/me finds it strangely enjoyable to make Mathieu unsettled.

> 
> I would prefer to add a new argument to __DO_TRACE, which we can call
> "checkrcu", e.g.:
> 
> #define __DO_TRACE(tp, proto, args, cond, checkrcu, prercu, postrcu)    \

Grumble. I was trying to avoid making the patch more intrusive. But I
do understand your concern.

>         do {                                                            \
>                 struct tracepoint_func *it_func_ptr;                    \
>                 void *it_func;                                          \
>                 void *__data;                                           \
>                                                                         \
>                 if (!((cond) && (checkrcu)))                            \
>                         return;                                         \
>                 prercu;                                                 \
>                 rcu_read_lock_sched_notrace();                          \
>                 it_func_ptr = rcu_dereference_sched((tp)->funcs);       \
>                 if (it_func_ptr) {                                      \
>                         do {                                            \
>                                 it_func = (it_func_ptr)->func;          \
>                                 __data = (it_func_ptr)->data;           \
>                                 ((void(*)(proto))(it_func))(args);      \
>                         } while ((++it_func_ptr)->func);                \
>                 }                                                       \
>                 rcu_read_unlock_sched_notrace();                        \
>                 postrcu;                                                \
>         } while (0)
> 
> And use it like this:
> 
> #define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)     \
>         static inline void trace_##name##_rcuidle(proto)                \
>         {                                                               \
>                 if (static_key_false(&__tracepoint_##name.key))         \
>                         __DO_TRACE(&__tracepoint_##name,                \
>                                 TP_PROTO(data_proto),                   \
>                                 TP_ARGS(data_args),                     \
>                                 TP_CONDITION(cond),                     \
>                                 !WARN_ON_ONCE(rcu_irq_enter_disabled()),\
>                                 rcu_irq_enter_irqson(),                 \
>                                 rcu_irq_exit_irqson());                 \
>         }
> 
> This way we only pass evaluated expression (not code with "return" that
> changes the flow) as arguments to __DO_TRACE, which makes it behave more
> like a "sub-function", which is what we usually expect.

I understand what you are getting at, and I will concede your point.
OK, I'll do it your way, but I still think you take all the fun out of
it. ;-)

-- Steve

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 7/5 v3] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 17:49       ` Mathieu Desnoyers
  2017-04-07 17:55         ` Steven Rostedt
@ 2017-04-07 18:10         ` Steven Rostedt
  2017-04-07 18:17           ` Mathieu Desnoyers
  1 sibling, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 18:10 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Stack tracing discovered that there's a small location inside the RCU
infrastructure where calling rcu_irq_enter() does not work. As trace events
use rcu_irq_enter() it must make sure that it is functionable. A check
against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace
event should ever be used in that part of RCU. If the warning is triggered,
then the trace event is ignored.

Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home

Cc: Mathieu (making code boring) Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---

Mathieu,

 There! Are you now satisfied?



 include/linux/tracepoint.h | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index f72fcfe..7f98d50 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -128,7 +128,7 @@ extern void syscall_unregfunc(void);
  * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
  * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
  */
-#define __DO_TRACE(tp, proto, args, cond, prercu, postrcu)		\
+#define __DO_TRACE(tp, proto, args, cond, rcucheck, prercu, postrcu)	\
 	do {								\
 		struct tracepoint_func *it_func_ptr;			\
 		void *it_func;						\
@@ -136,6 +136,8 @@ extern void syscall_unregfunc(void);
 									\
 		if (!(cond))						\
 			return;						\
+		if (rcucheck)						\
+			return;						\
 		prercu;							\
 		rcu_read_lock_sched_notrace();				\
 		it_func_ptr = rcu_dereference_sched((tp)->funcs);	\
@@ -151,7 +153,7 @@ extern void syscall_unregfunc(void);
 	} while (0)
 
 #ifndef MODULE
-#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)	\
+#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args) \
 	static inline void trace_##name##_rcuidle(proto)		\
 	{								\
 		if (static_key_false(&__tracepoint_##name.key))		\
@@ -159,6 +161,7 @@ extern void syscall_unregfunc(void);
 				TP_PROTO(data_proto),			\
 				TP_ARGS(data_args),			\
 				TP_CONDITION(cond),			\
+				WARN_ON_ONCE(rcu_irq_enter_disabled()),\
 				rcu_irq_enter_irqson(),			\
 				rcu_irq_exit_irqson());			\
 	}
@@ -186,7 +189,7 @@ extern void syscall_unregfunc(void);
 			__DO_TRACE(&__tracepoint_##name,		\
 				TP_PROTO(data_proto),			\
 				TP_ARGS(data_args),			\
-				TP_CONDITION(cond),,);			\
+				TP_CONDITION(cond),0,,);		\
 		if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {		\
 			rcu_read_lock_sched_notrace();			\
 			rcu_dereference_sched(__tracepoint_##name.funcs);\
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5 v3] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 18:10         ` [PATCH 7/5 v3] " Steven Rostedt
@ 2017-04-07 18:17           ` Mathieu Desnoyers
  2017-04-07 19:41             ` [PATCH 7/5 v4] " Steven Rostedt
  0 siblings, 1 reply; 39+ messages in thread
From: Mathieu Desnoyers @ 2017-04-07 18:17 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

----- On Apr 7, 2017, at 2:10 PM, rostedt rostedt@goodmis.org wrote:

> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> Stack tracing discovered that there's a small location inside the RCU
> infrastructure where calling rcu_irq_enter() does not work. As trace events
> use rcu_irq_enter() it must make sure that it is functionable. A check
> against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace
> event should ever be used in that part of RCU. If the warning is triggered,
> then the trace event is ignored.
> 
> Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home
> 
> Cc: Mathieu (making code boring) Desnoyers <mathieu.desnoyers@efficios.com>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
> 
> Mathieu,
> 
> There! Are you now satisfied?

Since you reversed the boolean logic from my proposal, perhaps
rename the "rcucheck" argument to "rcudisabled" to reflect the
logic swap ?

Other than that:

Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Thanks,

Mathieu

> 
> 
> 
> include/linux/tracepoint.h | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
> index f72fcfe..7f98d50 100644
> --- a/include/linux/tracepoint.h
> +++ b/include/linux/tracepoint.h
> @@ -128,7 +128,7 @@ extern void syscall_unregfunc(void);
>  * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
>  * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
>  */
> -#define __DO_TRACE(tp, proto, args, cond, prercu, postrcu)		\
> +#define __DO_TRACE(tp, proto, args, cond, rcucheck, prercu, postrcu)	\
> 	do {								\
> 		struct tracepoint_func *it_func_ptr;			\
> 		void *it_func;						\
> @@ -136,6 +136,8 @@ extern void syscall_unregfunc(void);
> 									\
> 		if (!(cond))						\
> 			return;						\
> +		if (rcucheck)						\
> +			return;						\
> 		prercu;							\
> 		rcu_read_lock_sched_notrace();				\
> 		it_func_ptr = rcu_dereference_sched((tp)->funcs);	\
> @@ -151,7 +153,7 @@ extern void syscall_unregfunc(void);
> 	} while (0)
> 
> #ifndef MODULE
> -#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)	\
> +#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args) \
> 	static inline void trace_##name##_rcuidle(proto)		\
> 	{								\
> 		if (static_key_false(&__tracepoint_##name.key))		\
> @@ -159,6 +161,7 @@ extern void syscall_unregfunc(void);
> 				TP_PROTO(data_proto),			\
> 				TP_ARGS(data_args),			\
> 				TP_CONDITION(cond),			\
> +				WARN_ON_ONCE(rcu_irq_enter_disabled()),\
> 				rcu_irq_enter_irqson(),			\
> 				rcu_irq_exit_irqson());			\
> 	}
> @@ -186,7 +189,7 @@ extern void syscall_unregfunc(void);
> 			__DO_TRACE(&__tracepoint_##name,		\
> 				TP_PROTO(data_proto),			\
> 				TP_ARGS(data_args),			\
> -				TP_CONDITION(cond),,);			\
> +				TP_CONDITION(cond),0,,);		\
> 		if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {		\
> 			rcu_read_lock_sched_notrace();			\
> 			rcu_dereference_sched(__tracepoint_##name.funcs);\
> --
> 2.9.3

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 7/5 v4] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 18:17           ` Mathieu Desnoyers
@ 2017-04-07 19:41             ` Steven Rostedt
  2017-04-07 19:43               ` Steven Rostedt
  0 siblings, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 19:41 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Stack tracing discovered that there's a small location inside the RCU
infrastructure where calling rcu_irq_enter() does not work. As trace events
use rcu_irq_enter() it must make sure that it is functionable. A check
against rcu_irq_enter_disabled() is added with a WARN_ON_ONCE() as no trace
event should ever be used in that part of RCU. If the warning is triggered,
then the trace event is ignored.

Restructure the __DO_TRACE() a bit to get rid of the prercu and postrcu,
and just have an rcucheck that does the work from within the _DO_TRACE()
macro. gcc optimization will compile out the rcucheck=0 case.

Link: http://lkml.kernel.org/r/20170405093207.404f8deb@gandalf.local.home

Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/tracepoint.h | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index f72fcfe..4ecb116 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -128,7 +128,7 @@ extern void syscall_unregfunc(void);
  * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
  * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
  */
-#define __DO_TRACE(tp, proto, args, cond, prercu, postrcu)		\
+#define __DO_TRACE(tp, proto, args, cond, rcucheck)			\
 	do {								\
 		struct tracepoint_func *it_func_ptr;			\
 		void *it_func;						\
@@ -136,7 +136,11 @@ extern void syscall_unregfunc(void);
 									\
 		if (!(cond))						\
 			return;						\
-		prercu;							\
+		if (rcucheck) {						\
+			if (WARN_ON_ONCE(rcu_irq_enter_disabled()))	\
+				return;					\
+			rcu_irq_enter_irqson();				\
+		}							\
 		rcu_read_lock_sched_notrace();				\
 		it_func_ptr = rcu_dereference_sched((tp)->funcs);	\
 		if (it_func_ptr) {					\
@@ -147,20 +151,19 @@ extern void syscall_unregfunc(void);
 			} while ((++it_func_ptr)->func);		\
 		}							\
 		rcu_read_unlock_sched_notrace();			\
-		postrcu;						\
+		if (rcucheck)						\
+			rcu_irq_exit_irqson();				\
 	} while (0)
 
 #ifndef MODULE
-#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)	\
+#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args) \
 	static inline void trace_##name##_rcuidle(proto)		\
 	{								\
 		if (static_key_false(&__tracepoint_##name.key))		\
 			__DO_TRACE(&__tracepoint_##name,		\
 				TP_PROTO(data_proto),			\
 				TP_ARGS(data_args),			\
-				TP_CONDITION(cond),			\
-				rcu_irq_enter_irqson(),			\
-				rcu_irq_exit_irqson());			\
+				TP_CONDITION(cond),1);			\
 	}
 #else
 #define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)
@@ -186,7 +189,7 @@ extern void syscall_unregfunc(void);
 			__DO_TRACE(&__tracepoint_##name,		\
 				TP_PROTO(data_proto),			\
 				TP_ARGS(data_args),			\
-				TP_CONDITION(cond),,);			\
+				TP_CONDITION(cond),0);			\
 		if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {		\
 			rcu_read_lock_sched_notrace();			\
 			rcu_dereference_sched(__tracepoint_##name.funcs);\
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5 v4] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 19:41             ` [PATCH 7/5 v4] " Steven Rostedt
@ 2017-04-07 19:43               ` Steven Rostedt
  2017-04-10 17:11                 ` Mathieu Desnoyers
  0 siblings, 1 reply; 39+ messages in thread
From: Steven Rostedt @ 2017-04-07 19:43 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

On Fri, 7 Apr 2017 15:41:17 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
 
>  #ifndef MODULE
> -#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)	\
> +#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args) \
>  	static inline void trace_##name##_rcuidle(proto)		\
>  	{								\
>  		if (static_key_false(&__tracepoint_##name.key))		\
>  			__DO_TRACE(&__tracepoint_##name,		\
>  				TP_PROTO(data_proto),			\
>  				TP_ARGS(data_args),			\
> -				TP_CONDITION(cond),			\
> -				rcu_irq_enter_irqson(),			\
> -				rcu_irq_exit_irqson());			\
> +				TP_CONDITION(cond),1);			\

I'm going to update this patch to add a space before the 1.

>  	}
>  #else
>  #define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)
> @@ -186,7 +189,7 @@ extern void syscall_unregfunc(void);
>  			__DO_TRACE(&__tracepoint_##name,		\
>  				TP_PROTO(data_proto),			\
>  				TP_ARGS(data_args),			\
> -				TP_CONDITION(cond),,);			\
> +				TP_CONDITION(cond),0);			\

And before the 0.

-- Steve

>  		if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {		\
>  			rcu_read_lock_sched_notrace();			\
>  			rcu_dereference_sched(__tracepoint_##name.funcs);\

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 7/5 v4] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events
  2017-04-07 19:43               ` Steven Rostedt
@ 2017-04-10 17:11                 ` Mathieu Desnoyers
  0 siblings, 0 replies; 39+ messages in thread
From: Mathieu Desnoyers @ 2017-04-10 17:11 UTC (permalink / raw)
  To: rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney


----- On Apr 7, 2017, at 3:43 PM, rostedt rostedt@goodmis.org wrote:

> On Fri, 7 Apr 2017 15:41:17 -0400
> Steven Rostedt <rostedt@goodmis.org> wrote:
> 
>>  #ifndef MODULE
>> -#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)	\
>> +#define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args) \
>>  	static inline void trace_##name##_rcuidle(proto)		\
>>  	{								\
>>  		if (static_key_false(&__tracepoint_##name.key))		\
>>  			__DO_TRACE(&__tracepoint_##name,		\
>>  				TP_PROTO(data_proto),			\
>>  				TP_ARGS(data_args),			\
>> -				TP_CONDITION(cond),			\
>> -				rcu_irq_enter_irqson(),			\
>> -				rcu_irq_exit_irqson());			\
>> +				TP_CONDITION(cond),1);			\
> 
> I'm going to update this patch to add a space before the 1.
> 
>>  	}
>>  #else
>>  #define __DECLARE_TRACE_RCU(name, proto, args, cond, data_proto, data_args)
>> @@ -186,7 +189,7 @@ extern void syscall_unregfunc(void);
>>  			__DO_TRACE(&__tracepoint_##name,		\
>>  				TP_PROTO(data_proto),			\
>>  				TP_ARGS(data_args),			\
>> -				TP_CONDITION(cond),,);			\
>> +				TP_CONDITION(cond),0);			\
> 
> And before the 0.

With those cosmetic changes, I'm ok with it.

Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>

Thanks!

Mathieu


> 
> -- Steve
> 
>>  		if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) {		\
>>  			rcu_read_lock_sched_notrace();			\
> >  			rcu_dereference_sched(__tracepoint_##name.funcs);\

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2017-04-10 17:10 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-07 14:01 [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
2017-04-07 14:01 ` [PATCH 1/5 v2] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
2017-04-07 14:01 ` [PATCH 2/5 v2] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c Steven Rostedt
2017-04-07 14:36   ` Paul E. McKenney
2017-04-07 14:48     ` Steven Rostedt
2017-04-07 15:08       ` Paul E. McKenney
2017-04-07 14:01 ` [PATCH 3/5 v2] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
2017-04-07 14:22   ` Steven Rostedt
2017-04-07 14:25   ` [PATCH 3/5 v2.1] " Steven Rostedt
2017-04-07 14:01 ` [PATCH 4/5 v2] tracing: Rename trace_active to disable_stack_tracer and inline its modification Steven Rostedt
2017-04-07 14:01 ` [PATCH 5/5 v2] rcu: Fix dyntick-idle tracing Steven Rostedt
2017-04-07 14:40   ` Paul E. McKenney
2017-04-07 14:53     ` Steven Rostedt
2017-04-07 15:09       ` Paul E. McKenney
2017-04-07 15:29         ` Steven Rostedt
2017-04-07 14:43 ` [PATCH 0/5 v2] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Paul E. McKenney
2017-04-07 14:58   ` Steven Rostedt
2017-04-07 15:11     ` Paul E. McKenney
2017-04-07 15:28       ` Steven Rostedt
2017-04-07 16:35 ` [PATCH 6/5]rcu/tracing: Add rcu_disabled to denote when rcu_irq_enter() will not work Steven Rostedt
2017-04-07 16:42   ` Paul E. McKenney
2017-04-07 16:44     ` Steven Rostedt
2017-04-07 16:53       ` Paul E. McKenney
2017-04-07 17:03     ` [PATCH 6/5 v2] rcu/tracing: " Steven Rostedt
2017-04-07 17:15       ` Paul E. McKenney
2017-04-07 17:06 ` [PATCH 7/5] tracing: Make sure rcu_irq_enter() can work for trace_*_rcuidle() trace events Steven Rostedt
2017-04-07 17:15   ` Paul E. McKenney
2017-04-07 17:19   ` Mathieu Desnoyers
2017-04-07 17:26     ` Steven Rostedt
2017-04-07 17:32       ` Steven Rostedt
2017-04-07 17:49       ` Mathieu Desnoyers
2017-04-07 17:55         ` Steven Rostedt
2017-04-07 18:10         ` [PATCH 7/5 v3] " Steven Rostedt
2017-04-07 18:17           ` Mathieu Desnoyers
2017-04-07 19:41             ` [PATCH 7/5 v4] " Steven Rostedt
2017-04-07 19:43               ` Steven Rostedt
2017-04-10 17:11                 ` Mathieu Desnoyers
2017-04-07 17:28     ` [PATCH 7/5] " Steven Rostedt
2017-04-07 17:48     ` [PATCH 7/5 v2] " Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).