All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable()
@ 2017-04-06 16:42 Steven Rostedt
  2017-04-06 16:42 ` [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 16:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

Paul, can you take a quick look at these patches. Namely patch 1, 3 and 4.

I modified your patch to use the updated name for stack_tracer_disable().

If all looks well, can you give them your ack.

Thanks!

-- Steve



Paul E. McKenney (1):
      rcu: Fix dyntick-idle tracing

Steven Rostedt (VMware) (3):
      ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines
      tracing: Replace the per_cpu() with this_cpu() in trace_stack.c
      tracing: Add stack_tracer_disable/enable() functions

----
 include/linux/ftrace.h     |  6 ++++++
 kernel/rcu/tree.c          | 48 +++++++++++++++++++++----------------------
 kernel/trace/Kconfig       |  3 ++-
 kernel/trace/ftrace.c      | 42 ++++++++++++++++----------------------
 kernel/trace/trace_stack.c | 51 +++++++++++++++++++++++++++++++---------------
 5 files changed, 84 insertions(+), 66 deletions(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines
  2017-04-06 16:42 [PATCH 0/4] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
@ 2017-04-06 16:42 ` Steven Rostedt
  2017-04-06 18:05   ` Paul E. McKenney
  2017-04-06 16:42 ` [PATCH 2/4] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 16:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0001-ftrace-Add-use-of-synchronize_rcu_tasks-with-dynamic.patch --]
[-- Type: text/plain, Size: 4458 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The function tracer needs to be more careful than other subsystems when it
comes to freeing data. Especially if that data is actually executable code.
When a single function is traced, a trampoline can be dynamically allocated
which is called to jump to the function trace callback. When the callback is
no longer needed, the dynamic allocated trampoline needs to be freed. This
is where the issues arise. The dynamically allocated trampoline must not be
used again. As function tracing can trace all subsystems, including
subsystems that are used to serialize aspects of freeing (namely RCU), it
must take extra care when doing the freeing.

Before synchronize_rcu_tasks() was around, there was no way for the function
tracer to know that nothing was using the dynamically allocated trampoline
when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely
preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(),
it will wait till all tasks have either voluntarily scheduled (not on the
trampoline) or goes into userspace (not on the trampoline). Then it is safe
to free the trampoline even with CONFIG_PREEMPT set.

Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/Kconfig  |  3 ++-
 kernel/trace/ftrace.c | 42 ++++++++++++++++++------------------------
 2 files changed, 20 insertions(+), 25 deletions(-)

diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index d4a06e714645..67b463b4f169 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -134,7 +134,8 @@ config FUNCTION_TRACER
 	select KALLSYMS
 	select GENERIC_TRACER
 	select CONTEXT_SWITCH_TRACER
-        select GLOB
+	select GLOB
+	select TASKS_RCU if PREEMPT
 	help
 	  Enable the kernel to trace every kernel function. This is done
 	  by using a compiler feature to insert a small, 5-byte No-Operation
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 8efd9fe7aec0..34f63e78d661 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2808,18 +2808,28 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
 	 * callers are done before leaving this function.
 	 * The same goes for freeing the per_cpu data of the per_cpu
 	 * ops.
-	 *
-	 * Again, normal synchronize_sched() is not good enough.
-	 * We need to do a hard force of sched synchronization.
-	 * This is because we use preempt_disable() to do RCU, but
-	 * the function tracers can be called where RCU is not watching
-	 * (like before user_exit()). We can not rely on the RCU
-	 * infrastructure to do the synchronization, thus we must do it
-	 * ourselves.
 	 */
 	if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) {
+		/*
+		 * We need to do a hard force of sched synchronization.
+		 * This is because we use preempt_disable() to do RCU, but
+		 * the function tracers can be called where RCU is not watching
+		 * (like before user_exit()). We can not rely on the RCU
+		 * infrastructure to do the synchronization, thus we must do it
+		 * ourselves.
+		 */
 		schedule_on_each_cpu(ftrace_sync);
 
+		/*
+		 * When the kernel is preeptive, tasks can be preempted
+		 * while on a ftrace trampoline. Just scheduling a task on
+		 * a CPU is not good enough to flush them. Calling
+		 * synchornize_rcu_tasks() will wait for those tasks to
+		 * execute and either schedule voluntarily or enter user space.
+		 */
+		if (IS_ENABLED(CONFIG_PREEMPT))
+			synchronize_rcu_tasks();
+
 		arch_ftrace_trampoline_free(ops);
 
 		if (ops->flags & FTRACE_OPS_FL_PER_CPU)
@@ -5366,22 +5376,6 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
 
 static void ftrace_update_trampoline(struct ftrace_ops *ops)
 {
-
-/*
- * Currently there's no safe way to free a trampoline when the kernel
- * is configured with PREEMPT. That is because a task could be preempted
- * when it jumped to the trampoline, it may be preempted for a long time
- * depending on the system load, and currently there's no way to know
- * when it will be off the trampoline. If the trampoline is freed
- * too early, when the task runs again, it will be executing on freed
- * memory and crash.
- */
-#ifdef CONFIG_PREEMPT
-	/* Currently, only non dynamic ops can have a trampoline */
-	if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
-		return;
-#endif
-
 	arch_ftrace_update_trampoline(ops);
 }
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c
  2017-04-06 16:42 [PATCH 0/4] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
  2017-04-06 16:42 ` [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
@ 2017-04-06 16:42 ` Steven Rostedt
  2017-04-06 16:42 ` [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
  2017-04-06 16:42 ` [PATCH 4/4] rcu: Fix dyntick-idle tracing Steven Rostedt
  3 siblings, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 16:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0002-tracing-Replace-the-per_cpu-with-this_cpu-in-trace_s.patch --]
[-- Type: text/plain, Size: 2603 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The updates to the trace_active per cpu variable can be updated with the
this_cpu_*() functions as it only gets updated on the CPU that the variable
is on.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_stack.c | 23 +++++++----------------
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 5fb1f2c87e6b..05ad2b86461e 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -207,13 +207,12 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 		 struct ftrace_ops *op, struct pt_regs *pt_regs)
 {
 	unsigned long stack;
-	int cpu;
 
 	preempt_disable_notrace();
 
-	cpu = raw_smp_processor_id();
 	/* no atomic needed, we only modify this variable by this cpu */
-	if (per_cpu(trace_active, cpu)++ != 0)
+	this_cpu_inc(trace_active);
+	if (this_cpu_read(trace_active) != 1)
 		goto out;
 
 	ip += MCOUNT_INSN_SIZE;
@@ -221,7 +220,7 @@ stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	check_stack(ip, &stack);
 
  out:
-	per_cpu(trace_active, cpu)--;
+	this_cpu_dec(trace_active);
 	/* prevent recursion in schedule */
 	preempt_enable_notrace();
 }
@@ -253,7 +252,6 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
 	long *ptr = filp->private_data;
 	unsigned long val, flags;
 	int ret;
-	int cpu;
 
 	ret = kstrtoul_from_user(ubuf, count, 10, &val);
 	if (ret)
@@ -266,14 +264,13 @@ stack_max_size_write(struct file *filp, const char __user *ubuf,
 	 * we will cause circular lock, so we also need to increase
 	 * the percpu trace_active here.
 	 */
-	cpu = smp_processor_id();
-	per_cpu(trace_active, cpu)++;
+	this_cpu_inc(trace_active);
 
 	arch_spin_lock(&stack_trace_max_lock);
 	*ptr = val;
 	arch_spin_unlock(&stack_trace_max_lock);
 
-	per_cpu(trace_active, cpu)--;
+	this_cpu_dec(trace_active);
 	local_irq_restore(flags);
 
 	return count;
@@ -307,12 +304,9 @@ t_next(struct seq_file *m, void *v, loff_t *pos)
 
 static void *t_start(struct seq_file *m, loff_t *pos)
 {
-	int cpu;
-
 	local_irq_disable();
 
-	cpu = smp_processor_id();
-	per_cpu(trace_active, cpu)++;
+	this_cpu_inc(trace_active);
 
 	arch_spin_lock(&stack_trace_max_lock);
 
@@ -324,12 +318,9 @@ static void *t_start(struct seq_file *m, loff_t *pos)
 
 static void t_stop(struct seq_file *m, void *p)
 {
-	int cpu;
-
 	arch_spin_unlock(&stack_trace_max_lock);
 
-	cpu = smp_processor_id();
-	per_cpu(trace_active, cpu)--;
+	this_cpu_dec(trace_active);
 
 	local_irq_enable();
 }
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions
  2017-04-06 16:42 [PATCH 0/4] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
  2017-04-06 16:42 ` [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
  2017-04-06 16:42 ` [PATCH 2/4] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c Steven Rostedt
@ 2017-04-06 16:42 ` Steven Rostedt
  2017-04-06 18:12   ` Paul E. McKenney
  2017-04-06 16:42 ` [PATCH 4/4] rcu: Fix dyntick-idle tracing Steven Rostedt
  3 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 16:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0003-tracing-Add-stack_tracer_disable-enable-functions.patch --]
[-- Type: text/plain, Size: 2226 bytes --]

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

There are certain parts of the kernel that can not let stack tracing
proceed (namely in RCU), because the stack tracer uses RCU, and parts of RCU
internals can not handle having RCU read side locks taken.

Add stack_tracer_disable() and stack_tracer_enable() functions to let RCU
stop stack tracing on the current CPU as it is in those critical sections.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/ftrace.h     |  6 ++++++
 kernel/trace/trace_stack.c | 28 ++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index ef7123219f14..40afee35565a 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -286,6 +286,12 @@ int
 stack_trace_sysctl(struct ctl_table *table, int write,
 		   void __user *buffer, size_t *lenp,
 		   loff_t *ppos);
+
+void stack_tracer_disable(void);
+void stack_tracer_enable(void);
+#else
+static inline void stack_tracer_disable(void) { }
+static inline void stack_tracer_enabe(void) { }
 #endif
 
 struct ftrace_func_command {
diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
index 05ad2b86461e..5adbb73ec2ec 100644
--- a/kernel/trace/trace_stack.c
+++ b/kernel/trace/trace_stack.c
@@ -41,6 +41,34 @@ static DEFINE_MUTEX(stack_sysctl_mutex);
 int stack_tracer_enabled;
 static int last_stack_tracer_enabled;
 
+/**
+ * stack_tracer_disable - temporarily disable the stack tracer
+ *
+ * There's a few locations (namely in RCU) where stack tracing
+ * can not be executed. This function is used to disable stack
+ * tracing during those critical sections.
+ *
+ * This function will disable preemption. stack_tracer_enable()
+ * must be called shortly after this is called.
+ */
+void stack_tracer_disable(void)
+{
+	preempt_disable_notrace();
+	this_cpu_inc(trace_active);
+}
+
+/**
+ * stack_tracer_enable - re-enable the stack tracer
+ *
+ * After stack_tracer_disable() is called, stack_tracer_enable()
+ * must shortly be called afterward.
+ */
+void stack_tracer_enable(void)
+{
+	this_cpu_dec(trace_active);
+	preempt_enable_notrace();
+}
+
 void stack_trace_print(void)
 {
 	long i;
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] rcu: Fix dyntick-idle tracing
  2017-04-06 16:42 [PATCH 0/4] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
                   ` (2 preceding siblings ...)
  2017-04-06 16:42 ` [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
@ 2017-04-06 16:42 ` Steven Rostedt
  2017-04-06 18:01   ` Paul E. McKenney
  2017-04-07  4:50   ` kbuild test robot
  3 siblings, 2 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 16:42 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: 0004-rcu-Fix-dyntick-idle-tracing.patch --]
[-- Type: text/plain, Size: 4765 bytes --]

From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

The tracing subsystem started using rcu_irq_entry() and rcu_irq_exit()
(with my blessing) to allow the current _rcuidle alternative tracepoint
name to be dispensed with while still maintaining good performance.
Unfortunately, this causes RCU's dyntick-idle entry code's tracing to
appear to RCU like an interrupt that occurs where RCU is not designed
to handle interrupts.

This commit fixes this problem by moving the zeroing of ->dynticks_nesting
after the offending trace_rcu_dyntick() statement, which narrows the
window of vulnerability to a pair of adjacent statements that are now
marked with comments to that effect.

Link: http://lkml.kernel.org/r/20170405193928.GM1600@linux.vnet.ibm.com

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/rcu/tree.c | 48 +++++++++++++++++++++++-------------------------
 1 file changed, 23 insertions(+), 25 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 50fee7689e71..8b4d273331e4 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -57,6 +57,7 @@
 #include <linux/random.h>
 #include <linux/trace_events.h>
 #include <linux/suspend.h>
+#include <linux/ftrace.h>
 
 #include "tree.h"
 #include "rcu.h"
@@ -771,25 +772,24 @@ cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
 }
 
 /*
- * rcu_eqs_enter_common - current CPU is moving towards extended quiescent state
+ * rcu_eqs_enter_common - current CPU is entering an extended quiescent state
  *
- * If the new value of the ->dynticks_nesting counter now is zero,
- * we really have entered idle, and must do the appropriate accounting.
- * The caller must have disabled interrupts.
+ * Enter idle, doing appropriate accounting.  The caller must have
+ * disabled interrupts.
  */
-static void rcu_eqs_enter_common(long long oldval, bool user)
+static void rcu_eqs_enter_common(bool user)
 {
 	struct rcu_state *rsp;
 	struct rcu_data *rdp;
-	RCU_TRACE(struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);)
+	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
 
-	trace_rcu_dyntick(TPS("Start"), oldval, rdtp->dynticks_nesting);
+	trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
 	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
 	    !user && !is_idle_task(current)) {
 		struct task_struct *idle __maybe_unused =
 			idle_task(smp_processor_id());
 
-		trace_rcu_dyntick(TPS("Error on entry: not idle task"), oldval, 0);
+		trace_rcu_dyntick(TPS("Error on entry: not idle task"), rdtp->dynticks_nesting, 0);
 		rcu_ftrace_dump(DUMP_ORIG);
 		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
 			  current->pid, current->comm,
@@ -800,7 +800,10 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
 		do_nocb_deferred_wakeup(rdp);
 	}
 	rcu_prepare_for_idle();
-	rcu_dynticks_eqs_enter();
+	stack_tracer_disable();
+	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
+	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
+	stack_tracer_enable();
 	rcu_dynticks_task_enter();
 
 	/*
@@ -821,19 +824,15 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
  */
 static void rcu_eqs_enter(bool user)
 {
-	long long oldval;
 	struct rcu_dynticks *rdtp;
 
 	rdtp = this_cpu_ptr(&rcu_dynticks);
-	oldval = rdtp->dynticks_nesting;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     (oldval & DYNTICK_TASK_NEST_MASK) == 0);
-	if ((oldval & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE) {
-		rdtp->dynticks_nesting = 0;
-		rcu_eqs_enter_common(oldval, user);
-	} else {
+		     (rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0);
+	if ((rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE)
+		rcu_eqs_enter_common(user);
+	else
 		rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
-	}
 }
 
 /**
@@ -892,19 +891,18 @@ void rcu_user_enter(void)
  */
 void rcu_irq_exit(void)
 {
-	long long oldval;
 	struct rcu_dynticks *rdtp;
 
 	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
 	rdtp = this_cpu_ptr(&rcu_dynticks);
-	oldval = rdtp->dynticks_nesting;
-	rdtp->dynticks_nesting--;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     rdtp->dynticks_nesting < 0);
-	if (rdtp->dynticks_nesting)
-		trace_rcu_dyntick(TPS("--="), oldval, rdtp->dynticks_nesting);
-	else
-		rcu_eqs_enter_common(oldval, true);
+		     rdtp->dynticks_nesting < 1);
+	if (rdtp->dynticks_nesting <= 1) {
+		rcu_eqs_enter_common(true);
+	} else {
+		trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nesting, rdtp->dynticks_nesting - 1);
+		rdtp->dynticks_nesting--;
+	}
 	rcu_sysidle_enter(1);
 }
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] rcu: Fix dyntick-idle tracing
  2017-04-06 16:42 ` [PATCH 4/4] rcu: Fix dyntick-idle tracing Steven Rostedt
@ 2017-04-06 18:01   ` Paul E. McKenney
  2017-04-07  4:50   ` kbuild test robot
  1 sibling, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2017-04-06 18:01 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, Apr 06, 2017 at 12:42:41PM -0400, Steven Rostedt wrote:
> From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> The tracing subsystem started using rcu_irq_entry() and rcu_irq_exit()
> (with my blessing) to allow the current _rcuidle alternative tracepoint
> name to be dispensed with while still maintaining good performance.
> Unfortunately, this causes RCU's dyntick-idle entry code's tracing to
> appear to RCU like an interrupt that occurs where RCU is not designed
> to handle interrupts.
> 
> This commit fixes this problem by moving the zeroing of ->dynticks_nesting
> after the offending trace_rcu_dyntick() statement, which narrows the
> window of vulnerability to a pair of adjacent statements that are now
> marked with comments to that effect.
> 
> Link: http://lkml.kernel.org/r/20170405193928.GM1600@linux.vnet.ibm.com
> 
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

Really confirming my Signed-off-by given Steven's changes, but whatever.  ;-)

							Thanx, Paul

> ---
>  kernel/rcu/tree.c | 48 +++++++++++++++++++++++-------------------------
>  1 file changed, 23 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 50fee7689e71..8b4d273331e4 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -57,6 +57,7 @@
>  #include <linux/random.h>
>  #include <linux/trace_events.h>
>  #include <linux/suspend.h>
> +#include <linux/ftrace.h>
> 
>  #include "tree.h"
>  #include "rcu.h"
> @@ -771,25 +772,24 @@ cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
>  }
> 
>  /*
> - * rcu_eqs_enter_common - current CPU is moving towards extended quiescent state
> + * rcu_eqs_enter_common - current CPU is entering an extended quiescent state
>   *
> - * If the new value of the ->dynticks_nesting counter now is zero,
> - * we really have entered idle, and must do the appropriate accounting.
> - * The caller must have disabled interrupts.
> + * Enter idle, doing appropriate accounting.  The caller must have
> + * disabled interrupts.
>   */
> -static void rcu_eqs_enter_common(long long oldval, bool user)
> +static void rcu_eqs_enter_common(bool user)
>  {
>  	struct rcu_state *rsp;
>  	struct rcu_data *rdp;
> -	RCU_TRACE(struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);)
> +	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
> 
> -	trace_rcu_dyntick(TPS("Start"), oldval, rdtp->dynticks_nesting);
> +	trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
>  	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
>  	    !user && !is_idle_task(current)) {
>  		struct task_struct *idle __maybe_unused =
>  			idle_task(smp_processor_id());
> 
> -		trace_rcu_dyntick(TPS("Error on entry: not idle task"), oldval, 0);
> +		trace_rcu_dyntick(TPS("Error on entry: not idle task"), rdtp->dynticks_nesting, 0);
>  		rcu_ftrace_dump(DUMP_ORIG);
>  		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
>  			  current->pid, current->comm,
> @@ -800,7 +800,10 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
>  		do_nocb_deferred_wakeup(rdp);
>  	}
>  	rcu_prepare_for_idle();
> -	rcu_dynticks_eqs_enter();
> +	stack_tracer_disable();
> +	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
> +	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
> +	stack_tracer_enable();
>  	rcu_dynticks_task_enter();
> 
>  	/*
> @@ -821,19 +824,15 @@ static void rcu_eqs_enter_common(long long oldval, bool user)
>   */
>  static void rcu_eqs_enter(bool user)
>  {
> -	long long oldval;
>  	struct rcu_dynticks *rdtp;
> 
>  	rdtp = this_cpu_ptr(&rcu_dynticks);
> -	oldval = rdtp->dynticks_nesting;
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> -		     (oldval & DYNTICK_TASK_NEST_MASK) == 0);
> -	if ((oldval & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE) {
> -		rdtp->dynticks_nesting = 0;
> -		rcu_eqs_enter_common(oldval, user);
> -	} else {
> +		     (rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0);
> +	if ((rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE)
> +		rcu_eqs_enter_common(user);
> +	else
>  		rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
> -	}
>  }
> 
>  /**
> @@ -892,19 +891,18 @@ void rcu_user_enter(void)
>   */
>  void rcu_irq_exit(void)
>  {
> -	long long oldval;
>  	struct rcu_dynticks *rdtp;
> 
>  	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_irq_exit() invoked with irqs enabled!!!");
>  	rdtp = this_cpu_ptr(&rcu_dynticks);
> -	oldval = rdtp->dynticks_nesting;
> -	rdtp->dynticks_nesting--;
>  	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
> -		     rdtp->dynticks_nesting < 0);
> -	if (rdtp->dynticks_nesting)
> -		trace_rcu_dyntick(TPS("--="), oldval, rdtp->dynticks_nesting);
> -	else
> -		rcu_eqs_enter_common(oldval, true);
> +		     rdtp->dynticks_nesting < 1);
> +	if (rdtp->dynticks_nesting <= 1) {
> +		rcu_eqs_enter_common(true);
> +	} else {
> +		trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nesting, rdtp->dynticks_nesting - 1);
> +		rdtp->dynticks_nesting--;
> +	}
>  	rcu_sysidle_enter(1);
>  }
> 
> -- 
> 2.10.2
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines
  2017-04-06 16:42 ` [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
@ 2017-04-06 18:05   ` Paul E. McKenney
  2017-04-06 18:13     ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2017-04-06 18:05 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, Apr 06, 2017 at 12:42:38PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> The function tracer needs to be more careful than other subsystems when it
> comes to freeing data. Especially if that data is actually executable code.
> When a single function is traced, a trampoline can be dynamically allocated
> which is called to jump to the function trace callback. When the callback is
> no longer needed, the dynamic allocated trampoline needs to be freed. This
> is where the issues arise. The dynamically allocated trampoline must not be
> used again. As function tracing can trace all subsystems, including
> subsystems that are used to serialize aspects of freeing (namely RCU), it
> must take extra care when doing the freeing.
> 
> Before synchronize_rcu_tasks() was around, there was no way for the function
> tracer to know that nothing was using the dynamically allocated trampoline
> when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely
> preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(),
> it will wait till all tasks have either voluntarily scheduled (not on the
> trampoline) or goes into userspace (not on the trampoline). Then it is safe
> to free the trampoline even with CONFIG_PREEMPT set.
> 
> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

One question below.  Other than that:

Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> ---
>  kernel/trace/Kconfig  |  3 ++-
>  kernel/trace/ftrace.c | 42 ++++++++++++++++++------------------------
>  2 files changed, 20 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index d4a06e714645..67b463b4f169 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -134,7 +134,8 @@ config FUNCTION_TRACER
>  	select KALLSYMS
>  	select GENERIC_TRACER
>  	select CONTEXT_SWITCH_TRACER
> -        select GLOB
> +	select GLOB

Does GLOB really want to be selected in production environments?
I could understand "select GLOB if WE_ARE_TESTING" or some such.

> +	select TASKS_RCU if PREEMPT
>  	help
>  	  Enable the kernel to trace every kernel function. This is done
>  	  by using a compiler feature to insert a small, 5-byte No-Operation
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 8efd9fe7aec0..34f63e78d661 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2808,18 +2808,28 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
>  	 * callers are done before leaving this function.
>  	 * The same goes for freeing the per_cpu data of the per_cpu
>  	 * ops.
> -	 *
> -	 * Again, normal synchronize_sched() is not good enough.
> -	 * We need to do a hard force of sched synchronization.
> -	 * This is because we use preempt_disable() to do RCU, but
> -	 * the function tracers can be called where RCU is not watching
> -	 * (like before user_exit()). We can not rely on the RCU
> -	 * infrastructure to do the synchronization, thus we must do it
> -	 * ourselves.
>  	 */
>  	if (ops->flags & (FTRACE_OPS_FL_DYNAMIC | FTRACE_OPS_FL_PER_CPU)) {
> +		/*
> +		 * We need to do a hard force of sched synchronization.
> +		 * This is because we use preempt_disable() to do RCU, but
> +		 * the function tracers can be called where RCU is not watching
> +		 * (like before user_exit()). We can not rely on the RCU
> +		 * infrastructure to do the synchronization, thus we must do it
> +		 * ourselves.
> +		 */
>  		schedule_on_each_cpu(ftrace_sync);
> 
> +		/*
> +		 * When the kernel is preeptive, tasks can be preempted
> +		 * while on a ftrace trampoline. Just scheduling a task on
> +		 * a CPU is not good enough to flush them. Calling
> +		 * synchornize_rcu_tasks() will wait for those tasks to
> +		 * execute and either schedule voluntarily or enter user space.
> +		 */
> +		if (IS_ENABLED(CONFIG_PREEMPT))
> +			synchronize_rcu_tasks();
> +
>  		arch_ftrace_trampoline_free(ops);
> 
>  		if (ops->flags & FTRACE_OPS_FL_PER_CPU)
> @@ -5366,22 +5376,6 @@ void __weak arch_ftrace_update_trampoline(struct ftrace_ops *ops)
> 
>  static void ftrace_update_trampoline(struct ftrace_ops *ops)
>  {
> -
> -/*
> - * Currently there's no safe way to free a trampoline when the kernel
> - * is configured with PREEMPT. That is because a task could be preempted
> - * when it jumped to the trampoline, it may be preempted for a long time
> - * depending on the system load, and currently there's no way to know
> - * when it will be off the trampoline. If the trampoline is freed
> - * too early, when the task runs again, it will be executing on freed
> - * memory and crash.
> - */
> -#ifdef CONFIG_PREEMPT
> -	/* Currently, only non dynamic ops can have a trampoline */
> -	if (ops->flags & FTRACE_OPS_FL_DYNAMIC)
> -		return;
> -#endif
> -
>  	arch_ftrace_update_trampoline(ops);
>  }
> 
> -- 
> 2.10.2
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions
  2017-04-06 16:42 ` [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
@ 2017-04-06 18:12   ` Paul E. McKenney
  2017-04-06 18:48     ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2017-04-06 18:12 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, Apr 06, 2017 at 12:42:40PM -0400, Steven Rostedt wrote:
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> 
> There are certain parts of the kernel that can not let stack tracing
> proceed (namely in RCU), because the stack tracer uses RCU, and parts of RCU
> internals can not handle having RCU read side locks taken.
> 
> Add stack_tracer_disable() and stack_tracer_enable() functions to let RCU
> stop stack tracing on the current CPU as it is in those critical sections.

s/as it is in/when it is in/?

> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>

One quibble above, one objection below.

							Thanx, Paul

> ---
>  include/linux/ftrace.h     |  6 ++++++
>  kernel/trace/trace_stack.c | 28 ++++++++++++++++++++++++++++
>  2 files changed, 34 insertions(+)
> 
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index ef7123219f14..40afee35565a 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -286,6 +286,12 @@ int
>  stack_trace_sysctl(struct ctl_table *table, int write,
>  		   void __user *buffer, size_t *lenp,
>  		   loff_t *ppos);
> +
> +void stack_tracer_disable(void);
> +void stack_tracer_enable(void);
> +#else
> +static inline void stack_tracer_disable(void) { }
> +static inline void stack_tracer_enabe(void) { }
>  #endif
> 
>  struct ftrace_func_command {
> diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> index 05ad2b86461e..5adbb73ec2ec 100644
> --- a/kernel/trace/trace_stack.c
> +++ b/kernel/trace/trace_stack.c
> @@ -41,6 +41,34 @@ static DEFINE_MUTEX(stack_sysctl_mutex);
>  int stack_tracer_enabled;
>  static int last_stack_tracer_enabled;
> 
> +/**
> + * stack_tracer_disable - temporarily disable the stack tracer
> + *
> + * There's a few locations (namely in RCU) where stack tracing
> + * can not be executed. This function is used to disable stack
> + * tracing during those critical sections.
> + *
> + * This function will disable preemption. stack_tracer_enable()
> + * must be called shortly after this is called.
> + */
> +void stack_tracer_disable(void)
> +{
> +	preempt_disable_notrace();

Interrupts are disabled in all current call points, so you don't really
need to disable preemption.  I would normally not worry, given the
ease-of-use improvements, but some people get annoyed about even slight
increases in idle-entry overhead.

> +	this_cpu_inc(trace_active);
> +}
> +
> +/**
> + * stack_tracer_enable - re-enable the stack tracer
> + *
> + * After stack_tracer_disable() is called, stack_tracer_enable()
> + * must shortly be called afterward.
> + */
> +void stack_tracer_enable(void)
> +{
> +	this_cpu_dec(trace_active);
> +	preempt_enable_notrace();

Ditto...

> +}
> +
>  void stack_trace_print(void)
>  {
>  	long i;
> -- 
> 2.10.2
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines
  2017-04-06 18:05   ` Paul E. McKenney
@ 2017-04-06 18:13     ` Steven Rostedt
  2017-04-06 20:20       ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 18:13 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, 6 Apr 2017 11:05:53 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Apr 06, 2017 at 12:42:38PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > The function tracer needs to be more careful than other subsystems when it
> > comes to freeing data. Especially if that data is actually executable code.
> > When a single function is traced, a trampoline can be dynamically allocated
> > which is called to jump to the function trace callback. When the callback is
> > no longer needed, the dynamic allocated trampoline needs to be freed. This
> > is where the issues arise. The dynamically allocated trampoline must not be
> > used again. As function tracing can trace all subsystems, including
> > subsystems that are used to serialize aspects of freeing (namely RCU), it
> > must take extra care when doing the freeing.
> > 
> > Before synchronize_rcu_tasks() was around, there was no way for the function
> > tracer to know that nothing was using the dynamically allocated trampoline
> > when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely
> > preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(),
> > it will wait till all tasks have either voluntarily scheduled (not on the
> > trampoline) or goes into userspace (not on the trampoline). Then it is safe
> > to free the trampoline even with CONFIG_PREEMPT set.
> > 
> > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>  
> 
> One question below.  Other than that:
> 
> Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>

Thanks!

> 
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > ---
> >  kernel/trace/Kconfig  |  3 ++-
> >  kernel/trace/ftrace.c | 42 ++++++++++++++++++------------------------
> >  2 files changed, 20 insertions(+), 25 deletions(-)
> > 
> > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> > index d4a06e714645..67b463b4f169 100644
> > --- a/kernel/trace/Kconfig
> > +++ b/kernel/trace/Kconfig
> > @@ -134,7 +134,8 @@ config FUNCTION_TRACER
> >  	select KALLSYMS
> >  	select GENERIC_TRACER
> >  	select CONTEXT_SWITCH_TRACER
> > -        select GLOB
> > +	select GLOB  
> 
> Does GLOB really want to be selected in production environments?
> I could understand "select GLOB if WE_ARE_TESTING" or some such.

Note, this patch just fixes the whitespace issue for the "select GLOB".
All the selects had a tab in front of them, where as GLOB had all
spaces.

As for your question, FUNCTION_TRACER depends on glob. It's what is
used for the function filters. You can do:

	echo '*sync*rcu*' > /sys/kernel/tracing/set_ftrace_filter

And get:

synchronize_rcu_tasks
synchronize_srcu
synchronize_srcu_expedited
sync_rcu_exp_select_cpus
sync_rcu_exp_handler
synchronize_rcu_bh.part.65
synchronize_rcu_expedited
synchronize_rcu.part.67
synchronize_rcu
synchronize_rcu_bh

selected.

So yes, production environments do want it selected.

-- Steve


> 
> > +	select TASKS_RCU if PREEMPT
> >  	help
> >  	  Enable the kernel to trace every kernel function. This is done
> >  	  by using a compiler feature to insert a small, 5-byte No-Operation

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions
  2017-04-06 18:12   ` Paul E. McKenney
@ 2017-04-06 18:48     ` Steven Rostedt
  2017-04-06 20:21       ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 18:48 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, 6 Apr 2017 11:12:22 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Apr 06, 2017 at 12:42:40PM -0400, Steven Rostedt wrote:
> > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > 
> > There are certain parts of the kernel that can not let stack tracing
> > proceed (namely in RCU), because the stack tracer uses RCU, and parts of RCU
> > internals can not handle having RCU read side locks taken.
> > 
> > Add stack_tracer_disable() and stack_tracer_enable() functions to let RCU
> > stop stack tracing on the current CPU as it is in those critical sections.  
> 
> s/as it is in/when it is in/?
> 
> > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> 
> One quibble above, one objection below.
> 
> 							Thanx, Paul
> 
> > ---
> >  include/linux/ftrace.h     |  6 ++++++
> >  kernel/trace/trace_stack.c | 28 ++++++++++++++++++++++++++++
> >  2 files changed, 34 insertions(+)
> > 
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index ef7123219f14..40afee35565a 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -286,6 +286,12 @@ int
> >  stack_trace_sysctl(struct ctl_table *table, int write,
> >  		   void __user *buffer, size_t *lenp,
> >  		   loff_t *ppos);
> > +
> > +void stack_tracer_disable(void);
> > +void stack_tracer_enable(void);
> > +#else
> > +static inline void stack_tracer_disable(void) { }
> > +static inline void stack_tracer_enabe(void) { }
> >  #endif
> > 
> >  struct ftrace_func_command {
> > diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> > index 05ad2b86461e..5adbb73ec2ec 100644
> > --- a/kernel/trace/trace_stack.c
> > +++ b/kernel/trace/trace_stack.c
> > @@ -41,6 +41,34 @@ static DEFINE_MUTEX(stack_sysctl_mutex);
> >  int stack_tracer_enabled;
> >  static int last_stack_tracer_enabled;
> > 
> > +/**
> > + * stack_tracer_disable - temporarily disable the stack tracer
> > + *
> > + * There's a few locations (namely in RCU) where stack tracing
> > + * can not be executed. This function is used to disable stack
> > + * tracing during those critical sections.
> > + *
> > + * This function will disable preemption. stack_tracer_enable()
> > + * must be called shortly after this is called.
> > + */
> > +void stack_tracer_disable(void)
> > +{
> > +	preempt_disable_notrace();  
> 
> Interrupts are disabled in all current call points, so you don't really
> need to disable preemption.  I would normally not worry, given the
> ease-of-use improvements, but some people get annoyed about even slight
> increases in idle-entry overhead.

My worry is that we add another caller that doesn't disable interrupts
or preemption.

I could add a __stack_trace_disable() that skips the disabling of
preemption, as the "__" usually denotes the call is "special".

-- Steve

> 
> > +	this_cpu_inc(trace_active);
> > +}
> > +
> > +/**
> > + * stack_tracer_enable - re-enable the stack tracer
> > + *
> > + * After stack_tracer_disable() is called, stack_tracer_enable()
> > + * must shortly be called afterward.
> > + */
> > +void stack_tracer_enable(void)
> > +{
> > +	this_cpu_dec(trace_active);
> > +	preempt_enable_notrace();  
> 
> Ditto...
> 
> > +}
> > +
> >  void stack_trace_print(void)
> >  {
> >  	long i;
> > -- 
> > 2.10.2
> > 
> >   

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines
  2017-04-06 18:13     ` Steven Rostedt
@ 2017-04-06 20:20       ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2017-04-06 20:20 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, Apr 06, 2017 at 02:13:29PM -0400, Steven Rostedt wrote:
> On Thu, 6 Apr 2017 11:05:53 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Apr 06, 2017 at 12:42:38PM -0400, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > > 
> > > The function tracer needs to be more careful than other subsystems when it
> > > comes to freeing data. Especially if that data is actually executable code.
> > > When a single function is traced, a trampoline can be dynamically allocated
> > > which is called to jump to the function trace callback. When the callback is
> > > no longer needed, the dynamic allocated trampoline needs to be freed. This
> > > is where the issues arise. The dynamically allocated trampoline must not be
> > > used again. As function tracing can trace all subsystems, including
> > > subsystems that are used to serialize aspects of freeing (namely RCU), it
> > > must take extra care when doing the freeing.
> > > 
> > > Before synchronize_rcu_tasks() was around, there was no way for the function
> > > tracer to know that nothing was using the dynamically allocated trampoline
> > > when CONFIG_PREEMPT was enabled. That's because a task could be indefinitely
> > > preempted while sitting on the trampoline. Now with synchronize_rcu_tasks(),
> > > it will wait till all tasks have either voluntarily scheduled (not on the
> > > trampoline) or goes into userspace (not on the trampoline). Then it is safe
> > > to free the trampoline even with CONFIG_PREEMPT set.
> > > 
> > > Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>  
> > 
> > One question below.  Other than that:
> > 
> > Acked-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
> 
> Thanks!
> 
> > 
> > > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> > > ---
> > >  kernel/trace/Kconfig  |  3 ++-
> > >  kernel/trace/ftrace.c | 42 ++++++++++++++++++------------------------
> > >  2 files changed, 20 insertions(+), 25 deletions(-)
> > > 
> > > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> > > index d4a06e714645..67b463b4f169 100644
> > > --- a/kernel/trace/Kconfig
> > > +++ b/kernel/trace/Kconfig
> > > @@ -134,7 +134,8 @@ config FUNCTION_TRACER
> > >  	select KALLSYMS
> > >  	select GENERIC_TRACER
> > >  	select CONTEXT_SWITCH_TRACER
> > > -        select GLOB
> > > +	select GLOB  
> > 
> > Does GLOB really want to be selected in production environments?
> > I could understand "select GLOB if WE_ARE_TESTING" or some such.
> 
> Note, this patch just fixes the whitespace issue for the "select GLOB".
> All the selects had a tab in front of them, where as GLOB had all
> spaces.
> 
> As for your question, FUNCTION_TRACER depends on glob. It's what is
> used for the function filters. You can do:
> 
> 	echo '*sync*rcu*' > /sys/kernel/tracing/set_ftrace_filter
> 
> And get:
> 
> synchronize_rcu_tasks
> synchronize_srcu
> synchronize_srcu_expedited
> sync_rcu_exp_select_cpus
> sync_rcu_exp_handler
> synchronize_rcu_bh.part.65
> synchronize_rcu_expedited
> synchronize_rcu.part.67
> synchronize_rcu
> synchronize_rcu_bh
> 
> selected.
> 
> So yes, production environments do want it selected.

Now that you mention it, that does seem like it might be important.  ;-)

Ah, I was mistakenly looking at GLOB_SELFTEST instead of just plain
GLOB -- sorry for the noise!

							Thanx, Paul

> -- Steve
> 
> 
> > 
> > > +	select TASKS_RCU if PREEMPT
> > >  	help
> > >  	  Enable the kernel to trace every kernel function. This is done
> > >  	  by using a compiler feature to insert a small, 5-byte No-Operation
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions
  2017-04-06 18:48     ` Steven Rostedt
@ 2017-04-06 20:21       ` Paul E. McKenney
  2017-04-06 21:23         ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2017-04-06 20:21 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, Apr 06, 2017 at 02:48:03PM -0400, Steven Rostedt wrote:
> On Thu, 6 Apr 2017 11:12:22 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Apr 06, 2017 at 12:42:40PM -0400, Steven Rostedt wrote:
> > > From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
> > > 
> > > There are certain parts of the kernel that can not let stack tracing
> > > proceed (namely in RCU), because the stack tracer uses RCU, and parts of RCU
> > > internals can not handle having RCU read side locks taken.
> > > 
> > > Add stack_tracer_disable() and stack_tracer_enable() functions to let RCU
> > > stop stack tracing on the current CPU as it is in those critical sections.  
> > 
> > s/as it is in/when it is in/?
> > 
> > > Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>  
> > 
> > One quibble above, one objection below.
> > 
> > 							Thanx, Paul
> > 
> > > ---
> > >  include/linux/ftrace.h     |  6 ++++++
> > >  kernel/trace/trace_stack.c | 28 ++++++++++++++++++++++++++++
> > >  2 files changed, 34 insertions(+)
> > > 
> > > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > > index ef7123219f14..40afee35565a 100644
> > > --- a/include/linux/ftrace.h
> > > +++ b/include/linux/ftrace.h
> > > @@ -286,6 +286,12 @@ int
> > >  stack_trace_sysctl(struct ctl_table *table, int write,
> > >  		   void __user *buffer, size_t *lenp,
> > >  		   loff_t *ppos);
> > > +
> > > +void stack_tracer_disable(void);
> > > +void stack_tracer_enable(void);
> > > +#else
> > > +static inline void stack_tracer_disable(void) { }
> > > +static inline void stack_tracer_enabe(void) { }
> > >  #endif
> > > 
> > >  struct ftrace_func_command {
> > > diff --git a/kernel/trace/trace_stack.c b/kernel/trace/trace_stack.c
> > > index 05ad2b86461e..5adbb73ec2ec 100644
> > > --- a/kernel/trace/trace_stack.c
> > > +++ b/kernel/trace/trace_stack.c
> > > @@ -41,6 +41,34 @@ static DEFINE_MUTEX(stack_sysctl_mutex);
> > >  int stack_tracer_enabled;
> > >  static int last_stack_tracer_enabled;
> > > 
> > > +/**
> > > + * stack_tracer_disable - temporarily disable the stack tracer
> > > + *
> > > + * There's a few locations (namely in RCU) where stack tracing
> > > + * can not be executed. This function is used to disable stack
> > > + * tracing during those critical sections.
> > > + *
> > > + * This function will disable preemption. stack_tracer_enable()
> > > + * must be called shortly after this is called.
> > > + */
> > > +void stack_tracer_disable(void)
> > > +{
> > > +	preempt_disable_notrace();  
> > 
> > Interrupts are disabled in all current call points, so you don't really
> > need to disable preemption.  I would normally not worry, given the
> > ease-of-use improvements, but some people get annoyed about even slight
> > increases in idle-entry overhead.
> 
> My worry is that we add another caller that doesn't disable interrupts
> or preemption.
> 
> I could add a __stack_trace_disable() that skips the disabling of
> preemption, as the "__" usually denotes the call is "special".

Given that interrupts are disabled at that point, and given also that
NMI skips stack tracing if growth is required, could we just leave
out the stack_tracer_disable() and stack_tracer_enable()?

							Thanx, Paul

> -- Steve
> 
> > 
> > > +	this_cpu_inc(trace_active);
> > > +}
> > > +
> > > +/**
> > > + * stack_tracer_enable - re-enable the stack tracer
> > > + *
> > > + * After stack_tracer_disable() is called, stack_tracer_enable()
> > > + * must shortly be called afterward.
> > > + */
> > > +void stack_tracer_enable(void)
> > > +{
> > > +	this_cpu_dec(trace_active);
> > > +	preempt_enable_notrace();  
> > 
> > Ditto...
> > 
> > > +}
> > > +
> > >  void stack_trace_print(void)
> > >  {
> > >  	long i;
> > > -- 
> > > 2.10.2
> > > 
> > >   
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions
  2017-04-06 20:21       ` Paul E. McKenney
@ 2017-04-06 21:23         ` Steven Rostedt
  2017-04-06 22:08           ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 21:23 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, 6 Apr 2017 13:21:17 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> > My worry is that we add another caller that doesn't disable interrupts
> > or preemption.
> > 
> > I could add a __stack_trace_disable() that skips the disabling of
> > preemption, as the "__" usually denotes the call is "special".  
> 
> Given that interrupts are disabled at that point, and given also that
> NMI skips stack tracing if growth is required, could we just leave
> out the stack_tracer_disable() and stack_tracer_enable()?

There may be other use cases. Hmm, maybe I'll just have it do a check
to make sure preemption is disabled. Something like:

	if (IS_ENABLED(CONFIG_DEBUG_PREEMPT))
		WARN_ON_ONCE(!preempt_count());

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions
  2017-04-06 21:23         ` Steven Rostedt
@ 2017-04-06 22:08           ` Paul E. McKenney
  2017-04-06 23:57             ` Steven Rostedt
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2017-04-06 22:08 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, Apr 06, 2017 at 05:23:48PM -0400, Steven Rostedt wrote:
> On Thu, 6 Apr 2017 13:21:17 -0700
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > > My worry is that we add another caller that doesn't disable interrupts
> > > or preemption.
> > > 
> > > I could add a __stack_trace_disable() that skips the disabling of
> > > preemption, as the "__" usually denotes the call is "special".  
> > 
> > Given that interrupts are disabled at that point, and given also that
> > NMI skips stack tracing if growth is required, could we just leave
> > out the stack_tracer_disable() and stack_tracer_enable()?
> 
> There may be other use cases. Hmm, maybe I'll just have it do a check
> to make sure preemption is disabled. Something like:
> 
> 	if (IS_ENABLED(CONFIG_DEBUG_PREEMPT))
> 		WARN_ON_ONCE(!preempt_count());

This in the include/linux/ftrace.h file so that it can be inlined?
That makes sense to me.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions
  2017-04-06 22:08           ` Paul E. McKenney
@ 2017-04-06 23:57             ` Steven Rostedt
  0 siblings, 0 replies; 16+ messages in thread
From: Steven Rostedt @ 2017-04-06 23:57 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: linux-kernel, Ingo Molnar, Andrew Morton

On Thu, 6 Apr 2017 15:08:21 -0700
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:

> On Thu, Apr 06, 2017 at 05:23:48PM -0400, Steven Rostedt wrote:
> > On Thu, 6 Apr 2017 13:21:17 -0700
> > "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> >   
> > > > My worry is that we add another caller that doesn't disable interrupts
> > > > or preemption.
> > > > 
> > > > I could add a __stack_trace_disable() that skips the disabling of
> > > > preemption, as the "__" usually denotes the call is "special".    
> > > 
> > > Given that interrupts are disabled at that point, and given also that
> > > NMI skips stack tracing if growth is required, could we just leave
> > > out the stack_tracer_disable() and stack_tracer_enable()?  
> > 
> > There may be other use cases. Hmm, maybe I'll just have it do a check
> > to make sure preemption is disabled. Something like:
> > 
> > 	if (IS_ENABLED(CONFIG_DEBUG_PREEMPT))
> > 		WARN_ON_ONCE(!preempt_count());  
> 
> This in the include/linux/ftrace.h file so that it can be inlined?
> That makes sense to me.
> 

Hah, I already had that part (inlining) written. It's a separate patch
though. I'll post another series tomorrow.

-- Steve

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] rcu: Fix dyntick-idle tracing
  2017-04-06 16:42 ` [PATCH 4/4] rcu: Fix dyntick-idle tracing Steven Rostedt
  2017-04-06 18:01   ` Paul E. McKenney
@ 2017-04-07  4:50   ` kbuild test robot
  1 sibling, 0 replies; 16+ messages in thread
From: kbuild test robot @ 2017-04-07  4:50 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: kbuild-all, linux-kernel, Ingo Molnar, Andrew Morton, Paul E. McKenney

[-- Attachment #1: Type: text/plain, Size: 1433 bytes --]

Hi Paul,

[auto build test ERROR on tip/perf/core]
[also build test ERROR on v4.11-rc5 next-20170406]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Steven-Rostedt/tracing-Add-usecase-of-synchronize_rcu_tasks-and-stack_tracer_disable/20170407-122352
config: x86_64-randconfig-x009-201714 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

All errors (new ones prefixed by >>):

   kernel/rcu/tree.c: In function 'rcu_eqs_enter_common':
>> kernel/rcu/tree.c:806:2: error: implicit declaration of function 'stack_tracer_enable' [-Werror=implicit-function-declaration]
     stack_tracer_enable();
     ^~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/stack_tracer_enable +806 kernel/rcu/tree.c

   800			do_nocb_deferred_wakeup(rdp);
   801		}
   802		rcu_prepare_for_idle();
   803		stack_tracer_disable();
   804		rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
   805		rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
 > 806		stack_tracer_enable();
   807		rcu_dynticks_task_enter();
   808	
   809		/*

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27768 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-04-07  4:50 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-06 16:42 [PATCH 0/4] tracing: Add usecase of synchronize_rcu_tasks() and stack_tracer_disable() Steven Rostedt
2017-04-06 16:42 ` [PATCH 1/4] ftrace: Add use of synchronize_rcu_tasks() with dynamic trampolines Steven Rostedt
2017-04-06 18:05   ` Paul E. McKenney
2017-04-06 18:13     ` Steven Rostedt
2017-04-06 20:20       ` Paul E. McKenney
2017-04-06 16:42 ` [PATCH 2/4] tracing: Replace the per_cpu() with this_cpu() in trace_stack.c Steven Rostedt
2017-04-06 16:42 ` [PATCH 3/4] tracing: Add stack_tracer_disable/enable() functions Steven Rostedt
2017-04-06 18:12   ` Paul E. McKenney
2017-04-06 18:48     ` Steven Rostedt
2017-04-06 20:21       ` Paul E. McKenney
2017-04-06 21:23         ` Steven Rostedt
2017-04-06 22:08           ` Paul E. McKenney
2017-04-06 23:57             ` Steven Rostedt
2017-04-06 16:42 ` [PATCH 4/4] rcu: Fix dyntick-idle tracing Steven Rostedt
2017-04-06 18:01   ` Paul E. McKenney
2017-04-07  4:50   ` kbuild test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.