All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20
@ 2018-11-30  3:18 Steven Rostedt
  2018-11-30  3:18 ` [PATCH 1/2] tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique Steven Rostedt
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Steven Rostedt @ 2018-11-30  3:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: Linus Torvalds, Ingo Molnar, Andrew Morton


Linus,

This includes two more fixes:

 - Change idx variable in DO_TRACE macro to __idx to avoid name conflicts.
   A kvm event had "idx" as a parameter and it confused the macro.

 - Fix a race where interrupts would be traced when set_graph_function was set.
   The previous patch set increased a race window that tricked the function graph
   tracer to think it should trace interrupts when it really should not have.
   The bug has been there before, but was seldom hit. Only the last patch series
   made it more common.

Note, this is on top of a previous git pull that I have submitted:

  http://lkml.kernel.org/r/20181127224031.76681fe0@vmware.local.home

Please pull the latest trace-v4.20-rc4 tree, which can be found at:


  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
trace-v4.20-rc4

Tag SHA1: 9ecaefab677642b895cdb44889a2baff3e7436bc
Head SHA1: 5cf99a0f3161bc3ae2391269d134d6bf7e26f00e


Steven Rostedt (VMware) (1):
      tracing/fgraph: Fix set_graph_function from showing interrupts

Zenghui Yu (1):
      tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique

----
 include/linux/tracepoint.h           |  6 ++--
 kernel/trace/trace.h                 | 57 ++++++++++++++++++++++++++++++++++--
 kernel/trace/trace_functions_graph.c |  4 +++
 kernel/trace/trace_irqsoff.c         |  2 ++
 kernel/trace/trace_sched_wakeup.c    |  2 ++
 5 files changed, 65 insertions(+), 6 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique
  2018-11-30  3:18 [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Steven Rostedt
@ 2018-11-30  3:18 ` Steven Rostedt
  2018-11-30  3:18 ` [PATCH 2/2] tracing/fgraph: Fix set_graph_function from showing interrupts Steven Rostedt
  2018-11-30 17:41 ` [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Linus Torvalds
  2 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2018-11-30  3:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, Joel Fernandes,
	Wang Haibin, linux-trace-devel, stable, Zenghui Yu

From: Zenghui Yu <yuzenghui@huawei.com>

After enabling KVM event tracing, almost all of trace_kvm_exit()'s
printk shows

	"kvm_exit: IRQ: ..."

even if the actual exception_type is NOT IRQ.  More specifically,
trace_kvm_exit() is defined in virt/kvm/arm/trace.h by TRACE_EVENT.

This slight problem may have existed after commit e6753f23d961
("tracepoint: Make rcuidle tracepoint callers use SRCU"). There are
two variables in trace_kvm_exit() and __DO_TRACE() which have the
same name, *idx*. Thus the actual value of *idx* will be overwritten
when tracing. Fix it by adding a simple prefix.

Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Wang Haibin <wanghaibin.wang@huawei.com>
Cc: linux-trace-devel@vger.kernel.org
Cc: stable@vger.kernel.org
Fixes: e6753f23d961 ("tracepoint: Make rcuidle tracepoint callers use SRCU")
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 include/linux/tracepoint.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index 538ba1a58f5b..e9de8ad0bad7 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -166,7 +166,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
 		struct tracepoint_func *it_func_ptr;			\
 		void *it_func;						\
 		void *__data;						\
-		int __maybe_unused idx = 0;				\
+		int __maybe_unused __idx = 0;				\
 									\
 		if (!(cond))						\
 			return;						\
@@ -182,7 +182,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
 		 * doesn't work from the idle path.			\
 		 */							\
 		if (rcuidle) {						\
-			idx = srcu_read_lock_notrace(&tracepoint_srcu);	\
+			__idx = srcu_read_lock_notrace(&tracepoint_srcu);\
 			rcu_irq_enter_irqson();				\
 		}							\
 									\
@@ -198,7 +198,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
 									\
 		if (rcuidle) {						\
 			rcu_irq_exit_irqson();				\
-			srcu_read_unlock_notrace(&tracepoint_srcu, idx);\
+			srcu_read_unlock_notrace(&tracepoint_srcu, __idx);\
 		}							\
 									\
 		preempt_enable_notrace();				\
-- 
2.19.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] tracing/fgraph: Fix set_graph_function from showing interrupts
  2018-11-30  3:18 [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Steven Rostedt
  2018-11-30  3:18 ` [PATCH 1/2] tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique Steven Rostedt
@ 2018-11-30  3:18 ` Steven Rostedt
  2018-11-30 17:41 ` [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Linus Torvalds
  2 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2018-11-30  3:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: Linus Torvalds, Ingo Molnar, Andrew Morton, stable, Joe Lawrence

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The tracefs file set_graph_function is used to only function graph functions
that are listed in that file (or all functions if the file is empty). The
way this is implemented is that the function graph tracer looks at every
function, and if the current depth is zero and the function matches
something in the file then it will trace that function. When other functions
are called, the depth will be greater than zero (because the original
function will be at depth zero), and all functions will be traced where the
depth is greater than zero.

The issue is that when a function is first entered, and the handler that
checks this logic is called, the depth is set to zero. If an interrupt comes
in and a function in the interrupt handler is traced, its depth will be
greater than zero and it will automatically be traced, even if the original
function was not. But because the logic only looks at depth it may trace
interrupts when it should not be.

The recent design change of the function graph tracer to fix other bugs
caused the depth to be zero while the function graph callback handler is
being called for a longer time, widening the race of this happening. This
bug was actually there for a longer time, but because the race window was so
small it seldom happened. The Fixes tag below is for the commit that widen
the race window, because that commit belongs to a series that will also help
fix the original bug.

Cc: stable@kernel.org
Fixes: 39eb456dacb5 ("function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack")
Reported-by: Joe Lawrence <joe.lawrence@redhat.com>
Tested-by: Joe Lawrence <joe.lawrence@redhat.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.h                 | 57 ++++++++++++++++++++++++++--
 kernel/trace/trace_functions_graph.c |  4 ++
 kernel/trace/trace_irqsoff.c         |  2 +
 kernel/trace/trace_sched_wakeup.c    |  2 +
 4 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 3b8c0e24ab30..447bd96ee658 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -512,12 +512,44 @@ enum {
  * can only be modified by current, we can reuse trace_recursion.
  */
 	TRACE_IRQ_BIT,
+
+	/* Set if the function is in the set_graph_function file */
+	TRACE_GRAPH_BIT,
+
+	/*
+	 * In the very unlikely case that an interrupt came in
+	 * at a start of graph tracing, and we want to trace
+	 * the function in that interrupt, the depth can be greater
+	 * than zero, because of the preempted start of a previous
+	 * trace. In an even more unlikely case, depth could be 2
+	 * if a softirq interrupted the start of graph tracing,
+	 * followed by an interrupt preempting a start of graph
+	 * tracing in the softirq, and depth can even be 3
+	 * if an NMI came in at the start of an interrupt function
+	 * that preempted a softirq start of a function that
+	 * preempted normal context!!!! Luckily, it can't be
+	 * greater than 3, so the next two bits are a mask
+	 * of what the depth is when we set TRACE_GRAPH_BIT
+	 */
+
+	TRACE_GRAPH_DEPTH_START_BIT,
+	TRACE_GRAPH_DEPTH_END_BIT,
 };
 
 #define trace_recursion_set(bit)	do { (current)->trace_recursion |= (1<<(bit)); } while (0)
 #define trace_recursion_clear(bit)	do { (current)->trace_recursion &= ~(1<<(bit)); } while (0)
 #define trace_recursion_test(bit)	((current)->trace_recursion & (1<<(bit)))
 
+#define trace_recursion_depth() \
+	(((current)->trace_recursion >> TRACE_GRAPH_DEPTH_START_BIT) & 3)
+#define trace_recursion_set_depth(depth) \
+	do {								\
+		current->trace_recursion &=				\
+			~(3 << TRACE_GRAPH_DEPTH_START_BIT);		\
+		current->trace_recursion |=				\
+			((depth) & 3) << TRACE_GRAPH_DEPTH_START_BIT;	\
+	} while (0)
+
 #define TRACE_CONTEXT_BITS	4
 
 #define TRACE_FTRACE_START	TRACE_FTRACE_BIT
@@ -843,8 +875,9 @@ extern void __trace_graph_return(struct trace_array *tr,
 extern struct ftrace_hash *ftrace_graph_hash;
 extern struct ftrace_hash *ftrace_graph_notrace_hash;
 
-static inline int ftrace_graph_addr(unsigned long addr)
+static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
 {
+	unsigned long addr = trace->func;
 	int ret = 0;
 
 	preempt_disable_notrace();
@@ -855,6 +888,14 @@ static inline int ftrace_graph_addr(unsigned long addr)
 	}
 
 	if (ftrace_lookup_ip(ftrace_graph_hash, addr)) {
+
+		/*
+		 * This needs to be cleared on the return functions
+		 * when the depth is zero.
+		 */
+		trace_recursion_set(TRACE_GRAPH_BIT);
+		trace_recursion_set_depth(trace->depth);
+
 		/*
 		 * If no irqs are to be traced, but a set_graph_function
 		 * is set, and called by an interrupt handler, we still
@@ -872,6 +913,13 @@ static inline int ftrace_graph_addr(unsigned long addr)
 	return ret;
 }
 
+static inline void ftrace_graph_addr_finish(struct ftrace_graph_ret *trace)
+{
+	if (trace_recursion_test(TRACE_GRAPH_BIT) &&
+	    trace->depth == trace_recursion_depth())
+		trace_recursion_clear(TRACE_GRAPH_BIT);
+}
+
 static inline int ftrace_graph_notrace_addr(unsigned long addr)
 {
 	int ret = 0;
@@ -885,7 +933,7 @@ static inline int ftrace_graph_notrace_addr(unsigned long addr)
 	return ret;
 }
 #else
-static inline int ftrace_graph_addr(unsigned long addr)
+static inline int ftrace_graph_addr(struct ftrace_graph_ent *trace)
 {
 	return 1;
 }
@@ -894,6 +942,8 @@ static inline int ftrace_graph_notrace_addr(unsigned long addr)
 {
 	return 0;
 }
+static inline void ftrace_graph_addr_finish(struct ftrace_graph_ret *trace)
+{ }
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
 extern unsigned int fgraph_max_depth;
@@ -901,7 +951,8 @@ extern unsigned int fgraph_max_depth;
 static inline bool ftrace_graph_ignore_func(struct ftrace_graph_ent *trace)
 {
 	/* trace it when it is-nested-in or is a function enabled. */
-	return !(trace->depth || ftrace_graph_addr(trace->func)) ||
+	return !(trace_recursion_test(TRACE_GRAPH_BIT) ||
+		 ftrace_graph_addr(trace)) ||
 		(trace->depth < 0) ||
 		(fgraph_max_depth && trace->depth >= fgraph_max_depth);
 }
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 2561460d7baf..086af4f5c3e8 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -509,6 +509,8 @@ void trace_graph_return(struct ftrace_graph_ret *trace)
 	int cpu;
 	int pc;
 
+	ftrace_graph_addr_finish(trace);
+
 	local_irq_save(flags);
 	cpu = raw_smp_processor_id();
 	data = per_cpu_ptr(tr->trace_buffer.data, cpu);
@@ -532,6 +534,8 @@ void set_graph_array(struct trace_array *tr)
 
 static void trace_graph_thresh_return(struct ftrace_graph_ret *trace)
 {
+	ftrace_graph_addr_finish(trace);
+
 	if (tracing_thresh &&
 	    (trace->rettime - trace->calltime < tracing_thresh))
 		return;
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index b7357f9f82a3..98ea6d28df15 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -208,6 +208,8 @@ static void irqsoff_graph_return(struct ftrace_graph_ret *trace)
 	unsigned long flags;
 	int pc;
 
+	ftrace_graph_addr_finish(trace);
+
 	if (!func_prolog_dec(tr, &data, &flags))
 		return;
 
diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
index a86b303e6c67..7d04b9890755 100644
--- a/kernel/trace/trace_sched_wakeup.c
+++ b/kernel/trace/trace_sched_wakeup.c
@@ -270,6 +270,8 @@ static void wakeup_graph_return(struct ftrace_graph_ret *trace)
 	unsigned long flags;
 	int pc;
 
+	ftrace_graph_addr_finish(trace);
+
 	if (!func_prolog_preempt_disable(tr, &data, &pc))
 		return;
 
-- 
2.19.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20
  2018-11-30  3:18 [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Steven Rostedt
  2018-11-30  3:18 ` [PATCH 1/2] tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique Steven Rostedt
  2018-11-30  3:18 ` [PATCH 2/2] tracing/fgraph: Fix set_graph_function from showing interrupts Steven Rostedt
@ 2018-11-30 17:41 ` Linus Torvalds
  2018-11-30 18:13   ` Steven Rostedt
  2018-11-30 18:56   ` Linus Torvalds
  2 siblings, 2 replies; 7+ messages in thread
From: Linus Torvalds @ 2018-11-30 17:41 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Linux List Kernel Mailing, Ingo Molnar, Andrew Morton

On Thu, Nov 29, 2018 at 7:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Note, this is on top of a previous git pull that I have submitted:
>
>   http://lkml.kernel.org/r/20181127224031.76681fe0@vmware.local.home

Hmm.

I had dismissed that, because the patch descriptors for that series
had had "for-next" in them.

    https://lore.kernel.org/lkml/20181122002801.501220343@goodmis.org/

so I dismissed that pull request entirely as being not for this
release entirely.

I went back and merged things, but in general, please try to avoid
confusing me. I'm easily confused when I get mixed messages about the
patches and the pull requests, and will then generally default to
"ignore, this is informational".

                  Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20
  2018-11-30 17:41 ` [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Linus Torvalds
@ 2018-11-30 18:13   ` Steven Rostedt
  2018-11-30 18:56   ` Linus Torvalds
  1 sibling, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2018-11-30 18:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux List Kernel Mailing, Ingo Molnar, Andrew Morton

On Fri, 30 Nov 2018 09:41:00 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, Nov 29, 2018 at 7:19 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > Note, this is on top of a previous git pull that I have submitted:
> >
> >   http://lkml.kernel.org/r/20181127224031.76681fe0@vmware.local.home  
> 
> Hmm.
> 
> I had dismissed that, because the patch descriptors for that series
> had had "for-next" in them.
> 
>     https://lore.kernel.org/lkml/20181122002801.501220343@goodmis.org/
> 
> so I dismissed that pull request entirely as being not for this
> release entirely.
> 
> I went back and merged things, but in general, please try to avoid
> confusing me. I'm easily confused when I get mixed messages about the
> patches and the pull requests, and will then generally default to
> "ignore, this is informational".
>

My apologies. I used my scripts to push them into my linux-next repo,
and it added the [for-next] when doing so in the series. I wanted it to
sit in next for a week (because I modified a bunch of architecture code
that I could only compile test, but not run).

I'll be more careful next time.

Thanks!

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20
  2018-11-30 17:41 ` [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Linus Torvalds
  2018-11-30 18:13   ` Steven Rostedt
@ 2018-11-30 18:56   ` Linus Torvalds
  2018-11-30 19:35     ` Steven Rostedt
  1 sibling, 1 reply; 7+ messages in thread
From: Linus Torvalds @ 2018-11-30 18:56 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Linux List Kernel Mailing, Ingo Molnar, Andrew Morton

On Fri, Nov 30, 2018 at 9:41 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> I went back and merged things [..]

Note that I did this as two merges, even if one would have done (since
the second pull request was just adding new commits on top of the
first).

This way I got the matching diffstat from your pull requests, but more
importantly also the independent merge messages.

The history looks slightly odd this way (with two adjacent merges of
continuous history), but I thought I'd explain the reason.

                   Linus

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20
  2018-11-30 18:56   ` Linus Torvalds
@ 2018-11-30 19:35     ` Steven Rostedt
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Rostedt @ 2018-11-30 19:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Linux List Kernel Mailing, Ingo Molnar, Andrew Morton

On Fri, 30 Nov 2018 10:56:06 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> This way I got the matching diffstat from your pull requests, but more
> importantly also the independent merge messages.
> 
> The history looks slightly odd this way (with two adjacent merges of
> continuous history), but I thought I'd explain the reason.

It makes perfect sense.

-- Steve

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-11-30 19:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-30  3:18 [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Steven Rostedt
2018-11-30  3:18 ` [PATCH 1/2] tracepoint: Use __idx instead of idx in DO_TRACE macro to make it unique Steven Rostedt
2018-11-30  3:18 ` [PATCH 2/2] tracing/fgraph: Fix set_graph_function from showing interrupts Steven Rostedt
2018-11-30 17:41 ` [PATCH 0/2] [GIT PULL] tracing: More fixes for 4.20 Linus Torvalds
2018-11-30 18:13   ` Steven Rostedt
2018-11-30 18:56   ` Linus Torvalds
2018-11-30 19:35     ` Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.