linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Improve the latency tracers
@ 2019-01-01 15:46 Changbin Du
  2019-01-01 15:46 ` [PATCH 1/5] function_graph: Support displaying relative timestamp Changbin Du
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Changbin Du @ 2019-01-01 15:46 UTC (permalink / raw)
  To: rostedt; +Cc: mingo, corbet, linux-doc, linux-kernel, Changbin Du

Happy new year!

This series make some improments for the kernel latency tracers, especilly
for the wakeup tracers. The latency tracers will show us more useful
information. With this series, the wakeup tracers look like this
when display-graph is enabled:

# tracer: wakeup
#
# wakeup latency trace v1.1.5 on 4.20.0+
# --------------------------------------------------------------------
# latency: 593 us, #674/674, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
#    -----------------
#    | task: kworker/0:1H-339 (uid:0 nice:-20 policy:0 rt_prio:0)
#    -----------------
#
#                                       _-----=> irqs-off        
#                                      / _----=> need-resched    
#                                     | / _---=> hardirq/softirq 
#                                     || / _--=> preempt-depth   
#                                     ||| /                      
#   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
#      |          |     |    |        ||||      |   |                     |   |   |   |
        0 us |   0)    <idle>-0    |  dNs. |               |  /*      0:120:R   + [000]   339:100:R kworker/0:1H */
        3 us |   0)    <idle>-0    |  dNs. |   0.000 us    |            (null)();
  <idle>-0       0dNs.   66us : <stack trace>
 => try_to_wake_up
 => __queue_work
 => queue_work_on
 => call_timer_fn
 => run_timer_softirq
 => __do_softirq
 => irq_exit
 => smp_apic_timer_interrupt
 => apic_timer_interrupt
 => native_safe_halt
 => default_idle
 => default_idle_call
 => do_idle
 => cpu_startup_entry
 => start_kernel
 => secondary_startup_64
       67 us |   0)    <idle>-0    |  dNs. |   0.721 us    |  ttwu_stat();
       69 us |   0)    <idle>-0    |  dNs. |   0.607 us    |  _raw_spin_unlock_irqrestore();
       71 us |   0)    <idle>-0    |  .Ns. |   0.598 us    |  _raw_spin_lock_irq();
       72 us |   0)    <idle>-0    |  .Ns. |   0.584 us    |  _raw_spin_lock_irq();
       73 us |   0)    <idle>-0    |  dNs. |   1.125 us    |  __next_timer_interrupt();
       75 us |   0)    <idle>-0    |  dNs. |               |  call_timer_fn() {
       76 us |   0)    <idle>-0    |  dNs. |               |    delayed_work_timer_fn() {
      [...]
      564 us |   0)   kworker-13   |  d... |               |      set_next_entity() {
      565 us |   0)   kworker-13   |  d... |   0.524 us    |        __update_load_avg_se();
      566 us |   0)   kworker-13   |  d... |   0.562 us    |        __update_load_avg_cfs_rq();
      567 us |   0)   kworker-13   |  d... |   2.765 us    |      }
      568 us |   0)   kworker-13   |  d... | + 10.077 us   |    }
      569 us |   0)   kworker-13   |  d... |   0.000 us    |  __schedule();
kworker/-13      0d...  593us : <stack trace>
 => schedule
 => worker_thread
 => kthread
 => ret_from_fork
      593 us |   0)   kworker-13   |  d... |               |  /*     13:120:I ==> [000]   339:100:R kworker/0:1H */

Changbin Du (5):
  function_graph: Support displaying relative timestamp
  sched/tracing: Show more info for funcgraph wakeup tracers
  sched/tracing: Put a margin between flags and duration for wakeup
    tracers
  sched/tracing: Show stacktrace for wakeup tracers
  trace/doc: Add latency tracer funcgraph example

 Documentation/trace/ftrace.rst       | 51 ++++++++++++++++++++++++++++
 kernel/trace/trace.h                 |  9 ++---
 kernel/trace/trace_functions_graph.c | 30 ++++++++++++++--
 kernel/trace/trace_irqsoff.c         |  2 +-
 kernel/trace/trace_sched_wakeup.c    | 11 ++++--
 5 files changed, 94 insertions(+), 9 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/5] function_graph: Support displaying relative timestamp
  2019-01-01 15:46 [PATCH 0/5] Improve the latency tracers Changbin Du
@ 2019-01-01 15:46 ` Changbin Du
  2019-01-01 15:46 ` [PATCH 2/5] sched/tracing: Show more info for funcgraph wakeup tracers Changbin Du
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Changbin Du @ 2019-01-01 15:46 UTC (permalink / raw)
  To: rostedt; +Cc: mingo, corbet, linux-doc, linux-kernel, Changbin Du

When function_graph is used for latency tracers, relative timestamp
is more straightforward than absolute timestamp as function trace
does. This change adds relative timestamp support to function_graph
and applies to latency tracers (wakeup and irqsoff).

Signed-off-by: Changbin Du <changbin.du@gmail.com>
---
 kernel/trace/trace.h                 |  9 +++++----
 kernel/trace/trace_functions_graph.c | 25 +++++++++++++++++++++++++
 kernel/trace/trace_irqsoff.c         |  2 +-
 kernel/trace/trace_sched_wakeup.c    |  2 +-
 4 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 08900828d282..a34fa5e76abb 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -855,10 +855,11 @@ static __always_inline bool ftrace_hash_empty(struct ftrace_hash *hash)
 #define TRACE_GRAPH_PRINT_PROC          0x8
 #define TRACE_GRAPH_PRINT_DURATION      0x10
 #define TRACE_GRAPH_PRINT_ABS_TIME      0x20
-#define TRACE_GRAPH_PRINT_IRQS          0x40
-#define TRACE_GRAPH_PRINT_TAIL          0x80
-#define TRACE_GRAPH_SLEEP_TIME		0x100
-#define TRACE_GRAPH_GRAPH_TIME		0x200
+#define TRACE_GRAPH_PRINT_REL_TIME      0x40
+#define TRACE_GRAPH_PRINT_IRQS          0x80
+#define TRACE_GRAPH_PRINT_TAIL          0x100
+#define TRACE_GRAPH_SLEEP_TIME          0x200
+#define TRACE_GRAPH_GRAPH_TIME          0x400
 #define TRACE_GRAPH_PRINT_FILL_SHIFT	28
 #define TRACE_GRAPH_PRINT_FILL_MASK	(0x3 << TRACE_GRAPH_PRINT_FILL_SHIFT)
 
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index c2af1560e856..16ebbdd7b22e 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -500,6 +500,17 @@ static void print_graph_abs_time(u64 t, struct trace_seq *s)
 			 (unsigned long)t, usecs_rem);
 }
 
+static void
+print_graph_rel_time(struct trace_iterator *iter, struct trace_seq *s)
+{
+	unsigned long long usecs;
+
+	usecs = iter->ts - iter->trace_buffer->time_start;
+	do_div(usecs, NSEC_PER_USEC);
+
+	trace_seq_printf(s, "%9llu us |  ", usecs);
+}
+
 static void
 print_graph_irq(struct trace_iterator *iter, unsigned long addr,
 		enum trace_type type, int cpu, pid_t pid, u32 flags)
@@ -517,6 +528,10 @@ print_graph_irq(struct trace_iterator *iter, unsigned long addr,
 		if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
 			print_graph_abs_time(iter->ts, s);
 
+		/* Relative time */
+		if (flags & TRACE_GRAPH_PRINT_REL_TIME)
+			print_graph_rel_time(iter, s);
+
 		/* Cpu */
 		if (flags & TRACE_GRAPH_PRINT_CPU)
 			print_graph_cpu(s, cpu);
@@ -725,6 +740,10 @@ print_graph_prologue(struct trace_iterator *iter, struct trace_seq *s,
 	if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
 		print_graph_abs_time(iter->ts, s);
 
+	/* Relative time */
+	if (flags & TRACE_GRAPH_PRINT_REL_TIME)
+		print_graph_rel_time(iter, s);
+
 	/* Cpu */
 	if (flags & TRACE_GRAPH_PRINT_CPU)
 		print_graph_cpu(s, cpu);
@@ -1101,6 +1120,8 @@ static void print_lat_header(struct seq_file *s, u32 flags)
 
 	if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
 		size += 16;
+	if (flags & TRACE_GRAPH_PRINT_REL_TIME)
+		size += 16;
 	if (flags & TRACE_GRAPH_PRINT_CPU)
 		size += 4;
 	if (flags & TRACE_GRAPH_PRINT_PROC)
@@ -1125,6 +1146,8 @@ static void __print_graph_headers_flags(struct trace_array *tr,
 	seq_putc(s, '#');
 	if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
 		seq_puts(s, "     TIME       ");
+	if (flags & TRACE_GRAPH_PRINT_REL_TIME)
+		seq_puts(s, "   REL TIME     ");
 	if (flags & TRACE_GRAPH_PRINT_CPU)
 		seq_puts(s, " CPU");
 	if (flags & TRACE_GRAPH_PRINT_PROC)
@@ -1139,6 +1162,8 @@ static void __print_graph_headers_flags(struct trace_array *tr,
 	seq_putc(s, '#');
 	if (flags & TRACE_GRAPH_PRINT_ABS_TIME)
 		seq_puts(s, "      |         ");
+	if (flags & TRACE_GRAPH_PRINT_REL_TIME)
+		seq_puts(s, "      |         ");
 	if (flags & TRACE_GRAPH_PRINT_CPU)
 		seq_puts(s, " |  ");
 	if (flags & TRACE_GRAPH_PRINT_PROC)
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index d3294721f119..e6a62e7c6b69 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -238,7 +238,7 @@ static void irqsoff_trace_close(struct trace_iterator *iter)
 
 #define GRAPH_TRACER_FLAGS (TRACE_GRAPH_PRINT_CPU | \
 			    TRACE_GRAPH_PRINT_PROC | \
-			    TRACE_GRAPH_PRINT_ABS_TIME | \
+			    TRACE_GRAPH_PRINT_REL_TIME | \
 			    TRACE_GRAPH_PRINT_DURATION)
 
 static enum print_line_t irqsoff_print_line(struct trace_iterator *iter)
diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
index 4ea7e6845efb..b6c5fa10347e 100644
--- a/kernel/trace/trace_sched_wakeup.c
+++ b/kernel/trace/trace_sched_wakeup.c
@@ -180,7 +180,7 @@ static void wakeup_trace_close(struct trace_iterator *iter)
 }
 
 #define GRAPH_TRACER_FLAGS (TRACE_GRAPH_PRINT_PROC | \
-			    TRACE_GRAPH_PRINT_ABS_TIME | \
+			    TRACE_GRAPH_PRINT_REL_TIME | \
 			    TRACE_GRAPH_PRINT_DURATION)
 
 static enum print_line_t wakeup_print_line(struct trace_iterator *iter)
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/5] sched/tracing: Show more info for funcgraph wakeup tracers
  2019-01-01 15:46 [PATCH 0/5] Improve the latency tracers Changbin Du
  2019-01-01 15:46 ` [PATCH 1/5] function_graph: Support displaying relative timestamp Changbin Du
@ 2019-01-01 15:46 ` Changbin Du
  2019-01-01 15:46 ` [PATCH 3/5] sched/tracing: Put a margin between flags and duration for " Changbin Du
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Changbin Du @ 2019-01-01 15:46 UTC (permalink / raw)
  To: rostedt; +Cc: mingo, corbet, linux-doc, linux-kernel, Changbin Du

Add these info fields to funcgraph wakeup tracers:
  o Show CPU info since the waker could be on a different CPU.
  o Show function duration and overhead.
  o Show IRQ markers.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
---
 kernel/trace/trace_sched_wakeup.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
index b6c5fa10347e..da5b6e012840 100644
--- a/kernel/trace/trace_sched_wakeup.c
+++ b/kernel/trace/trace_sched_wakeup.c
@@ -180,8 +180,11 @@ static void wakeup_trace_close(struct trace_iterator *iter)
 }
 
 #define GRAPH_TRACER_FLAGS (TRACE_GRAPH_PRINT_PROC | \
+			    TRACE_GRAPH_PRINT_CPU |  \
 			    TRACE_GRAPH_PRINT_REL_TIME | \
-			    TRACE_GRAPH_PRINT_DURATION)
+			    TRACE_GRAPH_PRINT_DURATION | \
+			    TRACE_GRAPH_PRINT_OVERHEAD | \
+			    TRACE_GRAPH_PRINT_IRQS)
 
 static enum print_line_t wakeup_print_line(struct trace_iterator *iter)
 {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/5] sched/tracing: Put a margin between flags and duration for wakeup tracers
  2019-01-01 15:46 [PATCH 0/5] Improve the latency tracers Changbin Du
  2019-01-01 15:46 ` [PATCH 1/5] function_graph: Support displaying relative timestamp Changbin Du
  2019-01-01 15:46 ` [PATCH 2/5] sched/tracing: Show more info for funcgraph wakeup tracers Changbin Du
@ 2019-01-01 15:46 ` Changbin Du
  2019-01-01 15:46 ` [PATCH 4/5] sched/tracing: Show stacktrace " Changbin Du
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Changbin Du @ 2019-01-01 15:46 UTC (permalink / raw)
  To: rostedt; +Cc: mingo, corbet, linux-doc, linux-kernel, Changbin Du

Don't mix context flags with function duration info. Now the output looks
like this:

\# tracer: wakeup
\#
\# wakeup latency trace v1.1.5 on 4.20.0+
\# --------------------------------------------------------------------
\# latency: 593 us, #674/674, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
\#    -----------------
\#    | task: kworker/0:1H-339 (uid:0 nice:-20 policy:0 rt_prio:0)
\#    -----------------
\#
\#                                      _-----=> irqs-off
\#                                     / _----=> need-resched
\#                                    | / _---=> hardirq/softirq
\#                                    || / _--=> preempt-depth
\#                                    ||| /
\#  REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
\#     |          |     |    |        ||||      |   |                     |   |   |   |
        0 us |   0)    <idle>-0    |  dNs. |               |  /*      0:120:R   + [000]   339:100:R kworker/0:1H */
        3 us |   0)    <idle>-0    |  dNs. |   0.000 us    |            (null)();
       67 us |   0)    <idle>-0    |  dNs. |   0.721 us    |  ttwu_stat();
       69 us |   0)    <idle>-0    |  dNs. |   0.607 us    |  _raw_spin_unlock_irqrestore();
       71 us |   0)    <idle>-0    |  .Ns. |   0.598 us    |  _raw_spin_lock_irq();
       72 us |   0)    <idle>-0    |  .Ns. |   0.584 us    |  _raw_spin_lock_irq();
       73 us |   0)    <idle>-0    |  dNs. | + 11.118 us   |  __next_timer_interrupt();
       75 us |   0)    <idle>-0    |  dNs. |               |  call_timer_fn() {
       76 us |   0)    <idle>-0    |  dNs. |               |    delayed_work_timer_fn() {
       76 us |   0)    <idle>-0    |  dNs. |               |      __queue_work() {
       ...

Signed-off-by: Changbin Du <changbin.du@gmail.com>
---
 kernel/trace/trace_functions_graph.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 16ebbdd7b22e..69ebf3c2f1b5 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -380,6 +380,7 @@ static void print_graph_lat_fmt(struct trace_seq *s, struct trace_entry *entry)
 {
 	trace_seq_putc(s, ' ');
 	trace_print_lat_fmt(s, entry);
+	trace_seq_puts(s, " | ");
 }
 
 /* If the pid changed since the last trace, output this event */
@@ -1153,7 +1154,7 @@ static void __print_graph_headers_flags(struct trace_array *tr,
 	if (flags & TRACE_GRAPH_PRINT_PROC)
 		seq_puts(s, "  TASK/PID       ");
 	if (lat)
-		seq_puts(s, "||||");
+		seq_puts(s, "||||   ");
 	if (flags & TRACE_GRAPH_PRINT_DURATION)
 		seq_puts(s, "  DURATION   ");
 	seq_puts(s, "               FUNCTION CALLS\n");
@@ -1169,7 +1170,7 @@ static void __print_graph_headers_flags(struct trace_array *tr,
 	if (flags & TRACE_GRAPH_PRINT_PROC)
 		seq_puts(s, "   |    |        ");
 	if (lat)
-		seq_puts(s, "||||");
+		seq_puts(s, "||||   ");
 	if (flags & TRACE_GRAPH_PRINT_DURATION)
 		seq_puts(s, "   |   |      ");
 	seq_puts(s, "               |   |   |   |\n");
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/5] sched/tracing: Show stacktrace for wakeup tracers
  2019-01-01 15:46 [PATCH 0/5] Improve the latency tracers Changbin Du
                   ` (2 preceding siblings ...)
  2019-01-01 15:46 ` [PATCH 3/5] sched/tracing: Put a margin between flags and duration for " Changbin Du
@ 2019-01-01 15:46 ` Changbin Du
  2019-01-16  3:25   ` Steven Rostedt
  2019-01-01 15:46 ` [PATCH 5/5] trace/doc: Add latency tracer funcgraph example Changbin Du
  2019-01-12  4:05 ` [PATCH 0/5] Improve the latency tracers Changbin Du
  5 siblings, 1 reply; 10+ messages in thread
From: Changbin Du @ 2019-01-01 15:46 UTC (permalink / raw)
  To: rostedt; +Cc: mingo, corbet, linux-doc, linux-kernel, Changbin Du

This align the behavior of wakeup tracers with irqsoff latency tracer
that we record stacktrace at the beginning and end of waking up. The
stacktrace shows us what is happening in the kernel.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
---
 kernel/trace/trace_sched_wakeup.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
index da5b6e012840..0ec136d408ff 100644
--- a/kernel/trace/trace_sched_wakeup.c
+++ b/kernel/trace/trace_sched_wakeup.c
@@ -474,6 +474,8 @@ probe_wakeup_sched_switch(void *ignore, bool preempt,
 	data = per_cpu_ptr(wakeup_trace->trace_buffer.data, wakeup_cpu);
 
 	__trace_function(wakeup_trace, CALLER_ADDR0, CALLER_ADDR1, flags, pc);
+	/* Skip 2 functions to get to the task switch function */
+	__trace_stack(wakeup_trace, flags, 2, pc);
 	tracing_sched_switch_trace(wakeup_trace, prev, next, flags, pc);
 
 	T0 = data->preempt_timestamp;
@@ -593,6 +595,8 @@ probe_wakeup(void *ignore, struct task_struct *p)
 	 * it should be safe to use it here.
 	 */
 	__trace_function(wakeup_trace, CALLER_ADDR1, CALLER_ADDR2, flags, pc);
+	/* Skip 2 functions to get to the task wakeup function */
+	__trace_stack(wakeup_trace, flags, 2, pc);
 
 out_locked:
 	arch_spin_unlock(&wakeup_lock);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 5/5] trace/doc: Add latency tracer funcgraph example
  2019-01-01 15:46 [PATCH 0/5] Improve the latency tracers Changbin Du
                   ` (3 preceding siblings ...)
  2019-01-01 15:46 ` [PATCH 4/5] sched/tracing: Show stacktrace " Changbin Du
@ 2019-01-01 15:46 ` Changbin Du
  2019-01-12  4:05 ` [PATCH 0/5] Improve the latency tracers Changbin Du
  5 siblings, 0 replies; 10+ messages in thread
From: Changbin Du @ 2019-01-01 15:46 UTC (permalink / raw)
  To: rostedt; +Cc: mingo, corbet, linux-doc, linux-kernel, Changbin Du

This add an example about how to use funcgraph with latency tracers.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
---
 Documentation/trace/ftrace.rst | 51 ++++++++++++++++++++++++++++++++++
 1 file changed, 51 insertions(+)

diff --git a/Documentation/trace/ftrace.rst b/Documentation/trace/ftrace.rst
index 0131df7f5968..6ce2763a2a3e 100644
--- a/Documentation/trace/ftrace.rst
+++ b/Documentation/trace/ftrace.rst
@@ -1396,6 +1396,57 @@ enabling function tracing, we incur an added overhead. This
 overhead may extend the latency times. But nevertheless, this
 trace has provided some very helpful debugging information.
 
+If we prefer function graph output instead of function, we can set
+display-graph option::
+ with echo 1 > options/display-graph
+
+  # tracer: irqsoff
+  #
+  # irqsoff latency trace v1.1.5 on 4.20.0-rc6+
+  # --------------------------------------------------------------------
+  # latency: 3751 us, #274/274, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
+  #    -----------------
+  #    | task: bash-1507 (uid:0 nice:0 policy:0 rt_prio:0)
+  #    -----------------
+  #  => started at: free_debug_processing
+  #  => ended at:   return_to_handler
+  #
+  #
+  #                                       _-----=> irqs-off
+  #                                      / _----=> need-resched
+  #                                     | / _---=> hardirq/softirq
+  #                                     || / _--=> preempt-depth
+  #                                     ||| /
+  #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
+  #      |          |     |    |        ||||      |   |                     |   |   |   |
+          0 us |   0)   bash-1507    |  d... |   0.000 us    |  _raw_spin_lock_irqsave();
+          0 us |   0)   bash-1507    |  d..1 |   0.378 us    |    do_raw_spin_trylock();
+          1 us |   0)   bash-1507    |  d..2 |               |    set_track() {
+          2 us |   0)   bash-1507    |  d..2 |               |      save_stack_trace() {
+          2 us |   0)   bash-1507    |  d..2 |               |        __save_stack_trace() {
+          3 us |   0)   bash-1507    |  d..2 |               |          __unwind_start() {
+          3 us |   0)   bash-1507    |  d..2 |               |            get_stack_info() {
+          3 us |   0)   bash-1507    |  d..2 |   0.351 us    |              in_task_stack();
+          4 us |   0)   bash-1507    |  d..2 |   1.107 us    |            }
+  [...]
+       3750 us |   0)   bash-1507    |  d..1 |   0.516 us    |      do_raw_spin_unlock();
+       3750 us |   0)   bash-1507    |  d..1 |   0.000 us    |  _raw_spin_unlock_irqrestore();
+       3764 us |   0)   bash-1507    |  d..1 |   0.000 us    |  tracer_hardirqs_on();
+      bash-1507    0d..1 3792us : <stack trace>
+   => free_debug_processing
+   => __slab_free
+   => kmem_cache_free
+   => vm_area_free
+   => remove_vma
+   => exit_mmap
+   => mmput
+   => flush_old_exec
+   => load_elf_binary
+   => search_binary_handler
+   => __do_execve_file.isra.32
+   => __x64_sys_execve
+   => do_syscall_64
+   => entry_SYSCALL_64_after_hwframe
 
 preemptoff
 ----------
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/5] Improve the latency tracers
  2019-01-01 15:46 [PATCH 0/5] Improve the latency tracers Changbin Du
                   ` (4 preceding siblings ...)
  2019-01-01 15:46 ` [PATCH 5/5] trace/doc: Add latency tracer funcgraph example Changbin Du
@ 2019-01-12  4:05 ` Changbin Du
  2019-01-14 15:00   ` Steven Rostedt
  5 siblings, 1 reply; 10+ messages in thread
From: Changbin Du @ 2019-01-12  4:05 UTC (permalink / raw)
  To: Changbin Du; +Cc: rostedt, mingo, corbet, linux-doc, linux-kernel

Hi Steven, Have you checked this serias yet? :)

On Tue, Jan 01, 2019 at 11:46:09PM +0800, Changbin Du wrote:
> Happy new year!
> 
> This series make some improments for the kernel latency tracers, especilly
> for the wakeup tracers. The latency tracers will show us more useful
> information. With this series, the wakeup tracers look like this
> when display-graph is enabled:
> 
> # tracer: wakeup
> #
> # wakeup latency trace v1.1.5 on 4.20.0+
> # --------------------------------------------------------------------
> # latency: 593 us, #674/674, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 #P:4)
> #    -----------------
> #    | task: kworker/0:1H-339 (uid:0 nice:-20 policy:0 rt_prio:0)
> #    -----------------
> #
> #                                       _-----=> irqs-off        
> #                                      / _----=> need-resched    
> #                                     | / _---=> hardirq/softirq 
> #                                     || / _--=> preempt-depth   
> #                                     ||| /                      
> #   REL TIME      CPU  TASK/PID       ||||     DURATION                  FUNCTION CALLS
> #      |          |     |    |        ||||      |   |                     |   |   |   |
>         0 us |   0)    <idle>-0    |  dNs. |               |  /*      0:120:R   + [000]   339:100:R kworker/0:1H */
>         3 us |   0)    <idle>-0    |  dNs. |   0.000 us    |            (null)();
>   <idle>-0       0dNs.   66us : <stack trace>
>  => try_to_wake_up
>  => __queue_work
>  => queue_work_on
>  => call_timer_fn
>  => run_timer_softirq
>  => __do_softirq
>  => irq_exit
>  => smp_apic_timer_interrupt
>  => apic_timer_interrupt
>  => native_safe_halt
>  => default_idle
>  => default_idle_call
>  => do_idle
>  => cpu_startup_entry
>  => start_kernel
>  => secondary_startup_64
>        67 us |   0)    <idle>-0    |  dNs. |   0.721 us    |  ttwu_stat();
>        69 us |   0)    <idle>-0    |  dNs. |   0.607 us    |  _raw_spin_unlock_irqrestore();
>        71 us |   0)    <idle>-0    |  .Ns. |   0.598 us    |  _raw_spin_lock_irq();
>        72 us |   0)    <idle>-0    |  .Ns. |   0.584 us    |  _raw_spin_lock_irq();
>        73 us |   0)    <idle>-0    |  dNs. |   1.125 us    |  __next_timer_interrupt();
>        75 us |   0)    <idle>-0    |  dNs. |               |  call_timer_fn() {
>        76 us |   0)    <idle>-0    |  dNs. |               |    delayed_work_timer_fn() {
>       [...]
>       564 us |   0)   kworker-13   |  d... |               |      set_next_entity() {
>       565 us |   0)   kworker-13   |  d... |   0.524 us    |        __update_load_avg_se();
>       566 us |   0)   kworker-13   |  d... |   0.562 us    |        __update_load_avg_cfs_rq();
>       567 us |   0)   kworker-13   |  d... |   2.765 us    |      }
>       568 us |   0)   kworker-13   |  d... | + 10.077 us   |    }
>       569 us |   0)   kworker-13   |  d... |   0.000 us    |  __schedule();
> kworker/-13      0d...  593us : <stack trace>
>  => schedule
>  => worker_thread
>  => kthread
>  => ret_from_fork
>       593 us |   0)   kworker-13   |  d... |               |  /*     13:120:I ==> [000]   339:100:R kworker/0:1H */
> 
> Changbin Du (5):
>   function_graph: Support displaying relative timestamp
>   sched/tracing: Show more info for funcgraph wakeup tracers
>   sched/tracing: Put a margin between flags and duration for wakeup
>     tracers
>   sched/tracing: Show stacktrace for wakeup tracers
>   trace/doc: Add latency tracer funcgraph example
> 
>  Documentation/trace/ftrace.rst       | 51 ++++++++++++++++++++++++++++
>  kernel/trace/trace.h                 |  9 ++---
>  kernel/trace/trace_functions_graph.c | 30 ++++++++++++++--
>  kernel/trace/trace_irqsoff.c         |  2 +-
>  kernel/trace/trace_sched_wakeup.c    | 11 ++++--
>  5 files changed, 94 insertions(+), 9 deletions(-)
> 
> -- 
> 2.17.1
> 

-- 
Cheers,
Changbin Du

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 0/5] Improve the latency tracers
  2019-01-12  4:05 ` [PATCH 0/5] Improve the latency tracers Changbin Du
@ 2019-01-14 15:00   ` Steven Rostedt
  0 siblings, 0 replies; 10+ messages in thread
From: Steven Rostedt @ 2019-01-14 15:00 UTC (permalink / raw)
  To: Changbin Du; +Cc: mingo, corbet, linux-doc, linux-kernel

On Sat, 12 Jan 2019 12:05:33 +0800
Changbin Du <changbin.du@gmail.com> wrote:

> Hi Steven, Have you checked this serias yet? :)
> 


Not yet, I'll try to do it today.

-- Steve

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/5] sched/tracing: Show stacktrace for wakeup tracers
  2019-01-01 15:46 ` [PATCH 4/5] sched/tracing: Show stacktrace " Changbin Du
@ 2019-01-16  3:25   ` Steven Rostedt
  2019-01-16 16:04     ` Changbin Du
  0 siblings, 1 reply; 10+ messages in thread
From: Steven Rostedt @ 2019-01-16  3:25 UTC (permalink / raw)
  To: Changbin Du; +Cc: mingo, corbet, linux-doc, linux-kernel

On Tue,  1 Jan 2019 23:46:13 +0800
Changbin Du <changbin.du@gmail.com> wrote:

> This align the behavior of wakeup tracers with irqsoff latency tracer
> that we record stacktrace at the beginning and end of waking up. The
> stacktrace shows us what is happening in the kernel.

OK, so I've applied (locally) all of the patches in this series except
this one.

> 
> Signed-off-by: Changbin Du <changbin.du@gmail.com>
> ---
>  kernel/trace/trace_sched_wakeup.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
> index da5b6e012840..0ec136d408ff 100644
> --- a/kernel/trace/trace_sched_wakeup.c
> +++ b/kernel/trace/trace_sched_wakeup.c
> @@ -474,6 +474,8 @@ probe_wakeup_sched_switch(void *ignore, bool preempt,
>  	data = per_cpu_ptr(wakeup_trace->trace_buffer.data, wakeup_cpu);
>  
>  	__trace_function(wakeup_trace, CALLER_ADDR0, CALLER_ADDR1, flags, pc);
> +	/* Skip 2 functions to get to the task switch function */
> +	__trace_stack(wakeup_trace, flags, 2, pc);

1) Just put in zero for skip. I found that with all the new updates to
the unwinders, you can never get this number right :-(, as well as with
gcc playing games, and retpolines and all that jazz.

>  	tracing_sched_switch_trace(wakeup_trace, prev, next, flags, pc);

2) Have the stack trace go after the sched_switch trace, otherwise it
looks funny:

      285 us |   5)    <idle>-0    |  dN.2 |   1.632 us    |    }
      286 us |   5)    <idle>-0    |  d..3 |   0.000 us    |  __schedule();
  <idle>-0       5d..3  299us : <stack trace>
 => schedule_idle
 => do_idle
 => cpu_startup_entry
 => start_secondary
 => secondary_startup_64
      299 us |   5)    <idle>-0    |  d..3 |               |  /*      0:120:R ==> [005]   811: 98:R i915/signal:0 */

Note, I removed the skip and moved the trace and it looks like this:

      180 us |   3)    <idle>-0    |  dN.2 |   0.944 us    |    }
      181 us |   3)    <idle>-0    |  d..3 |   0.000 us    |  __schedule();
      181 us |   3)    <idle>-0    |  d..3 |               |  /*      0:120:R ==> [003]    25:  0:R migration/3 */
  <idle>-0       3d..3  195us : <stack trace>
 => probe_wakeup_sched_switch
 => __schedule
 => schedule_idle
 => do_idle
 => cpu_startup_entry
 => start_secondary
 => secondary_startup_64

Yeah, it shows the "probe_wakeup_sched" but its better to show too much
than not enough. I've had a hard time debugging some kernels because
the skip was too high.

Please resend this patch with the above updates. Just this patch.

Thanks!

-- Steve

>  
>  	T0 = data->preempt_timestamp;
> @@ -593,6 +595,8 @@ probe_wakeup(void *ignore, struct task_struct *p)
>  	 * it should be safe to use it here.
>  	 */
>  	__trace_function(wakeup_trace, CALLER_ADDR1, CALLER_ADDR2, flags, pc);
> +	/* Skip 2 functions to get to the task wakeup function */
> +	__trace_stack(wakeup_trace, flags, 2, pc);
>  
>  out_locked:
>  	arch_spin_unlock(&wakeup_lock);


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/5] sched/tracing: Show stacktrace for wakeup tracers
  2019-01-16  3:25   ` Steven Rostedt
@ 2019-01-16 16:04     ` Changbin Du
  0 siblings, 0 replies; 10+ messages in thread
From: Changbin Du @ 2019-01-16 16:04 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Changbin Du, mingo, corbet, linux-doc, linux-kernel

Steven, I just send v2 of this one that has applied your 2 suggestions.
Please check. Thanks!

On Tue, Jan 15, 2019 at 10:25:00PM -0500, Steven Rostedt wrote:
> On Tue,  1 Jan 2019 23:46:13 +0800
> Changbin Du <changbin.du@gmail.com> wrote:
> 
> > This align the behavior of wakeup tracers with irqsoff latency tracer
> > that we record stacktrace at the beginning and end of waking up. The
> > stacktrace shows us what is happening in the kernel.
> 
> OK, so I've applied (locally) all of the patches in this series except
> this one.
> 
> > 
> > Signed-off-by: Changbin Du <changbin.du@gmail.com>
> > ---
> >  kernel/trace/trace_sched_wakeup.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/kernel/trace/trace_sched_wakeup.c b/kernel/trace/trace_sched_wakeup.c
> > index da5b6e012840..0ec136d408ff 100644
> > --- a/kernel/trace/trace_sched_wakeup.c
> > +++ b/kernel/trace/trace_sched_wakeup.c
> > @@ -474,6 +474,8 @@ probe_wakeup_sched_switch(void *ignore, bool preempt,
> >  	data = per_cpu_ptr(wakeup_trace->trace_buffer.data, wakeup_cpu);
> >  
> >  	__trace_function(wakeup_trace, CALLER_ADDR0, CALLER_ADDR1, flags, pc);
> > +	/* Skip 2 functions to get to the task switch function */
> > +	__trace_stack(wakeup_trace, flags, 2, pc);
> 
> 1) Just put in zero for skip. I found that with all the new updates to
> the unwinders, you can never get this number right :-(, as well as with
> gcc playing games, and retpolines and all that jazz.
> 
> >  	tracing_sched_switch_trace(wakeup_trace, prev, next, flags, pc);
> 
> 2) Have the stack trace go after the sched_switch trace, otherwise it
> looks funny:
> 
>       285 us |   5)    <idle>-0    |  dN.2 |   1.632 us    |    }
>       286 us |   5)    <idle>-0    |  d..3 |   0.000 us    |  __schedule();
>   <idle>-0       5d..3  299us : <stack trace>
>  => schedule_idle
>  => do_idle
>  => cpu_startup_entry
>  => start_secondary
>  => secondary_startup_64
>       299 us |   5)    <idle>-0    |  d..3 |               |  /*      0:120:R ==> [005]   811: 98:R i915/signal:0 */
> 
> Note, I removed the skip and moved the trace and it looks like this:
> 
>       180 us |   3)    <idle>-0    |  dN.2 |   0.944 us    |    }
>       181 us |   3)    <idle>-0    |  d..3 |   0.000 us    |  __schedule();
>       181 us |   3)    <idle>-0    |  d..3 |               |  /*      0:120:R ==> [003]    25:  0:R migration/3 */
>   <idle>-0       3d..3  195us : <stack trace>
>  => probe_wakeup_sched_switch
>  => __schedule
>  => schedule_idle
>  => do_idle
>  => cpu_startup_entry
>  => start_secondary
>  => secondary_startup_64
> 
> Yeah, it shows the "probe_wakeup_sched" but its better to show too much
> than not enough. I've had a hard time debugging some kernels because
> the skip was too high.
> 
> Please resend this patch with the above updates. Just this patch.
> 
> Thanks!
> 
> -- Steve
> 
> >  
> >  	T0 = data->preempt_timestamp;
> > @@ -593,6 +595,8 @@ probe_wakeup(void *ignore, struct task_struct *p)
> >  	 * it should be safe to use it here.
> >  	 */
> >  	__trace_function(wakeup_trace, CALLER_ADDR1, CALLER_ADDR2, flags, pc);
> > +	/* Skip 2 functions to get to the task wakeup function */
> > +	__trace_stack(wakeup_trace, flags, 2, pc);
> >  
> >  out_locked:
> >  	arch_spin_unlock(&wakeup_lock);
> 

-- 
Cheers,
Changbin Du

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-01-16 16:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-01 15:46 [PATCH 0/5] Improve the latency tracers Changbin Du
2019-01-01 15:46 ` [PATCH 1/5] function_graph: Support displaying relative timestamp Changbin Du
2019-01-01 15:46 ` [PATCH 2/5] sched/tracing: Show more info for funcgraph wakeup tracers Changbin Du
2019-01-01 15:46 ` [PATCH 3/5] sched/tracing: Put a margin between flags and duration for " Changbin Du
2019-01-01 15:46 ` [PATCH 4/5] sched/tracing: Show stacktrace " Changbin Du
2019-01-16  3:25   ` Steven Rostedt
2019-01-16 16:04     ` Changbin Du
2019-01-01 15:46 ` [PATCH 5/5] trace/doc: Add latency tracer funcgraph example Changbin Du
2019-01-12  4:05 ` [PATCH 0/5] Improve the latency tracers Changbin Du
2019-01-14 15:00   ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).