linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] perf: sched timehist command
@ 2013-11-18 20:32 David Ahern
  2013-11-18 20:32 ` [PATCH 1/8] perf tool: Skip ignored symbols while printing callchain David Ahern
                   ` (7 more replies)
  0 siblings, 8 replies; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel; +Cc: David Ahern

Here is the latest incarnation of the timehist command; this time integrated
into perf-sched. Patches can also be found here:

    https://github.com/dsahern/linux perf-timehist

I have other options ready to go, but this seems like a good point to push
out for review.

David Ahern (8):
  perf tool: Skip ignored symbols while printing callchain
  perf symbols: Move idle syms check from top to generic function
  perf symbol: Save vmlinux or kallsyms path loaded
  perf thread: Move comm_list check into function
  perf tool: export setup_list
  perf sched: Introduce timehist command
  perf sched timehist: Add support for context-switch event
  perf sched : Add documentation for timehist options

 tools/perf/Documentation/perf-sched.txt |   60 +-
 tools/perf/builtin-sched.c              | 1002 ++++++++++++++++++++++++++++++-
 tools/perf/builtin-top.c                |   25 +-
 tools/perf/util/event.c                 |    3 +-
 tools/perf/util/session.c               |   10 +-
 tools/perf/util/symbol.c                |   39 +-
 tools/perf/util/symbol.h                |    4 +
 tools/perf/util/thread.h                |   12 +
 8 files changed, 1124 insertions(+), 31 deletions(-)

-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/8] perf tool: Skip ignored symbols while printing callchain
  2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
@ 2013-11-18 20:32 ` David Ahern
  2013-11-30 12:52   ` [tip:perf/core] perf evsel: " tip-bot for David Ahern
  2013-11-18 20:32 ` [PATCH 2/8] perf symbols: Move idle syms check from top to generic function David Ahern
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel
  Cc: David Ahern, Ingo Molnar, Frederic Weisbecker, Peter Zijlstra,
	Jiri Olsa, Namhyung Kim, Stephane Eranian

Allows a command to have a symbol_filter controlled by the user to skip
certain functions in a backtrace. One example is to allow the user to reduce
repeating patterns like:
    do_select  core_sys_select  sys_select

to just sys_select when dumping callchains, consuming less real estate on
the screen while still conveying the essential message - the process is in
a select call.

This option is leveraged by the upcoming timehist command.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/session.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f36d24a02445..948f7644b4d9 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1522,6 +1522,9 @@ void perf_evsel__print_ip(struct perf_evsel *evsel, union perf_event *event,
 			if (!node)
 				break;
 
+			if (node->sym && node->sym->ignore)
+				goto next;
+
 			if (print_ip)
 				printf("%c%16" PRIx64, s, node->ip);
 
@@ -1544,12 +1547,15 @@ void perf_evsel__print_ip(struct perf_evsel *evsel, union perf_event *event,
 			if (!print_oneline)
 				printf("\n");
 
-			callchain_cursor_advance(&callchain_cursor);
-
 			stack_depth--;
+next:
+			callchain_cursor_advance(&callchain_cursor);
 		}
 
 	} else {
+		if (al.sym->ignore)
+			return;
+
 		if (print_ip)
 			printf("%16" PRIx64, sample->ip);
 
-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/8] perf symbols: Move idle syms check from top to generic function
  2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
  2013-11-18 20:32 ` [PATCH 1/8] perf tool: Skip ignored symbols while printing callchain David Ahern
@ 2013-11-18 20:32 ` David Ahern
  2013-11-28  8:36   ` Namhyung Kim
  2013-11-30 12:52   ` [tip:perf/core] " tip-bot for David Ahern
  2013-11-18 20:32 ` [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded David Ahern
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel
  Cc: David Ahern, Ingo Molnar, Frederic Weisbecker, Peter Zijlstra,
	Jiri Olsa, Namhyung Kim, Stephane Eranian

Allows list of idle symbols to be leveraged by other commands,
such as the upcoming timehist command.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
---
 tools/perf/builtin-top.c | 25 ++-----------------------
 tools/perf/util/symbol.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/symbol.h |  1 +
 3 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 531522d3d97b..03d37a76c612 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -634,26 +634,9 @@ repeat:
 	return NULL;
 }
 
-/* Tag samples to be skipped. */
-static const char *skip_symbols[] = {
-	"intel_idle",
-	"default_idle",
-	"native_safe_halt",
-	"cpu_idle",
-	"enter_idle",
-	"exit_idle",
-	"mwait_idle",
-	"mwait_idle_with_hints",
-	"poll_idle",
-	"ppc64_runlatch_off",
-	"pseries_dedicated_idle_sleep",
-	NULL
-};
-
 static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
 {
 	const char *name = sym->name;
-	int i;
 
 	/*
 	 * ppc64 uses function descriptors and appends a '.' to the
@@ -671,12 +654,8 @@ static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
 	    strstr(name, "_text_end"))
 		return 1;
 
-	for (i = 0; skip_symbols[i]; i++) {
-		if (!strcmp(skip_symbols[i], name)) {
-			sym->ignore = true;
-			break;
-		}
-	}
+	if (symbol__is_idle(sym))
+		sym->ignore = true;
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index c0c36965fff0..f55c18da1e40 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -573,6 +573,36 @@ static u8 kallsyms2elf_type(char type)
 	return isupper(type) ? STB_GLOBAL : STB_LOCAL;
 }
 
+bool symbol__is_idle(struct symbol *sym)
+{
+	const char * const idle_symbols[] = {
+		"cpu_idle",
+		"intel_idle",
+		"default_idle",
+		"native_safe_halt",
+		"enter_idle",
+		"exit_idle",
+		"mwait_idle",
+		"mwait_idle_with_hints",
+		"poll_idle",
+		"ppc64_runlatch_off",
+		"pseries_dedicated_idle_sleep",
+		NULL
+	};
+
+	int i;
+
+	if (!sym)
+		return false;
+
+	for (i = 0; idle_symbols[i]; i++) {
+		if (!strcmp(idle_symbols[i], sym->name))
+			return true;
+	}
+
+	return false;
+}
+
 static int map__process_kallsym_symbol(void *arg, const char *name,
 				       char type, u64 start)
 {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 07de8fea2f48..ad13c5d50b91 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -240,6 +240,7 @@ size_t symbol__fprintf(struct symbol *sym, FILE *fp);
 bool symbol_type__is_a(char symbol_type, enum map_type map_type);
 bool symbol__restricted_filename(const char *filename,
 				 const char *restricted_filename);
+bool symbol__is_idle(struct symbol *sym);
 
 int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 		  struct symsrc *runtime_ss, symbol_filter_t filter,
-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded
  2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
  2013-11-18 20:32 ` [PATCH 1/8] perf tool: Skip ignored symbols while printing callchain David Ahern
  2013-11-18 20:32 ` [PATCH 2/8] perf symbols: Move idle syms check from top to generic function David Ahern
@ 2013-11-18 20:32 ` David Ahern
  2013-11-22 18:44   ` Arnaldo Carvalho de Melo
  2013-11-18 20:32 ` [PATCH 4/8] perf thread: Move comm_list check into function David Ahern
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel
  Cc: David Ahern, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Frederic Weisbecker, Peter Zijlstra, Stephane Eranian

Save vmlinux or kallsyms path loaded using embedded default lookup (ie.,
not a user specified path). Upcoming perf sched timehist command requires
kernel symbols for properly computing idle times and prints a warning
if the kernel symbols are not loaded.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/symbol.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index f55c18da1e40..915eb6c4a248 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1437,6 +1437,8 @@ int dso__load_vmlinux(struct dso *dso, struct map *map,
 			dso->data_type = DSO_BINARY_TYPE__VMLINUX;
 		dso__set_long_name(dso, (char *)vmlinux);
 		dso__set_loaded(dso, map->type);
+		if (symbol_conf.vmlinux_name == NULL)
+			symbol_conf.vmlinux_name = strdup(symfs_vmlinux);
 		pr_debug("Using %s for symbols\n", symfs_vmlinux);
 	}
 
@@ -1629,8 +1631,11 @@ static int dso__load_kernel_sym(struct dso *dso, struct map *map,
 
 do_kallsyms:
 	err = dso__load_kallsyms(dso, kallsyms_filename, map, filter);
-	if (err > 0)
+	if (err > 0) {
+		if (symbol_conf.kallsyms_name == NULL)
+			symbol_conf.kallsyms_name = strdup(kallsyms_filename);
 		pr_debug("Using %s for symbols\n", kallsyms_filename);
+	}
 	free(kallsyms_allocated_filename);
 
 	if (err > 0 && !dso__is_kcore(dso)) {
-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/8] perf thread: Move comm_list check into function
  2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
                   ` (2 preceding siblings ...)
  2013-11-18 20:32 ` [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded David Ahern
@ 2013-11-18 20:32 ` David Ahern
  2013-11-30 12:52   ` [tip:perf/core] " tip-bot for David Ahern
  2013-11-18 20:32 ` [PATCH 5/8] perf tool: export setup_list David Ahern
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel
  Cc: David Ahern, Ingo Molnar, Frederic Weisbecker, Peter Zijlstra,
	Jiri Olsa, Namhyung Kim, Stephane Eranian

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/event.c  |  3 +--
 tools/perf/util/thread.h | 12 ++++++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index bb788c109fe6..c77814bf01e1 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -732,8 +732,7 @@ int perf_event__preprocess_sample(const union perf_event *event,
 	if (thread == NULL)
 		return -1;
 
-	if (symbol_conf.comm_list &&
-	    !strlist__has_entry(symbol_conf.comm_list, thread__comm_str(thread)))
+	if (thread__is_filtered(thread))
 		goto out_filtered;
 
 	dump_printf(" ... thread: %s:%d\n", thread__comm_str(thread), thread->tid);
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 897c1b2a750a..5b856bf942e1 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -6,6 +6,7 @@
 #include <unistd.h>
 #include <sys/types.h>
 #include "symbol.h"
+#include <strlist.h>
 
 struct thread {
 	union {
@@ -66,4 +67,15 @@ static inline void thread__set_priv(struct thread *thread, void *p)
 {
 	thread->priv = p;
 }
+
+static inline bool thread__is_filtered(struct thread *thread)
+{
+	if (symbol_conf.comm_list &&
+	    !strlist__has_entry(symbol_conf.comm_list, thread__comm_str(thread))) {
+		return true;
+	}
+
+	return false;
+}
+
 #endif	/* __PERF_THREAD_H */
-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/8] perf tool: export setup_list
  2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
                   ` (3 preceding siblings ...)
  2013-11-18 20:32 ` [PATCH 4/8] perf thread: Move comm_list check into function David Ahern
@ 2013-11-18 20:32 ` David Ahern
  2013-11-28  8:45   ` Namhyung Kim
  2013-11-30 12:52   ` [tip:perf/core] perf tools: Export setup_list tip-bot for David Ahern
  2013-11-18 20:32 ` [PATCH 6/8] perf sched: Introduce timehist command David Ahern
                   ` (2 subsequent siblings)
  7 siblings, 2 replies; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel
  Cc: David Ahern, Ingo Molnar, Frederic Weisbecker, Peter Zijlstra,
	Jiri Olsa, Namhyung Kim, Stephane Eranian

Used in next patch (perf sched timehist command)

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
---
 tools/perf/util/symbol.c | 2 +-
 tools/perf/util/symbol.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 915eb6c4a248..92b8e0cd6266 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1754,7 +1754,7 @@ out_fail:
 	return -1;
 }
 
-static int setup_list(struct strlist **list, const char *list_str,
+int setup_list(struct strlist **list, const char *list_str,
 		      const char *list_name)
 {
 	if (list_str == NULL)
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index ad13c5d50b91..f1031a1358a1 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -274,4 +274,7 @@ void kcore_extract__delete(struct kcore_extract *kce);
 int kcore_copy(const char *from_dir, const char *to_dir);
 int compare_proc_modules(const char *from, const char *to);
 
+int setup_list(struct strlist **list, const char *list_str,
+	       const char *list_name);
+
 #endif /* __PERF_SYMBOL */
-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/8] perf sched: Introduce timehist command
  2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
                   ` (4 preceding siblings ...)
  2013-11-18 20:32 ` [PATCH 5/8] perf tool: export setup_list David Ahern
@ 2013-11-18 20:32 ` David Ahern
  2013-11-28  9:50   ` Namhyung Kim
  2013-11-28 15:38   ` Namhyung Kim
  2013-11-18 20:32 ` [PATCH 7/8] perf sched timehist: Add support for context-switch event David Ahern
  2013-11-18 20:32 ` [PATCH 8/8] perf sched : Add documentation for timehist options David Ahern
  7 siblings, 2 replies; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel
  Cc: David Ahern, Ingo Molnar, Frederic Weisbecker, Peter Zijlstra,
	Mike Galbraith, Jiri Olsa, Namhyung Kim, Stephane Eranian,
	Pekka Enberg

'perf sched timehist' provides an analysis of scheduling events.

Example usage:
    perf sched record -- sleep 1
    perf sched timehist

By default it shows the individual schedule events, including the time between
sched-in events for the task, the task scheduling delay (time between wakeup
and actually running) and run time for the task:

           time cpu  task name[tid/pid]    b/n time sch delay  run time
--------------- ---- -------------------- --------- --------- ---------
   79371.874569 [11] gcc[31949]            0.000014  0.000000  0.001148
   79371.874591 [10] gcc[31951]            0.000000  0.000000  0.000024
   79371.874603 [10] migration/10[59]      0.003350  0.000004  0.000011
   79371.874604 [11] <idle>                0.001148  0.000000  0.000035
   79371.874723 [05] <idle>                0.000016  0.000000  0.001383
   79371.874746 [05] gcc[31949]            0.000153  0.000078  0.000022
...

If callchains were recorded they are appended to the line with a default stack depth of 5:

   79371.874569 [11] gcc[31949]                  0.000014  0.000000  0.001148  wait_for_completion_killable do_fork sys_vfork stub_vfork __vfork
   79371.874591 [10] gcc[31951]                  0.000000  0.000000  0.000024  __cond_resched _cond_resched wait_for_completion stop_one_cpu sched_exec
   79371.874603 [10] migration/10[59]            0.003350  0.000004  0.000011  smpboot_thread_fn kthread ret_from_fork
   79371.874604 [11] <idle>                      0.001148  0.000000  0.000035  cpu_startup_entry start_secondary
   79371.874723 [05] <idle>                      0.000016  0.000000  0.001383  cpu_startup_entry start_secondary
   79371.874746 [05] gcc[31949]                  0.000153  0.000078  0.000022  do_wait sys_wait4 system_call_fastpath __GI___waitpid

--no-call-graph can be used to not show the callchains. --max-stack is used
to control the number of frames shown (default of 5). -x/--excl options can
be used to collapse redundant callchains to get more relevant data on screen.

Similar to perf-trace -s and -S can be used to dump a statistical summary
without or with events (respectively). Statistics include min run time,
average run time and max run time. Stats are also shown for run time by
cpu.

The cpu-visual option provides a visual aid for sched switches by cpu:
...
   79371.874569 [11]            s      gcc[31949]                  0.000014  0.000000  0.001148
   79371.874591 [10]           s       gcc[31951]                  0.000000  0.000000  0.000024
   79371.874603 [10]           s       migration/10[59]            0.003350  0.000004  0.000011
   79371.874604 [11]            i      <idle>                      0.001148  0.000000  0.000035
   79371.874723 [05]      i            <idle>                      0.000016  0.000000  0.001383
   79371.874746 [05]      s            gcc[31949]                  0.000153  0.000078  0.000022
...

Additional options:
- -w option can be used to include a dump of wakeup events
- -c comm to only display events and stats for the given comm (csv list)
- -p/-t to only display events and stats for given pid/tid (csv list)

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
---
 tools/perf/builtin-sched.c | 972 ++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 971 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 0f3c65518a2c..ffe11067083c 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -16,6 +16,9 @@
 
 #include "util/debug.h"
 
+#include "util/strlist.h"
+#include "util/intlist.h"
+#include "asm/bug.h"
 #include <sys/prctl.h>
 #include <sys/resource.h>
 
@@ -169,8 +172,49 @@ struct perf_sched {
 	u64		 cpu_last_switched[MAX_CPUS];
 	struct rb_root	 atom_root, sorted_atom_root;
 	struct list_head sort_list, cmp_pid;
+
+	/* options for timehist command */
+	bool		summary;
+	bool		summary_only;
+	bool		show_callchain;
+	unsigned int	max_stack;
+	bool		show_cpu_visual;
+	bool		show_wakeups;
+	/* process and task id's of interest */
+	struct target target;
+	struct intlist  *pid, *tid;
+};
+
+/* used in symbol filter */
+static const char	*excl_sym_list_str;
+static struct strlist	*excl_sym_list;
+
+/* per thread run time data */
+struct thread_runtime {
+	u64 last_time;      /* time of previous sched in/out event */
+	u64 dt_run;         /* run time */
+	u64 dt_between;     /* time between CPU access (off cpu) */
+	u64 dt_delay;       /* time between wakeup and sched-in */
+	u64 ready_to_run;   /* time of wakeup */
+
+	u64 count;
+	u64 total_run_time;
+	u64 min_run_time;
+	u64 max_run_time;
+	double mean, M2;
 };
 
+/* per event run time data */
+struct evsel_runtime {
+	u64 *last_time; /* time this event was last seen per cpu */
+	u32 ncpu;       /* highest cpu slot allocated */
+};
+
+/* track idle times per cpu */
+static struct thread **idle_threads;
+static int idle_max_cpu;
+static char idle_comm[] = "<idle>";
+
 static u64 get_nsecs(void)
 {
 	struct timespec ts;
@@ -1484,6 +1528,883 @@ out_delete:
 	return -1;
 }
 
+static inline void printf_nsecs(unsigned long long nsecs, int width_sec)
+{
+	unsigned long secs;
+	unsigned long usecs;
+
+	secs = nsecs / NSECS_PER_SEC;
+	nsecs -= secs * NSECS_PER_SEC;
+	usecs = nsecs / NSECS_PER_USEC;
+	printf("%*lu.%06lu ", width_sec, secs, usecs);
+}
+
+static struct evsel_runtime *perf_evsel__get_runtime(struct perf_evsel *evsel)
+{
+	struct evsel_runtime *r = evsel->priv;
+
+	if (r == NULL) {
+		r = zalloc(sizeof(struct evsel_runtime));
+		evsel->priv = r;
+	}
+
+	return r;
+}
+
+static void perf_evsel__save_time(struct perf_evsel *evsel,
+				  u64 timestamp, u32 cpu)
+{
+	struct evsel_runtime *r = perf_evsel__get_runtime(evsel);
+
+	if (r == NULL)
+		return;
+
+	if ((cpu > r->ncpu) || (r->last_time == NULL)) {
+		unsigned int i;
+		void *p = r->last_time;
+
+		r->last_time = realloc(r->last_time, (cpu+1) * sizeof(u64));
+		if (!r->last_time) {
+			free(p);
+			return;
+		}
+
+		i = r->ncpu ? r->ncpu + 1 : 0;
+		for (; i <= cpu; ++i)
+			r->last_time[i] = (u64) 0;
+
+		r->ncpu = cpu;
+	}
+
+	r->last_time[cpu] = timestamp;
+}
+
+static u64 perf_evsel__get_time(struct perf_evsel *evsel, u32 cpu)
+{
+	struct evsel_runtime *r = perf_evsel__get_runtime(evsel);
+
+	if ((r == NULL) || (r->last_time == NULL) || (cpu > r->ncpu))
+		return 0;
+
+	return r->last_time[cpu];
+}
+
+static unsigned int comm_width = 20;
+
+static char *timehist_get_commstr(struct thread *thread)
+{
+	static char str[32];
+	const char *comm = thread__comm_str(thread);
+	pid_t tid = thread->tid;
+	pid_t pid = thread->pid_;
+	unsigned int n;
+
+	if (pid == 0)
+		scnprintf(str, sizeof(str), "%s", comm);
+
+	else if (tid != pid)
+		scnprintf(str, sizeof(str), "%s[%d/%d]", comm, tid, pid);
+
+	else
+		scnprintf(str, sizeof(str), "%s[%d]", comm, tid);
+
+	n = strlen(str);
+	if (n > comm_width)
+		comm_width = n;
+
+	return str;
+}
+
+static void timehist_header(struct perf_sched *sched)
+{
+	u32 max_cpus = sched->max_cpu;
+	u32 i, j;
+
+	printf("%15s %-4s", "time", "cpu");
+
+	if (sched->show_cpu_visual && max_cpus) {
+		printf(" ");
+		for (i = 0, j = 0; i < max_cpus; ++i) {
+			printf("%x", j++);
+			if (j > 15)
+				j = 0;
+		}
+		printf(" ");
+	}
+
+	printf(" %-20s %9s %9s %9s",
+		"task name[tid/pid]", "b/n time", "sch delay", "run time");
+
+	if (sched->show_wakeups)
+		printf("  %-20s", "wakeup");
+
+	printf("\n");
+
+	printf("%15s %4s", "---------------", "----");
+
+	if (sched->show_cpu_visual && max_cpus) {
+		printf(" ");
+		for (i = 0; i < max_cpus; ++i)
+			printf("-");
+		printf(" ");
+	}
+
+	printf(" %20s %9s %9s %9s",
+		"--------------------", "---------", "---------", "---------");
+
+	if (sched->show_wakeups)
+		printf("  %20s", "--------------------");
+
+	printf("\n");
+}
+
+static char *timehist_time_str(char *tstr, int len, u64 t)
+{
+	unsigned long secs, usecs;
+	unsigned long long nsecs;
+
+	nsecs = t;
+	secs = nsecs / NSECS_PER_SEC;
+	nsecs -= secs * NSECS_PER_SEC;
+	usecs = nsecs / NSECS_PER_USEC;
+	snprintf(tstr, len, "%5lu.%06lu", secs, usecs);
+
+	return tstr;
+}
+
+static void timehist_print_sample(struct perf_sched *sched,
+				  union perf_event *event,
+				  struct perf_evsel *evsel,
+				  struct perf_sample *sample,
+				  struct thread *thread,
+				  struct machine *machine)
+{
+	struct thread_runtime *tr = thread__priv(thread);
+	char tstr[64];
+	u32 max_cpus = sched->max_cpu;
+
+	printf("%15s ", timehist_time_str(tstr, sizeof(tstr), sample->time));
+
+	printf("[%02d] ", sample->cpu);
+
+	if (sched->show_cpu_visual && max_cpus) {
+		u32 i;
+		char c;
+		for (i = 0; i < max_cpus; ++i) {
+			if (i == sample->cpu)
+				c = (thread->tid == 0) ? 'i' : 's';
+			else
+				c = ' ';
+			printf("%c", c);
+		}
+		printf("  ");
+	}
+
+	printf("%-*s ", comm_width, timehist_get_commstr(thread));
+
+	printf_nsecs(tr->dt_between, 2);
+	printf_nsecs(tr->dt_delay, 2);
+	printf_nsecs(tr->dt_run, 2);
+
+	if (sched->show_wakeups)
+		printf(" %-*s ", comm_width, "");
+
+	perf_evsel__print_ip(evsel, event, sample, machine,
+			     PRINT_IP_OPT_SYM | PRINT_IP_OPT_ONELINE,
+			     sched->max_stack);
+
+	printf("\n");
+}
+
+/*
+ * Explanation of delta-time stats:
+ *
+ *            t = time of current schedule out event
+ *        tprev = time of previous sched out event
+ *                also time of schedule-in event for current task
+ *    last_time = time of last sched change event for current task
+ *                (i.e, time process was last scheduled out)
+ * ready_to_run = time of wakeup for current task
+ *
+ * -----|------------|------------|------------|------
+ *    last         ready        tprev          t
+ *    time         to run
+ *
+ *      |------- dt_between ------|
+ *                   |- dt_delay -|-- dt_run --|
+ *
+ *     dt_run = run time of current task
+ * dt_between = time between last schedule out event for task and tprev
+ *              represents time spent off the cpu
+ */
+static void timehist_update_runtime_stats(struct thread_runtime *r,
+					 u64 t, u64 tprev)
+{
+	double delta;
+
+	r->count++;
+
+	r->dt_delay = 0;
+	r->dt_between = 0;
+	if (tprev) {
+		r->dt_run = t - tprev;
+		if (r->ready_to_run)
+			r->dt_delay = tprev - r->ready_to_run;
+
+		if (r->last_time > tprev)
+			pr_debug("time travel: last sched out time for thread > previous sched_switch event\n");
+		else if (r->last_time)
+			r->dt_between = tprev - r->last_time;
+	}
+
+	delta    = r->dt_run - r->mean;
+	r->mean += delta / r->count;
+	r->M2   += delta * (r->dt_run - r->mean);
+
+	if (r->dt_run > r->max_run_time)
+		r->max_run_time = r->dt_run;
+
+	if (r->dt_run < r->min_run_time)
+		r->min_run_time = r->dt_run;
+
+	r->total_run_time += r->dt_run;
+
+	/* sched out event for task so reset ready to run time */
+	r->ready_to_run = 0;
+}
+
+static bool is_idle_sample(struct perf_sample *sample,
+			   struct perf_evsel *evsel,
+			   struct machine *machine)
+{
+	struct thread *thread;
+	struct callchain_cursor *cursor = &callchain_cursor;
+	struct callchain_cursor_node *node;
+	struct addr_location al;
+	int iter = 5;
+
+	/* pid 0 == swapper == idle task */
+	if (sample->pid == 0)
+		return true;
+
+	/* want main thread for process - has maps */
+	thread = machine__findnew_thread(machine, sample->pid, sample->pid);
+	if (thread == NULL) {
+		pr_debug("Failed to get thread for pid %d.\n", sample->pid);
+		return false;
+	}
+
+	if (!symbol_conf.use_callchain || sample->callchain == NULL)
+		return false;
+
+	if (machine__resolve_callchain(machine, evsel, thread,
+					sample, NULL, &al, PERF_MAX_STACK_DEPTH) != 0) {
+		if (verbose)
+			error("Failed to resolve callchain. Skipping\n");
+
+		return false;
+	}
+	callchain_cursor_commit(cursor);
+
+	/* idle symbol should be early in the stack */
+	while (iter) {
+		node = callchain_cursor_current(cursor);
+		if (!node)
+			break;
+
+		if (symbol__is_idle(node->sym))
+			return true;
+
+		callchain_cursor_advance(cursor);
+
+		iter--;
+	}
+
+	return false;
+}
+
+static int init_idle_threads(int ncpu)
+{
+	int i;
+
+	if (ncpu == 0)
+		ncpu = 1;
+
+	idle_threads = zalloc(ncpu * sizeof(struct thread *));
+	if (!idle_threads)
+		return -ENOMEM;
+
+	idle_max_cpu = ncpu - 1;
+
+	/* allocate the actual thread struct if needed */
+	for (i = 0; i < ncpu; ++i) {
+		idle_threads[i] = thread__new(0, 0);
+		if (idle_threads[i] == NULL)
+			return -ENOMEM;
+
+		thread__set_comm(idle_threads[i], idle_comm, 0);
+	}
+
+	return 0;
+}
+
+static void free_idle_threads(void)
+{
+	int i;
+
+	if (idle_threads == NULL)
+		return;
+
+	for (i = 0; i <= idle_max_cpu; ++i)
+		free(idle_threads[i]);
+
+	free(idle_threads);
+}
+
+static struct thread *get_idle_thread(int cpu)
+{
+	/* expand/allocate array of pointers to local thread
+	 * structs if needed
+	 */
+	if ((cpu > idle_max_cpu) || (idle_threads == NULL)) {
+		int i, j = 15;
+		void *p;
+
+		if (cpu > j)
+			j = cpu;
+		p = realloc(idle_threads, (j+1) * sizeof(struct thread *));
+		if (!p)
+			return NULL;
+
+		idle_threads = (struct thread **) p;
+		i = idle_max_cpu ? idle_max_cpu + 1 : 0;
+		for (; i <= cpu; ++i)
+			idle_threads[i] = NULL;
+
+		idle_max_cpu = cpu;
+	}
+
+	/* allocate the actual thread struct if needed */
+	if (idle_threads[cpu] == NULL) {
+		idle_threads[cpu] = thread__new(0, 0);
+		if (idle_threads[cpu]) {
+			idle_threads[cpu]->tid = 0;
+			thread__set_comm(idle_threads[cpu], idle_comm, 0);
+		}
+	}
+
+	return idle_threads[cpu];
+}
+
+static struct thread_runtime *thread__init_runtime(struct thread *thread)
+{
+	struct thread_runtime *r;
+
+	r = zalloc(sizeof(struct thread_runtime));
+	if (!r)
+		return NULL;
+
+	r->min_run_time = (u64) -1;
+	thread__set_priv(thread, r);
+
+	return r;
+}
+
+static struct thread_runtime *thread__get_runtime(struct thread *thread)
+{
+	struct thread_runtime *tr;
+
+	tr = thread__priv(thread);
+	if (tr == NULL) {
+		tr = thread__init_runtime(thread);
+		if (tr == NULL)
+			pr_debug("failed to malloc memory for runtime data.\n");
+	}
+
+	return tr;
+}
+
+static struct thread *timehist_get_thread(struct perf_evsel *evsel,
+					  struct perf_sample *sample,
+					  struct machine *machine)
+{
+	struct thread *thread;
+
+	if (is_idle_sample(sample, evsel, machine)) {
+		thread = get_idle_thread(sample->cpu);
+		if (thread == NULL)
+			pr_err("failed to get idle thread for cpu %d.\n", sample->cpu);
+
+	} else {
+		thread = machine__findnew_thread(machine, sample->pid, sample->tid);
+		if (thread == NULL) {
+			pr_debug("Failed to get thread for tid %d. skipping sample.\n",
+				 sample->tid);
+		}
+	}
+
+	return thread;
+}
+
+static bool timehist_skip_sample(struct perf_sched *sched,
+				 struct thread *thread)
+{
+	/*
+	 * if user gave a comm list, only show event if waker or wakee
+	 * is on the list
+	 */
+	if (thread__is_filtered(thread))
+		return true;
+
+	if (sched->pid && intlist__find(sched->pid, thread->pid_) == NULL)
+		return true;
+
+	if (sched->tid && intlist__find(sched->tid, thread->tid) == NULL)
+		return true;
+
+	return false;
+}
+
+static void timehist_print_wakeup_event(struct perf_sched *sched,
+					struct perf_sample *sample,
+					struct machine *machine,
+					struct thread *awakened)
+{
+	struct thread *thread;
+	char tstr[64];
+
+	thread = machine__findnew_thread(machine, sample->pid, sample->tid);
+	if (thread == NULL)
+		return;
+
+	if (timehist_skip_sample(sched, thread) &&
+	    timehist_skip_sample(sched, awakened)) {
+		return;
+	}
+
+	printf("%15s ", timehist_time_str(tstr, sizeof(tstr), sample->time));
+	printf("[%02d] ", sample->cpu);
+	if (sched->show_cpu_visual && sched->max_cpu)
+		printf("%*s  ", sched->max_cpu, "");
+
+	printf("%-*s ", comm_width, timehist_get_commstr(thread));
+
+	/* dt spacer */
+	printf(" %9s %9s %9s ", "", "", "");
+
+	printf("%-*s ", comm_width, timehist_get_commstr(awakened));
+
+	printf("\n");
+
+	return;
+}
+
+static int timehist_sched_wakeup_event(struct perf_tool *tool,
+				       union perf_event *event __maybe_unused,
+				       struct perf_evsel *evsel,
+				       struct perf_sample *sample,
+				       struct machine *machine)
+{
+	struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
+	struct thread *thread;
+	struct thread_runtime *tr = NULL;
+	/* want pid of awakened task not pid in sample */
+	const u32 pid = perf_evsel__intval(evsel, sample, "pid");
+
+	thread = machine__findnew_thread(machine, 0, pid);
+	if (thread == NULL)
+		return -1;
+
+	tr = thread__get_runtime(thread);
+	if (tr == NULL)
+		return -1;
+
+	if (tr->ready_to_run == 0)
+		tr->ready_to_run = sample->time;
+
+	/* show wakeups if requested */
+	if (sched->show_wakeups && !sched->summary_only)
+		timehist_print_wakeup_event(sched, sample, machine, thread);
+
+	return 0;
+}
+
+static int timehist_sched_change_event(struct perf_tool *tool,
+				       union perf_event *event,
+				       struct perf_evsel *evsel,
+				       struct perf_sample *sample,
+				       struct machine *machine)
+{
+	struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
+	struct thread *thread;
+	struct thread_runtime *tr = NULL;
+	u64 tprev;
+	int rc = 0;
+
+	thread = timehist_get_thread(evsel, sample, machine);
+	if (thread == NULL) {
+		rc = -1;
+		goto out;
+	}
+
+	if (timehist_skip_sample(sched, thread))
+		goto out;
+
+	tr = thread__get_runtime(thread);
+	if (tr == NULL) {
+		rc = -1;
+		goto out;
+	}
+
+	tprev = perf_evsel__get_time(evsel, sample->cpu);
+
+	timehist_update_runtime_stats(tr, sample->time, tprev);
+	if (!sched->summary_only)
+		timehist_print_sample(sched, event, evsel, sample, thread, machine);
+
+out:
+	if (tr)
+		tr->last_time = sample->time;
+	perf_evsel__save_time(evsel, sample->time, sample->cpu);
+
+	return rc;
+}
+
+static int timehist_sched_switch_event(struct perf_tool *tool,
+				       union perf_event *event,
+				       struct perf_evsel *evsel,
+				       struct perf_sample *sample,
+				       struct machine *machine)
+{
+	return timehist_sched_change_event(tool, event, evsel, sample, machine);
+}
+
+static int process_lost(struct perf_tool *tool __maybe_unused,
+			union perf_event *event,
+			struct perf_sample *sample,
+			struct machine *machine __maybe_unused)
+{
+	char tstr[64];
+
+	printf("%15s ", timehist_time_str(tstr, sizeof(tstr), sample->time));
+	printf("lost %" PRIu64 " events on cpu %d\n", event->lost.lost, sample->cpu);
+
+	return 0;
+}
+
+
+static void print_thread_runtime(struct thread *t, struct thread_runtime *r)
+{
+	double variance;
+	float stddev;
+
+	printf("    %*s  %5d  %6" PRIu64 "  ",
+	       comm_width, timehist_get_commstr(t), t->ppid, r->count);
+
+	printf_nsecs(r->total_run_time, 2);
+	if (r->count > 1) {
+		variance = r->M2 / (r->count - 1);
+		stddev = 100.0 * sqrt(variance / r->count) / r->mean;
+
+		printf(" ");
+		printf_nsecs(r->min_run_time, 2);
+		printf(" ");
+		printf_nsecs((u64) r->mean, 2);
+		printf(" ");
+		printf_nsecs(r->max_run_time, 2);
+		printf(" ");
+		printf("%5.2f", stddev);
+	}
+	printf("\n");
+}
+
+#define	UPDATE_TOTALS(r) {\
+			total_count += (r)->count; \
+			total_run_time += (r)->total_run_time; \
+			total_task++; \
+		}
+
+static void timehist_print_summary(struct perf_session *session)
+{
+	struct machine *m = &session->machines.host;
+	struct rb_node *nd;
+	struct thread *t;
+	struct thread_runtime *r;
+	u64 total_run_time = 0;
+	u64 total_count = 0;
+	int i, total_task = 0;
+
+	printf("\nRuntime summary (times are in seconds)\n");
+	printf("    %*s  parent  number  ", comm_width, "comm");
+	printf("run-time    min-run    avg-run   max-run  stddev(%%)\n");
+
+	nd = rb_first(&m->threads);
+	while (nd) {
+		t = rb_entry(nd, struct thread, rb_node);
+		if (t->dead)
+			goto next1;
+
+		if (thread__is_filtered(t))
+			goto next1;
+
+		r = thread__priv(t);
+		if (r && r->count) {
+			UPDATE_TOTALS(r);
+			print_thread_runtime(t, r);
+		}
+next1:
+		nd = rb_next(nd);
+	}
+
+	printf("\nTerminated tasks:\n");
+	nd = rb_first(&m->threads);
+	while (nd) {
+		t = rb_entry(nd, struct thread, rb_node);
+		if (!t->dead)
+			goto next2;
+
+		if (thread__is_filtered(t))
+			goto next2;
+
+		r = thread__priv(t);
+		if (r && r->count) {
+			UPDATE_TOTALS(r);
+			print_thread_runtime(t, r);
+		}
+next2:
+		nd = rb_next(nd);
+	}
+
+	list_for_each_entry(t, &m->dead_threads, node) {
+		r = thread__priv(t);
+		if (r && r->count) {
+			UPDATE_TOTALS(r);
+			print_thread_runtime(t, r);
+		}
+	}
+
+	printf("\nIdle stats:\n");
+	for (i = 0; i <= idle_max_cpu; ++i) {
+		t = idle_threads[i];
+		if (!t)
+			continue;
+
+		r = thread__priv(t);
+		if (r && r->count) {
+			total_count += r->count;
+			printf("    CPU %d idle for ", i);
+			printf_nsecs(r->total_run_time, 2);
+			printf("\n");
+		}
+	}
+
+	if (!symbol_conf.vmlinux_name && !symbol_conf.kallsyms_name)
+		pr_warning("kernel symbols not given. If this is an offline analysis idle time computations will be wrong\n");
+
+	printf("\n"
+	       "    Total number of unique tasks: %d\n"
+	       "Total number of context switches: %" PRIu64 "\n"
+	       "            Total run time (sec): ",
+	       total_task, total_count);
+	printf_nsecs(total_run_time, 2);
+	printf("\n");
+}
+
+typedef int (*sched_handler)(struct perf_tool *tool,
+			  union perf_event *event,
+			  struct perf_evsel *evsel,
+			  struct perf_sample *sample,
+			  struct machine *machine);
+
+static int perf_timehist__process_sample(struct perf_tool *tool,
+					 union perf_event *event,
+					 struct perf_sample *sample,
+					 struct perf_evsel *evsel,
+					 struct machine *machine)
+{
+	int err = 0;
+
+	evsel->hists.stats.total_period += sample->period;
+	hists__inc_nr_events(&evsel->hists, PERF_RECORD_SAMPLE);
+
+	if (evsel->handler != NULL) {
+		sched_handler f = evsel->handler;
+
+		err = f(tool, event, evsel, sample, machine);
+	}
+
+	return err;
+}
+
+static int setup_excl_sym(void)
+{
+	if (excl_sym_list_str &&
+	    setup_list(&excl_sym_list, excl_sym_list_str, "excl_sym") < 0)
+		return -1;
+
+	return 0;
+}
+
+static int timehist_symbol_filter(struct map *map, struct symbol *sym)
+{
+	/* filter out schedule and syscall related symbols from stack trace */
+	if (map->dso->kernel == DSO_TYPE_KERNEL) {
+		if ((strncmp(sym->name, "schedule", 8) == 0) ||
+		    (strcmp(sym->name, "__schedule") == 0)) {
+			sym->ignore = true;
+			return 0;
+		}
+		if ((strcmp(sym->name, "syscall") == 0) ||
+		    (strcmp(sym->name, "system_call_done") == 0) ||
+		    (strcmp(sym->name, "ia32_syscall_done") == 0)) {
+			sym->ignore = true;
+			return 0;
+		}
+	}
+
+	if ((excl_sym_list && strlist__has_entry(excl_sym_list, sym->name)) ||
+	    symbol__is_idle(sym))
+		sym->ignore = true;
+
+	return 0;
+}
+
+static int parse_target_str(struct perf_sched *sched)
+{
+	if (sched->target.pid) {
+		sched->pid = intlist__new(sched->target.pid);
+		if (sched->pid == NULL) {
+			pr_err("Error parsing process id string\n");
+			return -EINVAL;
+		}
+	}
+
+	if (sched->target.tid) {
+		sched->tid = intlist__new(sched->target.tid);
+		if (sched->tid == NULL) {
+			pr_err("Error parsing thread id string\n");
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static int timehist_check_attr(struct perf_sched *sched,
+			       struct perf_evlist *evlist)
+{
+	struct perf_evsel *evsel;
+	struct evsel_runtime *er;
+
+	list_for_each_entry(evsel, &evlist->entries, node) {
+		er = perf_evsel__get_runtime(evsel);
+		if (er == NULL) {
+			pr_err("Failed to allocate memory for evsel runtime data\n");
+			return -1;
+		}
+
+		if (sched->show_callchain &&
+		    !(evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN)) {
+			pr_info("Samples do not have callchains.\n");
+			sched->show_callchain = 0;
+			symbol_conf.use_callchain = 0;
+		}
+	}
+
+	return 0;
+}
+
+static int perf_sched__timehist(struct perf_sched *sched)
+{
+	const struct perf_evsel_str_handler handlers[] = {
+		{ "sched:sched_switch",       timehist_sched_switch_event, },
+		{ "sched:sched_wakeup",	      timehist_sched_wakeup_event, },
+		{ "sched:sched_wakeup_new",   timehist_sched_wakeup_event, },
+	};
+	struct perf_data_file file = {
+		.path = input_name,
+		.mode = PERF_DATA_MODE_READ,
+	};
+
+	struct perf_session *session;
+	int err = -1;
+
+	/*
+	 * event handlers for timehist option
+	 */
+	sched->tool.sample	 = perf_timehist__process_sample;
+	sched->tool.mmap	 = perf_event__process_mmap;
+	sched->tool.comm	 = perf_event__process_comm;
+	sched->tool.exit	 = perf_event__process_exit;
+	sched->tool.fork	 = perf_event__process_fork;
+	sched->tool.lost	 = process_lost;
+	sched->tool.attr	 = perf_event__process_attr;
+	sched->tool.tracing_data = perf_event__process_tracing_data;
+	sched->tool.build_id	 = perf_event__process_build_id;
+
+	sched->tool.ordered_samples = true;
+	sched->tool.ordering_requires_timestamps = true;
+
+	if (setup_excl_sym() < 0)
+		return -1;
+
+	symbol_conf.use_callchain = sched->show_callchain;
+	if (symbol__init() < 0)
+		return -1;
+
+	session = perf_session__new(&file, false, &sched->tool);
+	if (session == NULL)
+		return -ENOMEM;
+
+	machines__set_symbol_filter(&session->machines, timehist_symbol_filter);
+
+	if (timehist_check_attr(sched, session->evlist) != 0)
+		goto out;
+
+	if (parse_target_str(sched) != 0)
+		goto out;
+
+	setup_pager();
+
+	/* setup per-evsel handlers */
+	if (perf_session__set_tracepoints_handlers(session, handlers))
+		goto out;
+
+	if (!perf_session__has_traces(session, "record -R"))
+		goto out;
+
+	/* pre-allocate struct for per-CPU idle stats */
+	sched->max_cpu = session->header.env.nr_cpus_online;
+	if (init_idle_threads(sched->max_cpu))
+		goto out;
+
+	/* summary_only implies summary option, but don't overwrite summary if set */
+	if (sched->summary_only)
+		sched->summary = sched->summary_only;
+
+	if (!sched->summary_only)
+		timehist_header(sched);
+
+	err = perf_session__process_events(session, &sched->tool);
+	if (err) {
+		pr_err("Failed to process events, error %d", err);
+		goto out;
+	}
+
+	sched->nr_events      = session->stats.nr_events[0];
+	sched->nr_lost_events = session->stats.total_lost;
+	sched->nr_lost_chunks = session->stats.nr_events[PERF_RECORD_LOST];
+
+	if (sched->summary)
+		timehist_print_summary(session);
+
+out:
+	free_idle_threads();
+	perf_session__delete(session);
+
+	return err;
+}
+
 static void print_bad_events(struct perf_sched *sched)
 {
 	if (sched->nr_unordered_timestamps && sched->nr_timestamps) {
@@ -1675,6 +2596,8 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
 		.profile_cpu	      = -1,
 		.next_shortname1      = 'A',
 		.next_shortname2      = '0',
+		.show_callchain	      = 1,
+		.max_stack            = 5,
 	};
 	const struct option latency_options[] = {
 	OPT_STRING('s', "sort", &sched.sort_order, "key[,key2...]",
@@ -1705,6 +2628,42 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
 		    "dump raw trace in ASCII"),
 	OPT_END()
 	};
+	const struct option timehist_options[] = {
+	OPT_STRING('i', "input", &input_name, "file",
+		    "input file name"),
+	OPT_INCR('v', "verbose", &verbose,
+		    "be more verbose (show symbol address, etc)"),
+	OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
+		   "file", "vmlinux pathname"),
+	OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name,
+		   "file", "kallsyms pathname"),
+	OPT_STRING('c', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",
+		   "only display events for these comms"),
+	OPT_STRING('p', "pid", &sched.target.pid, "pid",
+		   "analyze events only for given process id(s)"),
+	OPT_STRING('t', "tid", &sched.target.tid, "tid",
+		    "analyze events only for given thread id(s)"),
+	OPT_BOOLEAN('g', "call-graph", &sched.show_callchain,
+		    "Display call chains if present (default on)"),
+	OPT_UINTEGER(0, "max-stack", &sched.max_stack,
+		   "Maximum number of functions to display backtrace."),
+	OPT_STRING('x', "excl", &excl_sym_list_str, "sym[,sym...]",
+		   "symbols to skip in backtrace"),
+	OPT_STRING(0, "symfs", &symbol_conf.symfs, "directory",
+		    "Look for files with symbols relative to this directory"),
+	OPT_BOOLEAN('s', "summary", &sched.summary_only,
+		    "Show only syscall summary with statistics"),
+	OPT_BOOLEAN('S', "with-summary", &sched.summary,
+		    "Show all syscalls and summary with statistics"),
+	OPT_BOOLEAN('w', "wakeups", &sched.show_wakeups, "Show wakeup events"),
+	OPT_BOOLEAN('V', "cpu-visual", &sched.show_cpu_visual, "Add CPU visual"),
+	OPT_END()
+	};
+	const char * const timehist_usage[] = {
+		"perf sched timehist [<options>]",
+		NULL
+	};
+
 	const char * const latency_usage[] = {
 		"perf sched latency [<options>]",
 		NULL
@@ -1747,7 +2706,6 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
 	if (!strcmp(argv[0], "script"))
 		return cmd_script(argc, argv, prefix);
 
-	symbol__init();
 	if (!strncmp(argv[0], "rec", 3)) {
 		return __cmd_record(argc, argv);
 	} else if (!strncmp(argv[0], "lat", 3)) {
@@ -1758,10 +2716,12 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
 				usage_with_options(latency_usage, latency_options);
 		}
 		setup_sorting(&sched, latency_options, latency_usage);
+		symbol__init();
 		return perf_sched__lat(&sched);
 	} else if (!strcmp(argv[0], "map")) {
 		sched.tp_handler = &map_ops;
 		setup_sorting(&sched, latency_options, latency_usage);
+		symbol__init();
 		return perf_sched__map(&sched);
 	} else if (!strncmp(argv[0], "rep", 3)) {
 		sched.tp_handler = &replay_ops;
@@ -1770,7 +2730,17 @@ int cmd_sched(int argc, const char **argv, const char *prefix __maybe_unused)
 			if (argc)
 				usage_with_options(replay_usage, replay_options);
 		}
+		symbol__init();
 		return perf_sched__replay(&sched);
+	} else if (!strcmp(argv[0], "timehist")) {
+		if (argc) {
+			argc = parse_options(argc, argv, timehist_options,
+					     timehist_usage, 0);
+			if (argc)
+				usage_with_options(timehist_usage, timehist_options);
+		}
+
+		return perf_sched__timehist(&sched);
 	} else {
 		usage_with_options(sched_usage, sched_options);
 	}
-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 7/8] perf sched timehist: Add support for context-switch event
  2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
                   ` (5 preceding siblings ...)
  2013-11-18 20:32 ` [PATCH 6/8] perf sched: Introduce timehist command David Ahern
@ 2013-11-18 20:32 ` David Ahern
  2013-11-18 20:32 ` [PATCH 8/8] perf sched : Add documentation for timehist options David Ahern
  7 siblings, 0 replies; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel
  Cc: David Ahern, Ingo Molnar, Frederic Weisbecker, Peter Zijlstra,
	Mike Galbraith, Jiri Olsa, Namhyung Kim, Stephane Eranian,
	Pekka Enberg

Context switch events are 64 bytes; sched_switch events are 136 bytes.  Both
indicate scheduling changes, so allow user to leverage the smaller event. If
both events exist in a data file, then context-switch event is ignored.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
---
 tools/perf/builtin-sched.c | 40 +++++++++++++++++++++++++++++++++++-----
 1 file changed, 35 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index ffe11067083c..79a94655d3ed 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -180,6 +180,7 @@ struct perf_sched {
 	unsigned int	max_stack;
 	bool		show_cpu_visual;
 	bool		show_wakeups;
+	bool		have_traces;
 	/* process and task id's of interest */
 	struct target target;
 	struct intlist  *pid, *tid;
@@ -2070,6 +2071,15 @@ out:
 	return rc;
 }
 
+static int timehist_cs_event(struct perf_tool *tool __maybe_unused,
+			     union perf_event *event,
+			     struct perf_evsel *evsel,
+			     struct perf_sample *sample,
+			     struct machine *machine __maybe_unused)
+{
+	return timehist_sched_change_event(tool, event, evsel, sample, machine);
+}
+
 static int timehist_sched_switch_event(struct perf_tool *tool,
 				       union perf_event *event,
 				       struct perf_evsel *evsel,
@@ -2293,10 +2303,25 @@ static int parse_target_str(struct perf_sched *sched)
 static int timehist_check_attr(struct perf_sched *sched,
 			       struct perf_evlist *evlist)
 {
-	struct perf_evsel *evsel;
+	struct perf_evsel *evsel, *cs_evsel = NULL;
 	struct evsel_runtime *er;
+	const char *evname;
+	bool have_cs_event = false, have_sched_event = false;
 
 	list_for_each_entry(evsel, &evlist->entries, node) {
+		if (evsel->attr.type == PERF_TYPE_TRACEPOINT)
+			sched->have_traces = true;
+
+		evname = perf_evsel__name(evsel);
+		if (strcmp(evname, "cs") == 0 ||
+		    strcmp(evname, "context-switch") == 0) {
+			cs_evsel = evsel;
+			have_cs_event = true;
+		} else if (strcmp(evname, "sched:sched_switch") == 0) {
+			have_sched_event = true;
+		}
+
+
 		er = perf_evsel__get_runtime(evsel);
 		if (er == NULL) {
 			pr_err("Failed to allocate memory for evsel runtime data\n");
@@ -2311,6 +2336,13 @@ static int timehist_check_attr(struct perf_sched *sched,
 		}
 	}
 
+	if (have_cs_event && have_sched_event) {
+		pr_debug("Both schedule change events exist. Ignoring context-switch event\n");
+	} else if (have_cs_event) {
+		pr_debug("Using context-switch events.\n");
+		cs_evsel->handler = timehist_cs_event;
+	}
+
 	return 0;
 }
 
@@ -2367,10 +2399,8 @@ static int perf_sched__timehist(struct perf_sched *sched)
 	setup_pager();
 
 	/* setup per-evsel handlers */
-	if (perf_session__set_tracepoints_handlers(session, handlers))
-		goto out;
-
-	if (!perf_session__has_traces(session, "record -R"))
+	if (sched->have_traces &&
+	    perf_session__set_tracepoints_handlers(session, handlers))
 		goto out;
 
 	/* pre-allocate struct for per-CPU idle stats */
-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 8/8] perf sched : Add documentation for timehist options
  2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
                   ` (6 preceding siblings ...)
  2013-11-18 20:32 ` [PATCH 7/8] perf sched timehist: Add support for context-switch event David Ahern
@ 2013-11-18 20:32 ` David Ahern
  2013-11-28 15:42   ` Namhyung Kim
  7 siblings, 1 reply; 28+ messages in thread
From: David Ahern @ 2013-11-18 20:32 UTC (permalink / raw)
  To: acme, linux-kernel
  Cc: David Ahern, Ingo Molnar, Frederic Weisbecker, Peter Zijlstra,
	Mike Galbraith, Jiri Olsa, Namhyung Kim, Stephane Eranian,
	Pekka Enberg

Add entry to perf-sched documentation for timehist command and its
options.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Pekka Enberg <penberg@kernel.org>
---
 tools/perf/Documentation/perf-sched.txt | 60 ++++++++++++++++++++++++++++++++-
 1 file changed, 59 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index 8ff4df956951..83fac8b86f52 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -8,7 +8,7 @@ perf-sched - Tool to trace/measure scheduler properties (latencies)
 SYNOPSIS
 --------
 [verse]
-'perf sched' {record|latency|map|replay|script}
+'perf sched' {record|latency|map|replay|script|timehist}
 
 DESCRIPTION
 -----------
@@ -36,6 +36,11 @@ There are five variants of perf sched:
   are running on a CPU. A '*' denotes the CPU that had the event, and
   a dot signals an idle CPU.
 
+  'perf sched timehist' to show context-switching analysis with times
+  between schedule-in, schedule delay, and run time. If callchains are
+  present stack trace is dumped at the end of the line. A summary of
+  run times can be shown as well.
+
 OPTIONS
 -------
 -i::
@@ -50,6 +55,59 @@ OPTIONS
 --dump-raw-trace=::
         Display verbose dump of the sched data.
 
+TIMEHIST OPTIONS
+----------------
+-k::
+--vmlinux=<file>::
+    vmlinux pathname
+
+--kallsyms=<file>::
+    kallsyms pathname
+
+-s::
+--summary::
+    Show only a summary of scheduling by thread with min, max, and average
+    run times (in sec) and relative stddev.
+
+-S::
+--with-summary::
+    Show all scheduling events followed by a summary by thread with min,
+    max, and average run times (in sec) and relative stddev.
+
+-w::
+--wakeups::
+    Show wakeup events.
+
+-c::
+--comms=::
+    Only show events for given comms (comma separated list)
+
+-p::
+--pid=::
+    Only show events for given process ID (comma separated list).
+
+-t::
+--tid=::
+    Only show events for given thread ID (comma separated list).
+
+--call-graph::
+        Display call chains. Default is on. Use --no-call-graph to suppress
+
+--max-stack::
+    Set the stack depth limit when showing the callchains. Default: 127
+
+-x::
+--excl::
+    Symbols to skip in backtrace. Useful to compact repetitive or
+    uninteresting frames to pull more useful frames into view.
+
+--symfs=<directory>::
+    Look for files with symbols relative to this directory.
+
+-V::
+--cpu-visual::
+    Add a visual for following events by cpu
+
 SEE ALSO
 --------
 linkperf:perf-record[1]
-- 
1.8.3.4 (Apple Git-47)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded
  2013-11-18 20:32 ` [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded David Ahern
@ 2013-11-22 18:44   ` Arnaldo Carvalho de Melo
  2013-11-22 19:13     ` David Ahern
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-22 18:44 UTC (permalink / raw)
  To: David Ahern
  Cc: linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Frederic Weisbecker, Peter Zijlstra, Stephane Eranian

Em Mon, Nov 18, 2013 at 01:32:46PM -0700, David Ahern escreveu:
> Save vmlinux or kallsyms path loaded using embedded default lookup (ie.,
> not a user specified path). Upcoming perf sched timehist command requires
> kernel symbols for properly computing idle times and prints a warning
> if the kernel symbols are not loaded.

What would happen if strdup failed here?

Why don't we undo anything done so far in this function and return an
error?

- Arnaldo
 
> Signed-off-by: David Ahern <dsahern@gmail.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Stephane Eranian <eranian@google.com>
> ---
>  tools/perf/util/symbol.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index f55c18da1e40..915eb6c4a248 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -1437,6 +1437,8 @@ int dso__load_vmlinux(struct dso *dso, struct map *map,
>  			dso->data_type = DSO_BINARY_TYPE__VMLINUX;
>  		dso__set_long_name(dso, (char *)vmlinux);
>  		dso__set_loaded(dso, map->type);
> +		if (symbol_conf.vmlinux_name == NULL)
> +			symbol_conf.vmlinux_name = strdup(symfs_vmlinux);
>  		pr_debug("Using %s for symbols\n", symfs_vmlinux);
>  	}
>  
> @@ -1629,8 +1631,11 @@ static int dso__load_kernel_sym(struct dso *dso, struct map *map,
>  
>  do_kallsyms:
>  	err = dso__load_kallsyms(dso, kallsyms_filename, map, filter);
> -	if (err > 0)
> +	if (err > 0) {
> +		if (symbol_conf.kallsyms_name == NULL)
> +			symbol_conf.kallsyms_name = strdup(kallsyms_filename);
>  		pr_debug("Using %s for symbols\n", kallsyms_filename);
> +	}
>  	free(kallsyms_allocated_filename);
>  
>  	if (err > 0 && !dso__is_kcore(dso)) {
> -- 
> 1.8.3.4 (Apple Git-47)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded
  2013-11-22 18:44   ` Arnaldo Carvalho de Melo
@ 2013-11-22 19:13     ` David Ahern
  2013-11-22 19:40       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 28+ messages in thread
From: David Ahern @ 2013-11-22 19:13 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Frederic Weisbecker, Peter Zijlstra, Stephane Eranian

On 11/22/13, 11:44 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 18, 2013 at 01:32:46PM -0700, David Ahern escreveu:
>> Save vmlinux or kallsyms path loaded using embedded default lookup (ie.,
>> not a user specified path). Upcoming perf sched timehist command requires
>> kernel symbols for properly computing idle times and prints a warning
>> if the kernel symbols are not loaded.
>
> What would happen if strdup failed here?

perf continues on happily. On the timehist command will question 
computation of idle times which require kernel symbols.

>
> Why don't we undo anything done so far in this function and return an
> error?
>

In the notion of best effort of continuing. It's not a show stopper that 
strdup failed; it's just not the filename it used for symbols.

David

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded
  2013-11-22 19:13     ` David Ahern
@ 2013-11-22 19:40       ` Arnaldo Carvalho de Melo
  2013-11-22 20:09         ` David Ahern
  0 siblings, 1 reply; 28+ messages in thread
From: Arnaldo Carvalho de Melo @ 2013-11-22 19:40 UTC (permalink / raw)
  To: David Ahern
  Cc: linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Frederic Weisbecker, Peter Zijlstra, Stephane Eranian

Em Fri, Nov 22, 2013 at 12:13:52PM -0700, David Ahern escreveu:
> On 11/22/13, 11:44 AM, Arnaldo Carvalho de Melo wrote:
> >Em Mon, Nov 18, 2013 at 01:32:46PM -0700, David Ahern escreveu:
> >>Save vmlinux or kallsyms path loaded using embedded default lookup (ie.,
> >>not a user specified path). Upcoming perf sched timehist command requires
> >>kernel symbols for properly computing idle times and prints a warning
> >>if the kernel symbols are not loaded.

> >What would happen if strdup failed here?

> perf continues on happily. On the timehist command will question
> computation of idle times which require kernel symbols.

> >Why don't we undo anything done so far in this function and return an
> >error?

> In the notion of best effort of continuing. It's not a show stopper
> that strdup failed; it's just not the filename it used for symbols.

So if later on one would need to print that info, it would just show
"unknown"?

- Arnaldo

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded
  2013-11-22 19:40       ` Arnaldo Carvalho de Melo
@ 2013-11-22 20:09         ` David Ahern
  0 siblings, 0 replies; 28+ messages in thread
From: David Ahern @ 2013-11-22 20:09 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, Ingo Molnar, Jiri Olsa, Namhyung Kim,
	Frederic Weisbecker, Peter Zijlstra, Stephane Eranian

On 11/22/13, 12:40 PM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Nov 22, 2013 at 12:13:52PM -0700, David Ahern escreveu:
>> On 11/22/13, 11:44 AM, Arnaldo Carvalho de Melo wrote:
>>> Em Mon, Nov 18, 2013 at 01:32:46PM -0700, David Ahern escreveu:
>>>> Save vmlinux or kallsyms path loaded using embedded default lookup (ie.,
>>>> not a user specified path). Upcoming perf sched timehist command requires
>>>> kernel symbols for properly computing idle times and prints a warning
>>>> if the kernel symbols are not loaded.
>
>>> What would happen if strdup failed here?
>
>> perf continues on happily. On the timehist command will question
>> computation of idle times which require kernel symbols.
>
>>> Why don't we undo anything done so far in this function and return an
>>> error?
>
>> In the notion of best effort of continuing. It's not a show stopper
>> that strdup failed; it's just not the filename it used for symbols.
>
> So if later on one would need to print that info, it would just show
> "unknown"?

no. you get a warning message that idle values might be wrong.

David


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/8] perf symbols: Move idle syms check from top to generic function
  2013-11-18 20:32 ` [PATCH 2/8] perf symbols: Move idle syms check from top to generic function David Ahern
@ 2013-11-28  8:36   ` Namhyung Kim
  2013-11-30 15:58     ` David Ahern
  2013-11-30 12:52   ` [tip:perf/core] " tip-bot for David Ahern
  1 sibling, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2013-11-28  8:36 UTC (permalink / raw)
  To: David Ahern
  Cc: acme, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Jiri Olsa, Stephane Eranian

Hi David,

Just minor nits below..

On Mon, 18 Nov 2013 13:32:45 -0700, David Ahern wrote:
> Allows list of idle symbols to be leveraged by other commands,
> such as the upcoming timehist command.

[SNIP]
> +bool symbol__is_idle(struct symbol *sym)
> +{
> +	const char * const idle_symbols[] = {

Wouldn't it better making it static?  It seems to build a table
everytime otherwise.


> +		"cpu_idle",
> +		"intel_idle",
> +		"default_idle",
> +		"native_safe_halt",
> +		"enter_idle",
> +		"exit_idle",
> +		"mwait_idle",
> +		"mwait_idle_with_hints",
> +		"poll_idle",
> +		"ppc64_runlatch_off",
> +		"pseries_dedicated_idle_sleep",
> +		NULL
> +	};
> +
> +	int i;
> +
> +	if (!sym)
> +		return false;
> +
> +	for (i = 0; idle_symbols[i]; i++) {

Also we can use ARRAY_SIZE() here and let the last NULL go IMHO.

Thanks,
Namhyung


> +		if (!strcmp(idle_symbols[i], sym->name))
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
>  static int map__process_kallsym_symbol(void *arg, const char *name,
>  				       char type, u64 start)
>  {
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 07de8fea2f48..ad13c5d50b91 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -240,6 +240,7 @@ size_t symbol__fprintf(struct symbol *sym, FILE *fp);
>  bool symbol_type__is_a(char symbol_type, enum map_type map_type);
>  bool symbol__restricted_filename(const char *filename,
>  				 const char *restricted_filename);
> +bool symbol__is_idle(struct symbol *sym);
>  
>  int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>  		  struct symsrc *runtime_ss, symbol_filter_t filter,

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 5/8] perf tool: export setup_list
  2013-11-18 20:32 ` [PATCH 5/8] perf tool: export setup_list David Ahern
@ 2013-11-28  8:45   ` Namhyung Kim
  2013-11-30 12:52   ` [tip:perf/core] perf tools: Export setup_list tip-bot for David Ahern
  1 sibling, 0 replies; 28+ messages in thread
From: Namhyung Kim @ 2013-11-28  8:45 UTC (permalink / raw)
  To: David Ahern
  Cc: acme, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Jiri Olsa, Stephane Eranian

On Mon, 18 Nov 2013 13:32:48 -0700, David Ahern wrote:
> Used in next patch (perf sched timehist command)

Why not moving it to util/strlist.c and rename to strlist__setup()?

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/8] perf sched: Introduce timehist command
  2013-11-18 20:32 ` [PATCH 6/8] perf sched: Introduce timehist command David Ahern
@ 2013-11-28  9:50   ` Namhyung Kim
  2013-11-28 16:01     ` David Ahern
  2013-11-28 15:38   ` Namhyung Kim
  1 sibling, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2013-11-28  9:50 UTC (permalink / raw)
  To: David Ahern
  Cc: acme, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Mike Galbraith, Jiri Olsa, Stephane Eranian,
	Pekka Enberg

On Mon, 18 Nov 2013 13:32:49 -0700, David Ahern wrote:
> 'perf sched timehist' provides an analysis of scheduling events.
>
> Example usage:
>     perf sched record -- sleep 1
>     perf sched timehist
>
> By default it shows the individual schedule events, including the time between
> sched-in events for the task, the task scheduling delay (time between wakeup
> and actually running) and run time for the task:
>
>            time cpu  task name[tid/pid]    b/n time sch delay  run time
> --------------- ---- -------------------- --------- --------- ---------
>    79371.874569 [11] gcc[31949]            0.000014  0.000000  0.001148
>    79371.874591 [10] gcc[31951]            0.000000  0.000000  0.000024
>    79371.874603 [10] migration/10[59]      0.003350  0.000004  0.000011
>    79371.874604 [11] <idle>                0.001148  0.000000  0.000035
>    79371.874723 [05] <idle>                0.000016  0.000000  0.001383
>    79371.874746 [05] gcc[31949]            0.000153  0.000078  0.000022
> ...
>
> If callchains were recorded they are appended to the line with a default stack depth of 5:
>
>    79371.874569 [11] gcc[31949]                  0.000014  0.000000  0.001148  wait_for_completion_killable do_fork sys_vfork stub_vfork __vfork
>    79371.874591 [10] gcc[31951]                  0.000000  0.000000  0.000024  __cond_resched _cond_resched wait_for_completion stop_one_cpu sched_exec
>    79371.874603 [10] migration/10[59]            0.003350  0.000004  0.000011  smpboot_thread_fn kthread ret_from_fork
>    79371.874604 [11] <idle>                      0.001148  0.000000  0.000035  cpu_startup_entry start_secondary
>    79371.874723 [05] <idle>                      0.000016  0.000000  0.001383  cpu_startup_entry start_secondary
>    79371.874746 [05] gcc[31949]                  0.000153  0.000078  0.000022  do_wait sys_wait4 system_call_fastpath __GI___waitpid
>
> --no-call-graph can be used to not show the callchains. --max-stack is used
> to control the number of frames shown (default of 5). -x/--excl options can
> be used to collapse redundant callchains to get more relevant data on screen.
>
> Similar to perf-trace -s and -S can be used to dump a statistical summary
> without or with events (respectively). Statistics include min run time,
> average run time and max run time. Stats are also shown for run time by
> cpu.
>
> The cpu-visual option provides a visual aid for sched switches by cpu:
> ...
>    79371.874569 [11]            s      gcc[31949]                  0.000014  0.000000  0.001148
>    79371.874591 [10]           s       gcc[31951]                  0.000000  0.000000  0.000024
>    79371.874603 [10]           s       migration/10[59]            0.003350  0.000004  0.000011
>    79371.874604 [11]            i      <idle>                      0.001148  0.000000  0.000035
>    79371.874723 [05]      i            <idle>                      0.000016  0.000000  0.001383
>    79371.874746 [05]      s            gcc[31949]                  0.000153  0.000078  0.000022
> ...
>
> Additional options:
> - -w option can be used to include a dump of wakeup events
> - -c comm to only display events and stats for the given comm (csv list)
> - -p/-t to only display events and stats for given pid/tid (csv list)

Looks like a very nice feature.  I haven't read the whole patch yet but
just want some comments.  More will come later. :)


[SNIP]
> +static inline void printf_nsecs(unsigned long long nsecs, int width_sec)
> +{
> +	unsigned long secs;
> +	unsigned long usecs;
> +
> +	secs = nsecs / NSECS_PER_SEC;
> +	nsecs -= secs * NSECS_PER_SEC;
> +	usecs = nsecs / NSECS_PER_USEC;
> +	printf("%*lu.%06lu ", width_sec, secs, usecs);
> +}

It seems very similar to what timehist_time_str() does.  Better to
factor out?


[SNIP]
> +static unsigned int comm_width = 20;
> +
> +static char *timehist_get_commstr(struct thread *thread)
> +{
> +	static char str[32];
> +	const char *comm = thread__comm_str(thread);
> +	pid_t tid = thread->tid;
> +	pid_t pid = thread->pid_;
> +	unsigned int n;
> +
> +	if (pid == 0)
> +		scnprintf(str, sizeof(str), "%s", comm);
> +
> +	else if (tid != pid)
> +		scnprintf(str, sizeof(str), "%s[%d/%d]", comm, tid, pid);
> +
> +	else
> +		scnprintf(str, sizeof(str), "%s[%d]", comm, tid);
> +
> +	n = strlen(str);

Why not just using return value of scnprintf()?


> +	if (n > comm_width)
> +		comm_width = n;
> +
> +	return str;
> +}
> +
> +static void timehist_header(struct perf_sched *sched)
> +{
> +	u32 max_cpus = sched->max_cpu;
> +	u32 i, j;
> +
> +	printf("%15s %-4s", "time", "cpu");
> +
> +	if (sched->show_cpu_visual && max_cpus) {
> +		printf(" ");
> +		for (i = 0, j = 0; i < max_cpus; ++i) {
> +			printf("%x", j++);
> +			if (j > 15)
> +				j = 0;
> +		}
> +		printf(" ");
> +	}
> +
> +	printf(" %-20s %9s %9s %9s",
> +		"task name[tid/pid]", "b/n time", "sch delay", "run time");
> +
> +	if (sched->show_wakeups)
> +		printf("  %-20s", "wakeup");
> +
> +	printf("\n");
> +
> +	printf("%15s %4s", "---------------", "----");

You might want to use "%.15s" and graph_dotted_line.

> +
> +	if (sched->show_cpu_visual && max_cpus) {
> +		printf(" ");
> +		for (i = 0; i < max_cpus; ++i)
> +			printf("-");
> +		printf(" ");
> +	}
> +
> +	printf(" %20s %9s %9s %9s",
> +		"--------------------", "---------", "---------", "---------");

Ditto.

> +
> +	if (sched->show_wakeups)
> +		printf("  %20s", "--------------------");

Ditto.

> +
> +	printf("\n");
> +}
> +
> +static char *timehist_time_str(char *tstr, int len, u64 t)
> +{
> +	unsigned long secs, usecs;
> +	unsigned long long nsecs;
> +
> +	nsecs = t;
> +	secs = nsecs / NSECS_PER_SEC;
> +	nsecs -= secs * NSECS_PER_SEC;
> +	usecs = nsecs / NSECS_PER_USEC;
> +	snprintf(tstr, len, "%5lu.%06lu", secs, usecs);
> +
> +	return tstr;
> +}

See above.

> +
> +static void timehist_print_sample(struct perf_sched *sched,
> +				  union perf_event *event,
> +				  struct perf_evsel *evsel,
> +				  struct perf_sample *sample,
> +				  struct thread *thread,
> +				  struct machine *machine)
> +{
> +	struct thread_runtime *tr = thread__priv(thread);
> +	char tstr[64];
> +	u32 max_cpus = sched->max_cpu;
> +
> +	printf("%15s ", timehist_time_str(tstr, sizeof(tstr), sample->time));
> +
> +	printf("[%02d] ", sample->cpu);
> +
> +	if (sched->show_cpu_visual && max_cpus) {
> +		u32 i;
> +		char c;
> +		for (i = 0; i < max_cpus; ++i) {
> +			if (i == sample->cpu)
> +				c = (thread->tid == 0) ? 'i' : 's';

It'd better explaining what the 'i' and 's' mean..


> +			else
> +				c = ' ';
> +			printf("%c", c);
> +		}
> +		printf("  ");
> +	}
> +
> +	printf("%-*s ", comm_width, timehist_get_commstr(thread));
> +
> +	printf_nsecs(tr->dt_between, 2);
> +	printf_nsecs(tr->dt_delay, 2);
> +	printf_nsecs(tr->dt_run, 2);
> +
> +	if (sched->show_wakeups)
> +		printf(" %-*s ", comm_width, "");
> +
> +	perf_evsel__print_ip(evsel, event, sample, machine,
> +			     PRINT_IP_OPT_SYM | PRINT_IP_OPT_ONELINE,
> +			     sched->max_stack);
> +
> +	printf("\n");
> +}

[SNIP]
> +static void free_idle_threads(void)
> +{
> +	int i;
> +
> +	if (idle_threads == NULL)
> +		return;
> +
> +	for (i = 0; i <= idle_max_cpu; ++i)
> +		free(idle_threads[i]);

Doesn't it need to be thread__delete(idle_threads[i]) ?

Thanks,
Namhyung

> +
> +	free(idle_threads);
> +}

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/8] perf sched: Introduce timehist command
  2013-11-18 20:32 ` [PATCH 6/8] perf sched: Introduce timehist command David Ahern
  2013-11-28  9:50   ` Namhyung Kim
@ 2013-11-28 15:38   ` Namhyung Kim
  2013-11-28 16:01     ` David Ahern
  1 sibling, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2013-11-28 15:38 UTC (permalink / raw)
  To: David Ahern
  Cc: Arnaldo Melo, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Mike Galbraith, Jiri Olsa, Stephane Eranian,
	Pekka Enberg

On Tue, Nov 19, 2013 at 5:32 AM, David Ahern <dsahern@gmail.com> wrote:

[SNIP]
> +static bool is_idle_sample(struct perf_sample *sample,
> +                          struct perf_evsel *evsel,
> +                          struct machine *machine)
> +{
> +       struct thread *thread;
> +       struct callchain_cursor *cursor = &callchain_cursor;
> +       struct callchain_cursor_node *node;
> +       struct addr_location al;
> +       int iter = 5;

Shouldn't it be sched->max_stack somehow?

> +
> +       /* pid 0 == swapper == idle task */
> +       if (sample->pid == 0)
> +               return true;
> +
> +       /* want main thread for process - has maps */
> +       thread = machine__findnew_thread(machine, sample->pid, sample->pid);
> +       if (thread == NULL) {
> +               pr_debug("Failed to get thread for pid %d.\n", sample->pid);
> +               return false;
> +       }
> +
> +       if (!symbol_conf.use_callchain || sample->callchain == NULL)
> +               return false;
> +
> +       if (machine__resolve_callchain(machine, evsel, thread,
> +                                       sample, NULL, &al, PERF_MAX_STACK_DEPTH) != 0) {
> +               if (verbose)
> +                       error("Failed to resolve callchain. Skipping\n");
> +
> +               return false;
> +       }

I think this callchain resolving logic should be moved to the
beginning of perf_hist__process_sample() like other commands do.  It's
not for idle threads only.

And it also needs to pass sched->max_stack.


> +       callchain_cursor_commit(cursor);
> +
> +       /* idle symbol should be early in the stack */
> +       while (iter) {
> +               node = callchain_cursor_current(cursor);
> +               if (!node)
> +                       break;
> +
> +               if (symbol__is_idle(node->sym))
> +                       return true;
> +
> +               callchain_cursor_advance(cursor);
> +
> +               iter--;
> +       }
> +
> +       return false;
> +}

[SNIP]
> +       /* show wakeups if requested */
> +       if (sched->show_wakeups && !sched->summary_only)

Hmm.. maybe we can emit a warning if -w and -s options are given at
the same time.


> +               timehist_print_wakeup_event(sched, sample, machine, thread);
> +
> +       return 0;
> +}

[SNIP]
> +static int parse_target_str(struct perf_sched *sched)
> +{
> +       if (sched->target.pid) {
> +               sched->pid = intlist__new(sched->target.pid);
> +               if (sched->pid == NULL) {
> +                       pr_err("Error parsing process id string\n");
> +                       return -EINVAL;
> +               }
> +       }
> +
> +       if (sched->target.tid) {
> +               sched->tid = intlist__new(sched->target.tid);
> +               if (sched->tid == NULL) {
> +                       pr_err("Error parsing thread id string\n");


Need to call intlist__delete(sched->pid) here?


> +                       return -EINVAL;
> +               }
> +       }
> +
> +       return 0;
> +}

[SNIP]
> +       const struct option timehist_options[] = {
> +       OPT_STRING('i', "input", &input_name, "file",
> +                   "input file name"),
> +       OPT_INCR('v', "verbose", &verbose,
> +                   "be more verbose (show symbol address, etc)"),
> +       OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
> +                  "file", "vmlinux pathname"),
> +       OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name,
> +                  "file", "kallsyms pathname"),
> +       OPT_STRING('c', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",
> +                  "only display events for these comms"),
> +       OPT_STRING('p', "pid", &sched.target.pid, "pid",
> +                  "analyze events only for given process id(s)"),
> +       OPT_STRING('t', "tid", &sched.target.tid, "tid",
> +                   "analyze events only for given thread id(s)"),
> +       OPT_BOOLEAN('g', "call-graph", &sched.show_callchain,
> +                   "Display call chains if present (default on)"),
> +       OPT_UINTEGER(0, "max-stack", &sched.max_stack,
> +                  "Maximum number of functions to display backtrace."),
> +       OPT_STRING('x', "excl", &excl_sym_list_str, "sym[,sym...]",

Why not renaming long option name to --exclude or --exclude-symbols?

Thanks,
Namhyung


> +                  "symbols to skip in backtrace"),
> +       OPT_STRING(0, "symfs", &symbol_conf.symfs, "directory",
> +                   "Look for files with symbols relative to this directory"),
> +       OPT_BOOLEAN('s', "summary", &sched.summary_only,
> +                   "Show only syscall summary with statistics"),
> +       OPT_BOOLEAN('S', "with-summary", &sched.summary,
> +                   "Show all syscalls and summary with statistics"),
> +       OPT_BOOLEAN('w', "wakeups", &sched.show_wakeups, "Show wakeup events"),
> +       OPT_BOOLEAN('V', "cpu-visual", &sched.show_cpu_visual, "Add CPU visual"),
> +       OPT_END()
> +       };
> +       const char * const timehist_usage[] = {
> +               "perf sched timehist [<options>]",
> +               NULL
> +       };
> +

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 8/8] perf sched : Add documentation for timehist options
  2013-11-18 20:32 ` [PATCH 8/8] perf sched : Add documentation for timehist options David Ahern
@ 2013-11-28 15:42   ` Namhyung Kim
  0 siblings, 0 replies; 28+ messages in thread
From: Namhyung Kim @ 2013-11-28 15:42 UTC (permalink / raw)
  To: David Ahern
  Cc: Arnaldo Melo, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Mike Galbraith, Jiri Olsa, Stephane Eranian,
	Pekka Enberg

On Tue, Nov 19, 2013 at 5:32 AM, David Ahern <dsahern@gmail.com> wrote:
> Add entry to perf-sched documentation for timehist command and its
> options.
>

> +--max-stack::
> +    Set the stack depth limit when showing the callchains. Default: 127

I think the default is 5.

> +
> +-x::
> +--excl::

The long name can be more verbose as I said.


> +    Symbols to skip in backtrace. Useful to compact repetitive or
> +    uninteresting frames to pull more useful frames into view.
> +
> +--symfs=<directory>::
> +    Look for files with symbols relative to this directory.
> +
> +-V::
> +--cpu-visual::
> +    Add a visual for following events by cpu

I think it'd be better if you elaborate it bit more. :)

Thanks,
Namhyung


> +
>  SEE ALSO
>  --------
>  linkperf:perf-record[1]
> --
> 1.8.3.4 (Apple Git-47)
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/8] perf sched: Introduce timehist command
  2013-11-28  9:50   ` Namhyung Kim
@ 2013-11-28 16:01     ` David Ahern
  0 siblings, 0 replies; 28+ messages in thread
From: David Ahern @ 2013-11-28 16:01 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: acme, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Mike Galbraith, Jiri Olsa, Stephane Eranian,
	Pekka Enberg

On 11/28/13, 2:50 AM, Namhyung Kim wrote:
>> +static inline void printf_nsecs(unsigned long long nsecs, int width_sec)
>> >+{
>> >+	unsigned long secs;
>> >+	unsigned long usecs;
>> >+
>> >+	secs = nsecs / NSECS_PER_SEC;
>> >+	nsecs -= secs * NSECS_PER_SEC;
>> >+	usecs = nsecs / NSECS_PER_USEC;
>> >+	printf("%*lu.%06lu ", width_sec, secs, usecs);
>> >+}
> It seems very similar to what timehist_time_str() does.  Better to
> factor out?

yes and no. The above is used to dump time-based stats (time between 
sched-in events, scheduling delay, and run time).

What I really want for timehist_time_str is time-of-day. And in local 
variants of this code I have that. Until I can convince Peter, et al. 
about the real need to pull perf_clock into userspace this is a bit of a 
place holder so that local code and this version do not differ too much.

>
>
> [SNIP]
>> >+static unsigned int comm_width = 20;
>> >+
>> >+static char *timehist_get_commstr(struct thread *thread)
>> >+{
>> >+	static char str[32];
>> >+	const char *comm = thread__comm_str(thread);
>> >+	pid_t tid = thread->tid;
>> >+	pid_t pid = thread->pid_;
>> >+	unsigned int n;
>> >+
>> >+	if (pid == 0)
>> >+		scnprintf(str, sizeof(str), "%s", comm);
>> >+
>> >+	else if (tid != pid)
>> >+		scnprintf(str, sizeof(str), "%s[%d/%d]", comm, tid, pid);
>> >+
>> >+	else
>> >+		scnprintf(str, sizeof(str), "%s[%d]", comm, tid);
>> >+
>> >+	n = strlen(str);
> Why not just using return value of scnprintf()?

right. will do.

>
>
>> >+	if (n > comm_width)
>> >+		comm_width = n;
>> >+
>> >+	return str;
>> >+}
>> >+
>> >+static void timehist_header(struct perf_sched *sched)
>> >+{
>> >+	u32 max_cpus = sched->max_cpu;
>> >+	u32 i, j;
>> >+
>> >+	printf("%15s %-4s", "time", "cpu");
>> >+
>> >+	if (sched->show_cpu_visual && max_cpus) {
>> >+		printf(" ");
>> >+		for (i = 0, j = 0; i < max_cpus; ++i) {
>> >+			printf("%x", j++);
>> >+			if (j > 15)
>> >+				j = 0;
>> >+		}
>> >+		printf(" ");
>> >+	}
>> >+
>> >+	printf(" %-20s %9s %9s %9s",
>> >+		"task name[tid/pid]", "b/n time", "sch delay", "run time");
>> >+
>> >+	if (sched->show_wakeups)
>> >+		printf("  %-20s", "wakeup");
>> >+
>> >+	printf("\n");
>> >+
>> >+	printf("%15s %4s", "---------------", "----");
> You might want to use "%.15s" and graph_dotted_line.

did not know about that. will update.


>> >+static void timehist_print_sample(struct perf_sched *sched,
>> >+				  union perf_event *event,
>> >+				  struct perf_evsel *evsel,
>> >+				  struct perf_sample *sample,
>> >+				  struct thread *thread,
>> >+				  struct machine *machine)
>> >+{
>> >+	struct thread_runtime *tr = thread__priv(thread);
>> >+	char tstr[64];
>> >+	u32 max_cpus = sched->max_cpu;
>> >+
>> >+	printf("%15s ", timehist_time_str(tstr, sizeof(tstr), sample->time));
>> >+
>> >+	printf("[%02d] ", sample->cpu);
>> >+
>> >+	if (sched->show_cpu_visual && max_cpus) {
>> >+		u32 i;
>> >+		char c;
>> >+		for (i = 0; i < max_cpus; ++i) {
>> >+			if (i == sample->cpu)
>> >+				c = (thread->tid == 0) ? 'i' : 's';
> It'd better explaining what the 'i' and 's' mean..

visual aid: i = idle, s = sched-event. will add a comment.

>> >+static void free_idle_threads(void)
>> >+{
>> >+	int i;
>> >+
>> >+	if (idle_threads == NULL)
>> >+		return;
>> >+
>> >+	for (i = 0; i <= idle_max_cpu; ++i)
>> >+		free(idle_threads[i]);
> Doesn't it need to be thread__delete(idle_threads[i]) ?

yes. forgot to update this to the new api.

David


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/8] perf sched: Introduce timehist command
  2013-11-28 15:38   ` Namhyung Kim
@ 2013-11-28 16:01     ` David Ahern
  2013-11-29  0:48       ` Namhyung Kim
  0 siblings, 1 reply; 28+ messages in thread
From: David Ahern @ 2013-11-28 16:01 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Melo, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Mike Galbraith, Jiri Olsa, Stephane Eranian,
	Pekka Enberg

On 11/28/13, 8:38 AM, Namhyung Kim wrote:
> On Tue, Nov 19, 2013 at 5:32 AM, David Ahern <dsahern@gmail.com> wrote:
>
> [SNIP]
>> +static bool is_idle_sample(struct perf_sample *sample,
>> +                          struct perf_evsel *evsel,
>> +                          struct machine *machine)
>> +{
>> +       struct thread *thread;
>> +       struct callchain_cursor *cursor = &callchain_cursor;
>> +       struct callchain_cursor_node *node;
>> +       struct addr_location al;
>> +       int iter = 5;
>
> Shouldn't it be sched->max_stack somehow?

max_stack is used to dump callstack to the user. In this case we are 
walking the stack looking for an idle symbol.

>
>> +
>> +       /* pid 0 == swapper == idle task */
>> +       if (sample->pid == 0)
>> +               return true;
>> +
>> +       /* want main thread for process - has maps */
>> +       thread = machine__findnew_thread(machine, sample->pid, sample->pid);
>> +       if (thread == NULL) {
>> +               pr_debug("Failed to get thread for pid %d.\n", sample->pid);
>> +               return false;
>> +       }
>> +
>> +       if (!symbol_conf.use_callchain || sample->callchain == NULL)
>> +               return false;
>> +
>> +       if (machine__resolve_callchain(machine, evsel, thread,
>> +                                       sample, NULL, &al, PERF_MAX_STACK_DEPTH) != 0) {
>> +               if (verbose)
>> +                       error("Failed to resolve callchain. Skipping\n");
>> +
>> +               return false;
>> +       }
>
> I think this callchain resolving logic should be moved to the
> beginning of perf_hist__process_sample() like other commands do.  It's
> not for idle threads only.

I'll see what can be done.

>
> And it also needs to pass sched->max_stack.

Per above, max_stack has a different purpose

>
>
>> +       callchain_cursor_commit(cursor);
>> +
>> +       /* idle symbol should be early in the stack */
>> +       while (iter) {
>> +               node = callchain_cursor_current(cursor);
>> +               if (!node)
>> +                       break;
>> +
>> +               if (symbol__is_idle(node->sym))
>> +                       return true;
>> +
>> +               callchain_cursor_advance(cursor);
>> +
>> +               iter--;
>> +       }
>> +
>> +       return false;
>> +}
>
> [SNIP]
>> +       /* show wakeups if requested */
>> +       if (sched->show_wakeups && !sched->summary_only)
>
> Hmm.. maybe we can emit a warning if -w and -s options are given at
> the same time.

ok.

>
>
>> +               timehist_print_wakeup_event(sched, sample, machine, thread);
>> +
>> +       return 0;
>> +}
>
> [SNIP]
>> +static int parse_target_str(struct perf_sched *sched)
>> +{
>> +       if (sched->target.pid) {
>> +               sched->pid = intlist__new(sched->target.pid);
>> +               if (sched->pid == NULL) {
>> +                       pr_err("Error parsing process id string\n");
>> +                       return -EINVAL;
>> +               }
>> +       }
>> +
>> +       if (sched->target.tid) {
>> +               sched->tid = intlist__new(sched->target.tid);
>> +               if (sched->tid == NULL) {
>> +                       pr_err("Error parsing thread id string\n");
>
>
> Need to call intlist__delete(sched->pid) here?

indeed. thanks.


>> +       const struct option timehist_options[] = {
>> +       OPT_STRING('i', "input", &input_name, "file",
>> +                   "input file name"),
>> +       OPT_INCR('v', "verbose", &verbose,
>> +                   "be more verbose (show symbol address, etc)"),
>> +       OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
>> +                  "file", "vmlinux pathname"),
>> +       OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name,
>> +                  "file", "kallsyms pathname"),
>> +       OPT_STRING('c', "comms", &symbol_conf.comm_list_str, "comm[,comm...]",
>> +                  "only display events for these comms"),
>> +       OPT_STRING('p', "pid", &sched.target.pid, "pid",
>> +                  "analyze events only for given process id(s)"),
>> +       OPT_STRING('t', "tid", &sched.target.tid, "tid",
>> +                   "analyze events only for given thread id(s)"),
>> +       OPT_BOOLEAN('g', "call-graph", &sched.show_callchain,
>> +                   "Display call chains if present (default on)"),
>> +       OPT_UINTEGER(0, "max-stack", &sched.max_stack,
>> +                  "Maximum number of functions to display backtrace."),
>> +       OPT_STRING('x', "excl", &excl_sym_list_str, "sym[,sym...]",
>
> Why not renaming long option name to --exclude or --exclude-symbols?

ok.

David

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/8] perf sched: Introduce timehist command
  2013-11-28 16:01     ` David Ahern
@ 2013-11-29  0:48       ` Namhyung Kim
  2013-11-29  1:58         ` David Ahern
  0 siblings, 1 reply; 28+ messages in thread
From: Namhyung Kim @ 2013-11-29  0:48 UTC (permalink / raw)
  To: David Ahern
  Cc: Arnaldo Melo, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Mike Galbraith, Jiri Olsa, Stephane Eranian,
	Pekka Enberg

Hi David,

On Thu, 28 Nov 2013 09:01:23 -0700, David Ahern wrote:
> On 11/28/13, 8:38 AM, Namhyung Kim wrote:
>> On Tue, Nov 19, 2013 at 5:32 AM, David Ahern <dsahern@gmail.com> wrote:
>>
>> [SNIP]
>>> +static bool is_idle_sample(struct perf_sample *sample,
>>> +                          struct perf_evsel *evsel,
>>> +                          struct machine *machine)
>>> +{
>>> +       struct thread *thread;
>>> +       struct callchain_cursor *cursor = &callchain_cursor;
>>> +       struct callchain_cursor_node *node;
>>> +       struct addr_location al;
>>> +       int iter = 5;
>>
>> Shouldn't it be sched->max_stack somehow?
>
> max_stack is used to dump callstack to the user. In this case we are
> walking the stack looking for an idle symbol.

Do we really need to look up the callchain to find out an idle thread?

  $ perf sched script | grep swapper | head
         swapper     0 [001] 4294177.326996: sched:sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=Xorg next_pid=1094 next_prio=120
         swapper     0 [010] 4294177.327019: sched:sched_switch: prev_comm=swapper/10 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=13902 next_prio=120
            perf 13901 [002] 4294177.327074: sched:sched_switch: prev_comm=perf prev_pid=13901 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120
         swapper     0 [004] 4294177.327096: sched:sched_switch: prev_comm=swapper/4 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=synergys next_pid=1521 next_prio=120
         swapper     0 [000] 4294177.327102: sched:sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=gnome-terminal next_pid=2392 next_prio=120
            Xorg  1094 [001] 4294177.327112: sched:sched_switch: prev_comm=Xorg prev_pid=1094 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120
         swapper     0 [007] 4294177.327122: sched:sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=13902 next_prio=120
    migration/10    58 [010] 4294177.327124: sched:sched_switch: prev_comm=migration/10 prev_pid=58 prev_prio=0 prev_state=S ==> next_comm=swapper/10 next_pid=0 next_prio=120
        synergys  1521 [004] 4294177.327144: sched:sched_switch: prev_comm=synergys prev_pid=1521 prev_prio=120 prev_state=S ==> next_comm=swapper/4 next_pid=0 next_prio=120
  gnome-terminal  2392 [000] 4294177.327286: sched:sched_switch: prev_comm=gnome-terminal prev_pid=2392 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120

It seems every idle/swapper thread for each cpu has a pid of 0.

>
>>
>>> +
>>> +       /* pid 0 == swapper == idle task */
>>> +       if (sample->pid == 0)
>>> +               return true;
>>> +
>>> +       /* want main thread for process - has maps */
>>> +       thread = machine__findnew_thread(machine, sample->pid, sample->pid);
>>> +       if (thread == NULL) {
>>> +               pr_debug("Failed to get thread for pid %d.\n", sample->pid);
>>> +               return false;
>>> +       }
>>> +
>>> +       if (!symbol_conf.use_callchain || sample->callchain == NULL)
>>> +               return false;
>>> +
>>> +       if (machine__resolve_callchain(machine, evsel, thread,
>>> +                                       sample, NULL, &al, PERF_MAX_STACK_DEPTH) != 0) {
>>> +               if (verbose)
>>> +                       error("Failed to resolve callchain. Skipping\n");
>>> +
>>> +               return false;
>>> +       }
>>
>> I think this callchain resolving logic should be moved to the
>> beginning of perf_hist__process_sample() like other commands do.  It's
>> not for idle threads only.
>
> I'll see what can be done.
>
>>
>> And it also needs to pass sched->max_stack.
>
> Per above, max_stack has a different purpose

Hmm.. anyway I don't think we need to pass PERF_MAX_STACK_DEPTH for
machine__resolve_callchain() as we'll only look up to max_stack entries.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/8] perf sched: Introduce timehist command
  2013-11-29  0:48       ` Namhyung Kim
@ 2013-11-29  1:58         ` David Ahern
  2013-12-04  4:15           ` David Ahern
  0 siblings, 1 reply; 28+ messages in thread
From: David Ahern @ 2013-11-29  1:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Melo, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Mike Galbraith, Jiri Olsa, Stephane Eranian,
	Pekka Enberg

On 11/28/13, 5:48 PM, Namhyung Kim wrote:
> Do we really need to look up the callchain to find out an idle thread?
>
>    $ perf sched script | grep swapper | head
>           swapper     0 [001] 4294177.326996: sched:sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=Xorg next_pid=1094 next_prio=120
>           swapper     0 [010] 4294177.327019: sched:sched_switch: prev_comm=swapper/10 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=13902 next_prio=120
>              perf 13901 [002] 4294177.327074: sched:sched_switch: prev_comm=perf prev_pid=13901 prev_prio=120 prev_state=S ==> next_comm=swapper/2 next_pid=0 next_prio=120
>           swapper     0 [004] 4294177.327096: sched:sched_switch: prev_comm=swapper/4 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=synergys next_pid=1521 next_prio=120
>           swapper     0 [000] 4294177.327102: sched:sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=gnome-terminal next_pid=2392 next_prio=120
>              Xorg  1094 [001] 4294177.327112: sched:sched_switch: prev_comm=Xorg prev_pid=1094 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120
>           swapper     0 [007] 4294177.327122: sched:sched_switch: prev_comm=swapper/7 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=perf next_pid=13902 next_prio=120
>      migration/10    58 [010] 4294177.327124: sched:sched_switch: prev_comm=migration/10 prev_pid=58 prev_prio=0 prev_state=S ==> next_comm=swapper/10 next_pid=0 next_prio=120
>          synergys  1521 [004] 4294177.327144: sched:sched_switch: prev_comm=synergys prev_pid=1521 prev_prio=120 prev_state=S ==> next_comm=swapper/4 next_pid=0 next_prio=120
>    gnome-terminal  2392 [000] 4294177.327286: sched:sched_switch: prev_comm=gnome-terminal prev_pid=2392 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
>
> It seems every idle/swapper thread for each cpu has a pid of 0.

The command was developed using context-switch events. Try the next 
patch and use:

perf record -e cs -c 1 -ag
perf sched timehist

David

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [tip:perf/core] perf evsel: Skip ignored symbols while printing callchain
  2013-11-18 20:32 ` [PATCH 1/8] perf tool: Skip ignored symbols while printing callchain David Ahern
@ 2013-11-30 12:52   ` tip-bot for David Ahern
  0 siblings, 0 replies; 28+ messages in thread
From: tip-bot for David Ahern @ 2013-11-30 12:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, eranian, hpa, mingo, peterz, namhyung, jolsa,
	fweisbec, dsahern, adrian.hunter, tglx

Commit-ID:  d2ff1b1499c8e0ad2fc79376a4215ba37771823f
Gitweb:     http://git.kernel.org/tip/d2ff1b1499c8e0ad2fc79376a4215ba37771823f
Author:     David Ahern <dsahern@gmail.com>
AuthorDate: Mon, 18 Nov 2013 13:32:44 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 Nov 2013 14:58:37 -0300

perf evsel: Skip ignored symbols while printing callchain

Allows a command to have a symbol_filter controlled by the user to skip
certain functions in a backtrace. One example is to allow the user to
reduce repeating patterns like:

    do_select  core_sys_select  sys_select

to just sys_select when dumping callchains, consuming less real estate
on the screen while still conveying the essential message - the process
is in a select call.

This option is leveraged by the upcoming timehist command.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-2-git-send-email-dsahern@gmail.com
[ Checked if al.sym is NULL before touching al.sym->ignored, as noted by Adrian Hunter ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/session.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f36d24a..b0b15e2 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1522,6 +1522,9 @@ void perf_evsel__print_ip(struct perf_evsel *evsel, union perf_event *event,
 			if (!node)
 				break;
 
+			if (node->sym && node->sym->ignore)
+				goto next;
+
 			if (print_ip)
 				printf("%c%16" PRIx64, s, node->ip);
 
@@ -1544,12 +1547,15 @@ void perf_evsel__print_ip(struct perf_evsel *evsel, union perf_event *event,
 			if (!print_oneline)
 				printf("\n");
 
-			callchain_cursor_advance(&callchain_cursor);
-
 			stack_depth--;
+next:
+			callchain_cursor_advance(&callchain_cursor);
 		}
 
 	} else {
+		if (al.sym && al.sym->ignore)
+			return;
+
 		if (print_ip)
 			printf("%16" PRIx64, sample->ip);
 

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [tip:perf/core] perf symbols: Move idle syms check from top to generic function
  2013-11-18 20:32 ` [PATCH 2/8] perf symbols: Move idle syms check from top to generic function David Ahern
  2013-11-28  8:36   ` Namhyung Kim
@ 2013-11-30 12:52   ` tip-bot for David Ahern
  1 sibling, 0 replies; 28+ messages in thread
From: tip-bot for David Ahern @ 2013-11-30 12:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, eranian, hpa, mingo, peterz, namhyung, jolsa,
	fweisbec, dsahern, tglx

Commit-ID:  82d1deb0546a4af7a2ddbcfed99690b3a61776c5
Gitweb:     http://git.kernel.org/tip/82d1deb0546a4af7a2ddbcfed99690b3a61776c5
Author:     David Ahern <dsahern@gmail.com>
AuthorDate: Mon, 18 Nov 2013 13:32:45 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 Nov 2013 14:58:37 -0300

perf symbols: Move idle syms check from top to generic function

Allows list of idle symbols to be leveraged by other commands, such as
the upcoming timehist command.

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-3-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/builtin-top.c | 25 ++-----------------------
 tools/perf/util/symbol.c | 30 ++++++++++++++++++++++++++++++
 tools/perf/util/symbol.h |  1 +
 3 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 531522d..03d37a7 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -634,26 +634,9 @@ repeat:
 	return NULL;
 }
 
-/* Tag samples to be skipped. */
-static const char *skip_symbols[] = {
-	"intel_idle",
-	"default_idle",
-	"native_safe_halt",
-	"cpu_idle",
-	"enter_idle",
-	"exit_idle",
-	"mwait_idle",
-	"mwait_idle_with_hints",
-	"poll_idle",
-	"ppc64_runlatch_off",
-	"pseries_dedicated_idle_sleep",
-	NULL
-};
-
 static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
 {
 	const char *name = sym->name;
-	int i;
 
 	/*
 	 * ppc64 uses function descriptors and appends a '.' to the
@@ -671,12 +654,8 @@ static int symbol_filter(struct map *map __maybe_unused, struct symbol *sym)
 	    strstr(name, "_text_end"))
 		return 1;
 
-	for (i = 0; skip_symbols[i]; i++) {
-		if (!strcmp(skip_symbols[i], name)) {
-			sym->ignore = true;
-			break;
-		}
-	}
+	if (symbol__is_idle(sym))
+		sym->ignore = true;
 
 	return 0;
 }
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index c0c3696..f55c18d 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -573,6 +573,36 @@ static u8 kallsyms2elf_type(char type)
 	return isupper(type) ? STB_GLOBAL : STB_LOCAL;
 }
 
+bool symbol__is_idle(struct symbol *sym)
+{
+	const char * const idle_symbols[] = {
+		"cpu_idle",
+		"intel_idle",
+		"default_idle",
+		"native_safe_halt",
+		"enter_idle",
+		"exit_idle",
+		"mwait_idle",
+		"mwait_idle_with_hints",
+		"poll_idle",
+		"ppc64_runlatch_off",
+		"pseries_dedicated_idle_sleep",
+		NULL
+	};
+
+	int i;
+
+	if (!sym)
+		return false;
+
+	for (i = 0; idle_symbols[i]; i++) {
+		if (!strcmp(idle_symbols[i], sym->name))
+			return true;
+	}
+
+	return false;
+}
+
 static int map__process_kallsym_symbol(void *arg, const char *name,
 				       char type, u64 start)
 {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 07de8fe..ad13c5d 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -240,6 +240,7 @@ size_t symbol__fprintf(struct symbol *sym, FILE *fp);
 bool symbol_type__is_a(char symbol_type, enum map_type map_type);
 bool symbol__restricted_filename(const char *filename,
 				 const char *restricted_filename);
+bool symbol__is_idle(struct symbol *sym);
 
 int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
 		  struct symsrc *runtime_ss, symbol_filter_t filter,

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [tip:perf/core] perf thread: Move comm_list check into function
  2013-11-18 20:32 ` [PATCH 4/8] perf thread: Move comm_list check into function David Ahern
@ 2013-11-30 12:52   ` tip-bot for David Ahern
  0 siblings, 0 replies; 28+ messages in thread
From: tip-bot for David Ahern @ 2013-11-30 12:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, eranian, hpa, mingo, peterz, namhyung, jolsa,
	fweisbec, dsahern, tglx

Commit-ID:  1f3878c11ca8c9f2d9545ec6e1e6bb5386e0565f
Gitweb:     http://git.kernel.org/tip/1f3878c11ca8c9f2d9545ec6e1e6bb5386e0565f
Author:     David Ahern <dsahern@gmail.com>
AuthorDate: Mon, 18 Nov 2013 13:32:47 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 Nov 2013 14:58:37 -0300

perf thread: Move comm_list check into function

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-5-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/event.c  |  3 +--
 tools/perf/util/thread.h | 12 ++++++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index bb788c1..c77814b 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -732,8 +732,7 @@ int perf_event__preprocess_sample(const union perf_event *event,
 	if (thread == NULL)
 		return -1;
 
-	if (symbol_conf.comm_list &&
-	    !strlist__has_entry(symbol_conf.comm_list, thread__comm_str(thread)))
+	if (thread__is_filtered(thread))
 		goto out_filtered;
 
 	dump_printf(" ... thread: %s:%d\n", thread__comm_str(thread), thread->tid);
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 897c1b2..5b856bf 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -6,6 +6,7 @@
 #include <unistd.h>
 #include <sys/types.h>
 #include "symbol.h"
+#include <strlist.h>
 
 struct thread {
 	union {
@@ -66,4 +67,15 @@ static inline void thread__set_priv(struct thread *thread, void *p)
 {
 	thread->priv = p;
 }
+
+static inline bool thread__is_filtered(struct thread *thread)
+{
+	if (symbol_conf.comm_list &&
+	    !strlist__has_entry(symbol_conf.comm_list, thread__comm_str(thread))) {
+		return true;
+	}
+
+	return false;
+}
+
 #endif	/* __PERF_THREAD_H */

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [tip:perf/core] perf tools: Export setup_list
  2013-11-18 20:32 ` [PATCH 5/8] perf tool: export setup_list David Ahern
  2013-11-28  8:45   ` Namhyung Kim
@ 2013-11-30 12:52   ` tip-bot for David Ahern
  1 sibling, 0 replies; 28+ messages in thread
From: tip-bot for David Ahern @ 2013-11-30 12:52 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, eranian, hpa, mingo, peterz, namhyung, jolsa,
	fweisbec, dsahern, tglx

Commit-ID:  3bfe5f81fc8f87bf822f3da36927cfc549f3b3db
Gitweb:     http://git.kernel.org/tip/3bfe5f81fc8f87bf822f3da36927cfc549f3b3db
Author:     David Ahern <dsahern@gmail.com>
AuthorDate: Mon, 18 Nov 2013 13:32:48 -0700
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 27 Nov 2013 14:58:38 -0300

perf tools: Export setup_list

Used in upcoming patches (perf sched timehist command).

Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1384806771-2945-6-git-send-email-dsahern@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/symbol.c | 2 +-
 tools/perf/util/symbol.h | 3 +++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index f55c18d..ce9ce10 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1749,7 +1749,7 @@ out_fail:
 	return -1;
 }
 
-static int setup_list(struct strlist **list, const char *list_str,
+int setup_list(struct strlist **list, const char *list_str,
 		      const char *list_name)
 {
 	if (list_str == NULL)
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index ad13c5d..f1031a1 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -274,4 +274,7 @@ void kcore_extract__delete(struct kcore_extract *kce);
 int kcore_copy(const char *from_dir, const char *to_dir);
 int compare_proc_modules(const char *from, const char *to);
 
+int setup_list(struct strlist **list, const char *list_str,
+	       const char *list_name);
+
 #endif /* __PERF_SYMBOL */

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 2/8] perf symbols: Move idle syms check from top to generic function
  2013-11-28  8:36   ` Namhyung Kim
@ 2013-11-30 15:58     ` David Ahern
  0 siblings, 0 replies; 28+ messages in thread
From: David Ahern @ 2013-11-30 15:58 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: acme, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Jiri Olsa, Stephane Eranian

On 11/28/13, 1:36 AM, Namhyung Kim wrote:
> Hi David,
>
> Just minor nits below..
>
> On Mon, 18 Nov 2013 13:32:45 -0700, David Ahern wrote:
>> Allows list of idle symbols to be leveraged by other commands,
>> such as the upcoming timehist command.
>
> [SNIP]
>> +bool symbol__is_idle(struct symbol *sym)
>> +{
>> +	const char * const idle_symbols[] = {
>
> Wouldn't it better making it static?  It seems to build a table
> everytime otherwise.

Checked a couple of gcc's (4.3.2 32-bit and 4.6.3 64-bit) and both put 
idle_symbols in read-only data section. e.g.,

0000000000000020 r idle_symbols.13404

So I think it is fine as is.

>
>
>> +		"cpu_idle",
>> +		"intel_idle",
>> +		"default_idle",
>> +		"native_safe_halt",
>> +		"enter_idle",
>> +		"exit_idle",
>> +		"mwait_idle",
>> +		"mwait_idle_with_hints",
>> +		"poll_idle",
>> +		"ppc64_runlatch_off",
>> +		"pseries_dedicated_idle_sleep",
>> +		NULL
>> +	};
>> +
>> +	int i;
>> +
>> +	if (!sym)
>> +		return false;
>> +
>> +	for (i = 0; idle_symbols[i]; i++) {
>
> Also we can use ARRAY_SIZE() here and let the last NULL go IMHO.

I copied what was in builtin-top.c.

David

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 6/8] perf sched: Introduce timehist command
  2013-11-29  1:58         ` David Ahern
@ 2013-12-04  4:15           ` David Ahern
  0 siblings, 0 replies; 28+ messages in thread
From: David Ahern @ 2013-12-04  4:15 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Melo, linux-kernel, Ingo Molnar, Frederic Weisbecker,
	Peter Zijlstra, Mike Galbraith, Jiri Olsa, Stephane Eranian,
	Pekka Enberg

On 11/28/13, 6:58 PM, David Ahern wrote:
> On 11/28/13, 5:48 PM, Namhyung Kim wrote:
>> Do we really need to look up the callchain to find out an idle thread?
---8<---
>> It seems every idle/swapper thread for each cpu has a pid of 0.
>

I knew I had this code in there for a reason....

Older kernels (e.g., RHEL6) show init as the idle task for cpus != 0. 
So, to be robust across kernel versions the idle check needs to do more 
than just looking at the swapper thread as the incoming or outgoing 
task. It needs to walk the first few frames of the callstack looking for 
a known idle symbol.

David


^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2013-12-04  4:15 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-18 20:32 [PATCH 0/8] perf: sched timehist command David Ahern
2013-11-18 20:32 ` [PATCH 1/8] perf tool: Skip ignored symbols while printing callchain David Ahern
2013-11-30 12:52   ` [tip:perf/core] perf evsel: " tip-bot for David Ahern
2013-11-18 20:32 ` [PATCH 2/8] perf symbols: Move idle syms check from top to generic function David Ahern
2013-11-28  8:36   ` Namhyung Kim
2013-11-30 15:58     ` David Ahern
2013-11-30 12:52   ` [tip:perf/core] " tip-bot for David Ahern
2013-11-18 20:32 ` [PATCH 3/8] perf symbol: Save vmlinux or kallsyms path loaded David Ahern
2013-11-22 18:44   ` Arnaldo Carvalho de Melo
2013-11-22 19:13     ` David Ahern
2013-11-22 19:40       ` Arnaldo Carvalho de Melo
2013-11-22 20:09         ` David Ahern
2013-11-18 20:32 ` [PATCH 4/8] perf thread: Move comm_list check into function David Ahern
2013-11-30 12:52   ` [tip:perf/core] " tip-bot for David Ahern
2013-11-18 20:32 ` [PATCH 5/8] perf tool: export setup_list David Ahern
2013-11-28  8:45   ` Namhyung Kim
2013-11-30 12:52   ` [tip:perf/core] perf tools: Export setup_list tip-bot for David Ahern
2013-11-18 20:32 ` [PATCH 6/8] perf sched: Introduce timehist command David Ahern
2013-11-28  9:50   ` Namhyung Kim
2013-11-28 16:01     ` David Ahern
2013-11-28 15:38   ` Namhyung Kim
2013-11-28 16:01     ` David Ahern
2013-11-29  0:48       ` Namhyung Kim
2013-11-29  1:58         ` David Ahern
2013-12-04  4:15           ` David Ahern
2013-11-18 20:32 ` [PATCH 7/8] perf sched timehist: Add support for context-switch event David Ahern
2013-11-18 20:32 ` [PATCH 8/8] perf sched : Add documentation for timehist options David Ahern
2013-11-28 15:42   ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).