linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: Clark Williams <williams@redhat.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	Andi Kleen <ak@linux.intel.com>,
	Kim Phillips <kim.phillips@arm.com>,
	Arnaldo Carvalho de Melo <acme@redhat.com>
Subject: [PATCH 36/37] perf script: Implement --graph-function
Date: Thu, 25 Oct 2018 08:10:30 -0300	[thread overview]
Message-ID: <20181025111031.3440-37-acme@kernel.org> (raw)
In-Reply-To: <20181025111031.3440-1-acme@kernel.org>

From: Andi Kleen <ak@linux.intel.com>

Add a ftrace style --graph-function argument to 'perf script' that
allows to print itrace function calls only below a given function. This
makes it easier to find the code of interest in a large trace.

% perf record -e intel_pt//k -a sleep 1
% perf script --graph-function group_sched_in --call-trace
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])          group_sched_in
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])              event_sched_in.isra.107
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_event_set_state.part.71
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_time
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_disable
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_log_itrace_start
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                      perf_event_update_userpage
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          calc_timer_values
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              sched_clock_cpu
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          __x86_indirect_thunk_rax
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                          arch_perf_update_userpage
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              __fentry__
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              using_native_sched_clock
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                              sched_clock_stable
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])                  perf_pmu_enable
            perf   900 [000] 194167.205652203: ([kernel.kallsyms])              __x86_indirect_thunk_rax
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])          group_sched_in
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])              __x86_indirect_thunk_rax
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])              event_sched_in.isra.107
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_event_set_state.part.71
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                      perf_event_update_time
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_pmu_disable
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  perf_log_itrace_start
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                  __x86_indirect_thunk_rax
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                      perf_event_update_userpage
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          calc_timer_values
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              sched_clock_cpu
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          __x86_indirect_thunk_rax
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                          arch_perf_update_userpage
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              __fentry__
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              using_native_sched_clock
         swapper     0 [001] 194167.205660693: ([kernel.kallsyms])                              sched_clock_stable

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Link: http://lkml.kernel.org/r/20180920180540.14039-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-script.txt |  4 ++
 tools/perf/builtin-script.c              | 96 ++++++++++++++++++++++++++------
 tools/perf/util/symbol.h                 |  3 +-
 tools/perf/util/thread.h                 |  2 +
 4 files changed, 86 insertions(+), 19 deletions(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 805baabd238e..a2b37ce48094 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -397,6 +397,10 @@ include::itrace.txt[]
 --call-ret-trace::
 	Show call and return stream for intel_pt traces.
 
+--graph-function::
+	For itrace only show specified functions and their callees for
+	itrace. Multiple functions can be separated by comma.
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 566e1450898a..9d2249ea75e3 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1105,6 +1105,35 @@ static int perf_sample__fprintf_addr(struct perf_sample *sample,
 	return printed;
 }
 
+static const char *resolve_branch_sym(struct perf_sample *sample,
+				      struct perf_evsel *evsel,
+				      struct thread *thread,
+				      struct addr_location *al,
+				      u64 *ip)
+{
+	struct addr_location addr_al;
+	struct perf_event_attr *attr = &evsel->attr;
+	const char *name = NULL;
+
+	if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
+		if (sample_addr_correlates_sym(attr)) {
+			thread__resolve(thread, &addr_al, sample);
+			if (addr_al.sym)
+				name = addr_al.sym->name;
+			else
+				*ip = sample->addr;
+		} else {
+			*ip = sample->addr;
+		}
+	} else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
+		if (al->sym)
+			name = al->sym->name;
+		else
+			*ip = sample->ip;
+	}
+	return name;
+}
+
 static int perf_sample__fprintf_callindent(struct perf_sample *sample,
 					   struct perf_evsel *evsel,
 					   struct thread *thread,
@@ -1112,7 +1141,6 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
 {
 	struct perf_event_attr *attr = &evsel->attr;
 	size_t depth = thread_stack__depth(thread);
-	struct addr_location addr_al;
 	const char *name = NULL;
 	static int spacing;
 	int len = 0;
@@ -1126,22 +1154,7 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
 	if (thread->ts && sample->flags & PERF_IP_FLAG_RETURN)
 		depth += 1;
 
-	if (sample->flags & (PERF_IP_FLAG_CALL | PERF_IP_FLAG_TRACE_BEGIN)) {
-		if (sample_addr_correlates_sym(attr)) {
-			thread__resolve(thread, &addr_al, sample);
-			if (addr_al.sym)
-				name = addr_al.sym->name;
-			else
-				ip = sample->addr;
-		} else {
-			ip = sample->addr;
-		}
-	} else if (sample->flags & (PERF_IP_FLAG_RETURN | PERF_IP_FLAG_TRACE_END)) {
-		if (al->sym)
-			name = al->sym->name;
-		else
-			ip = sample->ip;
-	}
+	name = resolve_branch_sym(sample, evsel, thread, al, &ip);
 
 	if (PRINT_FIELD(DSO) && !(PRINT_FIELD(IP) || PRINT_FIELD(ADDR))) {
 		dlen += fprintf(fp, "(");
@@ -1647,6 +1660,47 @@ static void perf_sample__fprint_metric(struct perf_script *script,
 	}
 }
 
+static bool show_event(struct perf_sample *sample,
+		       struct perf_evsel *evsel,
+		       struct thread *thread,
+		       struct addr_location *al)
+{
+	int depth = thread_stack__depth(thread);
+
+	if (!symbol_conf.graph_function)
+		return true;
+
+	if (thread->filter) {
+		if (depth <= thread->filter_entry_depth) {
+			thread->filter = false;
+			return false;
+		}
+		return true;
+	} else {
+		const char *s = symbol_conf.graph_function;
+		u64 ip;
+		const char *name = resolve_branch_sym(sample, evsel, thread, al,
+				&ip);
+		unsigned nlen;
+
+		if (!name)
+			return false;
+		nlen = strlen(name);
+		while (*s) {
+			unsigned len = strcspn(s, ",");
+			if (nlen == len && !strncmp(name, s, len)) {
+				thread->filter = true;
+				thread->filter_entry_depth = depth;
+				return true;
+			}
+			s += len;
+			if (*s == ',')
+				s++;
+		}
+		return false;
+	}
+}
+
 static void process_event(struct perf_script *script,
 			  struct perf_sample *sample, struct perf_evsel *evsel,
 			  struct addr_location *al,
@@ -1661,6 +1715,9 @@ static void process_event(struct perf_script *script,
 	if (output[type].fields == 0)
 		return;
 
+	if (!show_event(sample, evsel, thread, al))
+		return;
+
 	++es->samples;
 
 	perf_sample__fprintf_start(sample, thread, evsel,
@@ -3237,6 +3294,8 @@ int cmd_script(int argc, const char **argv)
 			"Decode calls from from itrace", parse_call_trace),
 	OPT_CALLBACK_OPTARG(0, "call-ret-trace", &itrace_synth_opts, NULL, NULL,
 			"Decode calls and returns from itrace", parse_callret_trace),
+	OPT_STRING(0, "graph-function", &symbol_conf.graph_function, "symbol[,symbol...]",
+			"Only print symbols and callees with --call-trace/--call-ret-trace"),
 	OPT_STRING(0, "stop-bt", &symbol_conf.bt_stop_list_str, "symbol[,symbol...]",
 		   "Stop display of callgraph at these symbols"),
 	OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
@@ -3494,7 +3553,8 @@ int cmd_script(int argc, const char **argv)
 	script.session = session;
 	script__setup_sample_type(&script);
 
-	if (output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT)
+	if ((output[PERF_TYPE_HARDWARE].fields & PERF_OUTPUT_CALLINDENT) ||
+	    symbol_conf.graph_function)
 		itrace_synth_opts.thread_stack = true;
 
 	session->itrace_synth_opts = &itrace_synth_opts;
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index f25fae4b5743..d726a8a7bb1b 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -123,7 +123,8 @@ struct symbol_conf {
 	const char	*vmlinux_name,
 			*kallsyms_name,
 			*source_prefix,
-			*field_sep;
+			*field_sep,
+			*graph_function;
 	const char	*default_guest_vmlinux_name,
 			*default_guest_kallsyms,
 			*default_guest_modules;
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 07606aa6998d..36c09a9904e6 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -42,6 +42,8 @@ struct thread {
 	void				*addr_space;
 	struct unwind_libunwind_ops	*unwind_libunwind_ops;
 #endif
+	bool			filter;
+	int			filter_entry_depth;
 };
 
 struct machine;
-- 
2.14.4


  parent reply	other threads:[~2018-10-25 11:12 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-25 11:09 [PATCH 00/37] perf/core improvements and fixes Arnaldo Carvalho de Melo
2018-10-25 11:09 ` [PATCH 01/37] perf record: Encode -k clockid frequency into Perf trace Arnaldo Carvalho de Melo
2018-10-25 11:09 ` [PATCH 02/37] perf annotate: Add Sparc support Arnaldo Carvalho de Melo
2018-10-25 11:09 ` [PATCH 03/37] perf jitdump: " Arnaldo Carvalho de Melo
2018-10-25 11:09 ` [PATCH 04/37] perf symbols: Set PLT entry/header sizes properly on Sparc Arnaldo Carvalho de Melo
2018-10-25 11:09 ` [PATCH 05/37] perf arm64: Fix generate system call table failed with /tmp mounted with noexec Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 06/37] tools lib subcmd: Introduce OPTION_ULONG Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 07/37] perf trace: Introduce --max-events Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 08/37] perf evsel: Introduce per event max_events property Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 09/37] perf evsel: Mark a evsel as disabled when asking the kernel do disable it Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 10/37] perf trace: Drop addr_location refcounts Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 11/37] perf trace: Drop thread refcount in trace__event_handler() Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 12/37] perf stat: Poll for monitored tasks being alive Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 13/37] perf script: Allow extended console debug output Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 14/37] perf script: Flush output stream after events in verbose mode Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 15/37] perf trace: Introduce per-event maximum number of events property Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 16/37] perf scripts python: call-graph-from-sql.py: Use SPDX license identifier Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 17/37] perf scripts python: call-graph-from-sql.py: Provide better default column sizes Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 18/37] perf scripts python: call-graph-from-sql.py: Set a minimum window size Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 19/37] perf scripts python: call-graph-from-sql.py: Change icon Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 20/37] perf scripts python: call-graph-from-sql.py: Make a "Main" function Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 21/37] perf scripts python: call-graph-from-sql.py: Separate the database details into a class Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 22/37] perf scripts python: call-graph-from-sql.py: Add a class for global data Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 23/37] perf scripts python: call-graph-from-sql.py: Remove use of setObjectName() Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 24/37] perf scripts python: call-graph-from-sql.py: Factor out CallGraphModel from TreeModel Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 25/37] perf scripts python: call-graph-from-sql.py: Add data helper functions Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 26/37] perf scripts python: call-graph-from-sql.py: Refactor TreeItem class Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 27/37] perf scripts python: call-graph-from-sql.py: Rename to exported-sql-viewer.py Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 28/37] perf scripts python: exported-sql-viewer.py: Add support for multiple sub-windows Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 29/37] perf scripts python: exported-sql-viewer.py: Add ability to find symbols in the call-graph Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 30/37] perf scripts python: exported-sql-viewer.py: Add ability to shrink / enlarge font Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 31/37] perf scripts python: exported-sql-viewer.py: Add ability to display all the database tables Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 32/37] perf scripts python: exported-sql-viewer.py: Add All branches report Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 33/37] perf script: Add --insn-trace for instruction decoding Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 34/37] perf script: Make itrace script default to all calls Arnaldo Carvalho de Melo
2018-10-25 11:10 ` [PATCH 35/37] tools script: Add --call-trace and --call-ret-trace Arnaldo Carvalho de Melo
2018-10-25 11:10 ` Arnaldo Carvalho de Melo [this message]
2018-10-25 11:10 ` [PATCH 37/37] perf script: Support total cycles count Arnaldo Carvalho de Melo
2018-10-26  7:25 ` [PATCH 00/37] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181025111031.3440-37-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=kim.phillips@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=williams@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).