linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Implement dwarf variable/type resolving for perf script
@ 2017-11-28  0:23 Andi Kleen
  2017-11-28  0:23 ` [PATCH 01/12] perf, tools, pt: Clear instruction for ptwrite samples Andi Kleen
                   ` (12 more replies)
  0 siblings, 13 replies; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel

This patchkit extends perf script to query dwarf information for variable
names or types/structure fields accessed from code. 

The dwarf resolution is all on top of Masami's perf probe dwarf code.

It supports multiple use cases:
- When we sample registers it can use the dwarf information to resolve
the registers to names.
- When we sample any instruction the instruction can be decoded and
we can determine the type/struct field to make an estimate of the 
memory access patterns in data structures. 
- When we sample the new PTWRITE instruction the logged value from
the PT log can be associated with a variable.
- Various cleanups and fixes to make the one above all possible.

It is all implemented with new output formats in perf script:
iregval (map register values to names) and insnvar (decode instruction
and map back memory operand to dwarf operation)

There are some limitations, it cannot decode everything, and is
somewhat slow, but it's already quite useful for typical code


    % perf record -Idi,si ./targ
    % perf script -F +iregvals
    ...
        targ  8584 169763.761843:    2091795 cycles:ppp:            40041a main (targ)
        targ  8584 169763.762520:    1913932 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }
        targ  8584 169763.763141:    1638913 cycles:ppp:            400523 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
        targ  8584 169763.763672:    1516522 cycles:ppp:            400522 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
        targ  8584 169763.764165:    1335501 cycles:ppp:            400523 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
        targ  8584 169763.764598:    1253289 cycles:ppp:            400522 f2 (targ) { b = 0x2, int }  { a = 0x1, int }
        targ  8584 169763.765005:    1135131 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }
        targ  8584 169763.765373:    1080325 cycles:ppp:            400522 f2 (targ) { b = 0x2, int }  { a = 0x1, int }
        targ  8584 169763.765724:    1036999 cycles:ppp:            400522 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
        targ  8584 169763.766061:     971213 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }


    % perf record -e intel_pt//u  -a sleep 1
    % perf script --itrace=i0ns -F insnvar,insn,ip,sym  -f 2>&1 | xed -F insn: -A -64 | less
    ...
               4f7e61 xyarray__max_y                pushq  %rbp
               4f7e62 xyarray__max_y                mov %rsp, %rbp
               4f7e65 xyarray__max_y                sub $0x20, %rsp
               4f7e69 xyarray__max_y                movq  %rdi, -0x18(%rbp) { -24(xy), struct xyarray* }
               4f7e6d xyarray__max_y                movq  %fs:0x28, %rax
               4f7e76 xyarray__max_y                movq  %rax, -0x8(%rbp) { -8(xy), struct xyarray* }
               4f7e7a xyarray__max_y                xor %eax, %eax
               4f7e7c xyarray__max_y                movq  -0x18(%rbp), %rax { -24(xy), struct xyarray* }
               4f7e80 xyarray__max_y                movq  0x20(%rax), %rax
               4f7e84 xyarray__max_y                movq  -0x8(%rbp), %rdx { -8(xy), struct xyarray* }
               4f7e88 xyarray__max_y                xorq  %fs:0x28, %rdx
               4f7e91 xyarray__max_y                jz 0x7
               4f7e98 xyarray__max_y                leaveq
               4f7e99 xyarray__max_y                retq
    
In this example we now know that this function accesses two fields in struct xyarray *

Available from

git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/var-resolve-2

v1: Initial post

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 01/12] perf, tools, pt: Clear instruction for ptwrite samples
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-28  0:23 ` [PATCH 02/12] perf, tools, script: Print insn/insnlen for non PT sample Andi Kleen
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

When a PTWRITE sample is synthesized the PT decoder already
ran ahead and sample->insn contains the next branch instruction,
not the PTWRITE.

Clear it for PTWRITE samples to avoid confusion.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/intel-pt.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 23f9ba676df0..485c8040484e 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1262,6 +1262,12 @@ static void intel_pt_prep_p_sample(struct intel_pt *pt,
 	 */
 	if (!sample->ip)
 		sample->flags = 0;
+
+	/*
+	 * Don't have valid instructions because decoder already ran ahead.
+	 */
+	sample->insn_len = 0;
+	memset(sample->insn, 0, INTEL_PT_INSN_BUF_SZ);
 }
 
 static int intel_pt_synth_ptwrite_sample(struct intel_pt_queue *ptq)
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 02/12] perf, tools, script: Print insn/insnlen for non PT sample
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
  2017-11-28  0:23 ` [PATCH 01/12] perf, tools, pt: Clear instruction for ptwrite samples Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-28  0:23 ` [PATCH 03/12] perf, tools: Support storing additional data in strlist Andi Kleen
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Dumping insn/insnlen only works for PT samples where the PT decoder
fills in these fields. Add a fallback for other samples where we
grab the instructions manually and then call an architecture
specific function to determine the instruction length.

The architecture specific function is currently only implemented
for x86, and uses the standard Linux instruction decoder.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/arch/x86/util/Build     |  1 +
 tools/perf/arch/x86/util/insnlen.c | 12 ++++++++++++
 tools/perf/builtin-script.c        | 13 +++++++++++++
 tools/perf/util/Build              |  1 +
 tools/perf/util/insnlen.c          | 10 ++++++++++
 tools/perf/util/insnlen.h          |  6 ++++++
 6 files changed, 43 insertions(+)
 create mode 100644 tools/perf/arch/x86/util/insnlen.c
 create mode 100644 tools/perf/util/insnlen.c
 create mode 100644 tools/perf/util/insnlen.h

diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index f95e6f46ef0d..139f9f1a56f9 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -1,6 +1,7 @@
 libperf-y += header.o
 libperf-y += tsc.o
 libperf-y += pmu.o
+libperf-y += insnlen.o
 libperf-y += kvm-stat.o
 libperf-y += perf_regs.o
 libperf-y += group.o
diff --git a/tools/perf/arch/x86/util/insnlen.c b/tools/perf/arch/x86/util/insnlen.c
new file mode 100644
index 000000000000..8e2e50bd5201
--- /dev/null
+++ b/tools/perf/arch/x86/util/insnlen.c
@@ -0,0 +1,12 @@
+#include "intel-pt-decoder/insn.h"
+#include "intel-pt-decoder/inat.h"
+#include "insnlen.h"
+
+int arch_insn_len(char *insnbytes, int insnlen, int is64bit)
+{
+	struct insn insn;
+
+	insn_init(&insn, insnbytes, insnlen, is64bit);
+	insn_get_length(&insn);
+	return insn.length;
+}
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index ee7c7aaaae72..cae4b13fc715 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -20,6 +20,7 @@
 #include "util/data.h"
 #include "util/auxtrace.h"
 #include "util/cpumap.h"
+#include "util/insnlen.h"
 #include "util/thread_map.h"
 #include "util/stat.h"
 #include "util/string2.h"
@@ -1108,6 +1109,18 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
 {
 	int printed = 0;
 
+	if ((PRINT_FIELD(INSNLEN) || PRINT_FIELD(INSN)) && !sample->insn_len) {
+		u8 ibuf[64];
+		bool is64bit;
+		u8 cpumode;
+
+		if (grab_bb(ibuf, sample->ip, sample->ip + 16,
+			    machine, thread, &is64bit, &cpumode, false) > 0) {
+			sample->insn_len = arch_insn_len((char *)ibuf, 16, is64bit);
+			memcpy(sample->insn, ibuf, sample->insn_len);
+		}
+	}
+
 	if (PRINT_FIELD(INSNLEN))
 		printed += fprintf(fp, " ilen: %d", sample->insn_len);
 	if (PRINT_FIELD(INSN)) {
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index a3de7916fe63..80c05329835a 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -14,6 +14,7 @@ libperf-y += kallsyms.o
 libperf-y += levenshtein.o
 libperf-y += llvm-utils.o
 libperf-y += mmap.o
+libperf-y += insnlen.o
 libperf-y += memswap.o
 libperf-y += parse-events.o
 libperf-y += perf_regs.o
diff --git a/tools/perf/util/insnlen.c b/tools/perf/util/insnlen.c
new file mode 100644
index 000000000000..6c126960a7e6
--- /dev/null
+++ b/tools/perf/util/insnlen.c
@@ -0,0 +1,10 @@
+#include "perf.h"
+#include "insnlen.h"
+
+/* Fallback for architectures not supporting this */
+__weak int arch_insn_len(char *buf __maybe_unused,
+			 int len __maybe_unused,
+			 int is64bit __maybe_unused)
+{
+	return 0;
+}
diff --git a/tools/perf/util/insnlen.h b/tools/perf/util/insnlen.h
new file mode 100644
index 000000000000..289877dff89d
--- /dev/null
+++ b/tools/perf/util/insnlen.h
@@ -0,0 +1,6 @@
+#ifndef INSNLEN_H
+#define INSNLEN_H 1
+
+int arch_insn_len(char *buf, int len, int is64bit);
+
+#endif
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 03/12] perf, tools: Support storing additional data in strlist
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
  2017-11-28  0:23 ` [PATCH 01/12] perf, tools, pt: Clear instruction for ptwrite samples Andi Kleen
  2017-11-28  0:23 ` [PATCH 02/12] perf, tools, script: Print insn/insnlen for non PT sample Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-28 13:31   ` Masami Hiramatsu
  2017-11-28  0:23 ` [PATCH 04/12] perf, tools: Store variable name and register for dwarf variable lists Andi Kleen
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add a configurable node size to strlist, which allows users
to store additional data in a str_node. Also add a new interface
to add a new strlist node, and return the node, so additional
data can be added.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/rblist.c  | 16 ++++++++++++++--
 tools/perf/util/rblist.h  |  2 ++
 tools/perf/util/strlist.c | 15 ++++++++++++++-
 tools/perf/util/strlist.h |  8 ++++++++
 4 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/rblist.c b/tools/perf/util/rblist.c
index 0dfe27d99458..fa221e6b0932 100644
--- a/tools/perf/util/rblist.c
+++ b/tools/perf/util/rblist.c
@@ -11,11 +11,13 @@
 
 #include "rblist.h"
 
-int rblist__add_node(struct rblist *rblist, const void *new_entry)
+int rblist__add_node_ptr(struct rblist *rblist, const void *new_entry,
+		       struct rb_node **nodep)
 {
 	struct rb_node **p = &rblist->entries.rb_node;
 	struct rb_node *parent = NULL, *new_node;
 
+	*nodep = NULL;
 	while (*p != NULL) {
 		int rc;
 
@@ -26,13 +28,16 @@ int rblist__add_node(struct rblist *rblist, const void *new_entry)
 			p = &(*p)->rb_left;
 		else if (rc < 0)
 			p = &(*p)->rb_right;
-		else
+		else {
+			*nodep = parent;
 			return -EEXIST;
+		}
 	}
 
 	new_node = rblist->node_new(rblist, new_entry);
 	if (new_node == NULL)
 		return -ENOMEM;
+	*nodep = new_node;
 
 	rb_link_node(new_node, parent, p);
 	rb_insert_color(new_node, &rblist->entries);
@@ -41,6 +46,13 @@ int rblist__add_node(struct rblist *rblist, const void *new_entry)
 	return 0;
 }
 
+int rblist__add_node(struct rblist *rblist, const void *new_entry)
+{
+	struct rb_node *nd;
+
+	return rblist__add_node_ptr(rblist, new_entry, &nd);
+}
+
 void rblist__remove_node(struct rblist *rblist, struct rb_node *rb_node)
 {
 	rb_erase(rb_node, &rblist->entries);
diff --git a/tools/perf/util/rblist.h b/tools/perf/util/rblist.h
index 4c8638a22571..2941e4295f63 100644
--- a/tools/perf/util/rblist.h
+++ b/tools/perf/util/rblist.h
@@ -31,6 +31,8 @@ struct rblist {
 void rblist__init(struct rblist *rblist);
 void rblist__delete(struct rblist *rblist);
 int rblist__add_node(struct rblist *rblist, const void *new_entry);
+int rblist__add_node_ptr(struct rblist *rblist, const void *new_entry,
+			 struct rb_node **nodep);
 void rblist__remove_node(struct rblist *rblist, struct rb_node *rb_node);
 struct rb_node *rblist__find(struct rblist *rblist, const void *entry);
 struct rb_node *rblist__findnew(struct rblist *rblist, const void *entry);
diff --git a/tools/perf/util/strlist.c b/tools/perf/util/strlist.c
index 9de5434bb49e..68ef21c3797c 100644
--- a/tools/perf/util/strlist.c
+++ b/tools/perf/util/strlist.c
@@ -18,7 +18,7 @@ struct rb_node *strlist__node_new(struct rblist *rblist, const void *entry)
 	const char *s = entry;
 	struct rb_node *rc = NULL;
 	struct strlist *strlist = container_of(rblist, struct strlist, rblist);
-	struct str_node *snode = malloc(sizeof(*snode));
+	struct str_node *snode = malloc(strlist->node_size);
 
 	if (snode != NULL) {
 		if (strlist->dupstr) {
@@ -66,6 +66,14 @@ int strlist__add(struct strlist *slist, const char *new_entry)
 	return rblist__add_node(&slist->rblist, new_entry);
 }
 
+struct str_node *strlist__add_node(struct strlist *slist, const char *new_entry)
+{
+	struct rb_node *nd;
+
+	rblist__add_node_ptr(&slist->rblist, new_entry, &nd);
+	return container_of(nd, struct str_node, rb_node);
+}
+
 int strlist__load(struct strlist *slist, const char *filename)
 {
 	char entry[1024];
@@ -165,11 +173,15 @@ struct strlist *strlist__new(const char *list, const struct strlist_config *conf
 		bool dupstr = true;
 		bool file_only = false;
 		const char *dirname = NULL;
+		size_t node_size = sizeof(struct str_node);
 
 		if (config) {
 			dupstr = !config->dont_dupstr;
 			dirname = config->dirname;
 			file_only = config->file_only;
+			node_size = config->node_size;
+			if (!node_size)
+				node_size = sizeof(struct str_node);
 		}
 
 		rblist__init(&slist->rblist);
@@ -179,6 +191,7 @@ struct strlist *strlist__new(const char *list, const struct strlist_config *conf
 
 		slist->dupstr	 = dupstr;
 		slist->file_only = file_only;
+		slist->node_size = node_size;
 
 		if (list && strlist__parse_list(slist, list, dirname) != 0)
 			goto out_error;
diff --git a/tools/perf/util/strlist.h b/tools/perf/util/strlist.h
index d58f1e08b170..fd407e11e124 100644
--- a/tools/perf/util/strlist.h
+++ b/tools/perf/util/strlist.h
@@ -16,25 +16,33 @@ struct strlist {
 	struct rblist rblist;
 	bool	      dupstr;
 	bool	      file_only;
+	size_t	      node_size;
 };
 
 /*
  * @file_only: When dirname is present, only consider entries as filenames,
  *             that should not be added to the list if dirname/entry is not
  *             found
+ * @node_size: Allocate extra space after str_node which can be used for other
+ *	       data. This is the complete size including str_node
  */
 struct strlist_config {
 	bool dont_dupstr;
 	bool file_only;
 	const char *dirname;
+	size_t node_size;
 };
 
+#define STRLIST_CONFIG_DEFAULT \
+	{ false, false, NULL, sizeof(struct str_node) }
+
 struct strlist *strlist__new(const char *slist, const struct strlist_config *config);
 void strlist__delete(struct strlist *slist);
 
 void strlist__remove(struct strlist *slist, struct str_node *sn);
 int strlist__load(struct strlist *slist, const char *filename);
 int strlist__add(struct strlist *slist, const char *str);
+struct str_node *strlist__add_node(struct strlist *slist, const char *str);
 
 struct str_node *strlist__entry(const struct strlist *slist, unsigned int idx);
 struct str_node *strlist__find(struct strlist *slist, const char *entry);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 04/12] perf, tools: Store variable name and register for dwarf variable lists
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (2 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 03/12] perf, tools: Support storing additional data in strlist Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-28  0:23 ` [PATCH 05/12] perf, tools, probe: Print location for resolved variables Andi Kleen
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Extend the strlist returned by debuginfo__find_available_vars_at to also
directly include the variable name and the location of the resolved
variables in each node. This makes it easier to use for callers that parse the output
instead of just printing it.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/probe-finder.c | 29 +++++++++++++++++++++++++----
 tools/perf/util/probe-finder.h |  7 +++++++
 2 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index a5731de0e5eb..0149428d453e 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1368,9 +1368,12 @@ static int collect_variables_cb(Dwarf_Die *die_mem, void *data)
 	tag = dwarf_tag(die_mem);
 	if (tag == DW_TAG_formal_parameter ||
 	    tag == DW_TAG_variable) {
+		struct probe_trace_arg ta;
+
+		memset(&ta, 0, sizeof(struct probe_trace_arg));
 		ret = convert_variable_location(die_mem, af->pf.addr,
 						af->pf.fb_ops, &af->pf.sp_die,
-						af->pf.machine, NULL);
+						af->pf.machine, &ta);
 		if (ret == 0 || ret == -ERANGE) {
 			int ret2;
 			bool externs = !af->child;
@@ -1400,8 +1403,23 @@ static int collect_variables_cb(Dwarf_Die *die_mem, void *data)
 
 			pr_debug("Add new var: %s\n", buf.buf);
 			if (ret2 == 0) {
-				strlist__add(vl->vars,
-					strbuf_detach(&buf, NULL));
+				struct str_node *sn;
+
+				/* Will get confused with shadowed variables */
+				sn = strlist__add_node(vl->vars,
+						       strbuf_detach(&buf, NULL));
+				if (sn) {
+					struct variable_node *vn =
+						container_of(sn, struct variable_node, snode);
+					if (dwarf_diename(die_mem))
+						strlcpy(vn->name, dwarf_diename(die_mem), sizeof(vn->name));
+					else
+						vn->name[0] = 0;
+					if (ta.value)
+						strlcpy(vn->value, ta.value, sizeof(vn->value));
+					else
+						vn->value[0] = 0;
+				}
 			}
 			strbuf_release(&buf);
 		}
@@ -1420,6 +1438,7 @@ static int collect_variables_cb(Dwarf_Die *die_mem, void *data)
 /* Add a found vars into available variables list */
 static int add_available_vars(Dwarf_Die *sc_die, struct probe_finder *pf)
 {
+	static struct strlist_config sconfig = STRLIST_CONFIG_DEFAULT;
 	struct available_var_finder *af =
 			container_of(pf, struct available_var_finder, pf);
 	struct perf_probe_point *pp = &pf->pev->point;
@@ -1427,6 +1446,8 @@ static int add_available_vars(Dwarf_Die *sc_die, struct probe_finder *pf)
 	Dwarf_Die die_mem;
 	int ret;
 
+	sconfig.node_size = sizeof(struct variable_node);
+
 	/* Check number of tevs */
 	if (af->nvls == af->max_vls) {
 		pr_warning("Too many( > %d) probe point found.\n", af->max_vls);
@@ -1444,7 +1465,7 @@ static int add_available_vars(Dwarf_Die *sc_die, struct probe_finder *pf)
 		 vl->point.offset);
 
 	/* Find local variables */
-	vl->vars = strlist__new(NULL, NULL);
+	vl->vars = strlist__new(NULL, &sconfig);
 	if (vl->vars == NULL)
 		return -ENOMEM;
 	af->child = true;
diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h
index 16252980ff00..6368e95a5d16 100644
--- a/tools/perf/util/probe-finder.h
+++ b/tools/perf/util/probe-finder.h
@@ -4,6 +4,7 @@
 
 #include <stdbool.h>
 #include "intlist.h"
+#include "strlist.h"
 #include "probe-event.h"
 #include "sane_ctype.h"
 
@@ -117,6 +118,12 @@ struct line_finder {
 	int			found;
 };
 
+struct variable_node {
+	struct str_node		snode;
+	char			value[64];
+	char			name[256];
+};
+
 #endif /* HAVE_DWARF_SUPPORT */
 
 #endif /*_PROBE_FINDER_H */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 05/12] perf, tools, probe: Print location for resolved variables
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (3 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 04/12] perf, tools: Store variable name and register for dwarf variable lists Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-29  1:19   ` Masami Hiramatsu
  2017-11-28  0:23 ` [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open Andi Kleen
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Print the location, e.g. the register, for resolved variables
with perf probe -V. This is useful for debugging, and manually
making sense of disassembly. I also have some scripts
which can make use of this information.

Before:

% perf probe  -x  ./tsrc/tstruct  -V  main+20
Available variables at main+20
        @<main+20>
                struct str*     xp

After:

% perf probe  -x  ./tsrc/tstruct  -V  main+20
Available variables at main+20
        @<main+20>
                struct str*     xp      %ax

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/probe-finder.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 0149428d453e..699f29d8a28e 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1369,6 +1369,7 @@ static int collect_variables_cb(Dwarf_Die *die_mem, void *data)
 	if (tag == DW_TAG_formal_parameter ||
 	    tag == DW_TAG_variable) {
 		struct probe_trace_arg ta;
+		struct probe_trace_arg_ref *ref;
 
 		memset(&ta, 0, sizeof(struct probe_trace_arg));
 		ret = convert_variable_location(die_mem, af->pf.addr,
@@ -1401,6 +1402,10 @@ static int collect_variables_cb(Dwarf_Die *die_mem, void *data)
 							die_mem, &buf);
 			}
 
+			strbuf_addf(&buf, "\t%s", ta.value);
+			for (ref = ta.ref; ref; ref = ref->next)
+				strbuf_addf(&buf, " off %ld", ref->offset);
+
 			pr_debug("Add new var: %s\n", buf.buf);
 			if (ret2 == 0) {
 				struct str_node *sn;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (4 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 05/12] perf, tools, probe: Print location for resolved variables Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-29  3:14   ` Masami Hiramatsu
  2017-11-28  0:23 ` [PATCH 07/12] perf, tools, script: Resolve variable names for registers Andi Kleen
                   ` (6 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add a extra quiet argument to the debug info open / probe finder
code that allows perf script to make them quieter. Otherwise
we may end up with too many error messages when lots of
instructions fail debug info parsing.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/probe-event.c  |  4 ++--
 tools/perf/util/probe-finder.c | 19 ++++++++++++-------
 tools/perf/util/probe-finder.h |  5 ++++-
 3 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index b7aaf9b2294d..2f9469e862fb 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1071,12 +1071,12 @@ static int show_available_vars_at(struct debuginfo *dinfo,
 		return -EINVAL;
 	pr_debug("Searching variables at %s\n", buf);
 
-	ret = debuginfo__find_available_vars_at(dinfo, pev, &vls);
+	ret = debuginfo__find_available_vars_at(dinfo, pev, &vls, false);
 	if (!ret) {  /* Not found, retry with an alternative */
 		ret = get_alternative_probe_event(dinfo, pev, &tmp);
 		if (!ret) {
 			ret = debuginfo__find_available_vars_at(dinfo, pev,
-								&vls);
+								&vls, false);
 			/* Release the old probe_point */
 			clear_perf_probe_point(&tmp);
 		}
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 699f29d8a28e..137b2fe71838 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -685,12 +685,14 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
 	if (!die_is_func_def(sc_die)) {
 		if (!die_find_realfunc(&pf->cu_die, pf->addr, &pf->sp_die)) {
 			if (die_find_tailfunc(&pf->cu_die, pf->addr, &pf->sp_die)) {
-				pr_warning("Ignoring tail call from %s\n",
+				if (!pf->quiet)
+					pr_warning("Ignoring tail call from %s\n",
 						dwarf_diename(&pf->sp_die));
 				return 0;
 			} else {
-				pr_warning("Failed to find probe point in any "
-					   "functions.\n");
+				if (!pf->quiet)
+					pr_warning("Failed to find probe point in any "
+						   "functions.\n");
 				return -ENOENT;
 			}
 		}
@@ -708,8 +710,9 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
 		if ((dwarf_cfi_addrframe(pf->cfi_eh, pf->addr, &frame) != 0 &&
 		     (dwarf_cfi_addrframe(pf->cfi_dbg, pf->addr, &frame) != 0)) ||
 		    dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {
-			pr_warning("Failed to get call frame on 0x%jx\n",
-				   (uintmax_t)pf->addr);
+			if (!pf->quiet)
+				pr_warning("Failed to get call frame on 0x%jx\n",
+					   (uintmax_t)pf->addr);
 			free(frame);
 			return -ENOENT;
 		}
@@ -1499,10 +1502,12 @@ static int add_available_vars(Dwarf_Die *sc_die, struct probe_finder *pf)
  */
 int debuginfo__find_available_vars_at(struct debuginfo *dbg,
 				      struct perf_probe_event *pev,
-				      struct variable_list **vls)
+				      struct variable_list **vls,
+				      bool be_quiet)
 {
 	struct available_var_finder af = {
-			.pf = {.pev = pev, .callback = add_available_vars},
+			.pf = {.pev = pev, .callback = add_available_vars,
+			       .quiet = be_quiet},
 			.mod = dbg->mod,
 			.max_vls = probe_conf.max_probes};
 	int ret;
diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h
index 6368e95a5d16..abcb2262ea72 100644
--- a/tools/perf/util/probe-finder.h
+++ b/tools/perf/util/probe-finder.h
@@ -57,7 +57,8 @@ int debuginfo__find_line_range(struct debuginfo *dbg, struct line_range *lr);
 /* Find available variables */
 int debuginfo__find_available_vars_at(struct debuginfo *dbg,
 				      struct perf_probe_event *pev,
-				      struct variable_list **vls);
+				      struct variable_list **vls,
+				      bool quiet);
 
 /* Find a src file from a DWARF tag path */
 int get_real_path(const char *raw_path, const char *comp_dir,
@@ -88,6 +89,8 @@ struct probe_finder {
 	unsigned int		machine;	/* Target machine arch */
 	struct perf_probe_arg	*pvar;		/* Current target variable */
 	struct probe_trace_arg	*tvar;		/* Current result variable */
+
+	bool			quiet;		/* Avoid warnings */
 };
 
 struct trace_event_finder {
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 07/12] perf, tools, script: Resolve variable names for registers
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (5 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-28  0:23 ` [PATCH 08/12] perf, tools: Always print probe finder warnings with -v Andi Kleen
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Add a new iregvals output that uses the dwarf decoding code in
perf probe to resolve registers to names based on dwarf debuginfo.

This allows to trace the values of variables which are stored in registers,
and sampled by perf.

We also print the type.

This builds on top of Masami Hiramatsu's perf probe code, which
implements all the difficult dwarf magic.
But we use it for a  'reverse lookup' to figure out the variable names

We use the perf probe code to generate a list of variables
with their registers at the sample point, and then print
any variable for which we have a valid register.

Note when testing please compile the program with -O2 -g. Unoptimized
code stores variables generally on the stack, which we cannot
decode with this. -g is needed for the debug information

For example to track the first two arguments passed in
registers on x86-64:

% cat targ.c
% gcc -O2 -g -o targ targ.c
volatile c;

__attribute__((noinline)) f2(int a, int b)
{
	c = a / b;
}

__attribute__((noinline)) f1(int a, int b)
{
	f2(a, b);
	f2(b, a);
}
main()
{
	int i;
	for (i = 0; i < 5000000; i++)
		f1(1, 2);
}

% perf record -Idi,si ./targ
% perf script -F +iregvals
...
    targ  8584 169763.761843:    2091795 cycles:ppp:            40041a main (targ)
    targ  8584 169763.762520:    1913932 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }
    targ  8584 169763.763141:    1638913 cycles:ppp:            400523 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
    targ  8584 169763.763672:    1516522 cycles:ppp:            400522 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
    targ  8584 169763.764165:    1335501 cycles:ppp:            400523 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
    targ  8584 169763.764598:    1253289 cycles:ppp:            400522 f2 (targ) { b = 0x2, int }  { a = 0x1, int }
    targ  8584 169763.765005:    1135131 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }
    targ  8584 169763.765373:    1080325 cycles:ppp:            400522 f2 (targ) { b = 0x2, int }  { a = 0x1, int }
    targ  8584 169763.765724:    1036999 cycles:ppp:            400522 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
    targ  8584 169763.766061:     971213 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }

It works for other variables too, as long as they are in integer registers
and the compiler generated correct dwarf debug information.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-script.txt |   6 +-
 tools/perf/builtin-script.c              |  63 ++++++++++++++++-
 tools/perf/util/Build                    |   1 +
 tools/perf/util/dwarf-sample.c           | 113 +++++++++++++++++++++++++++++++
 tools/perf/util/dwarf-sample.h           |  13 ++++
 tools/perf/util/probe-event.c            |   2 +-
 tools/perf/util/probe-event.h            |   3 +
 7 files changed, 196 insertions(+), 5 deletions(-)
 create mode 100644 tools/perf/util/dwarf-sample.c
 create mode 100644 tools/perf/util/dwarf-sample.h

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 2811fcf684cb..e296944cc03f 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@ OPTIONS
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
         srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
-        brstackoff, callindent, insn, insnlen, synth, phys_addr.
+        brstackoff, callindent, insn, insnlen, synth, phys_addr, iregvals.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
@@ -217,6 +217,10 @@ OPTIONS
 
 	The brstackoff field will print an offset into a specific dso/binary.
 
+	With iregvals perf script uses dwarf debug information to map sampled register values
+	(with perf record -I ...) to their symbolic names in the program. This requires availability
+	of debug information in the binaries.
+
 -k::
 --vmlinux=<file>::
         vmlinux pathname
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index cae4b13fc715..7913ec732620 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -34,6 +34,8 @@
 #include "asm/bug.h"
 #include "util/mem-events.h"
 #include "util/dump-insn.h"
+#include "util/probe-finder.h"
+#include "util/dwarf-sample.h"
 #include <dirent.h>
 #include <errno.h>
 #include <inttypes.h>
@@ -42,7 +44,6 @@
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <unistd.h>
-
 #include "sane_ctype.h"
 
 static char const		*script_name;
@@ -91,6 +92,7 @@ enum perf_output_field {
 	PERF_OUTPUT_SYNTH           = 1U << 25,
 	PERF_OUTPUT_PHYS_ADDR       = 1U << 26,
 	PERF_OUTPUT_UREGS	    = 1U << 27,
+	PERF_OUTPUT_IREG_VALS	    = 1U << 28,
 };
 
 struct output_option {
@@ -125,6 +127,7 @@ struct output_option {
 	{.str = "brstackoff", .field = PERF_OUTPUT_BRSTACKOFF},
 	{.str = "synth", .field = PERF_OUTPUT_SYNTH},
 	{.str = "phys_addr", .field = PERF_OUTPUT_PHYS_ADDR},
+	{.str = "iregvals", .field = PERF_OUTPUT_IREG_VALS},
 };
 
 enum {
@@ -424,7 +427,7 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					   PERF_OUTPUT_CPU, allow_user_set))
 		return -EINVAL;
 
-	if (PRINT_FIELD(IREGS) &&
+	if ((PRINT_FIELD(IREGS) || PRINT_FIELD(IREG_VALS)) &&
 		perf_evsel__check_stype(evsel, PERF_SAMPLE_REGS_INTR, "IREGS",
 					PERF_OUTPUT_IREGS))
 		return -EINVAL;
@@ -439,6 +442,11 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					PERF_OUTPUT_PHYS_ADDR))
 		return -EINVAL;
 
+	if (PRINT_FIELD(IREG_VALS)) {
+		if (init_probe_symbol_maps(false) >= 0)
+			probe_conf.max_probes = MAX_PROBES;
+	}
+
 	return 0;
 }
 
@@ -542,6 +550,51 @@ static int perf_session__check_output_opt(struct perf_session *session)
 	return 0;
 }
 
+#ifdef HAVE_DWARF_SUPPORT
+/* Resolve registers in samples to their dwarf name */
+static void print_sample__fprintf_ireg_vals(struct perf_sample *sample,
+				   struct perf_event_attr *attr,
+				   struct thread *thread,
+				   FILE *fp)
+{
+	struct regs_dump *regs = &sample->intr_regs;
+	uint64_t mask = attr->sample_regs_intr;
+	unsigned i = 0, r;
+	struct variable_list *vls;
+	int dret;
+
+	if (!regs)
+		return;
+
+	dret = dwarf_resolve_sample(sample, thread, &vls);
+	if (dret < 0)
+		return;
+
+	for_each_set_bit(r, (unsigned long *) &mask, sizeof(mask) * 8) {
+		u64 val = regs->regs[i++];
+		char *name;
+		char *type;
+
+		if (!dwarf_varlist_find_reg(vls, dret, r, &name, &type))
+			fprintf(fp, " { %s = %#" PRIx64 ", %.*s } ", name, val,
+					(int)strcspn(type, "\t"), type);
+	}
+
+	dwarf_free_varlist(vls, dret);
+}
+
+#else
+
+static void print_sample__fprintf_ireg_vals(
+		struct perf_sample *sample __maybe_unused,
+		struct perf_event_attr *attr __maybe_unused,
+		struct thread *thread __maybe_unused,
+		FILE *fp __maybe_unused)
+{
+}
+
+#endif
+
 static int perf_sample__fprintf_iregs(struct perf_sample *sample,
 				      struct perf_event_attr *attr, FILE *fp)
 {
@@ -1558,6 +1611,9 @@ static void process_event(struct perf_script *script,
 	if (PRINT_FIELD(UREGS))
 		perf_sample__fprintf_uregs(sample, attr, fp);
 
+	if (PRINT_FIELD(IREG_VALS))
+		print_sample__fprintf_ireg_vals(sample, attr, thread, fp);
+
 	if (PRINT_FIELD(BRSTACK))
 		perf_sample__fprintf_brstack(sample, thread, attr, fp);
 	else if (PRINT_FIELD(BRSTACKSYM))
@@ -2952,7 +3008,8 @@ int cmd_script(int argc, const char **argv)
 		     "Valid types: hw,sw,trace,raw,synth. "
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 		     "addr,symoff,period,iregs,uregs,brstack,brstacksym,flags,"
-		     "bpf-output,callindent,insn,insnlen,brstackinsn,synth,phys_addr",
+		     "bpf-output,callindent,insn,insnlen,brstackinsn,synth,phys_addr,"
+		     "iregvals",
 		     parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 80c05329835a..361db92a4bfd 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -42,6 +42,7 @@ libperf-y += callchain.o
 libperf-y += values.o
 libperf-y += debug.o
 libperf-y += machine.o
+libperf-y += dwarf-sample.o
 libperf-y += map.o
 libperf-y += pstack.o
 libperf-y += session.o
diff --git a/tools/perf/util/dwarf-sample.c b/tools/perf/util/dwarf-sample.c
new file mode 100644
index 000000000000..ee2c1b0d3bd7
--- /dev/null
+++ b/tools/perf/util/dwarf-sample.c
@@ -0,0 +1,113 @@
+/*
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+/* Resolve variable names from samples using DWARF. */
+#include <errno.h>
+#include "perf.h"
+#include "thread.h"
+#include "strbuf.h"
+#include "strlist.h"
+#include "util.h"
+#include "debug.h"
+#include "probe-finder.h"
+#include "dwarf-sample.h"
+
+/* Resolve dwarf variables at a sample IP */
+int dwarf_resolve_sample(struct perf_sample *sample,
+			 struct thread *thread,
+			 struct variable_list **vls)
+{
+	struct addr_location al;
+	int dret = -1;
+	struct perf_probe_event pev;
+	struct debuginfo *dinfo = NULL;
+
+	memset(&pev, 0, sizeof(struct perf_probe_event));
+	memset(&al, 0, sizeof(al));
+	thread__find_addr_map(thread, sample->cpumode, MAP__FUNCTION, sample->ip, &al);
+	if (al.map && al.map->dso->long_name) {
+		pev.target = build_id_cache__complement(al.map->dso->long_name);
+		if (pev.target) {
+			char *t = pev.target;
+			pev.target = build_id_cache__origname(pev.target);
+			free(t);
+		} else {
+			pev.target = strdup(al.map->dso->long_name);
+		}
+		if (!pev.target)
+			return -EIO;
+		al.sym = map__find_symbol(al.map, al.addr);
+		if (al.sym) {
+			pev.point.function = al.sym->name;
+			pev.point.offset = al.addr - al.sym->start;
+		}
+		dinfo = debuginfo_cache__open(pev.target, true);
+		if (dinfo)
+			dret = debuginfo__find_available_vars_at(dinfo, &pev, vls,
+								verbose == 0);
+	}
+
+	free(pev.target);
+
+	return dret;
+}
+
+/* Free resolved variable list */
+void dwarf_free_varlist(struct variable_list *vls, int dret)
+{
+	int i;
+
+	for (i = 0; i < dret; i++) {
+		zfree(&vls[i].point.symbol);
+		strlist__delete(vls[i].vars);
+	}
+	free(vls);
+}
+
+/* Find given register in resolved variables. */
+int dwarf_varlist_find_reg(struct variable_list *vls, int dret, int r, char **name,
+			   char **type)
+{
+	int i;
+	const char *regn = perf_reg_name(r);
+
+	*name = NULL;
+	for (i = 0; i < dret; i++) {
+		if (vls[i].vars) {
+			struct str_node *node;
+
+			strlist__for_each_entry(node, vls[i].vars) {
+				struct variable_node *vn =
+					container_of(node, struct variable_node, snode);
+				char *rr, *value;
+
+				for (rr = vn->value; *rr; rr++)
+					*rr = toupper(*rr);
+				value = vn->value;
+				if (*value == '%')
+					value++;
+				if (strncmp(regn, value, strlen(regn)))
+					continue;
+				*name = vn->name;
+				if (type) {
+					*type = (char *)node->s;
+					if ((*type)[0] == ' ' && (*type)[1] == '(')
+						(*type) += 2;
+				}
+				return 0;
+			}
+		}
+	}
+	return -EINVAL;
+}
diff --git a/tools/perf/util/dwarf-sample.h b/tools/perf/util/dwarf-sample.h
new file mode 100644
index 000000000000..39ec1a6c45d3
--- /dev/null
+++ b/tools/perf/util/dwarf-sample.h
@@ -0,0 +1,13 @@
+#ifndef DWARF_SAMPLE_H
+#define DWARF_SAMPLE_H 1
+
+#include "probe-finder.h"
+
+int dwarf_resolve_sample(struct perf_sample *sample,
+			 struct thread *thread,
+			 struct variable_list **vls);
+void dwarf_free_varlist(struct variable_list *vls, int dret);
+int dwarf_varlist_find_reg(struct variable_list *vls, int dret, int r,
+			   char **name, char **type);
+
+#endif
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 2f9469e862fb..4ef6ee967468 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -507,7 +507,7 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
 static struct debuginfo *debuginfo_cache;
 static char *debuginfo_cache_path;
 
-static struct debuginfo *debuginfo_cache__open(const char *module, bool silent)
+struct debuginfo *debuginfo_cache__open(const char *module, bool silent)
 {
 	const char *path = module;
 
diff --git a/tools/perf/util/probe-event.h b/tools/perf/util/probe-event.h
index 45b14f020558..e694a3680e1b 100644
--- a/tools/perf/util/probe-event.h
+++ b/tools/perf/util/probe-event.h
@@ -190,4 +190,7 @@ struct map *get_target_map(const char *target, struct nsinfo *nsi, bool user);
 void arch__post_process_probe_trace_events(struct perf_probe_event *pev,
 					   int ntevs);
 
+struct debuginfo;
+struct debuginfo *debuginfo_cache__open(const char *module, bool silent);
+
 #endif /*_PROBE_EVENT_H */
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 08/12] perf, tools: Always print probe finder warnings with -v
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (6 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 07/12] perf, tools, script: Resolve variable names for registers Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-29  3:16   ` Masami Hiramatsu
  2017-11-28  0:23 ` [PATCH 09/12] perf, tools: Downgrade register mapping message to warning Andi Kleen
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Normally perf script debug info resolution doesn't print
warnings, but allow -v to override that.  Useful for finding out why
things don't work.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/probe-event.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 4ef6ee967468..fb5031ac24a2 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -482,7 +482,7 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
 					strcpy(reason, "(unknown)");
 			} else
 				dso__strerror_load(dso, reason, STRERR_BUFSIZE);
-			if (!silent)
+			if (!silent || verbose)
 				pr_err("Failed to find the path for %s: %s\n",
 					module ?: "kernel", reason);
 			return NULL;
@@ -491,7 +491,7 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
 	}
 	nsinfo__mountns_enter(nsi, &nsc);
 	ret = debuginfo__new(path);
-	if (!ret && !silent) {
+	if (!ret && (!silent || verbose)) {
 		pr_warning("The %s file has no debug information.\n", path);
 		if (!module || !strtailcmp(path, ".ko"))
 			pr_warning("Rebuild with CONFIG_DEBUG_INFO=y, ");
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 09/12] perf, tools: Downgrade register mapping message to warning
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (7 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 08/12] perf, tools: Always print probe finder warnings with -v Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-29  5:56   ` Masami Hiramatsu
  2017-11-28  0:23 ` [PATCH 10/12] perf, tools: Add args and gprs shortcut for registers Andi Kleen
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

When tracing floating point code it's quite possible that perf
doesn't recognize the register number. Downgrade the warning
for unknown registers to a debug message.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/probe-finder.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 137b2fe71838..5fe6466254f9 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -272,8 +272,8 @@ static int convert_variable_location(Dwarf_Die *vr_die, Dwarf_Addr addr,
 
 	regs = get_dwarf_regstr(regn, machine);
 	if (!regs) {
-		/* This should be a bug in DWARF or this tool */
-		pr_warning("Mapping for the register number %u "
+		/* This can happen with floating point */
+		pr_debug("Mapping for the register number %u "
 			   "missing on this architecture.\n", regn);
 		return -ENOTSUP;
 	}
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 10/12] perf, tools: Add args and gprs shortcut for registers
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (8 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 09/12] perf, tools: Downgrade register mapping message to warning Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-28  0:23 ` [PATCH 11/12] perf, tools: Print probe warnings for binaries only once per binary Andi Kleen
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Writing all registers to sample with -I can give very long command
lines. Add short hands for "args" and "gprs"

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/arch/x86/util/perf_regs.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/perf/arch/x86/util/perf_regs.c b/tools/perf/arch/x86/util/perf_regs.c
index 4b2caf6d48e7..7d877c49bae5 100644
--- a/tools/perf/arch/x86/util/perf_regs.c
+++ b/tools/perf/arch/x86/util/perf_regs.c
@@ -30,6 +30,15 @@ const struct sample_reg sample_reg_masks[] = {
 	SMPL_REG(R13, PERF_REG_X86_R13),
 	SMPL_REG(R14, PERF_REG_X86_R14),
 	SMPL_REG(R15, PERF_REG_X86_R15),
+	{ "args", PERF_REG_X86_DI | PERF_REG_X86_SI | PERF_REG_X86_DX |
+		  PERF_REG_X86_CX |
+		  PERF_REG_X86_R8 | PERF_REG_X86_R9 },
+	{ "gpr",  PERF_REG_X86_AX | PERF_REG_X86_BX | PERF_REG_X86_CX |
+		  PERF_REG_X86_DX | PERF_REG_X86_SI | PERF_REG_X86_DI |
+		  PERF_REG_X86_BP | PERF_REG_X86_SP | PERF_REG_X86_R8 |
+		  PERF_REG_X86_R9 | PERF_REG_X86_R10| PERF_REG_X86_R11|
+		  PERF_REG_X86_R12| PERF_REG_X86_R13| PERF_REG_X86_R14|
+		  PERF_REG_X86_R15 },
 #endif
 	SMPL_REG_END
 };
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 11/12] perf, tools: Print probe warnings for binaries only once per binary
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (9 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 10/12] perf, tools: Add args and gprs shortcut for registers Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-11-30  2:38   ` Masami Hiramatsu
  2017-11-28  0:23 ` [PATCH 12/12] perf, tools, script: Implement dwarf resolving of instructions Andi Kleen
  2017-11-28  5:31 ` Implement dwarf variable/type resolving for perf script Masami Hiramatsu
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

When the perf probe code is called from perf script we may end up
with a flood of bad binary errors with -v. Only print the error message
once in this case.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/util/probe-event.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index fb5031ac24a2..85fbeeb364bf 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -492,12 +492,16 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
 	nsinfo__mountns_enter(nsi, &nsc);
 	ret = debuginfo__new(path);
 	if (!ret && (!silent || verbose)) {
-		pr_warning("The %s file has no debug information.\n", path);
-		if (!module || !strtailcmp(path, ".ko"))
-			pr_warning("Rebuild with CONFIG_DEBUG_INFO=y, ");
-		else
-			pr_warning("Rebuild with -g, ");
-		pr_warning("or install an appropriate debuginfo package.\n");
+		static char printed[1024];
+		if (strcmp(path, printed)) {
+			snprintf(printed, sizeof printed, "%s", path);
+			pr_warning("The %s file has no debug information.\n", path);
+			if (!module || !strtailcmp(path, ".ko"))
+				pr_warning("Rebuild with CONFIG_DEBUG_INFO=y, ");
+			else
+				pr_warning("Rebuild with -g, ");
+			pr_warning("or install an appropriate debuginfo package.\n");
+		}
 	}
 	nsinfo__mountns_exit(&nsc);
 	return ret;
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 12/12] perf, tools, script: Implement dwarf resolving of instructions
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (10 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 11/12] perf, tools: Print probe warnings for binaries only once per binary Andi Kleen
@ 2017-11-28  0:23 ` Andi Kleen
  2017-12-01  2:36   ` Masami Hiramatsu
  2017-11-28  5:31 ` Implement dwarf variable/type resolving for perf script Masami Hiramatsu
  12 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-28  0:23 UTC (permalink / raw)
  To: acme; +Cc: jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

From: Andi Kleen <ak@linux.intel.com>

Implement resolving arguments of instructions to dwarf variable names.

When we sample an instruction, decode the instruction and try to
symbolize the register or destination it is using. Also print the type.
It builds on the perf probe debugging information reverse lookup
infrastructure added earlier.

The dwarf decoding magic is all done using Masami Hiramatsu's perf probe code.

This is useful for

- The PTWRITE instruction: when the compiler generates debugging information
for PTWRITE arguments.  The value logged by PTWRITE is available to the
PT decoder, so it can print the value.

- It also works for other samples with an IP, so it's possible to follow
their memory access patterns (but not the values)

For the sample we use the instruction decoder to decode the instruction
at the sample point, and then map the arguments to dwarf information.

For structure reference we only print the numeric offset, but do not
resolve the field name.

Absolute memory references are not supported

It doesn't distinguish SSE (but AVX) registers from GPRs
(this would require extending the instruction decoder to detect SSE
instructions)

Example:

>From perf itself

% perf record -e intel_pt//u  -a sleep 1
% perf script --itrace=i0ns -F insnvar,insn,ip,sym  -f 2>&1 | xed -F insn: -A -64 | less
...
           4f7e61 xyarray__max_y                pushq  %rbp
           4f7e62 xyarray__max_y                mov %rsp, %rbp
           4f7e65 xyarray__max_y                sub $0x20, %rsp
           4f7e69 xyarray__max_y                movq  %rdi, -0x18(%rbp) { -24(xy), struct xyarray* }
           4f7e6d xyarray__max_y                movq  %fs:0x28, %rax
           4f7e76 xyarray__max_y                movq  %rax, -0x8(%rbp) { -8(xy), struct xyarray* }
           4f7e7a xyarray__max_y                xor %eax, %eax
           4f7e7c xyarray__max_y                movq  -0x18(%rbp), %rax { -24(xy), struct xyarray* }
           4f7e80 xyarray__max_y                movq  0x20(%rax), %rax
           4f7e84 xyarray__max_y                movq  -0x8(%rbp), %rdx { -8(xy), struct xyarray* }
           4f7e88 xyarray__max_y                xorq  %fs:0x28, %rdx
           4f7e91 xyarray__max_y                jz 0x7
           4f7e98 xyarray__max_y                leaveq
           4f7e99 xyarray__max_y                retq

In this example we now know that this function accesses two fields in struct xyarray *

Open Issues:
- It is fairly slow. Some caching would likely help.
- Frame pointer references are usually not correctly resolved,
which are common in unoptimized code. That's usually fine
because memory access on the stack is not very interesting.
- It cannot resolve some references.

But I find it already quite useful.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
 tools/perf/Documentation/perf-script.txt |   8 +-
 tools/perf/arch/x86/util/Build           |   1 +
 tools/perf/arch/x86/util/operand.c       | 131 +++++++++++++++++++++++++
 tools/perf/builtin-script.c              | 162 ++++++++++++++++++++++++++++++-
 tools/perf/util/Build                    |   1 +
 tools/perf/util/operand.c                |  16 +++
 tools/perf/util/operand.h                |  16 +++
 tools/perf/util/probe-event.c            |   3 +
 8 files changed, 335 insertions(+), 3 deletions(-)
 create mode 100644 tools/perf/arch/x86/util/operand.c
 create mode 100644 tools/perf/util/operand.c
 create mode 100644 tools/perf/util/operand.h

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index e296944cc03f..d3b93b7f804b 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,9 @@ OPTIONS
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
         srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
-        brstackoff, callindent, insn, insnlen, synth, phys_addr, iregvals.
+        brstackoff, callindent, insn, insnlen, synth, phys_addr, iregvals,
+	insnvar
+
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
@@ -221,6 +223,10 @@ OPTIONS
 	(with perf record -I ...) to their symbolic names in the program. This requires availability
 	of debug information in the binaries.
 
+	With insnvar try to decode and symbolize operands of sampled or traced instructions
+	using debug information.  When PTWRITEs are synthesized with Intel PT the values of the
+	PTWRITEs are automatically symbolized.
+
 -k::
 --vmlinux=<file>::
         vmlinux pathname
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 139f9f1a56f9..8e9a2140e72b 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -5,6 +5,7 @@ libperf-y += insnlen.o
 libperf-y += kvm-stat.o
 libperf-y += perf_regs.o
 libperf-y += group.o
+libperf-y += operand.o
 
 libperf-$(CONFIG_DWARF) += dwarf-regs.o
 libperf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/operand.c b/tools/perf/arch/x86/util/operand.c
new file mode 100644
index 000000000000..c78c21e8955f
--- /dev/null
+++ b/tools/perf/arch/x86/util/operand.c
@@ -0,0 +1,131 @@
+/*
+ * Copyright (c) 2017, Intel Corporation.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ */
+
+/* Decode instructions to resolve operands. */
+#include <stdio.h>
+#include "debug.h"
+#include "perf.h"
+#include "operand.h"
+#include "intel-pt-decoder/insn.h"
+#include "intel-pt-decoder/inat.h"
+
+static unsigned char x86_reg_to_perf[16] = {
+	[0] = PERF_REG_X86_AX,
+	[1] = PERF_REG_X86_CX,
+	[2] = PERF_REG_X86_DX,
+	[3] = PERF_REG_X86_BX,
+	[4] = PERF_REG_X86_SP,
+	[5] = PERF_REG_X86_BP,
+	[6] = PERF_REG_X86_SI,
+	[7] = PERF_REG_X86_DI,
+#ifdef HAVE_ARCH_X86_64_SUPPORT
+	[8] = PERF_REG_X86_R8,
+	[9] = PERF_REG_X86_R9,
+	[10] = PERF_REG_X86_R10,
+	[11] = PERF_REG_X86_R11,
+	[12] = PERF_REG_X86_R12,
+	[13] = PERF_REG_X86_R13,
+	[14] = PERF_REG_X86_R14,
+	[15] = PERF_REG_X86_R15,
+#endif
+};
+
+/* Decode x86 instruction and print address mode. */
+int arch_resolve_operand(char *insnbytes, int insnlen, bool is64bit,
+			 u64 ip,
+			 u64 val,
+			 struct operand_print_ops *ops,
+			 void *ctx)
+{
+	struct insn insn;
+	bool has_value;
+	int reg;
+
+	insn_init(&insn, insnbytes, insnlen, is64bit);
+	insn_get_length(&insn);
+	if (!insn_complete(&insn))
+		goto unknown;
+	/* Cannot handle Y/Zmm */
+	if (insn.vex_prefix.nbytes > 0)
+		goto unknown;
+	if (!insn.modrm.nbytes)
+		goto unknown;
+
+	switch (insn.opcode.bytes[0]) {
+	case 0x0f:
+		/* For PTWRITE use the caller value */
+		if (insn.opcode.bytes[1] == 0xae)
+			has_value = true;
+		break;
+	case 0xb0:
+	case 0xb8:
+	case 0xc6:
+	case 0xc7:
+		/* For MOV $xxx use the immediate */
+		if (insn.immediate.nbytes) {
+			has_value = true;
+			val = insn.immediate.value;
+		}
+		break;
+	default:
+		break;
+	}
+
+	/* Could also get known register values from caller */
+
+	/* Should check for SSE instructions to detect XMM* */
+
+	if (insn_rip_relative(&insn)) {
+		ops->print_symbol(ctx, ip + insn.length + insn.displacement.value,
+				  has_value, val);
+		return 0;
+	}
+
+	/* Should handle direct memory offset */
+
+	reg = X86_MODRM_RM(insn.modrm.value);
+	if (insn.rex_prefix.nbytes && X86_REX_B(insn.rex_prefix.value))
+		reg += 8;
+	reg = x86_reg_to_perf[reg];
+
+	switch (X86_MODRM_MOD(insn.modrm.value)) {
+	case 0: /* [r/m] */
+	case 1: /* [r/m + disp8] */
+	case 2: /* [r/m + disp32] */
+		if (insn.sib.nbytes) {
+			/*
+			 * Scaling and multiple registers
+			 * not supported for now.
+			 */
+			pr_debug("SIB encoding not supported\n");
+			goto unknown;
+		}
+		ops->print_indirect_reg(ctx, reg, insn.displacement.value,
+					has_value, val);
+		break;
+
+	case 3: /* register value */
+		ops->print_reg(ctx, reg, has_value, val);
+		break;
+
+	default:
+		goto unknown;
+	}
+	return 0;
+
+unknown:
+	ops->print_unknown(ctx);
+	return 0;
+
+}
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7913ec732620..792e1d2dfdd4 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -36,6 +36,7 @@
 #include "util/dump-insn.h"
 #include "util/probe-finder.h"
 #include "util/dwarf-sample.h"
+#include "util/operand.h"
 #include <dirent.h>
 #include <errno.h>
 #include <inttypes.h>
@@ -93,6 +94,7 @@ enum perf_output_field {
 	PERF_OUTPUT_PHYS_ADDR       = 1U << 26,
 	PERF_OUTPUT_UREGS	    = 1U << 27,
 	PERF_OUTPUT_IREG_VALS	    = 1U << 28,
+	PERF_OUTPUT_INSN_VAR	    = 1U << 29,
 };
 
 struct output_option {
@@ -128,6 +130,7 @@ struct output_option {
 	{.str = "synth", .field = PERF_OUTPUT_SYNTH},
 	{.str = "phys_addr", .field = PERF_OUTPUT_PHYS_ADDR},
 	{.str = "iregvals", .field = PERF_OUTPUT_IREG_VALS},
+	{.str = "insnvar", .field = PERF_OUTPUT_INSN_VAR},
 };
 
 enum {
@@ -442,7 +445,7 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					PERF_OUTPUT_PHYS_ADDR))
 		return -EINVAL;
 
-	if (PRINT_FIELD(IREG_VALS)) {
+	if (PRINT_FIELD(IREG_VALS) || PRINT_FIELD(INSN_VAR)) {
 		if (init_probe_symbol_maps(false) >= 0)
 			probe_conf.max_probes = MAX_PROBES;
 	}
@@ -1068,6 +1071,159 @@ static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
 	return printed;
 }
 
+#ifdef HAVE_DWARF_SUPPORT
+
+struct operand_print_ctx {
+	FILE *fp;
+	struct variable_list *vls;
+	int dret;
+	struct thread *thread;
+	u8 cpumode;
+};
+
+static void print_op_reg(void *ctx, int reg, bool has_value, u64 val)
+{
+	struct operand_print_ctx *oc = ctx;
+	char *name, *type;
+
+	if (!dwarf_varlist_find_reg(oc->vls, oc->dret, reg, &name, &type)) {
+		fprintf(oc->fp, " { %s", name);
+		if (has_value)
+			fprintf(oc->fp, " = %#" PRIx64, val);
+		fprintf(oc->fp, ", %.*s }", (int)strcspn(type, "\t"), type);
+	} else if (verbose)
+		fprintf(oc->fp, " {?NO-MATCH-REG}");
+}
+
+static void print_op_symbol(void *ctx, u64 addr, bool has_val, u64 val)
+{
+	struct operand_print_ctx *oc = ctx;
+	struct addr_location al;
+
+	memset(&al, 0, sizeof(struct addr_location));
+	thread__find_addr_map(oc->thread, oc->cpumode, MAP__VARIABLE,
+			      addr, &al);
+	if (!al.map)
+		thread__find_addr_map(oc->thread, oc->cpumode, MAP__FUNCTION,
+			      addr, &al);
+	if (al.map)
+		al.sym = map__find_symbol(al.map, al.addr);
+
+	if (al.map && al.sym) {
+		fprintf(oc->fp, " { ");
+		symbol__fprintf_symname_offs(al.sym, &al, oc->fp);
+		if (has_val)
+			fprintf(oc->fp, " = %lx", val);
+		fprintf(oc->fp, ", symbol }");
+	} else
+		fprintf(oc->fp, " {?BAD-SYM}");
+}
+
+static void print_op_unknown(void *ctx)
+{
+	struct operand_print_ctx *oc = ctx;
+
+	if (verbose)
+		fprintf(oc->fp, " {?}");
+}
+
+static void print_op_indirect_reg(void *ctx,
+				  int reg,
+				  s32 off,
+				  bool has_val,
+				  u64 val)
+{
+	struct operand_print_ctx *oc = ctx;
+	char *name, *type;
+
+	/* Should resolve field names too, for now just print offsets */
+	if (!dwarf_varlist_find_reg(oc->vls, oc->dret, reg, &name, &type)) {
+		/* Likely frame pointer. Should resolve separately. */
+		if (!strncmp(type, "unknown_type", 12))
+			return;
+
+		fprintf(oc->fp, " { %d(%s)", off, name);
+		if (has_val)
+			fprintf(oc->fp, " = %" PRIx64, val);
+		fprintf(oc->fp, ", %.*s }", (int)strcspn(type, "\t"), type);
+	} else if (verbose)
+		fprintf(oc->fp, " {?NO-MATCH-IND-REG}");
+
+}
+
+static struct operand_print_ops operand_ops = {
+	.print_reg = print_op_reg,
+	.print_symbol = print_op_symbol,
+	.print_unknown = print_op_unknown,
+	.print_indirect_reg = print_op_indirect_reg,
+};
+
+#define MAX_INSN 16
+
+/* Resolve operands of instructions to their dwarf name */
+static void perf_sample__fprint_insn_var(struct perf_sample *sample,
+			   struct thread *thread,
+			   struct perf_event_attr *attr,
+			   struct machine *machine,
+			   FILE *fp)
+{
+	struct operand_print_ctx oc = {
+		.fp = fp,
+		.thread = thread,
+	};
+	u8 ibuf[MAX_INSN*2];
+	bool is64bit;
+	u64 val = 0;
+
+	if (grab_bb(ibuf, sample->ip, sample->ip + MAX_INSN,
+		    machine,
+		    thread,
+		    &is64bit,
+		    &oc.cpumode,
+		    false) < 0) {
+		if (verbose)
+			fprintf(fp, " {?NO-TEXT}");
+		return;
+	}
+
+	oc.cpumode = sample->cpumode;
+
+	oc.dret = dwarf_resolve_sample(sample, thread, &oc.vls);
+	if (oc.dret < 0) {
+		if (verbose)
+			fprintf(fp, " {?BAD-DWARF}");
+		return;
+	}
+
+	if (attr->config == PERF_SYNTH_INTEL_PTWRITE) {
+		struct perf_synth_intel_ptwrite *data =
+			perf_sample__synth_ptr(sample);
+		if (!perf_sample__bad_synth_size(sample, *data))
+			val = le64_to_cpu(data->payload);
+	}
+
+	arch_resolve_operand((char *)ibuf, MAX_INSN, is64bit,
+			     sample->ip,
+			     val,
+			     &operand_ops,
+			     &oc);
+}
+
+#else
+
+static void perf_sample__fprint_insn_var(
+		struct perf_sample *sample __maybe_unused,
+		struct thread *thread __maybe_unused,
+		struct perf_event_attr *attr __maybe_unused,
+		struct machine *machine __maybe_unused,
+		FILE *fp __maybe_unused)
+{
+	if (verbose)
+		fprintf(fp, " {?}");
+}
+
+#endif
+
 static int perf_sample__fprintf_addr(struct perf_sample *sample,
 				     struct thread *thread,
 				     struct perf_event_attr *attr, FILE *fp)
@@ -1183,6 +1339,8 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
 		for (i = 0; i < sample->insn_len; i++)
 			printed += fprintf(fp, " %02x", (unsigned char)sample->insn[i]);
 	}
+	if (PRINT_FIELD(INSN_VAR))
+		perf_sample__fprint_insn_var(sample, thread, attr, machine, fp);
 	if (PRINT_FIELD(BRSTACKINSN))
 		printed += perf_sample__fprintf_brstackinsn(sample, thread, attr, machine, fp);
 
@@ -3009,7 +3167,7 @@ int cmd_script(int argc, const char **argv)
 		     "Fields: comm,tid,pid,time,cpu,event,trace,ip,sym,dso,"
 		     "addr,symoff,period,iregs,uregs,brstack,brstacksym,flags,"
 		     "bpf-output,callindent,insn,insnlen,brstackinsn,synth,phys_addr,"
-		     "iregvals",
+		     "iregvals,insnvar",
 		     parse_output_fields),
 	OPT_BOOLEAN('a', "all-cpus", &system_wide,
 		    "system-wide collection from all CPUs"),
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 361db92a4bfd..4f42b2fad398 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -42,6 +42,7 @@ libperf-y += callchain.o
 libperf-y += values.o
 libperf-y += debug.o
 libperf-y += machine.o
+libperf-y += operand.o
 libperf-y += dwarf-sample.o
 libperf-y += map.o
 libperf-y += pstack.o
diff --git a/tools/perf/util/operand.c b/tools/perf/util/operand.c
new file mode 100644
index 000000000000..88d9284a7049
--- /dev/null
+++ b/tools/perf/util/operand.c
@@ -0,0 +1,16 @@
+#include <errno.h>
+#include "perf.h"
+#include "operand.h"
+
+/* Fall back, can be overriden per architecture */
+__weak
+int arch_resolve_operand(char *insn __maybe_unused,
+			 int insnlen __maybe_unused,
+			 bool is64bit __maybe_unused,
+			 u64 ip __maybe_unused,
+			 u64 val __maybe_unused,
+			 struct operand_print_ops *ops __maybe_unused,
+			 void *ctx __maybe_unused)
+{
+	return -EINVAL;
+}
diff --git a/tools/perf/util/operand.h b/tools/perf/util/operand.h
new file mode 100644
index 000000000000..63a7602727a1
--- /dev/null
+++ b/tools/perf/util/operand.h
@@ -0,0 +1,16 @@
+#ifndef OPERAND_H
+#define OPERAND_H 1
+
+struct operand_print_ops {
+	void (*print_reg)(void *ctx, int reg, bool has_val, u64 val);
+	void (*print_symbol)(void *ctx, u64 addr, bool has_val, u64 val);
+	void (*print_indirect_reg)(void *ctx, int reg, s32 off, bool has_val, u64 val);
+	void (*print_unknown)(void *ctx);
+};
+
+int arch_resolve_operand(char *insn, int insnlen, bool is64bit, u64 ip,
+			 u64 val,
+			 struct operand_print_ops *ops,
+			 void *ctx);
+
+#endif
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 85fbeeb364bf..2a65ebed0998 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -78,6 +78,9 @@ int init_probe_symbol_maps(bool user_only)
 {
 	int ret;
 
+	if (host_machine)
+		return 0;
+
 	symbol_conf.sort_by_name = true;
 	symbol_conf.allow_aliases = true;
 	ret = symbol__init(NULL);
-- 
2.13.6

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: Implement dwarf variable/type resolving for perf script
  2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
                   ` (11 preceding siblings ...)
  2017-11-28  0:23 ` [PATCH 12/12] perf, tools, script: Implement dwarf resolving of instructions Andi Kleen
@ 2017-11-28  5:31 ` Masami Hiramatsu
  12 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-11-28  5:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, mhiramat, adrian.hunter, linux-kernel

Hi Andi,

On Mon, 27 Nov 2017 16:23:09 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> This patchkit extends perf script to query dwarf information for variable
> names or types/structure fields accessed from code. 

Very interesting! so it is a kind of revert query interface.

> 
> The dwarf resolution is all on top of Masami's perf probe dwarf code.
> 
> It supports multiple use cases:
> - When we sample registers it can use the dwarf information to resolve
> the registers to names.
> - When we sample any instruction the instruction can be decoded and
> we can determine the type/struct field to make an estimate of the 
> memory access patterns in data structures. 
> - When we sample the new PTWRITE instruction the logged value from
> the PT log can be associated with a variable.
> - Various cleanups and fixes to make the one above all possible.

OK, I'll review that.

Thanks!

> 
> It is all implemented with new output formats in perf script:
> iregval (map register values to names) and insnvar (decode instruction
> and map back memory operand to dwarf operation)
> 
> There are some limitations, it cannot decode everything, and is
> somewhat slow, but it's already quite useful for typical code
> 
> 
>     % perf record -Idi,si ./targ
>     % perf script -F +iregvals
>     ...
>         targ  8584 169763.761843:    2091795 cycles:ppp:            40041a main (targ)
>         targ  8584 169763.762520:    1913932 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }
>         targ  8584 169763.763141:    1638913 cycles:ppp:            400523 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
>         targ  8584 169763.763672:    1516522 cycles:ppp:            400522 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
>         targ  8584 169763.764165:    1335501 cycles:ppp:            400523 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
>         targ  8584 169763.764598:    1253289 cycles:ppp:            400522 f2 (targ) { b = 0x2, int }  { a = 0x1, int }
>         targ  8584 169763.765005:    1135131 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }
>         targ  8584 169763.765373:    1080325 cycles:ppp:            400522 f2 (targ) { b = 0x2, int }  { a = 0x1, int }
>         targ  8584 169763.765724:    1036999 cycles:ppp:            400522 f2 (targ) { b = 0x1, int }  { a = 0x2, int }
>         targ  8584 169763.766061:     971213 cycles:ppp:            400534 f1 (targ) { b = 0x2, int }  { a = 0x1, int }
> 
> 
>     % perf record -e intel_pt//u  -a sleep 1
>     % perf script --itrace=i0ns -F insnvar,insn,ip,sym  -f 2>&1 | xed -F insn: -A -64 | less
>     ...
>                4f7e61 xyarray__max_y                pushq  %rbp
>                4f7e62 xyarray__max_y                mov %rsp, %rbp
>                4f7e65 xyarray__max_y                sub $0x20, %rsp
>                4f7e69 xyarray__max_y                movq  %rdi, -0x18(%rbp) { -24(xy), struct xyarray* }
>                4f7e6d xyarray__max_y                movq  %fs:0x28, %rax
>                4f7e76 xyarray__max_y                movq  %rax, -0x8(%rbp) { -8(xy), struct xyarray* }
>                4f7e7a xyarray__max_y                xor %eax, %eax
>                4f7e7c xyarray__max_y                movq  -0x18(%rbp), %rax { -24(xy), struct xyarray* }
>                4f7e80 xyarray__max_y                movq  0x20(%rax), %rax
>                4f7e84 xyarray__max_y                movq  -0x8(%rbp), %rdx { -8(xy), struct xyarray* }
>                4f7e88 xyarray__max_y                xorq  %fs:0x28, %rdx
>                4f7e91 xyarray__max_y                jz 0x7
>                4f7e98 xyarray__max_y                leaveq
>                4f7e99 xyarray__max_y                retq
>     
> In this example we now know that this function accesses two fields in struct xyarray *
> 
> Available from
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc perf/var-resolve-2
> 
> v1: Initial post


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 03/12] perf, tools: Support storing additional data in strlist
  2017-11-28  0:23 ` [PATCH 03/12] perf, tools: Support storing additional data in strlist Andi Kleen
@ 2017-11-28 13:31   ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-11-28 13:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

On Mon, 27 Nov 2017 16:23:12 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> Add a configurable node size to strlist, which allows users
> to store additional data in a str_node. Also add a new interface
> to add a new strlist node, and return the node, so additional
> data can be added.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/util/rblist.c  | 16 ++++++++++++++--
>  tools/perf/util/rblist.h  |  2 ++
>  tools/perf/util/strlist.c | 15 ++++++++++++++-
>  tools/perf/util/strlist.h |  8 ++++++++
>  4 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/tools/perf/util/rblist.c b/tools/perf/util/rblist.c
> index 0dfe27d99458..fa221e6b0932 100644
> --- a/tools/perf/util/rblist.c
> +++ b/tools/perf/util/rblist.c
> @@ -11,11 +11,13 @@
>  
>  #include "rblist.h"
>  
> -int rblist__add_node(struct rblist *rblist, const void *new_entry)
> +int rblist__add_node_ptr(struct rblist *rblist, const void *new_entry,
> +		       struct rb_node **nodep)
>  {
>  	struct rb_node **p = &rblist->entries.rb_node;
>  	struct rb_node *parent = NULL, *new_node;
>  
> +	*nodep = NULL;
>  	while (*p != NULL) {
>  		int rc;
>  
> @@ -26,13 +28,16 @@ int rblist__add_node(struct rblist *rblist, const void *new_entry)
>  			p = &(*p)->rb_left;
>  		else if (rc < 0)
>  			p = &(*p)->rb_right;
> -		else
> +		else {
> +			*nodep = parent;
>  			return -EEXIST;
> +		}
>  	}
>  
>  	new_node = rblist->node_new(rblist, new_entry);
>  	if (new_node == NULL)
>  		return -ENOMEM;
> +	*nodep = new_node;
>  
>  	rb_link_node(new_node, parent, p);
>  	rb_insert_color(new_node, &rblist->entries);
> @@ -41,6 +46,13 @@ int rblist__add_node(struct rblist *rblist, const void *new_entry)
>  	return 0;
>  }
>  
> +int rblist__add_node(struct rblist *rblist, const void *new_entry)
> +{
> +	struct rb_node *nd;
> +
> +	return rblist__add_node_ptr(rblist, new_entry, &nd);
> +}
> +
>  void rblist__remove_node(struct rblist *rblist, struct rb_node *rb_node)
>  {
>  	rb_erase(rb_node, &rblist->entries);
> diff --git a/tools/perf/util/rblist.h b/tools/perf/util/rblist.h
> index 4c8638a22571..2941e4295f63 100644
> --- a/tools/perf/util/rblist.h
> +++ b/tools/perf/util/rblist.h
> @@ -31,6 +31,8 @@ struct rblist {
>  void rblist__init(struct rblist *rblist);
>  void rblist__delete(struct rblist *rblist);
>  int rblist__add_node(struct rblist *rblist, const void *new_entry);
> +int rblist__add_node_ptr(struct rblist *rblist, const void *new_entry,
> +			 struct rb_node **nodep);
>  void rblist__remove_node(struct rblist *rblist, struct rb_node *rb_node);
>  struct rb_node *rblist__find(struct rblist *rblist, const void *entry);
>  struct rb_node *rblist__findnew(struct rblist *rblist, const void *entry);
> diff --git a/tools/perf/util/strlist.c b/tools/perf/util/strlist.c
> index 9de5434bb49e..68ef21c3797c 100644
> --- a/tools/perf/util/strlist.c
> +++ b/tools/perf/util/strlist.c
> @@ -18,7 +18,7 @@ struct rb_node *strlist__node_new(struct rblist *rblist, const void *entry)
>  	const char *s = entry;
>  	struct rb_node *rc = NULL;
>  	struct strlist *strlist = container_of(rblist, struct strlist, rblist);
> -	struct str_node *snode = malloc(sizeof(*snode));
> +	struct str_node *snode = malloc(strlist->node_size);
>  
>  	if (snode != NULL) {
>  		if (strlist->dupstr) {
> @@ -66,6 +66,14 @@ int strlist__add(struct strlist *slist, const char *new_entry)
>  	return rblist__add_node(&slist->rblist, new_entry);
>  }
>  
> +struct str_node *strlist__add_node(struct strlist *slist, const char *new_entry)
> +{
> +	struct rb_node *nd;
> +
> +	rblist__add_node_ptr(&slist->rblist, new_entry, &nd);
> +	return container_of(nd, struct str_node, rb_node);
> +}

It should check the result of rblist__add_node_ptr() and return NULL if an error.

> +
>  int strlist__load(struct strlist *slist, const char *filename)
>  {
>  	char entry[1024];
> @@ -165,11 +173,15 @@ struct strlist *strlist__new(const char *list, const struct strlist_config *conf
>  		bool dupstr = true;
>  		bool file_only = false;
>  		const char *dirname = NULL;
> +		size_t node_size = sizeof(struct str_node);
>  
>  		if (config) {
>  			dupstr = !config->dont_dupstr;
>  			dirname = config->dirname;
>  			file_only = config->file_only;
> +			node_size = config->node_size;
> +			if (!node_size)
> +				node_size = sizeof(struct str_node);

This is dangerous, node_size can be smaller than sizeof(struct str_node).
Instead, why don't we use "data_size" and set additional data size?

Also, we can add "u8 data[0]" at the end of struct str_node, so that
user can easily access the appended data.

>  		}
>  
>  		rblist__init(&slist->rblist);
> @@ -179,6 +191,7 @@ struct strlist *strlist__new(const char *list, const struct strlist_config *conf
>  
>  		slist->dupstr	 = dupstr;
>  		slist->file_only = file_only;
> +		slist->node_size = node_size;
>  
>  		if (list && strlist__parse_list(slist, list, dirname) != 0)
>  			goto out_error;
> diff --git a/tools/perf/util/strlist.h b/tools/perf/util/strlist.h
> index d58f1e08b170..fd407e11e124 100644
> --- a/tools/perf/util/strlist.h
> +++ b/tools/perf/util/strlist.h
> @@ -16,25 +16,33 @@ struct strlist {
>  	struct rblist rblist;
>  	bool	      dupstr;
>  	bool	      file_only;
> +	size_t	      node_size;
>  };
>  
>  /*
>   * @file_only: When dirname is present, only consider entries as filenames,
>   *             that should not be added to the list if dirname/entry is not
>   *             found
> + * @node_size: Allocate extra space after str_node which can be used for other
> + *	       data. This is the complete size including str_node
>   */
>  struct strlist_config {
>  	bool dont_dupstr;
>  	bool file_only;
>  	const char *dirname;
> +	size_t node_size;
>  };
>  
> +#define STRLIST_CONFIG_DEFAULT \
> +	{ false, false, NULL, sizeof(struct str_node) }
> +
>  struct strlist *strlist__new(const char *slist, const struct strlist_config *config);
>  void strlist__delete(struct strlist *slist);
>  
>  void strlist__remove(struct strlist *slist, struct str_node *sn);
>  int strlist__load(struct strlist *slist, const char *filename);
>  int strlist__add(struct strlist *slist, const char *str);
> +struct str_node *strlist__add_node(struct strlist *slist, const char *str);

Hmm, I see this is much more efficient, but from the programming point of view,
we can also use the combination of strlist__add() and strlist__find() for
that purpose.

BTW, what happen if we already have same string on strlist?

Thank you,


>  
>  struct str_node *strlist__entry(const struct strlist *slist, unsigned int idx);
>  struct str_node *strlist__find(struct strlist *slist, const char *entry);
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 05/12] perf, tools, probe: Print location for resolved variables
  2017-11-28  0:23 ` [PATCH 05/12] perf, tools, probe: Print location for resolved variables Andi Kleen
@ 2017-11-29  1:19   ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-11-29  1:19 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

On Mon, 27 Nov 2017 16:23:14 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> Print the location, e.g. the register, for resolved variables
> with perf probe -V. This is useful for debugging, and manually
> making sense of disassembly. I also have some scripts
> which can make use of this information.
> 
> Before:
> 
> % perf probe  -x  ./tsrc/tstruct  -V  main+20
> Available variables at main+20
>         @<main+20>
>                 struct str*     xp
> 
> After:
> 
> % perf probe  -x  ./tsrc/tstruct  -V  main+20
> Available variables at main+20
>         @<main+20>
>                 struct str*     xp      %ax
> 

Sounds good :)
For clearly separating it from variable name,
I would like to see as below;

 % perf probe  -x  ./tsrc/tstruct  -V  main+20
 Available variables at main+20
         @<main+20>
                 struct str*     xp      // %ax


Thank you,

> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/util/probe-finder.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> index 0149428d453e..699f29d8a28e 100644
> --- a/tools/perf/util/probe-finder.c
> +++ b/tools/perf/util/probe-finder.c
> @@ -1369,6 +1369,7 @@ static int collect_variables_cb(Dwarf_Die *die_mem, void *data)
>  	if (tag == DW_TAG_formal_parameter ||
>  	    tag == DW_TAG_variable) {
>  		struct probe_trace_arg ta;
> +		struct probe_trace_arg_ref *ref;
>  
>  		memset(&ta, 0, sizeof(struct probe_trace_arg));
>  		ret = convert_variable_location(die_mem, af->pf.addr,
> @@ -1401,6 +1402,10 @@ static int collect_variables_cb(Dwarf_Die *die_mem, void *data)
>  							die_mem, &buf);
>  			}
>  
> +			strbuf_addf(&buf, "\t%s", ta.value);
> +			for (ref = ta.ref; ref; ref = ref->next)
> +				strbuf_addf(&buf, " off %ld", ref->offset);
> +
>  			pr_debug("Add new var: %s\n", buf.buf);
>  			if (ret2 == 0) {
>  				struct str_node *sn;
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open
  2017-11-28  0:23 ` [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open Andi Kleen
@ 2017-11-29  3:14   ` Masami Hiramatsu
  2017-11-29  3:39     ` Andi Kleen
  0 siblings, 1 reply; 23+ messages in thread
From: Masami Hiramatsu @ 2017-11-29  3:14 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

On Mon, 27 Nov 2017 16:23:15 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> Add a extra quiet argument to the debug info open / probe finder
> code that allows perf script to make them quieter. Otherwise
> we may end up with too many error messages when lots of
> instructions fail debug info parsing.

IMHO, this kind of simple suppress warning message would better
be done with pr_* implementation (or its macro) since there
maybe new message added or other function which has warning
messages can be called in the future.

Thank you,

> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/util/probe-event.c  |  4 ++--
>  tools/perf/util/probe-finder.c | 19 ++++++++++++-------
>  tools/perf/util/probe-finder.h |  5 ++++-
>  3 files changed, 18 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index b7aaf9b2294d..2f9469e862fb 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -1071,12 +1071,12 @@ static int show_available_vars_at(struct debuginfo *dinfo,
>  		return -EINVAL;
>  	pr_debug("Searching variables at %s\n", buf);
>  
> -	ret = debuginfo__find_available_vars_at(dinfo, pev, &vls);
> +	ret = debuginfo__find_available_vars_at(dinfo, pev, &vls, false);
>  	if (!ret) {  /* Not found, retry with an alternative */
>  		ret = get_alternative_probe_event(dinfo, pev, &tmp);
>  		if (!ret) {
>  			ret = debuginfo__find_available_vars_at(dinfo, pev,
> -								&vls);
> +								&vls, false);
>  			/* Release the old probe_point */
>  			clear_perf_probe_point(&tmp);
>  		}
> diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> index 699f29d8a28e..137b2fe71838 100644
> --- a/tools/perf/util/probe-finder.c
> +++ b/tools/perf/util/probe-finder.c
> @@ -685,12 +685,14 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
>  	if (!die_is_func_def(sc_die)) {
>  		if (!die_find_realfunc(&pf->cu_die, pf->addr, &pf->sp_die)) {
>  			if (die_find_tailfunc(&pf->cu_die, pf->addr, &pf->sp_die)) {
> -				pr_warning("Ignoring tail call from %s\n",
> +				if (!pf->quiet)
> +					pr_warning("Ignoring tail call from %s\n",
>  						dwarf_diename(&pf->sp_die));
>  				return 0;
>  			} else {
> -				pr_warning("Failed to find probe point in any "
> -					   "functions.\n");
> +				if (!pf->quiet)
> +					pr_warning("Failed to find probe point in any "
> +						   "functions.\n");
>  				return -ENOENT;
>  			}
>  		}
> @@ -708,8 +710,9 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
>  		if ((dwarf_cfi_addrframe(pf->cfi_eh, pf->addr, &frame) != 0 &&
>  		     (dwarf_cfi_addrframe(pf->cfi_dbg, pf->addr, &frame) != 0)) ||
>  		    dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {
> -			pr_warning("Failed to get call frame on 0x%jx\n",
> -				   (uintmax_t)pf->addr);
> +			if (!pf->quiet)
> +				pr_warning("Failed to get call frame on 0x%jx\n",
> +					   (uintmax_t)pf->addr);
>  			free(frame);
>  			return -ENOENT;
>  		}
> @@ -1499,10 +1502,12 @@ static int add_available_vars(Dwarf_Die *sc_die, struct probe_finder *pf)
>   */
>  int debuginfo__find_available_vars_at(struct debuginfo *dbg,
>  				      struct perf_probe_event *pev,
> -				      struct variable_list **vls)
> +				      struct variable_list **vls,
> +				      bool be_quiet)
>  {
>  	struct available_var_finder af = {
> -			.pf = {.pev = pev, .callback = add_available_vars},
> +			.pf = {.pev = pev, .callback = add_available_vars,
> +			       .quiet = be_quiet},
>  			.mod = dbg->mod,
>  			.max_vls = probe_conf.max_probes};
>  	int ret;
> diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h
> index 6368e95a5d16..abcb2262ea72 100644
> --- a/tools/perf/util/probe-finder.h
> +++ b/tools/perf/util/probe-finder.h
> @@ -57,7 +57,8 @@ int debuginfo__find_line_range(struct debuginfo *dbg, struct line_range *lr);
>  /* Find available variables */
>  int debuginfo__find_available_vars_at(struct debuginfo *dbg,
>  				      struct perf_probe_event *pev,
> -				      struct variable_list **vls);
> +				      struct variable_list **vls,
> +				      bool quiet);
>  
>  /* Find a src file from a DWARF tag path */
>  int get_real_path(const char *raw_path, const char *comp_dir,
> @@ -88,6 +89,8 @@ struct probe_finder {
>  	unsigned int		machine;	/* Target machine arch */
>  	struct perf_probe_arg	*pvar;		/* Current target variable */
>  	struct probe_trace_arg	*tvar;		/* Current result variable */
> +
> +	bool			quiet;		/* Avoid warnings */
>  };
>  
>  struct trace_event_finder {
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 08/12] perf, tools: Always print probe finder warnings with -v
  2017-11-28  0:23 ` [PATCH 08/12] perf, tools: Always print probe finder warnings with -v Andi Kleen
@ 2017-11-29  3:16   ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-11-29  3:16 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

On Mon, 27 Nov 2017 16:23:17 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> Normally perf script debug info resolution doesn't print
> warnings, but allow -v to override that.  Useful for finding out why
> things don't work.

This must be done in call-site, since we don't need it clearly
in below case.

----
/* Try to find perf_probe_event with debuginfo */
static int try_to_find_probe_trace_events(struct perf_probe_event *pev,
                                          struct probe_trace_event **tevs)
{
        bool need_dwarf = perf_probe_event_need_dwarf(pev);
        struct perf_probe_point tmp;
        struct debuginfo *dinfo;
        int ntevs, ret = 0;

        dinfo = open_debuginfo(pev->target, pev->nsi, !need_dwarf);
----

Thanks,

> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/util/probe-event.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 4ef6ee967468..fb5031ac24a2 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -482,7 +482,7 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
>  					strcpy(reason, "(unknown)");
>  			} else
>  				dso__strerror_load(dso, reason, STRERR_BUFSIZE);
> -			if (!silent)
> +			if (!silent || verbose)
>  				pr_err("Failed to find the path for %s: %s\n",
>  					module ?: "kernel", reason);
>  			return NULL;
> @@ -491,7 +491,7 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
>  	}
>  	nsinfo__mountns_enter(nsi, &nsc);
>  	ret = debuginfo__new(path);
> -	if (!ret && !silent) {
> +	if (!ret && (!silent || verbose)) {
>  		pr_warning("The %s file has no debug information.\n", path);
>  		if (!module || !strtailcmp(path, ".ko"))
>  			pr_warning("Rebuild with CONFIG_DEBUG_INFO=y, ");
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open
  2017-11-29  3:14   ` Masami Hiramatsu
@ 2017-11-29  3:39     ` Andi Kleen
  2017-11-30  2:36       ` Masami Hiramatsu
  0 siblings, 1 reply; 23+ messages in thread
From: Andi Kleen @ 2017-11-29  3:39 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Andi Kleen, acme, jolsa, adrian.hunter, linux-kernel, Andi Kleen

On Wed, Nov 29, 2017 at 12:14:00PM +0900, Masami Hiramatsu wrote:
> On Mon, 27 Nov 2017 16:23:15 -0800
> Andi Kleen <andi@firstfloor.org> wrote:
> 
> > From: Andi Kleen <ak@linux.intel.com>
> > 
> > Add a extra quiet argument to the debug info open / probe finder
> > code that allows perf script to make them quieter. Otherwise
> > we may end up with too many error messages when lots of
> > instructions fail debug info parsing.
> 
> IMHO, this kind of simple suppress warning message would better
> be done with pr_* implementation (or its macro) since there
> maybe new message added or other function which has warning
> messages can be called in the future.

Do you really mean adding a special pr_ / global just for this case?

Seems less clean to me, but I can do it.

-Andi

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 09/12] perf, tools: Downgrade register mapping message to warning
  2017-11-28  0:23 ` [PATCH 09/12] perf, tools: Downgrade register mapping message to warning Andi Kleen
@ 2017-11-29  5:56   ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-11-29  5:56 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

On Mon, 27 Nov 2017 16:23:18 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> When tracing floating point code it's quite possible that perf
> doesn't recognize the register number. Downgrade the warning
> for unknown registers to a debug message.

Hmm, but without this message, user will just see ENOTSUP error.
I'm considering to introduce storage of error string 
for probe-finder so that user can choose to show it or not.

Thank you,

> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/util/probe-finder.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> index 137b2fe71838..5fe6466254f9 100644
> --- a/tools/perf/util/probe-finder.c
> +++ b/tools/perf/util/probe-finder.c
> @@ -272,8 +272,8 @@ static int convert_variable_location(Dwarf_Die *vr_die, Dwarf_Addr addr,
>  
>  	regs = get_dwarf_regstr(regn, machine);
>  	if (!regs) {
> -		/* This should be a bug in DWARF or this tool */
> -		pr_warning("Mapping for the register number %u "
> +		/* This can happen with floating point */
> +		pr_debug("Mapping for the register number %u "
>  			   "missing on this architecture.\n", regn);
>  		return -ENOTSUP;
>  	}
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open
  2017-11-29  3:39     ` Andi Kleen
@ 2017-11-30  2:36       ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-11-30  2:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, adrian.hunter, linux-kernel, Andi Kleen

On Tue, 28 Nov 2017 19:39:43 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> On Wed, Nov 29, 2017 at 12:14:00PM +0900, Masami Hiramatsu wrote:
> > On Mon, 27 Nov 2017 16:23:15 -0800
> > Andi Kleen <andi@firstfloor.org> wrote:
> > 
> > > From: Andi Kleen <ak@linux.intel.com>
> > > 
> > > Add a extra quiet argument to the debug info open / probe finder
> > > code that allows perf script to make them quieter. Otherwise
> > > we may end up with too many error messages when lots of
> > > instructions fail debug info parsing.
> > 
> > IMHO, this kind of simple suppress warning message would better
> > be done with pr_* implementation (or its macro) since there
> > maybe new message added or other function which has warning
> > messages can be called in the future.
> 
> Do you really mean adding a special pr_ / global just for this case?

I thought that we could control it by changing "verbose" global variables.
Something like inc_verbose() & dec_verbose() macros will help us to
suppress warning/debug messages temporary.

Thank you,

> 
> Seems less clean to me, but I can do it.
> 
> -Andi


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 11/12] perf, tools: Print probe warnings for binaries only once per binary
  2017-11-28  0:23 ` [PATCH 11/12] perf, tools: Print probe warnings for binaries only once per binary Andi Kleen
@ 2017-11-30  2:38   ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-11-30  2:38 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

On Mon, 27 Nov 2017 16:23:20 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> When the perf probe code is called from perf script we may end up
> with a flood of bad binary errors with -v. Only print the error message
> once in this case.

Indeed, but this looks like a hack. You may need to store the path in
blacklist and skip it next time.

Thank you,

> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
>  tools/perf/util/probe-event.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index fb5031ac24a2..85fbeeb364bf 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -492,12 +492,16 @@ static struct debuginfo *open_debuginfo(const char *module, struct nsinfo *nsi,
>  	nsinfo__mountns_enter(nsi, &nsc);
>  	ret = debuginfo__new(path);
>  	if (!ret && (!silent || verbose)) {
> -		pr_warning("The %s file has no debug information.\n", path);
> -		if (!module || !strtailcmp(path, ".ko"))
> -			pr_warning("Rebuild with CONFIG_DEBUG_INFO=y, ");
> -		else
> -			pr_warning("Rebuild with -g, ");
> -		pr_warning("or install an appropriate debuginfo package.\n");
> +		static char printed[1024];
> +		if (strcmp(path, printed)) {
> +			snprintf(printed, sizeof printed, "%s", path);
> +			pr_warning("The %s file has no debug information.\n", path);
> +			if (!module || !strtailcmp(path, ".ko"))
> +				pr_warning("Rebuild with CONFIG_DEBUG_INFO=y, ");
> +			else
> +				pr_warning("Rebuild with -g, ");
> +			pr_warning("or install an appropriate debuginfo package.\n");
> +		}
>  	}
>  	nsinfo__mountns_exit(&nsc);
>  	return ret;
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 12/12] perf, tools, script: Implement dwarf resolving of instructions
  2017-11-28  0:23 ` [PATCH 12/12] perf, tools, script: Implement dwarf resolving of instructions Andi Kleen
@ 2017-12-01  2:36   ` Masami Hiramatsu
  0 siblings, 0 replies; 23+ messages in thread
From: Masami Hiramatsu @ 2017-12-01  2:36 UTC (permalink / raw)
  To: Andi Kleen; +Cc: acme, jolsa, mhiramat, adrian.hunter, linux-kernel, Andi Kleen

On Mon, 27 Nov 2017 16:23:21 -0800
Andi Kleen <andi@firstfloor.org> wrote:

> From: Andi Kleen <ak@linux.intel.com>
> 
> Implement resolving arguments of instructions to dwarf variable names.
> 
> When we sample an instruction, decode the instruction and try to
> symbolize the register or destination it is using. Also print the type.
> It builds on the perf probe debugging information reverse lookup
> infrastructure added earlier.
> 
> The dwarf decoding magic is all done using Masami Hiramatsu's perf probe code.
> 
> This is useful for
> 
> - The PTWRITE instruction: when the compiler generates debugging information
> for PTWRITE arguments.  The value logged by PTWRITE is available to the
> PT decoder, so it can print the value.
> 
> - It also works for other samples with an IP, so it's possible to follow
> their memory access patterns (but not the values)
> 
> For the sample we use the instruction decoder to decode the instruction
> at the sample point, and then map the arguments to dwarf information.
> 
> For structure reference we only print the numeric offset, but do not
> resolve the field name.
> 
> Absolute memory references are not supported

Hmm, I think perf-probe also have same limitation.

> 
> It doesn't distinguish SSE (but AVX) registers from GPRs
> (this would require extending the instruction decoder to detect SSE
> instructions)
> 
> Example:
> 
> From perf itself
> 
> % perf record -e intel_pt//u  -a sleep 1
> % perf script --itrace=i0ns -F insnvar,insn,ip,sym  -f 2>&1 | xed -F insn: -A -64 | less
> ...
>            4f7e61 xyarray__max_y                pushq  %rbp
>            4f7e62 xyarray__max_y                mov %rsp, %rbp
>            4f7e65 xyarray__max_y                sub $0x20, %rsp
>            4f7e69 xyarray__max_y                movq  %rdi, -0x18(%rbp) { -24(xy), struct xyarray* }
>            4f7e6d xyarray__max_y                movq  %fs:0x28, %rax
>            4f7e76 xyarray__max_y                movq  %rax, -0x8(%rbp) { -8(xy), struct xyarray* }
>            4f7e7a xyarray__max_y                xor %eax, %eax
>            4f7e7c xyarray__max_y                movq  -0x18(%rbp), %rax { -24(xy), struct xyarray* }
>            4f7e80 xyarray__max_y                movq  0x20(%rax), %rax
>            4f7e84 xyarray__max_y                movq  -0x8(%rbp), %rdx { -8(xy), struct xyarray* }
>            4f7e88 xyarray__max_y                xorq  %fs:0x28, %rdx
>            4f7e91 xyarray__max_y                jz 0x7
>            4f7e98 xyarray__max_y                leaveq
>            4f7e99 xyarray__max_y                retq

Nice! :)

> 
> In this example we now know that this function accesses two fields in struct xyarray *
> 
> Open Issues:
> - It is fairly slow. Some caching would likely help.

OK, we can keep debuginfo open, but it may consume lot of memory.

> - Frame pointer references are usually not correctly resolved,
> which are common in unoptimized code. That's usually fine
> because memory access on the stack is not very interesting.
> - It cannot resolve some references.

OK, let's find what kind of references can not be solved.

> 
> But I find it already quite useful.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
[..]
> diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
> index 85fbeeb364bf..2a65ebed0998 100644
> --- a/tools/perf/util/probe-event.c
> +++ b/tools/perf/util/probe-event.c
> @@ -78,6 +78,9 @@ int init_probe_symbol_maps(bool user_only)
>  {
>  	int ret;
>  
> +	if (host_machine)
> +		return 0;

What is this code for? Please put comment or make it separated patch.

Thanks!


> +
>  	symbol_conf.sort_by_name = true;
>  	symbol_conf.allow_aliases = true;
>  	ret = symbol__init(NULL);
> -- 
> 2.13.6
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2017-12-01  2:36 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-28  0:23 Implement dwarf variable/type resolving for perf script Andi Kleen
2017-11-28  0:23 ` [PATCH 01/12] perf, tools, pt: Clear instruction for ptwrite samples Andi Kleen
2017-11-28  0:23 ` [PATCH 02/12] perf, tools, script: Print insn/insnlen for non PT sample Andi Kleen
2017-11-28  0:23 ` [PATCH 03/12] perf, tools: Support storing additional data in strlist Andi Kleen
2017-11-28 13:31   ` Masami Hiramatsu
2017-11-28  0:23 ` [PATCH 04/12] perf, tools: Store variable name and register for dwarf variable lists Andi Kleen
2017-11-28  0:23 ` [PATCH 05/12] perf, tools, probe: Print location for resolved variables Andi Kleen
2017-11-29  1:19   ` Masami Hiramatsu
2017-11-28  0:23 ` [PATCH 06/12] perf, tools, probe: Support a quiet argument for debug info open Andi Kleen
2017-11-29  3:14   ` Masami Hiramatsu
2017-11-29  3:39     ` Andi Kleen
2017-11-30  2:36       ` Masami Hiramatsu
2017-11-28  0:23 ` [PATCH 07/12] perf, tools, script: Resolve variable names for registers Andi Kleen
2017-11-28  0:23 ` [PATCH 08/12] perf, tools: Always print probe finder warnings with -v Andi Kleen
2017-11-29  3:16   ` Masami Hiramatsu
2017-11-28  0:23 ` [PATCH 09/12] perf, tools: Downgrade register mapping message to warning Andi Kleen
2017-11-29  5:56   ` Masami Hiramatsu
2017-11-28  0:23 ` [PATCH 10/12] perf, tools: Add args and gprs shortcut for registers Andi Kleen
2017-11-28  0:23 ` [PATCH 11/12] perf, tools: Print probe warnings for binaries only once per binary Andi Kleen
2017-11-30  2:38   ` Masami Hiramatsu
2017-11-28  0:23 ` [PATCH 12/12] perf, tools, script: Implement dwarf resolving of instructions Andi Kleen
2017-12-01  2:36   ` Masami Hiramatsu
2017-11-28  5:31 ` Implement dwarf variable/type resolving for perf script Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).