linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHSET 0/9] perf tools: More updates on data type profiling (v4)
@ 2024-01-17  6:26 Namhyung Kim
  2024-01-17  6:26 ` [PATCH 1/9] perf annotate-data: Parse 'lock' prefix from llvm-objdump Namhyung Kim
                   ` (10 more replies)
  0 siblings, 11 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel,
	Ben Woodard, Joe Mario, Kees Cook, David Blaikie, Xu Liu,
	Kan Liang, Ravi Bangoria, Mark Wielaard, Jason Merrill,
	Jose E . Marchesi, William Huang

Hello,

This is a continuation of the data type profiling series.  Now the basic
part (v3) which uses pointer variables is merged to the perf-tools-next
tree.  And this part is for memory accesses without pointers as well as
small updates to handle some corner cases.  Still mores to come to
complete the original series.

There's no change from the previous version.  For background and usages,
pleaes refer the posting of previous version [1] and a LWN article [2].

Basically most memory accesses happen with pointers, but there are cases
don't use pointers - direct accesses to global and local variables.

Global variables are located in a static memory at a specific address.
So the DWARF location expression for the global vairable would also have
the static address.  And it's common to access them using PC-relative
addressing mode.  Thus it needs a special handling for global variables.

On the other hand, local variables are located in the stack which varies
as program executes.  So the local variables are accessed either by the
(stack) frame pointer or (current) stack pointer.  But sometimes DWARF
location expression uses a frame base address (CFA) to specify location
of local variables.  So it may need to convert or normalize the location
extracted from the instruction to match DWARF expression.

Lastly, there are some cases DWARF location expressions end up having
complex (or not straight-forward) location.  In that case, it cannot
simply match just the first expression with the instruction location.
It'd be safer to reject them.

The code is available at 'perf/data-profile-update-v4' branch in the tree
below.  The full version of the code is in 'perf/data-profile-v4' branch.

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Cc: Ben Woodard <woodard@redhat.com> 
Cc: Joe Mario <jmario@redhat.com>
CC: Kees Cook <keescook@chromium.org>
Cc: David Blaikie <blaikie@google.com>
Cc: Xu Liu <xliuprof@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Jason Merrill <jason@redhat.com>
Cc: Jose E. Marchesi <jose.marchesi@oracle.com>
Cc: William Huang <williamjhuang@google.com>

[1] https://lore.kernel.org/linux-perf-users/20231213001323.718046-1-namhyung@kernel.org/
[2] https://lwn.net/Articles/955709/


Namhyung Kim (9):
  perf annotate-data: Parse 'lock' prefix from llvm-objdump
  perf annotate-data: Handle macro fusion on x86
  perf annotate-data: Handle array style accesses
  perf annotate-data: Add stack operation pseudo type
  perf annotate-data: Handle PC-relative addressing
  perf annotate-data: Support global variables
  perf dwarf-aux: Add die_get_cfa()
  perf annotate-data: Support stack variables
  perf dwarf-aux: Check allowed DWARF Ops

 tools/perf/util/annotate-data.c | 119 ++++++++++++++++----
 tools/perf/util/annotate-data.h |   8 +-
 tools/perf/util/annotate.c      | 153 ++++++++++++++++++++++++--
 tools/perf/util/annotate.h      |  12 +-
 tools/perf/util/dwarf-aux.c     | 187 ++++++++++++++++++++++++++++----
 tools/perf/util/dwarf-aux.h     |  18 +++
 6 files changed, 439 insertions(+), 58 deletions(-)


base-commit: d988c9f511af71a3445b6a4f3a2c67208ff8e480
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/9] perf annotate-data: Parse 'lock' prefix from llvm-objdump
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-17  6:26 ` [PATCH 2/9] perf annotate-data: Handle macro fusion on x86 Namhyung Kim
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

For the performance reason, I prefer llvm-objdump over GNU's.  But I
found that llvm-objdump puts x86 lock prefix in a separate line like
below.

  ffffffff81000695: f0                    lock
  ffffffff81000696: ff 83 54 0b 00 00     incl    2900(%rbx)

This should be parsed properly, but I just changed to find the insn
with next offset for now.

This improves the statistics as it can process more instructions.

  Annotate data type stats:
  total 294, ok 144 (49.0%), bad 150 (51.0%)
  -----------------------------------------------------------
          30 : no_sym
          35 : no_mem_ops
          71 : no_var
           6 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 9b70ab110ce7..8d761be1a102 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3660,8 +3660,17 @@ static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip)
 	notes = symbol__annotation(sym);
 
 	list_for_each_entry(dl, &notes->src->source, al.node) {
-		if (sym->start + dl->al.offset == ip)
+		if (sym->start + dl->al.offset == ip) {
+			/*
+			 * llvm-objdump places "lock" in a separate line and
+			 * in that case, we want to get the next line.
+			 */
+			if (!strcmp(dl->ins.name, "lock") && *dl->ops.raw == '\0') {
+				ip++;
+				continue;
+			}
 			return dl;
+		}
 	}
 	return NULL;
 }
@@ -3758,6 +3767,9 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		if (!op_loc->mem_ref)
 			continue;
 
+		/* Recalculate IP since it can be changed due to LOCK prefix */
+		ip = ms->sym->start + dl->al.offset;
+
 		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
 		if (mem_type)
 			istat->good++;
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/9] perf annotate-data: Handle macro fusion on x86
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
  2024-01-17  6:26 ` [PATCH 1/9] perf annotate-data: Parse 'lock' prefix from llvm-objdump Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-17  6:26 ` [PATCH 3/9] perf annotate-data: Handle array style accesses Namhyung Kim
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

When a sample was come from a conditional branch without a memory
operand, it could be due to a macro fusion with a previous instruction.
So it needs to check the memory operand in the previous one.

This improves the stat like below:

  Annotate data type stats:
  total 294, ok 147 (50.0%), bad 147 (50.0%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          71 : no_var
           6 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 8d761be1a102..0ec42e85ca5c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3751,6 +3751,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		return NULL;
 	}
 
+retry:
 	istat = annotate_data_stat(&ann_insn_stat, dl->ins.name);
 	if (istat == NULL) {
 		ann_data_stat.no_insn++;
@@ -3767,7 +3768,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		if (!op_loc->mem_ref)
 			continue;
 
-		/* Recalculate IP since it can be changed due to LOCK prefix */
+		/* Recalculate IP because of LOCK prefix or insn fusion */
 		ip = ms->sym->start + dl->al.offset;
 
 		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
@@ -3786,6 +3787,20 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		return mem_type;
 	}
 
+	/*
+	 * Some instructions can be fused and the actual memory access came
+	 * from the previous instruction.
+	 */
+	if (dl->al.offset > 0) {
+		struct disasm_line *prev_dl;
+
+		prev_dl = list_prev_entry(dl, al.node);
+		if (ins__is_fused(arch, prev_dl->ins.name, dl->ins.name)) {
+			dl = prev_dl;
+			goto retry;
+		}
+	}
+
 	ann_data_stat.no_mem_ops++;
 	istat->bad++;
 	return NULL;
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/9] perf annotate-data: Handle array style accesses
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
  2024-01-17  6:26 ` [PATCH 1/9] perf annotate-data: Parse 'lock' prefix from llvm-objdump Namhyung Kim
  2024-01-17  6:26 ` [PATCH 2/9] perf annotate-data: Handle macro fusion on x86 Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-17  6:26 ` [PATCH 4/9] perf annotate-data: Add stack operation pseudo type Namhyung Kim
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

On x86, instructions for array access often looks like below.

  mov  0x1234(%rax,%rbx,8), %rcx

Usually the first register holds the type information and the second one
has the index.  And the current code only looks up a variable for the
first register.  But it's possible to be in the other way around so it
needs to check the second register if the first one failed.

The stat changed like this.

  Annotate data type stats:
  total 294, ok 148 (50.3%), bad 146 (49.7%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          66 : no_var
          10 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 24 +++++++++++++-----
 tools/perf/util/annotate-data.h |  5 ++--
 tools/perf/util/annotate.c      | 43 ++++++++++++++++++++++++++-------
 tools/perf/util/annotate.h      |  8 ++++--
 4 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index f22b4f18271c..d1ceac7e2441 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -9,6 +9,7 @@
 #include <stdlib.h>
 #include <inttypes.h>
 
+#include "annotate.h"
 #include "annotate-data.h"
 #include "debuginfo.h"
 #include "debug.h"
@@ -207,7 +208,8 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 	 * It expects a pointer type for a memory access.
 	 * Convert to a real type it points to.
 	 */
-	if (dwarf_tag(type_die) != DW_TAG_pointer_type ||
+	if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
+	     dwarf_tag(type_die) != DW_TAG_array_type) ||
 	    die_get_real_type(type_die, type_die) == NULL) {
 		pr_debug("no pointer or no type\n");
 		ann_data_stat.no_typeinfo++;
@@ -233,10 +235,11 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 
 /* The result will be saved in @type_die */
 static int find_data_type_die(struct debuginfo *di, u64 pc,
-			      int reg, int offset, Dwarf_Die *type_die)
+			      struct annotated_op_loc *loc, Dwarf_Die *type_die)
 {
 	Dwarf_Die cu_die, var_die;
 	Dwarf_Die *scopes = NULL;
+	int reg, offset;
 	int ret = -1;
 	int i, nr_scopes;
 
@@ -250,6 +253,10 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
 	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
 	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
 
+	reg = loc->reg1;
+	offset = loc->offset;
+
+retry:
 	/* Search from the inner-most scope to the outer */
 	for (i = nr_scopes - 1; i >= 0; i--) {
 		/* Look up variables/parameters in this scope */
@@ -260,6 +267,12 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
 		ret = check_variable(&var_die, type_die, offset);
 		goto out;
 	}
+
+	if (loc->multi_regs && reg == loc->reg1 && loc->reg1 != loc->reg2) {
+		reg = loc->reg2;
+		goto retry;
+	}
+
 	if (ret < 0)
 		ann_data_stat.no_var++;
 
@@ -272,15 +285,14 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
  * find_data_type - Return a data type at the location
  * @ms: map and symbol at the location
  * @ip: instruction address of the memory access
- * @reg: register that holds the base address
- * @offset: offset from the base address
+ * @loc: instruction operand location
  *
  * This functions searches the debug information of the binary to get the data
  * type it accesses.  The exact location is expressed by (ip, reg, offset).
  * It return %NULL if not found.
  */
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   int reg, int offset)
+					   struct annotated_op_loc *loc)
 {
 	struct annotated_data_type *result = NULL;
 	struct dso *dso = map__dso(ms->map);
@@ -300,7 +312,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	 * a file address for DWARF processing.
 	 */
 	pc = map__rip_2objdump(ms->map, ip);
-	if (find_data_type_die(di, pc, reg, offset, &type_die) < 0)
+	if (find_data_type_die(di, pc, loc, &type_die) < 0)
 		goto out;
 
 	result = dso__findnew_data_type(dso, &type_die);
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 8e73096c01d1..65ddd839850f 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -7,6 +7,7 @@
 #include <linux/rbtree.h>
 #include <linux/types.h>
 
+struct annotated_op_loc;
 struct evsel;
 struct map_symbol;
 
@@ -105,7 +106,7 @@ extern struct annotated_data_stat ann_data_stat;
 
 /* Returns data type at the location (ip, reg, offset) */
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   int reg, int offset);
+					   struct annotated_op_loc *loc);
 
 /* Update type access histogram at the given offset */
 int annotated_data_type__update_samples(struct annotated_data_type *adt,
@@ -119,7 +120,7 @@ void annotated_data_type__tree_delete(struct rb_root *root);
 
 static inline struct annotated_data_type *
 find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
-	       int reg __maybe_unused, int offset __maybe_unused)
+	       struct annotated_op_loc *loc __maybe_unused)
 {
 	return NULL;
 }
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 0ec42e85ca5c..3cdcdd449c86 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3563,8 +3563,22 @@ static int extract_reg_offset(struct arch *arch, const char *str,
 	if (regname == NULL)
 		return -1;
 
-	op_loc->reg = get_dwarf_regnum(regname, 0);
+	op_loc->reg1 = get_dwarf_regnum(regname, 0);
 	free(regname);
+
+	/* Get the second register */
+	if (op_loc->multi_regs) {
+		p = strchr(p + 1, arch->objdump.register_char);
+		if (p == NULL)
+			return -1;
+
+		regname = strdup(p);
+		if (regname == NULL)
+			return -1;
+
+		op_loc->reg2 = get_dwarf_regnum(regname, 0);
+		free(regname);
+	}
 	return 0;
 }
 
@@ -3577,14 +3591,20 @@ static int extract_reg_offset(struct arch *arch, const char *str,
  * Get detailed location info (register and offset) in the instruction.
  * It needs both source and target operand and whether it accesses a
  * memory location.  The offset field is meaningful only when the
- * corresponding mem flag is set.
+ * corresponding mem flag is set.  The reg2 field is meaningful only
+ * when multi_regs flag is set.
  *
  * Some examples on x86:
  *
- *   mov  (%rax), %rcx   # src_reg = rax, src_mem = 1, src_offset = 0
- *                       # dst_reg = rcx, dst_mem = 0
+ *   mov  (%rax), %rcx   # src_reg1 = rax, src_mem = 1, src_offset = 0
+ *                       # dst_reg1 = rcx, dst_mem = 0
  *
- *   mov  0x18, %r8      # src_reg = -1, dst_reg = r8
+ *   mov  0x18, %r8      # src_reg1 = -1, src_mem = 0
+ *                       # dst_reg1 = r8, dst_mem = 0
+ *
+ *   mov  %rsi, 8(%rbx,%rcx,4)  # src_reg1 = rsi, src_mem = 0, dst_multi_regs = 0
+ *                              # dst_reg1 = rbx, dst_reg2 = rcx, dst_mem = 1
+ *                              # dst_multi_regs = 1, dst_offset = 8
  */
 int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
 			       struct annotated_insn_loc *loc)
@@ -3605,24 +3625,29 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
 
 	for_each_insn_op_loc(loc, i, op_loc) {
 		const char *insn_str = ops->source.raw;
+		bool multi_regs = ops->source.multi_regs;
 
-		if (i == INSN_OP_TARGET)
+		if (i == INSN_OP_TARGET) {
 			insn_str = ops->target.raw;
+			multi_regs = ops->target.multi_regs;
+		}
 
 		/* Invalidate the register by default */
-		op_loc->reg = -1;
+		op_loc->reg1 = -1;
+		op_loc->reg2 = -1;
 
 		if (insn_str == NULL)
 			continue;
 
 		if (strchr(insn_str, arch->objdump.memory_ref_char)) {
 			op_loc->mem_ref = true;
+			op_loc->multi_regs = multi_regs;
 			extract_reg_offset(arch, insn_str, op_loc);
 		} else {
 			char *s = strdup(insn_str);
 
 			if (s) {
-				op_loc->reg = get_dwarf_regnum(s, 0);
+				op_loc->reg1 = get_dwarf_regnum(s, 0);
 				free(s);
 			}
 		}
@@ -3771,7 +3796,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		/* Recalculate IP because of LOCK prefix or insn fusion */
 		ip = ms->sym->start + dl->al.offset;
 
-		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
+		mem_type = find_data_type(ms, ip, op_loc);
 		if (mem_type)
 			istat->good++;
 		else
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index dba50762c6e8..d0ff677b460c 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -442,14 +442,18 @@ int annotate_check_args(void);
 
 /**
  * struct annotated_op_loc - Location info of instruction operand
- * @reg: Register in the operand
+ * @reg1: First register in the operand
+ * @reg2: Second register in the operand
  * @offset: Memory access offset in the operand
  * @mem_ref: Whether the operand accesses memory
+ * @multi_regs: Whether the second register is used
  */
 struct annotated_op_loc {
-	int reg;
+	int reg1;
+	int reg2;
 	int offset;
 	bool mem_ref;
+	bool multi_regs;
 };
 
 enum annotated_insn_ops {
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/9] perf annotate-data: Add stack operation pseudo type
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
                   ` (2 preceding siblings ...)
  2024-01-17  6:26 ` [PATCH 3/9] perf annotate-data: Handle array style accesses Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-17  6:26 ` [PATCH 5/9] perf annotate-data: Handle PC-relative addressing Namhyung Kim
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

A typical function prologue and epilogue include multiple stack
operations to save and restore the current value of registers.
On x86, it looks like below:

  push  r15
  push  r14
  push  r13
  push  r12

  ...

  pop   r12
  pop   r13
  pop   r14
  pop   r15
  ret

As these all touches the stack memory region, chances are high that they
appear in a memory profile data.  But these are not used for any real
purpose yet so it'd return no types.

One of my profile type shows that non neglible portion of data came from
the stack operations.  It also seems GCC generates more stack operations
than clang.

Annotate Instruction stats
total 264, ok 169 (64.0%), bad 95 (36.0%)

    Name      :  Good   Bad
  -----------------------------------------------------------
    movq      :    49    27
    movl      :    24     9
    popq      :     0    19   <-- here
    cmpl      :    17     2
    addq      :    14     1
    cmpq      :    12     2
    cmpxchgl  :     3     7

Instead of dealing them as unknown, let's create a seperate pseudo type
to represent those stack operations separately.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.h |  1 +
 tools/perf/util/annotate.c      | 26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 65ddd839850f..214c625e7bc9 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -70,6 +70,7 @@ struct annotated_data_type {
 };
 
 extern struct annotated_data_type unknown_type;
+extern struct annotated_data_type stackop_type;
 
 /**
  * struct annotated_data_stat - Debug statistics
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 3cdcdd449c86..655bd9443f5e 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -107,6 +107,14 @@ static struct ins_ops ret_ops;
 struct annotated_data_stat ann_data_stat;
 LIST_HEAD(ann_insn_stat);
 
+/* Pseudo data types */
+struct annotated_data_type stackop_type = {
+	.self = {
+		.type_name = (char *)"(stack operation)",
+		.children = LIST_HEAD_INIT(stackop_type.self.children),
+	},
+};
+
 static int arch__grow_instructions(struct arch *arch)
 {
 	struct ins *new_instructions;
@@ -3724,6 +3732,18 @@ static struct annotated_item_stat *annotate_data_stat(struct list_head *head,
 	return istat;
 }
 
+static bool is_stack_operation(struct arch *arch, struct disasm_line *dl)
+{
+	if (arch__is(arch, "x86")) {
+		if (!strncmp(dl->ins.name, "push", 4) ||
+		    !strncmp(dl->ins.name, "pop", 3) ||
+		    !strncmp(dl->ins.name, "ret", 3))
+			return true;
+	}
+
+	return false;
+}
+
 /**
  * hist_entry__get_data_type - find data type for given hist entry
  * @he: hist entry
@@ -3789,6 +3809,12 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		return NULL;
 	}
 
+	if (is_stack_operation(arch, dl)) {
+		istat->good++;
+		he->mem_type_off = 0;
+		return &stackop_type;
+	}
+
 	for_each_insn_op_loc(&loc, i, op_loc) {
 		if (!op_loc->mem_ref)
 			continue;
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/9] perf annotate-data: Handle PC-relative addressing
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
                   ` (3 preceding siblings ...)
  2024-01-17  6:26 ` [PATCH 4/9] perf annotate-data: Add stack operation pseudo type Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-17  6:26 ` [PATCH 6/9] perf annotate-data: Support global variables Namhyung Kim
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

Extend find_data_type_die() to find data type from PC-relative address
using die_find_variable_by_addr().  Users need to pass the address for
the (global) variable.

The offset for the variable should be updated after finding the type
because the offset in the instruction is just to calcuate the address
for the variable.  So it changed to pass a pointer to offset and renamed
it to 'poffset'.

First it searches variables in the CU DIE as it's likely that the global
variables are defined in the file level.  And then it iterates the scope
DIEs to find a local (static) variable.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 56 ++++++++++++++++++++++-----------
 1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index d1ceac7e2441..58c0fac42e9d 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -14,6 +14,7 @@
 #include "debuginfo.h"
 #include "debug.h"
 #include "dso.h"
+#include "dwarf-regs.h"
 #include "evsel.h"
 #include "evlist.h"
 #include "map.h"
@@ -193,7 +194,8 @@ static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die)
 }
 
 /* The type info will be saved in @type_die */
-static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
+static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
+			  bool is_pointer)
 {
 	Dwarf_Word size;
 
@@ -205,15 +207,18 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 	}
 
 	/*
-	 * It expects a pointer type for a memory access.
-	 * Convert to a real type it points to.
+	 * Usually it expects a pointer type for a memory access.
+	 * Convert to a real type it points to.  But global variables
+	 * are accessed directly without a pointer.
 	 */
-	if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
-	     dwarf_tag(type_die) != DW_TAG_array_type) ||
-	    die_get_real_type(type_die, type_die) == NULL) {
-		pr_debug("no pointer or no type\n");
-		ann_data_stat.no_typeinfo++;
-		return -1;
+	if (is_pointer) {
+		if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
+		     dwarf_tag(type_die) != DW_TAG_array_type) ||
+		    die_get_real_type(type_die, type_die) == NULL) {
+			pr_debug("no pointer or no type\n");
+			ann_data_stat.no_typeinfo++;
+			return -1;
+		}
 	}
 
 	/* Get the size of the actual type */
@@ -234,7 +239,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 }
 
 /* The result will be saved in @type_die */
-static int find_data_type_die(struct debuginfo *di, u64 pc,
+static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 			      struct annotated_op_loc *loc, Dwarf_Die *type_die)
 {
 	Dwarf_Die cu_die, var_die;
@@ -250,21 +255,36 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
 		return -1;
 	}
 
-	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
-	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
-
 	reg = loc->reg1;
 	offset = loc->offset;
 
+	if (reg == DWARF_REG_PC &&
+	    die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
+		ret = check_variable(&var_die, type_die, offset,
+				     /*is_pointer=*/false);
+		goto out;
+	}
+
+	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
+	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
+
 retry:
 	/* Search from the inner-most scope to the outer */
 	for (i = nr_scopes - 1; i >= 0; i--) {
-		/* Look up variables/parameters in this scope */
-		if (!die_find_variable_by_reg(&scopes[i], pc, reg, &var_die))
-			continue;
+		if (reg == DWARF_REG_PC) {
+			if (!die_find_variable_by_addr(&scopes[i], pc, addr,
+						       &var_die, &offset))
+				continue;
+		} else {
+			/* Look up variables/parameters in this scope */
+			if (!die_find_variable_by_reg(&scopes[i], pc, reg,
+						      &var_die))
+				continue;
+		}
 
 		/* Found a variable, see if it's correct */
-		ret = check_variable(&var_die, type_die, offset);
+		ret = check_variable(&var_die, type_die, offset,
+				     reg != DWARF_REG_PC);
 		goto out;
 	}
 
@@ -312,7 +332,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	 * a file address for DWARF processing.
 	 */
 	pc = map__rip_2objdump(ms->map, ip);
-	if (find_data_type_die(di, pc, loc, &type_die) < 0)
+	if (find_data_type_die(di, pc, 0, loc, &type_die) < 0)
 		goto out;
 
 	result = dso__findnew_data_type(dso, &type_die);
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/9] perf annotate-data: Support global variables
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
                   ` (4 preceding siblings ...)
  2024-01-17  6:26 ` [PATCH 5/9] perf annotate-data: Handle PC-relative addressing Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-17  6:26 ` [PATCH 7/9] perf dwarf-aux: Add die_get_cfa() Namhyung Kim
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

Global variables are accessed using PC-relative address so it needs to
be handled separately.  The PC-rel addressing is detected by using
DWARF_REG_PC.  On x86, %rip register would be used.

The address can be calculated using the ip and offset in the
instruction.  But it should start from the next instruction so add
calculate_pcrel_addr() to do it properly.

But global variables defined in a different file would only have a
declaration which doesn't include a location list.  So it first tries
to get the type info using the address, and then looks up the variable
declarations using name.  The name of global variables should be get
from the symbol table.  The declaration would have the type info.

So extend find_var_type() to take both address and name for global
variables.

The stat is now looks like:

  Annotate data type stats:
  total 294, ok 153 (52.0%), bad 141 (48.0%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          61 : no_var
          10 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 38 ++++++++++++++++------
 tools/perf/util/annotate-data.h |  6 ++--
 tools/perf/util/annotate.c      | 57 +++++++++++++++++++++++++++++++--
 tools/perf/util/annotate.h      |  4 +++
 4 files changed, 92 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 58c0fac42e9d..e375dd288f67 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -240,7 +240,8 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
 
 /* The result will be saved in @type_die */
 static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
-			      struct annotated_op_loc *loc, Dwarf_Die *type_die)
+			      const char *var_name, struct annotated_op_loc *loc,
+			      Dwarf_Die *type_die)
 {
 	Dwarf_Die cu_die, var_die;
 	Dwarf_Die *scopes = NULL;
@@ -258,11 +259,21 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 	reg = loc->reg1;
 	offset = loc->offset;
 
-	if (reg == DWARF_REG_PC &&
-	    die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
-		ret = check_variable(&var_die, type_die, offset,
-				     /*is_pointer=*/false);
-		goto out;
+	if (reg == DWARF_REG_PC) {
+		if (die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
+			ret = check_variable(&var_die, type_die, offset,
+					     /*is_pointer=*/false);
+			loc->offset = offset;
+			goto out;
+		}
+
+		if (var_name && die_find_variable_at(&cu_die, var_name, pc,
+						     &var_die)) {
+			ret = check_variable(&var_die, type_die, 0,
+					     /*is_pointer=*/false);
+			/* loc->offset will be updated by the caller */
+			goto out;
+		}
 	}
 
 	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
@@ -285,6 +296,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 		/* Found a variable, see if it's correct */
 		ret = check_variable(&var_die, type_die, offset,
 				     reg != DWARF_REG_PC);
+		loc->offset = offset;
 		goto out;
 	}
 
@@ -306,13 +318,21 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
  * @ms: map and symbol at the location
  * @ip: instruction address of the memory access
  * @loc: instruction operand location
+ * @addr: data address of the memory access
+ * @var_name: global variable name
  *
  * This functions searches the debug information of the binary to get the data
- * type it accesses.  The exact location is expressed by (ip, reg, offset).
+ * type it accesses.  The exact location is expressed by (@ip, reg, offset)
+ * for pointer variables or (@ip, @addr) for global variables.  Note that global
+ * variables might update the @loc->offset after finding the start of the variable.
+ * If it cannot find a global variable by address, it tried to fine a declaration
+ * of the variable using @var_name.  In that case, @loc->offset won't be updated.
+ *
  * It return %NULL if not found.
  */
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   struct annotated_op_loc *loc)
+					   struct annotated_op_loc *loc, u64 addr,
+					   const char *var_name)
 {
 	struct annotated_data_type *result = NULL;
 	struct dso *dso = map__dso(ms->map);
@@ -332,7 +352,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	 * a file address for DWARF processing.
 	 */
 	pc = map__rip_2objdump(ms->map, ip);
-	if (find_data_type_die(di, pc, 0, loc, &type_die) < 0)
+	if (find_data_type_die(di, pc, addr, var_name, loc, &type_die) < 0)
 		goto out;
 
 	result = dso__findnew_data_type(dso, &type_die);
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 214c625e7bc9..1b0db8e8c40e 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -107,7 +107,8 @@ extern struct annotated_data_stat ann_data_stat;
 
 /* Returns data type at the location (ip, reg, offset) */
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   struct annotated_op_loc *loc);
+					   struct annotated_op_loc *loc, u64 addr,
+					   const char *var_name);
 
 /* Update type access histogram at the given offset */
 int annotated_data_type__update_samples(struct annotated_data_type *adt,
@@ -121,7 +122,8 @@ void annotated_data_type__tree_delete(struct rb_root *root);
 
 static inline struct annotated_data_type *
 find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
-	       struct annotated_op_loc *loc __maybe_unused)
+	       struct annotated_op_loc *loc __maybe_unused,
+	       u64 addr __maybe_unused, const char *var_name __maybe_unused)
 {
 	return NULL;
 }
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 655bd9443f5e..107b264fa41e 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -37,6 +37,7 @@
 #include "util/sharded_mutex.h"
 #include "arch/common.h"
 #include "namespaces.h"
+#include "thread.h"
 #include <regex.h>
 #include <linux/bitops.h>
 #include <linux/kernel.h>
@@ -3744,6 +3745,30 @@ static bool is_stack_operation(struct arch *arch, struct disasm_line *dl)
 	return false;
 }
 
+u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
+			struct disasm_line *dl)
+{
+	struct annotation *notes;
+	struct disasm_line *next;
+	u64 addr;
+
+	notes = symbol__annotation(ms->sym);
+	/*
+	 * PC-relative addressing starts from the next instruction address
+	 * But the IP is for the current instruction.  Since disasm_line
+	 * doesn't have the instruction size, calculate it using the next
+	 * disasm_line.  If it's the last one, we can use symbol's end
+	 * address directly.
+	 */
+	if (&dl->al.node == notes->src->source.prev)
+		addr = ms->sym->end + offset;
+	else {
+		next = list_next_entry(dl, al.node);
+		addr = ip + (next->al.offset - dl->al.offset) + offset;
+	}
+	return map__rip_2objdump(ms->map, addr);
+}
+
 /**
  * hist_entry__get_data_type - find data type for given hist entry
  * @he: hist entry
@@ -3763,7 +3788,9 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	struct annotated_op_loc *op_loc;
 	struct annotated_data_type *mem_type;
 	struct annotated_item_stat *istat;
-	u64 ip = he->ip;
+	u64 ip = he->ip, addr = 0;
+	const char *var_name = NULL;
+	int var_offset;
 	int i;
 
 	ann_data_stat.total++;
@@ -3822,12 +3849,38 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		/* Recalculate IP because of LOCK prefix or insn fusion */
 		ip = ms->sym->start + dl->al.offset;
 
-		mem_type = find_data_type(ms, ip, op_loc);
+		var_offset = op_loc->offset;
+
+		/* PC-relative addressing */
+		if (op_loc->reg1 == DWARF_REG_PC) {
+			struct addr_location al;
+			struct symbol *var;
+			u64 map_addr;
+
+			addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl);
+			/* Kernel symbols might be relocated */
+			map_addr = addr + map__reloc(ms->map);
+
+			addr_location__init(&al);
+			var = thread__find_symbol_fb(he->thread, he->cpumode,
+						     map_addr, &al);
+			if (var) {
+				var_name = var->name;
+				/* Calculate type offset from the start of variable */
+				var_offset = map_addr - map__unmap_ip(al.map, var->start);
+			}
+			addr_location__exit(&al);
+		}
+
+		mem_type = find_data_type(ms, ip, op_loc, addr, var_name);
 		if (mem_type)
 			istat->good++;
 		else
 			istat->bad++;
 
+		if (mem_type && var_name)
+			op_loc->offset = var_offset;
+
 		if (symbol_conf.annotate_data_sample) {
 			annotated_data_type__update_samples(mem_type, evsel,
 							    op_loc->offset,
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index d0ff677b460c..94435607c958 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -491,4 +491,8 @@ struct annotated_item_stat {
 };
 extern struct list_head ann_insn_stat;
 
+/* Calculate PC-relative address */
+u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
+			struct disasm_line *dl);
+
 #endif	/* __PERF_ANNOTATE_H */
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/9] perf dwarf-aux: Add die_get_cfa()
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
                   ` (5 preceding siblings ...)
  2024-01-17  6:26 ` [PATCH 6/9] perf annotate-data: Support global variables Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-17  6:26 ` [PATCH 8/9] perf annotate-data: Support stack variables Namhyung Kim
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

The die_get_cfa() is to get frame base register and offset at the given
instruction address (pc).  This info will be used to locate stack
variables which have location expression using DW_OP_fbreg.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 68 ++++++++++++++++++++++++++++++++++++-
 tools/perf/util/dwarf-aux.h | 15 ++++++++
 2 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 7aa5fee0da19..3d42a8613869 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1407,7 +1407,73 @@ Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc,
 		*offset = data.offset;
 	return result;
 }
-#endif
+#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
+
+#ifdef HAVE_DWARF_CFI_SUPPORT
+static int reg_from_dwarf_op(Dwarf_Op *op)
+{
+	switch (op->atom) {
+	case DW_OP_reg0 ... DW_OP_reg31:
+		return op->atom - DW_OP_reg0;
+	case DW_OP_breg0 ... DW_OP_breg31:
+		return op->atom - DW_OP_breg0;
+	case DW_OP_regx:
+	case DW_OP_bregx:
+		return op->number;
+	default:
+		break;
+	}
+	return -1;
+}
+
+static int offset_from_dwarf_op(Dwarf_Op *op)
+{
+	switch (op->atom) {
+	case DW_OP_reg0 ... DW_OP_reg31:
+	case DW_OP_regx:
+		return 0;
+	case DW_OP_breg0 ... DW_OP_breg31:
+		return op->number;
+	case DW_OP_bregx:
+		return op->number2;
+	default:
+		break;
+	}
+	return -1;
+}
+
+/**
+ * die_get_cfa - Get frame base information
+ * @dwarf: a Dwarf info
+ * @pc: program address
+ * @preg: pointer for saved register
+ * @poffset: pointer for saved offset
+ *
+ * This function gets register and offset for CFA (Canonical Frame Address)
+ * by searching the CIE/FDE info.  The CFA usually points to the start address
+ * of the current stack frame and local variables can be located using an offset
+ * from the CFA.  The @preg and @poffset will be updated if it returns 0.
+ */
+int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset)
+{
+	Dwarf_CFI *cfi;
+	Dwarf_Frame *frame = NULL;
+	Dwarf_Op *ops = NULL;
+	size_t nops;
+
+	cfi = dwarf_getcfi(dwarf);
+	if (cfi == NULL)
+		return -1;
+
+	if (!dwarf_cfi_addrframe(cfi, pc, &frame) &&
+	    !dwarf_frame_cfa(frame, &ops, &nops) && nops == 1) {
+		*preg = reg_from_dwarf_op(ops);
+		*poffset = offset_from_dwarf_op(ops);
+		return 0;
+	}
+	return -1;
+}
+#endif /* HAVE_DWARF_CFI_SUPPORT */
 
 /*
  * die_has_loclist - Check if DW_AT_location of @vr_die is a location list
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index 4e64caac6df8..f209f9162908 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -177,4 +177,19 @@ static inline Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die __maybe_unu
 
 #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
+#ifdef HAVE_DWARF_CFI_SUPPORT
+
+/* Get the frame base information from CFA */
+int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset);
+
+#else /* HAVE_DWARF_CFI_SUPPORT */
+
+static inline int die_get_cfa(Dwarf *dwarf __maybe_unused, u64 pc __maybe_unused,
+			      int *preg __maybe_unused, int *poffset __maybe_unused)
+{
+	return -1;
+}
+
+#endif /* HAVE_DWARF_CFI_SUPPORT */
+
 #endif /* _DWARF_AUX_H */
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/9] perf annotate-data: Support stack variables
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
                   ` (6 preceding siblings ...)
  2024-01-17  6:26 ` [PATCH 7/9] perf dwarf-aux: Add die_get_cfa() Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-17  6:26 ` [PATCH 9/9] perf dwarf-aux: Check allowed DWARF Ops Namhyung Kim
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

Local variables are allocated in the stack and the location list
should look like base register(s) and an offset.  Extend the
die_find_variable_by_reg() to handle the following expressions

 * DW_OP_breg{0..31}
 * DW_OP_bregx
 * DW_OP_fbreg

Ususally DWARF subprogram entries have frame base information and
use it to locate stack variable like below:

 <2><43d1575>: Abbrev Number: 62 (DW_TAG_variable)
    <43d1576>   DW_AT_location    : 2 byte block: 91 7c         (DW_OP_fbreg: -4)  <--- here
    <43d1579>   DW_AT_name        : (indirect string, offset: 0x2c00c9): i
    <43d157d>   DW_AT_decl_file   : 1
    <43d157e>   DW_AT_decl_line   : 78
    <43d157f>   DW_AT_type        : <0x43d19d7>

I found some differences on saving the frame base between gcc and clang.
The gcc uses the CFA to get the base so it needs to check the current
frame's CFI info.  In this case, stack offset needs to be adjusted from
the start of the CFA.

 <1><1bb8d>: Abbrev Number: 102 (DW_TAG_subprogram)
    <1bb8e>   DW_AT_name        : (indirect string, offset: 0x74d41): kernel_init
    <1bb92>   DW_AT_decl_file   : 2
    <1bb92>   DW_AT_decl_line   : 1440
    <1bb94>   DW_AT_decl_column : 18
    <1bb95>   DW_AT_prototyped  : 1
    <1bb95>   DW_AT_type        : <0xcc>
    <1bb99>   DW_AT_low_pc      : 0xffffffff81bab9e0
    <1bba1>   DW_AT_high_pc     : 0x1b2
    <1bba9>   DW_AT_frame_base  : 1 byte block: 9c      (DW_OP_call_frame_cfa)  <------ here
    <1bbab>   DW_AT_call_all_calls: 1
    <1bbab>   DW_AT_sibling     : <0x1bf5a>

While clang sets it to a register directly and it can check the register
and offset in the instruction directly.

 <1><43d1542>: Abbrev Number: 60 (DW_TAG_subprogram)
    <43d1543>   DW_AT_low_pc      : 0xffffffff816a7c60
    <43d154b>   DW_AT_high_pc     : 0x98
    <43d154f>   DW_AT_frame_base  : 1 byte block: 56    (DW_OP_reg6 (rbp))  <---------- here
    <43d1551>   DW_AT_GNU_all_call_sites: 1
    <43d1551>   DW_AT_name        : (indirect string, offset: 0x3bce91): foo
    <43d1555>   DW_AT_decl_file   : 1
    <43d1556>   DW_AT_decl_line   : 75
    <43d1557>   DW_AT_prototyped  : 1
    <43d1557>   DW_AT_type        : <0x43c7332>
    <43d155b>   DW_AT_external    : 1

Also it needs to update the offset after finding the type like global
variables since the offset was from the frame base.  Factor out
match_var_offset() to check global and local variables in the same way.

The type stats are improved too:

  Annotate data type stats:
  total 294, ok 160 (54.4%), bad 134 (45.6%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          51 : no_var
          14 : no_typeinfo
           7 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 35 +++++++++++++--
 tools/perf/util/dwarf-aux.c     | 79 ++++++++++++++++++++++++---------
 tools/perf/util/dwarf-aux.h     |  3 ++
 3 files changed, 93 insertions(+), 24 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index e375dd288f67..30c4d19fcf11 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -209,7 +209,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
 	/*
 	 * Usually it expects a pointer type for a memory access.
 	 * Convert to a real type it points to.  But global variables
-	 * are accessed directly without a pointer.
+	 * and local variables are accessed directly without a pointer.
 	 */
 	if (is_pointer) {
 		if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
@@ -248,6 +248,9 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 	int reg, offset;
 	int ret = -1;
 	int i, nr_scopes;
+	int fbreg = -1;
+	bool is_fbreg = false;
+	int fb_offset = 0;
 
 	/* Get a compile_unit for this address */
 	if (!find_cu_die(di, pc, &cu_die)) {
@@ -279,7 +282,33 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
 	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
 
+	if (reg != DWARF_REG_PC && dwarf_hasattr(&scopes[0], DW_AT_frame_base)) {
+		Dwarf_Attribute attr;
+		Dwarf_Block block;
+
+		/* Check if the 'reg' is assigned as frame base register */
+		if (dwarf_attr(&scopes[0], DW_AT_frame_base, &attr) != NULL &&
+		    dwarf_formblock(&attr, &block) == 0 && block.length == 1) {
+			switch (*block.data) {
+			case DW_OP_reg0 ... DW_OP_reg31:
+				fbreg = *block.data - DW_OP_reg0;
+				break;
+			case DW_OP_call_frame_cfa:
+				if (die_get_cfa(di->dbg, pc, &fbreg,
+						&fb_offset) < 0)
+					fbreg = -1;
+				break;
+			default:
+				break;
+			}
+		}
+	}
+
 retry:
+	is_fbreg = (reg == fbreg);
+	if (is_fbreg)
+		offset = loc->offset - fb_offset;
+
 	/* Search from the inner-most scope to the outer */
 	for (i = nr_scopes - 1; i >= 0; i--) {
 		if (reg == DWARF_REG_PC) {
@@ -289,13 +318,13 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 		} else {
 			/* Look up variables/parameters in this scope */
 			if (!die_find_variable_by_reg(&scopes[i], pc, reg,
-						      &var_die))
+						      &offset, is_fbreg, &var_die))
 				continue;
 		}
 
 		/* Found a variable, see if it's correct */
 		ret = check_variable(&var_die, type_die, offset,
-				     reg != DWARF_REG_PC);
+				     reg != DWARF_REG_PC && !is_fbreg);
 		loc->offset = offset;
 		goto out;
 	}
diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 3d42a8613869..7caf52fdc255 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1272,11 +1272,39 @@ struct find_var_data {
 	unsigned reg;
 	/* Access offset, set for global data */
 	int offset;
+	/* True if the current register is the frame base */
+	bool is_fbreg;
 };
 
 /* Max number of registers DW_OP_regN supports */
 #define DWARF_OP_DIRECT_REGS  32
 
+static bool match_var_offset(Dwarf_Die *die_mem, struct find_var_data *data,
+			     u64 addr_offset, u64 addr_type)
+{
+	Dwarf_Die type_die;
+	Dwarf_Word size;
+
+	if (addr_offset == addr_type) {
+		/* Update offset relative to the start of the variable */
+		data->offset = 0;
+		return true;
+	}
+
+	if (die_get_real_type(die_mem, &type_die) == NULL)
+		return false;
+
+	if (dwarf_aggregate_size(&type_die, &size) < 0)
+		return false;
+
+	if (addr_offset >= addr_type + size)
+		return false;
+
+	/* Update offset relative to the start of the variable */
+	data->offset = addr_offset - addr_type;
+	return true;
+}
+
 /* Only checks direct child DIEs in the given scope. */
 static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
 {
@@ -1301,14 +1329,30 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
 		if (start > data->pc)
 			break;
 
+		/* Local variables accessed using frame base register */
+		if (data->is_fbreg && ops->atom == DW_OP_fbreg &&
+		    data->offset >= (int)ops->number &&
+		    match_var_offset(die_mem, data, data->offset, ops->number))
+			return DIE_FIND_CB_END;
+
 		/* Only match with a simple case */
 		if (data->reg < DWARF_OP_DIRECT_REGS) {
 			if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1)
 				return DIE_FIND_CB_END;
+
+			/* Local variables accessed by a register + offset */
+			if (ops->atom == (DW_OP_breg0 + data->reg) &&
+			    match_var_offset(die_mem, data, data->offset, ops->number))
+				return DIE_FIND_CB_END;
 		} else {
 			if (ops->atom == DW_OP_regx && ops->number == data->reg &&
 			    nops == 1)
 				return DIE_FIND_CB_END;
+
+			/* Local variables accessed by a register + offset */
+			if (ops->atom == DW_OP_bregx && data->reg == ops->number &&
+			    match_var_offset(die_mem, data, data->offset, ops->number2))
+				return DIE_FIND_CB_END;
 		}
 	}
 	return DIE_FIND_CB_SIBLING;
@@ -1319,18 +1363,29 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
  * @sc_die: a scope DIE
  * @pc: the program address to find
  * @reg: the register number to find
+ * @poffset: pointer to offset, will be updated for fbreg case
+ * @is_fbreg: boolean value if the current register is the frame base
  * @die_mem: a buffer to save the resulting DIE
  *
- * Find the variable DIE accessed by the given register.
+ * Find the variable DIE accessed by the given register.  It'll update the @offset
+ * when the variable is in the stack.
  */
 Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
+				    int *poffset, bool is_fbreg,
 				    Dwarf_Die *die_mem)
 {
 	struct find_var_data data = {
 		.pc = pc,
 		.reg = reg,
+		.offset = *poffset,
+		.is_fbreg = is_fbreg,
 	};
-	return die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem);
+	Dwarf_Die *result;
+
+	result = die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem);
+	if (result)
+		*poffset = data.offset;
+	return result;
 }
 
 /* Only checks direct child DIEs in the given scope */
@@ -1341,8 +1396,6 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
 	ptrdiff_t off = 0;
 	Dwarf_Attribute attr;
 	Dwarf_Addr base, start, end;
-	Dwarf_Word size;
-	Dwarf_Die type_die;
 	Dwarf_Op *ops;
 	size_t nops;
 
@@ -1359,24 +1412,8 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
 		if (data->addr < ops->number)
 			continue;
 
-		if (data->addr == ops->number) {
-			/* Update offset relative to the start of the variable */
-			data->offset = 0;
+		if (match_var_offset(die_mem, data, data->addr, ops->number))
 			return DIE_FIND_CB_END;
-		}
-
-		if (die_get_real_type(die_mem, &type_die) == NULL)
-			continue;
-
-		if (dwarf_aggregate_size(&type_die, &size) < 0)
-			continue;
-
-		if (data->addr >= ops->number + size)
-			continue;
-
-		/* Update offset relative to the start of the variable */
-		data->offset = data->addr - ops->number;
-		return DIE_FIND_CB_END;
 	}
 	return DIE_FIND_CB_SIBLING;
 }
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index f209f9162908..85dd527ae1f7 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -142,6 +142,7 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf);
 
 /* Find a variable saved in the 'reg' at given address */
 Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
+				    int *poffset, bool is_fbreg,
 				    Dwarf_Die *die_mem);
 
 /* Find a (global) variable located in the 'addr' */
@@ -161,6 +162,8 @@ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
 static inline Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die __maybe_unused,
 						  Dwarf_Addr pc __maybe_unused,
 						  int reg __maybe_unused,
+						  int *poffset __maybe_unused,
+						  bool is_fbreg __maybe_unused,
 						  Dwarf_Die *die_mem __maybe_unused)
 {
 	return NULL;
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 9/9] perf dwarf-aux: Check allowed DWARF Ops
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
                   ` (7 preceding siblings ...)
  2024-01-17  6:26 ` [PATCH 8/9] perf annotate-data: Support stack variables Namhyung Kim
@ 2024-01-17  6:26 ` Namhyung Kim
  2024-01-18 16:36 ` [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Ian Rogers
  2024-01-22 20:37 ` Namhyung Kim
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-17  6:26 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ian Rogers, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML,
	linux-perf-users, Linus Torvalds, Stephane Eranian,
	Masami Hiramatsu, linux-toolchains, linux-trace-devel

The DWARF location expression can be fairly complex and it'd be hard
to match it with the condition correctly.  So let's be conservative
and only allow simple expressions.  For now it just checks the first
operation in the list.  The following operations looks ok:

 * DW_OP_stack_value
 * DW_OP_deref_size
 * DW_OP_deref
 * DW_OP_piece

To refuse complex (and unsupported) location expressions, add
check_allowed_ops() to compare the rest of the list.  It seems earlier
result contained those unsupported expressions.  For example, I found
some local struct variable is placed like below.

 <2><43d1517>: Abbrev Number: 62 (DW_TAG_variable)
    <43d1518>   DW_AT_location    : 15 byte block: 91 50 93 8 91 78 93 4 93 84 8 91 68 93 4
        (DW_OP_fbreg: -48; DW_OP_piece: 8;
         DW_OP_fbreg: -8; DW_OP_piece: 4;
         DW_OP_piece: 1028;
         DW_OP_fbreg: -24; DW_OP_piece: 4)

Another example is something like this.

    0057c8be ffffffffffffffff ffffffff812109f0 (base address)
    0057c8ce ffffffff812112b5 ffffffff812112c8 (DW_OP_breg3 (rbx): 0;
                                                DW_OP_constu: 18446744073709551612;
                                                DW_OP_and;
                                                DW_OP_stack_value)

It should refuse them.  After the change, the stat shows:

  Annotate data type stats:
  total 294, ok 158 (53.7%), bad 136 (46.3%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          53 : no_var
          14 : no_typeinfo
           7 : bad_offset

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 44 +++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 7caf52fdc255..2791126069b4 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1305,6 +1305,34 @@ static bool match_var_offset(Dwarf_Die *die_mem, struct find_var_data *data,
 	return true;
 }
 
+static bool check_allowed_ops(Dwarf_Op *ops, size_t nops)
+{
+	/* The first op is checked separately */
+	ops++;
+	nops--;
+
+	/*
+	 * It needs to make sure if the location expression matches to the given
+	 * register and offset exactly.  Thus it rejects any complex expressions
+	 * and only allows a few of selected operators that doesn't change the
+	 * location.
+	 */
+	while (nops) {
+		switch (ops->atom) {
+		case DW_OP_stack_value:
+		case DW_OP_deref_size:
+		case DW_OP_deref:
+		case DW_OP_piece:
+			break;
+		default:
+			return false;
+		}
+		ops++;
+		nops--;
+	}
+	return true;
+}
+
 /* Only checks direct child DIEs in the given scope. */
 static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
 {
@@ -1332,25 +1360,31 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
 		/* Local variables accessed using frame base register */
 		if (data->is_fbreg && ops->atom == DW_OP_fbreg &&
 		    data->offset >= (int)ops->number &&
+		    check_allowed_ops(ops, nops) &&
 		    match_var_offset(die_mem, data, data->offset, ops->number))
 			return DIE_FIND_CB_END;
 
 		/* Only match with a simple case */
 		if (data->reg < DWARF_OP_DIRECT_REGS) {
-			if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1)
+			/* pointer variables saved in a register 0 to 31 */
+			if (ops->atom == (DW_OP_reg0 + data->reg) &&
+			    check_allowed_ops(ops, nops))
 				return DIE_FIND_CB_END;
 
 			/* Local variables accessed by a register + offset */
 			if (ops->atom == (DW_OP_breg0 + data->reg) &&
+			    check_allowed_ops(ops, nops) &&
 			    match_var_offset(die_mem, data, data->offset, ops->number))
 				return DIE_FIND_CB_END;
 		} else {
+			/* pointer variables saved in a register 32 or above */
 			if (ops->atom == DW_OP_regx && ops->number == data->reg &&
-			    nops == 1)
+			    check_allowed_ops(ops, nops))
 				return DIE_FIND_CB_END;
 
 			/* Local variables accessed by a register + offset */
 			if (ops->atom == DW_OP_bregx && data->reg == ops->number &&
+			    check_allowed_ops(ops, nops) &&
 			    match_var_offset(die_mem, data, data->offset, ops->number2))
 				return DIE_FIND_CB_END;
 		}
@@ -1412,7 +1446,8 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
 		if (data->addr < ops->number)
 			continue;
 
-		if (match_var_offset(die_mem, data, data->addr, ops->number))
+		if (check_allowed_ops(ops, nops) &&
+		    match_var_offset(die_mem, data, data->addr, ops->number))
 			return DIE_FIND_CB_END;
 	}
 	return DIE_FIND_CB_SIBLING;
@@ -1503,7 +1538,8 @@ int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset)
 		return -1;
 
 	if (!dwarf_cfi_addrframe(cfi, pc, &frame) &&
-	    !dwarf_frame_cfa(frame, &ops, &nops) && nops == 1) {
+	    !dwarf_frame_cfa(frame, &ops, &nops) &&
+	    check_allowed_ops(ops, nops)) {
 		*preg = reg_from_dwarf_op(ops);
 		*poffset = offset_from_dwarf_op(ops);
 		return 0;
-- 
2.43.0.381.gb435a96ce8-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/9] perf tools: More updates on data type profiling (v4)
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
                   ` (8 preceding siblings ...)
  2024-01-17  6:26 ` [PATCH 9/9] perf dwarf-aux: Check allowed DWARF Ops Namhyung Kim
@ 2024-01-18 16:36 ` Ian Rogers
  2024-01-22 20:37 ` Namhyung Kim
  10 siblings, 0 replies; 12+ messages in thread
From: Ian Rogers @ 2024-01-18 16:36 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Adrian Hunter,
	Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu,
	linux-toolchains, linux-trace-devel, Ben Woodard, Joe Mario,
	Kees Cook, David Blaikie, Xu Liu, Kan Liang, Ravi Bangoria,
	Mark Wielaard, Jason Merrill, Jose E . Marchesi, William Huang

On Tue, Jan 16, 2024 at 10:27 PM Namhyung Kim <namhyung@kernel.org> wrote:
>
> Hello,
>
> This is a continuation of the data type profiling series.  Now the basic
> part (v3) which uses pointer variables is merged to the perf-tools-next
> tree.  And this part is for memory accesses without pointers as well as
> small updates to handle some corner cases.  Still mores to come to
> complete the original series.
>
> There's no change from the previous version.  For background and usages,
> pleaes refer the posting of previous version [1] and a LWN article [2].
>
> Basically most memory accesses happen with pointers, but there are cases
> don't use pointers - direct accesses to global and local variables.
>
> Global variables are located in a static memory at a specific address.
> So the DWARF location expression for the global vairable would also have
> the static address.  And it's common to access them using PC-relative
> addressing mode.  Thus it needs a special handling for global variables.
>
> On the other hand, local variables are located in the stack which varies
> as program executes.  So the local variables are accessed either by the
> (stack) frame pointer or (current) stack pointer.  But sometimes DWARF
> location expression uses a frame base address (CFA) to specify location
> of local variables.  So it may need to convert or normalize the location
> extracted from the instruction to match DWARF expression.
>
> Lastly, there are some cases DWARF location expressions end up having
> complex (or not straight-forward) location.  In that case, it cannot
> simply match just the first expression with the instruction location.
> It'd be safer to reject them.
>
> The code is available at 'perf/data-profile-update-v4' branch in the tree
> below.  The full version of the code is in 'perf/data-profile-v4' branch.
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
>
> Cc: Ben Woodard <woodard@redhat.com>
> Cc: Joe Mario <jmario@redhat.com>
> CC: Kees Cook <keescook@chromium.org>
> Cc: David Blaikie <blaikie@google.com>
> Cc: Xu Liu <xliuprof@google.com>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Ravi Bangoria <ravi.bangoria@amd.com>
> Cc: Mark Wielaard <mark@klomp.org>
> Cc: Jason Merrill <jason@redhat.com>
> Cc: Jose E. Marchesi <jose.marchesi@oracle.com>
> Cc: William Huang <williamjhuang@google.com>
>
> [1] https://lore.kernel.org/linux-perf-users/20231213001323.718046-1-namhyung@kernel.org/
> [2] https://lwn.net/Articles/955709/
>
>
> Namhyung Kim (9):
>   perf annotate-data: Parse 'lock' prefix from llvm-objdump
>   perf annotate-data: Handle macro fusion on x86
>   perf annotate-data: Handle array style accesses
>   perf annotate-data: Add stack operation pseudo type
>   perf annotate-data: Handle PC-relative addressing
>   perf annotate-data: Support global variables
>   perf dwarf-aux: Add die_get_cfa()
>   perf annotate-data: Support stack variables
>   perf dwarf-aux: Check allowed DWARF Ops

Series:
Reviewed-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

>  tools/perf/util/annotate-data.c | 119 ++++++++++++++++----
>  tools/perf/util/annotate-data.h |   8 +-
>  tools/perf/util/annotate.c      | 153 ++++++++++++++++++++++++--
>  tools/perf/util/annotate.h      |  12 +-
>  tools/perf/util/dwarf-aux.c     | 187 ++++++++++++++++++++++++++++----
>  tools/perf/util/dwarf-aux.h     |  18 +++
>  6 files changed, 439 insertions(+), 58 deletions(-)
>
>
> base-commit: d988c9f511af71a3445b6a4f3a2c67208ff8e480
> --
> 2.43.0.381.gb435a96ce8-goog
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHSET 0/9] perf tools: More updates on data type profiling (v4)
  2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
                   ` (9 preceding siblings ...)
  2024-01-18 16:36 ` [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Ian Rogers
@ 2024-01-22 20:37 ` Namhyung Kim
  10 siblings, 0 replies; 12+ messages in thread
From: Namhyung Kim @ 2024-01-22 20:37 UTC (permalink / raw)
  To: Namhyung Kim, Arnaldo Carvalho de Melo, Jiri Olsa
  Cc: Ravi Bangoria, Xu Liu, Adrian Hunter, Linus Torvalds, Ian Rogers,
	Jason Merrill, Peter Zijlstra, LKML, William Huang, Ben Woodard,
	Stephane Eranian, Kees Cook, linux-toolchains, Joe Mario,
	linux-trace-devel, Masami Hiramatsu, Kan Liang,
	Jose E . Marchesi, linux-perf-users, David Blaikie,
	Mark Wielaard, Ingo Molnar

On Tue, 16 Jan 2024 22:26:48 -0800, Namhyung Kim wrote:
> This is a continuation of the data type profiling series.  Now the basic
> part (v3) which uses pointer variables is merged to the perf-tools-next
> tree.  And this part is for memory accesses without pointers as well as
> small updates to handle some corner cases.  Still mores to come to
> complete the original series.
> 
> There's no change from the previous version.  For background and usages,
> pleaes refer the posting of previous version [1] and a LWN article [2].
> 
> [...]

Applied to perf-tools-next, thanks!

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-01-22 20:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-17  6:26 [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Namhyung Kim
2024-01-17  6:26 ` [PATCH 1/9] perf annotate-data: Parse 'lock' prefix from llvm-objdump Namhyung Kim
2024-01-17  6:26 ` [PATCH 2/9] perf annotate-data: Handle macro fusion on x86 Namhyung Kim
2024-01-17  6:26 ` [PATCH 3/9] perf annotate-data: Handle array style accesses Namhyung Kim
2024-01-17  6:26 ` [PATCH 4/9] perf annotate-data: Add stack operation pseudo type Namhyung Kim
2024-01-17  6:26 ` [PATCH 5/9] perf annotate-data: Handle PC-relative addressing Namhyung Kim
2024-01-17  6:26 ` [PATCH 6/9] perf annotate-data: Support global variables Namhyung Kim
2024-01-17  6:26 ` [PATCH 7/9] perf dwarf-aux: Add die_get_cfa() Namhyung Kim
2024-01-17  6:26 ` [PATCH 8/9] perf annotate-data: Support stack variables Namhyung Kim
2024-01-17  6:26 ` [PATCH 9/9] perf dwarf-aux: Check allowed DWARF Ops Namhyung Kim
2024-01-18 16:36 ` [PATCHSET 0/9] perf tools: More updates on data type profiling (v4) Ian Rogers
2024-01-22 20:37 ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).