linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 00/52] perf tools: Introduce data type profiling (v2)
@ 2023-11-09 23:59 Namhyung Kim
  2023-11-09 23:59 ` [PATCH 01/52] perf annotate: Pass "-l" option to objdump conditionally Namhyung Kim
                   ` (52 more replies)
  0 siblings, 53 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains, Ben Woodard, Joe Mario,
	Kees Cook, David Blaikie, Xu Liu, Kan Liang, Ravi Bangoria,
	Mark Wielaard, Jason Merrill

Hello,

I'm happy to share my work on data type profiling.  This is to associate
PMU samples to data types they refer using DWARF debug information.  So
basically it depends on quality of PMU events and compiler for producing
DWARF info.  But it doesn't require any changes in the target program.

As it's an early stage, I've targeted the kernel on x86 to reduce the
amount of work but IIUC there's no fundamental blocker to apply it to
other architectures and applications.


* v2 changes
 - speed up analysis by not asking (unused) line number info to objdump
 - support annotate a specific data type only by passing a type name like
   `perf annotate --data-type=<TYPENAME>`  (PeterZ)
 - allow event group view to see multiple event results together like
   `perf annotate --data-type --group`  (PeterZ)
 - rename to `die_get_typename_from_type()`  (Masami)
 - add a feature check for HAVE_DWARF_CFI_SUPPORT  (Masami)
 - add Acked-by tags from Masami
 

* How to use it

To get precise memory access samples, users can use `perf mem record`
command to utilize those events supported by their architecture.  Intel
machines would work best as they have dedicated memory access events but
they would have a filter to ignore low latency loads like less than 30
cycles (use --ldlat option to change the default value).

    # To get memory access samples in kernel for 1 second (on Intel)
    $ sudo perf mem record -a -K --ldlat=4 -- sleep 1

    # Similar for the AMD (but it requires 6.3+ kernel for BPF filters)
    $ sudo perf mem record -a --filter 'mem_op == load || mem_op == store, ip > 0x8000000000000000' -- sleep 1

Note that it used 'sudo' command because it's collecting the event in
system wide mode.  Actually it would depend on the sysctl setting of
kernel.perf_event_paranoid.  AMD still needs root due to the BPF filter
though.

Actually users can use a different event as long as it gives precise
instruction addresses in samples.  But the perf mem record will pick
up the available events which will give more information like data
source or latency (for advanced usage).

After getting a profile data, you would run perf report or perf
annotate as usual to see the result.  Make sure that you have a kernel
debug package installed or vmlinux with DWARF info.

I've added new options and sort keys to enable the data type profiling.
Probably I need to add it to perf mem or perf c2c command for better
user experience.  I'm open to discussion how we can make it simpler and
intuitive for regular users.  But let's talk about the lower level
interface for now.

In perf report, it's just a matter of selecting new sort keys: 'type'
and 'typeoff'.  The 'type' shows name of the data type as a whole while
'typeoff' shows name of the field in the data type.  I found it useful
to use it with --hierarchy option to group relevant entries in the same
level.

Also, if you have both load and store events, pass --group option to
see the result together.  This would give 3 output columns with a dummy
event.  I think we should get rid of the dummy event after recording or
discard from the output at least.

    $ sudo perf report -s type,typeoff --hierarchy --group --stdio
    ...
    #
    # Samples: 10K of events 'cpu/mem-loads,ldlat=4/P, cpu/mem-stores/P, dummy:u'
    # Event count (approx.): 602758064
    #
    #                    Overhead  Data Type / Data Type Offset
    # ...........................  ............................
    #
        26.09%   3.28%   0.00%     long unsigned int
           26.09%   3.28%   0.00%     long unsigned int +0 (no field)
        18.48%   0.73%   0.00%     struct page
           10.83%   0.02%   0.00%     struct page +8 (lru.next)
            3.90%   0.28%   0.00%     struct page +0 (flags)
            3.45%   0.06%   0.00%     struct page +24 (mapping)
            0.25%   0.28%   0.00%     struct page +48 (_mapcount.counter)
            0.02%   0.06%   0.00%     struct page +32 (index)
            0.02%   0.00%   0.00%     struct page +52 (_refcount.counter)
            0.02%   0.01%   0.00%     struct page +56 (memcg_data)
            0.00%   0.01%   0.00%     struct page +16 (lru.prev)
        15.37%  17.54%   0.00%     (stack operation)
           15.37%  17.54%   0.00%     (stack operation) +0 (no field)
        11.71%  50.27%   0.00%     (unknown)
           11.71%  50.27%   0.00%     (unknown) +0 (no field)
    ...

The most frequently accessed type was long unsigned int and then the
struct page and you can see the second field (lru.next) at offset
8 was accessed mostly.

The (stack operation) and (unknown) have no type and field info.  FYI,
the stack operations are samples in PUSH, POP or RET instructions which
save or restore registers from/to the stack.  They are usually parts of
function prologue and epilogue and have no type info.

In perf annotate, new --data-type option was added to enable data
field level annotation.  Now it only shows number of samples for each
field but we can improve it.  The --data-type option optionally takes an
argument to specify the name of data type to display.  Otherwise, it'd
display all data types having samples.

    $ sudo perf annotate --data-type=page --group
    Annotate type: 'struct page' in [kernel.kallsyms] (480 samples):
     event[0] = cpu/mem-loads,ldlat=4/P
     event[1] = cpu/mem-stores/P
     event[2] = dummy:u
    ============================================================================
                              samples     offset       size  field
            447         33          0          0         64  struct page     {
            108          8          0          0          8      long unsigned int  flags;
            319         13          0          8         40      union       {
            319         13          0          8         40          struct          {
            236          2          0          8         16              union       {
            236          2          0          8         16                  struct list_head       lru {
            236          1          0          8          8                      struct list_head*  next;
              0          1          0         16          8                      struct list_head*  prev;
                                                                             };
            236          2          0          8         16                  struct          {
            236          1          0          8          8                      void*      __filler;
              0          1          0         16          4                      unsigned int       mlock_count;
                                                                             };
            236          2          0          8         16                  struct list_head       buddy_list {
            236          1          0          8          8                      struct list_head*  next;
              0          1          0         16          8                      struct list_head*  prev;
                                                                             };
            236          2          0          8         16                  struct list_head       pcp_list {
            236          1          0          8          8                      struct list_head*  next;
              0          1          0         16          8                      struct list_head*  prev;
                                                                             };
                                                                         };
             82          4          0         24          8              struct address_space*      mapping;
              1          7          0         32          8              union       {
              1          7          0         32          8                  long unsigned int      index;
              1          7          0         32          8                  long unsigned int      share;
                                                                         };
              0          0          0         40          8              long unsigned int  private;
                                                                     };
        ...

This shows each struct one by one and field-level access info in C-like
style.  The number of samples for the outer struct is a sum of number of
samples in every field in the struct.  In unions, each field is placed
in the same offset so they will have the same number of samples.

No TUI support yet.


* How it works

The basic idea is to use DWARF location expression in debug entries for
variables.  Say we got a sample in the instruction below:

    0x123456:  mov    0x18(%rdi), %rcx

Then we know the instruction at 0x123456 is accessing to a memory region
where %rdi register has a base address and offset 0x18 from the base.
DWARF would have a debug info entry for a function or a block which
covers that address.  For example, we might have something like this:

    <1><100>: Abbrev Number: 10 (DW_TAG_subroutine_type)
       <101>    DW_AT_name       : (indirect string, offset: 0x184e6): foo
       <105>    DW_AT_type       : <0x29ad7>
       <106>    DW_AT_low_pc     : 0x123400
       <10e>    DW_AT_high_pc    : 0x1234ff
    <2><116>: Abbrev Number: 8 (DW_TAG_formal_parameter)
       <117>    DW_AT_name       : (indirect string, offset: 0x18527): bar
       <11b>    DW_AT_type       : <0x29b3a>
       <11c>    DW_AT_location   : 1 byte block: 55    (DW_OP_reg2 (rdi))

So the function 'foo' covers the instruction from 0x123400 to 0x1234ff
and we know the sample instruction belongs to the function.  And it has
a parameter called 'bar' and it's located at the %rdi register.  Then we
know the instruction is using the variable bar and its type would be a
pointer (to a struct).  We can follow the type info of bar and verify
its access by checking the size of the (struct) type and offset in the
instruction (0x18).

Well.. this is a simple example that the 'bar' has a single location.
Other variables might be located in various places over time but it
should be covered by the location list of the debug entry.  Therefore,
as long as DWARF produces a correct location expression for a variable,
it should be able to find the variable using the location info.

Global variables and local variables are different as they can be
accessed directly without a pointer.  They are located in an absolute
address or relative position from the current stack frame.  So it needs
to handle such location expressions as well.

However, some memory accesses don't have a variable in some cases.  For
example, you have a pointer variable for a struct which contains another
pointers.  And then you can directly dereference it without using a
variable.  Consider the following source code.

    int foo(struct baz *bar) {
        ...
        if (bar->p->q == 0)
            return 1;
        ...
    }

This can generate instructions like below.

    ...
    0x123456:  mov    0x18(%rdi), %rcx
    0x12345a:  mov    0x10(%rcx), %rax     <=== sample
    0x12345e:  test   %rax, %rax
    0x123461:  je     <...>
    ...

And imagine we have a sample at 0x12345a.  Then it cannot find a
variable for %rcx since DWARF didn't generate one (it only knows about
'bar').  Without compiler support, all it can do is to track the code
execution in each instruction and propagate the type info in each
register and stack location by following the memory access.

Actually I found a discussion in the DWARF mailing list to support
"inverted location lists" and it seems a perfect fit for this project.
It'd be great if new DWARF would provide a way to lookup variable and
type info using a concrete location info (like a register number).

  https://lists.dwarfstd.org/pipermail/dwarf-discuss/2023-June/002278.html 


* Patch structure

The patch 1-5 are cleanups and a fix that can be applied separately.
The patch 6-25 are the main changes in perf report and perf annotate to
support simple cases with a pointer variable.  The patch 26-36 are to
improve it by handling global and local variables (without a pointer)
and some edge cases.  The patch 37-43 implemented instruction tracking
to infer data type when there's no variable for that.  The patch 47-51
handles kernel-specific per-cpu variables (only for current CPU).  The
patch 52 is to help debugging and is not intended for merge.


* Limitations and future work

As I said earlier, this work is in a very early shape and has many
limitations or rooms for improvement.  Basically it uses objdump tool to
extract location information from the sample instruction.  And the
parsing code and instruction tracking work on x86 only.

In the previous version, I mentioned a performance issue on objdump but
it turned out that it was because of bad usage.  I realized it passed
"-l" option to display line numbers along with the disassembly.  But it
doesn't use the line numbers for data type profiling.  So I just got rid
of it and it gives a huge speedup (285s -> 9.5s).

    $ time ./perf.v1 report -s type > output1
    real	4m45.248s
    user	4m0.714s
    sys		0m44.338s

    $ time ./perf.v2 report -s type > output2
    real	0m9.464s
    user	0m3.271s
    sys		0m6.254s

    $ md5sum output*
    1489f96658bfaee61812df9a42fa7812  output1
    1489f96658bfaee61812df9a42fa7812  output2

Interestingly, now GNU objdump outperforms llvm-objdump for some reason.
If you set the perf config annotate.objdump=llvm-objdump as I said in
the v1 cover letter, please remove it now. :)

Even with this change, still the most processing time was spent on the
objdump to get the disassembly.  It'd be nice if we can get the result
without using objdump at all.

Also I only tested it with C programs (mostly vmlinux) and I believe
there are many issues on handling C++ applications.  Probably other
languages (like Rust?) could be supported too.  But even for C programs,
it could improve things like better supporting union and array types and
dealing with type casts and so on.

I think compiler could generate more DWARF information to help this kind
of analysis.  Like I mentioned, it doesn't have a variable for
intermediate pointers when they are chained: a->b->c.  This chain could
be longer and hard to track the type from the previous variable.  If
compiler could generate (artificial) debug entries for the intermediate
pointers with a precise location expression and type info, it would be
really helpful.

And I plan to improve the analysis in perf tools with better integration
to the existing command like perf mem and/or perf c2c.  It'd be pretty
interesting to see per-struct or per-field access patterns both for load
and store event at the same time.  Also using data-source or snoop info
for each struct/field would give some insights on optimizing memory
usage or layout.

There are kernel specific issues too.  Some per-cpu variable accesses
created complex instruction patterns so it was hard to determine which
data/type it accessed.  For now, it just parsed simple patterns for
this-cpu access using %gs segment register.  Also it should handle
self-modifying codes like kprobe, ftrace, live patch and so on.  I guess
they would usually create an out-of-line copy of modified instructions
but needs more checking.  And I have no idea about the status of struct
layout randomization and the DWARF info of the resulting struct.  Maybe
there are more issues I'm not aware of, please let me know if you notice
something.


* Summary

Despite all the issues, I believe this would be a good addition to our
performance toolset.  It would help to observe memory overheads in a
different angle and to optimize the memory usage.  I'm really looking
forward to hearing any feedback.

The code is available at 'perf/data-profile-v2' branch in

  git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Enjoy,
Namhyung


Cc: Ben Woodard <woodard@redhat.com> 
Cc: Joe Mario <jmario@redhat.com>
CC: Kees Cook <keescook@chromium.org>
Cc: David Blaikie <blaikie@google.com>
Cc: Xu Liu <xliuprof@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Mark Wielaard <mark@klomp.org>
Cc: Jason Merrill <jason@redhat.com>


Namhyung Kim (52):
  perf annotate: Pass "-l" option to objdump conditionally
  perf annotate: Move raw_comment and raw_func_start
  perf tools: Add util/debuginfo.[ch] files
  perf dwarf-aux: Fix die_get_typename() for void *
  perf dwarf-aux: Move #ifdef code to the header file
  perf dwarf-aux: Add die_get_scopes() helper
  perf dwarf-aux: Add die_find_variable_by_reg() helper
  perf build: Add feature check for dwarf_getcfi()
  perf probe: Convert to check dwarf_getcfi feature
  perf dwarf-aux: Factor out die_get_typename_from_type()
  perf dwarf-regs: Add get_dwarf_regnum()
  perf annotate-data: Add find_data_type()
  perf annotate-data: Add dso->data_types tree
  perf annotate: Factor out evsel__get_arch()
  perf annotate: Check if operand has multiple regs
  perf annotate: Add annotate_get_insn_location()
  perf annotate: Implement hist_entry__get_data_type()
  perf report: Add 'type' sort key
  perf report: Support data type profiling
  perf annotate-data: Add member field in the data type
  perf annotate-data: Update sample histogram for type
  perf report: Add 'typeoff' sort key
  perf report: Add 'symoff' sort key
  perf annotate: Add --data-type option
  perf annotate: Support event group display
  perf annotate: Add --type-stat option for debugging
  perf annotate: Add --insn-stat option for debugging
  perf annotate-data: Parse 'lock' prefix from llvm-objdump
  perf annotate-data: Handle macro fusion on x86
  perf annotate-data: Handle array style accesses
  perf annotate-data: Add stack operation pseudo type
  perf dwarf-aux: Add die_find_variable_by_addr()
  perf annotate-data: Handle PC-relative addressing
  perf annotate-data: Support global variables
  perf dwarf-aux: Add die_get_cfa()
  perf annotate-data: Support stack variables
  perf dwarf-aux: Check allowed DWARF Ops
  perf dwarf-aux: Add die_collect_vars()
  perf dwarf-aux: Handle type transfer for memory access
  perf annotate-data: Introduce struct data_loc_info
  perf map: Add map__objdump_2rip()
  perf annotate: Add annotate_get_basic_blocks()
  perf annotate-data: Maintain variable type info
  perf annotate-data: Add update_insn_state()
  perf annotate-data: Handle global variable access
  perf annotate-data: Handle call instructions
  perf annotate-data: Implement instruction tracking
  perf annotate: Parse x86 segment register location
  perf annotate-data: Handle this-cpu variables in kernel
  perf annotate-data: Track instructions with a this-cpu variable
  perf annotate-data: Add stack canary type
  perf annotate-data: Add debug message

 tools/build/Makefile.feature                  |    1 +
 tools/build/feature/Makefile                  |    4 +
 tools/build/feature/test-dwarf_getcfi.c       |    9 +
 tools/perf/Documentation/perf-annotate.txt    |   11 +
 tools/perf/Documentation/perf-report.txt      |    3 +
 tools/perf/Makefile.config                    |    5 +
 .../arch/loongarch/annotate/instructions.c    |    6 +-
 tools/perf/arch/x86/util/dwarf-regs.c         |   38 +
 tools/perf/builtin-annotate.c                 |  217 ++-
 tools/perf/builtin-report.c                   |   21 +-
 tools/perf/util/Build                         |    2 +
 tools/perf/util/annotate-data.c               | 1221 +++++++++++++++++
 tools/perf/util/annotate-data.h               |  222 +++
 tools/perf/util/annotate.c                    |  786 ++++++++++-
 tools/perf/util/annotate.h                    |  104 +-
 tools/perf/util/debuginfo.c                   |  205 +++
 tools/perf/util/debuginfo.h                   |   64 +
 tools/perf/util/dso.c                         |    4 +
 tools/perf/util/dso.h                         |    2 +
 tools/perf/util/dwarf-aux.c                   |  566 +++++++-
 tools/perf/util/dwarf-aux.h                   |   92 +-
 tools/perf/util/dwarf-regs.c                  |   34 +
 tools/perf/util/hist.h                        |    3 +
 tools/perf/util/include/dwarf-regs.h          |   19 +
 tools/perf/util/map.c                         |   20 +
 tools/perf/util/map.h                         |    3 +
 tools/perf/util/probe-finder.c                |  201 +--
 tools/perf/util/probe-finder.h                |   19 +-
 tools/perf/util/sort.c                        |  202 ++-
 tools/perf/util/sort.h                        |    7 +
 tools/perf/util/symbol_conf.h                 |    4 +-
 31 files changed, 3830 insertions(+), 265 deletions(-)
 create mode 100644 tools/build/feature/test-dwarf_getcfi.c
 create mode 100644 tools/perf/util/annotate-data.c
 create mode 100644 tools/perf/util/annotate-data.h
 create mode 100644 tools/perf/util/debuginfo.c
 create mode 100644 tools/perf/util/debuginfo.h


base-commit: 6512b6aa237db36d881a81cc312db39668e61853
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply	[flat|nested] 69+ messages in thread

* [PATCH 01/52] perf annotate: Pass "-l" option to objdump conditionally
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 02/52] perf annotate: Move raw_comment and raw_func_start Namhyung Kim
                   ` (51 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The "-l" option is to print line numbers in the objdump output.
perf annotate TUI only can show the line numbers later but it
causes big slow downs for the kernel binary.

Similarly, showing source code also takes a long time and it
already has an option to control it.

  $ time objdump ... -d -S -C vmlinux > /dev/null
  real	0m3.474s
  user	0m3.047s
  sys	0m0.428s

  $ time objdump ... -d -l -C vmlinux > /dev/null
  real	0m1.796s
  user	0m1.459s
  sys	0m0.338s

  $ time objdump ... -d -C vmlinux > /dev/null
  real	0m0.051s
  user	0m0.036s
  sys	0m0.016s

As it's not needed for data type profiling, let's make it conditional
so that it can skip the unnecessary work.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 9b68b8e3791c..118195c787b9 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2144,12 +2144,13 @@ static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
 	err = asprintf(&command,
 		 "%s %s%s --start-address=0x%016" PRIx64
 		 " --stop-address=0x%016" PRIx64
-		 " -l -d %s %s %s %c%s%c %s%s -C \"$1\"",
+		 " %s -d %s %s %s %c%s%c %s%s -C \"$1\"",
 		 opts->objdump_path ?: "objdump",
 		 opts->disassembler_style ? "-M " : "",
 		 opts->disassembler_style ?: "",
 		 map__rip_2objdump(map, sym->start),
 		 map__rip_2objdump(map, sym->end),
+		 opts->show_linenr ? "-l" : "",
 		 opts->show_asm_raw ? "" : "--no-show-raw-insn",
 		 opts->annotate_src ? "-S" : "",
 		 opts->prefix ? "--prefix " : "",
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 02/52] perf annotate: Move raw_comment and raw_func_start
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
  2023-11-09 23:59 ` [PATCH 01/52] perf annotate: Pass "-l" option to objdump conditionally Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 03/52] perf tools: Add util/debuginfo.[ch] files Namhyung Kim
                   ` (50 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains, Huacai Chen, WANG Rui

Thoese two fields are used only for the jump_ops, so move them into the
union to save some bytes.  Also add jump__delete() callback not to free
the fields as they didn't allocate new strings.

Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Rui <wangrui@loongson.cn>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 .../perf/arch/loongarch/annotate/instructions.c |  6 +++---
 tools/perf/util/annotate.c                      | 17 +++++++++++++----
 tools/perf/util/annotate.h                      |  6 ++++--
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/tools/perf/arch/loongarch/annotate/instructions.c b/tools/perf/arch/loongarch/annotate/instructions.c
index 98e19c5366ac..21cc7e4149f7 100644
--- a/tools/perf/arch/loongarch/annotate/instructions.c
+++ b/tools/perf/arch/loongarch/annotate/instructions.c
@@ -61,10 +61,10 @@ static int loongarch_jump__parse(struct arch *arch, struct ins_operands *ops, st
 	const char *c = strchr(ops->raw, '#');
 	u64 start, end;
 
-	ops->raw_comment = strchr(ops->raw, arch->objdump.comment_char);
-	ops->raw_func_start = strchr(ops->raw, '<');
+	ops->jump.raw_comment = strchr(ops->raw, arch->objdump.comment_char);
+	ops->jump.raw_func_start = strchr(ops->raw, '<');
 
-	if (ops->raw_func_start && c > ops->raw_func_start)
+	if (ops->jump.raw_func_start && c > ops->jump.raw_func_start)
 		c = NULL;
 
 	if (c++ != NULL)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 118195c787b9..3364edf30f50 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -340,10 +340,10 @@ bool ins__is_call(const struct ins *ins)
  */
 static inline const char *validate_comma(const char *c, struct ins_operands *ops)
 {
-	if (ops->raw_comment && c > ops->raw_comment)
+	if (ops->jump.raw_comment && c > ops->jump.raw_comment)
 		return NULL;
 
-	if (ops->raw_func_start && c > ops->raw_func_start)
+	if (ops->jump.raw_func_start && c > ops->jump.raw_func_start)
 		return NULL;
 
 	return c;
@@ -359,8 +359,8 @@ static int jump__parse(struct arch *arch, struct ins_operands *ops, struct map_s
 	const char *c = strchr(ops->raw, ',');
 	u64 start, end;
 
-	ops->raw_comment = strchr(ops->raw, arch->objdump.comment_char);
-	ops->raw_func_start = strchr(ops->raw, '<');
+	ops->jump.raw_comment = strchr(ops->raw, arch->objdump.comment_char);
+	ops->jump.raw_func_start = strchr(ops->raw, '<');
 
 	c = validate_comma(c, ops);
 
@@ -462,7 +462,16 @@ static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
 			 ops->target.offset);
 }
 
+static void jump__delete(struct ins_operands *ops __maybe_unused)
+{
+	/*
+	 * The ops->jump.raw_comment and ops->jump.raw_func_start belong to the
+	 * raw string, don't free them.
+	 */
+}
+
 static struct ins_ops jump_ops = {
+	.free	   = jump__delete,
 	.parse	   = jump__parse,
 	.scnprintf = jump__scnprintf,
 };
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index de59c1aff08e..bc8b95e8b1be 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -31,8 +31,6 @@ struct ins {
 
 struct ins_operands {
 	char	*raw;
-	char	*raw_comment;
-	char	*raw_func_start;
 	struct {
 		char	*raw;
 		char	*name;
@@ -52,6 +50,10 @@ struct ins_operands {
 			struct ins	    ins;
 			struct ins_operands *ops;
 		} locked;
+		struct {
+			char	*raw_comment;
+			char	*raw_func_start;
+		} jump;
 	};
 };
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 03/52] perf tools: Add util/debuginfo.[ch] files
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
  2023-11-09 23:59 ` [PATCH 01/52] perf annotate: Pass "-l" option to objdump conditionally Namhyung Kim
  2023-11-09 23:59 ` [PATCH 02/52] perf annotate: Move raw_comment and raw_func_start Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 04/52] perf dwarf-aux: Fix die_get_typename() for void * Namhyung Kim
                   ` (49 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Split debuginfo data structure and related functions into a separate
file so that it can be used other than the probe-finder.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/Build          |   1 +
 tools/perf/util/debuginfo.c    | 205 +++++++++++++++++++++++++++++++++
 tools/perf/util/debuginfo.h    |  64 ++++++++++
 tools/perf/util/probe-finder.c | 193 +------------------------------
 tools/perf/util/probe-finder.h |  19 +--
 5 files changed, 272 insertions(+), 210 deletions(-)
 create mode 100644 tools/perf/util/debuginfo.c
 create mode 100644 tools/perf/util/debuginfo.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 96058f949ec9..73e3f194f949 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -195,6 +195,7 @@ endif
 perf-$(CONFIG_DWARF) += probe-finder.o
 perf-$(CONFIG_DWARF) += dwarf-aux.o
 perf-$(CONFIG_DWARF) += dwarf-regs.o
+perf-$(CONFIG_DWARF) += debuginfo.o
 
 perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
 perf-$(CONFIG_LOCAL_LIBUNWIND)    += unwind-libunwind-local.o
diff --git a/tools/perf/util/debuginfo.c b/tools/perf/util/debuginfo.c
new file mode 100644
index 000000000000..19acf4775d35
--- /dev/null
+++ b/tools/perf/util/debuginfo.c
@@ -0,0 +1,205 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * DWARF debug information handling code.  Copied from probe-finder.c.
+ *
+ * Written by Masami Hiramatsu <mhiramat@redhat.com>
+ */
+
+#include <errno.h>
+#include <fcntl.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <linux/zalloc.h>
+
+#include "build-id.h"
+#include "dso.h"
+#include "debug.h"
+#include "debuginfo.h"
+#include "symbol.h"
+
+#ifdef HAVE_DEBUGINFOD_SUPPORT
+#include <elfutils/debuginfod.h>
+#endif
+
+/* Dwarf FL wrappers */
+static char *debuginfo_path;	/* Currently dummy */
+
+static const Dwfl_Callbacks offline_callbacks = {
+	.find_debuginfo = dwfl_standard_find_debuginfo,
+	.debuginfo_path = &debuginfo_path,
+
+	.section_address = dwfl_offline_section_address,
+
+	/* We use this table for core files too.  */
+	.find_elf = dwfl_build_id_find_elf,
+};
+
+/* Get a Dwarf from offline image */
+static int debuginfo__init_offline_dwarf(struct debuginfo *dbg,
+					 const char *path)
+{
+	GElf_Addr dummy;
+	int fd;
+
+	fd = open(path, O_RDONLY);
+	if (fd < 0)
+		return fd;
+
+	dbg->dwfl = dwfl_begin(&offline_callbacks);
+	if (!dbg->dwfl)
+		goto error;
+
+	dwfl_report_begin(dbg->dwfl);
+	dbg->mod = dwfl_report_offline(dbg->dwfl, "", "", fd);
+	if (!dbg->mod)
+		goto error;
+
+	dbg->dbg = dwfl_module_getdwarf(dbg->mod, &dbg->bias);
+	if (!dbg->dbg)
+		goto error;
+
+	dwfl_module_build_id(dbg->mod, &dbg->build_id, &dummy);
+
+	dwfl_report_end(dbg->dwfl, NULL, NULL);
+
+	return 0;
+error:
+	if (dbg->dwfl)
+		dwfl_end(dbg->dwfl);
+	else
+		close(fd);
+	memset(dbg, 0, sizeof(*dbg));
+
+	return -ENOENT;
+}
+
+static struct debuginfo *__debuginfo__new(const char *path)
+{
+	struct debuginfo *dbg = zalloc(sizeof(*dbg));
+	if (!dbg)
+		return NULL;
+
+	if (debuginfo__init_offline_dwarf(dbg, path) < 0)
+		zfree(&dbg);
+	if (dbg)
+		pr_debug("Open Debuginfo file: %s\n", path);
+	return dbg;
+}
+
+enum dso_binary_type distro_dwarf_types[] = {
+	DSO_BINARY_TYPE__FEDORA_DEBUGINFO,
+	DSO_BINARY_TYPE__UBUNTU_DEBUGINFO,
+	DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO,
+	DSO_BINARY_TYPE__BUILDID_DEBUGINFO,
+	DSO_BINARY_TYPE__MIXEDUP_UBUNTU_DEBUGINFO,
+	DSO_BINARY_TYPE__NOT_FOUND,
+};
+
+struct debuginfo *debuginfo__new(const char *path)
+{
+	enum dso_binary_type *type;
+	char buf[PATH_MAX], nil = '\0';
+	struct dso *dso;
+	struct debuginfo *dinfo = NULL;
+	struct build_id bid;
+
+	/* Try to open distro debuginfo files */
+	dso = dso__new(path);
+	if (!dso)
+		goto out;
+
+	/* Set the build id for DSO_BINARY_TYPE__BUILDID_DEBUGINFO */
+	if (is_regular_file(path) && filename__read_build_id(path, &bid) > 0)
+		dso__set_build_id(dso, &bid);
+
+	for (type = distro_dwarf_types;
+	     !dinfo && *type != DSO_BINARY_TYPE__NOT_FOUND;
+	     type++) {
+		if (dso__read_binary_type_filename(dso, *type, &nil,
+						   buf, PATH_MAX) < 0)
+			continue;
+		dinfo = __debuginfo__new(buf);
+	}
+	dso__put(dso);
+
+out:
+	/* if failed to open all distro debuginfo, open given binary */
+	return dinfo ? : __debuginfo__new(path);
+}
+
+void debuginfo__delete(struct debuginfo *dbg)
+{
+	if (dbg) {
+		if (dbg->dwfl)
+			dwfl_end(dbg->dwfl);
+		free(dbg);
+	}
+}
+
+/* For the kernel module, we need a special code to get a DIE */
+int debuginfo__get_text_offset(struct debuginfo *dbg, Dwarf_Addr *offs,
+				bool adjust_offset)
+{
+	int n, i;
+	Elf32_Word shndx;
+	Elf_Scn *scn;
+	Elf *elf;
+	GElf_Shdr mem, *shdr;
+	const char *p;
+
+	elf = dwfl_module_getelf(dbg->mod, &dbg->bias);
+	if (!elf)
+		return -EINVAL;
+
+	/* Get the number of relocations */
+	n = dwfl_module_relocations(dbg->mod);
+	if (n < 0)
+		return -ENOENT;
+	/* Search the relocation related .text section */
+	for (i = 0; i < n; i++) {
+		p = dwfl_module_relocation_info(dbg->mod, i, &shndx);
+		if (strcmp(p, ".text") == 0) {
+			/* OK, get the section header */
+			scn = elf_getscn(elf, shndx);
+			if (!scn)
+				return -ENOENT;
+			shdr = gelf_getshdr(scn, &mem);
+			if (!shdr)
+				return -ENOENT;
+			*offs = shdr->sh_addr;
+			if (adjust_offset)
+				*offs -= shdr->sh_offset;
+		}
+	}
+	return 0;
+}
+
+#ifdef HAVE_DEBUGINFOD_SUPPORT
+int get_source_from_debuginfod(const char *raw_path,
+			       const char *sbuild_id, char **new_path)
+{
+	debuginfod_client *c = debuginfod_begin();
+	const char *p = raw_path;
+	int fd;
+
+	if (!c)
+		return -ENOMEM;
+
+	fd = debuginfod_find_source(c, (const unsigned char *)sbuild_id,
+				0, p, new_path);
+	pr_debug("Search %s from debuginfod -> %d\n", p, fd);
+	if (fd >= 0)
+		close(fd);
+	debuginfod_end(c);
+	if (fd < 0) {
+		pr_debug("Failed to find %s in debuginfod (%s)\n",
+			raw_path, sbuild_id);
+		return -ENOENT;
+	}
+	pr_debug("Got a source %s\n", *new_path);
+
+	return 0;
+}
+#endif /* HAVE_DEBUGINFOD_SUPPORT */
diff --git a/tools/perf/util/debuginfo.h b/tools/perf/util/debuginfo.h
new file mode 100644
index 000000000000..4d65b8c605fc
--- /dev/null
+++ b/tools/perf/util/debuginfo.h
@@ -0,0 +1,64 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _PERF_DEBUGINFO_H
+#define _PERF_DEBUGINFO_H
+
+#include <errno.h>
+#include <linux/compiler.h>
+
+#ifdef HAVE_DWARF_SUPPORT
+
+#include "dwarf-aux.h"
+
+/* debug information structure */
+struct debuginfo {
+	Dwarf		*dbg;
+	Dwfl_Module	*mod;
+	Dwfl		*dwfl;
+	Dwarf_Addr	bias;
+	const unsigned char	*build_id;
+};
+
+/* This also tries to open distro debuginfo */
+struct debuginfo *debuginfo__new(const char *path);
+void debuginfo__delete(struct debuginfo *dbg);
+
+int debuginfo__get_text_offset(struct debuginfo *dbg, Dwarf_Addr *offs,
+			       bool adjust_offset);
+
+#else /* HAVE_DWARF_SUPPORT */
+
+/* dummy debug information structure */
+struct debuginfo {
+};
+
+static inline struct debuginfo *debuginfo__new(const char *path __maybe_unused)
+{
+	return NULL;
+}
+
+static inline void debuginfo__delete(struct debuginfo *dbg __maybe_unused)
+{
+}
+
+static inline int debuginfo__get_text_offset(struct debuginfo *dbg __maybe_unused,
+					     Dwarf_Addr *offs __maybe_unused,
+					     bool adjust_offset __maybe_unused)
+{
+	return -EINVAL;
+}
+
+#endif /* HAVE_DWARF_SUPPORT */
+
+#ifdef HAVE_DEBUGINFOD_SUPPORT
+int get_source_from_debuginfod(const char *raw_path, const char *sbuild_id,
+			       char **new_path);
+#else /* HAVE_DEBUGINFOD_SUPPORT */
+static inline int get_source_from_debuginfod(const char *raw_path __maybe_unused,
+					     const char *sbuild_id __maybe_unused,
+					     char **new_path __maybe_unused)
+{
+	return -ENOTSUP;
+}
+#endif /* HAVE_DEBUGINFOD_SUPPORT */
+
+#endif /* _PERF_DEBUGINFO_H */
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index f171360b0ef4..8d3dd85f9ff4 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -23,6 +23,7 @@
 #include "event.h"
 #include "dso.h"
 #include "debug.h"
+#include "debuginfo.h"
 #include "intlist.h"
 #include "strbuf.h"
 #include "strlist.h"
@@ -31,128 +32,9 @@
 #include "probe-file.h"
 #include "string2.h"
 
-#ifdef HAVE_DEBUGINFOD_SUPPORT
-#include <elfutils/debuginfod.h>
-#endif
-
 /* Kprobe tracer basic type is up to u64 */
 #define MAX_BASIC_TYPE_BITS	64
 
-/* Dwarf FL wrappers */
-static char *debuginfo_path;	/* Currently dummy */
-
-static const Dwfl_Callbacks offline_callbacks = {
-	.find_debuginfo = dwfl_standard_find_debuginfo,
-	.debuginfo_path = &debuginfo_path,
-
-	.section_address = dwfl_offline_section_address,
-
-	/* We use this table for core files too.  */
-	.find_elf = dwfl_build_id_find_elf,
-};
-
-/* Get a Dwarf from offline image */
-static int debuginfo__init_offline_dwarf(struct debuginfo *dbg,
-					 const char *path)
-{
-	GElf_Addr dummy;
-	int fd;
-
-	fd = open(path, O_RDONLY);
-	if (fd < 0)
-		return fd;
-
-	dbg->dwfl = dwfl_begin(&offline_callbacks);
-	if (!dbg->dwfl)
-		goto error;
-
-	dwfl_report_begin(dbg->dwfl);
-	dbg->mod = dwfl_report_offline(dbg->dwfl, "", "", fd);
-	if (!dbg->mod)
-		goto error;
-
-	dbg->dbg = dwfl_module_getdwarf(dbg->mod, &dbg->bias);
-	if (!dbg->dbg)
-		goto error;
-
-	dwfl_module_build_id(dbg->mod, &dbg->build_id, &dummy);
-
-	dwfl_report_end(dbg->dwfl, NULL, NULL);
-
-	return 0;
-error:
-	if (dbg->dwfl)
-		dwfl_end(dbg->dwfl);
-	else
-		close(fd);
-	memset(dbg, 0, sizeof(*dbg));
-
-	return -ENOENT;
-}
-
-static struct debuginfo *__debuginfo__new(const char *path)
-{
-	struct debuginfo *dbg = zalloc(sizeof(*dbg));
-	if (!dbg)
-		return NULL;
-
-	if (debuginfo__init_offline_dwarf(dbg, path) < 0)
-		zfree(&dbg);
-	if (dbg)
-		pr_debug("Open Debuginfo file: %s\n", path);
-	return dbg;
-}
-
-enum dso_binary_type distro_dwarf_types[] = {
-	DSO_BINARY_TYPE__FEDORA_DEBUGINFO,
-	DSO_BINARY_TYPE__UBUNTU_DEBUGINFO,
-	DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO,
-	DSO_BINARY_TYPE__BUILDID_DEBUGINFO,
-	DSO_BINARY_TYPE__MIXEDUP_UBUNTU_DEBUGINFO,
-	DSO_BINARY_TYPE__NOT_FOUND,
-};
-
-struct debuginfo *debuginfo__new(const char *path)
-{
-	enum dso_binary_type *type;
-	char buf[PATH_MAX], nil = '\0';
-	struct dso *dso;
-	struct debuginfo *dinfo = NULL;
-	struct build_id bid;
-
-	/* Try to open distro debuginfo files */
-	dso = dso__new(path);
-	if (!dso)
-		goto out;
-
-	/* Set the build id for DSO_BINARY_TYPE__BUILDID_DEBUGINFO */
-	if (is_regular_file(path) && filename__read_build_id(path, &bid) > 0)
-		dso__set_build_id(dso, &bid);
-
-	for (type = distro_dwarf_types;
-	     !dinfo && *type != DSO_BINARY_TYPE__NOT_FOUND;
-	     type++) {
-		if (dso__read_binary_type_filename(dso, *type, &nil,
-						   buf, PATH_MAX) < 0)
-			continue;
-		dinfo = __debuginfo__new(buf);
-	}
-	dso__put(dso);
-
-out:
-	/* if failed to open all distro debuginfo, open given binary */
-	return dinfo ? : __debuginfo__new(path);
-}
-
-void debuginfo__delete(struct debuginfo *dbg)
-{
-	if (dbg) {
-		if (dbg->dwfl)
-			dwfl_end(dbg->dwfl);
-		free(dbg);
-	}
-}
-
 /*
  * Probe finder related functions
  */
@@ -1677,44 +1559,6 @@ int debuginfo__find_available_vars_at(struct debuginfo *dbg,
 	return (ret < 0) ? ret : af.nvls;
 }
 
-/* For the kernel module, we need a special code to get a DIE */
-int debuginfo__get_text_offset(struct debuginfo *dbg, Dwarf_Addr *offs,
-				bool adjust_offset)
-{
-	int n, i;
-	Elf32_Word shndx;
-	Elf_Scn *scn;
-	Elf *elf;
-	GElf_Shdr mem, *shdr;
-	const char *p;
-
-	elf = dwfl_module_getelf(dbg->mod, &dbg->bias);
-	if (!elf)
-		return -EINVAL;
-
-	/* Get the number of relocations */
-	n = dwfl_module_relocations(dbg->mod);
-	if (n < 0)
-		return -ENOENT;
-	/* Search the relocation related .text section */
-	for (i = 0; i < n; i++) {
-		p = dwfl_module_relocation_info(dbg->mod, i, &shndx);
-		if (strcmp(p, ".text") == 0) {
-			/* OK, get the section header */
-			scn = elf_getscn(elf, shndx);
-			if (!scn)
-				return -ENOENT;
-			shdr = gelf_getshdr(scn, &mem);
-			if (!shdr)
-				return -ENOENT;
-			*offs = shdr->sh_addr;
-			if (adjust_offset)
-				*offs -= shdr->sh_offset;
-		}
-	}
-	return 0;
-}
-
 /* Reverse search */
 int debuginfo__find_probe_point(struct debuginfo *dbg, u64 addr,
 				struct perf_probe_point *ppt)
@@ -2009,41 +1853,6 @@ int debuginfo__find_line_range(struct debuginfo *dbg, struct line_range *lr)
 	return (ret < 0) ? ret : lf.found;
 }
 
-#ifdef HAVE_DEBUGINFOD_SUPPORT
-/* debuginfod doesn't require the comp_dir but buildid is required */
-static int get_source_from_debuginfod(const char *raw_path,
-				const char *sbuild_id, char **new_path)
-{
-	debuginfod_client *c = debuginfod_begin();
-	const char *p = raw_path;
-	int fd;
-
-	if (!c)
-		return -ENOMEM;
-
-	fd = debuginfod_find_source(c, (const unsigned char *)sbuild_id,
-				0, p, new_path);
-	pr_debug("Search %s from debuginfod -> %d\n", p, fd);
-	if (fd >= 0)
-		close(fd);
-	debuginfod_end(c);
-	if (fd < 0) {
-		pr_debug("Failed to find %s in debuginfod (%s)\n",
-			raw_path, sbuild_id);
-		return -ENOENT;
-	}
-	pr_debug("Got a source %s\n", *new_path);
-
-	return 0;
-}
-#else
-static inline int get_source_from_debuginfod(const char *raw_path __maybe_unused,
-				const char *sbuild_id __maybe_unused,
-				char **new_path __maybe_unused)
-{
-	return -ENOTSUP;
-}
-#endif
 /*
  * Find a src file from a DWARF tag path. Prepend optional source path prefix
  * and chop off leading directories that do not exist. Result is passed back as
diff --git a/tools/perf/util/probe-finder.h b/tools/perf/util/probe-finder.h
index 8bc1c80d3c1c..3add5ff516e1 100644
--- a/tools/perf/util/probe-finder.h
+++ b/tools/perf/util/probe-finder.h
@@ -24,21 +24,7 @@ static inline int is_c_varname(const char *name)
 #ifdef HAVE_DWARF_SUPPORT
 
 #include "dwarf-aux.h"
-
-/* TODO: export debuginfo data structure even if no dwarf support */
-
-/* debug information structure */
-struct debuginfo {
-	Dwarf		*dbg;
-	Dwfl_Module	*mod;
-	Dwfl		*dwfl;
-	Dwarf_Addr	bias;
-	const unsigned char	*build_id;
-};
-
-/* This also tries to open distro debuginfo */
-struct debuginfo *debuginfo__new(const char *path);
-void debuginfo__delete(struct debuginfo *dbg);
+#include "debuginfo.h"
 
 /* Find probe_trace_events specified by perf_probe_event from debuginfo */
 int debuginfo__find_trace_events(struct debuginfo *dbg,
@@ -49,9 +35,6 @@ int debuginfo__find_trace_events(struct debuginfo *dbg,
 int debuginfo__find_probe_point(struct debuginfo *dbg, u64 addr,
 				struct perf_probe_point *ppt);
 
-int debuginfo__get_text_offset(struct debuginfo *dbg, Dwarf_Addr *offs,
-			       bool adjust_offset);
-
 /* Find a line range */
 int debuginfo__find_line_range(struct debuginfo *dbg, struct line_range *lr);
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 04/52] perf dwarf-aux: Fix die_get_typename() for void *
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (2 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 03/52] perf tools: Add util/debuginfo.[ch] files Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 05/52] perf dwarf-aux: Move #ifdef code to the header file Namhyung Kim
                   ` (48 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The die_get_typename() is to return a C-like type name from DWARF debug
entry and it follows data type if the target entry is a pointer type.
But I found void pointers don't have the type attribte to follow and
then the function returns an error for that case.  This results in a
broken type string for void pointer types.

For example, the following type entries are pointer types.

 <1><48c>: Abbrev Number: 4 (DW_TAG_pointer_type)
    <48d>   DW_AT_byte_size   : 8
    <48d>   DW_AT_type        : <0x481>
 <1><491>: Abbrev Number: 211 (DW_TAG_pointer_type)
    <493>   DW_AT_byte_size   : 8
 <1><494>: Abbrev Number: 4 (DW_TAG_pointer_type)
    <495>   DW_AT_byte_size   : 8
    <495>   DW_AT_type        : <0x49e>

The first one at offset 48c and the third one at offset 494 have type
information.  Then they are pointer types for the referenced types.
But the second one at offset 491 doesn't have the type attribute.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 2941d88f2199..4849c3bbfd95 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1090,7 +1090,14 @@ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf)
 		return strbuf_addf(buf, "%s%s", tmp, name ?: "");
 	}
 	ret = die_get_typename(&type, buf);
-	return ret ? ret : strbuf_addstr(buf, tmp);
+	if (ret < 0) {
+		/* void pointer has no type attribute */
+		if (tag == DW_TAG_pointer_type && ret == -ENOENT)
+			return strbuf_addf(buf, "void*");
+
+		return ret;
+	}
+	return strbuf_addstr(buf, tmp);
 }
 
 /**
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 05/52] perf dwarf-aux: Move #ifdef code to the header file
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (3 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 04/52] perf dwarf-aux: Fix die_get_typename() for void * Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 06/52] perf dwarf-aux: Add die_get_scopes() helper Namhyung Kim
                   ` (47 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

It's a usual convention that the conditional code is handled in a header
file.  As I'm planning to add some more of them, let's move the current
code to the header first.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c |  7 -------
 tools/perf/util/dwarf-aux.h | 19 +++++++++++++++++--
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 4849c3bbfd95..adef2635587d 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1245,13 +1245,6 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf)
 out:
 	return ret;
 }
-#else
-int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
-		      Dwarf_Die *vr_die __maybe_unused,
-		      struct strbuf *buf __maybe_unused)
-{
-	return -ENOTSUP;
-}
 #endif
 
 /*
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index 7ec8bc1083bb..4f5d0211ee4f 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -121,7 +121,6 @@ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf);
 
 /* Get the name and type of given variable DIE, stored as "type\tname" */
 int die_get_varname(Dwarf_Die *vr_die, struct strbuf *buf);
-int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf);
 
 /* Check if target program is compiled with optimization */
 bool die_is_optimized_target(Dwarf_Die *cu_die);
@@ -130,4 +129,20 @@ bool die_is_optimized_target(Dwarf_Die *cu_die);
 void die_skip_prologue(Dwarf_Die *sp_die, Dwarf_Die *cu_die,
 		       Dwarf_Addr *entrypc);
 
-#endif
+#ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT
+
+/* Get byte offset range of given variable DIE */
+int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf);
+
+#else /*  HAVE_DWARF_GETLOCATIONS_SUPPORT */
+
+static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
+				    Dwarf_Die *vr_die __maybe_unused,
+				    struct strbuf *buf __maybe_unused)
+{
+	return -ENOTSUP;
+}
+
+#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
+
+#endif /* _DWARF_AUX_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 06/52] perf dwarf-aux: Add die_get_scopes() helper
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (4 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 05/52] perf dwarf-aux: Move #ifdef code to the header file Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 07/52] perf dwarf-aux: Add die_find_variable_by_reg() helper Namhyung Kim
                   ` (46 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The die_get_scopes() would return the number of enclosing DIEs for the
given address and it fills an array of DIEs like dwarf_getscopes().
But it doesn't follow the abstract origin of inlined functions as we
want information of the concrete instance.  This is needed to check the
location of parameters and local variables properly.  Users can check
the origin separately if needed.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 53 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/dwarf-aux.h |  3 +++
 2 files changed, 56 insertions(+)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index adef2635587d..10aa32334d6f 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1425,3 +1425,56 @@ void die_skip_prologue(Dwarf_Die *sp_die, Dwarf_Die *cu_die,
 
 	*entrypc = postprologue_addr;
 }
+
+/* Internal parameters for __die_find_scope_cb() */
+struct find_scope_data {
+	/* Target instruction address */
+	Dwarf_Addr pc;
+	/* Number of scopes found [output] */
+	int nr;
+	/* Array of scopes found, 0 for the outermost one. [output] */
+	Dwarf_Die *scopes;
+};
+
+static int __die_find_scope_cb(Dwarf_Die *die_mem, void *arg)
+{
+	struct find_scope_data *data = arg;
+
+	if (dwarf_haspc(die_mem, data->pc)) {
+		Dwarf_Die *tmp;
+
+		tmp = realloc(data->scopes, (data->nr + 1) * sizeof(*tmp));
+		if (tmp == NULL)
+			return DIE_FIND_CB_END;
+
+		memcpy(tmp + data->nr, die_mem, sizeof(*die_mem));
+		data->scopes = tmp;
+		data->nr++;
+		return DIE_FIND_CB_CHILD;
+	}
+	return DIE_FIND_CB_SIBLING;
+}
+
+/**
+ * die_get_scopes - Return a list of scopes including the address
+ * @cu_die: a compile unit DIE
+ * @pc: the address to find
+ * @scopes: the array of DIEs for scopes (result)
+ *
+ * This function does the same as the dwarf_getscopes() but doesn't follow
+ * the origins of inlined functions.  It returns the number of scopes saved
+ * in the @scopes argument.  The outer scope will be saved first (index 0) and
+ * the last one is the innermost scope at the @pc.
+ */
+int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes)
+{
+	struct find_scope_data data = {
+		.pc = pc,
+	};
+	Dwarf_Die die_mem;
+
+	die_find_child(cu_die, __die_find_scope_cb, &data, &die_mem);
+
+	*scopes = data.scopes;
+	return data.nr;
+}
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index 4f5d0211ee4f..f9d765f80fb0 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -129,6 +129,9 @@ bool die_is_optimized_target(Dwarf_Die *cu_die);
 void die_skip_prologue(Dwarf_Die *sp_die, Dwarf_Die *cu_die,
 		       Dwarf_Addr *entrypc);
 
+/* Get the list of including scopes */
+int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes);
+
 #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT
 
 /* Get byte offset range of given variable DIE */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 07/52] perf dwarf-aux: Add die_find_variable_by_reg() helper
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (5 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 06/52] perf dwarf-aux: Add die_get_scopes() helper Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 08/52] perf build: Add feature check for dwarf_getcfi() Namhyung Kim
                   ` (45 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The die_find_variable_by_reg() will search for a variable or a parameter
sub-DIE in the given scope DIE where the location matches to the given
register.

For the simpliest and most common case, memory access usually happens
with a base register and an offset to the field so the register would
hold a pointer in a variable or function parameter.  Then we can find
one if it has a location expression at the (instruction) address.  So
this function only handles such a simple case for now.

In this case, the expression would have a DW_OP_regN operation where
N < 32.  If the register index (N) is greater than or equal to 32,
DW_OP_regx operation with an operand which saves the value for the N
would be used.  It would reject expressions with more operations.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 67 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/dwarf-aux.h | 12 +++++++
 2 files changed, 79 insertions(+)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 10aa32334d6f..652e6e7368a2 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1245,6 +1245,73 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf)
 out:
 	return ret;
 }
+
+/* Interval parameters for __die_find_var_reg_cb() */
+struct find_var_data {
+	/* Target instruction address */
+	Dwarf_Addr pc;
+	/* Target register */
+	unsigned reg;
+};
+
+/* Max number of registers DW_OP_regN supports */
+#define DWARF_OP_DIRECT_REGS  32
+
+/* Only checks direct child DIEs in the given scope. */
+static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
+{
+	struct find_var_data *data = arg;
+	int tag = dwarf_tag(die_mem);
+	ptrdiff_t off = 0;
+	Dwarf_Attribute attr;
+	Dwarf_Addr base, start, end;
+	Dwarf_Op *ops;
+	size_t nops;
+
+	if (tag != DW_TAG_variable && tag != DW_TAG_formal_parameter)
+		return DIE_FIND_CB_SIBLING;
+
+	if (dwarf_attr(die_mem, DW_AT_location, &attr) == NULL)
+		return DIE_FIND_CB_SIBLING;
+
+	while ((off = dwarf_getlocations(&attr, off, &base, &start, &end, &ops, &nops)) > 0) {
+		/* Assuming the location list is sorted by address */
+		if (end < data->pc)
+			continue;
+		if (start > data->pc)
+			break;
+
+		/* Only match with a simple case */
+		if (data->reg < DWARF_OP_DIRECT_REGS) {
+			if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1)
+				return DIE_FIND_CB_END;
+		} else {
+			if (ops->atom == DW_OP_regx && ops->number == data->reg &&
+			    nops == 1)
+				return DIE_FIND_CB_END;
+		}
+	}
+	return DIE_FIND_CB_SIBLING;
+}
+
+/**
+ * die_find_variable_by_reg - Find a variable saved in a register
+ * @sc_die: a scope DIE
+ * @pc: the program address to find
+ * @reg: the register number to find
+ * @die_mem: a buffer to save the resulting DIE
+ *
+ * Find the variable DIE accessed by the given register.
+ */
+Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
+				    Dwarf_Die *die_mem)
+{
+	struct find_var_data data = {
+		.pc = pc,
+		.reg = reg,
+	};
+	return die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem);
+}
 #endif
 
 /*
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index f9d765f80fb0..b6f430730bd1 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -137,6 +137,10 @@ int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes);
 /* Get byte offset range of given variable DIE */
 int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf);
 
+/* Find a variable saved in the 'reg' at given address */
+Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
+				    Dwarf_Die *die_mem);
+
 #else /*  HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
 static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
@@ -146,6 +150,14 @@ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
 	return -ENOTSUP;
 }
 
+static inline Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die __maybe_unused,
+						  Dwarf_Addr pc __maybe_unused,
+						  int reg __maybe_unused,
+						  Dwarf_Die *die_mem __maybe_unused)
+{
+	return NULL;
+}
+
 #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
 #endif /* _DWARF_AUX_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 08/52] perf build: Add feature check for dwarf_getcfi()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (6 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 07/52] perf dwarf-aux: Add die_find_variable_by_reg() helper Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-10 10:26   ` Masami Hiramatsu
  2023-11-09 23:59 ` [PATCH 09/52] perf probe: Convert to check dwarf_getcfi feature Namhyung Kim
                   ` (44 subsequent siblings)
  52 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The dwarf_getcfi() is available on libdw 0.142+.  Instead of just
checking the version number, it'd be nice to have a config item to check
the feature at build time.

Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/build/Makefile.feature            | 1 +
 tools/build/feature/Makefile            | 4 ++++
 tools/build/feature/test-dwarf_getcfi.c | 9 +++++++++
 3 files changed, 14 insertions(+)
 create mode 100644 tools/build/feature/test-dwarf_getcfi.c

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 934e2777a2db..64df118376df 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -32,6 +32,7 @@ FEATURE_TESTS_BASIC :=                  \
         backtrace                       \
         dwarf                           \
         dwarf_getlocations              \
+        dwarf_getcfi                    \
         eventfd                         \
         fortify-source                  \
         get_current_dir_name            \
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index dad79ede4e0a..37722e509eb9 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -7,6 +7,7 @@ FILES=                                          \
          test-bionic.bin                        \
          test-dwarf.bin                         \
          test-dwarf_getlocations.bin            \
+         test-dwarf_getcfi.bin                  \
          test-eventfd.bin                       \
          test-fortify-source.bin                \
          test-get_current_dir_name.bin          \
@@ -154,6 +155,9 @@ endif
 $(OUTPUT)test-dwarf_getlocations.bin:
 	$(BUILD) $(DWARFLIBS)
 
+$(OUTPUT)test-dwarf_getcfi.bin:
+	$(BUILD) $(DWARFLIBS)
+
 $(OUTPUT)test-libelf-getphdrnum.bin:
 	$(BUILD) -lelf
 
diff --git a/tools/build/feature/test-dwarf_getcfi.c b/tools/build/feature/test-dwarf_getcfi.c
new file mode 100644
index 000000000000..50e7d7cb7bdf
--- /dev/null
+++ b/tools/build/feature/test-dwarf_getcfi.c
@@ -0,0 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include <elfutils/libdw.h>
+
+int main(void)
+{
+	Dwarf *dwarf = NULL;
+	return dwarf_getcfi(dwarf) == NULL;
+}
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 09/52] perf probe: Convert to check dwarf_getcfi feature
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (7 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 08/52] perf build: Add feature check for dwarf_getcfi() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-10 10:25   ` Masami Hiramatsu
  2023-11-09 23:59 ` [PATCH 10/52] perf dwarf-aux: Factor out die_get_typename_from_type() Namhyung Kim
                   ` (43 subsequent siblings)
  52 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Now it has a feature check for the dwarf_getcfi(), use it and convert
the code to check HAVE_DWARF_CFI_SUPPORT definition.

Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Makefile.config     | 5 +++++
 tools/perf/util/probe-finder.c | 8 ++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 8b6cffbc4858..aa55850fbc21 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -476,6 +476,11 @@ else
       else
         CFLAGS += -DHAVE_DWARF_GETLOCATIONS_SUPPORT
       endif # dwarf_getlocations
+      ifneq ($(feature-dwarf_getcfi), 1)
+        msg := $(warning Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.142);
+      else
+        CFLAGS += -DHAVE_DWARF_CFI_SUPPORT
+      endif # dwarf_getcfi
     endif # Dwarf support
   endif # libelf support
 endif # NO_LIBELF
diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 8d3dd85f9ff4..c8923375e30d 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -604,7 +604,7 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
 	ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops, &nops, 1);
 	if (ret <= 0 || nops == 0) {
 		pf->fb_ops = NULL;
-#if _ELFUTILS_PREREQ(0, 142)
+#ifdef HAVE_DWARF_CFI_SUPPORT
 	} else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa &&
 		   (pf->cfi_eh != NULL || pf->cfi_dbg != NULL)) {
 		if ((dwarf_cfi_addrframe(pf->cfi_eh, pf->addr, &frame) != 0 &&
@@ -615,7 +615,7 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
 			free(frame);
 			return -ENOENT;
 		}
-#endif
+#endif /* HAVE_DWARF_CFI_SUPPORT */
 	}
 
 	/* Call finder's callback handler */
@@ -1140,7 +1140,7 @@ static int debuginfo__find_probes(struct debuginfo *dbg,
 
 	pf->machine = ehdr.e_machine;
 
-#if _ELFUTILS_PREREQ(0, 142)
+#ifdef HAVE_DWARF_CFI_SUPPORT
 	do {
 		GElf_Shdr shdr;
 
@@ -1150,7 +1150,7 @@ static int debuginfo__find_probes(struct debuginfo *dbg,
 
 		pf->cfi_dbg = dwarf_getcfi(dbg->dbg);
 	} while (0);
-#endif
+#endif /* HAVE_DWARF_CFI_SUPPORT */
 
 	ret = debuginfo__find_probe_location(dbg, pf);
 	return ret;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 10/52] perf dwarf-aux: Factor out die_get_typename_from_type()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (8 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 09/52] perf probe: Convert to check dwarf_getcfi feature Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 11/52] perf dwarf-regs: Add get_dwarf_regnum() Namhyung Kim
                   ` (42 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The die_get_typename_from_type() is to get the name of the given DIE in
C-style type name.  The difference from the die_get_typename() is that
it does not retrieve the DW_AT_type and use the given DIE directly.
This will be used when users know the type DIE already.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 38 ++++++++++++++++++++++++++-----------
 tools/perf/util/dwarf-aux.h |  3 +++
 2 files changed, 30 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 652e6e7368a2..4bdcd3dea28f 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1051,32 +1051,28 @@ Dwarf_Die *die_find_member(Dwarf_Die *st_die, const char *name,
 }
 
 /**
- * die_get_typename - Get the name of given variable DIE
- * @vr_die: a variable DIE
+ * die_get_typename_from_type - Get the name of given type DIE
+ * @type_die: a type DIE
  * @buf: a strbuf for result type name
  *
- * Get the name of @vr_die and stores it to @buf. Return 0 if succeeded.
+ * Get the name of @type_die and stores it to @buf. Return 0 if succeeded.
  * and Return -ENOENT if failed to find type name.
  * Note that the result will stores typedef name if possible, and stores
  * "*(function_type)" if the type is a function pointer.
  */
-int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf)
+int die_get_typename_from_type(Dwarf_Die *type_die, struct strbuf *buf)
 {
-	Dwarf_Die type;
 	int tag, ret;
 	const char *tmp = "";
 
-	if (__die_get_real_type(vr_die, &type) == NULL)
-		return -ENOENT;
-
-	tag = dwarf_tag(&type);
+	tag = dwarf_tag(type_die);
 	if (tag == DW_TAG_array_type || tag == DW_TAG_pointer_type)
 		tmp = "*";
 	else if (tag == DW_TAG_subroutine_type) {
 		/* Function pointer */
 		return strbuf_add(buf, "(function_type)", 15);
 	} else {
-		const char *name = dwarf_diename(&type);
+		const char *name = dwarf_diename(type_die);
 
 		if (tag == DW_TAG_union_type)
 			tmp = "union ";
@@ -1089,7 +1085,7 @@ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf)
 		/* Write a base name */
 		return strbuf_addf(buf, "%s%s", tmp, name ?: "");
 	}
-	ret = die_get_typename(&type, buf);
+	ret = die_get_typename(type_die, buf);
 	if (ret < 0) {
 		/* void pointer has no type attribute */
 		if (tag == DW_TAG_pointer_type && ret == -ENOENT)
@@ -1100,6 +1096,26 @@ int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf)
 	return strbuf_addstr(buf, tmp);
 }
 
+/**
+ * die_get_typename - Get the name of given variable DIE
+ * @vr_die: a variable DIE
+ * @buf: a strbuf for result type name
+ *
+ * Get the name of @vr_die and stores it to @buf. Return 0 if succeeded.
+ * and Return -ENOENT if failed to find type name.
+ * Note that the result will stores typedef name if possible, and stores
+ * "*(function_type)" if the type is a function pointer.
+ */
+int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf)
+{
+	Dwarf_Die type;
+
+	if (__die_get_real_type(vr_die, &type) == NULL)
+		return -ENOENT;
+
+	return die_get_typename_from_type(&type, buf);
+}
+
 /**
  * die_get_varname - Get the name and type of given variable DIE
  * @vr_die: a variable DIE
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index b6f430730bd1..f9763d3b7572 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -116,6 +116,9 @@ Dwarf_Die *die_find_variable_at(Dwarf_Die *sp_die, const char *name,
 Dwarf_Die *die_find_member(Dwarf_Die *st_die, const char *name,
 			   Dwarf_Die *die_mem);
 
+/* Get the name of given type DIE */
+int die_get_typename_from_type(Dwarf_Die *type_die, struct strbuf *buf);
+
 /* Get the name of given variable DIE */
 int die_get_typename(Dwarf_Die *vr_die, struct strbuf *buf);
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 11/52] perf dwarf-regs: Add get_dwarf_regnum()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (9 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 10/52] perf dwarf-aux: Factor out die_get_typename_from_type() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 12/52] perf annotate-data: Add find_data_type() Namhyung Kim
                   ` (41 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The get_dwarf_regnum() returns a DWARF register number from a register
name string according to the psABI.  Also add two pseudo encodings of
DWARF_REG_PC which is a register that are used by PC-relative addressing
and DWARF_REG_FB which is a frame base register.  They need to be
handled in a special way.

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/arch/x86/util/dwarf-regs.c | 38 +++++++++++++++++++++++++++
 tools/perf/util/dwarf-regs.c          | 34 ++++++++++++++++++++++++
 tools/perf/util/include/dwarf-regs.h  | 19 ++++++++++++++
 3 files changed, 91 insertions(+)

diff --git a/tools/perf/arch/x86/util/dwarf-regs.c b/tools/perf/arch/x86/util/dwarf-regs.c
index 530934805710..399c4a0a29d8 100644
--- a/tools/perf/arch/x86/util/dwarf-regs.c
+++ b/tools/perf/arch/x86/util/dwarf-regs.c
@@ -113,3 +113,41 @@ int regs_query_register_offset(const char *name)
 			return roff->offset;
 	return -EINVAL;
 }
+
+struct dwarf_regs_idx {
+	const char *name;
+	int idx;
+};
+
+static const struct dwarf_regs_idx x86_regidx_table[] = {
+	{ "rax", 0 }, { "eax", 0 }, { "ax", 0 }, { "al", 0 },
+	{ "rdx", 1 }, { "edx", 1 }, { "dx", 1 }, { "dl", 1 },
+	{ "rcx", 2 }, { "ecx", 2 }, { "cx", 2 }, { "cl", 2 },
+	{ "rbx", 3 }, { "edx", 3 }, { "bx", 3 }, { "bl", 3 },
+	{ "rsi", 4 }, { "esi", 4 }, { "si", 4 }, { "sil", 4 },
+	{ "rdi", 5 }, { "edi", 5 }, { "di", 5 }, { "dil", 5 },
+	{ "rbp", 6 }, { "ebp", 6 }, { "bp", 6 }, { "bpl", 6 },
+	{ "rsp", 7 }, { "esp", 7 }, { "sp", 7 }, { "spl", 7 },
+	{ "r8", 8 }, { "r8d", 8 }, { "r8w", 8 }, { "r8b", 8 },
+	{ "r9", 9 }, { "r9d", 9 }, { "r9w", 9 }, { "r9b", 9 },
+	{ "r10", 10 }, { "r10d", 10 }, { "r10w", 10 }, { "r10b", 10 },
+	{ "r11", 11 }, { "r11d", 11 }, { "r11w", 11 }, { "r11b", 11 },
+	{ "r12", 12 }, { "r12d", 12 }, { "r12w", 12 }, { "r12b", 12 },
+	{ "r13", 13 }, { "r13d", 13 }, { "r13w", 13 }, { "r13b", 13 },
+	{ "r14", 14 }, { "r14d", 14 }, { "r14w", 14 }, { "r14b", 14 },
+	{ "r15", 15 }, { "r15d", 15 }, { "r15w", 15 }, { "r15b", 15 },
+	{ "rip", DWARF_REG_PC },
+};
+
+int get_arch_regnum(const char *name)
+{
+	unsigned int i;
+
+	if (*name != '%')
+		return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(x86_regidx_table); i++)
+		if (!strcmp(x86_regidx_table[i].name, name + 1))
+			return x86_regidx_table[i].idx;
+	return -ENOENT;
+}
diff --git a/tools/perf/util/dwarf-regs.c b/tools/perf/util/dwarf-regs.c
index 69cfaa5953bf..5b7f86c0063f 100644
--- a/tools/perf/util/dwarf-regs.c
+++ b/tools/perf/util/dwarf-regs.c
@@ -5,9 +5,12 @@
  * Written by: Masami Hiramatsu <mhiramat@kernel.org>
  */
 
+#include <stdlib.h>
+#include <string.h>
 #include <debug.h>
 #include <dwarf-regs.h>
 #include <elf.h>
+#include <errno.h>
 #include <linux/kernel.h>
 
 #ifndef EM_AARCH64
@@ -68,3 +71,34 @@ const char *get_dwarf_regstr(unsigned int n, unsigned int machine)
 	}
 	return NULL;
 }
+
+__weak int get_arch_regnum(const char *name __maybe_unused)
+{
+	return -ENOTSUP;
+}
+
+/* Return DWARF register number from architecture register name */
+int get_dwarf_regnum(const char *name, unsigned int machine)
+{
+	char *regname = strdup(name);
+	int reg = -1;
+	char *p;
+
+	if (regname == NULL)
+		return -EINVAL;
+
+	/* For convenience, remove trailing characters */
+	p = strpbrk(regname, " ,)");
+	if (p)
+		*p = '\0';
+
+	switch (machine) {
+	case EM_NONE:	/* Generic arch - use host arch */
+		reg = get_arch_regnum(regname);
+		break;
+	default:
+		pr_err("ELF MACHINE %x is not supported.\n", machine);
+	}
+	free(regname);
+	return reg;
+}
diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
index 7d99a084e82d..01fb25a1150a 100644
--- a/tools/perf/util/include/dwarf-regs.h
+++ b/tools/perf/util/include/dwarf-regs.h
@@ -2,6 +2,9 @@
 #ifndef _PERF_DWARF_REGS_H_
 #define _PERF_DWARF_REGS_H_
 
+#define DWARF_REG_PC  0xd3af9c /* random number */
+#define DWARF_REG_FB  0xd3affb /* random number */
+
 #ifdef HAVE_DWARF_SUPPORT
 const char *get_arch_regstr(unsigned int n);
 /*
@@ -10,6 +13,22 @@ const char *get_arch_regstr(unsigned int n);
  * machine: ELF machine signature (EM_*)
  */
 const char *get_dwarf_regstr(unsigned int n, unsigned int machine);
+
+int get_arch_regnum(const char *name);
+/*
+ * get_dwarf_regnum - Returns DWARF regnum from register name
+ * name: architecture register name
+ * machine: ELF machine signature (EM_*)
+ */
+int get_dwarf_regnum(const char *name, unsigned int machine);
+
+#else /* HAVE_DWARF_SUPPORT */
+
+static inline int get_dwarf_regnum(const char *name __maybe_unused,
+				   unsigned int machine __maybe_unused)
+{
+	return -1;
+}
 #endif
 
 #ifdef HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 12/52] perf annotate-data: Add find_data_type()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (10 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 11/52] perf dwarf-regs: Add get_dwarf_regnum() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
       [not found]   ` <CA+JHD90fkWNrQWO5DrHeV8mCmFyKKqJ8fV=KwztRi7TSw+8yDg@mail.gmail.com>
  2023-11-09 23:59 ` [PATCH 13/52] perf annotate-data: Add dso->data_types tree Namhyung Kim
                   ` (40 subsequent siblings)
  52 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The find_data_type() is to get a data type from the memory access at the
given address (IP) using a register and an offset.  It requires DWARF
debug info in the DSO and searches the list of variables and function
parameters in the scope.

In a pseudo code, it does basically the following:

  find_data_type(dso, ip, reg, offset)
  {
      pc = map__rip_2objdump(ip);
      CU = dwarf_addrdie(dso->dwarf, pc);
      scopes = die_get_scopes(CU, pc);
      for_each_scope(S, scopes) {
          V = die_find_variable_by_reg(S, pc, reg);
          if (V && V.type == pointer_type) {
              T = die_get_real_type(V);
              if (offset < T.size)
                  return T;
          }
      }
      return NULL;
  }

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/Build           |   1 +
 tools/perf/util/annotate-data.c | 163 ++++++++++++++++++++++++++++++++
 tools/perf/util/annotate-data.h |  40 ++++++++
 3 files changed, 204 insertions(+)
 create mode 100644 tools/perf/util/annotate-data.c
 create mode 100644 tools/perf/util/annotate-data.h

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 73e3f194f949..5cf000302080 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -196,6 +196,7 @@ perf-$(CONFIG_DWARF) += probe-finder.o
 perf-$(CONFIG_DWARF) += dwarf-aux.o
 perf-$(CONFIG_DWARF) += dwarf-regs.o
 perf-$(CONFIG_DWARF) += debuginfo.o
+perf-$(CONFIG_DWARF) += annotate-data.o
 
 perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
 perf-$(CONFIG_LOCAL_LIBUNWIND)    += unwind-libunwind-local.o
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
new file mode 100644
index 000000000000..98c42dff2645
--- /dev/null
+++ b/tools/perf/util/annotate-data.c
@@ -0,0 +1,163 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Convert sample address to data type using DWARF debug info.
+ *
+ * Written by Namhyung Kim <namhyung@kernel.org>
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#include "annotate-data.h"
+#include "debuginfo.h"
+#include "debug.h"
+#include "dso.h"
+#include "map.h"
+#include "map_symbol.h"
+#include "strbuf.h"
+#include "symbol.h"
+
+static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die)
+{
+	Dwarf_Off off, next_off;
+	size_t header_size;
+
+	if (dwarf_addrdie(di->dbg, pc, cu_die) != NULL)
+		return cu_die;
+
+	/*
+	 * There are some kernels don't have full aranges and contain only a few
+	 * aranges entries.  Fallback to iterate all CU entries in .debug_info
+	 * in case it's missing.
+	 */
+	off = 0;
+	while (dwarf_nextcu(di->dbg, off, &next_off, &header_size,
+			    NULL, NULL, NULL) == 0) {
+		if (dwarf_offdie(di->dbg, off + header_size, cu_die) &&
+		    dwarf_haspc(cu_die, pc))
+			return true;
+
+		off = next_off;
+	}
+	return false;
+}
+
+/* The type info will be saved in @type_die */
+static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
+{
+	Dwarf_Word size;
+
+	/* Get the type of the variable */
+	if (die_get_real_type(var_die, type_die) == NULL) {
+		pr_debug("variable has no type\n");
+		return -1;
+	}
+
+	/*
+	 * It expects a pointer type for a memory access.
+	 * Convert to a real type it points to.
+	 */
+	if (dwarf_tag(type_die) != DW_TAG_pointer_type ||
+	    die_get_real_type(type_die, type_die) == NULL) {
+		pr_debug("no pointer or no type\n");
+		return -1;
+	}
+
+	/* Get the size of the actual type */
+	if (dwarf_aggregate_size(type_die, &size) < 0) {
+		pr_debug("type size is unknown\n");
+		return -1;
+	}
+
+	/* Minimal sanity check */
+	if ((unsigned)offset >= size) {
+		pr_debug("offset: %d is bigger than size: %lu\n", offset, size);
+		return -1;
+	}
+
+	return 0;
+}
+
+/* The result will be saved in @type_die */
+static int find_data_type_die(struct debuginfo *di, u64 pc,
+			      int reg, int offset, Dwarf_Die *type_die)
+{
+	Dwarf_Die cu_die, var_die;
+	Dwarf_Die *scopes = NULL;
+	int ret = -1;
+	int i, nr_scopes;
+
+	/* Get a compile_unit for this address */
+	if (!find_cu_die(di, pc, &cu_die)) {
+		pr_debug("cannot find CU for address %lx\n", pc);
+		return -1;
+	}
+
+	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
+	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
+
+	/* Search from the inner-most scope to the outer */
+	for (i = nr_scopes - 1; i >= 0; i--) {
+		/* Look up variables/parameters in this scope */
+		if (!die_find_variable_by_reg(&scopes[i], pc, reg, &var_die))
+			continue;
+
+		/* Found a variable, see if it's correct */
+		ret = check_variable(&var_die, type_die, offset);
+		break;
+	}
+
+	free(scopes);
+	return ret;
+}
+
+/**
+ * find_data_type - Return a data type at the location
+ * @ms: map and symbol at the location
+ * @ip: instruction address of the memory access
+ * @reg: register that holds the base address
+ * @offset: offset from the base address
+ *
+ * This functions searches the debug information of the binary to get the data
+ * type it accesses.  The exact location is expressed by (ip, reg, offset).
+ * It return %NULL if not found.
+ */
+struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
+					   int reg, int offset)
+{
+	struct annotated_data_type *result = NULL;
+	struct dso *dso = ms->map->dso;
+	struct debuginfo *di;
+	Dwarf_Die type_die;
+	struct strbuf sb;
+	u64 pc;
+
+	di = debuginfo__new(dso->long_name);
+	if (di == NULL) {
+		pr_debug("cannot get the debug info\n");
+		return NULL;
+	}
+
+	/*
+	 * IP is a relative instruction address from the start of the map, as
+	 * it can be randomized/relocated, it needs to translate to PC which is
+	 * a file address for DWARF processing.
+	 */
+	pc = map__rip_2objdump(ms->map, ip);
+	if (find_data_type_die(di, pc, reg, offset, &type_die) < 0)
+		goto out;
+
+	result = zalloc(sizeof(*result));
+	if (result == NULL)
+		goto out;
+
+	strbuf_init(&sb, 32);
+	if (die_get_typename_from_type(&type_die, &sb) < 0)
+		strbuf_add(&sb, "(unknown type)", 14);
+
+	result->type_name = strbuf_detach(&sb, NULL);
+
+out:
+	debuginfo__delete(di);
+	return result;
+}
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
new file mode 100644
index 000000000000..633147f78ca5
--- /dev/null
+++ b/tools/perf/util/annotate-data.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _PERF_ANNOTATE_DATA_H
+#define _PERF_ANNOTATE_DATA_H
+
+#include <errno.h>
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+struct map_symbol;
+
+/**
+ * struct annotated_data_type - Data type to profile
+ * @type_name: Name of the data type
+ * @type_size: Size of the data type
+ *
+ * This represents a data type accessed by samples in the profile data.
+ */
+struct annotated_data_type {
+	char *type_name;
+	int type_size;
+};
+
+#ifdef HAVE_DWARF_SUPPORT
+
+/* Returns data type at the location (ip, reg, offset) */
+struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
+					   int reg, int offset);
+
+#else /* HAVE_DWARF_SUPPORT */
+
+static inline struct annotated_data_type *
+find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
+	       int reg __maybe_unused, int offset __maybe_unused)
+{
+	return NULL;
+}
+
+#endif /* HAVE_DWARF_SUPPORT */
+
+#endif /* _PERF_ANNOTATE_DATA_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 13/52] perf annotate-data: Add dso->data_types tree
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (11 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 12/52] perf annotate-data: Add find_data_type() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-12-21 20:10   ` Arnaldo Carvalho de Melo
  2023-11-09 23:59 ` [PATCH 14/52] perf annotate: Factor out evsel__get_arch() Namhyung Kim
                   ` (39 subsequent siblings)
  52 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

To aggregate accesses to the same data type, add 'data_types' tree in
DSO to maintain data types and find it by name and size.  It might have
different data types that happen to have the same name.  So it also
compares the size of the type.  Even if it doesn't 100% guarantee, it'd
reduce the possiblility of mis-handling of such conflicts.  And I don't
think it's common to have different types with the same name.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 95 +++++++++++++++++++++++++++++----
 tools/perf/util/annotate-data.h |  9 ++++
 tools/perf/util/dso.c           |  4 ++
 tools/perf/util/dso.h           |  2 +
 4 files changed, 100 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 98c42dff2645..475cc30b33e1 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -17,6 +17,76 @@
 #include "strbuf.h"
 #include "symbol.h"
 
+/*
+ * Compare type name and size to maintain them in a tree.
+ * I'm not sure if DWARF would have information of a single type in many
+ * different places (compilation units).  If not, it could compare the
+ * offset of the type entry in the .debug_info section.
+ */
+static int data_type_cmp(const void *_key, const struct rb_node *node)
+{
+	const struct annotated_data_type *key = _key;
+	struct annotated_data_type *type;
+
+	type = rb_entry(node, struct annotated_data_type, node);
+
+	if (key->type_size != type->type_size)
+		return key->type_size - type->type_size;
+	return strcmp(key->type_name, type->type_name);
+}
+
+static bool data_type_less(struct rb_node *node_a, const struct rb_node *node_b)
+{
+	struct annotated_data_type *a, *b;
+
+	a = rb_entry(node_a, struct annotated_data_type, node);
+	b = rb_entry(node_b, struct annotated_data_type, node);
+
+	if (a->type_size != b->type_size)
+		return a->type_size < b->type_size;
+	return strcmp(a->type_name, b->type_name) < 0;
+}
+
+static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
+							  Dwarf_Die *type_die)
+{
+	struct annotated_data_type *result = NULL;
+	struct annotated_data_type key;
+	struct rb_node *node;
+	struct strbuf sb;
+	char *type_name;
+	Dwarf_Word size;
+
+	strbuf_init(&sb, 32);
+	if (die_get_typename_from_type(type_die, &sb) < 0)
+		strbuf_add(&sb, "(unknown type)", 14);
+	type_name = strbuf_detach(&sb, NULL);
+	dwarf_aggregate_size(type_die, &size);
+
+	/* Check existing nodes in dso->data_types tree */
+	key.type_name = type_name;
+	key.type_size = size;
+	node = rb_find(&key, &dso->data_types, data_type_cmp);
+	if (node) {
+		result = rb_entry(node, struct annotated_data_type, node);
+		free(type_name);
+		return result;
+	}
+
+	/* If not, add a new one */
+	result = zalloc(sizeof(*result));
+	if (result == NULL) {
+		free(type_name);
+		return NULL;
+	}
+
+	result->type_name = type_name;
+	result->type_size = size;
+
+	rb_add(&result->node, &dso->data_types, data_type_less);
+	return result;
+}
+
 static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die)
 {
 	Dwarf_Off off, next_off;
@@ -129,7 +199,6 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	struct dso *dso = ms->map->dso;
 	struct debuginfo *di;
 	Dwarf_Die type_die;
-	struct strbuf sb;
 	u64 pc;
 
 	di = debuginfo__new(dso->long_name);
@@ -147,17 +216,23 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	if (find_data_type_die(di, pc, reg, offset, &type_die) < 0)
 		goto out;
 
-	result = zalloc(sizeof(*result));
-	if (result == NULL)
-		goto out;
-
-	strbuf_init(&sb, 32);
-	if (die_get_typename_from_type(&type_die, &sb) < 0)
-		strbuf_add(&sb, "(unknown type)", 14);
-
-	result->type_name = strbuf_detach(&sb, NULL);
+	result = dso__findnew_data_type(dso, &type_die);
 
 out:
 	debuginfo__delete(di);
 	return result;
 }
+
+void annotated_data_type__tree_delete(struct rb_root *root)
+{
+	struct annotated_data_type *pos;
+
+	while (!RB_EMPTY_ROOT(root)) {
+		struct rb_node *node = rb_first(root);
+
+		rb_erase(node, root);
+		pos = rb_entry(node, struct annotated_data_type, node);
+		free(pos->type_name);
+		free(pos);
+	}
+}
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 633147f78ca5..ab9f187bd7f1 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -4,6 +4,7 @@
 
 #include <errno.h>
 #include <linux/compiler.h>
+#include <linux/rbtree.h>
 #include <linux/types.h>
 
 struct map_symbol;
@@ -16,6 +17,7 @@ struct map_symbol;
  * This represents a data type accessed by samples in the profile data.
  */
 struct annotated_data_type {
+	struct rb_node node;
 	char *type_name;
 	int type_size;
 };
@@ -26,6 +28,9 @@ struct annotated_data_type {
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 					   int reg, int offset);
 
+/* Release all data type information in the tree */
+void annotated_data_type__tree_delete(struct rb_root *root);
+
 #else /* HAVE_DWARF_SUPPORT */
 
 static inline struct annotated_data_type *
@@ -35,6 +40,10 @@ find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
 	return NULL;
 }
 
+static inline void annotated_data_type__tree_delete(struct rb_root *root __maybe_unused)
+{
+}
+
 #endif /* HAVE_DWARF_SUPPORT */
 
 #endif /* _PERF_ANNOTATE_DATA_H */
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 1f629b6fb7cf..22fd5fa806ed 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -31,6 +31,7 @@
 #include "debug.h"
 #include "string2.h"
 #include "vdso.h"
+#include "annotate-data.h"
 
 static const char * const debuglink_paths[] = {
 	"%.0s%s",
@@ -1327,6 +1328,7 @@ struct dso *dso__new_id(const char *name, struct dso_id *id)
 		dso->data.cache = RB_ROOT;
 		dso->inlined_nodes = RB_ROOT_CACHED;
 		dso->srclines = RB_ROOT_CACHED;
+		dso->data_types = RB_ROOT;
 		dso->data.fd = -1;
 		dso->data.status = DSO_DATA_STATUS_UNKNOWN;
 		dso->symtab_type = DSO_BINARY_TYPE__NOT_FOUND;
@@ -1370,6 +1372,8 @@ void dso__delete(struct dso *dso)
 	symbols__delete(&dso->symbols);
 	dso->symbol_names_len = 0;
 	zfree(&dso->symbol_names);
+	annotated_data_type__tree_delete(&dso->data_types);
+
 	if (dso->short_name_allocated) {
 		zfree((char **)&dso->short_name);
 		dso->short_name_allocated = false;
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 3759de8c2267..ce9f3849a773 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -154,6 +154,8 @@ struct dso {
 	size_t		 symbol_names_len;
 	struct rb_root_cached inlined_nodes;
 	struct rb_root_cached srclines;
+	struct rb_root	data_types;
+
 	struct {
 		u64		addr;
 		struct symbol	*symbol;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 14/52] perf annotate: Factor out evsel__get_arch()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (12 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 13/52] perf annotate-data: Add dso->data_types tree Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-12-23 14:14   ` Arnaldo Carvalho de Melo
  2023-11-09 23:59 ` [PATCH 15/52] perf annotate: Check if operand has multiple regs Namhyung Kim
                   ` (38 subsequent siblings)
  52 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The evsel__get_arch() is to get architecture info from the environ.
It'll be used by other places later so let's factor it out.

Also add arch__is() to check the arch info by name.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 44 +++++++++++++++++++++++++++-----------
 tools/perf/util/annotate.h |  2 ++
 2 files changed, 33 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 3364edf30f50..83e0996992af 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -804,6 +804,11 @@ static struct arch *arch__find(const char *name)
 	return bsearch(name, architectures, nmemb, sizeof(struct arch), arch__key_cmp);
 }
 
+bool arch__is(struct arch *arch, const char *name)
+{
+	return !strcmp(arch->name, name);
+}
+
 static struct annotated_source *annotated_source__new(void)
 {
 	struct annotated_source *src = zalloc(sizeof(*src));
@@ -2340,15 +2345,8 @@ void symbol__calc_percent(struct symbol *sym, struct evsel *evsel)
 	annotation__calc_percent(notes, evsel, symbol__size(sym));
 }
 
-int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
-		     struct annotation_options *options, struct arch **parch)
+static int evsel__get_arch(struct evsel *evsel, struct arch **parch)
 {
-	struct symbol *sym = ms->sym;
-	struct annotation *notes = symbol__annotation(sym);
-	struct annotate_args args = {
-		.evsel		= evsel,
-		.options	= options,
-	};
 	struct perf_env *env = evsel__env(evsel);
 	const char *arch_name = perf_env__arch(env);
 	struct arch *arch;
@@ -2357,23 +2355,43 @@ int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
 	if (!arch_name)
 		return errno;
 
-	args.arch = arch = arch__find(arch_name);
+	*parch = arch = arch__find(arch_name);
 	if (arch == NULL) {
 		pr_err("%s: unsupported arch %s\n", __func__, arch_name);
 		return ENOTSUP;
 	}
 
-	if (parch)
-		*parch = arch;
-
 	if (arch->init) {
 		err = arch->init(arch, env ? env->cpuid : NULL);
 		if (err) {
-			pr_err("%s: failed to initialize %s arch priv area\n", __func__, arch->name);
+			pr_err("%s: failed to initialize %s arch priv area\n",
+			       __func__, arch->name);
 			return err;
 		}
 	}
+	return 0;
+}
+
+int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
+		     struct annotation_options *options, struct arch **parch)
+{
+	struct symbol *sym = ms->sym;
+	struct annotation *notes = symbol__annotation(sym);
+	struct annotate_args args = {
+		.evsel		= evsel,
+		.options	= options,
+	};
+	struct arch *arch = NULL;
+	int err;
+
+	err = evsel__get_arch(evsel, &arch);
+	if (err < 0)
+		return err;
+
+	if (parch)
+		*parch = arch;
 
+	args.arch = arch;
 	args.ms = *ms;
 	if (notes->options && notes->options->full_addr)
 		notes->start = map__objdump_2mem(ms->map, ms->sym->start);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index bc8b95e8b1be..e8b0173f5f00 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -59,6 +59,8 @@ struct ins_operands {
 
 struct arch;
 
+bool arch__is(struct arch *arch, const char *name);
+
 struct ins_ops {
 	void (*free)(struct ins_operands *ops);
 	int (*parse)(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms);
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 15/52] perf annotate: Check if operand has multiple regs
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (13 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 14/52] perf annotate: Factor out evsel__get_arch() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 16/52] perf annotate: Add annotate_get_insn_location() Namhyung Kim
                   ` (37 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

It needs to check all possible information in an instruction.  Let's add
a field indicating if the operand has multiple registers.  I'll be used
to search type information like in an array access on x86 like:

  mov    0x10(%rax,%rbx,8), %rcx
             -------------
                 here

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 36 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/annotate.h |  2 ++
 2 files changed, 38 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 83e0996992af..9e297adc8c59 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -85,6 +85,8 @@ struct arch {
 	struct		{
 		char comment_char;
 		char skip_functions_char;
+		char register_char;
+		char memory_ref_char;
 	} objdump;
 };
 
@@ -188,6 +190,8 @@ static struct arch architectures[] = {
 		.insn_suffix = "bwlq",
 		.objdump =  {
 			.comment_char = '#',
+			.register_char = '%',
+			.memory_ref_char = '(',
 		},
 	},
 	{
@@ -566,6 +570,34 @@ static struct ins_ops lock_ops = {
 	.scnprintf = lock__scnprintf,
 };
 
+/*
+ * Check if the operand has more than one registers like x86 SIB addressing:
+ *   0x1234(%rax, %rbx, 8)
+ *
+ * But it doesn't care segment selectors like %gs:0x5678(%rcx), so just check
+ * the input string after 'memory_ref_char' if exists.
+ */
+static bool check_multi_regs(struct arch *arch, const char *op)
+{
+	int count = 0;
+
+	if (arch->objdump.register_char == 0)
+		return false;
+
+	if (arch->objdump.memory_ref_char) {
+		op = strchr(op, arch->objdump.memory_ref_char);
+		if (op == NULL)
+			return false;
+	}
+
+	while ((op = strchr(op, arch->objdump.register_char)) != NULL) {
+		count++;
+		op++;
+	}
+
+	return count > 1;
+}
+
 static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms __maybe_unused)
 {
 	char *s = strchr(ops->raw, ','), *target, *comment, prev;
@@ -593,6 +625,8 @@ static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map_sy
 	if (ops->source.raw == NULL)
 		return -1;
 
+	ops->source.multi_regs = check_multi_regs(arch, ops->source.raw);
+
 	target = skip_spaces(++s);
 	comment = strchr(s, arch->objdump.comment_char);
 
@@ -613,6 +647,8 @@ static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map_sy
 	if (ops->target.raw == NULL)
 		goto out_free_source;
 
+	ops->target.multi_regs = check_multi_regs(arch, ops->target.raw);
+
 	if (comment == NULL)
 		return 0;
 
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index e8b0173f5f00..4ebc6407c68a 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -39,12 +39,14 @@ struct ins_operands {
 		s64	offset;
 		bool	offset_avail;
 		bool	outside;
+		bool	multi_regs;
 	} target;
 	union {
 		struct {
 			char	*raw;
 			char	*name;
 			u64	addr;
+			bool	multi_regs;
 		} source;
 		struct {
 			struct ins	    ins;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 16/52] perf annotate: Add annotate_get_insn_location()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (14 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 15/52] perf annotate: Check if operand has multiple regs Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 17/52] perf annotate: Implement hist_entry__get_data_type() Namhyung Kim
                   ` (36 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The annotate_get_insn_location() is to get the detailed information of
instruction locations like registers and offset.  It has source and
target operands locations in an array.  Each operand can have a
register and an offset.  The offset is meaningful when mem_ref flag is
set.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 107 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/annotate.h |  36 +++++++++++++
 2 files changed, 143 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 9e297adc8c59..f0c89552087d 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -31,6 +31,7 @@
 #include "bpf-utils.h"
 #include "block-range.h"
 #include "string2.h"
+#include "dwarf-regs.h"
 #include "util/event.h"
 #include "util/sharded_mutex.h"
 #include "arch/common.h"
@@ -3522,3 +3523,109 @@ int annotate_check_args(struct annotation_options *args)
 	}
 	return 0;
 }
+
+/*
+ * Get register number and access offset from the given instruction.
+ * It assumes AT&T x86 asm format like OFFSET(REG).  Maybe it needs
+ * to revisit the format when it handles different architecture.
+ * Fills @reg and @offset when return 0.
+ */
+static int extract_reg_offset(struct arch *arch, const char *str,
+			      struct annotated_op_loc *op_loc)
+{
+	char *p;
+	char *regname;
+
+	if (arch->objdump.register_char == 0)
+		return -1;
+
+	/*
+	 * It should start from offset, but it's possible to skip 0
+	 * in the asm.  So 0(%rax) should be same as (%rax).
+	 *
+	 * However, it also start with a segment select register like
+	 * %gs:0x18(%rbx).  In that case it should skip the part.
+	 */
+	if (*str == arch->objdump.register_char) {
+		while (*str && !isdigit(*str) &&
+		       *str != arch->objdump.memory_ref_char)
+			str++;
+	}
+
+	op_loc->offset = strtol(str, &p, 0);
+
+	p = strchr(p, arch->objdump.register_char);
+	if (p == NULL)
+		return -1;
+
+	regname = strdup(p);
+	if (regname == NULL)
+		return -1;
+
+	op_loc->reg = get_dwarf_regnum(regname, 0);
+	free(regname);
+	return 0;
+}
+
+/**
+ * annotate_get_insn_location - Get location of instruction
+ * @arch: the architecture info
+ * @dl: the target instruction
+ * @loc: a buffer to save the data
+ *
+ * Get detailed location info (register and offset) in the instruction.
+ * It needs both source and target operand and whether it accesses a
+ * memory location.  The offset field is meaningful only when the
+ * corresponding mem flag is set.
+ *
+ * Some examples on x86:
+ *
+ *   mov  (%rax), %rcx   # src_reg = rax, src_mem = 1, src_offset = 0
+ *                       # dst_reg = rcx, dst_mem = 0
+ *
+ *   mov  0x18, %r8      # src_reg = -1, dst_reg = r8
+ */
+int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
+			       struct annotated_insn_loc *loc)
+{
+	struct ins_operands *ops;
+	struct annotated_op_loc *op_loc;
+	int i;
+
+	if (!strcmp(dl->ins.name, "lock"))
+		ops = dl->ops.locked.ops;
+	else
+		ops = &dl->ops;
+
+	if (ops == NULL)
+		return -1;
+
+	memset(loc, 0, sizeof(*loc));
+
+	for_each_insn_op_loc(loc, i, op_loc) {
+		const char *insn_str = ops->source.raw;
+
+		if (i == INSN_OP_TARGET)
+			insn_str = ops->target.raw;
+
+		/* Invalidate the register by default */
+		op_loc->reg = -1;
+
+		if (insn_str == NULL)
+			continue;
+
+		if (strchr(insn_str, arch->objdump.memory_ref_char)) {
+			op_loc->mem_ref = true;
+			extract_reg_offset(arch, insn_str, op_loc);
+		} else {
+			char *s = strdup(insn_str);
+
+			if (s) {
+				op_loc->reg = get_dwarf_regnum(s, 0);
+				free(s);
+			}
+		}
+	}
+
+	return 0;
+}
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 4ebc6407c68a..10eefecf49c4 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -445,4 +445,40 @@ int annotate_parse_percent_type(const struct option *opt, const char *_str,
 
 int annotate_check_args(struct annotation_options *args);
 
+/**
+ * struct annotated_op_loc - Location info of instruction operand
+ * @reg: Register in the operand
+ * @offset: Memory access offset in the operand
+ * @mem_ref: Whether the operand accesses memory
+ */
+struct annotated_op_loc {
+	int reg;
+	int offset;
+	bool mem_ref;
+};
+
+enum annotated_insn_ops {
+	INSN_OP_SOURCE = 0,
+	INSN_OP_TARGET = 1,
+
+	INSN_OP_MAX,
+};
+
+/**
+ * struct annotated_insn_loc - Location info of instruction
+ * @ops: Array of location info for source and target operands
+ */
+struct annotated_insn_loc {
+	struct annotated_op_loc ops[INSN_OP_MAX];
+};
+
+#define for_each_insn_op_loc(insn_loc, i, op_loc)			\
+	for (i = INSN_OP_SOURCE, op_loc = &(insn_loc)->ops[i];		\
+	     i < INSN_OP_MAX;						\
+	     i++, op_loc++)
+
+/* Get detailed location info in the instruction */
+int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
+			       struct annotated_insn_loc *loc);
+
 #endif	/* __PERF_ANNOTATE_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 17/52] perf annotate: Implement hist_entry__get_data_type()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (15 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 16/52] perf annotate: Add annotate_get_insn_location() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 18/52] perf report: Add 'type' sort key Namhyung Kim
                   ` (35 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

It's the function to find out the type info from the given sample data
and will be called from the hist_entry sort logic when 'type' sort key
is used.

It first calls objdump to disassemble the instructions and figure out
information about memory access at the location.  Maybe we can do it
better by analyzing the instruction directly, but I'll leave it for
later work.

The memory access is determined by checking instruction operands to
have "(" and then extract register name and offset.  It'll return NULL
if no data type is found.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 85 ++++++++++++++++++++++++++++++++++++++
 tools/perf/util/annotate.h |  4 ++
 2 files changed, 89 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index f0c89552087d..c08686b91861 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -25,6 +25,7 @@
 #include "units.h"
 #include "debug.h"
 #include "annotate.h"
+#include "annotate-data.h"
 #include "evsel.h"
 #include "evlist.h"
 #include "bpf-event.h"
@@ -3629,3 +3630,87 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
 
 	return 0;
 }
+
+static void symbol__ensure_annotate(struct map_symbol *ms, struct evsel *evsel)
+{
+	struct disasm_line *dl, *tmp_dl;
+	struct annotation *notes;
+
+	notes = symbol__annotation(ms->sym);
+	if (!list_empty(&notes->src->source))
+		return;
+
+	if (symbol__annotate(ms, evsel, notes->options, NULL) < 0)
+		return;
+
+	/* remove non-insn disasm lines for simplicity */
+	list_for_each_entry_safe(dl, tmp_dl, &notes->src->source, al.node) {
+		if (dl->al.offset == -1) {
+			list_del(&dl->al.node);
+			free(dl);
+		}
+	}
+}
+
+static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip)
+{
+	struct disasm_line *dl;
+	struct annotation *notes;
+
+	notes = symbol__annotation(sym);
+
+	list_for_each_entry(dl, &notes->src->source, al.node) {
+		if (sym->start + dl->al.offset == ip)
+			return dl;
+	}
+	return NULL;
+}
+
+/**
+ * hist_entry__get_data_type - find data type for given hist entry
+ * @he: hist entry
+ *
+ * This function first annotates the instruction at @he->ip and extracts
+ * register and offset info from it.  Then it searches the DWARF debug
+ * info to get a variable and type information using the address, register,
+ * and offset.
+ */
+struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
+{
+	struct map_symbol *ms = &he->ms;
+	struct evsel *evsel = hists_to_evsel(he->hists);
+	struct arch *arch;
+	struct disasm_line *dl;
+	struct annotated_insn_loc loc;
+	struct annotated_op_loc *op_loc;
+	u64 ip = he->ip;
+	int i;
+
+	if (ms->map == NULL || ms->sym == NULL)
+		return NULL;
+
+	if (evsel__get_arch(evsel, &arch) < 0)
+		return NULL;
+
+	/* Make sure it runs objdump to get disasm of the function */
+	symbol__ensure_annotate(ms, evsel);
+
+	/*
+	 * Get a disasm to extract the location from the insn.
+	 * This is too slow...
+	 */
+	dl = find_disasm_line(ms->sym, ip);
+	if (dl == NULL)
+		return NULL;
+
+	if (annotate_get_insn_location(arch, dl, &loc) < 0)
+		return NULL;
+
+	for_each_insn_op_loc(&loc, i, op_loc) {
+		if (!op_loc->mem_ref)
+			continue;
+
+		return find_data_type(ms, ip, op_loc->reg, op_loc->offset);
+	}
+	return NULL;
+}
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 10eefecf49c4..06281a50ecf6 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -23,6 +23,7 @@ struct option;
 struct perf_sample;
 struct evsel;
 struct symbol;
+struct annotated_data_type;
 
 struct ins {
 	const char     *name;
@@ -481,4 +482,7 @@ struct annotated_insn_loc {
 int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
 			       struct annotated_insn_loc *loc);
 
+/* Returns a data type from the sample instruction (if any) */
+struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he);
+
 #endif	/* __PERF_ANNOTATE_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 18/52] perf report: Add 'type' sort key
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (16 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 17/52] perf annotate: Implement hist_entry__get_data_type() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-21 17:55   ` Arnaldo Carvalho de Melo
  2023-11-09 23:59 ` [PATCH 19/52] perf report: Support data type profiling Namhyung Kim
                   ` (34 subsequent siblings)
  52 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The 'type' sort key is to aggregate hist entries by data type they
access.  Add mem_type field to hist_entry struct to save the type.
If hist_entry__get_data_type() returns NULL, it'd use the
'unknown_type' instance.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-report.txt |  1 +
 tools/perf/util/annotate-data.h          |  2 +
 tools/perf/util/hist.h                   |  1 +
 tools/perf/util/sort.c                   | 69 +++++++++++++++++++++++-
 tools/perf/util/sort.h                   |  4 ++
 5 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index af068b4f1e5a..aec34417090b 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -118,6 +118,7 @@ OPTIONS
 	- retire_lat: On X86, this reports pipeline stall of this instruction compared
 	  to the previous instruction in cycles. And currently supported only on X86
 	- simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate
+	- type: Data type of sample memory access.
 
 	By default, comm, dso and symbol keys are used.
 	(i.e. --sort comm,dso,symbol)
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index ab9f187bd7f1..6efdd7e21b28 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -22,6 +22,8 @@ struct annotated_data_type {
 	int type_size;
 };
 
+extern struct annotated_data_type unknown_type;
+
 #ifdef HAVE_DWARF_SUPPORT
 
 /* Returns data type at the location (ip, reg, offset) */
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index afc9f1c7f4dc..9bfed867f288 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -82,6 +82,7 @@ enum hist_column {
 	HISTC_ADDR_TO,
 	HISTC_ADDR,
 	HISTC_SIMD,
+	HISTC_TYPE,
 	HISTC_NR_COLS, /* Last entry */
 };
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 27b123ccd2d1..e647f0117bb5 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -24,6 +24,7 @@
 #include "strbuf.h"
 #include "mem-events.h"
 #include "annotate.h"
+#include "annotate-data.h"
 #include "event.h"
 #include "time-utils.h"
 #include "cgroup.h"
@@ -2094,7 +2095,7 @@ struct sort_entry sort_dso_size = {
 	.se_width_idx	= HISTC_DSO_SIZE,
 };
 
-/* --sort dso_size */
+/* --sort addr */
 
 static int64_t
 sort__addr_cmp(struct hist_entry *left, struct hist_entry *right)
@@ -2131,6 +2132,69 @@ struct sort_entry sort_addr = {
 	.se_width_idx	= HISTC_ADDR,
 };
 
+/* --sort type */
+
+struct annotated_data_type unknown_type = {
+	.type_name = (char *)"(unknown)",
+};
+
+static int64_t
+sort__type_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	return sort__addr_cmp(left, right);
+}
+
+static void sort__type_init(struct hist_entry *he)
+{
+	if (he->mem_type)
+		return;
+
+	he->mem_type = hist_entry__get_data_type(he);
+	if (he->mem_type == NULL)
+		he->mem_type = &unknown_type;
+}
+
+static int64_t
+sort__type_collapse(struct hist_entry *left, struct hist_entry *right)
+{
+	struct annotated_data_type *left_type = left->mem_type;
+	struct annotated_data_type *right_type = right->mem_type;
+
+	if (!left_type) {
+		sort__type_init(left);
+		left_type = left->mem_type;
+	}
+
+	if (!right_type) {
+		sort__type_init(right);
+		right_type = right->mem_type;
+	}
+
+	return strcmp(left_type->type_name, right_type->type_name);
+}
+
+static int64_t
+sort__type_sort(struct hist_entry *left, struct hist_entry *right)
+{
+	return sort__type_collapse(left, right);
+}
+
+static int hist_entry__type_snprintf(struct hist_entry *he, char *bf,
+				     size_t size, unsigned int width)
+{
+	return repsep_snprintf(bf, size, "%-*s", width, he->mem_type->type_name);
+}
+
+struct sort_entry sort_type = {
+	.se_header	= "Data Type",
+	.se_cmp		= sort__type_cmp,
+	.se_collapse	= sort__type_collapse,
+	.se_sort	= sort__type_sort,
+	.se_init	= sort__type_init,
+	.se_snprintf	= hist_entry__type_snprintf,
+	.se_width_idx	= HISTC_TYPE,
+};
+
 
 struct sort_dimension {
 	const char		*name;
@@ -2185,7 +2249,8 @@ static struct sort_dimension common_sort_dimensions[] = {
 	DIM(SORT_ADDR, "addr", sort_addr),
 	DIM(SORT_LOCAL_RETIRE_LAT, "local_retire_lat", sort_local_p_stage_cyc),
 	DIM(SORT_GLOBAL_RETIRE_LAT, "retire_lat", sort_global_p_stage_cyc),
-	DIM(SORT_SIMD, "simd", sort_simd)
+	DIM(SORT_SIMD, "simd", sort_simd),
+	DIM(SORT_ANNOTATE_DATA_TYPE, "type", sort_type),
 };
 
 #undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index ecfb7f1359d5..aabf0b8331a3 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -15,6 +15,7 @@
 
 struct option;
 struct thread;
+struct annotated_data_type;
 
 extern regex_t parent_regex;
 extern const char *sort_order;
@@ -34,6 +35,7 @@ extern struct sort_entry sort_dso_to;
 extern struct sort_entry sort_sym_from;
 extern struct sort_entry sort_sym_to;
 extern struct sort_entry sort_srcline;
+extern struct sort_entry sort_type;
 extern const char default_mem_sort_order[];
 extern bool chk_double_cl;
 
@@ -154,6 +156,7 @@ struct hist_entry {
 	struct perf_hpp_list	*hpp_list;
 	struct hist_entry	*parent_he;
 	struct hist_entry_ops	*ops;
+	struct annotated_data_type *mem_type;
 	union {
 		/* this is for hierarchical entry structure */
 		struct {
@@ -243,6 +246,7 @@ enum sort_type {
 	SORT_LOCAL_RETIRE_LAT,
 	SORT_GLOBAL_RETIRE_LAT,
 	SORT_SIMD,
+	SORT_ANNOTATE_DATA_TYPE,
 
 	/* branch stack specific sort keys */
 	__SORT_BRANCH_STACK,
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 19/52] perf report: Support data type profiling
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (17 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 18/52] perf report: Add 'type' sort key Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 20/52] perf annotate-data: Add member field in the data type Namhyung Kim
                   ` (33 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Enable type annotation when the 'type' sort key is used.
It shows type of variables the samples access at the moment.
Users can see which types are accessed frequently.

  $ perf report -s dso,type --stdio
  ...
  # Overhead  Shared Object      Data Type
  # ........  .................  .........
  #
      35.47%  [kernel.kallsyms]  (unknown)
       1.62%  [kernel.kallsyms]  struct sched_entry
       1.23%  [kernel.kallsyms]  struct cfs_rq
       0.83%  [kernel.kallsyms]  struct task_struct
       0.34%  [kernel.kallsyms]  struct list_head
       0.30%  [kernel.kallsyms]  struct mem_cgroup
  ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-report.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 121a2781323c..cd2cefd1ea9a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -96,6 +96,7 @@ struct report {
 	bool			stitch_lbr;
 	bool			disable_order;
 	bool			skip_empty;
+	bool			data_type;
 	int			max_stack;
 	struct perf_read_values	show_threads_values;
 	struct annotation_options annotation_opts;
@@ -171,7 +172,7 @@ static int hist_iter__report_callback(struct hist_entry_iter *iter,
 	struct mem_info *mi;
 	struct branch_info *bi;
 
-	if (!ui__has_annotation() && !rep->symbol_ipc)
+	if (!ui__has_annotation() && !rep->symbol_ipc && !rep->data_type)
 		return 0;
 
 	if (sort__mode == SORT_MODE__BRANCH) {
@@ -323,10 +324,19 @@ static int process_sample_event(struct perf_tool *tool,
 	if (al.map != NULL)
 		map__dso(al.map)->hit = 1;
 
-	if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode) {
+	if (ui__has_annotation() || rep->symbol_ipc || rep->total_cycles_mode ||
+	    rep->data_type) {
 		hist__account_cycles(sample->branch_stack, &al, sample,
 				     rep->nonany_branch_mode,
 				     &rep->total_cycles);
+		if (rep->data_type) {
+			struct symbol *sym = al.sym;
+			struct annotation *notes = sym ? symbol__annotation(sym) : NULL;
+
+			/* XXX: Save annotate options here */
+			if (notes)
+				notes->options = &rep->annotation_opts;
+		}
 	}
 
 	ret = hist_entry_iter__add(&iter, &al, rep->max_stack, rep);
@@ -1622,6 +1632,11 @@ int cmd_report(int argc, const char **argv)
 			sort_order = NULL;
 	}
 
+	if (sort_order && strstr(sort_order, "type")) {
+		report.data_type = true;
+		report.annotation_opts.annotate_src = false;
+	}
+
 	if (strcmp(input_name, "-") != 0)
 		setup_browser(true);
 	else
@@ -1680,7 +1695,7 @@ int cmd_report(int argc, const char **argv)
 	 * so don't allocate extra space that won't be used in the stdio
 	 * implementation.
 	 */
-	if (ui__has_annotation() || report.symbol_ipc ||
+	if (ui__has_annotation() || report.symbol_ipc || report.data_type ||
 	    report.total_cycles_mode) {
 		ret = symbol__annotation_init();
 		if (ret < 0)
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 20/52] perf annotate-data: Add member field in the data type
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (18 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 19/52] perf report: Support data type profiling Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 21/52] perf annotate-data: Update sample histogram for type Namhyung Kim
                   ` (32 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Add child member field if the current type is a composite type like a
struct or union.  The member fields are linked in the children list
and do the same recursively if the child itself is a composite type.
Add 'self' member to the annotated_data_type to handle the members in
the same way.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 101 ++++++++++++++++++++++++++++----
 tools/perf/util/annotate-data.h |  27 +++++++--
 tools/perf/util/sort.c          |   9 ++-
 3 files changed, 119 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 475cc30b33e1..107e3248a541 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -30,9 +30,9 @@ static int data_type_cmp(const void *_key, const struct rb_node *node)
 
 	type = rb_entry(node, struct annotated_data_type, node);
 
-	if (key->type_size != type->type_size)
-		return key->type_size - type->type_size;
-	return strcmp(key->type_name, type->type_name);
+	if (key->self.size != type->self.size)
+		return key->self.size - type->self.size;
+	return strcmp(key->self.type_name, type->self.type_name);
 }
 
 static bool data_type_less(struct rb_node *node_a, const struct rb_node *node_b)
@@ -42,9 +42,80 @@ static bool data_type_less(struct rb_node *node_a, const struct rb_node *node_b)
 	a = rb_entry(node_a, struct annotated_data_type, node);
 	b = rb_entry(node_b, struct annotated_data_type, node);
 
-	if (a->type_size != b->type_size)
-		return a->type_size < b->type_size;
-	return strcmp(a->type_name, b->type_name) < 0;
+	if (a->self.size != b->self.size)
+		return a->self.size < b->self.size;
+	return strcmp(a->self.type_name, b->self.type_name) < 0;
+}
+
+/* Recursively add new members for struct/union */
+static int __add_member_cb(Dwarf_Die *die, void *arg)
+{
+	struct annotated_member *parent = arg;
+	struct annotated_member *member;
+	Dwarf_Die member_type, die_mem;
+	Dwarf_Word size, loc;
+	Dwarf_Attribute attr;
+	struct strbuf sb;
+	int tag;
+
+	if (dwarf_tag(die) != DW_TAG_member)
+		return DIE_FIND_CB_SIBLING;
+
+	member = zalloc(sizeof(*member));
+	if (member == NULL)
+		return DIE_FIND_CB_END;
+
+	strbuf_init(&sb, 32);
+	die_get_typename(die, &sb);
+
+	die_get_real_type(die, &member_type);
+	if (dwarf_aggregate_size(&member_type, &size) < 0)
+		size = 0;
+
+	if (!dwarf_attr_integrate(die, DW_AT_data_member_location, &attr))
+		loc = 0;
+	else
+		dwarf_formudata(&attr, &loc);
+
+	member->type_name = strbuf_detach(&sb, NULL);
+	/* member->var_name can be NULL */
+	if (dwarf_diename(die))
+		member->var_name = strdup(dwarf_diename(die));
+	member->size = size;
+	member->offset = loc + parent->offset;
+	INIT_LIST_HEAD(&member->children);
+	list_add_tail(&member->node, &parent->children);
+
+	tag = dwarf_tag(&member_type);
+	switch (tag) {
+	case DW_TAG_structure_type:
+	case DW_TAG_union_type:
+		die_find_child(&member_type, __add_member_cb, member, &die_mem);
+		break;
+	default:
+		break;
+	}
+	return DIE_FIND_CB_SIBLING;
+}
+
+static void add_member_types(struct annotated_data_type *parent, Dwarf_Die *type)
+{
+	Dwarf_Die die_mem;
+
+	die_find_child(type, __add_member_cb, &parent->self, &die_mem);
+}
+
+static void delete_members(struct annotated_member *member)
+{
+	struct annotated_member *child, *tmp;
+
+	list_for_each_entry_safe(child, tmp, &member->children, node) {
+		list_del(&child->node);
+		delete_members(child);
+		free(child->type_name);
+		free(child->var_name);
+		free(child);
+	}
 }
 
 static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
@@ -64,8 +135,8 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
 	dwarf_aggregate_size(type_die, &size);
 
 	/* Check existing nodes in dso->data_types tree */
-	key.type_name = type_name;
-	key.type_size = size;
+	key.self.type_name = type_name;
+	key.self.size = size;
 	node = rb_find(&key, &dso->data_types, data_type_cmp);
 	if (node) {
 		result = rb_entry(node, struct annotated_data_type, node);
@@ -80,8 +151,15 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
 		return NULL;
 	}
 
-	result->type_name = type_name;
-	result->type_size = size;
+	result->self.type_name = type_name;
+	result->self.size = size;
+	INIT_LIST_HEAD(&result->self.children);
+
+	/*
+	 * Fill member info unconditionally for now,
+	 * later perf annotate would need it.
+	 */
+	add_member_types(result, type_die);
 
 	rb_add(&result->node, &dso->data_types, data_type_less);
 	return result;
@@ -232,7 +310,8 @@ void annotated_data_type__tree_delete(struct rb_root *root)
 
 		rb_erase(node, root);
 		pos = rb_entry(node, struct annotated_data_type, node);
-		free(pos->type_name);
+		delete_members(&pos->self);
+		free(pos->self.type_name);
 		free(pos);
 	}
 }
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 6efdd7e21b28..33748222e6aa 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -9,17 +9,36 @@
 
 struct map_symbol;
 
+/**
+ * struct annotated_member - Type of member field
+ * @node: List entry in the parent list
+ * @children: List head for child nodes
+ * @type_name: Name of the member type
+ * @var_name: Name of the member variable
+ * @offset: Offset from the outer data type
+ * @size: Size of the member field
+ *
+ * This represents a member type in a data type.
+ */
+struct annotated_member {
+	struct list_head node;
+	struct list_head children;
+	char *type_name;
+	char *var_name;
+	int offset;
+	int size;
+};
+
 /**
  * struct annotated_data_type - Data type to profile
- * @type_name: Name of the data type
- * @type_size: Size of the data type
+ * @node: RB-tree node for dso->type_tree
+ * @self: Actual type information
  *
  * This represents a data type accessed by samples in the profile data.
  */
 struct annotated_data_type {
 	struct rb_node node;
-	char *type_name;
-	int type_size;
+	struct annotated_member self;
 };
 
 extern struct annotated_data_type unknown_type;
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index e647f0117bb5..a41209e242ae 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -2135,7 +2135,10 @@ struct sort_entry sort_addr = {
 /* --sort type */
 
 struct annotated_data_type unknown_type = {
-	.type_name = (char *)"(unknown)",
+	.self = {
+		.type_name = (char *)"(unknown)",
+		.children = LIST_HEAD_INIT(unknown_type.self.children),
+	},
 };
 
 static int64_t
@@ -2170,7 +2173,7 @@ sort__type_collapse(struct hist_entry *left, struct hist_entry *right)
 		right_type = right->mem_type;
 	}
 
-	return strcmp(left_type->type_name, right_type->type_name);
+	return strcmp(left_type->self.type_name, right_type->self.type_name);
 }
 
 static int64_t
@@ -2182,7 +2185,7 @@ sort__type_sort(struct hist_entry *left, struct hist_entry *right)
 static int hist_entry__type_snprintf(struct hist_entry *he, char *bf,
 				     size_t size, unsigned int width)
 {
-	return repsep_snprintf(bf, size, "%-*s", width, he->mem_type->type_name);
+	return repsep_snprintf(bf, size, "%-*s", width, he->mem_type->self.type_name);
 }
 
 struct sort_entry sort_type = {
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 21/52] perf annotate-data: Update sample histogram for type
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (19 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 20/52] perf annotate-data: Add member field in the data type Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 22/52] perf report: Add 'typeoff' sort key Namhyung Kim
                   ` (31 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The annotated_data_type__update_samples() to get histogram for data type
access.  It'll be called by perf annotate to show which fields in the
data type are accessed frequently.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 81 +++++++++++++++++++++++++++++++++
 tools/perf/util/annotate-data.h | 42 +++++++++++++++++
 tools/perf/util/annotate.c      |  9 +++-
 3 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 107e3248a541..3c452d037948 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -12,6 +12,8 @@
 #include "debuginfo.h"
 #include "debug.h"
 #include "dso.h"
+#include "evsel.h"
+#include "evlist.h"
 #include "map.h"
 #include "map_symbol.h"
 #include "strbuf.h"
@@ -301,6 +303,44 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	return result;
 }
 
+static int alloc_data_type_histograms(struct annotated_data_type *adt, int nr_entries)
+{
+	int i;
+	size_t sz = sizeof(struct type_hist);
+
+	sz += sizeof(struct type_hist_entry) * adt->self.size;
+
+	/* Allocate a table of pointers for each event */
+	adt->nr_histograms = nr_entries;
+	adt->histograms = calloc(nr_entries, sizeof(*adt->histograms));
+	if (adt->histograms == NULL)
+		return -ENOMEM;
+
+	/*
+	 * Each histogram is allocated for the whole size of the type.
+	 * TODO: Probably we can move the histogram to members.
+	 */
+	for (i = 0; i < nr_entries; i++) {
+		adt->histograms[i] = zalloc(sz);
+		if (adt->histograms[i] == NULL)
+			goto err;
+	}
+	return 0;
+
+err:
+	while (--i >= 0)
+		free(adt->histograms[i]);
+	free(adt->histograms);
+	return -ENOMEM;
+}
+
+static void delete_data_type_histograms(struct annotated_data_type *adt)
+{
+	for (int i = 0; i < adt->nr_histograms; i++)
+		free(adt->histograms[i]);
+	free(adt->histograms);
+}
+
 void annotated_data_type__tree_delete(struct rb_root *root)
 {
 	struct annotated_data_type *pos;
@@ -311,7 +351,48 @@ void annotated_data_type__tree_delete(struct rb_root *root)
 		rb_erase(node, root);
 		pos = rb_entry(node, struct annotated_data_type, node);
 		delete_members(&pos->self);
+		delete_data_type_histograms(pos);
 		free(pos->self.type_name);
 		free(pos);
 	}
 }
+
+/**
+ * annotated_data_type__update_samples - Update histogram
+ * @adt: Data type to update
+ * @evsel: Event to update
+ * @offset: Offset in the type
+ * @nr_samples: Number of samples at this offset
+ * @period: Event count at this offset
+ *
+ * This function updates type histogram at @ofs for @evsel.  Samples are
+ * aggregated before calling this function so it can be called with more
+ * than one samples at a certain offset.
+ */
+int annotated_data_type__update_samples(struct annotated_data_type *adt,
+					struct evsel *evsel, int offset,
+					int nr_samples, u64 period)
+{
+	struct type_hist *h;
+
+	if (adt == NULL)
+		return 0;
+
+	if (adt->histograms == NULL) {
+		int nr = evsel->evlist->core.nr_entries;
+
+		if (alloc_data_type_histograms(adt, nr) < 0)
+			return -1;
+	}
+
+	if (offset < 0 || offset >= adt->self.size)
+		return -1;
+
+	h = adt->histograms[evsel->core.idx];
+
+	h->nr_samples += nr_samples;
+	h->addr[offset].nr_samples += nr_samples;
+	h->period += period;
+	h->addr[offset].period += period;
+	return 0;
+}
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 33748222e6aa..d2dc025b1934 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -7,6 +7,7 @@
 #include <linux/rbtree.h>
 #include <linux/types.h>
 
+struct evsel;
 struct map_symbol;
 
 /**
@@ -29,16 +30,42 @@ struct annotated_member {
 	int size;
 };
 
+/**
+ * struct type_hist_entry - Histogram entry per offset
+ * @nr_samples: Number of samples
+ * @period: Count of event
+ */
+struct type_hist_entry {
+	int nr_samples;
+	u64 period;
+};
+
+/**
+ * struct type_hist - Type histogram for each event
+ * @nr_samples: Total number of samples in this data type
+ * @period: Total count of the event in this data type
+ * @offset: Array of histogram entry
+ */
+struct type_hist {
+	u64			nr_samples;
+	u64			period;
+	struct type_hist_entry	addr[];
+};
+
 /**
  * struct annotated_data_type - Data type to profile
  * @node: RB-tree node for dso->type_tree
  * @self: Actual type information
+ * @nr_histogram: Number of histogram entries
+ * @histograms: An array of pointers to histograms
  *
  * This represents a data type accessed by samples in the profile data.
  */
 struct annotated_data_type {
 	struct rb_node node;
 	struct annotated_member self;
+	int nr_histograms;
+	struct type_hist **histograms;
 };
 
 extern struct annotated_data_type unknown_type;
@@ -49,6 +76,11 @@ extern struct annotated_data_type unknown_type;
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 					   int reg, int offset);
 
+/* Update type access histogram at the given offset */
+int annotated_data_type__update_samples(struct annotated_data_type *adt,
+					struct evsel *evsel, int offset,
+					int nr_samples, u64 period);
+
 /* Release all data type information in the tree */
 void annotated_data_type__tree_delete(struct rb_root *root);
 
@@ -61,6 +93,16 @@ find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
 	return NULL;
 }
 
+static inline int
+annotated_data_type__update_samples(struct annotated_data_type *adt __maybe_unused,
+				    struct evsel *evsel __maybe_unused,
+				    int offset __maybe_unused,
+				    int nr_samples __maybe_unused,
+				    u64 period __maybe_unused)
+{
+	return -1;
+}
+
 static inline void annotated_data_type__tree_delete(struct rb_root *root __maybe_unused)
 {
 }
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index c08686b91861..049d6ba394bd 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3683,6 +3683,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	struct disasm_line *dl;
 	struct annotated_insn_loc loc;
 	struct annotated_op_loc *op_loc;
+	struct annotated_data_type *mem_type;
 	u64 ip = he->ip;
 	int i;
 
@@ -3710,7 +3711,13 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		if (!op_loc->mem_ref)
 			continue;
 
-		return find_data_type(ms, ip, op_loc->reg, op_loc->offset);
+		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
+
+		annotated_data_type__update_samples(mem_type, evsel,
+						    op_loc->offset,
+						    he->stat.nr_events,
+						    he->stat.period);
+		return mem_type;
 	}
 	return NULL;
 }
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 22/52] perf report: Add 'typeoff' sort key
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (20 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 21/52] perf annotate-data: Update sample histogram for type Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 23/52] perf report: Add 'symoff' " Namhyung Kim
                   ` (30 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The typeoff sort key shows the data type name, offset and the name of
the field.  This is useful to see which field in the struct is accessed
most frequently.

  $ perf report -s type,typeoff --hierarchy --stdio
  ...
  #     Overhead  Data Type / Data Type Offset
  # ............  ............................
  #
  ...
        1.23%     struct cfs_rq
           0.19%    struct cfs_rq +404 (throttle_count)
           0.19%    struct cfs_rq +0 (load.weight)
           0.19%    struct cfs_rq +336 (leaf_cfs_rq_list.next)
           0.09%    struct cfs_rq +272 (propagate)
           0.09%    struct cfs_rq +196 (removed.nr)
           0.09%    struct cfs_rq +80 (curr)
           0.09%    struct cfs_rq +544 (lt_b_children_throttled)
           0.06%    struct cfs_rq +320 (rq)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-report.txt |  1 +
 tools/perf/util/annotate.c               |  1 +
 tools/perf/util/hist.h                   |  1 +
 tools/perf/util/sort.c                   | 83 +++++++++++++++++++++++-
 tools/perf/util/sort.h                   |  2 +
 5 files changed, 87 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index aec34417090b..b57eb51b47aa 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -119,6 +119,7 @@ OPTIONS
 	  to the previous instruction in cycles. And currently supported only on X86
 	- simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate
 	- type: Data type of sample memory access.
+	- typeoff: Offset in the data type of sample memory access.
 
 	By default, comm, dso and symbol keys are used.
 	(i.e. --sort comm,dso,symbol)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 049d6ba394bd..76309f1e6e39 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3717,6 +3717,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 						    op_loc->offset,
 						    he->stat.nr_events,
 						    he->stat.period);
+		he->mem_type_off = op_loc->offset;
 		return mem_type;
 	}
 	return NULL;
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 9bfed867f288..941176afcebc 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -83,6 +83,7 @@ enum hist_column {
 	HISTC_ADDR,
 	HISTC_SIMD,
 	HISTC_TYPE,
+	HISTC_TYPE_OFFSET,
 	HISTC_NR_COLS, /* Last entry */
 };
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index a41209e242ae..d78e680d3988 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -2153,8 +2153,10 @@ static void sort__type_init(struct hist_entry *he)
 		return;
 
 	he->mem_type = hist_entry__get_data_type(he);
-	if (he->mem_type == NULL)
+	if (he->mem_type == NULL) {
 		he->mem_type = &unknown_type;
+		he->mem_type_off = 0;
+	}
 }
 
 static int64_t
@@ -2198,6 +2200,84 @@ struct sort_entry sort_type = {
 	.se_width_idx	= HISTC_TYPE,
 };
 
+/* --sort typeoff */
+
+static int64_t
+sort__typeoff_sort(struct hist_entry *left, struct hist_entry *right)
+{
+	struct annotated_data_type *left_type = left->mem_type;
+	struct annotated_data_type *right_type = right->mem_type;
+	int64_t ret;
+
+	if (!left_type) {
+		sort__type_init(left);
+		left_type = left->mem_type;
+	}
+
+	if (!right_type) {
+		sort__type_init(right);
+		right_type = right->mem_type;
+	}
+
+	ret = strcmp(left_type->self.type_name, right_type->self.type_name);
+	if (ret)
+		return ret;
+	return left->mem_type_off - right->mem_type_off;
+}
+
+static void fill_member_name(char *buf, size_t sz, struct annotated_member *m,
+			     int offset, bool first)
+{
+	struct annotated_member *child;
+
+	if (list_empty(&m->children))
+		return;
+
+	list_for_each_entry(child, &m->children, node) {
+		if (child->offset <= offset && offset < child->offset + child->size) {
+			int len = 0;
+
+			/* It can have anonymous struct/union members */
+			if (child->var_name) {
+				len = scnprintf(buf, sz, "%s%s",
+						first ? "" : ".", child->var_name);
+				first = false;
+			}
+
+			fill_member_name(buf + len, sz - len, child, offset, first);
+			return;
+		}
+	}
+}
+
+static int hist_entry__typeoff_snprintf(struct hist_entry *he, char *bf,
+				     size_t size, unsigned int width __maybe_unused)
+{
+	struct annotated_data_type *he_type = he->mem_type;
+	char buf[4096];
+
+	buf[0] = '\0';
+	if (list_empty(&he_type->self.children))
+		snprintf(buf, sizeof(buf), "no field");
+	else
+		fill_member_name(buf, sizeof(buf), &he_type->self,
+				 he->mem_type_off, true);
+	buf[4095] = '\0';
+
+	return repsep_snprintf(bf, size, "%s %+d (%s)", he_type->self.type_name,
+			       he->mem_type_off, buf);
+}
+
+struct sort_entry sort_type_offset = {
+	.se_header	= "Data Type Offset",
+	.se_cmp		= sort__type_cmp,
+	.se_collapse	= sort__typeoff_sort,
+	.se_sort	= sort__typeoff_sort,
+	.se_init	= sort__type_init,
+	.se_snprintf	= hist_entry__typeoff_snprintf,
+	.se_width_idx	= HISTC_TYPE_OFFSET,
+};
+
 
 struct sort_dimension {
 	const char		*name;
@@ -2254,6 +2334,7 @@ static struct sort_dimension common_sort_dimensions[] = {
 	DIM(SORT_GLOBAL_RETIRE_LAT, "retire_lat", sort_global_p_stage_cyc),
 	DIM(SORT_SIMD, "simd", sort_simd),
 	DIM(SORT_ANNOTATE_DATA_TYPE, "type", sort_type),
+	DIM(SORT_ANNOTATE_DATA_TYPE_OFFSET, "typeoff", sort_type_offset),
 };
 
 #undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index aabf0b8331a3..d806adcc1e1e 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -113,6 +113,7 @@ struct hist_entry {
 	u64			p_stage_cyc;
 	u8			cpumode;
 	u8			depth;
+	int			mem_type_off;
 	struct simd_flags	simd_flags;
 
 	/* We are added by hists__add_dummy_entry. */
@@ -247,6 +248,7 @@ enum sort_type {
 	SORT_GLOBAL_RETIRE_LAT,
 	SORT_SIMD,
 	SORT_ANNOTATE_DATA_TYPE,
+	SORT_ANNOTATE_DATA_TYPE_OFFSET,
 
 	/* branch stack specific sort keys */
 	__SORT_BRANCH_STACK,
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 23/52] perf report: Add 'symoff' sort key
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (21 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 22/52] perf report: Add 'typeoff' sort key Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-12-23 14:29   ` Arnaldo Carvalho de Melo
  2023-11-09 23:59 ` [PATCH 24/52] perf annotate: Add --data-type option Namhyung Kim
                   ` (29 subsequent siblings)
  52 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The symoff sort key is to print symbol and offset of sample.  This is
useful for data type profiling to show exact instruction in the function
which refers the data.

  $ perf report -s type,sym,typeoff,symoff --hierarchy
  ...
  #       Overhead  Data Type / Symbol / Data Type Offset / Symbol Offset
  # ..............  .....................................................
  #
      1.23%         struct cfs_rq
        0.84%         update_blocked_averages
          0.19%         struct cfs_rq +336 (leaf_cfs_rq_list.next)
             0.19%         [k] update_blocked_averages+0x96
          0.19%         struct cfs_rq +0 (load.weight)
             0.14%         [k] update_blocked_averages+0x104
             0.04%         [k] update_blocked_averages+0x31c
          0.17%         struct cfs_rq +404 (throttle_count)
             0.12%         [k] update_blocked_averages+0x9d
             0.05%         [k] update_blocked_averages+0x1f9
          0.08%         struct cfs_rq +272 (propagate)
             0.07%         [k] update_blocked_averages+0x3d3
             0.02%         [k] update_blocked_averages+0x45b
  ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-report.txt |  1 +
 tools/perf/util/hist.h                   |  1 +
 tools/perf/util/sort.c                   | 47 ++++++++++++++++++++++++
 tools/perf/util/sort.h                   |  1 +
 4 files changed, 50 insertions(+)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index b57eb51b47aa..38f59ac064f7 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -120,6 +120,7 @@ OPTIONS
 	- simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate
 	- type: Data type of sample memory access.
 	- typeoff: Offset in the data type of sample memory access.
+	- symoff: Offset in the symbol.
 
 	By default, comm, dso and symbol keys are used.
 	(i.e. --sort comm,dso,symbol)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 941176afcebc..1ce0ee262abe 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -84,6 +84,7 @@ enum hist_column {
 	HISTC_SIMD,
 	HISTC_TYPE,
 	HISTC_TYPE_OFFSET,
+	HISTC_SYMBOL_OFFSET,
 	HISTC_NR_COLS, /* Last entry */
 };
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index d78e680d3988..0cbbd5ba8175 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -419,6 +419,52 @@ struct sort_entry sort_sym = {
 	.se_width_idx	= HISTC_SYMBOL,
 };
 
+/* --sort symoff */
+
+static int64_t
+sort__symoff_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	int64_t ret;
+
+	ret = sort__sym_cmp(left, right);
+	if (ret)
+		return ret;
+
+	return left->ip - right->ip;
+}
+
+static int64_t
+sort__symoff_sort(struct hist_entry *left, struct hist_entry *right)
+{
+	int64_t ret;
+
+	ret = sort__sym_sort(left, right);
+	if (ret)
+		return ret;
+
+	return left->ip - right->ip;
+}
+
+static int
+hist_entry__symoff_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width)
+{
+	struct symbol *sym = he->ms.sym;
+
+	if (sym == NULL)
+		return repsep_snprintf(bf, size, "[%c] %-#.*llx", he->level, width - 4, he->ip);
+
+	return repsep_snprintf(bf, size, "[%c] %s+0x%llx", he->level, sym->name, he->ip - sym->start);
+}
+
+struct sort_entry sort_sym_offset = {
+	.se_header	= "Symbol Offset",
+	.se_cmp		= sort__symoff_cmp,
+	.se_sort	= sort__symoff_sort,
+	.se_snprintf	= hist_entry__symoff_snprintf,
+	.se_filter	= hist_entry__sym_filter,
+	.se_width_idx	= HISTC_SYMBOL_OFFSET,
+};
+
 /* --sort srcline */
 
 char *hist_entry__srcline(struct hist_entry *he)
@@ -2335,6 +2381,7 @@ static struct sort_dimension common_sort_dimensions[] = {
 	DIM(SORT_SIMD, "simd", sort_simd),
 	DIM(SORT_ANNOTATE_DATA_TYPE, "type", sort_type),
 	DIM(SORT_ANNOTATE_DATA_TYPE_OFFSET, "typeoff", sort_type_offset),
+	DIM(SORT_SYM_OFFSET, "symoff", sort_sym_offset),
 };
 
 #undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index d806adcc1e1e..6f6b4189a389 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -249,6 +249,7 @@ enum sort_type {
 	SORT_SIMD,
 	SORT_ANNOTATE_DATA_TYPE,
 	SORT_ANNOTATE_DATA_TYPE_OFFSET,
+	SORT_SYM_OFFSET,
 
 	/* branch stack specific sort keys */
 	__SORT_BRANCH_STACK,
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 24/52] perf annotate: Add --data-type option
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (22 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 23/52] perf report: Add 'symoff' " Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 25/52] perf annotate: Support event group display Namhyung Kim
                   ` (28 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Support data type annotation with new --data-type option.  It internally
uses type sort key to collect sample histogram for the type and display
every members like below.

  $ perf annotate --data-type
  ...
  Annotate type: 'struct cfs_rq' in [kernel.kallsyms] (13 samples):
  ============================================================================
      samples     offset       size  field
           13          0        640  struct cfs_rq         {
            2          0         16      struct load_weight       load {
            2          0          8          unsigned long        weight;
            0          8          4          u32  inv_weight;
                                         };
            0         16          8      unsigned long    runnable_weight;
            0         24          4      unsigned int     nr_running;
            1         28          4      unsigned int     h_nr_running;
  ...

For simplicity it prints the number of samples per field for now.
But it should be easy to show the overhead percentage instead.

The number at the outer struct is a sum of the numbers of the inner
members.  For example, struct cfs_rq got total 13 samples, and 2 came
from the load (struct load_weight) and 1 from h_nr_running.  Similarly,
the struct load_weight got total 2 samples and they all came from the
weight field.

I've added two new flags in the symbol_conf for this.  The
annotate_data_member is to get the members of the type.  This is also
needed for perf report with typeoff sort key.  The annotate_data_sample
is to update sample stats for each offset and used only in annotate.

Currently it only support stdio output mode, TUI support can be added
later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-annotate.txt |  8 ++
 tools/perf/builtin-annotate.c              | 94 +++++++++++++++++++++-
 tools/perf/util/annotate-data.c            |  8 +-
 tools/perf/util/annotate.c                 | 10 ++-
 tools/perf/util/sort.c                     |  2 +
 tools/perf/util/symbol_conf.h              |  4 +-
 6 files changed, 115 insertions(+), 11 deletions(-)

diff --git a/tools/perf/Documentation/perf-annotate.txt b/tools/perf/Documentation/perf-annotate.txt
index fe168e8165c8..0e6a49b7795c 100644
--- a/tools/perf/Documentation/perf-annotate.txt
+++ b/tools/perf/Documentation/perf-annotate.txt
@@ -155,6 +155,14 @@ include::itrace.txt[]
 	stdio or stdio2 (Default: 0).  Note that this is about selection of
 	functions to display, not about lines within the function.
 
+--data-type[=TYPE_NAME]::
+	Display data type annotation instead of code.  It infers data type of
+	samples (if they are memory accessing instructions) using DWARF debug
+	information.  It can take an optional argument of data type name.  In
+	that case it'd show annotation for the type only, otherwise it'd show
+	all data types it finds.
+
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-report[1]
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index a9129b51d511..2290ce3bdc2e 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -20,6 +20,7 @@
 #include "util/evlist.h"
 #include "util/evsel.h"
 #include "util/annotate.h"
+#include "util/annotate-data.h"
 #include "util/event.h"
 #include <subcmd/parse-options.h>
 #include "util/parse-events.h"
@@ -56,9 +57,11 @@ struct perf_annotate {
 	bool	   skip_missing;
 	bool	   has_br_stack;
 	bool	   group_set;
+	bool	   data_type;
 	float	   min_percent;
 	const char *sym_hist_filter;
 	const char *cpu_list;
+	const char *target_data_type;
 	DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
 };
 
@@ -234,8 +237,12 @@ static int evsel__add_sample(struct evsel *evsel, struct perf_sample *sample,
 {
 	struct hists *hists = evsel__hists(evsel);
 	struct hist_entry *he;
+	struct annotation *notes = al->sym ? symbol__annotation(al->sym) : NULL;
 	int ret;
 
+	if (notes)
+		notes->options = &ann->opts;
+
 	if ((!ann->has_br_stack || !has_annotation(ann)) &&
 	    ann->sym_hist_filter != NULL &&
 	    (al->sym == NULL ||
@@ -323,6 +330,32 @@ static int hist_entry__tty_annotate(struct hist_entry *he,
 	return symbol__tty_annotate2(&he->ms, evsel, &ann->opts);
 }
 
+static void print_annotated_data_type(struct annotated_data_type *mem_type,
+				      struct annotated_member *member,
+				      struct evsel *evsel, int indent)
+{
+	struct annotated_member *child;
+	struct type_hist *h = mem_type->histograms[evsel->core.idx];
+	int i, samples = 0;
+
+	for (i = 0; i < member->size; i++)
+		samples += h->addr[member->offset + i].nr_samples;
+
+	printf(" %10d %10d %10d  %*s%s\t%s",
+	       samples, member->offset, member->size, indent, "", member->type_name,
+	       member->var_name ?: "");
+
+	if (!list_empty(&member->children))
+		printf(" {\n");
+
+	list_for_each_entry(child, &member->children, node)
+		print_annotated_data_type(mem_type, child, evsel, indent + 4);
+
+	if (!list_empty(&member->children))
+		printf("%*s}", 35 + indent, "");
+	printf(";\n");
+}
+
 static void hists__find_annotations(struct hists *hists,
 				    struct evsel *evsel,
 				    struct perf_annotate *ann)
@@ -362,6 +395,40 @@ static void hists__find_annotations(struct hists *hists,
 			continue;
 		}
 
+		if (ann->data_type) {
+			struct map *map = he->ms.map;
+
+			/* skip unknown type */
+			if (he->mem_type->histograms == NULL)
+				goto find_next;
+
+			if (ann->target_data_type) {
+				const char *type_name = he->mem_type->self.type_name;
+
+				/* skip 'struct ' prefix in the type name */
+				if (strncmp(ann->target_data_type, "struct ", 7) &&
+				    !strncmp(type_name, "struct ", 7))
+					type_name += 7;
+
+				/* skip 'union ' prefix in the type name */
+				if (strncmp(ann->target_data_type, "union ", 6) &&
+				    !strncmp(type_name, "union ", 6))
+					type_name += 6;
+
+				if (strcmp(ann->target_data_type, type_name))
+					goto find_next;
+			}
+
+			printf("Annotate type: '%s' in %s (%d samples):\n",
+				he->mem_type->self.type_name, map->dso->name, he->stat.nr_events);
+			printf("============================================================================\n");
+			printf(" %10s %10s %10s  %s\n", "samples", "offset", "size", "field");
+
+			print_annotated_data_type(he->mem_type, &he->mem_type->self, evsel, 0);
+			printf("\n");
+			goto find_next;
+		}
+
 		if (use_browser == 2) {
 			int ret;
 			int (*annotate)(struct hist_entry *he,
@@ -498,6 +565,17 @@ static int parse_percent_limit(const struct option *opt, const char *str,
 	return 0;
 }
 
+static int parse_data_type(const struct option *opt, const char *str, int unset)
+{
+	struct perf_annotate *ann = opt->value;
+
+	ann->data_type = !unset;
+	if (str)
+		ann->target_data_type = strdup(str);
+
+	return 0;
+}
+
 static const char * const annotate_usage[] = {
 	"perf annotate [<options>]",
 	NULL
@@ -609,6 +687,9 @@ int cmd_annotate(int argc, const char **argv)
 	OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
 			    "Instruction Tracing options\n" ITRACE_HELP,
 			    itrace_parse_synth_opts),
+	OPT_CALLBACK_OPTARG(0, "data-type", &annotate, NULL, "name",
+			    "Show data type annotate for the memory accesses",
+			    parse_data_type),
 
 	OPT_END()
 	};
@@ -705,6 +786,14 @@ int cmd_annotate(int argc, const char **argv)
 		use_browser = 2;
 #endif
 
+	/* FIXME: only support stdio for now */
+	if (annotate.data_type) {
+		use_browser = 0;
+		annotate.opts.annotate_src = false;
+		symbol_conf.annotate_data_member = true;
+		symbol_conf.annotate_data_sample = true;
+	}
+
 	setup_browser(true);
 
 	/*
@@ -712,7 +801,10 @@ int cmd_annotate(int argc, const char **argv)
 	 * symbol, we do not care about the processes in annotate,
 	 * set sort order to avoid repeated output.
 	 */
-	sort_order = "dso,symbol";
+	if (annotate.data_type)
+		sort_order = "dso,type";
+	else
+		sort_order = "dso,symbol";
 
 	/*
 	 * Set SORT_MODE__BRANCH so that annotate display IPC/Cycle
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 3c452d037948..5326396b08ec 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -18,6 +18,7 @@
 #include "map_symbol.h"
 #include "strbuf.h"
 #include "symbol.h"
+#include "symbol_conf.h"
 
 /*
  * Compare type name and size to maintain them in a tree.
@@ -157,11 +158,8 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
 	result->self.size = size;
 	INIT_LIST_HEAD(&result->self.children);
 
-	/*
-	 * Fill member info unconditionally for now,
-	 * later perf annotate would need it.
-	 */
-	add_member_types(result, type_die);
+	if (symbol_conf.annotate_data_member)
+		add_member_types(result, type_die);
 
 	rb_add(&result->node, &dso->data_types, data_type_less);
 	return result;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 76309f1e6e39..4d725562fd0a 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3713,10 +3713,12 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 
 		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
 
-		annotated_data_type__update_samples(mem_type, evsel,
-						    op_loc->offset,
-						    he->stat.nr_events,
-						    he->stat.period);
+		if (symbol_conf.annotate_data_sample) {
+			annotated_data_type__update_samples(mem_type, evsel,
+							    op_loc->offset,
+							    he->stat.nr_events,
+							    he->stat.period);
+		}
 		he->mem_type_off = op_loc->offset;
 		return mem_type;
 	}
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 0cbbd5ba8175..30254eb63709 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -3401,6 +3401,8 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
 			list->thread = 1;
 		} else if (sd->entry == &sort_comm) {
 			list->comm = 1;
+		} else if (sd->entry == &sort_type_offset) {
+			symbol_conf.annotate_data_member = true;
 		}
 
 		return __sort_dimension__add(sd, list, level);
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index 6040286e07a6..c114bbceef40 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -44,7 +44,9 @@ struct symbol_conf {
 			buildid_mmap2,
 			guest_code,
 			lazy_load_kernel_maps,
-			keep_exited_threads;
+			keep_exited_threads,
+			annotate_data_member,
+			annotate_data_sample;
 	const char	*vmlinux_name,
 			*kallsyms_name,
 			*source_prefix,
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 25/52] perf annotate: Support event group display
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (23 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 24/52] perf annotate: Add --data-type option Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 26/52] perf annotate: Add --type-stat option for debugging Namhyung Kim
                   ` (27 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

When events are grouped together, it'd be natural to show them at once
like in other mode.  Handle group leaders with members to collect the
number of samples together and display like below:

  $ perf annotate --data-type --group
  ...
  Annotate type: 'struct page' in vmlinux (1 samples):
   event[0] = cpu/mem-loads,ldlat=30/P
   event[1] = cpu/mem-stores/P
   event[2] = dummy:u
  ============================================================================
                            samples     offset       size  field
            1          0          0          0         64  struct page     {
            0          0          0          0          8      long unsigned int  flags;
            0          0          0          8         40      union       {
            0          0          0          8         40          struct          {
            0          0          0          8         16              union       {
            0          0          0          8         16                  struct list_head       lru {
            0          0          0          8          8                      struct list_head*  next;
            0          0          0         16          8                      struct list_head*  prev;
                                                                           };
            0          0          0          8         16                  struct          {
            0          0          0          8          8                      void*      __filler;
            0          0          0         16          4                      unsigned int       mlock_count;
                                                                           };
            0          0          0          8         16                  struct list_head       buddy_list {
            0          0          0          8          8                      struct list_head*  next;
            0          0          0         16          8                      struct list_head*  prev;
                                                                           };

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-annotate.c | 89 ++++++++++++++++++++++++++++++-----
 1 file changed, 77 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 2290ce3bdc2e..7e4ef93b19a0 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -330,19 +330,64 @@ static int hist_entry__tty_annotate(struct hist_entry *he,
 	return symbol__tty_annotate2(&he->ms, evsel, &ann->opts);
 }
 
+static void print_annotated_data_header(struct hist_entry *he, struct evsel *evsel)
+{
+	struct map *map = he->ms.map;
+	int nr_members = 1;
+	int nr_samples = he->stat.nr_events;
+
+	if (evsel__is_group_event(evsel)) {
+		struct hist_entry *pair;
+
+		list_for_each_entry(pair, &he->pairs.head, pairs.node)
+			nr_samples += pair->stat.nr_events;
+	}
+
+	printf("Annotate type: '%s' in %s (%d samples):\n",
+	       he->mem_type->self.type_name, map->dso->name, nr_samples);
+
+	if (evsel__is_group_event(evsel)) {
+		struct evsel *pos;
+		int i = 0;
+
+		for_each_group_evsel(pos, evsel)
+			printf(" event[%d] = %s\n", i++, pos->name);
+
+		nr_members = evsel->core.nr_members;
+	}
+
+	printf("============================================================================\n");
+	printf("%*s %10s %10s  %s\n", 11 * nr_members, "samples", "offset", "size", "field");
+}
+
 static void print_annotated_data_type(struct annotated_data_type *mem_type,
 				      struct annotated_member *member,
 				      struct evsel *evsel, int indent)
 {
 	struct annotated_member *child;
 	struct type_hist *h = mem_type->histograms[evsel->core.idx];
-	int i, samples = 0;
+	int i, nr_events = 1, samples = 0;
 
 	for (i = 0; i < member->size; i++)
 		samples += h->addr[member->offset + i].nr_samples;
+	printf(" %10d", samples);
+
+	if (evsel__is_group_event(evsel)) {
+		struct evsel *pos;
+
+		for_each_group_member(pos, evsel) {
+			h = mem_type->histograms[pos->core.idx];
+
+			samples = 0;
+			for (i = 0; i < member->size; i++)
+				samples += h->addr[member->offset + i].nr_samples;
+			printf(" %10d", samples);
+		}
+		nr_events = evsel->core.nr_members;
+	}
 
-	printf(" %10d %10d %10d  %*s%s\t%s",
-	       samples, member->offset, member->size, indent, "", member->type_name,
+	printf(" %10d %10d  %*s%s\t%s",
+	       member->offset, member->size, indent, "", member->type_name,
 	       member->var_name ?: "");
 
 	if (!list_empty(&member->children))
@@ -352,7 +397,7 @@ static void print_annotated_data_type(struct annotated_data_type *mem_type,
 		print_annotated_data_type(mem_type, child, evsel, indent + 4);
 
 	if (!list_empty(&member->children))
-		printf("%*s}", 35 + indent, "");
+		printf("%*s}", 11 * nr_events + 24 + indent, "");
 	printf(";\n");
 }
 
@@ -396,8 +441,6 @@ static void hists__find_annotations(struct hists *hists,
 		}
 
 		if (ann->data_type) {
-			struct map *map = he->ms.map;
-
 			/* skip unknown type */
 			if (he->mem_type->histograms == NULL)
 				goto find_next;
@@ -419,11 +462,7 @@ static void hists__find_annotations(struct hists *hists,
 					goto find_next;
 			}
 
-			printf("Annotate type: '%s' in %s (%d samples):\n",
-				he->mem_type->self.type_name, map->dso->name, he->stat.nr_events);
-			printf("============================================================================\n");
-			printf(" %10s %10s %10s  %s\n", "samples", "offset", "size", "field");
-
+			print_annotated_data_header(he, evsel);
 			print_annotated_data_type(he->mem_type, &he->mem_type->self, evsel, 0);
 			printf("\n");
 			goto find_next;
@@ -527,8 +566,20 @@ static int __cmd_annotate(struct perf_annotate *ann)
 			evsel__reset_sample_bit(pos, CALLCHAIN);
 			evsel__output_resort(pos, NULL);
 
-			if (symbol_conf.event_group && !evsel__is_group_leader(pos))
+			/*
+			 * An event group needs to display other events too.
+			 * Let's delay printing until other events are processed.
+			 */
+			if (symbol_conf.event_group) {
+				if (!evsel__is_group_leader(pos)) {
+					struct hists *leader_hists;
+
+					leader_hists = evsel__hists(evsel__leader(pos));
+					hists__match(leader_hists, hists);
+					hists__link(leader_hists, hists);
+				}
 				continue;
+			}
 
 			hists__find_annotations(hists, pos, ann);
 		}
@@ -539,6 +590,20 @@ static int __cmd_annotate(struct perf_annotate *ann)
 		goto out;
 	}
 
+	/* Display group events together */
+	evlist__for_each_entry(session->evlist, pos) {
+		struct hists *hists = evsel__hists(pos);
+		u32 nr_samples = hists->stats.nr_samples;
+
+		if (nr_samples == 0)
+			continue;
+
+		if (!symbol_conf.event_group || !evsel__is_group_leader(pos))
+			continue;
+
+		hists__find_annotations(hists, pos, ann);
+	}
+
 	if (use_browser == 2) {
 		void (*show_annotations)(void);
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 26/52] perf annotate: Add --type-stat option for debugging
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (24 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 25/52] perf annotate: Support event group display Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 27/52] perf annotate: Add --insn-stat " Namhyung Kim
                   ` (26 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The --type-stat option is to be used with --data-type and to print
detailed failure reasons for the data type annotation.

  $ perf annotate --data-type --type-stat
  Annotate data type stats:
  total 294, ok 116 (39.5%), bad 178 (60.5%)
  -----------------------------------------------------------
          30 : no_sym
          40 : no_insn_ops
          33 : no_mem_ops
          63 : no_var
           4 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/Documentation/perf-annotate.txt |  3 ++
 tools/perf/builtin-annotate.c              | 44 +++++++++++++++++++++-
 tools/perf/util/annotate-data.c            | 10 ++++-
 tools/perf/util/annotate-data.h            | 31 +++++++++++++++
 tools/perf/util/annotate.c                 | 23 +++++++++--
 5 files changed, 105 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-annotate.txt b/tools/perf/Documentation/perf-annotate.txt
index 0e6a49b7795c..b95524bea021 100644
--- a/tools/perf/Documentation/perf-annotate.txt
+++ b/tools/perf/Documentation/perf-annotate.txt
@@ -162,6 +162,9 @@ include::itrace.txt[]
 	that case it'd show annotation for the type only, otherwise it'd show
 	all data types it finds.
 
+--type-stat::
+	Show stats for the data type annotation.
+
 
 SEE ALSO
 --------
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 7e4ef93b19a0..e4fc00bc8fdf 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -58,6 +58,7 @@ struct perf_annotate {
 	bool	   has_br_stack;
 	bool	   group_set;
 	bool	   data_type;
+	bool	   type_stat;
 	float	   min_percent;
 	const char *sym_hist_filter;
 	const char *cpu_list;
@@ -401,6 +402,43 @@ static void print_annotated_data_type(struct annotated_data_type *mem_type,
 	printf(";\n");
 }
 
+static void print_annotate_data_stat(struct annotated_data_stat *s)
+{
+#define PRINT_STAT(fld) if (s->fld) printf("%10d : %s\n", s->fld, #fld)
+
+	int bad = s->no_sym +
+			s->no_insn +
+			s->no_insn_ops +
+			s->no_mem_ops +
+			s->no_reg +
+			s->no_dbginfo +
+			s->no_cuinfo +
+			s->no_var +
+			s->no_typeinfo +
+			s->invalid_size +
+			s->bad_offset;
+	int ok = s->total - bad;
+
+	printf("Annotate data type stats:\n");
+	printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n",
+		s->total, ok, 100.0 * ok / (s->total ?: 1), bad, 100.0 * bad / (s->total ?: 1));
+	printf("-----------------------------------------------------------\n");
+	PRINT_STAT(no_sym);
+	PRINT_STAT(no_insn);
+	PRINT_STAT(no_insn_ops);
+	PRINT_STAT(no_mem_ops);
+	PRINT_STAT(no_reg);
+	PRINT_STAT(no_dbginfo);
+	PRINT_STAT(no_cuinfo);
+	PRINT_STAT(no_var);
+	PRINT_STAT(no_typeinfo);
+	PRINT_STAT(invalid_size);
+	PRINT_STAT(bad_offset);
+	printf("\n");
+
+#undef PRINT_STAT
+}
+
 static void hists__find_annotations(struct hists *hists,
 				    struct evsel *evsel,
 				    struct perf_annotate *ann)
@@ -408,6 +446,9 @@ static void hists__find_annotations(struct hists *hists,
 	struct rb_node *nd = rb_first_cached(&hists->entries), *next;
 	int key = K_RIGHT;
 
+	if (ann->type_stat)
+		print_annotate_data_stat(&ann_data_stat);
+
 	while (nd) {
 		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
 		struct annotation *notes;
@@ -755,7 +796,8 @@ int cmd_annotate(int argc, const char **argv)
 	OPT_CALLBACK_OPTARG(0, "data-type", &annotate, NULL, "name",
 			    "Show data type annotate for the memory accesses",
 			    parse_data_type),
-
+	OPT_BOOLEAN(0, "type-stat", &annotate.type_stat,
+		    "Show stats for the data type annotation"),
 	OPT_END()
 	};
 	int ret;
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 5326396b08ec..79f09ce92f15 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -198,6 +198,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 	/* Get the type of the variable */
 	if (die_get_real_type(var_die, type_die) == NULL) {
 		pr_debug("variable has no type\n");
+		ann_data_stat.no_typeinfo++;
 		return -1;
 	}
 
@@ -208,18 +209,21 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 	if (dwarf_tag(type_die) != DW_TAG_pointer_type ||
 	    die_get_real_type(type_die, type_die) == NULL) {
 		pr_debug("no pointer or no type\n");
+		ann_data_stat.no_typeinfo++;
 		return -1;
 	}
 
 	/* Get the size of the actual type */
 	if (dwarf_aggregate_size(type_die, &size) < 0) {
 		pr_debug("type size is unknown\n");
+		ann_data_stat.invalid_size++;
 		return -1;
 	}
 
 	/* Minimal sanity check */
 	if ((unsigned)offset >= size) {
 		pr_debug("offset: %d is bigger than size: %lu\n", offset, size);
+		ann_data_stat.bad_offset++;
 		return -1;
 	}
 
@@ -238,6 +242,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
 	/* Get a compile_unit for this address */
 	if (!find_cu_die(di, pc, &cu_die)) {
 		pr_debug("cannot find CU for address %lx\n", pc);
+		ann_data_stat.no_cuinfo++;
 		return -1;
 	}
 
@@ -252,9 +257,12 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
 
 		/* Found a variable, see if it's correct */
 		ret = check_variable(&var_die, type_die, offset);
-		break;
+		goto out;
 	}
+	if (ret < 0)
+		ann_data_stat.no_var++;
 
+out:
 	free(scopes);
 	return ret;
 }
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index d2dc025b1934..8e73096c01d1 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -70,6 +70,37 @@ struct annotated_data_type {
 
 extern struct annotated_data_type unknown_type;
 
+/**
+ * struct annotated_data_stat - Debug statistics
+ * @total: Total number of entry
+ * @no_sym: No symbol or map found
+ * @no_insn: Failed to get disasm line
+ * @no_insn_ops: The instruction has no operands
+ * @no_mem_ops: The instruction has no memory operands
+ * @no_reg: Failed to extract a register from the operand
+ * @no_dbginfo: The binary has no debug information
+ * @no_cuinfo: Failed to find a compile_unit
+ * @no_var: Failed to find a matching variable
+ * @no_typeinfo: Failed to get a type info for the variable
+ * @invalid_size: Failed to get a size info of the type
+ * @bad_offset: The access offset is out of the type
+ */
+struct annotated_data_stat {
+	int total;
+	int no_sym;
+	int no_insn;
+	int no_insn_ops;
+	int no_mem_ops;
+	int no_reg;
+	int no_dbginfo;
+	int no_cuinfo;
+	int no_var;
+	int no_typeinfo;
+	int invalid_size;
+	int bad_offset;
+};
+extern struct annotated_data_stat ann_data_stat;
+
 #ifdef HAVE_DWARF_SUPPORT
 
 /* Returns data type at the location (ip, reg, offset) */
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4d725562fd0a..c284a29979d6 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -100,6 +100,9 @@ static struct ins_ops nop_ops;
 static struct ins_ops lock_ops;
 static struct ins_ops ret_ops;
 
+/* Data type collection debug statistics */
+struct annotated_data_stat ann_data_stat;
+
 static int arch__grow_instructions(struct arch *arch)
 {
 	struct ins *new_instructions;
@@ -3687,11 +3690,17 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	u64 ip = he->ip;
 	int i;
 
-	if (ms->map == NULL || ms->sym == NULL)
+	ann_data_stat.total++;
+
+	if (ms->map == NULL || ms->sym == NULL) {
+		ann_data_stat.no_sym++;
 		return NULL;
+	}
 
-	if (evsel__get_arch(evsel, &arch) < 0)
+	if (evsel__get_arch(evsel, &arch) < 0) {
+		ann_data_stat.no_insn++;
 		return NULL;
+	}
 
 	/* Make sure it runs objdump to get disasm of the function */
 	symbol__ensure_annotate(ms, evsel);
@@ -3701,11 +3710,15 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	 * This is too slow...
 	 */
 	dl = find_disasm_line(ms->sym, ip);
-	if (dl == NULL)
+	if (dl == NULL) {
+		ann_data_stat.no_insn++;
 		return NULL;
+	}
 
-	if (annotate_get_insn_location(arch, dl, &loc) < 0)
+	if (annotate_get_insn_location(arch, dl, &loc) < 0) {
+		ann_data_stat.no_insn_ops++;
 		return NULL;
+	}
 
 	for_each_insn_op_loc(&loc, i, op_loc) {
 		if (!op_loc->mem_ref)
@@ -3722,5 +3735,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		he->mem_type_off = op_loc->offset;
 		return mem_type;
 	}
+
+	ann_data_stat.no_mem_ops++;
 	return NULL;
 }
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 27/52] perf annotate: Add --insn-stat option for debugging
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (25 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 26/52] perf annotate: Add --type-stat option for debugging Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 28/52] perf annotate-data: Parse 'lock' prefix from llvm-objdump Namhyung Kim
                   ` (25 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

This is for a debugging purpose.  It'd be useful to see per-instrucion
level success/failure stats.

  $ perf annotate --data-type --insn-stat
  Annotate Instruction stats
  total 264, ok 143 (54.2%), bad 121 (45.8%)

    Name      :  Good   Bad
  -----------------------------------------------------------
    movq      :    45    31
    movl      :    22    11
    popq      :     0    19
    cmpl      :    16     3
    addq      :     8     7
    cmpq      :    11     3
    cmpxchgl  :     3     7
    cmpxchgq  :     8     0
    incl      :     3     3
    movzbl    :     4     2
    incq      :     4     2
    decl      :     6     0
    ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/builtin-annotate.c | 41 +++++++++++++++++++++++++++++++++++
 tools/perf/util/annotate.c    | 38 ++++++++++++++++++++++++++++++++
 tools/perf/util/annotate.h    |  8 +++++++
 3 files changed, 87 insertions(+)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index e4fc00bc8fdf..9516d2dbc488 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -59,6 +59,7 @@ struct perf_annotate {
 	bool	   group_set;
 	bool	   data_type;
 	bool	   type_stat;
+	bool	   insn_stat;
 	float	   min_percent;
 	const char *sym_hist_filter;
 	const char *cpu_list;
@@ -439,6 +440,42 @@ static void print_annotate_data_stat(struct annotated_data_stat *s)
 #undef PRINT_STAT
 }
 
+static void print_annotate_item_stat(struct list_head *head, const char *title)
+{
+	struct annotated_item_stat *istat, *pos, *iter;
+	int total_good, total_bad, total;
+	int sum1, sum2;
+	LIST_HEAD(tmp);
+
+	/* sort the list by count */
+	list_splice_init(head, &tmp);
+	total_good = total_bad = 0;
+
+	list_for_each_entry_safe(istat, pos, &tmp, list) {
+		total_good += istat->good;
+		total_bad += istat->bad;
+		sum1 = istat->good + istat->bad;
+
+		list_for_each_entry(iter, head, list) {
+			sum2 = iter->good + iter->bad;
+			if (sum1 > sum2)
+				break;
+		}
+		list_move_tail(&istat->list, &iter->list);
+	}
+	total = total_good + total_bad;
+
+	printf("Annotate %s stats\n", title);
+	printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n\n", total,
+	       total_good, 100.0 * total_good / (total ?: 1),
+	       total_bad, 100.0 * total_bad / (total ?: 1));
+	printf("  %-10s: %5s %5s\n", "Name", "Good", "Bad");
+	printf("-----------------------------------------------------------\n");
+	list_for_each_entry(istat, head, list)
+		printf("  %-10s: %5d %5d\n", istat->name, istat->good, istat->bad);
+	printf("\n");
+}
+
 static void hists__find_annotations(struct hists *hists,
 				    struct evsel *evsel,
 				    struct perf_annotate *ann)
@@ -448,6 +485,8 @@ static void hists__find_annotations(struct hists *hists,
 
 	if (ann->type_stat)
 		print_annotate_data_stat(&ann_data_stat);
+	if (ann->insn_stat)
+		print_annotate_item_stat(&ann_insn_stat, "Instruction");
 
 	while (nd) {
 		struct hist_entry *he = rb_entry(nd, struct hist_entry, rb_node);
@@ -798,6 +837,8 @@ int cmd_annotate(int argc, const char **argv)
 			    parse_data_type),
 	OPT_BOOLEAN(0, "type-stat", &annotate.type_stat,
 		    "Show stats for the data type annotation"),
+	OPT_BOOLEAN(0, "insn-stat", &annotate.insn_stat,
+		    "Show instruction stats for the data type annotation"),
 	OPT_END()
 	};
 	int ret;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index c284a29979d6..3ac601d70f61 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -102,6 +102,7 @@ static struct ins_ops ret_ops;
 
 /* Data type collection debug statistics */
 struct annotated_data_stat ann_data_stat;
+LIST_HEAD(ann_insn_stat);
 
 static int arch__grow_instructions(struct arch *arch)
 {
@@ -3669,6 +3670,30 @@ static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip)
 	return NULL;
 }
 
+static struct annotated_item_stat *annotate_data_stat(struct list_head *head,
+						      const char *name)
+{
+	struct annotated_item_stat *istat;
+
+	list_for_each_entry(istat, head, list) {
+		if (!strcmp(istat->name, name))
+			return istat;
+	}
+
+	istat = zalloc(sizeof(*istat));
+	if (istat == NULL)
+		return NULL;
+
+	istat->name = strdup(name);
+	if (istat->name == NULL) {
+		free(istat);
+		return NULL;
+	}
+
+	list_add_tail(&istat->list, head);
+	return istat;
+}
+
 /**
  * hist_entry__get_data_type - find data type for given hist entry
  * @he: hist entry
@@ -3687,6 +3712,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	struct annotated_insn_loc loc;
 	struct annotated_op_loc *op_loc;
 	struct annotated_data_type *mem_type;
+	struct annotated_item_stat *istat;
 	u64 ip = he->ip;
 	int i;
 
@@ -3715,8 +3741,15 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		return NULL;
 	}
 
+	istat = annotate_data_stat(&ann_insn_stat, dl->ins.name);
+	if (istat == NULL) {
+		ann_data_stat.no_insn++;
+		return NULL;
+	}
+
 	if (annotate_get_insn_location(arch, dl, &loc) < 0) {
 		ann_data_stat.no_insn_ops++;
+		istat->bad++;
 		return NULL;
 	}
 
@@ -3725,6 +3758,10 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 			continue;
 
 		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
+		if (mem_type)
+			istat->good++;
+		else
+			istat->bad++;
 
 		if (symbol_conf.annotate_data_sample) {
 			annotated_data_type__update_samples(mem_type, evsel,
@@ -3737,5 +3774,6 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	}
 
 	ann_data_stat.no_mem_ops++;
+	istat->bad++;
 	return NULL;
 }
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 06281a50ecf6..2cef96859e45 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -485,4 +485,12 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
 /* Returns a data type from the sample instruction (if any) */
 struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he);
 
+struct annotated_item_stat {
+	struct list_head list;
+	char *name;
+	int good;
+	int bad;
+};
+extern struct list_head ann_insn_stat;
+
 #endif	/* __PERF_ANNOTATE_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 28/52] perf annotate-data: Parse 'lock' prefix from llvm-objdump
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (26 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 27/52] perf annotate: Add --insn-stat " Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 29/52] perf annotate-data: Handle macro fusion on x86 Namhyung Kim
                   ` (24 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

For the performance reason, I prefer llvm-objdump over GNU's.  But I
found that llvm-objdump puts x86 lock prefix in a separate line like
below.

  ffffffff81000695: f0                    lock
  ffffffff81000696: ff 83 54 0b 00 00     incl    2900(%rbx)

This should be parsed properly, but I just changed to find the insn
with next offset for now.

This improves the statistics as it can process more instructions.

  Annotate data type stats:
  total 294, ok 144 (49.0%), bad 150 (51.0%)
  -----------------------------------------------------------
          30 : no_sym
          35 : no_mem_ops
          71 : no_var
           6 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 3ac601d70f61..2f325a9cf33a 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3664,8 +3664,17 @@ static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip)
 	notes = symbol__annotation(sym);
 
 	list_for_each_entry(dl, &notes->src->source, al.node) {
-		if (sym->start + dl->al.offset == ip)
+		if (sym->start + dl->al.offset == ip) {
+			/*
+			 * llvm-objdump places "lock" in a separate line and
+			 * in that case, we want to get the next line.
+			 */
+			if (!strcmp(dl->ins.name, "lock") && *dl->ops.raw == '\0') {
+				ip++;
+				continue;
+			}
 			return dl;
+		}
 	}
 	return NULL;
 }
@@ -3757,6 +3766,9 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		if (!op_loc->mem_ref)
 			continue;
 
+		/* Recalculate IP since it can be changed due to LOCK prefix */
+		ip = ms->sym->start + dl->al.offset;
+
 		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
 		if (mem_type)
 			istat->good++;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 29/52] perf annotate-data: Handle macro fusion on x86
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (27 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 28/52] perf annotate-data: Parse 'lock' prefix from llvm-objdump Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 30/52] perf annotate-data: Handle array style accesses Namhyung Kim
                   ` (23 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

When a sample was come from a conditional branch without a memory
operand, it could be due to a macro fusion with a previous instruction.
So it needs to check the memory operand in the previous one.

This improves the stat like below:

  Annotate data type stats:
  total 294, ok 147 (50.0%), bad 147 (50.0%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          71 : no_var
           6 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 2f325a9cf33a..7d733bc85c9a 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3750,6 +3750,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		return NULL;
 	}
 
+retry:
 	istat = annotate_data_stat(&ann_insn_stat, dl->ins.name);
 	if (istat == NULL) {
 		ann_data_stat.no_insn++;
@@ -3766,7 +3767,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		if (!op_loc->mem_ref)
 			continue;
 
-		/* Recalculate IP since it can be changed due to LOCK prefix */
+		/* Recalculate IP because of LOCK prefix or insn fusion */
 		ip = ms->sym->start + dl->al.offset;
 
 		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
@@ -3785,6 +3786,20 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		return mem_type;
 	}
 
+	/*
+	 * Some instructions can be fused and the actual memory access came
+	 * from the previous instruction.
+	 */
+	if (dl->al.offset > 0) {
+		struct disasm_line *prev_dl;
+
+		prev_dl = list_prev_entry(dl, al.node);
+		if (ins__is_fused(arch, prev_dl->ins.name, dl->ins.name)) {
+			dl = prev_dl;
+			goto retry;
+		}
+	}
+
 	ann_data_stat.no_mem_ops++;
 	istat->bad++;
 	return NULL;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 30/52] perf annotate-data: Handle array style accesses
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (28 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 29/52] perf annotate-data: Handle macro fusion on x86 Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 31/52] perf annotate-data: Add stack operation pseudo type Namhyung Kim
                   ` (22 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

On x86, instructions for array access often looks like below.

  mov  0x1234(%rax,%rbx,8), %rcx

Usually the first register holds the type information and the second one
has the index.  And the current code only looks up a variable for the
first register.  But it's possible to be in the other way around so it
needs to check the second register if the first one failed.

The stat changed like this.

  Annotate data type stats:
  total 294, ok 148 (50.3%), bad 146 (49.7%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          66 : no_var
          10 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 24 +++++++++++++-----
 tools/perf/util/annotate-data.h |  5 ++--
 tools/perf/util/annotate.c      | 43 ++++++++++++++++++++++++++-------
 tools/perf/util/annotate.h      |  8 ++++--
 4 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 79f09ce92f15..159fceeebaa4 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -8,6 +8,7 @@
 #include <stdio.h>
 #include <stdlib.h>
 
+#include "annotate.h"
 #include "annotate-data.h"
 #include "debuginfo.h"
 #include "debug.h"
@@ -206,7 +207,8 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 	 * It expects a pointer type for a memory access.
 	 * Convert to a real type it points to.
 	 */
-	if (dwarf_tag(type_die) != DW_TAG_pointer_type ||
+	if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
+	     dwarf_tag(type_die) != DW_TAG_array_type) ||
 	    die_get_real_type(type_die, type_die) == NULL) {
 		pr_debug("no pointer or no type\n");
 		ann_data_stat.no_typeinfo++;
@@ -232,10 +234,11 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 
 /* The result will be saved in @type_die */
 static int find_data_type_die(struct debuginfo *di, u64 pc,
-			      int reg, int offset, Dwarf_Die *type_die)
+			      struct annotated_op_loc *loc, Dwarf_Die *type_die)
 {
 	Dwarf_Die cu_die, var_die;
 	Dwarf_Die *scopes = NULL;
+	int reg, offset;
 	int ret = -1;
 	int i, nr_scopes;
 
@@ -249,6 +252,10 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
 	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
 	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
 
+	reg = loc->reg1;
+	offset = loc->offset;
+
+retry:
 	/* Search from the inner-most scope to the outer */
 	for (i = nr_scopes - 1; i >= 0; i--) {
 		/* Look up variables/parameters in this scope */
@@ -259,6 +266,12 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
 		ret = check_variable(&var_die, type_die, offset);
 		goto out;
 	}
+
+	if (loc->multi_regs && reg == loc->reg1 && loc->reg1 != loc->reg2) {
+		reg = loc->reg2;
+		goto retry;
+	}
+
 	if (ret < 0)
 		ann_data_stat.no_var++;
 
@@ -271,15 +284,14 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
  * find_data_type - Return a data type at the location
  * @ms: map and symbol at the location
  * @ip: instruction address of the memory access
- * @reg: register that holds the base address
- * @offset: offset from the base address
+ * @loc: instruction operand location
  *
  * This functions searches the debug information of the binary to get the data
  * type it accesses.  The exact location is expressed by (ip, reg, offset).
  * It return %NULL if not found.
  */
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   int reg, int offset)
+					   struct annotated_op_loc *loc)
 {
 	struct annotated_data_type *result = NULL;
 	struct dso *dso = ms->map->dso;
@@ -299,7 +311,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	 * a file address for DWARF processing.
 	 */
 	pc = map__rip_2objdump(ms->map, ip);
-	if (find_data_type_die(di, pc, reg, offset, &type_die) < 0)
+	if (find_data_type_die(di, pc, loc, &type_die) < 0)
 		goto out;
 
 	result = dso__findnew_data_type(dso, &type_die);
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 8e73096c01d1..65ddd839850f 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -7,6 +7,7 @@
 #include <linux/rbtree.h>
 #include <linux/types.h>
 
+struct annotated_op_loc;
 struct evsel;
 struct map_symbol;
 
@@ -105,7 +106,7 @@ extern struct annotated_data_stat ann_data_stat;
 
 /* Returns data type at the location (ip, reg, offset) */
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   int reg, int offset);
+					   struct annotated_op_loc *loc);
 
 /* Update type access histogram at the given offset */
 int annotated_data_type__update_samples(struct annotated_data_type *adt,
@@ -119,7 +120,7 @@ void annotated_data_type__tree_delete(struct rb_root *root);
 
 static inline struct annotated_data_type *
 find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
-	       int reg __maybe_unused, int offset __maybe_unused)
+	       struct annotated_op_loc *loc __maybe_unused)
 {
 	return NULL;
 }
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 7d733bc85c9a..19e7f4000368 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3567,8 +3567,22 @@ static int extract_reg_offset(struct arch *arch, const char *str,
 	if (regname == NULL)
 		return -1;
 
-	op_loc->reg = get_dwarf_regnum(regname, 0);
+	op_loc->reg1 = get_dwarf_regnum(regname, 0);
 	free(regname);
+
+	/* Get the second register */
+	if (op_loc->multi_regs) {
+		p = strchr(p + 1, arch->objdump.register_char);
+		if (p == NULL)
+			return -1;
+
+		regname = strdup(p);
+		if (regname == NULL)
+			return -1;
+
+		op_loc->reg2 = get_dwarf_regnum(regname, 0);
+		free(regname);
+	}
 	return 0;
 }
 
@@ -3581,14 +3595,20 @@ static int extract_reg_offset(struct arch *arch, const char *str,
  * Get detailed location info (register and offset) in the instruction.
  * It needs both source and target operand and whether it accesses a
  * memory location.  The offset field is meaningful only when the
- * corresponding mem flag is set.
+ * corresponding mem flag is set.  The reg2 field is meaningful only
+ * when multi_regs flag is set.
  *
  * Some examples on x86:
  *
- *   mov  (%rax), %rcx   # src_reg = rax, src_mem = 1, src_offset = 0
- *                       # dst_reg = rcx, dst_mem = 0
+ *   mov  (%rax), %rcx   # src_reg1 = rax, src_mem = 1, src_offset = 0
+ *                       # dst_reg1 = rcx, dst_mem = 0
  *
- *   mov  0x18, %r8      # src_reg = -1, dst_reg = r8
+ *   mov  0x18, %r8      # src_reg1 = -1, src_mem = 0
+ *                       # dst_reg1 = r8, dst_mem = 0
+ *
+ *   mov  %rsi, 8(%rbx,%rcx,4)  # src_reg1 = rsi, src_mem = 0, dst_multi_regs = 0
+ *                              # dst_reg1 = rbx, dst_reg2 = rcx, dst_mem = 1
+ *                              # dst_multi_regs = 1, dst_offset = 8
  */
 int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
 			       struct annotated_insn_loc *loc)
@@ -3609,24 +3629,29 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
 
 	for_each_insn_op_loc(loc, i, op_loc) {
 		const char *insn_str = ops->source.raw;
+		bool multi_regs = ops->source.multi_regs;
 
-		if (i == INSN_OP_TARGET)
+		if (i == INSN_OP_TARGET) {
 			insn_str = ops->target.raw;
+			multi_regs = ops->target.multi_regs;
+		}
 
 		/* Invalidate the register by default */
-		op_loc->reg = -1;
+		op_loc->reg1 = -1;
+		op_loc->reg2 = -1;
 
 		if (insn_str == NULL)
 			continue;
 
 		if (strchr(insn_str, arch->objdump.memory_ref_char)) {
 			op_loc->mem_ref = true;
+			op_loc->multi_regs = multi_regs;
 			extract_reg_offset(arch, insn_str, op_loc);
 		} else {
 			char *s = strdup(insn_str);
 
 			if (s) {
-				op_loc->reg = get_dwarf_regnum(s, 0);
+				op_loc->reg1 = get_dwarf_regnum(s, 0);
 				free(s);
 			}
 		}
@@ -3770,7 +3795,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		/* Recalculate IP because of LOCK prefix or insn fusion */
 		ip = ms->sym->start + dl->al.offset;
 
-		mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
+		mem_type = find_data_type(ms, ip, op_loc);
 		if (mem_type)
 			istat->good++;
 		else
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 2cef96859e45..f5a6c3227757 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -448,14 +448,18 @@ int annotate_check_args(struct annotation_options *args);
 
 /**
  * struct annotated_op_loc - Location info of instruction operand
- * @reg: Register in the operand
+ * @reg1: First register in the operand
+ * @reg2: Second register in the operand
  * @offset: Memory access offset in the operand
  * @mem_ref: Whether the operand accesses memory
+ * @multi_regs: Whether the second register is used
  */
 struct annotated_op_loc {
-	int reg;
+	int reg1;
+	int reg2;
 	int offset;
 	bool mem_ref;
+	bool multi_regs;
 };
 
 enum annotated_insn_ops {
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 31/52] perf annotate-data: Add stack operation pseudo type
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (29 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 30/52] perf annotate-data: Handle array style accesses Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 32/52] perf dwarf-aux: Add die_find_variable_by_addr() Namhyung Kim
                   ` (21 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

A typical function prologue and epilogue include multiple stack
operations to save and restore the current value of registers.
On x86, it looks like below:

  push  r15
  push  r14
  push  r13
  push  r12

  ...

  pop   r12
  pop   r13
  pop   r14
  pop   r15
  ret

As these all touches the stack memory region, chances are high that they
appear in a memory profile data.  But these are not used for any real
purpose yet so it'd return no types.

One of my profile type shows that non neglible portion of data came from
the stack operations.  It also seems GCC generates more stack operations
than clang.

Annotate Instruction stats
total 264, ok 169 (64.0%), bad 95 (36.0%)

    Name      :  Good   Bad
  -----------------------------------------------------------
    movq      :    49    27
    movl      :    24     9
    popq      :     0    19   <-- here
    cmpl      :    17     2
    addq      :    14     1
    cmpq      :    12     2
    cmpxchgl  :     3     7

Instead of dealing them as unknown, let's create a seperate pseudo type
to represent those stack operations separately.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.h |  1 +
 tools/perf/util/annotate.c      | 26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+)

diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 65ddd839850f..214c625e7bc9 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -70,6 +70,7 @@ struct annotated_data_type {
 };
 
 extern struct annotated_data_type unknown_type;
+extern struct annotated_data_type stackop_type;
 
 /**
  * struct annotated_data_stat - Debug statistics
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 19e7f4000368..4ea32c2dee4b 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -104,6 +104,14 @@ static struct ins_ops ret_ops;
 struct annotated_data_stat ann_data_stat;
 LIST_HEAD(ann_insn_stat);
 
+/* Pseudo data types */
+struct annotated_data_type stackop_type = {
+	.self = {
+		.type_name = (char *)"(stack operation)",
+		.children = LIST_HEAD_INIT(stackop_type.self.children),
+	},
+};
+
 static int arch__grow_instructions(struct arch *arch)
 {
 	struct ins *new_instructions;
@@ -3728,6 +3736,18 @@ static struct annotated_item_stat *annotate_data_stat(struct list_head *head,
 	return istat;
 }
 
+static bool is_stack_operation(struct arch *arch, struct disasm_line *dl)
+{
+	if (arch__is(arch, "x86")) {
+		if (!strncmp(dl->ins.name, "push", 4) ||
+		    !strncmp(dl->ins.name, "pop", 3) ||
+		    !strncmp(dl->ins.name, "ret", 3))
+			return true;
+	}
+
+	return false;
+}
+
 /**
  * hist_entry__get_data_type - find data type for given hist entry
  * @he: hist entry
@@ -3788,6 +3808,12 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		return NULL;
 	}
 
+	if (is_stack_operation(arch, dl)) {
+		istat->good++;
+		he->mem_type_off = 0;
+		return &stackop_type;
+	}
+
 	for_each_insn_op_loc(&loc, i, op_loc) {
 		if (!op_loc->mem_ref)
 			continue;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 32/52] perf dwarf-aux: Add die_find_variable_by_addr()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (30 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 31/52] perf annotate-data: Add stack operation pseudo type Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-27 22:07   ` Arnaldo Carvalho de Melo
  2023-11-09 23:59 ` [PATCH 33/52] perf annotate-data: Handle PC-relative addressing Namhyung Kim
                   ` (20 subsequent siblings)
  52 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The die_find_variable_by_addr() is to find a variables in the given DIE
using given (PC-relative) address.  Global variables will have a
location expression with DW_OP_addr which has an address so can simply
compare it with the address.

  <1><143a7>: Abbrev Number: 2 (DW_TAG_variable)
      <143a8>   DW_AT_name        : loops_per_jiffy
      <143ac>   DW_AT_type        : <0x1cca>
      <143b0>   DW_AT_external    : 1
      <143b0>   DW_AT_decl_file   : 193
      <143b1>   DW_AT_decl_line   : 213
      <143b2>   DW_AT_location    : 9 byte block: 3 b0 46 41 82 ff ff ff ff
                                     (DW_OP_addr: ffffffff824146b0)

Note that the type-offset should be calculated from the base address of
the global variable.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 79 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/dwarf-aux.h | 14 +++++++
 2 files changed, 93 insertions(+)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 4bdcd3dea28f..7aa5fee0da19 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1266,8 +1266,12 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf)
 struct find_var_data {
 	/* Target instruction address */
 	Dwarf_Addr pc;
+	/* Target memory address (for global data) */
+	Dwarf_Addr addr;
 	/* Target register */
 	unsigned reg;
+	/* Access offset, set for global data */
+	int offset;
 };
 
 /* Max number of registers DW_OP_regN supports */
@@ -1328,6 +1332,81 @@ Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
 	};
 	return die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem);
 }
+
+/* Only checks direct child DIEs in the given scope */
+static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
+{
+	struct find_var_data *data = arg;
+	int tag = dwarf_tag(die_mem);
+	ptrdiff_t off = 0;
+	Dwarf_Attribute attr;
+	Dwarf_Addr base, start, end;
+	Dwarf_Word size;
+	Dwarf_Die type_die;
+	Dwarf_Op *ops;
+	size_t nops;
+
+	if (tag != DW_TAG_variable)
+		return DIE_FIND_CB_SIBLING;
+
+	if (dwarf_attr(die_mem, DW_AT_location, &attr) == NULL)
+		return DIE_FIND_CB_SIBLING;
+
+	while ((off = dwarf_getlocations(&attr, off, &base, &start, &end, &ops, &nops)) > 0) {
+		if (ops->atom != DW_OP_addr)
+			continue;
+
+		if (data->addr < ops->number)
+			continue;
+
+		if (data->addr == ops->number) {
+			/* Update offset relative to the start of the variable */
+			data->offset = 0;
+			return DIE_FIND_CB_END;
+		}
+
+		if (die_get_real_type(die_mem, &type_die) == NULL)
+			continue;
+
+		if (dwarf_aggregate_size(&type_die, &size) < 0)
+			continue;
+
+		if (data->addr >= ops->number + size)
+			continue;
+
+		/* Update offset relative to the start of the variable */
+		data->offset = data->addr - ops->number;
+		return DIE_FIND_CB_END;
+	}
+	return DIE_FIND_CB_SIBLING;
+}
+
+/**
+ * die_find_variable_by_addr - Find variable located at given address
+ * @sc_die: a scope DIE
+ * @pc: the program address to find
+ * @addr: the data address to find
+ * @die_mem: a buffer to save the resulting DIE
+ * @offset: the offset in the resulting type
+ *
+ * Find the variable DIE located at the given address (in PC-relative mode).
+ * This is usually for global variables.
+ */
+Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc,
+				     Dwarf_Addr addr, Dwarf_Die *die_mem,
+				     int *offset)
+{
+	struct find_var_data data = {
+		.pc = pc,
+		.addr = addr,
+	};
+	Dwarf_Die *result;
+
+	result = die_find_child(sc_die, __die_find_var_addr_cb, &data, die_mem);
+	if (result)
+		*offset = data.offset;
+	return result;
+}
 #endif
 
 /*
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index f9763d3b7572..4e64caac6df8 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -144,6 +144,11 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf);
 Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
 				    Dwarf_Die *die_mem);
 
+/* Find a (global) variable located in the 'addr' */
+Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc,
+				     Dwarf_Addr addr, Dwarf_Die *die_mem,
+				     int *offset);
+
 #else /*  HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
 static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
@@ -161,6 +166,15 @@ static inline Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die __maybe_unus
 	return NULL;
 }
 
+static inline Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die __maybe_unused,
+						   Dwarf_Addr pc __maybe_unused,
+						   Dwarf_Addr addr __maybe_unused,
+						   Dwarf_Die *die_mem __maybe_unused,
+						   int *offset __maybe_unused)
+{
+	return NULL;
+}
+
 #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
 #endif /* _DWARF_AUX_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 33/52] perf annotate-data: Handle PC-relative addressing
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (31 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 32/52] perf dwarf-aux: Add die_find_variable_by_addr() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 34/52] perf annotate-data: Support global variables Namhyung Kim
                   ` (19 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Extend find_data_type_die() to find data type from PC-relative address
using die_find_variable_by_addr().  Users need to pass the address for
the (global) variable.

The offset for the variable should be updated after finding the type
because the offset in the instruction is just to calcuate the address
for the variable.  So it changed to pass a pointer to offset and renamed
it to 'poffset'.

First it searches variables in the CU DIE as it's likely that the global
variables are defined in the file level.  And then it iterates the scope
DIEs to find a local (static) variable.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 56 ++++++++++++++++++++++-----------
 1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 159fceeebaa4..61d2a17044c0 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -13,6 +13,7 @@
 #include "debuginfo.h"
 #include "debug.h"
 #include "dso.h"
+#include "dwarf-regs.h"
 #include "evsel.h"
 #include "evlist.h"
 #include "map.h"
@@ -192,7 +193,8 @@ static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die)
 }
 
 /* The type info will be saved in @type_die */
-static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
+static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
+			  bool is_pointer)
 {
 	Dwarf_Word size;
 
@@ -204,15 +206,18 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 	}
 
 	/*
-	 * It expects a pointer type for a memory access.
-	 * Convert to a real type it points to.
+	 * Usually it expects a pointer type for a memory access.
+	 * Convert to a real type it points to.  But global variables
+	 * are accessed directly without a pointer.
 	 */
-	if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
-	     dwarf_tag(type_die) != DW_TAG_array_type) ||
-	    die_get_real_type(type_die, type_die) == NULL) {
-		pr_debug("no pointer or no type\n");
-		ann_data_stat.no_typeinfo++;
-		return -1;
+	if (is_pointer) {
+		if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
+		     dwarf_tag(type_die) != DW_TAG_array_type) ||
+		    die_get_real_type(type_die, type_die) == NULL) {
+			pr_debug("no pointer or no type\n");
+			ann_data_stat.no_typeinfo++;
+			return -1;
+		}
 	}
 
 	/* Get the size of the actual type */
@@ -233,7 +238,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
 }
 
 /* The result will be saved in @type_die */
-static int find_data_type_die(struct debuginfo *di, u64 pc,
+static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 			      struct annotated_op_loc *loc, Dwarf_Die *type_die)
 {
 	Dwarf_Die cu_die, var_die;
@@ -249,21 +254,36 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
 		return -1;
 	}
 
-	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
-	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
-
 	reg = loc->reg1;
 	offset = loc->offset;
 
+	if (reg == DWARF_REG_PC &&
+	    die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
+		ret = check_variable(&var_die, type_die, offset,
+				     /*is_pointer=*/false);
+		goto out;
+	}
+
+	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
+	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
+
 retry:
 	/* Search from the inner-most scope to the outer */
 	for (i = nr_scopes - 1; i >= 0; i--) {
-		/* Look up variables/parameters in this scope */
-		if (!die_find_variable_by_reg(&scopes[i], pc, reg, &var_die))
-			continue;
+		if (reg == DWARF_REG_PC) {
+			if (!die_find_variable_by_addr(&scopes[i], pc, addr,
+						       &var_die, &offset))
+				continue;
+		} else {
+			/* Look up variables/parameters in this scope */
+			if (!die_find_variable_by_reg(&scopes[i], pc, reg,
+						      &var_die))
+				continue;
+		}
 
 		/* Found a variable, see if it's correct */
-		ret = check_variable(&var_die, type_die, offset);
+		ret = check_variable(&var_die, type_die, offset,
+				     reg != DWARF_REG_PC);
 		goto out;
 	}
 
@@ -311,7 +331,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	 * a file address for DWARF processing.
 	 */
 	pc = map__rip_2objdump(ms->map, ip);
-	if (find_data_type_die(di, pc, loc, &type_die) < 0)
+	if (find_data_type_die(di, pc, 0, loc, &type_die) < 0)
 		goto out;
 
 	result = dso__findnew_data_type(dso, &type_die);
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 34/52] perf annotate-data: Support global variables
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (32 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 33/52] perf annotate-data: Handle PC-relative addressing Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 35/52] perf dwarf-aux: Add die_get_cfa() Namhyung Kim
                   ` (18 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Global variables are accessed using PC-relative address so it needs to
be handled separately.  The PC-rel addressing is detected by using
DWARF_REG_PC.  On x86, %rip register would be used.

The address can be calculated using the ip and offset in the
instruction.  But it should start from the next instruction so add
calculate_pcrel_addr() to do it properly.

But global variables defined in a different file would only have a
declaration which doesn't include a location list.  So it first tries
to get the type info using the address, and then looks up the variable
declarations using name.  The name of global variables should be get
from the symbol table.  The declaration would have the type info.

So extend find_var_type() to take both address and name for global
variables.

The stat is now looks like:

  Annotate data type stats:
  total 294, ok 153 (52.0%), bad 141 (48.0%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          61 : no_var
          10 : no_typeinfo
           8 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 38 ++++++++++++++++------
 tools/perf/util/annotate-data.h |  6 ++--
 tools/perf/util/annotate.c      | 57 +++++++++++++++++++++++++++++++--
 tools/perf/util/annotate.h      |  4 +++
 4 files changed, 92 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 61d2a17044c0..99ecf4b3665c 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -239,7 +239,8 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
 
 /* The result will be saved in @type_die */
 static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
-			      struct annotated_op_loc *loc, Dwarf_Die *type_die)
+			      const char *var_name, struct annotated_op_loc *loc,
+			      Dwarf_Die *type_die)
 {
 	Dwarf_Die cu_die, var_die;
 	Dwarf_Die *scopes = NULL;
@@ -257,11 +258,21 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 	reg = loc->reg1;
 	offset = loc->offset;
 
-	if (reg == DWARF_REG_PC &&
-	    die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
-		ret = check_variable(&var_die, type_die, offset,
-				     /*is_pointer=*/false);
-		goto out;
+	if (reg == DWARF_REG_PC) {
+		if (die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
+			ret = check_variable(&var_die, type_die, offset,
+					     /*is_pointer=*/false);
+			loc->offset = offset;
+			goto out;
+		}
+
+		if (var_name && die_find_variable_at(&cu_die, var_name, pc,
+						     &var_die)) {
+			ret = check_variable(&var_die, type_die, 0,
+					     /*is_pointer=*/false);
+			/* loc->offset will be updated by the caller */
+			goto out;
+		}
 	}
 
 	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
@@ -284,6 +295,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 		/* Found a variable, see if it's correct */
 		ret = check_variable(&var_die, type_die, offset,
 				     reg != DWARF_REG_PC);
+		loc->offset = offset;
 		goto out;
 	}
 
@@ -305,13 +317,21 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
  * @ms: map and symbol at the location
  * @ip: instruction address of the memory access
  * @loc: instruction operand location
+ * @addr: data address of the memory access
+ * @var_name: global variable name
  *
  * This functions searches the debug information of the binary to get the data
- * type it accesses.  The exact location is expressed by (ip, reg, offset).
+ * type it accesses.  The exact location is expressed by (@ip, reg, offset)
+ * for pointer variables or (@ip, @addr) for global variables.  Note that global
+ * variables might update the @loc->offset after finding the start of the variable.
+ * If it cannot find a global variable by address, it tried to fine a declaration
+ * of the variable using @var_name.  In that case, @loc->offset won't be updated.
+ *
  * It return %NULL if not found.
  */
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   struct annotated_op_loc *loc)
+					   struct annotated_op_loc *loc, u64 addr,
+					   const char *var_name)
 {
 	struct annotated_data_type *result = NULL;
 	struct dso *dso = ms->map->dso;
@@ -331,7 +351,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
 	 * a file address for DWARF processing.
 	 */
 	pc = map__rip_2objdump(ms->map, ip);
-	if (find_data_type_die(di, pc, 0, loc, &type_die) < 0)
+	if (find_data_type_die(di, pc, addr, var_name, loc, &type_die) < 0)
 		goto out;
 
 	result = dso__findnew_data_type(dso, &type_die);
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 214c625e7bc9..1b0db8e8c40e 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -107,7 +107,8 @@ extern struct annotated_data_stat ann_data_stat;
 
 /* Returns data type at the location (ip, reg, offset) */
 struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   struct annotated_op_loc *loc);
+					   struct annotated_op_loc *loc, u64 addr,
+					   const char *var_name);
 
 /* Update type access histogram at the given offset */
 int annotated_data_type__update_samples(struct annotated_data_type *adt,
@@ -121,7 +122,8 @@ void annotated_data_type__tree_delete(struct rb_root *root);
 
 static inline struct annotated_data_type *
 find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
-	       struct annotated_op_loc *loc __maybe_unused)
+	       struct annotated_op_loc *loc __maybe_unused,
+	       u64 addr __maybe_unused, const char *var_name __maybe_unused)
 {
 	return NULL;
 }
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4ea32c2dee4b..4f74db1d3256 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -37,6 +37,7 @@
 #include "util/sharded_mutex.h"
 #include "arch/common.h"
 #include "namespaces.h"
+#include "thread.h"
 #include <regex.h>
 #include <linux/bitops.h>
 #include <linux/kernel.h>
@@ -3748,6 +3749,30 @@ static bool is_stack_operation(struct arch *arch, struct disasm_line *dl)
 	return false;
 }
 
+u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
+			struct disasm_line *dl)
+{
+	struct annotation *notes;
+	struct disasm_line *next;
+	u64 addr;
+
+	notes = symbol__annotation(ms->sym);
+	/*
+	 * PC-relative addressing starts from the next instruction address
+	 * But the IP is for the current instruction.  Since disasm_line
+	 * doesn't have the instruction size, calculate it using the next
+	 * disasm_line.  If it's the last one, we can use symbol's end
+	 * address directly.
+	 */
+	if (&dl->al.node == notes->src->source.prev)
+		addr = ms->sym->end + offset;
+	else {
+		next = list_next_entry(dl, al.node);
+		addr = ip + (next->al.offset - dl->al.offset) + offset;
+	}
+	return map__rip_2objdump(ms->map, addr);
+}
+
 /**
  * hist_entry__get_data_type - find data type for given hist entry
  * @he: hist entry
@@ -3767,7 +3792,9 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	struct annotated_op_loc *op_loc;
 	struct annotated_data_type *mem_type;
 	struct annotated_item_stat *istat;
-	u64 ip = he->ip;
+	u64 ip = he->ip, addr = 0;
+	const char *var_name = NULL;
+	int var_offset;
 	int i;
 
 	ann_data_stat.total++;
@@ -3821,12 +3848,38 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		/* Recalculate IP because of LOCK prefix or insn fusion */
 		ip = ms->sym->start + dl->al.offset;
 
-		mem_type = find_data_type(ms, ip, op_loc);
+		var_offset = op_loc->offset;
+
+		/* PC-relative addressing */
+		if (op_loc->reg1 == DWARF_REG_PC) {
+			struct addr_location al;
+			struct symbol *var;
+			u64 map_addr;
+
+			addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl);
+			/* Kernel symbols might be relocated */
+			map_addr = addr + map__reloc(ms->map);
+
+			addr_location__init(&al);
+			var = thread__find_symbol_fb(he->thread, he->cpumode,
+						     map_addr, &al);
+			if (var) {
+				var_name = var->name;
+				/* Calculate type offset from the start of variable */
+				var_offset = map_addr - map__unmap_ip(al.map, var->start);
+			}
+			addr_location__exit(&al);
+		}
+
+		mem_type = find_data_type(ms, ip, op_loc, addr, var_name);
 		if (mem_type)
 			istat->good++;
 		else
 			istat->bad++;
 
+		if (mem_type && var_name)
+			op_loc->offset = var_offset;
+
 		if (symbol_conf.annotate_data_sample) {
 			annotated_data_type__update_samples(mem_type, evsel,
 							    op_loc->offset,
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index f5a6c3227757..79ccc65c9ff9 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -497,4 +497,8 @@ struct annotated_item_stat {
 };
 extern struct list_head ann_insn_stat;
 
+/* Calculate PC-relative address */
+u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
+			struct disasm_line *dl);
+
 #endif	/* __PERF_ANNOTATE_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 35/52] perf dwarf-aux: Add die_get_cfa()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (33 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 34/52] perf annotate-data: Support global variables Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 36/52] perf annotate-data: Support stack variables Namhyung Kim
                   ` (17 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The die_get_cfa() is to get frame base register and offset at the given
instruction address (pc).  This info will be used to locate stack
variables which have location expression using DW_OP_fbreg.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 68 ++++++++++++++++++++++++++++++++++++-
 tools/perf/util/dwarf-aux.h | 15 ++++++++
 2 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 7aa5fee0da19..3d42a8613869 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1407,7 +1407,73 @@ Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc,
 		*offset = data.offset;
 	return result;
 }
-#endif
+#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
+
+#ifdef HAVE_DWARF_CFI_SUPPORT
+static int reg_from_dwarf_op(Dwarf_Op *op)
+{
+	switch (op->atom) {
+	case DW_OP_reg0 ... DW_OP_reg31:
+		return op->atom - DW_OP_reg0;
+	case DW_OP_breg0 ... DW_OP_breg31:
+		return op->atom - DW_OP_breg0;
+	case DW_OP_regx:
+	case DW_OP_bregx:
+		return op->number;
+	default:
+		break;
+	}
+	return -1;
+}
+
+static int offset_from_dwarf_op(Dwarf_Op *op)
+{
+	switch (op->atom) {
+	case DW_OP_reg0 ... DW_OP_reg31:
+	case DW_OP_regx:
+		return 0;
+	case DW_OP_breg0 ... DW_OP_breg31:
+		return op->number;
+	case DW_OP_bregx:
+		return op->number2;
+	default:
+		break;
+	}
+	return -1;
+}
+
+/**
+ * die_get_cfa - Get frame base information
+ * @dwarf: a Dwarf info
+ * @pc: program address
+ * @preg: pointer for saved register
+ * @poffset: pointer for saved offset
+ *
+ * This function gets register and offset for CFA (Canonical Frame Address)
+ * by searching the CIE/FDE info.  The CFA usually points to the start address
+ * of the current stack frame and local variables can be located using an offset
+ * from the CFA.  The @preg and @poffset will be updated if it returns 0.
+ */
+int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset)
+{
+	Dwarf_CFI *cfi;
+	Dwarf_Frame *frame = NULL;
+	Dwarf_Op *ops = NULL;
+	size_t nops;
+
+	cfi = dwarf_getcfi(dwarf);
+	if (cfi == NULL)
+		return -1;
+
+	if (!dwarf_cfi_addrframe(cfi, pc, &frame) &&
+	    !dwarf_frame_cfa(frame, &ops, &nops) && nops == 1) {
+		*preg = reg_from_dwarf_op(ops);
+		*poffset = offset_from_dwarf_op(ops);
+		return 0;
+	}
+	return -1;
+}
+#endif /* HAVE_DWARF_CFI_SUPPORT */
 
 /*
  * die_has_loclist - Check if DW_AT_location of @vr_die is a location list
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index 4e64caac6df8..f209f9162908 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -177,4 +177,19 @@ static inline Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die __maybe_unu
 
 #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
+#ifdef HAVE_DWARF_CFI_SUPPORT
+
+/* Get the frame base information from CFA */
+int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset);
+
+#else /* HAVE_DWARF_CFI_SUPPORT */
+
+static inline int die_get_cfa(Dwarf *dwarf __maybe_unused, u64 pc __maybe_unused,
+			      int *preg __maybe_unused, int *poffset __maybe_unused)
+{
+	return -1;
+}
+
+#endif /* HAVE_DWARF_CFI_SUPPORT */
+
 #endif /* _DWARF_AUX_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 36/52] perf annotate-data: Support stack variables
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (34 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 35/52] perf dwarf-aux: Add die_get_cfa() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 37/52] perf dwarf-aux: Check allowed DWARF Ops Namhyung Kim
                   ` (16 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Local variables are allocated in the stack and the location list
should look like base register(s) and an offset.  Extend the
die_find_variable_by_reg() to handle the following expressions

 * DW_OP_breg{0..31}
 * DW_OP_bregx
 * DW_OP_fbreg

Ususally DWARF subprogram entries have frame base information and
use it to locate stack variable like below:

 <2><43d1575>: Abbrev Number: 62 (DW_TAG_variable)
    <43d1576>   DW_AT_location    : 2 byte block: 91 7c         (DW_OP_fbreg: -4)  <--- here
    <43d1579>   DW_AT_name        : (indirect string, offset: 0x2c00c9): i
    <43d157d>   DW_AT_decl_file   : 1
    <43d157e>   DW_AT_decl_line   : 78
    <43d157f>   DW_AT_type        : <0x43d19d7>

I found some differences on saving the frame base between gcc and clang.
The gcc uses the CFA to get the base so it needs to check the current
frame's CFI info.  In this case, stack offset needs to be adjusted from
the start of the CFA.

 <1><1bb8d>: Abbrev Number: 102 (DW_TAG_subprogram)
    <1bb8e>   DW_AT_name        : (indirect string, offset: 0x74d41): kernel_init
    <1bb92>   DW_AT_decl_file   : 2
    <1bb92>   DW_AT_decl_line   : 1440
    <1bb94>   DW_AT_decl_column : 18
    <1bb95>   DW_AT_prototyped  : 1
    <1bb95>   DW_AT_type        : <0xcc>
    <1bb99>   DW_AT_low_pc      : 0xffffffff81bab9e0
    <1bba1>   DW_AT_high_pc     : 0x1b2
    <1bba9>   DW_AT_frame_base  : 1 byte block: 9c      (DW_OP_call_frame_cfa)  <------ here
    <1bbab>   DW_AT_call_all_calls: 1
    <1bbab>   DW_AT_sibling     : <0x1bf5a>

While clang sets it to a register directly and it can check the register
and offset in the instruction directly.

 <1><43d1542>: Abbrev Number: 60 (DW_TAG_subprogram)
    <43d1543>   DW_AT_low_pc      : 0xffffffff816a7c60
    <43d154b>   DW_AT_high_pc     : 0x98
    <43d154f>   DW_AT_frame_base  : 1 byte block: 56    (DW_OP_reg6 (rbp))  <---------- here
    <43d1551>   DW_AT_GNU_all_call_sites: 1
    <43d1551>   DW_AT_name        : (indirect string, offset: 0x3bce91): foo
    <43d1555>   DW_AT_decl_file   : 1
    <43d1556>   DW_AT_decl_line   : 75
    <43d1557>   DW_AT_prototyped  : 1
    <43d1557>   DW_AT_type        : <0x43c7332>
    <43d155b>   DW_AT_external    : 1

Also it needs to update the offset after finding the type like global
variables since the offset was from the frame base.  Factor out
match_var_offset() to check global and local variables in the same way.

The type stats are improved too:

  Annotate data type stats:
  total 294, ok 160 (54.4%), bad 134 (45.6%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          51 : no_var
          14 : no_typeinfo
           7 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 35 +++++++++++++--
 tools/perf/util/dwarf-aux.c     | 79 ++++++++++++++++++++++++---------
 tools/perf/util/dwarf-aux.h     |  3 ++
 3 files changed, 93 insertions(+), 24 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 99ecf4b3665c..b60c24091360 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -208,7 +208,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
 	/*
 	 * Usually it expects a pointer type for a memory access.
 	 * Convert to a real type it points to.  But global variables
-	 * are accessed directly without a pointer.
+	 * and local variables are accessed directly without a pointer.
 	 */
 	if (is_pointer) {
 		if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
@@ -247,6 +247,9 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 	int reg, offset;
 	int ret = -1;
 	int i, nr_scopes;
+	int fbreg = -1;
+	bool is_fbreg = false;
+	int fb_offset = 0;
 
 	/* Get a compile_unit for this address */
 	if (!find_cu_die(di, pc, &cu_die)) {
@@ -278,7 +281,33 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 	/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
 	nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
 
+	if (reg != DWARF_REG_PC && dwarf_hasattr(&scopes[0], DW_AT_frame_base)) {
+		Dwarf_Attribute attr;
+		Dwarf_Block block;
+
+		/* Check if the 'reg' is assigned as frame base register */
+		if (dwarf_attr(&scopes[0], DW_AT_frame_base, &attr) != NULL &&
+		    dwarf_formblock(&attr, &block) == 0 && block.length == 1) {
+			switch (*block.data) {
+			case DW_OP_reg0 ... DW_OP_reg31:
+				fbreg = *block.data - DW_OP_reg0;
+				break;
+			case DW_OP_call_frame_cfa:
+				if (die_get_cfa(di->dbg, pc, &fbreg,
+						&fb_offset) < 0)
+					fbreg = -1;
+				break;
+			default:
+				break;
+			}
+		}
+	}
+
 retry:
+	is_fbreg = (reg == fbreg);
+	if (is_fbreg)
+		offset = loc->offset - fb_offset;
+
 	/* Search from the inner-most scope to the outer */
 	for (i = nr_scopes - 1; i >= 0; i--) {
 		if (reg == DWARF_REG_PC) {
@@ -288,13 +317,13 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 		} else {
 			/* Look up variables/parameters in this scope */
 			if (!die_find_variable_by_reg(&scopes[i], pc, reg,
-						      &var_die))
+						      &offset, is_fbreg, &var_die))
 				continue;
 		}
 
 		/* Found a variable, see if it's correct */
 		ret = check_variable(&var_die, type_die, offset,
-				     reg != DWARF_REG_PC);
+				     reg != DWARF_REG_PC && !is_fbreg);
 		loc->offset = offset;
 		goto out;
 	}
diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 3d42a8613869..7caf52fdc255 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1272,11 +1272,39 @@ struct find_var_data {
 	unsigned reg;
 	/* Access offset, set for global data */
 	int offset;
+	/* True if the current register is the frame base */
+	bool is_fbreg;
 };
 
 /* Max number of registers DW_OP_regN supports */
 #define DWARF_OP_DIRECT_REGS  32
 
+static bool match_var_offset(Dwarf_Die *die_mem, struct find_var_data *data,
+			     u64 addr_offset, u64 addr_type)
+{
+	Dwarf_Die type_die;
+	Dwarf_Word size;
+
+	if (addr_offset == addr_type) {
+		/* Update offset relative to the start of the variable */
+		data->offset = 0;
+		return true;
+	}
+
+	if (die_get_real_type(die_mem, &type_die) == NULL)
+		return false;
+
+	if (dwarf_aggregate_size(&type_die, &size) < 0)
+		return false;
+
+	if (addr_offset >= addr_type + size)
+		return false;
+
+	/* Update offset relative to the start of the variable */
+	data->offset = addr_offset - addr_type;
+	return true;
+}
+
 /* Only checks direct child DIEs in the given scope. */
 static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
 {
@@ -1301,14 +1329,30 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
 		if (start > data->pc)
 			break;
 
+		/* Local variables accessed using frame base register */
+		if (data->is_fbreg && ops->atom == DW_OP_fbreg &&
+		    data->offset >= (int)ops->number &&
+		    match_var_offset(die_mem, data, data->offset, ops->number))
+			return DIE_FIND_CB_END;
+
 		/* Only match with a simple case */
 		if (data->reg < DWARF_OP_DIRECT_REGS) {
 			if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1)
 				return DIE_FIND_CB_END;
+
+			/* Local variables accessed by a register + offset */
+			if (ops->atom == (DW_OP_breg0 + data->reg) &&
+			    match_var_offset(die_mem, data, data->offset, ops->number))
+				return DIE_FIND_CB_END;
 		} else {
 			if (ops->atom == DW_OP_regx && ops->number == data->reg &&
 			    nops == 1)
 				return DIE_FIND_CB_END;
+
+			/* Local variables accessed by a register + offset */
+			if (ops->atom == DW_OP_bregx && data->reg == ops->number &&
+			    match_var_offset(die_mem, data, data->offset, ops->number2))
+				return DIE_FIND_CB_END;
 		}
 	}
 	return DIE_FIND_CB_SIBLING;
@@ -1319,18 +1363,29 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
  * @sc_die: a scope DIE
  * @pc: the program address to find
  * @reg: the register number to find
+ * @poffset: pointer to offset, will be updated for fbreg case
+ * @is_fbreg: boolean value if the current register is the frame base
  * @die_mem: a buffer to save the resulting DIE
  *
- * Find the variable DIE accessed by the given register.
+ * Find the variable DIE accessed by the given register.  It'll update the @offset
+ * when the variable is in the stack.
  */
 Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
+				    int *poffset, bool is_fbreg,
 				    Dwarf_Die *die_mem)
 {
 	struct find_var_data data = {
 		.pc = pc,
 		.reg = reg,
+		.offset = *poffset,
+		.is_fbreg = is_fbreg,
 	};
-	return die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem);
+	Dwarf_Die *result;
+
+	result = die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem);
+	if (result)
+		*poffset = data.offset;
+	return result;
 }
 
 /* Only checks direct child DIEs in the given scope */
@@ -1341,8 +1396,6 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
 	ptrdiff_t off = 0;
 	Dwarf_Attribute attr;
 	Dwarf_Addr base, start, end;
-	Dwarf_Word size;
-	Dwarf_Die type_die;
 	Dwarf_Op *ops;
 	size_t nops;
 
@@ -1359,24 +1412,8 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
 		if (data->addr < ops->number)
 			continue;
 
-		if (data->addr == ops->number) {
-			/* Update offset relative to the start of the variable */
-			data->offset = 0;
+		if (match_var_offset(die_mem, data, data->addr, ops->number))
 			return DIE_FIND_CB_END;
-		}
-
-		if (die_get_real_type(die_mem, &type_die) == NULL)
-			continue;
-
-		if (dwarf_aggregate_size(&type_die, &size) < 0)
-			continue;
-
-		if (data->addr >= ops->number + size)
-			continue;
-
-		/* Update offset relative to the start of the variable */
-		data->offset = data->addr - ops->number;
-		return DIE_FIND_CB_END;
 	}
 	return DIE_FIND_CB_SIBLING;
 }
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index f209f9162908..85dd527ae1f7 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -142,6 +142,7 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf);
 
 /* Find a variable saved in the 'reg' at given address */
 Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
+				    int *poffset, bool is_fbreg,
 				    Dwarf_Die *die_mem);
 
 /* Find a (global) variable located in the 'addr' */
@@ -161,6 +162,8 @@ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
 static inline Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die __maybe_unused,
 						  Dwarf_Addr pc __maybe_unused,
 						  int reg __maybe_unused,
+						  int *poffset __maybe_unused,
+						  bool is_fbreg __maybe_unused,
 						  Dwarf_Die *die_mem __maybe_unused)
 {
 	return NULL;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 37/52] perf dwarf-aux: Check allowed DWARF Ops
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (35 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 36/52] perf annotate-data: Support stack variables Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 38/52] perf dwarf-aux: Add die_collect_vars() Namhyung Kim
                   ` (15 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The DWARF location expression can be fairly complex and it'd be hard
to match it with the condition correctly.  So let's be conservative
and only allow simple expressions.  For now it just checks the first
operation in the list.  The following operations looks ok:

 * DW_OP_stack_value
 * DW_OP_deref_size
 * DW_OP_deref
 * DW_OP_piece

To refuse complex (and unsupported) location expressions, add
check_allowed_ops() to compare the rest of the list.  It seems earlier
result contained those unsupported expressions.  For example, I found
some local struct variable is placed like below.

 <2><43d1517>: Abbrev Number: 62 (DW_TAG_variable)
    <43d1518>   DW_AT_location    : 15 byte block: 91 50 93 8 91 78 93 4 93 84 8 91 68 93 4
        (DW_OP_fbreg: -48; DW_OP_piece: 8;
         DW_OP_fbreg: -8; DW_OP_piece: 4;
         DW_OP_piece: 1028;
         DW_OP_fbreg: -24; DW_OP_piece: 4)

Another example is something like this.

    0057c8be ffffffffffffffff ffffffff812109f0 (base address)
    0057c8ce ffffffff812112b5 ffffffff812112c8 (DW_OP_breg3 (rbx): 0;
                                                DW_OP_constu: 18446744073709551612;
                                                DW_OP_and;
                                                DW_OP_stack_value)

It should refuse them.  After the change, the stat shows:

  Annotate data type stats:
  total 294, ok 158 (53.7%), bad 136 (46.3%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          53 : no_var
          14 : no_typeinfo
           7 : bad_offset

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 44 +++++++++++++++++++++++++++++++++----
 1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 7caf52fdc255..2791126069b4 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1305,6 +1305,34 @@ static bool match_var_offset(Dwarf_Die *die_mem, struct find_var_data *data,
 	return true;
 }
 
+static bool check_allowed_ops(Dwarf_Op *ops, size_t nops)
+{
+	/* The first op is checked separately */
+	ops++;
+	nops--;
+
+	/*
+	 * It needs to make sure if the location expression matches to the given
+	 * register and offset exactly.  Thus it rejects any complex expressions
+	 * and only allows a few of selected operators that doesn't change the
+	 * location.
+	 */
+	while (nops) {
+		switch (ops->atom) {
+		case DW_OP_stack_value:
+		case DW_OP_deref_size:
+		case DW_OP_deref:
+		case DW_OP_piece:
+			break;
+		default:
+			return false;
+		}
+		ops++;
+		nops--;
+	}
+	return true;
+}
+
 /* Only checks direct child DIEs in the given scope. */
 static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
 {
@@ -1332,25 +1360,31 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
 		/* Local variables accessed using frame base register */
 		if (data->is_fbreg && ops->atom == DW_OP_fbreg &&
 		    data->offset >= (int)ops->number &&
+		    check_allowed_ops(ops, nops) &&
 		    match_var_offset(die_mem, data, data->offset, ops->number))
 			return DIE_FIND_CB_END;
 
 		/* Only match with a simple case */
 		if (data->reg < DWARF_OP_DIRECT_REGS) {
-			if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1)
+			/* pointer variables saved in a register 0 to 31 */
+			if (ops->atom == (DW_OP_reg0 + data->reg) &&
+			    check_allowed_ops(ops, nops))
 				return DIE_FIND_CB_END;
 
 			/* Local variables accessed by a register + offset */
 			if (ops->atom == (DW_OP_breg0 + data->reg) &&
+			    check_allowed_ops(ops, nops) &&
 			    match_var_offset(die_mem, data, data->offset, ops->number))
 				return DIE_FIND_CB_END;
 		} else {
+			/* pointer variables saved in a register 32 or above */
 			if (ops->atom == DW_OP_regx && ops->number == data->reg &&
-			    nops == 1)
+			    check_allowed_ops(ops, nops))
 				return DIE_FIND_CB_END;
 
 			/* Local variables accessed by a register + offset */
 			if (ops->atom == DW_OP_bregx && data->reg == ops->number &&
+			    check_allowed_ops(ops, nops) &&
 			    match_var_offset(die_mem, data, data->offset, ops->number2))
 				return DIE_FIND_CB_END;
 		}
@@ -1412,7 +1446,8 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
 		if (data->addr < ops->number)
 			continue;
 
-		if (match_var_offset(die_mem, data, data->addr, ops->number))
+		if (check_allowed_ops(ops, nops) &&
+		    match_var_offset(die_mem, data, data->addr, ops->number))
 			return DIE_FIND_CB_END;
 	}
 	return DIE_FIND_CB_SIBLING;
@@ -1503,7 +1538,8 @@ int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset)
 		return -1;
 
 	if (!dwarf_cfi_addrframe(cfi, pc, &frame) &&
-	    !dwarf_frame_cfa(frame, &ops, &nops) && nops == 1) {
+	    !dwarf_frame_cfa(frame, &ops, &nops) &&
+	    check_allowed_ops(ops, nops)) {
 		*preg = reg_from_dwarf_op(ops);
 		*poffset = offset_from_dwarf_op(ops);
 		return 0;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 38/52] perf dwarf-aux: Add die_collect_vars()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (36 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 37/52] perf dwarf-aux: Check allowed DWARF Ops Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 39/52] perf dwarf-aux: Handle type transfer for memory access Namhyung Kim
                   ` (14 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The die_collect_vars() is to find all variable information in the scope
including function parameters.  The struct die_var_type is to save the
type of the variable with the location (reg and offset) as well as where
it's defined in the code (addr).

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 118 +++++++++++++++++++++++++++---------
 tools/perf/util/dwarf-aux.h |  17 ++++++
 2 files changed, 107 insertions(+), 28 deletions(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 2791126069b4..f878014c9e27 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1136,6 +1136,40 @@ int die_get_varname(Dwarf_Die *vr_die, struct strbuf *buf)
 	return ret < 0 ? ret : strbuf_addf(buf, "\t%s", dwarf_diename(vr_die));
 }
 
+#if defined(HAVE_DWARF_GETLOCATIONS_SUPPORT) || defined(HAVE_DWARF_CFI_SUPPORT)
+static int reg_from_dwarf_op(Dwarf_Op *op)
+{
+	switch (op->atom) {
+	case DW_OP_reg0 ... DW_OP_reg31:
+		return op->atom - DW_OP_reg0;
+	case DW_OP_breg0 ... DW_OP_breg31:
+		return op->atom - DW_OP_breg0;
+	case DW_OP_regx:
+	case DW_OP_bregx:
+		return op->number;
+	default:
+		break;
+	}
+	return -1;
+}
+
+static int offset_from_dwarf_op(Dwarf_Op *op)
+{
+	switch (op->atom) {
+	case DW_OP_reg0 ... DW_OP_reg31:
+	case DW_OP_regx:
+		return 0;
+	case DW_OP_breg0 ... DW_OP_breg31:
+		return op->number;
+	case DW_OP_bregx:
+		return op->number2;
+	default:
+		break;
+	}
+	return -1;
+}
+#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT || HAVE_DWARF_CFI_SUPPORT */
+
 #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT
 /**
  * die_get_var_innermost_scope - Get innermost scope range of given variable DIE
@@ -1479,41 +1513,69 @@ Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc,
 		*offset = data.offset;
 	return result;
 }
-#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
-#ifdef HAVE_DWARF_CFI_SUPPORT
-static int reg_from_dwarf_op(Dwarf_Op *op)
+static int __die_collect_vars_cb(Dwarf_Die *die_mem, void *arg)
 {
-	switch (op->atom) {
-	case DW_OP_reg0 ... DW_OP_reg31:
-		return op->atom - DW_OP_reg0;
-	case DW_OP_breg0 ... DW_OP_breg31:
-		return op->atom - DW_OP_breg0;
-	case DW_OP_regx:
-	case DW_OP_bregx:
-		return op->number;
-	default:
-		break;
-	}
-	return -1;
+	struct die_var_type **var_types = arg;
+	Dwarf_Die type_die;
+	int tag = dwarf_tag(die_mem);
+	Dwarf_Attribute attr;
+	Dwarf_Addr base, start, end;
+	Dwarf_Op *ops;
+	size_t nops;
+	struct die_var_type *vt;
+
+	if (tag != DW_TAG_variable && tag != DW_TAG_formal_parameter)
+		return DIE_FIND_CB_SIBLING;
+
+	if (dwarf_attr(die_mem, DW_AT_location, &attr) == NULL)
+		return DIE_FIND_CB_SIBLING;
+
+	/*
+	 * Only collect the first location as it can reconstruct the
+	 * remaining state by following the instructions.
+	 * start = 0 means it covers the whole range.
+	 */
+	if (dwarf_getlocations(&attr, 0, &base, &start, &end, &ops, &nops) <= 0)
+		return DIE_FIND_CB_SIBLING;
+
+	if (die_get_real_type(die_mem, &type_die) == NULL)
+		return DIE_FIND_CB_SIBLING;
+
+	vt = malloc(sizeof(*vt));
+	if (vt == NULL)
+		return DIE_FIND_CB_END;
+
+	vt->die_off = dwarf_dieoffset(&type_die);
+	vt->addr = start;
+	vt->reg = reg_from_dwarf_op(ops);
+	vt->offset = offset_from_dwarf_op(ops);
+	vt->next = *var_types;
+	*var_types = vt;
+
+	return DIE_FIND_CB_SIBLING;
 }
 
-static int offset_from_dwarf_op(Dwarf_Op *op)
+/**
+ * die_collect_vars - Save all variables and parameters
+ * @sc_die: a scope DIE
+ * @var_types: a pointer to save the resulting list
+ *
+ * Save all variables and parameters in the @sc_die and save them to @var_types.
+ * The @var_types is a singly-linked list containing type and location info.
+ * Actual type can be retrieved using dwarf_offdie() with 'die_off' later.
+ *
+ * Callers should free @var_types.
+ */
+void die_collect_vars(Dwarf_Die *sc_die, struct die_var_type **var_types)
 {
-	switch (op->atom) {
-	case DW_OP_reg0 ... DW_OP_reg31:
-	case DW_OP_regx:
-		return 0;
-	case DW_OP_breg0 ... DW_OP_breg31:
-		return op->number;
-	case DW_OP_bregx:
-		return op->number2;
-	default:
-		break;
-	}
-	return -1;
+	Dwarf_Die die_mem;
+
+	die_find_child(sc_die, __die_collect_vars_cb, (void *)var_types, &die_mem);
 }
+#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
+#ifdef HAVE_DWARF_CFI_SUPPORT
 /**
  * die_get_cfa - Get frame base information
  * @dwarf: a Dwarf info
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index 85dd527ae1f7..efafd3a1f5b6 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -135,6 +135,15 @@ void die_skip_prologue(Dwarf_Die *sp_die, Dwarf_Die *cu_die,
 /* Get the list of including scopes */
 int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes);
 
+/* Variable type information */
+struct die_var_type {
+	struct die_var_type *next;
+	u64 die_off;
+	u64 addr;
+	int reg;
+	int offset;
+};
+
 #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT
 
 /* Get byte offset range of given variable DIE */
@@ -150,6 +159,9 @@ Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc,
 				     Dwarf_Addr addr, Dwarf_Die *die_mem,
 				     int *offset);
 
+/* Save all variables and parameters in this scope */
+void die_collect_vars(Dwarf_Die *sc_die, struct die_var_type **var_types);
+
 #else /*  HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
 static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
@@ -178,6 +190,11 @@ static inline Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die __maybe_unu
 	return NULL;
 }
 
+static inline void die_collect_vars(Dwarf_Die *sc_die __maybe_unused,
+				    struct die_var_type **var_types __maybe_unused)
+{
+}
+
 #endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
 
 #ifdef HAVE_DWARF_CFI_SUPPORT
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 39/52] perf dwarf-aux: Handle type transfer for memory access
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (37 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 38/52] perf dwarf-aux: Add die_collect_vars() Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-09 23:59 ` [PATCH 40/52] perf annotate-data: Introduce struct data_loc_info Namhyung Kim
                   ` (13 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

We want to track type states as instructions are executed.  Each
instruction can access compound types like struct or union and load/
store its members to a different location.

The die_deref_ptr_type() is to find a type of memory access with a
pointer variable.  If it points to a compound type like struct, the
target memory is a member in the struct.  The access will happen
with an offset indicating which member it refers.  Let's follow the
DWARF info to figure out the type of the pointer target.

For example, say we have the following code.

  struct foo {
    int a;
    int b;
  };

  struct foo *p = malloc(sizeof(*p));
  p->b = 0;

The last pointer access should produce x86 asm like below:

  mov  0x0, 4(%rbx)

And we know %rbx register has a pointer to struct foo.  Then offset 4
should return the debug info of member 'b'.

Also variables of compound types can be accessed directly without a
pointer.  The die_get_member_type() is to handle a such case.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/dwarf-aux.c | 110 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/dwarf-aux.h |   6 ++
 2 files changed, 116 insertions(+)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index f878014c9e27..39851ff1d5c4 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1841,3 +1841,113 @@ int die_get_scopes(Dwarf_Die *cu_die, Dwarf_Addr pc, Dwarf_Die **scopes)
 	*scopes = data.scopes;
 	return data.nr;
 }
+
+static int __die_find_member_offset_cb(Dwarf_Die *die_mem, void *arg)
+{
+	Dwarf_Die type_die;
+	Dwarf_Word size, loc;
+	Dwarf_Word offset = (long)arg;
+	int tag = dwarf_tag(die_mem);
+
+	if (tag != DW_TAG_member)
+		return DIE_FIND_CB_SIBLING;
+
+	/* Unions might not have location */
+	if (die_get_data_member_location(die_mem, &loc) < 0)
+		loc = 0;
+
+	if (offset == loc)
+		return DIE_FIND_CB_END;
+
+	die_get_real_type(die_mem, &type_die);
+
+	if (dwarf_aggregate_size(&type_die, &size) < 0)
+		size = 0;
+
+	if (loc < offset && offset < (loc + size))
+		return DIE_FIND_CB_END;
+
+	return DIE_FIND_CB_SIBLING;
+}
+
+/**
+ * die_get_member_type - Return type info of struct member
+ * @type_die: a type DIE
+ * @offset: offset in the type
+ * @die_mem: a buffer to save the resulting DIE
+ *
+ * This function returns a type of a member in @type_die where it's located at
+ * @offset if it's a struct.  For now, it just returns the first matching
+ * member in a union.  For other types, it'd return the given type directly
+ * if it's within the size of the type or NULL otherwise.
+ */
+Dwarf_Die *die_get_member_type(Dwarf_Die *type_die, int offset,
+			       Dwarf_Die *die_mem)
+{
+	Dwarf_Die *member;
+	Dwarf_Die mb_type;
+	int tag;
+
+	tag = dwarf_tag(type_die);
+	/* If it's not a compound type, return the type directly */
+	if (tag != DW_TAG_structure_type && tag != DW_TAG_union_type) {
+		Dwarf_Word size;
+
+		if (dwarf_aggregate_size(type_die, &size) < 0)
+			size = 0;
+
+		if ((unsigned)offset >= size)
+			return NULL;
+
+		*die_mem = *type_die;
+		return die_mem;
+	}
+
+	mb_type = *type_die;
+	/* TODO: Handle union types better? */
+	while (tag == DW_TAG_structure_type || tag == DW_TAG_union_type) {
+		member = die_find_child(&mb_type, __die_find_member_offset_cb,
+					(void *)(long)offset, die_mem);
+		if (member == NULL)
+			return NULL;
+
+		if (die_get_real_type(member, &mb_type) == NULL)
+			return NULL;
+
+		tag = dwarf_tag(&mb_type);
+
+		if (tag == DW_TAG_structure_type || tag == DW_TAG_union_type) {
+			Dwarf_Word loc;
+
+			/* Update offset for the start of the member struct */
+			if (die_get_data_member_location(member, &loc) == 0)
+				offset -= loc;
+		}
+	}
+	*die_mem = mb_type;
+	return die_mem;
+}
+
+/**
+ * die_deref_ptr_type - Return type info for pointer access
+ * @ptr_die: a pointer type DIE
+ * @offset: access offset for the pointer
+ * @die_mem: a buffer to save the resulting DIE
+ *
+ * This function follows the pointer in @ptr_die with given @offset
+ * and saves the resulting type in @die_mem.  If the pointer points
+ * a struct type, actual member at the offset would be returned.
+ */
+Dwarf_Die *die_deref_ptr_type(Dwarf_Die *ptr_die, int offset,
+			      Dwarf_Die *die_mem)
+{
+	Dwarf_Die type_die;
+
+	if (dwarf_tag(ptr_die) != DW_TAG_pointer_type)
+		return NULL;
+
+	if (die_get_real_type(ptr_die, &type_die) == NULL)
+		return NULL;
+
+	return die_get_member_type(&type_die, offset, die_mem);
+}
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index efafd3a1f5b6..ad4d7322fcbf 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -144,6 +144,12 @@ struct die_var_type {
 	int offset;
 };
 
+/* Return type info of a member at offset */
+Dwarf_Die *die_get_member_type(Dwarf_Die *type_die, int offset, Dwarf_Die *die_mem);
+
+/* Return type info where the pointer and offset point to */
+Dwarf_Die *die_deref_ptr_type(Dwarf_Die *ptr_die, int offset, Dwarf_Die *die_mem);
+
 #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT
 
 /* Get byte offset range of given variable DIE */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 40/52] perf annotate-data: Introduce struct data_loc_info
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (38 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 39/52] perf dwarf-aux: Handle type transfer for memory access Namhyung Kim
@ 2023-11-09 23:59 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 41/52] perf map: Add map__objdump_2rip() Namhyung Kim
                   ` (12 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-09 23:59 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The find_data_type() needs many information to describe the location of
the data.  Add the new struct data_loc_info to pass those information at
once.

No functional changes intended.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 83 +++++++++++++++++----------------
 tools/perf/util/annotate-data.h | 38 ++++++++++++---
 tools/perf/util/annotate.c      | 30 ++++++------
 3 files changed, 91 insertions(+), 60 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index b60c24091360..c61f5b5b6adc 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -238,21 +238,28 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
 }
 
 /* The result will be saved in @type_die */
-static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
-			      const char *var_name, struct annotated_op_loc *loc,
-			      Dwarf_Die *type_die)
+static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 {
+	struct annotated_op_loc *loc = dloc->op;
 	Dwarf_Die cu_die, var_die;
 	Dwarf_Die *scopes = NULL;
 	int reg, offset;
 	int ret = -1;
 	int i, nr_scopes;
 	int fbreg = -1;
-	bool is_fbreg = false;
 	int fb_offset = 0;
+	bool is_fbreg = false;
+	u64 pc;
+
+	/*
+	 * IP is a relative instruction address from the start of the map, as
+	 * it can be randomized/relocated, it needs to translate to PC which is
+	 * a file address for DWARF processing.
+	 */
+	pc = map__rip_2objdump(dloc->ms->map, dloc->ip);
 
 	/* Get a compile_unit for this address */
-	if (!find_cu_die(di, pc, &cu_die)) {
+	if (!find_cu_die(dloc->di, pc, &cu_die)) {
 		pr_debug("cannot find CU for address %lx\n", pc);
 		ann_data_stat.no_cuinfo++;
 		return -1;
@@ -262,18 +269,19 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 	offset = loc->offset;
 
 	if (reg == DWARF_REG_PC) {
-		if (die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
+		if (die_find_variable_by_addr(&cu_die, pc, dloc->var_addr,
+					      &var_die, &offset)) {
 			ret = check_variable(&var_die, type_die, offset,
 					     /*is_pointer=*/false);
-			loc->offset = offset;
+			dloc->type_offset = offset;
 			goto out;
 		}
 
-		if (var_name && die_find_variable_at(&cu_die, var_name, pc,
-						     &var_die)) {
-			ret = check_variable(&var_die, type_die, 0,
+		if (dloc->var_name &&
+		    die_find_variable_at(&cu_die, dloc->var_name, pc, &var_die)) {
+			ret = check_variable(&var_die, type_die, dloc->type_offset,
 					     /*is_pointer=*/false);
-			/* loc->offset will be updated by the caller */
+			/* dloc->type_offset was updated by the caller */
 			goto out;
 		}
 	}
@@ -290,10 +298,11 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 		    dwarf_formblock(&attr, &block) == 0 && block.length == 1) {
 			switch (*block.data) {
 			case DW_OP_reg0 ... DW_OP_reg31:
-				fbreg = *block.data - DW_OP_reg0;
+				fbreg = dloc->fbreg = *block.data - DW_OP_reg0;
 				break;
 			case DW_OP_call_frame_cfa:
-				if (die_get_cfa(di->dbg, pc, &fbreg,
+				dloc->fb_cfa = true;
+				if (die_get_cfa(dloc->di->dbg, pc, &fbreg,
 						&fb_offset) < 0)
 					fbreg = -1;
 				break;
@@ -311,7 +320,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 	/* Search from the inner-most scope to the outer */
 	for (i = nr_scopes - 1; i >= 0; i--) {
 		if (reg == DWARF_REG_PC) {
-			if (!die_find_variable_by_addr(&scopes[i], pc, addr,
+			if (!die_find_variable_by_addr(&scopes[i], pc, dloc->var_addr,
 						       &var_die, &offset))
 				continue;
 		} else {
@@ -324,7 +333,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 		/* Found a variable, see if it's correct */
 		ret = check_variable(&var_die, type_die, offset,
 				     reg != DWARF_REG_PC && !is_fbreg);
-		loc->offset = offset;
+		dloc->type_offset = offset;
 		goto out;
 	}
 
@@ -343,50 +352,46 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
 
 /**
  * find_data_type - Return a data type at the location
- * @ms: map and symbol at the location
- * @ip: instruction address of the memory access
- * @loc: instruction operand location
- * @addr: data address of the memory access
- * @var_name: global variable name
+ * @dloc: data location
  *
  * This functions searches the debug information of the binary to get the data
- * type it accesses.  The exact location is expressed by (@ip, reg, offset)
- * for pointer variables or (@ip, @addr) for global variables.  Note that global
- * variables might update the @loc->offset after finding the start of the variable.
- * If it cannot find a global variable by address, it tried to fine a declaration
- * of the variable using @var_name.  In that case, @loc->offset won't be updated.
+ * type it accesses.  The exact location is expressed by (ip, reg, offset)
+ * for pointer variables or (ip, addr) for global variables.  Note that global
+ * variables might update the @dloc->type_offset after finding the start of the
+ * variable.  If it cannot find a global variable by address, it tried to find
+ * a declaration of the variable using var_name.  In that case, @dloc->offset
+ * won't be updated.
  *
  * It return %NULL if not found.
  */
-struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   struct annotated_op_loc *loc, u64 addr,
-					   const char *var_name)
+struct annotated_data_type *find_data_type(struct data_loc_info *dloc)
 {
 	struct annotated_data_type *result = NULL;
-	struct dso *dso = ms->map->dso;
-	struct debuginfo *di;
+	struct dso *dso = dloc->ms->map->dso;
 	Dwarf_Die type_die;
-	u64 pc;
 
-	di = debuginfo__new(dso->long_name);
-	if (di == NULL) {
+	dloc->di = debuginfo__new(dso->long_name);
+	if (dloc->di == NULL) {
 		pr_debug("cannot get the debug info\n");
 		return NULL;
 	}
 
 	/*
-	 * IP is a relative instruction address from the start of the map, as
-	 * it can be randomized/relocated, it needs to translate to PC which is
-	 * a file address for DWARF processing.
+	 * The type offset is the same as instruction offset by default.
+	 * But when finding a global variable, the offset won't be valid.
 	 */
-	pc = map__rip_2objdump(ms->map, ip);
-	if (find_data_type_die(di, pc, addr, var_name, loc, &type_die) < 0)
+	if (dloc->var_name == NULL)
+		dloc->type_offset = dloc->op->offset;
+
+	dloc->fbreg = -1;
+
+	if (find_data_type_die(dloc, &type_die) < 0)
 		goto out;
 
 	result = dso__findnew_data_type(dso, &type_die);
 
 out:
-	debuginfo__delete(di);
+	debuginfo__delete(dloc->di);
 	return result;
 }
 
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 1b0db8e8c40e..ad6493ea2c8e 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -8,6 +8,7 @@
 #include <linux/types.h>
 
 struct annotated_op_loc;
+struct debuginfo;
 struct evsel;
 struct map_symbol;
 
@@ -72,6 +73,35 @@ struct annotated_data_type {
 extern struct annotated_data_type unknown_type;
 extern struct annotated_data_type stackop_type;
 
+/**
+ * struct data_loc_info - Data location information
+ * @ms: Map and Symbol info
+ * @ip: Instruction address
+ * @var_addr: Data address (for global variables)
+ * @var_name: Variable name (for global variables)
+ * @op: Instruction operand location (regs and offset)
+ * @di: Debug info
+ * @fbreg: Frame base register
+ * @fb_cfa: Whether the frame needs to check CFA
+ * @type_offset: Final offset in the type
+ */
+struct data_loc_info {
+	/* These are input field, should be filled by caller */
+	struct map_symbol *ms;
+	u64 ip;
+	u64 var_addr;
+	const char *var_name;
+	struct annotated_op_loc *op;
+
+	/* These are used internally */
+	struct debuginfo *di;
+	int fbreg;
+	bool fb_cfa;
+
+	/* This is for the result */
+	int type_offset;
+};
+
 /**
  * struct annotated_data_stat - Debug statistics
  * @total: Total number of entry
@@ -106,9 +136,7 @@ extern struct annotated_data_stat ann_data_stat;
 #ifdef HAVE_DWARF_SUPPORT
 
 /* Returns data type at the location (ip, reg, offset) */
-struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
-					   struct annotated_op_loc *loc, u64 addr,
-					   const char *var_name);
+struct annotated_data_type *find_data_type(struct data_loc_info *dloc);
 
 /* Update type access histogram at the given offset */
 int annotated_data_type__update_samples(struct annotated_data_type *adt,
@@ -121,9 +149,7 @@ void annotated_data_type__tree_delete(struct rb_root *root);
 #else /* HAVE_DWARF_SUPPORT */
 
 static inline struct annotated_data_type *
-find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
-	       struct annotated_op_loc *loc __maybe_unused,
-	       u64 addr __maybe_unused, const char *var_name __maybe_unused)
+find_data_type(struct data_loc_info *dloc __maybe_unused)
 {
 	return NULL;
 }
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 4f74db1d3256..136a00e17a5c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3792,9 +3792,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	struct annotated_op_loc *op_loc;
 	struct annotated_data_type *mem_type;
 	struct annotated_item_stat *istat;
-	u64 ip = he->ip, addr = 0;
-	const char *var_name = NULL;
-	int var_offset;
+	u64 ip = he->ip;
 	int i;
 
 	ann_data_stat.total++;
@@ -3842,51 +3840,53 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	}
 
 	for_each_insn_op_loc(&loc, i, op_loc) {
+		struct data_loc_info dloc = {
+			.ms = ms,
+			/* Recalculate IP for LOCK prefix or insn fusion */
+			.ip = ms->sym->start + dl->al.offset,
+			.op = op_loc,
+		};
+
 		if (!op_loc->mem_ref)
 			continue;
 
 		/* Recalculate IP because of LOCK prefix or insn fusion */
 		ip = ms->sym->start + dl->al.offset;
 
-		var_offset = op_loc->offset;
-
 		/* PC-relative addressing */
 		if (op_loc->reg1 == DWARF_REG_PC) {
 			struct addr_location al;
 			struct symbol *var;
 			u64 map_addr;
 
-			addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl);
+			dloc.var_addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl);
 			/* Kernel symbols might be relocated */
-			map_addr = addr + map__reloc(ms->map);
+			map_addr = dloc.var_addr + map__reloc(ms->map);
 
 			addr_location__init(&al);
 			var = thread__find_symbol_fb(he->thread, he->cpumode,
 						     map_addr, &al);
 			if (var) {
-				var_name = var->name;
+				dloc.var_name = var->name;
 				/* Calculate type offset from the start of variable */
-				var_offset = map_addr - map__unmap_ip(al.map, var->start);
+				dloc.type_offset = map_addr - map__unmap_ip(al.map, var->start);
 			}
 			addr_location__exit(&al);
 		}
 
-		mem_type = find_data_type(ms, ip, op_loc, addr, var_name);
+		mem_type = find_data_type(&dloc);
 		if (mem_type)
 			istat->good++;
 		else
 			istat->bad++;
 
-		if (mem_type && var_name)
-			op_loc->offset = var_offset;
-
 		if (symbol_conf.annotate_data_sample) {
 			annotated_data_type__update_samples(mem_type, evsel,
-							    op_loc->offset,
+							    dloc.type_offset,
 							    he->stat.nr_events,
 							    he->stat.period);
 		}
-		he->mem_type_off = op_loc->offset;
+		he->mem_type_off = dloc.type_offset;
 		return mem_type;
 	}
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 41/52] perf map: Add map__objdump_2rip()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (39 preceding siblings ...)
  2023-11-09 23:59 ` [PATCH 40/52] perf annotate-data: Introduce struct data_loc_info Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 42/52] perf annotate: Add annotate_get_basic_blocks() Namhyung Kim
                   ` (11 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Sometimes we want to convert an address in objdump output to
map-relative address to match with a sample data.  Let's add
map__objdump_2rip() for that.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/map.c | 20 ++++++++++++++++++++
 tools/perf/util/map.h |  3 +++
 2 files changed, 23 insertions(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index f64b83004421..f25cf664c898 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -595,6 +595,26 @@ u64 map__objdump_2mem(struct map *map, u64 ip)
 	return ip + map__reloc(map);
 }
 
+u64 map__objdump_2rip(struct map *map, u64 ip)
+{
+	const struct dso *dso = map__dso(map);
+
+	if (!dso->adjust_symbols)
+		return ip;
+
+	if (dso->rel)
+		return ip + map__pgoff(map);
+
+	/*
+	 * kernel modules also have DSO_TYPE_USER in dso->kernel,
+	 * but all kernel modules are ET_REL, so won't get here.
+	 */
+	if (dso->kernel == DSO_SPACE__USER)
+		return ip - dso->text_offset;
+
+	return map__map_ip(map, ip + map__reloc(map));
+}
+
 bool map__contains_symbol(const struct map *map, const struct symbol *sym)
 {
 	u64 ip = map__unmap_ip(map, sym->start);
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 1b53d53adc86..b7bcf0aa3b67 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -129,6 +129,9 @@ u64 map__rip_2objdump(struct map *map, u64 rip);
 /* objdump address -> memory address */
 u64 map__objdump_2mem(struct map *map, u64 ip);
 
+/* objdump address -> rip */
+u64 map__objdump_2rip(struct map *map, u64 ip);
+
 struct symbol;
 struct thread;
 
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 42/52] perf annotate: Add annotate_get_basic_blocks()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (40 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 41/52] perf map: Add map__objdump_2rip() Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 43/52] perf annotate-data: Maintain variable type info Namhyung Kim
                   ` (10 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The annotate_get_basic_blocks() is to find a list of basic blocks from
the source instruction to the destination instruction in a function.

It'll be used to find variables in a scope.  Use BFS (Breadth First
Search) to find a shortest path to carry the variable/register state
minimally.

Also change find_disasm_line() to be used in annotate_get_basic_blocks()
and add 'allow_update' argument to control if it can update the IP.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 222 ++++++++++++++++++++++++++++++++++++-
 tools/perf/util/annotate.h |  16 +++
 2 files changed, 235 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 136a00e17a5c..d54a9ec16af4 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3690,7 +3690,8 @@ static void symbol__ensure_annotate(struct map_symbol *ms, struct evsel *evsel)
 	}
 }
 
-static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip)
+static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip,
+					    bool allow_update)
 {
 	struct disasm_line *dl;
 	struct annotation *notes;
@@ -3703,7 +3704,8 @@ static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip)
 			 * llvm-objdump places "lock" in a separate line and
 			 * in that case, we want to get the next line.
 			 */
-			if (!strcmp(dl->ins.name, "lock") && *dl->ops.raw == '\0') {
+			if (!strcmp(dl->ins.name, "lock") &&
+			    *dl->ops.raw == '\0' && allow_update) {
 				ip++;
 				continue;
 			}
@@ -3814,7 +3816,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	 * Get a disasm to extract the location from the insn.
 	 * This is too slow...
 	 */
-	dl = find_disasm_line(ms->sym, ip);
+	dl = find_disasm_line(ms->sym, ip, /*allow_update=*/true);
 	if (dl == NULL) {
 		ann_data_stat.no_insn++;
 		return NULL;
@@ -3908,3 +3910,217 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	istat->bad++;
 	return NULL;
 }
+
+/* Basic block traversal (BFS) data structure */
+struct basic_block_data {
+	struct list_head queue;
+	struct list_head visited;
+};
+
+/*
+ * During the traversal, it needs to know the parent block where the current
+ * block block started from.  Note that single basic block can be parent of
+ * two child basic blocks (in case of condition jump).
+ */
+struct basic_block_link {
+	struct list_head node;
+	struct basic_block_link *parent;
+	struct annotated_basic_block *bb;
+};
+
+/* Check any of basic block in the list already has the offset */
+static bool basic_block_has_offset(struct list_head *head, s64 offset)
+{
+	struct basic_block_link *link;
+
+	list_for_each_entry(link, head, node) {
+		s64 begin_offset = link->bb->begin->al.offset;
+		s64 end_offset = link->bb->end->al.offset;
+
+		if (begin_offset <= offset && offset <= end_offset)
+			return true;
+	}
+	return false;
+}
+
+static bool is_new_basic_block(struct basic_block_data *bb_data,
+			       struct disasm_line *dl)
+{
+	s64 offset = dl->al.offset;
+
+	if (basic_block_has_offset(&bb_data->visited, offset))
+		return false;
+	if (basic_block_has_offset(&bb_data->queue, offset))
+		return false;
+	return true;
+}
+
+/* Add a basic block starting from dl and link it to the parent */
+static int add_basic_block(struct basic_block_data *bb_data,
+			   struct basic_block_link *parent,
+			   struct disasm_line *dl)
+{
+	struct annotated_basic_block *bb;
+	struct basic_block_link *link;
+
+	if (dl == NULL)
+		return -1;
+
+	if (!is_new_basic_block(bb_data, dl))
+		return 0;
+
+	bb = zalloc(sizeof(*bb));
+	if (bb == NULL)
+		return -1;
+
+	bb->begin = dl;
+	bb->end = dl;
+	INIT_LIST_HEAD(&bb->list);
+
+	link = malloc(sizeof(*link));
+	if (link == NULL) {
+		free(bb);
+		return -1;
+	}
+
+	link->bb = bb;
+	link->parent = parent;
+	list_add_tail(&link->node, &bb_data->queue);
+	return 0;
+}
+
+/* Returns true when it finds the target in the current basic block */
+static bool process_basic_block(struct basic_block_data *bb_data,
+				struct basic_block_link *link,
+				struct symbol *sym, u64 target)
+{
+	struct disasm_line *dl, *next_dl, *last_dl;
+	struct annotation *notes = symbol__annotation(sym);
+	bool found = false;
+
+	dl = link->bb->begin;
+	/* Check if it's already visited */
+	if (basic_block_has_offset(&bb_data->visited, dl->al.offset))
+		return false;
+
+	last_dl = list_last_entry(&notes->src->source,
+				  struct disasm_line, al.node);
+
+	list_for_each_entry_from(dl, &notes->src->source, al.node) {
+		/* Found the target instruction */
+		if (sym->start + dl->al.offset == target) {
+			found = true;
+			break;
+		}
+		/* End of the function, finish the block */
+		if (dl == last_dl)
+			break;
+		/* 'return' instruction finishes the block */
+		if (dl->ins.ops == &ret_ops)
+			break;
+		/* normal instructions are part of the basic block */
+		if (dl->ins.ops != &jump_ops)
+			continue;
+		/* jump to a different function, tail call or return */
+		if (dl->ops.target.outside)
+			break;
+		/* jump instruction creates new basic block(s) */
+		next_dl = find_disasm_line(sym, sym->start + dl->ops.target.offset,
+					   /*allow_update=*/false);
+		add_basic_block(bb_data, link, next_dl);
+
+		/*
+		 * FIXME: determine conditional jumps properly.
+		 * Conditional jumps create another basic block with the
+		 * next disasm line.
+		 */
+		if (!strstr(dl->ins.name, "jmp")) {
+			next_dl = list_next_entry(dl, al.node);
+			add_basic_block(bb_data, link, next_dl);
+		}
+		break;
+
+	}
+	link->bb->end = dl;
+	return found;
+}
+
+/*
+ * It founds a target basic block, build a proper linked list of basic blocks
+ * by following the link recursively.
+ */
+static void link_found_basic_blocks(struct basic_block_link *link,
+				    struct list_head *head)
+{
+	while (link) {
+		struct basic_block_link *parent = link->parent;
+
+		list_move(&link->bb->list, head);
+		list_del(&link->node);
+		free(link);
+
+		link = parent;
+	}
+}
+
+static void delete_basic_blocks(struct basic_block_data *bb_data)
+{
+	struct basic_block_link *link, *tmp;
+
+	list_for_each_entry_safe(link, tmp, &bb_data->queue, node) {
+		list_del(&link->node);
+		free(link->bb);
+		free(link);
+	}
+
+	list_for_each_entry_safe(link, tmp, &bb_data->visited, node) {
+		list_del(&link->node);
+		free(link->bb);
+		free(link);
+	}
+}
+
+/**
+ * annotate_get_basic_blocks - Get basic blocks for given address range
+ * @sym: symbol to annotate
+ * @src: source address
+ * @dst: destination address
+ * @head: list head to save basic blocks
+ *
+ * This function traverses disasm_lines from @src to @dst and save them in a
+ * list of annotated_basic_block to @head.  It uses BFS to find the shortest
+ * path between two.  The basic_block_link is to maintain parent links so
+ * that it can build a list of blocks from the start.
+ */
+int annotate_get_basic_blocks(struct symbol *sym, s64 src, s64 dst,
+			      struct list_head *head)
+{
+	struct basic_block_data bb_data = {
+		.queue = LIST_HEAD_INIT(bb_data.queue),
+		.visited = LIST_HEAD_INIT(bb_data.visited),
+	};
+	struct basic_block_link *link;
+	struct disasm_line *dl;
+	int ret = -1;
+
+	dl = find_disasm_line(sym, src, /*allow_update=*/false);
+	if (dl == NULL)
+		return -1;
+
+	if (add_basic_block(&bb_data, /*parent=*/NULL, dl) < 0)
+		return -1;
+
+	/* Find shortest path from src to dst using BFS */
+	while (!list_empty(&bb_data.queue)) {
+		link = list_first_entry(&bb_data.queue, struct basic_block_link, node);
+
+		if (process_basic_block(&bb_data, link, sym, dst)) {
+			link_found_basic_blocks(link, head);
+			ret = 0;
+			break;
+		}
+		list_move(&link->node, &bb_data.visited);
+	}
+	delete_basic_blocks(&bb_data);
+	return ret;
+}
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 79ccc65c9ff9..e1fa86341281 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -501,4 +501,20 @@ extern struct list_head ann_insn_stat;
 u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
 			struct disasm_line *dl);
 
+/**
+ * struct annotated_basic_block - Basic block of instructions
+ * @list: List node
+ * @begin: start instruction in the block
+ * @end: end instruction in the block
+ */
+struct annotated_basic_block {
+	struct list_head list;
+	struct disasm_line *begin;
+	struct disasm_line *end;
+};
+
+/* Get a list of basic blocks from src to dst addresses */
+int annotate_get_basic_blocks(struct symbol *sym, s64 src, s64 dst,
+			      struct list_head *head);
+
 #endif	/* __PERF_ANNOTATE_H */
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 43/52] perf annotate-data: Maintain variable type info
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (41 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 42/52] perf annotate: Add annotate_get_basic_blocks() Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 44/52] perf annotate-data: Add update_insn_state() Namhyung Kim
                   ` (9 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

As it collected basic block and variable information in each scope, it
now can build a state table to find matching variable at the location.

The struct type_state is to keep the type info saved in each register
and stack slot.  The update_var_state() updates the table when it finds
variables in the current address.  It expects die_collect_vars() filled
a list of variables with type info and starting address.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 155 ++++++++++++++++++++++++++++++++
 tools/perf/util/annotate-data.h |  29 ++++++
 tools/perf/util/dwarf-aux.c     |   4 +
 3 files changed, 188 insertions(+)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index c61f5b5b6adc..438d6234020b 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -22,6 +22,57 @@
 #include "symbol.h"
 #include "symbol_conf.h"
 
+/* Type information in a register, valid when ok is true */
+struct type_state_reg {
+	Dwarf_Die type;
+	bool ok;
+	bool scratch;
+};
+
+/* Type information in a stack location, dynamically allocated */
+struct type_state_stack {
+	struct list_head list;
+	Dwarf_Die type;
+	int offset;
+	int size;
+	bool compound;
+};
+
+/* FIXME: This should be arch-dependent */
+#define TYPE_STATE_MAX_REGS  16
+
+/*
+ * State table to maintain type info in each register and stack location.
+ * It'll be updated when new variable is allocated or type info is moved
+ * to a new location (register or stack).  As it'd be used with the
+ * shortest path of basic blocks, it only maintains a single table.
+ */
+struct type_state {
+	struct type_state_reg regs[TYPE_STATE_MAX_REGS];
+	struct list_head stack_vars;
+};
+
+static bool has_reg_type(struct type_state *state, int reg)
+{
+	return (unsigned)reg < ARRAY_SIZE(state->regs);
+}
+
+void init_type_state(struct type_state *state, struct arch *arch __maybe_unused)
+{
+	memset(state, 0, sizeof(*state));
+	INIT_LIST_HEAD(&state->stack_vars);
+}
+
+void exit_type_state(struct type_state *state)
+{
+	struct type_state_stack *stack, *tmp;
+
+	list_for_each_entry_safe(stack, tmp, &state->stack_vars, list) {
+		list_del(&stack->list);
+		free(stack);
+	}
+}
+
 /*
  * Compare type name and size to maintain them in a tree.
  * I'm not sure if DWARF would have information of a single type in many
@@ -237,6 +288,110 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
 	return 0;
 }
 
+static struct type_state_stack *find_stack_state(struct type_state *state,
+						 int offset)
+{
+	struct type_state_stack *stack;
+
+	list_for_each_entry(stack, &state->stack_vars, list) {
+		if (offset == stack->offset)
+			return stack;
+
+		if (stack->compound && stack->offset < offset &&
+		    offset < stack->offset + stack->size)
+			return stack;
+	}
+	return NULL;
+}
+
+static void set_stack_state(struct type_state_stack *stack, int offset,
+			    Dwarf_Die *type_die)
+{
+	int tag;
+	Dwarf_Word size;
+
+	if (dwarf_aggregate_size(type_die, &size) < 0)
+		size = 0;
+
+	tag = dwarf_tag(type_die);
+
+	stack->type = *type_die;
+	stack->size = size;
+	stack->offset = offset;
+
+	switch (tag) {
+	case DW_TAG_structure_type:
+	case DW_TAG_union_type:
+		stack->compound = true;
+		break;
+	default:
+		stack->compound = false;
+		break;
+	}
+}
+
+static struct type_state_stack *findnew_stack_state(struct type_state *state,
+						    int offset, Dwarf_Die *type_die)
+{
+	struct type_state_stack *stack = find_stack_state(state, offset);
+
+	if (stack) {
+		set_stack_state(stack, offset, type_die);
+		return stack;
+	}
+
+	stack = malloc(sizeof(*stack));
+	if (stack) {
+		set_stack_state(stack, offset, type_die);
+		list_add(&stack->list, &state->stack_vars);
+	}
+	return stack;
+}
+
+/**
+ * update_var_state - Update type state using given variables
+ * @state: type state table
+ * @dloc: data location info
+ * @addr: instruction address to update
+ * @var_types: list of variables with type info
+ *
+ * This function fills the @state table using @var_types info.  Each variable
+ * is used only at the given location and updates an entry in the table.
+ */
+void update_var_state(struct type_state *state, struct data_loc_info *dloc,
+		      u64 addr, struct die_var_type *var_types)
+{
+	Dwarf_Die mem_die;
+	struct die_var_type *var;
+	int fbreg = dloc->fbreg;
+	int fb_offset = 0;
+
+	if (dloc->fb_cfa) {
+		if (die_get_cfa(dloc->di->dbg, addr, &fbreg, &fb_offset) < 0)
+			fbreg = -1;
+	}
+
+	for (var = var_types; var != NULL; var = var->next) {
+		if (var->addr != addr)
+			continue;
+		/* Get the type DIE using the offset */
+		if (!dwarf_offdie(dloc->di->dbg, var->die_off, &mem_die))
+			continue;
+
+		if (var->reg == DWARF_REG_FB) {
+			findnew_stack_state(state, var->offset, &mem_die);
+		} else if (var->reg == fbreg) {
+			findnew_stack_state(state, var->offset - fb_offset, &mem_die);
+		} else if (has_reg_type(state, var->reg)) {
+			struct type_state_reg *reg;
+
+			reg = &state->regs[var->reg];
+			reg->type = mem_die;
+			reg->ok = true;
+		}
+	}
+}
+
 /* The result will be saved in @type_die */
 static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 {
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index ad6493ea2c8e..7fbb9eb2e96f 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -8,9 +8,12 @@
 #include <linux/types.h>
 
 struct annotated_op_loc;
+struct arch;
 struct debuginfo;
+struct die_var_type;
 struct evsel;
 struct map_symbol;
+struct type_state;
 
 /**
  * struct annotated_member - Type of member field
@@ -146,6 +149,16 @@ int annotated_data_type__update_samples(struct annotated_data_type *adt,
 /* Release all data type information in the tree */
 void annotated_data_type__tree_delete(struct rb_root *root);
 
+/* Initialize type state table */
+void init_type_state(struct type_state *state, struct arch *arch);
+
+/* Destroy type state table */
+void exit_type_state(struct type_state *state);
+
+/* Update type state table using variables */
+void update_var_state(struct type_state *state, struct data_loc_info *dloc,
+		      u64 addr, struct die_var_type *var_types);
+
 #else /* HAVE_DWARF_SUPPORT */
 
 static inline struct annotated_data_type *
@@ -168,6 +181,22 @@ static inline void annotated_data_type__tree_delete(struct rb_root *root __maybe
 {
 }
 
+static inline void init_type_state(struct type_state *state __maybe_unused,
+				   struct arch *arch __maybe_unused)
+{
+}
+
+static inline void exit_type_state(struct type_state *state __maybe_unused)
+{
+}
+
+static inline void update_var_state(struct type_state *state __maybe_unused,
+				    struct data_loc_info *dloc __maybe_unused,
+				    u64 addr __maybe_unused,
+				    struct die_var_type *var_types __maybe_unused)
+{
+}
+
 #endif /* HAVE_DWARF_SUPPORT */
 
 #endif /* _PERF_ANNOTATE_DATA_H */
diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 39851ff1d5c4..f88a8fb4a350 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -9,6 +9,7 @@
 #include <stdlib.h>
 #include "debug.h"
 #include "dwarf-aux.h"
+#include "dwarf-regs.h"
 #include "strbuf.h"
 #include "string2.h"
 
@@ -1147,6 +1148,8 @@ static int reg_from_dwarf_op(Dwarf_Op *op)
 	case DW_OP_regx:
 	case DW_OP_bregx:
 		return op->number;
+	case DW_OP_fbreg:
+		return DWARF_REG_FB;
 	default:
 		break;
 	}
@@ -1160,6 +1163,7 @@ static int offset_from_dwarf_op(Dwarf_Op *op)
 	case DW_OP_regx:
 		return 0;
 	case DW_OP_breg0 ... DW_OP_breg31:
+	case DW_OP_fbreg:
 		return op->number;
 	case DW_OP_bregx:
 		return op->number2;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 44/52] perf annotate-data: Add update_insn_state()
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (42 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 43/52] perf annotate-data: Maintain variable type info Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 45/52] perf annotate-data: Handle global variable access Namhyung Kim
                   ` (8 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

The update_insn_state() function is to update the type state table after
processing each instruction.  For now, it handles MOV (on x86) insn
to transfer type info from the source location to the target.

The location can be a register or a stack slot.  Check carefully when
memory reference happens and fetch the type correctly.  It basically
ignores write to a memory since it doesn't change the type info.  One
exception is writes to (new) stack slots for register spilling.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 128 +++++++++++++++++++++++++++++++-
 tools/perf/util/annotate-data.h |  13 ++++
 tools/perf/util/annotate.c      |   1 +
 3 files changed, 140 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 438d6234020b..09ccac1d0769 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -26,7 +26,6 @@
 struct type_state_reg {
 	Dwarf_Die type;
 	bool ok;
-	bool scratch;
 };
 
 /* Type information in a stack location, dynamically allocated */
@@ -382,7 +381,7 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc,
 			findnew_stack_state(state, var->offset, &mem_die);
 		} else if (var->reg == fbreg) {
 			findnew_stack_state(state, var->offset - fb_offset, &mem_die);
-		} else if (has_reg_type(state, var->reg)) {
+		} else if (has_reg_type(state, var->reg) && var->offset == 0) {
 			struct type_state_reg *reg;
 
 			reg = &state->regs[var->reg];
@@ -392,6 +391,131 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc,
 	}
 }
 
+/**
+ * update_insn_state - Update type state for an instruction
+ * @state: type state table
+ * @dloc: data location info
+ * @dl: disasm line for the instruction
+ *
+ * This function updates the @state table for the target operand of the
+ * instruction at @dl if it transfers the type like MOV on x86.  Since it
+ * tracks the type, it won't care about the values like in arithmetic
+ * instructions like ADD/SUB/MUL/DIV and INC/DEC.
+ *
+ * Note that ops->reg2 is only available when both mem_ref and multi_regs
+ * are true.
+ */
+void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
+		       struct disasm_line *dl)
+{
+	struct annotated_insn_loc loc;
+	struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
+	struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
+	Dwarf_Die type_die;
+	int fbreg = dloc->fbreg;
+	int fboff = 0;
+
+	/* FIXME: remove x86 specific code and handle more instructions like LEA */
+	if (!strstr(dl->ins.name, "mov"))
+		return;
+
+	if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
+		return;
+
+	if (dloc->fb_cfa) {
+		u64 ip = dloc->ms->sym->start + dl->al.offset;
+		u64 pc = map__rip_2objdump(dloc->ms->map, ip);
+
+		if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0)
+			fbreg = -1;
+	}
+
+	/* Case 1. register to register transfers */
+	if (!src->mem_ref && !dst->mem_ref) {
+		if (!has_reg_type(state, dst->reg1))
+			return;
+
+		if (has_reg_type(state, src->reg1))
+			state->regs[dst->reg1] = state->regs[src->reg1];
+		else
+			state->regs[dst->reg1].ok = false;
+	}
+	/* Case 2. memory to register transers */
+	if (src->mem_ref && !dst->mem_ref) {
+		int sreg = src->reg1;
+
+		if (!has_reg_type(state, dst->reg1))
+			return;
+
+retry:
+		/* Check stack variables with offset */
+		if (sreg == fbreg) {
+			struct type_state_stack *stack;
+			int offset = src->offset - fboff;
+
+			stack = find_stack_state(state, offset);
+			if (stack && die_get_member_type(&stack->type,
+							 offset - stack->offset,
+							 &type_die)) {
+				state->regs[dst->reg1].type = type_die;
+				state->regs[dst->reg1].ok = true;
+			} else
+				state->regs[dst->reg1].ok = false;
+		}
+		/* And then dereference the pointer if it has one */
+		else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
+			 die_deref_ptr_type(&state->regs[sreg].type,
+					    src->offset, &type_die)) {
+			state->regs[dst->reg1].type = type_die;
+			state->regs[dst->reg1].ok = true;
+		}
+		/* Or try another register if any */
+		else if (src->multi_regs && sreg == src->reg1 &&
+			 src->reg1 != src->reg2) {
+			sreg = src->reg2;
+			goto retry;
+		}
+		/* It failed to get a type info, mark it as invalid */
+		else {
+			state->regs[dst->reg1].ok = false;
+		}
+	}
+	/* Case 3. register to memory transfers */
+	if (!src->mem_ref && dst->mem_ref) {
+		if (!has_reg_type(state, src->reg1) ||
+		    !state->regs[src->reg1].ok)
+			return;
+
+		/* Check stack variables with offset */
+		if (dst->reg1 == fbreg) {
+			struct type_state_stack *stack;
+			int offset = dst->offset - fboff;
+
+			stack = find_stack_state(state, offset);
+			if (stack) {
+				/*
+				 * The source register is likely to hold a type
+				 * of member if it's a compound type.  Do not
+				 * update the stack variable type since we can
+				 * get the member type later by using the
+				 * die_get_member_type().
+				 */
+				if (!stack->compound)
+					set_stack_state(stack, offset,
+							&state->regs[src->reg1].type);
+			} else {
+				findnew_stack_state(state, offset,
+						    &state->regs[src->reg1].type);
+			}
+		}
+		/*
+		 * Ignore other transfers since it'd set a value in a struct
+		 * and won't change the type.
+		 */
+	}
+	/* Case 4. memory to memory transfers (not handled for now) */
+}
+
 /* The result will be saved in @type_die */
 static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 {
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 7fbb9eb2e96f..ff9acf6ea808 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -11,6 +11,7 @@ struct annotated_op_loc;
 struct arch;
 struct debuginfo;
 struct die_var_type;
+struct disasm_line;
 struct evsel;
 struct map_symbol;
 struct type_state;
@@ -78,6 +79,7 @@ extern struct annotated_data_type stackop_type;
 
 /**
  * struct data_loc_info - Data location information
+ * @arch: architecture info
  * @ms: Map and Symbol info
  * @ip: Instruction address
  * @var_addr: Data address (for global variables)
@@ -90,6 +92,7 @@ extern struct annotated_data_type stackop_type;
  */
 struct data_loc_info {
 	/* These are input field, should be filled by caller */
+	struct arch *arch;
 	struct map_symbol *ms;
 	u64 ip;
 	u64 var_addr;
@@ -159,6 +162,10 @@ void exit_type_state(struct type_state *state);
 void update_var_state(struct type_state *state, struct data_loc_info *dloc,
 		      u64 addr, struct die_var_type *var_types);
 
+/* Update type state table for an instruction */
+void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
+		       struct disasm_line *dl);
+
 #else /* HAVE_DWARF_SUPPORT */
 
 static inline struct annotated_data_type *
@@ -197,6 +204,12 @@ static inline void update_var_state(struct type_state *state __maybe_unused,
 {
 }
 
+static inline void update_insn_state(struct type_state *state __maybe_unused,
+				     struct data_loc_info *dloc __maybe_unused,
+				     struct disasm_line *dl __maybe_unused)
+{
+}
+
 #endif /* HAVE_DWARF_SUPPORT */
 
 #endif /* _PERF_ANNOTATE_DATA_H */
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index d54a9ec16af4..ffbdba50b50a 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3843,6 +3843,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 
 	for_each_insn_op_loc(&loc, i, op_loc) {
 		struct data_loc_info dloc = {
+			.arch = arch,
 			.ms = ms,
 			/* Recalculate IP for LOCK prefix or insn fusion */
 			.ip = ms->sym->start + dl->al.offset,
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 45/52] perf annotate-data: Handle global variable access
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (43 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 44/52] perf annotate-data: Add update_insn_state() Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 46/52] perf annotate-data: Handle call instructions Namhyung Kim
                   ` (7 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

When updating the instruction states, it also needs to handle global
variable accesses.  Same as it does for PC-relative addressing, it can
look up the type by address (if it's defined in the same file), or by
name after finding the symbol by address (for declarations).

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 45 ++++++++++++++++++++++++++++++---
 tools/perf/util/annotate-data.h | 10 ++++++--
 tools/perf/util/annotate.c      | 45 ++++++++++++++++++++-------------
 tools/perf/util/annotate.h      |  5 ++++
 4 files changed, 83 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 09ccac1d0769..bbd271cd3419 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -395,6 +395,7 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc,
  * update_insn_state - Update type state for an instruction
  * @state: type state table
  * @dloc: data location info
+ * @cu_die: compile unit debug entry
  * @dl: disasm line for the instruction
  *
  * This function updates the @state table for the target operand of the
@@ -406,7 +407,7 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc,
  * are true.
  */
 void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
-		       struct disasm_line *dl)
+		       void *cu_die, struct disasm_line *dl)
 {
 	struct annotated_insn_loc loc;
 	struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
@@ -448,8 +449,46 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 			return;
 
 retry:
-		/* Check stack variables with offset */
-		if (sreg == fbreg) {
+		/* Check if it's a global variable */
+		if (sreg == DWARF_REG_PC) {
+			Dwarf_Die var_die;
+			struct map_symbol *ms = dloc->ms;
+			int offset = src->offset;
+			u64 ip = ms->sym->start + dl->al.offset;
+			u64 pc, addr;
+			const char *var_name = NULL;
+
+			addr = annotate_calc_pcrel(ms, ip, offset, dl);
+			pc = map__rip_2objdump(ms->map, ip);
+
+			if (die_find_variable_by_addr(cu_die, pc, addr,
+						      &var_die, &offset) &&
+			    check_variable(&var_die, &type_die, offset,
+					   /*is_pointer=*/false) == 0 &&
+			    die_get_member_type(&type_die, offset, &type_die)) {
+				state->regs[dst->reg1].type = type_die;
+				state->regs[dst->reg1].ok = true;
+				return;
+			}
+
+			/* Try to get the name of global variable */
+			offset = src->offset;
+			get_global_var_info(dloc->thread, ms, ip, dl,
+					    dloc->cpumode, &addr,
+					    &var_name, &offset);
+
+			if (var_name && die_find_variable_at(cu_die, var_name,
+							     pc, &var_die) &&
+			    check_variable(&var_die, &type_die, offset,
+					   /*is_pointer=*/false) == 0 &&
+			    die_get_member_type(&type_die, offset, &type_die)) {
+				state->regs[dst->reg1].type = type_die;
+				state->regs[dst->reg1].ok = true;
+			} else
+				state->regs[dst->reg1].ok = false;
+		}
+		/* And check stack variables with offset */
+		else if (sreg == fbreg) {
 			struct type_state_stack *stack;
 			int offset = src->offset - fboff;
 
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index ff9acf6ea808..0bfef29fa52c 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -14,6 +14,7 @@ struct die_var_type;
 struct disasm_line;
 struct evsel;
 struct map_symbol;
+struct thread;
 struct type_state;
 
 /**
@@ -79,11 +80,13 @@ extern struct annotated_data_type stackop_type;
 
 /**
  * struct data_loc_info - Data location information
- * @arch: architecture info
+ * @arch: CPU architecture info
+ * @thread: Thread info
  * @ms: Map and Symbol info
  * @ip: Instruction address
  * @var_addr: Data address (for global variables)
  * @var_name: Variable name (for global variables)
+ * @cpumode: CPU execution mode
  * @op: Instruction operand location (regs and offset)
  * @di: Debug info
  * @fbreg: Frame base register
@@ -94,8 +97,10 @@ struct data_loc_info {
 	/* These are input field, should be filled by caller */
 	struct arch *arch;
 	struct map_symbol *ms;
+	struct thread *thread;
 	u64 ip;
 	u64 var_addr;
+	u8 cpumode;
 	const char *var_name;
 	struct annotated_op_loc *op;
 
@@ -164,7 +169,7 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc,
 
 /* Update type state table for an instruction */
 void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
-		       struct disasm_line *dl);
+		       void *cu_die, struct disasm_line *dl);
 
 #else /* HAVE_DWARF_SUPPORT */
 
@@ -206,6 +211,7 @@ static inline void update_var_state(struct type_state *state __maybe_unused,
 
 static inline void update_insn_state(struct type_state *state __maybe_unused,
 				     struct data_loc_info *dloc __maybe_unused,
+				     void *cu_die __maybe_unused,
 				     struct disasm_line *dl __maybe_unused)
 {
 }
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index ffbdba50b50a..33fd032bf463 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3775,6 +3775,28 @@ u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
 	return map__rip_2objdump(ms->map, addr);
 }
 
+void get_global_var_info(struct thread *thread, struct map_symbol *ms, u64 ip,
+			 struct disasm_line *dl, u8 cpumode, u64 *var_addr,
+			 const char **var_name, int *poffset)
+{
+	struct addr_location al;
+	struct symbol *var;
+	u64 map_addr;
+
+	*var_addr = annotate_calc_pcrel(ms, ip, *poffset, dl);
+	/* Kernel symbols might be relocated */
+	map_addr = *var_addr + map__reloc(ms->map);
+
+	addr_location__init(&al);
+	var = thread__find_symbol_fb(thread, cpumode, map_addr, &al);
+	if (var) {
+		*var_name = var->name;
+		/* Calculate type offset from the start of variable */
+		*poffset = map_addr - map__unmap_ip(al.map, var->start);
+	}
+	addr_location__exit(&al);
+}
+
 /**
  * hist_entry__get_data_type - find data type for given hist entry
  * @he: hist entry
@@ -3844,6 +3866,8 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 	for_each_insn_op_loc(&loc, i, op_loc) {
 		struct data_loc_info dloc = {
 			.arch = arch,
+			.thread = he->thread,
+			.cpumode = he->cpumode,
 			.ms = ms,
 			/* Recalculate IP for LOCK prefix or insn fusion */
 			.ip = ms->sym->start + dl->al.offset,
@@ -3858,23 +3882,10 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 
 		/* PC-relative addressing */
 		if (op_loc->reg1 == DWARF_REG_PC) {
-			struct addr_location al;
-			struct symbol *var;
-			u64 map_addr;
-
-			dloc.var_addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl);
-			/* Kernel symbols might be relocated */
-			map_addr = dloc.var_addr + map__reloc(ms->map);
-
-			addr_location__init(&al);
-			var = thread__find_symbol_fb(he->thread, he->cpumode,
-						     map_addr, &al);
-			if (var) {
-				dloc.var_name = var->name;
-				/* Calculate type offset from the start of variable */
-				dloc.type_offset = map_addr - map__unmap_ip(al.map, var->start);
-			}
-			addr_location__exit(&al);
+			dloc.type_offset = op_loc->offset;
+			get_global_var_info(he->thread, ms, ip, dl, he->cpumode,
+					    &dloc.var_addr, &dloc.var_name,
+					    &dloc.type_offset);
 		}
 
 		mem_type = find_data_type(&dloc);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index e1fa86341281..13c9b6a30b15 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -23,6 +23,7 @@ struct option;
 struct perf_sample;
 struct evsel;
 struct symbol;
+struct thread;
 struct annotated_data_type;
 
 struct ins {
@@ -501,6 +502,10 @@ extern struct list_head ann_insn_stat;
 u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
 			struct disasm_line *dl);
 
+void get_global_var_info(struct thread *thread, struct map_symbol *ms, u64 ip,
+			 struct disasm_line *dl, u8 cpumode, u64 *var_addr,
+			 const char **var_name, int *poffset);
+
 /**
  * struct annotated_basic_block - Basic block of instructions
  * @list: List node
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 46/52] perf annotate-data: Handle call instructions
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (44 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 45/52] perf annotate-data: Handle global variable access Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 47/52] perf annotate-data: Implement instruction tracking Namhyung Kim
                   ` (6 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

When updating instruction states, the call instruction should play a
role since it can change the register states.  For simplicity, mark some
registers as scratch registers (should be arch-dependent), and
invalidate them all after a function call.

If the function returns something, the designated register (ret_reg)
will have the type info.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 45 +++++++++++++++++++++++++++++++--
 1 file changed, 43 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index bbd271cd3419..54791dfc6244 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -22,10 +22,14 @@
 #include "symbol.h"
 #include "symbol_conf.h"
 
-/* Type information in a register, valid when ok is true */
+/*
+ * Type information in a register, valid when @ok is true.
+ * The @scratch registers are invalidated after a function call.
+ */
 struct type_state_reg {
 	Dwarf_Die type;
 	bool ok;
+	bool scratch;
 };
 
 /* Type information in a stack location, dynamically allocated */
@@ -49,6 +53,7 @@ struct type_state_stack {
 struct type_state {
 	struct type_state_reg regs[TYPE_STATE_MAX_REGS];
 	struct list_head stack_vars;
+	int ret_reg;
 };
 
 static bool has_reg_type(struct type_state *state, int reg)
@@ -56,10 +61,23 @@ static bool has_reg_type(struct type_state *state, int reg)
 	return (unsigned)reg < ARRAY_SIZE(state->regs);
 }
 
-void init_type_state(struct type_state *state, struct arch *arch __maybe_unused)
+void init_type_state(struct type_state *state, struct arch *arch)
 {
 	memset(state, 0, sizeof(*state));
 	INIT_LIST_HEAD(&state->stack_vars);
+
+	if (arch__is(arch, "x86")) {
+		state->regs[0].scratch = true;
+		state->regs[1].scratch = true;
+		state->regs[2].scratch = true;
+		state->regs[4].scratch = true;
+		state->regs[5].scratch = true;
+		state->regs[8].scratch = true;
+		state->regs[9].scratch = true;
+		state->regs[10].scratch = true;
+		state->regs[11].scratch = true;
+		state->ret_reg = 0;
+	}
 }
 
 void exit_type_state(struct type_state *state)
@@ -416,6 +434,29 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 	int fbreg = dloc->fbreg;
 	int fboff = 0;
 
+	if (ins__is_call(&dl->ins)) {
+		Dwarf_Die func_die;
+
+		/* __fentry__ will preserve all registers */
+		if (dl->ops.target.sym &&
+		    !strcmp(dl->ops.target.sym->name, "__fentry__"))
+			return;
+
+		/* Otherwise invalidate scratch registers after call */
+		for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) {
+			if (state->regs[i].scratch)
+				state->regs[i].ok = false;
+		}
+
+		/* Update register with the return type (if any) */
+		if (die_find_realfunc(cu_die, dl->ops.target.addr, &func_die) &&
+		    die_get_real_type(&func_die, &type_die)) {
+			state->regs[state->ret_reg].type = type_die;
+			state->regs[state->ret_reg].ok = true;
+		}
+		return;
+	}
+
 	/* FIXME: remove x86 specific code and handle more instructions like LEA */
 	if (!strstr(dl->ins.name, "mov"))
 		return;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 47/52] perf annotate-data: Implement instruction tracking
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (45 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 46/52] perf annotate-data: Handle call instructions Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 48/52] perf annotate: Parse x86 segment register location Namhyung Kim
                   ` (5 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

If it failed to find a variable for the location directly, it might be
due to a missing variable in the source code.  For example, accessing
pointer variables in a chain can result in the case like below:

  struct foo *foo = ...;

  int i = foo->bar->baz;

The DWARF debug information is created for each variable so it'd have
one for 'foo'.  But there's no variable for 'foo->bar' and then it
cannot know the type of 'bar' and 'baz'.

The above source code can be compiled to the follow x86 instructions:

  mov  0x8(%rax), %rcx
  mov  0x4(%rcx), %rdx   <=== PMU sample
  mov  %rdx, -4(%rbp)

Let's say 'foo' is located in the %rax and it has a pointer to struct
foo.  But perf sample is captured in the second instruction and there
is no variable or type info for the %rcx.

It'd be great if compiler could generate debug info for %rcx, but we
should handle it on our side.  So this patch implements the logic to
iterate instructions and update the type table for each location.

As it already collected a list of scopes including the target
instruction, we can use it to construct the type table smartly.

  +----------------  scope[0] subprogram
  |
  | +--------------  scope[1] lexical_block
  | |
  | | +------------  scope[2] inlined_subroutine
  | | |
  | | | +----------  scope[3] inlined_subroutine
  | | | |
  | | | | +--------  scope[4] lexical_block
  | | | | |
  | | | | |     ***  target instruction
  ...

Image the target instruction has 5 scopes, each scope will have its own
variables and parameters.  Then it can start with the innermost scope
(4).  So it'd search the shortest path from the start of scope[4] to
the target address and build a list of basic blocks.  Then it iterates
the basic blocks with the variables in the scope and update the table.
If it finds a type at the target instruction, then returns it.

Otherwise, it moves to the upper scope[3].  Now it'd search the shortest
path from the start of scope[3] to the start of scope[4].  Then connect
it to the existing basic block list.  Then it'd iterate the blocks with
variables for both scopes.  It can repeat this until it finds a type at
the target instruction or reaches to the top scope[0].

As the basic blocks contain the shortest path, it won't worry about
branches and can update the table simply.

With this change, the stat now looks like below:

  Annotate data type stats:
  total 294, ok 185 (62.9%), bad 109 (37.1%)
  -----------------------------------------------------------
          30 : no_sym
          32 : no_mem_ops
          27 : no_var
          13 : no_typeinfo
           7 : bad_offset

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 232 ++++++++++++++++++++++++++++++++
 1 file changed, 232 insertions(+)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 54791dfc6244..56dfbddb53d2 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -596,6 +596,231 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 	/* Case 4. memory to memory transfers (not handled for now) */
 }
 
+/* Prepend this_list to full_list, removing duplicate disasm line */
+static void prepend_basic_blocks(struct list_head *this_blocks,
+				 struct list_head *full_blocks)
+{
+	struct annotated_basic_block *first_bb, *last_bb;
+
+	last_bb = list_last_entry(this_blocks, typeof(*last_bb), list);
+	first_bb = list_first_entry(full_blocks, typeof(*first_bb), list);
+
+	if (list_empty(full_blocks))
+		goto out;
+
+	if (last_bb->end != first_bb->begin) {
+		pr_debug("prepend basic blocks: mismatched disasm line %lx -> %lx\n",
+			 last_bb->end->al.offset, first_bb->begin->al.offset);
+		goto out;
+	}
+
+	/* Is the basic block have only one disasm_line? */
+	if (last_bb->begin == last_bb->end) {
+		list_del(&last_bb->list);
+		free(last_bb);
+		goto out;
+	}
+
+	last_bb->end = list_prev_entry(last_bb->end, al.node);
+
+out:
+	list_splice(this_blocks, full_blocks);
+}
+
+static void delete_basic_blocks(struct list_head *basic_blocks)
+{
+	struct annotated_basic_block *bb, *tmp;
+
+	list_for_each_entry_safe(bb, tmp, basic_blocks, list) {
+		list_del(&bb->list);
+		free(bb);
+	}
+}
+
+/* Make sure all variables have a valid start address */
+static void fixup_var_address(struct die_var_type *var_types, u64 addr)
+{
+	while (var_types) {
+		/*
+		 * Some variables have no address range meaning it's always
+		 * available in the whole scope.  Let's adjust the start
+		 * address to the start of the scope.
+		 */
+		if (var_types->addr == 0)
+			var_types->addr = addr;
+
+		var_types = var_types->next;
+	}
+}
+
+static void delete_var_types(struct die_var_type *var_types)
+{
+	while (var_types) {
+		struct die_var_type *next = var_types->next;
+
+		free(var_types);
+		var_types = next;
+	}
+}
+
+/* It's at the target address, check if it has a matching type */
+static bool find_matching_type(struct type_state *state,
+			       struct data_loc_info *dloc, int reg,
+			       Dwarf_Die *type_die)
+{
+	Dwarf_Word size;
+
+	if (state->regs[reg].ok) {
+		int tag = dwarf_tag(&state->regs[reg].type);
+
+		/*
+		 * Normal registers should hold a pointer (or array) to
+		 * dereference a memory location.
+		 */
+		if (tag != DW_TAG_pointer_type && tag != DW_TAG_array_type)
+			return false;
+
+		if (die_get_real_type(&state->regs[reg].type, type_die) == NULL)
+			return false;
+
+		dloc->type_offset = dloc->op->offset;
+
+		/* Get the size of the actual type */
+		if (dwarf_aggregate_size(type_die, &size) < 0 ||
+		    (unsigned)dloc->type_offset >= size)
+			return false;
+
+		return true;
+	}
+
+	if (reg == dloc->fbreg) {
+		struct type_state_stack *stack;
+
+		stack = find_stack_state(state, dloc->type_offset);
+		if (stack == NULL)
+			return false;
+
+		*type_die = stack->type;
+		/* Update the type offset from the start of slot */
+		dloc->type_offset -= stack->offset;
+		return true;
+	}
+
+	if (dloc->fb_cfa) {
+		struct type_state_stack *stack;
+		u64 pc = map__rip_2objdump(dloc->ms->map, dloc->ip);
+		int fbreg, fboff;
+
+		if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0)
+			fbreg = -1;
+
+		if (reg != fbreg)
+			return false;
+
+		stack = find_stack_state(state, dloc->type_offset - fboff);
+		if (stack == NULL)
+			return false;
+
+		*type_die = stack->type;
+		/* Update the type offset from the start of slot */
+		dloc->type_offset -= fboff + stack->offset;
+		return true;
+	}
+
+	return false;
+}
+
+/* Iterate instructions in basic blocks and update type table */
+static bool find_data_type_insn(struct data_loc_info *dloc, int reg,
+				struct list_head *basic_blocks,
+				struct die_var_type *var_types,
+				Dwarf_Die *cu_die, Dwarf_Die *type_die)
+{
+	struct type_state state;
+	struct symbol *sym = dloc->ms->sym;
+	struct annotation *notes = symbol__annotation(sym);
+	struct annotated_basic_block *bb;
+	bool found = false;
+
+	init_type_state(&state, dloc->arch);
+
+	list_for_each_entry(bb, basic_blocks, list) {
+		struct disasm_line *dl = bb->begin;
+
+		list_for_each_entry_from(dl, &notes->src->source, al.node) {
+			u64 this_ip = sym->start + dl->al.offset;
+			u64 addr = map__rip_2objdump(dloc->ms->map, this_ip);
+
+			/* Update variable type at this address */
+			update_var_state(&state, dloc, addr, var_types);
+
+			if (this_ip == dloc->ip) {
+				found = find_matching_type(&state, dloc, reg,
+							   type_die);
+				goto out;
+			}
+
+			/* Update type table after processing the instruction */
+			update_insn_state(&state, dloc, cu_die, dl);
+			if (dl == bb->end)
+				break;
+		}
+	}
+
+out:
+	exit_type_state(&state);
+	return found;
+}
+
+/*
+ * Construct a list of basic blocks for each scope with variables and try to find
+ * the data type by updating a type state table through instructions.
+ */
+static int find_data_type_block(struct data_loc_info *dloc, int reg,
+				Dwarf_Die *cu_die, Dwarf_Die *scopes,
+				int nr_scopes, Dwarf_Die *type_die)
+{
+	LIST_HEAD(basic_blocks);
+	struct die_var_type *var_types = NULL;
+	u64 src_ip, dst_ip;
+	int ret = -1;
+
+	dst_ip = dloc->ip;
+	for (int i = nr_scopes - 1; i >= 0; i--) {
+		Dwarf_Addr base, start, end;
+		LIST_HEAD(this_blocks);
+
+		if (dwarf_ranges(&scopes[i], 0, &base, &start, &end) < 0)
+			break;
+
+		src_ip = map__objdump_2rip(dloc->ms->map, start);
+
+		/* Get basic blocks for this scope */
+		if (annotate_get_basic_blocks(dloc->ms->sym, src_ip, dst_ip,
+					      &this_blocks) < 0)
+			continue;
+		prepend_basic_blocks(&this_blocks, &basic_blocks);
+
+		/* Get variable info for this scope and add to var_types list */
+		die_collect_vars(&scopes[i], &var_types);
+		fixup_var_address(var_types, start);
+
+		/* Find from start of this scope to the target instruction */
+		if (find_data_type_insn(dloc, reg, &basic_blocks, var_types,
+					cu_die, type_die)) {
+			ret = 0;
+			break;
+		}
+
+		/* Go up to the next scope and find blocks to the start */
+		dst_ip = src_ip;
+	}
+
+	delete_basic_blocks(&basic_blocks);
+	delete_var_types(var_types);
+	return ret;
+}
+
 /* The result will be saved in @type_die */
 static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 {
@@ -696,6 +921,13 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 		goto out;
 	}
 
+	if (reg != DWARF_REG_PC) {
+		ret = find_data_type_block(dloc, reg, &cu_die, scopes,
+					   nr_scopes, type_die);
+		if (ret == 0)
+			goto out;
+	}
+
 	if (loc->multi_regs && reg == loc->reg1 && loc->reg1 != loc->reg2) {
 		reg = loc->reg2;
 		goto retry;
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 48/52] perf annotate: Parse x86 segment register location
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (46 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 47/52] perf annotate-data: Implement instruction tracking Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 49/52] perf annotate-data: Handle this-cpu variables in kernel Namhyung Kim
                   ` (4 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Add a segment field in the struct annotated_insn_loc and save it for the
segment based addressing like %gs:0x28.  For simplicity it now handles
%gs register only.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 21 +++++++++++++++++++--
 tools/perf/util/annotate.h | 13 +++++++++++++
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 33fd032bf463..a9075af10d24 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3561,6 +3561,12 @@ static int extract_reg_offset(struct arch *arch, const char *str,
 	 * %gs:0x18(%rbx).  In that case it should skip the part.
 	 */
 	if (*str == arch->objdump.register_char) {
+		if (arch__is(arch, "x86")) {
+			/* FIXME: Handle other segment registers */
+			if (!strncmp(str, "%gs:", 4))
+				op_loc->segment = INSN_SEG_X86_GS;
+		}
+
 		while (*str && !isdigit(*str) &&
 		       *str != arch->objdump.memory_ref_char)
 			str++;
@@ -3657,8 +3663,19 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
 			op_loc->multi_regs = multi_regs;
 			extract_reg_offset(arch, insn_str, op_loc);
 		} else {
-			char *s = strdup(insn_str);
+			char *s;
+
+			if (arch__is(arch, "x86")) {
+				/* FIXME: Handle other segment registers */
+				if (!strncmp(insn_str, "%gs:", 4)) {
+					op_loc->segment = INSN_SEG_X86_GS;
+					op_loc->offset = strtol(insn_str + 4,
+								NULL, 0);
+					continue;
+				}
+			}
 
+			s = strdup(insn_str);
 			if (s) {
 				op_loc->reg1 = get_dwarf_regnum(s, 0);
 				free(s);
@@ -3874,7 +3891,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 			.op = op_loc,
 		};
 
-		if (!op_loc->mem_ref)
+		if (!op_loc->mem_ref && op_loc->segment == INSN_SEG_NONE)
 			continue;
 
 		/* Recalculate IP because of LOCK prefix or insn fusion */
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 13c9b6a30b15..21a0947ed5e9 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -452,6 +452,7 @@ int annotate_check_args(struct annotation_options *args);
  * @reg1: First register in the operand
  * @reg2: Second register in the operand
  * @offset: Memory access offset in the operand
+ * @segment: Segment selector register
  * @mem_ref: Whether the operand accesses memory
  * @multi_regs: Whether the second register is used
  */
@@ -459,6 +460,7 @@ struct annotated_op_loc {
 	int reg1;
 	int reg2;
 	int offset;
+	u8 segment;
 	bool mem_ref;
 	bool multi_regs;
 };
@@ -470,6 +472,17 @@ enum annotated_insn_ops {
 	INSN_OP_MAX,
 };
 
+enum annotated_x86_segment {
+	INSN_SEG_NONE = 0,
+
+	INSN_SEG_X86_CS,
+	INSN_SEG_X86_DS,
+	INSN_SEG_X86_ES,
+	INSN_SEG_X86_FS,
+	INSN_SEG_X86_GS,
+	INSN_SEG_X86_SS,
+};
+
 /**
  * struct annotated_insn_loc - Location info of instruction
  * @ops: Array of location info for source and target operands
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 49/52] perf annotate-data: Handle this-cpu variables in kernel
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (47 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 48/52] perf annotate: Parse x86 segment register location Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 50/52] perf annotate-data: Track instructions with a this-cpu variable Namhyung Kim
                   ` (3 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

On x86, the kernel gets the current task using the current macro like
below:

  #define current  get_current()

  static __always_inline struct task_struct *get_current(void)
  {
      return this_cpu_read_stable(pcpu_hot.current_task);
  }

So it returns the current_task field of struct pcpu_hot which is the
first member.  On my build, it's located at 0x32940.

  $ nm vmlinux | grep pcpu_hot
  0000000000032940 D pcpu_hot

And the current macro generates the instructions like below:

  mov  %gs:0x32940, %rcx

So the %gs segment register points to the beginning of the per-cpu
region of this cpu and it points the variable with a constant.

Let's update the instruction location info to have a segment register
and handle %gs in kernel to look up a global variable.  The new
get_percpu_var_info() helper is to get information about the variable.
Pretend it as a global variable by changing the register number to
DWARF_REG_PC.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate.c | 31 +++++++++++++++++++++++++++++++
 tools/perf/util/annotate.h |  4 ++++
 2 files changed, 35 insertions(+)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index a9075af10d24..9b72eae2400c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3814,6 +3814,27 @@ void get_global_var_info(struct thread *thread, struct map_symbol *ms, u64 ip,
 	addr_location__exit(&al);
 }
 
+void get_percpu_var_info(struct thread *thread, struct map_symbol *ms,
+			 u8 cpumode, u64 var_addr, const char **var_name,
+			 int *poffset)
+{
+	struct addr_location al;
+	struct symbol *var;
+	u64 map_addr;
+
+	/* Kernel symbols might be relocated */
+	map_addr = var_addr + map__reloc(ms->map);
+
+	addr_location__init(&al);
+	var = thread__find_symbol_fb(thread, cpumode, map_addr, &al);
+	if (var) {
+		*var_name = var->name;
+		/* Calculate type offset from the start of variable */
+		*poffset = map_addr - map__unmap_ip(al.map, var->start);
+	}
+	addr_location__exit(&al);
+}
+
 /**
  * hist_entry__get_data_type - find data type for given hist entry
  * @he: hist entry
@@ -3905,6 +3926,16 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 					    &dloc.type_offset);
 		}
 
+		/* This CPU access in kernel - pretend PC-relative addressing */
+		if (op_loc->reg1 < 0 && ms->map->dso->kernel &&
+		    arch__is(arch, "x86") && op_loc->segment == INSN_SEG_X86_GS) {
+			dloc.var_addr = op_loc->offset;
+			get_percpu_var_info(he->thread, ms, he->cpumode,
+					    dloc.var_addr, &dloc.var_name,
+					    &dloc.type_offset);
+			op_loc->reg1 = DWARF_REG_PC;
+		}
+
 		mem_type = find_data_type(&dloc);
 		if (mem_type)
 			istat->good++;
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 21a0947ed5e9..c3cc0cba10b7 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -519,6 +519,10 @@ void get_global_var_info(struct thread *thread, struct map_symbol *ms, u64 ip,
 			 struct disasm_line *dl, u8 cpumode, u64 *var_addr,
 			 const char **var_name, int *poffset);
 
+void get_percpu_var_info(struct thread *thread, struct map_symbol *ms,
+			 u8 cpumode, u64 var_addr, const char **var_name,
+			 int *poffset);
+
 /**
  * struct annotated_basic_block - Basic block of instructions
  * @list: List node
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 50/52] perf annotate-data: Track instructions with a this-cpu variable
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (48 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 49/52] perf annotate-data: Handle this-cpu variables in kernel Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 51/52] perf annotate-data: Add stack canary type Namhyung Kim
                   ` (2 subsequent siblings)
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Like global variables, this per-cpu variables should be tracked
correctly.  Factor our get_global_var_type() to handle both global
and per-cpu (for this cpu) variables in the same manner.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 84 +++++++++++++++++++++++----------
 1 file changed, 60 insertions(+), 24 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 56dfbddb53d2..416c0b5649fc 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -409,6 +409,37 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc,
 	}
 }
 
+static bool get_global_var_type(Dwarf_Die *cu_die, struct map_symbol *ms, u64 ip,
+				u64 var_addr, const char *var_name, int var_offset,
+				Dwarf_Die *type_die)
+{
+	u64 pc;
+	int offset = var_offset;
+	bool is_pointer = false;
+	Dwarf_Die var_die;
+
+	pc = map__rip_2objdump(ms->map, ip);
+
+	/* Try to get the variable by address first */
+	if (die_find_variable_by_addr(cu_die, pc, var_addr, &var_die, &offset) &&
+	    check_variable(&var_die, type_die, offset, is_pointer) == 0 &&
+	    die_get_member_type(type_die, offset, type_die))
+		return true;
+
+	if (var_name == NULL)
+		return false;
+
+	offset = var_offset;
+
+	/* Try to get the name of global variable */
+	if (die_find_variable_at(cu_die, var_name, pc, &var_die) &&
+	    check_variable(&var_die, type_die, offset, is_pointer) == 0 &&
+	    die_get_member_type(type_die, offset, type_die))
+		return true;
+
+	return false;
+}
+
 /**
  * update_insn_state - Update type state for an instruction
  * @state: type state table
@@ -472,14 +503,36 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 			fbreg = -1;
 	}
 
-	/* Case 1. register to register transfers */
+	/* Case 1. register to register or segment:offset to register transfers */
 	if (!src->mem_ref && !dst->mem_ref) {
 		if (!has_reg_type(state, dst->reg1))
 			return;
 
 		if (has_reg_type(state, src->reg1))
 			state->regs[dst->reg1] = state->regs[src->reg1];
-		else
+		else if (dloc->ms->map->dso->kernel &&
+			 src->segment == INSN_SEG_X86_GS) {
+			struct map_symbol *ms = dloc->ms;
+			int offset = src->offset;
+			u64 ip = ms->sym->start + dl->al.offset;
+			const char *var_name = NULL;
+			u64 var_addr;
+
+			/*
+			 * In kernel, %gs points to a per-cpu region for the
+			 * current CPU.  Access with a constant offset should
+			 * be treated as a global variable access.
+			 */
+			var_addr = src->offset;
+			get_percpu_var_info(dloc->thread, ms, dloc->cpumode,
+					    var_addr, &var_name, &offset);
+
+			if (get_global_var_type(cu_die, ms, ip, var_addr,
+						var_name, offset, &type_die)) {
+				state->regs[dst->reg1].type = type_die;
+				state->regs[dst->reg1].ok = true;
+			}
+		} else
 			state->regs[dst->reg1].ok = false;
 	}
 	/* Case 2. memory to register transers */
@@ -492,37 +545,20 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 retry:
 		/* Check if it's a global variable */
 		if (sreg == DWARF_REG_PC) {
-			Dwarf_Die var_die;
 			struct map_symbol *ms = dloc->ms;
 			int offset = src->offset;
 			u64 ip = ms->sym->start + dl->al.offset;
-			u64 pc, addr;
 			const char *var_name = NULL;
+			u64 var_addr;
 
-			addr = annotate_calc_pcrel(ms, ip, offset, dl);
-			pc = map__rip_2objdump(ms->map, ip);
-
-			if (die_find_variable_by_addr(cu_die, pc, addr,
-						      &var_die, &offset) &&
-			    check_variable(&var_die, &type_die, offset,
-					   /*is_pointer=*/false) == 0 &&
-			    die_get_member_type(&type_die, offset, &type_die)) {
-				state->regs[dst->reg1].type = type_die;
-				state->regs[dst->reg1].ok = true;
-				return;
-			}
+			var_addr = annotate_calc_pcrel(ms, ip, offset, dl);
 
-			/* Try to get the name of global variable */
-			offset = src->offset;
 			get_global_var_info(dloc->thread, ms, ip, dl,
-					    dloc->cpumode, &addr,
+					    dloc->cpumode, &var_addr,
 					    &var_name, &offset);
 
-			if (var_name && die_find_variable_at(cu_die, var_name,
-							     pc, &var_die) &&
-			    check_variable(&var_die, &type_die, offset,
-					   /*is_pointer=*/false) == 0 &&
-			    die_get_member_type(&type_die, offset, &type_die)) {
+			if (get_global_var_type(cu_die, ms, ip, var_addr,
+						var_name, offset, &type_die)) {
 				state->regs[dst->reg1].type = type_die;
 				state->regs[dst->reg1].ok = true;
 			} else
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 51/52] perf annotate-data: Add stack canary type
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (49 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 50/52] perf annotate-data: Track instructions with a this-cpu variable Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10  0:00 ` [PATCH 52/52] perf annotate-data: Add debug message Namhyung Kim
  2023-11-10 12:05 ` [RFC 00/52] perf tools: Introduce data type profiling (v2) Arnaldo Carvalho de Melo
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

When the stack protector is enabled, compiler would generate code to
check stack overflow with a special value called 'stack carary' at
runtime.  On x86_64, GCC hard-codes the stack canary as %gs:40.

While there's a definition of fixed_percpu_data in asm/processor.h,
it seems that the header is not included everywhere and many places
it cannot find the type info.  As it's in the well-known location (at
%gs:40), let's add a pseudo stack canary type to handle it specially.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.h |  1 +
 tools/perf/util/annotate.c      | 24 ++++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 0bfef29fa52c..e293980eb11b 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -77,6 +77,7 @@ struct annotated_data_type {
 
 extern struct annotated_data_type unknown_type;
 extern struct annotated_data_type stackop_type;
+extern struct annotated_data_type canary_type;
 
 /**
  * struct data_loc_info - Data location information
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 9b72eae2400c..e183c53531fe 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -113,6 +113,13 @@ struct annotated_data_type stackop_type = {
 	},
 };
 
+struct annotated_data_type canary_type = {
+	.self = {
+		.type_name = (char *)"(stack canary)",
+		.children = LIST_HEAD_INIT(canary_type.self.children),
+	},
+};
+
 static int arch__grow_instructions(struct arch *arch)
 {
 	struct ins *new_instructions;
@@ -3768,6 +3775,17 @@ static bool is_stack_operation(struct arch *arch, struct disasm_line *dl)
 	return false;
 }
 
+static bool is_stack_canary(struct arch *arch, struct annotated_op_loc *loc)
+{
+	/* On x86_64, %gs:40 is used for stack canary */
+	if (arch__is(arch, "x86")) {
+		if (loc->segment == INSN_SEG_X86_GS && loc->offset == 40)
+			return true;
+	}
+
+	return false;
+}
+
 u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
 			struct disasm_line *dl)
 {
@@ -3937,6 +3955,12 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
 		}
 
 		mem_type = find_data_type(&dloc);
+
+		if (mem_type == NULL && is_stack_canary(arch, op_loc)) {
+			mem_type = &canary_type;
+			dloc.type_offset = 0;
+		}
+
 		if (mem_type)
 			istat->good++;
 		else
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* [PATCH 52/52] perf annotate-data: Add debug message
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (50 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 51/52] perf annotate-data: Add stack canary type Namhyung Kim
@ 2023-11-10  0:00 ` Namhyung Kim
  2023-11-10 12:05 ` [RFC 00/52] perf tools: Introduce data type profiling (v2) Arnaldo Carvalho de Melo
  52 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-10  0:00 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra
  Cc: Ian Rogers, Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

This is just for debugging and not for merge.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/annotate-data.c | 122 +++++++++++++++++++++++++++++---
 tools/perf/util/annotate-data.h |   2 +-
 2 files changed, 114 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 416c0b5649fc..8e318349f430 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -90,6 +90,21 @@ void exit_type_state(struct type_state *state)
 	}
 }
 
+static void debug_print_type_name(Dwarf_Die *die)
+{
+	struct strbuf sb;
+	char *str;
+
+	if (!verbose)
+		return;
+
+	strbuf_init(&sb, 32);
+	die_get_typename_from_type(die, &sb);
+	str = strbuf_detach(&sb, NULL);
+	pr_debug("%s (die:%lx)\n", str, dwarf_dieoffset(die));
+	free(str);
+}
+
 /*
  * Compare type name and size to maintain them in a tree.
  * I'm not sure if DWARF would have information of a single type in many
@@ -376,7 +391,7 @@ static struct type_state_stack *findnew_stack_state(struct type_state *state,
  * is used only at the given location and updates an entry in the table.
  */
 void update_var_state(struct type_state *state, struct data_loc_info *dloc,
-		      u64 addr, struct die_var_type *var_types)
+		      u64 addr, u64 off, struct die_var_type *var_types)
 {
 	Dwarf_Die mem_die;
 	struct die_var_type *var;
@@ -397,14 +412,20 @@ void update_var_state(struct type_state *state, struct data_loc_info *dloc,
 
 		if (var->reg == DWARF_REG_FB) {
 			findnew_stack_state(state, var->offset, &mem_die);
+			pr_debug("var [%lx] stack fbreg (%x, %d) type=", off, var->offset, var->offset);
+			debug_print_type_name(&mem_die);
 		} else if (var->reg == fbreg) {
 			findnew_stack_state(state, var->offset - fb_offset, &mem_die);
+			pr_debug("var [%lx] stack cfa (%x, %d) fb-offset=%d type=", off, var->offset - fb_offset, var->offset - fb_offset, fb_offset);
+			debug_print_type_name(&mem_die);
 		} else if (has_reg_type(state, var->reg) && var->offset == 0) {
 			struct type_state_reg *reg;
 
 			reg = &state->regs[var->reg];
 			reg->type = mem_die;
 			reg->ok = true;
+			pr_debug("var [%lx] reg%d type=", off, var->reg);
+			debug_print_type_name(&mem_die);
 		}
 	}
 }
@@ -484,6 +505,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 		    die_get_real_type(&func_die, &type_die)) {
 			state->regs[state->ret_reg].type = type_die;
 			state->regs[state->ret_reg].ok = true;
+			pr_debug("fun [%lx] reg0 return from %s type=", dl->al.offset, dwarf_diename(&func_die));
+			debug_print_type_name(&type_die);
 		}
 		return;
 	}
@@ -492,8 +515,10 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 	if (!strstr(dl->ins.name, "mov"))
 		return;
 
-	if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
+	if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0) {
+		pr_debug("failed to get mov insn loc\n");
 		return;
+	}
 
 	if (dloc->fb_cfa) {
 		u64 ip = dloc->ms->sym->start + dl->al.offset;
@@ -508,10 +533,14 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 		if (!has_reg_type(state, dst->reg1))
 			return;
 
-		if (has_reg_type(state, src->reg1))
+		if (has_reg_type(state, src->reg1)) {
 			state->regs[dst->reg1] = state->regs[src->reg1];
-		else if (dloc->ms->map->dso->kernel &&
-			 src->segment == INSN_SEG_X86_GS) {
+			if (state->regs[dst->reg1].ok) {
+				pr_debug("mov [%lx] reg%d -> reg%d type=", dl->al.offset, src->reg1, dst->reg1);
+				debug_print_type_name(&state->regs[dst->reg1].type);
+			}
+		} else if (dloc->ms->map->dso->kernel &&
+			   src->segment == INSN_SEG_X86_GS) {
 			struct map_symbol *ms = dloc->ms;
 			int offset = src->offset;
 			u64 ip = ms->sym->start + dl->al.offset;
@@ -531,6 +560,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 						var_name, offset, &type_die)) {
 				state->regs[dst->reg1].type = type_die;
 				state->regs[dst->reg1].ok = true;
+				pr_debug("mov [%lx] percpu -> reg%d type=", dl->al.offset, dst->reg1);
+				debug_print_type_name(&state->regs[dst->reg1].type);
 			}
 		} else
 			state->regs[dst->reg1].ok = false;
@@ -561,8 +592,13 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 						var_name, offset, &type_die)) {
 				state->regs[dst->reg1].type = type_die;
 				state->regs[dst->reg1].ok = true;
-			} else
+				pr_debug("mov [%lx] PC-rel -> reg%d type=", dl->al.offset, dst->reg1);
+				debug_print_type_name(&type_die);
+			} else {
+				if (var_name)
+					pr_debug("??? [%lx] PC-rel (%lx: %s%+d)\n", dl->al.offset, var_addr, var_name, offset);
 				state->regs[dst->reg1].ok = false;
+			}
 		}
 		/* And check stack variables with offset */
 		else if (sreg == fbreg) {
@@ -575,6 +611,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 							 &type_die)) {
 				state->regs[dst->reg1].type = type_die;
 				state->regs[dst->reg1].ok = true;
+				pr_debug("mov [%lx] stack (-%#x, %d) -> reg%d type=", dl->al.offset, -offset, offset, dst->reg1);
+				debug_print_type_name(&type_die);
 			} else
 				state->regs[dst->reg1].ok = false;
 		}
@@ -584,6 +622,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 					    src->offset, &type_die)) {
 			state->regs[dst->reg1].type = type_die;
 			state->regs[dst->reg1].ok = true;
+			pr_debug("mov [%lx] %#x(reg%d) -> reg%d type=", dl->al.offset, src->offset, sreg, dst->reg1);
+			debug_print_type_name(&type_die);
 		}
 		/* Or try another register if any */
 		else if (src->multi_regs && sreg == src->reg1 &&
@@ -623,6 +663,8 @@ void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
 				findnew_stack_state(state, offset,
 						    &state->regs[src->reg1].type);
 			}
+			pr_debug("mov [%lx] reg%d -> stack (-%#x, %d) type=", dl->al.offset, src->reg1, -offset, offset);
+			debug_print_type_name(&state->regs[src->reg1].type);
 		}
 		/*
 		 * Ignore other transfers since it'd set a value in a struct
@@ -726,6 +768,9 @@ static bool find_matching_type(struct type_state *state,
 		    (unsigned)dloc->type_offset >= size)
 			return false;
 
+		pr_debug("%s: [%lx] reg=%d offset=%d type=",
+			 __func__, dloc->ip - dloc->ms->sym->start, reg, dloc->type_offset);
+		debug_print_type_name(type_die);
 		return true;
 	}
 
@@ -739,6 +784,10 @@ static bool find_matching_type(struct type_state *state,
 		*type_die = stack->type;
 		/* Update the type offset from the start of slot */
 		dloc->type_offset -= stack->offset;
+
+		pr_debug("%s: [%lx] stack offset=%d type=",
+			 __func__, dloc->ip - dloc->ms->sym->start, dloc->type_offset);
+		debug_print_type_name(type_die);
 		return true;
 	}
 
@@ -760,6 +809,11 @@ static bool find_matching_type(struct type_state *state,
 		*type_die = stack->type;
 		/* Update the type offset from the start of slot */
 		dloc->type_offset -= fboff + stack->offset;
+
+		pr_debug("%s: [%lx] cfa stack offset=%d type_offset=%d type=",
+			 __func__, dloc->ip - dloc->ms->sym->start,
+			 dloc->type_offset + stack->offset, dloc->type_offset);
+		debug_print_type_name(type_die);
 		return true;
 	}
 
@@ -783,12 +837,13 @@ static bool find_data_type_insn(struct data_loc_info *dloc, int reg,
 	list_for_each_entry(bb, basic_blocks, list) {
 		struct disasm_line *dl = bb->begin;
 
+		pr_debug("bb: [%lx - %lx]\n", bb->begin->al.offset, bb->end->al.offset);
 		list_for_each_entry_from(dl, &notes->src->source, al.node) {
 			u64 this_ip = sym->start + dl->al.offset;
 			u64 addr = map__rip_2objdump(dloc->ms->map, this_ip);
 
 			/* Update variable type at this address */
-			update_var_state(&state, dloc, addr, var_types);
+			update_var_state(&state, dloc, addr, dl->al.offset, var_types);
 
 			if (this_ip == dloc->ip) {
 				found = find_matching_type(&state, dloc, reg,
@@ -821,6 +876,16 @@ static int find_data_type_block(struct data_loc_info *dloc, int reg,
 	u64 src_ip, dst_ip;
 	int ret = -1;
 
+	if (dloc->fb_cfa) {
+		u64 pc = map__rip_2objdump(dloc->ms->map, dloc->ip);
+		int fbreg, fboff;
+
+		if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0)
+			fbreg = -1;
+
+		pr_debug("CFA reg=%d offset=%d\n", fbreg, fboff);
+	}
+
 	dst_ip = dloc->ip;
 	for (int i = nr_scopes - 1; i >= 0; i--) {
 		Dwarf_Addr base, start, end;
@@ -829,12 +894,16 @@ static int find_data_type_block(struct data_loc_info *dloc, int reg,
 		if (dwarf_ranges(&scopes[i], 0, &base, &start, &end) < 0)
 			break;
 
+		pr_debug("scope: [%d/%d] (die:%lx)\n", i + 1, nr_scopes, dwarf_dieoffset(&scopes[i]));
 		src_ip = map__objdump_2rip(dloc->ms->map, start);
 
 		/* Get basic blocks for this scope */
 		if (annotate_get_basic_blocks(dloc->ms->sym, src_ip, dst_ip,
-					      &this_blocks) < 0)
+					      &this_blocks) < 0) {
+			pr_debug("cannot find a basic block from %lx to %lx\n",
+				 src_ip - dloc->ms->sym->start, dst_ip - dloc->ms->sym->start);
 			continue;
+		}
 		prepend_basic_blocks(&this_blocks, &basic_blocks);
 
 		/* Get variable info for this scope and add to var_types list */
@@ -870,6 +939,18 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 	int fb_offset = 0;
 	bool is_fbreg = false;
 	u64 pc;
+	char buf[64];
+
+	if (dloc->op->multi_regs)
+		snprintf(buf, sizeof(buf), " or reg%d", dloc->op->reg2);
+	else if (dloc->op->reg1 == DWARF_REG_PC)
+		snprintf(buf, sizeof(buf), " (PC)");
+	else
+		buf[0] = '\0';
+
+	pr_debug("-----------------------------------------------------------\n");
+	pr_debug("%s [%lx] for reg%d%s in %s\n", __func__, dloc->ip - dloc->ms->sym->start,
+		 dloc->op->reg1, buf, dloc->ms->sym->name);
 
 	/*
 	 * IP is a relative instruction address from the start of the map, as
@@ -888,11 +969,15 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 	reg = loc->reg1;
 	offset = loc->offset;
 
+	pr_debug("CU die offset: %lx\n", dwarf_dieoffset(&cu_die));
+
 	if (reg == DWARF_REG_PC) {
 		if (die_find_variable_by_addr(&cu_die, pc, dloc->var_addr,
 					      &var_die, &offset)) {
 			ret = check_variable(&var_die, type_die, offset,
 					     /*is_pointer=*/false);
+			if (ret == 0)
+				pr_debug("found PC-rel by addr=%lx offset=%d\n", dloc->var_addr, offset);
 			dloc->type_offset = offset;
 			goto out;
 		}
@@ -901,6 +986,8 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 		    die_find_variable_at(&cu_die, dloc->var_name, pc, &var_die)) {
 			ret = check_variable(&var_die, type_die, dloc->type_offset,
 					     /*is_pointer=*/false);
+			if (ret == 0)
+				pr_debug("found \"%s\" by name offset=%d\n", dloc->var_name, dloc->type_offset);
 			/* dloc->type_offset was updated by the caller */
 			goto out;
 		}
@@ -953,6 +1040,21 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 		/* Found a variable, see if it's correct */
 		ret = check_variable(&var_die, type_die, offset,
 				     reg != DWARF_REG_PC && !is_fbreg);
+		if (ret == 0) {
+#if 0
+			const char *filename;
+			int lineno;
+
+			if (cu_find_lineinfo(&cu_die, pc, &filename, &lineno) < 0) {
+				filename = "unknown";
+				lineno = 0;
+			}
+#endif
+			pr_debug("found \"%s\" in scope=%d/%d reg=%d offset=%#x (%d) loc->offset=%d fb-offset=%d (die:%lx scope:%lx) type=",
+				 dwarf_diename(&var_die), i+1, nr_scopes, reg, offset, offset, loc->offset, fb_offset, dwarf_dieoffset(&var_die),
+				 dwarf_dieoffset(&scopes[i])/*, filename, lineno*/);
+			debug_print_type_name(type_die);
+		}
 		dloc->type_offset = offset;
 		goto out;
 	}
@@ -969,8 +1071,10 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
 		goto retry;
 	}
 
-	if (ret < 0)
+	if (ret < 0) {
+		pr_debug("no variable found\n");
 		ann_data_stat.no_var++;
+	}
 
 out:
 	free(scopes);
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index e293980eb11b..44e0f3770432 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -166,7 +166,7 @@ void exit_type_state(struct type_state *state);
 
 /* Update type state table using variables */
 void update_var_state(struct type_state *state, struct data_loc_info *dloc,
-		      u64 addr, struct die_var_type *var_types);
+		      u64 addr, u64 off, struct die_var_type *var_types);
 
 /* Update type state table for an instruction */
 void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
-- 
2.42.0.869.gea05f2083d-goog


^ permalink raw reply related	[flat|nested] 69+ messages in thread

* Re: [PATCH 09/52] perf probe: Convert to check dwarf_getcfi feature
  2023-11-09 23:59 ` [PATCH 09/52] perf probe: Convert to check dwarf_getcfi feature Namhyung Kim
@ 2023-11-10 10:25   ` Masami Hiramatsu
  0 siblings, 0 replies; 69+ messages in thread
From: Masami Hiramatsu @ 2023-11-10 10:25 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ian Rogers,
	Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

On Thu,  9 Nov 2023 15:59:28 -0800
Namhyung Kim <namhyung@kernel.org> wrote:

> Now it has a feature check for the dwarf_getcfi(), use it and convert
> the code to check HAVE_DWARF_CFI_SUPPORT definition.
> 

Looks good to me!

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thanks!

> Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/Makefile.config     | 5 +++++
>  tools/perf/util/probe-finder.c | 8 ++++----
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index 8b6cffbc4858..aa55850fbc21 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -476,6 +476,11 @@ else
>        else
>          CFLAGS += -DHAVE_DWARF_GETLOCATIONS_SUPPORT
>        endif # dwarf_getlocations
> +      ifneq ($(feature-dwarf_getcfi), 1)
> +        msg := $(warning Old libdw.h, finding variables at given 'perf probe' point will not work, install elfutils-devel/libdw-dev >= 0.142);
> +      else
> +        CFLAGS += -DHAVE_DWARF_CFI_SUPPORT
> +      endif # dwarf_getcfi
>      endif # Dwarf support
>    endif # libelf support
>  endif # NO_LIBELF
> diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> index 8d3dd85f9ff4..c8923375e30d 100644
> --- a/tools/perf/util/probe-finder.c
> +++ b/tools/perf/util/probe-finder.c
> @@ -604,7 +604,7 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
>  	ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops, &nops, 1);
>  	if (ret <= 0 || nops == 0) {
>  		pf->fb_ops = NULL;
> -#if _ELFUTILS_PREREQ(0, 142)
> +#ifdef HAVE_DWARF_CFI_SUPPORT
>  	} else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa &&
>  		   (pf->cfi_eh != NULL || pf->cfi_dbg != NULL)) {
>  		if ((dwarf_cfi_addrframe(pf->cfi_eh, pf->addr, &frame) != 0 &&
> @@ -615,7 +615,7 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
>  			free(frame);
>  			return -ENOENT;
>  		}
> -#endif
> +#endif /* HAVE_DWARF_CFI_SUPPORT */
>  	}
>  
>  	/* Call finder's callback handler */
> @@ -1140,7 +1140,7 @@ static int debuginfo__find_probes(struct debuginfo *dbg,
>  
>  	pf->machine = ehdr.e_machine;
>  
> -#if _ELFUTILS_PREREQ(0, 142)
> +#ifdef HAVE_DWARF_CFI_SUPPORT
>  	do {
>  		GElf_Shdr shdr;
>  
> @@ -1150,7 +1150,7 @@ static int debuginfo__find_probes(struct debuginfo *dbg,
>  
>  		pf->cfi_dbg = dwarf_getcfi(dbg->dbg);
>  	} while (0);
> -#endif
> +#endif /* HAVE_DWARF_CFI_SUPPORT */
>  
>  	ret = debuginfo__find_probe_location(dbg, pf);
>  	return ret;
> -- 
> 2.42.0.869.gea05f2083d-goog
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 08/52] perf build: Add feature check for dwarf_getcfi()
  2023-11-09 23:59 ` [PATCH 08/52] perf build: Add feature check for dwarf_getcfi() Namhyung Kim
@ 2023-11-10 10:26   ` Masami Hiramatsu
  0 siblings, 0 replies; 69+ messages in thread
From: Masami Hiramatsu @ 2023-11-10 10:26 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ian Rogers,
	Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

On Thu,  9 Nov 2023 15:59:27 -0800
Namhyung Kim <namhyung@kernel.org> wrote:

> The dwarf_getcfi() is available on libdw 0.142+.  Instead of just
> checking the version number, it'd be nice to have a config item to check
> the feature at build time.

Looks good to me.

Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Thanks!

> 
> Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/build/Makefile.feature            | 1 +
>  tools/build/feature/Makefile            | 4 ++++
>  tools/build/feature/test-dwarf_getcfi.c | 9 +++++++++
>  3 files changed, 14 insertions(+)
>  create mode 100644 tools/build/feature/test-dwarf_getcfi.c
> 
> diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
> index 934e2777a2db..64df118376df 100644
> --- a/tools/build/Makefile.feature
> +++ b/tools/build/Makefile.feature
> @@ -32,6 +32,7 @@ FEATURE_TESTS_BASIC :=                  \
>          backtrace                       \
>          dwarf                           \
>          dwarf_getlocations              \
> +        dwarf_getcfi                    \
>          eventfd                         \
>          fortify-source                  \
>          get_current_dir_name            \
> diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
> index dad79ede4e0a..37722e509eb9 100644
> --- a/tools/build/feature/Makefile
> +++ b/tools/build/feature/Makefile
> @@ -7,6 +7,7 @@ FILES=                                          \
>           test-bionic.bin                        \
>           test-dwarf.bin                         \
>           test-dwarf_getlocations.bin            \
> +         test-dwarf_getcfi.bin                  \
>           test-eventfd.bin                       \
>           test-fortify-source.bin                \
>           test-get_current_dir_name.bin          \
> @@ -154,6 +155,9 @@ endif
>  $(OUTPUT)test-dwarf_getlocations.bin:
>  	$(BUILD) $(DWARFLIBS)
>  
> +$(OUTPUT)test-dwarf_getcfi.bin:
> +	$(BUILD) $(DWARFLIBS)
> +
>  $(OUTPUT)test-libelf-getphdrnum.bin:
>  	$(BUILD) -lelf
>  
> diff --git a/tools/build/feature/test-dwarf_getcfi.c b/tools/build/feature/test-dwarf_getcfi.c
> new file mode 100644
> index 000000000000..50e7d7cb7bdf
> --- /dev/null
> +++ b/tools/build/feature/test-dwarf_getcfi.c
> @@ -0,0 +1,9 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <stdio.h>
> +#include <elfutils/libdw.h>
> +
> +int main(void)
> +{
> +	Dwarf *dwarf = NULL;
> +	return dwarf_getcfi(dwarf) == NULL;
> +}
> -- 
> 2.42.0.869.gea05f2083d-goog
> 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC 00/52] perf tools: Introduce data type profiling (v2)
  2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
                   ` (51 preceding siblings ...)
  2023-11-10  0:00 ` [PATCH 52/52] perf annotate-data: Add debug message Namhyung Kim
@ 2023-11-10 12:05 ` Arnaldo Carvalho de Melo
  2023-11-11  2:27   ` Namhyung Kim
  52 siblings, 1 reply; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-11-10 12:05 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains, Ben Woodard, Joe Mario,
	Kees Cook, David Blaikie, Xu Liu, Kan Liang, Ravi Bangoria,
	Mark Wielaard, Jason Merrill

Em Thu, Nov 09, 2023 at 03:59:19PM -0800, Namhyung Kim escreveu:
> * Patch structure
> 
> The patch 1-5 are cleanups and a fix that can be applied separately.

Applied so far 1-9, will continue later.

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [RFC 00/52] perf tools: Introduce data type profiling (v2)
  2023-11-10 12:05 ` [RFC 00/52] perf tools: Introduce data type profiling (v2) Arnaldo Carvalho de Melo
@ 2023-11-11  2:27   ` Namhyung Kim
  0 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-11  2:27 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains, Ben Woodard, Joe Mario,
	Kees Cook, David Blaikie, Xu Liu, Kan Liang, Ravi Bangoria,
	Mark Wielaard, Jason Merrill

On Fri, Nov 10, 2023 at 4:05 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Nov 09, 2023 at 03:59:19PM -0800, Namhyung Kim escreveu:
> > * Patch structure
> >
> > The patch 1-5 are cleanups and a fix that can be applied separately.
>
> Applied so far 1-9, will continue later.

Great, thanks a lot!
Namhyung

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 12/52] perf annotate-data: Add find_data_type()
       [not found]   ` <CA+JHD90fkWNrQWO5DrHeV8mCmFyKKqJ8fV=KwztRi7TSw+8yDg@mail.gmail.com>
@ 2023-11-20 20:43     ` Namhyung Kim
  0 siblings, 0 replies; 69+ messages in thread
From: Namhyung Kim @ 2023-11-20 20:43 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ian Rogers,
	Adrian Hunter, Ingo Molnar, LKML, linux-perf-users,
	Linus Torvalds, Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	Linux Trace Devel, linux-toolchains

Hi Arnaldo,

On Sat, Nov 11, 2023 at 10:55 AM Arnaldo Carvalho de Melo
<arnaldo.melo@gmail.com> wrote:
>
> On Thu, Nov 9, 2023, 9:00 PM Namhyung Kim <namhyung@kernel.org> wrote:
>>
>>
>> +static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die)
>> +{
>> +       Dwarf_Off off, next_off;
>> +       size_t header_size;
>> +
>> +       if (dwarf_addrdie(di->dbg, pc, cu_die) != NULL)
>> +               return cu_die;
>
> Isn't the return type a bool?
>
> Shouldn't be 'return true;'?
>
> Ends up like that as cu_die isn't NULL, but looks confusing.

Ok, will change.

>
>> +
>> +       /*
>> +        * There are some kernels don't have full aranges and contain only a few
>> +        * aranges entries.  Fallback to iterate all CU entries in .debug_info
>> +        * in case it's missing.
>> +        */
>> +       off = 0;
>> +       while (dwarf_nextcu(di->dbg, off, &next_off, &header_size,
>> +                           NULL, NULL, NULL) == 0) {
>> +               if (dwarf_offdie(di->dbg, off + header_size, cu_die) &&
>> +                   dwarf_haspc(cu_die, pc))
>> +                       return true;
>> +
>> +               off = next_off;
>> +       }
>> +       return false;
>> +}
>> +
>>
>> +struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
>> +                                          int reg, int offset)
>> +{
>> +       struct annotated_data_type *result = NULL;
>> +       struct dso *dso = ms->map->dso;
>> +       struct debuginfo *di;
>> +       Dwarf_Die type_die;
>> +       struct strbuf sb;
>> +       u64 pc;
>> +
>> +       di = debuginfo__new(dso->long_name);
>> +       if (di == NULL) {
>> +               pr_debug("cannot get the debug info\n");
>
>
> Shouldn't inform the dso->long_name and function name to ease debugging?

Sounds good, I'll update it in the v3.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 18/52] perf report: Add 'type' sort key
  2023-11-09 23:59 ` [PATCH 18/52] perf report: Add 'type' sort key Namhyung Kim
@ 2023-11-21 17:55   ` Arnaldo Carvalho de Melo
  2023-11-22 18:49     ` Namhyung Kim
  0 siblings, 1 reply; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-11-21 17:55 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Thu, Nov 09, 2023 at 03:59:37PM -0800, Namhyung Kim escreveu:
> The 'type' sort key is to aggregate hist entries by data type they
> access.  Add mem_type field to hist_entry struct to save the type.
> If hist_entry__get_data_type() returns NULL, it'd use the
> 'unknown_type' instance.

I built up to here and then tried on a random perf.data file:

⬢[acme@toolbox perf-tools-next]$ perf evlist
cycles:Pu
⬢[acme@toolbox perf-tools-next]$ perf evlist -v
cycles:Pu: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID|LOST, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
⬢[acme@toolbox perf-tools-next]$

And got:

⬢[acme@toolbox perf-tools-next]$ perf report -s type
perf: Segmentation fault
-------- backtrace --------
perf[0x69f743]
/lib64/libc.so.6(+0x3dbb0)[0x7f89b4778bb0]
perf[0x505af6]
perf[0x512d47]
perf[0x512f82]
perf[0x5b3461]
perf[0x5b3516]
perf[0x5b3a3e]
perf[0x5bbb05]
perf[0x5bc68f]
perf[0x5bca7c]
perf[0x42ead1]
perf[0x42fa08]
perf[0x43200d]
perf[0x504856]
perf[0x504ac5]
perf[0x504c14]
perf[0x504f01]
/lib64/libc.so.6(+0x27b8a)[0x7f89b4762b8a]
/lib64/libc.so.6(__libc_start_main+0x8b)[0x7f89b4762c4b]
perf[0x40ed65]
⬢[acme@toolbox perf-tools-next]$

Using gdb:

(gdb) run report --stdio -s type
Starting program: /home/acme/bin/perf report --stdio -s type

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Detaching after fork from child process 811109]

Program received signal SIGSEGV, Segmentation fault.
0x0000000000505af6 in list_empty (head=0x14c20) at /home/acme/git/perf-tools-next/tools/include/linux/list.h:189
189		return head->next == head;
(gdb) bt
#0  0x0000000000505af6 in list_empty (head=0x14c20) at /home/acme/git/perf-tools-next/tools/include/linux/list.h:189
#1  0x0000000000512d47 in symbol__ensure_annotate (ms=0xe6f258, evsel=0xe276f0) at util/annotate.c:3640
#2  0x0000000000512f82 in hist_entry__get_data_type (he=0xe6f1e0) at util/annotate.c:3696
#3  0x00000000005b3461 in sort__type_init (he=0xe6f1e0) at util/sort.c:2152
#4  0x00000000005b3516 in sort__type_collapse (left=0xe6ed80, right=0xe6f1e0) at util/sort.c:2169
#5  0x00000000005b3a3e in __sort__hpp_collapse (fmt=0xe448f0, a=0xe6ed80, b=0xe6f1e0) at util/sort.c:2394
#6  0x00000000005bbb05 in hist_entry__collapse (left=0xe6ed80, right=0xe6f1e0) at util/hist.c:1306
#7  0x00000000005bc68f in hists__collapse_insert_entry (hists=0xe27960, root=0xe27998, he=0xe6f1e0) at util/hist.c:1613
#8  0x00000000005bca7c in hists__collapse_resort (hists=0xe27960, prog=0x7fffffffb820) at util/hist.c:1697
#9  0x000000000042ead1 in report__collapse_hists (rep=0x7fffffffbac0) at builtin-report.c:723
#10 0x000000000042fa08 in __cmd_report (rep=0x7fffffffbac0) at builtin-report.c:1042
#11 0x000000000043200d in cmd_report (argc=0, argv=0x7fffffffe1b0) at builtin-report.c:1733
#12 0x0000000000504856 in run_builtin (p=0xdf7da0 <commands+288>, argc=4, argv=0x7fffffffe1b0) at perf.c:322
#13 0x0000000000504ac5 in handle_internal_command (argc=4, argv=0x7fffffffe1b0) at perf.c:375
#14 0x0000000000504c14 in run_argv (argcp=0x7fffffffdfcc, argv=0x7fffffffdfc0) at perf.c:419
#15 0x0000000000504f01 in main (argc=4, argv=0x7fffffffe1b0) at perf.c:535
(gdb)


static void symbol__ensure_annotate(struct map_symbol *ms, struct evsel *evsel)
+{
+       struct disasm_line *dl, *tmp_dl;
+       struct annotation *notes;
+
+       notes = symbol__annotation(ms->sym);
+       if (!list_empty(&notes->src->source))
+               return;
+
+       if (symbol__annotate(ms, evsel, notes->options, NULL) < 0)
+               return;
+
+       /* remove non-insn disasm lines for simplicity */
+       list_for_each_entry_safe(dl, tmp_dl, &notes->src->source, al.node) {
+               if (dl->al.offset == -1) {
+                       list_del(&dl->al.node);
+                       free(dl);
+               }
+       }
+}

Probably annotated_source__new() wasn't called? Yeah, seems so:

(gdb) b annotated_source__new
Breakpoint 1 at 0x50a894: file util/annotate.c, line 851.
(gdb) run report --stdio -s type
Starting program: /home/acme/bin/perf report --stdio -s type

This GDB supports auto-downloading debuginfo from the following URLs:
  <https://debuginfod.fedoraproject.org/>
Enable debuginfod for this session? (y or [n]) y
Debuginfod has been enabled.
To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[Detaching after fork from child process 818292]

Program received signal SIGSEGV, Segmentation fault.
0x0000000000505af6 in list_empty (head=0x14c20) at /home/acme/git/perf-tools-next/tools/include/linux/list.h:189
189		return head->next == head;
(gdb)



 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/Documentation/perf-report.txt |  1 +
>  tools/perf/util/annotate-data.h          |  2 +
>  tools/perf/util/hist.h                   |  1 +
>  tools/perf/util/sort.c                   | 69 +++++++++++++++++++++++-
>  tools/perf/util/sort.h                   |  4 ++
>  5 files changed, 75 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index af068b4f1e5a..aec34417090b 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -118,6 +118,7 @@ OPTIONS
>  	- retire_lat: On X86, this reports pipeline stall of this instruction compared
>  	  to the previous instruction in cycles. And currently supported only on X86
>  	- simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate
> +	- type: Data type of sample memory access.
>  
>  	By default, comm, dso and symbol keys are used.
>  	(i.e. --sort comm,dso,symbol)
> diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
> index ab9f187bd7f1..6efdd7e21b28 100644
> --- a/tools/perf/util/annotate-data.h
> +++ b/tools/perf/util/annotate-data.h
> @@ -22,6 +22,8 @@ struct annotated_data_type {
>  	int type_size;
>  };
>  
> +extern struct annotated_data_type unknown_type;
> +
>  #ifdef HAVE_DWARF_SUPPORT
>  
>  /* Returns data type at the location (ip, reg, offset) */
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index afc9f1c7f4dc..9bfed867f288 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -82,6 +82,7 @@ enum hist_column {
>  	HISTC_ADDR_TO,
>  	HISTC_ADDR,
>  	HISTC_SIMD,
> +	HISTC_TYPE,
>  	HISTC_NR_COLS, /* Last entry */
>  };
>  
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 27b123ccd2d1..e647f0117bb5 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -24,6 +24,7 @@
>  #include "strbuf.h"
>  #include "mem-events.h"
>  #include "annotate.h"
> +#include "annotate-data.h"
>  #include "event.h"
>  #include "time-utils.h"
>  #include "cgroup.h"
> @@ -2094,7 +2095,7 @@ struct sort_entry sort_dso_size = {
>  	.se_width_idx	= HISTC_DSO_SIZE,
>  };
>  
> -/* --sort dso_size */
> +/* --sort addr */
>  
>  static int64_t
>  sort__addr_cmp(struct hist_entry *left, struct hist_entry *right)
> @@ -2131,6 +2132,69 @@ struct sort_entry sort_addr = {
>  	.se_width_idx	= HISTC_ADDR,
>  };
>  
> +/* --sort type */
> +
> +struct annotated_data_type unknown_type = {
> +	.type_name = (char *)"(unknown)",
> +};
> +
> +static int64_t
> +sort__type_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	return sort__addr_cmp(left, right);
> +}
> +
> +static void sort__type_init(struct hist_entry *he)
> +{
> +	if (he->mem_type)
> +		return;
> +
> +	he->mem_type = hist_entry__get_data_type(he);
> +	if (he->mem_type == NULL)
> +		he->mem_type = &unknown_type;
> +}
> +
> +static int64_t
> +sort__type_collapse(struct hist_entry *left, struct hist_entry *right)
> +{
> +	struct annotated_data_type *left_type = left->mem_type;
> +	struct annotated_data_type *right_type = right->mem_type;
> +
> +	if (!left_type) {
> +		sort__type_init(left);
> +		left_type = left->mem_type;
> +	}
> +
> +	if (!right_type) {
> +		sort__type_init(right);
> +		right_type = right->mem_type;
> +	}
> +
> +	return strcmp(left_type->type_name, right_type->type_name);
> +}
> +
> +static int64_t
> +sort__type_sort(struct hist_entry *left, struct hist_entry *right)
> +{
> +	return sort__type_collapse(left, right);
> +}
> +
> +static int hist_entry__type_snprintf(struct hist_entry *he, char *bf,
> +				     size_t size, unsigned int width)
> +{
> +	return repsep_snprintf(bf, size, "%-*s", width, he->mem_type->type_name);
> +}
> +
> +struct sort_entry sort_type = {
> +	.se_header	= "Data Type",
> +	.se_cmp		= sort__type_cmp,
> +	.se_collapse	= sort__type_collapse,
> +	.se_sort	= sort__type_sort,
> +	.se_init	= sort__type_init,
> +	.se_snprintf	= hist_entry__type_snprintf,
> +	.se_width_idx	= HISTC_TYPE,
> +};
> +
>  
>  struct sort_dimension {
>  	const char		*name;
> @@ -2185,7 +2249,8 @@ static struct sort_dimension common_sort_dimensions[] = {
>  	DIM(SORT_ADDR, "addr", sort_addr),
>  	DIM(SORT_LOCAL_RETIRE_LAT, "local_retire_lat", sort_local_p_stage_cyc),
>  	DIM(SORT_GLOBAL_RETIRE_LAT, "retire_lat", sort_global_p_stage_cyc),
> -	DIM(SORT_SIMD, "simd", sort_simd)
> +	DIM(SORT_SIMD, "simd", sort_simd),
> +	DIM(SORT_ANNOTATE_DATA_TYPE, "type", sort_type),
>  };
>  
>  #undef DIM
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index ecfb7f1359d5..aabf0b8331a3 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -15,6 +15,7 @@
>  
>  struct option;
>  struct thread;
> +struct annotated_data_type;
>  
>  extern regex_t parent_regex;
>  extern const char *sort_order;
> @@ -34,6 +35,7 @@ extern struct sort_entry sort_dso_to;
>  extern struct sort_entry sort_sym_from;
>  extern struct sort_entry sort_sym_to;
>  extern struct sort_entry sort_srcline;
> +extern struct sort_entry sort_type;
>  extern const char default_mem_sort_order[];
>  extern bool chk_double_cl;
>  
> @@ -154,6 +156,7 @@ struct hist_entry {
>  	struct perf_hpp_list	*hpp_list;
>  	struct hist_entry	*parent_he;
>  	struct hist_entry_ops	*ops;
> +	struct annotated_data_type *mem_type;
>  	union {
>  		/* this is for hierarchical entry structure */
>  		struct {
> @@ -243,6 +246,7 @@ enum sort_type {
>  	SORT_LOCAL_RETIRE_LAT,
>  	SORT_GLOBAL_RETIRE_LAT,
>  	SORT_SIMD,
> +	SORT_ANNOTATE_DATA_TYPE,
>  
>  	/* branch stack specific sort keys */
>  	__SORT_BRANCH_STACK,
> -- 
> 2.42.0.869.gea05f2083d-goog
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 18/52] perf report: Add 'type' sort key
  2023-11-21 17:55   ` Arnaldo Carvalho de Melo
@ 2023-11-22 18:49     ` Namhyung Kim
  2023-11-22 19:54       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-22 18:49 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

On Tue, Nov 21, 2023 at 9:55 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Nov 09, 2023 at 03:59:37PM -0800, Namhyung Kim escreveu:
> > The 'type' sort key is to aggregate hist entries by data type they
> > access.  Add mem_type field to hist_entry struct to save the type.
> > If hist_entry__get_data_type() returns NULL, it'd use the
> > 'unknown_type' instance.
>
> I built up to here and then tried on a random perf.data file:
>
> ⬢[acme@toolbox perf-tools-next]$ perf evlist
> cycles:Pu
> ⬢[acme@toolbox perf-tools-next]$ perf evlist -v
> cycles:Pu: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, read_format: ID|LOST, disabled: 1, inherit: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
> ⬢[acme@toolbox perf-tools-next]$
>
> And got:
>
> ⬢[acme@toolbox perf-tools-next]$ perf report -s type
> perf: Segmentation fault
> -------- backtrace --------
> perf[0x69f743]
> /lib64/libc.so.6(+0x3dbb0)[0x7f89b4778bb0]
> perf[0x505af6]
> perf[0x512d47]
> perf[0x512f82]
> perf[0x5b3461]
> perf[0x5b3516]
> perf[0x5b3a3e]
> perf[0x5bbb05]
> perf[0x5bc68f]
> perf[0x5bca7c]
> perf[0x42ead1]
> perf[0x42fa08]
> perf[0x43200d]
> perf[0x504856]
> perf[0x504ac5]
> perf[0x504c14]
> perf[0x504f01]
> /lib64/libc.so.6(+0x27b8a)[0x7f89b4762b8a]
> /lib64/libc.so.6(__libc_start_main+0x8b)[0x7f89b4762c4b]
> perf[0x40ed65]
> ⬢[acme@toolbox perf-tools-next]$

Right, the 'type' sort key was added here but unfortunately
it's not ready for prime time yet.  It also needs the next patch
19/52 ("perf report: Support data type profiling") to fully enable
the feature.  Do you think it's better to squash into here?

Thanks,
Namhyung

>
> Using gdb:
>
> (gdb) run report --stdio -s type
> Starting program: /home/acme/bin/perf report --stdio -s type
>
> This GDB supports auto-downloading debuginfo from the following URLs:
>   <https://debuginfod.fedoraproject.org/>
> Enable debuginfod for this session? (y or [n]) y
> Debuginfod has been enabled.
> To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [Detaching after fork from child process 811109]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000505af6 in list_empty (head=0x14c20) at /home/acme/git/perf-tools-next/tools/include/linux/list.h:189
> 189             return head->next == head;
> (gdb) bt
> #0  0x0000000000505af6 in list_empty (head=0x14c20) at /home/acme/git/perf-tools-next/tools/include/linux/list.h:189
> #1  0x0000000000512d47 in symbol__ensure_annotate (ms=0xe6f258, evsel=0xe276f0) at util/annotate.c:3640
> #2  0x0000000000512f82 in hist_entry__get_data_type (he=0xe6f1e0) at util/annotate.c:3696
> #3  0x00000000005b3461 in sort__type_init (he=0xe6f1e0) at util/sort.c:2152
> #4  0x00000000005b3516 in sort__type_collapse (left=0xe6ed80, right=0xe6f1e0) at util/sort.c:2169
> #5  0x00000000005b3a3e in __sort__hpp_collapse (fmt=0xe448f0, a=0xe6ed80, b=0xe6f1e0) at util/sort.c:2394
> #6  0x00000000005bbb05 in hist_entry__collapse (left=0xe6ed80, right=0xe6f1e0) at util/hist.c:1306
> #7  0x00000000005bc68f in hists__collapse_insert_entry (hists=0xe27960, root=0xe27998, he=0xe6f1e0) at util/hist.c:1613
> #8  0x00000000005bca7c in hists__collapse_resort (hists=0xe27960, prog=0x7fffffffb820) at util/hist.c:1697
> #9  0x000000000042ead1 in report__collapse_hists (rep=0x7fffffffbac0) at builtin-report.c:723
> #10 0x000000000042fa08 in __cmd_report (rep=0x7fffffffbac0) at builtin-report.c:1042
> #11 0x000000000043200d in cmd_report (argc=0, argv=0x7fffffffe1b0) at builtin-report.c:1733
> #12 0x0000000000504856 in run_builtin (p=0xdf7da0 <commands+288>, argc=4, argv=0x7fffffffe1b0) at perf.c:322
> #13 0x0000000000504ac5 in handle_internal_command (argc=4, argv=0x7fffffffe1b0) at perf.c:375
> #14 0x0000000000504c14 in run_argv (argcp=0x7fffffffdfcc, argv=0x7fffffffdfc0) at perf.c:419
> #15 0x0000000000504f01 in main (argc=4, argv=0x7fffffffe1b0) at perf.c:535
> (gdb)
>
>
> static void symbol__ensure_annotate(struct map_symbol *ms, struct evsel *evsel)
> +{
> +       struct disasm_line *dl, *tmp_dl;
> +       struct annotation *notes;
> +
> +       notes = symbol__annotation(ms->sym);
> +       if (!list_empty(&notes->src->source))
> +               return;
> +
> +       if (symbol__annotate(ms, evsel, notes->options, NULL) < 0)
> +               return;
> +
> +       /* remove non-insn disasm lines for simplicity */
> +       list_for_each_entry_safe(dl, tmp_dl, &notes->src->source, al.node) {
> +               if (dl->al.offset == -1) {
> +                       list_del(&dl->al.node);
> +                       free(dl);
> +               }
> +       }
> +}
>
> Probably annotated_source__new() wasn't called? Yeah, seems so:
>
> (gdb) b annotated_source__new
> Breakpoint 1 at 0x50a894: file util/annotate.c, line 851.
> (gdb) run report --stdio -s type
> Starting program: /home/acme/bin/perf report --stdio -s type
>
> This GDB supports auto-downloading debuginfo from the following URLs:
>   <https://debuginfod.fedoraproject.org/>
> Enable debuginfod for this session? (y or [n]) y
> Debuginfod has been enabled.
> To make this setting permanent, add 'set debuginfod enabled on' to .gdbinit.
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [Detaching after fork from child process 818292]
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x0000000000505af6 in list_empty (head=0x14c20) at /home/acme/git/perf-tools-next/tools/include/linux/list.h:189
> 189             return head->next == head;
> (gdb)

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 18/52] perf report: Add 'type' sort key
  2023-11-22 18:49     ` Namhyung Kim
@ 2023-11-22 19:54       ` Arnaldo Carvalho de Melo
  2023-11-22 21:13         ` Namhyung Kim
  0 siblings, 1 reply; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-11-22 19:54 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Wed, Nov 22, 2023 at 10:49:13AM -0800, Namhyung Kim escreveu:
> On Tue, Nov 21, 2023 at 9:55 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > ⬢[acme@toolbox perf-tools-next]$ perf report -s type
> > perf: Segmentation fault
> > -------- backtrace --------
> > perf[0x69f743]
> > /lib64/libc.so.6(+0x3dbb0)[0x7f89b4778bb0]
> > perf[0x505af6]
<SNIP>
> > perf[0x504f01]
> > /lib64/libc.so.6(+0x27b8a)[0x7f89b4762b8a]
> > /lib64/libc.so.6(__libc_start_main+0x8b)[0x7f89b4762c4b]
> > perf[0x40ed65]
> > ⬢[acme@toolbox perf-tools-next]$
>
> Right, the 'type' sort key was added here but unfortunately
> it's not ready for prime time yet.  It also needs the next patch
> 19/52 ("perf report: Support data type profiling") to fully enable
> the feature.  Do you think it's better to squash into here?

I haven't checked if squashing would be a good idea, but if you think
its the right granularity, then do it, as long as we can test features
in various ways as they are getting added, as I did, using a random
perf.data file.

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 18/52] perf report: Add 'type' sort key
  2023-11-22 19:54       ` Arnaldo Carvalho de Melo
@ 2023-11-22 21:13         ` Namhyung Kim
  2023-11-23 13:40           ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 69+ messages in thread
From: Namhyung Kim @ 2023-11-22 21:13 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

On Wed, Nov 22, 2023 at 11:54 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Wed, Nov 22, 2023 at 10:49:13AM -0800, Namhyung Kim escreveu:
> > On Tue, Nov 21, 2023 at 9:55 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > ⬢[acme@toolbox perf-tools-next]$ perf report -s type
> > > perf: Segmentation fault
> > > -------- backtrace --------
> > > perf[0x69f743]
> > > /lib64/libc.so.6(+0x3dbb0)[0x7f89b4778bb0]
> > > perf[0x505af6]
> <SNIP>
> > > perf[0x504f01]
> > > /lib64/libc.so.6(+0x27b8a)[0x7f89b4762b8a]
> > > /lib64/libc.so.6(__libc_start_main+0x8b)[0x7f89b4762c4b]
> > > perf[0x40ed65]
> > > ⬢[acme@toolbox perf-tools-next]$
> >
> > Right, the 'type' sort key was added here but unfortunately
> > it's not ready for prime time yet.  It also needs the next patch
> > 19/52 ("perf report: Support data type profiling") to fully enable
> > the feature.  Do you think it's better to squash into here?
>
> I haven't checked if squashing would be a good idea, but if you think
> its the right granularity, then do it, as long as we can test features
> in various ways as they are getting added, as I did, using a random
> perf.data file.

I still think it's better to split the change as it's logically separate.
But it's prematurely exposed then maybe needs some protection.

Thanks,
Namhyung

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 18/52] perf report: Add 'type' sort key
  2023-11-22 21:13         ` Namhyung Kim
@ 2023-11-23 13:40           ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-11-23 13:40 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Wed, Nov 22, 2023 at 01:13:04PM -0800, Namhyung Kim escreveu:
> On Wed, Nov 22, 2023 at 11:54 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > Em Wed, Nov 22, 2023 at 10:49:13AM -0800, Namhyung Kim escreveu:
> > > On Tue, Nov 21, 2023 at 9:55 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote:
> > > > ⬢[acme@toolbox perf-tools-next]$ perf report -s type
> > > > perf: Segmentation fault

> > > Right, the 'type' sort key was added here but unfortunately
> > > it's not ready for prime time yet.  It also needs the next patch
> > > 19/52 ("perf report: Support data type profiling") to fully enable
> > > the feature.  Do you think it's better to squash into here?

> > I haven't checked if squashing would be a good idea, but if you think
> > its the right granularity, then do it, as long as we can test features
> > in various ways as they are getting added, as I did, using a random
> > perf.data file.
 
> I still think it's better to split the change as it's logically separate.

The smaller the patches, the better, I'd say in general.

> But it's prematurely exposed then maybe needs some protection.

Yeah, that is what I felt like it needed, make it more robust by
checking if the used fields were properly initialized, etc.

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 32/52] perf dwarf-aux: Add die_find_variable_by_addr()
  2023-11-09 23:59 ` [PATCH 32/52] perf dwarf-aux: Add die_find_variable_by_addr() Namhyung Kim
@ 2023-11-27 22:07   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-11-27 22:07 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Thu, Nov 09, 2023 at 03:59:51PM -0800, Namhyung Kim escreveu:
> The die_find_variable_by_addr() is to find a variables in the given DIE
> using given (PC-relative) address.  Global variables will have a
> location expression with DW_OP_addr which has an address so can simply
> compare it with the address.
> 
>   <1><143a7>: Abbrev Number: 2 (DW_TAG_variable)
>       <143a8>   DW_AT_name        : loops_per_jiffy
>       <143ac>   DW_AT_type        : <0x1cca>
>       <143b0>   DW_AT_external    : 1
>       <143b0>   DW_AT_decl_file   : 193
>       <143b1>   DW_AT_decl_line   : 213
>       <143b2>   DW_AT_location    : 9 byte block: 3 b0 46 41 82 ff ff ff ff
>                                      (DW_OP_addr: ffffffff824146b0)
> 
> Note that the type-offset should be calculated from the base address of
> the global variable.
> 
> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>

Thanks, applied to perf-tools-next.

- Arnaldo


^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 13/52] perf annotate-data: Add dso->data_types tree
  2023-11-09 23:59 ` [PATCH 13/52] perf annotate-data: Add dso->data_types tree Namhyung Kim
@ 2023-12-21 20:10   ` Arnaldo Carvalho de Melo
  2023-12-21 20:13     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-12-21 20:10 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Thu, Nov 09, 2023 at 03:59:32PM -0800, Namhyung Kim escreveu:
> +++ b/tools/perf/util/dso.h
> @@ -154,6 +154,8 @@ struct dso {
>  	size_t		 symbol_names_len;
>  	struct rb_root_cached inlined_nodes;
>  	struct rb_root_cached srclines;
> +	struct rb_root	data_types;
> +
>  	struct {
>  		u64		addr;
>  		struct symbol	*symbol;

At some point we need to make these feature specific members to be
associated on demand, maybe thru some hash table, etc.

I.e. the most basic workflow, what everybody needs should be in 'struct
dso', something one _may_ want, like data profiling, should be in
associated with that DSO thru some other way.

I'm applying this now as this is a super cool feature, but think about
it.

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 13/52] perf annotate-data: Add dso->data_types tree
  2023-12-21 20:10   ` Arnaldo Carvalho de Melo
@ 2023-12-21 20:13     ` Arnaldo Carvalho de Melo
  2023-12-21 20:32       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-12-21 20:13 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Thu, Dec 21, 2023 at 05:10:53PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Nov 09, 2023 at 03:59:32PM -0800, Namhyung Kim escreveu:
> > +++ b/tools/perf/util/dso.h
> > @@ -154,6 +154,8 @@ struct dso {
> >  	size_t		 symbol_names_len;
> >  	struct rb_root_cached inlined_nodes;
> >  	struct rb_root_cached srclines;
> > +	struct rb_root	data_types;
> > +
> >  	struct {
> >  		u64		addr;
> >  		struct symbol	*symbol;
> 
> At some point we need to make these feature specific members to be
> associated on demand, maybe thru some hash table, etc.
> 
> I.e. the most basic workflow, what everybody needs should be in 'struct
> dso', something one _may_ want, like data profiling, should be in
> associated with that DSO thru some other way.
> 
> I'm applying this now as this is a super cool feature, but think about
> it.

I think I have this series applied up to this patch, the next one is not
applying cleanly, so I'll do the usual set of build tests so that I can
push this for linux-next consumption.

This should be on tmp.perf-tools-next in a few jiffies, in
perf-tools-next a bit later.

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 13/52] perf annotate-data: Add dso->data_types tree
  2023-12-21 20:13     ` Arnaldo Carvalho de Melo
@ 2023-12-21 20:32       ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-12-21 20:32 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Thu, Dec 21, 2023 at 05:13:11PM -0300, Arnaldo Carvalho de Melo escreveu:
> > At some point we need to make these feature specific members to be
> > associated on demand, maybe thru some hash table, etc.

> > I.e. the most basic workflow, what everybody needs should be in 'struct
> > dso', something one _may_ want, like data profiling, should be in
> > associated with that DSO thru some other way.

> > I'm applying this now as this is a super cool feature, but think about
> > it.

> I think I have this series applied up to this patch, the next one is not
> applying cleanly, so I'll do the usual set of build tests so that I can
> push this for linux-next consumption.

> This should be on tmp.perf-tools-next in a few jiffies, in
> perf-tools-next a bit later.

Ok, as discussed on google chat, I got v3, all is in my local tree,
build testing while I review.

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 14/52] perf annotate: Factor out evsel__get_arch()
  2023-11-09 23:59 ` [PATCH 14/52] perf annotate: Factor out evsel__get_arch() Namhyung Kim
@ 2023-12-23 14:14   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-12-23 14:14 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Thu, Nov 09, 2023 at 03:59:33PM -0800, Namhyung Kim escreveu:
> The evsel__get_arch() is to get architecture info from the environ.
> It'll be used by other places later so let's factor it out.

evsel__get_arch():

  The "get" is mostly associated with refcounts, so at some point we
should rename it to some better name, not a reason to delay processing
this patch right now, so I'm applying it as is.
 
> Also add arch__is() to check the arch info by name.

cool
 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/util/annotate.c | 44 +++++++++++++++++++++++++++-----------
>  tools/perf/util/annotate.h |  2 ++
>  2 files changed, 33 insertions(+), 13 deletions(-)
> 
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 3364edf30f50..83e0996992af 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -804,6 +804,11 @@ static struct arch *arch__find(const char *name)
>  	return bsearch(name, architectures, nmemb, sizeof(struct arch), arch__key_cmp);
>  }
>  
> +bool arch__is(struct arch *arch, const char *name)
> +{
> +	return !strcmp(arch->name, name);
> +}
> +
>  static struct annotated_source *annotated_source__new(void)
>  {
>  	struct annotated_source *src = zalloc(sizeof(*src));
> @@ -2340,15 +2345,8 @@ void symbol__calc_percent(struct symbol *sym, struct evsel *evsel)
>  	annotation__calc_percent(notes, evsel, symbol__size(sym));
>  }
>  
> -int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
> -		     struct annotation_options *options, struct arch **parch)
> +static int evsel__get_arch(struct evsel *evsel, struct arch **parch)
>  {
> -	struct symbol *sym = ms->sym;
> -	struct annotation *notes = symbol__annotation(sym);
> -	struct annotate_args args = {
> -		.evsel		= evsel,
> -		.options	= options,
> -	};
>  	struct perf_env *env = evsel__env(evsel);
>  	const char *arch_name = perf_env__arch(env);
>  	struct arch *arch;
> @@ -2357,23 +2355,43 @@ int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
>  	if (!arch_name)
>  		return errno;
>  
> -	args.arch = arch = arch__find(arch_name);
> +	*parch = arch = arch__find(arch_name);
>  	if (arch == NULL) {
>  		pr_err("%s: unsupported arch %s\n", __func__, arch_name);
>  		return ENOTSUP;
>  	}
>  
> -	if (parch)
> -		*parch = arch;
> -
>  	if (arch->init) {
>  		err = arch->init(arch, env ? env->cpuid : NULL);
>  		if (err) {
> -			pr_err("%s: failed to initialize %s arch priv area\n", __func__, arch->name);
> +			pr_err("%s: failed to initialize %s arch priv area\n",
> +			       __func__, arch->name);
>  			return err;
>  		}
>  	}
> +	return 0;
> +}
> +
> +int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
> +		     struct annotation_options *options, struct arch **parch)
> +{
> +	struct symbol *sym = ms->sym;
> +	struct annotation *notes = symbol__annotation(sym);
> +	struct annotate_args args = {
> +		.evsel		= evsel,
> +		.options	= options,
> +	};
> +	struct arch *arch = NULL;
> +	int err;
> +
> +	err = evsel__get_arch(evsel, &arch);
> +	if (err < 0)
> +		return err;
> +
> +	if (parch)
> +		*parch = arch;
>  
> +	args.arch = arch;
>  	args.ms = *ms;
>  	if (notes->options && notes->options->full_addr)
>  		notes->start = map__objdump_2mem(ms->map, ms->sym->start);
> diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
> index bc8b95e8b1be..e8b0173f5f00 100644
> --- a/tools/perf/util/annotate.h
> +++ b/tools/perf/util/annotate.h
> @@ -59,6 +59,8 @@ struct ins_operands {
>  
>  struct arch;
>  
> +bool arch__is(struct arch *arch, const char *name);
> +
>  struct ins_ops {
>  	void (*free)(struct ins_operands *ops);
>  	int (*parse)(struct arch *arch, struct ins_operands *ops, struct map_symbol *ms);
> -- 
> 2.42.0.869.gea05f2083d-goog
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

* Re: [PATCH 23/52] perf report: Add 'symoff' sort key
  2023-11-09 23:59 ` [PATCH 23/52] perf report: Add 'symoff' " Namhyung Kim
@ 2023-12-23 14:29   ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 69+ messages in thread
From: Arnaldo Carvalho de Melo @ 2023-12-23 14:29 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Jiri Olsa, Peter Zijlstra, Ian Rogers, Adrian Hunter,
	Ingo Molnar, LKML, linux-perf-users, Linus Torvalds,
	Stephane Eranian, Masami Hiramatsu, Andi Kleen,
	linux-trace-devel, linux-toolchains

Em Thu, Nov 09, 2023 at 03:59:42PM -0800, Namhyung Kim escreveu:
> The symoff sort key is to print symbol and offset of sample.  This is
> useful for data type profiling to show exact instruction in the function
> which refers the data.

Cool, perhaps we can add a "symoffexcerpt" that would print a few lines
of assembly/source before and after that offset so that we save having
to go from there to the annotate view, or even jump to the annotate view
when pressing 'A' (annotate) on top of that line in the TUI.

- Arnaldo
 
>   $ perf report -s type,sym,typeoff,symoff --hierarchy
>   ...
>   #       Overhead  Data Type / Symbol / Data Type Offset / Symbol Offset
>   # ..............  .....................................................
>   #
>       1.23%         struct cfs_rq
>         0.84%         update_blocked_averages
>           0.19%         struct cfs_rq +336 (leaf_cfs_rq_list.next)
>              0.19%         [k] update_blocked_averages+0x96
>           0.19%         struct cfs_rq +0 (load.weight)
>              0.14%         [k] update_blocked_averages+0x104
>              0.04%         [k] update_blocked_averages+0x31c
>           0.17%         struct cfs_rq +404 (throttle_count)
>              0.12%         [k] update_blocked_averages+0x9d
>              0.05%         [k] update_blocked_averages+0x1f9
>           0.08%         struct cfs_rq +272 (propagate)
>              0.07%         [k] update_blocked_averages+0x3d3
>              0.02%         [k] update_blocked_averages+0x45b
>   ...
> 
> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
> ---
>  tools/perf/Documentation/perf-report.txt |  1 +
>  tools/perf/util/hist.h                   |  1 +
>  tools/perf/util/sort.c                   | 47 ++++++++++++++++++++++++
>  tools/perf/util/sort.h                   |  1 +
>  4 files changed, 50 insertions(+)
> 
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index b57eb51b47aa..38f59ac064f7 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -120,6 +120,7 @@ OPTIONS
>  	- simd: Flags describing a SIMD operation. "e" for empty Arm SVE predicate. "p" for partial Arm SVE predicate
>  	- type: Data type of sample memory access.
>  	- typeoff: Offset in the data type of sample memory access.
> +	- symoff: Offset in the symbol.
>  
>  	By default, comm, dso and symbol keys are used.
>  	(i.e. --sort comm,dso,symbol)
> diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
> index 941176afcebc..1ce0ee262abe 100644
> --- a/tools/perf/util/hist.h
> +++ b/tools/perf/util/hist.h
> @@ -84,6 +84,7 @@ enum hist_column {
>  	HISTC_SIMD,
>  	HISTC_TYPE,
>  	HISTC_TYPE_OFFSET,
> +	HISTC_SYMBOL_OFFSET,
>  	HISTC_NR_COLS, /* Last entry */
>  };
>  
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index d78e680d3988..0cbbd5ba8175 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -419,6 +419,52 @@ struct sort_entry sort_sym = {
>  	.se_width_idx	= HISTC_SYMBOL,
>  };
>  
> +/* --sort symoff */
> +
> +static int64_t
> +sort__symoff_cmp(struct hist_entry *left, struct hist_entry *right)
> +{
> +	int64_t ret;
> +
> +	ret = sort__sym_cmp(left, right);
> +	if (ret)
> +		return ret;
> +
> +	return left->ip - right->ip;
> +}
> +
> +static int64_t
> +sort__symoff_sort(struct hist_entry *left, struct hist_entry *right)
> +{
> +	int64_t ret;
> +
> +	ret = sort__sym_sort(left, right);
> +	if (ret)
> +		return ret;
> +
> +	return left->ip - right->ip;
> +}
> +
> +static int
> +hist_entry__symoff_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width)
> +{
> +	struct symbol *sym = he->ms.sym;
> +
> +	if (sym == NULL)
> +		return repsep_snprintf(bf, size, "[%c] %-#.*llx", he->level, width - 4, he->ip);
> +
> +	return repsep_snprintf(bf, size, "[%c] %s+0x%llx", he->level, sym->name, he->ip - sym->start);
> +}
> +
> +struct sort_entry sort_sym_offset = {
> +	.se_header	= "Symbol Offset",
> +	.se_cmp		= sort__symoff_cmp,
> +	.se_sort	= sort__symoff_sort,
> +	.se_snprintf	= hist_entry__symoff_snprintf,
> +	.se_filter	= hist_entry__sym_filter,
> +	.se_width_idx	= HISTC_SYMBOL_OFFSET,
> +};
> +
>  /* --sort srcline */
>  
>  char *hist_entry__srcline(struct hist_entry *he)
> @@ -2335,6 +2381,7 @@ static struct sort_dimension common_sort_dimensions[] = {
>  	DIM(SORT_SIMD, "simd", sort_simd),
>  	DIM(SORT_ANNOTATE_DATA_TYPE, "type", sort_type),
>  	DIM(SORT_ANNOTATE_DATA_TYPE_OFFSET, "typeoff", sort_type_offset),
> +	DIM(SORT_SYM_OFFSET, "symoff", sort_sym_offset),
>  };
>  
>  #undef DIM
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index d806adcc1e1e..6f6b4189a389 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -249,6 +249,7 @@ enum sort_type {
>  	SORT_SIMD,
>  	SORT_ANNOTATE_DATA_TYPE,
>  	SORT_ANNOTATE_DATA_TYPE_OFFSET,
> +	SORT_SYM_OFFSET,
>  
>  	/* branch stack specific sort keys */
>  	__SORT_BRANCH_STACK,
> -- 
> 2.42.0.869.gea05f2083d-goog
> 

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 69+ messages in thread

end of thread, other threads:[~2023-12-23 14:29 UTC | newest]

Thread overview: 69+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-09 23:59 [RFC 00/52] perf tools: Introduce data type profiling (v2) Namhyung Kim
2023-11-09 23:59 ` [PATCH 01/52] perf annotate: Pass "-l" option to objdump conditionally Namhyung Kim
2023-11-09 23:59 ` [PATCH 02/52] perf annotate: Move raw_comment and raw_func_start Namhyung Kim
2023-11-09 23:59 ` [PATCH 03/52] perf tools: Add util/debuginfo.[ch] files Namhyung Kim
2023-11-09 23:59 ` [PATCH 04/52] perf dwarf-aux: Fix die_get_typename() for void * Namhyung Kim
2023-11-09 23:59 ` [PATCH 05/52] perf dwarf-aux: Move #ifdef code to the header file Namhyung Kim
2023-11-09 23:59 ` [PATCH 06/52] perf dwarf-aux: Add die_get_scopes() helper Namhyung Kim
2023-11-09 23:59 ` [PATCH 07/52] perf dwarf-aux: Add die_find_variable_by_reg() helper Namhyung Kim
2023-11-09 23:59 ` [PATCH 08/52] perf build: Add feature check for dwarf_getcfi() Namhyung Kim
2023-11-10 10:26   ` Masami Hiramatsu
2023-11-09 23:59 ` [PATCH 09/52] perf probe: Convert to check dwarf_getcfi feature Namhyung Kim
2023-11-10 10:25   ` Masami Hiramatsu
2023-11-09 23:59 ` [PATCH 10/52] perf dwarf-aux: Factor out die_get_typename_from_type() Namhyung Kim
2023-11-09 23:59 ` [PATCH 11/52] perf dwarf-regs: Add get_dwarf_regnum() Namhyung Kim
2023-11-09 23:59 ` [PATCH 12/52] perf annotate-data: Add find_data_type() Namhyung Kim
     [not found]   ` <CA+JHD90fkWNrQWO5DrHeV8mCmFyKKqJ8fV=KwztRi7TSw+8yDg@mail.gmail.com>
2023-11-20 20:43     ` Namhyung Kim
2023-11-09 23:59 ` [PATCH 13/52] perf annotate-data: Add dso->data_types tree Namhyung Kim
2023-12-21 20:10   ` Arnaldo Carvalho de Melo
2023-12-21 20:13     ` Arnaldo Carvalho de Melo
2023-12-21 20:32       ` Arnaldo Carvalho de Melo
2023-11-09 23:59 ` [PATCH 14/52] perf annotate: Factor out evsel__get_arch() Namhyung Kim
2023-12-23 14:14   ` Arnaldo Carvalho de Melo
2023-11-09 23:59 ` [PATCH 15/52] perf annotate: Check if operand has multiple regs Namhyung Kim
2023-11-09 23:59 ` [PATCH 16/52] perf annotate: Add annotate_get_insn_location() Namhyung Kim
2023-11-09 23:59 ` [PATCH 17/52] perf annotate: Implement hist_entry__get_data_type() Namhyung Kim
2023-11-09 23:59 ` [PATCH 18/52] perf report: Add 'type' sort key Namhyung Kim
2023-11-21 17:55   ` Arnaldo Carvalho de Melo
2023-11-22 18:49     ` Namhyung Kim
2023-11-22 19:54       ` Arnaldo Carvalho de Melo
2023-11-22 21:13         ` Namhyung Kim
2023-11-23 13:40           ` Arnaldo Carvalho de Melo
2023-11-09 23:59 ` [PATCH 19/52] perf report: Support data type profiling Namhyung Kim
2023-11-09 23:59 ` [PATCH 20/52] perf annotate-data: Add member field in the data type Namhyung Kim
2023-11-09 23:59 ` [PATCH 21/52] perf annotate-data: Update sample histogram for type Namhyung Kim
2023-11-09 23:59 ` [PATCH 22/52] perf report: Add 'typeoff' sort key Namhyung Kim
2023-11-09 23:59 ` [PATCH 23/52] perf report: Add 'symoff' " Namhyung Kim
2023-12-23 14:29   ` Arnaldo Carvalho de Melo
2023-11-09 23:59 ` [PATCH 24/52] perf annotate: Add --data-type option Namhyung Kim
2023-11-09 23:59 ` [PATCH 25/52] perf annotate: Support event group display Namhyung Kim
2023-11-09 23:59 ` [PATCH 26/52] perf annotate: Add --type-stat option for debugging Namhyung Kim
2023-11-09 23:59 ` [PATCH 27/52] perf annotate: Add --insn-stat " Namhyung Kim
2023-11-09 23:59 ` [PATCH 28/52] perf annotate-data: Parse 'lock' prefix from llvm-objdump Namhyung Kim
2023-11-09 23:59 ` [PATCH 29/52] perf annotate-data: Handle macro fusion on x86 Namhyung Kim
2023-11-09 23:59 ` [PATCH 30/52] perf annotate-data: Handle array style accesses Namhyung Kim
2023-11-09 23:59 ` [PATCH 31/52] perf annotate-data: Add stack operation pseudo type Namhyung Kim
2023-11-09 23:59 ` [PATCH 32/52] perf dwarf-aux: Add die_find_variable_by_addr() Namhyung Kim
2023-11-27 22:07   ` Arnaldo Carvalho de Melo
2023-11-09 23:59 ` [PATCH 33/52] perf annotate-data: Handle PC-relative addressing Namhyung Kim
2023-11-09 23:59 ` [PATCH 34/52] perf annotate-data: Support global variables Namhyung Kim
2023-11-09 23:59 ` [PATCH 35/52] perf dwarf-aux: Add die_get_cfa() Namhyung Kim
2023-11-09 23:59 ` [PATCH 36/52] perf annotate-data: Support stack variables Namhyung Kim
2023-11-09 23:59 ` [PATCH 37/52] perf dwarf-aux: Check allowed DWARF Ops Namhyung Kim
2023-11-09 23:59 ` [PATCH 38/52] perf dwarf-aux: Add die_collect_vars() Namhyung Kim
2023-11-09 23:59 ` [PATCH 39/52] perf dwarf-aux: Handle type transfer for memory access Namhyung Kim
2023-11-09 23:59 ` [PATCH 40/52] perf annotate-data: Introduce struct data_loc_info Namhyung Kim
2023-11-10  0:00 ` [PATCH 41/52] perf map: Add map__objdump_2rip() Namhyung Kim
2023-11-10  0:00 ` [PATCH 42/52] perf annotate: Add annotate_get_basic_blocks() Namhyung Kim
2023-11-10  0:00 ` [PATCH 43/52] perf annotate-data: Maintain variable type info Namhyung Kim
2023-11-10  0:00 ` [PATCH 44/52] perf annotate-data: Add update_insn_state() Namhyung Kim
2023-11-10  0:00 ` [PATCH 45/52] perf annotate-data: Handle global variable access Namhyung Kim
2023-11-10  0:00 ` [PATCH 46/52] perf annotate-data: Handle call instructions Namhyung Kim
2023-11-10  0:00 ` [PATCH 47/52] perf annotate-data: Implement instruction tracking Namhyung Kim
2023-11-10  0:00 ` [PATCH 48/52] perf annotate: Parse x86 segment register location Namhyung Kim
2023-11-10  0:00 ` [PATCH 49/52] perf annotate-data: Handle this-cpu variables in kernel Namhyung Kim
2023-11-10  0:00 ` [PATCH 50/52] perf annotate-data: Track instructions with a this-cpu variable Namhyung Kim
2023-11-10  0:00 ` [PATCH 51/52] perf annotate-data: Add stack canary type Namhyung Kim
2023-11-10  0:00 ` [PATCH 52/52] perf annotate-data: Add debug message Namhyung Kim
2023-11-10 12:05 ` [RFC 00/52] perf tools: Introduce data type profiling (v2) Arnaldo Carvalho de Melo
2023-11-11  2:27   ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).