* [PATCH RESEND WITH CCs v3 1/4] perf tools: record aarch64 registers automatically
@ 2021-03-04 16:32 Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 2/4] perf tools: add a mechanism to inject stack frames Alexandre Truong
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Alexandre Truong @ 2021-03-04 16:32 UTC (permalink / raw)
To: linux-kernel, linux-perf-users
Cc: Alexandre Truong, John Garry, Will Deacon, Mathieu Poirier,
Leo Yan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang, Jin Yao,
Adrian Hunter, Suzuki K Poulose, Al Grant, James Clark,
Wilco Dijkstra
On arm64, automatically record all the registers if the frame pointer
mode is on. They will be used to do a dwarf unwind to find the caller
of the leaf frame if the frame pointer was omitted.
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Wilco Dijkstra <wilco.dijkstra@arm.com>
---
tools/perf/arch/arm64/util/machine.c | 7 +++++++
tools/perf/builtin-record.c | 7 +++++++
tools/perf/util/callchain.h | 2 ++
3 files changed, 16 insertions(+)
diff --git a/tools/perf/arch/arm64/util/machine.c b/tools/perf/arch/arm64/util/machine.c
index 40c5e0b5bda8..bf2f9c447867 100644
--- a/tools/perf/arch/arm64/util/machine.c
+++ b/tools/perf/arch/arm64/util/machine.c
@@ -5,6 +5,8 @@
#include <string.h>
#include "debug.h"
#include "symbol.h"
+#include "callchain.h"
+#include "record.h"
/* On arm64, kernel text segment start at high memory address,
* for example 0xffff 0000 8xxx xxxx. Modules start at a low memory
@@ -26,3 +28,8 @@ void arch__symbols__fixup_end(struct symbol *p, struct symbol *c)
p->end = c->start;
pr_debug4("%s sym:%s end:%#" PRIx64 "\n", __func__, p->name, p->end);
}
+
+void arch__add_leaf_frame_record_opts(struct record_opts *opts)
+{
+ opts->sample_user_regs = arch__user_reg_mask();
+}
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 8a0127d4fb52..496307ef490e 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2244,6 +2244,10 @@ static int record__parse_mmap_pages(const struct option *opt,
return ret;
}
+void __weak arch__add_leaf_frame_record_opts(struct record_opts *opts __maybe_unused)
+{
+}
+
static int parse_control_option(const struct option *opt,
const char *str,
int unset __maybe_unused)
@@ -2813,6 +2817,9 @@ int cmd_record(int argc, const char **argv)
/* Enable ignoring missing threads when -u/-p option is defined. */
rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid;
+ if (callchain_param.enabled && callchain_param.record_mode == CALLCHAIN_FP)
+ arch__add_leaf_frame_record_opts(&rec->opts);
+
err = -ENOMEM;
if (evlist__create_maps(rec->evlist, &rec->opts.target) < 0)
usage_with_options(record_usage, record_options);
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 5824134f983b..77fba053c677 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -280,6 +280,8 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
}
#endif
+void arch__add_leaf_frame_record_opts(struct record_opts *opts);
+
char *callchain_list__sym_name(struct callchain_list *cl,
char *bf, size_t bfsize, bool show_dso);
char *callchain_node__scnprintf_value(struct callchain_node *node,
--
2.23.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH RESEND WITH CCs v3 2/4] perf tools: add a mechanism to inject stack frames
2021-03-04 16:32 [PATCH RESEND WITH CCs v3 1/4] perf tools: record aarch64 registers automatically Alexandre Truong
@ 2021-03-04 16:32 ` Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 3/4] perf tools: enable dwarf_callchain_users on aarch64 Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address Alexandre Truong
2 siblings, 0 replies; 12+ messages in thread
From: Alexandre Truong @ 2021-03-04 16:32 UTC (permalink / raw)
To: linux-kernel, linux-perf-users
Cc: Alexandre Truong, John Garry, Will Deacon, Mathieu Poirier,
Leo Yan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang, Jin Yao,
Adrian Hunter, Suzuki K Poulose, Al Grant, James Clark,
Wilco Dijkstra
Add a mechanism for platforms to inject stack frames for the leaf
frame caller if there is enough information to determine a frame
is missing from dwarf or other post processing mechanisms.
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Wilco Dijkstra <wilco.dijkstra@arm.com>
---
tools/perf/util/machine.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index ab8a6b3e801d..7f03ffa016b0 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2671,6 +2671,12 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
return err;
}
+static u64 get_leaf_frame_caller(struct perf_sample *sample __maybe_unused,
+ struct thread *thread __maybe_unused)
+{
+ return 0;
+}
+
static int thread__resolve_callchain_sample(struct thread *thread,
struct callchain_cursor *cursor,
struct evsel *evsel,
@@ -2687,6 +2693,8 @@ static int thread__resolve_callchain_sample(struct thread *thread,
int i, j, err, nr_entries;
int skip_idx = -1;
int first_call = 0;
+ u64 leaf_frame_caller;
+ int pos;
if (chain)
chain_nr = chain->nr;
@@ -2811,6 +2819,21 @@ static int thread__resolve_callchain_sample(struct thread *thread,
continue;
}
+ pos = callchain_param.order == ORDER_CALLEE ? 2 : chain_nr - 2;
+
+ if (i == pos) {
+ leaf_frame_caller = get_leaf_frame_caller(sample, thread);
+
+ if (leaf_frame_caller && leaf_frame_caller != ip) {
+
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, leaf_frame_caller,
+ false, NULL, NULL, 0);
+ if (err)
+ return (err < 0) ? err : 0;
+ }
+ }
+
err = add_callchain_ip(thread, cursor, parent,
root_al, &cpumode, ip,
false, NULL, NULL, 0);
--
2.23.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH RESEND WITH CCs v3 3/4] perf tools: enable dwarf_callchain_users on aarch64
2021-03-04 16:32 [PATCH RESEND WITH CCs v3 1/4] perf tools: record aarch64 registers automatically Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 2/4] perf tools: add a mechanism to inject stack frames Alexandre Truong
@ 2021-03-04 16:32 ` Alexandre Truong
2021-03-05 11:51 ` Leo Yan
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address Alexandre Truong
2 siblings, 1 reply; 12+ messages in thread
From: Alexandre Truong @ 2021-03-04 16:32 UTC (permalink / raw)
To: linux-kernel, linux-perf-users
Cc: Alexandre Truong, John Garry, Will Deacon, Mathieu Poirier,
Leo Yan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang, Jin Yao,
Adrian Hunter, Suzuki K Poulose, Al Grant, James Clark,
Wilco Dijkstra
On arm64, enable dwarf_callchain_users which will be needed
to do a dwarf unwind in order to get the caller of the leaf frame.
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Wilco Dijkstra <wilco.dijkstra@arm.com>
---
tools/perf/builtin-report.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2a845d6cac09..93661a3eaeb1 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -405,6 +405,10 @@ static int report__setup_sample_type(struct report *rep)
callchain_param_setup(sample_type);
+ if (callchain_param.record_mode == CALLCHAIN_FP &&
+ strncmp(rep->session->header.env.arch, "aarch64", 7) == 0)
+ dwarf_callchain_users = true;
+
if (rep->stitch_lbr && (callchain_param.record_mode != CALLCHAIN_LBR)) {
ui__warning("Can't find LBR callchain. Switch off --stitch-lbr.\n"
"Please apply --call-graph lbr when recording.\n");
--
2.23.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address
2021-03-04 16:32 [PATCH RESEND WITH CCs v3 1/4] perf tools: record aarch64 registers automatically Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 2/4] perf tools: add a mechanism to inject stack frames Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 3/4] perf tools: enable dwarf_callchain_users on aarch64 Alexandre Truong
@ 2021-03-04 16:32 ` Alexandre Truong
2021-03-05 8:54 ` James Clark
2 siblings, 1 reply; 12+ messages in thread
From: Alexandre Truong @ 2021-03-04 16:32 UTC (permalink / raw)
To: linux-kernel, linux-perf-users
Cc: Alexandre Truong, John Garry, Will Deacon, Mathieu Poirier,
Leo Yan, Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang, Jin Yao,
Adrian Hunter, Suzuki K Poulose, Al Grant, James Clark,
Wilco Dijkstra
On arm64 and frame pointer mode (e.g: perf record --callgraph fp),
use dwarf unwind info to check if the link register is the return
address in order to inject it to the frame pointer stack.
Write the following application:
int a = 10;
void f2(void)
{
for (int i = 0; i < 1000000; i++)
a *= a;
}
void f1()
{
for (int i = 0; i < 10; i++)
f2();
}
int main (void)
{
f1();
return 0;
}
with the following compilation flags:
gcc -fno-omit-frame-pointer -fno-inline -O2
The compiler omits the frame pointer for f2 on arm. This is a problem
with any leaf call, for example an application with many different
calls to malloc() would always omit the calling frame, even if it
can be determined.
./perf record --call-graph fp ./a.out
./perf report
currently gives the following stack:
0xffffea52f361
_start
__libc_start_main
main
f2
After this change, perf report correctly shows f1() calling f2(),
even though it was missing from the frame pointer unwind:
./perf report
0xffffea52f361
_start
__libc_start_main
main
f1
f2
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Wilco Dijkstra <wilco.dijkstra@arm.com>
---
tools/perf/util/Build | 1 +
.../util/arm-frame-pointer-unwind-support.c | 44 +++++++++++++++++++
.../util/arm-frame-pointer-unwind-support.h | 7 +++
tools/perf/util/machine.c | 9 ++--
4 files changed, 58 insertions(+), 3 deletions(-)
create mode 100644 tools/perf/util/arm-frame-pointer-unwind-support.c
create mode 100644 tools/perf/util/arm-frame-pointer-unwind-support.h
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 188521f34347..3b82cb992bce 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -1,3 +1,4 @@
+perf-y += arm-frame-pointer-unwind-support.o
perf-y += annotate.o
perf-y += block-info.o
perf-y += block-range.o
diff --git a/tools/perf/util/arm-frame-pointer-unwind-support.c b/tools/perf/util/arm-frame-pointer-unwind-support.c
new file mode 100644
index 000000000000..964efd08e72e
--- /dev/null
+++ b/tools/perf/util/arm-frame-pointer-unwind-support.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "../arch/arm64/include/uapi/asm/perf_regs.h"
+#include "arch/arm64/include/perf_regs.h"
+#include "event.h"
+#include "arm-frame-pointer-unwind-support.h"
+#include "callchain.h"
+#include "unwind.h"
+
+struct entries {
+ u64 stack[2];
+ size_t length;
+};
+
+static bool get_leaf_frame_caller_enabled(struct perf_sample *sample)
+{
+ return callchain_param.record_mode == CALLCHAIN_FP && sample->user_regs.regs
+ && sample->user_regs.mask == PERF_REGS_MASK;
+}
+
+static int add_entry(struct unwind_entry *entry, void *arg)
+{
+ struct entries *entries = arg;
+
+ entries->stack[entries->length++] = entry->ip;
+ return 0;
+}
+
+u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread *thread)
+{
+ int ret;
+
+ struct entries entries = {{0, 0}, 0};
+
+ if (!get_leaf_frame_caller_enabled(sample))
+ return 0;
+
+ ret = unwind__get_entries(add_entry, &entries, thread, sample, 2);
+
+ if (ret || entries.length != 2)
+ return ret;
+
+ return callchain_param.order == ORDER_CALLER ?
+ entries.stack[0] : entries.stack[1];
+}
diff --git a/tools/perf/util/arm-frame-pointer-unwind-support.h b/tools/perf/util/arm-frame-pointer-unwind-support.h
new file mode 100644
index 000000000000..16dc03fa9abe
--- /dev/null
+++ b/tools/perf/util/arm-frame-pointer-unwind-support.h
@@ -0,0 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H
+#define __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H
+
+u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread *thread);
+
+#endif /* __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H */
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 7f03ffa016b0..dfb72dbc0e2d 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -34,6 +34,7 @@
#include "bpf-event.h"
#include <internal/lib.h> // page_size
#include "cgroup.h"
+#include "arm-frame-pointer-unwind-support.h"
#include <linux/ctype.h>
#include <symbol/kallsyms.h>
@@ -2671,10 +2672,12 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
return err;
}
-static u64 get_leaf_frame_caller(struct perf_sample *sample __maybe_unused,
- struct thread *thread __maybe_unused)
+static u64 get_leaf_frame_caller(struct perf_sample *sample, struct thread *thread)
{
- return 0;
+ if (strncmp(thread->maps->machine->env->arch, "aarch64", 7) == 0)
+ return get_leaf_frame_caller_aarch64(sample, thread);
+ else
+ return 0;
}
static int thread__resolve_callchain_sample(struct thread *thread,
--
2.23.0
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address Alexandre Truong
@ 2021-03-05 8:54 ` James Clark
2021-03-06 12:55 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 12+ messages in thread
From: James Clark @ 2021-03-05 8:54 UTC (permalink / raw)
To: Alexandre Truong, linux-kernel, linux-perf-users
Cc: John Garry, Will Deacon, Mathieu Poirier, Leo Yan,
Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang, Jin Yao,
Adrian Hunter, Suzuki K Poulose, Al Grant, Wilco Dijkstra
I've tested this patchset on a few different applications and have seen it significantly improve
quality of frame pointer stacks on aarch64. For example with GDB 10 and default build options,
'bfd_calc_gnu_debuglink_crc32' is a leaf function, and its caller 'gdb_bfd_crc' is ommitted,
but with the patchset it is included. I've also confirmed that this is correct from looking at
the source code.
Before:
# Children Self Command Shared Object Symbol
# ........ ........ ............... .......................... ...........
#
34.55% 0.00% gdb-100 gdb-100 [.] _start
0.78%
_start
__libc_start_main
main
gdb_main
captured_command_loop
gdb_do_one_event
check_async_event_handlers
fetch_inferior_event
inferior_event_handler
do_all_inferior_continuations
attach_post_wait
post_create_inferior
svr4_solib_create_inferior_hook
solib_add
solib_read_symbols
symbol_file_add_with_addrs
read_symbols
elf_symfile_read
find_separate_debug_file_by_debuglink[abi:cxx11]
find_separate_debug_file
separate_debug_file_exists
gdb_bfd_crc
bfd_calc_gnu_debuglink_crc32
After:
# Children Self Command Shared Object Symbol
# ........ ........ ............... .......................... ...........
#
34.55% 0.00% gdb-100 gdb-100 [.] _start
0.78%
_start
__libc_start_main
main
gdb_main
captured_command_loop
gdb_do_one_event
check_async_event_handlers
fetch_inferior_event
inferior_event_handler
do_all_inferior_continuations
attach_post_wait
post_create_inferior
svr4_solib_create_inferior_hook
solib_add
solib_read_symbols
symbol_file_add_with_addrs
read_symbols
elf_symfile_read
find_separate_debug_file_by_debuglink[abi:cxx11]
find_separate_debug_file
separate_debug_file_exists
get_file_crc <--------------------- leaf frame caller added
bfd_calc_gnu_debuglink_crc32
There is a question about whether the overhead of recording all the registers is acceptable, for
filesize and time. We could make it a manual step, at the cost of not showing better frame pointer
stacks by default.
Tested-by: James Clark <james.clark@arm.com>
On 04/03/2021 18:32, Alexandre Truong wrote:
> On arm64 and frame pointer mode (e.g: perf record --callgraph fp),
> use dwarf unwind info to check if the link register is the return
> address in order to inject it to the frame pointer stack.
>
> Write the following application:
>
> int a = 10;
>
> void f2(void)
> {
> for (int i = 0; i < 1000000; i++)
> a *= a;
> }
>
> void f1()
> {
> for (int i = 0; i < 10; i++)
> f2();
> }
>
> int main (void)
> {
> f1();
> return 0;
> }
>
> with the following compilation flags:
> gcc -fno-omit-frame-pointer -fno-inline -O2
>
> The compiler omits the frame pointer for f2 on arm. This is a problem
> with any leaf call, for example an application with many different
> calls to malloc() would always omit the calling frame, even if it
> can be determined.
>
> ./perf record --call-graph fp ./a.out
> ./perf report
>
> currently gives the following stack:
>
> 0xffffea52f361
> _start
> __libc_start_main
> main
> f2
>
> After this change, perf report correctly shows f1() calling f2(),
> even though it was missing from the frame pointer unwind:
>
> ./perf report
>
> 0xffffea52f361
> _start
> __libc_start_main
> main
> f1
> f2
>
> Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Kemeng Shi <shikemeng@huawei.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Jin Yao <yao.jin@linux.intel.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Al Grant <al.grant@arm.com>
> Cc: James Clark <james.clark@arm.com>
> Cc: Wilco Dijkstra <wilco.dijkstra@arm.com>
> ---
> tools/perf/util/Build | 1 +
> .../util/arm-frame-pointer-unwind-support.c | 44 +++++++++++++++++++
> .../util/arm-frame-pointer-unwind-support.h | 7 +++
> tools/perf/util/machine.c | 9 ++--
> 4 files changed, 58 insertions(+), 3 deletions(-)
> create mode 100644 tools/perf/util/arm-frame-pointer-unwind-support.c
> create mode 100644 tools/perf/util/arm-frame-pointer-unwind-support.h
>
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index 188521f34347..3b82cb992bce 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -1,3 +1,4 @@
> +perf-y += arm-frame-pointer-unwind-support.o
> perf-y += annotate.o
> perf-y += block-info.o
> perf-y += block-range.o
> diff --git a/tools/perf/util/arm-frame-pointer-unwind-support.c b/tools/perf/util/arm-frame-pointer-unwind-support.c
> new file mode 100644
> index 000000000000..964efd08e72e
> --- /dev/null
> +++ b/tools/perf/util/arm-frame-pointer-unwind-support.c
> @@ -0,0 +1,44 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include "../arch/arm64/include/uapi/asm/perf_regs.h"
> +#include "arch/arm64/include/perf_regs.h"
> +#include "event.h"
> +#include "arm-frame-pointer-unwind-support.h"
> +#include "callchain.h"
> +#include "unwind.h"
> +
> +struct entries {
> + u64 stack[2];
> + size_t length;
> +};
> +
> +static bool get_leaf_frame_caller_enabled(struct perf_sample *sample)
> +{
> + return callchain_param.record_mode == CALLCHAIN_FP && sample->user_regs.regs
> + && sample->user_regs.mask == PERF_REGS_MASK;
> +}
> +
> +static int add_entry(struct unwind_entry *entry, void *arg)
> +{
> + struct entries *entries = arg;
> +
> + entries->stack[entries->length++] = entry->ip;
> + return 0;
> +}
> +
> +u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread *thread)
> +{
> + int ret;
> +
> + struct entries entries = {{0, 0}, 0};
> +
> + if (!get_leaf_frame_caller_enabled(sample))
> + return 0;
> +
> + ret = unwind__get_entries(add_entry, &entries, thread, sample, 2);
> +
> + if (ret || entries.length != 2)
> + return ret;
> +
> + return callchain_param.order == ORDER_CALLER ?
> + entries.stack[0] : entries.stack[1];
> +}
> diff --git a/tools/perf/util/arm-frame-pointer-unwind-support.h b/tools/perf/util/arm-frame-pointer-unwind-support.h
> new file mode 100644
> index 000000000000..16dc03fa9abe
> --- /dev/null
> +++ b/tools/perf/util/arm-frame-pointer-unwind-support.h
> @@ -0,0 +1,7 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H
> +#define __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H
> +
> +u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread *thread);
> +
> +#endif /* __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H */
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 7f03ffa016b0..dfb72dbc0e2d 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -34,6 +34,7 @@
> #include "bpf-event.h"
> #include <internal/lib.h> // page_size
> #include "cgroup.h"
> +#include "arm-frame-pointer-unwind-support.h"
>
> #include <linux/ctype.h>
> #include <symbol/kallsyms.h>
> @@ -2671,10 +2672,12 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
> return err;
> }
>
> -static u64 get_leaf_frame_caller(struct perf_sample *sample __maybe_unused,
> - struct thread *thread __maybe_unused)
> +static u64 get_leaf_frame_caller(struct perf_sample *sample, struct thread *thread)
> {
> - return 0;
> + if (strncmp(thread->maps->machine->env->arch, "aarch64", 7) == 0)
> + return get_leaf_frame_caller_aarch64(sample, thread);
> + else
> + return 0;
> }
>
> static int thread__resolve_callchain_sample(struct thread *thread,
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RESEND WITH CCs v3 3/4] perf tools: enable dwarf_callchain_users on aarch64
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 3/4] perf tools: enable dwarf_callchain_users on aarch64 Alexandre Truong
@ 2021-03-05 11:51 ` Leo Yan
2021-03-05 14:07 ` Leo Yan
0 siblings, 1 reply; 12+ messages in thread
From: Leo Yan @ 2021-03-05 11:51 UTC (permalink / raw)
To: Alexandre Truong
Cc: linux-kernel, linux-perf-users, John Garry, Will Deacon,
Mathieu Poirier, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Kemeng Shi, Ian Rogers, Andi Kleen,
Kan Liang, Jin Yao, Adrian Hunter, Suzuki K Poulose, Al Grant,
James Clark, Wilco Dijkstra
Hi Alexandre,
On Thu, Mar 04, 2021 at 04:32:54PM +0000, Alexandre Truong wrote:
> On arm64, enable dwarf_callchain_users which will be needed
> to do a dwarf unwind in order to get the caller of the leaf frame.
>
> Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
> Cc: John Garry <john.garry@huawei.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Kemeng Shi <shikemeng@huawei.com>
> Cc: Ian Rogers <irogers@google.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Jin Yao <yao.jin@linux.intel.com>
> Cc: Adrian Hunter <adrian.hunter@intel.com>
> Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> Cc: Al Grant <al.grant@arm.com>
> Cc: James Clark <james.clark@arm.com>
> Cc: Wilco Dijkstra <wilco.dijkstra@arm.com>
> ---
> tools/perf/builtin-report.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 2a845d6cac09..93661a3eaeb1 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -405,6 +405,10 @@ static int report__setup_sample_type(struct report *rep)
>
> callchain_param_setup(sample_type);
>
> + if (callchain_param.record_mode == CALLCHAIN_FP &&
> + strncmp(rep->session->header.env.arch, "aarch64", 7) == 0)
> + dwarf_callchain_users = true;
> +
I don't have knowledge for dwarf or FP.
This patch is suspicious for me that since it only fixes the issue for
"perf report" command, but it cannot support "perf script".
I did a quick testing for "perf script" command with the test code from
patch 04, seems to me it cannot fix the fp omitting issue for
"perf script" command:
arm64_fp_test 11211 2282.355095: 176307 cycles:
aaaac2e40740 f2+0x10 (/root/arm64_fp_test)
aaaac2e4061c main+0xc (/root/arm64_fp_test)
ffff961fbd24 __libc_start_main+0xe4 (/usr/lib/aarch64-linux-gnu/libc-2.28.so)
aaaac2e4065c _start+0x34 (/root/arm64_fp_test)
Could you check for this? Thanks!
Leo
> if (rep->stitch_lbr && (callchain_param.record_mode != CALLCHAIN_LBR)) {
> ui__warning("Can't find LBR callchain. Switch off --stitch-lbr.\n"
> "Please apply --call-graph lbr when recording.\n");
> --
> 2.23.0
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RESEND WITH CCs v3 3/4] perf tools: enable dwarf_callchain_users on aarch64
2021-03-05 11:51 ` Leo Yan
@ 2021-03-05 14:07 ` Leo Yan
2021-03-09 16:10 ` Alexandre Truong
0 siblings, 1 reply; 12+ messages in thread
From: Leo Yan @ 2021-03-05 14:07 UTC (permalink / raw)
To: Alexandre Truong
Cc: linux-kernel, linux-perf-users, John Garry, Will Deacon,
Mathieu Poirier, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Kemeng Shi, Ian Rogers, Andi Kleen,
Kan Liang, Jin Yao, Adrian Hunter, Suzuki K Poulose, Al Grant,
James Clark, Wilco Dijkstra
On Fri, Mar 05, 2021 at 07:51:20PM +0800, Leo Yan wrote:
[...]
> > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> > index 2a845d6cac09..93661a3eaeb1 100644
> > --- a/tools/perf/builtin-report.c
> > +++ b/tools/perf/builtin-report.c
> > @@ -405,6 +405,10 @@ static int report__setup_sample_type(struct report *rep)
> >
> > callchain_param_setup(sample_type);
> >
> > + if (callchain_param.record_mode == CALLCHAIN_FP &&
> > + strncmp(rep->session->header.env.arch, "aarch64", 7) == 0)
> > + dwarf_callchain_users = true;
> > +
>
> I don't have knowledge for dwarf or FP.
>
> This patch is suspicious for me that since it only fixes the issue for
> "perf report" command, but it cannot support "perf script".
>
> I did a quick testing for "perf script" command with the test code from
> patch 04, seems to me it cannot fix the fp omitting issue for
> "perf script" command:
>
> arm64_fp_test 11211 2282.355095: 176307 cycles:
> aaaac2e40740 f2+0x10 (/root/arm64_fp_test)
> aaaac2e4061c main+0xc (/root/arm64_fp_test)
> ffff961fbd24 __libc_start_main+0xe4 (/usr/lib/aarch64-linux-gnu/libc-2.28.so)
> aaaac2e4065c _start+0x34 (/root/arm64_fp_test)
>
> Could you check for this? Thanks!
Maybe we can consolidate the setting for the global variable
"dwarf_callchain_users" with below change; this can help us to cover
the tools for most cases. I used the below change to replact patch
03, "perf report" and "perf script" both can work well with it.
Please note, if you want to move forward with this way, it's better to
use a saperate patch for firstly refactoring the function
script__setup_sample_type() by using the general API
callchain_param_setup() to replace the duplicate code pieces for
callchain parameter setting up.
After that, you could apply the reset change for adding new parameter
"arch" for the function script__setup_sample_type().
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2a845d6cac09..ca2e8c9096ea 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1090,7 +1090,8 @@ static int process_attr(struct perf_tool *tool __maybe_unused,
* on events sample_type.
*/
sample_type = evlist__combined_sample_type(*pevlist);
- callchain_param_setup(sample_type);
+ callchain_param_setup(sample_type,
+ perf_env__arch((*pevlist)->env));
return 0;
}
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 5915f19cee55..c49212c135b2 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2250,7 +2250,8 @@ static int process_attr(struct perf_tool *tool, union perf_event *event,
* on events sample_type.
*/
sample_type = evlist__combined_sample_type(evlist);
- callchain_param_setup(sample_type);
+ callchain_param_setup(sample_type,
+ perf_env__arch((*pevlist)->env));
/* Enable fields for callchain entries */
if (symbol_conf.use_callchain &&
@@ -3309,16 +3310,8 @@ static void script__setup_sample_type(struct perf_script *script)
struct perf_session *session = script->session;
u64 sample_type = evlist__combined_sample_type(session->evlist);
- if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
- if ((sample_type & PERF_SAMPLE_REGS_USER) &&
- (sample_type & PERF_SAMPLE_STACK_USER)) {
- callchain_param.record_mode = CALLCHAIN_DWARF;
- dwarf_callchain_users = true;
- } else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
- callchain_param.record_mode = CALLCHAIN_LBR;
- else
- callchain_param.record_mode = CALLCHAIN_FP;
- }
+ callchain_param_setup(sample_type,
+ perf_env__arch(session->machines.host.env));
if (script->stitch_lbr && (callchain_param.record_mode != CALLCHAIN_LBR)) {
pr_warning("Can't find LBR callchain. Switch off --stitch-lbr.\n"
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 1b60985690bb..d9766b54cd1a 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1600,7 +1600,7 @@ void callchain_cursor_reset(struct callchain_cursor *cursor)
map__zput(node->ms.map);
}
-void callchain_param_setup(u64 sample_type)
+void callchain_param_setup(u64 sample_type, const char *arch)
{
if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
if ((sample_type & PERF_SAMPLE_REGS_USER) &&
@@ -1612,6 +1612,14 @@ void callchain_param_setup(u64 sample_type)
else
callchain_param.record_mode = CALLCHAIN_FP;
}
+
+ /*
+ * Fixup for arm64 due to the frame pointer was omitted for the
+ * caller of the leaf frame.
+ */
+ if (callchain_param.record_mode == CALLCHAIN_FP &&
+ strncmp(arch, "arm64", 6) == 0)
+ dwarf_callchain_users = true;
}
static bool chain_match(struct callchain_list *base_chain,
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 77fba053c677..d95615daed73 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -300,7 +300,7 @@ int callchain_branch_counts(struct callchain_root *root,
u64 *branch_count, u64 *predicted_count,
u64 *abort_count, u64 *cycles_count);
-void callchain_param_setup(u64 sample_type);
+void callchain_param_setup(u64 sample_type, const char *arch);
bool callchain_cnode_matched(struct callchain_node *base_cnode,
struct callchain_node *pair_cnode);
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address
2021-03-05 8:54 ` James Clark
@ 2021-03-06 12:55 ` Arnaldo Carvalho de Melo
2021-03-06 19:10 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-06 12:55 UTC (permalink / raw)
To: James Clark
Cc: Alexandre Truong, linux-kernel, linux-perf-users, John Garry,
Will Deacon, Mathieu Poirier, Leo Yan, Peter Zijlstra,
Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang,
Jin Yao, Adrian Hunter, Suzuki K Poulose, Al Grant,
Wilco Dijkstra
Em Fri, Mar 05, 2021 at 10:54:03AM +0200, James Clark escreveu:
> I've tested this patchset on a few different applications and have seen it significantly improve
> quality of frame pointer stacks on aarch64. For example with GDB 10 and default build options,
> 'bfd_calc_gnu_debuglink_crc32' is a leaf function, and its caller 'gdb_bfd_crc' is ommitted,
> but with the patchset it is included. I've also confirmed that this is correct from looking at
> the source code.
>
> Before:
>
> # Children Self Command Shared Object Symbol
> # ........ ........ ............... .......................... ...........
> #
> 34.55% 0.00% gdb-100 gdb-100 [.] _start
> 0.78%
> _start
> __libc_start_main
> main
> gdb_main
> captured_command_loop
> gdb_do_one_event
> check_async_event_handlers
> fetch_inferior_event
> inferior_event_handler
> do_all_inferior_continuations
> attach_post_wait
> post_create_inferior
> svr4_solib_create_inferior_hook
> solib_add
> solib_read_symbols
> symbol_file_add_with_addrs
> read_symbols
> elf_symfile_read
> find_separate_debug_file_by_debuglink[abi:cxx11]
> find_separate_debug_file
> separate_debug_file_exists
> gdb_bfd_crc
> bfd_calc_gnu_debuglink_crc32
>
> After:
>
> # Children Self Command Shared Object Symbol
> # ........ ........ ............... .......................... ...........
> #
> 34.55% 0.00% gdb-100 gdb-100 [.] _start
> 0.78%
> _start
> __libc_start_main
> main
> gdb_main
> captured_command_loop
> gdb_do_one_event
> check_async_event_handlers
> fetch_inferior_event
> inferior_event_handler
> do_all_inferior_continuations
> attach_post_wait
> post_create_inferior
> svr4_solib_create_inferior_hook
> solib_add
> solib_read_symbols
> symbol_file_add_with_addrs
> read_symbols
> elf_symfile_read
> find_separate_debug_file_by_debuglink[abi:cxx11]
> find_separate_debug_file
> separate_debug_file_exists
> get_file_crc <--------------------- leaf frame caller added
> bfd_calc_gnu_debuglink_crc32
>
> There is a question about whether the overhead of recording all the registers is acceptable, for
> filesize and time. We could make it a manual step, at the cost of not showing better frame pointer
> stacks by default.
Can someone quantify this, i.e. how much space per perf.data for a
typical scenario? But anyway, I'm applying it as is now, we can change
it if needed, its not like files with the extra registers won't be
valid if/when we decide not to collect it by default in the future.
If we decide to make this selectable, we should have it as a .perfconfig
knob as well, so that one can set it and change the default, etc.
- Arnaldo
> Tested-by: James Clark <james.clark@arm.com>
>
> On 04/03/2021 18:32, Alexandre Truong wrote:
> > On arm64 and frame pointer mode (e.g: perf record --callgraph fp),
> > use dwarf unwind info to check if the link register is the return
> > address in order to inject it to the frame pointer stack.
> >
> > Write the following application:
> >
> > int a = 10;
> >
> > void f2(void)
> > {
> > for (int i = 0; i < 1000000; i++)
> > a *= a;
> > }
> >
> > void f1()
> > {
> > for (int i = 0; i < 10; i++)
> > f2();
> > }
> >
> > int main (void)
> > {
> > f1();
> > return 0;
> > }
> >
> > with the following compilation flags:
> > gcc -fno-omit-frame-pointer -fno-inline -O2
> >
> > The compiler omits the frame pointer for f2 on arm. This is a problem
> > with any leaf call, for example an application with many different
> > calls to malloc() would always omit the calling frame, even if it
> > can be determined.
> >
> > ./perf record --call-graph fp ./a.out
> > ./perf report
> >
> > currently gives the following stack:
> >
> > 0xffffea52f361
> > _start
> > __libc_start_main
> > main
> > f2
> >
> > After this change, perf report correctly shows f1() calling f2(),
> > even though it was missing from the frame pointer unwind:
> >
> > ./perf report
> >
> > 0xffffea52f361
> > _start
> > __libc_start_main
> > main
> > f1
> > f2
> >
> > Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
> > Cc: John Garry <john.garry@huawei.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> > Cc: Leo Yan <leo.yan@linaro.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Cc: Jiri Olsa <jolsa@redhat.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Kemeng Shi <shikemeng@huawei.com>
> > Cc: Ian Rogers <irogers@google.com>
> > Cc: Andi Kleen <ak@linux.intel.com>
> > Cc: Kan Liang <kan.liang@linux.intel.com>
> > Cc: Jin Yao <yao.jin@linux.intel.com>
> > Cc: Adrian Hunter <adrian.hunter@intel.com>
> > Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> > Cc: Al Grant <al.grant@arm.com>
> > Cc: James Clark <james.clark@arm.com>
> > Cc: Wilco Dijkstra <wilco.dijkstra@arm.com>
> > ---
> > tools/perf/util/Build | 1 +
> > .../util/arm-frame-pointer-unwind-support.c | 44 +++++++++++++++++++
> > .../util/arm-frame-pointer-unwind-support.h | 7 +++
> > tools/perf/util/machine.c | 9 ++--
> > 4 files changed, 58 insertions(+), 3 deletions(-)
> > create mode 100644 tools/perf/util/arm-frame-pointer-unwind-support.c
> > create mode 100644 tools/perf/util/arm-frame-pointer-unwind-support.h
> >
> > diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> > index 188521f34347..3b82cb992bce 100644
> > --- a/tools/perf/util/Build
> > +++ b/tools/perf/util/Build
> > @@ -1,3 +1,4 @@
> > +perf-y += arm-frame-pointer-unwind-support.o
> > perf-y += annotate.o
> > perf-y += block-info.o
> > perf-y += block-range.o
> > diff --git a/tools/perf/util/arm-frame-pointer-unwind-support.c b/tools/perf/util/arm-frame-pointer-unwind-support.c
> > new file mode 100644
> > index 000000000000..964efd08e72e
> > --- /dev/null
> > +++ b/tools/perf/util/arm-frame-pointer-unwind-support.c
> > @@ -0,0 +1,44 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include "../arch/arm64/include/uapi/asm/perf_regs.h"
> > +#include "arch/arm64/include/perf_regs.h"
> > +#include "event.h"
> > +#include "arm-frame-pointer-unwind-support.h"
> > +#include "callchain.h"
> > +#include "unwind.h"
> > +
> > +struct entries {
> > + u64 stack[2];
> > + size_t length;
> > +};
> > +
> > +static bool get_leaf_frame_caller_enabled(struct perf_sample *sample)
> > +{
> > + return callchain_param.record_mode == CALLCHAIN_FP && sample->user_regs.regs
> > + && sample->user_regs.mask == PERF_REGS_MASK;
> > +}
> > +
> > +static int add_entry(struct unwind_entry *entry, void *arg)
> > +{
> > + struct entries *entries = arg;
> > +
> > + entries->stack[entries->length++] = entry->ip;
> > + return 0;
> > +}
> > +
> > +u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread *thread)
> > +{
> > + int ret;
> > +
> > + struct entries entries = {{0, 0}, 0};
> > +
> > + if (!get_leaf_frame_caller_enabled(sample))
> > + return 0;
> > +
> > + ret = unwind__get_entries(add_entry, &entries, thread, sample, 2);
> > +
> > + if (ret || entries.length != 2)
> > + return ret;
> > +
> > + return callchain_param.order == ORDER_CALLER ?
> > + entries.stack[0] : entries.stack[1];
> > +}
> > diff --git a/tools/perf/util/arm-frame-pointer-unwind-support.h b/tools/perf/util/arm-frame-pointer-unwind-support.h
> > new file mode 100644
> > index 000000000000..16dc03fa9abe
> > --- /dev/null
> > +++ b/tools/perf/util/arm-frame-pointer-unwind-support.h
> > @@ -0,0 +1,7 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H
> > +#define __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H
> > +
> > +u64 get_leaf_frame_caller_aarch64(struct perf_sample *sample, struct thread *thread);
> > +
> > +#endif /* __PERF_ARM_FRAME_POINTER_UNWIND_SUPPORT_H */
> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > index 7f03ffa016b0..dfb72dbc0e2d 100644
> > --- a/tools/perf/util/machine.c
> > +++ b/tools/perf/util/machine.c
> > @@ -34,6 +34,7 @@
> > #include "bpf-event.h"
> > #include <internal/lib.h> // page_size
> > #include "cgroup.h"
> > +#include "arm-frame-pointer-unwind-support.h"
> >
> > #include <linux/ctype.h>
> > #include <symbol/kallsyms.h>
> > @@ -2671,10 +2672,12 @@ static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
> > return err;
> > }
> >
> > -static u64 get_leaf_frame_caller(struct perf_sample *sample __maybe_unused,
> > - struct thread *thread __maybe_unused)
> > +static u64 get_leaf_frame_caller(struct perf_sample *sample, struct thread *thread)
> > {
> > - return 0;
> > + if (strncmp(thread->maps->machine->env->arch, "aarch64", 7) == 0)
> > + return get_leaf_frame_caller_aarch64(sample, thread);
> > + else
> > + return 0;
> > }
> >
> > static int thread__resolve_callchain_sample(struct thread *thread,
> >
--
- Arnaldo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address
2021-03-06 12:55 ` Arnaldo Carvalho de Melo
@ 2021-03-06 19:10 ` Arnaldo Carvalho de Melo
2021-03-22 11:57 ` Alexandre Truong
0 siblings, 1 reply; 12+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-03-06 19:10 UTC (permalink / raw)
To: James Clark
Cc: Alexandre Truong, linux-kernel, linux-perf-users, John Garry,
Will Deacon, Mathieu Poirier, Leo Yan, Peter Zijlstra,
Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
Namhyung Kim, Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang,
Jin Yao, Adrian Hunter, Suzuki K Poulose, Al Grant,
Wilco Dijkstra
Em Sat, Mar 06, 2021 at 09:55:32AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Mar 05, 2021 at 10:54:03AM +0200, James Clark escreveu:
> > I've tested this patchset on a few different applications and have seen it significantly improve
> > quality of frame pointer stacks on aarch64. For example with GDB 10 and default build options,
> > 'bfd_calc_gnu_debuglink_crc32' is a leaf function, and its caller 'gdb_bfd_crc' is ommitted,
> > but with the patchset it is included. I've also confirmed that this is correct from looking at
> > the source code.
> >
> > Before:
> >
> > # Children Self Command Shared Object Symbol
> > # ........ ........ ............... .......................... ...........
> > #
> > 34.55% 0.00% gdb-100 gdb-100 [.] _start
> > 0.78%
> > _start
> > __libc_start_main
> > main
> > gdb_main
> > captured_command_loop
> > gdb_do_one_event
> > check_async_event_handlers
> > fetch_inferior_event
> > inferior_event_handler
> > do_all_inferior_continuations
> > attach_post_wait
> > post_create_inferior
> > svr4_solib_create_inferior_hook
> > solib_add
> > solib_read_symbols
> > symbol_file_add_with_addrs
> > read_symbols
> > elf_symfile_read
> > find_separate_debug_file_by_debuglink[abi:cxx11]
> > find_separate_debug_file
> > separate_debug_file_exists
> > gdb_bfd_crc
> > bfd_calc_gnu_debuglink_crc32
> >
> > After:
> >
> > # Children Self Command Shared Object Symbol
> > # ........ ........ ............... .......................... ...........
> > #
> > 34.55% 0.00% gdb-100 gdb-100 [.] _start
> > 0.78%
> > _start
> > __libc_start_main
> > main
> > gdb_main
> > captured_command_loop
> > gdb_do_one_event
> > check_async_event_handlers
> > fetch_inferior_event
> > inferior_event_handler
> > do_all_inferior_continuations
> > attach_post_wait
> > post_create_inferior
> > svr4_solib_create_inferior_hook
> > solib_add
> > solib_read_symbols
> > symbol_file_add_with_addrs
> > read_symbols
> > elf_symfile_read
> > find_separate_debug_file_by_debuglink[abi:cxx11]
> > find_separate_debug_file
> > separate_debug_file_exists
> > get_file_crc <--------------------- leaf frame caller added
> > bfd_calc_gnu_debuglink_crc32
> >
> > There is a question about whether the overhead of recording all the registers is acceptable, for
> > filesize and time. We could make it a manual step, at the cost of not showing better frame pointer
> > stacks by default.
>
> Can someone quantify this, i.e. how much space per perf.data for a
> typical scenario? But anyway, I'm applying it as is now, we can change
> it if needed, its not like files with the extra registers won't be
> valid if/when we decide not to collect it by default in the future.
>
> If we decide to make this selectable, we should have it as a .perfconfig
> knob as well, so that one can set it and change the default, etc.
> > Tested-by: James Clark <james.clark@arm.com>
This is unconditionally asking for asm/perf_regs.h and it is not available
everywhere, so I think this has to be abstracted away, maybe using a weak
function that arm provides a replacement for?
A:Humm
+++ b/tools/perf/util/Build
@@ -1,3 +1,4 @@
+perf-y += arm-frame-pointer-unwind-support.o
Is this for doing cross-platform analysis? I.e. record a perf.data file
on arm64 and then do a perf-report on it on a x86_64 machine? Yeah, that
is expected to work, but then:
+++ b/tools/perf/util/arm-frame-pointer-unwind-support.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "../arch/arm64/include/uapi/asm/perf_regs.h"
+#include "arch/arm64/include/perf_regs.h"
[acme@five perf]$ head -25 tools/perf/arch/arm64/include/perf_regs.h
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef ARCH_PERF_REGS_H
#define ARCH_PERF_REGS_H
#include <stdlib.h>
#include <linux/types.h>
#include <asm/perf_regs.h>
void perf_regs_load(u64 *regs);
#define PERF_REGS_MASK ((1ULL << PERF_REG_ARM64_MAX) - 1)
#define PERF_REGS_MAX PERF_REG_ARM64_MAX
#define PERF_SAMPLE_REGS_ABI PERF_SAMPLE_REGS_ABI_64
#define PERF_REG_IP PERF_REG_ARM64_PC
#define PERF_REG_SP PERF_REG_ARM64_SP
static inline const char *__perf_reg_name(int id)
{
switch (id) {
case PERF_REG_ARM64_X0:
return "x0";
case PERF_REG_ARM64_X1:
return "x1";
case PERF_REG_ARM64_X2:
[acme@five perf]$
Won't this get the wrong file when cross-building? See below.
- Arnaldo
[perfbuilder@five ~]$ time dm
Sat Mar 6 12:02:01 PM -03 2021
# export PERF_TARBALL=http://192.168.86.5/perf/perf-5.11.0.tar.xz
# dm
1 78.76 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0 , clang version 3.8.0 (tags/RELEASE_380/final)
2 79.07 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822 , clang version 3.8.1 (tags/RELEASE_381/final)
3 82.68 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0 , clang version 4.0.0 (tags/RELEASE_400/final)
4 88.57 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0 , Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
5 89.47 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0 , Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
6 93.63 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0 , Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
7 125.41 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0 , Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
8 142.89 alpine:3.11 : Ok gcc (Alpine 9.3.0) 9.3.0 , Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0)
9 126.52 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0 , Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c)
10 135.80 alpine:3.13 : Ok gcc (Alpine 10.2.1_pre1) 10.2.1 20201203 , Alpine clang version 10.0.1
11 134.19 alpine:edge : Ok gcc (Alpine 10.2.1_pre1) 10.2.1 20201203 , Alpine clang version 10.0.1
12 78.12 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1) , clang version 3.8.0 (tags/RELEASE_380/final)
13 92.57 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1) , clang version 10.0.0
14 93.72 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 10.2.1 20201125 (ALT Sisyphus 10.2.1-alt2) , clang version 10.0.1
15 75.10 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2) , clang version 3.6.2 (tags/RELEASE_362/final)
16 116.14 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12) , clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
17 10.26 android-ndk:r12b-arm : FAIL gcc version 4.9.x 20150123 (prerelease) (GCC)
from util/arm-frame-pointer-unwind-support.c:3:
/git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
enum perf_event_arm_regs {
^
/git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
In file included from util/arm-frame-pointer-unwind-support.c:2:0:
/git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
enum perf_event_arm_regs {
^
make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
18 10.71 android-ndk:r15c-arm : FAIL gcc version 4.9.x 20150123 (prerelease) (GCC)
from util/arm-frame-pointer-unwind-support.c:3:
/git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
enum perf_event_arm_regs {
^
/git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
In file included from util/arm-frame-pointer-unwind-support.c:2:0:
/git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
enum perf_event_arm_regs {
^
19 28.97 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
20 34.56 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
21 110.26 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5) , clang version 10.0.1 (Red Hat 10.0.1-1.module_el8.3.0+467+cb298d5b)
22 67.12 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 10.2.1 20201217 releases/gcc-10.2.0-643-g7cbb07d2fc , clang version 10.0.1
23 87.08 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2 , Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
24 92.73 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 , clang version 3.8.1-24 (tags/RELEASE_381/final)
25 86.91 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0 , clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final)
26 86.13 debian:experimental : Ok gcc (Debian 10.2.1-6) 10.2.1 20210110 , Debian clang version 11.0.1-2
27 36.91 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
28 9.08 debian:experimental-x-mips : FAIL gcc version 10.2.1 20201224 (Debian 10.2.1-3)
from builtin-diff.c:12:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
29 33.76 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 10.2.1-3) 10.2.1 20201224
30 12.74 debian:experimental-x-mipsel : FAIL gcc version 10.2.1 20201224 (Debian 10.2.1-3)
from builtin-diff.c:12:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from util/branch.h:15,
from util/callchain.h:8,
from builtin-record.c:16:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from util/session.h:6,
from builtin-buildid-list.c:17:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from util/thread.h:16,
from builtin-sched.c:11:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from builtin-top.c:31:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from util/session.h:6,
from builtin-evlist.c:16:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from util/session.h:6,
from builtin-buildid-cache.c:24:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from builtin-stat.c:49:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from util/branch.h:15,
from builtin-report.c:24:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from builtin-annotate.c:24:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
In file included from util/perf_regs.h:30,
from util/event.h:15,
from util/thread.h:16,
from builtin-timechart.c:24:
/git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
31 32.64 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
32 35.41 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) , clang version 3.5.0 (tags/RELEASE_350/final)
33 77.83 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) , clang version 3.7.0 (tags/RELEASE_370/final)
34 96.28 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1) , clang version 3.8.1 (tags/RELEASE_381/final)
35 10.80 fedora:24-x-ARC-uClibc : FAIL gcc version 7.1.1 20170710 (ARCompact ISA Linux uClibc toolchain 2017.09-rc2)
In file included from util/arm-frame-pointer-unwind-support.c:3:0:
/git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
#include <asm/perf_regs.h>
^~~~~~~~~~~~~~~~~
compilation terminated.
/git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
make[3]: *** [util] Error 2
36 98.38 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1) , clang version 3.9.1 (tags/RELEASE_391/final)
37 111.11 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2) , clang version 4.0.1 (tags/RELEASE_401/final)
38 111.55 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6) , clang version 5.0.2 (tags/RELEASE_502/final)
39 127.99 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) , clang version 6.0.1 (tags/RELEASE_601/final)
40 136.36 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) , clang version 7.0.1 (Fedora 7.0.1-6.fc29)
41 139.67 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2) , clang version 8.0.0 (Fedora 8.0.0-3.fc30)
42 10.57 fedora:30-x-ARC-uClibc : FAIL gcc version 8.3.1 20190225 (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1)
In file included from util/arm-frame-pointer-unwind-support.c:3:
/git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
#include <asm/perf_regs.h>
^~~~~~~~~~~~~~~~~
compilation terminated.
make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
43 136.09 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2) , clang version 9.0.1 (Fedora 9.0.1-4.fc31)
44 112.14 fedora:32 : Ok gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) , clang version 10.0.1 (Fedora 10.0.1-3.fc32)
45 111.28 fedora:33 : Ok gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) , clang version 11.0.0 (Fedora 11.0.0-2.fc33)
46 116.62 fedora:34 : Ok gcc (GCC) 11.0.0 20210225 (Red Hat 11.0.0-0) , clang version 12.0.0 (Fedora 12.0.0-0.1.rc1.fc34)
47 117.53 fedora:rawhide : Ok gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0) , clang version 12.0.0 (Fedora 12.0.0-0.1.rc1.fc35)
48 38.03 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0
49 78.35 mageia:5 : Ok gcc (GCC) 4.9.2 , clang version 3.5.2 (tags/RELEASE_352/final)
50 97.37 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0 , clang version 3.9.1 (tags/RELEASE_391/final)
51 116.36 manjaro:latest : Ok gcc (GCC) 10.2.0 , clang version 10.0.1
52 246.83 openmandriva:cooker : Ok gcc (GCC) 10.2.0 20200723 (OpenMandriva) , OpenMandriva 11.0.0-1 clang version 11.0.0 (/builddir/build/BUILD/llvm-project-llvmorg-11.0.0/clang 63e22714ac938c6b537bd958f70680d3331a2030)
53 134.08 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407] , clang version 5.0.1 (tags/RELEASE_501/final 312548)
54 142.32 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0 , clang version 7.0.1 (tags/RELEASE_701/final 349238)
55 133.68 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0 , clang version 9.0.1
56 146.85 opensuse:15.3 : Ok gcc (SUSE Linux) 7.5.0 , clang version 7.0.1 (tags/RELEASE_701/final 349238)
57 136.96 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5 , clang version 3.8.0 (tags/RELEASE_380/final 262553)
58 122.69 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c] , clang version 10.0.1
59 28.36 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
60 35.22 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44.0.3)
61 109.60 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.1) , clang version 10.0.1 (Red Hat 10.0.1-1.0.1.module+el8.3.0+7827+89335dbf)
62 29.20 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 , Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
63 33.46 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
64 93.53 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 , clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
65 11.70 ubuntu:16.04-x-arm : FAIL gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
from util/arm-frame-pointer-unwind-support.c:3:
/git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
enum perf_event_arm_regs {
^
/git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
In file included from util/arm-frame-pointer-unwind-support.c:2:0:
/git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
enum perf_event_arm_regs {
^
/git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
make[3]: *** [util] Error 2
66 28.74 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
67 27.61 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
68 28.82 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
69 28.16 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
70 27.96 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
71 101.40 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 , clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
72 11.69 ubuntu:18.04-x-arm : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
from util/arm-frame-pointer-unwind-support.c:3:
/git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
enum perf_event_arm_regs {
^~~~~~~~~~~~~~~~~~~
/git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
In file included from util/arm-frame-pointer-unwind-support.c:2:0:
/git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
enum perf_event_arm_regs {
^~~~~~~~~~~~~~~~~~~
/git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
make[3]: *** [util] Error 2
73 29.62 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
74 10.74 ubuntu:18.04-x-m68k : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
In file included from util/arm-frame-pointer-unwind-support.c:3:0:
/git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
#include <asm/perf_regs.h>
^~~~~~~~~~~~~~~~~
compilation terminated.
/git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
make[3]: *** [util] Error 2
75 29.37 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
76 32.06 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
77 31.37 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
78 11.66 ubuntu:18.04-x-riscv64 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
In file included from util/arm-frame-pointer-unwind-support.c:3:0:
/git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
#include <asm/perf_regs.h>
^~~~~~~~~~~~~~~~~
compilation terminated.
/git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
make[3]: *** [util] Error 2
79 26.84 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
80 11.50 ubuntu:18.04-x-sh4 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
In file included from util/arm-frame-pointer-unwind-support.c:3:0:
/git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
#include <asm/perf_regs.h>
^~~~~~~~~~~~~~~~~
compilation terminated.
/git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
make[3]: *** [util] Error 2
81 10.62 ubuntu:18.04-x-sparc64 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
In file included from util/arm-frame-pointer-unwind-support.c:3:0:
/git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
#include <asm/perf_regs.h>
^~~~~~~~~~~~~~~~~
compilation terminated.
/git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
make[3]: *** [util] Error 2
82 80.18 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008 , clang version 8.0.1-3build1 (tags/RELEASE_801/final)
83 11.44 ubuntu:19.10-x-alpha : FAIL gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu1)
In file included from util/arm-frame-pointer-unwind-support.c:3:
/git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
84 11.67 ubuntu:19.10-x-hppa : FAIL gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu1)
In file included from util/arm-frame-pointer-unwind-support.c:3:
/git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
7 | #include <asm/perf_regs.h>
| ^~~~~~~~~~~~~~~~~
compilation terminated.
make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
85 83.72 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 , clang version 10.0.0-4ubuntu1
86 34.23 ubuntu:20.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0
87 83.44 ubuntu:20.10 : Ok gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0 , Ubuntu clang version 11.0.0-2
88 79.47 ubuntu:21.04 : Ok gcc (Ubuntu 10.2.1-6ubuntu1) 10.2.1 20210110 , Ubuntu clang version 11.0.1-2
89 6489
real 109m25.234s
user 1m34.076s
sys 0m55.476s
[perfbuilder@five ~]$
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RESEND WITH CCs v3 3/4] perf tools: enable dwarf_callchain_users on aarch64
2021-03-05 14:07 ` Leo Yan
@ 2021-03-09 16:10 ` Alexandre Truong
0 siblings, 0 replies; 12+ messages in thread
From: Alexandre Truong @ 2021-03-09 16:10 UTC (permalink / raw)
To: Leo Yan
Cc: linux-kernel, linux-perf-users, John Garry, Will Deacon,
Mathieu Poirier, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
Jiri Olsa, Namhyung Kim, Kemeng Shi, Ian Rogers, Andi Kleen,
Kan Liang, Jin Yao, Adrian Hunter, Suzuki K Poulose, Al Grant,
James Clark, Wilco Dijkstra
Hi Leo,
Thanks for your message, I'll apply your suggestion for the v4 of the patch.
Regards,
Alexandre
On 3/5/21 2:07 PM, Leo Yan wrote:
> On Fri, Mar 05, 2021 at 07:51:20PM +0800, Leo Yan wrote:
>
> [...]
>
>>> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
>>> index 2a845d6cac09..93661a3eaeb1 100644
>>> --- a/tools/perf/builtin-report.c
>>> +++ b/tools/perf/builtin-report.c
>>> @@ -405,6 +405,10 @@ static int report__setup_sample_type(struct report *rep)
>>>
>>> callchain_param_setup(sample_type);
>>>
>>> + if (callchain_param.record_mode == CALLCHAIN_FP &&
>>> + strncmp(rep->session->header.env.arch, "aarch64", 7) == 0)
>>> + dwarf_callchain_users = true;
>>> +
>>
>> I don't have knowledge for dwarf or FP.
>>
>> This patch is suspicious for me that since it only fixes the issue for
>> "perf report" command, but it cannot support "perf script".
>>
>> I did a quick testing for "perf script" command with the test code from
>> patch 04, seems to me it cannot fix the fp omitting issue for
>> "perf script" command:
>>
>> arm64_fp_test 11211 2282.355095: 176307 cycles:
>> aaaac2e40740 f2+0x10 (/root/arm64_fp_test)
>> aaaac2e4061c main+0xc (/root/arm64_fp_test)
>> ffff961fbd24 __libc_start_main+0xe4 (/usr/lib/aarch64-linux-gnu/libc-2.28.so)
>> aaaac2e4065c _start+0x34 (/root/arm64_fp_test)
>>
>> Could you check for this? Thanks!
>
> Maybe we can consolidate the setting for the global variable
> "dwarf_callchain_users" with below change; this can help us to cover
> the tools for most cases. I used the below change to replact patch
> 03, "perf report" and "perf script" both can work well with it.
>
> Please note, if you want to move forward with this way, it's better to
> use a saperate patch for firstly refactoring the function
> script__setup_sample_type() by using the general API
> callchain_param_setup() to replace the duplicate code pieces for
> callchain parameter setting up.
>
> After that, you could apply the reset change for adding new parameter
> "arch" for the function script__setup_sample_type().
>
>
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 2a845d6cac09..ca2e8c9096ea 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -1090,7 +1090,8 @@ static int process_attr(struct perf_tool *tool __maybe_unused,
> * on events sample_type.
> */
> sample_type = evlist__combined_sample_type(*pevlist);
> - callchain_param_setup(sample_type);
> + callchain_param_setup(sample_type,
> + perf_env__arch((*pevlist)->env));
> return 0;
> }
>
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index 5915f19cee55..c49212c135b2 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2250,7 +2250,8 @@ static int process_attr(struct perf_tool *tool, union perf_event *event,
> * on events sample_type.
> */
> sample_type = evlist__combined_sample_type(evlist);
> - callchain_param_setup(sample_type);
> + callchain_param_setup(sample_type,
> + perf_env__arch((*pevlist)->env));
>
> /* Enable fields for callchain entries */
> if (symbol_conf.use_callchain &&
> @@ -3309,16 +3310,8 @@ static void script__setup_sample_type(struct perf_script *script)
> struct perf_session *session = script->session;
> u64 sample_type = evlist__combined_sample_type(session->evlist);
>
> - if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
> - if ((sample_type & PERF_SAMPLE_REGS_USER) &&
> - (sample_type & PERF_SAMPLE_STACK_USER)) {
> - callchain_param.record_mode = CALLCHAIN_DWARF;
> - dwarf_callchain_users = true;
> - } else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
> - callchain_param.record_mode = CALLCHAIN_LBR;
> - else
> - callchain_param.record_mode = CALLCHAIN_FP;
> - }
> + callchain_param_setup(sample_type,
> + perf_env__arch(session->machines.host.env));
>
> if (script->stitch_lbr && (callchain_param.record_mode != CALLCHAIN_LBR)) {
> pr_warning("Can't find LBR callchain. Switch off --stitch-lbr.\n"
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index 1b60985690bb..d9766b54cd1a 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -1600,7 +1600,7 @@ void callchain_cursor_reset(struct callchain_cursor *cursor)
> map__zput(node->ms.map);
> }
>
> -void callchain_param_setup(u64 sample_type)
> +void callchain_param_setup(u64 sample_type, const char *arch)
> {
> if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
> if ((sample_type & PERF_SAMPLE_REGS_USER) &&
> @@ -1612,6 +1612,14 @@ void callchain_param_setup(u64 sample_type)
> else
> callchain_param.record_mode = CALLCHAIN_FP;
> }
> +
> + /*
> + * Fixup for arm64 due to the frame pointer was omitted for the
> + * caller of the leaf frame.
> + */
> + if (callchain_param.record_mode == CALLCHAIN_FP &&
> + strncmp(arch, "arm64", 6) == 0)
> + dwarf_callchain_users = true;
> }
>
> static bool chain_match(struct callchain_list *base_chain,
> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> index 77fba053c677..d95615daed73 100644
> --- a/tools/perf/util/callchain.h
> +++ b/tools/perf/util/callchain.h
> @@ -300,7 +300,7 @@ int callchain_branch_counts(struct callchain_root *root,
> u64 *branch_count, u64 *predicted_count,
> u64 *abort_count, u64 *cycles_count);
>
> -void callchain_param_setup(u64 sample_type);
> +void callchain_param_setup(u64 sample_type, const char *arch);
>
> bool callchain_cnode_matched(struct callchain_node *base_cnode,
> struct callchain_node *pair_cnode);
>
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address
2021-03-06 19:10 ` Arnaldo Carvalho de Melo
@ 2021-03-22 11:57 ` Alexandre Truong
2021-03-26 12:15 ` James Clark
0 siblings, 1 reply; 12+ messages in thread
From: Alexandre Truong @ 2021-03-22 11:57 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo, James Clark
Cc: linux-kernel, linux-perf-users, John Garry, Will Deacon,
Mathieu Poirier, Leo Yan, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang, Jin Yao,
Adrian Hunter, Suzuki K Poulose, Al Grant, Wilco Dijkstra
Hi Arnaldo,
Thanks for your reply.
I profiled a few applications and here are the results.
---------------------------------------------------------------------------------------------------------------
| | | |
App | Before the patch | With the patch (% of increase) | if only LR is recorded (% of increase) |
| | | |
---------------------------------------------------------------------------------------------------------------
| | | |
firefox | nb of samples: 834k | nb of samples: 685k (+70%) | nb of samples: 671k (+10%) |
| size: 101 MB | size: 141 MB | size: 90 MB |
---------------------------------------------------------------------------------------------------------------
| | | |
htop | nb of samples: 500k | nb of samples: 443k (+71%) | nb of samples: 504k (+7%) |
| size: 69 MB | size: 105 MB | size: 75 MB |
---------------------------------------------------------------------------------------------------------------
| | | |
top | nb of samples: 500k | nb of samples: 521k (+70%) | nb of samples: 481k (+7%) |
| size: 69 MB | size: 122 MB | size: 71 MB |
---------------------------------------------------------------------------------------------------------------
| | | |
thunderbird | nb of samples: 266k | nb of samples: 271k (+43%) | nb of samples: 269k (+5%) |
| size: 31 MB | size: 45 MB | size: 33 MB |
---------------------------------------------------------------------------------------------------------------
What do you think of these results ?
Should there be a selectable mode specified in .perfconfig ?
I will investigate in reducing the size of the file as well. As, we only need the link register in theory.
Also for the compilation on different platforms
What do you think of this fix ?
diff --git a/tools/arch/arm64/include/uapi/asm/perf_regs.h b/tools/arch/arm64/include/uapi/asm/perf_regs.h
index d54daafa89e3..15b6805202c1 100644
--- a/tools/arch/arm64/include/uapi/asm/perf_regs.h
+++ b/tools/arch/arm64/include/uapi/asm/perf_regs.h
@@ -2,7 +2,7 @@
#ifndef _ASM_ARM64_PERF_REGS_H
#define _ASM_ARM64_PERF_REGS_H
-enum perf_event_arm_regs {
+enum perf_event_arm64_regs {
PERF_REG_ARM64_X0,
PERF_REG_ARM64_X1,
PERF_REG_ARM64_X2,
diff --git a/tools/perf/util/arm-frame-pointer-unwind-support.c b/tools/perf/util/arm-frame-pointer-unwind-support.c
index 964efd08e72e..0104477a762a 100644
--- a/tools/perf/util/arm-frame-pointer-unwind-support.c
+++ b/tools/perf/util/arm-frame-pointer-unwind-support.c
@@ -1,6 +1,5 @@
// SPDX-License-Identifier: GPL-2.0
#include "../arch/arm64/include/uapi/asm/perf_regs.h"
-#include "arch/arm64/include/perf_regs.h"
#include "event.h"
#include "arm-frame-pointer-unwind-support.h"
#include "callchain.h"
@@ -13,8 +12,9 @@ struct entries {
static bool get_leaf_frame_caller_enabled(struct perf_sample *sample)
{
+ unsigned long long arm64_regs_mask = ((1ULL << PERF_REG_ARM64_MAX) - 1);
return callchain_param.record_mode == CALLCHAIN_FP && sample->user_regs.regs
- && sample->user_regs.mask == PERF_REGS_MASK;
+ && sample->user_regs.mask == arm64_regs_mask;
}
Regards,
Alexandre
On 3/6/21 7:10 PM, Arnaldo Carvalho de Melo wrote:
> Em Sat, Mar 06, 2021 at 09:55:32AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Fri, Mar 05, 2021 at 10:54:03AM +0200, James Clark escreveu:
>>> I've tested this patchset on a few different applications and have seen it significantly improve
>>> quality of frame pointer stacks on aarch64. For example with GDB 10 and default build options,
>>> 'bfd_calc_gnu_debuglink_crc32' is a leaf function, and its caller 'gdb_bfd_crc' is ommitted,
>>> but with the patchset it is included. I've also confirmed that this is correct from looking at
>>> the source code.
>>>
>>> Before:
>>>
>>> # Children Self Command Shared Object Symbol
>>> # ........ ........ ............... .......................... ...........
>>> #
>>> 34.55% 0.00% gdb-100 gdb-100 [.] _start
>>> 0.78%
>>> _start
>>> __libc_start_main
>>> main
>>> gdb_main
>>> captured_command_loop
>>> gdb_do_one_event
>>> check_async_event_handlers
>>> fetch_inferior_event
>>> inferior_event_handler
>>> do_all_inferior_continuations
>>> attach_post_wait
>>> post_create_inferior
>>> svr4_solib_create_inferior_hook
>>> solib_add
>>> solib_read_symbols
>>> symbol_file_add_with_addrs
>>> read_symbols
>>> elf_symfile_read
>>> find_separate_debug_file_by_debuglink[abi:cxx11]
>>> find_separate_debug_file
>>> separate_debug_file_exists
>>> gdb_bfd_crc
>>> bfd_calc_gnu_debuglink_crc32
>>>
>>> After:
>>>
>>> # Children Self Command Shared Object Symbol
>>> # ........ ........ ............... .......................... ...........
>>> #
>>> 34.55% 0.00% gdb-100 gdb-100 [.] _start
>>> 0.78%
>>> _start
>>> __libc_start_main
>>> main
>>> gdb_main
>>> captured_command_loop
>>> gdb_do_one_event
>>> check_async_event_handlers
>>> fetch_inferior_event
>>> inferior_event_handler
>>> do_all_inferior_continuations
>>> attach_post_wait
>>> post_create_inferior
>>> svr4_solib_create_inferior_hook
>>> solib_add
>>> solib_read_symbols
>>> symbol_file_add_with_addrs
>>> read_symbols
>>> elf_symfile_read
>>> find_separate_debug_file_by_debuglink[abi:cxx11]
>>> find_separate_debug_file
>>> separate_debug_file_exists
>>> get_file_crc <--------------------- leaf frame caller added
>>> bfd_calc_gnu_debuglink_crc32
>>>
>>> There is a question about whether the overhead of recording all the registers is acceptable, for
>>> filesize and time. We could make it a manual step, at the cost of not showing better frame pointer
>>> stacks by default.
>>
>> Can someone quantify this, i.e. how much space per perf.data for a
>> typical scenario? But anyway, I'm applying it as is now, we can change
>> it if needed, its not like files with the extra registers won't be
>> valid if/when we decide not to collect it by default in the future.
>>
>> If we decide to make this selectable, we should have it as a .perfconfig
>> knob as well, so that one can set it and change the default, etc.
>
>>> Tested-by: James Clark <james.clark@arm.com>
>
>
> This is unconditionally asking for asm/perf_regs.h and it is not available
> everywhere, so I think this has to be abstracted away, maybe using a weak
> function that arm provides a replacement for?
>
> A:Humm
>
> +++ b/tools/perf/util/Build
> @@ -1,3 +1,4 @@
> +perf-y += arm-frame-pointer-unwind-support.o
>
> Is this for doing cross-platform analysis? I.e. record a perf.data file
> on arm64 and then do a perf-report on it on a x86_64 machine? Yeah, that
> is expected to work, but then:
>
> +++ b/tools/perf/util/arm-frame-pointer-unwind-support.c
> @@ -0,0 +1,44 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include "../arch/arm64/include/uapi/asm/perf_regs.h"
> +#include "arch/arm64/include/perf_regs.h"
>
>
> [acme@five perf]$ head -25 tools/perf/arch/arm64/include/perf_regs.h
> /* SPDX-License-Identifier: GPL-2.0 */
> #ifndef ARCH_PERF_REGS_H
> #define ARCH_PERF_REGS_H
>
> #include <stdlib.h>
> #include <linux/types.h>
> #include <asm/perf_regs.h>
>
> void perf_regs_load(u64 *regs);
>
> #define PERF_REGS_MASK ((1ULL << PERF_REG_ARM64_MAX) - 1)
> #define PERF_REGS_MAX PERF_REG_ARM64_MAX
> #define PERF_SAMPLE_REGS_ABI PERF_SAMPLE_REGS_ABI_64
>
> #define PERF_REG_IP PERF_REG_ARM64_PC
> #define PERF_REG_SP PERF_REG_ARM64_SP
>
> static inline const char *__perf_reg_name(int id)
> {
> switch (id) {
> case PERF_REG_ARM64_X0:
> return "x0";
> case PERF_REG_ARM64_X1:
> return "x1";
> case PERF_REG_ARM64_X2:
> [acme@five perf]$
>
> Won't this get the wrong file when cross-building? See below.
>
> - Arnaldo
>
> [perfbuilder@five ~]$ time dm
> Sat Mar 6 12:02:01 PM -03 2021
> # export PERF_TARBALL=http://192.168.86.5/perf/perf-5.11.0.tar.xz
> # dm
> 1 78.76 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0 , clang version 3.8.0 (tags/RELEASE_380/final)
> 2 79.07 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822 , clang version 3.8.1 (tags/RELEASE_381/final)
> 3 82.68 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0 , clang version 4.0.0 (tags/RELEASE_400/final)
> 4 88.57 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0 , Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
> 5 89.47 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0 , Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
> 6 93.63 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0 , Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
> 7 125.41 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0 , Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
> 8 142.89 alpine:3.11 : Ok gcc (Alpine 9.3.0) 9.3.0 , Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0)
> 9 126.52 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0 , Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c)
> 10 135.80 alpine:3.13 : Ok gcc (Alpine 10.2.1_pre1) 10.2.1 20201203 , Alpine clang version 10.0.1
> 11 134.19 alpine:edge : Ok gcc (Alpine 10.2.1_pre1) 10.2.1 20201203 , Alpine clang version 10.0.1
> 12 78.12 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1) , clang version 3.8.0 (tags/RELEASE_380/final)
> 13 92.57 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1) , clang version 10.0.0
> 14 93.72 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 10.2.1 20201125 (ALT Sisyphus 10.2.1-alt2) , clang version 10.0.1
> 15 75.10 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2) , clang version 3.6.2 (tags/RELEASE_362/final)
> 16 116.14 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12) , clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
> 17 10.26 android-ndk:r12b-arm : FAIL gcc version 4.9.x 20150123 (prerelease) (GCC)
> from util/arm-frame-pointer-unwind-support.c:3:
> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
> enum perf_event_arm_regs {
> ^
> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
> In file included from util/arm-frame-pointer-unwind-support.c:2:0:
> /git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
> enum perf_event_arm_regs {
> ^
> make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
> 18 10.71 android-ndk:r15c-arm : FAIL gcc version 4.9.x 20150123 (prerelease) (GCC)
> from util/arm-frame-pointer-unwind-support.c:3:
> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
> enum perf_event_arm_regs {
> ^
> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
> In file included from util/arm-frame-pointer-unwind-support.c:2:0:
> /git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
> enum perf_event_arm_regs {
> ^
> 19 28.97 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
> 20 34.56 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
> 21 110.26 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5) , clang version 10.0.1 (Red Hat 10.0.1-1.module_el8.3.0+467+cb298d5b)
> 22 67.12 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 10.2.1 20201217 releases/gcc-10.2.0-643-g7cbb07d2fc , clang version 10.0.1
> 23 87.08 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2 , Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
> 24 92.73 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 , clang version 3.8.1-24 (tags/RELEASE_381/final)
> 25 86.91 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0 , clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final)
> 26 86.13 debian:experimental : Ok gcc (Debian 10.2.1-6) 10.2.1 20210110 , Debian clang version 11.0.1-2
> 27 36.91 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
> 28 9.08 debian:experimental-x-mips : FAIL gcc version 10.2.1 20201224 (Debian 10.2.1-3)
> from builtin-diff.c:12:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> 29 33.76 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 10.2.1-3) 10.2.1 20201224
> 30 12.74 debian:experimental-x-mipsel : FAIL gcc version 10.2.1 20201224 (Debian 10.2.1-3)
> from builtin-diff.c:12:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from util/branch.h:15,
> from util/callchain.h:8,
> from builtin-record.c:16:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from util/session.h:6,
> from builtin-buildid-list.c:17:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from util/thread.h:16,
> from builtin-sched.c:11:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from builtin-top.c:31:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from util/session.h:6,
> from builtin-evlist.c:16:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from util/session.h:6,
> from builtin-buildid-cache.c:24:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from builtin-stat.c:49:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from util/branch.h:15,
> from builtin-report.c:24:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from builtin-annotate.c:24:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> In file included from util/perf_regs.h:30,
> from util/event.h:15,
> from util/thread.h:16,
> from builtin-timechart.c:24:
> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> 31 32.64 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
> 32 35.41 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) , clang version 3.5.0 (tags/RELEASE_350/final)
> 33 77.83 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) , clang version 3.7.0 (tags/RELEASE_370/final)
> 34 96.28 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1) , clang version 3.8.1 (tags/RELEASE_381/final)
> 35 10.80 fedora:24-x-ARC-uClibc : FAIL gcc version 7.1.1 20170710 (ARCompact ISA Linux uClibc toolchain 2017.09-rc2)
> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> #include <asm/perf_regs.h>
> ^~~~~~~~~~~~~~~~~
> compilation terminated.
> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
> make[3]: *** [util] Error 2
> 36 98.38 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1) , clang version 3.9.1 (tags/RELEASE_391/final)
> 37 111.11 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2) , clang version 4.0.1 (tags/RELEASE_401/final)
> 38 111.55 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6) , clang version 5.0.2 (tags/RELEASE_502/final)
> 39 127.99 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) , clang version 6.0.1 (tags/RELEASE_601/final)
> 40 136.36 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) , clang version 7.0.1 (Fedora 7.0.1-6.fc29)
> 41 139.67 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2) , clang version 8.0.0 (Fedora 8.0.0-3.fc30)
> 42 10.57 fedora:30-x-ARC-uClibc : FAIL gcc version 8.3.1 20190225 (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1)
> In file included from util/arm-frame-pointer-unwind-support.c:3:
> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> #include <asm/perf_regs.h>
> ^~~~~~~~~~~~~~~~~
> compilation terminated.
> make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
> 43 136.09 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2) , clang version 9.0.1 (Fedora 9.0.1-4.fc31)
> 44 112.14 fedora:32 : Ok gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) , clang version 10.0.1 (Fedora 10.0.1-3.fc32)
> 45 111.28 fedora:33 : Ok gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) , clang version 11.0.0 (Fedora 11.0.0-2.fc33)
> 46 116.62 fedora:34 : Ok gcc (GCC) 11.0.0 20210225 (Red Hat 11.0.0-0) , clang version 12.0.0 (Fedora 12.0.0-0.1.rc1.fc34)
> 47 117.53 fedora:rawhide : Ok gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0) , clang version 12.0.0 (Fedora 12.0.0-0.1.rc1.fc35)
> 48 38.03 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0
> 49 78.35 mageia:5 : Ok gcc (GCC) 4.9.2 , clang version 3.5.2 (tags/RELEASE_352/final)
> 50 97.37 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0 , clang version 3.9.1 (tags/RELEASE_391/final)
> 51 116.36 manjaro:latest : Ok gcc (GCC) 10.2.0 , clang version 10.0.1
> 52 246.83 openmandriva:cooker : Ok gcc (GCC) 10.2.0 20200723 (OpenMandriva) , OpenMandriva 11.0.0-1 clang version 11.0.0 (/builddir/build/BUILD/llvm-project-llvmorg-11.0.0/clang 63e22714ac938c6b537bd958f70680d3331a2030)
> 53 134.08 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407] , clang version 5.0.1 (tags/RELEASE_501/final 312548)
> 54 142.32 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0 , clang version 7.0.1 (tags/RELEASE_701/final 349238)
> 55 133.68 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0 , clang version 9.0.1
> 56 146.85 opensuse:15.3 : Ok gcc (SUSE Linux) 7.5.0 , clang version 7.0.1 (tags/RELEASE_701/final 349238)
> 57 136.96 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5 , clang version 3.8.0 (tags/RELEASE_380/final 262553)
> 58 122.69 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c] , clang version 10.0.1
> 59 28.36 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
> 60 35.22 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44.0.3)
> 61 109.60 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.1) , clang version 10.0.1 (Red Hat 10.0.1-1.0.1.module+el8.3.0+7827+89335dbf)
> 62 29.20 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 , Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
> 63 33.46 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
> 64 93.53 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 , clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
> 65 11.70 ubuntu:16.04-x-arm : FAIL gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
> from util/arm-frame-pointer-unwind-support.c:3:
> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
> enum perf_event_arm_regs {
> ^
> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
> In file included from util/arm-frame-pointer-unwind-support.c:2:0:
> /git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
> enum perf_event_arm_regs {
> ^
> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
> make[3]: *** [util] Error 2
> 66 28.74 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
> 67 27.61 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
> 68 28.82 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
> 69 28.16 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
> 70 27.96 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
> 71 101.40 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 , clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
> 72 11.69 ubuntu:18.04-x-arm : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
> from util/arm-frame-pointer-unwind-support.c:3:
> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
> enum perf_event_arm_regs {
> ^~~~~~~~~~~~~~~~~~~
> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
> In file included from util/arm-frame-pointer-unwind-support.c:2:0:
> /git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
> enum perf_event_arm_regs {
> ^~~~~~~~~~~~~~~~~~~
> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
> make[3]: *** [util] Error 2
> 73 29.62 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
> 74 10.74 ubuntu:18.04-x-m68k : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> #include <asm/perf_regs.h>
> ^~~~~~~~~~~~~~~~~
> compilation terminated.
> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
> make[3]: *** [util] Error 2
> 75 29.37 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
> 76 32.06 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
> 77 31.37 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
> 78 11.66 ubuntu:18.04-x-riscv64 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> #include <asm/perf_regs.h>
> ^~~~~~~~~~~~~~~~~
> compilation terminated.
> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
> make[3]: *** [util] Error 2
> 79 26.84 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
> 80 11.50 ubuntu:18.04-x-sh4 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> #include <asm/perf_regs.h>
> ^~~~~~~~~~~~~~~~~
> compilation terminated.
> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
> make[3]: *** [util] Error 2
> 81 10.62 ubuntu:18.04-x-sparc64 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> #include <asm/perf_regs.h>
> ^~~~~~~~~~~~~~~~~
> compilation terminated.
> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
> make[3]: *** [util] Error 2
> 82 80.18 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008 , clang version 8.0.1-3build1 (tags/RELEASE_801/final)
> 83 11.44 ubuntu:19.10-x-alpha : FAIL gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu1)
> In file included from util/arm-frame-pointer-unwind-support.c:3:
> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
> 84 11.67 ubuntu:19.10-x-hppa : FAIL gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu1)
> In file included from util/arm-frame-pointer-unwind-support.c:3:
> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
> 7 | #include <asm/perf_regs.h>
> | ^~~~~~~~~~~~~~~~~
> compilation terminated.
> make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
> 85 83.72 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 , clang version 10.0.0-4ubuntu1
> 86 34.23 ubuntu:20.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0
> 87 83.44 ubuntu:20.10 : Ok gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0 , Ubuntu clang version 11.0.0-2
> 88 79.47 ubuntu:21.04 : Ok gcc (Ubuntu 10.2.1-6ubuntu1) 10.2.1 20210110 , Ubuntu clang version 11.0.1-2
> 89 6489
>
> real 109m25.234s
> user 1m34.076s
> sys 0m55.476s
> [perfbuilder@five ~]$
>
>
>
>
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address
2021-03-22 11:57 ` Alexandre Truong
@ 2021-03-26 12:15 ` James Clark
0 siblings, 0 replies; 12+ messages in thread
From: James Clark @ 2021-03-26 12:15 UTC (permalink / raw)
To: Alexandre Truong, Arnaldo Carvalho de Melo
Cc: linux-kernel, linux-perf-users, John Garry, Will Deacon,
Mathieu Poirier, Leo Yan, Peter Zijlstra, Ingo Molnar,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Kemeng Shi, Ian Rogers, Andi Kleen, Kan Liang, Jin Yao,
Adrian Hunter, Suzuki K Poulose, Al Grant, Wilco Dijkstra
On 22/03/2021 13:57, Alexandre Truong wrote:
> Hi Arnaldo,
>
> Thanks for your reply.
>
> I profiled a few applications and here are the results.
>
> ---------------------------------------------------------------------------------------------------------------
> | | | |
> App | Before the patch | With the patch (% of increase) | if only LR is recorded (% of increase) |
> | | | |
> ---------------------------------------------------------------------------------------------------------------
> | | | |
> firefox | nb of samples: 834k | nb of samples: 685k (+70%) | nb of samples: 671k (+10%) |
> | size: 101 MB | size: 141 MB | size: 90 MB |
Hi Alex,
I think the 70% increase when recording all of the registers is too much, so we should continue with the change
to only record the link register. In my opinion a 5% - 10% increase by enabling LR recording by default wouldn't
be noticeable, so it's worth doing. But obviously we'll need other opinions too.
Do you also have the average stack lengths for these figures?
James
> ---------------------------------------------------------------------------------------------------------------
> | | | |
> htop | nb of samples: 500k | nb of samples: 443k (+71%) | nb of samples: 504k (+7%) |
> | size: 69 MB | size: 105 MB | size: 75 MB |
> ---------------------------------------------------------------------------------------------------------------
> | | | |
> top | nb of samples: 500k | nb of samples: 521k (+70%) | nb of samples: 481k (+7%) |
> | size: 69 MB | size: 122 MB | size: 71 MB |
> ---------------------------------------------------------------------------------------------------------------
> | | | |
> thunderbird | nb of samples: 266k | nb of samples: 271k (+43%) | nb of samples: 269k (+5%) |
> | size: 31 MB | size: 45 MB | size: 33 MB |
> ---------------------------------------------------------------------------------------------------------------
>
> What do you think of these results ?
> Should there be a selectable mode specified in .perfconfig ?
> I will investigate in reducing the size of the file as well. As, we only need the link register in theory.
>
> Also for the compilation on different platforms
> What do you think of this fix ?
>
> diff --git a/tools/arch/arm64/include/uapi/asm/perf_regs.h b/tools/arch/arm64/include/uapi/asm/perf_regs.h
> index d54daafa89e3..15b6805202c1 100644
> --- a/tools/arch/arm64/include/uapi/asm/perf_regs.h
> +++ b/tools/arch/arm64/include/uapi/asm/perf_regs.h
> @@ -2,7 +2,7 @@
> #ifndef _ASM_ARM64_PERF_REGS_H
> #define _ASM_ARM64_PERF_REGS_H
>
> -enum perf_event_arm_regs {
> +enum perf_event_arm64_regs {
> PERF_REG_ARM64_X0,
> PERF_REG_ARM64_X1,
> PERF_REG_ARM64_X2,
> diff --git a/tools/perf/util/arm-frame-pointer-unwind-support.c b/tools/perf/util/arm-frame-pointer-unwind-support.c
> index 964efd08e72e..0104477a762a 100644
> --- a/tools/perf/util/arm-frame-pointer-unwind-support.c
> +++ b/tools/perf/util/arm-frame-pointer-unwind-support.c
> @@ -1,6 +1,5 @@
> // SPDX-License-Identifier: GPL-2.0
> #include "../arch/arm64/include/uapi/asm/perf_regs.h"
> -#include "arch/arm64/include/perf_regs.h"
> #include "event.h"
> #include "arm-frame-pointer-unwind-support.h"
> #include "callchain.h"
> @@ -13,8 +12,9 @@ struct entries {
>
> static bool get_leaf_frame_caller_enabled(struct perf_sample *sample)
> {
> + unsigned long long arm64_regs_mask = ((1ULL << PERF_REG_ARM64_MAX) - 1);
> return callchain_param.record_mode == CALLCHAIN_FP && sample->user_regs.regs
> - && sample->user_regs.mask == PERF_REGS_MASK;
> + && sample->user_regs.mask == arm64_regs_mask;
> }
>
> Regards,
>
> Alexandre
>
> On 3/6/21 7:10 PM, Arnaldo Carvalho de Melo wrote:
>> Em Sat, Mar 06, 2021 at 09:55:32AM -0300, Arnaldo Carvalho de Melo escreveu:
>>> Em Fri, Mar 05, 2021 at 10:54:03AM +0200, James Clark escreveu:
>>>> I've tested this patchset on a few different applications and have seen it significantly improve
>>>> quality of frame pointer stacks on aarch64. For example with GDB 10 and default build options,
>>>> 'bfd_calc_gnu_debuglink_crc32' is a leaf function, and its caller 'gdb_bfd_crc' is ommitted,
>>>> but with the patchset it is included. I've also confirmed that this is correct from looking at
>>>> the source code.
>>>>
>>>> Before:
>>>>
>>>> # Children Self Command Shared Object Symbol
>>>> # ........ ........ ............... .......................... ...........
>>>> #
>>>> 34.55% 0.00% gdb-100 gdb-100 [.] _start
>>>> 0.78%
>>>> _start
>>>> __libc_start_main
>>>> main
>>>> gdb_main
>>>> captured_command_loop
>>>> gdb_do_one_event
>>>> check_async_event_handlers
>>>> fetch_inferior_event
>>>> inferior_event_handler
>>>> do_all_inferior_continuations
>>>> attach_post_wait
>>>> post_create_inferior
>>>> svr4_solib_create_inferior_hook
>>>> solib_add
>>>> solib_read_symbols
>>>> symbol_file_add_with_addrs
>>>> read_symbols
>>>> elf_symfile_read
>>>> find_separate_debug_file_by_debuglink[abi:cxx11]
>>>> find_separate_debug_file
>>>> separate_debug_file_exists
>>>> gdb_bfd_crc
>>>> bfd_calc_gnu_debuglink_crc32
>>>>
>>>> After:
>>>>
>>>> # Children Self Command Shared Object Symbol
>>>> # ........ ........ ............... .......................... ...........
>>>> #
>>>> 34.55% 0.00% gdb-100 gdb-100 [.] _start
>>>> 0.78%
>>>> _start
>>>> __libc_start_main
>>>> main
>>>> gdb_main
>>>> captured_command_loop
>>>> gdb_do_one_event
>>>> check_async_event_handlers
>>>> fetch_inferior_event
>>>> inferior_event_handler
>>>> do_all_inferior_continuations
>>>> attach_post_wait
>>>> post_create_inferior
>>>> svr4_solib_create_inferior_hook
>>>> solib_add
>>>> solib_read_symbols
>>>> symbol_file_add_with_addrs
>>>> read_symbols
>>>> elf_symfile_read
>>>> find_separate_debug_file_by_debuglink[abi:cxx11]
>>>> find_separate_debug_file
>>>> separate_debug_file_exists
>>>> get_file_crc <--------------------- leaf frame caller added
>>>> bfd_calc_gnu_debuglink_crc32
>>>>
>>>> There is a question about whether the overhead of recording all the registers is acceptable, for
>>>> filesize and time. We could make it a manual step, at the cost of not showing better frame pointer
>>>> stacks by default.
>>>
>>> Can someone quantify this, i.e. how much space per perf.data for a
>>> typical scenario? But anyway, I'm applying it as is now, we can change
>>> it if needed, its not like files with the extra registers won't be
>>> valid if/when we decide not to collect it by default in the future.
>>>
>>> If we decide to make this selectable, we should have it as a .perfconfig
>>> knob as well, so that one can set it and change the default, etc.
>>
>>>> Tested-by: James Clark <james.clark@arm.com>
>>
>>
>> This is unconditionally asking for asm/perf_regs.h and it is not available
>> everywhere, so I think this has to be abstracted away, maybe using a weak
>> function that arm provides a replacement for?
>>
>> A:Humm
>>
>> +++ b/tools/perf/util/Build
>> @@ -1,3 +1,4 @@
>> +perf-y += arm-frame-pointer-unwind-support.o
>>
>> Is this for doing cross-platform analysis? I.e. record a perf.data file
>> on arm64 and then do a perf-report on it on a x86_64 machine? Yeah, that
>> is expected to work, but then:
>>
>> +++ b/tools/perf/util/arm-frame-pointer-unwind-support.c
>> @@ -0,0 +1,44 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +#include "../arch/arm64/include/uapi/asm/perf_regs.h"
>> +#include "arch/arm64/include/perf_regs.h"
>>
>>
>> [acme@five perf]$ head -25 tools/perf/arch/arm64/include/perf_regs.h
>> /* SPDX-License-Identifier: GPL-2.0 */
>> #ifndef ARCH_PERF_REGS_H
>> #define ARCH_PERF_REGS_H
>>
>> #include <stdlib.h>
>> #include <linux/types.h>
>> #include <asm/perf_regs.h>
>>
>> void perf_regs_load(u64 *regs);
>>
>> #define PERF_REGS_MASK ((1ULL << PERF_REG_ARM64_MAX) - 1)
>> #define PERF_REGS_MAX PERF_REG_ARM64_MAX
>> #define PERF_SAMPLE_REGS_ABI PERF_SAMPLE_REGS_ABI_64
>>
>> #define PERF_REG_IP PERF_REG_ARM64_PC
>> #define PERF_REG_SP PERF_REG_ARM64_SP
>>
>> static inline const char *__perf_reg_name(int id)
>> {
>> switch (id) {
>> case PERF_REG_ARM64_X0:
>> return "x0";
>> case PERF_REG_ARM64_X1:
>> return "x1";
>> case PERF_REG_ARM64_X2:
>> [acme@five perf]$
>>
>> Won't this get the wrong file when cross-building? See below.
>>
>> - Arnaldo
>>
>> [perfbuilder@five ~]$ time dm
>> Sat Mar 6 12:02:01 PM -03 2021
>> # export PERF_TARBALL=http://192.168.86.5/perf/perf-5.11.0.tar.xz
>> # dm
>> 1 78.76 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0 , clang version 3.8.0 (tags/RELEASE_380/final)
>> 2 79.07 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822 , clang version 3.8.1 (tags/RELEASE_381/final)
>> 3 82.68 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0 , clang version 4.0.0 (tags/RELEASE_400/final)
>> 4 88.57 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0 , Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
>> 5 89.47 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0 , Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
>> 6 93.63 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0 , Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
>> 7 125.41 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0 , Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
>> 8 142.89 alpine:3.11 : Ok gcc (Alpine 9.3.0) 9.3.0 , Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0)
>> 9 126.52 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0 , Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c)
>> 10 135.80 alpine:3.13 : Ok gcc (Alpine 10.2.1_pre1) 10.2.1 20201203 , Alpine clang version 10.0.1
>> 11 134.19 alpine:edge : Ok gcc (Alpine 10.2.1_pre1) 10.2.1 20201203 , Alpine clang version 10.0.1
>> 12 78.12 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1) , clang version 3.8.0 (tags/RELEASE_380/final)
>> 13 92.57 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1) , clang version 10.0.0
>> 14 93.72 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 10.2.1 20201125 (ALT Sisyphus 10.2.1-alt2) , clang version 10.0.1
>> 15 75.10 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2) , clang version 3.6.2 (tags/RELEASE_362/final)
>> 16 116.14 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12) , clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
>> 17 10.26 android-ndk:r12b-arm : FAIL gcc version 4.9.x 20150123 (prerelease) (GCC)
>> from util/arm-frame-pointer-unwind-support.c:3:
>> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
>> enum perf_event_arm_regs {
>> ^
>> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
>> In file included from util/arm-frame-pointer-unwind-support.c:2:0:
>> /git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
>> enum perf_event_arm_regs {
>> ^
>> make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
>> 18 10.71 android-ndk:r15c-arm : FAIL gcc version 4.9.x 20150123 (prerelease) (GCC)
>> from util/arm-frame-pointer-unwind-support.c:3:
>> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
>> enum perf_event_arm_regs {
>> ^
>> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
>> In file included from util/arm-frame-pointer-unwind-support.c:2:0:
>> /git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
>> enum perf_event_arm_regs {
>> ^
>> 19 28.97 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
>> 20 34.56 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
>> 21 110.26 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5) , clang version 10.0.1 (Red Hat 10.0.1-1.module_el8.3.0+467+cb298d5b)
>> 22 67.12 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 10.2.1 20201217 releases/gcc-10.2.0-643-g7cbb07d2fc , clang version 10.0.1
>> 23 87.08 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2 , Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
>> 24 92.73 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516 , clang version 3.8.1-24 (tags/RELEASE_381/final)
>> 25 86.91 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0 , clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final)
>> 26 86.13 debian:experimental : Ok gcc (Debian 10.2.1-6) 10.2.1 20210110 , Debian clang version 11.0.1-2
>> 27 36.91 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 10.2.1-6) 10.2.1 20210110
>> 28 9.08 debian:experimental-x-mips : FAIL gcc version 10.2.1 20201224 (Debian 10.2.1-3)
>> from builtin-diff.c:12:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> 29 33.76 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 10.2.1-3) 10.2.1 20201224
>> 30 12.74 debian:experimental-x-mipsel : FAIL gcc version 10.2.1 20201224 (Debian 10.2.1-3)
>> from builtin-diff.c:12:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from util/branch.h:15,
>> from util/callchain.h:8,
>> from builtin-record.c:16:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from util/session.h:6,
>> from builtin-buildid-list.c:17:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from util/thread.h:16,
>> from builtin-sched.c:11:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from builtin-top.c:31:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from util/session.h:6,
>> from builtin-evlist.c:16:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from util/session.h:6,
>> from builtin-buildid-cache.c:24:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from builtin-stat.c:49:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from util/branch.h:15,
>> from builtin-report.c:24:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from builtin-annotate.c:24:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> In file included from util/perf_regs.h:30,
>> from util/event.h:15,
>> from util/thread.h:16,
>> from builtin-timechart.c:24:
>> /git/linux/tools/perf/arch/mips/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> 31 32.64 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
>> 32 35.41 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) , clang version 3.5.0 (tags/RELEASE_350/final)
>> 33 77.83 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6) , clang version 3.7.0 (tags/RELEASE_370/final)
>> 34 96.28 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1) , clang version 3.8.1 (tags/RELEASE_381/final)
>> 35 10.80 fedora:24-x-ARC-uClibc : FAIL gcc version 7.1.1 20170710 (ARCompact ISA Linux uClibc toolchain 2017.09-rc2)
>> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
>> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> #include <asm/perf_regs.h>
>> ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
>> make[3]: *** [util] Error 2
>> 36 98.38 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1) , clang version 3.9.1 (tags/RELEASE_391/final)
>> 37 111.11 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2) , clang version 4.0.1 (tags/RELEASE_401/final)
>> 38 111.55 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6) , clang version 5.0.2 (tags/RELEASE_502/final)
>> 39 127.99 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) , clang version 6.0.1 (tags/RELEASE_601/final)
>> 40 136.36 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2) , clang version 7.0.1 (Fedora 7.0.1-6.fc29)
>> 41 139.67 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2) , clang version 8.0.0 (Fedora 8.0.0-3.fc30)
>> 42 10.57 fedora:30-x-ARC-uClibc : FAIL gcc version 8.3.1 20190225 (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1)
>> In file included from util/arm-frame-pointer-unwind-support.c:3:
>> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> #include <asm/perf_regs.h>
>> ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
>> 43 136.09 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2) , clang version 9.0.1 (Fedora 9.0.1-4.fc31)
>> 44 112.14 fedora:32 : Ok gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) , clang version 10.0.1 (Fedora 10.0.1-3.fc32)
>> 45 111.28 fedora:33 : Ok gcc (GCC) 10.2.1 20201125 (Red Hat 10.2.1-9) , clang version 11.0.0 (Fedora 11.0.0-2.fc33)
>> 46 116.62 fedora:34 : Ok gcc (GCC) 11.0.0 20210225 (Red Hat 11.0.0-0) , clang version 12.0.0 (Fedora 12.0.0-0.1.rc1.fc34)
>> 47 117.53 fedora:rawhide : Ok gcc (GCC) 11.0.0 20210210 (Red Hat 11.0.0-0) , clang version 12.0.0 (Fedora 12.0.0-0.1.rc1.fc35)
>> 48 38.03 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0
>> 49 78.35 mageia:5 : Ok gcc (GCC) 4.9.2 , clang version 3.5.2 (tags/RELEASE_352/final)
>> 50 97.37 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0 , clang version 3.9.1 (tags/RELEASE_391/final)
>> 51 116.36 manjaro:latest : Ok gcc (GCC) 10.2.0 , clang version 10.0.1
>> 52 246.83 openmandriva:cooker : Ok gcc (GCC) 10.2.0 20200723 (OpenMandriva) , OpenMandriva 11.0.0-1 clang version 11.0.0 (/builddir/build/BUILD/llvm-project-llvmorg-11.0.0/clang 63e22714ac938c6b537bd958f70680d3331a2030)
>> 53 134.08 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407] , clang version 5.0.1 (tags/RELEASE_501/final 312548)
>> 54 142.32 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0 , clang version 7.0.1 (tags/RELEASE_701/final 349238)
>> 55 133.68 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0 , clang version 9.0.1
>> 56 146.85 opensuse:15.3 : Ok gcc (SUSE Linux) 7.5.0 , clang version 7.0.1 (tags/RELEASE_701/final 349238)
>> 57 136.96 opensuse:42.3 : Ok gcc (SUSE Linux) 4.8.5 , clang version 3.8.0 (tags/RELEASE_380/final 262553)
>> 58 122.69 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c] , clang version 10.0.1
>> 59 28.36 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
>> 60 35.22 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44.0.3)
>> 61 109.60 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.1) , clang version 10.0.1 (Red Hat 10.0.1-1.0.1.module+el8.3.0+7827+89335dbf)
>> 62 29.20 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 , Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
>> 63 33.46 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
>> 64 93.53 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 , clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
>> 65 11.70 ubuntu:16.04-x-arm : FAIL gcc version 5.4.0 20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9)
>> from util/arm-frame-pointer-unwind-support.c:3:
>> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
>> enum perf_event_arm_regs {
>> ^
>> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
>> In file included from util/arm-frame-pointer-unwind-support.c:2:0:
>> /git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
>> enum perf_event_arm_regs {
>> ^
>> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
>> make[3]: *** [util] Error 2
>> 66 28.74 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>> 67 27.61 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>> 68 28.82 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>> 69 28.16 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>> 70 27.96 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
>> 71 101.40 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 , clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
>> 72 11.69 ubuntu:18.04-x-arm : FAIL gcc version 7.5.0 (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04)
>> from util/arm-frame-pointer-unwind-support.c:3:
>> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: nested redefinition of 'enum perf_event_arm_regs'
>> enum perf_event_arm_regs {
>> ^~~~~~~~~~~~~~~~~~~
>> /git/linux/tools/arch/arm/include/uapi/asm/perf_regs.h:5:6: error: redeclaration of 'enum perf_event_arm_regs'
>> In file included from util/arm-frame-pointer-unwind-support.c:2:0:
>> /git/linux/tools/include/../arch/arm64/include/uapi/asm/perf_regs.h:5:6: note: originally defined here
>> enum perf_event_arm_regs {
>> ^~~~~~~~~~~~~~~~~~~
>> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
>> make[3]: *** [util] Error 2
>> 73 29.62 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
>> 74 10.74 ubuntu:18.04-x-m68k : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
>> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
>> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> #include <asm/perf_regs.h>
>> ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
>> make[3]: *** [util] Error 2
>> 75 29.37 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
>> 76 32.06 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
>> 77 31.37 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
>> 78 11.66 ubuntu:18.04-x-riscv64 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
>> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
>> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> #include <asm/perf_regs.h>
>> ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
>> make[3]: *** [util] Error 2
>> 79 26.84 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
>> 80 11.50 ubuntu:18.04-x-sh4 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
>> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
>> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> #include <asm/perf_regs.h>
>> ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
>> make[3]: *** [util] Error 2
>> 81 10.62 ubuntu:18.04-x-sparc64 : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)
>> In file included from util/arm-frame-pointer-unwind-support.c:3:0:
>> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> #include <asm/perf_regs.h>
>> ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> /git/linux/tools/build/Makefile.build:139: recipe for target 'util' failed
>> make[3]: *** [util] Error 2
>> 82 80.18 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008 , clang version 8.0.1-3build1 (tags/RELEASE_801/final)
>> 83 11.44 ubuntu:19.10-x-alpha : FAIL gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu1)
>> In file included from util/arm-frame-pointer-unwind-support.c:3:
>> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
>> 84 11.67 ubuntu:19.10-x-hppa : FAIL gcc version 9.2.1 20191008 (Ubuntu 9.2.1-9ubuntu1)
>> In file included from util/arm-frame-pointer-unwind-support.c:3:
>> /git/linux/tools/perf/arch/arm64/include/perf_regs.h:7:10: fatal error: asm/perf_regs.h: No such file or directory
>> 7 | #include <asm/perf_regs.h>
>> | ^~~~~~~~~~~~~~~~~
>> compilation terminated.
>> make[3]: *** [/git/linux/tools/build/Makefile.build:139: util] Error 2
>> 85 83.72 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 , clang version 10.0.0-4ubuntu1
>> 86 34.23 ubuntu:20.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0
>> 87 83.44 ubuntu:20.10 : Ok gcc (Ubuntu 10.2.0-13ubuntu1) 10.2.0 , Ubuntu clang version 11.0.0-2
>> 88 79.47 ubuntu:21.04 : Ok gcc (Ubuntu 10.2.1-6ubuntu1) 10.2.1 20210110 , Ubuntu clang version 11.0.1-2
>> 89 6489
>>
>> real 109m25.234s
>> user 1m34.076s
>> sys 0m55.476s
>> [perfbuilder@five ~]$
>>
>>
>>
>>
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-03-26 12:16 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-04 16:32 [PATCH RESEND WITH CCs v3 1/4] perf tools: record aarch64 registers automatically Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 2/4] perf tools: add a mechanism to inject stack frames Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 3/4] perf tools: enable dwarf_callchain_users on aarch64 Alexandre Truong
2021-03-05 11:51 ` Leo Yan
2021-03-05 14:07 ` Leo Yan
2021-03-09 16:10 ` Alexandre Truong
2021-03-04 16:32 ` [PATCH RESEND WITH CCs v3 4/4] perf tools: determine if LR is the return address Alexandre Truong
2021-03-05 8:54 ` James Clark
2021-03-06 12:55 ` Arnaldo Carvalho de Melo
2021-03-06 19:10 ` Arnaldo Carvalho de Melo
2021-03-22 11:57 ` Alexandre Truong
2021-03-26 12:15 ` James Clark
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).