* [GIT PULL 0/7] perf/urgent callchain fixes @ 2017-05-24 6:21 Namhyung Kim 2017-05-24 6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim ` (8 more replies) 0 siblings, 9 replies; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 6:21 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin Hi Ingo, Please consider pulling the perf tooling changes below. Build tested on Ubuntu, Fedora and Archlinux. I found a problem during `perf test` but it seems unrelated to this series. Will take a look it later. Thanks, Namhyung The following changes since commit 88b0193d9418c00340e45e0a913a0813bc6c8c96: perf/callchain: Force USER_DS when invoking perf_callchain_user() (2017-05-10 07:54:00 +0200) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf tags/perf-urgent-for-mingo-4.12-20170524 for you to fetch changes up to 37d4e1b6ba56773cef96122dff4436c2c534c381: perf tools: Fix to put caller above callee in children mode (2017-05-24 08:51:11 +0900) ---------------------------------------------------------------- perf/urgent fixes Fixes: - Fix segfault on `perf report -g srcline` if a callchain address cannot find a map for some reason. The srcline sorting mode needs a DSO to resolve line numbers and it's accessed via a map. But it should check if map is available for the address first. (Milian Wolff) - Fix off-by-one for srcline output. It passed (unwound) address to resolve srcline for callchains. But it's a return address of the function which points to a next instruction. This leads to off-by-one for srcline info. So pass the "address - 1" instead to get the correct srcline. This also considers "signal frame" as well which has the exact address, so pass the address directly in this case. (Milian Wolff) - Fix missing inlined function. Current code missed to display inlined functions at the end. This was found when comparing the output of addr2line and perf script. (Milian Wolff) User Visible: - `perf script` also gained `--inline` option to show inlined functions with callchains. This helped to find a bug in the current inline code. (Namhyung Kim) - Fix missed callchain ordering with `-g callee/caller` when libbfd is not available. (Milian Wolff) - Reorder output entries in `perf report --children` so that it can put parent entries above their children. It worked like this but missed when callchain display order was changed with `-g caller`. Now default is `-g caller` if children mode enabled. (Namhyung Kim) ---------------------------------------------------------------- Milian Wolff (5): perf report: don't crash on invalid maps in `-g srcline` mode perf report: fix memory leak in addr2line when called by addr2inlines perf report: fix off-by-one for non-activation frames perf report: always honor callchain order for inlined nodes perf report: do not drop last inlined frame Namhyung Kim (2): perf script: Add --inline option perf tools: Fix to put caller above callee in children mode tools/perf/Documentation/perf-script.txt | 4 +++ tools/perf/builtin-script.c | 2 ++ tools/perf/ui/hist.c | 2 ++ tools/perf/util/callchain.c | 13 ++++++--- tools/perf/util/evsel_fprintf.c | 33 +++++++++++++++++++++ tools/perf/util/srcline.c | 49 +++++++++++++++++--------------- tools/perf/util/unwind-libdw.c | 6 +++- tools/perf/util/unwind-libunwind-local.c | 11 +++++++ 8 files changed, 92 insertions(+), 28 deletions(-) ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim @ 2017-05-24 6:21 ` Namhyung Kim 2017-05-24 7:03 ` [tip:perf/urgent] perf report: Don't " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines Namhyung Kim ` (7 subsequent siblings) 8 siblings, 1 reply; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 6:21 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern, Peter Zijlstra From: Milian Wolff <milian.wolff@kdab.com> I just hit a segfault when doing `perf report -g srcline`. Valgrind pointed me at this code as the culprit: ==8359== Invalid read of size 8 ==8359== at 0x3096D9: map__rip_2objdump (map.c:430) ==8359== by 0x2FC1A3: match_chain_srcline (callchain.c:645) ==8359== by 0x2FC1A3: match_chain (callchain.c:700) ==8359== by 0x2FC1A3: append_chain (callchain.c:895) ==8359== by 0x2FC1A3: append_chain_children (callchain.c:846) ==8359== by 0x2FF719: callchain_append (callchain.c:944) ==8359== by 0x2FF719: hist_entry__append_callchain (callchain.c:1058) ==8359== by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908) ==8359== by 0x33195C: hist_entry_iter__add (hist.c:1050) ==8359== by 0x258F65: process_sample_event (builtin-report.c:204) ==8359== by 0x30D60C: perf_session__deliver_event (session.c:1310) ==8359== by 0x30D60C: ordered_events__deliver_event (session.c:119) ==8359== by 0x310D12: __ordered_events__flush (ordered-events.c:210) ==8359== by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277) ==8359== by 0x30DD3C: perf_session__process_user_event (session.c:1349) ==8359== by 0x30DD3C: perf_session__process_event (session.c:1475) ==8359== by 0x30FC3C: __perf_session__process_events (session.c:1867) ==8359== by 0x30FC3C: perf_session__process_events (session.c:1921) ==8359== by 0x25A985: __cmd_report (builtin-report.c:575) ==8359== by 0x25A985: cmd_report (builtin-report.c:1054) ==8359== by 0x2B9A80: run_builtin (perf.c:296) ==8359== Address 0x70 is not stack'd, malloc'd or (recently) free'd This patch fixes the issue. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Signed-off-by: Milian Wolff <milian.wolff@kdab.com> [namhyung@kernel.org: remove dependency from another change] Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/util/callchain.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index 81fc29ac798f..b4204b43ed58 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -621,14 +621,19 @@ enum match_result { static enum match_result match_chain_srcline(struct callchain_cursor_node *node, struct callchain_list *cnode) { - char *left = get_srcline(cnode->ms.map->dso, + char *left = NULL; + char *right = NULL; + enum match_result ret = MATCH_EQ; + int cmp; + + if (cnode->ms.map) + left = get_srcline(cnode->ms.map->dso, map__rip_2objdump(cnode->ms.map, cnode->ip), cnode->ms.sym, true, false); - char *right = get_srcline(node->map->dso, + if (node->map) + right = get_srcline(node->map->dso, map__rip_2objdump(node->map, node->ip), node->sym, true, false); - enum match_result ret = MATCH_EQ; - int cmp; if (left && right) cmp = strcmp(left, right); -- 2.13.0 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [tip:perf/urgent] perf report: Don't crash on invalid maps in `-g srcline` mode 2017-05-24 6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim @ 2017-05-24 7:03 ` tip-bot for Milian Wolff 0 siblings, 0 replies; 26+ messages in thread From: tip-bot for Milian Wolff @ 2017-05-24 7:03 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, jolsa, yao.jin, acme, torvalds, acme, milian.wolff, mingo, namhyung, jolsa, tglx, hpa, peterz, dsahern, a.p.zijlstra Commit-ID: 7d4df089d77306914426a604c890175f91a9a459 Gitweb: http://git.kernel.org/tip/7d4df089d77306914426a604c890175f91a9a459 Author: Milian Wolff <milian.wolff@kdab.com> AuthorDate: Wed, 24 May 2017 15:21:23 +0900 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 08:41:47 +0200 perf report: Don't crash on invalid maps in `-g srcline` mode I just hit a segfault when doing `perf report -g srcline`. Valgrind pointed me at this code as the culprit: ==8359== Invalid read of size 8 ==8359== at 0x3096D9: map__rip_2objdump (map.c:430) ==8359== by 0x2FC1A3: match_chain_srcline (callchain.c:645) ==8359== by 0x2FC1A3: match_chain (callchain.c:700) ==8359== by 0x2FC1A3: append_chain (callchain.c:895) ==8359== by 0x2FC1A3: append_chain_children (callchain.c:846) ==8359== by 0x2FF719: callchain_append (callchain.c:944) ==8359== by 0x2FF719: hist_entry__append_callchain (callchain.c:1058) ==8359== by 0x32FA06: iter_add_single_cumulative_entry (hist.c:908) ==8359== by 0x33195C: hist_entry_iter__add (hist.c:1050) ==8359== by 0x258F65: process_sample_event (builtin-report.c:204) ==8359== by 0x30D60C: perf_session__deliver_event (session.c:1310) ==8359== by 0x30D60C: ordered_events__deliver_event (session.c:119) ==8359== by 0x310D12: __ordered_events__flush (ordered-events.c:210) ==8359== by 0x310D12: ordered_events__flush.part.3 (ordered-events.c:277) ==8359== by 0x30DD3C: perf_session__process_user_event (session.c:1349) ==8359== by 0x30DD3C: perf_session__process_event (session.c:1475) ==8359== by 0x30FC3C: __perf_session__process_events (session.c:1867) ==8359== by 0x30FC3C: perf_session__process_events (session.c:1921) ==8359== by 0x25A985: __cmd_report (builtin-report.c:575) ==8359== by 0x25A985: cmd_report (builtin-report.c:1054) ==8359== by 0x2B9A80: run_builtin (perf.c:296) ==8359== Address 0x70 is not stack'd, malloc'd or (recently) free'd This patch fixes the issue. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> [ Remove dependency from another change ] Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-2-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/perf/util/callchain.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c index 81fc29a..b4204b4 100644 --- a/tools/perf/util/callchain.c +++ b/tools/perf/util/callchain.c @@ -621,14 +621,19 @@ enum match_result { static enum match_result match_chain_srcline(struct callchain_cursor_node *node, struct callchain_list *cnode) { - char *left = get_srcline(cnode->ms.map->dso, + char *left = NULL; + char *right = NULL; + enum match_result ret = MATCH_EQ; + int cmp; + + if (cnode->ms.map) + left = get_srcline(cnode->ms.map->dso, map__rip_2objdump(cnode->ms.map, cnode->ip), cnode->ms.sym, true, false); - char *right = get_srcline(node->map->dso, + if (node->map) + right = get_srcline(node->map->dso, map__rip_2objdump(node->map, node->ip), node->sym, true, false); - enum match_result ret = MATCH_EQ; - int cmp; if (left && right) cmp = strcmp(left, right); ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim 2017-05-24 6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim @ 2017-05-24 6:21 ` Namhyung Kim 2017-05-24 7:04 ` [tip:perf/urgent] perf report: Fix " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 3/7] perf report: fix off-by-one for non-activation frames Namhyung Kim ` (6 subsequent siblings) 8 siblings, 1 reply; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 6:21 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern, Peter Zijlstra From: Milian Wolff <milian.wolff@kdab.com> When a filename was found in addr2line it was duplicated via strdup but never freed. Now we pass NULL and handle this gracefully in addr2line. Detected by Valgrind: ==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220 ==16331== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==16331== by 0x672FA69: strdup (in /usr/lib/libc-2.25.so) ==16331== by 0x52769F: addr2line (srcline.c:256) ==16331== by 0x52769F: addr2inlines (srcline.c:294) ==16331== by 0x52769F: dso__parse_addr_inlines (srcline.c:502) ==16331== by 0x574D7A: inline__fprintf (hist.c:41) ==16331== by 0x574D7A: ipchain__fprintf_graph (hist.c:147) ==16331== by 0x57518A: __callchain__fprintf_graph (hist.c:212) ==16331== by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337) ==16331== by 0x57738E: hist_entry__fprintf (hist.c:628) ==16331== by 0x57738E: hists__fprintf (hist.c:882) ==16331== by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399) ==16331== by 0x44A20F: report__browse_hists (builtin-report.c:491) ==16331== by 0x44A20F: __cmd_report (builtin-report.c:624) ==16331== by 0x44A20F: cmd_report (builtin-report.c:1054) ==16331== by 0x4A49CE: run_builtin (perf.c:296) ==16331== by 0x4A4CC0: handle_internal_command (perf.c:348) ==16331== by 0x434371: run_argv (perf.c:392) ==16331== by 0x434371: main (perf.c:530) Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/util/srcline.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index df051a52393c..5e376d64d59e 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -230,7 +230,10 @@ static int addr2line(const char *dso_name, u64 addr, bfd_map_over_sections(a2l->abfd, find_address_in_section, a2l); - if (a2l->found && unwind_inlines) { + if (!a2l->found) + return 0; + + if (unwind_inlines) { int cnt = 0; while (bfd_find_inliner_info(a2l->abfd, &a2l->filename, @@ -243,6 +246,8 @@ static int addr2line(const char *dso_name, u64 addr, a2l->line, node, dso) != 0) return 0; + // found at least one inline frame + ret = 1; } } @@ -252,14 +257,14 @@ static int addr2line(const char *dso_name, u64 addr, } } - if (a2l->found && a2l->filename) { - *file = strdup(a2l->filename); - *line = a2l->line; - - if (*file) - ret = 1; + if (file) { + *file = a2l->filename ? strdup(a2l->filename) : NULL; + ret = *file ? 1 : 0; } + if (line) + *line = a2l->line; + return ret; } @@ -278,8 +283,6 @@ void dso__free_a2l(struct dso *dso) static struct inline_node *addr2inlines(const char *dso_name, u64 addr, struct dso *dso) { - char *file = NULL; - unsigned int line = 0; struct inline_node *node; node = zalloc(sizeof(*node)); @@ -291,7 +294,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr, INIT_LIST_HEAD(&node->val); node->addr = addr; - if (!addr2line(dso_name, addr, &file, &line, dso, TRUE, node)) + if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node)) goto out_free_inline_node; if (list_empty(&node->val)) -- 2.13.0 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [tip:perf/urgent] perf report: Fix memory leak in addr2line when called by addr2inlines 2017-05-24 6:21 ` [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines Namhyung Kim @ 2017-05-24 7:04 ` tip-bot for Milian Wolff 0 siblings, 0 replies; 26+ messages in thread From: tip-bot for Milian Wolff @ 2017-05-24 7:04 UTC (permalink / raw) To: linux-tip-commits Cc: linux-kernel, peterz, jolsa, hpa, milian.wolff, jolsa, namhyung, acme, yao.jin, dsahern, mingo, acme, tglx, a.p.zijlstra, torvalds Commit-ID: b21cc97810932a551f7aac46f0b89c469c828b3f Gitweb: http://git.kernel.org/tip/b21cc97810932a551f7aac46f0b89c469c828b3f Author: Milian Wolff <milian.wolff@kdab.com> AuthorDate: Wed, 24 May 2017 15:21:24 +0900 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 08:41:48 +0200 perf report: Fix memory leak in addr2line when called by addr2inlines When a filename was found in addr2line it was duplicated via strdup() but never freed. Now we pass NULL and handle this gracefully in addr2line. Detected by Valgrind: ==16331== 1,680 bytes in 21 blocks are definitely lost in loss record 148 of 220 ==16331== at 0x4C2AF1F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==16331== by 0x672FA69: strdup (in /usr/lib/libc-2.25.so) ==16331== by 0x52769F: addr2line (srcline.c:256) ==16331== by 0x52769F: addr2inlines (srcline.c:294) ==16331== by 0x52769F: dso__parse_addr_inlines (srcline.c:502) ==16331== by 0x574D7A: inline__fprintf (hist.c:41) ==16331== by 0x574D7A: ipchain__fprintf_graph (hist.c:147) ==16331== by 0x57518A: __callchain__fprintf_graph (hist.c:212) ==16331== by 0x5753CF: callchain__fprintf_graph.constprop.6 (hist.c:337) ==16331== by 0x57738E: hist_entry__fprintf (hist.c:628) ==16331== by 0x57738E: hists__fprintf (hist.c:882) ==16331== by 0x44A20F: perf_evlist__tty_browse_hists (builtin-report.c:399) ==16331== by 0x44A20F: report__browse_hists (builtin-report.c:491) ==16331== by 0x44A20F: __cmd_report (builtin-report.c:624) ==16331== by 0x44A20F: cmd_report (builtin-report.c:1054) ==16331== by 0x4A49CE: run_builtin (perf.c:296) ==16331== by 0x4A4CC0: handle_internal_command (perf.c:348) ==16331== by 0x434371: run_argv (perf.c:392) ==16331== by 0x434371: main (perf.c:530) Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-3-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/perf/util/srcline.c | 23 +++++++++++++---------- 1 file changed, 13 insertions(+), 10 deletions(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index df051a5..5e376d6 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -230,7 +230,10 @@ static int addr2line(const char *dso_name, u64 addr, bfd_map_over_sections(a2l->abfd, find_address_in_section, a2l); - if (a2l->found && unwind_inlines) { + if (!a2l->found) + return 0; + + if (unwind_inlines) { int cnt = 0; while (bfd_find_inliner_info(a2l->abfd, &a2l->filename, @@ -243,6 +246,8 @@ static int addr2line(const char *dso_name, u64 addr, a2l->line, node, dso) != 0) return 0; + // found at least one inline frame + ret = 1; } } @@ -252,14 +257,14 @@ static int addr2line(const char *dso_name, u64 addr, } } - if (a2l->found && a2l->filename) { - *file = strdup(a2l->filename); - *line = a2l->line; - - if (*file) - ret = 1; + if (file) { + *file = a2l->filename ? strdup(a2l->filename) : NULL; + ret = *file ? 1 : 0; } + if (line) + *line = a2l->line; + return ret; } @@ -278,8 +283,6 @@ void dso__free_a2l(struct dso *dso) static struct inline_node *addr2inlines(const char *dso_name, u64 addr, struct dso *dso) { - char *file = NULL; - unsigned int line = 0; struct inline_node *node; node = zalloc(sizeof(*node)); @@ -291,7 +294,7 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr, INIT_LIST_HEAD(&node->val); node->addr = addr; - if (!addr2line(dso_name, addr, &file, &line, dso, TRUE, node)) + if (!addr2line(dso_name, addr, NULL, NULL, dso, TRUE, node)) goto out_free_inline_node; if (list_empty(&node->val)) ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 3/7] perf report: fix off-by-one for non-activation frames 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim 2017-05-24 6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim 2017-05-24 6:21 ` [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines Namhyung Kim @ 2017-05-24 6:21 ` Namhyung Kim 2017-05-24 7:05 ` [tip:perf/urgent] perf report: Fix " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim ` (5 subsequent siblings) 8 siblings, 1 reply; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 6:21 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern, Peter Zijlstra From: Milian Wolff <milian.wolff@kdab.com> As the documentation for dwfl_frame_pc says, frames that are no activation frames need to have their program counter decremented by one to properly find the function of the caller. This fixes many cases where perf report currently attributes the cost to the next line. I.e. I have code like this: ~~~~~~~~~~~~~~~ #include <thread> #include <chrono> using namespace std; int main() { this_thread::sleep_for(chrono::milliseconds(1000)); this_thread::sleep_for(chrono::milliseconds(100)); this_thread::sleep_for(chrono::milliseconds(10)); return 0; } ~~~~~~~~~~~~~~~ Now compile and record it: ~~~~~~~~~~~~~~~ g++ -std=c++11 -g -O2 test.cpp echo 1 | sudo tee /proc/sys/kernel/sched_schedstats perf record \ --event sched:sched_stat_sleep \ --event sched:sched_process_exit \ --event sched:sched_switch --call-graph=dwarf \ --output perf.data.raw \ ./a.out echo 0 | sudo tee /proc/sys/kernel/sched_schedstats perf inject --sched-stat --input perf.data.raw --output perf.data ~~~~~~~~~~~~~~~ Before this patch, the report clearly shows the off-by-one issue. Most notably, the last sleep invocation is incorrectly attributed to the "return 0;" line: ~~~~~~~~~~~~~~~ Overhead Source:Line ........ ........... 100.00% core.c:0 | ---__schedule core.c:0 schedule do_nanosleep hrtimer.c:0 hrtimer_nanosleep sys_nanosleep entry_SYSCALL_64_fastpath .tmp_entry_64.o:0 __nanosleep_nocancel .:0 std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323 | |--90.08%--main test.cpp:9 | __libc_start_main | _start | |--9.01%--main test.cpp:10 | __libc_start_main | _start | --0.91%--main test.cpp:13 __libc_start_main _start ~~~~~~~~~~~~~~~ With this patch here applied, the issue is fixed. The report becomes much more usable: ~~~~~~~~~~~~~~~ Overhead Source:Line ........ ........... 100.00% core.c:0 | ---__schedule core.c:0 schedule do_nanosleep hrtimer.c:0 hrtimer_nanosleep sys_nanosleep entry_SYSCALL_64_fastpath .tmp_entry_64.o:0 __nanosleep_nocancel .:0 std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323 | |--90.08%--main test.cpp:8 | __libc_start_main | _start | |--9.01%--main test.cpp:9 | __libc_start_main | _start | --0.91%--main test.cpp:10 __libc_start_main _start ~~~~~~~~~~~~~~~ Similarly it works for signal frames: ~~~~~~~~~~~~~~~ __noinline void bar(void) { volatile long cnt = 0; for (cnt = 0; cnt < 100000000; cnt++); } __noinline void foo(void) { bar(); } void sig_handler(int sig) { foo(); } int main(void) { signal(SIGUSR1, sig_handler); raise(SIGUSR1); foo(); return 0; } ~~~~~~~~~~~~~~~~ Before, the report wrongly points to `signal.c:29` after raise(): ~~~~~~~~~~~~~~~~ $ perf report --stdio --no-children -g srcline -s srcline ... 100.00% signal.c:11 | ---bar signal.c:11 | |--50.49%--main signal.c:29 | __libc_start_main | _start | --49.51%--0x33a8f raise .:0 main signal.c:29 __libc_start_main _start ~~~~~~~~~~~~~~~~ With this patch in, the issue is fixed and we instead get: ~~~~~~~~~~~~~~~~ 100.00% signal signal [.] bar | ---bar signal.c:11 | |--50.49%--main signal.c:29 | __libc_start_main | _start | --49.51%--0x33a8f raise .:0 main signal.c:27 __libc_start_main _start ~~~~~~~~~~~~~~~~ Note how this patch fixes this issue for both unwinding methods, i.e. both dwfl and libunwind. The former case is straight-forward thanks to dwfl_frame_pc. For libunwind, we replace the functionality via unw_is_signal_frame for any but the very first frame. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/util/unwind-libdw.c | 6 +++++- tools/perf/util/unwind-libunwind-local.c | 11 +++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c index f90e11a555b2..943a06291587 100644 --- a/tools/perf/util/unwind-libdw.c +++ b/tools/perf/util/unwind-libdw.c @@ -168,12 +168,16 @@ frame_callback(Dwfl_Frame *state, void *arg) { struct unwind_info *ui = arg; Dwarf_Addr pc; + bool isactivation; - if (!dwfl_frame_pc(state, &pc, NULL)) { + if (!dwfl_frame_pc(state, &pc, &isactivation)) { pr_err("%s", dwfl_errmsg(-1)); return DWARF_CB_ABORT; } + if (!isactivation) + --pc; + return entry(pc, ui) || !(--ui->max_stack) ? DWARF_CB_ABORT : DWARF_CB_OK; } diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c index f8455bed6e65..84d553898e2a 100644 --- a/tools/perf/util/unwind-libunwind-local.c +++ b/tools/perf/util/unwind-libunwind-local.c @@ -692,6 +692,17 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb, while (!ret && (unw_step(&c) > 0) && i < max_stack) { unw_get_reg(&c, UNW_REG_IP, &ips[i]); + + /* + * Decrement the IP for any non-activation frames. + * this is required to properly find the srcline + * for caller frames. + * See also the documentation for dwfl_frame_pc, + * which this code tries to replicate. + */ + if (unw_is_signal_frame(&c) <= 0) + --ips[i]; + ++i; } -- 2.13.0 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [tip:perf/urgent] perf report: Fix off-by-one for non-activation frames 2017-05-24 6:21 ` [PATCH 3/7] perf report: fix off-by-one for non-activation frames Namhyung Kim @ 2017-05-24 7:05 ` tip-bot for Milian Wolff 0 siblings, 0 replies; 26+ messages in thread From: tip-bot for Milian Wolff @ 2017-05-24 7:05 UTC (permalink / raw) To: linux-tip-commits Cc: a.p.zijlstra, jolsa, yao.jin, hpa, torvalds, linux-kernel, dsahern, jolsa, acme, milian.wolff, peterz, acme, namhyung, mingo, tglx Commit-ID: 1982ad48fc82c284a5cc55697a012d3357e84d01 Gitweb: http://git.kernel.org/tip/1982ad48fc82c284a5cc55697a012d3357e84d01 Author: Milian Wolff <milian.wolff@kdab.com> AuthorDate: Wed, 24 May 2017 15:21:25 +0900 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 08:41:48 +0200 perf report: Fix off-by-one for non-activation frames As the documentation for dwfl_frame_pc says, frames that are no activation frames need to have their program counter decremented by one to properly find the function of the caller. This fixes many cases where perf report currently attributes the cost to the next line. I.e. I have code like this: ~~~~~~~~~~~~~~~ #include <thread> #include <chrono> using namespace std; int main() { this_thread::sleep_for(chrono::milliseconds(1000)); this_thread::sleep_for(chrono::milliseconds(100)); this_thread::sleep_for(chrono::milliseconds(10)); return 0; } ~~~~~~~~~~~~~~~ Now compile and record it: ~~~~~~~~~~~~~~~ g++ -std=c++11 -g -O2 test.cpp echo 1 | sudo tee /proc/sys/kernel/sched_schedstats perf record \ --event sched:sched_stat_sleep \ --event sched:sched_process_exit \ --event sched:sched_switch --call-graph=dwarf \ --output perf.data.raw \ ./a.out echo 0 | sudo tee /proc/sys/kernel/sched_schedstats perf inject --sched-stat --input perf.data.raw --output perf.data ~~~~~~~~~~~~~~~ Before this patch, the report clearly shows the off-by-one issue. Most notably, the last sleep invocation is incorrectly attributed to the "return 0;" line: ~~~~~~~~~~~~~~~ Overhead Source:Line ........ ........... 100.00% core.c:0 | ---__schedule core.c:0 schedule do_nanosleep hrtimer.c:0 hrtimer_nanosleep sys_nanosleep entry_SYSCALL_64_fastpath .tmp_entry_64.o:0 __nanosleep_nocancel .:0 std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323 | |--90.08%--main test.cpp:9 | __libc_start_main | _start | |--9.01%--main test.cpp:10 | __libc_start_main | _start | --0.91%--main test.cpp:13 __libc_start_main _start ~~~~~~~~~~~~~~~ With this patch here applied, the issue is fixed. The report becomes much more usable: ~~~~~~~~~~~~~~~ Overhead Source:Line ........ ........... 100.00% core.c:0 | ---__schedule core.c:0 schedule do_nanosleep hrtimer.c:0 hrtimer_nanosleep sys_nanosleep entry_SYSCALL_64_fastpath .tmp_entry_64.o:0 __nanosleep_nocancel .:0 std::this_thread::sleep_for<long, std::ratio<1l, 1000l> > thread:323 | |--90.08%--main test.cpp:8 | __libc_start_main | _start | |--9.01%--main test.cpp:9 | __libc_start_main | _start | --0.91%--main test.cpp:10 __libc_start_main _start ~~~~~~~~~~~~~~~ Similarly it works for signal frames: ~~~~~~~~~~~~~~~ __noinline void bar(void) { volatile long cnt = 0; for (cnt = 0; cnt < 100000000; cnt++); } __noinline void foo(void) { bar(); } void sig_handler(int sig) { foo(); } int main(void) { signal(SIGUSR1, sig_handler); raise(SIGUSR1); foo(); return 0; } ~~~~~~~~~~~~~~~~ Before, the report wrongly points to `signal.c:29` after raise(): ~~~~~~~~~~~~~~~~ $ perf report --stdio --no-children -g srcline -s srcline ... 100.00% signal.c:11 | ---bar signal.c:11 | |--50.49%--main signal.c:29 | __libc_start_main | _start | --49.51%--0x33a8f raise .:0 main signal.c:29 __libc_start_main _start ~~~~~~~~~~~~~~~~ With this patch in, the issue is fixed and we instead get: ~~~~~~~~~~~~~~~~ 100.00% signal signal [.] bar | ---bar signal.c:11 | |--50.49%--main signal.c:29 | __libc_start_main | _start | --49.51%--0x33a8f raise .:0 main signal.c:27 __libc_start_main _start ~~~~~~~~~~~~~~~~ Note how this patch fixes this issue for both unwinding methods, i.e. both dwfl and libunwind. The former case is straight-forward thanks to dwfl_frame_pc(). For libunwind, we replace the functionality via unw_is_signal_frame() for any but the very first frame. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-4-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/perf/util/unwind-libdw.c | 6 +++++- tools/perf/util/unwind-libunwind-local.c | 11 +++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c index f90e11a..943a0629 100644 --- a/tools/perf/util/unwind-libdw.c +++ b/tools/perf/util/unwind-libdw.c @@ -168,12 +168,16 @@ frame_callback(Dwfl_Frame *state, void *arg) { struct unwind_info *ui = arg; Dwarf_Addr pc; + bool isactivation; - if (!dwfl_frame_pc(state, &pc, NULL)) { + if (!dwfl_frame_pc(state, &pc, &isactivation)) { pr_err("%s", dwfl_errmsg(-1)); return DWARF_CB_ABORT; } + if (!isactivation) + --pc; + return entry(pc, ui) || !(--ui->max_stack) ? DWARF_CB_ABORT : DWARF_CB_OK; } diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c index f8455be..672c2ad 100644 --- a/tools/perf/util/unwind-libunwind-local.c +++ b/tools/perf/util/unwind-libunwind-local.c @@ -692,6 +692,17 @@ static int get_entries(struct unwind_info *ui, unwind_entry_cb_t cb, while (!ret && (unw_step(&c) > 0) && i < max_stack) { unw_get_reg(&c, UNW_REG_IP, &ips[i]); + + /* + * Decrement the IP for any non-activation frames. + * this is required to properly find the srcline + * for caller frames. + * See also the documentation for dwfl_frame_pc(), + * which this code tries to replicate. + */ + if (unw_is_signal_frame(&c) <= 0) + --ips[i]; + ++i; } ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 4/7] perf script: Add --inline option 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim ` (2 preceding siblings ...) 2017-05-24 6:21 ` [PATCH 3/7] perf report: fix off-by-one for non-activation frames Namhyung Kim @ 2017-05-24 6:21 ` Namhyung Kim 2017-05-24 6:38 ` Ingo Molnar 2017-05-24 7:05 ` [tip:perf/urgent] perf script: Add --inline option for debugging tip-bot for Namhyung Kim 2017-05-24 6:21 ` [PATCH 5/7] perf report: always honor callchain order for inlined nodes Namhyung Kim ` (4 subsequent siblings) 8 siblings, 2 replies; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 6:21 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin The --inline option is to show inlined functions in callchains. For example, $ perf script a.out 5644 11611.467597: 309961 cycles:u: 790 main (/home/namhyung/tmp/perf/a.out) 20511 __libc_start_main (/usr/lib/libc-2.25.so) 8ba _start (/home/namhyung/tmp/perf/a.out) ... $ perf script --inline a.out 5644 11611.467597: 309961 cycles:u: 790 main (/home/namhyung/tmp/perf/a.out) std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > main 20511 __libc_start_main (/usr/lib/libc-2.25.so) 8ba _start (/home/namhyung/tmp/perf/a.out) ... Cc: Jin Yao <yao.jin@linux.intel.com> Reviewed-and-tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/Documentation/perf-script.txt | 4 ++++ tools/perf/builtin-script.c | 2 ++ tools/perf/util/evsel_fprintf.c | 33 ++++++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+) diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index cb0eda3925e6..3517e204a2b3 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -311,6 +311,10 @@ include::itrace.txt[] Set the maximum number of program blocks to print with brstackasm for each sample. +--inline:: + If a callgraph address belongs to an inlined function, the inline stack + will be printed. Each entry has function name and file/line. + SEE ALSO -------- linkperf:perf-record[1], linkperf:perf-script-perl[1], diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index d05aec491cff..4761b0d7fcb5 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -2494,6 +2494,8 @@ int cmd_script(int argc, const char **argv) "Enable kernel symbol demangling"), OPT_STRING(0, "time", &script.time_str, "str", "Time span of interest (start,stop)"), + OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name, + "Show inline function"), OPT_END() }; const char * const script_subcommands[] = { "record", "report", NULL }; diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c index e415aee6a245..583f3a602506 100644 --- a/tools/perf/util/evsel_fprintf.c +++ b/tools/perf/util/evsel_fprintf.c @@ -7,6 +7,7 @@ #include "map.h" #include "strlist.h" #include "symbol.h" +#include "srcline.h" static int comma_fprintf(FILE *fp, bool *first, const char *fmt, ...) { @@ -168,6 +169,38 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment, if (!print_oneline) printed += fprintf(fp, "\n"); + if (symbol_conf.inline_name && node->map) { + struct inline_node *inode; + + addr = map__rip_2objdump(node->map, node->ip), + inode = dso__parse_addr_inlines(node->map->dso, addr); + + if (inode) { + struct inline_list *ilist; + + list_for_each_entry(ilist, &inode->val, list) { + if (print_arrow) + printed += fprintf(fp, " <-"); + + /* IP is same, just skip it */ + if (print_ip) + printed += fprintf(fp, "%c%16s", + s, ""); + if (print_sym) + printed += fprintf(fp, " %s", + ilist->funcname); + if (print_srcline) + printed += fprintf(fp, "\n %s:%d", + ilist->filename, + ilist->line_nr); + if (!print_oneline) + printed += fprintf(fp, "\n"); + } + + inline_node__delete(inode); + } + } + if (symbol_conf.bt_stop_list && node->sym && strlist__has_entry(symbol_conf.bt_stop_list, -- 2.13.0 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 4/7] perf script: Add --inline option 2017-05-24 6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim @ 2017-05-24 6:38 ` Ingo Molnar 2017-05-24 7:13 ` Namhyung Kim 2017-05-24 7:05 ` [tip:perf/urgent] perf script: Add --inline option for debugging tip-bot for Namhyung Kim 1 sibling, 1 reply; 26+ messages in thread From: Ingo Molnar @ 2017-05-24 6:38 UTC (permalink / raw) To: Namhyung Kim Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin * Namhyung Kim <namhyung@kernel.org> wrote: > The --inline option is to show inlined functions in callchains. > > For example, > > $ perf script > a.out 5644 11611.467597: 309961 cycles:u: > 790 main (/home/namhyung/tmp/perf/a.out) > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > 8ba _start (/home/namhyung/tmp/perf/a.out) > ... > > $ perf script --inline > a.out 5644 11611.467597: 309961 cycles:u: > 790 main (/home/namhyung/tmp/perf/a.out) > std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > > main > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > 8ba _start (/home/namhyung/tmp/perf/a.out) > ... Shouldn't this be the default behavior, to make call chains more readable? Thanks, Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/7] perf script: Add --inline option 2017-05-24 6:38 ` Ingo Molnar @ 2017-05-24 7:13 ` Namhyung Kim 2017-05-24 7:21 ` Ingo Molnar 0 siblings, 1 reply; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 7:13 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin On Wed, May 24, 2017 at 08:38:11AM +0200, Ingo Molnar wrote: > > * Namhyung Kim <namhyung@kernel.org> wrote: > > > The --inline option is to show inlined functions in callchains. > > > > For example, > > > > $ perf script > > a.out 5644 11611.467597: 309961 cycles:u: > > 790 main (/home/namhyung/tmp/perf/a.out) > > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > > 8ba _start (/home/namhyung/tmp/perf/a.out) > > ... > > > > $ perf script --inline > > a.out 5644 11611.467597: 309961 cycles:u: > > 790 main (/home/namhyung/tmp/perf/a.out) > > std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() > > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > > > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > > > main > > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > > 8ba _start (/home/namhyung/tmp/perf/a.out) > > ... > > Shouldn't this be the default behavior, to make call chains more readable? AFAIK perf report didn't make it default due to a performance impact, but I didn't know how much it is. Especially if perf was not built with libbfd it'll run external addr2line to get inlined functions for each callchain entry.. Thanks, Namhyung ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/7] perf script: Add --inline option 2017-05-24 7:13 ` Namhyung Kim @ 2017-05-24 7:21 ` Ingo Molnar 2017-05-24 7:53 ` Milian Wolff 0 siblings, 1 reply; 26+ messages in thread From: Ingo Molnar @ 2017-05-24 7:21 UTC (permalink / raw) To: Namhyung Kim Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin * Namhyung Kim <namhyung@kernel.org> wrote: > On Wed, May 24, 2017 at 08:38:11AM +0200, Ingo Molnar wrote: > > > > * Namhyung Kim <namhyung@kernel.org> wrote: > > > > > The --inline option is to show inlined functions in callchains. > > > > > > For example, > > > > > > $ perf script > > > a.out 5644 11611.467597: 309961 cycles:u: > > > 790 main (/home/namhyung/tmp/perf/a.out) > > > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > > > 8ba _start (/home/namhyung/tmp/perf/a.out) > > > ... > > > > > > $ perf script --inline > > > a.out 5644 11611.467597: 309961 cycles:u: > > > 790 main (/home/namhyung/tmp/perf/a.out) > > > std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() > > > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > > > > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > > > > main > > > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > > > 8ba _start (/home/namhyung/tmp/perf/a.out) > > > ... > > > > Shouldn't this be the default behavior, to make call chains more readable? > > AFAIK perf report didn't make it default due to a performance impact, > but I didn't know how much it is. Especially if perf was not built > with libbfd it'll run external addr2line to get inlined functions for > each callchain entry.. So then at least let's make it the default when all libraries are present. Not enabling something when the build is not 'complete' is fair game - distros will typically have all the libraries available. We need to remember that roughly 99% of all our users will use as few perf command line options as they can get away with - myself included. Adding a non-debugging feature as a non-default command line option is really as if we didn't do anything: very few if any people will use it, and it might bitrot in the future without people noticing. So we need apply some thought into making it available to two orders of magnitude more people! If someone types 'perf report' we should give the best selection of all the features we have available. Thanks, Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/7] perf script: Add --inline option 2017-05-24 7:21 ` Ingo Molnar @ 2017-05-24 7:53 ` Milian Wolff 2017-05-24 8:06 ` Ingo Molnar 0 siblings, 1 reply; 26+ messages in thread From: Milian Wolff @ 2017-05-24 7:53 UTC (permalink / raw) To: Ingo Molnar Cc: Namhyung Kim, LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Yao Jin [-- Attachment #1: Type: text/plain, Size: 3443 bytes --] On Wednesday, May 24, 2017 9:21:42 AM CEST Ingo Molnar wrote: > * Namhyung Kim <namhyung@kernel.org> wrote: > > On Wed, May 24, 2017 at 08:38:11AM +0200, Ingo Molnar wrote: > > > * Namhyung Kim <namhyung@kernel.org> wrote: > > > > The --inline option is to show inlined functions in callchains. > > > > > > > > For example, > > > > > > > > $ perf script > > > > > > > > a.out 5644 11611.467597: 309961 cycles:u: > > > > 790 main (/home/namhyung/tmp/perf/a.out) > > > > > > > > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > > > > > > > > 8ba _start (/home/namhyung/tmp/perf/a.out) > > > > > > > > ... > > > > > > > > $ perf script --inline > > > > > > > > a.out 5644 11611.467597: 309961 cycles:u: > > > > 790 main (/home/namhyung/tmp/perf/a.out) > > > > > > > > std::__detail::_Adaptor<std::linear_congruent > > > > ial_engine<unsigned long, 16807ul, 0ul, > > > > 2147483647ul>, double>::operator() > > > > std::uniform_real_distribution<double>::oper > > > > ator()<std::linear_congruential_engine<unsign > > > > ed long, 16807ul, 0ul, 2147483647ul> > > > > > std::uniform_real_distribution<double>::oper > > > > ator()<std::linear_congruential_engine<unsign > > > > ed long, 16807ul, 0ul, 2147483647ul> > main > > > > > > > > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > > > > > > > > 8ba _start (/home/namhyung/tmp/perf/a.out) > > > > > > > > ... > > > > > > Shouldn't this be the default behavior, to make call chains more > > > readable? > > > > AFAIK perf report didn't make it default due to a performance impact, > > but I didn't know how much it is. Especially if perf was not built > > with libbfd it'll run external addr2line to get inlined functions for > > each callchain entry.. > > So then at least let's make it the default when all libraries are present. > Not enabling something when the build is not 'complete' is fair game - > distros will typically have all the libraries available. > > We need to remember that roughly 99% of all our users will use as few perf > command line options as they can get away with - myself included. Adding a > non-debugging feature as a non-default command line option is really as if > we didn't do anything: very few if any people will use it, and it might > bitrot in the future without people noticing. > > So we need apply some thought into making it available to two orders of > magnitude more people! If someone types 'perf report' we should give the > best selection of all the features we have available. Just a suggestion: My larger patch set that is in review now adds some caching features which already speeds up the whole process considerably. As such, my suggestion is to wait for this patch set to be integrated. Then we could enable --inline unconditionally, or at least only when libbfd is available. Cheers -- Milian Wolff | milian.wolff@kdab.com | Software Engineer KDAB (Deutschland) GmbH&Co KG, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt Experts [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 3826 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 4/7] perf script: Add --inline option 2017-05-24 7:53 ` Milian Wolff @ 2017-05-24 8:06 ` Ingo Molnar 0 siblings, 0 replies; 26+ messages in thread From: Ingo Molnar @ 2017-05-24 8:06 UTC (permalink / raw) To: Milian Wolff Cc: Namhyung Kim, LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Yao Jin * Milian Wolff <milian.wolff@kdab.com> wrote: > On Wednesday, May 24, 2017 9:21:42 AM CEST Ingo Molnar wrote: > > * Namhyung Kim <namhyung@kernel.org> wrote: > > > On Wed, May 24, 2017 at 08:38:11AM +0200, Ingo Molnar wrote: > > > > * Namhyung Kim <namhyung@kernel.org> wrote: > > > > > The --inline option is to show inlined functions in callchains. > > > > > > > > > > For example, > > > > > > > > > > $ perf script > > > > > > > > > > a.out 5644 11611.467597: 309961 cycles:u: > > > > > 790 main (/home/namhyung/tmp/perf/a.out) > > > > > > > > > > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > > > > > > > > > > 8ba _start (/home/namhyung/tmp/perf/a.out) > > > > > > > > > > ... > > > > > > > > > > $ perf script --inline > > > > > > > > > > a.out 5644 11611.467597: 309961 cycles:u: > > > > > 790 main (/home/namhyung/tmp/perf/a.out) > > > > > > > > > > std::__detail::_Adaptor<std::linear_congruent > > > > > ial_engine<unsigned long, 16807ul, 0ul, > > > > > 2147483647ul>, double>::operator() > > > > > std::uniform_real_distribution<double>::oper > > > > > ator()<std::linear_congruential_engine<unsign > > > > > ed long, 16807ul, 0ul, 2147483647ul> > > > > > > std::uniform_real_distribution<double>::oper > > > > > ator()<std::linear_congruential_engine<unsign > > > > > ed long, 16807ul, 0ul, 2147483647ul> > main > > > > > > > > > > 20511 __libc_start_main (/usr/lib/libc-2.25.so) > > > > > > > > > > 8ba _start (/home/namhyung/tmp/perf/a.out) > > > > > > > > > > ... > > > > > > > > Shouldn't this be the default behavior, to make call chains more > > > > readable? > > > > > > AFAIK perf report didn't make it default due to a performance impact, > > > but I didn't know how much it is. Especially if perf was not built > > > with libbfd it'll run external addr2line to get inlined functions for > > > each callchain entry.. > > > > So then at least let's make it the default when all libraries are present. > > Not enabling something when the build is not 'complete' is fair game - > > distros will typically have all the libraries available. > > > > We need to remember that roughly 99% of all our users will use as few perf > > command line options as they can get away with - myself included. Adding a > > non-debugging feature as a non-default command line option is really as if > > we didn't do anything: very few if any people will use it, and it might > > bitrot in the future without people noticing. > > > > So we need apply some thought into making it available to two orders of > > magnitude more people! If someone types 'perf report' we should give the > > best selection of all the features we have available. > > Just a suggestion: My larger patch set that is in review now adds some caching > features which already speeds up the whole process considerably. As such, my > suggestion is to wait for this patch set to be integrated. Then we could > enable --inline unconditionally, or at least only when libbfd is available. I'm fine with that - and please make the default-enabling part of your patch series, so it does not get forgotten. Thanks, Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* [tip:perf/urgent] perf script: Add --inline option for debugging 2017-05-24 6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim 2017-05-24 6:38 ` Ingo Molnar @ 2017-05-24 7:05 ` tip-bot for Namhyung Kim 1 sibling, 0 replies; 26+ messages in thread From: tip-bot for Namhyung Kim @ 2017-05-24 7:05 UTC (permalink / raw) To: linux-tip-commits Cc: jolsa, tglx, linux-kernel, torvalds, peterz, milian.wolff, acme, namhyung, mingo, hpa, acme, jolsa, yao.jin Commit-ID: 325fbff51f961491adff4037d0e0a94d6132bd9b Gitweb: http://git.kernel.org/tip/325fbff51f961491adff4037d0e0a94d6132bd9b Author: Namhyung Kim <namhyung@kernel.org> AuthorDate: Wed, 24 May 2017 15:21:26 +0900 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 08:41:48 +0200 perf script: Add --inline option for debugging The --inline option is to show inlined functions in callchains. For example: $ perf script a.out 5644 11611.467597: 309961 cycles:u: 790 main (/home/namhyung/tmp/perf/a.out) 20511 __libc_start_main (/usr/lib/libc-2.25.so) 8ba _start (/home/namhyung/tmp/perf/a.out) ... $ perf script --inline a.out 5644 11611.467597: 309961 cycles:u: 790 main (/home/namhyung/tmp/perf/a.out) std::__detail::_Adaptor<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul>, double>::operator() std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> > main 20511 __libc_start_main (/usr/lib/libc-2.25.so) 8ba _start (/home/namhyung/tmp/perf/a.out) ... Reviewed-and-tested-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-5-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/perf/Documentation/perf-script.txt | 4 ++++ tools/perf/builtin-script.c | 2 ++ tools/perf/util/evsel_fprintf.c | 33 ++++++++++++++++++++++++++++++++ 3 files changed, 39 insertions(+) diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt index cb0eda3..3517e20 100644 --- a/tools/perf/Documentation/perf-script.txt +++ b/tools/perf/Documentation/perf-script.txt @@ -311,6 +311,10 @@ include::itrace.txt[] Set the maximum number of program blocks to print with brstackasm for each sample. +--inline:: + If a callgraph address belongs to an inlined function, the inline stack + will be printed. Each entry has function name and file/line. + SEE ALSO -------- linkperf:perf-record[1], linkperf:perf-script-perl[1], diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index d05aec4..4761b0d 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -2494,6 +2494,8 @@ int cmd_script(int argc, const char **argv) "Enable kernel symbol demangling"), OPT_STRING(0, "time", &script.time_str, "str", "Time span of interest (start,stop)"), + OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name, + "Show inline function"), OPT_END() }; const char * const script_subcommands[] = { "record", "report", NULL }; diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c index e415aee..583f3a6 100644 --- a/tools/perf/util/evsel_fprintf.c +++ b/tools/perf/util/evsel_fprintf.c @@ -7,6 +7,7 @@ #include "map.h" #include "strlist.h" #include "symbol.h" +#include "srcline.h" static int comma_fprintf(FILE *fp, bool *first, const char *fmt, ...) { @@ -168,6 +169,38 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment, if (!print_oneline) printed += fprintf(fp, "\n"); + if (symbol_conf.inline_name && node->map) { + struct inline_node *inode; + + addr = map__rip_2objdump(node->map, node->ip), + inode = dso__parse_addr_inlines(node->map->dso, addr); + + if (inode) { + struct inline_list *ilist; + + list_for_each_entry(ilist, &inode->val, list) { + if (print_arrow) + printed += fprintf(fp, " <-"); + + /* IP is same, just skip it */ + if (print_ip) + printed += fprintf(fp, "%c%16s", + s, ""); + if (print_sym) + printed += fprintf(fp, " %s", + ilist->funcname); + if (print_srcline) + printed += fprintf(fp, "\n %s:%d", + ilist->filename, + ilist->line_nr); + if (!print_oneline) + printed += fprintf(fp, "\n"); + } + + inline_node__delete(inode); + } + } + if (symbol_conf.bt_stop_list && node->sym && strlist__has_entry(symbol_conf.bt_stop_list, ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 5/7] perf report: always honor callchain order for inlined nodes 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim ` (3 preceding siblings ...) 2017-05-24 6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim @ 2017-05-24 6:21 ` Namhyung Kim 2017-05-24 7:06 ` [tip:perf/urgent] perf report: Always " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 6/7] perf report: do not drop last inlined frame Namhyung Kim ` (3 subsequent siblings) 8 siblings, 1 reply; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 6:21 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern, Peter Zijlstra From: Milian Wolff <milian.wolff@kdab.com> So far, the inlined nodes where only reversed when we built perf against libbfd. If that was not available, the addr2line fallback code path was missing the inline_list__reverse call. Now we always add the nodes in the correct order within inline_list__append. This removes the need to reverse the list and also ensures that all callers construct the list in the right order. Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/util/srcline.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index 5e376d64d59e..6af0364cad06 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -56,7 +56,10 @@ static int inline_list__append(char *filename, char *funcname, int line_nr, } } - list_add_tail(&ilist->list, &node->val); + if (callchain_param.order == ORDER_CALLEE) + list_add_tail(&ilist->list, &node->val); + else + list_add(&ilist->list, &node->val); return 0; } @@ -200,14 +203,6 @@ static void addr2line_cleanup(struct a2l_data *a2l) #define MAX_INLINE_NEST 1024 -static void inline_list__reverse(struct inline_node *node) -{ - struct inline_list *ilist, *n; - - list_for_each_entry_safe_reverse(ilist, n, &node->val, list) - list_move_tail(&ilist->list, &node->val); -} - static int addr2line(const char *dso_name, u64 addr, char **file, unsigned int *line, struct dso *dso, bool unwind_inlines, struct inline_node *node) @@ -250,11 +245,6 @@ static int addr2line(const char *dso_name, u64 addr, ret = 1; } } - - if ((node != NULL) && - (callchain_param.order != ORDER_CALLEE)) { - inline_list__reverse(node); - } } if (file) { -- 2.13.0 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [tip:perf/urgent] perf report: Always honor callchain order for inlined nodes 2017-05-24 6:21 ` [PATCH 5/7] perf report: always honor callchain order for inlined nodes Namhyung Kim @ 2017-05-24 7:06 ` tip-bot for Milian Wolff 0 siblings, 0 replies; 26+ messages in thread From: tip-bot for Milian Wolff @ 2017-05-24 7:06 UTC (permalink / raw) To: linux-tip-commits Cc: peterz, hpa, jolsa, yao.jin, milian.wolff, torvalds, tglx, dsahern, linux-kernel, acme, jolsa, a.p.zijlstra, namhyung, mingo, acme Commit-ID: 28071f51839e393f697d0d1df0b223a4bc373606 Gitweb: http://git.kernel.org/tip/28071f51839e393f697d0d1df0b223a4bc373606 Author: Milian Wolff <milian.wolff@kdab.com> AuthorDate: Wed, 24 May 2017 15:21:27 +0900 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 08:41:48 +0200 perf report: Always honor callchain order for inlined nodes So far, the inlined nodes where only reversed when we built perf against libbfd. If that was not available, the addr2line fallback code path was missing the inline_list__reverse call. Now we always add the nodes in the correct order within inline_list__append. This removes the need to reverse the list and also ensures that all callers construct the list in the right order. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-6-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/perf/util/srcline.c | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index 5e376d6..6af0364 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -56,7 +56,10 @@ static int inline_list__append(char *filename, char *funcname, int line_nr, } } - list_add_tail(&ilist->list, &node->val); + if (callchain_param.order == ORDER_CALLEE) + list_add_tail(&ilist->list, &node->val); + else + list_add(&ilist->list, &node->val); return 0; } @@ -200,14 +203,6 @@ static void addr2line_cleanup(struct a2l_data *a2l) #define MAX_INLINE_NEST 1024 -static void inline_list__reverse(struct inline_node *node) -{ - struct inline_list *ilist, *n; - - list_for_each_entry_safe_reverse(ilist, n, &node->val, list) - list_move_tail(&ilist->list, &node->val); -} - static int addr2line(const char *dso_name, u64 addr, char **file, unsigned int *line, struct dso *dso, bool unwind_inlines, struct inline_node *node) @@ -250,11 +245,6 @@ static int addr2line(const char *dso_name, u64 addr, ret = 1; } } - - if ((node != NULL) && - (callchain_param.order != ORDER_CALLEE)) { - inline_list__reverse(node); - } } if (file) { ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 6/7] perf report: do not drop last inlined frame 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim ` (4 preceding siblings ...) 2017-05-24 6:21 ` [PATCH 5/7] perf report: always honor callchain order for inlined nodes Namhyung Kim @ 2017-05-24 6:21 ` Namhyung Kim 2017-05-24 7:06 ` [tip:perf/urgent] perf report: Do " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 7/7] perf tools: Fix to put caller above callee in children mode Namhyung Kim ` (2 subsequent siblings) 8 siblings, 1 reply; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 6:21 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin, Arnaldo Carvalho de Melo, David Ahern, Peter Zijlstra From: Milian Wolff <milian.wolff@kdab.com> The very last inlined frame, i.e. the one furthest away from the non-inlined frame, was silently dropped. This is apparent when comparing the output of `perf script` and `addr2line`: ~~~~~~ $ perf script --inline ... a.out 26722 80836.309329: 72425 cycles: 21561 __hypot_finite (/usr/lib/libm-2.25.so) ace3 hypot (/usr/lib/libm-2.25.so) a4a main (a.out) std::abs<double> std::_Norm_helper<true>::_S_do_it<double> std::norm<double> main 20510 __libc_start_main (/usr/lib/libc-2.25.so) bd9 _start (a.out) $ addr2line -a -f -i -e /tmp/a.out a4a | c++filt 0x0000000000000a4a std::__complex_abs(doublecomplex ) /usr/include/c++/6.3.1/complex:589 double std::abs<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:597 double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:654 double std::norm<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:664 main /tmp/inlining.cpp:14 ~~~~~ Note how `std::__complex_abs` is missing from the `perf script` output. This is similarly showing up in `perf report`. The patch here fixes this issue, and the output becomes: ~~~~~ a.out 26722 80836.309329: 72425 cycles: 21561 __hypot_finite (/usr/lib/libm-2.25.so) ace3 hypot (/usr/lib/libm-2.25.so) a4a main (a.out) std::__complex_abs std::abs<double> std::_Norm_helper<true>::_S_do_it<double> std::norm<double> main 20510 __libc_start_main (/usr/lib/libc-2.25.so) bd9 _start (a.out) ~~~~~ Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/util/srcline.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index 6af0364cad06..ebc88a74e67b 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -203,6 +203,16 @@ static void addr2line_cleanup(struct a2l_data *a2l) #define MAX_INLINE_NEST 1024 +static int inline_list__append_dso_a2l(struct dso *dso, + struct inline_node *node) +{ + struct a2l_data *a2l = dso->a2l; + char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL; + char *filename = a2l->filename ? strdup(a2l->filename) : NULL; + + return inline_list__append(filename, funcname, a2l->line, node, dso); +} + static int addr2line(const char *dso_name, u64 addr, char **file, unsigned int *line, struct dso *dso, bool unwind_inlines, struct inline_node *node) @@ -231,15 +241,15 @@ static int addr2line(const char *dso_name, u64 addr, if (unwind_inlines) { int cnt = 0; + if (node && inline_list__append_dso_a2l(dso, node)) + return 0; + while (bfd_find_inliner_info(a2l->abfd, &a2l->filename, &a2l->funcname, &a2l->line) && cnt++ < MAX_INLINE_NEST) { if (node != NULL) { - if (inline_list__append(strdup(a2l->filename), - strdup(a2l->funcname), - a2l->line, node, - dso) != 0) + if (inline_list__append_dso_a2l(dso, node)) return 0; // found at least one inline frame ret = 1; -- 2.13.0 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [tip:perf/urgent] perf report: Do not drop last inlined frame 2017-05-24 6:21 ` [PATCH 6/7] perf report: do not drop last inlined frame Namhyung Kim @ 2017-05-24 7:06 ` tip-bot for Milian Wolff 0 siblings, 0 replies; 26+ messages in thread From: tip-bot for Milian Wolff @ 2017-05-24 7:06 UTC (permalink / raw) To: linux-tip-commits Cc: mingo, dsahern, a.p.zijlstra, linux-kernel, hpa, jolsa, acme, tglx, namhyung, peterz, acme, torvalds, yao.jin, jolsa, milian.wolff Commit-ID: 4d53b9d546f9f4505e6e3d58c8eed894d6f684e7 Gitweb: http://git.kernel.org/tip/4d53b9d546f9f4505e6e3d58c8eed894d6f684e7 Author: Milian Wolff <milian.wolff@kdab.com> AuthorDate: Wed, 24 May 2017 15:21:28 +0900 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 08:41:48 +0200 perf report: Do not drop last inlined frame The very last inlined frame, i.e. the one furthest away from the non-inlined frame, was silently dropped. This is apparent when comparing the output of `perf script` and `addr2line`: ~~~~~~ $ perf script --inline ... a.out 26722 80836.309329: 72425 cycles: 21561 __hypot_finite (/usr/lib/libm-2.25.so) ace3 hypot (/usr/lib/libm-2.25.so) a4a main (a.out) std::abs<double> std::_Norm_helper<true>::_S_do_it<double> std::norm<double> main 20510 __libc_start_main (/usr/lib/libc-2.25.so) bd9 _start (a.out) $ addr2line -a -f -i -e /tmp/a.out a4a | c++filt 0x0000000000000a4a std::__complex_abs(doublecomplex ) /usr/include/c++/6.3.1/complex:589 double std::abs<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:597 double std::_Norm_helper<true>::_S_do_it<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:654 double std::norm<double>(std::complex<double> const&) /usr/include/c++/6.3.1/complex:664 main /tmp/inlining.cpp:14 ~~~~~ Note how `std::__complex_abs` is missing from the `perf script` output. This is similarly showing up in `perf report`. The patch here fixes this issue, and the output becomes: ~~~~~ a.out 26722 80836.309329: 72425 cycles: 21561 __hypot_finite (/usr/lib/libm-2.25.so) ace3 hypot (/usr/lib/libm-2.25.so) a4a main (a.out) std::__complex_abs std::abs<double> std::_Norm_helper<true>::_S_do_it<double> std::norm<double> main 20510 __libc_start_main (/usr/lib/libc-2.25.so) bd9 _start (a.out) ~~~~~ Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-7-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/perf/util/srcline.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c index 6af0364..ebc88a7 100644 --- a/tools/perf/util/srcline.c +++ b/tools/perf/util/srcline.c @@ -203,6 +203,16 @@ static void addr2line_cleanup(struct a2l_data *a2l) #define MAX_INLINE_NEST 1024 +static int inline_list__append_dso_a2l(struct dso *dso, + struct inline_node *node) +{ + struct a2l_data *a2l = dso->a2l; + char *funcname = a2l->funcname ? strdup(a2l->funcname) : NULL; + char *filename = a2l->filename ? strdup(a2l->filename) : NULL; + + return inline_list__append(filename, funcname, a2l->line, node, dso); +} + static int addr2line(const char *dso_name, u64 addr, char **file, unsigned int *line, struct dso *dso, bool unwind_inlines, struct inline_node *node) @@ -231,15 +241,15 @@ static int addr2line(const char *dso_name, u64 addr, if (unwind_inlines) { int cnt = 0; + if (node && inline_list__append_dso_a2l(dso, node)) + return 0; + while (bfd_find_inliner_info(a2l->abfd, &a2l->filename, &a2l->funcname, &a2l->line) && cnt++ < MAX_INLINE_NEST) { if (node != NULL) { - if (inline_list__append(strdup(a2l->filename), - strdup(a2l->funcname), - a2l->line, node, - dso) != 0) + if (inline_list__append_dso_a2l(dso, node)) return 0; // found at least one inline frame ret = 1; ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 7/7] perf tools: Fix to put caller above callee in children mode 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim ` (5 preceding siblings ...) 2017-05-24 6:21 ` [PATCH 6/7] perf report: do not drop last inlined frame Namhyung Kim @ 2017-05-24 6:21 ` Namhyung Kim 2017-05-24 7:07 ` [tip:perf/urgent] perf tools: Put caller above callee in --children mode tip-bot for Namhyung Kim 2017-05-24 6:53 ` [GIT PULL 0/7] perf/urgent callchain fixes Ingo Molnar 2017-06-08 13:15 ` [GIT PULL 0/7] perf/urgent callchain fixes Milian Wolff 8 siblings, 1 reply; 26+ messages in thread From: Namhyung Kim @ 2017-05-24 6:21 UTC (permalink / raw) To: Ingo Molnar Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin, Frederic Weisbecker The __hpp__sort_acc() sorts entries using callchain depth in order to put callers above in children mode. But it assumed the callchain order was callee-first. Now default (for children) is caller-first so the order of entries is reverted. For example, consider following case. $ perf report --no-children ..l # Overhead Command Shared Object Symbol # ........ ....... ................... .......................... # 99.44% a.out a.out [.] main | ---main __libc_start_main _start Then children mode should show 'start' above '__libc_start_main' since it's the caller (parent) of the __libc_start_main. But it's reversed: # Children Self Command Shared Object Symbol # ........ ........ ....... ............... ..................... # 99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main 99.61% 0.00% a.out a.out [.] _start 99.54% 99.44% a.out a.out [.] main This patch fixes it. # Children Self Command Shared Object Symbol # ........ ........ ....... ............... ..................... # 99.61% 0.00% a.out a.out [.] _start 99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main 99.54% 99.44% a.out a.out [.] main Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Milian Wolff <milian.wolff@kdab.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/ui/hist.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c index 59addd52d9cd..ddb2c6fbdf91 100644 --- a/tools/perf/ui/hist.c +++ b/tools/perf/ui/hist.c @@ -210,6 +210,8 @@ static int __hpp__sort_acc(struct hist_entry *a, struct hist_entry *b, return 0; ret = b->callchain->max_depth - a->callchain->max_depth; + if (callchain_param.order == ORDER_CALLER) + ret = -ret; } return ret; } -- 2.13.0 ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [tip:perf/urgent] perf tools: Put caller above callee in --children mode 2017-05-24 6:21 ` [PATCH 7/7] perf tools: Fix to put caller above callee in children mode Namhyung Kim @ 2017-05-24 7:07 ` tip-bot for Namhyung Kim 0 siblings, 0 replies; 26+ messages in thread From: tip-bot for Namhyung Kim @ 2017-05-24 7:07 UTC (permalink / raw) To: linux-tip-commits Cc: acme, hpa, jolsa, milian.wolff, mingo, peterz, yao.jin, tglx, torvalds, fweisbec, namhyung, linux-kernel, acme, jolsa Commit-ID: 7111ffff60a68f55d864200cd6c7677319e5c242 Gitweb: http://git.kernel.org/tip/7111ffff60a68f55d864200cd6c7677319e5c242 Author: Namhyung Kim <namhyung@kernel.org> AuthorDate: Wed, 24 May 2017 15:21:29 +0900 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 08:41:49 +0200 perf tools: Put caller above callee in --children mode The __hpp__sort_acc() sorts entries using callchain depth in order to put callers above in children mode. But it assumed the callchain order was callee-first. Now default (for children) is caller-first so the order of entries is reverted. For example, consider following case: $ perf report --no-children ..l # Overhead Command Shared Object Symbol # ........ ....... ................... .......................... # 99.44% a.out a.out [.] main | ---main __libc_start_main _start Then children mode should show 'start' above '__libc_start_main' since it's the caller (parent) of the __libc_start_main. But it's reversed: # Children Self Command Shared Object Symbol # ........ ........ ....... ............... ..................... # 99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main 99.61% 0.00% a.out a.out [.] _start 99.54% 99.44% a.out a.out [.] main This patch fixes it. # Children Self Command Shared Object Symbol # ........ ........ ....... ............... ..................... # 99.61% 0.00% a.out a.out [.] _start 99.61% 0.00% a.out libc-2.25.so [.] __libc_start_main 99.54% 99.44% a.out a.out [.] main Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524062129.32529-8-namhyung@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/perf/ui/hist.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c index 59addd5..ddb2c6f 100644 --- a/tools/perf/ui/hist.c +++ b/tools/perf/ui/hist.c @@ -210,6 +210,8 @@ static int __hpp__sort_acc(struct hist_entry *a, struct hist_entry *b, return 0; ret = b->callchain->max_depth - a->callchain->max_depth; + if (callchain_param.order == ORDER_CALLER) + ret = -ret; } return ret; } ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [GIT PULL 0/7] perf/urgent callchain fixes 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim ` (6 preceding siblings ...) 2017-05-24 6:21 ` [PATCH 7/7] perf tools: Fix to put caller above callee in children mode Namhyung Kim @ 2017-05-24 6:53 ` Ingo Molnar 2017-05-24 6:57 ` [PATCH] tools/include: Sync kernel ABI headers with tooling headers Ingo Molnar 2017-06-08 13:15 ` [GIT PULL 0/7] perf/urgent callchain fixes Milian Wolff 8 siblings, 1 reply; 26+ messages in thread From: Ingo Molnar @ 2017-05-24 6:53 UTC (permalink / raw) To: Namhyung Kim Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin * Namhyung Kim <namhyung@kernel.org> wrote: > Hi Ingo, > > Please consider pulling the perf tooling changes below. Build tested > on Ubuntu, Fedora and Archlinux. I found a problem during `perf test` > but it seems unrelated to this series. Will take a look it later. > > Thanks, > Namhyung > > > The following changes since commit 88b0193d9418c00340e45e0a913a0813bc6c8c96: > > perf/callchain: Force USER_DS when invoking perf_callchain_user() (2017-05-10 07:54:00 +0200) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf tags/perf-urgent-for-mingo-4.12-20170524 > > for you to fetch changes up to 37d4e1b6ba56773cef96122dff4436c2c534c381: > > perf tools: Fix to put caller above callee in children mode (2017-05-24 08:51:11 +0900) > > ---------------------------------------------------------------- > perf/urgent fixes > > Fixes: > > - Fix segfault on `perf report -g srcline` if a callchain address > cannot find a map for some reason. The srcline sorting mode needs > a DSO to resolve line numbers and it's accessed via a map. But it > should check if map is available for the address first. (Milian Wolff) > > - Fix off-by-one for srcline output. It passed (unwound) address to > resolve srcline for callchains. But it's a return address of the > function which points to a next instruction. This leads to > off-by-one for srcline info. So pass the "address - 1" instead to > get the correct srcline. This also considers "signal frame" as > well which has the exact address, so pass the address directly in > this case. (Milian Wolff) > > - Fix missing inlined function. Current code missed to display > inlined functions at the end. This was found when comparing the > output of addr2line and perf script. (Milian Wolff) > > > User Visible: > > - `perf script` also gained `--inline` option to show inlined > functions with callchains. This helped to find a bug in the > current inline code. (Namhyung Kim) > > - Fix missed callchain ordering with `-g callee/caller` when libbfd > is not available. (Milian Wolff) > > - Reorder output entries in `perf report --children` so that it can > put parent entries above their children. It worked like this but > missed when callchain display order was changed with `-g caller`. > Now default is `-g caller` if children mode enabled. (Namhyung Kim) > > > ---------------------------------------------------------------- > > Milian Wolff (5): > perf report: don't crash on invalid maps in `-g srcline` mode > perf report: fix memory leak in addr2line when called by addr2inlines > perf report: fix off-by-one for non-activation frames > perf report: always honor callchain order for inlined nodes > perf report: do not drop last inlined frame > > Namhyung Kim (2): > perf script: Add --inline option > perf tools: Fix to put caller above callee in children mode > > tools/perf/Documentation/perf-script.txt | 4 +++ > tools/perf/builtin-script.c | 2 ++ > tools/perf/ui/hist.c | 2 ++ > tools/perf/util/callchain.c | 13 ++++++--- > tools/perf/util/evsel_fprintf.c | 33 +++++++++++++++++++++ > tools/perf/util/srcline.c | 49 +++++++++++++++++--------------- > tools/perf/util/unwind-libdw.c | 6 +++- > tools/perf/util/unwind-libunwind-local.c | 11 +++++++ > 8 files changed, 92 insertions(+), 28 deletions(-) Thanks, I've applied the fixes from email with some minor tweaks to the changelogs. I also noticed that we now have a lot of warnings about out of sync headers: Warning: include/uapi/linux/stat.h differs from kernel Warning: arch/x86/include/asm/disabled-features.h differs from kernel Warning: arch/x86/include/asm/required-features.h differs from kernel Warning: arch/x86/include/asm/cpufeatures.h differs from kernel Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel ... will post a separate patch for that. Thanks, Ingo ^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH] tools/include: Sync kernel ABI headers with tooling headers 2017-05-24 6:53 ` [GIT PULL 0/7] perf/urgent callchain fixes Ingo Molnar @ 2017-05-24 6:57 ` Ingo Molnar 2017-05-24 7:07 ` [tip:perf/urgent] " tip-bot for Ingo Molnar 0 siblings, 1 reply; 26+ messages in thread From: Ingo Molnar @ 2017-05-24 6:57 UTC (permalink / raw) To: Namhyung Kim Cc: LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Milian Wolff, Yao Jin Sync (copy) the following v4.12 kernel headers to the tooling headers: arch/x86/include/asm/disabled-features.h: arch/x86/include/uapi/asm/kvm.h: arch/powerpc/include/uapi/asm/kvm.h: arch/s390/include/uapi/asm/kvm.h: arch/arm/include/uapi/asm/kvm.h: arch/arm64/include/uapi/asm/kvm.h: - 'struct kvm_sync_regs' got changed in an ABI-incompatible way, fortunately none of the (in-kernel) tooling relied on it - new KVM_DEV calls added arch/x86/include/asm/required-features.h: - 5-level paging hardware ABI detail added arch/x86/include/asm/cpufeatures.h: - new CPU feature added arch/x86/include/uapi/asm/vmx.h: - new VMX exit conditions None of the changes requires fixes in the tooling source code. This addresses the following warnings: Warning: include/uapi/linux/stat.h differs from kernel Warning: arch/x86/include/asm/disabled-features.h differs from kernel Warning: arch/x86/include/asm/required-features.h differs from kernel Warning: arch/x86/include/asm/cpufeatures.h differs from kernel Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/arch/arm/include/uapi/asm/kvm.h | 10 +++++++++- tools/arch/arm64/include/uapi/asm/kvm.h | 10 +++++++++- tools/arch/powerpc/include/uapi/asm/kvm.h | 3 +++ tools/arch/s390/include/uapi/asm/kvm.h | 26 ++++++++++++++++++++++++-- tools/arch/x86/include/asm/cpufeatures.h | 2 ++ tools/arch/x86/include/asm/disabled-features.h | 8 +++++++- tools/arch/x86/include/asm/required-features.h | 8 +++++++- tools/arch/x86/include/uapi/asm/kvm.h | 3 +++ tools/arch/x86/include/uapi/asm/vmx.h | 25 ++++++++++++++++++------- tools/include/uapi/linux/stat.h | 8 ++------ 10 files changed, 84 insertions(+), 19 deletions(-) diff --git a/tools/arch/arm/include/uapi/asm/kvm.h b/tools/arch/arm/include/uapi/asm/kvm.h index 6ebd3e6a1fd1..5e3c673fa3f4 100644 --- a/tools/arch/arm/include/uapi/asm/kvm.h +++ b/tools/arch/arm/include/uapi/asm/kvm.h @@ -27,6 +27,8 @@ #define __KVM_HAVE_IRQ_LINE #define __KVM_HAVE_READONLY_MEM +#define KVM_COALESCED_MMIO_PAGE_OFFSET 1 + #define KVM_REG_SIZE(id) \ (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) @@ -114,6 +116,8 @@ struct kvm_debug_exit_arch { }; struct kvm_sync_regs { + /* Used with KVM_CAP_ARM_USER_IRQ */ + __u64 device_irq_level; }; struct kvm_arch_memory_slot { @@ -192,13 +196,17 @@ struct kvm_arch_memory_slot { #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 +#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT) #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff #define VGIC_LEVEL_INFO_LINE_LEVEL 0 -#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 +#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 +#define KVM_DEV_ARM_ITS_SAVE_TABLES 1 +#define KVM_DEV_ARM_ITS_RESTORE_TABLES 2 +#define KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES 3 /* KVM_IRQ_LINE irq field index values */ #define KVM_ARM_IRQ_TYPE_SHIFT 24 diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h index c2860358ae3e..70eea2ecc663 100644 --- a/tools/arch/arm64/include/uapi/asm/kvm.h +++ b/tools/arch/arm64/include/uapi/asm/kvm.h @@ -39,6 +39,8 @@ #define __KVM_HAVE_IRQ_LINE #define __KVM_HAVE_READONLY_MEM +#define KVM_COALESCED_MMIO_PAGE_OFFSET 1 + #define KVM_REG_SIZE(id) \ (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) @@ -143,6 +145,8 @@ struct kvm_debug_exit_arch { #define KVM_GUESTDBG_USE_HW (1 << 17) struct kvm_sync_regs { + /* Used with KVM_CAP_ARM_USER_IRQ */ + __u64 device_irq_level; }; struct kvm_arch_memory_slot { @@ -212,13 +216,17 @@ struct kvm_arch_memory_slot { #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 +#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT) #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff #define VGIC_LEVEL_INFO_LINE_LEVEL 0 -#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 +#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 +#define KVM_DEV_ARM_ITS_SAVE_TABLES 1 +#define KVM_DEV_ARM_ITS_RESTORE_TABLES 2 +#define KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES 3 /* Device Control API on vcpu fd */ #define KVM_ARM_VCPU_PMU_V3_CTRL 0 diff --git a/tools/arch/powerpc/include/uapi/asm/kvm.h b/tools/arch/powerpc/include/uapi/asm/kvm.h index 4edbe4bb0e8b..07fbeb927834 100644 --- a/tools/arch/powerpc/include/uapi/asm/kvm.h +++ b/tools/arch/powerpc/include/uapi/asm/kvm.h @@ -29,6 +29,9 @@ #define __KVM_HAVE_IRQ_LINE #define __KVM_HAVE_GUEST_DEBUG +/* Not always available, but if it is, this is the correct offset. */ +#define KVM_COALESCED_MMIO_PAGE_OFFSET 1 + struct kvm_regs { __u64 pc; __u64 cr; diff --git a/tools/arch/s390/include/uapi/asm/kvm.h b/tools/arch/s390/include/uapi/asm/kvm.h index 7f4fd65e9208..3dd2a1d308dd 100644 --- a/tools/arch/s390/include/uapi/asm/kvm.h +++ b/tools/arch/s390/include/uapi/asm/kvm.h @@ -26,6 +26,8 @@ #define KVM_DEV_FLIC_ADAPTER_REGISTER 6 #define KVM_DEV_FLIC_ADAPTER_MODIFY 7 #define KVM_DEV_FLIC_CLEAR_IO_IRQ 8 +#define KVM_DEV_FLIC_AISM 9 +#define KVM_DEV_FLIC_AIRQ_INJECT 10 /* * We can have up to 4*64k pending subchannels + 8 adapter interrupts, * as well as up to ASYNC_PF_PER_VCPU*KVM_MAX_VCPUS pfault done interrupts. @@ -41,7 +43,14 @@ struct kvm_s390_io_adapter { __u8 isc; __u8 maskable; __u8 swap; - __u8 pad; + __u8 flags; +}; + +#define KVM_S390_ADAPTER_SUPPRESSIBLE 0x01 + +struct kvm_s390_ais_req { + __u8 isc; + __u16 mode; }; #define KVM_S390_IO_ADAPTER_MASK 1 @@ -110,6 +119,7 @@ struct kvm_s390_vm_cpu_machine { #define KVM_S390_VM_CPU_FEAT_CMMA 10 #define KVM_S390_VM_CPU_FEAT_PFMFI 11 #define KVM_S390_VM_CPU_FEAT_SIGPIF 12 +#define KVM_S390_VM_CPU_FEAT_KSS 13 struct kvm_s390_vm_cpu_feat { __u64 feat[16]; }; @@ -198,6 +208,10 @@ struct kvm_guest_debug_arch { #define KVM_SYNC_VRS (1UL << 6) #define KVM_SYNC_RICCB (1UL << 7) #define KVM_SYNC_FPRS (1UL << 8) +#define KVM_SYNC_GSCB (1UL << 9) +/* length and alignment of the sdnx as a power of two */ +#define SDNXC 8 +#define SDNXL (1UL << SDNXC) /* definition of registers in kvm_run */ struct kvm_sync_regs { __u64 prefix; /* prefix register */ @@ -218,8 +232,16 @@ struct kvm_sync_regs { }; __u8 reserved[512]; /* for future vector expansion */ __u32 fpc; /* valid on KVM_SYNC_VRS or KVM_SYNC_FPRS */ - __u8 padding[52]; /* riccb needs to be 64byte aligned */ + __u8 padding1[52]; /* riccb needs to be 64byte aligned */ __u8 riccb[64]; /* runtime instrumentation controls block */ + __u8 padding2[192]; /* sdnx needs to be 256byte aligned */ + union { + __u8 sdnx[SDNXL]; /* state description annex */ + struct { + __u64 reserved1[2]; + __u64 gscb[4]; + }; + }; }; #define KVM_REG_S390_TODPR (KVM_REG_S390 | KVM_REG_SIZE_U32 | 0x1) diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index 0fe00446f9ca..2701e5f8145b 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -202,6 +202,8 @@ #define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */ #define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */ +#define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */ + /* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ #define X86_FEATURE_VNMI ( 8*32+ 1) /* Intel Virtual NMI */ diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h index 85599ad4d024..5dff775af7cd 100644 --- a/tools/arch/x86/include/asm/disabled-features.h +++ b/tools/arch/x86/include/asm/disabled-features.h @@ -36,6 +36,12 @@ # define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31)) #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */ +#ifdef CONFIG_X86_5LEVEL +# define DISABLE_LA57 0 +#else +# define DISABLE_LA57 (1<<(X86_FEATURE_LA57 & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -55,7 +61,7 @@ #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 -#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE) +#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57) #define DISABLED_MASK17 0 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18) diff --git a/tools/arch/x86/include/asm/required-features.h b/tools/arch/x86/include/asm/required-features.h index fac9a5c0abe9..d91ba04dd007 100644 --- a/tools/arch/x86/include/asm/required-features.h +++ b/tools/arch/x86/include/asm/required-features.h @@ -53,6 +53,12 @@ # define NEED_MOVBE 0 #endif +#ifdef CONFIG_X86_5LEVEL +# define NEED_LA57 (1<<(X86_FEATURE_LA57 & 31)) +#else +# define NEED_LA57 0 +#endif + #ifdef CONFIG_X86_64 #ifdef CONFIG_PARAVIRT /* Paravirtualized systems may not have PSE or PGE available */ @@ -98,7 +104,7 @@ #define REQUIRED_MASK13 0 #define REQUIRED_MASK14 0 #define REQUIRED_MASK15 0 -#define REQUIRED_MASK16 0 +#define REQUIRED_MASK16 (NEED_LA57) #define REQUIRED_MASK17 0 #define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18) diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 739c0c594022..c2824d02ba37 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -9,6 +9,9 @@ #include <linux/types.h> #include <linux/ioctl.h> +#define KVM_PIO_PAGE_OFFSET 1 +#define KVM_COALESCED_MMIO_PAGE_OFFSET 2 + #define DE_VECTOR 0 #define DB_VECTOR 1 #define BP_VECTOR 3 diff --git a/tools/arch/x86/include/uapi/asm/vmx.h b/tools/arch/x86/include/uapi/asm/vmx.h index 14458658e988..690a2dcf4078 100644 --- a/tools/arch/x86/include/uapi/asm/vmx.h +++ b/tools/arch/x86/include/uapi/asm/vmx.h @@ -76,7 +76,11 @@ #define EXIT_REASON_WBINVD 54 #define EXIT_REASON_XSETBV 55 #define EXIT_REASON_APIC_WRITE 56 +#define EXIT_REASON_RDRAND 57 #define EXIT_REASON_INVPCID 58 +#define EXIT_REASON_VMFUNC 59 +#define EXIT_REASON_ENCLS 60 +#define EXIT_REASON_RDSEED 61 #define EXIT_REASON_PML_FULL 62 #define EXIT_REASON_XSAVES 63 #define EXIT_REASON_XRSTORS 64 @@ -90,6 +94,7 @@ { EXIT_REASON_TASK_SWITCH, "TASK_SWITCH" }, \ { EXIT_REASON_CPUID, "CPUID" }, \ { EXIT_REASON_HLT, "HLT" }, \ + { EXIT_REASON_INVD, "INVD" }, \ { EXIT_REASON_INVLPG, "INVLPG" }, \ { EXIT_REASON_RDPMC, "RDPMC" }, \ { EXIT_REASON_RDTSC, "RDTSC" }, \ @@ -108,6 +113,8 @@ { EXIT_REASON_IO_INSTRUCTION, "IO_INSTRUCTION" }, \ { EXIT_REASON_MSR_READ, "MSR_READ" }, \ { EXIT_REASON_MSR_WRITE, "MSR_WRITE" }, \ + { EXIT_REASON_INVALID_STATE, "INVALID_STATE" }, \ + { EXIT_REASON_MSR_LOAD_FAIL, "MSR_LOAD_FAIL" }, \ { EXIT_REASON_MWAIT_INSTRUCTION, "MWAIT_INSTRUCTION" }, \ { EXIT_REASON_MONITOR_TRAP_FLAG, "MONITOR_TRAP_FLAG" }, \ { EXIT_REASON_MONITOR_INSTRUCTION, "MONITOR_INSTRUCTION" }, \ @@ -115,20 +122,24 @@ { EXIT_REASON_MCE_DURING_VMENTRY, "MCE_DURING_VMENTRY" }, \ { EXIT_REASON_TPR_BELOW_THRESHOLD, "TPR_BELOW_THRESHOLD" }, \ { EXIT_REASON_APIC_ACCESS, "APIC_ACCESS" }, \ - { EXIT_REASON_GDTR_IDTR, "GDTR_IDTR" }, \ - { EXIT_REASON_LDTR_TR, "LDTR_TR" }, \ + { EXIT_REASON_EOI_INDUCED, "EOI_INDUCED" }, \ + { EXIT_REASON_GDTR_IDTR, "GDTR_IDTR" }, \ + { EXIT_REASON_LDTR_TR, "LDTR_TR" }, \ { EXIT_REASON_EPT_VIOLATION, "EPT_VIOLATION" }, \ { EXIT_REASON_EPT_MISCONFIG, "EPT_MISCONFIG" }, \ { EXIT_REASON_INVEPT, "INVEPT" }, \ + { EXIT_REASON_RDTSCP, "RDTSCP" }, \ { EXIT_REASON_PREEMPTION_TIMER, "PREEMPTION_TIMER" }, \ + { EXIT_REASON_INVVPID, "INVVPID" }, \ { EXIT_REASON_WBINVD, "WBINVD" }, \ + { EXIT_REASON_XSETBV, "XSETBV" }, \ { EXIT_REASON_APIC_WRITE, "APIC_WRITE" }, \ - { EXIT_REASON_EOI_INDUCED, "EOI_INDUCED" }, \ - { EXIT_REASON_INVALID_STATE, "INVALID_STATE" }, \ - { EXIT_REASON_MSR_LOAD_FAIL, "MSR_LOAD_FAIL" }, \ - { EXIT_REASON_INVD, "INVD" }, \ - { EXIT_REASON_INVVPID, "INVVPID" }, \ + { EXIT_REASON_RDRAND, "RDRAND" }, \ { EXIT_REASON_INVPCID, "INVPCID" }, \ + { EXIT_REASON_VMFUNC, "VMFUNC" }, \ + { EXIT_REASON_ENCLS, "ENCLS" }, \ + { EXIT_REASON_RDSEED, "RDSEED" }, \ + { EXIT_REASON_PML_FULL, "PML_FULL" }, \ { EXIT_REASON_XSAVES, "XSAVES" }, \ { EXIT_REASON_XRSTORS, "XRSTORS" } diff --git a/tools/include/uapi/linux/stat.h b/tools/include/uapi/linux/stat.h index d538897b8e08..17b10304c393 100644 --- a/tools/include/uapi/linux/stat.h +++ b/tools/include/uapi/linux/stat.h @@ -48,17 +48,13 @@ * tv_sec holds the number of seconds before (negative) or after (positive) * 00:00:00 1st January 1970 UTC. * - * tv_nsec holds a number of nanoseconds before (0..-999,999,999 if tv_sec is - * negative) or after (0..999,999,999 if tv_sec is positive) the tv_sec time. - * - * Note that if both tv_sec and tv_nsec are non-zero, then the two values must - * either be both positive or both negative. + * tv_nsec holds a number of nanoseconds (0..999,999,999) after the tv_sec time. * * __reserved is held in case we need a yet finer resolution. */ struct statx_timestamp { __s64 tv_sec; - __s32 tv_nsec; + __u32 tv_nsec; __s32 __reserved; }; ^ permalink raw reply related [flat|nested] 26+ messages in thread
* [tip:perf/urgent] tools/include: Sync kernel ABI headers with tooling headers 2017-05-24 6:57 ` [PATCH] tools/include: Sync kernel ABI headers with tooling headers Ingo Molnar @ 2017-05-24 7:07 ` tip-bot for Ingo Molnar 0 siblings, 0 replies; 26+ messages in thread From: tip-bot for Ingo Molnar @ 2017-05-24 7:07 UTC (permalink / raw) To: linux-tip-commits Cc: acme, acme, namhyung, linux-kernel, torvalds, milian.wolff, yao.jin, tglx, hpa, mingo, jolsa, peterz, jolsa Commit-ID: 6e30437bd42c4d4e9cfc4c40efda00eb83a11cde Gitweb: http://git.kernel.org/tip/6e30437bd42c4d4e9cfc4c40efda00eb83a11cde Author: Ingo Molnar <mingo@kernel.org> AuthorDate: Wed, 24 May 2017 08:57:21 +0200 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 24 May 2017 09:00:21 +0200 tools/include: Sync kernel ABI headers with tooling headers Sync (copy) the following v4.12 kernel headers to the tooling headers: arch/x86/include/asm/disabled-features.h: arch/x86/include/uapi/asm/kvm.h: arch/powerpc/include/uapi/asm/kvm.h: arch/s390/include/uapi/asm/kvm.h: arch/arm/include/uapi/asm/kvm.h: arch/arm64/include/uapi/asm/kvm.h: - 'struct kvm_sync_regs' got changed in an ABI-incompatible way, fortunately none of the (in-kernel) tooling relied on it - new KVM_DEV calls added arch/x86/include/asm/required-features.h: - 5-level paging hardware ABI detail added arch/x86/include/asm/cpufeatures.h: - new CPU feature added arch/x86/include/uapi/asm/vmx.h: - new VMX exit conditions None of the changes requires fixes in the tooling source code. This addresses the following warnings: Warning: include/uapi/linux/stat.h differs from kernel Warning: arch/x86/include/asm/disabled-features.h differs from kernel Warning: arch/x86/include/asm/required-features.h differs from kernel Warning: arch/x86/include/asm/cpufeatures.h differs from kernel Warning: arch/x86/include/uapi/asm/kvm.h differs from kernel Warning: arch/x86/include/uapi/asm/vmx.h differs from kernel Warning: arch/powerpc/include/uapi/asm/kvm.h differs from kernel Warning: arch/s390/include/uapi/asm/kvm.h differs from kernel Warning: arch/arm/include/uapi/asm/kvm.h differs from kernel Warning: arch/arm64/include/uapi/asm/kvm.h differs from kernel Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yao Jin <yao.jin@linux.intel.com> Cc: kernel-team@lge.com Link: http://lkml.kernel.org/r/20170524065721.j2mlch6bgk5klgbc@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- tools/arch/arm/include/uapi/asm/kvm.h | 10 +++++++++- tools/arch/arm64/include/uapi/asm/kvm.h | 10 +++++++++- tools/arch/powerpc/include/uapi/asm/kvm.h | 3 +++ tools/arch/s390/include/uapi/asm/kvm.h | 26 ++++++++++++++++++++++++-- tools/arch/x86/include/asm/cpufeatures.h | 2 ++ tools/arch/x86/include/asm/disabled-features.h | 8 +++++++- tools/arch/x86/include/asm/required-features.h | 8 +++++++- tools/arch/x86/include/uapi/asm/kvm.h | 3 +++ tools/arch/x86/include/uapi/asm/vmx.h | 25 ++++++++++++++++++------- tools/include/uapi/linux/stat.h | 8 ++------ 10 files changed, 84 insertions(+), 19 deletions(-) diff --git a/tools/arch/arm/include/uapi/asm/kvm.h b/tools/arch/arm/include/uapi/asm/kvm.h index 6ebd3e6..5e3c673 100644 --- a/tools/arch/arm/include/uapi/asm/kvm.h +++ b/tools/arch/arm/include/uapi/asm/kvm.h @@ -27,6 +27,8 @@ #define __KVM_HAVE_IRQ_LINE #define __KVM_HAVE_READONLY_MEM +#define KVM_COALESCED_MMIO_PAGE_OFFSET 1 + #define KVM_REG_SIZE(id) \ (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) @@ -114,6 +116,8 @@ struct kvm_debug_exit_arch { }; struct kvm_sync_regs { + /* Used with KVM_CAP_ARM_USER_IRQ */ + __u64 device_irq_level; }; struct kvm_arch_memory_slot { @@ -192,13 +196,17 @@ struct kvm_arch_memory_slot { #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 +#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT) #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff #define VGIC_LEVEL_INFO_LINE_LEVEL 0 -#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 +#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 +#define KVM_DEV_ARM_ITS_SAVE_TABLES 1 +#define KVM_DEV_ARM_ITS_RESTORE_TABLES 2 +#define KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES 3 /* KVM_IRQ_LINE irq field index values */ #define KVM_ARM_IRQ_TYPE_SHIFT 24 diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/include/uapi/asm/kvm.h index c286035..70eea2e 100644 --- a/tools/arch/arm64/include/uapi/asm/kvm.h +++ b/tools/arch/arm64/include/uapi/asm/kvm.h @@ -39,6 +39,8 @@ #define __KVM_HAVE_IRQ_LINE #define __KVM_HAVE_READONLY_MEM +#define KVM_COALESCED_MMIO_PAGE_OFFSET 1 + #define KVM_REG_SIZE(id) \ (1U << (((id) & KVM_REG_SIZE_MASK) >> KVM_REG_SIZE_SHIFT)) @@ -143,6 +145,8 @@ struct kvm_debug_exit_arch { #define KVM_GUESTDBG_USE_HW (1 << 17) struct kvm_sync_regs { + /* Used with KVM_CAP_ARM_USER_IRQ */ + __u64 device_irq_level; }; struct kvm_arch_memory_slot { @@ -212,13 +216,17 @@ struct kvm_arch_memory_slot { #define KVM_DEV_ARM_VGIC_GRP_REDIST_REGS 5 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 +#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT) #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INTID_MASK 0x3ff #define VGIC_LEVEL_INFO_LINE_LEVEL 0 -#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 +#define KVM_DEV_ARM_VGIC_CTRL_INIT 0 +#define KVM_DEV_ARM_ITS_SAVE_TABLES 1 +#define KVM_DEV_ARM_ITS_RESTORE_TABLES 2 +#define KVM_DEV_ARM_VGIC_SAVE_PENDING_TABLES 3 /* Device Control API on vcpu fd */ #define KVM_ARM_VCPU_PMU_V3_CTRL 0 diff --git a/tools/arch/powerpc/include/uapi/asm/kvm.h b/tools/arch/powerpc/include/uapi/asm/kvm.h index 4edbe4b..07fbeb9 100644 --- a/tools/arch/powerpc/include/uapi/asm/kvm.h +++ b/tools/arch/powerpc/include/uapi/asm/kvm.h @@ -29,6 +29,9 @@ #define __KVM_HAVE_IRQ_LINE #define __KVM_HAVE_GUEST_DEBUG +/* Not always available, but if it is, this is the correct offset. */ +#define KVM_COALESCED_MMIO_PAGE_OFFSET 1 + struct kvm_regs { __u64 pc; __u64 cr; diff --git a/tools/arch/s390/include/uapi/asm/kvm.h b/tools/arch/s390/include/uapi/asm/kvm.h index 7f4fd65..3dd2a1d 100644 --- a/tools/arch/s390/include/uapi/asm/kvm.h +++ b/tools/arch/s390/include/uapi/asm/kvm.h @@ -26,6 +26,8 @@ #define KVM_DEV_FLIC_ADAPTER_REGISTER 6 #define KVM_DEV_FLIC_ADAPTER_MODIFY 7 #define KVM_DEV_FLIC_CLEAR_IO_IRQ 8 +#define KVM_DEV_FLIC_AISM 9 +#define KVM_DEV_FLIC_AIRQ_INJECT 10 /* * We can have up to 4*64k pending subchannels + 8 adapter interrupts, * as well as up to ASYNC_PF_PER_VCPU*KVM_MAX_VCPUS pfault done interrupts. @@ -41,7 +43,14 @@ struct kvm_s390_io_adapter { __u8 isc; __u8 maskable; __u8 swap; - __u8 pad; + __u8 flags; +}; + +#define KVM_S390_ADAPTER_SUPPRESSIBLE 0x01 + +struct kvm_s390_ais_req { + __u8 isc; + __u16 mode; }; #define KVM_S390_IO_ADAPTER_MASK 1 @@ -110,6 +119,7 @@ struct kvm_s390_vm_cpu_machine { #define KVM_S390_VM_CPU_FEAT_CMMA 10 #define KVM_S390_VM_CPU_FEAT_PFMFI 11 #define KVM_S390_VM_CPU_FEAT_SIGPIF 12 +#define KVM_S390_VM_CPU_FEAT_KSS 13 struct kvm_s390_vm_cpu_feat { __u64 feat[16]; }; @@ -198,6 +208,10 @@ struct kvm_guest_debug_arch { #define KVM_SYNC_VRS (1UL << 6) #define KVM_SYNC_RICCB (1UL << 7) #define KVM_SYNC_FPRS (1UL << 8) +#define KVM_SYNC_GSCB (1UL << 9) +/* length and alignment of the sdnx as a power of two */ +#define SDNXC 8 +#define SDNXL (1UL << SDNXC) /* definition of registers in kvm_run */ struct kvm_sync_regs { __u64 prefix; /* prefix register */ @@ -218,8 +232,16 @@ struct kvm_sync_regs { }; __u8 reserved[512]; /* for future vector expansion */ __u32 fpc; /* valid on KVM_SYNC_VRS or KVM_SYNC_FPRS */ - __u8 padding[52]; /* riccb needs to be 64byte aligned */ + __u8 padding1[52]; /* riccb needs to be 64byte aligned */ __u8 riccb[64]; /* runtime instrumentation controls block */ + __u8 padding2[192]; /* sdnx needs to be 256byte aligned */ + union { + __u8 sdnx[SDNXL]; /* state description annex */ + struct { + __u64 reserved1[2]; + __u64 gscb[4]; + }; + }; }; #define KVM_REG_S390_TODPR (KVM_REG_S390 | KVM_REG_SIZE_U32 | 0x1) diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h index 0fe0044..2701e5f 100644 --- a/tools/arch/x86/include/asm/cpufeatures.h +++ b/tools/arch/x86/include/asm/cpufeatures.h @@ -202,6 +202,8 @@ #define X86_FEATURE_AVX512_4VNNIW (7*32+16) /* AVX-512 Neural Network Instructions */ #define X86_FEATURE_AVX512_4FMAPS (7*32+17) /* AVX-512 Multiply Accumulation Single precision */ +#define X86_FEATURE_MBA ( 7*32+18) /* Memory Bandwidth Allocation */ + /* Virtualization flags: Linux defined, word 8 */ #define X86_FEATURE_TPR_SHADOW ( 8*32+ 0) /* Intel TPR Shadow */ #define X86_FEATURE_VNMI ( 8*32+ 1) /* Intel Virtual NMI */ diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h index 85599ad..5dff775 100644 --- a/tools/arch/x86/include/asm/disabled-features.h +++ b/tools/arch/x86/include/asm/disabled-features.h @@ -36,6 +36,12 @@ # define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31)) #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */ +#ifdef CONFIG_X86_5LEVEL +# define DISABLE_LA57 0 +#else +# define DISABLE_LA57 (1<<(X86_FEATURE_LA57 & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -55,7 +61,7 @@ #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 -#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE) +#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57) #define DISABLED_MASK17 0 #define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18) diff --git a/tools/arch/x86/include/asm/required-features.h b/tools/arch/x86/include/asm/required-features.h index fac9a5c..d91ba04 100644 --- a/tools/arch/x86/include/asm/required-features.h +++ b/tools/arch/x86/include/asm/required-features.h @@ -53,6 +53,12 @@ # define NEED_MOVBE 0 #endif +#ifdef CONFIG_X86_5LEVEL +# define NEED_LA57 (1<<(X86_FEATURE_LA57 & 31)) +#else +# define NEED_LA57 0 +#endif + #ifdef CONFIG_X86_64 #ifdef CONFIG_PARAVIRT /* Paravirtualized systems may not have PSE or PGE available */ @@ -98,7 +104,7 @@ #define REQUIRED_MASK13 0 #define REQUIRED_MASK14 0 #define REQUIRED_MASK15 0 -#define REQUIRED_MASK16 0 +#define REQUIRED_MASK16 (NEED_LA57) #define REQUIRED_MASK17 0 #define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18) diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 739c0c5..c2824d0 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -9,6 +9,9 @@ #include <linux/types.h> #include <linux/ioctl.h> +#define KVM_PIO_PAGE_OFFSET 1 +#define KVM_COALESCED_MMIO_PAGE_OFFSET 2 + #define DE_VECTOR 0 #define DB_VECTOR 1 #define BP_VECTOR 3 diff --git a/tools/arch/x86/include/uapi/asm/vmx.h b/tools/arch/x86/include/uapi/asm/vmx.h index 1445865..690a2dc 100644 --- a/tools/arch/x86/include/uapi/asm/vmx.h +++ b/tools/arch/x86/include/uapi/asm/vmx.h @@ -76,7 +76,11 @@ #define EXIT_REASON_WBINVD 54 #define EXIT_REASON_XSETBV 55 #define EXIT_REASON_APIC_WRITE 56 +#define EXIT_REASON_RDRAND 57 #define EXIT_REASON_INVPCID 58 +#define EXIT_REASON_VMFUNC 59 +#define EXIT_REASON_ENCLS 60 +#define EXIT_REASON_RDSEED 61 #define EXIT_REASON_PML_FULL 62 #define EXIT_REASON_XSAVES 63 #define EXIT_REASON_XRSTORS 64 @@ -90,6 +94,7 @@ { EXIT_REASON_TASK_SWITCH, "TASK_SWITCH" }, \ { EXIT_REASON_CPUID, "CPUID" }, \ { EXIT_REASON_HLT, "HLT" }, \ + { EXIT_REASON_INVD, "INVD" }, \ { EXIT_REASON_INVLPG, "INVLPG" }, \ { EXIT_REASON_RDPMC, "RDPMC" }, \ { EXIT_REASON_RDTSC, "RDTSC" }, \ @@ -108,6 +113,8 @@ { EXIT_REASON_IO_INSTRUCTION, "IO_INSTRUCTION" }, \ { EXIT_REASON_MSR_READ, "MSR_READ" }, \ { EXIT_REASON_MSR_WRITE, "MSR_WRITE" }, \ + { EXIT_REASON_INVALID_STATE, "INVALID_STATE" }, \ + { EXIT_REASON_MSR_LOAD_FAIL, "MSR_LOAD_FAIL" }, \ { EXIT_REASON_MWAIT_INSTRUCTION, "MWAIT_INSTRUCTION" }, \ { EXIT_REASON_MONITOR_TRAP_FLAG, "MONITOR_TRAP_FLAG" }, \ { EXIT_REASON_MONITOR_INSTRUCTION, "MONITOR_INSTRUCTION" }, \ @@ -115,20 +122,24 @@ { EXIT_REASON_MCE_DURING_VMENTRY, "MCE_DURING_VMENTRY" }, \ { EXIT_REASON_TPR_BELOW_THRESHOLD, "TPR_BELOW_THRESHOLD" }, \ { EXIT_REASON_APIC_ACCESS, "APIC_ACCESS" }, \ - { EXIT_REASON_GDTR_IDTR, "GDTR_IDTR" }, \ - { EXIT_REASON_LDTR_TR, "LDTR_TR" }, \ + { EXIT_REASON_EOI_INDUCED, "EOI_INDUCED" }, \ + { EXIT_REASON_GDTR_IDTR, "GDTR_IDTR" }, \ + { EXIT_REASON_LDTR_TR, "LDTR_TR" }, \ { EXIT_REASON_EPT_VIOLATION, "EPT_VIOLATION" }, \ { EXIT_REASON_EPT_MISCONFIG, "EPT_MISCONFIG" }, \ { EXIT_REASON_INVEPT, "INVEPT" }, \ + { EXIT_REASON_RDTSCP, "RDTSCP" }, \ { EXIT_REASON_PREEMPTION_TIMER, "PREEMPTION_TIMER" }, \ + { EXIT_REASON_INVVPID, "INVVPID" }, \ { EXIT_REASON_WBINVD, "WBINVD" }, \ + { EXIT_REASON_XSETBV, "XSETBV" }, \ { EXIT_REASON_APIC_WRITE, "APIC_WRITE" }, \ - { EXIT_REASON_EOI_INDUCED, "EOI_INDUCED" }, \ - { EXIT_REASON_INVALID_STATE, "INVALID_STATE" }, \ - { EXIT_REASON_MSR_LOAD_FAIL, "MSR_LOAD_FAIL" }, \ - { EXIT_REASON_INVD, "INVD" }, \ - { EXIT_REASON_INVVPID, "INVVPID" }, \ + { EXIT_REASON_RDRAND, "RDRAND" }, \ { EXIT_REASON_INVPCID, "INVPCID" }, \ + { EXIT_REASON_VMFUNC, "VMFUNC" }, \ + { EXIT_REASON_ENCLS, "ENCLS" }, \ + { EXIT_REASON_RDSEED, "RDSEED" }, \ + { EXIT_REASON_PML_FULL, "PML_FULL" }, \ { EXIT_REASON_XSAVES, "XSAVES" }, \ { EXIT_REASON_XRSTORS, "XRSTORS" } diff --git a/tools/include/uapi/linux/stat.h b/tools/include/uapi/linux/stat.h index d538897..17b1030 100644 --- a/tools/include/uapi/linux/stat.h +++ b/tools/include/uapi/linux/stat.h @@ -48,17 +48,13 @@ * tv_sec holds the number of seconds before (negative) or after (positive) * 00:00:00 1st January 1970 UTC. * - * tv_nsec holds a number of nanoseconds before (0..-999,999,999 if tv_sec is - * negative) or after (0..999,999,999 if tv_sec is positive) the tv_sec time. - * - * Note that if both tv_sec and tv_nsec are non-zero, then the two values must - * either be both positive or both negative. + * tv_nsec holds a number of nanoseconds (0..999,999,999) after the tv_sec time. * * __reserved is held in case we need a yet finer resolution. */ struct statx_timestamp { __s64 tv_sec; - __s32 tv_nsec; + __u32 tv_nsec; __s32 __reserved; }; ^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [GIT PULL 0/7] perf/urgent callchain fixes 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim ` (7 preceding siblings ...) 2017-05-24 6:53 ` [GIT PULL 0/7] perf/urgent callchain fixes Ingo Molnar @ 2017-06-08 13:15 ` Milian Wolff 2017-06-08 13:59 ` Arnaldo Carvalho de Melo 8 siblings, 1 reply; 26+ messages in thread From: Milian Wolff @ 2017-06-08 13:15 UTC (permalink / raw) To: Namhyung Kim Cc: Ingo Molnar, LKML, kernel-team, Arnaldo Carvalho de Melo, Jiri Olsa, Yao Jin [-- Attachment #1: Type: text/plain, Size: 571 bytes --] On Wednesday, May 24, 2017 8:21:22 AM CEST Namhyung Kim wrote: > Hi Ingo, > > Please consider pulling the perf tooling changes below. Build tested > on Ubuntu, Fedora and Archlinux. I found a problem during `perf test` > but it seems unrelated to this series. Will take a look it later. Hey guys, I notice that these patches are not in acme's perf/core branch. Can they be applied there too please? Thanks -- Milian Wolff | milian.wolff@kdab.com | Software Engineer KDAB (Deutschland) GmbH&Co KG, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt Experts [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 3826 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [GIT PULL 0/7] perf/urgent callchain fixes 2017-06-08 13:15 ` [GIT PULL 0/7] perf/urgent callchain fixes Milian Wolff @ 2017-06-08 13:59 ` Arnaldo Carvalho de Melo 2017-06-08 14:34 ` Milian Wolff 0 siblings, 1 reply; 26+ messages in thread From: Arnaldo Carvalho de Melo @ 2017-06-08 13:59 UTC (permalink / raw) To: Milian Wolff Cc: Namhyung Kim, Ingo Molnar, LKML, kernel-team, Jiri Olsa, Yao Jin Em Thu, Jun 08, 2017 at 03:15:32PM +0200, Milian Wolff escreveu: > On Wednesday, May 24, 2017 8:21:22 AM CEST Namhyung Kim wrote: > > Hi Ingo, > > > > Please consider pulling the perf tooling changes below. Build tested > > on Ubuntu, Fedora and Archlinux. I found a problem during `perf test` > > but it seems unrelated to this series. Will take a look it later. > > Hey guys, > > I notice that these patches are not in acme's perf/core branch. Can they be > applied there too please? It is there now, Ingo merged tip/perf/urgent into tip/perf/core and I just rebased my perf/core with that, just pushed. - Arnaldo ^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [GIT PULL 0/7] perf/urgent callchain fixes 2017-06-08 13:59 ` Arnaldo Carvalho de Melo @ 2017-06-08 14:34 ` Milian Wolff 0 siblings, 0 replies; 26+ messages in thread From: Milian Wolff @ 2017-06-08 14:34 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Namhyung Kim, Ingo Molnar, LKML, kernel-team, Jiri Olsa, Yao Jin [-- Attachment #1: Type: text/plain, Size: 909 bytes --] On Thursday, June 8, 2017 3:59:31 PM CEST Arnaldo Carvalho de Melo wrote: > Em Thu, Jun 08, 2017 at 03:15:32PM +0200, Milian Wolff escreveu: > > On Wednesday, May 24, 2017 8:21:22 AM CEST Namhyung Kim wrote: > > > Hi Ingo, > > > > > > Please consider pulling the perf tooling changes below. Build tested > > > on Ubuntu, Fedora and Archlinux. I found a problem during `perf test` > > > but it seems unrelated to this series. Will take a look it later. > > > > Hey guys, > > > > I notice that these patches are not in acme's perf/core branch. Can they > > be > > applied there too please? > > It is there now, Ingo merged tip/perf/urgent into tip/perf/core and I > just rebased my perf/core with that, just pushed. Excellent, thank you. Cheers -- Milian Wolff | milian.wolff@kdab.com | Software Engineer KDAB (Deutschland) GmbH&Co KG, a KDAB Group company Tel: +49-30-521325470 KDAB - The Qt Experts [-- Attachment #2: smime.p7s --] [-- Type: application/pkcs7-signature, Size: 3826 bytes --] ^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2017-06-08 14:34 UTC | newest] Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-05-24 6:21 [GIT PULL 0/7] perf/urgent callchain fixes Namhyung Kim 2017-05-24 6:21 ` [PATCH 1/7] perf report: don't crash on invalid maps in `-g srcline` mode Namhyung Kim 2017-05-24 7:03 ` [tip:perf/urgent] perf report: Don't " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 2/7] perf report: fix memory leak in addr2line when called by addr2inlines Namhyung Kim 2017-05-24 7:04 ` [tip:perf/urgent] perf report: Fix " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 3/7] perf report: fix off-by-one for non-activation frames Namhyung Kim 2017-05-24 7:05 ` [tip:perf/urgent] perf report: Fix " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 4/7] perf script: Add --inline option Namhyung Kim 2017-05-24 6:38 ` Ingo Molnar 2017-05-24 7:13 ` Namhyung Kim 2017-05-24 7:21 ` Ingo Molnar 2017-05-24 7:53 ` Milian Wolff 2017-05-24 8:06 ` Ingo Molnar 2017-05-24 7:05 ` [tip:perf/urgent] perf script: Add --inline option for debugging tip-bot for Namhyung Kim 2017-05-24 6:21 ` [PATCH 5/7] perf report: always honor callchain order for inlined nodes Namhyung Kim 2017-05-24 7:06 ` [tip:perf/urgent] perf report: Always " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 6/7] perf report: do not drop last inlined frame Namhyung Kim 2017-05-24 7:06 ` [tip:perf/urgent] perf report: Do " tip-bot for Milian Wolff 2017-05-24 6:21 ` [PATCH 7/7] perf tools: Fix to put caller above callee in children mode Namhyung Kim 2017-05-24 7:07 ` [tip:perf/urgent] perf tools: Put caller above callee in --children mode tip-bot for Namhyung Kim 2017-05-24 6:53 ` [GIT PULL 0/7] perf/urgent callchain fixes Ingo Molnar 2017-05-24 6:57 ` [PATCH] tools/include: Sync kernel ABI headers with tooling headers Ingo Molnar 2017-05-24 7:07 ` [tip:perf/urgent] " tip-bot for Ingo Molnar 2017-06-08 13:15 ` [GIT PULL 0/7] perf/urgent callchain fixes Milian Wolff 2017-06-08 13:59 ` Arnaldo Carvalho de Melo 2017-06-08 14:34 ` Milian Wolff
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).