* [PATCH 01/10] perf bench numa: Fix cpu0 binding
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 02/10] perf annotate: Fix printing of unaugmented disassembled instructions from BPF Arnaldo Carvalho de Melo
` (8 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Michael Petlan, Alexander Shishkin, Andi Kleen,
Peter Zijlstra, Satheesh Rajendran, Arnaldo Carvalho de Melo
From: Jiri Olsa <jolsa@kernel.org>
Michael reported an issue with perf bench numa failing with binding to
cpu0 with '-0' option.
# perf bench numa mem -p 3 -t 1 -P 512 -s 100 -zZcm0 --thp 1 -M 1 -ddd
# Running 'numa/mem' benchmark:
# Running main, "perf bench numa numa-mem -p 3 -t 1 -P 512 -s 100 -zZcm0 --thp 1 -M 1 -ddd"
binding to node 0, mask: 0000000000000001 => -1
perf: bench/numa.c:356: bind_to_memnode: Assertion `!(ret)' failed.
Aborted (core dumped)
This happens when the cpu0 is not part of node0, which is the benchmark
assumption and we can see that's not the case for some powerpc servers.
Using correct node for cpu0 binding.
Reported-by: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20190801142642.28004-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/bench/numa.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/tools/perf/bench/numa.c b/tools/perf/bench/numa.c
index a640ca7aaada..513cb2f2fa32 100644
--- a/tools/perf/bench/numa.c
+++ b/tools/perf/bench/numa.c
@@ -379,8 +379,10 @@ static u8 *alloc_data(ssize_t bytes0, int map_flags,
/* Allocate and initialize all memory on CPU#0: */
if (init_cpu0) {
- orig_mask = bind_to_node(0);
- bind_to_memnode(0);
+ int node = numa_node_of_cpu(0);
+
+ orig_mask = bind_to_node(node);
+ bind_to_memnode(node);
}
bytes = bytes0 + HPSIZE;
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 02/10] perf annotate: Fix printing of unaugmented disassembled instructions from BPF
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 01/10] perf bench numa: Fix cpu0 binding Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 03/10] perf db-export: Fix thread__exec_comm() Arnaldo Carvalho de Melo
` (7 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Arnaldo Carvalho de Melo, Adrian Hunter,
Song Liu
From: Arnaldo Carvalho de Melo <acme@redhat.com>
The code to disassemble BPF programs uses binutil's disassembling
routines, and those use in turn fprintf to print to a memstream FILE,
adding a newline at the end of each line, which ends up confusing the
TUI routines called from:
annotate_browser__write()
annotate_line__write()
annotate_browser__printf()
ui_browser__vprintf()
SLsmg_vprintf()
The SLsmg_vprintf() function in the slang library gets confused with the
terminating newline, so make the disasm_line__parse() function that
parses the lines produced by the BPF specific disassembler (that uses
binutil's libopcodes) and the lines produced by the objdump based
disassembler used for everything else (and that doesn't adds this
terminating newline) trim the end of the line in addition of the
beginning.
This way when disasm_line->ops.raw, i.e. for instructions without a
special scnprintf() method, we'll not have that \n getting in the way of
filling the screen right after the instruction with spaces to avoid
leaving what was on the screen before and thus garbling the annotation
screen, breaking scrolling, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Fixes: 6987561c9e86 ("perf annotate: Enable annotation of BPF programs")
Link: https://lkml.kernel.org/n/tip-unbr5a5efakobfr6rhxq99ta@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/annotate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index ac9ad2330f93..163536720149 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1122,7 +1122,7 @@ static int disasm_line__parse(char *line, const char **namep, char **rawp)
goto out;
(*rawp)[0] = tmp;
- *rawp = skip_spaces(*rawp);
+ *rawp = strim(*rawp);
return 0;
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 03/10] perf db-export: Fix thread__exec_comm()
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 01/10] perf bench numa: Fix cpu0 binding Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 02/10] perf annotate: Fix printing of unaugmented disassembled instructions from BPF Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 04/10] perf ftrace: Fix failure to set cpumask when only one cpu is present Arnaldo Carvalho de Melo
` (6 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Adrian Hunter, Jiri Olsa, stable,
Arnaldo Carvalho de Melo
From: Adrian Hunter <adrian.hunter@intel.com>
Threads synthesized from /proc have comms with a start time of zero, and
not marked as "exec". Currently, there can be 2 such comms. The first is
created by processing a synthesized fork event and is set to the
parent's comm string, and the second by processing a synthesized comm
event set to the thread's current comm string.
In the absence of an "exec" comm, thread__exec_comm() picks the last
(oldest) comm, which, in the case above, is the parent's comm string.
For a main thread, that is very probably wrong. Use the second-to-last
in that case.
This affects only db-export because it is the only user of
thread__exec_comm().
Example:
$ sudo perf record -a -o pt-a-sleep-1 -e intel_pt//u -- sleep 1
$ sudo chown ahunter pt-a-sleep-1
Before:
$ perf script -i pt-a-sleep-1 --itrace=bep -s tools/perf/scripts/python/export-to-sqlite.py pt-a-sleep-1.db branches calls
$ sqlite3 -header -column pt-a-sleep-1.db 'select * from comm_threads_view'
comm_id command thread_id pid tid
---------- ---------- ---------- ---------- ----------
1 swapper 1 0 0
2 rcu_sched 2 10 10
3 kthreadd 3 78 78
5 sudo 4 15180 15180
5 sudo 5 15180 15182
7 kworker/4: 6 10335 10335
8 kthreadd 7 55 55
10 systemd 8 865 865
10 systemd 9 865 875
13 perf 10 15181 15181
15 sleep 10 15181 15181
16 kworker/3: 11 14179 14179
17 kthreadd 12 29376 29376
19 systemd 13 746 746
21 systemd 14 401 401
23 systemd 15 879 879
23 systemd 16 879 945
25 kthreadd 17 556 556
27 kworker/u1 18 14136 14136
28 kworker/u1 19 15021 15021
29 kthreadd 20 509 509
31 systemd 21 836 836
31 systemd 22 836 967
33 systemd 23 1148 1148
33 systemd 24 1148 1163
35 kworker/2: 25 17988 17988
36 kworker/0: 26 13478 13478
After:
$ perf script -i pt-a-sleep-1 --itrace=bep -s tools/perf/scripts/python/export-to-sqlite.py pt-a-sleep-1b.db branches calls
$ sqlite3 -header -column pt-a-sleep-1b.db 'select * from comm_threads_view'
comm_id command thread_id pid tid
---------- ---------- ---------- ---------- ----------
1 swapper 1 0 0
2 rcu_sched 2 10 10
3 kswapd0 3 78 78
4 perf 4 15180 15180
4 perf 5 15180 15182
6 kworker/4: 6 10335 10335
7 kcompactd0 7 55 55
8 accounts-d 8 865 865
8 accounts-d 9 865 875
10 perf 10 15181 15181
12 sleep 10 15181 15181
13 kworker/3: 11 14179 14179
14 kworker/1: 12 29376 29376
15 haveged 13 746 746
16 systemd-jo 14 401 401
17 NetworkMan 15 879 879
17 NetworkMan 16 879 945
19 irq/131-iw 17 556 556
20 kworker/u1 18 14136 14136
21 kworker/u1 19 15021 15021
22 kworker/u1 20 509 509
23 thermald 21 836 836
23 thermald 22 836 967
25 unity-sett 23 1148 1148
25 unity-sett 24 1148 1163
27 kworker/2: 25 17988 17988
28 kworker/0: 26 13478 13478
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: 65de51f93ebf ("perf tools: Identify which comms are from exec")
Link: http://lkml.kernel.org/r/20190808064823.14846-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/thread.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 873ab505ca80..590793cc5142 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -214,14 +214,24 @@ struct comm *thread__comm(const struct thread *thread)
struct comm *thread__exec_comm(const struct thread *thread)
{
- struct comm *comm, *last = NULL;
+ struct comm *comm, *last = NULL, *second_last = NULL;
list_for_each_entry(comm, &thread->comm_list, list) {
if (comm->exec)
return comm;
+ second_last = last;
last = comm;
}
+ /*
+ * 'last' with no start time might be the parent's comm of a synthesized
+ * thread (created by processing a synthesized fork event). For a main
+ * thread, that is very probably wrong. Prefer a later comm to avoid
+ * that case.
+ */
+ if (second_last && !last->start && thread->pid_ == thread->tid)
+ return second_last;
+
return last;
}
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 04/10] perf ftrace: Fix failure to set cpumask when only one cpu is present
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
` (2 preceding siblings ...)
2019-08-08 18:53 ` [PATCH 03/10] perf db-export: Fix thread__exec_comm() Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 05/10] perf cpumap: Fix writing to illegal memory in handling cpumap mask Arnaldo Carvalho de Melo
` (5 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, He Zhe, Arnaldo Carvalho de Melo,
Alexander Shishkin, Alexey Budankov, Jiri Olsa, Kan Liang,
Peter Zijlstra, Stephane Eranian
From: He Zhe <zhe.he@windriver.com>
The buffer containing the string used to set cpumask is overwritten at
the end of the string later in cpu_map__snprint_mask due to not enough
memory space, when there is only one cpu.
And thus causes the following failure:
$ perf ftrace ls
failed to reset ftrace
$
This patch fixes the calculation of the cpumask string size.
Signed-off-by: He Zhe <zhe.he@windriver.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: dc23103278c5 ("perf ftrace: Add support for -a and -C option")
Link: http://lkml.kernel.org/r/1564734592-15624-1-git-send-email-zhe.he@windriver.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/builtin-ftrace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/builtin-ftrace.c b/tools/perf/builtin-ftrace.c
index 66d5a6658daf..019312810405 100644
--- a/tools/perf/builtin-ftrace.c
+++ b/tools/perf/builtin-ftrace.c
@@ -173,7 +173,7 @@ static int set_tracing_cpumask(struct cpu_map *cpumap)
int last_cpu;
last_cpu = cpu_map__cpu(cpumap, cpumap->nr - 1);
- mask_size = (last_cpu + 3) / 4 + 1;
+ mask_size = last_cpu / 4 + 2; /* one more byte for EOS */
mask_size += last_cpu / 32; /* ',' is needed for every 32th cpus */
cpumask = malloc(mask_size);
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 05/10] perf cpumap: Fix writing to illegal memory in handling cpumap mask
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
` (3 preceding siblings ...)
2019-08-08 18:53 ` [PATCH 04/10] perf ftrace: Fix failure to set cpumask when only one cpu is present Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 06/10] perf tools: Fix a typo in a variable name in the Documentation Makefile Arnaldo Carvalho de Melo
` (4 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, He Zhe, Alexander Shishkin, Alexey Budankov,
Jiri Olsa, Kan Liang, Peter Zijlstra, Stephane Eranian,
Arnaldo Carvalho de Melo
From: He Zhe <zhe.he@windriver.com>
cpu_map__snprint_mask() would write to illegal memory pointed by
zalloc(0) when there is only one cpu.
This patch fixes the calculation and adds sanity check against the input
parameters.
Signed-off-by: He Zhe <zhe.he@windriver.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: 4400ac8a9a90 ("perf cpumap: Introduce cpu_map__snprint_mask()")
Link: http://lkml.kernel.org/r/1564734592-15624-2-git-send-email-zhe.he@windriver.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/cpumap.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 3acfbe34ebaf..39cce66b4ebc 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -751,7 +751,10 @@ size_t cpu_map__snprint_mask(struct cpu_map *map, char *buf, size_t size)
unsigned char *bitmap;
int last_cpu = cpu_map__cpu(map, map->nr - 1);
- bitmap = zalloc((last_cpu + 7) / 8);
+ if (buf == NULL)
+ return 0;
+
+ bitmap = zalloc(last_cpu / 8 + 1);
if (bitmap == NULL) {
buf[0] = '\0';
return 0;
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 06/10] perf tools: Fix a typo in a variable name in the Documentation Makefile
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
` (4 preceding siblings ...)
2019-08-08 18:53 ` [PATCH 05/10] perf cpumap: Fix writing to illegal memory in handling cpumap mask Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 07/10] perf tools: Fix include paths in ui directory Arnaldo Carvalho de Melo
` (3 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Masanari Iida, Mukesh Ojha, Alexander Shishkin,
Peter Zijlstra, Arnaldo Carvalho de Melo
From: Masanari Iida <standby24x7@gmail.com>
This patch fix a spelling typo in a variable name in the Documentation Makefile.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190801032812.25018-1-standby24x7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Documentation/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/Documentation/Makefile b/tools/perf/Documentation/Makefile
index 6d148a40551c..adc5a7e44b98 100644
--- a/tools/perf/Documentation/Makefile
+++ b/tools/perf/Documentation/Makefile
@@ -242,7 +242,7 @@ $(OUTPUT)doc.dep : $(wildcard *.txt) build-docdep.perl
$(PERL_PATH) ./build-docdep.perl >$@+ $(QUIET_STDERR) && \
mv $@+ $@
--include $(OUPTUT)doc.dep
+-include $(OUTPUT)doc.dep
_cmds_txt = cmds-ancillaryinterrogators.txt \
cmds-ancillarymanipulators.txt \
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 07/10] perf tools: Fix include paths in ui directory
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
` (5 preceding siblings ...)
2019-08-08 18:53 ` [PATCH 06/10] perf tools: Fix a typo in a variable name in the Documentation Makefile Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 08/10] perf record: Fix module size on s390 Arnaldo Carvalho de Melo
` (2 subsequent siblings)
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Ian Rogers, Alexander Shishkin, Andi Kleen,
Jiri Olsa, Peter Zijlstra, Stephane Eranian,
Arnaldo Carvalho de Melo
From: Ian Rogers <irogers@google.com>
These paths point to the wrong location but still work because they get
picked up by a -I flag that happens to direct to the correct file. Fix
paths to point to the correct location without -I flags.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190731225441.233800-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/ui/browser.c | 9 +++++----
tools/perf/ui/tui/progress.c | 2 +-
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index f80c51d53565..d227d74b28f8 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -1,7 +1,8 @@
// SPDX-License-Identifier: GPL-2.0
-#include "../string2.h"
-#include "../config.h"
-#include "../../perf.h"
+#include "../util/util.h"
+#include "../util/string2.h"
+#include "../util/config.h"
+#include "../perf.h"
#include "libslang.h"
#include "ui.h"
#include "util.h"
@@ -14,7 +15,7 @@
#include "browser.h"
#include "helpline.h"
#include "keysyms.h"
-#include "../color.h"
+#include "../util/color.h"
#include <linux/ctype.h>
#include <linux/zalloc.h>
diff --git a/tools/perf/ui/tui/progress.c b/tools/perf/ui/tui/progress.c
index bc134b82829d..5a24dd3ce4db 100644
--- a/tools/perf/ui/tui/progress.c
+++ b/tools/perf/ui/tui/progress.c
@@ -1,6 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/kernel.h>
-#include "../cache.h"
+#include "../../util/cache.h"
#include "../progress.h"
#include "../libslang.h"
#include "../ui.h"
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 08/10] perf record: Fix module size on s390
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
` (6 preceding siblings ...)
2019-08-08 18:53 ` [PATCH 07/10] perf tools: Fix include paths in ui directory Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 09/10] perf annotate: Fix s390 gap between kernel end and module start Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 10/10] perf pmu-events: Fix missing "cpu_clk_unhalted.core" event Arnaldo Carvalho de Melo
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Thomas Richter, Stefan Liebler, Heiko Carstens,
Hendrik Brueckner, Vasily Gorbik, stable,
Arnaldo Carvalho de Melo
From: Thomas Richter <tmricht@linux.ibm.com>
On s390 the modules loaded in memory have the text segment located after
the GOT and Relocation table. This can be seen with this output:
[root@m35lp76 perf]# fgrep qeth /proc/modules
qeth 151552 1 qeth_l2, Live 0x000003ff800b2000
...
[root@m35lp76 perf]# cat /sys/module/qeth/sections/.text
0x000003ff800b3990
[root@m35lp76 perf]#
There is an offset of 0x1990 bytes. The size of the qeth module is
151552 bytes (0x25000 in hex).
The location of the GOT/relocation table at the beginning of a module is
unique to s390.
commit 203d8a4aa6ed ("perf s390: Fix 'start' address of module's map")
adjusts the start address of a module in the map structures, but does
not adjust the size of the modules. This leads to overlapping of module
maps as this example shows:
[root@m35lp76 perf] # ./perf report -D
0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x25000)
@ 0]: x /lib/modules/.../qeth.ko.xz
0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x8000)
@ 0]: x /lib/modules/.../ip6_tables.ko.xz
The module qeth.ko has an adjusted start address modified to b3990, but
its size is unchanged and the module ends at 0x3ff800d8990. This end
address overlaps with the next modules start address of 0x3ff800d85a0.
When the size of the leading GOT/Relocation table stored in the
beginning of the text segment (0x1990 bytes) is subtracted from module
qeth end address, there are no overlaps anymore:
0x3ff800d8990 - 0x1990 = 0x0x3ff800d7000
which is the same as
0x3ff800b2000 + 0x25000 = 0x0x3ff800d7000.
To fix this issue, also adjust the modules size in function
arch__fix_module_text_start(). Add another function parameter named size
and reduce the size of the module when the text segment start address is
changed.
Output after:
0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x23670)
@ 0]: x /lib/modules/.../qeth.ko.xz
0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x7a60)
@ 0]: x /lib/modules/.../ip6_tables.ko.xz
Reported-by: Stefan Liebler <stli@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: stable@vger.kernel.org
Fixes: 203d8a4aa6ed ("perf s390: Fix 'start' address of module's map")
Link: http://lkml.kernel.org/r/20190724122703.3996-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/arch/s390/util/machine.c | 14 +++++++++++++-
tools/perf/util/machine.c | 3 ++-
tools/perf/util/machine.h | 2 +-
3 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/tools/perf/arch/s390/util/machine.c b/tools/perf/arch/s390/util/machine.c
index a19690a17291..de26b1441a48 100644
--- a/tools/perf/arch/s390/util/machine.c
+++ b/tools/perf/arch/s390/util/machine.c
@@ -7,7 +7,7 @@
#include "api/fs/fs.h"
#include "debug.h"
-int arch__fix_module_text_start(u64 *start, const char *name)
+int arch__fix_module_text_start(u64 *start, u64 *size, const char *name)
{
u64 m_start = *start;
char path[PATH_MAX];
@@ -17,6 +17,18 @@ int arch__fix_module_text_start(u64 *start, const char *name)
if (sysfs__read_ull(path, (unsigned long long *)start) < 0) {
pr_debug2("Using module %s start:%#lx\n", path, m_start);
*start = m_start;
+ } else {
+ /* Successful read of the modules segment text start address.
+ * Calculate difference between module start address
+ * in memory and module text segment start address.
+ * For example module load address is 0x3ff8011b000
+ * (from /proc/modules) and module text segment start
+ * address is 0x3ff8011b870 (from file above).
+ *
+ * Adjust the module size and subtract the GOT table
+ * size located at the beginning of the module.
+ */
+ *size -= (*start - m_start);
}
return 0;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index cf826eca3aaf..83b2fbbeeb90 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1378,6 +1378,7 @@ static int machine__set_modules_path(struct machine *machine)
return map_groups__set_modules_path_dir(&machine->kmaps, modules_path, 0);
}
int __weak arch__fix_module_text_start(u64 *start __maybe_unused,
+ u64 *size __maybe_unused,
const char *name __maybe_unused)
{
return 0;
@@ -1389,7 +1390,7 @@ static int machine__create_module(void *arg, const char *name, u64 start,
struct machine *machine = arg;
struct map *map;
- if (arch__fix_module_text_start(&start, name) < 0)
+ if (arch__fix_module_text_start(&start, &size, name) < 0)
return -1;
map = machine__findnew_module_map(machine, start, name);
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index f70ab98a7bde..7aa38da26427 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -222,7 +222,7 @@ struct symbol *machine__find_kernel_symbol_by_name(struct machine *machine,
struct map *machine__findnew_module_map(struct machine *machine, u64 start,
const char *filename);
-int arch__fix_module_text_start(u64 *start, const char *name);
+int arch__fix_module_text_start(u64 *start, u64 *size, const char *name);
int machine__load_kallsyms(struct machine *machine, const char *filename);
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 09/10] perf annotate: Fix s390 gap between kernel end and module start
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
` (7 preceding siblings ...)
2019-08-08 18:53 ` [PATCH 08/10] perf record: Fix module size on s390 Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
2019-08-08 18:53 ` [PATCH 10/10] perf pmu-events: Fix missing "cpu_clk_unhalted.core" event Arnaldo Carvalho de Melo
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Thomas Richter, Klaus Theurich, Heiko Carstens,
Hendrik Brueckner, Vasily Gorbik, stable,
Arnaldo Carvalho de Melo
From: Thomas Richter <tmricht@linux.ibm.com>
During execution of command 'perf top' the error message:
Not enough memory for annotating '__irf_end' symbol!)
is emitted from this call sequence:
__cmd_top
perf_top__mmap_read
perf_top__mmap_read_idx
perf_event__process_sample
hist_entry_iter__add
hist_iter__top_callback
perf_top__record_precise_ip
hist_entry__inc_addr_samples
symbol__inc_addr_samples
symbol__get_annotation
symbol__alloc_hist
In this function the size of symbol __irf_end is calculated. The size of
a symbol is the difference between its start and end address.
When the symbol was read the first time, its start and end was set to:
symbol__new: __irf_end 0xe954d0-0xe954d0
which is correct and maps with /proc/kallsyms:
root@s8360046:~/linux-4.15.0/tools/perf# fgrep _irf_end /proc/kallsyms
0000000000e954d0 t __irf_end
root@s8360046:~/linux-4.15.0/tools/perf#
In function symbol__alloc_hist() the end of symbol __irf_end is
symbol__alloc_hist sym:__irf_end start:0xe954d0 end:0x3ff80045a8
which is identical with the first module entry in /proc/kallsyms
This results in a symbol size of __irf_req for histogram analyses of
70334140059072 bytes and a malloc() for this requested size fails.
The root cause of this is function
__dso__load_kallsyms()
+-> symbols__fixup_end()
Function symbols__fixup_end() enlarges the last symbol in the kallsyms
map:
# fgrep __irf_end /proc/kallsyms
0000000000e954d0 t __irf_end
#
to the start address of the first module:
# cat /proc/kallsyms | sort | egrep ' [tT] '
....
0000000000e952d0 T __security_initcall_end
0000000000e954d0 T __initramfs_size
0000000000e954d0 t __irf_end
000003ff800045a8 T fc_get_event_number [scsi_transport_fc]
000003ff800045d0 t store_fc_vport_disable [scsi_transport_fc]
000003ff800046a8 T scsi_is_fc_rport [scsi_transport_fc]
000003ff800046d0 t fc_target_setup [scsi_transport_fc]
On s390 the kernel is located around memory address 0x200, 0x10000 or
0x100000, depending on linux version. Modules however start some- where
around 0x3ff xxxx xxxx.
This is different than x86 and produces a large gap for which histogram
allocation fails.
Fix this by detecting the kernel's last symbol and do no adjustment for
it. Introduce a weak function and handle s390 specifics.
Reported-by: Klaus Theurich <klaus.theurich@de.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20190724122703.3996-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/arch/s390/util/machine.c | 17 +++++++++++++++++
tools/perf/util/symbol.c | 7 ++++++-
tools/perf/util/symbol.h | 1 +
3 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/tools/perf/arch/s390/util/machine.c b/tools/perf/arch/s390/util/machine.c
index de26b1441a48..c8c86a0c9b79 100644
--- a/tools/perf/arch/s390/util/machine.c
+++ b/tools/perf/arch/s390/util/machine.c
@@ -6,6 +6,7 @@
#include "machine.h"
#include "api/fs/fs.h"
#include "debug.h"
+#include "symbol.h"
int arch__fix_module_text_start(u64 *start, u64 *size, const char *name)
{
@@ -33,3 +34,19 @@ int arch__fix_module_text_start(u64 *start, u64 *size, const char *name)
return 0;
}
+
+/* On s390 kernel text segment start is located at very low memory addresses,
+ * for example 0x10000. Modules are located at very high memory addresses,
+ * for example 0x3ff xxxx xxxx. The gap between end of kernel text segment
+ * and beginning of first module's text segment is very big.
+ * Therefore do not fill this gap and do not assign it to the kernel dso map.
+ */
+void arch__symbols__fixup_end(struct symbol *p, struct symbol *c)
+{
+ if (strchr(p->name, '[') == NULL && strchr(c->name, '['))
+ /* Last kernel symbol mapped to end of page */
+ p->end = roundup(p->end, page_size);
+ else
+ p->end = c->start;
+ pr_debug4("%s sym:%s end:%#lx\n", __func__, p->name, p->end);
+}
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 173f3378aaa0..4efde7879474 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -92,6 +92,11 @@ static int prefix_underscores_count(const char *str)
return tail - str;
}
+void __weak arch__symbols__fixup_end(struct symbol *p, struct symbol *c)
+{
+ p->end = c->start;
+}
+
const char * __weak arch__normalize_symbol_name(const char *name)
{
return name;
@@ -218,7 +223,7 @@ void symbols__fixup_end(struct rb_root_cached *symbols)
curr = rb_entry(nd, struct symbol, rb_node);
if (prev->end == prev->start && prev->end != curr->start)
- prev->end = curr->start;
+ arch__symbols__fixup_end(prev, curr);
}
/* Last entry */
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 12755b42ea93..183f630cb5f1 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -288,6 +288,7 @@ const char *arch__normalize_symbol_name(const char *name);
#define SYMBOL_A 0
#define SYMBOL_B 1
+void arch__symbols__fixup_end(struct symbol *p, struct symbol *c);
int arch__compare_symbol_names(const char *namea, const char *nameb);
int arch__compare_symbol_names_n(const char *namea, const char *nameb,
unsigned int n);
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH 10/10] perf pmu-events: Fix missing "cpu_clk_unhalted.core" event
2019-08-08 18:53 [GIT PULL] perf/urgent improvements and fixes Arnaldo Carvalho de Melo
` (8 preceding siblings ...)
2019-08-08 18:53 ` [PATCH 09/10] perf annotate: Fix s390 gap between kernel end and module start Arnaldo Carvalho de Melo
@ 2019-08-08 18:53 ` Arnaldo Carvalho de Melo
9 siblings, 0 replies; 11+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-08-08 18:53 UTC (permalink / raw)
To: Ingo Molnar, Thomas Gleixner
Cc: Jiri Olsa, Namhyung Kim, Clark Williams, linux-kernel,
linux-perf-users, Jin Yao, Alexander Shishkin, Andi Kleen,
Jin Yao, Kan Liang, Peter Zijlstra, Arnaldo Carvalho de Melo
From: Jin Yao <yao.jin@linux.intel.com>
The events defined in pmu-events JSON are parsed and added into perf
tool. For fixed counters, we handle the encodings between JSON and perf
by using a static array fixed[].
But the fixed[] has missed an important event "cpu_clk_unhalted.core".
For example, on the Tremont platform,
[root@localhost ~]# perf stat -e cpu_clk_unhalted.core -a
event syntax error: 'cpu_clk_unhalted.core'
\___ parser error
With this patch, the event cpu_clk_unhalted.core can be parsed.
[root@localhost perf]# ./perf stat -e cpu_clk_unhalted.core -a -vvv
------------------------------------------------------------
perf_event_attr:
type 4
size 112
config 0x3c
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
...
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190729072755.2166-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/pmu-events/jevents.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index 1a91a197cafb..d413761621b0 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -453,6 +453,7 @@ static struct fixed {
{ "inst_retired.any_p", "event=0xc0" },
{ "cpu_clk_unhalted.ref", "event=0x0,umask=0x03" },
{ "cpu_clk_unhalted.thread", "event=0x3c" },
+ { "cpu_clk_unhalted.core", "event=0x3c" },
{ "cpu_clk_unhalted.thread_any", "event=0x3c,any=1" },
{ NULL, NULL},
};
--
2.21.0
^ permalink raw reply related [flat|nested] 11+ messages in thread