* [PATCH v4 01/15] perf tools: Add utility function to fetch executable
2019-03-05 14:47 Support sample context in perf report Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-06 20:53 ` Arnaldo Carvalho de Melo
2019-03-09 19:59 ` [tip:perf/urgent] perf thread: Generalize function to copy from thread addr space from intel-bts code tip-bot for Andi Kleen
2019-03-05 14:47 ` [PATCH v4 02/15] perf tools script: Support insn output for normal samples Andi Kleen
` (14 subsequent siblings)
15 siblings, 2 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add a utility function to fetch executable code. Convert one
user over to it. There are more places doing that, but they
do significantly different actions, so they are not
easy to fit into a single library function.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/util/Build | 1 +
tools/perf/util/fetch.c | 28 ++++++++++++++++++++++++++++
tools/perf/util/fetch.h | 7 +++++++
tools/perf/util/intel-bts.c | 21 +++------------------
4 files changed, 39 insertions(+), 18 deletions(-)
create mode 100644 tools/perf/util/fetch.c
create mode 100644 tools/perf/util/fetch.h
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 8dd3102301ea..649321fc3fb9 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -65,6 +65,7 @@ perf-y += trace-event-scripting.o
perf-y += trace-event.o
perf-y += svghelper.o
perf-y += sort.o
+perf-y += fetch.o
perf-y += hist.o
perf-y += util.o
perf-y += xyarray.o
diff --git a/tools/perf/util/fetch.c b/tools/perf/util/fetch.c
new file mode 100644
index 000000000000..1430083e7eca
--- /dev/null
+++ b/tools/perf/util/fetch.c
@@ -0,0 +1,28 @@
+#include "perf.h"
+#include "machine.h"
+#include "thread.h"
+#include "symbol.h"
+#include "map.h"
+#include "fetch.h"
+
+int fetch_exe(u64 ip, struct thread *thread, struct machine *machine,
+ char *buf, int len, bool *is64bit)
+{
+ struct addr_location al;
+ u8 cpumode;
+ long offset;
+
+ if (machine__kernel_ip(machine, ip))
+ cpumode = PERF_RECORD_MISC_KERNEL;
+ else
+ cpumode = PERF_RECORD_MISC_USER;
+ if (!thread__find_map(thread, cpumode, ip, &al) || !al.map->dso)
+ return -1;
+ if (al.map->dso->data.status == DSO_DATA_STATUS_ERROR)
+ return -1;
+ map__load(al.map);
+ offset = al.map->map_ip(al.map, ip);
+ if (is64bit)
+ *is64bit = al.map->dso->is_64_bit;
+ return dso__data_read_offset(al.map->dso, machine, offset, (u8 *)buf, len);
+}
diff --git a/tools/perf/util/fetch.h b/tools/perf/util/fetch.h
new file mode 100644
index 000000000000..7b77b8cee55a
--- /dev/null
+++ b/tools/perf/util/fetch.h
@@ -0,0 +1,7 @@
+#ifndef FETCH_H
+#define FETCH_H 1
+
+int fetch_exe(u64 ip, struct thread *thread, struct machine *machine,
+ char *buf, int len, bool *is64bit);
+
+#endif
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 0c0180c67574..915f4662e52e 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -38,6 +38,7 @@
#include "auxtrace.h"
#include "intel-pt-decoder/intel-pt-insn-decoder.h"
#include "intel-bts.h"
+#include "fetch.h"
#define MAX_TIMESTAMP (~0ULL)
@@ -328,35 +329,19 @@ static int intel_bts_get_next_insn(struct intel_bts_queue *btsq, u64 ip)
{
struct machine *machine = btsq->bts->machine;
struct thread *thread;
- struct addr_location al;
unsigned char buf[INTEL_PT_INSN_BUF_SZ];
ssize_t len;
- int x86_64;
- uint8_t cpumode;
+ bool x86_64;
int err = -1;
- if (machine__kernel_ip(machine, ip))
- cpumode = PERF_RECORD_MISC_KERNEL;
- else
- cpumode = PERF_RECORD_MISC_USER;
-
thread = machine__find_thread(machine, -1, btsq->tid);
if (!thread)
return -1;
- if (!thread__find_map(thread, cpumode, ip, &al) || !al.map->dso)
- goto out_put;
-
- len = dso__data_read_addr(al.map->dso, al.map, machine, ip, buf,
- INTEL_PT_INSN_BUF_SZ);
+ len = fetch_exe(ip, thread, machine, (char*)buf, INTEL_PT_INSN_BUF_SZ, &x86_64);
if (len <= 0)
goto out_put;
- /* Load maps to ensure dso->is_64_bit has been updated */
- map__load(al.map);
-
- x86_64 = al.map->dso->is_64_bit;
-
if (intel_pt_get_insn(buf, len, x86_64, &btsq->intel_pt_insn))
goto out_put;
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH v4 01/15] perf tools: Add utility function to fetch executable
2019-03-05 14:47 ` [PATCH v4 01/15] perf tools: Add utility function to fetch executable Andi Kleen
@ 2019-03-06 20:53 ` Arnaldo Carvalho de Melo
2019-03-06 21:08 ` Andi Kleen
2019-03-09 19:59 ` [tip:perf/urgent] perf thread: Generalize function to copy from thread addr space from intel-bts code tip-bot for Andi Kleen
1 sibling, 1 reply; 49+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-06 20:53 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
Em Tue, Mar 05, 2019 at 06:47:44AM -0800, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> Add a utility function to fetch executable code. Convert one
> user over to it. There are more places doing that, but they
> do significantly different actions, so they are not
> easy to fit into a single library function.
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
> tools/perf/util/Build | 1 +
> tools/perf/util/fetch.c | 28 ++++++++++++++++++++++++++++
> tools/perf/util/fetch.h | 7 +++++++
> tools/perf/util/intel-bts.c | 21 +++------------------
> 4 files changed, 39 insertions(+), 18 deletions(-)
> create mode 100644 tools/perf/util/fetch.c
> create mode 100644 tools/perf/util/fetch.h
So I've made some changes and the end result is below, will fixup the
following patches, holler if you find any problems:
Changes:
. No need to cast around, make 'buf' be a void pointer
. Rename it to thread__memcpy() to reflect the fact it is about copying
a chunk of memory from a thread, i.e. from its address space.
. No need to have it in a separate object file, move it to thread.[ch]
. Check the return of map__load(), the original code didn't do it, but
since we're moving this around, check that as well, could be moved to
a separate patch tho.
- Arnaldo
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 0c0180c67574..47025bc727e1 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -328,35 +328,19 @@ static int intel_bts_get_next_insn(struct intel_bts_queue *btsq, u64 ip)
{
struct machine *machine = btsq->bts->machine;
struct thread *thread;
- struct addr_location al;
unsigned char buf[INTEL_PT_INSN_BUF_SZ];
ssize_t len;
- int x86_64;
- uint8_t cpumode;
+ bool x86_64;
int err = -1;
- if (machine__kernel_ip(machine, ip))
- cpumode = PERF_RECORD_MISC_KERNEL;
- else
- cpumode = PERF_RECORD_MISC_USER;
-
thread = machine__find_thread(machine, -1, btsq->tid);
if (!thread)
return -1;
- if (!thread__find_map(thread, cpumode, ip, &al) || !al.map->dso)
- goto out_put;
-
- len = dso__data_read_addr(al.map->dso, al.map, machine, ip, buf,
- INTEL_PT_INSN_BUF_SZ);
+ len = thread__memcpy(thread, machine, buf, ip, INTEL_PT_INSN_BUF_SZ, &x86_64);
if (len <= 0)
goto out_put;
- /* Load maps to ensure dso->is_64_bit has been updated */
- map__load(al.map);
-
- x86_64 = al.map->dso->is_64_bit;
-
if (intel_pt_get_insn(buf, len, x86_64, &btsq->intel_pt_insn))
goto out_put;
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 4c179fef442d..50678d318185 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -12,6 +12,7 @@
#include "debug.h"
#include "namespaces.h"
#include "comm.h"
+#include "map.h"
#include "symbol.h"
#include "unwind.h"
@@ -393,3 +394,25 @@ struct thread *thread__main_thread(struct machine *machine, struct thread *threa
return machine__find_thread(machine, thread->pid_, thread->pid_);
}
+
+int thread__memcpy(struct thread *thread, struct machine *machine,
+ void *buf, u64 ip, int len, bool *is64bit)
+{
+ u8 cpumode = PERF_RECORD_MISC_USER;
+ struct addr_location al;
+ long offset;
+
+ if (machine__kernel_ip(machine, ip))
+ cpumode = PERF_RECORD_MISC_KERNEL;
+
+ if (!thread__find_map(thread, cpumode, ip, &al) || !al.map->dso ||
+ al.map->dso->data.status == DSO_DATA_STATUS_ERROR ||
+ map__load(al.map) < 0)
+ return -1;
+
+ offset = al.map->map_ip(al.map, ip);
+ if (is64bit)
+ *is64bit = al.map->dso->is_64_bit;
+
+ return dso__data_read_offset(al.map->dso, machine, offset, buf, len);
+}
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 8276ffeec556..cf8375c017a0 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -113,6 +113,9 @@ struct symbol *thread__find_symbol_fb(struct thread *thread, u8 cpumode,
void thread__find_cpumode_addr_location(struct thread *thread, u64 addr,
struct addr_location *al);
+int thread__memcpy(struct thread *thread, struct machine *machine,
+ void *buf, u64 ip, int len, bool *is64bit);
+
static inline void *thread__priv(struct thread *thread)
{
return thread->priv;
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH v4 01/15] perf tools: Add utility function to fetch executable
2019-03-06 20:53 ` Arnaldo Carvalho de Melo
@ 2019-03-06 21:08 ` Andi Kleen
0 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-06 21:08 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Andi Kleen, jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
> . No need to cast around, make 'buf' be a void pointer
>
> . Rename it to thread__memcpy() to reflect the fact it is about copying
> a chunk of memory from a thread, i.e. from its address space.
>
> . No need to have it in a separate object file, move it to thread.[ch]
>
> . Check the return of map__load(), the original code didn't do it, but
> since we're moving this around, check that as well, could be moved to
> a separate patch tho.
Changes look good. Thanks Arnaldo.
-Andi
^ permalink raw reply [flat|nested] 49+ messages in thread
* [tip:perf/urgent] perf thread: Generalize function to copy from thread addr space from intel-bts code
2019-03-05 14:47 ` [PATCH v4 01/15] perf tools: Add utility function to fetch executable Andi Kleen
2019-03-06 20:53 ` Arnaldo Carvalho de Melo
@ 2019-03-09 19:59 ` tip-bot for Andi Kleen
1 sibling, 0 replies; 49+ messages in thread
From: tip-bot for Andi Kleen @ 2019-03-09 19:59 UTC (permalink / raw)
To: linux-tip-commits
Cc: namhyung, tglx, mingo, ak, acme, linux-kernel, jolsa, hpa
Commit-ID: 153259382633ecbbc0af4f3f0b6515757ebe2984
Gitweb: https://git.kernel.org/tip/153259382633ecbbc0af4f3f0b6515757ebe2984
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Wed, 6 Mar 2019 17:55:35 -0300
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Wed, 6 Mar 2019 17:55:35 -0300
perf thread: Generalize function to copy from thread addr space from intel-bts code
Add a utility function to fetch executable code. Convert one
user over to it. There are more places doing that, but they
do significantly different actions, so they are not
easy to fit into a single library function.
Committer changes:
. No need to cast around, make 'buf' be a void pointer.
. Rename it to thread__memcpy() to reflect the fact it is about copying
a chunk of memory from a thread, i.e. from its address space.
. No need to have it in a separate object file, move it to thread.[ch]
. Check the return of map__load(), the original code didn't do it, but
since we're moving this around, check that as well.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/r/20190305144758.12397-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/intel-bts.c | 20 ++------------------
tools/perf/util/thread.c | 23 +++++++++++++++++++++++
tools/perf/util/thread.h | 3 +++
3 files changed, 28 insertions(+), 18 deletions(-)
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 0c0180c67574..47025bc727e1 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -328,35 +328,19 @@ static int intel_bts_get_next_insn(struct intel_bts_queue *btsq, u64 ip)
{
struct machine *machine = btsq->bts->machine;
struct thread *thread;
- struct addr_location al;
unsigned char buf[INTEL_PT_INSN_BUF_SZ];
ssize_t len;
- int x86_64;
- uint8_t cpumode;
+ bool x86_64;
int err = -1;
- if (machine__kernel_ip(machine, ip))
- cpumode = PERF_RECORD_MISC_KERNEL;
- else
- cpumode = PERF_RECORD_MISC_USER;
-
thread = machine__find_thread(machine, -1, btsq->tid);
if (!thread)
return -1;
- if (!thread__find_map(thread, cpumode, ip, &al) || !al.map->dso)
- goto out_put;
-
- len = dso__data_read_addr(al.map->dso, al.map, machine, ip, buf,
- INTEL_PT_INSN_BUF_SZ);
+ len = thread__memcpy(thread, machine, buf, ip, INTEL_PT_INSN_BUF_SZ, &x86_64);
if (len <= 0)
goto out_put;
- /* Load maps to ensure dso->is_64_bit has been updated */
- map__load(al.map);
-
- x86_64 = al.map->dso->is_64_bit;
-
if (intel_pt_get_insn(buf, len, x86_64, &btsq->intel_pt_insn))
goto out_put;
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 4c179fef442d..50678d318185 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -12,6 +12,7 @@
#include "debug.h"
#include "namespaces.h"
#include "comm.h"
+#include "map.h"
#include "symbol.h"
#include "unwind.h"
@@ -393,3 +394,25 @@ struct thread *thread__main_thread(struct machine *machine, struct thread *threa
return machine__find_thread(machine, thread->pid_, thread->pid_);
}
+
+int thread__memcpy(struct thread *thread, struct machine *machine,
+ void *buf, u64 ip, int len, bool *is64bit)
+{
+ u8 cpumode = PERF_RECORD_MISC_USER;
+ struct addr_location al;
+ long offset;
+
+ if (machine__kernel_ip(machine, ip))
+ cpumode = PERF_RECORD_MISC_KERNEL;
+
+ if (!thread__find_map(thread, cpumode, ip, &al) || !al.map->dso ||
+ al.map->dso->data.status == DSO_DATA_STATUS_ERROR ||
+ map__load(al.map) < 0)
+ return -1;
+
+ offset = al.map->map_ip(al.map, ip);
+ if (is64bit)
+ *is64bit = al.map->dso->is_64_bit;
+
+ return dso__data_read_offset(al.map->dso, machine, offset, buf, len);
+}
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 8276ffeec556..cf8375c017a0 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -113,6 +113,9 @@ struct symbol *thread__find_symbol_fb(struct thread *thread, u8 cpumode,
void thread__find_cpumode_addr_location(struct thread *thread, u64 addr,
struct addr_location *al);
+int thread__memcpy(struct thread *thread, struct machine *machine,
+ void *buf, u64 ip, int len, bool *is64bit);
+
static inline void *thread__priv(struct thread *thread)
{
return thread->priv;
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 02/15] perf tools script: Support insn output for normal samples
2019-03-05 14:47 Support sample context in perf report Andi Kleen
2019-03-05 14:47 ` [PATCH v4 01/15] perf tools: Add utility function to fetch executable Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-08 13:37 ` Arnaldo Carvalho de Melo
2019-03-22 21:59 ` [tip:perf/urgent] perf " tip-bot for Andi Kleen
2019-03-05 14:47 ` [PATCH v4 03/15] perf tools script: Filter COMM/FORK/.. events by CPU Andi Kleen
` (13 subsequent siblings)
15 siblings, 2 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
perf script -F +insn was only working for PT traces because
the PT instruction decoder was filling in the insn/insn_len
sample attributes. Support it for non PT samples too on x86
using the existing x86 instruction decoder.
This adds some extra checking to ensure that we don't try
to decode instructions when using perf.data from a different
architecture.
% perf record -a sleep 1
% perf script -F ip,sym,insn --xed
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb50 intel_bts_enable_local retq
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff810f1f79 generic_exec_single xor %eax, %eax
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx
ffffffff81048610 native_apic_mem_write mov %edi, %edi
...
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Avoid printing instruction when empty
Only decode when perf.data file was collected on same architecture
---
tools/perf/arch/x86/util/Build | 1 +
tools/perf/arch/x86/util/archinsn.c | 28 ++++++++++++++++++++++++++++
tools/perf/builtin-script.c | 21 ++++++++++++++++++++-
tools/perf/util/archinsn.h | 12 ++++++++++++
4 files changed, 61 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/arch/x86/util/archinsn.c
create mode 100644 tools/perf/util/archinsn.h
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 7aab0be5fc5f..7b8e69bbbdfe 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -6,6 +6,7 @@ perf-y += perf_regs.o
perf-y += group.o
perf-y += machine.o
perf-y += event.o
+perf-y += archinsn.o
perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/archinsn.c b/tools/perf/arch/x86/util/archinsn.c
new file mode 100644
index 000000000000..10b3c2a08b8f
--- /dev/null
+++ b/tools/perf/arch/x86/util/archinsn.c
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "perf.h"
+#include "archinsn.h"
+#include "fetch.h"
+#include "util/intel-pt-decoder/insn.h"
+#include "machine.h"
+#include "thread.h"
+#include "symbol.h"
+
+void arch_fetch_insn(struct perf_sample *sample,
+ struct thread *thread,
+ struct machine *machine)
+{
+ struct insn insn;
+ int len;
+ bool is64bit = false;
+
+ if (!sample->ip)
+ return;
+ len = fetch_exe(sample->ip, thread, machine, sample->insn,
+ sizeof(sample->insn), &is64bit);
+ if (len <= 0)
+ return;
+ insn_init(&insn, sample->insn, len, is64bit);
+ insn_get_length(&insn);
+ if (insn_complete(&insn) && insn.length <= len)
+ sample->insn_len = insn.length;
+}
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 2d8cb1d1682c..fbc440bdf880 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -29,10 +29,12 @@
#include "util/time-utils.h"
#include "util/path.h"
#include "print_binary.h"
+#include "archinsn.h"
#include <linux/bitmap.h>
#include <linux/kernel.h>
#include <linux/stringify.h>
#include <linux/time64.h>
+#include <sys/utsname.h>
#include "asm/bug.h"
#include "util/mem-events.h"
#include "util/dump-insn.h"
@@ -63,6 +65,7 @@ static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
static struct perf_stat_config stat_config;
static int max_blocks;
+static bool native_arch;
unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;
@@ -1227,6 +1230,12 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
return len + dlen;
}
+__weak void arch_fetch_insn(struct perf_sample *sample __maybe_unused,
+ struct thread *thread __maybe_unused,
+ struct machine *machine __maybe_unused)
+{
+}
+
static int perf_sample__fprintf_insn(struct perf_sample *sample,
struct perf_event_attr *attr,
struct thread *thread,
@@ -1234,9 +1243,12 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
{
int printed = 0;
+ if (sample->insn_len == 0 && native_arch)
+ arch_fetch_insn(sample, thread, machine);
+
if (PRINT_FIELD(INSNLEN))
printed += fprintf(fp, " ilen: %d", sample->insn_len);
- if (PRINT_FIELD(INSN)) {
+ if (PRINT_FIELD(INSN) && sample->insn_len) {
int i;
printed += fprintf(fp, " insn:");
@@ -3277,6 +3289,7 @@ int cmd_script(int argc, const char **argv)
.set = false,
.default_no_sample = true,
};
+ struct utsname uts;
char *script_path = NULL;
const char **__argv;
int i, j, err = 0;
@@ -3615,6 +3628,12 @@ int cmd_script(int argc, const char **argv)
if (symbol__init(&session->header.env) < 0)
goto out_delete;
+ uname(&uts);
+ if (!strcmp(uts.machine, session->header.env.arch) ||
+ (!strcmp(uts.machine, "x86_64") &&
+ !strcmp(session->header.env.arch, "i386")))
+ native_arch = true;
+
script.session = session;
script__setup_sample_type(&script);
diff --git a/tools/perf/util/archinsn.h b/tools/perf/util/archinsn.h
new file mode 100644
index 000000000000..448cbb6b8d7e
--- /dev/null
+++ b/tools/perf/util/archinsn.h
@@ -0,0 +1,12 @@
+#ifndef INSN_H
+#define INSN_H 1
+
+struct perf_sample;
+struct machine;
+struct thread;
+
+void arch_fetch_insn(struct perf_sample *sample,
+ struct thread *thread,
+ struct machine *machine);
+
+#endif
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH v4 02/15] perf tools script: Support insn output for normal samples
2019-03-05 14:47 ` [PATCH v4 02/15] perf tools script: Support insn output for normal samples Andi Kleen
@ 2019-03-08 13:37 ` Arnaldo Carvalho de Melo
2019-03-22 21:59 ` [tip:perf/urgent] perf " tip-bot for Andi Kleen
1 sibling, 0 replies; 49+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-08 13:37 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
Em Tue, Mar 05, 2019 at 06:47:45AM -0800, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> perf script -F +insn was only working for PT traces because
> the PT instruction decoder was filling in the insn/insn_len
> sample attributes. Support it for non PT samples too on x86
> using the existing x86 instruction decoder.
>
> This adds some extra checking to ensure that we don't try
> to decode instructions when using perf.data from a different
> architecture.
>
> % perf record -a sleep 1
> % perf script -F ip,sym,insn --xed
> ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
> ffffffff8100bb50 intel_bts_enable_local retq
> ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
> ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
> ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
> ffffffff810f1f79 generic_exec_single xor %eax, %eax
> ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
> ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx
> ffffffff81048610 native_apic_mem_write mov %edi, %edi
> ...
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> ---
> v2:
> Avoid printing instruction when empty
> Only decode when perf.data file was collected on same architecture
Thanks, applied and added some testing notes comparing the output with
the one from 'perf annotate --stdio2':
commit 2109d7fa73200bc6ff7abe909b5fe821224f98b3
Author: Andi Kleen <ak@linux.intel.com>
Date: Tue Mar 5 06:47:45 2019 -0800
perf script: Support insn output for normal samples
perf script -F +insn was only working for PT traces because the PT
instruction decoder was filling in the insn/insn_len sample attributes.
Support it for non PT samples too on x86 using the existing x86
instruction decoder.
This adds some extra checking to ensure that we don't try to decode
instructions when using perf.data from a different architecture.
% perf record -a sleep 1
% perf script -F ip,sym,insn --xed
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb50 intel_bts_enable_local retq
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff810f1f79 generic_exec_single xor %eax, %eax
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx
ffffffff81048610 native_apic_mem_write mov %edi, %edi
...
Committer testing:
Before:
# perf script -F ip,sym,insn --xed | head -5
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068806 native_write_msr addb %al, (%rax)
ffffffffa4068806 native_write_msr addb %al, (%rax)
# perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)"
#
After:
# perf script -F ip,sym,insn --xed | head -5
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
# perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" | head -5
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
#
More examples:
# perf script -F ip,sym,insn --xed | grep -v native_write_msr | head
ffffffffa416b90e tick_check_broadcast_expired btq %rax, 0x1a5f42a(%rip)
ffffffffa4956bd0 nmi_cpu_backtrace pushq %r13
ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx
ffffffffa4956bf3 nmi_cpu_backtrace popq %r12
ffffffffa4171d5c smp_call_function_single pause
ffffffffa4956bdd nmi_cpu_backtrace mov %ebp, %r12d
ffffffffa4797e4d menu_select cmp $0x190, %rax
ffffffffa4171d5c smp_call_function_single pause
ffffffffa405a7d8 nmi_cpu_backtrace_handler callq 0xffffffffa4956bd0
ffffffffa4797f7a menu_select shr $0x3, %rax
#
Which matches the annotate output modulo resolving callqs:
# perf annotate --stdio2 nmi_cpu_backtrace_handler
Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period]
nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux
Percent
Disassembly of section .text:
ffffffff8105a7d0 <nmi_cpu_backtrace_handler>:
nmi_cpu_backtrace_handler():
nmi_trigger_cpumask_backtrace(mask, exclude_self,
nmi_raise_cpu_backtrace);
}
static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
{
24.45 → callq __fentry__
if (nmi_cpu_backtrace(regs))
mov %rsi,%rdi
75.55 → callq nmi_cpu_backtrace
return NMI_HANDLED;
movzbl %al,%eax
return NMI_DONE;
}
← retq
#
# perf annotate --stdio2 __hrtimer_next_event_base
Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period]
__hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux
Percent
Disassembly of section .text:
ffffffff8115b910 <__hrtimer_next_event_base>:
__hrtimer_next_event_base():
static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base,
const struct hrtimer *exclude,
unsigned int active,
ktime_t expires_next)
{
→ callq __fentry__
<SNIP>
4a: add $0x1,%r14
77.31 mov 0x18(%rax),%rdx
shl $0x6,%r14
sub 0x38(%rbx,%r14,1),%rdx
if (expires < expires_next) {
cmp %r12,%rdx
↓ jge 68
<SNIP>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-3-andi@firstfloor.org
[ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 7aab0be5fc5f..7b8e69bbbdfe 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -6,6 +6,7 @@ perf-y += perf_regs.o
perf-y += group.o
perf-y += machine.o
perf-y += event.o
+perf-y += archinsn.o
perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_BPF_PROLOGUE) += dwarf-regs.o
diff --git a/tools/perf/arch/x86/util/archinsn.c b/tools/perf/arch/x86/util/archinsn.c
new file mode 100644
index 000000000000..4237bb2e7fa2
--- /dev/null
+++ b/tools/perf/arch/x86/util/archinsn.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "perf.h"
+#include "archinsn.h"
+#include "util/intel-pt-decoder/insn.h"
+#include "machine.h"
+#include "thread.h"
+#include "symbol.h"
+
+void arch_fetch_insn(struct perf_sample *sample,
+ struct thread *thread,
+ struct machine *machine)
+{
+ struct insn insn;
+ int len;
+ bool is64bit = false;
+
+ if (!sample->ip)
+ return;
+ len = thread__memcpy(thread, machine, sample->insn, sample->ip, sizeof(sample->insn), &is64bit);
+ if (len <= 0)
+ return;
+ insn_init(&insn, sample->insn, len, is64bit);
+ insn_get_length(&insn);
+ if (insn_complete(&insn) && insn.length <= len)
+ sample->insn_len = insn.length;
+}
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 53f78cf3113f..a5080afd361d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -29,10 +29,12 @@
#include "util/time-utils.h"
#include "util/path.h"
#include "print_binary.h"
+#include "archinsn.h"
#include <linux/bitmap.h>
#include <linux/kernel.h>
#include <linux/stringify.h>
#include <linux/time64.h>
+#include <sys/utsname.h>
#include "asm/bug.h"
#include "util/mem-events.h"
#include "util/dump-insn.h"
@@ -63,6 +65,7 @@ static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
static struct perf_stat_config stat_config;
static int max_blocks;
+static bool native_arch;
unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;
@@ -1227,6 +1230,12 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
return len + dlen;
}
+__weak void arch_fetch_insn(struct perf_sample *sample __maybe_unused,
+ struct thread *thread __maybe_unused,
+ struct machine *machine __maybe_unused)
+{
+}
+
static int perf_sample__fprintf_insn(struct perf_sample *sample,
struct perf_event_attr *attr,
struct thread *thread,
@@ -1234,9 +1243,12 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
{
int printed = 0;
+ if (sample->insn_len == 0 && native_arch)
+ arch_fetch_insn(sample, thread, machine);
+
if (PRINT_FIELD(INSNLEN))
printed += fprintf(fp, " ilen: %d", sample->insn_len);
- if (PRINT_FIELD(INSN)) {
+ if (PRINT_FIELD(INSN) && sample->insn_len) {
int i;
printed += fprintf(fp, " insn:");
@@ -3277,6 +3289,7 @@ int cmd_script(int argc, const char **argv)
.set = false,
.default_no_sample = true,
};
+ struct utsname uts;
char *script_path = NULL;
const char **__argv;
int i, j, err = 0;
@@ -3615,6 +3628,12 @@ int cmd_script(int argc, const char **argv)
if (symbol__init(&session->header.env) < 0)
goto out_delete;
+ uname(&uts);
+ if (!strcmp(uts.machine, session->header.env.arch) ||
+ (!strcmp(uts.machine, "x86_64") &&
+ !strcmp(session->header.env.arch, "i386")))
+ native_arch = true;
+
script.session = session;
script__setup_sample_type(&script);
diff --git a/tools/perf/util/archinsn.h b/tools/perf/util/archinsn.h
new file mode 100644
index 000000000000..448cbb6b8d7e
--- /dev/null
+++ b/tools/perf/util/archinsn.h
@@ -0,0 +1,12 @@
+#ifndef INSN_H
+#define INSN_H 1
+
+struct perf_sample;
+struct machine;
+struct thread;
+
+void arch_fetch_insn(struct perf_sample *sample,
+ struct thread *thread,
+ struct machine *machine);
+
+#endif
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [tip:perf/urgent] perf script: Support insn output for normal samples
2019-03-05 14:47 ` [PATCH v4 02/15] perf tools script: Support insn output for normal samples Andi Kleen
2019-03-08 13:37 ` Arnaldo Carvalho de Melo
@ 2019-03-22 21:59 ` tip-bot for Andi Kleen
1 sibling, 0 replies; 49+ messages in thread
From: tip-bot for Andi Kleen @ 2019-03-22 21:59 UTC (permalink / raw)
To: linux-tip-commits
Cc: hpa, jolsa, namhyung, ak, tglx, linux-kernel, acme, mingo
Commit-ID: 3ab481a1cfe1511b94e142b648e2c5ade9175ed3
Gitweb: https://git.kernel.org/tip/3ab481a1cfe1511b94e142b648e2c5ade9175ed3
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 5 Mar 2019 06:47:45 -0800
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Mar 2019 11:56:02 -0300
perf script: Support insn output for normal samples
perf script -F +insn was only working for PT traces because the PT
instruction decoder was filling in the insn/insn_len sample attributes.
Support it for non PT samples too on x86 using the existing x86
instruction decoder.
This adds some extra checking to ensure that we don't try to decode
instructions when using perf.data from a different architecture.
% perf record -a sleep 1
% perf script -F ip,sym,insn --xed
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb50 intel_bts_enable_local retq
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff810f1f79 generic_exec_single xor %eax, %eax
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx
ffffffff81048610 native_apic_mem_write mov %edi, %edi
...
Committer testing:
Before:
# perf script -F ip,sym,insn --xed | head -5
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068806 native_write_msr addb %al, (%rax)
ffffffffa4068806 native_write_msr addb %al, (%rax)
# perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)"
#
After:
# perf script -F ip,sym,insn --xed | head -5
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
# perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" | head -5
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
#
More examples:
# perf script -F ip,sym,insn --xed | grep -v native_write_msr | head
ffffffffa416b90e tick_check_broadcast_expired btq %rax, 0x1a5f42a(%rip)
ffffffffa4956bd0 nmi_cpu_backtrace pushq %r13
ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx
ffffffffa4956bf3 nmi_cpu_backtrace popq %r12
ffffffffa4171d5c smp_call_function_single pause
ffffffffa4956bdd nmi_cpu_backtrace mov %ebp, %r12d
ffffffffa4797e4d menu_select cmp $0x190, %rax
ffffffffa4171d5c smp_call_function_single pause
ffffffffa405a7d8 nmi_cpu_backtrace_handler callq 0xffffffffa4956bd0
ffffffffa4797f7a menu_select shr $0x3, %rax
#
Which matches the annotate output modulo resolving callqs:
# perf annotate --stdio2 nmi_cpu_backtrace_handler
Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period]
nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux
Percent
Disassembly of section .text:
ffffffff8105a7d0 <nmi_cpu_backtrace_handler>:
nmi_cpu_backtrace_handler():
nmi_trigger_cpumask_backtrace(mask, exclude_self,
nmi_raise_cpu_backtrace);
}
static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
{
24.45 → callq __fentry__
if (nmi_cpu_backtrace(regs))
mov %rsi,%rdi
75.55 → callq nmi_cpu_backtrace
return NMI_HANDLED;
movzbl %al,%eax
return NMI_DONE;
}
← retq
#
# perf annotate --stdio2 __hrtimer_next_event_base
Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period]
__hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux
Percent
Disassembly of section .text:
ffffffff8115b910 <__hrtimer_next_event_base>:
__hrtimer_next_event_base():
static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base,
const struct hrtimer *exclude,
unsigned int active,
ktime_t expires_next)
{
→ callq __fentry__
<SNIP>
4a: add $0x1,%r14
77.31 mov 0x18(%rax),%rdx
shl $0x6,%r14
sub 0x38(%rbx,%r14,1),%rdx
if (expires < expires_next) {
cmp %r12,%rdx
↓ jge 68
<SNIP>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-3-andi@firstfloor.org
[ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ]
[ archinsn.c needs the instruction decoder that is only build when CONFIG_AUXTRACE=y, fix that ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/arch/x86/util/Build | 1 +
tools/perf/arch/x86/util/archinsn.c | 26 ++++++++++++++++++++++++++
tools/perf/builtin-script.c | 21 ++++++++++++++++++++-
tools/perf/util/archinsn.h | 12 ++++++++++++
4 files changed, 59 insertions(+), 1 deletion(-)
diff --git a/tools/perf/arch/x86/util/Build b/tools/perf/arch/x86/util/Build
index 7aab0be5fc5f..47f9c56e744f 100644
--- a/tools/perf/arch/x86/util/Build
+++ b/tools/perf/arch/x86/util/Build
@@ -14,5 +14,6 @@ perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
perf-$(CONFIG_AUXTRACE) += auxtrace.o
+perf-$(CONFIG_AUXTRACE) += archinsn.o
perf-$(CONFIG_AUXTRACE) += intel-pt.o
perf-$(CONFIG_AUXTRACE) += intel-bts.o
diff --git a/tools/perf/arch/x86/util/archinsn.c b/tools/perf/arch/x86/util/archinsn.c
new file mode 100644
index 000000000000..4237bb2e7fa2
--- /dev/null
+++ b/tools/perf/arch/x86/util/archinsn.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "perf.h"
+#include "archinsn.h"
+#include "util/intel-pt-decoder/insn.h"
+#include "machine.h"
+#include "thread.h"
+#include "symbol.h"
+
+void arch_fetch_insn(struct perf_sample *sample,
+ struct thread *thread,
+ struct machine *machine)
+{
+ struct insn insn;
+ int len;
+ bool is64bit = false;
+
+ if (!sample->ip)
+ return;
+ len = thread__memcpy(thread, machine, sample->insn, sample->ip, sizeof(sample->insn), &is64bit);
+ if (len <= 0)
+ return;
+ insn_init(&insn, sample->insn, len, is64bit);
+ insn_get_length(&insn);
+ if (insn_complete(&insn) && insn.length <= len)
+ sample->insn_len = insn.length;
+}
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 53f78cf3113f..a5080afd361d 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -29,10 +29,12 @@
#include "util/time-utils.h"
#include "util/path.h"
#include "print_binary.h"
+#include "archinsn.h"
#include <linux/bitmap.h>
#include <linux/kernel.h>
#include <linux/stringify.h>
#include <linux/time64.h>
+#include <sys/utsname.h>
#include "asm/bug.h"
#include "util/mem-events.h"
#include "util/dump-insn.h"
@@ -63,6 +65,7 @@ static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
static struct perf_stat_config stat_config;
static int max_blocks;
+static bool native_arch;
unsigned int scripting_max_stack = PERF_MAX_STACK_DEPTH;
@@ -1227,6 +1230,12 @@ static int perf_sample__fprintf_callindent(struct perf_sample *sample,
return len + dlen;
}
+__weak void arch_fetch_insn(struct perf_sample *sample __maybe_unused,
+ struct thread *thread __maybe_unused,
+ struct machine *machine __maybe_unused)
+{
+}
+
static int perf_sample__fprintf_insn(struct perf_sample *sample,
struct perf_event_attr *attr,
struct thread *thread,
@@ -1234,9 +1243,12 @@ static int perf_sample__fprintf_insn(struct perf_sample *sample,
{
int printed = 0;
+ if (sample->insn_len == 0 && native_arch)
+ arch_fetch_insn(sample, thread, machine);
+
if (PRINT_FIELD(INSNLEN))
printed += fprintf(fp, " ilen: %d", sample->insn_len);
- if (PRINT_FIELD(INSN)) {
+ if (PRINT_FIELD(INSN) && sample->insn_len) {
int i;
printed += fprintf(fp, " insn:");
@@ -3277,6 +3289,7 @@ int cmd_script(int argc, const char **argv)
.set = false,
.default_no_sample = true,
};
+ struct utsname uts;
char *script_path = NULL;
const char **__argv;
int i, j, err = 0;
@@ -3615,6 +3628,12 @@ int cmd_script(int argc, const char **argv)
if (symbol__init(&session->header.env) < 0)
goto out_delete;
+ uname(&uts);
+ if (!strcmp(uts.machine, session->header.env.arch) ||
+ (!strcmp(uts.machine, "x86_64") &&
+ !strcmp(session->header.env.arch, "i386")))
+ native_arch = true;
+
script.session = session;
script__setup_sample_type(&script);
diff --git a/tools/perf/util/archinsn.h b/tools/perf/util/archinsn.h
new file mode 100644
index 000000000000..448cbb6b8d7e
--- /dev/null
+++ b/tools/perf/util/archinsn.h
@@ -0,0 +1,12 @@
+#ifndef INSN_H
+#define INSN_H 1
+
+struct perf_sample;
+struct machine;
+struct thread;
+
+void arch_fetch_insn(struct perf_sample *sample,
+ struct thread *thread,
+ struct machine *machine);
+
+#endif
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 03/15] perf tools script: Filter COMM/FORK/.. events by CPU
2019-03-05 14:47 Support sample context in perf report Andi Kleen
2019-03-05 14:47 ` [PATCH v4 01/15] perf tools: Add utility function to fetch executable Andi Kleen
2019-03-05 14:47 ` [PATCH v4 02/15] perf tools script: Support insn output for normal samples Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-06 14:23 ` Adrian Hunter
2019-03-05 14:47 ` [PATCH v4 04/15] perf tools report: Support nano seconds Andi Kleen
` (12 subsequent siblings)
15 siblings, 1 reply; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
The --cpu option only filtered samples. Filter other perf events,
such as COMM, FORK, SWITCH by the CPU too.
Reported-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/builtin-script.c | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index fbc440bdf880..3813f60d1dc0 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2038,6 +2038,9 @@ static int process_comm_event(struct perf_tool *tool,
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
int ret = -1;
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return 0;
+
thread = machine__findnew_thread(machine, event->comm.pid, event->comm.tid);
if (thread == NULL) {
pr_debug("problem processing COMM event, skipping it.\n");
@@ -2073,6 +2076,9 @@ static int process_namespaces_event(struct perf_tool *tool,
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
int ret = -1;
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return 0;
+
thread = machine__findnew_thread(machine, event->namespaces.pid,
event->namespaces.tid);
if (thread == NULL) {
@@ -2108,6 +2114,9 @@ static int process_fork_event(struct perf_tool *tool,
struct perf_session *session = script->session;
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return 0;
+
if (perf_event__process_fork(tool, event, sample, machine) < 0)
return -1;
@@ -2141,6 +2150,9 @@ static int process_exit_event(struct perf_tool *tool,
struct perf_session *session = script->session;
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return 0;
+
thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid);
if (thread == NULL) {
pr_debug("problem processing EXIT event, skipping it.\n");
@@ -2174,6 +2186,9 @@ static int process_mmap_event(struct perf_tool *tool,
struct perf_session *session = script->session;
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return 0;
+
if (perf_event__process_mmap(tool, event, sample, machine) < 0)
return -1;
@@ -2206,6 +2221,9 @@ static int process_mmap2_event(struct perf_tool *tool,
struct perf_session *session = script->session;
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return 0;
+
if (perf_event__process_mmap2(tool, event, sample, machine) < 0)
return -1;
@@ -2238,6 +2256,9 @@ static int process_switch_event(struct perf_tool *tool,
struct perf_session *session = script->session;
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return 0;
+
if (perf_event__process_switch(tool, event, sample, machine) < 0)
return -1;
@@ -2266,6 +2287,9 @@ process_lost_event(struct perf_tool *tool,
struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
struct thread *thread;
+ if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
+ return 0;
+
thread = machine__findnew_thread(machine, sample->pid,
sample->tid);
if (thread == NULL)
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH v4 03/15] perf tools script: Filter COMM/FORK/.. events by CPU
2019-03-05 14:47 ` [PATCH v4 03/15] perf tools script: Filter COMM/FORK/.. events by CPU Andi Kleen
@ 2019-03-06 14:23 ` Adrian Hunter
2019-03-07 11:02 ` Jiri Olsa
0 siblings, 1 reply; 49+ messages in thread
From: Adrian Hunter @ 2019-03-06 14:23 UTC (permalink / raw)
To: Andi Kleen, acme
Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
On 5/03/19 4:47 PM, Andi Kleen wrote:
> From: Andi Kleen <ak@linux.intel.com>
>
> The --cpu option only filtered samples. Filter other perf events,
> such as COMM, FORK, SWITCH by the CPU too.
Because tasks can migrate from cpu to cpu, we probably need to process most
of the events anyway, even if they are not printed.
>
> Reported-by: Jiri Olsa <jolsa@kernel.org>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> ---
> tools/perf/builtin-script.c | 24 ++++++++++++++++++++++++
> 1 file changed, 24 insertions(+)
>
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index fbc440bdf880..3813f60d1dc0 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -2038,6 +2038,9 @@ static int process_comm_event(struct perf_tool *tool,
> struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> int ret = -1;
>
> + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> + return 0;
> +
> thread = machine__findnew_thread(machine, event->comm.pid, event->comm.tid);
> if (thread == NULL) {
> pr_debug("problem processing COMM event, skipping it.\n");
> @@ -2073,6 +2076,9 @@ static int process_namespaces_event(struct perf_tool *tool,
> struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> int ret = -1;
>
> + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> + return 0;
> +
> thread = machine__findnew_thread(machine, event->namespaces.pid,
> event->namespaces.tid);
> if (thread == NULL) {
> @@ -2108,6 +2114,9 @@ static int process_fork_event(struct perf_tool *tool,
> struct perf_session *session = script->session;
> struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
>
> + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> + return 0;
> +
> if (perf_event__process_fork(tool, event, sample, machine) < 0)
> return -1;
>
> @@ -2141,6 +2150,9 @@ static int process_exit_event(struct perf_tool *tool,
> struct perf_session *session = script->session;
> struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
>
> + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> + return 0;
> +
> thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid);
> if (thread == NULL) {
> pr_debug("problem processing EXIT event, skipping it.\n");
> @@ -2174,6 +2186,9 @@ static int process_mmap_event(struct perf_tool *tool,
> struct perf_session *session = script->session;
> struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
>
> + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> + return 0;
> +
> if (perf_event__process_mmap(tool, event, sample, machine) < 0)
> return -1;
>
> @@ -2206,6 +2221,9 @@ static int process_mmap2_event(struct perf_tool *tool,
> struct perf_session *session = script->session;
> struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
>
> + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> + return 0;
> +
> if (perf_event__process_mmap2(tool, event, sample, machine) < 0)
> return -1;
>
> @@ -2238,6 +2256,9 @@ static int process_switch_event(struct perf_tool *tool,
> struct perf_session *session = script->session;
> struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
>
> + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> + return 0;
> +
> if (perf_event__process_switch(tool, event, sample, machine) < 0)
> return -1;
>
> @@ -2266,6 +2287,9 @@ process_lost_event(struct perf_tool *tool,
> struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> struct thread *thread;
>
> + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> + return 0;
> +
> thread = machine__findnew_thread(machine, sample->pid,
> sample->tid);
> if (thread == NULL)
>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH v4 03/15] perf tools script: Filter COMM/FORK/.. events by CPU
2019-03-06 14:23 ` Adrian Hunter
@ 2019-03-07 11:02 ` Jiri Olsa
2019-03-08 13:39 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 49+ messages in thread
From: Jiri Olsa @ 2019-03-07 11:02 UTC (permalink / raw)
To: Adrian Hunter
Cc: Andi Kleen, acme, jolsa, namhyung, linux-kernel,
linux-perf-users, Andi Kleen
On Wed, Mar 06, 2019 at 04:23:40PM +0200, Adrian Hunter wrote:
> On 5/03/19 4:47 PM, Andi Kleen wrote:
> > From: Andi Kleen <ak@linux.intel.com>
> >
> > The --cpu option only filtered samples. Filter other perf events,
> > such as COMM, FORK, SWITCH by the CPU too.
>
> Because tasks can migrate from cpu to cpu, we probably need to process most
> of the events anyway, even if they are not printed.
agreed, I wonder we could just make the perf_event__fprintf conditional
jirka
>
> >
> > Reported-by: Jiri Olsa <jolsa@kernel.org>
> > Signed-off-by: Andi Kleen <ak@linux.intel.com>
> > ---
> > tools/perf/builtin-script.c | 24 ++++++++++++++++++++++++
> > 1 file changed, 24 insertions(+)
> >
> > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > index fbc440bdf880..3813f60d1dc0 100644
> > --- a/tools/perf/builtin-script.c
> > +++ b/tools/perf/builtin-script.c
> > @@ -2038,6 +2038,9 @@ static int process_comm_event(struct perf_tool *tool,
> > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > int ret = -1;
> >
> > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > + return 0;
> > +
> > thread = machine__findnew_thread(machine, event->comm.pid, event->comm.tid);
> > if (thread == NULL) {
> > pr_debug("problem processing COMM event, skipping it.\n");
> > @@ -2073,6 +2076,9 @@ static int process_namespaces_event(struct perf_tool *tool,
> > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > int ret = -1;
> >
> > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > + return 0;
> > +
> > thread = machine__findnew_thread(machine, event->namespaces.pid,
> > event->namespaces.tid);
> > if (thread == NULL) {
> > @@ -2108,6 +2114,9 @@ static int process_fork_event(struct perf_tool *tool,
> > struct perf_session *session = script->session;
> > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> >
> > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > + return 0;
> > +
> > if (perf_event__process_fork(tool, event, sample, machine) < 0)
> > return -1;
> >
> > @@ -2141,6 +2150,9 @@ static int process_exit_event(struct perf_tool *tool,
> > struct perf_session *session = script->session;
> > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> >
> > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > + return 0;
> > +
> > thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid);
> > if (thread == NULL) {
> > pr_debug("problem processing EXIT event, skipping it.\n");
> > @@ -2174,6 +2186,9 @@ static int process_mmap_event(struct perf_tool *tool,
> > struct perf_session *session = script->session;
> > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> >
> > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > + return 0;
> > +
> > if (perf_event__process_mmap(tool, event, sample, machine) < 0)
> > return -1;
> >
> > @@ -2206,6 +2221,9 @@ static int process_mmap2_event(struct perf_tool *tool,
> > struct perf_session *session = script->session;
> > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> >
> > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > + return 0;
> > +
> > if (perf_event__process_mmap2(tool, event, sample, machine) < 0)
> > return -1;
> >
> > @@ -2238,6 +2256,9 @@ static int process_switch_event(struct perf_tool *tool,
> > struct perf_session *session = script->session;
> > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> >
> > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > + return 0;
> > +
> > if (perf_event__process_switch(tool, event, sample, machine) < 0)
> > return -1;
> >
> > @@ -2266,6 +2287,9 @@ process_lost_event(struct perf_tool *tool,
> > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > struct thread *thread;
> >
> > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > + return 0;
> > +
> > thread = machine__findnew_thread(machine, sample->pid,
> > sample->tid);
> > if (thread == NULL)
> >
>
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH v4 03/15] perf tools script: Filter COMM/FORK/.. events by CPU
2019-03-07 11:02 ` Jiri Olsa
@ 2019-03-08 13:39 ` Arnaldo Carvalho de Melo
2019-03-08 15:08 ` Andi Kleen
0 siblings, 1 reply; 49+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-08 13:39 UTC (permalink / raw)
To: Jiri Olsa
Cc: Adrian Hunter, Andi Kleen, jolsa, namhyung, linux-kernel,
linux-perf-users, Andi Kleen
Em Thu, Mar 07, 2019 at 12:02:31PM +0100, Jiri Olsa escreveu:
> On Wed, Mar 06, 2019 at 04:23:40PM +0200, Adrian Hunter wrote:
> > On 5/03/19 4:47 PM, Andi Kleen wrote:
> > > From: Andi Kleen <ak@linux.intel.com>
> > >
> > > The --cpu option only filtered samples. Filter other perf events,
> > > such as COMM, FORK, SWITCH by the CPU too.
> >
> > Because tasks can migrate from cpu to cpu, we probably need to process most
> > of the events anyway, even if they are not printed.
>
> agreed, I wonder we could just make the perf_event__fprintf conditional
Humm, probably just do the filtering on PERF_RECORD_SAMPLE is enough?
I.e. having the other PERF_RECORD_{COMM,MMAP,} etc is required in face
of migration.
- Arnaldo
> jirka
>
> >
> > >
> > > Reported-by: Jiri Olsa <jolsa@kernel.org>
> > > Signed-off-by: Andi Kleen <ak@linux.intel.com>
> > > ---
> > > tools/perf/builtin-script.c | 24 ++++++++++++++++++++++++
> > > 1 file changed, 24 insertions(+)
> > >
> > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > > index fbc440bdf880..3813f60d1dc0 100644
> > > --- a/tools/perf/builtin-script.c
> > > +++ b/tools/perf/builtin-script.c
> > > @@ -2038,6 +2038,9 @@ static int process_comm_event(struct perf_tool *tool,
> > > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > > int ret = -1;
> > >
> > > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > > + return 0;
> > > +
> > > thread = machine__findnew_thread(machine, event->comm.pid, event->comm.tid);
> > > if (thread == NULL) {
> > > pr_debug("problem processing COMM event, skipping it.\n");
> > > @@ -2073,6 +2076,9 @@ static int process_namespaces_event(struct perf_tool *tool,
> > > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > > int ret = -1;
> > >
> > > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > > + return 0;
> > > +
> > > thread = machine__findnew_thread(machine, event->namespaces.pid,
> > > event->namespaces.tid);
> > > if (thread == NULL) {
> > > @@ -2108,6 +2114,9 @@ static int process_fork_event(struct perf_tool *tool,
> > > struct perf_session *session = script->session;
> > > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > >
> > > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > > + return 0;
> > > +
> > > if (perf_event__process_fork(tool, event, sample, machine) < 0)
> > > return -1;
> > >
> > > @@ -2141,6 +2150,9 @@ static int process_exit_event(struct perf_tool *tool,
> > > struct perf_session *session = script->session;
> > > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > >
> > > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > > + return 0;
> > > +
> > > thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid);
> > > if (thread == NULL) {
> > > pr_debug("problem processing EXIT event, skipping it.\n");
> > > @@ -2174,6 +2186,9 @@ static int process_mmap_event(struct perf_tool *tool,
> > > struct perf_session *session = script->session;
> > > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > >
> > > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > > + return 0;
> > > +
> > > if (perf_event__process_mmap(tool, event, sample, machine) < 0)
> > > return -1;
> > >
> > > @@ -2206,6 +2221,9 @@ static int process_mmap2_event(struct perf_tool *tool,
> > > struct perf_session *session = script->session;
> > > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > >
> > > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > > + return 0;
> > > +
> > > if (perf_event__process_mmap2(tool, event, sample, machine) < 0)
> > > return -1;
> > >
> > > @@ -2238,6 +2256,9 @@ static int process_switch_event(struct perf_tool *tool,
> > > struct perf_session *session = script->session;
> > > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > >
> > > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > > + return 0;
> > > +
> > > if (perf_event__process_switch(tool, event, sample, machine) < 0)
> > > return -1;
> > >
> > > @@ -2266,6 +2287,9 @@ process_lost_event(struct perf_tool *tool,
> > > struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
> > > struct thread *thread;
> > >
> > > + if (cpu_list && !test_bit(sample->cpu, cpu_bitmap))
> > > + return 0;
> > > +
> > > thread = machine__findnew_thread(machine, sample->pid,
> > > sample->tid);
> > > if (thread == NULL)
> > >
> >
--
- Arnaldo
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH v4 03/15] perf tools script: Filter COMM/FORK/.. events by CPU
2019-03-08 13:39 ` Arnaldo Carvalho de Melo
@ 2019-03-08 15:08 ` Andi Kleen
0 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-08 15:08 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Jiri Olsa, Adrian Hunter, Andi Kleen, jolsa, namhyung,
linux-kernel, linux-perf-users
On Fri, Mar 08, 2019 at 10:39:01AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Mar 07, 2019 at 12:02:31PM +0100, Jiri Olsa escreveu:
> > On Wed, Mar 06, 2019 at 04:23:40PM +0200, Adrian Hunter wrote:
> > > On 5/03/19 4:47 PM, Andi Kleen wrote:
> > > > From: Andi Kleen <ak@linux.intel.com>
> > > >
> > > > The --cpu option only filtered samples. Filter other perf events,
> > > > such as COMM, FORK, SWITCH by the CPU too.
> > >
> > > Because tasks can migrate from cpu to cpu, we probably need to process most
> > > of the events anyway, even if they are not printed.
> >
> > agreed, I wonder we could just make the perf_event__fprintf conditional
>
> Humm, probably just do the filtering on PERF_RECORD_SAMPLE is enough?
> I.e. having the other PERF_RECORD_{COMM,MMAP,} etc is required in face
> of migration.
The goal was to only show the output for the correct CPU in the perf
sample context browser. Otherwise the output on larger systems
is very confusing because most of it is for irrelevant CPUs.
-andi
^ permalink raw reply [flat|nested] 49+ messages in thread
* [PATCH v4 04/15] perf tools report: Support nano seconds
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (2 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 03/15] perf tools script: Filter COMM/FORK/.. events by CPU Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-22 21:59 ` [tip:perf/urgent] perf report: Support output in nanoseconds tip-bot for Andi Kleen
2019-03-05 14:47 ` [PATCH v4 05/15] perf tools report: Parse time quantum Andi Kleen
` (11 subsequent siblings)
15 siblings, 1 reply; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Upcoming changes add timestamp output in perf report. Add a --ns
argument similar to perf script to support nanoseconds resolution
when needed.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Move flag into symbol_conf and change all users
---
tools/perf/Documentation/perf-report.txt | 3 +++
tools/perf/builtin-report.c | 1 +
tools/perf/builtin-script.c | 11 +++++------
tools/perf/util/symbol.c | 1 +
tools/perf/util/symbol_conf.h | 1 +
5 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 1a27bfe05039..51dbc519dbce 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -477,6 +477,9 @@ include::itrace.txt[]
Please note that not all mmaps are stored, options affecting which ones
are include 'perf record --data', for instance.
+--ns::
+ Show time stamps in nanoseconds.
+
--stats::
Display overall events statistics without any further processing.
(like the one at the end of the perf report -D command)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 1532ebde6c4b..09180e559ad6 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1147,6 +1147,7 @@ int cmd_report(int argc, const char **argv)
OPT_CALLBACK(0, "percent-type", &report.annotation_opts, "local-period",
"Set percent type local/global-period/hits",
annotate_parse_percent_type),
+ OPT_BOOLEAN(0, "ns", &symbol_conf.nanosecs, "Show times in nanosecs"),
OPT_END()
};
struct perf_data data = {
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 3813f60d1dc0..d5e819b68970 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -60,7 +60,6 @@ static bool no_callchain;
static bool latency_format;
static bool system_wide;
static bool print_flags;
-static bool nanosecs;
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
static struct perf_stat_config stat_config;
@@ -691,7 +690,7 @@ static int perf_sample__fprintf_start(struct perf_sample *sample,
secs = nsecs / NSEC_PER_SEC;
nsecs -= secs * NSEC_PER_SEC;
- if (nanosecs)
+ if (symbol_conf.nanosecs)
printed += fprintf(fp, "%5lu.%09llu: ", secs, nsecs);
else {
char sample_time[32];
@@ -3268,7 +3267,7 @@ static int parse_insn_trace(const struct option *opt __maybe_unused,
{
parse_output_fields(NULL, "+insn,-event,-period", 0);
itrace_parse_synth_opts(opt, "i0ns", 0);
- nanosecs = true;
+ symbol_conf.nanosecs = true;
return 0;
}
@@ -3286,7 +3285,7 @@ static int parse_call_trace(const struct option *opt __maybe_unused,
{
parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent", 0);
itrace_parse_synth_opts(opt, "cewp", 0);
- nanosecs = true;
+ symbol_conf.nanosecs = true;
return 0;
}
@@ -3296,7 +3295,7 @@ static int parse_callret_trace(const struct option *opt __maybe_unused,
{
parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent,+flags", 0);
itrace_parse_synth_opts(opt, "crewp", 0);
- nanosecs = true;
+ symbol_conf.nanosecs = true;
return 0;
}
@@ -3432,7 +3431,7 @@ int cmd_script(int argc, const char **argv)
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
OPT_INTEGER(0, "max-blocks", &max_blocks,
"Maximum number of code blocks to dump with brstackinsn"),
- OPT_BOOLEAN(0, "ns", &nanosecs,
+ OPT_BOOLEAN(0, "ns", &symbol_conf.nanosecs,
"Use 9 decimal places when displaying time"),
OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
"Instruction Tracing options\n" ITRACE_HELP,
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 758bf5f74e6e..eb873ea1c405 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -39,6 +39,7 @@ int vmlinux_path__nr_entries;
char **vmlinux_path;
struct symbol_conf symbol_conf = {
+ .nanosecs = false,
.use_modules = true,
.try_vmlinux_path = true,
.demangle = true,
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index fffea68c1203..095a297c8b47 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -8,6 +8,7 @@ struct strlist;
struct intlist;
struct symbol_conf {
+ bool nanosecs;
unsigned short priv_size;
bool try_vmlinux_path,
init_annotation,
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [tip:perf/urgent] perf report: Support output in nanoseconds
2019-03-05 14:47 ` [PATCH v4 04/15] perf tools report: Support nano seconds Andi Kleen
@ 2019-03-22 21:59 ` tip-bot for Andi Kleen
0 siblings, 0 replies; 49+ messages in thread
From: tip-bot for Andi Kleen @ 2019-03-22 21:59 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, linux-kernel, ak, namhyung, jolsa, acme, hpa, tglx
Commit-ID: 52bab8868211b7c504146f6239e101421d4d125b
Gitweb: https://git.kernel.org/tip/52bab8868211b7c504146f6239e101421d4d125b
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 5 Mar 2019 06:47:47 -0800
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Mar 2019 11:56:02 -0300
perf report: Support output in nanoseconds
Upcoming changes add timestamp output in perf report. Add a --ns
argument similar to perf script to support nanoseconds resolution when
needed.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-5-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 3 +++
tools/perf/builtin-report.c | 1 +
tools/perf/builtin-script.c | 11 +++++------
tools/perf/util/symbol.c | 1 +
tools/perf/util/symbol_conf.h | 1 +
5 files changed, 11 insertions(+), 6 deletions(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 1a27bfe05039..51dbc519dbce 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -477,6 +477,9 @@ include::itrace.txt[]
Please note that not all mmaps are stored, options affecting which ones
are include 'perf record --data', for instance.
+--ns::
+ Show time stamps in nanoseconds.
+
--stats::
Display overall events statistics without any further processing.
(like the one at the end of the perf report -D command)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ee93c18a6685..515864ba504a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1147,6 +1147,7 @@ int cmd_report(int argc, const char **argv)
OPT_CALLBACK(0, "percent-type", &report.annotation_opts, "local-period",
"Set percent type local/global-period/hits",
annotate_parse_percent_type),
+ OPT_BOOLEAN(0, "ns", &symbol_conf.nanosecs, "Show times in nanosecs"),
OPT_END()
};
struct perf_data data = {
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index a5080afd361d..111787e83784 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -60,7 +60,6 @@ static bool no_callchain;
static bool latency_format;
static bool system_wide;
static bool print_flags;
-static bool nanosecs;
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
static struct perf_stat_config stat_config;
@@ -691,7 +690,7 @@ static int perf_sample__fprintf_start(struct perf_sample *sample,
secs = nsecs / NSEC_PER_SEC;
nsecs -= secs * NSEC_PER_SEC;
- if (nanosecs)
+ if (symbol_conf.nanosecs)
printed += fprintf(fp, "%5lu.%09llu: ", secs, nsecs);
else {
char sample_time[32];
@@ -3244,7 +3243,7 @@ static int parse_insn_trace(const struct option *opt __maybe_unused,
{
parse_output_fields(NULL, "+insn,-event,-period", 0);
itrace_parse_synth_opts(opt, "i0ns", 0);
- nanosecs = true;
+ symbol_conf.nanosecs = true;
return 0;
}
@@ -3262,7 +3261,7 @@ static int parse_call_trace(const struct option *opt __maybe_unused,
{
parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent", 0);
itrace_parse_synth_opts(opt, "cewp", 0);
- nanosecs = true;
+ symbol_conf.nanosecs = true;
return 0;
}
@@ -3272,7 +3271,7 @@ static int parse_callret_trace(const struct option *opt __maybe_unused,
{
parse_output_fields(NULL, "-ip,-addr,-event,-period,+callindent,+flags", 0);
itrace_parse_synth_opts(opt, "crewp", 0);
- nanosecs = true;
+ symbol_conf.nanosecs = true;
return 0;
}
@@ -3408,7 +3407,7 @@ int cmd_script(int argc, const char **argv)
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
OPT_INTEGER(0, "max-blocks", &max_blocks,
"Maximum number of code blocks to dump with brstackinsn"),
- OPT_BOOLEAN(0, "ns", &nanosecs,
+ OPT_BOOLEAN(0, "ns", &symbol_conf.nanosecs,
"Use 9 decimal places when displaying time"),
OPT_CALLBACK_OPTARG(0, "itrace", &itrace_synth_opts, NULL, "opts",
"Instruction Tracing options\n" ITRACE_HELP,
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 758bf5f74e6e..eb873ea1c405 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -39,6 +39,7 @@ int vmlinux_path__nr_entries;
char **vmlinux_path;
struct symbol_conf symbol_conf = {
+ .nanosecs = false,
.use_modules = true,
.try_vmlinux_path = true,
.demangle = true,
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index fffea68c1203..095a297c8b47 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -8,6 +8,7 @@ struct strlist;
struct intlist;
struct symbol_conf {
+ bool nanosecs;
unsigned short priv_size;
bool try_vmlinux_path,
init_annotation,
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 05/15] perf tools report: Parse time quantum
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (3 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 04/15] perf tools report: Support nano seconds Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-08 13:45 ` Arnaldo Carvalho de Melo
2019-03-22 22:01 ` [tip:perf/urgent] perf " tip-bot for Andi Kleen
2019-03-05 14:47 ` [PATCH v4 06/15] perf tools report: Support time sort key Andi Kleen
` (10 subsequent siblings)
15 siblings, 2 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Many workloads change over time. perf report currently aggregates
the whole time range reported in perf.data.
This patch adds an option for a time quantum to quantisize the
perf.data over time.
This just adds the option, will be used in follow on patches
for a time sort key.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Move time_quantum to symbol_conf. check for zero time quantum
v3:
Document s unit
---
tools/perf/Documentation/perf-report.txt | 4 +++
tools/perf/builtin-report.c | 41 ++++++++++++++++++++++++
tools/perf/util/symbol.c | 1 +
tools/perf/util/symbol_conf.h | 1 +
4 files changed, 47 insertions(+)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 51dbc519dbce..9ec1702bccdd 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -497,6 +497,10 @@ include::itrace.txt[]
The period/hits keywords set the base the percentage is computed
on - the samples period or the number of samples (hits).
+--time-quantum::
+ Configure time quantum for time sort key. Default 100ms.
+ Accepts s, us, ms, ns units.
+
include::callchain-overhead-calculation.txt[]
SEE ALSO
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 09180e559ad6..c19952072a3a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -47,6 +47,7 @@
#include <errno.h>
#include <inttypes.h>
#include <regex.h>
+#include "sane_ctype.h"
#include <signal.h>
#include <linux/bitmap.h>
#include <linux/stringify.h>
@@ -926,6 +927,43 @@ report_parse_callchain_opt(const struct option *opt, const char *arg, int unset)
return parse_callchain_report_opt(arg);
}
+static int
+parse_time_quantum(const struct option *opt, const char *arg,
+ int unset __maybe_unused)
+{
+ unsigned long *time_q = opt->value;
+ char *end;
+
+ *time_q = strtoul(arg, &end, 0);
+ if (end == arg)
+ goto parse_err;
+ if (*time_q == 0) {
+ pr_err("time quantum cannot be 0");
+ return -1;
+ }
+ while (isspace(*end))
+ end++;
+ if (*end == 0)
+ return 0;
+ if (!strcmp(end, "s")) {
+ *time_q *= 1000000000;
+ return 0;
+ }
+ if (!strcmp(end, "ms")) {
+ *time_q *= 1000000;
+ return 0;
+ }
+ if (!strcmp(end, "us")) {
+ *time_q *= 1000;
+ return 0;
+ }
+ if (!strcmp(end, "ns"))
+ return 0;
+parse_err:
+ pr_err("Cannot parse time quantum `%s'\n", arg);
+ return -1;
+}
+
int
report_parse_ignore_callees_opt(const struct option *opt __maybe_unused,
const char *arg, int unset __maybe_unused)
@@ -1148,6 +1186,9 @@ int cmd_report(int argc, const char **argv)
"Set percent type local/global-period/hits",
annotate_parse_percent_type),
OPT_BOOLEAN(0, "ns", &symbol_conf.nanosecs, "Show times in nanosecs"),
+ OPT_CALLBACK(0, "time-quantum", &symbol_conf.time_quantum, "time (ms|us|ns|s)",
+ "Set time quantum for time sort key (default 100ms)",
+ parse_time_quantum),
OPT_END()
};
struct perf_data data = {
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index eb873ea1c405..0f80743a1c25 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -45,6 +45,7 @@ struct symbol_conf symbol_conf = {
.demangle = true,
.demangle_kernel = false,
.cumulate_callchain = true,
+ .time_quantum = 100000000, /* 100ms */
.show_hist_headers = true,
.symfs = "",
.event_group = true,
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index 095a297c8b47..a5684a71b78e 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -56,6 +56,7 @@ struct symbol_conf {
*sym_list_str,
*col_width_list_str,
*bt_stop_list_str;
+ unsigned long time_quantum;
struct strlist *dso_list,
*comm_list,
*sym_list,
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH v4 05/15] perf tools report: Parse time quantum
2019-03-05 14:47 ` [PATCH v4 05/15] perf tools report: Parse time quantum Andi Kleen
@ 2019-03-08 13:45 ` Arnaldo Carvalho de Melo
2019-03-08 13:52 ` Arnaldo Carvalho de Melo
2019-03-22 22:01 ` [tip:perf/urgent] perf " tip-bot for Andi Kleen
1 sibling, 1 reply; 49+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-08 13:45 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
Em Tue, Mar 05, 2019 at 06:47:48AM -0800, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> Many workloads change over time. perf report currently aggregates
> the whole time range reported in perf.data.
>
> This patch adds an option for a time quantum to quantisize the
> perf.data over time.
>
> This just adds the option, will be used in follow on patches
> for a time sort key.
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> ---
> v2:
> Move time_quantum to symbol_conf. check for zero time quantum
> v3:
> Document s unit
> ---
> tools/perf/Documentation/perf-report.txt | 4 +++
> tools/perf/builtin-report.c | 41 ++++++++++++++++++++++++
> tools/perf/util/symbol.c | 1 +
> tools/perf/util/symbol_conf.h | 1 +
> 4 files changed, 47 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index 51dbc519dbce..9ec1702bccdd 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -497,6 +497,10 @@ include::itrace.txt[]
> The period/hits keywords set the base the percentage is computed
> on - the samples period or the number of samples (hits).
>
> +--time-quantum::
> + Configure time quantum for time sort key. Default 100ms.
> + Accepts s, us, ms, ns units.
> +
> include::callchain-overhead-calculation.txt[]
>
> SEE ALSO
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 09180e559ad6..c19952072a3a 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -47,6 +47,7 @@
> #include <errno.h>
> #include <inttypes.h>
> #include <regex.h>
> +#include "sane_ctype.h"
> #include <signal.h>
> #include <linux/bitmap.h>
> #include <linux/stringify.h>
> @@ -926,6 +927,43 @@ report_parse_callchain_opt(const struct option *opt, const char *arg, int unset)
> return parse_callchain_report_opt(arg);
> }
>
> +static int
> +parse_time_quantum(const struct option *opt, const char *arg,
> + int unset __maybe_unused)
> +{
> + unsigned long *time_q = opt->value;
> + char *end;
> +
> + *time_q = strtoul(arg, &end, 0);
> + if (end == arg)
> + goto parse_err;
> + if (*time_q == 0) {
> + pr_err("time quantum cannot be 0");
> + return -1;
> + }
> + while (isspace(*end))
> + end++;
> + if (*end == 0)
> + return 0;
We have tools/include/linux/time64.h, just like the kernel, so please
use:
#include <linux/time64.h>
> + if (!strcmp(end, "s")) {
> + *time_q *= 1000000000;
NSEC_PER_SEC;
> + return 0;
> + }
> + if (!strcmp(end, "ms")) {
> + *time_q *= 1000000;
NSEC_PER_MSEC;
> + return 0;
> + }
> + if (!strcmp(end, "us")) {
> + *time_q *= 1000;
NSEC_PER_USEC;
one more note below
> + return 0;
> + }
> + if (!strcmp(end, "ns"))
> + return 0;
> +parse_err:
> + pr_err("Cannot parse time quantum `%s'\n", arg);
> + return -1;
> +}
> +
> int
> report_parse_ignore_callees_opt(const struct option *opt __maybe_unused,
> const char *arg, int unset __maybe_unused)
> @@ -1148,6 +1186,9 @@ int cmd_report(int argc, const char **argv)
> "Set percent type local/global-period/hits",
> annotate_parse_percent_type),
> OPT_BOOLEAN(0, "ns", &symbol_conf.nanosecs, "Show times in nanosecs"),
> + OPT_CALLBACK(0, "time-quantum", &symbol_conf.time_quantum, "time (ms|us|ns|s)",
> + "Set time quantum for time sort key (default 100ms)",
> + parse_time_quantum),
> OPT_END()
> };
> struct perf_data data = {
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index eb873ea1c405..0f80743a1c25 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -45,6 +45,7 @@ struct symbol_conf symbol_conf = {
> .demangle = true,
> .demangle_kernel = false,
> .cumulate_callchain = true,
> + .time_quantum = 100000000, /* 100ms */
100 * NSEC_PER_MSEC;
> .show_hist_headers = true,
> .symfs = "",
> .event_group = true,
> diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
> index 095a297c8b47..a5684a71b78e 100644
> --- a/tools/perf/util/symbol_conf.h
> +++ b/tools/perf/util/symbol_conf.h
> @@ -56,6 +56,7 @@ struct symbol_conf {
> *sym_list_str,
> *col_width_list_str,
> *bt_stop_list_str;
> + unsigned long time_quantum;
> struct strlist *dso_list,
> *comm_list,
> *sym_list,
> --
> 2.20.1
--
- Arnaldo
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH v4 05/15] perf tools report: Parse time quantum
2019-03-08 13:45 ` Arnaldo Carvalho de Melo
@ 2019-03-08 13:52 ` Arnaldo Carvalho de Melo
2019-03-08 17:39 ` Andi Kleen
0 siblings, 1 reply; 49+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-08 13:52 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
Em Fri, Mar 08, 2019 at 10:45:32AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Mar 05, 2019 at 06:47:48AM -0800, Andi Kleen escreveu:
> > From: Andi Kleen <ak@linux.intel.com>
> > + if (*end == 0)
> > + return 0;
>
> We have tools/include/linux/time64.h, just like the kernel, so please
> use:
>
> #include <linux/time64.h>
I'll do that this time.
- Arnado
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH v4 05/15] perf tools report: Parse time quantum
2019-03-08 13:52 ` Arnaldo Carvalho de Melo
@ 2019-03-08 17:39 ` Andi Kleen
0 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-08 17:39 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Andi Kleen, jolsa, namhyung, linux-kernel, linux-perf-users
On Fri, Mar 08, 2019 at 10:52:36AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Fri, Mar 08, 2019 at 10:45:32AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Tue, Mar 05, 2019 at 06:47:48AM -0800, Andi Kleen escreveu:
> > > From: Andi Kleen <ak@linux.intel.com>
> > > + if (*end == 0)
> > > + return 0;
> >
> > We have tools/include/linux/time64.h, just like the kernel, so please
> > use:
> >
> > #include <linux/time64.h>
>
> I'll do that this time.
Thanks. I already did it in my copy too.
-Andi
^ permalink raw reply [flat|nested] 49+ messages in thread
* [tip:perf/urgent] perf report: Parse time quantum
2019-03-05 14:47 ` [PATCH v4 05/15] perf tools report: Parse time quantum Andi Kleen
2019-03-08 13:45 ` Arnaldo Carvalho de Melo
@ 2019-03-22 22:01 ` tip-bot for Andi Kleen
1 sibling, 0 replies; 49+ messages in thread
From: tip-bot for Andi Kleen @ 2019-03-22 22:01 UTC (permalink / raw)
To: linux-tip-commits
Cc: tglx, linux-kernel, jolsa, namhyung, ak, acme, hpa, mingo
Commit-ID: 2a1292cbd4e5c81edbf815a410fa2072c341db1e
Gitweb: https://git.kernel.org/tip/2a1292cbd4e5c81edbf815a410fa2072c341db1e
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 5 Mar 2019 06:47:48 -0800
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Mar 2019 11:56:03 -0300
perf report: Parse time quantum
Many workloads change over time. 'perf report' currently aggregates the
whole time range reported in perf.data.
This patch adds an option for a time quantum to quantisize the perf.data
over time.
This just adds the option, will be used in follow on patches for a time
sort key.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-6-andi@firstfloor.org
[ Use NSEC_PER_[MU]SEC ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/Documentation/perf-report.txt | 4 +++
tools/perf/builtin-report.c | 42 ++++++++++++++++++++++++++++++++
tools/perf/util/symbol.c | 2 ++
tools/perf/util/symbol_conf.h | 1 +
4 files changed, 49 insertions(+)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 51dbc519dbce..9ec1702bccdd 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -497,6 +497,10 @@ include::itrace.txt[]
The period/hits keywords set the base the percentage is computed
on - the samples period or the number of samples (hits).
+--time-quantum::
+ Configure time quantum for time sort key. Default 100ms.
+ Accepts s, us, ms, ns units.
+
include::callchain-overhead-calculation.txt[]
SEE ALSO
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 515864ba504a..05c8dd41106c 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -47,9 +47,11 @@
#include <errno.h>
#include <inttypes.h>
#include <regex.h>
+#include "sane_ctype.h"
#include <signal.h>
#include <linux/bitmap.h>
#include <linux/stringify.h>
+#include <linux/time64.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
@@ -926,6 +928,43 @@ report_parse_callchain_opt(const struct option *opt, const char *arg, int unset)
return parse_callchain_report_opt(arg);
}
+static int
+parse_time_quantum(const struct option *opt, const char *arg,
+ int unset __maybe_unused)
+{
+ unsigned long *time_q = opt->value;
+ char *end;
+
+ *time_q = strtoul(arg, &end, 0);
+ if (end == arg)
+ goto parse_err;
+ if (*time_q == 0) {
+ pr_err("time quantum cannot be 0");
+ return -1;
+ }
+ while (isspace(*end))
+ end++;
+ if (*end == 0)
+ return 0;
+ if (!strcmp(end, "s")) {
+ *time_q *= NSEC_PER_SEC;
+ return 0;
+ }
+ if (!strcmp(end, "ms")) {
+ *time_q *= NSEC_PER_MSEC;
+ return 0;
+ }
+ if (!strcmp(end, "us")) {
+ *time_q *= NSEC_PER_USEC;
+ return 0;
+ }
+ if (!strcmp(end, "ns"))
+ return 0;
+parse_err:
+ pr_err("Cannot parse time quantum `%s'\n", arg);
+ return -1;
+}
+
int
report_parse_ignore_callees_opt(const struct option *opt __maybe_unused,
const char *arg, int unset __maybe_unused)
@@ -1148,6 +1187,9 @@ int cmd_report(int argc, const char **argv)
"Set percent type local/global-period/hits",
annotate_parse_percent_type),
OPT_BOOLEAN(0, "ns", &symbol_conf.nanosecs, "Show times in nanosecs"),
+ OPT_CALLBACK(0, "time-quantum", &symbol_conf.time_quantum, "time (ms|us|ns|s)",
+ "Set time quantum for time sort key (default 100ms)",
+ parse_time_quantum),
OPT_END()
};
struct perf_data data = {
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index eb873ea1c405..6b73a0eeb6a1 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -6,6 +6,7 @@
#include <string.h>
#include <linux/kernel.h>
#include <linux/mman.h>
+#include <linux/time64.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/param.h>
@@ -45,6 +46,7 @@ struct symbol_conf symbol_conf = {
.demangle = true,
.demangle_kernel = false,
.cumulate_callchain = true,
+ .time_quantum = 100 * NSEC_PER_MSEC, /* 100ms */
.show_hist_headers = true,
.symfs = "",
.event_group = true,
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index 095a297c8b47..a5684a71b78e 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -56,6 +56,7 @@ struct symbol_conf {
*sym_list_str,
*col_width_list_str,
*bt_stop_list_str;
+ unsigned long time_quantum;
struct strlist *dso_list,
*comm_list,
*sym_list,
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 06/15] perf tools report: Support time sort key
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (4 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 05/15] perf tools report: Parse time quantum Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-05 14:47 ` [PATCH v4 07/15] perf tools report: Use less for scripts output Andi Kleen
` (9 subsequent siblings)
15 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add a time sort key to perf report to display samples for
different time quantums separately. This allows easier
analysis of workloads that change over time, and also
will allow looking at the context of samples.
% perf record ...
% perf report --sort time,overhead,symbol --time-quantum 1ms --stdio
...
0.67% 277061.87300 [.] _dl_start
0.50% 277061.87300 [.] f1
0.50% 277061.87300 [.] f2
0.33% 277061.87300 [.] main
0.29% 277061.87300 [.] _dl_lookup_symbol_x
0.29% 277061.87300 [.] dl_main
0.29% 277061.87300 [.] do_lookup_x
0.17% 277061.87300 [.] _dl_debug_initialize
0.17% 277061.87300 [.] _dl_init_paths
0.08% 277061.87300 [.] check_match
0.04% 277061.87300 [.] _dl_count_modids
1.33% 277061.87400 [.] f1
1.33% 277061.87400 [.] f2
1.33% 277061.87400 [.] main
1.17% 277061.87500 [.] main
1.08% 277061.87500 [.] f1
1.08% 277061.87500 [.] f2
1.00% 277061.87600 [.] main
0.83% 277061.87600 [.] f1
0.83% 277061.87600 [.] f2
1.00% 277061.87700 [.] main
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Use symbol_conf.time_quantum
---
tools/perf/Documentation/perf-report.txt | 2 ++
tools/perf/util/hist.c | 10 +++++++
tools/perf/util/hist.h | 1 +
tools/perf/util/sort.c | 38 ++++++++++++++++++++++++
tools/perf/util/sort.h | 2 ++
5 files changed, 53 insertions(+)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 9ec1702bccdd..546d87221ad8 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -105,6 +105,8 @@ OPTIONS
guest machine
- sample: Number of sample
- period: Raw number of event count of sample
+ - time: Separate the samples by time stamp with the resolution specified by
+ --time-quantum (default 100ms). Specify with overhead and before it.
By default, comm, dso and symbol keys are used.
(i.e. --sort comm,dso,symbol)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 669f961316f0..6040eb49ea23 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -192,6 +192,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
hists__new_col_len(hists, HISTC_MEM_LVL, 21 + 3);
hists__new_col_len(hists, HISTC_LOCAL_WEIGHT, 12);
hists__new_col_len(hists, HISTC_GLOBAL_WEIGHT, 12);
+ hists__new_col_len(hists, HISTC_TIME, 12);
if (h->srcline) {
len = MAX(strlen(h->srcline), strlen(sort_srcline.se_header));
@@ -246,6 +247,14 @@ static void he_stat__add_cpumode_period(struct he_stat *he_stat,
}
}
+static long hist_time(unsigned long time)
+{
+ unsigned long time_quantum = symbol_conf.time_quantum;
+ if (time_quantum)
+ return (time / time_quantum) * time_quantum;
+ return (time / 1000000) * 1000000;
+}
+
static void he_stat__add_period(struct he_stat *he_stat, u64 period,
u64 weight)
{
@@ -626,6 +635,7 @@ __hists__add_entry(struct hists *hists,
.raw_data = sample->raw_data,
.raw_size = sample->raw_size,
.ops = ops,
+ .time = hist_time(sample->time),
}, *he = hists__findnew_entry(hists, &entry, al, sample_self);
if (!hists->has_callchains && he && he->callchain_size != 0)
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 4af27fbab24f..6279eca56409 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -31,6 +31,7 @@ enum hist_filter {
enum hist_column {
HISTC_SYMBOL,
+ HISTC_TIME,
HISTC_DSO,
HISTC_THREAD,
HISTC_COMM,
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index d2299e912e59..22f24bb2bf8a 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -15,6 +15,7 @@
#include <traceevent/event-parse.h>
#include "mem-events.h"
#include "annotate.h"
+#include "time-utils.h"
#include <linux/kernel.h>
regex_t parent_regex;
@@ -654,6 +655,42 @@ struct sort_entry sort_socket = {
.se_width_idx = HISTC_SOCKET,
};
+/* --sort time */
+
+static int64_t
+sort__time_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ return right->time - left->time;
+}
+
+static int hist_entry__time_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ unsigned long secs;
+ unsigned long long nsecs;
+ char he_time[32];
+
+ nsecs = he->time;
+ secs = nsecs / 1000000000;
+ nsecs -= secs * 1000000000;
+
+ if (symbol_conf.nanosecs)
+ snprintf(he_time, sizeof he_time, "%5lu.%09llu: ",
+ secs, nsecs);
+ else
+ timestamp__scnprintf_usec(he->time, he_time,
+ sizeof(he_time));
+
+ return repsep_snprintf(bf, size, "%-.*s", width, he_time);
+}
+
+struct sort_entry sort_time = {
+ .se_header = "Time",
+ .se_cmp = sort__time_cmp,
+ .se_snprintf = hist_entry__time_snprintf,
+ .se_width_idx = HISTC_TIME,
+};
+
/* --sort trace */
static char *get_trace_output(struct hist_entry *he)
@@ -1634,6 +1671,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_DSO_SIZE, "dso_size", sort_dso_size),
DIM(SORT_CGROUP_ID, "cgroup_id", sort_cgroup_id),
DIM(SORT_SYM_IPC_NULL, "ipc_null", sort_sym_ipc_null),
+ DIM(SORT_TIME, "time", sort_time),
};
#undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 2fbee0b1011c..19dceb7f6145 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -135,6 +135,7 @@ struct hist_entry {
char *srcfile;
struct symbol *parent;
struct branch_info *branch_info;
+ long time;
struct hists *hists;
struct mem_info *mem_info;
void *raw_data;
@@ -231,6 +232,7 @@ enum sort_type {
SORT_DSO_SIZE,
SORT_CGROUP_ID,
SORT_SYM_IPC_NULL,
+ SORT_TIME,
/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 07/15] perf tools report: Use less for scripts output
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (5 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 06/15] perf tools report: Support time sort key Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-08 13:47 ` Arnaldo Carvalho de Melo
2019-03-05 14:47 ` [PATCH v4 08/15] perf tools report: Support running scripts for current time range Andi Kleen
` (8 subsequent siblings)
15 siblings, 1 reply; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
The UI viewer for scripts output has a lot of limitations: limited size,
no search or save function, slow, and various other issues.
Just use 'less' to display directly on the terminal instead.
This won't work in gtk mode, but gtk doesn't support these
context menus anyways. If that is ever done could use an terminal
for the output.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Remove some unneeded headers
---
tools/perf/ui/browsers/scripts.c | 130 ++++---------------------------
1 file changed, 17 insertions(+), 113 deletions(-)
diff --git a/tools/perf/ui/browsers/scripts.c b/tools/perf/ui/browsers/scripts.c
index 90a32ac69e76..7f36630694bf 100644
--- a/tools/perf/ui/browsers/scripts.c
+++ b/tools/perf/ui/browsers/scripts.c
@@ -1,35 +1,12 @@
// SPDX-License-Identifier: GPL-2.0
-#include <elf.h>
-#include <inttypes.h>
-#include <sys/ttydefaults.h>
-#include <string.h>
#include "../../util/sort.h"
#include "../../util/util.h"
#include "../../util/hist.h"
#include "../../util/debug.h"
#include "../../util/symbol.h"
#include "../browser.h"
-#include "../helpline.h"
#include "../libslang.h"
-/* 2048 lines should be enough for a script output */
-#define MAX_LINES 2048
-
-/* 160 bytes for one output line */
-#define AVERAGE_LINE_LEN 160
-
-struct script_line {
- struct list_head node;
- char line[AVERAGE_LINE_LEN];
-};
-
-struct perf_script_browser {
- struct ui_browser b;
- struct list_head entries;
- const char *script_name;
- int nr_lines;
-};
-
#define SCRIPT_NAMELEN 128
#define SCRIPT_MAX_NO 64
/*
@@ -73,69 +50,29 @@ static int list_scripts(char *script_name)
return ret;
}
-static void script_browser__write(struct ui_browser *browser,
- void *entry, int row)
+static void run_script(char *cmd)
{
- struct script_line *sline = list_entry(entry, struct script_line, node);
- bool current_entry = ui_browser__is_current_entry(browser, row);
-
- ui_browser__set_color(browser, current_entry ? HE_COLORSET_SELECTED :
- HE_COLORSET_NORMAL);
-
- ui_browser__write_nstring(browser, sline->line, browser->width);
+ pr_debug("Running %s\n", cmd);
+ SLang_reset_tty();
+ if (system(cmd) < 0)
+ pr_warning("Cannot run %s\n", cmd);
+ /*
+ * SLang doesn't seem to reset the whole terminal, so be more
+ * forceful to get back to the original state.
+ */
+ printf("\033[c\033[H\033[J");
+ fflush(stdout);
+ SLang_init_tty(0, 0, 0);
+ SLsmg_refresh();
}
-static int script_browser__run(struct perf_script_browser *browser)
-{
- int key;
-
- if (ui_browser__show(&browser->b, browser->script_name,
- "Press ESC to exit") < 0)
- return -1;
-
- while (1) {
- key = ui_browser__run(&browser->b, 0);
-
- /* We can add some special key handling here if needed */
- break;
- }
-
- ui_browser__hide(&browser->b);
- return key;
-}
-
-
int script_browse(const char *script_opt)
{
char cmd[SCRIPT_FULLPATH_LEN*2], script_name[SCRIPT_FULLPATH_LEN];
- char *line = NULL;
- size_t len = 0;
- ssize_t retlen;
- int ret = -1, nr_entries = 0;
- FILE *fp;
- void *buf;
- struct script_line *sline;
-
- struct perf_script_browser script = {
- .b = {
- .refresh = ui_browser__list_head_refresh,
- .seek = ui_browser__list_head_seek,
- .write = script_browser__write,
- },
- .script_name = script_name,
- };
-
- INIT_LIST_HEAD(&script.entries);
-
- /* Save each line of the output in one struct script_line object. */
- buf = zalloc((sizeof(*sline)) * MAX_LINES);
- if (!buf)
- return -1;
- sline = buf;
memset(script_name, 0, SCRIPT_FULLPATH_LEN);
if (list_scripts(script_name))
- goto exit;
+ return -1;
sprintf(cmd, "perf script -s %s ", script_name);
@@ -147,42 +84,9 @@ int script_browse(const char *script_opt)
strcat(cmd, input_name);
}
- strcat(cmd, " 2>&1");
-
- fp = popen(cmd, "r");
- if (!fp)
- goto exit;
-
- while ((retlen = getline(&line, &len, fp)) != -1) {
- strncpy(sline->line, line, AVERAGE_LINE_LEN);
-
- /* If one output line is very large, just cut it short */
- if (retlen >= AVERAGE_LINE_LEN) {
- sline->line[AVERAGE_LINE_LEN - 1] = '\0';
- sline->line[AVERAGE_LINE_LEN - 2] = '\n';
- }
- list_add_tail(&sline->node, &script.entries);
-
- if (script.b.width < retlen)
- script.b.width = retlen;
-
- if (nr_entries++ >= MAX_LINES - 1)
- break;
- sline++;
- }
-
- if (script.b.width > AVERAGE_LINE_LEN)
- script.b.width = AVERAGE_LINE_LEN;
-
- free(line);
- pclose(fp);
+ strcat(cmd, " 2>&1 | less");
- script.nr_lines = nr_entries;
- script.b.nr_entries = nr_entries;
- script.b.entries = &script.entries;
+ run_script(cmd);
- ret = script_browser__run(&script);
-exit:
- free(buf);
- return ret;
+ return 0;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH v4 07/15] perf tools report: Use less for scripts output
2019-03-05 14:47 ` [PATCH v4 07/15] perf tools report: Use less for scripts output Andi Kleen
@ 2019-03-08 13:47 ` Arnaldo Carvalho de Melo
2019-03-09 7:52 ` Feng Tang
0 siblings, 1 reply; 49+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-08 13:47 UTC (permalink / raw)
To: Andi Kleen
Cc: Feng Tang, jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
Em Tue, Mar 05, 2019 at 06:47:50AM -0800, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> The UI viewer for scripts output has a lot of limitations: limited size,
> no search or save function, slow, and various other issues.
>
> Just use 'less' to display directly on the terminal instead.
I'm ok with this, CCing Feng tho since he contributed this browser, to
let him know.
- Arnaldo
> This won't work in gtk mode, but gtk doesn't support these
> context menus anyways. If that is ever done could use an terminal
> for the output.
>
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
>
> ---
>
> v2:
> Remove some unneeded headers
> ---
> tools/perf/ui/browsers/scripts.c | 130 ++++---------------------------
> 1 file changed, 17 insertions(+), 113 deletions(-)
>
> diff --git a/tools/perf/ui/browsers/scripts.c b/tools/perf/ui/browsers/scripts.c
> index 90a32ac69e76..7f36630694bf 100644
> --- a/tools/perf/ui/browsers/scripts.c
> +++ b/tools/perf/ui/browsers/scripts.c
> @@ -1,35 +1,12 @@
> // SPDX-License-Identifier: GPL-2.0
> -#include <elf.h>
> -#include <inttypes.h>
> -#include <sys/ttydefaults.h>
> -#include <string.h>
> #include "../../util/sort.h"
> #include "../../util/util.h"
> #include "../../util/hist.h"
> #include "../../util/debug.h"
> #include "../../util/symbol.h"
> #include "../browser.h"
> -#include "../helpline.h"
> #include "../libslang.h"
>
> -/* 2048 lines should be enough for a script output */
> -#define MAX_LINES 2048
> -
> -/* 160 bytes for one output line */
> -#define AVERAGE_LINE_LEN 160
> -
> -struct script_line {
> - struct list_head node;
> - char line[AVERAGE_LINE_LEN];
> -};
> -
> -struct perf_script_browser {
> - struct ui_browser b;
> - struct list_head entries;
> - const char *script_name;
> - int nr_lines;
> -};
> -
> #define SCRIPT_NAMELEN 128
> #define SCRIPT_MAX_NO 64
> /*
> @@ -73,69 +50,29 @@ static int list_scripts(char *script_name)
> return ret;
> }
>
> -static void script_browser__write(struct ui_browser *browser,
> - void *entry, int row)
> +static void run_script(char *cmd)
> {
> - struct script_line *sline = list_entry(entry, struct script_line, node);
> - bool current_entry = ui_browser__is_current_entry(browser, row);
> -
> - ui_browser__set_color(browser, current_entry ? HE_COLORSET_SELECTED :
> - HE_COLORSET_NORMAL);
> -
> - ui_browser__write_nstring(browser, sline->line, browser->width);
> + pr_debug("Running %s\n", cmd);
> + SLang_reset_tty();
> + if (system(cmd) < 0)
> + pr_warning("Cannot run %s\n", cmd);
> + /*
> + * SLang doesn't seem to reset the whole terminal, so be more
> + * forceful to get back to the original state.
> + */
> + printf("\033[c\033[H\033[J");
> + fflush(stdout);
> + SLang_init_tty(0, 0, 0);
> + SLsmg_refresh();
> }
>
> -static int script_browser__run(struct perf_script_browser *browser)
> -{
> - int key;
> -
> - if (ui_browser__show(&browser->b, browser->script_name,
> - "Press ESC to exit") < 0)
> - return -1;
> -
> - while (1) {
> - key = ui_browser__run(&browser->b, 0);
> -
> - /* We can add some special key handling here if needed */
> - break;
> - }
> -
> - ui_browser__hide(&browser->b);
> - return key;
> -}
> -
> -
> int script_browse(const char *script_opt)
> {
> char cmd[SCRIPT_FULLPATH_LEN*2], script_name[SCRIPT_FULLPATH_LEN];
> - char *line = NULL;
> - size_t len = 0;
> - ssize_t retlen;
> - int ret = -1, nr_entries = 0;
> - FILE *fp;
> - void *buf;
> - struct script_line *sline;
> -
> - struct perf_script_browser script = {
> - .b = {
> - .refresh = ui_browser__list_head_refresh,
> - .seek = ui_browser__list_head_seek,
> - .write = script_browser__write,
> - },
> - .script_name = script_name,
> - };
> -
> - INIT_LIST_HEAD(&script.entries);
> -
> - /* Save each line of the output in one struct script_line object. */
> - buf = zalloc((sizeof(*sline)) * MAX_LINES);
> - if (!buf)
> - return -1;
> - sline = buf;
>
> memset(script_name, 0, SCRIPT_FULLPATH_LEN);
> if (list_scripts(script_name))
> - goto exit;
> + return -1;
>
> sprintf(cmd, "perf script -s %s ", script_name);
>
> @@ -147,42 +84,9 @@ int script_browse(const char *script_opt)
> strcat(cmd, input_name);
> }
>
> - strcat(cmd, " 2>&1");
> -
> - fp = popen(cmd, "r");
> - if (!fp)
> - goto exit;
> -
> - while ((retlen = getline(&line, &len, fp)) != -1) {
> - strncpy(sline->line, line, AVERAGE_LINE_LEN);
> -
> - /* If one output line is very large, just cut it short */
> - if (retlen >= AVERAGE_LINE_LEN) {
> - sline->line[AVERAGE_LINE_LEN - 1] = '\0';
> - sline->line[AVERAGE_LINE_LEN - 2] = '\n';
> - }
> - list_add_tail(&sline->node, &script.entries);
> -
> - if (script.b.width < retlen)
> - script.b.width = retlen;
> -
> - if (nr_entries++ >= MAX_LINES - 1)
> - break;
> - sline++;
> - }
> -
> - if (script.b.width > AVERAGE_LINE_LEN)
> - script.b.width = AVERAGE_LINE_LEN;
> -
> - free(line);
> - pclose(fp);
> + strcat(cmd, " 2>&1 | less");
>
> - script.nr_lines = nr_entries;
> - script.b.nr_entries = nr_entries;
> - script.b.entries = &script.entries;
> + run_script(cmd);
>
> - ret = script_browser__run(&script);
> -exit:
> - free(buf);
> - return ret;
> + return 0;
> }
> --
> 2.20.1
--
- Arnaldo
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: [PATCH v4 07/15] perf tools report: Use less for scripts output
2019-03-08 13:47 ` Arnaldo Carvalho de Melo
@ 2019-03-09 7:52 ` Feng Tang
0 siblings, 0 replies; 49+ messages in thread
From: Feng Tang @ 2019-03-09 7:52 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: Andi Kleen, jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
On Fri, Mar 08, 2019 at 10:47:54AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Mar 05, 2019 at 06:47:50AM -0800, Andi Kleen escreveu:
> > From: Andi Kleen <ak@linux.intel.com>
> >
> > The UI viewer for scripts output has a lot of limitations: limited size,
> > no search or save function, slow, and various other issues.
> >
> > Just use 'less' to display directly on the terminal instead.
>
> I'm ok with this, CCing Feng tho since he contributed this browser, to
> let him know.
Thanks for the note!
>
> - Arnaldo
>
> > This won't work in gtk mode, but gtk doesn't support these
> > context menus anyways. If that is ever done could use an terminal
> > for the output.
> >
> > Signed-off-by: Andi Kleen <ak@linux.intel.com>
> >
Looks good to me.
Acked-by: Feng Tang <feng.tang@intel.com>
Thanks,
Feng
^ permalink raw reply [flat|nested] 49+ messages in thread
* [PATCH v4 08/15] perf tools report: Support running scripts for current time range
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (6 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 07/15] perf tools report: Use less for scripts output Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-05 14:47 ` [PATCH v4 09/15] perf tools report: Support builtin perf script in scripts menu Andi Kleen
` (7 subsequent siblings)
15 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
When using the time sort key, add new context menus to run
scripts for only the currently selected time range. Compute
the correct range for the selection add pass it as the --time option to
perf script.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Use symbol_conf.time_quantum
---
tools/perf/ui/browsers/hists.c | 82 +++++++++++++++++++++++++++++-----
1 file changed, 71 insertions(+), 11 deletions(-)
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index aef800d97ea1..663970f766e1 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -30,6 +30,7 @@
#include "srcline.h"
#include "string2.h"
#include "units.h"
+#include "time-utils.h"
#include "sane_ctype.h"
@@ -2338,6 +2339,7 @@ static int switch_data_file(void)
}
struct popup_action {
+ unsigned long time;
struct thread *thread;
struct map_symbol ms;
int socket;
@@ -2527,36 +2529,64 @@ static int
do_run_script(struct hist_browser *browser __maybe_unused,
struct popup_action *act)
{
- char script_opt[64];
- memset(script_opt, 0, sizeof(script_opt));
+ char *script_opt;
+ int len;
+ int n = 0;
+
+ len = 100;
+ if (act->thread)
+ len += strlen(thread__comm_str(act->thread));
+ else if (act->ms.sym)
+ len += strlen(act->ms.sym->name);
+ script_opt = malloc(len);
+ if (!script_opt)
+ return -1;
+ script_opt[0] = 0;
if (act->thread) {
- scnprintf(script_opt, sizeof(script_opt), " -c %s ",
+ n = scnprintf(script_opt, len, " -c %s ",
thread__comm_str(act->thread));
} else if (act->ms.sym) {
- scnprintf(script_opt, sizeof(script_opt), " -S %s ",
+ n = scnprintf(script_opt, len, " -S %s ",
act->ms.sym->name);
}
+ if (act->time) {
+ char start[64], end[64];
+ unsigned long starttime = act->time;
+ unsigned long endtime = act->time + symbol_conf.time_quantum;
+
+ if (starttime == endtime) { /* Display 1ms as fallback */
+ starttime -= 1*1000000;
+ endtime += 1*1000000;
+ }
+ timestamp__scnprintf_usec(starttime, start, sizeof start);
+ timestamp__scnprintf_usec(endtime, end, sizeof end);
+ n += snprintf(script_opt + n, len - n, " --time %s,%s", start, end);
+ }
+
script_browse(script_opt);
+ free(script_opt);
return 0;
}
static int
-add_script_opt(struct hist_browser *browser __maybe_unused,
+add_script_opt_2(struct hist_browser *browser __maybe_unused,
struct popup_action *act, char **optstr,
- struct thread *thread, struct symbol *sym)
+ struct thread *thread, struct symbol *sym,
+ const char *tstr)
{
+
if (thread) {
- if (asprintf(optstr, "Run scripts for samples of thread [%s]",
- thread__comm_str(thread)) < 0)
+ if (asprintf(optstr, "Run scripts for samples of thread [%s]%s",
+ thread__comm_str(thread), tstr) < 0)
return 0;
} else if (sym) {
- if (asprintf(optstr, "Run scripts for samples of symbol [%s]",
- sym->name) < 0)
+ if (asprintf(optstr, "Run scripts for samples of symbol [%s]%s",
+ sym->name, tstr) < 0)
return 0;
} else {
- if (asprintf(optstr, "Run scripts for all samples") < 0)
+ if (asprintf(optstr, "Run scripts for all samples%s", tstr) < 0)
return 0;
}
@@ -2566,6 +2596,36 @@ add_script_opt(struct hist_browser *browser __maybe_unused,
return 1;
}
+static int
+add_script_opt(struct hist_browser *browser,
+ struct popup_action *act, char **optstr,
+ struct thread *thread, struct symbol *sym)
+{
+ int n, j;
+ struct hist_entry *he;
+
+ n = add_script_opt_2(browser, act, optstr, thread, sym, "");
+
+ he = hist_browser__selected_entry(browser);
+ if (sort_order && strstr(sort_order, "time")) {
+ char tstr[128];
+
+ optstr++;
+ act++;
+ j = sprintf(tstr, " in ");
+ j += timestamp__scnprintf_usec(he->time, tstr + j,
+ sizeof tstr - j);
+ j += sprintf(tstr + j, "-");
+ timestamp__scnprintf_usec(he->time + symbol_conf.time_quantum,
+ tstr + j,
+ sizeof tstr - j);
+ n += add_script_opt_2(browser, act, optstr, thread, sym,
+ tstr);
+ act->time = he->time;
+ }
+ return n;
+}
+
static int
do_switch_data(struct hist_browser *browser __maybe_unused,
struct popup_action *act __maybe_unused)
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 09/15] perf tools report: Support builtin perf script in scripts menu
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (7 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 08/15] perf tools report: Support running scripts for current time range Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-05 14:47 ` [PATCH v4 10/15] perf tools: Add utility function to print ns time stamps Andi Kleen
` (6 subsequent siblings)
15 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
The scripts menu traditionally only showed custom perf scripts.
Allow to run standard perf script with useful default options too.
- Normal perf script
- perf script with assembler (needs xed installed)
- perf script with source code output (needs debuginfo)
- perf script with custom arguments
Then we automatically select the right options to
display the information in the perf.data file.
For example with -b display branch contexts.
It's not easily possible to check for xed's existence
in advance. perf script usually gives sensible error messages when
it's not available.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Pass --inline if needed
v3:
Avoid compiler warnings. Allocate cmd buffer dynamically.
Change formatting
---
tools/perf/ui/browsers/annotate.c | 2 +-
tools/perf/ui/browsers/hists.c | 23 +++---
tools/perf/ui/browsers/scripts.c | 127 +++++++++++++++++++++++-------
tools/perf/util/hist.h | 8 +-
4 files changed, 120 insertions(+), 40 deletions(-)
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 35bdfd8b1e71..98d934a36d86 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -750,7 +750,7 @@ static int annotate_browser__run(struct annotate_browser *browser,
continue;
case 'r':
{
- script_browse(NULL);
+ script_browse(NULL, NULL);
continue;
}
case 'k':
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 663970f766e1..9c7b5fa9df39 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2343,6 +2343,7 @@ struct popup_action {
struct thread *thread;
struct map_symbol ms;
int socket;
+ struct perf_evsel *evsel;
int (*fn)(struct hist_browser *browser, struct popup_action *act);
};
@@ -2565,7 +2566,7 @@ do_run_script(struct hist_browser *browser __maybe_unused,
n += snprintf(script_opt + n, len - n, " --time %s,%s", start, end);
}
- script_browse(script_opt);
+ script_browse(script_opt, act->evsel);
free(script_opt);
return 0;
}
@@ -2574,7 +2575,7 @@ static int
add_script_opt_2(struct hist_browser *browser __maybe_unused,
struct popup_action *act, char **optstr,
struct thread *thread, struct symbol *sym,
- const char *tstr)
+ struct perf_evsel *evsel, const char *tstr)
{
if (thread) {
@@ -2592,6 +2593,7 @@ add_script_opt_2(struct hist_browser *browser __maybe_unused,
act->thread = thread;
act->ms.sym = sym;
+ act->evsel = evsel;
act->fn = do_run_script;
return 1;
}
@@ -2599,12 +2601,13 @@ add_script_opt_2(struct hist_browser *browser __maybe_unused,
static int
add_script_opt(struct hist_browser *browser,
struct popup_action *act, char **optstr,
- struct thread *thread, struct symbol *sym)
+ struct thread *thread, struct symbol *sym,
+ struct perf_evsel *evsel)
{
int n, j;
struct hist_entry *he;
- n = add_script_opt_2(browser, act, optstr, thread, sym, "");
+ n = add_script_opt_2(browser, act, optstr, thread, sym, evsel, "");
he = hist_browser__selected_entry(browser);
if (sort_order && strstr(sort_order, "time")) {
@@ -2617,10 +2620,9 @@ add_script_opt(struct hist_browser *browser,
sizeof tstr - j);
j += sprintf(tstr + j, "-");
timestamp__scnprintf_usec(he->time + symbol_conf.time_quantum,
- tstr + j,
- sizeof tstr - j);
+ tstr + j, sizeof tstr - j);
n += add_script_opt_2(browser, act, optstr, thread, sym,
- tstr);
+ evsel, tstr);
act->time = he->time;
}
return n;
@@ -3091,7 +3093,7 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
nr_options += add_script_opt(browser,
&actions[nr_options],
&options[nr_options],
- thread, NULL);
+ thread, NULL, evsel);
}
/*
* Note that browser->selection != NULL
@@ -3106,11 +3108,12 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
nr_options += add_script_opt(browser,
&actions[nr_options],
&options[nr_options],
- NULL, browser->selection->sym);
+ NULL, browser->selection->sym,
+ evsel);
}
}
nr_options += add_script_opt(browser, &actions[nr_options],
- &options[nr_options], NULL, NULL);
+ &options[nr_options], NULL, NULL, evsel);
nr_options += add_switch_opt(browser, &actions[nr_options],
&options[nr_options]);
skip_scripting:
diff --git a/tools/perf/ui/browsers/scripts.c b/tools/perf/ui/browsers/scripts.c
index 7f36630694bf..9e5f87558af6 100644
--- a/tools/perf/ui/browsers/scripts.c
+++ b/tools/perf/ui/browsers/scripts.c
@@ -17,36 +17,111 @@
*/
#define SCRIPT_FULLPATH_LEN 256
+struct script_config {
+ const char **names;
+ char **paths;
+ int index;
+ const char *perf;
+ char extra_format[256];
+};
+
+void attr_to_script(char *extra_format, struct perf_event_attr *attr)
+{
+ extra_format[0] = 0;
+ if (attr->read_format & PERF_FORMAT_GROUP)
+ strcat(extra_format, " -F +metric");
+ if (attr->sample_type & PERF_SAMPLE_BRANCH_STACK)
+ strcat(extra_format, " -F +brstackinsn --xed");
+ if (attr->sample_type & PERF_SAMPLE_REGS_INTR)
+ strcat(extra_format, " -F +iregs");
+ if (attr->sample_type & PERF_SAMPLE_REGS_USER)
+ strcat(extra_format, " -F +uregs");
+ if (attr->sample_type & PERF_SAMPLE_PHYS_ADDR)
+ strcat(extra_format, " -F +phys_addr");
+}
+
+static int add_script_option(const char *name, const char *opt,
+ struct script_config *c)
+{
+ c->names[c->index] = name;
+ if (asprintf(&c->paths[c->index],
+ "%s script %s -F +metric %s %s",
+ c->perf, opt, symbol_conf.inline_name ? " --inline" : "",
+ c->extra_format) < 0)
+ return -1;
+ c->index++;
+ return 0;
+}
+
/*
* When success, will copy the full path of the selected script
* into the buffer pointed by script_name, and return 0.
* Return -1 on failure.
*/
-static int list_scripts(char *script_name)
+static int list_scripts(char *script_name, bool *custom,
+ struct perf_evsel *evsel)
{
- char *buf, *names[SCRIPT_MAX_NO], *paths[SCRIPT_MAX_NO];
- int i, num, choice, ret = -1;
+ char *buf, *paths[SCRIPT_MAX_NO], *names[SCRIPT_MAX_NO];
+ int i, num, choice;
+ int ret = 0;
+ int max_std, custom_perf;
+ char pbuf[256];
+ const char *perf = perf_exe(pbuf, sizeof pbuf);
+ struct script_config scriptc = {
+ .names = (const char **)names,
+ .paths = paths,
+ .perf = perf
+ };
+
+ script_name[0] = 0;
/* Preset the script name to SCRIPT_NAMELEN */
buf = malloc(SCRIPT_MAX_NO * (SCRIPT_NAMELEN + SCRIPT_FULLPATH_LEN));
if (!buf)
- return ret;
+ return -1;
+
+ if (evsel)
+ attr_to_script(scriptc.extra_format, &evsel->attr);
+ add_script_option("Show individual samples", "", &scriptc);
+ add_script_option("Show individual samples with assembler", "-F +insn --xed",
+ &scriptc);
+ add_script_option("Show individual samples with source", "-F +srcline,+srccode",
+ &scriptc);
+ custom_perf = scriptc.index;
+ add_script_option("Show samples with custom perf script arguments", "", &scriptc);
+ i = scriptc.index;
+ max_std = i;
- for (i = 0; i < SCRIPT_MAX_NO; i++) {
- names[i] = buf + i * (SCRIPT_NAMELEN + SCRIPT_FULLPATH_LEN);
+ for (; i < SCRIPT_MAX_NO; i++) {
+ names[i] = buf + (i - max_std) * (SCRIPT_NAMELEN + SCRIPT_FULLPATH_LEN);
paths[i] = names[i] + SCRIPT_NAMELEN;
}
- num = find_scripts(names, paths);
- if (num > 0) {
- choice = ui__popup_menu(num, names);
- if (choice < num && choice >= 0) {
- strcpy(script_name, paths[choice]);
- ret = 0;
- }
+ num = find_scripts(names + max_std, paths + max_std);
+ if (num < 0)
+ num = 0;
+ choice = ui__popup_menu(num + max_std, (char * const *)names);
+ if (choice < 0) {
+ ret = -1;
+ goto out;
}
+ if (choice == custom_perf) {
+ char script_args[50];
+ int key = ui_browser__input_window("perf script command",
+ "Enter perf script command line (without perf script prefix)",
+ script_args, "", 0);
+ if (key != K_ENTER)
+ return -1;
+ sprintf(script_name, "%s script %s", perf, script_args);
+ } else if (choice < num + max_std) {
+ strcpy(script_name, paths[choice]);
+ }
+ *custom = choice >= max_std;
+out:
free(buf);
+ for (i = 0; i < max_std; i++)
+ free(paths[i]);
return ret;
}
@@ -66,27 +141,25 @@ static void run_script(char *cmd)
SLsmg_refresh();
}
-int script_browse(const char *script_opt)
+int script_browse(const char *script_opt, struct perf_evsel *evsel)
{
- char cmd[SCRIPT_FULLPATH_LEN*2], script_name[SCRIPT_FULLPATH_LEN];
+ char *cmd, script_name[SCRIPT_FULLPATH_LEN];
+ bool custom = false;
memset(script_name, 0, SCRIPT_FULLPATH_LEN);
- if (list_scripts(script_name))
+ if (list_scripts(script_name, &custom, evsel))
return -1;
- sprintf(cmd, "perf script -s %s ", script_name);
-
- if (script_opt)
- strcat(cmd, script_opt);
-
- if (input_name) {
- strcat(cmd, " -i ");
- strcat(cmd, input_name);
- }
-
- strcat(cmd, " 2>&1 | less");
+ if (asprintf(&cmd, "%s%s %s %s%s 2>&1 | less",
+ custom ? "perf script -s " : "",
+ script_name,
+ script_opt ? script_opt : "",
+ input_name ? "-i " : "",
+ input_name ? input_name : "") < 0)
+ return -1;
run_script(cmd);
+ free(cmd);
return 0;
}
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 6279eca56409..2113a6639cea 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -436,6 +436,8 @@ struct annotation_options;
#ifdef HAVE_SLANG_SUPPORT
#include "../ui/keysyms.h"
+void attr_to_script(char *buf, struct perf_event_attr *attr);
+
int map_symbol__tui_annotate(struct map_symbol *ms, struct perf_evsel *evsel,
struct hist_browser_timer *hbt,
struct annotation_options *annotation_opts);
@@ -450,7 +452,8 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
struct perf_env *env,
bool warn_lost_event,
struct annotation_options *annotation_options);
-int script_browse(const char *script_opt);
+
+int script_browse(const char *script_opt, struct perf_evsel *evsel);
#else
static inline
int perf_evlist__tui_browse_hists(struct perf_evlist *evlist __maybe_unused,
@@ -479,7 +482,8 @@ static inline int hist_entry__tui_annotate(struct hist_entry *he __maybe_unused,
return 0;
}
-static inline int script_browse(const char *script_opt __maybe_unused)
+static inline int script_browse(const char *script_opt __maybe_unused,
+ struct perf_evsel *evsel __maybe_unused)
{
return 0;
}
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 10/15] perf tools: Add utility function to print ns time stamps
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (8 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 09/15] perf tools report: Support builtin perf script in scripts menu Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-08 13:51 ` Arnaldo Carvalho de Melo
2019-03-22 22:00 ` [tip:perf/urgent] perf time-utils: Add utility function to print time stamps in nanoseconds tip-bot for Andi Kleen
2019-03-05 14:47 ` [PATCH v4 11/15] perf tools report: Implement browsing of individual samples Andi Kleen
` (5 subsequent siblings)
15 siblings, 2 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add a utility function to print nanosecond timestamps.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/util/time-utils.c | 8 ++++++++
tools/perf/util/time-utils.h | 1 +
2 files changed, 9 insertions(+)
diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 6193b46050a5..a63bdf4cfd1b 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -404,6 +404,14 @@ int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz)
return scnprintf(buf, sz, "%"PRIu64".%06"PRIu64, sec, usec);
}
+int timestamp__scnprintf_nsec(u64 timestamp, char *buf, size_t sz)
+{
+ u64 sec = timestamp / NSEC_PER_SEC;
+ u64 nsec = timestamp % NSEC_PER_SEC;
+
+ return scnprintf(buf, sz, "%"PRIu64".%09"PRIu64, sec, nsec);
+}
+
int fetch_current_timestamp(char *buf, size_t sz)
{
struct timeval tv;
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 70b177d2b98c..9266cf4a8e58 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -24,6 +24,7 @@ bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
int num, u64 timestamp);
int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
+int timestamp__scnprintf_nsec(u64 timestamp, char *buf, size_t sz);
int fetch_current_timestamp(char *buf, size_t sz);
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: [PATCH v4 10/15] perf tools: Add utility function to print ns time stamps
2019-03-05 14:47 ` [PATCH v4 10/15] perf tools: Add utility function to print ns time stamps Andi Kleen
@ 2019-03-08 13:51 ` Arnaldo Carvalho de Melo
2019-03-22 22:00 ` [tip:perf/urgent] perf time-utils: Add utility function to print time stamps in nanoseconds tip-bot for Andi Kleen
1 sibling, 0 replies; 49+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-03-08 13:51 UTC (permalink / raw)
To: Andi Kleen; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
Em Tue, Mar 05, 2019 at 06:47:53AM -0800, Andi Kleen escreveu:
> From: Andi Kleen <ak@linux.intel.com>
>
> Add a utility function to print nanosecond timestamps.
Applied.
- Arnaldo
^ permalink raw reply [flat|nested] 49+ messages in thread
* [tip:perf/urgent] perf time-utils: Add utility function to print time stamps in nanoseconds
2019-03-05 14:47 ` [PATCH v4 10/15] perf tools: Add utility function to print ns time stamps Andi Kleen
2019-03-08 13:51 ` Arnaldo Carvalho de Melo
@ 2019-03-22 22:00 ` tip-bot for Andi Kleen
1 sibling, 0 replies; 49+ messages in thread
From: tip-bot for Andi Kleen @ 2019-03-22 22:00 UTC (permalink / raw)
To: linux-tip-commits
Cc: namhyung, hpa, tglx, acme, linux-kernel, ak, mingo, jolsa
Commit-ID: f8c856cb2c947f4fad0a2dff5e95cdcddb801303
Gitweb: https://git.kernel.org/tip/f8c856cb2c947f4fad0a2dff5e95cdcddb801303
Author: Andi Kleen <ak@linux.intel.com>
AuthorDate: Tue, 5 Mar 2019 06:47:53 -0800
Committer: Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Mon, 11 Mar 2019 11:56:02 -0300
perf time-utils: Add utility function to print time stamps in nanoseconds
Add a utility function to print nanosecond timestamps.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20190305144758.12397-11-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
tools/perf/util/time-utils.c | 8 ++++++++
tools/perf/util/time-utils.h | 1 +
2 files changed, 9 insertions(+)
diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 0f53baec660e..20663a460df3 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -453,6 +453,14 @@ int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz)
return scnprintf(buf, sz, "%"PRIu64".%06"PRIu64, sec, usec);
}
+int timestamp__scnprintf_nsec(u64 timestamp, char *buf, size_t sz)
+{
+ u64 sec = timestamp / NSEC_PER_SEC,
+ nsec = timestamp % NSEC_PER_SEC;
+
+ return scnprintf(buf, sz, "%" PRIu64 ".%09" PRIu64, sec, nsec);
+}
+
int fetch_current_timestamp(char *buf, size_t sz)
{
struct timeval tv;
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index b923de44e36f..72a42ea1d513 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -30,6 +30,7 @@ int perf_time__parse_for_ranges(const char *str, struct perf_session *session,
int *range_size, int *range_num);
int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
+int timestamp__scnprintf_nsec(u64 timestamp, char *buf, size_t sz);
int fetch_current_timestamp(char *buf, size_t sz);
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 11/15] perf tools report: Implement browsing of individual samples
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (9 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 10/15] perf tools: Add utility function to print ns time stamps Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-05 14:47 ` [PATCH v4 12/15] perf tools: Add some new tips describing the new options Andi Kleen
` (4 subsequent siblings)
15 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Now report can show whole time periods with perf script,
but the user still has to find individual samples of interest
manually.
It would be expensive and complicated to search for the
right samples in the whole perf file. Typically users
only need to look at a small number of samples for useful
analysis.
Also the full scripts tend to show samples of all CPUs and all
threads mixed up, which can be very confusing on larger systems.
Add a new --samples option to save a small random number of samples
per hist entry
Use a reservoir sample technique to select a representatve
number of samples.
Then allow browsing the samples using perf script
as part of the hist entry context menu. This automatically
adds the right filters, so only the thread or cpu of the sample
is displayed. Then we use less' search functionality
to directly jump the to the time stamp of the selected
sample.
It uses different menus for assembler and source display.
Assembler needs xed installed and source needs debuginfo.
Currently it only supports as many samples as fit on
the screen due to some limitations in the slang ui code.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Free names on error path
Pass --inline and --show-*-event to child perf as needed.
v3:
Don't use num_res to keep track of total samples
Don't pass -1 cpus to perf script.
Don't pass extra event option with tid filtering.
Fix sampling bug.
---
tools/perf/Documentation/perf-report.txt | 4 ++
tools/perf/builtin-report.c | 2 +
tools/perf/ui/browsers/Build | 1 +
tools/perf/ui/browsers/hists.c | 45 +++++++++++++
tools/perf/ui/browsers/res_sample.c | 81 ++++++++++++++++++++++++
tools/perf/ui/browsers/scripts.c | 2 +-
tools/perf/util/hist.c | 36 +++++++++++
tools/perf/util/hist.h | 19 ++++++
tools/perf/util/sort.h | 8 +++
tools/perf/util/symbol.c | 1 +
tools/perf/util/symbol_conf.h | 1 +
11 files changed, 199 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/ui/browsers/res_sample.c
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 546d87221ad8..f441baa794ce 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -461,6 +461,10 @@ include::itrace.txt[]
--socket-filter::
Only report the samples on the processor socket that match with this filter
+--samples=N::
+ Save N individual samples for each histogram entry to show context in perf
+ report tui browser.
+
--raw-trace::
When displaying traceevent output, do not use print fmt or plugins.
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index c19952072a3a..05ca1378132f 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -1158,6 +1158,8 @@ int cmd_report(int argc, const char **argv)
OPT_BOOLEAN(0, "demangle-kernel", &symbol_conf.demangle_kernel,
"Enable kernel symbol demangling"),
OPT_BOOLEAN(0, "mem-mode", &report.mem_mode, "mem access profile"),
+ OPT_INTEGER(0, "samples", &symbol_conf.res_sample,
+ "Number of samples to save per histogram entry for individual browsing"),
OPT_CALLBACK(0, "percent-limit", &report, "percent",
"Don't show entries under that percent", parse_percent_limit),
OPT_CALLBACK(0, "percentage", NULL, "relative|absolute",
diff --git a/tools/perf/ui/browsers/Build b/tools/perf/ui/browsers/Build
index 8fee56b46502..fdf86f7981ca 100644
--- a/tools/perf/ui/browsers/Build
+++ b/tools/perf/ui/browsers/Build
@@ -3,6 +3,7 @@ perf-y += hists.o
perf-y += map.o
perf-y += scripts.o
perf-y += header.o
+perf-y += res_sample.o
CFLAGS_annotate.o += -DENABLE_SLFUTURE_CONST
CFLAGS_hists.o += -DENABLE_SLFUTURE_CONST
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 9c7b5fa9df39..364f5c577d00 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2344,6 +2344,7 @@ struct popup_action {
struct map_symbol ms;
int socket;
struct perf_evsel *evsel;
+ enum rstype rstype;
int (*fn)(struct hist_browser *browser, struct popup_action *act);
};
@@ -2571,6 +2572,17 @@ do_run_script(struct hist_browser *browser __maybe_unused,
return 0;
}
+static int
+do_res_sample_script(struct hist_browser *browser __maybe_unused,
+ struct popup_action *act)
+{
+ struct hist_entry *he;
+
+ he = hist_browser__selected_entry(browser);
+ res_sample_browse(he->res_samples, he->num_res, act->evsel, act->rstype);
+ return 0;
+}
+
static int
add_script_opt_2(struct hist_browser *browser __maybe_unused,
struct popup_action *act, char **optstr,
@@ -2628,6 +2640,27 @@ add_script_opt(struct hist_browser *browser,
return n;
}
+static int
+add_res_sample_opt(struct hist_browser *browser __maybe_unused,
+ struct popup_action *act, char **optstr,
+ struct res_sample *res_sample,
+ struct perf_evsel *evsel,
+ enum rstype type)
+{
+ if (!res_sample)
+ return 0;
+
+ if (asprintf(optstr, "Show context for individual samples %s",
+ type == A_ASM ? "with assembler" :
+ type == A_SOURCE ? "with source" : "") < 0)
+ return 0;
+
+ act->fn = do_res_sample_script;
+ act->evsel = evsel;
+ act->rstype = type;
+ return 1;
+}
+
static int
do_switch_data(struct hist_browser *browser __maybe_unused,
struct popup_action *act __maybe_unused)
@@ -3114,6 +3147,18 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
}
nr_options += add_script_opt(browser, &actions[nr_options],
&options[nr_options], NULL, NULL, evsel);
+ nr_options += add_res_sample_opt(browser, &actions[nr_options],
+ &options[nr_options],
+ hist_browser__selected_entry(browser)->res_samples,
+ evsel, A_NORMAL);
+ nr_options += add_res_sample_opt(browser, &actions[nr_options],
+ &options[nr_options],
+ hist_browser__selected_entry(browser)->res_samples,
+ evsel, A_ASM);
+ nr_options += add_res_sample_opt(browser, &actions[nr_options],
+ &options[nr_options],
+ hist_browser__selected_entry(browser)->res_samples,
+ evsel, A_SOURCE);
nr_options += add_switch_opt(browser, &actions[nr_options],
&options[nr_options]);
skip_scripting:
diff --git a/tools/perf/ui/browsers/res_sample.c b/tools/perf/ui/browsers/res_sample.c
new file mode 100644
index 000000000000..8b045482744e
--- /dev/null
+++ b/tools/perf/ui/browsers/res_sample.c
@@ -0,0 +1,81 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Display a menu with individual samples to browse with perf script */
+#include "util.h"
+#include "hist.h"
+#include "evsel.h"
+#include "hists.h"
+#include "sort.h"
+#include "time-utils.h"
+
+/* In ns. Could make configurable. */
+#define CONTEXT_LEN 100000000
+
+int res_sample_browse(struct res_sample *res_samples, int num_res,
+ struct perf_evsel *evsel, enum rstype rstype)
+{
+ char **names;
+ int i, n;
+ int choice;
+ char *cmd;
+ char pbuf[256], tidbuf[32], cpubuf[32];
+ const char *perf = perf_exe(pbuf, sizeof pbuf);
+ char trange[128], tsample[64];
+ struct res_sample *r;
+ char extra_format[256];
+
+ /* For now since ui__popup_menu doesn't like lists that don't fit */
+ num_res = max(min(SLtt_Screen_Rows - 4, num_res), 0);
+
+ names = calloc(num_res, sizeof(char *));
+ if (!names)
+ return -1;
+ for (i = 0; i < num_res; i++) {
+ char tbuf[64];
+
+ timestamp__scnprintf_nsec(res_samples[i].time, tbuf, sizeof tbuf);
+ if (asprintf(&names[i], "%s: CPU %d tid %d", tbuf,
+ res_samples[i].cpu, res_samples[i].tid) < 0) {
+ while (--i >= 0)
+ free(names[i]);
+ free(names);
+ return -1;
+ }
+ }
+ choice = ui__popup_menu(num_res, names);
+ for (i = 0; i < num_res; i++)
+ free(names[i]);
+ free(names);
+
+ if (choice < 0 || choice >= num_res)
+ return -1;
+ r = &res_samples[choice];
+
+ n = timestamp__scnprintf_nsec(r->time - CONTEXT_LEN, trange, sizeof trange);
+ trange[n++] = ',';
+ timestamp__scnprintf_nsec(r->time + CONTEXT_LEN, trange + n, sizeof trange - n);
+
+ timestamp__scnprintf_nsec(r->time, tsample, sizeof tsample);
+
+ attr_to_script(extra_format, &evsel->attr);
+
+ if (asprintf(&cmd, "%s script %s%s --time %s %s%s %s%s --ns %s %s %s %s %s | less +/%s",
+ perf,
+ input_name ? "-i " : "",
+ input_name ? input_name : "",
+ trange,
+ r->cpu >= 0 ? "--cpu " : "",
+ r->cpu >= 0 ? (sprintf(cpubuf, "%d", r->cpu), cpubuf) : "",
+ r->tid ? "--tid " : "",
+ r->tid ? (sprintf(tidbuf, "%d", r->tid), tidbuf) : "",
+ extra_format,
+ rstype == A_ASM ? "-F +insn --xed" :
+ rstype == A_SOURCE ? "-F +srcline,+srccode" : "",
+ symbol_conf.inline_name ? "--inline" : "",
+ "--show-lost-events ",
+ r->tid ? "--show-switch-events --show-task-events " : "",
+ tsample) < 0)
+ return -1;
+ run_script(cmd);
+ free(cmd);
+ return 0;
+}
diff --git a/tools/perf/ui/browsers/scripts.c b/tools/perf/ui/browsers/scripts.c
index 9e5f87558af6..cdba58447b85 100644
--- a/tools/perf/ui/browsers/scripts.c
+++ b/tools/perf/ui/browsers/scripts.c
@@ -125,7 +125,7 @@ static int list_scripts(char *script_name, bool *custom,
return ret;
}
-static void run_script(char *cmd)
+void run_script(char *cmd)
{
pr_debug("Running %s\n", cmd);
SLang_reset_tty();
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6040eb49ea23..1460450f47c4 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -444,6 +444,14 @@ static int hist_entry__init(struct hist_entry *he,
return -ENOMEM;
}
}
+
+ if (symbol_conf.res_sample) {
+ he->res_samples = calloc(sizeof(struct res_sample),
+ symbol_conf.res_sample);
+ if (!he->res_samples)
+ return -ENOMEM;
+ }
+
INIT_LIST_HEAD(&he->pairs.node);
thread__get(he->thread);
he->hroot_in = RB_ROOT_CACHED;
@@ -593,6 +601,32 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
return he;
}
+static unsigned random_max(unsigned high)
+{
+ unsigned thresh = -high % high;
+ for (;;) {
+ unsigned r = random();
+ if (r >= thresh)
+ return r % high;
+ }
+}
+
+static void hists__res_sample(struct hist_entry *he, struct perf_sample *sample)
+{
+ struct res_sample *r;
+ int j;
+
+ if (he->num_res < symbol_conf.res_sample) {
+ j = he->num_res++;
+ } else {
+ j = random_max(symbol_conf.res_sample);
+ }
+ r = &he->res_samples[j];
+ r->time = sample->time;
+ r->cpu = sample->cpu;
+ r->tid = sample->tid;
+}
+
static struct hist_entry*
__hists__add_entry(struct hists *hists,
struct addr_location *al,
@@ -640,6 +674,8 @@ __hists__add_entry(struct hists *hists,
if (!hists->has_callchains && he && he->callchain_size != 0)
hists->has_callchains = true;
+ if (he && symbol_conf.res_sample)
+ hists__res_sample(he, sample);
return he;
}
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 2113a6639cea..67b9b91e92df 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -433,6 +433,13 @@ struct hist_browser_timer {
};
struct annotation_options;
+struct res_sample;
+
+enum rstype {
+ A_NORMAL,
+ A_ASM,
+ A_SOURCE
+};
#ifdef HAVE_SLANG_SUPPORT
#include "../ui/keysyms.h"
@@ -454,6 +461,10 @@ int perf_evlist__tui_browse_hists(struct perf_evlist *evlist, const char *help,
struct annotation_options *annotation_options);
int script_browse(const char *script_opt, struct perf_evsel *evsel);
+
+void run_script(char *cmd);
+int res_sample_browse(struct res_sample *res_samples, int num_res,
+ struct perf_evsel *evsel, enum rstype rstype);
#else
static inline
int perf_evlist__tui_browse_hists(struct perf_evlist *evlist __maybe_unused,
@@ -488,6 +499,14 @@ static inline int script_browse(const char *script_opt __maybe_unused,
return 0;
}
+static inline int res_sample_browse(struct res_sample *res_samples __maybe_unused,
+ int num_res __maybe_unused,
+ struct perf_evsel *evsel __maybe_unused,
+ enum rstype rstype __maybe_unused)
+{
+ return 0;
+}
+
#define K_LEFT -1000
#define K_RIGHT -2000
#define K_SWITCH_INPUT_DATA -3000
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 19dceb7f6145..bb9442ab7a0c 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -47,6 +47,12 @@ extern struct sort_entry sort_srcline;
extern enum sort_type sort__first_dimension;
extern const char default_mem_sort_order[];
+struct res_sample {
+ u64 time;
+ int cpu;
+ int tid;
+};
+
struct he_stat {
u64 period;
u64 period_sys;
@@ -140,6 +146,8 @@ struct hist_entry {
struct mem_info *mem_info;
void *raw_data;
u32 raw_size;
+ int num_res;
+ struct res_sample *res_samples;
void *trace_output;
struct perf_hpp_list *hpp_list;
struct hist_entry *parent_he;
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 0f80743a1c25..754f7f2c1650 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -50,6 +50,7 @@ struct symbol_conf symbol_conf = {
.symfs = "",
.event_group = true,
.inline_name = true,
+ .res_sample = 0,
};
static enum dso_binary_type binary_type_symtab[] = {
diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index a5684a71b78e..6c55fa6fccec 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -68,6 +68,7 @@ struct symbol_conf {
struct intlist *pid_list,
*tid_list;
const char *symfs;
+ int res_sample;
};
extern struct symbol_conf symbol_conf;
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 12/15] perf tools: Add some new tips describing the new options
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (10 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 11/15] perf tools report: Implement browsing of individual samples Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-05 14:47 ` [PATCH v4 13/15] perf tools report: Add custom scripts to script menu Andi Kleen
` (3 subsequent siblings)
15 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
v2:
Even more tips.
---
tools/perf/Documentation/tips.txt | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt
index 849599f39c5e..3cb9a59c58ba 100644
--- a/tools/perf/Documentation/tips.txt
+++ b/tools/perf/Documentation/tips.txt
@@ -15,6 +15,7 @@ To see callchains in a more compact form: perf report -g folded
Show individual samples with: perf script
Limit to show entries above 5% only: perf report --percent-limit 5
Profiling branch (mis)predictions with: perf record -b / perf report
+To show assembler sample contexts use perf record -b / perf script -F +brstackinsn --xed
Treat branches as callchains: perf report --branch-history
To count events in every 1000 msec: perf stat -I 1000
Print event counts in CSV format with: perf stat -x,
@@ -34,3 +35,8 @@ Show current config key-value pairs: perf config --list
Show user configuration overrides: perf config --user --list
To add Node.js USDT(User-Level Statically Defined Tracing): perf buildid-cache --add `which node`
To report cacheline events from previous recording: perf c2c report
+To browse sample contexts use perf report --sample 10 and select in context menu
+To separate samples by time use perf report --sort time,overhead,sym
+To set sample time separation other than 100ms with --sort time use --time-quantum
+Add -I to perf report to sample register values visible in perf report context.
+To show IPC for sampling periods use perf record -e '{cycles,instructions}:S' and then browse context
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 13/15] perf tools report: Add custom scripts to script menu
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (11 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 12/15] perf tools: Add some new tips describing the new options Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-05 14:47 ` [PATCH v4 14/15] perf tools script: Add array bound checking to list_scripts Andi Kleen
` (2 subsequent siblings)
15 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Add a way to define custom scripts through ~/.perfconfig, which
are then added to the scripts menu. The scripts get the same
arguments as perf script, in particular -i, --cpu, --tid.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/Documentation/perf-config.txt | 8 ++++++++
tools/perf/ui/browsers/scripts.c | 20 ++++++++++++++++++++
2 files changed, 28 insertions(+)
diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index 86f3dcc15f83..dc1009e334e8 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -584,6 +584,14 @@ llvm.*::
llvm.opts::
Options passed to llc.
+script.*::
+
+ Any option defines a script that is added to the scripts menu
+ in the interactive perf browser and whose output is displayed.
+ The name of the option is the name, the value is a script command line.
+ The script gets the same options passed as a full perf script,
+ in particular -i perfdata file, --cpu, --tid
+
SEE ALSO
--------
linkperf:perf[1]
diff --git a/tools/perf/ui/browsers/scripts.c b/tools/perf/ui/browsers/scripts.c
index cdba58447b85..5d407ab834b5 100644
--- a/tools/perf/ui/browsers/scripts.c
+++ b/tools/perf/ui/browsers/scripts.c
@@ -6,6 +6,7 @@
#include "../../util/symbol.h"
#include "../browser.h"
#include "../libslang.h"
+#include "config.h"
#define SCRIPT_NAMELEN 128
#define SCRIPT_MAX_NO 64
@@ -53,6 +54,24 @@ static int add_script_option(const char *name, const char *opt,
return 0;
}
+static int scripts_config(const char *var, const char *value, void *data)
+{
+ struct script_config *c = data;
+
+ if (!strstarts(var, "script."))
+ return -1;
+ if (c->index >= SCRIPT_MAX_NO)
+ return -1;
+ c->names[c->index] = strdup(var + 7);
+ if (!c->names[c->index])
+ return -1;
+ if (asprintf(&c->paths[c->index], "%s %s", value,
+ c->extra_format) < 0)
+ return -1;
+ c->index++;
+ return 0;
+}
+
/*
* When success, will copy the full path of the selected script
* into the buffer pointed by script_name, and return 0.
@@ -87,6 +106,7 @@ static int list_scripts(char *script_name, bool *custom,
&scriptc);
add_script_option("Show individual samples with source", "-F +srcline,+srccode",
&scriptc);
+ perf_config(scripts_config, &scriptc);
custom_perf = scriptc.index;
add_script_option("Show samples with custom perf script arguments", "", &scriptc);
i = scriptc.index;
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 14/15] perf tools script: Add array bound checking to list_scripts
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (12 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 13/15] perf tools report: Add custom scripts to script menu Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-05 14:47 ` [PATCH v4 15/15] perf tools ui: Fix ui popup browser for many entries Andi Kleen
2019-03-07 10:57 ` Support sample context in perf report Jiri Olsa
15 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Don't overflow array when the scripts directory is too large,
or the script file name is too long.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/builtin-script.c | 8 ++++++--
tools/perf/builtin.h | 3 ++-
tools/perf/ui/browsers/scripts.c | 3 ++-
3 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index d5e819b68970..652c114f8d40 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -2983,7 +2983,8 @@ static int check_ev_match(char *dir_name, char *scriptname,
* will list all statically runnable scripts, select one, execute it and
* show the output in a perf browser.
*/
-int find_scripts(char **scripts_array, char **scripts_path_array)
+int find_scripts(char **scripts_array, char **scripts_path_array, int num,
+ int pathlen)
{
struct dirent *script_dirent, *lang_dirent;
char scripts_path[MAXPATHLEN], lang_path[MAXPATHLEN];
@@ -3028,7 +3029,10 @@ int find_scripts(char **scripts_array, char **scripts_path_array)
/* Skip those real time scripts: xxxtop.p[yl] */
if (strstr(script_dirent->d_name, "top."))
continue;
- sprintf(scripts_path_array[i], "%s/%s", lang_path,
+ if (i >= num)
+ break;
+ snprintf(scripts_path_array[i], pathlen, "%s/%s",
+ lang_path,
script_dirent->d_name);
temp = strchr(script_dirent->d_name, '.');
snprintf(scripts_array[i],
diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h
index 05745f3ce912..999fe9170122 100644
--- a/tools/perf/builtin.h
+++ b/tools/perf/builtin.h
@@ -40,5 +40,6 @@ int cmd_mem(int argc, const char **argv);
int cmd_data(int argc, const char **argv);
int cmd_ftrace(int argc, const char **argv);
-int find_scripts(char **scripts_array, char **scripts_path_array);
+int find_scripts(char **scripts_array, char **scripts_path_array, int num,
+ int pathlen);
#endif
diff --git a/tools/perf/ui/browsers/scripts.c b/tools/perf/ui/browsers/scripts.c
index 5d407ab834b5..6759d6657e1a 100644
--- a/tools/perf/ui/browsers/scripts.c
+++ b/tools/perf/ui/browsers/scripts.c
@@ -117,7 +117,8 @@ static int list_scripts(char *script_name, bool *custom,
paths[i] = names[i] + SCRIPT_NAMELEN;
}
- num = find_scripts(names + max_std, paths + max_std);
+ num = find_scripts(names + max_std, paths + max_std, SCRIPT_MAX_NO - max_std,
+ SCRIPT_FULLPATH_LEN);
if (num < 0)
num = 0;
choice = ui__popup_menu(num + max_std, (char * const *)names);
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* [PATCH v4 15/15] perf tools ui: Fix ui popup browser for many entries
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (13 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 14/15] perf tools script: Add array bound checking to list_scripts Andi Kleen
@ 2019-03-05 14:47 ` Andi Kleen
2019-03-07 10:57 ` Support sample context in perf report Jiri Olsa
15 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-05 14:47 UTC (permalink / raw)
To: acme; +Cc: jolsa, namhyung, linux-kernel, linux-perf-users, Andi Kleen
From: Andi Kleen <ak@linux.intel.com>
Fix the argv ui browser code to correctly display more entries
than fit on the screen without crashing. The problem was some type
confusion with pointer types in the ->seek function. Do
the argv arithmetic correctly with char ** pointers. Also
add some asserts to find overruns and limit the display function
correctly.
Then finally remove a workaround for this in the res sample
browser.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
---
tools/perf/ui/browser.c | 10 +++++++---
tools/perf/ui/browsers/res_sample.c | 3 ---
2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c
index 4f75561424ed..4ad37d8c7d6a 100644
--- a/tools/perf/ui/browser.c
+++ b/tools/perf/ui/browser.c
@@ -611,14 +611,16 @@ void ui_browser__argv_seek(struct ui_browser *browser, off_t offset, int whence)
browser->top = browser->entries;
break;
case SEEK_CUR:
- browser->top = browser->top + browser->top_idx + offset;
+ browser->top = (char **)browser->top + offset;
break;
case SEEK_END:
- browser->top = browser->top + browser->nr_entries - 1 + offset;
+ browser->top = (char **)browser->entries + browser->nr_entries - 1 + offset;
break;
default:
return;
}
+ assert((char **)browser->top < (char **)browser->entries + browser->nr_entries);
+ assert((char **)browser->top >= (char **)browser->entries);
}
unsigned int ui_browser__argv_refresh(struct ui_browser *browser)
@@ -630,7 +632,9 @@ unsigned int ui_browser__argv_refresh(struct ui_browser *browser)
browser->top = browser->entries;
pos = (char **)browser->top;
- while (idx < browser->nr_entries) {
+ while (idx < browser->nr_entries &&
+ row < (unsigned)SLtt_Screen_Rows - 1) {
+ assert(pos < (char **)browser->entries + browser->nr_entries);
if (!browser->filter || !browser->filter(browser, *pos)) {
ui_browser__gotorc(browser, row, 0);
browser->write(browser, pos, row);
diff --git a/tools/perf/ui/browsers/res_sample.c b/tools/perf/ui/browsers/res_sample.c
index 8b045482744e..62775efc32b4 100644
--- a/tools/perf/ui/browsers/res_sample.c
+++ b/tools/perf/ui/browsers/res_sample.c
@@ -23,9 +23,6 @@ int res_sample_browse(struct res_sample *res_samples, int num_res,
struct res_sample *r;
char extra_format[256];
- /* For now since ui__popup_menu doesn't like lists that don't fit */
- num_res = max(min(SLtt_Screen_Rows - 4, num_res), 0);
-
names = calloc(num_res, sizeof(char *));
if (!names)
return -1;
--
2.20.1
^ permalink raw reply related [flat|nested] 49+ messages in thread
* Re: Support sample context in perf report
2019-03-05 14:47 Support sample context in perf report Andi Kleen
` (14 preceding siblings ...)
2019-03-05 14:47 ` [PATCH v4 15/15] perf tools ui: Fix ui popup browser for many entries Andi Kleen
@ 2019-03-07 10:57 ` Jiri Olsa
2019-03-07 16:57 ` Andi Kleen
15 siblings, 1 reply; 49+ messages in thread
From: Jiri Olsa @ 2019-03-07 10:57 UTC (permalink / raw)
To: Andi Kleen; +Cc: acme, jolsa, namhyung, linux-kernel, linux-perf-users
On Tue, Mar 05, 2019 at 06:47:43AM -0800, Andi Kleen wrote:
> [Changes:
> v4:
> Address review comments.
> Fix --cpu filtering.
> Fix a sampling bug.
> Add support for configuring custom script menu entries in perfconfig.
> Fix display of more samples than fit on screen.
> Fix some buffer overruns in legacy code.
> Add more tips
hi,
getting gcc error on your branch, similar like last time:
ui/browsers/hists.c: In function ‘perf_evsel__hists_browse’:
ui/browsers/hists.c:2567:8: error: ‘%s’ directive output may be truncated writing up to 63 bytes into a region of size between 28 and 91 [-Werror=format-truncation=]
n += snprintf(script_opt + n, len - n, " --time %s,%s", start, end);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/stdio.h:862,
from ui/browsers/hists.c:5:
/usr/include/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’ output between 10 and 136 bytes into a destination of size 100
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
jirka
^ permalink raw reply [flat|nested] 49+ messages in thread
* Re: Support sample context in perf report
2019-03-07 10:57 ` Support sample context in perf report Jiri Olsa
@ 2019-03-07 16:57 ` Andi Kleen
0 siblings, 0 replies; 49+ messages in thread
From: Andi Kleen @ 2019-03-07 16:57 UTC (permalink / raw)
To: Jiri Olsa
Cc: Andi Kleen, acme, jolsa, namhyung, linux-kernel, linux-perf-users
On Thu, Mar 07, 2019 at 11:57:43AM +0100, Jiri Olsa wrote:
> On Tue, Mar 05, 2019 at 06:47:43AM -0800, Andi Kleen wrote:
> > [Changes:
> > v4:
> > Address review comments.
> > Fix --cpu filtering.
> > Fix a sampling bug.
> > Add support for configuring custom script menu entries in perfconfig.
> > Fix display of more samples than fit on screen.
> > Fix some buffer overruns in legacy code.
> > Add more tips
>
> hi,
> getting gcc error on your branch, similar like last time:
Okay I figured out now why I'm not seeing this. It's because my perf
builds are with DEBUG=1, and Makefile.config has this insanity:
ifeq ($(DEBUG),0)
ifeq ($(feature-fortify-source), 1)
CFLAGS += -D_FORTIFY_SOURCE=2
endif
endif
Anyways the warnings are false positives because the strings can never
be that big. In fact I think they're harmful because it discourages
adding safety margins to stack buffers.
Anyways you can use this patch as a workaround.
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 364f5c577d00..3e795bd1279d 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -2554,7 +2554,7 @@ do_run_script(struct hist_browser *browser __maybe_unused,
}
if (act->time) {
- char start[64], end[64];
+ char start[32], end[32];
unsigned long starttime = act->time;
unsigned long endtime = act->time + symbol_conf.time_quantum;
^ permalink raw reply related [flat|nested] 49+ messages in thread