linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE
@ 2020-08-06  2:40 Leo Yan
  2020-08-06  2:40 ` [PATCH v1 01/11] perf mem: Search event name with more flexible path Leo Yan
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

This patch set is to support AUX trace and Arm SPE as the first enabled
hardware tracing for Perf memory tool.

Patches 01 ~ 04 are preparasion patches which mainly resolve the issue
for memory events, since the existed code is hard coded the memory
events which based on x86 and PowerPC architectures, so patches 01 ~ 04
extend to support more flexible memory event name, and introduce weak
functions so can allow every architecture to define its own memory
events structure and returning event pointer and name respectively.

Patch 05 is used to extend Perf memory tool to support AUX trace.

Patch 06 ~ 11 are to support Arm SPE with Perf memory tool.  Firstly it
registers SPE events for memory events, then it extends the SPE packet
to pass addresses info and operation types, and also set 'data_src'
field so can allow the tool to display readable string in the result.

This patch set has been tested on ARMv8 Hisilicon D06 platform.  I noted
now the 'data object' cannot be displayed properly, this should be
another issue so need to check separately.   Below is testing result:

# Samples: 73  of event 'l1d-miss'
# Total weight : 73
# Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead       Samples  Local Weight  Memory access             Symbol                              Shared Object      Data Symbol                                                   Data Object         Snoop         TLB access              Locked
# ........  ............  ............  ........................  ..................................  .................  ............................................................  ..................  ............  ......................  ......
#
     2.74%             2  0             L1 or L1 miss             [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027aacb08a8                                        [unknown]           N/A           N/A                     No
     2.74%             2  0             L1 or L1 miss             [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027be6488a8                                        [unknown]           N/A           N/A                     No
     2.74%             2  0             L1 or L1 miss             [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027c432f8a8                                        [unknown]           N/A           N/A                     No
     1.37%             1  0             L1 or L1 miss             [k] __arch_copy_to_user             [kernel.kallsyms]  [k] 0xffff0027a65352a0                                        [unknown]           N/A           N/A                     No
     1.37%             1  0             L1 or L1 miss             [k] __d_lookup_rcu                  [kernel.kallsyms]  [k] 0xffff0027d3cbf468                                        [unknown]           N/A           N/A                     No
     1.37%             1  0             L1 or L1 miss             [k] __d_lookup_rcu                  [kernel.kallsyms]  [k] 0xffff0027d8f44490                                        [unknown]           N/A           N/A                     No
     [...]


# Samples: 101  of event 'l1d-access'
# Total weight : 101
# Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead       Samples  Local Weight  Memory access             Symbol                              Shared Object      Data Symbol                                                   Data Object             Snoop         TLB access              Locked
# ........  ............  ............  ........................  ..................................  .................  ............................................................  ......................  ............  ......................  ......
#
     2.97%             3  0             L1 or L1 hit              [k] perf_event_mmap                 [kernel.kallsyms]  [k] perf_swevent+0x5c                                         [kernel.kallsyms].data  N/A           N/A                     No
     1.98%             2  0             L1 or L1 hit              [k] kmem_cache_alloc                [kernel.kallsyms]  [k] 0xffff2027af40e3d0                                        [unknown]               N/A           N/A                     No
     1.98%             2  0             L1 or L1 hit              [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027aacb08a8                                        [unknown]               N/A           N/A                     No
     1.98%             2  0             L1 or L1 hit              [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027be6488a8                                        [unknown]               N/A           N/A                     No
     1.98%             2  0             L1 or L1 hit              [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027c432f8a8                                        [unknown]               N/A           N/A                     No
     0.99%             1  0             L1 or L1 hit              [k] __arch_copy_to_user             [kernel.kallsyms]  [k] 0xffff0027a65352a0                                        [unknown]               N/A           N/A                     No
     0.99%             1  0             L1 or L1 hit              [k] __d_lookup_rcu                  [kernel.kallsyms]  [k] 0xffff0027d3cbf468                                        [unknown]               N/A           N/A                     No
     0.99%             1  0             L1 or L1 hit              [k] __d_lookup_rcu                  [kernel.kallsyms]  [k] 0xffff0027d8f44490                                        [unknown]               N/A           N/A                     No
     [...]


# Samples: 46  of event 'llc-miss'
# Total weight : 46
# Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead       Samples  Local Weight  Memory access             Symbol                              Shared Object      Data Symbol                                                   Data Object         Snoop         TLB access              Locked
# ........  ............  ............  ........................  ..................................  .................  ............................................................  ..................  ............  ......................  ......
#
     2.17%             1  0             L3 or L3 miss             [k] __arch_copy_to_user             [kernel.kallsyms]  [k] 0xffff0027a65352a0                                        [unknown]           N/A           N/A                     No
     2.17%             1  0             L3 or L3 miss             [k] __d_lookup_rcu                  [kernel.kallsyms]  [k] 0xffff0027d3cbf468                                        [unknown]           N/A           N/A                     No
     2.17%             1  0             L3 or L3 miss             [k] __d_lookup_rcu                  [kernel.kallsyms]  [k] 0xffff0027d8f44490                                        [unknown]           N/A           N/A                     No
     2.17%             1  0             L3 or L3 miss             [k] __tty_buffer_request_room       [kernel.kallsyms]  [k] 0xffff2027c424ac08                                        [unknown]           N/A           N/A                     No
     2.17%             1  0             L3 or L3 miss             [.] _dl_addr                        libc-2.28.so       [.] 0x0000ffff9afc94c4                                        libc-2.28.so        N/A           N/A                     No
     2.17%             1  0             L3 or L3 miss             [.] _dl_addr                        libc-2.28.so       [.] 0x0000ffff9afc98b6                                        libc-2.28.so        N/A           N/A                     No
     2.17%             1  0             L3 or L3 miss             [.] _dl_lookup_symbol_x             ld-2.28.so         [.] 0x0000ffff9af38703                                        libdl-2.28.so       N/A           N/A                     No
     [...]


# Samples: 6  of event 'llc-access'
# Total weight : 6
# Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead       Samples  Local Weight  Memory access             Symbol                   Shared Object      Data Symbol             Data Object       Snoop         TLB access              Locked
# ........  ............  ............  ........................  .......................  .................  ......................  ................  ............  ......................  ......
#
    16.67%             1  0             L3 or L3 hit              [.] _dl_addr             libc-2.28.so       [.] 0x0000ffff9afc98b6  libc-2.28.so      N/A           N/A                     No
    16.67%             1  0             L3 or L3 hit              [.] _dl_lookup_symbol_x  ld-2.28.so         [.] 0x0000ffff9af38703  libdl-2.28.so     N/A           N/A                     No
    16.67%             1  0             L3 or L3 hit              [.] _dl_relocate_object  ld-2.28.so         [.] 0x0000aaaadc9e4dd0  ls                N/A           N/A                     No
    16.67%             1  0             L3 or L3 hit              [.] _dl_relocate_object  ld-2.28.so         [.] 0x0000aaaadc9e5970  ls                N/A           N/A                     No
    16.67%             1  0             L3 or L3 hit              [k] copy_page            [kernel.kallsyms]  [k] 0xffff0027a8528be0  [unknown]         N/A           N/A                     No
    16.67%             1  0             L3 or L3 hit              [k] copy_page            [kernel.kallsyms]  [k] 0xffff2027c66a65e0  [unknown]         N/A           N/A                     No


# Samples: 32  of event 'tlb-miss'
# Total weight : 32
# Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead       Samples  Local Weight  Memory access             Symbol                              Shared Object      Data Symbol             Data Object         Snoop         TLB access              Locked
# ........  ............  ............  ........................  ..................................  .................  ......................  ..................  ............  ......................  ......
#
     6.25%             2  0             N/A                       [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027be6488a8  [unknown]           N/A           Walker miss             No
     6.25%             2  0             N/A                       [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027c432f8a8  [unknown]           N/A           Walker miss             No
     3.12%             1  0             N/A                       [k] __arch_clear_user               [kernel.kallsyms]  [k] 0x0000aaaadca14658  [unknown]           N/A           Walker miss             No
     3.12%             1  0             N/A                       [.] _dl_map_object_deps             ld-2.28.so         [.] 0x0000ffffdd807c50  [stack]             N/A           Walker miss             No
     3.12%             1  0             N/A                       [.] _dl_map_object_from_fd          ld-2.28.so         [.] 0x0000ffff9af32d50  libpthread-2.28.so  N/A           Walker miss             No
     [...]


# Samples: 114  of event 'tlb-access'
# Total weight : 114
# Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead       Samples  Local Weight  Memory access             Symbol                              Shared Object      Data Symbol                                                   Data Object             Snoop         TLB access              Locked
# ........  ............  ............  ........................  ..................................  .................  ............................................................  ......................  ............  ......................  ......
#
     2.63%             3  0             N/A                       [k] perf_event_mmap                 [kernel.kallsyms]  [k] perf_swevent+0x5c                                         [kernel.kallsyms].data  N/A           Walker hit              No
     1.75%             2  0             N/A                       [k] kmem_cache_alloc                [kernel.kallsyms]  [k] 0xffff2027af40e3d0                                        [unknown]               N/A           Walker hit              No
     1.75%             2  0             N/A                       [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027aacb08a8                                        [unknown]               N/A           Walker hit              No
     1.75%             2  0             N/A                       [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027be6488a8                                        [unknown]               N/A           Walker hit              No
     1.75%             2  0             N/A                       [k] perf_iterate_ctx.constprop.151  [kernel.kallsyms]  [k] 0xffff2027c432f8a8                                        [unknown]               N/A           Walker hit              No
     0.88%             1  0             N/A                       [k] __arch_clear_user               [kernel.kallsyms]  [k] 0x0000aaaadca14658                                        [unknown]               N/A           Walker hit              No
     0.88%             1  0             N/A                       [k] __arch_clear_user               [kernel.kallsyms]  [k] 0x0000ffff9b1963f8                                        [unknown]               N/A           Walker hit              No
     [...]


# Samples: 21  of event 'remote-access'
# Total weight : 21
# Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked
#
# Overhead       Samples  Local Weight  Memory access             Symbol                      Shared Object      Data Symbol                                                   Data Object         Snoop         TLB access              Locked
# ........  ............  ............  ........................  ..........................  .................  ............................................................  ..................  ............  ......................  ......
#
     4.76%             1  0             Remote Cache (1 hop) or Any cache hit  [k] __arch_copy_to_user     [kernel.kallsyms]  [k] 0xffff0027a65352a0                                        [unknown]           N/A           N/A                     No
     4.76%             1  0             Remote Cache (1 hop) or Any cache hit  [k] __d_lookup_rcu          [kernel.kallsyms]  [k] 0xffff0027d3cbf468                                        [unknown]           N/A           N/A                     No
     4.76%             1  0             Remote Cache (1 hop) or Any cache hit  [k] __d_lookup_rcu          [kernel.kallsyms]  [k] 0xffff0027d8f44490                                        [unknown]           N/A           N/A                     No
     4.76%             1  0             Remote Cache (1 hop) or Any cache hit  [.] _dl_addr                libc-2.28.so       [.] 0x0000ffff9afc94c4                                        libc-2.28.so        N/A           N/A                     No
     [...]


Leo Yan (11):
  perf mem: Search event name with more flexible path
  perf mem: Introduce weak function perf_mem_events__ptr()
  perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORE
  perf mem: Only initialize memory event for recording
  perf mem: Support AUX trace
  perf mem: Support Arm SPE events
  perf arm-spe: Enable attribution PERF_SAMPLE_DATA_SRC
  perf arm-spe: Save memory addresses in packet
  perf arm-spe: Store operation types in packet
  perf arm-spe: Fill address info for memory samples
  perf arm-spe: Set sample's data source field

 tools/perf/arch/arm64/util/Build              |   2 +-
 tools/perf/arch/arm64/util/mem-events.c       |  46 +++++++
 tools/perf/builtin-c2c.c                      |  18 ++-
 tools/perf/builtin-mem.c                      |  71 ++++++++--
 .../util/arm-spe-decoder/arm-spe-decoder.c    |  15 +++
 .../util/arm-spe-decoder/arm-spe-decoder.h    |   8 ++
 tools/perf/util/arm-spe.c                     | 125 +++++++++++++++---
 tools/perf/util/mem-events.c                  |  32 +++--
 tools/perf/util/mem-events.h                  |   3 +-
 9 files changed, 266 insertions(+), 54 deletions(-)
 create mode 100644 tools/perf/arch/arm64/util/mem-events.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 01/11] perf mem: Search event name with more flexible path
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  2020-08-06  2:40 ` [PATCH v1 02/11] perf mem: Introduce weak function perf_mem_events__ptr() Leo Yan
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

Perf tool searches memory event name under the folder
'/sys/devices/cpu/events/', this leads to the limitation for selection
memory profiling event which must be under this folder.  Thus it's
impossible to use any other event as memory event which is not under
this specific, e.g. it cannot support Arm SPE hardware tracing for
memory profiling

This patch changes to search folder from '/sys/devices/cpu/events/' to
'/sys/devices', so it give flexibility to find events which can be used
for memory profiling.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/mem-events.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index ea0af0bc4314..35c8d175a9d2 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -18,8 +18,8 @@ unsigned int perf_mem_events__loads_ldlat = 30;
 #define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
 
 struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
-	E("ldlat-loads",	"cpu/mem-loads,ldlat=%u/P",	"mem-loads"),
-	E("ldlat-stores",	"cpu/mem-stores/P",		"mem-stores"),
+	E("ldlat-loads",	"cpu/mem-loads,ldlat=%u/P",	"cpu/events/mem-loads"),
+	E("ldlat-stores",	"cpu/mem-stores/P",		"cpu/events/mem-stores"),
 };
 #undef E
 
@@ -93,7 +93,7 @@ int perf_mem_events__init(void)
 		struct perf_mem_event *e = &perf_mem_events[j];
 		struct stat st;
 
-		scnprintf(path, PATH_MAX, "%s/devices/cpu/events/%s",
+		scnprintf(path, PATH_MAX, "%s/devices/%s",
 			  mnt, e->sysfs_name);
 
 		if (!stat(path, &st))
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 02/11] perf mem: Introduce weak function perf_mem_events__ptr()
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
  2020-08-06  2:40 ` [PATCH v1 01/11] perf mem: Search event name with more flexible path Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  2020-08-06  2:40 ` [PATCH v1 03/11] perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORE Leo Yan
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

Different architectures might use different event or different event
parameters for memory profiling, this patch introduces weak function
perf_mem_events__ptr(), which allows to return back architecture
specific memory event.

After the function perf_mem_events__ptr() is introduced, the variable
'perf_mem_events' can be accessed by using this new function; so marks
the variable as 'static' variable, this can allow the architectures to
define its own memory event array.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-c2c.c     | 18 ++++++++++++------
 tools/perf/builtin-mem.c     | 21 ++++++++++++++-------
 tools/perf/util/mem-events.c | 26 +++++++++++++++++++-------
 tools/perf/util/mem-events.h |  2 +-
 4 files changed, 46 insertions(+), 21 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 5938b100eaf4..88e68f36aa62 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -2914,6 +2914,7 @@ static int perf_c2c__record(int argc, const char **argv)
 	int ret;
 	bool all_user = false, all_kernel = false;
 	bool event_set = false;
+	struct perf_mem_event *e;
 	struct option options[] = {
 	OPT_CALLBACK('e', "event", &event_set, "event",
 		     "event selector. Use 'perf mem record -e list' to list available events",
@@ -2941,11 +2942,15 @@ static int perf_c2c__record(int argc, const char **argv)
 	rec_argv[i++] = "record";
 
 	if (!event_set) {
-		perf_mem_events[PERF_MEM_EVENTS__LOAD].record  = true;
-		perf_mem_events[PERF_MEM_EVENTS__STORE].record = true;
+		e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+		e->record = true;
+
+		e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
+		e->record = true;
 	}
 
-	if (perf_mem_events[PERF_MEM_EVENTS__LOAD].record)
+	e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+	if (e->record)
 		rec_argv[i++] = "-W";
 
 	rec_argv[i++] = "-d";
@@ -2953,12 +2958,13 @@ static int perf_c2c__record(int argc, const char **argv)
 	rec_argv[i++] = "--sample-cpu";
 
 	for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-		if (!perf_mem_events[j].record)
+		e = perf_mem_events__ptr(j);
+		if (!e->record)
 			continue;
 
-		if (!perf_mem_events[j].supported) {
+		if (!e->supported) {
 			pr_err("failed: event '%s' not supported\n",
-			       perf_mem_events[j].name);
+			       perf_mem_events__name(j));
 			free(rec_argv);
 			return -1;
 		}
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 3523279af6af..9a7df8d01296 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -64,6 +64,7 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
 	const char **rec_argv;
 	int ret;
 	bool all_user = false, all_kernel = false;
+	struct perf_mem_event *e;
 	struct option options[] = {
 	OPT_CALLBACK('e', "event", &mem, "event",
 		     "event selector. use 'perf mem record -e list' to list available events",
@@ -86,13 +87,18 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
 
 	rec_argv[i++] = "record";
 
-	if (mem->operation & MEM_OPERATION_LOAD)
-		perf_mem_events[PERF_MEM_EVENTS__LOAD].record = true;
+	if (mem->operation & MEM_OPERATION_LOAD) {
+		e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+		e->record = true;
+	}
 
-	if (mem->operation & MEM_OPERATION_STORE)
-		perf_mem_events[PERF_MEM_EVENTS__STORE].record = true;
+	if (mem->operation & MEM_OPERATION_STORE) {
+		e = perf_mem_events__ptr(PERF_MEM_EVENTS__STORE);
+		e->record = true;
+	}
 
-	if (perf_mem_events[PERF_MEM_EVENTS__LOAD].record)
+	e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
+	if (e->record)
 		rec_argv[i++] = "-W";
 
 	rec_argv[i++] = "-d";
@@ -101,10 +107,11 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
 		rec_argv[i++] = "--phys-data";
 
 	for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-		if (!perf_mem_events[j].record)
+		e = perf_mem_events__ptr(j);
+		if (!e->record)
 			continue;
 
-		if (!perf_mem_events[j].supported) {
+		if (!e->supported) {
 			pr_err("failed: event '%s' not supported\n",
 			       perf_mem_events__name(j));
 			free(rec_argv);
diff --git a/tools/perf/util/mem-events.c b/tools/perf/util/mem-events.c
index 35c8d175a9d2..7a5a0d699e27 100644
--- a/tools/perf/util/mem-events.c
+++ b/tools/perf/util/mem-events.c
@@ -17,7 +17,7 @@ unsigned int perf_mem_events__loads_ldlat = 30;
 
 #define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
 
-struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
 	E("ldlat-loads",	"cpu/mem-loads,ldlat=%u/P",	"cpu/events/mem-loads"),
 	E("ldlat-stores",	"cpu/mem-stores/P",		"cpu/events/mem-stores"),
 };
@@ -28,19 +28,31 @@ struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
 static char mem_loads_name[100];
 static bool mem_loads_name__init;
 
+struct perf_mem_event * __weak perf_mem_events__ptr(int i)
+{
+	if (i >= PERF_MEM_EVENTS__MAX)
+		return NULL;
+
+	return &perf_mem_events[i];
+}
+
 char * __weak perf_mem_events__name(int i)
 {
+	struct perf_mem_event *e = perf_mem_events__ptr(i);
+
+	if (!e)
+		return NULL;
+
 	if (i == PERF_MEM_EVENTS__LOAD) {
 		if (!mem_loads_name__init) {
 			mem_loads_name__init = true;
 			scnprintf(mem_loads_name, sizeof(mem_loads_name),
-				  perf_mem_events[i].name,
-				  perf_mem_events__loads_ldlat);
+				  e->name, perf_mem_events__loads_ldlat);
 		}
 		return mem_loads_name;
 	}
 
-	return (char *)perf_mem_events[i].name;
+	return (char *)e->name;
 }
 
 int perf_mem_events__parse(const char *str)
@@ -61,7 +73,7 @@ int perf_mem_events__parse(const char *str)
 
 	while (tok) {
 		for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-			struct perf_mem_event *e = &perf_mem_events[j];
+			struct perf_mem_event *e = perf_mem_events__ptr(j);
 
 			if (strstr(e->tag, tok))
 				e->record = found = true;
@@ -90,7 +102,7 @@ int perf_mem_events__init(void)
 
 	for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
 		char path[PATH_MAX];
-		struct perf_mem_event *e = &perf_mem_events[j];
+		struct perf_mem_event *e = perf_mem_events__ptr(j);
 		struct stat st;
 
 		scnprintf(path, PATH_MAX, "%s/devices/%s",
@@ -108,7 +120,7 @@ void perf_mem_events__list(void)
 	int j;
 
 	for (j = 0; j < PERF_MEM_EVENTS__MAX; j++) {
-		struct perf_mem_event *e = &perf_mem_events[j];
+		struct perf_mem_event *e = perf_mem_events__ptr(j);
 
 		fprintf(stderr, "%-13s%-*s%s\n",
 			e->tag,
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 904dad34f7f7..726a9c8103e4 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -31,13 +31,13 @@ enum {
 	PERF_MEM_EVENTS__MAX,
 };
 
-extern struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX];
 extern unsigned int perf_mem_events__loads_ldlat;
 
 int perf_mem_events__parse(const char *str);
 int perf_mem_events__init(void);
 
 char *perf_mem_events__name(int i);
+struct perf_mem_event *perf_mem_events__ptr(int i);
 
 void perf_mem_events__list(void);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 03/11] perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORE
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
  2020-08-06  2:40 ` [PATCH v1 01/11] perf mem: Search event name with more flexible path Leo Yan
  2020-08-06  2:40 ` [PATCH v1 02/11] perf mem: Introduce weak function perf_mem_events__ptr() Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  2020-08-06  2:40 ` [PATCH v1 04/11] perf mem: Only initialize memory event for recording Leo Yan
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

The existed architectures which have supported perf memory profiling,
usually it contains two types of hardware events: load and store, so if
want to profile memory for both load and store operations, the tool will
use these two events at the same time.  But this is not valid for aux
tracing event, the same event can be used with setting different
configurations for memory operation filtering, e.g the event can be used
to only trace memory load, or only memory store, or trace for both memory
load and store.

This patch introduces a new event PERF_MEM_EVENTS__LOAD_STORE, which is
used to support the event which can record both memory load and store
operations.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-mem.c     | 11 +++++++++--
 tools/perf/util/mem-events.h |  1 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 9a7df8d01296..bd4229ca3685 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -19,8 +19,9 @@
 #include "util/symbol.h"
 #include <linux/err.h>
 
-#define MEM_OPERATION_LOAD	0x1
-#define MEM_OPERATION_STORE	0x2
+#define MEM_OPERATION_LOAD		0x1
+#define MEM_OPERATION_STORE		0x2
+#define MEM_OPERATION_LOAD_STORE	0x4
 
 struct perf_mem {
 	struct perf_tool	tool;
@@ -97,6 +98,11 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
 		e->record = true;
 	}
 
+	if (mem->operation & MEM_OPERATION_LOAD_STORE) {
+		e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD_STORE);
+		e->record = true;
+	}
+
 	e = perf_mem_events__ptr(PERF_MEM_EVENTS__LOAD);
 	if (e->record)
 		rec_argv[i++] = "-W";
@@ -326,6 +332,7 @@ struct mem_mode {
 static const struct mem_mode mem_modes[]={
 	MEM_OPT("load", MEM_OPERATION_LOAD),
 	MEM_OPT("store", MEM_OPERATION_STORE),
+	MEM_OPT("ldst", MEM_OPERATION_LOAD_STORE),
 	MEM_END
 };
 
diff --git a/tools/perf/util/mem-events.h b/tools/perf/util/mem-events.h
index 726a9c8103e4..5ef178278909 100644
--- a/tools/perf/util/mem-events.h
+++ b/tools/perf/util/mem-events.h
@@ -28,6 +28,7 @@ struct mem_info {
 enum {
 	PERF_MEM_EVENTS__LOAD,
 	PERF_MEM_EVENTS__STORE,
+	PERF_MEM_EVENTS__LOAD_STORE,
 	PERF_MEM_EVENTS__MAX,
 };
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 04/11] perf mem: Only initialize memory event for recording
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
                   ` (2 preceding siblings ...)
  2020-08-06  2:40 ` [PATCH v1 03/11] perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORE Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  2020-08-06  2:40 ` [PATCH v1 05/11] perf mem: Support AUX trace Leo Yan
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

It's needless to initialize memory events for perf reporting, so only
initialize memory event for perf recording.  This change allows to parse
perf data on cross platforms, e.g. perf tool can output reports even the
machine doesn't enable any memory events.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-mem.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index bd4229ca3685..a7204634893c 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -78,6 +78,11 @@ static int __cmd_record(int argc, const char **argv, struct perf_mem *mem)
 	OPT_END()
 	};
 
+	if (perf_mem_events__init()) {
+		pr_err("failed: memory events not supported\n");
+		return -1;
+	}
+
 	argc = parse_options(argc, argv, options, record_mem_usage,
 			     PARSE_OPT_KEEP_UNKNOWN);
 
@@ -436,11 +441,6 @@ int cmd_mem(int argc, const char **argv)
 		NULL
 	};
 
-	if (perf_mem_events__init()) {
-		pr_err("failed: memory events not supported\n");
-		return -1;
-	}
-
 	argc = parse_options_subcommand(argc, argv, mem_options, mem_subcommands,
 					mem_usage, PARSE_OPT_KEEP_UNKNOWN);
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 05/11] perf mem: Support AUX trace
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
                   ` (3 preceding siblings ...)
  2020-08-06  2:40 ` [PATCH v1 04/11] perf mem: Only initialize memory event for recording Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  2020-08-06  2:40 ` [PATCH v1 06/11] perf mem: Support Arm SPE events Leo Yan
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

Perf memory profiling doesn't support aux trace data so the tool cannot
receive the synthesized samples from hardware tracing data.  On the
Arm64 platform, though it doesn't support PMU events for memory load and
store, but Armv8's SPE is a good candidate for memory profiling, the
hardware tracer can record memory accessing operations with physical
address and virtual address for different cache level and it also stats
the memory operations for remote access and TLB.

To allow the perf memory tool to support AUX trace, this patches adds
the aux callbacks for session structure.  It passes the predefined synth
options (like llc, flc, remote_access, tlb, etc) so this notifies the
tracing decoder to generate corresponding samples.  This patch also
invokes the standard API perf_event__process_attr() to register sample
IDs into evlist.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/builtin-mem.c | 29 +++++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index a7204634893c..6c8b5e956a4a 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -7,6 +7,7 @@
 #include "perf.h"
 
 #include <subcmd/parse-options.h>
+#include "util/auxtrace.h"
 #include "util/trace-event.h"
 #include "util/tool.h"
 #include "util/session.h"
@@ -249,6 +250,15 @@ static int process_sample_event(struct perf_tool *tool,
 
 static int report_raw_events(struct perf_mem *mem)
 {
+	struct itrace_synth_opts itrace_synth_opts = {
+		.set = true,
+		.flc = true,		/* First level cache samples */
+		.llc = true,		/* Last level cache samples */
+		.tlb = true,		/* TLB samples */
+		.remote_access = true,	/* Remote access samples */
+		.default_no_sample = true,
+	};
+
 	struct perf_data data = {
 		.path  = input_name,
 		.mode  = PERF_DATA_MODE_READ,
@@ -261,6 +271,8 @@ static int report_raw_events(struct perf_mem *mem)
 	if (IS_ERR(session))
 		return PTR_ERR(session);
 
+	session->itrace_synth_opts = &itrace_synth_opts;
+
 	if (mem->cpu_list) {
 		ret = perf_session__cpu_bitmap(session, mem->cpu_list,
 					       mem->cpu_bitmap);
@@ -394,6 +406,19 @@ parse_mem_ops(const struct option *opt, const char *str, int unset)
 	return ret;
 }
 
+static int process_attr(struct perf_tool *tool __maybe_unused,
+			union perf_event *event,
+			struct evlist **pevlist)
+{
+	int err;
+
+	err = perf_event__process_attr(tool, event, pevlist);
+	if (err)
+		return err;
+
+	return 0;
+}
+
 int cmd_mem(int argc, const char **argv)
 {
 	struct stat st;
@@ -405,8 +430,12 @@ int cmd_mem(int argc, const char **argv)
 			.comm		= perf_event__process_comm,
 			.lost		= perf_event__process_lost,
 			.fork		= perf_event__process_fork,
+			.attr		= process_attr,
 			.build_id	= perf_event__process_build_id,
 			.namespaces	= perf_event__process_namespaces,
+			.auxtrace_info  = perf_event__process_auxtrace_info,
+			.auxtrace       = perf_event__process_auxtrace,
+			.auxtrace_error = perf_event__process_auxtrace_error,
 			.ordered_events	= true,
 		},
 		.input_name		 = "perf.data",
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 06/11] perf mem: Support Arm SPE events
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
                   ` (4 preceding siblings ...)
  2020-08-06  2:40 ` [PATCH v1 05/11] perf mem: Support AUX trace Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  2020-08-06  2:40 ` [PATCH v1 07/11] perf arm-spe: Enable attribution PERF_SAMPLE_DATA_SRC Leo Yan
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

This patch is to add Arm SPE events for perf memory profiling.  It
supports three Arm SPE events:

  - spe-load: memory event for only recording memory load ops;
  - spe-store: memory event for only recording memory store ops;
  - spe-ldst: memory event for recording memory load and store ops.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/arch/arm64/util/Build        |  2 +-
 tools/perf/arch/arm64/util/mem-events.c | 46 +++++++++++++++++++++++++
 2 files changed, 47 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/arch/arm64/util/mem-events.c

diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index 5c13438c7bd4..cb18442e840f 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -8,4 +8,4 @@ perf-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
 perf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
 			      ../../arm/util/auxtrace.o \
 			      ../../arm/util/cs-etm.o \
-			      arm-spe.o
+			      arm-spe.o mem-events.o
diff --git a/tools/perf/arch/arm64/util/mem-events.c b/tools/perf/arch/arm64/util/mem-events.c
new file mode 100644
index 000000000000..f23128db54fb
--- /dev/null
+++ b/tools/perf/arch/arm64/util/mem-events.c
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+#include "map_symbol.h"
+#include "mem-events.h"
+
+#define E(t, n, s) { .tag = t, .name = n, .sysfs_name = s }
+
+static struct perf_mem_event perf_mem_events[PERF_MEM_EVENTS__MAX] = {
+	E("spe-load",	"arm_spe_0/ts_enable=1,load_filter=1,store_filter=0,min_latency=%u/",	"arm_spe_0"),
+	E("spe-store",	"arm_spe_0/ts_enable=1,load_filter=0,store_filter=1/",			"arm_spe_0"),
+	E("spe-ldst",	"arm_spe_0/ts_enable=1,load_filter=1,store_filter=1,min_latency=%u/",	"arm_spe_0"),
+};
+
+static char mem_ld_name[100];
+static char mem_st_name[100];
+static char mem_ldst_name[100];
+
+struct perf_mem_event *perf_mem_events__ptr(int i)
+{
+	if (i >= PERF_MEM_EVENTS__MAX)
+		return NULL;
+
+	return &perf_mem_events[i];
+}
+
+char *perf_mem_events__name(int i)
+{
+	struct perf_mem_event *e = perf_mem_events__ptr(i);
+
+	if (i >= PERF_MEM_EVENTS__MAX)
+		return NULL;
+
+	if (i == PERF_MEM_EVENTS__LOAD) {
+		scnprintf(mem_ld_name, sizeof(mem_ld_name),
+			  e->name, perf_mem_events__loads_ldlat);
+		return mem_ld_name;
+	}
+
+	if (i == PERF_MEM_EVENTS__STORE) {
+		scnprintf(mem_st_name, sizeof(mem_st_name), e->name);
+		return mem_st_name;
+	}
+
+	scnprintf(mem_ldst_name, sizeof(mem_ldst_name),
+		  e->name, perf_mem_events__loads_ldlat);
+	return mem_ldst_name;
+}
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 07/11] perf arm-spe: Enable attribution PERF_SAMPLE_DATA_SRC
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
                   ` (5 preceding siblings ...)
  2020-08-06  2:40 ` [PATCH v1 06/11] perf mem: Support Arm SPE events Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  2020-08-06  2:40 ` [PATCH v1 08/11] perf arm-spe: Save memory addresses in packet Leo Yan
  2020-08-06  2:40 ` [PATCH v1 09/11] perf arm-spe: Store operation types " Leo Yan
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

This patch is to enable attribution PERF_SAMPLE_DATA_SRC for the perf
data, when decoding the tracing data, it will tells the tool it contains
memory data.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
index 3882a5360ada..c2cf5058648f 100644
--- a/tools/perf/util/arm-spe.c
+++ b/tools/perf/util/arm-spe.c
@@ -803,7 +803,7 @@ arm_spe_synth_events(struct arm_spe *spe, struct perf_session *session)
 	attr.type = PERF_TYPE_HARDWARE;
 	attr.sample_type = evsel->core.attr.sample_type & PERF_SAMPLE_MASK;
 	attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
-		PERF_SAMPLE_PERIOD;
+			    PERF_SAMPLE_PERIOD | PERF_SAMPLE_DATA_SRC;
 	if (spe->timeless_decoding)
 		attr.sample_type &= ~(u64)PERF_SAMPLE_TIME;
 	else
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 08/11] perf arm-spe: Save memory addresses in packet
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
                   ` (6 preceding siblings ...)
  2020-08-06  2:40 ` [PATCH v1 07/11] perf arm-spe: Enable attribution PERF_SAMPLE_DATA_SRC Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  2020-08-06  2:40 ` [PATCH v1 09/11] perf arm-spe: Store operation types " Leo Yan
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

This patch is to save virtual and physical memory addresses in packet,
the address info can be used for generating memory samples.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 4 ++++
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h | 2 ++
 2 files changed, 6 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 93e063f22be5..373dc2d1cf06 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -162,6 +162,10 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
 				decoder->record.from_ip = ip;
 			else if (idx == SPE_ADDR_PKT_HDR_INDEX_BRANCH)
 				decoder->record.to_ip = ip;
+			else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_VIRT)
+				decoder->record.addr = ip;
+			else if (idx == SPE_ADDR_PKT_HDR_INDEX_DATA_PHYS)
+				decoder->record.phys_addr = ip;
 			break;
 		case ARM_SPE_COUNTER:
 			break;
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index a5111a8d4360..5acddfcffbd1 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -47,6 +47,8 @@ struct arm_spe_record {
 	u64 from_ip;
 	u64 to_ip;
 	u64 timestamp;
+	u64 addr;
+	u64 phys_addr;
 };
 
 struct arm_spe_insn;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 09/11] perf arm-spe: Store operation types in packet
  2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
                   ` (7 preceding siblings ...)
  2020-08-06  2:40 ` [PATCH v1 08/11] perf arm-spe: Save memory addresses in packet Leo Yan
@ 2020-08-06  2:40 ` Leo Yan
  8 siblings, 0 replies; 10+ messages in thread
From: Leo Yan @ 2020-08-06  2:40 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Jiri Olsa, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Namhyung Kim, Nick Desaulniers,
	Naveen N. Rao, Ian Rogers, Kemeng Shi, James Clark, Wei Li,
	Adrian Hunter, Al Grant, linux-kernel
  Cc: Leo Yan

This patch is to store operation types into packet structure, this can
be used by frontend to generate memory accessing info for samples.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
---
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.c | 11 +++++++++++
 tools/perf/util/arm-spe-decoder/arm-spe-decoder.h |  6 ++++++
 2 files changed, 17 insertions(+)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
index 373dc2d1cf06..cba394784b0d 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.c
@@ -172,6 +172,17 @@ static int arm_spe_read_record(struct arm_spe_decoder *decoder)
 		case ARM_SPE_CONTEXT:
 			break;
 		case ARM_SPE_OP_TYPE:
+			/*
+			 * When operation type packet header's class equals 1,
+			 * the payload's least significant bit (LSB) indicates
+			 * the operation type: load/swap or store.
+			 */
+			if (idx == 1) {
+				if (payload & 0x1)
+					decoder->record.op = ARM_SPE_ST;
+				else
+					decoder->record.op = ARM_SPE_LD;
+			}
 			break;
 		case ARM_SPE_EVENTS:
 			if (payload & BIT(EV_L1D_REFILL))
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
index 5acddfcffbd1..f23188282ef0 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h
@@ -41,9 +41,15 @@ enum arm_spe_sample_type {
 	ARM_SPE_REMOTE_ACCESS	= 1 << 7,
 };
 
+enum arm_spe_op_type {
+	ARM_SPE_LD		= 1 << 0,
+	ARM_SPE_ST		= 1 << 1,
+};
+
 struct arm_spe_record {
 	enum arm_spe_sample_type type;
 	int err;
+	u32 op;
 	u64 from_ip;
 	u64 to_ip;
 	u64 timestamp;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-08-06  2:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-06  2:40 [PATCH v1 00/11] perf mem: Support AUX trace and Arm SPE Leo Yan
2020-08-06  2:40 ` [PATCH v1 01/11] perf mem: Search event name with more flexible path Leo Yan
2020-08-06  2:40 ` [PATCH v1 02/11] perf mem: Introduce weak function perf_mem_events__ptr() Leo Yan
2020-08-06  2:40 ` [PATCH v1 03/11] perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORE Leo Yan
2020-08-06  2:40 ` [PATCH v1 04/11] perf mem: Only initialize memory event for recording Leo Yan
2020-08-06  2:40 ` [PATCH v1 05/11] perf mem: Support AUX trace Leo Yan
2020-08-06  2:40 ` [PATCH v1 06/11] perf mem: Support Arm SPE events Leo Yan
2020-08-06  2:40 ` [PATCH v1 07/11] perf arm-spe: Enable attribution PERF_SAMPLE_DATA_SRC Leo Yan
2020-08-06  2:40 ` [PATCH v1 08/11] perf arm-spe: Save memory addresses in packet Leo Yan
2020-08-06  2:40 ` [PATCH v1 09/11] perf arm-spe: Store operation types " Leo Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).