linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] powerpc/perf: Export processor pipeline stage cycles information
@ 2021-03-09 14:03 Athira Rajeev
  2021-03-09 14:03 ` [PATCH 1/4] powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT Athira Rajeev
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Athira Rajeev @ 2021-03-09 14:03 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel, linux-perf-users, mpe, acme, jolsa
  Cc: maddy, ravi.bangoria, kjain, kan.liang, peterz

Performance Monitoring Unit (PMU) registers in powerpc exports
number of cycles elapsed between different stages in the pipeline.
Example, sampling registers in ISA v3.1.

This patchset implements kernel and perf tools support to expose
these pipeline stage cycles using the sample type PERF_SAMPLE_WEIGHT_TYPE.

Patch 1/4 adds kernel side support to store the cycle counter
values as part of 'var2_w' and 'var3_w' fields of perf_sample_weight
structure.

Patch 2/4 adds support to make the perf report column header
strings as dynamic.
Patch 3/4 adds powerpc support in perf tools for PERF_SAMPLE_WEIGHT_STRUCT
in sample type: PERF_SAMPLE_WEIGHT_TYPE.
Patch 4/4 adds support to present pipeline stage cycles as part of
mem-mode.

Sample output on powerpc:

# perf mem record ls
# perf mem report

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 11  of event 'cpu/mem-loads/'
# Total weight : 1332
# Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,stall_cyc
#
# Overhead       Samples  Local Weight  Memory access             Symbol                              Shared Object     Data Symbol                                    Data Object            Snoop         TLB access              Locked  Blocked     Finish Cyc     Dispatch Cyc 
# ........  ............  ............  ........................  ..................................  ................  .............................................  .....................  ............  ......................  ......  ..........  .............  .............
#
    44.14%             1  588           L1 hit                    [k] rcu_nmi_exit                    [kernel.vmlinux]  [k] 0xc0000007ffdd21b0                         [unknown]              N/A           N/A                     No       N/A        7              5            
    22.22%             1  296           L1 hit                    [k] copypage_power7                 [kernel.vmlinux]  [k] 0xc0000000ff6a1780                         [unknown]              N/A           N/A                     No       N/A        293            3            
     6.98%             1  93            L1 hit                    [.] _dl_addr                        libc-2.31.so      [.] 0x00007fff86fa5058                         libc-2.31.so           N/A           N/A                     No       N/A        7              1            
     6.61%             1  88            L2 hit                    [.] new_do_write                    libc-2.31.so      [.] _IO_2_1_stdout_+0x0                        libc-2.31.so           N/A           N/A                     No       N/A        84             1            
     5.93%             1  79            L1 hit                    [k] printk_nmi_exit                 [kernel.vmlinux]  [k] 0xc0000006085df6b0                         [unknown]              N/A           N/A                     No       N/A        7              1            
     4.05%             1  54            L2 hit                    [.] __alloc_dir                     libc-2.31.so      [.] 0x00007fffdb70a640                         [stack]                N/A           N/A                     No       N/A        18             1            
     3.60%             1  48            L1 hit                    [.] _init                           ls                [.] 0x000000016ca82118                         [heap]                 N/A           N/A                     No       N/A        7              6            
     2.40%             1  32            L1 hit                    [k] desc_read                       [kernel.vmlinux]  [k] _printk_rb_static_descs+0x1ea10            [kernel.vmlinux].data  N/A           N/A                     No       N/A        7              1            
     1.65%             1  22            L2 hit                    [k] perf_iterate_ctx.constprop.139  [kernel.vmlinux]  [k] 0xc00000064d79e8a8                         [unknown]              N/A           N/A                     No       N/A        16             1            
     1.58%             1  21            L1 hit                    [k] perf_event_interrupt            [kernel.vmlinux]  [k] 0xc0000006085df6b0                         [unknown]              N/A           N/A                     No       N/A        7              1            
     0.83%             1  11            L1 hit                    [k] perf_event_exec                 [kernel.vmlinux]  [k] 0xc0000007ffdd3288                         [unknown]              N/A           N/A                     No       N/A        7              4            


Athira Rajeev (4):
  powerpc/perf: Expose processor pipeline stage cycles using
    PERF_SAMPLE_WEIGHT_STRUCT
  tools/perf: Add dynamic headers for perf report columns
  tools/perf: Add powerpc support for PERF_SAMPLE_WEIGHT_STRUCT
  tools/perf: Support pipeline stage cycles for powerpc

 arch/powerpc/include/asm/perf_event_server.h |  2 +-
 arch/powerpc/perf/core-book3s.c              |  4 +--
 arch/powerpc/perf/isa207-common.c            | 29 ++++++++++++++++--
 arch/powerpc/perf/isa207-common.h            |  6 +++-
 tools/perf/Documentation/perf-report.txt     |  1 +
 tools/perf/arch/powerpc/util/Build           |  2 ++
 tools/perf/arch/powerpc/util/event.c         | 46 ++++++++++++++++++++++++++++
 tools/perf/arch/powerpc/util/evsel.c         |  8 +++++
 tools/perf/util/event.h                      |  2 ++
 tools/perf/util/hist.c                       | 11 +++++--
 tools/perf/util/hist.h                       |  1 +
 tools/perf/util/session.c                    |  4 ++-
 tools/perf/util/sort.c                       | 41 +++++++++++++++++++++++--
 tools/perf/util/sort.h                       |  2 ++
 14 files changed, 146 insertions(+), 13 deletions(-)
 create mode 100644 tools/perf/arch/powerpc/util/event.c
 create mode 100644 tools/perf/arch/powerpc/util/evsel.c

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/4] powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT
  2021-03-09 14:03 [PATCH 0/4] powerpc/perf: Export processor pipeline stage cycles information Athira Rajeev
@ 2021-03-09 14:03 ` Athira Rajeev
  2021-03-09 14:03 ` [PATCH 2/4] tools/perf: Add dynamic headers for perf report columns Athira Rajeev
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Athira Rajeev @ 2021-03-09 14:03 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel, linux-perf-users, mpe, acme, jolsa
  Cc: maddy, ravi.bangoria, kjain, kan.liang, peterz

Performance Monitoring Unit (PMU) registers in powerpc provides
information on cycles elapsed between different stages in the
pipeline. This can be used for application tuning. On ISA v3.1
platform, this information is exposed by sampling registers.
Patch adds kernel support to capture two of the cycle counters
as part of perf sample using the sample type:
PERF_SAMPLE_WEIGHT_STRUCT.

The power PMU function 'get_mem_weight' currently uses 64 bit weight
field of perf_sample_data to capture memory latency. But following the
introduction of PERF_SAMPLE_WEIGHT_TYPE, weight field could contain
64-bit or 32-bit value depending on the architexture support for
PERF_SAMPLE_WEIGHT_STRUCT. Patches uses WEIGHT_STRUCT to expose the
pipeline stage cycles info. Hence update the ppmu functions to work for
64-bit and 32-bit weight values.

If the sample type is PERF_SAMPLE_WEIGHT, use the 64-bit weight field.
if the sample type is PERF_SAMPLE_WEIGHT_STRUCT, memory subsystem
latency is stored in the low 32bits of perf_sample_weight structure.
Also for CPU_FTR_ARCH_31, capture the two cycle counter information in
two 16 bit fields of perf_sample_weight structure.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/perf_event_server.h |  2 +-
 arch/powerpc/perf/core-book3s.c              |  4 ++--
 arch/powerpc/perf/isa207-common.c            | 29 +++++++++++++++++++++++++---
 arch/powerpc/perf/isa207-common.h            |  6 +++++-
 4 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 00e7e671bb4b..112cf092d7b3 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -43,7 +43,7 @@ struct power_pmu {
 				u64 alt[]);
 	void		(*get_mem_data_src)(union perf_mem_data_src *dsrc,
 				u32 flags, struct pt_regs *regs);
-	void		(*get_mem_weight)(u64 *weight);
+	void		(*get_mem_weight)(u64 *weight, u64 type);
 	unsigned long	group_constraint_mask;
 	unsigned long	group_constraint_val;
 	u64             (*bhrb_filter_map)(u64 branch_sample_type);
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 6817331e22ff..57ff2494880c 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2206,9 +2206,9 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
 						ppmu->get_mem_data_src)
 			ppmu->get_mem_data_src(&data.data_src, ppmu->flags, regs);
 
-		if (event->attr.sample_type & PERF_SAMPLE_WEIGHT &&
+		if (event->attr.sample_type & PERF_SAMPLE_WEIGHT_TYPE &&
 						ppmu->get_mem_weight)
-			ppmu->get_mem_weight(&data.weight.full);
+			ppmu->get_mem_weight(&data.weight.full, event->attr.sample_type);
 
 		if (perf_event_overflow(event, &data, regs))
 			power_pmu_stop(event, 0);
diff --git a/arch/powerpc/perf/isa207-common.c b/arch/powerpc/perf/isa207-common.c
index e4f577da33d8..5dcbdbd54598 100644
--- a/arch/powerpc/perf/isa207-common.c
+++ b/arch/powerpc/perf/isa207-common.c
@@ -284,8 +284,10 @@ void isa207_get_mem_data_src(union perf_mem_data_src *dsrc, u32 flags,
 	}
 }
 
-void isa207_get_mem_weight(u64 *weight)
+void isa207_get_mem_weight(u64 *weight, u64 type)
 {
+	union perf_sample_weight *weight_fields;
+	u64 weight_lat;
 	u64 mmcra = mfspr(SPRN_MMCRA);
 	u64 exp = MMCRA_THR_CTR_EXP(mmcra);
 	u64 mantissa = MMCRA_THR_CTR_MANT(mmcra);
@@ -296,9 +298,30 @@ void isa207_get_mem_weight(u64 *weight)
 		mantissa = P10_MMCRA_THR_CTR_MANT(mmcra);
 
 	if (val == 0 || val == 7)
-		*weight = 0;
+		weight_lat = 0;
 	else
-		*weight = mantissa << (2 * exp);
+		weight_lat = mantissa << (2 * exp);
+
+	/*
+	 * Use 64 bit weight field (full) if sample type is
+	 * WEIGHT.
+	 *
+	 * if sample type is WEIGHT_STRUCT:
+	 * - store memory latency in the lower 32 bits.
+	 * - For ISA v3.1, use remaining two 16 bit fields of
+	 *   perf_sample_weight to store cycle counter values
+	 *   from sier2.
+	 */
+	weight_fields = (union perf_sample_weight *)weight;
+	if (type & PERF_SAMPLE_WEIGHT)
+		weight_fields->full = weight_lat;
+	else {
+		weight_fields->var1_dw = (u32)weight_lat;
+		if (cpu_has_feature(CPU_FTR_ARCH_31)) {
+			weight_fields->var2_w = P10_SIER2_FINISH_CYC(mfspr(SPRN_SIER2));
+			weight_fields->var3_w = P10_SIER2_DISPATCH_CYC(mfspr(SPRN_SIER2));
+		}
+	}
 }
 
 int isa207_get_constraint(u64 event, unsigned long *maskp, unsigned long *valp, u64 event_config1)
diff --git a/arch/powerpc/perf/isa207-common.h b/arch/powerpc/perf/isa207-common.h
index 1af0e8c97ac7..fc30d43c4d0c 100644
--- a/arch/powerpc/perf/isa207-common.h
+++ b/arch/powerpc/perf/isa207-common.h
@@ -265,6 +265,10 @@
 #define ISA207_SIER_DATA_SRC_SHIFT	53
 #define ISA207_SIER_DATA_SRC_MASK	(0x7ull << ISA207_SIER_DATA_SRC_SHIFT)
 
+/* Bits in SIER2/SIER3 for Power10 */
+#define P10_SIER2_FINISH_CYC(sier2)	(((sier2) >> (63 - 37)) & 0x7fful)
+#define P10_SIER2_DISPATCH_CYC(sier2)	(((sier2) >> (63 - 13)) & 0x7fful)
+
 #define P(a, b)				PERF_MEM_S(a, b)
 #define PH(a, b)			(P(LVL, HIT) | P(a, b))
 #define PM(a, b)			(P(LVL, MISS) | P(a, b))
@@ -278,6 +282,6 @@ int isa207_get_alternatives(u64 event, u64 alt[], int size, unsigned int flags,
 					const unsigned int ev_alt[][MAX_ALT]);
 void isa207_get_mem_data_src(union perf_mem_data_src *dsrc, u32 flags,
 							struct pt_regs *regs);
-void isa207_get_mem_weight(u64 *weight);
+void isa207_get_mem_weight(u64 *weight, u64 type);
 
 #endif
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 2/4] tools/perf: Add dynamic headers for perf report columns
  2021-03-09 14:03 [PATCH 0/4] powerpc/perf: Export processor pipeline stage cycles information Athira Rajeev
  2021-03-09 14:03 ` [PATCH 1/4] powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT Athira Rajeev
@ 2021-03-09 14:03 ` Athira Rajeev
  2021-03-12 12:57   ` Jiri Olsa
  2021-03-09 14:03 ` [PATCH 3/4] tools/perf: Add powerpc support for PERF_SAMPLE_WEIGHT_STRUCT Athira Rajeev
  2021-03-09 14:04 ` [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc Athira Rajeev
  3 siblings, 1 reply; 11+ messages in thread
From: Athira Rajeev @ 2021-03-09 14:03 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel, linux-perf-users, mpe, acme, jolsa
  Cc: maddy, ravi.bangoria, kjain, kan.liang, peterz

Currently the header string for different columns in perf report
is fixed. Some fields of perf sample could have different meaning
for different architectures than the meaning conveyed by the header
string. An example is the new field 'var2_w' of perf_sample_weight
structure. This is presently captured as 'Local INSTR Latency' in
perf mem report. But this could be used to denote a different latency
cycle in another architecture.

Introduce a weak function arch_perf_header_entry__add() to set
the arch specific header string for the fields which can contain dynamic
header. If the architecture do not have this function, fall back to the
default header string value.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 tools/perf/util/event.h |  1 +
 tools/perf/util/sort.c  | 19 ++++++++++++++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index f603edbbbc6f..89b149e2e70a 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -427,5 +427,6 @@ void  cpu_map_data__synthesize(struct perf_record_cpu_map_data *data, struct per
 
 void arch_perf_parse_sample_weight(struct perf_sample *data, const __u64 *array, u64 type);
 void arch_perf_synthesize_sample_weight(const struct perf_sample *data, __u64 *array, u64 type);
+const char *arch_perf_header_entry__add(const char *se_header);
 
 #endif /* __PERF_RECORD_H */
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 0d5ad42812b9..741a6df29fa0 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -25,6 +25,7 @@
 #include <traceevent/event-parse.h>
 #include "mem-events.h"
 #include "annotate.h"
+#include "event.h"
 #include "time-utils.h"
 #include "cgroup.h"
 #include "machine.h"
@@ -45,6 +46,7 @@
 regex_t		ignore_callees_regex;
 int		have_ignore_callees = 0;
 enum sort_mode	sort__mode = SORT_MODE__NORMAL;
+const char	*dynamic_headers[] = {"local_ins_lat"};
 
 /*
  * Replaces all occurrences of a char used with the:
@@ -1816,6 +1818,16 @@ struct sort_dimension {
 	int			taken;
 };
 
+const char * __weak arch_perf_header_entry__add(const char *se_header)
+{
+	return se_header;
+}
+
+static void sort_dimension_add_dynamic_header(struct sort_dimension *sd)
+{
+	sd->entry->se_header = arch_perf_header_entry__add(sd->entry->se_header);
+}
+
 #define DIM(d, n, func) [d] = { .name = n, .entry = &(func) }
 
 static struct sort_dimension common_sort_dimensions[] = {
@@ -2739,11 +2751,16 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
 			struct evlist *evlist,
 			int level)
 {
-	unsigned int i;
+	unsigned int i, j;
 
 	for (i = 0; i < ARRAY_SIZE(common_sort_dimensions); i++) {
 		struct sort_dimension *sd = &common_sort_dimensions[i];
 
+		for (j = 0; j < ARRAY_SIZE(dynamic_headers); j++) {
+			if (!strcmp(dynamic_headers[j], sd->name))
+				sort_dimension_add_dynamic_header(sd);
+		}
+
 		if (strncasecmp(tok, sd->name, strlen(tok)))
 			continue;
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] tools/perf: Add powerpc support for PERF_SAMPLE_WEIGHT_STRUCT
  2021-03-09 14:03 [PATCH 0/4] powerpc/perf: Export processor pipeline stage cycles information Athira Rajeev
  2021-03-09 14:03 ` [PATCH 1/4] powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT Athira Rajeev
  2021-03-09 14:03 ` [PATCH 2/4] tools/perf: Add dynamic headers for perf report columns Athira Rajeev
@ 2021-03-09 14:03 ` Athira Rajeev
  2021-03-09 14:04 ` [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc Athira Rajeev
  3 siblings, 0 replies; 11+ messages in thread
From: Athira Rajeev @ 2021-03-09 14:03 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel, linux-perf-users, mpe, acme, jolsa
  Cc: maddy, ravi.bangoria, kjain, kan.liang, peterz

Add arch specific arch_evsel__set_sample_weight() to set the new
sample type for powerpc.

Add arch specific arch_perf_parse_sample_weight() to store the
sample->weight values depending on the sample type applied.
if the new sample type (PERF_SAMPLE_WEIGHT_STRUCT) is applied,
store only the lower 32 bits to sample->weight. If sample type
is 'PERF_SAMPLE_WEIGHT', store the full 64-bit to sample->weight.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/Build   |  2 ++
 tools/perf/arch/powerpc/util/event.c | 32 ++++++++++++++++++++++++++++++++
 tools/perf/arch/powerpc/util/evsel.c |  8 ++++++++
 3 files changed, 42 insertions(+)
 create mode 100644 tools/perf/arch/powerpc/util/event.c
 create mode 100644 tools/perf/arch/powerpc/util/evsel.c

diff --git a/tools/perf/arch/powerpc/util/Build b/tools/perf/arch/powerpc/util/Build
index b7945e5a543b..8a79c4126e5b 100644
--- a/tools/perf/arch/powerpc/util/Build
+++ b/tools/perf/arch/powerpc/util/Build
@@ -4,6 +4,8 @@ perf-y += kvm-stat.o
 perf-y += perf_regs.o
 perf-y += mem-events.o
 perf-y += sym-handling.o
+perf-y += evsel.o
+perf-y += event.o
 
 perf-$(CONFIG_DWARF) += dwarf-regs.o
 perf-$(CONFIG_DWARF) += skip-callchain-idx.o
diff --git a/tools/perf/arch/powerpc/util/event.c b/tools/perf/arch/powerpc/util/event.c
new file mode 100644
index 000000000000..f49d32c2c8ae
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/event.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/types.h>
+#include <linux/string.h>
+#include <linux/zalloc.h>
+
+#include "../../../util/event.h"
+#include "../../../util/synthetic-events.h"
+#include "../../../util/machine.h"
+#include "../../../util/tool.h"
+#include "../../../util/map.h"
+#include "../../../util/debug.h"
+
+void arch_perf_parse_sample_weight(struct perf_sample *data,
+				   const __u64 *array, u64 type)
+{
+	union perf_sample_weight weight;
+
+	weight.full = *array;
+	if (type & PERF_SAMPLE_WEIGHT)
+		data->weight = weight.full;
+	else
+		data->weight = weight.var1_dw;
+}
+
+void arch_perf_synthesize_sample_weight(const struct perf_sample *data,
+					__u64 *array, u64 type)
+{
+	*array = data->weight;
+
+	if (type & PERF_SAMPLE_WEIGHT_STRUCT)
+		*array &= 0xffffffff;
+}
diff --git a/tools/perf/arch/powerpc/util/evsel.c b/tools/perf/arch/powerpc/util/evsel.c
new file mode 100644
index 000000000000..2f733cdc8dbb
--- /dev/null
+++ b/tools/perf/arch/powerpc/util/evsel.c
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdio.h>
+#include "util/evsel.h"
+
+void arch_evsel__set_sample_weight(struct evsel *evsel)
+{
+	evsel__set_sample_bit(evsel, WEIGHT_STRUCT);
+}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc
  2021-03-09 14:03 [PATCH 0/4] powerpc/perf: Export processor pipeline stage cycles information Athira Rajeev
                   ` (2 preceding siblings ...)
  2021-03-09 14:03 ` [PATCH 3/4] tools/perf: Add powerpc support for PERF_SAMPLE_WEIGHT_STRUCT Athira Rajeev
@ 2021-03-09 14:04 ` Athira Rajeev
  2021-03-12 12:56   ` Jiri Olsa
  3 siblings, 1 reply; 11+ messages in thread
From: Athira Rajeev @ 2021-03-09 14:04 UTC (permalink / raw)
  To: linuxppc-dev, linux-kernel, linux-perf-users, mpe, acme, jolsa
  Cc: maddy, ravi.bangoria, kjain, kan.liang, peterz

The pipeline stage cycles details can be recorded on powerpc from
the contents of Performance Monitor Unit (PMU) registers. On
ISA v3.1 platform, sampling registers exposes the cycles spent in
different pipeline stages. Patch adds perf tools support to present
two of the cycle counter information along with memory latency (weight).

Re-use the field 'ins_lat' for storing the first pipeline stage cycle.
This is stored in 'var2_w' field of 'perf_sample_weight'.

Add a new field 'p_stage_cyc' to store the second pipeline stage cycle
which is stored in 'var3_w' field of perf_sample_weight.

Add new sort function 'Pipeline Stage Cycle' and include this in
default_mem_sort_order[]. This new sort function may be used to denote
some other pipeline stage in another architecture. So add this to
list of sort entries that can have dynamic header string.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 tools/perf/Documentation/perf-report.txt |  1 +
 tools/perf/arch/powerpc/util/event.c     | 18 ++++++++++++++++--
 tools/perf/util/event.h                  |  1 +
 tools/perf/util/hist.c                   | 11 ++++++++---
 tools/perf/util/hist.h                   |  1 +
 tools/perf/util/session.c                |  4 +++-
 tools/perf/util/sort.c                   | 24 ++++++++++++++++++++++--
 tools/perf/util/sort.h                   |  2 ++
 8 files changed, 54 insertions(+), 8 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index f546b5e9db05..9691d9c227ba 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -112,6 +112,7 @@ OPTIONS
 	- ins_lat: Instruction latency in core cycles. This is the global instruction
 	  latency
 	- local_ins_lat: Local instruction latency version
+	- p_stage_cyc: Number of cycles spent in a pipeline stage.
 
 	By default, comm, dso and symbol keys are used.
 	(i.e. --sort comm,dso,symbol)
diff --git a/tools/perf/arch/powerpc/util/event.c b/tools/perf/arch/powerpc/util/event.c
index f49d32c2c8ae..b80fbee83b6e 100644
--- a/tools/perf/arch/powerpc/util/event.c
+++ b/tools/perf/arch/powerpc/util/event.c
@@ -18,8 +18,11 @@ void arch_perf_parse_sample_weight(struct perf_sample *data,
 	weight.full = *array;
 	if (type & PERF_SAMPLE_WEIGHT)
 		data->weight = weight.full;
-	else
+	else {
 		data->weight = weight.var1_dw;
+		data->ins_lat = weight.var2_w;
+		data->p_stage_cyc = weight.var3_w;
+	}
 }
 
 void arch_perf_synthesize_sample_weight(const struct perf_sample *data,
@@ -27,6 +30,17 @@ void arch_perf_synthesize_sample_weight(const struct perf_sample *data,
 {
 	*array = data->weight;
 
-	if (type & PERF_SAMPLE_WEIGHT_STRUCT)
+	if (type & PERF_SAMPLE_WEIGHT_STRUCT) {
 		*array &= 0xffffffff;
+		*array |= ((u64)data->ins_lat << 32);
+	}
+}
+
+const char *arch_perf_header_entry__add(const char *se_header)
+{
+	if (!strcmp(se_header, "Local INSTR Latency"))
+		return "Finish Cyc";
+	else if (!strcmp(se_header, "Pipeline Stage Cycle"))
+		return "Dispatch Cyc";
+	return se_header;
 }
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 89b149e2e70a..65f89e80916f 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -147,6 +147,7 @@ struct perf_sample {
 	u8  cpumode;
 	u16 misc;
 	u16 ins_lat;
+	u16 p_stage_cyc;
 	bool no_hw_idx;		/* No hw_idx collected in branch_stack */
 	char insn[MAX_INSN];
 	void *raw_data;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index c82f5fc26af8..9299ee535518 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -211,6 +211,7 @@ void hists__calc_col_len(struct hists *hists, struct hist_entry *h)
 	hists__new_col_len(hists, HISTC_MEM_BLOCKED, 10);
 	hists__new_col_len(hists, HISTC_LOCAL_INS_LAT, 13);
 	hists__new_col_len(hists, HISTC_GLOBAL_INS_LAT, 13);
+	hists__new_col_len(hists, HISTC_P_STAGE_CYC, 13);
 	if (symbol_conf.nanosecs)
 		hists__new_col_len(hists, HISTC_TIME, 16);
 	else
@@ -289,13 +290,14 @@ static long hist_time(unsigned long htime)
 }
 
 static void he_stat__add_period(struct he_stat *he_stat, u64 period,
-				u64 weight, u64 ins_lat)
+				u64 weight, u64 ins_lat, u64 p_stage_cyc)
 {
 
 	he_stat->period		+= period;
 	he_stat->weight		+= weight;
 	he_stat->nr_events	+= 1;
 	he_stat->ins_lat	+= ins_lat;
+	he_stat->p_stage_cyc	+= p_stage_cyc;
 }
 
 static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
@@ -308,6 +310,7 @@ static void he_stat__add_stat(struct he_stat *dest, struct he_stat *src)
 	dest->nr_events		+= src->nr_events;
 	dest->weight		+= src->weight;
 	dest->ins_lat		+= src->ins_lat;
+	dest->p_stage_cyc		+= src->p_stage_cyc;
 }
 
 static void he_stat__decay(struct he_stat *he_stat)
@@ -597,6 +600,7 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
 	u64 period = entry->stat.period;
 	u64 weight = entry->stat.weight;
 	u64 ins_lat = entry->stat.ins_lat;
+	u64 p_stage_cyc = entry->stat.p_stage_cyc;
 	bool leftmost = true;
 
 	p = &hists->entries_in->rb_root.rb_node;
@@ -615,11 +619,11 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
 
 		if (!cmp) {
 			if (sample_self) {
-				he_stat__add_period(&he->stat, period, weight, ins_lat);
+				he_stat__add_period(&he->stat, period, weight, ins_lat, p_stage_cyc);
 				hist_entry__add_callchain_period(he, period);
 			}
 			if (symbol_conf.cumulate_callchain)
-				he_stat__add_period(he->stat_acc, period, weight, ins_lat);
+				he_stat__add_period(he->stat_acc, period, weight, ins_lat, p_stage_cyc);
 
 			/*
 			 * This mem info was allocated from sample__resolve_mem
@@ -731,6 +735,7 @@ static void hists__res_sample(struct hist_entry *he, struct perf_sample *sample)
 			.period	= sample->period,
 			.weight = sample->weight,
 			.ins_lat = sample->ins_lat,
+			.p_stage_cyc = sample->p_stage_cyc,
 		},
 		.parent = sym_parent,
 		.filtered = symbol__parent_filter(sym_parent) | al->filtered,
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 3c537232294b..e2faa745c8d6 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -75,6 +75,7 @@ enum hist_column {
 	HISTC_MEM_BLOCKED,
 	HISTC_LOCAL_INS_LAT,
 	HISTC_GLOBAL_INS_LAT,
+	HISTC_P_STAGE_CYC,
 	HISTC_NR_COLS, /* Last entry */
 };
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 859832a82496..a6fed96d783d 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1302,8 +1302,10 @@ static void dump_sample(struct evsel *evsel, union perf_event *event,
 
 	if (sample_type & PERF_SAMPLE_WEIGHT_TYPE) {
 		printf("... weight: %" PRIu64 "", sample->weight);
-			if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT)
+			if (sample_type & PERF_SAMPLE_WEIGHT_STRUCT) {
 				printf(",0x%"PRIx16"", sample->ins_lat);
+				printf(",0x%"PRIx16"", sample->p_stage_cyc);
+			}
 		printf("\n");
 	}
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 741a6df29fa0..cbb3899e7eca 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -37,7 +37,7 @@
 const char	*parent_pattern = default_parent_pattern;
 const char	*default_sort_order = "comm,dso,symbol";
 const char	default_branch_sort_order[] = "comm,dso_from,symbol_from,symbol_to,cycles";
-const char	default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat";
+const char	default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,p_stage_cyc";
 const char	default_top_sort_order[] = "dso,symbol";
 const char	default_diff_sort_order[] = "dso,symbol";
 const char	default_tracepoint_sort_order[] = "trace";
@@ -46,7 +46,7 @@
 regex_t		ignore_callees_regex;
 int		have_ignore_callees = 0;
 enum sort_mode	sort__mode = SORT_MODE__NORMAL;
-const char	*dynamic_headers[] = {"local_ins_lat"};
+const char	*dynamic_headers[] = {"local_ins_lat", "p_stage_cyc"};
 
 /*
  * Replaces all occurrences of a char used with the:
@@ -1410,6 +1410,25 @@ struct sort_entry sort_global_ins_lat = {
 	.se_width_idx	= HISTC_GLOBAL_INS_LAT,
 };
 
+static int64_t
+sort__global_p_stage_cyc_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+	return left->stat.p_stage_cyc - right->stat.p_stage_cyc;
+}
+
+static int hist_entry__p_stage_cyc_snprintf(struct hist_entry *he, char *bf,
+					size_t size, unsigned int width)
+{
+	return repsep_snprintf(bf, size, "%-*u", width, he->stat.p_stage_cyc);
+}
+
+struct sort_entry sort_p_stage_cyc = {
+	.se_header      = "Pipeline Stage Cycle",
+	.se_cmp         = sort__global_p_stage_cyc_cmp,
+	.se_snprintf	= hist_entry__p_stage_cyc_snprintf,
+	.se_width_idx	= HISTC_P_STAGE_CYC,
+};
+
 struct sort_entry sort_mem_daddr_sym = {
 	.se_header	= "Data Symbol",
 	.se_cmp		= sort__daddr_cmp,
@@ -1853,6 +1872,7 @@ static void sort_dimension_add_dynamic_header(struct sort_dimension *sd)
 	DIM(SORT_CODE_PAGE_SIZE, "code_page_size", sort_code_page_size),
 	DIM(SORT_LOCAL_INS_LAT, "local_ins_lat", sort_local_ins_lat),
 	DIM(SORT_GLOBAL_INS_LAT, "ins_lat", sort_global_ins_lat),
+	DIM(SORT_P_STAGE_CYC, "p_stage_cyc", sort_p_stage_cyc),
 };
 
 #undef DIM
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 63f67a3f3630..23b20cbbc846 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -51,6 +51,7 @@ struct he_stat {
 	u64			period_guest_us;
 	u64			weight;
 	u64			ins_lat;
+	u64			p_stage_cyc;
 	u32			nr_events;
 };
 
@@ -234,6 +235,7 @@ enum sort_type {
 	SORT_CODE_PAGE_SIZE,
 	SORT_LOCAL_INS_LAT,
 	SORT_GLOBAL_INS_LAT,
+	SORT_P_STAGE_CYC,
 
 	/* branch stack specific sort keys */
 	__SORT_BRANCH_STACK,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc
  2021-03-09 14:04 ` [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc Athira Rajeev
@ 2021-03-12 12:56   ` Jiri Olsa
  2021-03-15  7:52     ` Athira Rajeev
  0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2021-03-12 12:56 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: linuxppc-dev, linux-kernel, linux-perf-users, mpe, acme, jolsa,
	maddy, ravi.bangoria, kjain, kan.liang, peterz

On Tue, Mar 09, 2021 at 09:04:00AM -0500, Athira Rajeev wrote:
> The pipeline stage cycles details can be recorded on powerpc from
> the contents of Performance Monitor Unit (PMU) registers. On
> ISA v3.1 platform, sampling registers exposes the cycles spent in
> different pipeline stages. Patch adds perf tools support to present
> two of the cycle counter information along with memory latency (weight).
> 
> Re-use the field 'ins_lat' for storing the first pipeline stage cycle.
> This is stored in 'var2_w' field of 'perf_sample_weight'.
> 
> Add a new field 'p_stage_cyc' to store the second pipeline stage cycle
> which is stored in 'var3_w' field of perf_sample_weight.
> 
> Add new sort function 'Pipeline Stage Cycle' and include this in
> default_mem_sort_order[]. This new sort function may be used to denote
> some other pipeline stage in another architecture. So add this to
> list of sort entries that can have dynamic header string.
> 
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  tools/perf/Documentation/perf-report.txt |  1 +
>  tools/perf/arch/powerpc/util/event.c     | 18 ++++++++++++++++--
>  tools/perf/util/event.h                  |  1 +
>  tools/perf/util/hist.c                   | 11 ++++++++---
>  tools/perf/util/hist.h                   |  1 +
>  tools/perf/util/session.c                |  4 +++-
>  tools/perf/util/sort.c                   | 24 ++++++++++++++++++++++--
>  tools/perf/util/sort.h                   |  2 ++
>  8 files changed, 54 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index f546b5e9db05..9691d9c227ba 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -112,6 +112,7 @@ OPTIONS
>  	- ins_lat: Instruction latency in core cycles. This is the global instruction
>  	  latency
>  	- local_ins_lat: Local instruction latency version
> +	- p_stage_cyc: Number of cycles spent in a pipeline stage.

please specify in here that it's ppc only

SNIP

> +struct sort_entry sort_p_stage_cyc = {
> +	.se_header      = "Pipeline Stage Cycle",
> +	.se_cmp         = sort__global_p_stage_cyc_cmp,
> +	.se_snprintf	= hist_entry__p_stage_cyc_snprintf,
> +	.se_width_idx	= HISTC_P_STAGE_CYC,
> +};
> +
>  struct sort_entry sort_mem_daddr_sym = {
>  	.se_header	= "Data Symbol",
>  	.se_cmp		= sort__daddr_cmp,
> @@ -1853,6 +1872,7 @@ static void sort_dimension_add_dynamic_header(struct sort_dimension *sd)
>  	DIM(SORT_CODE_PAGE_SIZE, "code_page_size", sort_code_page_size),
>  	DIM(SORT_LOCAL_INS_LAT, "local_ins_lat", sort_local_ins_lat),
>  	DIM(SORT_GLOBAL_INS_LAT, "ins_lat", sort_global_ins_lat),
> +	DIM(SORT_P_STAGE_CYC, "p_stage_cyc", sort_p_stage_cyc),

this might be out of scope for this patch, but would it make sense
to add arch specific sort dimension? so the specific column is
not even visible on arch that it's not supported on


>  };
>  
>  #undef DIM
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index 63f67a3f3630..23b20cbbc846 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -51,6 +51,7 @@ struct he_stat {
>  	u64			period_guest_us;
>  	u64			weight;
>  	u64			ins_lat;
> +	u64			p_stage_cyc;
>  	u32			nr_events;
>  };
>  
> @@ -234,6 +235,7 @@ enum sort_type {
>  	SORT_CODE_PAGE_SIZE,
>  	SORT_LOCAL_INS_LAT,
>  	SORT_GLOBAL_INS_LAT,
> +	SORT_P_STAGE_CYC,

we could have the whole 'SORT_PEPELINE_STAGE_CYC',
so it's more obvious

thanks,
jirka


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] tools/perf: Add dynamic headers for perf report columns
  2021-03-09 14:03 ` [PATCH 2/4] tools/perf: Add dynamic headers for perf report columns Athira Rajeev
@ 2021-03-12 12:57   ` Jiri Olsa
  2021-03-15  7:41     ` Athira Rajeev
  0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2021-03-12 12:57 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: linuxppc-dev, linux-kernel, linux-perf-users, mpe, acme, jolsa,
	maddy, ravi.bangoria, kjain, kan.liang, peterz

On Tue, Mar 09, 2021 at 09:03:58AM -0500, Athira Rajeev wrote:
> Currently the header string for different columns in perf report
> is fixed. Some fields of perf sample could have different meaning
> for different architectures than the meaning conveyed by the header
> string. An example is the new field 'var2_w' of perf_sample_weight
> structure. This is presently captured as 'Local INSTR Latency' in
> perf mem report. But this could be used to denote a different latency
> cycle in another architecture.
> 
> Introduce a weak function arch_perf_header_entry__add() to set
> the arch specific header string for the fields which can contain dynamic
> header. If the architecture do not have this function, fall back to the
> default header string value.
> 
> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
> ---
>  tools/perf/util/event.h |  1 +
>  tools/perf/util/sort.c  | 19 ++++++++++++++++++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index f603edbbbc6f..89b149e2e70a 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -427,5 +427,6 @@ void  cpu_map_data__synthesize(struct perf_record_cpu_map_data *data, struct per
>  
>  void arch_perf_parse_sample_weight(struct perf_sample *data, const __u64 *array, u64 type);
>  void arch_perf_synthesize_sample_weight(const struct perf_sample *data, __u64 *array, u64 type);
> +const char *arch_perf_header_entry__add(const char *se_header);
>  
>  #endif /* __PERF_RECORD_H */
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 0d5ad42812b9..741a6df29fa0 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -25,6 +25,7 @@
>  #include <traceevent/event-parse.h>
>  #include "mem-events.h"
>  #include "annotate.h"
> +#include "event.h"
>  #include "time-utils.h"
>  #include "cgroup.h"
>  #include "machine.h"
> @@ -45,6 +46,7 @@
>  regex_t		ignore_callees_regex;
>  int		have_ignore_callees = 0;
>  enum sort_mode	sort__mode = SORT_MODE__NORMAL;
> +const char	*dynamic_headers[] = {"local_ins_lat"};
>  
>  /*
>   * Replaces all occurrences of a char used with the:
> @@ -1816,6 +1818,16 @@ struct sort_dimension {
>  	int			taken;
>  };
>  
> +const char * __weak arch_perf_header_entry__add(const char *se_header)

no need for the __add suffix in here

jirka

> +{
> +	return se_header;
> +}
> +
> +static void sort_dimension_add_dynamic_header(struct sort_dimension *sd)
> +{
> +	sd->entry->se_header = arch_perf_header_entry__add(sd->entry->se_header);
> +}
> +
>  #define DIM(d, n, func) [d] = { .name = n, .entry = &(func) }
>  
>  static struct sort_dimension common_sort_dimensions[] = {
> @@ -2739,11 +2751,16 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
>  			struct evlist *evlist,
>  			int level)
>  {
> -	unsigned int i;
> +	unsigned int i, j;
>  
>  	for (i = 0; i < ARRAY_SIZE(common_sort_dimensions); i++) {
>  		struct sort_dimension *sd = &common_sort_dimensions[i];
>  
> +		for (j = 0; j < ARRAY_SIZE(dynamic_headers); j++) {
> +			if (!strcmp(dynamic_headers[j], sd->name))
> +				sort_dimension_add_dynamic_header(sd);
> +		}
> +
>  		if (strncasecmp(tok, sd->name, strlen(tok)))
>  			continue;
>  
> -- 
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 2/4] tools/perf: Add dynamic headers for perf report columns
  2021-03-12 12:57   ` Jiri Olsa
@ 2021-03-15  7:41     ` Athira Rajeev
  0 siblings, 0 replies; 11+ messages in thread
From: Athira Rajeev @ 2021-03-15  7:41 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: linuxppc-dev, linux-kernel, linux-perf-users, Michael Ellerman,
	Arnaldo Carvalho de Melo, jolsa, Madhavan Srinivasan,
	ravi.bangoria, kjain, kan.liang, peterz



> On 12-Mar-2021, at 6:27 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Tue, Mar 09, 2021 at 09:03:58AM -0500, Athira Rajeev wrote:
>> Currently the header string for different columns in perf report
>> is fixed. Some fields of perf sample could have different meaning
>> for different architectures than the meaning conveyed by the header
>> string. An example is the new field 'var2_w' of perf_sample_weight
>> structure. This is presently captured as 'Local INSTR Latency' in
>> perf mem report. But this could be used to denote a different latency
>> cycle in another architecture.
>> 
>> Introduce a weak function arch_perf_header_entry__add() to set
>> the arch specific header string for the fields which can contain dynamic
>> header. If the architecture do not have this function, fall back to the
>> default header string value.
>> 
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> ---
>> tools/perf/util/event.h |  1 +
>> tools/perf/util/sort.c  | 19 ++++++++++++++++++-
>> 2 files changed, 19 insertions(+), 1 deletion(-)
>> 
>> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
>> index f603edbbbc6f..89b149e2e70a 100644
>> --- a/tools/perf/util/event.h
>> +++ b/tools/perf/util/event.h
>> @@ -427,5 +427,6 @@ void  cpu_map_data__synthesize(struct perf_record_cpu_map_data *data, struct per
>> 
>> void arch_perf_parse_sample_weight(struct perf_sample *data, const __u64 *array, u64 type);
>> void arch_perf_synthesize_sample_weight(const struct perf_sample *data, __u64 *array, u64 type);
>> +const char *arch_perf_header_entry__add(const char *se_header);
>> 
>> #endif /* __PERF_RECORD_H */
>> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
>> index 0d5ad42812b9..741a6df29fa0 100644
>> --- a/tools/perf/util/sort.c
>> +++ b/tools/perf/util/sort.c
>> @@ -25,6 +25,7 @@
>> #include <traceevent/event-parse.h>
>> #include "mem-events.h"
>> #include "annotate.h"
>> +#include "event.h"
>> #include "time-utils.h"
>> #include "cgroup.h"
>> #include "machine.h"
>> @@ -45,6 +46,7 @@
>> regex_t		ignore_callees_regex;
>> int		have_ignore_callees = 0;
>> enum sort_mode	sort__mode = SORT_MODE__NORMAL;
>> +const char	*dynamic_headers[] = {"local_ins_lat"};
>> 
>> /*
>>  * Replaces all occurrences of a char used with the:
>> @@ -1816,6 +1818,16 @@ struct sort_dimension {
>> 	int			taken;
>> };
>> 
>> +const char * __weak arch_perf_header_entry__add(const char *se_header)
> 
> no need for the __add suffix in here
> 
> jirka
> 

Thanks Jiri for the review.

I will include this change in next version.

Thanks
Athira

>> +{
>> +	return se_header;
>> +}
>> +
>> +static void sort_dimension_add_dynamic_header(struct sort_dimension *sd)
>> +{
>> +	sd->entry->se_header = arch_perf_header_entry__add(sd->entry->se_header);
>> +}
>> +
>> #define DIM(d, n, func) [d] = { .name = n, .entry = &(func) }
>> 
>> static struct sort_dimension common_sort_dimensions[] = {
>> @@ -2739,11 +2751,16 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,
>> 			struct evlist *evlist,
>> 			int level)
>> {
>> -	unsigned int i;
>> +	unsigned int i, j;
>> 
>> 	for (i = 0; i < ARRAY_SIZE(common_sort_dimensions); i++) {
>> 		struct sort_dimension *sd = &common_sort_dimensions[i];
>> 
>> +		for (j = 0; j < ARRAY_SIZE(dynamic_headers); j++) {
>> +			if (!strcmp(dynamic_headers[j], sd->name))
>> +				sort_dimension_add_dynamic_header(sd);
>> +		}
>> +
>> 		if (strncasecmp(tok, sd->name, strlen(tok)))
>> 			continue;
>> 
>> -- 
>> 1.8.3.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc
  2021-03-12 12:56   ` Jiri Olsa
@ 2021-03-15  7:52     ` Athira Rajeev
  2021-03-15 23:18       ` Jiri Olsa
  0 siblings, 1 reply; 11+ messages in thread
From: Athira Rajeev @ 2021-03-15  7:52 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Ravi Bangoria, Madhavan Srinivasan, Peter Zijlstra, linux-kernel,
	acme, linux-perf-users, jolsa, kjain, linuxppc-dev, kan.liang



> On 12-Mar-2021, at 6:26 PM, Jiri Olsa <jolsa@redhat.com> wrote:
> 
> On Tue, Mar 09, 2021 at 09:04:00AM -0500, Athira Rajeev wrote:
>> The pipeline stage cycles details can be recorded on powerpc from
>> the contents of Performance Monitor Unit (PMU) registers. On
>> ISA v3.1 platform, sampling registers exposes the cycles spent in
>> different pipeline stages. Patch adds perf tools support to present
>> two of the cycle counter information along with memory latency (weight).
>> 
>> Re-use the field 'ins_lat' for storing the first pipeline stage cycle.
>> This is stored in 'var2_w' field of 'perf_sample_weight'.
>> 
>> Add a new field 'p_stage_cyc' to store the second pipeline stage cycle
>> which is stored in 'var3_w' field of perf_sample_weight.
>> 
>> Add new sort function 'Pipeline Stage Cycle' and include this in
>> default_mem_sort_order[]. This new sort function may be used to denote
>> some other pipeline stage in another architecture. So add this to
>> list of sort entries that can have dynamic header string.
>> 
>> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
>> ---
>> tools/perf/Documentation/perf-report.txt |  1 +
>> tools/perf/arch/powerpc/util/event.c     | 18 ++++++++++++++++--
>> tools/perf/util/event.h                  |  1 +
>> tools/perf/util/hist.c                   | 11 ++++++++---
>> tools/perf/util/hist.h                   |  1 +
>> tools/perf/util/session.c                |  4 +++-
>> tools/perf/util/sort.c                   | 24 ++++++++++++++++++++++--
>> tools/perf/util/sort.h                   |  2 ++
>> 8 files changed, 54 insertions(+), 8 deletions(-)
>> 
>> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
>> index f546b5e9db05..9691d9c227ba 100644
>> --- a/tools/perf/Documentation/perf-report.txt
>> +++ b/tools/perf/Documentation/perf-report.txt
>> @@ -112,6 +112,7 @@ OPTIONS
>> 	- ins_lat: Instruction latency in core cycles. This is the global instruction
>> 	  latency
>> 	- local_ins_lat: Local instruction latency version
>> +	- p_stage_cyc: Number of cycles spent in a pipeline stage.
> 
> please specify in here that it's ppc only

Ok Sure,

> 
> SNIP
> 
>> +struct sort_entry sort_p_stage_cyc = {
>> +	.se_header      = "Pipeline Stage Cycle",
>> +	.se_cmp         = sort__global_p_stage_cyc_cmp,
>> +	.se_snprintf	= hist_entry__p_stage_cyc_snprintf,
>> +	.se_width_idx	= HISTC_P_STAGE_CYC,
>> +};
>> +
>> struct sort_entry sort_mem_daddr_sym = {
>> 	.se_header	= "Data Symbol",
>> 	.se_cmp		= sort__daddr_cmp,
>> @@ -1853,6 +1872,7 @@ static void sort_dimension_add_dynamic_header(struct sort_dimension *sd)
>> 	DIM(SORT_CODE_PAGE_SIZE, "code_page_size", sort_code_page_size),
>> 	DIM(SORT_LOCAL_INS_LAT, "local_ins_lat", sort_local_ins_lat),
>> 	DIM(SORT_GLOBAL_INS_LAT, "ins_lat", sort_global_ins_lat),
>> +	DIM(SORT_P_STAGE_CYC, "p_stage_cyc", sort_p_stage_cyc),
> 
> this might be out of scope for this patch, but would it make sense
> to add arch specific sort dimension? so the specific column is
> not even visible on arch that it's not supported on
> 

Hi Jiri,

Thanks for the suggestions.

Below is an approach I came up with for adding dynamic sort key based on architecture support.
With this patch, perf report for mem mode will display new sort key only in supported archs. 
Please help to review if this approach looks good. I have created this on top of my current set. If this looks fine, 
I can include this in version2 patch set.

From 8ebbe6ae802d895103335899e4e60dde5e562f33 Mon Sep 17 00:00:00 2001
From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Date: Mon, 15 Mar 2021 02:33:28 +0000
Subject: [PATCH] tools/perf: Add dynamic sort dimensions for mem mode

Add dynamic sort dimensions for mem mode.

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
---
 tools/perf/arch/powerpc/util/event.c |  7 +++++
 tools/perf/util/event.h              |  1 +
 tools/perf/util/sort.c               | 43 +++++++++++++++++++++++++++-
 3 files changed, 50 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/powerpc/util/event.c b/tools/perf/arch/powerpc/util/event.c
index b80fbee83b6e..fddfc288c415 100644
--- a/tools/perf/arch/powerpc/util/event.c
+++ b/tools/perf/arch/powerpc/util/event.c
@@ -44,3 +44,10 @@ const char *arch_perf_header_entry__add(const char *se_header)
 		return "Dispatch Cyc";
 	return se_header;
 }
+
+int arch_support_dynamic_key(const char *sort_key)
+{
+	if (!strcmp(sort_key, "p_stage_cyc"))
+		return 1;
+	return 0;
+}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 65f89e80916f..6cd4bf54dbdc 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -429,5 +429,6 @@ char *get_page_size_name(u64 size, char *str);
 void arch_perf_parse_sample_weight(struct perf_sample *data, const __u64 *array, u64 type);
 void arch_perf_synthesize_sample_weight(const struct perf_sample *data, __u64 *array, u64 type);
 const char *arch_perf_header_entry__add(const char *se_header);
+int arch_support_dynamic_key(const char *sort_key);
 
 #endif /* __PERF_RECORD_H */
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index cbb3899e7eca..e194b1187db8 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -37,7 +37,7 @@ const char	default_parent_pattern[] = "^sys_|^do_page_fault";
 const char	*parent_pattern = default_parent_pattern;
 const char	*default_sort_order = "comm,dso,symbol";
 const char	default_branch_sort_order[] = "comm,dso_from,symbol_from,symbol_to,cycles";
-const char	default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,p_stage_cyc";
+const char	default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat";
 const char	default_top_sort_order[] = "dso,symbol";
 const char	default_diff_sort_order[] = "dso,symbol";
 const char	default_tracepoint_sort_order[] = "trace";
@@ -47,6 +47,7 @@ regex_t		ignore_callees_regex;
 int		have_ignore_callees = 0;
 enum sort_mode	sort__mode = SORT_MODE__NORMAL;
 const char	*dynamic_headers[] = {"local_ins_lat", "p_stage_cyc"};
+const char	*dynamic_sort_keys_mem[] = {"p_stage_cyc"};
 
 /*
  * Replaces all occurrences of a char used with the:
@@ -2997,6 +2998,20 @@ static char *prefix_if_not_in(const char *pre, char *str)
 	return n;
 }
 
+/*
+ * Adds 'suff,' suffix into 'str' if 'suff' is
+ * not already part of 'str'.
+ */
+static char *suffix_if_not_in(const char *suff, char *str)
+{
+	if (!str || strstr(str, suff))
+		return str;
+
+	if (asprintf(&str, "%s,%s", str, suff) < 0)
+		str = NULL;
+	return str;
+}
+
 static char *setup_overhead(char *keys)
 {
 	if (sort__mode == SORT_MODE__DIFF)
@@ -3010,6 +3025,26 @@ static char *setup_overhead(char *keys)
 	return keys;
 }
 
+int __weak arch_support_dynamic_key(const char *sort_key __maybe_unused)
+{
+	return 0;
+}
+
+static char *setup_dynamic_sort_keys(char *str)
+{
+	unsigned int j;
+
+	if (sort__mode == SORT_MODE__MEMORY)
+		for (j = 0; j < ARRAY_SIZE(dynamic_sort_keys_mem); j++)
+			if (arch_support_dynamic_key(dynamic_sort_keys_mem[j])) {
+				str = suffix_if_not_in(dynamic_sort_keys_mem[j], str);
+				if (str == NULL)
+					return str;
+			}
+
+	return str;
+}
+
 static int __setup_sorting(struct evlist *evlist)
 {
 	char *str;
@@ -3050,6 +3085,12 @@ static int __setup_sorting(struct evlist *evlist)
 		}
 	}
 
+	str = setup_dynamic_sort_keys(str);
+	if (str == NULL) {
+		pr_err("Not enough memory to setup dynamic sort keys");
+		return -ENOMEM;
+	}
+
 	ret = setup_sort_list(&perf_hpp_list, str, evlist);
 
 	free(str);
-- 
2.26.2


Thanks,
Athira

> 
>> };
>> 
>> #undef DIM
>> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
>> index 63f67a3f3630..23b20cbbc846 100644
>> --- a/tools/perf/util/sort.h
>> +++ b/tools/perf/util/sort.h
>> @@ -51,6 +51,7 @@ struct he_stat {
>> 	u64			period_guest_us;
>> 	u64			weight;
>> 	u64			ins_lat;
>> +	u64			p_stage_cyc;
>> 	u32			nr_events;
>> };
>> 
>> @@ -234,6 +235,7 @@ enum sort_type {
>> 	SORT_CODE_PAGE_SIZE,
>> 	SORT_LOCAL_INS_LAT,
>> 	SORT_GLOBAL_INS_LAT,
>> +	SORT_P_STAGE_CYC,
> 
> we could have the whole 'SORT_PEPELINE_STAGE_CYC',
> so it's more obvious

Ok.

> 
> thanks,
> jirka


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc
  2021-03-15  7:52     ` Athira Rajeev
@ 2021-03-15 23:18       ` Jiri Olsa
       [not found]         ` <CA827A39-FA2A-4B0C-BF8F-9DB428CD58B8@linux.vnet.ibm.com>
  0 siblings, 1 reply; 11+ messages in thread
From: Jiri Olsa @ 2021-03-15 23:18 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Ravi Bangoria, Madhavan Srinivasan, Peter Zijlstra, linux-kernel,
	acme, linux-perf-users, jolsa, kjain, linuxppc-dev, kan.liang

On Mon, Mar 15, 2021 at 01:22:09PM +0530, Athira Rajeev wrote:

SNIP

> +
> +static char *setup_dynamic_sort_keys(char *str)
> +{
> +	unsigned int j;
> +
> +	if (sort__mode == SORT_MODE__MEMORY)
> +		for (j = 0; j < ARRAY_SIZE(dynamic_sort_keys_mem); j++)
> +			if (arch_support_dynamic_key(dynamic_sort_keys_mem[j])) {
> +				str = suffix_if_not_in(dynamic_sort_keys_mem[j], str);
> +				if (str == NULL)
> +					return str;
> +			}
> +
> +	return str;
> +}
> +
>  static int __setup_sorting(struct evlist *evlist)
>  {
>  	char *str;
> @@ -3050,6 +3085,12 @@ static int __setup_sorting(struct evlist *evlist)
>  		}
>  	}
>  
> +	str = setup_dynamic_sort_keys(str);
> +	if (str == NULL) {
> +		pr_err("Not enough memory to setup dynamic sort keys");
> +		return -ENOMEM;
> +	}

hum, so this is basicaly overloading the default_mem_sort_order for
architecture, right?

then I think it'd be easier just overload default_mem_sort_order directly

I was thinking more about adding extra (arch specific) loop to
sort_dimension__add or somehow add arch's specific stuff to
memory_sort_dimensions

jirka


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc
       [not found]         ` <CA827A39-FA2A-4B0C-BF8F-9DB428CD58B8@linux.vnet.ibm.com>
@ 2021-03-17 12:16           ` Jiri Olsa
  0 siblings, 0 replies; 11+ messages in thread
From: Jiri Olsa @ 2021-03-17 12:16 UTC (permalink / raw)
  To: Athira Rajeev
  Cc: Ravi Bangoria, Madhavan Srinivasan, Peter Zijlstra, Kajol Jain,
	linux-kernel, acme, linux-perf-users, jolsa, linuxppc-dev,
	kan.liang

On Wed, Mar 17, 2021 at 05:01:27PM +0530, Athira Rajeev wrote:
> <html><head></head><body dir="auto" style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="ApplePlainTextBody"><div class="ApplePlainTextBody"><br><br><blockquote type="cite">On 16-Mar-2021, at 4:48 AM, Jiri Olsa &lt;jolsa@redhat.com&gt; wrote:<br><br>On Mon, Mar 15, 2021 at 01:22:09PM +0530, Athira Rajeev wrote:<br><br>SNIP<br><br><blockquote type="cite">+<br>+static char *setup_dynamic_sort_keys(char *str)<br>+{<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>unsigned int j;<br>+<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>if (sort__mode == SORT_MODE__MEMORY)<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>for (j = 0; j &lt; ARRAY_SIZE(dynamic_sort_keys_mem); j++)<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>if (arch_support_dynamic_key(dynamic_sort_keys_mem[j])) {<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>str = suffix_if_not_in(dynamic_sort_keys_mem[j], str);<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>if (str == NULL)<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>return str;<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>}<br>+<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>return str;<br>+}<br>+<br>static int __setup_sorting(struct evlist *evlist)<br>{<br><span class="Apple-tab-span" style="white-space:pre">	</span>char *str;<br>@@ -3050,6 +3085,12 @@ static int __setup_sorting(struct evlist *evlist)<br><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>}<br><span class="Apple-tab-span" style="white-space:pre">	</span>}<br><br>+<span class="Apple-tab-span" style="white-space:pre">	</span>str = setup_dynamic_sort_keys(str);<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>if (str == NULL) {<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>pr_err("Not enough memory to setup dynamic sort keys");<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>return -ENOMEM;<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>}<br></blockquote><br>hum, so this is basicaly overloading the default_mem_sort_order for<br>architecture, right?<br><br>then I think it'd be easier just overload default_mem_sort_order directly<br><br>I was thinking more about adding extra (arch specific) loop to<br>sort_dimension__add or somehow add arch's specific stuff to<br>memory_sort_dimensions<br></blockquote><br>Hi Jiri,<br><br>Above patch was to append additional sort keys to sort order based on<br>sort mode and architecture support. I had initially thought of defining two<br>orders ( say default_mem_sort_order plus mem_sort_order_pstage ). But if<br>new sort keys gets added for mem mode in future, we will need to keep<br>updating both orders. So preferred the approach of "appending" supported sort<br>keys to default order.<br><br>Following your thought on using "sort_dimension__add", I tried below approach<br>which is easier. The new sort dimension "p_stage_cyc" is presently only supported<br>on powerpc. For unsupported platforms, we don't want to display it<br>in the perf report output columns. Hence added check in sort_dimension__add()<br>and skip the sort key incase its not applicable for particular arch.<br><br>Please help to check if below approach looks fine.<br><br><br>diff --git a/tools/perf/arch/powerpc/util/event.c b/tools/perf/arch/powerpc/util/event.c<br>index b80fbee83b6e..7205767d75eb 100644<br>--- a/tools/perf/arch/powerpc/util/event.c<br>+++ b/tools/perf/arch/powerpc/util/event.c<br>@@ -44,3 +44,10 @@ const char *arch_perf_header_entry__add(const char *se_header)<br> <span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>return "Dispatch Cyc";<br> <span class="Apple-tab-span" style="white-space:pre">	</span>return se_header;<br> }<br>+<br>+int arch_support_sort_key(const char *sort_key)<br>+{<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>if (!strcmp(sort_key, "p_stage_cyc"))<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>return 1;<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>return 0;<br>+}<br>diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h<br>index 65f89e80916f..612a92aaaefb 100644<br>--- a/tools/perf/util/event.h<br>+++ b/tools/perf/util/event.h<br>@@ -429,5 +429,6 @@ char *get_page_size_name(u64 size, char *str);<br> void arch_perf_parse_sample_weight(struct perf_sample *data, const __u64 *array, u64 type);<br> void arch_perf_synthesize_sample_weight(const struct perf_sample *data, __u64 *array, u64 type);<br> const char *arch_perf_header_entry__add(const char *se_header);<br>+int arch_support_sort_key(const char *sort_key);<br><br> #endif /* __PERF_RECORD_H */<br>diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c<br>index cbb3899e7eca..d8b0b0b43a81 100644<br>--- a/tools/perf/util/sort.c<br>+++ b/tools/perf/util/sort.c<br>@@ -47,6 +47,7 @@ regex_t<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>ignore_callees_regex;<br> int<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>have_ignore_callees = 0;<br> enum sort_mode<span class="Apple-tab-span" style="white-space:pre">	</span>sort__mode = SORT_MODE__NORMAL;<br> const char<span class="Apple-tab-span" style="white-space:pre">	</span>*dynamic_headers[] = {"local_ins_lat", "p_stage_cyc"};<br>+const char<span class="Apple-tab-span" style="white-space:pre">	</span>*arch_specific_sort_keys[] = {"p_stage_cyc"};<br><br> /*<br> &nbsp;* Replaces all occurrences of a char used with the:<br>@@ -1837,6 +1838,11 @@ struct sort_dimension {<br> <span class="Apple-tab-span" style="white-space:pre">	</span>int<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>taken;<br> };<br><br>+int __weak arch_support_sort_key(const char *sort_key __maybe_unused)<br>+{<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>return 0;<br>+}<br>+<br> const char * __weak arch_perf_header_entry__add(const char *se_header)<br> {<br> <span class="Apple-tab-span" style="white-space:pre">	</span>return se_header;<br>@@ -2773,6 +2779,18 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok,<br> {<br> <span class="Apple-tab-span" style="white-space:pre">	</span>unsigned int i, j;<br><br>+<span class="Apple-tab-span" style="white-space:pre">	</span>/* Check to see if there are any arch specific<br>+<span class="Apple-tab-span" style="white-space:pre">	</span> * sort dimensions not applicable for the current<br>+<span class="Apple-tab-span" style="white-space:pre">	</span> * architecture. If so, Skip that sort key since<br>+<span class="Apple-tab-span" style="white-space:pre">	</span> * we don't want to display it in the output fields.<br>+<span class="Apple-tab-span" style="white-space:pre">	</span> */<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>for (j = 0; j &lt; ARRAY_SIZE(arch_specific_sort_keys); j++) {<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>if (!strcmp(arch_specific_sort_keys[j], tok) &amp;&amp;<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>!arch_support_sort_key(tok)) {<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>return 0;<br>+<span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>}<br>+<span class="Apple-tab-span" style="white-space:pre">	</span>}<br>+<br> <span class="Apple-tab-span" style="white-space:pre">	</span>for (i = 0; i &lt; ARRAY_SIZE(common_sort_dimensions); i++) {<br> <span class="Apple-tab-span" style="white-space:pre">	</span><span class="Apple-tab-span" style="white-space:pre">	</span>struct sort_dimension *sd = &amp;common_sort_dimensions[i];<br><br>— <br>2.26.2<br><br>Thanks<br>Athira<br><br><blockquote type="cite"><br>jirka<br><br></blockquote><br></div></body></html>
> 

apart from the html content, looks ok ;-)

jirka


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-03-17 12:17 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-09 14:03 [PATCH 0/4] powerpc/perf: Export processor pipeline stage cycles information Athira Rajeev
2021-03-09 14:03 ` [PATCH 1/4] powerpc/perf: Expose processor pipeline stage cycles using PERF_SAMPLE_WEIGHT_STRUCT Athira Rajeev
2021-03-09 14:03 ` [PATCH 2/4] tools/perf: Add dynamic headers for perf report columns Athira Rajeev
2021-03-12 12:57   ` Jiri Olsa
2021-03-15  7:41     ` Athira Rajeev
2021-03-09 14:03 ` [PATCH 3/4] tools/perf: Add powerpc support for PERF_SAMPLE_WEIGHT_STRUCT Athira Rajeev
2021-03-09 14:04 ` [PATCH 4/4] tools/perf: Support pipeline stage cycles for powerpc Athira Rajeev
2021-03-12 12:56   ` Jiri Olsa
2021-03-15  7:52     ` Athira Rajeev
2021-03-15 23:18       ` Jiri Olsa
     [not found]         ` <CA827A39-FA2A-4B0C-BF8F-9DB428CD58B8@linux.vnet.ibm.com>
2021-03-17 12:16           ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).