All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] perf: Support a new 'percore' event qualifier
@ 2019-04-12 13:59 Jin Yao
  2019-04-12 13:59 ` [PATCH v4 1/4] perf: Add a " Jin Yao
                   ` (5 more replies)
  0 siblings, 6 replies; 13+ messages in thread
From: Jin Yao @ 2019-04-12 13:59 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The 'percore' event qualifier which sums up the event counts for both
hardware threads in a core. For example,

perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line.

We can already support per-core counting with --per-core, but it's
often useful to do this together with other metrics that are collected
per CPU (per hardware thread). So this patch series supports this
per-core counting on a event level.

 v4:
 ---
 Add percore qualifier to documantation.
 Rebase to latest perf/core branch

 v3:
 ---
 Simplify the patch: "perf: Add a 'percore' event qualifier"
 Other patches don't have changes.

 v2:
 ---
 1. Change 'coresum' to 'percore'.
 2. Move the aggregate counts printing to a seperate patch.

Jin Yao (4):
  perf: Add a 'percore' event qualifier
  perf stat: Factor out aggregate counts printing
  perf stat: Support 'percore' event qualifier
  perf test: Add a simple test for term 'percore'

 tools/perf/Documentation/perf-list.txt |  12 ++++
 tools/perf/Documentation/perf-stat.txt |   4 ++
 tools/perf/builtin-stat.c              |  21 +++++++
 tools/perf/tests/parse-events.c        |  10 ++-
 tools/perf/util/evsel.c                |   2 +
 tools/perf/util/evsel.h                |   3 +
 tools/perf/util/parse-events.c         |  27 +++++++++
 tools/perf/util/parse-events.h         |   1 +
 tools/perf/util/parse-events.l         |   1 +
 tools/perf/util/stat-display.c         | 107 ++++++++++++++++++++++++---------
 tools/perf/util/stat.c                 |   8 ++-
 11 files changed, 163 insertions(+), 33 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v4 1/4] perf: Add a 'percore' event qualifier
  2019-04-12 13:59 [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jin Yao
@ 2019-04-12 13:59 ` Jin Yao
  2019-05-18  9:35   ` [tip:perf/core] perf tools: " tip-bot for Jin Yao
  2019-04-12 13:59 ` [PATCH v4 2/4] perf stat: Factor out aggregate counts printing Jin Yao
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Jin Yao @ 2019-04-12 13:59 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Add a 'percore' event qualifier, like cpu/event=0,umask=0x3,percore=1/,
that sums up the event counts for both hardware threads in a core.

We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
So we need to support this per-core counting on a event level.

This can be implemented in only the user tool, no kernel support needed.

 v4:
 ---
 1. Add Arnaldo's patch which updates the documentation for
    this new qualifier.
 2. Rebase to latest perf/core branch

 v3:
 ---
 Simplify the code according to Jiri's comments.
 Before:
   "return term->val.percore ? true : false;"
 Now:
   "return term->val.percore;"

 v2:
 ---
 Change the qualifier name from 'coresum' to 'percore' according to
 comments from Jiri and Andi.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-list.txt | 12 ++++++++++++
 tools/perf/util/evsel.c                |  2 ++
 tools/perf/util/evsel.h                |  3 +++
 tools/perf/util/parse-events.c         | 27 +++++++++++++++++++++++++++
 tools/perf/util/parse-events.h         |  1 +
 tools/perf/util/parse-events.l         |  1 +
 6 files changed, 46 insertions(+)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 138fb6e..18ed1b0 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -199,6 +199,18 @@ also be supplied. For example:
 
   perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
 
+EVENT QUALIFIERS:
+
+It is also possible to add extra qualifiers to an event:
+
+percore:
+
+Sums up the event counts for all hardware threads in a core, e.g.:
+
+
+  perf stat -e cpu/event=0,umask=0x3,percore=1/
+
+
 EVENT GROUPS
 ------------
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 84cfb9f..99b6143 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -813,6 +813,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			break;
 		case PERF_EVSEL__CONFIG_TERM_DRV_CFG:
 			break;
+		case PERF_EVSEL__CONFIG_TERM_PERCORE:
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 6d190cb..cad54e8 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -50,6 +50,7 @@ enum term_type {
 	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_DRV_CFG,
 	PERF_EVSEL__CONFIG_TERM_BRANCH,
+	PERF_EVSEL__CONFIG_TERM_PERCORE,
 };
 
 struct perf_evsel_config_term {
@@ -67,6 +68,7 @@ struct perf_evsel_config_term {
 		bool	overwrite;
 		char	*branch;
 		unsigned long max_events;
+		bool	percore;
 	} val;
 	bool weak;
 };
@@ -158,6 +160,7 @@ struct perf_evsel {
 	struct perf_evsel	**metric_events;
 	bool			collect_stat;
 	bool			weak_group;
+	bool			percore;
 	const char		*pmu_name;
 	struct {
 		perf_evsel__sb_cb_t	*cb;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4432bfe..cf0b9b8 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -950,6 +950,7 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
 	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
 	[PARSE_EVENTS__TERM_TYPE_DRV_CFG]		= "driver-config",
+	[PARSE_EVENTS__TERM_TYPE_PERCORE]		= "percore",
 };
 
 static bool config_term_shrinked;
@@ -970,6 +971,7 @@ config_term_avail(int term_type, struct parse_events_error *err)
 	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 	case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
+	case PARSE_EVENTS__TERM_TYPE_PERCORE:
 		return true;
 	default:
 		if (!err)
@@ -1061,6 +1063,14 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_PERCORE:
+		CHECK_TYPE_VAL(NUM);
+		if ((unsigned int)term->val.num > 1) {
+			err->str = strdup("expected 0 or 1");
+			err->idx = term->err_val;
+			return -EINVAL;
+		}
+		break;
 	default:
 		err->str = strdup("unknown term");
 		err->idx = term->err_term;
@@ -1199,6 +1209,10 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
 			ADD_CONFIG_TERM(DRV_CFG, drv_cfg, term->val.str);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_PERCORE:
+			ADD_CONFIG_TERM(PERCORE, percore,
+					term->val.num ? true : false);
+			break;
 		default:
 			break;
 		}
@@ -1260,6 +1274,18 @@ int parse_events_add_tool(struct parse_events_state *parse_state,
 	return add_event_tool(list, &parse_state->idx, tool_event);
 }
 
+static bool config_term_percore(struct list_head *config_terms)
+{
+	struct perf_evsel_config_term *term;
+
+	list_for_each_entry(term, config_terms, list) {
+		if (term->type == PERF_EVSEL__CONFIG_TERM_PERCORE)
+			return term->val.percore;
+	}
+
+	return false;
+}
+
 int parse_events_add_pmu(struct parse_events_state *parse_state,
 			 struct list_head *list, char *name,
 			 struct list_head *head_config,
@@ -1333,6 +1359,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
 		evsel->metric_name = info.metric_name;
 		evsel->pmu_name = name;
 		evsel->use_uncore_alias = use_uncore_alias;
+		evsel->percore = config_term_percore(&evsel->config_terms);
 	}
 
 	return evsel ? 0 : -ENOMEM;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index a052cd6..f7139e1 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -75,6 +75,7 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
 	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	PARSE_EVENTS__TERM_TYPE_DRV_CFG,
+	PARSE_EVENTS__TERM_TYPE_PERCORE,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index c54bfe8..ca60988 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -283,6 +283,7 @@ inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
 no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
+percore			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 2/4] perf stat: Factor out aggregate counts printing
  2019-04-12 13:59 [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jin Yao
  2019-04-12 13:59 ` [PATCH v4 1/4] perf: Add a " Jin Yao
@ 2019-04-12 13:59 ` Jin Yao
  2019-05-18  9:36   ` [tip:perf/core] " tip-bot for Jin Yao
  2019-04-12 13:59 ` [PATCH v4 3/4] perf stat: Support 'percore' event qualifier Jin Yao
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 13+ messages in thread
From: Jin Yao @ 2019-04-12 13:59 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Move the aggregate counts printing to a new function
print_counter_aggrdata, which will be used in following
patches.

 v4:
 ---
 Rebase to latest perf/core branch.

 v3:
 ---
 No change

 v2:
 ---
 Create this patch according to Jiri's comments.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/stat-display.c | 64 +++++++++++++++++++++++++-----------------
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 3324f23..f5b4ee7 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -594,6 +594,41 @@ static void aggr_cb(struct perf_stat_config *config,
 	}
 }
 
+static void print_counter_aggrdata(struct perf_stat_config *config,
+				   struct perf_evsel *counter, int s,
+				   char *prefix, bool metric_only,
+				   bool *first)
+{
+	struct aggr_data ad;
+	FILE *output = config->output;
+	u64 ena, run, val;
+	int id, nr;
+	double uval;
+
+	ad.id = id = config->aggr_map->map[s];
+	ad.val = ad.ena = ad.run = 0;
+	ad.nr = 0;
+	if (!collect_data(config, counter, aggr_cb, &ad))
+		return;
+
+	nr = ad.nr;
+	ena = ad.ena;
+	run = ad.run;
+	val = ad.val;
+	if (*first && metric_only) {
+		*first = false;
+		aggr_printout(config, counter, id, nr);
+	}
+	if (prefix && !metric_only)
+		fprintf(output, "%s", prefix);
+
+	uval = val * counter->scale;
+	printout(config, id, nr, counter, uval, prefix,
+		 run, ena, 1.0, &rt_stat);
+	if (!metric_only)
+		fputc('\n', output);
+}
+
 static void print_aggr(struct perf_stat_config *config,
 		       struct perf_evlist *evlist,
 		       char *prefix)
@@ -601,9 +636,7 @@ static void print_aggr(struct perf_stat_config *config,
 	bool metric_only = config->metric_only;
 	FILE *output = config->output;
 	struct perf_evsel *counter;
-	int s, id, nr;
-	double uval;
-	u64 ena, run, val;
+	int s;
 	bool first;
 
 	if (!(config->aggr_map || config->aggr_get_id))
@@ -616,33 +649,14 @@ static void print_aggr(struct perf_stat_config *config,
 	 * Without each counter has its own line.
 	 */
 	for (s = 0; s < config->aggr_map->nr; s++) {
-		struct aggr_data ad;
 		if (prefix && metric_only)
 			fprintf(output, "%s", prefix);
 
-		ad.id = id = config->aggr_map->map[s];
 		first = true;
 		evlist__for_each_entry(evlist, counter) {
-			ad.val = ad.ena = ad.run = 0;
-			ad.nr = 0;
-			if (!collect_data(config, counter, aggr_cb, &ad))
-				continue;
-			nr = ad.nr;
-			ena = ad.ena;
-			run = ad.run;
-			val = ad.val;
-			if (first && metric_only) {
-				first = false;
-				aggr_printout(config, counter, id, nr);
-			}
-			if (prefix && !metric_only)
-				fprintf(output, "%s", prefix);
-
-			uval = val * counter->scale;
-			printout(config, id, nr, counter, uval, prefix,
-				 run, ena, 1.0, &rt_stat);
-			if (!metric_only)
-				fputc('\n', output);
+			print_counter_aggrdata(config, counter, s,
+					       prefix, metric_only,
+					       &first);
 		}
 		if (metric_only)
 			fputc('\n', output);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 3/4] perf stat: Support 'percore' event qualifier
  2019-04-12 13:59 [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jin Yao
  2019-04-12 13:59 ` [PATCH v4 1/4] perf: Add a " Jin Yao
  2019-04-12 13:59 ` [PATCH v4 2/4] perf stat: Factor out aggregate counts printing Jin Yao
@ 2019-04-12 13:59 ` Jin Yao
  2019-04-15 14:39   ` Ravi Bangoria
  2019-05-18  9:36   ` [tip:perf/core] " tip-bot for Jin Yao
  2019-04-12 13:59 ` [PATCH v4 4/4] perf test: Add a simple test for term 'percore' Jin Yao
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 13+ messages in thread
From: Jin Yao @ 2019-04-12 13:59 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

With this patch, we can use the 'percore' event qualifier in perf-stat.

root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
     1.000773050 S0-C0             98,352,832      cpu/event=0,umask=0x3,percore=1/                                     (50.01%)
     1.000773050 S0-C1            103,763,057      cpu/event=0,umask=0x3,percore=1/                                     (50.02%)
     1.000773050 S0-C2            196,776,995      cpu/event=0,umask=0x3,percore=1/                                     (50.02%)
     1.000773050 S0-C3            176,493,779      cpu/event=0,umask=0x3,percore=1/                                     (50.02%)
     1.000773050 CPU0              47,699,641      cpu/event=0,umask=0x3/                                        (50.02%)
     1.000773050 CPU1              49,052,451      cpu/event=0,umask=0x3/                                        (49.98%)
     1.000773050 CPU2             102,771,422      cpu/event=0,umask=0x3/                                        (49.98%)
     1.000773050 CPU3             100,784,662      cpu/event=0,umask=0x3/                                        (49.98%)
     1.000773050 CPU4              43,171,342      cpu/event=0,umask=0x3/                                        (49.98%)
     1.000773050 CPU5              54,152,158      cpu/event=0,umask=0x3/                                        (49.98%)
     1.000773050 CPU6              93,618,410      cpu/event=0,umask=0x3/                                        (49.98%)
     1.000773050 CPU7              74,477,589      cpu/event=0,umask=0x3/                                        (49.99%)

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:

  S0-C0 = CPU0 + CPU4
  S0-C1 = CPU1 + CPU5
  S0-C2 = CPU2 + CPU6
  S0-C3 = CPU3 + CPU7

So the result is expected (tiny difference is ignored).

Note that, the 'percore' event qualifier needs to use with option '-A'.

 v4:
 ---
 Rebase to latest perf/core branch.

 v3:
 ---
 No change

 v2:
 ---
 Change 'coresum' to 'percore'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt |  4 ++++
 tools/perf/builtin-stat.c              | 21 +++++++++++++++++
 tools/perf/util/stat-display.c         | 43 ++++++++++++++++++++++++++++++----
 tools/perf/util/stat.c                 |  8 ++++---
 4 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 39c05f8..1e312c2 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -43,6 +43,10 @@ report::
 	  param1 and param2 are defined as formats for the PMU in
 	  /sys/bus/event_source/devices/<pmu>/format/*
 
+	  'percore' is a event qualifier that sums up the event counts for both
+	  hardware threads in a core. For example:
+	  perf stat -A -a -e cpu/event,percore=1/,otherevent ...
+
 	- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
 	  where M, N, K are numbers (in decimal, hex, octal format).
 	  Acceptable values for each of 'config', 'config1' and 'config2'
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7f9c4b7..4a79fa9 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -847,6 +847,18 @@ static int perf_stat__get_core_cached(struct perf_stat_config *config,
 	return perf_stat__get_aggr(config, perf_stat__get_core, map, idx);
 }
 
+static bool term_percore_set(void)
+{
+	struct perf_evsel *counter;
+
+	evlist__for_each_entry(evsel_list, counter) {
+		if (counter->percore)
+			return true;
+	}
+
+	return false;
+}
+
 static int perf_stat_init_aggr_mode(void)
 {
 	int nr;
@@ -867,6 +879,15 @@ static int perf_stat_init_aggr_mode(void)
 		stat_config.aggr_get_id = perf_stat__get_core_cached;
 		break;
 	case AGGR_NONE:
+		if (term_percore_set()) {
+			if (cpu_map__build_core_map(evsel_list->cpus,
+						    &stat_config.aggr_map)) {
+				perror("cannot build core map");
+				return -1;
+			}
+			stat_config.aggr_get_id = perf_stat__get_core_cached;
+		}
+		break;
 	case AGGR_GLOBAL:
 	case AGGR_THREAD:
 	case AGGR_UNSET:
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index f5b4ee7..4c53bae 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -88,9 +88,17 @@ static void aggr_printout(struct perf_stat_config *config,
 			config->csv_sep);
 			break;
 	case AGGR_NONE:
-		fprintf(config->output, "CPU%*d%s",
-			config->csv_output ? 0 : -4,
-			perf_evsel__cpus(evsel)->map[id], config->csv_sep);
+		if (evsel->percore) {
+			fprintf(config->output, "S%d-C%*d%s",
+				cpu_map__id_to_socket(id),
+				config->csv_output ? 0 : -5,
+				cpu_map__id_to_cpu(id), config->csv_sep);
+		} else {
+			fprintf(config->output, "CPU%*d%s ",
+				config->csv_output ? 0 : -5,
+				perf_evsel__cpus(evsel)->map[id],
+				config->csv_sep);
+		}
 		break;
 	case AGGR_THREAD:
 		fprintf(config->output, "%*s-%*d%s",
@@ -1103,6 +1111,30 @@ static void print_footer(struct perf_stat_config *config)
 			"the same PMU. Try reorganizing the group.\n");
 }
 
+static void print_percore(struct perf_stat_config *config,
+			  struct perf_evsel *counter, char *prefix)
+{
+	bool metric_only = config->metric_only;
+	FILE *output = config->output;
+	int s;
+	bool first = true;
+
+	if (!(config->aggr_map || config->aggr_get_id))
+		return;
+
+	for (s = 0; s < config->aggr_map->nr; s++) {
+		if (prefix && metric_only)
+			fprintf(output, "%s", prefix);
+
+		print_counter_aggrdata(config, counter, s,
+				       prefix, metric_only,
+				       &first);
+	}
+
+	if (metric_only)
+		fputc('\n', output);
+}
+
 void
 perf_evlist__print_counters(struct perf_evlist *evlist,
 			    struct perf_stat_config *config,
@@ -1153,7 +1185,10 @@ perf_evlist__print_counters(struct perf_evlist *evlist,
 			print_no_aggr_metric(config, evlist, prefix);
 		else {
 			evlist__for_each_entry(evlist, counter) {
-				print_counter(config, counter, prefix);
+				if (counter->percore)
+					print_percore(config, counter, prefix);
+				else
+					print_counter(config, counter, prefix);
 			}
 		}
 		break;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 2856cc9..c3115d9 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -277,9 +277,11 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
 		if (!evsel->snapshot)
 			perf_evsel__compute_deltas(evsel, cpu, thread, count);
 		perf_counts_values__scale(count, config->scale, NULL);
-		if (config->aggr_mode == AGGR_NONE)
-			perf_stat__update_shadow_stats(evsel, count->val, cpu,
-						       &rt_stat);
+		if ((config->aggr_mode == AGGR_NONE) && (!evsel->percore)) {
+			perf_stat__update_shadow_stats(evsel, count->val,
+						       cpu, &rt_stat);
+		}
+
 		if (config->aggr_mode == AGGR_THREAD) {
 			if (config->stats)
 				perf_stat__update_shadow_stats(evsel,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v4 4/4] perf test: Add a simple test for term 'percore'
  2019-04-12 13:59 [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jin Yao
                   ` (2 preceding siblings ...)
  2019-04-12 13:59 ` [PATCH v4 3/4] perf stat: Support 'percore' event qualifier Jin Yao
@ 2019-04-12 13:59 ` Jin Yao
  2019-04-15  8:51 ` [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jiri Olsa
  2019-05-15 19:40 ` Arnaldo Carvalho de Melo
  5 siblings, 0 replies; 13+ messages in thread
From: Jin Yao @ 2019-04-12 13:59 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It's a simple test which just checks if parser works.

 v4:
 ---
 No change

 v3:
 ---
 No change

 v2:
 ---
 Change 'coresum' to 'percore'

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/tests/parse-events.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 4a69c07..e59a7bb 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -605,6 +605,14 @@ static int test__checkterms_simple(struct list_head *terms)
 	TEST_ASSERT_VAL("wrong val", term->val.num == 1);
 	TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "umask"));
 
+	/* percore=1*/
+	term = list_entry(term->list.next, struct parse_events_term, list);
+	TEST_ASSERT_VAL("wrong type term",
+			term->type_term == PARSE_EVENTS__TERM_TYPE_PERCORE);
+	TEST_ASSERT_VAL("wrong type val",
+			term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+	TEST_ASSERT_VAL("wrong val", term->val.num == 1);
+
 	return 0;
 }
 
@@ -1734,7 +1742,7 @@ struct terms_test {
 
 static struct terms_test test__terms[] = {
 	[0] = {
-		.str   = "config=10,config1,config2=3,umask=1",
+		.str   = "config=10,config1,config2=3,umask=1,percore=1",
 		.check = test__checkterms_simple,
 	},
 };
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] perf: Support a new 'percore' event qualifier
  2019-04-12 13:59 [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jin Yao
                   ` (3 preceding siblings ...)
  2019-04-12 13:59 ` [PATCH v4 4/4] perf test: Add a simple test for term 'percore' Jin Yao
@ 2019-04-15  8:51 ` Jiri Olsa
  2019-05-15 19:40 ` Arnaldo Carvalho de Melo
  5 siblings, 0 replies; 13+ messages in thread
From: Jiri Olsa @ 2019-04-15  8:51 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Fri, Apr 12, 2019 at 09:59:46PM +0800, Jin Yao wrote:
> The 'percore' event qualifier which sums up the event counts for both
> hardware threads in a core. For example,
> 
> perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/
> 
> In this example, we count the event 'ref-cycles' per-core and per-CPU in
> one perf stat command-line.
> 
> We can already support per-core counting with --per-core, but it's
> often useful to do this together with other metrics that are collected
> per CPU (per hardware thread). So this patch series supports this
> per-core counting on a event level.
> 
>  v4:
>  ---
>  Add percore qualifier to documantation.
>  Rebase to latest perf/core branch

Acked-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

> 
>  v3:
>  ---
>  Simplify the patch: "perf: Add a 'percore' event qualifier"
>  Other patches don't have changes.
> 
>  v2:
>  ---
>  1. Change 'coresum' to 'percore'.
>  2. Move the aggregate counts printing to a seperate patch.
> 
> Jin Yao (4):
>   perf: Add a 'percore' event qualifier
>   perf stat: Factor out aggregate counts printing
>   perf stat: Support 'percore' event qualifier
>   perf test: Add a simple test for term 'percore'
> 
>  tools/perf/Documentation/perf-list.txt |  12 ++++
>  tools/perf/Documentation/perf-stat.txt |   4 ++
>  tools/perf/builtin-stat.c              |  21 +++++++
>  tools/perf/tests/parse-events.c        |  10 ++-
>  tools/perf/util/evsel.c                |   2 +
>  tools/perf/util/evsel.h                |   3 +
>  tools/perf/util/parse-events.c         |  27 +++++++++
>  tools/perf/util/parse-events.h         |   1 +
>  tools/perf/util/parse-events.l         |   1 +
>  tools/perf/util/stat-display.c         | 107 ++++++++++++++++++++++++---------
>  tools/perf/util/stat.c                 |   8 ++-
>  11 files changed, 163 insertions(+), 33 deletions(-)
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/4] perf stat: Support 'percore' event qualifier
  2019-04-12 13:59 ` [PATCH v4 3/4] perf stat: Support 'percore' event qualifier Jin Yao
@ 2019-04-15 14:39   ` Ravi Bangoria
  2019-04-15 19:26     ` Arnaldo Carvalho de Melo
  2019-05-18  9:36   ` [tip:perf/core] " tip-bot for Jin Yao
  1 sibling, 1 reply; 13+ messages in thread
From: Ravi Bangoria @ 2019-04-15 14:39 UTC (permalink / raw)
  To: Jin Yao, acme
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin, Ravi Bangoria



On 4/12/19 7:29 PM, Jin Yao wrote:
> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> index 39c05f8..1e312c2 100644
> --- a/tools/perf/Documentation/perf-stat.txt
> +++ b/tools/perf/Documentation/perf-stat.txt
> @@ -43,6 +43,10 @@ report::
>  	  param1 and param2 are defined as formats for the PMU in
>  	  /sys/bus/event_source/devices/<pmu>/format/*
>  
> +	  'percore' is a event qualifier that sums up the event counts for both
> +	  hardware threads in a core.

s/both/all/  :

  $ lscpu | grep Thread
  Thread(s) per core:    4


Apart from that, for the series:
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/4] perf stat: Support 'percore' event qualifier
  2019-04-15 14:39   ` Ravi Bangoria
@ 2019-04-15 19:26     ` Arnaldo Carvalho de Melo
  2019-04-16  0:31       ` Jin, Yao
  0 siblings, 1 reply; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-04-15 19:26 UTC (permalink / raw)
  To: Ravi Bangoria
  Cc: Jin Yao, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel,
	ak, kan.liang, yao.jin

Em Mon, Apr 15, 2019 at 08:09:09PM +0530, Ravi Bangoria escreveu:
> 
> 
> On 4/12/19 7:29 PM, Jin Yao wrote:
> > diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
> > index 39c05f8..1e312c2 100644
> > --- a/tools/perf/Documentation/perf-stat.txt
> > +++ b/tools/perf/Documentation/perf-stat.txt
> > @@ -43,6 +43,10 @@ report::
> >  	  param1 and param2 are defined as formats for the PMU in
> >  	  /sys/bus/event_source/devices/<pmu>/format/*
> >  
> > +	  'percore' is a event qualifier that sums up the event counts for both
> > +	  hardware threads in a core.
> 
> s/both/all/  :
> 
>   $ lscpu | grep Thread
>   Thread(s) per core:    4
> 
> 
> Apart from that, for the series:
> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>

Yeah I had fixed up that already, fixed up the patch description this
time around, "both" was being used there as well, changed to "all".

Thanks, added your Tested-by tag, appreciated.

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 3/4] perf stat: Support 'percore' event qualifier
  2019-04-15 19:26     ` Arnaldo Carvalho de Melo
@ 2019-04-16  0:31       ` Jin, Yao
  0 siblings, 0 replies; 13+ messages in thread
From: Jin, Yao @ 2019-04-16  0:31 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ravi Bangoria
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 4/16/2019 3:26 AM, Arnaldo Carvalho de Melo wrote:
> Em Mon, Apr 15, 2019 at 08:09:09PM +0530, Ravi Bangoria escreveu:
>>
>>
>> On 4/12/19 7:29 PM, Jin Yao wrote:
>>> diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
>>> index 39c05f8..1e312c2 100644
>>> --- a/tools/perf/Documentation/perf-stat.txt
>>> +++ b/tools/perf/Documentation/perf-stat.txt
>>> @@ -43,6 +43,10 @@ report::
>>>   	  param1 and param2 are defined as formats for the PMU in
>>>   	  /sys/bus/event_source/devices/<pmu>/format/*
>>>   
>>> +	  'percore' is a event qualifier that sums up the event counts for both
>>> +	  hardware threads in a core.
>>
>> s/both/all/  :
>>
>>    $ lscpu | grep Thread
>>    Thread(s) per core:    4
>>
>>
>> Apart from that, for the series:
>> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> 
> Yeah I had fixed up that already, fixed up the patch description this
> time around, "both" was being used there as well, changed to "all".
> 
> Thanks, added your Tested-by tag, appreciated.
> 
> - Arnaldo
> 

Thanks Ravi, Thanks Arnaldo!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v4 0/4] perf: Support a new 'percore' event qualifier
  2019-04-12 13:59 [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jin Yao
                   ` (4 preceding siblings ...)
  2019-04-15  8:51 ` [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jiri Olsa
@ 2019-05-15 19:40 ` Arnaldo Carvalho de Melo
  5 siblings, 0 replies; 13+ messages in thread
From: Arnaldo Carvalho de Melo @ 2019-05-15 19:40 UTC (permalink / raw)
  To: Jin Yao
  Cc: jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

Em Fri, Apr 12, 2019 at 09:59:46PM +0800, Jin Yao escreveu:
> The 'percore' event qualifier which sums up the event counts for both
> hardware threads in a core. For example,
> 
> perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/
> 
> In this example, we count the event 'ref-cycles' per-core and per-CPU in
> one perf stat command-line.
> 
> We can already support per-core counting with --per-core, but it's
> often useful to do this together with other metrics that are collected
> per CPU (per hardware thread). So this patch series supports this
> per-core counting on a event level.
> 
>  v4:
>  ---
>  Add percore qualifier to documantation.
>  Rebase to latest perf/core branch

Thanks, applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [tip:perf/core] perf tools: Add a 'percore' event qualifier
  2019-04-12 13:59 ` [PATCH v4 1/4] perf: Add a " Jin Yao
@ 2019-05-18  9:35   ` tip-bot for Jin Yao
  0 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Jin Yao @ 2019-05-18  9:35 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: yao.jin, ravi.bangoria, hpa, mingo, acme, jolsa, yao.jin, ak,
	alexander.shishkin, linux-kernel, peterz, kan.liang, tglx

Commit-ID:  064b4e82aa1633c27c383cc686b87ced57e072d1
Gitweb:     https://git.kernel.org/tip/064b4e82aa1633c27c383cc686b87ced57e072d1
Author:     Jin Yao <yao.jin@linux.intel.com>
AuthorDate: Fri, 12 Apr 2019 21:59:47 +0800
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 16 May 2019 14:17:24 -0300

perf tools: Add a 'percore' event qualifier

Add a 'percore' event qualifier, like cpu/event=0,umask=0x3,percore=1/,
that sums up the event counts for both hardware threads in a core.

We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
So we need to support this per-core counting on a event level.

This can be implemented in only the user tool, no kernel support needed.

 v4:
 ---
 1. Add Arnaldo's patch which updates the documentation for
    this new qualifier.
 2. Rebase to latest perf/core branch

 v3:
 ---
 Simplify the code according to Jiri's comments.
 Before:
   "return term->val.percore ? true : false;"
 Now:
   "return term->val.percore;"

 v2:
 ---
 Change the qualifier name from 'coresum' to 'percore' according to
 comments from Jiri and Andi.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-2-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-list.txt | 12 ++++++++++++
 tools/perf/util/evsel.c                |  2 ++
 tools/perf/util/evsel.h                |  3 +++
 tools/perf/util/parse-events.c         | 27 +++++++++++++++++++++++++++
 tools/perf/util/parse-events.h         |  1 +
 tools/perf/util/parse-events.l         |  1 +
 6 files changed, 46 insertions(+)

diff --git a/tools/perf/Documentation/perf-list.txt b/tools/perf/Documentation/perf-list.txt
index 138fb6e94b3c..18ed1b0fceb3 100644
--- a/tools/perf/Documentation/perf-list.txt
+++ b/tools/perf/Documentation/perf-list.txt
@@ -199,6 +199,18 @@ also be supplied. For example:
 
   perf stat -C 0 -e 'hv_gpci/dtbp_ptitc,phys_processor_idx=0x2/' ...
 
+EVENT QUALIFIERS:
+
+It is also possible to add extra qualifiers to an event:
+
+percore:
+
+Sums up the event counts for all hardware threads in a core, e.g.:
+
+
+  perf stat -e cpu/event=0,umask=0x3,percore=1/
+
+
 EVENT GROUPS
 ------------
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index a10cf4cde920..a6f572a40deb 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -813,6 +813,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			break;
 		case PERF_EVSEL__CONFIG_TERM_DRV_CFG:
 			break;
+		case PERF_EVSEL__CONFIG_TERM_PERCORE:
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 6d190cbf1070..cad54e8ba522 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -50,6 +50,7 @@ enum term_type {
 	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_DRV_CFG,
 	PERF_EVSEL__CONFIG_TERM_BRANCH,
+	PERF_EVSEL__CONFIG_TERM_PERCORE,
 };
 
 struct perf_evsel_config_term {
@@ -67,6 +68,7 @@ struct perf_evsel_config_term {
 		bool	overwrite;
 		char	*branch;
 		unsigned long max_events;
+		bool	percore;
 	} val;
 	bool weak;
 };
@@ -158,6 +160,7 @@ struct perf_evsel {
 	struct perf_evsel	**metric_events;
 	bool			collect_stat;
 	bool			weak_group;
+	bool			percore;
 	const char		*pmu_name;
 	struct {
 		perf_evsel__sb_cb_t	*cb;
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4432bfe039fd..cf0b9b81c5aa 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -950,6 +950,7 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
 	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
 	[PARSE_EVENTS__TERM_TYPE_DRV_CFG]		= "driver-config",
+	[PARSE_EVENTS__TERM_TYPE_PERCORE]		= "percore",
 };
 
 static bool config_term_shrinked;
@@ -970,6 +971,7 @@ config_term_avail(int term_type, struct parse_events_error *err)
 	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 	case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
+	case PARSE_EVENTS__TERM_TYPE_PERCORE:
 		return true;
 	default:
 		if (!err)
@@ -1061,6 +1063,14 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_PERCORE:
+		CHECK_TYPE_VAL(NUM);
+		if ((unsigned int)term->val.num > 1) {
+			err->str = strdup("expected 0 or 1");
+			err->idx = term->err_val;
+			return -EINVAL;
+		}
+		break;
 	default:
 		err->str = strdup("unknown term");
 		err->idx = term->err_term;
@@ -1199,6 +1209,10 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
 			ADD_CONFIG_TERM(DRV_CFG, drv_cfg, term->val.str);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_PERCORE:
+			ADD_CONFIG_TERM(PERCORE, percore,
+					term->val.num ? true : false);
+			break;
 		default:
 			break;
 		}
@@ -1260,6 +1274,18 @@ int parse_events_add_tool(struct parse_events_state *parse_state,
 	return add_event_tool(list, &parse_state->idx, tool_event);
 }
 
+static bool config_term_percore(struct list_head *config_terms)
+{
+	struct perf_evsel_config_term *term;
+
+	list_for_each_entry(term, config_terms, list) {
+		if (term->type == PERF_EVSEL__CONFIG_TERM_PERCORE)
+			return term->val.percore;
+	}
+
+	return false;
+}
+
 int parse_events_add_pmu(struct parse_events_state *parse_state,
 			 struct list_head *list, char *name,
 			 struct list_head *head_config,
@@ -1333,6 +1359,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
 		evsel->metric_name = info.metric_name;
 		evsel->pmu_name = name;
 		evsel->use_uncore_alias = use_uncore_alias;
+		evsel->percore = config_term_percore(&evsel->config_terms);
 	}
 
 	return evsel ? 0 : -ENOMEM;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index a052cd6ac63e..f7139e1a2fd3 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -75,6 +75,7 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
 	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	PARSE_EVENTS__TERM_TYPE_DRV_CFG,
+	PARSE_EVENTS__TERM_TYPE_PERCORE,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index c54bfe88626c..ca6098874fe2 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -283,6 +283,7 @@ inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
 no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
+percore			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip:perf/core] perf stat: Factor out aggregate counts printing
  2019-04-12 13:59 ` [PATCH v4 2/4] perf stat: Factor out aggregate counts printing Jin Yao
@ 2019-05-18  9:36   ` tip-bot for Jin Yao
  0 siblings, 0 replies; 13+ messages in thread
From: tip-bot for Jin Yao @ 2019-05-18  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: yao.jin, acme, kan.liang, yao.jin, jolsa, ravi.bangoria, tglx,
	linux-kernel, ak, mingo, alexander.shishkin, peterz, hpa

Commit-ID:  40480a8136700d678dc07222c4d7287c89d0c04d
Gitweb:     https://git.kernel.org/tip/40480a8136700d678dc07222c4d7287c89d0c04d
Author:     Jin Yao <yao.jin@linux.intel.com>
AuthorDate: Fri, 12 Apr 2019 21:59:48 +0800
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 16 May 2019 14:17:24 -0300

perf stat: Factor out aggregate counts printing

Move the aggregate counts printing to a new function
print_counter_aggrdata, which will be used in following patches.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-3-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/stat-display.c | 64 +++++++++++++++++++++++++-----------------
 1 file changed, 39 insertions(+), 25 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 3324f23c7efc..f5b4ee79568c 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -594,6 +594,41 @@ static void aggr_cb(struct perf_stat_config *config,
 	}
 }
 
+static void print_counter_aggrdata(struct perf_stat_config *config,
+				   struct perf_evsel *counter, int s,
+				   char *prefix, bool metric_only,
+				   bool *first)
+{
+	struct aggr_data ad;
+	FILE *output = config->output;
+	u64 ena, run, val;
+	int id, nr;
+	double uval;
+
+	ad.id = id = config->aggr_map->map[s];
+	ad.val = ad.ena = ad.run = 0;
+	ad.nr = 0;
+	if (!collect_data(config, counter, aggr_cb, &ad))
+		return;
+
+	nr = ad.nr;
+	ena = ad.ena;
+	run = ad.run;
+	val = ad.val;
+	if (*first && metric_only) {
+		*first = false;
+		aggr_printout(config, counter, id, nr);
+	}
+	if (prefix && !metric_only)
+		fprintf(output, "%s", prefix);
+
+	uval = val * counter->scale;
+	printout(config, id, nr, counter, uval, prefix,
+		 run, ena, 1.0, &rt_stat);
+	if (!metric_only)
+		fputc('\n', output);
+}
+
 static void print_aggr(struct perf_stat_config *config,
 		       struct perf_evlist *evlist,
 		       char *prefix)
@@ -601,9 +636,7 @@ static void print_aggr(struct perf_stat_config *config,
 	bool metric_only = config->metric_only;
 	FILE *output = config->output;
 	struct perf_evsel *counter;
-	int s, id, nr;
-	double uval;
-	u64 ena, run, val;
+	int s;
 	bool first;
 
 	if (!(config->aggr_map || config->aggr_get_id))
@@ -616,33 +649,14 @@ static void print_aggr(struct perf_stat_config *config,
 	 * Without each counter has its own line.
 	 */
 	for (s = 0; s < config->aggr_map->nr; s++) {
-		struct aggr_data ad;
 		if (prefix && metric_only)
 			fprintf(output, "%s", prefix);
 
-		ad.id = id = config->aggr_map->map[s];
 		first = true;
 		evlist__for_each_entry(evlist, counter) {
-			ad.val = ad.ena = ad.run = 0;
-			ad.nr = 0;
-			if (!collect_data(config, counter, aggr_cb, &ad))
-				continue;
-			nr = ad.nr;
-			ena = ad.ena;
-			run = ad.run;
-			val = ad.val;
-			if (first && metric_only) {
-				first = false;
-				aggr_printout(config, counter, id, nr);
-			}
-			if (prefix && !metric_only)
-				fprintf(output, "%s", prefix);
-
-			uval = val * counter->scale;
-			printout(config, id, nr, counter, uval, prefix,
-				 run, ena, 1.0, &rt_stat);
-			if (!metric_only)
-				fputc('\n', output);
+			print_counter_aggrdata(config, counter, s,
+					       prefix, metric_only,
+					       &first);
 		}
 		if (metric_only)
 			fputc('\n', output);

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [tip:perf/core] perf stat: Support 'percore' event qualifier
  2019-04-12 13:59 ` [PATCH v4 3/4] perf stat: Support 'percore' event qualifier Jin Yao
  2019-04-15 14:39   ` Ravi Bangoria
@ 2019-05-18  9:36   ` tip-bot for Jin Yao
  1 sibling, 0 replies; 13+ messages in thread
From: tip-bot for Jin Yao @ 2019-05-18  9:36 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, mingo, tglx, alexander.shishkin, hpa,
	yao.jin, yao.jin, kan.liang, jolsa, peterz, ak, ravi.bangoria

Commit-ID:  4fc4d8dfa056dfd48afe73b9ea3b7570ceb80b9c
Gitweb:     https://git.kernel.org/tip/4fc4d8dfa056dfd48afe73b9ea3b7570ceb80b9c
Author:     Jin Yao <yao.jin@linux.intel.com>
AuthorDate: Fri, 12 Apr 2019 21:59:49 +0800
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Thu, 16 May 2019 14:17:24 -0300

perf stat: Support 'percore' event qualifier

With this patch, we can use the 'percore' event qualifier in perf-stat.

  root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,percore=1/,cpu/event=0,umask=0x3/ -a -A -I1000
    1.000773050 S0-C0   98,352,832 cpu/event=0,umask=0x3,percore=1/  (50.01%)
    1.000773050 S0-C1  103,763,057 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 S0-C2  196,776,995 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 S0-C3  176,493,779 cpu/event=0,umask=0x3,percore=1/  (50.02%)
    1.000773050 CPU0    47,699,641 cpu/event=0,umask=0x3/            (50.02%)
    1.000773050 CPU1    49,052,451 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU2   102,771,422 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU3   100,784,662 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU4    43,171,342 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU5    54,152,158 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU6    93,618,410 cpu/event=0,umask=0x3/            (49.98%)
    1.000773050 CPU7    74,477,589 cpu/event=0,umask=0x3/            (49.99%)

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:

  S0-C0 = CPU0 + CPU4
  S0-C1 = CPU1 + CPU5
  S0-C2 = CPU2 + CPU6
  S0-C3 = CPU3 + CPU7

So the result is expected (tiny difference is ignored).

Note that, the 'percore' event qualifier needs to use with option '-A'.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1555077590-27664-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/Documentation/perf-stat.txt |  4 ++++
 tools/perf/builtin-stat.c              | 21 +++++++++++++++++
 tools/perf/util/stat-display.c         | 43 ++++++++++++++++++++++++++++++----
 tools/perf/util/stat.c                 |  8 ++++---
 4 files changed, 69 insertions(+), 7 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 39c05f89104e..1e312c2672e4 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -43,6 +43,10 @@ report::
 	  param1 and param2 are defined as formats for the PMU in
 	  /sys/bus/event_source/devices/<pmu>/format/*
 
+	  'percore' is a event qualifier that sums up the event counts for both
+	  hardware threads in a core. For example:
+	  perf stat -A -a -e cpu/event,percore=1/,otherevent ...
+
 	- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
 	  where M, N, K are numbers (in decimal, hex, octal format).
 	  Acceptable values for each of 'config', 'config1' and 'config2'
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a3c060878faa..24b8e690fb69 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -847,6 +847,18 @@ static int perf_stat__get_core_cached(struct perf_stat_config *config,
 	return perf_stat__get_aggr(config, perf_stat__get_core, map, idx);
 }
 
+static bool term_percore_set(void)
+{
+	struct perf_evsel *counter;
+
+	evlist__for_each_entry(evsel_list, counter) {
+		if (counter->percore)
+			return true;
+	}
+
+	return false;
+}
+
 static int perf_stat_init_aggr_mode(void)
 {
 	int nr;
@@ -867,6 +879,15 @@ static int perf_stat_init_aggr_mode(void)
 		stat_config.aggr_get_id = perf_stat__get_core_cached;
 		break;
 	case AGGR_NONE:
+		if (term_percore_set()) {
+			if (cpu_map__build_core_map(evsel_list->cpus,
+						    &stat_config.aggr_map)) {
+				perror("cannot build core map");
+				return -1;
+			}
+			stat_config.aggr_get_id = perf_stat__get_core_cached;
+		}
+		break;
 	case AGGR_GLOBAL:
 	case AGGR_THREAD:
 	case AGGR_UNSET:
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index f5b4ee79568c..4c53bae5644b 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -88,9 +88,17 @@ static void aggr_printout(struct perf_stat_config *config,
 			config->csv_sep);
 			break;
 	case AGGR_NONE:
-		fprintf(config->output, "CPU%*d%s",
-			config->csv_output ? 0 : -4,
-			perf_evsel__cpus(evsel)->map[id], config->csv_sep);
+		if (evsel->percore) {
+			fprintf(config->output, "S%d-C%*d%s",
+				cpu_map__id_to_socket(id),
+				config->csv_output ? 0 : -5,
+				cpu_map__id_to_cpu(id), config->csv_sep);
+		} else {
+			fprintf(config->output, "CPU%*d%s ",
+				config->csv_output ? 0 : -5,
+				perf_evsel__cpus(evsel)->map[id],
+				config->csv_sep);
+		}
 		break;
 	case AGGR_THREAD:
 		fprintf(config->output, "%*s-%*d%s",
@@ -1103,6 +1111,30 @@ static void print_footer(struct perf_stat_config *config)
 			"the same PMU. Try reorganizing the group.\n");
 }
 
+static void print_percore(struct perf_stat_config *config,
+			  struct perf_evsel *counter, char *prefix)
+{
+	bool metric_only = config->metric_only;
+	FILE *output = config->output;
+	int s;
+	bool first = true;
+
+	if (!(config->aggr_map || config->aggr_get_id))
+		return;
+
+	for (s = 0; s < config->aggr_map->nr; s++) {
+		if (prefix && metric_only)
+			fprintf(output, "%s", prefix);
+
+		print_counter_aggrdata(config, counter, s,
+				       prefix, metric_only,
+				       &first);
+	}
+
+	if (metric_only)
+		fputc('\n', output);
+}
+
 void
 perf_evlist__print_counters(struct perf_evlist *evlist,
 			    struct perf_stat_config *config,
@@ -1153,7 +1185,10 @@ perf_evlist__print_counters(struct perf_evlist *evlist,
 			print_no_aggr_metric(config, evlist, prefix);
 		else {
 			evlist__for_each_entry(evlist, counter) {
-				print_counter(config, counter, prefix);
+				if (counter->percore)
+					print_percore(config, counter, prefix);
+				else
+					print_counter(config, counter, prefix);
 			}
 		}
 		break;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 2856cc9d5a31..c3115d939b0b 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -277,9 +277,11 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
 		if (!evsel->snapshot)
 			perf_evsel__compute_deltas(evsel, cpu, thread, count);
 		perf_counts_values__scale(count, config->scale, NULL);
-		if (config->aggr_mode == AGGR_NONE)
-			perf_stat__update_shadow_stats(evsel, count->val, cpu,
-						       &rt_stat);
+		if ((config->aggr_mode == AGGR_NONE) && (!evsel->percore)) {
+			perf_stat__update_shadow_stats(evsel, count->val,
+						       cpu, &rt_stat);
+		}
+
 		if (config->aggr_mode == AGGR_THREAD) {
 			if (config->stats)
 				perf_stat__update_shadow_stats(evsel,

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-05-18  9:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-12 13:59 [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jin Yao
2019-04-12 13:59 ` [PATCH v4 1/4] perf: Add a " Jin Yao
2019-05-18  9:35   ` [tip:perf/core] perf tools: " tip-bot for Jin Yao
2019-04-12 13:59 ` [PATCH v4 2/4] perf stat: Factor out aggregate counts printing Jin Yao
2019-05-18  9:36   ` [tip:perf/core] " tip-bot for Jin Yao
2019-04-12 13:59 ` [PATCH v4 3/4] perf stat: Support 'percore' event qualifier Jin Yao
2019-04-15 14:39   ` Ravi Bangoria
2019-04-15 19:26     ` Arnaldo Carvalho de Melo
2019-04-16  0:31       ` Jin, Yao
2019-05-18  9:36   ` [tip:perf/core] " tip-bot for Jin Yao
2019-04-12 13:59 ` [PATCH v4 4/4] perf test: Add a simple test for term 'percore' Jin Yao
2019-04-15  8:51 ` [PATCH v4 0/4] perf: Support a new 'percore' event qualifier Jiri Olsa
2019-05-15 19:40 ` Arnaldo Carvalho de Melo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.