linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
  2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
@ 2019-03-15 13:34 ` Jiri Olsa
  2019-03-15 23:07   ` Jin, Yao
  2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2019-03-15 13:34 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Sat, Mar 16, 2019 at 12:04:13AM +0800, Jin Yao wrote:
> The coresum event qualifier which sums up the event counts for both
> hardware threads in a core. For example,
> 
> perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/
> 
> In this example, we count the event 'ref-cycles' per-core and per-CPU in
> one perf stat command-line.
> 
> We can already support per-core counting with --per-core, but it's
> often useful to do this together with other metrics that are collected
> per CPU (per hardware thread). So this patch series supports this
> per-core counting on a event level.

seems useful, but perhaps we should follow the --per-core option
we already have and call it 'per-core' instead of coresum

jirka

> 
> Jin Yao (3):
>   perf: Add a coresum event qualifier
>   perf stat: Support coresum event qualifier
>   perf test: Add a simple test for term coresum
> 
>  tools/perf/Documentation/perf-stat.txt |   4 ++
>  tools/perf/builtin-stat.c              |  21 +++++++
>  tools/perf/tests/parse-events.c        |  10 ++-
>  tools/perf/util/evsel.c                |   2 +
>  tools/perf/util/evsel.h                |   3 +
>  tools/perf/util/parse-events.c         |  27 +++++++++
>  tools/perf/util/parse-events.h         |   1 +
>  tools/perf/util/parse-events.l         |   1 +
>  tools/perf/util/stat-display.c         | 108 ++++++++++++++++++++++++---------
>  tools/perf/util/stat.c                 |   8 ++-
>  10 files changed, 151 insertions(+), 34 deletions(-)
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/3] perf stat: Support coresum event qualifier
  2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
@ 2019-03-15 13:34   ` Jiri Olsa
  2019-03-15 23:08     ` Jin, Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2019-03-15 13:34 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Sat, Mar 16, 2019 at 12:04:15AM +0800, Jin Yao wrote:

SNIP

> +static void print_counter_aggrdata(struct perf_stat_config *config,
> +				   struct perf_evsel *counter, int s,
> +				   char *prefix, bool metric_only,
> +				   bool *first)
> +{
> +	struct aggr_data ad;
> +	FILE *output = config->output;
> +	u64 ena, run, val;
> +	int id, nr;
> +	double uval;
> +
> +	ad.id = id = config->aggr_map->map[s];
> +	ad.val = ad.ena = ad.run = 0;
> +	ad.nr = 0;
> +	if (!collect_data(config, counter, aggr_cb, &ad))
> +		return;
> +
> +	nr = ad.nr;
> +	ena = ad.ena;
> +	run = ad.run;
> +	val = ad.val;
> +	if (*first && metric_only) {
> +		*first = false;
> +		aggr_printout(config, counter, id, nr);
> +	}
> +	if (prefix && !metric_only)
> +		fprintf(output, "%s", prefix);
> +
> +	uval = val * counter->scale;
> +	printout(config, id, nr, counter, uval, prefix,
> +		 run, ena, 1.0, &rt_stat);
> +	if (!metric_only)
> +		fputc('\n', output);
> +}

plese put the factoring out of print_counter_aggrdata function
into separate patch

thanks,
jirka


> +
>  static void print_aggr(struct perf_stat_config *config,
>  		       struct perf_evlist *evlist,
>  		       char *prefix)
> @@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
>  	bool metric_only = config->metric_only;
>  	FILE *output = config->output;
>  	struct perf_evsel *counter;
> -	int s, id, nr;
> -	double uval;
> -	u64 ena, run, val;
> +	int s;
>  	bool first;
>  
>  	if (!(config->aggr_map || config->aggr_get_id))
> @@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
>  	 * Without each counter has its own line.
>  	 */
>  	for (s = 0; s < config->aggr_map->nr; s++) {
> -		struct aggr_data ad;
>  		if (prefix && metric_only)
>  			fprintf(output, "%s", prefix);
>  
> -		ad.id = id = config->aggr_map->map[s];
>  		first = true;
>  		evlist__for_each_entry(evlist, counter) {
>  			if (is_duration_time(counter))
>  				continue;
> -
> -			ad.val = ad.ena = ad.run = 0;
> -			ad.nr = 0;
> -			if (!collect_data(config, counter, aggr_cb, &ad))
> -				continue;
> -			nr = ad.nr;
> -			ena = ad.ena;
> -			run = ad.run;
> -			val = ad.val;
> -			if (first && metric_only) {
> -				first = false;
> -				aggr_printout(config, counter, id, nr);
> -			}
> -			if (prefix && !metric_only)
> -				fprintf(output, "%s", prefix);
> -
> -			uval = val * counter->scale;
> -			printout(config, id, nr, counter, uval, prefix,
> -				 run, ena, 1.0, &rt_stat);
> -			if (!metric_only)
> -				fputc('\n', output);
> +			print_counter_aggrdata(config, counter, s,
> +					       prefix, metric_only,
> +					       &first);

SNIP

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 0/3] perf: Support a new coresum event qualifier
@ 2019-03-15 16:04 Jin Yao
  2019-03-15 13:34 ` Jiri Olsa
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The coresum event qualifier which sums up the event counts for both
hardware threads in a core. For example,

perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line.

We can already support per-core counting with --per-core, but it's
often useful to do this together with other metrics that are collected
per CPU (per hardware thread). So this patch series supports this
per-core counting on a event level.

Jin Yao (3):
  perf: Add a coresum event qualifier
  perf stat: Support coresum event qualifier
  perf test: Add a simple test for term coresum

 tools/perf/Documentation/perf-stat.txt |   4 ++
 tools/perf/builtin-stat.c              |  21 +++++++
 tools/perf/tests/parse-events.c        |  10 ++-
 tools/perf/util/evsel.c                |   2 +
 tools/perf/util/evsel.h                |   3 +
 tools/perf/util/parse-events.c         |  27 +++++++++
 tools/perf/util/parse-events.h         |   1 +
 tools/perf/util/parse-events.l         |   1 +
 tools/perf/util/stat-display.c         | 108 ++++++++++++++++++++++++---------
 tools/perf/util/stat.c                 |   8 ++-
 10 files changed, 151 insertions(+), 34 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/3] perf: Add a coresum event qualifier
  2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
  2019-03-15 13:34 ` Jiri Olsa
@ 2019-03-15 16:04 ` Jin Yao
  2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
  2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao
  3 siblings, 0 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Add a coresum event qualifier, like cpu/event=0,umask=0x3,coresum=1/,
that sums up the event counts for both hardware threads in a core.

We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
So we need to support this per-core counting on a event level.

This can be implemented in only the user tool, no kernel support needed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/evsel.c        |  2 ++
 tools/perf/util/evsel.h        |  3 +++
 tools/perf/util/parse-events.c | 27 +++++++++++++++++++++++++++
 tools/perf/util/parse-events.h |  1 +
 tools/perf/util/parse-events.l |  1 +
 5 files changed, 34 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3bbf73e..978e10b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -803,6 +803,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			break;
 		case PERF_EVSEL__CONFIG_TERM_DRV_CFG:
 			break;
+		case PERF_EVSEL__CONFIG_TERM_CORESUM:
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index cc578e0..25a0777 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -50,6 +50,7 @@ enum term_type {
 	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_DRV_CFG,
 	PERF_EVSEL__CONFIG_TERM_BRANCH,
+	PERF_EVSEL__CONFIG_TERM_CORESUM,
 };
 
 struct perf_evsel_config_term {
@@ -67,6 +68,7 @@ struct perf_evsel_config_term {
 		bool	overwrite;
 		char	*branch;
 		unsigned long max_events;
+		bool	coresum;
 	} val;
 	bool weak;
 };
@@ -150,6 +152,7 @@ struct perf_evsel {
 	struct perf_evsel	**metric_events;
 	bool			collect_stat;
 	bool			weak_group;
+	bool			coresum;
 	const char		*pmu_name;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4dcc01b..9c2808c 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -930,6 +930,7 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
 	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
 	[PARSE_EVENTS__TERM_TYPE_DRV_CFG]		= "driver-config",
+	[PARSE_EVENTS__TERM_TYPE_CORESUM]		= "coresum",
 };
 
 static bool config_term_shrinked;
@@ -950,6 +951,7 @@ config_term_avail(int term_type, struct parse_events_error *err)
 	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 	case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
+	case PARSE_EVENTS__TERM_TYPE_CORESUM:
 		return true;
 	default:
 		if (!err)
@@ -1041,6 +1043,14 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_CORESUM:
+		CHECK_TYPE_VAL(NUM);
+		if ((unsigned int)term->val.num > 1) {
+			err->str = strdup("expected 0 or 1");
+			err->idx = term->err_val;
+			return -EINVAL;
+		}
+		break;
 	default:
 		err->str = strdup("unknown term");
 		err->idx = term->err_term;
@@ -1179,6 +1189,10 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
 			ADD_CONFIG_TERM(DRV_CFG, drv_cfg, term->val.str);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_CORESUM:
+			ADD_CONFIG_TERM(CORESUM, coresum,
+					term->val.num ? 1 : 0);
+			break;
 		default:
 			break;
 		}
@@ -1233,6 +1247,18 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
 			 get_config_name(head_config), &config_terms);
 }
 
+static bool config_term_coresum(struct list_head *config_terms)
+{
+	struct perf_evsel_config_term *term;
+
+	list_for_each_entry(term, config_terms, list) {
+		if (term->type == PERF_EVSEL__CONFIG_TERM_CORESUM)
+			return term->val.coresum ? true : false;
+	}
+
+	return false;
+}
+
 int parse_events_add_pmu(struct parse_events_state *parse_state,
 			 struct list_head *list, char *name,
 			 struct list_head *head_config,
@@ -1305,6 +1331,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
 		evsel->metric_name = info.metric_name;
 		evsel->pmu_name = name;
 		evsel->use_uncore_alias = use_uncore_alias;
+		evsel->coresum = config_term_coresum(&evsel->config_terms);
 	}
 
 	return evsel ? 0 : -ENOMEM;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 5ed035c..8d259c5 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -75,6 +75,7 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
 	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	PARSE_EVENTS__TERM_TYPE_DRV_CFG,
+	PARSE_EVENTS__TERM_TYPE_CORESUM,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 7805c71..3410bc5 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -274,6 +274,7 @@ inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
 no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
+coresum			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CORESUM); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 2/3] perf stat: Support coresum event qualifier
  2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
  2019-03-15 13:34 ` Jiri Olsa
  2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
@ 2019-03-15 16:04 ` Jin Yao
  2019-03-15 13:34   ` Jiri Olsa
  2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao
  3 siblings, 1 reply; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

With this patch, we can use the coresum event qualifier in perf-stat.

root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/ -a -A -I1000
 #           time CPU                  counts      unit events
     1.000816660 S0-C0             80,676,154      cpu/event=0,umask=0x3,coresum=1/                                     (50.04%)
     1.000816660 S0-C1            115,314,959      cpu/event=0,umask=0x3,coresum=1/                                     (50.04%)
     1.000816660 S0-C2            126,541,249      cpu/event=0,umask=0x3,coresum=1/                                     (50.04%)
     1.000816660 S0-C3            119,950,015      cpu/event=0,umask=0x3,coresum=1/                                     (50.04%)
     1.000816660 CPU0              52,439,489      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU1              53,431,155      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU2              91,192,070      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU3              90,852,159      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU4              29,715,956      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU5              63,289,507      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU6              29,036,120      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU7              28,996,591      cpu/event=0,umask=0x3/                                        (49.97%)

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:

  S0-C0 = CPU0 + CPU4
  S0-C1 = CPU1 + CPU5
  S0-C2 = CPU2 + CPU6
  S0-C3 = CPU3 + CPU7

So the result is expected (tiny difference is ignored).

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt |   4 ++
 tools/perf/builtin-stat.c              |  21 +++++++
 tools/perf/util/stat-display.c         | 108 ++++++++++++++++++++++++---------
 tools/perf/util/stat.c                 |   8 ++-
 4 files changed, 108 insertions(+), 33 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 4bc2085..cd5fbd9 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -43,6 +43,10 @@ report::
 	  param1 and param2 are defined as formats for the PMU in
 	  /sys/bus/event_source/devices/<pmu>/format/*
 
+	  'coresum' is a event qualifier that sums up the event counts for both
+	  hardware threads in a core. For example:
+	  perf stat -A -a -e cpu/event,coresum=1/,otherevent ...
+
 	- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
 	  where M, N, K are numbers (in decimal, hex, octal format).
 	  Acceptable values for each of 'config', 'config1' and 'config2'
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7b8f09b..8ce0168 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -832,6 +832,18 @@ static int perf_stat__get_core_cached(struct perf_stat_config *config,
 	return perf_stat__get_aggr(config, perf_stat__get_core, map, idx);
 }
 
+static bool term_coresum_set(void)
+{
+	struct perf_evsel *counter;
+
+	evlist__for_each_entry(evsel_list, counter) {
+		if (counter->coresum)
+			return true;
+	}
+
+	return false;
+}
+
 static int perf_stat_init_aggr_mode(void)
 {
 	int nr;
@@ -852,6 +864,15 @@ static int perf_stat_init_aggr_mode(void)
 		stat_config.aggr_get_id = perf_stat__get_core_cached;
 		break;
 	case AGGR_NONE:
+		if (term_coresum_set()) {
+			if (cpu_map__build_core_map(evsel_list->cpus,
+						    &stat_config.aggr_map)) {
+				perror("cannot build core map");
+				return -1;
+			}
+			stat_config.aggr_get_id = perf_stat__get_core_cached;
+		}
+		break;
 	case AGGR_GLOBAL:
 	case AGGR_THREAD:
 	case AGGR_UNSET:
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 6d043c7..23c9368 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -93,9 +93,17 @@ static void aggr_printout(struct perf_stat_config *config,
 			config->csv_sep);
 			break;
 	case AGGR_NONE:
-		fprintf(config->output, "CPU%*d%s",
-			config->csv_output ? 0 : -4,
-			perf_evsel__cpus(evsel)->map[id], config->csv_sep);
+		if (evsel->coresum) {
+			fprintf(config->output, "S%d-C%*d%s",
+				cpu_map__id_to_socket(id),
+				config->csv_output ? 0 : -5,
+				cpu_map__id_to_cpu(id), config->csv_sep);
+		} else {
+			fprintf(config->output, "CPU%*d%s ",
+				config->csv_output ? 0 : -5,
+				perf_evsel__cpus(evsel)->map[id],
+				config->csv_sep);
+		}
 		break;
 	case AGGR_THREAD:
 		fprintf(config->output, "%*s-%*d%s",
@@ -599,6 +607,41 @@ static void aggr_cb(struct perf_stat_config *config,
 	}
 }
 
+static void print_counter_aggrdata(struct perf_stat_config *config,
+				   struct perf_evsel *counter, int s,
+				   char *prefix, bool metric_only,
+				   bool *first)
+{
+	struct aggr_data ad;
+	FILE *output = config->output;
+	u64 ena, run, val;
+	int id, nr;
+	double uval;
+
+	ad.id = id = config->aggr_map->map[s];
+	ad.val = ad.ena = ad.run = 0;
+	ad.nr = 0;
+	if (!collect_data(config, counter, aggr_cb, &ad))
+		return;
+
+	nr = ad.nr;
+	ena = ad.ena;
+	run = ad.run;
+	val = ad.val;
+	if (*first && metric_only) {
+		*first = false;
+		aggr_printout(config, counter, id, nr);
+	}
+	if (prefix && !metric_only)
+		fprintf(output, "%s", prefix);
+
+	uval = val * counter->scale;
+	printout(config, id, nr, counter, uval, prefix,
+		 run, ena, 1.0, &rt_stat);
+	if (!metric_only)
+		fputc('\n', output);
+}
+
 static void print_aggr(struct perf_stat_config *config,
 		       struct perf_evlist *evlist,
 		       char *prefix)
@@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
 	bool metric_only = config->metric_only;
 	FILE *output = config->output;
 	struct perf_evsel *counter;
-	int s, id, nr;
-	double uval;
-	u64 ena, run, val;
+	int s;
 	bool first;
 
 	if (!(config->aggr_map || config->aggr_get_id))
@@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
 	 * Without each counter has its own line.
 	 */
 	for (s = 0; s < config->aggr_map->nr; s++) {
-		struct aggr_data ad;
 		if (prefix && metric_only)
 			fprintf(output, "%s", prefix);
 
-		ad.id = id = config->aggr_map->map[s];
 		first = true;
 		evlist__for_each_entry(evlist, counter) {
 			if (is_duration_time(counter))
 				continue;
-
-			ad.val = ad.ena = ad.run = 0;
-			ad.nr = 0;
-			if (!collect_data(config, counter, aggr_cb, &ad))
-				continue;
-			nr = ad.nr;
-			ena = ad.ena;
-			run = ad.run;
-			val = ad.val;
-			if (first && metric_only) {
-				first = false;
-				aggr_printout(config, counter, id, nr);
-			}
-			if (prefix && !metric_only)
-				fprintf(output, "%s", prefix);
-
-			uval = val * counter->scale;
-			printout(config, id, nr, counter, uval, prefix,
-				 run, ena, 1.0, &rt_stat);
-			if (!metric_only)
-				fputc('\n', output);
+			print_counter_aggrdata(config, counter, s,
+					       prefix, metric_only,
+					       &first);
 		}
 		if (metric_only)
 			fputc('\n', output);
@@ -1101,6 +1122,30 @@ static void print_footer(struct perf_stat_config *config)
 			"the same PMU. Try reorganizing the group.\n");
 }
 
+static void print_coresum(struct perf_stat_config *config,
+			  struct perf_evsel *counter, char *prefix)
+{
+	bool metric_only = config->metric_only;
+	FILE *output = config->output;
+	int s;
+	bool first = true;
+
+	if (!(config->aggr_map || config->aggr_get_id))
+		return;
+
+	for (s = 0; s < config->aggr_map->nr; s++) {
+		if (prefix && metric_only)
+			fprintf(output, "%s", prefix);
+
+		print_counter_aggrdata(config, counter, s,
+				       prefix, metric_only,
+				       &first);
+	}
+
+	if (metric_only)
+		fputc('\n', output);
+}
+
 void
 perf_evlist__print_counters(struct perf_evlist *evlist,
 			    struct perf_stat_config *config,
@@ -1157,7 +1202,10 @@ perf_evlist__print_counters(struct perf_evlist *evlist,
 			evlist__for_each_entry(evlist, counter) {
 				if (is_duration_time(counter))
 					continue;
-				print_counter(config, counter, prefix);
+				if (counter->coresum)
+					print_coresum(config, counter, prefix);
+				else
+					print_counter(config, counter, prefix);
 			}
 		}
 		break;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 4d40515..33a6e3c 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -277,9 +277,11 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
 		if (!evsel->snapshot)
 			perf_evsel__compute_deltas(evsel, cpu, thread, count);
 		perf_counts_values__scale(count, config->scale, NULL);
-		if (config->aggr_mode == AGGR_NONE)
-			perf_stat__update_shadow_stats(evsel, count->val, cpu,
-						       &rt_stat);
+		if ((config->aggr_mode == AGGR_NONE) && (!evsel->coresum)) {
+			perf_stat__update_shadow_stats(evsel, count->val,
+						       cpu, &rt_stat);
+		}
+
 		if (config->aggr_mode == AGGR_THREAD) {
 			if (config->stats)
 				perf_stat__update_shadow_stats(evsel,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 3/3] perf test: Add a simple test for term coresum
  2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
                   ` (2 preceding siblings ...)
  2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
@ 2019-03-15 16:04 ` Jin Yao
  3 siblings, 0 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It's a simple test which just checks if parser works.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/tests/parse-events.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 4a69c07..8d741d4 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -605,6 +605,14 @@ static int test__checkterms_simple(struct list_head *terms)
 	TEST_ASSERT_VAL("wrong val", term->val.num == 1);
 	TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "umask"));
 
+	/* coresum=1*/
+	term = list_entry(term->list.next, struct parse_events_term, list);
+	TEST_ASSERT_VAL("wrong type term",
+			term->type_term == PARSE_EVENTS__TERM_TYPE_CORESUM);
+	TEST_ASSERT_VAL("wrong type val",
+			term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+	TEST_ASSERT_VAL("wrong val", term->val.num == 1);
+
 	return 0;
 }
 
@@ -1734,7 +1742,7 @@ struct terms_test {
 
 static struct terms_test test__terms[] = {
 	[0] = {
-		.str   = "config=10,config1,config2=3,umask=1",
+		.str   = "config=10,config1,config2=3,umask=1,coresum=1",
 		.check = test__checkterms_simple,
 	},
 };
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
  2019-03-15 13:34 ` Jiri Olsa
@ 2019-03-15 23:07   ` Jin, Yao
  2019-03-15 23:26     ` Andi Kleen
  0 siblings, 1 reply; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 3/15/2019 9:34 PM, Jiri Olsa wrote:
> On Sat, Mar 16, 2019 at 12:04:13AM +0800, Jin Yao wrote:
>> The coresum event qualifier which sums up the event counts for both
>> hardware threads in a core. For example,
>>
>> perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/
>>
>> In this example, we count the event 'ref-cycles' per-core and per-CPU in
>> one perf stat command-line.
>>
>> We can already support per-core counting with --per-core, but it's
>> often useful to do this together with other metrics that are collected
>> per CPU (per hardware thread). So this patch series supports this
>> per-core counting on a event level.
> 
> seems useful, but perhaps we should follow the --per-core option
> we already have and call it 'per-core' instead of coresum
> 
> jirka
> 

Yes, the coresum's behavior is similar as --per-core option, just 
supports at the event level. I'm OK with calling it 'per-core'.

For example,
perf stat -e cpu/event=0,umask=0x3,per-core=1/

Thanks
Jin Yao

>>
>> Jin Yao (3):
>>    perf: Add a coresum event qualifier
>>    perf stat: Support coresum event qualifier
>>    perf test: Add a simple test for term coresum
>>
>>   tools/perf/Documentation/perf-stat.txt |   4 ++
>>   tools/perf/builtin-stat.c              |  21 +++++++
>>   tools/perf/tests/parse-events.c        |  10 ++-
>>   tools/perf/util/evsel.c                |   2 +
>>   tools/perf/util/evsel.h                |   3 +
>>   tools/perf/util/parse-events.c         |  27 +++++++++
>>   tools/perf/util/parse-events.h         |   1 +
>>   tools/perf/util/parse-events.l         |   1 +
>>   tools/perf/util/stat-display.c         | 108 ++++++++++++++++++++++++---------
>>   tools/perf/util/stat.c                 |   8 ++-
>>   10 files changed, 151 insertions(+), 34 deletions(-)
>>
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/3] perf stat: Support coresum event qualifier
  2019-03-15 13:34   ` Jiri Olsa
@ 2019-03-15 23:08     ` Jin, Yao
  0 siblings, 0 replies; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 3/15/2019 9:34 PM, Jiri Olsa wrote:
> On Sat, Mar 16, 2019 at 12:04:15AM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> +static void print_counter_aggrdata(struct perf_stat_config *config,
>> +				   struct perf_evsel *counter, int s,
>> +				   char *prefix, bool metric_only,
>> +				   bool *first)
>> +{
>> +	struct aggr_data ad;
>> +	FILE *output = config->output;
>> +	u64 ena, run, val;
>> +	int id, nr;
>> +	double uval;
>> +
>> +	ad.id = id = config->aggr_map->map[s];
>> +	ad.val = ad.ena = ad.run = 0;
>> +	ad.nr = 0;
>> +	if (!collect_data(config, counter, aggr_cb, &ad))
>> +		return;
>> +
>> +	nr = ad.nr;
>> +	ena = ad.ena;
>> +	run = ad.run;
>> +	val = ad.val;
>> +	if (*first && metric_only) {
>> +		*first = false;
>> +		aggr_printout(config, counter, id, nr);
>> +	}
>> +	if (prefix && !metric_only)
>> +		fprintf(output, "%s", prefix);
>> +
>> +	uval = val * counter->scale;
>> +	printout(config, id, nr, counter, uval, prefix,
>> +		 run, ena, 1.0, &rt_stat);
>> +	if (!metric_only)
>> +		fputc('\n', output);
>> +}
> 
> plese put the factoring out of print_counter_aggrdata function
> into separate patch
> 
> thanks,
> jirka
> 
> 

OK!

Thanks
Jin Yao

>> +
>>   static void print_aggr(struct perf_stat_config *config,
>>   		       struct perf_evlist *evlist,
>>   		       char *prefix)
>> @@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
>>   	bool metric_only = config->metric_only;
>>   	FILE *output = config->output;
>>   	struct perf_evsel *counter;
>> -	int s, id, nr;
>> -	double uval;
>> -	u64 ena, run, val;
>> +	int s;
>>   	bool first;
>>   
>>   	if (!(config->aggr_map || config->aggr_get_id))
>> @@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
>>   	 * Without each counter has its own line.
>>   	 */
>>   	for (s = 0; s < config->aggr_map->nr; s++) {
>> -		struct aggr_data ad;
>>   		if (prefix && metric_only)
>>   			fprintf(output, "%s", prefix);
>>   
>> -		ad.id = id = config->aggr_map->map[s];
>>   		first = true;
>>   		evlist__for_each_entry(evlist, counter) {
>>   			if (is_duration_time(counter))
>>   				continue;
>> -
>> -			ad.val = ad.ena = ad.run = 0;
>> -			ad.nr = 0;
>> -			if (!collect_data(config, counter, aggr_cb, &ad))
>> -				continue;
>> -			nr = ad.nr;
>> -			ena = ad.ena;
>> -			run = ad.run;
>> -			val = ad.val;
>> -			if (first && metric_only) {
>> -				first = false;
>> -				aggr_printout(config, counter, id, nr);
>> -			}
>> -			if (prefix && !metric_only)
>> -				fprintf(output, "%s", prefix);
>> -
>> -			uval = val * counter->scale;
>> -			printout(config, id, nr, counter, uval, prefix,
>> -				 run, ena, 1.0, &rt_stat);
>> -			if (!metric_only)
>> -				fputc('\n', output);
>> +			print_counter_aggrdata(config, counter, s,
>> +					       prefix, metric_only,
>> +					       &first);
> 
> SNIP
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
  2019-03-15 23:07   ` Jin, Yao
@ 2019-03-15 23:26     ` Andi Kleen
  2019-03-15 23:31       ` Jin, Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2019-03-15 23:26 UTC (permalink / raw)
  To: Jin, Yao
  Cc: Jiri Olsa, acme, jolsa, peterz, mingo, alexander.shishkin,
	Linux-kernel, kan.liang, yao.jin

> Yes, the coresum's behavior is similar as --per-core option, just supports
> at the event level. I'm OK with calling it 'per-core'.
> 
> For example,
> perf stat -e cpu/event=0,umask=0x3,per-core=1/

Please use percore, the - would need to be escaped in metric expressions.

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
  2019-03-15 23:26     ` Andi Kleen
@ 2019-03-15 23:31       ` Jin, Yao
  0 siblings, 0 replies; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:31 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Olsa, acme, jolsa, peterz, mingo, alexander.shishkin,
	Linux-kernel, kan.liang, yao.jin



On 3/16/2019 7:26 AM, Andi Kleen wrote:
>> Yes, the coresum's behavior is similar as --per-core option, just supports
>> at the event level. I'm OK with calling it 'per-core'.
>>
>> For example,
>> perf stat -e cpu/event=0,umask=0x3,per-core=1/
> 
> Please use percore, the - would need to be escaped in metric expressions.
> 
> -Andi
> 

Oh, yes, thanks for reminding. Will use 'percore' in next version.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-03-15 23:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
2019-03-15 13:34 ` Jiri Olsa
2019-03-15 23:07   ` Jin, Yao
2019-03-15 23:26     ` Andi Kleen
2019-03-15 23:31       ` Jin, Yao
2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
2019-03-15 13:34   ` Jiri Olsa
2019-03-15 23:08     ` Jin, Yao
2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).