All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
  2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
@ 2019-03-15 13:34 ` Jiri Olsa
  2019-03-15 23:07   ` Jin, Yao
  2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2019-03-15 13:34 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Sat, Mar 16, 2019 at 12:04:13AM +0800, Jin Yao wrote:
> The coresum event qualifier which sums up the event counts for both
> hardware threads in a core. For example,
> 
> perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/
> 
> In this example, we count the event 'ref-cycles' per-core and per-CPU in
> one perf stat command-line.
> 
> We can already support per-core counting with --per-core, but it's
> often useful to do this together with other metrics that are collected
> per CPU (per hardware thread). So this patch series supports this
> per-core counting on a event level.

seems useful, but perhaps we should follow the --per-core option
we already have and call it 'per-core' instead of coresum

jirka

> 
> Jin Yao (3):
>   perf: Add a coresum event qualifier
>   perf stat: Support coresum event qualifier
>   perf test: Add a simple test for term coresum
> 
>  tools/perf/Documentation/perf-stat.txt |   4 ++
>  tools/perf/builtin-stat.c              |  21 +++++++
>  tools/perf/tests/parse-events.c        |  10 ++-
>  tools/perf/util/evsel.c                |   2 +
>  tools/perf/util/evsel.h                |   3 +
>  tools/perf/util/parse-events.c         |  27 +++++++++
>  tools/perf/util/parse-events.h         |   1 +
>  tools/perf/util/parse-events.l         |   1 +
>  tools/perf/util/stat-display.c         | 108 ++++++++++++++++++++++++---------
>  tools/perf/util/stat.c                 |   8 ++-
>  10 files changed, 151 insertions(+), 34 deletions(-)
> 
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/3] perf stat: Support coresum event qualifier
  2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
@ 2019-03-15 13:34   ` Jiri Olsa
  2019-03-15 23:08     ` Jin, Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2019-03-15 13:34 UTC (permalink / raw)
  To: Jin Yao
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin

On Sat, Mar 16, 2019 at 12:04:15AM +0800, Jin Yao wrote:

SNIP

> +static void print_counter_aggrdata(struct perf_stat_config *config,
> +				   struct perf_evsel *counter, int s,
> +				   char *prefix, bool metric_only,
> +				   bool *first)
> +{
> +	struct aggr_data ad;
> +	FILE *output = config->output;
> +	u64 ena, run, val;
> +	int id, nr;
> +	double uval;
> +
> +	ad.id = id = config->aggr_map->map[s];
> +	ad.val = ad.ena = ad.run = 0;
> +	ad.nr = 0;
> +	if (!collect_data(config, counter, aggr_cb, &ad))
> +		return;
> +
> +	nr = ad.nr;
> +	ena = ad.ena;
> +	run = ad.run;
> +	val = ad.val;
> +	if (*first && metric_only) {
> +		*first = false;
> +		aggr_printout(config, counter, id, nr);
> +	}
> +	if (prefix && !metric_only)
> +		fprintf(output, "%s", prefix);
> +
> +	uval = val * counter->scale;
> +	printout(config, id, nr, counter, uval, prefix,
> +		 run, ena, 1.0, &rt_stat);
> +	if (!metric_only)
> +		fputc('\n', output);
> +}

plese put the factoring out of print_counter_aggrdata function
into separate patch

thanks,
jirka


> +
>  static void print_aggr(struct perf_stat_config *config,
>  		       struct perf_evlist *evlist,
>  		       char *prefix)
> @@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
>  	bool metric_only = config->metric_only;
>  	FILE *output = config->output;
>  	struct perf_evsel *counter;
> -	int s, id, nr;
> -	double uval;
> -	u64 ena, run, val;
> +	int s;
>  	bool first;
>  
>  	if (!(config->aggr_map || config->aggr_get_id))
> @@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
>  	 * Without each counter has its own line.
>  	 */
>  	for (s = 0; s < config->aggr_map->nr; s++) {
> -		struct aggr_data ad;
>  		if (prefix && metric_only)
>  			fprintf(output, "%s", prefix);
>  
> -		ad.id = id = config->aggr_map->map[s];
>  		first = true;
>  		evlist__for_each_entry(evlist, counter) {
>  			if (is_duration_time(counter))
>  				continue;
> -
> -			ad.val = ad.ena = ad.run = 0;
> -			ad.nr = 0;
> -			if (!collect_data(config, counter, aggr_cb, &ad))
> -				continue;
> -			nr = ad.nr;
> -			ena = ad.ena;
> -			run = ad.run;
> -			val = ad.val;
> -			if (first && metric_only) {
> -				first = false;
> -				aggr_printout(config, counter, id, nr);
> -			}
> -			if (prefix && !metric_only)
> -				fprintf(output, "%s", prefix);
> -
> -			uval = val * counter->scale;
> -			printout(config, id, nr, counter, uval, prefix,
> -				 run, ena, 1.0, &rt_stat);
> -			if (!metric_only)
> -				fputc('\n', output);
> +			print_counter_aggrdata(config, counter, s,
> +					       prefix, metric_only,
> +					       &first);

SNIP

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 0/3] perf: Support a new coresum event qualifier
@ 2019-03-15 16:04 Jin Yao
  2019-03-15 13:34 ` Jiri Olsa
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

The coresum event qualifier which sums up the event counts for both
hardware threads in a core. For example,

perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line.

We can already support per-core counting with --per-core, but it's
often useful to do this together with other metrics that are collected
per CPU (per hardware thread). So this patch series supports this
per-core counting on a event level.

Jin Yao (3):
  perf: Add a coresum event qualifier
  perf stat: Support coresum event qualifier
  perf test: Add a simple test for term coresum

 tools/perf/Documentation/perf-stat.txt |   4 ++
 tools/perf/builtin-stat.c              |  21 +++++++
 tools/perf/tests/parse-events.c        |  10 ++-
 tools/perf/util/evsel.c                |   2 +
 tools/perf/util/evsel.h                |   3 +
 tools/perf/util/parse-events.c         |  27 +++++++++
 tools/perf/util/parse-events.h         |   1 +
 tools/perf/util/parse-events.l         |   1 +
 tools/perf/util/stat-display.c         | 108 ++++++++++++++++++++++++---------
 tools/perf/util/stat.c                 |   8 ++-
 10 files changed, 151 insertions(+), 34 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v1 1/3] perf: Add a coresum event qualifier
  2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
  2019-03-15 13:34 ` Jiri Olsa
@ 2019-03-15 16:04 ` Jin Yao
  2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
  2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao
  3 siblings, 0 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

Add a coresum event qualifier, like cpu/event=0,umask=0x3,coresum=1/,
that sums up the event counts for both hardware threads in a core.

We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
So we need to support this per-core counting on a event level.

This can be implemented in only the user tool, no kernel support needed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/util/evsel.c        |  2 ++
 tools/perf/util/evsel.h        |  3 +++
 tools/perf/util/parse-events.c | 27 +++++++++++++++++++++++++++
 tools/perf/util/parse-events.h |  1 +
 tools/perf/util/parse-events.l |  1 +
 5 files changed, 34 insertions(+)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3bbf73e..978e10b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -803,6 +803,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			break;
 		case PERF_EVSEL__CONFIG_TERM_DRV_CFG:
 			break;
+		case PERF_EVSEL__CONFIG_TERM_CORESUM:
+			break;
 		default:
 			break;
 		}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index cc578e0..25a0777 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -50,6 +50,7 @@ enum term_type {
 	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_DRV_CFG,
 	PERF_EVSEL__CONFIG_TERM_BRANCH,
+	PERF_EVSEL__CONFIG_TERM_CORESUM,
 };
 
 struct perf_evsel_config_term {
@@ -67,6 +68,7 @@ struct perf_evsel_config_term {
 		bool	overwrite;
 		char	*branch;
 		unsigned long max_events;
+		bool	coresum;
 	} val;
 	bool weak;
 };
@@ -150,6 +152,7 @@ struct perf_evsel {
 	struct perf_evsel	**metric_events;
 	bool			collect_stat;
 	bool			weak_group;
+	bool			coresum;
 	const char		*pmu_name;
 };
 
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4dcc01b..9c2808c 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -930,6 +930,7 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
 	[PARSE_EVENTS__TERM_TYPE_OVERWRITE]		= "overwrite",
 	[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE]		= "no-overwrite",
 	[PARSE_EVENTS__TERM_TYPE_DRV_CFG]		= "driver-config",
+	[PARSE_EVENTS__TERM_TYPE_CORESUM]		= "coresum",
 };
 
 static bool config_term_shrinked;
@@ -950,6 +951,7 @@ config_term_avail(int term_type, struct parse_events_error *err)
 	case PARSE_EVENTS__TERM_TYPE_CONFIG2:
 	case PARSE_EVENTS__TERM_TYPE_NAME:
 	case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
+	case PARSE_EVENTS__TERM_TYPE_CORESUM:
 		return true;
 	default:
 		if (!err)
@@ -1041,6 +1043,14 @@ do {									   \
 	case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:
 		CHECK_TYPE_VAL(NUM);
 		break;
+	case PARSE_EVENTS__TERM_TYPE_CORESUM:
+		CHECK_TYPE_VAL(NUM);
+		if ((unsigned int)term->val.num > 1) {
+			err->str = strdup("expected 0 or 1");
+			err->idx = term->err_val;
+			return -EINVAL;
+		}
+		break;
 	default:
 		err->str = strdup("unknown term");
 		err->idx = term->err_term;
@@ -1179,6 +1189,10 @@ do {								\
 		case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
 			ADD_CONFIG_TERM(DRV_CFG, drv_cfg, term->val.str);
 			break;
+		case PARSE_EVENTS__TERM_TYPE_CORESUM:
+			ADD_CONFIG_TERM(CORESUM, coresum,
+					term->val.num ? 1 : 0);
+			break;
 		default:
 			break;
 		}
@@ -1233,6 +1247,18 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
 			 get_config_name(head_config), &config_terms);
 }
 
+static bool config_term_coresum(struct list_head *config_terms)
+{
+	struct perf_evsel_config_term *term;
+
+	list_for_each_entry(term, config_terms, list) {
+		if (term->type == PERF_EVSEL__CONFIG_TERM_CORESUM)
+			return term->val.coresum ? true : false;
+	}
+
+	return false;
+}
+
 int parse_events_add_pmu(struct parse_events_state *parse_state,
 			 struct list_head *list, char *name,
 			 struct list_head *head_config,
@@ -1305,6 +1331,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
 		evsel->metric_name = info.metric_name;
 		evsel->pmu_name = name;
 		evsel->use_uncore_alias = use_uncore_alias;
+		evsel->coresum = config_term_coresum(&evsel->config_terms);
 	}
 
 	return evsel ? 0 : -ENOMEM;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 5ed035c..8d259c5 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -75,6 +75,7 @@ enum {
 	PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
 	PARSE_EVENTS__TERM_TYPE_OVERWRITE,
 	PARSE_EVENTS__TERM_TYPE_DRV_CFG,
+	PARSE_EVENTS__TERM_TYPE_CORESUM,
 	__PARSE_EVENTS__TERM_TYPE_NR,
 };
 
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 7805c71..3410bc5 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -274,6 +274,7 @@ inherit			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
 no-inherit		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
 overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
 no-overwrite		{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
+coresum			{ return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CORESUM); }
 ,			{ return ','; }
 "/"			{ BEGIN(INITIAL); return '/'; }
 {name_minus}		{ return str(yyscanner, PE_NAME); }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 2/3] perf stat: Support coresum event qualifier
  2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
  2019-03-15 13:34 ` Jiri Olsa
  2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
@ 2019-03-15 16:04 ` Jin Yao
  2019-03-15 13:34   ` Jiri Olsa
  2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao
  3 siblings, 1 reply; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

With this patch, we can use the coresum event qualifier in perf-stat.

root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/ -a -A -I1000
 #           time CPU                  counts      unit events
     1.000816660 S0-C0             80,676,154      cpu/event=0,umask=0x3,coresum=1/                                     (50.04%)
     1.000816660 S0-C1            115,314,959      cpu/event=0,umask=0x3,coresum=1/                                     (50.04%)
     1.000816660 S0-C2            126,541,249      cpu/event=0,umask=0x3,coresum=1/                                     (50.04%)
     1.000816660 S0-C3            119,950,015      cpu/event=0,umask=0x3,coresum=1/                                     (50.04%)
     1.000816660 CPU0              52,439,489      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU1              53,431,155      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU2              91,192,070      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU3              90,852,159      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU4              29,715,956      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU5              63,289,507      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU6              29,036,120      cpu/event=0,umask=0x3/                                        (49.96%)
     1.000816660 CPU7              28,996,591      cpu/event=0,umask=0x3/                                        (49.97%)

In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:

  S0-C0 = CPU0 + CPU4
  S0-C1 = CPU1 + CPU5
  S0-C2 = CPU2 + CPU6
  S0-C3 = CPU3 + CPU7

So the result is expected (tiny difference is ignored).

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/Documentation/perf-stat.txt |   4 ++
 tools/perf/builtin-stat.c              |  21 +++++++
 tools/perf/util/stat-display.c         | 108 ++++++++++++++++++++++++---------
 tools/perf/util/stat.c                 |   8 ++-
 4 files changed, 108 insertions(+), 33 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 4bc2085..cd5fbd9 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -43,6 +43,10 @@ report::
 	  param1 and param2 are defined as formats for the PMU in
 	  /sys/bus/event_source/devices/<pmu>/format/*
 
+	  'coresum' is a event qualifier that sums up the event counts for both
+	  hardware threads in a core. For example:
+	  perf stat -A -a -e cpu/event,coresum=1/,otherevent ...
+
 	- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
 	  where M, N, K are numbers (in decimal, hex, octal format).
 	  Acceptable values for each of 'config', 'config1' and 'config2'
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7b8f09b..8ce0168 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -832,6 +832,18 @@ static int perf_stat__get_core_cached(struct perf_stat_config *config,
 	return perf_stat__get_aggr(config, perf_stat__get_core, map, idx);
 }
 
+static bool term_coresum_set(void)
+{
+	struct perf_evsel *counter;
+
+	evlist__for_each_entry(evsel_list, counter) {
+		if (counter->coresum)
+			return true;
+	}
+
+	return false;
+}
+
 static int perf_stat_init_aggr_mode(void)
 {
 	int nr;
@@ -852,6 +864,15 @@ static int perf_stat_init_aggr_mode(void)
 		stat_config.aggr_get_id = perf_stat__get_core_cached;
 		break;
 	case AGGR_NONE:
+		if (term_coresum_set()) {
+			if (cpu_map__build_core_map(evsel_list->cpus,
+						    &stat_config.aggr_map)) {
+				perror("cannot build core map");
+				return -1;
+			}
+			stat_config.aggr_get_id = perf_stat__get_core_cached;
+		}
+		break;
 	case AGGR_GLOBAL:
 	case AGGR_THREAD:
 	case AGGR_UNSET:
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 6d043c7..23c9368 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -93,9 +93,17 @@ static void aggr_printout(struct perf_stat_config *config,
 			config->csv_sep);
 			break;
 	case AGGR_NONE:
-		fprintf(config->output, "CPU%*d%s",
-			config->csv_output ? 0 : -4,
-			perf_evsel__cpus(evsel)->map[id], config->csv_sep);
+		if (evsel->coresum) {
+			fprintf(config->output, "S%d-C%*d%s",
+				cpu_map__id_to_socket(id),
+				config->csv_output ? 0 : -5,
+				cpu_map__id_to_cpu(id), config->csv_sep);
+		} else {
+			fprintf(config->output, "CPU%*d%s ",
+				config->csv_output ? 0 : -5,
+				perf_evsel__cpus(evsel)->map[id],
+				config->csv_sep);
+		}
 		break;
 	case AGGR_THREAD:
 		fprintf(config->output, "%*s-%*d%s",
@@ -599,6 +607,41 @@ static void aggr_cb(struct perf_stat_config *config,
 	}
 }
 
+static void print_counter_aggrdata(struct perf_stat_config *config,
+				   struct perf_evsel *counter, int s,
+				   char *prefix, bool metric_only,
+				   bool *first)
+{
+	struct aggr_data ad;
+	FILE *output = config->output;
+	u64 ena, run, val;
+	int id, nr;
+	double uval;
+
+	ad.id = id = config->aggr_map->map[s];
+	ad.val = ad.ena = ad.run = 0;
+	ad.nr = 0;
+	if (!collect_data(config, counter, aggr_cb, &ad))
+		return;
+
+	nr = ad.nr;
+	ena = ad.ena;
+	run = ad.run;
+	val = ad.val;
+	if (*first && metric_only) {
+		*first = false;
+		aggr_printout(config, counter, id, nr);
+	}
+	if (prefix && !metric_only)
+		fprintf(output, "%s", prefix);
+
+	uval = val * counter->scale;
+	printout(config, id, nr, counter, uval, prefix,
+		 run, ena, 1.0, &rt_stat);
+	if (!metric_only)
+		fputc('\n', output);
+}
+
 static void print_aggr(struct perf_stat_config *config,
 		       struct perf_evlist *evlist,
 		       char *prefix)
@@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
 	bool metric_only = config->metric_only;
 	FILE *output = config->output;
 	struct perf_evsel *counter;
-	int s, id, nr;
-	double uval;
-	u64 ena, run, val;
+	int s;
 	bool first;
 
 	if (!(config->aggr_map || config->aggr_get_id))
@@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
 	 * Without each counter has its own line.
 	 */
 	for (s = 0; s < config->aggr_map->nr; s++) {
-		struct aggr_data ad;
 		if (prefix && metric_only)
 			fprintf(output, "%s", prefix);
 
-		ad.id = id = config->aggr_map->map[s];
 		first = true;
 		evlist__for_each_entry(evlist, counter) {
 			if (is_duration_time(counter))
 				continue;
-
-			ad.val = ad.ena = ad.run = 0;
-			ad.nr = 0;
-			if (!collect_data(config, counter, aggr_cb, &ad))
-				continue;
-			nr = ad.nr;
-			ena = ad.ena;
-			run = ad.run;
-			val = ad.val;
-			if (first && metric_only) {
-				first = false;
-				aggr_printout(config, counter, id, nr);
-			}
-			if (prefix && !metric_only)
-				fprintf(output, "%s", prefix);
-
-			uval = val * counter->scale;
-			printout(config, id, nr, counter, uval, prefix,
-				 run, ena, 1.0, &rt_stat);
-			if (!metric_only)
-				fputc('\n', output);
+			print_counter_aggrdata(config, counter, s,
+					       prefix, metric_only,
+					       &first);
 		}
 		if (metric_only)
 			fputc('\n', output);
@@ -1101,6 +1122,30 @@ static void print_footer(struct perf_stat_config *config)
 			"the same PMU. Try reorganizing the group.\n");
 }
 
+static void print_coresum(struct perf_stat_config *config,
+			  struct perf_evsel *counter, char *prefix)
+{
+	bool metric_only = config->metric_only;
+	FILE *output = config->output;
+	int s;
+	bool first = true;
+
+	if (!(config->aggr_map || config->aggr_get_id))
+		return;
+
+	for (s = 0; s < config->aggr_map->nr; s++) {
+		if (prefix && metric_only)
+			fprintf(output, "%s", prefix);
+
+		print_counter_aggrdata(config, counter, s,
+				       prefix, metric_only,
+				       &first);
+	}
+
+	if (metric_only)
+		fputc('\n', output);
+}
+
 void
 perf_evlist__print_counters(struct perf_evlist *evlist,
 			    struct perf_stat_config *config,
@@ -1157,7 +1202,10 @@ perf_evlist__print_counters(struct perf_evlist *evlist,
 			evlist__for_each_entry(evlist, counter) {
 				if (is_duration_time(counter))
 					continue;
-				print_counter(config, counter, prefix);
+				if (counter->coresum)
+					print_coresum(config, counter, prefix);
+				else
+					print_counter(config, counter, prefix);
 			}
 		}
 		break;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 4d40515..33a6e3c 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -277,9 +277,11 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
 		if (!evsel->snapshot)
 			perf_evsel__compute_deltas(evsel, cpu, thread, count);
 		perf_counts_values__scale(count, config->scale, NULL);
-		if (config->aggr_mode == AGGR_NONE)
-			perf_stat__update_shadow_stats(evsel, count->val, cpu,
-						       &rt_stat);
+		if ((config->aggr_mode == AGGR_NONE) && (!evsel->coresum)) {
+			perf_stat__update_shadow_stats(evsel, count->val,
+						       cpu, &rt_stat);
+		}
+
 		if (config->aggr_mode == AGGR_THREAD) {
 			if (config->stats)
 				perf_stat__update_shadow_stats(evsel,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v1 3/3] perf test: Add a simple test for term coresum
  2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
                   ` (2 preceding siblings ...)
  2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
@ 2019-03-15 16:04 ` Jin Yao
  3 siblings, 0 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
  To: acme, jolsa, peterz, mingo, alexander.shishkin
  Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao

It's a simple test which just checks if parser works.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
 tools/perf/tests/parse-events.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 4a69c07..8d741d4 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -605,6 +605,14 @@ static int test__checkterms_simple(struct list_head *terms)
 	TEST_ASSERT_VAL("wrong val", term->val.num == 1);
 	TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "umask"));
 
+	/* coresum=1*/
+	term = list_entry(term->list.next, struct parse_events_term, list);
+	TEST_ASSERT_VAL("wrong type term",
+			term->type_term == PARSE_EVENTS__TERM_TYPE_CORESUM);
+	TEST_ASSERT_VAL("wrong type val",
+			term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+	TEST_ASSERT_VAL("wrong val", term->val.num == 1);
+
 	return 0;
 }
 
@@ -1734,7 +1742,7 @@ struct terms_test {
 
 static struct terms_test test__terms[] = {
 	[0] = {
-		.str   = "config=10,config1,config2=3,umask=1",
+		.str   = "config=10,config1,config2=3,umask=1,coresum=1",
 		.check = test__checkterms_simple,
 	},
 };
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
  2019-03-15 13:34 ` Jiri Olsa
@ 2019-03-15 23:07   ` Jin, Yao
  2019-03-15 23:26     ` Andi Kleen
  0 siblings, 1 reply; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:07 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 3/15/2019 9:34 PM, Jiri Olsa wrote:
> On Sat, Mar 16, 2019 at 12:04:13AM +0800, Jin Yao wrote:
>> The coresum event qualifier which sums up the event counts for both
>> hardware threads in a core. For example,
>>
>> perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/
>>
>> In this example, we count the event 'ref-cycles' per-core and per-CPU in
>> one perf stat command-line.
>>
>> We can already support per-core counting with --per-core, but it's
>> often useful to do this together with other metrics that are collected
>> per CPU (per hardware thread). So this patch series supports this
>> per-core counting on a event level.
> 
> seems useful, but perhaps we should follow the --per-core option
> we already have and call it 'per-core' instead of coresum
> 
> jirka
> 

Yes, the coresum's behavior is similar as --per-core option, just 
supports at the event level. I'm OK with calling it 'per-core'.

For example,
perf stat -e cpu/event=0,umask=0x3,per-core=1/

Thanks
Jin Yao

>>
>> Jin Yao (3):
>>    perf: Add a coresum event qualifier
>>    perf stat: Support coresum event qualifier
>>    perf test: Add a simple test for term coresum
>>
>>   tools/perf/Documentation/perf-stat.txt |   4 ++
>>   tools/perf/builtin-stat.c              |  21 +++++++
>>   tools/perf/tests/parse-events.c        |  10 ++-
>>   tools/perf/util/evsel.c                |   2 +
>>   tools/perf/util/evsel.h                |   3 +
>>   tools/perf/util/parse-events.c         |  27 +++++++++
>>   tools/perf/util/parse-events.h         |   1 +
>>   tools/perf/util/parse-events.l         |   1 +
>>   tools/perf/util/stat-display.c         | 108 ++++++++++++++++++++++++---------
>>   tools/perf/util/stat.c                 |   8 ++-
>>   10 files changed, 151 insertions(+), 34 deletions(-)
>>
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 2/3] perf stat: Support coresum event qualifier
  2019-03-15 13:34   ` Jiri Olsa
@ 2019-03-15 23:08     ` Jin, Yao
  0 siblings, 0 replies; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:08 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
	kan.liang, yao.jin



On 3/15/2019 9:34 PM, Jiri Olsa wrote:
> On Sat, Mar 16, 2019 at 12:04:15AM +0800, Jin Yao wrote:
> 
> SNIP
> 
>> +static void print_counter_aggrdata(struct perf_stat_config *config,
>> +				   struct perf_evsel *counter, int s,
>> +				   char *prefix, bool metric_only,
>> +				   bool *first)
>> +{
>> +	struct aggr_data ad;
>> +	FILE *output = config->output;
>> +	u64 ena, run, val;
>> +	int id, nr;
>> +	double uval;
>> +
>> +	ad.id = id = config->aggr_map->map[s];
>> +	ad.val = ad.ena = ad.run = 0;
>> +	ad.nr = 0;
>> +	if (!collect_data(config, counter, aggr_cb, &ad))
>> +		return;
>> +
>> +	nr = ad.nr;
>> +	ena = ad.ena;
>> +	run = ad.run;
>> +	val = ad.val;
>> +	if (*first && metric_only) {
>> +		*first = false;
>> +		aggr_printout(config, counter, id, nr);
>> +	}
>> +	if (prefix && !metric_only)
>> +		fprintf(output, "%s", prefix);
>> +
>> +	uval = val * counter->scale;
>> +	printout(config, id, nr, counter, uval, prefix,
>> +		 run, ena, 1.0, &rt_stat);
>> +	if (!metric_only)
>> +		fputc('\n', output);
>> +}
> 
> plese put the factoring out of print_counter_aggrdata function
> into separate patch
> 
> thanks,
> jirka
> 
> 

OK!

Thanks
Jin Yao

>> +
>>   static void print_aggr(struct perf_stat_config *config,
>>   		       struct perf_evlist *evlist,
>>   		       char *prefix)
>> @@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
>>   	bool metric_only = config->metric_only;
>>   	FILE *output = config->output;
>>   	struct perf_evsel *counter;
>> -	int s, id, nr;
>> -	double uval;
>> -	u64 ena, run, val;
>> +	int s;
>>   	bool first;
>>   
>>   	if (!(config->aggr_map || config->aggr_get_id))
>> @@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
>>   	 * Without each counter has its own line.
>>   	 */
>>   	for (s = 0; s < config->aggr_map->nr; s++) {
>> -		struct aggr_data ad;
>>   		if (prefix && metric_only)
>>   			fprintf(output, "%s", prefix);
>>   
>> -		ad.id = id = config->aggr_map->map[s];
>>   		first = true;
>>   		evlist__for_each_entry(evlist, counter) {
>>   			if (is_duration_time(counter))
>>   				continue;
>> -
>> -			ad.val = ad.ena = ad.run = 0;
>> -			ad.nr = 0;
>> -			if (!collect_data(config, counter, aggr_cb, &ad))
>> -				continue;
>> -			nr = ad.nr;
>> -			ena = ad.ena;
>> -			run = ad.run;
>> -			val = ad.val;
>> -			if (first && metric_only) {
>> -				first = false;
>> -				aggr_printout(config, counter, id, nr);
>> -			}
>> -			if (prefix && !metric_only)
>> -				fprintf(output, "%s", prefix);
>> -
>> -			uval = val * counter->scale;
>> -			printout(config, id, nr, counter, uval, prefix,
>> -				 run, ena, 1.0, &rt_stat);
>> -			if (!metric_only)
>> -				fputc('\n', output);
>> +			print_counter_aggrdata(config, counter, s,
>> +					       prefix, metric_only,
>> +					       &first);
> 
> SNIP
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
  2019-03-15 23:07   ` Jin, Yao
@ 2019-03-15 23:26     ` Andi Kleen
  2019-03-15 23:31       ` Jin, Yao
  0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2019-03-15 23:26 UTC (permalink / raw)
  To: Jin, Yao
  Cc: Jiri Olsa, acme, jolsa, peterz, mingo, alexander.shishkin,
	Linux-kernel, kan.liang, yao.jin

> Yes, the coresum's behavior is similar as --per-core option, just supports
> at the event level. I'm OK with calling it 'per-core'.
> 
> For example,
> perf stat -e cpu/event=0,umask=0x3,per-core=1/

Please use percore, the - would need to be escaped in metric expressions.

-Andi

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
  2019-03-15 23:26     ` Andi Kleen
@ 2019-03-15 23:31       ` Jin, Yao
  0 siblings, 0 replies; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:31 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Jiri Olsa, acme, jolsa, peterz, mingo, alexander.shishkin,
	Linux-kernel, kan.liang, yao.jin



On 3/16/2019 7:26 AM, Andi Kleen wrote:
>> Yes, the coresum's behavior is similar as --per-core option, just supports
>> at the event level. I'm OK with calling it 'per-core'.
>>
>> For example,
>> perf stat -e cpu/event=0,umask=0x3,per-core=1/
> 
> Please use percore, the - would need to be escaped in metric expressions.
> 
> -Andi
> 

Oh, yes, thanks for reminding. Will use 'percore' in next version.

Thanks
Jin Yao

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-03-15 23:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
2019-03-15 13:34 ` Jiri Olsa
2019-03-15 23:07   ` Jin, Yao
2019-03-15 23:26     ` Andi Kleen
2019-03-15 23:31       ` Jin, Yao
2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
2019-03-15 13:34   ` Jiri Olsa
2019-03-15 23:08     ` Jin, Yao
2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.