* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
@ 2019-03-15 13:34 ` Jiri Olsa
2019-03-15 23:07 ` Jin, Yao
2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
` (2 subsequent siblings)
3 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2019-03-15 13:34 UTC (permalink / raw)
To: Jin Yao
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On Sat, Mar 16, 2019 at 12:04:13AM +0800, Jin Yao wrote:
> The coresum event qualifier which sums up the event counts for both
> hardware threads in a core. For example,
>
> perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/
>
> In this example, we count the event 'ref-cycles' per-core and per-CPU in
> one perf stat command-line.
>
> We can already support per-core counting with --per-core, but it's
> often useful to do this together with other metrics that are collected
> per CPU (per hardware thread). So this patch series supports this
> per-core counting on a event level.
seems useful, but perhaps we should follow the --per-core option
we already have and call it 'per-core' instead of coresum
jirka
>
> Jin Yao (3):
> perf: Add a coresum event qualifier
> perf stat: Support coresum event qualifier
> perf test: Add a simple test for term coresum
>
> tools/perf/Documentation/perf-stat.txt | 4 ++
> tools/perf/builtin-stat.c | 21 +++++++
> tools/perf/tests/parse-events.c | 10 ++-
> tools/perf/util/evsel.c | 2 +
> tools/perf/util/evsel.h | 3 +
> tools/perf/util/parse-events.c | 27 +++++++++
> tools/perf/util/parse-events.h | 1 +
> tools/perf/util/parse-events.l | 1 +
> tools/perf/util/stat-display.c | 108 ++++++++++++++++++++++++---------
> tools/perf/util/stat.c | 8 ++-
> 10 files changed, 151 insertions(+), 34 deletions(-)
>
> --
> 2.7.4
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 2/3] perf stat: Support coresum event qualifier
2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
@ 2019-03-15 13:34 ` Jiri Olsa
2019-03-15 23:08 ` Jin, Yao
0 siblings, 1 reply; 10+ messages in thread
From: Jiri Olsa @ 2019-03-15 13:34 UTC (permalink / raw)
To: Jin Yao
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On Sat, Mar 16, 2019 at 12:04:15AM +0800, Jin Yao wrote:
SNIP
> +static void print_counter_aggrdata(struct perf_stat_config *config,
> + struct perf_evsel *counter, int s,
> + char *prefix, bool metric_only,
> + bool *first)
> +{
> + struct aggr_data ad;
> + FILE *output = config->output;
> + u64 ena, run, val;
> + int id, nr;
> + double uval;
> +
> + ad.id = id = config->aggr_map->map[s];
> + ad.val = ad.ena = ad.run = 0;
> + ad.nr = 0;
> + if (!collect_data(config, counter, aggr_cb, &ad))
> + return;
> +
> + nr = ad.nr;
> + ena = ad.ena;
> + run = ad.run;
> + val = ad.val;
> + if (*first && metric_only) {
> + *first = false;
> + aggr_printout(config, counter, id, nr);
> + }
> + if (prefix && !metric_only)
> + fprintf(output, "%s", prefix);
> +
> + uval = val * counter->scale;
> + printout(config, id, nr, counter, uval, prefix,
> + run, ena, 1.0, &rt_stat);
> + if (!metric_only)
> + fputc('\n', output);
> +}
plese put the factoring out of print_counter_aggrdata function
into separate patch
thanks,
jirka
> +
> static void print_aggr(struct perf_stat_config *config,
> struct perf_evlist *evlist,
> char *prefix)
> @@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
> bool metric_only = config->metric_only;
> FILE *output = config->output;
> struct perf_evsel *counter;
> - int s, id, nr;
> - double uval;
> - u64 ena, run, val;
> + int s;
> bool first;
>
> if (!(config->aggr_map || config->aggr_get_id))
> @@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
> * Without each counter has its own line.
> */
> for (s = 0; s < config->aggr_map->nr; s++) {
> - struct aggr_data ad;
> if (prefix && metric_only)
> fprintf(output, "%s", prefix);
>
> - ad.id = id = config->aggr_map->map[s];
> first = true;
> evlist__for_each_entry(evlist, counter) {
> if (is_duration_time(counter))
> continue;
> -
> - ad.val = ad.ena = ad.run = 0;
> - ad.nr = 0;
> - if (!collect_data(config, counter, aggr_cb, &ad))
> - continue;
> - nr = ad.nr;
> - ena = ad.ena;
> - run = ad.run;
> - val = ad.val;
> - if (first && metric_only) {
> - first = false;
> - aggr_printout(config, counter, id, nr);
> - }
> - if (prefix && !metric_only)
> - fprintf(output, "%s", prefix);
> -
> - uval = val * counter->scale;
> - printout(config, id, nr, counter, uval, prefix,
> - run, ena, 1.0, &rt_stat);
> - if (!metric_only)
> - fputc('\n', output);
> + print_counter_aggrdata(config, counter, s,
> + prefix, metric_only,
> + &first);
SNIP
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v1 0/3] perf: Support a new coresum event qualifier
@ 2019-03-15 16:04 Jin Yao
2019-03-15 13:34 ` Jiri Olsa
` (3 more replies)
0 siblings, 4 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
The coresum event qualifier which sums up the event counts for both
hardware threads in a core. For example,
perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/
In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line.
We can already support per-core counting with --per-core, but it's
often useful to do this together with other metrics that are collected
per CPU (per hardware thread). So this patch series supports this
per-core counting on a event level.
Jin Yao (3):
perf: Add a coresum event qualifier
perf stat: Support coresum event qualifier
perf test: Add a simple test for term coresum
tools/perf/Documentation/perf-stat.txt | 4 ++
tools/perf/builtin-stat.c | 21 +++++++
tools/perf/tests/parse-events.c | 10 ++-
tools/perf/util/evsel.c | 2 +
tools/perf/util/evsel.h | 3 +
tools/perf/util/parse-events.c | 27 +++++++++
tools/perf/util/parse-events.h | 1 +
tools/perf/util/parse-events.l | 1 +
tools/perf/util/stat-display.c | 108 ++++++++++++++++++++++++---------
tools/perf/util/stat.c | 8 ++-
10 files changed, 151 insertions(+), 34 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH v1 1/3] perf: Add a coresum event qualifier
2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
2019-03-15 13:34 ` Jiri Olsa
@ 2019-03-15 16:04 ` Jin Yao
2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao
3 siblings, 0 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
Add a coresum event qualifier, like cpu/event=0,umask=0x3,coresum=1/,
that sums up the event counts for both hardware threads in a core.
We can already do this with --per-core, but it's often useful to do
this together with other metrics that are collected per hardware thread.
So we need to support this per-core counting on a event level.
This can be implemented in only the user tool, no kernel support needed.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/util/evsel.c | 2 ++
tools/perf/util/evsel.h | 3 +++
tools/perf/util/parse-events.c | 27 +++++++++++++++++++++++++++
tools/perf/util/parse-events.h | 1 +
tools/perf/util/parse-events.l | 1 +
5 files changed, 34 insertions(+)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3bbf73e..978e10b 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -803,6 +803,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
break;
case PERF_EVSEL__CONFIG_TERM_DRV_CFG:
break;
+ case PERF_EVSEL__CONFIG_TERM_CORESUM:
+ break;
default:
break;
}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index cc578e0..25a0777 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -50,6 +50,7 @@ enum term_type {
PERF_EVSEL__CONFIG_TERM_OVERWRITE,
PERF_EVSEL__CONFIG_TERM_DRV_CFG,
PERF_EVSEL__CONFIG_TERM_BRANCH,
+ PERF_EVSEL__CONFIG_TERM_CORESUM,
};
struct perf_evsel_config_term {
@@ -67,6 +68,7 @@ struct perf_evsel_config_term {
bool overwrite;
char *branch;
unsigned long max_events;
+ bool coresum;
} val;
bool weak;
};
@@ -150,6 +152,7 @@ struct perf_evsel {
struct perf_evsel **metric_events;
bool collect_stat;
bool weak_group;
+ bool coresum;
const char *pmu_name;
};
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 4dcc01b..9c2808c 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -930,6 +930,7 @@ static const char *config_term_names[__PARSE_EVENTS__TERM_TYPE_NR] = {
[PARSE_EVENTS__TERM_TYPE_OVERWRITE] = "overwrite",
[PARSE_EVENTS__TERM_TYPE_NOOVERWRITE] = "no-overwrite",
[PARSE_EVENTS__TERM_TYPE_DRV_CFG] = "driver-config",
+ [PARSE_EVENTS__TERM_TYPE_CORESUM] = "coresum",
};
static bool config_term_shrinked;
@@ -950,6 +951,7 @@ config_term_avail(int term_type, struct parse_events_error *err)
case PARSE_EVENTS__TERM_TYPE_CONFIG2:
case PARSE_EVENTS__TERM_TYPE_NAME:
case PARSE_EVENTS__TERM_TYPE_SAMPLE_PERIOD:
+ case PARSE_EVENTS__TERM_TYPE_CORESUM:
return true;
default:
if (!err)
@@ -1041,6 +1043,14 @@ do { \
case PARSE_EVENTS__TERM_TYPE_MAX_EVENTS:
CHECK_TYPE_VAL(NUM);
break;
+ case PARSE_EVENTS__TERM_TYPE_CORESUM:
+ CHECK_TYPE_VAL(NUM);
+ if ((unsigned int)term->val.num > 1) {
+ err->str = strdup("expected 0 or 1");
+ err->idx = term->err_val;
+ return -EINVAL;
+ }
+ break;
default:
err->str = strdup("unknown term");
err->idx = term->err_term;
@@ -1179,6 +1189,10 @@ do { \
case PARSE_EVENTS__TERM_TYPE_DRV_CFG:
ADD_CONFIG_TERM(DRV_CFG, drv_cfg, term->val.str);
break;
+ case PARSE_EVENTS__TERM_TYPE_CORESUM:
+ ADD_CONFIG_TERM(CORESUM, coresum,
+ term->val.num ? 1 : 0);
+ break;
default:
break;
}
@@ -1233,6 +1247,18 @@ int parse_events_add_numeric(struct parse_events_state *parse_state,
get_config_name(head_config), &config_terms);
}
+static bool config_term_coresum(struct list_head *config_terms)
+{
+ struct perf_evsel_config_term *term;
+
+ list_for_each_entry(term, config_terms, list) {
+ if (term->type == PERF_EVSEL__CONFIG_TERM_CORESUM)
+ return term->val.coresum ? true : false;
+ }
+
+ return false;
+}
+
int parse_events_add_pmu(struct parse_events_state *parse_state,
struct list_head *list, char *name,
struct list_head *head_config,
@@ -1305,6 +1331,7 @@ int parse_events_add_pmu(struct parse_events_state *parse_state,
evsel->metric_name = info.metric_name;
evsel->pmu_name = name;
evsel->use_uncore_alias = use_uncore_alias;
+ evsel->coresum = config_term_coresum(&evsel->config_terms);
}
return evsel ? 0 : -ENOMEM;
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 5ed035c..8d259c5 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -75,6 +75,7 @@ enum {
PARSE_EVENTS__TERM_TYPE_NOOVERWRITE,
PARSE_EVENTS__TERM_TYPE_OVERWRITE,
PARSE_EVENTS__TERM_TYPE_DRV_CFG,
+ PARSE_EVENTS__TERM_TYPE_CORESUM,
__PARSE_EVENTS__TERM_TYPE_NR,
};
diff --git a/tools/perf/util/parse-events.l b/tools/perf/util/parse-events.l
index 7805c71..3410bc5 100644
--- a/tools/perf/util/parse-events.l
+++ b/tools/perf/util/parse-events.l
@@ -274,6 +274,7 @@ inherit { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_INHERIT); }
no-inherit { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOINHERIT); }
overwrite { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_OVERWRITE); }
no-overwrite { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_NOOVERWRITE); }
+coresum { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_CORESUM); }
, { return ','; }
"/" { BEGIN(INITIAL); return '/'; }
{name_minus} { return str(yyscanner, PE_NAME); }
--
2.7.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v1 2/3] perf stat: Support coresum event qualifier
2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
2019-03-15 13:34 ` Jiri Olsa
2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
@ 2019-03-15 16:04 ` Jin Yao
2019-03-15 13:34 ` Jiri Olsa
2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao
3 siblings, 1 reply; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
With this patch, we can use the coresum event qualifier in perf-stat.
root@skl:/tmp# perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/ -a -A -I1000
# time CPU counts unit events
1.000816660 S0-C0 80,676,154 cpu/event=0,umask=0x3,coresum=1/ (50.04%)
1.000816660 S0-C1 115,314,959 cpu/event=0,umask=0x3,coresum=1/ (50.04%)
1.000816660 S0-C2 126,541,249 cpu/event=0,umask=0x3,coresum=1/ (50.04%)
1.000816660 S0-C3 119,950,015 cpu/event=0,umask=0x3,coresum=1/ (50.04%)
1.000816660 CPU0 52,439,489 cpu/event=0,umask=0x3/ (49.96%)
1.000816660 CPU1 53,431,155 cpu/event=0,umask=0x3/ (49.96%)
1.000816660 CPU2 91,192,070 cpu/event=0,umask=0x3/ (49.96%)
1.000816660 CPU3 90,852,159 cpu/event=0,umask=0x3/ (49.96%)
1.000816660 CPU4 29,715,956 cpu/event=0,umask=0x3/ (49.96%)
1.000816660 CPU5 63,289,507 cpu/event=0,umask=0x3/ (49.96%)
1.000816660 CPU6 29,036,120 cpu/event=0,umask=0x3/ (49.96%)
1.000816660 CPU7 28,996,591 cpu/event=0,umask=0x3/ (49.97%)
In this example, we count the event 'ref-cycles' per-core and per-CPU in
one perf stat command-line. From the output, we can see:
S0-C0 = CPU0 + CPU4
S0-C1 = CPU1 + CPU5
S0-C2 = CPU2 + CPU6
S0-C3 = CPU3 + CPU7
So the result is expected (tiny difference is ignored).
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/Documentation/perf-stat.txt | 4 ++
tools/perf/builtin-stat.c | 21 +++++++
tools/perf/util/stat-display.c | 108 ++++++++++++++++++++++++---------
tools/perf/util/stat.c | 8 ++-
4 files changed, 108 insertions(+), 33 deletions(-)
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 4bc2085..cd5fbd9 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -43,6 +43,10 @@ report::
param1 and param2 are defined as formats for the PMU in
/sys/bus/event_source/devices/<pmu>/format/*
+ 'coresum' is a event qualifier that sums up the event counts for both
+ hardware threads in a core. For example:
+ perf stat -A -a -e cpu/event,coresum=1/,otherevent ...
+
- a symbolically formed event like 'pmu/config=M,config1=N,config2=K/'
where M, N, K are numbers (in decimal, hex, octal format).
Acceptable values for each of 'config', 'config1' and 'config2'
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 7b8f09b..8ce0168 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -832,6 +832,18 @@ static int perf_stat__get_core_cached(struct perf_stat_config *config,
return perf_stat__get_aggr(config, perf_stat__get_core, map, idx);
}
+static bool term_coresum_set(void)
+{
+ struct perf_evsel *counter;
+
+ evlist__for_each_entry(evsel_list, counter) {
+ if (counter->coresum)
+ return true;
+ }
+
+ return false;
+}
+
static int perf_stat_init_aggr_mode(void)
{
int nr;
@@ -852,6 +864,15 @@ static int perf_stat_init_aggr_mode(void)
stat_config.aggr_get_id = perf_stat__get_core_cached;
break;
case AGGR_NONE:
+ if (term_coresum_set()) {
+ if (cpu_map__build_core_map(evsel_list->cpus,
+ &stat_config.aggr_map)) {
+ perror("cannot build core map");
+ return -1;
+ }
+ stat_config.aggr_get_id = perf_stat__get_core_cached;
+ }
+ break;
case AGGR_GLOBAL:
case AGGR_THREAD:
case AGGR_UNSET:
diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 6d043c7..23c9368 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -93,9 +93,17 @@ static void aggr_printout(struct perf_stat_config *config,
config->csv_sep);
break;
case AGGR_NONE:
- fprintf(config->output, "CPU%*d%s",
- config->csv_output ? 0 : -4,
- perf_evsel__cpus(evsel)->map[id], config->csv_sep);
+ if (evsel->coresum) {
+ fprintf(config->output, "S%d-C%*d%s",
+ cpu_map__id_to_socket(id),
+ config->csv_output ? 0 : -5,
+ cpu_map__id_to_cpu(id), config->csv_sep);
+ } else {
+ fprintf(config->output, "CPU%*d%s ",
+ config->csv_output ? 0 : -5,
+ perf_evsel__cpus(evsel)->map[id],
+ config->csv_sep);
+ }
break;
case AGGR_THREAD:
fprintf(config->output, "%*s-%*d%s",
@@ -599,6 +607,41 @@ static void aggr_cb(struct perf_stat_config *config,
}
}
+static void print_counter_aggrdata(struct perf_stat_config *config,
+ struct perf_evsel *counter, int s,
+ char *prefix, bool metric_only,
+ bool *first)
+{
+ struct aggr_data ad;
+ FILE *output = config->output;
+ u64 ena, run, val;
+ int id, nr;
+ double uval;
+
+ ad.id = id = config->aggr_map->map[s];
+ ad.val = ad.ena = ad.run = 0;
+ ad.nr = 0;
+ if (!collect_data(config, counter, aggr_cb, &ad))
+ return;
+
+ nr = ad.nr;
+ ena = ad.ena;
+ run = ad.run;
+ val = ad.val;
+ if (*first && metric_only) {
+ *first = false;
+ aggr_printout(config, counter, id, nr);
+ }
+ if (prefix && !metric_only)
+ fprintf(output, "%s", prefix);
+
+ uval = val * counter->scale;
+ printout(config, id, nr, counter, uval, prefix,
+ run, ena, 1.0, &rt_stat);
+ if (!metric_only)
+ fputc('\n', output);
+}
+
static void print_aggr(struct perf_stat_config *config,
struct perf_evlist *evlist,
char *prefix)
@@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
bool metric_only = config->metric_only;
FILE *output = config->output;
struct perf_evsel *counter;
- int s, id, nr;
- double uval;
- u64 ena, run, val;
+ int s;
bool first;
if (!(config->aggr_map || config->aggr_get_id))
@@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
* Without each counter has its own line.
*/
for (s = 0; s < config->aggr_map->nr; s++) {
- struct aggr_data ad;
if (prefix && metric_only)
fprintf(output, "%s", prefix);
- ad.id = id = config->aggr_map->map[s];
first = true;
evlist__for_each_entry(evlist, counter) {
if (is_duration_time(counter))
continue;
-
- ad.val = ad.ena = ad.run = 0;
- ad.nr = 0;
- if (!collect_data(config, counter, aggr_cb, &ad))
- continue;
- nr = ad.nr;
- ena = ad.ena;
- run = ad.run;
- val = ad.val;
- if (first && metric_only) {
- first = false;
- aggr_printout(config, counter, id, nr);
- }
- if (prefix && !metric_only)
- fprintf(output, "%s", prefix);
-
- uval = val * counter->scale;
- printout(config, id, nr, counter, uval, prefix,
- run, ena, 1.0, &rt_stat);
- if (!metric_only)
- fputc('\n', output);
+ print_counter_aggrdata(config, counter, s,
+ prefix, metric_only,
+ &first);
}
if (metric_only)
fputc('\n', output);
@@ -1101,6 +1122,30 @@ static void print_footer(struct perf_stat_config *config)
"the same PMU. Try reorganizing the group.\n");
}
+static void print_coresum(struct perf_stat_config *config,
+ struct perf_evsel *counter, char *prefix)
+{
+ bool metric_only = config->metric_only;
+ FILE *output = config->output;
+ int s;
+ bool first = true;
+
+ if (!(config->aggr_map || config->aggr_get_id))
+ return;
+
+ for (s = 0; s < config->aggr_map->nr; s++) {
+ if (prefix && metric_only)
+ fprintf(output, "%s", prefix);
+
+ print_counter_aggrdata(config, counter, s,
+ prefix, metric_only,
+ &first);
+ }
+
+ if (metric_only)
+ fputc('\n', output);
+}
+
void
perf_evlist__print_counters(struct perf_evlist *evlist,
struct perf_stat_config *config,
@@ -1157,7 +1202,10 @@ perf_evlist__print_counters(struct perf_evlist *evlist,
evlist__for_each_entry(evlist, counter) {
if (is_duration_time(counter))
continue;
- print_counter(config, counter, prefix);
+ if (counter->coresum)
+ print_coresum(config, counter, prefix);
+ else
+ print_counter(config, counter, prefix);
}
}
break;
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 4d40515..33a6e3c 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -277,9 +277,11 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
if (!evsel->snapshot)
perf_evsel__compute_deltas(evsel, cpu, thread, count);
perf_counts_values__scale(count, config->scale, NULL);
- if (config->aggr_mode == AGGR_NONE)
- perf_stat__update_shadow_stats(evsel, count->val, cpu,
- &rt_stat);
+ if ((config->aggr_mode == AGGR_NONE) && (!evsel->coresum)) {
+ perf_stat__update_shadow_stats(evsel, count->val,
+ cpu, &rt_stat);
+ }
+
if (config->aggr_mode == AGGR_THREAD) {
if (config->stats)
perf_stat__update_shadow_stats(evsel,
--
2.7.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH v1 3/3] perf test: Add a simple test for term coresum
2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
` (2 preceding siblings ...)
2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
@ 2019-03-15 16:04 ` Jin Yao
3 siblings, 0 replies; 10+ messages in thread
From: Jin Yao @ 2019-03-15 16:04 UTC (permalink / raw)
To: acme, jolsa, peterz, mingo, alexander.shishkin
Cc: Linux-kernel, ak, kan.liang, yao.jin, Jin Yao
It's a simple test which just checks if parser works.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
---
tools/perf/tests/parse-events.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index 4a69c07..8d741d4 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -605,6 +605,14 @@ static int test__checkterms_simple(struct list_head *terms)
TEST_ASSERT_VAL("wrong val", term->val.num == 1);
TEST_ASSERT_VAL("wrong config", !strcmp(term->config, "umask"));
+ /* coresum=1*/
+ term = list_entry(term->list.next, struct parse_events_term, list);
+ TEST_ASSERT_VAL("wrong type term",
+ term->type_term == PARSE_EVENTS__TERM_TYPE_CORESUM);
+ TEST_ASSERT_VAL("wrong type val",
+ term->type_val == PARSE_EVENTS__TERM_TYPE_NUM);
+ TEST_ASSERT_VAL("wrong val", term->val.num == 1);
+
return 0;
}
@@ -1734,7 +1742,7 @@ struct terms_test {
static struct terms_test test__terms[] = {
[0] = {
- .str = "config=10,config1,config2=3,umask=1",
+ .str = "config=10,config1,config2=3,umask=1,coresum=1",
.check = test__checkterms_simple,
},
};
--
2.7.4
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
2019-03-15 13:34 ` Jiri Olsa
@ 2019-03-15 23:07 ` Jin, Yao
2019-03-15 23:26 ` Andi Kleen
0 siblings, 1 reply; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:07 UTC (permalink / raw)
To: Jiri Olsa
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On 3/15/2019 9:34 PM, Jiri Olsa wrote:
> On Sat, Mar 16, 2019 at 12:04:13AM +0800, Jin Yao wrote:
>> The coresum event qualifier which sums up the event counts for both
>> hardware threads in a core. For example,
>>
>> perf stat -e cpu/event=0,umask=0x3,coresum=1/,cpu/event=0,umask=0x3/
>>
>> In this example, we count the event 'ref-cycles' per-core and per-CPU in
>> one perf stat command-line.
>>
>> We can already support per-core counting with --per-core, but it's
>> often useful to do this together with other metrics that are collected
>> per CPU (per hardware thread). So this patch series supports this
>> per-core counting on a event level.
>
> seems useful, but perhaps we should follow the --per-core option
> we already have and call it 'per-core' instead of coresum
>
> jirka
>
Yes, the coresum's behavior is similar as --per-core option, just
supports at the event level. I'm OK with calling it 'per-core'.
For example,
perf stat -e cpu/event=0,umask=0x3,per-core=1/
Thanks
Jin Yao
>>
>> Jin Yao (3):
>> perf: Add a coresum event qualifier
>> perf stat: Support coresum event qualifier
>> perf test: Add a simple test for term coresum
>>
>> tools/perf/Documentation/perf-stat.txt | 4 ++
>> tools/perf/builtin-stat.c | 21 +++++++
>> tools/perf/tests/parse-events.c | 10 ++-
>> tools/perf/util/evsel.c | 2 +
>> tools/perf/util/evsel.h | 3 +
>> tools/perf/util/parse-events.c | 27 +++++++++
>> tools/perf/util/parse-events.h | 1 +
>> tools/perf/util/parse-events.l | 1 +
>> tools/perf/util/stat-display.c | 108 ++++++++++++++++++++++++---------
>> tools/perf/util/stat.c | 8 ++-
>> 10 files changed, 151 insertions(+), 34 deletions(-)
>>
>> --
>> 2.7.4
>>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 2/3] perf stat: Support coresum event qualifier
2019-03-15 13:34 ` Jiri Olsa
@ 2019-03-15 23:08 ` Jin, Yao
0 siblings, 0 replies; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:08 UTC (permalink / raw)
To: Jiri Olsa
Cc: acme, jolsa, peterz, mingo, alexander.shishkin, Linux-kernel, ak,
kan.liang, yao.jin
On 3/15/2019 9:34 PM, Jiri Olsa wrote:
> On Sat, Mar 16, 2019 at 12:04:15AM +0800, Jin Yao wrote:
>
> SNIP
>
>> +static void print_counter_aggrdata(struct perf_stat_config *config,
>> + struct perf_evsel *counter, int s,
>> + char *prefix, bool metric_only,
>> + bool *first)
>> +{
>> + struct aggr_data ad;
>> + FILE *output = config->output;
>> + u64 ena, run, val;
>> + int id, nr;
>> + double uval;
>> +
>> + ad.id = id = config->aggr_map->map[s];
>> + ad.val = ad.ena = ad.run = 0;
>> + ad.nr = 0;
>> + if (!collect_data(config, counter, aggr_cb, &ad))
>> + return;
>> +
>> + nr = ad.nr;
>> + ena = ad.ena;
>> + run = ad.run;
>> + val = ad.val;
>> + if (*first && metric_only) {
>> + *first = false;
>> + aggr_printout(config, counter, id, nr);
>> + }
>> + if (prefix && !metric_only)
>> + fprintf(output, "%s", prefix);
>> +
>> + uval = val * counter->scale;
>> + printout(config, id, nr, counter, uval, prefix,
>> + run, ena, 1.0, &rt_stat);
>> + if (!metric_only)
>> + fputc('\n', output);
>> +}
>
> plese put the factoring out of print_counter_aggrdata function
> into separate patch
>
> thanks,
> jirka
>
>
OK!
Thanks
Jin Yao
>> +
>> static void print_aggr(struct perf_stat_config *config,
>> struct perf_evlist *evlist,
>> char *prefix)
>> @@ -606,9 +649,7 @@ static void print_aggr(struct perf_stat_config *config,
>> bool metric_only = config->metric_only;
>> FILE *output = config->output;
>> struct perf_evsel *counter;
>> - int s, id, nr;
>> - double uval;
>> - u64 ena, run, val;
>> + int s;
>> bool first;
>>
>> if (!(config->aggr_map || config->aggr_get_id))
>> @@ -621,36 +662,16 @@ static void print_aggr(struct perf_stat_config *config,
>> * Without each counter has its own line.
>> */
>> for (s = 0; s < config->aggr_map->nr; s++) {
>> - struct aggr_data ad;
>> if (prefix && metric_only)
>> fprintf(output, "%s", prefix);
>>
>> - ad.id = id = config->aggr_map->map[s];
>> first = true;
>> evlist__for_each_entry(evlist, counter) {
>> if (is_duration_time(counter))
>> continue;
>> -
>> - ad.val = ad.ena = ad.run = 0;
>> - ad.nr = 0;
>> - if (!collect_data(config, counter, aggr_cb, &ad))
>> - continue;
>> - nr = ad.nr;
>> - ena = ad.ena;
>> - run = ad.run;
>> - val = ad.val;
>> - if (first && metric_only) {
>> - first = false;
>> - aggr_printout(config, counter, id, nr);
>> - }
>> - if (prefix && !metric_only)
>> - fprintf(output, "%s", prefix);
>> -
>> - uval = val * counter->scale;
>> - printout(config, id, nr, counter, uval, prefix,
>> - run, ena, 1.0, &rt_stat);
>> - if (!metric_only)
>> - fputc('\n', output);
>> + print_counter_aggrdata(config, counter, s,
>> + prefix, metric_only,
>> + &first);
>
> SNIP
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
2019-03-15 23:07 ` Jin, Yao
@ 2019-03-15 23:26 ` Andi Kleen
2019-03-15 23:31 ` Jin, Yao
0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2019-03-15 23:26 UTC (permalink / raw)
To: Jin, Yao
Cc: Jiri Olsa, acme, jolsa, peterz, mingo, alexander.shishkin,
Linux-kernel, kan.liang, yao.jin
> Yes, the coresum's behavior is similar as --per-core option, just supports
> at the event level. I'm OK with calling it 'per-core'.
>
> For example,
> perf stat -e cpu/event=0,umask=0x3,per-core=1/
Please use percore, the - would need to be escaped in metric expressions.
-Andi
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v1 0/3] perf: Support a new coresum event qualifier
2019-03-15 23:26 ` Andi Kleen
@ 2019-03-15 23:31 ` Jin, Yao
0 siblings, 0 replies; 10+ messages in thread
From: Jin, Yao @ 2019-03-15 23:31 UTC (permalink / raw)
To: Andi Kleen
Cc: Jiri Olsa, acme, jolsa, peterz, mingo, alexander.shishkin,
Linux-kernel, kan.liang, yao.jin
On 3/16/2019 7:26 AM, Andi Kleen wrote:
>> Yes, the coresum's behavior is similar as --per-core option, just supports
>> at the event level. I'm OK with calling it 'per-core'.
>>
>> For example,
>> perf stat -e cpu/event=0,umask=0x3,per-core=1/
>
> Please use percore, the - would need to be escaped in metric expressions.
>
> -Andi
>
Oh, yes, thanks for reminding. Will use 'percore' in next version.
Thanks
Jin Yao
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2019-03-15 23:31 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-15 16:04 [PATCH v1 0/3] perf: Support a new coresum event qualifier Jin Yao
2019-03-15 13:34 ` Jiri Olsa
2019-03-15 23:07 ` Jin, Yao
2019-03-15 23:26 ` Andi Kleen
2019-03-15 23:31 ` Jin, Yao
2019-03-15 16:04 ` [PATCH v1 1/3] perf: Add a " Jin Yao
2019-03-15 16:04 ` [PATCH v1 2/3] perf stat: Support " Jin Yao
2019-03-15 13:34 ` Jiri Olsa
2019-03-15 23:08 ` Jin, Yao
2019-03-15 16:04 ` [PATCH v1 3/3] perf test: Add a simple test for term coresum Jin Yao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).