linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events
@ 2020-02-21 10:11 Kajol Jain
  2020-03-17  6:21 ` kajoljain
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Kajol Jain @ 2020-02-21 10:11 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, linux-perf-users, kjain, Jiri Olsa,
	Alexander Shishkin, Andi Kleen, Kan Liang, Peter Zijlstra,
	Jin Yao, Madhavan Srinivasan, Anju T Sudhakar, Ravi Bangoria

Commit f01642e4912b ("perf metricgroup: Support multiple
events for metricgroup") introduced support for multiple events
in a metric group. But with the current upstream, metric events
names are not printed properly incase we try to run multiple
metric groups with overlapping event.

With current upstream version, incase of overlapping metric events
issue is, we always start our comparision logic from start.
So, the events which already matched with some metric group also
take part in comparision logic. Because of that when we have overlapping
events, we end up matching current metric group event with already matched
one.

For example, in skylake machine we have metric event CoreIPC and
Instructions. Both of them need 'inst_retired.any' event value.
As events in Instructions is subset of events in CoreIPC, they
endup in pointing to same 'inst_retired.any' value.

In skylake platform:

command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1

 Performance counter stats for 'CPU(s) 0':

     1,254,992,790      inst_retired.any          # 1254992790.0
                                                    Instructions
                                                  #      1.3 CoreIPC
       977,172,805      cycles
     1,254,992,756      inst_retired.any

       1.000802596 seconds time elapsed

command:# sudo ./perf stat -M UPI,IPC sleep 1

   Performance counter stats for 'sleep 1':
           948,650      uops_retired.retire_slots
           866,182      inst_retired.any          #      0.7 IPC
           866,182      inst_retired.any
         1,175,671      cpu_clk_unhalted.thread

Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
track of events which already matched with some group by setting it true.
So, we skip all used events in list when we start comparision logic.
Patch also make some changes in comparision logic, incase we get a match
miss, we discard the whole match and start again with first event id in
metric event.

With this patch:
In skylake platform:

command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1

 Performance counter stats for 'CPU(s) 0':

         3,348,415      inst_retired.any          #      0.3 CoreIPC
        11,779,026      cycles
         3,348,381      inst_retired.any          # 3348381.0
                                                    Instructions

       1.001649056 seconds time elapsed

command:# ./perf stat -M UPI,IPC sleep 1

 Performance counter stats for 'sleep 1':

         1,023,148      uops_retired.retire_slots #      1.1 UPI
           924,976      inst_retired.any
           924,976      inst_retired.any          #      0.6 IPC
         1,489,414      cpu_clk_unhalted.thread

       1.003064672 seconds time elapsed

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>

Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
---
 tools/perf/util/metricgroup.c | 49 +++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 19 deletions(-)

Changelog:
v5 -> v6
- Remove bool cast
- Add Acked-by tag

v4 -> v5
- Made small fix to return from function 'metricgroup__setup_events'
  in case calloc fail

v3 -> v4
- Make 'evlist_used' a bool pointer.

v2 -> v3
- Add array in place of variable to keep track of matched events.
  Because incase we miss match in previous approach, all events will
  be rolled over in next condition. So, rather we add array and set  
  it incase that variable already match with some group.
  - Suggested by Jiri Olsa

v1 -> v2
- Rather then adding static variable in metricgroup.c,
  add a new variable in evlist itself with name 'evlist_iter'

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 02aee946b6c1..33bb138f7902 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -93,13 +93,16 @@ struct egroup {
 static struct evsel *find_evsel_group(struct evlist *perf_evlist,
 				      const char **ids,
 				      int idnum,
-				      struct evsel **metric_events)
+				      struct evsel **metric_events,
+				      bool *evlist_used)
 {
 	struct evsel *ev;
-	int i = 0;
+	int i = 0, j = 0;
 	bool leader_found;
 
 	evlist__for_each_entry (perf_evlist, ev) {
+		if (evlist_used[j++])
+			continue;
 		if (!strcmp(ev->name, ids[i])) {
 			if (!metric_events[i])
 				metric_events[i] = ev;
@@ -107,22 +110,17 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
 			if (i == idnum)
 				break;
 		} else {
-			if (i + 1 == idnum) {
-				/* Discard the whole match and start again */
-				i = 0;
-				memset(metric_events, 0,
-				       sizeof(struct evsel *) * idnum);
-				continue;
-			}
-
-			if (!strcmp(ev->name, ids[i]))
-				metric_events[i] = ev;
-			else {
-				/* Discard the whole match and start again */
-				i = 0;
-				memset(metric_events, 0,
-				       sizeof(struct evsel *) * idnum);
-				continue;
+			/* Discard the whole match and start again */
+			i = 0;
+			memset(metric_events, 0,
+				sizeof(struct evsel *) * idnum);
+
+			if (!strcmp(ev->name, ids[i])) {
+				if (!metric_events[i])
+					metric_events[i] = ev;
+				i++;
+				if (i == idnum)
+					break;
 			}
 		}
 	}
@@ -144,7 +142,10 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
 			    !strcmp(ev->name, metric_events[i]->name)) {
 				ev->metric_leader = metric_events[i];
 			}
+			j++;
 		}
+		ev = metric_events[i];
+		evlist_used[ev->idx] = true;
 	}
 
 	return metric_events[0];
@@ -160,6 +161,13 @@ static int metricgroup__setup_events(struct list_head *groups,
 	int ret = 0;
 	struct egroup *eg;
 	struct evsel *evsel;
+	bool *evlist_used;
+
+	evlist_used = calloc(perf_evlist->core.nr_entries, sizeof(bool));
+	if (!evlist_used) {
+		ret = -ENOMEM;
+		return ret;
+	}
 
 	list_for_each_entry (eg, groups, nd) {
 		struct evsel **metric_events;
@@ -170,7 +178,7 @@ static int metricgroup__setup_events(struct list_head *groups,
 			break;
 		}
 		evsel = find_evsel_group(perf_evlist, eg->ids, eg->idnum,
-					 metric_events);
+					 metric_events, evlist_used);
 		if (!evsel) {
 			pr_debug("Cannot resolve %s: %s\n",
 					eg->metric_name, eg->metric_expr);
@@ -194,6 +202,9 @@ static int metricgroup__setup_events(struct list_head *groups,
 		expr->metric_events = metric_events;
 		list_add(&expr->nd, &me->head);
 	}
+
+	free(evlist_used);
+
 	return ret;
 }
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v6] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events
  2020-02-21 10:11 [PATCH v6] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events Kajol Jain
@ 2020-03-17  6:21 ` kajoljain
  2020-03-18 19:28 ` Arnaldo Carvalho de Melo
  2020-04-04  8:42 ` [tip: perf/urgent] perf metricgroup: " tip-bot2 for Kajol Jain
  2 siblings, 0 replies; 5+ messages in thread
From: kajoljain @ 2020-03-17  6:21 UTC (permalink / raw)
  To: acme
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Jin Yao,
	Madhavan Srinivasan, Anju T Sudhakar, Ravi Bangoria

Hi Arnaldo,
	Can you pull this patch if it looks fine to you. Please let
me know if any changes require.

Thanks,
Kajol

On 2/21/20 3:41 PM, Kajol Jain wrote:
> Commit f01642e4912b ("perf metricgroup: Support multiple
> events for metricgroup") introduced support for multiple events
> in a metric group. But with the current upstream, metric events
> names are not printed properly incase we try to run multiple
> metric groups with overlapping event.
> 
> With current upstream version, incase of overlapping metric events
> issue is, we always start our comparision logic from start.
> So, the events which already matched with some metric group also
> take part in comparision logic. Because of that when we have overlapping
> events, we end up matching current metric group event with already matched
> one.
> 
> For example, in skylake machine we have metric event CoreIPC and
> Instructions. Both of them need 'inst_retired.any' event value.
> As events in Instructions is subset of events in CoreIPC, they
> endup in pointing to same 'inst_retired.any' value.
> 
> In skylake platform:
> 
> command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
> 
>  Performance counter stats for 'CPU(s) 0':
> 
>      1,254,992,790      inst_retired.any          # 1254992790.0
>                                                     Instructions
>                                                   #      1.3 CoreIPC
>        977,172,805      cycles
>      1,254,992,756      inst_retired.any
> 
>        1.000802596 seconds time elapsed
> 
> command:# sudo ./perf stat -M UPI,IPC sleep 1
> 
>    Performance counter stats for 'sleep 1':
>            948,650      uops_retired.retire_slots
>            866,182      inst_retired.any          #      0.7 IPC
>            866,182      inst_retired.any
>          1,175,671      cpu_clk_unhalted.thread
> 
> Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
> track of events which already matched with some group by setting it true.
> So, we skip all used events in list when we start comparision logic.
> Patch also make some changes in comparision logic, incase we get a match
> miss, we discard the whole match and start again with first event id in
> metric event.
> 
> With this patch:
> In skylake platform:
> 
> command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
> 
>  Performance counter stats for 'CPU(s) 0':
> 
>          3,348,415      inst_retired.any          #      0.3 CoreIPC
>         11,779,026      cycles
>          3,348,381      inst_retired.any          # 3348381.0
>                                                     Instructions
> 
>        1.001649056 seconds time elapsed
> 
> command:# ./perf stat -M UPI,IPC sleep 1
> 
>  Performance counter stats for 'sleep 1':
> 
>          1,023,148      uops_retired.retire_slots #      1.1 UPI
>            924,976      inst_retired.any
>            924,976      inst_retired.any          #      0.6 IPC
>          1,489,414      cpu_clk_unhalted.thread
> 
>        1.003064672 seconds time elapsed
> 
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Kan Liang <kan.liang@linux.intel.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Jin Yao <yao.jin@linux.intel.com>
> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
> Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
> ---
>  tools/perf/util/metricgroup.c | 49 +++++++++++++++++++++--------------
>  1 file changed, 30 insertions(+), 19 deletions(-)
> 
> Changelog:
> v5 -> v6
> - Remove bool cast
> - Add Acked-by tag
> 
> v4 -> v5
> - Made small fix to return from function 'metricgroup__setup_events'
>   in case calloc fail
> 
> v3 -> v4
> - Make 'evlist_used' a bool pointer.
> 
> v2 -> v3
> - Add array in place of variable to keep track of matched events.
>   Because incase we miss match in previous approach, all events will
>   be rolled over in next condition. So, rather we add array and set  
>   it incase that variable already match with some group.
>   - Suggested by Jiri Olsa
> 
> v1 -> v2
> - Rather then adding static variable in metricgroup.c,
>   add a new variable in evlist itself with name 'evlist_iter'
> 
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 02aee946b6c1..33bb138f7902 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -93,13 +93,16 @@ struct egroup {
>  static struct evsel *find_evsel_group(struct evlist *perf_evlist,
>  				      const char **ids,
>  				      int idnum,
> -				      struct evsel **metric_events)
> +				      struct evsel **metric_events,
> +				      bool *evlist_used)
>  {
>  	struct evsel *ev;
> -	int i = 0;
> +	int i = 0, j = 0;
>  	bool leader_found;
>  
>  	evlist__for_each_entry (perf_evlist, ev) {
> +		if (evlist_used[j++])
> +			continue;
>  		if (!strcmp(ev->name, ids[i])) {
>  			if (!metric_events[i])
>  				metric_events[i] = ev;
> @@ -107,22 +110,17 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
>  			if (i == idnum)
>  				break;
>  		} else {
> -			if (i + 1 == idnum) {
> -				/* Discard the whole match and start again */
> -				i = 0;
> -				memset(metric_events, 0,
> -				       sizeof(struct evsel *) * idnum);
> -				continue;
> -			}
> -
> -			if (!strcmp(ev->name, ids[i]))
> -				metric_events[i] = ev;
> -			else {
> -				/* Discard the whole match and start again */
> -				i = 0;
> -				memset(metric_events, 0,
> -				       sizeof(struct evsel *) * idnum);
> -				continue;
> +			/* Discard the whole match and start again */
> +			i = 0;
> +			memset(metric_events, 0,
> +				sizeof(struct evsel *) * idnum);
> +
> +			if (!strcmp(ev->name, ids[i])) {
> +				if (!metric_events[i])
> +					metric_events[i] = ev;
> +				i++;
> +				if (i == idnum)
> +					break;
>  			}
>  		}
>  	}
> @@ -144,7 +142,10 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
>  			    !strcmp(ev->name, metric_events[i]->name)) {
>  				ev->metric_leader = metric_events[i];
>  			}
> +			j++;
>  		}
> +		ev = metric_events[i];
> +		evlist_used[ev->idx] = true;
>  	}
>  
>  	return metric_events[0];
> @@ -160,6 +161,13 @@ static int metricgroup__setup_events(struct list_head *groups,
>  	int ret = 0;
>  	struct egroup *eg;
>  	struct evsel *evsel;
> +	bool *evlist_used;
> +
> +	evlist_used = calloc(perf_evlist->core.nr_entries, sizeof(bool));
> +	if (!evlist_used) {
> +		ret = -ENOMEM;
> +		return ret;
> +	}
>  
>  	list_for_each_entry (eg, groups, nd) {
>  		struct evsel **metric_events;
> @@ -170,7 +178,7 @@ static int metricgroup__setup_events(struct list_head *groups,
>  			break;
>  		}
>  		evsel = find_evsel_group(perf_evlist, eg->ids, eg->idnum,
> -					 metric_events);
> +					 metric_events, evlist_used);
>  		if (!evsel) {
>  			pr_debug("Cannot resolve %s: %s\n",
>  					eg->metric_name, eg->metric_expr);
> @@ -194,6 +202,9 @@ static int metricgroup__setup_events(struct list_head *groups,
>  		expr->metric_events = metric_events;
>  		list_add(&expr->nd, &me->head);
>  	}
> +
> +	free(evlist_used);
> +
>  	return ret;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v6] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events
  2020-02-21 10:11 [PATCH v6] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events Kajol Jain
  2020-03-17  6:21 ` kajoljain
@ 2020-03-18 19:28 ` Arnaldo Carvalho de Melo
  2020-03-20  6:44   ` kajoljain
  2020-04-04  8:42 ` [tip: perf/urgent] perf metricgroup: " tip-bot2 for Kajol Jain
  2 siblings, 1 reply; 5+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-03-18 19:28 UTC (permalink / raw)
  To: Kajol Jain
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Jin Yao,
	Madhavan Srinivasan, Anju T Sudhakar, Ravi Bangoria

Em Fri, Feb 21, 2020 at 03:41:21PM +0530, Kajol Jain escreveu:
> Commit f01642e4912b ("perf metricgroup: Support multiple
> events for metricgroup") introduced support for multiple events
> in a metric group. But with the current upstream, metric events
> names are not printed properly incase we try to run multiple
> metric groups with overlapping event.
> 
> With current upstream version, incase of overlapping metric events
> issue is, we always start our comparision logic from start.
> So, the events which already matched with some metric group also
> take part in comparision logic. Because of that when we have overlapping
> events, we end up matching current metric group event with already matched
> one.
> 
> For example, in skylake machine we have metric event CoreIPC and
> Instructions. Both of them need 'inst_retired.any' event value.
> As events in Instructions is subset of events in CoreIPC, they
> endup in pointing to same 'inst_retired.any' value.
> 
> In skylake platform:
> 
> command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
> 
>  Performance counter stats for 'CPU(s) 0':
> 
>      1,254,992,790      inst_retired.any          # 1254992790.0
>                                                     Instructions
>                                                   #      1.3 CoreIPC
>        977,172,805      cycles
>      1,254,992,756      inst_retired.any
> 
>        1.000802596 seconds time elapsed
> 
> command:# sudo ./perf stat -M UPI,IPC sleep 1
> 
>    Performance counter stats for 'sleep 1':
>            948,650      uops_retired.retire_slots
>            866,182      inst_retired.any          #      0.7 IPC
>            866,182      inst_retired.any
>          1,175,671      cpu_clk_unhalted.thread
> 
> Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
> track of events which already matched with some group by setting it true.
> So, we skip all used events in list when we start comparision logic.
> Patch also make some changes in comparision logic, incase we get a match
> miss, we discard the whole match and start again with first event id in
> metric event.
> 
> With this patch:
> In skylake platform:
> 
> command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
> 
>  Performance counter stats for 'CPU(s) 0':
> 
>          3,348,415      inst_retired.any          #      0.3 CoreIPC
>         11,779,026      cycles
>          3,348,381      inst_retired.any          # 3348381.0
>                                                     Instructions
> 
>        1.001649056 seconds time elapsed
> 
> command:# ./perf stat -M UPI,IPC sleep 1
> 
>  Performance counter stats for 'sleep 1':
> 
>          1,023,148      uops_retired.retire_slots #      1.1 UPI
>            924,976      inst_retired.any
>            924,976      inst_retired.any          #      0.6 IPC
>          1,489,414      cpu_clk_unhalted.thread
> 
>        1.003064672 seconds time elapsed
> 
> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
> Acked-by: Jiri Olsa <jolsa@kernel.org>

This is an area I think needs some improvement, look how it ends up
setting up the inst_retired.any multiple times:

[root@seventh ~]# perf stat -vv -M CoreIPC,Instructions  -C 0 sleep 1
Using CPUID GenuineIntel-6-9E-9
metric expr inst_retired.any / cycles for CoreIPC
found event inst_retired.any
found event cycles
metric expr inst_retired.any for Instructions
found event inst_retired.any
adding {inst_retired.any,cycles}:W,{inst_retired.any}:W
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
inst_retired.any -> cpu/event=0xc0,(null)=0x1e8483/
inst_retired.any -> cpu/event=0xc0,(null)=0x1e8483/
------------------------------------------------------------
perf_event_attr:
  type                             4
  size                             120
  config                           0xc0
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
  disabled                         1
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
  size                             120
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd 3  flags 0x8 = 4
------------------------------------------------------------
perf_event_attr:
  type                             4
  size                             120
  config                           0xc0
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
inst_retired.any: 0: 507070 1000948076 1000948076
cycles: 0: 1250258 1000948076 1000948076
inst_retired.any: 0: 507038 1000953052 1000953052
inst_retired.any: 507070 1000948076 1000948076
cycles: 1250258 1000948076 1000948076
inst_retired.any: 507038 1000953052 1000953052

 Performance counter stats for 'CPU(s) 0':

           507,070      inst_retired.any          #      0.4 CoreIPC
         1,250,258      cycles
           507,038      inst_retired.any          # 507038.0 Instructions

       1.000964961 seconds time elapsed

[root@seventh ~]#

And it ends up printing the "inst_retired.any" multiple times, with
different values, as after all two events were allocated, can't we
notice this and set just one inst_retired.any and then when calculating
the metrics just do something like:

  # perf stat -M CoreIPC,Instructions -C 0 sleep 1

 Performance counter stats for 'CPU(s) 0':

           507,070      inst_retired.any          #      0.4 CoreIPC,
						  #  507,070 Instructions
         1,250,258      cycles

       1.000964961 seconds time elapsed
#

?

Ditto for:

    command:# perf stat -M UPI,IPC sleep 1

     Performance counter stats for 'sleep 1':

             1,023,148      uops_retired.retire_slots #      1.1 UPI
               924,976      inst_retired.any
               924,976      inst_retired.any          #      0.6 IPC
             1,489,414      cpu_clk_unhalted.thread

           1.003064672 seconds time elapsed

Wouldn't this be better as:

    command:# perf stat -M UPI,IPC sleep 1

     Performance counter stats for 'sleep 1':

             1,023,148      uops_retired.retire_slots #      1.1 UPI
               924,976      inst_retired.any
             1,489,414      cpu_clk_unhalted.thread   #      0.6 IPC

           1.003064672 seconds time elapsed

This should help to look at many metrics at the same time by requiring
less counters to be allocated, etc, or am I missing something here?

Since this went thru multiple versions and Jiri is satisfied with it,
I'm applying the patch, but please consider this suggestion.

- Arnaldo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v6] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events
  2020-03-18 19:28 ` Arnaldo Carvalho de Melo
@ 2020-03-20  6:44   ` kajoljain
  0 siblings, 0 replies; 5+ messages in thread
From: kajoljain @ 2020-03-20  6:44 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: linux-kernel, linux-perf-users, Jiri Olsa, Alexander Shishkin,
	Andi Kleen, Kan Liang, Peter Zijlstra, Jin Yao,
	Madhavan Srinivasan, Anju T Sudhakar, Ravi Bangoria



On 3/19/20 12:58 AM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Feb 21, 2020 at 03:41:21PM +0530, Kajol Jain escreveu:
>> Commit f01642e4912b ("perf metricgroup: Support multiple
>> events for metricgroup") introduced support for multiple events
>> in a metric group. But with the current upstream, metric events
>> names are not printed properly incase we try to run multiple
>> metric groups with overlapping event.
>>
>> With current upstream version, incase of overlapping metric events
>> issue is, we always start our comparision logic from start.
>> So, the events which already matched with some metric group also
>> take part in comparision logic. Because of that when we have overlapping
>> events, we end up matching current metric group event with already matched
>> one.
>>
>> For example, in skylake machine we have metric event CoreIPC and
>> Instructions. Both of them need 'inst_retired.any' event value.
>> As events in Instructions is subset of events in CoreIPC, they
>> endup in pointing to same 'inst_retired.any' value.
>>
>> In skylake platform:
>>
>> command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
>>
>>  Performance counter stats for 'CPU(s) 0':
>>
>>      1,254,992,790      inst_retired.any          # 1254992790.0
>>                                                     Instructions
>>                                                   #      1.3 CoreIPC
>>        977,172,805      cycles
>>      1,254,992,756      inst_retired.any
>>
>>        1.000802596 seconds time elapsed
>>
>> command:# sudo ./perf stat -M UPI,IPC sleep 1
>>
>>    Performance counter stats for 'sleep 1':
>>            948,650      uops_retired.retire_slots
>>            866,182      inst_retired.any          #      0.7 IPC
>>            866,182      inst_retired.any
>>          1,175,671      cpu_clk_unhalted.thread
>>
>> Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
>> track of events which already matched with some group by setting it true.
>> So, we skip all used events in list when we start comparision logic.
>> Patch also make some changes in comparision logic, incase we get a match
>> miss, we discard the whole match and start again with first event id in
>> metric event.
>>
>> With this patch:
>> In skylake platform:
>>
>> command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1
>>
>>  Performance counter stats for 'CPU(s) 0':
>>
>>          3,348,415      inst_retired.any          #      0.3 CoreIPC
>>         11,779,026      cycles
>>          3,348,381      inst_retired.any          # 3348381.0
>>                                                     Instructions
>>
>>        1.001649056 seconds time elapsed
>>
>> command:# ./perf stat -M UPI,IPC sleep 1
>>
>>  Performance counter stats for 'sleep 1':
>>
>>          1,023,148      uops_retired.retire_slots #      1.1 UPI
>>            924,976      inst_retired.any
>>            924,976      inst_retired.any          #      0.6 IPC
>>          1,489,414      cpu_clk_unhalted.thread
>>
>>        1.003064672 seconds time elapsed
>>
>> Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
>> Acked-by: Jiri Olsa <jolsa@kernel.org>
> 
> This is an area I think needs some improvement, look how it ends up
> setting up the inst_retired.any multiple times:
> 
> [root@seventh ~]# perf stat -vv -M CoreIPC,Instructions  -C 0 sleep 1
> Using CPUID GenuineIntel-6-9E-9
> metric expr inst_retired.any / cycles for CoreIPC
> found event inst_retired.any
> found event cycles
> metric expr inst_retired.any for Instructions
> found event inst_retired.any
> adding {inst_retired.any,cycles}:W,{inst_retired.any}:W
> intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
> inst_retired.any -> cpu/event=0xc0,(null)=0x1e8483/
> inst_retired.any -> cpu/event=0xc0,(null)=0x1e8483/
> ------------------------------------------------------------
> perf_event_attr:
>   type                             4
>   size                             120
>   config                           0xc0
>   sample_type                      IDENTIFIER
>   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
>   disabled                         1
>   inherit                          1
>   exclude_guest                    1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 3
> ------------------------------------------------------------
> perf_event_attr:
>   size                             120
>   sample_type                      IDENTIFIER
>   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
>   inherit                          1
>   exclude_guest                    1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu 0  group_fd 3  flags 0x8 = 4
> ------------------------------------------------------------
> perf_event_attr:
>   type                             4
>   size                             120
>   config                           0xc0
>   sample_type                      IDENTIFIER
>   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
>   disabled                         1
>   inherit                          1
>   exclude_guest                    1
> ------------------------------------------------------------
> sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
> inst_retired.any: 0: 507070 1000948076 1000948076
> cycles: 0: 1250258 1000948076 1000948076
> inst_retired.any: 0: 507038 1000953052 1000953052
> inst_retired.any: 507070 1000948076 1000948076
> cycles: 1250258 1000948076 1000948076
> inst_retired.any: 507038 1000953052 1000953052
> 
>  Performance counter stats for 'CPU(s) 0':
> 
>            507,070      inst_retired.any          #      0.4 CoreIPC
>          1,250,258      cycles
>            507,038      inst_retired.any          # 507038.0 Instructions
> 
>        1.000964961 seconds time elapsed
> 
> [root@seventh ~]#
> 
> And it ends up printing the "inst_retired.any" multiple times, with
> different values, as after all two events were allocated, can't we
> notice this and set just one inst_retired.any and then when calculating
> the metrics just do something like:
> 
>   # perf stat -M CoreIPC,Instructions -C 0 sleep 1
> 
>  Performance counter stats for 'CPU(s) 0':
> 
>            507,070      inst_retired.any          #      0.4 CoreIPC,
> 						  #  507,070 Instructions
>          1,250,258      cycles
> 
>        1.000964961 seconds time elapsed
> #
> 
> ?
> 
> Ditto for:
> 
>     command:# perf stat -M UPI,IPC sleep 1
> 
>      Performance counter stats for 'sleep 1':
> 
>              1,023,148      uops_retired.retire_slots #      1.1 UPI
>                924,976      inst_retired.any
>                924,976      inst_retired.any          #      0.6 IPC
>              1,489,414      cpu_clk_unhalted.thread
> 
>            1.003064672 seconds time elapsed
> 
> Wouldn't this be better as:
> 
>     command:# perf stat -M UPI,IPC sleep 1
> 
>      Performance counter stats for 'sleep 1':
> 
>              1,023,148      uops_retired.retire_slots #      1.1 UPI
>                924,976      inst_retired.any
>              1,489,414      cpu_clk_unhalted.thread   #      0.6 IPC
> 
>            1.003064672 seconds time elapsed
> 
> This should help to look at many metrics at the same time by requiring
> less counters to be allocated, etc, or am I missing something here?

Hi Arnaldo,
	Yes that will be better. I will look into it from my end.

Thanks,
Kajol

> 
> Since this went thru multiple versions and Jiri is satisfied with it,
> I'm applying the patch, but please consider this suggestion.
> 
> - Arnaldo
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [tip: perf/urgent] perf metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events
  2020-02-21 10:11 [PATCH v6] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events Kajol Jain
  2020-03-17  6:21 ` kajoljain
  2020-03-18 19:28 ` Arnaldo Carvalho de Melo
@ 2020-04-04  8:42 ` tip-bot2 for Kajol Jain
  2 siblings, 0 replies; 5+ messages in thread
From: tip-bot2 for Kajol Jain @ 2020-04-04  8:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kajol Jain, Jiri Olsa, Alexander Shishkin, Andi Kleen,
	Anju T Sudhakar, Jin Yao, Kan Liang, Madhavan Srinivasan,
	Peter Zijlstra, Ravi Bangoria, Arnaldo Carvalho de Melo, x86,
	LKML

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     58fc90fda0cc983c11c5290c7a9e992b08ac4a5c
Gitweb:        https://git.kernel.org/tip/58fc90fda0cc983c11c5290c7a9e992b08ac4a5c
Author:        Kajol Jain <kjain@linux.ibm.com>
AuthorDate:    Fri, 21 Feb 2020 15:41:21 +05:30
Committer:     Arnaldo Carvalho de Melo <acme@redhat.com>
CommitterDate: Tue, 24 Mar 2020 09:37:27 -03:00

perf metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events

Commit f01642e4912b ("perf metricgroup: Support multiple events for
metricgroup") introduced support for multiple events in a metric group.
But with the current upstream, metric events names are not printed
properly incase we try to run multiple metric groups with overlapping
event.

With current upstream version, incase of overlapping metric events issue
is, we always start our comparision logic from start.  So, the events
which already matched with some metric group also take part in
comparision logic. Because of that when we have overlapping events, we
end up matching current metric group event with already matched one.

For example, in skylake machine we have metric event CoreIPC and
Instructions. Both of them need 'inst_retired.any' event value.  As
events in Instructions is subset of events in CoreIPC, they endup in
pointing to same 'inst_retired.any' value.

In skylake platform:

command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1

 Performance counter stats for 'CPU(s) 0':

     1,254,992,790      inst_retired.any          # 1254992790.0
                                                    Instructions
                                                  #      1.3 CoreIPC
       977,172,805      cycles
     1,254,992,756      inst_retired.any

       1.000802596 seconds time elapsed

command:# sudo ./perf stat -M UPI,IPC sleep 1

   Performance counter stats for 'sleep 1':
           948,650      uops_retired.retire_slots
           866,182      inst_retired.any          #      0.7 IPC
           866,182      inst_retired.any
         1,175,671      cpu_clk_unhalted.thread

Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
track of events which already matched with some group by setting it
true.  So, we skip all used events in list when we start comparision
logic.  Patch also make some changes in comparision logic, incase we get
a match miss, we discard the whole match and start again with first
event id in metric event.

With this patch:

In skylake platform:

command:# ./perf stat -M CoreIPC,Instructions  -C 0 sleep 1

 Performance counter stats for 'CPU(s) 0':

         3,348,415      inst_retired.any          #      0.3 CoreIPC
        11,779,026      cycles
         3,348,381      inst_retired.any          # 3348381.0
                                                    Instructions

       1.001649056 seconds time elapsed

command:# ./perf stat -M UPI,IPC sleep 1

 Performance counter stats for 'sleep 1':

         1,023,148      uops_retired.retire_slots #      1.1 UPI
           924,976      inst_retired.any
           924,976      inst_retired.any          #      0.6 IPC
         1,489,414      cpu_clk_unhalted.thread

       1.003064672 seconds time elapsed

Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200221101121.28920-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/util/metricgroup.c | 49 ++++++++++++++++++++--------------
 1 file changed, 30 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index c3a8c70..926449a 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -95,13 +95,16 @@ struct egroup {
 static struct evsel *find_evsel_group(struct evlist *perf_evlist,
 				      const char **ids,
 				      int idnum,
-				      struct evsel **metric_events)
+				      struct evsel **metric_events,
+				      bool *evlist_used)
 {
 	struct evsel *ev;
-	int i = 0;
+	int i = 0, j = 0;
 	bool leader_found;
 
 	evlist__for_each_entry (perf_evlist, ev) {
+		if (evlist_used[j++])
+			continue;
 		if (!strcmp(ev->name, ids[i])) {
 			if (!metric_events[i])
 				metric_events[i] = ev;
@@ -109,22 +112,17 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
 			if (i == idnum)
 				break;
 		} else {
-			if (i + 1 == idnum) {
-				/* Discard the whole match and start again */
-				i = 0;
-				memset(metric_events, 0,
-				       sizeof(struct evsel *) * idnum);
-				continue;
-			}
-
-			if (!strcmp(ev->name, ids[i]))
-				metric_events[i] = ev;
-			else {
-				/* Discard the whole match and start again */
-				i = 0;
-				memset(metric_events, 0,
-				       sizeof(struct evsel *) * idnum);
-				continue;
+			/* Discard the whole match and start again */
+			i = 0;
+			memset(metric_events, 0,
+				sizeof(struct evsel *) * idnum);
+
+			if (!strcmp(ev->name, ids[i])) {
+				if (!metric_events[i])
+					metric_events[i] = ev;
+				i++;
+				if (i == idnum)
+					break;
 			}
 		}
 	}
@@ -146,7 +144,10 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
 			    !strcmp(ev->name, metric_events[i]->name)) {
 				ev->metric_leader = metric_events[i];
 			}
+			j++;
 		}
+		ev = metric_events[i];
+		evlist_used[ev->idx] = true;
 	}
 
 	return metric_events[0];
@@ -162,6 +163,13 @@ static int metricgroup__setup_events(struct list_head *groups,
 	int ret = 0;
 	struct egroup *eg;
 	struct evsel *evsel;
+	bool *evlist_used;
+
+	evlist_used = calloc(perf_evlist->core.nr_entries, sizeof(bool));
+	if (!evlist_used) {
+		ret = -ENOMEM;
+		return ret;
+	}
 
 	list_for_each_entry (eg, groups, nd) {
 		struct evsel **metric_events;
@@ -172,7 +180,7 @@ static int metricgroup__setup_events(struct list_head *groups,
 			break;
 		}
 		evsel = find_evsel_group(perf_evlist, eg->ids, eg->idnum,
-					 metric_events);
+					 metric_events, evlist_used);
 		if (!evsel) {
 			pr_debug("Cannot resolve %s: %s\n",
 					eg->metric_name, eg->metric_expr);
@@ -196,6 +204,9 @@ static int metricgroup__setup_events(struct list_head *groups,
 		expr->metric_events = metric_events;
 		list_add(&expr->nd, &me->head);
 	}
+
+	free(evlist_used);
+
 	return ret;
 }
 

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-04-04  8:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-21 10:11 [PATCH v6] tools/perf/metricgroup: Fix printing event names of metric group with multiple events incase of overlapping events Kajol Jain
2020-03-17  6:21 ` kajoljain
2020-03-18 19:28 ` Arnaldo Carvalho de Melo
2020-03-20  6:44   ` kajoljain
2020-04-04  8:42 ` [tip: perf/urgent] perf metricgroup: " tip-bot2 for Kajol Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).