All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jin, Yao" <yao.jin@linux.intel.com>
To: Jiri Olsa <jolsa@redhat.com>
Cc: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org,
	mingo@redhat.com, alexander.shishkin@linux.intel.com,
	Linux-kernel@vger.kernel.org, ak@linux.intel.com,
	kan.liang@intel.com, yao.jin@intel.com
Subject: Re: [PATCH v2 17/27] perf evsel: Adjust hybrid event and global event mixed group
Date: Tue, 16 Mar 2021 12:39:38 +0800	[thread overview]
Message-ID: <709fd95a-d379-da99-3816-83797d3d5411@linux.intel.com> (raw)
In-Reply-To: <YE/oF4pOkyQmm0rI@krava>

Hi Jiri,

On 3/16/2021 7:04 AM, Jiri Olsa wrote:
> On Thu, Mar 11, 2021 at 03:07:32PM +0800, Jin Yao wrote:
>> A group mixed with hybrid event and global event is allowed. For example,
>> group leader is 'cpu-clock' and the group member is 'cpu_atom/cycles/'.
>>
>> e.g.
>> perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a
>>
>> The challenge is their available cpus are not fully matched.
>> For example, 'cpu-clock' is available on CPU0-CPU23, but 'cpu_core/cycles/'
>> is available on CPU16-CPU23.
>>
>> When getting the group id for group member, we must be very careful
>> because the cpu for 'cpu-clock' is not equal to the cpu for 'cpu_atom/cycles/'.
>> Actually the cpu here is the index of evsel->core.cpus, not the real CPU ID.
>> e.g. cpu0 for 'cpu-clock' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.
>>
>> Another challenge is for group read. The events in group may be not
>> available on all cpus. For example the leader is a software event and
>> it's available on CPU0-CPU1, but the group member is a hybrid event and
>> it's only available on CPU1. For CPU0, we have only one event, but for CPU1
>> we have two events. So we need to change the read size according to
>> the real number of events on that cpu.
> 
> ugh, this is really bad.. do we really want to support it? ;-)
> I guess we need that for metrics..
> 

Yes, it's a bit of pain but the user case makes sense. Some metrics need the event group which 
consists of global event + hybrid event.

For example, CPU_Utilization = 'cpu_clk_unhalted.ref_tsc' / 'msr/tsc/'.

'msr/tsc/' is a global event. It's valid on all CPUs.

But 'cpu_clk_unhalted.ref' is hybrid event.
'cpu_core/cpu_clk_unhalted.ref/' is valid on core CPUs
'cpu_atom/cpu_clk_unhalted.ref/' is valid on atom CPUs.

So we have to support this usage. :)

> SNIP
> 
>>
>>     Performance counter stats for 'system wide':
>>
>>             24,059.14 msec cpu-clock                 #   23.994 CPUs utilized
>>         6,406,677,892      cpu_atom/cycles/          #  266.289 M/sec
>>
>>           1.002699058 seconds time elapsed
>>
>> For cpu_atom/cycles/, cpu16-cpu23 are set with valid group fd (cpu-clock's fd
>> on that cpu). For counting results, cpu-clock has 24 cpus aggregation and
>> cpu_atom/cycles/ has 8 cpus aggregation. That's expected.
>>
>> But if the event order is changed, e.g. '{cpu_atom/cycles/,cpu-clock}',
>> there leaves more works to do.
>>
>>    root@ssp-pwrt-002:~# ./perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
> 
> what id you add the other hybrid pmu event? or just cycles?
>

Do you mean the config for cpu_atom/cycles/? Let's see the log.

root@ssp-pwrt-002:~# perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
   type                             6
   size                             120
   config                           0xa00000000
   sample_type                      IDENTIFIER
   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
   disabled                         1
   inherit                          1
   exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 16  group_fd -1  flags 0x8 = 3
sys_perf_event_open: pid -1  cpu 17  group_fd -1  flags 0x8 = 4
sys_perf_event_open: pid -1  cpu 18  group_fd -1  flags 0x8 = 5
sys_perf_event_open: pid -1  cpu 19  group_fd -1  flags 0x8 = 7
sys_perf_event_open: pid -1  cpu 20  group_fd -1  flags 0x8 = 8
sys_perf_event_open: pid -1  cpu 21  group_fd -1  flags 0x8 = 9
sys_perf_event_open: pid -1  cpu 22  group_fd -1  flags 0x8 = 10
sys_perf_event_open: pid -1  cpu 23  group_fd -1  flags 0x8 = 11
------------------------------------------------------------
perf_event_attr:
   type                             1
   size                             120
   sample_type                      IDENTIFIER
   read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
   inherit                          1
   exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 12
sys_perf_event_open: pid -1  cpu 1  group_fd -1  flags 0x8 = 13
sys_perf_event_open: pid -1  cpu 2  group_fd -1  flags 0x8 = 14
sys_perf_event_open: pid -1  cpu 3  group_fd -1  flags 0x8 = 15
sys_perf_event_open: pid -1  cpu 4  group_fd -1  flags 0x8 = 16
sys_perf_event_open: pid -1  cpu 5  group_fd -1  flags 0x8 = 17
sys_perf_event_open: pid -1  cpu 6  group_fd -1  flags 0x8 = 18
sys_perf_event_open: pid -1  cpu 7  group_fd -1  flags 0x8 = 19
sys_perf_event_open: pid -1  cpu 8  group_fd -1  flags 0x8 = 20
sys_perf_event_open: pid -1  cpu 9  group_fd -1  flags 0x8 = 21
sys_perf_event_open: pid -1  cpu 10  group_fd -1  flags 0x8 = 22
sys_perf_event_open: pid -1  cpu 11  group_fd -1  flags 0x8 = 23
sys_perf_event_open: pid -1  cpu 12  group_fd -1  flags 0x8 = 24
sys_perf_event_open: pid -1  cpu 13  group_fd -1  flags 0x8 = 25
sys_perf_event_open: pid -1  cpu 14  group_fd -1  flags 0x8 = 26
sys_perf_event_open: pid -1  cpu 15  group_fd -1  flags 0x8 = 27
sys_perf_event_open: pid -1  cpu 16  group_fd 3  flags 0x8 = 28
sys_perf_event_open: pid -1  cpu 17  group_fd 4  flags 0x8 = 29
sys_perf_event_open: pid -1  cpu 18  group_fd 5  flags 0x8 = 30
sys_perf_event_open: pid -1  cpu 19  group_fd 7  flags 0x8 = 31
sys_perf_event_open: pid -1  cpu 20  group_fd 8  flags 0x8 = 32
sys_perf_event_open: pid -1  cpu 21  group_fd 9  flags 0x8 = 33
sys_perf_event_open: pid -1  cpu 22  group_fd 10  flags 0x8 = 34
sys_perf_event_open: pid -1  cpu 23  group_fd 11  flags 0x8 = 35
cycles: 0: 800791792 1002389889 1002389889
cycles: 1: 800788198 1002383611 1002383611
cycles: 2: 800783491 1002377507 1002377507
cycles: 3: 800777752 1002371035 1002371035
cycles: 4: 800771559 1002363669 1002363669
cycles: 5: 800766391 1002356944 1002356944
cycles: 6: 800761593 1002350144 1002350144
cycles: 7: 800756258 1002343203 1002343203
WARNING: for cpu-clock, some CPU counts not read
cpu-clock: 0: 0 0 0
cpu-clock: 1: 0 0 0
cpu-clock: 2: 0 0 0
cpu-clock: 3: 0 0 0
cpu-clock: 4: 0 0 0
cpu-clock: 5: 0 0 0
cpu-clock: 6: 0 0 0
cpu-clock: 7: 0 0 0
cpu-clock: 8: 0 0 0
cpu-clock: 9: 0 0 0
cpu-clock: 10: 0 0 0
cpu-clock: 11: 0 0 0
cpu-clock: 12: 0 0 0
cpu-clock: 13: 0 0 0
cpu-clock: 14: 0 0 0
cpu-clock: 15: 0 0 0
cpu-clock: 16: 1002390566 1002389889 1002389889
cpu-clock: 17: 1002383263 1002383611 1002383611
cpu-clock: 18: 1002377257 1002377507 1002377507
cpu-clock: 19: 1002370895 1002371035 1002371035
cpu-clock: 20: 1002363611 1002363669 1002363669
cpu-clock: 21: 1002356623 1002356944 1002356944
cpu-clock: 22: 1002349562 1002350144 1002350144
cpu-clock: 23: 1002343089 1002343203 1002343203
cycles: 6406197034 8018936002 8018936002
cpu-clock: 8018934866 8018936002 8018936002

  Performance counter stats for 'system wide':

      6,406,197,034      cpu_atom/cycles/          #  798.884 M/sec
           8,018.93 msec cpu-clock                 #    7.999 CPUs utilized

        1.002475994 seconds time elapsed

'0xa00000000' is the config for cpu_atom/cycles/ and we can see this event is only enabled on CPU16-23.

'cpu-clock' is a global event and it's enable don CPU0-23 (but only have valid group fd on CPU16-23).

> 
> SNIP
>    
>> +static int hybrid_read_size(struct evsel *leader, int cpu, int *nr_members)
>> +{
>> +	struct evsel *pos;
>> +	int nr = 1, back, new_size = 0, idx;
>> +
>> +	for_each_group_member(pos, leader) {
>> +		idx = evsel_cpuid_match(leader, pos, cpu);
>> +		if (idx != -1)
>> +			nr++;
>> +	}
>> +
>> +	if (nr != leader->core.nr_members) {
>> +		back = leader->core.nr_members;
>> +		leader->core.nr_members = nr;
>> +		new_size = perf_evsel__read_size(&leader->core);
>> +		leader->core.nr_members = back;
>> +	}
>> +
>> +	*nr_members = nr;
>> +	return new_size;
>> +}
>> +
>>   static int evsel__read_group(struct evsel *leader, int cpu, int thread)
>>   {
>>   	struct perf_stat_evsel *ps = leader->stats;
>>   	u64 read_format = leader->core.attr.read_format;
>>   	int size = perf_evsel__read_size(&leader->core);
>> +	int new_size, nr_members;
>>   	u64 *data = ps->group_data;
>>   
>>   	if (!(read_format & PERF_FORMAT_ID))
>>   		return -EINVAL;
> 
> I wonder if we do not find some reasonable generic way to process
> this, porhaps we should make some early check that this evlist has
> hybrid event and the move the implementation in some separated
> hybrid-XXX object, so we don't confuse the code
> 
> jirka
> 

Agree. The code looks a bit tricky and hard to understand for non-hybrid platform.

Maybe it's a good idea to move this code to hybrid-xxx files.

Thanks
Jin Yao

  reply	other threads:[~2021-03-16  4:40 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11  7:07 [PATCH v2 00/27] perf tool: AlderLake hybrid support series 1 Jin Yao
2021-03-11  7:07 ` [PATCH v2 01/27] tools headers uapi: Update tools's copy of linux/perf_event.h Jin Yao
2021-03-11  7:07 ` [PATCH v2 02/27] perf jevents: Support unit value "cpu_core" and "cpu_atom" Jin Yao
2021-03-11  7:07 ` [PATCH v2 03/27] perf pmu: Simplify arguments of __perf_pmu__new_alias Jin Yao
2021-03-11  7:07 ` [PATCH v2 04/27] perf pmu: Save pmu name Jin Yao
2021-03-15 23:03   ` Jiri Olsa
2021-03-16  0:59     ` Jin, Yao
2021-03-11  7:07 ` [PATCH v2 05/27] perf pmu: Save detected hybrid pmus to a global pmu list Jin Yao
2021-03-11  7:07 ` [PATCH v2 06/27] perf pmu: Add hybrid helper functions Jin Yao
2021-03-11  7:07 ` [PATCH v2 07/27] perf evlist: Hybrid event uses its own cpus Jin Yao
2021-03-12 19:15   ` Jiri Olsa
2021-03-15  1:25     ` Jin, Yao
2021-03-11  7:07 ` [PATCH v2 08/27] perf stat: Uniquify hybrid event name Jin Yao
2021-03-15 23:04   ` Jiri Olsa
2021-03-11  7:07 ` [PATCH v2 09/27] perf parse-events: Create two hybrid hardware events Jin Yao
2021-03-12 19:15   ` Jiri Olsa
2021-03-15  2:04     ` Jin, Yao
2021-03-15 17:34       ` Jiri Olsa
2021-03-15 23:05   ` Jiri Olsa
2021-03-16  3:13     ` Jin, Yao
2021-03-11  7:07 ` [PATCH v2 10/27] perf parse-events: Create two hybrid cache events Jin Yao
2021-03-15 23:05   ` Jiri Olsa
2021-03-16  3:51     ` Jin, Yao
2021-03-11  7:07 ` [PATCH v2 11/27] perf parse-events: Support hardware events inside PMU Jin Yao
2021-03-12 19:15   ` Jiri Olsa
2021-03-15  2:28     ` Jin, Yao
2021-03-15 17:37       ` Jiri Olsa
2021-03-16  1:49         ` Jin, Yao
2021-03-16 14:04           ` Jiri Olsa
2021-03-17  2:12             ` Jin, Yao
2021-03-17 10:06               ` Jiri Olsa
2021-03-17 12:17                 ` Jin, Yao
2021-03-17 13:42                   ` Arnaldo Carvalho de Melo
2021-03-18 12:16                     ` Jiri Olsa
2021-03-18 13:21                       ` Arnaldo Carvalho de Melo
2021-03-18 18:16                         ` Liang, Kan
2021-03-19  2:48                     ` Andi Kleen
2021-03-18 11:51                   ` Jiri Olsa
2021-03-11  7:07 ` [PATCH v2 12/27] perf parse-events: Support hybrid raw events Jin Yao
2021-03-11  7:07 ` [PATCH v2 13/27] perf evlist: Create two hybrid 'cycles' events by default Jin Yao
2021-03-11  7:07 ` [PATCH v2 14/27] perf stat: Add default hybrid events Jin Yao
2021-03-11  7:07 ` [PATCH v2 15/27] perf stat: Filter out unmatched aggregation for hybrid event Jin Yao
2021-03-11  7:07 ` [PATCH v2 16/27] perf evlist: Warn as events from different hybrid PMUs in a group Jin Yao
2021-03-15 23:03   ` Jiri Olsa
2021-03-16  5:25     ` Jin, Yao
2021-03-18 11:52       ` Jiri Olsa
2021-03-11  7:07 ` [PATCH v2 17/27] perf evsel: Adjust hybrid event and global event mixed group Jin Yao
2021-03-15 23:04   ` Jiri Olsa
2021-03-16  4:39     ` Jin, Yao [this message]
2021-03-11  7:07 ` [PATCH v2 18/27] perf script: Support PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU Jin Yao
2021-03-11  7:07 ` [PATCH v2 19/27] perf tests: Add hybrid cases for 'Parse event definition strings' test Jin Yao
2021-03-11  7:07 ` [PATCH v2 20/27] perf tests: Add hybrid cases for 'Roundtrip evsel->name' test Jin Yao
2021-03-11  7:07 ` [PATCH v2 21/27] perf tests: Skip 'Setup struct perf_event_attr' test for hybrid Jin Yao
2021-03-11  7:07 ` [PATCH v2 22/27] perf tests: Support 'Track with sched_switch' " Jin Yao
2021-03-11  7:07 ` [PATCH v2 23/27] perf tests: Support 'Parse and process metrics' " Jin Yao
2021-03-11  7:07 ` [PATCH v2 24/27] perf tests: Support 'Session topology' " Jin Yao
2021-03-11  7:07 ` [PATCH v2 25/27] perf tests: Support 'Convert perf time to TSC' " Jin Yao
2021-03-11  7:07 ` [PATCH v2 26/27] perf tests: Skip 'perf stat metrics (shadow stat) test' " Jin Yao
2021-03-11  7:07 ` [PATCH v2 27/27] perf Documentation: Document intel-hybrid support Jin Yao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=709fd95a-d379-da99-3816-83797d3d5411@linux.intel.com \
    --to=yao.jin@linux.intel.com \
    --cc=Linux-kernel@vger.kernel.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=yao.jin@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.