From: "Jin, Yao" <yao.jin@linux.intel.com>
To: Jiri Olsa <jolsa@redhat.com>
Cc: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org,
mingo@redhat.com, alexander.shishkin@linux.intel.com,
Linux-kernel@vger.kernel.org, ak@linux.intel.com,
kan.liang@intel.com, yao.jin@intel.com
Subject: Re: [PATCH v2 17/27] perf evsel: Adjust hybrid event and global event mixed group
Date: Tue, 16 Mar 2021 12:39:38 +0800 [thread overview]
Message-ID: <709fd95a-d379-da99-3816-83797d3d5411@linux.intel.com> (raw)
In-Reply-To: <YE/oF4pOkyQmm0rI@krava>
Hi Jiri,
On 3/16/2021 7:04 AM, Jiri Olsa wrote:
> On Thu, Mar 11, 2021 at 03:07:32PM +0800, Jin Yao wrote:
>> A group mixed with hybrid event and global event is allowed. For example,
>> group leader is 'cpu-clock' and the group member is 'cpu_atom/cycles/'.
>>
>> e.g.
>> perf stat -e '{cpu-clock,cpu_atom/cycles/}' -a
>>
>> The challenge is their available cpus are not fully matched.
>> For example, 'cpu-clock' is available on CPU0-CPU23, but 'cpu_core/cycles/'
>> is available on CPU16-CPU23.
>>
>> When getting the group id for group member, we must be very careful
>> because the cpu for 'cpu-clock' is not equal to the cpu for 'cpu_atom/cycles/'.
>> Actually the cpu here is the index of evsel->core.cpus, not the real CPU ID.
>> e.g. cpu0 for 'cpu-clock' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.
>>
>> Another challenge is for group read. The events in group may be not
>> available on all cpus. For example the leader is a software event and
>> it's available on CPU0-CPU1, but the group member is a hybrid event and
>> it's only available on CPU1. For CPU0, we have only one event, but for CPU1
>> we have two events. So we need to change the read size according to
>> the real number of events on that cpu.
>
> ugh, this is really bad.. do we really want to support it? ;-)
> I guess we need that for metrics..
>
Yes, it's a bit of pain but the user case makes sense. Some metrics need the event group which
consists of global event + hybrid event.
For example, CPU_Utilization = 'cpu_clk_unhalted.ref_tsc' / 'msr/tsc/'.
'msr/tsc/' is a global event. It's valid on all CPUs.
But 'cpu_clk_unhalted.ref' is hybrid event.
'cpu_core/cpu_clk_unhalted.ref/' is valid on core CPUs
'cpu_atom/cpu_clk_unhalted.ref/' is valid on atom CPUs.
So we have to support this usage. :)
> SNIP
>
>>
>> Performance counter stats for 'system wide':
>>
>> 24,059.14 msec cpu-clock # 23.994 CPUs utilized
>> 6,406,677,892 cpu_atom/cycles/ # 266.289 M/sec
>>
>> 1.002699058 seconds time elapsed
>>
>> For cpu_atom/cycles/, cpu16-cpu23 are set with valid group fd (cpu-clock's fd
>> on that cpu). For counting results, cpu-clock has 24 cpus aggregation and
>> cpu_atom/cycles/ has 8 cpus aggregation. That's expected.
>>
>> But if the event order is changed, e.g. '{cpu_atom/cycles/,cpu-clock}',
>> there leaves more works to do.
>>
>> root@ssp-pwrt-002:~# ./perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
>
> what id you add the other hybrid pmu event? or just cycles?
>
Do you mean the config for cpu_atom/cycles/? Let's see the log.
root@ssp-pwrt-002:~# perf stat -e '{cpu_atom/cycles/,cpu-clock}' -a -vvv -- sleep 1
Control descriptor is not initialized
------------------------------------------------------------
perf_event_attr:
type 6
size 120
config 0xa00000000
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
disabled 1
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 = 3
sys_perf_event_open: pid -1 cpu 17 group_fd -1 flags 0x8 = 4
sys_perf_event_open: pid -1 cpu 18 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid -1 cpu 19 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 20 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid -1 cpu 21 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 22 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 23 group_fd -1 flags 0x8 = 11
------------------------------------------------------------
perf_event_attr:
type 1
size 120
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
inherit 1
exclude_guest 1
------------------------------------------------------------
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid -1 cpu 8 group_fd -1 flags 0x8 = 20
sys_perf_event_open: pid -1 cpu 9 group_fd -1 flags 0x8 = 21
sys_perf_event_open: pid -1 cpu 10 group_fd -1 flags 0x8 = 22
sys_perf_event_open: pid -1 cpu 11 group_fd -1 flags 0x8 = 23
sys_perf_event_open: pid -1 cpu 12 group_fd -1 flags 0x8 = 24
sys_perf_event_open: pid -1 cpu 13 group_fd -1 flags 0x8 = 25
sys_perf_event_open: pid -1 cpu 14 group_fd -1 flags 0x8 = 26
sys_perf_event_open: pid -1 cpu 15 group_fd -1 flags 0x8 = 27
sys_perf_event_open: pid -1 cpu 16 group_fd 3 flags 0x8 = 28
sys_perf_event_open: pid -1 cpu 17 group_fd 4 flags 0x8 = 29
sys_perf_event_open: pid -1 cpu 18 group_fd 5 flags 0x8 = 30
sys_perf_event_open: pid -1 cpu 19 group_fd 7 flags 0x8 = 31
sys_perf_event_open: pid -1 cpu 20 group_fd 8 flags 0x8 = 32
sys_perf_event_open: pid -1 cpu 21 group_fd 9 flags 0x8 = 33
sys_perf_event_open: pid -1 cpu 22 group_fd 10 flags 0x8 = 34
sys_perf_event_open: pid -1 cpu 23 group_fd 11 flags 0x8 = 35
cycles: 0: 800791792 1002389889 1002389889
cycles: 1: 800788198 1002383611 1002383611
cycles: 2: 800783491 1002377507 1002377507
cycles: 3: 800777752 1002371035 1002371035
cycles: 4: 800771559 1002363669 1002363669
cycles: 5: 800766391 1002356944 1002356944
cycles: 6: 800761593 1002350144 1002350144
cycles: 7: 800756258 1002343203 1002343203
WARNING: for cpu-clock, some CPU counts not read
cpu-clock: 0: 0 0 0
cpu-clock: 1: 0 0 0
cpu-clock: 2: 0 0 0
cpu-clock: 3: 0 0 0
cpu-clock: 4: 0 0 0
cpu-clock: 5: 0 0 0
cpu-clock: 6: 0 0 0
cpu-clock: 7: 0 0 0
cpu-clock: 8: 0 0 0
cpu-clock: 9: 0 0 0
cpu-clock: 10: 0 0 0
cpu-clock: 11: 0 0 0
cpu-clock: 12: 0 0 0
cpu-clock: 13: 0 0 0
cpu-clock: 14: 0 0 0
cpu-clock: 15: 0 0 0
cpu-clock: 16: 1002390566 1002389889 1002389889
cpu-clock: 17: 1002383263 1002383611 1002383611
cpu-clock: 18: 1002377257 1002377507 1002377507
cpu-clock: 19: 1002370895 1002371035 1002371035
cpu-clock: 20: 1002363611 1002363669 1002363669
cpu-clock: 21: 1002356623 1002356944 1002356944
cpu-clock: 22: 1002349562 1002350144 1002350144
cpu-clock: 23: 1002343089 1002343203 1002343203
cycles: 6406197034 8018936002 8018936002
cpu-clock: 8018934866 8018936002 8018936002
Performance counter stats for 'system wide':
6,406,197,034 cpu_atom/cycles/ # 798.884 M/sec
8,018.93 msec cpu-clock # 7.999 CPUs utilized
1.002475994 seconds time elapsed
'0xa00000000' is the config for cpu_atom/cycles/ and we can see this event is only enabled on CPU16-23.
'cpu-clock' is a global event and it's enable don CPU0-23 (but only have valid group fd on CPU16-23).
>
> SNIP
>
>> +static int hybrid_read_size(struct evsel *leader, int cpu, int *nr_members)
>> +{
>> + struct evsel *pos;
>> + int nr = 1, back, new_size = 0, idx;
>> +
>> + for_each_group_member(pos, leader) {
>> + idx = evsel_cpuid_match(leader, pos, cpu);
>> + if (idx != -1)
>> + nr++;
>> + }
>> +
>> + if (nr != leader->core.nr_members) {
>> + back = leader->core.nr_members;
>> + leader->core.nr_members = nr;
>> + new_size = perf_evsel__read_size(&leader->core);
>> + leader->core.nr_members = back;
>> + }
>> +
>> + *nr_members = nr;
>> + return new_size;
>> +}
>> +
>> static int evsel__read_group(struct evsel *leader, int cpu, int thread)
>> {
>> struct perf_stat_evsel *ps = leader->stats;
>> u64 read_format = leader->core.attr.read_format;
>> int size = perf_evsel__read_size(&leader->core);
>> + int new_size, nr_members;
>> u64 *data = ps->group_data;
>>
>> if (!(read_format & PERF_FORMAT_ID))
>> return -EINVAL;
>
> I wonder if we do not find some reasonable generic way to process
> this, porhaps we should make some early check that this evlist has
> hybrid event and the move the implementation in some separated
> hybrid-XXX object, so we don't confuse the code
>
> jirka
>
Agree. The code looks a bit tricky and hard to understand for non-hybrid platform.
Maybe it's a good idea to move this code to hybrid-xxx files.
Thanks
Jin Yao
next prev parent reply other threads:[~2021-03-16 4:40 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-11 7:07 [PATCH v2 00/27] perf tool: AlderLake hybrid support series 1 Jin Yao
2021-03-11 7:07 ` [PATCH v2 01/27] tools headers uapi: Update tools's copy of linux/perf_event.h Jin Yao
2021-03-11 7:07 ` [PATCH v2 02/27] perf jevents: Support unit value "cpu_core" and "cpu_atom" Jin Yao
2021-03-11 7:07 ` [PATCH v2 03/27] perf pmu: Simplify arguments of __perf_pmu__new_alias Jin Yao
2021-03-11 7:07 ` [PATCH v2 04/27] perf pmu: Save pmu name Jin Yao
2021-03-15 23:03 ` Jiri Olsa
2021-03-16 0:59 ` Jin, Yao
2021-03-11 7:07 ` [PATCH v2 05/27] perf pmu: Save detected hybrid pmus to a global pmu list Jin Yao
2021-03-11 7:07 ` [PATCH v2 06/27] perf pmu: Add hybrid helper functions Jin Yao
2021-03-11 7:07 ` [PATCH v2 07/27] perf evlist: Hybrid event uses its own cpus Jin Yao
2021-03-12 19:15 ` Jiri Olsa
2021-03-15 1:25 ` Jin, Yao
2021-03-11 7:07 ` [PATCH v2 08/27] perf stat: Uniquify hybrid event name Jin Yao
2021-03-15 23:04 ` Jiri Olsa
2021-03-11 7:07 ` [PATCH v2 09/27] perf parse-events: Create two hybrid hardware events Jin Yao
2021-03-12 19:15 ` Jiri Olsa
2021-03-15 2:04 ` Jin, Yao
2021-03-15 17:34 ` Jiri Olsa
2021-03-15 23:05 ` Jiri Olsa
2021-03-16 3:13 ` Jin, Yao
2021-03-11 7:07 ` [PATCH v2 10/27] perf parse-events: Create two hybrid cache events Jin Yao
2021-03-15 23:05 ` Jiri Olsa
2021-03-16 3:51 ` Jin, Yao
2021-03-11 7:07 ` [PATCH v2 11/27] perf parse-events: Support hardware events inside PMU Jin Yao
2021-03-12 19:15 ` Jiri Olsa
2021-03-15 2:28 ` Jin, Yao
2021-03-15 17:37 ` Jiri Olsa
2021-03-16 1:49 ` Jin, Yao
2021-03-16 14:04 ` Jiri Olsa
2021-03-17 2:12 ` Jin, Yao
2021-03-17 10:06 ` Jiri Olsa
2021-03-17 12:17 ` Jin, Yao
2021-03-17 13:42 ` Arnaldo Carvalho de Melo
2021-03-18 12:16 ` Jiri Olsa
2021-03-18 13:21 ` Arnaldo Carvalho de Melo
2021-03-18 18:16 ` Liang, Kan
2021-03-19 2:48 ` Andi Kleen
2021-03-18 11:51 ` Jiri Olsa
2021-03-11 7:07 ` [PATCH v2 12/27] perf parse-events: Support hybrid raw events Jin Yao
2021-03-11 7:07 ` [PATCH v2 13/27] perf evlist: Create two hybrid 'cycles' events by default Jin Yao
2021-03-11 7:07 ` [PATCH v2 14/27] perf stat: Add default hybrid events Jin Yao
2021-03-11 7:07 ` [PATCH v2 15/27] perf stat: Filter out unmatched aggregation for hybrid event Jin Yao
2021-03-11 7:07 ` [PATCH v2 16/27] perf evlist: Warn as events from different hybrid PMUs in a group Jin Yao
2021-03-15 23:03 ` Jiri Olsa
2021-03-16 5:25 ` Jin, Yao
2021-03-18 11:52 ` Jiri Olsa
2021-03-11 7:07 ` [PATCH v2 17/27] perf evsel: Adjust hybrid event and global event mixed group Jin Yao
2021-03-15 23:04 ` Jiri Olsa
2021-03-16 4:39 ` Jin, Yao [this message]
2021-03-11 7:07 ` [PATCH v2 18/27] perf script: Support PERF_TYPE_HARDWARE_PMU and PERF_TYPE_HW_CACHE_PMU Jin Yao
2021-03-11 7:07 ` [PATCH v2 19/27] perf tests: Add hybrid cases for 'Parse event definition strings' test Jin Yao
2021-03-11 7:07 ` [PATCH v2 20/27] perf tests: Add hybrid cases for 'Roundtrip evsel->name' test Jin Yao
2021-03-11 7:07 ` [PATCH v2 21/27] perf tests: Skip 'Setup struct perf_event_attr' test for hybrid Jin Yao
2021-03-11 7:07 ` [PATCH v2 22/27] perf tests: Support 'Track with sched_switch' " Jin Yao
2021-03-11 7:07 ` [PATCH v2 23/27] perf tests: Support 'Parse and process metrics' " Jin Yao
2021-03-11 7:07 ` [PATCH v2 24/27] perf tests: Support 'Session topology' " Jin Yao
2021-03-11 7:07 ` [PATCH v2 25/27] perf tests: Support 'Convert perf time to TSC' " Jin Yao
2021-03-11 7:07 ` [PATCH v2 26/27] perf tests: Skip 'perf stat metrics (shadow stat) test' " Jin Yao
2021-03-11 7:07 ` [PATCH v2 27/27] perf Documentation: Document intel-hybrid support Jin Yao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=709fd95a-d379-da99-3816-83797d3d5411@linux.intel.com \
--to=yao.jin@linux.intel.com \
--cc=Linux-kernel@vger.kernel.org \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=jolsa@kernel.org \
--cc=jolsa@redhat.com \
--cc=kan.liang@intel.com \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=yao.jin@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.