linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jin, Yao" <yao.jin@linux.intel.com>
To: Jiri Olsa <jolsa@redhat.com>
Cc: acme@kernel.org, jolsa@kernel.org, peterz@infradead.org,
	mingo@redhat.com, alexander.shishkin@linux.intel.com,
	Linux-kernel@vger.kernel.org, ak@linux.intel.com,
	kan.liang@intel.com, yao.jin@intel.com
Subject: Re: [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid
Date: Tue, 20 Jul 2021 15:07:02 +0800	[thread overview]
Message-ID: <ecf0e815-616f-0a08-cefd-baac93c0e47d@linux.intel.com> (raw)
In-Reply-To: <YPXUMTFbj2Tl3eBz@krava>

Hi Jiri,

On 7/20/2021 3:36 AM, Jiri Olsa wrote:
> On Mon, Jul 12, 2021 at 03:12:35PM +0800, Jin Yao wrote:
>> The perf-record and perf-stat have supported the option '-C/--cpus'
>> to count or collect only on the list of CPUs provided. This option
>> needs to be supported for hybrid as well.
>>
>> For hybrid support, it needs to check that the CPUs are available on
>> hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11
>> is 'cpu_atom'.
>>
>> Before:
>>
>>    # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
>>
>>     Performance counter stats for 'CPU(s) 11':
>>
>>       <not supported>      cpu_core/cycles/
>>
>>           1.006179431 seconds time elapsed
>>
>> The perf-stat silently returned "<not supported>" without any helpful
>> information. It should error out that cpu11 was not 'cpu_core'.
>>
>> After:
>>
>>    # perf stat -e cpu_core/cycles/ -C11 -- sleep 1
>>    WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
>>    failed to use cpu list 11
>>
>> We also need to support the events without pmu prefix specified.
>>
>>    # perf stat -e cycles -C11 -- sleep 1
>>    WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7)
>>
>>     Performance counter stats for 'CPU(s) 11':
>>
>>             1,067,373      cpu_atom/cycles/
>>
>>           1.005544738 seconds time elapsed
>>
>> The perf tool creates two cycles events automatically, cpu_core/cycles/ and
>> cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning
>> for cpu_core/cycles/ and only count the cpu_atom/cycles/.
>>
>> If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', the example,
>>
>>    # perf stat -e cycles -C0,11 -- sleep 1
>>    WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
>>
>>     Performance counter stats for 'CPU(s) 0,11':
>>
>>             1,914,704      cpu_core/cycles/
>>             2,036,983      cpu_atom/cycles/
>>
>>           1.005815641 seconds time elapsed
>>
>> It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for
>> cpu_atom/cycles/, and output with some warnings.
>>
>> Some more complex examples,
>>
>>    # perf stat -e cycles,instructions -C0,11 -- sleep 1
>>    WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
>>    WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list.
>>
>>     Performance counter stats for 'CPU(s) 0,11':
>>
>>             2,780,387      cpu_core/cycles/
>>             1,583,432      cpu_atom/cycles/
>>             3,957,277      cpu_core/instructions/
>>             1,167,089      cpu_atom/instructions/
>>
>>           1.006005124 seconds time elapsed
>>
>>    # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1
>>    WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list.
>>    WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list.
>>
>>     Performance counter stats for 'CPU(s) 0,11':
>>
>>             3,290,301      cpu_core/cycles/
>>             1,953,073      cpu_atom/cycles/
>>             1,407,869      cpu_atom/instructions/
>>
>>           1.006260912 seconds time elapsed
>>
>> Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
>> ---
>> v3:
>>   - Rebase to perf/core.
>>   - No functional change.
>>
>> v2:
>>   - Automatically map to hybrid pmu.
>>
>>   tools/perf/builtin-record.c     |  7 ++++
>>   tools/perf/builtin-stat.c       |  6 +++
>>   tools/perf/util/evlist-hybrid.c | 65 +++++++++++++++++++++++++++++++++
>>   tools/perf/util/evlist-hybrid.h |  1 +
>>   tools/perf/util/evlist.c        |  1 +
>>   tools/perf/util/pmu.c           | 35 ++++++++++++++++++
>>   tools/perf/util/pmu.h           |  4 ++
>>   7 files changed, 119 insertions(+)
>>
>> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
>> index 671a21c9ee4d..9518b028b850 100644
>> --- a/tools/perf/builtin-record.c
>> +++ b/tools/perf/builtin-record.c
>> @@ -2884,6 +2884,13 @@ int cmd_record(int argc, const char **argv)
>>   	/* Enable ignoring missing threads when -u/-p option is defined. */
>>   	rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid;
>>   
>> +	if (evlist__use_cpu_list(rec->evlist, rec->opts.target.cpu_list)) {
>> +		pr_err("failed to use cpu list %s\n",
>> +		       rec->opts.target.cpu_list);
>> +		goto out;
>> +	}
>> +
>> +	rec->opts.target.hybrid = perf_pmu__has_hybrid();
>>   	err = -ENOMEM;
>>   	if (evlist__create_maps(rec->evlist, &rec->opts.target) < 0)
>>   		usage_with_options(record_usage, record_options);
>> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
>> index d25cb8088e8c..f7067587008f 100644
>> --- a/tools/perf/builtin-stat.c
>> +++ b/tools/perf/builtin-stat.c
>> @@ -2430,6 +2430,12 @@ int cmd_stat(int argc, const char **argv)
>>   	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
>>   		target.per_thread = true;
>>   
>> +	if (evlist__use_cpu_list(evsel_list, target.cpu_list)) {
>> +		pr_err("failed to use cpu list %s\n", target.cpu_list);
>> +		goto out;
>> +	}
>> +
>> +	target.hybrid = perf_pmu__has_hybrid();
>>   	if (evlist__create_maps(evsel_list, &target) < 0) {
>>   		if (target__has_task(&target)) {
>>   			pr_err("Problems finding threads of monitor\n");
>> diff --git a/tools/perf/util/evlist-hybrid.c b/tools/perf/util/evlist-hybrid.c
>> index db3f5fbdebe1..13c9f3063dda 100644
>> --- a/tools/perf/util/evlist-hybrid.c
>> +++ b/tools/perf/util/evlist-hybrid.c
>> @@ -86,3 +86,68 @@ bool evlist__has_hybrid(struct evlist *evlist)
>>   
>>   	return false;
>>   }
>> +
>> +int evlist__use_cpu_list(struct evlist *evlist, const char *cpu_list)
> 
> 
> the name seems not to cover what it's doing, how about something
> like evlist__fix_cpus or such
> 

OK, evlist__fix_cpus() is better, use this name in v4.

>> +{
>> +	struct perf_cpu_map *cpus;
>> +	struct evsel *evsel, *tmp;
>> +	struct perf_pmu *pmu;
>> +	int ret, unmatched_count = 0, events_nr = 0;
>> +
>> +	if (!perf_pmu__has_hybrid() || !cpu_list)
>> +		return 0;
>> +
>> +	cpus = perf_cpu_map__new(cpu_list);
>> +	if (!cpus)
>> +		return -1;
>> +
>> +	evlist__for_each_entry_safe(evlist, tmp, evsel) {
>> +		struct perf_cpu_map *matched_cpus, *unmatched_cpus;
>> +		char buf1[128], buf2[128];
>> +
>> +		pmu = perf_pmu__find_hybrid_pmu(evsel->pmu_name);
>> +		if (!pmu)
>> +			continue;
>> +
>> +		ret = perf_pmu__cpus_match(pmu, cpus, &matched_cpus,
>> +					   &unmatched_cpus);
>> +		if (ret)
>> +			goto out;
>> +
>> +		events_nr++;
>> +
>> +		if (matched_cpus->nr > 0 && (unmatched_cpus->nr > 0 ||
>> +		    matched_cpus->nr < cpus->nr ||
>> +		    matched_cpus->nr < pmu->cpus->nr)) {
>> +			perf_cpu_map__put(evsel->core.cpus);
>> +			perf_cpu_map__put(evsel->core.own_cpus);
>> +			evsel->core.cpus = perf_cpu_map__get(matched_cpus);
>> +			evsel->core.own_cpus = perf_cpu_map__get(matched_cpus);
> 
> I'm bit confused in here.. AFAIUI there's 2 evsel objects create
> for hybrid 'cycles' ... should they have already proper cpus set?
> 

For 'cycles', yes two evsels are created automatically. One is for atom CPU (e.g. 8-11), the other 
is for core CPU (e.g. 0-7). In this example, these 2 evsels have already the cpus set.

While the 'cpus' here is just the user specified cpu list.
cpus = perf_cpu_map__new(cpu_list);

We need to check that the cpu in 'cpus' is available on hybrid pmu or not and adjust the 
evsel->core.cpus according the matching results.

>> +
>> +			if (unmatched_cpus->nr > 0) {
>> +				cpu_map__snprint(matched_cpus, buf1, sizeof(buf1));
>> +				pr_warning("WARNING: use %s in '%s' for '%s', skip other cpus in list.\n",
>> +					   buf1, pmu->name, evsel->name);
>> +			}
>> +		}
>> +
>> +		if (matched_cpus->nr == 0) {
>> +			evlist__remove(evlist, evsel);
>> +			evsel__delete(evsel);
>> +
>> +			cpu_map__snprint(cpus, buf1, sizeof(buf1));
>> +			cpu_map__snprint(pmu->cpus, buf2, sizeof(buf2));
>> +			pr_warning("WARNING: %s isn't a '%s', please use a CPU list in the '%s' range (%s)\n",
>> +				   buf1, pmu->name, pmu->name, buf2);
>> +			unmatched_count++;
>> +		}
> 
> hum, should we rather fail in here?
> 

perf stat -e cpu_core/cycles/,cpu_atom/instructions/ -C11

CPU11 is atom CPU so the evsel 'cpu_core/cycles/' is failed but cpu_atom/instructions/ is OK.

Don't we report the partially successful event?

Thanks
Jin Yao

> jirka
> 

  reply	other threads:[~2021-07-20  7:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-12  7:12 [PATCH v3 0/3] perf tool: Enable cpu list " Jin Yao
2021-07-12  7:12 ` [PATCH v3 1/3] libperf: Add perf_cpu_map__default_new() Jin Yao
2021-07-12  7:12 ` [PATCH v3 2/3] perf tools: Create hybrid flag in target Jin Yao
2021-07-12  7:12 ` [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid Jin Yao
2021-07-19 19:36   ` Jiri Olsa
2021-07-20  7:07     ` Jin, Yao [this message]
2021-07-20  9:16       ` Jiri Olsa
2021-07-21  4:30         ` Jin, Yao
2021-07-22 10:19           ` Jiri Olsa
2021-07-23  0:49             ` Jin, Yao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ecf0e815-616f-0a08-cefd-baac93c0e47d@linux.intel.com \
    --to=yao.jin@linux.intel.com \
    --cc=Linux-kernel@vger.kernel.org \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=jolsa@kernel.org \
    --cc=jolsa@redhat.com \
    --cc=kan.liang@intel.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=yao.jin@intel.com \
    --subject='Re: [PATCH v3 3/3] perf tools: Enable on a list of CPUs for hybrid' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).