From: Ian Rogers <irogers@google.com> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Mathieu Poirier <mathieu.poirier@linaro.org>, Suzuki K Poulose <suzuki.poulose@arm.com>, Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org>, John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>, Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii@kernel.org>, Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>, John Fastabend <john.fastabend@gmail.com>, KP Singh <kpsingh@kernel.org>, Kajol Jain <kjain@linux.ibm.com>, James Clark <james.clark@arm.com>, German Gomez <german.gomez@arm.com>, Adrian Hunter <adrian.hunter@intel.com>, Riccardo Mancini <rickyman7@gmail.com>, Andi Kleen <ak@linux.intel.com>, Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>, Alexander Antonov <alexander.antonov@linux.intel.com>, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org, bpf@vger.kernel.org Cc: Stephane Eranian <eranian@google.com>, Ian Rogers <irogers@google.com> Subject: [PATCH v3 4/5] perf evlist: Respect all_cpus when setting user_requested_cpus Date: Thu, 7 Apr 2022 20:56:15 -0700 [thread overview] Message-ID: <20220408035616.1356953-5-irogers@google.com> (raw) In-Reply-To: <20220408035616.1356953-1-irogers@google.com> If all_cpus is calculated it represents the merge/union of all evsel cpu maps. By default user_requested_cpus is computed to be the online CPUs. For uncore events, it is often the case currently that all_cpus is a subset of user_requested_cpus. Metrics printed without aggregation and with metric-only, in print_no_aggr_metric, iterate over user_requested_cpus assuming every CPU has a metric to print. For each CPU the prefix is printed, but then if the evsel's cpus doesn't contain anything you get an empty line like the following on a 2 socket 36 core SkylakeX: ``` $ perf stat -A -M DRAM_BW_Use -a --metric-only -I 1000 1.000453137 CPU0 0.00 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 CPU18 0.00 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 2.003717143 CPU0 0.00 ... ``` While it is possible to be lazier in printing the prefix and trailing newline, having user_requested_cpus not be a subset of all_cpus is preferential so that wasted work isn't done elsewhere user_requested_cpus is used. The change modifies user_requested_cpus to be the intersection of user specified CPUs, or default all online CPUs, with the CPUs computed through the merge of all evsel cpu maps. New behavior: ``` $ perf stat -A -M DRAM_BW_Use -a --metric-only -I 1000 1.001086325 CPU0 0.00 1.001086325 CPU18 0.00 2.003671291 CPU0 0.00 2.003671291 CPU18 0.00 ... ``` Signed-off-by: Ian Rogers <irogers@google.com> --- tools/perf/util/evlist.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 52ea004ba01e..196d57b905a0 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1036,6 +1036,13 @@ int evlist__create_maps(struct evlist *evlist, struct target *target) if (!cpus) goto out_delete_threads; + if (evlist->core.all_cpus) { + struct perf_cpu_map *tmp; + + tmp = perf_cpu_map__intersect(cpus, evlist->core.all_cpus); + perf_cpu_map__put(cpus); + cpus = tmp; + } evlist->core.has_user_cpus = !!target->cpu_list && !target->hybrid; perf_evlist__set_maps(&evlist->core, cpus, threads); -- 2.35.1.1178.g4f1659d476-goog
WARNING: multiple messages have this Message-ID (diff)
From: Ian Rogers <irogers@google.com> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Mathieu Poirier <mathieu.poirier@linaro.org>, Suzuki K Poulose <suzuki.poulose@arm.com>, Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org>, John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>, Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii@kernel.org>, Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>, John Fastabend <john.fastabend@gmail.com>, KP Singh <kpsingh@kernel.org>, Kajol Jain <kjain@linux.ibm.com>, James Clark <james.clark@arm.com>, German Gomez <german.gomez@arm.com>, Adrian Hunter <adrian.hunter@intel.com>, Riccardo Mancini <rickyman7@gmail.com>, Andi Kleen <ak@linux.intel.com>, Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>, Alexander Antonov <alexander.antonov@linux.intel.com>, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, coresight@lists.linaro.org, linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org, bpf@vger.kernel.org Cc: Stephane Eranian <eranian@google.com>, Ian Rogers <irogers@google.com> Subject: [PATCH v3 4/5] perf evlist: Respect all_cpus when setting user_requested_cpus Date: Thu, 7 Apr 2022 20:56:15 -0700 [thread overview] Message-ID: <20220408035616.1356953-5-irogers@google.com> (raw) In-Reply-To: <20220408035616.1356953-1-irogers@google.com> If all_cpus is calculated it represents the merge/union of all evsel cpu maps. By default user_requested_cpus is computed to be the online CPUs. For uncore events, it is often the case currently that all_cpus is a subset of user_requested_cpus. Metrics printed without aggregation and with metric-only, in print_no_aggr_metric, iterate over user_requested_cpus assuming every CPU has a metric to print. For each CPU the prefix is printed, but then if the evsel's cpus doesn't contain anything you get an empty line like the following on a 2 socket 36 core SkylakeX: ``` $ perf stat -A -M DRAM_BW_Use -a --metric-only -I 1000 1.000453137 CPU0 0.00 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 CPU18 0.00 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 1.000453137 2.003717143 CPU0 0.00 ... ``` While it is possible to be lazier in printing the prefix and trailing newline, having user_requested_cpus not be a subset of all_cpus is preferential so that wasted work isn't done elsewhere user_requested_cpus is used. The change modifies user_requested_cpus to be the intersection of user specified CPUs, or default all online CPUs, with the CPUs computed through the merge of all evsel cpu maps. New behavior: ``` $ perf stat -A -M DRAM_BW_Use -a --metric-only -I 1000 1.001086325 CPU0 0.00 1.001086325 CPU18 0.00 2.003671291 CPU0 0.00 2.003671291 CPU18 0.00 ... ``` Signed-off-by: Ian Rogers <irogers@google.com> --- tools/perf/util/evlist.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 52ea004ba01e..196d57b905a0 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1036,6 +1036,13 @@ int evlist__create_maps(struct evlist *evlist, struct target *target) if (!cpus) goto out_delete_threads; + if (evlist->core.all_cpus) { + struct perf_cpu_map *tmp; + + tmp = perf_cpu_map__intersect(cpus, evlist->core.all_cpus); + perf_cpu_map__put(cpus); + cpus = tmp; + } evlist->core.has_user_cpus = !!target->cpu_list && !target->hybrid; perf_evlist__set_maps(&evlist->core, cpus, threads); -- 2.35.1.1178.g4f1659d476-goog _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2022-04-08 3:57 UTC|newest] Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-04-08 3:56 [PATCH v3 0/5] Make evlist CPUs more accurate Ian Rogers 2022-04-08 3:56 ` Ian Rogers 2022-04-08 3:56 ` [PATCH v3 1/5] perf cpumap: Don't decrement refcnt on args to merge Ian Rogers 2022-04-08 3:56 ` Ian Rogers 2022-04-08 3:56 ` [PATCH v3 2/5] perf tests: Additional cpumap merge tests Ian Rogers 2022-04-08 3:56 ` Ian Rogers 2022-04-08 3:56 ` [PATCH v3 3/5] perf cpumap: Add intersect function Ian Rogers 2022-04-08 3:56 ` Ian Rogers 2022-04-08 3:56 ` Ian Rogers [this message] 2022-04-08 3:56 ` [PATCH v3 4/5] perf evlist: Respect all_cpus when setting user_requested_cpus Ian Rogers 2022-04-28 20:15 ` Adrian Hunter 2022-04-28 20:15 ` Adrian Hunter [not found] ` <CAP-5=fVNuQDW+yge897RjaWfE3cfQTD4ufFws6PS2k99Qe05Uw@mail.gmail.com> 2022-04-29 11:34 ` Adrian Hunter 2022-04-29 11:34 ` Adrian Hunter 2022-04-30 1:06 ` Ian Rogers 2022-04-30 1:06 ` Ian Rogers 2022-04-08 3:56 ` [PATCH v3 5/5] perf test: Combine cpu map tests into 1 suite Ian Rogers 2022-04-08 3:56 ` Ian Rogers
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20220408035616.1356953-5-irogers@google.com \ --to=irogers@google.com \ --cc=acme@kernel.org \ --cc=adrian.hunter@intel.com \ --cc=ak@linux.intel.com \ --cc=alexander.antonov@linux.intel.com \ --cc=alexander.shishkin@linux.intel.com \ --cc=alexey.v.bayduraev@linux.intel.com \ --cc=andrii@kernel.org \ --cc=ast@kernel.org \ --cc=bpf@vger.kernel.org \ --cc=coresight@lists.linaro.org \ --cc=daniel@iogearbox.net \ --cc=eranian@google.com \ --cc=german.gomez@arm.com \ --cc=james.clark@arm.com \ --cc=john.fastabend@gmail.com \ --cc=john.garry@huawei.com \ --cc=jolsa@kernel.org \ --cc=kafai@fb.com \ --cc=kjain@linux.ibm.com \ --cc=kpsingh@kernel.org \ --cc=leo.yan@linaro.org \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-perf-users@vger.kernel.org \ --cc=mark.rutland@arm.com \ --cc=mathieu.poirier@linaro.org \ --cc=mike.leach@linaro.org \ --cc=mingo@redhat.com \ --cc=namhyung@kernel.org \ --cc=netdev@vger.kernel.org \ --cc=peterz@infradead.org \ --cc=rickyman7@gmail.com \ --cc=songliubraving@fb.com \ --cc=suzuki.poulose@arm.com \ --cc=will@kernel.org \ --cc=yhs@fb.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.