linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Rogers <irogers@google.com>
To: Adrian Hunter <adrian.hunter@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Namhyung Kim <namhyung@kernel.org>,
	Mathieu Poirier <mathieu.poirier@linaro.org>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Mike Leach <mike.leach@linaro.org>, Leo Yan <leo.yan@linaro.org>,
	John Garry <john.garry@huawei.com>, Will Deacon <will@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Martin KaFai Lau <kafai@fb.com>, Song Liu <songliubraving@fb.com>,
	Yonghong Song <yhs@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>, Kajol Jain <kjain@linux.ibm.com>,
	James Clark <james.clark@arm.com>,
	German Gomez <german.gomez@arm.com>,
	Riccardo Mancini <rickyman7@gmail.com>,
	Andi Kleen <ak@linux.intel.com>,
	Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>,
	Alexander Antonov <alexander.antonov@linux.intel.com>,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org,
	Stephane Eranian <eranian@google.com>
Subject: Re: [PATCH v5 4/6] perf cpumap: Handle dummy maps as empty in subset
Date: Wed, 4 May 2022 06:59:01 -0700	[thread overview]
Message-ID: <CAP-5=fWEdLV6Jf5q=MSQyVSL0Q3-KxSvCWgXGjhexezx9AJAdA@mail.gmail.com> (raw)
In-Reply-To: <1e13c738-3460-ef7f-8b4b-5169e16b0b06@intel.com>

On Wed, May 4, 2022 at 5:54 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
>
> On 3/05/22 17:03, Ian Rogers wrote:
> > On Tue, May 3, 2022 at 12:43 AM Adrian Hunter <adrian.hunter@intel.com> wrote:
> >>
> >> On 3/05/22 07:17, Ian Rogers wrote:
> >>> perf_cpu_map__empty is true for empty and dummy maps. Make is_subset
> >>> respect that.
> >>
> >> As I wrote before, I am not keen on this because it prevents -1, as a
> >> valid 3rd parameter to perf_event_open(), from being represented
> >> in merged evsel cpu maps.
> >>
> >> Why do you want this?
> >
> > Thanks Adrian, could you give me a test case (command line) where the
> > differing dummy and empty behavior matters?
>
> perf record --per-thread -e intel_pt// uname
>
> With patchset "perf intel-pt: Better support for perf record --cpu"
> the above will have (assuming 8-CPUs):
>         user_requested_cpus = {-1}
>         intel_pt evsel->cpus = {-1}
>         text_poke dummy evsel->cpus = {0-7}
> which when merged would result in:
>         before this patch: all_cpus = {-1-7}
>         after this patch:  all_cpus = {0-7}
>
> The absence of -1 will mean that the intel_pt event does not get
> mmapped.

Thanks, so what's the right fix? To make this work we should:
 - remove language of dummy being a singular cpu_map
 - change perf_cpu_map__empty to be something like
perf_cpu_map__empty_or_just_dummy
 - change cpu_map__is_dummy to be something llike cpu_map__singular_dummy
 - add tests on cpu map code for things like the evlist affiinity
iterator as I'm not clear what will happen when it encounters a -1 CPU
Note, I'm proposing changing function names rather than implementation
behavior, as we don't have enough tests to give me confidence that
changing the existing behavior wouldn't break something. For example,
--per-thread mode was recently broken:
https://lore.kernel.org/lkml/e1ce0d93-88cc-af79-e67e-d3c79d166ca6@gmail.com/
Do we also need to fix up parse events for software events (e.g.
faults) where the cpu map is empty but really should be dummy? This
will likely need a propagate fix as during the parsing propagation
user_requested_cpus is empty and we want to keep dummy cpu maps, not
overwrite them with empty.

I see a fair amount of clean up here, which isn't bad. My assumed
alternative was that the intel_pt code could look for dummy cpu maps
on the evsels, but then why have dummy cpu maps at all and just use
empty throughout the code base? We could also add a flag to the evlist
to say whether any evsel cpu maps contain a dummy/empty map. API wise
I'm tempted to say that the dummy CPU map should be removed and empty
just used in its place (less is more, keep it simple).

Something that would help with clarity, I think, would be to land the fix in:
https://lore.kernel.org/lkml/20220503041757.2365696-3-irogers@google.com/
as currently all_cpus contains cpus not in the evsel cpu maps, but are
residue from when the evsels were parsed.

Thanks,
Ian

> >                                             Normally cpus/own_cpus are
> > set to null during parsing. They may get replaced with
> > user_requested_cpus:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/lib/perf/evlist.c?h=perf/core#n44
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/lib/perf/evlist.c?h=perf/core#n45
> > (should it be on line 45 that !empty is expected?)
> >
> > During merge the null/empty all_cpus drops this value, which doesn't
> > matter as the behavior with empty is the same as dummy:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/lib/perf/evsel.c?h=perf/core#n119
> >
> > What's concerning me is the definition of empty:
> > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/lib/perf/cpumap.c?h=perf/core#n279
> > ```
> > return map ? map->map[0].cpu == -1 : true;
> > ```
> > If the first entry can be -1 and there can be other CPUs merged after
> > then that cpu map will be empty by the definition above. Perhaps it
> > should be:
> > ```
> > return map ? (map->nr == 1 && map->map[0].cpu == -1) : true;
> > ```
> > but it seems you prefer:
> > ```
> > return (map == NULL) ? true : false;
> > ```
> >
> > You'd asked what the behavior with a dummy is and clearly it is
> > somewhat muddy. That is what this patch and unit test is trying to
> > clean up.
> >
> > Thanks,
> > Ian
> >
> >>>
> >>> Signed-off-by: Ian Rogers <irogers@google.com>
> >>> ---
> >>>  tools/lib/perf/cpumap.c   |  4 ++--
> >>>  tools/perf/tests/cpumap.c | 10 +++++++++-
> >>>  2 files changed, 11 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
> >>> index 384d5e076ee4..9c83675788c2 100644
> >>> --- a/tools/lib/perf/cpumap.c
> >>> +++ b/tools/lib/perf/cpumap.c
> >>> @@ -322,9 +322,9 @@ struct perf_cpu perf_cpu_map__max(struct perf_cpu_map *map)
> >>>  /** Is 'b' a subset of 'a'. */
> >>>  bool perf_cpu_map__is_subset(const struct perf_cpu_map *a, const struct perf_cpu_map *b)
> >>>  {
> >>> -     if (a == b || !b)
> >>> +     if (a == b || perf_cpu_map__empty(b))
> >>>               return true;
> >>> -     if (!a || b->nr > a->nr)
> >>> +     if (perf_cpu_map__empty(a) || b->nr > a->nr)
> >>>               return false;
> >>>
> >>>       for (int i = 0, j = 0; i < a->nr; i++) {
> >>> diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
> >>> index f94929ebb54b..d52b58395385 100644
> >>> --- a/tools/perf/tests/cpumap.c
> >>> +++ b/tools/perf/tests/cpumap.c
> >>> @@ -128,13 +128,21 @@ static int test__cpu_map_merge(struct test_suite *test __maybe_unused, int subte
> >>>       struct perf_cpu_map *a = perf_cpu_map__new("4,2,1");
> >>>       struct perf_cpu_map *b = perf_cpu_map__new("4,5,7");
> >>>       struct perf_cpu_map *c = perf_cpu_map__merge(a, b);
> >>> +     struct perf_cpu_map *d = perf_cpu_map__dummy_new();
> >>> +     struct perf_cpu_map *e = perf_cpu_map__merge(b, d);
> >>>       char buf[100];
> >>>
> >>>       TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(c) == 5);
> >>>       cpu_map__snprint(c, buf, sizeof(buf));
> >>>       TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, "1-2,4-5,7"));
> >>> -     perf_cpu_map__put(b);
> >>> +
> >>> +     TEST_ASSERT_VAL("failed to merge map: bad nr", perf_cpu_map__nr(e) == 3);
> >>> +     cpu_map__snprint(e, buf, sizeof(buf));
> >>> +     TEST_ASSERT_VAL("failed to merge map: bad result", !strcmp(buf, "4-5,7"));
> >>> +
> >>>       perf_cpu_map__put(c);
> >>> +     perf_cpu_map__put(d);
> >>> +     perf_cpu_map__put(e);
> >>>       return 0;
> >>>  }
> >>>
> >>
>

  reply	other threads:[~2022-05-04 13:59 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-03  4:17 [PATCH v5 0/6] Make evlist CPUs more accurate Ian Rogers
2022-05-03  4:17 ` [PATCH v5 1/6] perf cpumap: Switch to using perf_cpu_map API Ian Rogers
2022-05-04 17:44   ` Namhyung Kim
2022-05-05 17:39     ` Arnaldo Carvalho de Melo
2022-05-03  4:17 ` [PATCH v5 2/6] perf evlist: Clear all_cpus before propagating Ian Rogers
2022-05-04 14:15   ` Adrian Hunter
2022-05-05 17:39     ` Arnaldo Carvalho de Melo
2022-05-03  4:17 ` [PATCH v5 3/6] perf stat: Avoid printing cpus with no counters Ian Rogers
2022-05-03 14:19   ` Arnaldo Carvalho de Melo
2022-05-03  4:17 ` [PATCH v5 4/6] perf cpumap: Handle dummy maps as empty in subset Ian Rogers
2022-05-03  7:43   ` Adrian Hunter
2022-05-03 14:03     ` Ian Rogers
2022-05-04 12:54       ` Adrian Hunter
2022-05-04 13:59         ` Ian Rogers [this message]
2022-05-04 16:33           ` Ian Rogers
2022-05-03  4:17 ` [PATCH v5 5/6] perf evlist: Add to user_requested_cpus documentation Ian Rogers
2022-05-03  4:17 ` [PATCH v5 6/6] perf evlist: Rename all_cpus Ian Rogers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAP-5=fWEdLV6Jf5q=MSQyVSL0Q3-KxSvCWgXGjhexezx9AJAdA@mail.gmail.com' \
    --to=irogers@google.com \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=ak@linux.intel.com \
    --cc=alexander.antonov@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexey.v.bayduraev@linux.intel.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=eranian@google.com \
    --cc=german.gomez@arm.com \
    --cc=james.clark@arm.com \
    --cc=john.fastabend@gmail.com \
    --cc=john.garry@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=kafai@fb.com \
    --cc=kjain@linux.ibm.com \
    --cc=kpsingh@kernel.org \
    --cc=leo.yan@linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=mike.leach@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rickyman7@gmail.com \
    --cc=songliubraving@fb.com \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).