* [PATCH 0/2] perf metricgroups: A couple of fixes
@ 2021-06-10 14:32 John Garry
2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry
2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry
0 siblings, 2 replies; 6+ messages in thread
From: John Garry @ 2021-06-10 14:32 UTC (permalink / raw)
To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
namhyung, irogers, kjain
Cc: linux-perf-users, linux-kernel, John Garry
This series fixes a couple of subtle issues.
The first fixes a segfault from on my x86 broadwell when running the
'stat' command with a particular order of metrics.
As mentioned at [0], there is a pre-existing issue here which needs fixing
as this still does not work properly; however I think that is a bigger job,
and getting rid of the segfault is best I can do for the moment.
The second fixes an issue of an uninitialized variable. As noted in the
commit message, gcc does not seem to do a good job of picking up on this.
[0] https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/
John Garry (2):
perf metricgroup: Fix find_evsel_group() event selector
perf metricgroup: Return error code from
metricgroup__add_metric_sys_event_iter()
tools/perf/util/metricgroup.c | 14 ++++++++------
1 file changed, 8 insertions(+), 6 deletions(-)
--
2.26.2
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector
2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry
@ 2021-06-10 14:32 ` John Garry
2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry
1 sibling, 0 replies; 6+ messages in thread
From: John Garry @ 2021-06-10 14:32 UTC (permalink / raw)
To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
namhyung, irogers, kjain
Cc: linux-perf-users, linux-kernel, John Garry
The following command segfaults on my x86 broadwell:
$ ./perf stat -M frontend_bound,retiring,backend_bound,bad_speculation sleep 1
WARNING: grouped events cpus do not match, disabling group:
anon group { raw 0x10e }
anon group { raw 0x10e }
perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)' failed.
Aborted (core dumped)
The issue shows itself as a use-after-free in evlist__check_cpu_maps(),
whereby the leader of an event selector (evsel) has been deleted (yet we
still attempt to verify for an evsel).
Fundamentally the problem comes from metricgroup__setup_events() ->
find_evsel_group(), and has developed from the previous fix attempt in
commit 9c880c24cb0d ("perf metricgroup: Fix for metrics containing
duration_time").
The problem now is that the logic in checking if an evsel is in the same
group is subtely broken for "cycles" event. For "cycles" event, the
pmu_name is NULL; however the logic in find_evsel_group() may set an event
matched against "cycles" as used, when it should not be.
This leads to a condition where an evsel is set, yet its leader is not.
Fix the check for evsel pmu_name by not matching evsels when either has a
NULL pmu_name.
There is still a pre-existing metric issue whereby the ordering of the
metrics may break the 'stat' function, as discussed at:
https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/
Fixes: 9c880c24cb0d ("perf metricgroup: Fix for metrics containing duration_time")
Signed-off-by: John Garry <john.garry@huawei.com>
---
tools/perf/util/metricgroup.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 8336dd8e8098..c456fdeae06a 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -162,10 +162,10 @@ static bool contains_event(struct evsel **metric_events, int num_events,
return false;
}
-static bool evsel_same_pmu(struct evsel *ev1, struct evsel *ev2)
+static bool evsel_same_pmu_or_none(struct evsel *ev1, struct evsel *ev2)
{
if (!ev1->pmu_name || !ev2->pmu_name)
- return false;
+ return true;
return !strcmp(ev1->pmu_name, ev2->pmu_name);
}
@@ -288,7 +288,7 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
*/
if (!has_constraint &&
ev->leader != metric_events[i]->leader &&
- evsel_same_pmu(ev->leader, metric_events[i]->leader))
+ evsel_same_pmu_or_none(ev->leader, metric_events[i]->leader))
break;
if (!strcmp(metric_events[i]->name, ev->name)) {
set_bit(ev->idx, evlist_used);
--
2.26.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry
2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry
@ 2021-06-10 14:33 ` John Garry
2021-06-10 18:45 ` Ian Rogers
1 sibling, 1 reply; 6+ messages in thread
From: John Garry @ 2021-06-10 14:33 UTC (permalink / raw)
To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
namhyung, irogers, kjain
Cc: linux-perf-users, linux-kernel, John Garry
The error code is not set at all in the sys event iter function.
This may lead to an uninitialized value of "ret" in
metricgroup__add_metric() when no CPU metric is added.
Fix by properly setting the error code.
It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
if we have no CPU or sys event metric matching, then "has_match" should
be 0 and "ret" is set to -EINVAL.
However gcc cannot detect that it may not have been set after the
map_for_each_metric() loop for CPU metrics, which is strange.
Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
Signed-off-by: John Garry <john.garry@huawei.com>
---
tools/perf/util/metricgroup.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index c456fdeae06a..d3cf2dee36c8 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids);
if (ret)
- return ret;
+ goto out;
ret = resolve_metric(d->metric_no_group,
d->metric_list, NULL, d->ids);
if (ret)
- return ret;
+ goto out;
*(d->has_match) = true;
- return *d->ret;
+out:
+ *(d->ret) = ret;
+ return ret;
}
static int metricgroup__add_metric(const char *metric, bool metric_no_group,
--
2.26.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry
@ 2021-06-10 18:45 ` Ian Rogers
2021-06-14 14:56 ` Arnaldo Carvalho de Melo
0 siblings, 1 reply; 6+ messages in thread
From: Ian Rogers @ 2021-06-10 18:45 UTC (permalink / raw)
To: John Garry
Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
kajoljain, linux-perf-users, LKML
On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote:
>
> The error code is not set at all in the sys event iter function.
>
> This may lead to an uninitialized value of "ret" in
> metricgroup__add_metric() when no CPU metric is added.
>
> Fix by properly setting the error code.
>
> It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
> if we have no CPU or sys event metric matching, then "has_match" should
> be 0 and "ret" is set to -EINVAL.
>
> However gcc cannot detect that it may not have been set after the
> map_for_each_metric() loop for CPU metrics, which is strange.
>
> Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
> Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Thanks,
Ian
> ---
> tools/perf/util/metricgroup.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index c456fdeae06a..d3cf2dee36c8 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
>
> ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids);
> if (ret)
> - return ret;
> + goto out;
>
> ret = resolve_metric(d->metric_no_group,
> d->metric_list, NULL, d->ids);
> if (ret)
> - return ret;
> + goto out;
>
> *(d->has_match) = true;
>
> - return *d->ret;
> +out:
> + *(d->ret) = ret;
> + return ret;
> }
>
> static int metricgroup__add_metric(const char *metric, bool metric_no_group,
> --
> 2.26.2
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
2021-06-10 18:45 ` Ian Rogers
@ 2021-06-14 14:56 ` Arnaldo Carvalho de Melo
2021-06-15 17:51 ` Ian Rogers
0 siblings, 1 reply; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-14 14:56 UTC (permalink / raw)
To: Ian Rogers
Cc: John Garry, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Namhyung Kim, kajoljain,
linux-perf-users, LKML
Em Thu, Jun 10, 2021 at 11:45:17AM -0700, Ian Rogers escreveu:
> On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote:
> >
> > The error code is not set at all in the sys event iter function.
> >
> > This may lead to an uninitialized value of "ret" in
> > metricgroup__add_metric() when no CPU metric is added.
> >
> > Fix by properly setting the error code.
> >
> > It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
> > if we have no CPU or sys event metric matching, then "has_match" should
> > be 0 and "ret" is set to -EINVAL.
> >
> > However gcc cannot detect that it may not have been set after the
> > map_for_each_metric() loop for CPU metrics, which is strange.
> >
> > Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
> > Signed-off-by: John Garry <john.garry@huawei.com>
>
> Acked-by: Ian Rogers <irogers@google.com>
Do your Acked-by applies to both patches? Or just 2/2? I reproduced the
problem fixed by 1/2 on a Thinkpad T450S (broadwell) and after applying
the patch it doesn't segfaults.
Please clarify,
- Arnaldo
> Thanks,
> Ian
>
> > ---
> > tools/perf/util/metricgroup.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> > index c456fdeae06a..d3cf2dee36c8 100644
> > --- a/tools/perf/util/metricgroup.c
> > +++ b/tools/perf/util/metricgroup.c
> > @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
> >
> > ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids);
> > if (ret)
> > - return ret;
> > + goto out;
> >
> > ret = resolve_metric(d->metric_no_group,
> > d->metric_list, NULL, d->ids);
> > if (ret)
> > - return ret;
> > + goto out;
> >
> > *(d->has_match) = true;
> >
> > - return *d->ret;
> > +out:
> > + *(d->ret) = ret;
> > + return ret;
> > }
> >
> > static int metricgroup__add_metric(const char *metric, bool metric_no_group,
> > --
> > 2.26.2
> >
--
- Arnaldo
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
2021-06-14 14:56 ` Arnaldo Carvalho de Melo
@ 2021-06-15 17:51 ` Ian Rogers
0 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2021-06-15 17:51 UTC (permalink / raw)
To: Arnaldo Carvalho de Melo
Cc: John Garry, Peter Zijlstra, Ingo Molnar, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Namhyung Kim, kajoljain,
linux-perf-users, LKML
On Mon, Jun 14, 2021 at 7:56 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Jun 10, 2021 at 11:45:17AM -0700, Ian Rogers escreveu:
> > On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote:
> > >
> > > The error code is not set at all in the sys event iter function.
> > >
> > > This may lead to an uninitialized value of "ret" in
> > > metricgroup__add_metric() when no CPU metric is added.
> > >
> > > Fix by properly setting the error code.
> > >
> > > It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
> > > if we have no CPU or sys event metric matching, then "has_match" should
> > > be 0 and "ret" is set to -EINVAL.
> > >
> > > However gcc cannot detect that it may not have been set after the
> > > map_for_each_metric() loop for CPU metrics, which is strange.
> > >
> > > Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
> > > Signed-off-by: John Garry <john.garry@huawei.com>
> >
> > Acked-by: Ian Rogers <irogers@google.com>
>
> Do your Acked-by applies to both patches? Or just 2/2? I reproduced the
> problem fixed by 1/2 on a Thinkpad T450S (broadwell) and after applying
> the patch it doesn't segfaults.
IIRC I need to look at what is going on with the names in patch 1/2
and didn't have a repro. I don't mind to ack it given that you've
repro-ed the problem and confirmed the fix. In general this logic
isn't working well (especially for --metric-no-group) so I plan to
take a stab at reorganizing it.
Thanks,
Ian
> Please clarify,
>
> - Arnaldo
>
>
> > Thanks,
> > Ian
> >
> > > ---
> > > tools/perf/util/metricgroup.c | 8 +++++---
> > > 1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> > > index c456fdeae06a..d3cf2dee36c8 100644
> > > --- a/tools/perf/util/metricgroup.c
> > > +++ b/tools/perf/util/metricgroup.c
> > > @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
> > >
> > > ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids);
> > > if (ret)
> > > - return ret;
> > > + goto out;
> > >
> > > ret = resolve_metric(d->metric_no_group,
> > > d->metric_list, NULL, d->ids);
> > > if (ret)
> > > - return ret;
> > > + goto out;
> > >
> > > *(d->has_match) = true;
> > >
> > > - return *d->ret;
> > > +out:
> > > + *(d->ret) = ret;
> > > + return ret;
> > > }
> > >
> > > static int metricgroup__add_metric(const char *metric, bool metric_no_group,
> > > --
> > > 2.26.2
> > >
>
> --
>
> - Arnaldo
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-06-15 17:51 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry
2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry
2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry
2021-06-10 18:45 ` Ian Rogers
2021-06-14 14:56 ` Arnaldo Carvalho de Melo
2021-06-15 17:51 ` Ian Rogers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).