linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] perf metricgroups: A couple of fixes
@ 2021-06-10 14:32 John Garry
  2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry
  2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry
  0 siblings, 2 replies; 6+ messages in thread
From: John Garry @ 2021-06-10 14:32 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, kjain
  Cc: linux-perf-users, linux-kernel, John Garry

This series fixes a couple of subtle issues.

The first fixes a segfault from on my x86 broadwell when running the
'stat' command with a particular order of metrics.

As mentioned at [0], there is a pre-existing issue here which needs fixing
as this still does not work properly; however I think that is a bigger job,
and getting rid of the segfault is best I can do for the moment.

The second fixes an issue of an uninitialized variable. As noted in the
commit message, gcc does not seem to do a good job of picking up on this.

[0] https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/

John Garry (2):
  perf metricgroup: Fix find_evsel_group() event selector
  perf metricgroup: Return error code from
    metricgroup__add_metric_sys_event_iter()

 tools/perf/util/metricgroup.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector
  2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry
@ 2021-06-10 14:32 ` John Garry
  2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry
  1 sibling, 0 replies; 6+ messages in thread
From: John Garry @ 2021-06-10 14:32 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, kjain
  Cc: linux-perf-users, linux-kernel, John Garry

The following command segfaults on my x86 broadwell:

$ ./perf stat  -M frontend_bound,retiring,backend_bound,bad_speculation sleep 1
WARNING: grouped events cpus do not match, disabling group:
  anon group { raw 0x10e }
  anon group { raw 0x10e }
perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)' failed.
Aborted (core dumped)

The issue shows itself as a use-after-free in evlist__check_cpu_maps(),
whereby the leader of an event selector (evsel) has been deleted (yet we
still attempt to verify for an evsel).

Fundamentally the problem comes from metricgroup__setup_events() ->
find_evsel_group(), and has developed from the previous fix attempt in
commit 9c880c24cb0d ("perf metricgroup: Fix for metrics containing
duration_time").

The problem now is that the logic in checking if an evsel is in the same
group is subtely broken for "cycles" event. For "cycles" event, the
pmu_name is NULL; however the logic in find_evsel_group() may set an event
matched against "cycles" as used, when it should not be.

This leads to a condition where an evsel is set, yet its leader is not.

Fix the check for evsel pmu_name by not matching evsels when either has a
NULL pmu_name.

There is still a pre-existing metric issue whereby the ordering of the
metrics may break the 'stat' function, as discussed at:
https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/

Fixes: 9c880c24cb0d ("perf metricgroup: Fix for metrics containing duration_time")
Signed-off-by: John Garry <john.garry@huawei.com>
---
 tools/perf/util/metricgroup.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 8336dd8e8098..c456fdeae06a 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -162,10 +162,10 @@ static bool contains_event(struct evsel **metric_events, int num_events,
 	return false;
 }
 
-static bool evsel_same_pmu(struct evsel *ev1, struct evsel *ev2)
+static bool evsel_same_pmu_or_none(struct evsel *ev1, struct evsel *ev2)
 {
 	if (!ev1->pmu_name || !ev2->pmu_name)
-		return false;
+		return true;
 
 	return !strcmp(ev1->pmu_name, ev2->pmu_name);
 }
@@ -288,7 +288,7 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist,
 			 */
 			if (!has_constraint &&
 			    ev->leader != metric_events[i]->leader &&
-			    evsel_same_pmu(ev->leader, metric_events[i]->leader))
+			    evsel_same_pmu_or_none(ev->leader, metric_events[i]->leader))
 				break;
 			if (!strcmp(metric_events[i]->name, ev->name)) {
 				set_bit(ev->idx, evlist_used);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
  2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry
  2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry
@ 2021-06-10 14:33 ` John Garry
  2021-06-10 18:45   ` Ian Rogers
  1 sibling, 1 reply; 6+ messages in thread
From: John Garry @ 2021-06-10 14:33 UTC (permalink / raw)
  To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
	namhyung, irogers, kjain
  Cc: linux-perf-users, linux-kernel, John Garry

The error code is not set at all in the sys event iter function.

This may lead to an uninitialized value of "ret" in
metricgroup__add_metric() when no CPU metric is added.

Fix by properly setting the error code.

It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
if we have no CPU or sys event metric matching, then "has_match" should
be 0 and "ret" is set to -EINVAL.

However gcc cannot detect that it may not have been set after the
map_for_each_metric() loop for CPU metrics, which is strange.

Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
Signed-off-by: John Garry <john.garry@huawei.com>
---
 tools/perf/util/metricgroup.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index c456fdeae06a..d3cf2dee36c8 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
 
 	ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids);
 	if (ret)
-		return ret;
+		goto out;
 
 	ret = resolve_metric(d->metric_no_group,
 				     d->metric_list, NULL, d->ids);
 	if (ret)
-		return ret;
+		goto out;
 
 	*(d->has_match) = true;
 
-	return *d->ret;
+out:
+	*(d->ret) = ret;
+	return ret;
 }
 
 static int metricgroup__add_metric(const char *metric, bool metric_no_group,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
  2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry
@ 2021-06-10 18:45   ` Ian Rogers
  2021-06-14 14:56     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 6+ messages in thread
From: Ian Rogers @ 2021-06-10 18:45 UTC (permalink / raw)
  To: John Garry
  Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	kajoljain, linux-perf-users, LKML

On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote:
>
> The error code is not set at all in the sys event iter function.
>
> This may lead to an uninitialized value of "ret" in
> metricgroup__add_metric() when no CPU metric is added.
>
> Fix by properly setting the error code.
>
> It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
> if we have no CPU or sys event metric matching, then "has_match" should
> be 0 and "ret" is set to -EINVAL.
>
> However gcc cannot detect that it may not have been set after the
> map_for_each_metric() loop for CPU metrics, which is strange.
>
> Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
> Signed-off-by: John Garry <john.garry@huawei.com>

Acked-by: Ian Rogers <irogers@google.com>

Thanks,
Ian

> ---
>  tools/perf/util/metricgroup.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index c456fdeae06a..d3cf2dee36c8 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
>
>         ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids);
>         if (ret)
> -               return ret;
> +               goto out;
>
>         ret = resolve_metric(d->metric_no_group,
>                                      d->metric_list, NULL, d->ids);
>         if (ret)
> -               return ret;
> +               goto out;
>
>         *(d->has_match) = true;
>
> -       return *d->ret;
> +out:
> +       *(d->ret) = ret;
> +       return ret;
>  }
>
>  static int metricgroup__add_metric(const char *metric, bool metric_no_group,
> --
> 2.26.2
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
  2021-06-10 18:45   ` Ian Rogers
@ 2021-06-14 14:56     ` Arnaldo Carvalho de Melo
  2021-06-15 17:51       ` Ian Rogers
  0 siblings, 1 reply; 6+ messages in thread
From: Arnaldo Carvalho de Melo @ 2021-06-14 14:56 UTC (permalink / raw)
  To: Ian Rogers
  Cc: John Garry, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, kajoljain,
	linux-perf-users, LKML

Em Thu, Jun 10, 2021 at 11:45:17AM -0700, Ian Rogers escreveu:
> On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote:
> >
> > The error code is not set at all in the sys event iter function.
> >
> > This may lead to an uninitialized value of "ret" in
> > metricgroup__add_metric() when no CPU metric is added.
> >
> > Fix by properly setting the error code.
> >
> > It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
> > if we have no CPU or sys event metric matching, then "has_match" should
> > be 0 and "ret" is set to -EINVAL.
> >
> > However gcc cannot detect that it may not have been set after the
> > map_for_each_metric() loop for CPU metrics, which is strange.
> >
> > Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
> > Signed-off-by: John Garry <john.garry@huawei.com>
> 
> Acked-by: Ian Rogers <irogers@google.com>

Do your Acked-by applies to both patches? Or just 2/2?  I reproduced the
problem fixed by 1/2 on a Thinkpad T450S (broadwell) and after applying
the patch it doesn't segfaults.

Please clarify,

- Arnaldo

 
> Thanks,
> Ian
> 
> > ---
> >  tools/perf/util/metricgroup.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> > index c456fdeae06a..d3cf2dee36c8 100644
> > --- a/tools/perf/util/metricgroup.c
> > +++ b/tools/perf/util/metricgroup.c
> > @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
> >
> >         ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids);
> >         if (ret)
> > -               return ret;
> > +               goto out;
> >
> >         ret = resolve_metric(d->metric_no_group,
> >                                      d->metric_list, NULL, d->ids);
> >         if (ret)
> > -               return ret;
> > +               goto out;
> >
> >         *(d->has_match) = true;
> >
> > -       return *d->ret;
> > +out:
> > +       *(d->ret) = ret;
> > +       return ret;
> >  }
> >
> >  static int metricgroup__add_metric(const char *metric, bool metric_no_group,
> > --
> > 2.26.2
> >

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
  2021-06-14 14:56     ` Arnaldo Carvalho de Melo
@ 2021-06-15 17:51       ` Ian Rogers
  0 siblings, 0 replies; 6+ messages in thread
From: Ian Rogers @ 2021-06-15 17:51 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: John Garry, Peter Zijlstra, Ingo Molnar, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, kajoljain,
	linux-perf-users, LKML

On Mon, Jun 14, 2021 at 7:56 AM Arnaldo Carvalho de Melo
<acme@kernel.org> wrote:
>
> Em Thu, Jun 10, 2021 at 11:45:17AM -0700, Ian Rogers escreveu:
> > On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote:
> > >
> > > The error code is not set at all in the sys event iter function.
> > >
> > > This may lead to an uninitialized value of "ret" in
> > > metricgroup__add_metric() when no CPU metric is added.
> > >
> > > Fix by properly setting the error code.
> > >
> > > It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
> > > if we have no CPU or sys event metric matching, then "has_match" should
> > > be 0 and "ret" is set to -EINVAL.
> > >
> > > However gcc cannot detect that it may not have been set after the
> > > map_for_each_metric() loop for CPU metrics, which is strange.
> > >
> > > Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs")
> > > Signed-off-by: John Garry <john.garry@huawei.com>
> >
> > Acked-by: Ian Rogers <irogers@google.com>
>
> Do your Acked-by applies to both patches? Or just 2/2?  I reproduced the
> problem fixed by 1/2 on a Thinkpad T450S (broadwell) and after applying
> the patch it doesn't segfaults.

IIRC I need to look at what is going on with the names in patch 1/2
and didn't have a repro. I don't mind to ack it given that you've
repro-ed the problem and confirmed the fix. In general this logic
isn't working well (especially for --metric-no-group) so I plan to
take a stab at reorganizing it.

Thanks,
Ian

> Please clarify,
>
> - Arnaldo
>
>
> > Thanks,
> > Ian
> >
> > > ---
> > >  tools/perf/util/metricgroup.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> > > index c456fdeae06a..d3cf2dee36c8 100644
> > > --- a/tools/perf/util/metricgroup.c
> > > +++ b/tools/perf/util/metricgroup.c
> > > @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe,
> > >
> > >         ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids);
> > >         if (ret)
> > > -               return ret;
> > > +               goto out;
> > >
> > >         ret = resolve_metric(d->metric_no_group,
> > >                                      d->metric_list, NULL, d->ids);
> > >         if (ret)
> > > -               return ret;
> > > +               goto out;
> > >
> > >         *(d->has_match) = true;
> > >
> > > -       return *d->ret;
> > > +out:
> > > +       *(d->ret) = ret;
> > > +       return ret;
> > >  }
> > >
> > >  static int metricgroup__add_metric(const char *metric, bool metric_no_group,
> > > --
> > > 2.26.2
> > >
>
> --
>
> - Arnaldo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-06-15 17:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry
2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry
2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry
2021-06-10 18:45   ` Ian Rogers
2021-06-14 14:56     ` Arnaldo Carvalho de Melo
2021-06-15 17:51       ` Ian Rogers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).