* [PATCH 0/2] perf metricgroups: A couple of fixes @ 2021-06-10 14:32 John Garry 2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry 2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry 0 siblings, 2 replies; 6+ messages in thread From: John Garry @ 2021-06-10 14:32 UTC (permalink / raw) To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung, irogers, kjain Cc: linux-perf-users, linux-kernel, John Garry This series fixes a couple of subtle issues. The first fixes a segfault from on my x86 broadwell when running the 'stat' command with a particular order of metrics. As mentioned at [0], there is a pre-existing issue here which needs fixing as this still does not work properly; however I think that is a bigger job, and getting rid of the segfault is best I can do for the moment. The second fixes an issue of an uninitialized variable. As noted in the commit message, gcc does not seem to do a good job of picking up on this. [0] https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/ John Garry (2): perf metricgroup: Fix find_evsel_group() event selector perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() tools/perf/util/metricgroup.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) -- 2.26.2 ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector 2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry @ 2021-06-10 14:32 ` John Garry 2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry 1 sibling, 0 replies; 6+ messages in thread From: John Garry @ 2021-06-10 14:32 UTC (permalink / raw) To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung, irogers, kjain Cc: linux-perf-users, linux-kernel, John Garry The following command segfaults on my x86 broadwell: $ ./perf stat -M frontend_bound,retiring,backend_bound,bad_speculation sleep 1 WARNING: grouped events cpus do not match, disabling group: anon group { raw 0x10e } anon group { raw 0x10e } perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)' failed. Aborted (core dumped) The issue shows itself as a use-after-free in evlist__check_cpu_maps(), whereby the leader of an event selector (evsel) has been deleted (yet we still attempt to verify for an evsel). Fundamentally the problem comes from metricgroup__setup_events() -> find_evsel_group(), and has developed from the previous fix attempt in commit 9c880c24cb0d ("perf metricgroup: Fix for metrics containing duration_time"). The problem now is that the logic in checking if an evsel is in the same group is subtely broken for "cycles" event. For "cycles" event, the pmu_name is NULL; however the logic in find_evsel_group() may set an event matched against "cycles" as used, when it should not be. This leads to a condition where an evsel is set, yet its leader is not. Fix the check for evsel pmu_name by not matching evsels when either has a NULL pmu_name. There is still a pre-existing metric issue whereby the ordering of the metrics may break the 'stat' function, as discussed at: https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/ Fixes: 9c880c24cb0d ("perf metricgroup: Fix for metrics containing duration_time") Signed-off-by: John Garry <john.garry@huawei.com> --- tools/perf/util/metricgroup.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c index 8336dd8e8098..c456fdeae06a 100644 --- a/tools/perf/util/metricgroup.c +++ b/tools/perf/util/metricgroup.c @@ -162,10 +162,10 @@ static bool contains_event(struct evsel **metric_events, int num_events, return false; } -static bool evsel_same_pmu(struct evsel *ev1, struct evsel *ev2) +static bool evsel_same_pmu_or_none(struct evsel *ev1, struct evsel *ev2) { if (!ev1->pmu_name || !ev2->pmu_name) - return false; + return true; return !strcmp(ev1->pmu_name, ev2->pmu_name); } @@ -288,7 +288,7 @@ static struct evsel *find_evsel_group(struct evlist *perf_evlist, */ if (!has_constraint && ev->leader != metric_events[i]->leader && - evsel_same_pmu(ev->leader, metric_events[i]->leader)) + evsel_same_pmu_or_none(ev->leader, metric_events[i]->leader)) break; if (!strcmp(metric_events[i]->name, ev->name)) { set_bit(ev->idx, evlist_used); -- 2.26.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() 2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry 2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry @ 2021-06-10 14:33 ` John Garry 2021-06-10 18:45 ` Ian Rogers 1 sibling, 1 reply; 6+ messages in thread From: John Garry @ 2021-06-10 14:33 UTC (permalink / raw) To: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa, namhyung, irogers, kjain Cc: linux-perf-users, linux-kernel, John Garry The error code is not set at all in the sys event iter function. This may lead to an uninitialized value of "ret" in metricgroup__add_metric() when no CPU metric is added. Fix by properly setting the error code. It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as if we have no CPU or sys event metric matching, then "has_match" should be 0 and "ret" is set to -EINVAL. However gcc cannot detect that it may not have been set after the map_for_each_metric() loop for CPU metrics, which is strange. Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs") Signed-off-by: John Garry <john.garry@huawei.com> --- tools/perf/util/metricgroup.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c index c456fdeae06a..d3cf2dee36c8 100644 --- a/tools/perf/util/metricgroup.c +++ b/tools/perf/util/metricgroup.c @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe, ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids); if (ret) - return ret; + goto out; ret = resolve_metric(d->metric_no_group, d->metric_list, NULL, d->ids); if (ret) - return ret; + goto out; *(d->has_match) = true; - return *d->ret; +out: + *(d->ret) = ret; + return ret; } static int metricgroup__add_metric(const char *metric, bool metric_no_group, -- 2.26.2 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() 2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry @ 2021-06-10 18:45 ` Ian Rogers 2021-06-14 14:56 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 6+ messages in thread From: Ian Rogers @ 2021-06-10 18:45 UTC (permalink / raw) To: John Garry Cc: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim, kajoljain, linux-perf-users, LKML On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote: > > The error code is not set at all in the sys event iter function. > > This may lead to an uninitialized value of "ret" in > metricgroup__add_metric() when no CPU metric is added. > > Fix by properly setting the error code. > > It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as > if we have no CPU or sys event metric matching, then "has_match" should > be 0 and "ret" is set to -EINVAL. > > However gcc cannot detect that it may not have been set after the > map_for_each_metric() loop for CPU metrics, which is strange. > > Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs") > Signed-off-by: John Garry <john.garry@huawei.com> Acked-by: Ian Rogers <irogers@google.com> Thanks, Ian > --- > tools/perf/util/metricgroup.c | 8 +++++--- > 1 file changed, 5 insertions(+), 3 deletions(-) > > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c > index c456fdeae06a..d3cf2dee36c8 100644 > --- a/tools/perf/util/metricgroup.c > +++ b/tools/perf/util/metricgroup.c > @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe, > > ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids); > if (ret) > - return ret; > + goto out; > > ret = resolve_metric(d->metric_no_group, > d->metric_list, NULL, d->ids); > if (ret) > - return ret; > + goto out; > > *(d->has_match) = true; > > - return *d->ret; > +out: > + *(d->ret) = ret; > + return ret; > } > > static int metricgroup__add_metric(const char *metric, bool metric_no_group, > -- > 2.26.2 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() 2021-06-10 18:45 ` Ian Rogers @ 2021-06-14 14:56 ` Arnaldo Carvalho de Melo 2021-06-15 17:51 ` Ian Rogers 0 siblings, 1 reply; 6+ messages in thread From: Arnaldo Carvalho de Melo @ 2021-06-14 14:56 UTC (permalink / raw) To: Ian Rogers Cc: John Garry, Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim, kajoljain, linux-perf-users, LKML Em Thu, Jun 10, 2021 at 11:45:17AM -0700, Ian Rogers escreveu: > On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote: > > > > The error code is not set at all in the sys event iter function. > > > > This may lead to an uninitialized value of "ret" in > > metricgroup__add_metric() when no CPU metric is added. > > > > Fix by properly setting the error code. > > > > It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as > > if we have no CPU or sys event metric matching, then "has_match" should > > be 0 and "ret" is set to -EINVAL. > > > > However gcc cannot detect that it may not have been set after the > > map_for_each_metric() loop for CPU metrics, which is strange. > > > > Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs") > > Signed-off-by: John Garry <john.garry@huawei.com> > > Acked-by: Ian Rogers <irogers@google.com> Do your Acked-by applies to both patches? Or just 2/2? I reproduced the problem fixed by 1/2 on a Thinkpad T450S (broadwell) and after applying the patch it doesn't segfaults. Please clarify, - Arnaldo > Thanks, > Ian > > > --- > > tools/perf/util/metricgroup.c | 8 +++++--- > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c > > index c456fdeae06a..d3cf2dee36c8 100644 > > --- a/tools/perf/util/metricgroup.c > > +++ b/tools/perf/util/metricgroup.c > > @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe, > > > > ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids); > > if (ret) > > - return ret; > > + goto out; > > > > ret = resolve_metric(d->metric_no_group, > > d->metric_list, NULL, d->ids); > > if (ret) > > - return ret; > > + goto out; > > > > *(d->has_match) = true; > > > > - return *d->ret; > > +out: > > + *(d->ret) = ret; > > + return ret; > > } > > > > static int metricgroup__add_metric(const char *metric, bool metric_no_group, > > -- > > 2.26.2 > > -- - Arnaldo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() 2021-06-14 14:56 ` Arnaldo Carvalho de Melo @ 2021-06-15 17:51 ` Ian Rogers 0 siblings, 0 replies; 6+ messages in thread From: Ian Rogers @ 2021-06-15 17:51 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: John Garry, Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim, kajoljain, linux-perf-users, LKML On Mon, Jun 14, 2021 at 7:56 AM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Thu, Jun 10, 2021 at 11:45:17AM -0700, Ian Rogers escreveu: > > On Thu, Jun 10, 2021 at 7:37 AM John Garry <john.garry@huawei.com> wrote: > > > > > > The error code is not set at all in the sys event iter function. > > > > > > This may lead to an uninitialized value of "ret" in > > > metricgroup__add_metric() when no CPU metric is added. > > > > > > Fix by properly setting the error code. > > > > > > It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as > > > if we have no CPU or sys event metric matching, then "has_match" should > > > be 0 and "ret" is set to -EINVAL. > > > > > > However gcc cannot detect that it may not have been set after the > > > map_for_each_metric() loop for CPU metrics, which is strange. > > > > > > Fixes: be335ec28efa8 ("perf metricgroup: Support adding metrics for system PMUs") > > > Signed-off-by: John Garry <john.garry@huawei.com> > > > > Acked-by: Ian Rogers <irogers@google.com> > > Do your Acked-by applies to both patches? Or just 2/2? I reproduced the > problem fixed by 1/2 on a Thinkpad T450S (broadwell) and after applying > the patch it doesn't segfaults. IIRC I need to look at what is going on with the names in patch 1/2 and didn't have a repro. I don't mind to ack it given that you've repro-ed the problem and confirmed the fix. In general this logic isn't working well (especially for --metric-no-group) so I plan to take a stab at reorganizing it. Thanks, Ian > Please clarify, > > - Arnaldo > > > > Thanks, > > Ian > > > > > --- > > > tools/perf/util/metricgroup.c | 8 +++++--- > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > > > diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c > > > index c456fdeae06a..d3cf2dee36c8 100644 > > > --- a/tools/perf/util/metricgroup.c > > > +++ b/tools/perf/util/metricgroup.c > > > @@ -1073,16 +1073,18 @@ static int metricgroup__add_metric_sys_event_iter(struct pmu_event *pe, > > > > > > ret = add_metric(d->metric_list, pe, d->metric_no_group, &m, NULL, d->ids); > > > if (ret) > > > - return ret; > > > + goto out; > > > > > > ret = resolve_metric(d->metric_no_group, > > > d->metric_list, NULL, d->ids); > > > if (ret) > > > - return ret; > > > + goto out; > > > > > > *(d->has_match) = true; > > > > > > - return *d->ret; > > > +out: > > > + *(d->ret) = ret; > > > + return ret; > > > } > > > > > > static int metricgroup__add_metric(const char *metric, bool metric_no_group, > > > -- > > > 2.26.2 > > > > > -- > > - Arnaldo ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-06-15 17:51 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-06-10 14:32 [PATCH 0/2] perf metricgroups: A couple of fixes John Garry 2021-06-10 14:32 ` [PATCH 1/2] perf metricgroup: Fix find_evsel_group() event selector John Garry 2021-06-10 14:33 ` [PATCH 2/2] perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter() John Garry 2021-06-10 18:45 ` Ian Rogers 2021-06-14 14:56 ` Arnaldo Carvalho de Melo 2021-06-15 17:51 ` Ian Rogers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).