From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FDA4CCA47C for ; Tue, 7 Jun 2022 22:07:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378735AbiFGWHO (ORCPT ); Tue, 7 Jun 2022 18:07:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37648 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1383595AbiFGWF6 (ORCPT ); Tue, 7 Jun 2022 18:05:58 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3801F251486; Tue, 7 Jun 2022 12:16:37 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C202FB823CB; Tue, 7 Jun 2022 19:16:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F12C9C385A2; Tue, 7 Jun 2022 19:16:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1654629394; bh=A74qveUecxfw6OpCDbn/9Bp9Pz+MgltAUiR0d4gktVI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nrnd4XO7p5lXZQZpQEt2gh1+EtXq2h/GfY4v/8v/nvfvqSjwS63PVCzrEBV/qhGn8 qzoT4W+v3mu5CHD/oHNMFDIkgpSbHCkIRwDnt7ffcwJh/3tr+d3B/004MhvgDw1s94 WESQ2ui5lUzbxlkRXG+J5GHVmA57MmE73IT2SVX8= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Kan Liang , Ian Rogers , Adrian Hunter , Alexander Shishkin , Andi Kleen , Florian Fischer , Ingo Molnar , James Clark , Jiri Olsa , John Garry , Kim Phillips , Madhavan Srinivasan , Mark Rutland , Namhyung Kim , Peter Zijlstra , Riccardo Mancini , Shunsuke Nakamura , Stephane Eranian , Xing Zhengjun , Arnaldo Carvalho de Melo , Sasha Levin Subject: [PATCH 5.18 635/879] perf evlist: Keep topdown counters in weak group Date: Tue, 7 Jun 2022 19:02:33 +0200 Message-Id: <20220607165021.279952657@linuxfoundation.org> X-Mailer: git-send-email 2.36.1 In-Reply-To: <20220607165002.659942637@linuxfoundation.org> References: <20220607165002.659942637@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Ian Rogers [ Upstream commit d98079c05b5a5411c6030c47b6256cbeeeff77d0 ] On Intel Icelake, topdown events must always be grouped with a slots event as leader. When a metric is parsed a weak group is formed and retried if perf_event_open fails. The retried events aren't grouped breaking the slots leader requirement. This change modifies the weak group "reset" behavior so that topdown events aren't broken from the group for the retry. $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1 Performance counter stats for 'system wide': 47,867,188,483 slots (92.27%) topdown-bad-spec topdown-be-bound topdown-fe-bound topdown-retiring 2,173,346,937 branch-instructions (92.27%) 10,540,253 branch-misses # 0.48% of all branches (92.29%) 96,291,140 bus-cycles (92.29%) 6,214,202 cache-misses # 20.120 % of all cache refs (92.29%) 30,886,082 cache-references (76.91%) 11,773,726,641 cpu-cycles (84.62%) 11,807,585,307 instructions # 1.00 insn per cycle (92.31%) 0 mem-loads (92.32%) 2,212,928,573 mem-stores (84.69%) 10,024,403,118 ref-cycles (92.35%) 16,232,978 baclears.any (92.35%) 23,832,633 ARITH.DIVIDER_ACTIVE (84.59%) 0.981070734 seconds time elapsed After: $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1 Performance counter stats for 'system wide': 31040189283 slots (92.27%) 8997514811 topdown-bad-spec # 28.2% bad speculation (92.27%) 10997536028 topdown-be-bound # 34.5% backend bound (92.27%) 4778060526 topdown-fe-bound # 15.0% frontend bound (92.27%) 7086628768 topdown-retiring # 22.2% retiring (92.27%) 1417611942 branch-instructions (92.26%) 5285529 branch-misses # 0.37% of all branches (92.28%) 62922469 bus-cycles (92.29%) 1440708 cache-misses # 8.292 % of all cache refs (92.30%) 17374098 cache-references (76.94%) 8040889520 cpu-cycles (84.63%) 7709992319 instructions # 0.96 insn per cycle (92.32%) 0 mem-loads (92.32%) 1515669558 mem-stores (84.68%) 6542411177 ref-cycles (92.35%) 4154149 baclears.any (92.35%) 20556152 ARITH.DIVIDER_ACTIVE (84.59%) 1.010799593 seconds time elapsed Reviewed-by: Kan Liang Signed-off-by: Ian Rogers Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Andi Kleen Cc: Florian Fischer Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: John Garry Cc: Kim Phillips Cc: Madhavan Srinivasan Cc: Mark Rutland Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Riccardo Mancini Cc: Shunsuke Nakamura Cc: Stephane Eranian Cc: Xing Zhengjun Link: https://lore.kernel.org/r/20220517052724.283874-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Sasha Levin --- tools/perf/arch/x86/util/evsel.c | 12 ++++++++++++ tools/perf/util/evlist.c | 16 ++++++++++++++-- tools/perf/util/evsel.c | 10 ++++++++++ tools/perf/util/evsel.h | 3 +++ 4 files changed, 39 insertions(+), 2 deletions(-) diff --git a/tools/perf/arch/x86/util/evsel.c b/tools/perf/arch/x86/util/evsel.c index ac2899a25b7a..00cb4466b4ca 100644 --- a/tools/perf/arch/x86/util/evsel.c +++ b/tools/perf/arch/x86/util/evsel.c @@ -3,6 +3,7 @@ #include #include "util/evsel.h" #include "util/env.h" +#include "util/pmu.h" #include "linux/string.h" void arch_evsel__set_sample_weight(struct evsel *evsel) @@ -29,3 +30,14 @@ void arch_evsel__fixup_new_cycles(struct perf_event_attr *attr) free(env.cpuid); } + +bool arch_evsel__must_be_in_group(const struct evsel *evsel) +{ + if ((evsel->pmu_name && strcmp(evsel->pmu_name, "cpu")) || + !pmu_have_event("cpu", "slots")) + return false; + + return evsel->name && + (!strcasecmp(evsel->name, "slots") || + strcasestr(evsel->name, "topdown")); +} diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c index 52ea004ba01e..4804b52f2946 100644 --- a/tools/perf/util/evlist.c +++ b/tools/perf/util/evlist.c @@ -1790,8 +1790,17 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel * if (evsel__has_leader(c2, leader)) { if (is_open && close) perf_evsel__close(&c2->core); - evsel__set_leader(c2, c2); - c2->core.nr_members = 0; + /* + * We want to close all members of the group and reopen + * them. Some events, like Intel topdown, require being + * in a group and so keep these in the group. + */ + if (!evsel__must_be_in_group(c2) && c2 != leader) { + evsel__set_leader(c2, c2); + c2->core.nr_members = 0; + leader->core.nr_members--; + } + /* * Set this for all former members of the group * to indicate they get reopened. @@ -1799,6 +1808,9 @@ struct evsel *evlist__reset_weak_group(struct evlist *evsel_list, struct evsel * c2->reset_group = true; } } + /* Reset the leader count if all entries were removed. */ + if (leader->core.nr_members == 1) + leader->core.nr_members = 0; return leader; } diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 2a1729e7aee4..b98882cbb286 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -3077,3 +3077,13 @@ int evsel__source_count(const struct evsel *evsel) } return count; } + +bool __weak arch_evsel__must_be_in_group(const struct evsel *evsel __maybe_unused) +{ + return false; +} + +bool evsel__must_be_in_group(const struct evsel *evsel) +{ + return arch_evsel__must_be_in_group(evsel); +} diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h index 041b42d33bf5..a36172ed4cf6 100644 --- a/tools/perf/util/evsel.h +++ b/tools/perf/util/evsel.h @@ -483,6 +483,9 @@ bool evsel__has_leader(struct evsel *evsel, struct evsel *leader); bool evsel__is_leader(struct evsel *evsel); void evsel__set_leader(struct evsel *evsel, struct evsel *leader); int evsel__source_count(const struct evsel *evsel); +bool evsel__must_be_in_group(const struct evsel *evsel); + +bool arch_evsel__must_be_in_group(const struct evsel *evsel); /* * Macro to swap the bit-field postition and size. -- 2.35.1