From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Rogers Subject: Re: Issue of metrics for multiple uncore PMUs (was Re: [RFC PATCH v2 23/23] perf metricgroup: remove duped metric group events) Date: Fri, 2 Oct 2020 13:46:52 -0700 Message-ID: References: <20200507140819.126960-1-irogers@google.com> <20200507140819.126960-24-irogers@google.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: To: John Garry Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Mark Rutland , Alexander Shishkin , Jiri Olsa , Namhyung Kim , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , Kajol Jain , Andi Kleen , Jin Yao , Kan Liang , Cong Wang Kim List-Id: linux-perf-users.vger.kernel.org On Fri, Oct 2, 2020 at 5:00 AM John Garry wrote: > > On 07/05/2020 15:08, Ian Rogers wrote: > > Hi Ian, > > I was wondering if you ever tested commit 2440689d62e9 ("perf > metricgroup: Remove duped metric group events") for when we have a > metric which aliases multiple instances of the same uncore PMU in the > system? Sorry for this, I hadn't tested such a metric and wasn't aware of how the aliasing worked. I sent a fix for this issue here: https://lore.kernel.org/lkml/20200917201807.4090224-1-irogers@google.com/ Could you see if this addresses the issue for you? I don't see the change in Arnaldo's trees yet. Thanks, Ian > I have been rebasing some of my arm64 perf work to v5.9-rc7, and find an > issue where find_evsel_group() fails for the uncore metrics under the > condition mentioned above. > > Unfortunately I don't have an x86 machine to which this test applies. > However, as an experiment, I added a test metric to my broadwell JSON: > > diff --git a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > index 8cdc7c13dc2a..fc6d9adf996a 100644 > --- a/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > +++ b/tools/perf/pmu-events/arch/x86/broadwell/bdw-metrics.json > @@ -348,5 +348,11 @@ > "MetricExpr": "(cstate_pkg@c7\\-residency@ / msr@tsc@) * 100", > "MetricGroup": "Power", > "MetricName": "C7_Pkg_Residency" > + }, > + { > + "BriefDescription": "test metric", > + "MetricExpr": "UNC_CBO_XSNP_RESPONSE.MISS_XCORE * > UNC_CBO_XSNP_RESPONSE.MISS_EVICTION", > + "MetricGroup": "Test", > + "MetricName": "test_metric_inc" > } > ] > > > And get this: > > john@localhost:~/linux/tools/perf> sudo ./perf stat -v -M > test_metric_inc sleep 1 > Using CPUID GenuineIntel-6-3D-4 > metric expr unc_cbo_xsnp_response.miss_xcore * > unc_cbo_xsnp_response.miss_eviction for test_metric_inc > found event unc_cbo_xsnp_response.miss_eviction > found event unc_cbo_xsnp_response.miss_xcore > adding > {unc_cbo_xsnp_response.miss_eviction,unc_cbo_xsnp_response.miss_xcore}:W > unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_1/umask=0x81,event=0x22/ > unc_cbo_xsnp_response.miss_eviction -> uncore_cbox_0/umask=0x81,event=0x22/ > unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_1/umask=0x41,event=0x22/ > unc_cbo_xsnp_response.miss_xcore -> uncore_cbox_0/umask=0x41,event=0x22/ > Cannot resolve test_metric_inc: unc_cbo_xsnp_response.miss_xcore * > unc_cbo_xsnp_response.miss_eviction > task-clock: 688876 688876 688876 > context-switches: 2 688876 688876 > cpu-migrations: 0 688876 688876 > page-faults: 69 688876 688876 > cycles: 2101719 695690 695690 > instructions: 1180534 695690 695690 > branches: 249450 695690 695690 > branch-misses: 10815 695690 695690 > > Performance counter stats for 'sleep 1': > > 0.69 msec task-clock # 0.001 CPUs > utilized > 2 context-switches # 0.003 M/sec > > 0 cpu-migrations # 0.000 K/sec > > 69 page-faults # 0.100 M/sec > > 2,101,719 cycles # 3.051 GHz > > 1,180,534 instructions # 0.56 insn per > cycle > 249,450 branches # 362.112 M/sec > > 10,815 branch-misses # 4.34% of all > branches > > 1.001177693 seconds time elapsed > > 0.001149000 seconds user > 0.000000000 seconds sys > > > john@localhost:~/linux/tools/perf> > > > Any idea what is going wrong here, before I have to dive in? The issue > seems to be this named commit. > > Thanks, > John > > > A metric group contains multiple metrics. These metrics may use the same > > events. If metrics use separate events then it leads to more > > multiplexing and overall metric counts fail to sum to 100%. > > Modify how metrics are associated with events so that if the events in > > an earlier group satisfy the current metric, the same events are used. > > A record of used events is kept and at the end of processing unnecessary > > events are eliminated. > > > > Before: