linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [6.1.7][6.2-rc5] perf all metrics test: FAILED!
@ 2023-01-29  9:58 Sedat Dilek
  2023-01-29 23:21 ` Ian Rogers
  0 siblings, 1 reply; 13+ messages in thread
From: Sedat Dilek @ 2023-01-29  9:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Ian Rogers
  Cc: Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-perf-users, linux-kernel,
	Nick Desaulniers, Nathan Chancellor, llvm, Ben Hutchings

[-- Attachment #1: Type: text/plain, Size: 4876 bytes --]

[ CC LLVM linux folks + Ben from Debian kernel team ]

Hi,

I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.

After building my selfmade LLVM toolchain, I built perf and run some
perf tests here on my Intel SandyBridge CPU (details see below).

perf all metrics test: FAILED!

...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.

Just noticed:

Couldn't bump rlimit(MEMLOCK), failures may take place when creating
BPF maps, etc

Run the below tests with `sudo` - made this go away - still FAILED.

But maybe I am missing to activate some sysfs/debug or whatever other stuff?

Last perf version which was OK:

~/bin/perf -v
perf version 6.0.0

echo "linux-perf: Adjust limited access to performance monitoring and
observability operations"
echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
/proc/sys/kernel/perf_event_paranoid
0

~/bin/perf test 10 86 92 93 94 95
10: PMU events                                                      :
10.1: PMU event table sanity                                        : Ok
10.2: PMU event map aliases                                         : Ok
10.3: Parsing of PMU event table metrics                            : Ok
10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
86: perf record tests                                               : Ok
92: perf stat tests                                                 : Ok
93: perf all metricgroups test                                      : Ok
94: perf all metrics test                                           : Ok
95: perf all PMU test                                               : Ok

echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
/proc/sys/kernel/perf_event_paranoid
echo "linux-perf: Reset limited access to performance monitoring and
observability operations"

If you need further information, please let me know.

Thanks.

Regards,
-Sedat-

P.S. Instructions

[ REPRODUCER ]

LLVM_MVER="16"

# Debian LLVM
##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
# Selfmade LLVM
LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
   export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
fi

PYTHON_VER="3.11"
MAKE="make"
MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
STRIP=llvm-strip"

echo "LLVM MVER ........ $LLVM_MVER"
echo "Path settings .... $PATH"
echo "Python version ... $PYTHON_VER"
echo "make line ........ $MAKE $MAKE_OPTS"

LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt

LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt


[ TESTS ]

[ TESTS - START ]

echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
/proc/sys/kernel/perf_event_paranoid

[ TESTS - DEBIAN ]

/usr/bin/perf -v
perf version 6.1.7

/usr/bin/perf test 10 92 98 99 100 101

 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 92: perf record tests                                               : Ok
 98: perf stat tests                                                 : Ok
 99: perf all metricgroups test                                      : Ok
100: perf all metrics test                                           : FAILED!
101: perf all PMU test                                               : Ok

[ TESTS - DILEKS ]

~/bin/perf -v
perf version 6.2.0-rc5

~/bin/perf test 7 87 93 94 95 96

  7: PMU events                                                      :
  7.1: PMU event table sanity                                        : Ok
  7.2: PMU event map aliases                                         : Ok
  7.3: Parsing of PMU event table metrics                            : Ok
  7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
 87: perf record tests                                               : Ok
 93: perf stat tests                                                 : Ok
 94: perf all metricgroups test                                      : Ok
 95: perf all metrics test                                           : FAILED!
 96: perf all PMU test                                               : Ok

[ TESTS - FAILED ]

/usr/bin/perf test --verbose 100 2>&1 | tee
perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt

~/bin/perf test --verbose 95 2>&1 | tee
perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt

[ TESTS - STOP ]

echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
/proc/sys/kernel/perf_event_paranoid

- EOT -

[-- Attachment #2: debian-perf-6-1-7_test-verbose-100-perf-all-metrics-test.txt --]
[-- Type: text/plain, Size: 3732 bytes --]

Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
100: perf all metrics test                                           :
--- start ---
test child forked, pid 39432
Testing Average_Frequency
Testing C2_Pkg_Residency
Testing C3_Core_Residency
Testing C3_Pkg_Residency
Testing C6_Core_Residency
Testing C6_Pkg_Residency
Testing C7_Core_Residency
Testing C7_Pkg_Residency
Testing CLKS
Testing CORE_CLKS
Testing CPI
Testing CPU_Utilization
Testing CoreIPC
Testing DRAM_BW_Use
Testing DSB_Coverage
Testing Execute_per_Issue
Testing FLOPc
Testing GFLOPs
Testing ILP
Testing IPC
Testing Instructions
Testing IpFarBranch
Testing Kernel_CPI
Testing Kernel_Utilization
Testing MEM_Parallel_Requests
Testing MEM_Request_Latency
Testing Retire
Testing SLOTS
Testing SMT_2T_Utilization
Testing Turbo_Utilization
Testing UPI
Testing tma_backend_bound
Testing tma_bad_speculation
Testing tma_branch_mispredicts
Testing tma_branch_resteers
Testing tma_core_bound
Testing tma_divider
Testing tma_dram_bound
Metric 'tma_dram_bound' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
  Average synthesis took: 203.922 usec (+- 0.191 usec)
  Average num. events: 30.000 (+- 0.000)
  Average time per event 6.797 usec
  Average data synthesis took: 219.730 usec (+- 0.216 usec)
  Average num. events: 159.000 (+- 0.000)
  Average time per event 1.382 usec

 Performance counter stats for 'perf bench internals synthesize':

     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT                                     (0,00%)
     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING                                     (0,00%)
     <not counted>      CPU_CLK_UNHALTED.THREAD                                       (0,00%)
     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS                                     (0,00%)

       4,456375532 seconds time elapsed

       1,415829000 seconds user
       3,027083000 seconds sys
Testing tma_dsb_switches
Testing tma_dtlb_load
Testing tma_fetch_bandwidth
Testing tma_fetch_latency
Testing tma_fp_arith
Testing tma_fp_scalar
Testing tma_fp_vector
Testing tma_frontend_bound
Testing tma_heavy_operations
Testing tma_itlb_misses
Testing tma_l3_bound
Metric 'tma_l3_bound' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
  Average synthesis took: 204.199 usec (+- 0.228 usec)
  Average num. events: 30.000 (+- 0.000)
  Average time per event 6.807 usec
  Average data synthesis took: 219.934 usec (+- 0.232 usec)
  Average num. events: 159.000 (+- 0.000)
  Average time per event 1.383 usec

 Performance counter stats for 'perf bench internals synthesize':

     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT                                     (0,00%)
     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING                                     (0,00%)
     <not counted>      CPU_CLK_UNHALTED.THREAD                                       (0,00%)
     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS                                     (0,00%)

       4,458943453 seconds time elapsed

       1,468251000 seconds user
       2,976400000 seconds sys
Testing tma_lcp
Testing tma_light_operations
Testing tma_machine_clears
Testing tma_mem_bandwidth
Testing tma_mem_latency
Testing tma_memory_bound
Testing tma_microcode_sequencer
Testing tma_ms_switches
Testing tma_ports_utilization
Testing tma_retiring
Testing tma_store_bound
Testing tma_x87_use
test child finished with -1
---- end ----
perf all metrics test: FAILED!

[-- Attachment #3: dileks-perf-6-2-rc5-test-verbose-95-perf-all-metrics-test.txt --]
[-- Type: text/plain, Size: 3816 bytes --]

Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
 95: perf all metrics test                                           :
--- start ---
test child forked, pid 39198
Testing ILP
Testing tma_core_bound
Testing tma_memory_bound
Testing tma_branch_mispredicts
Testing tma_machine_clears
Testing tma_itlb_misses
Testing IpFarBranch
Testing tma_l3_bound
Metric 'tma_l3_bound' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
  Average synthesis took: 208.033 usec (+- 0.214 usec)
  Average num. events: 30.000 (+- 0.000)
  Average time per event 6.934 usec
  Average data synthesis took: 216.728 usec (+- 0.182 usec)
  Average num. events: 162.000 (+- 0.000)
  Average time per event 1.338 usec

 Performance counter stats for 'perf bench internals synthesize':

     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT                                           (0,00%)
     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING                                        (0,00%)
     <not counted>      CPU_CLK_UNHALTED.THREAD                                                 (0,00%)
     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS                                        (0,00%)

       4,555228480 seconds time elapsed

       1,504137000 seconds user
       3,040193000 seconds sys
Testing tma_fp_scalar
Testing tma_fp_vector
Testing tma_x87_use
Testing Execute_per_Issue
Testing GFLOPs
Testing DSB_Coverage
Testing tma_dsb_switches
Testing tma_fetch_bandwidth
Testing tma_branch_resteers
Testing tma_lcp
Testing tma_ms_switches
Testing FLOPc
Testing tma_fetch_latency
Testing CPU_Utilization
Testing DRAM_BW_Use
Testing tma_fp_arith
Testing CPI
Testing MEM_Parallel_Requests
Testing MEM_Request_Latency
Testing tma_mem_bandwidth
Testing tma_dram_bound
Metric 'tma_dram_bound' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
  Average synthesis took: 207.680 usec (+- 0.176 usec)
  Average num. events: 30.000 (+- 0.000)
  Average time per event 6.923 usec
  Average data synthesis took: 217.833 usec (+- 0.202 usec)
  Average num. events: 161.000 (+- 0.000)
  Average time per event 1.353 usec

 Performance counter stats for 'perf bench internals synthesize':

     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT                                           (0,00%)
     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING                                        (0,00%)
     <not counted>      CPU_CLK_UNHALTED.THREAD                                                 (0,00%)
     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS                                        (0,00%)

       4,555698863 seconds time elapsed

       1,481769000 seconds user
       3,063387000 seconds sys
Testing tma_store_bound
Testing tma_mem_latency
Testing tma_dtlb_load
Testing tma_microcode_sequencer
Testing Kernel_CPI
Testing Kernel_Utilization
Testing tma_frontend_bound
Testing CLKS
Testing Retire
Testing UPI
Testing tma_ports_utilization
Testing Average_Frequency
Testing C2_Pkg_Residency
Testing C3_Core_Residency
Testing C3_Pkg_Residency
Testing C6_Core_Residency
Testing C6_Pkg_Residency
Testing C7_Core_Residency
Testing C7_Pkg_Residency
Testing Turbo_Utilization
Testing CoreIPC
Testing IPC
Testing tma_heavy_operations
Testing tma_light_operations
Testing CORE_CLKS
Testing SMT_2T_Utilization
Testing Socket_CLKS
Testing UNCORE_FREQ
Testing Instructions
Testing tma_backend_bound
Testing tma_bad_speculation
Testing tma_retiring
Testing tma_divider
Testing SLOTS
test child finished with -1
---- end ----
perf all metrics test: FAILED!

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-29  9:58 [6.1.7][6.2-rc5] perf all metrics test: FAILED! Sedat Dilek
@ 2023-01-29 23:21 ` Ian Rogers
  2023-01-30  2:24   ` Sedat Dilek
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Rogers @ 2023-01-29 23:21 UTC (permalink / raw)
  To: sedat.dilek
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-perf-users, linux-kernel, Nick Desaulniers,
	Nathan Chancellor, llvm, Ben Hutchings

On Sun, Jan 29, 2023 at 1:59 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
>
> [ CC LLVM linux folks + Ben from Debian kernel team ]
>
> Hi,
>
> I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.
>
> After building my selfmade LLVM toolchain, I built perf and run some
> perf tests here on my Intel SandyBridge CPU (details see below).
>
> perf all metrics test: FAILED!
>
> ...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.
>
> Just noticed:
>
> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> BPF maps, etc
>
> Run the below tests with `sudo` - made this go away - still FAILED.
>
> But maybe I am missing to activate some sysfs/debug or whatever other stuff?

Hi Sedat,

things have been improving wrt metrics and so this failure may have
just been because of the addition of a previously missing metric. The
rlimit thing shouldn't affect things but maybe file descriptors?
Looking at the test output the issue is:

```
Metric 'tma_dram_bound' not printed in:
# Running 'internals/synthesize' benchmark:
Computing performance of single threaded perf event synthesis by
synthesizing events on the perf process itself:
  Average synthesis took: 207.680 usec (+- 0.176 usec)
  Average num. events: 30.000 (+- 0.000)
  Average time per event 6.923 usec
  Average data synthesis took: 217.833 usec (+- 0.202 usec)
  Average num. events: 161.000 (+- 0.000)
  Average time per event 1.353 usec

 Performance counter stats for 'perf bench internals synthesize':

     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
                         (0,00%)
     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
                         (0,00%)
     <not counted>      CPU_CLK_UNHALTED.THREAD
                         (0,00%)
     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
                            (0,00%)
```

So the test was checking to see whether the tma_dram_bound metric
could be computed on your Sandybridge and it failed. The event counts
below show that every event came back "<not counted>" which is usually
indicative of a permissions problem - it is also not surprising given
this that the metric wasn't computed. You could try repeating the
command the test is trying with something like "perf stat -M
tma_dram_bound -a sleep 1", but running as root should have resolved
that issue. Does that give you enough to keep exploring?

Thanks,
Ian

> Last perf version which was OK:
>
> ~/bin/perf -v
> perf version 6.0.0
>
> echo "linux-perf: Adjust limited access to performance monitoring and
> observability operations"
> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> /proc/sys/kernel/perf_event_paranoid
> 0
>
> ~/bin/perf test 10 86 92 93 94 95
> 10: PMU events                                                      :
> 10.1: PMU event table sanity                                        : Ok
> 10.2: PMU event map aliases                                         : Ok
> 10.3: Parsing of PMU event table metrics                            : Ok
> 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> 86: perf record tests                                               : Ok
> 92: perf stat tests                                                 : Ok
> 93: perf all metricgroups test                                      : Ok
> 94: perf all metrics test                                           : Ok
> 95: perf all PMU test                                               : Ok
>
> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> /proc/sys/kernel/perf_event_paranoid
> echo "linux-perf: Reset limited access to performance monitoring and
> observability operations"
>
> If you need further information, please let me know.
>
> Thanks.
>
> Regards,
> -Sedat-
>
> P.S. Instructions
>
> [ REPRODUCER ]
>
> LLVM_MVER="16"
>
> # Debian LLVM
> ##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
> # Selfmade LLVM
> LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
> if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
>    export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
> fi
>
> PYTHON_VER="3.11"
> MAKE="make"
> MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
> HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
> STRIP=llvm-strip"
>
> echo "LLVM MVER ........ $LLVM_MVER"
> echo "Path settings .... $PATH"
> echo "Python version ... $PYTHON_VER"
> echo "make line ........ $MAKE $MAKE_OPTS"
>
> LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt
>
> LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
> PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
> ../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt
>
>
> [ TESTS ]
>
> [ TESTS - START ]
>
> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> /proc/sys/kernel/perf_event_paranoid
>
> [ TESTS - DEBIAN ]
>
> /usr/bin/perf -v
> perf version 6.1.7
>
> /usr/bin/perf test 10 92 98 99 100 101
>
>  10: PMU events                                                      :
>  10.1: PMU event table sanity                                        : Ok
>  10.2: PMU event map aliases                                         : Ok
>  10.3: Parsing of PMU event table metrics                            : Ok
>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>  92: perf record tests                                               : Ok
>  98: perf stat tests                                                 : Ok
>  99: perf all metricgroups test                                      : Ok
> 100: perf all metrics test                                           : FAILED!
> 101: perf all PMU test                                               : Ok
>
> [ TESTS - DILEKS ]
>
> ~/bin/perf -v
> perf version 6.2.0-rc5
>
> ~/bin/perf test 7 87 93 94 95 96
>
>   7: PMU events                                                      :
>   7.1: PMU event table sanity                                        : Ok
>   7.2: PMU event map aliases                                         : Ok
>   7.3: Parsing of PMU event table metrics                            : Ok
>   7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>  87: perf record tests                                               : Ok
>  93: perf stat tests                                                 : Ok
>  94: perf all metricgroups test                                      : Ok
>  95: perf all metrics test                                           : FAILED!
>  96: perf all PMU test                                               : Ok
>
> [ TESTS - FAILED ]
>
> /usr/bin/perf test --verbose 100 2>&1 | tee
> perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt
>
> ~/bin/perf test --verbose 95 2>&1 | tee
> perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt
>
> [ TESTS - STOP ]
>
> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> /proc/sys/kernel/perf_event_paranoid
>
> - EOT -

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-29 23:21 ` Ian Rogers
@ 2023-01-30  2:24   ` Sedat Dilek
  2023-01-30 10:04     ` James Clark
  0 siblings, 1 reply; 13+ messages in thread
From: Sedat Dilek @ 2023-01-30  2:24 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-perf-users, linux-kernel, Nick Desaulniers,
	Nathan Chancellor, llvm, Ben Hutchings

?

On Mon, Jan 30, 2023 at 12:21 AM Ian Rogers <irogers@google.com> wrote:
>
> On Sun, Jan 29, 2023 at 1:59 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> >
> > [ CC LLVM linux folks + Ben from Debian kernel team ]
> >
> > Hi,
> >
> > I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.
> >
> > After building my selfmade LLVM toolchain, I built perf and run some
> > perf tests here on my Intel SandyBridge CPU (details see below).
> >
> > perf all metrics test: FAILED!
> >
> > ...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.
> >
> > Just noticed:
> >
> > Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> > BPF maps, etc
> >
> > Run the below tests with `sudo` - made this go away - still FAILED.
> >
> > But maybe I am missing to activate some sysfs/debug or whatever other stuff?
>
> Hi Sedat,
>
> things have been improving wrt metrics and so this failure may have
> just been because of the addition of a previously missing metric. The
> rlimit thing shouldn't affect things but maybe file descriptors?
> Looking at the test output the issue is:
>
> ```
> Metric 'tma_dram_bound' not printed in:
> # Running 'internals/synthesize' benchmark:
> Computing performance of single threaded perf event synthesis by
> synthesizing events on the perf process itself:
>   Average synthesis took: 207.680 usec (+- 0.176 usec)
>   Average num. events: 30.000 (+- 0.000)
>   Average time per event 6.923 usec
>   Average data synthesis took: 217.833 usec (+- 0.202 usec)
>   Average num. events: 161.000 (+- 0.000)
>   Average time per event 1.353 usec
>
>  Performance counter stats for 'perf bench internals synthesize':
>
>      <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
>                          (0,00%)
>      <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                          (0,00%)
>      <not counted>      CPU_CLK_UNHALTED.THREAD
>                          (0,00%)
>      <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>                             (0,00%)
> ```
>
> So the test was checking to see whether the tma_dram_bound metric
> could be computed on your Sandybridge and it failed. The event counts
> below show that every event came back "<not counted>" which is usually
> indicative of a permissions problem - it is also not surprising given
> this that the metric wasn't computed. You could try repeating the
> command the test is trying with something like "perf stat -M
> tma_dram_bound -a sleep 1", but running as root should have resolved
> that issue. Does that give you enough to keep exploring?
>

Hi Ian,

Thanks for your feedback!

I booted into my Debian kernel - just to see what happens.

# cat /proc/version
Linux version 6.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-12
(Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1
SMP PREEMPT_DYNAMIC Debian 6.1.7-1 (2023-01-18)

All things run as root...

# echo 0 | tee /proc/sys/kernel/kptr_restrict
/proc/sys/kernel/perf_event_paranoid
0

# /usr/bin/perf test 10 92 98 99 100 101
10: PMU events                                                      :
10.1: PMU event table sanity                                        : Ok
10.2: PMU event map aliases                                         : Ok
10.3: Parsing of PMU event table metrics                            : Ok
10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
92: perf record tests                                               : Ok
98: perf stat tests                                                 : Ok
99: perf all metricgroups test                                      : Ok
100: perf all metrics test                                           : FAILED!
101: perf all PMU test                                               : Ok

# perf stat -M tma_dram_bound -a sleep 1

Performance counter stats for 'system wide':

    <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
                  (0,00%)
    <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
                     (0,00%)
    <not counted>      CPU_CLK_UNHALTED.THREAD
              (0,00%)
    <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
                        (0,00%)

      1,002148600 seconds time elapsed

Hmm... looking at... Metric 'tma_l3_bound' ...

Running...

# perf stat --verbose -M tma_l3_bound -a sleep 1
Using CPUID GenuineIntel-6-2A-7
metric expr (MEM_LOAD_UOPS_RETIRED.LLC_HIT /
(MEM_LOAD_UOPS_RETIRED.LLC_HIT + 7 *
MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS)) *
CYCLE_ACTIVITY.STALLS_L2_PENDING / CLKS for tma_l3_bound
metric expr CPU_CLK_UNHALTED.THREAD for CLKS

found event MEM_LOAD_UOPS_RETIRED.LLC_HIT
found event CYCLE_ACTIVITY.STALLS_L2_PENDING
found event CPU_CLK_UNHALTED.THREAD
found event MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS

Parsing metric events
'{MEM_LOAD_UOPS_RETIRED.LLC_HIT/metric-id=MEM_LOAD_UOPS_RETIRED.LLC_HIT/,CYCLE_ACTIVITY.STALLS_L2_PENDING/metric-id=CYCLE_ACTIVITY.STALLS_L2_PEND
ING/,CPU_CLK_UNHALTED.THREAD/metric-id=CPU_CLK_UNHALTED.THREAD/,MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/metric-id=MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/}:W'
MEM_LOAD_UOPS_RETIRED.LLC_HIT -> cpu/event=0xd1,period=0xc365,umask=0x4/
CYCLE_ACTIVITY.STALLS_L2_PENDING ->
cpu/event=0xa3,cmask=0x5,period=0x1e8483,umask=0x5/
CPU_CLK_UNHALTED.THREAD -> cpu/event=0x3c,period=0x1e8483/
MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS -> cpu/event=0xd4,period=0x186a7,umask=0x2/

Control descriptor is not initialized

MEM_LOAD_UOPS_RETIRED.LLC_HIT: 0 4007421228 0
CYCLE_ACTIVITY.STALLS_L2_PENDING: 0 4007421228 0
CPU_CLK_UNHALTED.THREAD: 0 4007421228 0
MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS: 0 4007421228 0

Performance counter stats for 'system wide':

    <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
                  (0,00%)
    <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
                     (0,00%)
    <not counted>      CPU_CLK_UNHALTED.THREAD
              (0,00%)
    <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
                        (0,00%)

      1,002310013 seconds time elapsed

So those events/metric-ids resulting in "<not counted>" are all found.

What means "Control descriptor is not initialized"?

To summarize:

Those two tests in "100: perf all metrics test" FAILED:

1. tma_dram_bound
2. tma_l3_bound

Best regards,
-Sedat-

> Thanks,
> Ian
>
> > Last perf version which was OK:
> >
> > ~/bin/perf -v
> > perf version 6.0.0
> >
> > echo "linux-perf: Adjust limited access to performance monitoring and
> > observability operations"
> > echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > /proc/sys/kernel/perf_event_paranoid
> > 0
> >
> > ~/bin/perf test 10 86 92 93 94 95
> > 10: PMU events                                                      :
> > 10.1: PMU event table sanity                                        : Ok
> > 10.2: PMU event map aliases                                         : Ok
> > 10.3: Parsing of PMU event table metrics                            : Ok
> > 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > 86: perf record tests                                               : Ok
> > 92: perf stat tests                                                 : Ok
> > 93: perf all metricgroups test                                      : Ok
> > 94: perf all metrics test                                           : Ok
> > 95: perf all PMU test                                               : Ok
> >
> > echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > /proc/sys/kernel/perf_event_paranoid
> > echo "linux-perf: Reset limited access to performance monitoring and
> > observability operations"
> >
> > If you need further information, please let me know.
> >
> > Thanks.
> >
> > Regards,
> > -Sedat-
> >
> > P.S. Instructions
> >
> > [ REPRODUCER ]
> >
> > LLVM_MVER="16"
> >
> > # Debian LLVM
> > ##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
> > # Selfmade LLVM
> > LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
> > if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
> >    export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
> > fi
> >
> > PYTHON_VER="3.11"
> > MAKE="make"
> > MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
> > HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
> > STRIP=llvm-strip"
> >
> > echo "LLVM MVER ........ $LLVM_MVER"
> > echo "Path settings .... $PATH"
> > echo "Python version ... $PYTHON_VER"
> > echo "make line ........ $MAKE $MAKE_OPTS"
> >
> > LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt
> >
> > LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
> > PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
> > ../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt
> >
> >
> > [ TESTS ]
> >
> > [ TESTS - START ]
> >
> > echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > /proc/sys/kernel/perf_event_paranoid
> >
> > [ TESTS - DEBIAN ]
> >
> > /usr/bin/perf -v
> > perf version 6.1.7
> >
> > /usr/bin/perf test 10 92 98 99 100 101
> >
> >  10: PMU events                                                      :
> >  10.1: PMU event table sanity                                        : Ok
> >  10.2: PMU event map aliases                                         : Ok
> >  10.3: Parsing of PMU event table metrics                            : Ok
> >  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  92: perf record tests                                               : Ok
> >  98: perf stat tests                                                 : Ok
> >  99: perf all metricgroups test                                      : Ok
> > 100: perf all metrics test                                           : FAILED!
> > 101: perf all PMU test                                               : Ok
> >
> > [ TESTS - DILEKS ]
> >
> > ~/bin/perf -v
> > perf version 6.2.0-rc5
> >
> > ~/bin/perf test 7 87 93 94 95 96
> >
> >   7: PMU events                                                      :
> >   7.1: PMU event table sanity                                        : Ok
> >   7.2: PMU event map aliases                                         : Ok
> >   7.3: Parsing of PMU event table metrics                            : Ok
> >   7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >  87: perf record tests                                               : Ok
> >  93: perf stat tests                                                 : Ok
> >  94: perf all metricgroups test                                      : Ok
> >  95: perf all metrics test                                           : FAILED!
> >  96: perf all PMU test                                               : Ok
> >
> > [ TESTS - FAILED ]
> >
> > /usr/bin/perf test --verbose 100 2>&1 | tee
> > perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt
> >
> > ~/bin/perf test --verbose 95 2>&1 | tee
> > perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt
> >
> > [ TESTS - STOP ]
> >
> > echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > /proc/sys/kernel/perf_event_paranoid
> >
> > - EOT -

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-30  2:24   ` Sedat Dilek
@ 2023-01-30 10:04     ` James Clark
  2023-01-31  0:20       ` Ian Rogers
  0 siblings, 1 reply; 13+ messages in thread
From: James Clark @ 2023-01-30 10:04 UTC (permalink / raw)
  To: sedat.dilek, Ian Rogers
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-perf-users, linux-kernel, Nick Desaulniers,
	Nathan Chancellor, llvm, Ben Hutchings



On 30/01/2023 02:24, Sedat Dilek wrote:
> ?
> 
> On Mon, Jan 30, 2023 at 12:21 AM Ian Rogers <irogers@google.com> wrote:
>>
>> On Sun, Jan 29, 2023 at 1:59 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
>>>
>>> [ CC LLVM linux folks + Ben from Debian kernel team ]
>>>
>>> Hi,
>>>
>>> I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.
>>>
>>> After building my selfmade LLVM toolchain, I built perf and run some
>>> perf tests here on my Intel SandyBridge CPU (details see below).
>>>
>>> perf all metrics test: FAILED!
>>>
>>> ...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.
>>>
>>> Just noticed:
>>>
>>> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
>>> BPF maps, etc
>>>
>>> Run the below tests with `sudo` - made this go away - still FAILED.
>>>
>>> But maybe I am missing to activate some sysfs/debug or whatever other stuff?
>>
>> Hi Sedat,
>>
>> things have been improving wrt metrics and so this failure may have
>> just been because of the addition of a previously missing metric. The
>> rlimit thing shouldn't affect things but maybe file descriptors?
>> Looking at the test output the issue is:
>>
>> ```
>> Metric 'tma_dram_bound' not printed in:
>> # Running 'internals/synthesize' benchmark:
>> Computing performance of single threaded perf event synthesis by
>> synthesizing events on the perf process itself:
>>   Average synthesis took: 207.680 usec (+- 0.176 usec)
>>   Average num. events: 30.000 (+- 0.000)
>>   Average time per event 6.923 usec
>>   Average data synthesis took: 217.833 usec (+- 0.202 usec)
>>   Average num. events: 161.000 (+- 0.000)
>>   Average time per event 1.353 usec
>>
>>  Performance counter stats for 'perf bench internals synthesize':
>>
>>      <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
>>                          (0,00%)
>>      <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
>>                          (0,00%)
>>      <not counted>      CPU_CLK_UNHALTED.THREAD
>>                          (0,00%)
>>      <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>>                             (0,00%)
>> ```
>>
>> So the test was checking to see whether the tma_dram_bound metric
>> could be computed on your Sandybridge and it failed. The event counts
>> below show that every event came back "<not counted>" which is usually
>> indicative of a permissions problem - it is also not surprising given
>> this that the metric wasn't computed. You could try repeating the
>> command the test is trying with something like "perf stat -M
>> tma_dram_bound -a sleep 1", but running as root should have resolved
>> that issue. Does that give you enough to keep exploring?
>>
> 
> Hi Ian,
> 
> Thanks for your feedback!
> 
> I booted into my Debian kernel - just to see what happens.
> 
> # cat /proc/version
> Linux version 6.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-12
> (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1
> SMP PREEMPT_DYNAMIC Debian 6.1.7-1 (2023-01-18)
> 
> All things run as root...
> 
> # echo 0 | tee /proc/sys/kernel/kptr_restrict
> /proc/sys/kernel/perf_event_paranoid
> 0
> 
> # /usr/bin/perf test 10 92 98 99 100 101
> 10: PMU events                                                      :
> 10.1: PMU event table sanity                                        : Ok
> 10.2: PMU event map aliases                                         : Ok
> 10.3: Parsing of PMU event table metrics                            : Ok
> 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> 92: perf record tests                                               : Ok
> 98: perf stat tests                                                 : Ok
> 99: perf all metricgroups test                                      : Ok
> 100: perf all metrics test                                           : FAILED!
> 101: perf all PMU test                                               : Ok
> 
> # perf stat -M tma_dram_bound -a sleep 1
> 
> Performance counter stats for 'system wide':
> 
>     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
>                   (0,00%)
>     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                      (0,00%)
>     <not counted>      CPU_CLK_UNHALTED.THREAD
>               (0,00%)
>     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>                         (0,00%)
> 

Hi Sedat,

I also had this failure and did a git bisect, but it led me to the
conclusion that it is a stale build issue rather than a regression.

There was a recent commit that renamed/removed some json PMU files which
the build system can't cope with. I think the tests end up iterating
over a different set of event names than were generated by the build system.

If you do a clean build the issue should go away. I don't know if there
is anything more we can do to stop this from happening.

James

>       1,002148600 seconds time elapsed
> 
> Hmm... looking at... Metric 'tma_l3_bound' ...
> 
> Running...
> 
> # perf stat --verbose -M tma_l3_bound -a sleep 1
> Using CPUID GenuineIntel-6-2A-7
> metric expr (MEM_LOAD_UOPS_RETIRED.LLC_HIT /
> (MEM_LOAD_UOPS_RETIRED.LLC_HIT + 7 *
> MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS)) *
> CYCLE_ACTIVITY.STALLS_L2_PENDING / CLKS for tma_l3_bound
> metric expr CPU_CLK_UNHALTED.THREAD for CLKS
> 
> found event MEM_LOAD_UOPS_RETIRED.LLC_HIT
> found event CYCLE_ACTIVITY.STALLS_L2_PENDING
> found event CPU_CLK_UNHALTED.THREAD
> found event MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> 
> Parsing metric events
> '{MEM_LOAD_UOPS_RETIRED.LLC_HIT/metric-id=MEM_LOAD_UOPS_RETIRED.LLC_HIT/,CYCLE_ACTIVITY.STALLS_L2_PENDING/metric-id=CYCLE_ACTIVITY.STALLS_L2_PEND
> ING/,CPU_CLK_UNHALTED.THREAD/metric-id=CPU_CLK_UNHALTED.THREAD/,MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/metric-id=MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/}:W'
> MEM_LOAD_UOPS_RETIRED.LLC_HIT -> cpu/event=0xd1,period=0xc365,umask=0x4/
> CYCLE_ACTIVITY.STALLS_L2_PENDING ->
> cpu/event=0xa3,cmask=0x5,period=0x1e8483,umask=0x5/
> CPU_CLK_UNHALTED.THREAD -> cpu/event=0x3c,period=0x1e8483/
> MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS -> cpu/event=0xd4,period=0x186a7,umask=0x2/
> 
> Control descriptor is not initialized
> 
> MEM_LOAD_UOPS_RETIRED.LLC_HIT: 0 4007421228 0
> CYCLE_ACTIVITY.STALLS_L2_PENDING: 0 4007421228 0
> CPU_CLK_UNHALTED.THREAD: 0 4007421228 0
> MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS: 0 4007421228 0
> 
> Performance counter stats for 'system wide':
> 
>     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
>                   (0,00%)
>     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                      (0,00%)
>     <not counted>      CPU_CLK_UNHALTED.THREAD
>               (0,00%)
>     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>                         (0,00%)
> 
>       1,002310013 seconds time elapsed
> 
> So those events/metric-ids resulting in "<not counted>" are all found.
> 
> What means "Control descriptor is not initialized"?
> 
> To summarize:
> 
> Those two tests in "100: perf all metrics test" FAILED:
> 
> 1. tma_dram_bound
> 2. tma_l3_bound
> 
> Best regards,
> -Sedat-
> 
>> Thanks,
>> Ian
>>
>>> Last perf version which was OK:
>>>
>>> ~/bin/perf -v
>>> perf version 6.0.0
>>>
>>> echo "linux-perf: Adjust limited access to performance monitoring and
>>> observability operations"
>>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
>>> /proc/sys/kernel/perf_event_paranoid
>>> 0
>>>
>>> ~/bin/perf test 10 86 92 93 94 95
>>> 10: PMU events                                                      :
>>> 10.1: PMU event table sanity                                        : Ok
>>> 10.2: PMU event map aliases                                         : Ok
>>> 10.3: Parsing of PMU event table metrics                            : Ok
>>> 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>>> 86: perf record tests                                               : Ok
>>> 92: perf stat tests                                                 : Ok
>>> 93: perf all metricgroups test                                      : Ok
>>> 94: perf all metrics test                                           : Ok
>>> 95: perf all PMU test                                               : Ok
>>>
>>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
>>> /proc/sys/kernel/perf_event_paranoid
>>> echo "linux-perf: Reset limited access to performance monitoring and
>>> observability operations"
>>>
>>> If you need further information, please let me know.
>>>
>>> Thanks.
>>>
>>> Regards,
>>> -Sedat-
>>>
>>> P.S. Instructions
>>>
>>> [ REPRODUCER ]
>>>
>>> LLVM_MVER="16"
>>>
>>> # Debian LLVM
>>> ##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
>>> # Selfmade LLVM
>>> LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
>>> if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
>>>    export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
>>> fi
>>>
>>> PYTHON_VER="3.11"
>>> MAKE="make"
>>> MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
>>> HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
>>> STRIP=llvm-strip"
>>>
>>> echo "LLVM MVER ........ $LLVM_MVER"
>>> echo "Path settings .... $PATH"
>>> echo "Python version ... $PYTHON_VER"
>>> echo "make line ........ $MAKE $MAKE_OPTS"
>>>
>>> LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt
>>>
>>> LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
>>> PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
>>> ../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt
>>>
>>>
>>> [ TESTS ]
>>>
>>> [ TESTS - START ]
>>>
>>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
>>> /proc/sys/kernel/perf_event_paranoid
>>>
>>> [ TESTS - DEBIAN ]
>>>
>>> /usr/bin/perf -v
>>> perf version 6.1.7
>>>
>>> /usr/bin/perf test 10 92 98 99 100 101
>>>
>>>  10: PMU events                                                      :
>>>  10.1: PMU event table sanity                                        : Ok
>>>  10.2: PMU event map aliases                                         : Ok
>>>  10.3: Parsing of PMU event table metrics                            : Ok
>>>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>>>  92: perf record tests                                               : Ok
>>>  98: perf stat tests                                                 : Ok
>>>  99: perf all metricgroups test                                      : Ok
>>> 100: perf all metrics test                                           : FAILED!
>>> 101: perf all PMU test                                               : Ok
>>>
>>> [ TESTS - DILEKS ]
>>>
>>> ~/bin/perf -v
>>> perf version 6.2.0-rc5
>>>
>>> ~/bin/perf test 7 87 93 94 95 96
>>>
>>>   7: PMU events                                                      :
>>>   7.1: PMU event table sanity                                        : Ok
>>>   7.2: PMU event map aliases                                         : Ok
>>>   7.3: Parsing of PMU event table metrics                            : Ok
>>>   7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
>>>  87: perf record tests                                               : Ok
>>>  93: perf stat tests                                                 : Ok
>>>  94: perf all metricgroups test                                      : Ok
>>>  95: perf all metrics test                                           : FAILED!
>>>  96: perf all PMU test                                               : Ok
>>>
>>> [ TESTS - FAILED ]
>>>
>>> /usr/bin/perf test --verbose 100 2>&1 | tee
>>> perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt
>>>
>>> ~/bin/perf test --verbose 95 2>&1 | tee
>>> perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt
>>>
>>> [ TESTS - STOP ]
>>>
>>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
>>> /proc/sys/kernel/perf_event_paranoid
>>>
>>> - EOT -

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-30 10:04     ` James Clark
@ 2023-01-31  0:20       ` Ian Rogers
  2023-01-31  3:45         ` Sedat Dilek
  2023-02-01  6:51         ` Ravi Bangoria
  0 siblings, 2 replies; 13+ messages in thread
From: Ian Rogers @ 2023-01-31  0:20 UTC (permalink / raw)
  To: Liang, Kan, Xing, Zhengjun, sedat.dilek
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-perf-users, linux-kernel, Nick Desaulniers,
	Nathan Chancellor, llvm, Ben Hutchings, James Clark,
	Stephane Eranian

On Mon, Jan 30, 2023 at 2:04 AM James Clark <james.clark@arm.com> wrote:
>
>
>
> On 30/01/2023 02:24, Sedat Dilek wrote:
> > ?
> >
> > On Mon, Jan 30, 2023 at 12:21 AM Ian Rogers <irogers@google.com> wrote:
> >>
> >> On Sun, Jan 29, 2023 at 1:59 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> >>>
> >>> [ CC LLVM linux folks + Ben from Debian kernel team ]
> >>>
> >>> Hi,
> >>>
> >>> I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.
> >>>
> >>> After building my selfmade LLVM toolchain, I built perf and run some
> >>> perf tests here on my Intel SandyBridge CPU (details see below).
> >>>
> >>> perf all metrics test: FAILED!
> >>>
> >>> ...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.
> >>>
> >>> Just noticed:
> >>>
> >>> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> >>> BPF maps, etc
> >>>
> >>> Run the below tests with `sudo` - made this go away - still FAILED.
> >>>
> >>> But maybe I am missing to activate some sysfs/debug or whatever other stuff?
> >>
> >> Hi Sedat,
> >>
> >> things have been improving wrt metrics and so this failure may have
> >> just been because of the addition of a previously missing metric. The
> >> rlimit thing shouldn't affect things but maybe file descriptors?
> >> Looking at the test output the issue is:
> >>
> >> ```
> >> Metric 'tma_dram_bound' not printed in:
> >> # Running 'internals/synthesize' benchmark:
> >> Computing performance of single threaded perf event synthesis by
> >> synthesizing events on the perf process itself:
> >>   Average synthesis took: 207.680 usec (+- 0.176 usec)
> >>   Average num. events: 30.000 (+- 0.000)
> >>   Average time per event 6.923 usec
> >>   Average data synthesis took: 217.833 usec (+- 0.202 usec)
> >>   Average num. events: 161.000 (+- 0.000)
> >>   Average time per event 1.353 usec
> >>
> >>  Performance counter stats for 'perf bench internals synthesize':
> >>
> >>      <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> >>                          (0,00%)
> >>      <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >>                          (0,00%)
> >>      <not counted>      CPU_CLK_UNHALTED.THREAD
> >>                          (0,00%)
> >>      <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> >>                             (0,00%)
> >> ```
> >>
> >> So the test was checking to see whether the tma_dram_bound metric
> >> could be computed on your Sandybridge and it failed. The event counts
> >> below show that every event came back "<not counted>" which is usually
> >> indicative of a permissions problem - it is also not surprising given
> >> this that the metric wasn't computed. You could try repeating the
> >> command the test is trying with something like "perf stat -M
> >> tma_dram_bound -a sleep 1", but running as root should have resolved
> >> that issue. Does that give you enough to keep exploring?
> >>
> >
> > Hi Ian,
> >
> > Thanks for your feedback!
> >
> > I booted into my Debian kernel - just to see what happens.
> >
> > # cat /proc/version
> > Linux version 6.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-12
> > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1
> > SMP PREEMPT_DYNAMIC Debian 6.1.7-1 (2023-01-18)
> >
> > All things run as root...
> >
> > # echo 0 | tee /proc/sys/kernel/kptr_restrict
> > /proc/sys/kernel/perf_event_paranoid
> > 0
> >
> > # /usr/bin/perf test 10 92 98 99 100 101
> > 10: PMU events                                                      :
> > 10.1: PMU event table sanity                                        : Ok
> > 10.2: PMU event map aliases                                         : Ok
> > 10.3: Parsing of PMU event table metrics                            : Ok
> > 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > 92: perf record tests                                               : Ok
> > 98: perf stat tests                                                 : Ok
> > 99: perf all metricgroups test                                      : Ok
> > 100: perf all metrics test                                           : FAILED!
> > 101: perf all PMU test                                               : Ok
> >
> > # perf stat -M tma_dram_bound -a sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> >                   (0,00%)
> >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >                      (0,00%)
> >     <not counted>      CPU_CLK_UNHALTED.THREAD
> >               (0,00%)
> >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> >                         (0,00%)
> >
>
> Hi Sedat,
>
> I also had this failure and did a git bisect, but it led me to the
> conclusion that it is a stale build issue rather than a regression.
>
> There was a recent commit that renamed/removed some json PMU files which
> the build system can't cope with. I think the tests end up iterating
> over a different set of event names than were generated by the build system.
>
> If you do a clean build the issue should go away. I don't know if there
> is anything more we can do to stop this from happening.
>
> James

So I think this is a kernel bug triggering a perf tool bug. The kernel
bug can be worked around in the perf tool. I only had an Ivybridge to
test with (hence slightly different events) but what I see is both
tma_dram_bound and tma_l3_bound using the same 4 events. I could work
around the "<not counted>" by adding the --metric-no-group flag:

```
$ perf stat -M tma_l3_bound --metric-no-group -a sleep 1

Performance counter stats for 'system wide':

          400,404      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      4.3 %
tma_l3_bound             (74.99%)
      128,937,891      CYCLE_ACTIVITY.STALLS_L2_PENDING
                        (87.46%)
          167,459      MEM_LOAD_UOPS_RETIRED.LLC_MISS
                        (74.99%)
      759,574,967      CPU_CLK_UNHALTED.THREAD
                        (87.47%)

      1.001526438 seconds time elapsed

$ perf stat -M tma_dram_bound -a --metric-no-group sleep 1

Performance counter stats for 'system wide':

          259,954      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #     15.2 %
tma_dram_bound           (74.99%)
      118,807,043      CYCLE_ACTIVITY.STALLS_L2_PENDING
                        (87.46%)
          111,699      MEM_LOAD_UOPS_RETIRED.LLC_MISS
                        (74.95%)
      587,571,060      CPU_CLK_UNHALTED.THREAD
                        (87.45%)

      1.001518093 seconds time elapsed
```

The issue is that perf metrics use weak groups of events. A weak group
is the same as a group of events initially. We want to use groups of
events with metrics so that all the counters are scheduled in and out
at the same time, and not multiplexed independently. Imagine measuring
IPC but the counts for instructions and cycles are measured at
different periods, the resultant IPC value would be unlikely to be
accurate. If perf_event_open fails then the perf tool retries the
events without the group. If I try just 3 of the events in a weak
group then the failure can be seen:

```
$ perf stat -e "{MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING}:W"
-a sleep 1

Performance counter stats for 'system wide':

    <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
                        (0.00%)
    <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_MISS
                        (0.00%)
    <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
                        (0.00%)

      1.001458485 seconds time elapsed
```

The kernel should have failed the perf_event_open on opening the third
event and then measured without the group, which it can do with
multiplexing as in the following:

```
$ perf stat -e "MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING"
-a sleep 1

Performance counter stats for 'system wide':

        1,239,397      MEM_LOAD_UOPS_RETIRED.LLC_HIT
                        (79.06%)
          174,826      MEM_LOAD_UOPS_RETIRED.LLC_MISS
                        (64.60%)
      124,026,024      CYCLE_ACTIVITY.STALLS_L2_PENDING
                        (81.16%)

      1.001483434 seconds time elapsed
```

When the --metric-no-group flag is given to perf then it doesn't
produce the initial weak group, which works around the bug of the
kernel not failing on the 3rd perf_event_open. I've added Kan and
Zhengjun to the e-mail as they work on the Intel kernel PMU code.

There's a question about what we should do in the perf test about
this? I have a few solutions:

1) try metric tests again with the --metric-no-group flag and don't
fail the test if this succeeds. This allows kernel bugs to hide, so
I'm not a huge fan.

2) add a new metric flag/constraint to say not to group, this way the
metric will automatically apply the "--metric-no-group" flag. It is a
bit of work to wire this up but this kind of failure is common enough
in PMUs that it is probably worthwhile. We also need to add the flag
to metrics and I'm not sure how to get a good list of the metrics that
currently fail and require it. This is okay but error prone.

3) fix the kernel bug and let the perf test fail until an adequate
kernel is installed. Probably the best option.

Thanks,
Ian

> >       1,002148600 seconds time elapsed
> >
> > Hmm... looking at... Metric 'tma_l3_bound' ...
> >
> > Running...
> >
> > # perf stat --verbose -M tma_l3_bound -a sleep 1
> > Using CPUID GenuineIntel-6-2A-7
> > metric expr (MEM_LOAD_UOPS_RETIRED.LLC_HIT /
> > (MEM_LOAD_UOPS_RETIRED.LLC_HIT + 7 *
> > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS)) *
> > CYCLE_ACTIVITY.STALLS_L2_PENDING / CLKS for tma_l3_bound
> > metric expr CPU_CLK_UNHALTED.THREAD for CLKS
> >
> > found event MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > found event CYCLE_ACTIVITY.STALLS_L2_PENDING
> > found event CPU_CLK_UNHALTED.THREAD
> > found event MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> >
> > Parsing metric events
> > '{MEM_LOAD_UOPS_RETIRED.LLC_HIT/metric-id=MEM_LOAD_UOPS_RETIRED.LLC_HIT/,CYCLE_ACTIVITY.STALLS_L2_PENDING/metric-id=CYCLE_ACTIVITY.STALLS_L2_PEND
> > ING/,CPU_CLK_UNHALTED.THREAD/metric-id=CPU_CLK_UNHALTED.THREAD/,MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/metric-id=MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/}:W'
> > MEM_LOAD_UOPS_RETIRED.LLC_HIT -> cpu/event=0xd1,period=0xc365,umask=0x4/
> > CYCLE_ACTIVITY.STALLS_L2_PENDING ->
> > cpu/event=0xa3,cmask=0x5,period=0x1e8483,umask=0x5/
> > CPU_CLK_UNHALTED.THREAD -> cpu/event=0x3c,period=0x1e8483/
> > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS -> cpu/event=0xd4,period=0x186a7,umask=0x2/
> >
> > Control descriptor is not initialized
> >
> > MEM_LOAD_UOPS_RETIRED.LLC_HIT: 0 4007421228 0
> > CYCLE_ACTIVITY.STALLS_L2_PENDING: 0 4007421228 0
> > CPU_CLK_UNHALTED.THREAD: 0 4007421228 0
> > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS: 0 4007421228 0
> >
> > Performance counter stats for 'system wide':
> >
> >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> >                   (0,00%)
> >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >                      (0,00%)
> >     <not counted>      CPU_CLK_UNHALTED.THREAD
> >               (0,00%)
> >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> >                         (0,00%)
> >
> >       1,002310013 seconds time elapsed
> >
> > So those events/metric-ids resulting in "<not counted>" are all found.
> >
> > What means "Control descriptor is not initialized"?
> >
> > To summarize:
> >
> > Those two tests in "100: perf all metrics test" FAILED:
> >
> > 1. tma_dram_bound
> > 2. tma_l3_bound
> >
> > Best regards,
> > -Sedat-
> >
> >> Thanks,
> >> Ian
> >>
> >>> Last perf version which was OK:
> >>>
> >>> ~/bin/perf -v
> >>> perf version 6.0.0
> >>>
> >>> echo "linux-perf: Adjust limited access to performance monitoring and
> >>> observability operations"
> >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> >>> /proc/sys/kernel/perf_event_paranoid
> >>> 0
> >>>
> >>> ~/bin/perf test 10 86 92 93 94 95
> >>> 10: PMU events                                                      :
> >>> 10.1: PMU event table sanity                                        : Ok
> >>> 10.2: PMU event map aliases                                         : Ok
> >>> 10.3: Parsing of PMU event table metrics                            : Ok
> >>> 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >>> 86: perf record tests                                               : Ok
> >>> 92: perf stat tests                                                 : Ok
> >>> 93: perf all metricgroups test                                      : Ok
> >>> 94: perf all metrics test                                           : Ok
> >>> 95: perf all PMU test                                               : Ok
> >>>
> >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> >>> /proc/sys/kernel/perf_event_paranoid
> >>> echo "linux-perf: Reset limited access to performance monitoring and
> >>> observability operations"
> >>>
> >>> If you need further information, please let me know.
> >>>
> >>> Thanks.
> >>>
> >>> Regards,
> >>> -Sedat-
> >>>
> >>> P.S. Instructions
> >>>
> >>> [ REPRODUCER ]
> >>>
> >>> LLVM_MVER="16"
> >>>
> >>> # Debian LLVM
> >>> ##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
> >>> # Selfmade LLVM
> >>> LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
> >>> if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
> >>>    export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
> >>> fi
> >>>
> >>> PYTHON_VER="3.11"
> >>> MAKE="make"
> >>> MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
> >>> HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
> >>> STRIP=llvm-strip"
> >>>
> >>> echo "LLVM MVER ........ $LLVM_MVER"
> >>> echo "Path settings .... $PATH"
> >>> echo "Python version ... $PYTHON_VER"
> >>> echo "make line ........ $MAKE $MAKE_OPTS"
> >>>
> >>> LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt
> >>>
> >>> LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
> >>> PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
> >>> ../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt
> >>>
> >>>
> >>> [ TESTS ]
> >>>
> >>> [ TESTS - START ]
> >>>
> >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> >>> /proc/sys/kernel/perf_event_paranoid
> >>>
> >>> [ TESTS - DEBIAN ]
> >>>
> >>> /usr/bin/perf -v
> >>> perf version 6.1.7
> >>>
> >>> /usr/bin/perf test 10 92 98 99 100 101
> >>>
> >>>  10: PMU events                                                      :
> >>>  10.1: PMU event table sanity                                        : Ok
> >>>  10.2: PMU event map aliases                                         : Ok
> >>>  10.3: Parsing of PMU event table metrics                            : Ok
> >>>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >>>  92: perf record tests                                               : Ok
> >>>  98: perf stat tests                                                 : Ok
> >>>  99: perf all metricgroups test                                      : Ok
> >>> 100: perf all metrics test                                           : FAILED!
> >>> 101: perf all PMU test                                               : Ok
> >>>
> >>> [ TESTS - DILEKS ]
> >>>
> >>> ~/bin/perf -v
> >>> perf version 6.2.0-rc5
> >>>
> >>> ~/bin/perf test 7 87 93 94 95 96
> >>>
> >>>   7: PMU events                                                      :
> >>>   7.1: PMU event table sanity                                        : Ok
> >>>   7.2: PMU event map aliases                                         : Ok
> >>>   7.3: Parsing of PMU event table metrics                            : Ok
> >>>   7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> >>>  87: perf record tests                                               : Ok
> >>>  93: perf stat tests                                                 : Ok
> >>>  94: perf all metricgroups test                                      : Ok
> >>>  95: perf all metrics test                                           : FAILED!
> >>>  96: perf all PMU test                                               : Ok
> >>>
> >>> [ TESTS - FAILED ]
> >>>
> >>> /usr/bin/perf test --verbose 100 2>&1 | tee
> >>> perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt
> >>>
> >>> ~/bin/perf test --verbose 95 2>&1 | tee
> >>> perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt
> >>>
> >>> [ TESTS - STOP ]
> >>>
> >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> >>> /proc/sys/kernel/perf_event_paranoid
> >>>
> >>> - EOT -

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-31  0:20       ` Ian Rogers
@ 2023-01-31  3:45         ` Sedat Dilek
  2023-01-31  3:55           ` Ian Rogers
  2023-02-01  6:51         ` Ravi Bangoria
  1 sibling, 1 reply; 13+ messages in thread
From: Sedat Dilek @ 2023-01-31  3:45 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Liang, Kan, Xing, Zhengjun, Arnaldo Carvalho de Melo,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-perf-users, linux-kernel,
	Nick Desaulniers, Nathan Chancellor, llvm, Ben Hutchings,
	James Clark, Stephane Eranian

On Tue, Jan 31, 2023 at 1:20 AM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Jan 30, 2023 at 2:04 AM James Clark <james.clark@arm.com> wrote:
> >
> >
> >
> > On 30/01/2023 02:24, Sedat Dilek wrote:
> > > ?
> > >
> > > On Mon, Jan 30, 2023 at 12:21 AM Ian Rogers <irogers@google.com> wrote:
> > >>
> > >> On Sun, Jan 29, 2023 at 1:59 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> > >>>
> > >>> [ CC LLVM linux folks + Ben from Debian kernel team ]
> > >>>
> > >>> Hi,
> > >>>
> > >>> I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.
> > >>>
> > >>> After building my selfmade LLVM toolchain, I built perf and run some
> > >>> perf tests here on my Intel SandyBridge CPU (details see below).
> > >>>
> > >>> perf all metrics test: FAILED!
> > >>>
> > >>> ...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.
> > >>>
> > >>> Just noticed:
> > >>>
> > >>> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> > >>> BPF maps, etc
> > >>>
> > >>> Run the below tests with `sudo` - made this go away - still FAILED.
> > >>>
> > >>> But maybe I am missing to activate some sysfs/debug or whatever other stuff?
> > >>
> > >> Hi Sedat,
> > >>
> > >> things have been improving wrt metrics and so this failure may have
> > >> just been because of the addition of a previously missing metric. The
> > >> rlimit thing shouldn't affect things but maybe file descriptors?
> > >> Looking at the test output the issue is:
> > >>
> > >> ```
> > >> Metric 'tma_dram_bound' not printed in:
> > >> # Running 'internals/synthesize' benchmark:
> > >> Computing performance of single threaded perf event synthesis by
> > >> synthesizing events on the perf process itself:
> > >>   Average synthesis took: 207.680 usec (+- 0.176 usec)
> > >>   Average num. events: 30.000 (+- 0.000)
> > >>   Average time per event 6.923 usec
> > >>   Average data synthesis took: 217.833 usec (+- 0.202 usec)
> > >>   Average num. events: 161.000 (+- 0.000)
> > >>   Average time per event 1.353 usec
> > >>
> > >>  Performance counter stats for 'perf bench internals synthesize':
> > >>
> > >>      <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > >>                          (0,00%)
> > >>      <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >>                          (0,00%)
> > >>      <not counted>      CPU_CLK_UNHALTED.THREAD
> > >>                          (0,00%)
> > >>      <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > >>                             (0,00%)
> > >> ```
> > >>
> > >> So the test was checking to see whether the tma_dram_bound metric
> > >> could be computed on your Sandybridge and it failed. The event counts
> > >> below show that every event came back "<not counted>" which is usually
> > >> indicative of a permissions problem - it is also not surprising given
> > >> this that the metric wasn't computed. You could try repeating the
> > >> command the test is trying with something like "perf stat -M
> > >> tma_dram_bound -a sleep 1", but running as root should have resolved
> > >> that issue. Does that give you enough to keep exploring?
> > >>
> > >
> > > Hi Ian,
> > >
> > > Thanks for your feedback!
> > >
> > > I booted into my Debian kernel - just to see what happens.
> > >
> > > # cat /proc/version
> > > Linux version 6.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-12
> > > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1
> > > SMP PREEMPT_DYNAMIC Debian 6.1.7-1 (2023-01-18)
> > >
> > > All things run as root...
> > >
> > > # echo 0 | tee /proc/sys/kernel/kptr_restrict
> > > /proc/sys/kernel/perf_event_paranoid
> > > 0
> > >
> > > # /usr/bin/perf test 10 92 98 99 100 101
> > > 10: PMU events                                                      :
> > > 10.1: PMU event table sanity                                        : Ok
> > > 10.2: PMU event map aliases                                         : Ok
> > > 10.3: Parsing of PMU event table metrics                            : Ok
> > > 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > 92: perf record tests                                               : Ok
> > > 98: perf stat tests                                                 : Ok
> > > 99: perf all metricgroups test                                      : Ok
> > > 100: perf all metrics test                                           : FAILED!
> > > 101: perf all PMU test                                               : Ok
> > >
> > > # perf stat -M tma_dram_bound -a sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > >                   (0,00%)
> > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >                      (0,00%)
> > >     <not counted>      CPU_CLK_UNHALTED.THREAD
> > >               (0,00%)
> > >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > >                         (0,00%)
> > >
> >
> > Hi Sedat,
> >
> > I also had this failure and did a git bisect, but it led me to the
> > conclusion that it is a stale build issue rather than a regression.
> >
> > There was a recent commit that renamed/removed some json PMU files which
> > the build system can't cope with. I think the tests end up iterating
> > over a different set of event names than were generated by the build system.
> >
> > If you do a clean build the issue should go away. I don't know if there
> > is anything more we can do to stop this from happening.
> >
> > James
>
> So I think this is a kernel bug triggering a perf tool bug. The kernel
> bug can be worked around in the perf tool. I only had an Ivybridge to
> test with (hence slightly different events) but what I see is both
> tma_dram_bound and tma_l3_bound using the same 4 events. I could work
> around the "<not counted>" by adding the --metric-no-group flag:
>
> ```
> $ perf stat -M tma_l3_bound --metric-no-group -a sleep 1
>
> Performance counter stats for 'system wide':
>
>           400,404      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      4.3 %
> tma_l3_bound             (74.99%)
>       128,937,891      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (87.46%)
>           167,459      MEM_LOAD_UOPS_RETIRED.LLC_MISS
>                         (74.99%)
>       759,574,967      CPU_CLK_UNHALTED.THREAD
>                         (87.47%)
>
>       1.001526438 seconds time elapsed
>
> $ perf stat -M tma_dram_bound -a --metric-no-group sleep 1
>
> Performance counter stats for 'system wide':
>
>           259,954      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #     15.2 %
> tma_dram_bound           (74.99%)
>       118,807,043      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (87.46%)
>           111,699      MEM_LOAD_UOPS_RETIRED.LLC_MISS
>                         (74.95%)
>       587,571,060      CPU_CLK_UNHALTED.THREAD
>                         (87.45%)
>
>       1.001518093 seconds time elapsed
> ```
>
> The issue is that perf metrics use weak groups of events. A weak group
> is the same as a group of events initially. We want to use groups of
> events with metrics so that all the counters are scheduled in and out
> at the same time, and not multiplexed independently. Imagine measuring
> IPC but the counts for instructions and cycles are measured at
> different periods, the resultant IPC value would be unlikely to be
> accurate. If perf_event_open fails then the perf tool retries the
> events without the group. If I try just 3 of the events in a weak
> group then the failure can be seen:
>
> ```
> $ perf stat -e "{MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING}:W"
> -a sleep 1
>
> Performance counter stats for 'system wide':
>
>     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
>                         (0.00%)
>     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_MISS
>                         (0.00%)
>     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (0.00%)
>
>       1.001458485 seconds time elapsed
> ```
>
> The kernel should have failed the perf_event_open on opening the third
> event and then measured without the group, which it can do with
> multiplexing as in the following:
>
> ```
> $ perf stat -e "MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING"
> -a sleep 1
>
> Performance counter stats for 'system wide':
>
>         1,239,397      MEM_LOAD_UOPS_RETIRED.LLC_HIT
>                         (79.06%)
>           174,826      MEM_LOAD_UOPS_RETIRED.LLC_MISS
>                         (64.60%)
>       124,026,024      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (81.16%)
>
>       1.001483434 seconds time elapsed
> ```
>
> When the --metric-no-group flag is given to perf then it doesn't
> produce the initial weak group, which works around the bug of the
> kernel not failing on the 3rd perf_event_open. I've added Kan and
> Zhengjun to the e-mail as they work on the Intel kernel PMU code.
>
> There's a question about what we should do in the perf test about
> this? I have a few solutions:
>
> 1) try metric tests again with the --metric-no-group flag and don't
> fail the test if this succeeds. This allows kernel bugs to hide, so
> I'm not a huge fan.
>
> 2) add a new metric flag/constraint to say not to group, this way the
> metric will automatically apply the "--metric-no-group" flag. It is a
> bit of work to wire this up but this kind of failure is common enough
> in PMUs that it is probably worthwhile. We also need to add the flag
> to metrics and I'm not sure how to get a good list of the metrics that
> currently fail and require it. This is okay but error prone.
>
> 3) fix the kernel bug and let the perf test fail until an adequate
> kernel is installed. Probably the best option.
>

Hi Ian,

I can confirm:

$ echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
/proc/sys/kernel/perf_event_paranoid
0

$ ~/bin/perf stat -M tma_l3_bound --metric-no-group -a sleep 1

Performance counter stats for 'system wide':

        2.058.892      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,5 %
tma_l3_bound             (99,30%)
      173.254.697      CYCLE_ACTIVITY.STALLS_L2_PENDING
                        (99,10%)
    2.396.130.501      CPU_CLK_UNHALTED.THREAD
                        (99,60%)
        1.110.486      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
                           (99,53%)

      1,001989022 seconds time elapsed

$ ~/bin/perf stat -M tma_dram_bound --metric-no-group -a sleep 1

Performance counter stats for 'system wide':

        1.729.208      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,2 %
tma_dram_bound           (99,50%)
       50.346.734      CYCLE_ACTIVITY.STALLS_L2_PENDING
                        (99,50%)
    2.354.963.862      CPU_CLK_UNHALTED.THREAD
                        (99,80%)
          306.500      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
                           (99,61%)

      1,001981392 seconds time elapsed

Thanks!

BR,
-Sedat-

> Thanks,
> Ian
>
> > >       1,002148600 seconds time elapsed
> > >
> > > Hmm... looking at... Metric 'tma_l3_bound' ...
> > >
> > > Running...
> > >
> > > # perf stat --verbose -M tma_l3_bound -a sleep 1
> > > Using CPUID GenuineIntel-6-2A-7
> > > metric expr (MEM_LOAD_UOPS_RETIRED.LLC_HIT /
> > > (MEM_LOAD_UOPS_RETIRED.LLC_HIT + 7 *
> > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS)) *
> > > CYCLE_ACTIVITY.STALLS_L2_PENDING / CLKS for tma_l3_bound
> > > metric expr CPU_CLK_UNHALTED.THREAD for CLKS
> > >
> > > found event MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > found event CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > found event CPU_CLK_UNHALTED.THREAD
> > > found event MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > >
> > > Parsing metric events
> > > '{MEM_LOAD_UOPS_RETIRED.LLC_HIT/metric-id=MEM_LOAD_UOPS_RETIRED.LLC_HIT/,CYCLE_ACTIVITY.STALLS_L2_PENDING/metric-id=CYCLE_ACTIVITY.STALLS_L2_PEND
> > > ING/,CPU_CLK_UNHALTED.THREAD/metric-id=CPU_CLK_UNHALTED.THREAD/,MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/metric-id=MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/}:W'
> > > MEM_LOAD_UOPS_RETIRED.LLC_HIT -> cpu/event=0xd1,period=0xc365,umask=0x4/
> > > CYCLE_ACTIVITY.STALLS_L2_PENDING ->
> > > cpu/event=0xa3,cmask=0x5,period=0x1e8483,umask=0x5/
> > > CPU_CLK_UNHALTED.THREAD -> cpu/event=0x3c,period=0x1e8483/
> > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS -> cpu/event=0xd4,period=0x186a7,umask=0x2/
> > >
> > > Control descriptor is not initialized
> > >
> > > MEM_LOAD_UOPS_RETIRED.LLC_HIT: 0 4007421228 0
> > > CYCLE_ACTIVITY.STALLS_L2_PENDING: 0 4007421228 0
> > > CPU_CLK_UNHALTED.THREAD: 0 4007421228 0
> > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS: 0 4007421228 0
> > >
> > > Performance counter stats for 'system wide':
> > >
> > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > >                   (0,00%)
> > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >                      (0,00%)
> > >     <not counted>      CPU_CLK_UNHALTED.THREAD
> > >               (0,00%)
> > >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > >                         (0,00%)
> > >
> > >       1,002310013 seconds time elapsed
> > >
> > > So those events/metric-ids resulting in "<not counted>" are all found.
> > >
> > > What means "Control descriptor is not initialized"?
> > >
> > > To summarize:
> > >
> > > Those two tests in "100: perf all metrics test" FAILED:
> > >
> > > 1. tma_dram_bound
> > > 2. tma_l3_bound
> > >
> > > Best regards,
> > > -Sedat-
> > >
> > >> Thanks,
> > >> Ian
> > >>
> > >>> Last perf version which was OK:
> > >>>
> > >>> ~/bin/perf -v
> > >>> perf version 6.0.0
> > >>>
> > >>> echo "linux-perf: Adjust limited access to performance monitoring and
> > >>> observability operations"
> > >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > >>> /proc/sys/kernel/perf_event_paranoid
> > >>> 0
> > >>>
> > >>> ~/bin/perf test 10 86 92 93 94 95
> > >>> 10: PMU events                                                      :
> > >>> 10.1: PMU event table sanity                                        : Ok
> > >>> 10.2: PMU event map aliases                                         : Ok
> > >>> 10.3: Parsing of PMU event table metrics                            : Ok
> > >>> 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > >>> 86: perf record tests                                               : Ok
> > >>> 92: perf stat tests                                                 : Ok
> > >>> 93: perf all metricgroups test                                      : Ok
> > >>> 94: perf all metrics test                                           : Ok
> > >>> 95: perf all PMU test                                               : Ok
> > >>>
> > >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > >>> /proc/sys/kernel/perf_event_paranoid
> > >>> echo "linux-perf: Reset limited access to performance monitoring and
> > >>> observability operations"
> > >>>
> > >>> If you need further information, please let me know.
> > >>>
> > >>> Thanks.
> > >>>
> > >>> Regards,
> > >>> -Sedat-
> > >>>
> > >>> P.S. Instructions
> > >>>
> > >>> [ REPRODUCER ]
> > >>>
> > >>> LLVM_MVER="16"
> > >>>
> > >>> # Debian LLVM
> > >>> ##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
> > >>> # Selfmade LLVM
> > >>> LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
> > >>> if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
> > >>>    export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
> > >>> fi
> > >>>
> > >>> PYTHON_VER="3.11"
> > >>> MAKE="make"
> > >>> MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
> > >>> HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
> > >>> STRIP=llvm-strip"
> > >>>
> > >>> echo "LLVM MVER ........ $LLVM_MVER"
> > >>> echo "Path settings .... $PATH"
> > >>> echo "Python version ... $PYTHON_VER"
> > >>> echo "make line ........ $MAKE $MAKE_OPTS"
> > >>>
> > >>> LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt
> > >>>
> > >>> LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
> > >>> PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
> > >>> ../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt
> > >>>
> > >>>
> > >>> [ TESTS ]
> > >>>
> > >>> [ TESTS - START ]
> > >>>
> > >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > >>> /proc/sys/kernel/perf_event_paranoid
> > >>>
> > >>> [ TESTS - DEBIAN ]
> > >>>
> > >>> /usr/bin/perf -v
> > >>> perf version 6.1.7
> > >>>
> > >>> /usr/bin/perf test 10 92 98 99 100 101
> > >>>
> > >>>  10: PMU events                                                      :
> > >>>  10.1: PMU event table sanity                                        : Ok
> > >>>  10.2: PMU event map aliases                                         : Ok
> > >>>  10.3: Parsing of PMU event table metrics                            : Ok
> > >>>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > >>>  92: perf record tests                                               : Ok
> > >>>  98: perf stat tests                                                 : Ok
> > >>>  99: perf all metricgroups test                                      : Ok
> > >>> 100: perf all metrics test                                           : FAILED!
> > >>> 101: perf all PMU test                                               : Ok
> > >>>
> > >>> [ TESTS - DILEKS ]
> > >>>
> > >>> ~/bin/perf -v
> > >>> perf version 6.2.0-rc5
> > >>>
> > >>> ~/bin/perf test 7 87 93 94 95 96
> > >>>
> > >>>   7: PMU events                                                      :
> > >>>   7.1: PMU event table sanity                                        : Ok
> > >>>   7.2: PMU event map aliases                                         : Ok
> > >>>   7.3: Parsing of PMU event table metrics                            : Ok
> > >>>   7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > >>>  87: perf record tests                                               : Ok
> > >>>  93: perf stat tests                                                 : Ok
> > >>>  94: perf all metricgroups test                                      : Ok
> > >>>  95: perf all metrics test                                           : FAILED!
> > >>>  96: perf all PMU test                                               : Ok
> > >>>
> > >>> [ TESTS - FAILED ]
> > >>>
> > >>> /usr/bin/perf test --verbose 100 2>&1 | tee
> > >>> perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt
> > >>>
> > >>> ~/bin/perf test --verbose 95 2>&1 | tee
> > >>> perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt
> > >>>
> > >>> [ TESTS - STOP ]
> > >>>
> > >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > >>> /proc/sys/kernel/perf_event_paranoid
> > >>>
> > >>> - EOT -

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-31  3:45         ` Sedat Dilek
@ 2023-01-31  3:55           ` Ian Rogers
  2023-01-31  6:14             ` Sedat Dilek
  2023-02-01 15:27             ` Liang, Kan
  0 siblings, 2 replies; 13+ messages in thread
From: Ian Rogers @ 2023-01-31  3:55 UTC (permalink / raw)
  To: sedat.dilek
  Cc: Liang, Kan, Xing, Zhengjun, Arnaldo Carvalho de Melo,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-perf-users, linux-kernel,
	Nick Desaulniers, Nathan Chancellor, llvm, Ben Hutchings,
	James Clark, Stephane Eranian

On Mon, Jan 30, 2023 at 7:45 PM Sedat Dilek <sedat.dilek@gmail.com> wrote:
>
> On Tue, Jan 31, 2023 at 1:20 AM Ian Rogers <irogers@google.com> wrote:
> >
> > On Mon, Jan 30, 2023 at 2:04 AM James Clark <james.clark@arm.com> wrote:
> > >
> > >
> > >
> > > On 30/01/2023 02:24, Sedat Dilek wrote:
> > > > ?
> > > >
> > > > On Mon, Jan 30, 2023 at 12:21 AM Ian Rogers <irogers@google.com> wrote:
> > > >>
> > > >> On Sun, Jan 29, 2023 at 1:59 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> > > >>>
> > > >>> [ CC LLVM linux folks + Ben from Debian kernel team ]
> > > >>>
> > > >>> Hi,
> > > >>>
> > > >>> I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.
> > > >>>
> > > >>> After building my selfmade LLVM toolchain, I built perf and run some
> > > >>> perf tests here on my Intel SandyBridge CPU (details see below).
> > > >>>
> > > >>> perf all metrics test: FAILED!
> > > >>>
> > > >>> ...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.
> > > >>>
> > > >>> Just noticed:
> > > >>>
> > > >>> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> > > >>> BPF maps, etc
> > > >>>
> > > >>> Run the below tests with `sudo` - made this go away - still FAILED.
> > > >>>
> > > >>> But maybe I am missing to activate some sysfs/debug or whatever other stuff?
> > > >>
> > > >> Hi Sedat,
> > > >>
> > > >> things have been improving wrt metrics and so this failure may have
> > > >> just been because of the addition of a previously missing metric. The
> > > >> rlimit thing shouldn't affect things but maybe file descriptors?
> > > >> Looking at the test output the issue is:
> > > >>
> > > >> ```
> > > >> Metric 'tma_dram_bound' not printed in:
> > > >> # Running 'internals/synthesize' benchmark:
> > > >> Computing performance of single threaded perf event synthesis by
> > > >> synthesizing events on the perf process itself:
> > > >>   Average synthesis took: 207.680 usec (+- 0.176 usec)
> > > >>   Average num. events: 30.000 (+- 0.000)
> > > >>   Average time per event 6.923 usec
> > > >>   Average data synthesis took: 217.833 usec (+- 0.202 usec)
> > > >>   Average num. events: 161.000 (+- 0.000)
> > > >>   Average time per event 1.353 usec
> > > >>
> > > >>  Performance counter stats for 'perf bench internals synthesize':
> > > >>
> > > >>      <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > >>                          (0,00%)
> > > >>      <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > >>                          (0,00%)
> > > >>      <not counted>      CPU_CLK_UNHALTED.THREAD
> > > >>                          (0,00%)
> > > >>      <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > >>                             (0,00%)
> > > >> ```
> > > >>
> > > >> So the test was checking to see whether the tma_dram_bound metric
> > > >> could be computed on your Sandybridge and it failed. The event counts
> > > >> below show that every event came back "<not counted>" which is usually
> > > >> indicative of a permissions problem - it is also not surprising given
> > > >> this that the metric wasn't computed. You could try repeating the
> > > >> command the test is trying with something like "perf stat -M
> > > >> tma_dram_bound -a sleep 1", but running as root should have resolved
> > > >> that issue. Does that give you enough to keep exploring?
> > > >>
> > > >
> > > > Hi Ian,
> > > >
> > > > Thanks for your feedback!
> > > >
> > > > I booted into my Debian kernel - just to see what happens.
> > > >
> > > > # cat /proc/version
> > > > Linux version 6.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-12
> > > > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1
> > > > SMP PREEMPT_DYNAMIC Debian 6.1.7-1 (2023-01-18)
> > > >
> > > > All things run as root...
> > > >
> > > > # echo 0 | tee /proc/sys/kernel/kptr_restrict
> > > > /proc/sys/kernel/perf_event_paranoid
> > > > 0
> > > >
> > > > # /usr/bin/perf test 10 92 98 99 100 101
> > > > 10: PMU events                                                      :
> > > > 10.1: PMU event table sanity                                        : Ok
> > > > 10.2: PMU event map aliases                                         : Ok
> > > > 10.3: Parsing of PMU event table metrics                            : Ok
> > > > 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > 92: perf record tests                                               : Ok
> > > > 98: perf stat tests                                                 : Ok
> > > > 99: perf all metricgroups test                                      : Ok
> > > > 100: perf all metrics test                                           : FAILED!
> > > > 101: perf all PMU test                                               : Ok
> > > >
> > > > # perf stat -M tma_dram_bound -a sleep 1
> > > >
> > > > Performance counter stats for 'system wide':
> > > >
> > > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > >                   (0,00%)
> > > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > >                      (0,00%)
> > > >     <not counted>      CPU_CLK_UNHALTED.THREAD
> > > >               (0,00%)
> > > >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > >                         (0,00%)
> > > >
> > >
> > > Hi Sedat,
> > >
> > > I also had this failure and did a git bisect, but it led me to the
> > > conclusion that it is a stale build issue rather than a regression.
> > >
> > > There was a recent commit that renamed/removed some json PMU files which
> > > the build system can't cope with. I think the tests end up iterating
> > > over a different set of event names than were generated by the build system.
> > >
> > > If you do a clean build the issue should go away. I don't know if there
> > > is anything more we can do to stop this from happening.
> > >
> > > James
> >
> > So I think this is a kernel bug triggering a perf tool bug. The kernel
> > bug can be worked around in the perf tool. I only had an Ivybridge to
> > test with (hence slightly different events) but what I see is both
> > tma_dram_bound and tma_l3_bound using the same 4 events. I could work
> > around the "<not counted>" by adding the --metric-no-group flag:
> >
> > ```
> > $ perf stat -M tma_l3_bound --metric-no-group -a sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> >           400,404      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      4.3 %
> > tma_l3_bound             (74.99%)
> >       128,937,891      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >                         (87.46%)
> >           167,459      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> >                         (74.99%)
> >       759,574,967      CPU_CLK_UNHALTED.THREAD
> >                         (87.47%)
> >
> >       1.001526438 seconds time elapsed
> >
> > $ perf stat -M tma_dram_bound -a --metric-no-group sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> >           259,954      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #     15.2 %
> > tma_dram_bound           (74.99%)
> >       118,807,043      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >                         (87.46%)
> >           111,699      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> >                         (74.95%)
> >       587,571,060      CPU_CLK_UNHALTED.THREAD
> >                         (87.45%)
> >
> >       1.001518093 seconds time elapsed
> > ```
> >
> > The issue is that perf metrics use weak groups of events. A weak group
> > is the same as a group of events initially. We want to use groups of
> > events with metrics so that all the counters are scheduled in and out
> > at the same time, and not multiplexed independently. Imagine measuring
> > IPC but the counts for instructions and cycles are measured at
> > different periods, the resultant IPC value would be unlikely to be
> > accurate. If perf_event_open fails then the perf tool retries the
> > events without the group. If I try just 3 of the events in a weak
> > group then the failure can be seen:
> >
> > ```
> > $ perf stat -e "{MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING}:W"
> > -a sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> >                         (0.00%)
> >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> >                         (0.00%)
> >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >                         (0.00%)
> >
> >       1.001458485 seconds time elapsed
> > ```
> >
> > The kernel should have failed the perf_event_open on opening the third
> > event and then measured without the group, which it can do with
> > multiplexing as in the following:
> >
> > ```
> > $ perf stat -e "MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING"
> > -a sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> >         1,239,397      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> >                         (79.06%)
> >           174,826      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> >                         (64.60%)
> >       124,026,024      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >                         (81.16%)
> >
> >       1.001483434 seconds time elapsed
> > ```
> >
> > When the --metric-no-group flag is given to perf then it doesn't
> > produce the initial weak group, which works around the bug of the
> > kernel not failing on the 3rd perf_event_open. I've added Kan and
> > Zhengjun to the e-mail as they work on the Intel kernel PMU code.
> >
> > There's a question about what we should do in the perf test about
> > this? I have a few solutions:
> >
> > 1) try metric tests again with the --metric-no-group flag and don't
> > fail the test if this succeeds. This allows kernel bugs to hide, so
> > I'm not a huge fan.
> >
> > 2) add a new metric flag/constraint to say not to group, this way the
> > metric will automatically apply the "--metric-no-group" flag. It is a
> > bit of work to wire this up but this kind of failure is common enough
> > in PMUs that it is probably worthwhile. We also need to add the flag
> > to metrics and I'm not sure how to get a good list of the metrics that
> > currently fail and require it. This is okay but error prone.
> >
> > 3) fix the kernel bug and let the perf test fail until an adequate
> > kernel is installed. Probably the best option.
> >
>
> Hi Ian,
>
> I can confirm:
>
> $ echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> /proc/sys/kernel/perf_event_paranoid
> 0
>
> $ ~/bin/perf stat -M tma_l3_bound --metric-no-group -a sleep 1
>
> Performance counter stats for 'system wide':
>
>         2.058.892      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,5 %
> tma_l3_bound             (99,30%)
>       173.254.697      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (99,10%)
>     2.396.130.501      CPU_CLK_UNHALTED.THREAD
>                         (99,60%)
>         1.110.486      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>                            (99,53%)
>
>       1,001989022 seconds time elapsed
>
> $ ~/bin/perf stat -M tma_dram_bound --metric-no-group -a sleep 1
>
> Performance counter stats for 'system wide':
>
>         1.729.208      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,2 %
> tma_dram_bound           (99,50%)
>        50.346.734      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (99,50%)
>     2.354.963.862      CPU_CLK_UNHALTED.THREAD
>                         (99,80%)
>           306.500      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>                            (99,61%)
>
>       1,001981392 seconds time elapsed
>
> Thanks!

Thanks, apparently it is an issue with SandyBridge/IvyBridge that some
counters on one hyperthread will limit what can be on the other. I
believe that's the comment related to EXCL access here:
https://github.com/torvalds/linux/blob/master/arch/x86/events/intel/core.c#L124
So you may have more success with the metric if you disable
hyperthreading, but I imagine that's not a popular option.

Thanks,
Ian

> BR,
> -Sedat-
>
> > Thanks,
> > Ian
> >
> > > >       1,002148600 seconds time elapsed
> > > >
> > > > Hmm... looking at... Metric 'tma_l3_bound' ...
> > > >
> > > > Running...
> > > >
> > > > # perf stat --verbose -M tma_l3_bound -a sleep 1
> > > > Using CPUID GenuineIntel-6-2A-7
> > > > metric expr (MEM_LOAD_UOPS_RETIRED.LLC_HIT /
> > > > (MEM_LOAD_UOPS_RETIRED.LLC_HIT + 7 *
> > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS)) *
> > > > CYCLE_ACTIVITY.STALLS_L2_PENDING / CLKS for tma_l3_bound
> > > > metric expr CPU_CLK_UNHALTED.THREAD for CLKS
> > > >
> > > > found event MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > found event CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > found event CPU_CLK_UNHALTED.THREAD
> > > > found event MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > >
> > > > Parsing metric events
> > > > '{MEM_LOAD_UOPS_RETIRED.LLC_HIT/metric-id=MEM_LOAD_UOPS_RETIRED.LLC_HIT/,CYCLE_ACTIVITY.STALLS_L2_PENDING/metric-id=CYCLE_ACTIVITY.STALLS_L2_PEND
> > > > ING/,CPU_CLK_UNHALTED.THREAD/metric-id=CPU_CLK_UNHALTED.THREAD/,MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/metric-id=MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/}:W'
> > > > MEM_LOAD_UOPS_RETIRED.LLC_HIT -> cpu/event=0xd1,period=0xc365,umask=0x4/
> > > > CYCLE_ACTIVITY.STALLS_L2_PENDING ->
> > > > cpu/event=0xa3,cmask=0x5,period=0x1e8483,umask=0x5/
> > > > CPU_CLK_UNHALTED.THREAD -> cpu/event=0x3c,period=0x1e8483/
> > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS -> cpu/event=0xd4,period=0x186a7,umask=0x2/
> > > >
> > > > Control descriptor is not initialized
> > > >
> > > > MEM_LOAD_UOPS_RETIRED.LLC_HIT: 0 4007421228 0
> > > > CYCLE_ACTIVITY.STALLS_L2_PENDING: 0 4007421228 0
> > > > CPU_CLK_UNHALTED.THREAD: 0 4007421228 0
> > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS: 0 4007421228 0
> > > >
> > > > Performance counter stats for 'system wide':
> > > >
> > > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > >                   (0,00%)
> > > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > >                      (0,00%)
> > > >     <not counted>      CPU_CLK_UNHALTED.THREAD
> > > >               (0,00%)
> > > >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > >                         (0,00%)
> > > >
> > > >       1,002310013 seconds time elapsed
> > > >
> > > > So those events/metric-ids resulting in "<not counted>" are all found.
> > > >
> > > > What means "Control descriptor is not initialized"?
> > > >
> > > > To summarize:
> > > >
> > > > Those two tests in "100: perf all metrics test" FAILED:
> > > >
> > > > 1. tma_dram_bound
> > > > 2. tma_l3_bound
> > > >
> > > > Best regards,
> > > > -Sedat-
> > > >
> > > >> Thanks,
> > > >> Ian
> > > >>
> > > >>> Last perf version which was OK:
> > > >>>
> > > >>> ~/bin/perf -v
> > > >>> perf version 6.0.0
> > > >>>
> > > >>> echo "linux-perf: Adjust limited access to performance monitoring and
> > > >>> observability operations"
> > > >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > > >>> /proc/sys/kernel/perf_event_paranoid
> > > >>> 0
> > > >>>
> > > >>> ~/bin/perf test 10 86 92 93 94 95
> > > >>> 10: PMU events                                                      :
> > > >>> 10.1: PMU event table sanity                                        : Ok
> > > >>> 10.2: PMU event map aliases                                         : Ok
> > > >>> 10.3: Parsing of PMU event table metrics                            : Ok
> > > >>> 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > >>> 86: perf record tests                                               : Ok
> > > >>> 92: perf stat tests                                                 : Ok
> > > >>> 93: perf all metricgroups test                                      : Ok
> > > >>> 94: perf all metrics test                                           : Ok
> > > >>> 95: perf all PMU test                                               : Ok
> > > >>>
> > > >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > > >>> /proc/sys/kernel/perf_event_paranoid
> > > >>> echo "linux-perf: Reset limited access to performance monitoring and
> > > >>> observability operations"
> > > >>>
> > > >>> If you need further information, please let me know.
> > > >>>
> > > >>> Thanks.
> > > >>>
> > > >>> Regards,
> > > >>> -Sedat-
> > > >>>
> > > >>> P.S. Instructions
> > > >>>
> > > >>> [ REPRODUCER ]
> > > >>>
> > > >>> LLVM_MVER="16"
> > > >>>
> > > >>> # Debian LLVM
> > > >>> ##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
> > > >>> # Selfmade LLVM
> > > >>> LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
> > > >>> if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
> > > >>>    export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
> > > >>> fi
> > > >>>
> > > >>> PYTHON_VER="3.11"
> > > >>> MAKE="make"
> > > >>> MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
> > > >>> HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
> > > >>> STRIP=llvm-strip"
> > > >>>
> > > >>> echo "LLVM MVER ........ $LLVM_MVER"
> > > >>> echo "Path settings .... $PATH"
> > > >>> echo "Python version ... $PYTHON_VER"
> > > >>> echo "make line ........ $MAKE $MAKE_OPTS"
> > > >>>
> > > >>> LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt
> > > >>>
> > > >>> LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
> > > >>> PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
> > > >>> ../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt
> > > >>>
> > > >>>
> > > >>> [ TESTS ]
> > > >>>
> > > >>> [ TESTS - START ]
> > > >>>
> > > >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > > >>> /proc/sys/kernel/perf_event_paranoid
> > > >>>
> > > >>> [ TESTS - DEBIAN ]
> > > >>>
> > > >>> /usr/bin/perf -v
> > > >>> perf version 6.1.7
> > > >>>
> > > >>> /usr/bin/perf test 10 92 98 99 100 101
> > > >>>
> > > >>>  10: PMU events                                                      :
> > > >>>  10.1: PMU event table sanity                                        : Ok
> > > >>>  10.2: PMU event map aliases                                         : Ok
> > > >>>  10.3: Parsing of PMU event table metrics                            : Ok
> > > >>>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > >>>  92: perf record tests                                               : Ok
> > > >>>  98: perf stat tests                                                 : Ok
> > > >>>  99: perf all metricgroups test                                      : Ok
> > > >>> 100: perf all metrics test                                           : FAILED!
> > > >>> 101: perf all PMU test                                               : Ok
> > > >>>
> > > >>> [ TESTS - DILEKS ]
> > > >>>
> > > >>> ~/bin/perf -v
> > > >>> perf version 6.2.0-rc5
> > > >>>
> > > >>> ~/bin/perf test 7 87 93 94 95 96
> > > >>>
> > > >>>   7: PMU events                                                      :
> > > >>>   7.1: PMU event table sanity                                        : Ok
> > > >>>   7.2: PMU event map aliases                                         : Ok
> > > >>>   7.3: Parsing of PMU event table metrics                            : Ok
> > > >>>   7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > >>>  87: perf record tests                                               : Ok
> > > >>>  93: perf stat tests                                                 : Ok
> > > >>>  94: perf all metricgroups test                                      : Ok
> > > >>>  95: perf all metrics test                                           : FAILED!
> > > >>>  96: perf all PMU test                                               : Ok
> > > >>>
> > > >>> [ TESTS - FAILED ]
> > > >>>
> > > >>> /usr/bin/perf test --verbose 100 2>&1 | tee
> > > >>> perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt
> > > >>>
> > > >>> ~/bin/perf test --verbose 95 2>&1 | tee
> > > >>> perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt
> > > >>>
> > > >>> [ TESTS - STOP ]
> > > >>>
> > > >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > > >>> /proc/sys/kernel/perf_event_paranoid
> > > >>>
> > > >>> - EOT -

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-31  3:55           ` Ian Rogers
@ 2023-01-31  6:14             ` Sedat Dilek
  2023-01-31  6:20               ` Sedat Dilek
  2023-02-01 15:27             ` Liang, Kan
  1 sibling, 1 reply; 13+ messages in thread
From: Sedat Dilek @ 2023-01-31  6:14 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Liang, Kan, Xing, Zhengjun, Arnaldo Carvalho de Melo,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-perf-users, linux-kernel,
	Nick Desaulniers, Nathan Chancellor, llvm, Ben Hutchings,
	James Clark, Stephane Eranian

On Tue, Jan 31, 2023 at 4:55 AM Ian Rogers <irogers@google.com> wrote:
>
> On Mon, Jan 30, 2023 at 7:45 PM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> >
> > On Tue, Jan 31, 2023 at 1:20 AM Ian Rogers <irogers@google.com> wrote:
> > >
> > > On Mon, Jan 30, 2023 at 2:04 AM James Clark <james.clark@arm.com> wrote:
> > > >
> > > >
> > > >
> > > > On 30/01/2023 02:24, Sedat Dilek wrote:
> > > > > ?
> > > > >
> > > > > On Mon, Jan 30, 2023 at 12:21 AM Ian Rogers <irogers@google.com> wrote:
> > > > >>
> > > > >> On Sun, Jan 29, 2023 at 1:59 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> > > > >>>
> > > > >>> [ CC LLVM linux folks + Ben from Debian kernel team ]
> > > > >>>
> > > > >>> Hi,
> > > > >>>
> > > > >>> I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.
> > > > >>>
> > > > >>> After building my selfmade LLVM toolchain, I built perf and run some
> > > > >>> perf tests here on my Intel SandyBridge CPU (details see below).
> > > > >>>
> > > > >>> perf all metrics test: FAILED!
> > > > >>>
> > > > >>> ...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.
> > > > >>>
> > > > >>> Just noticed:
> > > > >>>
> > > > >>> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> > > > >>> BPF maps, etc
> > > > >>>
> > > > >>> Run the below tests with `sudo` - made this go away - still FAILED.
> > > > >>>
> > > > >>> But maybe I am missing to activate some sysfs/debug or whatever other stuff?
> > > > >>
> > > > >> Hi Sedat,
> > > > >>
> > > > >> things have been improving wrt metrics and so this failure may have
> > > > >> just been because of the addition of a previously missing metric. The
> > > > >> rlimit thing shouldn't affect things but maybe file descriptors?
> > > > >> Looking at the test output the issue is:
> > > > >>
> > > > >> ```
> > > > >> Metric 'tma_dram_bound' not printed in:
> > > > >> # Running 'internals/synthesize' benchmark:
> > > > >> Computing performance of single threaded perf event synthesis by
> > > > >> synthesizing events on the perf process itself:
> > > > >>   Average synthesis took: 207.680 usec (+- 0.176 usec)
> > > > >>   Average num. events: 30.000 (+- 0.000)
> > > > >>   Average time per event 6.923 usec
> > > > >>   Average data synthesis took: 217.833 usec (+- 0.202 usec)
> > > > >>   Average num. events: 161.000 (+- 0.000)
> > > > >>   Average time per event 1.353 usec
> > > > >>
> > > > >>  Performance counter stats for 'perf bench internals synthesize':
> > > > >>
> > > > >>      <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > >>                          (0,00%)
> > > > >>      <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > >>                          (0,00%)
> > > > >>      <not counted>      CPU_CLK_UNHALTED.THREAD
> > > > >>                          (0,00%)
> > > > >>      <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > > >>                             (0,00%)
> > > > >> ```
> > > > >>
> > > > >> So the test was checking to see whether the tma_dram_bound metric
> > > > >> could be computed on your Sandybridge and it failed. The event counts
> > > > >> below show that every event came back "<not counted>" which is usually
> > > > >> indicative of a permissions problem - it is also not surprising given
> > > > >> this that the metric wasn't computed. You could try repeating the
> > > > >> command the test is trying with something like "perf stat -M
> > > > >> tma_dram_bound -a sleep 1", but running as root should have resolved
> > > > >> that issue. Does that give you enough to keep exploring?
> > > > >>
> > > > >
> > > > > Hi Ian,
> > > > >
> > > > > Thanks for your feedback!
> > > > >
> > > > > I booted into my Debian kernel - just to see what happens.
> > > > >
> > > > > # cat /proc/version
> > > > > Linux version 6.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-12
> > > > > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1
> > > > > SMP PREEMPT_DYNAMIC Debian 6.1.7-1 (2023-01-18)
> > > > >
> > > > > All things run as root...
> > > > >
> > > > > # echo 0 | tee /proc/sys/kernel/kptr_restrict
> > > > > /proc/sys/kernel/perf_event_paranoid
> > > > > 0
> > > > >
> > > > > # /usr/bin/perf test 10 92 98 99 100 101
> > > > > 10: PMU events                                                      :
> > > > > 10.1: PMU event table sanity                                        : Ok
> > > > > 10.2: PMU event map aliases                                         : Ok
> > > > > 10.3: Parsing of PMU event table metrics                            : Ok
> > > > > 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > > 92: perf record tests                                               : Ok
> > > > > 98: perf stat tests                                                 : Ok
> > > > > 99: perf all metricgroups test                                      : Ok
> > > > > 100: perf all metrics test                                           : FAILED!
> > > > > 101: perf all PMU test                                               : Ok
> > > > >
> > > > > # perf stat -M tma_dram_bound -a sleep 1
> > > > >
> > > > > Performance counter stats for 'system wide':
> > > > >
> > > > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > >                   (0,00%)
> > > > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > >                      (0,00%)
> > > > >     <not counted>      CPU_CLK_UNHALTED.THREAD
> > > > >               (0,00%)
> > > > >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > > >                         (0,00%)
> > > > >
> > > >
> > > > Hi Sedat,
> > > >
> > > > I also had this failure and did a git bisect, but it led me to the
> > > > conclusion that it is a stale build issue rather than a regression.
> > > >
> > > > There was a recent commit that renamed/removed some json PMU files which
> > > > the build system can't cope with. I think the tests end up iterating
> > > > over a different set of event names than were generated by the build system.
> > > >
> > > > If you do a clean build the issue should go away. I don't know if there
> > > > is anything more we can do to stop this from happening.
> > > >
> > > > James
> > >
> > > So I think this is a kernel bug triggering a perf tool bug. The kernel
> > > bug can be worked around in the perf tool. I only had an Ivybridge to
> > > test with (hence slightly different events) but what I see is both
> > > tma_dram_bound and tma_l3_bound using the same 4 events. I could work
> > > around the "<not counted>" by adding the --metric-no-group flag:
> > >
> > > ```
> > > $ perf stat -M tma_l3_bound --metric-no-group -a sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > >           400,404      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      4.3 %
> > > tma_l3_bound             (74.99%)
> > >       128,937,891      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >                         (87.46%)
> > >           167,459      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> > >                         (74.99%)
> > >       759,574,967      CPU_CLK_UNHALTED.THREAD
> > >                         (87.47%)
> > >
> > >       1.001526438 seconds time elapsed
> > >
> > > $ perf stat -M tma_dram_bound -a --metric-no-group sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > >           259,954      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #     15.2 %
> > > tma_dram_bound           (74.99%)
> > >       118,807,043      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >                         (87.46%)
> > >           111,699      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> > >                         (74.95%)
> > >       587,571,060      CPU_CLK_UNHALTED.THREAD
> > >                         (87.45%)
> > >
> > >       1.001518093 seconds time elapsed
> > > ```
> > >
> > > The issue is that perf metrics use weak groups of events. A weak group
> > > is the same as a group of events initially. We want to use groups of
> > > events with metrics so that all the counters are scheduled in and out
> > > at the same time, and not multiplexed independently. Imagine measuring
> > > IPC but the counts for instructions and cycles are measured at
> > > different periods, the resultant IPC value would be unlikely to be
> > > accurate. If perf_event_open fails then the perf tool retries the
> > > events without the group. If I try just 3 of the events in a weak
> > > group then the failure can be seen:
> > >
> > > ```
> > > $ perf stat -e "{MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING}:W"
> > > -a sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > >                         (0.00%)
> > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> > >                         (0.00%)
> > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >                         (0.00%)
> > >
> > >       1.001458485 seconds time elapsed
> > > ```
> > >
> > > The kernel should have failed the perf_event_open on opening the third
> > > event and then measured without the group, which it can do with
> > > multiplexing as in the following:
> > >
> > > ```
> > > $ perf stat -e "MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING"
> > > -a sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > >         1,239,397      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > >                         (79.06%)
> > >           174,826      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> > >                         (64.60%)
> > >       124,026,024      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >                         (81.16%)
> > >
> > >       1.001483434 seconds time elapsed
> > > ```
> > >
> > > When the --metric-no-group flag is given to perf then it doesn't
> > > produce the initial weak group, which works around the bug of the
> > > kernel not failing on the 3rd perf_event_open. I've added Kan and
> > > Zhengjun to the e-mail as they work on the Intel kernel PMU code.
> > >
> > > There's a question about what we should do in the perf test about
> > > this? I have a few solutions:
> > >
> > > 1) try metric tests again with the --metric-no-group flag and don't
> > > fail the test if this succeeds. This allows kernel bugs to hide, so
> > > I'm not a huge fan.
> > >
> > > 2) add a new metric flag/constraint to say not to group, this way the
> > > metric will automatically apply the "--metric-no-group" flag. It is a
> > > bit of work to wire this up but this kind of failure is common enough
> > > in PMUs that it is probably worthwhile. We also need to add the flag
> > > to metrics and I'm not sure how to get a good list of the metrics that
> > > currently fail and require it. This is okay but error prone.
> > >
> > > 3) fix the kernel bug and let the perf test fail until an adequate
> > > kernel is installed. Probably the best option.
> > >
> >
> > Hi Ian,
> >
> > I can confirm:
> >
> > $ echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > /proc/sys/kernel/perf_event_paranoid
> > 0
> >
> > $ ~/bin/perf stat -M tma_l3_bound --metric-no-group -a sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> >         2.058.892      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,5 %
> > tma_l3_bound             (99,30%)
> >       173.254.697      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >                         (99,10%)
> >     2.396.130.501      CPU_CLK_UNHALTED.THREAD
> >                         (99,60%)
> >         1.110.486      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> >                            (99,53%)
> >
> >       1,001989022 seconds time elapsed
> >
> > $ ~/bin/perf stat -M tma_dram_bound --metric-no-group -a sleep 1
> >
> > Performance counter stats for 'system wide':
> >
> >         1.729.208      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,2 %
> > tma_dram_bound           (99,50%)
> >        50.346.734      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >                         (99,50%)
> >     2.354.963.862      CPU_CLK_UNHALTED.THREAD
> >                         (99,80%)
> >           306.500      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> >                            (99,61%)
> >
> >       1,001981392 seconds time elapsed
> >
> > Thanks!
>
> Thanks, apparently it is an issue with SandyBridge/IvyBridge that some
> counters on one hyperthread will limit what can be on the other. I
> believe that's the comment related to EXCL access here:
> https://github.com/torvalds/linux/blob/master/arch/x86/events/intel/core.c#L124
> So you may have more success with the metric if you disable
> hyperthreading, but I imagine that's not a popular option.
>

Hi Ian,

LOL.

Yesterday, I played a bit with perf... and did some research.

I was thinking more of the "formula" in the metrics calculation is
somehow BROKEN (not executed properly).

The first thing to fix the issue was to recompile perf with throwing
out the blocks in snb-metrics JSON file containing the two tests.

AFFAICS, stackoverflow or wherever I found something about haswell vs.
sandy bridge and topic "SMT / HT (hyper-threading)".
That's why my initial LOL...

I am not in front of my Linux machine, I guess it was...

Link: https://stackoverflow.com/questions/33677367/understanding-cycle-activity-haswell-performance-monitoring-events

Hope I have not to disable HT in my BIOS before running these tests.
Can I do this in runtime (proc / sysfs / etc.)?

Furthermore, I passed * mitigations=off * as kernel-boot-parameter for
testing purposes.
( I wanted to test it for a long time - independent of this issue. )

Of course, I played with perf -e
MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING
... (comma-separated, no "...", no {...}), passing -C 1 (number of
CPUs), etc.
Had to check my bash-history...

Again as noone explained:

Control descriptor is not initialized

In some tests I have seen this (all with passing -v or -vv for more
verbosity) - in some not.

Worth reading: Brendan Gregg's Homepage!

https://www.brendangregg.com/linuxperf.html

https://www.brendangregg.com/perf.html

Thanks Ian for digging into this!

If you have any news or anything for testing, please let me know.

BR,
-Sedat-

>
> > BR,
> > -Sedat-
> >
> > > Thanks,
> > > Ian
> > >
> > > > >       1,002148600 seconds time elapsed
> > > > >
> > > > > Hmm... looking at... Metric 'tma_l3_bound' ...
> > > > >
> > > > > Running...
> > > > >
> > > > > # perf stat --verbose -M tma_l3_bound -a sleep 1
> > > > > Using CPUID GenuineIntel-6-2A-7
> > > > > metric expr (MEM_LOAD_UOPS_RETIRED.LLC_HIT /
> > > > > (MEM_LOAD_UOPS_RETIRED.LLC_HIT + 7 *
> > > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS)) *
> > > > > CYCLE_ACTIVITY.STALLS_L2_PENDING / CLKS for tma_l3_bound
> > > > > metric expr CPU_CLK_UNHALTED.THREAD for CLKS
> > > > >
> > > > > found event MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > > found event CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > > found event CPU_CLK_UNHALTED.THREAD
> > > > > found event MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > > >
> > > > > Parsing metric events
> > > > > '{MEM_LOAD_UOPS_RETIRED.LLC_HIT/metric-id=MEM_LOAD_UOPS_RETIRED.LLC_HIT/,CYCLE_ACTIVITY.STALLS_L2_PENDING/metric-id=CYCLE_ACTIVITY.STALLS_L2_PEND
> > > > > ING/,CPU_CLK_UNHALTED.THREAD/metric-id=CPU_CLK_UNHALTED.THREAD/,MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/metric-id=MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/}:W'
> > > > > MEM_LOAD_UOPS_RETIRED.LLC_HIT -> cpu/event=0xd1,period=0xc365,umask=0x4/
> > > > > CYCLE_ACTIVITY.STALLS_L2_PENDING ->
> > > > > cpu/event=0xa3,cmask=0x5,period=0x1e8483,umask=0x5/
> > > > > CPU_CLK_UNHALTED.THREAD -> cpu/event=0x3c,period=0x1e8483/
> > > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS -> cpu/event=0xd4,period=0x186a7,umask=0x2/
> > > > >
> > > > > Control descriptor is not initialized
> > > > >
> > > > > MEM_LOAD_UOPS_RETIRED.LLC_HIT: 0 4007421228 0
> > > > > CYCLE_ACTIVITY.STALLS_L2_PENDING: 0 4007421228 0
> > > > > CPU_CLK_UNHALTED.THREAD: 0 4007421228 0
> > > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS: 0 4007421228 0
> > > > >
> > > > > Performance counter stats for 'system wide':
> > > > >
> > > > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > >                   (0,00%)
> > > > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > >                      (0,00%)
> > > > >     <not counted>      CPU_CLK_UNHALTED.THREAD
> > > > >               (0,00%)
> > > > >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > > >                         (0,00%)
> > > > >
> > > > >       1,002310013 seconds time elapsed
> > > > >
> > > > > So those events/metric-ids resulting in "<not counted>" are all found.
> > > > >
> > > > > What means "Control descriptor is not initialized"?
> > > > >
> > > > > To summarize:
> > > > >
> > > > > Those two tests in "100: perf all metrics test" FAILED:
> > > > >
> > > > > 1. tma_dram_bound
> > > > > 2. tma_l3_bound
> > > > >
> > > > > Best regards,
> > > > > -Sedat-
> > > > >
> > > > >> Thanks,
> > > > >> Ian
> > > > >>
> > > > >>> Last perf version which was OK:
> > > > >>>
> > > > >>> ~/bin/perf -v
> > > > >>> perf version 6.0.0
> > > > >>>
> > > > >>> echo "linux-perf: Adjust limited access to performance monitoring and
> > > > >>> observability operations"
> > > > >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > > > >>> /proc/sys/kernel/perf_event_paranoid
> > > > >>> 0
> > > > >>>
> > > > >>> ~/bin/perf test 10 86 92 93 94 95
> > > > >>> 10: PMU events                                                      :
> > > > >>> 10.1: PMU event table sanity                                        : Ok
> > > > >>> 10.2: PMU event map aliases                                         : Ok
> > > > >>> 10.3: Parsing of PMU event table metrics                            : Ok
> > > > >>> 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > >>> 86: perf record tests                                               : Ok
> > > > >>> 92: perf stat tests                                                 : Ok
> > > > >>> 93: perf all metricgroups test                                      : Ok
> > > > >>> 94: perf all metrics test                                           : Ok
> > > > >>> 95: perf all PMU test                                               : Ok
> > > > >>>
> > > > >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > > > >>> /proc/sys/kernel/perf_event_paranoid
> > > > >>> echo "linux-perf: Reset limited access to performance monitoring and
> > > > >>> observability operations"
> > > > >>>
> > > > >>> If you need further information, please let me know.
> > > > >>>
> > > > >>> Thanks.
> > > > >>>
> > > > >>> Regards,
> > > > >>> -Sedat-
> > > > >>>
> > > > >>> P.S. Instructions
> > > > >>>
> > > > >>> [ REPRODUCER ]
> > > > >>>
> > > > >>> LLVM_MVER="16"
> > > > >>>
> > > > >>> # Debian LLVM
> > > > >>> ##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
> > > > >>> # Selfmade LLVM
> > > > >>> LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
> > > > >>> if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
> > > > >>>    export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
> > > > >>> fi
> > > > >>>
> > > > >>> PYTHON_VER="3.11"
> > > > >>> MAKE="make"
> > > > >>> MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
> > > > >>> HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
> > > > >>> STRIP=llvm-strip"
> > > > >>>
> > > > >>> echo "LLVM MVER ........ $LLVM_MVER"
> > > > >>> echo "Path settings .... $PATH"
> > > > >>> echo "Python version ... $PYTHON_VER"
> > > > >>> echo "make line ........ $MAKE $MAKE_OPTS"
> > > > >>>
> > > > >>> LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt
> > > > >>>
> > > > >>> LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
> > > > >>> PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
> > > > >>> ../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt
> > > > >>>
> > > > >>>
> > > > >>> [ TESTS ]
> > > > >>>
> > > > >>> [ TESTS - START ]
> > > > >>>
> > > > >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > > > >>> /proc/sys/kernel/perf_event_paranoid
> > > > >>>
> > > > >>> [ TESTS - DEBIAN ]
> > > > >>>
> > > > >>> /usr/bin/perf -v
> > > > >>> perf version 6.1.7
> > > > >>>
> > > > >>> /usr/bin/perf test 10 92 98 99 100 101
> > > > >>>
> > > > >>>  10: PMU events                                                      :
> > > > >>>  10.1: PMU event table sanity                                        : Ok
> > > > >>>  10.2: PMU event map aliases                                         : Ok
> > > > >>>  10.3: Parsing of PMU event table metrics                            : Ok
> > > > >>>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > >>>  92: perf record tests                                               : Ok
> > > > >>>  98: perf stat tests                                                 : Ok
> > > > >>>  99: perf all metricgroups test                                      : Ok
> > > > >>> 100: perf all metrics test                                           : FAILED!
> > > > >>> 101: perf all PMU test                                               : Ok
> > > > >>>
> > > > >>> [ TESTS - DILEKS ]
> > > > >>>
> > > > >>> ~/bin/perf -v
> > > > >>> perf version 6.2.0-rc5
> > > > >>>
> > > > >>> ~/bin/perf test 7 87 93 94 95 96
> > > > >>>
> > > > >>>   7: PMU events                                                      :
> > > > >>>   7.1: PMU event table sanity                                        : Ok
> > > > >>>   7.2: PMU event map aliases                                         : Ok
> > > > >>>   7.3: Parsing of PMU event table metrics                            : Ok
> > > > >>>   7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > >>>  87: perf record tests                                               : Ok
> > > > >>>  93: perf stat tests                                                 : Ok
> > > > >>>  94: perf all metricgroups test                                      : Ok
> > > > >>>  95: perf all metrics test                                           : FAILED!
> > > > >>>  96: perf all PMU test                                               : Ok
> > > > >>>
> > > > >>> [ TESTS - FAILED ]
> > > > >>>
> > > > >>> /usr/bin/perf test --verbose 100 2>&1 | tee
> > > > >>> perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt
> > > > >>>
> > > > >>> ~/bin/perf test --verbose 95 2>&1 | tee
> > > > >>> perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt
> > > > >>>
> > > > >>> [ TESTS - STOP ]
> > > > >>>
> > > > >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > > > >>> /proc/sys/kernel/perf_event_paranoid
> > > > >>>
> > > > >>> - EOT -

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-31  6:14             ` Sedat Dilek
@ 2023-01-31  6:20               ` Sedat Dilek
  0 siblings, 0 replies; 13+ messages in thread
From: Sedat Dilek @ 2023-01-31  6:20 UTC (permalink / raw)
  To: Ian Rogers
  Cc: Liang, Kan, Xing, Zhengjun, Arnaldo Carvalho de Melo,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-perf-users, linux-kernel,
	Nick Desaulniers, Nathan Chancellor, llvm, Ben Hutchings,
	James Clark, Stephane Eranian

On Tue, Jan 31, 2023 at 7:14 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
>
> On Tue, Jan 31, 2023 at 4:55 AM Ian Rogers <irogers@google.com> wrote:
> >
> > On Mon, Jan 30, 2023 at 7:45 PM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> > >
> > > On Tue, Jan 31, 2023 at 1:20 AM Ian Rogers <irogers@google.com> wrote:
> > > >
> > > > On Mon, Jan 30, 2023 at 2:04 AM James Clark <james.clark@arm.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On 30/01/2023 02:24, Sedat Dilek wrote:
> > > > > > ?
> > > > > >
> > > > > > On Mon, Jan 30, 2023 at 12:21 AM Ian Rogers <irogers@google.com> wrote:
> > > > > >>
> > > > > >> On Sun, Jan 29, 2023 at 1:59 AM Sedat Dilek <sedat.dilek@gmail.com> wrote:
> > > > > >>>
> > > > > >>> [ CC LLVM linux folks + Ben from Debian kernel team ]
> > > > > >>>
> > > > > >>> Hi,
> > > > > >>>
> > > > > >>> I am playing with LLVM version 16.0.0-rc1 which was released yesterday and PERF.
> > > > > >>>
> > > > > >>> After building my selfmade LLVM toolchain, I built perf and run some
> > > > > >>> perf tests here on my Intel SandyBridge CPU (details see below).
> > > > > >>>
> > > > > >>> perf all metrics test: FAILED!
> > > > > >>>
> > > > > >>> ...with both Debian's perf version 6.1.7 and my selfmade version 6.2-rc5.
> > > > > >>>
> > > > > >>> Just noticed:
> > > > > >>>
> > > > > >>> Couldn't bump rlimit(MEMLOCK), failures may take place when creating
> > > > > >>> BPF maps, etc
> > > > > >>>
> > > > > >>> Run the below tests with `sudo` - made this go away - still FAILED.
> > > > > >>>
> > > > > >>> But maybe I am missing to activate some sysfs/debug or whatever other stuff?
> > > > > >>
> > > > > >> Hi Sedat,
> > > > > >>
> > > > > >> things have been improving wrt metrics and so this failure may have
> > > > > >> just been because of the addition of a previously missing metric. The
> > > > > >> rlimit thing shouldn't affect things but maybe file descriptors?
> > > > > >> Looking at the test output the issue is:
> > > > > >>
> > > > > >> ```
> > > > > >> Metric 'tma_dram_bound' not printed in:
> > > > > >> # Running 'internals/synthesize' benchmark:
> > > > > >> Computing performance of single threaded perf event synthesis by
> > > > > >> synthesizing events on the perf process itself:
> > > > > >>   Average synthesis took: 207.680 usec (+- 0.176 usec)
> > > > > >>   Average num. events: 30.000 (+- 0.000)
> > > > > >>   Average time per event 6.923 usec
> > > > > >>   Average data synthesis took: 217.833 usec (+- 0.202 usec)
> > > > > >>   Average num. events: 161.000 (+- 0.000)
> > > > > >>   Average time per event 1.353 usec
> > > > > >>
> > > > > >>  Performance counter stats for 'perf bench internals synthesize':
> > > > > >>
> > > > > >>      <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > > >>                          (0,00%)
> > > > > >>      <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > > >>                          (0,00%)
> > > > > >>      <not counted>      CPU_CLK_UNHALTED.THREAD
> > > > > >>                          (0,00%)
> > > > > >>      <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > > > >>                             (0,00%)
> > > > > >> ```
> > > > > >>
> > > > > >> So the test was checking to see whether the tma_dram_bound metric
> > > > > >> could be computed on your Sandybridge and it failed. The event counts
> > > > > >> below show that every event came back "<not counted>" which is usually
> > > > > >> indicative of a permissions problem - it is also not surprising given
> > > > > >> this that the metric wasn't computed. You could try repeating the
> > > > > >> command the test is trying with something like "perf stat -M
> > > > > >> tma_dram_bound -a sleep 1", but running as root should have resolved
> > > > > >> that issue. Does that give you enough to keep exploring?
> > > > > >>
> > > > > >
> > > > > > Hi Ian,
> > > > > >
> > > > > > Thanks for your feedback!
> > > > > >
> > > > > > I booted into my Debian kernel - just to see what happens.
> > > > > >
> > > > > > # cat /proc/version
> > > > > > Linux version 6.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc-12
> > > > > > (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1
> > > > > > SMP PREEMPT_DYNAMIC Debian 6.1.7-1 (2023-01-18)
> > > > > >
> > > > > > All things run as root...
> > > > > >
> > > > > > # echo 0 | tee /proc/sys/kernel/kptr_restrict
> > > > > > /proc/sys/kernel/perf_event_paranoid
> > > > > > 0
> > > > > >
> > > > > > # /usr/bin/perf test 10 92 98 99 100 101
> > > > > > 10: PMU events                                                      :
> > > > > > 10.1: PMU event table sanity                                        : Ok
> > > > > > 10.2: PMU event map aliases                                         : Ok
> > > > > > 10.3: Parsing of PMU event table metrics                            : Ok
> > > > > > 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > > > 92: perf record tests                                               : Ok
> > > > > > 98: perf stat tests                                                 : Ok
> > > > > > 99: perf all metricgroups test                                      : Ok
> > > > > > 100: perf all metrics test                                           : FAILED!
> > > > > > 101: perf all PMU test                                               : Ok
> > > > > >
> > > > > > # perf stat -M tma_dram_bound -a sleep 1
> > > > > >
> > > > > > Performance counter stats for 'system wide':
> > > > > >
> > > > > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > > >                   (0,00%)
> > > > > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > > >                      (0,00%)
> > > > > >     <not counted>      CPU_CLK_UNHALTED.THREAD
> > > > > >               (0,00%)
> > > > > >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > > > >                         (0,00%)
> > > > > >
> > > > >
> > > > > Hi Sedat,
> > > > >
> > > > > I also had this failure and did a git bisect, but it led me to the
> > > > > conclusion that it is a stale build issue rather than a regression.
> > > > >
> > > > > There was a recent commit that renamed/removed some json PMU files which
> > > > > the build system can't cope with. I think the tests end up iterating
> > > > > over a different set of event names than were generated by the build system.
> > > > >
> > > > > If you do a clean build the issue should go away. I don't know if there
> > > > > is anything more we can do to stop this from happening.
> > > > >
> > > > > James
> > > >
> > > > So I think this is a kernel bug triggering a perf tool bug. The kernel
> > > > bug can be worked around in the perf tool. I only had an Ivybridge to
> > > > test with (hence slightly different events) but what I see is both
> > > > tma_dram_bound and tma_l3_bound using the same 4 events. I could work
> > > > around the "<not counted>" by adding the --metric-no-group flag:
> > > >
> > > > ```
> > > > $ perf stat -M tma_l3_bound --metric-no-group -a sleep 1
> > > >
> > > > Performance counter stats for 'system wide':
> > > >
> > > >           400,404      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      4.3 %
> > > > tma_l3_bound             (74.99%)
> > > >       128,937,891      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > >                         (87.46%)
> > > >           167,459      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> > > >                         (74.99%)
> > > >       759,574,967      CPU_CLK_UNHALTED.THREAD
> > > >                         (87.47%)
> > > >
> > > >       1.001526438 seconds time elapsed
> > > >
> > > > $ perf stat -M tma_dram_bound -a --metric-no-group sleep 1
> > > >
> > > > Performance counter stats for 'system wide':
> > > >
> > > >           259,954      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #     15.2 %
> > > > tma_dram_bound           (74.99%)
> > > >       118,807,043      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > >                         (87.46%)
> > > >           111,699      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> > > >                         (74.95%)
> > > >       587,571,060      CPU_CLK_UNHALTED.THREAD
> > > >                         (87.45%)
> > > >
> > > >       1.001518093 seconds time elapsed
> > > > ```
> > > >
> > > > The issue is that perf metrics use weak groups of events. A weak group
> > > > is the same as a group of events initially. We want to use groups of
> > > > events with metrics so that all the counters are scheduled in and out
> > > > at the same time, and not multiplexed independently. Imagine measuring
> > > > IPC but the counts for instructions and cycles are measured at
> > > > different periods, the resultant IPC value would be unlikely to be
> > > > accurate. If perf_event_open fails then the perf tool retries the
> > > > events without the group. If I try just 3 of the events in a weak
> > > > group then the failure can be seen:
> > > >
> > > > ```
> > > > $ perf stat -e "{MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING}:W"
> > > > -a sleep 1
> > > >
> > > > Performance counter stats for 'system wide':
> > > >
> > > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > >                         (0.00%)
> > > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> > > >                         (0.00%)
> > > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > >                         (0.00%)
> > > >
> > > >       1.001458485 seconds time elapsed
> > > > ```
> > > >
> > > > The kernel should have failed the perf_event_open on opening the third
> > > > event and then measured without the group, which it can do with
> > > > multiplexing as in the following:
> > > >
> > > > ```
> > > > $ perf stat -e "MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING"
> > > > -a sleep 1
> > > >
> > > > Performance counter stats for 'system wide':
> > > >
> > > >         1,239,397      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > >                         (79.06%)
> > > >           174,826      MEM_LOAD_UOPS_RETIRED.LLC_MISS
> > > >                         (64.60%)
> > > >       124,026,024      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > >                         (81.16%)
> > > >
> > > >       1.001483434 seconds time elapsed
> > > > ```
> > > >
> > > > When the --metric-no-group flag is given to perf then it doesn't
> > > > produce the initial weak group, which works around the bug of the
> > > > kernel not failing on the 3rd perf_event_open. I've added Kan and
> > > > Zhengjun to the e-mail as they work on the Intel kernel PMU code.
> > > >
> > > > There's a question about what we should do in the perf test about
> > > > this? I have a few solutions:
> > > >
> > > > 1) try metric tests again with the --metric-no-group flag and don't
> > > > fail the test if this succeeds. This allows kernel bugs to hide, so
> > > > I'm not a huge fan.
> > > >
> > > > 2) add a new metric flag/constraint to say not to group, this way the
> > > > metric will automatically apply the "--metric-no-group" flag. It is a
> > > > bit of work to wire this up but this kind of failure is common enough
> > > > in PMUs that it is probably worthwhile. We also need to add the flag
> > > > to metrics and I'm not sure how to get a good list of the metrics that
> > > > currently fail and require it. This is okay but error prone.
> > > >
> > > > 3) fix the kernel bug and let the perf test fail until an adequate
> > > > kernel is installed. Probably the best option.
> > > >
> > >
> > > Hi Ian,
> > >
> > > I can confirm:
> > >
> > > $ echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > > /proc/sys/kernel/perf_event_paranoid
> > > 0
> > >
> > > $ ~/bin/perf stat -M tma_l3_bound --metric-no-group -a sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > >         2.058.892      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,5 %
> > > tma_l3_bound             (99,30%)
> > >       173.254.697      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >                         (99,10%)
> > >     2.396.130.501      CPU_CLK_UNHALTED.THREAD
> > >                         (99,60%)
> > >         1.110.486      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > >                            (99,53%)
> > >
> > >       1,001989022 seconds time elapsed
> > >
> > > $ ~/bin/perf stat -M tma_dram_bound --metric-no-group -a sleep 1
> > >
> > > Performance counter stats for 'system wide':
> > >
> > >         1.729.208      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,2 %
> > > tma_dram_bound           (99,50%)
> > >        50.346.734      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > >                         (99,50%)
> > >     2.354.963.862      CPU_CLK_UNHALTED.THREAD
> > >                         (99,80%)
> > >           306.500      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > >                            (99,61%)
> > >
> > >       1,001981392 seconds time elapsed
> > >
> > > Thanks!
> >
> > Thanks, apparently it is an issue with SandyBridge/IvyBridge that some
> > counters on one hyperthread will limit what can be on the other. I
> > believe that's the comment related to EXCL access here:
> > https://github.com/torvalds/linux/blob/master/arch/x86/events/intel/core.c#L124
> > So you may have more success with the metric if you disable
> > hyperthreading, but I imagine that's not a popular option.
> >
>
> Hi Ian,
>
> LOL.
>
> Yesterday, I played a bit with perf... and did some research.
>
> I was thinking more of the "formula" in the metrics calculation is
> somehow BROKEN (not executed properly).
>
> The first thing to fix the issue was to recompile perf with throwing
> out the blocks in snb-metrics JSON file containing the two tests.
>
> AFFAICS, stackoverflow or wherever I found something about haswell vs.
> sandy bridge and topic "SMT / HT (hyper-threading)".
> That's why my initial LOL...
>
> I am not in front of my Linux machine, I guess it was...
>
> Link: https://stackoverflow.com/questions/33677367/understanding-cycle-activity-haswell-performance-monitoring-events
>

To quote from the above Link:

> There are three important differences:
>
> Some of the Haswell events can only valid when HT is disabled. All SNB events are valid even when HT is enabled.
>
> CYCLE_ACTIVITY.STALLS_L2_PENDING on HSW counts the number of load misses at L2, but on SNB, it counts the number of cycles during which there was at least one demand load miss at L2.
>
> The HSW events include all accesses, not just demand loads. In contrast, the SNB events only occur for demand loads.

-Sedat-

> Hope I have not to disable HT in my BIOS before running these tests.
> Can I do this in runtime (proc / sysfs / etc.)?
>
> Furthermore, I passed * mitigations=off * as kernel-boot-parameter for
> testing purposes.
> ( I wanted to test it for a long time - independent of this issue. )
>
> Of course, I played with perf -e
> MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING
> ... (comma-separated, no "...", no {...}), passing -C 1 (number of
> CPUs), etc.
> Had to check my bash-history...
>
> Again as noone explained:
>
> Control descriptor is not initialized
>
> In some tests I have seen this (all with passing -v or -vv for more
> verbosity) - in some not.
>
> Worth reading: Brendan Gregg's Homepage!
>
> https://www.brendangregg.com/linuxperf.html
>
> https://www.brendangregg.com/perf.html
>
> Thanks Ian for digging into this!
>
> If you have any news or anything for testing, please let me know.
>
> BR,
> -Sedat-
>
> >
> > > BR,
> > > -Sedat-
> > >
> > > > Thanks,
> > > > Ian
> > > >
> > > > > >       1,002148600 seconds time elapsed
> > > > > >
> > > > > > Hmm... looking at... Metric 'tma_l3_bound' ...
> > > > > >
> > > > > > Running...
> > > > > >
> > > > > > # perf stat --verbose -M tma_l3_bound -a sleep 1
> > > > > > Using CPUID GenuineIntel-6-2A-7
> > > > > > metric expr (MEM_LOAD_UOPS_RETIRED.LLC_HIT /
> > > > > > (MEM_LOAD_UOPS_RETIRED.LLC_HIT + 7 *
> > > > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS)) *
> > > > > > CYCLE_ACTIVITY.STALLS_L2_PENDING / CLKS for tma_l3_bound
> > > > > > metric expr CPU_CLK_UNHALTED.THREAD for CLKS
> > > > > >
> > > > > > found event MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > > > found event CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > > > found event CPU_CLK_UNHALTED.THREAD
> > > > > > found event MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > > > >
> > > > > > Parsing metric events
> > > > > > '{MEM_LOAD_UOPS_RETIRED.LLC_HIT/metric-id=MEM_LOAD_UOPS_RETIRED.LLC_HIT/,CYCLE_ACTIVITY.STALLS_L2_PENDING/metric-id=CYCLE_ACTIVITY.STALLS_L2_PEND
> > > > > > ING/,CPU_CLK_UNHALTED.THREAD/metric-id=CPU_CLK_UNHALTED.THREAD/,MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/metric-id=MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/}:W'
> > > > > > MEM_LOAD_UOPS_RETIRED.LLC_HIT -> cpu/event=0xd1,period=0xc365,umask=0x4/
> > > > > > CYCLE_ACTIVITY.STALLS_L2_PENDING ->
> > > > > > cpu/event=0xa3,cmask=0x5,period=0x1e8483,umask=0x5/
> > > > > > CPU_CLK_UNHALTED.THREAD -> cpu/event=0x3c,period=0x1e8483/
> > > > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS -> cpu/event=0xd4,period=0x186a7,umask=0x2/
> > > > > >
> > > > > > Control descriptor is not initialized
> > > > > >
> > > > > > MEM_LOAD_UOPS_RETIRED.LLC_HIT: 0 4007421228 0
> > > > > > CYCLE_ACTIVITY.STALLS_L2_PENDING: 0 4007421228 0
> > > > > > CPU_CLK_UNHALTED.THREAD: 0 4007421228 0
> > > > > > MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS: 0 4007421228 0
> > > > > >
> > > > > > Performance counter stats for 'system wide':
> > > > > >
> > > > > >     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
> > > > > >                   (0,00%)
> > > > > >     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
> > > > > >                      (0,00%)
> > > > > >     <not counted>      CPU_CLK_UNHALTED.THREAD
> > > > > >               (0,00%)
> > > > > >     <not counted>      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> > > > > >                         (0,00%)
> > > > > >
> > > > > >       1,002310013 seconds time elapsed
> > > > > >
> > > > > > So those events/metric-ids resulting in "<not counted>" are all found.
> > > > > >
> > > > > > What means "Control descriptor is not initialized"?
> > > > > >
> > > > > > To summarize:
> > > > > >
> > > > > > Those two tests in "100: perf all metrics test" FAILED:
> > > > > >
> > > > > > 1. tma_dram_bound
> > > > > > 2. tma_l3_bound
> > > > > >
> > > > > > Best regards,
> > > > > > -Sedat-
> > > > > >
> > > > > >> Thanks,
> > > > > >> Ian
> > > > > >>
> > > > > >>> Last perf version which was OK:
> > > > > >>>
> > > > > >>> ~/bin/perf -v
> > > > > >>> perf version 6.0.0
> > > > > >>>
> > > > > >>> echo "linux-perf: Adjust limited access to performance monitoring and
> > > > > >>> observability operations"
> > > > > >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > > > > >>> /proc/sys/kernel/perf_event_paranoid
> > > > > >>> 0
> > > > > >>>
> > > > > >>> ~/bin/perf test 10 86 92 93 94 95
> > > > > >>> 10: PMU events                                                      :
> > > > > >>> 10.1: PMU event table sanity                                        : Ok
> > > > > >>> 10.2: PMU event map aliases                                         : Ok
> > > > > >>> 10.3: Parsing of PMU event table metrics                            : Ok
> > > > > >>> 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > > >>> 86: perf record tests                                               : Ok
> > > > > >>> 92: perf stat tests                                                 : Ok
> > > > > >>> 93: perf all metricgroups test                                      : Ok
> > > > > >>> 94: perf all metrics test                                           : Ok
> > > > > >>> 95: perf all PMU test                                               : Ok
> > > > > >>>
> > > > > >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > > > > >>> /proc/sys/kernel/perf_event_paranoid
> > > > > >>> echo "linux-perf: Reset limited access to performance monitoring and
> > > > > >>> observability operations"
> > > > > >>>
> > > > > >>> If you need further information, please let me know.
> > > > > >>>
> > > > > >>> Thanks.
> > > > > >>>
> > > > > >>> Regards,
> > > > > >>> -Sedat-
> > > > > >>>
> > > > > >>> P.S. Instructions
> > > > > >>>
> > > > > >>> [ REPRODUCER ]
> > > > > >>>
> > > > > >>> LLVM_MVER="16"
> > > > > >>>
> > > > > >>> # Debian LLVM
> > > > > >>> ##LLVM_TOOLCHAIN_PATH="/usr/lib/llvm-${LLVM_MVER}/bin"
> > > > > >>> # Selfmade LLVM
> > > > > >>> LLVM_TOOLCHAIN_PATH="/opt/llvm/bin"
> > > > > >>> if [ -d ${LLVM_TOOLCHAIN_PATH} ]; then
> > > > > >>>    export PATH="${LLVM_TOOLCHAIN_PATH}:${PATH}"
> > > > > >>> fi
> > > > > >>>
> > > > > >>> PYTHON_VER="3.11"
> > > > > >>> MAKE="make"
> > > > > >>> MAKE_OPTS="V=1 -j1 HOSTCC=clang-$LLVM_MVER HOSTLD=ld.lld
> > > > > >>> HOSTAR=llvm-ar CC=clang-$LLVM_MVER LD=ld.lld AR=llvm-ar
> > > > > >>> STRIP=llvm-strip"
> > > > > >>>
> > > > > >>> echo "LLVM MVER ........ $LLVM_MVER"
> > > > > >>> echo "Path settings .... $PATH"
> > > > > >>> echo "Python version ... $PYTHON_VER"
> > > > > >>> echo "make line ........ $MAKE $MAKE_OPTS"
> > > > > >>>
> > > > > >>> LANG=C LC_ALL=C make -C tools/perf clean 2>&1 | tee ../make-log_perf-clean.txt
> > > > > >>>
> > > > > >>> LANG=C LC_ALL=C $MAKE $MAKE_OPTS -C tools/perf
> > > > > >>> PYTHON=python${PYTHON_VER} install-bin 2>&1 | tee
> > > > > >>> ../make-log_perf-install_bin_python${PYTHON_VER}_llvm${LLVM_MVER}.txt
> > > > > >>>
> > > > > >>>
> > > > > >>> [ TESTS ]
> > > > > >>>
> > > > > >>> [ TESTS - START ]
> > > > > >>>
> > > > > >>> echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> > > > > >>> /proc/sys/kernel/perf_event_paranoid
> > > > > >>>
> > > > > >>> [ TESTS - DEBIAN ]
> > > > > >>>
> > > > > >>> /usr/bin/perf -v
> > > > > >>> perf version 6.1.7
> > > > > >>>
> > > > > >>> /usr/bin/perf test 10 92 98 99 100 101
> > > > > >>>
> > > > > >>>  10: PMU events                                                      :
> > > > > >>>  10.1: PMU event table sanity                                        : Ok
> > > > > >>>  10.2: PMU event map aliases                                         : Ok
> > > > > >>>  10.3: Parsing of PMU event table metrics                            : Ok
> > > > > >>>  10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > > >>>  92: perf record tests                                               : Ok
> > > > > >>>  98: perf stat tests                                                 : Ok
> > > > > >>>  99: perf all metricgroups test                                      : Ok
> > > > > >>> 100: perf all metrics test                                           : FAILED!
> > > > > >>> 101: perf all PMU test                                               : Ok
> > > > > >>>
> > > > > >>> [ TESTS - DILEKS ]
> > > > > >>>
> > > > > >>> ~/bin/perf -v
> > > > > >>> perf version 6.2.0-rc5
> > > > > >>>
> > > > > >>> ~/bin/perf test 7 87 93 94 95 96
> > > > > >>>
> > > > > >>>   7: PMU events                                                      :
> > > > > >>>   7.1: PMU event table sanity                                        : Ok
> > > > > >>>   7.2: PMU event map aliases                                         : Ok
> > > > > >>>   7.3: Parsing of PMU event table metrics                            : Ok
> > > > > >>>   7.4: Parsing of PMU event table metrics with fake PMUs             : Ok
> > > > > >>>  87: perf record tests                                               : Ok
> > > > > >>>  93: perf stat tests                                                 : Ok
> > > > > >>>  94: perf all metricgroups test                                      : Ok
> > > > > >>>  95: perf all metrics test                                           : FAILED!
> > > > > >>>  96: perf all PMU test                                               : Ok
> > > > > >>>
> > > > > >>> [ TESTS - FAILED ]
> > > > > >>>
> > > > > >>> /usr/bin/perf test --verbose 100 2>&1 | tee
> > > > > >>> perf-test-verbose-100-perf-all-metrics-test_debian-perf-6-1-7.txt
> > > > > >>>
> > > > > >>> ~/bin/perf test --verbose 95 2>&1 | tee
> > > > > >>> perf-test-verbose-95-perf-all-metrics-test_dileks-perf-6-2-rc5.txt
> > > > > >>>
> > > > > >>> [ TESTS - STOP ]
> > > > > >>>
> > > > > >>> echo 1 | sudo tee /proc/sys/kernel/kptr_restrict
> > > > > >>> /proc/sys/kernel/perf_event_paranoid
> > > > > >>>
> > > > > >>> - EOT -

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-31  0:20       ` Ian Rogers
  2023-01-31  3:45         ` Sedat Dilek
@ 2023-02-01  6:51         ` Ravi Bangoria
  1 sibling, 0 replies; 13+ messages in thread
From: Ravi Bangoria @ 2023-02-01  6:51 UTC (permalink / raw)
  To: Ian Rogers, Liang, Kan, Xing, Zhengjun, sedat.dilek
  Cc: Arnaldo Carvalho de Melo, Peter Zijlstra, Ingo Molnar,
	Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
	linux-perf-users, linux-kernel, Nick Desaulniers,
	Nathan Chancellor, llvm, Ben Hutchings, James Clark,
	Stephane Eranian, Ravi Bangoria

Hi Ian,

> So I think this is a kernel bug triggering a perf tool bug. The kernel
> bug can be worked around in the perf tool. I only had an Ivybridge to
> test with (hence slightly different events) but what I see is both
> tma_dram_bound and tma_l3_bound using the same 4 events. I could work
> around the "<not counted>" by adding the --metric-no-group flag:
> 
> ```
> $ perf stat -M tma_l3_bound --metric-no-group -a sleep 1
> 
> Performance counter stats for 'system wide':
> 
>           400,404      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      4.3 %
> tma_l3_bound             (74.99%)
>       128,937,891      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (87.46%)
>           167,459      MEM_LOAD_UOPS_RETIRED.LLC_MISS
>                         (74.99%)
>       759,574,967      CPU_CLK_UNHALTED.THREAD
>                         (87.47%)
> 
>       1.001526438 seconds time elapsed
> 
> $ perf stat -M tma_dram_bound -a --metric-no-group sleep 1
> 
> Performance counter stats for 'system wide':
> 
>           259,954      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #     15.2 %
> tma_dram_bound           (74.99%)
>       118,807,043      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (87.46%)
>           111,699      MEM_LOAD_UOPS_RETIRED.LLC_MISS
>                         (74.95%)
>       587,571,060      CPU_CLK_UNHALTED.THREAD
>                         (87.45%)
> 
>       1.001518093 seconds time elapsed
> ```
> 
> The issue is that perf metrics use weak groups of events. A weak group
> is the same as a group of events initially. We want to use groups of
> events with metrics so that all the counters are scheduled in and out
> at the same time, and not multiplexed independently. Imagine measuring
> IPC but the counts for instructions and cycles are measured at
> different periods, the resultant IPC value would be unlikely to be
> accurate. If perf_event_open fails then the perf tool retries the
> events without the group. If I try just 3 of the events in a weak
> group then the failure can be seen:
> 
> ```
> $ perf stat -e "{MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING}:W"
> -a sleep 1
> 
> Performance counter stats for 'system wide':
> 
>     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_HIT
>                         (0.00%)
>     <not counted>      MEM_LOAD_UOPS_RETIRED.LLC_MISS
>                         (0.00%)
>     <not counted>      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (0.00%)
> 
>       1.001458485 seconds time elapsed
> ```
> 
> The kernel should have failed the perf_event_open on opening the third
> event and then measured without the group,

IIUC, Kernel should not fail opening of the 3rd event, because there are 4
general purpose counters on Intel and all three events can be scheduled
on any of the 4 counter (I checked IvyBridge).

However, what I don't understand is why kernel failed to schedule the group.
Unless someone has pre-occupied 2 or more GP counter, group should get
schedule fine.

> which it can do with
> multiplexing as in the following:
> 
> ```
> $ perf stat -e "MEM_LOAD_UOPS_RETIRED.LLC_HIT,MEM_LOAD_UOPS_RETIRED.LLC_MISS,CYCLE_ACTIVITY.STALLS_L2_PENDING"
> -a sleep 1
> 
> Performance counter stats for 'system wide':
> 
>         1,239,397      MEM_LOAD_UOPS_RETIRED.LLC_HIT
>                         (79.06%)
>           174,826      MEM_LOAD_UOPS_RETIRED.LLC_MISS
>                         (64.60%)
>       124,026,024      CYCLE_ACTIVITY.STALLS_L2_PENDING
>                         (81.16%)
> 
>       1.001483434 seconds time elapsed
> ```
> 
> When the --metric-no-group flag is given to perf then it doesn't
> produce the initial weak group, which works around the bug of the
> kernel not failing on the 3rd perf_event_open. I've added Kan and
> Zhengjun to the e-mail as they work on the Intel kernel PMU code.
> 
> There's a question about what we should do in the perf test about
> this? I have a few solutions:
> 
> 1) try metric tests again with the --metric-no-group flag and don't
> fail the test if this succeeds. This allows kernel bugs to hide, so
> I'm not a huge fan.
> 
> 2) add a new metric flag/constraint to say not to group, this way the
> metric will automatically apply the "--metric-no-group" flag. It is a
> bit of work to wire this up but this kind of failure is common enough
> in PMUs that it is probably worthwhile. We also need to add the flag
> to metrics and I'm not sure how to get a good list of the metrics that
> currently fail and require it. This is okay but error prone.
> 
> 3) fix the kernel bug and let the perf test fail until an adequate
> kernel is installed. Probably the best option.

Thanks,
Ravi


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-01-31  3:55           ` Ian Rogers
  2023-01-31  6:14             ` Sedat Dilek
@ 2023-02-01 15:27             ` Liang, Kan
  2023-02-01 17:02               ` Ian Rogers
  1 sibling, 1 reply; 13+ messages in thread
From: Liang, Kan @ 2023-02-01 15:27 UTC (permalink / raw)
  To: Ian Rogers, sedat.dilek
  Cc: Xing, Zhengjun, Arnaldo Carvalho de Melo, Peter Zijlstra,
	Ingo Molnar, Mark Rutland, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, linux-perf-users, linux-kernel, Nick Desaulniers,
	Nathan Chancellor, llvm, Ben Hutchings, James Clark,
	Stephane Eranian

Hi Ian,

On 2023-01-30 10:55 p.m., Ian Rogers wrote:
>>> There's a question about what we should do in the perf test about
>>> this? I have a few solutions:
>>>
>>> 1) try metric tests again with the --metric-no-group flag and don't
>>> fail the test if this succeeds. This allows kernel bugs to hide, so
>>> I'm not a huge fan.
>>>
>>> 2) add a new metric flag/constraint to say not to group, this way the
>>> metric will automatically apply the "--metric-no-group" flag. It is a
>>> bit of work to wire this up but this kind of failure is common enough
>>> in PMUs that it is probably worthwhile. We also need to add the flag
>>> to metrics and I'm not sure how to get a good list of the metrics that
>>> currently fail and require it. This is okay but error prone.
>>>
>>> 3) fix the kernel bug and let the perf test fail until an adequate
>>> kernel is installed. Probably the best option.
>>>
>> Hi Ian,
>>
>> I can confirm:
>>
>> $ echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
>> /proc/sys/kernel/perf_event_paranoid
>> 0
>>
>> $ ~/bin/perf stat -M tma_l3_bound --metric-no-group -a sleep 1
>>
>> Performance counter stats for 'system wide':
>>
>>         2.058.892      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,5 %
>> tma_l3_bound             (99,30%)
>>       173.254.697      CYCLE_ACTIVITY.STALLS_L2_PENDING
>>                         (99,10%)
>>     2.396.130.501      CPU_CLK_UNHALTED.THREAD
>>                         (99,60%)
>>         1.110.486      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>>                            (99,53%)
>>
>>       1,001989022 seconds time elapsed
>>
>> $ ~/bin/perf stat -M tma_dram_bound --metric-no-group -a sleep 1
>>
>> Performance counter stats for 'system wide':
>>
>>         1.729.208      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,2 %
>> tma_dram_bound           (99,50%)
>>        50.346.734      CYCLE_ACTIVITY.STALLS_L2_PENDING
>>                         (99,50%)
>>     2.354.963.862      CPU_CLK_UNHALTED.THREAD
>>                         (99,80%)
>>           306.500      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>>                            (99,61%)
>>
>>       1,001981392 seconds time elapsed
>>
>> Thanks!
> Thanks, apparently it is an issue with SandyBridge/IvyBridge that some
> counters on one hyperthread will limit what can be on the other. I
> believe that's the comment related to EXCL access here:
> https://github.com/torvalds/linux/blob/master/arch/x86/events/intel/core.c#L124
> So you may have more success with the metric if you disable
> hyperthreading, but I imagine that's not a popular option.

Thanks for debugging the issue. Yes, it's caused by the HT workaround
for SNB/IVB/HSW.

The weak group check in the kernel is in validate_group(). It only does
a sanity check. It doesn't check all the workarounds and the current
status of counters (e.g., whether the fixed counter is occupied by NMI
watchdog.) It's possible that a false positive is returned to the perf
tool. I once tried to fix the NMI watchdog check in the kernel, but the
proposal was rejected. So the metric constraint is introduced.

For this issue, I think the above option2 should be a better and
practical choice. The issue is only observed on old machines, which
usually has a stable kernel running on it. I don't think the user wants
to update their kernel just to workaround an issue for several metrics.
But it should be much easier for them to update the perf tool.

We know that the below events are the problematic events.
/* MEM_UOPS_RETIRED.* */
/* MEM_LOAD_UOPS_RETIRED.* */
/* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
/* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
Can we update the convertor script and apply the "--metric-no-group"
flag or add a new constraint if the above events are detected in
SNB/IVB/HSW?

Thanks,
Kan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-02-01 15:27             ` Liang, Kan
@ 2023-02-01 17:02               ` Ian Rogers
  2023-02-01 19:06                 ` Liang, Kan
  0 siblings, 1 reply; 13+ messages in thread
From: Ian Rogers @ 2023-02-01 17:02 UTC (permalink / raw)
  To: Liang, Kan
  Cc: sedat.dilek, Xing, Zhengjun, Arnaldo Carvalho de Melo,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-perf-users, linux-kernel,
	Nick Desaulniers, Nathan Chancellor, llvm, Ben Hutchings,
	James Clark, Stephane Eranian

On Wed, Feb 1, 2023 at 7:28 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>
> Hi Ian,
>
> On 2023-01-30 10:55 p.m., Ian Rogers wrote:
> >>> There's a question about what we should do in the perf test about
> >>> this? I have a few solutions:
> >>>
> >>> 1) try metric tests again with the --metric-no-group flag and don't
> >>> fail the test if this succeeds. This allows kernel bugs to hide, so
> >>> I'm not a huge fan.
> >>>
> >>> 2) add a new metric flag/constraint to say not to group, this way the
> >>> metric will automatically apply the "--metric-no-group" flag. It is a
> >>> bit of work to wire this up but this kind of failure is common enough
> >>> in PMUs that it is probably worthwhile. We also need to add the flag
> >>> to metrics and I'm not sure how to get a good list of the metrics that
> >>> currently fail and require it. This is okay but error prone.
> >>>
> >>> 3) fix the kernel bug and let the perf test fail until an adequate
> >>> kernel is installed. Probably the best option.
> >>>
> >> Hi Ian,
> >>
> >> I can confirm:
> >>
> >> $ echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
> >> /proc/sys/kernel/perf_event_paranoid
> >> 0
> >>
> >> $ ~/bin/perf stat -M tma_l3_bound --metric-no-group -a sleep 1
> >>
> >> Performance counter stats for 'system wide':
> >>
> >>         2.058.892      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,5 %
> >> tma_l3_bound             (99,30%)
> >>       173.254.697      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >>                         (99,10%)
> >>     2.396.130.501      CPU_CLK_UNHALTED.THREAD
> >>                         (99,60%)
> >>         1.110.486      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> >>                            (99,53%)
> >>
> >>       1,001989022 seconds time elapsed
> >>
> >> $ ~/bin/perf stat -M tma_dram_bound --metric-no-group -a sleep 1
> >>
> >> Performance counter stats for 'system wide':
> >>
> >>         1.729.208      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,2 %
> >> tma_dram_bound           (99,50%)
> >>        50.346.734      CYCLE_ACTIVITY.STALLS_L2_PENDING
> >>                         (99,50%)
> >>     2.354.963.862      CPU_CLK_UNHALTED.THREAD
> >>                         (99,80%)
> >>           306.500      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
> >>                            (99,61%)
> >>
> >>       1,001981392 seconds time elapsed
> >>
> >> Thanks!
> > Thanks, apparently it is an issue with SandyBridge/IvyBridge that some
> > counters on one hyperthread will limit what can be on the other. I
> > believe that's the comment related to EXCL access here:
> > https://github.com/torvalds/linux/blob/master/arch/x86/events/intel/core.c#L124
> > So you may have more success with the metric if you disable
> > hyperthreading, but I imagine that's not a popular option.
>
> Thanks for debugging the issue. Yes, it's caused by the HT workaround
> for SNB/IVB/HSW.
>
> The weak group check in the kernel is in validate_group(). It only does
> a sanity check. It doesn't check all the workarounds and the current
> status of counters (e.g., whether the fixed counter is occupied by NMI
> watchdog.) It's possible that a false positive is returned to the perf
> tool. I once tried to fix the NMI watchdog check in the kernel, but the
> proposal was rejected. So the metric constraint is introduced.
>
> For this issue, I think the above option2 should be a better and
> practical choice. The issue is only observed on old machines, which
> usually has a stable kernel running on it. I don't think the user wants
> to update their kernel just to workaround an issue for several metrics.
> But it should be much easier for them to update the perf tool.
>
> We know that the below events are the problematic events.
> /* MEM_UOPS_RETIRED.* */
> /* MEM_LOAD_UOPS_RETIRED.* */
> /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
> /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
> Can we update the convertor script and apply the "--metric-no-group"
> flag or add a new constraint if the above events are detected in
> SNB/IVB/HSW?
>
> Thanks,
> Kan

Thanks Kan,

We absolutely can do that! In this case should it be --metric-no-group
only when SMT is enabled? I can do some patches but would like to know
about whether we need SMT and not SMT versions of --metric-no-group.
Also, should we just have a list of metrics that need the flag or try
to automate detection? Some warts in detection are the names of the
events that vary between Ivybridge and Sandybridge, and how to
determine which events conflict. For example, the perfmon event data:

MEM_LOAD_UOPS_RETIRED.LLC_HIT
https://github.com/intel/perfmon/blob/main/IVB/events/ivybridge_core.json#L5368
MEM_LOAD_UOPS_RETIRED.LLC_MISS
https://github.com/intel/perfmon/blob/main/IVB/events/ivybridge_core.json#L5431
CYCLE_ACTIVITY.STALLS_L2_PENDING
https://github.com/intel/perfmon/blob/main/IVB/events/ivybridge_core.json#L3541

The events list all counters, there are no errata fields.. Should the
event data be updated and then in the converter script handle that? If
I get shown an example I can modify the script accordingly.

It is also hard for me to test anything other than SMT on Ivybridge.

Thanks,
Ian

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [6.1.7][6.2-rc5] perf all metrics test: FAILED!
  2023-02-01 17:02               ` Ian Rogers
@ 2023-02-01 19:06                 ` Liang, Kan
  0 siblings, 0 replies; 13+ messages in thread
From: Liang, Kan @ 2023-02-01 19:06 UTC (permalink / raw)
  To: Ian Rogers
  Cc: sedat.dilek, Xing, Zhengjun, Arnaldo Carvalho de Melo,
	Peter Zijlstra, Ingo Molnar, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, linux-perf-users, linux-kernel,
	Nick Desaulniers, Nathan Chancellor, llvm, Ben Hutchings,
	James Clark, Stephane Eranian



On 2023-02-01 12:02 p.m., Ian Rogers wrote:
> On Wed, Feb 1, 2023 at 7:28 AM Liang, Kan <kan.liang@linux.intel.com> wrote:
>>
>> Hi Ian,
>>
>> On 2023-01-30 10:55 p.m., Ian Rogers wrote:
>>>>> There's a question about what we should do in the perf test about
>>>>> this? I have a few solutions:
>>>>>
>>>>> 1) try metric tests again with the --metric-no-group flag and don't
>>>>> fail the test if this succeeds. This allows kernel bugs to hide, so
>>>>> I'm not a huge fan.
>>>>>
>>>>> 2) add a new metric flag/constraint to say not to group, this way the
>>>>> metric will automatically apply the "--metric-no-group" flag. It is a
>>>>> bit of work to wire this up but this kind of failure is common enough
>>>>> in PMUs that it is probably worthwhile. We also need to add the flag
>>>>> to metrics and I'm not sure how to get a good list of the metrics that
>>>>> currently fail and require it. This is okay but error prone.
>>>>>
>>>>> 3) fix the kernel bug and let the perf test fail until an adequate
>>>>> kernel is installed. Probably the best option.
>>>>>
>>>> Hi Ian,
>>>>
>>>> I can confirm:
>>>>
>>>> $ echo 0 | sudo tee /proc/sys/kernel/kptr_restrict
>>>> /proc/sys/kernel/perf_event_paranoid
>>>> 0
>>>>
>>>> $ ~/bin/perf stat -M tma_l3_bound --metric-no-group -a sleep 1
>>>>
>>>> Performance counter stats for 'system wide':
>>>>
>>>>         2.058.892      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,5 %
>>>> tma_l3_bound             (99,30%)
>>>>       173.254.697      CYCLE_ACTIVITY.STALLS_L2_PENDING
>>>>                         (99,10%)
>>>>     2.396.130.501      CPU_CLK_UNHALTED.THREAD
>>>>                         (99,60%)
>>>>         1.110.486      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>>>>                            (99,53%)
>>>>
>>>>       1,001989022 seconds time elapsed
>>>>
>>>> $ ~/bin/perf stat -M tma_dram_bound --metric-no-group -a sleep 1
>>>>
>>>> Performance counter stats for 'system wide':
>>>>
>>>>         1.729.208      MEM_LOAD_UOPS_RETIRED.LLC_HIT    #      1,2 %
>>>> tma_dram_bound           (99,50%)
>>>>        50.346.734      CYCLE_ACTIVITY.STALLS_L2_PENDING
>>>>                         (99,50%)
>>>>     2.354.963.862      CPU_CLK_UNHALTED.THREAD
>>>>                         (99,80%)
>>>>           306.500      MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS
>>>>                            (99,61%)
>>>>
>>>>       1,001981392 seconds time elapsed
>>>>
>>>> Thanks!
>>> Thanks, apparently it is an issue with SandyBridge/IvyBridge that some
>>> counters on one hyperthread will limit what can be on the other. I
>>> believe that's the comment related to EXCL access here:
>>> https://github.com/torvalds/linux/blob/master/arch/x86/events/intel/core.c#L124
>>> So you may have more success with the metric if you disable
>>> hyperthreading, but I imagine that's not a popular option.
>>
>> Thanks for debugging the issue. Yes, it's caused by the HT workaround
>> for SNB/IVB/HSW.
>>
>> The weak group check in the kernel is in validate_group(). It only does
>> a sanity check. It doesn't check all the workarounds and the current
>> status of counters (e.g., whether the fixed counter is occupied by NMI
>> watchdog.) It's possible that a false positive is returned to the perf
>> tool. I once tried to fix the NMI watchdog check in the kernel, but the
>> proposal was rejected. So the metric constraint is introduced.
>>
>> For this issue, I think the above option2 should be a better and
>> practical choice. The issue is only observed on old machines, which
>> usually has a stable kernel running on it. I don't think the user wants
>> to update their kernel just to workaround an issue for several metrics.
>> But it should be much easier for them to update the perf tool.
>>
>> We know that the below events are the problematic events.
>> /* MEM_UOPS_RETIRED.* */
>> /* MEM_LOAD_UOPS_RETIRED.* */
>> /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
>> /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
>> Can we update the convertor script and apply the "--metric-no-group"
>> flag or add a new constraint if the above events are detected in
>> SNB/IVB/HSW?
>>
>> Thanks,
>> Kan
> 
> Thanks Kan,
> 
> We absolutely can do that! In this case should it be --metric-no-group
> only when SMT is enabled? I can do some patches but would like to know
> about whether we need SMT and not SMT versions of --metric-no-group.

The kernel workaround is disabled when SMT is off. So I think we only
need SMT version of --metric-no-group.
https://lore.kernel.org/all/1416251225-17721-13-git-send-email-eranian@google.com/T/#u

> Also, should we just have a list of metrics that need the flag or try
> to automate detection? 

I don't think Intel will update the metrics or events for the old
SNB/IVB/HSW platforms. Hard code a list of metrics may be simpler than
automated detection.

> Some warts in detection are the names of the
> events that vary between Ivybridge and Sandybridge, and how to
> determine which events conflict. For example, the perfmon event data:
> 
> MEM_LOAD_UOPS_RETIRED.LLC_HIT
> https://github.com/intel/perfmon/blob/main/IVB/events/ivybridge_core.json#L5368
> MEM_LOAD_UOPS_RETIRED.LLC_MISS
> https://github.com/intel/perfmon/blob/main/IVB/events/ivybridge_core.json#L5431
> CYCLE_ACTIVITY.STALLS_L2_PENDING
> https://github.com/intel/perfmon/blob/main/IVB/events/ivybridge_core.json#L3541
>

The problematic events should have the same name among platforms. If the
event name doesn't work, the event encoding is exactly the same among
those platforms.


> The events list all counters, there are no errata fields.. Should the
> event data be updated and then in the converter script handle that? If
> I get shown an example I can modify the script accordingly.

If it can helps the converter script, I think we can update the errata
field.

Here are the errata information.
 * SNB: BJ122
 * IVB: BV98
 * HSW: HSD29

Here is the details regarding the issue. (Please search BV98)
https://www.intel.com/content/www/us/en/content-details/619604/desktop-3rd-generation-intel-core-processor-family-specification-update.html
> 
> It is also hard for me to test anything other than SMT on Ivybridge.
> 

I think it's OK to only test on Ivybridge.
The original kernel patch indicates the issue is the same among SNB, IVB
and HSW.
https://lore.kernel.org/all/1416251225-17721-7-git-send-email-eranian@google.com/T/#u

Thanks,
Kan

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-02-01 19:07 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-29  9:58 [6.1.7][6.2-rc5] perf all metrics test: FAILED! Sedat Dilek
2023-01-29 23:21 ` Ian Rogers
2023-01-30  2:24   ` Sedat Dilek
2023-01-30 10:04     ` James Clark
2023-01-31  0:20       ` Ian Rogers
2023-01-31  3:45         ` Sedat Dilek
2023-01-31  3:55           ` Ian Rogers
2023-01-31  6:14             ` Sedat Dilek
2023-01-31  6:20               ` Sedat Dilek
2023-02-01 15:27             ` Liang, Kan
2023-02-01 17:02               ` Ian Rogers
2023-02-01 19:06                 ` Liang, Kan
2023-02-01  6:51         ` Ravi Bangoria

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).