* [V3 00/10] perf: New conditional branch filter
@ 2013-10-16 6:56 Anshuman Khandual
2013-10-16 6:56 ` [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling Anshuman Khandual
` (10 more replies)
0 siblings, 11 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
This patchset is the re-spin of the original branch stack sampling
patchset which introduced new PERF_SAMPLE_BRANCH_COND branch filter. This patchset
also enables SW based branch filtering support for book3s powerpc platforms which
have PMU HW backed branch stack sampling support.
Summary of code changes in this patchset:
(1) Introduces a new PERF_SAMPLE_BRANCH_COND branch filter
(2) Add the "cond" branch filter options in the "perf record" tool
(3) Enable PERF_SAMPLE_BRANCH_COND in X86 platforms
(4) Enable PERF_SAMPLE_BRANCH_COND in POWER8 platform
(5) Update the documentation regarding "perf record" tool
(6) Add some new powerpc instruction analysis functions in code-patching library
(7) Enable SW based branch filter support for powerpc book3s
(8) Changed BHRB configuration in POWER8 to accommodate SW branch filters
With this new SW enablement, the branch filter support for book3s platforms have
been extended to include all these combinations discussed below with a sample test
application program (included here).
Changes in V2
=============
(1) Enabled PPC64 SW branch filtering support
(2) Incorporated changes required for all previous comments
Changes in V3
=============
(1) Split the SW branch filter enablement into multiple patches
(2) Added PMU neutral SW branch filtering code, PMU specific HW branch filtering code
(3) Added new instruction analysis functionality into powerpc code-patching library
(4) Changed name for some of the functions
(5) Fixed couple of spelling mistakes
(6) Changed code documentation in multiple places
PMU HW branch filters
=====================
(1) perf record -j any_call -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... ..................... .................... ........................
#
7.00% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_2
6.99% cprog cprog [.] hw_1_1 cprog [.] symbol1
6.52% cprog cprog [.] sw_3_1 cprog [.] success_3_1_2
5.41% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_3
5.40% cprog cprog [.] hw_1_2 cprog [.] symbol2
5.40% cprog cprog [.] callme cprog [.] hw_1_2
5.40% cprog cprog [.] sw_3_1 cprog [.] success_3_1_1
5.40% cprog cprog [.] callme cprog [.] hw_1_1
5.39% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_1
5.39% cprog cprog [.] sw_4_2 cprog [.] lr_addr
5.39% cprog cprog [.] callme cprog [.] sw_4_2
5.37% cprog [unknown] [.] 00000000 cprog [.] ctr_addr
4.30% cprog cprog [.] callme cprog [.] hw_2_1
4.28% cprog cprog [.] callme cprog [.] sw_3_1
3.82% cprog cprog [.] sw_3_1 cprog [.] success_3_1_3
3.81% cprog cprog [.] callme cprog [.] hw_2_2
3.81% cprog cprog [.] callme cprog [.] sw_3_2
2.71% cprog [unknown] [.] 00000000 cprog [.] lr_addr
2.70% cprog cprog [.] main cprog [.] callme
2.70% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
2.70% cprog cprog [.] callme cprog [.] sw_4_1
0.08% cprog [unknown] [.] 0xf78676c4 [unknown] [.] 0xf78522c0
0.02% cprog [unknown] [k] 00000000 cprog [k] ctr_addr
0.01% cprog [kernel.kallsyms] [.] .power_pmu_enable [kernel.kallsyms] [.] .power8_compute_mmcr
0.00% cprog ld-2.11.2.so [.] malloc [unknown] [.] 0xf786b380
0.00% cprog ld-2.11.2.so [.] calloc [unknown] [.] 0xf786b390
0.00% cprog cprog [.] main [unknown] [.] 0x10000950
0.00% cprog [unknown] [.] 00000000 [kernel.kallsyms] [.] .power_pmu_enable
(2) perf record -j cond -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... ....................... .................... .......................
#
27.73% cprog [unknown] [.] 00000000 cprog [.] callme
13.03% cprog cprog [.] sw_3_1 cprog [.] sw_3_1
5.64% cprog [unknown] [.] 00000000 cprog [.] main
5.62% cprog [unknown] [.] 00000000 cprog [.] sw_4_2
5.46% cprog cprog [.] sw_4_2 cprog [.] lr_addr
5.40% cprog [unknown] [.] 00000000 cprog [.] sw_4_1
3.72% cprog cprog [.] hw_2_1 cprog [.] callme
3.71% cprog cprog [.] main cprog [.] hw_1_1
3.71% cprog cprog [.] sw_3_1_2 cprog [.] sw_3_1
3.70% cprog cprog [.] sw_3_1_3 cprog [.] sw_3_1
3.70% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
3.69% cprog cprog [.] hw_1_2 cprog [.] hw_1_2
3.69% cprog cprog [.] hw_2_2 cprog [.] callme
3.68% cprog cprog [.] sw_3_1_1 cprog [.] sw_3_1
1.93% cprog [unknown] [.] 00000000 cprog [.] lr_addr
1.78% cprog [unknown] [.] 00000000 cprog [.] hw_1_2
1.78% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
1.76% cprog [unknown] [.] 00000000 cprog [.] hw_1_1
0.12% cprog [unknown] [.] 0xf7bb25dc [unknown] [.] 0xf7bb27e4
0.07% cprog [unknown] [k] 00000000 cprog [k] callme
0.07% cprog [unknown] [k] 00000000 cprog [k] sw_4_1
0.00% cprog libc-2.11.2.so [.] _IO_file_doallocate libc-2.11.2.so [.] _IO_file_doallocate
0.00% cprog libc-2.11.2.so [.] _IO_file_doallocate libc-2.11.2.so [.] isatty
0.00% cprog [unknown] [.] 00000000 libc-2.11.2.so [.] _IO_file_doallocate
SW based branch filters
=======================
(3) perf record -j any_ret -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... .................... .................... .....................
#
15.37% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
6.46% cprog cprog [.] success_3_1_3 cprog [.] sw_3_1
6.45% cprog cprog [.] symbol1 cprog [.] hw_1_1
6.41% cprog [unknown] [.] 00000000 cprog [.] callme
6.39% cprog cprog [.] ctr_addr cprog [.] sw_4_1
6.37% cprog cprog [.] symbol2 cprog [.] hw_1_2
6.36% cprog cprog [.] sw_4_2 cprog [.] callme
6.35% cprog cprog [.] lr_addr cprog [.] sw_4_2
3.97% cprog cprog [.] back1 cprog [.] callme
3.93% cprog cprog [.] sw_3_1_2 cprog [.] sw_3_1
3.93% cprog cprog [.] sw_3_1 cprog [.] callme
3.86% cprog cprog [.] sw_3_1_3 cprog [.] sw_3_1
3.84% cprog cprog [.] sw_3_1_1 cprog [.] sw_3_1
2.54% cprog cprog [.] success_3_1_1 cprog [.] sw_3_1
2.54% cprog cprog [.] sw_4_1 cprog [.] callme
2.54% cprog cprog [.] hw_1_1 cprog [.] callme
2.53% cprog cprog [.] sw_3_2 cprog [.] callme
2.52% cprog cprog [.] callme cprog [.] main
2.51% cprog cprog [.] hw_1_2 cprog [.] callme
2.51% cprog cprog [.] back2 cprog [.] callme
2.51% cprog cprog [.] success_3_1_2 cprog [.] sw_3_1
0.07% cprog [unknown] [k] 00000000 cprog [k] callme
0.02% cprog [unknown] [.] 00000000 [unknown] [.] 0xf7e5c004
0.01% cprog libc-2.11.2.so [.] __errno_location libc-2.11.2.so [.] vfprintf
0.01% cprog [unknown] [.] 00000000 libc-2.11.2.so [.] _IO_file_overflow
(4) perf record -j ind_call -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... ................... .................... .....................
#
48.04% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
19.96% cprog cprog [.] sw_4_2 cprog [.] lr_addr
19.69% cprog [unknown] [.] 00000000 cprog [.] callme
12.04% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
0.18% cprog [unknown] [k] 00000000 cprog [k] callme
0.02% cprog libc-2.11.2.so [.] _IO_file_xsputn libc-2.11.2.so [.] _IO_file_overflow
0.02% cprog [unknown] [.] 00000000 libc-2.11.2.so [.] _IO_file_xsputn
0.02% cprog [unknown] [.] 00000000 ld-2.11.2.so [.] malloc
0.02% cprog [unknown] [k] 00000000 cprog [k] sw_3_1
(5) perf record -j any_call,any_ret -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... ....................... .................... .......................
#
10.36% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
4.18% cprog cprog [.] symbol1 cprog [.] hw_1_1
4.18% cprog cprog [.] success_3_1_3 cprog [.] sw_3_1
4.17% cprog cprog [.] sw_4_2 cprog [.] lr_addr
4.16% cprog cprog [.] sw_4_2 cprog [.] callme
4.15% cprog cprog [.] ctr_addr cprog [.] sw_4_1
4.15% cprog cprog [.] lr_addr cprog [.] sw_4_2
4.14% cprog cprog [.] symbol2 cprog [.] hw_1_2
4.14% cprog [unknown] [.] 00000000 cprog [.] callme
2.15% cprog cprog [.] sw_3_1 cprog [.] callme
2.14% cprog cprog [.] hw_1_1 cprog [.] symbol1
2.14% cprog cprog [.] callme cprog [.] hw_1_1
2.14% cprog cprog [.] callme cprog [.] sw_4_2
2.13% cprog cprog [.] back1 cprog [.] callme
2.12% cprog cprog [.] sw_3_1_2 cprog [.] sw_3_1
2.12% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_2
2.11% cprog cprog [.] sw_3_1_3 cprog [.] sw_3_1
2.11% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_3
2.11% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
2.10% cprog cprog [.] hw_1_2 cprog [.] symbol2
2.10% cprog cprog [.] sw_3_1_1 cprog [.] sw_3_1
2.10% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_1
2.10% cprog cprog [.] callme cprog [.] hw_1_2
2.10% cprog cprog [.] callme cprog [.] sw_3_1
2.05% cprog cprog [.] success_3_1_1 cprog [.] sw_3_1
2.05% cprog cprog [.] sw_3_1 cprog [.] success_3_1_1
2.05% cprog cprog [.] success_3_1_2 cprog [.] sw_3_1
2.05% cprog cprog [.] sw_3_1 cprog [.] success_3_1_2
2.04% cprog cprog [.] hw_1_1 cprog [.] callme
2.04% cprog cprog [.] back2 cprog [.] callme
2.04% cprog cprog [.] sw_4_1 cprog [.] callme
2.04% cprog cprog [.] callme cprog [.] main
2.04% cprog cprog [.] hw_1_2 cprog [.] callme
2.04% cprog cprog [.] sw_3_2 cprog [.] callme
2.04% cprog cprog [.] callme cprog [.] sw_3_2
2.03% cprog cprog [.] sw_3_1 cprog [.] success_3_1_3
0.03% cprog [unknown] [k] 00000000 cprog [k] callme
0.01% cprog [unknown] [.] 0xf7e79bb0 [unknown] [.] 0xf7e64088
0.00% cprog libc-2.11.2.so [.] _IO_file_doallocate libc-2.11.2.so [.] mmap
0.00% cprog libc-2.11.2.so [.] mmap libc-2.11.2.so [.] _IO_file_doallocate
0.00% cprog [unknown] [.] 0xf7e7589c libc-2.11.2.so [.] printf
0.00% cprog [unknown] [k] 00000000 cprog [k] sw_3_1
(6) perf record -j any_call,ind_call -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... .............. .................... .................
#
23.09% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
8.99% cprog cprog [.] sw_4_2 cprog [.] lr_addr
8.92% cprog [unknown] [.] 00000000 cprog [.] callme
5.18% cprog cprog [.] sw_3_1 cprog [.] success_3_1_2
5.16% cprog cprog [.] sw_3_1 cprog [.] success_3_1_1
5.16% cprog cprog [.] callme cprog [.] sw_3_2
5.12% cprog cprog [.] sw_3_1 cprog [.] success_3_1_3
3.85% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_1
3.85% cprog cprog [.] callme cprog [.] sw_3_1
3.84% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
3.82% cprog cprog [.] hw_1_1 cprog [.] symbol1
3.82% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_2
3.82% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_3
3.82% cprog cprog [.] callme cprog [.] hw_1_1
3.81% cprog cprog [.] hw_1_2 cprog [.] symbol2
3.81% cprog cprog [.] callme cprog [.] hw_1_2
3.81% cprog cprog [.] callme cprog [.] sw_4_2
0.05% cprog [unknown] [k] 00000000 cprog [k] callme
0.03% cprog [unknown] [.] 0xf7f7232c [unknown] [.] 0xf7f72334
0.01% cprog ld-2.11.2.so [.] malloc [unknown] [.] 0xf7f8b380
0.01% cprog cprog [.] main [unknown] [.] 0x10000950
0.01% cprog [unknown] [.] 00000000 ld-2.11.2.so [.] malloc
0.01% cprog [unknown] [.] 00000000 cprog [.] main
(7) perf record -j cond,any_ret -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... ..................... .................... .....................
#
12.18% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
4.90% cprog cprog [.] sw_4_2 cprog [.] lr_addr
4.88% cprog [unknown] [.] 00000000 cprog [.] callme
4.88% cprog cprog [.] lr_addr cprog [.] sw_4_2
4.88% cprog cprog [.] sw_4_2 cprog [.] callme
4.86% cprog cprog [.] symbol1 cprog [.] hw_1_1
4.86% cprog cprog [.] success_3_1_3 cprog [.] sw_3_1
4.85% cprog cprog [.] symbol2 cprog [.] hw_1_2
4.85% cprog cprog [.] ctr_addr cprog [.] sw_4_1
2.47% cprog cprog [.] sw_3_1_3 cprog [.] sw_3_1
2.46% cprog cprog [.] back1 cprog [.] callme
2.45% cprog cprog [.] hw_1_1 cprog [.] callme
2.45% cprog cprog [.] hw_2_1 cprog [.] address1
2.44% cprog cprog [.] hw_1_2 cprog [.] symbol2
2.44% cprog cprog [.] sw_3_1_1 cprog [.] sw_3_1
2.44% cprog cprog [.] sw_3_2 cprog [.] callme
2.44% cprog cprog [.] success_3_1_1 cprog [.] sw_3_1
2.44% cprog cprog [.] sw_3_1 cprog [.] success_3_1_1
2.44% cprog cprog [.] sw_3_1 cprog [.] success_3_1_3
2.43% cprog cprog [.] callme cprog [.] main
2.43% cprog cprog [.] hw_2_2 cprog [.] address2
2.43% cprog cprog [.] sw_3_1_2 cprog [.] sw_3_1
2.43% cprog cprog [.] success_3_1_2 cprog [.] sw_3_1
2.43% cprog cprog [.] sw_3_1 cprog [.] success_3_1_2
2.43% cprog cprog [.] sw_4_1 cprog [.] callme
2.42% cprog cprog [.] sw_3_1 cprog [.] callme
2.42% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
2.42% cprog cprog [.] back2 cprog [.] callme
2.40% cprog cprog [.] hw_1_2 cprog [.] callme
0.10% cprog [unknown] [.] 0xf78923e0 [unknown] [.] 0xf78923c0
0.03% cprog [unknown] [k] 00000000 cprog [k] callme
0.01% cprog [unknown] [k] 00000000 cprog [k] sw_3_1
0.01% cprog libc-2.11.2.so [.] vfprintf libc-2.11.2.so [.] vfprintf
0.01% cprog libc-2.11.2.so [.] _IO_file_overflow [unknown] [.] 0x0fee0100
0.01% cprog libc-2.11.2.so [.] strchrnul libc-2.11.2.so [.] vfprintf
0.01% cprog libc-2.11.2.so [.] strchrnul libc-2.11.2.so [.] strchrnul
0.01% cprog [unknown] [.] 00000000 libc-2.11.2.so [.] _IO_file_overflow
(8) perf record -j cond,ind_call -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... .............. .................... ...................
#
26.21% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
10.50% cprog cprog [.] sw_4_2 cprog [.] lr_addr
10.38% cprog [unknown] [.] 00000000 cprog [.] callme
5.31% cprog cprog [.] sw_3_1_2 cprog [.] sw_3_1
5.30% cprog cprog [.] sw_3_1_1 cprog [.] sw_3_1
5.27% cprog cprog [.] sw_3_1 cprog [.] success_3_1_2
5.26% cprog cprog [.] hw_2_2 cprog [.] address2
5.25% cprog cprog [.] hw_1_2 cprog [.] symbol2
5.25% cprog cprog [.] sw_3_1 cprog [.] success_3_1_3
5.24% cprog cprog [.] hw_2_1 cprog [.] address1
5.23% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
5.20% cprog cprog [.] sw_3_1_3 cprog [.] sw_3_1
5.19% cprog cprog [.] sw_3_1 cprog [.] success_3_1_1
0.24% cprog [unknown] [.] 0xf7cf23e0 [unknown] [.] 0xf7cf23c0
0.11% cprog [unknown] [k] 00000000 cprog [k] callme
0.01% cprog libc-2.11.2.so [.] vfprintf libc-2.11.2.so [.] vfprintf
0.01% cprog libc-2.11.2.so [.] vfprintf libc-2.11.2.so [.] _IO_file_xsputn
0.01% cprog [unknown] [.] 00000000 libc-2.11.2.so [.] vfprintf
0.01% cprog [unknown] [k] 00000000 cprog [k] sw_3_1
(9) perf record -j any_call,cond,any_ret -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... ................. .................... .....................
#
9.96% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
4.06% cprog cprog [.] sw_4_2 cprog [.] lr_addr
4.04% cprog cprog [.] lr_addr cprog [.] sw_4_2
4.03% cprog cprog [.] symbol1 cprog [.] hw_1_1
4.02% cprog [unknown] [.] 00000000 cprog [.] callme
3.96% cprog cprog [.] ctr_addr cprog [.] sw_4_1
3.94% cprog cprog [.] symbol2 cprog [.] hw_1_2
3.94% cprog cprog [.] success_3_1_3 cprog [.] sw_3_1
3.93% cprog cprog [.] sw_4_2 cprog [.] callme
2.08% cprog cprog [.] sw_3_2 cprog [.] callme
2.08% cprog cprog [.] callme cprog [.] sw_3_2
2.07% cprog cprog [.] hw_2_2 cprog [.] address2
2.07% cprog cprog [.] success_3_1_2 cprog [.] sw_3_1
2.07% cprog cprog [.] sw_3_1 cprog [.] success_3_1_2
2.07% cprog cprog [.] back2 cprog [.] callme
2.06% cprog cprog [.] hw_1_1 cprog [.] callme
1.99% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
1.98% cprog cprog [.] sw_3_1_3 cprog [.] sw_3_1
1.98% cprog cprog [.] success_3_1_1 cprog [.] sw_3_1
1.98% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_3
1.98% cprog cprog [.] sw_3_1 cprog [.] success_3_1_1
1.98% cprog cprog [.] callme cprog [.] sw_4_2
1.98% cprog cprog [.] back1 cprog [.] callme
1.97% cprog cprog [.] hw_1_1 cprog [.] symbol1
1.97% cprog cprog [.] hw_2_1 cprog [.] address1
1.97% cprog cprog [.] sw_3_1_1 cprog [.] sw_3_1
1.97% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_1
1.97% cprog cprog [.] sw_3_1 cprog [.] success_3_1_3
1.97% cprog cprog [.] callme cprog [.] hw_1_1
1.97% cprog cprog [.] callme cprog [.] sw_3_1
1.97% cprog cprog [.] hw_1_2 cprog [.] symbol2
1.97% cprog cprog [.] hw_1_2 cprog [.] callme
1.97% cprog cprog [.] sw_4_1 cprog [.] callme
1.97% cprog cprog [.] callme cprog [.] main
1.97% cprog cprog [.] callme cprog [.] hw_1_2
1.96% cprog cprog [.] sw_3_1 cprog [.] callme
1.96% cprog cprog [.] sw_3_1_2 cprog [.] sw_3_1
1.96% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_2
0.12% cprog [unknown] [.] 0xf7ab23e0 [unknown] [.] 0xf7ab23c0
0.04% cprog [unknown] [k] 00000000 cprog [k] callme
0.01% cprog [unknown] [k] 00000000 cprog [k] sw_3_1
0.00% cprog libc-2.11.2.so [.] vfprintf libc-2.11.2.so [.] vfprintf
0.00% cprog libc-2.11.2.so [.] _IO_do_write libc-2.11.2.so [.] _IO_do_write
0.00% cprog libc-2.11.2.so [.] _IO_do_write libc-2.11.2.so [.] _IO_file_overflow
0.00% cprog libc-2.11.2.so [.] strchrnul libc-2.11.2.so [.] vfprintf
0.00% cprog libc-2.11.2.so [.] strchrnul libc-2.11.2.so [.] strchrnul
0.00% cprog cprog [.] callme cprog [.] hw_2_2
0.00% cprog [unknown] [.] 00000000 libc-2.11.2.so [.] _IO_do_write
(10) perf record -j any_call,cond,ind_call -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... ..................... .................... .....................
#
17.81% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
7.19% cprog cprog [.] sw_4_2 cprog [.] lr_addr
7.12% cprog [unknown] [.] 00000000 cprog [.] callme
3.71% cprog cprog [.] sw_3_1 cprog [.] success_3_1_2
3.68% cprog cprog [.] callme cprog [.] sw_3_2
3.67% cprog cprog [.] hw_2_2 cprog [.] address2
3.57% cprog cprog [.] hw_2_1 cprog [.] address1
3.55% cprog cprog [.] hw_1_1 cprog [.] symbol1
3.55% cprog cprog [.] sw_3_1 cprog [.] success_3_1_1
3.55% cprog cprog [.] callme cprog [.] hw_1_1
3.54% cprog cprog [.] sw_3_1_1 cprog [.] sw_3_1
3.54% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_1
3.54% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
3.54% cprog cprog [.] callme cprog [.] sw_3_1
3.52% cprog cprog [.] sw_3_1_3 cprog [.] sw_3_1
3.52% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_3
3.52% cprog cprog [.] sw_3_1 cprog [.] success_3_1_3
3.52% cprog cprog [.] sw_3_1_2 cprog [.] sw_3_1
3.52% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_2
3.51% cprog cprog [.] hw_1_2 cprog [.] symbol2
3.51% cprog cprog [.] callme cprog [.] hw_1_2
3.49% cprog cprog [.] callme cprog [.] sw_4_2
0.22% cprog [unknown] [.] 0xf7ca23f4 [unknown] [.] 0xf7ca25d0
0.05% cprog [unknown] [k] 00000000 cprog [k] callme
0.01% cprog libc-2.11.2.so [.] vfprintf libc-2.11.2.so [.] vfprintf
0.01% cprog libc-2.11.2.so [.] vfprintf libc-2.11.2.so [.] strchrnul
0.01% cprog libc-2.11.2.so [.] _IO_file_overflow libc-2.11.2.so [.] _IO_file_overflow
0.01% cprog libc-2.11.2.so [.] strchrnul libc-2.11.2.so [.] strchrnul
0.01% cprog [unknown] [.] 00000000 libc-2.11.2.so [.] _IO_file_overflow
0.01% cprog [unknown] [k] 00000000 cprog [k] sw_3_1
(11) perf record -j any_call,cond,any_ret,ind_call -e branch-misses:u ./cprog
# Overhead Command Source Shared Object Source Symbol Target Shared Object Target Symbol
# ........ ....... .................... ................. .................... ...................
#
9.72% cprog [unknown] [.] 00000000 cprog [.] sw_3_1
3.99% cprog cprog [.] ctr_addr cprog [.] sw_4_1
3.98% cprog cprog [.] success_3_1_3 cprog [.] sw_3_1
3.98% cprog cprog [.] symbol1 cprog [.] hw_1_1
3.98% cprog cprog [.] symbol2 cprog [.] hw_1_2
3.98% cprog cprog [.] sw_4_2 cprog [.] lr_addr
3.98% cprog cprog [.] sw_4_2 cprog [.] callme
3.97% cprog cprog [.] lr_addr cprog [.] sw_4_2
3.91% cprog [unknown] [.] 00000000 cprog [.] callme
2.22% cprog cprog [.] sw_4_1 cprog [.] ctr_addr
2.22% cprog cprog [.] callme cprog [.] sw_4_2
2.22% cprog cprog [.] hw_2_1 cprog [.] address1
2.22% cprog cprog [.] back1 cprog [.] callme
2.21% cprog cprog [.] hw_1_2 cprog [.] symbol2
2.21% cprog cprog [.] sw_3_1 cprog [.] callme
2.21% cprog cprog [.] callme cprog [.] hw_1_2
2.21% cprog cprog [.] sw_3_1_1 cprog [.] sw_3_1
2.21% cprog cprog [.] sw_3_1_3 cprog [.] sw_3_1
2.21% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_1
2.21% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_3
2.21% cprog cprog [.] callme cprog [.] sw_3_1
2.20% cprog cprog [.] hw_1_1 cprog [.] symbol1
2.20% cprog cprog [.] sw_3_1_2 cprog [.] sw_3_1
2.20% cprog cprog [.] sw_3_1 cprog [.] sw_3_1_2
2.20% cprog cprog [.] callme cprog [.] hw_1_1
1.77% cprog cprog [.] hw_1_1 cprog [.] callme
1.77% cprog cprog [.] success_3_1_1 cprog [.] sw_3_1
1.77% cprog cprog [.] sw_3_1 cprog [.] success_3_1_1
1.77% cprog cprog [.] success_3_1_2 cprog [.] sw_3_1
1.77% cprog cprog [.] sw_3_1 cprog [.] success_3_1_2
1.77% cprog cprog [.] sw_3_1 cprog [.] success_3_1_3
1.76% cprog cprog [.] hw_1_2 cprog [.] callme
1.76% cprog cprog [.] sw_4_1 cprog [.] callme
1.76% cprog cprog [.] sw_3_2 cprog [.] callme
1.76% cprog cprog [.] callme cprog [.] main
1.76% cprog cprog [.] callme cprog [.] sw_3_2
1.75% cprog cprog [.] hw_2_2 cprog [.] address2
1.75% cprog cprog [.] back2 cprog [.] callme
0.13% cprog [unknown] [.] 0xf7dd23e0 [unknown] [.] 0xf7dd23c0
0.07% cprog [unknown] [k] 00000000 cprog [k] callme
0.00% cprog libc-2.11.2.so [.] vfprintf libc-2.11.2.so [.] vfprintf
0.00% cprog libc-2.11.2.so [.] vfprintf libc-2.11.2.so [.] _IO_file_xsputn
0.00% cprog [unknown] [.] 00000000 libc-2.11.2.so [.] vfprintf
Test application program
========================
(1) Makefile:
--------------------------------------------
all: sample.o cprog of.cprog of.sample
sample.o: sample.s
as -o sample.o sample.s
cprog: cprog.c sample.o
gcc -o cprog cprog.c sample.o
of.sample: sample.o
objdump -d sample.o > of.sample
of.cprog: cprog
objdump -d cprog > of.cprog
clean:
rm sample.o cprog of.sample of.cprog
---------------------------------------------
(2) cprog.c
---------------------------------------------
#include <stdio.h>
#define LOOP_COUNT 10000
extern void callme(void);
int main(int argc, char *argv[])
{
int i;
for(i = 0; i < LOOP_COUNT; i++)
callme();
printf("end");
return 0;
}
---------------------------------------------
(3) sample.S
---------------------------------------------
# r25, r26, r27 will be used as first level, second level
# and third level stack for LR. Register r20, r21, r22, r23
# r24 will be used for general programming purpose.
.data
msg:
.string "BHRB filter tests\n"
len = . - msg
msg_1_1:
.string "Test: hw_1_1\n"
len_1_1 = 13
msg_1_2:
.string "Test: hw_1_2\n"
len_1_2 = 13
msg_2_1:
.string "Test: hw_2_1\n"
len_2_1 = 13
msg_2_2:
.string "Test: hw_2_2\n"
len_2_2 = 13
msg_3_1:
.string "Test: sw_3_1\n"
len_3_1 = 13
msg_3_1_1:
.string "Test: sw_3_1_1\n"
len_3_1_1 = 15
msg_3_1_2:
.string "Test: sw_3_1_2\n"
len_3_1_2 = 15
msg_3_1_3:
.string "Test: sw_3_1_3\n"
len_3_1_3 = 15
msg_3_2:
.string "Test: sw_3_2\n"
len_3_3 = 13
msg_4_1:
.string "Test: sw_4_1\n"
len_4_1 = 13
msg_4_2:
.string "Test: sw_4_2\n"
len_4_2 = 13
hw_3_1_1_passed:
.string "\thw_3_1_1_passed\n\n"
len_hw_3_1_1_passed = 18
hw_3_1_2_passed:
.string "\thw_3_1_2_passed\n\n"
len_hw_3_1_2_passed = 18
hw_3_1_3_passed:
.string "\thw_3_1_3_passed\n\n"
len_hw_3_1_3_passed = 18
hw_2_1_passed:
.string "\thw_2_1_passed\n\n"
len_hw_2_1_passed = 16
hw_2_2_passed:
.string "\thw_2_2_passed\n\n"
len_hw_2_2_passed = 16
hw_1_1_passed:
.string "\thw_1_1_passed\n\n"
len_hw_1_1_passed = 16
hw_1_2_passed:
.string "\thw_1_2_passed\n\n"
len_hw_1_2_passed = 16
hw_4_1_passed:
.string "\thw_4_1_passed\n\n"
len_hw_4_1_passed = 16
hw_4_2_passed:
.string "\thw_4_2_passed\n\n"
len_hw_4_2_passed = 16
msg_error:
.string "\tError\n"
len_error = 7
.text
.global callme
.global hw_1_1
.global hw_1_2
.global hw_2_1
.global hw_2_2
# HW filter test symbols
symbol1:
# Print "hw_1_1_passed"
li 0, 4
li 3, 1
lis 4, hw_1_1_passed@ha
addi 4, 4, hw_1_1_passed@l
li 5, len_hw_1_1_passed
sc
blr # PERF_SAMPLE_BRANCH_ANY_RET
hw_1_1:
# Save LR - second level
mflr 26
# Print "hw_1_1 called"
li 0, 4
li 3, 1
lis 4, msg_1_1@ha
addi 4, 4, msg_1_1@l
li 5, len_1_1
sc
bl symbol1 # PERF_SAMPLE_BRANCH_ANY_CALL
# Restore LR
mtlr 26
blr # PERF_SAMPLE_BRANCH_ANY_RET
symbol2:
# Print "Symbol2 taken"
li 0, 4
li 3, 1
lis 4, hw_1_2_passed@ha
addi 4, 4, hw_1_2_passed@l
li 5, len_hw_1_2_passed
sc
blr # PERF_SAMPLE_BRANCH_ANY_RET
hw_1_2:
# Save LR - second level
mflr 26
# Print "hw_1_2 called"
li 0, 4
li 3, 1
lis 4, msg_1_2@ha
addi 4, 4, msg_1_2@l
li 5, len_1_2
sc
li 4,20
cmpi 0,4,20
bcl 12, 4*cr0+2, symbol2 # PERF_SAMPLE_BRANCH_ANY_CALL | PERF_SAMPLE_BRANCH_COND
mtlr 26
blr # PERF_SAMPLE_BRANCH_ANY_RET
# HW filter test
address1:
# Print "hw_2_1_passed"
li 0, 4
li 3, 1
lis 4, hw_2_1_passed@ha
addi 4, 4, hw_2_1_passed@l
li 5, len_hw_2_1_passed
sc
b back1 # PERF_SAMPLE_BRANCH_ANY
hw_2_1:
# Print "hw_2_1 called"
li 0, 4
li 3, 1
lis 4, msg_2_1@ha
addi 4, 4, msg_2_1@l
li 5, len_2_1
sc
# Simple conditional branch (equal)
li 20, 12
cmpi 3, 20, 12
bc 12, 4*cr3+2, address1 # PERF_SAMPLE_BRANCH_COND
back1:
blr # PERF_SAMPLE_BRANCH_ANY_RET
address2:
# Print "hw_2_2_passed"
li 0, 4
li 3, 1
lis 4, hw_2_2_passed@ha
addi 4, 4, hw_2_2_passed@l
li 5, len_hw_2_2_passed
sc
b back2 # PERF_SAMPLE_BRANCH_ANY
hw_2_2:
# Print "hw_2_2 called"
li 0, 4
li 3, 1
lis 4, msg_2_2@ha
addi 4, 4, msg_2_2@l
li 5, len_2_2
sc
# Simple conditional branch (less than)
li 20, 12
cmpi 4, 20, 20
bc 12, 4*cr4+0, address2 # PERF_SAMPLE_BRANCH_COND
back2:
blr # PERF_SAMPLE_BRANCH_ANY_RET
# SW filter test symbols
sw_3_1_1:
# Print "Test: sw_3_1_1"
li 0, 4
li 3, 1
lis 4, msg_3_1_1@ha
addi 4, 4, msg_3_1_1@l
li 5, len_3_1_1
sc
li 22,0
# Test the condition and return
li 21, 10
cmpi 0, 21, 10
bclr 12, 2 # PERF_SAMPLE_BRANCH_ANY_RET | PERF_SAMPLE_BRANCH_COND
# Should not have come here
li 0, 4
li 3, 1
lis 4, msg_error@ha
addi 4, 4, msg_error@l
li 5, len_error
sc
# Mark the error
li 22, 1
# Safe fall back
blr # PERF_SAMPLE_BRANCH_ANY_RET
sw_3_1_2:
# Print "Test: sw_3_1_2"
li 0, 4
li 3, 1
lis 4, msg_3_1_2@ha
addi 4, 4, msg_3_1_2@l
li 5, len_3_1_2
sc
li 23, 0
# Test the condition and return
li 21, 10
cmpi 0, 21, 20
bclr 12, 0 # PERF_SAMPLE_BRANCH_ANY_RET | PERF_SAMPLE_BRANCH_COND
# Should not have come here
li 0, 4
li 3, 1
lis 4, msg_error@ha
addi 4, 4, msg_error@l
li 5, len_error
sc
# Mark the error
li 23, 1
# Safe fall back
blr # PERF_SAMPLE_BRANCH_ANY_RET
sw_3_1_3:
# Print "Test: sw_3_1_3"
li 0, 4
li 3, 1
lis 4, msg_3_1_3@ha
addi 4, 4, msg_3_1_3@l
li 5, len_3_1_3
sc
li 24, 0
# Test the condition and return
li 21, 10
cmpi 0, 21, 5
bclr 12, 1 # PERF_SAMPLE_BRANCH_ANY_RET | PERF_SAMPLE_BRANCH_COND
# Mark the error
li 24, 1
# Should not have come here
li 0, 4
li 3, 1
lis 4, msg_error@ha
addi 4, 4, msg_error@l
li 5, len_error
sc
# Safe fall back
blr # PERF_SAMPLE_BRANCH_ANY_RET
success_3_1_1:
li 0, 4
li 3, 1
lis 4, hw_3_1_1_passed@ha
addi 4, 4, hw_3_1_1_passed@l
li 5, len_hw_3_1_1_passed
sc
blr
success_3_1_2:
li 0, 4
li 3, 1
lis 4, hw_3_1_2_passed@ha
addi 4, 4, hw_3_1_2_passed@l
li 5, len_hw_3_1_2_passed
sc
blr
success_3_1_3:
li 0, 4
li 3, 1
lis 4, hw_3_1_3_passed@ha
addi 4, 4, hw_3_1_3_passed@l
li 5, len_hw_3_1_3_passed
sc
blr
sw_3_1:
# Save LR
mflr 26
# Print "Test: sw_3_1"
li 0, 4
li 3, 1
lis 4, msg_3_1@ha
addi 4, 4, msg_3_1@l
li 5, len_3_1
sc
# Equal comparison condition
bl sw_3_1_1 # PERF_SAMPLE_BRANCH_ANY_CALL
cmpi 0, 22, 0
bcl 12, 2, success_3_1_1 # PERF_SAMPLE_BRANCH_ANY_CALL | PERF_SAMPLE_BRANCH_COND
# LT comparison condition
bl sw_3_1_2 # PERF_SAMPLE_BRANCH_ANY_CALL
cmpi 0, 23, 0
bcl 12, 2, success_3_1_2 # PERF_SAMPLE_BRANCH_ANY_CALL | PERF_SAMPLE_BRANCH_COND
# GT comparison condition
bl sw_3_1_3 # PERF_SAMPLE_BRANCH_ANY_CALL
cmpi 0, 24, 0
bcl 12, 2, success_3_1_3 # PERF_SAMPLE_BRANCH_ANY_CALL | PERF_SAMPLE_BRANCH_COND
mtlr 26
blr # PERF_SAMPLE_BRANCH_ANY_RET
sw_3_2:
# Print "Test: sw_3_2"
li 0, 4
li 3, 1
lis 4, msg_3_2@ha
addi 4, 4, msg_3_2@l
li 5, len_3_1
sc
# FIXME: Anything more here ?
blr # PERF_SAMPLE_BRANCH_ANY_RET
# Indirect call tests
# CTR
ctr_addr:
# Print "bcctr taken"
li 0, 4
li 3, 1
lis 4, hw_4_1_passed@ha
addi 4, 4, hw_4_1_passed@l
li 5, len_hw_4_1_passed
sc
blr # PERF_SAMPLE_BRANCH_ANY_RET
sw_4_1:
# Save LR
mflr 26
# Print "sw_4_1 called"
li 0, 4
li 3, 1
lis 4, msg_4_1@ha
addi 4, 4, msg_4_1@l
li 5, len_4_1
sc
# Save address in CTR
lis 20, ctr_addr@ha
addi 20, 20, ctr_addr@l
mtctr 20
# Compare and jump to CTR
li 21, 10
cmpi 0, 21, 10
bcctrl 12, 4*cr0+2 # PERF_SAMPLE_BRANCH_IND_CALL
mtlr 26
blr # PERF_SAMPLE_BRANCH_ANY_RET
# LR
lr_addr:
# Print "bclrl taken"
li 0, 4
li 3, 1
lis 4, hw_4_2_passed@ha
addi 4, 4, hw_4_2_passed@l
li 5, len_hw_4_2_passed
sc
blr # PERF_SAMPLE_BRANCH_ANY_RET
sw_4_2:
# Save LR
mflr 26
# Print "Test: sw_4_2"
li 0, 4
li 3, 1
lis 4, msg_4_2@ha
addi 4, 4, msg_4_2@l
li 5, len_4_2
sc
# Save address in LR
lis 20, lr_addr@ha
addi 20, 20, lr_addr@l
mtlr 20
# Compare and jump to CTR
li 21, 10
cmpi 0, 21, 10
bclrl 12, 4*cr0+2 # PERF_SAMPLE_BRANCH_IND_CALL
# Restore LR
mtlr 26
blr # PERF_SAMPLE_BRANCH_ANY_RET
callme:
# Save LR
mflr 25
# Print "Branch filter Test"
li 0, 4
li 3, 1
lis 4, msg@ha
addi 4, 4, msg@l
li 5, len
sc
# PERF_SAMPLE_BRANCH_ANY_CALL
bl hw_1_1 # PERF_SAMPLE_BRANCH_ANY_CALL
bl hw_1_2 # PERF_SAMPLE_BRANCH_ANY_CALL
# PERF_SAMPLE_BRANCH_COND
bl hw_2_1 # PERF_SAMPLE_BRANCH_ANY_CALL
bl hw_2_2 # PERF_SAMPLE_BRANCH_ANY_CALL
# PERF_SAMPLE_BRANCH_ANY_RET
bl sw_3_1 # PERF_SAMPLE_BRANCH_ANY_CALL
bl sw_3_2 # PERF_SAMPLE_BRANCH_ANY_CALL
# PERF_SAMPLE_BRANCH_IND_CALL
bl sw_4_1 # PERF_SAMPLE_BRANCH_ANY_CALL
bl sw_4_2 # PERF_SAMPLE_BRANCH_ANY_CALL
# Restore LR
mtlr 25
blr # PERF_SAMPLE_BRANCH_ANY_RET
--------------------------------------------------------------------
Anshuman Khandual (10):
perf: New conditional branch filter criteria in branch stack sampling
powerpc, perf: Enable conditional branch filter for POWER8
perf, tool: Conditional branch filter 'cond' added to perf record
x86, perf: Add conditional branch filtering support
perf, documentation: Description for conditional branch filter
powerpc, perf: Change the name of HW PMU branch filter tracking variable
powerpc, lib: Add new branch instruction analysis support functions
powerpc, perf: Enable SW filtering in branch stack sampling framework
power8, perf: Change BHRB branch filter configuration
powerpc, perf: Cleanup SW branch filter list look up
arch/powerpc/include/asm/code-patching.h | 30 ++++
arch/powerpc/include/asm/perf_event_server.h | 6 +-
arch/powerpc/lib/code-patching.c | 54 +++++-
arch/powerpc/perf/core-book3s.c | 260 +++++++++++++++++++++++++--
arch/powerpc/perf/power8-pmu.c | 75 ++++++--
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 5 +
include/uapi/linux/perf_event.h | 3 +-
tools/perf/Documentation/perf-record.txt | 3 +-
tools/perf/builtin-record.c | 1 +
9 files changed, 404 insertions(+), 33 deletions(-)
--
1.7.11.7
^ permalink raw reply [flat|nested] 17+ messages in thread
* [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-11-26 6:06 ` mpe@ellerman.id.au
2013-10-16 6:56 ` [V3 02/10] powerpc, perf: Enable conditional branch filter for POWER8 Anshuman Khandual
` (9 subsequent siblings)
10 siblings, 1 reply; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
POWER8 PMU based BHRB supports filtering for conditional branches.
This patch introduces new branch filter PERF_SAMPLE_BRANCH_COND which
will extend the existing perf ABI. Other architectures can provide
this functionality with either HW filtering support (if present) or
with SW filtering of instructions.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
---
include/uapi/linux/perf_event.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0b1df41..5da52b6 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -160,8 +160,9 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
+ PERF_SAMPLE_BRANCH_COND = 1U << 10, /* conditional branches */
- PERF_SAMPLE_BRANCH_MAX = 1U << 10, /* non-ABI */
+ PERF_SAMPLE_BRANCH_MAX = 1U << 11, /* non-ABI */
};
#define PERF_SAMPLE_BRANCH_PLM_ALL \
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 02/10] powerpc, perf: Enable conditional branch filter for POWER8
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
2013-10-16 6:56 ` [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-11-26 6:06 ` mpe@ellerman.id.au
2013-10-16 6:56 ` [V3 03/10] perf, tool: Conditional branch filter 'cond' added to perf record Anshuman Khandual
` (8 subsequent siblings)
10 siblings, 1 reply; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
Enables conditional branch filter support for POWER8
utilizing MMCRA register based filter and also invalidates
a BHRB branch filter combination involving conditional
branches.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/perf/power8-pmu.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 2ee4a70..6e28587 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -580,11 +580,21 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
return -1;
+ /* Invalid branch filter combination - HW does not support */
+ if ((branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) &&
+ (branch_sample_type & PERF_SAMPLE_BRANCH_COND))
+ return -1;
+
if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
return pmu_bhrb_filter;
}
+ if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
+ pmu_bhrb_filter |= POWER8_MMCRA_IFM3;
+ return pmu_bhrb_filter;
+ }
+
/* Every thing else is unsupported */
return -1;
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 03/10] perf, tool: Conditional branch filter 'cond' added to perf record
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
2013-10-16 6:56 ` [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling Anshuman Khandual
2013-10-16 6:56 ` [V3 02/10] powerpc, perf: Enable conditional branch filter for POWER8 Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 04/10] x86, perf: Add conditional branch filtering support Anshuman Khandual
` (7 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
Adding perf record support for new branch stack filter criteria
PERF_SAMPLE_BRANCH_COND.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
---
tools/perf/builtin-record.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index ecca62e..802d11d 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -625,6 +625,7 @@ static const struct branch_mode branch_modes[] = {
BRANCH_OPT("any_call", PERF_SAMPLE_BRANCH_ANY_CALL),
BRANCH_OPT("any_ret", PERF_SAMPLE_BRANCH_ANY_RETURN),
BRANCH_OPT("ind_call", PERF_SAMPLE_BRANCH_IND_CALL),
+ BRANCH_OPT("cond", PERF_SAMPLE_BRANCH_COND),
BRANCH_END
};
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 04/10] x86, perf: Add conditional branch filtering support
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
` (2 preceding siblings ...)
2013-10-16 6:56 ` [V3 03/10] perf, tool: Conditional branch filter 'cond' added to perf record Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 05/10] perf, documentation: Description for conditional branch filter Anshuman Khandual
` (6 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
This patch adds conditional branch filtering support,
enabling it for PERF_SAMPLE_BRANCH_COND in perf branch
stack sampling framework by utilizing an available
software filter X86_BR_JCC.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
---
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index d5be06a..9723773 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -371,6 +371,9 @@ static void intel_pmu_setup_sw_lbr_filter(struct perf_event *event)
if (br_type & PERF_SAMPLE_BRANCH_NO_TX)
mask |= X86_BR_NO_TX;
+ if (br_type & PERF_SAMPLE_BRANCH_COND)
+ mask |= X86_BR_JCC;
+
/*
* stash actual user request into reg, it may
* be used by fixup code for some CPU
@@ -665,6 +668,7 @@ static const int nhm_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
* NHM/WSM erratum: must include IND_JMP to capture IND_CALL
*/
[PERF_SAMPLE_BRANCH_IND_CALL] = LBR_IND_CALL | LBR_IND_JMP,
+ [PERF_SAMPLE_BRANCH_COND] = LBR_JCC,
};
static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
@@ -676,6 +680,7 @@ static const int snb_lbr_sel_map[PERF_SAMPLE_BRANCH_MAX] = {
[PERF_SAMPLE_BRANCH_ANY_CALL] = LBR_REL_CALL | LBR_IND_CALL
| LBR_FAR,
[PERF_SAMPLE_BRANCH_IND_CALL] = LBR_IND_CALL,
+ [PERF_SAMPLE_BRANCH_COND] = LBR_JCC,
};
/* core */
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 05/10] perf, documentation: Description for conditional branch filter
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
` (3 preceding siblings ...)
2013-10-16 6:56 ` [V3 04/10] x86, perf: Add conditional branch filtering support Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 06/10] powerpc, perf: Change the name of HW PMU branch filter tracking variable Anshuman Khandual
` (5 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
Adding documentation support for conditional branch filter.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
---
tools/perf/Documentation/perf-record.txt | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index e297b74..59ca8d0 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -163,12 +163,13 @@ following filters are defined:
- any_call: any function call or system call
- any_ret: any function return or system call return
- ind_call: any indirect branch
+ - cond: conditional branches
- u: only when the branch target is at the user level
- k: only when the branch target is in the kernel
- hv: only when the target is at the hypervisor level
+
-The option requires at least one branch type among any, any_call, any_ret, ind_call.
+The option requires at least one branch type among any, any_call, any_ret, ind_call, cond.
The privilege levels may be omitted, in which case, the privilege levels of the associated
event are applied to the branch filter. Both kernel (k) and hypervisor (hv) privilege
levels are subject to permissions. When sampling on multiple events, branch stack sampling
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 06/10] powerpc, perf: Change the name of HW PMU branch filter tracking variable
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
` (4 preceding siblings ...)
2013-10-16 6:56 ` [V3 05/10] perf, documentation: Description for conditional branch filter Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 07/10] powerpc, lib: Add new branch instruction analysis support functions Anshuman Khandual
` (4 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
This patch simply changes the name of the variable from "bhrb_filter" to
"bhrb_hw_filter" in order to add one more variable which will track SW
filters in generic powerpc book3s code which will be implemented in the
subsequent patch.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/perf/core-book3s.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index eeae308..bc4dac7 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -47,7 +47,7 @@ struct cpu_hw_events {
int n_txn_start;
/* BHRB bits */
- u64 bhrb_filter; /* BHRB HW branch filter */
+ u64 bhrb_hw_filter; /* BHRB HW branch filter */
int bhrb_users;
void *bhrb_context;
struct perf_branch_stack bhrb_stack;
@@ -1159,7 +1159,7 @@ static void power_pmu_enable(struct pmu *pmu)
out:
if (cpuhw->bhrb_users)
- ppmu->config_bhrb(cpuhw->bhrb_filter);
+ ppmu->config_bhrb(cpuhw->bhrb_hw_filter);
local_irq_restore(flags);
}
@@ -1254,7 +1254,7 @@ nocheck:
out:
if (has_branch_stack(event)) {
power_pmu_bhrb_enable(event);
- cpuhw->bhrb_filter = ppmu->bhrb_filter_map(
+ cpuhw->bhrb_hw_filter = ppmu->bhrb_filter_map(
event->attr.branch_sample_type);
}
@@ -1637,10 +1637,10 @@ static int power_pmu_event_init(struct perf_event *event)
err = power_check_constraints(cpuhw, events, cflags, n + 1);
if (has_branch_stack(event)) {
- cpuhw->bhrb_filter = ppmu->bhrb_filter_map(
+ cpuhw->bhrb_hw_filter = ppmu->bhrb_filter_map(
event->attr.branch_sample_type);
- if(cpuhw->bhrb_filter == -1)
+ if(cpuhw->bhrb_hw_filter == -1)
return -EOPNOTSUPP;
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 07/10] powerpc, lib: Add new branch instruction analysis support functions
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
` (5 preceding siblings ...)
2013-10-16 6:56 ` [V3 06/10] powerpc, perf: Change the name of HW PMU branch filter tracking variable Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 08/10] powerpc, perf: Enable SW filtering in branch stack sampling framework Anshuman Khandual
` (3 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
Generic powerpc branch instruction analysis support added in the code
patching library which will help the subsequent patch on SW based
filtering of branch records in perf. This patch also converts and
exports some of the existing local static functions through the header
file to be used else where.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/code-patching.h | 30 ++++++++++++++++++
arch/powerpc/lib/code-patching.c | 54 ++++++++++++++++++++++++++++++--
2 files changed, 82 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h
index a6f8c7a..8bab417 100644
--- a/arch/powerpc/include/asm/code-patching.h
+++ b/arch/powerpc/include/asm/code-patching.h
@@ -22,6 +22,36 @@
#define BRANCH_SET_LINK 0x1
#define BRANCH_ABSOLUTE 0x2
+#define XL_FORM_LR 0x4C000020
+#define XL_FORM_CTR 0x4C000420
+#define XL_FORM_TAR 0x4C000460
+
+#define BO_ALWAYS 0x02800000
+#define BO_CTR 0x02000000
+#define BO_CRBI_OFF 0x00800000
+#define BO_CRBI_ON 0x01800000
+#define BO_CRBI_HINT 0x00400000
+
+/* Forms of branch instruction */
+int instr_is_branch_iform(unsigned int instr);
+int instr_is_branch_bform(unsigned int instr);
+int instr_is_branch_xlform(unsigned int instr);
+
+/* Classification of XL-form instruction */
+int is_xlform_lr(unsigned int instr);
+int is_xlform_ctr(unsigned int instr);
+int is_xlform_tar(unsigned int instr);
+
+/* Branch instruction is a call */
+int is_branch_link_set(unsigned int instr);
+
+/* BO field analysis (B-form or XL-form) */
+int is_bo_always(unsigned int instr);
+int is_bo_ctr(unsigned int instr);
+int is_bo_crbi_off(unsigned int instr);
+int is_bo_crbi_on(unsigned int instr);
+int is_bo_crbi_hint(unsigned int instr);
+
unsigned int create_branch(const unsigned int *addr,
unsigned long target, int flags);
unsigned int create_cond_branch(const unsigned int *addr,
diff --git a/arch/powerpc/lib/code-patching.c b/arch/powerpc/lib/code-patching.c
index 17e5b23..cb62bd8 100644
--- a/arch/powerpc/lib/code-patching.c
+++ b/arch/powerpc/lib/code-patching.c
@@ -77,16 +77,66 @@ static unsigned int branch_opcode(unsigned int instr)
return (instr >> 26) & 0x3F;
}
-static int instr_is_branch_iform(unsigned int instr)
+int instr_is_branch_iform(unsigned int instr)
{
return branch_opcode(instr) == 18;
}
-static int instr_is_branch_bform(unsigned int instr)
+int instr_is_branch_bform(unsigned int instr)
{
return branch_opcode(instr) == 16;
}
+int instr_is_branch_xlform(unsigned int instr)
+{
+ return branch_opcode(instr) == 19;
+}
+
+int is_xlform_lr(unsigned int instr)
+{
+ return (instr & XL_FORM_LR) == XL_FORM_LR;
+}
+
+int is_xlform_ctr(unsigned int instr)
+{
+ return (instr & XL_FORM_CTR) == XL_FORM_CTR;
+}
+
+int is_xlform_tar(unsigned int instr)
+{
+ return (instr & XL_FORM_TAR) == XL_FORM_TAR;
+}
+
+int is_branch_link_set(unsigned int instr)
+{
+ return (instr & BRANCH_SET_LINK) == BRANCH_SET_LINK;
+}
+
+int is_bo_always(unsigned int instr)
+{
+ return (instr & BO_ALWAYS) == BO_ALWAYS;
+}
+
+int is_bo_ctr(unsigned int instr)
+{
+ return (instr & BO_CTR) == BO_CTR;
+}
+
+int is_bo_crbi_off(unsigned int instr)
+{
+ return (instr & BO_CRBI_OFF) == BO_CRBI_OFF;
+}
+
+int is_bo_crbi_on(unsigned int instr)
+{
+ return (instr & BO_CRBI_ON) == BO_CRBI_ON;
+}
+
+int is_bo_crbi_hint(unsigned int instr)
+{
+ return (instr & BO_CRBI_HINT) == BO_CRBI_HINT;
+}
+
int instr_is_relative_branch(unsigned int instr)
{
if (instr & BRANCH_ABSOLUTE)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 08/10] powerpc, perf: Enable SW filtering in branch stack sampling framework
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
` (6 preceding siblings ...)
2013-10-16 6:56 ` [V3 07/10] powerpc, lib: Add new branch instruction analysis support functions Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 09/10] power8, perf: Change BHRB branch filter configuration Anshuman Khandual
` (2 subsequent siblings)
10 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
This patch enables SW based post processing of BHRB captured branches
to be able to meet more user defined branch filtration criteria in perf
branch stack sampling framework. These changes increase the number of
branch filters and their valid combinations on any powerpc64 server
platform with BHRB support. Find the summary of code changes here.
(1) struct cpu_hw_events
Introduced two new variables track various filter values and mask
(a) bhrb_sw_filter Tracks SW implemented branch filter flags
(b) filter_mask Tracks both (SW and HW) branch filter flags
(2) Event creation
Kernel will figure out supported BHRB branch filters through a PMU call
back 'bhrb_filter_map'. This function will find out how many of the
requested branch filters can be supported in the PMU HW. It will not
try to invalidate any branch filter combinations. Event creation will not
error out because of lack of HW based branch filters. Meanwhile it will
track the overall supported branch filters in the "filter_mask" variable.
Once the PMU call back returns kernel will process the user branch filter
request against available SW filters while looking at the "filter_mask".
During this phase all the branch filters which are still pending from the
user requested list will have to be supported in SW failing which the
event creation will error out.
(3) SW branch filter
During the BHRB data capture inside the PMU interrupt context, each
of the captured 'perf_branch_entry.from' will be checked for compliance
with applicable SW branch filters. If the entry does not conform to the
filter requirements, it will be discarded from the final perf branch
stack buffer.
(4) Supported SW based branch filters
(a) PERF_SAMPLE_BRANCH_ANY_RETURN
(b) PERF_SAMPLE_BRANCH_IND_CALL
(c) PERF_SAMPLE_BRANCH_ANY_CALL
(d) PERF_SAMPLE_BRANCH_COND
Please refer patch to understand the classification of instructions into
these branch filter categories.
(5) Multiple branch filter semantics
Book3 sever implementation follows the same OR semantics (as implemented in
x86) while dealing with multiple branch filters at any point of time. SW
branch filter analysis is carried on the data set captured in the PMU HW.
So the resulting set of data (after applying the SW filters) will inherently
be an AND with the HW captured set. Hence any combination of HW and SW branch
filters will be invalid. HW based branch filters are more efficient and faster
compared to SW implemented branch filters. So at first the PMU should decide
whether it can support all the requested branch filters itself or not. In case
it can support all the branch filters in an OR manner, we dont apply any SW
branch filter on top of the HW captured set (which is the final set). This
preserves the OR semantic of multiple branch filters as required. But in case
where the PMU cannot support all the requested branch filters in an OR manner,
it should not apply any it's filters and leave it upto the SW to handle them
all. Its the PMU code's responsibility to uphold this protocol to be able to
conform to the overall OR semantic of perf branch stack sampling framework.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/perf_event_server.h | 6 +-
arch/powerpc/perf/core-book3s.c | 266 ++++++++++++++++++++++++++-
arch/powerpc/perf/power8-pmu.c | 2 +-
3 files changed, 262 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/include/asm/perf_event_server.h b/arch/powerpc/include/asm/perf_event_server.h
index 8b24926..7314085 100644
--- a/arch/powerpc/include/asm/perf_event_server.h
+++ b/arch/powerpc/include/asm/perf_event_server.h
@@ -18,6 +18,10 @@
#define MAX_EVENT_ALTERNATIVES 8
#define MAX_LIMITED_HWCOUNTERS 2
+#define for_each_branch_sample_type(x) \
+ for ((x) = PERF_SAMPLE_BRANCH_USER; \
+ (x) < PERF_SAMPLE_BRANCH_MAX; (x) <<= 1)
+
/*
* This struct provides the constants and functions needed to
* describe the PMU on a particular POWER-family CPU.
@@ -34,7 +38,7 @@ struct power_pmu {
unsigned long *valp);
int (*get_alternatives)(u64 event_id, unsigned int flags,
u64 alt[]);
- u64 (*bhrb_filter_map)(u64 branch_sample_type);
+ u64 (*bhrb_filter_map)(u64 branch_sample_type, u64 *filter_mask);
void (*config_bhrb)(u64 pmu_bhrb_filter);
void (*disable_pmc)(unsigned int pmc, unsigned long mmcr[]);
int (*limited_pmc_event)(u64 event_id);
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index bc4dac7..f983334 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -48,6 +48,8 @@ struct cpu_hw_events {
/* BHRB bits */
u64 bhrb_hw_filter; /* BHRB HW branch filter */
+ u64 bhrb_sw_filter; /* BHRB SW branch filter */
+ u64 filter_mask; /* Branch filter mask */
int bhrb_users;
void *bhrb_context;
struct perf_branch_stack bhrb_stack;
@@ -400,6 +402,228 @@ static __u64 power_pmu_bhrb_to(u64 addr)
return target - (unsigned long)&instr + addr;
}
+/*
+ * Instruction opcode analysis
+ *
+ * Analyse instruction opcodes and classify them
+ * into various branch filter options available.
+ * This follows the standard semantics of OR which
+ * means that instructions which conforms to `any`
+ * of the requested branch filters get picked up.
+ */
+static bool validate_instruction(unsigned int *addr, u64 bhrb_sw_filter)
+{
+ bool result = false;
+
+ if (bhrb_sw_filter & PERF_SAMPLE_BRANCH_ANY_RETURN) {
+
+ /* XL-form instruction */
+ if (instr_is_branch_xlform(*addr)) {
+
+ /* LR should not be set */
+ if (!is_branch_link_set(*addr)) {
+ /*
+ * Conditional and unconditional
+ * branch to LR register.
+ */
+ if (is_xlform_lr(*addr))
+ result = true;
+ }
+ }
+ }
+
+ if (bhrb_sw_filter & PERF_SAMPLE_BRANCH_IND_CALL) {
+ /* XL-form instruction */
+ if (instr_is_branch_xlform(*addr)) {
+
+ /* LR should be set */
+ if (is_branch_link_set(*addr)) {
+ /*
+ * Conditional and unconditional
+ * branch to CTR.
+ */
+ if (is_xlform_ctr(*addr))
+ result = true;
+
+ /*
+ * Conditional and unconditional
+ * branch to LR.
+ */
+ if (is_xlform_lr(*addr))
+ result = true;
+
+ /*
+ * Conditional and unconditional
+ * branch to TAR.
+ */
+ if (is_xlform_tar(*addr))
+ result = true;
+ }
+ }
+ }
+
+ /* Any-form branch */
+ if (bhrb_sw_filter & PERF_SAMPLE_BRANCH_ANY_CALL) {
+ /* LR should be set */
+ if (is_branch_link_set(*addr))
+ result = true;
+ }
+
+ if (bhrb_sw_filter & PERF_SAMPLE_BRANCH_COND) {
+
+ /* I-form instruction - excluded */
+ if (instr_is_branch_iform(*addr))
+ goto out;
+
+ /* B-form or XL-form instruction */
+ if (instr_is_branch_bform(*addr) || instr_is_branch_xlform(*addr)) {
+
+ /* Not branch always */
+ if (!is_bo_always(*addr)) {
+
+ /* Conditional branch to CTR register */
+ if (is_bo_ctr(*addr))
+ goto out;
+
+ /* CR[BI] conditional branch with static hint */
+ if (is_bo_crbi_off(*addr) || is_bo_crbi_on(*addr)) {
+ if (is_bo_crbi_hint(*addr))
+ goto out;
+ }
+
+ result = true;
+ }
+ }
+ }
+out:
+ return result;
+}
+
+static bool check_instruction(u64 addr, u64 bhrb_sw_filter)
+{
+ unsigned int instr;
+ bool ret;
+
+ if (bhrb_sw_filter == 0)
+ return true;
+
+ if (is_kernel_addr(addr)) {
+ ret = validate_instruction((unsigned int *) addr, bhrb_sw_filter);
+ } else {
+ /*
+ * Userspace address needs to be
+ * copied first before analysis.
+ */
+ pagefault_disable();
+ ret = __get_user_inatomic(instr, (unsigned int __user *)addr);
+
+ /*
+ * If the instruction could not be accessible
+ * from user space, we still 'okay' the entry.
+ */
+ if (ret) {
+ pagefault_enable();
+ return true;
+ }
+ pagefault_enable();
+ ret = validate_instruction(&instr, bhrb_sw_filter);
+ }
+ return ret;
+}
+
+/*
+ * Validate whether all requested branch filters
+ * are getting processed either in the PMU or in SW.
+ */
+static int match_filters(u64 branch_sample_type, u64 filter_mask)
+{
+ u64 x;
+
+ if (filter_mask == PERF_SAMPLE_BRANCH_ANY)
+ return true;
+
+ for_each_branch_sample_type(x) {
+ if (!(branch_sample_type & x))
+ continue;
+ /*
+ * Privilege filter requests have been already
+ * taken care during the base PMU configuration.
+ */
+ if (x == PERF_SAMPLE_BRANCH_USER)
+ continue;
+ if (x == PERF_SAMPLE_BRANCH_KERNEL)
+ continue;
+ if (x == PERF_SAMPLE_BRANCH_HV)
+ continue;
+
+ /*
+ * Requested filter not available either
+ * in PMU or in SW.
+ */
+ if (!(filter_mask & x))
+ return false;
+ }
+ return true;
+}
+
+/*
+ * Required SW based branch filters
+ *
+ * This is called after figuring out what all branch filters the
+ * PMU HW supports for the requested branch filter set. Here we
+ * will go through all the SW implemented branch filters one by
+ * one and pick them up if its not already supported in the PMU.
+ */
+static u64 branch_filter_map(u64 branch_sample_type, u64 pmu_bhrb_filter,
+ u64 *filter_mask)
+{
+ u64 branch_sw_filter = 0;
+
+ /* No branch filter requested */
+ if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY) {
+ WARN_ON(pmu_bhrb_filter != 0);
+ WARN_ON(*filter_mask != PERF_SAMPLE_BRANCH_ANY);
+ return branch_sw_filter;
+ }
+
+ /*
+ * PMU supported branch filters must also be implemented in SW
+ * in the event when the PMU is unable to process them for some
+ * reason. This all those branch filters can be satisfied with
+ * SW implemented filters. But right now, there is now way to
+ * initimate the user about this decision.
+ */
+ if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+ if (!(pmu_bhrb_filter & PERF_SAMPLE_BRANCH_ANY_CALL)) {
+ branch_sw_filter |= PERF_SAMPLE_BRANCH_ANY_CALL;
+ *filter_mask |= PERF_SAMPLE_BRANCH_ANY_CALL;
+ }
+ }
+
+ if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
+ if (!(pmu_bhrb_filter & PERF_SAMPLE_BRANCH_COND)) {
+ branch_sw_filter |= PERF_SAMPLE_BRANCH_COND;
+ *filter_mask |= PERF_SAMPLE_BRANCH_COND;
+ }
+ }
+
+ if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN) {
+ if (!(pmu_bhrb_filter & PERF_SAMPLE_BRANCH_ANY_RETURN)) {
+ branch_sw_filter |= PERF_SAMPLE_BRANCH_ANY_RETURN;
+ *filter_mask |= PERF_SAMPLE_BRANCH_ANY_RETURN;
+ }
+ }
+
+ if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
+ if (!(pmu_bhrb_filter & PERF_SAMPLE_BRANCH_IND_CALL)) {
+ branch_sw_filter |= PERF_SAMPLE_BRANCH_IND_CALL;
+ *filter_mask |= PERF_SAMPLE_BRANCH_IND_CALL;
+ }
+ }
+
+ return branch_sw_filter;
+}
+
/* Processing BHRB entries */
void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
{
@@ -459,17 +683,29 @@ void power_pmu_bhrb_read(struct cpu_hw_events *cpuhw)
addr = 0;
}
cpuhw->bhrb_entries[u_index].from = addr;
+
+ if (!check_instruction(cpuhw->
+ bhrb_entries[u_index].from,
+ cpuhw->bhrb_sw_filter))
+ u_index--;
} else {
/* Branches to immediate field
(ie I or B form) */
cpuhw->bhrb_entries[u_index].from = addr;
- cpuhw->bhrb_entries[u_index].to =
- power_pmu_bhrb_to(addr);
- cpuhw->bhrb_entries[u_index].mispred = pred;
- cpuhw->bhrb_entries[u_index].predicted = ~pred;
+ if (check_instruction(cpuhw->
+ bhrb_entries[u_index].from,
+ cpuhw->bhrb_sw_filter)) {
+ cpuhw->bhrb_entries[u_index].
+ to = power_pmu_bhrb_to(addr);
+ cpuhw->bhrb_entries[u_index].
+ mispred = pred;
+ cpuhw->bhrb_entries[u_index].
+ predicted = ~pred;
+ } else {
+ u_index--;
+ }
}
u_index++;
-
}
}
cpuhw->bhrb_stack.nr = u_index;
@@ -1255,7 +1491,11 @@ nocheck:
if (has_branch_stack(event)) {
power_pmu_bhrb_enable(event);
cpuhw->bhrb_hw_filter = ppmu->bhrb_filter_map(
- event->attr.branch_sample_type);
+ event->attr.branch_sample_type,
+ &cpuhw->filter_mask);
+ cpuhw->bhrb_sw_filter = branch_filter_map
+ (event->attr.branch_sample_type,
+ cpuhw->bhrb_hw_filter, &cpuhw->filter_mask);
}
perf_pmu_enable(event->pmu);
@@ -1637,10 +1877,16 @@ static int power_pmu_event_init(struct perf_event *event)
err = power_check_constraints(cpuhw, events, cflags, n + 1);
if (has_branch_stack(event)) {
- cpuhw->bhrb_hw_filter = ppmu->bhrb_filter_map(
- event->attr.branch_sample_type);
-
- if(cpuhw->bhrb_hw_filter == -1)
+ cpuhw->bhrb_hw_filter = ppmu->bhrb_filter_map
+ (event->attr.branch_sample_type,
+ &cpuhw->filter_mask);
+ cpuhw->bhrb_sw_filter = branch_filter_map
+ (event->attr.branch_sample_type,
+ cpuhw->bhrb_hw_filter,
+ &cpuhw->filter_mask);
+
+ if(!match_filters(event->attr.branch_sample_type,
+ cpuhw->filter_mask))
return -EOPNOTSUPP;
}
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 6e28587..94460bc 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -558,7 +558,7 @@ static int power8_generic_events[] = {
[PERF_COUNT_HW_BRANCH_MISSES] = PM_BR_MPRED_CMPL,
};
-static u64 power8_bhrb_filter_map(u64 branch_sample_type)
+static u64 power8_bhrb_filter_map(u64 branch_sample_type, u64 *filter_mask)
{
u64 pmu_bhrb_filter = 0;
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 09/10] power8, perf: Change BHRB branch filter configuration
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
` (7 preceding siblings ...)
2013-10-16 6:56 ` [V3 08/10] powerpc, perf: Enable SW filtering in branch stack sampling framework Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 10/10] powerpc, perf: Cleanup SW branch filter list look up Anshuman Khandual
2013-12-04 2:50 ` [V3 00/10] perf: New conditional branch filter Michael Ellerman
10 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
Powerpc kernel now supports SW based branch filters for book3s systems with some
specifc requirements while dealing with HW supported branch filters in order to
achieve overall OR semantics prevailing in perf branch stack sampling framework.
This patch adapts the BHRB branch filter configuration to meet those protocols.
POWER8 PMU does support 3 branch filters (out of which two are getting used in
perf branch stack) which are mutually exclussive and cannot be ORed with each
other. This implies that PMU can only handle one HW based branch filter request
at any point of time. For all other combinations PMU will pass it on to the SW.
Also the combination of PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND
can now be handled in SW, hence we dont error them out anymore.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/perf/power8-pmu.c | 73 +++++++++++++++++++++++++++++++-----------
1 file changed, 54 insertions(+), 19 deletions(-)
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index 94460bc..7b82725 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -560,7 +560,56 @@ static int power8_generic_events[] = {
static u64 power8_bhrb_filter_map(u64 branch_sample_type, u64 *filter_mask)
{
- u64 pmu_bhrb_filter = 0;
+ u64 x, tmp, pmu_bhrb_filter = 0;
+ *filter_mask = 0;
+
+ /* No branch filter requested */
+ if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY) {
+ *filter_mask = PERF_SAMPLE_BRANCH_ANY;
+ return pmu_bhrb_filter;
+ }
+
+ /*
+ * P8 does not support oring of PMU HW branch filters. Hence
+ * if multiple branch filters are requested which includes filters
+ * supported in PMU, still go ahead and clear the PMU based HW branch
+ * filter component as in this case all the filters will be processed
+ * in SW.
+ */
+ tmp = branch_sample_type;
+
+ /* Remove privilege filters before comparison */
+ tmp &= ~PERF_SAMPLE_BRANCH_USER;
+ tmp &= ~PERF_SAMPLE_BRANCH_KERNEL;
+ tmp &= ~PERF_SAMPLE_BRANCH_HV;
+
+ for_each_branch_sample_type(x) {
+ /* Ignore privilege requests */
+ if ((x == PERF_SAMPLE_BRANCH_USER) || (x == PERF_SAMPLE_BRANCH_KERNEL) || (x == PERF_SAMPLE_BRANCH_HV))
+ continue;
+
+ if (!(tmp & x))
+ continue;
+
+ /* Supported HW PMU filters */
+ if (tmp & PERF_SAMPLE_BRANCH_ANY_CALL) {
+ tmp &= ~PERF_SAMPLE_BRANCH_ANY_CALL;
+ if (tmp) {
+ pmu_bhrb_filter = 0;
+ *filter_mask = 0;
+ return pmu_bhrb_filter;
+ }
+ }
+
+ if (tmp & PERF_SAMPLE_BRANCH_COND) {
+ tmp &= ~PERF_SAMPLE_BRANCH_COND;
+ if (tmp) {
+ pmu_bhrb_filter = 0;
+ *filter_mask = 0;
+ return pmu_bhrb_filter;
+ }
+ }
+ }
/* BHRB and regular PMU events share the same privilege state
* filter configuration. BHRB is always recorded along with a
@@ -569,34 +618,20 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type, u64 *filter_mask)
* PMU event, we ignore any separate BHRB specific request.
*/
- /* No branch filter requested */
- if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY)
- return pmu_bhrb_filter;
-
- /* Invalid branch filter options - HW does not support */
- if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
- return -1;
-
- if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
- return -1;
-
- /* Invalid branch filter combination - HW does not support */
- if ((branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) &&
- (branch_sample_type & PERF_SAMPLE_BRANCH_COND))
- return -1;
-
+ /* Supported individual branch filters */
if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
+ *filter_mask |= PERF_SAMPLE_BRANCH_ANY_CALL;
return pmu_bhrb_filter;
}
if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
pmu_bhrb_filter |= POWER8_MMCRA_IFM3;
+ *filter_mask |= PERF_SAMPLE_BRANCH_COND;
return pmu_bhrb_filter;
}
- /* Every thing else is unsupported */
- return -1;
+ return pmu_bhrb_filter;
}
static void power8_config_bhrb(u64 pmu_bhrb_filter)
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [V3 10/10] powerpc, perf: Cleanup SW branch filter list look up
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
` (8 preceding siblings ...)
2013-10-16 6:56 ` [V3 09/10] power8, perf: Change BHRB branch filter configuration Anshuman Khandual
@ 2013-10-16 6:56 ` Anshuman Khandual
2013-12-04 2:50 ` [V3 00/10] perf: New conditional branch filter Michael Ellerman
10 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-10-16 6:56 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, michaele, eranian
This patch adds enumeration for all available SW branch filters
in powerpc book3s code and also streamlines the look for the
SW branch filter entries while trying to figure out which all
branch filters can be supported in SW.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
arch/powerpc/perf/core-book3s.c | 38 +++++++++++++-------------------------
1 file changed, 13 insertions(+), 25 deletions(-)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index f983334..ec2dd61 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -566,6 +566,12 @@ static int match_filters(u64 branch_sample_type, u64 filter_mask)
return true;
}
+/* SW implemented branch filters */
+static unsigned int power_sw_filter[] = { PERF_SAMPLE_BRANCH_ANY_CALL,
+ PERF_SAMPLE_BRANCH_COND,
+ PERF_SAMPLE_BRANCH_ANY_RETURN,
+ PERF_SAMPLE_BRANCH_IND_CALL };
+
/*
* Required SW based branch filters
*
@@ -578,6 +584,7 @@ static u64 branch_filter_map(u64 branch_sample_type, u64 pmu_bhrb_filter,
u64 *filter_mask)
{
u64 branch_sw_filter = 0;
+ unsigned int i;
/* No branch filter requested */
if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY) {
@@ -593,34 +600,15 @@ static u64 branch_filter_map(u64 branch_sample_type, u64 pmu_bhrb_filter,
* SW implemented filters. But right now, there is now way to
* initimate the user about this decision.
*/
- if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
- if (!(pmu_bhrb_filter & PERF_SAMPLE_BRANCH_ANY_CALL)) {
- branch_sw_filter |= PERF_SAMPLE_BRANCH_ANY_CALL;
- *filter_mask |= PERF_SAMPLE_BRANCH_ANY_CALL;
- }
- }
-
- if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
- if (!(pmu_bhrb_filter & PERF_SAMPLE_BRANCH_COND)) {
- branch_sw_filter |= PERF_SAMPLE_BRANCH_COND;
- *filter_mask |= PERF_SAMPLE_BRANCH_COND;
- }
- }
- if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_RETURN) {
- if (!(pmu_bhrb_filter & PERF_SAMPLE_BRANCH_ANY_RETURN)) {
- branch_sw_filter |= PERF_SAMPLE_BRANCH_ANY_RETURN;
- *filter_mask |= PERF_SAMPLE_BRANCH_ANY_RETURN;
- }
- }
-
- if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL) {
- if (!(pmu_bhrb_filter & PERF_SAMPLE_BRANCH_IND_CALL)) {
- branch_sw_filter |= PERF_SAMPLE_BRANCH_IND_CALL;
- *filter_mask |= PERF_SAMPLE_BRANCH_IND_CALL;
+ for (i = 0; i < ARRAY_SIZE(power_sw_filter); i++) {
+ if (branch_sample_type & power_sw_filter[i]) {
+ if (!(pmu_bhrb_filter & power_sw_filter[i])) {
+ branch_sw_filter |= power_sw_filter[i];
+ *filter_mask |= power_sw_filter[i];
+ }
}
}
-
return branch_sw_filter;
}
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* RE: [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling
2013-10-16 6:56 ` [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling Anshuman Khandual
@ 2013-11-26 6:06 ` mpe@ellerman.id.au
2013-11-26 10:15 ` Anshuman Khandual
0 siblings, 1 reply; 17+ messages in thread
From: mpe@ellerman.id.au @ 2013-11-26 6:06 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, eranian, michaele
Ideally your commit subject would contain a verb, preferably in the present
tense.
I think simply "perf: Add PERF_SAMPLE_BRANCH_COND" would be clearer.
On Wed, 2013-16-10 at 06:56:48 UTC, Anshuman Khandual wrote:
> POWER8 PMU based BHRB supports filtering for conditional branches.
> This patch introduces new branch filter PERF_SAMPLE_BRANCH_COND which
> will extend the existing perf ABI. Other architectures can provide
> this functionality with either HW filtering support (if present) or
> with SW filtering of instructions.
>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> Reviewed-by: Stephane Eranian <eranian@google.com>
> ---
> include/uapi/linux/perf_event.h | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 0b1df41..5da52b6 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -160,8 +160,9 @@ enum perf_branch_sample_type {
> PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
> PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
> PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
> + PERF_SAMPLE_BRANCH_COND = 1U << 10, /* conditional branches */
>
> - PERF_SAMPLE_BRANCH_MAX = 1U << 10, /* non-ABI */
> + PERF_SAMPLE_BRANCH_MAX = 1U << 11, /* non-ABI */
> };
This no longer applies against Linus' tree, you'll need to rebase it.
cheers
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [V3 02/10] powerpc, perf: Enable conditional branch filter for POWER8
2013-10-16 6:56 ` [V3 02/10] powerpc, perf: Enable conditional branch filter for POWER8 Anshuman Khandual
@ 2013-11-26 6:06 ` mpe@ellerman.id.au
2013-11-26 10:40 ` Anshuman Khandual
0 siblings, 1 reply; 17+ messages in thread
From: mpe@ellerman.id.au @ 2013-11-26 6:06 UTC (permalink / raw)
To: linuxppc-dev, linux-kernel; +Cc: mikey, sukadev, eranian, michaele
On Wed, 2013-16-10 at 06:56:49 UTC, Anshuman Khandual wrote:
> Enables conditional branch filter support for POWER8
> utilizing MMCRA register based filter and also invalidates
> a BHRB branch filter combination involving conditional
> branches.
>
> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> ---
> arch/powerpc/perf/power8-pmu.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
> index 2ee4a70..6e28587 100644
> --- a/arch/powerpc/perf/power8-pmu.c
> +++ b/arch/powerpc/perf/power8-pmu.c
> @@ -580,11 +580,21 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
> if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
> return -1;
>
> + /* Invalid branch filter combination - HW does not support */
> + if ((branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) &&
> + (branch_sample_type & PERF_SAMPLE_BRANCH_COND))
> + return -1;
What this doesn't make obvious is that the hardware doesn't support any
combinations. It just happens that these are the only two possibilities we
allow, and so this is the only combination we have to disallow.
>
> if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
> pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
> return pmu_bhrb_filter;
> }
>
> + if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
> + pmu_bhrb_filter |= POWER8_MMCRA_IFM3;
> + return pmu_bhrb_filter;
> + }
> +
> /* Every thing else is unsupported */
> return -1;
> }
I think it would be clearer if we actually checked for the possibilities we
allow and let everything else fall through, eg:
/* Ignore user/kernel/hv bits */
branch_sample_type &= ~PERF_SAMPLE_BRANCH_PLM_ALL;
if (branch_sample_type == PERF_SAMPLE_BRANCH_ANY)
return 0;
if (branch_sample_type == PERF_SAMPLE_BRANCH_ANY_CALL)
return POWER8_MMCRA_IFM1;
if (branch_sample_type == PERF_SAMPLE_BRANCH_COND)
return POWER8_MMCRA_IFM3;
return -1;
cheers
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling
2013-11-26 6:06 ` mpe@ellerman.id.au
@ 2013-11-26 10:15 ` Anshuman Khandual
2013-12-03 10:21 ` Anshuman Khandual
0 siblings, 1 reply; 17+ messages in thread
From: Anshuman Khandual @ 2013-11-26 10:15 UTC (permalink / raw)
To: mpe@ellerman.id.au
Cc: mikey, michaele, linux-kernel, eranian, linuxppc-dev, sukadev
On 11/26/2013 11:36 AM, mpe@ellerman.id.au wrote:
> Ideally your commit subject would contain a verb, preferably in the present
> tense.
>
> I think simply "perf: Add PERF_SAMPLE_BRANCH_COND" would be clearer.
Sure, will change it.
>
> On Wed, 2013-16-10 at 06:56:48 UTC, Anshuman Khandual wrote:
>> POWER8 PMU based BHRB supports filtering for conditional branches.
>> This patch introduces new branch filter PERF_SAMPLE_BRANCH_COND which
>> will extend the existing perf ABI. Other architectures can provide
>> this functionality with either HW filtering support (if present) or
>> with SW filtering of instructions.
>>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> Reviewed-by: Stephane Eranian <eranian@google.com>
>> ---
>> include/uapi/linux/perf_event.h | 3 ++-
>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>> index 0b1df41..5da52b6 100644
>> --- a/include/uapi/linux/perf_event.h
>> +++ b/include/uapi/linux/perf_event.h
>> @@ -160,8 +160,9 @@ enum perf_branch_sample_type {
>> PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
>> PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
>> PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
>> + PERF_SAMPLE_BRANCH_COND = 1U << 10, /* conditional branches */
>>
>> - PERF_SAMPLE_BRANCH_MAX = 1U << 10, /* non-ABI */
>> + PERF_SAMPLE_BRANCH_MAX = 1U << 11, /* non-ABI */
>> };
>
> This no longer applies against Linus' tree, you'll need to rebase it.
Okay
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [V3 02/10] powerpc, perf: Enable conditional branch filter for POWER8
2013-11-26 6:06 ` mpe@ellerman.id.au
@ 2013-11-26 10:40 ` Anshuman Khandual
0 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-11-26 10:40 UTC (permalink / raw)
To: mpe@ellerman.id.au
Cc: mikey, michaele, linux-kernel, eranian, linuxppc-dev, sukadev
On 11/26/2013 11:36 AM, mpe@ellerman.id.au wrote:
> On Wed, 2013-16-10 at 06:56:49 UTC, Anshuman Khandual wrote:
>> Enables conditional branch filter support for POWER8
>> utilizing MMCRA register based filter and also invalidates
>> a BHRB branch filter combination involving conditional
>> branches.
>>
>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> ---
>> arch/powerpc/perf/power8-pmu.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
>> index 2ee4a70..6e28587 100644
>> --- a/arch/powerpc/perf/power8-pmu.c
>> +++ b/arch/powerpc/perf/power8-pmu.c
>> @@ -580,11 +580,21 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
>> if (branch_sample_type & PERF_SAMPLE_BRANCH_IND_CALL)
>> return -1;
>>
>> + /* Invalid branch filter combination - HW does not support */
>> + if ((branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) &&
>> + (branch_sample_type & PERF_SAMPLE_BRANCH_COND))
>> + return -1;
>
> What this doesn't make obvious is that the hardware doesn't support any
> combinations. It just happens that these are the only two possibilities we
> allow, and so this is the only combination we have to disallow.
>
>>
>> if (branch_sample_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
>> pmu_bhrb_filter |= POWER8_MMCRA_IFM1;
>> return pmu_bhrb_filter;
>> }
>>
>> + if (branch_sample_type & PERF_SAMPLE_BRANCH_COND) {
>> + pmu_bhrb_filter |= POWER8_MMCRA_IFM3;
>> + return pmu_bhrb_filter;
>> + }
>> +
>> /* Every thing else is unsupported */
>> return -1;
>> }
>
> I think it would be clearer if we actually checked for the possibilities we
> allow and let everything else fall through, eg:
>
> /* Ignore user/kernel/hv bits */
> branch_sample_type &= ~PERF_SAMPLE_BRANCH_PLM_ALL;
>
> if (branch_sample_type == PERF_SAMPLE_BRANCH_ANY)
> return 0;
>
> if (branch_sample_type == PERF_SAMPLE_BRANCH_ANY_CALL)
> return POWER8_MMCRA_IFM1;
>
> if (branch_sample_type == PERF_SAMPLE_BRANCH_COND)
> return POWER8_MMCRA_IFM3;
>
> return -1;
>
Please look at the 9th patch (power8, perf: Change BHRB branch filter configuration).
All these issues are taken care of in this patch. It clearly indicates that any combination
of HW BHRB filters will not be supported in the PMU and hence zero out the HW filter component
and processes all of those filters in the SW.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling
2013-11-26 10:15 ` Anshuman Khandual
@ 2013-12-03 10:21 ` Anshuman Khandual
0 siblings, 0 replies; 17+ messages in thread
From: Anshuman Khandual @ 2013-12-03 10:21 UTC (permalink / raw)
To: mpe@ellerman.id.au
Cc: mikey, michaele, linux-kernel, eranian, linuxppc-dev, sukadev
On 11/26/2013 03:45 PM, Anshuman Khandual wrote:
> On 11/26/2013 11:36 AM, mpe@ellerman.id.au wrote:
>> Ideally your commit subject would contain a verb, preferably in the present
>> tense.
>>
>> I think simply "perf: Add PERF_SAMPLE_BRANCH_COND" would be clearer.
>
>
> Sure, will change it.
>
>>
>> On Wed, 2013-16-10 at 06:56:48 UTC, Anshuman Khandual wrote:
>>> POWER8 PMU based BHRB supports filtering for conditional branches.
>>> This patch introduces new branch filter PERF_SAMPLE_BRANCH_COND which
>>> will extend the existing perf ABI. Other architectures can provide
>>> this functionality with either HW filtering support (if present) or
>>> with SW filtering of instructions.
>>>
>>> Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>> Reviewed-by: Stephane Eranian <eranian@google.com>
>>> ---
>>> include/uapi/linux/perf_event.h | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
>>> index 0b1df41..5da52b6 100644
>>> --- a/include/uapi/linux/perf_event.h
>>> +++ b/include/uapi/linux/perf_event.h
>>> @@ -160,8 +160,9 @@ enum perf_branch_sample_type {
>>> PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
>>> PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
>>> PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
>>> + PERF_SAMPLE_BRANCH_COND = 1U << 10, /* conditional branches */
>>>
>>> - PERF_SAMPLE_BRANCH_MAX = 1U << 10, /* non-ABI */
>>> + PERF_SAMPLE_BRANCH_MAX = 1U << 11, /* non-ABI */
>>> };
>>
>> This no longer applies against Linus' tree, you'll need to rebase it.
>
> Okay
Hey Michael,
Looks like the patch still applies on top of Linus's tree. The modified patch with
a new commit subject line can be found here.
----------------------------------------------------------------------
>From d368096fc51a8da65f2d80ed5090d43cbc269f62 Mon Sep 17 00:00:00 2001
From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Date: Mon, 22 Jul 2013 12:22:27 +0530
Subject: [PATCH] perf: Add PERF_SAMPLE_BRANCH_COND
POWER8 PMU based BHRB supports filtering for conditional branches.
This patch introduces new branch filter PERF_SAMPLE_BRANCH_COND which
will extend the existing perf ABI. Other architectures can provide
this functionality with either HW filtering support (if present) or
with SW filtering of instructions.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Reviewed-by: Stephane Eranian <eranian@google.com>
---
include/uapi/linux/perf_event.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index e1802d6..e2d8b8b 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -163,8 +163,9 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
+ PERF_SAMPLE_BRANCH_COND = 1U << 10, /* conditional branches */
- PERF_SAMPLE_BRANCH_MAX = 1U << 10, /* non-ABI */
+ PERF_SAMPLE_BRANCH_MAX = 1U << 11, /* non-ABI */
};
#define PERF_SAMPLE_BRANCH_PLM_ALL \
--
1.7.11.7
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [V3 00/10] perf: New conditional branch filter
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
` (9 preceding siblings ...)
2013-10-16 6:56 ` [V3 10/10] powerpc, perf: Cleanup SW branch filter list look up Anshuman Khandual
@ 2013-12-04 2:50 ` Michael Ellerman
10 siblings, 0 replies; 17+ messages in thread
From: Michael Ellerman @ 2013-12-04 2:50 UTC (permalink / raw)
To: Anshuman Khandual
Cc: mikey, linux-kernel, eranian, linuxppc-dev,
Arnaldo Carvalho de Melo, sukadev
On Wed, 2013-10-16 at 12:26 +0530, Anshuman Khandual wrote:
> This patchset is the re-spin of the original branch stack sampling
> patchset which introduced new PERF_SAMPLE_BRANCH_COND branch filter. This patchset
> also enables SW based branch filtering support for book3s powerpc platforms which
> have PMU HW backed branch stack sampling support.
>
> Summary of code changes in this patchset:
>
> (1) Introduces a new PERF_SAMPLE_BRANCH_COND branch filter
> (2) Add the "cond" branch filter options in the "perf record" tool
> (3) Enable PERF_SAMPLE_BRANCH_COND in X86 platforms
> (4) Enable PERF_SAMPLE_BRANCH_COND in POWER8 platform
> (5) Update the documentation regarding "perf record" tool
Can you please address my comments and then resend patches 1-5. And make sure
you send them to the perf maintainers.
Those three touch the generic code, powerpc and x86, so we'll get those merged
first, and then focus on the remaining patches, which are powerpc specific.
cheers
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2013-12-04 2:50 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-10-16 6:56 [V3 00/10] perf: New conditional branch filter Anshuman Khandual
2013-10-16 6:56 ` [V3 01/10] perf: New conditional branch filter criteria in branch stack sampling Anshuman Khandual
2013-11-26 6:06 ` mpe@ellerman.id.au
2013-11-26 10:15 ` Anshuman Khandual
2013-12-03 10:21 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 02/10] powerpc, perf: Enable conditional branch filter for POWER8 Anshuman Khandual
2013-11-26 6:06 ` mpe@ellerman.id.au
2013-11-26 10:40 ` Anshuman Khandual
2013-10-16 6:56 ` [V3 03/10] perf, tool: Conditional branch filter 'cond' added to perf record Anshuman Khandual
2013-10-16 6:56 ` [V3 04/10] x86, perf: Add conditional branch filtering support Anshuman Khandual
2013-10-16 6:56 ` [V3 05/10] perf, documentation: Description for conditional branch filter Anshuman Khandual
2013-10-16 6:56 ` [V3 06/10] powerpc, perf: Change the name of HW PMU branch filter tracking variable Anshuman Khandual
2013-10-16 6:56 ` [V3 07/10] powerpc, lib: Add new branch instruction analysis support functions Anshuman Khandual
2013-10-16 6:56 ` [V3 08/10] powerpc, perf: Enable SW filtering in branch stack sampling framework Anshuman Khandual
2013-10-16 6:56 ` [V3 09/10] power8, perf: Change BHRB branch filter configuration Anshuman Khandual
2013-10-16 6:56 ` [V3 10/10] powerpc, perf: Cleanup SW branch filter list look up Anshuman Khandual
2013-12-04 2:50 ` [V3 00/10] perf: New conditional branch filter Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).