linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/7] arm64/perf: Enable branch stack sampling
@ 2022-09-08  5:10 Anshuman Khandual
  2022-09-08  5:10 ` [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE Anshuman Khandual
                   ` (7 more replies)
  0 siblings, 8 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08  5:10 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas
  Cc: Anshuman Khandual, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

This series enables perf branch stack sampling support on arm64 platform
via a new arch feature called Branch Record Buffer Extension (BRBE). All
relevant register definitions could be accessed here.

https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers

This series applies on v6.0-rc4 after the BRBE related perf ABI changes series
(V7) that was posted earlier, and a branch sample filter helper patch.

https://lore.kernel.org/all/20220824044822.70230-1-anshuman.khandual@arm.com/

https://lore.kernel.org/all/20220906084414.396220-1-anshuman.khandual@arm.com/

Following issues have been resolved

- Jame's concerns regarding permission inadequacy related to perfmon_capable()
- Jame's concerns regarding using perf_event_paranoid along with perfmon_capable()

Following issues remain inconclusive

- Rob's concerns regarding the series structure, arm_pmu callbacks based framework

Changes in V2:

- Dropped branch sample filter helpers consolidation patch from this series 
- Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable()
- Use cached perfmon_capable() while configuring BRBE branch record filters

Changes in V1:

https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.khandual@arm.com/

- Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers
- Process new perf branch types via PERF_BR_EXTEND_ABI

Changes in RFC V2:

https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.khandual@arm.com/

- Added branch_sample_priv() while consolidating other branch sample filter helpers
- Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc
- Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5)
- Added documentation for struct arm_pmu changes, updated commit message
- Updated commit message for BRBE detection infrastructure patch
- PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver)
- Branch privilege state capture mechanism has now moved inside the driver

Changes in RFC V1:

https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (7):
  arm64/perf: Add register definitions for BRBE
  arm64/perf: Update struct arm_pmu for BRBE
  arm64/perf: Update struct pmu_hw_events for BRBE
  driver/perf/arm_pmu_platform: Add support for BRBE attributes detection
  arm64/perf: Drive BRBE from perf event states
  arm64/perf: Add BRBE driver
  arm64/perf: Enable branch stack sampling

 arch/arm64/include/asm/sysreg.h | 222 ++++++++++++++++
 arch/arm64/kernel/perf_event.c  |  48 ++++
 drivers/perf/Kconfig            |  11 +
 drivers/perf/Makefile           |   1 +
 drivers/perf/arm_pmu.c          |  82 +++++-
 drivers/perf/arm_pmu_brbe.c     | 448 ++++++++++++++++++++++++++++++++
 drivers/perf/arm_pmu_brbe.h     | 259 ++++++++++++++++++
 drivers/perf/arm_pmu_platform.c |  34 +++
 include/linux/perf/arm_pmu.h    |  67 +++++
 9 files changed, 1169 insertions(+), 3 deletions(-)
 create mode 100644 drivers/perf/arm_pmu_brbe.c
 create mode 100644 drivers/perf/arm_pmu_brbe.h

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE
  2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
@ 2022-09-08  5:10 ` Anshuman Khandual
  2022-09-12  9:57   ` Mark Brown
  2022-09-08  5:10 ` [PATCH V2 2/7] arm64/perf: Update struct arm_pmu " Anshuman Khandual
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08  5:10 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas
  Cc: Anshuman Khandual, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

This adds BRBE related register definitions and various other related field
macros there in. These will be used subsequently in a BRBE driver which is
being added later on.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 222 ++++++++++++++++++++++++++++++++
 1 file changed, 222 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 7c71358d44c4..66b031e6f671 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -161,6 +161,224 @@
 #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
 #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
 
+/*
+ * BRBINF<N>_EL1 Encoding: [2, 1, 8, CRm, op2]
+ *
+ * derived as <CRm> = c{N<3:0>} <op2> = (N<4>x4 + 0)
+ */
+#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), (((n) & 0x10)) >> 2)
+
+#define SYS_BRBINF0_EL1			__SYS_BRBINFO(0)
+#define SYS_BRBINF1_EL1			__SYS_BRBINFO(1)
+#define SYS_BRBINF2_EL1			__SYS_BRBINFO(2)
+#define SYS_BRBINF3_EL1			__SYS_BRBINFO(3)
+#define SYS_BRBINF4_EL1			__SYS_BRBINFO(4)
+#define SYS_BRBINF5_EL1			__SYS_BRBINFO(5)
+#define SYS_BRBINF6_EL1			__SYS_BRBINFO(6)
+#define SYS_BRBINF7_EL1			__SYS_BRBINFO(7)
+#define SYS_BRBINF8_EL1			__SYS_BRBINFO(8)
+#define SYS_BRBINF9_EL1			__SYS_BRBINFO(9)
+#define SYS_BRBINF10_EL1		__SYS_BRBINFO(10)
+#define SYS_BRBINF11_EL1		__SYS_BRBINFO(11)
+#define SYS_BRBINF12_EL1		__SYS_BRBINFO(12)
+#define SYS_BRBINF13_EL1		__SYS_BRBINFO(13)
+#define SYS_BRBINF14_EL1		__SYS_BRBINFO(14)
+#define SYS_BRBINF15_EL1		__SYS_BRBINFO(15)
+#define SYS_BRBINF16_EL1		__SYS_BRBINFO(16)
+#define SYS_BRBINF17_EL1		__SYS_BRBINFO(17)
+#define SYS_BRBINF18_EL1		__SYS_BRBINFO(18)
+#define SYS_BRBINF19_EL1		__SYS_BRBINFO(19)
+#define SYS_BRBINF20_EL1		__SYS_BRBINFO(20)
+#define SYS_BRBINF21_EL1		__SYS_BRBINFO(21)
+#define SYS_BRBINF22_EL1		__SYS_BRBINFO(22)
+#define SYS_BRBINF23_EL1		__SYS_BRBINFO(23)
+#define SYS_BRBINF24_EL1		__SYS_BRBINFO(24)
+#define SYS_BRBINF25_EL1		__SYS_BRBINFO(25)
+#define SYS_BRBINF26_EL1		__SYS_BRBINFO(26)
+#define SYS_BRBINF27_EL1		__SYS_BRBINFO(27)
+#define SYS_BRBINF28_EL1		__SYS_BRBINFO(28)
+#define SYS_BRBINF29_EL1		__SYS_BRBINFO(29)
+#define SYS_BRBINF30_EL1		__SYS_BRBINFO(30)
+#define SYS_BRBINF31_EL1		__SYS_BRBINFO(31)
+
+/*
+ * BRBSRC<N>_EL1 Encoding: [2, 1, 8, CRm, op2]
+ *
+ * derived as <CRm> = c{N<3:0>} <op2> = (N<4>x4 + 1)
+ */
+#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
+
+#define SYS_BRBSRC0_EL1			__SYS_BRBSRC(0)
+#define SYS_BRBSRC1_EL1			__SYS_BRBSRC(1)
+#define SYS_BRBSRC2_EL1			__SYS_BRBSRC(2)
+#define SYS_BRBSRC3_EL1			__SYS_BRBSRC(3)
+#define SYS_BRBSRC4_EL1			__SYS_BRBSRC(4)
+#define SYS_BRBSRC5_EL1			__SYS_BRBSRC(5)
+#define SYS_BRBSRC6_EL1			__SYS_BRBSRC(6)
+#define SYS_BRBSRC7_EL1			__SYS_BRBSRC(7)
+#define SYS_BRBSRC8_EL1			__SYS_BRBSRC(8)
+#define SYS_BRBSRC9_EL1			__SYS_BRBSRC(9)
+#define SYS_BRBSRC10_EL1		__SYS_BRBSRC(10)
+#define SYS_BRBSRC11_EL1		__SYS_BRBSRC(11)
+#define SYS_BRBSRC12_EL1		__SYS_BRBSRC(12)
+#define SYS_BRBSRC13_EL1		__SYS_BRBSRC(13)
+#define SYS_BRBSRC14_EL1		__SYS_BRBSRC(14)
+#define SYS_BRBSRC15_EL1		__SYS_BRBSRC(15)
+#define SYS_BRBSRC16_EL1		__SYS_BRBSRC(16)
+#define SYS_BRBSRC17_EL1		__SYS_BRBSRC(17)
+#define SYS_BRBSRC18_EL1		__SYS_BRBSRC(18)
+#define SYS_BRBSRC19_EL1		__SYS_BRBSRC(19)
+#define SYS_BRBSRC20_EL1		__SYS_BRBSRC(20)
+#define SYS_BRBSRC21_EL1		__SYS_BRBSRC(21)
+#define SYS_BRBSRC22_EL1		__SYS_BRBSRC(22)
+#define SYS_BRBSRC23_EL1		__SYS_BRBSRC(23)
+#define SYS_BRBSRC24_EL1		__SYS_BRBSRC(24)
+#define SYS_BRBSRC25_EL1		__SYS_BRBSRC(25)
+#define SYS_BRBSRC26_EL1		__SYS_BRBSRC(26)
+#define SYS_BRBSRC27_EL1		__SYS_BRBSRC(27)
+#define SYS_BRBSRC28_EL1		__SYS_BRBSRC(28)
+#define SYS_BRBSRC29_EL1		__SYS_BRBSRC(29)
+#define SYS_BRBSRC30_EL1		__SYS_BRBSRC(30)
+#define SYS_BRBSRC31_EL1		__SYS_BRBSRC(31)
+
+/*
+ * BRBTGT<N>_EL1 Encoding: [2, 1, 8, CRm, op2]
+ *
+ * derived as <CRm> = c{N<3:0>} <op2> = (N<4>x4 + 2)
+ */
+#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
+
+#define SYS_BRBTGT0_EL1			__SYS_BRBTGT(0)
+#define SYS_BRBTGT1_EL1			__SYS_BRBTGT(1)
+#define SYS_BRBTGT2_EL1			__SYS_BRBTGT(2)
+#define SYS_BRBTGT3_EL1			__SYS_BRBTGT(3)
+#define SYS_BRBTGT4_EL1			__SYS_BRBTGT(4)
+#define SYS_BRBTGT5_EL1			__SYS_BRBTGT(5)
+#define SYS_BRBTGT6_EL1			__SYS_BRBTGT(6)
+#define SYS_BRBTGT7_EL1			__SYS_BRBTGT(7)
+#define SYS_BRBTGT8_EL1			__SYS_BRBTGT(8)
+#define SYS_BRBTGT9_EL1			__SYS_BRBTGT(9)
+#define SYS_BRBTGT10_EL1		__SYS_BRBTGT(10)
+#define SYS_BRBTGT11_EL1		__SYS_BRBTGT(11)
+#define SYS_BRBTGT12_EL1		__SYS_BRBTGT(12)
+#define SYS_BRBTGT13_EL1		__SYS_BRBTGT(13)
+#define SYS_BRBTGT14_EL1		__SYS_BRBTGT(14)
+#define SYS_BRBTGT15_EL1		__SYS_BRBTGT(15)
+#define SYS_BRBTGT16_EL1		__SYS_BRBTGT(16)
+#define SYS_BRBTGT17_EL1		__SYS_BRBTGT(17)
+#define SYS_BRBTGT18_EL1		__SYS_BRBTGT(18)
+#define SYS_BRBTGT19_EL1		__SYS_BRBTGT(19)
+#define SYS_BRBTGT20_EL1		__SYS_BRBTGT(20)
+#define SYS_BRBTGT21_EL1		__SYS_BRBTGT(21)
+#define SYS_BRBTGT22_EL1		__SYS_BRBTGT(22)
+#define SYS_BRBTGT23_EL1		__SYS_BRBTGT(23)
+#define SYS_BRBTGT24_EL1		__SYS_BRBTGT(24)
+#define SYS_BRBTGT25_EL1		__SYS_BRBTGT(25)
+#define SYS_BRBTGT26_EL1		__SYS_BRBTGT(26)
+#define SYS_BRBTGT27_EL1		__SYS_BRBTGT(27)
+#define SYS_BRBTGT28_EL1		__SYS_BRBTGT(28)
+#define SYS_BRBTGT29_EL1		__SYS_BRBTGT(29)
+#define SYS_BRBTGT30_EL1		__SYS_BRBTGT(30)
+#define SYS_BRBTGT31_EL1		__SYS_BRBTGT(31)
+
+#define SYS_BRBIDR0_EL1			sys_reg(2, 1, 9, 2, 0)
+#define SYS_BRBCR_EL1			sys_reg(2, 1, 9, 0, 0)
+#define SYS_BRBFCR_EL1			sys_reg(2, 1, 9, 0, 1)
+#define SYS_BRBTS_EL1			sys_reg(2, 1, 9, 0, 2)
+#define SYS_BRBINFINJ_EL1		sys_reg(2, 1, 9, 1, 0)
+#define SYS_BRBSRCINJ_EL1		sys_reg(2, 1, 9, 1, 1)
+#define SYS_BRBTGTINJ_EL1		sys_reg(2, 1, 9, 1, 2)
+
+#define BRBIDR0_CC_SHIFT	12
+#define BRBIDR0_CC_MASK		GENMASK(3, 0)
+#define BRBIDR0_FORMAT_SHIFT	8
+#define BRBIDR0_FORMAT_MASK	GENMASK(3, 0)
+#define BRBIDR0_NUMREC_SHIFT	0
+#define BRBIDR0_NUMREC_MASK	GENMASK(7, 0)
+
+#define BRBIDR0_CC_20_BIT	0x5
+#define BRBIDR0_FORMAT_0	0x0
+
+#define BRBIDR0_NUMREC_8	0x08
+#define BRBIDR0_NUMREC_16	0x10
+#define BRBIDR0_NUMREC_32	0x20
+#define BRBIDR0_NUMREC_64	0x40
+
+#define BRBINF_VALID_SHIFT	0
+#define BRBINF_VALID_MASK	GENMASK(1, 0)
+#define BRBINF_MPRED		(1UL << 5)
+#define BRBINF_EL_SHIFT		6
+#define BRBINF_EL_MASK		GENMASK(1, 0)
+#define BRBINF_TYPE_SHIFT	8
+#define BRBINF_TYPE_MASK	GENMASK(5, 0)
+#define BRBINF_TX		(1UL << 16)
+#define BRBINF_LASTFAILED	(1UL << 17)
+#define BRBINF_CC_SHIFT		32
+#define BRBINF_CC_MASK		GENMASK(13, 0)
+#define BRBINF_CCU		(1UL << 46)
+
+#define BRBINF_EL_EL0		0x0
+#define BRBINF_EL_EL1		0x1
+#define BRBINF_EL_EL2		0x2
+
+#define BRBINF_VALID_INVALID	0x0
+#define BRBINF_VALID_TARGET	0x1
+#define BRBINF_VALID_SOURCE	0x2
+#define BRBINF_VALID_ALL	0x3
+
+#define BRBINF_TYPE_UNCOND_DIR	0x0
+#define BRBINF_TYPE_INDIR	0x1
+#define BRBINF_TYPE_DIR_LINK	0x2
+#define BRBINF_TYPE_INDIR_LINK	0x3
+#define BRBINF_TYPE_RET_SUB	0x5
+#define BRBINF_TYPE_RET_EXCPT	0x7
+#define BRBINF_TYPE_COND_DIR	0x8
+#define BRBINF_TYPE_DEBUG_HALT	0x21
+#define BRBINF_TYPE_CALL	0x22
+#define BRBINF_TYPE_TRAP	0x23
+#define BRBINF_TYPE_SERROR	0x24
+#define BRBINF_TYPE_INST_DEBUG	0x26
+#define BRBINF_TYPE_DATA_DEBUG	0x27
+#define BRBINF_TYPE_ALGN_FAULT	0x2A
+#define BRBINF_TYPE_INST_FAULT	0x2B
+#define BRBINF_TYPE_DATA_FAULT	0x2C
+#define BRBINF_TYPE_IRQ		0x2E
+#define BRBINF_TYPE_FIQ		0x2F
+#define BRBINF_TYPE_DEBUG_EXIT	0x39
+
+#define BRBCR_E0BRE		(1UL << 0)
+#define BRBCR_E1BRE		(1UL << 1)
+#define BRBCR_CC		(1UL << 3)
+#define BRBCR_MPRED		(1UL << 4)
+#define BRBCR_FZP		(1UL << 8)
+#define BRBCR_ERTN		(1UL <<	22)
+#define BRBCR_EXCEPTION		(1UL << 23)
+#define BRBCR_TS_MASK		GENMASK(1, 0)
+#define BRBCR_TS_SHIFT		5
+
+#define BRBCR_TS_VIRTUAL	0x1
+#define BRBCR_TS_GST_PHYSICAL	0x2
+#define BRBCR_TS_PHYSICAL	0x3
+
+#define BRBFCR_LASTFAILED	(1UL << 6)
+#define BRBFCR_PAUSED		(1UL << 7)
+#define BRBFCR_ENL		(1UL << 16)
+#define BRBFCR_DIRECT		(1UL << 17)
+#define BRBFCR_INDIRECT		(1UL << 18)
+#define BRBFCR_RTN		(1UL << 19)
+#define BRBFCR_INDCALL		(1UL << 20)
+#define BRBFCR_DIRCALL		(1UL << 21)
+#define BRBFCR_CONDDIR		(1UL << 22)
+#define BRBFCR_BANK_MASK	GENMASK(1, 0)
+#define BRBFCR_BANK_SHIFT	28
+
+#define BRBFCR_BANK_FIRST	0x0
+#define BRBFCR_BANK_SECOND	0x1
+
+#define BRBFCR_BRANCH_ALL	(BRBFCR_DIRECT | BRBFCR_INDIRECT | \
+				 BRBFCR_RTN | BRBFCR_INDCALL | \
+				 BRBFCR_DIRCALL | BRBFCR_CONDDIR)
+
 #define SYS_MIDR_EL1			sys_reg(3, 0, 0, 0, 0)
 #define SYS_MPIDR_EL1			sys_reg(3, 0, 0, 0, 5)
 #define SYS_REVIDR_EL1			sys_reg(3, 0, 0, 0, 6)
@@ -826,6 +1044,7 @@
 #define ID_AA64MMFR2_CNP_SHIFT		0
 
 /* id_aa64dfr0 */
+#define ID_AA64DFR0_BRBE_SHIFT		52
 #define ID_AA64DFR0_MTPMU_SHIFT		48
 #define ID_AA64DFR0_TRBE_SHIFT		44
 #define ID_AA64DFR0_TRACE_FILT_SHIFT	40
@@ -848,6 +1067,9 @@
 #define ID_AA64DFR0_PMSVER_8_2		0x1
 #define ID_AA64DFR0_PMSVER_8_3		0x2
 
+#define ID_AA64DFR0_BRBE		0x1
+#define ID_AA64DFR0_BRBE_V1P1		0x2
+
 #define ID_DFR0_PERFMON_SHIFT		24
 
 #define ID_DFR0_PERFMON_8_0		0x3
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH V2 2/7] arm64/perf: Update struct arm_pmu for BRBE
  2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
  2022-09-08  5:10 ` [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE Anshuman Khandual
@ 2022-09-08  5:10 ` Anshuman Khandual
  2022-09-08  5:10 ` [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events " Anshuman Khandual
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08  5:10 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas
  Cc: Anshuman Khandual, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

Although BRBE is an armv8 speciifc HW feature, abstracting out its various
function callbacks at the struct arm_pmu level is preferred, as it cleaner
, easier to follow and maintain.

Besides some helpers i.e brbe_supported(), brbe_probe() and brbe_reset()
might not fit seamlessly, when tried to be embedded via existing arm_pmu
helpers in the armv8 implementation.

Updates the struct arm_pmu to include all required helpers that will drive
BRBE functionality for a given PMU implementation. These are the following.

- brbe_filter	: Convert perf event filters into BRBE HW filters
- brbe_probe	: Probe BRBE HW and capture its attributes
- brbe_enable	: Enable BRBE HW with a given config
- brbe_disable	: Disable BRBE HW
- brbe_read	: Read BRBE buffer for captured branch records
- brbe_reset	: Reset BRBE buffer
- brbe_supported: Whether BRBE is supported or not

A BRBE driver implementation needs to provide these functionalities.

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/kernel/perf_event.c | 36 ++++++++++++++++++++++++++++++++++
 include/linux/perf/arm_pmu.h   | 21 ++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index cb69ff1e6138..e7013699171f 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1025,6 +1025,35 @@ static int armv8pmu_filter_match(struct perf_event *event)
 	return evtype != ARMV8_PMUV3_PERFCTR_CHAIN;
 }
 
+static void armv8pmu_brbe_filter(struct pmu_hw_events *hw_event, struct perf_event *event)
+{
+}
+
+static void armv8pmu_brbe_enable(struct pmu_hw_events *hw_event)
+{
+}
+
+static void armv8pmu_brbe_disable(struct pmu_hw_events *hw_event)
+{
+}
+
+static void armv8pmu_brbe_read(struct pmu_hw_events *hw_event, struct perf_event *event)
+{
+}
+
+static void armv8pmu_brbe_probe(struct pmu_hw_events *hw_event)
+{
+}
+
+static void armv8pmu_brbe_reset(struct pmu_hw_events *hw_event)
+{
+}
+
+static bool armv8pmu_brbe_supported(struct perf_event *event)
+{
+	return false;
+}
+
 static void armv8pmu_reset(void *info)
 {
 	struct arm_pmu *cpu_pmu = (struct arm_pmu *)info;
@@ -1257,6 +1286,13 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 
 	cpu_pmu->pmu.event_idx		= armv8pmu_user_event_idx;
 
+	cpu_pmu->brbe_filter		= armv8pmu_brbe_filter;
+	cpu_pmu->brbe_enable		= armv8pmu_brbe_enable;
+	cpu_pmu->brbe_disable		= armv8pmu_brbe_disable;
+	cpu_pmu->brbe_read		= armv8pmu_brbe_read;
+	cpu_pmu->brbe_probe		= armv8pmu_brbe_probe;
+	cpu_pmu->brbe_reset		= armv8pmu_brbe_reset;
+	cpu_pmu->brbe_supported		= armv8pmu_brbe_supported;
 	cpu_pmu->name			= name;
 	cpu_pmu->map_event		= map_event;
 	cpu_pmu->attr_groups[ARMPMU_ATTR_GROUP_EVENTS] = events ?
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 0407a38b470a..3d427ac0ca45 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -100,6 +100,27 @@ struct arm_pmu {
 	void		(*reset)(void *);
 	int		(*map_event)(struct perf_event *event);
 	int		(*filter_match)(struct perf_event *event);
+
+	/* Convert perf event filters into BRBE HW filters */
+	void		(*brbe_filter)(struct pmu_hw_events *hw_events, struct perf_event *event);
+
+	/* Probe BRBE HW and capture its attributes */
+	void		(*brbe_probe)(struct pmu_hw_events *hw_events);
+
+	/* Enable BRBE HW with a given config */
+	void		(*brbe_enable)(struct pmu_hw_events *hw_events);
+
+	/* Disable BRBE HW */
+	void		(*brbe_disable)(struct pmu_hw_events *hw_events);
+
+	/* Process BRBE buffer for captured branch records */
+	void		(*brbe_read)(struct pmu_hw_events *hw_events, struct perf_event *event);
+
+	/* Reset BRBE buffer */
+	void		(*brbe_reset)(struct pmu_hw_events *hw_events);
+
+	/* Check whether BRBE is supported */
+	bool		(*brbe_supported)(struct perf_event *event);
 	int		num_events;
 	bool		secure_access; /* 32-bit ARM only */
 #define ARMV8_PMUV3_MAX_COMMON_EVENTS		0x40
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE
  2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
  2022-09-08  5:10 ` [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE Anshuman Khandual
  2022-09-08  5:10 ` [PATCH V2 2/7] arm64/perf: Update struct arm_pmu " Anshuman Khandual
@ 2022-09-08  5:10 ` Anshuman Khandual
       [not found]   ` <202209082022.6BPdyQn8-lkp@intel.com>
                     ` (2 more replies)
  2022-09-08  5:10 ` [PATCH V2 4/7] driver/perf/arm_pmu_platform: Add support for BRBE attributes detection Anshuman Khandual
                   ` (4 subsequent siblings)
  7 siblings, 3 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08  5:10 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas
  Cc: Anshuman Khandual, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

A single perf event instance BRBE related contexts and data will be tracked
in struct pmu_hw_events. Hence update the structure to accommodate required
details related to BRBE.

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/perf/arm_pmu.h | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 3d427ac0ca45..18e519e4e658 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -43,6 +43,11 @@
 	},								\
 }
 
+/*
+ * Maximum branch records in BRBE
+ */
+#define BRBE_MAX_ENTRIES 64
+
 /* The events for a given PMU register set. */
 struct pmu_hw_events {
 	/*
@@ -69,6 +74,23 @@ struct pmu_hw_events {
 	struct arm_pmu		*percpu_pmu;
 
 	int irq;
+
+	/* Detected BRBE attributes */
+	bool				v1p1;
+	int				brbe_cc;
+	int				brbe_nr;
+
+	/* Evaluated BRBE configuration */
+	u64				brbfcr;
+	u64				brbcr;
+
+	/* Tracked BRBE context */
+	unsigned int			brbe_users;
+	void				*brbe_context;
+
+	/* Captured BRBE buffer - copied as is into perf_sample_data */
+	struct perf_branch_stack	brbe_stack;
+	struct perf_branch_entry	brbe_entries[BRBE_MAX_ENTRIES];
 };
 
 enum armpmu_attr_groups {
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH V2 4/7] driver/perf/arm_pmu_platform: Add support for BRBE attributes detection
  2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
                   ` (2 preceding siblings ...)
  2022-09-08  5:10 ` [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events " Anshuman Khandual
@ 2022-09-08  5:10 ` Anshuman Khandual
  2022-09-08  5:10 ` [PATCH V2 5/7] arm64/perf: Drive BRBE from perf event states Anshuman Khandual
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08  5:10 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas
  Cc: Anshuman Khandual, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

This adds arm pmu infrastrure to probe BRBE implementation's attributes via
driver exported callbacks later. The actual BRBE feature detection will be
added by the driver itself.

CPU specific BRBE entries, cycle count, format support gets detected during
PMU init. This information gets saved in per-cpu struct pmu_hw_events which
later helps in operating BRBE during a perf event context.

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 drivers/perf/arm_pmu_platform.c | 34 +++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c
index 513de1f54e2d..800e4a6e8bc3 100644
--- a/drivers/perf/arm_pmu_platform.c
+++ b/drivers/perf/arm_pmu_platform.c
@@ -172,6 +172,36 @@ static int armpmu_request_irqs(struct arm_pmu *armpmu)
 	return err;
 }
 
+static void arm_brbe_probe_cpu(void *info)
+{
+	struct pmu_hw_events *hw_events;
+	struct arm_pmu *armpmu = info;
+
+	/*
+	 * Return from here, if BRBE driver has not been
+	 * implemented for this PMU. This helps prevent
+	 * kernel crash later when brbe_probe() will be
+	 * called on the PMU.
+	 */
+	if (!armpmu->brbe_probe)
+		return;
+
+	hw_events = per_cpu_ptr(armpmu->hw_events, smp_processor_id());
+	armpmu->brbe_probe(hw_events);
+}
+
+static int armpmu_request_brbe(struct arm_pmu *armpmu)
+{
+	int cpu, err = 0;
+
+	for_each_cpu(cpu, &armpmu->supported_cpus) {
+		err = smp_call_function_single(cpu, arm_brbe_probe_cpu, armpmu, 1);
+		if (err)
+			return err;
+	}
+	return err;
+}
+
 static void armpmu_free_irqs(struct arm_pmu *armpmu)
 {
 	int cpu;
@@ -229,6 +259,10 @@ int arm_pmu_device_probe(struct platform_device *pdev,
 	if (ret)
 		goto out_free_irqs;
 
+	ret = armpmu_request_brbe(pmu);
+	if (ret)
+		goto out_free_irqs;
+
 	ret = armpmu_register(pmu);
 	if (ret) {
 		dev_err(dev, "failed to register PMU devices!\n");
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH V2 5/7] arm64/perf: Drive BRBE from perf event states
  2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
                   ` (3 preceding siblings ...)
  2022-09-08  5:10 ` [PATCH V2 4/7] driver/perf/arm_pmu_platform: Add support for BRBE attributes detection Anshuman Khandual
@ 2022-09-08  5:10 ` Anshuman Khandual
  2022-09-08 15:31   ` kernel test robot
  2022-09-08  5:10 ` [PATCH V2 6/7] arm64/perf: Add BRBE driver Anshuman Khandual
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08  5:10 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas
  Cc: Anshuman Khandual, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

Branch stack sampling rides along the normal perf event and all the branch
records get captured during the PMU interrupt. This just changes perf event
handling on the arm64 platform to accommodate required BRBE operations that
will enable branch stack sampling support.

It adds a new 'hw_perf_event.flags' element i.e ARMPMU_EVT_PRIV, which will
enable caching perf event privilege information required for capturing some
branch record types.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/kernel/perf_event.c |  6 ++++
 drivers/perf/arm_pmu.c         | 50 ++++++++++++++++++++++++++++++++++
 include/linux/perf/arm_pmu.h   |  4 +++
 3 files changed, 60 insertions(+)

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index e7013699171f..5bfaba8edad1 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -874,6 +874,12 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
+		if (has_branch_stack(event)) {
+			cpu_pmu->brbe_read(cpuc, event);
+			data.br_stack = &cpuc->brbe_stack;
+			cpu_pmu->brbe_reset(cpuc);
+		}
+
 		/*
 		 * Perf event overflow will queue the processing of the event as
 		 * an irq_work which will be taken care of in the handling of
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 59d3980b8ca2..1fe5d6238b81 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -271,12 +271,22 @@ armpmu_stop(struct perf_event *event, int flags)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
 	struct hw_perf_event *hwc = &event->hw;
+	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
 
 	/*
 	 * ARM pmu always has to update the counter, so ignore
 	 * PERF_EF_UPDATE, see comments in armpmu_start().
 	 */
 	if (!(hwc->state & PERF_HES_STOPPED)) {
+		if (has_branch_stack(event)) {
+			WARN_ON_ONCE(!hw_events->brbe_users);
+			hw_events->brbe_users--;
+			if (!hw_events->brbe_users) {
+				hw_events->brbe_context = NULL;
+				armpmu->brbe_disable(hw_events);
+			}
+		}
+
 		armpmu->disable(event);
 		armpmu_event_update(event);
 		hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
@@ -287,6 +297,7 @@ static void armpmu_start(struct perf_event *event, int flags)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
 	struct hw_perf_event *hwc = &event->hw;
+	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
 
 	/*
 	 * ARM pmu always has to reprogram the period, so ignore
@@ -304,6 +315,14 @@ static void armpmu_start(struct perf_event *event, int flags)
 	 * happened since disabling.
 	 */
 	armpmu_event_set_period(event);
+	if (has_branch_stack(event)) {
+		if (event->ctx->task && hw_events->brbe_context != event->ctx) {
+			armpmu->brbe_reset(hw_events);
+			hw_events->brbe_context = event->ctx;
+		}
+		armpmu->brbe_enable(hw_events);
+		hw_events->brbe_users++;
+	}
 	armpmu->enable(event);
 }
 
@@ -349,6 +368,10 @@ armpmu_add(struct perf_event *event, int flags)
 	hw_events->events[idx] = event;
 
 	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+	if (has_branch_stack(event))
+		armpmu->brbe_filter(hw_events, event);
+
 	if (flags & PERF_EF_START)
 		armpmu_start(event, PERF_EF_RELOAD);
 
@@ -443,6 +466,7 @@ __hw_perf_event_init(struct perf_event *event)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
 	struct hw_perf_event *hwc = &event->hw;
+	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
 	int mapping;
 
 	hwc->flags = 0;
@@ -492,6 +516,19 @@ __hw_perf_event_init(struct perf_event *event)
 		local64_set(&hwc->period_left, hwc->sample_period);
 	}
 
+	if (has_branch_stack(event)) {
+		/*
+		 * Cache whether the perf event is allowed to capture exception
+		 * and exception return branch records. It allows us to perform
+		 * the privilege check via perfmon_capable(), in the context of
+		 * the event owner, just once, during the pmu->event_init().
+		 */
+		if (perfmon_capable())
+			event->hw.flags |= ARMPMU_EVT_PRIV;
+
+		armpmu->brbe_filter(hw_events, event);
+	}
+
 	return validate_group(event);
 }
 
@@ -520,6 +557,18 @@ static int armpmu_event_init(struct perf_event *event)
 	return __hw_perf_event_init(event);
 }
 
+static void armpmu_sched_task(struct perf_event_context *ctx, bool sched_in)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(ctx->pmu);
+	struct pmu_hw_events *hw_events = this_cpu_ptr(armpmu->hw_events);
+
+	if (!hw_events->brbe_users)
+		return;
+
+	if (sched_in)
+		armpmu->brbe_reset(hw_events);
+}
+
 static void armpmu_enable(struct pmu *pmu)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(pmu);
@@ -877,6 +926,7 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
 	}
 
 	pmu->pmu = (struct pmu) {
+		.sched_task	= armpmu_sched_task,
 		.pmu_enable	= armpmu_enable,
 		.pmu_disable	= armpmu_disable,
 		.event_init	= armpmu_event_init,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 18e519e4e658..67f44020a736 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -29,6 +29,10 @@
 /* Event uses a 47bit counter */
 #define ARMPMU_EVT_47BIT		2
 
+#define ARMPMU_EVT_PRIV			0x00004	/* Event is privileged */
+
+static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_PRIV) == ARMPMU_EVT_PRIV);
+
 #define HW_OP_UNSUPPORTED		0xFFFF
 #define C(_x)				PERF_COUNT_HW_CACHE_##_x
 #define CACHE_OP_UNSUPPORTED		0xFFFF
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH V2 6/7] arm64/perf: Add BRBE driver
  2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
                   ` (4 preceding siblings ...)
  2022-09-08  5:10 ` [PATCH V2 5/7] arm64/perf: Drive BRBE from perf event states Anshuman Khandual
@ 2022-09-08  5:10 ` Anshuman Khandual
  2022-09-08  9:23   ` kernel test robot
  2022-09-13 10:39   ` James Clark
  2022-09-08  5:10 ` [PATCH V2 7/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
  2022-09-13 10:55 ` [PATCH V2 0/7] " James Clark
  7 siblings, 2 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08  5:10 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas
  Cc: Anshuman Khandual, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

This adds a BRBE driver which implements all the required helper functions
for struct arm_pmu. Following functions are defined by this driver which
will configure, enable, capture, reset and disable BRBE buffer HW as and
when requested via perf branch stack sampling framework.

- arm64_pmu_brbe_filter()
- arm64_pmu_brbe_enable()
- arm64_pmu_brbe_disable()
- arm64_pmu_brbe_read()
- arm64_pmu_brbe_probe()
- arm64_pmu_brbe_reset()
- arm64_pmu_brbe_supported()

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/kernel/perf_event.c |   8 +-
 drivers/perf/Kconfig           |  11 +
 drivers/perf/Makefile          |   1 +
 drivers/perf/arm_pmu_brbe.c    | 448 +++++++++++++++++++++++++++++++++
 drivers/perf/arm_pmu_brbe.h    | 259 +++++++++++++++++++
 include/linux/perf/arm_pmu.h   |  20 ++
 6 files changed, 746 insertions(+), 1 deletion(-)
 create mode 100644 drivers/perf/arm_pmu_brbe.c
 create mode 100644 drivers/perf/arm_pmu_brbe.h

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 5bfaba8edad1..76d409d9b5f3 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1033,31 +1033,37 @@ static int armv8pmu_filter_match(struct perf_event *event)
 
 static void armv8pmu_brbe_filter(struct pmu_hw_events *hw_event, struct perf_event *event)
 {
+	arm64_pmu_brbe_filter(hw_event, event);
 }
 
 static void armv8pmu_brbe_enable(struct pmu_hw_events *hw_event)
 {
+	arm64_pmu_brbe_enable(hw_event);
 }
 
 static void armv8pmu_brbe_disable(struct pmu_hw_events *hw_event)
 {
+	arm64_pmu_brbe_disable(hw_event);
 }
 
 static void armv8pmu_brbe_read(struct pmu_hw_events *hw_event, struct perf_event *event)
 {
+	arm64_pmu_brbe_read(hw_event, event);
 }
 
 static void armv8pmu_brbe_probe(struct pmu_hw_events *hw_event)
 {
+	arm64_pmu_brbe_probe(hw_event);
 }
 
 static void armv8pmu_brbe_reset(struct pmu_hw_events *hw_event)
 {
+	arm64_pmu_brbe_reset(hw_event);
 }
 
 static bool armv8pmu_brbe_supported(struct perf_event *event)
 {
-	return false;
+	return arm64_pmu_brbe_supported(event);
 }
 
 static void armv8pmu_reset(void *info)
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index 1e2d69453771..9fa34a1d3a23 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -183,6 +183,17 @@ config APPLE_M1_CPU_PMU
 	  Provides support for the non-architectural CPU PMUs present on
 	  the Apple M1 SoCs and derivatives.
 
+config ARM_BRBE_PMU
+	tristate "Enable support for Branch Record Buffer Extension (BRBE)"
+	depends on ARM64 && ARM_PMU
+	default y
+	help
+	  Enable perf support for Branch Record Buffer Extension (BRBE) which
+	  records all branches taken in an execution path. This supports some
+	  branch types and privilege based filtering. It captured additional
+	  relevant information such as cycle count, misprediction and branch
+	  type, branch privilege level etc.
+
 source "drivers/perf/hisilicon/Kconfig"
 
 config MARVELL_CN10K_DDR_PMU
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 57a279c61df5..b81fc134d95f 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -20,3 +20,4 @@ obj-$(CONFIG_ARM_DMC620_PMU) += arm_dmc620_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_TAD_PMU) += marvell_cn10k_tad_pmu.o
 obj-$(CONFIG_MARVELL_CN10K_DDR_PMU) += marvell_cn10k_ddr_pmu.o
 obj-$(CONFIG_APPLE_M1_CPU_PMU) += apple_m1_cpu_pmu.o
+obj-$(CONFIG_ARM_BRBE_PMU) += arm_pmu_brbe.o
diff --git a/drivers/perf/arm_pmu_brbe.c b/drivers/perf/arm_pmu_brbe.c
new file mode 100644
index 000000000000..d2d546a8eaab
--- /dev/null
+++ b/drivers/perf/arm_pmu_brbe.c
@@ -0,0 +1,448 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Branch Record Buffer Extension Driver.
+ *
+ * Copyright (C) 2021 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#include "arm_pmu_brbe.h"
+
+#define BRBE_FCR_MASK (BRBFCR_BRANCH_ALL)
+#define BRBE_CR_MASK  (BRBCR_EXCEPTION | BRBCR_ERTN | BRBCR_CC | \
+		       BRBCR_MPRED | BRBCR_E1BRE | BRBCR_E0BRE)
+
+static bool arm64_pmu_brbe_has_priv(struct perf_event *event)
+{
+	return !!(event->hw.flags & ARMPMU_EVT_PRIV);
+}
+
+static void set_brbe_disabled(struct pmu_hw_events *cpuc)
+{
+	cpuc->brbe_nr = 0;
+}
+
+static bool brbe_disabled(struct pmu_hw_events *cpuc)
+{
+	return !cpuc->brbe_nr;
+}
+
+bool arm64_pmu_brbe_supported(struct perf_event *event)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	struct pmu_hw_events *hw_events = per_cpu_ptr(armpmu->hw_events, event->cpu);
+
+	/*
+	 * If the event does not have at least one of the privilege
+	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
+	 * perf will adjust its value based on perf event's existing
+	 * privilege level via attr.exclude_[user|kernel|hv].
+	 *
+	 * As event->attr.branch_sample_type might have been changed
+	 * when the event reaches here, it is not possible to figure
+	 * out whether the event originally had HV privilege request
+	 * or got added via the core perf. Just report this situation
+	 * once and continue ignoring if there are other instances.
+	 */
+	if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_HV)
+		pr_warn_once("does not support hypervisor privilege branch filter\n");
+
+	if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
+		pr_warn_once("does not support aborted transaction branch filter\n");
+		return false;
+	}
+
+	if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_NO_TX) {
+		pr_warn_once("does not support non transaction branch filter\n");
+		return false;
+	}
+
+	if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_IN_TX) {
+		pr_warn_once("does not support in transaction branch filter\n");
+		return false;
+	}
+	return !brbe_disabled(hw_events);
+}
+
+void arm64_pmu_brbe_probe(struct pmu_hw_events *cpuc)
+{
+	u64 aa64dfr0, brbidr;
+	unsigned int brbe, format, cpu = smp_processor_id();
+
+	aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
+	brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_BRBE_SHIFT);
+	if (!brbe) {
+		pr_info("no implementation found on cpu %d\n", cpu);
+		set_brbe_disabled(cpuc);
+		return;
+	} else if (brbe == ID_AA64DFR0_BRBE) {
+		pr_info("implementation found on cpu %d\n", cpu);
+		cpuc->v1p1 = false;
+	} else if (brbe == ID_AA64DFR0_BRBE_V1P1) {
+		pr_info("implementation (v1p1) found on cpu %d\n", cpu);
+		cpuc->v1p1 = true;
+	}
+
+	brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
+	format = brbe_fetch_format(brbidr);
+	if (format != BRBIDR0_FORMAT_0) {
+		pr_warn("format 0 not implemented\n");
+		set_brbe_disabled(cpuc);
+		return;
+	}
+
+	cpuc->brbe_cc = brbe_fetch_cc_bits(brbidr);
+	if (cpuc->brbe_cc != BRBIDR0_CC_20_BIT) {
+		pr_warn("20-bit counter not implemented\n");
+		set_brbe_disabled(cpuc);
+		return;
+	}
+
+	cpuc->brbe_nr = brbe_fetch_numrec(brbidr);
+	if (!valid_brbe_nr(cpuc->brbe_nr)) {
+		pr_warn("invalid number of records\n");
+		set_brbe_disabled(cpuc);
+		return;
+	}
+}
+
+void arm64_pmu_brbe_enable(struct pmu_hw_events *cpuc)
+{
+	u64 brbfcr, brbcr;
+
+	if (brbe_disabled(cpuc))
+		return;
+
+	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	brbfcr &= ~(BRBFCR_BANK_MASK << BRBFCR_BANK_SHIFT);
+	brbfcr &= ~(BRBFCR_ENL | BRBFCR_PAUSED | BRBE_FCR_MASK);
+	brbfcr |= (cpuc->brbfcr & BRBE_FCR_MASK);
+	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+	isb();
+
+	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+	brbcr &= ~BRBE_CR_MASK;
+	brbcr |= BRBCR_FZP;
+	brbcr |= (BRBCR_TS_PHYSICAL << BRBCR_TS_SHIFT);
+	brbcr |= (cpuc->brbcr & BRBE_CR_MASK);
+	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+	isb();
+}
+
+void arm64_pmu_brbe_disable(struct pmu_hw_events *cpuc)
+{
+	u64 brbcr;
+
+	if (brbe_disabled(cpuc))
+		return;
+
+	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+	brbcr &= ~(BRBCR_E0BRE | BRBCR_E1BRE);
+	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+	isb();
+}
+
+static void perf_branch_to_brbfcr(struct pmu_hw_events *cpuc, int branch_type)
+{
+	cpuc->brbfcr = 0;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+		cpuc->brbfcr |= BRBFCR_BRANCH_ALL;
+		return;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+		cpuc->brbfcr |= (BRBFCR_INDCALL | BRBFCR_DIRCALL);
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		cpuc->brbfcr |= BRBFCR_RTN;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
+		cpuc->brbfcr |= BRBFCR_INDCALL;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_COND)
+		cpuc->brbfcr |= BRBFCR_CONDDIR;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
+		cpuc->brbfcr |= BRBFCR_INDIRECT;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_CALL)
+		cpuc->brbfcr |= BRBFCR_DIRCALL;
+}
+
+static void perf_branch_to_brbcr(struct pmu_hw_events *cpuc, int branch_type, bool privilege)
+{
+	cpuc->brbcr = (BRBCR_CC | BRBCR_MPRED);
+
+	if (branch_type & PERF_SAMPLE_BRANCH_USER)
+		cpuc->brbcr |= BRBCR_E0BRE;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_KERNEL) {
+		/*
+		 * This should have been verified earlier.
+		 */
+		WARN_ON(!privilege);
+		cpuc->brbcr |= BRBCR_E1BRE;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES)
+		cpuc->brbcr &= ~BRBCR_CC;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS)
+		cpuc->brbcr &= ~BRBCR_MPRED;
+
+	if (!privilege)
+		return;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+		cpuc->brbcr |= BRBCR_EXCEPTION;
+		cpuc->brbcr |= BRBCR_ERTN;
+		return;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+		cpuc->brbcr |= BRBCR_EXCEPTION;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		cpuc->brbcr |= BRBCR_ERTN;
+}
+
+
+void arm64_pmu_brbe_filter(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+	u64 branch_type = event->attr.branch_sample_type;
+	bool privilege = arm64_pmu_brbe_has_priv(event);
+
+	if (brbe_disabled(cpuc))
+		return;
+
+	perf_branch_to_brbfcr(cpuc, branch_type);
+	perf_branch_to_brbcr(cpuc, branch_type, privilege);
+}
+
+static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)
+{
+	int brbe_type = brbe_fetch_type(brbinf);
+	*new_branch_type = false;
+
+	switch (brbe_type) {
+	case BRBINF_TYPE_UNCOND_DIR:
+		return PERF_BR_UNCOND;
+	case BRBINF_TYPE_INDIR:
+		return PERF_BR_IND;
+	case BRBINF_TYPE_DIR_LINK:
+		return PERF_BR_CALL;
+	case BRBINF_TYPE_INDIR_LINK:
+		return PERF_BR_IND_CALL;
+	case BRBINF_TYPE_RET_SUB:
+		return PERF_BR_RET;
+	case BRBINF_TYPE_COND_DIR:
+		return PERF_BR_COND;
+	case BRBINF_TYPE_CALL:
+		return PERF_BR_CALL;
+	case BRBINF_TYPE_TRAP:
+		return PERF_BR_SYSCALL;
+	case BRBINF_TYPE_RET_EXCPT:
+		return PERF_BR_ERET;
+	case BRBINF_TYPE_IRQ:
+		return PERF_BR_IRQ;
+	case BRBINF_TYPE_DEBUG_HALT:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_HALT;
+	case BRBINF_TYPE_SERROR:
+		return PERF_BR_SERROR;
+	case BRBINF_TYPE_INST_DEBUG:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_INST;
+	case BRBINF_TYPE_DATA_DEBUG:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_DATA;
+	case BRBINF_TYPE_ALGN_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_ALGN;
+	case BRBINF_TYPE_INST_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_INST;
+	case BRBINF_TYPE_DATA_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_DATA;
+	case BRBINF_TYPE_FIQ:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_FIQ;
+	case BRBINF_TYPE_DEBUG_EXIT:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_EXIT;
+	default:
+		pr_warn("unknown branch type captured\n");
+		return PERF_BR_UNKNOWN;
+	}
+}
+
+static int brbe_fetch_perf_priv(u64 brbinf)
+{
+       int brbe_el = brbe_fetch_el(brbinf);
+
+       switch (brbe_el) {
+       case BRBINF_EL_EL0:
+               return PERF_BR_PRIV_USER;
+       case BRBINF_EL_EL1:
+               return PERF_BR_PRIV_KERNEL;
+       case BRBINF_EL_EL2:
+               if (is_kernel_in_hyp_mode())
+                       return PERF_BR_PRIV_KERNEL;
+               return PERF_BR_PRIV_HV;
+       default:
+               pr_warn("unknown branch privilege captured\n");
+               return -1;
+       }
+}
+
+static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
+			       u64 brbinf, int idx)
+{
+	int branch_type, type = brbe_record_valid(brbinf);
+	bool new_branch_type;
+
+	if (!branch_sample_no_cycles(event))
+		cpuc->brbe_entries[idx].cycles = brbe_fetch_cycles(brbinf);
+
+	if (branch_sample_type(event)) {
+		branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
+		if (new_branch_type) {
+			cpuc->brbe_entries[idx].type = PERF_BR_EXTEND_ABI;
+			cpuc->brbe_entries[idx].new_type = branch_type;
+		} else {
+			cpuc->brbe_entries[idx].type = branch_type;
+		}
+	}
+
+	if (!branch_sample_no_flags(event)) {
+		/*
+		 * BRBINF_LASTFAILED does not indicate that the last transaction
+		 * got failed or aborted during the current branch record itself.
+		 * Rather, this indicates that all the branch records which were
+		 * in transaction until the curret branch record have failed. So
+		 * the entire BRBE buffer needs to be processed later on to find
+		 * all branch records which might have failed.
+		 */
+		cpuc->brbe_entries[idx].abort = brbinf & BRBINF_LASTFAILED;
+
+		/*
+		 * All these information (i.e transaction state and mispredicts)
+		 * are not available for target only branch records.
+		 */
+		if (type != BRBINF_VALID_TARGET) {
+			cpuc->brbe_entries[idx].mispred = brbinf & BRBINF_MPRED;
+			cpuc->brbe_entries[idx].predicted = !(brbinf & BRBINF_MPRED);
+			cpuc->brbe_entries[idx].in_tx = brbinf & BRBINF_TX;
+		}
+	}
+
+	if (branch_sample_priv(event)) {
+		/*
+		 * All these information (i.e branch privilege level) are not
+		 * available for source only branch records.
+		 */
+		if (type != BRBINF_VALID_SOURCE)
+			cpuc->brbe_entries[idx].priv = brbe_fetch_perf_priv(brbinf);
+	}
+}
+
+/*
+ * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
+ * preceding consecutive branch records, that were in a transaction
+ * (i.e their BRBINF_EL1.TX set) have been aborted.
+ *
+ * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
+ * consecutive branch records upto the last record, which were in a
+ * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
+ *
+ * --------------------------------- -------------------
+ * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
+ * --------------------------------- -------------------
+ * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
+ * --------------------------------- -------------------
+ * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
+ * --------------------------------- -------------------
+ * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
+ * --------------------------------- -------------------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
+ * --------------------------------- -------------------
+ * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ *
+ * BRBFCR_EL1.LASTFAILED == 1
+ *
+ * Here BRBFCR_EL1.LASTFAILED failes all those consecutive and also
+ * in transaction branches near the end of the BRBE buffer.
+ */
+static void process_branch_aborts(struct pmu_hw_events *cpuc)
+{
+	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	bool lastfailed = !!(brbfcr & BRBFCR_LASTFAILED);
+	int idx = cpuc->brbe_nr - 1;
+
+	do {
+		if (cpuc->brbe_entries[idx].in_tx) {
+			cpuc->brbe_entries[idx].abort = lastfailed;
+		} else {
+			lastfailed = cpuc->brbe_entries[idx].abort;
+			cpuc->brbe_entries[idx].abort = false;
+		}
+	} while (idx--, idx >= 0);
+}
+
+void arm64_pmu_brbe_read(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+	u64 brbinf;
+	int idx;
+
+	if (brbe_disabled(cpuc))
+		return;
+
+	set_brbe_paused();
+	for (idx = 0; idx < cpuc->brbe_nr; idx++) {
+		select_brbe_bank_index(idx);
+		brbinf = get_brbinf_reg(idx);
+		/*
+		 * There are no valid entries anymore on the buffer.
+		 * Abort the branch record processing to save some
+		 * cycles and also reduce the capture/process load
+		 * for the user space as well.
+		 */
+		if (brbe_invalid(brbinf))
+			break;
+
+		if (brbe_valid(brbinf)) {
+			cpuc->brbe_entries[idx].from =  get_brbsrc_reg(idx);
+			cpuc->brbe_entries[idx].to =  get_brbtgt_reg(idx);
+		} else if (brbe_source(brbinf)) {
+			cpuc->brbe_entries[idx].from =  get_brbsrc_reg(idx);
+			cpuc->brbe_entries[idx].to = 0;
+		} else if (brbe_target(brbinf)) {
+			cpuc->brbe_entries[idx].from = 0;
+			cpuc->brbe_entries[idx].to =  get_brbtgt_reg(idx);
+		}
+		capture_brbe_flags(cpuc, event, brbinf, idx);
+	}
+	cpuc->brbe_stack.nr = idx;
+	cpuc->brbe_stack.hw_idx = -1ULL;
+	process_branch_aborts(cpuc);
+}
+
+void arm64_pmu_brbe_reset(struct pmu_hw_events *cpuc)
+{
+	if (brbe_disabled(cpuc))
+		return;
+
+	asm volatile(BRB_IALL);
+	isb();
+}
diff --git a/drivers/perf/arm_pmu_brbe.h b/drivers/perf/arm_pmu_brbe.h
new file mode 100644
index 000000000000..f04975cdc242
--- /dev/null
+++ b/drivers/perf/arm_pmu_brbe.h
@@ -0,0 +1,259 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Branch Record Buffer Extension Helpers.
+ *
+ * Copyright (C) 2021 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#define pr_fmt(fmt) "brbe: " fmt
+
+#include <linux/perf/arm_pmu.h>
+
+/*
+ * BRBE Instructions
+ *
+ * BRB_IALL : Invalidate the entire buffer
+ * BRB_INJ  : Inject latest branch record derived from [BRBSRCINJ, BRBTGTINJ, BRBINFINJ]
+ */
+#define BRB_IALL __emit_inst(0xD5000000 | sys_insn(1, 1, 7, 2, 4) | (0x1f))
+#define BRB_INJ  __emit_inst(0xD5000000 | sys_insn(1, 1, 7, 2, 5) | (0x1f))
+
+/*
+ * BRBE Buffer Organization
+ *
+ * BRBE buffer is arranged as multiple banks of 32 branch record
+ * entries each. An indivdial branch record in a given bank could
+ * be accessedi, after selecting the bank in BRBFCR_EL1.BANK and
+ * accessing the registers i.e [BRBSRC, BRBTGT, BRBINF] set with
+ * indices [0..31].
+ *
+ * Bank 0
+ *
+ *	---------------------------------	------
+ *	| 00 | BRBSRC | BRBTGT | BRBINF |	| 00 |
+ *	---------------------------------	------
+ *	| 01 | BRBSRC | BRBTGT | BRBINF |	| 01 |
+ *	---------------------------------	------
+ *	| .. | BRBSRC | BRBTGT | BRBINF |	| .. |
+ *	---------------------------------	------
+ *	| 31 | BRBSRC | BRBTGT | BRBINF |	| 31 |
+ *	---------------------------------	------
+ *
+ * Bank 1
+ *
+ *	---------------------------------	------
+ *	| 32 | BRBSRC | BRBTGT | BRBINF |	| 00 |
+ *	---------------------------------	------
+ *	| 33 | BRBSRC | BRBTGT | BRBINF |	| 01 |
+ *	---------------------------------	------
+ *	| .. | BRBSRC | BRBTGT | BRBINF |	| .. |
+ *	---------------------------------	------
+ *	| 63 | BRBSRC | BRBTGT | BRBINF |	| 31 |
+ *	---------------------------------	------
+ */
+#define BRBE_BANK0_IDX_MIN 0
+#define BRBE_BANK0_IDX_MAX 31
+#define BRBE_BANK1_IDX_MIN 32
+#define BRBE_BANK1_IDX_MAX 63
+
+#define RETURN_READ_BRBSRCN(n) \
+	read_sysreg_s(SYS_BRBSRC##n##_EL1)
+
+#define RETURN_READ_BRBTGTN(n) \
+	read_sysreg_s(SYS_BRBTGT##n##_EL1)
+
+#define RETURN_READ_BRBINFN(n) \
+	read_sysreg_s(SYS_BRBINF##n##_EL1)
+
+#define BRBE_REGN_CASE(n, case_macro) \
+	case n: return case_macro(n); break
+
+#define BRBE_REGN_SWITCH(x, case_macro)				\
+	do {							\
+		switch (x) {					\
+		BRBE_REGN_CASE(0, case_macro);			\
+		BRBE_REGN_CASE(1, case_macro);			\
+		BRBE_REGN_CASE(2, case_macro);			\
+		BRBE_REGN_CASE(3, case_macro);			\
+		BRBE_REGN_CASE(4, case_macro);			\
+		BRBE_REGN_CASE(5, case_macro);			\
+		BRBE_REGN_CASE(6, case_macro);			\
+		BRBE_REGN_CASE(7, case_macro);			\
+		BRBE_REGN_CASE(8, case_macro);			\
+		BRBE_REGN_CASE(9, case_macro);			\
+		BRBE_REGN_CASE(10, case_macro);			\
+		BRBE_REGN_CASE(11, case_macro);			\
+		BRBE_REGN_CASE(12, case_macro);			\
+		BRBE_REGN_CASE(13, case_macro);			\
+		BRBE_REGN_CASE(14, case_macro);			\
+		BRBE_REGN_CASE(15, case_macro);			\
+		BRBE_REGN_CASE(16, case_macro);			\
+		BRBE_REGN_CASE(17, case_macro);			\
+		BRBE_REGN_CASE(18, case_macro);			\
+		BRBE_REGN_CASE(19, case_macro);			\
+		BRBE_REGN_CASE(20, case_macro);			\
+		BRBE_REGN_CASE(21, case_macro);			\
+		BRBE_REGN_CASE(22, case_macro);			\
+		BRBE_REGN_CASE(23, case_macro);			\
+		BRBE_REGN_CASE(24, case_macro);			\
+		BRBE_REGN_CASE(25, case_macro);			\
+		BRBE_REGN_CASE(26, case_macro);			\
+		BRBE_REGN_CASE(27, case_macro);			\
+		BRBE_REGN_CASE(28, case_macro);			\
+		BRBE_REGN_CASE(29, case_macro);			\
+		BRBE_REGN_CASE(30, case_macro);			\
+		BRBE_REGN_CASE(31, case_macro);			\
+		default:					\
+			pr_warn("unknown register index\n");	\
+			return -1;				\
+		}						\
+	} while (0)
+
+static inline int buffer_to_brbe_idx(int buffer_idx)
+{
+	return buffer_idx % 32;
+}
+
+static inline u64 get_brbsrc_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBSRCN);
+}
+
+static inline u64 get_brbtgt_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBTGTN);
+}
+
+static inline u64 get_brbinf_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBINFN);
+}
+
+static inline u64 brbe_record_valid(u64 brbinf)
+{
+	return brbinf & (BRBINF_VALID_MASK << BRBINF_VALID_SHIFT);
+}
+
+static inline bool brbe_invalid(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_VALID_INVALID;
+}
+
+static inline bool brbe_valid(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_VALID_ALL;
+}
+
+static inline bool brbe_source(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_VALID_SOURCE;
+}
+
+static inline bool brbe_target(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_VALID_TARGET;
+}
+
+static inline int brbe_fetch_cycles(u64 brbinf)
+{
+	/*
+	 * Captured cycle count is unknown and hence
+	 * should not be passed on the user space.
+	 */
+	if (brbinf & BRBINF_CCU)
+		return 0;
+
+	return (brbinf >> BRBINF_CC_SHIFT) & BRBINF_CC_MASK;
+}
+
+static inline int brbe_fetch_type(u64 brbinf)
+{
+	return (brbinf >> BRBINF_TYPE_SHIFT) & BRBINF_TYPE_MASK;
+}
+
+static inline int brbe_fetch_el(u64 brbinf)
+{
+	return (brbinf >> BRBINF_EL_SHIFT) & BRBINF_EL_MASK;
+}
+
+static inline int brbe_fetch_numrec(u64 brbidr)
+{
+	return (brbidr >> BRBIDR0_NUMREC_SHIFT) & BRBIDR0_NUMREC_MASK;
+}
+
+static inline int brbe_fetch_format(u64 brbidr)
+{
+	return (brbidr >> BRBIDR0_FORMAT_SHIFT) & BRBIDR0_FORMAT_MASK;
+}
+
+static inline int brbe_fetch_cc_bits(u64 brbidr)
+{
+	return (brbidr >> BRBIDR0_CC_SHIFT) & BRBIDR0_CC_MASK;
+}
+
+static inline void select_brbe_bank(int bank)
+{
+	static int brbe_current_bank = -1;
+	u64 brbfcr;
+
+	if (brbe_current_bank == bank)
+		return;
+
+	WARN_ON(bank > 1);
+	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	brbfcr &= ~(BRBFCR_BANK_MASK << BRBFCR_BANK_SHIFT);
+	brbfcr |= ((bank & BRBFCR_BANK_MASK) << BRBFCR_BANK_SHIFT);
+	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+	isb();
+	brbe_current_bank = bank;
+}
+
+static inline void select_brbe_bank_index(int buffer_idx)
+{
+	switch (buffer_idx) {
+	case BRBE_BANK0_IDX_MIN ... BRBE_BANK0_IDX_MAX:
+		select_brbe_bank(0);
+		break;
+	case BRBE_BANK1_IDX_MIN ... BRBE_BANK1_IDX_MAX:
+		select_brbe_bank(1);
+		break;
+	default:
+		pr_warn("unsupported BRBE index\n");
+	}
+}
+
+static inline bool valid_brbe_nr(int brbe_nr)
+{
+	switch (brbe_nr) {
+	case BRBIDR0_NUMREC_8:
+	case BRBIDR0_NUMREC_16:
+	case BRBIDR0_NUMREC_32:
+	case BRBIDR0_NUMREC_64:
+		return true;
+	default:
+		pr_warn("unsupported BRBE entries\n");
+		return false;
+	}
+}
+
+static inline bool brbe_paused(void)
+{
+	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+
+	return brbfcr & BRBFCR_PAUSED;
+}
+
+static inline void set_brbe_paused(void)
+{
+	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+
+	write_sysreg_s(brbfcr | BRBFCR_PAUSED, SYS_BRBFCR_EL1);
+	isb();
+}
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 67f44020a736..3e7757d05146 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -166,6 +166,26 @@ struct arm_pmu {
 	unsigned long acpi_cpuid;
 };
 
+#ifdef CONFIG_ARM_BRBE_PMU
+void arm64_pmu_brbe_filter(struct pmu_hw_events *hw_events, struct perf_event *event);
+void arm64_pmu_brbe_read(struct pmu_hw_events *cpuc, struct perf_event *event);
+void arm64_pmu_brbe_disable(struct pmu_hw_events *cpuc);
+void arm64_pmu_brbe_enable(struct pmu_hw_events *cpuc);
+void arm64_pmu_brbe_probe(struct pmu_hw_events *cpuc);
+void arm64_pmu_brbe_reset(struct pmu_hw_events *cpuc);
+bool arm64_pmu_brbe_supported(struct perf_event *event);
+#else
+static inline void arm64_pmu_brbe_filter(struct pmu_hw_events *hw_events, struct perf_event *event)
+{
+}
+static inline void arm64_pmu_brbe_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
+static inline void arm64_pmu_brbe_disable(struct pmu_hw_events *cpuc) { }
+static inline void arm64_pmu_brbe_enable(struct pmu_hw_events *cpuc) { }
+static inline void arm64_pmu_brbe_probe(struct pmu_hw_events *cpuc) { }
+static inline void arm64_pmu_brbe_reset(struct pmu_hw_events *cpuc) { }
+static inline bool arm64_pmu_brbe_supported(struct perf_event *event) {return false; }
+#endif
+
 #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
 
 u64 armpmu_event_update(struct perf_event *event);
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH V2 7/7] arm64/perf: Enable branch stack sampling
  2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
                   ` (5 preceding siblings ...)
  2022-09-08  5:10 ` [PATCH V2 6/7] arm64/perf: Add BRBE driver Anshuman Khandual
@ 2022-09-08  5:10 ` Anshuman Khandual
  2022-09-13 10:55 ` [PATCH V2 0/7] " James Clark
  7 siblings, 0 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08  5:10 UTC (permalink / raw)
  To: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas
  Cc: Anshuman Khandual, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

Now that all the required pieces are already in place, just enable the perf
branch stack sampling support on arm64 platform, by removing the gate which
blocks it in armpmu_event_init().

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 drivers/perf/arm_pmu.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 1fe5d6238b81..05848c6d955c 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -547,9 +547,35 @@ static int armpmu_event_init(struct perf_event *event)
 		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
 		return -ENOENT;
 
-	/* does not support taken branch sampling */
-	if (has_branch_stack(event))
-		return -EOPNOTSUPP;
+	if (has_branch_stack(event)) {
+		/*
+		 * BRBE support is absent. Select CONFIG_ARM_BRBE_PMU
+		 * in the config, before branch stack sampling events
+		 * can be requested.
+		 */
+		if (!IS_ENABLED(CONFIG_ARM_BRBE_PMU)) {
+			pr_warn_once("BRBE is disabled, select CONFIG_ARM_BRBE_PMU\n");
+			return -EOPNOTSUPP;
+		}
+
+		if (event->attr.branch_sample_type & PERF_SAMPLE_BRANCH_KERNEL) {
+			if (!perfmon_capable()) {
+				pr_warn_once("does not have permission for kernel branch filter\n");
+				return -EPERM;
+			}
+		}
+
+		/*
+		 * Branch stack sampling event can not be supported in
+		 * case either the required driver itself is absent or
+		 * BRBE buffer, is not supported. Besides checking for
+		 * the callback prevents a crash in case it's absent.
+		 */
+		if (!armpmu->brbe_supported || !armpmu->brbe_supported(event)) {
+			pr_warn_once("BRBE is not supported\n");
+			return -EOPNOTSUPP;
+		}
+	}
 
 	if (armpmu->map_event(event) == -ENOENT)
 		return -ENOENT;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 6/7] arm64/perf: Add BRBE driver
  2022-09-08  5:10 ` [PATCH V2 6/7] arm64/perf: Add BRBE driver Anshuman Khandual
@ 2022-09-08  9:23   ` kernel test robot
  2022-09-08 10:16     ` Anshuman Khandual
  2022-09-13 10:39   ` James Clark
  1 sibling, 1 reply; 27+ messages in thread
From: kernel test robot @ 2022-09-08  9:23 UTC (permalink / raw)
  To: Anshuman Khandual, linux-kernel, linux-perf-users,
	linux-arm-kernel, peterz, acme, mark.rutland, will,
	catalin.marinas
  Cc: kbuild-all, Anshuman Khandual, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar

Hi Anshuman,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on acme/perf/core]
[also build test ERROR on tip/perf/core arm64/for-next/core linus/master v6.0-rc4 next-20220907]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
base:   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
config: arm64-buildonly-randconfig-r002-20220907 (https://download.01.org/0day-ci/archive/20220908/202209081717.00OiPpzm-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/5b70e42a715860504646cb5bd1788ddb823dd50b
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
        git checkout 5b70e42a715860504646cb5bd1788ddb823dd50b
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/perf/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   drivers/perf/arm_pmu_brbe.c: In function 'brbe_fetch_perf_type':
>> drivers/perf/arm_pmu_brbe.c:251:24: error: 'PERF_BR_ARM64_DEBUG_HALT' undeclared (first use in this function)
     251 |                 return PERF_BR_ARM64_DEBUG_HALT;
         |                        ^~~~~~~~~~~~~~~~~~~~~~~~
   drivers/perf/arm_pmu_brbe.c:251:24: note: each undeclared identifier is reported only once for each function it appears in
>> drivers/perf/arm_pmu_brbe.c:253:24: error: 'PERF_BR_SERROR' undeclared (first use in this function); did you mean 'PERF_BR_ERET'?
     253 |                 return PERF_BR_SERROR;
         |                        ^~~~~~~~~~~~~~
         |                        PERF_BR_ERET
>> drivers/perf/arm_pmu_brbe.c:256:24: error: 'PERF_BR_ARM64_DEBUG_INST' undeclared (first use in this function)
     256 |                 return PERF_BR_ARM64_DEBUG_INST;
         |                        ^~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:259:24: error: 'PERF_BR_ARM64_DEBUG_DATA' undeclared (first use in this function)
     259 |                 return PERF_BR_ARM64_DEBUG_DATA;
         |                        ^~~~~~~~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:262:24: error: 'PERF_BR_NEW_FAULT_ALGN' undeclared (first use in this function)
     262 |                 return PERF_BR_NEW_FAULT_ALGN;
         |                        ^~~~~~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:265:24: error: 'PERF_BR_NEW_FAULT_INST' undeclared (first use in this function)
     265 |                 return PERF_BR_NEW_FAULT_INST;
         |                        ^~~~~~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:268:24: error: 'PERF_BR_NEW_FAULT_DATA' undeclared (first use in this function)
     268 |                 return PERF_BR_NEW_FAULT_DATA;
         |                        ^~~~~~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:271:24: error: 'PERF_BR_ARM64_FIQ' undeclared (first use in this function); did you mean 'PERF_REG_ARM64_MAX'?
     271 |                 return PERF_BR_ARM64_FIQ;
         |                        ^~~~~~~~~~~~~~~~~
         |                        PERF_REG_ARM64_MAX
>> drivers/perf/arm_pmu_brbe.c:274:24: error: 'PERF_BR_ARM64_DEBUG_EXIT' undeclared (first use in this function)
     274 |                 return PERF_BR_ARM64_DEBUG_EXIT;
         |                        ^~~~~~~~~~~~~~~~~~~~~~~~
   drivers/perf/arm_pmu_brbe.c: In function 'brbe_fetch_perf_priv':
>> drivers/perf/arm_pmu_brbe.c:287:23: error: 'PERF_BR_PRIV_USER' undeclared (first use in this function)
     287 |                return PERF_BR_PRIV_USER;
         |                       ^~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:289:23: error: 'PERF_BR_PRIV_KERNEL' undeclared (first use in this function); did you mean 'PERF_SECURITY_KERNEL'?
     289 |                return PERF_BR_PRIV_KERNEL;
         |                       ^~~~~~~~~~~~~~~~~~~
         |                       PERF_SECURITY_KERNEL
>> drivers/perf/arm_pmu_brbe.c:293:23: error: 'PERF_BR_PRIV_HV' undeclared (first use in this function)
     293 |                return PERF_BR_PRIV_HV;
         |                       ^~~~~~~~~~~~~~~
   drivers/perf/arm_pmu_brbe.c: In function 'capture_brbe_flags':
>> drivers/perf/arm_pmu_brbe.c:306:14: error: implicit declaration of function 'branch_sample_no_cycles' [-Werror=implicit-function-declaration]
     306 |         if (!branch_sample_no_cycles(event))
         |              ^~~~~~~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:309:13: error: implicit declaration of function 'branch_sample_type' [-Werror=implicit-function-declaration]
     309 |         if (branch_sample_type(event)) {
         |             ^~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:312:56: error: 'PERF_BR_EXTEND_ABI' undeclared (first use in this function)
     312 |                         cpuc->brbe_entries[idx].type = PERF_BR_EXTEND_ABI;
         |                                                        ^~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:313:48: error: 'struct perf_branch_entry' has no member named 'new_type'
     313 |                         cpuc->brbe_entries[idx].new_type = branch_type;
         |                                                ^
>> drivers/perf/arm_pmu_brbe.c:319:14: error: implicit declaration of function 'branch_sample_no_flags' [-Werror=implicit-function-declaration]
     319 |         if (!branch_sample_no_flags(event)) {
         |              ^~~~~~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:341:13: error: implicit declaration of function 'branch_sample_priv' [-Werror=implicit-function-declaration]
     341 |         if (branch_sample_priv(event)) {
         |             ^~~~~~~~~~~~~~~~~~
>> drivers/perf/arm_pmu_brbe.c:347:48: error: 'struct perf_branch_entry' has no member named 'priv'
     347 |                         cpuc->brbe_entries[idx].priv = brbe_fetch_perf_priv(brbinf);
         |                                                ^
   drivers/perf/arm_pmu_brbe.c: In function 'brbe_fetch_perf_type':
>> drivers/perf/arm_pmu_brbe.c:250:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
     250 |                 *new_branch_type = true;
         |                 ~~~~~~~~~~~~~~~~~^~~~~~
   drivers/perf/arm_pmu_brbe.c:252:9: note: here
     252 |         case BRBINF_TYPE_SERROR:
         |         ^~~~
   drivers/perf/arm_pmu_brbe.c:255:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
     255 |                 *new_branch_type = true;
         |                 ~~~~~~~~~~~~~~~~~^~~~~~
   drivers/perf/arm_pmu_brbe.c:257:9: note: here
     257 |         case BRBINF_TYPE_DATA_DEBUG:
         |         ^~~~
   drivers/perf/arm_pmu_brbe.c:258:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
     258 |                 *new_branch_type = true;
         |                 ~~~~~~~~~~~~~~~~~^~~~~~
   drivers/perf/arm_pmu_brbe.c:260:9: note: here
     260 |         case BRBINF_TYPE_ALGN_FAULT:
         |         ^~~~
   drivers/perf/arm_pmu_brbe.c:261:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
     261 |                 *new_branch_type = true;
         |                 ~~~~~~~~~~~~~~~~~^~~~~~
   drivers/perf/arm_pmu_brbe.c:263:9: note: here
     263 |         case BRBINF_TYPE_INST_FAULT:
         |         ^~~~
   drivers/perf/arm_pmu_brbe.c:264:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
     264 |                 *new_branch_type = true;
         |                 ~~~~~~~~~~~~~~~~~^~~~~~
   drivers/perf/arm_pmu_brbe.c:266:9: note: here
     266 |         case BRBINF_TYPE_DATA_FAULT:
         |         ^~~~
   drivers/perf/arm_pmu_brbe.c:267:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
     267 |                 *new_branch_type = true;
         |                 ~~~~~~~~~~~~~~~~~^~~~~~
   drivers/perf/arm_pmu_brbe.c:269:9: note: here
     269 |         case BRBINF_TYPE_FIQ:
         |         ^~~~
   drivers/perf/arm_pmu_brbe.c:270:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
     270 |                 *new_branch_type = true;
         |                 ~~~~~~~~~~~~~~~~~^~~~~~
   drivers/perf/arm_pmu_brbe.c:272:9: note: here
     272 |         case BRBINF_TYPE_DEBUG_EXIT:
         |         ^~~~
   drivers/perf/arm_pmu_brbe.c:273:34: warning: this statement may fall through [-Wimplicit-fallthrough=]
     273 |                 *new_branch_type = true;
         |                 ~~~~~~~~~~~~~~~~~^~~~~~
   drivers/perf/arm_pmu_brbe.c:275:9: note: here
     275 |         default:
         |         ^~~~~~~
   cc1: some warnings being treated as errors


vim +/PERF_BR_ARM64_DEBUG_HALT +251 drivers/perf/arm_pmu_brbe.c

   222	
   223	static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)
   224	{
   225		int brbe_type = brbe_fetch_type(brbinf);
   226		*new_branch_type = false;
   227	
   228		switch (brbe_type) {
   229		case BRBINF_TYPE_UNCOND_DIR:
   230			return PERF_BR_UNCOND;
   231		case BRBINF_TYPE_INDIR:
   232			return PERF_BR_IND;
   233		case BRBINF_TYPE_DIR_LINK:
   234			return PERF_BR_CALL;
   235		case BRBINF_TYPE_INDIR_LINK:
   236			return PERF_BR_IND_CALL;
   237		case BRBINF_TYPE_RET_SUB:
   238			return PERF_BR_RET;
   239		case BRBINF_TYPE_COND_DIR:
   240			return PERF_BR_COND;
   241		case BRBINF_TYPE_CALL:
   242			return PERF_BR_CALL;
   243		case BRBINF_TYPE_TRAP:
   244			return PERF_BR_SYSCALL;
   245		case BRBINF_TYPE_RET_EXCPT:
   246			return PERF_BR_ERET;
   247		case BRBINF_TYPE_IRQ:
   248			return PERF_BR_IRQ;
   249		case BRBINF_TYPE_DEBUG_HALT:
 > 250			*new_branch_type = true;
 > 251			return PERF_BR_ARM64_DEBUG_HALT;
   252		case BRBINF_TYPE_SERROR:
 > 253			return PERF_BR_SERROR;
   254		case BRBINF_TYPE_INST_DEBUG:
   255			*new_branch_type = true;
 > 256			return PERF_BR_ARM64_DEBUG_INST;
   257		case BRBINF_TYPE_DATA_DEBUG:
   258			*new_branch_type = true;
 > 259			return PERF_BR_ARM64_DEBUG_DATA;
   260		case BRBINF_TYPE_ALGN_FAULT:
   261			*new_branch_type = true;
 > 262			return PERF_BR_NEW_FAULT_ALGN;
   263		case BRBINF_TYPE_INST_FAULT:
   264			*new_branch_type = true;
 > 265			return PERF_BR_NEW_FAULT_INST;
   266		case BRBINF_TYPE_DATA_FAULT:
   267			*new_branch_type = true;
 > 268			return PERF_BR_NEW_FAULT_DATA;
   269		case BRBINF_TYPE_FIQ:
   270			*new_branch_type = true;
 > 271			return PERF_BR_ARM64_FIQ;
   272		case BRBINF_TYPE_DEBUG_EXIT:
   273			*new_branch_type = true;
 > 274			return PERF_BR_ARM64_DEBUG_EXIT;
   275		default:
   276			pr_warn("unknown branch type captured\n");
   277			return PERF_BR_UNKNOWN;
   278		}
   279	}
   280	
   281	static int brbe_fetch_perf_priv(u64 brbinf)
   282	{
   283	       int brbe_el = brbe_fetch_el(brbinf);
   284	
   285	       switch (brbe_el) {
   286	       case BRBINF_EL_EL0:
 > 287	               return PERF_BR_PRIV_USER;
   288	       case BRBINF_EL_EL1:
 > 289	               return PERF_BR_PRIV_KERNEL;
   290	       case BRBINF_EL_EL2:
   291	               if (is_kernel_in_hyp_mode())
   292	                       return PERF_BR_PRIV_KERNEL;
 > 293	               return PERF_BR_PRIV_HV;
   294	       default:
   295	               pr_warn("unknown branch privilege captured\n");
   296	               return -1;
   297	       }
   298	}
   299	
   300	static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
   301				       u64 brbinf, int idx)
   302	{
   303		int branch_type, type = brbe_record_valid(brbinf);
   304		bool new_branch_type;
   305	
 > 306		if (!branch_sample_no_cycles(event))
   307			cpuc->brbe_entries[idx].cycles = brbe_fetch_cycles(brbinf);
   308	
 > 309		if (branch_sample_type(event)) {
   310			branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
   311			if (new_branch_type) {
 > 312				cpuc->brbe_entries[idx].type = PERF_BR_EXTEND_ABI;
 > 313				cpuc->brbe_entries[idx].new_type = branch_type;
   314			} else {
   315				cpuc->brbe_entries[idx].type = branch_type;
   316			}
   317		}
   318	
 > 319		if (!branch_sample_no_flags(event)) {
   320			/*
   321			 * BRBINF_LASTFAILED does not indicate that the last transaction
   322			 * got failed or aborted during the current branch record itself.
   323			 * Rather, this indicates that all the branch records which were
   324			 * in transaction until the curret branch record have failed. So
   325			 * the entire BRBE buffer needs to be processed later on to find
   326			 * all branch records which might have failed.
   327			 */
   328			cpuc->brbe_entries[idx].abort = brbinf & BRBINF_LASTFAILED;
   329	
   330			/*
   331			 * All these information (i.e transaction state and mispredicts)
   332			 * are not available for target only branch records.
   333			 */
   334			if (type != BRBINF_VALID_TARGET) {
   335				cpuc->brbe_entries[idx].mispred = brbinf & BRBINF_MPRED;
   336				cpuc->brbe_entries[idx].predicted = !(brbinf & BRBINF_MPRED);
   337				cpuc->brbe_entries[idx].in_tx = brbinf & BRBINF_TX;
   338			}
   339		}
   340	
 > 341		if (branch_sample_priv(event)) {
   342			/*
   343			 * All these information (i.e branch privilege level) are not
   344			 * available for source only branch records.
   345			 */
   346			if (type != BRBINF_VALID_SOURCE)
 > 347				cpuc->brbe_entries[idx].priv = brbe_fetch_perf_priv(brbinf);
   348		}
   349	}
   350	

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 6/7] arm64/perf: Add BRBE driver
  2022-09-08  9:23   ` kernel test robot
@ 2022-09-08 10:16     ` Anshuman Khandual
  0 siblings, 0 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-08 10:16 UTC (permalink / raw)
  To: kernel test robot, linux-kernel, linux-perf-users,
	linux-arm-kernel, peterz, acme, mark.rutland, will,
	catalin.marinas
  Cc: kbuild-all, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar



On 9/8/22 14:53, kernel test robot wrote:
> Hi Anshuman,
> 
> Thank you for the patch! Yet something to improve:
> 
> [auto build test ERROR on acme/perf/core]
> [also build test ERROR on tip/perf/core arm64/for-next/core linus/master v6.0-rc4 next-20220907]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
> config: arm64-buildonly-randconfig-r002-20220907 (https://download.01.org/0day-ci/archive/20220908/202209081717.00OiPpzm-lkp@intel.com/config)
> compiler: aarch64-linux-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://github.com/intel-lab-lkp/linux/commit/5b70e42a715860504646cb5bd1788ddb823dd50b
>         git remote add linux-review https://github.com/intel-lab-lkp/linux
>         git fetch --no-tags linux-review Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
>         git checkout 5b70e42a715860504646cb5bd1788ddb823dd50b
>         # save the config file
>         mkdir build_dir && cp config build_dir/.config
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/perf/
> 
> If you fix the issue, kindly add following tag where applicable
> Reported-by: kernel test robot <lkp@intel.com>

These build problems will not happen with the prerequisite patches as mentioned
in the cover letter.

https://lore.kernel.org/all/20220824044822.70230-1-anshuman.khandual@arm.com/

https://lore.kernel.org/all/20220906084414.396220-1-anshuman.khandual@arm.com/

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 5/7] arm64/perf: Drive BRBE from perf event states
  2022-09-08  5:10 ` [PATCH V2 5/7] arm64/perf: Drive BRBE from perf event states Anshuman Khandual
@ 2022-09-08 15:31   ` kernel test robot
  0 siblings, 0 replies; 27+ messages in thread
From: kernel test robot @ 2022-09-08 15:31 UTC (permalink / raw)
  To: Anshuman Khandual, linux-kernel, linux-perf-users,
	linux-arm-kernel, peterz, acme, mark.rutland, will,
	catalin.marinas
  Cc: llvm, kbuild-all, Anshuman Khandual, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar

Hi Anshuman,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on acme/perf/core]
[also build test WARNING on tip/perf/core arm64/for-next/core linus/master v6.0-rc4 next-20220908]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
base:   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
config: arm64-randconfig-r025-20220907 (https://download.01.org/0day-ci/archive/20220908/202209082350.lDY2EvGx-lkp@intel.com/config)
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 1546df49f5a6d09df78f569e4137ddb365a3e827)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/intel-lab-lkp/linux/commit/5c7c07e050abb38b80d0c129fdef3a6f4b761017
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
        git checkout 5c7c07e050abb38b80d0c129fdef3a6f4b761017
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/perf/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> drivers/perf/arm_pmu.c:535:12: warning: stack frame size (2064) exceeds limit (2048) in 'armpmu_event_init' [-Wframe-larger-than]
   static int armpmu_event_init(struct perf_event *event)
              ^
   1 warning generated.


vim +/armpmu_event_init +535 drivers/perf/arm_pmu.c

1b8873a0c6ec51 arch/arm/kernel/perf_event.c Jamie Iles       2010-02-02  534  
b0a873ebbf87bf arch/arm/kernel/perf_event.c Peter Zijlstra   2010-06-11 @535  static int armpmu_event_init(struct perf_event *event)
1b8873a0c6ec51 arch/arm/kernel/perf_event.c Jamie Iles       2010-02-02  536  {
8a16b34e21199e arch/arm/kernel/perf_event.c Mark Rutland     2011-04-28  537  	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
1b8873a0c6ec51 arch/arm/kernel/perf_event.c Jamie Iles       2010-02-02  538  
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  539  	/*
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  540  	 * Reject CPU-affine events for CPUs that are of a different class to
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  541  	 * that which this PMU handles. Process-following events (where
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  542  	 * event->cpu == -1) can be migrated between CPUs, and thus we have to
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  543  	 * reject them later (in armpmu_add) if they're scheduled on a
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  544  	 * different class of CPU.
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  545  	 */
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  546  	if (event->cpu != -1 &&
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  547  		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  548  		return -ENOENT;
cc88116da0d18b arch/arm/kernel/perf_event.c Mark Rutland     2015-05-13  549  
2481c5fa6db023 arch/arm/kernel/perf_event.c Stephane Eranian 2012-02-09  550  	/* does not support taken branch sampling */
2481c5fa6db023 arch/arm/kernel/perf_event.c Stephane Eranian 2012-02-09  551  	if (has_branch_stack(event))
2481c5fa6db023 arch/arm/kernel/perf_event.c Stephane Eranian 2012-02-09  552  		return -EOPNOTSUPP;
2481c5fa6db023 arch/arm/kernel/perf_event.c Stephane Eranian 2012-02-09  553  
e1f431b57ef9e4 arch/arm/kernel/perf_event.c Mark Rutland     2011-04-28  554  	if (armpmu->map_event(event) == -ENOENT)
b0a873ebbf87bf arch/arm/kernel/perf_event.c Peter Zijlstra   2010-06-11  555  		return -ENOENT;
b0a873ebbf87bf arch/arm/kernel/perf_event.c Peter Zijlstra   2010-06-11  556  
c09adab01e4aee drivers/perf/arm_pmu.c       Mark Rutland     2017-03-10  557  	return __hw_perf_event_init(event);
1b8873a0c6ec51 arch/arm/kernel/perf_event.c Jamie Iles       2010-02-02  558  }
1b8873a0c6ec51 arch/arm/kernel/perf_event.c Jamie Iles       2010-02-02  559  

-- 
0-DAY CI Kernel Test Service
https://01.org/lkp

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE
       [not found]   ` <202209082022.6BPdyQn8-lkp@intel.com>
@ 2022-09-09  3:11     ` Anshuman Khandual
  0 siblings, 0 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-09  3:11 UTC (permalink / raw)
  To: kernel test robot, linux-kernel, linux-perf-users,
	linux-arm-kernel, peterz, acme, mark.rutland, will,
	catalin.marinas
  Cc: kbuild-all, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar



On 9/8/22 18:02, kernel test robot wrote:
> Hi Anshuman,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on acme/perf/core]
> [also build test WARNING on tip/perf/core arm64/for-next/core linus/master v6.0-rc4 next-20220908]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
> config: arm-defconfig
> compiler: arm-linux-gnueabi-gcc (GCC) 12.1.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://github.com/intel-lab-lkp/linux/commit/d3c4ba711027d312a2f4e16cdbdbec5fcbf2913b
>         git remote add linux-review https://github.com/intel-lab-lkp/linux
>         git fetch --no-tags linux-review Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
>         git checkout d3c4ba711027d312a2f4e16cdbdbec5fcbf2913b
>         # save the config file
>         mkdir build_dir && cp config build_dir/.config
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash drivers/perf/
> 
> If you fix the issue, kindly add following tag where applicable
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
>    drivers/perf/arm_pmu.c: In function 'validate_group':
>>> drivers/perf/arm_pmu.c:415:1: warning: the frame size of 1744 bytes is larger than 1024 bytes [-Wframe-larger-than=]
>      415 | }
>          | ^

Guess this might just be complaining because of the local variable 'fake_pmu'
as 'struct pmu_hw_events' now has been expanded to accommodate BRBE related
elements. But these are essential elements for BRBE branch record processing.

static int
validate_group(struct perf_event *event)
{
        struct perf_event *sibling, *leader = event->group_leader;
        struct pmu_hw_events fake_pmu;
....

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE
       [not found]   ` <202209082259.XDCTMY9g-lkp@intel.com>
@ 2022-09-09  3:14     ` Anshuman Khandual
  0 siblings, 0 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-09  3:14 UTC (permalink / raw)
  To: kernel test robot, linux-kernel, linux-perf-users,
	linux-arm-kernel, peterz, acme, mark.rutland, will,
	catalin.marinas
  Cc: llvm, kbuild-all, James Clark, Rob Herring, Marc Zyngier, Ingo Molnar

On 9/8/22 19:44, kernel test robot wrote:
> Hi Anshuman,
> 
> Thank you for the patch! Perhaps something to improve:
> 
> [auto build test WARNING on acme/perf/core]
> [also build test WARNING on tip/perf/core arm64/for-next/core linus/master v6.0-rc4 next-20220908]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf/core
> config: arm-spitz_defconfig
> compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 1546df49f5a6d09df78f569e4137ddb365a3e827)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # install arm cross compiling tool for clang build
>         # apt-get install binutils-arm-linux-gnueabi
>         # https://github.com/intel-lab-lkp/linux/commit/d3c4ba711027d312a2f4e16cdbdbec5fcbf2913b
>         git remote add linux-review https://github.com/intel-lab-lkp/linux
>         git fetch --no-tags linux-review Anshuman-Khandual/arm64-perf-Enable-branch-stack-sampling/20220908-131425
>         git checkout d3c4ba711027d312a2f4e16cdbdbec5fcbf2913b
>         # save the config file
>         mkdir build_dir && cp config build_dir/.config
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash drivers/perf/
> 
> If you fix the issue, kindly add following tag where applicable
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
>>> drivers/perf/arm_pmu.c:498:12: warning: stack frame size (1760) exceeds limit (1024) in 'armpmu_event_init' [-Wframe-larger-than]
>    static int armpmu_event_init(struct perf_event *event)
>               ^
>    1 warning generated.

This is a similar problem like the previous one, new elements are
required in struct pmu_hw_events for BRBE records processing.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE
  2022-09-08  5:10 ` [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE Anshuman Khandual
@ 2022-09-12  9:57   ` Mark Brown
  2022-09-13  6:24     ` Anshuman Khandual
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Brown @ 2022-09-12  9:57 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar


[-- Attachment #1.1: Type: text/plain, Size: 350 bytes --]

On Thu, Sep 08, 2022 at 10:40:40AM +0530, Anshuman Khandual wrote:

> ---
>  arch/arm64/include/asm/sysreg.h | 222 ++++++++++++++++++++++++++++++++
>  1 file changed, 222 insertions(+)

Rather than manually encoding register definitions in sysreg.h
can we add them to arch/arm64/tools/sysreg so that all the
#defines and so on are generated instead?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE
  2022-09-08  5:10 ` [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events " Anshuman Khandual
       [not found]   ` <202209082022.6BPdyQn8-lkp@intel.com>
       [not found]   ` <202209082259.XDCTMY9g-lkp@intel.com>
@ 2022-09-12 10:12   ` Mark Brown
  2022-09-13  5:33     ` Anshuman Khandual
  2 siblings, 1 reply; 27+ messages in thread
From: Mark Brown @ 2022-09-12 10:12 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar


[-- Attachment #1.1: Type: text/plain, Size: 607 bytes --]

On Thu, Sep 08, 2022 at 10:40:42AM +0530, Anshuman Khandual wrote:

> +	/* Captured BRBE buffer - copied as is into perf_sample_data */
> +	struct perf_branch_stack	brbe_stack;
> +	struct perf_branch_entry	brbe_entries[BRBE_MAX_ENTRIES];

It looks like perf_branch_entry is intended to be the variably
sized entries array at the end of perf_branch_stack?  That could
probably do with being called out if it's the case.  It feels
like it would be clearer and safer to allocate these dynamically
when BRBE is used if that's possible, I'd expect that should also
deal with the stack frame size issues as well.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE
  2022-09-12 10:12   ` Mark Brown
@ 2022-09-13  5:33     ` Anshuman Khandual
  2022-09-13 11:43       ` Mark Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-13  5:33 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar



On 9/12/22 15:42, Mark Brown wrote:
> On Thu, Sep 08, 2022 at 10:40:42AM +0530, Anshuman Khandual wrote:
> 
>> +	/* Captured BRBE buffer - copied as is into perf_sample_data */
>> +	struct perf_branch_stack	brbe_stack;
>> +	struct perf_branch_entry	brbe_entries[BRBE_MAX_ENTRIES];
> 
> It looks like perf_branch_entry is intended to be the variably
> sized entries array at the end of perf_branch_stack?  That could

That is right. Because max number of entries for brbe_entries[] array
is platform dependent i.e BHRB_MAX_ENTRIES on powerpc, MAX_LBR_ENTRIES
on x86 and BRBE_MAX_ENTRIES on arm64.

The generic definition

struct perf_branch_stack {
        __u64                           nr;
        __u64                           hw_idx;
        struct perf_branch_entry        entries[];
};

On x86 platform

#define MAX_LBR_ENTRIES         32

struct cpu_hw_events {
	....
	struct perf_branch_stack        lbr_stack;
        struct perf_branch_entry        lbr_entries[MAX_LBR_ENTRIES];
	....
}

On powerpc platform

#define BHRB_MAX_ENTRIES        32

struct cpu_hw_events {
	....
        struct  perf_branch_stack       bhrb_stack;
        struct  perf_branch_entry       bhrb_entries[BHRB_MAX_ENTRIES];
	....
}

Followed same format on arm64 platform as well

#define BRBE_MAX_ENTRIES	64

struct pmu_hw_events {
	....
	....
	struct perf_branch_stack	brbe_stack;
	struct perf_branch_entry	brbe_entries[BRBE_MAX_ENTRIES];
	....
	....
}

> probably do with being called out if it's the case.  It feels

Right, we could add a comment in this regard.

> like it would be clearer and safer to allocate these dynamically
> when BRBE is used if that's possible, I'd expect that should also
> deal with the stack frame size issues as well.

That might not be possible because the generic 'struct perf_branch_stack'
expects 'perf_branch_stack.entries' to be a variable array which is also
contiguous in memory, with other elements in 'perf_branch_stack'. Besides
that will be a deviation from similar implementations on x86 and powerpc
platforms.

The stack frame size came up because BRBE_MAX_ENTRIES is 64 compared to
just 32 on other platforms, which follow the exact same method.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE
  2022-09-12  9:57   ` Mark Brown
@ 2022-09-13  6:24     ` Anshuman Khandual
  2022-09-13 11:30       ` Mark Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-13  6:24 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar


On 9/12/22 15:27, Mark Brown wrote:
> On Thu, Sep 08, 2022 at 10:40:40AM +0530, Anshuman Khandual wrote:
> 
>> ---
>>  arch/arm64/include/asm/sysreg.h | 222 ++++++++++++++++++++++++++++++++
>>  1 file changed, 222 insertions(+)
> 
> Rather than manually encoding register definitions in sysreg.h
> can we add them to arch/arm64/tools/sysreg so that all the
> #defines and so on are generated instead?

SYS_[BRBINF<N>|BRBSRC<N>|BRBTGT<N>]_EL1 registers are encoded as per three
distinct formulas where <CRm> and <op2> are derived from corresponding <N>
Just wondering if those could be accommodated in arch/arm64/tools/sysreg ?

System register description via arch/arm64/tools/sysreg seems bit cryptic.
BTW, do we expect all existing sysreg definitions to move there ? Because
still there are many registers and their fields present in sysreg.h

Besides, there is also some benefit in being able to grep system registers
and their fields, across headers and implementations simultaneously.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 6/7] arm64/perf: Add BRBE driver
  2022-09-08  5:10 ` [PATCH V2 6/7] arm64/perf: Add BRBE driver Anshuman Khandual
  2022-09-08  9:23   ` kernel test robot
@ 2022-09-13 10:39   ` James Clark
  2022-09-13 11:38     ` Anshuman Khandual
  1 sibling, 1 reply; 27+ messages in thread
From: James Clark @ 2022-09-13 10:39 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Rob Herring, Marc Zyngier, Ingo Molnar, linux-kernel,
	linux-perf-users, linux-arm-kernel, peterz, acme, mark.rutland,
	will, catalin.marinas, German Gomez



On 08/09/2022 06:10, Anshuman Khandual wrote:
> This adds a BRBE driver which implements all the required helper functions
> for struct arm_pmu. Following functions are defined by this driver which
> will configure, enable, capture, reset and disable BRBE buffer HW as and
> when requested via perf branch stack sampling framework.
> 
> - arm64_pmu_brbe_filter()
> - arm64_pmu_brbe_enable()
> - arm64_pmu_brbe_disable()
> - arm64_pmu_brbe_read()
> - arm64_pmu_brbe_probe()
> - arm64_pmu_brbe_reset()
> - arm64_pmu_brbe_supported()
> 
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/kernel/perf_event.c |   8 +-
>  drivers/perf/Kconfig           |  11 +
>  drivers/perf/Makefile          |   1 +
>  drivers/perf/arm_pmu_brbe.c    | 448 +++++++++++++++++++++++++++++++++
>  drivers/perf/arm_pmu_brbe.h    | 259 +++++++++++++++++++
>  include/linux/perf/arm_pmu.h   |  20 ++
>  6 files changed, 746 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/perf/arm_pmu_brbe.c
>  create mode 100644 drivers/perf/arm_pmu_brbe.h
> 
[...]
> +
> +static int brbe_fetch_perf_priv(u64 brbinf)
> +{
> +       int brbe_el = brbe_fetch_el(brbinf);
> +
> +       switch (brbe_el) {
> +       case BRBINF_EL_EL0:
> +               return PERF_BR_PRIV_USER;
> +       case BRBINF_EL_EL1:
> +               return PERF_BR_PRIV_KERNEL;
> +       case BRBINF_EL_EL2:
> +               if (is_kernel_in_hyp_mode())
> +                       return PERF_BR_PRIV_KERNEL;
> +               return PERF_BR_PRIV_HV;
> +       default:
> +               pr_warn("unknown branch privilege captured\n");
> +               return -1;

On V1 you said that you would change this to PERF_BR_PRIV_UNKNOWN, looks
like that was dropped. Unless it didn't work out?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 0/7] arm64/perf: Enable branch stack sampling
  2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
                   ` (6 preceding siblings ...)
  2022-09-08  5:10 ` [PATCH V2 7/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
@ 2022-09-13 10:55 ` James Clark
  2022-09-13 12:12   ` Anshuman Khandual
  7 siblings, 1 reply; 27+ messages in thread
From: James Clark @ 2022-09-13 10:55 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Rob Herring, Marc Zyngier, Ingo Molnar, linux-kernel,
	linux-perf-users, linux-arm-kernel, peterz, acme, mark.rutland,
	will, catalin.marinas



On 08/09/2022 06:10, Anshuman Khandual wrote:
> This series enables perf branch stack sampling support on arm64 platform
> via a new arch feature called Branch Record Buffer Extension (BRBE). All
> relevant register definitions could be accessed here.
> 
> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
> 
> This series applies on v6.0-rc4 after the BRBE related perf ABI changes series
> (V7) that was posted earlier, and a branch sample filter helper patch.
> 
> https://lore.kernel.org/all/20220824044822.70230-1-anshuman.khandual@arm.com/
> 
> https://lore.kernel.org/all/20220906084414.396220-1-anshuman.khandual@arm.com/
> 
> Following issues have been resolved
> 
> - Jame's concerns regarding permission inadequacy related to perfmon_capable()
> - Jame's concerns regarding using perf_event_paranoid along with perfmon_capable()

I don't see the resolution to this one. I'm not 100% sure of the code
path used for LBR, but I think you just need to take perf_allow_kernel()
into account somewhere to make this command have the same result with
BRBE. Is there any contention that the permissions shouldn't behave in
the same way across platforms? This is when perf_event_paranoid < 2:

Intel:

  $ perf record -j any -- ls

  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.014 MB perf.data (16 samples) ]

Arm:

  $ perf record -j any -- ls

  Error:
  No permission to enable cycles event.

> 
> Following issues remain inconclusive
> 
> - Rob's concerns regarding the series structure, arm_pmu callbacks based framework
> 
> Changes in V2:
> 
> - Dropped branch sample filter helpers consolidation patch from this series 
> - Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable()
> - Use cached perfmon_capable() while configuring BRBE branch record filters
> 
> Changes in V1:
> 
> https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.khandual@arm.com/
> 
> - Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers
> - Process new perf branch types via PERF_BR_EXTEND_ABI
> 
> Changes in RFC V2:
> 
> https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.khandual@arm.com/
> 
> - Added branch_sample_priv() while consolidating other branch sample filter helpers
> - Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc
> - Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5)
> - Added documentation for struct arm_pmu changes, updated commit message
> - Updated commit message for BRBE detection infrastructure patch
> - PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver)
> - Branch privilege state capture mechanism has now moved inside the driver
> 
> Changes in RFC V1:
> 
> https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: James Clark <james.clark@arm.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> 
> Anshuman Khandual (7):
>   arm64/perf: Add register definitions for BRBE
>   arm64/perf: Update struct arm_pmu for BRBE
>   arm64/perf: Update struct pmu_hw_events for BRBE
>   driver/perf/arm_pmu_platform: Add support for BRBE attributes detection
>   arm64/perf: Drive BRBE from perf event states
>   arm64/perf: Add BRBE driver
>   arm64/perf: Enable branch stack sampling
> 
>  arch/arm64/include/asm/sysreg.h | 222 ++++++++++++++++
>  arch/arm64/kernel/perf_event.c  |  48 ++++
>  drivers/perf/Kconfig            |  11 +
>  drivers/perf/Makefile           |   1 +
>  drivers/perf/arm_pmu.c          |  82 +++++-
>  drivers/perf/arm_pmu_brbe.c     | 448 ++++++++++++++++++++++++++++++++
>  drivers/perf/arm_pmu_brbe.h     | 259 ++++++++++++++++++
>  drivers/perf/arm_pmu_platform.c |  34 +++
>  include/linux/perf/arm_pmu.h    |  67 +++++
>  9 files changed, 1169 insertions(+), 3 deletions(-)
>  create mode 100644 drivers/perf/arm_pmu_brbe.c
>  create mode 100644 drivers/perf/arm_pmu_brbe.h
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE
  2022-09-13  6:24     ` Anshuman Khandual
@ 2022-09-13 11:30       ` Mark Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Brown @ 2022-09-13 11:30 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar


[-- Attachment #1.1: Type: text/plain, Size: 2325 bytes --]

On Tue, Sep 13, 2022 at 11:54:09AM +0530, Anshuman Khandual wrote:
> On 9/12/22 15:27, Mark Brown wrote:
> > On Thu, Sep 08, 2022 at 10:40:40AM +0530, Anshuman Khandual wrote:

> >>  arch/arm64/include/asm/sysreg.h | 222 ++++++++++++++++++++++++++++++++
> >>  1 file changed, 222 insertions(+)

> > Rather than manually encoding register definitions in sysreg.h
> > can we add them to arch/arm64/tools/sysreg so that all the
> > #defines and so on are generated instead?

> SYS_[BRBINF<N>|BRBSRC<N>|BRBTGT<N>]_EL1 registers are encoded as per three
> distinct formulas where <CRm> and <op2> are derived from corresponding <N>
> Just wondering if those could be accommodated in arch/arm64/tools/sysreg ?

It'd need some work on the script but if there's a reusable
pattern there (I'd guess there might be) then adding some support
for generating patterns like that seems like a sensible thing.

> System register description via arch/arm64/tools/sysreg seems bit cryptic.

It's very close to the way they're definied in the ARM, and I'd
say a good bit easier than typing out all the individual defines
and making sure there's no typos.

> BTW, do we expect all existing sysreg definitions to move there ? Because
> still there are many registers and their fields present in sysreg.h

Yes, we expect all registers to be converted.  This process is
onging and if you don't do it now someone will just have to go
and convert it later.

> Besides, there is also some benefit in being able to grep system registers
> and their fields, across headers and implementations simultaneously.

That's true.  On the other hand having consistently generated
defines that are tied to the architecture means that people can
look at the ARM and know that there's a define already provided
without having to go check, and having everything generated in a
very consistent fashion means that we can write helpers which
take advantage of that fact - the SYSREG_FIELD_GET() and _PREP()
helpers are an example of that, once James' series for the 32 bit
ID registers lands we'll be able to start adding more for the
cpufeature stuff.

At the minute the major advantage is that you only have to cross
check the input file with the architecture when reviewing, and
that has a format that's very close to the architecture so is
much easier to validate.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 6/7] arm64/perf: Add BRBE driver
  2022-09-13 10:39   ` James Clark
@ 2022-09-13 11:38     ` Anshuman Khandual
  0 siblings, 0 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-13 11:38 UTC (permalink / raw)
  To: James Clark
  Cc: Rob Herring, Marc Zyngier, Ingo Molnar, linux-kernel,
	linux-perf-users, linux-arm-kernel, peterz, acme, mark.rutland,
	will, catalin.marinas, German Gomez



On 9/13/22 16:09, James Clark wrote:
> 
> 
> On 08/09/2022 06:10, Anshuman Khandual wrote:
>> This adds a BRBE driver which implements all the required helper functions
>> for struct arm_pmu. Following functions are defined by this driver which
>> will configure, enable, capture, reset and disable BRBE buffer HW as and
>> when requested via perf branch stack sampling framework.
>>
>> - arm64_pmu_brbe_filter()
>> - arm64_pmu_brbe_enable()
>> - arm64_pmu_brbe_disable()
>> - arm64_pmu_brbe_read()
>> - arm64_pmu_brbe_probe()
>> - arm64_pmu_brbe_reset()
>> - arm64_pmu_brbe_supported()
>>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-perf-users@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/kernel/perf_event.c |   8 +-
>>  drivers/perf/Kconfig           |  11 +
>>  drivers/perf/Makefile          |   1 +
>>  drivers/perf/arm_pmu_brbe.c    | 448 +++++++++++++++++++++++++++++++++
>>  drivers/perf/arm_pmu_brbe.h    | 259 +++++++++++++++++++
>>  include/linux/perf/arm_pmu.h   |  20 ++
>>  6 files changed, 746 insertions(+), 1 deletion(-)
>>  create mode 100644 drivers/perf/arm_pmu_brbe.c
>>  create mode 100644 drivers/perf/arm_pmu_brbe.h
>>
> [...]
>> +
>> +static int brbe_fetch_perf_priv(u64 brbinf)
>> +{
>> +       int brbe_el = brbe_fetch_el(brbinf);
>> +
>> +       switch (brbe_el) {
>> +       case BRBINF_EL_EL0:
>> +               return PERF_BR_PRIV_USER;
>> +       case BRBINF_EL_EL1:
>> +               return PERF_BR_PRIV_KERNEL;
>> +       case BRBINF_EL_EL2:
>> +               if (is_kernel_in_hyp_mode())
>> +                       return PERF_BR_PRIV_KERNEL;
>> +               return PERF_BR_PRIV_HV;
>> +       default:
>> +               pr_warn("unknown branch privilege captured\n");
>> +               return -1;
> 
> On V1 you said that you would change this to PERF_BR_PRIV_UNKNOWN, looks
> like that was dropped. Unless it didn't work out?

Seems like it just got dropped unintentionally. Yes, PERF_BR_PRIV_UNKNOWN
can be returned here instead of "-1", similar to brbe_fetch_perf_type()
which returns PERF_BR_UNKNOWN.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE
  2022-09-13  5:33     ` Anshuman Khandual
@ 2022-09-13 11:43       ` Mark Brown
  2022-09-14  3:39         ` Anshuman Khandual
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Brown @ 2022-09-13 11:43 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar


[-- Attachment #1.1: Type: text/plain, Size: 1137 bytes --]

On Tue, Sep 13, 2022 at 11:03:45AM +0530, Anshuman Khandual wrote:
> On 9/12/22 15:42, Mark Brown wrote:

> > like it would be clearer and safer to allocate these dynamically
> > when BRBE is used if that's possible, I'd expect that should also
> > deal with the stack frame size issues as well.

> That might not be possible because the generic 'struct perf_branch_stack'
> expects 'perf_branch_stack.entries' to be a variable array which is also
> contiguous in memory, with other elements in 'perf_branch_stack'. Besides
> that will be a deviation from similar implementations on x86 and powerpc
> platforms.

> The stack frame size came up because BRBE_MAX_ENTRIES is 64 compared to
> just 32 on other platforms, which follow the exact same method.

If this is a pattern used by other architectures and relied on by
the core that doesn't mean it's impossible to do anything, it
means that the existing code needs to be updated to allow the
larger number of entries for BRBE if we want to change things.
That is a lot of effort of course so something that moves the
allocation off the stack would be more expedient in the short
term.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 0/7] arm64/perf: Enable branch stack sampling
  2022-09-13 10:55 ` [PATCH V2 0/7] " James Clark
@ 2022-09-13 12:12   ` Anshuman Khandual
  2022-09-13 13:12     ` James Clark
  0 siblings, 1 reply; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-13 12:12 UTC (permalink / raw)
  To: James Clark
  Cc: Rob Herring, Marc Zyngier, Ingo Molnar, linux-kernel,
	linux-perf-users, linux-arm-kernel, peterz, acme, mark.rutland,
	will, catalin.marinas



On 9/13/22 16:25, James Clark wrote:
> 
> On 08/09/2022 06:10, Anshuman Khandual wrote:
>> This series enables perf branch stack sampling support on arm64 platform
>> via a new arch feature called Branch Record Buffer Extension (BRBE). All
>> relevant register definitions could be accessed here.
>>
>> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
>>
>> This series applies on v6.0-rc4 after the BRBE related perf ABI changes series
>> (V7) that was posted earlier, and a branch sample filter helper patch.
>>
>> https://lore.kernel.org/all/20220824044822.70230-1-anshuman.khandual@arm.com/
>>
>> https://lore.kernel.org/all/20220906084414.396220-1-anshuman.khandual@arm.com/
>>
>> Following issues have been resolved
>>
>> - Jame's concerns regarding permission inadequacy related to perfmon_capable()
>> - Jame's concerns regarding using perf_event_paranoid along with perfmon_capable()
> I don't see the resolution to this one. I'm not 100% sure of the code
> path used for LBR, but I think you just need to take perf_allow_kernel()
> into account somewhere to make this command have the same result with
> BRBE. Is there any contention that the permissions shouldn't behave in
> the same way across platforms? This is when perf_event_paranoid < 2:
> 
> Intel:
> 
>   $ perf record -j any -- ls
> 
>   [ perf record: Woken up 1 times to write data ]
>   [ perf record: Captured and wrote 0.014 MB perf.data (16 samples) ]
> 
> Arm:
> 
>   $ perf record -j any -- ls
> 
>   Error:
>   No permission to enable cycles event.
> 
Proposed solution here just follows what we did for the SPE driver recently.
I would not be surprised, if there is difference in semantics in permission
checking across various platform perf drivers. Ideally permission should not
even be checked in platform drivers - either capability or perf_event_paranoid.

Unfortunately changing the permission checking framework across generic perf
is beyond the scope for this BRBE proposal and might be taken up later via a
different series. Although I would be willing to accommodate any alternate
suggestions to improve permission checking here in the BRBE driver.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 0/7] arm64/perf: Enable branch stack sampling
  2022-09-13 12:12   ` Anshuman Khandual
@ 2022-09-13 13:12     ` James Clark
  2022-09-14  4:43       ` Anshuman Khandual
  0 siblings, 1 reply; 27+ messages in thread
From: James Clark @ 2022-09-13 13:12 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Rob Herring, Marc Zyngier, Ingo Molnar, linux-kernel,
	linux-perf-users, linux-arm-kernel, peterz, acme, mark.rutland,
	will, catalin.marinas



On 13/09/2022 13:12, Anshuman Khandual wrote:
> 
> 
> On 9/13/22 16:25, James Clark wrote:
>>
>> On 08/09/2022 06:10, Anshuman Khandual wrote:
>>> This series enables perf branch stack sampling support on arm64 platform
>>> via a new arch feature called Branch Record Buffer Extension (BRBE). All
>>> relevant register definitions could be accessed here.
>>>
>>> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
>>>
>>> This series applies on v6.0-rc4 after the BRBE related perf ABI changes series
>>> (V7) that was posted earlier, and a branch sample filter helper patch.
>>>
>>> https://lore.kernel.org/all/20220824044822.70230-1-anshuman.khandual@arm.com/
>>>
>>> https://lore.kernel.org/all/20220906084414.396220-1-anshuman.khandual@arm.com/
>>>
>>> Following issues have been resolved
>>>
>>> - Jame's concerns regarding permission inadequacy related to perfmon_capable()
>>> - Jame's concerns regarding using perf_event_paranoid along with perfmon_capable()
>> I don't see the resolution to this one. I'm not 100% sure of the code
>> path used for LBR, but I think you just need to take perf_allow_kernel()
>> into account somewhere to make this command have the same result with
>> BRBE. Is there any contention that the permissions shouldn't behave in
>> the same way across platforms? This is when perf_event_paranoid < 2:
>>
>> Intel:
>>
>>   $ perf record -j any -- ls
>>
>>   [ perf record: Woken up 1 times to write data ]
>>   [ perf record: Captured and wrote 0.014 MB perf.data (16 samples) ]
>>
>> Arm:
>>
>>   $ perf record -j any -- ls
>>
>>   Error:
>>   No permission to enable cycles event.
>>
> Proposed solution here just follows what we did for the SPE driver recently.
> I would not be surprised, if there is difference in semantics in permission
> checking across various platform perf drivers. 

SPE isn't too relevant because it's its own thing and there is no SPE
command that can be run on other platforms. There may be something like
perf c2c that uses SPE under the hood but if it works differently across
platforms I would also consider that a bug and not something to be copied.

> Ideally permission should not
> even be checked in platform drivers - either capability or perf_event_paranoid.

But it is currently. Users don't care about the code or how complicated
the implementation is, only that the behaviour is sane. We're not
helping Arm users or adoption of BRBE if the same command that someone
runs somewhere else fails inexplicably, without any justification other
than "the code didn't look right".

> 
> Unfortunately changing the permission checking framework across generic perf
> is beyond the scope for this BRBE proposal and might be taken up later via a

Permissions are definitely not beyond the scope of this proposal because
the code to check the permissions has been added right here:

  +		if (perfmon_capable())
  +			event->hw.flags |= ARMPMU_EVT_PRIV;

And all it needs extra is a check of perf_allow_kernel() or similar.

> different series. Although I would be willing to accommodate any alternate
> suggestions to improve permission checking here in the BRBE driver.

I don't think planning to change it in the future is very user friendly
either, otherwise any help we give to people stuck will have to start
with an explanation about how we changed the permissions model across
versions, and their command or setup also depends on the kernel version.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE
  2022-09-13 11:43       ` Mark Brown
@ 2022-09-14  3:39         ` Anshuman Khandual
  2022-09-14  9:35           ` Mark Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-14  3:39 UTC (permalink / raw)
  To: Mark Brown
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar



On 9/13/22 17:13, Mark Brown wrote:
> On Tue, Sep 13, 2022 at 11:03:45AM +0530, Anshuman Khandual wrote:
>> On 9/12/22 15:42, Mark Brown wrote:
> 
>>> like it would be clearer and safer to allocate these dynamically
>>> when BRBE is used if that's possible, I'd expect that should also
>>> deal with the stack frame size issues as well.
> 
>> That might not be possible because the generic 'struct perf_branch_stack'
>> expects 'perf_branch_stack.entries' to be a variable array which is also
>> contiguous in memory, with other elements in 'perf_branch_stack'. Besides
>> that will be a deviation from similar implementations on x86 and powerpc
>> platforms.
> 
>> The stack frame size came up because BRBE_MAX_ENTRIES is 64 compared to
>> just 32 on other platforms, which follow the exact same method.
> 
> If this is a pattern used by other architectures and relied on by
> the core that doesn't mean it's impossible to do anything, it
> means that the existing code needs to be updated to allow the
> larger number of entries for BRBE if we want to change things.
> That is a lot of effort of course so something that moves the
> allocation off the stack would be more expedient in the short
> term.

Something like the following change moves the buffer allocation off the stack,
although it requires updating the driver, and buffer assignment during a PMU
interrupt. But it does seem to work (will require some more testing).

diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 3e7757d05146..a3401122d855 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -52,6 +52,12 @@ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_PRIV) == ARMPMU_EVT_PRIV);
  */
 #define BRBE_MAX_ENTRIES 64
 
+/* Captured BRBE buffer - copied as is into perf_sample_data */
+struct brbe_records {
+       struct perf_branch_stack        brbe_stack;
+       struct perf_branch_entry        brbe_entries[BRBE_MAX_ENTRIES];
+};
+
 /* The events for a given PMU register set. */
 struct pmu_hw_events {
        /*
@@ -92,9 +98,7 @@ struct pmu_hw_events {
        unsigned int                    brbe_users;
        void                            *brbe_context;
 
-       /* Captured BRBE buffer - copied as is into perf_sample_data */
-       struct perf_branch_stack        brbe_stack;
-       struct perf_branch_entry        brbe_entries[BRBE_MAX_ENTRIES];
+       struct brbe_records             *branch_records;
 };
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 05848c6d955c..2f0957519307 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -951,6 +951,13 @@ static struct arm_pmu *__armpmu_alloc(gfp_t flags)
                goto out_free_pmu;
        }
 
+       for_each_possible_cpu(cpu) {
+               struct pmu_hw_events *events = per_cpu_ptr(pmu->hw_events, cpu);
+
+               events->branch_records = kmalloc(sizeof(struct brbe_records), flags);
+               WARN_ON(!events->branch_records);
+       }

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 0/7] arm64/perf: Enable branch stack sampling
  2022-09-13 13:12     ` James Clark
@ 2022-09-14  4:43       ` Anshuman Khandual
  0 siblings, 0 replies; 27+ messages in thread
From: Anshuman Khandual @ 2022-09-14  4:43 UTC (permalink / raw)
  To: James Clark
  Cc: Rob Herring, Marc Zyngier, Ingo Molnar, linux-kernel,
	linux-perf-users, linux-arm-kernel, peterz, acme, mark.rutland,
	will, catalin.marinas



On 9/13/22 18:42, James Clark wrote:
> 
> 
> On 13/09/2022 13:12, Anshuman Khandual wrote:
>>
>>
>> On 9/13/22 16:25, James Clark wrote:
>>>
>>> On 08/09/2022 06:10, Anshuman Khandual wrote:
>>>> This series enables perf branch stack sampling support on arm64 platform
>>>> via a new arch feature called Branch Record Buffer Extension (BRBE). All
>>>> relevant register definitions could be accessed here.
>>>>
>>>> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
>>>>
>>>> This series applies on v6.0-rc4 after the BRBE related perf ABI changes series
>>>> (V7) that was posted earlier, and a branch sample filter helper patch.
>>>>
>>>> https://lore.kernel.org/all/20220824044822.70230-1-anshuman.khandual@arm.com/
>>>>
>>>> https://lore.kernel.org/all/20220906084414.396220-1-anshuman.khandual@arm.com/
>>>>
>>>> Following issues have been resolved
>>>>
>>>> - Jame's concerns regarding permission inadequacy related to perfmon_capable()
>>>> - Jame's concerns regarding using perf_event_paranoid along with perfmon_capable()
>>> I don't see the resolution to this one. I'm not 100% sure of the code
>>> path used for LBR, but I think you just need to take perf_allow_kernel()
>>> into account somewhere to make this command have the same result with
>>> BRBE. Is there any contention that the permissions shouldn't behave in
>>> the same way across platforms? This is when perf_event_paranoid < 2:
>>>
>>> Intel:
>>>
>>>   $ perf record -j any -- ls
>>>
>>>   [ perf record: Woken up 1 times to write data ]
>>>   [ perf record: Captured and wrote 0.014 MB perf.data (16 samples) ]
>>>
>>> Arm:
>>>
>>>   $ perf record -j any -- ls
>>>
>>>   Error:
>>>   No permission to enable cycles event.
>>>
>> Proposed solution here just follows what we did for the SPE driver recently.
>> I would not be surprised, if there is difference in semantics in permission
>> checking across various platform perf drivers. 
> 
> SPE isn't too relevant because it's its own thing and there is no SPE
> command that can be run on other platforms. There may be something like
> perf c2c that uses SPE under the hood but if it works differently across
> platforms I would also consider that a bug and not something to be copied.
> 
>> Ideally permission should not
>> even be checked in platform drivers - either capability or perf_event_paranoid.
> 
> But it is currently. Users don't care about the code or how complicated
> the implementation is, only that the behaviour is sane. We're not
> helping Arm users or adoption of BRBE if the same command that someone
> runs somewhere else fails inexplicably, without any justification other
> than "the code didn't look right".
> 
>>
>> Unfortunately changing the permission checking framework across generic perf
>> is beyond the scope for this BRBE proposal and might be taken up later via a
> 
> Permissions are definitely not beyond the scope of this proposal because
> the code to check the permissions has been added right here:
> 
>   +		if (perfmon_capable())
>   +			event->hw.flags |= ARMPMU_EVT_PRIV;
> 
> And all it needs extra is a check of perf_allow_kernel() or similar.
> 
>> different series. Although I would be willing to accommodate any alternate
>> suggestions to improve permission checking here in the BRBE driver.
> 
> I don't think planning to change it in the future is very user friendly
> either, otherwise any help we give to people stuck will have to start
> with an explanation about how we changed the permissions model across
> versions, and their command or setup also depends on the kernel version.

I guess this discussion regarding perfmon_capable(), perf_event_paranoid,
and perf_allow_kernel() has been happening in a rather cyclical manner :)
There are multiple approaches to solve this problem both in near and long
term, and there seems to be disagreement over which is the preferred path
to be taken. Hence, will just leave the decision up to the maintainers.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events for BRBE
  2022-09-14  3:39         ` Anshuman Khandual
@ 2022-09-14  9:35           ` Mark Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Brown @ 2022-09-14  9:35 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-kernel, linux-perf-users, linux-arm-kernel, peterz, acme,
	mark.rutland, will, catalin.marinas, James Clark, Rob Herring,
	Marc Zyngier, Ingo Molnar


[-- Attachment #1.1: Type: text/plain, Size: 597 bytes --]

On Wed, Sep 14, 2022 at 09:09:10AM +0530, Anshuman Khandual wrote:

> Something like the following change moves the buffer allocation off the stack,
> although it requires updating the driver, and buffer assignment during a PMU
> interrupt. But it does seem to work (will require some more testing).

Yeah, looks like it should do the trick.

> +
> +               events->branch_records = kmalloc(sizeof(struct brbe_records), flags);
> +               WARN_ON(!events->branch_records);

No need for the WARN_ON(), if we run out of memory the memory
management code is already very loud about it.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2022-09-14  9:42 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-08  5:10 [PATCH V2 0/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
2022-09-08  5:10 ` [PATCH V2 1/7] arm64/perf: Add register definitions for BRBE Anshuman Khandual
2022-09-12  9:57   ` Mark Brown
2022-09-13  6:24     ` Anshuman Khandual
2022-09-13 11:30       ` Mark Brown
2022-09-08  5:10 ` [PATCH V2 2/7] arm64/perf: Update struct arm_pmu " Anshuman Khandual
2022-09-08  5:10 ` [PATCH V2 3/7] arm64/perf: Update struct pmu_hw_events " Anshuman Khandual
     [not found]   ` <202209082022.6BPdyQn8-lkp@intel.com>
2022-09-09  3:11     ` Anshuman Khandual
     [not found]   ` <202209082259.XDCTMY9g-lkp@intel.com>
2022-09-09  3:14     ` Anshuman Khandual
2022-09-12 10:12   ` Mark Brown
2022-09-13  5:33     ` Anshuman Khandual
2022-09-13 11:43       ` Mark Brown
2022-09-14  3:39         ` Anshuman Khandual
2022-09-14  9:35           ` Mark Brown
2022-09-08  5:10 ` [PATCH V2 4/7] driver/perf/arm_pmu_platform: Add support for BRBE attributes detection Anshuman Khandual
2022-09-08  5:10 ` [PATCH V2 5/7] arm64/perf: Drive BRBE from perf event states Anshuman Khandual
2022-09-08 15:31   ` kernel test robot
2022-09-08  5:10 ` [PATCH V2 6/7] arm64/perf: Add BRBE driver Anshuman Khandual
2022-09-08  9:23   ` kernel test robot
2022-09-08 10:16     ` Anshuman Khandual
2022-09-13 10:39   ` James Clark
2022-09-13 11:38     ` Anshuman Khandual
2022-09-08  5:10 ` [PATCH V2 7/7] arm64/perf: Enable branch stack sampling Anshuman Khandual
2022-09-13 10:55 ` [PATCH V2 0/7] " James Clark
2022-09-13 12:12   ` Anshuman Khandual
2022-09-13 13:12     ` James Clark
2022-09-14  4:43       ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).