All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V7 0/6] arm64/perf: Enable branch stack sampling
@ 2023-01-05  3:10 ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
	James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	linux-perf-users

This series enables perf branch stack sampling support on arm64 platform
via a new arch feature called Branch Record Buffer Extension (BRBE). All
relevant register definitions could be accessed here.

https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers

This series applies on v6.2-r2.

Changes in V7:

- Folded [PATCH 7/7] into [PATCH 3/7] which enables branch stack sampling event
- Defined BRBFCR_EL1_BRANCH_FILTERS, BRBCR_EL1_DEFAULT_CONFIG in the header
- Defined BRBFCR_EL1_DEFAULT_CONFIG in the header
- Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_FZP
- Defined BRBCR_EL1_DEFAULT_TS in the header
- Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_DEFAULT_TS
- Moved BRBCR_EL1_DEFAULT_CONFIG check inside branch_type_to_brbcr()
- Moved down BRBCR_EL1_CC, BRBCR_EL1_MPRED later in branch_type_to_brbcr()
- Also set BRBE in paused state in armv8pmu_branch_disable()
- Dropped brbe_paused(), set_brbe_paused() helpers
- Extracted error string via branch_filter_error_msg[] for armv8pmu_branch_valid()
- Replaced brbe_v1p1 with brbe_version in struct brbe_hw_attr
- Added valid_brbe_[cc, format, version]() helpers
- Split a separate brbe_attributes_probe() from armv8pmu_branch_probe()
- Capture event->attr.branch_sample_type earlier in armv8pmu_branch_valid()
- Defined enum brbe_bank_idx with possible values for BRBE bank indices
- Changed armpmu->hw_attr into armpmu->private
- Added missing space in stub definition for armv8pmu_branch_valid()
- Replaced both kmalloc() with kzalloc()
- Added BRBE_BANK_MAX_ENTRIES
- Updated comment for capture_brbe_flags()
- Updated comment for struct brbe_hw_attr
- Dropped space after type cast in couple of places
- Replaced inverse with negation for testing BRBCR_EL1_FZP in armv8pmu_branch_read()
- Captured cpuc->branches->branch_entries[idx] in a local variable
- Dropped saved_priv from armv8pmu_branch_read()
- Reorganize PERF_SAMPLE_BRANCH_NO_[CYCLES|NO_FLAGS] related configuration
- Replaced with FIELD_GET() and FIELD_PREP() wherever applicable
- Replaced BRBCR_EL1_TS_PHYSICAL with BRBCR_EL1_TS_VIRTUAL
- Moved valid_brbe_nr(), valid_brbe_cc(), valid_brbe_format(), valid_brbe_version()
  select_brbe_bank(), select_brbe_bank_index() helpers inside the C implementation
- Reorganized brbe_valid_nr() and dropped the pr_warn() message
- Changed probe sequence in brbe_attributes_probe()
- Added 'brbcr' argument into capture_brbe_flags() to ascertain correct state
- Disable BRBE before disabling the PMU event counter
- Enable PERF_SAMPLE_BRANCH_HV filters when is_kernel_in_hyp_mode()
- Guard armv8pmu_reset() & armv8pmu_sched_task() with arm_pmu_branch_stack_supported()

Changes in V6:

https://lore.kernel.org/linux-arm-kernel/20221208084402.863310-1-anshuman.khandual@arm.com/

- Restore the exception level privilege after reading the branch records
- Unpause the buffer after reading the branch records
- Decouple BRBCR_EL1_EXCEPTION/ERTN from perf event privilege level
- Reworked BRBE implementation and branch stack sampling support on arm pmu
- BRBE implementation is now part of overall ARMV8 PMU implementation
- BRBE implementation moved from drivers/perf/ to inside arch/arm64/kernel/
- CONFIG_ARM_BRBE_PMU renamed as CONFIG_ARM64_BRBE in arch/arm64/Kconfig
- File moved - drivers/perf/arm_pmu_brbe.c -> arch/arm64/kernel/brbe.c
- File moved - drivers/perf/arm_pmu_brbe.h -> arch/arm64/kernel/brbe.h
- BRBE name has been dropped from struct arm_pmu and struct hw_pmu_events
- BRBE name has been abstracted out as 'branches' in arm_pmu and hw_pmu_events
- BRBE name has been abstracted out as 'branches' in ARMV8 PMU implementation
- Added sched_task() callback into struct arm_pmu
- Added 'hw_attr' into struct arm_pmu encapsulating possible PMU HW attributes
- Dropped explicit attributes brbe_(v1p1, nr, cc, format) from struct arm_pmu
- Dropped brbfcr, brbcr, registers scratch area from struct hw_pmu_events
- Dropped brbe_users, brbe_context tracking in struct hw_pmu_events
- Added 'features' tracking into struct arm_pmu with ARM_PMU_BRANCH_STACK flag
- armpmu->hw_attr maps into 'struct brbe_hw_attr' inside BRBE implementation
- Set ARM_PMU_BRANCH_STACK in 'arm_pmu->features' after successful BRBE probe
- Added armv8pmu_branch_reset() inside armv8pmu_branch_enable()
- Dropped brbe_supported() as events will be rejected via ARM_PMU_BRANCH_STACK
- Dropped set_brbe_disabled() as well
- Reformated armv8pmu_branch_valid() warnings while rejecting unsupported events

Changes in V5:

https://lore.kernel.org/linux-arm-kernel/20221107062514.2851047-1-anshuman.khandual@arm.com/

- Changed BRBCR_EL1.VIRTUAL from 0b1 to 0b01
- Changed BRBFCR_EL1.EnL into BRBFCR_EL1.EnI
- Changed config ARM_BRBE_PMU from 'tristate' to 'bool'

Changes in V4:

https://lore.kernel.org/all/20221017055713.451092-1-anshuman.khandual@arm.com/

- Changed ../tools/sysreg declarations as suggested
- Set PERF_SAMPLE_BRANCH_STACK in data.sample_flags
- Dropped perfmon_capable() check in armpmu_event_init()
- s/pr_warn_once/pr_info in armpmu_event_init()
- Added brbe_format element into struct pmu_hw_events
- Changed v1p1 as brbe_v1p1 in struct pmu_hw_events
- Dropped pr_info() from arm64_pmu_brbe_probe(), solved LOCKDEP warning

Changes in V3:

https://lore.kernel.org/all/20220929075857.158358-1-anshuman.khandual@arm.com/

- Moved brbe_stack from the stack and now dynamically allocated
- Return PERF_BR_PRIV_UNKNOWN instead of -1 in brbe_fetch_perf_priv()
- Moved BRBIDR0, BRBCR, BRBFCR registers and fields into tools/sysreg
- Created dummy BRBINF_EL1 field definitions in tools/sysreg
- Dropped ARMPMU_EVT_PRIV framework which cached perfmon_capable()
- Both exception and exception return branche records are now captured
  only if the event has PERF_SAMPLE_BRANCH_KERNEL which would already
  been checked in generic perf via perf_allow_kernel()

Changes in V2:

https://lore.kernel.org/all/20220908051046.465307-1-anshuman.khandual@arm.com/

- Dropped branch sample filter helpers consolidation patch from this series 
- Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable()
- Use cached perfmon_capable() while configuring BRBE branch record filters

Changes in V1:

https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.khandual@arm.com/

- Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers
- Process new perf branch types via PERF_BR_EXTEND_ABI

Changes in RFC V2:

https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.khandual@arm.com/

- Added branch_sample_priv() while consolidating other branch sample filter helpers
- Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc
- Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5)
- Added documentation for struct arm_pmu changes, updated commit message
- Updated commit message for BRBE detection infrastructure patch
- PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver)
- Branch privilege state capture mechanism has now moved inside the driver

Changes in RFC V1:

https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (6):
  drivers: perf: arm_pmu: Add new sched_task() callback
  arm64/perf: Add BRBE registers and fields
  arm64/perf: Add branch stack support in struct arm_pmu
  arm64/perf: Add branch stack support in struct pmu_hw_events
  arm64/perf: Add branch stack support in ARMV8 PMU
  arm64/perf: Enable branch stack events via FEAT_BRBE

 arch/arm64/Kconfig                  |  11 +
 arch/arm64/include/asm/perf_event.h |  19 ++
 arch/arm64/include/asm/sysreg.h     | 103 ++++++
 arch/arm64/kernel/Makefile          |   1 +
 arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
 arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
 arch/arm64/kernel/perf_event.c      |  35 ++
 arch/arm64/tools/sysreg             | 161 +++++++++
 drivers/perf/arm_pmu.c              |  12 +-
 include/linux/perf/arm_pmu.h        |  19 ++
 10 files changed, 1128 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm64/kernel/brbe.c
 create mode 100644 arch/arm64/kernel/brbe.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH V7 0/6] arm64/perf: Enable branch stack sampling
@ 2023-01-05  3:10 ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Mark Brown,
	James Clark, Rob Herring, Marc Zyngier, Suzuki Poulose,
	Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo,
	linux-perf-users

This series enables perf branch stack sampling support on arm64 platform
via a new arch feature called Branch Record Buffer Extension (BRBE). All
relevant register definitions could be accessed here.

https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers

This series applies on v6.2-r2.

Changes in V7:

- Folded [PATCH 7/7] into [PATCH 3/7] which enables branch stack sampling event
- Defined BRBFCR_EL1_BRANCH_FILTERS, BRBCR_EL1_DEFAULT_CONFIG in the header
- Defined BRBFCR_EL1_DEFAULT_CONFIG in the header
- Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_FZP
- Defined BRBCR_EL1_DEFAULT_TS in the header
- Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_DEFAULT_TS
- Moved BRBCR_EL1_DEFAULT_CONFIG check inside branch_type_to_brbcr()
- Moved down BRBCR_EL1_CC, BRBCR_EL1_MPRED later in branch_type_to_brbcr()
- Also set BRBE in paused state in armv8pmu_branch_disable()
- Dropped brbe_paused(), set_brbe_paused() helpers
- Extracted error string via branch_filter_error_msg[] for armv8pmu_branch_valid()
- Replaced brbe_v1p1 with brbe_version in struct brbe_hw_attr
- Added valid_brbe_[cc, format, version]() helpers
- Split a separate brbe_attributes_probe() from armv8pmu_branch_probe()
- Capture event->attr.branch_sample_type earlier in armv8pmu_branch_valid()
- Defined enum brbe_bank_idx with possible values for BRBE bank indices
- Changed armpmu->hw_attr into armpmu->private
- Added missing space in stub definition for armv8pmu_branch_valid()
- Replaced both kmalloc() with kzalloc()
- Added BRBE_BANK_MAX_ENTRIES
- Updated comment for capture_brbe_flags()
- Updated comment for struct brbe_hw_attr
- Dropped space after type cast in couple of places
- Replaced inverse with negation for testing BRBCR_EL1_FZP in armv8pmu_branch_read()
- Captured cpuc->branches->branch_entries[idx] in a local variable
- Dropped saved_priv from armv8pmu_branch_read()
- Reorganize PERF_SAMPLE_BRANCH_NO_[CYCLES|NO_FLAGS] related configuration
- Replaced with FIELD_GET() and FIELD_PREP() wherever applicable
- Replaced BRBCR_EL1_TS_PHYSICAL with BRBCR_EL1_TS_VIRTUAL
- Moved valid_brbe_nr(), valid_brbe_cc(), valid_brbe_format(), valid_brbe_version()
  select_brbe_bank(), select_brbe_bank_index() helpers inside the C implementation
- Reorganized brbe_valid_nr() and dropped the pr_warn() message
- Changed probe sequence in brbe_attributes_probe()
- Added 'brbcr' argument into capture_brbe_flags() to ascertain correct state
- Disable BRBE before disabling the PMU event counter
- Enable PERF_SAMPLE_BRANCH_HV filters when is_kernel_in_hyp_mode()
- Guard armv8pmu_reset() & armv8pmu_sched_task() with arm_pmu_branch_stack_supported()

Changes in V6:

https://lore.kernel.org/linux-arm-kernel/20221208084402.863310-1-anshuman.khandual@arm.com/

- Restore the exception level privilege after reading the branch records
- Unpause the buffer after reading the branch records
- Decouple BRBCR_EL1_EXCEPTION/ERTN from perf event privilege level
- Reworked BRBE implementation and branch stack sampling support on arm pmu
- BRBE implementation is now part of overall ARMV8 PMU implementation
- BRBE implementation moved from drivers/perf/ to inside arch/arm64/kernel/
- CONFIG_ARM_BRBE_PMU renamed as CONFIG_ARM64_BRBE in arch/arm64/Kconfig
- File moved - drivers/perf/arm_pmu_brbe.c -> arch/arm64/kernel/brbe.c
- File moved - drivers/perf/arm_pmu_brbe.h -> arch/arm64/kernel/brbe.h
- BRBE name has been dropped from struct arm_pmu and struct hw_pmu_events
- BRBE name has been abstracted out as 'branches' in arm_pmu and hw_pmu_events
- BRBE name has been abstracted out as 'branches' in ARMV8 PMU implementation
- Added sched_task() callback into struct arm_pmu
- Added 'hw_attr' into struct arm_pmu encapsulating possible PMU HW attributes
- Dropped explicit attributes brbe_(v1p1, nr, cc, format) from struct arm_pmu
- Dropped brbfcr, brbcr, registers scratch area from struct hw_pmu_events
- Dropped brbe_users, brbe_context tracking in struct hw_pmu_events
- Added 'features' tracking into struct arm_pmu with ARM_PMU_BRANCH_STACK flag
- armpmu->hw_attr maps into 'struct brbe_hw_attr' inside BRBE implementation
- Set ARM_PMU_BRANCH_STACK in 'arm_pmu->features' after successful BRBE probe
- Added armv8pmu_branch_reset() inside armv8pmu_branch_enable()
- Dropped brbe_supported() as events will be rejected via ARM_PMU_BRANCH_STACK
- Dropped set_brbe_disabled() as well
- Reformated armv8pmu_branch_valid() warnings while rejecting unsupported events

Changes in V5:

https://lore.kernel.org/linux-arm-kernel/20221107062514.2851047-1-anshuman.khandual@arm.com/

- Changed BRBCR_EL1.VIRTUAL from 0b1 to 0b01
- Changed BRBFCR_EL1.EnL into BRBFCR_EL1.EnI
- Changed config ARM_BRBE_PMU from 'tristate' to 'bool'

Changes in V4:

https://lore.kernel.org/all/20221017055713.451092-1-anshuman.khandual@arm.com/

- Changed ../tools/sysreg declarations as suggested
- Set PERF_SAMPLE_BRANCH_STACK in data.sample_flags
- Dropped perfmon_capable() check in armpmu_event_init()
- s/pr_warn_once/pr_info in armpmu_event_init()
- Added brbe_format element into struct pmu_hw_events
- Changed v1p1 as brbe_v1p1 in struct pmu_hw_events
- Dropped pr_info() from arm64_pmu_brbe_probe(), solved LOCKDEP warning

Changes in V3:

https://lore.kernel.org/all/20220929075857.158358-1-anshuman.khandual@arm.com/

- Moved brbe_stack from the stack and now dynamically allocated
- Return PERF_BR_PRIV_UNKNOWN instead of -1 in brbe_fetch_perf_priv()
- Moved BRBIDR0, BRBCR, BRBFCR registers and fields into tools/sysreg
- Created dummy BRBINF_EL1 field definitions in tools/sysreg
- Dropped ARMPMU_EVT_PRIV framework which cached perfmon_capable()
- Both exception and exception return branche records are now captured
  only if the event has PERF_SAMPLE_BRANCH_KERNEL which would already
  been checked in generic perf via perf_allow_kernel()

Changes in V2:

https://lore.kernel.org/all/20220908051046.465307-1-anshuman.khandual@arm.com/

- Dropped branch sample filter helpers consolidation patch from this series 
- Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable()
- Use cached perfmon_capable() while configuring BRBE branch record filters

Changes in V1:

https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.khandual@arm.com/

- Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers
- Process new perf branch types via PERF_BR_EXTEND_ABI

Changes in RFC V2:

https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.khandual@arm.com/

- Added branch_sample_priv() while consolidating other branch sample filter helpers
- Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc
- Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5)
- Added documentation for struct arm_pmu changes, updated commit message
- Updated commit message for BRBE detection infrastructure patch
- PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver)
- Branch privilege state capture mechanism has now moved inside the driver

Changes in RFC V1:

https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Suzuki Poulose <suzuki.poulose@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-perf-users@vger.kernel.org
Cc: linux-kernel@vger.kernel.org

Anshuman Khandual (6):
  drivers: perf: arm_pmu: Add new sched_task() callback
  arm64/perf: Add BRBE registers and fields
  arm64/perf: Add branch stack support in struct arm_pmu
  arm64/perf: Add branch stack support in struct pmu_hw_events
  arm64/perf: Add branch stack support in ARMV8 PMU
  arm64/perf: Enable branch stack events via FEAT_BRBE

 arch/arm64/Kconfig                  |  11 +
 arch/arm64/include/asm/perf_event.h |  19 ++
 arch/arm64/include/asm/sysreg.h     | 103 ++++++
 arch/arm64/kernel/Makefile          |   1 +
 arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
 arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
 arch/arm64/kernel/perf_event.c      |  35 ++
 arch/arm64/tools/sysreg             | 161 +++++++++
 drivers/perf/arm_pmu.c              |  12 +-
 include/linux/perf/arm_pmu.h        |  19 ++
 10 files changed, 1128 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm64/kernel/brbe.c
 create mode 100644 arch/arm64/kernel/brbe.h

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* [PATCH V7 1/6] drivers: perf: arm_pmu: Add new sched_task() callback
  2023-01-05  3:10 ` Anshuman Khandual
@ 2023-01-05  3:10   ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This adds armpmu_sched_task(), as generic pmu's sched_task() override which
in turn can utilize a new arm_pmu.sched_task() callback when available from
the arm_pmu instance. This new callback will be used while enabling BRBE in
ARMV8 PMU.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 drivers/perf/arm_pmu.c       | 9 +++++++++
 include/linux/perf/arm_pmu.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 9b593f985805..14a3ed3bdb0b 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -517,6 +517,14 @@ static int armpmu_event_init(struct perf_event *event)
 	return __hw_perf_event_init(event);
 }
 
+static void armpmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
+
+	if (armpmu->sched_task)
+		armpmu->sched_task(pmu_ctx, sched_in);
+}
+
 static void armpmu_enable(struct pmu *pmu)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(pmu);
@@ -873,6 +881,7 @@ struct arm_pmu *armpmu_alloc(void)
 	}
 
 	pmu->pmu = (struct pmu) {
+		.sched_task	= armpmu_sched_task,
 		.pmu_enable	= armpmu_enable,
 		.pmu_disable	= armpmu_disable,
 		.event_init	= armpmu_event_init,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index ef914a600087..2a9d07cee927 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -101,6 +101,7 @@ struct arm_pmu {
 	void		(*reset)(void *);
 	int		(*map_event)(struct perf_event *event);
 	bool		(*filter)(struct pmu *pmu, int cpu);
+	void		(*sched_task)(struct perf_event_pmu_context *pmu_ctx, bool sched_in);
 	int		num_events;
 	bool		secure_access; /* 32-bit ARM only */
 #define ARMV8_PMUV3_MAX_COMMON_EVENTS		0x40
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 1/6] drivers: perf: arm_pmu: Add new sched_task() callback
@ 2023-01-05  3:10   ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This adds armpmu_sched_task(), as generic pmu's sched_task() override which
in turn can utilize a new arm_pmu.sched_task() callback when available from
the arm_pmu instance. This new callback will be used while enabling BRBE in
ARMV8 PMU.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 drivers/perf/arm_pmu.c       | 9 +++++++++
 include/linux/perf/arm_pmu.h | 1 +
 2 files changed, 10 insertions(+)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 9b593f985805..14a3ed3bdb0b 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -517,6 +517,14 @@ static int armpmu_event_init(struct perf_event *event)
 	return __hw_perf_event_init(event);
 }
 
+static void armpmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
+
+	if (armpmu->sched_task)
+		armpmu->sched_task(pmu_ctx, sched_in);
+}
+
 static void armpmu_enable(struct pmu *pmu)
 {
 	struct arm_pmu *armpmu = to_arm_pmu(pmu);
@@ -873,6 +881,7 @@ struct arm_pmu *armpmu_alloc(void)
 	}
 
 	pmu->pmu = (struct pmu) {
+		.sched_task	= armpmu_sched_task,
 		.pmu_enable	= armpmu_enable,
 		.pmu_disable	= armpmu_disable,
 		.event_init	= armpmu_event_init,
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index ef914a600087..2a9d07cee927 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -101,6 +101,7 @@ struct arm_pmu {
 	void		(*reset)(void *);
 	int		(*map_event)(struct perf_event *event);
 	bool		(*filter)(struct pmu *pmu, int cpu);
+	void		(*sched_task)(struct perf_event_pmu_context *pmu_ctx, bool sched_in);
 	int		num_events;
 	bool		secure_access; /* 32-bit ARM only */
 #define ARMV8_PMUV3_MAX_COMMON_EVENTS		0x40
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
  2023-01-05  3:10 ` Anshuman Khandual
@ 2023-01-05  3:10   ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Marc Zyngier,
	Mark Brown

This adds BRBE related register definitions and various other related field
macros there in. These will be used subsequently in a BRBE driver which is
being added later on.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 103 ++++++++++++++++++++
 arch/arm64/tools/sysreg         | 161 ++++++++++++++++++++++++++++++++
 2 files changed, 264 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 1312fb48f18b..05b70e8b7f83 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -165,6 +165,109 @@
 #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
 #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
 
+#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
+#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
+#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
+
+#define SYS_BRBINF0_EL1			__SYS_BRBINFO(0)
+#define SYS_BRBINF1_EL1			__SYS_BRBINFO(1)
+#define SYS_BRBINF2_EL1			__SYS_BRBINFO(2)
+#define SYS_BRBINF3_EL1			__SYS_BRBINFO(3)
+#define SYS_BRBINF4_EL1			__SYS_BRBINFO(4)
+#define SYS_BRBINF5_EL1			__SYS_BRBINFO(5)
+#define SYS_BRBINF6_EL1			__SYS_BRBINFO(6)
+#define SYS_BRBINF7_EL1			__SYS_BRBINFO(7)
+#define SYS_BRBINF8_EL1			__SYS_BRBINFO(8)
+#define SYS_BRBINF9_EL1			__SYS_BRBINFO(9)
+#define SYS_BRBINF10_EL1		__SYS_BRBINFO(10)
+#define SYS_BRBINF11_EL1		__SYS_BRBINFO(11)
+#define SYS_BRBINF12_EL1		__SYS_BRBINFO(12)
+#define SYS_BRBINF13_EL1		__SYS_BRBINFO(13)
+#define SYS_BRBINF14_EL1		__SYS_BRBINFO(14)
+#define SYS_BRBINF15_EL1		__SYS_BRBINFO(15)
+#define SYS_BRBINF16_EL1		__SYS_BRBINFO(16)
+#define SYS_BRBINF17_EL1		__SYS_BRBINFO(17)
+#define SYS_BRBINF18_EL1		__SYS_BRBINFO(18)
+#define SYS_BRBINF19_EL1		__SYS_BRBINFO(19)
+#define SYS_BRBINF20_EL1		__SYS_BRBINFO(20)
+#define SYS_BRBINF21_EL1		__SYS_BRBINFO(21)
+#define SYS_BRBINF22_EL1		__SYS_BRBINFO(22)
+#define SYS_BRBINF23_EL1		__SYS_BRBINFO(23)
+#define SYS_BRBINF24_EL1		__SYS_BRBINFO(24)
+#define SYS_BRBINF25_EL1		__SYS_BRBINFO(25)
+#define SYS_BRBINF26_EL1		__SYS_BRBINFO(26)
+#define SYS_BRBINF27_EL1		__SYS_BRBINFO(27)
+#define SYS_BRBINF28_EL1		__SYS_BRBINFO(28)
+#define SYS_BRBINF29_EL1		__SYS_BRBINFO(29)
+#define SYS_BRBINF30_EL1		__SYS_BRBINFO(30)
+#define SYS_BRBINF31_EL1		__SYS_BRBINFO(31)
+
+#define SYS_BRBSRC0_EL1			__SYS_BRBSRC(0)
+#define SYS_BRBSRC1_EL1			__SYS_BRBSRC(1)
+#define SYS_BRBSRC2_EL1			__SYS_BRBSRC(2)
+#define SYS_BRBSRC3_EL1			__SYS_BRBSRC(3)
+#define SYS_BRBSRC4_EL1			__SYS_BRBSRC(4)
+#define SYS_BRBSRC5_EL1			__SYS_BRBSRC(5)
+#define SYS_BRBSRC6_EL1			__SYS_BRBSRC(6)
+#define SYS_BRBSRC7_EL1			__SYS_BRBSRC(7)
+#define SYS_BRBSRC8_EL1			__SYS_BRBSRC(8)
+#define SYS_BRBSRC9_EL1			__SYS_BRBSRC(9)
+#define SYS_BRBSRC10_EL1		__SYS_BRBSRC(10)
+#define SYS_BRBSRC11_EL1		__SYS_BRBSRC(11)
+#define SYS_BRBSRC12_EL1		__SYS_BRBSRC(12)
+#define SYS_BRBSRC13_EL1		__SYS_BRBSRC(13)
+#define SYS_BRBSRC14_EL1		__SYS_BRBSRC(14)
+#define SYS_BRBSRC15_EL1		__SYS_BRBSRC(15)
+#define SYS_BRBSRC16_EL1		__SYS_BRBSRC(16)
+#define SYS_BRBSRC17_EL1		__SYS_BRBSRC(17)
+#define SYS_BRBSRC18_EL1		__SYS_BRBSRC(18)
+#define SYS_BRBSRC19_EL1		__SYS_BRBSRC(19)
+#define SYS_BRBSRC20_EL1		__SYS_BRBSRC(20)
+#define SYS_BRBSRC21_EL1		__SYS_BRBSRC(21)
+#define SYS_BRBSRC22_EL1		__SYS_BRBSRC(22)
+#define SYS_BRBSRC23_EL1		__SYS_BRBSRC(23)
+#define SYS_BRBSRC24_EL1		__SYS_BRBSRC(24)
+#define SYS_BRBSRC25_EL1		__SYS_BRBSRC(25)
+#define SYS_BRBSRC26_EL1		__SYS_BRBSRC(26)
+#define SYS_BRBSRC27_EL1		__SYS_BRBSRC(27)
+#define SYS_BRBSRC28_EL1		__SYS_BRBSRC(28)
+#define SYS_BRBSRC29_EL1		__SYS_BRBSRC(29)
+#define SYS_BRBSRC30_EL1		__SYS_BRBSRC(30)
+#define SYS_BRBSRC31_EL1		__SYS_BRBSRC(31)
+
+#define SYS_BRBTGT0_EL1			__SYS_BRBTGT(0)
+#define SYS_BRBTGT1_EL1			__SYS_BRBTGT(1)
+#define SYS_BRBTGT2_EL1			__SYS_BRBTGT(2)
+#define SYS_BRBTGT3_EL1			__SYS_BRBTGT(3)
+#define SYS_BRBTGT4_EL1			__SYS_BRBTGT(4)
+#define SYS_BRBTGT5_EL1			__SYS_BRBTGT(5)
+#define SYS_BRBTGT6_EL1			__SYS_BRBTGT(6)
+#define SYS_BRBTGT7_EL1			__SYS_BRBTGT(7)
+#define SYS_BRBTGT8_EL1			__SYS_BRBTGT(8)
+#define SYS_BRBTGT9_EL1			__SYS_BRBTGT(9)
+#define SYS_BRBTGT10_EL1		__SYS_BRBTGT(10)
+#define SYS_BRBTGT11_EL1		__SYS_BRBTGT(11)
+#define SYS_BRBTGT12_EL1		__SYS_BRBTGT(12)
+#define SYS_BRBTGT13_EL1		__SYS_BRBTGT(13)
+#define SYS_BRBTGT14_EL1		__SYS_BRBTGT(14)
+#define SYS_BRBTGT15_EL1		__SYS_BRBTGT(15)
+#define SYS_BRBTGT16_EL1		__SYS_BRBTGT(16)
+#define SYS_BRBTGT17_EL1		__SYS_BRBTGT(17)
+#define SYS_BRBTGT18_EL1		__SYS_BRBTGT(18)
+#define SYS_BRBTGT19_EL1		__SYS_BRBTGT(19)
+#define SYS_BRBTGT20_EL1		__SYS_BRBTGT(20)
+#define SYS_BRBTGT21_EL1		__SYS_BRBTGT(21)
+#define SYS_BRBTGT22_EL1		__SYS_BRBTGT(22)
+#define SYS_BRBTGT23_EL1		__SYS_BRBTGT(23)
+#define SYS_BRBTGT24_EL1		__SYS_BRBTGT(24)
+#define SYS_BRBTGT25_EL1		__SYS_BRBTGT(25)
+#define SYS_BRBTGT26_EL1		__SYS_BRBTGT(26)
+#define SYS_BRBTGT27_EL1		__SYS_BRBTGT(27)
+#define SYS_BRBTGT28_EL1		__SYS_BRBTGT(28)
+#define SYS_BRBTGT29_EL1		__SYS_BRBTGT(29)
+#define SYS_BRBTGT30_EL1		__SYS_BRBTGT(30)
+#define SYS_BRBTGT31_EL1		__SYS_BRBTGT(31)
+
 #define SYS_MIDR_EL1			sys_reg(3, 0, 0, 0, 0)
 #define SYS_MPIDR_EL1			sys_reg(3, 0, 0, 0, 5)
 #define SYS_REVIDR_EL1			sys_reg(3, 0, 0, 0, 6)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 184e58fd5631..a7f9054bd84c 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -921,6 +921,167 @@ Enum	3:0	BT
 EndEnum
 EndSysreg
 
+
+# This is just a dummy register declaration to get all common field masks and
+# shifts for accessing given BRBINF contents.
+Sysreg	BRBINF_EL1	2	1	8	0	0
+Res0	63:47
+Field	46	CCU
+Field	45:32	CC
+Res0	31:18
+Field	17	LASTFAILED
+Field	16	T
+Res0	15:14
+Enum	13:8		TYPE
+	0b000000	UNCOND_DIR
+	0b000001	INDIR
+	0b000010	DIR_LINK
+	0b000011	INDIR_LINK
+	0b000101	RET_SUB
+	0b000111	RET_EXCPT
+	0b001000	COND_DIR
+	0b100001	DEBUG_HALT
+	0b100010	CALL
+	0b100011	TRAP
+	0b100100	SERROR
+	0b100110	INST_DEBUG
+	0b100111	DATA_DEBUG
+	0b101010	ALGN_FAULT
+	0b101011	INST_FAULT
+	0b101100	DATA_FAULT
+	0b101110	IRQ
+	0b101111	FIQ
+	0b111001	DEBUG_EXIT
+EndEnum
+Enum	7:6	EL
+	0b00	EL0
+	0b01	EL1
+	0b10	EL2
+	0b11	EL3
+EndEnum
+Field	5	MPRED
+Res0	4:2
+Enum	1:0	VALID
+	0b00	NONE
+	0b01	TARGET
+	0b10	SOURCE
+	0b11	FULL
+EndEnum
+EndSysreg
+
+Sysreg	BRBCR_EL1	2	1	9	0	0
+Res0	63:24
+Field	23 	EXCEPTION
+Field	22 	ERTN
+Res0	21:9
+Field	8 	FZP
+Res0	7
+Enum	6:5	TS
+	0b01	VIRTUAL
+	0b10	GST_PHYSICAL
+	0b11	PHYSICAL
+EndEnum
+Field	4	MPRED
+Field	3	CC
+Res0	2
+Field	1	E1BRE
+Field	0	E0BRE
+EndSysreg
+
+Sysreg	BRBFCR_EL1	2	1	9	0	1
+Res0	63:30
+Enum	29:28	BANK
+	0b0	FIRST
+	0b1	SECOND
+EndEnum
+Res0	27:23
+Field	22	CONDDIR
+Field	21	DIRCALL
+Field	20	INDCALL
+Field	19	RTN
+Field	18	INDIRECT
+Field	17	DIRECT
+Field	16	EnI
+Res0	15:8
+Field	7	PAUSED
+Field	6	LASTFAILED
+Res0	5:0
+EndSysreg
+
+Sysreg	BRBTS_EL1	2	1	9	0	2
+Field	63:0	TS
+EndSysreg
+
+Sysreg	BRBINFINJ_EL1	2	1	9	1	0
+Res0	63:47
+Field	46	CCU
+Field	45:32	CC
+Res0	31:18
+Field	17	LASTFAILED
+Field	16	T
+Res0	15:14
+Enum	13:8		TYPE
+	0b000000	UNCOND_DIR
+	0b000001	INDIR
+	0b000010	DIR_LINK
+	0b000011	INDIR_LINK
+	0b000100	RET_SUB
+	0b000100	RET_SUB
+	0b000111	RET_EXCPT
+	0b001000	COND_DIR
+	0b100001	DEBUG_HALT
+	0b100010	CALL
+	0b100011	TRAP
+	0b100100	SERROR
+	0b100110	INST_DEBUG
+	0b100111	DATA_DEBUG
+	0b101010	ALGN_FAULT
+	0b101011	INST_FAULT
+	0b101100	DATA_FAULT
+	0b101110	IRQ
+	0b101111	FIQ
+	0b111001	DEBUG_EXIT
+EndEnum
+Enum	7:6	EL
+	0b00	EL0
+	0b01	EL1
+	0b10	EL2
+	0b11	EL3
+EndEnum
+Field	5	MPRED
+Res0	4:2
+Enum	1:0	VALID
+	0b00	NONE
+	0b01	TARGET
+	0b10	SOURCE
+	0b00	FULL
+EndEnum
+EndSysreg
+
+Sysreg	BRBSRCINJ_EL1	2	1	9	1	1
+Field	63:0 ADDRESS
+EndSysreg
+
+Sysreg	BRBTGTINJ_EL1	2	1	9	1	2
+Field	63:0 ADDRESS
+EndSysreg
+
+Sysreg	BRBIDR0_EL1	2	1	9	2	0
+Res0	63:16
+Enum	15:12	CC
+	0b101	20_BIT
+EndEnum
+Enum	11:8	FORMAT
+	0b0	0
+EndEnum
+Enum	7:0		NUMREC
+	0b1000		8
+	0b10000		16
+	0b100000	32
+	0b1000000	64
+EndEnum
+EndSysreg
+
 Sysreg	ID_AA64ZFR0_EL1	3	0	0	4	4
 Res0	63:60
 Enum	59:56	F64MM
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
@ 2023-01-05  3:10   ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon, Marc Zyngier,
	Mark Brown

This adds BRBE related register definitions and various other related field
macros there in. These will be used subsequently in a BRBE driver which is
being added later on.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 103 ++++++++++++++++++++
 arch/arm64/tools/sysreg         | 161 ++++++++++++++++++++++++++++++++
 2 files changed, 264 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 1312fb48f18b..05b70e8b7f83 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -165,6 +165,109 @@
 #define SYS_DBGDTRTX_EL0		sys_reg(2, 3, 0, 5, 0)
 #define SYS_DBGVCR32_EL2		sys_reg(2, 4, 0, 7, 0)
 
+#define __SYS_BRBINFO(n)		sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 0))
+#define __SYS_BRBSRC(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 1))
+#define __SYS_BRBTGT(n)			sys_reg(2, 1, 8, ((n) & 0xf), ((((n) & 0x10)) >> 2 + 2))
+
+#define SYS_BRBINF0_EL1			__SYS_BRBINFO(0)
+#define SYS_BRBINF1_EL1			__SYS_BRBINFO(1)
+#define SYS_BRBINF2_EL1			__SYS_BRBINFO(2)
+#define SYS_BRBINF3_EL1			__SYS_BRBINFO(3)
+#define SYS_BRBINF4_EL1			__SYS_BRBINFO(4)
+#define SYS_BRBINF5_EL1			__SYS_BRBINFO(5)
+#define SYS_BRBINF6_EL1			__SYS_BRBINFO(6)
+#define SYS_BRBINF7_EL1			__SYS_BRBINFO(7)
+#define SYS_BRBINF8_EL1			__SYS_BRBINFO(8)
+#define SYS_BRBINF9_EL1			__SYS_BRBINFO(9)
+#define SYS_BRBINF10_EL1		__SYS_BRBINFO(10)
+#define SYS_BRBINF11_EL1		__SYS_BRBINFO(11)
+#define SYS_BRBINF12_EL1		__SYS_BRBINFO(12)
+#define SYS_BRBINF13_EL1		__SYS_BRBINFO(13)
+#define SYS_BRBINF14_EL1		__SYS_BRBINFO(14)
+#define SYS_BRBINF15_EL1		__SYS_BRBINFO(15)
+#define SYS_BRBINF16_EL1		__SYS_BRBINFO(16)
+#define SYS_BRBINF17_EL1		__SYS_BRBINFO(17)
+#define SYS_BRBINF18_EL1		__SYS_BRBINFO(18)
+#define SYS_BRBINF19_EL1		__SYS_BRBINFO(19)
+#define SYS_BRBINF20_EL1		__SYS_BRBINFO(20)
+#define SYS_BRBINF21_EL1		__SYS_BRBINFO(21)
+#define SYS_BRBINF22_EL1		__SYS_BRBINFO(22)
+#define SYS_BRBINF23_EL1		__SYS_BRBINFO(23)
+#define SYS_BRBINF24_EL1		__SYS_BRBINFO(24)
+#define SYS_BRBINF25_EL1		__SYS_BRBINFO(25)
+#define SYS_BRBINF26_EL1		__SYS_BRBINFO(26)
+#define SYS_BRBINF27_EL1		__SYS_BRBINFO(27)
+#define SYS_BRBINF28_EL1		__SYS_BRBINFO(28)
+#define SYS_BRBINF29_EL1		__SYS_BRBINFO(29)
+#define SYS_BRBINF30_EL1		__SYS_BRBINFO(30)
+#define SYS_BRBINF31_EL1		__SYS_BRBINFO(31)
+
+#define SYS_BRBSRC0_EL1			__SYS_BRBSRC(0)
+#define SYS_BRBSRC1_EL1			__SYS_BRBSRC(1)
+#define SYS_BRBSRC2_EL1			__SYS_BRBSRC(2)
+#define SYS_BRBSRC3_EL1			__SYS_BRBSRC(3)
+#define SYS_BRBSRC4_EL1			__SYS_BRBSRC(4)
+#define SYS_BRBSRC5_EL1			__SYS_BRBSRC(5)
+#define SYS_BRBSRC6_EL1			__SYS_BRBSRC(6)
+#define SYS_BRBSRC7_EL1			__SYS_BRBSRC(7)
+#define SYS_BRBSRC8_EL1			__SYS_BRBSRC(8)
+#define SYS_BRBSRC9_EL1			__SYS_BRBSRC(9)
+#define SYS_BRBSRC10_EL1		__SYS_BRBSRC(10)
+#define SYS_BRBSRC11_EL1		__SYS_BRBSRC(11)
+#define SYS_BRBSRC12_EL1		__SYS_BRBSRC(12)
+#define SYS_BRBSRC13_EL1		__SYS_BRBSRC(13)
+#define SYS_BRBSRC14_EL1		__SYS_BRBSRC(14)
+#define SYS_BRBSRC15_EL1		__SYS_BRBSRC(15)
+#define SYS_BRBSRC16_EL1		__SYS_BRBSRC(16)
+#define SYS_BRBSRC17_EL1		__SYS_BRBSRC(17)
+#define SYS_BRBSRC18_EL1		__SYS_BRBSRC(18)
+#define SYS_BRBSRC19_EL1		__SYS_BRBSRC(19)
+#define SYS_BRBSRC20_EL1		__SYS_BRBSRC(20)
+#define SYS_BRBSRC21_EL1		__SYS_BRBSRC(21)
+#define SYS_BRBSRC22_EL1		__SYS_BRBSRC(22)
+#define SYS_BRBSRC23_EL1		__SYS_BRBSRC(23)
+#define SYS_BRBSRC24_EL1		__SYS_BRBSRC(24)
+#define SYS_BRBSRC25_EL1		__SYS_BRBSRC(25)
+#define SYS_BRBSRC26_EL1		__SYS_BRBSRC(26)
+#define SYS_BRBSRC27_EL1		__SYS_BRBSRC(27)
+#define SYS_BRBSRC28_EL1		__SYS_BRBSRC(28)
+#define SYS_BRBSRC29_EL1		__SYS_BRBSRC(29)
+#define SYS_BRBSRC30_EL1		__SYS_BRBSRC(30)
+#define SYS_BRBSRC31_EL1		__SYS_BRBSRC(31)
+
+#define SYS_BRBTGT0_EL1			__SYS_BRBTGT(0)
+#define SYS_BRBTGT1_EL1			__SYS_BRBTGT(1)
+#define SYS_BRBTGT2_EL1			__SYS_BRBTGT(2)
+#define SYS_BRBTGT3_EL1			__SYS_BRBTGT(3)
+#define SYS_BRBTGT4_EL1			__SYS_BRBTGT(4)
+#define SYS_BRBTGT5_EL1			__SYS_BRBTGT(5)
+#define SYS_BRBTGT6_EL1			__SYS_BRBTGT(6)
+#define SYS_BRBTGT7_EL1			__SYS_BRBTGT(7)
+#define SYS_BRBTGT8_EL1			__SYS_BRBTGT(8)
+#define SYS_BRBTGT9_EL1			__SYS_BRBTGT(9)
+#define SYS_BRBTGT10_EL1		__SYS_BRBTGT(10)
+#define SYS_BRBTGT11_EL1		__SYS_BRBTGT(11)
+#define SYS_BRBTGT12_EL1		__SYS_BRBTGT(12)
+#define SYS_BRBTGT13_EL1		__SYS_BRBTGT(13)
+#define SYS_BRBTGT14_EL1		__SYS_BRBTGT(14)
+#define SYS_BRBTGT15_EL1		__SYS_BRBTGT(15)
+#define SYS_BRBTGT16_EL1		__SYS_BRBTGT(16)
+#define SYS_BRBTGT17_EL1		__SYS_BRBTGT(17)
+#define SYS_BRBTGT18_EL1		__SYS_BRBTGT(18)
+#define SYS_BRBTGT19_EL1		__SYS_BRBTGT(19)
+#define SYS_BRBTGT20_EL1		__SYS_BRBTGT(20)
+#define SYS_BRBTGT21_EL1		__SYS_BRBTGT(21)
+#define SYS_BRBTGT22_EL1		__SYS_BRBTGT(22)
+#define SYS_BRBTGT23_EL1		__SYS_BRBTGT(23)
+#define SYS_BRBTGT24_EL1		__SYS_BRBTGT(24)
+#define SYS_BRBTGT25_EL1		__SYS_BRBTGT(25)
+#define SYS_BRBTGT26_EL1		__SYS_BRBTGT(26)
+#define SYS_BRBTGT27_EL1		__SYS_BRBTGT(27)
+#define SYS_BRBTGT28_EL1		__SYS_BRBTGT(28)
+#define SYS_BRBTGT29_EL1		__SYS_BRBTGT(29)
+#define SYS_BRBTGT30_EL1		__SYS_BRBTGT(30)
+#define SYS_BRBTGT31_EL1		__SYS_BRBTGT(31)
+
 #define SYS_MIDR_EL1			sys_reg(3, 0, 0, 0, 0)
 #define SYS_MPIDR_EL1			sys_reg(3, 0, 0, 0, 5)
 #define SYS_REVIDR_EL1			sys_reg(3, 0, 0, 0, 6)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 184e58fd5631..a7f9054bd84c 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -921,6 +921,167 @@ Enum	3:0	BT
 EndEnum
 EndSysreg
 
+
+# This is just a dummy register declaration to get all common field masks and
+# shifts for accessing given BRBINF contents.
+Sysreg	BRBINF_EL1	2	1	8	0	0
+Res0	63:47
+Field	46	CCU
+Field	45:32	CC
+Res0	31:18
+Field	17	LASTFAILED
+Field	16	T
+Res0	15:14
+Enum	13:8		TYPE
+	0b000000	UNCOND_DIR
+	0b000001	INDIR
+	0b000010	DIR_LINK
+	0b000011	INDIR_LINK
+	0b000101	RET_SUB
+	0b000111	RET_EXCPT
+	0b001000	COND_DIR
+	0b100001	DEBUG_HALT
+	0b100010	CALL
+	0b100011	TRAP
+	0b100100	SERROR
+	0b100110	INST_DEBUG
+	0b100111	DATA_DEBUG
+	0b101010	ALGN_FAULT
+	0b101011	INST_FAULT
+	0b101100	DATA_FAULT
+	0b101110	IRQ
+	0b101111	FIQ
+	0b111001	DEBUG_EXIT
+EndEnum
+Enum	7:6	EL
+	0b00	EL0
+	0b01	EL1
+	0b10	EL2
+	0b11	EL3
+EndEnum
+Field	5	MPRED
+Res0	4:2
+Enum	1:0	VALID
+	0b00	NONE
+	0b01	TARGET
+	0b10	SOURCE
+	0b11	FULL
+EndEnum
+EndSysreg
+
+Sysreg	BRBCR_EL1	2	1	9	0	0
+Res0	63:24
+Field	23 	EXCEPTION
+Field	22 	ERTN
+Res0	21:9
+Field	8 	FZP
+Res0	7
+Enum	6:5	TS
+	0b01	VIRTUAL
+	0b10	GST_PHYSICAL
+	0b11	PHYSICAL
+EndEnum
+Field	4	MPRED
+Field	3	CC
+Res0	2
+Field	1	E1BRE
+Field	0	E0BRE
+EndSysreg
+
+Sysreg	BRBFCR_EL1	2	1	9	0	1
+Res0	63:30
+Enum	29:28	BANK
+	0b0	FIRST
+	0b1	SECOND
+EndEnum
+Res0	27:23
+Field	22	CONDDIR
+Field	21	DIRCALL
+Field	20	INDCALL
+Field	19	RTN
+Field	18	INDIRECT
+Field	17	DIRECT
+Field	16	EnI
+Res0	15:8
+Field	7	PAUSED
+Field	6	LASTFAILED
+Res0	5:0
+EndSysreg
+
+Sysreg	BRBTS_EL1	2	1	9	0	2
+Field	63:0	TS
+EndSysreg
+
+Sysreg	BRBINFINJ_EL1	2	1	9	1	0
+Res0	63:47
+Field	46	CCU
+Field	45:32	CC
+Res0	31:18
+Field	17	LASTFAILED
+Field	16	T
+Res0	15:14
+Enum	13:8		TYPE
+	0b000000	UNCOND_DIR
+	0b000001	INDIR
+	0b000010	DIR_LINK
+	0b000011	INDIR_LINK
+	0b000100	RET_SUB
+	0b000100	RET_SUB
+	0b000111	RET_EXCPT
+	0b001000	COND_DIR
+	0b100001	DEBUG_HALT
+	0b100010	CALL
+	0b100011	TRAP
+	0b100100	SERROR
+	0b100110	INST_DEBUG
+	0b100111	DATA_DEBUG
+	0b101010	ALGN_FAULT
+	0b101011	INST_FAULT
+	0b101100	DATA_FAULT
+	0b101110	IRQ
+	0b101111	FIQ
+	0b111001	DEBUG_EXIT
+EndEnum
+Enum	7:6	EL
+	0b00	EL0
+	0b01	EL1
+	0b10	EL2
+	0b11	EL3
+EndEnum
+Field	5	MPRED
+Res0	4:2
+Enum	1:0	VALID
+	0b00	NONE
+	0b01	TARGET
+	0b10	SOURCE
+	0b00	FULL
+EndEnum
+EndSysreg
+
+Sysreg	BRBSRCINJ_EL1	2	1	9	1	1
+Field	63:0 ADDRESS
+EndSysreg
+
+Sysreg	BRBTGTINJ_EL1	2	1	9	1	2
+Field	63:0 ADDRESS
+EndSysreg
+
+Sysreg	BRBIDR0_EL1	2	1	9	2	0
+Res0	63:16
+Enum	15:12	CC
+	0b101	20_BIT
+EndEnum
+Enum	11:8	FORMAT
+	0b0	0
+EndEnum
+Enum	7:0		NUMREC
+	0b1000		8
+	0b10000		16
+	0b100000	32
+	0b1000000	64
+EndEnum
+EndSysreg
+
 Sysreg	ID_AA64ZFR0_EL1	3	0	0	4	4
 Res0	63:60
 Enum	59:56	F64MM
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
  2023-01-05  3:10 ` Anshuman Khandual
@ 2023-01-05  3:10   ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This updates 'struct arm_pmu' for branch stack sampling support later. This
adds a new 'features' element in the structure to track supported features,
and another 'private' element to encapsulate implementation attributes on a
given 'struct arm_pmu'. These updates here will help in tracking any branch
stack sampling support, which is being added later. This also adds a helper
arm_pmu_branch_stack_supported().

This also enables perf branch stack sampling event on all 'struct arm pmu',
supporting the feature but after removing the current gate that blocks such
events unconditionally in armpmu_event_init(). Instead a quick probe can be
initiated via arm_pmu_branch_stack_supported() to ascertain the support.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 drivers/perf/arm_pmu.c       | 3 +--
 include/linux/perf/arm_pmu.h | 9 +++++++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 14a3ed3bdb0b..a85b2d67022e 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
 		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
 		return -ENOENT;
 
-	/* does not support taken branch sampling */
-	if (has_branch_stack(event))
+	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
 		return -EOPNOTSUPP;
 
 	return __hw_perf_event_init(event);
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 2a9d07cee927..64e1b2594025 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -80,11 +80,14 @@ enum armpmu_attr_groups {
 	ARMPMU_NR_ATTR_GROUPS
 };
 
+#define ARM_PMU_BRANCH_STACK	BIT(0)
+
 struct arm_pmu {
 	struct pmu	pmu;
 	cpumask_t	supported_cpus;
 	char		*name;
 	int		pmuver;
+	int		features;
 	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
 	void		(*enable)(struct perf_event *event);
 	void		(*disable)(struct perf_event *event);
@@ -119,8 +122,14 @@ struct arm_pmu {
 
 	/* Only to be used by ACPI probing code */
 	unsigned long acpi_cpuid;
+	void		*private;
 };
 
+static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
+{
+	return armpmu->features & ARM_PMU_BRANCH_STACK;
+}
+
 #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
 
 u64 armpmu_event_update(struct perf_event *event);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
@ 2023-01-05  3:10   ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This updates 'struct arm_pmu' for branch stack sampling support later. This
adds a new 'features' element in the structure to track supported features,
and another 'private' element to encapsulate implementation attributes on a
given 'struct arm_pmu'. These updates here will help in tracking any branch
stack sampling support, which is being added later. This also adds a helper
arm_pmu_branch_stack_supported().

This also enables perf branch stack sampling event on all 'struct arm pmu',
supporting the feature but after removing the current gate that blocks such
events unconditionally in armpmu_event_init(). Instead a quick probe can be
initiated via arm_pmu_branch_stack_supported() to ascertain the support.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 drivers/perf/arm_pmu.c       | 3 +--
 include/linux/perf/arm_pmu.h | 9 +++++++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 14a3ed3bdb0b..a85b2d67022e 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
 		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
 		return -ENOENT;
 
-	/* does not support taken branch sampling */
-	if (has_branch_stack(event))
+	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
 		return -EOPNOTSUPP;
 
 	return __hw_perf_event_init(event);
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 2a9d07cee927..64e1b2594025 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -80,11 +80,14 @@ enum armpmu_attr_groups {
 	ARMPMU_NR_ATTR_GROUPS
 };
 
+#define ARM_PMU_BRANCH_STACK	BIT(0)
+
 struct arm_pmu {
 	struct pmu	pmu;
 	cpumask_t	supported_cpus;
 	char		*name;
 	int		pmuver;
+	int		features;
 	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
 	void		(*enable)(struct perf_event *event);
 	void		(*disable)(struct perf_event *event);
@@ -119,8 +122,14 @@ struct arm_pmu {
 
 	/* Only to be used by ACPI probing code */
 	unsigned long acpi_cpuid;
+	void		*private;
 };
 
+static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
+{
+	return armpmu->features & ARM_PMU_BRANCH_STACK;
+}
+
 #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
 
 u64 armpmu_event_update(struct perf_event *event);
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 4/6] arm64/perf: Add branch stack support in struct pmu_hw_events
  2023-01-05  3:10 ` Anshuman Khandual
@ 2023-01-05  3:10   ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This adds branch records buffer pointer in 'struct pmu_hw_events' which can
be used to capture branch records during PMU interrupt. This percpu pointer
here needs to be allocated first before usage.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/perf/arm_pmu.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 64e1b2594025..9184f9b33740 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -44,6 +44,13 @@ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_47BIT) == ARMPMU_EVT_47BIT);
 	},								\
 }
 
+#define MAX_BRANCH_RECORDS 64
+
+struct branch_records {
+	struct perf_branch_stack	branch_stack;
+	struct perf_branch_entry	branch_entries[MAX_BRANCH_RECORDS];
+};
+
 /* The events for a given PMU register set. */
 struct pmu_hw_events {
 	/*
@@ -70,6 +77,8 @@ struct pmu_hw_events {
 	struct arm_pmu		*percpu_pmu;
 
 	int irq;
+
+	struct branch_records	*branches;
 };
 
 enum armpmu_attr_groups {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 4/6] arm64/perf: Add branch stack support in struct pmu_hw_events
@ 2023-01-05  3:10   ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This adds branch records buffer pointer in 'struct pmu_hw_events' which can
be used to capture branch records during PMU interrupt. This percpu pointer
here needs to be allocated first before usage.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/perf/arm_pmu.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 64e1b2594025..9184f9b33740 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -44,6 +44,13 @@ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_47BIT) == ARMPMU_EVT_47BIT);
 	},								\
 }
 
+#define MAX_BRANCH_RECORDS 64
+
+struct branch_records {
+	struct perf_branch_stack	branch_stack;
+	struct perf_branch_entry	branch_entries[MAX_BRANCH_RECORDS];
+};
+
 /* The events for a given PMU register set. */
 struct pmu_hw_events {
 	/*
@@ -70,6 +77,8 @@ struct pmu_hw_events {
 	struct arm_pmu		*percpu_pmu;
 
 	int irq;
+
+	struct branch_records	*branches;
 };
 
 enum armpmu_attr_groups {
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
  2023-01-05  3:10 ` Anshuman Khandual
@ 2023-01-05  3:10   ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This enables support for branch stack sampling event in ARMV8 PMU, checking
has_branch_stack() on the event inside 'struct arm_pmu' callbacks. Although
these branch stack helpers armv8pmu_branch_XXXXX() are just dummy functions
for now. While here, this also defines arm_pmu's sched_task() callback with
armv8pmu_sched_task(), which resets the branch record buffer on a sched_in.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/perf_event.h | 10 +++++++++
 arch/arm64/kernel/perf_event.c      | 35 +++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 3eaf462f5752..a038902d6874 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -273,4 +273,14 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 	(regs)->pstate = PSR_MODE_EL1h;	\
 }
 
+struct pmu_hw_events;
+struct arm_pmu;
+struct perf_event;
+
+static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
+static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
+static inline void armv8pmu_branch_enable(struct perf_event *event) { }
+static inline void armv8pmu_branch_disable(struct perf_event *event) { }
+static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
+static inline void armv8pmu_branch_reset(void) { }
 #endif
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index a5193f2146a6..8805b4516088 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -789,10 +789,22 @@ static void armv8pmu_enable_event(struct perf_event *event)
 	 * Enable counter
 	 */
 	armv8pmu_enable_event_counter(event);
+
+	/*
+	 * Enable BRBE
+	 */
+	if (has_branch_stack(event))
+		armv8pmu_branch_enable(event);
 }
 
 static void armv8pmu_disable_event(struct perf_event *event)
 {
+	/*
+	 * Disable BRBE
+	 */
+	if (has_branch_stack(event))
+		armv8pmu_branch_disable(event);
+
 	/*
 	 * Disable counter
 	 */
@@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
+		if (has_branch_stack(event)) {
+			WARN_ON(!cpuc->branches);
+			armv8pmu_branch_read(cpuc, event);
+			data.br_stack = &cpuc->branches->branch_stack;
+			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+		}
+
 		/*
 		 * Perf event overflow will queue the processing of the event as
 		 * an irq_work which will be taken care of in the handling of
@@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
 	return event->hw.idx;
 }
 
+static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
+
+	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
+		armv8pmu_branch_reset();
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -1052,6 +1079,9 @@ static void armv8pmu_reset(void *info)
 		pmcr |= ARMV8_PMU_PMCR_LP;
 
 	armv8pmu_pmcr_write(pmcr);
+
+	if (arm_pmu_branch_stack_supported(cpu_pmu))
+		armv8pmu_branch_reset();
 }
 
 static int __armv8_pmuv3_map_event(struct perf_event *event,
@@ -1069,6 +1099,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
 				       &armv8_pmuv3_perf_cache_map,
 				       ARMV8_PMU_EVTYPE_EVENT);
 
+	if (has_branch_stack(event) && !armv8pmu_branch_valid(event))
+		return -EOPNOTSUPP;
+
 	if (armv8pmu_event_is_64bit(event))
 		event->hw.flags |= ARMPMU_EVT_64BIT;
 
@@ -1181,6 +1214,7 @@ static void __armv8pmu_probe_pmu(void *info)
 		cpu_pmu->reg_pmmir = read_cpuid(PMMIR_EL1);
 	else
 		cpu_pmu->reg_pmmir = 0;
+	armv8pmu_branch_probe(cpu_pmu);
 }
 
 static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
@@ -1261,6 +1295,7 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 	cpu_pmu->filter			= armv8pmu_filter;
 
 	cpu_pmu->pmu.event_idx		= armv8pmu_user_event_idx;
+	cpu_pmu->sched_task		= armv8pmu_sched_task;
 
 	cpu_pmu->name			= name;
 	cpu_pmu->map_event		= map_event;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
@ 2023-01-05  3:10   ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This enables support for branch stack sampling event in ARMV8 PMU, checking
has_branch_stack() on the event inside 'struct arm_pmu' callbacks. Although
these branch stack helpers armv8pmu_branch_XXXXX() are just dummy functions
for now. While here, this also defines arm_pmu's sched_task() callback with
armv8pmu_sched_task(), which resets the branch record buffer on a sched_in.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/perf_event.h | 10 +++++++++
 arch/arm64/kernel/perf_event.c      | 35 +++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index 3eaf462f5752..a038902d6874 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -273,4 +273,14 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 	(regs)->pstate = PSR_MODE_EL1h;	\
 }
 
+struct pmu_hw_events;
+struct arm_pmu;
+struct perf_event;
+
+static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
+static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
+static inline void armv8pmu_branch_enable(struct perf_event *event) { }
+static inline void armv8pmu_branch_disable(struct perf_event *event) { }
+static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
+static inline void armv8pmu_branch_reset(void) { }
 #endif
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index a5193f2146a6..8805b4516088 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -789,10 +789,22 @@ static void armv8pmu_enable_event(struct perf_event *event)
 	 * Enable counter
 	 */
 	armv8pmu_enable_event_counter(event);
+
+	/*
+	 * Enable BRBE
+	 */
+	if (has_branch_stack(event))
+		armv8pmu_branch_enable(event);
 }
 
 static void armv8pmu_disable_event(struct perf_event *event)
 {
+	/*
+	 * Disable BRBE
+	 */
+	if (has_branch_stack(event))
+		armv8pmu_branch_disable(event);
+
 	/*
 	 * Disable counter
 	 */
@@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
 		if (!armpmu_event_set_period(event))
 			continue;
 
+		if (has_branch_stack(event)) {
+			WARN_ON(!cpuc->branches);
+			armv8pmu_branch_read(cpuc, event);
+			data.br_stack = &cpuc->branches->branch_stack;
+			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
+		}
+
 		/*
 		 * Perf event overflow will queue the processing of the event as
 		 * an irq_work which will be taken care of in the handling of
@@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
 	return event->hw.idx;
 }
 
+static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
+
+	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
+		armv8pmu_branch_reset();
+}
+
 /*
  * Add an event filter to a given event.
  */
@@ -1052,6 +1079,9 @@ static void armv8pmu_reset(void *info)
 		pmcr |= ARMV8_PMU_PMCR_LP;
 
 	armv8pmu_pmcr_write(pmcr);
+
+	if (arm_pmu_branch_stack_supported(cpu_pmu))
+		armv8pmu_branch_reset();
 }
 
 static int __armv8_pmuv3_map_event(struct perf_event *event,
@@ -1069,6 +1099,9 @@ static int __armv8_pmuv3_map_event(struct perf_event *event,
 				       &armv8_pmuv3_perf_cache_map,
 				       ARMV8_PMU_EVTYPE_EVENT);
 
+	if (has_branch_stack(event) && !armv8pmu_branch_valid(event))
+		return -EOPNOTSUPP;
+
 	if (armv8pmu_event_is_64bit(event))
 		event->hw.flags |= ARMPMU_EVT_64BIT;
 
@@ -1181,6 +1214,7 @@ static void __armv8pmu_probe_pmu(void *info)
 		cpu_pmu->reg_pmmir = read_cpuid(PMMIR_EL1);
 	else
 		cpu_pmu->reg_pmmir = 0;
+	armv8pmu_branch_probe(cpu_pmu);
 }
 
 static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
@@ -1261,6 +1295,7 @@ static int armv8_pmu_init(struct arm_pmu *cpu_pmu, char *name,
 	cpu_pmu->filter			= armv8pmu_filter;
 
 	cpu_pmu->pmu.event_idx		= armv8pmu_user_event_idx;
+	cpu_pmu->sched_task		= armv8pmu_sched_task;
 
 	cpu_pmu->name			= name;
 	cpu_pmu->map_event		= map_event;
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
  2023-01-05  3:10 ` Anshuman Khandual
@ 2023-01-05  3:10   ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This enables branch stack sampling events in ARMV8 PMU, via an architecture
feature FEAT_BRBE aka branch record buffer extension. This defines required
branch helper functions pmuv8pmu_branch_XXXXX() and the implementation here
is wrapped with a new config option CONFIG_ARM64_BRBE.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig                  |  11 +
 arch/arm64/include/asm/perf_event.h |   9 +
 arch/arm64/kernel/Makefile          |   1 +
 arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
 arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
 5 files changed, 790 insertions(+)
 create mode 100644 arch/arm64/kernel/brbe.c
 create mode 100644 arch/arm64/kernel/brbe.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 03934808b2ed..915b12709a46 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1363,6 +1363,17 @@ config HW_PERF_EVENTS
 	def_bool y
 	depends on ARM_PMU
 
+config ARM64_BRBE
+	bool "Enable support for Branch Record Buffer Extension (BRBE)"
+	depends on PERF_EVENTS && ARM64 && ARM_PMU
+	default y
+	help
+	  Enable perf support for Branch Record Buffer Extension (BRBE) which
+	  records all branches taken in an execution path. This supports some
+	  branch types and privilege based filtering. It captured additional
+	  relevant information such as cycle count, misprediction and branch
+	  type, branch privilege level etc.
+
 # Supported by clang >= 7.0 or GCC >= 12.0.0
 config CC_HAVE_SHADOW_CALL_STACK
 	def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index a038902d6874..cf2e88c7b707 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -277,6 +277,14 @@ struct pmu_hw_events;
 struct arm_pmu;
 struct perf_event;
 
+#ifdef CONFIG_ARM64_BRBE
+void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event);
+bool armv8pmu_branch_valid(struct perf_event *event);
+void armv8pmu_branch_enable(struct perf_event *event);
+void armv8pmu_branch_disable(struct perf_event *event);
+void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
+void armv8pmu_branch_reset(void);
+#else
 static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
 static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
 static inline void armv8pmu_branch_enable(struct perf_event *event) { }
@@ -284,3 +292,4 @@ static inline void armv8pmu_branch_disable(struct perf_event *event) { }
 static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
 static inline void armv8pmu_branch_reset(void) { }
 #endif
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ceba6792f5b3..6ee7ccb61621 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
 obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
+obj-$(CONFIG_ARM64_BRBE)		+= brbe.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
 obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
 obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
new file mode 100644
index 000000000000..cd03d3531e04
--- /dev/null
+++ b/arch/arm64/kernel/brbe.c
@@ -0,0 +1,512 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Branch Record Buffer Extension Driver.
+ *
+ * Copyright (C) 2022 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#include "brbe.h"
+
+static bool valid_brbe_nr(int brbe_nr)
+{
+	return brbe_nr == BRBIDR0_EL1_NUMREC_8 ||
+	       brbe_nr == BRBIDR0_EL1_NUMREC_16 ||
+	       brbe_nr == BRBIDR0_EL1_NUMREC_32 ||
+	       brbe_nr == BRBIDR0_EL1_NUMREC_64;
+}
+
+static bool valid_brbe_cc(int brbe_cc)
+{
+	return brbe_cc == BRBIDR0_EL1_CC_20_BIT;
+}
+
+static bool valid_brbe_format(int brbe_format)
+{
+	return brbe_format == BRBIDR0_EL1_FORMAT_0;
+}
+
+static bool valid_brbe_version(int brbe_version)
+{
+	return brbe_version == ID_AA64DFR0_EL1_BRBE_IMP ||
+	       brbe_version == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1;
+}
+
+static void select_brbe_bank(int bank)
+{
+	static int brbe_current_bank = BRBE_BANK_IDX_INVALID;
+	u64 brbfcr;
+
+	if (brbe_current_bank == bank)
+		return;
+
+	WARN_ON(bank > BRBE_BANK_IDX_1);
+	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
+	brbfcr |= ((bank << BRBFCR_EL1_BANK_SHIFT) & BRBFCR_EL1_BANK_MASK);
+	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+	isb();
+	brbe_current_bank = bank;
+}
+
+static void select_brbe_bank_index(int buffer_idx)
+{
+	switch (buffer_idx) {
+	case BRBE_BANK0_IDX_MIN ... BRBE_BANK0_IDX_MAX:
+		select_brbe_bank(BRBE_BANK_IDX_0);
+		break;
+	case BRBE_BANK1_IDX_MIN ... BRBE_BANK1_IDX_MAX:
+		select_brbe_bank(BRBE_BANK_IDX_1);
+		break;
+	default:
+		pr_warn("unsupported BRBE index\n");
+	}
+}
+
+static const char branch_filter_error_msg[] = "branch filter not supported";
+
+bool armv8pmu_branch_valid(struct perf_event *event)
+{
+	u64 branch_type = event->attr.branch_sample_type;
+
+	/*
+	 * If the event does not have at least one of the privilege
+	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
+	 * perf will adjust its value based on perf event's existing
+	 * privilege level via attr.exclude_[user|kernel|hv].
+	 *
+	 * As event->attr.branch_sample_type might have been changed
+	 * when the event reaches here, it is not possible to figure
+	 * out whether the event originally had HV privilege request
+	 * or got added via the core perf. Just report this situation
+	 * once and continue ignoring if there are other instances.
+	 */
+	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
+		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
+		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
+		return false;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
+		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
+		return false;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
+		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
+		return false;
+	}
+	return true;
+}
+
+static void branch_records_alloc(struct arm_pmu *armpmu)
+{
+	struct pmu_hw_events *events;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		events = per_cpu_ptr(armpmu->hw_events, cpu);
+
+		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
+		WARN_ON(!events->branches);
+	}
+}
+
+static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
+{
+	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
+	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
+
+	WARN_ON(!brbe_attr);
+	armpmu->private = brbe_attr;
+
+	brbe_attr->brbe_version = brbe;
+	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
+	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
+	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
+
+	if (!valid_brbe_version(brbe_attr->brbe_version) ||
+	   !valid_brbe_format(brbe_attr->brbe_format) ||
+	   !valid_brbe_cc(brbe_attr->brbe_cc) ||
+	   !valid_brbe_nr(brbe_attr->brbe_nr))
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
+void armv8pmu_branch_probe(struct arm_pmu *armpmu)
+{
+	u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
+	u32 brbe;
+
+	brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
+	if (!brbe)
+		return;
+
+	if (brbe_attributes_probe(armpmu, brbe))
+		return;
+
+	branch_records_alloc(armpmu);
+	armpmu->features |= ARM_PMU_BRANCH_STACK;
+}
+
+static u64 branch_type_to_brbfcr(int branch_type)
+{
+	u64 brbfcr = 0;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+		brbfcr |= BRBFCR_EL1_BRANCH_FILTERS;
+		return brbfcr;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+		brbfcr |= BRBFCR_EL1_INDCALL;
+		brbfcr |= BRBFCR_EL1_DIRCALL;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		brbfcr |= BRBFCR_EL1_RTN;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
+		brbfcr |= BRBFCR_EL1_INDCALL;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_COND)
+		brbfcr |= BRBFCR_EL1_CONDDIR;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
+		brbfcr |= BRBFCR_EL1_INDIRECT;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_CALL)
+		brbfcr |= BRBFCR_EL1_DIRCALL;
+
+	return brbfcr;
+}
+
+static u64 branch_type_to_brbcr(int branch_type)
+{
+	u64 brbcr = (BRBCR_EL1_FZP | BRBCR_EL1_DEFAULT_TS);
+
+	if (branch_type & PERF_SAMPLE_BRANCH_USER)
+		brbcr |= BRBCR_EL1_E0BRE;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
+		brbcr |= BRBCR_EL1_E1BRE;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_HV) {
+		if (is_kernel_in_hyp_mode())
+			brbcr |= BRBCR_EL1_E1BRE;
+	}
+
+	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
+		brbcr |= BRBCR_EL1_CC;
+
+	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
+		brbcr |= BRBCR_EL1_MPRED;
+
+	/*
+	 * The exception and exception return branches could be
+	 * captured, irrespective of the perf event's privilege.
+	 * If the perf event does not have enough privilege for
+	 * a given exception level, then addresses which falls
+	 * under that exception level will be reported as zero
+	 * for the captured branch record, creating source only
+	 * or target only records.
+	 */
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+		brbcr |= BRBCR_EL1_EXCEPTION;
+		brbcr |= BRBCR_EL1_ERTN;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+		brbcr |= BRBCR_EL1_EXCEPTION;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		brbcr |= BRBCR_EL1_ERTN;
+
+	return brbcr & BRBCR_EL1_DEFAULT_CONFIG;
+}
+
+void armv8pmu_branch_enable(struct perf_event *event)
+{
+	u64 branch_type = event->attr.branch_sample_type;
+	u64 brbfcr, brbcr;
+
+	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	brbfcr &= ~BRBFCR_EL1_DEFAULT_CONFIG;
+	brbfcr |= branch_type_to_brbfcr(branch_type);
+	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+	isb();
+
+	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+	brbcr &= ~BRBCR_EL1_DEFAULT_CONFIG;
+	brbcr |= branch_type_to_brbcr(branch_type);
+	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+	isb();
+	armv8pmu_branch_reset();
+}
+
+void armv8pmu_branch_disable(struct perf_event *event)
+{
+	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	u64 brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+
+	brbcr &= ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
+	brbfcr |= BRBFCR_EL1_PAUSED;
+	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+	isb();
+}
+
+static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)
+{
+	int brbe_type = brbe_fetch_type(brbinf);
+	*new_branch_type = false;
+
+	switch (brbe_type) {
+	case BRBINF_EL1_TYPE_UNCOND_DIR:
+		return PERF_BR_UNCOND;
+	case BRBINF_EL1_TYPE_INDIR:
+		return PERF_BR_IND;
+	case BRBINF_EL1_TYPE_DIR_LINK:
+		return PERF_BR_CALL;
+	case BRBINF_EL1_TYPE_INDIR_LINK:
+		return PERF_BR_IND_CALL;
+	case BRBINF_EL1_TYPE_RET_SUB:
+		return PERF_BR_RET;
+	case BRBINF_EL1_TYPE_COND_DIR:
+		return PERF_BR_COND;
+	case BRBINF_EL1_TYPE_CALL:
+		return PERF_BR_CALL;
+	case BRBINF_EL1_TYPE_TRAP:
+		return PERF_BR_SYSCALL;
+	case BRBINF_EL1_TYPE_RET_EXCPT:
+		return PERF_BR_ERET;
+	case BRBINF_EL1_TYPE_IRQ:
+		return PERF_BR_IRQ;
+	case BRBINF_EL1_TYPE_DEBUG_HALT:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_HALT;
+	case BRBINF_EL1_TYPE_SERROR:
+		return PERF_BR_SERROR;
+	case BRBINF_EL1_TYPE_INST_DEBUG:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_INST;
+	case BRBINF_EL1_TYPE_DATA_DEBUG:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_DATA;
+	case BRBINF_EL1_TYPE_ALGN_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_ALGN;
+	case BRBINF_EL1_TYPE_INST_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_INST;
+	case BRBINF_EL1_TYPE_DATA_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_DATA;
+	case BRBINF_EL1_TYPE_FIQ:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_FIQ;
+	case BRBINF_EL1_TYPE_DEBUG_EXIT:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_EXIT;
+	default:
+		pr_warn("unknown branch type captured\n");
+		return PERF_BR_UNKNOWN;
+	}
+}
+
+static int brbe_fetch_perf_priv(u64 brbinf)
+{
+	int brbe_el = brbe_fetch_el(brbinf);
+
+	switch (brbe_el) {
+	case BRBINF_EL1_EL_EL0:
+		return PERF_BR_PRIV_USER;
+	case BRBINF_EL1_EL_EL1:
+		return PERF_BR_PRIV_KERNEL;
+	case BRBINF_EL1_EL_EL2:
+		if (is_kernel_in_hyp_mode())
+			return PERF_BR_PRIV_KERNEL;
+		return PERF_BR_PRIV_HV;
+	default:
+		pr_warn("unknown branch privilege captured\n");
+		return PERF_BR_PRIV_UNKNOWN;
+	}
+}
+
+static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
+			       u64 brbinf, u64 brbcr, int idx)
+{
+	struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
+	bool new_branch_type;
+	int branch_type;
+
+	if (branch_sample_type(event)) {
+		branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
+		if (new_branch_type) {
+			entry->type = PERF_BR_EXTEND_ABI;
+			entry->new_type = branch_type;
+		} else {
+			entry->type = branch_type;
+		}
+	}
+
+	if (!branch_sample_no_cycles(event)) {
+		WARN_ON_ONCE(!(brbcr & BRBCR_EL1_CC));
+		entry->cycles = brbe_fetch_cycles(brbinf);
+	}
+
+	if (!branch_sample_no_flags(event)) {
+		/*
+		 * BRBINF_LASTFAILED does not indicate whether last transaction
+		 * got failed or aborted during the current branch record itself.
+		 * Rather, this indicates that all the branch records which were
+		 * in transaction until the curret branch record have failed. So
+		 * the entire BRBE buffer needs to be processed later on to find
+		 * all branch records which might have failed.
+		 */
+		entry->abort = brbe_fetch_lastfailed(brbinf);
+
+		/*
+		 * All these information (i.e transaction state and mispredicts)
+		 * are not available for target only branch records.
+		 */
+		if (!brbe_target(brbinf)) {
+			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));
+			entry->mispred = brbe_fetch_mispredict(brbinf);
+			entry->predicted = !entry->mispred;
+			entry->in_tx = brbe_fetch_in_tx(brbinf);
+		}
+	}
+
+	if (branch_sample_priv(event)) {
+		/*
+		 * All these information (i.e branch privilege level) are not
+		 * available for source only branch records.
+		 */
+		if (!brbe_source(brbinf))
+			entry->priv = brbe_fetch_perf_priv(brbinf);
+	}
+}
+
+/*
+ * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
+ * preceding consecutive branch records, that were in a transaction
+ * (i.e their BRBINF_EL1.TX set) have been aborted.
+ *
+ * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
+ * consecutive branch records upto the last record, which were in a
+ * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
+ *
+ * --------------------------------- -------------------
+ * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
+ * --------------------------------- -------------------
+ * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
+ * --------------------------------- -------------------
+ * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
+ * --------------------------------- -------------------
+ * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
+ * --------------------------------- -------------------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
+ * --------------------------------- -------------------
+ * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ *
+ * BRBFCR_EL1.LASTFAILED == 1
+ *
+ * Here BRBFCR_EL1.LASTFAILED failes all those consecutive and also
+ * in transaction branches near the end of the BRBE buffer.
+ */
+static void process_branch_aborts(struct pmu_hw_events *cpuc)
+{
+	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
+	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	bool lastfailed = !!(brbfcr & BRBFCR_EL1_LASTFAILED);
+	int idx = brbe_attr->brbe_nr - 1;
+	struct perf_branch_entry *entry;
+
+	do {
+		entry = &cpuc->branches->branch_entries[idx];
+		if (entry->in_tx) {
+			entry->abort = lastfailed;
+		} else {
+			lastfailed = entry->abort;
+			entry->abort = false;
+		}
+	} while (idx--, idx >= 0);
+}
+
+void armv8pmu_branch_reset(void)
+{
+	asm volatile(BRB_IALL);
+	isb();
+}
+
+void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
+	u64 brbinf, brbfcr, brbcr;
+	int idx;
+
+	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+
+	/* Ensure pause on PMU interrupt is enabled */
+	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
+
+	/* Save and clear the privilege */
+	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
+
+	/* Pause the buffer */
+	write_sysreg_s(brbfcr | BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
+	isb();
+
+	for (idx = 0; idx < brbe_attr->brbe_nr; idx++) {
+		struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
+
+		select_brbe_bank_index(idx);
+		brbinf = get_brbinf_reg(idx);
+		/*
+		 * There are no valid entries anymore on the buffer.
+		 * Abort the branch record processing to save some
+		 * cycles and also reduce the capture/process load
+		 * for the user space as well.
+		 */
+		if (brbe_invalid(brbinf))
+			break;
+
+		perf_clear_branch_entry_bitfields(entry);
+		if (brbe_valid(brbinf)) {
+			entry->from = get_brbsrc_reg(idx);
+			entry->to = get_brbtgt_reg(idx);
+		} else if (brbe_source(brbinf)) {
+			entry->from = get_brbsrc_reg(idx);
+			entry->to = 0;
+		} else if (brbe_target(brbinf)) {
+			entry->from = 0;
+			entry->to = get_brbtgt_reg(idx);
+		}
+		capture_brbe_flags(cpuc, event, brbinf, brbcr, idx);
+	}
+	cpuc->branches->branch_stack.nr = idx;
+	cpuc->branches->branch_stack.hw_idx = -1ULL;
+	process_branch_aborts(cpuc);
+
+	/* Restore privilege, enable pause on PMU interrupt */
+	write_sysreg_s(brbcr | BRBCR_EL1_FZP, SYS_BRBCR_EL1);
+
+	/* Unpause the buffer */
+	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
+	isb();
+	armv8pmu_branch_reset();
+}
diff --git a/arch/arm64/kernel/brbe.h b/arch/arm64/kernel/brbe.h
new file mode 100644
index 000000000000..ee5aa311f12c
--- /dev/null
+++ b/arch/arm64/kernel/brbe.h
@@ -0,0 +1,257 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Branch Record Buffer Extension Helpers.
+ *
+ * Copyright (C) 2022 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#define pr_fmt(fmt) "brbe: " fmt
+
+#include <linux/perf/arm_pmu.h>
+
+#define BRBFCR_EL1_BRANCH_FILTERS (BRBFCR_EL1_DIRECT   | \
+				   BRBFCR_EL1_INDIRECT | \
+				   BRBFCR_EL1_RTN      | \
+				   BRBFCR_EL1_INDCALL  | \
+				   BRBFCR_EL1_DIRCALL  | \
+				   BRBFCR_EL1_CONDDIR)
+
+#define BRBFCR_EL1_DEFAULT_CONFIG (BRBFCR_EL1_BANK_MASK | \
+				   BRBFCR_EL1_PAUSED    | \
+				   BRBFCR_EL1_EnI       | \
+				   BRBFCR_EL1_BRANCH_FILTERS)
+
+/*
+ * BRBTS_EL1 is currently not used for branch stack implementation
+ * purpose but BRBCR_EL1.TS needs to have a valid value from all
+ * available options. BRBCR_EL1_TS_VIRTUAL is selected for this.
+ */
+#define BRBCR_EL1_DEFAULT_TS      FIELD_PREP(BRBCR_EL1_TS_MASK, BRBCR_EL1_TS_VIRTUAL)
+
+#define BRBCR_EL1_DEFAULT_CONFIG  (BRBCR_EL1_EXCEPTION | \
+				   BRBCR_EL1_ERTN      | \
+				   BRBCR_EL1_CC        | \
+				   BRBCR_EL1_MPRED     | \
+				   BRBCR_EL1_E1BRE     | \
+				   BRBCR_EL1_E0BRE     | \
+				   BRBCR_EL1_FZP       | \
+				   BRBCR_EL1_DEFAULT_TS)
+/*
+ * BRBE Instructions
+ *
+ * BRB_IALL : Invalidate the entire buffer
+ * BRB_INJ  : Inject latest branch record derived from [BRBSRCINJ, BRBTGTINJ, BRBINFINJ]
+ */
+#define BRB_IALL __emit_inst(0xD5000000 | sys_insn(1, 1, 7, 2, 4) | (0x1f))
+#define BRB_INJ  __emit_inst(0xD5000000 | sys_insn(1, 1, 7, 2, 5) | (0x1f))
+
+/*
+ * BRBE Buffer Organization
+ *
+ * BRBE buffer is arranged as multiple banks of 32 branch record
+ * entries each. An individual branch record in a given bank could
+ * be accessed, after selecting the bank in BRBFCR_EL1.BANK and
+ * accessing the registers i.e [BRBSRC, BRBTGT, BRBINF] set with
+ * indices [0..31].
+ *
+ * Bank 0
+ *
+ *	---------------------------------	------
+ *	| 00 | BRBSRC | BRBTGT | BRBINF |	| 00 |
+ *	---------------------------------	------
+ *	| 01 | BRBSRC | BRBTGT | BRBINF |	| 01 |
+ *	---------------------------------	------
+ *	| .. | BRBSRC | BRBTGT | BRBINF |	| .. |
+ *	---------------------------------	------
+ *	| 31 | BRBSRC | BRBTGT | BRBINF |	| 31 |
+ *	---------------------------------	------
+ *
+ * Bank 1
+ *
+ *	---------------------------------	------
+ *	| 32 | BRBSRC | BRBTGT | BRBINF |	| 00 |
+ *	---------------------------------	------
+ *	| 33 | BRBSRC | BRBTGT | BRBINF |	| 01 |
+ *	---------------------------------	------
+ *	| .. | BRBSRC | BRBTGT | BRBINF |	| .. |
+ *	---------------------------------	------
+ *	| 63 | BRBSRC | BRBTGT | BRBINF |	| 31 |
+ *	---------------------------------	------
+ */
+#define BRBE_BANK_MAX_ENTRIES 32
+
+#define BRBE_BANK0_IDX_MIN 0
+#define BRBE_BANK0_IDX_MAX 31
+#define BRBE_BANK1_IDX_MIN 32
+#define BRBE_BANK1_IDX_MAX 63
+
+struct brbe_hw_attr {
+	bool	brbe_version;
+	int	brbe_cc;
+	int	brbe_nr;
+	int	brbe_format;
+};
+
+enum brbe_bank_idx {
+	BRBE_BANK_IDX_INVALID = -1,
+	BRBE_BANK_IDX_0,
+	BRBE_BANK_IDX_1,
+	BRBE_BANK_IDX_MAX
+};
+
+#define RETURN_READ_BRBSRCN(n) \
+	read_sysreg_s(SYS_BRBSRC##n##_EL1)
+
+#define RETURN_READ_BRBTGTN(n) \
+	read_sysreg_s(SYS_BRBTGT##n##_EL1)
+
+#define RETURN_READ_BRBINFN(n) \
+	read_sysreg_s(SYS_BRBINF##n##_EL1)
+
+#define BRBE_REGN_CASE(n, case_macro) \
+	case n: return case_macro(n); break
+
+#define BRBE_REGN_SWITCH(x, case_macro)				\
+	do {							\
+		switch (x) {					\
+		BRBE_REGN_CASE(0, case_macro);			\
+		BRBE_REGN_CASE(1, case_macro);			\
+		BRBE_REGN_CASE(2, case_macro);			\
+		BRBE_REGN_CASE(3, case_macro);			\
+		BRBE_REGN_CASE(4, case_macro);			\
+		BRBE_REGN_CASE(5, case_macro);			\
+		BRBE_REGN_CASE(6, case_macro);			\
+		BRBE_REGN_CASE(7, case_macro);			\
+		BRBE_REGN_CASE(8, case_macro);			\
+		BRBE_REGN_CASE(9, case_macro);			\
+		BRBE_REGN_CASE(10, case_macro);			\
+		BRBE_REGN_CASE(11, case_macro);			\
+		BRBE_REGN_CASE(12, case_macro);			\
+		BRBE_REGN_CASE(13, case_macro);			\
+		BRBE_REGN_CASE(14, case_macro);			\
+		BRBE_REGN_CASE(15, case_macro);			\
+		BRBE_REGN_CASE(16, case_macro);			\
+		BRBE_REGN_CASE(17, case_macro);			\
+		BRBE_REGN_CASE(18, case_macro);			\
+		BRBE_REGN_CASE(19, case_macro);			\
+		BRBE_REGN_CASE(20, case_macro);			\
+		BRBE_REGN_CASE(21, case_macro);			\
+		BRBE_REGN_CASE(22, case_macro);			\
+		BRBE_REGN_CASE(23, case_macro);			\
+		BRBE_REGN_CASE(24, case_macro);			\
+		BRBE_REGN_CASE(25, case_macro);			\
+		BRBE_REGN_CASE(26, case_macro);			\
+		BRBE_REGN_CASE(27, case_macro);			\
+		BRBE_REGN_CASE(28, case_macro);			\
+		BRBE_REGN_CASE(29, case_macro);			\
+		BRBE_REGN_CASE(30, case_macro);			\
+		BRBE_REGN_CASE(31, case_macro);			\
+		default:					\
+			pr_warn("unknown register index\n");	\
+			return -1;				\
+		}						\
+	} while (0)
+
+static inline int buffer_to_brbe_idx(int buffer_idx)
+{
+	return buffer_idx % BRBE_BANK_MAX_ENTRIES;
+}
+
+static inline u64 get_brbsrc_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBSRCN);
+}
+
+static inline u64 get_brbtgt_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBTGTN);
+}
+
+static inline u64 get_brbinf_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBINFN);
+}
+
+static inline u64 brbe_record_valid(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_VALID_MASK, brbinf);
+}
+
+static inline bool brbe_invalid(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_NONE;
+}
+
+static inline bool brbe_valid(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_FULL;
+}
+
+static inline bool brbe_source(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_SOURCE;
+}
+
+static inline bool brbe_target(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_TARGET;
+}
+
+static inline int brbe_fetch_in_tx(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_T_MASK, brbinf);
+}
+
+static inline int brbe_fetch_mispredict(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_MPRED_MASK, brbinf);
+}
+
+static inline int brbe_fetch_lastfailed(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_LASTFAILED_MASK, brbinf);
+}
+
+static inline int brbe_fetch_cycles(u64 brbinf)
+{
+	/*
+	 * Captured cycle count is unknown and hence
+	 * should not be passed on to the user space.
+	 */
+	if (brbinf & BRBINF_EL1_CCU)
+		return 0;
+
+	return FIELD_GET(BRBINF_EL1_CC_MASK, brbinf);
+}
+
+static inline int brbe_fetch_type(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_TYPE_MASK, brbinf);
+}
+
+static inline int brbe_fetch_el(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_EL_MASK, brbinf);
+}
+
+static inline int brbe_fetch_numrec(u64 brbidr)
+{
+	return FIELD_GET(BRBIDR0_EL1_NUMREC_MASK, brbidr);
+}
+
+static inline int brbe_fetch_format(u64 brbidr)
+{
+	return FIELD_GET(BRBIDR0_EL1_FORMAT_MASK, brbidr);
+}
+
+static inline int brbe_fetch_cc_bits(u64 brbidr)
+{
+	return FIELD_GET(BRBIDR0_EL1_CC_MASK, brbidr);
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
@ 2023-01-05  3:10   ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-05  3:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Anshuman Khandual, Catalin Marinas, Will Deacon

This enables branch stack sampling events in ARMV8 PMU, via an architecture
feature FEAT_BRBE aka branch record buffer extension. This defines required
branch helper functions pmuv8pmu_branch_XXXXX() and the implementation here
is wrapped with a new config option CONFIG_ARM64_BRBE.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig                  |  11 +
 arch/arm64/include/asm/perf_event.h |   9 +
 arch/arm64/kernel/Makefile          |   1 +
 arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
 arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
 5 files changed, 790 insertions(+)
 create mode 100644 arch/arm64/kernel/brbe.c
 create mode 100644 arch/arm64/kernel/brbe.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 03934808b2ed..915b12709a46 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1363,6 +1363,17 @@ config HW_PERF_EVENTS
 	def_bool y
 	depends on ARM_PMU
 
+config ARM64_BRBE
+	bool "Enable support for Branch Record Buffer Extension (BRBE)"
+	depends on PERF_EVENTS && ARM64 && ARM_PMU
+	default y
+	help
+	  Enable perf support for Branch Record Buffer Extension (BRBE) which
+	  records all branches taken in an execution path. This supports some
+	  branch types and privilege based filtering. It captured additional
+	  relevant information such as cycle count, misprediction and branch
+	  type, branch privilege level etc.
+
 # Supported by clang >= 7.0 or GCC >= 12.0.0
 config CC_HAVE_SHADOW_CALL_STACK
 	def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index a038902d6874..cf2e88c7b707 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -277,6 +277,14 @@ struct pmu_hw_events;
 struct arm_pmu;
 struct perf_event;
 
+#ifdef CONFIG_ARM64_BRBE
+void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event);
+bool armv8pmu_branch_valid(struct perf_event *event);
+void armv8pmu_branch_enable(struct perf_event *event);
+void armv8pmu_branch_disable(struct perf_event *event);
+void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
+void armv8pmu_branch_reset(void);
+#else
 static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
 static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
 static inline void armv8pmu_branch_enable(struct perf_event *event) { }
@@ -284,3 +292,4 @@ static inline void armv8pmu_branch_disable(struct perf_event *event) { }
 static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
 static inline void armv8pmu_branch_reset(void) { }
 #endif
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ceba6792f5b3..6ee7ccb61621 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
 obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
+obj-$(CONFIG_ARM64_BRBE)		+= brbe.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
 obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
 obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
new file mode 100644
index 000000000000..cd03d3531e04
--- /dev/null
+++ b/arch/arm64/kernel/brbe.c
@@ -0,0 +1,512 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Branch Record Buffer Extension Driver.
+ *
+ * Copyright (C) 2022 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#include "brbe.h"
+
+static bool valid_brbe_nr(int brbe_nr)
+{
+	return brbe_nr == BRBIDR0_EL1_NUMREC_8 ||
+	       brbe_nr == BRBIDR0_EL1_NUMREC_16 ||
+	       brbe_nr == BRBIDR0_EL1_NUMREC_32 ||
+	       brbe_nr == BRBIDR0_EL1_NUMREC_64;
+}
+
+static bool valid_brbe_cc(int brbe_cc)
+{
+	return brbe_cc == BRBIDR0_EL1_CC_20_BIT;
+}
+
+static bool valid_brbe_format(int brbe_format)
+{
+	return brbe_format == BRBIDR0_EL1_FORMAT_0;
+}
+
+static bool valid_brbe_version(int brbe_version)
+{
+	return brbe_version == ID_AA64DFR0_EL1_BRBE_IMP ||
+	       brbe_version == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1;
+}
+
+static void select_brbe_bank(int bank)
+{
+	static int brbe_current_bank = BRBE_BANK_IDX_INVALID;
+	u64 brbfcr;
+
+	if (brbe_current_bank == bank)
+		return;
+
+	WARN_ON(bank > BRBE_BANK_IDX_1);
+	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
+	brbfcr |= ((bank << BRBFCR_EL1_BANK_SHIFT) & BRBFCR_EL1_BANK_MASK);
+	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+	isb();
+	brbe_current_bank = bank;
+}
+
+static void select_brbe_bank_index(int buffer_idx)
+{
+	switch (buffer_idx) {
+	case BRBE_BANK0_IDX_MIN ... BRBE_BANK0_IDX_MAX:
+		select_brbe_bank(BRBE_BANK_IDX_0);
+		break;
+	case BRBE_BANK1_IDX_MIN ... BRBE_BANK1_IDX_MAX:
+		select_brbe_bank(BRBE_BANK_IDX_1);
+		break;
+	default:
+		pr_warn("unsupported BRBE index\n");
+	}
+}
+
+static const char branch_filter_error_msg[] = "branch filter not supported";
+
+bool armv8pmu_branch_valid(struct perf_event *event)
+{
+	u64 branch_type = event->attr.branch_sample_type;
+
+	/*
+	 * If the event does not have at least one of the privilege
+	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
+	 * perf will adjust its value based on perf event's existing
+	 * privilege level via attr.exclude_[user|kernel|hv].
+	 *
+	 * As event->attr.branch_sample_type might have been changed
+	 * when the event reaches here, it is not possible to figure
+	 * out whether the event originally had HV privilege request
+	 * or got added via the core perf. Just report this situation
+	 * once and continue ignoring if there are other instances.
+	 */
+	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
+		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
+		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
+		return false;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
+		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
+		return false;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
+		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
+		return false;
+	}
+	return true;
+}
+
+static void branch_records_alloc(struct arm_pmu *armpmu)
+{
+	struct pmu_hw_events *events;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		events = per_cpu_ptr(armpmu->hw_events, cpu);
+
+		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
+		WARN_ON(!events->branches);
+	}
+}
+
+static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
+{
+	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
+	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
+
+	WARN_ON(!brbe_attr);
+	armpmu->private = brbe_attr;
+
+	brbe_attr->brbe_version = brbe;
+	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
+	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
+	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
+
+	if (!valid_brbe_version(brbe_attr->brbe_version) ||
+	   !valid_brbe_format(brbe_attr->brbe_format) ||
+	   !valid_brbe_cc(brbe_attr->brbe_cc) ||
+	   !valid_brbe_nr(brbe_attr->brbe_nr))
+		return -EOPNOTSUPP;
+
+	return 0;
+}
+
+void armv8pmu_branch_probe(struct arm_pmu *armpmu)
+{
+	u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
+	u32 brbe;
+
+	brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
+	if (!brbe)
+		return;
+
+	if (brbe_attributes_probe(armpmu, brbe))
+		return;
+
+	branch_records_alloc(armpmu);
+	armpmu->features |= ARM_PMU_BRANCH_STACK;
+}
+
+static u64 branch_type_to_brbfcr(int branch_type)
+{
+	u64 brbfcr = 0;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+		brbfcr |= BRBFCR_EL1_BRANCH_FILTERS;
+		return brbfcr;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
+		brbfcr |= BRBFCR_EL1_INDCALL;
+		brbfcr |= BRBFCR_EL1_DIRCALL;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		brbfcr |= BRBFCR_EL1_RTN;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
+		brbfcr |= BRBFCR_EL1_INDCALL;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_COND)
+		brbfcr |= BRBFCR_EL1_CONDDIR;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
+		brbfcr |= BRBFCR_EL1_INDIRECT;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_CALL)
+		brbfcr |= BRBFCR_EL1_DIRCALL;
+
+	return brbfcr;
+}
+
+static u64 branch_type_to_brbcr(int branch_type)
+{
+	u64 brbcr = (BRBCR_EL1_FZP | BRBCR_EL1_DEFAULT_TS);
+
+	if (branch_type & PERF_SAMPLE_BRANCH_USER)
+		brbcr |= BRBCR_EL1_E0BRE;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
+		brbcr |= BRBCR_EL1_E1BRE;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_HV) {
+		if (is_kernel_in_hyp_mode())
+			brbcr |= BRBCR_EL1_E1BRE;
+	}
+
+	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
+		brbcr |= BRBCR_EL1_CC;
+
+	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
+		brbcr |= BRBCR_EL1_MPRED;
+
+	/*
+	 * The exception and exception return branches could be
+	 * captured, irrespective of the perf event's privilege.
+	 * If the perf event does not have enough privilege for
+	 * a given exception level, then addresses which falls
+	 * under that exception level will be reported as zero
+	 * for the captured branch record, creating source only
+	 * or target only records.
+	 */
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
+		brbcr |= BRBCR_EL1_EXCEPTION;
+		brbcr |= BRBCR_EL1_ERTN;
+	}
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
+		brbcr |= BRBCR_EL1_EXCEPTION;
+
+	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
+		brbcr |= BRBCR_EL1_ERTN;
+
+	return brbcr & BRBCR_EL1_DEFAULT_CONFIG;
+}
+
+void armv8pmu_branch_enable(struct perf_event *event)
+{
+	u64 branch_type = event->attr.branch_sample_type;
+	u64 brbfcr, brbcr;
+
+	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	brbfcr &= ~BRBFCR_EL1_DEFAULT_CONFIG;
+	brbfcr |= branch_type_to_brbfcr(branch_type);
+	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+	isb();
+
+	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+	brbcr &= ~BRBCR_EL1_DEFAULT_CONFIG;
+	brbcr |= branch_type_to_brbcr(branch_type);
+	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+	isb();
+	armv8pmu_branch_reset();
+}
+
+void armv8pmu_branch_disable(struct perf_event *event)
+{
+	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	u64 brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+
+	brbcr &= ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
+	brbfcr |= BRBFCR_EL1_PAUSED;
+	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
+	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
+	isb();
+}
+
+static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)
+{
+	int brbe_type = brbe_fetch_type(brbinf);
+	*new_branch_type = false;
+
+	switch (brbe_type) {
+	case BRBINF_EL1_TYPE_UNCOND_DIR:
+		return PERF_BR_UNCOND;
+	case BRBINF_EL1_TYPE_INDIR:
+		return PERF_BR_IND;
+	case BRBINF_EL1_TYPE_DIR_LINK:
+		return PERF_BR_CALL;
+	case BRBINF_EL1_TYPE_INDIR_LINK:
+		return PERF_BR_IND_CALL;
+	case BRBINF_EL1_TYPE_RET_SUB:
+		return PERF_BR_RET;
+	case BRBINF_EL1_TYPE_COND_DIR:
+		return PERF_BR_COND;
+	case BRBINF_EL1_TYPE_CALL:
+		return PERF_BR_CALL;
+	case BRBINF_EL1_TYPE_TRAP:
+		return PERF_BR_SYSCALL;
+	case BRBINF_EL1_TYPE_RET_EXCPT:
+		return PERF_BR_ERET;
+	case BRBINF_EL1_TYPE_IRQ:
+		return PERF_BR_IRQ;
+	case BRBINF_EL1_TYPE_DEBUG_HALT:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_HALT;
+	case BRBINF_EL1_TYPE_SERROR:
+		return PERF_BR_SERROR;
+	case BRBINF_EL1_TYPE_INST_DEBUG:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_INST;
+	case BRBINF_EL1_TYPE_DATA_DEBUG:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_DATA;
+	case BRBINF_EL1_TYPE_ALGN_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_ALGN;
+	case BRBINF_EL1_TYPE_INST_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_INST;
+	case BRBINF_EL1_TYPE_DATA_FAULT:
+		*new_branch_type = true;
+		return PERF_BR_NEW_FAULT_DATA;
+	case BRBINF_EL1_TYPE_FIQ:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_FIQ;
+	case BRBINF_EL1_TYPE_DEBUG_EXIT:
+		*new_branch_type = true;
+		return PERF_BR_ARM64_DEBUG_EXIT;
+	default:
+		pr_warn("unknown branch type captured\n");
+		return PERF_BR_UNKNOWN;
+	}
+}
+
+static int brbe_fetch_perf_priv(u64 brbinf)
+{
+	int brbe_el = brbe_fetch_el(brbinf);
+
+	switch (brbe_el) {
+	case BRBINF_EL1_EL_EL0:
+		return PERF_BR_PRIV_USER;
+	case BRBINF_EL1_EL_EL1:
+		return PERF_BR_PRIV_KERNEL;
+	case BRBINF_EL1_EL_EL2:
+		if (is_kernel_in_hyp_mode())
+			return PERF_BR_PRIV_KERNEL;
+		return PERF_BR_PRIV_HV;
+	default:
+		pr_warn("unknown branch privilege captured\n");
+		return PERF_BR_PRIV_UNKNOWN;
+	}
+}
+
+static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
+			       u64 brbinf, u64 brbcr, int idx)
+{
+	struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
+	bool new_branch_type;
+	int branch_type;
+
+	if (branch_sample_type(event)) {
+		branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
+		if (new_branch_type) {
+			entry->type = PERF_BR_EXTEND_ABI;
+			entry->new_type = branch_type;
+		} else {
+			entry->type = branch_type;
+		}
+	}
+
+	if (!branch_sample_no_cycles(event)) {
+		WARN_ON_ONCE(!(brbcr & BRBCR_EL1_CC));
+		entry->cycles = brbe_fetch_cycles(brbinf);
+	}
+
+	if (!branch_sample_no_flags(event)) {
+		/*
+		 * BRBINF_LASTFAILED does not indicate whether last transaction
+		 * got failed or aborted during the current branch record itself.
+		 * Rather, this indicates that all the branch records which were
+		 * in transaction until the curret branch record have failed. So
+		 * the entire BRBE buffer needs to be processed later on to find
+		 * all branch records which might have failed.
+		 */
+		entry->abort = brbe_fetch_lastfailed(brbinf);
+
+		/*
+		 * All these information (i.e transaction state and mispredicts)
+		 * are not available for target only branch records.
+		 */
+		if (!brbe_target(brbinf)) {
+			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));
+			entry->mispred = brbe_fetch_mispredict(brbinf);
+			entry->predicted = !entry->mispred;
+			entry->in_tx = brbe_fetch_in_tx(brbinf);
+		}
+	}
+
+	if (branch_sample_priv(event)) {
+		/*
+		 * All these information (i.e branch privilege level) are not
+		 * available for source only branch records.
+		 */
+		if (!brbe_source(brbinf))
+			entry->priv = brbe_fetch_perf_priv(brbinf);
+	}
+}
+
+/*
+ * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
+ * preceding consecutive branch records, that were in a transaction
+ * (i.e their BRBINF_EL1.TX set) have been aborted.
+ *
+ * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
+ * consecutive branch records upto the last record, which were in a
+ * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
+ *
+ * --------------------------------- -------------------
+ * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
+ * --------------------------------- -------------------
+ * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
+ * --------------------------------- -------------------
+ * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
+ * --------------------------------- -------------------
+ * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
+ * --------------------------------- -------------------
+ * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
+ * --------------------------------- -------------------
+ * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
+ * --------------------------------- -------------------
+ *
+ * BRBFCR_EL1.LASTFAILED == 1
+ *
+ * Here BRBFCR_EL1.LASTFAILED failes all those consecutive and also
+ * in transaction branches near the end of the BRBE buffer.
+ */
+static void process_branch_aborts(struct pmu_hw_events *cpuc)
+{
+	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
+	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+	bool lastfailed = !!(brbfcr & BRBFCR_EL1_LASTFAILED);
+	int idx = brbe_attr->brbe_nr - 1;
+	struct perf_branch_entry *entry;
+
+	do {
+		entry = &cpuc->branches->branch_entries[idx];
+		if (entry->in_tx) {
+			entry->abort = lastfailed;
+		} else {
+			lastfailed = entry->abort;
+			entry->abort = false;
+		}
+	} while (idx--, idx >= 0);
+}
+
+void armv8pmu_branch_reset(void)
+{
+	asm volatile(BRB_IALL);
+	isb();
+}
+
+void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
+	u64 brbinf, brbfcr, brbcr;
+	int idx;
+
+	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
+	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
+
+	/* Ensure pause on PMU interrupt is enabled */
+	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
+
+	/* Save and clear the privilege */
+	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
+
+	/* Pause the buffer */
+	write_sysreg_s(brbfcr | BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
+	isb();
+
+	for (idx = 0; idx < brbe_attr->brbe_nr; idx++) {
+		struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
+
+		select_brbe_bank_index(idx);
+		brbinf = get_brbinf_reg(idx);
+		/*
+		 * There are no valid entries anymore on the buffer.
+		 * Abort the branch record processing to save some
+		 * cycles and also reduce the capture/process load
+		 * for the user space as well.
+		 */
+		if (brbe_invalid(brbinf))
+			break;
+
+		perf_clear_branch_entry_bitfields(entry);
+		if (brbe_valid(brbinf)) {
+			entry->from = get_brbsrc_reg(idx);
+			entry->to = get_brbtgt_reg(idx);
+		} else if (brbe_source(brbinf)) {
+			entry->from = get_brbsrc_reg(idx);
+			entry->to = 0;
+		} else if (brbe_target(brbinf)) {
+			entry->from = 0;
+			entry->to = get_brbtgt_reg(idx);
+		}
+		capture_brbe_flags(cpuc, event, brbinf, brbcr, idx);
+	}
+	cpuc->branches->branch_stack.nr = idx;
+	cpuc->branches->branch_stack.hw_idx = -1ULL;
+	process_branch_aborts(cpuc);
+
+	/* Restore privilege, enable pause on PMU interrupt */
+	write_sysreg_s(brbcr | BRBCR_EL1_FZP, SYS_BRBCR_EL1);
+
+	/* Unpause the buffer */
+	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
+	isb();
+	armv8pmu_branch_reset();
+}
diff --git a/arch/arm64/kernel/brbe.h b/arch/arm64/kernel/brbe.h
new file mode 100644
index 000000000000..ee5aa311f12c
--- /dev/null
+++ b/arch/arm64/kernel/brbe.h
@@ -0,0 +1,257 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Branch Record Buffer Extension Helpers.
+ *
+ * Copyright (C) 2022 ARM Limited
+ *
+ * Author: Anshuman Khandual <anshuman.khandual@arm.com>
+ */
+#define pr_fmt(fmt) "brbe: " fmt
+
+#include <linux/perf/arm_pmu.h>
+
+#define BRBFCR_EL1_BRANCH_FILTERS (BRBFCR_EL1_DIRECT   | \
+				   BRBFCR_EL1_INDIRECT | \
+				   BRBFCR_EL1_RTN      | \
+				   BRBFCR_EL1_INDCALL  | \
+				   BRBFCR_EL1_DIRCALL  | \
+				   BRBFCR_EL1_CONDDIR)
+
+#define BRBFCR_EL1_DEFAULT_CONFIG (BRBFCR_EL1_BANK_MASK | \
+				   BRBFCR_EL1_PAUSED    | \
+				   BRBFCR_EL1_EnI       | \
+				   BRBFCR_EL1_BRANCH_FILTERS)
+
+/*
+ * BRBTS_EL1 is currently not used for branch stack implementation
+ * purpose but BRBCR_EL1.TS needs to have a valid value from all
+ * available options. BRBCR_EL1_TS_VIRTUAL is selected for this.
+ */
+#define BRBCR_EL1_DEFAULT_TS      FIELD_PREP(BRBCR_EL1_TS_MASK, BRBCR_EL1_TS_VIRTUAL)
+
+#define BRBCR_EL1_DEFAULT_CONFIG  (BRBCR_EL1_EXCEPTION | \
+				   BRBCR_EL1_ERTN      | \
+				   BRBCR_EL1_CC        | \
+				   BRBCR_EL1_MPRED     | \
+				   BRBCR_EL1_E1BRE     | \
+				   BRBCR_EL1_E0BRE     | \
+				   BRBCR_EL1_FZP       | \
+				   BRBCR_EL1_DEFAULT_TS)
+/*
+ * BRBE Instructions
+ *
+ * BRB_IALL : Invalidate the entire buffer
+ * BRB_INJ  : Inject latest branch record derived from [BRBSRCINJ, BRBTGTINJ, BRBINFINJ]
+ */
+#define BRB_IALL __emit_inst(0xD5000000 | sys_insn(1, 1, 7, 2, 4) | (0x1f))
+#define BRB_INJ  __emit_inst(0xD5000000 | sys_insn(1, 1, 7, 2, 5) | (0x1f))
+
+/*
+ * BRBE Buffer Organization
+ *
+ * BRBE buffer is arranged as multiple banks of 32 branch record
+ * entries each. An individual branch record in a given bank could
+ * be accessed, after selecting the bank in BRBFCR_EL1.BANK and
+ * accessing the registers i.e [BRBSRC, BRBTGT, BRBINF] set with
+ * indices [0..31].
+ *
+ * Bank 0
+ *
+ *	---------------------------------	------
+ *	| 00 | BRBSRC | BRBTGT | BRBINF |	| 00 |
+ *	---------------------------------	------
+ *	| 01 | BRBSRC | BRBTGT | BRBINF |	| 01 |
+ *	---------------------------------	------
+ *	| .. | BRBSRC | BRBTGT | BRBINF |	| .. |
+ *	---------------------------------	------
+ *	| 31 | BRBSRC | BRBTGT | BRBINF |	| 31 |
+ *	---------------------------------	------
+ *
+ * Bank 1
+ *
+ *	---------------------------------	------
+ *	| 32 | BRBSRC | BRBTGT | BRBINF |	| 00 |
+ *	---------------------------------	------
+ *	| 33 | BRBSRC | BRBTGT | BRBINF |	| 01 |
+ *	---------------------------------	------
+ *	| .. | BRBSRC | BRBTGT | BRBINF |	| .. |
+ *	---------------------------------	------
+ *	| 63 | BRBSRC | BRBTGT | BRBINF |	| 31 |
+ *	---------------------------------	------
+ */
+#define BRBE_BANK_MAX_ENTRIES 32
+
+#define BRBE_BANK0_IDX_MIN 0
+#define BRBE_BANK0_IDX_MAX 31
+#define BRBE_BANK1_IDX_MIN 32
+#define BRBE_BANK1_IDX_MAX 63
+
+struct brbe_hw_attr {
+	bool	brbe_version;
+	int	brbe_cc;
+	int	brbe_nr;
+	int	brbe_format;
+};
+
+enum brbe_bank_idx {
+	BRBE_BANK_IDX_INVALID = -1,
+	BRBE_BANK_IDX_0,
+	BRBE_BANK_IDX_1,
+	BRBE_BANK_IDX_MAX
+};
+
+#define RETURN_READ_BRBSRCN(n) \
+	read_sysreg_s(SYS_BRBSRC##n##_EL1)
+
+#define RETURN_READ_BRBTGTN(n) \
+	read_sysreg_s(SYS_BRBTGT##n##_EL1)
+
+#define RETURN_READ_BRBINFN(n) \
+	read_sysreg_s(SYS_BRBINF##n##_EL1)
+
+#define BRBE_REGN_CASE(n, case_macro) \
+	case n: return case_macro(n); break
+
+#define BRBE_REGN_SWITCH(x, case_macro)				\
+	do {							\
+		switch (x) {					\
+		BRBE_REGN_CASE(0, case_macro);			\
+		BRBE_REGN_CASE(1, case_macro);			\
+		BRBE_REGN_CASE(2, case_macro);			\
+		BRBE_REGN_CASE(3, case_macro);			\
+		BRBE_REGN_CASE(4, case_macro);			\
+		BRBE_REGN_CASE(5, case_macro);			\
+		BRBE_REGN_CASE(6, case_macro);			\
+		BRBE_REGN_CASE(7, case_macro);			\
+		BRBE_REGN_CASE(8, case_macro);			\
+		BRBE_REGN_CASE(9, case_macro);			\
+		BRBE_REGN_CASE(10, case_macro);			\
+		BRBE_REGN_CASE(11, case_macro);			\
+		BRBE_REGN_CASE(12, case_macro);			\
+		BRBE_REGN_CASE(13, case_macro);			\
+		BRBE_REGN_CASE(14, case_macro);			\
+		BRBE_REGN_CASE(15, case_macro);			\
+		BRBE_REGN_CASE(16, case_macro);			\
+		BRBE_REGN_CASE(17, case_macro);			\
+		BRBE_REGN_CASE(18, case_macro);			\
+		BRBE_REGN_CASE(19, case_macro);			\
+		BRBE_REGN_CASE(20, case_macro);			\
+		BRBE_REGN_CASE(21, case_macro);			\
+		BRBE_REGN_CASE(22, case_macro);			\
+		BRBE_REGN_CASE(23, case_macro);			\
+		BRBE_REGN_CASE(24, case_macro);			\
+		BRBE_REGN_CASE(25, case_macro);			\
+		BRBE_REGN_CASE(26, case_macro);			\
+		BRBE_REGN_CASE(27, case_macro);			\
+		BRBE_REGN_CASE(28, case_macro);			\
+		BRBE_REGN_CASE(29, case_macro);			\
+		BRBE_REGN_CASE(30, case_macro);			\
+		BRBE_REGN_CASE(31, case_macro);			\
+		default:					\
+			pr_warn("unknown register index\n");	\
+			return -1;				\
+		}						\
+	} while (0)
+
+static inline int buffer_to_brbe_idx(int buffer_idx)
+{
+	return buffer_idx % BRBE_BANK_MAX_ENTRIES;
+}
+
+static inline u64 get_brbsrc_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBSRCN);
+}
+
+static inline u64 get_brbtgt_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBTGTN);
+}
+
+static inline u64 get_brbinf_reg(int buffer_idx)
+{
+	int brbe_idx = buffer_to_brbe_idx(buffer_idx);
+
+	BRBE_REGN_SWITCH(brbe_idx, RETURN_READ_BRBINFN);
+}
+
+static inline u64 brbe_record_valid(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_VALID_MASK, brbinf);
+}
+
+static inline bool brbe_invalid(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_NONE;
+}
+
+static inline bool brbe_valid(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_FULL;
+}
+
+static inline bool brbe_source(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_SOURCE;
+}
+
+static inline bool brbe_target(u64 brbinf)
+{
+	return brbe_record_valid(brbinf) == BRBINF_EL1_VALID_TARGET;
+}
+
+static inline int brbe_fetch_in_tx(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_T_MASK, brbinf);
+}
+
+static inline int brbe_fetch_mispredict(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_MPRED_MASK, brbinf);
+}
+
+static inline int brbe_fetch_lastfailed(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_LASTFAILED_MASK, brbinf);
+}
+
+static inline int brbe_fetch_cycles(u64 brbinf)
+{
+	/*
+	 * Captured cycle count is unknown and hence
+	 * should not be passed on to the user space.
+	 */
+	if (brbinf & BRBINF_EL1_CCU)
+		return 0;
+
+	return FIELD_GET(BRBINF_EL1_CC_MASK, brbinf);
+}
+
+static inline int brbe_fetch_type(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_TYPE_MASK, brbinf);
+}
+
+static inline int brbe_fetch_el(u64 brbinf)
+{
+	return FIELD_GET(BRBINF_EL1_EL_MASK, brbinf);
+}
+
+static inline int brbe_fetch_numrec(u64 brbidr)
+{
+	return FIELD_GET(BRBIDR0_EL1_NUMREC_MASK, brbidr);
+}
+
+static inline int brbe_fetch_format(u64 brbidr)
+{
+	return FIELD_GET(BRBIDR0_EL1_FORMAT_MASK, brbidr);
+}
+
+static inline int brbe_fetch_cc_bits(u64 brbidr)
+{
+	return FIELD_GET(BRBIDR0_EL1_CC_MASK, brbidr);
+}
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 0/6] arm64/perf: Enable branch stack sampling
  2023-01-05  3:10 ` Anshuman Khandual
@ 2023-01-06 10:23   ` James Clark
  -1 siblings, 0 replies; 62+ messages in thread
From: James Clark @ 2023-01-06 10:23 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Catalin Marinas, Will Deacon, Mark Brown, Rob Herring,
	Marc Zyngier, Suzuki Poulose, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, linux-perf-users, linux-arm-kernel,
	linux-kernel, mark.rutland



On 05/01/2023 03:10, Anshuman Khandual wrote:
> This series enables perf branch stack sampling support on arm64 platform
> via a new arch feature called Branch Record Buffer Extension (BRBE). All
> relevant register definitions could be accessed here.
> 

Hi Anshuman,

The missing cc for linux-perf-users@vger.kernel.org on the other patches
means that this looks incomplete on the lore page for linux-perf-users.
b4 still picks up the full set, so it's probably fine. But it might be
worth adding the same cc for all patches next time.

Thanks
James

> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
> 
> This series applies on v6.2-r2.
> 
> Changes in V7:
> 
> - Folded [PATCH 7/7] into [PATCH 3/7] which enables branch stack sampling event
> - Defined BRBFCR_EL1_BRANCH_FILTERS, BRBCR_EL1_DEFAULT_CONFIG in the header
> - Defined BRBFCR_EL1_DEFAULT_CONFIG in the header
> - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_FZP
> - Defined BRBCR_EL1_DEFAULT_TS in the header
> - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_DEFAULT_TS
> - Moved BRBCR_EL1_DEFAULT_CONFIG check inside branch_type_to_brbcr()
> - Moved down BRBCR_EL1_CC, BRBCR_EL1_MPRED later in branch_type_to_brbcr()
> - Also set BRBE in paused state in armv8pmu_branch_disable()
> - Dropped brbe_paused(), set_brbe_paused() helpers
> - Extracted error string via branch_filter_error_msg[] for armv8pmu_branch_valid()
> - Replaced brbe_v1p1 with brbe_version in struct brbe_hw_attr
> - Added valid_brbe_[cc, format, version]() helpers
> - Split a separate brbe_attributes_probe() from armv8pmu_branch_probe()
> - Capture event->attr.branch_sample_type earlier in armv8pmu_branch_valid()
> - Defined enum brbe_bank_idx with possible values for BRBE bank indices
> - Changed armpmu->hw_attr into armpmu->private
> - Added missing space in stub definition for armv8pmu_branch_valid()
> - Replaced both kmalloc() with kzalloc()
> - Added BRBE_BANK_MAX_ENTRIES
> - Updated comment for capture_brbe_flags()
> - Updated comment for struct brbe_hw_attr
> - Dropped space after type cast in couple of places
> - Replaced inverse with negation for testing BRBCR_EL1_FZP in armv8pmu_branch_read()
> - Captured cpuc->branches->branch_entries[idx] in a local variable
> - Dropped saved_priv from armv8pmu_branch_read()
> - Reorganize PERF_SAMPLE_BRANCH_NO_[CYCLES|NO_FLAGS] related configuration
> - Replaced with FIELD_GET() and FIELD_PREP() wherever applicable
> - Replaced BRBCR_EL1_TS_PHYSICAL with BRBCR_EL1_TS_VIRTUAL
> - Moved valid_brbe_nr(), valid_brbe_cc(), valid_brbe_format(), valid_brbe_version()
>   select_brbe_bank(), select_brbe_bank_index() helpers inside the C implementation
> - Reorganized brbe_valid_nr() and dropped the pr_warn() message
> - Changed probe sequence in brbe_attributes_probe()
> - Added 'brbcr' argument into capture_brbe_flags() to ascertain correct state
> - Disable BRBE before disabling the PMU event counter
> - Enable PERF_SAMPLE_BRANCH_HV filters when is_kernel_in_hyp_mode()
> - Guard armv8pmu_reset() & armv8pmu_sched_task() with arm_pmu_branch_stack_supported()
> 
> Changes in V6:
> 
> https://lore.kernel.org/linux-arm-kernel/20221208084402.863310-1-anshuman.khandual@arm.com/
> 
> - Restore the exception level privilege after reading the branch records
> - Unpause the buffer after reading the branch records
> - Decouple BRBCR_EL1_EXCEPTION/ERTN from perf event privilege level
> - Reworked BRBE implementation and branch stack sampling support on arm pmu
> - BRBE implementation is now part of overall ARMV8 PMU implementation
> - BRBE implementation moved from drivers/perf/ to inside arch/arm64/kernel/
> - CONFIG_ARM_BRBE_PMU renamed as CONFIG_ARM64_BRBE in arch/arm64/Kconfig
> - File moved - drivers/perf/arm_pmu_brbe.c -> arch/arm64/kernel/brbe.c
> - File moved - drivers/perf/arm_pmu_brbe.h -> arch/arm64/kernel/brbe.h
> - BRBE name has been dropped from struct arm_pmu and struct hw_pmu_events
> - BRBE name has been abstracted out as 'branches' in arm_pmu and hw_pmu_events
> - BRBE name has been abstracted out as 'branches' in ARMV8 PMU implementation
> - Added sched_task() callback into struct arm_pmu
> - Added 'hw_attr' into struct arm_pmu encapsulating possible PMU HW attributes
> - Dropped explicit attributes brbe_(v1p1, nr, cc, format) from struct arm_pmu
> - Dropped brbfcr, brbcr, registers scratch area from struct hw_pmu_events
> - Dropped brbe_users, brbe_context tracking in struct hw_pmu_events
> - Added 'features' tracking into struct arm_pmu with ARM_PMU_BRANCH_STACK flag
> - armpmu->hw_attr maps into 'struct brbe_hw_attr' inside BRBE implementation
> - Set ARM_PMU_BRANCH_STACK in 'arm_pmu->features' after successful BRBE probe
> - Added armv8pmu_branch_reset() inside armv8pmu_branch_enable()
> - Dropped brbe_supported() as events will be rejected via ARM_PMU_BRANCH_STACK
> - Dropped set_brbe_disabled() as well
> - Reformated armv8pmu_branch_valid() warnings while rejecting unsupported events
> 
> Changes in V5:
> 
> https://lore.kernel.org/linux-arm-kernel/20221107062514.2851047-1-anshuman.khandual@arm.com/
> 
> - Changed BRBCR_EL1.VIRTUAL from 0b1 to 0b01
> - Changed BRBFCR_EL1.EnL into BRBFCR_EL1.EnI
> - Changed config ARM_BRBE_PMU from 'tristate' to 'bool'
> 
> Changes in V4:
> 
> https://lore.kernel.org/all/20221017055713.451092-1-anshuman.khandual@arm.com/
> 
> - Changed ../tools/sysreg declarations as suggested
> - Set PERF_SAMPLE_BRANCH_STACK in data.sample_flags
> - Dropped perfmon_capable() check in armpmu_event_init()
> - s/pr_warn_once/pr_info in armpmu_event_init()
> - Added brbe_format element into struct pmu_hw_events
> - Changed v1p1 as brbe_v1p1 in struct pmu_hw_events
> - Dropped pr_info() from arm64_pmu_brbe_probe(), solved LOCKDEP warning
> 
> Changes in V3:
> 
> https://lore.kernel.org/all/20220929075857.158358-1-anshuman.khandual@arm.com/
> 
> - Moved brbe_stack from the stack and now dynamically allocated
> - Return PERF_BR_PRIV_UNKNOWN instead of -1 in brbe_fetch_perf_priv()
> - Moved BRBIDR0, BRBCR, BRBFCR registers and fields into tools/sysreg
> - Created dummy BRBINF_EL1 field definitions in tools/sysreg
> - Dropped ARMPMU_EVT_PRIV framework which cached perfmon_capable()
> - Both exception and exception return branche records are now captured
>   only if the event has PERF_SAMPLE_BRANCH_KERNEL which would already
>   been checked in generic perf via perf_allow_kernel()
> 
> Changes in V2:
> 
> https://lore.kernel.org/all/20220908051046.465307-1-anshuman.khandual@arm.com/
> 
> - Dropped branch sample filter helpers consolidation patch from this series 
> - Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable()
> - Use cached perfmon_capable() while configuring BRBE branch record filters
> 
> Changes in V1:
> 
> https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.khandual@arm.com/
> 
> - Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers
> - Process new perf branch types via PERF_BR_EXTEND_ABI
> 
> Changes in RFC V2:
> 
> https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.khandual@arm.com/
> 
> - Added branch_sample_priv() while consolidating other branch sample filter helpers
> - Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc
> - Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5)
> - Added documentation for struct arm_pmu changes, updated commit message
> - Updated commit message for BRBE detection infrastructure patch
> - PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver)
> - Branch privilege state capture mechanism has now moved inside the driver
> 
> Changes in RFC V1:
> 
> https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: James Clark <james.clark@arm.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Suzuki Poulose <suzuki.poulose@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> 
> Anshuman Khandual (6):
>   drivers: perf: arm_pmu: Add new sched_task() callback
>   arm64/perf: Add BRBE registers and fields
>   arm64/perf: Add branch stack support in struct arm_pmu
>   arm64/perf: Add branch stack support in struct pmu_hw_events
>   arm64/perf: Add branch stack support in ARMV8 PMU
>   arm64/perf: Enable branch stack events via FEAT_BRBE
> 
>  arch/arm64/Kconfig                  |  11 +
>  arch/arm64/include/asm/perf_event.h |  19 ++
>  arch/arm64/include/asm/sysreg.h     | 103 ++++++
>  arch/arm64/kernel/Makefile          |   1 +
>  arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
>  arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
>  arch/arm64/kernel/perf_event.c      |  35 ++
>  arch/arm64/tools/sysreg             | 161 +++++++++
>  drivers/perf/arm_pmu.c              |  12 +-
>  include/linux/perf/arm_pmu.h        |  19 ++
>  10 files changed, 1128 insertions(+), 2 deletions(-)
>  create mode 100644 arch/arm64/kernel/brbe.c
>  create mode 100644 arch/arm64/kernel/brbe.h
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 0/6] arm64/perf: Enable branch stack sampling
@ 2023-01-06 10:23   ` James Clark
  0 siblings, 0 replies; 62+ messages in thread
From: James Clark @ 2023-01-06 10:23 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Catalin Marinas, Will Deacon, Mark Brown, Rob Herring,
	Marc Zyngier, Suzuki Poulose, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, linux-perf-users, linux-arm-kernel,
	linux-kernel, mark.rutland



On 05/01/2023 03:10, Anshuman Khandual wrote:
> This series enables perf branch stack sampling support on arm64 platform
> via a new arch feature called Branch Record Buffer Extension (BRBE). All
> relevant register definitions could be accessed here.
> 

Hi Anshuman,

The missing cc for linux-perf-users@vger.kernel.org on the other patches
means that this looks incomplete on the lore page for linux-perf-users.
b4 still picks up the full set, so it's probably fine. But it might be
worth adding the same cc for all patches next time.

Thanks
James

> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
> 
> This series applies on v6.2-r2.
> 
> Changes in V7:
> 
> - Folded [PATCH 7/7] into [PATCH 3/7] which enables branch stack sampling event
> - Defined BRBFCR_EL1_BRANCH_FILTERS, BRBCR_EL1_DEFAULT_CONFIG in the header
> - Defined BRBFCR_EL1_DEFAULT_CONFIG in the header
> - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_FZP
> - Defined BRBCR_EL1_DEFAULT_TS in the header
> - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_DEFAULT_TS
> - Moved BRBCR_EL1_DEFAULT_CONFIG check inside branch_type_to_brbcr()
> - Moved down BRBCR_EL1_CC, BRBCR_EL1_MPRED later in branch_type_to_brbcr()
> - Also set BRBE in paused state in armv8pmu_branch_disable()
> - Dropped brbe_paused(), set_brbe_paused() helpers
> - Extracted error string via branch_filter_error_msg[] for armv8pmu_branch_valid()
> - Replaced brbe_v1p1 with brbe_version in struct brbe_hw_attr
> - Added valid_brbe_[cc, format, version]() helpers
> - Split a separate brbe_attributes_probe() from armv8pmu_branch_probe()
> - Capture event->attr.branch_sample_type earlier in armv8pmu_branch_valid()
> - Defined enum brbe_bank_idx with possible values for BRBE bank indices
> - Changed armpmu->hw_attr into armpmu->private
> - Added missing space in stub definition for armv8pmu_branch_valid()
> - Replaced both kmalloc() with kzalloc()
> - Added BRBE_BANK_MAX_ENTRIES
> - Updated comment for capture_brbe_flags()
> - Updated comment for struct brbe_hw_attr
> - Dropped space after type cast in couple of places
> - Replaced inverse with negation for testing BRBCR_EL1_FZP in armv8pmu_branch_read()
> - Captured cpuc->branches->branch_entries[idx] in a local variable
> - Dropped saved_priv from armv8pmu_branch_read()
> - Reorganize PERF_SAMPLE_BRANCH_NO_[CYCLES|NO_FLAGS] related configuration
> - Replaced with FIELD_GET() and FIELD_PREP() wherever applicable
> - Replaced BRBCR_EL1_TS_PHYSICAL with BRBCR_EL1_TS_VIRTUAL
> - Moved valid_brbe_nr(), valid_brbe_cc(), valid_brbe_format(), valid_brbe_version()
>   select_brbe_bank(), select_brbe_bank_index() helpers inside the C implementation
> - Reorganized brbe_valid_nr() and dropped the pr_warn() message
> - Changed probe sequence in brbe_attributes_probe()
> - Added 'brbcr' argument into capture_brbe_flags() to ascertain correct state
> - Disable BRBE before disabling the PMU event counter
> - Enable PERF_SAMPLE_BRANCH_HV filters when is_kernel_in_hyp_mode()
> - Guard armv8pmu_reset() & armv8pmu_sched_task() with arm_pmu_branch_stack_supported()
> 
> Changes in V6:
> 
> https://lore.kernel.org/linux-arm-kernel/20221208084402.863310-1-anshuman.khandual@arm.com/
> 
> - Restore the exception level privilege after reading the branch records
> - Unpause the buffer after reading the branch records
> - Decouple BRBCR_EL1_EXCEPTION/ERTN from perf event privilege level
> - Reworked BRBE implementation and branch stack sampling support on arm pmu
> - BRBE implementation is now part of overall ARMV8 PMU implementation
> - BRBE implementation moved from drivers/perf/ to inside arch/arm64/kernel/
> - CONFIG_ARM_BRBE_PMU renamed as CONFIG_ARM64_BRBE in arch/arm64/Kconfig
> - File moved - drivers/perf/arm_pmu_brbe.c -> arch/arm64/kernel/brbe.c
> - File moved - drivers/perf/arm_pmu_brbe.h -> arch/arm64/kernel/brbe.h
> - BRBE name has been dropped from struct arm_pmu and struct hw_pmu_events
> - BRBE name has been abstracted out as 'branches' in arm_pmu and hw_pmu_events
> - BRBE name has been abstracted out as 'branches' in ARMV8 PMU implementation
> - Added sched_task() callback into struct arm_pmu
> - Added 'hw_attr' into struct arm_pmu encapsulating possible PMU HW attributes
> - Dropped explicit attributes brbe_(v1p1, nr, cc, format) from struct arm_pmu
> - Dropped brbfcr, brbcr, registers scratch area from struct hw_pmu_events
> - Dropped brbe_users, brbe_context tracking in struct hw_pmu_events
> - Added 'features' tracking into struct arm_pmu with ARM_PMU_BRANCH_STACK flag
> - armpmu->hw_attr maps into 'struct brbe_hw_attr' inside BRBE implementation
> - Set ARM_PMU_BRANCH_STACK in 'arm_pmu->features' after successful BRBE probe
> - Added armv8pmu_branch_reset() inside armv8pmu_branch_enable()
> - Dropped brbe_supported() as events will be rejected via ARM_PMU_BRANCH_STACK
> - Dropped set_brbe_disabled() as well
> - Reformated armv8pmu_branch_valid() warnings while rejecting unsupported events
> 
> Changes in V5:
> 
> https://lore.kernel.org/linux-arm-kernel/20221107062514.2851047-1-anshuman.khandual@arm.com/
> 
> - Changed BRBCR_EL1.VIRTUAL from 0b1 to 0b01
> - Changed BRBFCR_EL1.EnL into BRBFCR_EL1.EnI
> - Changed config ARM_BRBE_PMU from 'tristate' to 'bool'
> 
> Changes in V4:
> 
> https://lore.kernel.org/all/20221017055713.451092-1-anshuman.khandual@arm.com/
> 
> - Changed ../tools/sysreg declarations as suggested
> - Set PERF_SAMPLE_BRANCH_STACK in data.sample_flags
> - Dropped perfmon_capable() check in armpmu_event_init()
> - s/pr_warn_once/pr_info in armpmu_event_init()
> - Added brbe_format element into struct pmu_hw_events
> - Changed v1p1 as brbe_v1p1 in struct pmu_hw_events
> - Dropped pr_info() from arm64_pmu_brbe_probe(), solved LOCKDEP warning
> 
> Changes in V3:
> 
> https://lore.kernel.org/all/20220929075857.158358-1-anshuman.khandual@arm.com/
> 
> - Moved brbe_stack from the stack and now dynamically allocated
> - Return PERF_BR_PRIV_UNKNOWN instead of -1 in brbe_fetch_perf_priv()
> - Moved BRBIDR0, BRBCR, BRBFCR registers and fields into tools/sysreg
> - Created dummy BRBINF_EL1 field definitions in tools/sysreg
> - Dropped ARMPMU_EVT_PRIV framework which cached perfmon_capable()
> - Both exception and exception return branche records are now captured
>   only if the event has PERF_SAMPLE_BRANCH_KERNEL which would already
>   been checked in generic perf via perf_allow_kernel()
> 
> Changes in V2:
> 
> https://lore.kernel.org/all/20220908051046.465307-1-anshuman.khandual@arm.com/
> 
> - Dropped branch sample filter helpers consolidation patch from this series 
> - Added new hw_perf_event.flags element ARMPMU_EVT_PRIV to cache perfmon_capable()
> - Use cached perfmon_capable() while configuring BRBE branch record filters
> 
> Changes in V1:
> 
> https://lore.kernel.org/linux-arm-kernel/20220613100119.684673-1-anshuman.khandual@arm.com/
> 
> - Added CONFIG_PERF_EVENTS wrapper for all branch sample filter helpers
> - Process new perf branch types via PERF_BR_EXTEND_ABI
> 
> Changes in RFC V2:
> 
> https://lore.kernel.org/linux-arm-kernel/20220412115455.293119-1-anshuman.khandual@arm.com/
> 
> - Added branch_sample_priv() while consolidating other branch sample filter helpers
> - Changed all SYS_BRBXXXN_EL1 register definition encodings per Marc
> - Changed the BRBE driver as per proposed BRBE related perf ABI changes (V5)
> - Added documentation for struct arm_pmu changes, updated commit message
> - Updated commit message for BRBE detection infrastructure patch
> - PERF_SAMPLE_BRANCH_KERNEL gets checked during arm event init (outside the driver)
> - Branch privilege state capture mechanism has now moved inside the driver
> 
> Changes in RFC V1:
> 
> https://lore.kernel.org/all/1642998653-21377-1-git-send-email-anshuman.khandual@arm.com/
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: James Clark <james.clark@arm.com>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Suzuki Poulose <suzuki.poulose@arm.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-perf-users@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> 
> Anshuman Khandual (6):
>   drivers: perf: arm_pmu: Add new sched_task() callback
>   arm64/perf: Add BRBE registers and fields
>   arm64/perf: Add branch stack support in struct arm_pmu
>   arm64/perf: Add branch stack support in struct pmu_hw_events
>   arm64/perf: Add branch stack support in ARMV8 PMU
>   arm64/perf: Enable branch stack events via FEAT_BRBE
> 
>  arch/arm64/Kconfig                  |  11 +
>  arch/arm64/include/asm/perf_event.h |  19 ++
>  arch/arm64/include/asm/sysreg.h     | 103 ++++++
>  arch/arm64/kernel/Makefile          |   1 +
>  arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
>  arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
>  arch/arm64/kernel/perf_event.c      |  35 ++
>  arch/arm64/tools/sysreg             | 161 +++++++++
>  drivers/perf/arm_pmu.c              |  12 +-
>  include/linux/perf/arm_pmu.h        |  19 ++
>  10 files changed, 1128 insertions(+), 2 deletions(-)
>  create mode 100644 arch/arm64/kernel/brbe.c
>  create mode 100644 arch/arm64/kernel/brbe.h
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 0/6] arm64/perf: Enable branch stack sampling
  2023-01-06 10:23   ` James Clark
@ 2023-01-06 11:13     ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-06 11:13 UTC (permalink / raw)
  To: James Clark
  Cc: Catalin Marinas, Will Deacon, Mark Brown, Rob Herring,
	Marc Zyngier, Suzuki Poulose, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, linux-perf-users, linux-arm-kernel,
	linux-kernel, mark.rutland



On 1/6/23 15:53, James Clark wrote:
> 
> On 05/01/2023 03:10, Anshuman Khandual wrote:
>> This series enables perf branch stack sampling support on arm64 platform
>> via a new arch feature called Branch Record Buffer Extension (BRBE). All
>> relevant register definitions could be accessed here.
>>
> Hi Anshuman,
> 
> The missing cc for linux-perf-users@vger.kernel.org on the other patches
> means that this looks incomplete on the lore page for linux-perf-users.
> b4 still picks up the full set, so it's probably fine. But it might be
> worth adding the same cc for all patches next time.

Right, actually forgot to add cc-cover option while sending via git.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 0/6] arm64/perf: Enable branch stack sampling
@ 2023-01-06 11:13     ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-06 11:13 UTC (permalink / raw)
  To: James Clark
  Cc: Catalin Marinas, Will Deacon, Mark Brown, Rob Herring,
	Marc Zyngier, Suzuki Poulose, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, linux-perf-users, linux-arm-kernel,
	linux-kernel, mark.rutland



On 1/6/23 15:53, James Clark wrote:
> 
> On 05/01/2023 03:10, Anshuman Khandual wrote:
>> This series enables perf branch stack sampling support on arm64 platform
>> via a new arch feature called Branch Record Buffer Extension (BRBE). All
>> relevant register definitions could be accessed here.
>>
> Hi Anshuman,
> 
> The missing cc for linux-perf-users@vger.kernel.org on the other patches
> means that this looks incomplete on the lore page for linux-perf-users.
> b4 still picks up the full set, so it's probably fine. But it might be
> worth adding the same cc for all patches next time.

Right, actually forgot to add cc-cover option while sending via git.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 0/6] arm64/perf: Enable branch stack sampling
  2023-01-05  3:10 ` Anshuman Khandual
@ 2023-01-11  5:05   ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-11  5:05 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Catalin Marinas, Will Deacon, Mark Brown, James Clark,
	Rob Herring, Marc Zyngier, Suzuki Poulose, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, linux-perf-users



On 1/5/23 08:40, Anshuman Khandual wrote:
> This series enables perf branch stack sampling support on arm64 platform
> via a new arch feature called Branch Record Buffer Extension (BRBE). All
> relevant register definitions could be accessed here.
> 
> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
> 
> This series applies on v6.2-r2.
> 
> Changes in V7:
> 
> - Folded [PATCH 7/7] into [PATCH 3/7] which enables branch stack sampling event
> - Defined BRBFCR_EL1_BRANCH_FILTERS, BRBCR_EL1_DEFAULT_CONFIG in the header
> - Defined BRBFCR_EL1_DEFAULT_CONFIG in the header
> - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_FZP
> - Defined BRBCR_EL1_DEFAULT_TS in the header
> - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_DEFAULT_TS
> - Moved BRBCR_EL1_DEFAULT_CONFIG check inside branch_type_to_brbcr()
> - Moved down BRBCR_EL1_CC, BRBCR_EL1_MPRED later in branch_type_to_brbcr()
> - Also set BRBE in paused state in armv8pmu_branch_disable()
> - Dropped brbe_paused(), set_brbe_paused() helpers
> - Extracted error string via branch_filter_error_msg[] for armv8pmu_branch_valid()
> - Replaced brbe_v1p1 with brbe_version in struct brbe_hw_attr
> - Added valid_brbe_[cc, format, version]() helpers
> - Split a separate brbe_attributes_probe() from armv8pmu_branch_probe()
> - Capture event->attr.branch_sample_type earlier in armv8pmu_branch_valid()
> - Defined enum brbe_bank_idx with possible values for BRBE bank indices
> - Changed armpmu->hw_attr into armpmu->private
> - Added missing space in stub definition for armv8pmu_branch_valid()
> - Replaced both kmalloc() with kzalloc()
> - Added BRBE_BANK_MAX_ENTRIES
> - Updated comment for capture_brbe_flags()
> - Updated comment for struct brbe_hw_attr
> - Dropped space after type cast in couple of places
> - Replaced inverse with negation for testing BRBCR_EL1_FZP in armv8pmu_branch_read()
> - Captured cpuc->branches->branch_entries[idx] in a local variable
> - Dropped saved_priv from armv8pmu_branch_read()
> - Reorganize PERF_SAMPLE_BRANCH_NO_[CYCLES|NO_FLAGS] related configuration
> - Replaced with FIELD_GET() and FIELD_PREP() wherever applicable
> - Replaced BRBCR_EL1_TS_PHYSICAL with BRBCR_EL1_TS_VIRTUAL
> - Moved valid_brbe_nr(), valid_brbe_cc(), valid_brbe_format(), valid_brbe_version()
>   select_brbe_bank(), select_brbe_bank_index() helpers inside the C implementation
> - Reorganized brbe_valid_nr() and dropped the pr_warn() message
> - Changed probe sequence in brbe_attributes_probe()
> - Added 'brbcr' argument into capture_brbe_flags() to ascertain correct state
> - Disable BRBE before disabling the PMU event counter
> - Enable PERF_SAMPLE_BRANCH_HV filters when is_kernel_in_hyp_mode()
> - Guard armv8pmu_reset() & armv8pmu_sched_task() with arm_pmu_branch_stack_supported()

Gentle ping, any updates on this series ?

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 0/6] arm64/perf: Enable branch stack sampling
@ 2023-01-11  5:05   ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-11  5:05 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, mark.rutland
  Cc: Catalin Marinas, Will Deacon, Mark Brown, James Clark,
	Rob Herring, Marc Zyngier, Suzuki Poulose, Peter Zijlstra,
	Ingo Molnar, Arnaldo Carvalho de Melo, linux-perf-users



On 1/5/23 08:40, Anshuman Khandual wrote:
> This series enables perf branch stack sampling support on arm64 platform
> via a new arch feature called Branch Record Buffer Extension (BRBE). All
> relevant register definitions could be accessed here.
> 
> https://developer.arm.com/documentation/ddi0601/2021-12/AArch64-Registers
> 
> This series applies on v6.2-r2.
> 
> Changes in V7:
> 
> - Folded [PATCH 7/7] into [PATCH 3/7] which enables branch stack sampling event
> - Defined BRBFCR_EL1_BRANCH_FILTERS, BRBCR_EL1_DEFAULT_CONFIG in the header
> - Defined BRBFCR_EL1_DEFAULT_CONFIG in the header
> - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_FZP
> - Defined BRBCR_EL1_DEFAULT_TS in the header
> - Updated BRBCR_EL1_DEFAULT_CONFIG with BRBCR_EL1_DEFAULT_TS
> - Moved BRBCR_EL1_DEFAULT_CONFIG check inside branch_type_to_brbcr()
> - Moved down BRBCR_EL1_CC, BRBCR_EL1_MPRED later in branch_type_to_brbcr()
> - Also set BRBE in paused state in armv8pmu_branch_disable()
> - Dropped brbe_paused(), set_brbe_paused() helpers
> - Extracted error string via branch_filter_error_msg[] for armv8pmu_branch_valid()
> - Replaced brbe_v1p1 with brbe_version in struct brbe_hw_attr
> - Added valid_brbe_[cc, format, version]() helpers
> - Split a separate brbe_attributes_probe() from armv8pmu_branch_probe()
> - Capture event->attr.branch_sample_type earlier in armv8pmu_branch_valid()
> - Defined enum brbe_bank_idx with possible values for BRBE bank indices
> - Changed armpmu->hw_attr into armpmu->private
> - Added missing space in stub definition for armv8pmu_branch_valid()
> - Replaced both kmalloc() with kzalloc()
> - Added BRBE_BANK_MAX_ENTRIES
> - Updated comment for capture_brbe_flags()
> - Updated comment for struct brbe_hw_attr
> - Dropped space after type cast in couple of places
> - Replaced inverse with negation for testing BRBCR_EL1_FZP in armv8pmu_branch_read()
> - Captured cpuc->branches->branch_entries[idx] in a local variable
> - Dropped saved_priv from armv8pmu_branch_read()
> - Reorganize PERF_SAMPLE_BRANCH_NO_[CYCLES|NO_FLAGS] related configuration
> - Replaced with FIELD_GET() and FIELD_PREP() wherever applicable
> - Replaced BRBCR_EL1_TS_PHYSICAL with BRBCR_EL1_TS_VIRTUAL
> - Moved valid_brbe_nr(), valid_brbe_cc(), valid_brbe_format(), valid_brbe_version()
>   select_brbe_bank(), select_brbe_bank_index() helpers inside the C implementation
> - Reorganized brbe_valid_nr() and dropped the pr_warn() message
> - Changed probe sequence in brbe_attributes_probe()
> - Added 'brbcr' argument into capture_brbe_flags() to ascertain correct state
> - Disable BRBE before disabling the PMU event counter
> - Enable PERF_SAMPLE_BRANCH_HV filters when is_kernel_in_hyp_mode()
> - Guard armv8pmu_reset() & armv8pmu_sched_task() with arm_pmu_branch_stack_supported()

Gentle ping, any updates on this series ?

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
  2023-01-05  3:10   ` Anshuman Khandual
@ 2023-01-12 13:24     ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 13:24 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown

Hi Anshuman,

On Thu, Jan 05, 2023 at 08:40:35AM +0530, Anshuman Khandual wrote:
> This adds BRBE related register definitions and various other related field
> macros there in. These will be used subsequently in a BRBE driver which is
> being added later on.

I haven't verified the specific values, but this looks good to me aside from
one minor nit below.

[...]

> +# This is just a dummy register declaration to get all common field masks and
> +# shifts for accessing given BRBINF contents.
> +Sysreg	BRBINF_EL1	2	1	8	0	0

We don't need a dummy declaration, as we have 'SysregFields' that can be used
for this, e.g.

  SysregFields BRBINFx_EL1
  ...
  EndSysregFields

... which will avoid accidental usage of the register encoding. Note that I've
also added an 'x' there in place of the index, which we do for other registers,
e.g. TTBRx_EL1.

Could you please update to that?

With that:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
@ 2023-01-12 13:24     ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 13:24 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown

Hi Anshuman,

On Thu, Jan 05, 2023 at 08:40:35AM +0530, Anshuman Khandual wrote:
> This adds BRBE related register definitions and various other related field
> macros there in. These will be used subsequently in a BRBE driver which is
> being added later on.

I haven't verified the specific values, but this looks good to me aside from
one minor nit below.

[...]

> +# This is just a dummy register declaration to get all common field masks and
> +# shifts for accessing given BRBINF contents.
> +Sysreg	BRBINF_EL1	2	1	8	0	0

We don't need a dummy declaration, as we have 'SysregFields' that can be used
for this, e.g.

  SysregFields BRBINFx_EL1
  ...
  EndSysregFields

... which will avoid accidental usage of the register encoding. Note that I've
also added an 'x' there in place of the index, which we do for other registers,
e.g. TTBRx_EL1.

Could you please update to that?

With that:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
  2023-01-05  3:10   ` Anshuman Khandual
@ 2023-01-12 13:54     ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 13:54 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 05, 2023 at 08:40:36AM +0530, Anshuman Khandual wrote:
> This updates 'struct arm_pmu' for branch stack sampling support later. This
> adds a new 'features' element in the structure to track supported features,
> and another 'private' element to encapsulate implementation attributes on a
> given 'struct arm_pmu'. These updates here will help in tracking any branch
> stack sampling support, which is being added later. This also adds a helper
> arm_pmu_branch_stack_supported().
> 
> This also enables perf branch stack sampling event on all 'struct arm pmu',
> supporting the feature but after removing the current gate that blocks such
> events unconditionally in armpmu_event_init(). Instead a quick probe can be
> initiated via arm_pmu_branch_stack_supported() to ascertain the support.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  drivers/perf/arm_pmu.c       | 3 +--
>  include/linux/perf/arm_pmu.h | 9 +++++++++
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 14a3ed3bdb0b..a85b2d67022e 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
>  		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
>  		return -ENOENT;
>  
> -	/* does not support taken branch sampling */
> -	if (has_branch_stack(event))
> +	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
>  		return -EOPNOTSUPP;
>  
>  	return __hw_perf_event_init(event);
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 2a9d07cee927..64e1b2594025 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -80,11 +80,14 @@ enum armpmu_attr_groups {
>  	ARMPMU_NR_ATTR_GROUPS
>  };
>  
> +#define ARM_PMU_BRANCH_STACK	BIT(0)
> +
>  struct arm_pmu {
>  	struct pmu	pmu;
>  	cpumask_t	supported_cpus;
>  	char		*name;
>  	int		pmuver;
> +	int		features;
>  	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
>  	void		(*enable)(struct perf_event *event);
>  	void		(*disable)(struct perf_event *event);

Hmm, we already have the secure_access field separately. How about we fold that
in and go with:

	unsigned int	secure_access    : 1,
			has_branch_stack : 1;

... that way we have one way to manage flags, we don't need to allocate the
bits, and the bulk of the existing code for secure_access can stay as-is.

> @@ -119,8 +122,14 @@ struct arm_pmu {
>  
>  	/* Only to be used by ACPI probing code */
>  	unsigned long acpi_cpuid;
> +	void		*private;

Does this need to be on the end of struct arm_pmu, or can it be placed earlier?

The line spacing makes it look like the ACPI comment applies to 'private',
which isn't the case.

>  };
>  
> +static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
> +{
> +	return armpmu->features & ARM_PMU_BRANCH_STACK;
> +}

With the above, this would become:

static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
{
	return armpmu->has_branch_stack;
}

Thanks,
Mark.

> +
>  #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
>  
>  u64 armpmu_event_update(struct perf_event *event);
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
@ 2023-01-12 13:54     ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 13:54 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 05, 2023 at 08:40:36AM +0530, Anshuman Khandual wrote:
> This updates 'struct arm_pmu' for branch stack sampling support later. This
> adds a new 'features' element in the structure to track supported features,
> and another 'private' element to encapsulate implementation attributes on a
> given 'struct arm_pmu'. These updates here will help in tracking any branch
> stack sampling support, which is being added later. This also adds a helper
> arm_pmu_branch_stack_supported().
> 
> This also enables perf branch stack sampling event on all 'struct arm pmu',
> supporting the feature but after removing the current gate that blocks such
> events unconditionally in armpmu_event_init(). Instead a quick probe can be
> initiated via arm_pmu_branch_stack_supported() to ascertain the support.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  drivers/perf/arm_pmu.c       | 3 +--
>  include/linux/perf/arm_pmu.h | 9 +++++++++
>  2 files changed, 10 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 14a3ed3bdb0b..a85b2d67022e 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
>  		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
>  		return -ENOENT;
>  
> -	/* does not support taken branch sampling */
> -	if (has_branch_stack(event))
> +	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
>  		return -EOPNOTSUPP;
>  
>  	return __hw_perf_event_init(event);
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 2a9d07cee927..64e1b2594025 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -80,11 +80,14 @@ enum armpmu_attr_groups {
>  	ARMPMU_NR_ATTR_GROUPS
>  };
>  
> +#define ARM_PMU_BRANCH_STACK	BIT(0)
> +
>  struct arm_pmu {
>  	struct pmu	pmu;
>  	cpumask_t	supported_cpus;
>  	char		*name;
>  	int		pmuver;
> +	int		features;
>  	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
>  	void		(*enable)(struct perf_event *event);
>  	void		(*disable)(struct perf_event *event);

Hmm, we already have the secure_access field separately. How about we fold that
in and go with:

	unsigned int	secure_access    : 1,
			has_branch_stack : 1;

... that way we have one way to manage flags, we don't need to allocate the
bits, and the bulk of the existing code for secure_access can stay as-is.

> @@ -119,8 +122,14 @@ struct arm_pmu {
>  
>  	/* Only to be used by ACPI probing code */
>  	unsigned long acpi_cpuid;
> +	void		*private;

Does this need to be on the end of struct arm_pmu, or can it be placed earlier?

The line spacing makes it look like the ACPI comment applies to 'private',
which isn't the case.

>  };
>  
> +static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
> +{
> +	return armpmu->features & ARM_PMU_BRANCH_STACK;
> +}

With the above, this would become:

static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
{
	return armpmu->has_branch_stack;
}

Thanks,
Mark.

> +
>  #define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
>  
>  u64 armpmu_event_update(struct perf_event *event);
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 4/6] arm64/perf: Add branch stack support in struct pmu_hw_events
  2023-01-05  3:10   ` Anshuman Khandual
@ 2023-01-12 13:59     ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 13:59 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 05, 2023 at 08:40:37AM +0530, Anshuman Khandual wrote:
> This adds branch records buffer pointer in 'struct pmu_hw_events' which can
> be used to capture branch records during PMU interrupt. This percpu pointer
> here needs to be allocated first before usage.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

This looks fine, but it's difficult to review this without seeing how its
lifetime is managed, and it feels like this should be folded into a subsequent
patch which manages that.

Thanks,
Mark.

> ---
>  include/linux/perf/arm_pmu.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 64e1b2594025..9184f9b33740 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -44,6 +44,13 @@ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_47BIT) == ARMPMU_EVT_47BIT);
>  	},								\
>  }
>  
> +#define MAX_BRANCH_RECORDS 64
> +
> +struct branch_records {
> +	struct perf_branch_stack	branch_stack;
> +	struct perf_branch_entry	branch_entries[MAX_BRANCH_RECORDS];
> +};
> +
>  /* The events for a given PMU register set. */
>  struct pmu_hw_events {
>  	/*
> @@ -70,6 +77,8 @@ struct pmu_hw_events {
>  	struct arm_pmu		*percpu_pmu;
>  
>  	int irq;
> +
> +	struct branch_records	*branches;
>  };
>  
>  enum armpmu_attr_groups {
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 4/6] arm64/perf: Add branch stack support in struct pmu_hw_events
@ 2023-01-12 13:59     ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 13:59 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 05, 2023 at 08:40:37AM +0530, Anshuman Khandual wrote:
> This adds branch records buffer pointer in 'struct pmu_hw_events' which can
> be used to capture branch records during PMU interrupt. This percpu pointer
> here needs to be allocated first before usage.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

This looks fine, but it's difficult to review this without seeing how its
lifetime is managed, and it feels like this should be folded into a subsequent
patch which manages that.

Thanks,
Mark.

> ---
>  include/linux/perf/arm_pmu.h | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 64e1b2594025..9184f9b33740 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -44,6 +44,13 @@ static_assert((PERF_EVENT_FLAG_ARCH & ARMPMU_EVT_47BIT) == ARMPMU_EVT_47BIT);
>  	},								\
>  }
>  
> +#define MAX_BRANCH_RECORDS 64
> +
> +struct branch_records {
> +	struct perf_branch_stack	branch_stack;
> +	struct perf_branch_entry	branch_entries[MAX_BRANCH_RECORDS];
> +};
> +
>  /* The events for a given PMU register set. */
>  struct pmu_hw_events {
>  	/*
> @@ -70,6 +77,8 @@ struct pmu_hw_events {
>  	struct arm_pmu		*percpu_pmu;
>  
>  	int irq;
> +
> +	struct branch_records	*branches;
>  };
>  
>  enum armpmu_attr_groups {
> -- 
> 2.25.1
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
  2023-01-05  3:10   ` Anshuman Khandual
@ 2023-01-12 14:29     ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 14:29 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
> This enables support for branch stack sampling event in ARMV8 PMU, checking
> has_branch_stack() on the event inside 'struct arm_pmu' callbacks. Although
> these branch stack helpers armv8pmu_branch_XXXXX() are just dummy functions
> for now. While here, this also defines arm_pmu's sched_task() callback with
> armv8pmu_sched_task(), which resets the branch record buffer on a sched_in.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/include/asm/perf_event.h | 10 +++++++++
>  arch/arm64/kernel/perf_event.c      | 35 +++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> index 3eaf462f5752..a038902d6874 100644
> --- a/arch/arm64/include/asm/perf_event.h
> +++ b/arch/arm64/include/asm/perf_event.h
> @@ -273,4 +273,14 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
>  	(regs)->pstate = PSR_MODE_EL1h;	\
>  }
>  
> +struct pmu_hw_events;
> +struct arm_pmu;
> +struct perf_event;
> +
> +static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
> +static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
> +static inline void armv8pmu_branch_enable(struct perf_event *event) { }
> +static inline void armv8pmu_branch_disable(struct perf_event *event) { }
> +static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
> +static inline void armv8pmu_branch_reset(void) { }

As far as I can tell, these are not supposed to be called when
!has_branch_stack(), so it would be good if these had a WARN() or similar to
spot buggy usage.

>  #endif
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index a5193f2146a6..8805b4516088 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -789,10 +789,22 @@ static void armv8pmu_enable_event(struct perf_event *event)
>  	 * Enable counter
>  	 */
>  	armv8pmu_enable_event_counter(event);
> +
> +	/*
> +	 * Enable BRBE
> +	 */
> +	if (has_branch_stack(event))
> +		armv8pmu_branch_enable(event);
>  }

This looks fine, but tbh I think we should delete the existing comments above
each function call as they're blindingly obvious and just waste space.

>  static void armv8pmu_disable_event(struct perf_event *event)
>  {
> +	/*
> +	 * Disable BRBE
> +	 */
> +	if (has_branch_stack(event))
> +		armv8pmu_branch_disable(event);
> +

Likewise here.

>  	/*
>  	 * Disable counter
>  	 */
> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>  		if (!armpmu_event_set_period(event))
>  			continue;
>  
> +		if (has_branch_stack(event)) {
> +			WARN_ON(!cpuc->branches);
> +			armv8pmu_branch_read(cpuc, event);
> +			data.br_stack = &cpuc->branches->branch_stack;
> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> +		}

How do we ensure the data we're getting isn't changed under our feet? Is BRBE
disabled at this point?

Is this going to have branches after taking the exception, or does BRBE stop
automatically at that point? If so we presumably need to take special care as
to when we read this relative to enabling/disabling and/or manipulating the
overflow bits.

> +
>  		/*
>  		 * Perf event overflow will queue the processing of the event as
>  		 * an irq_work which will be taken care of in the handling of
> @@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
>  	return event->hw.idx;
>  }
>  
> +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
> +{
> +	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
> +
> +	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
> +		armv8pmu_branch_reset();
> +}

When scheduling out, shouldn't we save what we have so far?

It seems odd that we just throw that away rather than placing it into a FIFO.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
@ 2023-01-12 14:29     ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 14:29 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
> This enables support for branch stack sampling event in ARMV8 PMU, checking
> has_branch_stack() on the event inside 'struct arm_pmu' callbacks. Although
> these branch stack helpers armv8pmu_branch_XXXXX() are just dummy functions
> for now. While here, this also defines arm_pmu's sched_task() callback with
> armv8pmu_sched_task(), which resets the branch record buffer on a sched_in.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/include/asm/perf_event.h | 10 +++++++++
>  arch/arm64/kernel/perf_event.c      | 35 +++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> index 3eaf462f5752..a038902d6874 100644
> --- a/arch/arm64/include/asm/perf_event.h
> +++ b/arch/arm64/include/asm/perf_event.h
> @@ -273,4 +273,14 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
>  	(regs)->pstate = PSR_MODE_EL1h;	\
>  }
>  
> +struct pmu_hw_events;
> +struct arm_pmu;
> +struct perf_event;
> +
> +static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
> +static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
> +static inline void armv8pmu_branch_enable(struct perf_event *event) { }
> +static inline void armv8pmu_branch_disable(struct perf_event *event) { }
> +static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
> +static inline void armv8pmu_branch_reset(void) { }

As far as I can tell, these are not supposed to be called when
!has_branch_stack(), so it would be good if these had a WARN() or similar to
spot buggy usage.

>  #endif
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index a5193f2146a6..8805b4516088 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -789,10 +789,22 @@ static void armv8pmu_enable_event(struct perf_event *event)
>  	 * Enable counter
>  	 */
>  	armv8pmu_enable_event_counter(event);
> +
> +	/*
> +	 * Enable BRBE
> +	 */
> +	if (has_branch_stack(event))
> +		armv8pmu_branch_enable(event);
>  }

This looks fine, but tbh I think we should delete the existing comments above
each function call as they're blindingly obvious and just waste space.

>  static void armv8pmu_disable_event(struct perf_event *event)
>  {
> +	/*
> +	 * Disable BRBE
> +	 */
> +	if (has_branch_stack(event))
> +		armv8pmu_branch_disable(event);
> +

Likewise here.

>  	/*
>  	 * Disable counter
>  	 */
> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>  		if (!armpmu_event_set_period(event))
>  			continue;
>  
> +		if (has_branch_stack(event)) {
> +			WARN_ON(!cpuc->branches);
> +			armv8pmu_branch_read(cpuc, event);
> +			data.br_stack = &cpuc->branches->branch_stack;
> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> +		}

How do we ensure the data we're getting isn't changed under our feet? Is BRBE
disabled at this point?

Is this going to have branches after taking the exception, or does BRBE stop
automatically at that point? If so we presumably need to take special care as
to when we read this relative to enabling/disabling and/or manipulating the
overflow bits.

> +
>  		/*
>  		 * Perf event overflow will queue the processing of the event as
>  		 * an irq_work which will be taken care of in the handling of
> @@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
>  	return event->hw.idx;
>  }
>  
> +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
> +{
> +	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
> +
> +	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
> +		armv8pmu_branch_reset();
> +}

When scheduling out, shouldn't we save what we have so far?

It seems odd that we just throw that away rather than placing it into a FIFO.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
  2023-01-05  3:10   ` Anshuman Khandual
@ 2023-01-12 16:51     ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 16:51 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:
> This enables branch stack sampling events in ARMV8 PMU, via an architecture
> feature FEAT_BRBE aka branch record buffer extension. This defines required
> branch helper functions pmuv8pmu_branch_XXXXX() and the implementation here
> is wrapped with a new config option CONFIG_ARM64_BRBE.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/Kconfig                  |  11 +
>  arch/arm64/include/asm/perf_event.h |   9 +
>  arch/arm64/kernel/Makefile          |   1 +
>  arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
>  arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
>  5 files changed, 790 insertions(+)
>  create mode 100644 arch/arm64/kernel/brbe.c
>  create mode 100644 arch/arm64/kernel/brbe.h
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 03934808b2ed..915b12709a46 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1363,6 +1363,17 @@ config HW_PERF_EVENTS
>  	def_bool y
>  	depends on ARM_PMU
>  
> +config ARM64_BRBE
> +	bool "Enable support for Branch Record Buffer Extension (BRBE)"
> +	depends on PERF_EVENTS && ARM64 && ARM_PMU
> +	default y
> +	help
> +	  Enable perf support for Branch Record Buffer Extension (BRBE) which
> +	  records all branches taken in an execution path. This supports some
> +	  branch types and privilege based filtering. It captured additional
> +	  relevant information such as cycle count, misprediction and branch
> +	  type, branch privilege level etc.
> +
>  # Supported by clang >= 7.0 or GCC >= 12.0.0
>  config CC_HAVE_SHADOW_CALL_STACK
>  	def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> index a038902d6874..cf2e88c7b707 100644
> --- a/arch/arm64/include/asm/perf_event.h
> +++ b/arch/arm64/include/asm/perf_event.h
> @@ -277,6 +277,14 @@ struct pmu_hw_events;
>  struct arm_pmu;
>  struct perf_event;
>  
> +#ifdef CONFIG_ARM64_BRBE
> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event);
> +bool armv8pmu_branch_valid(struct perf_event *event);
> +void armv8pmu_branch_enable(struct perf_event *event);
> +void armv8pmu_branch_disable(struct perf_event *event);
> +void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
> +void armv8pmu_branch_reset(void);
> +#else
>  static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
>  static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
>  static inline void armv8pmu_branch_enable(struct perf_event *event) { }
> @@ -284,3 +292,4 @@ static inline void armv8pmu_branch_disable(struct perf_event *event) { }
>  static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
>  static inline void armv8pmu_branch_reset(void) { }
>  #endif
> +#endif
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index ceba6792f5b3..6ee7ccb61621 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
>  obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
>  obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
>  obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
> +obj-$(CONFIG_ARM64_BRBE)		+= brbe.o
>  obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
>  obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
>  obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
> diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
> new file mode 100644
> index 000000000000..cd03d3531e04
> --- /dev/null
> +++ b/arch/arm64/kernel/brbe.c
> @@ -0,0 +1,512 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Branch Record Buffer Extension Driver.
> + *
> + * Copyright (C) 2022 ARM Limited
> + *
> + * Author: Anshuman Khandual <anshuman.khandual@arm.com>
> + */
> +#include "brbe.h"
> +
> +static bool valid_brbe_nr(int brbe_nr)
> +{
> +	return brbe_nr == BRBIDR0_EL1_NUMREC_8 ||
> +	       brbe_nr == BRBIDR0_EL1_NUMREC_16 ||
> +	       brbe_nr == BRBIDR0_EL1_NUMREC_32 ||
> +	       brbe_nr == BRBIDR0_EL1_NUMREC_64;
> +}
> +
> +static bool valid_brbe_cc(int brbe_cc)
> +{
> +	return brbe_cc == BRBIDR0_EL1_CC_20_BIT;
> +}
> +
> +static bool valid_brbe_format(int brbe_format)
> +{
> +	return brbe_format == BRBIDR0_EL1_FORMAT_0;
> +}
> +
> +static bool valid_brbe_version(int brbe_version)
> +{
> +	return brbe_version == ID_AA64DFR0_EL1_BRBE_IMP ||
> +	       brbe_version == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1;
> +}
> +
> +static void select_brbe_bank(int bank)
> +{
> +	static int brbe_current_bank = BRBE_BANK_IDX_INVALID;

This is a per-cpu peroperty, so I don't understand how this can safely be
stored in a static variable. If this is necessary it needs to go in a per-cpu
variable, but I suspect we don't actually need it.

> +	u64 brbfcr;
> +
> +	if (brbe_current_bank == bank)
> +		return;

It looks like this is just for the same of optimizing redundant changes when
armv8pmu_branch_read() iterates over the records?

It'd be simpler to have armv8pmu_branch_read() iterate over each bank, then
within that iterate over each record within that bank.

> +	WARN_ON(bank > BRBE_BANK_IDX_1);
> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
> +	brbfcr |= ((bank << BRBFCR_EL1_BANK_SHIFT) & BRBFCR_EL1_BANK_MASK);

You can use SYS_FIELD_PREP() for this:

	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
	brbfcr |= SYS_FIELD_PREP(BRBFCR_EL1, BANK, bank);

Please use FIELD_PREP for this.

> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
> +	isb();
> +	brbe_current_bank = bank;
> +}
> +
> +static void select_brbe_bank_index(int buffer_idx)
> +{
> +	switch (buffer_idx) {
> +	case BRBE_BANK0_IDX_MIN ... BRBE_BANK0_IDX_MAX:
> +		select_brbe_bank(BRBE_BANK_IDX_0);
> +		break;
> +	case BRBE_BANK1_IDX_MIN ... BRBE_BANK1_IDX_MAX:
> +		select_brbe_bank(BRBE_BANK_IDX_1);
> +		break;
> +	default:
> +		pr_warn("unsupported BRBE index\n");

It would be worth logging the specific index in case we ever have to debug
this. It's probably worth also making this a WARN_ONCE() or WARN_RATELIMITED().

> +	}
> +}
> +
> +static const char branch_filter_error_msg[] = "branch filter not supported";
> +
> +bool armv8pmu_branch_valid(struct perf_event *event)
> +{
> +	u64 branch_type = event->attr.branch_sample_type;
> +
> +	/*
> +	 * If the event does not have at least one of the privilege
> +	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
> +	 * perf will adjust its value based on perf event's existing
> +	 * privilege level via attr.exclude_[user|kernel|hv].
> +	 *
> +	 * As event->attr.branch_sample_type might have been changed
> +	 * when the event reaches here, it is not possible to figure
> +	 * out whether the event originally had HV privilege request
> +	 * or got added via the core perf. Just report this situation
> +	 * once and continue ignoring if there are other instances.
> +	 */
> +	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
> +		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
> +		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
> +		return false;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
> +		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
> +		return false;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
> +		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
> +		return false;
> +	}
> +	return true;
> +}

Is this called when validating user input? If so, NAK to printing anything to a
higher leval than debug. If there are constraints the user needs to be aware of
we should expose the relevant information under sysfs, but it seems that these
are just generic perf options that BRBE doesn't support.

It would be better to whitelist what we do support rather than blacklisting
what we don't.

> +
> +static void branch_records_alloc(struct arm_pmu *armpmu)
> +{
> +	struct pmu_hw_events *events;
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		events = per_cpu_ptr(armpmu->hw_events, cpu);
> +
> +		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
> +		WARN_ON(!events->branches);
> +	}
> +}

It would be simpler for this to be a percpu allocation.

If the allocation fails, we should propogate that error rather than just
WARNing, and fail probing the PMU.

Also, if the generic allocator fails it will print a warning (unless
__GFP_NOWARN was used), so we don't need the warning here.

> +
> +static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
> +{
> +	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);

Same comments as for the failure path in branch_records_alloc().

> +	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);

Which context is this run in? Unless this is affine to a relevant CPU we can't
read the sysreg safely, and if we're in a cross-call we cannot allocate memory,
so this doesn't look right to me.

I suspect CONFIG_DEBUG_ATOMIC_SLEEP=y and/or CONFIG_PROVE_LOCKING=y will complain here.

Please follow the approach of armv8pmu_probe_pmu(), where we use a probe_info
structure that the callee can fill with information. Then we can do the
allocation in the main thread from a non-atomic context.

> +
> +	WARN_ON(!brbe_attr);
> +	armpmu->private = brbe_attr;
> +
> +	brbe_attr->brbe_version = brbe;
> +	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
> +	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
> +	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);

As a minor thing, could we please s/fetch/get/ ? To me, 'fetch' sounds like a
memory operation, and elsewhere we use 'get' for this sort of getter function.

> +
> +	if (!valid_brbe_version(brbe_attr->brbe_version) ||
> +	   !valid_brbe_format(brbe_attr->brbe_format) ||
> +	   !valid_brbe_cc(brbe_attr->brbe_cc) ||
> +	   !valid_brbe_nr(brbe_attr->brbe_nr))
> +		return -EOPNOTSUPP;
> +
> +	return 0;
> +}
> +
> +void armv8pmu_branch_probe(struct arm_pmu *armpmu)
> +{
> +	u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
> +	u32 brbe;
> +
> +	brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
> +	if (!brbe)
> +		return;
> +
> +	if (brbe_attributes_probe(armpmu, brbe))
> +		return;
> +
> +	branch_records_alloc(armpmu);
> +	armpmu->features |= ARM_PMU_BRANCH_STACK;
> +}
> +
> +static u64 branch_type_to_brbfcr(int branch_type)
> +{
> +	u64 brbfcr = 0;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
> +		brbfcr |= BRBFCR_EL1_BRANCH_FILTERS;
> +		return brbfcr;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
> +		brbfcr |= BRBFCR_EL1_INDCALL;
> +		brbfcr |= BRBFCR_EL1_DIRCALL;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
> +		brbfcr |= BRBFCR_EL1_RTN;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
> +		brbfcr |= BRBFCR_EL1_INDCALL;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_COND)
> +		brbfcr |= BRBFCR_EL1_CONDDIR;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
> +		brbfcr |= BRBFCR_EL1_INDIRECT;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_CALL)
> +		brbfcr |= BRBFCR_EL1_DIRCALL;
> +
> +	return brbfcr;
> +}
> +
> +static u64 branch_type_to_brbcr(int branch_type)
> +{
> +	u64 brbcr = (BRBCR_EL1_FZP | BRBCR_EL1_DEFAULT_TS);
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_USER)
> +		brbcr |= BRBCR_EL1_E0BRE;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
> +		brbcr |= BRBCR_EL1_E1BRE;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_HV) {
> +		if (is_kernel_in_hyp_mode())
> +			brbcr |= BRBCR_EL1_E1BRE;
> +	}

I assume that in that case we're actually writing to BRBCR_EL2, and this is
actually the E2BRE bit, which is at the same position? If so, I think that's
worth a comment above the USER/KERNEL/HV bits here.

How do the BRB* control registers work with E2H? Is BRBCR_EL1 rewritten to
BRBCR_EL2 by the hardware?

> +
> +	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
> +		brbcr |= BRBCR_EL1_CC;
> +
> +	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
> +		brbcr |= BRBCR_EL1_MPRED;
> +
> +	/*
> +	 * The exception and exception return branches could be
> +	 * captured, irrespective of the perf event's privilege.
> +	 * If the perf event does not have enough privilege for
> +	 * a given exception level, then addresses which falls
> +	 * under that exception level will be reported as zero
> +	 * for the captured branch record, creating source only
> +	 * or target only records.
> +	 */
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
> +		brbcr |= BRBCR_EL1_EXCEPTION;
> +		brbcr |= BRBCR_EL1_ERTN;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
> +		brbcr |= BRBCR_EL1_EXCEPTION;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
> +		brbcr |= BRBCR_EL1_ERTN;
> +
> +	return brbcr & BRBCR_EL1_DEFAULT_CONFIG;
> +}
> +
> +void armv8pmu_branch_enable(struct perf_event *event)
> +{
> +	u64 branch_type = event->attr.branch_sample_type;
> +	u64 brbfcr, brbcr;
> +
> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +	brbfcr &= ~BRBFCR_EL1_DEFAULT_CONFIG;
> +	brbfcr |= branch_type_to_brbfcr(branch_type);
> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
> +	isb();
> +
> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
> +	brbcr &= ~BRBCR_EL1_DEFAULT_CONFIG;
> +	brbcr |= branch_type_to_brbcr(branch_type);
> +	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
> +	isb();
> +	armv8pmu_branch_reset();
> +}
> +
> +void armv8pmu_branch_disable(struct perf_event *event)
> +{
> +	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +	u64 brbcr = read_sysreg_s(SYS_BRBCR_EL1);
> +
> +	brbcr &= ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
> +	brbfcr |= BRBFCR_EL1_PAUSED;
> +	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
> +	isb();
> +}
> +
> +static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)

It's a bit confusing to return the type and new_type fields in this way.

I think this would be clearer as a setter function, even if that results in it
being a bit longer, since it keeps all the type and new_type relationships in
one place and has a single path for returning the value:

static void brbe_set_perf_entry_type(struct perf_branch_entry *entry,
				     u64 brbinf)
{
	int brbe_type = brbe_fetch_type(brbinf);

	switch (brbe_type) {
	case BRBINF_EL1_TYPE_UNCOND_DIR;
		entry->type = PERF_BR_UNCOND;
		break;
	...
	case BRBINF_EL1_TYPE_DEBUG_HALT;
		entry->type = PERF_BR_EXTEND_ABI;
		entry->new_type = PERF_BR_ARM64_DEBUG_HALT;
		break;
	...
	default:
		...
	}
}

... and in theory that makes it easier to propogate an error in future if we
want to.

> +{
> +	int brbe_type = brbe_fetch_type(brbinf);
> +	*new_branch_type = false;
> +
> +	switch (brbe_type) {
> +	case BRBINF_EL1_TYPE_UNCOND_DIR:
> +		return PERF_BR_UNCOND;
> +	case BRBINF_EL1_TYPE_INDIR:
> +		return PERF_BR_IND;
> +	case BRBINF_EL1_TYPE_DIR_LINK:
> +		return PERF_BR_CALL;
> +	case BRBINF_EL1_TYPE_INDIR_LINK:
> +		return PERF_BR_IND_CALL;
> +	case BRBINF_EL1_TYPE_RET_SUB:
> +		return PERF_BR_RET;
> +	case BRBINF_EL1_TYPE_COND_DIR:
> +		return PERF_BR_COND;
> +	case BRBINF_EL1_TYPE_CALL:
> +		return PERF_BR_CALL;
> +	case BRBINF_EL1_TYPE_TRAP:
> +		return PERF_BR_SYSCALL;
> +	case BRBINF_EL1_TYPE_RET_EXCPT:
> +		return PERF_BR_ERET;
> +	case BRBINF_EL1_TYPE_IRQ:
> +		return PERF_BR_IRQ;
> +	case BRBINF_EL1_TYPE_DEBUG_HALT:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_DEBUG_HALT;
> +	case BRBINF_EL1_TYPE_SERROR:
> +		return PERF_BR_SERROR;
> +	case BRBINF_EL1_TYPE_INST_DEBUG:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_DEBUG_INST;
> +	case BRBINF_EL1_TYPE_DATA_DEBUG:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_DEBUG_DATA;
> +	case BRBINF_EL1_TYPE_ALGN_FAULT:
> +		*new_branch_type = true;
> +		return PERF_BR_NEW_FAULT_ALGN;
> +	case BRBINF_EL1_TYPE_INST_FAULT:
> +		*new_branch_type = true;
> +		return PERF_BR_NEW_FAULT_INST;
> +	case BRBINF_EL1_TYPE_DATA_FAULT:
> +		*new_branch_type = true;
> +		return PERF_BR_NEW_FAULT_DATA;
> +	case BRBINF_EL1_TYPE_FIQ:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_FIQ;
> +	case BRBINF_EL1_TYPE_DEBUG_EXIT:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_DEBUG_EXIT;
> +	default:
> +		pr_warn("unknown branch type captured\n");
> +		return PERF_BR_UNKNOWN;

It would be worth logging the specific value in case we ever have to debug
this. This should also be marked as _ratelimited or _once.

> +	}
> +}
> +
> +static int brbe_fetch_perf_priv(u64 brbinf)
> +{
> +	int brbe_el = brbe_fetch_el(brbinf);
> +
> +	switch (brbe_el) {
> +	case BRBINF_EL1_EL_EL0:
> +		return PERF_BR_PRIV_USER;
> +	case BRBINF_EL1_EL_EL1:
> +		return PERF_BR_PRIV_KERNEL;
> +	case BRBINF_EL1_EL_EL2:
> +		if (is_kernel_in_hyp_mode())
> +			return PERF_BR_PRIV_KERNEL;
> +		return PERF_BR_PRIV_HV;
> +	default:
> +		pr_warn("unknown branch privilege captured\n");
> +		return PERF_BR_PRIV_UNKNOWN;

It would be worth logging the specific value in case we ever have to debug
this. This should also be marked as _ratelimited or _once.

> +	}
> +}
> +
> +static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
> +			       u64 brbinf, u64 brbcr, int idx)
> +{
> +	struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
> +	bool new_branch_type;
> +	int branch_type;
> +
> +	if (branch_sample_type(event)) {
> +		branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
> +		if (new_branch_type) {
> +			entry->type = PERF_BR_EXTEND_ABI;
> +			entry->new_type = branch_type;
> +		} else {
> +			entry->type = branch_type;
> +		}
> +	}

With the suggestions bove, this would become:

	if (branch_sample_type(event))
		brbe_set_perf_entry_type(entry, brbinf);

> +	if (!branch_sample_no_cycles(event)) {
> +		WARN_ON_ONCE(!(brbcr & BRBCR_EL1_CC));
> +		entry->cycles = brbe_fetch_cycles(brbinf);
> +	}
> +
> +	if (!branch_sample_no_flags(event)) {
> +		/*
> +		 * BRBINF_LASTFAILED does not indicate whether last transaction
> +		 * got failed or aborted during the current branch record itself.
> +		 * Rather, this indicates that all the branch records which were
> +		 * in transaction until the curret branch record have failed. So
> +		 * the entire BRBE buffer needs to be processed later on to find
> +		 * all branch records which might have failed.
> +		 */

This is quite difficult to follow.

I took in the ARM ARM, and it looks like this is all about TME transactions
(which Linux doesn't currently support). Per ARM DDI 0487I.a, page D15-5506:

| R_GVCJH
|   When an entire transaction is executed in a BRBE Non-prohibited region and
|   the transaction fails or is canceled then BRBFCR_EL1.LASTFAILED is set to
|   1.

| R_KBSZM
|   When a Branch record is generated, other than through the injection
|   mechanism, the value of BRBFCR_EL1.LASTFAILED is copied to the LASTFAILED
|   field in the Branch record and BRBFCR_EL1.LASTFAILED is set to 0.

| I_JBPHS
|   When a transaction fails or is canceled, Branch records generated in the
|   transaction are not removed from the Branch record buffer.

I think what this is saying is:

	/*
	 * BRBINFx_EL1.LASTFAILED indicates that a TME transaction failed (or
	 * was cancelled) prior to this record, and some number of records
	 * prior to this one may have been generated during an attempt to
	 * execute the transaction.
	 *
	 * We will remove such entries later in process_branch_aborts().
	 */

Is that right?

> +
> +		/*
> +		 * All these information (i.e transaction state and mispredicts)
> +		 * are not available for target only branch records.
> +		 */
> +		if (!brbe_target(brbinf)) {

Could we rename these heleprs for clarity, e.g.
brbe_record_is_{target_only,source_only,complete}()

With that, it would also be clearer to have:

	/*
	 * These fields only exist for complete and source-only records.
	 */
	if (brbe_record_is_complete(brbinf) ||
	    brbe_record_is_source_only()) {

... and explicilty match the cases we care about[


> +			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));

Huh? Why does the value of BRBCR matter here?

> +			entry->mispred = brbe_fetch_mispredict(brbinf);
> +			entry->predicted = !entry->mispred;
> +			entry->in_tx = brbe_fetch_in_tx(brbinf);
> +		}
> +	}
> +
> +	if (branch_sample_priv(event)) {
> +		/*
> +		 * All these information (i.e branch privilege level) are not
> +		 * available for source only branch records.
> +		 */
> +		if (!brbe_source(brbinf))
> +			entry->priv = brbe_fetch_perf_priv(brbinf);

Same style comment as above.

> +	}
> +}
> +
> +/*
> + * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
> + * preceding consecutive branch records, that were in a transaction
> + * (i.e their BRBINF_EL1.TX set) have been aborted.
> + *
> + * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
> + * consecutive branch records upto the last record, which were in a
> + * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
> + *
> + * --------------------------------- -------------------
> + * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
> + * --------------------------------- -------------------
> + * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
> + * --------------------------------- -------------------
> + * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
> + * --------------------------------- -------------------
> + * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------
> + * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------
> + * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
> + * --------------------------------- -------------------
> + * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
> + * --------------------------------- -------------------
> + * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------
> + * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------
> + * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------

Are we guaranteed to have a record between two transactions with TX = 0?

AFAICT you could have a sequence where a TCOMMIT is immediately followed by a
TSTART, and IIUC in that case you could have back-to-back records for distinct
transactions all with TX = 1, where the first transaction could be commited,
and the second might fail/cancel.

... or do TCOMMIT/TCANCEL/TSTART get handled specially?

> + *
> + * BRBFCR_EL1.LASTFAILED == 1
> + *
> + * Here BRBFCR_EL1.LASTFAILED failes all those consecutive and also
> + * in transaction branches near the end of the BRBE buffer.
> + */
> +static void process_branch_aborts(struct pmu_hw_events *cpuc)
> +{
> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
> +	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +	bool lastfailed = !!(brbfcr & BRBFCR_EL1_LASTFAILED);
> +	int idx = brbe_attr->brbe_nr - 1;
> +	struct perf_branch_entry *entry;
> +
> +	do {
> +		entry = &cpuc->branches->branch_entries[idx];
> +		if (entry->in_tx) {
> +			entry->abort = lastfailed;
> +		} else {
> +			lastfailed = entry->abort;
> +			entry->abort = false;
> +		}
> +	} while (idx--, idx >= 0);
> +}
> +
> +void armv8pmu_branch_reset(void)
> +{
> +	asm volatile(BRB_IALL);
> +	isb();
> +}
> +
> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
> +{
> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
> +	u64 brbinf, brbfcr, brbcr;
> +	int idx;
> +
> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +
> +	/* Ensure pause on PMU interrupt is enabled */
> +	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));

As above, I think this needs commentary in the interrupt handler, since this
presumably needs us to keep the IRQ asserted until we're done
reading/manipulating records in the IRQ handler.

Do we ever read this outside of the IRQ handler? AFAICT we don't, and that
makes it seem like some of this is redundant.

> +
> +	/* Save and clear the privilege */
> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);

Why? Later on we restore this, and AFAICT we don't modify it.

If it's paused, why do we care about the privilege?

> +
> +	/* Pause the buffer */
> +	write_sysreg_s(brbfcr | BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
> +	isb();

Why? If we're in the IRQ handler it's already paused, and if we're not in the
IRQ handler what prevents us racing with an IRQ?

> +
> +	for (idx = 0; idx < brbe_attr->brbe_nr; idx++) {
> +		struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
> +
> +		select_brbe_bank_index(idx);
> +		brbinf = get_brbinf_reg(idx);
> +		/*
> +		 * There are no valid entries anymore on the buffer.
> +		 * Abort the branch record processing to save some
> +		 * cycles and also reduce the capture/process load
> +		 * for the user space as well.
> +		 */
> +		if (brbe_invalid(brbinf))
> +			break;
> +
> +		perf_clear_branch_entry_bitfields(entry);
> +		if (brbe_valid(brbinf)) {
> +			entry->from = get_brbsrc_reg(idx);
> +			entry->to = get_brbtgt_reg(idx);
> +		} else if (brbe_source(brbinf)) {
> +			entry->from = get_brbsrc_reg(idx);
> +			entry->to = 0;
> +		} else if (brbe_target(brbinf)) {
> +			entry->from = 0;
> +			entry->to = get_brbtgt_reg(idx);
> +		}
> +		capture_brbe_flags(cpuc, event, brbinf, brbcr, idx);
> +	}
> +	cpuc->branches->branch_stack.nr = idx;
> +	cpuc->branches->branch_stack.hw_idx = -1ULL;
> +	process_branch_aborts(cpuc);
> +
> +	/* Restore privilege, enable pause on PMU interrupt */
> +	write_sysreg_s(brbcr | BRBCR_EL1_FZP, SYS_BRBCR_EL1);

Why do we have to save/restore this?

> +
> +	/* Unpause the buffer */
> +	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
> +	isb();
> +	armv8pmu_branch_reset();
> +}

Why do we enable it before we reset it?

Surely it would make sense to reset it first, and ammortize the cost of the ISB?

That said, as above, do we actually need to pause/unpause it? Or is it already
paused by virtue of the IRQ?

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
@ 2023-01-12 16:51     ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-01-12 16:51 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:
> This enables branch stack sampling events in ARMV8 PMU, via an architecture
> feature FEAT_BRBE aka branch record buffer extension. This defines required
> branch helper functions pmuv8pmu_branch_XXXXX() and the implementation here
> is wrapped with a new config option CONFIG_ARM64_BRBE.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: linux-arm-kernel@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/Kconfig                  |  11 +
>  arch/arm64/include/asm/perf_event.h |   9 +
>  arch/arm64/kernel/Makefile          |   1 +
>  arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
>  arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
>  5 files changed, 790 insertions(+)
>  create mode 100644 arch/arm64/kernel/brbe.c
>  create mode 100644 arch/arm64/kernel/brbe.h
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 03934808b2ed..915b12709a46 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1363,6 +1363,17 @@ config HW_PERF_EVENTS
>  	def_bool y
>  	depends on ARM_PMU
>  
> +config ARM64_BRBE
> +	bool "Enable support for Branch Record Buffer Extension (BRBE)"
> +	depends on PERF_EVENTS && ARM64 && ARM_PMU
> +	default y
> +	help
> +	  Enable perf support for Branch Record Buffer Extension (BRBE) which
> +	  records all branches taken in an execution path. This supports some
> +	  branch types and privilege based filtering. It captured additional
> +	  relevant information such as cycle count, misprediction and branch
> +	  type, branch privilege level etc.
> +
>  # Supported by clang >= 7.0 or GCC >= 12.0.0
>  config CC_HAVE_SHADOW_CALL_STACK
>  	def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
> index a038902d6874..cf2e88c7b707 100644
> --- a/arch/arm64/include/asm/perf_event.h
> +++ b/arch/arm64/include/asm/perf_event.h
> @@ -277,6 +277,14 @@ struct pmu_hw_events;
>  struct arm_pmu;
>  struct perf_event;
>  
> +#ifdef CONFIG_ARM64_BRBE
> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event);
> +bool armv8pmu_branch_valid(struct perf_event *event);
> +void armv8pmu_branch_enable(struct perf_event *event);
> +void armv8pmu_branch_disable(struct perf_event *event);
> +void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
> +void armv8pmu_branch_reset(void);
> +#else
>  static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
>  static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
>  static inline void armv8pmu_branch_enable(struct perf_event *event) { }
> @@ -284,3 +292,4 @@ static inline void armv8pmu_branch_disable(struct perf_event *event) { }
>  static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
>  static inline void armv8pmu_branch_reset(void) { }
>  #endif
> +#endif
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index ceba6792f5b3..6ee7ccb61621 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
>  obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
>  obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
>  obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
> +obj-$(CONFIG_ARM64_BRBE)		+= brbe.o
>  obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
>  obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
>  obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
> diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
> new file mode 100644
> index 000000000000..cd03d3531e04
> --- /dev/null
> +++ b/arch/arm64/kernel/brbe.c
> @@ -0,0 +1,512 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Branch Record Buffer Extension Driver.
> + *
> + * Copyright (C) 2022 ARM Limited
> + *
> + * Author: Anshuman Khandual <anshuman.khandual@arm.com>
> + */
> +#include "brbe.h"
> +
> +static bool valid_brbe_nr(int brbe_nr)
> +{
> +	return brbe_nr == BRBIDR0_EL1_NUMREC_8 ||
> +	       brbe_nr == BRBIDR0_EL1_NUMREC_16 ||
> +	       brbe_nr == BRBIDR0_EL1_NUMREC_32 ||
> +	       brbe_nr == BRBIDR0_EL1_NUMREC_64;
> +}
> +
> +static bool valid_brbe_cc(int brbe_cc)
> +{
> +	return brbe_cc == BRBIDR0_EL1_CC_20_BIT;
> +}
> +
> +static bool valid_brbe_format(int brbe_format)
> +{
> +	return brbe_format == BRBIDR0_EL1_FORMAT_0;
> +}
> +
> +static bool valid_brbe_version(int brbe_version)
> +{
> +	return brbe_version == ID_AA64DFR0_EL1_BRBE_IMP ||
> +	       brbe_version == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1;
> +}
> +
> +static void select_brbe_bank(int bank)
> +{
> +	static int brbe_current_bank = BRBE_BANK_IDX_INVALID;

This is a per-cpu peroperty, so I don't understand how this can safely be
stored in a static variable. If this is necessary it needs to go in a per-cpu
variable, but I suspect we don't actually need it.

> +	u64 brbfcr;
> +
> +	if (brbe_current_bank == bank)
> +		return;

It looks like this is just for the same of optimizing redundant changes when
armv8pmu_branch_read() iterates over the records?

It'd be simpler to have armv8pmu_branch_read() iterate over each bank, then
within that iterate over each record within that bank.

> +	WARN_ON(bank > BRBE_BANK_IDX_1);
> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
> +	brbfcr |= ((bank << BRBFCR_EL1_BANK_SHIFT) & BRBFCR_EL1_BANK_MASK);

You can use SYS_FIELD_PREP() for this:

	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
	brbfcr |= SYS_FIELD_PREP(BRBFCR_EL1, BANK, bank);

Please use FIELD_PREP for this.

> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
> +	isb();
> +	brbe_current_bank = bank;
> +}
> +
> +static void select_brbe_bank_index(int buffer_idx)
> +{
> +	switch (buffer_idx) {
> +	case BRBE_BANK0_IDX_MIN ... BRBE_BANK0_IDX_MAX:
> +		select_brbe_bank(BRBE_BANK_IDX_0);
> +		break;
> +	case BRBE_BANK1_IDX_MIN ... BRBE_BANK1_IDX_MAX:
> +		select_brbe_bank(BRBE_BANK_IDX_1);
> +		break;
> +	default:
> +		pr_warn("unsupported BRBE index\n");

It would be worth logging the specific index in case we ever have to debug
this. It's probably worth also making this a WARN_ONCE() or WARN_RATELIMITED().

> +	}
> +}
> +
> +static const char branch_filter_error_msg[] = "branch filter not supported";
> +
> +bool armv8pmu_branch_valid(struct perf_event *event)
> +{
> +	u64 branch_type = event->attr.branch_sample_type;
> +
> +	/*
> +	 * If the event does not have at least one of the privilege
> +	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
> +	 * perf will adjust its value based on perf event's existing
> +	 * privilege level via attr.exclude_[user|kernel|hv].
> +	 *
> +	 * As event->attr.branch_sample_type might have been changed
> +	 * when the event reaches here, it is not possible to figure
> +	 * out whether the event originally had HV privilege request
> +	 * or got added via the core perf. Just report this situation
> +	 * once and continue ignoring if there are other instances.
> +	 */
> +	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
> +		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
> +		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
> +		return false;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
> +		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
> +		return false;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
> +		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
> +		return false;
> +	}
> +	return true;
> +}

Is this called when validating user input? If so, NAK to printing anything to a
higher leval than debug. If there are constraints the user needs to be aware of
we should expose the relevant information under sysfs, but it seems that these
are just generic perf options that BRBE doesn't support.

It would be better to whitelist what we do support rather than blacklisting
what we don't.

> +
> +static void branch_records_alloc(struct arm_pmu *armpmu)
> +{
> +	struct pmu_hw_events *events;
> +	int cpu;
> +
> +	for_each_possible_cpu(cpu) {
> +		events = per_cpu_ptr(armpmu->hw_events, cpu);
> +
> +		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
> +		WARN_ON(!events->branches);
> +	}
> +}

It would be simpler for this to be a percpu allocation.

If the allocation fails, we should propogate that error rather than just
WARNing, and fail probing the PMU.

Also, if the generic allocator fails it will print a warning (unless
__GFP_NOWARN was used), so we don't need the warning here.

> +
> +static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
> +{
> +	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);

Same comments as for the failure path in branch_records_alloc().

> +	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);

Which context is this run in? Unless this is affine to a relevant CPU we can't
read the sysreg safely, and if we're in a cross-call we cannot allocate memory,
so this doesn't look right to me.

I suspect CONFIG_DEBUG_ATOMIC_SLEEP=y and/or CONFIG_PROVE_LOCKING=y will complain here.

Please follow the approach of armv8pmu_probe_pmu(), where we use a probe_info
structure that the callee can fill with information. Then we can do the
allocation in the main thread from a non-atomic context.

> +
> +	WARN_ON(!brbe_attr);
> +	armpmu->private = brbe_attr;
> +
> +	brbe_attr->brbe_version = brbe;
> +	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
> +	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
> +	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);

As a minor thing, could we please s/fetch/get/ ? To me, 'fetch' sounds like a
memory operation, and elsewhere we use 'get' for this sort of getter function.

> +
> +	if (!valid_brbe_version(brbe_attr->brbe_version) ||
> +	   !valid_brbe_format(brbe_attr->brbe_format) ||
> +	   !valid_brbe_cc(brbe_attr->brbe_cc) ||
> +	   !valid_brbe_nr(brbe_attr->brbe_nr))
> +		return -EOPNOTSUPP;
> +
> +	return 0;
> +}
> +
> +void armv8pmu_branch_probe(struct arm_pmu *armpmu)
> +{
> +	u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
> +	u32 brbe;
> +
> +	brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
> +	if (!brbe)
> +		return;
> +
> +	if (brbe_attributes_probe(armpmu, brbe))
> +		return;
> +
> +	branch_records_alloc(armpmu);
> +	armpmu->features |= ARM_PMU_BRANCH_STACK;
> +}
> +
> +static u64 branch_type_to_brbfcr(int branch_type)
> +{
> +	u64 brbfcr = 0;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
> +		brbfcr |= BRBFCR_EL1_BRANCH_FILTERS;
> +		return brbfcr;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
> +		brbfcr |= BRBFCR_EL1_INDCALL;
> +		brbfcr |= BRBFCR_EL1_DIRCALL;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
> +		brbfcr |= BRBFCR_EL1_RTN;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
> +		brbfcr |= BRBFCR_EL1_INDCALL;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_COND)
> +		brbfcr |= BRBFCR_EL1_CONDDIR;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
> +		brbfcr |= BRBFCR_EL1_INDIRECT;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_CALL)
> +		brbfcr |= BRBFCR_EL1_DIRCALL;
> +
> +	return brbfcr;
> +}
> +
> +static u64 branch_type_to_brbcr(int branch_type)
> +{
> +	u64 brbcr = (BRBCR_EL1_FZP | BRBCR_EL1_DEFAULT_TS);
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_USER)
> +		brbcr |= BRBCR_EL1_E0BRE;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
> +		brbcr |= BRBCR_EL1_E1BRE;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_HV) {
> +		if (is_kernel_in_hyp_mode())
> +			brbcr |= BRBCR_EL1_E1BRE;
> +	}

I assume that in that case we're actually writing to BRBCR_EL2, and this is
actually the E2BRE bit, which is at the same position? If so, I think that's
worth a comment above the USER/KERNEL/HV bits here.

How do the BRB* control registers work with E2H? Is BRBCR_EL1 rewritten to
BRBCR_EL2 by the hardware?

> +
> +	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
> +		brbcr |= BRBCR_EL1_CC;
> +
> +	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
> +		brbcr |= BRBCR_EL1_MPRED;
> +
> +	/*
> +	 * The exception and exception return branches could be
> +	 * captured, irrespective of the perf event's privilege.
> +	 * If the perf event does not have enough privilege for
> +	 * a given exception level, then addresses which falls
> +	 * under that exception level will be reported as zero
> +	 * for the captured branch record, creating source only
> +	 * or target only records.
> +	 */
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
> +		brbcr |= BRBCR_EL1_EXCEPTION;
> +		brbcr |= BRBCR_EL1_ERTN;
> +	}
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
> +		brbcr |= BRBCR_EL1_EXCEPTION;
> +
> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
> +		brbcr |= BRBCR_EL1_ERTN;
> +
> +	return brbcr & BRBCR_EL1_DEFAULT_CONFIG;
> +}
> +
> +void armv8pmu_branch_enable(struct perf_event *event)
> +{
> +	u64 branch_type = event->attr.branch_sample_type;
> +	u64 brbfcr, brbcr;
> +
> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +	brbfcr &= ~BRBFCR_EL1_DEFAULT_CONFIG;
> +	brbfcr |= branch_type_to_brbfcr(branch_type);
> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
> +	isb();
> +
> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
> +	brbcr &= ~BRBCR_EL1_DEFAULT_CONFIG;
> +	brbcr |= branch_type_to_brbcr(branch_type);
> +	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
> +	isb();
> +	armv8pmu_branch_reset();
> +}
> +
> +void armv8pmu_branch_disable(struct perf_event *event)
> +{
> +	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +	u64 brbcr = read_sysreg_s(SYS_BRBCR_EL1);
> +
> +	brbcr &= ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
> +	brbfcr |= BRBFCR_EL1_PAUSED;
> +	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
> +	isb();
> +}
> +
> +static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)

It's a bit confusing to return the type and new_type fields in this way.

I think this would be clearer as a setter function, even if that results in it
being a bit longer, since it keeps all the type and new_type relationships in
one place and has a single path for returning the value:

static void brbe_set_perf_entry_type(struct perf_branch_entry *entry,
				     u64 brbinf)
{
	int brbe_type = brbe_fetch_type(brbinf);

	switch (brbe_type) {
	case BRBINF_EL1_TYPE_UNCOND_DIR;
		entry->type = PERF_BR_UNCOND;
		break;
	...
	case BRBINF_EL1_TYPE_DEBUG_HALT;
		entry->type = PERF_BR_EXTEND_ABI;
		entry->new_type = PERF_BR_ARM64_DEBUG_HALT;
		break;
	...
	default:
		...
	}
}

... and in theory that makes it easier to propogate an error in future if we
want to.

> +{
> +	int brbe_type = brbe_fetch_type(brbinf);
> +	*new_branch_type = false;
> +
> +	switch (brbe_type) {
> +	case BRBINF_EL1_TYPE_UNCOND_DIR:
> +		return PERF_BR_UNCOND;
> +	case BRBINF_EL1_TYPE_INDIR:
> +		return PERF_BR_IND;
> +	case BRBINF_EL1_TYPE_DIR_LINK:
> +		return PERF_BR_CALL;
> +	case BRBINF_EL1_TYPE_INDIR_LINK:
> +		return PERF_BR_IND_CALL;
> +	case BRBINF_EL1_TYPE_RET_SUB:
> +		return PERF_BR_RET;
> +	case BRBINF_EL1_TYPE_COND_DIR:
> +		return PERF_BR_COND;
> +	case BRBINF_EL1_TYPE_CALL:
> +		return PERF_BR_CALL;
> +	case BRBINF_EL1_TYPE_TRAP:
> +		return PERF_BR_SYSCALL;
> +	case BRBINF_EL1_TYPE_RET_EXCPT:
> +		return PERF_BR_ERET;
> +	case BRBINF_EL1_TYPE_IRQ:
> +		return PERF_BR_IRQ;
> +	case BRBINF_EL1_TYPE_DEBUG_HALT:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_DEBUG_HALT;
> +	case BRBINF_EL1_TYPE_SERROR:
> +		return PERF_BR_SERROR;
> +	case BRBINF_EL1_TYPE_INST_DEBUG:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_DEBUG_INST;
> +	case BRBINF_EL1_TYPE_DATA_DEBUG:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_DEBUG_DATA;
> +	case BRBINF_EL1_TYPE_ALGN_FAULT:
> +		*new_branch_type = true;
> +		return PERF_BR_NEW_FAULT_ALGN;
> +	case BRBINF_EL1_TYPE_INST_FAULT:
> +		*new_branch_type = true;
> +		return PERF_BR_NEW_FAULT_INST;
> +	case BRBINF_EL1_TYPE_DATA_FAULT:
> +		*new_branch_type = true;
> +		return PERF_BR_NEW_FAULT_DATA;
> +	case BRBINF_EL1_TYPE_FIQ:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_FIQ;
> +	case BRBINF_EL1_TYPE_DEBUG_EXIT:
> +		*new_branch_type = true;
> +		return PERF_BR_ARM64_DEBUG_EXIT;
> +	default:
> +		pr_warn("unknown branch type captured\n");
> +		return PERF_BR_UNKNOWN;

It would be worth logging the specific value in case we ever have to debug
this. This should also be marked as _ratelimited or _once.

> +	}
> +}
> +
> +static int brbe_fetch_perf_priv(u64 brbinf)
> +{
> +	int brbe_el = brbe_fetch_el(brbinf);
> +
> +	switch (brbe_el) {
> +	case BRBINF_EL1_EL_EL0:
> +		return PERF_BR_PRIV_USER;
> +	case BRBINF_EL1_EL_EL1:
> +		return PERF_BR_PRIV_KERNEL;
> +	case BRBINF_EL1_EL_EL2:
> +		if (is_kernel_in_hyp_mode())
> +			return PERF_BR_PRIV_KERNEL;
> +		return PERF_BR_PRIV_HV;
> +	default:
> +		pr_warn("unknown branch privilege captured\n");
> +		return PERF_BR_PRIV_UNKNOWN;

It would be worth logging the specific value in case we ever have to debug
this. This should also be marked as _ratelimited or _once.

> +	}
> +}
> +
> +static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
> +			       u64 brbinf, u64 brbcr, int idx)
> +{
> +	struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
> +	bool new_branch_type;
> +	int branch_type;
> +
> +	if (branch_sample_type(event)) {
> +		branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
> +		if (new_branch_type) {
> +			entry->type = PERF_BR_EXTEND_ABI;
> +			entry->new_type = branch_type;
> +		} else {
> +			entry->type = branch_type;
> +		}
> +	}

With the suggestions bove, this would become:

	if (branch_sample_type(event))
		brbe_set_perf_entry_type(entry, brbinf);

> +	if (!branch_sample_no_cycles(event)) {
> +		WARN_ON_ONCE(!(brbcr & BRBCR_EL1_CC));
> +		entry->cycles = brbe_fetch_cycles(brbinf);
> +	}
> +
> +	if (!branch_sample_no_flags(event)) {
> +		/*
> +		 * BRBINF_LASTFAILED does not indicate whether last transaction
> +		 * got failed or aborted during the current branch record itself.
> +		 * Rather, this indicates that all the branch records which were
> +		 * in transaction until the curret branch record have failed. So
> +		 * the entire BRBE buffer needs to be processed later on to find
> +		 * all branch records which might have failed.
> +		 */

This is quite difficult to follow.

I took in the ARM ARM, and it looks like this is all about TME transactions
(which Linux doesn't currently support). Per ARM DDI 0487I.a, page D15-5506:

| R_GVCJH
|   When an entire transaction is executed in a BRBE Non-prohibited region and
|   the transaction fails or is canceled then BRBFCR_EL1.LASTFAILED is set to
|   1.

| R_KBSZM
|   When a Branch record is generated, other than through the injection
|   mechanism, the value of BRBFCR_EL1.LASTFAILED is copied to the LASTFAILED
|   field in the Branch record and BRBFCR_EL1.LASTFAILED is set to 0.

| I_JBPHS
|   When a transaction fails or is canceled, Branch records generated in the
|   transaction are not removed from the Branch record buffer.

I think what this is saying is:

	/*
	 * BRBINFx_EL1.LASTFAILED indicates that a TME transaction failed (or
	 * was cancelled) prior to this record, and some number of records
	 * prior to this one may have been generated during an attempt to
	 * execute the transaction.
	 *
	 * We will remove such entries later in process_branch_aborts().
	 */

Is that right?

> +
> +		/*
> +		 * All these information (i.e transaction state and mispredicts)
> +		 * are not available for target only branch records.
> +		 */
> +		if (!brbe_target(brbinf)) {

Could we rename these heleprs for clarity, e.g.
brbe_record_is_{target_only,source_only,complete}()

With that, it would also be clearer to have:

	/*
	 * These fields only exist for complete and source-only records.
	 */
	if (brbe_record_is_complete(brbinf) ||
	    brbe_record_is_source_only()) {

... and explicilty match the cases we care about[


> +			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));

Huh? Why does the value of BRBCR matter here?

> +			entry->mispred = brbe_fetch_mispredict(brbinf);
> +			entry->predicted = !entry->mispred;
> +			entry->in_tx = brbe_fetch_in_tx(brbinf);
> +		}
> +	}
> +
> +	if (branch_sample_priv(event)) {
> +		/*
> +		 * All these information (i.e branch privilege level) are not
> +		 * available for source only branch records.
> +		 */
> +		if (!brbe_source(brbinf))
> +			entry->priv = brbe_fetch_perf_priv(brbinf);

Same style comment as above.

> +	}
> +}
> +
> +/*
> + * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
> + * preceding consecutive branch records, that were in a transaction
> + * (i.e their BRBINF_EL1.TX set) have been aborted.
> + *
> + * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
> + * consecutive branch records upto the last record, which were in a
> + * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
> + *
> + * --------------------------------- -------------------
> + * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
> + * --------------------------------- -------------------
> + * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
> + * --------------------------------- -------------------
> + * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
> + * --------------------------------- -------------------
> + * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------
> + * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------
> + * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
> + * --------------------------------- -------------------
> + * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
> + * --------------------------------- -------------------
> + * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------
> + * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------
> + * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> + * --------------------------------- -------------------

Are we guaranteed to have a record between two transactions with TX = 0?

AFAICT you could have a sequence where a TCOMMIT is immediately followed by a
TSTART, and IIUC in that case you could have back-to-back records for distinct
transactions all with TX = 1, where the first transaction could be commited,
and the second might fail/cancel.

... or do TCOMMIT/TCANCEL/TSTART get handled specially?

> + *
> + * BRBFCR_EL1.LASTFAILED == 1
> + *
> + * Here BRBFCR_EL1.LASTFAILED failes all those consecutive and also
> + * in transaction branches near the end of the BRBE buffer.
> + */
> +static void process_branch_aborts(struct pmu_hw_events *cpuc)
> +{
> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
> +	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +	bool lastfailed = !!(brbfcr & BRBFCR_EL1_LASTFAILED);
> +	int idx = brbe_attr->brbe_nr - 1;
> +	struct perf_branch_entry *entry;
> +
> +	do {
> +		entry = &cpuc->branches->branch_entries[idx];
> +		if (entry->in_tx) {
> +			entry->abort = lastfailed;
> +		} else {
> +			lastfailed = entry->abort;
> +			entry->abort = false;
> +		}
> +	} while (idx--, idx >= 0);
> +}
> +
> +void armv8pmu_branch_reset(void)
> +{
> +	asm volatile(BRB_IALL);
> +	isb();
> +}
> +
> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
> +{
> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
> +	u64 brbinf, brbfcr, brbcr;
> +	int idx;
> +
> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> +
> +	/* Ensure pause on PMU interrupt is enabled */
> +	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));

As above, I think this needs commentary in the interrupt handler, since this
presumably needs us to keep the IRQ asserted until we're done
reading/manipulating records in the IRQ handler.

Do we ever read this outside of the IRQ handler? AFAICT we don't, and that
makes it seem like some of this is redundant.

> +
> +	/* Save and clear the privilege */
> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);

Why? Later on we restore this, and AFAICT we don't modify it.

If it's paused, why do we care about the privilege?

> +
> +	/* Pause the buffer */
> +	write_sysreg_s(brbfcr | BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
> +	isb();

Why? If we're in the IRQ handler it's already paused, and if we're not in the
IRQ handler what prevents us racing with an IRQ?

> +
> +	for (idx = 0; idx < brbe_attr->brbe_nr; idx++) {
> +		struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
> +
> +		select_brbe_bank_index(idx);
> +		brbinf = get_brbinf_reg(idx);
> +		/*
> +		 * There are no valid entries anymore on the buffer.
> +		 * Abort the branch record processing to save some
> +		 * cycles and also reduce the capture/process load
> +		 * for the user space as well.
> +		 */
> +		if (brbe_invalid(brbinf))
> +			break;
> +
> +		perf_clear_branch_entry_bitfields(entry);
> +		if (brbe_valid(brbinf)) {
> +			entry->from = get_brbsrc_reg(idx);
> +			entry->to = get_brbtgt_reg(idx);
> +		} else if (brbe_source(brbinf)) {
> +			entry->from = get_brbsrc_reg(idx);
> +			entry->to = 0;
> +		} else if (brbe_target(brbinf)) {
> +			entry->from = 0;
> +			entry->to = get_brbtgt_reg(idx);
> +		}
> +		capture_brbe_flags(cpuc, event, brbinf, brbcr, idx);
> +	}
> +	cpuc->branches->branch_stack.nr = idx;
> +	cpuc->branches->branch_stack.hw_idx = -1ULL;
> +	process_branch_aborts(cpuc);
> +
> +	/* Restore privilege, enable pause on PMU interrupt */
> +	write_sysreg_s(brbcr | BRBCR_EL1_FZP, SYS_BRBCR_EL1);

Why do we have to save/restore this?

> +
> +	/* Unpause the buffer */
> +	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
> +	isb();
> +	armv8pmu_branch_reset();
> +}

Why do we enable it before we reset it?

Surely it would make sense to reset it first, and ammortize the cost of the ISB?

That said, as above, do we actually need to pause/unpause it? Or is it already
paused by virtue of the IRQ?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
  2023-01-12 13:24     ` Mark Rutland
@ 2023-01-13  3:02       ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-13  3:02 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown

On 1/12/23 18:54, Mark Rutland wrote:
> Hi Anshuman,
> 
> On Thu, Jan 05, 2023 at 08:40:35AM +0530, Anshuman Khandual wrote:
>> This adds BRBE related register definitions and various other related field
>> macros there in. These will be used subsequently in a BRBE driver which is
>> being added later on.
> 
> I haven't verified the specific values, but this looks good to me aside from
> one minor nit below.
> 
> [...]
> 
>> +# This is just a dummy register declaration to get all common field masks and
>> +# shifts for accessing given BRBINF contents.
>> +Sysreg	BRBINF_EL1	2	1	8	0	0
> 
> We don't need a dummy declaration, as we have 'SysregFields' that can be used
> for this, e.g.
> 
>   SysregFields BRBINFx_EL1
>   ...
>   EndSysregFields
> 
> ... which will avoid accidental usage of the register encoding. Note that I've
> also added an 'x' there in place of the index, which we do for other registers,
> e.g. TTBRx_EL1.
> 
> Could you please update to that?

There is a problem in defining SysregFields (which I did explore earlier as well).
SysregFields unfortunately does not support enums fields. Following build failure
comes up, while trying to convert BRBINFx_EL1 into a SysregFields definition.

Error at 932: unexpected Enum (inside SysregFields)

===============================================================================
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index a7f9054bd84c..519c4f080898 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -921,10 +921,7 @@ Enum       3:0     BT
 EndEnum
 EndSysreg
 
-
-# This is just a dummy register declaration to get all common field masks and
-# shifts for accessing given BRBINF contents.
-Sysreg BRBINF_EL1      2       1       8       0       0
+SysregFields BRBINFx_EL1
 Res0   63:47
 Field  46      CCU
 Field  45:32   CC
@@ -967,7 +964,7 @@ Enum        1:0     VALID
        0b10    SOURCE
        0b11    FULL
 EndEnum
-EndSysreg
+EndSysregFields
 
 Sysreg BRBCR_EL1       2       1       9       0       0
 Res0   63:24
===============================================================================

There are three enum fields in BRBINFx_EL1 as listed here.

Enum    13:8            TYPE
Enum    7:6		EL
Enum    1:0     	VALID

However, BRBINF_EL1 can be changed as BRBINFx_EL1, indicating its more generic
nature with a potential to be used for any index value register thereafter.

> 
> With that:
> 
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> 
> Mark.

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
@ 2023-01-13  3:02       ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-13  3:02 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown

On 1/12/23 18:54, Mark Rutland wrote:
> Hi Anshuman,
> 
> On Thu, Jan 05, 2023 at 08:40:35AM +0530, Anshuman Khandual wrote:
>> This adds BRBE related register definitions and various other related field
>> macros there in. These will be used subsequently in a BRBE driver which is
>> being added later on.
> 
> I haven't verified the specific values, but this looks good to me aside from
> one minor nit below.
> 
> [...]
> 
>> +# This is just a dummy register declaration to get all common field masks and
>> +# shifts for accessing given BRBINF contents.
>> +Sysreg	BRBINF_EL1	2	1	8	0	0
> 
> We don't need a dummy declaration, as we have 'SysregFields' that can be used
> for this, e.g.
> 
>   SysregFields BRBINFx_EL1
>   ...
>   EndSysregFields
> 
> ... which will avoid accidental usage of the register encoding. Note that I've
> also added an 'x' there in place of the index, which we do for other registers,
> e.g. TTBRx_EL1.
> 
> Could you please update to that?

There is a problem in defining SysregFields (which I did explore earlier as well).
SysregFields unfortunately does not support enums fields. Following build failure
comes up, while trying to convert BRBINFx_EL1 into a SysregFields definition.

Error at 932: unexpected Enum (inside SysregFields)

===============================================================================
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index a7f9054bd84c..519c4f080898 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -921,10 +921,7 @@ Enum       3:0     BT
 EndEnum
 EndSysreg
 
-
-# This is just a dummy register declaration to get all common field masks and
-# shifts for accessing given BRBINF contents.
-Sysreg BRBINF_EL1      2       1       8       0       0
+SysregFields BRBINFx_EL1
 Res0   63:47
 Field  46      CCU
 Field  45:32   CC
@@ -967,7 +964,7 @@ Enum        1:0     VALID
        0b10    SOURCE
        0b11    FULL
 EndEnum
-EndSysreg
+EndSysregFields
 
 Sysreg BRBCR_EL1       2       1       9       0       0
 Res0   63:24
===============================================================================

There are three enum fields in BRBINFx_EL1 as listed here.

Enum    13:8            TYPE
Enum    7:6		EL
Enum    1:0     	VALID

However, BRBINF_EL1 can be changed as BRBINFx_EL1, indicating its more generic
nature with a potential to be used for any index value register thereafter.

> 
> With that:
> 
> Acked-by: Mark Rutland <mark.rutland@arm.com>
> 
> Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
  2023-01-12 13:54     ` Mark Rutland
@ 2023-01-13  4:15       ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-13  4:15 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon


On 1/12/23 19:24, Mark Rutland wrote:
> On Thu, Jan 05, 2023 at 08:40:36AM +0530, Anshuman Khandual wrote:
>> This updates 'struct arm_pmu' for branch stack sampling support later. This
>> adds a new 'features' element in the structure to track supported features,
>> and another 'private' element to encapsulate implementation attributes on a
>> given 'struct arm_pmu'. These updates here will help in tracking any branch
>> stack sampling support, which is being added later. This also adds a helper
>> arm_pmu_branch_stack_supported().
>>
>> This also enables perf branch stack sampling event on all 'struct arm pmu',
>> supporting the feature but after removing the current gate that blocks such
>> events unconditionally in armpmu_event_init(). Instead a quick probe can be
>> initiated via arm_pmu_branch_stack_supported() to ascertain the support.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  drivers/perf/arm_pmu.c       | 3 +--
>>  include/linux/perf/arm_pmu.h | 9 +++++++++
>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
>> index 14a3ed3bdb0b..a85b2d67022e 100644
>> --- a/drivers/perf/arm_pmu.c
>> +++ b/drivers/perf/arm_pmu.c
>> @@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
>>  		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
>>  		return -ENOENT;
>>  
>> -	/* does not support taken branch sampling */
>> -	if (has_branch_stack(event))
>> +	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
>>  		return -EOPNOTSUPP;
>>  
>>  	return __hw_perf_event_init(event);
>> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
>> index 2a9d07cee927..64e1b2594025 100644
>> --- a/include/linux/perf/arm_pmu.h
>> +++ b/include/linux/perf/arm_pmu.h
>> @@ -80,11 +80,14 @@ enum armpmu_attr_groups {
>>  	ARMPMU_NR_ATTR_GROUPS
>>  };
>>  
>> +#define ARM_PMU_BRANCH_STACK	BIT(0)
>> +
>>  struct arm_pmu {
>>  	struct pmu	pmu;
>>  	cpumask_t	supported_cpus;
>>  	char		*name;
>>  	int		pmuver;
>> +	int		features;
>>  	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
>>  	void		(*enable)(struct perf_event *event);
>>  	void		(*disable)(struct perf_event *event);
> 
> Hmm, we already have the secure_access field separately. How about we fold that
> in and go with:
> 
> 	unsigned int	secure_access    : 1,
> 			has_branch_stack : 1;

Something like this would work, but should we use __u32 instead of unsigned int
to ensure 32 bit width ?

-       bool            secure_access; /* 32-bit ARM only */
+       unsigned int    secure_access   : 1, /* 32-bit ARM only */
+                       has_branch_stack: 1,
+                       reserved        : 31;

> 
> ... that way we have one way to manage flags, we don't need to allocate the
> bits, and the bulk of the existing code for secure_access can stay as-is.

Right, the changed code also builds on arm32 without any code change.

> 
>> @@ -119,8 +122,14 @@ struct arm_pmu {
>>  
>>  	/* Only to be used by ACPI probing code */
>>  	unsigned long acpi_cpuid;
>> +	void		*private;
> 
> Does this need to be on the end of struct arm_pmu, or can it be placed earlier?

This additional 'private' attribute structure sticking out from struct arm_pmu
should be at the end. But is there any benefit moving this earlier ?

> 
> The line spacing makes it look like the ACPI comment applies to 'private',
> which isn't the case.

Sure, will add the following comment, and a space in between.

diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index f60f7e01acae..c0a090ff7991 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -130,6 +130,8 @@ struct arm_pmu {
 
        /* Only to be used by ACPI probing code */
        unsigned long acpi_cpuid;
+
+       /* Implementation specific attributes */
        void            *private;
 };

> 
>>  };
>>  
>> +static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
>> +{
>> +	return armpmu->features & ARM_PMU_BRANCH_STACK;
>> +}
> 
> With the above, this would become:
> 
> static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
> {
> 	return armpmu->has_branch_stack;
> }

Right, will change this helper as required.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
@ 2023-01-13  4:15       ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-13  4:15 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon


On 1/12/23 19:24, Mark Rutland wrote:
> On Thu, Jan 05, 2023 at 08:40:36AM +0530, Anshuman Khandual wrote:
>> This updates 'struct arm_pmu' for branch stack sampling support later. This
>> adds a new 'features' element in the structure to track supported features,
>> and another 'private' element to encapsulate implementation attributes on a
>> given 'struct arm_pmu'. These updates here will help in tracking any branch
>> stack sampling support, which is being added later. This also adds a helper
>> arm_pmu_branch_stack_supported().
>>
>> This also enables perf branch stack sampling event on all 'struct arm pmu',
>> supporting the feature but after removing the current gate that blocks such
>> events unconditionally in armpmu_event_init(). Instead a quick probe can be
>> initiated via arm_pmu_branch_stack_supported() to ascertain the support.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  drivers/perf/arm_pmu.c       | 3 +--
>>  include/linux/perf/arm_pmu.h | 9 +++++++++
>>  2 files changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
>> index 14a3ed3bdb0b..a85b2d67022e 100644
>> --- a/drivers/perf/arm_pmu.c
>> +++ b/drivers/perf/arm_pmu.c
>> @@ -510,8 +510,7 @@ static int armpmu_event_init(struct perf_event *event)
>>  		!cpumask_test_cpu(event->cpu, &armpmu->supported_cpus))
>>  		return -ENOENT;
>>  
>> -	/* does not support taken branch sampling */
>> -	if (has_branch_stack(event))
>> +	if (has_branch_stack(event) && !arm_pmu_branch_stack_supported(armpmu))
>>  		return -EOPNOTSUPP;
>>  
>>  	return __hw_perf_event_init(event);
>> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
>> index 2a9d07cee927..64e1b2594025 100644
>> --- a/include/linux/perf/arm_pmu.h
>> +++ b/include/linux/perf/arm_pmu.h
>> @@ -80,11 +80,14 @@ enum armpmu_attr_groups {
>>  	ARMPMU_NR_ATTR_GROUPS
>>  };
>>  
>> +#define ARM_PMU_BRANCH_STACK	BIT(0)
>> +
>>  struct arm_pmu {
>>  	struct pmu	pmu;
>>  	cpumask_t	supported_cpus;
>>  	char		*name;
>>  	int		pmuver;
>> +	int		features;
>>  	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
>>  	void		(*enable)(struct perf_event *event);
>>  	void		(*disable)(struct perf_event *event);
> 
> Hmm, we already have the secure_access field separately. How about we fold that
> in and go with:
> 
> 	unsigned int	secure_access    : 1,
> 			has_branch_stack : 1;

Something like this would work, but should we use __u32 instead of unsigned int
to ensure 32 bit width ?

-       bool            secure_access; /* 32-bit ARM only */
+       unsigned int    secure_access   : 1, /* 32-bit ARM only */
+                       has_branch_stack: 1,
+                       reserved        : 31;

> 
> ... that way we have one way to manage flags, we don't need to allocate the
> bits, and the bulk of the existing code for secure_access can stay as-is.

Right, the changed code also builds on arm32 without any code change.

> 
>> @@ -119,8 +122,14 @@ struct arm_pmu {
>>  
>>  	/* Only to be used by ACPI probing code */
>>  	unsigned long acpi_cpuid;
>> +	void		*private;
> 
> Does this need to be on the end of struct arm_pmu, or can it be placed earlier?

This additional 'private' attribute structure sticking out from struct arm_pmu
should be at the end. But is there any benefit moving this earlier ?

> 
> The line spacing makes it look like the ACPI comment applies to 'private',
> which isn't the case.

Sure, will add the following comment, and a space in between.

diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index f60f7e01acae..c0a090ff7991 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -130,6 +130,8 @@ struct arm_pmu {
 
        /* Only to be used by ACPI probing code */
        unsigned long acpi_cpuid;
+
+       /* Implementation specific attributes */
        void            *private;
 };

> 
>>  };
>>  
>> +static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
>> +{
>> +	return armpmu->features & ARM_PMU_BRANCH_STACK;
>> +}
> 
> With the above, this would become:
> 
> static inline bool arm_pmu_branch_stack_supported(struct arm_pmu *armpmu)
> {
> 	return armpmu->has_branch_stack;
> }

Right, will change this helper as required.

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
  2023-01-12 14:29     ` Mark Rutland
@ 2023-01-13  5:11       ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-13  5:11 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon



On 1/12/23 19:59, Mark Rutland wrote:
> On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
>> This enables support for branch stack sampling event in ARMV8 PMU, checking
>> has_branch_stack() on the event inside 'struct arm_pmu' callbacks. Although
>> these branch stack helpers armv8pmu_branch_XXXXX() are just dummy functions
>> for now. While here, this also defines arm_pmu's sched_task() callback with
>> armv8pmu_sched_task(), which resets the branch record buffer on a sched_in.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/include/asm/perf_event.h | 10 +++++++++
>>  arch/arm64/kernel/perf_event.c      | 35 +++++++++++++++++++++++++++++
>>  2 files changed, 45 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
>> index 3eaf462f5752..a038902d6874 100644
>> --- a/arch/arm64/include/asm/perf_event.h
>> +++ b/arch/arm64/include/asm/perf_event.h
>> @@ -273,4 +273,14 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
>>  	(regs)->pstate = PSR_MODE_EL1h;	\
>>  }
>>  
>> +struct pmu_hw_events;
>> +struct arm_pmu;
>> +struct perf_event;
>> +
>> +static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
>> +static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
>> +static inline void armv8pmu_branch_enable(struct perf_event *event) { }
>> +static inline void armv8pmu_branch_disable(struct perf_event *event) { }
>> +static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
>> +static inline void armv8pmu_branch_reset(void) { }
> 
> As far as I can tell, these are not supposed to be called when
> !has_branch_stack(), so it would be good if these had a WARN() or similar to
> spot buggy usage.

This is actually true except the last two helper functions, which get called in the
generic PMU context i.e while probing or resetting the PMU. While probing it is not
yet known whether the PMU supports branch stack or not, but while resetting the PMU
arm_pmu_branch_stack_supported() is checked to ensure there is a buffer to be reset
via a special instruction. Will change the first four functions to add warnings in
case the event is not a branch stack one.

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index cf2e88c7b707..ab1d180e17a6 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -285,10 +285,27 @@ void armv8pmu_branch_disable(struct perf_event *event);
 void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
 void armv8pmu_branch_reset(void);
 #else
-static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
-static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
-static inline void armv8pmu_branch_enable(struct perf_event *event) { }
-static inline void armv8pmu_branch_disable(struct perf_event *event) { }
+static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+       WARN_ON_ONCE(!has_branch_stack(event));
+}
+
+static inline bool armv8pmu_branch_valid(struct perf_event *event)
+{
+       WARN_ON_ONCE(!has_branch_stack(event));
+       return false;
+}
+
+static inline void armv8pmu_branch_enable(struct perf_event *event)
+{
+       WARN_ON_ONCE(!has_branch_stack(event));
+}
+
+static inline void armv8pmu_branch_disable(struct perf_event *event)
+{
+       WARN_ON_ONCE(!has_branch_stack(event));
+}
+
 static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
 static inline void armv8pmu_branch_reset(void) { }
 #endif



> 
>>  #endif
>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>> index a5193f2146a6..8805b4516088 100644
>> --- a/arch/arm64/kernel/perf_event.c
>> +++ b/arch/arm64/kernel/perf_event.c
>> @@ -789,10 +789,22 @@ static void armv8pmu_enable_event(struct perf_event *event)
>>  	 * Enable counter
>>  	 */
>>  	armv8pmu_enable_event_counter(event);
>> +
>> +	/*
>> +	 * Enable BRBE
>> +	 */
>> +	if (has_branch_stack(event))
>> +		armv8pmu_branch_enable(event);
>>  }
> 
> This looks fine, but tbh I think we should delete the existing comments above
> each function call as they're blindingly obvious and just waste space.
> 
>>  static void armv8pmu_disable_event(struct perf_event *event)
>>  {
>> +	/*
>> +	 * Disable BRBE
>> +	 */
>> +	if (has_branch_stack(event))
>> +		armv8pmu_branch_disable(event);
>> +
> 
> Likewise here.

Dropped all the comments in armv8pmu_enable_event() and in armv8pmu_disable_event()
and removed the redundant interleaving lines as well.

> 
>>  	/*
>>  	 * Disable counter
>>  	 */
>> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>>  		if (!armpmu_event_set_period(event))
>>  			continue;
>>  
>> +		if (has_branch_stack(event)) {
>> +			WARN_ON(!cpuc->branches);
>> +			armv8pmu_branch_read(cpuc, event);
>> +			data.br_stack = &cpuc->branches->branch_stack;
>> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
>> +		}
> 
> How do we ensure the data we're getting isn't changed under our feet? Is BRBE
> disabled at this point?

Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
before initiating the actual read, which eventually populates the data.br_stack.

> 
> Is this going to have branches after taking the exception, or does BRBE stop
> automatically at that point? If so we presumably need to take special care as
> to when we read this relative to enabling/disabling and/or manipulating the
> overflow bits.

The default BRBE configuration includes setting BRBCR_EL1.FZP, enabling BRBE to
be paused automatically, right after a PMU IRQ. Regardless, before reading the
buffer, BRBE is paused (BRBFCR_EL1.PAUSED) and disabled for all privilege levels
~(BRBCR_EL1.E0BRE/E1BRE) which ensures that no new branch record is getting into
the buffer, while it is being read for perf right buffer.

> 
>> +
>>  		/*
>>  		 * Perf event overflow will queue the processing of the event as
>>  		 * an irq_work which will be taken care of in the handling of
>> @@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
>>  	return event->hw.idx;
>>  }
>>  
>> +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
>> +{
>> +	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
>> +
>> +	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
>> +		armv8pmu_branch_reset();
>> +}
> 
> When scheduling out, shouldn't we save what we have so far?
> 
> It seems odd that we just throw that away rather than placing it into a FIFO.

IIRC we had discussed this earlier, save and restore mechanism will be added
later, not during this enablement patch series. For now resetting the buffer
ensures that branch records from one session does not get into another. Note
that these branches cannot be pushed into perf ring buffer either, as there
was no corresponding PMU interrupt to be associated with.

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
@ 2023-01-13  5:11       ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-13  5:11 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon



On 1/12/23 19:59, Mark Rutland wrote:
> On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
>> This enables support for branch stack sampling event in ARMV8 PMU, checking
>> has_branch_stack() on the event inside 'struct arm_pmu' callbacks. Although
>> these branch stack helpers armv8pmu_branch_XXXXX() are just dummy functions
>> for now. While here, this also defines arm_pmu's sched_task() callback with
>> armv8pmu_sched_task(), which resets the branch record buffer on a sched_in.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/include/asm/perf_event.h | 10 +++++++++
>>  arch/arm64/kernel/perf_event.c      | 35 +++++++++++++++++++++++++++++
>>  2 files changed, 45 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
>> index 3eaf462f5752..a038902d6874 100644
>> --- a/arch/arm64/include/asm/perf_event.h
>> +++ b/arch/arm64/include/asm/perf_event.h
>> @@ -273,4 +273,14 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
>>  	(regs)->pstate = PSR_MODE_EL1h;	\
>>  }
>>  
>> +struct pmu_hw_events;
>> +struct arm_pmu;
>> +struct perf_event;
>> +
>> +static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
>> +static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
>> +static inline void armv8pmu_branch_enable(struct perf_event *event) { }
>> +static inline void armv8pmu_branch_disable(struct perf_event *event) { }
>> +static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
>> +static inline void armv8pmu_branch_reset(void) { }
> 
> As far as I can tell, these are not supposed to be called when
> !has_branch_stack(), so it would be good if these had a WARN() or similar to
> spot buggy usage.

This is actually true except the last two helper functions, which get called in the
generic PMU context i.e while probing or resetting the PMU. While probing it is not
yet known whether the PMU supports branch stack or not, but while resetting the PMU
arm_pmu_branch_stack_supported() is checked to ensure there is a buffer to be reset
via a special instruction. Will change the first four functions to add warnings in
case the event is not a branch stack one.

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
index cf2e88c7b707..ab1d180e17a6 100644
--- a/arch/arm64/include/asm/perf_event.h
+++ b/arch/arm64/include/asm/perf_event.h
@@ -285,10 +285,27 @@ void armv8pmu_branch_disable(struct perf_event *event);
 void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
 void armv8pmu_branch_reset(void);
 #else
-static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
-static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
-static inline void armv8pmu_branch_enable(struct perf_event *event) { }
-static inline void armv8pmu_branch_disable(struct perf_event *event) { }
+static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
+{
+       WARN_ON_ONCE(!has_branch_stack(event));
+}
+
+static inline bool armv8pmu_branch_valid(struct perf_event *event)
+{
+       WARN_ON_ONCE(!has_branch_stack(event));
+       return false;
+}
+
+static inline void armv8pmu_branch_enable(struct perf_event *event)
+{
+       WARN_ON_ONCE(!has_branch_stack(event));
+}
+
+static inline void armv8pmu_branch_disable(struct perf_event *event)
+{
+       WARN_ON_ONCE(!has_branch_stack(event));
+}
+
 static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
 static inline void armv8pmu_branch_reset(void) { }
 #endif



> 
>>  #endif
>> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
>> index a5193f2146a6..8805b4516088 100644
>> --- a/arch/arm64/kernel/perf_event.c
>> +++ b/arch/arm64/kernel/perf_event.c
>> @@ -789,10 +789,22 @@ static void armv8pmu_enable_event(struct perf_event *event)
>>  	 * Enable counter
>>  	 */
>>  	armv8pmu_enable_event_counter(event);
>> +
>> +	/*
>> +	 * Enable BRBE
>> +	 */
>> +	if (has_branch_stack(event))
>> +		armv8pmu_branch_enable(event);
>>  }
> 
> This looks fine, but tbh I think we should delete the existing comments above
> each function call as they're blindingly obvious and just waste space.
> 
>>  static void armv8pmu_disable_event(struct perf_event *event)
>>  {
>> +	/*
>> +	 * Disable BRBE
>> +	 */
>> +	if (has_branch_stack(event))
>> +		armv8pmu_branch_disable(event);
>> +
> 
> Likewise here.

Dropped all the comments in armv8pmu_enable_event() and in armv8pmu_disable_event()
and removed the redundant interleaving lines as well.

> 
>>  	/*
>>  	 * Disable counter
>>  	 */
>> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>>  		if (!armpmu_event_set_period(event))
>>  			continue;
>>  
>> +		if (has_branch_stack(event)) {
>> +			WARN_ON(!cpuc->branches);
>> +			armv8pmu_branch_read(cpuc, event);
>> +			data.br_stack = &cpuc->branches->branch_stack;
>> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
>> +		}
> 
> How do we ensure the data we're getting isn't changed under our feet? Is BRBE
> disabled at this point?

Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
before initiating the actual read, which eventually populates the data.br_stack.

> 
> Is this going to have branches after taking the exception, or does BRBE stop
> automatically at that point? If so we presumably need to take special care as
> to when we read this relative to enabling/disabling and/or manipulating the
> overflow bits.

The default BRBE configuration includes setting BRBCR_EL1.FZP, enabling BRBE to
be paused automatically, right after a PMU IRQ. Regardless, before reading the
buffer, BRBE is paused (BRBFCR_EL1.PAUSED) and disabled for all privilege levels
~(BRBCR_EL1.E0BRE/E1BRE) which ensures that no new branch record is getting into
the buffer, while it is being read for perf right buffer.

> 
>> +
>>  		/*
>>  		 * Perf event overflow will queue the processing of the event as
>>  		 * an irq_work which will be taken care of in the handling of
>> @@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
>>  	return event->hw.idx;
>>  }
>>  
>> +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
>> +{
>> +	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
>> +
>> +	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
>> +		armv8pmu_branch_reset();
>> +}
> 
> When scheduling out, shouldn't we save what we have so far?
> 
> It seems odd that we just throw that away rather than placing it into a FIFO.

IIRC we had discussed this earlier, save and restore mechanism will be added
later, not during this enablement patch series. For now resetting the buffer
ensures that branch records from one session does not get into another. Note
that these branches cannot be pushed into perf ring buffer either, as there
was no corresponding PMU interrupt to be associated with.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
  2023-01-12 16:51     ` Mark Rutland
@ 2023-01-19  2:48       ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-19  2:48 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On 1/12/23 22:21, Mark Rutland wrote:
> On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:
>> This enables branch stack sampling events in ARMV8 PMU, via an architecture
>> feature FEAT_BRBE aka branch record buffer extension. This defines required
>> branch helper functions pmuv8pmu_branch_XXXXX() and the implementation here
>> is wrapped with a new config option CONFIG_ARM64_BRBE.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/Kconfig                  |  11 +
>>  arch/arm64/include/asm/perf_event.h |   9 +
>>  arch/arm64/kernel/Makefile          |   1 +
>>  arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
>>  arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
>>  5 files changed, 790 insertions(+)
>>  create mode 100644 arch/arm64/kernel/brbe.c
>>  create mode 100644 arch/arm64/kernel/brbe.h
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 03934808b2ed..915b12709a46 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -1363,6 +1363,17 @@ config HW_PERF_EVENTS
>>  	def_bool y
>>  	depends on ARM_PMU
>>  
>> +config ARM64_BRBE
>> +	bool "Enable support for Branch Record Buffer Extension (BRBE)"
>> +	depends on PERF_EVENTS && ARM64 && ARM_PMU
>> +	default y
>> +	help
>> +	  Enable perf support for Branch Record Buffer Extension (BRBE) which
>> +	  records all branches taken in an execution path. This supports some
>> +	  branch types and privilege based filtering. It captured additional
>> +	  relevant information such as cycle count, misprediction and branch
>> +	  type, branch privilege level etc.
>> +
>>  # Supported by clang >= 7.0 or GCC >= 12.0.0
>>  config CC_HAVE_SHADOW_CALL_STACK
>>  	def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
>> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
>> index a038902d6874..cf2e88c7b707 100644
>> --- a/arch/arm64/include/asm/perf_event.h
>> +++ b/arch/arm64/include/asm/perf_event.h
>> @@ -277,6 +277,14 @@ struct pmu_hw_events;
>>  struct arm_pmu;
>>  struct perf_event;
>>  
>> +#ifdef CONFIG_ARM64_BRBE
>> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event);
>> +bool armv8pmu_branch_valid(struct perf_event *event);
>> +void armv8pmu_branch_enable(struct perf_event *event);
>> +void armv8pmu_branch_disable(struct perf_event *event);
>> +void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
>> +void armv8pmu_branch_reset(void);
>> +#else
>>  static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
>>  static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
>>  static inline void armv8pmu_branch_enable(struct perf_event *event) { }
>> @@ -284,3 +292,4 @@ static inline void armv8pmu_branch_disable(struct perf_event *event) { }
>>  static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
>>  static inline void armv8pmu_branch_reset(void) { }
>>  #endif
>> +#endif
>> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
>> index ceba6792f5b3..6ee7ccb61621 100644
>> --- a/arch/arm64/kernel/Makefile
>> +++ b/arch/arm64/kernel/Makefile
>> @@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
>>  obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
>>  obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
>>  obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
>> +obj-$(CONFIG_ARM64_BRBE)		+= brbe.o
>>  obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
>>  obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
>>  obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
>> diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
>> new file mode 100644
>> index 000000000000..cd03d3531e04
>> --- /dev/null
>> +++ b/arch/arm64/kernel/brbe.c
>> @@ -0,0 +1,512 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Branch Record Buffer Extension Driver.
>> + *
>> + * Copyright (C) 2022 ARM Limited
>> + *
>> + * Author: Anshuman Khandual <anshuman.khandual@arm.com>
>> + */
>> +#include "brbe.h"
>> +
>> +static bool valid_brbe_nr(int brbe_nr)
>> +{
>> +	return brbe_nr == BRBIDR0_EL1_NUMREC_8 ||
>> +	       brbe_nr == BRBIDR0_EL1_NUMREC_16 ||
>> +	       brbe_nr == BRBIDR0_EL1_NUMREC_32 ||
>> +	       brbe_nr == BRBIDR0_EL1_NUMREC_64;
>> +}
>> +
>> +static bool valid_brbe_cc(int brbe_cc)
>> +{
>> +	return brbe_cc == BRBIDR0_EL1_CC_20_BIT;
>> +}
>> +
>> +static bool valid_brbe_format(int brbe_format)
>> +{
>> +	return brbe_format == BRBIDR0_EL1_FORMAT_0;
>> +}
>> +
>> +static bool valid_brbe_version(int brbe_version)
>> +{
>> +	return brbe_version == ID_AA64DFR0_EL1_BRBE_IMP ||
>> +	       brbe_version == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1;
>> +}
>> +
>> +static void select_brbe_bank(int bank)
>> +{
>> +	static int brbe_current_bank = BRBE_BANK_IDX_INVALID;
> 
> This is a per-cpu peroperty, so I don't understand how this can safely be
> stored in a static variable. If this is necessary it needs to go in a per-cpu
> variable, but I suspect we don't actually need it.

You are right, we dont need it.

> 
>> +	u64 brbfcr;
>> +
>> +	if (brbe_current_bank == bank)
>> +		return;
> 
> It looks like this is just for the same of optimizing redundant changes when
> armv8pmu_branch_read() iterates over the records?

Right, it is.

> 
> It'd be simpler to have armv8pmu_branch_read() iterate over each bank, then
> within that iterate over each record within that bank.

Sure, will drop this optimization completely. I will split the iteration into two
separate loops, one each for bank 0 and other for bank 1.

> 
>> +	WARN_ON(bank > BRBE_BANK_IDX_1);
>> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
>> +	brbfcr |= ((bank << BRBFCR_EL1_BANK_SHIFT) & BRBFCR_EL1_BANK_MASK);
> 
> You can use SYS_FIELD_PREP() for this:

Sure, will do.

> 
> 	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
> 	brbfcr |= SYS_FIELD_PREP(BRBFCR_EL1, BANK, bank);
> 
> Please use FIELD_PREP for this.

Done.

> 
>> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
>> +	isb();
>> +	brbe_current_bank = bank;
>> +}
>> +
>> +static void select_brbe_bank_index(int buffer_idx)
>> +{
>> +	switch (buffer_idx) {
>> +	case BRBE_BANK0_IDX_MIN ... BRBE_BANK0_IDX_MAX:
>> +		select_brbe_bank(BRBE_BANK_IDX_0);
>> +		break;
>> +	case BRBE_BANK1_IDX_MIN ... BRBE_BANK1_IDX_MAX:
>> +		select_brbe_bank(BRBE_BANK_IDX_1);
>> +		break;
>> +	default:
>> +		pr_warn("unsupported BRBE index\n");
> 
> It would be worth logging the specific index in case we ever have to debug
> this. It's probably worth also making this a WARN_ONCE() or WARN_RATELIMITED().

This function will not be required once individual loops based read for each
BRBE bank is implemented, thus reducing number of times select_brbe_bank()
gets called i.e just two times once for bank 0 and other for bank 1.

> 
>> +	}
>> +}
>> +
>> +static const char branch_filter_error_msg[] = "branch filter not supported";
>> +
>> +bool armv8pmu_branch_valid(struct perf_event *event)
>> +{
>> +	u64 branch_type = event->attr.branch_sample_type;
>> +
>> +	/*
>> +	 * If the event does not have at least one of the privilege
>> +	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
>> +	 * perf will adjust its value based on perf event's existing
>> +	 * privilege level via attr.exclude_[user|kernel|hv].
>> +	 *
>> +	 * As event->attr.branch_sample_type might have been changed
>> +	 * when the event reaches here, it is not possible to figure
>> +	 * out whether the event originally had HV privilege request
>> +	 * or got added via the core perf. Just report this situation
>> +	 * once and continue ignoring if there are other instances.
>> +	 */
>> +	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
>> +		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
>> +		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
>> +		return false;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
>> +		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
>> +		return false;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
>> +		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
>> +		return false;
>> +	}
>> +	return true;
>> +}
> 
> Is this called when validating user input? If so, NAK to printing anything to a
> higher leval than debug. If there are constraints the user needs to be aware of

You mean pr_debug() based prints ?

> we should expose the relevant information under sysfs, but it seems that these
> are just generic perf options that BRBE doesn't support.

Right, these are generic perf options. As you mentioned, will replace these with
pr_debug() instead.

> 
> It would be better to whitelist what we do support rather than blacklisting
> what we don't.

But with a negative list, user would know what is not supported via these pr_debug()
based output when enabled ? But I dont have a strong opinion either way.

> 
>> +
>> +static void branch_records_alloc(struct arm_pmu *armpmu)
>> +{
>> +	struct pmu_hw_events *events;
>> +	int cpu;
>> +
>> +	for_each_possible_cpu(cpu) {
>> +		events = per_cpu_ptr(armpmu->hw_events, cpu);
>> +
>> +		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
>> +		WARN_ON(!events->branches);
>> +	}
>> +}
> 
> It would be simpler for this to be a percpu allocation.

Could you please be more specific ? alloc_percpu_gfp() cannot be used here
because 'events->branches' is not a __percpu variable unlike its parent
'events' which is derived from armpmu.

> 
> If the allocation fails, we should propogate that error rather than just
> WARNing, and fail probing the PMU.

Sure, will change that.

> 
> Also, if the generic allocator fails it will print a warning (unless
> __GFP_NOWARN was used), so we don't need the warning here.

Sure, understood.

> 
>> +
>> +static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
>> +{
>> +	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
> 
> Same comments as for the failure path in branch_records_alloc().
> 
>> +	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
> 
> Which context is this run in? Unless this is affine to a relevant CPU we can't
> read the sysreg safely, and if we're in a cross-call we cannot allocate memory,
> so this doesn't look right to me.

Called from smp_call_function_any() context via __armv8pmu_probe_pmu().

> 
> I suspect CONFIG_DEBUG_ATOMIC_SLEEP=y and/or CONFIG_PROVE_LOCKING=y will complain here.

Right, it does. Remember dropping pr_info() during BRBE probe for the exact same
reason here but did not realize we will run into the same problem again.

> 
> Please follow the approach of armv8pmu_probe_pmu(), where we use a probe_info
> structure that the callee can fill with information. Then we can do the
> allocation in the main thread from a non-atomic context.

Right, will do that. The only problem being 'struct brbe_hw_attr' which will not be
visible in the main thread, might need an abstraction function to do the allocation
in BRBE implementation. Regardless, a successful BRBE in the preceding function can
be ascertained from armpmu->has_branch_stack().

> 
>> +
>> +	WARN_ON(!brbe_attr);
>> +	armpmu->private = brbe_attr;
>> +
>> +	brbe_attr->brbe_version = brbe;
>> +	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
>> +	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
>> +	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
> 
> As a minor thing, could we please s/fetch/get/ ? To me, 'fetch' sounds like a
> memory operation, and elsewhere we use 'get' for this sort of getter function.

Sure, but shall we change fetch as get for entire BRBE implementation (where ever
there is a determination of field from a register value) or just the above function ?
Default, will change all places.

> 
>> +
>> +	if (!valid_brbe_version(brbe_attr->brbe_version) ||
>> +	   !valid_brbe_format(brbe_attr->brbe_format) ||
>> +	   !valid_brbe_cc(brbe_attr->brbe_cc) ||
>> +	   !valid_brbe_nr(brbe_attr->brbe_nr))
>> +		return -EOPNOTSUPP;
>> +
>> +	return 0;
>> +}
>> +
>> +void armv8pmu_branch_probe(struct arm_pmu *armpmu)
>> +{
>> +	u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
>> +	u32 brbe;
>> +
>> +	brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
>> +	if (!brbe)
>> +		return;
>> +
>> +	if (brbe_attributes_probe(armpmu, brbe))
>> +		return;
>> +
>> +	branch_records_alloc(armpmu);
>> +	armpmu->features |= ARM_PMU_BRANCH_STACK;
>> +}
>> +
>> +static u64 branch_type_to_brbfcr(int branch_type)
>> +{
>> +	u64 brbfcr = 0;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
>> +		brbfcr |= BRBFCR_EL1_BRANCH_FILTERS;
>> +		return brbfcr;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
>> +		brbfcr |= BRBFCR_EL1_INDCALL;
>> +		brbfcr |= BRBFCR_EL1_DIRCALL;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>> +		brbfcr |= BRBFCR_EL1_RTN;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
>> +		brbfcr |= BRBFCR_EL1_INDCALL;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_COND)
>> +		brbfcr |= BRBFCR_EL1_CONDDIR;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
>> +		brbfcr |= BRBFCR_EL1_INDIRECT;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_CALL)
>> +		brbfcr |= BRBFCR_EL1_DIRCALL;
>> +
>> +	return brbfcr;
>> +}
>> +
>> +static u64 branch_type_to_brbcr(int branch_type)
>> +{
>> +	u64 brbcr = (BRBCR_EL1_FZP | BRBCR_EL1_DEFAULT_TS);
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_USER)
>> +		brbcr |= BRBCR_EL1_E0BRE;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
>> +		brbcr |= BRBCR_EL1_E1BRE;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_HV) {
>> +		if (is_kernel_in_hyp_mode())
>> +			brbcr |= BRBCR_EL1_E1BRE;
>> +	}
> 
> I assume that in that case we're actually writing to BRBCR_EL2, and this is
> actually the E2BRE bit, which is at the same position? If so, I think that's
> worth a comment above the USER/KERNEL/HV bits here.

That is right, will add a comment.

> 
> How do the BRB* control registers work with E2H? Is BRBCR_EL1 rewritten to
> BRBCR_EL2 by the hardware?

Right, that is my understanding as well.

With FEAT_VHE and HCR_EL2.E2H = 1, access to BRBCR_EL1 at EL2, accesses BRBCR_EL2
without FEAT_VHE or HCR_EL2.E2H = 0, access to BRBCR_EL1 at EL2, accesses BRBCR_EL1

> 
>> +
>> +	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
>> +		brbcr |= BRBCR_EL1_CC;
>> +
>> +	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
>> +		brbcr |= BRBCR_EL1_MPRED;
>> +
>> +	/*
>> +	 * The exception and exception return branches could be
>> +	 * captured, irrespective of the perf event's privilege.
>> +	 * If the perf event does not have enough privilege for
>> +	 * a given exception level, then addresses which falls
>> +	 * under that exception level will be reported as zero
>> +	 * for the captured branch record, creating source only
>> +	 * or target only records.
>> +	 */
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
>> +		brbcr |= BRBCR_EL1_EXCEPTION;
>> +		brbcr |= BRBCR_EL1_ERTN;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
>> +		brbcr |= BRBCR_EL1_EXCEPTION;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>> +		brbcr |= BRBCR_EL1_ERTN;
>> +
>> +	return brbcr & BRBCR_EL1_DEFAULT_CONFIG;
>> +}
>> +
>> +void armv8pmu_branch_enable(struct perf_event *event)
>> +{
>> +	u64 branch_type = event->attr.branch_sample_type;
>> +	u64 brbfcr, brbcr;
>> +
>> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +	brbfcr &= ~BRBFCR_EL1_DEFAULT_CONFIG;
>> +	brbfcr |= branch_type_to_brbfcr(branch_type);
>> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
>> +	isb();
>> +
>> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
>> +	brbcr &= ~BRBCR_EL1_DEFAULT_CONFIG;
>> +	brbcr |= branch_type_to_brbcr(branch_type);
>> +	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
>> +	isb();
>> +	armv8pmu_branch_reset();
>> +}
>> +
>> +void armv8pmu_branch_disable(struct perf_event *event)
>> +{
>> +	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +	u64 brbcr = read_sysreg_s(SYS_BRBCR_EL1);
>> +
>> +	brbcr &= ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
>> +	brbfcr |= BRBFCR_EL1_PAUSED;
>> +	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
>> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
>> +	isb();
>> +}
>> +
>> +static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)
> 
> It's a bit confusing to return the type and new_type fields in this way.
> 
> I think this would be clearer as a setter function, even if that results in it
> being a bit longer, since it keeps all the type and new_type relationships in
> one place and has a single path for returning the value:

Makes sense.

> 
> static void brbe_set_perf_entry_type(struct perf_branch_entry *entry,
> 				     u64 brbinf)
> {
> 	int brbe_type = brbe_fetch_type(brbinf);
> 
> 	switch (brbe_type) {
> 	case BRBINF_EL1_TYPE_UNCOND_DIR;
> 		entry->type = PERF_BR_UNCOND;
> 		break;
> 	...
> 	case BRBINF_EL1_TYPE_DEBUG_HALT;
> 		entry->type = PERF_BR_EXTEND_ABI;
> 		entry->new_type = PERF_BR_ARM64_DEBUG_HALT;
> 		break;
> 	...
> 	default:
> 		...
> 	}
> }
> 
> ... and in theory that makes it easier to propogate an error in future if we
> want to.

Sure, will convert this function into brbe_set_perf_entry_type() as suggested.

> 
>> +{
>> +	int brbe_type = brbe_fetch_type(brbinf);
>> +	*new_branch_type = false;
>> +
>> +	switch (brbe_type) {
>> +	case BRBINF_EL1_TYPE_UNCOND_DIR:
>> +		return PERF_BR_UNCOND;
>> +	case BRBINF_EL1_TYPE_INDIR:
>> +		return PERF_BR_IND;
>> +	case BRBINF_EL1_TYPE_DIR_LINK:
>> +		return PERF_BR_CALL;
>> +	case BRBINF_EL1_TYPE_INDIR_LINK:
>> +		return PERF_BR_IND_CALL;
>> +	case BRBINF_EL1_TYPE_RET_SUB:
>> +		return PERF_BR_RET;
>> +	case BRBINF_EL1_TYPE_COND_DIR:
>> +		return PERF_BR_COND;
>> +	case BRBINF_EL1_TYPE_CALL:
>> +		return PERF_BR_CALL;
>> +	case BRBINF_EL1_TYPE_TRAP:
>> +		return PERF_BR_SYSCALL;
>> +	case BRBINF_EL1_TYPE_RET_EXCPT:
>> +		return PERF_BR_ERET;
>> +	case BRBINF_EL1_TYPE_IRQ:
>> +		return PERF_BR_IRQ;
>> +	case BRBINF_EL1_TYPE_DEBUG_HALT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_DEBUG_HALT;
>> +	case BRBINF_EL1_TYPE_SERROR:
>> +		return PERF_BR_SERROR;
>> +	case BRBINF_EL1_TYPE_INST_DEBUG:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_DEBUG_INST;
>> +	case BRBINF_EL1_TYPE_DATA_DEBUG:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_DEBUG_DATA;
>> +	case BRBINF_EL1_TYPE_ALGN_FAULT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_NEW_FAULT_ALGN;
>> +	case BRBINF_EL1_TYPE_INST_FAULT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_NEW_FAULT_INST;
>> +	case BRBINF_EL1_TYPE_DATA_FAULT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_NEW_FAULT_DATA;
>> +	case BRBINF_EL1_TYPE_FIQ:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_FIQ;
>> +	case BRBINF_EL1_TYPE_DEBUG_EXIT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_DEBUG_EXIT;
>> +	default:
>> +		pr_warn("unknown branch type captured\n");
>> +		return PERF_BR_UNKNOWN;
> 
> It would be worth logging the specific value in case we ever have to debug
> this. This should also be marked as _ratelimited or _once.

Sure, will replace with a pr_warn_once() printing 'branch_type'.

> 
>> +	}
>> +}
>> +
>> +static int brbe_fetch_perf_priv(u64 brbinf)
>> +{
>> +	int brbe_el = brbe_fetch_el(brbinf);
>> +
>> +	switch (brbe_el) {
>> +	case BRBINF_EL1_EL_EL0:
>> +		return PERF_BR_PRIV_USER;
>> +	case BRBINF_EL1_EL_EL1:
>> +		return PERF_BR_PRIV_KERNEL;
>> +	case BRBINF_EL1_EL_EL2:
>> +		if (is_kernel_in_hyp_mode())
>> +			return PERF_BR_PRIV_KERNEL;
>> +		return PERF_BR_PRIV_HV;
>> +	default:
>> +		pr_warn("unknown branch privilege captured\n");
>> +		return PERF_BR_PRIV_UNKNOWN;
> 
> It would be worth logging the specific value in case we ever have to debug
> this. This should also be marked as _ratelimited or _once.

Sure, will replace with a pr_warn_once() printing 'brbe_el.

> 
>> +	}
>> +}
>> +
>> +static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
>> +			       u64 brbinf, u64 brbcr, int idx)
>> +{
>> +	struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
>> +	bool new_branch_type;
>> +	int branch_type;
>> +
>> +	if (branch_sample_type(event)) {
>> +		branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
>> +		if (new_branch_type) {
>> +			entry->type = PERF_BR_EXTEND_ABI;
>> +			entry->new_type = branch_type;
>> +		} else {
>> +			entry->type = branch_type;
>> +		}
>> +	}
> 
> With the suggestions bove, this would become:
> 
> 	if (branch_sample_type(event))
> 		brbe_set_perf_entry_type(entry, brbinf);

That's right, will change.

> 
>> +	if (!branch_sample_no_cycles(event)) {
>> +		WARN_ON_ONCE(!(brbcr & BRBCR_EL1_CC));
>> +		entry->cycles = brbe_fetch_cycles(brbinf);
>> +	}
>> +
>> +	if (!branch_sample_no_flags(event)) {
>> +		/*
>> +		 * BRBINF_LASTFAILED does not indicate whether last transaction
>> +		 * got failed or aborted during the current branch record itself.
>> +		 * Rather, this indicates that all the branch records which were
>> +		 * in transaction until the curret branch record have failed. So
>> +		 * the entire BRBE buffer needs to be processed later on to find
>> +		 * all branch records which might have failed.
>> +		 */
> 
> This is quite difficult to follow.
> 
> I took in the ARM ARM, and it looks like this is all about TME transactions
> (which Linux doesn't currently support). Per ARM DDI 0487I.a, page D15-5506:
> 
> | R_GVCJH
> |   When an entire transaction is executed in a BRBE Non-prohibited region and
> |   the transaction fails or is canceled then BRBFCR_EL1.LASTFAILED is set to
> |   1.
> 
> | R_KBSZM
> |   When a Branch record is generated, other than through the injection
> |   mechanism, the value of BRBFCR_EL1.LASTFAILED is copied to the LASTFAILED
> |   field in the Branch record and BRBFCR_EL1.LASTFAILED is set to 0.
> 
> | I_JBPHS
> |   When a transaction fails or is canceled, Branch records generated in the
> |   transaction are not removed from the Branch record buffer.
> 
> I think what this is saying is:
> 
> 	/*
> 	 * BRBINFx_EL1.LASTFAILED indicates that a TME transaction failed (or
> 	 * was cancelled) prior to this record, and some number of records
> 	 * prior to this one may have been generated during an attempt to
> 	 * execute the transaction.
> 	 *
> 	 * We will remove such entries later in process_branch_aborts().
> 	 */
> 
> Is that right?

Right, will update the comment here.

> 
>> +
>> +		/*
>> +		 * All these information (i.e transaction state and mispredicts)
>> +		 * are not available for target only branch records.
>> +		 */
>> +		if (!brbe_target(brbinf)) {
> 
> Could we rename these heleprs for clarity, e.g.
> brbe_record_is_{target_only,source_only,complete}()

Sure, will do.

> 
> With that, it would also be clearer to have:
> 
> 	/*
> 	 * These fields only exist for complete and source-only records.
> 	 */
> 	if (brbe_record_is_complete(brbinf) ||
> 	    brbe_record_is_source_only()) {
> 
> ... and explicilty match the cases we care about[

Sure, will invert the check and update the comment here.

> 
> 
>> +			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));
> 
> Huh? Why does the value of BRBCR matter here?

This is just a code hardening measure here. Before recording branch record
cycles or its flags, ensure BRBCR_EL1 was configured correctly to produce
these additional information along with the branch records.

> 
>> +			entry->mispred = brbe_fetch_mispredict(brbinf);
>> +			entry->predicted = !entry->mispred;
>> +			entry->in_tx = brbe_fetch_in_tx(brbinf);
>> +		}
>> +	}
>> +
>> +	if (branch_sample_priv(event)) {
>> +		/*
>> +		 * All these information (i.e branch privilege level) are not
>> +		 * available for source only branch records.
>> +		 */
>> +		if (!brbe_source(brbinf))
>> +			entry->priv = brbe_fetch_perf_priv(brbinf);
> 
> Same style comment as above.

Sure, will do.

> 
>> +	}
>> +}
>> +
>> +/*
>> + * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
>> + * preceding consecutive branch records, that were in a transaction
>> + * (i.e their BRBINF_EL1.TX set) have been aborted.
>> + *
>> + * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
>> + * consecutive branch records upto the last record, which were in a
>> + * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
>> + *
>> + * --------------------------------- -------------------
>> + * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
>> + * --------------------------------- -------------------
>> + * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
>> + * --------------------------------- -------------------
>> + * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
>> + * --------------------------------- -------------------
>> + * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
>> + * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
>> + * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
>> + * --------------------------------- -------------------
>> + * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
>> + * --------------------------------- -------------------
>> + * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
>> + * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
>> + * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
> 
> Are we guaranteed to have a record between two transactions with TX = 0?

With TX = 0 i.e no transaction was active, indicates normal sequence of branches
creating their own branch records. How can there be a transaction with TX = 0 ?
Could you please be more specific here ?

> 
> AFAICT you could have a sequence where a TCOMMIT is immediately followed by a
> TSTART, and IIUC in that case you could have back-to-back records for distinct
> transactions all with TX = 1, where the first transaction could be commited,
> and the second might fail/cancel.
> 
> ... or do TCOMMIT/TCANCEL/TSTART get handled specially?

I guess these are micro-architectural implementation details which unfortunately
BRBINF_EL1/BRBCR_EL1 specifications do not capture in detail. But all it says is
that upon encountering BRBINF_EL1.LASTFAILED or BRBFCR_EL1.LASTFAILED (just for
the last record) all previous in-transaction branch records (BRBINF_EL1.TX = 1)
should be considered aborted for branch record reporting purpose.

> 
>> + *
>> + * BRBFCR_EL1.LASTFAILED == 1
>> + *
>> + * Here BRBFCR_EL1.LASTFAILED failes all those consecutive and also
>> + * in transaction branches near the end of the BRBE buffer.
>> + */
>> +static void process_branch_aborts(struct pmu_hw_events *cpuc)
>> +{
>> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
>> +	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +	bool lastfailed = !!(brbfcr & BRBFCR_EL1_LASTFAILED);
>> +	int idx = brbe_attr->brbe_nr - 1;
>> +	struct perf_branch_entry *entry;
>> +
>> +	do {
>> +		entry = &cpuc->branches->branch_entries[idx];
>> +		if (entry->in_tx) {
>> +			entry->abort = lastfailed;
>> +		} else {
>> +			lastfailed = entry->abort;
>> +			entry->abort = false;
>> +		}
>> +	} while (idx--, idx >= 0);
>> +}
>> +
>> +void armv8pmu_branch_reset(void)
>> +{
>> +	asm volatile(BRB_IALL);
>> +	isb();
>> +}
>> +
>> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
>> +{
>> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
>> +	u64 brbinf, brbfcr, brbcr;
>> +	int idx;
>> +
>> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
>> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +
>> +	/* Ensure pause on PMU interrupt is enabled */
>> +	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
> 
> As above, I think this needs commentary in the interrupt handler, since this
> presumably needs us to keep the IRQ asserted until we're done
> reading/manipulating records in the IRQ handler.

The base IRQ handler armv8pmu_handle_irq() is in ARMV8 PMU code inside perf_event.c
which could/should not access BRBE specific details without adding an another new
abstraction function. But I guess adding a comment should be fine.

> 
> Do we ever read this outside of the IRQ handler? AFAICT we don't, and that
> makes it seem like some of this is redundant.


> 
>> +
>> +	/* Save and clear the privilege */
>> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
> 
> Why? Later on we restore this, and AFAICT we don't modify it.
> 
> If it's paused, why do we care about the privilege?

This disables BRBE completely (not only pause) providing confidence that no
branch record can come in while the existing records are being processed.

> 
>> +
>> +	/* Pause the buffer */
>> +	write_sysreg_s(brbfcr | BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
>> +	isb();
> 
> Why? If we're in the IRQ handler it's already paused, and if we're not in the
> IRQ handler what prevents us racing with an IRQ?

armv8pmu_branch_read() always gets called from an IRQ context only. The point
here is to force a pause (and also disable, as I had explained earlier) before
reading the buffer.

> 
>> +
>> +	for (idx = 0; idx < brbe_attr->brbe_nr; idx++) {
>> +		struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
>> +
>> +		select_brbe_bank_index(idx);
>> +		brbinf = get_brbinf_reg(idx);
>> +		/*
>> +		 * There are no valid entries anymore on the buffer.
>> +		 * Abort the branch record processing to save some
>> +		 * cycles and also reduce the capture/process load
>> +		 * for the user space as well.
>> +		 */
>> +		if (brbe_invalid(brbinf))
>> +			break;
>> +
>> +		perf_clear_branch_entry_bitfields(entry);
>> +		if (brbe_valid(brbinf)) {
>> +			entry->from = get_brbsrc_reg(idx);
>> +			entry->to = get_brbtgt_reg(idx);
>> +		} else if (brbe_source(brbinf)) {
>> +			entry->from = get_brbsrc_reg(idx);
>> +			entry->to = 0;
>> +		} else if (brbe_target(brbinf)) {
>> +			entry->from = 0;
>> +			entry->to = get_brbtgt_reg(idx);
>> +		}
>> +		capture_brbe_flags(cpuc, event, brbinf, brbcr, idx);
>> +	}
>> +	cpuc->branches->branch_stack.nr = idx;
>> +	cpuc->branches->branch_stack.hw_idx = -1ULL;
>> +	process_branch_aborts(cpuc);
>> +
>> +	/* Restore privilege, enable pause on PMU interrupt */
>> +	write_sysreg_s(brbcr | BRBCR_EL1_FZP, SYS_BRBCR_EL1);
> 
> Why do we have to save/restore this?

Yes, this guarantees (more so than the paused state) that BRBE remains disabled in
privilege levels that are relevant, while the contents are being read. 

> 
>> +
>> +	/* Unpause the buffer */
>> +	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
>> +	isb();
>> +	armv8pmu_branch_reset();
>> +}
> 
> Why do we enable it before we reset it?

This is the last opportunity for a clean slate start for BRBE buffer before it is
back recording the branches. Basically helps in ensuring a clean start.

> 
> Surely it would make sense to reset it first, and ammortize the cost of the ISB?
> 
> That said, as above, do we actually need to pause/unpause it? Or is it already
> paused by virtue of the IRQ?

Yes, it should be paused after an IRQ but it is also enforced before reading along
with privilege level disable. Regardless the buffer needs to be un-paused and also
enabled for required privilege levels before exiting from here.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
@ 2023-01-19  2:48       ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-01-19  2:48 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On 1/12/23 22:21, Mark Rutland wrote:
> On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:
>> This enables branch stack sampling events in ARMV8 PMU, via an architecture
>> feature FEAT_BRBE aka branch record buffer extension. This defines required
>> branch helper functions pmuv8pmu_branch_XXXXX() and the implementation here
>> is wrapped with a new config option CONFIG_ARM64_BRBE.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Mark Rutland <mark.rutland@arm.com>
>> Cc: linux-arm-kernel@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/Kconfig                  |  11 +
>>  arch/arm64/include/asm/perf_event.h |   9 +
>>  arch/arm64/kernel/Makefile          |   1 +
>>  arch/arm64/kernel/brbe.c            | 512 ++++++++++++++++++++++++++++
>>  arch/arm64/kernel/brbe.h            | 257 ++++++++++++++
>>  5 files changed, 790 insertions(+)
>>  create mode 100644 arch/arm64/kernel/brbe.c
>>  create mode 100644 arch/arm64/kernel/brbe.h
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 03934808b2ed..915b12709a46 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -1363,6 +1363,17 @@ config HW_PERF_EVENTS
>>  	def_bool y
>>  	depends on ARM_PMU
>>  
>> +config ARM64_BRBE
>> +	bool "Enable support for Branch Record Buffer Extension (BRBE)"
>> +	depends on PERF_EVENTS && ARM64 && ARM_PMU
>> +	default y
>> +	help
>> +	  Enable perf support for Branch Record Buffer Extension (BRBE) which
>> +	  records all branches taken in an execution path. This supports some
>> +	  branch types and privilege based filtering. It captured additional
>> +	  relevant information such as cycle count, misprediction and branch
>> +	  type, branch privilege level etc.
>> +
>>  # Supported by clang >= 7.0 or GCC >= 12.0.0
>>  config CC_HAVE_SHADOW_CALL_STACK
>>  	def_bool $(cc-option, -fsanitize=shadow-call-stack -ffixed-x18)
>> diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
>> index a038902d6874..cf2e88c7b707 100644
>> --- a/arch/arm64/include/asm/perf_event.h
>> +++ b/arch/arm64/include/asm/perf_event.h
>> @@ -277,6 +277,14 @@ struct pmu_hw_events;
>>  struct arm_pmu;
>>  struct perf_event;
>>  
>> +#ifdef CONFIG_ARM64_BRBE
>> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event);
>> +bool armv8pmu_branch_valid(struct perf_event *event);
>> +void armv8pmu_branch_enable(struct perf_event *event);
>> +void armv8pmu_branch_disable(struct perf_event *event);
>> +void armv8pmu_branch_probe(struct arm_pmu *arm_pmu);
>> +void armv8pmu_branch_reset(void);
>> +#else
>>  static inline void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event) { }
>>  static inline bool armv8pmu_branch_valid(struct perf_event *event) { return false; }
>>  static inline void armv8pmu_branch_enable(struct perf_event *event) { }
>> @@ -284,3 +292,4 @@ static inline void armv8pmu_branch_disable(struct perf_event *event) { }
>>  static inline void armv8pmu_branch_probe(struct arm_pmu *arm_pmu) { }
>>  static inline void armv8pmu_branch_reset(void) { }
>>  #endif
>> +#endif
>> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
>> index ceba6792f5b3..6ee7ccb61621 100644
>> --- a/arch/arm64/kernel/Makefile
>> +++ b/arch/arm64/kernel/Makefile
>> @@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
>>  obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
>>  obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
>>  obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
>> +obj-$(CONFIG_ARM64_BRBE)		+= brbe.o
>>  obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
>>  obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
>>  obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
>> diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
>> new file mode 100644
>> index 000000000000..cd03d3531e04
>> --- /dev/null
>> +++ b/arch/arm64/kernel/brbe.c
>> @@ -0,0 +1,512 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * Branch Record Buffer Extension Driver.
>> + *
>> + * Copyright (C) 2022 ARM Limited
>> + *
>> + * Author: Anshuman Khandual <anshuman.khandual@arm.com>
>> + */
>> +#include "brbe.h"
>> +
>> +static bool valid_brbe_nr(int brbe_nr)
>> +{
>> +	return brbe_nr == BRBIDR0_EL1_NUMREC_8 ||
>> +	       brbe_nr == BRBIDR0_EL1_NUMREC_16 ||
>> +	       brbe_nr == BRBIDR0_EL1_NUMREC_32 ||
>> +	       brbe_nr == BRBIDR0_EL1_NUMREC_64;
>> +}
>> +
>> +static bool valid_brbe_cc(int brbe_cc)
>> +{
>> +	return brbe_cc == BRBIDR0_EL1_CC_20_BIT;
>> +}
>> +
>> +static bool valid_brbe_format(int brbe_format)
>> +{
>> +	return brbe_format == BRBIDR0_EL1_FORMAT_0;
>> +}
>> +
>> +static bool valid_brbe_version(int brbe_version)
>> +{
>> +	return brbe_version == ID_AA64DFR0_EL1_BRBE_IMP ||
>> +	       brbe_version == ID_AA64DFR0_EL1_BRBE_BRBE_V1P1;
>> +}
>> +
>> +static void select_brbe_bank(int bank)
>> +{
>> +	static int brbe_current_bank = BRBE_BANK_IDX_INVALID;
> 
> This is a per-cpu peroperty, so I don't understand how this can safely be
> stored in a static variable. If this is necessary it needs to go in a per-cpu
> variable, but I suspect we don't actually need it.

You are right, we dont need it.

> 
>> +	u64 brbfcr;
>> +
>> +	if (brbe_current_bank == bank)
>> +		return;
> 
> It looks like this is just for the same of optimizing redundant changes when
> armv8pmu_branch_read() iterates over the records?

Right, it is.

> 
> It'd be simpler to have armv8pmu_branch_read() iterate over each bank, then
> within that iterate over each record within that bank.

Sure, will drop this optimization completely. I will split the iteration into two
separate loops, one each for bank 0 and other for bank 1.

> 
>> +	WARN_ON(bank > BRBE_BANK_IDX_1);
>> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
>> +	brbfcr |= ((bank << BRBFCR_EL1_BANK_SHIFT) & BRBFCR_EL1_BANK_MASK);
> 
> You can use SYS_FIELD_PREP() for this:

Sure, will do.

> 
> 	brbfcr &= ~BRBFCR_EL1_BANK_MASK;
> 	brbfcr |= SYS_FIELD_PREP(BRBFCR_EL1, BANK, bank);
> 
> Please use FIELD_PREP for this.

Done.

> 
>> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
>> +	isb();
>> +	brbe_current_bank = bank;
>> +}
>> +
>> +static void select_brbe_bank_index(int buffer_idx)
>> +{
>> +	switch (buffer_idx) {
>> +	case BRBE_BANK0_IDX_MIN ... BRBE_BANK0_IDX_MAX:
>> +		select_brbe_bank(BRBE_BANK_IDX_0);
>> +		break;
>> +	case BRBE_BANK1_IDX_MIN ... BRBE_BANK1_IDX_MAX:
>> +		select_brbe_bank(BRBE_BANK_IDX_1);
>> +		break;
>> +	default:
>> +		pr_warn("unsupported BRBE index\n");
> 
> It would be worth logging the specific index in case we ever have to debug
> this. It's probably worth also making this a WARN_ONCE() or WARN_RATELIMITED().

This function will not be required once individual loops based read for each
BRBE bank is implemented, thus reducing number of times select_brbe_bank()
gets called i.e just two times once for bank 0 and other for bank 1.

> 
>> +	}
>> +}
>> +
>> +static const char branch_filter_error_msg[] = "branch filter not supported";
>> +
>> +bool armv8pmu_branch_valid(struct perf_event *event)
>> +{
>> +	u64 branch_type = event->attr.branch_sample_type;
>> +
>> +	/*
>> +	 * If the event does not have at least one of the privilege
>> +	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
>> +	 * perf will adjust its value based on perf event's existing
>> +	 * privilege level via attr.exclude_[user|kernel|hv].
>> +	 *
>> +	 * As event->attr.branch_sample_type might have been changed
>> +	 * when the event reaches here, it is not possible to figure
>> +	 * out whether the event originally had HV privilege request
>> +	 * or got added via the core perf. Just report this situation
>> +	 * once and continue ignoring if there are other instances.
>> +	 */
>> +	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
>> +		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
>> +		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
>> +		return false;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
>> +		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
>> +		return false;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
>> +		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
>> +		return false;
>> +	}
>> +	return true;
>> +}
> 
> Is this called when validating user input? If so, NAK to printing anything to a
> higher leval than debug. If there are constraints the user needs to be aware of

You mean pr_debug() based prints ?

> we should expose the relevant information under sysfs, but it seems that these
> are just generic perf options that BRBE doesn't support.

Right, these are generic perf options. As you mentioned, will replace these with
pr_debug() instead.

> 
> It would be better to whitelist what we do support rather than blacklisting
> what we don't.

But with a negative list, user would know what is not supported via these pr_debug()
based output when enabled ? But I dont have a strong opinion either way.

> 
>> +
>> +static void branch_records_alloc(struct arm_pmu *armpmu)
>> +{
>> +	struct pmu_hw_events *events;
>> +	int cpu;
>> +
>> +	for_each_possible_cpu(cpu) {
>> +		events = per_cpu_ptr(armpmu->hw_events, cpu);
>> +
>> +		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
>> +		WARN_ON(!events->branches);
>> +	}
>> +}
> 
> It would be simpler for this to be a percpu allocation.

Could you please be more specific ? alloc_percpu_gfp() cannot be used here
because 'events->branches' is not a __percpu variable unlike its parent
'events' which is derived from armpmu.

> 
> If the allocation fails, we should propogate that error rather than just
> WARNing, and fail probing the PMU.

Sure, will change that.

> 
> Also, if the generic allocator fails it will print a warning (unless
> __GFP_NOWARN was used), so we don't need the warning here.

Sure, understood.

> 
>> +
>> +static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
>> +{
>> +	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
> 
> Same comments as for the failure path in branch_records_alloc().
> 
>> +	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
> 
> Which context is this run in? Unless this is affine to a relevant CPU we can't
> read the sysreg safely, and if we're in a cross-call we cannot allocate memory,
> so this doesn't look right to me.

Called from smp_call_function_any() context via __armv8pmu_probe_pmu().

> 
> I suspect CONFIG_DEBUG_ATOMIC_SLEEP=y and/or CONFIG_PROVE_LOCKING=y will complain here.

Right, it does. Remember dropping pr_info() during BRBE probe for the exact same
reason here but did not realize we will run into the same problem again.

> 
> Please follow the approach of armv8pmu_probe_pmu(), where we use a probe_info
> structure that the callee can fill with information. Then we can do the
> allocation in the main thread from a non-atomic context.

Right, will do that. The only problem being 'struct brbe_hw_attr' which will not be
visible in the main thread, might need an abstraction function to do the allocation
in BRBE implementation. Regardless, a successful BRBE in the preceding function can
be ascertained from armpmu->has_branch_stack().

> 
>> +
>> +	WARN_ON(!brbe_attr);
>> +	armpmu->private = brbe_attr;
>> +
>> +	brbe_attr->brbe_version = brbe;
>> +	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
>> +	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
>> +	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
> 
> As a minor thing, could we please s/fetch/get/ ? To me, 'fetch' sounds like a
> memory operation, and elsewhere we use 'get' for this sort of getter function.

Sure, but shall we change fetch as get for entire BRBE implementation (where ever
there is a determination of field from a register value) or just the above function ?
Default, will change all places.

> 
>> +
>> +	if (!valid_brbe_version(brbe_attr->brbe_version) ||
>> +	   !valid_brbe_format(brbe_attr->brbe_format) ||
>> +	   !valid_brbe_cc(brbe_attr->brbe_cc) ||
>> +	   !valid_brbe_nr(brbe_attr->brbe_nr))
>> +		return -EOPNOTSUPP;
>> +
>> +	return 0;
>> +}
>> +
>> +void armv8pmu_branch_probe(struct arm_pmu *armpmu)
>> +{
>> +	u64 aa64dfr0 = read_sysreg_s(SYS_ID_AA64DFR0_EL1);
>> +	u32 brbe;
>> +
>> +	brbe = cpuid_feature_extract_unsigned_field(aa64dfr0, ID_AA64DFR0_EL1_BRBE_SHIFT);
>> +	if (!brbe)
>> +		return;
>> +
>> +	if (brbe_attributes_probe(armpmu, brbe))
>> +		return;
>> +
>> +	branch_records_alloc(armpmu);
>> +	armpmu->features |= ARM_PMU_BRANCH_STACK;
>> +}
>> +
>> +static u64 branch_type_to_brbfcr(int branch_type)
>> +{
>> +	u64 brbfcr = 0;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
>> +		brbfcr |= BRBFCR_EL1_BRANCH_FILTERS;
>> +		return brbfcr;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL) {
>> +		brbfcr |= BRBFCR_EL1_INDCALL;
>> +		brbfcr |= BRBFCR_EL1_DIRCALL;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>> +		brbfcr |= BRBFCR_EL1_RTN;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_IND_CALL)
>> +		brbfcr |= BRBFCR_EL1_INDCALL;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_COND)
>> +		brbfcr |= BRBFCR_EL1_CONDDIR;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_IND_JUMP)
>> +		brbfcr |= BRBFCR_EL1_INDIRECT;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_CALL)
>> +		brbfcr |= BRBFCR_EL1_DIRCALL;
>> +
>> +	return brbfcr;
>> +}
>> +
>> +static u64 branch_type_to_brbcr(int branch_type)
>> +{
>> +	u64 brbcr = (BRBCR_EL1_FZP | BRBCR_EL1_DEFAULT_TS);
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_USER)
>> +		brbcr |= BRBCR_EL1_E0BRE;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_KERNEL)
>> +		brbcr |= BRBCR_EL1_E1BRE;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_HV) {
>> +		if (is_kernel_in_hyp_mode())
>> +			brbcr |= BRBCR_EL1_E1BRE;
>> +	}
> 
> I assume that in that case we're actually writing to BRBCR_EL2, and this is
> actually the E2BRE bit, which is at the same position? If so, I think that's
> worth a comment above the USER/KERNEL/HV bits here.

That is right, will add a comment.

> 
> How do the BRB* control registers work with E2H? Is BRBCR_EL1 rewritten to
> BRBCR_EL2 by the hardware?

Right, that is my understanding as well.

With FEAT_VHE and HCR_EL2.E2H = 1, access to BRBCR_EL1 at EL2, accesses BRBCR_EL2
without FEAT_VHE or HCR_EL2.E2H = 0, access to BRBCR_EL1 at EL2, accesses BRBCR_EL1

> 
>> +
>> +	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_CYCLES))
>> +		brbcr |= BRBCR_EL1_CC;
>> +
>> +	if (!(branch_type & PERF_SAMPLE_BRANCH_NO_FLAGS))
>> +		brbcr |= BRBCR_EL1_MPRED;
>> +
>> +	/*
>> +	 * The exception and exception return branches could be
>> +	 * captured, irrespective of the perf event's privilege.
>> +	 * If the perf event does not have enough privilege for
>> +	 * a given exception level, then addresses which falls
>> +	 * under that exception level will be reported as zero
>> +	 * for the captured branch record, creating source only
>> +	 * or target only records.
>> +	 */
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY) {
>> +		brbcr |= BRBCR_EL1_EXCEPTION;
>> +		brbcr |= BRBCR_EL1_ERTN;
>> +	}
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_CALL)
>> +		brbcr |= BRBCR_EL1_EXCEPTION;
>> +
>> +	if (branch_type & PERF_SAMPLE_BRANCH_ANY_RETURN)
>> +		brbcr |= BRBCR_EL1_ERTN;
>> +
>> +	return brbcr & BRBCR_EL1_DEFAULT_CONFIG;
>> +}
>> +
>> +void armv8pmu_branch_enable(struct perf_event *event)
>> +{
>> +	u64 branch_type = event->attr.branch_sample_type;
>> +	u64 brbfcr, brbcr;
>> +
>> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +	brbfcr &= ~BRBFCR_EL1_DEFAULT_CONFIG;
>> +	brbfcr |= branch_type_to_brbfcr(branch_type);
>> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
>> +	isb();
>> +
>> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
>> +	brbcr &= ~BRBCR_EL1_DEFAULT_CONFIG;
>> +	brbcr |= branch_type_to_brbcr(branch_type);
>> +	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
>> +	isb();
>> +	armv8pmu_branch_reset();
>> +}
>> +
>> +void armv8pmu_branch_disable(struct perf_event *event)
>> +{
>> +	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +	u64 brbcr = read_sysreg_s(SYS_BRBCR_EL1);
>> +
>> +	brbcr &= ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE);
>> +	brbfcr |= BRBFCR_EL1_PAUSED;
>> +	write_sysreg_s(brbcr, SYS_BRBCR_EL1);
>> +	write_sysreg_s(brbfcr, SYS_BRBFCR_EL1);
>> +	isb();
>> +}
>> +
>> +static int brbe_fetch_perf_type(u64 brbinf, bool *new_branch_type)
> 
> It's a bit confusing to return the type and new_type fields in this way.
> 
> I think this would be clearer as a setter function, even if that results in it
> being a bit longer, since it keeps all the type and new_type relationships in
> one place and has a single path for returning the value:

Makes sense.

> 
> static void brbe_set_perf_entry_type(struct perf_branch_entry *entry,
> 				     u64 brbinf)
> {
> 	int brbe_type = brbe_fetch_type(brbinf);
> 
> 	switch (brbe_type) {
> 	case BRBINF_EL1_TYPE_UNCOND_DIR;
> 		entry->type = PERF_BR_UNCOND;
> 		break;
> 	...
> 	case BRBINF_EL1_TYPE_DEBUG_HALT;
> 		entry->type = PERF_BR_EXTEND_ABI;
> 		entry->new_type = PERF_BR_ARM64_DEBUG_HALT;
> 		break;
> 	...
> 	default:
> 		...
> 	}
> }
> 
> ... and in theory that makes it easier to propogate an error in future if we
> want to.

Sure, will convert this function into brbe_set_perf_entry_type() as suggested.

> 
>> +{
>> +	int brbe_type = brbe_fetch_type(brbinf);
>> +	*new_branch_type = false;
>> +
>> +	switch (brbe_type) {
>> +	case BRBINF_EL1_TYPE_UNCOND_DIR:
>> +		return PERF_BR_UNCOND;
>> +	case BRBINF_EL1_TYPE_INDIR:
>> +		return PERF_BR_IND;
>> +	case BRBINF_EL1_TYPE_DIR_LINK:
>> +		return PERF_BR_CALL;
>> +	case BRBINF_EL1_TYPE_INDIR_LINK:
>> +		return PERF_BR_IND_CALL;
>> +	case BRBINF_EL1_TYPE_RET_SUB:
>> +		return PERF_BR_RET;
>> +	case BRBINF_EL1_TYPE_COND_DIR:
>> +		return PERF_BR_COND;
>> +	case BRBINF_EL1_TYPE_CALL:
>> +		return PERF_BR_CALL;
>> +	case BRBINF_EL1_TYPE_TRAP:
>> +		return PERF_BR_SYSCALL;
>> +	case BRBINF_EL1_TYPE_RET_EXCPT:
>> +		return PERF_BR_ERET;
>> +	case BRBINF_EL1_TYPE_IRQ:
>> +		return PERF_BR_IRQ;
>> +	case BRBINF_EL1_TYPE_DEBUG_HALT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_DEBUG_HALT;
>> +	case BRBINF_EL1_TYPE_SERROR:
>> +		return PERF_BR_SERROR;
>> +	case BRBINF_EL1_TYPE_INST_DEBUG:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_DEBUG_INST;
>> +	case BRBINF_EL1_TYPE_DATA_DEBUG:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_DEBUG_DATA;
>> +	case BRBINF_EL1_TYPE_ALGN_FAULT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_NEW_FAULT_ALGN;
>> +	case BRBINF_EL1_TYPE_INST_FAULT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_NEW_FAULT_INST;
>> +	case BRBINF_EL1_TYPE_DATA_FAULT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_NEW_FAULT_DATA;
>> +	case BRBINF_EL1_TYPE_FIQ:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_FIQ;
>> +	case BRBINF_EL1_TYPE_DEBUG_EXIT:
>> +		*new_branch_type = true;
>> +		return PERF_BR_ARM64_DEBUG_EXIT;
>> +	default:
>> +		pr_warn("unknown branch type captured\n");
>> +		return PERF_BR_UNKNOWN;
> 
> It would be worth logging the specific value in case we ever have to debug
> this. This should also be marked as _ratelimited or _once.

Sure, will replace with a pr_warn_once() printing 'branch_type'.

> 
>> +	}
>> +}
>> +
>> +static int brbe_fetch_perf_priv(u64 brbinf)
>> +{
>> +	int brbe_el = brbe_fetch_el(brbinf);
>> +
>> +	switch (brbe_el) {
>> +	case BRBINF_EL1_EL_EL0:
>> +		return PERF_BR_PRIV_USER;
>> +	case BRBINF_EL1_EL_EL1:
>> +		return PERF_BR_PRIV_KERNEL;
>> +	case BRBINF_EL1_EL_EL2:
>> +		if (is_kernel_in_hyp_mode())
>> +			return PERF_BR_PRIV_KERNEL;
>> +		return PERF_BR_PRIV_HV;
>> +	default:
>> +		pr_warn("unknown branch privilege captured\n");
>> +		return PERF_BR_PRIV_UNKNOWN;
> 
> It would be worth logging the specific value in case we ever have to debug
> this. This should also be marked as _ratelimited or _once.

Sure, will replace with a pr_warn_once() printing 'brbe_el.

> 
>> +	}
>> +}
>> +
>> +static void capture_brbe_flags(struct pmu_hw_events *cpuc, struct perf_event *event,
>> +			       u64 brbinf, u64 brbcr, int idx)
>> +{
>> +	struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
>> +	bool new_branch_type;
>> +	int branch_type;
>> +
>> +	if (branch_sample_type(event)) {
>> +		branch_type = brbe_fetch_perf_type(brbinf, &new_branch_type);
>> +		if (new_branch_type) {
>> +			entry->type = PERF_BR_EXTEND_ABI;
>> +			entry->new_type = branch_type;
>> +		} else {
>> +			entry->type = branch_type;
>> +		}
>> +	}
> 
> With the suggestions bove, this would become:
> 
> 	if (branch_sample_type(event))
> 		brbe_set_perf_entry_type(entry, brbinf);

That's right, will change.

> 
>> +	if (!branch_sample_no_cycles(event)) {
>> +		WARN_ON_ONCE(!(brbcr & BRBCR_EL1_CC));
>> +		entry->cycles = brbe_fetch_cycles(brbinf);
>> +	}
>> +
>> +	if (!branch_sample_no_flags(event)) {
>> +		/*
>> +		 * BRBINF_LASTFAILED does not indicate whether last transaction
>> +		 * got failed or aborted during the current branch record itself.
>> +		 * Rather, this indicates that all the branch records which were
>> +		 * in transaction until the curret branch record have failed. So
>> +		 * the entire BRBE buffer needs to be processed later on to find
>> +		 * all branch records which might have failed.
>> +		 */
> 
> This is quite difficult to follow.
> 
> I took in the ARM ARM, and it looks like this is all about TME transactions
> (which Linux doesn't currently support). Per ARM DDI 0487I.a, page D15-5506:
> 
> | R_GVCJH
> |   When an entire transaction is executed in a BRBE Non-prohibited region and
> |   the transaction fails or is canceled then BRBFCR_EL1.LASTFAILED is set to
> |   1.
> 
> | R_KBSZM
> |   When a Branch record is generated, other than through the injection
> |   mechanism, the value of BRBFCR_EL1.LASTFAILED is copied to the LASTFAILED
> |   field in the Branch record and BRBFCR_EL1.LASTFAILED is set to 0.
> 
> | I_JBPHS
> |   When a transaction fails or is canceled, Branch records generated in the
> |   transaction are not removed from the Branch record buffer.
> 
> I think what this is saying is:
> 
> 	/*
> 	 * BRBINFx_EL1.LASTFAILED indicates that a TME transaction failed (or
> 	 * was cancelled) prior to this record, and some number of records
> 	 * prior to this one may have been generated during an attempt to
> 	 * execute the transaction.
> 	 *
> 	 * We will remove such entries later in process_branch_aborts().
> 	 */
> 
> Is that right?

Right, will update the comment here.

> 
>> +
>> +		/*
>> +		 * All these information (i.e transaction state and mispredicts)
>> +		 * are not available for target only branch records.
>> +		 */
>> +		if (!brbe_target(brbinf)) {
> 
> Could we rename these heleprs for clarity, e.g.
> brbe_record_is_{target_only,source_only,complete}()

Sure, will do.

> 
> With that, it would also be clearer to have:
> 
> 	/*
> 	 * These fields only exist for complete and source-only records.
> 	 */
> 	if (brbe_record_is_complete(brbinf) ||
> 	    brbe_record_is_source_only()) {
> 
> ... and explicilty match the cases we care about[

Sure, will invert the check and update the comment here.

> 
> 
>> +			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));
> 
> Huh? Why does the value of BRBCR matter here?

This is just a code hardening measure here. Before recording branch record
cycles or its flags, ensure BRBCR_EL1 was configured correctly to produce
these additional information along with the branch records.

> 
>> +			entry->mispred = brbe_fetch_mispredict(brbinf);
>> +			entry->predicted = !entry->mispred;
>> +			entry->in_tx = brbe_fetch_in_tx(brbinf);
>> +		}
>> +	}
>> +
>> +	if (branch_sample_priv(event)) {
>> +		/*
>> +		 * All these information (i.e branch privilege level) are not
>> +		 * available for source only branch records.
>> +		 */
>> +		if (!brbe_source(brbinf))
>> +			entry->priv = brbe_fetch_perf_priv(brbinf);
> 
> Same style comment as above.

Sure, will do.

> 
>> +	}
>> +}
>> +
>> +/*
>> + * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
>> + * preceding consecutive branch records, that were in a transaction
>> + * (i.e their BRBINF_EL1.TX set) have been aborted.
>> + *
>> + * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
>> + * consecutive branch records upto the last record, which were in a
>> + * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
>> + *
>> + * --------------------------------- -------------------
>> + * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
>> + * --------------------------------- -------------------
>> + * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
>> + * --------------------------------- -------------------
>> + * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
>> + * --------------------------------- -------------------
>> + * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
>> + * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
>> + * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
>> + * --------------------------------- -------------------
>> + * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
>> + * --------------------------------- -------------------
>> + * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
>> + * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
>> + * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>> + * --------------------------------- -------------------
> 
> Are we guaranteed to have a record between two transactions with TX = 0?

With TX = 0 i.e no transaction was active, indicates normal sequence of branches
creating their own branch records. How can there be a transaction with TX = 0 ?
Could you please be more specific here ?

> 
> AFAICT you could have a sequence where a TCOMMIT is immediately followed by a
> TSTART, and IIUC in that case you could have back-to-back records for distinct
> transactions all with TX = 1, where the first transaction could be commited,
> and the second might fail/cancel.
> 
> ... or do TCOMMIT/TCANCEL/TSTART get handled specially?

I guess these are micro-architectural implementation details which unfortunately
BRBINF_EL1/BRBCR_EL1 specifications do not capture in detail. But all it says is
that upon encountering BRBINF_EL1.LASTFAILED or BRBFCR_EL1.LASTFAILED (just for
the last record) all previous in-transaction branch records (BRBINF_EL1.TX = 1)
should be considered aborted for branch record reporting purpose.

> 
>> + *
>> + * BRBFCR_EL1.LASTFAILED == 1
>> + *
>> + * Here BRBFCR_EL1.LASTFAILED failes all those consecutive and also
>> + * in transaction branches near the end of the BRBE buffer.
>> + */
>> +static void process_branch_aborts(struct pmu_hw_events *cpuc)
>> +{
>> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
>> +	u64 brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +	bool lastfailed = !!(brbfcr & BRBFCR_EL1_LASTFAILED);
>> +	int idx = brbe_attr->brbe_nr - 1;
>> +	struct perf_branch_entry *entry;
>> +
>> +	do {
>> +		entry = &cpuc->branches->branch_entries[idx];
>> +		if (entry->in_tx) {
>> +			entry->abort = lastfailed;
>> +		} else {
>> +			lastfailed = entry->abort;
>> +			entry->abort = false;
>> +		}
>> +	} while (idx--, idx >= 0);
>> +}
>> +
>> +void armv8pmu_branch_reset(void)
>> +{
>> +	asm volatile(BRB_IALL);
>> +	isb();
>> +}
>> +
>> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
>> +{
>> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
>> +	u64 brbinf, brbfcr, brbcr;
>> +	int idx;
>> +
>> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
>> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>> +
>> +	/* Ensure pause on PMU interrupt is enabled */
>> +	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
> 
> As above, I think this needs commentary in the interrupt handler, since this
> presumably needs us to keep the IRQ asserted until we're done
> reading/manipulating records in the IRQ handler.

The base IRQ handler armv8pmu_handle_irq() is in ARMV8 PMU code inside perf_event.c
which could/should not access BRBE specific details without adding an another new
abstraction function. But I guess adding a comment should be fine.

> 
> Do we ever read this outside of the IRQ handler? AFAICT we don't, and that
> makes it seem like some of this is redundant.


> 
>> +
>> +	/* Save and clear the privilege */
>> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
> 
> Why? Later on we restore this, and AFAICT we don't modify it.
> 
> If it's paused, why do we care about the privilege?

This disables BRBE completely (not only pause) providing confidence that no
branch record can come in while the existing records are being processed.

> 
>> +
>> +	/* Pause the buffer */
>> +	write_sysreg_s(brbfcr | BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
>> +	isb();
> 
> Why? If we're in the IRQ handler it's already paused, and if we're not in the
> IRQ handler what prevents us racing with an IRQ?

armv8pmu_branch_read() always gets called from an IRQ context only. The point
here is to force a pause (and also disable, as I had explained earlier) before
reading the buffer.

> 
>> +
>> +	for (idx = 0; idx < brbe_attr->brbe_nr; idx++) {
>> +		struct perf_branch_entry *entry = &cpuc->branches->branch_entries[idx];
>> +
>> +		select_brbe_bank_index(idx);
>> +		brbinf = get_brbinf_reg(idx);
>> +		/*
>> +		 * There are no valid entries anymore on the buffer.
>> +		 * Abort the branch record processing to save some
>> +		 * cycles and also reduce the capture/process load
>> +		 * for the user space as well.
>> +		 */
>> +		if (brbe_invalid(brbinf))
>> +			break;
>> +
>> +		perf_clear_branch_entry_bitfields(entry);
>> +		if (brbe_valid(brbinf)) {
>> +			entry->from = get_brbsrc_reg(idx);
>> +			entry->to = get_brbtgt_reg(idx);
>> +		} else if (brbe_source(brbinf)) {
>> +			entry->from = get_brbsrc_reg(idx);
>> +			entry->to = 0;
>> +		} else if (brbe_target(brbinf)) {
>> +			entry->from = 0;
>> +			entry->to = get_brbtgt_reg(idx);
>> +		}
>> +		capture_brbe_flags(cpuc, event, brbinf, brbcr, idx);
>> +	}
>> +	cpuc->branches->branch_stack.nr = idx;
>> +	cpuc->branches->branch_stack.hw_idx = -1ULL;
>> +	process_branch_aborts(cpuc);
>> +
>> +	/* Restore privilege, enable pause on PMU interrupt */
>> +	write_sysreg_s(brbcr | BRBCR_EL1_FZP, SYS_BRBCR_EL1);
> 
> Why do we have to save/restore this?

Yes, this guarantees (more so than the paused state) that BRBE remains disabled in
privilege levels that are relevant, while the contents are being read. 

> 
>> +
>> +	/* Unpause the buffer */
>> +	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
>> +	isb();
>> +	armv8pmu_branch_reset();
>> +}
> 
> Why do we enable it before we reset it?

This is the last opportunity for a clean slate start for BRBE buffer before it is
back recording the branches. Basically helps in ensuring a clean start.

> 
> Surely it would make sense to reset it first, and ammortize the cost of the ISB?
> 
> That said, as above, do we actually need to pause/unpause it? Or is it already
> paused by virtue of the IRQ?

Yes, it should be paused after an IRQ but it is also enforced before reading along
with privilege level disable. Regardless the buffer needs to be un-paused and also
enabled for required privilege levels before exiting from here.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
  2023-01-13  3:02       ` Anshuman Khandual
@ 2023-02-08 19:22         ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-08 19:22 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown

On Fri, Jan 13, 2023 at 08:32:47AM +0530, Anshuman Khandual wrote:
> On 1/12/23 18:54, Mark Rutland wrote:
> > Hi Anshuman,
> > 
> > On Thu, Jan 05, 2023 at 08:40:35AM +0530, Anshuman Khandual wrote:
> >> This adds BRBE related register definitions and various other related field
> >> macros there in. These will be used subsequently in a BRBE driver which is
> >> being added later on.
> > 
> > I haven't verified the specific values, but this looks good to me aside from
> > one minor nit below.
> > 
> > [...]
> > 
> >> +# This is just a dummy register declaration to get all common field masks and
> >> +# shifts for accessing given BRBINF contents.
> >> +Sysreg	BRBINF_EL1	2	1	8	0	0
> > 
> > We don't need a dummy declaration, as we have 'SysregFields' that can be used
> > for this, e.g.
> > 
> >   SysregFields BRBINFx_EL1
> >   ...
> >   EndSysregFields
> > 
> > ... which will avoid accidental usage of the register encoding. Note that I've
> > also added an 'x' there in place of the index, which we do for other registers,
> > e.g. TTBRx_EL1.
> > 
> > Could you please update to that?
> 
> There is a problem in defining SysregFields (which I did explore earlier as well).
> SysregFields unfortunately does not support enums fields. Following build failure
> comes up, while trying to convert BRBINFx_EL1 into a SysregFields definition.
> 
> Error at 932: unexpected Enum (inside SysregFields)

This is a problem, but it's one that we can solve. We're in control of
gen-sysreg.awk and the language it parses, so we can make this an expected and
supported case -- see below.

> ===============================================================================
> diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
> index a7f9054bd84c..519c4f080898 100644
> --- a/arch/arm64/tools/sysreg
> +++ b/arch/arm64/tools/sysreg
> @@ -921,10 +921,7 @@ Enum       3:0     BT
>  EndEnum
>  EndSysreg
>  
> -
> -# This is just a dummy register declaration to get all common field masks and
> -# shifts for accessing given BRBINF contents.
> -Sysreg BRBINF_EL1      2       1       8       0       0
> +SysregFields BRBINFx_EL1
>  Res0   63:47
>  Field  46      CCU
>  Field  45:32   CC
> @@ -967,7 +964,7 @@ Enum        1:0     VALID
>         0b10    SOURCE
>         0b11    FULL
>  EndEnum
> -EndSysreg
> +EndSysregFields
>  
>  Sysreg BRBCR_EL1       2       1       9       0       0
>  Res0   63:24
> ===============================================================================
> 
> There are three enum fields in BRBINFx_EL1 as listed here.
> 
> Enum    13:8            TYPE
> Enum    7:6		EL
> Enum    1:0     	VALID
> 
> However, BRBINF_EL1 can be changed as BRBINFx_EL1, indicating its more generic
> nature with a potential to be used for any index value register thereafter.

It's certainly better to use the BRBINFx_EL1 name, but my main concern here is
to avoid the dummy values used above to satisfy the tools, so that those cannot
be accidentally misused.

I'd prefer that we fix gen-sysreg.awk to support Enum blocks within
SysregFields blocks (patch below), then use SysregFields as described above.

Thanks,
Mark.

---->8----
From 0c194d92b0b9ff3b32f666a4610b077fdf1b4b93 Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland@arm.com>
Date: Wed, 8 Feb 2023 17:55:08 +0000
Subject: [PATCH] arm64/sysreg: allow *Enum blocks in SysregFields blocks

We'd like to support Enum/SignedEnum/UnsignedEnum blocks within
SysregFields blocks, so that we can define enumerations for sets of
registers. This isn't currently supported by gen-sysreg.awk due to the
way we track the active block, which can't handle more than a single
layer of nesting, which imposes an awkward requirement that when ending
a block we know what the parent block is when calling change_block()

Make this nicer by using a stack of active blocks, with block_push() to
start a block, and block_pop() to end a block. Doing so means hat when
ending a block we don't need to know the parent block type, and checks
of the active block become more consistent. On top of that, it's easy to
permit *Enum blocks within both Sysreg and SysregFields blocks.

To aid debugging, the stack of active blocks is reported for fatal
errors, and an error is raised if the file is terminated without ending
the active block. For clarity I've renamed the top-level element from
"None" to "Root".

The Fields element it intended only for use within Systeg blocks, and
does not make sense within SysregFields blocks, and so remains forbidden
within a SysregFields block.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/tools/gen-sysreg.awk | 93 ++++++++++++++++++++-------------
 1 file changed, 57 insertions(+), 36 deletions(-)

diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
index 7f27d66a17e1..066ebf5410fa 100755
--- a/arch/arm64/tools/gen-sysreg.awk
+++ b/arch/arm64/tools/gen-sysreg.awk
@@ -4,23 +4,35 @@
 #
 # Usage: awk -f gen-sysreg.awk sysregs.txt
 
+function block_current() {
+	return __current_block[__current_block_depth];
+}
+
 # Log an error and terminate
 function fatal(msg) {
 	print "Error at " NR ": " msg > "/dev/stderr"
+
+	printf "Current block nesting:"
+
+	for (i = 0; i <= __current_block_depth; i++) {
+		printf " " __current_block[i]
+	}
+	printf "\n"
+
 	exit 1
 }
 
-# Sanity check that the start or end of a block makes sense at this point in
-# the file. If not, produce an error and terminate.
-#
-# @this - the $Block or $EndBlock
-# @prev - the only valid block to already be in (value of @block)
-# @new - the new value of @block
-function change_block(this, prev, new) {
-	if (block != prev)
-		fatal("unexpected " this " (inside " block ")")
-
-	block = new
+# Enter a new block, setting the active block to @block
+function block_push(block) {
+	__current_block[++__current_block_depth] = block
+}
+
+# Exit a block, setting the active block to the parent block
+function block_pop() {
+	if (__current_block_depth == 0)
+		fatal("error: block_pop() in root block")
+
+	__current_block_depth--;
 }
 
 # Sanity check the number of records for a field makes sense. If not, produce
@@ -84,10 +96,14 @@ BEGIN {
 	print "/* Generated file - do not edit */"
 	print ""
 
-	block = "None"
+	__current_block_depth = 0
+	__current_block[__current_block_depth] = "Root"
 }
 
 END {
+	if (__current_block_depth != 0)
+		fatal("Missing terminator for " block_current() " block")
+
 	print "#endif /* __ASM_SYSREG_DEFS_H */"
 }
 
@@ -95,8 +111,9 @@ END {
 /^$/ { next }
 /^[\t ]*#/ { next }
 
-/^SysregFields/ {
-	change_block("SysregFields", "None", "SysregFields")
+/^SysregFields/ && block_current() == "Root" {
+	block_push("SysregFields")
+
 	expect_fields(2)
 
 	reg = $2
@@ -109,12 +126,10 @@ END {
 	next
 }
 
-/^EndSysregFields/ {
+/^EndSysregFields/ && block_current() == "SysregFields" {
 	if (next_bit > 0)
 		fatal("Unspecified bits in " reg)
 
-	change_block("EndSysregFields", "SysregFields", "None")
-
 	define(reg "_RES0", "(" res0 ")")
 	define(reg "_RES1", "(" res1 ")")
 	print ""
@@ -123,11 +138,13 @@ END {
 	res0 = null
 	res1 = null
 
+	block_pop()
 	next
 }
 
-/^Sysreg/ {
-	change_block("Sysreg", "None", "Sysreg")
+/^Sysreg/ && block_current() == "Root" {
+	block_push("Sysreg")
+
 	expect_fields(7)
 
 	reg = $2
@@ -156,12 +173,10 @@ END {
 	next
 }
 
-/^EndSysreg/ {
+/^EndSysreg/ && block_current() == "Sysreg" {
 	if (next_bit > 0)
 		fatal("Unspecified bits in " reg)
 
-	change_block("EndSysreg", "Sysreg", "None")
-
 	if (res0 != null)
 		define(reg "_RES0", "(" res0 ")")
 	if (res1 != null)
@@ -178,12 +193,13 @@ END {
 	res0 = null
 	res1 = null
 
+	block_pop()
 	next
 }
 
 # Currently this is effectivey a comment, in future we may want to emit
 # defines for the fields.
-/^Fields/ && (block == "Sysreg") {
+/^Fields/ && block_current() == "Sysreg" {
 	expect_fields(2)
 
 	if (next_bit != 63)
@@ -200,7 +216,7 @@ END {
 }
 
 
-/^Res0/ && (block == "Sysreg" || block == "SysregFields") {
+/^Res0/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, "RES0", $2)
 	field = "RES0_" msb "_" lsb
@@ -210,7 +226,7 @@ END {
 	next
 }
 
-/^Res1/ && (block == "Sysreg" || block == "SysregFields") {
+/^Res1/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, "RES1", $2)
 	field = "RES1_" msb "_" lsb
@@ -220,7 +236,7 @@ END {
 	next
 }
 
-/^Field/ && (block == "Sysreg" || block == "SysregFields") {
+/^Field/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -231,15 +247,16 @@ END {
 	next
 }
 
-/^Raz/ && (block == "Sysreg" || block == "SysregFields") {
+/^Raz/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, field, $2)
 
 	next
 }
 
-/^SignedEnum/ {
-	change_block("Enum<", "Sysreg", "Enum")
+/^SignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+	block_push("Enum")
+
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -250,8 +267,9 @@ END {
 	next
 }
 
-/^UnsignedEnum/ {
-	change_block("Enum<", "Sysreg", "Enum")
+/^UnsignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+	block_push("Enum")
+
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -262,8 +280,9 @@ END {
 	next
 }
 
-/^Enum/ {
-	change_block("Enum", "Sysreg", "Enum")
+/^Enum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+	block_push("Enum")
+
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -273,16 +292,18 @@ END {
 	next
 }
 
-/^EndEnum/ {
-	change_block("EndEnum", "Enum", "Sysreg")
+/^EndEnum/ && block_current() == "Enum" {
+
 	field = null
 	msb = null
 	lsb = null
 	print ""
+
+	block_pop()
 	next
 }
 
-/0b[01]+/ && block == "Enum" {
+/0b[01]+/ && block_current() == "Enum" {
 	expect_fields(2)
 	val = $1
 	name = $2
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
@ 2023-02-08 19:22         ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-08 19:22 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown

On Fri, Jan 13, 2023 at 08:32:47AM +0530, Anshuman Khandual wrote:
> On 1/12/23 18:54, Mark Rutland wrote:
> > Hi Anshuman,
> > 
> > On Thu, Jan 05, 2023 at 08:40:35AM +0530, Anshuman Khandual wrote:
> >> This adds BRBE related register definitions and various other related field
> >> macros there in. These will be used subsequently in a BRBE driver which is
> >> being added later on.
> > 
> > I haven't verified the specific values, but this looks good to me aside from
> > one minor nit below.
> > 
> > [...]
> > 
> >> +# This is just a dummy register declaration to get all common field masks and
> >> +# shifts for accessing given BRBINF contents.
> >> +Sysreg	BRBINF_EL1	2	1	8	0	0
> > 
> > We don't need a dummy declaration, as we have 'SysregFields' that can be used
> > for this, e.g.
> > 
> >   SysregFields BRBINFx_EL1
> >   ...
> >   EndSysregFields
> > 
> > ... which will avoid accidental usage of the register encoding. Note that I've
> > also added an 'x' there in place of the index, which we do for other registers,
> > e.g. TTBRx_EL1.
> > 
> > Could you please update to that?
> 
> There is a problem in defining SysregFields (which I did explore earlier as well).
> SysregFields unfortunately does not support enums fields. Following build failure
> comes up, while trying to convert BRBINFx_EL1 into a SysregFields definition.
> 
> Error at 932: unexpected Enum (inside SysregFields)

This is a problem, but it's one that we can solve. We're in control of
gen-sysreg.awk and the language it parses, so we can make this an expected and
supported case -- see below.

> ===============================================================================
> diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
> index a7f9054bd84c..519c4f080898 100644
> --- a/arch/arm64/tools/sysreg
> +++ b/arch/arm64/tools/sysreg
> @@ -921,10 +921,7 @@ Enum       3:0     BT
>  EndEnum
>  EndSysreg
>  
> -
> -# This is just a dummy register declaration to get all common field masks and
> -# shifts for accessing given BRBINF contents.
> -Sysreg BRBINF_EL1      2       1       8       0       0
> +SysregFields BRBINFx_EL1
>  Res0   63:47
>  Field  46      CCU
>  Field  45:32   CC
> @@ -967,7 +964,7 @@ Enum        1:0     VALID
>         0b10    SOURCE
>         0b11    FULL
>  EndEnum
> -EndSysreg
> +EndSysregFields
>  
>  Sysreg BRBCR_EL1       2       1       9       0       0
>  Res0   63:24
> ===============================================================================
> 
> There are three enum fields in BRBINFx_EL1 as listed here.
> 
> Enum    13:8            TYPE
> Enum    7:6		EL
> Enum    1:0     	VALID
> 
> However, BRBINF_EL1 can be changed as BRBINFx_EL1, indicating its more generic
> nature with a potential to be used for any index value register thereafter.

It's certainly better to use the BRBINFx_EL1 name, but my main concern here is
to avoid the dummy values used above to satisfy the tools, so that those cannot
be accidentally misused.

I'd prefer that we fix gen-sysreg.awk to support Enum blocks within
SysregFields blocks (patch below), then use SysregFields as described above.

Thanks,
Mark.

---->8----
From 0c194d92b0b9ff3b32f666a4610b077fdf1b4b93 Mon Sep 17 00:00:00 2001
From: Mark Rutland <mark.rutland@arm.com>
Date: Wed, 8 Feb 2023 17:55:08 +0000
Subject: [PATCH] arm64/sysreg: allow *Enum blocks in SysregFields blocks

We'd like to support Enum/SignedEnum/UnsignedEnum blocks within
SysregFields blocks, so that we can define enumerations for sets of
registers. This isn't currently supported by gen-sysreg.awk due to the
way we track the active block, which can't handle more than a single
layer of nesting, which imposes an awkward requirement that when ending
a block we know what the parent block is when calling change_block()

Make this nicer by using a stack of active blocks, with block_push() to
start a block, and block_pop() to end a block. Doing so means hat when
ending a block we don't need to know the parent block type, and checks
of the active block become more consistent. On top of that, it's easy to
permit *Enum blocks within both Sysreg and SysregFields blocks.

To aid debugging, the stack of active blocks is reported for fatal
errors, and an error is raised if the file is terminated without ending
the active block. For clarity I've renamed the top-level element from
"None" to "Root".

The Fields element it intended only for use within Systeg blocks, and
does not make sense within SysregFields blocks, and so remains forbidden
within a SysregFields block.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Will Deacon <will@kernel.org>
---
 arch/arm64/tools/gen-sysreg.awk | 93 ++++++++++++++++++++-------------
 1 file changed, 57 insertions(+), 36 deletions(-)

diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
index 7f27d66a17e1..066ebf5410fa 100755
--- a/arch/arm64/tools/gen-sysreg.awk
+++ b/arch/arm64/tools/gen-sysreg.awk
@@ -4,23 +4,35 @@
 #
 # Usage: awk -f gen-sysreg.awk sysregs.txt
 
+function block_current() {
+	return __current_block[__current_block_depth];
+}
+
 # Log an error and terminate
 function fatal(msg) {
 	print "Error at " NR ": " msg > "/dev/stderr"
+
+	printf "Current block nesting:"
+
+	for (i = 0; i <= __current_block_depth; i++) {
+		printf " " __current_block[i]
+	}
+	printf "\n"
+
 	exit 1
 }
 
-# Sanity check that the start or end of a block makes sense at this point in
-# the file. If not, produce an error and terminate.
-#
-# @this - the $Block or $EndBlock
-# @prev - the only valid block to already be in (value of @block)
-# @new - the new value of @block
-function change_block(this, prev, new) {
-	if (block != prev)
-		fatal("unexpected " this " (inside " block ")")
-
-	block = new
+# Enter a new block, setting the active block to @block
+function block_push(block) {
+	__current_block[++__current_block_depth] = block
+}
+
+# Exit a block, setting the active block to the parent block
+function block_pop() {
+	if (__current_block_depth == 0)
+		fatal("error: block_pop() in root block")
+
+	__current_block_depth--;
 }
 
 # Sanity check the number of records for a field makes sense. If not, produce
@@ -84,10 +96,14 @@ BEGIN {
 	print "/* Generated file - do not edit */"
 	print ""
 
-	block = "None"
+	__current_block_depth = 0
+	__current_block[__current_block_depth] = "Root"
 }
 
 END {
+	if (__current_block_depth != 0)
+		fatal("Missing terminator for " block_current() " block")
+
 	print "#endif /* __ASM_SYSREG_DEFS_H */"
 }
 
@@ -95,8 +111,9 @@ END {
 /^$/ { next }
 /^[\t ]*#/ { next }
 
-/^SysregFields/ {
-	change_block("SysregFields", "None", "SysregFields")
+/^SysregFields/ && block_current() == "Root" {
+	block_push("SysregFields")
+
 	expect_fields(2)
 
 	reg = $2
@@ -109,12 +126,10 @@ END {
 	next
 }
 
-/^EndSysregFields/ {
+/^EndSysregFields/ && block_current() == "SysregFields" {
 	if (next_bit > 0)
 		fatal("Unspecified bits in " reg)
 
-	change_block("EndSysregFields", "SysregFields", "None")
-
 	define(reg "_RES0", "(" res0 ")")
 	define(reg "_RES1", "(" res1 ")")
 	print ""
@@ -123,11 +138,13 @@ END {
 	res0 = null
 	res1 = null
 
+	block_pop()
 	next
 }
 
-/^Sysreg/ {
-	change_block("Sysreg", "None", "Sysreg")
+/^Sysreg/ && block_current() == "Root" {
+	block_push("Sysreg")
+
 	expect_fields(7)
 
 	reg = $2
@@ -156,12 +173,10 @@ END {
 	next
 }
 
-/^EndSysreg/ {
+/^EndSysreg/ && block_current() == "Sysreg" {
 	if (next_bit > 0)
 		fatal("Unspecified bits in " reg)
 
-	change_block("EndSysreg", "Sysreg", "None")
-
 	if (res0 != null)
 		define(reg "_RES0", "(" res0 ")")
 	if (res1 != null)
@@ -178,12 +193,13 @@ END {
 	res0 = null
 	res1 = null
 
+	block_pop()
 	next
 }
 
 # Currently this is effectivey a comment, in future we may want to emit
 # defines for the fields.
-/^Fields/ && (block == "Sysreg") {
+/^Fields/ && block_current() == "Sysreg" {
 	expect_fields(2)
 
 	if (next_bit != 63)
@@ -200,7 +216,7 @@ END {
 }
 
 
-/^Res0/ && (block == "Sysreg" || block == "SysregFields") {
+/^Res0/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, "RES0", $2)
 	field = "RES0_" msb "_" lsb
@@ -210,7 +226,7 @@ END {
 	next
 }
 
-/^Res1/ && (block == "Sysreg" || block == "SysregFields") {
+/^Res1/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, "RES1", $2)
 	field = "RES1_" msb "_" lsb
@@ -220,7 +236,7 @@ END {
 	next
 }
 
-/^Field/ && (block == "Sysreg" || block == "SysregFields") {
+/^Field/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -231,15 +247,16 @@ END {
 	next
 }
 
-/^Raz/ && (block == "Sysreg" || block == "SysregFields") {
+/^Raz/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
 	expect_fields(2)
 	parse_bitdef(reg, field, $2)
 
 	next
 }
 
-/^SignedEnum/ {
-	change_block("Enum<", "Sysreg", "Enum")
+/^SignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+	block_push("Enum")
+
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -250,8 +267,9 @@ END {
 	next
 }
 
-/^UnsignedEnum/ {
-	change_block("Enum<", "Sysreg", "Enum")
+/^UnsignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+	block_push("Enum")
+
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -262,8 +280,9 @@ END {
 	next
 }
 
-/^Enum/ {
-	change_block("Enum", "Sysreg", "Enum")
+/^Enum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
+	block_push("Enum")
+
 	expect_fields(3)
 	field = $3
 	parse_bitdef(reg, field, $2)
@@ -273,16 +292,18 @@ END {
 	next
 }
 
-/^EndEnum/ {
-	change_block("EndEnum", "Enum", "Sysreg")
+/^EndEnum/ && block_current() == "Enum" {
+
 	field = null
 	msb = null
 	lsb = null
 	print ""
+
+	block_pop()
 	next
 }
 
-/0b[01]+/ && block == "Enum" {
+/0b[01]+/ && block_current() == "Enum" {
 	expect_fields(2)
 	val = $1
 	name = $2
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
  2023-01-13  4:15       ` Anshuman Khandual
@ 2023-02-08 19:26         ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-08 19:26 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Fri, Jan 13, 2023 at 09:45:22AM +0530, Anshuman Khandual wrote:
> 
> On 1/12/23 19:24, Mark Rutland wrote:
> > On Thu, Jan 05, 2023 at 08:40:36AM +0530, Anshuman Khandual wrote:
> >>  struct arm_pmu {
> >>  	struct pmu	pmu;
> >>  	cpumask_t	supported_cpus;
> >>  	char		*name;
> >>  	int		pmuver;
> >> +	int		features;
> >>  	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
> >>  	void		(*enable)(struct perf_event *event);
> >>  	void		(*disable)(struct perf_event *event);
> > 
> > Hmm, we already have the secure_access field separately. How about we fold that
> > in and go with:
> > 
> > 	unsigned int	secure_access    : 1,
> > 			has_branch_stack : 1;
> 
> Something like this would work, but should we use __u32 instead of unsigned int
> to ensure 32 bit width ?

I don't think that's necessary; the exact size doesn't really matter, and
unsigned int is 32-bits on all targets suppropted by Linux, not just arm and
arm64.

I do agree that if this were a userspace ABI detail, it might be preferable to
use __u32. However, I think using it here gives the misleading impression that
there is an ABI concern when there is not, and as above it's not necessary, so
I'd prefer unsigned int here.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
@ 2023-02-08 19:26         ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-08 19:26 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Fri, Jan 13, 2023 at 09:45:22AM +0530, Anshuman Khandual wrote:
> 
> On 1/12/23 19:24, Mark Rutland wrote:
> > On Thu, Jan 05, 2023 at 08:40:36AM +0530, Anshuman Khandual wrote:
> >>  struct arm_pmu {
> >>  	struct pmu	pmu;
> >>  	cpumask_t	supported_cpus;
> >>  	char		*name;
> >>  	int		pmuver;
> >> +	int		features;
> >>  	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
> >>  	void		(*enable)(struct perf_event *event);
> >>  	void		(*disable)(struct perf_event *event);
> > 
> > Hmm, we already have the secure_access field separately. How about we fold that
> > in and go with:
> > 
> > 	unsigned int	secure_access    : 1,
> > 			has_branch_stack : 1;
> 
> Something like this would work, but should we use __u32 instead of unsigned int
> to ensure 32 bit width ?

I don't think that's necessary; the exact size doesn't really matter, and
unsigned int is 32-bits on all targets suppropted by Linux, not just arm and
arm64.

I do agree that if this were a userspace ABI detail, it might be preferable to
use __u32. However, I think using it here gives the misleading impression that
there is an ABI concern when there is not, and as above it's not necessary, so
I'd prefer unsigned int here.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
  2023-01-13  5:11       ` Anshuman Khandual
@ 2023-02-08 19:36         ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-08 19:36 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Fri, Jan 13, 2023 at 10:41:51AM +0530, Anshuman Khandual wrote:
> 
> 
> On 1/12/23 19:59, Mark Rutland wrote:
> > On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
> >> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
> >>  		if (!armpmu_event_set_period(event))
> >>  			continue;
> >>  
> >> +		if (has_branch_stack(event)) {
> >> +			WARN_ON(!cpuc->branches);
> >> +			armv8pmu_branch_read(cpuc, event);
> >> +			data.br_stack = &cpuc->branches->branch_stack;
> >> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> >> +		}
> > 
> > How do we ensure the data we're getting isn't changed under our feet? Is BRBE
> > disabled at this point?
> 
> Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
> all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
> before initiating the actual read, which eventually populates the data.br_stack.

Ok; just to confirm, what exactly is the condition that enforces that BRBE is
disabled? Is that *while* there's an overflow asserted, or does something else
get set at the instant the overflow occurs?

What exactly is necessary for it to start again?

> > Is this going to have branches after taking the exception, or does BRBE stop
> > automatically at that point? If so we presumably need to take special care as
> > to when we read this relative to enabling/disabling and/or manipulating the
> > overflow bits.
> 
> The default BRBE configuration includes setting BRBCR_EL1.FZP, enabling BRBE to
> be paused automatically, right after a PMU IRQ. Regardless, before reading the
> buffer, BRBE is paused (BRBFCR_EL1.PAUSED) and disabled for all privilege levels
> ~(BRBCR_EL1.E0BRE/E1BRE) which ensures that no new branch record is getting into
> the buffer, while it is being read for perf right buffer.

Ok; I think we could do with some comments as to this.

> 
> > 
> >> +
> >>  		/*
> >>  		 * Perf event overflow will queue the processing of the event as
> >>  		 * an irq_work which will be taken care of in the handling of
> >> @@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
> >>  	return event->hw.idx;
> >>  }
> >>  
> >> +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
> >> +{
> >> +	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
> >> +
> >> +	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
> >> +		armv8pmu_branch_reset();
> >> +}
> > 
> > When scheduling out, shouldn't we save what we have so far?
> > 
> > It seems odd that we just throw that away rather than placing it into a FIFO.
> 
> IIRC we had discussed this earlier, save and restore mechanism will be added
> later, not during this enablement patch series. 

Sorry, but why?

I don't understand why it's acceptable to non-deterministically throw away data
for now. At the least that's going to confuse users, especially as the
observable behaviour may change if and when that's added later.

I assume that there's some reason that it's painful to do that? Could you
please elaborate on that?

> For now resetting the buffer ensures that branch records from one session
> does not get into another. 

I agree that it's necessary to do that, but as above I don't believe it's
sufficient.

> Note that these branches cannot be pushed into perf ring buffer either, as
> there was no corresponding PMU interrupt to be associated with.

I'm not suggesting we put it in the perf ring buffer; I'm suggesting that we
snapshot it into *some* kernel-internal storage, then later reconcile that.

Maybe that's far more painful than I expect?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
@ 2023-02-08 19:36         ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-08 19:36 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Fri, Jan 13, 2023 at 10:41:51AM +0530, Anshuman Khandual wrote:
> 
> 
> On 1/12/23 19:59, Mark Rutland wrote:
> > On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
> >> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
> >>  		if (!armpmu_event_set_period(event))
> >>  			continue;
> >>  
> >> +		if (has_branch_stack(event)) {
> >> +			WARN_ON(!cpuc->branches);
> >> +			armv8pmu_branch_read(cpuc, event);
> >> +			data.br_stack = &cpuc->branches->branch_stack;
> >> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> >> +		}
> > 
> > How do we ensure the data we're getting isn't changed under our feet? Is BRBE
> > disabled at this point?
> 
> Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
> all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
> before initiating the actual read, which eventually populates the data.br_stack.

Ok; just to confirm, what exactly is the condition that enforces that BRBE is
disabled? Is that *while* there's an overflow asserted, or does something else
get set at the instant the overflow occurs?

What exactly is necessary for it to start again?

> > Is this going to have branches after taking the exception, or does BRBE stop
> > automatically at that point? If so we presumably need to take special care as
> > to when we read this relative to enabling/disabling and/or manipulating the
> > overflow bits.
> 
> The default BRBE configuration includes setting BRBCR_EL1.FZP, enabling BRBE to
> be paused automatically, right after a PMU IRQ. Regardless, before reading the
> buffer, BRBE is paused (BRBFCR_EL1.PAUSED) and disabled for all privilege levels
> ~(BRBCR_EL1.E0BRE/E1BRE) which ensures that no new branch record is getting into
> the buffer, while it is being read for perf right buffer.

Ok; I think we could do with some comments as to this.

> 
> > 
> >> +
> >>  		/*
> >>  		 * Perf event overflow will queue the processing of the event as
> >>  		 * an irq_work which will be taken care of in the handling of
> >> @@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
> >>  	return event->hw.idx;
> >>  }
> >>  
> >> +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
> >> +{
> >> +	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
> >> +
> >> +	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
> >> +		armv8pmu_branch_reset();
> >> +}
> > 
> > When scheduling out, shouldn't we save what we have so far?
> > 
> > It seems odd that we just throw that away rather than placing it into a FIFO.
> 
> IIRC we had discussed this earlier, save and restore mechanism will be added
> later, not during this enablement patch series. 

Sorry, but why?

I don't understand why it's acceptable to non-deterministically throw away data
for now. At the least that's going to confuse users, especially as the
observable behaviour may change if and when that's added later.

I assume that there's some reason that it's painful to do that? Could you
please elaborate on that?

> For now resetting the buffer ensures that branch records from one session
> does not get into another. 

I agree that it's necessary to do that, but as above I don't believe it's
sufficient.

> Note that these branches cannot be pushed into perf ring buffer either, as
> there was no corresponding PMU interrupt to be associated with.

I'm not suggesting we put it in the perf ring buffer; I'm suggesting that we
snapshot it into *some* kernel-internal storage, then later reconcile that.

Maybe that's far more painful than I expect?

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
  2023-01-19  2:48       ` Anshuman Khandual
@ 2023-02-08 20:03         ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-08 20:03 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 19, 2023 at 08:18:47AM +0530, Anshuman Khandual wrote:
> On 1/12/23 22:21, Mark Rutland wrote:
> > On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:
> >> +bool armv8pmu_branch_valid(struct perf_event *event)
> >> +{
> >> +	u64 branch_type = event->attr.branch_sample_type;
> >> +
> >> +	/*
> >> +	 * If the event does not have at least one of the privilege
> >> +	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
> >> +	 * perf will adjust its value based on perf event's existing
> >> +	 * privilege level via attr.exclude_[user|kernel|hv].
> >> +	 *
> >> +	 * As event->attr.branch_sample_type might have been changed
> >> +	 * when the event reaches here, it is not possible to figure
> >> +	 * out whether the event originally had HV privilege request
> >> +	 * or got added via the core perf. Just report this situation
> >> +	 * once and continue ignoring if there are other instances.
> >> +	 */
> >> +	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
> >> +		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
> >> +
> >> +	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
> >> +		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
> >> +		return false;
> >> +	}
> >> +
> >> +	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
> >> +		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
> >> +		return false;
> >> +	}
> >> +
> >> +	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
> >> +		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
> >> +		return false;
> >> +	}
> >> +	return true;
> >> +}
> > 
> > Is this called when validating user input? If so, NAK to printing anything to a
> > higher leval than debug. If there are constraints the user needs to be aware of
> 
> You mean pr_debug() based prints ?

Yes.

> > It would be better to whitelist what we do support rather than blacklisting
> > what we don't.
> 
> But with a negative list, user would know what is not supported via these pr_debug()
> based output when enabled ? But I dont have a strong opinion either way.

With a negative list, when new options are added the driver will erroneously
and silently accept them, which is worse.

> 
> > 
> >> +
> >> +static void branch_records_alloc(struct arm_pmu *armpmu)
> >> +{
> >> +	struct pmu_hw_events *events;
> >> +	int cpu;
> >> +
> >> +	for_each_possible_cpu(cpu) {
> >> +		events = per_cpu_ptr(armpmu->hw_events, cpu);
> >> +
> >> +		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
> >> +		WARN_ON(!events->branches);
> >> +	}
> >> +}
> > 
> > It would be simpler for this to be a percpu allocation.
> 
> Could you please be more specific ? alloc_percpu_gfp() cannot be used here
> because 'events->branches' is not a __percpu variable unlike its parent
> 'events' which is derived from armpmu.

You can allocate it per-cpu, then grab each of the cpu's pointers using
per_cpu() and place those into events->branches.

That way you only make one allocation which can fail, which makes the error
path much simpler.

[...]

> >> +static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
> >> +{
> >> +	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
> > 
> > Same comments as for the failure path in branch_records_alloc().
> > 
> >> +	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
> > 
> > Which context is this run in? Unless this is affine to a relevant CPU we can't
> > read the sysreg safely, and if we're in a cross-call we cannot allocate memory,
> > so this doesn't look right to me.
> 
> Called from smp_call_function_any() context via __armv8pmu_probe_pmu().

Ok; so the read is safe, but the allocation is not.

[...]

> >> +	WARN_ON(!brbe_attr);
> >> +	armpmu->private = brbe_attr;
> >> +
> >> +	brbe_attr->brbe_version = brbe;
> >> +	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
> >> +	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
> >> +	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
> > 
> > As a minor thing, could we please s/fetch/get/ ? To me, 'fetch' sounds like a
> > memory operation, and elsewhere we use 'get' for this sort of getter function.
> 
> Sure, but shall we change fetch as get for entire BRBE implementation (where ever
> there is a determination of field from a register value) or just the above function ?
> Default, will change all places.

I had meant in all cases, so that's perfect, thanks.


[...]

> >> +			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));
> > 
> > Huh? Why does the value of BRBCR matter here?
> 
> This is just a code hardening measure here. Before recording branch record
> cycles or its flags, ensure BRBCR_EL1 was configured correctly to produce
> these additional information along with the branch records.

I don't think that's necessary. Where is brbcr written such that this could be
misconfigured?

At the least, this needs a comment as to why we need to check, and what we're
checking for.

[...]

> >> +/*
> >> + * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
> >> + * preceding consecutive branch records, that were in a transaction
> >> + * (i.e their BRBINF_EL1.TX set) have been aborted.
> >> + *
> >> + * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
> >> + * consecutive branch records upto the last record, which were in a
> >> + * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
> >> + *
> >> + * --------------------------------- -------------------
> >> + * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
> >> + * --------------------------------- -------------------
> >> + * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
> >> + * --------------------------------- -------------------
> >> + * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
> >> + * --------------------------------- -------------------
> >> + * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> >> + * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> >> + * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
> >> + * --------------------------------- -------------------
> >> + * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
> >> + * --------------------------------- -------------------
> >> + * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> >> + * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> >> + * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> > 
> > Are we guaranteed to have a record between two transactions with TX = 0?
> 
> With TX = 0 i.e no transaction was active, indicates normal sequence of branches
> creating their own branch records. How can there be a transaction with TX = 0 ?
> Could you please be more specific here ?

Consider a sequence of a successful transaction followed by a cancelled
transation, with not branches between the first transation being commited and
the second transaction starting:

	TSTART	// TX=1
	...     // TX=1
	TCOMMIT // TX=1
	TSTART  // TX=1
	...     // TX=1
	<failure>
		// TX=0, LF=1

AFAICT, we are not guaranteed to have a record with TX=0 between that
successful TCOMMIT and the subsequent TSTART, and so the LASTFAILED field
doesn't indicate that *all* preceding records with TX set are part of the
failed transaction.

Am I missing something? e.g. does the TCOMMIT get records with TX=0?

> > AFAICT you could have a sequence where a TCOMMIT is immediately followed by a
> > TSTART, and IIUC in that case you could have back-to-back records for distinct
> > transactions all with TX = 1, where the first transaction could be commited,
> > and the second might fail/cancel.
> > 
> > ... or do TCOMMIT/TCANCEL/TSTART get handled specially?
> 
> I guess these are micro-architectural implementation details which unfortunately
> BRBINF_EL1/BRBCR_EL1 specifications do not capture in detail. But all it says is
> that upon encountering BRBINF_EL1.LASTFAILED or BRBFCR_EL1.LASTFAILED (just for
> the last record) all previous in-transaction branch records (BRBINF_EL1.TX = 1)
> should be considered aborted for branch record reporting purpose.

Ok, so we're throwing away data?

If we're going to do that, it would be good to at least have a comment
explaining why we're forced to do so. Ideally we'd get the architecture
clarified/fixed, since AFAIK no-one has actually built TME yet, and it might be
a simple fix (as above).

[...]

> >> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
> >> +{
> >> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
> >> +	u64 brbinf, brbfcr, brbcr;
> >> +	int idx;
> >> +
> >> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
> >> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> >> +
> >> +	/* Ensure pause on PMU interrupt is enabled */
> >> +	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
> > 
> > As above, I think this needs commentary in the interrupt handler, since this
> > presumably needs us to keep the IRQ asserted until we're done
> > reading/manipulating records in the IRQ handler.
> 
> The base IRQ handler armv8pmu_handle_irq() is in ARMV8 PMU code inside perf_event.c
> which could/should not access BRBE specific details without adding an another new
> abstraction function. But I guess adding a comment should be fine.

I think it's fine to have a comment there saying that we *must not* do
something that woukd break BRBE.

> >> +
> >> +	/* Save and clear the privilege */
> >> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
> > 
> > Why? Later on we restore this, and AFAICT we don't modify it.
> > 
> > If it's paused, why do we care about the privilege?
> 
> This disables BRBE completely (not only pause) providing confidence that no
> branch record can come in while the existing records are being processed.

I thought from earlier that it was automatically paused by HW upon raising the
IRQ. Have I misunderstood, and we *must* stop it, or is this a belt-and-braces
additional disable?

Is that not the case, or do we not trust the pause for some reason?

Regardless, the comment should expalin *why* we're doing this (i.e. that this
is about ensuring BRBE does not create new records while we're manipulating
it).

> >> +	/* Unpause the buffer */
> >> +	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
> >> +	isb();
> >> +	armv8pmu_branch_reset();
> >> +}
> > 
> > Why do we enable it before we reset it?
> 
> This is the last opportunity for a clean slate start for BRBE buffer before it is
> back recording the branches. Basically helps in ensuring a clean start.

My point is why do we start if *before* resetting it, rather than restting it
first? Why give it the opportunity to create records that we're going to
discard immediately thereafter?

> > Surely it would make sense to reset it first, and ammortize the cost of the ISB?
> > 
> > That said, as above, do we actually need to pause/unpause it? Or is it already
> > paused by virtue of the IRQ?
> 
> Yes, it should be paused after an IRQ but it is also enforced before reading along
> with privilege level disable.

I'm very confused as to why we're not trusting the HW to remain paused. Why do
we need to enforce what th e hardware should already be doing?

> Regardless the buffer needs to be un-paused and also
> enabled for required privilege levels before exiting from here.

I agree this needs to be balanced, it just seems to me that we're doing
redundant work here.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
@ 2023-02-08 20:03         ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-08 20:03 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Thu, Jan 19, 2023 at 08:18:47AM +0530, Anshuman Khandual wrote:
> On 1/12/23 22:21, Mark Rutland wrote:
> > On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:
> >> +bool armv8pmu_branch_valid(struct perf_event *event)
> >> +{
> >> +	u64 branch_type = event->attr.branch_sample_type;
> >> +
> >> +	/*
> >> +	 * If the event does not have at least one of the privilege
> >> +	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
> >> +	 * perf will adjust its value based on perf event's existing
> >> +	 * privilege level via attr.exclude_[user|kernel|hv].
> >> +	 *
> >> +	 * As event->attr.branch_sample_type might have been changed
> >> +	 * when the event reaches here, it is not possible to figure
> >> +	 * out whether the event originally had HV privilege request
> >> +	 * or got added via the core perf. Just report this situation
> >> +	 * once and continue ignoring if there are other instances.
> >> +	 */
> >> +	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
> >> +		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
> >> +
> >> +	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
> >> +		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
> >> +		return false;
> >> +	}
> >> +
> >> +	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
> >> +		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
> >> +		return false;
> >> +	}
> >> +
> >> +	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
> >> +		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
> >> +		return false;
> >> +	}
> >> +	return true;
> >> +}
> > 
> > Is this called when validating user input? If so, NAK to printing anything to a
> > higher leval than debug. If there are constraints the user needs to be aware of
> 
> You mean pr_debug() based prints ?

Yes.

> > It would be better to whitelist what we do support rather than blacklisting
> > what we don't.
> 
> But with a negative list, user would know what is not supported via these pr_debug()
> based output when enabled ? But I dont have a strong opinion either way.

With a negative list, when new options are added the driver will erroneously
and silently accept them, which is worse.

> 
> > 
> >> +
> >> +static void branch_records_alloc(struct arm_pmu *armpmu)
> >> +{
> >> +	struct pmu_hw_events *events;
> >> +	int cpu;
> >> +
> >> +	for_each_possible_cpu(cpu) {
> >> +		events = per_cpu_ptr(armpmu->hw_events, cpu);
> >> +
> >> +		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
> >> +		WARN_ON(!events->branches);
> >> +	}
> >> +}
> > 
> > It would be simpler for this to be a percpu allocation.
> 
> Could you please be more specific ? alloc_percpu_gfp() cannot be used here
> because 'events->branches' is not a __percpu variable unlike its parent
> 'events' which is derived from armpmu.

You can allocate it per-cpu, then grab each of the cpu's pointers using
per_cpu() and place those into events->branches.

That way you only make one allocation which can fail, which makes the error
path much simpler.

[...]

> >> +static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
> >> +{
> >> +	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
> > 
> > Same comments as for the failure path in branch_records_alloc().
> > 
> >> +	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
> > 
> > Which context is this run in? Unless this is affine to a relevant CPU we can't
> > read the sysreg safely, and if we're in a cross-call we cannot allocate memory,
> > so this doesn't look right to me.
> 
> Called from smp_call_function_any() context via __armv8pmu_probe_pmu().

Ok; so the read is safe, but the allocation is not.

[...]

> >> +	WARN_ON(!brbe_attr);
> >> +	armpmu->private = brbe_attr;
> >> +
> >> +	brbe_attr->brbe_version = brbe;
> >> +	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
> >> +	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
> >> +	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
> > 
> > As a minor thing, could we please s/fetch/get/ ? To me, 'fetch' sounds like a
> > memory operation, and elsewhere we use 'get' for this sort of getter function.
> 
> Sure, but shall we change fetch as get for entire BRBE implementation (where ever
> there is a determination of field from a register value) or just the above function ?
> Default, will change all places.

I had meant in all cases, so that's perfect, thanks.


[...]

> >> +			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));
> > 
> > Huh? Why does the value of BRBCR matter here?
> 
> This is just a code hardening measure here. Before recording branch record
> cycles or its flags, ensure BRBCR_EL1 was configured correctly to produce
> these additional information along with the branch records.

I don't think that's necessary. Where is brbcr written such that this could be
misconfigured?

At the least, this needs a comment as to why we need to check, and what we're
checking for.

[...]

> >> +/*
> >> + * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
> >> + * preceding consecutive branch records, that were in a transaction
> >> + * (i.e their BRBINF_EL1.TX set) have been aborted.
> >> + *
> >> + * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
> >> + * consecutive branch records upto the last record, which were in a
> >> + * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
> >> + *
> >> + * --------------------------------- -------------------
> >> + * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
> >> + * --------------------------------- -------------------
> >> + * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
> >> + * --------------------------------- -------------------
> >> + * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
> >> + * --------------------------------- -------------------
> >> + * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> >> + * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> >> + * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
> >> + * --------------------------------- -------------------
> >> + * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
> >> + * --------------------------------- -------------------
> >> + * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> >> + * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> >> + * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
> >> + * --------------------------------- -------------------
> > 
> > Are we guaranteed to have a record between two transactions with TX = 0?
> 
> With TX = 0 i.e no transaction was active, indicates normal sequence of branches
> creating their own branch records. How can there be a transaction with TX = 0 ?
> Could you please be more specific here ?

Consider a sequence of a successful transaction followed by a cancelled
transation, with not branches between the first transation being commited and
the second transaction starting:

	TSTART	// TX=1
	...     // TX=1
	TCOMMIT // TX=1
	TSTART  // TX=1
	...     // TX=1
	<failure>
		// TX=0, LF=1

AFAICT, we are not guaranteed to have a record with TX=0 between that
successful TCOMMIT and the subsequent TSTART, and so the LASTFAILED field
doesn't indicate that *all* preceding records with TX set are part of the
failed transaction.

Am I missing something? e.g. does the TCOMMIT get records with TX=0?

> > AFAICT you could have a sequence where a TCOMMIT is immediately followed by a
> > TSTART, and IIUC in that case you could have back-to-back records for distinct
> > transactions all with TX = 1, where the first transaction could be commited,
> > and the second might fail/cancel.
> > 
> > ... or do TCOMMIT/TCANCEL/TSTART get handled specially?
> 
> I guess these are micro-architectural implementation details which unfortunately
> BRBINF_EL1/BRBCR_EL1 specifications do not capture in detail. But all it says is
> that upon encountering BRBINF_EL1.LASTFAILED or BRBFCR_EL1.LASTFAILED (just for
> the last record) all previous in-transaction branch records (BRBINF_EL1.TX = 1)
> should be considered aborted for branch record reporting purpose.

Ok, so we're throwing away data?

If we're going to do that, it would be good to at least have a comment
explaining why we're forced to do so. Ideally we'd get the architecture
clarified/fixed, since AFAIK no-one has actually built TME yet, and it might be
a simple fix (as above).

[...]

> >> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
> >> +{
> >> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
> >> +	u64 brbinf, brbfcr, brbcr;
> >> +	int idx;
> >> +
> >> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
> >> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
> >> +
> >> +	/* Ensure pause on PMU interrupt is enabled */
> >> +	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
> > 
> > As above, I think this needs commentary in the interrupt handler, since this
> > presumably needs us to keep the IRQ asserted until we're done
> > reading/manipulating records in the IRQ handler.
> 
> The base IRQ handler armv8pmu_handle_irq() is in ARMV8 PMU code inside perf_event.c
> which could/should not access BRBE specific details without adding an another new
> abstraction function. But I guess adding a comment should be fine.

I think it's fine to have a comment there saying that we *must not* do
something that woukd break BRBE.

> >> +
> >> +	/* Save and clear the privilege */
> >> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
> > 
> > Why? Later on we restore this, and AFAICT we don't modify it.
> > 
> > If it's paused, why do we care about the privilege?
> 
> This disables BRBE completely (not only pause) providing confidence that no
> branch record can come in while the existing records are being processed.

I thought from earlier that it was automatically paused by HW upon raising the
IRQ. Have I misunderstood, and we *must* stop it, or is this a belt-and-braces
additional disable?

Is that not the case, or do we not trust the pause for some reason?

Regardless, the comment should expalin *why* we're doing this (i.e. that this
is about ensuring BRBE does not create new records while we're manipulating
it).

> >> +	/* Unpause the buffer */
> >> +	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
> >> +	isb();
> >> +	armv8pmu_branch_reset();
> >> +}
> > 
> > Why do we enable it before we reset it?
> 
> This is the last opportunity for a clean slate start for BRBE buffer before it is
> back recording the branches. Basically helps in ensuring a clean start.

My point is why do we start if *before* resetting it, rather than restting it
first? Why give it the opportunity to create records that we're going to
discard immediately thereafter?

> > Surely it would make sense to reset it first, and ammortize the cost of the ISB?
> > 
> > That said, as above, do we actually need to pause/unpause it? Or is it already
> > paused by virtue of the IRQ?
> 
> Yes, it should be paused after an IRQ but it is also enforced before reading along
> with privilege level disable.

I'm very confused as to why we're not trusting the HW to remain paused. Why do
we need to enforce what th e hardware should already be doing?

> Regardless the buffer needs to be un-paused and also
> enabled for required privilege levels before exiting from here.

I agree this needs to be balanced, it just seems to me that we're doing
redundant work here.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
  2023-02-08 19:26         ` Mark Rutland
@ 2023-02-09  3:40           ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-02-09  3:40 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon



On 2/9/23 00:56, Mark Rutland wrote:
> On Fri, Jan 13, 2023 at 09:45:22AM +0530, Anshuman Khandual wrote:
>>
>> On 1/12/23 19:24, Mark Rutland wrote:
>>> On Thu, Jan 05, 2023 at 08:40:36AM +0530, Anshuman Khandual wrote:
>>>>  struct arm_pmu {
>>>>  	struct pmu	pmu;
>>>>  	cpumask_t	supported_cpus;
>>>>  	char		*name;
>>>>  	int		pmuver;
>>>> +	int		features;
>>>>  	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
>>>>  	void		(*enable)(struct perf_event *event);
>>>>  	void		(*disable)(struct perf_event *event);
>>>
>>> Hmm, we already have the secure_access field separately. How about we fold that
>>> in and go with:
>>>
>>> 	unsigned int	secure_access    : 1,
>>> 			has_branch_stack : 1;
>>
>> Something like this would work, but should we use __u32 instead of unsigned int
>> to ensure 32 bit width ?
> 
> I don't think that's necessary; the exact size doesn't really matter, and
> unsigned int is 32-bits on all targets suppropted by Linux, not just arm and
> arm64.
> 
> I do agree that if this were a userspace ABI detail, it might be preferable to
> use __u32. However, I think using it here gives the misleading impression that
> there is an ABI concern when there is not, and as above it's not necessary, so
> I'd prefer unsigned int here.

Makes sense, will this as unsigned int.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu
@ 2023-02-09  3:40           ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-02-09  3:40 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon



On 2/9/23 00:56, Mark Rutland wrote:
> On Fri, Jan 13, 2023 at 09:45:22AM +0530, Anshuman Khandual wrote:
>>
>> On 1/12/23 19:24, Mark Rutland wrote:
>>> On Thu, Jan 05, 2023 at 08:40:36AM +0530, Anshuman Khandual wrote:
>>>>  struct arm_pmu {
>>>>  	struct pmu	pmu;
>>>>  	cpumask_t	supported_cpus;
>>>>  	char		*name;
>>>>  	int		pmuver;
>>>> +	int		features;
>>>>  	irqreturn_t	(*handle_irq)(struct arm_pmu *pmu);
>>>>  	void		(*enable)(struct perf_event *event);
>>>>  	void		(*disable)(struct perf_event *event);
>>>
>>> Hmm, we already have the secure_access field separately. How about we fold that
>>> in and go with:
>>>
>>> 	unsigned int	secure_access    : 1,
>>> 			has_branch_stack : 1;
>>
>> Something like this would work, but should we use __u32 instead of unsigned int
>> to ensure 32 bit width ?
> 
> I don't think that's necessary; the exact size doesn't really matter, and
> unsigned int is 32-bits on all targets suppropted by Linux, not just arm and
> arm64.
> 
> I do agree that if this were a userspace ABI detail, it might be preferable to
> use __u32. However, I think using it here gives the misleading impression that
> there is an ABI concern when there is not, and as above it's not necessary, so
> I'd prefer unsigned int here.

Makes sense, will this as unsigned int.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
  2023-02-08 19:22         ` Mark Rutland
@ 2023-02-09  5:49           ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-02-09  5:49 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown



On 2/9/23 00:52, Mark Rutland wrote:
> On Fri, Jan 13, 2023 at 08:32:47AM +0530, Anshuman Khandual wrote:
>> On 1/12/23 18:54, Mark Rutland wrote:
>>> Hi Anshuman,
>>>
>>> On Thu, Jan 05, 2023 at 08:40:35AM +0530, Anshuman Khandual wrote:
>>>> This adds BRBE related register definitions and various other related field
>>>> macros there in. These will be used subsequently in a BRBE driver which is
>>>> being added later on.
>>>
>>> I haven't verified the specific values, but this looks good to me aside from
>>> one minor nit below.
>>>
>>> [...]
>>>
>>>> +# This is just a dummy register declaration to get all common field masks and
>>>> +# shifts for accessing given BRBINF contents.
>>>> +Sysreg	BRBINF_EL1	2	1	8	0	0
>>>
>>> We don't need a dummy declaration, as we have 'SysregFields' that can be used
>>> for this, e.g.
>>>
>>>   SysregFields BRBINFx_EL1
>>>   ...
>>>   EndSysregFields
>>>
>>> ... which will avoid accidental usage of the register encoding. Note that I've
>>> also added an 'x' there in place of the index, which we do for other registers,
>>> e.g. TTBRx_EL1.
>>>
>>> Could you please update to that?
>>
>> There is a problem in defining SysregFields (which I did explore earlier as well).
>> SysregFields unfortunately does not support enums fields. Following build failure
>> comes up, while trying to convert BRBINFx_EL1 into a SysregFields definition.
>>
>> Error at 932: unexpected Enum (inside SysregFields)
> 
> This is a problem, but it's one that we can solve. We're in control of
> gen-sysreg.awk and the language it parses, so we can make this an expected and
> supported case -- see below.
> 
>> ===============================================================================
>> diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
>> index a7f9054bd84c..519c4f080898 100644
>> --- a/arch/arm64/tools/sysreg
>> +++ b/arch/arm64/tools/sysreg
>> @@ -921,10 +921,7 @@ Enum       3:0     BT
>>  EndEnum
>>  EndSysreg
>>  
>> -
>> -# This is just a dummy register declaration to get all common field masks and
>> -# shifts for accessing given BRBINF contents.
>> -Sysreg BRBINF_EL1      2       1       8       0       0
>> +SysregFields BRBINFx_EL1
>>  Res0   63:47
>>  Field  46      CCU
>>  Field  45:32   CC
>> @@ -967,7 +964,7 @@ Enum        1:0     VALID
>>         0b10    SOURCE
>>         0b11    FULL
>>  EndEnum
>> -EndSysreg
>> +EndSysregFields
>>  
>>  Sysreg BRBCR_EL1       2       1       9       0       0
>>  Res0   63:24
>> ===============================================================================
>>
>> There are three enum fields in BRBINFx_EL1 as listed here.
>>
>> Enum    13:8            TYPE
>> Enum    7:6		EL
>> Enum    1:0     	VALID
>>
>> However, BRBINF_EL1 can be changed as BRBINFx_EL1, indicating its more generic
>> nature with a potential to be used for any index value register thereafter.
> 
> It's certainly better to use the BRBINFx_EL1 name, but my main concern here is
> to avoid the dummy values used above to satisfy the tools, so that those cannot
> be accidentally misused.
> 
> I'd prefer that we fix gen-sysreg.awk to support Enum blocks within
> SysregFields blocks (patch below), then use SysregFields as described above.

The following patch did not apply cleanly on v6.2-rc7 but eventually did so after
some changes. Is the patch against mainline or arm64-next ? Nonetheless, it does
solve the enum problem for SysregFields. With this patch in place, I was able to

- Change Sysreg BRBINF_EL1 as SysregFields BRBINFx_EL1
- Change BRBINF_EL1_XXX fields usage as BRBINFx_EL1_XXX fields

Should I take this patch with this series as an initial prerequisite patch or you
would like to post this now for current merge window ?

> 
> Thanks,
> Mark.
> 
> ---->8----
>>From 0c194d92b0b9ff3b32f666a4610b077fdf1b4b93 Mon Sep 17 00:00:00 2001
> From: Mark Rutland <mark.rutland@arm.com>
> Date: Wed, 8 Feb 2023 17:55:08 +0000
> Subject: [PATCH] arm64/sysreg: allow *Enum blocks in SysregFields blocks
> 
> We'd like to support Enum/SignedEnum/UnsignedEnum blocks within
> SysregFields blocks, so that we can define enumerations for sets of
> registers. This isn't currently supported by gen-sysreg.awk due to the
> way we track the active block, which can't handle more than a single
> layer of nesting, which imposes an awkward requirement that when ending
> a block we know what the parent block is when calling change_block()
> 
> Make this nicer by using a stack of active blocks, with block_push() to
> start a block, and block_pop() to end a block. Doing so means hat when
> ending a block we don't need to know the parent block type, and checks
> of the active block become more consistent. On top of that, it's easy to
> permit *Enum blocks within both Sysreg and SysregFields blocks.
> 
> To aid debugging, the stack of active blocks is reported for fatal
> errors, and an error is raised if the file is terminated without ending
> the active block. For clarity I've renamed the top-level element from
> "None" to "Root".
> 
> The Fields element it intended only for use within Systeg blocks, and
> does not make sense within SysregFields blocks, and so remains forbidden
> within a SysregFields block.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/tools/gen-sysreg.awk | 93 ++++++++++++++++++++-------------
>  1 file changed, 57 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
> index 7f27d66a17e1..066ebf5410fa 100755
> --- a/arch/arm64/tools/gen-sysreg.awk
> +++ b/arch/arm64/tools/gen-sysreg.awk
> @@ -4,23 +4,35 @@
>  #
>  # Usage: awk -f gen-sysreg.awk sysregs.txt
>  
> +function block_current() {
> +	return __current_block[__current_block_depth];
> +}
> +
>  # Log an error and terminate
>  function fatal(msg) {
>  	print "Error at " NR ": " msg > "/dev/stderr"
> +
> +	printf "Current block nesting:"
> +
> +	for (i = 0; i <= __current_block_depth; i++) {
> +		printf " " __current_block[i]
> +	}
> +	printf "\n"
> +
>  	exit 1
>  }
>  
> -# Sanity check that the start or end of a block makes sense at this point in
> -# the file. If not, produce an error and terminate.
> -#
> -# @this - the $Block or $EndBlock
> -# @prev - the only valid block to already be in (value of @block)
> -# @new - the new value of @block
> -function change_block(this, prev, new) {
> -	if (block != prev)
> -		fatal("unexpected " this " (inside " block ")")
> -
> -	block = new
> +# Enter a new block, setting the active block to @block
> +function block_push(block) {
> +	__current_block[++__current_block_depth] = block
> +}
> +
> +# Exit a block, setting the active block to the parent block
> +function block_pop() {
> +	if (__current_block_depth == 0)
> +		fatal("error: block_pop() in root block")
> +
> +	__current_block_depth--;
>  }
>  
>  # Sanity check the number of records for a field makes sense. If not, produce
> @@ -84,10 +96,14 @@ BEGIN {
>  	print "/* Generated file - do not edit */"
>  	print ""
>  
> -	block = "None"
> +	__current_block_depth = 0
> +	__current_block[__current_block_depth] = "Root"
>  }
>  
>  END {
> +	if (__current_block_depth != 0)
> +		fatal("Missing terminator for " block_current() " block")
> +
>  	print "#endif /* __ASM_SYSREG_DEFS_H */"
>  }
>  
> @@ -95,8 +111,9 @@ END {
>  /^$/ { next }
>  /^[\t ]*#/ { next }
>  
> -/^SysregFields/ {
> -	change_block("SysregFields", "None", "SysregFields")
> +/^SysregFields/ && block_current() == "Root" {
> +	block_push("SysregFields")
> +
>  	expect_fields(2)
>  
>  	reg = $2
> @@ -109,12 +126,10 @@ END {
>  	next
>  }
>  
> -/^EndSysregFields/ {
> +/^EndSysregFields/ && block_current() == "SysregFields" {
>  	if (next_bit > 0)
>  		fatal("Unspecified bits in " reg)
>  
> -	change_block("EndSysregFields", "SysregFields", "None")
> -
>  	define(reg "_RES0", "(" res0 ")")
>  	define(reg "_RES1", "(" res1 ")")
>  	print ""
> @@ -123,11 +138,13 @@ END {
>  	res0 = null
>  	res1 = null
>  
> +	block_pop()
>  	next
>  }
>  
> -/^Sysreg/ {
> -	change_block("Sysreg", "None", "Sysreg")
> +/^Sysreg/ && block_current() == "Root" {
> +	block_push("Sysreg")
> +
>  	expect_fields(7)
>  
>  	reg = $2
> @@ -156,12 +173,10 @@ END {
>  	next
>  }
>  
> -/^EndSysreg/ {
> +/^EndSysreg/ && block_current() == "Sysreg" {
>  	if (next_bit > 0)
>  		fatal("Unspecified bits in " reg)
>  
> -	change_block("EndSysreg", "Sysreg", "None")
> -
>  	if (res0 != null)
>  		define(reg "_RES0", "(" res0 ")")
>  	if (res1 != null)
> @@ -178,12 +193,13 @@ END {
>  	res0 = null
>  	res1 = null
>  
> +	block_pop()
>  	next
>  }
>  
>  # Currently this is effectivey a comment, in future we may want to emit
>  # defines for the fields.
> -/^Fields/ && (block == "Sysreg") {
> +/^Fields/ && block_current() == "Sysreg" {
>  	expect_fields(2)
>  
>  	if (next_bit != 63)
> @@ -200,7 +216,7 @@ END {
>  }
>  
>  
> -/^Res0/ && (block == "Sysreg" || block == "SysregFields") {
> +/^Res0/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, "RES0", $2)
>  	field = "RES0_" msb "_" lsb
> @@ -210,7 +226,7 @@ END {
>  	next
>  }
>  
> -/^Res1/ && (block == "Sysreg" || block == "SysregFields") {
> +/^Res1/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, "RES1", $2)
>  	field = "RES1_" msb "_" lsb
> @@ -220,7 +236,7 @@ END {
>  	next
>  }
>  
> -/^Field/ && (block == "Sysreg" || block == "SysregFields") {
> +/^Field/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -231,15 +247,16 @@ END {
>  	next
>  }
>  
> -/^Raz/ && (block == "Sysreg" || block == "SysregFields") {
> +/^Raz/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, field, $2)
>  
>  	next
>  }
>  
> -/^SignedEnum/ {
> -	change_block("Enum<", "Sysreg", "Enum")
> +/^SignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +	block_push("Enum")
> +
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -250,8 +267,9 @@ END {
>  	next
>  }
>  
> -/^UnsignedEnum/ {
> -	change_block("Enum<", "Sysreg", "Enum")
> +/^UnsignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +	block_push("Enum")
> +
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -262,8 +280,9 @@ END {
>  	next
>  }
>  
> -/^Enum/ {
> -	change_block("Enum", "Sysreg", "Enum")
> +/^Enum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +	block_push("Enum")
> +
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -273,16 +292,18 @@ END {
>  	next
>  }
>  
> -/^EndEnum/ {
> -	change_block("EndEnum", "Enum", "Sysreg")
> +/^EndEnum/ && block_current() == "Enum" {
> +
>  	field = null
>  	msb = null
>  	lsb = null
>  	print ""
> +
> +	block_pop()
>  	next
>  }
>  
> -/0b[01]+/ && block == "Enum" {
> +/0b[01]+/ && block_current() == "Enum" {
>  	expect_fields(2)
>  	val = $1
>  	name = $2

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
@ 2023-02-09  5:49           ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-02-09  5:49 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown



On 2/9/23 00:52, Mark Rutland wrote:
> On Fri, Jan 13, 2023 at 08:32:47AM +0530, Anshuman Khandual wrote:
>> On 1/12/23 18:54, Mark Rutland wrote:
>>> Hi Anshuman,
>>>
>>> On Thu, Jan 05, 2023 at 08:40:35AM +0530, Anshuman Khandual wrote:
>>>> This adds BRBE related register definitions and various other related field
>>>> macros there in. These will be used subsequently in a BRBE driver which is
>>>> being added later on.
>>>
>>> I haven't verified the specific values, but this looks good to me aside from
>>> one minor nit below.
>>>
>>> [...]
>>>
>>>> +# This is just a dummy register declaration to get all common field masks and
>>>> +# shifts for accessing given BRBINF contents.
>>>> +Sysreg	BRBINF_EL1	2	1	8	0	0
>>>
>>> We don't need a dummy declaration, as we have 'SysregFields' that can be used
>>> for this, e.g.
>>>
>>>   SysregFields BRBINFx_EL1
>>>   ...
>>>   EndSysregFields
>>>
>>> ... which will avoid accidental usage of the register encoding. Note that I've
>>> also added an 'x' there in place of the index, which we do for other registers,
>>> e.g. TTBRx_EL1.
>>>
>>> Could you please update to that?
>>
>> There is a problem in defining SysregFields (which I did explore earlier as well).
>> SysregFields unfortunately does not support enums fields. Following build failure
>> comes up, while trying to convert BRBINFx_EL1 into a SysregFields definition.
>>
>> Error at 932: unexpected Enum (inside SysregFields)
> 
> This is a problem, but it's one that we can solve. We're in control of
> gen-sysreg.awk and the language it parses, so we can make this an expected and
> supported case -- see below.
> 
>> ===============================================================================
>> diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
>> index a7f9054bd84c..519c4f080898 100644
>> --- a/arch/arm64/tools/sysreg
>> +++ b/arch/arm64/tools/sysreg
>> @@ -921,10 +921,7 @@ Enum       3:0     BT
>>  EndEnum
>>  EndSysreg
>>  
>> -
>> -# This is just a dummy register declaration to get all common field masks and
>> -# shifts for accessing given BRBINF contents.
>> -Sysreg BRBINF_EL1      2       1       8       0       0
>> +SysregFields BRBINFx_EL1
>>  Res0   63:47
>>  Field  46      CCU
>>  Field  45:32   CC
>> @@ -967,7 +964,7 @@ Enum        1:0     VALID
>>         0b10    SOURCE
>>         0b11    FULL
>>  EndEnum
>> -EndSysreg
>> +EndSysregFields
>>  
>>  Sysreg BRBCR_EL1       2       1       9       0       0
>>  Res0   63:24
>> ===============================================================================
>>
>> There are three enum fields in BRBINFx_EL1 as listed here.
>>
>> Enum    13:8            TYPE
>> Enum    7:6		EL
>> Enum    1:0     	VALID
>>
>> However, BRBINF_EL1 can be changed as BRBINFx_EL1, indicating its more generic
>> nature with a potential to be used for any index value register thereafter.
> 
> It's certainly better to use the BRBINFx_EL1 name, but my main concern here is
> to avoid the dummy values used above to satisfy the tools, so that those cannot
> be accidentally misused.
> 
> I'd prefer that we fix gen-sysreg.awk to support Enum blocks within
> SysregFields blocks (patch below), then use SysregFields as described above.

The following patch did not apply cleanly on v6.2-rc7 but eventually did so after
some changes. Is the patch against mainline or arm64-next ? Nonetheless, it does
solve the enum problem for SysregFields. With this patch in place, I was able to

- Change Sysreg BRBINF_EL1 as SysregFields BRBINFx_EL1
- Change BRBINF_EL1_XXX fields usage as BRBINFx_EL1_XXX fields

Should I take this patch with this series as an initial prerequisite patch or you
would like to post this now for current merge window ?

> 
> Thanks,
> Mark.
> 
> ---->8----
>>From 0c194d92b0b9ff3b32f666a4610b077fdf1b4b93 Mon Sep 17 00:00:00 2001
> From: Mark Rutland <mark.rutland@arm.com>
> Date: Wed, 8 Feb 2023 17:55:08 +0000
> Subject: [PATCH] arm64/sysreg: allow *Enum blocks in SysregFields blocks
> 
> We'd like to support Enum/SignedEnum/UnsignedEnum blocks within
> SysregFields blocks, so that we can define enumerations for sets of
> registers. This isn't currently supported by gen-sysreg.awk due to the
> way we track the active block, which can't handle more than a single
> layer of nesting, which imposes an awkward requirement that when ending
> a block we know what the parent block is when calling change_block()
> 
> Make this nicer by using a stack of active blocks, with block_push() to
> start a block, and block_pop() to end a block. Doing so means hat when
> ending a block we don't need to know the parent block type, and checks
> of the active block become more consistent. On top of that, it's easy to
> permit *Enum blocks within both Sysreg and SysregFields blocks.
> 
> To aid debugging, the stack of active blocks is reported for fatal
> errors, and an error is raised if the file is terminated without ending
> the active block. For clarity I've renamed the top-level element from
> "None" to "Root".
> 
> The Fields element it intended only for use within Systeg blocks, and
> does not make sense within SysregFields blocks, and so remains forbidden
> within a SysregFields block.
> 
> Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mark Brown <broonie@kernel.org>
> Cc: Will Deacon <will@kernel.org>
> ---
>  arch/arm64/tools/gen-sysreg.awk | 93 ++++++++++++++++++++-------------
>  1 file changed, 57 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
> index 7f27d66a17e1..066ebf5410fa 100755
> --- a/arch/arm64/tools/gen-sysreg.awk
> +++ b/arch/arm64/tools/gen-sysreg.awk
> @@ -4,23 +4,35 @@
>  #
>  # Usage: awk -f gen-sysreg.awk sysregs.txt
>  
> +function block_current() {
> +	return __current_block[__current_block_depth];
> +}
> +
>  # Log an error and terminate
>  function fatal(msg) {
>  	print "Error at " NR ": " msg > "/dev/stderr"
> +
> +	printf "Current block nesting:"
> +
> +	for (i = 0; i <= __current_block_depth; i++) {
> +		printf " " __current_block[i]
> +	}
> +	printf "\n"
> +
>  	exit 1
>  }
>  
> -# Sanity check that the start or end of a block makes sense at this point in
> -# the file. If not, produce an error and terminate.
> -#
> -# @this - the $Block or $EndBlock
> -# @prev - the only valid block to already be in (value of @block)
> -# @new - the new value of @block
> -function change_block(this, prev, new) {
> -	if (block != prev)
> -		fatal("unexpected " this " (inside " block ")")
> -
> -	block = new
> +# Enter a new block, setting the active block to @block
> +function block_push(block) {
> +	__current_block[++__current_block_depth] = block
> +}
> +
> +# Exit a block, setting the active block to the parent block
> +function block_pop() {
> +	if (__current_block_depth == 0)
> +		fatal("error: block_pop() in root block")
> +
> +	__current_block_depth--;
>  }
>  
>  # Sanity check the number of records for a field makes sense. If not, produce
> @@ -84,10 +96,14 @@ BEGIN {
>  	print "/* Generated file - do not edit */"
>  	print ""
>  
> -	block = "None"
> +	__current_block_depth = 0
> +	__current_block[__current_block_depth] = "Root"
>  }
>  
>  END {
> +	if (__current_block_depth != 0)
> +		fatal("Missing terminator for " block_current() " block")
> +
>  	print "#endif /* __ASM_SYSREG_DEFS_H */"
>  }
>  
> @@ -95,8 +111,9 @@ END {
>  /^$/ { next }
>  /^[\t ]*#/ { next }
>  
> -/^SysregFields/ {
> -	change_block("SysregFields", "None", "SysregFields")
> +/^SysregFields/ && block_current() == "Root" {
> +	block_push("SysregFields")
> +
>  	expect_fields(2)
>  
>  	reg = $2
> @@ -109,12 +126,10 @@ END {
>  	next
>  }
>  
> -/^EndSysregFields/ {
> +/^EndSysregFields/ && block_current() == "SysregFields" {
>  	if (next_bit > 0)
>  		fatal("Unspecified bits in " reg)
>  
> -	change_block("EndSysregFields", "SysregFields", "None")
> -
>  	define(reg "_RES0", "(" res0 ")")
>  	define(reg "_RES1", "(" res1 ")")
>  	print ""
> @@ -123,11 +138,13 @@ END {
>  	res0 = null
>  	res1 = null
>  
> +	block_pop()
>  	next
>  }
>  
> -/^Sysreg/ {
> -	change_block("Sysreg", "None", "Sysreg")
> +/^Sysreg/ && block_current() == "Root" {
> +	block_push("Sysreg")
> +
>  	expect_fields(7)
>  
>  	reg = $2
> @@ -156,12 +173,10 @@ END {
>  	next
>  }
>  
> -/^EndSysreg/ {
> +/^EndSysreg/ && block_current() == "Sysreg" {
>  	if (next_bit > 0)
>  		fatal("Unspecified bits in " reg)
>  
> -	change_block("EndSysreg", "Sysreg", "None")
> -
>  	if (res0 != null)
>  		define(reg "_RES0", "(" res0 ")")
>  	if (res1 != null)
> @@ -178,12 +193,13 @@ END {
>  	res0 = null
>  	res1 = null
>  
> +	block_pop()
>  	next
>  }
>  
>  # Currently this is effectivey a comment, in future we may want to emit
>  # defines for the fields.
> -/^Fields/ && (block == "Sysreg") {
> +/^Fields/ && block_current() == "Sysreg" {
>  	expect_fields(2)
>  
>  	if (next_bit != 63)
> @@ -200,7 +216,7 @@ END {
>  }
>  
>  
> -/^Res0/ && (block == "Sysreg" || block == "SysregFields") {
> +/^Res0/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, "RES0", $2)
>  	field = "RES0_" msb "_" lsb
> @@ -210,7 +226,7 @@ END {
>  	next
>  }
>  
> -/^Res1/ && (block == "Sysreg" || block == "SysregFields") {
> +/^Res1/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, "RES1", $2)
>  	field = "RES1_" msb "_" lsb
> @@ -220,7 +236,7 @@ END {
>  	next
>  }
>  
> -/^Field/ && (block == "Sysreg" || block == "SysregFields") {
> +/^Field/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -231,15 +247,16 @@ END {
>  	next
>  }
>  
> -/^Raz/ && (block == "Sysreg" || block == "SysregFields") {
> +/^Raz/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
>  	expect_fields(2)
>  	parse_bitdef(reg, field, $2)
>  
>  	next
>  }
>  
> -/^SignedEnum/ {
> -	change_block("Enum<", "Sysreg", "Enum")
> +/^SignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +	block_push("Enum")
> +
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -250,8 +267,9 @@ END {
>  	next
>  }
>  
> -/^UnsignedEnum/ {
> -	change_block("Enum<", "Sysreg", "Enum")
> +/^UnsignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +	block_push("Enum")
> +
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -262,8 +280,9 @@ END {
>  	next
>  }
>  
> -/^Enum/ {
> -	change_block("Enum", "Sysreg", "Enum")
> +/^Enum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> +	block_push("Enum")
> +
>  	expect_fields(3)
>  	field = $3
>  	parse_bitdef(reg, field, $2)
> @@ -273,16 +292,18 @@ END {
>  	next
>  }
>  
> -/^EndEnum/ {
> -	change_block("EndEnum", "Enum", "Sysreg")
> +/^EndEnum/ && block_current() == "Enum" {
> +
>  	field = null
>  	msb = null
>  	lsb = null
>  	print ""
> +
> +	block_pop()
>  	next
>  }
>  
> -/0b[01]+/ && block == "Enum" {
> +/0b[01]+/ && block_current() == "Enum" {
>  	expect_fields(2)
>  	val = $1
>  	name = $2

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
  2023-02-09  5:49           ` Anshuman Khandual
@ 2023-02-09 10:08             ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-09 10:08 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown

On Thu, Feb 09, 2023 at 11:19:04AM +0530, Anshuman Khandual wrote:
> On 2/9/23 00:52, Mark Rutland wrote:
> > I'd prefer that we fix gen-sysreg.awk to support Enum blocks within
> > SysregFields blocks (patch below), then use SysregFields as described above.
> 
> The following patch did not apply cleanly on v6.2-rc7 but eventually did so after
> some changes. Is the patch against mainline or arm64-next ? 

Sorry I forgot to say: that needs the arm64 for-next/sysreg-hwcaps branch
(which is merged into arm64 for-next/core).

> Nonetheless, it does
> solve the enum problem for SysregFields. With this patch in place, I was able to
> 
> - Change Sysreg BRBINF_EL1 as SysregFields BRBINFx_EL1
> - Change BRBINF_EL1_XXX fields usage as BRBINFx_EL1_XXX fields

Nice!

> Should I take this patch with this series as an initial prerequisite patch or you
> would like to post this now for current merge window ?

I think for now it's best to add it to this series as a prerequisite.

Thanks,
Mark.

> 
> > 
> > Thanks,
> > Mark.
> > 
> > ---->8----
> >>From 0c194d92b0b9ff3b32f666a4610b077fdf1b4b93 Mon Sep 17 00:00:00 2001
> > From: Mark Rutland <mark.rutland@arm.com>
> > Date: Wed, 8 Feb 2023 17:55:08 +0000
> > Subject: [PATCH] arm64/sysreg: allow *Enum blocks in SysregFields blocks
> > 
> > We'd like to support Enum/SignedEnum/UnsignedEnum blocks within
> > SysregFields blocks, so that we can define enumerations for sets of
> > registers. This isn't currently supported by gen-sysreg.awk due to the
> > way we track the active block, which can't handle more than a single
> > layer of nesting, which imposes an awkward requirement that when ending
> > a block we know what the parent block is when calling change_block()
> > 
> > Make this nicer by using a stack of active blocks, with block_push() to
> > start a block, and block_pop() to end a block. Doing so means hat when
> > ending a block we don't need to know the parent block type, and checks
> > of the active block become more consistent. On top of that, it's easy to
> > permit *Enum blocks within both Sysreg and SysregFields blocks.
> > 
> > To aid debugging, the stack of active blocks is reported for fatal
> > errors, and an error is raised if the file is terminated without ending
> > the active block. For clarity I've renamed the top-level element from
> > "None" to "Root".
> > 
> > The Fields element it intended only for use within Systeg blocks, and
> > does not make sense within SysregFields blocks, and so remains forbidden
> > within a SysregFields block.
> > 
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Mark Brown <broonie@kernel.org>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/tools/gen-sysreg.awk | 93 ++++++++++++++++++++-------------
> >  1 file changed, 57 insertions(+), 36 deletions(-)
> > 
> > diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
> > index 7f27d66a17e1..066ebf5410fa 100755
> > --- a/arch/arm64/tools/gen-sysreg.awk
> > +++ b/arch/arm64/tools/gen-sysreg.awk
> > @@ -4,23 +4,35 @@
> >  #
> >  # Usage: awk -f gen-sysreg.awk sysregs.txt
> >  
> > +function block_current() {
> > +	return __current_block[__current_block_depth];
> > +}
> > +
> >  # Log an error and terminate
> >  function fatal(msg) {
> >  	print "Error at " NR ": " msg > "/dev/stderr"
> > +
> > +	printf "Current block nesting:"
> > +
> > +	for (i = 0; i <= __current_block_depth; i++) {
> > +		printf " " __current_block[i]
> > +	}
> > +	printf "\n"
> > +
> >  	exit 1
> >  }
> >  
> > -# Sanity check that the start or end of a block makes sense at this point in
> > -# the file. If not, produce an error and terminate.
> > -#
> > -# @this - the $Block or $EndBlock
> > -# @prev - the only valid block to already be in (value of @block)
> > -# @new - the new value of @block
> > -function change_block(this, prev, new) {
> > -	if (block != prev)
> > -		fatal("unexpected " this " (inside " block ")")
> > -
> > -	block = new
> > +# Enter a new block, setting the active block to @block
> > +function block_push(block) {
> > +	__current_block[++__current_block_depth] = block
> > +}
> > +
> > +# Exit a block, setting the active block to the parent block
> > +function block_pop() {
> > +	if (__current_block_depth == 0)
> > +		fatal("error: block_pop() in root block")
> > +
> > +	__current_block_depth--;
> >  }
> >  
> >  # Sanity check the number of records for a field makes sense. If not, produce
> > @@ -84,10 +96,14 @@ BEGIN {
> >  	print "/* Generated file - do not edit */"
> >  	print ""
> >  
> > -	block = "None"
> > +	__current_block_depth = 0
> > +	__current_block[__current_block_depth] = "Root"
> >  }
> >  
> >  END {
> > +	if (__current_block_depth != 0)
> > +		fatal("Missing terminator for " block_current() " block")
> > +
> >  	print "#endif /* __ASM_SYSREG_DEFS_H */"
> >  }
> >  
> > @@ -95,8 +111,9 @@ END {
> >  /^$/ { next }
> >  /^[\t ]*#/ { next }
> >  
> > -/^SysregFields/ {
> > -	change_block("SysregFields", "None", "SysregFields")
> > +/^SysregFields/ && block_current() == "Root" {
> > +	block_push("SysregFields")
> > +
> >  	expect_fields(2)
> >  
> >  	reg = $2
> > @@ -109,12 +126,10 @@ END {
> >  	next
> >  }
> >  
> > -/^EndSysregFields/ {
> > +/^EndSysregFields/ && block_current() == "SysregFields" {
> >  	if (next_bit > 0)
> >  		fatal("Unspecified bits in " reg)
> >  
> > -	change_block("EndSysregFields", "SysregFields", "None")
> > -
> >  	define(reg "_RES0", "(" res0 ")")
> >  	define(reg "_RES1", "(" res1 ")")
> >  	print ""
> > @@ -123,11 +138,13 @@ END {
> >  	res0 = null
> >  	res1 = null
> >  
> > +	block_pop()
> >  	next
> >  }
> >  
> > -/^Sysreg/ {
> > -	change_block("Sysreg", "None", "Sysreg")
> > +/^Sysreg/ && block_current() == "Root" {
> > +	block_push("Sysreg")
> > +
> >  	expect_fields(7)
> >  
> >  	reg = $2
> > @@ -156,12 +173,10 @@ END {
> >  	next
> >  }
> >  
> > -/^EndSysreg/ {
> > +/^EndSysreg/ && block_current() == "Sysreg" {
> >  	if (next_bit > 0)
> >  		fatal("Unspecified bits in " reg)
> >  
> > -	change_block("EndSysreg", "Sysreg", "None")
> > -
> >  	if (res0 != null)
> >  		define(reg "_RES0", "(" res0 ")")
> >  	if (res1 != null)
> > @@ -178,12 +193,13 @@ END {
> >  	res0 = null
> >  	res1 = null
> >  
> > +	block_pop()
> >  	next
> >  }
> >  
> >  # Currently this is effectivey a comment, in future we may want to emit
> >  # defines for the fields.
> > -/^Fields/ && (block == "Sysreg") {
> > +/^Fields/ && block_current() == "Sysreg" {
> >  	expect_fields(2)
> >  
> >  	if (next_bit != 63)
> > @@ -200,7 +216,7 @@ END {
> >  }
> >  
> >  
> > -/^Res0/ && (block == "Sysreg" || block == "SysregFields") {
> > +/^Res0/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> >  	expect_fields(2)
> >  	parse_bitdef(reg, "RES0", $2)
> >  	field = "RES0_" msb "_" lsb
> > @@ -210,7 +226,7 @@ END {
> >  	next
> >  }
> >  
> > -/^Res1/ && (block == "Sysreg" || block == "SysregFields") {
> > +/^Res1/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> >  	expect_fields(2)
> >  	parse_bitdef(reg, "RES1", $2)
> >  	field = "RES1_" msb "_" lsb
> > @@ -220,7 +236,7 @@ END {
> >  	next
> >  }
> >  
> > -/^Field/ && (block == "Sysreg" || block == "SysregFields") {
> > +/^Field/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> >  	expect_fields(3)
> >  	field = $3
> >  	parse_bitdef(reg, field, $2)
> > @@ -231,15 +247,16 @@ END {
> >  	next
> >  }
> >  
> > -/^Raz/ && (block == "Sysreg" || block == "SysregFields") {
> > +/^Raz/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> >  	expect_fields(2)
> >  	parse_bitdef(reg, field, $2)
> >  
> >  	next
> >  }
> >  
> > -/^SignedEnum/ {
> > -	change_block("Enum<", "Sysreg", "Enum")
> > +/^SignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> > +	block_push("Enum")
> > +
> >  	expect_fields(3)
> >  	field = $3
> >  	parse_bitdef(reg, field, $2)
> > @@ -250,8 +267,9 @@ END {
> >  	next
> >  }
> >  
> > -/^UnsignedEnum/ {
> > -	change_block("Enum<", "Sysreg", "Enum")
> > +/^UnsignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> > +	block_push("Enum")
> > +
> >  	expect_fields(3)
> >  	field = $3
> >  	parse_bitdef(reg, field, $2)
> > @@ -262,8 +280,9 @@ END {
> >  	next
> >  }
> >  
> > -/^Enum/ {
> > -	change_block("Enum", "Sysreg", "Enum")
> > +/^Enum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> > +	block_push("Enum")
> > +
> >  	expect_fields(3)
> >  	field = $3
> >  	parse_bitdef(reg, field, $2)
> > @@ -273,16 +292,18 @@ END {
> >  	next
> >  }
> >  
> > -/^EndEnum/ {
> > -	change_block("EndEnum", "Enum", "Sysreg")
> > +/^EndEnum/ && block_current() == "Enum" {
> > +
> >  	field = null
> >  	msb = null
> >  	lsb = null
> >  	print ""
> > +
> > +	block_pop()
> >  	next
> >  }
> >  
> > -/0b[01]+/ && block == "Enum" {
> > +/0b[01]+/ && block_current() == "Enum" {
> >  	expect_fields(2)
> >  	val = $1
> >  	name = $2

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields
@ 2023-02-09 10:08             ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-09 10:08 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon,
	Marc Zyngier, Mark Brown

On Thu, Feb 09, 2023 at 11:19:04AM +0530, Anshuman Khandual wrote:
> On 2/9/23 00:52, Mark Rutland wrote:
> > I'd prefer that we fix gen-sysreg.awk to support Enum blocks within
> > SysregFields blocks (patch below), then use SysregFields as described above.
> 
> The following patch did not apply cleanly on v6.2-rc7 but eventually did so after
> some changes. Is the patch against mainline or arm64-next ? 

Sorry I forgot to say: that needs the arm64 for-next/sysreg-hwcaps branch
(which is merged into arm64 for-next/core).

> Nonetheless, it does
> solve the enum problem for SysregFields. With this patch in place, I was able to
> 
> - Change Sysreg BRBINF_EL1 as SysregFields BRBINFx_EL1
> - Change BRBINF_EL1_XXX fields usage as BRBINFx_EL1_XXX fields

Nice!

> Should I take this patch with this series as an initial prerequisite patch or you
> would like to post this now for current merge window ?

I think for now it's best to add it to this series as a prerequisite.

Thanks,
Mark.

> 
> > 
> > Thanks,
> > Mark.
> > 
> > ---->8----
> >>From 0c194d92b0b9ff3b32f666a4610b077fdf1b4b93 Mon Sep 17 00:00:00 2001
> > From: Mark Rutland <mark.rutland@arm.com>
> > Date: Wed, 8 Feb 2023 17:55:08 +0000
> > Subject: [PATCH] arm64/sysreg: allow *Enum blocks in SysregFields blocks
> > 
> > We'd like to support Enum/SignedEnum/UnsignedEnum blocks within
> > SysregFields blocks, so that we can define enumerations for sets of
> > registers. This isn't currently supported by gen-sysreg.awk due to the
> > way we track the active block, which can't handle more than a single
> > layer of nesting, which imposes an awkward requirement that when ending
> > a block we know what the parent block is when calling change_block()
> > 
> > Make this nicer by using a stack of active blocks, with block_push() to
> > start a block, and block_pop() to end a block. Doing so means hat when
> > ending a block we don't need to know the parent block type, and checks
> > of the active block become more consistent. On top of that, it's easy to
> > permit *Enum blocks within both Sysreg and SysregFields blocks.
> > 
> > To aid debugging, the stack of active blocks is reported for fatal
> > errors, and an error is raised if the file is terminated without ending
> > the active block. For clarity I've renamed the top-level element from
> > "None" to "Root".
> > 
> > The Fields element it intended only for use within Systeg blocks, and
> > does not make sense within SysregFields blocks, and so remains forbidden
> > within a SysregFields block.
> > 
> > Signed-off-by: Mark Rutland <mark.rutland@arm.com>
> > Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Mark Brown <broonie@kernel.org>
> > Cc: Will Deacon <will@kernel.org>
> > ---
> >  arch/arm64/tools/gen-sysreg.awk | 93 ++++++++++++++++++++-------------
> >  1 file changed, 57 insertions(+), 36 deletions(-)
> > 
> > diff --git a/arch/arm64/tools/gen-sysreg.awk b/arch/arm64/tools/gen-sysreg.awk
> > index 7f27d66a17e1..066ebf5410fa 100755
> > --- a/arch/arm64/tools/gen-sysreg.awk
> > +++ b/arch/arm64/tools/gen-sysreg.awk
> > @@ -4,23 +4,35 @@
> >  #
> >  # Usage: awk -f gen-sysreg.awk sysregs.txt
> >  
> > +function block_current() {
> > +	return __current_block[__current_block_depth];
> > +}
> > +
> >  # Log an error and terminate
> >  function fatal(msg) {
> >  	print "Error at " NR ": " msg > "/dev/stderr"
> > +
> > +	printf "Current block nesting:"
> > +
> > +	for (i = 0; i <= __current_block_depth; i++) {
> > +		printf " " __current_block[i]
> > +	}
> > +	printf "\n"
> > +
> >  	exit 1
> >  }
> >  
> > -# Sanity check that the start or end of a block makes sense at this point in
> > -# the file. If not, produce an error and terminate.
> > -#
> > -# @this - the $Block or $EndBlock
> > -# @prev - the only valid block to already be in (value of @block)
> > -# @new - the new value of @block
> > -function change_block(this, prev, new) {
> > -	if (block != prev)
> > -		fatal("unexpected " this " (inside " block ")")
> > -
> > -	block = new
> > +# Enter a new block, setting the active block to @block
> > +function block_push(block) {
> > +	__current_block[++__current_block_depth] = block
> > +}
> > +
> > +# Exit a block, setting the active block to the parent block
> > +function block_pop() {
> > +	if (__current_block_depth == 0)
> > +		fatal("error: block_pop() in root block")
> > +
> > +	__current_block_depth--;
> >  }
> >  
> >  # Sanity check the number of records for a field makes sense. If not, produce
> > @@ -84,10 +96,14 @@ BEGIN {
> >  	print "/* Generated file - do not edit */"
> >  	print ""
> >  
> > -	block = "None"
> > +	__current_block_depth = 0
> > +	__current_block[__current_block_depth] = "Root"
> >  }
> >  
> >  END {
> > +	if (__current_block_depth != 0)
> > +		fatal("Missing terminator for " block_current() " block")
> > +
> >  	print "#endif /* __ASM_SYSREG_DEFS_H */"
> >  }
> >  
> > @@ -95,8 +111,9 @@ END {
> >  /^$/ { next }
> >  /^[\t ]*#/ { next }
> >  
> > -/^SysregFields/ {
> > -	change_block("SysregFields", "None", "SysregFields")
> > +/^SysregFields/ && block_current() == "Root" {
> > +	block_push("SysregFields")
> > +
> >  	expect_fields(2)
> >  
> >  	reg = $2
> > @@ -109,12 +126,10 @@ END {
> >  	next
> >  }
> >  
> > -/^EndSysregFields/ {
> > +/^EndSysregFields/ && block_current() == "SysregFields" {
> >  	if (next_bit > 0)
> >  		fatal("Unspecified bits in " reg)
> >  
> > -	change_block("EndSysregFields", "SysregFields", "None")
> > -
> >  	define(reg "_RES0", "(" res0 ")")
> >  	define(reg "_RES1", "(" res1 ")")
> >  	print ""
> > @@ -123,11 +138,13 @@ END {
> >  	res0 = null
> >  	res1 = null
> >  
> > +	block_pop()
> >  	next
> >  }
> >  
> > -/^Sysreg/ {
> > -	change_block("Sysreg", "None", "Sysreg")
> > +/^Sysreg/ && block_current() == "Root" {
> > +	block_push("Sysreg")
> > +
> >  	expect_fields(7)
> >  
> >  	reg = $2
> > @@ -156,12 +173,10 @@ END {
> >  	next
> >  }
> >  
> > -/^EndSysreg/ {
> > +/^EndSysreg/ && block_current() == "Sysreg" {
> >  	if (next_bit > 0)
> >  		fatal("Unspecified bits in " reg)
> >  
> > -	change_block("EndSysreg", "Sysreg", "None")
> > -
> >  	if (res0 != null)
> >  		define(reg "_RES0", "(" res0 ")")
> >  	if (res1 != null)
> > @@ -178,12 +193,13 @@ END {
> >  	res0 = null
> >  	res1 = null
> >  
> > +	block_pop()
> >  	next
> >  }
> >  
> >  # Currently this is effectivey a comment, in future we may want to emit
> >  # defines for the fields.
> > -/^Fields/ && (block == "Sysreg") {
> > +/^Fields/ && block_current() == "Sysreg" {
> >  	expect_fields(2)
> >  
> >  	if (next_bit != 63)
> > @@ -200,7 +216,7 @@ END {
> >  }
> >  
> >  
> > -/^Res0/ && (block == "Sysreg" || block == "SysregFields") {
> > +/^Res0/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> >  	expect_fields(2)
> >  	parse_bitdef(reg, "RES0", $2)
> >  	field = "RES0_" msb "_" lsb
> > @@ -210,7 +226,7 @@ END {
> >  	next
> >  }
> >  
> > -/^Res1/ && (block == "Sysreg" || block == "SysregFields") {
> > +/^Res1/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> >  	expect_fields(2)
> >  	parse_bitdef(reg, "RES1", $2)
> >  	field = "RES1_" msb "_" lsb
> > @@ -220,7 +236,7 @@ END {
> >  	next
> >  }
> >  
> > -/^Field/ && (block == "Sysreg" || block == "SysregFields") {
> > +/^Field/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> >  	expect_fields(3)
> >  	field = $3
> >  	parse_bitdef(reg, field, $2)
> > @@ -231,15 +247,16 @@ END {
> >  	next
> >  }
> >  
> > -/^Raz/ && (block == "Sysreg" || block == "SysregFields") {
> > +/^Raz/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> >  	expect_fields(2)
> >  	parse_bitdef(reg, field, $2)
> >  
> >  	next
> >  }
> >  
> > -/^SignedEnum/ {
> > -	change_block("Enum<", "Sysreg", "Enum")
> > +/^SignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> > +	block_push("Enum")
> > +
> >  	expect_fields(3)
> >  	field = $3
> >  	parse_bitdef(reg, field, $2)
> > @@ -250,8 +267,9 @@ END {
> >  	next
> >  }
> >  
> > -/^UnsignedEnum/ {
> > -	change_block("Enum<", "Sysreg", "Enum")
> > +/^UnsignedEnum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> > +	block_push("Enum")
> > +
> >  	expect_fields(3)
> >  	field = $3
> >  	parse_bitdef(reg, field, $2)
> > @@ -262,8 +280,9 @@ END {
> >  	next
> >  }
> >  
> > -/^Enum/ {
> > -	change_block("Enum", "Sysreg", "Enum")
> > +/^Enum/ && (block_current() == "Sysreg" || block_current() == "SysregFields") {
> > +	block_push("Enum")
> > +
> >  	expect_fields(3)
> >  	field = $3
> >  	parse_bitdef(reg, field, $2)
> > @@ -273,16 +292,18 @@ END {
> >  	next
> >  }
> >  
> > -/^EndEnum/ {
> > -	change_block("EndEnum", "Enum", "Sysreg")
> > +/^EndEnum/ && block_current() == "Enum" {
> > +
> >  	field = null
> >  	msb = null
> >  	lsb = null
> >  	print ""
> > +
> > +	block_pop()
> >  	next
> >  }
> >  
> > -/0b[01]+/ && block == "Enum" {
> > +/0b[01]+/ && block_current() == "Enum" {
> >  	expect_fields(2)
> >  	val = $1
> >  	name = $2

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
  2023-02-08 19:36         ` Mark Rutland
@ 2023-02-13  8:23           ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-02-13  8:23 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon



On 2/9/23 01:06, Mark Rutland wrote:
> On Fri, Jan 13, 2023 at 10:41:51AM +0530, Anshuman Khandual wrote:
>>
>>
>> On 1/12/23 19:59, Mark Rutland wrote:
>>> On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
>>>> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>>>>  		if (!armpmu_event_set_period(event))
>>>>  			continue;
>>>>  
>>>> +		if (has_branch_stack(event)) {
>>>> +			WARN_ON(!cpuc->branches);
>>>> +			armv8pmu_branch_read(cpuc, event);
>>>> +			data.br_stack = &cpuc->branches->branch_stack;
>>>> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
>>>> +		}
>>>
>>> How do we ensure the data we're getting isn't changed under our feet? Is BRBE
>>> disabled at this point?
>>
>> Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
>> all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
>> before initiating the actual read, which eventually populates the data.br_stack.
> 
> Ok; just to confirm, what exactly is the condition that enforces that BRBE is
> disabled? Is that *while* there's an overflow asserted, or does something else
> get set at the instant the overflow occurs?

- BRBE can be disabled completely via BRBCR_EL1_E0BRE/E1BRE irrespective of PMU interrupt
- But with PMU interrupt, it just pauses if BRBCR_EL1_FZP is enabled

> 
> What exactly is necessary for it to start again?

- Unpause via clearing BRBFCR_EL1_PAUSED
- Enable for applicable privilege levels via setting BRBCR_EL1_E0BRE/E1BRE

> 
>>> Is this going to have branches after taking the exception, or does BRBE stop
>>> automatically at that point? If so we presumably need to take special care as
>>> to when we read this relative to enabling/disabling and/or manipulating the
>>> overflow bits.
>>
>> The default BRBE configuration includes setting BRBCR_EL1.FZP, enabling BRBE to
>> be paused automatically, right after a PMU IRQ. Regardless, before reading the
>> buffer, BRBE is paused (BRBFCR_EL1.PAUSED) and disabled for all privilege levels
>> ~(BRBCR_EL1.E0BRE/E1BRE) which ensures that no new branch record is getting into
>> the buffer, while it is being read for perf right buffer.
> 
> Ok; I think we could do with some comments as to this.

Sure, will add a comment, something like this.

diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
index 17562d3fb33d..0d6e54e92ee2 100644
--- a/arch/arm64/kernel/brbe.c
+++ b/arch/arm64/kernel/brbe.c
@@ -480,6 +480,13 @@ static bool capture_branch_entry(struct pmu_hw_events *cpuc,
        return true;
 }
 
+/*
+ * This gets called in PMU IRQ handler context. BRBE is configured (BRBCR_EL1.FZP)
+ * to be paused, right after a PMU IRQ. Regardless, before reading branch records,
+ * BRBE explicitly paused (BRBFCR_EL1.PAUSED) and also disabled for all applicable
+ * privilege levels (BRBCR_EL1.E0/E1BRBE) ensuring that no branch record could get
+ * in the BRBE buffer while it is being read.
+ */
 void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
 {
        struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;


> 
>>
>>>
>>>> +
>>>>  		/*
>>>>  		 * Perf event overflow will queue the processing of the event as
>>>>  		 * an irq_work which will be taken care of in the handling of
>>>> @@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
>>>>  	return event->hw.idx;
>>>>  }
>>>>  
>>>> +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
>>>> +{
>>>> +	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
>>>> +
>>>> +	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
>>>> +		armv8pmu_branch_reset();
>>>> +}
>>>
>>> When scheduling out, shouldn't we save what we have so far?
>>>
>>> It seems odd that we just throw that away rather than placing it into a FIFO.
>>
>> IIRC we had discussed this earlier, save and restore mechanism will be added
>> later, not during this enablement patch series. 
> 
> Sorry, but why?
> 
> I don't understand why it's acceptable to non-deterministically throw away data
> for now. At the least that's going to confuse users, especially as the
> observable behaviour may change if and when that's added later.

Each set of branch records captured in BRBE is part of broader statistical
sample which goes into the perf ring buffer. So in absolute terms throwing
away some branch records should not be a problem for the end result. Would
the relative instances of possible task switches be more than all those PMU
interrupts that can be generated between those switches ? I am not sure if
that will create much loss of samples for the overall perf session.

For implementation, we could follow x86 intel_pmu_lbr_sched_task(), which
saves and restores branch records via perf_event_pmu_context->task_ctx_data
with some more changes to CPU specific structure. But restoration involves
writing back the saved branch records into the HW (via BRB INJ instructions
in BRBE case) recreating the buffer state before task switch out happened.
This save/restore mechanism will increase switch out and switch in latency
for tasks being traced for branch stack samples.

Just wondering are those potential lost branch samples worth the increase
in task switch latency ? Should this save/restore be auto enabled for all
tasks ? Should this be done part of the base enablement series itself ?

> 
> I assume that there's some reason that it's painful to do that? Could you
> please elaborate on that?
> 
>> For now resetting the buffer ensures that branch records from one session
>> does not get into another. 
> 
> I agree that it's necessary to do that, but as above I don't believe it's
> sufficient.
> 
>> Note that these branches cannot be pushed into perf ring buffer either, as
>> there was no corresponding PMU interrupt to be associated with.
> 
> I'm not suggesting we put it in the perf ring buffer; I'm suggesting that we
> snapshot it into *some* kernel-internal storage, then later reconcile that.

Only feasible reconciliation would be restore them into the BRBE HW buffer
ensuring their continuity after the task starts back again on the CPU, and
continue capturing records for perf ring buffer during future PMU interrupts.

Saved branch records cannot be pushed into the perf ring buffer, along side
the newer ones, because continuity of branch branch records leading upto the
PMU interrupt will be broken. It might also happen that all restored branch
records might just get overridden by newer branch records, before the next
PMU interrupt, wasting the entire restoration process.

> 
> Maybe that's far more painful than I expect?

I could try and implement save/restore mechanism for BRBE as explained above.

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
@ 2023-02-13  8:23           ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-02-13  8:23 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon



On 2/9/23 01:06, Mark Rutland wrote:
> On Fri, Jan 13, 2023 at 10:41:51AM +0530, Anshuman Khandual wrote:
>>
>>
>> On 1/12/23 19:59, Mark Rutland wrote:
>>> On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
>>>> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>>>>  		if (!armpmu_event_set_period(event))
>>>>  			continue;
>>>>  
>>>> +		if (has_branch_stack(event)) {
>>>> +			WARN_ON(!cpuc->branches);
>>>> +			armv8pmu_branch_read(cpuc, event);
>>>> +			data.br_stack = &cpuc->branches->branch_stack;
>>>> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
>>>> +		}
>>>
>>> How do we ensure the data we're getting isn't changed under our feet? Is BRBE
>>> disabled at this point?
>>
>> Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
>> all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
>> before initiating the actual read, which eventually populates the data.br_stack.
> 
> Ok; just to confirm, what exactly is the condition that enforces that BRBE is
> disabled? Is that *while* there's an overflow asserted, or does something else
> get set at the instant the overflow occurs?

- BRBE can be disabled completely via BRBCR_EL1_E0BRE/E1BRE irrespective of PMU interrupt
- But with PMU interrupt, it just pauses if BRBCR_EL1_FZP is enabled

> 
> What exactly is necessary for it to start again?

- Unpause via clearing BRBFCR_EL1_PAUSED
- Enable for applicable privilege levels via setting BRBCR_EL1_E0BRE/E1BRE

> 
>>> Is this going to have branches after taking the exception, or does BRBE stop
>>> automatically at that point? If so we presumably need to take special care as
>>> to when we read this relative to enabling/disabling and/or manipulating the
>>> overflow bits.
>>
>> The default BRBE configuration includes setting BRBCR_EL1.FZP, enabling BRBE to
>> be paused automatically, right after a PMU IRQ. Regardless, before reading the
>> buffer, BRBE is paused (BRBFCR_EL1.PAUSED) and disabled for all privilege levels
>> ~(BRBCR_EL1.E0BRE/E1BRE) which ensures that no new branch record is getting into
>> the buffer, while it is being read for perf right buffer.
> 
> Ok; I think we could do with some comments as to this.

Sure, will add a comment, something like this.

diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
index 17562d3fb33d..0d6e54e92ee2 100644
--- a/arch/arm64/kernel/brbe.c
+++ b/arch/arm64/kernel/brbe.c
@@ -480,6 +480,13 @@ static bool capture_branch_entry(struct pmu_hw_events *cpuc,
        return true;
 }
 
+/*
+ * This gets called in PMU IRQ handler context. BRBE is configured (BRBCR_EL1.FZP)
+ * to be paused, right after a PMU IRQ. Regardless, before reading branch records,
+ * BRBE explicitly paused (BRBFCR_EL1.PAUSED) and also disabled for all applicable
+ * privilege levels (BRBCR_EL1.E0/E1BRBE) ensuring that no branch record could get
+ * in the BRBE buffer while it is being read.
+ */
 void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
 {
        struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;


> 
>>
>>>
>>>> +
>>>>  		/*
>>>>  		 * Perf event overflow will queue the processing of the event as
>>>>  		 * an irq_work which will be taken care of in the handling of
>>>> @@ -976,6 +995,14 @@ static int armv8pmu_user_event_idx(struct perf_event *event)
>>>>  	return event->hw.idx;
>>>>  }
>>>>  
>>>> +static void armv8pmu_sched_task(struct perf_event_pmu_context *pmu_ctx, bool sched_in)
>>>> +{
>>>> +	struct arm_pmu *armpmu = to_arm_pmu(pmu_ctx->pmu);
>>>> +
>>>> +	if (sched_in && arm_pmu_branch_stack_supported(armpmu))
>>>> +		armv8pmu_branch_reset();
>>>> +}
>>>
>>> When scheduling out, shouldn't we save what we have so far?
>>>
>>> It seems odd that we just throw that away rather than placing it into a FIFO.
>>
>> IIRC we had discussed this earlier, save and restore mechanism will be added
>> later, not during this enablement patch series. 
> 
> Sorry, but why?
> 
> I don't understand why it's acceptable to non-deterministically throw away data
> for now. At the least that's going to confuse users, especially as the
> observable behaviour may change if and when that's added later.

Each set of branch records captured in BRBE is part of broader statistical
sample which goes into the perf ring buffer. So in absolute terms throwing
away some branch records should not be a problem for the end result. Would
the relative instances of possible task switches be more than all those PMU
interrupts that can be generated between those switches ? I am not sure if
that will create much loss of samples for the overall perf session.

For implementation, we could follow x86 intel_pmu_lbr_sched_task(), which
saves and restores branch records via perf_event_pmu_context->task_ctx_data
with some more changes to CPU specific structure. But restoration involves
writing back the saved branch records into the HW (via BRB INJ instructions
in BRBE case) recreating the buffer state before task switch out happened.
This save/restore mechanism will increase switch out and switch in latency
for tasks being traced for branch stack samples.

Just wondering are those potential lost branch samples worth the increase
in task switch latency ? Should this save/restore be auto enabled for all
tasks ? Should this be done part of the base enablement series itself ?

> 
> I assume that there's some reason that it's painful to do that? Could you
> please elaborate on that?
> 
>> For now resetting the buffer ensures that branch records from one session
>> does not get into another. 
> 
> I agree that it's necessary to do that, but as above I don't believe it's
> sufficient.
> 
>> Note that these branches cannot be pushed into perf ring buffer either, as
>> there was no corresponding PMU interrupt to be associated with.
> 
> I'm not suggesting we put it in the perf ring buffer; I'm suggesting that we
> snapshot it into *some* kernel-internal storage, then later reconcile that.

Only feasible reconciliation would be restore them into the BRBE HW buffer
ensuring their continuity after the task starts back again on the CPU, and
continue capturing records for perf ring buffer during future PMU interrupts.

Saved branch records cannot be pushed into the perf ring buffer, along side
the newer ones, because continuity of branch branch records leading upto the
PMU interrupt will be broken. It might also happen that all restored branch
records might just get overridden by newer branch records, before the next
PMU interrupt, wasting the entire restoration process.

> 
> Maybe that's far more painful than I expect?

I could try and implement save/restore mechanism for BRBE as explained above.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
  2023-02-08 20:03         ` Mark Rutland
@ 2023-02-20  8:38           ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-02-20  8:38 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On 2/9/23 01:33, Mark Rutland wrote:
> On Thu, Jan 19, 2023 at 08:18:47AM +0530, Anshuman Khandual wrote:
>> On 1/12/23 22:21, Mark Rutland wrote:
>>> On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:
>>>> +bool armv8pmu_branch_valid(struct perf_event *event)
>>>> +{
>>>> +	u64 branch_type = event->attr.branch_sample_type;
>>>> +
>>>> +	/*
>>>> +	 * If the event does not have at least one of the privilege
>>>> +	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
>>>> +	 * perf will adjust its value based on perf event's existing
>>>> +	 * privilege level via attr.exclude_[user|kernel|hv].
>>>> +	 *
>>>> +	 * As event->attr.branch_sample_type might have been changed
>>>> +	 * when the event reaches here, it is not possible to figure
>>>> +	 * out whether the event originally had HV privilege request
>>>> +	 * or got added via the core perf. Just report this situation
>>>> +	 * once and continue ignoring if there are other instances.
>>>> +	 */
>>>> +	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
>>>> +		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
>>>> +
>>>> +	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
>>>> +		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
>>>> +		return false;
>>>> +	}
>>>> +
>>>> +	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
>>>> +		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
>>>> +		return false;
>>>> +	}
>>>> +
>>>> +	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
>>>> +		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
>>>> +		return false;
>>>> +	}
>>>> +	return true;
>>>> +}
>>>
>>> Is this called when validating user input? If so, NAK to printing anything to a
>>> higher leval than debug. If there are constraints the user needs to be aware of
>>
>> You mean pr_debug() based prints ?
> 
> Yes.
> 
>>> It would be better to whitelist what we do support rather than blacklisting
>>> what we don't.
>>
>> But with a negative list, user would know what is not supported via these pr_debug()
>> based output when enabled ? But I dont have a strong opinion either way.
> 
> With a negative list, when new options are added the driver will erroneously
> and silently accept them, which is worse.

Okay, will rather convert this into a positive list instead.

> 
>>
>>>
>>>> +
>>>> +static void branch_records_alloc(struct arm_pmu *armpmu)
>>>> +{
>>>> +	struct pmu_hw_events *events;
>>>> +	int cpu;
>>>> +
>>>> +	for_each_possible_cpu(cpu) {
>>>> +		events = per_cpu_ptr(armpmu->hw_events, cpu);
>>>> +
>>>> +		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
>>>> +		WARN_ON(!events->branches);
>>>> +	}
>>>> +}
>>>
>>> It would be simpler for this to be a percpu allocation.
>>
>> Could you please be more specific ? alloc_percpu_gfp() cannot be used here
>> because 'events->branches' is not a __percpu variable unlike its parent
>> 'events' which is derived from armpmu.
> 
> You can allocate it per-cpu, then grab each of the cpu's pointers using
> per_cpu() and place those into events->branches.
> 
> That way you only make one allocation which can fail, which makes the error
> path much simpler.

I guess you are suggesting something like this.

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index f0689c84530b..3f0a9d1df5e8 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1190,14 +1190,19 @@ static void __armv8pmu_probe_pmu(void *info)
 
 static int branch_records_alloc(struct arm_pmu *armpmu)
 {
+       struct branch_records __percpu *tmp;
        struct pmu_hw_events *events;
+       struct branch_records *records;
        int cpu;
 
+       tmp = alloc_percpu_gfp(struct branch_records, GFP_KERNEL);
+       if (!tmp)
+               return -ENOMEM;
+
        for_each_possible_cpu(cpu) {
                events = per_cpu_ptr(armpmu->hw_events, cpu);
-               events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
-               if (!events->branches)
-                       return -ENOMEM;
+               records = per_cpu_ptr(tmp, cpu);
+               events->branches = records;
        }
        return 0;
 }

This should be all okay, as these branch records never really get freed up. But otherwise
this local 'tmp' will have to saved some where for later, to be used with free_percpu().
Else access to this percpu allocation handle is lost.

> 
> [...]
> 
>>>> +static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
>>>> +{
>>>> +	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
>>>
>>> Same comments as for the failure path in branch_records_alloc().
>>>
>>>> +	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
>>>
>>> Which context is this run in? Unless this is affine to a relevant CPU we can't
>>> read the sysreg safely, and if we're in a cross-call we cannot allocate memory,
>>> so this doesn't look right to me.
>>
>> Called from smp_call_function_any() context via __armv8pmu_probe_pmu().
> 
> Ok; so the read is safe, but the allocation is not.

This problem has already been solved in latest V8 series.

> 
> [...]
> 
>>>> +	WARN_ON(!brbe_attr);
>>>> +	armpmu->private = brbe_attr;
>>>> +
>>>> +	brbe_attr->brbe_version = brbe;
>>>> +	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
>>>> +	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
>>>> +	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
>>>
>>> As a minor thing, could we please s/fetch/get/ ? To me, 'fetch' sounds like a
>>> memory operation, and elsewhere we use 'get' for this sort of getter function.
>>
>> Sure, but shall we change fetch as get for entire BRBE implementation (where ever
>> there is a determination of field from a register value) or just the above function ?
>> Default, will change all places.
> 
> I had meant in all cases, so that's perfect, thanks.
> 
> 
> [...]
> 
>>>> +			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));
>>>
>>> Huh? Why does the value of BRBCR matter here?
>>
>> This is just a code hardening measure here. Before recording branch record
>> cycles or its flags, ensure BRBCR_EL1 was configured correctly to produce
>> these additional information along with the branch records.
> 
> I don't think that's necessary. Where is brbcr written such that this could be
> misconfigured?

BRBCR cannot be mis-configured but this is just for code hardening purpose both
for BRBCR_EL1_MPRED and BRBCR_EL1_CC.

> 
> At the least, this needs a comment as to why we need to check, and what we're
> checking for.

Sure, will probably add a comment explaining these checks.

> 
> [...]
> 
>>>> +/*
>>>> + * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
>>>> + * preceding consecutive branch records, that were in a transaction
>>>> + * (i.e their BRBINF_EL1.TX set) have been aborted.
>>>> + *
>>>> + * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
>>>> + * consecutive branch records upto the last record, which were in a
>>>> + * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
>>>> + *
>>>> + * --------------------------------- -------------------
>>>> + * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
>>>> + * --------------------------------- -------------------
>>>> + * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
>>>> + * --------------------------------- -------------------
>>>> + * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
>>>> + * --------------------------------- -------------------
>>>> + * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>> + * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>> + * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
>>>> + * --------------------------------- -------------------
>>>> + * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
>>>> + * --------------------------------- -------------------
>>>> + * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>> + * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>> + * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>
>>> Are we guaranteed to have a record between two transactions with TX = 0?
>>
>> With TX = 0 i.e no transaction was active, indicates normal sequence of branches
>> creating their own branch records. How can there be a transaction with TX = 0 ?
>> Could you please be more specific here ?
> 
> Consider a sequence of a successful transaction followed by a cancelled
> transation, with not branches between the first transation being commited and
> the second transaction starting:
> 
> 	TSTART	// TX=1
> 	...     // TX=1
> 	TCOMMIT // TX=1
> 	TSTART  // TX=1
> 	...     // TX=1
> 	<failure>
> 		// TX=0, LF=1
> 
> AFAICT, we are not guaranteed to have a record with TX=0 between that
> successful TCOMMIT and the subsequent TSTART, and so the LASTFAILED field
> doesn't indicate that *all* preceding records with TX set are part of the
> failed transaction.
> 
> Am I missing something? e.g. does the TCOMMIT get records with TX=0?

I would assume so. Otherwise there will be no marker between the branch records
generated from two different set of transactions, either successful or failed.
Although I will get this clarified from the HW folks.

> 
>>> AFAICT you could have a sequence where a TCOMMIT is immediately followed by a
>>> TSTART, and IIUC in that case you could have back-to-back records for distinct
>>> transactions all with TX = 1, where the first transaction could be commited,
>>> and the second might fail/cancel.
>>>
>>> ... or do TCOMMIT/TCANCEL/TSTART get handled specially?
>>
>> I guess these are micro-architectural implementation details which unfortunately
>> BRBINF_EL1/BRBCR_EL1 specifications do not capture in detail. But all it says is
>> that upon encountering BRBINF_EL1.LASTFAILED or BRBFCR_EL1.LASTFAILED (just for
>> the last record) all previous in-transaction branch records (BRBINF_EL1.TX = 1)
>> should be considered aborted for branch record reporting purpose.
> 
> Ok, so we're throwing away data?

No, we just mark branch record abort field (entry->abort) appropriately.

> 
> If we're going to do that, it would be good to at least have a comment
> explaining why we're forced to do so. Ideally we'd get the architecture
> clarified/fixed, since AFAIK no-one has actually built TME yet, and it might be
> a simple fix (as above).

I will get this clarified from the HW folks.

> 
> [...]
> 
>>>> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
>>>> +{
>>>> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
>>>> +	u64 brbinf, brbfcr, brbcr;
>>>> +	int idx;
>>>> +
>>>> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
>>>> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>>>> +
>>>> +	/* Ensure pause on PMU interrupt is enabled */
>>>> +	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
>>>
>>> As above, I think this needs commentary in the interrupt handler, since this
>>> presumably needs us to keep the IRQ asserted until we're done
>>> reading/manipulating records in the IRQ handler.
>>
>> The base IRQ handler armv8pmu_handle_irq() is in ARMV8 PMU code inside perf_event.c
>> which could/should not access BRBE specific details without adding an another new
>> abstraction function. But I guess adding a comment should be fine.
> 
> I think it's fine to have a comment there saying that we *must not* do
> something that woukd break BRBE.

Will add a comment here, something like this.

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 3f0a9d1df5e8..727c4806f18c 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -861,6 +861,10 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
                if (!armpmu_event_set_period(event))
                        continue;
 
+               /*
+                * PMU IRQ should remain asserted until all branch records
+                * are captured and processed into struct perf_sample_data.
+                */
                if (has_branch_stack(event)) {
                        WARN_ON(!cpuc->branches);
                        armv8pmu_branch_read(cpuc, event);


> 
>>>> +
>>>> +	/* Save and clear the privilege */
>>>> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
>>>
>>> Why? Later on we restore this, and AFAICT we don't modify it.
>>>
>>> If it's paused, why do we care about the privilege?
>>
>> This disables BRBE completely (not only pause) providing confidence that no
>> branch record can come in while the existing records are being processed.
> 
> I thought from earlier that it was automatically paused by HW upon raising the
> IRQ. Have I misunderstood, and we *must* stop it, or is this a belt-and-braces
> additional disable?
> 
> Is that not the case, or do we not trust the pause for some reason?

Yes, this is a belt-and-braces additional disable i.e putting the BRBE in prohibited
region, which is more effective than a pause.

> 
> Regardless, the comment should expalin *why* we're doing this (i.e. that this
> is about ensuring BRBE does not create new records while we're manipulating
> it).

Will update the comment, something like this.

diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
index a2de0b5a941c..35e11d0f41fa 100644
--- a/arch/arm64/kernel/brbe.c
+++ b/arch/arm64/kernel/brbe.c
@@ -504,7 +504,13 @@ void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
        /* Ensure pause on PMU interrupt is enabled */
        WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
 
-       /* Save and clear the privilege */
+       /*
+        * Save and clear the privilege
+        *
+        * Clearing both privilege levels put the BRBE in prohibited state
+        * thus preventing new branch records being created while existing
+        * ones get captured and processed.
+        */
        write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
 
        /* Pause the buffer */


> 
>>>> +	/* Unpause the buffer */
>>>> +	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
>>>> +	isb();
>>>> +	armv8pmu_branch_reset();
>>>> +}
>>>
>>> Why do we enable it before we reset it?
>>
>> This is the last opportunity for a clean slate start for BRBE buffer before it is
>> back recording the branches. Basically helps in ensuring a clean start.
> 
> My point is why do we start if *before* resetting it, rather than restting it
> first? Why give it the opportunity to create records that we're going to
> discard immediately thereafter?
> 
>>> Surely it would make sense to reset it first, and ammortize the cost of the ISB?
>>>
>>> That said, as above, do we actually need to pause/unpause it? Or is it already
>>> paused by virtue of the IRQ?
>>
>> Yes, it should be paused after an IRQ but it is also enforced before reading along
>> with privilege level disable.
> 
> I'm very confused as to why we're not trusting the HW to remain paused. Why do
> we need to enforce what the hardware should already be doing?

As have learned from the HW folks, there might be situations where the BRBE buffer has
been actually paused, ready for extraction in principle, but without BRBFCR_EL1_PAUSED
being set. Setting the bit here explicitly creates consistency across scenarios before
capturing the branch records. But please do note, that putting the BRBE in prohibited
region via clearing BRBCR_EL1_E0BRE/E1BRE is the primary construct which ensures that
no new branch records will make into the buffer while it's being processed.

> 
>> Regardless the buffer needs to be un-paused and also
>> enabled for required privilege levels before exiting from here.
> 
> I agree this needs to be balanced, it just seems to me that we're doing
> redundant work here.

Extracting branch records for an user space only profile session might be faster as it
would require lesser context synchronization, might not even require prohibited region
change mechanism (will be already in there, upon a PMU interrupt) etc. I could try and
update IRQ branch records handling, based on whether current perf session was profiling
only the user space or not.

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
@ 2023-02-20  8:38           ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-02-20  8:38 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On 2/9/23 01:33, Mark Rutland wrote:
> On Thu, Jan 19, 2023 at 08:18:47AM +0530, Anshuman Khandual wrote:
>> On 1/12/23 22:21, Mark Rutland wrote:
>>> On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:
>>>> +bool armv8pmu_branch_valid(struct perf_event *event)
>>>> +{
>>>> +	u64 branch_type = event->attr.branch_sample_type;
>>>> +
>>>> +	/*
>>>> +	 * If the event does not have at least one of the privilege
>>>> +	 * branch filters as in PERF_SAMPLE_BRANCH_PLM_ALL, the core
>>>> +	 * perf will adjust its value based on perf event's existing
>>>> +	 * privilege level via attr.exclude_[user|kernel|hv].
>>>> +	 *
>>>> +	 * As event->attr.branch_sample_type might have been changed
>>>> +	 * when the event reaches here, it is not possible to figure
>>>> +	 * out whether the event originally had HV privilege request
>>>> +	 * or got added via the core perf. Just report this situation
>>>> +	 * once and continue ignoring if there are other instances.
>>>> +	 */
>>>> +	if ((branch_type & PERF_SAMPLE_BRANCH_HV) && !is_kernel_in_hyp_mode())
>>>> +		pr_warn_once("%s - hypervisor privilege\n", branch_filter_error_msg);
>>>> +
>>>> +	if (branch_type & PERF_SAMPLE_BRANCH_ABORT_TX) {
>>>> +		pr_warn_once("%s - aborted transaction\n", branch_filter_error_msg);
>>>> +		return false;
>>>> +	}
>>>> +
>>>> +	if (branch_type & PERF_SAMPLE_BRANCH_NO_TX) {
>>>> +		pr_warn_once("%s - no transaction\n", branch_filter_error_msg);
>>>> +		return false;
>>>> +	}
>>>> +
>>>> +	if (branch_type & PERF_SAMPLE_BRANCH_IN_TX) {
>>>> +		pr_warn_once("%s - in transaction\n", branch_filter_error_msg);
>>>> +		return false;
>>>> +	}
>>>> +	return true;
>>>> +}
>>>
>>> Is this called when validating user input? If so, NAK to printing anything to a
>>> higher leval than debug. If there are constraints the user needs to be aware of
>>
>> You mean pr_debug() based prints ?
> 
> Yes.
> 
>>> It would be better to whitelist what we do support rather than blacklisting
>>> what we don't.
>>
>> But with a negative list, user would know what is not supported via these pr_debug()
>> based output when enabled ? But I dont have a strong opinion either way.
> 
> With a negative list, when new options are added the driver will erroneously
> and silently accept them, which is worse.

Okay, will rather convert this into a positive list instead.

> 
>>
>>>
>>>> +
>>>> +static void branch_records_alloc(struct arm_pmu *armpmu)
>>>> +{
>>>> +	struct pmu_hw_events *events;
>>>> +	int cpu;
>>>> +
>>>> +	for_each_possible_cpu(cpu) {
>>>> +		events = per_cpu_ptr(armpmu->hw_events, cpu);
>>>> +
>>>> +		events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
>>>> +		WARN_ON(!events->branches);
>>>> +	}
>>>> +}
>>>
>>> It would be simpler for this to be a percpu allocation.
>>
>> Could you please be more specific ? alloc_percpu_gfp() cannot be used here
>> because 'events->branches' is not a __percpu variable unlike its parent
>> 'events' which is derived from armpmu.
> 
> You can allocate it per-cpu, then grab each of the cpu's pointers using
> per_cpu() and place those into events->branches.
> 
> That way you only make one allocation which can fail, which makes the error
> path much simpler.

I guess you are suggesting something like this.

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index f0689c84530b..3f0a9d1df5e8 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1190,14 +1190,19 @@ static void __armv8pmu_probe_pmu(void *info)
 
 static int branch_records_alloc(struct arm_pmu *armpmu)
 {
+       struct branch_records __percpu *tmp;
        struct pmu_hw_events *events;
+       struct branch_records *records;
        int cpu;
 
+       tmp = alloc_percpu_gfp(struct branch_records, GFP_KERNEL);
+       if (!tmp)
+               return -ENOMEM;
+
        for_each_possible_cpu(cpu) {
                events = per_cpu_ptr(armpmu->hw_events, cpu);
-               events->branches = kzalloc(sizeof(struct branch_records), GFP_KERNEL);
-               if (!events->branches)
-                       return -ENOMEM;
+               records = per_cpu_ptr(tmp, cpu);
+               events->branches = records;
        }
        return 0;
 }

This should be all okay, as these branch records never really get freed up. But otherwise
this local 'tmp' will have to saved some where for later, to be used with free_percpu().
Else access to this percpu allocation handle is lost.

> 
> [...]
> 
>>>> +static int brbe_attributes_probe(struct arm_pmu *armpmu, u32 brbe)
>>>> +{
>>>> +	struct brbe_hw_attr *brbe_attr = kzalloc(sizeof(struct brbe_hw_attr), GFP_KERNEL);
>>>
>>> Same comments as for the failure path in branch_records_alloc().
>>>
>>>> +	u64 brbidr = read_sysreg_s(SYS_BRBIDR0_EL1);
>>>
>>> Which context is this run in? Unless this is affine to a relevant CPU we can't
>>> read the sysreg safely, and if we're in a cross-call we cannot allocate memory,
>>> so this doesn't look right to me.
>>
>> Called from smp_call_function_any() context via __armv8pmu_probe_pmu().
> 
> Ok; so the read is safe, but the allocation is not.

This problem has already been solved in latest V8 series.

> 
> [...]
> 
>>>> +	WARN_ON(!brbe_attr);
>>>> +	armpmu->private = brbe_attr;
>>>> +
>>>> +	brbe_attr->brbe_version = brbe;
>>>> +	brbe_attr->brbe_format = brbe_fetch_format(brbidr);
>>>> +	brbe_attr->brbe_cc = brbe_fetch_cc_bits(brbidr);
>>>> +	brbe_attr->brbe_nr = brbe_fetch_numrec(brbidr);
>>>
>>> As a minor thing, could we please s/fetch/get/ ? To me, 'fetch' sounds like a
>>> memory operation, and elsewhere we use 'get' for this sort of getter function.
>>
>> Sure, but shall we change fetch as get for entire BRBE implementation (where ever
>> there is a determination of field from a register value) or just the above function ?
>> Default, will change all places.
> 
> I had meant in all cases, so that's perfect, thanks.
> 
> 
> [...]
> 
>>>> +			WARN_ON_ONCE(!(brbcr & BRBCR_EL1_MPRED));
>>>
>>> Huh? Why does the value of BRBCR matter here?
>>
>> This is just a code hardening measure here. Before recording branch record
>> cycles or its flags, ensure BRBCR_EL1 was configured correctly to produce
>> these additional information along with the branch records.
> 
> I don't think that's necessary. Where is brbcr written such that this could be
> misconfigured?

BRBCR cannot be mis-configured but this is just for code hardening purpose both
for BRBCR_EL1_MPRED and BRBCR_EL1_CC.

> 
> At the least, this needs a comment as to why we need to check, and what we're
> checking for.

Sure, will probably add a comment explaining these checks.

> 
> [...]
> 
>>>> +/*
>>>> + * A branch record with BRBINF_EL1.LASTFAILED set, implies that all
>>>> + * preceding consecutive branch records, that were in a transaction
>>>> + * (i.e their BRBINF_EL1.TX set) have been aborted.
>>>> + *
>>>> + * Similarly BRBFCR_EL1.LASTFAILED set, indicate that all preceding
>>>> + * consecutive branch records upto the last record, which were in a
>>>> + * transaction (i.e their BRBINF_EL1.TX set) have been aborted.
>>>> + *
>>>> + * --------------------------------- -------------------
>>>> + * | 00 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
>>>> + * --------------------------------- -------------------
>>>> + * | 01 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX success]
>>>> + * --------------------------------- -------------------
>>>> + * | 02 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
>>>> + * --------------------------------- -------------------
>>>> + * | 03 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>> + * | 04 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>> + * | 05 | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 1 |
>>>> + * --------------------------------- -------------------
>>>> + * | .. | BRBSRC | BRBTGT | BRBINF | | TX = 0 | LF = 0 |
>>>> + * --------------------------------- -------------------
>>>> + * | 61 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>> + * | 62 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>> + * | 63 | BRBSRC | BRBTGT | BRBINF | | TX = 1 | LF = 0 | [TX failed]
>>>> + * --------------------------------- -------------------
>>>
>>> Are we guaranteed to have a record between two transactions with TX = 0?
>>
>> With TX = 0 i.e no transaction was active, indicates normal sequence of branches
>> creating their own branch records. How can there be a transaction with TX = 0 ?
>> Could you please be more specific here ?
> 
> Consider a sequence of a successful transaction followed by a cancelled
> transation, with not branches between the first transation being commited and
> the second transaction starting:
> 
> 	TSTART	// TX=1
> 	...     // TX=1
> 	TCOMMIT // TX=1
> 	TSTART  // TX=1
> 	...     // TX=1
> 	<failure>
> 		// TX=0, LF=1
> 
> AFAICT, we are not guaranteed to have a record with TX=0 between that
> successful TCOMMIT and the subsequent TSTART, and so the LASTFAILED field
> doesn't indicate that *all* preceding records with TX set are part of the
> failed transaction.
> 
> Am I missing something? e.g. does the TCOMMIT get records with TX=0?

I would assume so. Otherwise there will be no marker between the branch records
generated from two different set of transactions, either successful or failed.
Although I will get this clarified from the HW folks.

> 
>>> AFAICT you could have a sequence where a TCOMMIT is immediately followed by a
>>> TSTART, and IIUC in that case you could have back-to-back records for distinct
>>> transactions all with TX = 1, where the first transaction could be commited,
>>> and the second might fail/cancel.
>>>
>>> ... or do TCOMMIT/TCANCEL/TSTART get handled specially?
>>
>> I guess these are micro-architectural implementation details which unfortunately
>> BRBINF_EL1/BRBCR_EL1 specifications do not capture in detail. But all it says is
>> that upon encountering BRBINF_EL1.LASTFAILED or BRBFCR_EL1.LASTFAILED (just for
>> the last record) all previous in-transaction branch records (BRBINF_EL1.TX = 1)
>> should be considered aborted for branch record reporting purpose.
> 
> Ok, so we're throwing away data?

No, we just mark branch record abort field (entry->abort) appropriately.

> 
> If we're going to do that, it would be good to at least have a comment
> explaining why we're forced to do so. Ideally we'd get the architecture
> clarified/fixed, since AFAIK no-one has actually built TME yet, and it might be
> a simple fix (as above).

I will get this clarified from the HW folks.

> 
> [...]
> 
>>>> +void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
>>>> +{
>>>> +	struct brbe_hw_attr *brbe_attr = (struct brbe_hw_attr *)cpuc->percpu_pmu->private;
>>>> +	u64 brbinf, brbfcr, brbcr;
>>>> +	int idx;
>>>> +
>>>> +	brbcr = read_sysreg_s(SYS_BRBCR_EL1);
>>>> +	brbfcr = read_sysreg_s(SYS_BRBFCR_EL1);
>>>> +
>>>> +	/* Ensure pause on PMU interrupt is enabled */
>>>> +	WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
>>>
>>> As above, I think this needs commentary in the interrupt handler, since this
>>> presumably needs us to keep the IRQ asserted until we're done
>>> reading/manipulating records in the IRQ handler.
>>
>> The base IRQ handler armv8pmu_handle_irq() is in ARMV8 PMU code inside perf_event.c
>> which could/should not access BRBE specific details without adding an another new
>> abstraction function. But I guess adding a comment should be fine.
> 
> I think it's fine to have a comment there saying that we *must not* do
> something that woukd break BRBE.

Will add a comment here, something like this.

diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 3f0a9d1df5e8..727c4806f18c 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -861,6 +861,10 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
                if (!armpmu_event_set_period(event))
                        continue;
 
+               /*
+                * PMU IRQ should remain asserted until all branch records
+                * are captured and processed into struct perf_sample_data.
+                */
                if (has_branch_stack(event)) {
                        WARN_ON(!cpuc->branches);
                        armv8pmu_branch_read(cpuc, event);


> 
>>>> +
>>>> +	/* Save and clear the privilege */
>>>> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
>>>
>>> Why? Later on we restore this, and AFAICT we don't modify it.
>>>
>>> If it's paused, why do we care about the privilege?
>>
>> This disables BRBE completely (not only pause) providing confidence that no
>> branch record can come in while the existing records are being processed.
> 
> I thought from earlier that it was automatically paused by HW upon raising the
> IRQ. Have I misunderstood, and we *must* stop it, or is this a belt-and-braces
> additional disable?
> 
> Is that not the case, or do we not trust the pause for some reason?

Yes, this is a belt-and-braces additional disable i.e putting the BRBE in prohibited
region, which is more effective than a pause.

> 
> Regardless, the comment should expalin *why* we're doing this (i.e. that this
> is about ensuring BRBE does not create new records while we're manipulating
> it).

Will update the comment, something like this.

diff --git a/arch/arm64/kernel/brbe.c b/arch/arm64/kernel/brbe.c
index a2de0b5a941c..35e11d0f41fa 100644
--- a/arch/arm64/kernel/brbe.c
+++ b/arch/arm64/kernel/brbe.c
@@ -504,7 +504,13 @@ void armv8pmu_branch_read(struct pmu_hw_events *cpuc, struct perf_event *event)
        /* Ensure pause on PMU interrupt is enabled */
        WARN_ON_ONCE(!(brbcr & BRBCR_EL1_FZP));
 
-       /* Save and clear the privilege */
+       /*
+        * Save and clear the privilege
+        *
+        * Clearing both privilege levels put the BRBE in prohibited state
+        * thus preventing new branch records being created while existing
+        * ones get captured and processed.
+        */
        write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
 
        /* Pause the buffer */


> 
>>>> +	/* Unpause the buffer */
>>>> +	write_sysreg_s(brbfcr & ~BRBFCR_EL1_PAUSED, SYS_BRBFCR_EL1);
>>>> +	isb();
>>>> +	armv8pmu_branch_reset();
>>>> +}
>>>
>>> Why do we enable it before we reset it?
>>
>> This is the last opportunity for a clean slate start for BRBE buffer before it is
>> back recording the branches. Basically helps in ensuring a clean start.
> 
> My point is why do we start if *before* resetting it, rather than restting it
> first? Why give it the opportunity to create records that we're going to
> discard immediately thereafter?
> 
>>> Surely it would make sense to reset it first, and ammortize the cost of the ISB?
>>>
>>> That said, as above, do we actually need to pause/unpause it? Or is it already
>>> paused by virtue of the IRQ?
>>
>> Yes, it should be paused after an IRQ but it is also enforced before reading along
>> with privilege level disable.
> 
> I'm very confused as to why we're not trusting the HW to remain paused. Why do
> we need to enforce what the hardware should already be doing?

As have learned from the HW folks, there might be situations where the BRBE buffer has
been actually paused, ready for extraction in principle, but without BRBFCR_EL1_PAUSED
being set. Setting the bit here explicitly creates consistency across scenarios before
capturing the branch records. But please do note, that putting the BRBE in prohibited
region via clearing BRBCR_EL1_E0BRE/E1BRE is the primary construct which ensures that
no new branch records will make into the buffer while it's being processed.

> 
>> Regardless the buffer needs to be un-paused and also
>> enabled for required privilege levels before exiting from here.
> 
> I agree this needs to be balanced, it just seems to me that we're doing
> redundant work here.

Extracting branch records for an user space only profile session might be faster as it
would require lesser context synchronization, might not even require prohibited region
change mechanism (will be already in there, upon a PMU interrupt) etc. I could try and
update IRQ branch records handling, based on whether current perf session was profiling
only the user space or not.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
  2023-02-20  8:38           ` Anshuman Khandual
@ 2023-02-23 13:38             ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-23 13:38 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

Hi Anshuman,

Following up on some of the bits below, I've tried to read the BRBE section in
the ARM ARM (ARM DDI 0487I.a), and explain my understanding below. Please let
me know if I've misunderstood or missed something.

On Mon, Feb 20, 2023 at 02:08:39PM +0530, Anshuman Khandual wrote:
> On 2/9/23 01:33, Mark Rutland wrote:
> > On Thu, Jan 19, 2023 at 08:18:47AM +0530, Anshuman Khandual wrote:
> >> On 1/12/23 22:21, Mark Rutland wrote:
> >>> On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:

> >>>> +	/* Save and clear the privilege */
> >>>> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
> >>>
> >>> Why? Later on we restore this, and AFAICT we don't modify it.
> >>>
> >>> If it's paused, why do we care about the privilege?
> >>
> >> This disables BRBE completely (not only pause) providing confidence that no
> >> branch record can come in while the existing records are being processed.
> > 
> > I thought from earlier that it was automatically paused by HW upon raising the
> > IRQ. Have I misunderstood, and we *must* stop it, or is this a belt-and-braces
> > additional disable?
> > 
> > Is that not the case, or do we not trust the pause for some reason?
> 
> Yes, this is a belt-and-braces additional disable i.e putting the BRBE in prohibited
> region, which is more effective than a pause.

I'm afraid I don't understand what you mean by "more effective than a pause";
AFAICT the pause should be sufficient for what we're doing.

If there's a particular property that a prohibited region ensures but pausing
does not, could you please say what that property is specifically? e.g. as
below I note some differences w.r.t. the BRB_FILTRATE PMU event, but I'm not
sure if that's what you're referring to.

Per ARM DDI 0487I.a, section D15.3 on pages D15-5511 and D15-5512, we have:

  R_PYBRZ:
  
    Generation of Branch records is paused when BRBFCR_EL1.PAUSED is 1.

  R_SRJND

    If a direct read of BRBFCR_EL1.PAUSED returns 1, then no operations ordered
    after the direct read will generate further Branch records until
    BRBFCR_EL1.PAUSED is cleared by software.

    Note: The subsequent operations can be ordered by a Context synchronization
    event.

So if we read BRBFCR_EL1 and the PAUSED bit is set, then all we need is an ISB
to ensure that no further records will be generated.

Rules R_NXCWF, R_GXGWY, R_RPKTXQ mean that a freeze event is generated when
PMOVSCLR_EL0 bits become set (i.e. when an overflow occurs), and we have:

  R_BHYTD

    On a BRBE freeze event:
    * BRBFCR_EL1.PAUSED is set to 1.
    * The current timestamp is captured in BRBTS_EL1.

So any counter overflow will indirectly set BRBFCR_EL1.PAUSED, and stop the
generation of records. The note in R_SRJND tells us that remains the case until
we explicitly clear BRBFCR_EL1.PAUSED.

The only thing that I can see that potentially justifies placing the BRBE into
a prohibited region is the notes about BRB_FILTRATE, but I don't think that's
all that useful anyway since it's not manipulated atomically w.r.t. the actual
BRBE record management, and there are larger windows where BRBE will be paused
but counters running (e.g. between overflow occurring and poking the BRBE in
the overflow handler). So I think it'd be pointless to do that *just* for
BRB_FILTRATE.

Practically speaking, I expect that if we read PMOVSCLR and find any bits are
set, then issue an ISB, then after that ISB all of the following should be
true:

 (a) BRBFCR_EL1.PAUSED will be set 
 (b) No further records will be generated
 (c) We can safely manipulate the existing records

[...]

> >>> That said, as above, do we actually need to pause/unpause it? Or is it already
> >>> paused by virtue of the IRQ?
> >>
> >> Yes, it should be paused after an IRQ but it is also enforced before reading along
> >> with privilege level disable.
> > 
> > I'm very confused as to why we're not trusting the HW to remain paused. Why do
> > we need to enforce what the hardware should already be doing?
> 
> As have learned from the HW folks, there might be situations where the BRBE buffer has
> been actually paused, ready for extraction in principle, but without BRBFCR_EL1_PAUSED
> being set. 

The ARM ARM is pretty clear that paused means BRBFCR_EL1.PAUSED==1, so I assume
you mean there's a different scenario where it won't generate records (e.g.
such as being in a prohibited region).

Can you please give an example here?

I'm happy to go talk with the HW folk with you for this.

> Setting the bit here explicitly creates consistency across scenarios before
> capturing the branch records. But please do note, that putting the BRBE in prohibited
> region via clearing BRBCR_EL1_E0BRE/E1BRE is the primary construct which ensures that
> no new branch records will make into the buffer while it's being processed.

I agree that's sufficient, but as above I don't believe it's necessary, and all
that we actually require is that no new records are generated.

> >> Regardless the buffer needs to be un-paused and also
> >> enabled for required privilege levels before exiting from here.
> > 
> > I agree this needs to be balanced, it just seems to me that we're doing
> > redundant work here.
> 
> Extracting branch records for an user space only profile session might be faster as it
> would require lesser context synchronization, might not even require prohibited region
> change mechanism (will be already in there, upon a PMU interrupt) etc. I could try and
> update IRQ branch records handling, based on whether current perf session was profiling
> only the user space or not.

For the IRQ handler I do not believe it matters which exception level(s) are
being monitored; if BRBFCR_EL1.PAUSED is set no new records will be generated
regardless. So I don't think that needs any special care.

For the context-switch between tasks I believe we'll need to transiently enter
a prohibited region, but that's a different path.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE
@ 2023-02-23 13:38             ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-23 13:38 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

Hi Anshuman,

Following up on some of the bits below, I've tried to read the BRBE section in
the ARM ARM (ARM DDI 0487I.a), and explain my understanding below. Please let
me know if I've misunderstood or missed something.

On Mon, Feb 20, 2023 at 02:08:39PM +0530, Anshuman Khandual wrote:
> On 2/9/23 01:33, Mark Rutland wrote:
> > On Thu, Jan 19, 2023 at 08:18:47AM +0530, Anshuman Khandual wrote:
> >> On 1/12/23 22:21, Mark Rutland wrote:
> >>> On Thu, Jan 05, 2023 at 08:40:39AM +0530, Anshuman Khandual wrote:

> >>>> +	/* Save and clear the privilege */
> >>>> +	write_sysreg_s(brbcr & ~(BRBCR_EL1_E0BRE | BRBCR_EL1_E1BRE), SYS_BRBCR_EL1);
> >>>
> >>> Why? Later on we restore this, and AFAICT we don't modify it.
> >>>
> >>> If it's paused, why do we care about the privilege?
> >>
> >> This disables BRBE completely (not only pause) providing confidence that no
> >> branch record can come in while the existing records are being processed.
> > 
> > I thought from earlier that it was automatically paused by HW upon raising the
> > IRQ. Have I misunderstood, and we *must* stop it, or is this a belt-and-braces
> > additional disable?
> > 
> > Is that not the case, or do we not trust the pause for some reason?
> 
> Yes, this is a belt-and-braces additional disable i.e putting the BRBE in prohibited
> region, which is more effective than a pause.

I'm afraid I don't understand what you mean by "more effective than a pause";
AFAICT the pause should be sufficient for what we're doing.

If there's a particular property that a prohibited region ensures but pausing
does not, could you please say what that property is specifically? e.g. as
below I note some differences w.r.t. the BRB_FILTRATE PMU event, but I'm not
sure if that's what you're referring to.

Per ARM DDI 0487I.a, section D15.3 on pages D15-5511 and D15-5512, we have:

  R_PYBRZ:
  
    Generation of Branch records is paused when BRBFCR_EL1.PAUSED is 1.

  R_SRJND

    If a direct read of BRBFCR_EL1.PAUSED returns 1, then no operations ordered
    after the direct read will generate further Branch records until
    BRBFCR_EL1.PAUSED is cleared by software.

    Note: The subsequent operations can be ordered by a Context synchronization
    event.

So if we read BRBFCR_EL1 and the PAUSED bit is set, then all we need is an ISB
to ensure that no further records will be generated.

Rules R_NXCWF, R_GXGWY, R_RPKTXQ mean that a freeze event is generated when
PMOVSCLR_EL0 bits become set (i.e. when an overflow occurs), and we have:

  R_BHYTD

    On a BRBE freeze event:
    * BRBFCR_EL1.PAUSED is set to 1.
    * The current timestamp is captured in BRBTS_EL1.

So any counter overflow will indirectly set BRBFCR_EL1.PAUSED, and stop the
generation of records. The note in R_SRJND tells us that remains the case until
we explicitly clear BRBFCR_EL1.PAUSED.

The only thing that I can see that potentially justifies placing the BRBE into
a prohibited region is the notes about BRB_FILTRATE, but I don't think that's
all that useful anyway since it's not manipulated atomically w.r.t. the actual
BRBE record management, and there are larger windows where BRBE will be paused
but counters running (e.g. between overflow occurring and poking the BRBE in
the overflow handler). So I think it'd be pointless to do that *just* for
BRB_FILTRATE.

Practically speaking, I expect that if we read PMOVSCLR and find any bits are
set, then issue an ISB, then after that ISB all of the following should be
true:

 (a) BRBFCR_EL1.PAUSED will be set 
 (b) No further records will be generated
 (c) We can safely manipulate the existing records

[...]

> >>> That said, as above, do we actually need to pause/unpause it? Or is it already
> >>> paused by virtue of the IRQ?
> >>
> >> Yes, it should be paused after an IRQ but it is also enforced before reading along
> >> with privilege level disable.
> > 
> > I'm very confused as to why we're not trusting the HW to remain paused. Why do
> > we need to enforce what the hardware should already be doing?
> 
> As have learned from the HW folks, there might be situations where the BRBE buffer has
> been actually paused, ready for extraction in principle, but without BRBFCR_EL1_PAUSED
> being set. 

The ARM ARM is pretty clear that paused means BRBFCR_EL1.PAUSED==1, so I assume
you mean there's a different scenario where it won't generate records (e.g.
such as being in a prohibited region).

Can you please give an example here?

I'm happy to go talk with the HW folk with you for this.

> Setting the bit here explicitly creates consistency across scenarios before
> capturing the branch records. But please do note, that putting the BRBE in prohibited
> region via clearing BRBCR_EL1_E0BRE/E1BRE is the primary construct which ensures that
> no new branch records will make into the buffer while it's being processed.

I agree that's sufficient, but as above I don't believe it's necessary, and all
that we actually require is that no new records are generated.

> >> Regardless the buffer needs to be un-paused and also
> >> enabled for required privilege levels before exiting from here.
> > 
> > I agree this needs to be balanced, it just seems to me that we're doing
> > redundant work here.
> 
> Extracting branch records for an user space only profile session might be faster as it
> would require lesser context synchronization, might not even require prohibited region
> change mechanism (will be already in there, upon a PMU interrupt) etc. I could try and
> update IRQ branch records handling, based on whether current perf session was profiling
> only the user space or not.

For the IRQ handler I do not believe it matters which exception level(s) are
being monitored; if BRBFCR_EL1.PAUSED is set no new records will be generated
regardless. So I don't think that needs any special care.

For the context-switch between tasks I believe we'll need to transiently enter
a prohibited region, but that's a different path.

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
  2023-02-13  8:23           ` Anshuman Khandual
@ 2023-02-23 13:47             ` Mark Rutland
  -1 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-23 13:47 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Mon, Feb 13, 2023 at 01:53:56PM +0530, Anshuman Khandual wrote:
> 
> 
> On 2/9/23 01:06, Mark Rutland wrote:
> > On Fri, Jan 13, 2023 at 10:41:51AM +0530, Anshuman Khandual wrote:
> >>
> >>
> >> On 1/12/23 19:59, Mark Rutland wrote:
> >>> On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
> >>>> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
> >>>>  		if (!armpmu_event_set_period(event))
> >>>>  			continue;
> >>>>  
> >>>> +		if (has_branch_stack(event)) {
> >>>> +			WARN_ON(!cpuc->branches);
> >>>> +			armv8pmu_branch_read(cpuc, event);
> >>>> +			data.br_stack = &cpuc->branches->branch_stack;
> >>>> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> >>>> +		}
> >>>
> >>> How do we ensure the data we're getting isn't changed under our feet? Is BRBE
> >>> disabled at this point?
> >>
> >> Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
> >> all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
> >> before initiating the actual read, which eventually populates the data.br_stack.
> > 
> > Ok; just to confirm, what exactly is the condition that enforces that BRBE is
> > disabled? Is that *while* there's an overflow asserted, or does something else
> > get set at the instant the overflow occurs?
> 
> - BRBE can be disabled completely via BRBCR_EL1_E0BRE/E1BRE irrespective of PMU interrupt
> - But with PMU interrupt, it just pauses if BRBCR_EL1_FZP is enabled

IIUC the distinction between "disabled completely" and "just pauses" doesn't
really matter to us, and a pause is sufficient for use to be able to read and
manipulate the records.

I also note that we always set BRBCR_EL1.FZP.

Am I missing something?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
@ 2023-02-23 13:47             ` Mark Rutland
  0 siblings, 0 replies; 62+ messages in thread
From: Mark Rutland @ 2023-02-23 13:47 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon

On Mon, Feb 13, 2023 at 01:53:56PM +0530, Anshuman Khandual wrote:
> 
> 
> On 2/9/23 01:06, Mark Rutland wrote:
> > On Fri, Jan 13, 2023 at 10:41:51AM +0530, Anshuman Khandual wrote:
> >>
> >>
> >> On 1/12/23 19:59, Mark Rutland wrote:
> >>> On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
> >>>> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
> >>>>  		if (!armpmu_event_set_period(event))
> >>>>  			continue;
> >>>>  
> >>>> +		if (has_branch_stack(event)) {
> >>>> +			WARN_ON(!cpuc->branches);
> >>>> +			armv8pmu_branch_read(cpuc, event);
> >>>> +			data.br_stack = &cpuc->branches->branch_stack;
> >>>> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
> >>>> +		}
> >>>
> >>> How do we ensure the data we're getting isn't changed under our feet? Is BRBE
> >>> disabled at this point?
> >>
> >> Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
> >> all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
> >> before initiating the actual read, which eventually populates the data.br_stack.
> > 
> > Ok; just to confirm, what exactly is the condition that enforces that BRBE is
> > disabled? Is that *while* there's an overflow asserted, or does something else
> > get set at the instant the overflow occurs?
> 
> - BRBE can be disabled completely via BRBCR_EL1_E0BRE/E1BRE irrespective of PMU interrupt
> - But with PMU interrupt, it just pauses if BRBCR_EL1_FZP is enabled

IIUC the distinction between "disabled completely" and "just pauses" doesn't
really matter to us, and a pause is sufficient for use to be able to read and
manipulate the records.

I also note that we always set BRBCR_EL1.FZP.

Am I missing something?

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
  2023-02-23 13:47             ` Mark Rutland
@ 2023-03-06  7:59               ` Anshuman Khandual
  -1 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-03-06  7:59 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon



On 2/23/23 19:17, Mark Rutland wrote:
> On Mon, Feb 13, 2023 at 01:53:56PM +0530, Anshuman Khandual wrote:
>>
>>
>> On 2/9/23 01:06, Mark Rutland wrote:
>>> On Fri, Jan 13, 2023 at 10:41:51AM +0530, Anshuman Khandual wrote:
>>>>
>>>>
>>>> On 1/12/23 19:59, Mark Rutland wrote:
>>>>> On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
>>>>>> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>>>>>>  		if (!armpmu_event_set_period(event))
>>>>>>  			continue;
>>>>>>  
>>>>>> +		if (has_branch_stack(event)) {
>>>>>> +			WARN_ON(!cpuc->branches);
>>>>>> +			armv8pmu_branch_read(cpuc, event);
>>>>>> +			data.br_stack = &cpuc->branches->branch_stack;
>>>>>> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
>>>>>> +		}
>>>>>
>>>>> How do we ensure the data we're getting isn't changed under our feet? Is BRBE
>>>>> disabled at this point?
>>>>
>>>> Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
>>>> all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
>>>> before initiating the actual read, which eventually populates the data.br_stack.
>>>
>>> Ok; just to confirm, what exactly is the condition that enforces that BRBE is
>>> disabled? Is that *while* there's an overflow asserted, or does something else
>>> get set at the instant the overflow occurs?
>>
>> - BRBE can be disabled completely via BRBCR_EL1_E0BRE/E1BRE irrespective of PMU interrupt
>> - But with PMU interrupt, it just pauses if BRBCR_EL1_FZP is enabled
> 
> IIUC the distinction between "disabled completely" and "just pauses" doesn't
> really matter to us, and a pause is sufficient for use to be able to read and
> manipulate the records.
As I learned from the HW folks earlier, but seems like we might have to revisit
this understanding once again.

'Pause' state ensures that no new branch records could not get into the buffer
which is necessary, but not sufficient enough condition before all the branch
records could be processed in software. BRBE "disabled completely" via putting
in prohibited region (implicitly during PMU interrupt while tracing user only
sessions, explicitly while tracing user/kernel/hv sessions) is still necessary.

> 
> I also note that we always set BRBCR_EL1.FZP.
> 
> Am I missing something?

We always set BRBCR_EL1.FZP, but during PMU interrupt while processing branch
records, there are certain distinctions. 

user only traces:

	- Ensuring BRBFCR_EL1_PAUSED being set is not necessary
	- BRBE is already in prohibited region (IRQ handler in EL1)
	- Exception transition serve as a context synchronization event
	- Branch records can be read and processed right away
	- Return after clearing BRBFCR_EL1_PAUSED followed by BRB_IALL
	- isb() is not even necessary before returning 
	- ERET from EL1 will ensure a context a synchronization event

privilege traces:
	
	- Ensuring BRBFCR_EL1_PAUSED is necessary
	- Ensuring BRBE is in prohibited state - SW clears BRBCR_EL1_E1BR
	- isb() is required to ensure BRBE is prohibited state before reading
	- Return after clearing BRBFCR_EL1_PAUSED followed by BRB_IALL
	- isb() is required while returning from IRQ handler

I had suggested differentiating user only sessions, because it can save multiple
isb() instances and register write accesses which is not possible for privilege
trace sessions.

- Anshuman

^ permalink raw reply	[flat|nested] 62+ messages in thread

* Re: [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU
@ 2023-03-06  7:59               ` Anshuman Khandual
  0 siblings, 0 replies; 62+ messages in thread
From: Anshuman Khandual @ 2023-03-06  7:59 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel, linux-kernel, Catalin Marinas, Will Deacon



On 2/23/23 19:17, Mark Rutland wrote:
> On Mon, Feb 13, 2023 at 01:53:56PM +0530, Anshuman Khandual wrote:
>>
>>
>> On 2/9/23 01:06, Mark Rutland wrote:
>>> On Fri, Jan 13, 2023 at 10:41:51AM +0530, Anshuman Khandual wrote:
>>>>
>>>>
>>>> On 1/12/23 19:59, Mark Rutland wrote:
>>>>> On Thu, Jan 05, 2023 at 08:40:38AM +0530, Anshuman Khandual wrote:
>>>>>> @@ -878,6 +890,13 @@ static irqreturn_t armv8pmu_handle_irq(struct arm_pmu *cpu_pmu)
>>>>>>  		if (!armpmu_event_set_period(event))
>>>>>>  			continue;
>>>>>>  
>>>>>> +		if (has_branch_stack(event)) {
>>>>>> +			WARN_ON(!cpuc->branches);
>>>>>> +			armv8pmu_branch_read(cpuc, event);
>>>>>> +			data.br_stack = &cpuc->branches->branch_stack;
>>>>>> +			data.sample_flags |= PERF_SAMPLE_BRANCH_STACK;
>>>>>> +		}
>>>>>
>>>>> How do we ensure the data we're getting isn't changed under our feet? Is BRBE
>>>>> disabled at this point?
>>>>
>>>> Right, BRBE is paused after a PMU IRQ. We also ensure the buffer is disabled for
>>>> all exception levels, i.e removing BRBCR_EL1_E0BRE/E1BRE from the configuration,
>>>> before initiating the actual read, which eventually populates the data.br_stack.
>>>
>>> Ok; just to confirm, what exactly is the condition that enforces that BRBE is
>>> disabled? Is that *while* there's an overflow asserted, or does something else
>>> get set at the instant the overflow occurs?
>>
>> - BRBE can be disabled completely via BRBCR_EL1_E0BRE/E1BRE irrespective of PMU interrupt
>> - But with PMU interrupt, it just pauses if BRBCR_EL1_FZP is enabled
> 
> IIUC the distinction between "disabled completely" and "just pauses" doesn't
> really matter to us, and a pause is sufficient for use to be able to read and
> manipulate the records.
As I learned from the HW folks earlier, but seems like we might have to revisit
this understanding once again.

'Pause' state ensures that no new branch records could not get into the buffer
which is necessary, but not sufficient enough condition before all the branch
records could be processed in software. BRBE "disabled completely" via putting
in prohibited region (implicitly during PMU interrupt while tracing user only
sessions, explicitly while tracing user/kernel/hv sessions) is still necessary.

> 
> I also note that we always set BRBCR_EL1.FZP.
> 
> Am I missing something?

We always set BRBCR_EL1.FZP, but during PMU interrupt while processing branch
records, there are certain distinctions. 

user only traces:

	- Ensuring BRBFCR_EL1_PAUSED being set is not necessary
	- BRBE is already in prohibited region (IRQ handler in EL1)
	- Exception transition serve as a context synchronization event
	- Branch records can be read and processed right away
	- Return after clearing BRBFCR_EL1_PAUSED followed by BRB_IALL
	- isb() is not even necessary before returning 
	- ERET from EL1 will ensure a context a synchronization event

privilege traces:
	
	- Ensuring BRBFCR_EL1_PAUSED is necessary
	- Ensuring BRBE is in prohibited state - SW clears BRBCR_EL1_E1BR
	- isb() is required to ensure BRBE is prohibited state before reading
	- Return after clearing BRBFCR_EL1_PAUSED followed by BRB_IALL
	- isb() is required while returning from IRQ handler

I had suggested differentiating user only sessions, because it can save multiple
isb() instances and register write accesses which is not possible for privilege
trace sessions.

- Anshuman

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 62+ messages in thread

end of thread, other threads:[~2023-03-06  8:11 UTC | newest]

Thread overview: 62+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-05  3:10 [PATCH V7 0/6] arm64/perf: Enable branch stack sampling Anshuman Khandual
2023-01-05  3:10 ` Anshuman Khandual
2023-01-05  3:10 ` [PATCH V7 1/6] drivers: perf: arm_pmu: Add new sched_task() callback Anshuman Khandual
2023-01-05  3:10   ` Anshuman Khandual
2023-01-05  3:10 ` [PATCH V7 2/6] arm64/perf: Add BRBE registers and fields Anshuman Khandual
2023-01-05  3:10   ` Anshuman Khandual
2023-01-12 13:24   ` Mark Rutland
2023-01-12 13:24     ` Mark Rutland
2023-01-13  3:02     ` Anshuman Khandual
2023-01-13  3:02       ` Anshuman Khandual
2023-02-08 19:22       ` Mark Rutland
2023-02-08 19:22         ` Mark Rutland
2023-02-09  5:49         ` Anshuman Khandual
2023-02-09  5:49           ` Anshuman Khandual
2023-02-09 10:08           ` Mark Rutland
2023-02-09 10:08             ` Mark Rutland
2023-01-05  3:10 ` [PATCH V7 3/6] arm64/perf: Add branch stack support in struct arm_pmu Anshuman Khandual
2023-01-05  3:10   ` Anshuman Khandual
2023-01-12 13:54   ` Mark Rutland
2023-01-12 13:54     ` Mark Rutland
2023-01-13  4:15     ` Anshuman Khandual
2023-01-13  4:15       ` Anshuman Khandual
2023-02-08 19:26       ` Mark Rutland
2023-02-08 19:26         ` Mark Rutland
2023-02-09  3:40         ` Anshuman Khandual
2023-02-09  3:40           ` Anshuman Khandual
2023-01-05  3:10 ` [PATCH V7 4/6] arm64/perf: Add branch stack support in struct pmu_hw_events Anshuman Khandual
2023-01-05  3:10   ` Anshuman Khandual
2023-01-12 13:59   ` Mark Rutland
2023-01-12 13:59     ` Mark Rutland
2023-01-05  3:10 ` [PATCH V7 5/6] arm64/perf: Add branch stack support in ARMV8 PMU Anshuman Khandual
2023-01-05  3:10   ` Anshuman Khandual
2023-01-12 14:29   ` Mark Rutland
2023-01-12 14:29     ` Mark Rutland
2023-01-13  5:11     ` Anshuman Khandual
2023-01-13  5:11       ` Anshuman Khandual
2023-02-08 19:36       ` Mark Rutland
2023-02-08 19:36         ` Mark Rutland
2023-02-13  8:23         ` Anshuman Khandual
2023-02-13  8:23           ` Anshuman Khandual
2023-02-23 13:47           ` Mark Rutland
2023-02-23 13:47             ` Mark Rutland
2023-03-06  7:59             ` Anshuman Khandual
2023-03-06  7:59               ` Anshuman Khandual
2023-01-05  3:10 ` [PATCH V7 6/6] arm64/perf: Enable branch stack events via FEAT_BRBE Anshuman Khandual
2023-01-05  3:10   ` Anshuman Khandual
2023-01-12 16:51   ` Mark Rutland
2023-01-12 16:51     ` Mark Rutland
2023-01-19  2:48     ` Anshuman Khandual
2023-01-19  2:48       ` Anshuman Khandual
2023-02-08 20:03       ` Mark Rutland
2023-02-08 20:03         ` Mark Rutland
2023-02-20  8:38         ` Anshuman Khandual
2023-02-20  8:38           ` Anshuman Khandual
2023-02-23 13:38           ` Mark Rutland
2023-02-23 13:38             ` Mark Rutland
2023-01-06 10:23 ` [PATCH V7 0/6] arm64/perf: Enable branch stack sampling James Clark
2023-01-06 10:23   ` James Clark
2023-01-06 11:13   ` Anshuman Khandual
2023-01-06 11:13     ` Anshuman Khandual
2023-01-11  5:05 ` Anshuman Khandual
2023-01-11  5:05   ` Anshuman Khandual

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.