linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds
@ 2021-10-14 22:31 Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 01/15] arm64: Add Neoverse-N2, Cortex-A710 CPU part definition Suzuki K Poulose
                   ` (14 more replies)
  0 siblings, 15 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose


This series adds CPU erratum work arounds related to the self-hosted
tracing. The list of affected errata handled in this series are :

 * TRBE may overwrite trace in FILL mode
   - Arm Neoverse-N2	#2139208
   - Cortex-A710	#211985

 * A TSB instruction may not flush the trace completely when executed
   in trace prohibited region.

   - Arm Neoverse-N2	#2067961
   - Cortex-A710	#2054223

 * TRBE may write to out-of-range address
   - Arm Neoverse-N2	#2253138
   - Cortex-A710	#2224489

The series applies on coresight/next. The series has been reordered
to make it easier to merge the patches via arm64 tree and the coresight
tree.

Patches 1-4 are could be picked up via arm64 tree. The rest can go via
the coresight tree. The Kconfig items for the TRBE errata are initially
dropped in with dependency on COMPILE_TEST. These are dropped only after
the driver is equipped with the work around in later patches.


A tree is available here :

git@git.gitlab.arm.com:linux-arm/linux-skp.git  coresight/errata/trbe-tsb-n2-a710/v5

Changes since v4:
 - Fix WARN on trbe driver probe on a hotplugged CPU, by making
   sure that the arm_trbe_probe_cpu() is called from non-premptible
   context. this_cpu_has_cap() doesn't like to be called from a
   preemptible() context.

 - Fix Kconfig text issues pointed out by Randy

Changes since v3:
 
 - Fix missing Kconfig selection for TSB flush failure erratum (Will)
   Merged the Kconfig changes to the core patch for TSB.
 - Use COMPILE_TEST dependency for the TRBE work arounds instead of
   delaying the Kconfig entries.

Changes since v2:
 * https://lkml.kernel.org/r/20210921134121.2423546-1-suzuki.poulose@arm.com 
 - Dropped patch adding a helper to reach cpudata from perf handle
 - Split the TSB erratum work around patch to split the Kconfig/erratum
   list update changes(pushed to the end of the series).
 - Added wrappers to check the erratum :
    trbe_has_erratum(cpudata, TRBE_ERRATUM_<TITLE>) -> trbe_may_<title>
 - More ASCII art explanation on workaround.

Changes since v1:
 * https://lkml.kernel.org/r/20210728135217.591173-1-suzuki.poulose@arm.com
 - Added a fix to the TRBE driver handling of sink_specific data
 - Added more description and ASCII art for overwrite in FILL mode
   work around 
 - Added another TRBE erratum to the list.
  "TRBE may write to out-of-range address"
  Patches from 12-17
 - Added comment to list the expectations around TSB erratum workaround.


Suzuki K Poulose (15):
  arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
  arm64: errata: Add detection for TRBE overwrite in FILL mode
  arm64: errata: Add workaround for TSB flush failures
  arm64: errata: Add detection for TRBE write to out-of-range
  coresight: trbe: Add a helper to calculate the trace generated
  coresight: trbe: Add a helper to pad a given buffer area
  coresight: trbe: Decouple buffer base from the hardware base
  coresight: trbe: Allow driver to choose a different alignment
  coresight: trbe: Add infrastructure for Errata handling
  coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  coresight: trbe: Add a helper to determine the minimum buffer size
  coresight: trbe: Make sure we have enough space
  coresight: trbe: Work around write to out of range
  arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  arm64: errata: Enable TRBE workaround for write to out-of-range
    address

 Documentation/arm64/silicon-errata.rst       |  12 +
 arch/arm64/Kconfig                           | 111 ++++++
 arch/arm64/include/asm/barrier.h             |  16 +-
 arch/arm64/include/asm/cputype.h             |   4 +
 arch/arm64/kernel/cpu_errata.c               |  64 +++
 arch/arm64/tools/cpucaps                     |   3 +
 drivers/hwtracing/coresight/coresight-trbe.c | 395 +++++++++++++++++--
 7 files changed, 567 insertions(+), 38 deletions(-)

-- 
2.25.4


^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH v5 01/15] arm64: Add Neoverse-N2, Cortex-A710 CPU part definition
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 02/15] arm64: errata: Add detection for TRBE overwrite in FILL mode Suzuki K Poulose
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose,
	Mark Rutland

Add the CPU Partnumbers for the new Arm designs.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cputype.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index 6231e1f0abe7..19b8441aa8f2 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -73,6 +73,8 @@
 #define ARM_CPU_PART_CORTEX_A76		0xD0B
 #define ARM_CPU_PART_NEOVERSE_N1	0xD0C
 #define ARM_CPU_PART_CORTEX_A77		0xD0D
+#define ARM_CPU_PART_CORTEX_A710	0xD47
+#define ARM_CPU_PART_NEOVERSE_N2	0xD49
 
 #define APM_CPU_PART_POTENZA		0x000
 
@@ -113,6 +115,8 @@
 #define MIDR_CORTEX_A76	MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A76)
 #define MIDR_NEOVERSE_N1 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N1)
 #define MIDR_CORTEX_A77	MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A77)
+#define MIDR_CORTEX_A710 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A710)
+#define MIDR_NEOVERSE_N2 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_NEOVERSE_N2)
 #define MIDR_THUNDERX	MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
 #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
 #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 02/15] arm64: errata: Add detection for TRBE overwrite in FILL mode
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 01/15] arm64: Add Neoverse-N2, Cortex-A710 CPU part definition Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-19 11:04   ` Will Deacon
  2021-10-14 22:31 ` [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures Suzuki K Poulose
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose,
	Mark Rutland

Arm Neoverse-N2 and the Cortex-A710 cores are affected
by a CPU erratum where the TRBE will overwrite the trace buffer
in FILL mode. The TRBE doesn't stop (as expected in FILL mode)
when it reaches the limit and wraps to the base to continue
writing upto 3 cache lines. This will overwrite any trace that
was written previously.

Add the Neoverse-N2 erratumi(#2139208) and Cortex-A710 erratum
(#2119858) to the  detection logic.

This will be used by the TRBE driver in later patches to work
around the issue. The detection has been kept with the core
arm64 errata framework list to make sure :
  - We don't duplicate the framework in TRBE driver
  - The errata detection is advertised like the rest
    of the CPU errata.

Note that the Kconfig entries are not fully active until the
TRBE driver implements the work around.

Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
cc: Leo Yan <leo.yan@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 Documentation/arm64/silicon-errata.rst |  4 +++
 arch/arm64/Kconfig                     | 41 ++++++++++++++++++++++++++
 arch/arm64/kernel/cpu_errata.c         | 25 ++++++++++++++++
 arch/arm64/tools/cpucaps               |  1 +
 4 files changed, 71 insertions(+)

diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index d410a47ffa57..2f99229d993c 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -92,12 +92,16 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-A77      | #1508412        | ARM64_ERRATUM_1508412       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-A710     | #2119858        | ARM64_ERRATUM_2119858       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1188873,1418040| ARM64_ERRATUM_1418040       |
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1349291        | N/A                         |
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1542419        | ARM64_ERRATUM_1542419       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Neoverse-N2     | #2139208        | ARM64_ERRATUM_2139208       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | MMU-500         | #841119,826419  | N/A                         |
 +----------------+-----------------+-----------------+-----------------------------+
 +----------------+-----------------+-----------------+-----------------------------+
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 077f2ec4eeb2..b452181d3638 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -666,6 +666,47 @@ config ARM64_ERRATUM_1508412
 
 	  If unsure, say Y.
 
+config ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
+	bool
+
+config ARM64_ERRATUM_2119858
+	bool "Cortex-A710: 2119858: workaround TRBE overwriting trace data in FILL mode"
+	default y
+	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
+	depends on CORESIGHT_TRBE
+	select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
+	help
+	  This option adds the workaround for ARM Cortex-A710 erratum 2119858.
+
+	  Affected Cortex-A710 cores could overwrite up to 3 cache lines of trace
+	  data at the base of the buffer (pointed to by TRBASER_EL1) in FILL mode in
+	  the event of a WRAP event.
+
+	  Work around the issue by always making sure we move the TRBPTR_EL1 by
+	  256 bytes before enabling the buffer and filling the first 256 bytes of
+	  the buffer with ETM ignore packets upon disabling.
+
+	  If unsure, say Y.
+
+config ARM64_ERRATUM_2139208
+	bool "Neoverse-N2: 2139208: workaround TRBE overwriting trace data in FILL mode"
+	default y
+	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
+	depends on CORESIGHT_TRBE
+	select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
+	help
+	  This option adds the workaround for ARM Neoverse-N2 erratum 2139208.
+
+	  Affected Neoverse-N2 cores could overwrite up to 3 cache lines of trace
+	  data at the base of the buffer (pointed to by TRBASER_EL1) in FILL mode in
+	  the event of a WRAP event.
+
+	  Work around the issue by always making sure we move the TRBPTR_EL1 by
+	  256 bytes before enabling the buffer and filling the first 256 bytes of
+	  the buffer with ETM ignore packets upon disabling.
+
+	  If unsure, say Y.
+
 config CAVIUM_ERRATUM_22375
 	bool "Cavium erratum 22375, 24313"
 	default y
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index e2c20c036442..ccd757373f36 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -340,6 +340,18 @@ static const struct midr_range erratum_1463225[] = {
 };
 #endif
 
+#ifdef CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
+static const struct midr_range trbe_overwrite_fill_mode_cpus[] = {
+#ifdef CONFIG_ARM64_ERRATUM_2139208
+	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+#endif
+#ifdef CONFIG_ARM64_ERRATUM_2119858
+	MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+#endif
+	{},
+};
+#endif	/* CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE */
+
 const struct arm64_cpu_capabilities arm64_errata[] = {
 #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
 	{
@@ -533,6 +545,19 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 		.capability = ARM64_WORKAROUND_NVIDIA_CARMEL_CNP,
 		ERRATA_MIDR_ALL_VERSIONS(MIDR_NVIDIA_CARMEL),
 	},
+#endif
+#ifdef CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
+	{
+		/*
+		 * The erratum work around is handled within the TRBE
+		 * driver and can be applied per-cpu. So, we can allow
+		 * a late CPU to come online with this erratum.
+		 */
+		.desc = "ARM erratum 2119858 or 2139208",
+		.capability = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE,
+		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+		CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus),
+	},
 #endif
 	{
 	}
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 49305c2e6dfd..1ccb92165bd8 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -53,6 +53,7 @@ WORKAROUND_1418040
 WORKAROUND_1463225
 WORKAROUND_1508412
 WORKAROUND_1542419
+WORKAROUND_TRBE_OVERWRITE_FILL_MODE
 WORKAROUND_CAVIUM_23154
 WORKAROUND_CAVIUM_27456
 WORKAROUND_CAVIUM_30115
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 01/15] arm64: Add Neoverse-N2, Cortex-A710 CPU part definition Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 02/15] arm64: errata: Add detection for TRBE overwrite in FILL mode Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-19  5:06   ` Anshuman Khandual
  2021-10-19 11:02   ` Will Deacon
  2021-10-14 22:31 ` [PATCH v5 04/15] arm64: errata: Add detection for TRBE write to out-of-range Suzuki K Poulose
                   ` (11 subsequent siblings)
  14 siblings, 2 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose,
	Mark Rutland

Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
from errata, where a TSB (trace synchronization barrier)
fails to flush the trace data completely, when executed from
a trace prohibited region. In Linux we always execute it
after we have moved the PE to trace prohibited region. So,
we can apply the workaround every time a TSB is executed.

The work around is to issue two TSB consecutively.

NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
that a late CPU could be blocked from booting if it is the
first CPU that requires the workaround. This is because we
do not allow setting a cpu_hwcaps after the SMP boot. The
other alternative is to use "this_cpu_has_cap()" instead
of the faster system wide check, which may be a bit of an
overhead, given we may have to do this in nvhe KVM host
before a guest entry.

Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Changes since v3:
 - Merge the Kconfig changes back.
---
 Documentation/arm64/silicon-errata.rst |  4 ++++
 arch/arm64/Kconfig                     | 33 ++++++++++++++++++++++++++
 arch/arm64/include/asm/barrier.h       | 16 ++++++++++++-
 arch/arm64/kernel/cpu_errata.c         | 19 +++++++++++++++
 arch/arm64/tools/cpucaps               |  1 +
 5 files changed, 72 insertions(+), 1 deletion(-)

diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index 2f99229d993c..569a92411dcd 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -94,6 +94,8 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-A710     | #2119858        | ARM64_ERRATUM_2119858       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-A710     | #2054223        | ARM64_ERRATUM_2054223       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1188873,1418040| ARM64_ERRATUM_1418040       |
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1349291        | N/A                         |
@@ -102,6 +104,8 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N2     | #2139208        | ARM64_ERRATUM_2139208       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Neoverse-N2     | #2067961        | ARM64_ERRATUM_2067961       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | MMU-500         | #841119,826419  | N/A                         |
 +----------------+-----------------+-----------------+-----------------------------+
 +----------------+-----------------+-----------------+-----------------------------+
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b452181d3638..39b78460b9d0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -707,6 +707,39 @@ config ARM64_ERRATUM_2139208
 
 	  If unsure, say Y.
 
+config ARM64_WORKAROUND_TSB_FLUSH_FAILURE
+	bool
+
+config ARM64_ERRATUM_2054223
+	bool "Cortex-A710: 2054223: workaround TSB instruction failing to flush trace"
+	default y
+	select ARM64_WORKAROUND_TSB_FLUSH_FAILURE
+	help
+	  Enable workaround for ARM Cortex-A710 erratum 2054223
+
+	  Affected cores may fail to flush the trace data on a TSB instruction, when
+	  the PE is in trace prohibited state. This will cause losing a few bytes
+	  of the trace cached.
+
+	  Workaround is to issue two TSB consecutively on affected cores.
+
+	  If unsure, say Y.
+
+config ARM64_ERRATUM_2067961
+	bool "Neoverse-N2: 2067961: workaround TSB instruction failing to flush trace"
+	default y
+	select ARM64_WORKAROUND_TSB_FLUSH_FAILURE
+	help
+	  Enable workaround for ARM Neoverse-N2 erratum 2067961
+
+	  Affected cores may fail to flush the trace data on a TSB instruction, when
+	  the PE is in trace prohibited state. This will cause losing a few bytes
+	  of the trace cached.
+
+	  Workaround is to issue two TSB consecutively on affected cores.
+
+	  If unsure, say Y.
+
 config CAVIUM_ERRATUM_22375
 	bool "Cavium erratum 22375, 24313"
 	default y
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index 451e11e5fd23..1c5a00598458 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -23,7 +23,7 @@
 #define dsb(opt)	asm volatile("dsb " #opt : : : "memory")
 
 #define psb_csync()	asm volatile("hint #17" : : : "memory")
-#define tsb_csync()	asm volatile("hint #18" : : : "memory")
+#define __tsb_csync()	asm volatile("hint #18" : : : "memory")
 #define csdb()		asm volatile("hint #20" : : : "memory")
 
 #ifdef CONFIG_ARM64_PSEUDO_NMI
@@ -46,6 +46,20 @@
 #define dma_rmb()	dmb(oshld)
 #define dma_wmb()	dmb(oshst)
 
+
+#define tsb_csync()								\
+	do {									\
+		/*								\
+		 * CPUs affected by Arm Erratum 2054223 or 2067961 needs	\
+		 * another TSB to ensure the trace is flushed. The barriers	\
+		 * don't have to be strictly back to back, as long as the	\
+		 * CPU is in trace prohibited state.				\
+		 */								\
+		if (cpus_have_final_cap(ARM64_WORKAROUND_TSB_FLUSH_FAILURE))	\
+			__tsb_csync();						\
+		__tsb_csync();							\
+	} while (0)
+
 /*
  * Generate a mask for array_index__nospec() that is ~0UL when 0 <= idx < sz
  * and 0 otherwise.
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index ccd757373f36..bdbeac75ead6 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -352,6 +352,18 @@ static const struct midr_range trbe_overwrite_fill_mode_cpus[] = {
 };
 #endif	/* CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE */
 
+#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE
+static const struct midr_range tsb_flush_fail_cpus[] = {
+#ifdef CONFIG_ARM64_ERRATUM_2067961
+	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+#endif
+#ifdef CONFIG_ARM64_ERRATUM_2054223
+	MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+#endif
+	{},
+};
+#endif	/* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
+
 const struct arm64_cpu_capabilities arm64_errata[] = {
 #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
 	{
@@ -558,6 +570,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
 		CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus),
 	},
+#endif
+#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE
+	{
+		.desc = "ARM erratum 2067961 or 2054223",
+		.capability = ARM64_WORKAROUND_TSB_FLUSH_FAILURE,
+		ERRATA_MIDR_RANGE_LIST(tsb_flush_fail_cpus),
+	},
 #endif
 	{
 	}
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 1ccb92165bd8..2102e15af43d 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -54,6 +54,7 @@ WORKAROUND_1463225
 WORKAROUND_1508412
 WORKAROUND_1542419
 WORKAROUND_TRBE_OVERWRITE_FILL_MODE
+WORKAROUND_TSB_FLUSH_FAILURE
 WORKAROUND_CAVIUM_23154
 WORKAROUND_CAVIUM_27456
 WORKAROUND_CAVIUM_30115
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 04/15] arm64: errata: Add detection for TRBE write to out-of-range
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (2 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-18 15:50   ` Mathieu Poirier
  2021-10-14 22:31 ` [PATCH v5 05/15] coresight: trbe: Add a helper to calculate the trace generated Suzuki K Poulose
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose,
	Mark Rutland

Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where the
trbe, under some circumstances, might write upto 64bytes to an address after
the Limit as programmed by the TRBLIMITR_EL1.LIMIT. This might -

  - Corrupt a page in the ring buffer, which may corrupt trace from a
    previous session, consumed by userspace.
  - Hit the guard page at the end of the vmalloc area and raise a fault.

To keep the handling simpler, we always leave the last page from the
range, which TRBE is allowed to write. This can be achieved by ensuring
that we always have more than a PAGE worth space in the range, while
calculating the LIMIT for TRBE. And then the LIMIT pointer can be adjusted
to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the TRBE range
while enabling it. This makes sure that the TRBE will only write to an area
within its allowed limit (i.e, [head-head+size]) and we do not have to handle
address faults within the driver.

Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 Documentation/arm64/silicon-errata.rst |  4 +++
 arch/arm64/Kconfig                     | 41 ++++++++++++++++++++++++++
 arch/arm64/kernel/cpu_errata.c         | 20 +++++++++++++
 arch/arm64/tools/cpucaps               |  1 +
 4 files changed, 66 insertions(+)

diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index 569a92411dcd..5342e895fb60 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -96,6 +96,8 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Cortex-A710     | #2054223        | ARM64_ERRATUM_2054223       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Cortex-A710     | #2224489        | ARM64_ERRATUM_2224489       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1188873,1418040| ARM64_ERRATUM_1418040       |
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N1     | #1349291        | N/A                         |
@@ -106,6 +108,8 @@ stable kernels.
 +----------------+-----------------+-----------------+-----------------------------+
 | ARM            | Neoverse-N2     | #2067961        | ARM64_ERRATUM_2067961       |
 +----------------+-----------------+-----------------+-----------------------------+
+| ARM            | Neoverse-N2     | #2253138        | ARM64_ERRATUM_2253138       |
++----------------+-----------------+-----------------+-----------------------------+
 | ARM            | MMU-500         | #841119,826419  | N/A                         |
 +----------------+-----------------+-----------------+-----------------------------+
 +----------------+-----------------+-----------------+-----------------------------+
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 39b78460b9d0..f30029f4a9f9 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -740,6 +740,47 @@ config ARM64_ERRATUM_2067961
 
 	  If unsure, say Y.
 
+config ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
+	bool
+
+config ARM64_ERRATUM_2253138
+	bool "Neoverse-N2: 2253138: workaround TRBE writing to address out-of-range"
+	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
+	depends on CORESIGHT_TRBE
+	default y
+	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
+	help
+	  This option adds the workaround for ARM Neoverse-N2 erratum 2253138.
+
+	  Affected Neoverse-N2 cores might write to an out-of-range address, not reserved
+	  for TRBE. Under some conditions, the TRBE might generate a write to the next
+	  virtually addressed page following the last page of the TRBE address space
+	  (i.e., the TRBLIMITR_EL1.LIMIT), instead of wrapping around to the base.
+
+	  Work around this in the driver by always making sure that there is a
+	  page beyond the TRBLIMITR_EL1.LIMIT, within the space allowed for the TRBE.
+
+	  If unsure, say Y.
+
+config ARM64_ERRATUM_2224489
+	bool "Cortex-A710: 2224489: workaround TRBE writing to address out-of-range"
+	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
+	depends on CORESIGHT_TRBE
+	default y
+	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
+	help
+	  This option adds the workaround for ARM Cortex-A710 erratum 2224489.
+
+	  Affected Cortex-A710 cores might write to an out-of-range address, not reserved
+	  for TRBE. Under some conditions, the TRBE might generate a write to the next
+	  virtually addressed page following the last page of the TRBE address space
+	  (i.e., the TRBLIMITR_EL1.LIMIT), instead of wrapping around to the base.
+
+	  Work around this in the driver by always making sure that there is a
+	  page beyond the TRBLIMITR_EL1.LIMIT, within the space allowed for the TRBE.
+
+	  If unsure, say Y.
+
 config CAVIUM_ERRATUM_22375
 	bool "Cavium erratum 22375, 24313"
 	default y
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index bdbeac75ead6..e2978b89d4b8 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -364,6 +364,18 @@ static const struct midr_range tsb_flush_fail_cpus[] = {
 };
 #endif	/* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
 
+#ifdef CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
+static struct midr_range trbe_write_out_of_range_cpus[] = {
+#ifdef CONFIG_ARM64_ERRATUM_2253138
+	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
+#endif
+#ifdef CONFIG_ARM64_ERRATUM_2224489
+	MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
+#endif
+	{},
+};
+#endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */
+
 const struct arm64_cpu_capabilities arm64_errata[] = {
 #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
 	{
@@ -577,6 +589,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 		.capability = ARM64_WORKAROUND_TSB_FLUSH_FAILURE,
 		ERRATA_MIDR_RANGE_LIST(tsb_flush_fail_cpus),
 	},
+#endif
+#ifdef CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
+	{
+		.desc = "ARM erratum 2253138 or 2224489",
+		.capability = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE,
+		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+		CAP_MIDR_RANGE_LIST(trbe_write_out_of_range_cpus),
+	},
 #endif
 	{
 	}
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 2102e15af43d..90628638e0f9 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -55,6 +55,7 @@ WORKAROUND_1508412
 WORKAROUND_1542419
 WORKAROUND_TRBE_OVERWRITE_FILL_MODE
 WORKAROUND_TSB_FLUSH_FAILURE
+WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
 WORKAROUND_CAVIUM_23154
 WORKAROUND_CAVIUM_27456
 WORKAROUND_CAVIUM_30115
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 05/15] coresight: trbe: Add a helper to calculate the trace generated
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (3 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 04/15] arm64: errata: Add detection for TRBE write to out-of-range Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 06/15] coresight: trbe: Add a helper to pad a given buffer area Suzuki K Poulose
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

We collect the trace from the TRBE on FILL event from IRQ context
and via update_buffer(), when the event is stopped. Let us
consolidate how we calculate the trace generated into a helper.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Changes since v2:
  - Fix code style issues
  - Read base pointer directly now. Switch to using cached value
    of the base of the ring buffer, when this changes.
---
 drivers/hwtracing/coresight/coresight-trbe.c | 47 ++++++++++++--------
 1 file changed, 29 insertions(+), 18 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 2825ccb0cf39..5902ef41bfb8 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -499,6 +499,29 @@ static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
 	return TRBE_FAULT_ACT_SPURIOUS;
 }
 
+static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
+					 struct trbe_buf *buf, bool wrap)
+{
+	u64 write;
+	u64 start_off, end_off;
+
+	/*
+	 * If the TRBE has wrapped around the write pointer has
+	 * wrapped and should be treated as limit.
+	 */
+	if (wrap)
+		write = get_trbe_limit_pointer();
+	else
+		write = get_trbe_write_pointer();
+
+	end_off = write - get_trbe_base_pointer();
+	start_off = PERF_IDX2OFF(handle->head, buf);
+
+	if (WARN_ON_ONCE(end_off < start_off))
+		return 0;
+	return (end_off - start_off);
+}
+
 static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
 				   struct perf_event *event, void **pages,
 				   int nr_pages, bool snapshot)
@@ -560,9 +583,9 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
 	struct trbe_cpudata *cpudata = dev_get_drvdata(&csdev->dev);
 	struct trbe_buf *buf = config;
 	enum trbe_fault_action act;
-	unsigned long size, offset;
-	unsigned long write, base, status;
+	unsigned long size, status;
 	unsigned long flags;
+	bool wrap = false;
 
 	WARN_ON(buf->cpudata != cpudata);
 	WARN_ON(cpudata->cpu != smp_processor_id());
@@ -602,8 +625,6 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
 	 * handle gets freed in etm_event_stop().
 	 */
 	trbe_drain_and_disable_local();
-	write = get_trbe_write_pointer();
-	base = get_trbe_base_pointer();
 
 	/* Check if there is a pending interrupt and handle it here */
 	status = read_sysreg_s(SYS_TRBSR_EL1);
@@ -627,20 +648,11 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
 			goto done;
 		}
 
-		/*
-		 * Otherwise, the buffer is full and the write pointer
-		 * has reached base. Adjust this back to the Limit pointer
-		 * for correct size. Also, mark the buffer truncated.
-		 */
-		write = get_trbe_limit_pointer();
 		trbe_report_wrap_event(handle);
+		wrap = true;
 	}
 
-	offset = write - base;
-	if (WARN_ON_ONCE(offset < PERF_IDX2OFF(handle->head, buf)))
-		size = 0;
-	else
-		size = offset - PERF_IDX2OFF(handle->head, buf);
+	size = trbe_get_trace_size(handle, buf, wrap);
 
 done:
 	local_irq_restore(flags);
@@ -721,11 +733,10 @@ static int trbe_handle_overflow(struct perf_output_handle *handle)
 {
 	struct perf_event *event = handle->event;
 	struct trbe_buf *buf = etm_perf_sink_config(handle);
-	unsigned long offset, size;
+	unsigned long size;
 	struct etm_event_data *event_data;
 
-	offset = get_trbe_limit_pointer() - get_trbe_base_pointer();
-	size = offset - PERF_IDX2OFF(handle->head, buf);
+	size = trbe_get_trace_size(handle, buf, true);
 	if (buf->snapshot)
 		handle->head += size;
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 06/15] coresight: trbe: Add a helper to pad a given buffer area
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (4 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 05/15] coresight: trbe: Add a helper to calculate the trace generated Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 07/15] coresight: trbe: Decouple buffer base from the hardware base Suzuki K Poulose
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

Refactor the helper to pad a given AUX buffer area to allow
"filling" ignore packets, without moving any handle pointers.
This will be useful in working around errata, where we may
have to fill the buffer after a session.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 5902ef41bfb8..3de03136285f 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -199,12 +199,18 @@ static void trbe_stop_and_truncate_event(struct perf_output_handle *handle)
  * consumed from the user space. The enabled TRBE buffer area is a moving subset of
  * the allocated perf auxiliary buffer.
  */
+
+static void __trbe_pad_buf(struct trbe_buf *buf, u64 offset, int len)
+{
+	memset((void *)buf->trbe_base + offset, ETE_IGNORE_PACKET, len);
+}
+
 static void trbe_pad_buf(struct perf_output_handle *handle, int len)
 {
 	struct trbe_buf *buf = etm_perf_sink_config(handle);
 	u64 head = PERF_IDX2OFF(handle->head, buf);
 
-	memset((void *)buf->trbe_base + head, ETE_IGNORE_PACKET, len);
+	__trbe_pad_buf(buf, head, len);
 	if (!buf->snapshot)
 		perf_aux_output_skip(handle, len);
 }
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 07/15] coresight: trbe: Decouple buffer base from the hardware base
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (5 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 06/15] coresight: trbe: Add a helper to pad a given buffer area Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 08/15] coresight: trbe: Allow driver to choose a different alignment Suzuki K Poulose
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

We always set the TRBBASER_EL1 to the base of the virtual ring
buffer. We are about to change this for working around an erratum.
So, in preparation to that, allow the driver to choose a different
base for the TRBBASER_EL1 (which is within the buffer range).

Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 3de03136285f..58796ff425a4 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -57,6 +57,8 @@ struct trbe_buf {
 	 * trbe_limit sibling pointers.
 	 */
 	unsigned long trbe_base;
+	/* The base programmed into the TRBE */
+	unsigned long trbe_hw_base;
 	unsigned long trbe_limit;
 	unsigned long trbe_write;
 	int nr_pages;
@@ -470,12 +472,13 @@ static void set_trbe_limit_pointer_enabled(unsigned long addr)
 
 static void trbe_enable_hw(struct trbe_buf *buf)
 {
-	WARN_ON(buf->trbe_write < buf->trbe_base);
+	WARN_ON(buf->trbe_hw_base < buf->trbe_base);
+	WARN_ON(buf->trbe_write < buf->trbe_hw_base);
 	WARN_ON(buf->trbe_write >= buf->trbe_limit);
 	set_trbe_disabled();
 	isb();
 	clr_trbe_status();
-	set_trbe_base_pointer(buf->trbe_base);
+	set_trbe_base_pointer(buf->trbe_hw_base);
 	set_trbe_write_pointer(buf->trbe_write);
 
 	/*
@@ -520,7 +523,12 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
 	else
 		write = get_trbe_write_pointer();
 
-	end_off = write - get_trbe_base_pointer();
+	/*
+	 * TRBE may use a different base address than the base
+	 * of the ring buffer. Thus use the beginning of the ring
+	 * buffer to compute the offsets.
+	 */
+	end_off = write - buf->trbe_base;
 	start_off = PERF_IDX2OFF(handle->head, buf);
 
 	if (WARN_ON_ONCE(end_off < start_off))
@@ -678,6 +686,8 @@ static int __arm_trbe_enable(struct trbe_buf *buf,
 		trbe_stop_and_truncate_event(handle);
 		return -ENOSPC;
 	}
+	/* Set the base of the TRBE to the buffer base */
+	buf->trbe_hw_base = buf->trbe_base;
 	*this_cpu_ptr(buf->cpudata->drvdata->handle) = handle;
 	trbe_enable_hw(buf);
 	return 0;
@@ -771,7 +781,7 @@ static bool is_perf_trbe(struct perf_output_handle *handle)
 	struct trbe_drvdata *drvdata = cpudata->drvdata;
 	int cpu = smp_processor_id();
 
-	WARN_ON(buf->trbe_base != get_trbe_base_pointer());
+	WARN_ON(buf->trbe_hw_base != get_trbe_base_pointer());
 	WARN_ON(buf->trbe_limit != get_trbe_limit_pointer());
 
 	if (cpudata->mode != CS_MODE_PERF)
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 08/15] coresight: trbe: Allow driver to choose a different alignment
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (6 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 07/15] coresight: trbe: Decouple buffer base from the hardware base Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling Suzuki K Poulose
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

The TRBE hardware mandates a minimum alignment for the TRBPTR_EL1,
advertised via the TRBIDR_EL1. This is used by the driver to
align the buffer write head. This patch allows the driver to
choose a different alignment from that of the hardware, by
decoupling the alignment tracking. This will be useful for
working around errata.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 58796ff425a4..f8c04c428780 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -67,8 +67,18 @@ struct trbe_buf {
 	struct trbe_cpudata *cpudata;
 };
 
+/*
+ * struct trbe_cpudata: TRBE instance specific data
+ * @trbe_flag		- TRBE dirty/access flag support
+ * @trbe_hw_align	- Actual TRBE alignment required for TRBPTR_EL1.
+ * @trbe_align		- Software alignment used for the TRBPTR_EL1,
+ * @cpu			- CPU this TRBE belongs to.
+ * @mode		- Mode of current operation. (perf/disabled)
+ * @drvdata		- TRBE specific drvdata
+ */
 struct trbe_cpudata {
 	bool trbe_flag;
+	u64 trbe_hw_align;
 	u64 trbe_align;
 	int cpu;
 	enum cs_mode mode;
@@ -875,7 +885,7 @@ static ssize_t align_show(struct device *dev, struct device_attribute *attr, cha
 {
 	struct trbe_cpudata *cpudata = dev_get_drvdata(dev);
 
-	return sprintf(buf, "%llx\n", cpudata->trbe_align);
+	return sprintf(buf, "%llx\n", cpudata->trbe_hw_align);
 }
 static DEVICE_ATTR_RO(align);
 
@@ -963,11 +973,13 @@ static void arm_trbe_probe_cpu(void *info)
 		goto cpu_clear;
 	}
 
-	cpudata->trbe_align = 1ULL << get_trbe_address_align(trbidr);
-	if (cpudata->trbe_align > SZ_2K) {
+	cpudata->trbe_hw_align = 1ULL << get_trbe_address_align(trbidr);
+	if (cpudata->trbe_hw_align > SZ_2K) {
 		pr_err("Unsupported alignment on cpu %d\n", cpu);
 		goto cpu_clear;
 	}
+
+	cpudata->trbe_align = cpudata->trbe_hw_align;
 	cpudata->trbe_flag = get_trbe_flag_update(trbidr);
 	cpudata->cpu = cpu;
 	cpudata->drvdata = drvdata;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (7 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 08/15] coresight: trbe: Allow driver to choose a different alignment Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-19  5:25   ` Anshuman Khandual
  2021-10-29 10:31   ` Arnd Bergmann
  2021-10-14 22:31 ` [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode Suzuki K Poulose
                   ` (5 subsequent siblings)
  14 siblings, 2 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

Add a minimal infrastructure to keep track of the errata
affecting the given TRBE instance. Given that we have
heterogeneous CPUs, we have to manage the list per-TRBE
instance to be able to apply the work around as needed.
Thus we will need to check if individual CPUs are affected
by the erratum.

We rely on the arm64 errata framework for the actual
description and the discovery of a given erratum, to
keep the Erratum work around at a central place and
benefit from the code and the advertisement from the
kernel. Though we could reuse the "this_cpu_has_cap()"
to apply an erratum work around, it is a bit of a heavy
operation, as it must go through the "erratum" detection
check on the CPU every time it is called (e.g, scanning
through a table of affected MIDRs). Since we need
to do this check for every session, may be multiple
times (depending on the wrok around), we could save
the cycles by caching the affected errata per-CPU
instance in the per-CPU struct trbe_cpudata.

Since we are only interested in the errata affecting
the TRBE driver, we only need to track a very few of them
per-CPU.  Thus we use a local mapping of the CPUCAP for the
erratum to avoid bloating up a bitmap for trbe_cpudata.

i.e, each arm64 TRBE erratum bit is assigned a "index"
within the driver to track. Each trbe instance updates
the list of affected erratum at probe time on the CPU.
This makes sure that we can easily access the list of
errata on a given TRBE instance without much overhead.

Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Changes since v4:
  - Ensure the arm_trbe_probe_cpu() is called from non preemptible
    context for hotplugged CPUs
Changes since v2:
  - Automatically define TRBE_ERRATA_MAX
  - Add some basic sanity check to make sure the new entries
    are added in order.
  - Describe the design choice of caching CPU local errata
    in trbe_cpudata instead of using this_cpu_has_cap()
Changes since v1:
  - Flip the order of args for trbe_has_erratum()
  - Move erratum detection further down in the sequence
---
 drivers/hwtracing/coresight/coresight-trbe.c | 71 +++++++++++++++++++-
 1 file changed, 68 insertions(+), 3 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index f8c04c428780..314e5e7374c7 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -16,6 +16,7 @@
 #define pr_fmt(fmt) DRVNAME ": " fmt
 
 #include <asm/barrier.h>
+
 #include "coresight-self-hosted-trace.h"
 #include "coresight-trbe.h"
 
@@ -67,14 +68,43 @@ struct trbe_buf {
 	struct trbe_cpudata *cpudata;
 };
 
+/*
+ * TRBE erratum list
+ *
+ * The errata are defined in arm64 generic cpu_errata framework.
+ * Since the errata work arounds could be applied individually
+ * to the affected CPUs inside the TRBE driver, we need to know if
+ * a given CPU is affected by the erratum. Unlike the other erratum
+ * work arounds, TRBE driver needs to check multiple times during
+ * a trace session. Thus we need a quicker access to per-CPU
+ * errata and not issue costly this_cpu_has_cap() everytime.
+ * We keep a set of the affected errata in trbe_cpudata, per TRBE.
+ *
+ * We rely on the corresponding cpucaps to be defined for a given
+ * TRBE erratum. We map the given cpucap into a TRBE internal number
+ * to make the tracking of the errata lean.
+ *
+ * This helps in :
+ *   - Not duplicating the detection logic
+ *   - Streamlined detection of erratum across the system
+ */
+
+static int trbe_errata_cpucaps[] = {
+	-1,		/* Sentinel, must be the last entry */
+};
+
+/* The total number of listed errata in trbe_errata_cpucaps */
+#define TRBE_ERRATA_MAX			(ARRAY_SIZE(trbe_errata_cpucaps) - 1)
+
 /*
  * struct trbe_cpudata: TRBE instance specific data
  * @trbe_flag		- TRBE dirty/access flag support
  * @trbe_hw_align	- Actual TRBE alignment required for TRBPTR_EL1.
- * @trbe_align		- Software alignment used for the TRBPTR_EL1,
+ * @trbe_align		- Software alignment used for the TRBPTR_EL1
  * @cpu			- CPU this TRBE belongs to.
  * @mode		- Mode of current operation. (perf/disabled)
  * @drvdata		- TRBE specific drvdata
+ * @errata		- Bit map for the errata on this TRBE.
  */
 struct trbe_cpudata {
 	bool trbe_flag;
@@ -84,6 +114,7 @@ struct trbe_cpudata {
 	enum cs_mode mode;
 	struct trbe_buf *buf;
 	struct trbe_drvdata *drvdata;
+	DECLARE_BITMAP(errata, TRBE_ERRATA_MAX);
 };
 
 struct trbe_drvdata {
@@ -96,6 +127,25 @@ struct trbe_drvdata {
 	struct platform_device *pdev;
 };
 
+static void trbe_check_errata(struct trbe_cpudata *cpudata)
+{
+	int i;
+
+	for (i = 0; i < TRBE_ERRATA_MAX; i++) {
+		int cap = trbe_errata_cpucaps[i];
+
+		if (WARN_ON_ONCE(cap < 0))
+			return;
+		if (this_cpu_has_cap(cap))
+			set_bit(i, cpudata->errata);
+	}
+}
+
+static inline bool trbe_has_erratum(struct trbe_cpudata *cpudata, int i)
+{
+	return (i < TRBE_ERRATA_MAX) && test_bit(i, cpudata->errata);
+}
+
 static int trbe_alloc_node(struct perf_event *event)
 {
 	if (event->cpu == -1)
@@ -952,6 +1002,9 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
 	cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
 }
 
+/*
+ * Must be called with preemption disabled, for trbe_check_errata().
+ */
 static void arm_trbe_probe_cpu(void *info)
 {
 	struct trbe_drvdata *drvdata = info;
@@ -979,6 +1032,12 @@ static void arm_trbe_probe_cpu(void *info)
 		goto cpu_clear;
 	}
 
+	/*
+	 * Run the TRBE erratum checks, now that we know
+	 * this instance is about to be registered.
+	 */
+	trbe_check_errata(cpudata);
+
 	cpudata->trbe_align = cpudata->trbe_hw_align;
 	cpudata->trbe_flag = get_trbe_flag_update(trbidr);
 	cpudata->cpu = cpu;
@@ -1032,18 +1091,24 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
 	return 0;
 }
 
+static void arm_trbe_probe_hotplugged_cpu(struct trbe_drvdata *drvdata)
+{
+	preempt_disable();
+	arm_trbe_probe_cpu(drvdata);
+	preempt_enable();
+}
+
 static int arm_trbe_cpu_startup(unsigned int cpu, struct hlist_node *node)
 {
 	struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
 
 	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
-
 		/*
 		 * If this CPU was not probed for TRBE,
 		 * initialize it now.
 		 */
 		if (!coresight_get_percpu_sink(cpu)) {
-			arm_trbe_probe_cpu(drvdata);
+			arm_trbe_probe_hotplugged_cpu(drvdata);
 			if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
 				arm_trbe_register_coresight_cpu(drvdata, cpu);
 			if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (8 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-18 15:51   ` Mathieu Poirier
  2021-10-19  5:42   ` Anshuman Khandual
  2021-10-14 22:31 ` [PATCH v5 11/15] coresight: trbe: Add a helper to determine the minimum buffer size Suzuki K Poulose
                   ` (4 subsequent siblings)
  14 siblings, 2 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from
an erratum, which when triggered, might cause the TRBE to overwrite
the trace data already collected in FILL mode, in the event of a WRAP.
i.e, the TRBE doesn't stop writing the data, instead wraps to the base
and could write upto 3 cache line size worth trace. Thus, this could
corrupt the trace at the "BASE" pointer.

The workaround is to program the write pointer 256bytes from the
base, such that if the erratum is triggered, it doesn't overwrite
the trace data that was captured. This skipped region could be
padded with ignore packets at the end of the session, so that
the decoder sees a continuous buffer with some padding at the
beginning. The trace data written at the base is considered
lost as the limit could have been in the middle of the perf
ring buffer, and jumping to the "base" is not acceptable.
We set the flags already to indicate that some amount of trace
was lost during the FILL event IRQ. So this is fine.

One important change with the work around is, we program the
TRBBASER_EL1 to current page where we are allowed to write.
Otherwise, it could overwrite a region that may be consumed
by the perf. Towards this, we always make sure that the
"handle->head" and thus the trbe_write is PAGE_SIZE aligned,
so that we can set the BASE to the PAGE base and move the
TRBPTR to the 256bytes offset.

Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
Changes since v2:
 - Updated the ASCII art to include better description of
   all the steps in the work around
Change since v1:
 - Updated comment with ASCII art
 - Add _BYTES suffix for the space to skip for the work around.
---
 drivers/hwtracing/coresight/coresight-trbe.c | 169 +++++++++++++++++--
 1 file changed, 158 insertions(+), 11 deletions(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 314e5e7374c7..b56b166b2dec 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -16,6 +16,7 @@
 #define pr_fmt(fmt) DRVNAME ": " fmt
 
 #include <asm/barrier.h>
+#include <asm/cpufeature.h>
 
 #include "coresight-self-hosted-trace.h"
 #include "coresight-trbe.h"
@@ -88,14 +89,22 @@ struct trbe_buf {
  *   - Not duplicating the detection logic
  *   - Streamlined detection of erratum across the system
  */
+#define TRBE_WORKAROUND_OVERWRITE_FILL_MODE	0
 
 static int trbe_errata_cpucaps[] = {
+	[TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE,
 	-1,		/* Sentinel, must be the last entry */
 };
 
 /* The total number of listed errata in trbe_errata_cpucaps */
 #define TRBE_ERRATA_MAX			(ARRAY_SIZE(trbe_errata_cpucaps) - 1)
 
+/*
+ * Safe limit for the number of bytes that may be overwritten
+ * when ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE is triggered.
+ */
+#define TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES	256
+
 /*
  * struct trbe_cpudata: TRBE instance specific data
  * @trbe_flag		- TRBE dirty/access flag support
@@ -146,6 +155,11 @@ static inline bool trbe_has_erratum(struct trbe_cpudata *cpudata, int i)
 	return (i < TRBE_ERRATA_MAX) && test_bit(i, cpudata->errata);
 }
 
+static inline bool trbe_may_overwrite_in_fill_mode(struct trbe_cpudata *cpudata)
+{
+	return trbe_has_erratum(cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE);
+}
+
 static int trbe_alloc_node(struct perf_event *event)
 {
 	if (event->cpu == -1)
@@ -549,10 +563,13 @@ static void trbe_enable_hw(struct trbe_buf *buf)
 	set_trbe_limit_pointer_enabled(buf->trbe_limit);
 }
 
-static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
+static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle,
+						 u64 trbsr)
 {
 	int ec = get_trbe_ec(trbsr);
 	int bsc = get_trbe_bsc(trbsr);
+	struct trbe_buf *buf = etm_perf_sink_config(handle);
+	struct trbe_cpudata *cpudata = buf->cpudata;
 
 	WARN_ON(is_trbe_running(trbsr));
 	if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
@@ -561,10 +578,16 @@ static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
 	if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
 		return TRBE_FAULT_ACT_FATAL;
 
-	if (is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) {
-		if (get_trbe_write_pointer() == get_trbe_base_pointer())
-			return TRBE_FAULT_ACT_WRAP;
-	}
+	/*
+	 * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
+	 * it might write data after a WRAP event in the fill mode.
+	 * Thus the check TRBPTR == TRBBASER will not be honored.
+	 */
+	if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
+	    (trbe_may_overwrite_in_fill_mode(cpudata) ||
+	     get_trbe_write_pointer() == get_trbe_base_pointer()))
+		return TRBE_FAULT_ACT_WRAP;
+
 	return TRBE_FAULT_ACT_SPURIOUS;
 }
 
@@ -573,6 +596,8 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
 {
 	u64 write;
 	u64 start_off, end_off;
+	u64 size;
+	u64 overwrite_skip = TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
 
 	/*
 	 * If the TRBE has wrapped around the write pointer has
@@ -593,7 +618,18 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
 
 	if (WARN_ON_ONCE(end_off < start_off))
 		return 0;
-	return (end_off - start_off);
+
+	size = end_off - start_off;
+	/*
+	 * If the TRBE is affected by the following erratum, we must fill
+	 * the space we skipped with IGNORE packets. And we are always
+	 * guaranteed to have at least a PAGE_SIZE space in the buffer.
+	 */
+	if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE) &&
+	    !WARN_ON(size < overwrite_skip))
+		__trbe_pad_buf(buf, start_off, overwrite_skip);
+
+	return size;
 }
 
 static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
@@ -712,7 +748,7 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
 		clr_trbe_irq();
 		isb();
 
-		act = trbe_get_fault_act(status);
+		act = trbe_get_fault_act(handle, status);
 		/*
 		 * If this was not due to a WRAP event, we have some
 		 * errors and as such buffer is empty.
@@ -736,21 +772,117 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
 	return size;
 }
 
+
+static int trbe_apply_work_around_before_enable(struct trbe_buf *buf)
+{
+	/*
+	 * TRBE_WORKAROUND_OVERWRITE_FILL_MODE causes the TRBE to overwrite a few cache
+	 * line size from the "TRBBASER_EL1" in the event of a "FILL".
+	 * Thus, we could loose some amount of the trace at the base.
+	 *
+	 * Before Fix:
+	 *
+	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
+	 *  |                   \/                       /
+	 *   -------------------------------------------------------------
+	 *  |   Pg0      |   Pg1       |           |          |  PgN     |
+	 *   -------------------------------------------------------------
+	 *
+	 * In the normal course of action, we would set the TRBBASER to the
+	 * beginning of the ring-buffer (normal-BASE). But with the erratum,
+	 * the TRBE could overwrite the contents at the "normal-BASE", after
+	 * hitting the "normal-LIMIT", since it doesn't stop as expected. And
+	 * this is wrong. This could result in overwriting trace collected in
+	 * one of the previous runs, being consumed by the user. So we must
+	 * always make sure that the TRBBASER is within the region
+	 * [head, head+size]. Note that TRBBASER must be PAGE aligned,
+	 *
+	 *  After moving the BASE:
+	 *
+	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
+	 *  |                   \/                       /
+	 *   -------------------------------------------------------------
+	 *  |         |          |xyzdef.     |..   tuvw|                |
+	 *   -------------------------------------------------------------
+	 *                      /
+	 *              New-BASER
+	 *
+	 * Also, we would set the TRBPTR to head (after adjusting for
+	 * alignment) at normal-PTR. This would mean that the last few bytes
+	 * of the trace (say, "xyz") might overwrite the first few bytes of
+	 * trace written ("abc"). More importantly they will appear in what
+	 * userspace sees as the beginning of the trace, which is wrong. We may
+	 * not always have space to move the latest trace "xyz" to the correct
+	 * order as it must appear beyond the LIMIT. (i.e, [head..head+size]).
+	 * Thus it is easier to ignore those bytes than to complicate the
+	 * driver to move it, assuming that the erratum was triggered and
+	 * doing additional checks to see if there is indeed allowed space at
+	 * TRBLIMITR.LIMIT.
+	 *
+	 *  Thus the full workaround will move the BASE and the PTR and would
+	 *  look like (after padding at the skipped bytes at the end of
+	 *  session) :
+	 *
+	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
+	 *  |                   \/                       /
+	 *   -------------------------------------------------------------
+	 *  |         |          |///abc..     |..  rst|                |
+	 *   -------------------------------------------------------------
+	 *                      /    |
+	 *              New-BASER    New-TRBPTR
+	 *
+	 * To summarize, with the work around:
+	 *
+	 *  - We always align the offset for the next session to PAGE_SIZE
+	 *    (This is to ensure we can program the TRBBASER to this offset
+	 *    within the region [head...head+size]).
+	 *
+	 *  - At TRBE enable:
+	 *     - Set the TRBBASER to the page aligned offset of the current
+	 *       proposed write offset. (which is guaranteed to be aligned
+	 *       as above)
+	 *     - Move the TRBPTR to skip first 256bytes (that might be
+	 *       overwritten with the erratum). This ensures that the trace
+	 *       generated in the session is not re-written.
+	 *
+	 *  - At trace collection:
+	 *     - Pad the 256bytes skipped above again with IGNORE packets.
+	 */
+	if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE)) {
+		if (WARN_ON(!IS_ALIGNED(buf->trbe_write, PAGE_SIZE)))
+			return -EINVAL;
+		buf->trbe_hw_base = buf->trbe_write;
+		buf->trbe_write += TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
+	}
+
+	return 0;
+}
+
 static int __arm_trbe_enable(struct trbe_buf *buf,
 			     struct perf_output_handle *handle)
 {
+	int ret = 0;
+
 	perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
 	buf->trbe_limit = compute_trbe_buffer_limit(handle);
 	buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
 	if (buf->trbe_limit == buf->trbe_base) {
-		trbe_stop_and_truncate_event(handle);
-		return -ENOSPC;
+		ret = -ENOSPC;
+		goto err;
 	}
 	/* Set the base of the TRBE to the buffer base */
 	buf->trbe_hw_base = buf->trbe_base;
+
+	ret = trbe_apply_work_around_before_enable(buf);
+	if (ret)
+		goto err;
+
 	*this_cpu_ptr(buf->cpudata->drvdata->handle) = handle;
 	trbe_enable_hw(buf);
 	return 0;
+err:
+	trbe_stop_and_truncate_event(handle);
+	return ret;
 }
 
 static int arm_trbe_enable(struct coresight_device *csdev, u32 mode, void *data)
@@ -890,7 +1022,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
 	if (!is_perf_trbe(handle))
 		return IRQ_NONE;
 
-	act = trbe_get_fault_act(status);
+	act = trbe_get_fault_act(handle, status);
 	switch (act) {
 	case TRBE_FAULT_ACT_WRAP:
 		truncated = !!trbe_handle_overflow(handle);
@@ -1038,7 +1170,22 @@ static void arm_trbe_probe_cpu(void *info)
 	 */
 	trbe_check_errata(cpudata);
 
-	cpudata->trbe_align = cpudata->trbe_hw_align;
+	/*
+	 * If the TRBE is affected by erratum TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
+	 * we must always program the TBRPTR_EL1, 256bytes from a page
+	 * boundary, with TRBBASER_EL1 set to the page, to prevent
+	 * TRBE over-writing 256bytes at TRBBASER_EL1 on FILL event.
+	 *
+	 * Thus make sure we always align our write pointer to a PAGE_SIZE,
+	 * which also guarantees that we have at least a PAGE_SIZE space in
+	 * the buffer (TRBLIMITR is PAGE aligned) and thus we can skip
+	 * the required bytes at the base.
+	 */
+	if (trbe_may_overwrite_in_fill_mode(cpudata))
+		cpudata->trbe_align = PAGE_SIZE;
+	else
+		cpudata->trbe_align = cpudata->trbe_hw_align;
+
 	cpudata->trbe_flag = get_trbe_flag_update(trbidr);
 	cpudata->cpu = cpu;
 	cpudata->drvdata = drvdata;
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 11/15] coresight: trbe: Add a helper to determine the minimum buffer size
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (9 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 12/15] coresight: trbe: Make sure we have enough space Suzuki K Poulose
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

For the TRBE to operate, we need a minimum space available to collect
meaningful trace session. This is currently a few bytes, but we may need
to extend this for working around errata. So, abstract this into a helper
function.

Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index b56b166b2dec..4a50309a892d 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -303,6 +303,11 @@ static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle)
 	return buf->nr_pages * PAGE_SIZE;
 }
 
+static u64 trbe_min_trace_buf_size(struct perf_output_handle *handle)
+{
+	return TRBE_TRACE_MIN_BUF_SIZE;
+}
+
 /*
  * TRBE Limit Calculation
  *
@@ -473,7 +478,7 @@ static unsigned long trbe_normal_offset(struct perf_output_handle *handle)
 	 * have space for a meaningful run, we rather pad it
 	 * and start fresh.
 	 */
-	if (limit && (limit - head < TRBE_TRACE_MIN_BUF_SIZE)) {
+	if (limit && ((limit - head) < trbe_min_trace_buf_size(handle))) {
 		trbe_pad_buf(handle, limit - head);
 		limit = __trbe_normal_offset(handle);
 	}
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 12/15] coresight: trbe: Make sure we have enough space
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (10 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 11/15] coresight: trbe: Add a helper to determine the minimum buffer size Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-19  5:55   ` Anshuman Khandual
  2021-10-14 22:31 ` [PATCH v5 13/15] coresight: trbe: Work around write to out of range Suzuki K Poulose
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

The TRBE driver makes sure that there is enough space for a meaningful
run, otherwise pads the given space and restarts the offset calculation
once. But there is no guarantee that we may find space or hit "no space".
Make sure that we repeat the step until, either :
  - We have the minimum space
   OR
  - There is NO space at all.

Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 4a50309a892d..5fb9f49eab33 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -477,10 +477,14 @@ static unsigned long trbe_normal_offset(struct perf_output_handle *handle)
 	 * If the head is too close to the limit and we don't
 	 * have space for a meaningful run, we rather pad it
 	 * and start fresh.
+	 *
+	 * We might have to do this more than once to make sure
+	 * we have enough required space.
 	 */
-	if (limit && ((limit - head) < trbe_min_trace_buf_size(handle))) {
+	while (limit && ((limit - head) < trbe_min_trace_buf_size(handle))) {
 		trbe_pad_buf(handle, limit - head);
 		limit = __trbe_normal_offset(handle);
+		head = PERF_IDX2OFF(handle->head, buf);
 	}
 	return limit;
 }
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 13/15] coresight: trbe: Work around write to out of range
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (11 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 12/15] coresight: trbe: Make sure we have enough space Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-19  5:57   ` Anshuman Khandual
  2021-10-14 22:31 ` [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode Suzuki K Poulose
  2021-10-14 22:31 ` [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address Suzuki K Poulose
  14 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

TRBE implementations affected by Arm erratum (2253138 or 2224489), could
write to the next address after the TRBLIMITR.LIMIT, instead of wrapping
to the TRBBASER. This implies that the TRBE could potentially corrupt :

  - A page used by the rest of the kernel/user (if the LIMIT = end of
    perf ring buffer)
  - A page within the ring buffer, but outside the driver's range.
    [head, head + size]. This may contain some trace data, may be
    consumed by the userspace.

We workaround this erratum by :
  - Making sure that there is at least an extra PAGE space left in the
    TRBE's range than we normally assign. This will be additional to other
    restrictions (e.g, the TRBE alignment for working around
    TRBE_WORKAROUND_OVERWRITE_IN_FILL_MODE, where there is a minimum of
    PAGE_SIZE. Thus we would have 2 * PAGE_SIZE)

  - Adjust the LIMIT to leave the last PAGE_SIZE out of the TRBE's allowed
    range (i.e, TRBEBASER...TRBLIMITR.LIMIT), by :

        TRBLIMITR.LIMIT -= PAGE_SIZE

Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 drivers/hwtracing/coresight/coresight-trbe.c | 63 +++++++++++++++++++-
 1 file changed, 62 insertions(+), 1 deletion(-)

diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
index 5fb9f49eab33..eca2e7e94079 100644
--- a/drivers/hwtracing/coresight/coresight-trbe.c
+++ b/drivers/hwtracing/coresight/coresight-trbe.c
@@ -90,9 +90,11 @@ struct trbe_buf {
  *   - Streamlined detection of erratum across the system
  */
 #define TRBE_WORKAROUND_OVERWRITE_FILL_MODE	0
+#define TRBE_WORKAROUND_WRITE_OUT_OF_RANGE	1
 
 static int trbe_errata_cpucaps[] = {
 	[TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE,
+	[TRBE_WORKAROUND_WRITE_OUT_OF_RANGE] = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE,
 	-1,		/* Sentinel, must be the last entry */
 };
 
@@ -160,6 +162,11 @@ static inline bool trbe_may_overwrite_in_fill_mode(struct trbe_cpudata *cpudata)
 	return trbe_has_erratum(cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE);
 }
 
+static inline bool trbe_may_write_out_of_range(struct trbe_cpudata *cpudata)
+{
+	return trbe_has_erratum(cpudata, TRBE_WORKAROUND_WRITE_OUT_OF_RANGE);
+}
+
 static int trbe_alloc_node(struct perf_event *event)
 {
 	if (event->cpu == -1)
@@ -305,7 +312,21 @@ static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle)
 
 static u64 trbe_min_trace_buf_size(struct perf_output_handle *handle)
 {
-	return TRBE_TRACE_MIN_BUF_SIZE;
+	u64 size = TRBE_TRACE_MIN_BUF_SIZE;
+	struct trbe_buf *buf = etm_perf_sink_config(handle);
+	struct trbe_cpudata *cpudata = buf->cpudata;
+
+	/*
+	 * When the TRBE is affected by an erratum that could make it
+	 * write to the next "virtually addressed" page beyond the LIMIT.
+	 * We need to make sure there is always a PAGE after the LIMIT,
+	 * within the buffer. Thus we ensure there is at least an extra
+	 * page than normal. With this we could then adjust the LIMIT
+	 * pointer down by a PAGE later.
+	 */
+	if (trbe_may_write_out_of_range(cpudata))
+		size += PAGE_SIZE;
+	return size;
 }
 
 /*
@@ -611,6 +632,17 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
 	/*
 	 * If the TRBE has wrapped around the write pointer has
 	 * wrapped and should be treated as limit.
+	 *
+	 * When the TRBE is affected by TRBE_WORKAROUND_WRITE_OUT_OF_RANGE,
+	 * it may write upto 64bytes beyond the "LIMIT". The driver already
+	 * keeps a valid page next to the LIMIT and we could potentially
+	 * consume the trace data that may have been collected there. But we
+	 * cannot be really sure it is available, and the TRBPTR may not
+	 * indicate the same. Also, affected cores are also affected by another
+	 * erratum which forces the PAGE_SIZE alignment on the TRBPTR, and thus
+	 * could potentially pad an entire PAGE_SIZE - 64bytes, to get those
+	 * 64bytes. Thus we ignore the potential triggering of the erratum
+	 * on WRAP and limit the data to LIMIT.
 	 */
 	if (wrap)
 		write = get_trbe_limit_pointer();
@@ -864,6 +896,35 @@ static int trbe_apply_work_around_before_enable(struct trbe_buf *buf)
 		buf->trbe_write += TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
 	}
 
+	/*
+	 * TRBE_WORKAROUND_WRITE_OUT_OF_RANGE could cause the TRBE to write to
+	 * the next page after the TRBLIMITR.LIMIT. For perf, the "next page"
+	 * may be:
+	 *     - The page beyond the ring buffer. This could mean, TRBE could
+	 *       corrupt another entity (kernel / user)
+	 *     - A portion of the "ring buffer" consumed by the userspace.
+	 *       i.e, a page outisde [head, head + size].
+	 *
+	 * We work around this by:
+	 *     - Making sure that we have at least an extra space of PAGE left
+	 *       in the ring buffer [head, head + size], than we normally do
+	 *       without the erratum. See trbe_min_trace_buf_size().
+	 *
+	 *     - Adjust the TRBLIMITR.LIMIT to leave the extra PAGE outside
+	 *       the TRBE's range (i.e [TRBBASER, TRBLIMITR.LIMI] ).
+	 */
+	if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_WRITE_OUT_OF_RANGE)) {
+		s64 space = buf->trbe_limit - buf->trbe_write;
+		/*
+		 * We must have more than a PAGE_SIZE worth space in the proposed
+		 * range for the TRBE.
+		 */
+		if (WARN_ON(space <= PAGE_SIZE ||
+			    !IS_ALIGNED(buf->trbe_limit, PAGE_SIZE)))
+			return -EINVAL;
+		buf->trbe_limit -= PAGE_SIZE;
+	}
+
 	return 0;
 }
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (12 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 13/15] coresight: trbe: Work around write to out of range Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-18 15:54   ` Mathieu Poirier
                     ` (2 more replies)
  2021-10-14 22:31 ` [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address Suzuki K Poulose
  14 siblings, 3 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

With the workaround enabled in TRBE, enable the config entries
to be built without COMPILE_TEST

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/Kconfig | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f30029f4a9f9..f72fa44d6182 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -672,7 +672,6 @@ config ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
 config ARM64_ERRATUM_2119858
 	bool "Cortex-A710: 2119858: workaround TRBE overwriting trace data in FILL mode"
 	default y
-	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
 	depends on CORESIGHT_TRBE
 	select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
 	help
@@ -691,7 +690,6 @@ config ARM64_ERRATUM_2119858
 config ARM64_ERRATUM_2139208
 	bool "Neoverse-N2: 2139208: workaround TRBE overwriting trace data in FILL mode"
 	default y
-	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
 	depends on CORESIGHT_TRBE
 	select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
 	help
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address
  2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
                   ` (13 preceding siblings ...)
  2021-10-14 22:31 ` [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode Suzuki K Poulose
@ 2021-10-14 22:31 ` Suzuki K Poulose
  2021-10-18 15:54   ` Mathieu Poirier
                     ` (2 more replies)
  14 siblings, 3 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-14 22:31 UTC (permalink / raw)
  To: will, mathieu.poirier
  Cc: catalin.marinas, anshuman.khandual, mike.leach, leo.yan, maz,
	coresight, linux-arm-kernel, linux-kernel, Suzuki K Poulose

With the TRBE driver workaround available, enable the config symbols
to be built without COMPILE_TEST

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/Kconfig | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f72fa44d6182..d6383ef05871 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -743,7 +743,6 @@ config ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
 
 config ARM64_ERRATUM_2253138
 	bool "Neoverse-N2: 2253138: workaround TRBE writing to address out-of-range"
-	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
 	depends on CORESIGHT_TRBE
 	default y
 	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
@@ -762,7 +761,6 @@ config ARM64_ERRATUM_2253138
 
 config ARM64_ERRATUM_2224489
 	bool "Cortex-A710: 2224489: workaround TRBE writing to address out-of-range"
-	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
 	depends on CORESIGHT_TRBE
 	default y
 	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 04/15] arm64: errata: Add detection for TRBE write to out-of-range
  2021-10-14 22:31 ` [PATCH v5 04/15] arm64: errata: Add detection for TRBE write to out-of-range Suzuki K Poulose
@ 2021-10-18 15:50   ` Mathieu Poirier
  2021-10-19 13:29     ` Suzuki K Poulose
  0 siblings, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2021-10-18 15:50 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: will, catalin.marinas, anshuman.khandual, mike.leach, leo.yan,
	maz, coresight, linux-arm-kernel, linux-kernel, Mark Rutland

Good morning,

On Thu, Oct 14, 2021 at 11:31:14PM +0100, Suzuki K Poulose wrote:
> Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where the
> trbe, under some circumstances, might write upto 64bytes to an address after

Checkpatch gives me a warning about this line...

> the Limit as programmed by the TRBLIMITR_EL1.LIMIT. This might -
> 
>   - Corrupt a page in the ring buffer, which may corrupt trace from a
>     previous session, consumed by userspace.
>   - Hit the guard page at the end of the vmalloc area and raise a fault.
> 
> To keep the handling simpler, we always leave the last page from the
> range, which TRBE is allowed to write. This can be achieved by ensuring
> that we always have more than a PAGE worth space in the range, while
> calculating the LIMIT for TRBE. And then the LIMIT pointer can be adjusted
> to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the TRBE range
> while enabling it. This makes sure that the TRBE will only write to an area
> within its allowed limit (i.e, [head-head+size]) and we do not have to handle

I'm pretty sure this line will also be flagged.

> address faults within the driver.
> 
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  Documentation/arm64/silicon-errata.rst |  4 +++
>  arch/arm64/Kconfig                     | 41 ++++++++++++++++++++++++++
>  arch/arm64/kernel/cpu_errata.c         | 20 +++++++++++++
>  arch/arm64/tools/cpucaps               |  1 +
>  4 files changed, 66 insertions(+)
> 
> diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
> index 569a92411dcd..5342e895fb60 100644
> --- a/Documentation/arm64/silicon-errata.rst
> +++ b/Documentation/arm64/silicon-errata.rst
> @@ -96,6 +96,8 @@ stable kernels.
>  +----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | Cortex-A710     | #2054223        | ARM64_ERRATUM_2054223       |
>  +----------------+-----------------+-----------------+-----------------------------+
> +| ARM            | Cortex-A710     | #2224489        | ARM64_ERRATUM_2224489       |
> ++----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | Neoverse-N1     | #1188873,1418040| ARM64_ERRATUM_1418040       |
>  +----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | Neoverse-N1     | #1349291        | N/A                         |
> @@ -106,6 +108,8 @@ stable kernels.
>  +----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | Neoverse-N2     | #2067961        | ARM64_ERRATUM_2067961       |
>  +----------------+-----------------+-----------------+-----------------------------+
> +| ARM            | Neoverse-N2     | #2253138        | ARM64_ERRATUM_2253138       |
> ++----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | MMU-500         | #841119,826419  | N/A                         |
>  +----------------+-----------------+-----------------+-----------------------------+
>  +----------------+-----------------+-----------------+-----------------------------+
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 39b78460b9d0..f30029f4a9f9 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -740,6 +740,47 @@ config ARM64_ERRATUM_2067961
>  
>  	  If unsure, say Y.
>  
> +config ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> +	bool
> +
> +config ARM64_ERRATUM_2253138
> +	bool "Neoverse-N2: 2253138: workaround TRBE writing to address out-of-range"
> +	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
> +	depends on CORESIGHT_TRBE
> +	default y
> +	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> +	help
> +	  This option adds the workaround for ARM Neoverse-N2 erratum 2253138.
> +
> +	  Affected Neoverse-N2 cores might write to an out-of-range address, not reserved
> +	  for TRBE. Under some conditions, the TRBE might generate a write to the next
> +	  virtually addressed page following the last page of the TRBE address space
> +	  (i.e., the TRBLIMITR_EL1.LIMIT), instead of wrapping around to the base.
> +
> +	  Work around this in the driver by always making sure that there is a
> +	  page beyond the TRBLIMITR_EL1.LIMIT, within the space allowed for the TRBE.
> +
> +	  If unsure, say Y.
> +
> +config ARM64_ERRATUM_2224489
> +	bool "Cortex-A710: 2224489: workaround TRBE writing to address out-of-range"
> +	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
> +	depends on CORESIGHT_TRBE
> +	default y
> +	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> +	help
> +	  This option adds the workaround for ARM Cortex-A710 erratum 2224489.
> +
> +	  Affected Cortex-A710 cores might write to an out-of-range address, not reserved
> +	  for TRBE. Under some conditions, the TRBE might generate a write to the next
> +	  virtually addressed page following the last page of the TRBE address space
> +	  (i.e., the TRBLIMITR_EL1.LIMIT), instead of wrapping around to the base.
> +
> +	  Work around this in the driver by always making sure that there is a
> +	  page beyond the TRBLIMITR_EL1.LIMIT, within the space allowed for the TRBE.
> +
> +	  If unsure, say Y.
> +
>  config CAVIUM_ERRATUM_22375
>  	bool "Cavium erratum 22375, 24313"
>  	default y
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index bdbeac75ead6..e2978b89d4b8 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -364,6 +364,18 @@ static const struct midr_range tsb_flush_fail_cpus[] = {
>  };
>  #endif	/* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
>  
> +#ifdef CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> +static struct midr_range trbe_write_out_of_range_cpus[] = {
> +#ifdef CONFIG_ARM64_ERRATUM_2253138
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> +#endif
> +#ifdef CONFIG_ARM64_ERRATUM_2224489
> +	MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
> +#endif
> +	{},
> +};
> +#endif /* CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE */
> +
>  const struct arm64_cpu_capabilities arm64_errata[] = {
>  #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
>  	{
> @@ -577,6 +589,14 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>  		.capability = ARM64_WORKAROUND_TSB_FLUSH_FAILURE,
>  		ERRATA_MIDR_RANGE_LIST(tsb_flush_fail_cpus),
>  	},
> +#endif
> +#ifdef CONFIG_ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> +	{
> +		.desc = "ARM erratum 2253138 or 2224489",
> +		.capability = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE,
> +		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
> +		CAP_MIDR_RANGE_LIST(trbe_write_out_of_range_cpus),
> +	},
>  #endif
>  	{
>  	}
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 2102e15af43d..90628638e0f9 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -55,6 +55,7 @@ WORKAROUND_1508412
>  WORKAROUND_1542419
>  WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>  WORKAROUND_TSB_FLUSH_FAILURE
> +WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
>  WORKAROUND_CAVIUM_23154
>  WORKAROUND_CAVIUM_27456
>  WORKAROUND_CAVIUM_30115
> -- 
> 2.25.4
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  2021-10-14 22:31 ` [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode Suzuki K Poulose
@ 2021-10-18 15:51   ` Mathieu Poirier
  2021-10-18 21:15     ` Suzuki K Poulose
  2021-10-19  5:42   ` Anshuman Khandual
  1 sibling, 1 reply; 41+ messages in thread
From: Mathieu Poirier @ 2021-10-18 15:51 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: will, catalin.marinas, anshuman.khandual, mike.leach, leo.yan,
	maz, coresight, linux-arm-kernel, linux-kernel

On Thu, Oct 14, 2021 at 11:31:20PM +0100, Suzuki K Poulose wrote:
> ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from
> an erratum, which when triggered, might cause the TRBE to overwrite
> the trace data already collected in FILL mode, in the event of a WRAP.
> i.e, the TRBE doesn't stop writing the data, instead wraps to the base
> and could write upto 3 cache line size worth trace. Thus, this could
> corrupt the trace at the "BASE" pointer.
> 
> The workaround is to program the write pointer 256bytes from the
> base, such that if the erratum is triggered, it doesn't overwrite
> the trace data that was captured. This skipped region could be
> padded with ignore packets at the end of the session, so that
> the decoder sees a continuous buffer with some padding at the
> beginning. The trace data written at the base is considered
> lost as the limit could have been in the middle of the perf
> ring buffer, and jumping to the "base" is not acceptable.
> We set the flags already to indicate that some amount of trace
> was lost during the FILL event IRQ. So this is fine.
> 
> One important change with the work around is, we program the
> TRBBASER_EL1 to current page where we are allowed to write.
> Otherwise, it could overwrite a region that may be consumed
> by the perf. Towards this, we always make sure that the
> "handle->head" and thus the trbe_write is PAGE_SIZE aligned,
> so that we can set the BASE to the PAGE base and move the
> TRBPTR to the 256bytes offset.
> 
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Leo Yan <leo.yan@linaro.org>
> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
> Changes since v2:
>  - Updated the ASCII art to include better description of
>    all the steps in the work around
> Change since v1:
>  - Updated comment with ASCII art
>  - Add _BYTES suffix for the space to skip for the work around.
> ---
>  drivers/hwtracing/coresight/coresight-trbe.c | 169 +++++++++++++++++--
>  1 file changed, 158 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 314e5e7374c7..b56b166b2dec 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -16,6 +16,7 @@
>  #define pr_fmt(fmt) DRVNAME ": " fmt
>  
>  #include <asm/barrier.h>
> +#include <asm/cpufeature.h>

Here too I get a checkpatch warning...

>  
>  #include "coresight-self-hosted-trace.h"
>  #include "coresight-trbe.h"
> @@ -88,14 +89,22 @@ struct trbe_buf {
>   *   - Not duplicating the detection logic
>   *   - Streamlined detection of erratum across the system
>   */
> +#define TRBE_WORKAROUND_OVERWRITE_FILL_MODE	0
>  
>  static int trbe_errata_cpucaps[] = {
> +	[TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE,
>  	-1,		/* Sentinel, must be the last entry */
>  };
>  
>  /* The total number of listed errata in trbe_errata_cpucaps */
>  #define TRBE_ERRATA_MAX			(ARRAY_SIZE(trbe_errata_cpucaps) - 1)
>  
> +/*
> + * Safe limit for the number of bytes that may be overwritten
> + * when ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE is triggered.
> + */
> +#define TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES	256
> +
>  /*
>   * struct trbe_cpudata: TRBE instance specific data
>   * @trbe_flag		- TRBE dirty/access flag support
> @@ -146,6 +155,11 @@ static inline bool trbe_has_erratum(struct trbe_cpudata *cpudata, int i)
>  	return (i < TRBE_ERRATA_MAX) && test_bit(i, cpudata->errata);
>  }
>  
> +static inline bool trbe_may_overwrite_in_fill_mode(struct trbe_cpudata *cpudata)
> +{
> +	return trbe_has_erratum(cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE);
> +}
> +
>  static int trbe_alloc_node(struct perf_event *event)
>  {
>  	if (event->cpu == -1)
> @@ -549,10 +563,13 @@ static void trbe_enable_hw(struct trbe_buf *buf)
>  	set_trbe_limit_pointer_enabled(buf->trbe_limit);
>  }
>  
> -static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
> +static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle,
> +						 u64 trbsr)
>  {
>  	int ec = get_trbe_ec(trbsr);
>  	int bsc = get_trbe_bsc(trbsr);
> +	struct trbe_buf *buf = etm_perf_sink_config(handle);
> +	struct trbe_cpudata *cpudata = buf->cpudata;
>  
>  	WARN_ON(is_trbe_running(trbsr));
>  	if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
> @@ -561,10 +578,16 @@ static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
>  	if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
>  		return TRBE_FAULT_ACT_FATAL;
>  
> -	if (is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) {
> -		if (get_trbe_write_pointer() == get_trbe_base_pointer())
> -			return TRBE_FAULT_ACT_WRAP;
> -	}
> +	/*
> +	 * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
> +	 * it might write data after a WRAP event in the fill mode.
> +	 * Thus the check TRBPTR == TRBBASER will not be honored.
> +	 */
> +	if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
> +	    (trbe_may_overwrite_in_fill_mode(cpudata) ||
> +	     get_trbe_write_pointer() == get_trbe_base_pointer()))
> +		return TRBE_FAULT_ACT_WRAP;
> +
>  	return TRBE_FAULT_ACT_SPURIOUS;
>  }
>  
> @@ -573,6 +596,8 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
>  {
>  	u64 write;
>  	u64 start_off, end_off;
> +	u64 size;
> +	u64 overwrite_skip = TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
>  
>  	/*
>  	 * If the TRBE has wrapped around the write pointer has
> @@ -593,7 +618,18 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
>  
>  	if (WARN_ON_ONCE(end_off < start_off))
>  		return 0;
> -	return (end_off - start_off);
> +
> +	size = end_off - start_off;
> +	/*
> +	 * If the TRBE is affected by the following erratum, we must fill
> +	 * the space we skipped with IGNORE packets. And we are always
> +	 * guaranteed to have at least a PAGE_SIZE space in the buffer.
> +	 */
> +	if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE) &&
> +	    !WARN_ON(size < overwrite_skip))
> +		__trbe_pad_buf(buf, start_off, overwrite_skip);
> +
> +	return size;
>  }
>  
>  static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
> @@ -712,7 +748,7 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
>  		clr_trbe_irq();
>  		isb();
>  
> -		act = trbe_get_fault_act(status);
> +		act = trbe_get_fault_act(handle, status);
>  		/*
>  		 * If this was not due to a WRAP event, we have some
>  		 * errors and as such buffer is empty.
> @@ -736,21 +772,117 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
>  	return size;
>  }
>  
> +
> +static int trbe_apply_work_around_before_enable(struct trbe_buf *buf)
> +{
> +	/*
> +	 * TRBE_WORKAROUND_OVERWRITE_FILL_MODE causes the TRBE to overwrite a few cache
> +	 * line size from the "TRBBASER_EL1" in the event of a "FILL".
> +	 * Thus, we could loose some amount of the trace at the base.
> +	 *
> +	 * Before Fix:
> +	 *
> +	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
> +	 *  |                   \/                       /
> +	 *   -------------------------------------------------------------
> +	 *  |   Pg0      |   Pg1       |           |          |  PgN     |
> +	 *   -------------------------------------------------------------
> +	 *
> +	 * In the normal course of action, we would set the TRBBASER to the
> +	 * beginning of the ring-buffer (normal-BASE). But with the erratum,
> +	 * the TRBE could overwrite the contents at the "normal-BASE", after
> +	 * hitting the "normal-LIMIT", since it doesn't stop as expected. And
> +	 * this is wrong. This could result in overwriting trace collected in
> +	 * one of the previous runs, being consumed by the user. So we must
> +	 * always make sure that the TRBBASER is within the region
> +	 * [head, head+size]. Note that TRBBASER must be PAGE aligned,
> +	 *
> +	 *  After moving the BASE:
> +	 *
> +	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
> +	 *  |                   \/                       /
> +	 *   -------------------------------------------------------------
> +	 *  |         |          |xyzdef.     |..   tuvw|                |
> +	 *   -------------------------------------------------------------
> +	 *                      /
> +	 *              New-BASER
> +	 *
> +	 * Also, we would set the TRBPTR to head (after adjusting for
> +	 * alignment) at normal-PTR. This would mean that the last few bytes
> +	 * of the trace (say, "xyz") might overwrite the first few bytes of
> +	 * trace written ("abc"). More importantly they will appear in what
> +	 * userspace sees as the beginning of the trace, which is wrong. We may
> +	 * not always have space to move the latest trace "xyz" to the correct
> +	 * order as it must appear beyond the LIMIT. (i.e, [head..head+size]).
> +	 * Thus it is easier to ignore those bytes than to complicate the
> +	 * driver to move it, assuming that the erratum was triggered and
> +	 * doing additional checks to see if there is indeed allowed space at
> +	 * TRBLIMITR.LIMIT.
> +	 *
> +	 *  Thus the full workaround will move the BASE and the PTR and would
> +	 *  look like (after padding at the skipped bytes at the end of
> +	 *  session) :
> +	 *
> +	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
> +	 *  |                   \/                       /
> +	 *   -------------------------------------------------------------
> +	 *  |         |          |///abc..     |..  rst|                |
> +	 *   -------------------------------------------------------------
> +	 *                      /    |
> +	 *              New-BASER    New-TRBPTR
> +	 *
> +	 * To summarize, with the work around:
> +	 *
> +	 *  - We always align the offset for the next session to PAGE_SIZE
> +	 *    (This is to ensure we can program the TRBBASER to this offset
> +	 *    within the region [head...head+size]).
> +	 *
> +	 *  - At TRBE enable:
> +	 *     - Set the TRBBASER to the page aligned offset of the current
> +	 *       proposed write offset. (which is guaranteed to be aligned
> +	 *       as above)
> +	 *     - Move the TRBPTR to skip first 256bytes (that might be
> +	 *       overwritten with the erratum). This ensures that the trace
> +	 *       generated in the session is not re-written.
> +	 *
> +	 *  - At trace collection:
> +	 *     - Pad the 256bytes skipped above again with IGNORE packets.
> +	 */
> +	if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE)) {
> +		if (WARN_ON(!IS_ALIGNED(buf->trbe_write, PAGE_SIZE)))
> +			return -EINVAL;
> +		buf->trbe_hw_base = buf->trbe_write;
> +		buf->trbe_write += TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
> +	}
> +
> +	return 0;
> +}
> +
>  static int __arm_trbe_enable(struct trbe_buf *buf,
>  			     struct perf_output_handle *handle)
>  {
> +	int ret = 0;
> +
>  	perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
>  	buf->trbe_limit = compute_trbe_buffer_limit(handle);
>  	buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
>  	if (buf->trbe_limit == buf->trbe_base) {
> -		trbe_stop_and_truncate_event(handle);
> -		return -ENOSPC;
> +		ret = -ENOSPC;
> +		goto err;
>  	}
>  	/* Set the base of the TRBE to the buffer base */
>  	buf->trbe_hw_base = buf->trbe_base;
> +
> +	ret = trbe_apply_work_around_before_enable(buf);
> +	if (ret)
> +		goto err;
> +
>  	*this_cpu_ptr(buf->cpudata->drvdata->handle) = handle;
>  	trbe_enable_hw(buf);
>  	return 0;
> +err:
> +	trbe_stop_and_truncate_event(handle);
> +	return ret;
>  }
>  
>  static int arm_trbe_enable(struct coresight_device *csdev, u32 mode, void *data)
> @@ -890,7 +1022,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
>  	if (!is_perf_trbe(handle))
>  		return IRQ_NONE;
>  
> -	act = trbe_get_fault_act(status);
> +	act = trbe_get_fault_act(handle, status);
>  	switch (act) {
>  	case TRBE_FAULT_ACT_WRAP:
>  		truncated = !!trbe_handle_overflow(handle);
> @@ -1038,7 +1170,22 @@ static void arm_trbe_probe_cpu(void *info)
>  	 */
>  	trbe_check_errata(cpudata);
>  
> -	cpudata->trbe_align = cpudata->trbe_hw_align;
> +	/*
> +	 * If the TRBE is affected by erratum TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
> +	 * we must always program the TBRPTR_EL1, 256bytes from a page
> +	 * boundary, with TRBBASER_EL1 set to the page, to prevent
> +	 * TRBE over-writing 256bytes at TRBBASER_EL1 on FILL event.
> +	 *
> +	 * Thus make sure we always align our write pointer to a PAGE_SIZE,
> +	 * which also guarantees that we have at least a PAGE_SIZE space in
> +	 * the buffer (TRBLIMITR is PAGE aligned) and thus we can skip
> +	 * the required bytes at the base.
> +	 */
> +	if (trbe_may_overwrite_in_fill_mode(cpudata))
> +		cpudata->trbe_align = PAGE_SIZE;
> +	else
> +		cpudata->trbe_align = cpudata->trbe_hw_align;
> +
>  	cpudata->trbe_flag = get_trbe_flag_update(trbidr);
>  	cpudata->cpu = cpu;
>  	cpudata->drvdata = drvdata;
> -- 
> 2.25.4
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  2021-10-14 22:31 ` [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode Suzuki K Poulose
@ 2021-10-18 15:54   ` Mathieu Poirier
  2021-10-19  5:59   ` Anshuman Khandual
  2021-10-19 10:42   ` Will Deacon
  2 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2021-10-18 15:54 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: will, catalin.marinas, anshuman.khandual, mike.leach, leo.yan,
	maz, coresight, linux-arm-kernel, linux-kernel

On Thu, Oct 14, 2021 at 11:31:24PM +0100, Suzuki K Poulose wrote:
> With the workaround enabled in TRBE, enable the config entries
> to be built without COMPILE_TEST
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/Kconfig | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f30029f4a9f9..f72fa44d6182 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -672,7 +672,6 @@ config ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>  config ARM64_ERRATUM_2119858
>  	bool "Cortex-A710: 2119858: workaround TRBE overwriting trace data in FILL mode"
>  	default y
> -	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
>  	depends on CORESIGHT_TRBE
>  	select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>  	help
> @@ -691,7 +690,6 @@ config ARM64_ERRATUM_2119858
>  config ARM64_ERRATUM_2139208
>  	bool "Neoverse-N2: 2139208: workaround TRBE overwriting trace data in FILL mode"
>  	default y
> -	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
>  	depends on CORESIGHT_TRBE

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>

>  	select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>  	help
> -- 
> 2.25.4
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address
  2021-10-14 22:31 ` [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address Suzuki K Poulose
@ 2021-10-18 15:54   ` Mathieu Poirier
  2021-10-19  6:00   ` Anshuman Khandual
  2021-10-19 10:42   ` Will Deacon
  2 siblings, 0 replies; 41+ messages in thread
From: Mathieu Poirier @ 2021-10-18 15:54 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: will, catalin.marinas, anshuman.khandual, mike.leach, leo.yan,
	maz, coresight, linux-arm-kernel, linux-kernel

On Thu, Oct 14, 2021 at 11:31:25PM +0100, Suzuki K Poulose wrote:
> With the TRBE driver workaround available, enable the config symbols
> to be built without COMPILE_TEST
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/Kconfig | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f72fa44d6182..d6383ef05871 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -743,7 +743,6 @@ config ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
>  
>  config ARM64_ERRATUM_2253138
>  	bool "Neoverse-N2: 2253138: workaround TRBE writing to address out-of-range"
> -	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
>  	depends on CORESIGHT_TRBE
>  	default y
>  	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> @@ -762,7 +761,6 @@ config ARM64_ERRATUM_2253138
>  
>  config ARM64_ERRATUM_2224489
>  	bool "Cortex-A710: 2224489: workaround TRBE writing to address out-of-range"
> -	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
>  	depends on CORESIGHT_TRBE

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>

>  	default y
>  	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> -- 
> 2.25.4
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  2021-10-18 15:51   ` Mathieu Poirier
@ 2021-10-18 21:15     ` Suzuki K Poulose
  2021-10-19  4:36       ` Anshuman Khandual
  0 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-18 21:15 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: will, catalin.marinas, anshuman.khandual, mike.leach, leo.yan,
	maz, coresight, linux-arm-kernel, linux-kernel

On 18/10/2021 16:51, Mathieu Poirier wrote:
> On Thu, Oct 14, 2021 at 11:31:20PM +0100, Suzuki K Poulose wrote:
>> ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from
>> an erratum, which when triggered, might cause the TRBE to overwrite
>> the trace data already collected in FILL mode, in the event of a WRAP.
>> i.e, the TRBE doesn't stop writing the data, instead wraps to the base
>> and could write upto 3 cache line size worth trace. Thus, this could
>> corrupt the trace at the "BASE" pointer.
>>
>> The workaround is to program the write pointer 256bytes from the
>> base, such that if the erratum is triggered, it doesn't overwrite
>> the trace data that was captured. This skipped region could be
>> padded with ignore packets at the end of the session, so that
>> the decoder sees a continuous buffer with some padding at the
>> beginning. The trace data written at the base is considered
>> lost as the limit could have been in the middle of the perf
>> ring buffer, and jumping to the "base" is not acceptable.
>> We set the flags already to indicate that some amount of trace
>> was lost during the FILL event IRQ. So this is fine.
>>
>> One important change with the work around is, we program the
>> TRBBASER_EL1 to current page where we are allowed to write.
>> Otherwise, it could overwrite a region that may be consumed
>> by the perf. Towards this, we always make sure that the
>> "handle->head" and thus the trbe_write is PAGE_SIZE aligned,
>> so that we can set the BASE to the PAGE base and move the
>> TRBPTR to the 256bytes offset.
>>
>> Cc: Mike Leach <mike.leach@linaro.org>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
>> Cc: Leo Yan <leo.yan@linaro.org>
>> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>> Changes since v2:
>>   - Updated the ASCII art to include better description of
>>     all the steps in the work around
>> Change since v1:
>>   - Updated comment with ASCII art
>>   - Add _BYTES suffix for the space to skip for the work around.
>> ---
>>   drivers/hwtracing/coresight/coresight-trbe.c | 169 +++++++++++++++++--
>>   1 file changed, 158 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>> index 314e5e7374c7..b56b166b2dec 100644
>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>> @@ -16,6 +16,7 @@
>>   #define pr_fmt(fmt) DRVNAME ": " fmt
>>   
>>   #include <asm/barrier.h>
>> +#include <asm/cpufeature.h>
> 
> Here too I get a checkpatch warning...
> 

That is a false alarm. I guess that warns for including
linux/cpufeature.h? It is a bit odd, we include this
for the arm64 cpucaps, not the generic linux feature
checks. (They are used for "loading modules" based
on "features" which are more like ELF HWCAPs).

As such I chose to ignore it.

Suzuki

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  2021-10-18 21:15     ` Suzuki K Poulose
@ 2021-10-19  4:36       ` Anshuman Khandual
  2021-10-19  8:37         ` Suzuki K Poulose
  0 siblings, 1 reply; 41+ messages in thread
From: Anshuman Khandual @ 2021-10-19  4:36 UTC (permalink / raw)
  To: Suzuki K Poulose, Mathieu Poirier
  Cc: will, catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel



On 10/19/21 2:45 AM, Suzuki K Poulose wrote:
> On 18/10/2021 16:51, Mathieu Poirier wrote:
>> On Thu, Oct 14, 2021 at 11:31:20PM +0100, Suzuki K Poulose wrote:
>>> ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from
>>> an erratum, which when triggered, might cause the TRBE to overwrite
>>> the trace data already collected in FILL mode, in the event of a WRAP.
>>> i.e, the TRBE doesn't stop writing the data, instead wraps to the base
>>> and could write upto 3 cache line size worth trace. Thus, this could
>>> corrupt the trace at the "BASE" pointer.
>>>
>>> The workaround is to program the write pointer 256bytes from the
>>> base, such that if the erratum is triggered, it doesn't overwrite
>>> the trace data that was captured. This skipped region could be
>>> padded with ignore packets at the end of the session, so that
>>> the decoder sees a continuous buffer with some padding at the
>>> beginning. The trace data written at the base is considered
>>> lost as the limit could have been in the middle of the perf
>>> ring buffer, and jumping to the "base" is not acceptable.
>>> We set the flags already to indicate that some amount of trace
>>> was lost during the FILL event IRQ. So this is fine.
>>>
>>> One important change with the work around is, we program the
>>> TRBBASER_EL1 to current page where we are allowed to write.
>>> Otherwise, it could overwrite a region that may be consumed
>>> by the perf. Towards this, we always make sure that the
>>> "handle->head" and thus the trbe_write is PAGE_SIZE aligned,
>>> so that we can set the BASE to the PAGE base and move the
>>> TRBPTR to the 256bytes offset.
>>>
>>> Cc: Mike Leach <mike.leach@linaro.org>
>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
>>> Cc: Leo Yan <leo.yan@linaro.org>
>>> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>> ---
>>> Changes since v2:
>>>   - Updated the ASCII art to include better description of
>>>     all the steps in the work around
>>> Change since v1:
>>>   - Updated comment with ASCII art
>>>   - Add _BYTES suffix for the space to skip for the work around.
>>> ---
>>>   drivers/hwtracing/coresight/coresight-trbe.c | 169 +++++++++++++++++--
>>>   1 file changed, 158 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>>> index 314e5e7374c7..b56b166b2dec 100644
>>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>>> @@ -16,6 +16,7 @@
>>>   #define pr_fmt(fmt) DRVNAME ": " fmt
>>>     #include <asm/barrier.h>
>>> +#include <asm/cpufeature.h>
>>
>> Here too I get a checkpatch warning...
>>
> 
> That is a false alarm. I guess that warns for including
> linux/cpufeature.h? It is a bit odd, we include this
> for the arm64 cpucaps, not the generic linux feature

It is a bit odd, I saw that too.

> checks. (They are used for "loading modules" based
> on "features" which are more like ELF HWCAPs).

Should <asm/cpufeature.h> be renamed as <asm/arm64_cpufeature.h>
or something similar instead to differentiate it from the generic
<linux/cpufeature.h> as they are not related. Also, probably this
warning could be avoided.

> 
> As such I chose to ignore it.
> 
> Suzuki

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures
  2021-10-14 22:31 ` [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures Suzuki K Poulose
@ 2021-10-19  5:06   ` Anshuman Khandual
  2021-10-19 11:02   ` Will Deacon
  1 sibling, 0 replies; 41+ messages in thread
From: Anshuman Khandual @ 2021-10-19  5:06 UTC (permalink / raw)
  To: Suzuki K Poulose, will, mathieu.poirier
  Cc: catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel, Mark Rutland



On 10/15/21 4:01 AM, Suzuki K Poulose wrote:
> Arm Neoverse-N2 (#2067961) and Cortex-A710 (#2054223) suffers
> from errata, where a TSB (trace synchronization barrier)
> fails to flush the trace data completely, when executed from
> a trace prohibited region. In Linux we always execute it
> after we have moved the PE to trace prohibited region. So,
> we can apply the workaround every time a TSB is executed.
> 
> The work around is to issue two TSB consecutively.
> 
> NOTE: This errata is defined as LOCAL_CPU_ERRATUM, implying
> that a late CPU could be blocked from booting if it is the
> first CPU that requires the workaround. This is because we
> do not allow setting a cpu_hwcaps after the SMP boot. The
> other alternative is to use "this_cpu_has_cap()" instead
> of the faster system wide check, which may be a bit of an
> overhead, given we may have to do this in nvhe KVM host
> before a guest entry.
> 
> Cc: Will Deacon <will@kernel.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Marc Zyngier <maz@kernel.org>
> Acked-by: Catalin Marinas <catalin.marinas@arm.com>
> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
> Changes since v3:
>  - Merge the Kconfig changes back.
> ---
>  Documentation/arm64/silicon-errata.rst |  4 ++++
>  arch/arm64/Kconfig                     | 33 ++++++++++++++++++++++++++
>  arch/arm64/include/asm/barrier.h       | 16 ++++++++++++-
>  arch/arm64/kernel/cpu_errata.c         | 19 +++++++++++++++
>  arch/arm64/tools/cpucaps               |  1 +
>  5 files changed, 72 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
> index 2f99229d993c..569a92411dcd 100644
> --- a/Documentation/arm64/silicon-errata.rst
> +++ b/Documentation/arm64/silicon-errata.rst
> @@ -94,6 +94,8 @@ stable kernels.
>  +----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | Cortex-A710     | #2119858        | ARM64_ERRATUM_2119858       |
>  +----------------+-----------------+-----------------+-----------------------------+
> +| ARM            | Cortex-A710     | #2054223        | ARM64_ERRATUM_2054223       |
> ++----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | Neoverse-N1     | #1188873,1418040| ARM64_ERRATUM_1418040       |
>  +----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | Neoverse-N1     | #1349291        | N/A                         |
> @@ -102,6 +104,8 @@ stable kernels.
>  +----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | Neoverse-N2     | #2139208        | ARM64_ERRATUM_2139208       |
>  +----------------+-----------------+-----------------+-----------------------------+
> +| ARM            | Neoverse-N2     | #2067961        | ARM64_ERRATUM_2067961       |
> ++----------------+-----------------+-----------------+-----------------------------+
>  | ARM            | MMU-500         | #841119,826419  | N/A                         |
>  +----------------+-----------------+-----------------+-----------------------------+
>  +----------------+-----------------+-----------------+-----------------------------+
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b452181d3638..39b78460b9d0 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -707,6 +707,39 @@ config ARM64_ERRATUM_2139208
>  
>  	  If unsure, say Y.
>  
> +config ARM64_WORKAROUND_TSB_FLUSH_FAILURE
> +	bool
> +
> +config ARM64_ERRATUM_2054223
> +	bool "Cortex-A710: 2054223: workaround TSB instruction failing to flush trace"
> +	default y
> +	select ARM64_WORKAROUND_TSB_FLUSH_FAILURE
> +	help
> +	  Enable workaround for ARM Cortex-A710 erratum 2054223
> +
> +	  Affected cores may fail to flush the trace data on a TSB instruction, when
> +	  the PE is in trace prohibited state. This will cause losing a few bytes
> +	  of the trace cached.
> +
> +	  Workaround is to issue two TSB consecutively on affected cores.
> +
> +	  If unsure, say Y.
> +
> +config ARM64_ERRATUM_2067961
> +	bool "Neoverse-N2: 2067961: workaround TSB instruction failing to flush trace"
> +	default y
> +	select ARM64_WORKAROUND_TSB_FLUSH_FAILURE
> +	help
> +	  Enable workaround for ARM Neoverse-N2 erratum 2067961
> +
> +	  Affected cores may fail to flush the trace data on a TSB instruction, when
> +	  the PE is in trace prohibited state. This will cause losing a few bytes
> +	  of the trace cached.
> +
> +	  Workaround is to issue two TSB consecutively on affected cores.
> +
> +	  If unsure, say Y.
> +
>  config CAVIUM_ERRATUM_22375
>  	bool "Cavium erratum 22375, 24313"
>  	default y
> diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
> index 451e11e5fd23..1c5a00598458 100644
> --- a/arch/arm64/include/asm/barrier.h
> +++ b/arch/arm64/include/asm/barrier.h
> @@ -23,7 +23,7 @@
>  #define dsb(opt)	asm volatile("dsb " #opt : : : "memory")
>  
>  #define psb_csync()	asm volatile("hint #17" : : : "memory")
> -#define tsb_csync()	asm volatile("hint #18" : : : "memory")
> +#define __tsb_csync()	asm volatile("hint #18" : : : "memory")
>  #define csdb()		asm volatile("hint #20" : : : "memory")
>  
>  #ifdef CONFIG_ARM64_PSEUDO_NMI
> @@ -46,6 +46,20 @@
>  #define dma_rmb()	dmb(oshld)
>  #define dma_wmb()	dmb(oshst)
>  
> +
> +#define tsb_csync()								\
> +	do {									\
> +		/*								\
> +		 * CPUs affected by Arm Erratum 2054223 or 2067961 needs	\
> +		 * another TSB to ensure the trace is flushed. The barriers	\
> +		 * don't have to be strictly back to back, as long as the	\
> +		 * CPU is in trace prohibited state.				\
> +		 */								\
> +		if (cpus_have_final_cap(ARM64_WORKAROUND_TSB_FLUSH_FAILURE))	\
> +			__tsb_csync();						\
> +		__tsb_csync();							\
> +	} while (0)
> +
>  /*
>   * Generate a mask for array_index__nospec() that is ~0UL when 0 <= idx < sz
>   * and 0 otherwise.
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index ccd757373f36..bdbeac75ead6 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -352,6 +352,18 @@ static const struct midr_range trbe_overwrite_fill_mode_cpus[] = {
>  };
>  #endif	/* CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE */
>  
> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE
> +static const struct midr_range tsb_flush_fail_cpus[] = {
> +#ifdef CONFIG_ARM64_ERRATUM_2067961
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> +#endif
> +#ifdef CONFIG_ARM64_ERRATUM_2054223
> +	MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
> +#endif
> +	{},
> +};
> +#endif	/* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
> +
>  const struct arm64_cpu_capabilities arm64_errata[] = {
>  #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
>  	{
> @@ -558,6 +570,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>  		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
>  		CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus),
>  	},
> +#endif
> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE
> +	{
> +		.desc = "ARM erratum 2067961 or 2054223",
> +		.capability = ARM64_WORKAROUND_TSB_FLUSH_FAILURE,
> +		ERRATA_MIDR_RANGE_LIST(tsb_flush_fail_cpus),
> +	},
>  #endif
>  	{
>  	}
> diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
> index 1ccb92165bd8..2102e15af43d 100644
> --- a/arch/arm64/tools/cpucaps
> +++ b/arch/arm64/tools/cpucaps
> @@ -54,6 +54,7 @@ WORKAROUND_1463225
>  WORKAROUND_1508412
>  WORKAROUND_1542419
>  WORKAROUND_TRBE_OVERWRITE_FILL_MODE
> +WORKAROUND_TSB_FLUSH_FAILURE
>  WORKAROUND_CAVIUM_23154
>  WORKAROUND_CAVIUM_27456
>  WORKAROUND_CAVIUM_30115
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling
  2021-10-14 22:31 ` [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling Suzuki K Poulose
@ 2021-10-19  5:25   ` Anshuman Khandual
  2021-10-29 10:31   ` Arnd Bergmann
  1 sibling, 0 replies; 41+ messages in thread
From: Anshuman Khandual @ 2021-10-19  5:25 UTC (permalink / raw)
  To: Suzuki K Poulose, will, mathieu.poirier
  Cc: catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel



On 10/15/21 4:01 AM, Suzuki K Poulose wrote:
> Add a minimal infrastructure to keep track of the errata
> affecting the given TRBE instance. Given that we have
> heterogeneous CPUs, we have to manage the list per-TRBE
> instance to be able to apply the work around as needed.
> Thus we will need to check if individual CPUs are affected
> by the erratum.
> 
> We rely on the arm64 errata framework for the actual
> description and the discovery of a given erratum, to
> keep the Erratum work around at a central place and
> benefit from the code and the advertisement from the
> kernel. Though we could reuse the "this_cpu_has_cap()"
> to apply an erratum work around, it is a bit of a heavy
> operation, as it must go through the "erratum" detection
> check on the CPU every time it is called (e.g, scanning
> through a table of affected MIDRs). Since we need
> to do this check for every session, may be multiple
> times (depending on the wrok around), we could save
> the cycles by caching the affected errata per-CPU
> instance in the per-CPU struct trbe_cpudata.
> 
> Since we are only interested in the errata affecting
> the TRBE driver, we only need to track a very few of them
> per-CPU.  Thus we use a local mapping of the CPUCAP for the
> erratum to avoid bloating up a bitmap for trbe_cpudata.
> 
> i.e, each arm64 TRBE erratum bit is assigned a "index"
> within the driver to track. Each trbe instance updates
> the list of affected erratum at probe time on the CPU.
> This makes sure that we can easily access the list of
> errata on a given TRBE instance without much overhead.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
> Changes since v4:
>   - Ensure the arm_trbe_probe_cpu() is called from non preemptible
>     context for hotplugged CPUs

This (both renaming and non preemptible context) makes sense.

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

A small nit though.

Should the array sentinel value '-1' be bit formalized with a
macro and check like this != instead.

if (WARN_ON_ONCE(cap != TRBE_ERRATA_SENTINEL))
 

> Changes since v2:
>   - Automatically define TRBE_ERRATA_MAX
>   - Add some basic sanity check to make sure the new entries
>     are added in order.
>   - Describe the design choice of caching CPU local errata
>     in trbe_cpudata instead of using this_cpu_has_cap()
> Changes since v1:
>   - Flip the order of args for trbe_has_erratum()
>   - Move erratum detection further down in the sequence
> ---
>  drivers/hwtracing/coresight/coresight-trbe.c | 71 +++++++++++++++++++-
>  1 file changed, 68 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index f8c04c428780..314e5e7374c7 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -16,6 +16,7 @@
>  #define pr_fmt(fmt) DRVNAME ": " fmt
>  
>  #include <asm/barrier.h>
> +
>  #include "coresight-self-hosted-trace.h"
>  #include "coresight-trbe.h"
>  
> @@ -67,14 +68,43 @@ struct trbe_buf {
>  	struct trbe_cpudata *cpudata;
>  };
>  
> +/*
> + * TRBE erratum list
> + *
> + * The errata are defined in arm64 generic cpu_errata framework.
> + * Since the errata work arounds could be applied individually
> + * to the affected CPUs inside the TRBE driver, we need to know if
> + * a given CPU is affected by the erratum. Unlike the other erratum
> + * work arounds, TRBE driver needs to check multiple times during
> + * a trace session. Thus we need a quicker access to per-CPU
> + * errata and not issue costly this_cpu_has_cap() everytime.
> + * We keep a set of the affected errata in trbe_cpudata, per TRBE.
> + *
> + * We rely on the corresponding cpucaps to be defined for a given
> + * TRBE erratum. We map the given cpucap into a TRBE internal number
> + * to make the tracking of the errata lean.
> + *
> + * This helps in :
> + *   - Not duplicating the detection logic
> + *   - Streamlined detection of erratum across the system
> + */
> +
> +static int trbe_errata_cpucaps[] = {
> +	-1,		/* Sentinel, must be the last entry */
> +};
> +
> +/* The total number of listed errata in trbe_errata_cpucaps */
> +#define TRBE_ERRATA_MAX			(ARRAY_SIZE(trbe_errata_cpucaps) - 1)
> +
>  /*
>   * struct trbe_cpudata: TRBE instance specific data
>   * @trbe_flag		- TRBE dirty/access flag support
>   * @trbe_hw_align	- Actual TRBE alignment required for TRBPTR_EL1.
> - * @trbe_align		- Software alignment used for the TRBPTR_EL1,
> + * @trbe_align		- Software alignment used for the TRBPTR_EL1
>   * @cpu			- CPU this TRBE belongs to.
>   * @mode		- Mode of current operation. (perf/disabled)
>   * @drvdata		- TRBE specific drvdata
> + * @errata		- Bit map for the errata on this TRBE.
>   */
>  struct trbe_cpudata {
>  	bool trbe_flag;
> @@ -84,6 +114,7 @@ struct trbe_cpudata {
>  	enum cs_mode mode;
>  	struct trbe_buf *buf;
>  	struct trbe_drvdata *drvdata;
> +	DECLARE_BITMAP(errata, TRBE_ERRATA_MAX);
>  };
>  
>  struct trbe_drvdata {
> @@ -96,6 +127,25 @@ struct trbe_drvdata {
>  	struct platform_device *pdev;
>  };
>  
> +static void trbe_check_errata(struct trbe_cpudata *cpudata)
> +{
> +	int i;
> +
> +	for (i = 0; i < TRBE_ERRATA_MAX; i++) {
> +		int cap = trbe_errata_cpucaps[i];
> +
> +		if (WARN_ON_ONCE(cap < 0))
> +			return;
> +		if (this_cpu_has_cap(cap))
> +			set_bit(i, cpudata->errata);
> +	}
> +}
> +
> +static inline bool trbe_has_erratum(struct trbe_cpudata *cpudata, int i)
> +{
> +	return (i < TRBE_ERRATA_MAX) && test_bit(i, cpudata->errata);
> +}
> +
>  static int trbe_alloc_node(struct perf_event *event)
>  {
>  	if (event->cpu == -1)
> @@ -952,6 +1002,9 @@ static void arm_trbe_register_coresight_cpu(struct trbe_drvdata *drvdata, int cp
>  	cpumask_clear_cpu(cpu, &drvdata->supported_cpus);
>  }
>  
> +/*
> + * Must be called with preemption disabled, for trbe_check_errata().
> + */
>  static void arm_trbe_probe_cpu(void *info)
>  {
>  	struct trbe_drvdata *drvdata = info;
> @@ -979,6 +1032,12 @@ static void arm_trbe_probe_cpu(void *info)
>  		goto cpu_clear;
>  	}
>  
> +	/*
> +	 * Run the TRBE erratum checks, now that we know
> +	 * this instance is about to be registered.
> +	 */
> +	trbe_check_errata(cpudata);
> +
>  	cpudata->trbe_align = cpudata->trbe_hw_align;
>  	cpudata->trbe_flag = get_trbe_flag_update(trbidr);
>  	cpudata->cpu = cpu;
> @@ -1032,18 +1091,24 @@ static int arm_trbe_remove_coresight(struct trbe_drvdata *drvdata)
>  	return 0;
>  }
>  
> +static void arm_trbe_probe_hotplugged_cpu(struct trbe_drvdata *drvdata)
> +{
> +	preempt_disable();
> +	arm_trbe_probe_cpu(drvdata);
> +	preempt_enable();
> +}
> +
>  static int arm_trbe_cpu_startup(unsigned int cpu, struct hlist_node *node)
>  {
>  	struct trbe_drvdata *drvdata = hlist_entry_safe(node, struct trbe_drvdata, hotplug_node);
>  
>  	if (cpumask_test_cpu(cpu, &drvdata->supported_cpus)) {
> -
>  		/*
>  		 * If this CPU was not probed for TRBE,
>  		 * initialize it now.
>  		 */
>  		if (!coresight_get_percpu_sink(cpu)) {
> -			arm_trbe_probe_cpu(drvdata);
> +			arm_trbe_probe_hotplugged_cpu(drvdata);
>  			if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
>  				arm_trbe_register_coresight_cpu(drvdata, cpu);
>  			if (cpumask_test_cpu(cpu, &drvdata->supported_cpus))
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  2021-10-14 22:31 ` [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode Suzuki K Poulose
  2021-10-18 15:51   ` Mathieu Poirier
@ 2021-10-19  5:42   ` Anshuman Khandual
  1 sibling, 0 replies; 41+ messages in thread
From: Anshuman Khandual @ 2021-10-19  5:42 UTC (permalink / raw)
  To: Suzuki K Poulose, will, mathieu.poirier
  Cc: catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel



On 10/15/21 4:01 AM, Suzuki K Poulose wrote:
> ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from
> an erratum, which when triggered, might cause the TRBE to overwrite
> the trace data already collected in FILL mode, in the event of a WRAP.
> i.e, the TRBE doesn't stop writing the data, instead wraps to the base
> and could write upto 3 cache line size worth trace. Thus, this could
> corrupt the trace at the "BASE" pointer.
> 
> The workaround is to program the write pointer 256bytes from the
> base, such that if the erratum is triggered, it doesn't overwrite
> the trace data that was captured. This skipped region could be
> padded with ignore packets at the end of the session, so that
> the decoder sees a continuous buffer with some padding at the
> beginning. The trace data written at the base is considered
> lost as the limit could have been in the middle of the perf
> ring buffer, and jumping to the "base" is not acceptable.
> We set the flags already to indicate that some amount of trace
> was lost during the FILL event IRQ. So this is fine.
> 
> One important change with the work around is, we program the
> TRBBASER_EL1 to current page where we are allowed to write.
> Otherwise, it could overwrite a region that may be consumed
> by the perf. Towards this, we always make sure that the
> "handle->head" and thus the trbe_write is PAGE_SIZE aligned,
> so that we can set the BASE to the PAGE base and move the
> TRBPTR to the 256bytes offset.
> 
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Leo Yan <leo.yan@linaro.org>
> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
> Changes since v2:
>  - Updated the ASCII art to include better description of
>    all the steps in the work around
> Change since v1:
>  - Updated comment with ASCII art
>  - Add _BYTES suffix for the space to skip for the work around.
> ---
>  drivers/hwtracing/coresight/coresight-trbe.c | 169 +++++++++++++++++--
>  1 file changed, 158 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 314e5e7374c7..b56b166b2dec 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -16,6 +16,7 @@
>  #define pr_fmt(fmt) DRVNAME ": " fmt
>  
>  #include <asm/barrier.h>
> +#include <asm/cpufeature.h>
>  
>  #include "coresight-self-hosted-trace.h"
>  #include "coresight-trbe.h"
> @@ -88,14 +89,22 @@ struct trbe_buf {
>   *   - Not duplicating the detection logic
>   *   - Streamlined detection of erratum across the system
>   */
> +#define TRBE_WORKAROUND_OVERWRITE_FILL_MODE	0
>  
>  static int trbe_errata_cpucaps[] = {
> +	[TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE,
>  	-1,		/* Sentinel, must be the last entry */
>  };
>  
>  /* The total number of listed errata in trbe_errata_cpucaps */
>  #define TRBE_ERRATA_MAX			(ARRAY_SIZE(trbe_errata_cpucaps) - 1)
>  
> +/*
> + * Safe limit for the number of bytes that may be overwritten
> + * when ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE is triggered.
> + */
> +#define TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES	256
> +
>  /*
>   * struct trbe_cpudata: TRBE instance specific data
>   * @trbe_flag		- TRBE dirty/access flag support
> @@ -146,6 +155,11 @@ static inline bool trbe_has_erratum(struct trbe_cpudata *cpudata, int i)
>  	return (i < TRBE_ERRATA_MAX) && test_bit(i, cpudata->errata);
>  }
>  
> +static inline bool trbe_may_overwrite_in_fill_mode(struct trbe_cpudata *cpudata)
> +{
> +	return trbe_has_erratum(cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE);
> +}
> +
>  static int trbe_alloc_node(struct perf_event *event)
>  {
>  	if (event->cpu == -1)
> @@ -549,10 +563,13 @@ static void trbe_enable_hw(struct trbe_buf *buf)
>  	set_trbe_limit_pointer_enabled(buf->trbe_limit);
>  }
>  
> -static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
> +static enum trbe_fault_action trbe_get_fault_act(struct perf_output_handle *handle,
> +						 u64 trbsr)
>  {
>  	int ec = get_trbe_ec(trbsr);
>  	int bsc = get_trbe_bsc(trbsr);
> +	struct trbe_buf *buf = etm_perf_sink_config(handle);
> +	struct trbe_cpudata *cpudata = buf->cpudata;
>  
>  	WARN_ON(is_trbe_running(trbsr));
>  	if (is_trbe_trg(trbsr) || is_trbe_abort(trbsr))
> @@ -561,10 +578,16 @@ static enum trbe_fault_action trbe_get_fault_act(u64 trbsr)
>  	if ((ec == TRBE_EC_STAGE1_ABORT) || (ec == TRBE_EC_STAGE2_ABORT))
>  		return TRBE_FAULT_ACT_FATAL;
>  
> -	if (is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) {
> -		if (get_trbe_write_pointer() == get_trbe_base_pointer())
> -			return TRBE_FAULT_ACT_WRAP;
> -	}
> +	/*
> +	 * If the trbe is affected by TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
> +	 * it might write data after a WRAP event in the fill mode.
> +	 * Thus the check TRBPTR == TRBBASER will not be honored.
> +	 */
> +	if ((is_trbe_wrap(trbsr) && (ec == TRBE_EC_OTHERS) && (bsc == TRBE_BSC_FILLED)) &&
> +	    (trbe_may_overwrite_in_fill_mode(cpudata) ||
> +	     get_trbe_write_pointer() == get_trbe_base_pointer()))
> +		return TRBE_FAULT_ACT_WRAP;
> +
>  	return TRBE_FAULT_ACT_SPURIOUS;
>  }
>  
> @@ -573,6 +596,8 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
>  {
>  	u64 write;
>  	u64 start_off, end_off;
> +	u64 size;
> +	u64 overwrite_skip = TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
>  
>  	/*
>  	 * If the TRBE has wrapped around the write pointer has
> @@ -593,7 +618,18 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
>  
>  	if (WARN_ON_ONCE(end_off < start_off))
>  		return 0;
> -	return (end_off - start_off);
> +
> +	size = end_off - start_off;
> +	/*
> +	 * If the TRBE is affected by the following erratum, we must fill
> +	 * the space we skipped with IGNORE packets. And we are always
> +	 * guaranteed to have at least a PAGE_SIZE space in the buffer.
> +	 */
> +	if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE) &&
> +	    !WARN_ON(size < overwrite_skip))
> +		__trbe_pad_buf(buf, start_off, overwrite_skip);
> +
> +	return size;
>  }
>  
>  static void *arm_trbe_alloc_buffer(struct coresight_device *csdev,
> @@ -712,7 +748,7 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
>  		clr_trbe_irq();
>  		isb();
>  
> -		act = trbe_get_fault_act(status);
> +		act = trbe_get_fault_act(handle, status);
>  		/*
>  		 * If this was not due to a WRAP event, we have some
>  		 * errors and as such buffer is empty.
> @@ -736,21 +772,117 @@ static unsigned long arm_trbe_update_buffer(struct coresight_device *csdev,
>  	return size;
>  }
>  
> +
> +static int trbe_apply_work_around_before_enable(struct trbe_buf *buf)
> +{
> +	/*
> +	 * TRBE_WORKAROUND_OVERWRITE_FILL_MODE causes the TRBE to overwrite a few cache
> +	 * line size from the "TRBBASER_EL1" in the event of a "FILL".
> +	 * Thus, we could loose some amount of the trace at the base.
> +	 *
> +	 * Before Fix:
> +	 *
> +	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
> +	 *  |                   \/                       /
> +	 *   -------------------------------------------------------------
> +	 *  |   Pg0      |   Pg1       |           |          |  PgN     |
> +	 *   -------------------------------------------------------------
> +	 *
> +	 * In the normal course of action, we would set the TRBBASER to the
> +	 * beginning of the ring-buffer (normal-BASE). But with the erratum,
> +	 * the TRBE could overwrite the contents at the "normal-BASE", after
> +	 * hitting the "normal-LIMIT", since it doesn't stop as expected. And
> +	 * this is wrong. This could result in overwriting trace collected in
> +	 * one of the previous runs, being consumed by the user. So we must
> +	 * always make sure that the TRBBASER is within the region
> +	 * [head, head+size]. Note that TRBBASER must be PAGE aligned,
> +	 *
> +	 *  After moving the BASE:
> +	 *
> +	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
> +	 *  |                   \/                       /
> +	 *   -------------------------------------------------------------
> +	 *  |         |          |xyzdef.     |..   tuvw|                |
> +	 *   -------------------------------------------------------------
> +	 *                      /
> +	 *              New-BASER
> +	 *
> +	 * Also, we would set the TRBPTR to head (after adjusting for
> +	 * alignment) at normal-PTR. This would mean that the last few bytes
> +	 * of the trace (say, "xyz") might overwrite the first few bytes of
> +	 * trace written ("abc"). More importantly they will appear in what
> +	 * userspace sees as the beginning of the trace, which is wrong. We may
> +	 * not always have space to move the latest trace "xyz" to the correct
> +	 * order as it must appear beyond the LIMIT. (i.e, [head..head+size]).
> +	 * Thus it is easier to ignore those bytes than to complicate the
> +	 * driver to move it, assuming that the erratum was triggered and
> +	 * doing additional checks to see if there is indeed allowed space at
> +	 * TRBLIMITR.LIMIT.
> +	 *
> +	 *  Thus the full workaround will move the BASE and the PTR and would
> +	 *  look like (after padding at the skipped bytes at the end of
> +	 *  session) :
> +	 *
> +	 *  normal-BASE     head (normal-TRBPTR)         tail (normal-LIMIT)
> +	 *  |                   \/                       /
> +	 *   -------------------------------------------------------------
> +	 *  |         |          |///abc..     |..  rst|                |
> +	 *   -------------------------------------------------------------
> +	 *                      /    |
> +	 *              New-BASER    New-TRBPTR
> +	 *
> +	 * To summarize, with the work around:
> +	 *
> +	 *  - We always align the offset for the next session to PAGE_SIZE
> +	 *    (This is to ensure we can program the TRBBASER to this offset
> +	 *    within the region [head...head+size]).
> +	 *
> +	 *  - At TRBE enable:
> +	 *     - Set the TRBBASER to the page aligned offset of the current
> +	 *       proposed write offset. (which is guaranteed to be aligned
> +	 *       as above)
> +	 *     - Move the TRBPTR to skip first 256bytes (that might be
> +	 *       overwritten with the erratum). This ensures that the trace
> +	 *       generated in the session is not re-written.
> +	 *
> +	 *  - At trace collection:
> +	 *     - Pad the 256bytes skipped above again with IGNORE packets.
> +	 */
> +	if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE)) {
> +		if (WARN_ON(!IS_ALIGNED(buf->trbe_write, PAGE_SIZE)))
> +			return -EINVAL;
> +		buf->trbe_hw_base = buf->trbe_write;
> +		buf->trbe_write += TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
> +	}
> +
> +	return 0;
> +}
> +
>  static int __arm_trbe_enable(struct trbe_buf *buf,
>  			     struct perf_output_handle *handle)
>  {
> +	int ret = 0;
> +
>  	perf_aux_output_flag(handle, PERF_AUX_FLAG_CORESIGHT_FORMAT_RAW);
>  	buf->trbe_limit = compute_trbe_buffer_limit(handle);
>  	buf->trbe_write = buf->trbe_base + PERF_IDX2OFF(handle->head, buf);
>  	if (buf->trbe_limit == buf->trbe_base) {
> -		trbe_stop_and_truncate_event(handle);
> -		return -ENOSPC;
> +		ret = -ENOSPC;
> +		goto err;
>  	}
>  	/* Set the base of the TRBE to the buffer base */
>  	buf->trbe_hw_base = buf->trbe_base;
> +
> +	ret = trbe_apply_work_around_before_enable(buf);
> +	if (ret)
> +		goto err;
> +
>  	*this_cpu_ptr(buf->cpudata->drvdata->handle) = handle;
>  	trbe_enable_hw(buf);
>  	return 0;
> +err:
> +	trbe_stop_and_truncate_event(handle);
> +	return ret;
>  }
>  
>  static int arm_trbe_enable(struct coresight_device *csdev, u32 mode, void *data)
> @@ -890,7 +1022,7 @@ static irqreturn_t arm_trbe_irq_handler(int irq, void *dev)
>  	if (!is_perf_trbe(handle))
>  		return IRQ_NONE;
>  
> -	act = trbe_get_fault_act(status);
> +	act = trbe_get_fault_act(handle, status);
>  	switch (act) {
>  	case TRBE_FAULT_ACT_WRAP:
>  		truncated = !!trbe_handle_overflow(handle);
> @@ -1038,7 +1170,22 @@ static void arm_trbe_probe_cpu(void *info)
>  	 */
>  	trbe_check_errata(cpudata);
>  
> -	cpudata->trbe_align = cpudata->trbe_hw_align;
> +	/*
> +	 * If the TRBE is affected by erratum TRBE_WORKAROUND_OVERWRITE_FILL_MODE,
> +	 * we must always program the TBRPTR_EL1, 256bytes from a page
> +	 * boundary, with TRBBASER_EL1 set to the page, to prevent
> +	 * TRBE over-writing 256bytes at TRBBASER_EL1 on FILL event.
> +	 *
> +	 * Thus make sure we always align our write pointer to a PAGE_SIZE,
> +	 * which also guarantees that we have at least a PAGE_SIZE space in
> +	 * the buffer (TRBLIMITR is PAGE aligned) and thus we can skip
> +	 * the required bytes at the base.
> +	 */
> +	if (trbe_may_overwrite_in_fill_mode(cpudata))
> +		cpudata->trbe_align = PAGE_SIZE;
> +	else
> +		cpudata->trbe_align = cpudata->trbe_hw_align;
> +
>  	cpudata->trbe_flag = get_trbe_flag_update(trbidr);
>  	cpudata->cpu = cpu;
>  	cpudata->drvdata = drvdata;
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 12/15] coresight: trbe: Make sure we have enough space
  2021-10-14 22:31 ` [PATCH v5 12/15] coresight: trbe: Make sure we have enough space Suzuki K Poulose
@ 2021-10-19  5:55   ` Anshuman Khandual
  0 siblings, 0 replies; 41+ messages in thread
From: Anshuman Khandual @ 2021-10-19  5:55 UTC (permalink / raw)
  To: Suzuki K Poulose, will, mathieu.poirier
  Cc: catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel



On 10/15/21 4:01 AM, Suzuki K Poulose wrote:
> The TRBE driver makes sure that there is enough space for a meaningful
> run, otherwise pads the given space and restarts the offset calculation
> once. But there is no guarantee that we may find space or hit "no space".
> Make sure that we repeat the step until, either :
>   - We have the minimum space
>    OR
>   - There is NO space at all.
> 
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  drivers/hwtracing/coresight/coresight-trbe.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 4a50309a892d..5fb9f49eab33 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -477,10 +477,14 @@ static unsigned long trbe_normal_offset(struct perf_output_handle *handle)
>  	 * If the head is too close to the limit and we don't
>  	 * have space for a meaningful run, we rather pad it
>  	 * and start fresh.
> +	 *
> +	 * We might have to do this more than once to make sure
> +	 * we have enough required space.
>  	 */
> -	if (limit && ((limit - head) < trbe_min_trace_buf_size(handle))) {
> +	while (limit && ((limit - head) < trbe_min_trace_buf_size(handle))) {
>  		trbe_pad_buf(handle, limit - head);
>  		limit = __trbe_normal_offset(handle);
> +		head = PERF_IDX2OFF(handle->head, buf);
>  	}
>  	return limit;
>  }
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 13/15] coresight: trbe: Work around write to out of range
  2021-10-14 22:31 ` [PATCH v5 13/15] coresight: trbe: Work around write to out of range Suzuki K Poulose
@ 2021-10-19  5:57   ` Anshuman Khandual
  0 siblings, 0 replies; 41+ messages in thread
From: Anshuman Khandual @ 2021-10-19  5:57 UTC (permalink / raw)
  To: Suzuki K Poulose, will, mathieu.poirier
  Cc: catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel



On 10/15/21 4:01 AM, Suzuki K Poulose wrote:
> TRBE implementations affected by Arm erratum (2253138 or 2224489), could
> write to the next address after the TRBLIMITR.LIMIT, instead of wrapping
> to the TRBBASER. This implies that the TRBE could potentially corrupt :
> 
>   - A page used by the rest of the kernel/user (if the LIMIT = end of
>     perf ring buffer)
>   - A page within the ring buffer, but outside the driver's range.
>     [head, head + size]. This may contain some trace data, may be
>     consumed by the userspace.
> 
> We workaround this erratum by :
>   - Making sure that there is at least an extra PAGE space left in the
>     TRBE's range than we normally assign. This will be additional to other
>     restrictions (e.g, the TRBE alignment for working around
>     TRBE_WORKAROUND_OVERWRITE_IN_FILL_MODE, where there is a minimum of
>     PAGE_SIZE. Thus we would have 2 * PAGE_SIZE)
> 
>   - Adjust the LIMIT to leave the last PAGE_SIZE out of the TRBE's allowed
>     range (i.e, TRBEBASER...TRBLIMITR.LIMIT), by :
> 
>         TRBLIMITR.LIMIT -= PAGE_SIZE
> 
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Mike Leach <mike.leach@linaro.org>
> Cc: Leo Yan <leo.yan@linaro.org>
> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  drivers/hwtracing/coresight/coresight-trbe.c | 63 +++++++++++++++++++-
>  1 file changed, 62 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
> index 5fb9f49eab33..eca2e7e94079 100644
> --- a/drivers/hwtracing/coresight/coresight-trbe.c
> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
> @@ -90,9 +90,11 @@ struct trbe_buf {
>   *   - Streamlined detection of erratum across the system
>   */
>  #define TRBE_WORKAROUND_OVERWRITE_FILL_MODE	0
> +#define TRBE_WORKAROUND_WRITE_OUT_OF_RANGE	1
>  
>  static int trbe_errata_cpucaps[] = {
>  	[TRBE_WORKAROUND_OVERWRITE_FILL_MODE] = ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE,
> +	[TRBE_WORKAROUND_WRITE_OUT_OF_RANGE] = ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE,
>  	-1,		/* Sentinel, must be the last entry */
>  };
>  
> @@ -160,6 +162,11 @@ static inline bool trbe_may_overwrite_in_fill_mode(struct trbe_cpudata *cpudata)
>  	return trbe_has_erratum(cpudata, TRBE_WORKAROUND_OVERWRITE_FILL_MODE);
>  }
>  
> +static inline bool trbe_may_write_out_of_range(struct trbe_cpudata *cpudata)
> +{
> +	return trbe_has_erratum(cpudata, TRBE_WORKAROUND_WRITE_OUT_OF_RANGE);
> +}
> +
>  static int trbe_alloc_node(struct perf_event *event)
>  {
>  	if (event->cpu == -1)
> @@ -305,7 +312,21 @@ static unsigned long trbe_snapshot_offset(struct perf_output_handle *handle)
>  
>  static u64 trbe_min_trace_buf_size(struct perf_output_handle *handle)
>  {
> -	return TRBE_TRACE_MIN_BUF_SIZE;
> +	u64 size = TRBE_TRACE_MIN_BUF_SIZE;
> +	struct trbe_buf *buf = etm_perf_sink_config(handle);
> +	struct trbe_cpudata *cpudata = buf->cpudata;
> +
> +	/*
> +	 * When the TRBE is affected by an erratum that could make it
> +	 * write to the next "virtually addressed" page beyond the LIMIT.
> +	 * We need to make sure there is always a PAGE after the LIMIT,
> +	 * within the buffer. Thus we ensure there is at least an extra
> +	 * page than normal. With this we could then adjust the LIMIT
> +	 * pointer down by a PAGE later.
> +	 */
> +	if (trbe_may_write_out_of_range(cpudata))
> +		size += PAGE_SIZE;
> +	return size;
>  }
>  
>  /*
> @@ -611,6 +632,17 @@ static unsigned long trbe_get_trace_size(struct perf_output_handle *handle,
>  	/*
>  	 * If the TRBE has wrapped around the write pointer has
>  	 * wrapped and should be treated as limit.
> +	 *
> +	 * When the TRBE is affected by TRBE_WORKAROUND_WRITE_OUT_OF_RANGE,
> +	 * it may write upto 64bytes beyond the "LIMIT". The driver already
> +	 * keeps a valid page next to the LIMIT and we could potentially
> +	 * consume the trace data that may have been collected there. But we
> +	 * cannot be really sure it is available, and the TRBPTR may not
> +	 * indicate the same. Also, affected cores are also affected by another
> +	 * erratum which forces the PAGE_SIZE alignment on the TRBPTR, and thus
> +	 * could potentially pad an entire PAGE_SIZE - 64bytes, to get those
> +	 * 64bytes. Thus we ignore the potential triggering of the erratum
> +	 * on WRAP and limit the data to LIMIT.
>  	 */
>  	if (wrap)
>  		write = get_trbe_limit_pointer();
> @@ -864,6 +896,35 @@ static int trbe_apply_work_around_before_enable(struct trbe_buf *buf)
>  		buf->trbe_write += TRBE_WORKAROUND_OVERWRITE_FILL_MODE_SKIP_BYTES;
>  	}
>  
> +	/*
> +	 * TRBE_WORKAROUND_WRITE_OUT_OF_RANGE could cause the TRBE to write to
> +	 * the next page after the TRBLIMITR.LIMIT. For perf, the "next page"
> +	 * may be:
> +	 *     - The page beyond the ring buffer. This could mean, TRBE could
> +	 *       corrupt another entity (kernel / user)
> +	 *     - A portion of the "ring buffer" consumed by the userspace.
> +	 *       i.e, a page outisde [head, head + size].
> +	 *
> +	 * We work around this by:
> +	 *     - Making sure that we have at least an extra space of PAGE left
> +	 *       in the ring buffer [head, head + size], than we normally do
> +	 *       without the erratum. See trbe_min_trace_buf_size().
> +	 *
> +	 *     - Adjust the TRBLIMITR.LIMIT to leave the extra PAGE outside
> +	 *       the TRBE's range (i.e [TRBBASER, TRBLIMITR.LIMI] ).
> +	 */
> +	if (trbe_has_erratum(buf->cpudata, TRBE_WORKAROUND_WRITE_OUT_OF_RANGE)) {
> +		s64 space = buf->trbe_limit - buf->trbe_write;
> +		/*
> +		 * We must have more than a PAGE_SIZE worth space in the proposed
> +		 * range for the TRBE.
> +		 */
> +		if (WARN_ON(space <= PAGE_SIZE ||
> +			    !IS_ALIGNED(buf->trbe_limit, PAGE_SIZE)))
> +			return -EINVAL;
> +		buf->trbe_limit -= PAGE_SIZE;
> +	}
> +
>  	return 0;
>  }
>  
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  2021-10-14 22:31 ` [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode Suzuki K Poulose
  2021-10-18 15:54   ` Mathieu Poirier
@ 2021-10-19  5:59   ` Anshuman Khandual
  2021-10-19 10:42   ` Will Deacon
  2 siblings, 0 replies; 41+ messages in thread
From: Anshuman Khandual @ 2021-10-19  5:59 UTC (permalink / raw)
  To: Suzuki K Poulose, will, mathieu.poirier
  Cc: catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel



On 10/15/21 4:01 AM, Suzuki K Poulose wrote:
> With the workaround enabled in TRBE, enable the config entries
> to be built without COMPILE_TEST
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  arch/arm64/Kconfig | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f30029f4a9f9..f72fa44d6182 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -672,7 +672,6 @@ config ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>  config ARM64_ERRATUM_2119858
>  	bool "Cortex-A710: 2119858: workaround TRBE overwriting trace data in FILL mode"
>  	default y
> -	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
>  	depends on CORESIGHT_TRBE
>  	select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>  	help
> @@ -691,7 +690,6 @@ config ARM64_ERRATUM_2119858
>  config ARM64_ERRATUM_2139208
>  	bool "Neoverse-N2: 2139208: workaround TRBE overwriting trace data in FILL mode"
>  	default y
> -	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
>  	depends on CORESIGHT_TRBE
>  	select ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE
>  	help
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address
  2021-10-14 22:31 ` [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address Suzuki K Poulose
  2021-10-18 15:54   ` Mathieu Poirier
@ 2021-10-19  6:00   ` Anshuman Khandual
  2021-10-19 10:42   ` Will Deacon
  2 siblings, 0 replies; 41+ messages in thread
From: Anshuman Khandual @ 2021-10-19  6:00 UTC (permalink / raw)
  To: Suzuki K Poulose, will, mathieu.poirier
  Cc: catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel



On 10/15/21 4:01 AM, Suzuki K Poulose wrote:
> With the TRBE driver workaround available, enable the config symbols
> to be built without COMPILE_TEST
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>

> ---
>  arch/arm64/Kconfig | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f72fa44d6182..d6383ef05871 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -743,7 +743,6 @@ config ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
>  
>  config ARM64_ERRATUM_2253138
>  	bool "Neoverse-N2: 2253138: workaround TRBE writing to address out-of-range"
> -	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
>  	depends on CORESIGHT_TRBE
>  	default y
>  	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> @@ -762,7 +761,6 @@ config ARM64_ERRATUM_2253138
>  
>  config ARM64_ERRATUM_2224489
>  	bool "Cortex-A710: 2224489: workaround TRBE writing to address out-of-range"
> -	depends on COMPILE_TEST # Until the CoreSight TRBE driver changes are in
>  	depends on CORESIGHT_TRBE
>  	default y
>  	select ARM64_WORKAROUND_TRBE_WRITE_OUT_OF_RANGE
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode
  2021-10-19  4:36       ` Anshuman Khandual
@ 2021-10-19  8:37         ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-19  8:37 UTC (permalink / raw)
  To: Anshuman Khandual, Mathieu Poirier
  Cc: will, catalin.marinas, mike.leach, leo.yan, maz, coresight,
	linux-arm-kernel, linux-kernel

On 19/10/2021 05:36, Anshuman Khandual wrote:
> 
> 
> On 10/19/21 2:45 AM, Suzuki K Poulose wrote:
>> On 18/10/2021 16:51, Mathieu Poirier wrote:
>>> On Thu, Oct 14, 2021 at 11:31:20PM +0100, Suzuki K Poulose wrote:
>>>> ARM Neoverse-N2 (#2139208) and Cortex-A710(##2119858) suffers from
>>>> an erratum, which when triggered, might cause the TRBE to overwrite
>>>> the trace data already collected in FILL mode, in the event of a WRAP.
>>>> i.e, the TRBE doesn't stop writing the data, instead wraps to the base
>>>> and could write upto 3 cache line size worth trace. Thus, this could
>>>> corrupt the trace at the "BASE" pointer.
>>>>
>>>> The workaround is to program the write pointer 256bytes from the
>>>> base, such that if the erratum is triggered, it doesn't overwrite
>>>> the trace data that was captured. This skipped region could be
>>>> padded with ignore packets at the end of the session, so that
>>>> the decoder sees a continuous buffer with some padding at the
>>>> beginning. The trace data written at the base is considered
>>>> lost as the limit could have been in the middle of the perf
>>>> ring buffer, and jumping to the "base" is not acceptable.
>>>> We set the flags already to indicate that some amount of trace
>>>> was lost during the FILL event IRQ. So this is fine.
>>>>
>>>> One important change with the work around is, we program the
>>>> TRBBASER_EL1 to current page where we are allowed to write.
>>>> Otherwise, it could overwrite a region that may be consumed
>>>> by the perf. Towards this, we always make sure that the
>>>> "handle->head" and thus the trbe_write is PAGE_SIZE aligned,
>>>> so that we can set the BASE to the PAGE base and move the
>>>> TRBPTR to the 256bytes offset.
>>>>
>>>> Cc: Mike Leach <mike.leach@linaro.org>
>>>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> Cc: Leo Yan <leo.yan@linaro.org>
>>>> Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
>>>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>>>> ---
>>>> Changes since v2:
>>>>    - Updated the ASCII art to include better description of
>>>>      all the steps in the work around
>>>> Change since v1:
>>>>    - Updated comment with ASCII art
>>>>    - Add _BYTES suffix for the space to skip for the work around.
>>>> ---
>>>>    drivers/hwtracing/coresight/coresight-trbe.c | 169 +++++++++++++++++--
>>>>    1 file changed, 158 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-trbe.c b/drivers/hwtracing/coresight/coresight-trbe.c
>>>> index 314e5e7374c7..b56b166b2dec 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-trbe.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-trbe.c
>>>> @@ -16,6 +16,7 @@
>>>>    #define pr_fmt(fmt) DRVNAME ": " fmt
>>>>      #include <asm/barrier.h>
>>>> +#include <asm/cpufeature.h>
>>>
>>> Here too I get a checkpatch warning...
>>>
>>
>> That is a false alarm. I guess that warns for including
>> linux/cpufeature.h? It is a bit odd, we include this
>> for the arm64 cpucaps, not the generic linux feature
> 
> It is a bit odd, I saw that too.
> 
>> checks. (They are used for "loading modules" based
>> on "features" which are more like ELF HWCAPs).
> 
> Should <asm/cpufeature.h> be renamed as <asm/arm64_cpufeature.h>
> or something similar instead to differentiate it from the generic
> <linux/cpufeature.h> as they are not related. Also, probably this
> warning could be avoided.

It is not that simple. asm/cpufeature.h on arm64 provides :

  * arch backend for linux/cpufeature.h
  * CPU feature sanity check infrastructure
  * CPU capability infrastructure ( features & errata )

So ideally it should be split into 3 for better cleanup and
is something that could be pursued outside this series.

Suzuki

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode
  2021-10-14 22:31 ` [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode Suzuki K Poulose
  2021-10-18 15:54   ` Mathieu Poirier
  2021-10-19  5:59   ` Anshuman Khandual
@ 2021-10-19 10:42   ` Will Deacon
  2 siblings, 0 replies; 41+ messages in thread
From: Will Deacon @ 2021-10-19 10:42 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mathieu.poirier, catalin.marinas, anshuman.khandual, mike.leach,
	leo.yan, maz, coresight, linux-arm-kernel, linux-kernel

On Thu, Oct 14, 2021 at 11:31:24PM +0100, Suzuki K Poulose wrote:
> With the workaround enabled in TRBE, enable the config entries
> to be built without COMPILE_TEST
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/Kconfig | 2 --
>  1 file changed, 2 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address
  2021-10-14 22:31 ` [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address Suzuki K Poulose
  2021-10-18 15:54   ` Mathieu Poirier
  2021-10-19  6:00   ` Anshuman Khandual
@ 2021-10-19 10:42   ` Will Deacon
  2 siblings, 0 replies; 41+ messages in thread
From: Will Deacon @ 2021-10-19 10:42 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mathieu.poirier, catalin.marinas, anshuman.khandual, mike.leach,
	leo.yan, maz, coresight, linux-arm-kernel, linux-kernel

On Thu, Oct 14, 2021 at 11:31:25PM +0100, Suzuki K Poulose wrote:
> With the TRBE driver workaround available, enable the config symbols
> to be built without COMPILE_TEST
> 
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  arch/arm64/Kconfig | 2 --
>  1 file changed, 2 deletions(-)

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures
  2021-10-14 22:31 ` [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures Suzuki K Poulose
  2021-10-19  5:06   ` Anshuman Khandual
@ 2021-10-19 11:02   ` Will Deacon
  2021-10-19 11:36     ` Suzuki K Poulose
  1 sibling, 1 reply; 41+ messages in thread
From: Will Deacon @ 2021-10-19 11:02 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mathieu.poirier, catalin.marinas, anshuman.khandual, mike.leach,
	leo.yan, maz, coresight, linux-arm-kernel, linux-kernel,
	Mark Rutland

On Thu, Oct 14, 2021 at 11:31:13PM +0100, Suzuki K Poulose wrote:
> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
> index ccd757373f36..bdbeac75ead6 100644
> --- a/arch/arm64/kernel/cpu_errata.c
> +++ b/arch/arm64/kernel/cpu_errata.c
> @@ -352,6 +352,18 @@ static const struct midr_range trbe_overwrite_fill_mode_cpus[] = {
>  };
>  #endif	/* CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE */
>  
> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE
> +static const struct midr_range tsb_flush_fail_cpus[] = {
> +#ifdef CONFIG_ARM64_ERRATUM_2067961
> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
> +#endif
> +#ifdef CONFIG_ARM64_ERRATUM_2054223
> +	MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
> +#endif
> +	{},
> +};
> +#endif	/* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
> +
>  const struct arm64_cpu_capabilities arm64_errata[] = {
>  #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
>  	{
> @@ -558,6 +570,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>  		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
>  		CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus),
>  	},
> +#endif
> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE

You still haven't fixed this typo...

Seriously, I get compile warnings from this -- are you not seeing them?

Will

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 02/15] arm64: errata: Add detection for TRBE overwrite in FILL mode
  2021-10-14 22:31 ` [PATCH v5 02/15] arm64: errata: Add detection for TRBE overwrite in FILL mode Suzuki K Poulose
@ 2021-10-19 11:04   ` Will Deacon
  2021-10-19 11:15     ` Suzuki K Poulose
  0 siblings, 1 reply; 41+ messages in thread
From: Will Deacon @ 2021-10-19 11:04 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mathieu.poirier, catalin.marinas, anshuman.khandual, mike.leach,
	leo.yan, maz, coresight, linux-arm-kernel, linux-kernel,
	Mark Rutland

On Thu, Oct 14, 2021 at 11:31:12PM +0100, Suzuki K Poulose wrote:
> Arm Neoverse-N2 and the Cortex-A710 cores are affected
> by a CPU erratum where the TRBE will overwrite the trace buffer
> in FILL mode. The TRBE doesn't stop (as expected in FILL mode)
> when it reaches the limit and wraps to the base to continue
> writing upto 3 cache lines. This will overwrite any trace that
> was written previously.
> 
> Add the Neoverse-N2 erratumi(#2139208) and Cortex-A710 erratum
> (#2119858) to the  detection logic.

Weird typo and double space in this sentence.

Will

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 02/15] arm64: errata: Add detection for TRBE overwrite in FILL mode
  2021-10-19 11:04   ` Will Deacon
@ 2021-10-19 11:15     ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-19 11:15 UTC (permalink / raw)
  To: Will Deacon
  Cc: mathieu.poirier, catalin.marinas, anshuman.khandual, mike.leach,
	leo.yan, maz, coresight, linux-arm-kernel, linux-kernel,
	Mark Rutland

On 19/10/2021 12:04, Will Deacon wrote:
> On Thu, Oct 14, 2021 at 11:31:12PM +0100, Suzuki K Poulose wrote:
>> Arm Neoverse-N2 and the Cortex-A710 cores are affected
>> by a CPU erratum where the TRBE will overwrite the trace buffer
>> in FILL mode. The TRBE doesn't stop (as expected in FILL mode)
>> when it reaches the limit and wraps to the base to continue
>> writing upto 3 cache lines. This will overwrite any trace that
>> was written previously.
>>
>> Add the Neoverse-N2 erratumi(#2139208) and Cortex-A710 erratum
>> (#2119858) to the  detection logic.
> 
> Weird typo and double space in this sentence.

I have fixed this now.

Thanks
Suzuki


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures
  2021-10-19 11:02   ` Will Deacon
@ 2021-10-19 11:36     ` Suzuki K Poulose
  2021-10-19 11:42       ` Will Deacon
  0 siblings, 1 reply; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-19 11:36 UTC (permalink / raw)
  To: Will Deacon
  Cc: mathieu.poirier, catalin.marinas, anshuman.khandual, mike.leach,
	leo.yan, maz, coresight, linux-arm-kernel, linux-kernel,
	Mark Rutland

On 19/10/2021 12:02, Will Deacon wrote:
> On Thu, Oct 14, 2021 at 11:31:13PM +0100, Suzuki K Poulose wrote:
>> diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
>> index ccd757373f36..bdbeac75ead6 100644
>> --- a/arch/arm64/kernel/cpu_errata.c
>> +++ b/arch/arm64/kernel/cpu_errata.c
>> @@ -352,6 +352,18 @@ static const struct midr_range trbe_overwrite_fill_mode_cpus[] = {
>>   };
>>   #endif	/* CONFIG_ARM64_WORKAROUND_TRBE_OVERWRITE_FILL_MODE */
>>   
>> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE
>> +static const struct midr_range tsb_flush_fail_cpus[] = {
>> +#ifdef CONFIG_ARM64_ERRATUM_2067961
>> +	MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2),
>> +#endif
>> +#ifdef CONFIG_ARM64_ERRATUM_2054223
>> +	MIDR_ALL_VERSIONS(MIDR_CORTEX_A710),
>> +#endif
>> +	{},
>> +};
>> +#endif	/* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
>> +
>>   const struct arm64_cpu_capabilities arm64_errata[] = {
>>   #ifdef CONFIG_ARM64_WORKAROUND_CLEAN_CACHE
>>   	{
>> @@ -558,6 +570,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>>   		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
>>   		CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus),
>>   	},
>> +#endif
>> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE
> 
> You still haven't fixed this typo...
> 

Sorry about that. I thought it was about selecting the
Kconfig entries, which was fixed. I will fix this.

> Seriously, I get compile warnings from this -- are you not seeing them?

No, I don't get any warnings. Is there something that I am missing ?

--8>--

suzuki@ewhatever:coresight$ grep "WERROR\|TSB" .config
CONFIG_WERROR=y
CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE=y
suzuki@ewhatever:coresight$ grep TSB arch/arm64/kernel/cpu_errata.c
#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE
#endif  /* CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE
                 .capability = ARM64_WORKAROUND_TSB_FLUSH_FAILURE,


suzuki@ewhatever:coresight$ touch arch/arm64/kernel/cpu_errata.c
suzuki@ewhatever:coresight$ make -j16
   CALL    scripts/atomic/check-atomics.sh
   CALL    scripts/checksyscalls.sh
   CHK     include/generated/compile.h
   CC      arch/arm64/kernel/cpu_errata.o
   AR      arch/arm64/kernel/built-in.a
   AR      arch/arm64/built-in.a
   GEN     .version
   CHK     include/generated/compile.h
   UPD     include/generated/compile.h
   CC      init/version.o
   AR      init/built-in.a
   LD      vmlinux.o
   MODPOST vmlinux.symvers
   MODINFO modules.builtin.modinfo
   GEN     modules.builtin
   LD      .tmp_vmlinux.kallsyms1
   KSYMS   .tmp_vmlinux.kallsyms1.S
   AS      .tmp_vmlinux.kallsyms1.S
   LD      .tmp_vmlinux.kallsyms2
   KSYMS   .tmp_vmlinux.kallsyms2.S
   AS      .tmp_vmlinux.kallsyms2.S
   LD      vmlinux
   SORTTAB vmlinux
   SYSMAP  System.map
   MODPOST modules-only.symvers
   OBJCOPY arch/arm64/boot/Image
   GEN     Module.symvers
   GZIP    arch/arm64/boot/Image.gz

Suzuki
> 
> Will
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures
  2021-10-19 11:36     ` Suzuki K Poulose
@ 2021-10-19 11:42       ` Will Deacon
  2021-10-19 12:06         ` Suzuki K Poulose
  0 siblings, 1 reply; 41+ messages in thread
From: Will Deacon @ 2021-10-19 11:42 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: mathieu.poirier, catalin.marinas, anshuman.khandual, mike.leach,
	leo.yan, maz, coresight, linux-arm-kernel, linux-kernel,
	Mark Rutland

On Tue, Oct 19, 2021 at 12:36:48PM +0100, Suzuki K Poulose wrote:
> On 19/10/2021 12:02, Will Deacon wrote:
> > On Thu, Oct 14, 2021 at 11:31:13PM +0100, Suzuki K Poulose wrote:
> > > @@ -558,6 +570,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
> > >   		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
> > >   		CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus),
> > >   	},
> > > +#endif
> > > +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE
> > 
> > You still haven't fixed this typo...
> > 
> 
> Sorry about that. I thought it was about selecting the
> Kconfig entries, which was fixed. I will fix this.

Sorry, I thought it was such a howler that it would've jumped out ;)
That's what made me think we should make sure the series compiles without
the coresight changes, so we can catch these problems early.

> > Seriously, I get compile warnings from this -- are you not seeing them?
> 
> No, I don't get any warnings. Is there something that I am missing ?

Interesting. I see the warning below in my bisection testing, since the typo
means that the midr lookup table isn't used. Maybe you're only compiling the
end result?

Will

--->8

+arch/arm64/kernel/cpu_errata.c:356:32: warning: ‘tsb_flush_fail_cpus’ defined but not used [-Wunused-const-variable=]
+  356 | static const struct midr_range tsb_flush_fail_cpus[] = {
+      |                                ^~~~~~~~~~~~~~~~~~~

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures
  2021-10-19 11:42       ` Will Deacon
@ 2021-10-19 12:06         ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-19 12:06 UTC (permalink / raw)
  To: Will Deacon
  Cc: mathieu.poirier, catalin.marinas, anshuman.khandual, mike.leach,
	leo.yan, maz, coresight, linux-arm-kernel, linux-kernel,
	Mark Rutland

On 19/10/2021 12:42, Will Deacon wrote:
> On Tue, Oct 19, 2021 at 12:36:48PM +0100, Suzuki K Poulose wrote:
>> On 19/10/2021 12:02, Will Deacon wrote:
>>> On Thu, Oct 14, 2021 at 11:31:13PM +0100, Suzuki K Poulose wrote:
>>>> @@ -558,6 +570,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
>>>>    		.type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
>>>>    		CAP_MIDR_RANGE_LIST(trbe_overwrite_fill_mode_cpus),
>>>>    	},
>>>> +#endif
>>>> +#ifdef CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE
>>>
>>> You still haven't fixed this typo...
>>>
>>
>> Sorry about that. I thought it was about selecting the
>> Kconfig entries, which was fixed. I will fix this.
> 
> Sorry, I thought it was such a howler that it would've jumped out ;)
> That's what made me think we should make sure the series compiles without
> the coresight changes, so we can catch these problems early.
> 
>>> Seriously, I get compile warnings from this -- are you not seeing them?
>>
>> No, I don't get any warnings. Is there something that I am missing ?
> 
> Interesting. I see the warning below in my bisection testing, since the typo
> means that the midr lookup table isn't used. Maybe you're only compiling the
> end result?

No, I was compiling this at the commit. Also, please note that the
TSB flush failure config is enabled with the patch, unlike the TRBE
errata ones.

My GCC is :

gcc version 9.3.1 20200408 (Red Hat 9.3.1-2) (GCC)


$ grep TSB arch/arm64/Kconfig arch/arm64/kernel/cpu_errata.c .config
arch/arm64/Kconfig:config ARM64_WORKAROUND_TSB_FLUSH_FAILURE
arch/arm64/Kconfig:     bool "Cortex-A710: 2054223: workaround TSB 
instruction failing to flush trace"
arch/arm64/Kconfig:     select ARM64_WORKAROUND_TSB_FLUSH_FAILURE
arch/arm64/Kconfig:       Affected cores may fail to flush the trace 
data on a TSB instruction, when
arch/arm64/Kconfig:       Workaround is to issue two TSB consecutively 
on affected cores.
arch/arm64/Kconfig:     bool "Neoverse-N2: 2067961: workaround TSB 
instruction failing to flush trace"
arch/arm64/Kconfig:     select ARM64_WORKAROUND_TSB_FLUSH_FAILURE
arch/arm64/Kconfig:       Affected cores may fail to flush the trace 
data on a TSB instruction, when
arch/arm64/Kconfig:       Workaround is to issue two TSB consecutively 
on affected cores.
arch/arm64/kernel/cpu_errata.c:#ifdef 
CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE
arch/arm64/kernel/cpu_errata.c:#endif   /* 
CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE */
arch/arm64/kernel/cpu_errata.c:#ifdef 
CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILRE
arch/arm64/kernel/cpu_errata.c:         .capability = 
ARM64_WORKAROUND_TSB_FLUSH_FAILURE,
.config:CONFIG_ARM64_WORKAROUND_TSB_FLUSH_FAILURE=y
suzuki@ewhatever:coresight$ git log --oneline -1
89e0c94bd734 (HEAD) arm64: errata: Add workaround for TSB flush failures


> 
> Will
> 
> --->8
> 
> +arch/arm64/kernel/cpu_errata.c:356:32: warning: ‘tsb_flush_fail_cpus’ defined but not used [-Wunused-const-variable=]
> +  356 | static const struct midr_range tsb_flush_fail_cpus[] = {
> +      |                                ^~~~~~~~~~~~~~~~~~~
> 

That looks a valid warning. Hmm, strange.

It does complain for an unused function though.

$ make -j16
   CALL    scripts/atomic/check-atomics.sh
   CALL    scripts/checksyscalls.sh
   CHK     include/generated/compile.h
   CC      arch/arm64/kernel/cpu_errata.o
arch/arm64/kernel/cpu_errata.c:90:13: error: 
‘here_is_an_unused_function’ defined but not used [-Werror=unused-function]
  static void here_is_an_unused_function(void)
              ^~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make[2]: *** [scripts/Makefile.build:277: 
arch/arm64/kernel/cpu_errata.o] Error 1
make[1]: *** [scripts/Makefile.build:540: arch/arm64/kernel] Error 2
make: *** [Makefile:1874: arch/arm64] Error 2
make: *** Waiting for unfinished jobs....

--8>--

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index aaa66c9eee24..57c83e84b274 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -87,12 +87,20 @@ has_mismatched_cache_type(const struct 
arm64_cpu_capabilities *entry,
         return (ctr_real != sys) && (ctr_raw != sys);
  }

+static void here_is_an_unused_function(void)
+{
+       pr_crit("I am unused\n");
+}
+
  static void
  cpu_enable_trap_ctr_access(const struct arm64_cpu_capabilities *cap)
  {
         u64 mask = arm64_ftr_reg_ctrel0.strict_mask;
         bool enable_uct_trap = false;

+#ifdef CONFIG_UNUSED_FUNCTION
+       here_is_an_unused_function();
+#endif

Cheers
Suzuki

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 04/15] arm64: errata: Add detection for TRBE write to out-of-range
  2021-10-18 15:50   ` Mathieu Poirier
@ 2021-10-19 13:29     ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-19 13:29 UTC (permalink / raw)
  To: Mathieu Poirier
  Cc: will, catalin.marinas, anshuman.khandual, mike.leach, leo.yan,
	maz, coresight, linux-arm-kernel, linux-kernel, Mark Rutland

On 18/10/2021 16:50, Mathieu Poirier wrote:
> Good morning,
> 
> On Thu, Oct 14, 2021 at 11:31:14PM +0100, Suzuki K Poulose wrote:
>> Arm Neoverse-N2 and Cortex-A710 cores are affected by an erratum where the
>> trbe, under some circumstances, might write upto 64bytes to an address after
> 
> Checkpatch gives me a warning about this line...
> 
>> the Limit as programmed by the TRBLIMITR_EL1.LIMIT. This might -
>>
>>    - Corrupt a page in the ring buffer, which may corrupt trace from a
>>      previous session, consumed by userspace.
>>    - Hit the guard page at the end of the vmalloc area and raise a fault.
>>
>> To keep the handling simpler, we always leave the last page from the
>> range, which TRBE is allowed to write. This can be achieved by ensuring
>> that we always have more than a PAGE worth space in the range, while
>> calculating the LIMIT for TRBE. And then the LIMIT pointer can be adjusted
>> to leave the PAGE (TRBLIMITR.LIMIT -= PAGE_SIZE), out of the TRBE range
>> while enabling it. This makes sure that the TRBE will only write to an area
>> within its allowed limit (i.e, [head-head+size]) and we do not have to handle
> 
> I'm pretty sure this line will also be flagged.
> 

Thanks for pointing them out. I have fixed it now.

Suzuki

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling
  2021-10-14 22:31 ` [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling Suzuki K Poulose
  2021-10-19  5:25   ` Anshuman Khandual
@ 2021-10-29 10:31   ` Arnd Bergmann
  2021-10-29 13:00     ` Suzuki K Poulose
  1 sibling, 1 reply; 41+ messages in thread
From: Arnd Bergmann @ 2021-10-29 10:31 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: Will Deacon, Mathieu Poirier, Catalin Marinas, Anshuman Khandual,
	Mike Leach, Leo Yan, Marc Zyngier, coresight, Linux ARM,
	Linux Kernel Mailing List

On Fri, Oct 15, 2021 at 12:31 AM Suzuki K Poulose
<suzuki.poulose@arm.com> wrote:
>
> +static void trbe_check_errata(struct trbe_cpudata *cpudata)
> +{
> +       int i;
> +
> +       for (i = 0; i < TRBE_ERRATA_MAX; i++) {
> +               int cap = trbe_errata_cpucaps[i];
> +
> +               if (WARN_ON_ONCE(cap < 0))
> +                       return;
> +               if (this_cpu_has_cap(cap))
> +                       set_bit(i, cpudata->errata);
> +       }
> +}

this_cpu_has_cap() is private to arch/arm64 and not exported, so this causes
a build failure when used from a loadable module:

ERROR: modpost: "this_cpu_has_cap"
[drivers/hwtracing/coresight/coresight-trbe.ko] undefined!

Should this symbol be exported or do we need a different workaround?

       Arnd

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling
  2021-10-29 10:31   ` Arnd Bergmann
@ 2021-10-29 13:00     ` Suzuki K Poulose
  0 siblings, 0 replies; 41+ messages in thread
From: Suzuki K Poulose @ 2021-10-29 13:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Will Deacon, Mathieu Poirier, Catalin Marinas, Anshuman Khandual,
	Mike Leach, Leo Yan, Marc Zyngier, coresight, Linux ARM,
	Linux Kernel Mailing List

Hi Arnd

Thanks for the report.

On 29/10/2021 11:31, Arnd Bergmann wrote:
> On Fri, Oct 15, 2021 at 12:31 AM Suzuki K Poulose
> <suzuki.poulose@arm.com> wrote:
>>
>> +static void trbe_check_errata(struct trbe_cpudata *cpudata)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < TRBE_ERRATA_MAX; i++) {
>> +               int cap = trbe_errata_cpucaps[i];
>> +
>> +               if (WARN_ON_ONCE(cap < 0))
>> +                       return;
>> +               if (this_cpu_has_cap(cap))
>> +                       set_bit(i, cpudata->errata);
>> +       }
>> +}
> 
> this_cpu_has_cap() is private to arch/arm64 and not exported, so this causes
> a build failure when used from a loadable module:
> 
> ERROR: modpost: "this_cpu_has_cap"
> [drivers/hwtracing/coresight/coresight-trbe.ko] undefined!
> 
> Should this symbol be exported or do we need a different workaround?

This should be exported. I can send in a patch.

Suzuki

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2021-10-29 13:00 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-14 22:31 [PATCH v5 00/15] arm64: Self-hosted trace related errata workarounds Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 01/15] arm64: Add Neoverse-N2, Cortex-A710 CPU part definition Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 02/15] arm64: errata: Add detection for TRBE overwrite in FILL mode Suzuki K Poulose
2021-10-19 11:04   ` Will Deacon
2021-10-19 11:15     ` Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 03/15] arm64: errata: Add workaround for TSB flush failures Suzuki K Poulose
2021-10-19  5:06   ` Anshuman Khandual
2021-10-19 11:02   ` Will Deacon
2021-10-19 11:36     ` Suzuki K Poulose
2021-10-19 11:42       ` Will Deacon
2021-10-19 12:06         ` Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 04/15] arm64: errata: Add detection for TRBE write to out-of-range Suzuki K Poulose
2021-10-18 15:50   ` Mathieu Poirier
2021-10-19 13:29     ` Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 05/15] coresight: trbe: Add a helper to calculate the trace generated Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 06/15] coresight: trbe: Add a helper to pad a given buffer area Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 07/15] coresight: trbe: Decouple buffer base from the hardware base Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 08/15] coresight: trbe: Allow driver to choose a different alignment Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 09/15] coresight: trbe: Add infrastructure for Errata handling Suzuki K Poulose
2021-10-19  5:25   ` Anshuman Khandual
2021-10-29 10:31   ` Arnd Bergmann
2021-10-29 13:00     ` Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 10/15] coresight: trbe: Workaround TRBE errata overwrite in FILL mode Suzuki K Poulose
2021-10-18 15:51   ` Mathieu Poirier
2021-10-18 21:15     ` Suzuki K Poulose
2021-10-19  4:36       ` Anshuman Khandual
2021-10-19  8:37         ` Suzuki K Poulose
2021-10-19  5:42   ` Anshuman Khandual
2021-10-14 22:31 ` [PATCH v5 11/15] coresight: trbe: Add a helper to determine the minimum buffer size Suzuki K Poulose
2021-10-14 22:31 ` [PATCH v5 12/15] coresight: trbe: Make sure we have enough space Suzuki K Poulose
2021-10-19  5:55   ` Anshuman Khandual
2021-10-14 22:31 ` [PATCH v5 13/15] coresight: trbe: Work around write to out of range Suzuki K Poulose
2021-10-19  5:57   ` Anshuman Khandual
2021-10-14 22:31 ` [PATCH v5 14/15] arm64: errata: Enable workaround for TRBE overwrite in FILL mode Suzuki K Poulose
2021-10-18 15:54   ` Mathieu Poirier
2021-10-19  5:59   ` Anshuman Khandual
2021-10-19 10:42   ` Will Deacon
2021-10-14 22:31 ` [PATCH v5 15/15] arm64: errata: Enable TRBE workaround for write to out-of-range address Suzuki K Poulose
2021-10-18 15:54   ` Mathieu Poirier
2021-10-19  6:00   ` Anshuman Khandual
2021-10-19 10:42   ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).