linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0
@ 2018-10-09 13:47 Suzuki K Poulose
  2018-10-09 13:47 ` Suzuki K Poulose
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Suzuki K Poulose @ 2018-10-09 13:47 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, will.deacon, mark.rutland, pelcan,
	shankerd, suzuki.poulose

This series makes sure that we handle the CTR_EL0 field mismatches
properly, especially for the IDC field. Also, skip trapping CTR
accesses on a CPU if it matches the safe value.

Applies on arm64 for-next/core.

Changes since v1
 - Fix wrong hunk in has_cache_idc()
 - Allow a late secondary CPU with raw CTR_EL0.IDC = 0 and effective
   CTR_EL0.IDC = 1, to boot on a system without IDC available.

Suzuki K Poulose (3):
  arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
  arm64: cpufeature: Fix handling of CTR_EL0.IDC field
  arm64: cpufeature: Trap CTR_EL0 access only where it is necessary

 arch/arm64/include/asm/cache.h | 40 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/cpu_errata.c | 34 +++++++++++++++++++++++++----
 arch/arm64/kernel/cpufeature.c | 35 +++++++++++++++++++++++++----
 arch/arm64/kernel/cpuinfo.c    | 10 ++++++++-
 4 files changed, 110 insertions(+), 9 deletions(-)

-- 
2.19.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0
  2018-10-09 13:47 [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Suzuki K Poulose
@ 2018-10-09 13:47 ` Suzuki K Poulose
  2018-10-09 13:47 ` [PATCH v2 1/3] arm64: cpufeature: ctr: Fix cpu capability check for late CPUs Suzuki K Poulose
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Suzuki K Poulose @ 2018-10-09 13:47 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, will.deacon, mark.rutland, pelcan,
	shankerd, suzuki.poulose

This series makes sure that we handle the CTR_EL0 field mismatches
properly, especially for the IDC field. Also, skip trapping CTR
accesses on a CPU if it matches the safe value.

Applies on arm64 for-next/core.

Changes since v1
 - Fix wrong hunk in has_cache_idc()
 - Allow a late secondary CPU with raw CTR_EL0.IDC = 0 and effective
   CTR_EL0.IDC = 1, to boot on a system without IDC available.

Suzuki K Poulose (3):
  arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
  arm64: cpufeature: Fix handling of CTR_EL0.IDC field
  arm64: cpufeature: Trap CTR_EL0 access only where it is necessary

 arch/arm64/include/asm/cache.h | 40 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/cpu_errata.c | 34 +++++++++++++++++++++++++----
 arch/arm64/kernel/cpufeature.c | 35 +++++++++++++++++++++++++----
 arch/arm64/kernel/cpuinfo.c    | 10 ++++++++-
 4 files changed, 110 insertions(+), 9 deletions(-)

-- 
2.19.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/3] arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
  2018-10-09 13:47 [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Suzuki K Poulose
  2018-10-09 13:47 ` Suzuki K Poulose
@ 2018-10-09 13:47 ` Suzuki K Poulose
  2018-10-09 13:47 ` [PATCH v2 2/3] arm64: cpufeature: Fix handling of CTR_EL0.IDC field Suzuki K Poulose
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Suzuki K Poulose @ 2018-10-09 13:47 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, will.deacon, mark.rutland, pelcan,
	shankerd, suzuki.poulose

The matches() routine for a capability must honor the "scope"
passed to it and return the proper results.
i.e, when passed with SCOPE_LOCAL_CPU, it should check the
status of the capability on the current CPU. This is used by
verify_local_cpu_capabilities() on a late secondary CPU to make
sure that it's compliant with the established system features.
However, ARM64_HAS_CACHE_{IDC/DIC} always checks the system wide
registers and this could mean that a late secondary CPU could return
"true" (since the CPU hasn't updated the system wide registers yet)
and thus lead the system in an inconsistent state, where
the system assumes it has IDC/DIC feature, while the new CPU
doesn't.

Fixes: commit 6ae4b6e0578886eb36 ("arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC")
Cc: Philip Elcan <pelcan@codeaurora.org>
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/cpufeature.c | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 00e7c313f088..ba16bb7762ca 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -854,15 +854,29 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
 }
 
 static bool has_cache_idc(const struct arm64_cpu_capabilities *entry,
-			  int __unused)
+			  int scope)
 {
-	return read_sanitised_ftr_reg(SYS_CTR_EL0) & BIT(CTR_IDC_SHIFT);
+	u64 ctr;
+
+	if (scope == SCOPE_SYSTEM)
+		ctr = arm64_ftr_reg_ctrel0.sys_val;
+	else
+		ctr = read_cpuid_cachetype();
+
+	return ctr & BIT(CTR_IDC_SHIFT);
 }
 
 static bool has_cache_dic(const struct arm64_cpu_capabilities *entry,
-			  int __unused)
+			  int scope)
 {
-	return read_sanitised_ftr_reg(SYS_CTR_EL0) & BIT(CTR_DIC_SHIFT);
+	u64 ctr;
+
+	if (scope == SCOPE_SYSTEM)
+		ctr = arm64_ftr_reg_ctrel0.sys_val;
+	else
+		ctr = read_cpuid_cachetype();
+
+	return ctr & BIT(CTR_DIC_SHIFT);
 }
 
 static bool __maybe_unused
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/3] arm64: cpufeature: Fix handling of CTR_EL0.IDC field
  2018-10-09 13:47 [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Suzuki K Poulose
  2018-10-09 13:47 ` Suzuki K Poulose
  2018-10-09 13:47 ` [PATCH v2 1/3] arm64: cpufeature: ctr: Fix cpu capability check for late CPUs Suzuki K Poulose
@ 2018-10-09 13:47 ` Suzuki K Poulose
  2018-10-09 13:47 ` [PATCH v2 3/3] arm64: cpufeature: Trap CTR_EL0 access only where it is necessary Suzuki K Poulose
  2018-10-16 11:00 ` [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Catalin Marinas
  4 siblings, 0 replies; 6+ messages in thread
From: Suzuki K Poulose @ 2018-10-09 13:47 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, will.deacon, mark.rutland, pelcan,
	shankerd, suzuki.poulose

CTR_EL0.IDC reports the data cache clean requirements for instruction
to data coherence. However, if the field is 0, we need to check the
CLIDR_EL1 fields to detect the status of the feature. Currently we
don't do this and generate a warning with tainting the kernel, when
there is a mismatch in the field among the CPUs. Also the userspace
doesn't have a reliable way to check the CLIDR_EL1 register to check
the status.

This patch fixes the problem by checking the CLIDR_EL1 fields, when
(CTR_EL0.IDC == 0) and updates the kernel's copy of the CTR_EL0 for
the CPU with the actual status of the feature. This would allow the
sanity check infrastructure to do the proper checking of the fields
and also allow the CTR_EL0 emulation code to supply the real status
of the feature.

Now, if a CPU has raw CTR_EL0.IDC == 0 and effective IDC == 1 (with
overall system wide IDC == 1), we need to expose the real value to
the user. So, we trap CTR_EL0 access on the CPU which reports incorrect
CTR_EL0.IDC.

Fixes: commit 6ae4b6e057888 ("arm64: Add support for new control bits CTR_EL0.DIC and CTR_EL0.IDC")
Cc: Shanker Donthineni <shankerd@codeaurora.org>
Cc: Philip Elcan <pelcan@codeaurora.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cache.h | 40 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/cpu_errata.c | 27 ++++++++++++++++++++---
 arch/arm64/kernel/cpufeature.c | 15 ++++++++++++-
 arch/arm64/kernel/cpuinfo.c    | 10 ++++++++-
 4 files changed, 87 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
index 5ee5bca8c24b..13dd42c3ad4e 100644
--- a/arch/arm64/include/asm/cache.h
+++ b/arch/arm64/include/asm/cache.h
@@ -40,6 +40,15 @@
 #define L1_CACHE_SHIFT		(6)
 #define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
 
+
+#define CLIDR_LOUU_SHIFT	27
+#define CLIDR_LOC_SHIFT		24
+#define CLIDR_LOUIS_SHIFT	21
+
+#define CLIDR_LOUU(clidr)	(((clidr) >> CLIDR_LOUU_SHIFT) & 0x7)
+#define CLIDR_LOC(clidr)	(((clidr) >> CLIDR_LOC_SHIFT) & 0x7)
+#define CLIDR_LOUIS(clidr)	(((clidr) >> CLIDR_LOUIS_SHIFT) & 0x7)
+
 /*
  * Memory returned by kmalloc() may be used for DMA, so we must make
  * sure that all such allocations are cache aligned. Otherwise,
@@ -84,6 +93,37 @@ static inline int cache_line_size(void)
 	return cwg ? 4 << cwg : ARCH_DMA_MINALIGN;
 }
 
+/*
+ * Read the effective value of CTR_EL0.
+ *
+ * According to ARM ARM for ARMv8-A (ARM DDI 0487C.a),
+ * section D10.2.33 "CTR_EL0, Cache Type Register" :
+ *
+ * CTR_EL0.IDC reports the data cache clean requirements for
+ * instruction to data coherence.
+ *
+ *  0 - dcache clean to PoU is required unless :
+ *     (CLIDR_EL1.LoC == 0) || (CLIDR_EL1.LoUIS == 0 && CLIDR_EL1.LoUU == 0)
+ *  1 - dcache clean to PoU is not required for i-to-d coherence.
+ *
+ * This routine provides the CTR_EL0 with the IDC field updated to the
+ * effective state.
+ */
+static inline u32 __attribute_const__ read_cpuid_effective_cachetype(void)
+{
+	u32 ctr = read_cpuid_cachetype();
+
+	if (!(ctr & BIT(CTR_IDC_SHIFT))) {
+		u64 clidr = read_sysreg(clidr_el1);
+
+		if (CLIDR_LOC(clidr) == 0 ||
+		    (CLIDR_LOUIS(clidr) == 0 && CLIDR_LOUU(clidr) == 0))
+			ctr |= BIT(CTR_IDC_SHIFT);
+	}
+
+	return ctr;
+}
+
 #endif	/* __ASSEMBLY__ */
 
 #endif
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index cde948991d68..3deb01c6ed49 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -68,11 +68,32 @@ static bool
 has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
 			  int scope)
 {
-	u64 mask = arm64_ftr_reg_ctrel0.strict_mask;;
+	u64 mask = arm64_ftr_reg_ctrel0.strict_mask;
+	u64 sys = arm64_ftr_reg_ctrel0.sys_val & mask;
+	u64 ctr_raw, ctr_real;
 
 	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
-	return (read_cpuid_cachetype() & mask) !=
-	       (arm64_ftr_reg_ctrel0.sys_val & mask);
+
+	/*
+	 * We want to make sure that all the CPUs in the system expose
+	 * a consistent CTR_EL0 to make sure that applications behaves
+	 * correctly with migration.
+	 *
+	 * If a CPU has CTR_EL0.IDC but does not advertise it via CTR_EL0 :
+	 *
+	 * 1) It is safe if the system doesn't support IDC, as CPU anyway
+	 *    reports IDC = 0, consistent with the rest.
+	 *
+	 * 2) If the system has IDC, it is still safe as we trap CTR_EL0
+	 *    access on this CPU via the ARM64_HAS_CACHE_IDC capability.
+	 *
+	 * So, we need to make sure either the raw CTR_EL0 or the effective
+	 * CTR_EL0 matches the system's copy to allow a secondary CPU to boot.
+	 */
+	ctr_raw = read_cpuid_cachetype() & mask;
+	ctr_real = read_cpuid_effective_cachetype() & mask;
+
+	return (ctr_real != sys) && (ctr_raw != sys);
 }
 
 static void
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index ba16bb7762ca..af50064dea51 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -861,11 +861,23 @@ static bool has_cache_idc(const struct arm64_cpu_capabilities *entry,
 	if (scope == SCOPE_SYSTEM)
 		ctr = arm64_ftr_reg_ctrel0.sys_val;
 	else
-		ctr = read_cpuid_cachetype();
+		ctr = read_cpuid_effective_cachetype();
 
 	return ctr & BIT(CTR_IDC_SHIFT);
 }
 
+static void cpu_emulate_effective_ctr(const struct arm64_cpu_capabilities *__unused)
+{
+	/*
+	 * If the CPU exposes raw CTR_EL0.IDC = 0, while effectively
+	 * CTR_EL0.IDC = 1 (from CLIDR values), we need to trap accesses
+	 * to the CTR_EL0 on this CPU and emulate it with the real/safe
+	 * value.
+	 */
+	if (!(read_cpuid_cachetype() & BIT(CTR_IDC_SHIFT)))
+		sysreg_clear_set(sctlr_el1, SCTLR_EL1_UCT, 0);
+}
+
 static bool has_cache_dic(const struct arm64_cpu_capabilities *entry,
 			  int scope)
 {
@@ -1282,6 +1294,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.capability = ARM64_HAS_CACHE_IDC,
 		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
 		.matches = has_cache_idc,
+		.cpu_enable = cpu_emulate_effective_ctr,
 	},
 	{
 		.desc = "Instruction cache invalidation not required for I/D coherence",
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index dce971f2c167..bcc2831399cb 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -325,7 +325,15 @@ static void cpuinfo_detect_icache_policy(struct cpuinfo_arm64 *info)
 static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
 {
 	info->reg_cntfrq = arch_timer_get_cntfrq();
-	info->reg_ctr = read_cpuid_cachetype();
+	/*
+	 * Use the effective value of the CTR_EL0 than the raw value
+	 * exposed by the CPU. CTR_E0.IDC field value must be interpreted
+	 * with the CLIDR_EL1 fields to avoid triggering false warnings
+	 * when there is a mismatch across the CPUs. Keep track of the
+	 * effective value of the CTR_EL0 in our internal records for
+	 * acurate sanity check and feature enablement.
+	 */
+	info->reg_ctr = read_cpuid_effective_cachetype();
 	info->reg_dczid = read_cpuid(DCZID_EL0);
 	info->reg_midr = read_cpuid_id();
 	info->reg_revidr = read_cpuid(REVIDR_EL1);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/3] arm64: cpufeature: Trap CTR_EL0 access only where it is necessary
  2018-10-09 13:47 [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Suzuki K Poulose
                   ` (2 preceding siblings ...)
  2018-10-09 13:47 ` [PATCH v2 2/3] arm64: cpufeature: Fix handling of CTR_EL0.IDC field Suzuki K Poulose
@ 2018-10-09 13:47 ` Suzuki K Poulose
  2018-10-16 11:00 ` [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Catalin Marinas
  4 siblings, 0 replies; 6+ messages in thread
From: Suzuki K Poulose @ 2018-10-09 13:47 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, will.deacon, mark.rutland, pelcan,
	shankerd, suzuki.poulose

When there is a mismatch in the CTR_EL0 field, we trap
access to CTR from EL0 on all CPUs to expose the safe
value. However, we could skip trapping on a CPU which
matches the safe value.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/cpu_errata.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 3deb01c6ed49..9af8df96ec49 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -99,7 +99,12 @@ has_mismatched_cache_type(const struct arm64_cpu_capabilities *entry,
 static void
 cpu_enable_trap_ctr_access(const struct arm64_cpu_capabilities *__unused)
 {
-	sysreg_clear_set(sctlr_el1, SCTLR_EL1_UCT, 0);
+	u64 mask = arm64_ftr_reg_ctrel0.strict_mask;
+
+	/* Trap CTR_EL0 access on this CPU, only if it has a mismatch */
+	if ((read_cpuid_cachetype() & mask) !=
+	    (arm64_ftr_reg_ctrel0.sys_val & mask))
+		sysreg_clear_set(sctlr_el1, SCTLR_EL1_UCT, 0);
 }
 
 atomic_t arm64_el2_vector_last_slot = ATOMIC_INIT(-1);
-- 
2.19.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0
  2018-10-09 13:47 [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Suzuki K Poulose
                   ` (3 preceding siblings ...)
  2018-10-09 13:47 ` [PATCH v2 3/3] arm64: cpufeature: Trap CTR_EL0 access only where it is necessary Suzuki K Poulose
@ 2018-10-16 11:00 ` Catalin Marinas
  4 siblings, 0 replies; 6+ messages in thread
From: Catalin Marinas @ 2018-10-16 11:00 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, mark.rutland, pelcan, will.deacon,
	linux-kernel, shankerd

On Tue, Oct 09, 2018 at 02:47:03PM +0100, Suzuki K. Poulose wrote:
> This series makes sure that we handle the CTR_EL0 field mismatches
> properly, especially for the IDC field. Also, skip trapping CTR
> accesses on a CPU if it matches the safe value.
> 
> Applies on arm64 for-next/core.
> 
> Changes since v1
>  - Fix wrong hunk in has_cache_idc()
>  - Allow a late secondary CPU with raw CTR_EL0.IDC = 0 and effective
>    CTR_EL0.IDC = 1, to boot on a system without IDC available.
> 
> Suzuki K Poulose (3):
>   arm64: cpufeature: ctr: Fix cpu capability check for late CPUs
>   arm64: cpufeature: Fix handling of CTR_EL0.IDC field
>   arm64: cpufeature: Trap CTR_EL0 access only where it is necessary

Queued for 4.20. Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-16 11:00 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-09 13:47 [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Suzuki K Poulose
2018-10-09 13:47 ` Suzuki K Poulose
2018-10-09 13:47 ` [PATCH v2 1/3] arm64: cpufeature: ctr: Fix cpu capability check for late CPUs Suzuki K Poulose
2018-10-09 13:47 ` [PATCH v2 2/3] arm64: cpufeature: Fix handling of CTR_EL0.IDC field Suzuki K Poulose
2018-10-09 13:47 ` [PATCH v2 3/3] arm64: cpufeature: Trap CTR_EL0 access only where it is necessary Suzuki K Poulose
2018-10-16 11:00 ` [PATCH v2 0/3] arm64: cpufeature: Fix handling of CTR_EL0 Catalin Marinas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).