linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8]  arm64: Work around for mismatched cache line size
@ 2016-07-08 11:37 Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 1/8] arm64: Set the safe value for L1 icache policy Suzuki K Poulose
                   ` (9 more replies)
  0 siblings, 10 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose

This series adds a work around for systems with mismatched {I,D}-cache
line sizes. When a thread of execution gets migrated to a different CPU,
the cache line size it had cached could be larger than that of the new
CPU. This could cause data corruption issues. We work around this by

 - Dynamically patching the kernel to use the smallest line size on the
   system (from the CPU feature infrastructure)
 - Trapping the userspace access to CTR_EL0 (by clearing SCTLR_EL1.UCT) and
   emulating it with the system wide safe value of CTR.

The series also adds support for alternative code patching of adrp
instructions by adjusting the PC-relative address offset to reflect
the new PC.

The series has been tested on Juno with a hack to forced enabling
of the capability.

Applies on aarch64: for-next/core. The tree is avaiable at :

	git://linux-arm.org/linux-skp.git ctr-emulation


Suzuki K Poulose (8):
  arm64: Set the safe value for L1 icache policy
  arm64: Use consistent naming for errata handling
  arm64: Rearrange CPU errata workaround checks
  arm64/insn: Add helpers for pc relative address offsets
  arm64: alternative: Add support for patching adrp instructions
  arm64: Introduce raw_{d,i}cache_line_size
  arm64: Refactor sysinstr exception handling
  arm64: Work around systems with mismatched cache line sizes

 arch/arm64/include/asm/assembler.h  | 45 +++++++++++++++++--
 arch/arm64/include/asm/cpufeature.h | 14 +++---
 arch/arm64/include/asm/esr.h        | 56 ++++++++++++++++++++++++
 arch/arm64/include/asm/insn.h       |  5 +++
 arch/arm64/include/asm/sysreg.h     |  1 +
 arch/arm64/kernel/alternative.c     | 13 ++++++
 arch/arm64/kernel/asm-offsets.c     |  2 +
 arch/arm64/kernel/cpu_errata.c      | 26 ++++++++++-
 arch/arm64/kernel/cpufeature.c      | 44 ++++++++++++++-----
 arch/arm64/kernel/cpuinfo.c         |  2 -
 arch/arm64/kernel/hibernate-asm.S   |  2 +-
 arch/arm64/kernel/insn.c            | 23 ++++++++++
 arch/arm64/kernel/relocate_kernel.S |  2 +-
 arch/arm64/kernel/smp.c             |  8 +++-
 arch/arm64/kernel/traps.c           | 87 ++++++++++++++++++++++++++-----------
 15 files changed, 275 insertions(+), 55 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/8] arm64: Set the safe value for L1 icache policy
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 2/8] arm64: Use consistent naming for errata handling Suzuki K Poulose
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose

Right now we use 0 as the safe value for CTR_EL0:L1Ip, which is
not defined at the moment. The safer value for the L1Ip should be
the weakest of the policies, which happens to be AIVIVT. While at it,
fix the comment about safe_val.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cpufeature.h | 2 +-
 arch/arm64/kernel/cpufeature.c      | 5 +++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 49dd1bd..2926c7d 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -62,7 +62,7 @@ struct arm64_ftr_bits {
 	enum ftr_type	type;
 	u8		shift;
 	u8		width;
-	s64		safe_val; /* safe value for discrete features */
+	s64		safe_val; /* safe value for FTR_EXACT features */
 };
 
 /*
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 916d27a..3566b6d 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -147,9 +147,10 @@ static struct arm64_ftr_bits ftr_ctr[] = {
 	ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 16, 4, 1),	/* DminLine */
 	/*
 	 * Linux can handle differing I-cache policies. Userspace JITs will
-	 * make use of *minLine
+	 * make use of *minLine.
+	 * If we have differing I-cache policies, report it as the weakest - AIVIVT.
 	 */
-	ARM64_FTR_BITS(FTR_NONSTRICT, FTR_EXACT, 14, 2, 0),	/* L1Ip */
+	ARM64_FTR_BITS(FTR_NONSTRICT, FTR_EXACT, 14, 2, ICACHE_POLICY_AIVIVT),	/* L1Ip */
 	ARM64_FTR_BITS(FTR_STRICT, FTR_EXACT, 4, 10, 0),	/* RAZ */
 	ARM64_FTR_BITS(FTR_STRICT, FTR_LOWER_SAFE, 0, 4, 0),	/* IminLine */
 	ARM64_FTR_END,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/8] arm64: Use consistent naming for errata handling
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 1/8] arm64: Set the safe value for L1 icache policy Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 3/8] arm64: Rearrange CPU errata workaround checks Suzuki K Poulose
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose

This is a cosmetic change to rename the functions dealing with
the errata work arounds to be more consistent with their naming.

1) check_local_cpu_errata() => update_cpu_errata_workarounds()
check_local_cpu_errata() actually updates the system's errata work
arounds. So rename it to reflect the same.

2) verify_local_cpu_errata() => verify_local_cpu_errata_workarounds()
Use errata_workarounds instead of _errata.

Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cpufeature.h | 4 ++--
 arch/arm64/kernel/cpu_errata.c      | 4 ++--
 arch/arm64/kernel/cpufeature.c      | 2 +-
 arch/arm64/kernel/cpuinfo.c         | 2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 2926c7d..0aab3ec 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -192,10 +192,10 @@ void __init setup_cpu_features(void);
 void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
 			    const char *info);
 void enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps);
-void check_local_cpu_errata(void);
+void update_cpu_errata_workarounds(void);
 void __init enable_errata_workarounds(void);
 
-void verify_local_cpu_errata(void);
+void verify_local_cpu_errata_workarounds(void);
 void verify_local_cpu_capabilities(void);
 
 u64 read_system_reg(u32 id);
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index af647d2..f0fd6a2c 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -110,7 +110,7 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
  * and the related information is freed soon after. If the new CPU requires
  * an errata not detected at boot, fail this CPU.
  */
-void verify_local_cpu_errata(void)
+void verify_local_cpu_errata_workarounds(void)
 {
 	const struct arm64_cpu_capabilities *caps = arm64_errata;
 
@@ -125,7 +125,7 @@ void verify_local_cpu_errata(void)
 		}
 }
 
-void check_local_cpu_errata(void)
+void update_cpu_errata_workarounds(void)
 {
 	update_cpu_capabilities(arm64_errata, "enabling workaround for");
 }
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 3566b6d..7b8bb7f 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -998,7 +998,7 @@ void verify_local_cpu_capabilities(void)
 	if (!sys_caps_initialised)
 		return;
 
-	verify_local_cpu_errata();
+	verify_local_cpu_errata_workarounds();
 	verify_local_cpu_features(arm64_features);
 	verify_local_elf_hwcaps(arm64_elf_hwcaps);
 	if (system_supports_32bit_el0())
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index c173d32..2de8767 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -246,7 +246,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
 
 	cpuinfo_detect_icache_policy(info);
 
-	check_local_cpu_errata();
+	update_cpu_errata_workarounds();
 }
 
 void cpuinfo_store_cpu(void)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/8] arm64: Rearrange CPU errata workaround checks
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 1/8] arm64: Set the safe value for L1 icache policy Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 2/8] arm64: Use consistent naming for errata handling Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 4/8] arm64/insn: Add helpers for pc relative address offsets Suzuki K Poulose
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose, Andre Przywara

Right now we run through the work around checks on a CPU
from __cpuinfo_store_cpu. There are some problems with that:

1) We initialise the system wide CPU feature registers only after the
Boot CPU updates its cpuinfo. Now, if a work around depends on the
variance of a CPU ID feature (e.g, check for Cache Line size mismatch),
we have no way of performing it cleanly for the boot CPU.

2) It is out of place, invoked from __cpuinfo_store_cpu() in cpuinfo.c. It
is not an obvious place for that.

This patch rearranges the CPU specific capability(aka work around) checks.

1) At the moment we use verify_local_cpu_capabilities() to check if a new
CPU has all the system advertised features. Use this for the secondary CPUs
to perform the work around check. For that we rename
  verify_local_cpu_capabilities() => check_local_cpu_capabilities()
which:

   If the system wide capabilities haven't been initialised (i.e, the CPU
   is activated at the boot), update the system wide detected work arounds.

   Otherwise (i.e a CPU hotplugged in later) verify that this CPU conforms to the
   system wide capabilities.

2) Boot CPU updates the work arounds from smp_prepare_boot_cpu() after we have
initialised the system wide CPU feature values.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/cpufeature.h |  4 ++--
 arch/arm64/kernel/cpufeature.c      | 30 ++++++++++++++++++++----------
 arch/arm64/kernel/cpuinfo.c         |  2 --
 arch/arm64/kernel/smp.c             |  8 +++++++-
 4 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 0aab3ec..1442b45 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -192,11 +192,11 @@ void __init setup_cpu_features(void);
 void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
 			    const char *info);
 void enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps);
+void check_local_cpu_capabilities(void);
+
 void update_cpu_errata_workarounds(void);
 void __init enable_errata_workarounds(void);
-
 void verify_local_cpu_errata_workarounds(void);
-void verify_local_cpu_capabilities(void);
 
 u64 read_system_reg(u32 id);
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 7b8bb7f..fb57c99 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -986,23 +986,33 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps)
  * cannot do anything to fix it up and could cause unexpected failures. So
  * we park the CPU.
  */
-void verify_local_cpu_capabilities(void)
+static void verify_local_cpu_capabilities(void)
 {
+	verify_local_cpu_errata_workarounds();
+	verify_local_cpu_features(arm64_features);
+	verify_local_elf_hwcaps(arm64_elf_hwcaps);
+	if (system_supports_32bit_el0())
+		verify_local_elf_hwcaps(compat_elf_hwcaps);
+}
 
+void check_local_cpu_capabilities(void)
+{
+	/*
+	 * All secondary CPUs should conform to the early CPU features
+	 * in use by the kernel based on boot CPU.
+	 */
 	check_early_cpu_features();
 
 	/*
-	 * If we haven't computed the system capabilities, there is nothing
-	 * to verify.
+	 * If we haven't finalised the system capabilities, this CPU gets
+	 * a chance to update the errata work arounds.
+	 * Otherwise, this CPU should verify that it has all the system
+	 * advertised capabilities.
 	 */
 	if (!sys_caps_initialised)
-		return;
-
-	verify_local_cpu_errata_workarounds();
-	verify_local_cpu_features(arm64_features);
-	verify_local_elf_hwcaps(arm64_elf_hwcaps);
-	if (system_supports_32bit_el0())
-		verify_local_elf_hwcaps(compat_elf_hwcaps);
+		update_cpu_errata_workarounds();
+	else
+		verify_local_cpu_capabilities();
 }
 
 static void __init setup_feature_capabilities(void)
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 2de8767..ddd2f36 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -245,8 +245,6 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
 	}
 
 	cpuinfo_detect_icache_policy(info);
-
-	update_cpu_errata_workarounds();
 }
 
 void cpuinfo_store_cpu(void)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 62ff3c0..20d630f 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -239,7 +239,7 @@ asmlinkage void secondary_start_kernel(void)
 	 * this CPU ticks all of those. If it doesn't, the CPU will
 	 * fail to come online.
 	 */
-	verify_local_cpu_capabilities();
+	check_local_cpu_capabilities();
 
 	if (cpu_ops[cpu]->cpu_postboot)
 		cpu_ops[cpu]->cpu_postboot();
@@ -440,6 +440,12 @@ void __init smp_prepare_boot_cpu(void)
 	cpuinfo_store_boot_cpu();
 	save_boot_cpu_run_el();
 	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+	/*
+	 * Run the errata work around checks on the boot CPU, once we have
+	 * initialised the cpu feature infrastructure from
+	 * cpuinfo_store_boot_cpu() above.
+	 */
+	update_cpu_errata_workarounds();
 }
 
 static u64 __init of_get_cpu_mpidr(struct device_node *dn)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/8] arm64/insn: Add helpers for pc relative address offsets
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
                   ` (2 preceding siblings ...)
  2016-07-08 11:37 ` [PATCH 3/8] arm64: Rearrange CPU errata workaround checks Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 4/8] arm64/insn: Add helpers for pc relative data processing instructions Suzuki K Poulose
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose

Adds helpers for decoding/encoding the PC relative addresses for
Data processing instructions (i.e, adr and adrp). This will be used
for handling dynamic patching of 'adr/adrp' instructions in alternative
code patching.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/insn.h |  5 +++++
 arch/arm64/kernel/insn.c      | 23 +++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 30e50eb..03dc4c2 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -277,6 +277,8 @@ __AARCH64_INSN_FUNCS(hint,	0xFFFFF01F, 0xD503201F)
 __AARCH64_INSN_FUNCS(br,	0xFFFFFC1F, 0xD61F0000)
 __AARCH64_INSN_FUNCS(blr,	0xFFFFFC1F, 0xD63F0000)
 __AARCH64_INSN_FUNCS(ret,	0xFFFFFC1F, 0xD65F0000)
+__AARCH64_INSN_FUNCS(adrp,	0x9F000000, 0x90000000)
+__AARCH64_INSN_FUNCS(adr,	0x9F000000, 0x10000000)
 
 #undef	__AARCH64_INSN_FUNCS
 
@@ -355,6 +357,9 @@ u32 aarch64_insn_gen_logical_shifted_reg(enum aarch64_insn_register dst,
 s32 aarch64_get_branch_offset(u32 insn);
 u32 aarch64_set_branch_offset(u32 insn, s32 offset);
 
+s32 aarch64_get_addr_offset(u32 insn);
+u32 aarch64_set_addr_offset(u32 insn, s32 offset);
+
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
 int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 368c082..3edd741 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -1175,6 +1175,29 @@ u32 aarch64_set_branch_offset(u32 insn, s32 offset)
 	BUG();
 }
 
+s32 aarch64_get_addr_offset(u32 insn)
+{
+	if (aarch64_insn_is_adr(insn))
+		return aarch64_insn_decode_immediate(AARCH64_INSN_IMM_ADR, insn);
+	if (aarch64_insn_is_adrp(insn))
+		return aarch64_insn_decode_immediate(AARCH64_INSN_IMM_ADR, insn) << 12;
+
+	/* Unhandled instruction */
+	BUG();
+}
+
+u32 aarch64_set_addr_offset(u32 insn, s32 offset)
+{
+	if (aarch64_insn_is_adr(insn))
+		return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_ADR, insn,
+									offset);
+	if (aarch64_insn_is_adrp(insn))
+		return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_ADR, insn,
+									offset >> 12);
+	/* Unhandled instruction */
+	BUG();
+}
+
 bool aarch32_insn_is_wide(u32 insn)
 {
 	return insn >= 0xe800;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/8] arm64/insn: Add helpers for pc relative data processing instructions
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
                   ` (3 preceding siblings ...)
  2016-07-08 11:37 ` [PATCH 4/8] arm64/insn: Add helpers for pc relative address offsets Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 5/8] arm64: alternative: Add support for patching adrp instructions Suzuki K Poulose
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose

Adds helpers for decoding/encoding the PC relative addresses for
Data processing instructions (i.e, adr and adrp). This will be used
for handling dynamic patching of 'adr/adrp' instructions in alternative
code patching.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/insn.h |  5 +++++
 arch/arm64/kernel/insn.c      | 23 +++++++++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 30e50eb..03dc4c2 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -277,6 +277,8 @@ __AARCH64_INSN_FUNCS(hint,	0xFFFFF01F, 0xD503201F)
 __AARCH64_INSN_FUNCS(br,	0xFFFFFC1F, 0xD61F0000)
 __AARCH64_INSN_FUNCS(blr,	0xFFFFFC1F, 0xD63F0000)
 __AARCH64_INSN_FUNCS(ret,	0xFFFFFC1F, 0xD65F0000)
+__AARCH64_INSN_FUNCS(adrp,	0x9F000000, 0x90000000)
+__AARCH64_INSN_FUNCS(adr,	0x9F000000, 0x10000000)
 
 #undef	__AARCH64_INSN_FUNCS
 
@@ -355,6 +357,9 @@ u32 aarch64_insn_gen_logical_shifted_reg(enum aarch64_insn_register dst,
 s32 aarch64_get_branch_offset(u32 insn);
 u32 aarch64_set_branch_offset(u32 insn, s32 offset);
 
+s32 aarch64_get_addr_offset(u32 insn);
+u32 aarch64_set_addr_offset(u32 insn, s32 offset);
+
 bool aarch64_insn_hotpatch_safe(u32 old_insn, u32 new_insn);
 
 int aarch64_insn_patch_text_nosync(void *addr, u32 insn);
diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
index 368c082..3edd741 100644
--- a/arch/arm64/kernel/insn.c
+++ b/arch/arm64/kernel/insn.c
@@ -1175,6 +1175,29 @@ u32 aarch64_set_branch_offset(u32 insn, s32 offset)
 	BUG();
 }
 
+s32 aarch64_get_addr_offset(u32 insn)
+{
+	if (aarch64_insn_is_adr(insn))
+		return aarch64_insn_decode_immediate(AARCH64_INSN_IMM_ADR, insn);
+	if (aarch64_insn_is_adrp(insn))
+		return aarch64_insn_decode_immediate(AARCH64_INSN_IMM_ADR, insn) << 12;
+
+	/* Unhandled instruction */
+	BUG();
+}
+
+u32 aarch64_set_addr_offset(u32 insn, s32 offset)
+{
+	if (aarch64_insn_is_adr(insn))
+		return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_ADR, insn,
+									offset);
+	if (aarch64_insn_is_adrp(insn))
+		return aarch64_insn_encode_immediate(AARCH64_INSN_IMM_ADR, insn,
+									offset >> 12);
+	/* Unhandled instruction */
+	BUG();
+}
+
 bool aarch32_insn_is_wide(u32 insn)
 {
 	return insn >= 0xe800;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/8] arm64: alternative: Add support for patching adrp instructions
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
                   ` (4 preceding siblings ...)
  2016-07-08 11:37 ` [PATCH 4/8] arm64/insn: Add helpers for pc relative data processing instructions Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 6/8] arm64: Introduce raw_{d,i}cache_line_size Suzuki K Poulose
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose, Marc Zyngier, Andre Przywara

adrp uses PC-relative address offset to a page (of 4K size) of
a symbol. If it appears in an alternative code patched in, we
should adjust the offset to reflect the address where it will
be run from. This patch adds support for fixing the offset
for adrp instructions.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/kernel/alternative.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index d2ee1b2..cd36950 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -80,6 +80,19 @@ static u32 get_alt_insn(struct alt_instr *alt, u32 *insnptr, u32 *altinsnptr)
 			offset = target - (unsigned long)insnptr;
 			insn = aarch64_set_branch_offset(insn, offset);
 		}
+	} else if (aarch64_insn_is_adrp(insn)) {
+		s32 orig_offset, new_offset;
+		unsigned long target;
+
+		/*
+		 * If we're replacing an adrp instruction, which uses PC-relative
+		 * immediate addressing, adjust the offset to reflect the new
+		 * PC. adrp operates on 4K aligned addresses.
+		 */
+		orig_offset  = aarch64_get_addr_offset(insn);
+		target = ((unsigned long)altinsnptr & ~0xfffUL) + orig_offset;
+		new_offset = target - ((unsigned long)insnptr & ~0xfffUL);
+		insn = aarch64_set_addr_offset(insn, new_offset);
 	}
 
 	return insn;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 6/8] arm64: Introduce raw_{d,i}cache_line_size
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
                   ` (5 preceding siblings ...)
  2016-07-08 11:37 ` [PATCH 5/8] arm64: alternative: Add support for patching adrp instructions Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-11 21:55   ` Geoff Levand
  2016-08-09 16:22   ` James Morse
  2016-07-08 11:37 ` [PATCH 7/8] arm64: Refactor sysinstr exception handling Suzuki K Poulose
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose, James Morse, Geoff Levand

On systems with mismatched i/d cache min line sizes, we need to use
the smallest size possible across all CPUs. This will be done by fetching
the system wide safe value from CPU feature infrastructure.
However the some special users(e.g kexec, hibernate) would need the line
size on the CPU (rather than the system wide), when the system wide
feature may not be accessible. Provide another helper which will fetch
cache line size on the current CPU.

Cc: James Morse <james.morse@arm.com>
Cc: Geoff Levand <geoff@infradead.org>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/assembler.h  | 24 ++++++++++++++++++++----
 arch/arm64/kernel/hibernate-asm.S   |  2 +-
 arch/arm64/kernel/relocate_kernel.S |  2 +-
 3 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index d5025c6..a4bb3f5 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -218,9 +218,10 @@ lr	.req	x30		// link register
 	.endm
 
 /*
- * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ * raw_dcache_line_size - get the minimum D-cache line size on this CPU
+ * from the CTR register.
  */
-	.macro	dcache_line_size, reg, tmp
+	.macro	raw_dcache_line_size, reg, tmp
 	mrs	\tmp, ctr_el0			// read CTR
 	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
 	mov	\reg, #4			// bytes per word
@@ -228,9 +229,17 @@ lr	.req	x30		// link register
 	.endm
 
 /*
- * icache_line_size - get the minimum I-cache line size from the CTR register.
+ * dcache_line_size - get the safe D-cache line size across all CPUs
  */
-	.macro	icache_line_size, reg, tmp
+	.macro	dcache_line_size, reg, tmp
+	raw_dcache_line_size	\reg, \tmp
+	.endm
+
+/*
+ * raw_icache_line_size - get the minimum I-cache line size on this CPU
+ * from the CTR register.
+ */
+	.macro	raw_icache_line_size, reg, tmp
 	mrs	\tmp, ctr_el0			// read CTR
 	and	\tmp, \tmp, #0xf		// cache line size encoding
 	mov	\reg, #4			// bytes per word
@@ -238,6 +247,13 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * icache_line_size - get the safe I-cache line size across all CPUs
+ */
+	.macro	icache_line_size, reg, tmp
+	raw_icache_line_size	\reg, \tmp
+	.endm
+
+/*
  * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
  */
 	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S
index 46f29b6..4ebc6a1 100644
--- a/arch/arm64/kernel/hibernate-asm.S
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -96,7 +96,7 @@ ENTRY(swsusp_arch_suspend_exit)
 
 	add	x1, x10, #PAGE_SIZE
 	/* Clean the copied page to PoU - based on flush_icache_range() */
-	dcache_line_size x2, x3
+	raw_dcache_line_size x2, x3
 	sub	x3, x2, #1
 	bic	x4, x10, x3
 2:	dc	cvau, x4	/* clean D line / unified line */
diff --git a/arch/arm64/kernel/relocate_kernel.S b/arch/arm64/kernel/relocate_kernel.S
index 51b73cd..ce704a4 100644
--- a/arch/arm64/kernel/relocate_kernel.S
+++ b/arch/arm64/kernel/relocate_kernel.S
@@ -34,7 +34,7 @@ ENTRY(arm64_relocate_new_kernel)
 	/* Setup the list loop variables. */
 	mov	x17, x1				/* x17 = kimage_start */
 	mov	x16, x0				/* x16 = kimage_head */
-	dcache_line_size x15, x0		/* x15 = dcache line size */
+	raw_dcache_line_size x15, x0		/* x15 = dcache line size */
 	mov	x14, xzr			/* x14 = entry ptr */
 	mov	x13, xzr			/* x13 = copy dest */
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 7/8] arm64: Refactor sysinstr exception handling
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
                   ` (6 preceding siblings ...)
  2016-07-08 11:37 ` [PATCH 6/8] arm64: Introduce raw_{d,i}cache_line_size Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 8/8] arm64: Work around systems with mismatched cache line sizes Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose, Andre Przywara

Right now we trap some of the user space data cache operations
based on a few Errata (ARM 819472, 826319, 827319 and 824069).
We need to trap userspace access to CTR_EL0, if we detect mismatched
cache line size. Since both these traps share the EC, refactor
the handler a little bit to make it a bit more reader friendly.

Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/esr.h | 48 +++++++++++++++++++++++++++++
 arch/arm64/kernel/traps.c    | 73 ++++++++++++++++++++++++++++----------------
 2 files changed, 95 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index f772e15..2a8f6c3 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -109,6 +109,54 @@
 	((ESR_ELx_EC_BRK64 << ESR_ELx_EC_SHIFT) | ESR_ELx_IL |	\
 	 ((imm) & 0xffff))
 
+/* ISS field definitions for System instruction traps */
+#define ESR_ELx_SYS64_ISS_RES0_SHIFT	22
+#define ESR_ELx_SYS64_ISS_RES0_MASK	(UL(0x7) << ESR_ELx_SYS64_ISS_RES0_SHIFT)
+#define ESR_ELx_SYS64_ISS_DIR_MASK	0x1
+#define ESR_ELx_SYS64_ISS_DIR_READ	0x1
+#define ESR_ELx_SYS64_ISS_DIR_WRITE	0x0
+
+#define ESR_ELx_SYS64_ISS_RT_SHIFT	5
+#define ESR_ELx_SYS64_ISS_RT_MASK	(UL(0x1f) << ESR_ELx_SYS64_ISS_RT_SHIFT)
+#define ESR_ELx_SYS64_ISS_CRm_SHIFT	1
+#define ESR_ELx_SYS64_ISS_CRm_MASK	(UL(0xf) << ESR_ELx_SYS64_ISS_CRm_SHIFT)
+#define ESR_ELx_SYS64_ISS_CRn_SHIFT	10
+#define ESR_ELx_SYS64_ISS_CRn_MASK	(UL(0xf) << ESR_ELx_SYS64_ISS_CRn_SHIFT)
+#define ESR_ELx_SYS64_ISS_Op1_SHIFT	14
+#define ESR_ELx_SYS64_ISS_Op1_MASK	(UL(0x7) << ESR_ELx_SYS64_ISS_Op1_SHIFT)
+#define ESR_ELx_SYS64_ISS_Op2_SHIFT	17
+#define ESR_ELx_SYS64_ISS_Op2_MASK	(UL(0x7) << ESR_ELx_SYS64_ISS_Op2_SHIFT)
+#define ESR_ELx_SYS64_ISS_Op0_SHIFT	20
+#define ESR_ELx_SYS64_ISS_Op0_MASK	(UL(0x3) << ESR_ELx_SYS64_ISS_Op0_SHIFT)
+#define ESR_ELx_SYS64_ISS_SYS_MASK	(ESR_ELx_SYS64_ISS_Op0_MASK | \
+					 ESR_ELx_SYS64_ISS_Op1_MASK | \
+					 ESR_ELx_SYS64_ISS_Op2_MASK | \
+					 ESR_ELx_SYS64_ISS_CRn_MASK | \
+					 ESR_ELx_SYS64_ISS_CRm_MASK)
+#define ESR_ELx_SYS64_ISS_SYS_VAL(Op0, Op1, Op2, CRn, CRm) \
+					(((Op0) << ESR_ELx_SYS64_ISS_Op0_SHIFT) | \
+					 ((Op1) << ESR_ELx_SYS64_ISS_Op1_SHIFT) | \
+					 ((Op2) << ESR_ELx_SYS64_ISS_Op2_SHIFT) | \
+					 ((CRn) << ESR_ELx_SYS64_ISS_CRn_SHIFT) | \
+					 ((CRm) << ESR_ELx_SYS64_ISS_CRm_SHIFT))
+/*
+ * User space cache operations have the following sysreg encoding
+ * in System instructions.
+ * Op0=1, Op1=3, Op2=1, CRn=7, CRm={ 5, 10, 11, 14 }, WRITE (L=0)
+ */
+#define ESR_ELx_SYS64_ISS_CRm_DC_CIVAC	14
+#define ESR_ELx_SYS64_ISS_CRm_DC_CVAU	11
+#define ESR_ELx_SYS64_ISS_CRm_DC_CVAC	10
+#define ESR_ELx_SYS64_ISS_CRm_IC_IVAU	5
+
+#define ESR_ELx_SYS64_ISS_U_CACHE_OP_MASK	(ESR_ELx_SYS64_ISS_Op0_MASK | \
+						 ESR_ELx_SYS64_ISS_Op1_MASK | \
+						 ESR_ELx_SYS64_ISS_Op2_MASK | \
+						 ESR_ELx_SYS64_ISS_CRn_MASK | \
+						 ESR_ELx_SYS64_ISS_DIR_MASK)
+#define ESR_ELx_SYS64_ISS_U_CACHE_OP_VAL \
+				(ESR_ELx_SYS64_ISS_SYS_VAL(1, 3, 1, 7, 0) | \
+				 ESR_ELx_SYS64_ISS_DIR_WRITE)
 #ifndef __ASSEMBLY__
 #include <asm/types.h>
 
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index e04f838..93c5287 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -447,36 +447,29 @@ void cpu_enable_cache_maint_trap(void *__unused)
 		: "=r" (res)					\
 		: "r" (address), "i" (-EFAULT) )
 
-asmlinkage void __exception do_sysinstr(unsigned int esr, struct pt_regs *regs)
+static void user_cache_maint_handler(unsigned int esr, struct pt_regs *regs)
 {
 	unsigned long address;
-	int ret;
+	int rt = (esr & ESR_ELx_SYS64_ISS_RT_MASK) >> ESR_ELx_SYS64_ISS_RT_SHIFT;
+	int crm = (esr & ESR_ELx_SYS64_ISS_CRm_MASK) >> ESR_ELx_SYS64_ISS_CRm_SHIFT;
+	int ret = 0;
 
-	/* if this is a write with: Op0=1, Op2=1, Op1=3, CRn=7 */
-	if ((esr & 0x01fffc01) == 0x0012dc00) {
-		int rt = (esr >> 5) & 0x1f;
-		int crm = (esr >> 1) & 0x0f;
+	address = (rt == 31) ? 0 : regs->regs[rt];
 
-		address = (rt == 31) ? 0 : regs->regs[rt];
-
-		switch (crm) {
-		case 11:		/* DC CVAU, gets promoted */
-			__user_cache_maint("dc civac", address, ret);
-			break;
-		case 10:		/* DC CVAC, gets promoted */
-			__user_cache_maint("dc civac", address, ret);
-			break;
-		case 14:		/* DC CIVAC */
-			__user_cache_maint("dc civac", address, ret);
-			break;
-		case 5:			/* IC IVAU */
-			__user_cache_maint("ic ivau", address, ret);
-			break;
-		default:
-			force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);
-			return;
-		}
-	} else {
+	switch (crm) {
+	case ESR_ELx_SYS64_ISS_CRm_DC_CVAU:	/* DC CVAU, gets promoted */
+		__user_cache_maint("dc civac", address, ret);
+		break;
+	case ESR_ELx_SYS64_ISS_CRm_DC_CVAC:	/* DC CVAC, gets promoted */
+		__user_cache_maint("dc civac", address, ret);
+		break;
+	case ESR_ELx_SYS64_ISS_CRm_DC_CIVAC:	/* DC CIVAC */
+		__user_cache_maint("dc civac", address, ret);
+		break;
+	case ESR_ELx_SYS64_ISS_CRm_IC_IVAU:	/* IC IVAU */
+		__user_cache_maint("ic ivau", address, ret);
+		break;
+	default:
 		force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);
 		return;
 	}
@@ -487,6 +480,34 @@ asmlinkage void __exception do_sysinstr(unsigned int esr, struct pt_regs *regs)
 		regs->pc += 4;
 }
 
+struct sys64_hook {
+	unsigned int esr_mask;
+	unsigned int esr_val;
+	void (*handler)(unsigned int esr, struct pt_regs *regs);
+};
+
+static struct sys64_hook sys64_hooks[] = {
+	{
+		.esr_mask = ESR_ELx_SYS64_ISS_U_CACHE_OP_MASK,
+		.esr_val = ESR_ELx_SYS64_ISS_U_CACHE_OP_VAL,
+		.handler = user_cache_maint_handler,
+	},
+	{},
+};
+
+asmlinkage void __exception do_sysinstr(unsigned int esr, struct pt_regs *regs)
+{
+	struct sys64_hook *hook;
+
+	for (hook = sys64_hooks; hook->handler; hook++)
+		if ((hook->esr_mask & esr) == hook->esr_val) {
+			hook->handler(esr, regs);
+			return;
+		}
+
+	force_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);
+}
+
 long compat_arm_syscall(struct pt_regs *regs);
 
 asmlinkage long do_ni_syscall(struct pt_regs *regs)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 8/8] arm64: Work around systems with mismatched cache line sizes
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
                   ` (7 preceding siblings ...)
  2016-07-08 11:37 ` [PATCH 7/8] arm64: Refactor sysinstr exception handling Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  2016-07-08 11:37 ` [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose, Andre Przywara

Systems with differing CPU i-cache/d-cache line sizes can cause
problems with the cache management by software when the execution
is migrated from one to another. Usually, the application reads
the cache size on a CPU and then uses that length to perform cache
operations. However, if it gets migrated to another CPU with a smaller
cache line size, things could go completely wrong. To prevent such
cases, always use the smallest cache line size among the CPUs. The
kernel CPU feature infrastructure already keeps track of the safe
value for all CPUID registers including CTR. This patch works around
the problem by :

For kernel, dynamically patch the kernel to read the cache size
from the system wide copy of CTR_EL0.

For applications, trap read accesses to CTR_EL0 (by clearing the SCTLR.UCT)
and emulate the mrs instruction to return the system wide safe value
of CTR_EL0.

For faster access (i.e, avoiding to lookup the system wide value of CTR_EL0
via read_system_reg), we keep track of the pointer to table entry for
CTR_EL0 in the CPU feature infrastructure.

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
---
 arch/arm64/include/asm/assembler.h  | 25 +++++++++++++++++++++++--
 arch/arm64/include/asm/cpufeature.h |  4 +++-
 arch/arm64/include/asm/esr.h        |  8 ++++++++
 arch/arm64/include/asm/sysreg.h     |  1 +
 arch/arm64/kernel/asm-offsets.c     |  2 ++
 arch/arm64/kernel/cpu_errata.c      | 22 ++++++++++++++++++++++
 arch/arm64/kernel/cpufeature.c      |  9 +++++++++
 arch/arm64/kernel/traps.c           | 14 ++++++++++++++
 8 files changed, 82 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index a4bb3f5..1902ee7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -216,6 +216,21 @@ lr	.req	x30		// link register
 	.macro	mmid, rd, rn
 	ldr	\rd, [\rn, #MM_CONTEXT_ID]
 	.endm
+/*
+ * read_safe_ctr - read system wide safe CTR_EL0. If the system has
+ * mismatched cache line sizes, provide the system wide safe value.
+ */
+	.macro	read_safe_ctr, reg
+alternative_if_not ARM64_MISMATCHED_CACHE_LINE_SIZE
+	mrs	\reg, ctr_el0			// read CTR
+	nop
+	nop
+alternative_else
+	ldr_l	\reg, sys_ctr_ftr		// Read system wide safe CTR value
+	ldr	\reg, [\reg, #ARM64_FTR_SYSVAL] // from sys_ctr_ftr->sys_val
+alternative_endif
+	.endm
+
 
 /*
  * raw_dcache_line_size - get the minimum D-cache line size on this CPU
@@ -232,7 +247,10 @@ lr	.req	x30		// link register
  * dcache_line_size - get the safe D-cache line size across all CPUs
  */
 	.macro	dcache_line_size, reg, tmp
-	raw_dcache_line_size	\reg, \tmp
+	read_safe_ctr	\tmp
+	ubfm		\tmp, \tmp, #16, #19	// cache line size encoding
+	mov		\reg, #4		// bytes per word
+	lsl		\reg, \reg, \tmp	// actual cache line size
 	.endm
 
 /*
@@ -250,7 +268,10 @@ lr	.req	x30		// link register
  * icache_line_size - get the safe I-cache line size across all CPUs
  */
 	.macro	icache_line_size, reg, tmp
-	raw_icache_line_size	\reg, \tmp
+	read_safe_ctr	\tmp
+	and		\tmp, \tmp, #0xf	// cache line size encoding
+	mov		\reg, #4		// bytes per word
+	lsl		\reg, \reg, \tmp	// actual cache line size
 	.endm
 
 /*
diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 1442b45..b1f963c 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -36,8 +36,9 @@
 #define ARM64_HAS_VIRT_HOST_EXTN		11
 #define ARM64_WORKAROUND_CAVIUM_27456		12
 #define ARM64_HAS_32BIT_EL0			13
+#define ARM64_MISMATCHED_CACHE_LINE_SIZE	14
 
-#define ARM64_NCAPS				14
+#define ARM64_NCAPS				15
 
 #ifndef __ASSEMBLY__
 
@@ -108,6 +109,7 @@ struct arm64_cpu_capabilities {
 };
 
 extern DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
+extern struct arm64_ftr_reg *sys_ctr_ftr;
 
 bool this_cpu_has_cap(unsigned int cap);
 
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 2a8f6c3..51aea89 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -139,6 +139,9 @@
 					 ((Op2) << ESR_ELx_SYS64_ISS_Op2_SHIFT) | \
 					 ((CRn) << ESR_ELx_SYS64_ISS_CRn_SHIFT) | \
 					 ((CRm) << ESR_ELx_SYS64_ISS_CRm_SHIFT))
+
+#define ESR_ELx_SYS64_ISS_SYS_OP_MASK	(ESR_ELx_SYS64_ISS_SYS_MASK | \
+					 ESR_ELx_SYS64_ISS_DIR_MASK)
 /*
  * User space cache operations have the following sysreg encoding
  * in System instructions.
@@ -157,6 +160,11 @@
 #define ESR_ELx_SYS64_ISS_U_CACHE_OP_VAL \
 				(ESR_ELx_SYS64_ISS_SYS_VAL(1, 3, 1, 7, 0) | \
 				 ESR_ELx_SYS64_ISS_DIR_WRITE)
+
+#define ESR_ELx_SYS64_ISS_SYS_CTR	ESR_ELx_SYS64_ISS_SYS_VAL(3, 3, 1, 0, 0)
+#define ESR_ELx_SYS64_ISS_SYS_CTR_READ	(ESR_ELx_SYS64_ISS_SYS_CTR | \
+					 ESR_ELx_SYS64_ISS_DIR_READ)
+
 #ifndef __ASSEMBLY__
 #include <asm/types.h>
 
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index cc06794..4f26ad5 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -100,6 +100,7 @@
 /* SCTLR_EL1 specific flags. */
 #define SCTLR_EL1_UCI		(1 << 26)
 #define SCTLR_EL1_SPAN		(1 << 23)
+#define SCTLR_EL1_UCT		(1 << 15)
 #define SCTLR_EL1_SED		(1 << 8)
 #define SCTLR_EL1_CP15BEN	(1 << 5)
 
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index f8e5d47..3a9d48e 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -23,6 +23,7 @@
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
 #include <linux/suspend.h>
+#include <asm/cpufeature.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/smp_plat.h>
@@ -129,5 +130,6 @@ int main(void)
   DEFINE(HIBERN_PBE_ORIG,	offsetof(struct pbe, orig_address));
   DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
   DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
+  DEFINE(ARM64_FTR_SYSVAL,	offsetof(struct arm64_ftr_reg, sys_val));
   return 0;
 }
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index f0fd6a2c..f182190 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -30,6 +30,21 @@ is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope)
 				       entry->midr_range_max);
 }
 
+static bool
+has_mismatched_cache_line_size(const struct arm64_cpu_capabilities *entry,
+				int scope)
+{
+	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
+	return (read_cpuid_cachetype() & sys_ctr_ftr->strict_mask) !=
+		 (sys_ctr_ftr->sys_val & sys_ctr_ftr->strict_mask);
+}
+
+static void cpu_enable_trap_ctr_access(void *__unused)
+{
+	/* Clear SCTLR_EL1.UCT */
+	config_sctlr_el1(SCTLR_EL1_UCT, 0);
+}
+
 #define MIDR_RANGE(model, min, max) \
 	.def_scope = SCOPE_LOCAL_CPU, \
 	.matches = is_affected_midr_range, \
@@ -102,6 +117,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 	},
 #endif
 	{
+		.desc = "Mismatched cache line size",
+		.capability = ARM64_MISMATCHED_CACHE_LINE_SIZE,
+		.matches = has_mismatched_cache_line_size,
+		.def_scope = SCOPE_LOCAL_CPU,
+		.enable = cpu_enable_trap_ctr_access,
+	},
+	{
 	}
 };
 
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index fb57c99..f428655 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -46,6 +46,13 @@ unsigned int compat_elf_hwcap2 __read_mostly;
 
 DECLARE_BITMAP(cpu_hwcaps, ARM64_NCAPS);
 
+/*
+ * If we have mismatched cache line sizes, we need to emulate access to
+ * CTRL_EL0. To avoid a lookup everytime (via read_system_reg()), cache
+ * the table entry for CTR_EL0.
+ */
+struct arm64_ftr_reg *sys_ctr_ftr;
+
 #define __ARM64_FTR_BITS(SIGNED, STRICT, TYPE, SHIFT, WIDTH, SAFE_VAL) \
 	{						\
 		.sign = SIGNED,				\
@@ -461,6 +468,8 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
 		init_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2);
 	}
 
+	/* Keep track of the CTR feature register */
+	sys_ctr_ftr = get_arm64_ftr_reg(SYS_CTR_EL0);
 }
 
 static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new)
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 93c5287..db2d6cb 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -480,6 +480,14 @@ static void user_cache_maint_handler(unsigned int esr, struct pt_regs *regs)
 		regs->pc += 4;
 }
 
+static void ctr_read_handler(unsigned int esr, struct pt_regs *regs)
+{
+	int rt = (esr & ESR_ELx_SYS64_ISS_RT_MASK) >> ESR_ELx_SYS64_ISS_RT_SHIFT;
+
+	regs->regs[rt] = sys_ctr_ftr->sys_val;
+	regs->pc += 4;
+}
+
 struct sys64_hook {
 	unsigned int esr_mask;
 	unsigned int esr_val;
@@ -492,6 +500,12 @@ static struct sys64_hook sys64_hooks[] = {
 		.esr_val = ESR_ELx_SYS64_ISS_U_CACHE_OP_VAL,
 		.handler = user_cache_maint_handler,
 	},
+	{
+		/* Trap read access to CTR_EL0 */
+		.esr_mask = ESR_ELx_SYS64_ISS_SYS_OP_MASK,
+		.esr_val = ESR_ELx_SYS64_ISS_SYS_CTR_READ,
+		.handler = ctr_read_handler,
+	},
 	{},
 };
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 0/8]  arm64: Work around for mismatched cache line size
  2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
                   ` (8 preceding siblings ...)
  2016-07-08 11:37 ` [PATCH 8/8] arm64: Work around systems with mismatched cache line sizes Suzuki K Poulose
@ 2016-07-08 11:37 ` Suzuki K Poulose
  9 siblings, 0 replies; 13+ messages in thread
From: Suzuki K Poulose @ 2016-07-08 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon,
	Suzuki K Poulose

This series adds a work around for systems with mismatched {I,D}-cache
line sizes. When a thread of execution gets migrated to a different CPU,
the cache line size it had cached could be larger than that of the new
CPU. This could cause data corruption issues. We work around this by

 - Dynamically patching the kernel to use the smallest line size on the
   system (from the CPU feature infrastructure)
 - Trapping the userspace access to CTR_EL0 (by clearing SCTLR_EL1.UCT) and
   emulating it with the system wide safe value of CTR.

The series also adds support for alternative code patching of adrp
instructions by adjusting the PC-relative address offset to reflect
the new PC.

The series has been tested on Juno with a hack to forced enabling
of the capability.

Applies on aarch64: for-next/core. The tree is avaiable at :

	git://linux-arm.org/linux-skp.git ctr-emulation


Suzuki K Poulose (8):
  arm64: Set the safe value for L1 icache policy
  arm64: Use consistent naming for errata handling
  arm64: Rearrange CPU errata workaround checks
  arm64/insn: Add helpers for pc relative address offsets
  arm64: alternative: Add support for patching adrp instructions
  arm64: Introduce raw_{d,i}cache_line_size
  arm64: Refactor sysinstr exception handling
  arm64: Work around systems with mismatched cache line sizes

 arch/arm64/include/asm/assembler.h  | 45 +++++++++++++++++--
 arch/arm64/include/asm/cpufeature.h | 14 +++---
 arch/arm64/include/asm/esr.h        | 56 ++++++++++++++++++++++++
 arch/arm64/include/asm/insn.h       |  5 +++
 arch/arm64/include/asm/sysreg.h     |  1 +
 arch/arm64/kernel/alternative.c     | 13 ++++++
 arch/arm64/kernel/asm-offsets.c     |  2 +
 arch/arm64/kernel/cpu_errata.c      | 26 ++++++++++-
 arch/arm64/kernel/cpufeature.c      | 44 ++++++++++++++-----
 arch/arm64/kernel/cpuinfo.c         |  2 -
 arch/arm64/kernel/hibernate-asm.S   |  2 +-
 arch/arm64/kernel/insn.c            | 23 ++++++++++
 arch/arm64/kernel/relocate_kernel.S |  2 +-
 arch/arm64/kernel/smp.c             |  8 +++-
 arch/arm64/kernel/traps.c           | 87 ++++++++++++++++++++++++++-----------
 15 files changed, 275 insertions(+), 55 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/8] arm64: Introduce raw_{d,i}cache_line_size
  2016-07-08 11:37 ` [PATCH 6/8] arm64: Introduce raw_{d,i}cache_line_size Suzuki K Poulose
@ 2016-07-11 21:55   ` Geoff Levand
  2016-08-09 16:22   ` James Morse
  1 sibling, 0 replies; 13+ messages in thread
From: Geoff Levand @ 2016-07-11 21:55 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel
  Cc: linux-kernel, catalin.marinas, mark.rutland, will.deacon, James Morse

On Fri, 2016-07-08 at 12:37 +0100, Suzuki K Poulose wrote:
> On systems with mismatched i/d cache min line sizes, we need to use
> the smallest size possible across all CPUs. This will be done by fetching
> the system wide safe value from CPU feature infrastructure.
> However the some special users(e.g kexec, hibernate) would need the line
> size on the CPU (rather than the system wide), when the system wide
> feature may not be accessible. Provide another helper which will fetch
> cache line size on the current CPU.
> 

Looks OK for what kexec needs.

Acked-by: Geoff Levand <geoff@infradead.org>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 6/8] arm64: Introduce raw_{d,i}cache_line_size
  2016-07-08 11:37 ` [PATCH 6/8] arm64: Introduce raw_{d,i}cache_line_size Suzuki K Poulose
  2016-07-11 21:55   ` Geoff Levand
@ 2016-08-09 16:22   ` James Morse
  1 sibling, 0 replies; 13+ messages in thread
From: James Morse @ 2016-08-09 16:22 UTC (permalink / raw)
  To: Suzuki K Poulose
  Cc: linux-arm-kernel, linux-kernel, catalin.marinas, mark.rutland,
	will.deacon, Geoff Levand

Hi Suzuki,

Sorry this fell through the cracks...

On 08/07/16 12:37, Suzuki K Poulose wrote:
> On systems with mismatched i/d cache min line sizes, we need to use
> the smallest size possible across all CPUs. This will be done by fetching
> the system wide safe value from CPU feature infrastructure.
> However the some special users(e.g kexec, hibernate) would need the line
> size on the CPU (rather than the system wide), when the system wide
> feature may not be accessible. Provide another helper which will fetch
> cache line size on the current CPU.

> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index d5025c6..a4bb3f5 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -218,9 +218,10 @@ lr	.req	x30		// link register
>  	.endm
>  
>  /*
> - * dcache_line_size - get the minimum D-cache line size from the CTR register.
> + * raw_dcache_line_size - get the minimum D-cache line size on this CPU
> + * from the CTR register.
>   */
> -	.macro	dcache_line_size, reg, tmp
> +	.macro	raw_dcache_line_size, reg, tmp
>  	mrs	\tmp, ctr_el0			// read CTR
>  	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
>  	mov	\reg, #4			// bytes per word
> @@ -228,9 +229,17 @@ lr	.req	x30		// link register
>  	.endm
>  
>  /*
> - * icache_line_size - get the minimum I-cache line size from the CTR register.
> + * dcache_line_size - get the safe D-cache line size across all CPUs
>   */
> -	.macro	icache_line_size, reg, tmp
> +	.macro	dcache_line_size, reg, tmp
> +	raw_dcache_line_size	\reg, \tmp
> +	.endm
> +
> +/*
> + * raw_icache_line_size - get the minimum I-cache line size on this CPU
> + * from the CTR register.
> + */
> +	.macro	raw_icache_line_size, reg, tmp
>  	mrs	\tmp, ctr_el0			// read CTR
>  	and	\tmp, \tmp, #0xf		// cache line size encoding
>  	mov	\reg, #4			// bytes per word
> @@ -238,6 +247,13 @@ lr	.req	x30		// link register
>  	.endm
>  
>  /*
> + * icache_line_size - get the safe I-cache line size across all CPUs
> + */
> +	.macro	icache_line_size, reg, tmp
> +	raw_icache_line_size	\reg, \tmp
> +	.endm
> +
> +/*
>   * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
>   */
>  	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
> diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S
> index 46f29b6..4ebc6a1 100644
> --- a/arch/arm64/kernel/hibernate-asm.S
> +++ b/arch/arm64/kernel/hibernate-asm.S
> @@ -96,7 +96,7 @@ ENTRY(swsusp_arch_suspend_exit)
>  
>  	add	x1, x10, #PAGE_SIZE
>  	/* Clean the copied page to PoU - based on flush_icache_range() */
> -	dcache_line_size x2, x3
> +	raw_dcache_line_size x2, x3
>  	sub	x3, x2, #1
>  	bic	x4, x10, x3
>  2:	dc	cvau, x4	/* clean D line / unified line */


Looks like no-change to me!

If you think you need it:
Acked-by: James Morse <james.morse@arm.com>


Thanks,

James

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-08-09 16:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-08 11:37 [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 1/8] arm64: Set the safe value for L1 icache policy Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 2/8] arm64: Use consistent naming for errata handling Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 3/8] arm64: Rearrange CPU errata workaround checks Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 4/8] arm64/insn: Add helpers for pc relative address offsets Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 4/8] arm64/insn: Add helpers for pc relative data processing instructions Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 5/8] arm64: alternative: Add support for patching adrp instructions Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 6/8] arm64: Introduce raw_{d,i}cache_line_size Suzuki K Poulose
2016-07-11 21:55   ` Geoff Levand
2016-08-09 16:22   ` James Morse
2016-07-08 11:37 ` [PATCH 7/8] arm64: Refactor sysinstr exception handling Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 8/8] arm64: Work around systems with mismatched cache line sizes Suzuki K Poulose
2016-07-08 11:37 ` [PATCH 0/8] arm64: Work around for mismatched cache line size Suzuki K Poulose

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).