linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO
@ 2020-11-03 12:17 Will Deacon
  2020-11-03 12:17 ` [PATCH v4 1/4] arm64: alternatives: Split up alternative.h Will Deacon
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Will Deacon @ 2020-11-03 12:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

Hi all,

These patches were previously posted as part of a larger series enabling
architectures to override __READ_ONCE():

  v3: https://lore.kernel.org/lkml/20200710165203.31284-1-will@kernel.org/

With the bulk of that merged, the four patches here override READ_ONCE()
so that it gains RCpc acquire semantics on arm64 when LTO is enabled. We
can revisit this as and when the compiler provides a means for us to reason
about the result of dependency-breaking optimisations. In the meantime,
this unblocks LTO for arm64, which I would really like to see merged so
that we can focus on enabling CFI.

I plan to queue these on their own branch in the arm64 tree for 5.11 at
-rc3.

Cheers,

Will

Cc: Kees Cook <keescook@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org

--->8

Will Deacon (4):
  arm64: alternatives: Split up alternative.h
  arm64: cpufeatures: Add capability for LDAPR instruction
  arm64: alternatives: Remove READ_ONCE() usage during patch operation
  arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y

 arch/arm64/Kconfig                          |   3 +
 arch/arm64/include/asm/alternative-macros.h | 276 ++++++++++++++++++++
 arch/arm64/include/asm/alternative.h        | 267 +------------------
 arch/arm64/include/asm/cpucaps.h            |   3 +-
 arch/arm64/include/asm/insn.h               |   3 +-
 arch/arm64/include/asm/rwonce.h             |  63 +++++
 arch/arm64/kernel/alternative.c             |   7 +-
 arch/arm64/kernel/cpufeature.c              |  10 +
 arch/arm64/kernel/vdso/Makefile             |   2 +-
 arch/arm64/kernel/vdso32/Makefile           |   2 +-
 arch/arm64/kernel/vmlinux.lds.S             |   2 +-
 11 files changed, 364 insertions(+), 274 deletions(-)
 create mode 100644 arch/arm64/include/asm/alternative-macros.h
 create mode 100644 arch/arm64/include/asm/rwonce.h

-- 
2.29.1.341.ge80a0c044ae-goog


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/4] arm64: alternatives: Split up alternative.h
  2020-11-03 12:17 [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon
@ 2020-11-03 12:17 ` Will Deacon
  2020-11-03 12:40   ` Mark Rutland
  2020-11-03 12:17 ` [PATCH v4 2/4] arm64: cpufeatures: Add capability for LDAPR instruction Will Deacon
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2020-11-03 12:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

asm/alternative.h contains both the macros needed to use alternatives,
as well the type definitions and function prototypes for applying them.

Split the header in two, so that alternatives can be used from core
header files such as linux/compiler.h without the risk of circular
includes

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/alternative-macros.h | 276 ++++++++++++++++++++
 arch/arm64/include/asm/alternative.h        | 267 +------------------
 arch/arm64/include/asm/insn.h               |   3 +-
 3 files changed, 279 insertions(+), 267 deletions(-)
 create mode 100644 arch/arm64/include/asm/alternative-macros.h

diff --git a/arch/arm64/include/asm/alternative-macros.h b/arch/arm64/include/asm/alternative-macros.h
new file mode 100644
index 000000000000..c959377f9790
--- /dev/null
+++ b/arch/arm64/include/asm/alternative-macros.h
@@ -0,0 +1,276 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_ALTERNATIVE_MACROS_H
+#define __ASM_ALTERNATIVE_MACROS_H
+
+#include <asm/cpucaps.h>
+
+#define ARM64_CB_PATCH ARM64_NCAPS
+
+/* A64 instructions are always 32 bits. */
+#define	AARCH64_INSN_SIZE		4
+
+#ifndef __ASSEMBLY__
+
+#include <linux/stringify.h>
+
+#define ALTINSTR_ENTRY(feature)					              \
+	" .word 661b - .\n"				/* label           */ \
+	" .word 663f - .\n"				/* new instruction */ \
+	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
+	" .byte 662b-661b\n"				/* source len      */ \
+	" .byte 664f-663f\n"				/* replacement len */
+
+#define ALTINSTR_ENTRY_CB(feature, cb)					      \
+	" .word 661b - .\n"				/* label           */ \
+	" .word " __stringify(cb) "- .\n"		/* callback */	      \
+	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
+	" .byte 662b-661b\n"				/* source len      */ \
+	" .byte 664f-663f\n"				/* replacement len */
+
+/*
+ * alternative assembly primitive:
+ *
+ * If any of these .org directive fail, it means that insn1 and insn2
+ * don't have the same length. This used to be written as
+ *
+ * .if ((664b-663b) != (662b-661b))
+ * 	.error "Alternatives instruction length mismatch"
+ * .endif
+ *
+ * but most assemblers die if insn1 or insn2 have a .inst. This should
+ * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
+ * containing commit 4e4d08cf7399b606 or c1baaddf8861).
+ *
+ * Alternatives with callbacks do not generate replacement instructions.
+ */
+#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
+	".if "__stringify(cfg_enabled)" == 1\n"				\
+	"661:\n\t"							\
+	oldinstr "\n"							\
+	"662:\n"							\
+	".pushsection .altinstructions,\"a\"\n"				\
+	ALTINSTR_ENTRY(feature)						\
+	".popsection\n"							\
+	".subsection 1\n"						\
+	"663:\n\t"							\
+	newinstr "\n"							\
+	"664:\n\t"							\
+	".org	. - (664b-663b) + (662b-661b)\n\t"			\
+	".org	. - (662b-661b) + (664b-663b)\n\t"			\
+	".previous\n"							\
+	".endif\n"
+
+#define __ALTERNATIVE_CFG_CB(oldinstr, feature, cfg_enabled, cb)	\
+	".if "__stringify(cfg_enabled)" == 1\n"				\
+	"661:\n\t"							\
+	oldinstr "\n"							\
+	"662:\n"							\
+	".pushsection .altinstructions,\"a\"\n"				\
+	ALTINSTR_ENTRY_CB(feature, cb)					\
+	".popsection\n"							\
+	"663:\n\t"							\
+	"664:\n\t"							\
+	".endif\n"
+
+#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
+	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
+
+#define ALTERNATIVE_CB(oldinstr, cb) \
+	__ALTERNATIVE_CFG_CB(oldinstr, ARM64_CB_PATCH, 1, cb)
+#else
+
+#include <asm/assembler.h>
+
+.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
+	.word \orig_offset - .
+	.word \alt_offset - .
+	.hword \feature
+	.byte \orig_len
+	.byte \alt_len
+.endm
+
+.macro alternative_insn insn1, insn2, cap, enable = 1
+	.if \enable
+661:	\insn1
+662:	.pushsection .altinstructions, "a"
+	altinstruction_entry 661b, 663f, \cap, 662b-661b, 664f-663f
+	.popsection
+	.subsection 1
+663:	\insn2
+664:	.previous
+	.org	. - (664b-663b) + (662b-661b)
+	.org	. - (662b-661b) + (664b-663b)
+	.endif
+.endm
+
+/*
+ * Alternative sequences
+ *
+ * The code for the case where the capability is not present will be
+ * assembled and linked as normal. There are no restrictions on this
+ * code.
+ *
+ * The code for the case where the capability is present will be
+ * assembled into a special section to be used for dynamic patching.
+ * Code for that case must:
+ *
+ * 1. Be exactly the same length (in bytes) as the default code
+ *    sequence.
+ *
+ * 2. Not contain a branch target that is used outside of the
+ *    alternative sequence it is defined in (branches into an
+ *    alternative sequence are not fixed up).
+ */
+
+/*
+ * Begin an alternative code sequence.
+ */
+.macro alternative_if_not cap
+	.set .Lasm_alt_mode, 0
+	.pushsection .altinstructions, "a"
+	altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
+	.popsection
+661:
+.endm
+
+.macro alternative_if cap
+	.set .Lasm_alt_mode, 1
+	.pushsection .altinstructions, "a"
+	altinstruction_entry 663f, 661f, \cap, 664f-663f, 662f-661f
+	.popsection
+	.subsection 1
+	.align 2	/* So GAS knows label 661 is suitably aligned */
+661:
+.endm
+
+.macro alternative_cb cb
+	.set .Lasm_alt_mode, 0
+	.pushsection .altinstructions, "a"
+	altinstruction_entry 661f, \cb, ARM64_CB_PATCH, 662f-661f, 0
+	.popsection
+661:
+.endm
+
+/*
+ * Provide the other half of the alternative code sequence.
+ */
+.macro alternative_else
+662:
+	.if .Lasm_alt_mode==0
+	.subsection 1
+	.else
+	.previous
+	.endif
+663:
+.endm
+
+/*
+ * Complete an alternative code sequence.
+ */
+.macro alternative_endif
+664:
+	.if .Lasm_alt_mode==0
+	.previous
+	.endif
+	.org	. - (664b-663b) + (662b-661b)
+	.org	. - (662b-661b) + (664b-663b)
+.endm
+
+/*
+ * Callback-based alternative epilogue
+ */
+.macro alternative_cb_end
+662:
+.endm
+
+/*
+ * Provides a trivial alternative or default sequence consisting solely
+ * of NOPs. The number of NOPs is chosen automatically to match the
+ * previous case.
+ */
+.macro alternative_else_nop_endif
+alternative_else
+	nops	(662b-661b) / AARCH64_INSN_SIZE
+alternative_endif
+.endm
+
+#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...)	\
+	alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
+
+.macro user_alt, label, oldinstr, newinstr, cond
+9999:	alternative_insn "\oldinstr", "\newinstr", \cond
+	_asm_extable 9999b, \label
+.endm
+
+/*
+ * Generate the assembly for UAO alternatives with exception table entries.
+ * This is complicated as there is no post-increment or pair versions of the
+ * unprivileged instructions, and USER() only works for single instructions.
+ */
+#ifdef CONFIG_ARM64_UAO
+	.macro uao_ldp l, reg1, reg2, addr, post_inc
+		alternative_if_not ARM64_HAS_UAO
+8888:			ldp	\reg1, \reg2, [\addr], \post_inc;
+8889:			nop;
+			nop;
+		alternative_else
+			ldtr	\reg1, [\addr];
+			ldtr	\reg2, [\addr, #8];
+			add	\addr, \addr, \post_inc;
+		alternative_endif
+
+		_asm_extable	8888b,\l;
+		_asm_extable	8889b,\l;
+	.endm
+
+	.macro uao_stp l, reg1, reg2, addr, post_inc
+		alternative_if_not ARM64_HAS_UAO
+8888:			stp	\reg1, \reg2, [\addr], \post_inc;
+8889:			nop;
+			nop;
+		alternative_else
+			sttr	\reg1, [\addr];
+			sttr	\reg2, [\addr, #8];
+			add	\addr, \addr, \post_inc;
+		alternative_endif
+
+		_asm_extable	8888b,\l;
+		_asm_extable	8889b,\l;
+	.endm
+
+	.macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
+		alternative_if_not ARM64_HAS_UAO
+8888:			\inst	\reg, [\addr], \post_inc;
+			nop;
+		alternative_else
+			\alt_inst	\reg, [\addr];
+			add		\addr, \addr, \post_inc;
+		alternative_endif
+
+		_asm_extable	8888b,\l;
+	.endm
+#else
+	.macro uao_ldp l, reg1, reg2, addr, post_inc
+		USER(\l, ldp \reg1, \reg2, [\addr], \post_inc)
+	.endm
+	.macro uao_stp l, reg1, reg2, addr, post_inc
+		USER(\l, stp \reg1, \reg2, [\addr], \post_inc)
+	.endm
+	.macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
+		USER(\l, \inst \reg, [\addr], \post_inc)
+	.endm
+#endif
+
+#endif  /*  __ASSEMBLY__  */
+
+/*
+ * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature));
+ *
+ * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature, CONFIG_FOO));
+ * N.B. If CONFIG_FOO is specified, but not selected, the whole block
+ *      will be omitted, including oldinstr.
+ */
+#define ALTERNATIVE(oldinstr, newinstr, ...)   \
+	_ALTERNATIVE_CFG(oldinstr, newinstr, __VA_ARGS__, 1)
+
+#endif /* __ASM_ALTERNATIVE_MACROS_H */
diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 619db9b4c9d5..a38b92e11811 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -2,17 +2,13 @@
 #ifndef __ASM_ALTERNATIVE_H
 #define __ASM_ALTERNATIVE_H
 
-#include <asm/cpucaps.h>
-#include <asm/insn.h>
-
-#define ARM64_CB_PATCH ARM64_NCAPS
+#include <asm/alternative-macros.h>
 
 #ifndef __ASSEMBLY__
 
 #include <linux/init.h>
 #include <linux/types.h>
 #include <linux/stddef.h>
-#include <linux/stringify.h>
 
 struct alt_instr {
 	s32 orig_offset;	/* offset to original instruction */
@@ -35,264 +31,5 @@ void apply_alternatives_module(void *start, size_t length);
 static inline void apply_alternatives_module(void *start, size_t length) { }
 #endif
 
-#define ALTINSTR_ENTRY(feature)					              \
-	" .word 661b - .\n"				/* label           */ \
-	" .word 663f - .\n"				/* new instruction */ \
-	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
-	" .byte 662b-661b\n"				/* source len      */ \
-	" .byte 664f-663f\n"				/* replacement len */
-
-#define ALTINSTR_ENTRY_CB(feature, cb)					      \
-	" .word 661b - .\n"				/* label           */ \
-	" .word " __stringify(cb) "- .\n"		/* callback */	      \
-	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
-	" .byte 662b-661b\n"				/* source len      */ \
-	" .byte 664f-663f\n"				/* replacement len */
-
-/*
- * alternative assembly primitive:
- *
- * If any of these .org directive fail, it means that insn1 and insn2
- * don't have the same length. This used to be written as
- *
- * .if ((664b-663b) != (662b-661b))
- * 	.error "Alternatives instruction length mismatch"
- * .endif
- *
- * but most assemblers die if insn1 or insn2 have a .inst. This should
- * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
- * containing commit 4e4d08cf7399b606 or c1baaddf8861).
- *
- * Alternatives with callbacks do not generate replacement instructions.
- */
-#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
-	".if "__stringify(cfg_enabled)" == 1\n"				\
-	"661:\n\t"							\
-	oldinstr "\n"							\
-	"662:\n"							\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY(feature)						\
-	".popsection\n"							\
-	".subsection 1\n"						\
-	"663:\n\t"							\
-	newinstr "\n"							\
-	"664:\n\t"							\
-	".org	. - (664b-663b) + (662b-661b)\n\t"			\
-	".org	. - (662b-661b) + (664b-663b)\n\t"			\
-	".previous\n"							\
-	".endif\n"
-
-#define __ALTERNATIVE_CFG_CB(oldinstr, feature, cfg_enabled, cb)	\
-	".if "__stringify(cfg_enabled)" == 1\n"				\
-	"661:\n\t"							\
-	oldinstr "\n"							\
-	"662:\n"							\
-	".pushsection .altinstructions,\"a\"\n"				\
-	ALTINSTR_ENTRY_CB(feature, cb)					\
-	".popsection\n"							\
-	"663:\n\t"							\
-	"664:\n\t"							\
-	".endif\n"
-
-#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
-	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
-
-#define ALTERNATIVE_CB(oldinstr, cb) \
-	__ALTERNATIVE_CFG_CB(oldinstr, ARM64_CB_PATCH, 1, cb)
-#else
-
-#include <asm/assembler.h>
-
-.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
-	.word \orig_offset - .
-	.word \alt_offset - .
-	.hword \feature
-	.byte \orig_len
-	.byte \alt_len
-.endm
-
-.macro alternative_insn insn1, insn2, cap, enable = 1
-	.if \enable
-661:	\insn1
-662:	.pushsection .altinstructions, "a"
-	altinstruction_entry 661b, 663f, \cap, 662b-661b, 664f-663f
-	.popsection
-	.subsection 1
-663:	\insn2
-664:	.previous
-	.org	. - (664b-663b) + (662b-661b)
-	.org	. - (662b-661b) + (664b-663b)
-	.endif
-.endm
-
-/*
- * Alternative sequences
- *
- * The code for the case where the capability is not present will be
- * assembled and linked as normal. There are no restrictions on this
- * code.
- *
- * The code for the case where the capability is present will be
- * assembled into a special section to be used for dynamic patching.
- * Code for that case must:
- *
- * 1. Be exactly the same length (in bytes) as the default code
- *    sequence.
- *
- * 2. Not contain a branch target that is used outside of the
- *    alternative sequence it is defined in (branches into an
- *    alternative sequence are not fixed up).
- */
-
-/*
- * Begin an alternative code sequence.
- */
-.macro alternative_if_not cap
-	.set .Lasm_alt_mode, 0
-	.pushsection .altinstructions, "a"
-	altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
-	.popsection
-661:
-.endm
-
-.macro alternative_if cap
-	.set .Lasm_alt_mode, 1
-	.pushsection .altinstructions, "a"
-	altinstruction_entry 663f, 661f, \cap, 664f-663f, 662f-661f
-	.popsection
-	.subsection 1
-	.align 2	/* So GAS knows label 661 is suitably aligned */
-661:
-.endm
-
-.macro alternative_cb cb
-	.set .Lasm_alt_mode, 0
-	.pushsection .altinstructions, "a"
-	altinstruction_entry 661f, \cb, ARM64_CB_PATCH, 662f-661f, 0
-	.popsection
-661:
-.endm
-
-/*
- * Provide the other half of the alternative code sequence.
- */
-.macro alternative_else
-662:
-	.if .Lasm_alt_mode==0
-	.subsection 1
-	.else
-	.previous
-	.endif
-663:
-.endm
-
-/*
- * Complete an alternative code sequence.
- */
-.macro alternative_endif
-664:
-	.if .Lasm_alt_mode==0
-	.previous
-	.endif
-	.org	. - (664b-663b) + (662b-661b)
-	.org	. - (662b-661b) + (664b-663b)
-.endm
-
-/*
- * Callback-based alternative epilogue
- */
-.macro alternative_cb_end
-662:
-.endm
-
-/*
- * Provides a trivial alternative or default sequence consisting solely
- * of NOPs. The number of NOPs is chosen automatically to match the
- * previous case.
- */
-.macro alternative_else_nop_endif
-alternative_else
-	nops	(662b-661b) / AARCH64_INSN_SIZE
-alternative_endif
-.endm
-
-#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...)	\
-	alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
-
-.macro user_alt, label, oldinstr, newinstr, cond
-9999:	alternative_insn "\oldinstr", "\newinstr", \cond
-	_asm_extable 9999b, \label
-.endm
-
-/*
- * Generate the assembly for UAO alternatives with exception table entries.
- * This is complicated as there is no post-increment or pair versions of the
- * unprivileged instructions, and USER() only works for single instructions.
- */
-#ifdef CONFIG_ARM64_UAO
-	.macro uao_ldp l, reg1, reg2, addr, post_inc
-		alternative_if_not ARM64_HAS_UAO
-8888:			ldp	\reg1, \reg2, [\addr], \post_inc;
-8889:			nop;
-			nop;
-		alternative_else
-			ldtr	\reg1, [\addr];
-			ldtr	\reg2, [\addr, #8];
-			add	\addr, \addr, \post_inc;
-		alternative_endif
-
-		_asm_extable	8888b,\l;
-		_asm_extable	8889b,\l;
-	.endm
-
-	.macro uao_stp l, reg1, reg2, addr, post_inc
-		alternative_if_not ARM64_HAS_UAO
-8888:			stp	\reg1, \reg2, [\addr], \post_inc;
-8889:			nop;
-			nop;
-		alternative_else
-			sttr	\reg1, [\addr];
-			sttr	\reg2, [\addr, #8];
-			add	\addr, \addr, \post_inc;
-		alternative_endif
-
-		_asm_extable	8888b,\l;
-		_asm_extable	8889b,\l;
-	.endm
-
-	.macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
-		alternative_if_not ARM64_HAS_UAO
-8888:			\inst	\reg, [\addr], \post_inc;
-			nop;
-		alternative_else
-			\alt_inst	\reg, [\addr];
-			add		\addr, \addr, \post_inc;
-		alternative_endif
-
-		_asm_extable	8888b,\l;
-	.endm
-#else
-	.macro uao_ldp l, reg1, reg2, addr, post_inc
-		USER(\l, ldp \reg1, \reg2, [\addr], \post_inc)
-	.endm
-	.macro uao_stp l, reg1, reg2, addr, post_inc
-		USER(\l, stp \reg1, \reg2, [\addr], \post_inc)
-	.endm
-	.macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
-		USER(\l, \inst \reg, [\addr], \post_inc)
-	.endm
-#endif
-
-#endif  /*  __ASSEMBLY__  */
-
-/*
- * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature));
- *
- * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature, CONFIG_FOO));
- * N.B. If CONFIG_FOO is specified, but not selected, the whole block
- *      will be omitted, including oldinstr.
- */
-#define ALTERNATIVE(oldinstr, newinstr, ...)   \
-	_ALTERNATIVE_CFG(oldinstr, newinstr, __VA_ARGS__, 1)
-
+#endif /* __ASSEMBLY__ */
 #endif /* __ASM_ALTERNATIVE_H */
diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
index 4b39293d0f72..4ebb9c054ccc 100644
--- a/arch/arm64/include/asm/insn.h
+++ b/arch/arm64/include/asm/insn.h
@@ -10,8 +10,7 @@
 #include <linux/build_bug.h>
 #include <linux/types.h>
 
-/* A64 instructions are always 32 bits. */
-#define	AARCH64_INSN_SIZE		4
+#include <asm/alternative.h>
 
 #ifndef __ASSEMBLY__
 /*
-- 
2.29.1.341.ge80a0c044ae-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/4] arm64: cpufeatures: Add capability for LDAPR instruction
  2020-11-03 12:17 [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon
  2020-11-03 12:17 ` [PATCH v4 1/4] arm64: alternatives: Split up alternative.h Will Deacon
@ 2020-11-03 12:17 ` Will Deacon
  2020-11-03 12:44   ` Mark Rutland
  2020-11-03 12:17 ` [PATCH v4 3/4] arm64: alternatives: Remove READ_ONCE() usage during patch operation Will Deacon
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2020-11-03 12:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

Armv8.3 introduced the LDAPR instruction, which provides weaker memory
ordering semantics than LDARi (RCpc vs RCsc). Generally, we provide an
RCsc implementation when implementing the Linux memory model, but LDAPR
can be used as a useful alternative to dependency ordering, particularly
when the compiler is capable of breaking the dependencies.

Since LDAPR is not available on all CPUs, add a cpufeature to detect it at
runtime and allow the instruction to be used with alternative code
patching.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/Kconfig               |  3 +++
 arch/arm64/include/asm/cpucaps.h |  3 ++-
 arch/arm64/kernel/cpufeature.c   | 10 ++++++++++
 3 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1d466addb078..356c50b0447f 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1388,6 +1388,9 @@ config ARM64_PAN
 	 The feature is detected at runtime, and will remain as a 'nop'
 	 instruction if the cpu does not implement the feature.
 
+config AS_HAS_LDAPR
+	def_bool $(as-instr,.arch_extension rcpc)
+
 config ARM64_LSE_ATOMICS
 	bool
 	default ARM64_USE_LSE_ATOMICS
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index e7d98997c09c..64ea0bb9f420 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -66,7 +66,8 @@
 #define ARM64_HAS_TLB_RANGE			56
 #define ARM64_MTE				57
 #define ARM64_WORKAROUND_1508412		58
+#define ARM64_HAS_LDAPR				59
 
-#define ARM64_NCAPS				59
+#define ARM64_NCAPS				60
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index dcc165b3fc04..b7b6804cb931 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2136,6 +2136,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.cpu_enable = cpu_enable_mte,
 	},
 #endif /* CONFIG_ARM64_MTE */
+	{
+		.desc = "RCpc load-acquire (LDAPR)",
+		.capability = ARM64_HAS_LDAPR,
+		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
+		.sys_reg = SYS_ID_AA64ISAR1_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64ISAR1_LRCPC_SHIFT,
+		.matches = has_cpuid_feature,
+		.min_field_value = 1,
+	},
 	{},
 };
 
-- 
2.29.1.341.ge80a0c044ae-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/4] arm64: alternatives: Remove READ_ONCE() usage during patch operation
  2020-11-03 12:17 [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon
  2020-11-03 12:17 ` [PATCH v4 1/4] arm64: alternatives: Split up alternative.h Will Deacon
  2020-11-03 12:17 ` [PATCH v4 2/4] arm64: cpufeatures: Add capability for LDAPR instruction Will Deacon
@ 2020-11-03 12:17 ` Will Deacon
  2020-11-03 12:46   ` Mark Rutland
  2020-11-03 12:17 ` [PATCH v4 4/4] arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y Will Deacon
  2020-11-09 23:25 ` [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon
  4 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2020-11-03 12:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

In preparation for patching the internals of READ_ONCE() itself, replace
its usage on the alternatives patching patch with a volatile variable
instead.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/kernel/alternative.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 73039949b5ce..a57cffb752e8 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -21,7 +21,8 @@
 #define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)
 #define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)
 
-static int all_alternatives_applied;
+/* Volatile, as we may be patching the guts of READ_ONCE() */
+static volatile int all_alternatives_applied;
 
 static DECLARE_BITMAP(applied_alternatives, ARM64_NCAPS);
 
@@ -205,7 +206,7 @@ static int __apply_alternatives_multi_stop(void *unused)
 
 	/* We always have a CPU 0 at this point (__init) */
 	if (smp_processor_id()) {
-		while (!READ_ONCE(all_alternatives_applied))
+		while (!all_alternatives_applied)
 			cpu_relax();
 		isb();
 	} else {
@@ -217,7 +218,7 @@ static int __apply_alternatives_multi_stop(void *unused)
 		BUG_ON(all_alternatives_applied);
 		__apply_alternatives(&region, false, remaining_capabilities);
 		/* Barriers provided by the cache flushing */
-		WRITE_ONCE(all_alternatives_applied, 1);
+		all_alternatives_applied = 1;
 	}
 
 	return 0;
-- 
2.29.1.341.ge80a0c044ae-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/4] arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y
  2020-11-03 12:17 [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon
                   ` (2 preceding siblings ...)
  2020-11-03 12:17 ` [PATCH v4 3/4] arm64: alternatives: Remove READ_ONCE() usage during patch operation Will Deacon
@ 2020-11-03 12:17 ` Will Deacon
  2020-11-03 12:58   ` Mark Rutland
  2020-11-09 23:25 ` [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon
  4 siblings, 1 reply; 12+ messages in thread
From: Will Deacon @ 2020-11-03 12:17 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

When building with LTO, there is an increased risk of the compiler
converting an address dependency headed by a READ_ONCE() invocation
into a control dependency and consequently allowing for harmful
reordering by the CPU.

Ensure that such transformations are harmless by overriding the generic
READ_ONCE() definition with one that provides acquire semantics when
building with LTO.

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Will Deacon <will@kernel.org>
---
 arch/arm64/include/asm/rwonce.h   | 63 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/vdso/Makefile   |  2 +-
 arch/arm64/kernel/vdso32/Makefile |  2 +-
 arch/arm64/kernel/vmlinux.lds.S   |  2 +-
 4 files changed, 66 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/include/asm/rwonce.h

diff --git a/arch/arm64/include/asm/rwonce.h b/arch/arm64/include/asm/rwonce.h
new file mode 100644
index 000000000000..d78eb4cb795b
--- /dev/null
+++ b/arch/arm64/include/asm/rwonce.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2020 Google LLC.
+ */
+#ifndef __ASM_RWONCE_H
+#define __ASM_RWONCE_H
+
+#ifdef CONFIG_LTO
+
+#include <linux/compiler_types.h>
+#include <asm/alternative-macros.h>
+
+#ifndef BUILD_VDSO
+
+#ifdef CONFIG_AS_HAS_LDAPR
+#define __LOAD_RCPC(sfx, regs...)					\
+	ALTERNATIVE(							\
+		"ldar"	#sfx "\t" #regs,				\
+		".arch_extension rcpc\n"				\
+		"ldapr"	#sfx "\t" #regs,				\
+	ARM64_HAS_LDAPR)
+#else
+#define __LOAD_RCPC(sfx, regs...)	"ldar" #sfx "\t" #regs
+#endif /* CONFIG_AS_HAS_LDAPR */
+
+#define __READ_ONCE(x)							\
+({									\
+	typeof(&(x)) __x = &(x);					\
+	int atomic = 1;							\
+	union { __unqual_scalar_typeof(*__x) __val; char __c[1]; } __u;	\
+	switch (sizeof(x)) {						\
+	case 1:								\
+		asm volatile(__LOAD_RCPC(b, %w0, %1)			\
+			: "=r" (*(__u8 *)__u.__c)			\
+			: "Q" (*__x) : "memory");			\
+		break;							\
+	case 2:								\
+		asm volatile(__LOAD_RCPC(h, %w0, %1)			\
+			: "=r" (*(__u16 *)__u.__c)			\
+			: "Q" (*__x) : "memory");			\
+		break;							\
+	case 4:								\
+		asm volatile(__LOAD_RCPC(, %w0, %1)			\
+			: "=r" (*(__u32 *)__u.__c)			\
+			: "Q" (*__x) : "memory");			\
+		break;							\
+	case 8:								\
+		asm volatile(__LOAD_RCPC(, %0, %1)			\
+			: "=r" (*(__u64 *)__u.__c)			\
+			: "Q" (*__x) : "memory");			\
+		break;							\
+	default:							\
+		atomic = 0;						\
+	}								\
+	atomic ? (typeof(*__x))__u.__val : (*(volatile typeof(__x))__x);\
+})
+
+#endif	/* !BUILD_VDSO */
+#endif	/* CONFIG_LTO */
+
+#include <asm-generic/rwonce.h>
+
+#endif	/* __ASM_RWONCE_H */
diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
index d65f52264aba..a8f8e409e2bf 100644
--- a/arch/arm64/kernel/vdso/Makefile
+++ b/arch/arm64/kernel/vdso/Makefile
@@ -28,7 +28,7 @@ ldflags-y := -shared -nostdlib -soname=linux-vdso.so.1 --hash-style=sysv	\
 	     $(btildflags-y) -T
 
 ccflags-y := -fno-common -fno-builtin -fno-stack-protector -ffixed-x18
-ccflags-y += -DDISABLE_BRANCH_PROFILING
+ccflags-y += -DDISABLE_BRANCH_PROFILING -DBUILD_VDSO
 
 CFLAGS_REMOVE_vgettimeofday.o = $(CC_FLAGS_FTRACE) -Os $(CC_FLAGS_SCS) $(GCC_PLUGINS_CFLAGS)
 KASAN_SANITIZE			:= n
diff --git a/arch/arm64/kernel/vdso32/Makefile b/arch/arm64/kernel/vdso32/Makefile
index 79280c53b9a6..a1e0f91e6cea 100644
--- a/arch/arm64/kernel/vdso32/Makefile
+++ b/arch/arm64/kernel/vdso32/Makefile
@@ -48,7 +48,7 @@ cc32-as-instr = $(call try-run,\
 # As a result we set our own flags here.
 
 # KBUILD_CPPFLAGS and NOSTDINC_FLAGS from top-level Makefile
-VDSO_CPPFLAGS := -D__KERNEL__ -nostdinc -isystem $(shell $(CC_COMPAT) -print-file-name=include)
+VDSO_CPPFLAGS := -DBUILD_VDSO -D__KERNEL__ -nostdinc -isystem $(shell $(CC_COMPAT) -print-file-name=include)
 VDSO_CPPFLAGS += $(LINUXINCLUDE)
 
 # Common C and assembly flags
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 1bda604f4c70..d6cdcf4aa6a5 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -201,7 +201,7 @@ SECTIONS
 		INIT_CALLS
 		CON_INITCALL
 		INIT_RAM_FS
-		*(.init.rodata.* .init.bss)	/* from the EFI stub */
+		*(.init.altinstructions .init.rodata.* .init.bss)	/* from the EFI stub */
 	}
 	.exit.data : {
 		EXIT_DATA
-- 
2.29.1.341.ge80a0c044ae-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/4] arm64: alternatives: Split up alternative.h
  2020-11-03 12:17 ` [PATCH v4 1/4] arm64: alternatives: Split up alternative.h Will Deacon
@ 2020-11-03 12:40   ` Mark Rutland
  2020-11-03 12:42     ` Will Deacon
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Rutland @ 2020-11-03 12:40 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

On Tue, Nov 03, 2020 at 12:17:18PM +0000, Will Deacon wrote:
> asm/alternative.h contains both the macros needed to use alternatives,
> as well the type definitions and function prototypes for applying them.
> 
> Split the header in two, so that alternatives can be used from core
> header files such as linux/compiler.h without the risk of circular
> includes
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

As a heads-up, the uaccess macro move will end up conflicting with my
uaccess rework. I have a patch moving those out into asm/asm-uaccess.h:

https://lore.kernel.org/r/20201006144642.12195-9-mark.rutland@arm.com

.... would you be happy to take that as a prep patch? Then in this
patch you'd need to modify asm/asm-uaccess.h to include
asm/alternative-macros.h.

That wasy I can also carry that prep patch in the uaccess series, and
avoid nasty merge conflicts, and it seems to make sense to factor out
the uaccess bits anyway since they're not common alternative macros.

The patch itself looks fine to me, so FWIW (ideally with the above):

Acked-by: Mark Ryutland <mark.rutland@arm.com>

Thanks,
Mark.

> ---
>  arch/arm64/include/asm/alternative-macros.h | 276 ++++++++++++++++++++
>  arch/arm64/include/asm/alternative.h        | 267 +------------------
>  arch/arm64/include/asm/insn.h               |   3 +-
>  3 files changed, 279 insertions(+), 267 deletions(-)
>  create mode 100644 arch/arm64/include/asm/alternative-macros.h
> 
> diff --git a/arch/arm64/include/asm/alternative-macros.h b/arch/arm64/include/asm/alternative-macros.h
> new file mode 100644
> index 000000000000..c959377f9790
> --- /dev/null
> +++ b/arch/arm64/include/asm/alternative-macros.h
> @@ -0,0 +1,276 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_ALTERNATIVE_MACROS_H
> +#define __ASM_ALTERNATIVE_MACROS_H
> +
> +#include <asm/cpucaps.h>
> +
> +#define ARM64_CB_PATCH ARM64_NCAPS
> +
> +/* A64 instructions are always 32 bits. */
> +#define	AARCH64_INSN_SIZE		4
> +
> +#ifndef __ASSEMBLY__
> +
> +#include <linux/stringify.h>
> +
> +#define ALTINSTR_ENTRY(feature)					              \
> +	" .word 661b - .\n"				/* label           */ \
> +	" .word 663f - .\n"				/* new instruction */ \
> +	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
> +	" .byte 662b-661b\n"				/* source len      */ \
> +	" .byte 664f-663f\n"				/* replacement len */
> +
> +#define ALTINSTR_ENTRY_CB(feature, cb)					      \
> +	" .word 661b - .\n"				/* label           */ \
> +	" .word " __stringify(cb) "- .\n"		/* callback */	      \
> +	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
> +	" .byte 662b-661b\n"				/* source len      */ \
> +	" .byte 664f-663f\n"				/* replacement len */
> +
> +/*
> + * alternative assembly primitive:
> + *
> + * If any of these .org directive fail, it means that insn1 and insn2
> + * don't have the same length. This used to be written as
> + *
> + * .if ((664b-663b) != (662b-661b))
> + * 	.error "Alternatives instruction length mismatch"
> + * .endif
> + *
> + * but most assemblers die if insn1 or insn2 have a .inst. This should
> + * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
> + * containing commit 4e4d08cf7399b606 or c1baaddf8861).
> + *
> + * Alternatives with callbacks do not generate replacement instructions.
> + */
> +#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
> +	".if "__stringify(cfg_enabled)" == 1\n"				\
> +	"661:\n\t"							\
> +	oldinstr "\n"							\
> +	"662:\n"							\
> +	".pushsection .altinstructions,\"a\"\n"				\
> +	ALTINSTR_ENTRY(feature)						\
> +	".popsection\n"							\
> +	".subsection 1\n"						\
> +	"663:\n\t"							\
> +	newinstr "\n"							\
> +	"664:\n\t"							\
> +	".org	. - (664b-663b) + (662b-661b)\n\t"			\
> +	".org	. - (662b-661b) + (664b-663b)\n\t"			\
> +	".previous\n"							\
> +	".endif\n"
> +
> +#define __ALTERNATIVE_CFG_CB(oldinstr, feature, cfg_enabled, cb)	\
> +	".if "__stringify(cfg_enabled)" == 1\n"				\
> +	"661:\n\t"							\
> +	oldinstr "\n"							\
> +	"662:\n"							\
> +	".pushsection .altinstructions,\"a\"\n"				\
> +	ALTINSTR_ENTRY_CB(feature, cb)					\
> +	".popsection\n"							\
> +	"663:\n\t"							\
> +	"664:\n\t"							\
> +	".endif\n"
> +
> +#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
> +	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
> +
> +#define ALTERNATIVE_CB(oldinstr, cb) \
> +	__ALTERNATIVE_CFG_CB(oldinstr, ARM64_CB_PATCH, 1, cb)
> +#else
> +
> +#include <asm/assembler.h>
> +
> +.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
> +	.word \orig_offset - .
> +	.word \alt_offset - .
> +	.hword \feature
> +	.byte \orig_len
> +	.byte \alt_len
> +.endm
> +
> +.macro alternative_insn insn1, insn2, cap, enable = 1
> +	.if \enable
> +661:	\insn1
> +662:	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661b, 663f, \cap, 662b-661b, 664f-663f
> +	.popsection
> +	.subsection 1
> +663:	\insn2
> +664:	.previous
> +	.org	. - (664b-663b) + (662b-661b)
> +	.org	. - (662b-661b) + (664b-663b)
> +	.endif
> +.endm
> +
> +/*
> + * Alternative sequences
> + *
> + * The code for the case where the capability is not present will be
> + * assembled and linked as normal. There are no restrictions on this
> + * code.
> + *
> + * The code for the case where the capability is present will be
> + * assembled into a special section to be used for dynamic patching.
> + * Code for that case must:
> + *
> + * 1. Be exactly the same length (in bytes) as the default code
> + *    sequence.
> + *
> + * 2. Not contain a branch target that is used outside of the
> + *    alternative sequence it is defined in (branches into an
> + *    alternative sequence are not fixed up).
> + */
> +
> +/*
> + * Begin an alternative code sequence.
> + */
> +.macro alternative_if_not cap
> +	.set .Lasm_alt_mode, 0
> +	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
> +	.popsection
> +661:
> +.endm
> +
> +.macro alternative_if cap
> +	.set .Lasm_alt_mode, 1
> +	.pushsection .altinstructions, "a"
> +	altinstruction_entry 663f, 661f, \cap, 664f-663f, 662f-661f
> +	.popsection
> +	.subsection 1
> +	.align 2	/* So GAS knows label 661 is suitably aligned */
> +661:
> +.endm
> +
> +.macro alternative_cb cb
> +	.set .Lasm_alt_mode, 0
> +	.pushsection .altinstructions, "a"
> +	altinstruction_entry 661f, \cb, ARM64_CB_PATCH, 662f-661f, 0
> +	.popsection
> +661:
> +.endm
> +
> +/*
> + * Provide the other half of the alternative code sequence.
> + */
> +.macro alternative_else
> +662:
> +	.if .Lasm_alt_mode==0
> +	.subsection 1
> +	.else
> +	.previous
> +	.endif
> +663:
> +.endm
> +
> +/*
> + * Complete an alternative code sequence.
> + */
> +.macro alternative_endif
> +664:
> +	.if .Lasm_alt_mode==0
> +	.previous
> +	.endif
> +	.org	. - (664b-663b) + (662b-661b)
> +	.org	. - (662b-661b) + (664b-663b)
> +.endm
> +
> +/*
> + * Callback-based alternative epilogue
> + */
> +.macro alternative_cb_end
> +662:
> +.endm
> +
> +/*
> + * Provides a trivial alternative or default sequence consisting solely
> + * of NOPs. The number of NOPs is chosen automatically to match the
> + * previous case.
> + */
> +.macro alternative_else_nop_endif
> +alternative_else
> +	nops	(662b-661b) / AARCH64_INSN_SIZE
> +alternative_endif
> +.endm
> +
> +#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...)	\
> +	alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
> +
> +.macro user_alt, label, oldinstr, newinstr, cond
> +9999:	alternative_insn "\oldinstr", "\newinstr", \cond
> +	_asm_extable 9999b, \label
> +.endm
> +
> +/*
> + * Generate the assembly for UAO alternatives with exception table entries.
> + * This is complicated as there is no post-increment or pair versions of the
> + * unprivileged instructions, and USER() only works for single instructions.
> + */
> +#ifdef CONFIG_ARM64_UAO
> +	.macro uao_ldp l, reg1, reg2, addr, post_inc
> +		alternative_if_not ARM64_HAS_UAO
> +8888:			ldp	\reg1, \reg2, [\addr], \post_inc;
> +8889:			nop;
> +			nop;
> +		alternative_else
> +			ldtr	\reg1, [\addr];
> +			ldtr	\reg2, [\addr, #8];
> +			add	\addr, \addr, \post_inc;
> +		alternative_endif
> +
> +		_asm_extable	8888b,\l;
> +		_asm_extable	8889b,\l;
> +	.endm
> +
> +	.macro uao_stp l, reg1, reg2, addr, post_inc
> +		alternative_if_not ARM64_HAS_UAO
> +8888:			stp	\reg1, \reg2, [\addr], \post_inc;
> +8889:			nop;
> +			nop;
> +		alternative_else
> +			sttr	\reg1, [\addr];
> +			sttr	\reg2, [\addr, #8];
> +			add	\addr, \addr, \post_inc;
> +		alternative_endif
> +
> +		_asm_extable	8888b,\l;
> +		_asm_extable	8889b,\l;
> +	.endm
> +
> +	.macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
> +		alternative_if_not ARM64_HAS_UAO
> +8888:			\inst	\reg, [\addr], \post_inc;
> +			nop;
> +		alternative_else
> +			\alt_inst	\reg, [\addr];
> +			add		\addr, \addr, \post_inc;
> +		alternative_endif
> +
> +		_asm_extable	8888b,\l;
> +	.endm
> +#else
> +	.macro uao_ldp l, reg1, reg2, addr, post_inc
> +		USER(\l, ldp \reg1, \reg2, [\addr], \post_inc)
> +	.endm
> +	.macro uao_stp l, reg1, reg2, addr, post_inc
> +		USER(\l, stp \reg1, \reg2, [\addr], \post_inc)
> +	.endm
> +	.macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
> +		USER(\l, \inst \reg, [\addr], \post_inc)
> +	.endm
> +#endif
> +
> +#endif  /*  __ASSEMBLY__  */
> +
> +/*
> + * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature));
> + *
> + * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature, CONFIG_FOO));
> + * N.B. If CONFIG_FOO is specified, but not selected, the whole block
> + *      will be omitted, including oldinstr.
> + */
> +#define ALTERNATIVE(oldinstr, newinstr, ...)   \
> +	_ALTERNATIVE_CFG(oldinstr, newinstr, __VA_ARGS__, 1)
> +
> +#endif /* __ASM_ALTERNATIVE_MACROS_H */
> diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
> index 619db9b4c9d5..a38b92e11811 100644
> --- a/arch/arm64/include/asm/alternative.h
> +++ b/arch/arm64/include/asm/alternative.h
> @@ -2,17 +2,13 @@
>  #ifndef __ASM_ALTERNATIVE_H
>  #define __ASM_ALTERNATIVE_H
>  
> -#include <asm/cpucaps.h>
> -#include <asm/insn.h>
> -
> -#define ARM64_CB_PATCH ARM64_NCAPS
> +#include <asm/alternative-macros.h>
>  
>  #ifndef __ASSEMBLY__
>  
>  #include <linux/init.h>
>  #include <linux/types.h>
>  #include <linux/stddef.h>
> -#include <linux/stringify.h>
>  
>  struct alt_instr {
>  	s32 orig_offset;	/* offset to original instruction */
> @@ -35,264 +31,5 @@ void apply_alternatives_module(void *start, size_t length);
>  static inline void apply_alternatives_module(void *start, size_t length) { }
>  #endif
>  
> -#define ALTINSTR_ENTRY(feature)					              \
> -	" .word 661b - .\n"				/* label           */ \
> -	" .word 663f - .\n"				/* new instruction */ \
> -	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
> -	" .byte 662b-661b\n"				/* source len      */ \
> -	" .byte 664f-663f\n"				/* replacement len */
> -
> -#define ALTINSTR_ENTRY_CB(feature, cb)					      \
> -	" .word 661b - .\n"				/* label           */ \
> -	" .word " __stringify(cb) "- .\n"		/* callback */	      \
> -	" .hword " __stringify(feature) "\n"		/* feature bit     */ \
> -	" .byte 662b-661b\n"				/* source len      */ \
> -	" .byte 664f-663f\n"				/* replacement len */
> -
> -/*
> - * alternative assembly primitive:
> - *
> - * If any of these .org directive fail, it means that insn1 and insn2
> - * don't have the same length. This used to be written as
> - *
> - * .if ((664b-663b) != (662b-661b))
> - * 	.error "Alternatives instruction length mismatch"
> - * .endif
> - *
> - * but most assemblers die if insn1 or insn2 have a .inst. This should
> - * be fixed in a binutils release posterior to 2.25.51.0.2 (anything
> - * containing commit 4e4d08cf7399b606 or c1baaddf8861).
> - *
> - * Alternatives with callbacks do not generate replacement instructions.
> - */
> -#define __ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg_enabled)	\
> -	".if "__stringify(cfg_enabled)" == 1\n"				\
> -	"661:\n\t"							\
> -	oldinstr "\n"							\
> -	"662:\n"							\
> -	".pushsection .altinstructions,\"a\"\n"				\
> -	ALTINSTR_ENTRY(feature)						\
> -	".popsection\n"							\
> -	".subsection 1\n"						\
> -	"663:\n\t"							\
> -	newinstr "\n"							\
> -	"664:\n\t"							\
> -	".org	. - (664b-663b) + (662b-661b)\n\t"			\
> -	".org	. - (662b-661b) + (664b-663b)\n\t"			\
> -	".previous\n"							\
> -	".endif\n"
> -
> -#define __ALTERNATIVE_CFG_CB(oldinstr, feature, cfg_enabled, cb)	\
> -	".if "__stringify(cfg_enabled)" == 1\n"				\
> -	"661:\n\t"							\
> -	oldinstr "\n"							\
> -	"662:\n"							\
> -	".pushsection .altinstructions,\"a\"\n"				\
> -	ALTINSTR_ENTRY_CB(feature, cb)					\
> -	".popsection\n"							\
> -	"663:\n\t"							\
> -	"664:\n\t"							\
> -	".endif\n"
> -
> -#define _ALTERNATIVE_CFG(oldinstr, newinstr, feature, cfg, ...)	\
> -	__ALTERNATIVE_CFG(oldinstr, newinstr, feature, IS_ENABLED(cfg))
> -
> -#define ALTERNATIVE_CB(oldinstr, cb) \
> -	__ALTERNATIVE_CFG_CB(oldinstr, ARM64_CB_PATCH, 1, cb)
> -#else
> -
> -#include <asm/assembler.h>
> -
> -.macro altinstruction_entry orig_offset alt_offset feature orig_len alt_len
> -	.word \orig_offset - .
> -	.word \alt_offset - .
> -	.hword \feature
> -	.byte \orig_len
> -	.byte \alt_len
> -.endm
> -
> -.macro alternative_insn insn1, insn2, cap, enable = 1
> -	.if \enable
> -661:	\insn1
> -662:	.pushsection .altinstructions, "a"
> -	altinstruction_entry 661b, 663f, \cap, 662b-661b, 664f-663f
> -	.popsection
> -	.subsection 1
> -663:	\insn2
> -664:	.previous
> -	.org	. - (664b-663b) + (662b-661b)
> -	.org	. - (662b-661b) + (664b-663b)
> -	.endif
> -.endm
> -
> -/*
> - * Alternative sequences
> - *
> - * The code for the case where the capability is not present will be
> - * assembled and linked as normal. There are no restrictions on this
> - * code.
> - *
> - * The code for the case where the capability is present will be
> - * assembled into a special section to be used for dynamic patching.
> - * Code for that case must:
> - *
> - * 1. Be exactly the same length (in bytes) as the default code
> - *    sequence.
> - *
> - * 2. Not contain a branch target that is used outside of the
> - *    alternative sequence it is defined in (branches into an
> - *    alternative sequence are not fixed up).
> - */
> -
> -/*
> - * Begin an alternative code sequence.
> - */
> -.macro alternative_if_not cap
> -	.set .Lasm_alt_mode, 0
> -	.pushsection .altinstructions, "a"
> -	altinstruction_entry 661f, 663f, \cap, 662f-661f, 664f-663f
> -	.popsection
> -661:
> -.endm
> -
> -.macro alternative_if cap
> -	.set .Lasm_alt_mode, 1
> -	.pushsection .altinstructions, "a"
> -	altinstruction_entry 663f, 661f, \cap, 664f-663f, 662f-661f
> -	.popsection
> -	.subsection 1
> -	.align 2	/* So GAS knows label 661 is suitably aligned */
> -661:
> -.endm
> -
> -.macro alternative_cb cb
> -	.set .Lasm_alt_mode, 0
> -	.pushsection .altinstructions, "a"
> -	altinstruction_entry 661f, \cb, ARM64_CB_PATCH, 662f-661f, 0
> -	.popsection
> -661:
> -.endm
> -
> -/*
> - * Provide the other half of the alternative code sequence.
> - */
> -.macro alternative_else
> -662:
> -	.if .Lasm_alt_mode==0
> -	.subsection 1
> -	.else
> -	.previous
> -	.endif
> -663:
> -.endm
> -
> -/*
> - * Complete an alternative code sequence.
> - */
> -.macro alternative_endif
> -664:
> -	.if .Lasm_alt_mode==0
> -	.previous
> -	.endif
> -	.org	. - (664b-663b) + (662b-661b)
> -	.org	. - (662b-661b) + (664b-663b)
> -.endm
> -
> -/*
> - * Callback-based alternative epilogue
> - */
> -.macro alternative_cb_end
> -662:
> -.endm
> -
> -/*
> - * Provides a trivial alternative or default sequence consisting solely
> - * of NOPs. The number of NOPs is chosen automatically to match the
> - * previous case.
> - */
> -.macro alternative_else_nop_endif
> -alternative_else
> -	nops	(662b-661b) / AARCH64_INSN_SIZE
> -alternative_endif
> -.endm
> -
> -#define _ALTERNATIVE_CFG(insn1, insn2, cap, cfg, ...)	\
> -	alternative_insn insn1, insn2, cap, IS_ENABLED(cfg)
> -
> -.macro user_alt, label, oldinstr, newinstr, cond
> -9999:	alternative_insn "\oldinstr", "\newinstr", \cond
> -	_asm_extable 9999b, \label
> -.endm
> -
> -/*
> - * Generate the assembly for UAO alternatives with exception table entries.
> - * This is complicated as there is no post-increment or pair versions of the
> - * unprivileged instructions, and USER() only works for single instructions.
> - */
> -#ifdef CONFIG_ARM64_UAO
> -	.macro uao_ldp l, reg1, reg2, addr, post_inc
> -		alternative_if_not ARM64_HAS_UAO
> -8888:			ldp	\reg1, \reg2, [\addr], \post_inc;
> -8889:			nop;
> -			nop;
> -		alternative_else
> -			ldtr	\reg1, [\addr];
> -			ldtr	\reg2, [\addr, #8];
> -			add	\addr, \addr, \post_inc;
> -		alternative_endif
> -
> -		_asm_extable	8888b,\l;
> -		_asm_extable	8889b,\l;
> -	.endm
> -
> -	.macro uao_stp l, reg1, reg2, addr, post_inc
> -		alternative_if_not ARM64_HAS_UAO
> -8888:			stp	\reg1, \reg2, [\addr], \post_inc;
> -8889:			nop;
> -			nop;
> -		alternative_else
> -			sttr	\reg1, [\addr];
> -			sttr	\reg2, [\addr, #8];
> -			add	\addr, \addr, \post_inc;
> -		alternative_endif
> -
> -		_asm_extable	8888b,\l;
> -		_asm_extable	8889b,\l;
> -	.endm
> -
> -	.macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
> -		alternative_if_not ARM64_HAS_UAO
> -8888:			\inst	\reg, [\addr], \post_inc;
> -			nop;
> -		alternative_else
> -			\alt_inst	\reg, [\addr];
> -			add		\addr, \addr, \post_inc;
> -		alternative_endif
> -
> -		_asm_extable	8888b,\l;
> -	.endm
> -#else
> -	.macro uao_ldp l, reg1, reg2, addr, post_inc
> -		USER(\l, ldp \reg1, \reg2, [\addr], \post_inc)
> -	.endm
> -	.macro uao_stp l, reg1, reg2, addr, post_inc
> -		USER(\l, stp \reg1, \reg2, [\addr], \post_inc)
> -	.endm
> -	.macro uao_user_alternative l, inst, alt_inst, reg, addr, post_inc
> -		USER(\l, \inst \reg, [\addr], \post_inc)
> -	.endm
> -#endif
> -
> -#endif  /*  __ASSEMBLY__  */
> -
> -/*
> - * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature));
> - *
> - * Usage: asm(ALTERNATIVE(oldinstr, newinstr, feature, CONFIG_FOO));
> - * N.B. If CONFIG_FOO is specified, but not selected, the whole block
> - *      will be omitted, including oldinstr.
> - */
> -#define ALTERNATIVE(oldinstr, newinstr, ...)   \
> -	_ALTERNATIVE_CFG(oldinstr, newinstr, __VA_ARGS__, 1)
> -
> +#endif /* __ASSEMBLY__ */
>  #endif /* __ASM_ALTERNATIVE_H */
> diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h
> index 4b39293d0f72..4ebb9c054ccc 100644
> --- a/arch/arm64/include/asm/insn.h
> +++ b/arch/arm64/include/asm/insn.h
> @@ -10,8 +10,7 @@
>  #include <linux/build_bug.h>
>  #include <linux/types.h>
>  
> -/* A64 instructions are always 32 bits. */
> -#define	AARCH64_INSN_SIZE		4
> +#include <asm/alternative.h>
>  
>  #ifndef __ASSEMBLY__
>  /*
> -- 
> 2.29.1.341.ge80a0c044ae-goog
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/4] arm64: alternatives: Split up alternative.h
  2020-11-03 12:40   ` Mark Rutland
@ 2020-11-03 12:42     ` Will Deacon
  0 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2020-11-03 12:42 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

On Tue, Nov 03, 2020 at 12:40:18PM +0000, Mark Rutland wrote:
> On Tue, Nov 03, 2020 at 12:17:18PM +0000, Will Deacon wrote:
> > asm/alternative.h contains both the macros needed to use alternatives,
> > as well the type definitions and function prototypes for applying them.
> > 
> > Split the header in two, so that alternatives can be used from core
> > header files such as linux/compiler.h without the risk of circular
> > includes
> > 
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Signed-off-by: Will Deacon <will@kernel.org>
> 
> As a heads-up, the uaccess macro move will end up conflicting with my
> uaccess rework. I have a patch moving those out into asm/asm-uaccess.h:
> 
> https://lore.kernel.org/r/20201006144642.12195-9-mark.rutland@arm.com
> 
> .... would you be happy to take that as a prep patch? Then in this
> patch you'd need to modify asm/asm-uaccess.h to include
> asm/alternative-macros.h.

Sure thing, I'll do that when I put the branch together.

> That wasy I can also carry that prep patch in the uaccess series, and
> avoid nasty merge conflicts, and it seems to make sense to factor out
> the uaccess bits anyway since they're not common alternative macros.
> 
> The patch itself looks fine to me, so FWIW (ideally with the above):
> 
> Acked-by: Mark Ryutland <mark.rutland@arm.com>

Cheers!

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/4] arm64: cpufeatures: Add capability for LDAPR instruction
  2020-11-03 12:17 ` [PATCH v4 2/4] arm64: cpufeatures: Add capability for LDAPR instruction Will Deacon
@ 2020-11-03 12:44   ` Mark Rutland
  0 siblings, 0 replies; 12+ messages in thread
From: Mark Rutland @ 2020-11-03 12:44 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

On Tue, Nov 03, 2020 at 12:17:19PM +0000, Will Deacon wrote:
> Armv8.3 introduced the LDAPR instruction, which provides weaker memory
> ordering semantics than LDARi (RCpc vs RCsc). Generally, we provide an
> RCsc implementation when implementing the Linux memory model, but LDAPR
> can be used as a useful alternative to dependency ordering, particularly
> when the compiler is capable of breaking the dependencies.
> 
> Since LDAPR is not available on all CPUs, add a cpufeature to detect it at
> runtime and allow the instruction to be used with alternative code
> patching.
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/Kconfig               |  3 +++
>  arch/arm64/include/asm/cpucaps.h |  3 ++-
>  arch/arm64/kernel/cpufeature.c   | 10 ++++++++++
>  3 files changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 1d466addb078..356c50b0447f 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -1388,6 +1388,9 @@ config ARM64_PAN
>  	 The feature is detected at runtime, and will remain as a 'nop'
>  	 instruction if the cpu does not implement the feature.
>  
> +config AS_HAS_LDAPR
> +	def_bool $(as-instr,.arch_extension rcpc)
> +
>  config ARM64_LSE_ATOMICS
>  	bool
>  	default ARM64_USE_LSE_ATOMICS
> diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
> index e7d98997c09c..64ea0bb9f420 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -66,7 +66,8 @@
>  #define ARM64_HAS_TLB_RANGE			56
>  #define ARM64_MTE				57
>  #define ARM64_WORKAROUND_1508412		58
> +#define ARM64_HAS_LDAPR				59
>  
> -#define ARM64_NCAPS				59
> +#define ARM64_NCAPS				60
>  
>  #endif /* __ASM_CPUCAPS_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index dcc165b3fc04..b7b6804cb931 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -2136,6 +2136,16 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
>  		.cpu_enable = cpu_enable_mte,
>  	},
>  #endif /* CONFIG_ARM64_MTE */
> +	{
> +		.desc = "RCpc load-acquire (LDAPR)",
> +		.capability = ARM64_HAS_LDAPR,
> +		.type = ARM64_CPUCAP_SYSTEM_FEATURE,
> +		.sys_reg = SYS_ID_AA64ISAR1_EL1,
> +		.sign = FTR_UNSIGNED,
> +		.field_pos = ID_AA64ISAR1_LRCPC_SHIFT,
> +		.matches = has_cpuid_feature,
> +		.min_field_value = 1,
> +	},
>  	{},
>  };
>  
> -- 
> 2.29.1.341.ge80a0c044ae-goog
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/4] arm64: alternatives: Remove READ_ONCE() usage during patch operation
  2020-11-03 12:17 ` [PATCH v4 3/4] arm64: alternatives: Remove READ_ONCE() usage during patch operation Will Deacon
@ 2020-11-03 12:46   ` Mark Rutland
  0 siblings, 0 replies; 12+ messages in thread
From: Mark Rutland @ 2020-11-03 12:46 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

On Tue, Nov 03, 2020 at 12:17:20PM +0000, Will Deacon wrote:
> In preparation for patching the internals of READ_ONCE() itself, replace
> its usage on the alternatives patching patch with a volatile variable
> instead.
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/kernel/alternative.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> index 73039949b5ce..a57cffb752e8 100644
> --- a/arch/arm64/kernel/alternative.c
> +++ b/arch/arm64/kernel/alternative.c
> @@ -21,7 +21,8 @@
>  #define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)
>  #define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)
>  
> -static int all_alternatives_applied;
> +/* Volatile, as we may be patching the guts of READ_ONCE() */
> +static volatile int all_alternatives_applied;
>  
>  static DECLARE_BITMAP(applied_alternatives, ARM64_NCAPS);
>  
> @@ -205,7 +206,7 @@ static int __apply_alternatives_multi_stop(void *unused)
>  
>  	/* We always have a CPU 0 at this point (__init) */
>  	if (smp_processor_id()) {
> -		while (!READ_ONCE(all_alternatives_applied))
> +		while (!all_alternatives_applied)
>  			cpu_relax();
>  		isb();
>  	} else {
> @@ -217,7 +218,7 @@ static int __apply_alternatives_multi_stop(void *unused)
>  		BUG_ON(all_alternatives_applied);
>  		__apply_alternatives(&region, false, remaining_capabilities);
>  		/* Barriers provided by the cache flushing */
> -		WRITE_ONCE(all_alternatives_applied, 1);
> +		all_alternatives_applied = 1;
>  	}
>  
>  	return 0;
> -- 
> 2.29.1.341.ge80a0c044ae-goog
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 4/4] arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y
  2020-11-03 12:17 ` [PATCH v4 4/4] arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y Will Deacon
@ 2020-11-03 12:58   ` Mark Rutland
  2020-11-03 13:13     ` Will Deacon
  0 siblings, 1 reply; 12+ messages in thread
From: Mark Rutland @ 2020-11-03 12:58 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-arm-kernel, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

On Tue, Nov 03, 2020 at 12:17:21PM +0000, Will Deacon wrote:
> When building with LTO, there is an increased risk of the compiler
> converting an address dependency headed by a READ_ONCE() invocation
> into a control dependency and consequently allowing for harmful
> reordering by the CPU.
> 
> Ensure that such transformations are harmless by overriding the generic
> READ_ONCE() definition with one that provides acquire semantics when
> building with LTO.
> 
> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Will Deacon <will@kernel.org>

[...]

Could we add a note above __READ_ONCE() along the lines of the commit
message, e.g.

/*
 * With LTO a compiler might convert an address dependency headed by a
 * READ_ONCE() into a control dependency, allowing for harmful
 * reordering by the CPU.
 *
 * To prevent this, upgrade READ_OONCE() to provide acquire semantics
 * when building with LTO.
 */

Either way:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark

> +#define __READ_ONCE(x)							\
> +({									\
> +	typeof(&(x)) __x = &(x);					\
> +	int atomic = 1;							\
> +	union { __unqual_scalar_typeof(*__x) __val; char __c[1]; } __u;	\
> +	switch (sizeof(x)) {						\
> +	case 1:								\
> +		asm volatile(__LOAD_RCPC(b, %w0, %1)			\
> +			: "=r" (*(__u8 *)__u.__c)			\
> +			: "Q" (*__x) : "memory");			\
> +		break;							\
> +	case 2:								\
> +		asm volatile(__LOAD_RCPC(h, %w0, %1)			\
> +			: "=r" (*(__u16 *)__u.__c)			\
> +			: "Q" (*__x) : "memory");			\
> +		break;							\
> +	case 4:								\
> +		asm volatile(__LOAD_RCPC(, %w0, %1)			\
> +			: "=r" (*(__u32 *)__u.__c)			\
> +			: "Q" (*__x) : "memory");			\
> +		break;							\
> +	case 8:								\
> +		asm volatile(__LOAD_RCPC(, %0, %1)			\
> +			: "=r" (*(__u64 *)__u.__c)			\
> +			: "Q" (*__x) : "memory");			\
> +		break;							\
> +	default:							\
> +		atomic = 0;						\
> +	}								\
> +	atomic ? (typeof(*__x))__u.__val : (*(volatile typeof(__x))__x);\
> +})

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 4/4] arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y
  2020-11-03 12:58   ` Mark Rutland
@ 2020-11-03 13:13     ` Will Deacon
  0 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2020-11-03 13:13 UTC (permalink / raw)
  To: Mark Rutland
  Cc: linux-arm-kernel, Kees Cook, Catalin Marinas, Sami Tolvanen,
	Masahiro Yamada, Peter Zijlstra, linux-kernel

On Tue, Nov 03, 2020 at 12:58:45PM +0000, Mark Rutland wrote:
> On Tue, Nov 03, 2020 at 12:17:21PM +0000, Will Deacon wrote:
> > When building with LTO, there is an increased risk of the compiler
> > converting an address dependency headed by a READ_ONCE() invocation
> > into a control dependency and consequently allowing for harmful
> > reordering by the CPU.
> > 
> > Ensure that such transformations are harmless by overriding the generic
> > READ_ONCE() definition with one that provides acquire semantics when
> > building with LTO.
> > 
> > Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > Signed-off-by: Will Deacon <will@kernel.org>
> 
> [...]
> 
> Could we add a note above __READ_ONCE() along the lines of the commit
> message, e.g.
> 
> /*
>  * With LTO a compiler might convert an address dependency headed by a
>  * READ_ONCE() into a control dependency, allowing for harmful
>  * reordering by the CPU.
>  *
>  * To prevent this, upgrade READ_OONCE() to provide acquire semantics
>  * when building with LTO.

It's not halloween any moooore :)

But yes, I'll add something to that effect, cheers.

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO
  2020-11-03 12:17 [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon
                   ` (3 preceding siblings ...)
  2020-11-03 12:17 ` [PATCH v4 4/4] arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y Will Deacon
@ 2020-11-09 23:25 ` Will Deacon
  4 siblings, 0 replies; 12+ messages in thread
From: Will Deacon @ 2020-11-09 23:25 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Kees Cook, Catalin Marinas, Sami Tolvanen, Masahiro Yamada,
	Peter Zijlstra, linux-kernel

On Tue, Nov 03, 2020 at 12:17:17PM +0000, Will Deacon wrote:
> Hi all,
> 
> These patches were previously posted as part of a larger series enabling
> architectures to override __READ_ONCE():
> 
>   v3: https://lore.kernel.org/lkml/20200710165203.31284-1-will@kernel.org/
> 
> With the bulk of that merged, the four patches here override READ_ONCE()
> so that it gains RCpc acquire semantics on arm64 when LTO is enabled. We
> can revisit this as and when the compiler provides a means for us to reason
> about the result of dependency-breaking optimisations. In the meantime,
> this unblocks LTO for arm64, which I would really like to see merged so
> that we can focus on enabling CFI.
> 
> I plan to queue these on their own branch in the arm64 tree for 5.11 at
> -rc3.

Now pushed to for-next/lto:

https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/log/?h=for-next/lto

with Mark's comments addressed.

Will

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-11-09 23:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-03 12:17 [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon
2020-11-03 12:17 ` [PATCH v4 1/4] arm64: alternatives: Split up alternative.h Will Deacon
2020-11-03 12:40   ` Mark Rutland
2020-11-03 12:42     ` Will Deacon
2020-11-03 12:17 ` [PATCH v4 2/4] arm64: cpufeatures: Add capability for LDAPR instruction Will Deacon
2020-11-03 12:44   ` Mark Rutland
2020-11-03 12:17 ` [PATCH v4 3/4] arm64: alternatives: Remove READ_ONCE() usage during patch operation Will Deacon
2020-11-03 12:46   ` Mark Rutland
2020-11-03 12:17 ` [PATCH v4 4/4] arm64: lto: Strengthen READ_ONCE() to acquire when CONFIG_LTO=y Will Deacon
2020-11-03 12:58   ` Mark Rutland
2020-11-03 13:13     ` Will Deacon
2020-11-09 23:25 ` [PATCH v4 0/4] Upgrade READ_ONCE() to RCpc acquire on arm64 with LTO Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).