linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/22] x86: Remove anonymous out-of-line fixups
@ 2021-11-05 17:10 Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 01/22] bitfield.h: Fix "type of reg too small for mask" test Peter Zijlstra
                   ` (21 more replies)
  0 siblings, 22 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Hi,

Direct counterpart to the arm64 series from Mark:

  https://lkml.kernel.org/r/20211019160219.5202-1-mark.rutland@arm.com

Since he already put it rather well:

"We recently realised that out-of-line extable fixups cause a number of problems
for backtracing (mattering both for developers and for RELIABLE_STACKTRACE and
LIVEPATCH). Dmitry spotted a confusing backtrace, which we identified was due
to problems with unwinding fixups, as summarized in:

  https://lore.kernel.org/linux-arm-kernel/20210927171812.GB9201@C02TD0UTHF1T.local/

The gist is that while backtracing through a fixup, the fixup gets symbolized
as an offset from the nearest prior symbol (which happens to be
`__entry_tramp_text_end`), and we the backtrace misses the function that was
being fixed up (because the fixup handling adjusts the PC, then the fixup does
a direct branch back to the original function). We can't reliably map from an
arbitrary PC in the fixup text back to the original function.

The way we create fixups is a bit unfortunate: most fixups are generated from
common templates, and only differ in register to be poked and the address to
branch back to, leading to redundant copies of the same logic that must pollute
Since the fixups are all written in assembly, and duplicated for each fixup
site, we can only perform very simple fixups, and can't handle any complex
triage that we might need for some exceptions (e.g. MTE faults)."


This time things have been build tested for both i386 and x86_64
(defconfig,allyesconfig) and boot tested x86_64 and even started a guest inside
of that.

Also available here:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git x86/wip.extable

Changes since RFC:

 - rebase to origin/master
 - Fixup missing mmx prefetch and use 3DNOWPREFETCH feature
 - renamed POP_SEG to POP_ZERO, changed size, added comment
 - added found to extable_type_reg voodoo
 - used insn-eval.c copy of pt_regs indexing
 - renamed exception_table_entry::type to ::data
 - renamed macro magic
 - removed ltype from __get_user_asm()
 - dropped ftrace patch
 - simpler kvm patch
 - rewrote all of load_unaligned_zeropad()
 - removed .fixup from objtool

---
 arch/x86/entry/entry_32.S                  |  28 ++-----
 arch/x86/entry/entry_64.S                  |  13 ++-
 arch/x86/entry/vdso/vdso-layout.lds.S      |   1 -
 arch/x86/include/asm/asm.h                 |  33 ++++++++
 arch/x86/include/asm/extable.h             |   6 +-
 arch/x86/include/asm/extable_fixup_types.h |  46 +++++++++--
 arch/x86/include/asm/futex.h               |  28 ++-----
 arch/x86/include/asm/insn-eval.h           |   2 +
 arch/x86/include/asm/msr.h                 |  26 ++----
 arch/x86/include/asm/segment.h             |   9 +--
 arch/x86/include/asm/sgx.h                 |  18 +++++
 arch/x86/include/asm/uaccess.h             |  39 ++++-----
 arch/x86/include/asm/word-at-a-time.h      |  67 +++++++++++-----
 arch/x86/include/asm/xen/page.h            |  12 +--
 arch/x86/kernel/cpu/sgx/encls.h            |  36 ++-------
 arch/x86/kernel/fpu/legacy.h               |   6 +-
 arch/x86/kernel/fpu/xstate.h               |   6 +-
 arch/x86/kernel/vmlinux.lds.S              |   1 -
 arch/x86/kvm/emulate.c                     |  16 +---
 arch/x86/kvm/vmx/vmx_ops.h                 |  14 ++--
 arch/x86/lib/checksum_32.S                 |  19 +----
 arch/x86/lib/copy_mc_64.S                  |  12 +--
 arch/x86/lib/copy_user_64.S                |  32 +++-----
 arch/x86/lib/insn-eval.c                   |  66 +++++++++------
 arch/x86/lib/mmx_32.c                      |  86 +++++++-------------
 arch/x86/lib/usercopy_32.c                 |  66 ++++++---------
 arch/x86/lib/usercopy_64.c                 |   8 +-
 arch/x86/mm/extable.c                      | 124 ++++++++++++++++++++++-------
 arch/x86/net/bpf_jit_comp.c                |   2 +-
 include/linux/bitfield.h                   |  19 ++++-
 tools/objtool/check.c                      |   8 +-
 31 files changed, 445 insertions(+), 404 deletions(-)


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 01/22] bitfield.h: Fix "type of reg too small for mask" test
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 02/22] x86,mmx_32: Remove .fixup usage Peter Zijlstra
                   ` (20 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

The test: 'mask > (typeof(_reg))~0ull' only works correctly when both
sides are unsigned, consider:

 - 0xff000000 vs (int)~0ull
 - 0x000000ff vs (int)~0ull

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 include/linux/bitfield.h |   19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

--- a/include/linux/bitfield.h
+++ b/include/linux/bitfield.h
@@ -41,6 +41,22 @@
 
 #define __bf_shf(x) (__builtin_ffsll(x) - 1)
 
+#define __scalar_type_to_unsigned_cases(type)				\
+		unsigned type:	(unsigned type)0,			\
+		signed type:	(unsigned type)0
+
+#define __unsigned_scalar_typeof(x) typeof(				\
+		_Generic((x),						\
+			char:	(unsigned char)0,			\
+			__scalar_type_to_unsigned_cases(char),		\
+			__scalar_type_to_unsigned_cases(short),		\
+			__scalar_type_to_unsigned_cases(int),		\
+			__scalar_type_to_unsigned_cases(long),		\
+			__scalar_type_to_unsigned_cases(long long),	\
+			default: (x)))
+
+#define __bf_cast_unsigned(type, x)	((__unsigned_scalar_typeof(type))(x))
+
 #define __BF_FIELD_CHECK(_mask, _reg, _val, _pfx)			\
 	({								\
 		BUILD_BUG_ON_MSG(!__builtin_constant_p(_mask),		\
@@ -49,7 +65,8 @@
 		BUILD_BUG_ON_MSG(__builtin_constant_p(_val) ?		\
 				 ~((_mask) >> __bf_shf(_mask)) & (_val) : 0, \
 				 _pfx "value too large for the field"); \
-		BUILD_BUG_ON_MSG((_mask) > (typeof(_reg))~0ull,		\
+		BUILD_BUG_ON_MSG(__bf_cast_unsigned(_mask, _mask) >	\
+				 __bf_cast_unsigned(_reg, ~0ull),	\
 				 _pfx "type of reg too small for mask"); \
 		__BUILD_BUG_ON_NOT_POWER_OF_2((_mask) +			\
 					      (1ULL << __bf_shf(_mask))); \



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 02/22] x86,mmx_32: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 01/22] bitfield.h: Fix "type of reg too small for mask" test Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 03/22] x86,copy_user_64: " Peter Zijlstra
                   ` (19 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

This code puts an exception table entry on the "PREFIX" instruction to
overwrite it with a jmp.d8 when it triggers an exception. Except of
course, our code is no longer writable, also SMP.

Replace it with ALTERNATIVE, the novel

XXX: arguably we should just delete this code

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/lib/mmx_32.c |   86 +++++++++++++++++---------------------------------
 1 file changed, 30 insertions(+), 56 deletions(-)

--- a/arch/x86/lib/mmx_32.c
+++ b/arch/x86/lib/mmx_32.c
@@ -50,23 +50,18 @@ void *_mmx_memcpy(void *to, const void *
 	kernel_fpu_begin_mask(KFPU_387);
 
 	__asm__ __volatile__ (
-		"1: prefetch (%0)\n"		/* This set is 28 bytes */
-		"   prefetch 64(%0)\n"
-		"   prefetch 128(%0)\n"
-		"   prefetch 192(%0)\n"
-		"   prefetch 256(%0)\n"
-		"2:  \n"
-		".section .fixup, \"ax\"\n"
-		"3: movw $0x1AEB, 1b\n"	/* jmp on 26 bytes */
-		"   jmp 2b\n"
-		".previous\n"
-			_ASM_EXTABLE(1b, 3b)
-			: : "r" (from));
+		ALTERNATIVE "",
+			    "prefetch (%0)\n"
+			    "prefetch 64(%0)\n"
+			    "prefetch 128(%0)\n"
+			    "prefetch 192(%0)\n"
+			    "prefetch 256(%0)\n", X86_FEATURE_3DNOWPREFETCH
+		: : "r" (from));
 
 	for ( ; i > 5; i--) {
 		__asm__ __volatile__ (
-		"1:  prefetch 320(%0)\n"
-		"2:  movq (%0), %%mm0\n"
+		ALTERNATIVE "", "prefetch 320(%0)\n", X86_FEATURE_3DNOWPREFETCH
+		"  movq (%0), %%mm0\n"
 		"  movq 8(%0), %%mm1\n"
 		"  movq 16(%0), %%mm2\n"
 		"  movq 24(%0), %%mm3\n"
@@ -82,11 +77,6 @@ void *_mmx_memcpy(void *to, const void *
 		"  movq %%mm1, 40(%1)\n"
 		"  movq %%mm2, 48(%1)\n"
 		"  movq %%mm3, 56(%1)\n"
-		".section .fixup, \"ax\"\n"
-		"3: movw $0x05EB, 1b\n"	/* jmp on 5 bytes */
-		"   jmp 2b\n"
-		".previous\n"
-			_ASM_EXTABLE(1b, 3b)
 			: : "r" (from), "r" (to) : "memory");
 
 		from += 64;
@@ -177,22 +167,18 @@ static void fast_copy_page(void *to, voi
 	 * but that is for later. -AV
 	 */
 	__asm__ __volatile__(
-		"1: prefetch (%0)\n"
-		"   prefetch 64(%0)\n"
-		"   prefetch 128(%0)\n"
-		"   prefetch 192(%0)\n"
-		"   prefetch 256(%0)\n"
-		"2:  \n"
-		".section .fixup, \"ax\"\n"
-		"3: movw $0x1AEB, 1b\n"	/* jmp on 26 bytes */
-		"   jmp 2b\n"
-		".previous\n"
-			_ASM_EXTABLE(1b, 3b) : : "r" (from));
+		ALTERNATIVE "",
+			    "prefetch (%0)\n"
+			    "prefetch 64(%0)\n"
+			    "prefetch 128(%0)\n"
+			    "prefetch 192(%0)\n"
+			    "prefetch 256(%0)\n", X86_FEATURE_3DNOWPREFETCH
+		: : "r" (from));
 
 	for (i = 0; i < (4096-320)/64; i++) {
 		__asm__ __volatile__ (
-		"1: prefetch 320(%0)\n"
-		"2: movq (%0), %%mm0\n"
+		ALTERNATIVE "", "prefetch 320(%0)\n", X86_FEATURE_3DNOWPREFETCH
+		"   movq (%0), %%mm0\n"
 		"   movntq %%mm0, (%1)\n"
 		"   movq 8(%0), %%mm1\n"
 		"   movntq %%mm1, 8(%1)\n"
@@ -208,11 +194,7 @@ static void fast_copy_page(void *to, voi
 		"   movntq %%mm6, 48(%1)\n"
 		"   movq 56(%0), %%mm7\n"
 		"   movntq %%mm7, 56(%1)\n"
-		".section .fixup, \"ax\"\n"
-		"3: movw $0x05EB, 1b\n"	/* jmp on 5 bytes */
-		"   jmp 2b\n"
-		".previous\n"
-		_ASM_EXTABLE(1b, 3b) : : "r" (from), "r" (to) : "memory");
+			: : "r" (from), "r" (to) : "memory");
 
 		from += 64;
 		to += 64;
@@ -220,7 +202,7 @@ static void fast_copy_page(void *to, voi
 
 	for (i = (4096-320)/64; i < 4096/64; i++) {
 		__asm__ __volatile__ (
-		"2: movq (%0), %%mm0\n"
+		"   movq (%0), %%mm0\n"
 		"   movntq %%mm0, (%1)\n"
 		"   movq 8(%0), %%mm1\n"
 		"   movntq %%mm1, 8(%1)\n"
@@ -237,6 +219,7 @@ static void fast_copy_page(void *to, voi
 		"   movq 56(%0), %%mm7\n"
 		"   movntq %%mm7, 56(%1)\n"
 			: : "r" (from), "r" (to) : "memory");
+
 		from += 64;
 		to += 64;
 	}
@@ -295,22 +278,18 @@ static void fast_copy_page(void *to, voi
 	kernel_fpu_begin_mask(KFPU_387);
 
 	__asm__ __volatile__ (
-		"1: prefetch (%0)\n"
-		"   prefetch 64(%0)\n"
-		"   prefetch 128(%0)\n"
-		"   prefetch 192(%0)\n"
-		"   prefetch 256(%0)\n"
-		"2:  \n"
-		".section .fixup, \"ax\"\n"
-		"3: movw $0x1AEB, 1b\n"	/* jmp on 26 bytes */
-		"   jmp 2b\n"
-		".previous\n"
-			_ASM_EXTABLE(1b, 3b) : : "r" (from));
+		ALTERNATIVE "",
+			    "prefetch (%0)\n"
+			    "prefetch 64(%0)\n"
+			    "prefetch 128(%0)\n"
+			    "prefetch 192(%0)\n"
+			    "prefetch 256(%0)\n", X86_FEATURE_3DNOWPREFETCH
+		: : "r" (from));
 
 	for (i = 0; i < 4096/64; i++) {
 		__asm__ __volatile__ (
-		"1: prefetch 320(%0)\n"
-		"2: movq (%0), %%mm0\n"
+		ALTERNATIVE "", "prefetch 320(%0)\n", X86_FEATURE_3DNOWPREFETCH
+		"   movq (%0), %%mm0\n"
 		"   movq 8(%0), %%mm1\n"
 		"   movq 16(%0), %%mm2\n"
 		"   movq 24(%0), %%mm3\n"
@@ -326,11 +305,6 @@ static void fast_copy_page(void *to, voi
 		"   movq %%mm1, 40(%1)\n"
 		"   movq %%mm2, 48(%1)\n"
 		"   movq %%mm3, 56(%1)\n"
-		".section .fixup, \"ax\"\n"
-		"3: movw $0x05EB, 1b\n"	/* jmp on 5 bytes */
-		"   jmp 2b\n"
-		".previous\n"
-			_ASM_EXTABLE(1b, 3b)
 			: : "r" (from), "r" (to) : "memory");
 
 		from += 64;



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 03/22] x86,copy_user_64: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 01/22] bitfield.h: Fix "type of reg too small for mask" test Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 02/22] x86,mmx_32: Remove .fixup usage Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 04/22] x86,copy_mc_64: " Peter Zijlstra
                   ` (18 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Place the anonymous .fixup code at the tail of the regular functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/lib/copy_user_64.S |   32 +++++++++++---------------------
 1 file changed, 11 insertions(+), 21 deletions(-)

--- a/arch/x86/lib/copy_user_64.S
+++ b/arch/x86/lib/copy_user_64.S
@@ -32,14 +32,10 @@
 	decl %ecx
 	jnz 100b
 102:
-	.section .fixup,"ax"
-103:	addl %ecx,%edx			/* ecx is zerorest also */
-	jmp .Lcopy_user_handle_tail
-	.previous
 
-	_ASM_EXTABLE_CPY(100b, 103b)
-	_ASM_EXTABLE_CPY(101b, 103b)
-	.endm
+	_ASM_EXTABLE_CPY(100b, .Lcopy_user_handle_align)
+	_ASM_EXTABLE_CPY(101b, .Lcopy_user_handle_align)
+.endm
 
 /*
  * copy_user_generic_unrolled - memory copy with exception handling.
@@ -107,7 +103,6 @@ SYM_FUNC_START(copy_user_generic_unrolle
 	ASM_CLAC
 	ret
 
-	.section .fixup,"ax"
 30:	shll $6,%ecx
 	addl %ecx,%edx
 	jmp 60f
@@ -115,7 +110,6 @@ SYM_FUNC_START(copy_user_generic_unrolle
 	jmp 60f
 50:	movl %ecx,%edx
 60:	jmp .Lcopy_user_handle_tail /* ecx is zerorest also */
-	.previous
 
 	_ASM_EXTABLE_CPY(1b, 30b)
 	_ASM_EXTABLE_CPY(2b, 30b)
@@ -166,20 +160,16 @@ SYM_FUNC_START(copy_user_generic_string)
 	movl %edx,%ecx
 	shrl $3,%ecx
 	andl $7,%edx
-1:	rep
-	movsq
+1:	rep movsq
 2:	movl %edx,%ecx
-3:	rep
-	movsb
+3:	rep movsb
 	xorl %eax,%eax
 	ASM_CLAC
 	ret
 
-	.section .fixup,"ax"
 11:	leal (%rdx,%rcx,8),%ecx
 12:	movl %ecx,%edx		/* ecx is zerorest also */
 	jmp .Lcopy_user_handle_tail
-	.previous
 
 	_ASM_EXTABLE_CPY(1b, 11b)
 	_ASM_EXTABLE_CPY(3b, 12b)
@@ -203,16 +193,13 @@ SYM_FUNC_START(copy_user_enhanced_fast_s
 	cmpl $64,%edx
 	jb .L_copy_short_string	/* less then 64 bytes, avoid the costly 'rep' */
 	movl %edx,%ecx
-1:	rep
-	movsb
+1:	rep movsb
 	xorl %eax,%eax
 	ASM_CLAC
 	ret
 
-	.section .fixup,"ax"
 12:	movl %ecx,%edx		/* ecx is zerorest also */
 	jmp .Lcopy_user_handle_tail
-	.previous
 
 	_ASM_EXTABLE_CPY(1b, 12b)
 SYM_FUNC_END(copy_user_enhanced_fast_string)
@@ -240,6 +227,11 @@ SYM_CODE_START_LOCAL(.Lcopy_user_handle_
 	ret
 
 	_ASM_EXTABLE_CPY(1b, 2b)
+
+.Lcopy_user_handle_align:
+	addl %ecx,%edx			/* ecx is zerorest also */
+	jmp .Lcopy_user_handle_tail
+
 SYM_CODE_END(.Lcopy_user_handle_tail)
 
 /*
@@ -350,7 +342,6 @@ SYM_FUNC_START(__copy_user_nocache)
 	sfence
 	ret
 
-	.section .fixup,"ax"
 .L_fixup_4x8b_copy:
 	shll $6,%ecx
 	addl %ecx,%edx
@@ -366,7 +357,6 @@ SYM_FUNC_START(__copy_user_nocache)
 .L_fixup_handle_tail:
 	sfence
 	jmp .Lcopy_user_handle_tail
-	.previous
 
 	_ASM_EXTABLE_CPY(1b, .L_fixup_4x8b_copy)
 	_ASM_EXTABLE_CPY(2b, .L_fixup_4x8b_copy)



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 04/22] x86,copy_mc_64: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (2 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 03/22] x86,copy_user_64: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 05/22] x86,entry_64: " Peter Zijlstra
                   ` (17 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Place the anonymous .fixup code at the tail of the regular functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/lib/copy_mc_64.S |   12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

--- a/arch/x86/lib/copy_mc_64.S
+++ b/arch/x86/lib/copy_mc_64.S
@@ -78,9 +78,7 @@ SYM_FUNC_START(copy_mc_fragile)
 	xorl %eax, %eax
 .L_done:
 	ret
-SYM_FUNC_END(copy_mc_fragile)
 
-	.section .fixup, "ax"
 	/*
 	 * Return number of bytes not copied for any failure. Note that
 	 * there is no "tail" handling since the source buffer is 8-byte
@@ -105,14 +103,14 @@ SYM_FUNC_END(copy_mc_fragile)
 	movl	%ecx, %edx
 	jmp copy_mc_fragile_handle_tail
 
-	.previous
-
 	_ASM_EXTABLE_TYPE(.L_read_leading_bytes, .E_leading_bytes, EX_TYPE_DEFAULT_MCE_SAFE)
 	_ASM_EXTABLE_TYPE(.L_read_words, .E_read_words, EX_TYPE_DEFAULT_MCE_SAFE)
 	_ASM_EXTABLE_TYPE(.L_read_trailing_bytes, .E_trailing_bytes, EX_TYPE_DEFAULT_MCE_SAFE)
 	_ASM_EXTABLE(.L_write_leading_bytes, .E_leading_bytes)
 	_ASM_EXTABLE(.L_write_words, .E_write_words)
 	_ASM_EXTABLE(.L_write_trailing_bytes, .E_trailing_bytes)
+
+SYM_FUNC_END(copy_mc_fragile)
 #endif /* CONFIG_X86_MCE */
 
 /*
@@ -133,9 +131,7 @@ SYM_FUNC_START(copy_mc_enhanced_fast_str
 	/* Copy successful. Return zero */
 	xorl %eax, %eax
 	ret
-SYM_FUNC_END(copy_mc_enhanced_fast_string)
 
-	.section .fixup, "ax"
 .E_copy:
 	/*
 	 * On fault %rcx is updated such that the copy instruction could
@@ -147,7 +143,7 @@ SYM_FUNC_END(copy_mc_enhanced_fast_strin
 	movq %rcx, %rax
 	ret
 
-	.previous
-
 	_ASM_EXTABLE_TYPE(.L_copy, .E_copy, EX_TYPE_DEFAULT_MCE_SAFE)
+
+SYM_FUNC_END(copy_mc_enhanced_fast_string)
 #endif /* !CONFIG_UML */



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 05/22] x86,entry_64: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (3 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 04/22] x86,copy_mc_64: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 06/22] x86,entry_32: " Peter Zijlstra
                   ` (16 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Place the anonymous .fixup code at the tail of the regular functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_64.S |   13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -735,13 +735,9 @@ SYM_FUNC_START(asm_load_gs_index)
 	swapgs
 	FRAME_END
 	ret
-SYM_FUNC_END(asm_load_gs_index)
-EXPORT_SYMBOL(asm_load_gs_index)
 
-	_ASM_EXTABLE(.Lgs_change, .Lbad_gs)
-	.section .fixup, "ax"
 	/* running with kernelgs */
-SYM_CODE_START_LOCAL_NOALIGN(.Lbad_gs)
+.Lbad_gs:
 	swapgs					/* switch back to user gs */
 .macro ZAP_GS
 	/* This can't be a string because the preprocessor needs to see it. */
@@ -752,8 +748,11 @@ SYM_CODE_START_LOCAL_NOALIGN(.Lbad_gs)
 	xorl	%eax, %eax
 	movl	%eax, %gs
 	jmp	2b
-SYM_CODE_END(.Lbad_gs)
-	.previous
+
+	_ASM_EXTABLE(.Lgs_change, .Lbad_gs)
+
+SYM_FUNC_END(asm_load_gs_index)
+EXPORT_SYMBOL(asm_load_gs_index)
 
 #ifdef CONFIG_XEN_PV
 /*



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 06/22] x86,entry_32: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (4 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 05/22] x86,entry_64: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 07/22] x86,extable: Extend extable functionality Peter Zijlstra
                   ` (15 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Where possible, push the .fixup into code, at the tail of functions.

This is hard for macros since they're used in multiple functions,
therefore introduce a new extable handler for pop-segment.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/entry_32.S                  |   28 ++++++++--------------------
 arch/x86/include/asm/extable_fixup_types.h |    2 ++
 arch/x86/mm/extable.c                      |   14 ++++++++++++++
 3 files changed, 24 insertions(+), 20 deletions(-)

--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -270,17 +270,9 @@
 3:	popl	%fs
 	addl	$(4 + \pop), %esp	/* pop the unused "gs" slot */
 	IRET_FRAME
-.pushsection .fixup, "ax"
-4:	movl	$0, (%esp)
-	jmp	1b
-5:	movl	$0, (%esp)
-	jmp	2b
-6:	movl	$0, (%esp)
-	jmp	3b
-.popsection
-	_ASM_EXTABLE(1b, 4b)
-	_ASM_EXTABLE(2b, 5b)
-	_ASM_EXTABLE(3b, 6b)
+	_ASM_EXTABLE_TYPE(1b, 1b, EX_TYPE_POP_ZERO)
+	_ASM_EXTABLE_TYPE(2b, 2b, EX_TYPE_POP_ZERO)
+	_ASM_EXTABLE_TYPE(3b, 3b, EX_TYPE_POP_ZERO)
 .endm
 
 .macro RESTORE_ALL_NMI cr3_reg:req pop=0
@@ -925,10 +917,8 @@ SYM_FUNC_START(entry_SYSENTER_32)
 	sti
 	sysexit
 
-.pushsection .fixup, "ax"
-2:	movl	$0, PT_FS(%esp)
-	jmp	1b
-.popsection
+2:	movl    $0, PT_FS(%esp)
+	jmp     1b
 	_ASM_EXTABLE(1b, 2b)
 
 .Lsysenter_fix_flags:
@@ -996,8 +986,7 @@ SYM_FUNC_START(entry_INT80_32)
 	 */
 	iret
 
-.section .fixup, "ax"
-SYM_CODE_START(asm_iret_error)
+.Lasm_iret_error:
 	pushl	$0				# no error code
 	pushl	$iret_error
 
@@ -1014,9 +1003,8 @@ SYM_CODE_START(asm_iret_error)
 #endif
 
 	jmp	handle_exception
-SYM_CODE_END(asm_iret_error)
-.previous
-	_ASM_EXTABLE(.Lirq_return, asm_iret_error)
+
+	_ASM_EXTABLE(.Lirq_return, .Lasm_iret_error)
 SYM_FUNC_END(entry_INT80_32)
 
 .macro FIXUP_ESPFIX_STACK
--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -19,4 +19,6 @@
 #define	EX_TYPE_DEFAULT_MCE_SAFE	12
 #define	EX_TYPE_FAULT_MCE_SAFE		13
 
+#define	EX_TYPE_POP_ZERO		14
+
 #endif
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -99,6 +99,18 @@ static bool ex_handler_clear_fs(const st
 	return ex_handler_default(fixup, regs);
 }
 
+static bool ex_handler_pop_zero(const struct exception_table_entry *fixup,
+				struct pt_regs *regs)
+{
+	/*
+	 * Typically used for when "pop %seg" traps, in which case we'll clear
+	 * the stack slot and re-try the instruction, which will then succeed
+	 * to pop zero.
+	 */
+	*((unsigned long *)regs->sp) = 0;
+	return ex_handler_default(fixup, regs);
+}
+
 int ex_get_fixup_type(unsigned long ip)
 {
 	const struct exception_table_entry *e = search_exception_tables(ip);
@@ -156,6 +168,8 @@ int fixup_exception(struct pt_regs *regs
 	case EX_TYPE_WRMSR_IN_MCE:
 		ex_handler_msr_mce(regs, true);
 		break;
+	case EX_TYPE_POP_ZERO:
+		return ex_handler_pop_zero(e, regs);
 	}
 	BUG();
 }



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 07/22] x86,extable: Extend extable functionality
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (5 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 06/22] x86,entry_32: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 08/22] x86,msr: Remove .fixup usage Peter Zijlstra
                   ` (14 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

In order to remove further .fixup usage, extend the extable
infrastructure to take additional information from the extable entry
sites.

Specifically add _ASM_EXTABLE_TYPE_REG() and EX_TYPE_IMM_REG that
extend the existing _ASM_EXTABLE_TYPE() by taking an additional
register argument and encoding that and an s16 immediate into the
existing s32 type field. This limits the actual types to the first
byte, 255 seem plenty.

Also add a few flags into the type word, specifically CLEAR_AX and
CLEAR_DX which clear the return and extended return register.

Notes:
 - due to the % in our register names it's hard to make it more
   generally usable as arm64 did.
 - the s16 is far larger than used in these patches, future extentions
   can easily shrink this to get more bits.
 - without the bitfield fix this will not compile, because: 0xFF > -1
   and we can't even extract the TYPE field.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/asm.h                 |   33 ++++++++++++++
 arch/x86/include/asm/extable.h             |    6 +-
 arch/x86/include/asm/extable_fixup_types.h |   19 ++++++++
 arch/x86/include/asm/insn-eval.h           |    2 
 arch/x86/lib/insn-eval.c                   |   66 ++++++++++++++++++-----------
 arch/x86/mm/extable.c                      |   40 +++++++++++++++--
 arch/x86/net/bpf_jit_comp.c                |    2 
 7 files changed, 136 insertions(+), 32 deletions(-)

--- a/arch/x86/include/asm/asm.h
+++ b/arch/x86/include/asm/asm.h
@@ -152,6 +152,31 @@
 
 #else /* ! __ASSEMBLY__ */
 
+asm(
+"	.macro extable_type_reg type:req reg:req\n"
+"	.set found, 0\n"
+"	.set regnr, 0\n"
+"	.irp rs,rax,rcx,rdx,rbx,rsp,rbp,rsi,rdi,r8,r9,r10,r11,r12,r13,r14,r15\n"
+"	.ifc \\reg, %\\rs\n"
+"	.set found, found+1\n"
+"	.long \\type + (regnr << 8)\n"
+"	.endif\n"
+"	.set regnr, regnr+1\n"
+"	.endr\n"
+"	.set regnr, 0\n"
+"	.irp rs,eax,ecx,edx,ebx,esp,ebp,esi,edi,r8d,r9d,r10d,r11d,r12d,r13d,r14d,r15d\n"
+"	.ifc \\reg, %\\rs\n"
+"	.set found, found+1\n"
+"	.long \\type + (regnr << 8)\n"
+"	.endif\n"
+"	.set regnr, regnr+1\n"
+"	.endr\n"
+"	.if (found != 1)\n"
+"	.error \"extable_type_reg: bad register argument\"\n"
+"	.endif\n"
+"	.endm\n"
+);
+
 # define _ASM_EXTABLE_TYPE(from, to, type)			\
 	" .pushsection \"__ex_table\",\"a\"\n"			\
 	" .balign 4\n"						\
@@ -160,6 +185,14 @@
 	" .long " __stringify(type) " \n"			\
 	" .popsection\n"
 
+# define _ASM_EXTABLE_TYPE_REG(from, to, type, reg)				\
+	" .pushsection \"__ex_table\",\"a\"\n"					\
+	" .balign 4\n"								\
+	" .long (" #from ") - .\n"						\
+	" .long (" #to ") - .\n"						\
+	"extable_type_reg reg=" __stringify(reg) ", type=" __stringify(type) " \n"\
+	" .popsection\n"
+
 /* For C file, we already have NOKPROBE_SYMBOL macro */
 
 /*
--- a/arch/x86/include/asm/extable.h
+++ b/arch/x86/include/asm/extable.h
@@ -21,7 +21,7 @@
  */
 
 struct exception_table_entry {
-	int insn, fixup, type;
+	int insn, fixup, data;
 };
 struct pt_regs;
 
@@ -31,8 +31,8 @@ struct pt_regs;
 	do {							\
 		(a)->fixup = (b)->fixup + (delta);		\
 		(b)->fixup = (tmp).fixup - (delta);		\
-		(a)->type = (b)->type;				\
-		(b)->type = (tmp).type;				\
+		(a)->data = (b)->data;				\
+		(b)->data = (tmp).data;				\
 	} while (0)
 
 extern int fixup_exception(struct pt_regs *regs, int trapnr,
--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -2,6 +2,24 @@
 #ifndef _ASM_X86_EXTABLE_FIXUP_TYPES_H
 #define _ASM_X86_EXTABLE_FIXUP_TYPES_H
 
+#define EX_DATA_TYPE_MASK		0x000000FF
+#define EX_DATA_REG_MASK		0x00000F00
+#define EX_DATA_FLAG_MASK		0x0000F000
+#define EX_DATA_IMM_MASK		0xFFFF0000
+
+#define EX_DATA_REG_SHIFT		8
+#define EX_DATA_FLAG_SHIFT		12
+#define EX_DATA_IMM_SHIFT		16
+
+#define EX_DATA_FLAG(flag)		((flag) << EX_DATA_FLAG_SHIFT)
+#define EX_DATA_IMM(imm)		((imm) << EX_DATA_IMM_SHIFT)
+
+/* flags */
+#define EX_FLAG_CLEAR_AX		EX_DATA_FLAG(1)
+#define EX_FLAG_CLEAR_DX		EX_DATA_FLAG(2)
+#define EX_FLAG_CLEAR_AX_DX		EX_DATA_FLAG(3)
+
+/* types */
 #define	EX_TYPE_NONE			 0
 #define	EX_TYPE_DEFAULT			 1
 #define	EX_TYPE_FAULT			 2
@@ -20,5 +38,6 @@
 #define	EX_TYPE_FAULT_MCE_SAFE		13
 
 #define	EX_TYPE_POP_ZERO		14
+#define	EX_TYPE_IMM_REG			15 /* reg := (long)imm */
 
 #endif
--- a/arch/x86/include/asm/insn-eval.h
+++ b/arch/x86/include/asm/insn-eval.h
@@ -15,6 +15,8 @@
 #define INSN_CODE_SEG_OPND_SZ(params) (params & 0xf)
 #define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4))
 
+int pt_regs_offset(struct pt_regs *regs, int regno);
+
 bool insn_has_rep_prefix(struct insn *insn);
 void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs);
 int insn_get_modrm_rm_off(struct insn *insn, struct pt_regs *regs);
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -412,32 +412,39 @@ static short get_segment_selector(struct
 #endif /* CONFIG_X86_64 */
 }
 
-static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
-			  enum reg_type type)
+static const int pt_regoff[] = {
+	offsetof(struct pt_regs, ax),
+	offsetof(struct pt_regs, cx),
+	offsetof(struct pt_regs, dx),
+	offsetof(struct pt_regs, bx),
+	offsetof(struct pt_regs, sp),
+	offsetof(struct pt_regs, bp),
+	offsetof(struct pt_regs, si),
+	offsetof(struct pt_regs, di),
+#ifdef CONFIG_X86_64
+	offsetof(struct pt_regs, r8),
+	offsetof(struct pt_regs, r9),
+	offsetof(struct pt_regs, r10),
+	offsetof(struct pt_regs, r11),
+	offsetof(struct pt_regs, r12),
+	offsetof(struct pt_regs, r13),
+	offsetof(struct pt_regs, r14),
+	offsetof(struct pt_regs, r15),
+#endif
+};
+
+int pt_regs_offset(struct pt_regs *regs, int regno)
 {
+	if ((unsigned)regno < ARRAY_SIZE(pt_regoff))
+		return pt_regoff[regno];
+	return -EDOM;
+}
+
+static int get_regno(struct insn *insn, enum reg_type type)
+{
+	int nr_registers = ARRAY_SIZE(pt_regoff);
 	int regno = 0;
 
-	static const int regoff[] = {
-		offsetof(struct pt_regs, ax),
-		offsetof(struct pt_regs, cx),
-		offsetof(struct pt_regs, dx),
-		offsetof(struct pt_regs, bx),
-		offsetof(struct pt_regs, sp),
-		offsetof(struct pt_regs, bp),
-		offsetof(struct pt_regs, si),
-		offsetof(struct pt_regs, di),
-#ifdef CONFIG_X86_64
-		offsetof(struct pt_regs, r8),
-		offsetof(struct pt_regs, r9),
-		offsetof(struct pt_regs, r10),
-		offsetof(struct pt_regs, r11),
-		offsetof(struct pt_regs, r12),
-		offsetof(struct pt_regs, r13),
-		offsetof(struct pt_regs, r14),
-		offsetof(struct pt_regs, r15),
-#endif
-	};
-	int nr_registers = ARRAY_SIZE(regoff);
 	/*
 	 * Don't possibly decode a 32-bit instructions as
 	 * reading a 64-bit-only register.
@@ -505,7 +512,18 @@ static int get_reg_offset(struct insn *i
 		WARN_ONCE(1, "decoded an instruction with an invalid register");
 		return -EINVAL;
 	}
-	return regoff[regno];
+	return regno;
+}
+
+static int get_reg_offset(struct insn *insn, struct pt_regs *regs,
+			  enum reg_type type)
+{
+	int regno = get_regno(insn, type);
+
+	if (regno < 0)
+		return regno;
+
+	return pt_regs_offset(regs, regno);
 }
 
 /**
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -2,12 +2,25 @@
 #include <linux/extable.h>
 #include <linux/uaccess.h>
 #include <linux/sched/debug.h>
+#include <linux/bitfield.h>
 #include <xen/xen.h>
 
 #include <asm/fpu/api.h>
 #include <asm/sev.h>
 #include <asm/traps.h>
 #include <asm/kdebug.h>
+#include <asm/insn-eval.h>
+
+static inline unsigned long *pt_regs_nr(struct pt_regs *regs, int nr)
+{
+	int reg_offset = pt_regs_offset(regs, nr);
+	static unsigned long __dummy;
+
+	if (WARN_ON_ONCE(reg_offset < 0))
+		return &__dummy;
+
+	return (unsigned long *)((unsigned long)regs + reg_offset);
+}
 
 static inline unsigned long
 ex_fixup_addr(const struct exception_table_entry *x)
@@ -15,10 +28,15 @@ ex_fixup_addr(const struct exception_tab
 	return (unsigned long)&x->fixup + x->fixup;
 }
 
-static bool ex_handler_default(const struct exception_table_entry *fixup,
+static bool ex_handler_default(const struct exception_table_entry *e,
 			       struct pt_regs *regs)
 {
-	regs->ip = ex_fixup_addr(fixup);
+	if (e->data & EX_FLAG_CLEAR_AX)
+		regs->ax = 0;
+	if (e->data & EX_FLAG_CLEAR_DX)
+		regs->dx = 0;
+
+	regs->ip = ex_fixup_addr(e);
 	return true;
 }
 
@@ -111,17 +129,25 @@ static bool ex_handler_pop_zero(const st
 	return ex_handler_default(fixup, regs);
 }
 
+static bool ex_handler_imm_reg(const struct exception_table_entry *fixup,
+			       struct pt_regs *regs, int reg, int imm)
+{
+	*pt_regs_nr(regs, reg) = (long)imm;
+	return ex_handler_default(fixup, regs);
+}
+
 int ex_get_fixup_type(unsigned long ip)
 {
 	const struct exception_table_entry *e = search_exception_tables(ip);
 
-	return e ? e->type : EX_TYPE_NONE;
+	return e ? FIELD_GET(EX_DATA_TYPE_MASK, e->data) : EX_TYPE_NONE;
 }
 
 int fixup_exception(struct pt_regs *regs, int trapnr, unsigned long error_code,
 		    unsigned long fault_addr)
 {
 	const struct exception_table_entry *e;
+	int type, reg, imm;
 
 #ifdef CONFIG_PNPBIOS
 	if (unlikely(SEGMENT_IS_PNP_CODE(regs->cs))) {
@@ -141,7 +167,11 @@ int fixup_exception(struct pt_regs *regs
 	if (!e)
 		return 0;
 
-	switch (e->type) {
+	type = FIELD_GET(EX_DATA_TYPE_MASK, e->data);
+	reg  = FIELD_GET(EX_DATA_REG_MASK,  e->data);
+	imm  = FIELD_GET(EX_DATA_IMM_MASK,  e->data);
+
+	switch (type) {
 	case EX_TYPE_DEFAULT:
 	case EX_TYPE_DEFAULT_MCE_SAFE:
 		return ex_handler_default(e, regs);
@@ -170,6 +200,8 @@ int fixup_exception(struct pt_regs *regs
 		break;
 	case EX_TYPE_POP_ZERO:
 		return ex_handler_pop_zero(e, regs);
+	case EX_TYPE_IMM_REG:
+		return ex_handler_imm_reg(e, regs, reg, imm);
 	}
 	BUG();
 }
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1286,7 +1286,7 @@ st:			if (is_imm8(insn->off))
 				}
 				ex->insn = delta;
 
-				ex->type = EX_TYPE_BPF;
+				ex->data = EX_TYPE_BPF;
 
 				if (dst_reg > BPF_REG_9) {
 					pr_err("verifier error\n");



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 08/22] x86,msr: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (6 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 07/22] x86,extable: Extend extable functionality Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 09/22] x86,futex: " Peter Zijlstra
                   ` (13 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Rework the MSR accessors to remove .fixup usage. Add two new extable
types (to the 4 already existing msr ones) using the new register
infrastructure to record which register should get the error value.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/extable_fixup_types.h |   23 ++++++-------
 arch/x86/include/asm/msr.h                 |   26 ++++----------
 arch/x86/mm/extable.c                      |   51 +++++++++++++++--------------
 3 files changed, 47 insertions(+), 53 deletions(-)

--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -27,17 +27,16 @@
 #define	EX_TYPE_COPY			 4
 #define	EX_TYPE_CLEAR_FS		 5
 #define	EX_TYPE_FPU_RESTORE		 6
-#define	EX_TYPE_WRMSR			 7
-#define	EX_TYPE_RDMSR			 8
-#define	EX_TYPE_BPF			 9
-
-#define	EX_TYPE_WRMSR_IN_MCE		10
-#define	EX_TYPE_RDMSR_IN_MCE		11
-
-#define	EX_TYPE_DEFAULT_MCE_SAFE	12
-#define	EX_TYPE_FAULT_MCE_SAFE		13
-
-#define	EX_TYPE_POP_ZERO		14
-#define	EX_TYPE_IMM_REG			15 /* reg := (long)imm */
+#define	EX_TYPE_BPF			 7
+#define	EX_TYPE_WRMSR			 8
+#define	EX_TYPE_RDMSR			 9
+#define	EX_TYPE_WRMSR_SAFE		10 /* reg := -EIO */
+#define	EX_TYPE_RDMSR_SAFE		11 /* reg := -EIO */
+#define	EX_TYPE_WRMSR_IN_MCE		12
+#define	EX_TYPE_RDMSR_IN_MCE		13
+#define	EX_TYPE_DEFAULT_MCE_SAFE	14
+#define	EX_TYPE_FAULT_MCE_SAFE		15
+#define	EX_TYPE_POP_ZERO		16
+#define	EX_TYPE_IMM_REG			17 /* reg := (long)imm */
 
 #endif
--- a/arch/x86/include/asm/msr.h
+++ b/arch/x86/include/asm/msr.h
@@ -137,17 +137,11 @@ static inline unsigned long long native_
 {
 	DECLARE_ARGS(val, low, high);
 
-	asm volatile("2: rdmsr ; xor %[err],%[err]\n"
-		     "1:\n\t"
-		     ".section .fixup,\"ax\"\n\t"
-		     "3: mov %[fault],%[err]\n\t"
-		     "xorl %%eax, %%eax\n\t"
-		     "xorl %%edx, %%edx\n\t"
-		     "jmp 1b\n\t"
-		     ".previous\n\t"
-		     _ASM_EXTABLE(2b, 3b)
+	asm volatile("1: rdmsr ; xor %[err],%[err]\n"
+		     "2:\n\t"
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_RDMSR_SAFE, %[err])
 		     : [err] "=r" (*err), EAX_EDX_RET(val, low, high)
-		     : "c" (msr), [fault] "i" (-EIO));
+		     : "c" (msr));
 	if (tracepoint_enabled(read_msr))
 		do_trace_read_msr(msr, EAX_EDX_VAL(val, low, high), *err);
 	return EAX_EDX_VAL(val, low, high);
@@ -169,15 +163,11 @@ native_write_msr_safe(unsigned int msr,
 {
 	int err;
 
-	asm volatile("2: wrmsr ; xor %[err],%[err]\n"
-		     "1:\n\t"
-		     ".section .fixup,\"ax\"\n\t"
-		     "3:  mov %[fault],%[err] ; jmp 1b\n\t"
-		     ".previous\n\t"
-		     _ASM_EXTABLE(2b, 3b)
+	asm volatile("1: wrmsr ; xor %[err],%[err]\n"
+		     "2:\n\t"
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_WRMSR_SAFE, %[err])
 		     : [err] "=a" (err)
-		     : "c" (msr), "0" (low), "d" (high),
-		       [fault] "i" (-EIO)
+		     : "c" (msr), "0" (low), "d" (high)
 		     : "memory");
 	if (tracepoint_enabled(write_msr))
 		do_trace_write_msr(msr, ((u64)high << 32 | low), err);
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -83,28 +83,29 @@ static bool ex_handler_copy(const struct
 	return ex_handler_fault(fixup, regs, trapnr);
 }
 
-static bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup,
-				    struct pt_regs *regs)
+static bool ex_handler_msr(const struct exception_table_entry *fixup,
+			   struct pt_regs *regs, bool wrmsr, bool safe, int reg)
 {
-	if (pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n",
+	if (!safe && wrmsr &&
+	    pr_warn_once("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n",
+			 (unsigned int)regs->cx, (unsigned int)regs->dx,
+			 (unsigned int)regs->ax,  regs->ip, (void *)regs->ip))
+		show_stack_regs(regs);
+
+	if (!safe && !wrmsr &&
+	    pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pS)\n",
 			 (unsigned int)regs->cx, regs->ip, (void *)regs->ip))
 		show_stack_regs(regs);
 
-	/* Pretend that the read succeeded and returned 0. */
-	regs->ax = 0;
-	regs->dx = 0;
-	return ex_handler_default(fixup, regs);
-}
+	if (!wrmsr) {
+		/* Pretend that the read succeeded and returned 0. */
+		regs->ax = 0;
+		regs->dx = 0;
+	}
 
-static bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup,
-				    struct pt_regs *regs)
-{
-	if (pr_warn_once("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pS)\n",
-			 (unsigned int)regs->cx, (unsigned int)regs->dx,
-			 (unsigned int)regs->ax,  regs->ip, (void *)regs->ip))
-		show_stack_regs(regs);
+	if (safe)
+		*pt_regs_nr(regs, reg) = -EIO;
 
-	/* Pretend that the write succeeded. */
 	return ex_handler_default(fixup, regs);
 }
 
@@ -186,18 +187,22 @@ int fixup_exception(struct pt_regs *regs
 		return ex_handler_clear_fs(e, regs);
 	case EX_TYPE_FPU_RESTORE:
 		return ex_handler_fprestore(e, regs);
-	case EX_TYPE_RDMSR:
-		return ex_handler_rdmsr_unsafe(e, regs);
-	case EX_TYPE_WRMSR:
-		return ex_handler_wrmsr_unsafe(e, regs);
 	case EX_TYPE_BPF:
 		return ex_handler_bpf(e, regs);
-	case EX_TYPE_RDMSR_IN_MCE:
-		ex_handler_msr_mce(regs, false);
-		break;
+	case EX_TYPE_WRMSR:
+		return ex_handler_msr(e, regs, true, false, reg);
+	case EX_TYPE_RDMSR:
+		return ex_handler_msr(e, regs, false, false, reg);
+	case EX_TYPE_WRMSR_SAFE:
+		return ex_handler_msr(e, regs, true, true, reg);
+	case EX_TYPE_RDMSR_SAFE:
+		return ex_handler_msr(e, regs, false, true, reg);
 	case EX_TYPE_WRMSR_IN_MCE:
 		ex_handler_msr_mce(regs, true);
 		break;
+	case EX_TYPE_RDMSR_IN_MCE:
+		ex_handler_msr_mce(regs, false);
+		break;
 	case EX_TYPE_POP_ZERO:
 		return ex_handler_pop_zero(e, regs);
 	case EX_TYPE_IMM_REG:



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 09/22] x86,futex: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (7 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 08/22] x86,msr: Remove .fixup usage Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 10/22] x86,uaccess: " Peter Zijlstra
                   ` (12 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Use the new EX_TYPE_IMM_REG to store -EFAULT into the designated 'ret'
register, this removes the need for anonymous .fixup code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/extable_fixup_types.h |    2 ++
 arch/x86/include/asm/futex.h               |   28 ++++++++--------------------
 2 files changed, 10 insertions(+), 20 deletions(-)

--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -32,6 +32,8 @@
 #define	EX_TYPE_DEFAULT_MCE_SAFE	14
 #define	EX_TYPE_FAULT_MCE_SAFE		15
 #define	EX_TYPE_POP_ZERO		16
+
 #define	EX_TYPE_IMM_REG			17 /* reg := (long)imm */
+#define	EX_TYPE_EFAULT_REG		(EX_TYPE_IMM_REG | EX_DATA_IMM(-EFAULT))
 
 #endif
--- a/arch/x86/include/asm/futex.h
+++ b/arch/x86/include/asm/futex.h
@@ -17,13 +17,9 @@ do {								\
 	int oldval = 0, ret;					\
 	asm volatile("1:\t" insn "\n"				\
 		     "2:\n"					\
-		     "\t.section .fixup,\"ax\"\n"		\
-		     "3:\tmov\t%3, %1\n"			\
-		     "\tjmp\t2b\n"				\
-		     "\t.previous\n"				\
-		     _ASM_EXTABLE_UA(1b, 3b)			\
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, %1) \
 		     : "=r" (oldval), "=r" (ret), "+m" (*uaddr)	\
-		     : "i" (-EFAULT), "0" (oparg), "1" (0));	\
+		     : "0" (oparg), "1" (0));	\
 	if (ret)						\
 		goto label;					\
 	*oval = oldval;						\
@@ -39,15 +35,11 @@ do {								\
 		     "3:\t" LOCK_PREFIX "cmpxchgl %3, %2\n"	\
 		     "\tjnz\t2b\n"				\
 		     "4:\n"					\
-		     "\t.section .fixup,\"ax\"\n"		\
-		     "5:\tmov\t%5, %1\n"			\
-		     "\tjmp\t4b\n"				\
-		     "\t.previous\n"				\
-		     _ASM_EXTABLE_UA(1b, 5b)			\
-		     _ASM_EXTABLE_UA(3b, 5b)			\
+		     _ASM_EXTABLE_TYPE_REG(1b, 4b, EX_TYPE_EFAULT_REG, %1) \
+		     _ASM_EXTABLE_TYPE_REG(3b, 4b, EX_TYPE_EFAULT_REG, %1) \
 		     : "=&a" (oldval), "=&r" (ret),		\
 		       "+m" (*uaddr), "=&r" (tem)		\
-		     : "r" (oparg), "i" (-EFAULT), "1" (0));	\
+		     : "r" (oparg), "1" (0));			\
 	if (ret)						\
 		goto label;					\
 	*oval = oldval;						\
@@ -95,15 +87,11 @@ static inline int futex_atomic_cmpxchg_i
 	if (!user_access_begin(uaddr, sizeof(u32)))
 		return -EFAULT;
 	asm volatile("\n"
-		"1:\t" LOCK_PREFIX "cmpxchgl %4, %2\n"
+		"1:\t" LOCK_PREFIX "cmpxchgl %3, %2\n"
 		"2:\n"
-		"\t.section .fixup, \"ax\"\n"
-		"3:\tmov     %3, %0\n"
-		"\tjmp     2b\n"
-		"\t.previous\n"
-		_ASM_EXTABLE_UA(1b, 3b)
+		_ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG, %0) \
 		: "+r" (ret), "=a" (oldval), "+m" (*uaddr)
-		: "i" (-EFAULT), "r" (newval), "1" (oldval)
+		: "r" (newval), "1" (oldval)
 		: "memory"
 	);
 	user_access_end();



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 10/22] x86,uaccess: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (8 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 09/22] x86,futex: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 11/22] x86,xen: " Peter Zijlstra
                   ` (11 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

For the !CC_AS_ASM_GOTO_OUTPUT (aka. the legacy codepath), remove the
.fixup usage by employing both EX_TYPE_EFAULT_REG and EX_FLAG_CLR.
Like was already done for X86_32's version of __get_user_asm_u64()
use the "a" register for output, specifically so we can use CLR_AX.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/uaccess.h |   39 +++++++++++++++++----------------------
 1 file changed, 17 insertions(+), 22 deletions(-)

--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -351,24 +351,22 @@ do {									\
 		     "1:	movl %[lowbits],%%eax\n"		\
 		     "2:	movl %[highbits],%%edx\n"		\
 		     "3:\n"						\
-		     ".section .fixup,\"ax\"\n"				\
-		     "4:	mov %[efault],%[errout]\n"		\
-		     "	xorl %%eax,%%eax\n"				\
-		     "	xorl %%edx,%%edx\n"				\
-		     "	jmp 3b\n"					\
-		     ".previous\n"					\
-		     _ASM_EXTABLE_UA(1b, 4b)				\
-		     _ASM_EXTABLE_UA(2b, 4b)				\
+		     _ASM_EXTABLE_TYPE_REG(1b, 3b, EX_TYPE_EFAULT_REG |	\
+					   EX_FLAG_CLEAR_AX_DX,		\
+					   %[errout])			\
+		     _ASM_EXTABLE_TYPE_REG(2b, 3b, EX_TYPE_EFAULT_REG |	\
+					   EX_FLAG_CLEAR_AX_DX,		\
+					   %[errout])			\
 		     : [errout] "=r" (retval),				\
 		       [output] "=&A"(x)				\
 		     : [lowbits] "m" (__m(__ptr)),			\
 		       [highbits] "m" __m(((u32 __user *)(__ptr)) + 1),	\
-		       [efault] "i" (-EFAULT), "0" (retval));		\
+		       "0" (retval));					\
 })
 
 #else
 #define __get_user_asm_u64(x, ptr, retval) \
-	 __get_user_asm(x, ptr, retval, "q", "=r")
+	 __get_user_asm(x, ptr, retval, "q")
 #endif
 
 #define __get_user_size(x, ptr, size, retval)				\
@@ -379,14 +377,14 @@ do {									\
 	__chk_user_ptr(ptr);						\
 	switch (size) {							\
 	case 1:								\
-		__get_user_asm(x_u8__, ptr, retval, "b", "=q");		\
+		__get_user_asm(x_u8__, ptr, retval, "b");		\
 		(x) = x_u8__;						\
 		break;							\
 	case 2:								\
-		__get_user_asm(x, ptr, retval, "w", "=r");		\
+		__get_user_asm(x, ptr, retval, "w");			\
 		break;							\
 	case 4:								\
-		__get_user_asm(x, ptr, retval, "l", "=r");		\
+		__get_user_asm(x, ptr, retval, "l");			\
 		break;							\
 	case 8:								\
 		__get_user_asm_u64(x, ptr, retval);			\
@@ -396,20 +394,17 @@ do {									\
 	}								\
 } while (0)
 
-#define __get_user_asm(x, addr, err, itype, ltype)			\
+#define __get_user_asm(x, addr, err, itype)				\
 	asm volatile("\n"						\
 		     "1:	mov"itype" %[umem],%[output]\n"		\
 		     "2:\n"						\
-		     ".section .fixup,\"ax\"\n"				\
-		     "3:	mov %[efault],%[errout]\n"		\
-		     "	xorl %k[output],%k[output]\n"			\
-		     "	jmp 2b\n"					\
-		     ".previous\n"					\
-		     _ASM_EXTABLE_UA(1b, 3b)				\
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_EFAULT_REG | \
+					   EX_FLAG_CLEAR_AX,		\
+					   %[errout])			\
 		     : [errout] "=r" (err),				\
-		       [output] ltype(x)				\
+		       [output] "=a" (x)				\
 		     : [umem] "m" (__m(addr)),				\
-		       [efault] "i" (-EFAULT), "0" (err))
+		       "0" (err))
 
 #endif // CONFIG_CC_HAS_ASM_GOTO_OUTPUT
 



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 11/22] x86,xen: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (9 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 10/22] x86,uaccess: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 12/22] x86,fpu: " Peter Zijlstra
                   ` (10 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Employ the fancy new EX_TYPE_IMM_REG to create EX_TYPE_NEG_REG to
store '-1' into the designated register and use this to remove some
Xen .fixup usage.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/extable_fixup_types.h |    1 +
 arch/x86/include/asm/xen/page.h            |   12 ++----------
 2 files changed, 3 insertions(+), 10 deletions(-)

--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -35,5 +35,6 @@
 
 #define	EX_TYPE_IMM_REG			17 /* reg := (long)imm */
 #define	EX_TYPE_EFAULT_REG		(EX_TYPE_IMM_REG | EX_DATA_IMM(-EFAULT))
+#define	EX_TYPE_NEG_REG			(EX_TYPE_IMM_REG | EX_DATA_IMM(-1))
 
 #endif
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -96,11 +96,7 @@ static inline int xen_safe_write_ulong(u
 
 	asm volatile("1: mov %[val], %[ptr]\n"
 		     "2:\n"
-		     ".section .fixup, \"ax\"\n"
-		     "3: sub $1, %[ret]\n"
-		     "   jmp 2b\n"
-		     ".previous\n"
-		     _ASM_EXTABLE(1b, 3b)
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_NEG_REG, %[ret])
 		     : [ret] "+r" (ret), [ptr] "=m" (*addr)
 		     : [val] "r" (val));
 
@@ -115,11 +111,7 @@ static inline int xen_safe_read_ulong(co
 
 	asm volatile("1: mov %[ptr], %[rval]\n"
 		     "2:\n"
-		     ".section .fixup, \"ax\"\n"
-		     "3: sub $1, %[ret]\n"
-		     "   jmp 2b\n"
-		     ".previous\n"
-		     _ASM_EXTABLE(1b, 3b)
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_NEG_REG, %[ret])
 		     : [ret] "+r" (ret), [rval] "+r" (rval)
 		     : [ptr] "m" (*addr));
 	*val = rval;



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 12/22] x86,fpu: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (10 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 11/22] x86,xen: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 13/22] x86,segment: " Peter Zijlstra
                   ` (9 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Employ EX_TYPE_NEG_REG to store '-1' into the %[err] register on
exception.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/fpu/legacy.h |    6 +-----
 arch/x86/kernel/fpu/xstate.h |    6 +-----
 2 files changed, 2 insertions(+), 10 deletions(-)

--- a/arch/x86/kernel/fpu/legacy.h
+++ b/arch/x86/kernel/fpu/legacy.h
@@ -35,11 +35,7 @@ static inline void ldmxcsr(u32 mxcsr)
 	int err;							\
 	asm volatile("1:" #insn "\n\t"					\
 		     "2:\n"						\
-		     ".section .fixup,\"ax\"\n"				\
-		     "3:  movl $-1,%[err]\n"				\
-		     "    jmp  2b\n"					\
-		     ".previous\n"					\
-		     _ASM_EXTABLE(1b, 3b)				\
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_NEG_REG, %[err]) \
 		     : [err] "=r" (err), output				\
 		     : "0"(0), input);					\
 	err;								\
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -107,11 +107,7 @@ static inline u64 xfeatures_mask_indepen
 		     "\n"						\
 		     "xor %[err], %[err]\n"				\
 		     "3:\n"						\
-		     ".pushsection .fixup,\"ax\"\n"			\
-		     "4: movl $-2, %[err]\n"				\
-		     "jmp 3b\n"						\
-		     ".popsection\n"					\
-		     _ASM_EXTABLE(661b, 4b)				\
+		     _ASM_EXTABLE_TYPE_REG(661b, 3b, EX_TYPE_NEG_REG, %[err]) \
 		     : [err] "=r" (err)					\
 		     : "D" (st), "m" (*st), "a" (lmask), "d" (hmask)	\
 		     : "memory")



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 13/22] x86,segment: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (11 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 12/22] x86,fpu: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 14/22] x86,vmx: " Peter Zijlstra
                   ` (8 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Create and use EX_TYPE_ZERO_REG to clear the register and retry the
segment load on exception.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/extable_fixup_types.h |    1 +
 arch/x86/include/asm/segment.h             |    9 +--------
 2 files changed, 2 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -36,5 +36,6 @@
 #define	EX_TYPE_IMM_REG			17 /* reg := (long)imm */
 #define	EX_TYPE_EFAULT_REG		(EX_TYPE_IMM_REG | EX_DATA_IMM(-EFAULT))
 #define	EX_TYPE_NEG_REG			(EX_TYPE_IMM_REG | EX_DATA_IMM(-1))
+#define	EX_TYPE_ZERO_REG		(EX_TYPE_IMM_REG | EX_DATA_IMM(0))
 
 #endif
--- a/arch/x86/include/asm/segment.h
+++ b/arch/x86/include/asm/segment.h
@@ -307,14 +307,7 @@ do {									\
 									\
 	asm volatile("						\n"	\
 		     "1:	movl %k0,%%" #seg "		\n"	\
-									\
-		     ".section .fixup,\"ax\"			\n"	\
-		     "2:	xorl %k0,%k0			\n"	\
-		     "		jmp 1b				\n"	\
-		     ".previous					\n"	\
-									\
-		     _ASM_EXTABLE(1b, 2b)				\
-									\
+		     _ASM_EXTABLE_TYPE_REG(1b, 1b, EX_TYPE_ZERO_REG, %k0)\
 		     : "+r" (__val) : : "memory");			\
 } while (0)
 



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 14/22] x86,vmx: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (12 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 13/22] x86,segment: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 15/22] x86,checksum_32: " Peter Zijlstra
                   ` (7 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

In the vmread exceptin path, use the, thus far, unused output register
to push the @fault argument onto the stack. This, in turn, enables the
exception handler to not do pushes and only modify that register when
an exception does occur.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/extable_fixup_types.h |    1 +
 arch/x86/kvm/vmx/vmx_ops.h                 |   14 ++++++--------
 2 files changed, 7 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -42,5 +42,6 @@
 #define	EX_TYPE_EFAULT_REG		(EX_TYPE_IMM_REG | EX_DATA_IMM(-EFAULT))
 #define	EX_TYPE_NEG_REG			(EX_TYPE_IMM_REG | EX_DATA_IMM(-1))
 #define	EX_TYPE_ZERO_REG		(EX_TYPE_IMM_REG | EX_DATA_IMM(0))
+#define	EX_TYPE_ONE_REG			(EX_TYPE_IMM_REG | EX_DATA_IMM(1))
 
 #endif
--- a/arch/x86/kvm/vmx/vmx_ops.h
+++ b/arch/x86/kvm/vmx/vmx_ops.h
@@ -80,9 +80,11 @@ static __always_inline unsigned long __v
 		      * @field, and bounce through the trampoline to preserve
 		      * volatile registers.
 		      */
-		     "push $0\n\t"
+		     "xorl %k1, %k1\n\t"
+		     "2:\n\t"
+		     "push %1\n\t"
 		     "push %2\n\t"
-		     "2:call vmread_error_trampoline\n\t"
+		     "call vmread_error_trampoline\n\t"
 
 		     /*
 		      * Unwind the stack.  Note, the trampoline zeros out the
@@ -93,12 +95,8 @@ static __always_inline unsigned long __v
 		     "3:\n\t"
 
 		     /* VMREAD faulted.  As above, except push '1' for @fault. */
-		     ".pushsection .fixup, \"ax\"\n\t"
-		     "4: push $1\n\t"
-		     "push %2\n\t"
-		     "jmp 2b\n\t"
-		     ".popsection\n\t"
-		     _ASM_EXTABLE(1b, 4b)
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_ONE_REG, %1)
+
 		     : ASM_CALL_CONSTRAINT, "=r"(value) : "r"(field) : "cc");
 	return value;
 }



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 15/22] x86,checksum_32: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (13 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 14/22] x86,vmx: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 16/22] x86,sgx: " Peter Zijlstra
                   ` (6 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Simply add EX_FLAG_CLEAR_AX to do as the .fixup used to do.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/lib/checksum_32.S |   19 +++----------------
 1 file changed, 3 insertions(+), 16 deletions(-)

--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -260,9 +260,9 @@ unsigned int csum_partial_copy_generic (
  * Copy from ds while checksumming, otherwise like csum_partial
  */
 
-#define EXC(y...)			\
-	9999: y;			\
-	_ASM_EXTABLE_UA(9999b, 6001f)
+#define EXC(y...)						\
+	9999: y;						\
+	_ASM_EXTABLE_TYPE(9999b, 7f, EX_TYPE_UACCESS | EX_FLAG_CLEAR_AX)
 
 #ifndef CONFIG_X86_USE_PPRO_CHECKSUM
 
@@ -358,15 +358,6 @@ EXC(	movb %cl, (%edi)	)
 	adcl $0, %eax
 7:
 
-# Exception handler:
-.section .fixup, "ax"							
-
-6001:
-	xorl %eax, %eax
-	jmp 7b
-
-.previous
-
 	popl %ebx
 	popl %esi
 	popl %edi
@@ -439,10 +430,6 @@ EXC(	movb %dl, (%edi)         )
 6:	addl %edx, %eax
 	adcl $0, %eax
 7:
-.section .fixup, "ax"
-6001:	xorl %eax, %eax
-	jmp  7b			
-.previous				
 
 	popl %esi
 	popl %edi



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 16/22] x86,sgx: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (14 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 15/22] x86,checksum_32: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 17/22] x86,kvm: " Peter Zijlstra
                   ` (5 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Create EX_TYPE_FAULT_SGX which does as EX_TYPE_FAULT does, except adds
this extra bit that SGX really fancies having.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/extable_fixup_types.h |    2 +
 arch/x86/include/asm/sgx.h                 |   18 ++++++++++++++
 arch/x86/kernel/cpu/sgx/encls.h            |   36 ++++-------------------------
 arch/x86/mm/extable.c                      |   10 ++++++++
 4 files changed, 35 insertions(+), 31 deletions(-)

--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -44,4 +44,6 @@
 #define	EX_TYPE_ZERO_REG		(EX_TYPE_IMM_REG | EX_DATA_IMM(0))
 #define	EX_TYPE_ONE_REG			(EX_TYPE_IMM_REG | EX_DATA_IMM(1))
 
+#define	EX_TYPE_FAULT_SGX		18
+
 #endif
--- a/arch/x86/include/asm/sgx.h
+++ b/arch/x86/include/asm/sgx.h
@@ -46,6 +46,24 @@ enum sgx_encls_function {
 };
 
 /**
+ * SGX_ENCLS_FAULT_FLAG - flag signifying an ENCLS return code is a trapnr
+ *
+ * ENCLS has its own (positive value) error codes and also generates
+ * ENCLS specific #GP and #PF faults.  And the ENCLS values get munged
+ * with system error codes as everything percolates back up the stack.
+ * Unfortunately (for us), we need to precisely identify each unique
+ * error code, e.g. the action taken if EWB fails varies based on the
+ * type of fault and on the exact SGX error code, i.e. we can't simply
+ * convert all faults to -EFAULT.
+ *
+ * To make all three error types coexist, we set bit 30 to identify an
+ * ENCLS fault.  Bit 31 (technically bits N:31) is used to differentiate
+ * between positive (faults and SGX error codes) and negative (system
+ * error codes) values.
+ */
+#define SGX_ENCLS_FAULT_FLAG 0x40000000
+
+/**
  * enum sgx_return_code - The return code type for ENCLS, ENCLU and ENCLV
  * %SGX_NOT_TRACKED:		Previous ETRACK's shootdown sequence has not
  *				been completed yet.
--- a/arch/x86/kernel/cpu/sgx/encls.h
+++ b/arch/x86/kernel/cpu/sgx/encls.h
@@ -11,26 +11,8 @@
 #include <asm/traps.h>
 #include "sgx.h"
 
-/**
- * ENCLS_FAULT_FLAG - flag signifying an ENCLS return code is a trapnr
- *
- * ENCLS has its own (positive value) error codes and also generates
- * ENCLS specific #GP and #PF faults.  And the ENCLS values get munged
- * with system error codes as everything percolates back up the stack.
- * Unfortunately (for us), we need to precisely identify each unique
- * error code, e.g. the action taken if EWB fails varies based on the
- * type of fault and on the exact SGX error code, i.e. we can't simply
- * convert all faults to -EFAULT.
- *
- * To make all three error types coexist, we set bit 30 to identify an
- * ENCLS fault.  Bit 31 (technically bits N:31) is used to differentiate
- * between positive (faults and SGX error codes) and negative (system
- * error codes) values.
- */
-#define ENCLS_FAULT_FLAG 0x40000000
-
 /* Retrieve the encoded trapnr from the specified return code. */
-#define ENCLS_TRAPNR(r) ((r) & ~ENCLS_FAULT_FLAG)
+#define ENCLS_TRAPNR(r) ((r) & ~SGX_ENCLS_FAULT_FLAG)
 
 /* Issue a WARN() about an ENCLS function. */
 #define ENCLS_WARN(r, name) {						  \
@@ -50,7 +32,7 @@
  */
 static inline bool encls_faulted(int ret)
 {
-	return ret & ENCLS_FAULT_FLAG;
+	return ret & SGX_ENCLS_FAULT_FLAG;
 }
 
 /**
@@ -88,11 +70,7 @@ static inline bool encls_failed(int ret)
 	asm volatile(						\
 	"1: .byte 0x0f, 0x01, 0xcf;\n\t"			\
 	"2:\n"							\
-	".section .fixup,\"ax\"\n"				\
-	"3: orl $"__stringify(ENCLS_FAULT_FLAG)",%%eax\n"	\
-	"   jmp 2b\n"						\
-	".previous\n"						\
-	_ASM_EXTABLE_FAULT(1b, 3b)				\
+	_ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_FAULT_SGX)		\
 	: "=a"(ret)						\
 	: "a"(rax), inputs					\
 	: "memory", "cc");					\
@@ -127,7 +105,7 @@ static inline bool encls_failed(int ret)
  *
  * Return:
  *   0 on success,
- *   trapnr with ENCLS_FAULT_FLAG set on fault
+ *   trapnr with SGX_ENCLS_FAULT_FLAG set on fault
  */
 #define __encls_N(rax, rbx_out, inputs...)			\
 	({							\
@@ -136,11 +114,7 @@ static inline bool encls_failed(int ret)
 	"1: .byte 0x0f, 0x01, 0xcf;\n\t"			\
 	"   xor %%eax,%%eax;\n"					\
 	"2:\n"							\
-	".section .fixup,\"ax\"\n"				\
-	"3: orl $"__stringify(ENCLS_FAULT_FLAG)",%%eax\n"	\
-	"   jmp 2b\n"						\
-	".previous\n"						\
-	_ASM_EXTABLE_FAULT(1b, 3b)				\
+	_ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_FAULT_SGX)		\
 	: "=a"(ret), "=b"(rbx_out)				\
 	: "a"(rax), inputs					\
 	: "memory");						\
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -10,6 +10,7 @@
 #include <asm/traps.h>
 #include <asm/kdebug.h>
 #include <asm/insn-eval.h>
+#include <asm/sgx.h>
 
 static inline unsigned long *pt_regs_nr(struct pt_regs *regs, int nr)
 {
@@ -47,6 +48,13 @@ static bool ex_handler_fault(const struc
 	return ex_handler_default(fixup, regs);
 }
 
+static bool ex_handler_sgx(const struct exception_table_entry *fixup,
+			   struct pt_regs *regs, int trapnr)
+{
+	regs->ax = trapnr | SGX_ENCLS_FAULT_FLAG;
+	return ex_handler_default(fixup, regs);
+}
+
 /*
  * Handler for when we fail to restore a task's FPU state.  We should never get
  * here because the FPU state of a task using the FPU (task->thread.fpu.state)
@@ -207,6 +215,8 @@ int fixup_exception(struct pt_regs *regs
 		return ex_handler_pop_zero(e, regs);
 	case EX_TYPE_IMM_REG:
 		return ex_handler_imm_reg(e, regs, reg, imm);
+	case EX_TYPE_FAULT_SGX:
+		return ex_handler_sgx(e, regs, trapnr);
 	}
 	BUG();
 }



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 17/22] x86,kvm: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (15 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 16/22] x86,sgx: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 18/22] x86,usercopy_32: Simplify __copy_user_intel_nocache() Peter Zijlstra
                   ` (4 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

KVM instruction emulation has a gnarly hack where the .fixup does a
return, however there's already a ret right after the 10b label, so
mark that as 11 and have the exception clear %esi to remove the
.fixup.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kvm/emulate.c |   16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -315,7 +315,7 @@ static int fastop(struct x86_emulate_ctx
 	__FOP_FUNC(#name)
 
 #define __FOP_RET(name) \
-	"ret \n\t" \
+	"11: ret \n\t" \
 	".size " name ", .-" name "\n\t"
 
 #define FOP_RET(name) \
@@ -344,7 +344,7 @@ static int fastop(struct x86_emulate_ctx
 	__FOP_RET(#op "_" #dst)
 
 #define FOP1EEX(op,  dst) \
-	FOP1E(op, dst) _ASM_EXTABLE(10b, kvm_fastop_exception)
+	FOP1E(op, dst) _ASM_EXTABLE_TYPE_REG(10b, 11b, EX_TYPE_ZERO_REG, %esi)
 
 #define FASTOP1(op) \
 	FOP_START(op) \
@@ -434,10 +434,6 @@ static int fastop(struct x86_emulate_ctx
 	#op " %al \n\t" \
 	__FOP_RET(#op)
 
-asm(".pushsection .fixup, \"ax\"\n"
-    "kvm_fastop_exception: xor %esi, %esi; ret\n"
-    ".popsection");
-
 FOP_START(setcc)
 FOP_SETCC(seto)
 FOP_SETCC(setno)
@@ -473,12 +469,8 @@ FOP_END;
  \
 	asm volatile("1:" insn "\n" \
 	             "2:\n" \
-	             ".pushsection .fixup, \"ax\"\n" \
-	             "3: movl $1, %[_fault]\n" \
-	             "   jmp  2b\n" \
-	             ".popsection\n" \
-	             _ASM_EXTABLE(1b, 3b) \
-	             : [_fault] "+qm"(_fault) inoutclob ); \
+		     _ASM_EXTABLE_TYPE_REG(1b, 2b, EX_TYPE_ONE_REG, %[_fault]) \
+	             : [_fault] "+r"(_fault) inoutclob ); \
  \
 	_fault ? X86EMUL_UNHANDLEABLE : X86EMUL_CONTINUE; \
 })



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 18/22] x86,usercopy_32: Simplify __copy_user_intel_nocache()
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (16 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 17/22] x86,kvm: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 19/22] x86,usercopy: Remove .fixup usage Peter Zijlstra
                   ` (3 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Have an exception jump to a .fixup to only immediately jump out is
daft, jump to the right place in one go.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/lib/usercopy_32.c |   40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -257,28 +257,28 @@ static unsigned long __copy_user_intel_n
 	       "8:\n"
 	       ".section .fixup,\"ax\"\n"
 	       "9:      lea 0(%%eax,%0,4),%0\n"
-	       "16:     jmp 8b\n"
+	       "        jmp 8b\n"
 	       ".previous\n"
-	       _ASM_EXTABLE_UA(0b, 16b)
-	       _ASM_EXTABLE_UA(1b, 16b)
-	       _ASM_EXTABLE_UA(2b, 16b)
-	       _ASM_EXTABLE_UA(21b, 16b)
-	       _ASM_EXTABLE_UA(3b, 16b)
-	       _ASM_EXTABLE_UA(31b, 16b)
-	       _ASM_EXTABLE_UA(4b, 16b)
-	       _ASM_EXTABLE_UA(41b, 16b)
-	       _ASM_EXTABLE_UA(10b, 16b)
-	       _ASM_EXTABLE_UA(51b, 16b)
-	       _ASM_EXTABLE_UA(11b, 16b)
-	       _ASM_EXTABLE_UA(61b, 16b)
-	       _ASM_EXTABLE_UA(12b, 16b)
-	       _ASM_EXTABLE_UA(71b, 16b)
-	       _ASM_EXTABLE_UA(13b, 16b)
-	       _ASM_EXTABLE_UA(81b, 16b)
-	       _ASM_EXTABLE_UA(14b, 16b)
-	       _ASM_EXTABLE_UA(91b, 16b)
+	       _ASM_EXTABLE_UA(0b, 8b)
+	       _ASM_EXTABLE_UA(1b, 8b)
+	       _ASM_EXTABLE_UA(2b, 8b)
+	       _ASM_EXTABLE_UA(21b, 8b)
+	       _ASM_EXTABLE_UA(3b, 8b)
+	       _ASM_EXTABLE_UA(31b, 8b)
+	       _ASM_EXTABLE_UA(4b, 8b)
+	       _ASM_EXTABLE_UA(41b, 8b)
+	       _ASM_EXTABLE_UA(10b, 8b)
+	       _ASM_EXTABLE_UA(51b, 8b)
+	       _ASM_EXTABLE_UA(11b, 8b)
+	       _ASM_EXTABLE_UA(61b, 8b)
+	       _ASM_EXTABLE_UA(12b, 8b)
+	       _ASM_EXTABLE_UA(71b, 8b)
+	       _ASM_EXTABLE_UA(13b, 8b)
+	       _ASM_EXTABLE_UA(81b, 8b)
+	       _ASM_EXTABLE_UA(14b, 8b)
+	       _ASM_EXTABLE_UA(91b, 8b)
 	       _ASM_EXTABLE_UA(6b, 9b)
-	       _ASM_EXTABLE_UA(7b, 16b)
+	       _ASM_EXTABLE_UA(7b, 8b)
 	       : "=&c"(size), "=&D" (d0), "=&S" (d1)
 	       :  "1"(to), "2"(from), "0"(size)
 	       : "eax", "edx", "memory");



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 19/22] x86,usercopy: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (17 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 18/22] x86,usercopy_32: Simplify __copy_user_intel_nocache() Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 20/22] x86,word-at-a-time: " Peter Zijlstra
                   ` (2 subsequent siblings)
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Typically usercopy does whole word copies followed by a number of byte
copies to finish the tail. This means that on exception it needs to
compute the remaining length as: words*sizeof(long) + bytes.

Create a new extable handler to do just this.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/extable_fixup_types.h |    5 +++++
 arch/x86/lib/usercopy_32.c                 |   28 +++++-----------------------
 arch/x86/lib/usercopy_64.c                 |    8 +++-----
 arch/x86/mm/extable.c                      |    9 +++++++++
 4 files changed, 22 insertions(+), 28 deletions(-)

--- a/arch/x86/include/asm/extable_fixup_types.h
+++ b/arch/x86/include/asm/extable_fixup_types.h
@@ -46,4 +46,9 @@
 
 #define	EX_TYPE_FAULT_SGX		18
 
+#define	EX_TYPE_UCOPY_LEN		19 /* cx := reg + imm*cx */
+#define	EX_TYPE_UCOPY_LEN1		(EX_TYPE_UCOPY_LEN | EX_DATA_IMM(1))
+#define	EX_TYPE_UCOPY_LEN4		(EX_TYPE_UCOPY_LEN | EX_DATA_IMM(4))
+#define	EX_TYPE_UCOPY_LEN8		(EX_TYPE_UCOPY_LEN | EX_DATA_IMM(8))
+
 #endif
--- a/arch/x86/lib/usercopy_32.c
+++ b/arch/x86/lib/usercopy_32.c
@@ -43,11 +43,7 @@ do {									\
 		"	movl %2,%0\n"					\
 		"1:	rep; stosb\n"					\
 		"2: " ASM_CLAC "\n"					\
-		".section .fixup,\"ax\"\n"				\
-		"3:	lea 0(%2,%0,4),%0\n"				\
-		"	jmp 2b\n"					\
-		".previous\n"						\
-		_ASM_EXTABLE_UA(0b, 3b)					\
+		_ASM_EXTABLE_TYPE_REG(0b, 2b, EX_TYPE_UCOPY_LEN4, %2)	\
 		_ASM_EXTABLE_UA(1b, 2b)					\
 		: "=&c"(size), "=&D" (__d0)				\
 		: "r"(size & 3), "0"(size / 4), "1"(addr), "a"(0));	\
@@ -149,10 +145,6 @@ __copy_user_intel(void __user *to, const
 		       "36:    movl %%eax, %0\n"
 		       "37:    rep; movsb\n"
 		       "100:\n"
-		       ".section .fixup,\"ax\"\n"
-		       "101:   lea 0(%%eax,%0,4),%0\n"
-		       "       jmp 100b\n"
-		       ".previous\n"
 		       _ASM_EXTABLE_UA(1b, 100b)
 		       _ASM_EXTABLE_UA(2b, 100b)
 		       _ASM_EXTABLE_UA(3b, 100b)
@@ -190,7 +182,7 @@ __copy_user_intel(void __user *to, const
 		       _ASM_EXTABLE_UA(35b, 100b)
 		       _ASM_EXTABLE_UA(36b, 100b)
 		       _ASM_EXTABLE_UA(37b, 100b)
-		       _ASM_EXTABLE_UA(99b, 101b)
+		       _ASM_EXTABLE_TYPE_REG(99b, 100b, EX_TYPE_UCOPY_LEN4, %%eax)
 		       : "=&c"(size), "=&D" (d0), "=&S" (d1)
 		       :  "1"(to), "2"(from), "0"(size)
 		       : "eax", "edx", "memory");
@@ -255,10 +247,6 @@ static unsigned long __copy_user_intel_n
 	       "        movl %%eax,%0\n"
 	       "7:      rep; movsb\n"
 	       "8:\n"
-	       ".section .fixup,\"ax\"\n"
-	       "9:      lea 0(%%eax,%0,4),%0\n"
-	       "        jmp 8b\n"
-	       ".previous\n"
 	       _ASM_EXTABLE_UA(0b, 8b)
 	       _ASM_EXTABLE_UA(1b, 8b)
 	       _ASM_EXTABLE_UA(2b, 8b)
@@ -277,7 +265,7 @@ static unsigned long __copy_user_intel_n
 	       _ASM_EXTABLE_UA(81b, 8b)
 	       _ASM_EXTABLE_UA(14b, 8b)
 	       _ASM_EXTABLE_UA(91b, 8b)
-	       _ASM_EXTABLE_UA(6b, 9b)
+	       _ASM_EXTABLE_TYPE_REG(6b, 8b, EX_TYPE_UCOPY_LEN4, %%eax)
 	       _ASM_EXTABLE_UA(7b, 8b)
 	       : "=&c"(size), "=&D" (d0), "=&S" (d1)
 	       :  "1"(to), "2"(from), "0"(size)
@@ -315,14 +303,8 @@ do {									\
 		"	movl %3,%0\n"					\
 		"1:	rep; movsb\n"					\
 		"2:\n"							\
-		".section .fixup,\"ax\"\n"				\
-		"5:	addl %3,%0\n"					\
-		"	jmp 2b\n"					\
-		"3:	lea 0(%3,%0,4),%0\n"				\
-		"	jmp 2b\n"					\
-		".previous\n"						\
-		_ASM_EXTABLE_UA(4b, 5b)					\
-		_ASM_EXTABLE_UA(0b, 3b)					\
+		_ASM_EXTABLE_TYPE_REG(4b, 2b, EX_TYPE_UCOPY_LEN1, %3)	\
+		_ASM_EXTABLE_TYPE_REG(0b, 2b, EX_TYPE_UCOPY_LEN4, %3)	\
 		_ASM_EXTABLE_UA(1b, 2b)					\
 		: "=&c"(size), "=&D" (__d0), "=&S" (__d1), "=r"(__d2)	\
 		: "3"(size), "0"(size), "1"(to), "2"(from)		\
--- a/arch/x86/lib/usercopy_64.c
+++ b/arch/x86/lib/usercopy_64.c
@@ -35,12 +35,10 @@ unsigned long __clear_user(void __user *
 		"	incq   %[dst]\n"
 		"	decl %%ecx ; jnz  1b\n"
 		"2:\n"
-		".section .fixup,\"ax\"\n"
-		"3:	lea 0(%[size1],%[size8],8),%[size8]\n"
-		"	jmp 2b\n"
-		".previous\n"
-		_ASM_EXTABLE_UA(0b, 3b)
+
+		_ASM_EXTABLE_TYPE_REG(0b, 2b, EX_TYPE_UCOPY_LEN8, %[size1])
 		_ASM_EXTABLE_UA(1b, 2b)
+
 		: [size8] "=&c"(size), [dst] "=&D" (__d0)
 		: [size1] "r"(size & 7), "[size8]" (size / 8), "[dst]"(addr));
 	clac();
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -145,6 +145,13 @@ static bool ex_handler_imm_reg(const str
 	return ex_handler_default(fixup, regs);
 }
 
+static bool ex_handler_ucopy_leng(const struct exception_table_entry *fixup,
+				   struct pt_regs *regs, int trapnr, int reg, int imm)
+{
+	regs->cx = imm * regs->cx + *pt_regs_nr(regs, reg);
+	return ex_handler_uaccess(fixup, regs, trapnr);
+}
+
 int ex_get_fixup_type(unsigned long ip)
 {
 	const struct exception_table_entry *e = search_exception_tables(ip);
@@ -217,6 +224,8 @@ int fixup_exception(struct pt_regs *regs
 		return ex_handler_imm_reg(e, regs, reg, imm);
 	case EX_TYPE_FAULT_SGX:
 		return ex_handler_sgx(e, regs, trapnr);
+	case EX_TYPE_UCOPY_LEN:
+		return ex_handler_ucopy_leng(e, regs, trapnr, reg, imm);
 	}
 	BUG();
 }



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (18 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 19/22] x86,usercopy: Remove .fixup usage Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 18:01   ` Josh Poimboeuf
  2021-11-08 16:47   ` Josh Poimboeuf
  2021-11-05 17:10 ` [PATCH 21/22] x86: Remove .fixup section Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 22/22] objtool: Remove .fixup handling Peter Zijlstra
  21 siblings, 2 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

Rewrite load_unaligned_zeropad() to not require .fixup text.

This is easiest done using asm_goto_output, where we can stick a C
label in the exception table entry. The fallback version isn't nearly
so nice but should work.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/word-at-a-time.h |   67 ++++++++++++++++++++++++----------
 1 file changed, 48 insertions(+), 19 deletions(-)

--- a/arch/x86/include/asm/word-at-a-time.h
+++ b/arch/x86/include/asm/word-at-a-time.h
@@ -77,30 +77,59 @@ static inline unsigned long find_zero(un
  * and the next page not being mapped, take the exception and
  * return zeroes in the non-existing part.
  */
+#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
+
+static inline unsigned long load_unaligned_zeropad(const void *addr)
+{
+	unsigned long offset, data;
+	unsigned long ret;
+
+	asm_volatile_goto(
+		"1:	mov %[mem], %[ret]\n"
+
+		_ASM_EXTABLE(1b, %l[do_exception])
+
+		: [ret] "=&r" (ret)
+		: [mem] "m" (*(unsigned long *)addr)
+		: : do_exception);
+
+out:
+	return ret;
+
+do_exception: __cold;
+
+	offset = (unsigned long)addr & (sizeof(long) - 1);
+	addr = (void *)((unsigned long)addr & ~(sizeof(long) - 1));
+	data = *(unsigned long *)addr;
+	ret = data >> offset * 8;
+	goto out;
+}
+
+#else /* !CONFIG_CC_HAS_ASM_GOTO_OUTPUT */
+
 static inline unsigned long load_unaligned_zeropad(const void *addr)
 {
-	unsigned long ret, dummy;
+	unsigned long offset, data;
+	unsigned long ret, err = 0;
 
-	asm(
-		"1:\tmov %2,%0\n"
+	asm(	"1:	mov %[mem], %[ret]\n"
 		"2:\n"
-		".section .fixup,\"ax\"\n"
-		"3:\t"
-		"lea %2,%1\n\t"
-		"and %3,%1\n\t"
-		"mov (%1),%0\n\t"
-		"leal %2,%%ecx\n\t"
-		"andl %4,%%ecx\n\t"
-		"shll $3,%%ecx\n\t"
-		"shr %%cl,%0\n\t"
-		"jmp 2b\n"
-		".previous\n"
-		_ASM_EXTABLE(1b, 3b)
-		:"=&r" (ret),"=&c" (dummy)
-		:"m" (*(unsigned long *)addr),
-		 "i" (-sizeof(unsigned long)),
-		 "i" (sizeof(unsigned long)-1));
+
+		_ASM_EXTABLE_FAULT(1b, 2b)
+
+		: [ret] "=&r" (ret), "+a" (err)
+		: [mem] "m" (*(unsigned long *)addr));
+
+	if (unlikely(err)) {
+		offset = (unsigned long)addr & (sizeof(long) - 1);
+		addr = (void *)((unsigned long)addr & ~(sizeof(long) - 1));
+		data = *(unsigned long *)addr;
+		ret = data >> offset * 8;
+	}
+
 	return ret;
 }
 
+#endif /* CONFIG_CC_HAS_ASM_GOTO_OUTPUT */
+
 #endif /* _ASM_WORD_AT_A_TIME_H */



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 21/22] x86: Remove .fixup section
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (19 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 20/22] x86,word-at-a-time: " Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  2021-11-05 17:10 ` [PATCH 22/22] objtool: Remove .fixup handling Peter Zijlstra
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

No moar users, kill it dead.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/entry/vdso/vdso-layout.lds.S |    1 -
 arch/x86/kernel/vmlinux.lds.S         |    1 -
 2 files changed, 2 deletions(-)

--- a/arch/x86/entry/vdso/vdso-layout.lds.S
+++ b/arch/x86/entry/vdso/vdso-layout.lds.S
@@ -77,7 +77,6 @@ SECTIONS
 
 	.text		: {
 		*(.text*)
-		*(.fixup)
 	}						:text	=0x90909090,
 
 
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -137,7 +137,6 @@ SECTIONS
 		ALIGN_ENTRY_TEXT_END
 		SOFTIRQENTRY_TEXT
 		STATIC_CALL_TEXT
-		*(.fixup)
 		*(.gnu.warning)
 
 #ifdef CONFIG_RETPOLINE



^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH 22/22] objtool: Remove .fixup handling
  2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
                   ` (20 preceding siblings ...)
  2021-11-05 17:10 ` [PATCH 21/22] x86: Remove .fixup section Peter Zijlstra
@ 2021-11-05 17:10 ` Peter Zijlstra
  21 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 17:10 UTC (permalink / raw)
  To: x86
  Cc: linux-kernel, peterz, jpoimboe, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

The .fixup has gone the way of the Dodo, that test will always be
false.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 tools/objtool/check.c |    8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -3296,14 +3296,10 @@ static bool ignore_unreachable_insn(stru
 		return true;
 
 	/*
-	 * Ignore any unused exceptions.  This can happen when a whitelisted
-	 * function has an exception table entry.
-	 *
-	 * Also ignore alternative replacement instructions.  This can happen
+	 * Ignore alternative replacement instructions.  This can happen
 	 * when a whitelisted function uses one of the ALTERNATIVE macros.
 	 */
-	if (!strcmp(insn->sec->name, ".fixup") ||
-	    !strcmp(insn->sec->name, ".altinstr_replacement") ||
+	if (!strcmp(insn->sec->name, ".altinstr_replacement") ||
 	    !strcmp(insn->sec->name, ".altinstr_aux"))
 		return true;
 



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-05 17:10 ` [PATCH 20/22] x86,word-at-a-time: " Peter Zijlstra
@ 2021-11-05 18:01   ` Josh Poimboeuf
  2021-11-05 18:07     ` Peter Zijlstra
  2021-11-08 16:47   ` Josh Poimboeuf
  1 sibling, 1 reply; 54+ messages in thread
From: Josh Poimboeuf @ 2021-11-05 18:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, mbenes

On Fri, Nov 05, 2021 at 06:10:43PM +0100, Peter Zijlstra wrote:
> +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
> +
> +static inline unsigned long load_unaligned_zeropad(const void *addr)
> +{
> +	unsigned long offset, data;
> +	unsigned long ret;
> +
> +	asm_volatile_goto(
> +		"1:	mov %[mem], %[ret]\n"
> +
> +		_ASM_EXTABLE(1b, %l[do_exception])
> +
> +		: [ret] "=&r" (ret)
> +		: [mem] "m" (*(unsigned long *)addr)
> +		: : do_exception);
> +
> +out:
> +	return ret;
> +
> +do_exception: __cold;
> +
> +	offset = (unsigned long)addr & (sizeof(long) - 1);
> +	addr = (void *)((unsigned long)addr & ~(sizeof(long) - 1));
> +	data = *(unsigned long *)addr;
> +	ret = data >> offset * 8;
> +	goto out;

Superfluous goto, can just return here?

>  static inline unsigned long load_unaligned_zeropad(const void *addr)
>  {
> -	unsigned long ret, dummy;
> +	unsigned long offset, data;
> +	unsigned long ret, err = 0;
>  
> -	asm(
> -		"1:\tmov %2,%0\n"
> +	asm(	"1:	mov %[mem], %[ret]\n"
>  		"2:\n"
> -		".section .fixup,\"ax\"\n"
> -		"3:\t"
> -		"lea %2,%1\n\t"
> -		"and %3,%1\n\t"
> -		"mov (%1),%0\n\t"
> -		"leal %2,%%ecx\n\t"
> -		"andl %4,%%ecx\n\t"
> -		"shll $3,%%ecx\n\t"
> -		"shr %%cl,%0\n\t"
> -		"jmp 2b\n"
> -		".previous\n"
> -		_ASM_EXTABLE(1b, 3b)
> -		:"=&r" (ret),"=&c" (dummy)
> -		:"m" (*(unsigned long *)addr),
> -		 "i" (-sizeof(unsigned long)),
> -		 "i" (sizeof(unsigned long)-1));
> +
> +		_ASM_EXTABLE_FAULT(1b, 2b)
> +
> +		: [ret] "=&r" (ret), "+a" (err)
> +		: [mem] "m" (*(unsigned long *)addr));
> +
> +	if (unlikely(err)) {
> +		offset = (unsigned long)addr & (sizeof(long) - 1);
> +		addr = (void *)((unsigned long)addr & ~(sizeof(long) - 1));
> +		data = *(unsigned long *)addr;
> +		ret = data >> offset * 8;
> +	}
> +
>  	return ret;

This adds a (normally not taken) conditional jump, would a straight jmp
over the fixup not be better?

i.e.

	1: mov %[mem], %[ret]
	jmp 2
	... fixup code ...
	2:

-- 
Josh


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-05 18:01   ` Josh Poimboeuf
@ 2021-11-05 18:07     ` Peter Zijlstra
  0 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-05 18:07 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, mbenes

On Fri, Nov 05, 2021 at 11:01:52AM -0700, Josh Poimboeuf wrote:
> On Fri, Nov 05, 2021 at 06:10:43PM +0100, Peter Zijlstra wrote:
> > +#ifdef CONFIG_CC_HAS_ASM_GOTO_OUTPUT
> > +
> > +static inline unsigned long load_unaligned_zeropad(const void *addr)
> > +{
> > +	unsigned long offset, data;
> > +	unsigned long ret;
> > +
> > +	asm_volatile_goto(
> > +		"1:	mov %[mem], %[ret]\n"
> > +
> > +		_ASM_EXTABLE(1b, %l[do_exception])
> > +
> > +		: [ret] "=&r" (ret)
> > +		: [mem] "m" (*(unsigned long *)addr)
> > +		: : do_exception);
> > +
> > +out:
> > +	return ret;
> > +
> > +do_exception: __cold;
> > +
> > +	offset = (unsigned long)addr & (sizeof(long) - 1);
> > +	addr = (void *)((unsigned long)addr & ~(sizeof(long) - 1));
> > +	data = *(unsigned long *)addr;
> > +	ret = data >> offset * 8;
> > +	goto out;
> 
> Superfluous goto, can just return here?

Can do I suppose.

> >  static inline unsigned long load_unaligned_zeropad(const void *addr)
> >  {
> > -	unsigned long ret, dummy;
> > +	unsigned long offset, data;
> > +	unsigned long ret, err = 0;
> >  
> > -	asm(
> > -		"1:\tmov %2,%0\n"
> > +	asm(	"1:	mov %[mem], %[ret]\n"
> >  		"2:\n"
> > -		".section .fixup,\"ax\"\n"
> > -		"3:\t"
> > -		"lea %2,%1\n\t"
> > -		"and %3,%1\n\t"
> > -		"mov (%1),%0\n\t"
> > -		"leal %2,%%ecx\n\t"
> > -		"andl %4,%%ecx\n\t"
> > -		"shll $3,%%ecx\n\t"
> > -		"shr %%cl,%0\n\t"
> > -		"jmp 2b\n"
> > -		".previous\n"
> > -		_ASM_EXTABLE(1b, 3b)
> > -		:"=&r" (ret),"=&c" (dummy)
> > -		:"m" (*(unsigned long *)addr),
> > -		 "i" (-sizeof(unsigned long)),
> > -		 "i" (sizeof(unsigned long)-1));
> > +
> > +		_ASM_EXTABLE_FAULT(1b, 2b)
> > +
> > +		: [ret] "=&r" (ret), "+a" (err)
> > +		: [mem] "m" (*(unsigned long *)addr));
> > +
> > +	if (unlikely(err)) {
> > +		offset = (unsigned long)addr & (sizeof(long) - 1);
> > +		addr = (void *)((unsigned long)addr & ~(sizeof(long) - 1));
> > +		data = *(unsigned long *)addr;
> > +		ret = data >> offset * 8;
> > +	}
> > +
> >  	return ret;
> 
> This adds a (normally not taken) conditional jump, would a straight jmp
> over the fixup not be better?
> 
> i.e.
> 
> 	1: mov %[mem], %[ret]
> 	jmp 2
> 	... fixup code ...
> 	2:

I really don't know, this way we can get the bulk of the text
out-of-line.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-05 17:10 ` [PATCH 20/22] x86,word-at-a-time: " Peter Zijlstra
  2021-11-05 18:01   ` Josh Poimboeuf
@ 2021-11-08 16:47   ` Josh Poimboeuf
  2021-11-08 18:29     ` Peter Zijlstra
  1 sibling, 1 reply; 54+ messages in thread
From: Josh Poimboeuf @ 2021-11-08 16:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, mbenes

On Fri, Nov 05, 2021 at 06:10:43PM +0100, Peter Zijlstra wrote:
> +static inline unsigned long load_unaligned_zeropad(const void *addr)
> +{
> +	unsigned long offset, data;
> +	unsigned long ret;
> +
> +	asm_volatile_goto(
> +		"1:	mov %[mem], %[ret]\n"
> +
> +		_ASM_EXTABLE(1b, %l[do_exception])
> +
> +		: [ret] "=&r" (ret)
> +		: [mem] "m" (*(unsigned long *)addr)
> +		: : do_exception);
> +
> +out:
> +	return ret;
> +
> +do_exception: __cold;

Clang doesn't approve of this label annotation:

In file included from fs/dcache.c:186:
./arch/x86/include/asm/word-at-a-time.h:99:15: warning: '__cold__' attribute only applies to functions [-Wignored-attributes]
do_exception: __cold;

-- 
Josh


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-08 16:47   ` Josh Poimboeuf
@ 2021-11-08 18:29     ` Peter Zijlstra
  2021-11-08 18:53       ` Nick Desaulniers
  0 siblings, 1 reply; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-08 18:29 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini,
	mbenes, ndesaulniers

On Mon, Nov 08, 2021 at 08:47:11AM -0800, Josh Poimboeuf wrote:
> On Fri, Nov 05, 2021 at 06:10:43PM +0100, Peter Zijlstra wrote:
> > +static inline unsigned long load_unaligned_zeropad(const void *addr)
> > +{
> > +	unsigned long offset, data;
> > +	unsigned long ret;
> > +
> > +	asm_volatile_goto(
> > +		"1:	mov %[mem], %[ret]\n"
> > +
> > +		_ASM_EXTABLE(1b, %l[do_exception])
> > +
> > +		: [ret] "=&r" (ret)
> > +		: [mem] "m" (*(unsigned long *)addr)
> > +		: : do_exception);
> > +
> > +out:
> > +	return ret;
> > +
> > +do_exception: __cold;
> 
> Clang doesn't approve of this label annotation:
> 
> In file included from fs/dcache.c:186:
> ./arch/x86/include/asm/word-at-a-time.h:99:15: warning: '__cold__' attribute only applies to functions [-Wignored-attributes]
> do_exception: __cold;

/me mutters something best left unsaid these days...

Nick, how come?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-08 18:29     ` Peter Zijlstra
@ 2021-11-08 18:53       ` Nick Desaulniers
  2021-11-09  8:23         ` Peter Zijlstra
  0 siblings, 1 reply; 54+ messages in thread
From: Nick Desaulniers @ 2021-11-08 18:53 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Josh Poimboeuf, x86, linux-kernel, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

On Mon, Nov 8, 2021 at 10:29 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Nov 08, 2021 at 08:47:11AM -0800, Josh Poimboeuf wrote:
> > On Fri, Nov 05, 2021 at 06:10:43PM +0100, Peter Zijlstra wrote:
> > > +static inline unsigned long load_unaligned_zeropad(const void *addr)
> > > +{
> > > +   unsigned long offset, data;
> > > +   unsigned long ret;
> > > +
> > > +   asm_volatile_goto(
> > > +           "1:     mov %[mem], %[ret]\n"
> > > +
> > > +           _ASM_EXTABLE(1b, %l[do_exception])
> > > +
> > > +           : [ret] "=&r" (ret)
> > > +           : [mem] "m" (*(unsigned long *)addr)
> > > +           : : do_exception);
> > > +
> > > +out:
> > > +   return ret;
> > > +
> > > +do_exception: __cold;
> >
> > Clang doesn't approve of this label annotation:
> >
> > In file included from fs/dcache.c:186:
> > ./arch/x86/include/asm/word-at-a-time.h:99:15: warning: '__cold__' attribute only applies to functions [-Wignored-attributes]
> > do_exception: __cold;
>
> /me mutters something best left unsaid these days...
>
> Nick, how come?

Looks like https://bugs.llvm.org/show_bug.cgi?id=47487.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-08 18:53       ` Nick Desaulniers
@ 2021-11-09  8:23         ` Peter Zijlstra
  2021-11-09 19:22           ` Nick Desaulniers
  0 siblings, 1 reply; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-09  8:23 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Josh Poimboeuf, x86, linux-kernel, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes

On Mon, Nov 08, 2021 at 10:53:31AM -0800, Nick Desaulniers wrote:
> On Mon, Nov 8, 2021 at 10:29 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Nov 08, 2021 at 08:47:11AM -0800, Josh Poimboeuf wrote:
> > > On Fri, Nov 05, 2021 at 06:10:43PM +0100, Peter Zijlstra wrote:
> > > > +static inline unsigned long load_unaligned_zeropad(const void *addr)
> > > > +{
> > > > +   unsigned long offset, data;
> > > > +   unsigned long ret;
> > > > +
> > > > +   asm_volatile_goto(
> > > > +           "1:     mov %[mem], %[ret]\n"
> > > > +
> > > > +           _ASM_EXTABLE(1b, %l[do_exception])
> > > > +
> > > > +           : [ret] "=&r" (ret)
> > > > +           : [mem] "m" (*(unsigned long *)addr)
> > > > +           : : do_exception);
> > > > +
> > > > +out:
> > > > +   return ret;
> > > > +
> > > > +do_exception: __cold;
> > >
> > > Clang doesn't approve of this label annotation:
> > >
> > > In file included from fs/dcache.c:186:
> > > ./arch/x86/include/asm/word-at-a-time.h:99:15: warning: '__cold__' attribute only applies to functions [-Wignored-attributes]
> > > do_exception: __cold;
> >
> > /me mutters something best left unsaid these days...
> >
> > Nick, how come?
> 
> Looks like https://bugs.llvm.org/show_bug.cgi?id=47487.

Indeed it does. So what do we do? Keep the attribute and ignore the warn
on clang for now? Even if techinically useless, I do like it's
descriptive nature.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09  8:23         ` Peter Zijlstra
@ 2021-11-09 19:22           ` Nick Desaulniers
  2021-11-09 20:59             ` Bill Wendling
  2021-11-09 21:07             ` Peter Zijlstra
  0 siblings, 2 replies; 54+ messages in thread
From: Nick Desaulniers @ 2021-11-09 19:22 UTC (permalink / raw)
  To: Peter Zijlstra, Bill Wendling, Josh Poimboeuf
  Cc: x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini,
	mbenes, llvm, linux-toolchains

On Tue, Nov 9, 2021 at 12:23 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Mon, Nov 08, 2021 at 10:53:31AM -0800, Nick Desaulniers wrote:
> > On Mon, Nov 8, 2021 at 10:29 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Mon, Nov 08, 2021 at 08:47:11AM -0800, Josh Poimboeuf wrote:
> > > > On Fri, Nov 05, 2021 at 06:10:43PM +0100, Peter Zijlstra wrote:
> > > > > +static inline unsigned long load_unaligned_zeropad(const void *addr)
> > > > > +{
> > > > > +   unsigned long offset, data;
> > > > > +   unsigned long ret;
> > > > > +
> > > > > +   asm_volatile_goto(
> > > > > +           "1:     mov %[mem], %[ret]\n"
> > > > > +
> > > > > +           _ASM_EXTABLE(1b, %l[do_exception])
> > > > > +
> > > > > +           : [ret] "=&r" (ret)
> > > > > +           : [mem] "m" (*(unsigned long *)addr)
> > > > > +           : : do_exception);
> > > > > +
> > > > > +out:
> > > > > +   return ret;
> > > > > +
> > > > > +do_exception: __cold;
> > > >
> > > > Clang doesn't approve of this label annotation:
> > > >
> > > > In file included from fs/dcache.c:186:
> > > > ./arch/x86/include/asm/word-at-a-time.h:99:15: warning: '__cold__' attribute only applies to functions [-Wignored-attributes]
> > > > do_exception: __cold;
> > >
> > > /me mutters something best left unsaid these days...
> > >
> > > Nick, how come?
> >
> > Looks like https://bugs.llvm.org/show_bug.cgi?id=47487.
>
> Indeed it does. So what do we do? Keep the attribute and ignore the warn
> on clang for now? Even if techinically useless, I do like it's
> descriptive nature.

I think the feature of label-attributes is generally useful, for asm
goto (without outputs) and probably computed-goto, so I think we
should implement support for these in clang.  I suspect the machinery
for hot/cold labels was added to clang and LLVM before asm goto was;
LLVM likely has all the machinery needed and we probably just need to
relax or adjust clang's semantic analysis of where the attribute may
occur.

With the above patch, we'd still have issues though with released
versions of clang, and -Werror would complicate things further.

I think the use of this feature (label-attributes) here isn't
necessary though; because of the use of outputs, the "fallthrough"
basic block needs to be placed immediately after the basic block
terminated by the asm goto, at least in LLVM.  Was different ordering
of basic blocks observed with GCC without this label attribute?

_Without_ outputs, I can see being able to specify which target of an
asm-goto with multiple labels is relatively hot as useful, but _with_
outputs I suspect specifying the indirect targets as cold provides
little to no utility.  Unless the cold attribute is helping move
("shrink-wrap"?) the basic block to a whole other section
(.text.cold.)?
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 19:22           ` Nick Desaulniers
@ 2021-11-09 20:59             ` Bill Wendling
  2021-11-09 21:21               ` Peter Zijlstra
  2021-11-09 21:07             ` Peter Zijlstra
  1 sibling, 1 reply; 54+ messages in thread
From: Bill Wendling @ 2021-11-09 20:59 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Peter Zijlstra, Josh Poimboeuf, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains

On Tue, Nov 9, 2021 at 11:22 AM Nick Desaulniers
<ndesaulniers@google.com> wrote:
>
> On Tue, Nov 9, 2021 at 12:23 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Mon, Nov 08, 2021 at 10:53:31AM -0800, Nick Desaulniers wrote:
> > > On Mon, Nov 8, 2021 at 10:29 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Mon, Nov 08, 2021 at 08:47:11AM -0800, Josh Poimboeuf wrote:
> > > > > On Fri, Nov 05, 2021 at 06:10:43PM +0100, Peter Zijlstra wrote:
> > > > > > +static inline unsigned long load_unaligned_zeropad(const void *addr)
> > > > > > +{
> > > > > > +   unsigned long offset, data;
> > > > > > +   unsigned long ret;
> > > > > > +
> > > > > > +   asm_volatile_goto(
> > > > > > +           "1:     mov %[mem], %[ret]\n"
> > > > > > +
> > > > > > +           _ASM_EXTABLE(1b, %l[do_exception])
> > > > > > +
> > > > > > +           : [ret] "=&r" (ret)
> > > > > > +           : [mem] "m" (*(unsigned long *)addr)
> > > > > > +           : : do_exception);
> > > > > > +
> > > > > > +out:
> > > > > > +   return ret;
> > > > > > +
> > > > > > +do_exception: __cold;
> > > > >
> > > > > Clang doesn't approve of this label annotation:
> > > > >
> > > > > In file included from fs/dcache.c:186:
> > > > > ./arch/x86/include/asm/word-at-a-time.h:99:15: warning: '__cold__' attribute only applies to functions [-Wignored-attributes]
> > > > > do_exception: __cold;
> > > >
> > > > /me mutters something best left unsaid these days...
> > > >
> > > > Nick, how come?
> > >
> > > Looks like https://bugs.llvm.org/show_bug.cgi?id=47487.
> >
> > Indeed it does. So what do we do? Keep the attribute and ignore the warn
> > on clang for now? Even if techinically useless, I do like it's
> > descriptive nature.
>
> I think the feature of label-attributes is generally useful, for asm
> goto (without outputs) and probably computed-goto, so I think we
> should implement support for these in clang.  I suspect the machinery
> for hot/cold labels was added to clang and LLVM before asm goto was;
> LLVM likely has all the machinery needed and we probably just need to
> relax or adjust clang's semantic analysis of where the attribute may
> occur.
>
> With the above patch, we'd still have issues though with released
> versions of clang, and -Werror would complicate things further.
>
> I think the use of this feature (label-attributes) here isn't
> necessary though; because of the use of outputs, the "fallthrough"
> basic block needs to be placed immediately after the basic block
> terminated by the asm goto, at least in LLVM.  Was different ordering
> of basic blocks observed with GCC without this label attribute?
>
> _Without_ outputs, I can see being able to specify which target of an
> asm-goto with multiple labels is relatively hot as useful, but _with_
> outputs I suspect specifying the indirect targets as cold provides
> little to no utility.  Unless the cold attribute is helping move
> ("shrink-wrap"?) the basic block to a whole other section
> (.text.cold.)?

Adding attributes to labels shouldn't be difficult, as you mention. In
the case of cold/hot, it's adjusting some of the metadata that already
exists on some basic blocks. It might be enough to allow the normal
block placement algorithms to move the hot and cold blocks around for
us. The question becomes how many attributes does GCC allow on labels?

-bw

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 19:22           ` Nick Desaulniers
  2021-11-09 20:59             ` Bill Wendling
@ 2021-11-09 21:07             ` Peter Zijlstra
  2021-11-10 10:18               ` Peter Zijlstra
                                 ` (2 more replies)
  1 sibling, 3 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-09 21:07 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Bill Wendling, Josh Poimboeuf, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains

On Tue, Nov 09, 2021 at 11:22:44AM -0800, Nick Desaulniers wrote:

> I think the use of this feature (label-attributes) here isn't
> necessary though; because of the use of outputs, the "fallthrough"
> basic block needs to be placed immediately after the basic block
> terminated by the asm goto, at least in LLVM.  Was different ordering
> of basic blocks observed with GCC without this label attribute?

GCC does the same, but I wanted to have the exception stuff be in
.text.cold, but alas it doesn't do that. I left the attribute because of
it's descriptive value.

>  Unless the cold attribute is helping move
> ("shrink-wrap"?) the basic block to a whole other section
> (.text.cold.)?

I was hoping it would do that, but it doesn't on gcc-11.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 20:59             ` Bill Wendling
@ 2021-11-09 21:21               ` Peter Zijlstra
  2021-11-09 21:25                 ` Nick Desaulniers
  0 siblings, 1 reply; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-09 21:21 UTC (permalink / raw)
  To: Bill Wendling
  Cc: Nick Desaulniers, Josh Poimboeuf, x86, linux-kernel,
	mark.rutland, dvyukov, seanjc, pbonzini, mbenes, llvm,
	linux-toolchains

On Tue, Nov 09, 2021 at 12:59:12PM -0800, Bill Wendling wrote:
> 
> Adding attributes to labels shouldn't be difficult, as you mention. In
> the case of cold/hot, it's adjusting some of the metadata that already
> exists on some basic blocks. It might be enough to allow the normal
> block placement algorithms to move the hot and cold blocks around for
> us. The question becomes how many attributes does GCC allow on labels?

I'm aware of 3: unused, hot, cold. Also:

  https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 21:21               ` Peter Zijlstra
@ 2021-11-09 21:25                 ` Nick Desaulniers
  2021-11-09 22:11                   ` Peter Zijlstra
  0 siblings, 1 reply; 54+ messages in thread
From: Nick Desaulniers @ 2021-11-09 21:25 UTC (permalink / raw)
  To: Bill Wendling
  Cc: Josh Poimboeuf, x86, linux-kernel, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes, llvm, linux-toolchains, Peter Zijlstra

On Tue, Nov 9, 2021 at 1:21 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Nov 09, 2021 at 12:59:12PM -0800, Bill Wendling wrote:
> >
> > Adding attributes to labels shouldn't be difficult, as you mention. In
> > the case of cold/hot, it's adjusting some of the metadata that already
> > exists on some basic blocks. It might be enough to allow the normal
> > block placement algorithms to move the hot and cold blocks around for
> > us. The question becomes how many attributes does GCC allow on labels?
>
> I'm aware of 3: unused, hot, cold. Also:
>
>   https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html

Re: unused:
Being able to selectively disable -Wunused-label via
__attribute__((unused)); seems useful, too.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 21:25                 ` Nick Desaulniers
@ 2021-11-09 22:11                   ` Peter Zijlstra
  2021-11-09 22:15                     ` Nick Desaulniers
  0 siblings, 1 reply; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-09 22:11 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Bill Wendling, Josh Poimboeuf, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains

On Tue, Nov 09, 2021 at 01:25:47PM -0800, Nick Desaulniers wrote:
> On Tue, Nov 9, 2021 at 1:21 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Tue, Nov 09, 2021 at 12:59:12PM -0800, Bill Wendling wrote:
> > >
> > > Adding attributes to labels shouldn't be difficult, as you mention. In
> > > the case of cold/hot, it's adjusting some of the metadata that already
> > > exists on some basic blocks. It might be enough to allow the normal
> > > block placement algorithms to move the hot and cold blocks around for
> > > us. The question becomes how many attributes does GCC allow on labels?
> >
> > I'm aware of 3: unused, hot, cold. Also:
> >
> >   https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html
> 
> Re: unused:
> Being able to selectively disable -Wunused-label via
> __attribute__((unused)); seems useful, too.

kernel/sched/fair.c:done: __maybe_unused;

Yes it is ;-)

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 22:11                   ` Peter Zijlstra
@ 2021-11-09 22:15                     ` Nick Desaulniers
  0 siblings, 0 replies; 54+ messages in thread
From: Nick Desaulniers @ 2021-11-09 22:15 UTC (permalink / raw)
  To: Peter Zijlstra, Bill Wendling
  Cc: Josh Poimboeuf, x86, linux-kernel, mark.rutland, dvyukov, seanjc,
	pbonzini, mbenes, llvm, linux-toolchains

On Tue, Nov 9, 2021 at 2:11 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Tue, Nov 09, 2021 at 01:25:47PM -0800, Nick Desaulniers wrote:
> > On Tue, Nov 9, 2021 at 1:21 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Tue, Nov 09, 2021 at 12:59:12PM -0800, Bill Wendling wrote:
> > > >
> > > > Adding attributes to labels shouldn't be difficult, as you mention. In
> > > > the case of cold/hot, it's adjusting some of the metadata that already
> > > > exists on some basic blocks. It might be enough to allow the normal
> > > > block placement algorithms to move the hot and cold blocks around for
> > > > us. The question becomes how many attributes does GCC allow on labels?
> > >
> > > I'm aware of 3: unused, hot, cold. Also:
> > >
> > >   https://gcc.gnu.org/onlinedocs/gcc/Label-Attributes.html
> >
> > Re: unused:
> > Being able to selectively disable -Wunused-label via
> > __attribute__((unused)); seems useful, too.
>
> kernel/sched/fair.c:done: __maybe_unused;
>
> Yes it is ;-)

Ah, that's already supported: https://godbolt.org/z/aa76aexnv, so it's
just hot/cold that could be implemented next.
-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 21:07             ` Peter Zijlstra
@ 2021-11-10 10:18               ` Peter Zijlstra
  2021-11-10 10:46               ` David Laight
  2021-11-10 12:14               ` Segher Boessenkool
  2 siblings, 0 replies; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-10 10:18 UTC (permalink / raw)
  To: Nick Desaulniers
  Cc: Bill Wendling, Josh Poimboeuf, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains

On Tue, Nov 09, 2021 at 10:07:36PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 09, 2021 at 11:22:44AM -0800, Nick Desaulniers wrote:
> 
> > I think the use of this feature (label-attributes) here isn't
> > necessary though; because of the use of outputs, the "fallthrough"
> > basic block needs to be placed immediately after the basic block
> > terminated by the asm goto, at least in LLVM.  Was different ordering
> > of basic blocks observed with GCC without this label attribute?
> 
> GCC does the same, but I wanted to have the exception stuff be in
> .text.cold, but alas it doesn't do that. I left the attribute because of
> it's descriptive value.
> 
> >  Unless the cold attribute is helping move
> > ("shrink-wrap"?) the basic block to a whole other section
> > (.text.cold.)?
> 
> I was hoping it would do that, but it doesn't on gcc-11.

I've removed the __cold on labels in the latest posting.

  https://lkml.kernel.org/r/20211110100102.250793167@infradead.org

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 21:07             ` Peter Zijlstra
  2021-11-10 10:18               ` Peter Zijlstra
@ 2021-11-10 10:46               ` David Laight
  2021-11-10 11:09                 ` Peter Zijlstra
  2021-11-10 12:14               ` Segher Boessenkool
  2 siblings, 1 reply; 54+ messages in thread
From: David Laight @ 2021-11-10 10:46 UTC (permalink / raw)
  To: 'Peter Zijlstra', Nick Desaulniers
  Cc: Bill Wendling, Josh Poimboeuf, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains

From: Peter Zijlstra
> Sent: 09 November 2021 21:08
...
> 
> GCC does the same, but I wanted to have the exception stuff be in
> .text.cold, but alas it doesn't do that. I left the attribute because of
> it's descriptive value.
> 
> >  Unless the cold attribute is helping move
> > ("shrink-wrap"?) the basic block to a whole other section
> > (.text.cold.)?
> 
> I was hoping it would do that, but it doesn't on gcc-11.

Wouldn't moving part of a function to .text.cold (or .text.unlikely)
generate the same problems with the stack backtrace code as the
.text.fixup section you are removing had??

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-10 10:46               ` David Laight
@ 2021-11-10 11:09                 ` Peter Zijlstra
  2021-11-10 12:20                   ` David Laight
  0 siblings, 1 reply; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-10 11:09 UTC (permalink / raw)
  To: David Laight
  Cc: Nick Desaulniers, Bill Wendling, Josh Poimboeuf, x86,
	linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, mbenes,
	llvm, linux-toolchains

On Wed, Nov 10, 2021 at 10:46:42AM +0000, David Laight wrote:
> From: Peter Zijlstra
> > Sent: 09 November 2021 21:08
> ...
> > 
> > GCC does the same, but I wanted to have the exception stuff be in
> > .text.cold, but alas it doesn't do that. I left the attribute because of
> > it's descriptive value.
> > 
> > >  Unless the cold attribute is helping move
> > > ("shrink-wrap"?) the basic block to a whole other section
> > > (.text.cold.)?
> > 
> > I was hoping it would do that, but it doesn't on gcc-11.
> 
> Wouldn't moving part of a function to .text.cold (or .text.unlikely)
> generate the same problems with the stack backtrace code as the
> .text.fixup section you are removing had??

GCC can already split a function into func and func.cold today (or
worse: func, func.isra.N, func.cold, func.isra.N.cold etc..).

I'm assuming reliable unwind and livepatch know how to deal with this.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-09 21:07             ` Peter Zijlstra
  2021-11-10 10:18               ` Peter Zijlstra
  2021-11-10 10:46               ` David Laight
@ 2021-11-10 12:14               ` Segher Boessenkool
  2 siblings, 0 replies; 54+ messages in thread
From: Segher Boessenkool @ 2021-11-10 12:14 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Nick Desaulniers, Bill Wendling, Josh Poimboeuf, x86,
	linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, mbenes,
	llvm, linux-toolchains

Hi!

On Tue, Nov 09, 2021 at 10:07:36PM +0100, Peter Zijlstra wrote:
> On Tue, Nov 09, 2021 at 11:22:44AM -0800, Nick Desaulniers wrote:
> > I think the use of this feature (label-attributes) here isn't
> > necessary though; because of the use of outputs, the "fallthrough"
> > basic block needs to be placed immediately after the basic block
> > terminated by the asm goto, at least in LLVM.  Was different ordering
> > of basic blocks observed with GCC without this label attribute?
> 
> GCC does the same, but I wanted to have the exception stuff be in
> .text.cold, but alas it doesn't do that. I left the attribute because of
> it's descriptive value.
> 
> >  Unless the cold attribute is helping move
> > ("shrink-wrap"?)

Shrink-wrapping is something else entirely.

>>  the basic block to a whole other section
> > (.text.cold.)?
> 
> I was hoping it would do that, but it doesn't on gcc-11.

A cold basic block can never dominate a non-cold basic block.  GCC will
fix things up when it notices this property is violated, so marking
random blocks as "cold" will not be very effective.


Segher

^ permalink raw reply	[flat|nested] 54+ messages in thread

* RE: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-10 11:09                 ` Peter Zijlstra
@ 2021-11-10 12:20                   ` David Laight
  2021-11-12  1:50                     ` Josh Poimboeuf
  0 siblings, 1 reply; 54+ messages in thread
From: David Laight @ 2021-11-10 12:20 UTC (permalink / raw)
  To: 'Peter Zijlstra'
  Cc: Nick Desaulniers, Bill Wendling, Josh Poimboeuf, x86,
	linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, mbenes,
	llvm, linux-toolchains

From: Peter Zijlstra
> Sent: 10 November 2021 11:10
> 
> On Wed, Nov 10, 2021 at 10:46:42AM +0000, David Laight wrote:
> > From: Peter Zijlstra
> > > Sent: 09 November 2021 21:08
> > ...
> > >
> > > GCC does the same, but I wanted to have the exception stuff be in
> > > .text.cold, but alas it doesn't do that. I left the attribute because of
> > > it's descriptive value.
> > >
> > > >  Unless the cold attribute is helping move
> > > > ("shrink-wrap"?) the basic block to a whole other section
> > > > (.text.cold.)?
> > >
> > > I was hoping it would do that, but it doesn't on gcc-11.
> >
> > Wouldn't moving part of a function to .text.cold (or .text.unlikely)
> > generate the same problems with the stack backtrace code as the
> > .text.fixup section you are removing had??
> 
> GCC can already split a function into func and func.cold today (or
> worse: func, func.isra.N, func.cold, func.isra.N.cold etc..).
> 
> I'm assuming reliable unwind and livepatch know how to deal with this.

They'll have 'proper' function labels at the top - so backtrace
stands a chance.
Indeed you (probably) want it to output "func.irsa.n.cold" rather
than just "func" to help show which copy it is in.

I guess that livepatch will need separate patches for each
version of the function - which might be 'interesting' if
all the copies actually need patching at the same time.
You'd certainly want a warning if there seemed to be multiple
copies of the function.

I'm waiting for the side-channel attack caused by detecting
timing differences caused by TLB misses when speculatively
executing code in the .cold/.unlikely sections.
ISTR recent x86 cpu speculate unknown conditional branches
'randomly' - rather than (say) assuming backwards taken.
So you can't (easily) stop speculative execution into the 'cold'
text.

I don't know if speculative execution will load TLB,
it would speed a lot of code up - with the same downsides
as evicting code from the L1-cache.
A 'half-way house' would be to do the page table walk, but
hold the read value 'pending' the code being needed.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-10 12:20                   ` David Laight
@ 2021-11-12  1:50                     ` Josh Poimboeuf
  2021-11-12  9:33                       ` Peter Zijlstra
  2021-11-22 17:46                       ` Petr Mladek
  0 siblings, 2 replies; 54+ messages in thread
From: Josh Poimboeuf @ 2021-11-12  1:50 UTC (permalink / raw)
  To: David Laight
  Cc: 'Peter Zijlstra',
	Nick Desaulniers, Bill Wendling, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains,
	live-patching

On Wed, Nov 10, 2021 at 12:20:47PM +0000, David Laight wrote:
> > > Wouldn't moving part of a function to .text.cold (or .text.unlikely)
> > > generate the same problems with the stack backtrace code as the
> > > .text.fixup section you are removing had??
> > 
> > GCC can already split a function into func and func.cold today (or
> > worse: func, func.isra.N, func.cold, func.isra.N.cold etc..).
> > 
> > I'm assuming reliable unwind and livepatch know how to deal with this.
> 
> They'll have 'proper' function labels at the top - so backtrace
> stands a chance.
> Indeed you (probably) want it to output "func.irsa.n.cold" rather
> than just "func" to help show which copy it is in.  > 
> I guess that livepatch will need separate patches for each
> version of the function - which might be 'interesting' if
> all the copies actually need patching at the same time.
> You'd certainly want a warning if there seemed to be multiple
> copies of the function.

Hm, I think there is actually a livepatch problem here.

If the .cold (aka "child") function actually had a fentry hook then we'd
be fine.  Then we could just patch both "parent" and "child" functions
at the same time.  We already have the ability to patch multiple
functions having dependent interface changes.

But there's no fentry hook in the child, so we can only patch the
parent.

If the child schedules out, and then the parent gets patched, things can
go off-script if the child later jumps back to the unpatched version of
the parent, and then for example the old parent tries to call another
patched function with a since-changed ABI.

Granted, it's like three nested edge cases, so it may not be all that
likely to happen.

Some ideas to fix:

a) Add a field to 'klp_func' which allows the patch module to specify a
   function's .cold counterpart?

b) Detect such cold counterparts in klp_enable_patch()?  Presumably it
   would require searching kallsyms for "<func>.cold", which is somewhat
   problematic as there might be duplicates.

c) Update the reliable stacktrace code to mark the stack unreliable if
   it has a function with ".cold" in the name?

d) Don't patch functions with .cold counterparts? (Probably not a viable
   long-term solution, there are a ton of .cold functions because calls
   to printk are marked cold)

e) Disable .cold optimization?

f) Add fentry hooks to .cold functions?


I'm thinking a) seems do-able, and less disruptive / more precise than
most others, but it requires more due diligence on behalf of the patch
creation.  It sounds be pretty easy for kpatch-build to handle at least.

-- 
Josh


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-12  1:50                     ` Josh Poimboeuf
@ 2021-11-12  9:33                       ` Peter Zijlstra
  2021-11-13  5:35                         ` Josh Poimboeuf
  2021-11-22 17:46                       ` Petr Mladek
  1 sibling, 1 reply; 54+ messages in thread
From: Peter Zijlstra @ 2021-11-12  9:33 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: David Laight, Nick Desaulniers, Bill Wendling, x86, linux-kernel,
	mark.rutland, dvyukov, seanjc, pbonzini, mbenes, llvm,
	linux-toolchains, live-patching

On Thu, Nov 11, 2021 at 05:50:03PM -0800, Josh Poimboeuf wrote:

> Hm, I think there is actually a livepatch problem here.

I suspected as much, because I couldn't find any code dealing with it
when I looked in a hurry.. :/

> Some ideas to fix:

> c) Update the reliable stacktrace code to mark the stack unreliable if
>    it has a function with ".cold" in the name?

Why not simply match func.cold as func in the transition thing? Then
func won't get patched as long as it (or it's .cold part) is in use.
This seems like the natural thing to do.

If there are enough .cold functions, always reporting stacktrace as
unreliable will make progress hard, even though it might be perfectly
safe.

> e) Disable .cold optimization?

Yeah, lets not do that. That'll have me lobbying to kill KLP again
because it generates crap code.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-12  9:33                       ` Peter Zijlstra
@ 2021-11-13  5:35                         ` Josh Poimboeuf
  2021-11-15 12:36                           ` Miroslav Benes
  2021-11-15 12:59                           ` Miroslav Benes
  0 siblings, 2 replies; 54+ messages in thread
From: Josh Poimboeuf @ 2021-11-13  5:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: David Laight, Nick Desaulniers, Bill Wendling, x86, linux-kernel,
	mark.rutland, dvyukov, seanjc, pbonzini, mbenes, llvm,
	linux-toolchains, live-patching

On Fri, Nov 12, 2021 at 10:33:36AM +0100, Peter Zijlstra wrote:
> On Thu, Nov 11, 2021 at 05:50:03PM -0800, Josh Poimboeuf wrote:
> 
> > Hm, I think there is actually a livepatch problem here.
> 
> I suspected as much, because I couldn't find any code dealing with it
> when I looked in a hurry.. :/
> 
> > Some ideas to fix:
> 
> > c) Update the reliable stacktrace code to mark the stack unreliable if
> >    it has a function with ".cold" in the name?
> 
> Why not simply match func.cold as func in the transition thing? Then
> func won't get patched as long as it (or it's .cold part) is in use.
> This seems like the natural thing to do.

Well yes, you're basically hinting at my first two options a and b:

a) Add a field to 'klp_func' which allows the patch module to specify a
   function's .cold counterpart?

b) Detect such cold counterparts in klp_enable_patch()?  Presumably it
   would require searching kallsyms for "<func>.cold", which is somewhat
   problematic as there might be duplicates.

It's basically a two-step process:  1) match func to .cold if it exists;
2) check for both in klp_check_stack_func().  The above two options are
proposals for the 1st step.  The 2nd step was implied.

I think something like that is probably the way to go, but the question
is where to match func to .cold:

  - patch creation;
  - klp_init_object_loaded(); or
  - klp_check_stack_func().

I think the main problem with matching them in the kernel is that you
can't disambiguate duplicate ".cold" symbols without disassembling the
function.  Duplicates are rare but they do exist.

Matching them at patch creation time (option a) may work best.  At least
with kpatch-build, the tooling already knows about .cold functions, so
explicitly matching them in new klp_func.{cold_name,cold_sympos} fields
would be trivial.

I don't know about other patch creation tooling, but I'd imagine they
also need to know about .cold functions, unless they have that
optimization disabled.  Because the func and its .cold counterpart
always need to be patched together.

-- 
Josh


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-13  5:35                         ` Josh Poimboeuf
@ 2021-11-15 12:36                           ` Miroslav Benes
  2021-11-15 13:01                             ` Joe Lawrence
  2021-11-15 12:59                           ` Miroslav Benes
  1 sibling, 1 reply; 54+ messages in thread
From: Miroslav Benes @ 2021-11-15 12:36 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, David Laight, Nick Desaulniers, Bill Wendling,
	x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, llvm,
	linux-toolchains, live-patching

On Fri, 12 Nov 2021, Josh Poimboeuf wrote:

> On Fri, Nov 12, 2021 at 10:33:36AM +0100, Peter Zijlstra wrote:
> > On Thu, Nov 11, 2021 at 05:50:03PM -0800, Josh Poimboeuf wrote:
> > 
> > > Hm, I think there is actually a livepatch problem here.
> > 
> > I suspected as much, because I couldn't find any code dealing with it
> > when I looked in a hurry.. :/
> > 
> > > Some ideas to fix:
> > 
> > > c) Update the reliable stacktrace code to mark the stack unreliable if
> > >    it has a function with ".cold" in the name?
> > 
> > Why not simply match func.cold as func in the transition thing? Then
> > func won't get patched as long as it (or it's .cold part) is in use.
> > This seems like the natural thing to do.
> 
> Well yes, you're basically hinting at my first two options a and b:
> 
> a) Add a field to 'klp_func' which allows the patch module to specify a
>    function's .cold counterpart?
> 
> b) Detect such cold counterparts in klp_enable_patch()?  Presumably it
>    would require searching kallsyms for "<func>.cold", which is somewhat
>    problematic as there might be duplicates.
> 
> It's basically a two-step process:  1) match func to .cold if it exists;
> 2) check for both in klp_check_stack_func().  The above two options are
> proposals for the 1st step.  The 2nd step was implied.

This reminded me... one of the things I have on my todo list for a long 
time is to add an option for a live patch creator to specify functions 
which are not contained in the live patch but their presence on stacks 
should be checked for. It could save some space in the final live patch 
when one would add functions (callers) just because the consistency 
requires it.

I took as a convenience feature with a low priority and forgot about it. 
The problem above changes it. So should we take the opportunity and 
implement both in one step? I wanted to include a list of functions in 
on a patch level (klp_patch structure) and klp_check_stack() would just 
have more to check.

Miroslav

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-13  5:35                         ` Josh Poimboeuf
  2021-11-15 12:36                           ` Miroslav Benes
@ 2021-11-15 12:59                           ` Miroslav Benes
  2021-11-16 21:27                             ` Josh Poimboeuf
  1 sibling, 1 reply; 54+ messages in thread
From: Miroslav Benes @ 2021-11-15 12:59 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, David Laight, Nick Desaulniers, Bill Wendling,
	x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, llvm,
	linux-toolchains, live-patching

On Fri, 12 Nov 2021, Josh Poimboeuf wrote:

> If the child schedules out, and then the parent gets patched, things can
> go off-script if the child later jumps back to the unpatched version of
> the parent, and then for example the old parent tries to call another
> patched function with a since-changed ABI.

...
 
> I don't know about other patch creation tooling, but I'd imagine they
> also need to know about .cold functions, unless they have that
> optimization disabled.  Because the func and its .cold counterpart
> always need to be patched together.

We, at SUSE, solve the issue differently... the new patched parent would 
call that another patched function with a changed ABI statically in a live 
patch. So in that example, .cold child would jump back to the unpatched 
parent which would then call, also, the unpatched function.

The situation would change if we ever were to have some notion of global 
consistency. Then it would be really fragile, so it is worth of improving 
this, I think.

Miroslav

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-15 12:36                           ` Miroslav Benes
@ 2021-11-15 13:01                             ` Joe Lawrence
  2021-11-15 23:40                               ` Josh Poimboeuf
  0 siblings, 1 reply; 54+ messages in thread
From: Joe Lawrence @ 2021-11-15 13:01 UTC (permalink / raw)
  To: Miroslav Benes, Josh Poimboeuf
  Cc: Peter Zijlstra, David Laight, Nick Desaulniers, Bill Wendling,
	x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, llvm,
	linux-toolchains, live-patching

On 11/15/21 7:36 AM, Miroslav Benes wrote:
> On Fri, 12 Nov 2021, Josh Poimboeuf wrote:
> 
>> On Fri, Nov 12, 2021 at 10:33:36AM +0100, Peter Zijlstra wrote:
>>> On Thu, Nov 11, 2021 at 05:50:03PM -0800, Josh Poimboeuf wrote:
>>>
>>>> Hm, I think there is actually a livepatch problem here.
>>>
>>> I suspected as much, because I couldn't find any code dealing with it
>>> when I looked in a hurry.. :/
>>>
>>>> Some ideas to fix:
>>>
>>>> c) Update the reliable stacktrace code to mark the stack unreliable if
>>>>    it has a function with ".cold" in the name?
>>>
>>> Why not simply match func.cold as func in the transition thing? Then
>>> func won't get patched as long as it (or it's .cold part) is in use.
>>> This seems like the natural thing to do.
>>
>> Well yes, you're basically hinting at my first two options a and b:
>>
>> a) Add a field to 'klp_func' which allows the patch module to specify a
>>    function's .cold counterpart?
>>
>> b) Detect such cold counterparts in klp_enable_patch()?  Presumably it
>>    would require searching kallsyms for "<func>.cold", which is somewhat
>>    problematic as there might be duplicates.
>>
>> It's basically a two-step process:  1) match func to .cold if it exists;
>> 2) check for both in klp_check_stack_func().  The above two options are
>> proposals for the 1st step.  The 2nd step was implied.
> 
> This reminded me... one of the things I have on my todo list for a long 
> time is to add an option for a live patch creator to specify functions 
> which are not contained in the live patch but their presence on stacks 
> should be checked for. It could save some space in the final live patch 
> when one would add functions (callers) just because the consistency 
> requires it.
> 

Yea, I've used this technique once (adding a nop to a function so
kpatch-build would detect and include in klp_funcs[]) to make a set of
changes safer with respect to the consistency model.  Making it simpler
to for the livepatch author to say, "I'm not changing foo(), but I don't
want it doing anything while patching a task" sounds reasonable.

> I took as a convenience feature with a low priority and forgot about it. 
> The problem above changes it. So should we take the opportunity and 
> implement both in one step? I wanted to include a list of functions in 
> on a patch level (klp_patch structure) and klp_check_stack() would just 
> have more to check.
> 

As far as I read the original problem, I think it would solve for that,
too, so I would say go for it.

-- 
Joe


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-15 13:01                             ` Joe Lawrence
@ 2021-11-15 23:40                               ` Josh Poimboeuf
  2021-11-16  7:25                                 ` Miroslav Benes
  0 siblings, 1 reply; 54+ messages in thread
From: Josh Poimboeuf @ 2021-11-15 23:40 UTC (permalink / raw)
  To: Joe Lawrence
  Cc: Miroslav Benes, Peter Zijlstra, David Laight, Nick Desaulniers,
	Bill Wendling, x86, linux-kernel, mark.rutland, dvyukov, seanjc,
	pbonzini, llvm, linux-toolchains, live-patching

On Mon, Nov 15, 2021 at 08:01:13AM -0500, Joe Lawrence wrote:
> > This reminded me... one of the things I have on my todo list for a long 
> > time is to add an option for a live patch creator to specify functions 
> > which are not contained in the live patch but their presence on stacks 
> > should be checked for. It could save some space in the final live patch 
> > when one would add functions (callers) just because the consistency 
> > requires it.
> > 
> 
> Yea, I've used this technique once (adding a nop to a function so
> kpatch-build would detect and include in klp_funcs[]) to make a set of
> changes safer with respect to the consistency model.  Making it simpler
> to for the livepatch author to say, "I'm not changing foo(), but I don't
> want it doing anything while patching a task" sounds reasonable.
> 
> > I took as a convenience feature with a low priority and forgot about it. 
> > The problem above changes it. So should we take the opportunity and 
> > implement both in one step? I wanted to include a list of functions in 
> > on a patch level (klp_patch structure) and klp_check_stack() would just 
> > have more to check.
> > 
> 
> As far as I read the original problem, I think it would solve for that,
> too, so I would say go for it.

Sounds good to me.

Miroslav, do I understand correctly that you're volunteering to make
this change? ;-)

-- 
Josh


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-15 23:40                               ` Josh Poimboeuf
@ 2021-11-16  7:25                                 ` Miroslav Benes
  0 siblings, 0 replies; 54+ messages in thread
From: Miroslav Benes @ 2021-11-16  7:25 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Joe Lawrence, Peter Zijlstra, David Laight, Nick Desaulniers,
	Bill Wendling, x86, linux-kernel, mark.rutland, dvyukov, seanjc,
	pbonzini, llvm, linux-toolchains, live-patching

On Mon, 15 Nov 2021, Josh Poimboeuf wrote:

> On Mon, Nov 15, 2021 at 08:01:13AM -0500, Joe Lawrence wrote:
> > > This reminded me... one of the things I have on my todo list for a long 
> > > time is to add an option for a live patch creator to specify functions 
> > > which are not contained in the live patch but their presence on stacks 
> > > should be checked for. It could save some space in the final live patch 
> > > when one would add functions (callers) just because the consistency 
> > > requires it.
> > > 
> > 
> > Yea, I've used this technique once (adding a nop to a function so
> > kpatch-build would detect and include in klp_funcs[]) to make a set of
> > changes safer with respect to the consistency model.  Making it simpler
> > to for the livepatch author to say, "I'm not changing foo(), but I don't
> > want it doing anything while patching a task" sounds reasonable.
> > 
> > > I took as a convenience feature with a low priority and forgot about it. 
> > > The problem above changes it. So should we take the opportunity and 
> > > implement both in one step? I wanted to include a list of functions in 
> > > on a patch level (klp_patch structure) and klp_check_stack() would just 
> > > have more to check.
> > > 
> > 
> > As far as I read the original problem, I think it would solve for that,
> > too, so I would say go for it.
> 
> Sounds good to me.
> 
> Miroslav, do I understand correctly that you're volunteering to make
> this change? ;-)

Yes, you do. I am not superfast peterz, so it will take some time, but 
I'll be happy to do it :).

Miroslav

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-15 12:59                           ` Miroslav Benes
@ 2021-11-16 21:27                             ` Josh Poimboeuf
  2021-11-18  7:15                               ` Miroslav Benes
  0 siblings, 1 reply; 54+ messages in thread
From: Josh Poimboeuf @ 2021-11-16 21:27 UTC (permalink / raw)
  To: Miroslav Benes
  Cc: Peter Zijlstra, David Laight, Nick Desaulniers, Bill Wendling,
	x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, llvm,
	linux-toolchains, live-patching

On Mon, Nov 15, 2021 at 01:59:36PM +0100, Miroslav Benes wrote:
> On Fri, 12 Nov 2021, Josh Poimboeuf wrote:
> 
> > If the child schedules out, and then the parent gets patched, things can
> > go off-script if the child later jumps back to the unpatched version of
> > the parent, and then for example the old parent tries to call another
> > patched function with a since-changed ABI.
> 
> ...
>  
> > I don't know about other patch creation tooling, but I'd imagine they
> > also need to know about .cold functions, unless they have that
> > optimization disabled.  Because the func and its .cold counterpart
> > always need to be patched together.
> 
> We, at SUSE, solve the issue differently... the new patched parent would 
> call that another patched function with a changed ABI statically in a live 
> patch. So in that example, .cold child would jump back to the unpatched 
> parent which would then call, also, the unpatched function.

So if I understand correctly, if a function's ABI changes then you don't
patch it directly, but only patch its callers to call the replacement?

Is there a reason for that?

-- 
Josh


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-16 21:27                             ` Josh Poimboeuf
@ 2021-11-18  7:15                               ` Miroslav Benes
  0 siblings, 0 replies; 54+ messages in thread
From: Miroslav Benes @ 2021-11-18  7:15 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Peter Zijlstra, David Laight, Nick Desaulniers, Bill Wendling,
	x86, linux-kernel, mark.rutland, dvyukov, seanjc, pbonzini, llvm,
	linux-toolchains, live-patching, nstange

On Tue, 16 Nov 2021, Josh Poimboeuf wrote:

> On Mon, Nov 15, 2021 at 01:59:36PM +0100, Miroslav Benes wrote:
> > On Fri, 12 Nov 2021, Josh Poimboeuf wrote:
> > 
> > > If the child schedules out, and then the parent gets patched, things can
> > > go off-script if the child later jumps back to the unpatched version of
> > > the parent, and then for example the old parent tries to call another
> > > patched function with a since-changed ABI.
> > 
> > ...
> >  
> > > I don't know about other patch creation tooling, but I'd imagine they
> > > also need to know about .cold functions, unless they have that
> > > optimization disabled.  Because the func and its .cold counterpart
> > > always need to be patched together.
> > 
> > We, at SUSE, solve the issue differently... the new patched parent would 
> > call that another patched function with a changed ABI statically in a live 
> > patch. So in that example, .cold child would jump back to the unpatched 
> > parent which would then call, also, the unpatched function.
> 
> So if I understand correctly, if a function's ABI changes then you don't
> patch it directly, but only patch its callers to call the replacement?

Yes.
 
> Is there a reason for that?

Not really. There were a couple of cases in the past where this was safer 
to implement and then it became a habbit, I guess.

[ Nicolai CCed, if he wants to add more details ]

Miroslav

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-12  1:50                     ` Josh Poimboeuf
  2021-11-12  9:33                       ` Peter Zijlstra
@ 2021-11-22 17:46                       ` Petr Mladek
  2021-11-24 17:42                         ` Josh Poimboeuf
  1 sibling, 1 reply; 54+ messages in thread
From: Petr Mladek @ 2021-11-22 17:46 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: David Laight, 'Peter Zijlstra',
	Nick Desaulniers, Bill Wendling, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains,
	live-patching

On Thu 2021-11-11 17:50:03, Josh Poimboeuf wrote:
> On Wed, Nov 10, 2021 at 12:20:47PM +0000, David Laight wrote:
> > > > Wouldn't moving part of a function to .text.cold (or .text.unlikely)
> > > > generate the same problems with the stack backtrace code as the
> > > > .text.fixup section you are removing had??
> > > 
> > > GCC can already split a function into func and func.cold today (or
> > > worse: func, func.isra.N, func.cold, func.isra.N.cold etc..).
> > > 
> > > I'm assuming reliable unwind and livepatch know how to deal with this.
> > 
> > They'll have 'proper' function labels at the top - so backtrace
> > stands a chance.
> > Indeed you (probably) want it to output "func.irsa.n.cold" rather
> > than just "func" to help show which copy it is in.  > 
> > I guess that livepatch will need separate patches for each
> > version of the function - which might be 'interesting' if
> > all the copies actually need patching at the same time.
> > You'd certainly want a warning if there seemed to be multiple
> > copies of the function.
> 
> Hm, I think there is actually a livepatch problem here.
> 
> If the .cold (aka "child") function actually had a fentry hook then we'd
> be fine.  Then we could just patch both "parent" and "child" functions
> at the same time.  We already have the ability to patch multiple
> functions having dependent interface changes.
> 
> But there's no fentry hook in the child, so we can only patch the
> parent.
> 
> If the child schedules out, and then the parent gets patched, things can
> go off-script if the child later jumps back to the unpatched version of
> the parent, and then for example the old parent tries to call another
> patched function with a since-changed ABI.

This thread seems to be motivation for the patchset
https://lore.kernel.org/all/20211119090327.12811-1-mbenes@suse.cz/
I am trying to understand the problem here, first. And I am
a bit lost.

How exactly is child called in the above scenario, please?
How could parent get livepatched when child is sleeping?

I imagine it the following way:

    parent_func()
       fentry

       /* some parent code */
       jmp child
	   /* child code */
	   jmp back_to_parent
       /* more parent code */
       ret

In the above example, parent_func() would be on stack and could not
get livepatched even when the process is sleeping in the child code.

The livepatching is done via ftrace. Only code with fentry could be
livepatched. And code called via fentry must be visible on stack.


Anyway, this looks to me like yet another compiler optimization where
we need to livepatch the callers. The compiler might produce completely
different optimizations for the new code. I do not see a reasonable
way how to create compatible func, func.isra.N, func.cold,
func.isra.N.cold variants.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-22 17:46                       ` Petr Mladek
@ 2021-11-24 17:42                         ` Josh Poimboeuf
  2021-11-25  8:18                           ` Petr Mladek
  0 siblings, 1 reply; 54+ messages in thread
From: Josh Poimboeuf @ 2021-11-24 17:42 UTC (permalink / raw)
  To: Petr Mladek
  Cc: David Laight, 'Peter Zijlstra',
	Nick Desaulniers, Bill Wendling, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains,
	live-patching

On Mon, Nov 22, 2021 at 06:46:44PM +0100, Petr Mladek wrote:
> On Thu 2021-11-11 17:50:03, Josh Poimboeuf wrote:
> > On Wed, Nov 10, 2021 at 12:20:47PM +0000, David Laight wrote:
> > > > > Wouldn't moving part of a function to .text.cold (or .text.unlikely)
> > > > > generate the same problems with the stack backtrace code as the
> > > > > .text.fixup section you are removing had??
> > > > 
> > > > GCC can already split a function into func and func.cold today (or
> > > > worse: func, func.isra.N, func.cold, func.isra.N.cold etc..).
> > > > 
> > > > I'm assuming reliable unwind and livepatch know how to deal with this.
> > > 
> > > They'll have 'proper' function labels at the top - so backtrace
> > > stands a chance.
> > > Indeed you (probably) want it to output "func.irsa.n.cold" rather
> > > than just "func" to help show which copy it is in.  > 
> > > I guess that livepatch will need separate patches for each
> > > version of the function - which might be 'interesting' if
> > > all the copies actually need patching at the same time.
> > > You'd certainly want a warning if there seemed to be multiple
> > > copies of the function.
> > 
> > Hm, I think there is actually a livepatch problem here.
> > 
> > If the .cold (aka "child") function actually had a fentry hook then we'd
> > be fine.  Then we could just patch both "parent" and "child" functions
> > at the same time.  We already have the ability to patch multiple
> > functions having dependent interface changes.
> > 
> > But there's no fentry hook in the child, so we can only patch the
> > parent.
> > 
> > If the child schedules out, and then the parent gets patched, things can
> > go off-script if the child later jumps back to the unpatched version of
> > the parent, and then for example the old parent tries to call another
> > patched function with a since-changed ABI.
> 
> This thread seems to be motivation for the patchset
> https://lore.kernel.org/all/20211119090327.12811-1-mbenes@suse.cz/
> I am trying to understand the problem here, first. And I am
> a bit lost.
> 
> How exactly is child called in the above scenario, please?
> How could parent get livepatched when child is sleeping?
> 
> I imagine it the following way:
> 
>     parent_func()
>        fentry
> 
>        /* some parent code */
>        jmp child
> 	   /* child code */
> 	   jmp back_to_parent
>        /* more parent code */
>        ret

Right.

> In the above example, parent_func() would be on stack and could not
> get livepatched even when the process is sleeping in the child code.
> 
> The livepatching is done via ftrace. Only code with fentry could be
> livepatched. And code called via fentry must be visible on stack.

How would parent_func() be on the stack?  If it jumps to the child then
it leaves no trace on the stack.

-- 
Josh


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH 20/22] x86,word-at-a-time: Remove .fixup usage
  2021-11-24 17:42                         ` Josh Poimboeuf
@ 2021-11-25  8:18                           ` Petr Mladek
  0 siblings, 0 replies; 54+ messages in thread
From: Petr Mladek @ 2021-11-25  8:18 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: David Laight, 'Peter Zijlstra',
	Nick Desaulniers, Bill Wendling, x86, linux-kernel, mark.rutland,
	dvyukov, seanjc, pbonzini, mbenes, llvm, linux-toolchains,
	live-patching

On Wed 2021-11-24 09:42:13, Josh Poimboeuf wrote:
> On Mon, Nov 22, 2021 at 06:46:44PM +0100, Petr Mladek wrote:
> > On Thu 2021-11-11 17:50:03, Josh Poimboeuf wrote:
> > > On Wed, Nov 10, 2021 at 12:20:47PM +0000, David Laight wrote:
> > > > > > Wouldn't moving part of a function to .text.cold (or .text.unlikely)
> > > > > > generate the same problems with the stack backtrace code as the
> > > > > > .text.fixup section you are removing had??
> > > > > 
> > > > > GCC can already split a function into func and func.cold today (or
> > > > > worse: func, func.isra.N, func.cold, func.isra.N.cold etc..).
> > > > > 
> > > > > I'm assuming reliable unwind and livepatch know how to deal with this.
> > > > 
> > > > They'll have 'proper' function labels at the top - so backtrace
> > > > stands a chance.
> > > > Indeed you (probably) want it to output "func.irsa.n.cold" rather
> > > > than just "func" to help show which copy it is in.  > 
> > > > I guess that livepatch will need separate patches for each
> > > > version of the function - which might be 'interesting' if
> > > > all the copies actually need patching at the same time.
> > > > You'd certainly want a warning if there seemed to be multiple
> > > > copies of the function.
> > > 
> > > Hm, I think there is actually a livepatch problem here.
> > > 
> > > If the .cold (aka "child") function actually had a fentry hook then we'd
> > > be fine.  Then we could just patch both "parent" and "child" functions
> > > at the same time.  We already have the ability to patch multiple
> > > functions having dependent interface changes.
> > > 
> > > But there's no fentry hook in the child, so we can only patch the
> > > parent.
> > > 
> > > If the child schedules out, and then the parent gets patched, things can
> > > go off-script if the child later jumps back to the unpatched version of
> > > the parent, and then for example the old parent tries to call another
> > > patched function with a since-changed ABI.
> > 
> > This thread seems to be motivation for the patchset
> > https://lore.kernel.org/all/20211119090327.12811-1-mbenes@suse.cz/
> > I am trying to understand the problem here, first. And I am
> > a bit lost.
> > 
> > How exactly is child called in the above scenario, please?
> > How could parent get livepatched when child is sleeping?
> > 
> > I imagine it the following way:
> > 
> >     parent_func()
> >        fentry
> > 
> >        /* some parent code */
> >        jmp child
> > 	   /* child code */
> > 	   jmp back_to_parent
> >        /* more parent code */
> >        ret
> 
> Right.
> 
> > In the above example, parent_func() would be on stack and could not
> > get livepatched even when the process is sleeping in the child code.
> > 
> > The livepatching is done via ftrace. Only code with fentry could be
> > livepatched. And code called via fentry must be visible on stack.
> 
> How would parent_func() be on the stack?  If it jumps to the child then
> it leaves no trace on the stack.

Grr, sure. It was off-by-one error on my side. /o\

Thanks for explanation.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2021-11-25  8:20 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-05 17:10 [PATCH 00/22] x86: Remove anonymous out-of-line fixups Peter Zijlstra
2021-11-05 17:10 ` [PATCH 01/22] bitfield.h: Fix "type of reg too small for mask" test Peter Zijlstra
2021-11-05 17:10 ` [PATCH 02/22] x86,mmx_32: Remove .fixup usage Peter Zijlstra
2021-11-05 17:10 ` [PATCH 03/22] x86,copy_user_64: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 04/22] x86,copy_mc_64: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 05/22] x86,entry_64: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 06/22] x86,entry_32: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 07/22] x86,extable: Extend extable functionality Peter Zijlstra
2021-11-05 17:10 ` [PATCH 08/22] x86,msr: Remove .fixup usage Peter Zijlstra
2021-11-05 17:10 ` [PATCH 09/22] x86,futex: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 10/22] x86,uaccess: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 11/22] x86,xen: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 12/22] x86,fpu: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 13/22] x86,segment: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 14/22] x86,vmx: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 15/22] x86,checksum_32: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 16/22] x86,sgx: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 17/22] x86,kvm: " Peter Zijlstra
2021-11-05 17:10 ` [PATCH 18/22] x86,usercopy_32: Simplify __copy_user_intel_nocache() Peter Zijlstra
2021-11-05 17:10 ` [PATCH 19/22] x86,usercopy: Remove .fixup usage Peter Zijlstra
2021-11-05 17:10 ` [PATCH 20/22] x86,word-at-a-time: " Peter Zijlstra
2021-11-05 18:01   ` Josh Poimboeuf
2021-11-05 18:07     ` Peter Zijlstra
2021-11-08 16:47   ` Josh Poimboeuf
2021-11-08 18:29     ` Peter Zijlstra
2021-11-08 18:53       ` Nick Desaulniers
2021-11-09  8:23         ` Peter Zijlstra
2021-11-09 19:22           ` Nick Desaulniers
2021-11-09 20:59             ` Bill Wendling
2021-11-09 21:21               ` Peter Zijlstra
2021-11-09 21:25                 ` Nick Desaulniers
2021-11-09 22:11                   ` Peter Zijlstra
2021-11-09 22:15                     ` Nick Desaulniers
2021-11-09 21:07             ` Peter Zijlstra
2021-11-10 10:18               ` Peter Zijlstra
2021-11-10 10:46               ` David Laight
2021-11-10 11:09                 ` Peter Zijlstra
2021-11-10 12:20                   ` David Laight
2021-11-12  1:50                     ` Josh Poimboeuf
2021-11-12  9:33                       ` Peter Zijlstra
2021-11-13  5:35                         ` Josh Poimboeuf
2021-11-15 12:36                           ` Miroslav Benes
2021-11-15 13:01                             ` Joe Lawrence
2021-11-15 23:40                               ` Josh Poimboeuf
2021-11-16  7:25                                 ` Miroslav Benes
2021-11-15 12:59                           ` Miroslav Benes
2021-11-16 21:27                             ` Josh Poimboeuf
2021-11-18  7:15                               ` Miroslav Benes
2021-11-22 17:46                       ` Petr Mladek
2021-11-24 17:42                         ` Josh Poimboeuf
2021-11-25  8:18                           ` Petr Mladek
2021-11-10 12:14               ` Segher Boessenkool
2021-11-05 17:10 ` [PATCH 21/22] x86: Remove .fixup section Peter Zijlstra
2021-11-05 17:10 ` [PATCH 22/22] objtool: Remove .fixup handling Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).