xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] x86: some assembler macro rework
@ 2020-09-28 12:28 Jan Beulich
  2020-09-28 12:29 ` [PATCH v2 1/6] x86: replace __ASM_{CL,ST}AC Jan Beulich
                   ` (6 more replies)
  0 siblings, 7 replies; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:28 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Parts of this were discussed in the context of Andrew's CET-SS work.
Further parts simply fit the underlying picture. And the two final
patches get attached here simply because of their dependency: Patch
4 was sent standalone already as v2, and is unchanged from that,
while patch 6 is new.

1: replace __ASM_{CL,ST}AC
2: reduce CET-SS related #ifdef-ary
3: drop ASM_{CL,ST}AC
4: fold indirect_thunk_asm.h into asm-defns.h
5: guard against straight-line speculation past RET
6: limit amount of INT3 in IND_THUNK_*

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/6] x86: replace __ASM_{CL,ST}AC
  2020-09-28 12:28 [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
@ 2020-09-28 12:29 ` Jan Beulich
  2020-10-13 11:04   ` Andrew Cooper
  2020-09-28 12:30 ` [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary Jan Beulich
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:29 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Introduce proper assembler macros instead, enabled only when the
assembler itself doesn't support the insns. To avoid duplicating the
macros for assembly and C files, have them processed into asm-macros.h.
This in turn requires adding a multiple inclusion guard when generating
that header.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/arch/x86/Makefile
+++ b/xen/arch/x86/Makefile
@@ -243,7 +243,10 @@ $(BASEDIR)/include/asm-x86/asm-macros.h:
 	echo '#if 0' >$@.new
 	echo '.if 0' >>$@.new
 	echo '#endif' >>$@.new
+	echo '#ifndef __ASM_MACROS_H__' >>$@.new
+	echo '#define __ASM_MACROS_H__' >>$@.new
 	echo 'asm ( ".include \"$@\"" );' >>$@.new
+	echo '#endif /* __ASM_MACROS_H__ */' >>$@.new
 	echo '#if 0' >>$@.new
 	echo '.endif' >>$@.new
 	cat $< >>$@.new
--- a/xen/arch/x86/arch.mk
+++ b/xen/arch/x86/arch.mk
@@ -20,6 +20,7 @@ $(call as-option-add,CFLAGS,CC,"rdrand %
 $(call as-option-add,CFLAGS,CC,"rdfsbase %rax",-DHAVE_AS_FSGSBASE)
 $(call as-option-add,CFLAGS,CC,"xsaveopt (%rax)",-DHAVE_AS_XSAVEOPT)
 $(call as-option-add,CFLAGS,CC,"rdseed %eax",-DHAVE_AS_RDSEED)
+$(call as-option-add,CFLAGS,CC,"clac",-DHAVE_AS_CLAC_STAC)
 $(call as-option-add,CFLAGS,CC,"clwb (%rax)",-DHAVE_AS_CLWB)
 $(call as-option-add,CFLAGS,CC,".equ \"x\"$$(comma)1",-DHAVE_AS_QUOTED_SYM)
 $(call as-option-add,CFLAGS,CC,"invpcid (%rax)$$(comma)%rax",-DHAVE_AS_INVPCID)
--- a/xen/arch/x86/asm-macros.c
+++ b/xen/arch/x86/asm-macros.c
@@ -1 +1,2 @@
+#include <asm/asm-defns.h>
 #include <asm/alternative-asm.h>
--- /dev/null
+++ b/xen/include/asm-x86/asm-defns.h
@@ -0,0 +1,9 @@
+#ifndef HAVE_AS_CLAC_STAC
+.macro clac
+    .byte 0x0f, 0x01, 0xca
+.endm
+
+.macro stac
+    .byte 0x0f, 0x01, 0xcb
+.endm
+#endif
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -13,10 +13,12 @@
 #include <asm/alternative.h>
 
 #ifdef __ASSEMBLY__
+#include <asm/asm-defns.h>
 #ifndef CONFIG_INDIRECT_THUNK
 .equ CONFIG_INDIRECT_THUNK, 0
 #endif
 #else
+#include <asm/asm-macros.h>
 asm ( "\t.equ CONFIG_INDIRECT_THUNK, "
       __stringify(IS_ENABLED(CONFIG_INDIRECT_THUNK)) );
 #endif
@@ -200,34 +202,27 @@ register unsigned long current_stack_poi
 
 #endif
 
-/* "Raw" instruction opcodes */
-#define __ASM_CLAC      ".byte 0x0f,0x01,0xca"
-#define __ASM_STAC      ".byte 0x0f,0x01,0xcb"
-
 #ifdef __ASSEMBLY__
 .macro ASM_STAC
-    ALTERNATIVE "", __ASM_STAC, X86_FEATURE_XEN_SMAP
+    ALTERNATIVE "", stac, X86_FEATURE_XEN_SMAP
 .endm
 .macro ASM_CLAC
-    ALTERNATIVE "", __ASM_CLAC, X86_FEATURE_XEN_SMAP
+    ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
 .endm
 #else
 static always_inline void clac(void)
 {
     /* Note: a barrier is implicit in alternative() */
-    alternative("", __ASM_CLAC, X86_FEATURE_XEN_SMAP);
+    alternative("", "clac", X86_FEATURE_XEN_SMAP);
 }
 
 static always_inline void stac(void)
 {
     /* Note: a barrier is implicit in alternative() */
-    alternative("", __ASM_STAC, X86_FEATURE_XEN_SMAP);
+    alternative("", "stac", X86_FEATURE_XEN_SMAP);
 }
 #endif
 
-#undef __ASM_STAC
-#undef __ASM_CLAC
-
 #ifdef __ASSEMBLY__
 .macro SAVE_ALL op, compat=0
 .ifeqs "\op", "CLAC"



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary
  2020-09-28 12:28 [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
  2020-09-28 12:29 ` [PATCH v2 1/6] x86: replace __ASM_{CL,ST}AC Jan Beulich
@ 2020-09-28 12:30 ` Jan Beulich
  2020-09-28 12:37   ` Jan Beulich
  2020-10-13 11:20   ` Andrew Cooper
  2020-09-28 12:30 ` [PATCH v2 3/6] x86: drop ASM_{CL,ST}AC Jan Beulich
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Commit b586a81b7a90 ("x86/CET: Fix build following c/s 43b98e7190") had
to introduce a number of #ifdef-s to make the build work with older tool
chains. Introduce an assembler macro covering for tool chains not
knowing of CET-SS, allowing some conditionals where just SETSSBSY is the
problem to be dropped again.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
Now that I've done this I'm no longer sure which direction is better to
follow: On one hand this introduces dead code (even if just NOPs) into
CET-SS-disabled builds. Otoh this is a step towards breaking the tool
chain version dependency of the feature.

I've also dropped conditionals around bigger chunks of code; while I
think that's preferable, I'm open to undo those parts.

--- a/xen/arch/x86/boot/x86_64.S
+++ b/xen/arch/x86/boot/x86_64.S
@@ -31,7 +31,6 @@ ENTRY(__high_start)
         jz      .L_bsp
 
         /* APs.  Set up shadow stacks before entering C. */
-#ifdef CONFIG_XEN_SHSTK
         testl   $cpufeat_mask(X86_FEATURE_XEN_SHSTK), \
                 CPUINFO_FEATURE_OFFSET(X86_FEATURE_XEN_SHSTK) + boot_cpu_data(%rip)
         je      .L_ap_shstk_done
@@ -55,7 +54,6 @@ ENTRY(__high_start)
         mov     $XEN_MINIMAL_CR4 | X86_CR4_CET, %ecx
         mov     %rcx, %cr4
         setssbsy
-#endif
 
 .L_ap_shstk_done:
         call    start_secondary
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -668,7 +668,7 @@ static void __init noreturn reinit_bsp_s
     stack_base[0] = stack;
     memguard_guard_stack(stack);
 
-    if ( IS_ENABLED(CONFIG_XEN_SHSTK) && cpu_has_xen_shstk )
+    if ( cpu_has_xen_shstk )
     {
         wrmsrl(MSR_PL0_SSP,
                (unsigned long)stack + (PRIMARY_SHSTK_SLOT + 1) * PAGE_SIZE - 8);
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -197,9 +197,7 @@ ENTRY(cr4_pv32_restore)
 
 /* See lstar_enter for entry register state. */
 ENTRY(cstar_enter)
-#ifdef CONFIG_XEN_SHSTK
         ALTERNATIVE "", "setssbsy", X86_FEATURE_XEN_SHSTK
-#endif
         /* sti could live here when we don't switch page tables below. */
         CR4_PV32_RESTORE
         movq  8(%rsp),%rax /* Restore %rax. */
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -236,9 +236,7 @@ iret_exit_to_guest:
  * %ss must be saved into the space left by the trampoline.
  */
 ENTRY(lstar_enter)
-#ifdef CONFIG_XEN_SHSTK
         ALTERNATIVE "", "setssbsy", X86_FEATURE_XEN_SHSTK
-#endif
         /* sti could live here when we don't switch page tables below. */
         movq  8(%rsp),%rax /* Restore %rax. */
         movq  $FLAT_KERNEL_SS,8(%rsp)
@@ -272,9 +270,7 @@ ENTRY(lstar_enter)
         jmp   test_all_events
 
 ENTRY(sysenter_entry)
-#ifdef CONFIG_XEN_SHSTK
         ALTERNATIVE "", "setssbsy", X86_FEATURE_XEN_SHSTK
-#endif
         /* sti could live here when we don't switch page tables below. */
         pushq $FLAT_USER_SS
         pushq $0
--- a/xen/include/asm-x86/asm-defns.h
+++ b/xen/include/asm-x86/asm-defns.h
@@ -7,3 +7,9 @@
     .byte 0x0f, 0x01, 0xcb
 .endm
 #endif
+
+#ifndef CONFIG_HAS_AS_CET_SS
+.macro setssbsy
+    .byte 0xf3, 0x0f, 0x01, 0xe8
+.endm
+#endif



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 3/6] x86: drop ASM_{CL,ST}AC
  2020-09-28 12:28 [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
  2020-09-28 12:29 ` [PATCH v2 1/6] x86: replace __ASM_{CL,ST}AC Jan Beulich
  2020-09-28 12:30 ` [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary Jan Beulich
@ 2020-09-28 12:30 ` Jan Beulich
  2020-09-28 12:30 ` [PATCH v2 4/6] x86: fold indirect_thunk_asm.h into asm-defns.h Jan Beulich
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Use ALTERNATIVE directly, such that at the use sites it is visible that
alternative code patching is in use. Similarly avoid hiding the fact in
SAVE_ALL.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
v2: Further adjust comment in asm_domain_crash_synchronous().

--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -2165,9 +2165,8 @@ void activate_debugregs(const struct vcp
 void asm_domain_crash_synchronous(unsigned long addr)
 {
     /*
-     * We need clear AC bit here because in entry.S AC is set
-     * by ASM_STAC to temporarily allow accesses to user pages
-     * which is prevented by SMAP by default.
+     * We need to clear the AC bit here because the exception fixup logic
+     * may leave user accesses enabled.
      *
      * For some code paths, where this function is called, clac()
      * is not needed, but adding clac() here instead of each place
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -12,7 +12,7 @@
 #include <irq_vectors.h>
 
 ENTRY(entry_int82)
-        ASM_CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
         pushq $0
         movl  $HYPERCALL_VECTOR, 4(%rsp)
         SAVE_ALL compat=1 /* DPL1 gate, restricted to 32bit PV guests only. */
@@ -284,7 +284,7 @@ ENTRY(compat_int80_direct_trap)
 compat_create_bounce_frame:
         ASSERT_INTERRUPTS_ENABLED
         mov   %fs,%edi
-        ASM_STAC
+        ALTERNATIVE "", stac, X86_FEATURE_XEN_SMAP
         testb $2,UREGS_cs+8(%rsp)
         jz    1f
         /* Push new frame at registered guest-OS stack base. */
@@ -331,7 +331,7 @@ compat_create_bounce_frame:
         movl  TRAPBOUNCE_error_code(%rdx),%eax
 .Lft8:  movl  %eax,%fs:(%rsi)           # ERROR CODE
 1:
-        ASM_CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
         /* Rewrite our stack frame and return to guest-OS mode. */
         /* IA32 Ref. Vol. 3: TF, VM, RF and NT flags are cleared on trap. */
         andl  $~(X86_EFLAGS_VM|X86_EFLAGS_RF|\
@@ -377,7 +377,7 @@ compat_crash_page_fault_4:
         addl  $4,%esi
 compat_crash_page_fault:
 .Lft14: mov   %edi,%fs
-        ASM_CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
         movl  %esi,%edi
         call  show_page_walk
         jmp   dom_crash_sync_extable
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -276,7 +276,7 @@ ENTRY(sysenter_entry)
         pushq $0
         pushfq
 GLOBAL(sysenter_eflags_saved)
-        ASM_CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
         pushq $3 /* ring 3 null cs */
         pushq $0 /* null rip */
         pushq $0
@@ -329,7 +329,7 @@ UNLIKELY_END(sysenter_gpf)
         jmp   .Lbounce_exception
 
 ENTRY(int80_direct_trap)
-        ASM_CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
         pushq $0
         movl  $0x80, 4(%rsp)
         SAVE_ALL
@@ -448,7 +448,7 @@ __UNLIKELY_END(create_bounce_frame_bad_s
 
         subq  $7*8,%rsi
         movq  UREGS_ss+8(%rsp),%rax
-        ASM_STAC
+        ALTERNATIVE "", stac, X86_FEATURE_XEN_SMAP
         movq  VCPU_domain(%rbx),%rdi
         STORE_GUEST_STACK(rax,6)        # SS
         movq  UREGS_rsp+8(%rsp),%rax
@@ -486,7 +486,7 @@ __UNLIKELY_END(create_bounce_frame_bad_s
         STORE_GUEST_STACK(rax,1)        # R11
         movq  UREGS_rcx+8(%rsp),%rax
         STORE_GUEST_STACK(rax,0)        # RCX
-        ASM_CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
 
 #undef STORE_GUEST_STACK
 
@@ -528,11 +528,11 @@ domain_crash_page_fault_2x8:
 domain_crash_page_fault_1x8:
         addq  $8,%rsi
 domain_crash_page_fault_0x8:
-        ASM_CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
         movq  %rsi,%rdi
         call  show_page_walk
 ENTRY(dom_crash_sync_extable)
-        ASM_CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
         # Get out of the guest-save area of the stack.
         GET_STACK_END(ax)
         leaq  STACK_CPUINFO_FIELD(guest_cpu_user_regs)(%rax),%rsp
@@ -590,7 +590,8 @@ UNLIKELY_END(exit_cr3)
         iretq
 
 ENTRY(common_interrupt)
-        SAVE_ALL CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
+        SAVE_ALL
 
         GET_STACK_END(14)
 
@@ -622,7 +623,8 @@ ENTRY(page_fault)
         movl  $TRAP_page_fault,4(%rsp)
 /* No special register assumptions. */
 GLOBAL(handle_exception)
-        SAVE_ALL CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
+        SAVE_ALL
 
         GET_STACK_END(14)
 
@@ -827,7 +829,8 @@ ENTRY(entry_CP)
 ENTRY(double_fault)
         movl  $TRAP_double_fault,4(%rsp)
         /* Set AC to reduce chance of further SMAP faults */
-        SAVE_ALL STAC
+        ALTERNATIVE "", stac, X86_FEATURE_XEN_SMAP
+        SAVE_ALL
 
         GET_STACK_END(14)
 
@@ -860,7 +863,8 @@ ENTRY(nmi)
         pushq $0
         movl  $TRAP_nmi,4(%rsp)
 handle_ist_exception:
-        SAVE_ALL CLAC
+        ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
+        SAVE_ALL
 
         GET_STACK_END(14)
 
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -200,16 +200,6 @@ register unsigned long current_stack_poi
         UNLIKELY_END_SECTION "\n"          \
         ".Llikely." #tag ".%=:"
 
-#endif
-
-#ifdef __ASSEMBLY__
-.macro ASM_STAC
-    ALTERNATIVE "", stac, X86_FEATURE_XEN_SMAP
-.endm
-.macro ASM_CLAC
-    ALTERNATIVE "", clac, X86_FEATURE_XEN_SMAP
-.endm
-#else
 static always_inline void clac(void)
 {
     /* Note: a barrier is implicit in alternative() */
@@ -224,18 +214,7 @@ static always_inline void stac(void)
 #endif
 
 #ifdef __ASSEMBLY__
-.macro SAVE_ALL op, compat=0
-.ifeqs "\op", "CLAC"
-        ASM_CLAC
-.else
-.ifeqs "\op", "STAC"
-        ASM_STAC
-.else
-.ifnb \op
-        .err
-.endif
-.endif
-.endif
+.macro SAVE_ALL compat=0
         addq  $-(UREGS_error_code-UREGS_r15), %rsp
         cld
         movq  %rdi,UREGS_rdi(%rsp)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 4/6] x86: fold indirect_thunk_asm.h into asm-defns.h
  2020-09-28 12:28 [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
                   ` (2 preceding siblings ...)
  2020-09-28 12:30 ` [PATCH v2 3/6] x86: drop ASM_{CL,ST}AC Jan Beulich
@ 2020-09-28 12:30 ` Jan Beulich
  2020-09-28 12:31 ` [PATCH v2 5/6] x86: guard against straight-line speculation past RET Jan Beulich
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:30 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

There's little point in having two separate headers both getting
included by asm_defns.h. This in particular reduces the number of
instances of guarding asm(".include ...") suitably in such dual use
headers.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

--- a/xen/Makefile
+++ b/xen/Makefile
@@ -139,7 +139,7 @@ ifeq ($(TARGET_ARCH),x86)
 t1 = $(call as-insn,$(CC),".L0: .L1: .skip (.L1 - .L0)",,-no-integrated-as)
 
 # Check whether clang asm()-s support .include.
-t2 = $(call as-insn,$(CC) -I$(BASEDIR)/include,".include \"asm-x86/indirect_thunk_asm.h\"",,-no-integrated-as)
+t2 = $(call as-insn,$(CC) -I$(BASEDIR)/include,".include \"asm-x86/asm-defns.h\"",,-no-integrated-as)
 
 # Check whether clang keeps .macro-s between asm()-s:
 # https://bugs.llvm.org/show_bug.cgi?id=36110
--- a/xen/include/asm-x86/asm-defns.h
+++ b/xen/include/asm-x86/asm-defns.h
@@ -13,3 +13,40 @@
     .byte 0xf3, 0x0f, 0x01, 0xe8
 .endm
 #endif
+
+.macro INDIRECT_BRANCH insn:req arg:req
+/*
+ * Create an indirect branch.  insn is one of call/jmp, arg is a single
+ * register.
+ *
+ * With no compiler support, this degrades into a plain indirect call/jmp.
+ * With compiler support, dispatch to the correct __x86_indirect_thunk_*
+ */
+    .if CONFIG_INDIRECT_THUNK == 1
+
+        $done = 0
+        .irp reg, ax, cx, dx, bx, bp, si, di, 8, 9, 10, 11, 12, 13, 14, 15
+        .ifeqs "\arg", "%r\reg"
+            \insn __x86_indirect_thunk_r\reg
+            $done = 1
+           .exitm
+        .endif
+        .endr
+
+        .if $done != 1
+            .error "Bad register arg \arg"
+        .endif
+
+    .else
+        \insn *\arg
+    .endif
+.endm
+
+/* Convenience wrappers. */
+.macro INDIRECT_CALL arg:req
+    INDIRECT_BRANCH call \arg
+.endm
+
+.macro INDIRECT_JMP arg:req
+    INDIRECT_BRANCH jmp \arg
+.endm
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -22,7 +22,6 @@
 asm ( "\t.equ CONFIG_INDIRECT_THUNK, "
       __stringify(IS_ENABLED(CONFIG_INDIRECT_THUNK)) );
 #endif
-#include <asm/indirect_thunk_asm.h>
 
 #ifndef __ASSEMBLY__
 void ret_from_intr(void);
--- a/xen/include/asm-x86/indirect_thunk_asm.h
+++ /dev/null
@@ -1,53 +0,0 @@
-/*
- * Trickery to allow this header to be included at the C level, to permit
- * proper dependency tracking in .*.o.d files, while still having it contain
- * assembler only macros.
- */
-#ifndef __ASSEMBLY__
-# if 0
-  .if 0
-# endif
-asm ( "\t.include \"asm/indirect_thunk_asm.h\"" );
-# if 0
-  .endif
-# endif
-#else
-
-.macro INDIRECT_BRANCH insn:req arg:req
-/*
- * Create an indirect branch.  insn is one of call/jmp, arg is a single
- * register.
- *
- * With no compiler support, this degrades into a plain indirect call/jmp.
- * With compiler support, dispatch to the correct __x86_indirect_thunk_*
- */
-    .if CONFIG_INDIRECT_THUNK == 1
-
-        $done = 0
-        .irp reg, ax, cx, dx, bx, bp, si, di, 8, 9, 10, 11, 12, 13, 14, 15
-        .ifeqs "\arg", "%r\reg"
-            \insn __x86_indirect_thunk_r\reg
-            $done = 1
-           .exitm
-        .endif
-        .endr
-
-        .if $done != 1
-            .error "Bad register arg \arg"
-        .endif
-
-    .else
-        \insn *\arg
-    .endif
-.endm
-
-/* Convenience wrappers. */
-.macro INDIRECT_CALL arg:req
-    INDIRECT_BRANCH call \arg
-.endm
-
-.macro INDIRECT_JMP arg:req
-    INDIRECT_BRANCH jmp \arg
-.endm
-
-#endif



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 5/6] x86: guard against straight-line speculation past RET
  2020-09-28 12:28 [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
                   ` (3 preceding siblings ...)
  2020-09-28 12:30 ` [PATCH v2 4/6] x86: fold indirect_thunk_asm.h into asm-defns.h Jan Beulich
@ 2020-09-28 12:31 ` Jan Beulich
  2020-10-08 16:28   ` Roger Pau Monné
  2020-10-13 12:00   ` Andrew Cooper
  2020-09-28 12:32 ` [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_* Jan Beulich
  2020-09-28 12:34 ` [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
  6 siblings, 2 replies; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:31 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

Under certain conditions CPUs can speculate into the instruction stream
past a RET instruction. Guard against this just like 3b7dab93f240
("x86/spec-ctrl: Protect against CALL/JMP straight-line speculation")
did - by inserting an "INT $3" insn. It's merely the mechanics of how to
achieve this that differ: A set of macros gets introduced to post-
process RET insns issued by the compiler (or living in assembly files).

Unfortunately for clang this requires further features their built-in
assembler doesn't support: We need to be able to override insn mnemonics
produced by the compiler (which may be impossible, if internally
assembly mnemonics never get generated), and we want to use \(text)
escaping / quoting in the auxiliary macro.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
TBD: Should this depend on CONFIG_SPECULATIVE_HARDEN_BRANCH?
TBD: Would be nice to avoid the additions in .init.text, but a query to
     the binutils folks regarding the ability to identify the section
     stuff is in (by Peter Zijlstra over a year ago:
     https://sourceware.org/pipermail/binutils/2019-July/107528.html)
     has been left without helpful replies.
---
v2: Fix build with newer clang. Use int3 mnemonic. Also override retq.

--- a/xen/Makefile
+++ b/xen/Makefile
@@ -145,7 +145,15 @@ t2 = $(call as-insn,$(CC) -I$(BASEDIR)/i
 # https://bugs.llvm.org/show_bug.cgi?id=36110
 t3 = $(call as-insn,$(CC),".macro FOO;.endm"$(close); asm volatile $(open)".macro FOO;.endm",-no-integrated-as)
 
-CLANG_FLAGS += $(call or,$(t1),$(t2),$(t3))
+# Check whether \(text) escaping in macro bodies is supported.
+t4 = $(call as-insn,$(CC),".macro m ret:req; \\(ret) $$\\ret; .endm; m 8",,-no-integrated-as)
+
+# Check whether macros can override insn mnemonics in inline assembly.
+t5 = $(call as-insn,$(CC),".macro ret; .error; .endm; .macro retq; .error; .endm",-no-integrated-as)
+
+acc1 := $(call or,$(t1),$(t2),$(t3),$(t4))
+
+CLANG_FLAGS += $(call or,$(acc1),$(t5))
 endif
 
 CLANG_FLAGS += -Werror=unknown-warning-option
--- a/xen/include/asm-x86/asm-defns.h
+++ b/xen/include/asm-x86/asm-defns.h
@@ -50,3 +50,22 @@
 .macro INDIRECT_JMP arg:req
     INDIRECT_BRANCH jmp \arg
 .endm
+
+/*
+ * To guard against speculation past RET, insert a breakpoint insn
+ * immediately after them.
+ */
+.macro ret operand:vararg
+    ret$ \operand
+.endm
+.macro retq operand:vararg
+    ret$ \operand
+.endm
+.macro ret$ operand:vararg
+    .purgem ret
+    ret \operand
+    int3
+    .macro ret operand:vararg
+        ret$ \\(operand)
+    .endm
+.endm



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_*
  2020-09-28 12:28 [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
                   ` (4 preceding siblings ...)
  2020-09-28 12:31 ` [PATCH v2 5/6] x86: guard against straight-line speculation past RET Jan Beulich
@ 2020-09-28 12:32 ` Jan Beulich
  2020-10-08 16:35   ` Roger Pau Monné
  2020-10-13 12:58   ` Andrew Cooper
  2020-09-28 12:34 ` [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
  6 siblings, 2 replies; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

There's no point having every replacement variant to also specify the
INT3 - just have it once in the base macro. When patching, NOPs will get
inserted, which are fine to speculate through (until reaching the INT3).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be
replaced by INT3 as well. Of course the effect will be marginal, as the
size of the thunk will still be 16 bytes when including tail padding
resulting from alignment.
---
v2: New.

--- a/xen/arch/x86/indirect-thunk.S
+++ b/xen/arch/x86/indirect-thunk.S
@@ -11,6 +11,8 @@
 
 #include <asm/asm_defns.h>
 
+.purgem ret
+
 .macro IND_THUNK_RETPOLINE reg:req
         call 2f
 1:
@@ -24,12 +26,10 @@
 .macro IND_THUNK_LFENCE reg:req
         lfence
         jmp *%\reg
-        int3 /* Halt straight-line speculation */
 .endm
 
 .macro IND_THUNK_JMP reg:req
         jmp *%\reg
-        int3 /* Halt straight-line speculation */
 .endm
 
 /*
@@ -44,6 +44,8 @@ ENTRY(__x86_indirect_thunk_\reg)
         __stringify(IND_THUNK_LFENCE \reg), X86_FEATURE_IND_THUNK_LFENCE, \
         __stringify(IND_THUNK_JMP \reg),    X86_FEATURE_IND_THUNK_JMP
 
+        int3 /* Halt straight-line speculation */
+
         .size __x86_indirect_thunk_\reg, . - __x86_indirect_thunk_\reg
         .type __x86_indirect_thunk_\reg, @function
 .endm



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 0/6] x86: some assembler macro rework
  2020-09-28 12:28 [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
                   ` (5 preceding siblings ...)
  2020-09-28 12:32 ` [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_* Jan Beulich
@ 2020-09-28 12:34 ` Jan Beulich
  6 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:34 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper, Wei Liu, Roger Pau Monné

On 28.09.2020 14:28, Jan Beulich wrote:
> Parts of this were discussed in the context of Andrew's CET-SS work.
> Further parts simply fit the underlying picture. And the two final
> patches get attached here simply because of their dependency: Patch
> 4 was sent standalone already as v2, and is unchanged from that,
> while patch 6 is new.

And I should perhaps clarify: I'm resending the initial part of this
mainly to revive the discussion. There was some file name disagreement,
which is why I didn't commit at least the 2st patch here so far. But
there were no alternative suggestions that I would consider acceptable,
and hence I've kept the name as is.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary
  2020-09-28 12:30 ` [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary Jan Beulich
@ 2020-09-28 12:37   ` Jan Beulich
  2020-10-13 11:40     ` Andrew Cooper
  2020-10-13 11:20   ` Andrew Cooper
  1 sibling, 1 reply; 21+ messages in thread
From: Jan Beulich @ 2020-09-28 12:37 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 28.09.2020 14:30, Jan Beulich wrote:
> Commit b586a81b7a90 ("x86/CET: Fix build following c/s 43b98e7190") had
> to introduce a number of #ifdef-s to make the build work with older tool
> chains. Introduce an assembler macro covering for tool chains not
> knowing of CET-SS, allowing some conditionals where just SETSSBSY is the
> problem to be dropped again.
> 
> No change to generated code.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Now that I've done this I'm no longer sure which direction is better to
> follow: On one hand this introduces dead code (even if just NOPs) into
> CET-SS-disabled builds.

A possible compromise here might be to ...

> --- a/xen/include/asm-x86/asm-defns.h
> +++ b/xen/include/asm-x86/asm-defns.h
> @@ -7,3 +7,9 @@
>      .byte 0x0f, 0x01, 0xcb
>  .endm
>  #endif
> +
> +#ifndef CONFIG_HAS_AS_CET_SS
> +.macro setssbsy
> +    .byte 0xf3, 0x0f, 0x01, 0xe8
> +.endm
> +#endif

... comment out this macro's body. If we went this route, incssp
and wrssp could be dealt with in similar ways, to allow dropping
further #ifdef-s.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 5/6] x86: guard against straight-line speculation past RET
  2020-09-28 12:31 ` [PATCH v2 5/6] x86: guard against straight-line speculation past RET Jan Beulich
@ 2020-10-08 16:28   ` Roger Pau Monné
  2020-10-13 10:33     ` Jan Beulich
  2020-10-13 12:00   ` Andrew Cooper
  1 sibling, 1 reply; 21+ messages in thread
From: Roger Pau Monné @ 2020-10-08 16:28 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Andrew Cooper, Wei Liu

On Mon, Sep 28, 2020 at 02:31:49PM +0200, Jan Beulich wrote:
> Under certain conditions CPUs can speculate into the instruction stream
> past a RET instruction. Guard against this just like 3b7dab93f240
> ("x86/spec-ctrl: Protect against CALL/JMP straight-line speculation")
> did - by inserting an "INT $3" insn. It's merely the mechanics of how to
> achieve this that differ: A set of macros gets introduced to post-
> process RET insns issued by the compiler (or living in assembly files).
> 
> Unfortunately for clang this requires further features their built-in
> assembler doesn't support: We need to be able to override insn mnemonics
> produced by the compiler (which may be impossible, if internally
> assembly mnemonics never get generated), and we want to use \(text)
> escaping / quoting in the auxiliary macro.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Code LGTM.

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

See below for the TBD.

> ---
> TBD: Should this depend on CONFIG_SPECULATIVE_HARDEN_BRANCH?

I don't see the additions done in 3b7dab93f240 being guarded by
CONFIG_SPECULATIVE_HARDEN_BRANCH, so in that regard I would say no.
However those are already guarded by CONFIG_INDIRECT_THUNK so it's
slightly weird that the addition of such protections cannot be turned
off in any way.

I would be fine with having the additions done in 3b7dab93f240
protected by CONFIG_SPECULATIVE_HARDEN_BRANCH, and then the additions
done here also.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_*
  2020-09-28 12:32 ` [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_* Jan Beulich
@ 2020-10-08 16:35   ` Roger Pau Monné
  2020-10-13 12:58   ` Andrew Cooper
  1 sibling, 0 replies; 21+ messages in thread
From: Roger Pau Monné @ 2020-10-08 16:35 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: xen-devel, Wei Liu

On Mon, Sep 28, 2020 at 02:32:24PM +0200, Jan Beulich wrote:
> There's no point having every replacement variant to also specify the
> INT3 - just have it once in the base macro. When patching, NOPs will get
> inserted, which are fine to speculate through (until reaching the INT3).
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be
> replaced by INT3 as well. Of course the effect will be marginal, as the
> size of the thunk will still be 16 bytes when including tail padding
> resulting from alignment.

I think Andrew is the best one to have an opinion on this.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 5/6] x86: guard against straight-line speculation past RET
  2020-10-08 16:28   ` Roger Pau Monné
@ 2020-10-13 10:33     ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2020-10-13 10:33 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Andrew Cooper, Wei Liu

On 08.10.2020 18:28, Roger Pau Monné wrote:
> On Mon, Sep 28, 2020 at 02:31:49PM +0200, Jan Beulich wrote:
>> Under certain conditions CPUs can speculate into the instruction stream
>> past a RET instruction. Guard against this just like 3b7dab93f240
>> ("x86/spec-ctrl: Protect against CALL/JMP straight-line speculation")
>> did - by inserting an "INT $3" insn. It's merely the mechanics of how to
>> achieve this that differ: A set of macros gets introduced to post-
>> process RET insns issued by the compiler (or living in assembly files).
>>
>> Unfortunately for clang this requires further features their built-in
>> assembler doesn't support: We need to be able to override insn mnemonics
>> produced by the compiler (which may be impossible, if internally
>> assembly mnemonics never get generated), and we want to use \(text)
>> escaping / quoting in the auxiliary macro.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Code LGTM.
> 
> Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> ---
>> TBD: Should this depend on CONFIG_SPECULATIVE_HARDEN_BRANCH?
> 
> I don't see the additions done in 3b7dab93f240 being guarded by
> CONFIG_SPECULATIVE_HARDEN_BRANCH, so in that regard I would say no.
> However those are already guarded by CONFIG_INDIRECT_THUNK so it's
> slightly weird that the addition of such protections cannot be turned
> off in any way.
> 
> I would be fine with having the additions done in 3b7dab93f240
> protected by CONFIG_SPECULATIVE_HARDEN_BRANCH, and then the additions
> done here also.

Okay, perhaps I'll make a separate patch then to add the conditional
at all respective places.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/6] x86: replace __ASM_{CL,ST}AC
  2020-09-28 12:29 ` [PATCH v2 1/6] x86: replace __ASM_{CL,ST}AC Jan Beulich
@ 2020-10-13 11:04   ` Andrew Cooper
  2020-10-13 11:21     ` Jan Beulich
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2020-10-13 11:04 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné

On 28/09/2020 13:29, Jan Beulich wrote:
> --- a/xen/arch/x86/arch.mk
> +++ b/xen/arch/x86/arch.mk
> @@ -20,6 +20,7 @@ $(call as-option-add,CFLAGS,CC,"rdrand %
>  $(call as-option-add,CFLAGS,CC,"rdfsbase %rax",-DHAVE_AS_FSGSBASE)
>  $(call as-option-add,CFLAGS,CC,"xsaveopt (%rax)",-DHAVE_AS_XSAVEOPT)
>  $(call as-option-add,CFLAGS,CC,"rdseed %eax",-DHAVE_AS_RDSEED)
> +$(call as-option-add,CFLAGS,CC,"clac",-DHAVE_AS_CLAC_STAC)

Kconfig

> --- /dev/null
> +++ b/xen/include/asm-x86/asm-defns.h
> @@ -0,0 +1,9 @@
> +#ifndef HAVE_AS_CLAC_STAC
> +.macro clac
> +    .byte 0x0f, 0x01, 0xca
> +.endm
> +
> +.macro stac
> +    .byte 0x0f, 0x01, 0xcb
> +.endm
> +#endif
> --- a/xen/include/asm-x86/asm_defns.h
> +++ b/xen/include/asm-x86/asm_defns.h

We cannot have two files which differ by the vertical positioning of a dash.

How about asm-insn.h for the former, seeing as that is what it contains.

~Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary
  2020-09-28 12:30 ` [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary Jan Beulich
  2020-09-28 12:37   ` Jan Beulich
@ 2020-10-13 11:20   ` Andrew Cooper
  2020-10-13 11:26     ` Jan Beulich
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2020-10-13 11:20 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné

On 28/09/2020 13:30, Jan Beulich wrote:
> Commit b586a81b7a90 ("x86/CET: Fix build following c/s 43b98e7190") had
> to introduce a number of #ifdef-s to make the build work with older tool
> chains. Introduce an assembler macro covering for tool chains not
> knowing of CET-SS, allowing some conditionals where just SETSSBSY is the
> problem to be dropped again.
>
> No change to generated code.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Now that I've done this I'm no longer sure which direction is better to
> follow: On one hand this introduces dead code (even if just NOPs) into
> CET-SS-disabled builds. Otoh this is a step towards breaking the tool
> chain version dependency of the feature.

I've said before.  You cannot break the toolchain dependency without
hardcoding memory operands.  I'm not prepared to let that happen.

There is no problem requiring newer toolchains for newer features
(you're definitely not having CET-IBT, for example), and there is a
(unacceptably, IMO) large cost to this work.

~Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/6] x86: replace __ASM_{CL,ST}AC
  2020-10-13 11:04   ` Andrew Cooper
@ 2020-10-13 11:21     ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2020-10-13 11:21 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 13.10.2020 13:04, Andrew Cooper wrote:
> On 28/09/2020 13:29, Jan Beulich wrote:
>> --- a/xen/arch/x86/arch.mk
>> +++ b/xen/arch/x86/arch.mk
>> @@ -20,6 +20,7 @@ $(call as-option-add,CFLAGS,CC,"rdrand %
>>  $(call as-option-add,CFLAGS,CC,"rdfsbase %rax",-DHAVE_AS_FSGSBASE)
>>  $(call as-option-add,CFLAGS,CC,"xsaveopt (%rax)",-DHAVE_AS_XSAVEOPT)
>>  $(call as-option-add,CFLAGS,CC,"rdseed %eax",-DHAVE_AS_RDSEED)
>> +$(call as-option-add,CFLAGS,CC,"clac",-DHAVE_AS_CLAC_STAC)
> 
> Kconfig

I know that's your view, and you know I disagree. I don't see the
thread I had started to have lead to any consensus.

>> --- /dev/null
>> +++ b/xen/include/asm-x86/asm-defns.h
>> @@ -0,0 +1,9 @@
>> +#ifndef HAVE_AS_CLAC_STAC
>> +.macro clac
>> +    .byte 0x0f, 0x01, 0xca
>> +.endm
>> +
>> +.macro stac
>> +    .byte 0x0f, 0x01, 0xcb
>> +.endm
>> +#endif
>> --- a/xen/include/asm-x86/asm_defns.h
>> +++ b/xen/include/asm-x86/asm_defns.h
> 
> We cannot have two files which differ by the vertical positioning of a dash.

Why "cannot"? One is the helper of the other, so them being named almost
identically is quite sensible imo (and no-one is supposed to include the
new one directly). In any event I'd at most see this be "we don't want to".

> How about asm-insn.h for the former, seeing as that is what it contains.

Until "x86: fold indirect_thunk_asm.h into asm-defns.h", where it starts
to be more than just plain insn replacements. And I suspect more non-insn
macros will appear over time. I'd have suggested asm-macros.h in case the
present name really can't be reached consensus on, but we have a
(generated) file of this name already.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary
  2020-10-13 11:20   ` Andrew Cooper
@ 2020-10-13 11:26     ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2020-10-13 11:26 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 13.10.2020 13:20, Andrew Cooper wrote:
> On 28/09/2020 13:30, Jan Beulich wrote:
>> Commit b586a81b7a90 ("x86/CET: Fix build following c/s 43b98e7190") had
>> to introduce a number of #ifdef-s to make the build work with older tool
>> chains. Introduce an assembler macro covering for tool chains not
>> knowing of CET-SS, allowing some conditionals where just SETSSBSY is the
>> problem to be dropped again.
>>
>> No change to generated code.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>> ---
>> Now that I've done this I'm no longer sure which direction is better to
>> follow: On one hand this introduces dead code (even if just NOPs) into
>> CET-SS-disabled builds. Otoh this is a step towards breaking the tool
>> chain version dependency of the feature.
> 
> I've said before.  You cannot break the toolchain dependency without
> hardcoding memory operands.  I'm not prepared to let that happen.
> 
> There is no problem requiring newer toolchains for newer features
> (you're definitely not having CET-IBT, for example), and there is a
> (unacceptably, IMO) large cost to this work.

I'm aware of your view. What remains unclear to me is whether your
reply is merely a remark on this post-commit-message comment, or
whether it is an objection to the tidying (as I view it) the patch
does.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary
  2020-09-28 12:37   ` Jan Beulich
@ 2020-10-13 11:40     ` Andrew Cooper
  2020-10-13 13:24       ` Jan Beulich
  0 siblings, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2020-10-13 11:40 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 28/09/2020 13:37, Jan Beulich wrote:
> On 28.09.2020 14:30, Jan Beulich wrote:
>> Commit b586a81b7a90 ("x86/CET: Fix build following c/s 43b98e7190") had
>> to introduce a number of #ifdef-s to make the build work with older tool
>> chains. Introduce an assembler macro covering for tool chains not
>> knowing of CET-SS, allowing some conditionals where just SETSSBSY is the
>> problem to be dropped again.
>>
>> No change to generated code.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>> ---
>> Now that I've done this I'm no longer sure which direction is better to
>> follow: On one hand this introduces dead code (even if just NOPs) into
>> CET-SS-disabled builds.
> A possible compromise here might be to ...
>
>> --- a/xen/include/asm-x86/asm-defns.h
>> +++ b/xen/include/asm-x86/asm-defns.h
>> @@ -7,3 +7,9 @@
>>      .byte 0x0f, 0x01, 0xcb
>>  .endm
>>  #endif
>> +
>> +#ifndef CONFIG_HAS_AS_CET_SS
>> +.macro setssbsy
>> +    .byte 0xf3, 0x0f, 0x01, 0xe8
>> +.endm
>> +#endif
> ... comment out this macro's body. If we went this route, incssp
> and wrssp could be dealt with in similar ways, to allow dropping
> further #ifdef-s.

No, because how you've got something which reads as a real instruction
which sometimes disappears into nothing.  (Interestingly, zero length
alternatives do appear to compile, and this is clearly a bug.)

The thing which matters is the clarity of code in its surrounding
context, and this isn't an improvement.

~Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 5/6] x86: guard against straight-line speculation past RET
  2020-09-28 12:31 ` [PATCH v2 5/6] x86: guard against straight-line speculation past RET Jan Beulich
  2020-10-08 16:28   ` Roger Pau Monné
@ 2020-10-13 12:00   ` Andrew Cooper
  2020-10-13 13:34     ` Jan Beulich
  1 sibling, 1 reply; 21+ messages in thread
From: Andrew Cooper @ 2020-10-13 12:00 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné

On 28/09/2020 13:31, Jan Beulich wrote:
> Under certain conditions CPUs can speculate into the instruction stream
> past a RET instruction. Guard against this just like 3b7dab93f240
> ("x86/spec-ctrl: Protect against CALL/JMP straight-line speculation")
> did - by inserting an "INT $3" insn. It's merely the mechanics of how to
> achieve this that differ: A set of macros gets introduced to post-
> process RET insns issued by the compiler (or living in assembly files).
>
> Unfortunately for clang this requires further features their built-in
> assembler doesn't support: We need to be able to override insn mnemonics
> produced by the compiler (which may be impossible, if internally
> assembly mnemonics never get generated), and we want to use \(text)
> escaping / quoting in the auxiliary macro.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> TBD: Should this depend on CONFIG_SPECULATIVE_HARDEN_BRANCH?
> TBD: Would be nice to avoid the additions in .init.text, but a query to
>      the binutils folks regarding the ability to identify the section
>      stuff is in (by Peter Zijlstra over a year ago:
>      https://sourceware.org/pipermail/binutils/2019-July/107528.html)
>      has been left without helpful replies.
> ---
> v2: Fix build with newer clang. Use int3 mnemonic. Also override retq.
>
> --- a/xen/Makefile
> +++ b/xen/Makefile
> @@ -145,7 +145,15 @@ t2 = $(call as-insn,$(CC) -I$(BASEDIR)/i
>  # https://bugs.llvm.org/show_bug.cgi?id=36110
>  t3 = $(call as-insn,$(CC),".macro FOO;.endm"$(close); asm volatile $(open)".macro FOO;.endm",-no-integrated-as)
>  
> -CLANG_FLAGS += $(call or,$(t1),$(t2),$(t3))
> +# Check whether \(text) escaping in macro bodies is supported.
> +t4 = $(call as-insn,$(CC),".macro m ret:req; \\(ret) $$\\ret; .endm; m 8",,-no-integrated-as)
> +
> +# Check whether macros can override insn mnemonics in inline assembly.
> +t5 = $(call as-insn,$(CC),".macro ret; .error; .endm; .macro retq; .error; .endm",-no-integrated-as)
> +
> +acc1 := $(call or,$(t1),$(t2),$(t3),$(t4))
> +
> +CLANG_FLAGS += $(call or,$(acc1),$(t5))

I'm not happy taking this until there is toolchain support visible in
the future.

We *cannot* rule out the use of IAS forever more, because there are
features far more important than ret speculation which depend on it.

>  endif
>  
>  CLANG_FLAGS += -Werror=unknown-warning-option
> --- a/xen/include/asm-x86/asm-defns.h
> +++ b/xen/include/asm-x86/asm-defns.h
> @@ -50,3 +50,22 @@
>  .macro INDIRECT_JMP arg:req
>      INDIRECT_BRANCH jmp \arg
>  .endm
> +
> +/*
> + * To guard against speculation past RET, insert a breakpoint insn
> + * immediately after them.
> + */
> +.macro ret operand:vararg
> +    ret$ \operand
> +.endm
> +.macro retq operand:vararg
> +    ret$ \operand
> +.endm
> +.macro ret$ operand:vararg
> +    .purgem ret
> +    ret \operand

You're substituting retq for ret, which defeats the purpose of unwrapping.

I will repeat my previous feedback.  Do away with this
wrapping/unwrapping and just use a raw .byte.  Its simpler and faster.

~Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_*
  2020-09-28 12:32 ` [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_* Jan Beulich
  2020-10-08 16:35   ` Roger Pau Monné
@ 2020-10-13 12:58   ` Andrew Cooper
  1 sibling, 0 replies; 21+ messages in thread
From: Andrew Cooper @ 2020-10-13 12:58 UTC (permalink / raw)
  To: Jan Beulich, xen-devel; +Cc: Wei Liu, Roger Pau Monné

On 28/09/2020 13:32, Jan Beulich wrote:
> There's no point having every replacement variant to also specify the
> INT3 - just have it once in the base macro. When patching, NOPs will get
> inserted, which are fine to speculate through (until reaching the INT3).
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> I also wonder whether the LFENCE in IND_THUNK_RETPOLINE couldn't be
> replaced by INT3 as well. Of course the effect will be marginal, as the
> size of the thunk will still be 16 bytes when including tail padding
> resulting from alignment.

There are surprising performance implications from the choice of
speculation blocker.  RSB filling in particular had a benefit (up to 6%
iirc) from unrolling the loop.

Any differences here are likely to be marginal, whereas for inline
retpoline, the code volume reduction might easily be the winning factor.

> ---
> v2: New.
>
> --- a/xen/arch/x86/indirect-thunk.S
> +++ b/xen/arch/x86/indirect-thunk.S
> @@ -11,6 +11,8 @@
>  
>  #include <asm/asm_defns.h>
>  
> +.purgem ret

This needs a comment.

~Andrew


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary
  2020-10-13 11:40     ` Andrew Cooper
@ 2020-10-13 13:24       ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2020-10-13 13:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 13.10.2020 13:40, Andrew Cooper wrote:
> (Interestingly, zero length
> alternatives do appear to compile, and this is clearly a bug.)

Why? The replacement code may be intended to be all NOPs.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 5/6] x86: guard against straight-line speculation past RET
  2020-10-13 12:00   ` Andrew Cooper
@ 2020-10-13 13:34     ` Jan Beulich
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Beulich @ 2020-10-13 13:34 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel, Wei Liu, Roger Pau Monné

On 13.10.2020 14:00, Andrew Cooper wrote:
> On 28/09/2020 13:31, Jan Beulich wrote:
>> --- a/xen/Makefile
>> +++ b/xen/Makefile
>> @@ -145,7 +145,15 @@ t2 = $(call as-insn,$(CC) -I$(BASEDIR)/i
>>  # https://bugs.llvm.org/show_bug.cgi?id=36110
>>  t3 = $(call as-insn,$(CC),".macro FOO;.endm"$(close); asm volatile $(open)".macro FOO;.endm",-no-integrated-as)
>>  
>> -CLANG_FLAGS += $(call or,$(t1),$(t2),$(t3))
>> +# Check whether \(text) escaping in macro bodies is supported.
>> +t4 = $(call as-insn,$(CC),".macro m ret:req; \\(ret) $$\\ret; .endm; m 8",,-no-integrated-as)
>> +
>> +# Check whether macros can override insn mnemonics in inline assembly.
>> +t5 = $(call as-insn,$(CC),".macro ret; .error; .endm; .macro retq; .error; .endm",-no-integrated-as)
>> +
>> +acc1 := $(call or,$(t1),$(t2),$(t3),$(t4))
>> +
>> +CLANG_FLAGS += $(call or,$(acc1),$(t5))
> 
> I'm not happy taking this until there is toolchain support visible in
> the future.
> 
> We *cannot* rule out the use of IAS forever more, because there are
> features far more important than ret speculation which depend on it.

So what do you suggest? We can't have both, afaics, so we need to
pick either being able to use the integrated assembler or being
able to guard RET.

>> --- a/xen/include/asm-x86/asm-defns.h
>> +++ b/xen/include/asm-x86/asm-defns.h
>> @@ -50,3 +50,22 @@
>>  .macro INDIRECT_JMP arg:req
>>      INDIRECT_BRANCH jmp \arg
>>  .endm
>> +
>> +/*
>> + * To guard against speculation past RET, insert a breakpoint insn
>> + * immediately after them.
>> + */
>> +.macro ret operand:vararg
>> +    ret$ \operand
>> +.endm
>> +.macro retq operand:vararg
>> +    ret$ \operand
>> +.endm
>> +.macro ret$ operand:vararg
>> +    .purgem ret
>> +    ret \operand
> 
> You're substituting retq for ret, which defeats the purpose of unwrapping.

I'm afraid I don't understand the "defeats" aspect.

> I will repeat my previous feedback.  Do away with this
> wrapping/unwrapping and just use a raw .byte.  Its simpler and faster.

Well, I could now also repeat my prior response to your prior
feedback, but since iirc you didn't reply back then I don't
expect you would now. Instead I'll - once again - give in despite
my believe that this is the cleaner approach, and that in cases
like this one - when there are pros and cons to either approach -
it should be the author's choice rather than the reviewer's.

Jan


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-10-13 13:35 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-28 12:28 [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich
2020-09-28 12:29 ` [PATCH v2 1/6] x86: replace __ASM_{CL,ST}AC Jan Beulich
2020-10-13 11:04   ` Andrew Cooper
2020-10-13 11:21     ` Jan Beulich
2020-09-28 12:30 ` [PATCH v2 2/6] x86: reduce CET-SS related #ifdef-ary Jan Beulich
2020-09-28 12:37   ` Jan Beulich
2020-10-13 11:40     ` Andrew Cooper
2020-10-13 13:24       ` Jan Beulich
2020-10-13 11:20   ` Andrew Cooper
2020-10-13 11:26     ` Jan Beulich
2020-09-28 12:30 ` [PATCH v2 3/6] x86: drop ASM_{CL,ST}AC Jan Beulich
2020-09-28 12:30 ` [PATCH v2 4/6] x86: fold indirect_thunk_asm.h into asm-defns.h Jan Beulich
2020-09-28 12:31 ` [PATCH v2 5/6] x86: guard against straight-line speculation past RET Jan Beulich
2020-10-08 16:28   ` Roger Pau Monné
2020-10-13 10:33     ` Jan Beulich
2020-10-13 12:00   ` Andrew Cooper
2020-10-13 13:34     ` Jan Beulich
2020-09-28 12:32 ` [PATCH v2 6/6] x86: limit amount of INT3 in IND_THUNK_* Jan Beulich
2020-10-08 16:35   ` Roger Pau Monné
2020-10-13 12:58   ` Andrew Cooper
2020-09-28 12:34 ` [PATCH v2 0/6] x86: some assembler macro rework Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).