All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction
@ 2018-02-07 16:05 Jan Beulich
  2018-02-07 16:12 ` [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead Jan Beulich
                   ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Jan Beulich @ 2018-02-07 16:05 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

1: slightly reduce Meltdown band-aid overhead
2: remove CR reads from exit-to-guest path
3: introduce altinstruction_nop assembler macro
4: NOP out most XPTI entry/exit code when it's not in use
5: avoid double CR3 reload when switching to guest user mode
6: disable XPTI when RDCL_NO
7: x86: log XPTI enabled status

I won't mind if it was decided for some of them to be pointless, but I
think 1 (because of a measurable improvement of 1-3%), 4 (helping
the "xpti=no" case, even if only a little; taking 3 as prereq), and
6+7 should be considered seriously.

Signed-off-by: Jan Beulich <jbeulich@suse.com>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
@ 2018-02-07 16:12 ` Jan Beulich
  2018-02-07 19:35   ` Andrew Cooper
  2018-02-07 16:12 ` [PATCH v2 2/7] x86: remove CR reads from exit-to-guest path Jan Beulich
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-02-07 16:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

I'm not sure why I didn't do this right away: By avoiding the use of
global PTEs in the cloned directmap, there's no need to fiddle with
CR4.PGE on any of the entry paths. Only the exit paths need to flush
global mappings.

The reduced flushing, however, implies that we now need to have
interrupts off on all entry paths until after the page table switch, so
that flush IPIs can't arrive with the restricted page tables still
active, but only a non-global flush happening with the CR3 loads. Along
those lines the "sync" IPI after L4 entry updates now needs to become a
real (and global) flush IPI, so that inside Xen we'll also pick up such
changes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Re-phrase description. Re-base.

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3826,18 +3826,14 @@ long do_mmu_update(
     {
         /*
          * Force other vCPU-s of the affected guest to pick up L4 entry
-         * changes (if any). Issue a flush IPI with empty operation mask to
-         * facilitate this (including ourselves waiting for the IPI to
-         * actually have arrived). Utilize the fact that FLUSH_VA_VALID is
-         * meaningless without FLUSH_CACHE, but will allow to pass the no-op
-         * check in flush_area_mask().
+         * changes (if any).
          */
         unsigned int cpu = smp_processor_id();
         cpumask_t *mask = per_cpu(scratch_cpumask, cpu);
 
         cpumask_andnot(mask, pt_owner->dirty_cpumask, cpumask_of(cpu));
         if ( !cpumask_empty(mask) )
-            flush_area_mask(mask, ZERO_BLOCK_PTR, FLUSH_VA_VALID);
+            flush_mask(mask, FLUSH_TLB_GLOBAL);
     }
 
     perfc_add(num_page_updates, i);
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -728,6 +728,7 @@ static int clone_mapping(const void *ptr
     }
 
     pl1e += l1_table_offset(linear);
+    flags &= ~_PAGE_GLOBAL;
 
     if ( l1e_get_flags(*pl1e) & _PAGE_PRESENT )
     {
@@ -1009,8 +1010,17 @@ void __init smp_prepare_cpus(unsigned in
     if ( rc )
         panic("Error %d setting up PV root page table\n", rc);
     if ( per_cpu(root_pgt, 0) )
+    {
         get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
 
+        /*
+         * All entry points which may need to switch page tables have to start
+         * with interrupts off. Re-write what pv_trap_init() has put there.
+         */
+        _set_gate(idt_table + LEGACY_SYSCALL_VECTOR, SYS_DESC_irq_gate, 3,
+                  &int80_direct_trap);
+    }
+
     set_nr_sockets();
 
     socket_cpumask = xzalloc_array(cpumask_t *, nr_sockets);
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -200,7 +200,7 @@ ENTRY(compat_post_handle_exception)
 
 /* See lstar_enter for entry register state. */
 ENTRY(cstar_enter)
-        sti
+        /* sti could live here when we don't switch page tables below. */
         CR4_PV32_RESTORE
         movq  8(%rsp),%rax /* Restore %rax. */
         movq  $FLAT_KERNEL_SS,8(%rsp)
@@ -220,9 +220,10 @@ ENTRY(cstar_enter)
         jz    .Lcstar_cr3_okay
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
         neg   %rcx
-        write_cr3 rcx, rdi, rsi
+        mov   %rcx, %cr3
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lcstar_cr3_okay:
+        sti
 
         movq  STACK_CPUINFO_FIELD(current_vcpu)(%rbx), %rbx
         movq  VCPU_domain(%rbx),%rcx
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -148,7 +148,7 @@ UNLIKELY_END(exit_cr3)
  * %ss must be saved into the space left by the trampoline.
  */
 ENTRY(lstar_enter)
-        sti
+        /* sti could live here when we don't switch page tables below. */
         movq  8(%rsp),%rax /* Restore %rax. */
         movq  $FLAT_KERNEL_SS,8(%rsp)
         pushq %r11
@@ -167,9 +167,10 @@ ENTRY(lstar_enter)
         jz    .Llstar_cr3_okay
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
         neg   %rcx
-        write_cr3 rcx, rdi, rsi
+        mov   %rcx, %cr3
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Llstar_cr3_okay:
+        sti
 
         movq  STACK_CPUINFO_FIELD(current_vcpu)(%rbx), %rbx
         testb $TF_kernel_mode,VCPU_thread_flags(%rbx)
@@ -252,7 +253,7 @@ process_trap:
         jmp  test_all_events
 
 ENTRY(sysenter_entry)
-        sti
+        /* sti could live here when we don't switch page tables below. */
         pushq $FLAT_USER_SS
         pushq $0
         pushfq
@@ -273,9 +274,10 @@ GLOBAL(sysenter_eflags_saved)
         jz    .Lsyse_cr3_okay
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
         neg   %rcx
-        write_cr3 rcx, rdi, rsi
+        mov   %rcx, %cr3
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lsyse_cr3_okay:
+        sti
 
         movq  STACK_CPUINFO_FIELD(current_vcpu)(%rbx), %rbx
         cmpb  $0,VCPU_sysenter_disables_events(%rbx)
@@ -322,9 +324,10 @@ ENTRY(int80_direct_trap)
         jz    .Lint80_cr3_okay
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
         neg   %rcx
-        write_cr3 rcx, rdi, rsi
+        mov   %rcx, %cr3
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lint80_cr3_okay:
+        sti
 
         cmpb  $0,untrusted_msi(%rip)
 UNLIKELY_START(ne, msi_check)
@@ -503,7 +506,7 @@ ENTRY(common_interrupt)
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         neg   %rcx
 .Lintr_cr3_load:
-        write_cr3 rcx, rdi, rsi
+        mov   %rcx, %cr3
         xor   %ecx, %ecx
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         testb $3, UREGS_cs(%rsp)
@@ -545,7 +548,7 @@ GLOBAL(handle_exception)
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         neg   %rcx
 .Lxcpt_cr3_load:
-        write_cr3 rcx, rdi, rsi
+        mov   %rcx, %cr3
         xor   %ecx, %ecx
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         testb $3, UREGS_cs(%rsp)
@@ -741,7 +744,7 @@ ENTRY(double_fault)
         jns   .Ldblf_cr3_load
         neg   %rbx
 .Ldblf_cr3_load:
-        write_cr3 rbx, rdi, rsi
+        mov   %rbx, %cr3
 .Ldblf_cr3_okay:
 
         movq  %rsp,%rdi
@@ -776,7 +779,7 @@ handle_ist_exception:
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         neg   %rcx
 .List_cr3_load:
-        write_cr3 rcx, rdi, rsi
+        mov   %rcx, %cr3
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
 .List_cr3_okay:
 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 2/7] x86: remove CR reads from exit-to-guest path
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
  2018-02-07 16:12 ` [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead Jan Beulich
@ 2018-02-07 16:12 ` Jan Beulich
  2018-03-01 19:23   ` Andrew Cooper
  2018-02-07 16:13 ` [PATCH v2 3/7] x86: introduce altinstruction_nop assembler macro Jan Beulich
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-02-07 16:12 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

CR3 is - during normal operation - only ever loaded from v->arch.cr3,
so there's no need to read the actual control register. For CR4 we can
generally use the cached value on all synchronous entry end exit paths.
Drop the write_cr3 macro, as the two use sites are probably easier to
follow without its use.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Drop write_cr3 macro. Re-base.

--- a/xen/arch/x86/x86_64/asm-offsets.c
+++ b/xen/arch/x86/x86_64/asm-offsets.c
@@ -88,6 +88,7 @@ void __dummy__(void)
     OFFSET(VCPU_kernel_ss, struct vcpu, arch.pv_vcpu.kernel_ss);
     OFFSET(VCPU_iopl, struct vcpu, arch.pv_vcpu.iopl);
     OFFSET(VCPU_guest_context_flags, struct vcpu, arch.vgc_flags);
+    OFFSET(VCPU_cr3, struct vcpu, arch.cr3);
     OFFSET(VCPU_arch_msr, struct vcpu, arch.msr);
     OFFSET(VCPU_nmi_pending, struct vcpu, nmi_pending);
     OFFSET(VCPU_mce_pending, struct vcpu, mce_pending);
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -43,7 +43,7 @@ restore_all_guest:
         mov VCPUMSR_spec_ctrl_raw(%rdx), %r15d
 
         /* Copy guest mappings and switch to per-CPU root page table. */
-        mov   %cr3, %r9
+        mov   VCPU_cr3(%rbx), %r9
         GET_STACK_END(dx)
         mov   STACK_CPUINFO_FIELD(pv_cr3)(%rdx), %rdi
         movabs $PADDR_MASK & PAGE_MASK, %rsi
@@ -65,8 +65,13 @@ restore_all_guest:
         sub   $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \
                 ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi
         rep movsq
+        mov   STACK_CPUINFO_FIELD(cr4)(%rdx), %rdi
         mov   %r9, STACK_CPUINFO_FIELD(xen_cr3)(%rdx)
-        write_cr3 rax, rdi, rsi
+        mov   %rdi, %rsi
+        and   $~X86_CR4_PGE, %rdi
+        mov   %rdi, %cr4
+        mov   %rax, %cr3
+        mov   %rsi, %cr4
 .Lrag_keep_cr3:
 
         /* Restore stashed SPEC_CTRL value. */
@@ -122,7 +127,12 @@ restore_all_xen:
          * so "g" will have to do.
          */
 UNLIKELY_START(g, exit_cr3)
-        write_cr3 rax, rdi, rsi
+        mov   %cr4, %rdi
+        mov   %rdi, %rsi
+        and   $~X86_CR4_PGE, %rdi
+        mov   %rdi, %cr4
+        mov   %rax, %cr3
+        mov   %rsi, %cr4
 UNLIKELY_END(exit_cr3)
 
         /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -205,15 +205,6 @@ void ret_from_intr(void);
 #define ASM_STAC ASM_AC(STAC)
 #define ASM_CLAC ASM_AC(CLAC)
 
-.macro write_cr3 val:req, tmp1:req, tmp2:req
-        mov   %cr4, %\tmp1
-        mov   %\tmp1, %\tmp2
-        and   $~X86_CR4_PGE, %\tmp1
-        mov   %\tmp1, %cr4
-        mov   %\val, %cr3
-        mov   %\tmp2, %cr4
-.endm
-
 #define CR4_PV32_RESTORE                                           \
         667: ASM_NOP5;                                             \
         .pushsection .altinstr_replacement, "ax";                  \




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 3/7] x86: introduce altinstruction_nop assembler macro
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
  2018-02-07 16:12 ` [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead Jan Beulich
  2018-02-07 16:12 ` [PATCH v2 2/7] x86: remove CR reads from exit-to-guest path Jan Beulich
@ 2018-02-07 16:13 ` Jan Beulich
  2018-03-01 19:25   ` Andrew Cooper
  2018-02-07 16:13 ` [PATCH v2 4/7] x86: NOP out most XPTI entry/exit code when it's not in use Jan Beulich
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-02-07 16:13 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

This allows shortening (and making more obvious what they do) some
altinstruction_entry uses.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -135,8 +135,7 @@ ENTRY(compat_restore_all_guest)
         jne   1b
 .Lcr4_alt_end:
         .section .altinstructions, "a"
-        altinstruction_entry .Lcr4_orig, .Lcr4_orig, X86_FEATURE_ALWAYS, \
-                             (.Lcr4_orig_end - .Lcr4_orig), 0
+        altinstruction_nop .Lcr4_orig, .Lcr4_orig_end, X86_FEATURE_ALWAYS
         altinstruction_entry .Lcr4_orig, .Lcr4_alt, X86_FEATURE_XEN_SMEP, \
                              (.Lcr4_orig_end - .Lcr4_orig), \
                              (.Lcr4_alt_end - .Lcr4_alt)
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -17,6 +17,15 @@
     .byte \alt_len
 .endm
 
+/* As above, but to replace the entire range by suitable NOPs. */
+.macro altinstruction_nop start end feature
+    .long \start - .
+    .long \start - .
+    .word \feature
+    .byte \end - \start
+    .byte 0
+.endm
+
 .macro ALTERNATIVE oldinstr, newinstr, feature
 .Lold_start_\@:
     \oldinstr
--- a/xen/include/asm-x86/asm_defns.h
+++ b/xen/include/asm-x86/asm_defns.h
@@ -193,24 +193,24 @@ void ret_from_intr(void);
 
 #ifdef __ASSEMBLY__
 #define ASM_AC(op)                                                     \
-        661: ASM_NOP3;                                                 \
+        661: ASM_NOP3; 660:;                                           \
         .pushsection .altinstr_replacement, "ax";                      \
         662: __ASM_##op;                                               \
         .popsection;                                                   \
         .pushsection .altinstructions, "a";                            \
-        altinstruction_entry 661b, 661b, X86_FEATURE_ALWAYS, 3, 0;     \
-        altinstruction_entry 661b, 662b, X86_FEATURE_XEN_SMAP, 3, 3;       \
+        altinstruction_nop 661b, 660b, X86_FEATURE_ALWAYS;             \
+        altinstruction_entry 661b, 662b, X86_FEATURE_XEN_SMAP, 3, 3;   \
         .popsection
 
 #define ASM_STAC ASM_AC(STAC)
 #define ASM_CLAC ASM_AC(CLAC)
 
 #define CR4_PV32_RESTORE                                           \
-        667: ASM_NOP5;                                             \
+        667: ASM_NOP5; 669:;                                       \
         .pushsection .altinstr_replacement, "ax";                  \
         668: call cr4_pv32_restore;                                \
         .section .altinstructions, "a";                            \
-        altinstruction_entry 667b, 667b, X86_FEATURE_ALWAYS, 5, 0; \
+        altinstruction_nop 667b, 669b, X86_FEATURE_ALWAYS;         \
         altinstruction_entry 667b, 668b, X86_FEATURE_XEN_SMEP, 5, 5;   \
         altinstruction_entry 667b, 668b, X86_FEATURE_XEN_SMAP, 5, 5;   \
         .popsection




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 4/7] x86: NOP out most XPTI entry/exit code when it's not in use
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
                   ` (2 preceding siblings ...)
  2018-02-07 16:13 ` [PATCH v2 3/7] x86: introduce altinstruction_nop assembler macro Jan Beulich
@ 2018-02-07 16:13 ` Jan Beulich
  2018-03-01 19:42   ` Andrew Cooper
  2018-02-07 16:14 ` [PATCH v2 5/7] x86: avoid double CR3 reload when switching to guest user mode Jan Beulich
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-02-07 16:13 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

Introduce a synthetic feature flag to use alternative instruction
patching to NOP out all code on entry/exit paths other than those
involved in NMI/#MC handling (the patching logic can't properly handle
those paths yet). Having NOPs here is generally better than using
conditional branches.

Also change the limit on the number of bytes we can patch in one go to
that resulting from the encoding in struct alt_instr - there's no point
reducing it below that limit, and without a check being in place that
the limit isn't actually exceeded, such an artificial boundary is a
latent risk.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Introduce and use ALTERNATIVE_NOP. Re-base.

--- a/xen/arch/x86/alternative.c
+++ b/xen/arch/x86/alternative.c
@@ -24,7 +24,7 @@
 #include <asm/nmi.h>
 #include <xen/livepatch.h>
 
-#define MAX_PATCH_LEN (255-1)
+#define MAX_PATCH_LEN 255
 
 extern struct alt_instr __alt_instructions[], __alt_instructions_end[];
 
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3711,7 +3711,7 @@ long do_mmu_update(
                      * to the page lock we hold, its pinned status, and uses on
                      * this (v)CPU.
                      */
-                    if ( !rc && this_cpu(root_pgt) &&
+                    if ( !rc && !cpu_has_no_xpti &&
                          ((page->u.inuse.type_info & PGT_count_mask) >
                           (1 + !!(page->u.inuse.type_info & PGT_pinned) +
                            (pagetable_get_pfn(curr->arch.guest_table) == mfn) +
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -169,6 +169,9 @@ static int __init parse_smap_param(const
 }
 custom_param("smap", parse_smap_param);
 
+static int8_t __initdata opt_xpti = -1;
+boolean_param("xpti", opt_xpti);
+
 bool __read_mostly acpi_disabled;
 bool __initdata acpi_force;
 static char __initdata acpi_param[10] = "";
@@ -1541,6 +1544,13 @@ void __init noreturn __start_xen(unsigne
 
     cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS;
 
+    if ( opt_xpti < 0 )
+        opt_xpti = boot_cpu_data.x86_vendor != X86_VENDOR_AMD;
+    if ( opt_xpti )
+        setup_clear_cpu_cap(X86_FEATURE_NO_XPTI);
+    else
+        setup_force_cpu_cap(X86_FEATURE_NO_XPTI);
+
     if ( cpu_has_fsgsbase )
         set_in_cr4(X86_CR4_FSGSBASE);
 
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -741,8 +741,6 @@ static int clone_mapping(const void *ptr
     return 0;
 }
 
-static __read_mostly int8_t opt_xpti = -1;
-boolean_param("xpti", opt_xpti);
 DEFINE_PER_CPU(root_pgentry_t *, root_pgt);
 
 static int setup_cpu_root_pgt(unsigned int cpu)
@@ -751,7 +749,7 @@ static int setup_cpu_root_pgt(unsigned i
     unsigned int off;
     int rc;
 
-    if ( !opt_xpti )
+    if ( cpu_has_no_xpti )
         return 0;
 
     rpt = alloc_xen_pagetable();
@@ -1003,9 +1001,6 @@ void __init smp_prepare_cpus(unsigned in
 
     stack_base[0] = stack_start;
 
-    if ( opt_xpti < 0 )
-        opt_xpti = boot_cpu_data.x86_vendor != X86_VENDOR_AMD;
-
     rc = setup_cpu_root_pgt(0);
     if ( rc )
         panic("Error %d setting up PV root page table\n", rc);
--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -199,7 +199,7 @@ ENTRY(compat_post_handle_exception)
 
 /* See lstar_enter for entry register state. */
 ENTRY(cstar_enter)
-        /* sti could live here when we don't switch page tables below. */
+        ALTERNATIVE nop, sti, X86_FEATURE_NO_XPTI
         CR4_PV32_RESTORE
         movq  8(%rsp),%rax /* Restore %rax. */
         movq  $FLAT_KERNEL_SS,8(%rsp)
@@ -214,6 +214,7 @@ ENTRY(cstar_enter)
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
         GET_STACK_END(bx)
+.Lcstar_cr3_start:
         mov   STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
         neg   %rcx
         jz    .Lcstar_cr3_okay
@@ -223,6 +224,8 @@ ENTRY(cstar_enter)
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lcstar_cr3_okay:
         sti
+.Lcstar_cr3_end:
+        ALTERNATIVE_NOP .Lcstar_cr3_start, .Lcstar_cr3_end, X86_FEATURE_NO_XPTI
 
         movq  STACK_CPUINFO_FIELD(current_vcpu)(%rbx), %rbx
         movq  VCPU_domain(%rbx),%rcx
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -43,6 +43,7 @@ restore_all_guest:
         mov VCPUMSR_spec_ctrl_raw(%rdx), %r15d
 
         /* Copy guest mappings and switch to per-CPU root page table. */
+.Lrag_cr3_start:
         mov   VCPU_cr3(%rbx), %r9
         GET_STACK_END(dx)
         mov   STACK_CPUINFO_FIELD(pv_cr3)(%rdx), %rdi
@@ -50,7 +51,6 @@ restore_all_guest:
         movabs $DIRECTMAP_VIRT_START, %rcx
         mov   %rdi, %rax
         and   %rsi, %rdi
-        jz    .Lrag_keep_cr3
         and   %r9, %rsi
         add   %rcx, %rdi
         add   %rcx, %rsi
@@ -72,7 +72,8 @@ restore_all_guest:
         mov   %rdi, %cr4
         mov   %rax, %cr3
         mov   %rsi, %cr4
-.Lrag_keep_cr3:
+.Lrag_cr3_end:
+        ALTERNATIVE_NOP .Lrag_cr3_start, .Lrag_cr3_end, X86_FEATURE_NO_XPTI
 
         /* Restore stashed SPEC_CTRL value. */
         mov   %r15d, %eax
@@ -158,7 +159,7 @@ UNLIKELY_END(exit_cr3)
  * %ss must be saved into the space left by the trampoline.
  */
 ENTRY(lstar_enter)
-        /* sti could live here when we don't switch page tables below. */
+        ALTERNATIVE nop, sti, X86_FEATURE_NO_XPTI
         movq  8(%rsp),%rax /* Restore %rax. */
         movq  $FLAT_KERNEL_SS,8(%rsp)
         pushq %r11
@@ -172,6 +173,7 @@ ENTRY(lstar_enter)
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
         GET_STACK_END(bx)
+.Llstar_cr3_start:
         mov   STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
         neg   %rcx
         jz    .Llstar_cr3_okay
@@ -181,6 +183,8 @@ ENTRY(lstar_enter)
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Llstar_cr3_okay:
         sti
+.Llstar_cr3_end:
+        ALTERNATIVE_NOP .Llstar_cr3_start, .Llstar_cr3_end, X86_FEATURE_NO_XPTI
 
         movq  STACK_CPUINFO_FIELD(current_vcpu)(%rbx), %rbx
         testb $TF_kernel_mode,VCPU_thread_flags(%rbx)
@@ -263,7 +267,7 @@ process_trap:
         jmp  test_all_events
 
 ENTRY(sysenter_entry)
-        /* sti could live here when we don't switch page tables below. */
+        ALTERNATIVE nop, sti, X86_FEATURE_NO_XPTI
         pushq $FLAT_USER_SS
         pushq $0
         pushfq
@@ -279,6 +283,7 @@ GLOBAL(sysenter_eflags_saved)
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
         GET_STACK_END(bx)
+.Lsyse_cr3_start:
         mov   STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
         neg   %rcx
         jz    .Lsyse_cr3_okay
@@ -288,6 +293,8 @@ GLOBAL(sysenter_eflags_saved)
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lsyse_cr3_okay:
         sti
+.Lsyse_cr3_end:
+        ALTERNATIVE_NOP .Lsyse_cr3_start, .Lsyse_cr3_end, X86_FEATURE_NO_XPTI
 
         movq  STACK_CPUINFO_FIELD(current_vcpu)(%rbx), %rbx
         cmpb  $0,VCPU_sysenter_disables_events(%rbx)
@@ -329,6 +336,7 @@ ENTRY(int80_direct_trap)
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
         GET_STACK_END(bx)
+.Lint80_cr3_start:
         mov   STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
         neg   %rcx
         jz    .Lint80_cr3_okay
@@ -338,6 +346,8 @@ ENTRY(int80_direct_trap)
         movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lint80_cr3_okay:
         sti
+.Lint80_cr3_end:
+        ALTERNATIVE_NOP .Lint80_cr3_start, .Lint80_cr3_end, X86_FEATURE_NO_XPTI
 
         cmpb  $0,untrusted_msi(%rip)
 UNLIKELY_START(ne, msi_check)
@@ -508,6 +518,7 @@ ENTRY(common_interrupt)
         SPEC_CTRL_ENTRY_FROM_INTR /* Req: %rsp=regs, %r14=end, Clob: acd */
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
+.Lintr_cr3_start:
         mov   STACK_CPUINFO_FIELD(xen_cr3)(%r14), %rcx
         mov   %rcx, %r15
         neg   %rcx
@@ -526,9 +537,14 @@ ENTRY(common_interrupt)
         CR4_PV32_RESTORE
         movq %rsp,%rdi
         callq do_IRQ
+.Lintr_cr3_restore:
         mov   %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
+.Lintr_cr3_end:
         jmp ret_from_intr
 
+        ALTERNATIVE_NOP .Lintr_cr3_restore, .Lintr_cr3_end, X86_FEATURE_NO_XPTI
+        ALTERNATIVE_NOP .Lintr_cr3_start, .Lintr_cr3_okay, X86_FEATURE_NO_XPTI
+
 /* No special register assumptions. */
 ENTRY(ret_from_intr)
         GET_CURRENT(bx)
@@ -550,6 +566,7 @@ GLOBAL(handle_exception)
         SPEC_CTRL_ENTRY_FROM_INTR /* Req: %rsp=regs, %r14=end, Clob: acd */
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
+.Lxcpt_cr3_start:
         mov   STACK_CPUINFO_FIELD(xen_cr3)(%r14), %rcx
         mov   %rcx, %r15
         neg   %rcx
@@ -630,7 +647,9 @@ handle_exception_saved:
         PERFC_INCR(exceptions, %rax, %rbx)
         mov   (%rdx, %rax, 8), %rdx
         INDIRECT_CALL %rdx
+.Lxcpt_cr3_restore1:
         mov   %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
+.Lxcpt_cr3_end1:
         testb $3,UREGS_cs(%rsp)
         jz    restore_all_xen
         leaq  VCPU_trap_bounce(%rbx),%rdx
@@ -663,9 +682,17 @@ exception_with_ints_disabled:
         rep;  movsq                     # make room for ec/ev
 1:      movq  UREGS_error_code(%rsp),%rax # ec/ev
         movq  %rax,UREGS_kernel_sizeof(%rsp)
+.Lxcpt_cr3_restore2:
         mov   %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
+.Lxcpt_cr3_end2:
         jmp   restore_all_xen           # return to fixup code
 
+        ALTERNATIVE_NOP .Lxcpt_cr3_restore1, .Lxcpt_cr3_end1, \
+                        X86_FEATURE_NO_XPTI
+        ALTERNATIVE_NOP .Lxcpt_cr3_restore2, .Lxcpt_cr3_end2, \
+                        X86_FEATURE_NO_XPTI
+        ALTERNATIVE_NOP .Lxcpt_cr3_start, .Lxcpt_cr3_okay, X86_FEATURE_NO_XPTI
+
 /* No special register assumptions. */
 FATAL_exception_with_ints_disabled:
         xorl  %esi,%esi
--- a/xen/include/asm-x86/alternative-asm.h
+++ b/xen/include/asm-x86/alternative-asm.h
@@ -72,6 +72,12 @@
     .popsection
 .endm
 
+.macro ALTERNATIVE_NOP start, end, feature
+        .pushsection .altinstructions, "a", @progbits
+        altinstruction_nop \start, \end, \feature
+        .popsection
+.endm
+
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_X86_ALTERNATIVE_ASM_H_ */
 
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -105,6 +105,7 @@
 #define cpu_has_cpuid_faulting  boot_cpu_has(X86_FEATURE_CPUID_FAULTING)
 #define cpu_has_aperfmperf      boot_cpu_has(X86_FEATURE_APERFMPERF)
 #define cpu_has_lfence_dispatch boot_cpu_has(X86_FEATURE_LFENCE_DISPATCH)
+#define cpu_has_no_xpti         boot_cpu_has(X86_FEATURE_NO_XPTI)
 
 enum _cache_type {
     CACHE_TYPE_NULL = 0,
--- a/xen/include/asm-x86/cpufeatures.h
+++ b/xen/include/asm-x86/cpufeatures.h
@@ -29,4 +29,5 @@ XEN_CPUFEATURE(XEN_IBPB,        (FSCAPIN
 XEN_CPUFEATURE(XEN_IBRS_SET,    (FSCAPINTS+0)*32+16) /* IBRSB && IRBS set in Xen */
 XEN_CPUFEATURE(XEN_IBRS_CLEAR,  (FSCAPINTS+0)*32+17) /* IBRSB && IBRS clear in Xen */
 XEN_CPUFEATURE(RSB_NATIVE,      (FSCAPINTS+0)*32+18) /* RSB overwrite needed for native */
-XEN_CPUFEATURE(RSB_VMEXIT,      (FSCAPINTS+0)*32+20) /* RSB overwrite needed for vmexit */
+XEN_CPUFEATURE(RSB_VMEXIT,      (FSCAPINTS+0)*32+19) /* RSB overwrite needed for vmexit */
+XEN_CPUFEATURE(NO_XPTI,         (FSCAPINTS+0)*32+20) /* XPTI mitigation not in use */



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 5/7] x86: avoid double CR3 reload when switching to guest user mode
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
                   ` (3 preceding siblings ...)
  2018-02-07 16:13 ` [PATCH v2 4/7] x86: NOP out most XPTI entry/exit code when it's not in use Jan Beulich
@ 2018-02-07 16:14 ` Jan Beulich
  2018-02-07 16:14 ` [PATCH v2 6/7] x86: disable XPTI when RDCL_NO Jan Beulich
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-02-07 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

When XPTI is active, the CR3 load in restore_all_guest is sufficient
when switching to user mode, improving in particular system call and
page fault exit paths for the guest.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Add ASSERT(!in_irq()).

--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -220,10 +220,22 @@ int pv_domain_initialise(struct domain *
     return rc;
 }
 
-static void _toggle_guest_pt(struct vcpu *v)
+static void _toggle_guest_pt(struct vcpu *v, bool force_cr3)
 {
+    ASSERT(!in_irq());
+
     v->arch.flags ^= TF_kernel_mode;
     update_cr3(v);
+
+    /*
+     * There's no need to load CR3 here when it is going to be loaded on the
+     * way out to guest mode again anyway, and when the page tables we're
+     * currently on are the kernel ones (whereas when switching to kernel
+     * mode we need to be able to write a bounce frame onto the kernel stack).
+     */
+    if ( !force_cr3 && !(v->arch.flags & TF_kernel_mode) )
+        return;
+
     /* Don't flush user global mappings from the TLB. Don't tick TLB clock. */
     asm volatile ( "mov %0, %%cr3" : : "r" (v->arch.cr3) : "memory" );
 
@@ -253,13 +265,13 @@ void toggle_guest_mode(struct vcpu *v)
     }
     asm volatile ( "swapgs" );
 
-    _toggle_guest_pt(v);
+    _toggle_guest_pt(v, cpu_has_no_xpti);
 }
 
 void toggle_guest_pt(struct vcpu *v)
 {
     if ( !is_pv_32bit_vcpu(v) )
-        _toggle_guest_pt(v);
+        _toggle_guest_pt(v, true);
 }
 
 /*




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 6/7] x86: disable XPTI when RDCL_NO
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
                   ` (4 preceding siblings ...)
  2018-02-07 16:14 ` [PATCH v2 5/7] x86: avoid double CR3 reload when switching to guest user mode Jan Beulich
@ 2018-02-07 16:14 ` Jan Beulich
  2018-02-07 16:15 ` [PATCH v2 7/7] x86: log XPTI enabled status Jan Beulich
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-02-07 16:14 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

Use the respective ARCH_CAPABILITIES MSR bit, but don't expose the MSR
to guests yet.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: New.

--- a/tools/libxl/libxl_cpuid.c
+++ b/tools/libxl/libxl_cpuid.c
@@ -204,6 +204,7 @@ int libxl_cpuid_parse_config(libxl_cpuid
         {"avx512-4fmaps",0x00000007,  0, CPUID_REG_EDX,  3,  1},
         {"ibrsb",        0x00000007,  0, CPUID_REG_EDX, 26,  1},
         {"stibp",        0x00000007,  0, CPUID_REG_EDX, 27,  1},
+        {"arch-caps",    0x00000007,  0, CPUID_REG_EDX, 29,  1},
 
         {"lahfsahf",     0x80000001, NA, CPUID_REG_ECX,  0,  1},
         {"cmplegacy",    0x80000001, NA, CPUID_REG_ECX,  1,  1},
--- a/tools/misc/xen-cpuid.c
+++ b/tools/misc/xen-cpuid.c
@@ -166,7 +166,9 @@ static const char *str_7d0[32] =
 
     [26] = "ibrsb",         [27] = "stibp",
 
-    [28 ... 31] = "REZ",
+    [28] = "REZ",           [29] = "arch_caps",
+
+    [30 ... 31] = "REZ",
 };
 
 static struct {
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1545,7 +1545,16 @@ void __init noreturn __start_xen(unsigne
     cr4_pv32_mask = mmu_cr4_features & XEN_CR4_PV32_BITS;
 
     if ( opt_xpti < 0 )
-        opt_xpti = boot_cpu_data.x86_vendor != X86_VENDOR_AMD;
+    {
+        uint64_t caps = 0;
+
+        if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+            caps = ARCH_CAPABILITIES_RDCL_NO;
+        else if ( boot_cpu_has(X86_FEATURE_ARCH_CAPS) )
+            rdmsrl(MSR_ARCH_CAPABILITIES, caps);
+
+        opt_xpti = !(caps & ARCH_CAPABILITIES_RDCL_NO);
+    }
     if ( opt_xpti )
         setup_clear_cpu_cap(X86_FEATURE_NO_XPTI);
     else
--- a/xen/include/asm-x86/msr-index.h
+++ b/xen/include/asm-x86/msr-index.h
@@ -40,6 +40,8 @@
 #define PRED_CMD_IBPB			(_AC(1, ULL) << 0)
 
 #define MSR_ARCH_CAPABILITIES		0x0000010a
+#define ARCH_CAPABILITIES_RDCL_NO	(_AC(1, ULL) << 0)
+#define ARCH_CAPABILITIES_IBRS_ALL	(_AC(1, ULL) << 1)
 
 /* Intel MSRs. Some also available on other CPUs */
 #define MSR_IA32_PERFCTR0		0x000000c1
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -244,6 +244,7 @@ XEN_CPUFEATURE(AVX512_4VNNIW, 9*32+ 2) /
 XEN_CPUFEATURE(AVX512_4FMAPS, 9*32+ 3) /*A  AVX512 Multiply Accumulation Single Precision */
 XEN_CPUFEATURE(IBRSB,         9*32+26) /*A  IBRS and IBPB support (used by Intel) */
 XEN_CPUFEATURE(STIBP,         9*32+27) /*A! STIBP */
+XEN_CPUFEATURE(ARCH_CAPS,     9*32+29) /*   IA32_ARCH_CAPABILITIES MSR */
 
 #endif /* XEN_CPUFEATURE */
 




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 7/7] x86: log XPTI enabled status
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
                   ` (5 preceding siblings ...)
  2018-02-07 16:14 ` [PATCH v2 6/7] x86: disable XPTI when RDCL_NO Jan Beulich
@ 2018-02-07 16:15 ` Jan Beulich
  2018-02-13 18:39 ` [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Rich Persaud
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-02-07 16:15 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

At the same time also report the state of the two defined
ARCH_CAPABILITIES MSR bits. To avoid further complicating the
conditional around that printk(), drop it (it's a debug level one only
anyway).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Re-base over split off earlier patch. Drop MSR_ from
    MSR_ARCH_CAPABILITIES_*. Drop conditional around debug printk().

--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -21,7 +21,7 @@
 #include <xen/lib.h>
 
 #include <asm/microcode.h>
-#include <asm/msr-index.h>
+#include <asm/msr.h>
 #include <asm/processor.h>
 #include <asm/spec_ctrl.h>
 #include <asm/spec_ctrl_asm.h>
@@ -100,23 +100,25 @@ custom_param("bti", parse_bti);
 static void __init print_details(enum ind_thunk thunk)
 {
     unsigned int _7d0 = 0, e8b = 0, tmp;
+    uint64_t caps = 0;
 
     /* Collect diagnostics about available mitigations. */
     if ( boot_cpu_data.cpuid_level >= 7 )
         cpuid_count(7, 0, &tmp, &tmp, &tmp, &_7d0);
     if ( boot_cpu_data.extended_cpuid_level >= 0x80000008 )
         cpuid(0x80000008, &tmp, &e8b, &tmp, &tmp);
+    if ( _7d0 & cpufeat_mask(X86_FEATURE_ARCH_CAPS) )
+        rdmsrl(MSR_ARCH_CAPABILITIES, caps);
 
     printk(XENLOG_DEBUG "Speculative mitigation facilities:\n");
 
     /* Hardware features which pertain to speculative mitigations. */
-    if ( (_7d0 & (cpufeat_mask(X86_FEATURE_IBRSB) |
-                  cpufeat_mask(X86_FEATURE_STIBP))) ||
-         (e8b & cpufeat_mask(X86_FEATURE_IBPB)) )
-        printk(XENLOG_DEBUG "  Hardware features:%s%s%s\n",
-               (_7d0 & cpufeat_mask(X86_FEATURE_IBRSB)) ? " IBRS/IBPB" : "",
-               (_7d0 & cpufeat_mask(X86_FEATURE_STIBP)) ? " STIBP"     : "",
-               (e8b  & cpufeat_mask(X86_FEATURE_IBPB))  ? " IBPB"      : "");
+    printk(XENLOG_DEBUG "  Hardware features:%s%s%s%s%s\n",
+           (_7d0 & cpufeat_mask(X86_FEATURE_IBRSB)) ? " IBRS/IBPB" : "",
+           (_7d0 & cpufeat_mask(X86_FEATURE_STIBP)) ? " STIBP"     : "",
+           (e8b  & cpufeat_mask(X86_FEATURE_IBPB))  ? " IBPB"      : "",
+           (caps & ARCH_CAPABILITIES_IBRS_ALL)      ? " IBRS_ALL"  : "",
+           (caps & ARCH_CAPABILITIES_RDCL_NO)       ? " RDCL_NO"   : "");
 
     /* Compiled-in support which pertains to BTI mitigations. */
     if ( IS_ENABLED(CONFIG_INDIRECT_THUNK) )
@@ -133,6 +135,9 @@ static void __init print_details(enum in
            opt_ibpb                                  ? " IBPB"       : "",
            boot_cpu_has(X86_FEATURE_RSB_NATIVE)      ? " RSB_NATIVE" : "",
            boot_cpu_has(X86_FEATURE_RSB_VMEXIT)      ? " RSB_VMEXIT" : "");
+
+    printk(XENLOG_INFO "XPTI: %s\n",
+           boot_cpu_has(X86_FEATURE_NO_XPTI) ? "disabled" : "enabled");
 }
 
 /* Calculate whether Retpoline is known-safe on this CPU. */




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead
  2018-02-07 16:12 ` [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead Jan Beulich
@ 2018-02-07 19:35   ` Andrew Cooper
  2018-02-08  9:20     ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2018-02-07 19:35 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 07/02/18 16:12, Jan Beulich wrote:
> I'm not sure why I didn't do this right away: By avoiding the use of
> global PTEs in the cloned directmap, there's no need to fiddle with
> CR4.PGE on any of the entry paths. Only the exit paths need to flush
> global mappings.
>
> The reduced flushing, however, implies that we now need to have
> interrupts off on all entry paths until after the page table switch, so
> that flush IPIs can't arrive with the restricted page tables still
> active, but only a non-global flush happening with the CR3 loads. Along
> those lines the "sync" IPI after L4 entry updates now needs to become a
> real (and global) flush IPI, so that inside Xen we'll also pick up such
> changes.

Actually, on second consideration, why does reenabling interrupts need
to be deferred?

The safety of the sync_guest path (which previously entered Xen, did
nothing, and exited again) relied on the entry part flushing global
mappings for safety, as the return-to-xen path doesn't necessarily
switch mappings.

However, the first hunk upgrading the "do nothing" to a proper global
flush, covers that case.

I don't see anything else which affects the safety of taking TLB flush
IPIs early in the entry-from-guest path.  What am I missing?

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead
  2018-02-07 19:35   ` Andrew Cooper
@ 2018-02-08  9:20     ` Jan Beulich
  2018-03-01 19:21       ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-02-08  9:20 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 07.02.18 at 20:35, <andrew.cooper3@citrix.com> wrote:
> On 07/02/18 16:12, Jan Beulich wrote:
>> I'm not sure why I didn't do this right away: By avoiding the use of
>> global PTEs in the cloned directmap, there's no need to fiddle with
>> CR4.PGE on any of the entry paths. Only the exit paths need to flush
>> global mappings.
>>
>> The reduced flushing, however, implies that we now need to have
>> interrupts off on all entry paths until after the page table switch, so
>> that flush IPIs can't arrive with the restricted page tables still
>> active, but only a non-global flush happening with the CR3 loads. Along
>> those lines the "sync" IPI after L4 entry updates now needs to become a
>> real (and global) flush IPI, so that inside Xen we'll also pick up such
>> changes.
> 
> Actually, on second consideration, why does reenabling interrupts need
> to be deferred?
> 
> The safety of the sync_guest path (which previously entered Xen, did
> nothing, and exited again) relied on the entry part flushing global
> mappings for safety, as the return-to-xen path doesn't necessarily
> switch mappings.
> 
> However, the first hunk upgrading the "do nothing" to a proper global
> flush, covers that case.
> 
> I don't see anything else which affects the safety of taking TLB flush
> IPIs early in the entry-from-guest path.  What am I missing?

If a sync IPI arrives before we switch away from the restricted page
tables, the processor may re-fetch a global entry from those tables
through an L4 with the sync IPI is supposed to tell the processor to
get rid of (or modify). The subsequent CR3 write won't invalidate such
a TLB entry, and hence whatever we do internally may reference a
stale mapping.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
                   ` (6 preceding siblings ...)
  2018-02-07 16:15 ` [PATCH v2 7/7] x86: log XPTI enabled status Jan Beulich
@ 2018-02-13 18:39 ` Rich Persaud
  2018-02-14  8:04   ` Jan Beulich
  2018-02-28 11:52 ` Ping: " Jan Beulich
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Rich Persaud @ 2018-02-13 18:39 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 551 bytes --]

On Feb 7, 2018, at 11:05, Jan Beulich <JBeulich@suse.com> wrote:
> 
> 1: slightly reduce Meltdown band-aid overhead
> 2: remove CR reads from exit-to-guest path
> 3: introduce altinstruction_nop assembler macro
> 4: NOP out most XPTI entry/exit code when it's not in use
> 5: avoid double CR3 reload when switching to guest user mode
> 6: disable XPTI when RDCL_NO
> 7: x86: log XPTI enabled status

Since work on XPTI is ongoing, will these improvements to XPTI-stage-1 be published via http://xenbits.xen.org/xsa/xsa254/README.pti?

Rich

[-- Attachment #1.2: Type: text/html, Size: 995 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction
  2018-02-13 18:39 ` [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Rich Persaud
@ 2018-02-14  8:04   ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-02-14  8:04 UTC (permalink / raw)
  To: Rich Persaud; +Cc: Andrew Cooper, xen-devel

>>> On 13.02.18 at 19:39, <persaur@gmail.com> wrote:
> On Feb 7, 2018, at 11:05, Jan Beulich <JBeulich@suse.com> wrote:
>> 
>> 1: slightly reduce Meltdown band-aid overhead
>> 2: remove CR reads from exit-to-guest path
>> 3: introduce altinstruction_nop assembler macro
>> 4: NOP out most XPTI entry/exit code when it's not in use
>> 5: avoid double CR3 reload when switching to guest user mode
>> 6: disable XPTI when RDCL_NO
>> 7: x86: log XPTI enabled status
> 
> Since work on XPTI is ongoing, will these improvements to XPTI-stage-1 be 
> published via http://xenbits.xen.org/xsa/xsa254/README.pti? 

They first of all need to go in (or whatever sub-portion of it is
deemed acceptable). And then I'm not sure we should further
clutter the XSA with performance improvements - I'd be
inclined to add further commits to the list only if we find issues
with the current approach.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Ping: [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
                   ` (7 preceding siblings ...)
  2018-02-13 18:39 ` [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Rich Persaud
@ 2018-02-28 11:52 ` Jan Beulich
  2018-02-28 12:00   ` Juergen Gross
  2018-03-08 11:32 ` [PATCH v2 8/7] x86/XPTI: use %r12 to write zero into xen_cr3 Jan Beulich
  2018-03-08 11:33 ` [PATCH v2 9/7] x86/XPTI: reduce .text.entry Jan Beulich
  10 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-02-28 11:52 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 07.02.18 at 17:05, <JBeulich@suse.com> wrote:
> 1: slightly reduce Meltdown band-aid overhead
> 2: remove CR reads from exit-to-guest path
> 3: introduce altinstruction_nop assembler macro
> 4: NOP out most XPTI entry/exit code when it's not in use

I've updated this one to patch all paths, but I'd like to avoid sending
v3 of the series without having had any feedback on v2 (leaving
aside a tiny bit on patch 1).

Jan

> 5: avoid double CR3 reload when switching to guest user mode
> 6: disable XPTI when RDCL_NO
> 7: x86: log XPTI enabled status
> 
> I won't mind if it was decided for some of them to be pointless, but I
> think 1 (because of a measurable improvement of 1-3%), 4 (helping
> the "xpti=no" case, even if only a little; taking 3 as prereq), and
> 6+7 should be considered seriously.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: Ping: [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction
  2018-02-28 11:52 ` Ping: " Jan Beulich
@ 2018-02-28 12:00   ` Juergen Gross
  0 siblings, 0 replies; 24+ messages in thread
From: Juergen Gross @ 2018-02-28 12:00 UTC (permalink / raw)
  To: Jan Beulich, Andrew Cooper; +Cc: xen-devel

On 28/02/18 12:52, Jan Beulich wrote:
>>>> On 07.02.18 at 17:05, <JBeulich@suse.com> wrote:
>> 1: slightly reduce Meltdown band-aid overhead
>> 2: remove CR reads from exit-to-guest path
>> 3: introduce altinstruction_nop assembler macro
>> 4: NOP out most XPTI entry/exit code when it's not in use
> 
> I've updated this one to patch all paths, but I'd like to avoid sending
> v3 of the series without having had any feedback on v2 (leaving
> aside a tiny bit on patch 1).
> 
> Jan
> 
>> 5: avoid double CR3 reload when switching to guest user mode
>> 6: disable XPTI when RDCL_NO
>> 7: x86: log XPTI enabled status
>>
>> I won't mind if it was decided for some of them to be pointless, but I
>> think 1 (because of a measurable improvement of 1-3%), 4 (helping
>> the "xpti=no" case, even if only a little; taking 3 as prereq), and
>> 6+7 should be considered seriously.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>

You can add to all of them:

Tested-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead
  2018-02-08  9:20     ` Jan Beulich
@ 2018-03-01 19:21       ` Andrew Cooper
  2018-03-02 11:34         ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2018-03-01 19:21 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 08/02/18 09:20, Jan Beulich wrote:
>>>> On 07.02.18 at 20:35, <andrew.cooper3@citrix.com> wrote:
>> On 07/02/18 16:12, Jan Beulich wrote:
>>> I'm not sure why I didn't do this right away: By avoiding the use of
>>> global PTEs in the cloned directmap, there's no need to fiddle with
>>> CR4.PGE on any of the entry paths. Only the exit paths need to flush
>>> global mappings.
>>>
>>> The reduced flushing, however, implies that we now need to have
>>> interrupts off on all entry paths until after the page table switch, so
>>> that flush IPIs can't arrive with the restricted page tables still
>>> active, but only a non-global flush happening with the CR3 loads. Along
>>> those lines the "sync" IPI after L4 entry updates now needs to become a
>>> real (and global) flush IPI, so that inside Xen we'll also pick up such
>>> changes.
>> Actually, on second consideration, why does reenabling interrupts need
>> to be deferred?
>>
>> The safety of the sync_guest path (which previously entered Xen, did
>> nothing, and exited again) relied on the entry part flushing global
>> mappings for safety, as the return-to-xen path doesn't necessarily
>> switch mappings.
>>
>> However, the first hunk upgrading the "do nothing" to a proper global
>> flush, covers that case.
>>
>> I don't see anything else which affects the safety of taking TLB flush
>> IPIs early in the entry-from-guest path.  What am I missing?
> If a sync IPI arrives before we switch away from the restricted page
> tables, the processor may re-fetch a global entry from those tables
> through an L4 with the sync IPI is supposed to tell the processor to
> get rid of (or modify). The subsequent CR3 write won't invalidate such
> a TLB entry, and hence whatever we do internally may reference a
> stale mapping.

In which case, can I propose that the commit message reads:

The reduced flushing, however, requires that we now have
interrupts off on all entry paths until after the page table
switch, so that flush IPIs can't be serviced while on the
restricted pagetables, leaving a window where a potentially stale
guest global mapping can be brought into the TLB.  Along those
lines the "sync" IPI after L4 entry updates now needs to become a
real (and global) flush IPI, so that inside Xen we'll also pick
up such changes.

Or something similar?

Also, you've got a bugfix needed in clone_mapping() as found by Juergen,
asserting that the flags are the same after clobbering PAGE_GLOBAL.

With both of these suitably addressed, Reviewed-by: Andrew Cooper
<andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 2/7] x86: remove CR reads from exit-to-guest path
  2018-02-07 16:12 ` [PATCH v2 2/7] x86: remove CR reads from exit-to-guest path Jan Beulich
@ 2018-03-01 19:23   ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2018-03-01 19:23 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 07/02/18 16:12, Jan Beulich wrote:
> CR3 is - during normal operation - only ever loaded from v->arch.cr3,
> so there's no need to read the actual control register. For CR4 we can
> generally use the cached value on all synchronous entry end exit paths.
> Drop the write_cr3 macro, as the two use sites are probably easier to
> follow without its use.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/7] x86: introduce altinstruction_nop assembler macro
  2018-02-07 16:13 ` [PATCH v2 3/7] x86: introduce altinstruction_nop assembler macro Jan Beulich
@ 2018-03-01 19:25   ` Andrew Cooper
  2018-03-02  7:24     ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2018-03-01 19:25 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 07/02/18 16:13, Jan Beulich wrote:
> This allows shortening (and making more obvious what they do) some
> altinstruction_entry uses.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

In principle, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>,
but I'd prefer if you held it back until my nop calculation series is
in, which will drop this patch down to the single hunk in alternative-asm.h

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/7] x86: NOP out most XPTI entry/exit code when it's not in use
  2018-02-07 16:13 ` [PATCH v2 4/7] x86: NOP out most XPTI entry/exit code when it's not in use Jan Beulich
@ 2018-03-01 19:42   ` Andrew Cooper
  2018-03-02  7:38     ` Jan Beulich
  0 siblings, 1 reply; 24+ messages in thread
From: Andrew Cooper @ 2018-03-01 19:42 UTC (permalink / raw)
  To: Jan Beulich, xen-devel

On 07/02/18 16:13, Jan Beulich wrote:
> --- a/xen/arch/x86/x86_64/compat/entry.S
> +++ b/xen/arch/x86/x86_64/compat/entry.S
> @@ -199,7 +199,7 @@ ENTRY(compat_post_handle_exception)
>  
>  /* See lstar_enter for entry register state. */
>  ENTRY(cstar_enter)
> -        /* sti could live here when we don't switch page tables below. */
> +        ALTERNATIVE nop, sti, X86_FEATURE_NO_XPTI
>          CR4_PV32_RESTORE
>          movq  8(%rsp),%rax /* Restore %rax. */
>          movq  $FLAT_KERNEL_SS,8(%rsp)
> @@ -214,6 +214,7 @@ ENTRY(cstar_enter)
>          /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
>  
>          GET_STACK_END(bx)
> +.Lcstar_cr3_start:
>          mov   STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
>          neg   %rcx
>          jz    .Lcstar_cr3_okay
> @@ -223,6 +224,8 @@ ENTRY(cstar_enter)
>          movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
>  .Lcstar_cr3_okay:
>          sti
> +.Lcstar_cr3_end:
> +        ALTERNATIVE_NOP .Lcstar_cr3_start, .Lcstar_cr3_end, X86_FEATURE_NO_XPTI

This is much clearer with the nop infrastructure abstracted away.

However, I remain unconvinced that this dynamic handling of interrupt
re-enabling is worth the hoops you've jumped through to make it happen. 
It might be interesting to hear others thoughts on the matter.

In particular, we've got a race window depending on the order in which
the alternatives list is traversed where we might be unsafe.

On a tangent (which probably wont affect the result of this patch),
given your thoughts to allow guests to notice and extend their own
featureset, what about Xen?  If so, we're going to need something more
clever than simply nopping out the code.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/7] x86: introduce altinstruction_nop assembler macro
  2018-03-01 19:25   ` Andrew Cooper
@ 2018-03-02  7:24     ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-03-02  7:24 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 01.03.18 at 20:25, <andrew.cooper3@citrix.com> wrote:
> On 07/02/18 16:13, Jan Beulich wrote:
>> This allows shortening (and making more obvious what they do) some
>> altinstruction_entry uses.
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> In principle, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>,
> but I'd prefer if you held it back until my nop calculation series is
> in, which will drop this patch down to the single hunk in alternative-asm.h

I was certainly hoping to re-base this on top of your work. I don't
see though why that single hunk would then still be wanted as a
standalone patch - I'd simply move it into the next one.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 4/7] x86: NOP out most XPTI entry/exit code when it's not in use
  2018-03-01 19:42   ` Andrew Cooper
@ 2018-03-02  7:38     ` Jan Beulich
  0 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-03-02  7:38 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 01.03.18 at 20:42, <andrew.cooper3@citrix.com> wrote:
> On 07/02/18 16:13, Jan Beulich wrote:
>> --- a/xen/arch/x86/x86_64/compat/entry.S
>> +++ b/xen/arch/x86/x86_64/compat/entry.S
>> @@ -199,7 +199,7 @@ ENTRY(compat_post_handle_exception)
>>  
>>  /* See lstar_enter for entry register state. */
>>  ENTRY(cstar_enter)
>> -        /* sti could live here when we don't switch page tables below. */
>> +        ALTERNATIVE nop, sti, X86_FEATURE_NO_XPTI
>>          CR4_PV32_RESTORE
>>          movq  8(%rsp),%rax /* Restore %rax. */
>>          movq  $FLAT_KERNEL_SS,8(%rsp)
>> @@ -214,6 +214,7 @@ ENTRY(cstar_enter)
>>          /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
>>  
>>          GET_STACK_END(bx)
>> +.Lcstar_cr3_start:
>>          mov   STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
>>          neg   %rcx
>>          jz    .Lcstar_cr3_okay
>> @@ -223,6 +224,8 @@ ENTRY(cstar_enter)
>>          movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
>>  .Lcstar_cr3_okay:
>>          sti
>> +.Lcstar_cr3_end:
>> +        ALTERNATIVE_NOP .Lcstar_cr3_start, .Lcstar_cr3_end, 
> X86_FEATURE_NO_XPTI
> 
> This is much clearer with the nop infrastructure abstracted away.
> 
> However, I remain unconvinced that this dynamic handling of interrupt
> re-enabling is worth the hoops you've jumped through to make it happen. 
> It might be interesting to hear others thoughts on the matter.
> 
> In particular, we've got a race window depending on the order in which
> the alternatives list is traversed where we might be unsafe.

But that ordering dependency is not just an issue with the interrupt
enabling: Note how certain pairs of ALTERNATIVE_NOP are carefully
ordered to first NOP out a later piece of code, and only then the
earlier one. There's an argument to be made that the solitary writing
back of %r15 could be left in place, as with the other pieces of code
patched out plus the earlier zeroing of registers, %r15 will then only
ever be zero, which is safe to write back. That zeroing code,
however, wasn't in the tree yet when this patch was first submitted.

Bottom line though - right now processing in order is a strict
requirement. Note that multiple patches to the same patch site
also depend on such ordering.

> On a tangent (which probably wont affect the result of this patch),
> given your thoughts to allow guests to notice and extend their own
> featureset, what about Xen?  If so, we're going to need something more
> clever than simply nopping out the code.

Well, if we want a runtime disable of xpti (like has been requested
by one of our customers, and ISTR you saying something like this
as well), then I don't think we should do this patching. But that
could be achieved by simply not setting the NO_XPTI feature (and
dropping cpu_has_no_xpti and its use), e.g. via an "xpti=dynamic"
command line option extension. As per the feature request we've
got this would need to be no more than the option of a one-time
disable; anything more involved (like flipping between modes)
would certainly be more complicated than is probably worth it.

Of course then there's this other consideration that Jürgen has
had with his series: If we want to make the page table switching
domain dependent (and in particular don't do so for Dom0), then
this patch either needs to go away altogether, or there would
need to be further restriction on when to set NO_XPTI (and
trigger patching).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead
  2018-03-01 19:21       ` Andrew Cooper
@ 2018-03-02 11:34         ` Jan Beulich
  2018-03-02 11:38           ` Andrew Cooper
  0 siblings, 1 reply; 24+ messages in thread
From: Jan Beulich @ 2018-03-02 11:34 UTC (permalink / raw)
  To: Andrew Cooper; +Cc: xen-devel

>>> On 01.03.18 at 20:21, <andrew.cooper3@citrix.com> wrote:
> On 08/02/18 09:20, Jan Beulich wrote:
>>>>> On 07.02.18 at 20:35, <andrew.cooper3@citrix.com> wrote:
>>> On 07/02/18 16:12, Jan Beulich wrote:
>>>> I'm not sure why I didn't do this right away: By avoiding the use of
>>>> global PTEs in the cloned directmap, there's no need to fiddle with
>>>> CR4.PGE on any of the entry paths. Only the exit paths need to flush
>>>> global mappings.
>>>>
>>>> The reduced flushing, however, implies that we now need to have
>>>> interrupts off on all entry paths until after the page table switch, so
>>>> that flush IPIs can't arrive with the restricted page tables still
>>>> active, but only a non-global flush happening with the CR3 loads. Along
>>>> those lines the "sync" IPI after L4 entry updates now needs to become a
>>>> real (and global) flush IPI, so that inside Xen we'll also pick up such
>>>> changes.
>>> Actually, on second consideration, why does reenabling interrupts need
>>> to be deferred?
>>>
>>> The safety of the sync_guest path (which previously entered Xen, did
>>> nothing, and exited again) relied on the entry part flushing global
>>> mappings for safety, as the return-to-xen path doesn't necessarily
>>> switch mappings.
>>>
>>> However, the first hunk upgrading the "do nothing" to a proper global
>>> flush, covers that case.
>>>
>>> I don't see anything else which affects the safety of taking TLB flush
>>> IPIs early in the entry-from-guest path.  What am I missing?
>> If a sync IPI arrives before we switch away from the restricted page
>> tables, the processor may re-fetch a global entry from those tables
>> through an L4 with the sync IPI is supposed to tell the processor to
>> get rid of (or modify). The subsequent CR3 write won't invalidate such
>> a TLB entry, and hence whatever we do internally may reference a
>> stale mapping.
> 
> In which case, can I propose that the commit message reads:
> 
> The reduced flushing, however, requires that we now have
> interrupts off on all entry paths until after the page table
> switch, so that flush IPIs can't be serviced while on the
> restricted pagetables, leaving a window where a potentially stale
> guest global mapping can be brought into the TLB.  Along those
> lines the "sync" IPI after L4 entry updates now needs to become a
> real (and global) flush IPI, so that inside Xen we'll also pick
> up such changes.
> 
> Or something similar?

I've used the above.

> Also, you've got a bugfix needed in clone_mapping() as found by Juergen,
> asserting that the flags are the same after clobbering PAGE_GLOBAL.
> 
> With both of these suitably addressed, Reviewed-by: Andrew Cooper
> <andrew.cooper3@citrix.com>

As you will likely have seen from the patch just sent, the fix is not
to be made here. Please confirm that I can apply the R-b with just
the description change. Of course this change should then go in
only after that other bug fix has (in whatever shape).

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead
  2018-03-02 11:34         ` Jan Beulich
@ 2018-03-02 11:38           ` Andrew Cooper
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Cooper @ 2018-03-02 11:38 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel

On 02/03/18 11:34, Jan Beulich wrote:
>>>> On 01.03.18 at 20:21, <andrew.cooper3@citrix.com> wrote:
>> On 08/02/18 09:20, Jan Beulich wrote:
>>>>>> On 07.02.18 at 20:35, <andrew.cooper3@citrix.com> wrote:
>>>> On 07/02/18 16:12, Jan Beulich wrote:
>>>>> I'm not sure why I didn't do this right away: By avoiding the use of
>>>>> global PTEs in the cloned directmap, there's no need to fiddle with
>>>>> CR4.PGE on any of the entry paths. Only the exit paths need to flush
>>>>> global mappings.
>>>>>
>>>>> The reduced flushing, however, implies that we now need to have
>>>>> interrupts off on all entry paths until after the page table switch, so
>>>>> that flush IPIs can't arrive with the restricted page tables still
>>>>> active, but only a non-global flush happening with the CR3 loads. Along
>>>>> those lines the "sync" IPI after L4 entry updates now needs to become a
>>>>> real (and global) flush IPI, so that inside Xen we'll also pick up such
>>>>> changes.
>>>> Actually, on second consideration, why does reenabling interrupts need
>>>> to be deferred?
>>>>
>>>> The safety of the sync_guest path (which previously entered Xen, did
>>>> nothing, and exited again) relied on the entry part flushing global
>>>> mappings for safety, as the return-to-xen path doesn't necessarily
>>>> switch mappings.
>>>>
>>>> However, the first hunk upgrading the "do nothing" to a proper global
>>>> flush, covers that case.
>>>>
>>>> I don't see anything else which affects the safety of taking TLB flush
>>>> IPIs early in the entry-from-guest path.  What am I missing?
>>> If a sync IPI arrives before we switch away from the restricted page
>>> tables, the processor may re-fetch a global entry from those tables
>>> through an L4 with the sync IPI is supposed to tell the processor to
>>> get rid of (or modify). The subsequent CR3 write won't invalidate such
>>> a TLB entry, and hence whatever we do internally may reference a
>>> stale mapping.
>> In which case, can I propose that the commit message reads:
>>
>> The reduced flushing, however, requires that we now have
>> interrupts off on all entry paths until after the page table
>> switch, so that flush IPIs can't be serviced while on the
>> restricted pagetables, leaving a window where a potentially stale
>> guest global mapping can be brought into the TLB.  Along those
>> lines the "sync" IPI after L4 entry updates now needs to become a
>> real (and global) flush IPI, so that inside Xen we'll also pick
>> up such changes.
>>
>> Or something similar?
> I've used the above.
>
>> Also, you've got a bugfix needed in clone_mapping() as found by Juergen,
>> asserting that the flags are the same after clobbering PAGE_GLOBAL.
>>
>> With both of these suitably addressed, Reviewed-by: Andrew Cooper
>> <andrew.cooper3@citrix.com>
> As you will likely have seen from the patch just sent, the fix is not
> to be made here. Please confirm that I can apply the R-b with just
> the description change. Of course this change should then go in
> only after that other bug fix has (in whatever shape).

I haven't caught up with mail yet. If this isn't the right fix, then my
R-b stands so long as is a fix somewhere.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 8/7] x86/XPTI: use %r12 to write zero into xen_cr3
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
                   ` (8 preceding siblings ...)
  2018-02-28 11:52 ` Ping: " Jan Beulich
@ 2018-03-08 11:32 ` Jan Beulich
  2018-03-08 11:33 ` [PATCH v2 9/7] x86/XPTI: reduce .text.entry Jan Beulich
  10 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-03-08 11:32 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

Now that we zero all registers early on all entry paths, use that to
avoid a couple of immediates here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
We may want to consider eliminating a few more $0 this way. But
especially for byte ones I'm not sure it's worth it, due to the REX
prefix the use of %r12 would incur.

--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -223,7 +223,7 @@ ENTRY(cstar_enter)
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
         neg   %rcx
         mov   %rcx, %cr3
-        movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
+        mov   %r12, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lcstar_cr3_okay:
         sti
 .Lcstar_cr3_end:
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -185,7 +185,7 @@ ENTRY(lstar_enter)
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
         neg   %rcx
         mov   %rcx, %cr3
-        movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
+        mov   %r12, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Llstar_cr3_okay:
         sti
 .Llstar_cr3_end:
@@ -295,7 +295,7 @@ GLOBAL(sysenter_eflags_saved)
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
         neg   %rcx
         mov   %rcx, %cr3
-        movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
+        mov   %r12, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lsyse_cr3_okay:
         sti
 .Lsyse_cr3_end:
@@ -348,7 +348,7 @@ ENTRY(int80_direct_trap)
         mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
         neg   %rcx
         mov   %rcx, %cr3
-        movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
+        mov   %r12, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
 .Lint80_cr3_okay:
         sti
 .Lint80_cr3_end:
@@ -538,10 +538,9 @@ ENTRY(common_interrupt)
         neg   %rcx
 .Lintr_cr3_load:
         mov   %rcx, %cr3
-        xor   %ecx, %ecx
-        mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
+        mov   %r12, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         testb $3, UREGS_cs(%rsp)
-        cmovnz %rcx, %r15
+        cmovnz %r12, %r15
 .Lintr_cr3_okay:
 
         CR4_PV32_RESTORE
@@ -586,10 +585,9 @@ GLOBAL(handle_exception)
         neg   %rcx
 .Lxcpt_cr3_load:
         mov   %rcx, %cr3
-        xor   %ecx, %ecx
-        mov   %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
+        mov   %r12, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
         testb $3, UREGS_cs(%rsp)
-        cmovnz %rcx, %r15
+        cmovnz %r12, %r15
 .Lxcpt_cr3_okay:
 
 handle_exception_saved:
@@ -828,7 +826,7 @@ handle_ist_exception:
         neg   %rcx
 .List_cr3_load:
         mov   %rcx, %cr3
-        movq  $0, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
+        mov   %r12, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
 .List_cr3_okay:
 
         CR4_PV32_RESTORE




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 9/7] x86/XPTI: reduce .text.entry
  2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
                   ` (9 preceding siblings ...)
  2018-03-08 11:32 ` [PATCH v2 8/7] x86/XPTI: use %r12 to write zero into xen_cr3 Jan Beulich
@ 2018-03-08 11:33 ` Jan Beulich
  10 siblings, 0 replies; 24+ messages in thread
From: Jan Beulich @ 2018-03-08 11:33 UTC (permalink / raw)
  To: xen-devel; +Cc: Andrew Cooper

This exposes less code pieces and at the same time reduces the range
covered from slightly above 3 pages to a little below 2 of them.

The code being moved is entirely unchanged, except for the removal of
trailing blanks and a pointless q suffix from "retq".

A few more small pieces could be moved, but it seems better to me to
leave them where they are to not make it overly hard to follow code
paths.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/x86_64/compat/entry.S
+++ b/xen/arch/x86/x86_64/compat/entry.S
@@ -13,8 +13,6 @@
 #include <public/xen.h>
 #include <irq_vectors.h>
 
-        .section .text.entry, "ax", @progbits
-
 ENTRY(entry_int82)
         ASM_CLAC
         pushq $0
@@ -199,6 +197,8 @@ ENTRY(compat_post_handle_exception)
         movb  $0,TRAPBOUNCE_flags(%rdx)
         jmp   compat_test_all_events
 
+        .section .text.entry, "ax", @progbits
+
 /* See lstar_enter for entry register state. */
 ENTRY(cstar_enter)
         ALTERNATIVE nop, sti, X86_FEATURE_NO_XPTI
@@ -256,6 +256,8 @@ UNLIKELY_END(compat_syscall_gpf)
         movb  %cl,TRAPBOUNCE_flags(%rdx)
         jmp   .Lcompat_bounce_exception
 
+        .text
+
 ENTRY(compat_sysenter)
         CR4_PV32_RESTORE
         movq  VCPU_trap_ctxt(%rbx),%rcx
@@ -275,9 +277,6 @@ ENTRY(compat_int80_direct_trap)
         call  compat_create_bounce_frame
         jmp   compat_test_all_events
 
-        /* compat_create_bounce_frame & helpers don't need to be in .text.entry */
-        .text
-
 /* CREATE A BASIC EXCEPTION FRAME ON GUEST OS (RING-1) STACK:            */
 /*   {[ERRCODE,] EIP, CS, EFLAGS, [ESP, SS]}                             */
 /* %rdx: trap_bounce, %rbx: struct vcpu                                  */
--- a/xen/arch/x86/x86_64/entry.S
+++ b/xen/arch/x86/x86_64/entry.S
@@ -14,8 +14,6 @@
 #include <public/xen.h>
 #include <irq_vectors.h>
 
-        .section .text.entry, "ax", @progbits
-
 /* %rbx: struct vcpu */
 ENTRY(switch_to_kernel)
         leaq  VCPU_trap_bounce(%rbx),%rdx
@@ -34,8 +32,107 @@ ENTRY(switch_to_kernel)
         movb  %cl,TRAPBOUNCE_flags(%rdx)
         call  create_bounce_frame
         andl  $~X86_EFLAGS_DF,UREGS_eflags(%rsp)
+/* %rbx: struct vcpu */
+test_all_events:
+        ASSERT_NOT_IN_ATOMIC
+        cli                             # tests must not race interrupts
+/*test_softirqs:*/
+        movl  VCPU_processor(%rbx),%eax
+        shll  $IRQSTAT_shift,%eax
+        leaq  irq_stat+IRQSTAT_softirq_pending(%rip),%rcx
+        cmpl  $0,(%rcx,%rax,1)
+        jne   process_softirqs
+        testb $1,VCPU_mce_pending(%rbx)
+        jnz   process_mce
+.Ltest_guest_nmi:
+        testb $1,VCPU_nmi_pending(%rbx)
+        jnz   process_nmi
+test_guest_events:
+        movq  VCPU_vcpu_info(%rbx),%rax
+        movzwl VCPUINFO_upcall_pending(%rax),%eax
+        decl  %eax
+        cmpl  $0xfe,%eax
+        ja    restore_all_guest
+/*process_guest_events:*/
+        sti
+        leaq  VCPU_trap_bounce(%rbx),%rdx
+        movq  VCPU_event_addr(%rbx),%rax
+        movq  %rax,TRAPBOUNCE_eip(%rdx)
+        movb  $TBF_INTERRUPT,TRAPBOUNCE_flags(%rdx)
+        call  create_bounce_frame
         jmp   test_all_events
 
+        ALIGN
+/* %rbx: struct vcpu */
+process_softirqs:
+        sti
+        call do_softirq
+        jmp  test_all_events
+
+        ALIGN
+/* %rbx: struct vcpu */
+process_mce:
+        testb $1 << VCPU_TRAP_MCE,VCPU_async_exception_mask(%rbx)
+        jnz  .Ltest_guest_nmi
+        sti
+        movb $0,VCPU_mce_pending(%rbx)
+        call set_guest_machinecheck_trapbounce
+        test %eax,%eax
+        jz   test_all_events
+        movzbl VCPU_async_exception_mask(%rbx),%edx # save mask for the
+        movb %dl,VCPU_mce_old_mask(%rbx)            # iret hypercall
+        orl  $1 << VCPU_TRAP_MCE,%edx
+        movb %dl,VCPU_async_exception_mask(%rbx)
+        jmp  process_trap
+
+        ALIGN
+/* %rbx: struct vcpu */
+process_nmi:
+        testb $1 << VCPU_TRAP_NMI,VCPU_async_exception_mask(%rbx)
+        jnz  test_guest_events
+        sti
+        movb $0,VCPU_nmi_pending(%rbx)
+        call set_guest_nmi_trapbounce
+        test %eax,%eax
+        jz   test_all_events
+        movzbl VCPU_async_exception_mask(%rbx),%edx # save mask for the
+        movb %dl,VCPU_nmi_old_mask(%rbx)            # iret hypercall
+        orl  $1 << VCPU_TRAP_NMI,%edx
+        movb %dl,VCPU_async_exception_mask(%rbx)
+        /* FALLTHROUGH */
+process_trap:
+        leaq VCPU_trap_bounce(%rbx),%rdx
+        call create_bounce_frame
+        jmp  test_all_events
+
+/* No special register assumptions. */
+ENTRY(ret_from_intr)
+        GET_CURRENT(bx)
+        testb $3,UREGS_cs(%rsp)
+        jz    restore_all_xen
+        movq  VCPU_domain(%rbx),%rax
+        testb $1,DOMAIN_is_32bit_pv(%rax)
+        jz    test_all_events
+        jmp   compat_test_all_events
+
+/* Enable NMIs.  No special register assumptions. Only %rax is not preserved. */
+ENTRY(enable_nmis)
+        movq  %rsp, %rax /* Grab RSP before pushing */
+
+        /* Set up stack frame */
+        pushq $0               /* SS */
+        pushq %rax             /* RSP */
+        pushfq                 /* RFLAGS */
+        pushq $__HYPERVISOR_CS /* CS */
+        leaq  1f(%rip),%rax
+        pushq %rax             /* RIP */
+
+        iretq /* Disable the hardware NMI latch */
+1:
+        ret
+
+        .section .text.entry, "ax", @progbits
+
 /* %rbx: struct vcpu, interrupts disabled */
 restore_all_guest:
         ASSERT_INTERRUPTS_DISABLED
@@ -197,80 +294,8 @@ ENTRY(lstar_enter)
 
         mov   %rsp, %rdi
         call  pv_hypercall
-
-/* %rbx: struct vcpu */
-test_all_events:
-        ASSERT_NOT_IN_ATOMIC
-        cli                             # tests must not race interrupts
-/*test_softirqs:*/  
-        movl  VCPU_processor(%rbx),%eax
-        shll  $IRQSTAT_shift,%eax
-        leaq  irq_stat+IRQSTAT_softirq_pending(%rip),%rcx
-        cmpl  $0,(%rcx,%rax,1)
-        jne   process_softirqs
-        testb $1,VCPU_mce_pending(%rbx)
-        jnz   process_mce
-.Ltest_guest_nmi:
-        testb $1,VCPU_nmi_pending(%rbx)
-        jnz   process_nmi
-test_guest_events:
-        movq  VCPU_vcpu_info(%rbx),%rax
-        movzwl VCPUINFO_upcall_pending(%rax),%eax
-        decl  %eax
-        cmpl  $0xfe,%eax
-        ja    restore_all_guest
-/*process_guest_events:*/
-        sti
-        leaq  VCPU_trap_bounce(%rbx),%rdx
-        movq  VCPU_event_addr(%rbx),%rax
-        movq  %rax,TRAPBOUNCE_eip(%rdx)
-        movb  $TBF_INTERRUPT,TRAPBOUNCE_flags(%rdx)
-        call  create_bounce_frame
         jmp   test_all_events
 
-        ALIGN
-/* %rbx: struct vcpu */
-process_softirqs:
-        sti       
-        call do_softirq
-        jmp  test_all_events
-
-        ALIGN
-/* %rbx: struct vcpu */
-process_mce:
-        testb $1 << VCPU_TRAP_MCE,VCPU_async_exception_mask(%rbx)
-        jnz  .Ltest_guest_nmi
-        sti
-        movb $0,VCPU_mce_pending(%rbx)
-        call set_guest_machinecheck_trapbounce
-        test %eax,%eax
-        jz   test_all_events
-        movzbl VCPU_async_exception_mask(%rbx),%edx # save mask for the
-        movb %dl,VCPU_mce_old_mask(%rbx)            # iret hypercall
-        orl  $1 << VCPU_TRAP_MCE,%edx
-        movb %dl,VCPU_async_exception_mask(%rbx)
-        jmp  process_trap
-
-        ALIGN
-/* %rbx: struct vcpu */
-process_nmi:
-        testb $1 << VCPU_TRAP_NMI,VCPU_async_exception_mask(%rbx)
-        jnz  test_guest_events
-        sti
-        movb $0,VCPU_nmi_pending(%rbx)
-        call set_guest_nmi_trapbounce
-        test %eax,%eax
-        jz   test_all_events
-        movzbl VCPU_async_exception_mask(%rbx),%edx # save mask for the
-        movb %dl,VCPU_nmi_old_mask(%rbx)            # iret hypercall
-        orl  $1 << VCPU_TRAP_NMI,%edx
-        movb %dl,VCPU_async_exception_mask(%rbx)
-        /* FALLTHROUGH */
-process_trap:
-        leaq VCPU_trap_bounce(%rbx),%rdx
-        call create_bounce_frame
-        jmp  test_all_events
-
 ENTRY(sysenter_entry)
         ALTERNATIVE nop, sti, X86_FEATURE_NO_XPTI
         pushq $FLAT_USER_SS
@@ -554,16 +579,6 @@ ENTRY(common_interrupt)
         ALTERNATIVE_NOP .Lintr_cr3_restore, .Lintr_cr3_end, X86_FEATURE_NO_XPTI
         ALTERNATIVE_NOP .Lintr_cr3_start, .Lintr_cr3_okay, X86_FEATURE_NO_XPTI
 
-/* No special register assumptions. */
-ENTRY(ret_from_intr)
-        GET_CURRENT(bx)
-        testb $3,UREGS_cs(%rsp)
-        jz    restore_all_xen
-        movq  VCPU_domain(%rbx),%rax
-        testb $1,DOMAIN_is_32bit_pv(%rax)
-        jz    test_all_events
-        jmp   compat_test_all_events
-
 ENTRY(page_fault)
         movl  $TRAP_page_fault,4(%rsp)
 /* No special register assumptions. */
@@ -878,22 +893,6 @@ ENTRY(machine_check)
         movl  $TRAP_machine_check,4(%rsp)
         jmp   handle_ist_exception
 
-/* Enable NMIs.  No special register assumptions. Only %rax is not preserved. */
-ENTRY(enable_nmis)
-        movq  %rsp, %rax /* Grab RSP before pushing */
-
-        /* Set up stack frame */
-        pushq $0               /* SS */
-        pushq %rax             /* RSP */
-        pushfq                 /* RFLAGS */
-        pushq $__HYPERVISOR_CS /* CS */
-        leaq  1f(%rip),%rax
-        pushq %rax             /* RIP */
-
-        iretq /* Disable the hardware NMI latch */
-1:
-        retq
-
 /* No op trap handler.  Required for kexec crash path. */
 GLOBAL(trap_nop)
         iretq



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2018-03-08 11:33 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-07 16:05 [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Jan Beulich
2018-02-07 16:12 ` [PATCH v2 1/7] x86: slightly reduce Meltdown band-aid overhead Jan Beulich
2018-02-07 19:35   ` Andrew Cooper
2018-02-08  9:20     ` Jan Beulich
2018-03-01 19:21       ` Andrew Cooper
2018-03-02 11:34         ` Jan Beulich
2018-03-02 11:38           ` Andrew Cooper
2018-02-07 16:12 ` [PATCH v2 2/7] x86: remove CR reads from exit-to-guest path Jan Beulich
2018-03-01 19:23   ` Andrew Cooper
2018-02-07 16:13 ` [PATCH v2 3/7] x86: introduce altinstruction_nop assembler macro Jan Beulich
2018-03-01 19:25   ` Andrew Cooper
2018-03-02  7:24     ` Jan Beulich
2018-02-07 16:13 ` [PATCH v2 4/7] x86: NOP out most XPTI entry/exit code when it's not in use Jan Beulich
2018-03-01 19:42   ` Andrew Cooper
2018-03-02  7:38     ` Jan Beulich
2018-02-07 16:14 ` [PATCH v2 5/7] x86: avoid double CR3 reload when switching to guest user mode Jan Beulich
2018-02-07 16:14 ` [PATCH v2 6/7] x86: disable XPTI when RDCL_NO Jan Beulich
2018-02-07 16:15 ` [PATCH v2 7/7] x86: log XPTI enabled status Jan Beulich
2018-02-13 18:39 ` [PATCH v2 0/7] x86: Meltdown band-aid overhead reduction Rich Persaud
2018-02-14  8:04   ` Jan Beulich
2018-02-28 11:52 ` Ping: " Jan Beulich
2018-02-28 12:00   ` Juergen Gross
2018-03-08 11:32 ` [PATCH v2 8/7] x86/XPTI: use %r12 to write zero into xen_cr3 Jan Beulich
2018-03-08 11:33 ` [PATCH v2 9/7] x86/XPTI: reduce .text.entry Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.