* [patch 0/2] x86/idt: Consolidate IDT/TSS setup @ 2021-05-07 11:02 Thomas Gleixner 2021-05-07 11:02 ` [patch 1/2] x86/cpu: Init exception handling from cpu_init_secondary() Thomas Gleixner 2021-05-07 11:02 ` [patch 2/2] x86/idt: Rework IDT setup for boot CPU Thomas Gleixner 0 siblings, 2 replies; 14+ messages in thread From: Thomas Gleixner @ 2021-05-07 11:02 UTC (permalink / raw) To: LKML; +Cc: x86, Lai Jiangshan, Joerg Roedel, Borislav Petkov The IDT/TSS setup for the boot CPU on 64-bit is split into two parts: 1) Setup IDT without IST before cpu_init() 2) Setup IDT with IST after cpu_init() Lai noticed [1] that the NMI setup in #1 is wrong because it uses the IST aware entry point but if an NMI happens there it would run on the kernel stack which can cause the IST aware code to malfunction. That's not a real problem because a NMI hitting during early boot before the IDT is fully set up is fatal anyway. The intermediate issue with the split setup is just making that window marginally wider. Though the setup logic is more convoluted than necessary. There is another oddity regarding secondary CPUs. The recently added SEV support requires #VC to be functional when invoking cpu_init() and therefore added a separate function which initializes TSS before that. Now cpu_init() itself does the same initialization again, which is pointless and confusing at best. Borislav posted a patch [2] which moves the TSS initialization to the start of cpu_init(), but when looking at the boot CPU setup, this is not helpful. So I kept the separate function and made use of it in trap_init() so that the ordering is now TSS setup, IDT setup, cpu_init(). That allows to get rid of the separate IST setup step and makes the overall code simpler. Thanks, tglx [1] https://lore.kernel.org/r/20210426230949.3561-3-jiangshanlai@gmail.com [2] https://lore.kernel.org/r/20210504171745.2249-1-bp@alien8.de --- include/asm/desc.h | 2 -- include/asm/processor.h | 1 + kernel/cpu/common.c | 24 +++++++++++------------- kernel/idt.c | 40 ++++++++++++---------------------------- kernel/traps.c | 9 +++------ 5 files changed, 27 insertions(+), 49 deletions(-) ^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 1/2] x86/cpu: Init exception handling from cpu_init_secondary() 2021-05-07 11:02 [patch 0/2] x86/idt: Consolidate IDT/TSS setup Thomas Gleixner @ 2021-05-07 11:02 ` Thomas Gleixner 2021-05-08 23:40 ` Lai Jiangshan 2021-05-07 11:02 ` [patch 2/2] x86/idt: Rework IDT setup for boot CPU Thomas Gleixner 1 sibling, 1 reply; 14+ messages in thread From: Thomas Gleixner @ 2021-05-07 11:02 UTC (permalink / raw) To: LKML; +Cc: x86, Lai Jiangshan, Joerg Roedel, Borislav Petkov From: Borislav Petkov <bp@suse.de> SEV-ES guests require properly setup task register with which the TSS descriptor in the GDT can be located so that the IST-type #VC exception handler which they need to function properly, can be executed. This setup needs to happen before attempting to load microcode in ucode_cpu_init() on secondary CPUs which can cause such #VC exceptions. Simplify the machinery by running that exception setup from a new function cpu_init_secondary() and explicitly call cpu_init_exception_handling() for the boot CPU before cpu_init(). The latter prepares for fixing and simplifying the exception/IST setup on the boot CPU. There should be no functional changes resulting from this patch. [ tglx: Reworked it so cpu_init_exception_handling() stays separate ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/processor.h | 1 + arch/x86/kernel/cpu/common.c | 24 +++++++++++------------- arch/x86/kernel/traps.c | 4 +--- 3 files changed, 13 insertions(+), 16 deletions(-) --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -663,6 +663,7 @@ extern void load_direct_gdt(int); extern void load_fixmap_gdt(int); extern void load_percpu_segment(int); extern void cpu_init(void); +extern void cpu_init_secondary(void); extern void cpu_init_exception_handling(void); extern void cr4_init(void); --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1938,13 +1938,12 @@ void cpu_init_exception_handling(void) /* * cpu_init() initializes state that is per-CPU. Some data is already - * initialized (naturally) in the bootstrap process, such as the GDT - * and IDT. We reload them nevertheless, this function acts as a - * 'CPU state barrier', nothing should get across. + * initialized (naturally) in the bootstrap process, such as the GDT. We + * reload it nevertheless, this function acts as a 'CPU state barrier', + * nothing should get across. */ void cpu_init(void) { - struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); struct task_struct *cur = current; int cpu = raw_smp_processor_id(); @@ -1957,8 +1956,6 @@ void cpu_init(void) early_cpu_to_node(cpu) != NUMA_NO_NODE) set_numa_node(early_cpu_to_node(cpu)); #endif - setup_getcpu(cpu); - pr_debug("Initializing CPU#%d\n", cpu); if (IS_ENABLED(CONFIG_X86_64) || cpu_feature_enabled(X86_FEATURE_VME) || @@ -1970,7 +1967,6 @@ void cpu_init(void) * and set up the GDT descriptor: */ switch_to_new_gdt(cpu); - load_current_idt(); if (IS_ENABLED(CONFIG_X86_64)) { loadsegment(fs, 0); @@ -1990,12 +1986,6 @@ void cpu_init(void) initialize_tlbstate_and_flush(); enter_lazy_tlb(&init_mm, cur); - /* Initialize the TSS. */ - tss_setup_ist(tss); - tss_setup_io_bitmap(tss); - set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); - - load_TR_desc(); /* * sp0 points to the entry trampoline stack regardless of what task * is running. @@ -2017,6 +2007,14 @@ void cpu_init(void) load_fixmap_gdt(cpu); } +#ifdef CONFIG_SMP +void cpu_init_secondary(void) +{ + cpu_init_exception_handling(); + cpu_init(); +} +#endif + /* * The microcode loader calls this upon late microcode load to recheck features, * only when microcode has been updated. Caller holds microcode_mutex and CPU --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1162,9 +1162,7 @@ void __init trap_init(void) idt_setup_traps(); - /* - * Should be a barrier for any external CPU state: - */ + cpu_init_exception_handling(); cpu_init(); idt_setup_ist_traps(); ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 1/2] x86/cpu: Init exception handling from cpu_init_secondary() 2021-05-07 11:02 ` [patch 1/2] x86/cpu: Init exception handling from cpu_init_secondary() Thomas Gleixner @ 2021-05-08 23:40 ` Lai Jiangshan 2021-05-09 13:55 ` Thomas Gleixner 0 siblings, 1 reply; 14+ messages in thread From: Lai Jiangshan @ 2021-05-08 23:40 UTC (permalink / raw) To: Thomas Gleixner, LKML; +Cc: x86, Joerg Roedel, Borislav Petkov On 2021/5/7 19:02, Thomas Gleixner wrote: > From: Borislav Petkov <bp@suse.de> > > SEV-ES guests require properly setup task register with which the TSS > descriptor in the GDT can be located so that the IST-type #VC exception > handler which they need to function properly, can be executed. > > This setup needs to happen before attempting to load microcode in > ucode_cpu_init() on secondary CPUs which can cause such #VC exceptions. > > Simplify the machinery by running that exception setup from a new function > cpu_init_secondary() and explicitly call cpu_init_exception_handling() for > the boot CPU before cpu_init(). The latter prepares for fixing and > simplifying the exception/IST setup on the boot CPU. > > There should be no functional changes resulting from this patch. > > [ tglx: Reworked it so cpu_init_exception_handling() stays separate ] > > Signed-off-by: Borislav Petkov <bp@suse.de> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > --- > arch/x86/include/asm/processor.h | 1 + > arch/x86/kernel/cpu/common.c | 24 +++++++++++------------- > arch/x86/kernel/traps.c | 4 +--- > 3 files changed, 13 insertions(+), 16 deletions(-) > > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -663,6 +663,7 @@ extern void load_direct_gdt(int); > extern void load_fixmap_gdt(int); > extern void load_percpu_segment(int); > extern void cpu_init(void); > +extern void cpu_init_secondary(void); > extern void cpu_init_exception_handling(void); > extern void cr4_init(void); > > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -1938,13 +1938,12 @@ void cpu_init_exception_handling(void) > > /* > * cpu_init() initializes state that is per-CPU. Some data is already > - * initialized (naturally) in the bootstrap process, such as the GDT > - * and IDT. We reload them nevertheless, this function acts as a > - * 'CPU state barrier', nothing should get across. > + * initialized (naturally) in the bootstrap process, such as the GDT. We > + * reload it nevertheless, this function acts as a 'CPU state barrier', > + * nothing should get across. > */ > void cpu_init(void) > { > - struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); > struct task_struct *cur = current; > int cpu = raw_smp_processor_id(); > > @@ -1957,8 +1956,6 @@ void cpu_init(void) > early_cpu_to_node(cpu) != NUMA_NO_NODE) > set_numa_node(early_cpu_to_node(cpu)); > #endif > - setup_getcpu(cpu); > - > pr_debug("Initializing CPU#%d\n", cpu); > > if (IS_ENABLED(CONFIG_X86_64) || cpu_feature_enabled(X86_FEATURE_VME) || > @@ -1970,7 +1967,6 @@ void cpu_init(void) > * and set up the GDT descriptor: > */ > switch_to_new_gdt(cpu); > - load_current_idt(); > > if (IS_ENABLED(CONFIG_X86_64)) { > loadsegment(fs, 0); > @@ -1990,12 +1986,6 @@ void cpu_init(void) > initialize_tlbstate_and_flush(); > enter_lazy_tlb(&init_mm, cur); > > - /* Initialize the TSS. */ > - tss_setup_ist(tss); > - tss_setup_io_bitmap(tss); > - set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); > - > - load_TR_desc(); > /* > * sp0 points to the entry trampoline stack regardless of what task > * is running. > @@ -2017,6 +2007,14 @@ void cpu_init(void) > load_fixmap_gdt(cpu); > } > > +#ifdef CONFIG_SMP > +void cpu_init_secondary(void) > +{ > + cpu_init_exception_handling(); > + cpu_init(); > +} > +#endif Hello No code invokes this function in this patch. Forgot to invoke it from start_secondary() or somewhere? Thanks Lai > + > /* > * The microcode loader calls this upon late microcode load to recheck features, > * only when microcode has been updated. Caller holds microcode_mutex and CPU > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -1162,9 +1162,7 @@ void __init trap_init(void) > > idt_setup_traps(); > > - /* > - * Should be a barrier for any external CPU state: > - */ > + cpu_init_exception_handling(); > cpu_init(); > > idt_setup_ist_traps(); > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 1/2] x86/cpu: Init exception handling from cpu_init_secondary() 2021-05-08 23:40 ` Lai Jiangshan @ 2021-05-09 13:55 ` Thomas Gleixner 2021-05-10 21:29 ` [patch 1/2 v2] x86/cpu: Init AP " Thomas Gleixner 0 siblings, 1 reply; 14+ messages in thread From: Thomas Gleixner @ 2021-05-09 13:55 UTC (permalink / raw) To: Lai Jiangshan, LKML; +Cc: x86, Joerg Roedel, Borislav Petkov On Sun, May 09 2021 at 07:40, Lai Jiangshan wrote: > On 2021/5/7 19:02, Thomas Gleixner wrote: >> +#ifdef CONFIG_SMP >> +void cpu_init_secondary(void) >> +{ >> + cpu_init_exception_handling(); >> + cpu_init(); >> +} >> +#endif > > Hello > > No code invokes this function in this patch. > > Forgot to invoke it from start_secondary() or somewhere? Yes. Stupid me. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch 1/2 v2] x86/cpu: Init AP exception handling from cpu_init_secondary() 2021-05-09 13:55 ` Thomas Gleixner @ 2021-05-10 21:29 ` Thomas Gleixner 2021-05-11 9:25 ` Lai Jiangshan ` (3 more replies) 0 siblings, 4 replies; 14+ messages in thread From: Thomas Gleixner @ 2021-05-10 21:29 UTC (permalink / raw) To: Lai Jiangshan, LKML; +Cc: x86, Joerg Roedel, Borislav Petkov From: Borislav Petkov <bp@suse.de> SEV-ES guests require properly setup task register with which the TSS descriptor in the GDT can be located so that the IST-type #VC exception handler which they need to function properly, can be executed. This setup needs to happen before attempting to load microcode in ucode_cpu_init() on secondary CPUs which can cause such #VC exceptions. Simplify the machinery by running that exception setup from a new function cpu_init_secondary() and explicitly call cpu_init_exception_handling() for the boot CPU before cpu_init(). The latter prepares for fixing and simplifying the exception/IST setup on the boot CPU. There should be no functional changes resulting from this patch. [ tglx: Reworked it so cpu_init_exception_handling() stays seperate ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- V2: Utilize cpu_init_secondary() - Lai --- arch/x86/include/asm/processor.h | 1 + arch/x86/kernel/cpu/common.c | 24 +++++++++++------------- arch/x86/kernel/smpboot.c | 3 +-- arch/x86/kernel/traps.c | 4 +--- 4 files changed, 14 insertions(+), 18 deletions(-) --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -663,6 +663,7 @@ extern void load_direct_gdt(int); extern void load_fixmap_gdt(int); extern void load_percpu_segment(int); extern void cpu_init(void); +extern void cpu_init_secondary(void); extern void cpu_init_exception_handling(void); extern void cr4_init(void); --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1938,13 +1938,12 @@ void cpu_init_exception_handling(void) /* * cpu_init() initializes state that is per-CPU. Some data is already - * initialized (naturally) in the bootstrap process, such as the GDT - * and IDT. We reload them nevertheless, this function acts as a - * 'CPU state barrier', nothing should get across. + * initialized (naturally) in the bootstrap process, such as the GDT. We + * reload it nevertheless, this function acts as a 'CPU state barrier', + * nothing should get across. */ void cpu_init(void) { - struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); struct task_struct *cur = current; int cpu = raw_smp_processor_id(); @@ -1957,8 +1956,6 @@ void cpu_init(void) early_cpu_to_node(cpu) != NUMA_NO_NODE) set_numa_node(early_cpu_to_node(cpu)); #endif - setup_getcpu(cpu); - pr_debug("Initializing CPU#%d\n", cpu); if (IS_ENABLED(CONFIG_X86_64) || cpu_feature_enabled(X86_FEATURE_VME) || @@ -1970,7 +1967,6 @@ void cpu_init(void) * and set up the GDT descriptor: */ switch_to_new_gdt(cpu); - load_current_idt(); if (IS_ENABLED(CONFIG_X86_64)) { loadsegment(fs, 0); @@ -1990,12 +1986,6 @@ void cpu_init(void) initialize_tlbstate_and_flush(); enter_lazy_tlb(&init_mm, cur); - /* Initialize the TSS. */ - tss_setup_ist(tss); - tss_setup_io_bitmap(tss); - set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); - - load_TR_desc(); /* * sp0 points to the entry trampoline stack regardless of what task * is running. @@ -2017,6 +2007,14 @@ void cpu_init(void) load_fixmap_gdt(cpu); } +#ifdef CONFIG_SMP +void cpu_init_secondary(void) +{ + cpu_init_exception_handling(); + cpu_init(); +} +#endif + /* * The microcode loader calls this upon late microcode load to recheck features, * only when microcode has been updated. Caller holds microcode_mutex and CPU --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -232,8 +232,7 @@ static void notrace start_secondary(void load_cr3(swapper_pg_dir); __flush_tlb_all(); #endif - cpu_init_exception_handling(); - cpu_init(); + cpu_init_secondary(); rcu_cpu_starting(raw_smp_processor_id()); x86_cpuinit.early_percpu_clock_init(); preempt_disable(); --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1162,9 +1162,7 @@ void __init trap_init(void) idt_setup_traps(); - /* - * Should be a barrier for any external CPU state: - */ + cpu_init_exception_handling(); cpu_init(); idt_setup_ist_traps(); ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 1/2 v2] x86/cpu: Init AP exception handling from cpu_init_secondary() 2021-05-10 21:29 ` [patch 1/2 v2] x86/cpu: Init AP " Thomas Gleixner @ 2021-05-11 9:25 ` Lai Jiangshan 2021-05-12 8:37 ` Peter Zijlstra 2021-05-12 8:49 ` Peter Zijlstra ` (2 subsequent siblings) 3 siblings, 1 reply; 14+ messages in thread From: Lai Jiangshan @ 2021-05-11 9:25 UTC (permalink / raw) To: Thomas Gleixner, LKML; +Cc: x86, Joerg Roedel, Borislav Petkov On 2021/5/11 05:29, Thomas Gleixner wrote: > From: Borislav Petkov <bp@suse.de> > > SEV-ES guests require properly setup task register with which the TSS > descriptor in the GDT can be located so that the IST-type #VC exception > handler which they need to function properly, can be executed. > > This setup needs to happen before attempting to load microcode in > ucode_cpu_init() on secondary CPUs which can cause such #VC exceptions. > > Simplify the machinery by running that exception setup from a new function > cpu_init_secondary() and explicitly call cpu_init_exception_handling() for > the boot CPU before cpu_init(). The latter prepares for fixing and > simplifying the exception/IST setup on the boot CPU. > > There should be no functional changes resulting from this patch. > > [ tglx: Reworked it so cpu_init_exception_handling() stays seperate ] > > Signed-off-by: Borislav Petkov <bp@suse.de> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> For both patches: Reviewed-by: Lai Jiangshan <laijs@linux.alibaba.com> > --- > V2: Utilize cpu_init_secondary() - Lai > --- > arch/x86/include/asm/processor.h | 1 + > arch/x86/kernel/cpu/common.c | 24 +++++++++++------------- > arch/x86/kernel/smpboot.c | 3 +-- > arch/x86/kernel/traps.c | 4 +--- > 4 files changed, 14 insertions(+), 18 deletions(-) > > --- a/arch/x86/include/asm/processor.h > +++ b/arch/x86/include/asm/processor.h > @@ -663,6 +663,7 @@ extern void load_direct_gdt(int); > extern void load_fixmap_gdt(int); > extern void load_percpu_segment(int); > extern void cpu_init(void); > +extern void cpu_init_secondary(void); > extern void cpu_init_exception_handling(void); > extern void cr4_init(void); > > --- a/arch/x86/kernel/cpu/common.c > +++ b/arch/x86/kernel/cpu/common.c > @@ -1938,13 +1938,12 @@ void cpu_init_exception_handling(void) > > /* > * cpu_init() initializes state that is per-CPU. Some data is already > - * initialized (naturally) in the bootstrap process, such as the GDT > - * and IDT. We reload them nevertheless, this function acts as a > - * 'CPU state barrier', nothing should get across. > + * initialized (naturally) in the bootstrap process, such as the GDT. We > + * reload it nevertheless, this function acts as a 'CPU state barrier', > + * nothing should get across. > */ > void cpu_init(void) > { > - struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); > struct task_struct *cur = current; > int cpu = raw_smp_processor_id(); > > @@ -1957,8 +1956,6 @@ void cpu_init(void) > early_cpu_to_node(cpu) != NUMA_NO_NODE) > set_numa_node(early_cpu_to_node(cpu)); > #endif > - setup_getcpu(cpu); > - > pr_debug("Initializing CPU#%d\n", cpu); > > if (IS_ENABLED(CONFIG_X86_64) || cpu_feature_enabled(X86_FEATURE_VME) || > @@ -1970,7 +1967,6 @@ void cpu_init(void) > * and set up the GDT descriptor: > */ > switch_to_new_gdt(cpu); > - load_current_idt(); > > if (IS_ENABLED(CONFIG_X86_64)) { > loadsegment(fs, 0); > @@ -1990,12 +1986,6 @@ void cpu_init(void) > initialize_tlbstate_and_flush(); > enter_lazy_tlb(&init_mm, cur); > > - /* Initialize the TSS. */ > - tss_setup_ist(tss); > - tss_setup_io_bitmap(tss); > - set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); > - > - load_TR_desc(); > /* > * sp0 points to the entry trampoline stack regardless of what task > * is running. > @@ -2017,6 +2007,14 @@ void cpu_init(void) > load_fixmap_gdt(cpu); > } > > +#ifdef CONFIG_SMP > +void cpu_init_secondary(void) > +{ > + cpu_init_exception_handling(); > + cpu_init(); > +} > +#endif > + > /* > * The microcode loader calls this upon late microcode load to recheck features, > * only when microcode has been updated. Caller holds microcode_mutex and CPU > --- a/arch/x86/kernel/smpboot.c > +++ b/arch/x86/kernel/smpboot.c > @@ -232,8 +232,7 @@ static void notrace start_secondary(void > load_cr3(swapper_pg_dir); > __flush_tlb_all(); > #endif > - cpu_init_exception_handling(); > - cpu_init(); > + cpu_init_secondary(); > rcu_cpu_starting(raw_smp_processor_id()); > x86_cpuinit.early_percpu_clock_init(); > preempt_disable(); > --- a/arch/x86/kernel/traps.c > +++ b/arch/x86/kernel/traps.c > @@ -1162,9 +1162,7 @@ void __init trap_init(void) > > idt_setup_traps(); > > - /* > - * Should be a barrier for any external CPU state: > - */ > + cpu_init_exception_handling(); > cpu_init(); > > idt_setup_ist_traps(); > ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 1/2 v2] x86/cpu: Init AP exception handling from cpu_init_secondary() 2021-05-11 9:25 ` Lai Jiangshan @ 2021-05-12 8:37 ` Peter Zijlstra 0 siblings, 0 replies; 14+ messages in thread From: Peter Zijlstra @ 2021-05-12 8:37 UTC (permalink / raw) To: Lai Jiangshan; +Cc: Thomas Gleixner, LKML, x86, Joerg Roedel, Borislav Petkov On Tue, May 11, 2021 at 05:25:35PM +0800, Lai Jiangshan wrote: > > > On 2021/5/11 05:29, Thomas Gleixner wrote: > > From: Borislav Petkov <bp@suse.de> > > > > SEV-ES guests require properly setup task register with which the TSS > > descriptor in the GDT can be located so that the IST-type #VC exception > > handler which they need to function properly, can be executed. > > > > This setup needs to happen before attempting to load microcode in > > ucode_cpu_init() on secondary CPUs which can cause such #VC exceptions. > > > > Simplify the machinery by running that exception setup from a new function > > cpu_init_secondary() and explicitly call cpu_init_exception_handling() for > > the boot CPU before cpu_init(). The latter prepares for fixing and > > simplifying the exception/IST setup on the boot CPU. > > > > There should be no functional changes resulting from this patch. > > > > [ tglx: Reworked it so cpu_init_exception_handling() stays seperate ] > > > > Signed-off-by: Borislav Petkov <bp@suse.de> > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > > For both patches: > > Reviewed-by: Lai Jiangshan <laijs@linux.alibaba.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 1/2 v2] x86/cpu: Init AP exception handling from cpu_init_secondary() 2021-05-10 21:29 ` [patch 1/2 v2] x86/cpu: Init AP " Thomas Gleixner 2021-05-11 9:25 ` Lai Jiangshan @ 2021-05-12 8:49 ` Peter Zijlstra 2021-05-12 9:52 ` Thomas Gleixner 2021-05-18 12:40 ` [tip: x86/apic] x86_cpu_Init_AP_exception_handling_from_cpu_init_secondary_ tip-bot2 for Borislav Petkov 2021-05-18 12:52 ` [tip: x86/apic] x86/cpu: Init AP exception handling from cpu_init_secondary() tip-bot2 for Borislav Petkov 3 siblings, 1 reply; 14+ messages in thread From: Peter Zijlstra @ 2021-05-12 8:49 UTC (permalink / raw) To: Thomas Gleixner; +Cc: Lai Jiangshan, LKML, x86, Joerg Roedel, Borislav Petkov On Mon, May 10, 2021 at 11:29:25PM +0200, Thomas Gleixner wrote: > +#ifdef CONFIG_SMP > +void cpu_init_secondary(void) > +{ /* * Relies on the BP having set-up the IDT tables, which * are loaded on this CPU in the below * cpu_init_exception_handling(). */ > + cpu_init_exception_handling(); > + cpu_init(); > +} > +#endif Or something along those lines? It took me a little to figure out why start_secondary() didn't have idt_setup_traps(), hopefully something like this will avoid a little future confusion. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch 1/2 v2] x86/cpu: Init AP exception handling from cpu_init_secondary() 2021-05-12 8:49 ` Peter Zijlstra @ 2021-05-12 9:52 ` Thomas Gleixner 0 siblings, 0 replies; 14+ messages in thread From: Thomas Gleixner @ 2021-05-12 9:52 UTC (permalink / raw) To: Peter Zijlstra; +Cc: Lai Jiangshan, LKML, x86, Joerg Roedel, Borislav Petkov On Wed, May 12 2021 at 10:49, Peter Zijlstra wrote: > On Mon, May 10, 2021 at 11:29:25PM +0200, Thomas Gleixner wrote: >> +#ifdef CONFIG_SMP >> +void cpu_init_secondary(void) >> +{ > /* > * Relies on the BP having set-up the IDT tables, which > * are loaded on this CPU in the below > * cpu_init_exception_handling(). > */ >> + cpu_init_exception_handling(); >> + cpu_init(); >> +} >> +#endif > > Or something along those lines? It took me a little to figure out why > start_secondary() didn't have idt_setup_traps(), hopefully something > like this will avoid a little future confusion. I'll add something to that effect. ^ permalink raw reply [flat|nested] 14+ messages in thread
* [tip: x86/apic] x86_cpu_Init_AP_exception_handling_from_cpu_init_secondary_ 2021-05-10 21:29 ` [patch 1/2 v2] x86/cpu: Init AP " Thomas Gleixner 2021-05-11 9:25 ` Lai Jiangshan 2021-05-12 8:49 ` Peter Zijlstra @ 2021-05-18 12:40 ` tip-bot2 for Borislav Petkov 2021-05-18 12:52 ` [tip: x86/apic] x86/cpu: Init AP exception handling from cpu_init_secondary() tip-bot2 for Borislav Petkov 3 siblings, 0 replies; 14+ messages in thread From: tip-bot2 for Borislav Petkov @ 2021-05-18 12:40 UTC (permalink / raw) To: linux-tip-commits Cc: Borislav Petkov, Thomas Gleixner, Lai Jiangshan, Peter Zijlstra (Intel), x86, linux-kernel The following commit has been merged into the x86/apic branch of tip: Commit-ID: 14508594acb7606c10f89e79f3f73e8203295f8b Gitweb: https://git.kernel.org/tip/14508594acb7606c10f89e79f3f73e8203295f8b Author: Borislav Petkov <bp@suse.de> AuthorDate: Mon, 10 May 2021 23:29:25 +02:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Tue, 18 May 2021 14:33:19 +02:00 x86_cpu_Init_AP_exception_handling_from_cpu_init_secondary_ SEV-ES guests require properly setup task register with which the TSS descriptor in the GDT can be located so that the IST-type #VC exception handler which they need to function properly, can be executed. This setup needs to happen before attempting to load microcode in ucode_cpu_init() on secondary CPUs which can cause such #VC exceptions. Simplify the machinery by running that exception setup from a new function cpu_init_secondary() and explicitly call cpu_init_exception_handling() for the boot CPU before cpu_init(). The latter prepares for fixing and simplifying the exception/IST setup on the boot CPU. There should be no functional changes resulting from this patch. [ tglx: Reworked it so cpu_init_exception_handling() stays seperate ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lai Jiangshan <laijs@linux.alibaba.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/87k0o6gtvu.ffs@nanos.tec.linutronix.de --- arch/x86/include/asm/processor.h | 1 + arch/x86/kernel/cpu/common.c | 28 +++++++++++++++------------- arch/x86/kernel/smpboot.c | 3 +-- arch/x86/kernel/traps.c | 4 +--- 4 files changed, 18 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 556b2b1..364d0e4 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -663,6 +663,7 @@ extern void load_direct_gdt(int); extern void load_fixmap_gdt(int); extern void load_percpu_segment(int); extern void cpu_init(void); +extern void cpu_init_secondary(void); extern void cpu_init_exception_handling(void); extern void cr4_init(void); diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a1b756c..212e8bc 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1938,13 +1938,12 @@ void cpu_init_exception_handling(void) /* * cpu_init() initializes state that is per-CPU. Some data is already - * initialized (naturally) in the bootstrap process, such as the GDT - * and IDT. We reload them nevertheless, this function acts as a - * 'CPU state barrier', nothing should get across. + * initialized (naturally) in the bootstrap process, such as the GDT. We + * reload it nevertheless, this function acts as a 'CPU state barrier', + * nothing should get across. */ void cpu_init(void) { - struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); struct task_struct *cur = current; int cpu = raw_smp_processor_id(); @@ -1957,8 +1956,6 @@ void cpu_init(void) early_cpu_to_node(cpu) != NUMA_NO_NODE) set_numa_node(early_cpu_to_node(cpu)); #endif - setup_getcpu(cpu); - pr_debug("Initializing CPU#%d\n", cpu); if (IS_ENABLED(CONFIG_X86_64) || cpu_feature_enabled(X86_FEATURE_VME) || @@ -1970,7 +1967,6 @@ void cpu_init(void) * and set up the GDT descriptor: */ switch_to_new_gdt(cpu); - load_current_idt(); if (IS_ENABLED(CONFIG_X86_64)) { loadsegment(fs, 0); @@ -1990,12 +1986,6 @@ void cpu_init(void) initialize_tlbstate_and_flush(); enter_lazy_tlb(&init_mm, cur); - /* Initialize the TSS. */ - tss_setup_ist(tss); - tss_setup_io_bitmap(tss); - set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); - - load_TR_desc(); /* * sp0 points to the entry trampoline stack regardless of what task * is running. @@ -2017,6 +2007,18 @@ void cpu_init(void) load_fixmap_gdt(cpu); } +#ifdef CONFIG_SMP +void cpu_init_secondary(void) +{ + /* + * Relies on the BP having set-up the IDT tables, which are loaded + * on this CPU in cpu_init_exception_handling(). + */ + cpu_init_exception_handling(); + cpu_init(); +} +#endif + /* * The microcode loader calls this upon late microcode load to recheck features, * only when microcode has been updated. Caller holds microcode_mutex and CPU diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 7770245..2ed45b0 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -232,8 +232,7 @@ static void notrace start_secondary(void *unused) load_cr3(swapper_pg_dir); __flush_tlb_all(); #endif - cpu_init_exception_handling(); - cpu_init(); + cpu_init_secondary(); rcu_cpu_starting(raw_smp_processor_id()); x86_cpuinit.early_percpu_clock_init(); preempt_disable(); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 853ea7a..41f7dc4 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1162,9 +1162,7 @@ void __init trap_init(void) idt_setup_traps(); - /* - * Should be a barrier for any external CPU state: - */ + cpu_init_exception_handling(); cpu_init(); idt_setup_ist_traps(); ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip: x86/apic] x86/cpu: Init AP exception handling from cpu_init_secondary() 2021-05-10 21:29 ` [patch 1/2 v2] x86/cpu: Init AP " Thomas Gleixner ` (2 preceding siblings ...) 2021-05-18 12:40 ` [tip: x86/apic] x86_cpu_Init_AP_exception_handling_from_cpu_init_secondary_ tip-bot2 for Borislav Petkov @ 2021-05-18 12:52 ` tip-bot2 for Borislav Petkov 3 siblings, 0 replies; 14+ messages in thread From: tip-bot2 for Borislav Petkov @ 2021-05-18 12:52 UTC (permalink / raw) To: linux-tip-commits Cc: Borislav Petkov, Thomas Gleixner, Lai Jiangshan, Peter Zijlstra (Intel), x86, linux-kernel The following commit has been merged into the x86/apic branch of tip: Commit-ID: b1efd0ff4bd16e8bb8607ba566b03f2024a830bb Gitweb: https://git.kernel.org/tip/b1efd0ff4bd16e8bb8607ba566b03f2024a830bb Author: Borislav Petkov <bp@suse.de> AuthorDate: Mon, 10 May 2021 23:29:25 +02:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Tue, 18 May 2021 14:49:21 +02:00 x86/cpu: Init AP exception handling from cpu_init_secondary() SEV-ES guests require properly setup task register with which the TSS descriptor in the GDT can be located so that the IST-type #VC exception handler which they need to function properly, can be executed. This setup needs to happen before attempting to load microcode in ucode_cpu_init() on secondary CPUs which can cause such #VC exceptions. Simplify the machinery by running that exception setup from a new function cpu_init_secondary() and explicitly call cpu_init_exception_handling() for the boot CPU before cpu_init(). The latter prepares for fixing and simplifying the exception/IST setup on the boot CPU. There should be no functional changes resulting from this patch. [ tglx: Reworked it so cpu_init_exception_handling() stays seperate ] Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lai Jiangshan <laijs@linux.alibaba.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/87k0o6gtvu.ffs@nanos.tec.linutronix.de --- arch/x86/include/asm/processor.h | 1 + arch/x86/kernel/cpu/common.c | 28 +++++++++++++++------------- arch/x86/kernel/smpboot.c | 3 +-- arch/x86/kernel/traps.c | 4 +--- 4 files changed, 18 insertions(+), 18 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h index 556b2b1..364d0e4 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -663,6 +663,7 @@ extern void load_direct_gdt(int); extern void load_fixmap_gdt(int); extern void load_percpu_segment(int); extern void cpu_init(void); +extern void cpu_init_secondary(void); extern void cpu_init_exception_handling(void); extern void cr4_init(void); diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index a1b756c..212e8bc 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1938,13 +1938,12 @@ void cpu_init_exception_handling(void) /* * cpu_init() initializes state that is per-CPU. Some data is already - * initialized (naturally) in the bootstrap process, such as the GDT - * and IDT. We reload them nevertheless, this function acts as a - * 'CPU state barrier', nothing should get across. + * initialized (naturally) in the bootstrap process, such as the GDT. We + * reload it nevertheless, this function acts as a 'CPU state barrier', + * nothing should get across. */ void cpu_init(void) { - struct tss_struct *tss = this_cpu_ptr(&cpu_tss_rw); struct task_struct *cur = current; int cpu = raw_smp_processor_id(); @@ -1957,8 +1956,6 @@ void cpu_init(void) early_cpu_to_node(cpu) != NUMA_NO_NODE) set_numa_node(early_cpu_to_node(cpu)); #endif - setup_getcpu(cpu); - pr_debug("Initializing CPU#%d\n", cpu); if (IS_ENABLED(CONFIG_X86_64) || cpu_feature_enabled(X86_FEATURE_VME) || @@ -1970,7 +1967,6 @@ void cpu_init(void) * and set up the GDT descriptor: */ switch_to_new_gdt(cpu); - load_current_idt(); if (IS_ENABLED(CONFIG_X86_64)) { loadsegment(fs, 0); @@ -1990,12 +1986,6 @@ void cpu_init(void) initialize_tlbstate_and_flush(); enter_lazy_tlb(&init_mm, cur); - /* Initialize the TSS. */ - tss_setup_ist(tss); - tss_setup_io_bitmap(tss); - set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); - - load_TR_desc(); /* * sp0 points to the entry trampoline stack regardless of what task * is running. @@ -2017,6 +2007,18 @@ void cpu_init(void) load_fixmap_gdt(cpu); } +#ifdef CONFIG_SMP +void cpu_init_secondary(void) +{ + /* + * Relies on the BP having set-up the IDT tables, which are loaded + * on this CPU in cpu_init_exception_handling(). + */ + cpu_init_exception_handling(); + cpu_init(); +} +#endif + /* * The microcode loader calls this upon late microcode load to recheck features, * only when microcode has been updated. Caller holds microcode_mutex and CPU diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 7770245..2ed45b0 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -232,8 +232,7 @@ static void notrace start_secondary(void *unused) load_cr3(swapper_pg_dir); __flush_tlb_all(); #endif - cpu_init_exception_handling(); - cpu_init(); + cpu_init_secondary(); rcu_cpu_starting(raw_smp_processor_id()); x86_cpuinit.early_percpu_clock_init(); preempt_disable(); diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 853ea7a..41f7dc4 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1162,9 +1162,7 @@ void __init trap_init(void) idt_setup_traps(); - /* - * Should be a barrier for any external CPU state: - */ + cpu_init_exception_handling(); cpu_init(); idt_setup_ist_traps(); ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [patch 2/2] x86/idt: Rework IDT setup for boot CPU 2021-05-07 11:02 [patch 0/2] x86/idt: Consolidate IDT/TSS setup Thomas Gleixner 2021-05-07 11:02 ` [patch 1/2] x86/cpu: Init exception handling from cpu_init_secondary() Thomas Gleixner @ 2021-05-07 11:02 ` Thomas Gleixner 2021-05-18 12:40 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner 2021-05-18 12:52 ` tip-bot2 for Thomas Gleixner 1 sibling, 2 replies; 14+ messages in thread From: Thomas Gleixner @ 2021-05-07 11:02 UTC (permalink / raw) To: LKML; +Cc: x86, Lai Jiangshan, Joerg Roedel, Borislav Petkov A basic IDT setup for the boot CPU has to be done before invoking cpu_init() because that might trigger #GP when accessing certain MSRs. This setup cannot install the IST variants on 64-bit because the TSS setup which is required for ISTs to work happens in cpu_init(). That leaves a theoretical window where a NMI would invoke the ASM entry point which relies on IST being enabled on the kernel stack which is undefined behaviour. This setup logic has never worked correctly, but on the other hand a NMI hitting the boot CPU before it has fully set up the IDT would be fatal anyway. So the small window between the wrong NMI gate and the IST based NMI gate is not really adding a substantial amount of risk. But the setup logic is nevertheless more convoluted than necessary. The recent separation of the TSS setup into a separate function to ensure that #VC is working on secondary CPUs early on, allows to rework the boot CPU setup so it can setup TSS first, then initialize IDT with the IST variants before invoking cpu_init() and get rid of the post cpu_init() IST setup. Move the invocation of cpu_init_exception_handling() ahead of idt_setup_traps() and merge the IST setup into the default setup table. Reported-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> --- arch/x86/include/asm/desc.h | 2 -- arch/x86/kernel/idt.c | 40 ++++++++++++---------------------------- arch/x86/kernel/traps.c | 7 +++---- 3 files changed, 15 insertions(+), 34 deletions(-) --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -421,10 +421,8 @@ extern bool idt_is_f00f_address(unsigned #ifdef CONFIG_X86_64 extern void idt_setup_early_pf(void); -extern void idt_setup_ist_traps(void); #else static inline void idt_setup_early_pf(void) { } -static inline void idt_setup_ist_traps(void) { } #endif extern void idt_invalidate(void *addr); --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -35,12 +35,16 @@ #define SYSG(_vector, _addr) \ G(_vector, _addr, DEFAULT_STACK, GATE_INTERRUPT, DPL3, __KERNEL_CS) +#ifdef CONFIG_X86_64 /* * Interrupt gate with interrupt stack. The _ist index is the index in * the tss.ist[] array, but for the descriptor it needs to start at 1. */ #define ISTG(_vector, _addr, _ist) \ G(_vector, _addr, _ist + 1, GATE_INTERRUPT, DPL0, __KERNEL_CS) +#else +#define ISTG(_vector, _addr, _ist) INTG(_vector, _addr) +#endif /* Task gate */ #define TSKG(_vector, _gdt) \ @@ -74,7 +78,7 @@ static const __initconst struct idt_data */ static const __initconst struct idt_data def_idts[] = { INTG(X86_TRAP_DE, asm_exc_divide_error), - INTG(X86_TRAP_NMI, asm_exc_nmi), + ISTG(X86_TRAP_NMI, asm_exc_nmi, IST_INDEX_NMI), INTG(X86_TRAP_BR, asm_exc_bounds), INTG(X86_TRAP_UD, asm_exc_invalid_op), INTG(X86_TRAP_NM, asm_exc_device_not_available), @@ -91,12 +95,16 @@ static const __initconst struct idt_data #ifdef CONFIG_X86_32 TSKG(X86_TRAP_DF, GDT_ENTRY_DOUBLEFAULT_TSS), #else - INTG(X86_TRAP_DF, asm_exc_double_fault), + ISTG(X86_TRAP_DF, asm_exc_double_fault, IST_INDEX_DF), #endif - INTG(X86_TRAP_DB, asm_exc_debug), + ISTG(X86_TRAP_DB, asm_exc_debug, IST_INDEX_DB), #ifdef CONFIG_X86_MCE - INTG(X86_TRAP_MC, asm_exc_machine_check), + ISTG(X86_TRAP_MC, asm_exc_machine_check, IST_INDEX_MCE), +#endif + +#ifdef CONFIG_AMD_MEM_ENCRYPT + ISTG(X86_TRAP_VC, asm_exc_vmm_communication, IST_INDEX_VC), #endif SYSG(X86_TRAP_OF, asm_exc_overflow), @@ -221,22 +229,6 @@ static const __initconst struct idt_data INTG(X86_TRAP_PF, asm_exc_page_fault), }; -/* - * The exceptions which use Interrupt stacks. They are setup after - * cpu_init() when the TSS has been initialized. - */ -static const __initconst struct idt_data ist_idts[] = { - ISTG(X86_TRAP_DB, asm_exc_debug, IST_INDEX_DB), - ISTG(X86_TRAP_NMI, asm_exc_nmi, IST_INDEX_NMI), - ISTG(X86_TRAP_DF, asm_exc_double_fault, IST_INDEX_DF), -#ifdef CONFIG_X86_MCE - ISTG(X86_TRAP_MC, asm_exc_machine_check, IST_INDEX_MCE), -#endif -#ifdef CONFIG_AMD_MEM_ENCRYPT - ISTG(X86_TRAP_VC, asm_exc_vmm_communication, IST_INDEX_VC), -#endif -}; - /** * idt_setup_early_pf - Initialize the idt table with early pagefault handler * @@ -254,14 +246,6 @@ void __init idt_setup_early_pf(void) idt_setup_from_table(idt_table, early_pf_idts, ARRAY_SIZE(early_pf_idts), true); } - -/** - * idt_setup_ist_traps - Initialize the idt table with traps using IST - */ -void __init idt_setup_ist_traps(void) -{ - idt_setup_from_table(idt_table, ist_idts, ARRAY_SIZE(ist_idts), true); -} #endif static void __init idt_map_in_cea(void) --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1160,10 +1160,9 @@ void __init trap_init(void) /* Init GHCB memory pages when running as an SEV-ES guest */ sev_es_init_vc_handling(); - idt_setup_traps(); - + /* Initialize TSS before setting up traps so ISTs work */ cpu_init_exception_handling(); + /* Setup traps as cpu_init() might #GP */ + idt_setup_traps(); cpu_init(); - - idt_setup_ist_traps(); } ^ permalink raw reply [flat|nested] 14+ messages in thread
* [tip: x86/apic] x86/idt: Rework IDT setup for boot CPU 2021-05-07 11:02 ` [patch 2/2] x86/idt: Rework IDT setup for boot CPU Thomas Gleixner @ 2021-05-18 12:40 ` tip-bot2 for Thomas Gleixner 2021-05-18 12:52 ` tip-bot2 for Thomas Gleixner 1 sibling, 0 replies; 14+ messages in thread From: tip-bot2 for Thomas Gleixner @ 2021-05-18 12:40 UTC (permalink / raw) To: linux-tip-commits Cc: Lai Jiangshan, Thomas Gleixner, Peter Zijlstra (Intel), x86, linux-kernel The following commit has been merged into the x86/apic branch of tip: Commit-ID: 1f2b9e4acea8e6a39a7c2159d9cbd04583a729f5 Gitweb: https://git.kernel.org/tip/1f2b9e4acea8e6a39a7c2159d9cbd04583a729f5 Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Fri, 07 May 2021 13:02:12 +02:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Tue, 18 May 2021 14:33:54 +02:00 x86/idt: Rework IDT setup for boot CPU A basic IDT setup for the boot CPU has to be done before invoking cpu_init() because that might trigger #GP when accessing certain MSRs. This setup cannot install the IST variants on 64-bit because the TSS setup which is required for ISTs to work happens in cpu_init(). That leaves a theoretical window where a NMI would invoke the ASM entry point which relies on IST being enabled on the kernel stack which is undefined behaviour. This setup logic has never worked correctly, but on the other hand a NMI hitting the boot CPU before it has fully set up the IDT would be fatal anyway. So the small window between the wrong NMI gate and the IST based NMI gate is not really adding a substantial amount of risk. But the setup logic is nevertheless more convoluted than necessary. The recent separation of the TSS setup into a separate function to ensure that setup so it can setup TSS first, then initialize IDT with the IST variants before invoking cpu_init() and get rid of the post cpu_init() IST setup. Move the invocation of cpu_init_exception_handling() ahead of idt_setup_traps() and merge the IST setup into the default setup table. Reported-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lai Jiangshan <laijs@linux.alibaba.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210507114000.569244755@linutronix.de --- arch/x86/include/asm/desc.h | 2 +-- arch/x86/kernel/idt.c | 40 ++++++++++-------------------------- arch/x86/kernel/traps.c | 7 ++---- 3 files changed, 15 insertions(+), 34 deletions(-) diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index 476082a..96021e9 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -421,10 +421,8 @@ extern bool idt_is_f00f_address(unsigned long address); #ifdef CONFIG_X86_64 extern void idt_setup_early_pf(void); -extern void idt_setup_ist_traps(void); #else static inline void idt_setup_early_pf(void) { } -static inline void idt_setup_ist_traps(void) { } #endif extern void idt_invalidate(void *addr); diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index d552f17..6cce604 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -35,12 +35,16 @@ #define SYSG(_vector, _addr) \ G(_vector, _addr, DEFAULT_STACK, GATE_INTERRUPT, DPL3, __KERNEL_CS) +#ifdef CONFIG_X86_64 /* * Interrupt gate with interrupt stack. The _ist index is the index in * the tss.ist[] array, but for the descriptor it needs to start at 1. */ #define ISTG(_vector, _addr, _ist) \ G(_vector, _addr, _ist + 1, GATE_INTERRUPT, DPL0, __KERNEL_CS) +#else +#define ISTG(_vector, _addr, _ist) INTG(_vector, _addr) +#endif /* Task gate */ #define TSKG(_vector, _gdt) \ @@ -74,7 +78,7 @@ static const __initconst struct idt_data early_idts[] = { */ static const __initconst struct idt_data def_idts[] = { INTG(X86_TRAP_DE, asm_exc_divide_error), - INTG(X86_TRAP_NMI, asm_exc_nmi), + ISTG(X86_TRAP_NMI, asm_exc_nmi, IST_INDEX_NMI), INTG(X86_TRAP_BR, asm_exc_bounds), INTG(X86_TRAP_UD, asm_exc_invalid_op), INTG(X86_TRAP_NM, asm_exc_device_not_available), @@ -91,12 +95,16 @@ static const __initconst struct idt_data def_idts[] = { #ifdef CONFIG_X86_32 TSKG(X86_TRAP_DF, GDT_ENTRY_DOUBLEFAULT_TSS), #else - INTG(X86_TRAP_DF, asm_exc_double_fault), + ISTG(X86_TRAP_DF, asm_exc_double_fault, IST_INDEX_DF), #endif - INTG(X86_TRAP_DB, asm_exc_debug), + ISTG(X86_TRAP_DB, asm_exc_debug, IST_INDEX_DB), #ifdef CONFIG_X86_MCE - INTG(X86_TRAP_MC, asm_exc_machine_check), + ISTG(X86_TRAP_MC, asm_exc_machine_check, IST_INDEX_MCE), +#endif + +#ifdef CONFIG_AMD_MEM_ENCRYPT + ISTG(X86_TRAP_VC, asm_exc_vmm_communication, IST_INDEX_VC), #endif SYSG(X86_TRAP_OF, asm_exc_overflow), @@ -221,22 +229,6 @@ static const __initconst struct idt_data early_pf_idts[] = { INTG(X86_TRAP_PF, asm_exc_page_fault), }; -/* - * The exceptions which use Interrupt stacks. They are setup after - * cpu_init() when the TSS has been initialized. - */ -static const __initconst struct idt_data ist_idts[] = { - ISTG(X86_TRAP_DB, asm_exc_debug, IST_INDEX_DB), - ISTG(X86_TRAP_NMI, asm_exc_nmi, IST_INDEX_NMI), - ISTG(X86_TRAP_DF, asm_exc_double_fault, IST_INDEX_DF), -#ifdef CONFIG_X86_MCE - ISTG(X86_TRAP_MC, asm_exc_machine_check, IST_INDEX_MCE), -#endif -#ifdef CONFIG_AMD_MEM_ENCRYPT - ISTG(X86_TRAP_VC, asm_exc_vmm_communication, IST_INDEX_VC), -#endif -}; - /** * idt_setup_early_pf - Initialize the idt table with early pagefault handler * @@ -254,14 +246,6 @@ void __init idt_setup_early_pf(void) idt_setup_from_table(idt_table, early_pf_idts, ARRAY_SIZE(early_pf_idts), true); } - -/** - * idt_setup_ist_traps - Initialize the idt table with traps using IST - */ -void __init idt_setup_ist_traps(void) -{ - idt_setup_from_table(idt_table, ist_idts, ARRAY_SIZE(ist_idts), true); -} #endif static void __init idt_map_in_cea(void) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 41f7dc4..ed540e0 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1160,10 +1160,9 @@ void __init trap_init(void) /* Init GHCB memory pages when running as an SEV-ES guest */ sev_es_init_vc_handling(); - idt_setup_traps(); - + /* Initialize TSS before setting up traps so ISTs work */ cpu_init_exception_handling(); + /* Setup traps as cpu_init() might #GP */ + idt_setup_traps(); cpu_init(); - - idt_setup_ist_traps(); } ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [tip: x86/apic] x86/idt: Rework IDT setup for boot CPU 2021-05-07 11:02 ` [patch 2/2] x86/idt: Rework IDT setup for boot CPU Thomas Gleixner 2021-05-18 12:40 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner @ 2021-05-18 12:52 ` tip-bot2 for Thomas Gleixner 1 sibling, 0 replies; 14+ messages in thread From: tip-bot2 for Thomas Gleixner @ 2021-05-18 12:52 UTC (permalink / raw) To: linux-tip-commits Cc: Lai Jiangshan, Thomas Gleixner, Peter Zijlstra (Intel), x86, linux-kernel The following commit has been merged into the x86/apic branch of tip: Commit-ID: 1dcc917a0eed934c522d93bb05a9a7bb3c54f96c Gitweb: https://git.kernel.org/tip/1dcc917a0eed934c522d93bb05a9a7bb3c54f96c Author: Thomas Gleixner <tglx@linutronix.de> AuthorDate: Fri, 07 May 2021 13:02:12 +02:00 Committer: Thomas Gleixner <tglx@linutronix.de> CommitterDate: Tue, 18 May 2021 14:49:21 +02:00 x86/idt: Rework IDT setup for boot CPU A basic IDT setup for the boot CPU has to be done before invoking cpu_init() because that might trigger #GP when accessing certain MSRs. This setup cannot install the IST variants on 64-bit because the TSS setup which is required for ISTs to work happens in cpu_init(). That leaves a theoretical window where a NMI would invoke the ASM entry point which relies on IST being enabled on the kernel stack which is undefined behaviour. This setup logic has never worked correctly, but on the other hand a NMI hitting the boot CPU before it has fully set up the IDT would be fatal anyway. So the small window between the wrong NMI gate and the IST based NMI gate is not really adding a substantial amount of risk. But the setup logic is nevertheless more convoluted than necessary. The recent separation of the TSS setup into a separate function to ensure that setup so it can setup TSS first, then initialize IDT with the IST variants before invoking cpu_init() and get rid of the post cpu_init() IST setup. Move the invocation of cpu_init_exception_handling() ahead of idt_setup_traps() and merge the IST setup into the default setup table. Reported-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Lai Jiangshan <laijs@linux.alibaba.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20210507114000.569244755@linutronix.de --- arch/x86/include/asm/desc.h | 2 +-- arch/x86/kernel/idt.c | 40 ++++++++++-------------------------- arch/x86/kernel/traps.c | 7 ++---- 3 files changed, 15 insertions(+), 34 deletions(-) diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index 476082a..96021e9 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -421,10 +421,8 @@ extern bool idt_is_f00f_address(unsigned long address); #ifdef CONFIG_X86_64 extern void idt_setup_early_pf(void); -extern void idt_setup_ist_traps(void); #else static inline void idt_setup_early_pf(void) { } -static inline void idt_setup_ist_traps(void) { } #endif extern void idt_invalidate(void *addr); diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c index d552f17..6cce604 100644 --- a/arch/x86/kernel/idt.c +++ b/arch/x86/kernel/idt.c @@ -35,12 +35,16 @@ #define SYSG(_vector, _addr) \ G(_vector, _addr, DEFAULT_STACK, GATE_INTERRUPT, DPL3, __KERNEL_CS) +#ifdef CONFIG_X86_64 /* * Interrupt gate with interrupt stack. The _ist index is the index in * the tss.ist[] array, but for the descriptor it needs to start at 1. */ #define ISTG(_vector, _addr, _ist) \ G(_vector, _addr, _ist + 1, GATE_INTERRUPT, DPL0, __KERNEL_CS) +#else +#define ISTG(_vector, _addr, _ist) INTG(_vector, _addr) +#endif /* Task gate */ #define TSKG(_vector, _gdt) \ @@ -74,7 +78,7 @@ static const __initconst struct idt_data early_idts[] = { */ static const __initconst struct idt_data def_idts[] = { INTG(X86_TRAP_DE, asm_exc_divide_error), - INTG(X86_TRAP_NMI, asm_exc_nmi), + ISTG(X86_TRAP_NMI, asm_exc_nmi, IST_INDEX_NMI), INTG(X86_TRAP_BR, asm_exc_bounds), INTG(X86_TRAP_UD, asm_exc_invalid_op), INTG(X86_TRAP_NM, asm_exc_device_not_available), @@ -91,12 +95,16 @@ static const __initconst struct idt_data def_idts[] = { #ifdef CONFIG_X86_32 TSKG(X86_TRAP_DF, GDT_ENTRY_DOUBLEFAULT_TSS), #else - INTG(X86_TRAP_DF, asm_exc_double_fault), + ISTG(X86_TRAP_DF, asm_exc_double_fault, IST_INDEX_DF), #endif - INTG(X86_TRAP_DB, asm_exc_debug), + ISTG(X86_TRAP_DB, asm_exc_debug, IST_INDEX_DB), #ifdef CONFIG_X86_MCE - INTG(X86_TRAP_MC, asm_exc_machine_check), + ISTG(X86_TRAP_MC, asm_exc_machine_check, IST_INDEX_MCE), +#endif + +#ifdef CONFIG_AMD_MEM_ENCRYPT + ISTG(X86_TRAP_VC, asm_exc_vmm_communication, IST_INDEX_VC), #endif SYSG(X86_TRAP_OF, asm_exc_overflow), @@ -221,22 +229,6 @@ static const __initconst struct idt_data early_pf_idts[] = { INTG(X86_TRAP_PF, asm_exc_page_fault), }; -/* - * The exceptions which use Interrupt stacks. They are setup after - * cpu_init() when the TSS has been initialized. - */ -static const __initconst struct idt_data ist_idts[] = { - ISTG(X86_TRAP_DB, asm_exc_debug, IST_INDEX_DB), - ISTG(X86_TRAP_NMI, asm_exc_nmi, IST_INDEX_NMI), - ISTG(X86_TRAP_DF, asm_exc_double_fault, IST_INDEX_DF), -#ifdef CONFIG_X86_MCE - ISTG(X86_TRAP_MC, asm_exc_machine_check, IST_INDEX_MCE), -#endif -#ifdef CONFIG_AMD_MEM_ENCRYPT - ISTG(X86_TRAP_VC, asm_exc_vmm_communication, IST_INDEX_VC), -#endif -}; - /** * idt_setup_early_pf - Initialize the idt table with early pagefault handler * @@ -254,14 +246,6 @@ void __init idt_setup_early_pf(void) idt_setup_from_table(idt_table, early_pf_idts, ARRAY_SIZE(early_pf_idts), true); } - -/** - * idt_setup_ist_traps - Initialize the idt table with traps using IST - */ -void __init idt_setup_ist_traps(void) -{ - idt_setup_from_table(idt_table, ist_idts, ARRAY_SIZE(ist_idts), true); -} #endif static void __init idt_map_in_cea(void) diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 41f7dc4..ed540e0 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1160,10 +1160,9 @@ void __init trap_init(void) /* Init GHCB memory pages when running as an SEV-ES guest */ sev_es_init_vc_handling(); - idt_setup_traps(); - + /* Initialize TSS before setting up traps so ISTs work */ cpu_init_exception_handling(); + /* Setup traps as cpu_init() might #GP */ + idt_setup_traps(); cpu_init(); - - idt_setup_ist_traps(); } ^ permalink raw reply related [flat|nested] 14+ messages in thread
end of thread, other threads:[~2021-05-18 12:52 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-07 11:02 [patch 0/2] x86/idt: Consolidate IDT/TSS setup Thomas Gleixner 2021-05-07 11:02 ` [patch 1/2] x86/cpu: Init exception handling from cpu_init_secondary() Thomas Gleixner 2021-05-08 23:40 ` Lai Jiangshan 2021-05-09 13:55 ` Thomas Gleixner 2021-05-10 21:29 ` [patch 1/2 v2] x86/cpu: Init AP " Thomas Gleixner 2021-05-11 9:25 ` Lai Jiangshan 2021-05-12 8:37 ` Peter Zijlstra 2021-05-12 8:49 ` Peter Zijlstra 2021-05-12 9:52 ` Thomas Gleixner 2021-05-18 12:40 ` [tip: x86/apic] x86_cpu_Init_AP_exception_handling_from_cpu_init_secondary_ tip-bot2 for Borislav Petkov 2021-05-18 12:52 ` [tip: x86/apic] x86/cpu: Init AP exception handling from cpu_init_secondary() tip-bot2 for Borislav Petkov 2021-05-07 11:02 ` [patch 2/2] x86/idt: Rework IDT setup for boot CPU Thomas Gleixner 2021-05-18 12:40 ` [tip: x86/apic] " tip-bot2 for Thomas Gleixner 2021-05-18 12:52 ` tip-bot2 for Thomas Gleixner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).