On Tue, 2021-12-28 at 12:34 +0100, Paul Menzel wrote: > Same on the ASUS F2A85-M PRO with AMD A6-6400K. Without serial console, > the messages below are printed below to the monitor after nine seconds. > > [ 1.078879] smp: Bringing up secondary CPUs ... > [ 1.080950] x86: Booting SMP configuration: > > Please find the serial log attached. > Thanks for testing. That looks like the same triple-fault on bringup that we have been seeing, and that I reproduced without my patches using kexec all the way back to a 5.0 kernel. Out of interest, are you also able to reproduce it with kexec and without the parallel bringup? And with that patch I sent Tom in https://lore.kernel.org/lkml/721484e0fa719e99f9b8f13e67de05033dd7cc86.camel@infradead.org/ to expand the bitlock exclusion and stop the bringup being truly in parallel at all? Or tbe one in https://lore.kernel.org/lkml/d4cde50b4aab24612823714dfcbe69bc4bb63b60.camel@infradead.org which makes it do nothing except prepare all the CPUs before bringing them up one at a time? My current theory (not that I've spent that much time thinking about it in the last week) is that there's something about the existing CPU bringup, possibly a CPU bug or something special about the AMD CPUs, which is triggered by just making it a little bit *faster*, which is why bringing them up from kexec (especially in qemu) can cause it too? Tom seemed to find that it was in load_TR_desc(), so if you could try this hack on a machine that doesn't magically wink out of existence on a triplefault before even flushing its serial output, that would be much appreciated... diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h index ab97b22ac04a..cc6590712ff4 100644 --- a/arch/x86/include/asm/desc.h +++ b/arch/x86/include/asm/desc.h @@ -8,7 +8,7 @@ #include #include #include - +#include #include #include #include @@ -265,11 +265,16 @@ static inline void native_load_tr_desc(void) * If the current GDT is the read-only fixmap, swap to the original * writeable version. Swap back at the end. */ + outb('d', 0x3f8); if (gdt.address == (unsigned long)fixmap_gdt) { + outb('e', 0x3f8); load_direct_gdt(cpu); restore = 1; + outb('f', 0x3f8); } + outb('g', 0x3f8); asm volatile("ltr %w0"::"q" (GDT_ENTRY_TSS*8)); + outb('h', 0x3f8); if (restore) load_fixmap_gdt(cpu); } diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 0083464de5e3..5bc8f30c3283 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1716,7 +1716,9 @@ void identify_secondary_cpu(struct cpuinfo_x86 *c) enable_sep_cpu(); #endif mtrr_ap_init(); +outb('A', 0x3f8); validate_apic_and_package_id(c); +outb('B', 0x3f8); x86_spec_ctrl_setup_ap(); update_srbds_msr(); } @@ -1957,6 +1959,7 @@ static inline void tss_setup_io_bitmap(struct tss_struct *tss) tss->io_bitmap.mapall[IO_BITMAP_LONGS] = ~0UL; #endif } +#include /* * Setup everything needed to handle exceptions from the IDT, including the IST @@ -1969,16 +1972,24 @@ void cpu_init_exception_handling(void) /* paranoid_entry() gets the CPU number from the GDT */ setup_getcpu(cpu); - + outb('\n', 0x3f8); + outb('0' + cpu / 100, 0x3f8); + outb('0' + (cpu % 100) / 10, 0x3f8); + outb('0' + (cpu % 10), 0x3f8); + /* IST vectors need TSS to be set up. */ tss_setup_ist(tss); + outb('a', 0x3f8); tss_setup_io_bitmap(tss); set_tss_desc(cpu, &get_cpu_entry_area(cpu)->tss.x86_tss); - + outb('b', 0x3f8); load_TR_desc(); + outb('c', 0x3f8); /* Finally load the IDT */ load_current_idt(); + outb('z', 0x3f8); + } /*