From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andy Lutomirski Subject: Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up Date: Mon, 8 May 2017 04:21:29 -0700 Message-ID: References: <0c4d6d04-7038-fb82-87b3-343784550d0a@siemens.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Return-path: In-Reply-To: Sender: linux-efi-owner-u79uwXL29TY76Z2rM5mHXA@public.gmane.org To: Andy Shevchenko Cc: Jan Kiszka , Andy Lutomirski , Ingo Molnar , x86 , linux-efi , "linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org" , "Peter Zijlstra (Intel)" , Linus Torvalds , Josh Poimboeuf , Boris Ostrovsky , Borislav Petkov , "H. Peter Anvin" , Matt Fleming , Thomas Gleixner , Brian Gerst , Thomas Garnier , Denys Vlasenko , Juergen Gross , Ard Biesheuvel List-Id: linux-efi@vger.kernel.org On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko wrote: > On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka wrote: >> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote: >>> The x86 smpboot trampoline expects initial_page_table to have the >>> GDT mapped. If the GDT ends up in a virtually mapped per-cpu page, >>> then it won't be in the page tables at all until perc-pu areas are >>> set up. The result will be a triple fault the first time that the >>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT. >>> >>> This appears to be an old bug, but somehow the GDT fixmap rework >>> is triggering it. This seems to have something to do with the >>> memory layout. > >> This breaks the boot on our Intel Quark platform (IOT2000, similar to >> Galileo Gen2). Reverting it over master makes it work again. Any idea >> what goes wrong? Let me know how I can help debugging this. > > JFYI: As of today linux-next when _kexec:ed_ works fine to me > > Perhaps I can test this later with direct boot from SD card. > The most likely explanation is that there's some code that needs the page table synced and runs before setup_per_cpu_areas(). The relevant init code is: setup_arch(&command_line); mm_init_cpumask(&init_mm); setup_command_line(command_line); setup_nr_cpu_ids(); setup_per_cpu_areas(); so I didn't move it very far. It would be awesome if we could get a backtrace when the failure happens, but it's likely to be a triple fault. Is this an EFI boot? I bet the failure is in efi_init(). Could you try reverting just the deletions in the patch? I.e. try a kernel with both the old and the new copies of the code I moved. --Andy