From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752150AbdEIAEQ (ORCPT ); Mon, 8 May 2017 20:04:16 -0400 Received: from mail.kernel.org ([198.145.29.136]:60944 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751636AbdEIAEO (ORCPT ); Mon, 8 May 2017 20:04:14 -0400 MIME-Version: 1.0 In-Reply-To: <7ce941e5-5a9b-acd7-c7b6-7be464572de5@siemens.com> References: <0c4d6d04-7038-fb82-87b3-343784550d0a@siemens.com> <7f5916b5-01c0-52d5-9f44-dee4bf355212@siemens.com> <7ce941e5-5a9b-acd7-c7b6-7be464572de5@siemens.com> From: Andy Lutomirski Date: Mon, 8 May 2017 17:03:44 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [tip:x86/mm] x86/boot/32: Defer resyncing initial_page_table until per-cpu is set up To: Jan Kiszka Cc: Andy Lutomirski , Andy Shevchenko , Ingo Molnar , x86 , linux-efi , "linux-kernel@vger.kernel.org" , "Peter Zijlstra (Intel)" , Linus Torvalds , Josh Poimboeuf , Boris Ostrovsky , Borislav Petkov , "H. Peter Anvin" , Matt Fleming , Thomas Gleixner , Brian Gerst , Thomas Garnier , Denys Vlasenko , Juergen Gross , Ard Biesheuvel Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 8, 2017 at 10:53 AM, Jan Kiszka wrote: > On 2017-05-08 14:34, Jan Kiszka wrote: >> On 2017-05-08 13:21, Andy Lutomirski wrote: >>> On Mon, May 8, 2017 at 2:32 AM, Andy Shevchenko >>> wrote: >>>> On Mon, May 8, 2017 at 9:31 AM, Jan Kiszka wrote: >>>>> On 2017-03-23 10:14, tip-bot for Andy Lutomirski wrote: >>>>>> The x86 smpboot trampoline expects initial_page_table to have the >>>>>> GDT mapped. If the GDT ends up in a virtually mapped per-cpu page, >>>>>> then it won't be in the page tables at all until perc-pu areas are >>>>>> set up. The result will be a triple fault the first time that the >>>>>> CPU attempts to access the GDT after LGDT loads the perc-pu GDT. >>>>>> >>>>>> This appears to be an old bug, but somehow the GDT fixmap rework >>>>>> is triggering it. This seems to have something to do with the >>>>>> memory layout. >>>> >>>>> This breaks the boot on our Intel Quark platform (IOT2000, similar to >>>>> Galileo Gen2). Reverting it over master makes it work again. Any idea >>>>> what goes wrong? Let me know how I can help debugging this. >>>> >>>> JFYI: As of today linux-next when _kexec:ed_ works fine to me >>>> >>>> Perhaps I can test this later with direct boot from SD card. >>>> >>> >>> The most likely explanation is that there's some code that needs the >>> page table synced and runs before setup_per_cpu_areas(). The relevant >>> init code is: >>> >>> setup_arch(&command_line); >>> mm_init_cpumask(&init_mm); >>> setup_command_line(command_line); >>> setup_nr_cpu_ids(); >>> setup_per_cpu_areas(); >>> >>> so I didn't move it very far. It would be awesome if we could get a >>> backtrace when the failure happens, but it's likely to be a triple >>> fault. Is this an EFI boot? I bet the failure is in efi_init(). >> >> Yes, it's an EFI thing. Unfortunately, I didn't make >> earlycon/earlyprintk work yet. >> >>> >>> Could you try reverting just the deletions in the patch? I.e. try a >>> kernel with both the old and the new copies of the code I moved. >> >> Let me try that later. I can also move the new code around to nail down >> the dependency. >> > > I found the reason: your patch is very discriminating! Not the whole > world is multicore yet. ;) > > setup_per_cpu_areas() is taken from mm/percpu.c in case of !CONFIG_SMP. > So the new home for the resync is not even built. D'oh! > > Any suggestions how to refactor things instead? efi_init() seems okay, but it makes me nervous. I think the partial revert is the right fix. Patch coming. > > Jan > > -- > Siemens AG, Corporate Technology, CT RDA ITP SES-DE > Corporate Competence Center Embedded Linux