From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934032AbdCUVL6 (ORCPT ); Tue, 21 Mar 2017 17:11:58 -0400 Received: from mail-it0-f65.google.com ([209.85.214.65]:34938 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934000AbdCUVL4 (ORCPT ); Tue, 21 Mar 2017 17:11:56 -0400 MIME-Version: 1.0 In-Reply-To: References: <20170321045713.GE23490@yexl-desktop> From: Linus Torvalds Date: Tue, 21 Mar 2017 14:11:25 -0700 X-Google-Sender-Auth: q7YP8EYSmql1j3zpLkdDlOOB82U Message-ID: Subject: Re: [lkp-robot] [x86] 69218e4799: BUG:kernel_hang_in_boot_stage To: Thomas Garnier Cc: kernel test robot , Ingo Molnar , Alexander Potapenko , Andrew Morton , Andrey Ryabinin , Andy Lutomirski , Ard Biesheuvel , Boris Ostrovsky , Borislav Petkov , Chris Wilson , Christian Borntraeger , Dmitry Vyukov , Frederic Weisbecker , Jiri Kosina , Joerg Roedel , Jonathan Corbet , Josh Poimboeuf , Juergen Gross , Kees Cook , Len Brown , Lorenzo Stoakes , "Luis R . Rodriguez" , Matt Fleming , Michal Hocko , Paolo Bonzini , Paul Gortmaker , Pavel Machek , Peter Zijlstra , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , "Rafael J . Wysocki" , Rusty Russell , Stanislaw Gruszka , Thomas Gleixner , Tim Chen , Vitaly Kuznetsov , zijun_hu , LKML , Stephen Rothwell , LKP Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 21, 2017 at 1:25 PM, Thomas Garnier wrote: > The issue seems to be related to exceptions happening in close pages > to the fixmap GDT remapping. > > The original page fault happen in do_test_wp_bit which set a fixmap > entry to test WP flag. If I grow the number of processors supported > increasing the distance between the remapped GDT page and the WP test > page, the error does not reproduce. > > I am still looking at the exact distance between repro and no-repro as > well as the exact root cause. Hmm. Have we set the GDT limit incorrectly, somehow? The GDT *can* cover 8k entries, which at 8 bytes each would be 64kB. So somebody trying to load an invalid segment (say, 0xffff) might end up causing an access to the GDT base + 64k - 8. It is also possible that the CPU might do a page table writability check *before* it does the limit check. That would sound odd, though. Might be a CPU errata. Linus