From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753862AbdKXWxQ (ORCPT ); Fri, 24 Nov 2017 17:53:16 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:43496 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753730AbdKXWxP (ORCPT ); Fri, 24 Nov 2017 17:53:15 -0500 X-Google-Smtp-Source: AGs4zMZPXRZ1+OO0JBQ/55jjz7Wxpv3LE15us1iyJCWdi49/Ogx1hMACtqibTHQXCuhugtBWgSpcMg== Date: Fri, 24 Nov 2017 23:53:11 +0100 From: Ingo Molnar To: Andy Lutomirski Cc: linux-kernel@vger.kernel.org, Andy Lutomirski , Dave Hansen , Thomas Gleixner , "H . Peter Anvin" , Peter Zijlstra , Borislav Petkov , Linus Torvalds Subject: Re: [crash] PANIC: double fault, error_code: 0x0 Message-ID: <20171124225311.zpbgsejobpzxm7tb@gmail.com> References: <20171124172411.19476-1-mingo@kernel.org> <20171124202237.oytdkqq25s3ak2ul@gmail.com> <20171124220934.q7ovq4yzaihevqls@gmail.com> <464B14E7-EC38-4A5A-8BF6-B086F437C6D1@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <464B14E7-EC38-4A5A-8BF6-B086F437C6D1@amacapital.net> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Andy Lutomirski wrote: > > Note that if *any* of those 4 padding sequences is removed, the kernel starts > > crashing again. Also note that the exact size of the padding appears to be not > > material - it could be larger as well, i.e. it's not an alignment bug I think. > > > > In any case it's not a problem in the actual assembly code paths itself it > > appears. > > > > One guess would be tha it's some sort of sizing bug: maybe the padding forces a > > key piece of data or code on another page - but I'm too tired to root cause it > > right now. > > > > Any ideas? > > This smells like a pagerable setup bug. Either the pagetables are a bit broken or they're totally busted and the passing gets something in a more TLB-friendly place. Also note that the delta patch below also keeps it working, i.e. doubling the first padding and eliminating the second padding. I.e. it's the total per IRQ entry padding that matters, not the exact placement of the padding. I.e. some sort of sizing bug - IDT and/or the pagetables. (Also note that in my config NR_CPUS is at 128 - defconfigs are 64.) Thanks, Ingo --- arch/x86/entry/entry_64.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) Index: linux/arch/x86/entry/entry_64.S =================================================================== --- linux.orig/arch/x86/entry/entry_64.S +++ linux/arch/x86/entry/entry_64.S @@ -548,7 +548,7 @@ END(irq_entries_start) .Lokay_\@: addq $8, %rsp #else - .rep 16; nop; .endr + .rep 32; nop; .endr #endif .endm @@ -600,7 +600,7 @@ END(irq_entries_start) ud2 .Lirq_stack_okay\@: #else - .rep 16; nop; .endr +// .rep 16; nop; .endr #endif .Lirq_stack_push_old_rsp_\@: