From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7822C282CE for ; Fri, 5 Apr 2019 20:55:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A097621726 for ; Fri, 5 Apr 2019 20:55:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726576AbfDEUzu (ORCPT ); Fri, 5 Apr 2019 16:55:50 -0400 Received: from mga03.intel.com ([134.134.136.65]:54006 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726548AbfDEUzu (ORCPT ); Fri, 5 Apr 2019 16:55:50 -0400 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Apr 2019 13:55:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,313,1549958400"; d="scan'208";a="313516093" Received: from sjchrist-coffee.jf.intel.com (HELO linux.intel.com) ([10.54.74.181]) by orsmga005.jf.intel.com with ESMTP; 05 Apr 2019 13:55:49 -0700 Date: Fri, 5 Apr 2019 13:55:49 -0700 From: Sean Christopherson To: Thomas Gleixner Cc: LKML , x86@kernel.org, Andy Lutomirski , Josh Poimboeuf Subject: Re: [patch V2 19/29] x86/exceptions: Split debug IST stack Message-ID: <20190405205549.GE15808@linux.intel.com> References: <20190405150658.237064784@linutronix.de> <20190405150930.129884669@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190405150930.129884669@linutronix.de> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 05, 2019 at 05:07:17PM +0200, Thomas Gleixner wrote: > The debug IST stack is actually two separate debug stacks to handle #DB > recursion. This is required because the CPU starts always at top of stack > on exception entry, which means on #DB recursion the second #DB would > overwrite the stack of the first. > > The low level entry code therefore adjusts the top of stack on entry so a > secondary #DB starts from a different stack page. But the stack pages are > adjacent without a guard page between them. > > Split the debug stack into 3 stacks which are separated by guard pages. The > 3rd stack is never mapped into the cpu_entry_area and is only there to > catch triple #DB nesting: > > --- top of DB_stack <- Initial stack > --- end of DB_stack > guard page > > --- top of DB1_stack <- Top of stack after entering first #DB > --- end of DB1_stack > guard page > > --- top of DB2_stack <- Top of stack after entering second #DB > --- end of DB2_stack > guard page > > If DB2 would not act as the final guard hole, a second #DB would point the > top of #DB stack to the stack below #DB1 which would be valid and not catch > the not so desired triple nesting. > > The backing store does not allocate any memory for DB2 and its guard page > as it is not going to be mapped into the cpu_entry_area. > > - Adjust the low level entry code so it adjusts top of #DB with the offset > between the stacks instead of exception stack size. > > - Make the dumpstack code aware of the new stacks. > > - Adjust the in_debug_stack() implementation and move it into the NMI code > where it belongs. As this is NMI hotpath code, it just checks the full > area between top of DB_stack and bottom of DB1_stack without checking > for the guard page. That's correct because the NMI cannot hit a > stackpointer pointing to the guard page between DB and DB1 stack. Even > if it would, then the NMI operation still is unaffected, but the resume > of the debug exception on the topmost DB stack will crash by touching > the guard page. > > Suggested-by: Andy Lutomirski > Signed-off-by: Thomas Gleixner > --- ... > +static bool notrace is_debug_stack(unsigned long addr) > +{ > + struct cea_exception_stacks *cs = __this_cpu_read(cea_exception_stacks); > + unsigned long top = CEA_ESTACK_TOP(cs, DB); > + unsigned long bot = CEA_ESTACK_BOT(cs, DB1); > + > + if (__this_cpu_read(debug_stack_usage)) > + return true; > + /* > + * Note, this covers the guard page between DB and DB1 as well to > + * avoid two checks. But by all means @addr can never point into > + * the guard page. > + */ > + return addr > bot && addr < top; Isn't this an off by one error? I.e. "return addr >= bot && addr < top". %rsp == bot is technically still in the DB1 stack even though the next PUSH/CALL will explode on the guard page. > +} > +NOKPROBE_SYMBOL(is_debug_stack); > #endif > > dotraplinkage notrace void > --- a/arch/x86/mm/cpu_entry_area.c > +++ b/arch/x86/mm/cpu_entry_area.c > @@ -98,10 +98,12 @@ static void __init percpu_setup_exceptio > > /* > * The exceptions stack mappings in the per cpu area are protected > - * by guard pages so each stack must be mapped separately. > + * by guard pages so each stack must be mapped separately. DB2 is > + * not mapped; it just exists to catch triple nesting of #DB. > */ > cea_map_stack(DF); > cea_map_stack(NMI); > + cea_map_stack(DB1); > cea_map_stack(DB); > cea_map_stack(MCE); > } > >