From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05B70C5519F for ; Tue, 17 Nov 2020 15:50:33 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9233424655 for ; Tue, 17 Nov 2020 15:50:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="pJsP79df" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726635AbgKQPuL (ORCPT ); Tue, 17 Nov 2020 10:50:11 -0500 Received: from mail.kernel.org ([198.145.29.99]:59082 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725767AbgKQPuL (ORCPT ); Tue, 17 Nov 2020 10:50:11 -0500 Received: from mail-wr1-f42.google.com (mail-wr1-f42.google.com [209.85.221.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D2C24246A5 for ; Tue, 17 Nov 2020 15:50:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1605628210; bh=u1el01G9Rxn5QxKul4igkMQ+uIlzPCYj5W5qeoA3aQQ=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=pJsP79dfyNR7hwm2vMl/nNj/v7y0qCYaDefg1qA/I/gKlNIhUwDa1l8XB8mNNNUSj TaTPHZpQpBxw9wMZySyzuDFryAOEfmm/HftDzu9u0THtsEgc0KpuT/va/qGJUeyL1R sjfRV5Q4qaBaChgIIidBhRL3MnJ1qKlDcMvajw9Q= Received: by mail-wr1-f42.google.com with SMTP id m6so6471383wrg.7 for ; Tue, 17 Nov 2020 07:50:09 -0800 (PST) X-Gm-Message-State: AOAM53211iZlfCBfDV7voN9Ow+mXsML0k5ZvL/FS7Ywz0L5MisXrEtv+ x0DeIsvNMwfXNYHVthO5I5Oy6Mk0LxR807s+cWqUiQ== X-Google-Smtp-Source: ABdhPJzeqtLFoEF4ietenDEVnPD2MdOjQpmq7/+A2FPVapFRdUjKO6SMMFp4UDuSP+XGuK6xU5Eu9TstsDJk/X3d560= X-Received: by 2002:a5d:4991:: with SMTP id r17mr188952wrq.70.1605628208195; Tue, 17 Nov 2020 07:50:08 -0800 (PST) MIME-Version: 1.0 References: <20201116144757.1920077-1-alexandre.chartre@oracle.com> <20201116144757.1920077-12-alexandre.chartre@oracle.com> <820278dc-5f8e-6224-71b4-7c61819f68d1@oracle.com> In-Reply-To: From: Andy Lutomirski Date: Tue, 17 Nov 2020 07:49:53 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC][PATCH v2 11/21] x86/pti: Extend PTI user mappings To: Alexandre Chartre Cc: Andy Lutomirski , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , X86 ML , Dave Hansen , Peter Zijlstra , LKML , Tom Lendacky , Joerg Roedel , Konrad Rzeszutek Wilk , jan.setjeeilers@oracle.com, Junaid Shahid , oweisse@google.com, Mike Rapoport , Alexander Graf , mgross@linux.intel.com, kuzuno@gmail.com Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 17, 2020 at 12:42 AM Alexandre Chartre wrote: > > > On 11/17/20 12:06 AM, Andy Lutomirski wrote: > > On Mon, Nov 16, 2020 at 12:18 PM Alexandre Chartre > > wrote: > >> > >> > >> On 11/16/20 8:48 PM, Andy Lutomirski wrote: > >>> On Mon, Nov 16, 2020 at 6:49 AM Alexandre Chartre > >>> wrote: > >>>> > >>>> Extend PTI user mappings so that more kernel entry code can be executed > >>>> with the user page-table. To do so, we need to map syscall and interrupt > >>>> entry code, per cpu offsets (__per_cpu_offset, which is used some in > >>>> entry code), the stack canary, and the PTI stack (which is defined per > >>>> task). > >>> > >>> Does anything unmap the PTI stack? Mapping is easy, and unmapping > >>> could be a pretty big mess. > >>> > >> > >> No, there's no unmap. The mapping exists as long as the task page-table > >> does (i.e. as long as the task mm exits). I assume that the task stack > >> and mm are freed at the same time but that's not something I have checked. > >> > > > > Nope. A multi-threaded mm will free task stacks when the task exits, > > but the mm may outlive the individual tasks. Additionally, if you > > allocate page tables as part of mapping PTI stacks, you need to make > > sure the pagetables are freed. > > So I think I just need to unmap the PTI stack from the user page-table > when the task exits. Everything else is handled because the kernel and > PTI stack are allocated in a single chunk (referenced by task->stack). > > > > Finally, you need to make sure that > > the PTI stacks have appropriate guard pages -- just doubling the > > allocation is not safe enough. > > The PTI stack does have guard pages because it maps only a part of the task > stack into the user page-table, so pages around the PTI stack are not mapped > into the user-pagetable (the page below is the task stack guard, and the page > above is part of the kernel-only stack so it's never mapped into the user > page-table). > > + * +-------------+ > + * | | ^ ^ > + * | kernel-only | | KERNEL_STACK_SIZE | > + * | stack | | | > + * | | V | > + * +-------------+ <- top of kernel stack | THREAD_SIZE > + * | | ^ | > + * | kernel and | | KERNEL_STACK_SIZE | > + * | PTI stack | | | > + * | | V v > + * +-------------+ <- top of stack There's no guard page between the stacks. That seems unfortunate. > > > My intuition is that this is going to be far more complexity than is justified. > > Sounds like only the PTI stack unmap is missing, which is hopefully not > that bad. I will check that. > > alex.