From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB440C35E0F for ; Tue, 25 Feb 2020 20:14:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A98C922464 for ; Tue, 25 Feb 2020 20:14:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="QB9/2Oh/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731733AbgBYUOi (ORCPT ); Tue, 25 Feb 2020 15:14:38 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:46853 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731724AbgBYUOh (ORCPT ); Tue, 25 Feb 2020 15:14:37 -0500 Received: by mail-pg1-f193.google.com with SMTP id y30so72987pga.13 for ; Tue, 25 Feb 2020 12:14:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=BTTLmTlutCOGMTV3ONbR2Fs2QQVUcjN96p4jAbuK95s=; b=QB9/2Oh/cWiLYN8wCN3MJFPiDTfRIJ4E1pxfixTeHiURbdkz67+jdzh8PFJTTfyAyu D3dwetUzQ2S8grF3unLt2EOypBpGybAQh2kB2M2KTKe1PtTjL8Om5GSjQE5+wg8CNBvd kIY+aXUTNgORVkBwei2THalxhJyNF6vj4dofg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=BTTLmTlutCOGMTV3ONbR2Fs2QQVUcjN96p4jAbuK95s=; b=rDQA1/MckAauSjWUdKW8hj6GOjAJ4BtmM5hb9seQTaFzfOzqeULx201bADw/dgcx1Z 5htMrDIVU9nEzUMT7v/w2ZcLPufRKQB6T72Fqt++aIhKHxomEQZ1e8hcWGwiuMsdLNLi BPUxIcE+MgDJee/C0x5eZtN59D4HYcDfb2D+b9nXK4cOU3tdLa8fxP/qietLLH/vA/bc oYMkIq7cFHmx59Jzon9X3ErvVEVHcGmeYi8m4h3chvGmR5lJz51izJWd0d4/46mOXp5h WstZmk7Exm+sDvwO5+Q4Q6LcJlaEPAN1PyaQjyMjRK43qTvkqEz4diMAVLjoZQFhHGix y1EA== X-Gm-Message-State: APjAAAXfAYBRS5SbqAeMnBqedEFtQfw8NliXyKRBczPILA6r9RHmVAqb vdImhcoYGE1jN6f7/SZJu0jDDg== X-Google-Smtp-Source: APXvYqzQRJ5UsCLDvhXrdE0i7Rc00jqGyfX6QsU0LmkGz7BPlioCLt8NE2IkhrABklXhz2sOHBeagQ== X-Received: by 2002:a62:6d01:: with SMTP id i1mr448865pfc.94.1582661676706; Tue, 25 Feb 2020 12:14:36 -0800 (PST) Received: from www.outflux.net (smtp.outflux.net. [198.145.64.163]) by smtp.gmail.com with ESMTPSA id i14sm4258972pgh.14.2020.02.25.12.14.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Feb 2020 12:14:35 -0800 (PST) Date: Tue, 25 Feb 2020 12:14:34 -0800 From: Kees Cook To: Yu-cheng Yu Cc: x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V. Shankar" , Vedvyas Shanbhogue , Dave Martin , x86-patch-review@intel.com Subject: Re: [RFC PATCH v9 12/27] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW Message-ID: <202002251214.8B2063AA87@keescook> References: <20200205181935.3712-1-yu-cheng.yu@intel.com> <20200205181935.3712-13-yu-cheng.yu@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200205181935.3712-13-yu-cheng.yu@intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 05, 2020 at 10:19:20AM -0800, Yu-cheng Yu wrote: > When Shadow Stack (SHSTK) is enabled, the [R/O + PAGE_DIRTY_HW] setting is > reserved only for SHSTK. Non-Shadow Stack R/O PTEs are > [R/O + PAGE_DIRTY_SW]. > > When a PTE goes from [R/W + PAGE_DIRTY_HW] to [R/O + PAGE_DIRTY_SW], it > could become a transient SHSTK PTE in two cases. > > The first case is that some processors can start a write but end up seeing > a read-only PTE by the time they get to the Dirty bit, creating a transient > SHSTK PTE. However, this will not occur on processors supporting SHSTK > therefore we don't need a TLB flush here. > > The second case is that when the software, without atomic, tests & replaces > PAGE_DIRTY_HW with PAGE_DIRTY_SW, a transient SHSTK PTE can exist. This is > prevented with cmpxchg. > > Dave Hansen, Jann Horn, Andy Lutomirski, and Peter Zijlstra provided many > insights to the issue. Jann Horn provided the cmpxchg solution. > > v9: > - Change compile-time conditionals to runtime checks. > - Fix parameters of try_cmpxchg(): change pte_t/pmd_t to > pte_t.pte/pmd_t.pmd. > > v4: > - Implement try_cmpxchg(). > > Signed-off-by: Yu-cheng Yu Reviewed-by: Kees Cook -Kees > --- > arch/x86/include/asm/pgtable.h | 66 ++++++++++++++++++++++++++++++++++ > 1 file changed, 66 insertions(+) > > diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h > index 2733e7ec16b3..43cb27379208 100644 > --- a/arch/x86/include/asm/pgtable.h > +++ b/arch/x86/include/asm/pgtable.h > @@ -1253,6 +1253,39 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, > static inline void ptep_set_wrprotect(struct mm_struct *mm, > unsigned long addr, pte_t *ptep) > { > + /* > + * Some processors can start a write, but end up seeing a read-only > + * PTE by the time they get to the Dirty bit. In this case, they > + * will set the Dirty bit, leaving a read-only, Dirty PTE which > + * looks like a Shadow Stack PTE. > + * > + * However, this behavior has been improved and will not occur on > + * processors supporting Shadow Stack. Without this guarantee, a > + * transition to a non-present PTE and flush the TLB would be > + * needed. > + * > + * When changing a writable PTE to read-only and if the PTE has > + * _PAGE_DIRTY_HW set, we move that bit to _PAGE_DIRTY_SW so that > + * the PTE is not a valid Shadow Stack PTE. > + */ > +#ifdef CONFIG_X86_64 > + if (static_cpu_has(X86_FEATURE_SHSTK)) { > + pte_t new_pte, pte = READ_ONCE(*ptep); > + > + do { > + /* > + * This is the same as moving _PAGE_DIRTY_HW > + * to _PAGE_DIRTY_SW. > + */ > + new_pte = pte_wrprotect(pte); > + new_pte.pte |= (new_pte.pte & _PAGE_DIRTY_HW) >> > + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW; > + new_pte.pte &= ~_PAGE_DIRTY_HW; > + } while (!try_cmpxchg(&ptep->pte, &pte.pte, new_pte.pte)); > + > + return; > + } > +#endif > clear_bit(_PAGE_BIT_RW, (unsigned long *)&ptep->pte); > } > > @@ -1303,6 +1336,39 @@ static inline pud_t pudp_huge_get_and_clear(struct mm_struct *mm, > static inline void pmdp_set_wrprotect(struct mm_struct *mm, > unsigned long addr, pmd_t *pmdp) > { > + /* > + * Some processors can start a write, but end up seeing a read-only > + * PMD by the time they get to the Dirty bit. In this case, they > + * will set the Dirty bit, leaving a read-only, Dirty PMD which > + * looks like a Shadow Stack PMD. > + * > + * However, this behavior has been improved and will not occur on > + * processors supporting Shadow Stack. Without this guarantee, a > + * transition to a non-present PMD and flush the TLB would be > + * needed. > + * > + * When changing a writable PMD to read-only and if the PMD has > + * _PAGE_DIRTY_HW set, we move that bit to _PAGE_DIRTY_SW so that > + * the PMD is not a valid Shadow Stack PMD. > + */ > +#ifdef CONFIG_X86_64 > + if (static_cpu_has(X86_FEATURE_SHSTK)) { > + pmd_t new_pmd, pmd = READ_ONCE(*pmdp); > + > + do { > + /* > + * This is the same as moving _PAGE_DIRTY_HW > + * to _PAGE_DIRTY_SW. > + */ > + new_pmd = pmd_wrprotect(pmd); > + new_pmd.pmd |= (new_pmd.pmd & _PAGE_DIRTY_HW) >> > + _PAGE_BIT_DIRTY_HW << _PAGE_BIT_DIRTY_SW; > + new_pmd.pmd &= ~_PAGE_DIRTY_HW; > + } while (!try_cmpxchg(&pmdp->pmd, &pmd.pmd, new_pmd.pmd)); > + > + return; > + } > +#endif > clear_bit(_PAGE_BIT_RW, (unsigned long *)pmdp); > } > > -- > 2.21.0 > -- Kees Cook