From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99B1AC433EF for ; Thu, 10 Feb 2022 19:27:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343982AbiBJT1m (ORCPT ); Thu, 10 Feb 2022 14:27:42 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343544AbiBJT1j (ORCPT ); Thu, 10 Feb 2022 14:27:39 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9CC51BF; Thu, 10 Feb 2022 11:27:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1644521259; x=1676057259; h=message-id:date:mime-version:to:cc:references:from: subject:in-reply-to:content-transfer-encoding; bh=KgGGYz1uf3JX/t90pBcGCFyoEJL2Bi12Sok3epGvjoU=; b=Q3XCtvxEN8ejztXdjfWGuXT/iSEa5/ftyrHs/SecpTNtpO8MpeifNCR2 ZXw0zmf9xMvGLExjFyyRe17N0ZXTYJyPdjnQ5HQw8Tcu4k7xG+4dyWjRc yvshqU2pSsMCzhMQgLpltaOO6p1xsoI+/y+jvoAUKo5ytCa4bakLv8nyJ 7JJKzH4yKKn+BdyR+E/yc7/mBv2VFJ1CIRHX5g0it0Q/isMYBhhelnnF5 Mr2t7mapvSS+VrPZG05Tv9N8bpjiCe1hhEzP1tIoAdr1ZfZj7dWs2agLG QqW7iKPtn21GpwbdAd+IFgwpv0BUxNyxgoogwPV/KzbeoJlXZDNG7rkUA w==; X-IronPort-AV: E=McAfee;i="6200,9189,10254"; a="335996186" X-IronPort-AV: E=Sophos;i="5.88,359,1635231600"; d="scan'208";a="335996186" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2022 11:27:39 -0800 X-IronPort-AV: E=Sophos;i="5.88,359,1635231600"; d="scan'208";a="500514005" Received: from pengyusu-mobl.amr.corp.intel.com (HELO [10.212.149.216]) ([10.212.149.216]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Feb 2022 11:27:37 -0800 Message-ID: Date: Thu, 10 Feb 2022 11:27:34 -0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Content-Language: en-US To: Rick Edgecombe , x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , "Ravi V . Shankar" , Dave Martin , Weijiang Yang , "Kirill A . Shutemov" , joao.moreira@intel.com, John Allen , kcc@google.com, eranian@google.com Cc: Yu-cheng Yu References: <20220130211838.8382-1-rick.p.edgecombe@intel.com> <20220130211838.8382-22-rick.p.edgecombe@intel.com> From: Dave Hansen Subject: Re: [PATCH 21/35] mm/mprotect: Exclude shadow stack from preserve_write In-Reply-To: <20220130211838.8382-22-rick.p.edgecombe@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 1/30/22 13:18, Rick Edgecombe wrote: > In change_pte_range(), when a PTE is changed for prot_numa, _PAGE_RW is > preserved to avoid the additional write fault after the NUMA hinting fault. > However, pte_write() now includes both normal writable and shadow stack > (RW=0, Dirty=1) PTEs, but the latter does not have _PAGE_RW and has no need > to preserve it. This series creates an interesting situation: it causes a logical disconnection between things that were tightly coupled before. For instance, before this series, _PAGE_RW=1 and "writable" really were synonyms. They meant the same thing. One of the complexities in this series is differentiating the two. For instance, a shadow stack page can be written to, even though it has _PAGE_RW=0. This particular patch seems to be hacking around the problem that a p*_mkwrite() doesn't work on shadow stack PTE/PMDs. First, that makes me wonder what *actually* happens if we do a plain pte_mkwrite() on a shadow stack PTE. I *think* it will take the [Write=0,Dirty=1] PTE and pte = pte_set_flags(pte, _PAGE_RW); so we'll end up with [Write=1,Dirty=1], which is bad. Let's say pte_mkwrite() can't be fixed. We should probably make it VM_BUG_ON() if it's ever asked to muck with a shadow stack PTE. It's also weird because we have this pte_write()==1 PTE in a !VM_WRITE VMA. Then, we're trying to pte_mkwrite() under this !VM_WRITE VMA. pte_write() <-- returns true for on shadow stack PTE! pte_mkwrite() <-- illegal on shadow stack PTE I need to think about this a little more. I don't have a solution. But, as-is, it seems untenable. The rules are just too counter intuitive to live.