From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C560EC46467 for ; Thu, 19 Jan 2023 21:36:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230392AbjASVgl (ORCPT ); Thu, 19 Jan 2023 16:36:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45784 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230482AbjASVeo (ORCPT ); Thu, 19 Jan 2023 16:34:44 -0500 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8CF5CA3177; Thu, 19 Jan 2023 13:26:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1674163616; x=1705699616; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=NJhAPcbhUYo6SN1rH0mb7F3rAzE882GPxl5rAvSEEz0=; b=Q1kHlcw7UBkpLpksUL1Cmb/+k3cunnk/ojuV6VYfWRVa/sw7WXi75hi2 0E/S9HmeHUODsoruMi30PA83n5npVPuxc4dGJVYd93kZcPJXb2//ecVgj WQOpsyd+jbOayOmxqZoXpzEfMUB5WmNgfR3t6NBrYqlE5/a21qzsC4EqL 3cgTiF/pQiRm3v5VG4AtabkvduRZPPyvzBPE0VrM2XQbgvqZutVRhu3Qc /qtkgOz73Zbi693mINDHsUXTXOJrSYhgetpZ7XbntgOaZ1jM0u7a3WScr cHG4iFYaW+sQtZtGvl5V0uaK03dTYgCSdBAtriGATqU57Lbiszj/RwsrP g==; X-IronPort-AV: E=McAfee;i="6500,9779,10595"; a="323119581" X-IronPort-AV: E=Sophos;i="5.97,230,1669104000"; d="scan'208";a="323119581" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Jan 2023 13:23:54 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10595"; a="989139082" X-IronPort-AV: E=Sophos;i="5.97,230,1669104000"; d="scan'208";a="989139082" Received: from hossain3-mobl.amr.corp.intel.com (HELO rpedgeco-desk.amr.corp.intel.com) ([10.252.128.187]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Jan 2023 13:23:52 -0800 From: Rick Edgecombe To: x86@kernel.org, "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Borislav Petkov , Cyrill Gorcunov , Dave Hansen , Eugene Syromiatnikov , Florian Weimer , "H . J . Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , Randy Dunlap , Weijiang Yang , "Kirill A . Shutemov" , John Allen , kcc@google.com, eranian@google.com, rppt@kernel.org, jamorris@linux.microsoft.com, dethoma@microsoft.com, akpm@linux-foundation.org, Andrew.Cooper3@citrix.com, christina.schimpe@intel.com Cc: rick.p.edgecombe@intel.com, David Hildenbrand , Yu-cheng Yu Subject: [PATCH v5 18/39] mm: Handle faultless write upgrades for shstk Date: Thu, 19 Jan 2023 13:22:56 -0800 Message-Id: <20230119212317.8324-19-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230119212317.8324-1-rick.p.edgecombe@intel.com> References: <20230119212317.8324-1-rick.p.edgecombe@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org The x86 Control-flow Enforcement Technology (CET) feature includes a new type of memory called shadow stack. This shadow stack memory has some unusual properties, which requires some core mm changes to function properly. Since shadow stack memory can be changed from userspace, is both VM_SHADOW_STACK and VM_WRITE. But it should not be made conventionally writable (i.e. pte_mkwrite()). So some code that calls pte_mkwrite() needs to be adjusted. One such case is when memory is made writable without an actual write fault. This happens in some mprotect operations, and also prot_numa faults. In both cases code checks whether it should be made (conventionally) writable by calling vma_wants_manual_pte_write_upgrade(). One way to fix this would be have code actually check if memory is also VM_SHADOW_STACK and in that case call pte_mkwrite_shstk(). But since most memory won't be shadow stack, just have simpler logic and skip this optimization by changing vma_wants_manual_pte_write_upgrade() to not return true for VM_SHADOW_STACK_MEMORY. This will simply handle all cases of this type. Cc: David Hildenbrand Tested-by: Pengfei Xu Tested-by: John Allen Signed-off-by: Yu-cheng Yu Reviewed-by: Kirill A. Shutemov Signed-off-by: Rick Edgecombe --- v5: - Update solution after the recent removal of pte_savedwrite() v4: - Add "why" to comments in code (Peterz) Yu-cheng v25: - Move is_shadow_stack_mapping() to a separate line. Yu-cheng v24: - Change arch_shadow_stack_mapping() to is_shadow_stack_mapping(). include/linux/mm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index e15d2fc04007..139a682d243b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2181,7 +2181,7 @@ static inline bool vma_wants_manual_pte_write_upgrade(struct vm_area_struct *vma */ if (vma->vm_flags & VM_SHARED) return vma_wants_writenotify(vma, vma->vm_page_prot); - return !!(vma->vm_flags & VM_WRITE); + return (vma->vm_flags & VM_WRITE) && !(vma->vm_flags & VM_SHADOW_STACK); } bool can_change_pte_writable(struct vm_area_struct *vma, unsigned long addr, -- 2.17.1