From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E0BEC2D0E2 for ; Tue, 22 Sep 2020 12:58:16 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0E64B2399A for ; Tue, 22 Sep 2020 12:58:15 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0E64B2399A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6F31090007C; Tue, 22 Sep 2020 08:58:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6A356900063; Tue, 22 Sep 2020 08:58:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 593A790007C; Tue, 22 Sep 2020 08:58:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0023.hostedemail.com [216.40.44.23]) by kanga.kvack.org (Postfix) with ESMTP id 438D0900063 for ; Tue, 22 Sep 2020 08:58:15 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id E34658249980 for ; Tue, 22 Sep 2020 12:58:14 +0000 (UTC) X-FDA: 77290700508.02.smell69_21170992714d Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin02.hostedemail.com (Postfix) with ESMTP id BFFFC100D3486 for ; Tue, 22 Sep 2020 12:58:14 +0000 (UTC) X-HE-Tag: smell69_21170992714d X-Filterd-Recvd-Size: 7166 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf15.hostedemail.com (Postfix) with ESMTP for ; Tue, 22 Sep 2020 12:58:13 +0000 (UTC) IronPort-SDR: F0h1bzTn8j8F8zAmziC8GcyEJlaHEwUzrTxbTM5r6eCWzH6RBQVbRv+ngyeIn1zWLHnIbGRFZx 42yFgH+zwZhg== X-IronPort-AV: E=McAfee;i="6000,8403,9751"; a="148258261" X-IronPort-AV: E=Sophos;i="5.77,290,1596524400"; d="scan'208";a="148258261" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2020 05:58:10 -0700 IronPort-SDR: QQI3EOz5NORvfETTv/Z0zM7EPBjRgIKJIL8dV1IR1RF0kabvPI70csxeQBOqlIpgAPw20W/fXQ L5LexkyDy9xA== X-IronPort-AV: E=Sophos;i="5.77,290,1596524400"; d="scan'208";a="485946765" Received: from krodolf-mobl1.ger.corp.intel.com (HELO localhost) ([10.252.49.25]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Sep 2020 05:58:03 -0700 Date: Tue, 22 Sep 2020 15:58:01 +0300 From: Jarkko Sakkinen To: Andy Lutomirski Cc: Sean Christopherson , Andy Lutomirski , X86 ML , linux-sgx@vger.kernel.org, LKML , Linux-MM , Andrew Morton , Matthew Wilcox , Jethro Beekman , Darren Kenny , Andy Shevchenko , asapek@google.com, Borislav Petkov , "Xing, Cedric" , chenalexchen@google.com, Conrad Parker , cyhanish@google.com, Dave Hansen , "Huang, Haitao" , Josh Triplett , "Huang, Kai" , "Svahn, Kai" , Keith Moyer , Christian Ludloff , Neil Horman , Nathaniel McCallum , Patrick Uiterwijk , David Rientjes , Thomas Gleixner , yaozhangx@google.com Subject: Re: [PATCH v38 10/24] mm: Add vm_ops->mprotect() Message-ID: <20200922125801.GA133710@linux.intel.com> References: <20200918235337.GA21189@sjchrist-ice> <1B23E216-0229-4BDD-8B09-807256A54AF5@amacapital.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1B23E216-0229-4BDD-8B09-807256A54AF5@amacapital.net> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 18, 2020 at 05:15:32PM -0700, Andy Lutomirski wrote: >=20 > > On Sep 18, 2020, at 4:53 PM, Sean Christopherson wrote: > >=20 > > =EF=BB=BFOn Fri, Sep 18, 2020 at 08:09:04AM -0700, Andy Lutomirski wr= ote: > >>> On Tue, Sep 15, 2020 at 4:28 AM Jarkko Sakkinen > >>> wrote: > >>>=20 > >>> From: Sean Christopherson > >>>=20 > >>> Add vm_ops()->mprotect() for additional constraints for a VMA. > >>>=20 > >>> Intel Software Guard eXtensions (SGX) will use this callback to add= two > >>> constraints: > >>>=20 > >>> 1. Verify that the address range does not have holes: each page add= ress > >>> must be filled with an enclave page. > >>> 2. Verify that VMA permissions won't surpass the permissions of any= enclave > >>> page within the address range. Enclave cryptographically sealed > >>> permissions for each page address that set the upper limit for po= ssible > >>> VMA permissions. Not respecting this can cause #GP's to be emitte= d. > >=20 > > Side note, #GP is wrong. EPCM violations are #PFs. Skylake CPUs #GP= , but > > that's technically an errata. But this isn't the real motivation, e.= g. > > userspace can already trigger #GP/#PF by reading/writing a bad addres= s, SGX > > simply adds another flavor. > >=20 > >> It's been awhile since I looked at this. Can you remind us: is this > >> just preventing userspace from shooting itself in the foot or is thi= s > >> something more important? > >=20 > > Something more important, it's used to prevent userspace from circumv= enting > > a noexec filesystem by loading code into an enclave, and to give the = kernel the > > option of adding enclave specific LSM policies in the future. > >=20 > > The source file (if one exists) for the enclave is long gone when the= enclave > > is actually mmap()'d and mprotect()'d. To enforce noexec, the reques= ted > > permissions for a given page are snapshotted when the page is added t= o the > > enclave, i.e. when the enclave is built. Enclave pages that will be = executable > > must originate from an a MAYEXEC VMA, e.g. the source page can't come= from a > > noexec file system. > >=20 > > The ->mprotect() hook allows SGX to reject mprotect() if userspace is= declaring > > permissions beyond what are allowed, e.g. trying to map an enclave pa= ge with > > EXEC permissions when the page was added to the enclave without EXEC. > >=20 > > Future LSM policies have a similar need due to vm_file always pointin= g at > > /dev/sgx/enclave, e.g. policies couldn't be attached to a specific en= clave. > > ->mprotect() again allows enforcing permissions at map time that were= checked > > at enclave build time, e.g. via an LSM hook. > >=20 > > Deferring ->mprotect() until LSM support is added (if it ever is) wou= ld be > > problematic due to SGX2. With SGX2, userspace can extend permissions= of an > > enclave page (for the CPU's EPC Map entry, not the kernel's page tabl= es) > > without bouncing through the kernel. Without ->mprotect () enforceme= nt. > > userspace could do EADD(RW) -> mprotect(RWX) -> EMODPE(X) to gain W+X= . We > > want to disallow such a flow now, i.e. force userspace to do EADD(RW,= X), so > > that the hypothetical LSM hook would have all information at EADD(), = i.e. > > would be aware of the EXEC permission, without creating divergent beh= avior > > based on whether or not an LSM is active. >=20 > That=E2=80=99s what I thought. Can we get this in the changelog? I rewrote the commit message.=20 " mm: Add 'mprotect' callback to vm_ops Intel Sofware Guard eXtensions (SGX) allows creation of executable blobs called enclaves, of which page permissions are defined when the enclave is first loaded. Once an enclave is loaded and initialized, it can be mapped to the process address space. Enclave permissions can be dynamically modified by using ENCLS[EMODPE] instruction. We want to limit its use to not allow higher permissions tha= n the ones defined when the enclave was first created. Add 'mprotect' hook to vm_ops, so that we can implement a callback for SG= X that will check that {mmap, mprotect}() permissions do not surpass any of the page permissions in the address range defined. This is required in order to be able to make any access control decisions when enclave pages are loaded. " /Jarkko