From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADE46C4742C for ; Mon, 28 Sep 2020 16:48:28 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 59E0B20717 for ; Mon, 28 Sep 2020 16:48:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 59E0B20717 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9D9E26B00A3; Mon, 28 Sep 2020 12:48:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 98B886B00A4; Mon, 28 Sep 2020 12:48:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7705C6B00A5; Mon, 28 Sep 2020 12:48:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0158.hostedemail.com [216.40.44.158]) by kanga.kvack.org (Postfix) with ESMTP id 553136B00A3 for ; Mon, 28 Sep 2020 12:48:17 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1A82C364B for ; Mon, 28 Sep 2020 16:48:17 +0000 (UTC) X-FDA: 77313053034.14.toys56_220f83a27183 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id ED4A118229818 for ; Mon, 28 Sep 2020 16:48:16 +0000 (UTC) X-HE-Tag: toys56_220f83a27183 X-Filterd-Recvd-Size: 11775 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Mon, 28 Sep 2020 16:48:14 +0000 (UTC) IronPort-SDR: Yck5t8Tzq/66nHC69tfjZ2NyVxf/oY0eLW/BzgVnYeW/KS3FcIMTQ6WFU/hrKijcGSB/65vT23 1PHHGBKu28mQ== X-IronPort-AV: E=McAfee;i="6000,8403,9758"; a="149800132" X-IronPort-AV: E=Sophos;i="5.77,313,1596524400"; d="scan'208";a="149800132" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2020 09:48:13 -0700 IronPort-SDR: 5E66H2whoS8RfrgTspPO2Tp5CegedCZ8GPRmqtUwBBNvI7+2CMT6X4QwHOqTJl1gI7kZ07ZQCI i35JswzrrQ7w== X-IronPort-AV: E=Sophos;i="5.77,313,1596524400"; d="scan'208";a="293944527" Received: from rcalvo1-mobl1.amr.corp.intel.com (HELO [10.209.56.88]) ([10.209.56.88]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Sep 2020 09:48:10 -0700 Subject: Re: [PATCH v38 10/24] mm: Add vm_ops->mprotect() To: Jarkko Sakkinen Cc: Sean Christopherson , Haitao Huang , Andy Lutomirski , X86 ML , linux-sgx@vger.kernel.org, LKML , Linux-MM , Andrew Morton , Matthew Wilcox , Jethro Beekman , Darren Kenny , Andy Shevchenko , asapek@google.com, Borislav Petkov , "Xing, Cedric" , chenalexchen@google.com, Conrad Parker , cyhanish@google.com, "Huang, Haitao" , Josh Triplett , "Huang, Kai" , "Svahn, Kai" , Keith Moyer , Christian Ludloff , Neil Horman , Nathaniel McCallum , Patrick Uiterwijk , David Rientjes , Thomas Gleixner , yaozhangx@google.com References: <20200924202549.GB19127@linux.intel.com> <20200924230501.GA20095@linux.intel.com> <20200925000052.GA20333@linux.intel.com> <32fc9df4-d4aa-6768-aa06-0035427b7535@intel.com> <20200925194304.GE31528@linux.intel.com> <230ce6da-7820-976f-f036-a261841d626f@intel.com> <20200928005347.GB6704@linux.intel.com> <6eca8490-d27d-25b8-da7c-df4f9a802e87@intel.com> <20200928161954.GB92669@linux.intel.com> From: Dave Hansen Autocrypt: addr=dave.hansen@intel.com; keydata= xsFNBE6HMP0BEADIMA3XYkQfF3dwHlj58Yjsc4E5y5G67cfbt8dvaUq2fx1lR0K9h1bOI6fC oAiUXvGAOxPDsB/P6UEOISPpLl5IuYsSwAeZGkdQ5g6m1xq7AlDJQZddhr/1DC/nMVa/2BoY 2UnKuZuSBu7lgOE193+7Uks3416N2hTkyKUSNkduyoZ9F5twiBhxPJwPtn/wnch6n5RsoXsb ygOEDxLEsSk/7eyFycjE+btUtAWZtx+HseyaGfqkZK0Z9bT1lsaHecmB203xShwCPT49Blxz VOab8668QpaEOdLGhtvrVYVK7x4skyT3nGWcgDCl5/Vp3TWA4K+IofwvXzX2ON/Mj7aQwf5W iC+3nWC7q0uxKwwsddJ0Nu+dpA/UORQWa1NiAftEoSpk5+nUUi0WE+5DRm0H+TXKBWMGNCFn c6+EKg5zQaa8KqymHcOrSXNPmzJuXvDQ8uj2J8XuzCZfK4uy1+YdIr0yyEMI7mdh4KX50LO1 pmowEqDh7dLShTOif/7UtQYrzYq9cPnjU2ZW4qd5Qz2joSGTG9eCXLz5PRe5SqHxv6ljk8mb ApNuY7bOXO/A7T2j5RwXIlcmssqIjBcxsRRoIbpCwWWGjkYjzYCjgsNFL6rt4OL11OUF37wL QcTl7fbCGv53KfKPdYD5hcbguLKi/aCccJK18ZwNjFhqr4MliQARAQABzShEYXZpZCBDaHJp c3RvcGhlciBIYW5zZW4gPGRhdmVAc3I3MS5uZXQ+wsF7BBMBAgAlAhsDBgsJCAcDAgYVCAIJ CgsEFgIDAQIeAQIXgAUCTo3k0QIZAQAKCRBoNZUwcMmSsMO2D/421Xg8pimb9mPzM5N7khT0 2MCnaGssU1T59YPE25kYdx2HntwdO0JA27Wn9xx5zYijOe6B21ufrvsyv42auCO85+oFJWfE K2R/IpLle09GDx5tcEmMAHX6KSxpHmGuJmUPibHVbfep2aCh9lKaDqQR07gXXWK5/yU1Dx0r VVFRaHTasp9fZ9AmY4K9/BSA3VkQ8v3OrxNty3OdsrmTTzO91YszpdbjjEFZK53zXy6tUD2d e1i0kBBS6NLAAsqEtneplz88T/v7MpLmpY30N9gQU3QyRC50jJ7LU9RazMjUQY1WohVsR56d ORqFxS8ChhyJs7BI34vQusYHDTp6PnZHUppb9WIzjeWlC7Jc8lSBDlEWodmqQQgp5+6AfhTD kDv1a+W5+ncq+Uo63WHRiCPuyt4di4/0zo28RVcjtzlGBZtmz2EIC3vUfmoZbO/Gn6EKbYAn rzz3iU/JWV8DwQ+sZSGu0HmvYMt6t5SmqWQo/hyHtA7uF5Wxtu1lCgolSQw4t49ZuOyOnQi5 f8R3nE7lpVCSF1TT+h8kMvFPv3VG7KunyjHr3sEptYxQs4VRxqeirSuyBv1TyxT+LdTm6j4a mulOWf+YtFRAgIYyyN5YOepDEBv4LUM8Tz98lZiNMlFyRMNrsLV6Pv6SxhrMxbT6TNVS5D+6 UorTLotDZKp5+M7BTQRUY85qARAAsgMW71BIXRgxjYNCYQ3Xs8k3TfAvQRbHccky50h99TUY sqdULbsb3KhmY29raw1bgmyM0a4DGS1YKN7qazCDsdQlxIJp9t2YYdBKXVRzPCCsfWe1dK/q 66UVhRPP8EGZ4CmFYuPTxqGY+dGRInxCeap/xzbKdvmPm01Iw3YFjAE4PQ4hTMr/H76KoDbD cq62U50oKC83ca/PRRh2QqEqACvIH4BR7jueAZSPEDnzwxvVgzyeuhwqHY05QRK/wsKuhq7s UuYtmN92Fasbxbw2tbVLZfoidklikvZAmotg0dwcFTjSRGEg0Gr3p/xBzJWNavFZZ95Rj7Et db0lCt0HDSY5q4GMR+SrFbH+jzUY/ZqfGdZCBqo0cdPPp58krVgtIGR+ja2Mkva6ah94/oQN lnCOw3udS+Eb/aRcM6detZr7XOngvxsWolBrhwTQFT9D2NH6ryAuvKd6yyAFt3/e7r+HHtkU kOy27D7IpjngqP+b4EumELI/NxPgIqT69PQmo9IZaI/oRaKorYnDaZrMXViqDrFdD37XELwQ gmLoSm2VfbOYY7fap/AhPOgOYOSqg3/Nxcapv71yoBzRRxOc4FxmZ65mn+q3rEM27yRztBW9 AnCKIc66T2i92HqXCw6AgoBJRjBkI3QnEkPgohQkZdAb8o9WGVKpfmZKbYBo4pEAEQEAAcLB XwQYAQIACQUCVGPOagIbDAAKCRBoNZUwcMmSsJeCEACCh7P/aaOLKWQxcnw47p4phIVR6pVL e4IEdR7Jf7ZL00s3vKSNT+nRqdl1ugJx9Ymsp8kXKMk9GSfmZpuMQB9c6io1qZc6nW/3TtvK pNGz7KPPtaDzvKA4S5tfrWPnDr7n15AU5vsIZvgMjU42gkbemkjJwP0B1RkifIK60yQqAAlT YZ14P0dIPdIPIlfEPiAWcg5BtLQU4Wg3cNQdpWrCJ1E3m/RIlXy/2Y3YOVVohfSy+4kvvYU3 lXUdPb04UPw4VWwjcVZPg7cgR7Izion61bGHqVqURgSALt2yvHl7cr68NYoFkzbNsGsye9ft M9ozM23JSgMkRylPSXTeh5JIK9pz2+etco3AfLCKtaRVysjvpysukmWMTrx8QnI5Nn5MOlJj 1Ov4/50JY9pXzgIDVSrgy6LYSMc4vKZ3QfCY7ipLRORyalFDF3j5AGCMRENJjHPD6O7bl3Xo 4DzMID+8eucbXxKiNEbs21IqBZbbKdY1GkcEGTE7AnkA3Y6YB7I/j9mQ3hCgm5muJuhM/2Fr OPsw5tV/LmQ5GXH0JQ/TZXWygyRFyyI2FqNTx4WHqUn3yFj8rwTAU1tluRUYyeLy0ayUlKBH ybj0N71vWO936MqP6haFERzuPAIpxj2ezwu0xb1GjTk4ynna6h5GjnKgdfOWoRtoWndMZxbA z5cecg== Message-ID: Date: Mon, 28 Sep 2020 09:48:10 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20200928161954.GB92669@linux.intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 9/28/20 9:19 AM, Jarkko Sakkinen wrote: > On Mon, Sep 28, 2020 at 07:04:38AM -0700, Dave Hansen wrote: >> EMODPE is virtually irrelevant for this whole thing. The x86 PTE >> permissions still specify the most restrictive permissions, which is >> what matters the most. >> >> We care about the _worst_ the enclave can do, not what it imposes on >> itself on top of that. >=20 > AFAIK it is not, or what we are protecting against with this anyway > then? >=20 > Let say an LSM makes decision for the permissions based on origin. If w= e > do not have this you can: >=20 > 1. EMODPE > 2. mprotect The thing that matters is that the enclave needs relaxed permissions from the kernel. What it *ALSO* needs to do to *ITSELF* to get those permissions is entirely irrelevant to the kernel. > I.e. whatever LSM decides, won't matter. >=20 > The other case, noexec, is now unconditionally denied. >>> I think other important thing to underline is that an LSM or any othe= r >>> security measure can only do a sane decision when the enclave is load= ed. >>> At that point we know the source (vm_file). >> >> Right, you know the source, but it can be anonymous or a file. >=20 > They are both origin, the point being that you know what you're dealing > with when you build the enclave, not when you map it. >=20 > This is my current rewrite of the commit message in my master branch: >=20 > " > mm: Add 'mprotect' callback to vm_ops > =20 > Intel Sofware Guard eXtensions (SGX) allows creation of blobs calle= d > enclaves, for which page permissions are defined when the enclave i= s first > loaded. Once an enclave is loaded and initialized, it can be mapped= to the > process address space. > =20 > There is no standard file format for enclaves. They are dynamically= built > and the ways how enclaves are deployed differ greatly. For an app y= ou might > want to have a simple static binary, but on the other hand for a co= ntainer > you might want to dynamically create the whole thing at run-time. A= lso, the > existing ecosystem for SGX is already large, which would make the t= ask very > hard. I'm sorry I ever mentioned the file format. Please remove any mention of it. It's irrelevant. This entire paragraph is irrelevant. > Finally, even if there was a standard format, one would still want = a > dynamic way to add pages to the enclave. One big reason for this is= that > enclaves have load time defined pages that represent entry points t= o the > enclave. Each entry point can service one hardware thread at a time= and > you might want to run-time parametrize this depending on your envir= onment. I also don't know what this paragraph has to do with the mprotect() hook. Please remove it. > The consequence is that enclaves are best created with an ioctl API= and the > access control can be based only to the origin of the source file f= or the > enclave data, i.e. on VMA file pointer and page permissions. For ex= ample, It's not strictly page permissions, though. It's actually VMA permissions. The thing you copy from might be the zero page, and even though it has Write=3D0 page permissions, apps are completely OK to write to the address. This is the WRITE vs. MAY_WRITE semantic in the VMA flag= s. It's also not just about *files*. Anonymous memory might or might not be a valid source for enclave data based on LSM hooks. > this could be done with LSM hooks that are triggered in the appropr= iate > ioctl's and they could make the access control decision based on th= is > information. This "appropriate ioctl's" is not good changelog material. Please use those bytes to convey actual information. ... this could be done with LSM hooks which restrict the source of enclave page data I don't care that it's an ioctl(), really. What matters is what the ioctl() does: copy data into enclave pages. > Unfortunately, there is ENCLS[EMODPE] that a running enclave can us= e to > upgrade its permissions. If we do not limit mmap() and mprotect(), = enclave > could upgrade its permissions by using EMODPE followed by an approp= riate > mprotect() call. This would be completely hidden from the kernel. There's too much irrelevant info. I'll say it again: all that matters is that enclaves can legitimately, safely, and securely have a need for the kernel to change page permissions. That's *IT*. EMODPE just happens to be part of the mechanism that makes these permission changes safe for enclaves. It's a side show. > Add 'mprotect' hook to vm_ops, so that a callback can be implemeted= for SGX > that will ensure that {mmap, mprotect}() permissions do not surpass= any of > the original page permissions. This feature allows to maintain and = refine > sane access control for enclaves. Instead of "original", I'd stick to the "source" page nomenclature. There are also "original" permissions with mprotect(). Also, it's literally OK for the enclave page permissions to surpass the original (source) page permissions. That sentence is incorrect, or at least misleadingly imprecise. > I'm mostly happy with this but am open for change suggestions. I wrote a pretty nice description of this. It was about 90% correct, shorter, and conveyed more information. I'd suggest starting with that.