From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85F0DC433FE for ; Sat, 5 Mar 2022 01:03:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230212AbiCEBEH (ORCPT ); Fri, 4 Mar 2022 20:04:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229478AbiCEBEF (ORCPT ); Fri, 4 Mar 2022 20:04:05 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BCE831CE9B9; Fri, 4 Mar 2022 17:03:15 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4198161F51; Sat, 5 Mar 2022 01:03:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 26902C340E9; Sat, 5 Mar 2022 01:03:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1646442194; bh=zYtuIkRNR31j199FK8vxS8K91wZhoT2cjKlItlsB5BI=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=YRkmV9jbquG3Ad4ozCAUiZRLKfv5Hl2E+L0NewuVd9kpGx7PJL6pfem128cjTF9Q5 bX9rSfSCjMce5mqzyYELGruLXf+JL4LcLg374s6hJRmnTStTp/ygfohTe1BRecFdlY dlijgwCFK9zUb78kV2Yau56Wx2vYl+WBr+w1P87kTU27PZTo4iaNWoVeYGNvsSxfwB qSNCye7ay0dXWieBVmnOv/AivGBfbAgfwb+kUz0k0QUimVpZUhhzxYjlQgRzFpnNTt GbAXFoXUStg5HpoxV0+YtiqoWyxjbg5qKROQq+4HHInWjgnGUpbJtjDPP9AWwC2O16 FCFSUc2FjRNLA== Date: Sat, 5 Mar 2022 03:02:28 +0200 From: Jarkko Sakkinen To: Haitao Huang Cc: Reinette Chatre , Dave Hansen , "Dhanraj, Vijay" , "dave.hansen@linux.intel.com" , "tglx@linutronix.de" , "bp@alien8.de" , "Lutomirski, Andy" , "mingo@redhat.com" , "linux-sgx@vger.kernel.org" , "x86@kernel.org" , "Christopherson,, Sean" , "Huang, Kai" , "Zhang, Cathy" , "Xing, Cedric" , "Huang, Haitao" , "Shanahan, Mark" , "hpa@zytor.com" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH V2 16/32] x86/sgx: Support restricting of enclave page permissions Message-ID: References: <2d2d3471-78ce-9faa-daf6-138078f5ffaa@intel.com> <6f65287a-f69c-971e-be2c-929e327e7ff9@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 04, 2022 at 09:51:22AM -0600, Haitao Huang wrote: > Hi Jarkko > > On Fri, 04 Mar 2022 02:30:22 -0600, Jarkko Sakkinen > wrote: > > > On Thu, Mar 03, 2022 at 10:03:30PM -0600, Haitao Huang wrote: > > > > > > On Thu, 03 Mar 2022 17:18:33 -0600, Jarkko Sakkinen > > > wrote: > > > > > > > On Thu, Mar 03, 2022 at 10:08:14AM -0600, Haitao Huang wrote: > > > > > Hi all, > > > > > > > > > > On Wed, 02 Mar 2022 16:57:45 -0600, Reinette Chatre > > > > > wrote: > > > > > > > > > > > Hi Jarkko, > > > > > > > > > > > > On 3/1/2022 6:05 PM, Jarkko Sakkinen wrote: > > > > > > > On Tue, Mar 01, 2022 at 09:48:48AM -0800, Reinette Chatre wrote: > > > > > > > > Hi Jarkko, > > > > > > > > > > > > > > > > On 3/1/2022 5:42 AM, Jarkko Sakkinen wrote: > > > > > > > > > > With EACCEPTCOPY (kudos to Mark S. for reminding me of > > > > > > > > > > this version of > > > > > > > > > > EACCEPT @ chat.enarx.dev) it is possible to make R and RX > > > > > pages but > > > > > > > > > > obviously new RX pages are now out of the picture: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > /* > > > > > > > > > > * Adding a regular page that is architecturally allowed > > > > > to only > > > > > > > > > > * be created with RW permissions. > > > > > > > > > > * TBD: Interface with user space policy to support max > > > > > permissions > > > > > > > > > > * of RWX. > > > > > > > > > > */ > > > > > > > > > > prot = PROT_READ | PROT_WRITE; > > > > > > > > > > encl_page->vm_run_prot_bits = calc_vm_prot_bits(prot, 0); > > > > > > > > > > encl_page->vm_max_prot_bits = > > > encl_page->vm_run_prot_bits; > > > > > > > > > > > > > > > > > > > > If that TBD is left out to the final version the page > > > > > > > > > > augmentation has a > > > > > > > > > > risk of a API bottleneck, and that risk can realize then > > > > > > > > > > also in the page > > > > > > > > > > permission ioctls. > > > > > > > > > > > > > > > > > > > > I.e. now any review comment is based on not fully known > > > > > > > > > > territory, we have > > > > > > > > > > one known unknown, and some unknown unknowns from > > > > > > > > > > unpredictable effect to > > > > > > > > > > future API changes. > > > > > > > > > > > > > > > > The plan to complete the "TBD" in the above snippet was to > > > > > > > > follow this work > > > > > > > > with user policy integration at this location. On a high level > > > > > > > > the plan was > > > > > > > > for this to look something like: > > > > > > > > > > > > > > > > > > > > > > > > /* > > > > > > > > * Adding a regular page that is architecturally allowed > > > to only > > > > > > > > * be created with RW permissions. > > > > > > > > * Interface with user space policy to support max > > > permissions > > > > > > > > * of RWX. > > > > > > > > */ > > > > > > > > prot = PROT_READ | PROT_WRITE; > > > > > > > > encl_page->vm_run_prot_bits = calc_vm_prot_bits(prot, 0); > > > > > > > > > > > > > > > > if (user space policy allows RWX on dynamically added > > > > > pages) > > > > > > > > encl_page->vm_max_prot_bits = calc_vm_prot_bits(PROT_READ | > > > > > > > > PROT_WRITE | PROT_EXEC, 0); > > > > > > > > else > > > > > > > > encl_page->vm_max_prot_bits = calc_vm_prot_bits(PROT_READ | > > > > > > > > PROT_WRITE, 0); > > > > > > > > > > > > > > > > The work that follows this series aimed to do the integration > > > > > with user > > > > > > > > space policy. > > > > > > > > > > > > > > What do you mean by "user space policy" anyway exactly? I'm > > > > > sorry but I > > > > > > > just don't fully understand this. > > > > > > > > > > > > My apologies - I just assumed that you would need no reminder > > > > > about this > > > > > > contentious > > > > > > part of SGX history. Essentially it means that, yes, the > > > kernel could > > > > > > theoretically > > > > > > permit any kind of access to any file/page, but some accesses are > > > > > known > > > > > > to generally > > > > > > be a bad idea - like making memory executable as well as writable > > > > > - and > > > > > > thus there > > > > > > are additional checks based on what user space permits before the > > > > > kernel > > > > > > allows > > > > > > such accesses. > > > > > > > > > > > > For example, > > > > > > mm/mprotect.c:do_mprotect_pkey()->security_file_mprotect() > > > > > > > > > > > > User policy and SGX has seen significant discussion. Some notable > > > > > > threads: > > > > > > https://lore.kernel.org/linux-security-module/CALCETrXf8mSK45h7sTK5Wf+pXLVn=Bjsc_RLpgO-h-qdzBRo5Q@mail.gmail.com/ > > > > > > https://lore.kernel.org/linux-security-module/20190619222401.14942-1-sean.j.christopherson@intel.com/ > > > > > > > > > > > > > It's too big of a risk to accept this series without X taken > > > care > > > > > > > of. Patch > > > > > > > series should neither have TODO nor TBD comments IMHO. I > > > don't want > > > > > > > to ack > > > > > > > a series based on speculation what might happen in the future. > > > > > > > > > > > > ok > > > > > > > > > > > > > > > > > > > > > > I think the best way to move forward would be to do EAUG's > > > > > > > > > explicitly with > > > > > > > > > an ioctl that could also include secinfo for permissions. > > > > > Then you can > > > > > > > > > easily do the rest with EACCEPTCOPY inside the enclave. > > > > > > > > > > > > > > > > SGX_IOC_ENCLAVE_ADD_PAGES already exists and could possibly be > > > > > used for > > > > > > > > this purpose. It already includes SECINFO which may also be > > > > > useful if > > > > > > > > needing to later support EAUG of PT_SS* pages. > > > > > > > > > > > > > > You could also simply add SGX_IOC_ENCLAVE_AUGMENT_PAGES and > > > call it > > > > > > > a day. > > > > > > > > > > > > I could, yes. > > > > > > > > > > > > > And if there is plan to extend SGX_IOC_ENCLAVE_ADD_PAGES what is > > > > > > > this weird > > > > > > > thing added to the #PF handler? Why is it added at all then? > > > > > > > > > > > > I was just speculating in my response, there is no plan to extend > > > > > > SGX_IOC_ENCLAVE_ADD_PAGES (that I am aware of). > > > > > > > > > > > > > > How this could work is user space calls > > > SGX_IOC_ENCLAVE_ADD_PAGES > > > > > > > > after enclave initialization on any memory region within the > > > > > > > > enclave where > > > > > > > > pages are planned to be added dynamically. This ioctl() calls > > > > > > > > EAUG to add the > > > > > > > > new pages with RW permissions and their vm_max_prot_bits > > > can be > > > > > > > > set to the > > > > > > > > permissions found in the included SECINFO. This will support > > > > > > > > later EACCEPTCOPY > > > > > > > > as well as SGX_IOC_ENCLAVE_RELAX_PERMISSIONS > > > > > > > > > > > > > > I don't like this type of re-use of the existing API. > > > > > > > > > > > > I could proceed with SGX_IOC_ENCLAVE_AUGMENT_PAGES if there is > > > > > consensus > > > > > > after > > > > > > considering the user policy question (above) and performance > > > trade-off > > > > > > (more below). > > > > > > > > > > > > > > > > > > > > > The big question is whether communicating user policy after > > > > > > > > enclave initialization > > > > > > > > via the SECINFO within SGX_IOC_ENCLAVE_ADD_PAGES is acceptable > > > > > > > > to all? I would > > > > > > > > appreciate a confirmation on this direction considering the > > > > > > > > significant history > > > > > > > > behind this topic. > > > > > > > > > > > > > > I have no idea because I don't know what is user space policy. > > > > > > > > > > > > This discussion is about some enclave usages needing RWX > > > permissions > > > > > > on dynamically added enclave pages. RWX permissions on dynamically > > > > > added > > > > > > pages is > > > > > > not something that should blindly be allowed for all SGX > > > enclaves but > > > > > > instead the user > > > > > > needs to explicitly allow specific enclaves to have such > > > ability. This > > > > > > is equivalent > > > > > > to (but not the same as) what exists in Linux today with LSM. As > > > > > seen in > > > > > > mm/mprotect.c:do_mprotect_pkey()->security_file_mprotect() Linux > > > > > is able > > > > > > to make > > > > > > files and memory be both writable and executable, but it would > > > only do > > > > > > so for those > > > > > > files and memory that the LSM (which is how user policy is > > > > > communicated, > > > > > > like SELinux) > > > > > > indicates it is allowed, not blindly do so for all files and all > > > > > memory. > > > > > > > > > > > > > > > Putting EAUG to the #PF handler and implicitly call it just > > > > > > > > > too flakky and > > > > > > > > > hard to make deterministic for e.g. JIT compiler in our use > > > > > > > > > case (not to > > > > > > > > > mention that JIT is not possible at all because inability to > > > > > > > > > do RX pages). > > > > > > > > > > > > I understand how SGX_IOC_ENCLAVE_AUGMENT_PAGES can be more > > > > > deterministic > > > > > > but from > > > > > > what I understand it would have a performance impact since it > > > would > > > > > > require all memory > > > > > > that may be needed by the enclave be pre-allocated from > > > outside the > > > > > > enclave and not > > > > > > just dynamically allocated from within the enclave at the time > > > it is > > > > > > needed. > > > > > > > > > > > > Would such a performance impact be acceptable? > > > > > > > > > > > > > > > > User space won't always have enough info to decide whether the pages > > > > > to be > > > > > EAUG'd immediately. In some cases (shared libraries, JVM for > > > > > example) lots > > > > > of code/data pages can be mapped but never actually touched. One > > > > > enclave/process does not know if any other more important > > > > > enclave/process > > > > > would need the EPC. > > > > > > > > > > It should be for kernel to make the final decision as it has overall > > > > > picture > > > > > of the system EPC usage and availability. > > > > > > > > EAUG ioctl does not give better capabilities for user space to waste > > > > EPC given that EADD ioctl already exists, i.e. your argument is > > > logically > > > > incorrect. > > > > > > The point of adding EAUG is to allow more efficient use of EPC pages. > > > Without EAUG, enclaves have to EADD everything upfront into EPC, > > > consuming > > > predetermined number of EPC pages, some of which may not be used at all. > > > With EAUG, enclaves should be able to load minimal pages to get started, > > > pages added on #PF as they are actually accessed. > > > > > > Obviously as you pointed out, some usages make more sense to > > > pre-EAUG (EAUG > > > before #PF). But your proposal of supporting only pre-EAUG here > > > essentially > > > makes EAUG behave almost the same as EADD. If the current > > > implementation > > > with EAUG on #PF can also use MAP_POPULATE for pre-EAUG (seems possible > > > based on Dave's comments), then it is flxible to cover all cases and > > > allow > > > kernel to optimize allocation of EPC pages. > > > > There is no even a working #PF based implementation in existance, and > > your > > argument has too many if's for my taste. > > 1) if you mean no user space is implementing this kind of solution, read > this section, otherwise, skip to 2) below which is only couple of sentences. > > If you are willing to look, there is already implementation in our SDK to do > heap and stack expansion on demand on #PF. Enclaves may not know heap/stack > size up front, we have implemented these features to make EPC usage more > efficient. I don't know why normal processes can add RAM on #PF, but > enclaves adding EPC on #PF becomes so unacceptable concept to you. And the > kernel does that for EPC swapping already when #PF happens on a swapped out > EPC page. In adds O(n) round-trips for a mmap() emulation, which can be done in O(1) round-trips with a ioctl. > Our implementation has gone through several rounds, the latest is > here:https://github.com/intel/linux-sgx/tree/edmm_v2/sdk/emm. It was also > implemented in original OOT driver based SDK implementation. Customers are > using it and found them useful. I think this is a critical feature that many > other runtimes will also need. I'm not sure what the common sense argument here is. > 2) > It's OK for you to request additional support for your usage and I agree it > is needed. But IMHO, totally getting rid of EAUG on #PF is bad and > unnecessary. Current implementation can be extended to support your usage. > What's the reason you think MAP_POPULATE won't work for you? I do not recall taking stand on MAP_POPULATE. > BR > Haitao BR, Jarkko