From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDCC2C43381 for ; Fri, 22 Mar 2019 10:51:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8493C218D4 for ; Fri, 22 Mar 2019 10:51:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727867AbfCVKvL (ORCPT ); Fri, 22 Mar 2019 06:51:11 -0400 Received: from mga01.intel.com ([192.55.52.88]:53458 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726667AbfCVKvK (ORCPT ); Fri, 22 Mar 2019 06:51:10 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga101.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Mar 2019 03:51:10 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,256,1549958400"; d="scan'208";a="157368439" Received: from vanderss-mobl1.ger.corp.intel.com (HELO localhost) ([10.249.254.199]) by fmsmga001.fm.intel.com with ESMTP; 22 Mar 2019 03:51:00 -0700 Date: Fri, 22 Mar 2019 12:50:58 +0200 From: Jarkko Sakkinen To: Sean Christopherson Cc: x86@kernel.org, linux-sgx@vger.kernel.org, akpm@linux-foundation.org, dave.hansen@intel.com, nhorman@redhat.com, npmccallum@redhat.com, serge.ayoun@intel.com, shay.katz-zamir@intel.com, haitao.huang@intel.com, andriy.shevchenko@linux.intel.com, tglx@linutronix.de, kai.svahn@intel.com, bp@alien8.de, josh@joshtriplett.org, luto@kernel.org, kai.huang@intel.com, rientjes@google.com, Suresh Siddha Subject: Re: [PATCH v19 12/27] x86/sgx: Enumerate and track EPC sections Message-ID: <20190322105058.GE3122@linux.intel.com> References: <20190317211456.13927-1-jarkko.sakkinen@linux.intel.com> <20190317211456.13927-13-jarkko.sakkinen@linux.intel.com> <20190318195043.GA20298@linux.intel.com> <20190321144056.GM4603@linux.intel.com> <20190321152810.GC6519@linux.intel.com> <20190322101940.GC3122@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190322101940.GC3122@linux.intel.com> Organization: Intel Finland Oy - BIC 0357606-4 - Westendinkatu 7, 02160 Espoo User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-sgx-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-sgx@vger.kernel.org On Fri, Mar 22, 2019 at 12:19:40PM +0200, Jarkko Sakkinen wrote: > On Thu, Mar 21, 2019 at 08:28:10AM -0700, Sean Christopherson wrote: > > On Thu, Mar 21, 2019 at 04:40:56PM +0200, Jarkko Sakkinen wrote: > > > On Mon, Mar 18, 2019 at 12:50:43PM -0700, Sean Christopherson wrote: > > > > On Sun, Mar 17, 2019 at 11:14:41PM +0200, Jarkko Sakkinen wrote: > > > > Dynamically allocating sgx_epc_sections isn't exactly difficult, and > > > > AFAICT the static allocation is the primary motivation for capping > > > > SGX_MAX_EPC_SECTIONS at such a low value (8). I still think it makes > > > > sense to define SGX_MAX_EPC_SECTIONS so that the section number can > > > > be embedded in the offset, along with flags. But the max can be > > > > significantly higher, e.g. using 7 bits to support 128 sections. > > > > > > > > > > I don't disagree with you but I think for the existing and forseeable > > > hardware this is good enough. Can be refined if there is ever need. > > > > My concern is that there may be virtualization use cases that want to > > expose more than 8 EPC sections to a guest. I have no idea if this is > > anything more than paranoia, but at the same time the cost to increase > > support to 128+ sections is quite low. > > > > > > I realize hardware is highly unlikely to have more than 8 sections, at > > > > least for the near future, but IMO the small amount of extra complexity > > > > is worth having a bit of breathing room. > > > > > > Yup. > > > > > > > > +static __init int sgx_init_epc_section(u64 addr, u64 size, unsigned long index, > > > > > + struct sgx_epc_section *section) > > > > > +{ > > > > > + unsigned long nr_pages = size >> PAGE_SHIFT; > > > > > + struct sgx_epc_page *page; > > > > > + unsigned long i; > > > > > + > > > > > + section->va = memremap(addr, size, MEMREMAP_WB); > > > > > + if (!section->va) > > > > > + return -ENOMEM; > > > > > + > > > > > + section->pa = addr; > > > > > + spin_lock_init(§ion->lock); > > > > > + INIT_LIST_HEAD(§ion->page_list); > > > > > + > > > > > + for (i = 0; i < nr_pages; i++) { > > > > > + page = kzalloc(sizeof(*page), GFP_KERNEL); > > > > > + if (!page) > > > > > + goto out; > > > > > + page->desc = (addr + (i << PAGE_SHIFT)) | index; > > > > > + sgx_section_put_page(section, page); > > > > > + } > > > > > > > > Not sure if this is the correct location, but at some point the kernel > > > > needs to sanitize the EPC during init. EPC pages may be in an unknown > > > > state, e.g. after kexec(), which will cause all manner of faults and > > > > warnings. Maybe the best approach is to sanitize on-demand, e.g. suppress > > > > the first WARN due to unexpected ENCLS failure and purge the EPC at that > > > > time. The downside of that approach is that exposing EPC to a guest would > > > > need to implement its own sanitization flow. > > > > > > Hmm... Lets think this through. I'm just thinking how sanitization on > > > demand would actually work given the parent-child relationships. > > > > It's ugly. > > > > 1. Temporarily disable EPC allocation and enclave fault handling > > 2. Zap all TCS PTEs in all enclaves > > 3. Flush all logical CPUs from enclaves via IPI > > 4. Forcefully reclaim all EPC pages from enclaves > > 5. EREMOVE all "free" EPC pages, track pages that fail with SGX_CHILD_PRESENT > > 6. EREMOVE all EPC pages that failed with SGX_CHILD_PRESENT > > 7. Disable SGX if any EREMOVE failed in step 6 > > 8. Re-enable EPC allocation and enclave fault handling > > > > Exposing EPC to a VM would still require sanitization. > > > > Sanitizing during boot is a lot cleaner, the primary concern is that it > > will significantly increase boot time on systems with large EPCs. If we > > can somehow limit this to kexec() and that's the only scenario where the > > EPC needs to be sanitized, then that would mitigate the boot time concern. > > > > We might also be able to get away with unconditionally sanitizing the EPC > > post-boot, e.g. via worker threads, returning -EBUSY for everything until > > the EPC is good to go. > > I like the worker threads approach better. It is something that is > maintainable. I don't see any better solution given the hierarchical > nature of enclaves. It is also fairly to implement without making > major changes to the other parts of the implementation. > > I.e. every time the driver initializes: > > 1. Move all EPC first to a bad pool. > 2. Let worker threads move EPC to the real allocation pool. > > Then the OS can immediately start to use EPC. > > Is this about along the lines what you had in mind? We could even simplify this by using the already existing reclaimer thread for the purpose. /Jarkko