From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9D3FC31E5B for ; Mon, 17 Jun 2019 18:29:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 741752084D for ; Mon, 17 Jun 2019 18:29:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 741752084D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 203CA8E0002; Mon, 17 Jun 2019 14:29:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1B46A8E0001; Mon, 17 Jun 2019 14:29:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 02E6A8E0002; Mon, 17 Jun 2019 14:29:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by kanga.kvack.org (Postfix) with ESMTP id BBC328E0001 for ; Mon, 17 Jun 2019 14:29:04 -0400 (EDT) Received: by mail-pf1-f199.google.com with SMTP id 145so7506203pfv.18 for ; Mon, 17 Jun 2019 11:29:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:subject:to:cc :references:from:openpgp:autocrypt:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=6H4mkm7WLcM2yq13uqTtAXQ5LOBWBCYMF5NVMbbfEUI=; b=JrruYXDaRVpoXTeoxYwvEfZOceNLDhK6YdpdDChvndDXG40SKevWX0pSPi+mrZfWcy ti+FHl5IYFB4q/5NAvUS/mwA8yEzEpTMii7iRrsZ2bsA2/SAMt/uPkUaUJfJze0+K+pN e8u83RrjxNeMkp0qxm1Gy51XgQH38p9zBCf4ST+rtYBe2Ur8wiIruOewGDr+QBdAwxxQ JglOCFGxMOuHMokf6+p1AvYOlc8AAXD2ft6u4DP/Hr3zixowCmmrTYfQR4Rszq/bu0No bJV/ka7Hxs01z71kRUSxvG/Mpqyv2OHrpr2VePm2ByetLU7I1m3NMhOXgN9OR/Ynq8n6 xvSg== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of dave.hansen@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Gm-Message-State: APjAAAWYX5jH06Pl5maJSPIuMxlw8C6XkgjjH6MYc0e5k2OuoAJzUWpU 1GrFLqZGdl16VQ203cSCM6vDi1qCySv4s6rw4OmW088+c3iugx1C0HFys7R7fHL8FUn+bfrjTiD rnzrebAMQbDUHOETL/RslCN4nloC+zG/MeNaxF5+8NJBsfvUmlBsTpz5bKcufYvZHqw== X-Received: by 2002:a17:902:7887:: with SMTP id q7mr23551307pll.129.1560796144423; Mon, 17 Jun 2019 11:29:04 -0700 (PDT) X-Google-Smtp-Source: APXvYqyB9pJ62iF7Fw6AmxL+uNKrlgzHVjTONyjsz5ijX+yYBo2ezIB5SB7VbrVeDmrd6vZCtDDx X-Received: by 2002:a17:902:7887:: with SMTP id q7mr23551229pll.129.1560796143513; Mon, 17 Jun 2019 11:29:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1560796143; cv=none; d=google.com; s=arc-20160816; b=JMAvo6PAZuRQLmlY99NUyOqPVMYY5gBiBi0UHqBQQuah5a6jHVs3lfd8kBQSUqs4cH nWoibry5tzy3NY+PiDB6a8KC+tOvauWZR6FUUD3fzfUSUGpZ323AdVw7tRfnJwq6QjqB r5SIlHpNNWjJBy/Jeig0oBbN/q8KMrUNyMDZx31VeAlSUVZSgtd8kb1VnRbj3PCUK+AJ nj2eBjkcwUJT5Vkk7aFeirxefd1GhflmwpCN4yxLjSDpdHkFUd3Xj0IooIzUSb1QTg9u Y4u5WoASRm/5rGI1Yab9eI2U9mWy7sq0ri/zmP2+rbjCAVFqQWilGkNJ1xAB37FVXZR7 FlZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:content-language:in-reply-to:mime-version :user-agent:date:message-id:autocrypt:openpgp:from:references:cc:to :subject; bh=6H4mkm7WLcM2yq13uqTtAXQ5LOBWBCYMF5NVMbbfEUI=; b=Bgtdsj/f53/X1LBXVVD8n0aalWoB4v+iB6l9JEOg9G1ac2GJ3vU1irK0i7798qVgS2 E6TXkiTtqCF2KQGVeuPoL1g9r/ff4tNIWqCQNg3zdfdODpsFldUsrfH4E6lWFiyBAIy1 yjQiuyJXCRZu6nth5cn9gStXNXs8p9argc/BY5rvbnQ/+2pExJWzFp5j0UUA+F8GcDQf wRDHcKbbSFb9CVLhVde9s1QvdF9rlWIWdGlf3IK9uSQl3vIPWaqqcqhu3QijUlmfpD8V HxOkfWXY/AgkywTEls805gUvpOYVc1XHYsaXRDg8BvVPLyuXtE4eo4ft0yYBExwCYlLG 5frA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of dave.hansen@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from mga05.intel.com (mga05.intel.com. [192.55.52.43]) by mx.google.com with ESMTPS id d4si11193201plj.124.2019.06.17.11.29.03 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 17 Jun 2019 11:29:03 -0700 (PDT) Received-SPF: pass (google.com: domain of dave.hansen@intel.com designates 192.55.52.43 as permitted sender) client-ip=192.55.52.43; Authentication-Results: mx.google.com; spf=pass (google.com: domain of dave.hansen@intel.com designates 192.55.52.43 as permitted sender) smtp.mailfrom=dave.hansen@intel.com; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jun 2019 11:28:00 -0700 X-ExtLoop1: 1 Received: from ray.jf.intel.com (HELO [10.7.201.126]) ([10.7.201.126]) by orsmga002.jf.intel.com with ESMTP; 17 Jun 2019 11:27:59 -0700 Subject: Re: [PATCH, RFC 45/62] mm: Add the encrypt_mprotect() system call for MKTME To: Andy Lutomirski Cc: "Kirill A. Shutemov" , Andrew Morton , X86 ML , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Borislav Petkov , Peter Zijlstra , David Howells , Kees Cook , Kai Huang , Jacob Pan , Alison Schofield , Linux-MM , kvm list , keyrings@vger.kernel.org, LKML , Tom Lendacky References: <20190508144422.13171-1-kirill.shutemov@linux.intel.com> <20190508144422.13171-46-kirill.shutemov@linux.intel.com> <3c658cce-7b7e-7d45-59a0-e17dae986713@intel.com> From: Dave Hansen Openpgp: preference=signencrypt Autocrypt: addr=dave.hansen@intel.com; keydata= mQINBE6HMP0BEADIMA3XYkQfF3dwHlj58Yjsc4E5y5G67cfbt8dvaUq2fx1lR0K9h1bOI6fC oAiUXvGAOxPDsB/P6UEOISPpLl5IuYsSwAeZGkdQ5g6m1xq7AlDJQZddhr/1DC/nMVa/2BoY 2UnKuZuSBu7lgOE193+7Uks3416N2hTkyKUSNkduyoZ9F5twiBhxPJwPtn/wnch6n5RsoXsb ygOEDxLEsSk/7eyFycjE+btUtAWZtx+HseyaGfqkZK0Z9bT1lsaHecmB203xShwCPT49Blxz VOab8668QpaEOdLGhtvrVYVK7x4skyT3nGWcgDCl5/Vp3TWA4K+IofwvXzX2ON/Mj7aQwf5W iC+3nWC7q0uxKwwsddJ0Nu+dpA/UORQWa1NiAftEoSpk5+nUUi0WE+5DRm0H+TXKBWMGNCFn c6+EKg5zQaa8KqymHcOrSXNPmzJuXvDQ8uj2J8XuzCZfK4uy1+YdIr0yyEMI7mdh4KX50LO1 pmowEqDh7dLShTOif/7UtQYrzYq9cPnjU2ZW4qd5Qz2joSGTG9eCXLz5PRe5SqHxv6ljk8mb ApNuY7bOXO/A7T2j5RwXIlcmssqIjBcxsRRoIbpCwWWGjkYjzYCjgsNFL6rt4OL11OUF37wL QcTl7fbCGv53KfKPdYD5hcbguLKi/aCccJK18ZwNjFhqr4MliQARAQABtEVEYXZpZCBDaHJp c3RvcGhlciBIYW5zZW4gKEludGVsIFdvcmsgQWRkcmVzcykgPGRhdmUuaGFuc2VuQGludGVs LmNvbT6JAjgEEwECACIFAlQ+9J0CGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4BAheAAAoJEGg1 lTBwyZKwLZUP/0dnbhDc229u2u6WtK1s1cSd9WsflGXGagkR6liJ4um3XCfYWDHvIdkHYC1t MNcVHFBwmQkawxsYvgO8kXT3SaFZe4ISfB4K4CL2qp4JO+nJdlFUbZI7cz/Td9z8nHjMcWYF IQuTsWOLs/LBMTs+ANumibtw6UkiGVD3dfHJAOPNApjVr+M0P/lVmTeP8w0uVcd2syiaU5jB aht9CYATn+ytFGWZnBEEQFnqcibIaOrmoBLu2b3fKJEd8Jp7NHDSIdrvrMjYynmc6sZKUqH2 I1qOevaa8jUg7wlLJAWGfIqnu85kkqrVOkbNbk4TPub7VOqA6qG5GCNEIv6ZY7HLYd/vAkVY E8Plzq/NwLAuOWxvGrOl7OPuwVeR4hBDfcrNb990MFPpjGgACzAZyjdmYoMu8j3/MAEW4P0z F5+EYJAOZ+z212y1pchNNauehORXgjrNKsZwxwKpPY9qb84E3O9KYpwfATsqOoQ6tTgr+1BR CCwP712H+E9U5HJ0iibN/CDZFVPL1bRerHziuwuQuvE0qWg0+0SChFe9oq0KAwEkVs6ZDMB2 P16MieEEQ6StQRlvy2YBv80L1TMl3T90Bo1UUn6ARXEpcbFE0/aORH/jEXcRteb+vuik5UGY 5TsyLYdPur3TXm7XDBdmmyQVJjnJKYK9AQxj95KlXLVO38lcuQINBFRjzmoBEACyAxbvUEhd GDGNg0JhDdezyTdN8C9BFsdxyTLnSH31NRiyp1QtuxvcqGZjb2trDVuCbIzRrgMZLVgo3upr MIOx1CXEgmn23Zhh0EpdVHM8IKx9Z7V0r+rrpRWFE8/wQZngKYVi49PGoZj50ZEifEJ5qn/H Nsp2+Y+bTUjDdgWMATg9DiFMyv8fvoqgNsNyrrZTnSgoLzdxr89FGHZCoSoAK8gfgFHuO54B lI8QOfPDG9WDPJ66HCodjTlBEr/Cwq6GruxS5i2Y33YVqxvFvDa1tUtl+iJ2SWKS9kCai2DR 3BwVONJEYSDQaven/EHMlY1q8Vln3lGPsS11vSUK3QcNJjmrgYxH5KsVsf6PNRj9mp8Z1kIG qjRx08+nnyStWC0gZH6NrYyS9rpqH3j+hA2WcI7De51L4Rv9pFwzp161mvtc6eC/GxaiUGuH BNAVP0PY0fqvIC68p3rLIAW3f97uv4ce2RSQ7LbsPsimOeCo/5vgS6YQsj83E+AipPr09Caj 0hloj+hFoqiticNpmsxdWKoOsV0PftcQvBCCYuhKbZV9s5hjt9qn8CE86A5g5KqDf83Fxqm/ vXKgHNFHE5zgXGZnrmaf6resQzbvJHO0Fb0CcIohzrpPaL3YepcLDoCCgElGMGQjdCcSQ+Ci FCRl0Bvyj1YZUql+ZkptgGjikQARAQABiQIfBBgBAgAJBQJUY85qAhsMAAoJEGg1lTBwyZKw l4IQAIKHs/9po4spZDFyfDjunimEhVHqlUt7ggR1Hsl/tkvTSze8pI1P6dGp2XW6AnH1iayn yRcoyT0ZJ+Zmm4xAH1zqKjWplzqdb/dO28qk0bPso8+1oPO8oDhLm1+tY+cOvufXkBTm+whm +AyNTjaCRt6aSMnA/QHVGSJ8grrTJCoACVNhnXg/R0g90g8iV8Q+IBZyDkG0tBThaDdw1B2l asInUTeb9EiVfL/Zjdg5VWiF9LL7iS+9hTeVdR09vThQ/DhVbCNxVk+DtyBHsjOKifrVsYep WpRGBIAu3bK8eXtyvrw1igWTNs2wazJ71+0z2jMzbclKAyRHKU9JdN6Hkkgr2nPb561yjcB8 sIq1pFXKyO+nKy6SZYxOvHxCcjk2fkw6UmPU6/j/nQlj2lfOAgNVKuDLothIxzi8pndB8Jju KktE5HJqUUMXePkAYIxEQ0mMc8Po7tuXdejgPMwgP7x65xtfEqI0RuzbUioFltsp1jUaRwQZ MTsCeQDdjpgHsj+P2ZDeEKCbma4m6Ez/YWs4+zDm1X8uZDkZcfQlD9NldbKDJEXLIjYWo1PH hYepSffIWPyvBMBTW2W5FRjJ4vLRrJSUoEfJuPQ3vW9Y73foyo/qFoURHO48AinGPZ7PC7TF vUaNOTjKedrqHkaOcqB185ahG2had0xnFsDPlx5y Message-ID: <5cbfa2da-ba2e-ed91-d0e8-add67753fc12@intel.com> Date: Mon, 17 Jun 2019 11:27:59 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Tom Lendacky, could you take a look down in the message to the talk of SEV? I want to make sure I'm not misrepresenting what it does today. ... >> I actually don't care all that much which one we end up with. It's not >> like the extra syscall in the second options means much. > > The benefit of the second one is that, if sys_encrypt is absent, it > just works. In the first model, programs need a fallback because > they'll segfault of mprotect_encrypt() gets ENOSYS. Well, by the time they get here, they would have already had to allocate and set up the encryption key. I don't think this would really be the "normal" malloc() path, for instance. >> How do we >> eventually stack it on top of persistent memory filesystems or Device >> DAX? > > How do we stack anonymous memory on top of persistent memory or Device > DAX? I'm confused. If our interface to MKTME is: fd = open("/dev/mktme"); ptr = mmap(fd); Then it's hard to combine with an interface which is: fd = open("/dev/dax123"); ptr = mmap(fd); Where if we have something like mprotect() (or madvise() or something else taking pointer), we can just do: fd = open("/dev/anything987"); ptr = mmap(fd); sys_encrypt(ptr); Now, we might not *do* it that way for dax, for instance, but I'm just saying that if we go the /dev/mktme route, we never get a choice. > I think that, in the long run, we're going to have to either expand > the core mm's concept of what "memory" is or just have a whole > parallel set of mechanisms for memory that doesn't work like memory. ... > I expect that some day normal memory will be able to be repurposed as > SGX pages on the fly, and that will also look a lot more like SEV or > XPFO than like the this model of MKTME. I think you're drawing the line at pages where the kernel can manage contents vs. not manage contents. I'm not sure that's the right distinction to make, though. The thing that is important is whether the kernel can manage the lifetime and location of the data in the page. Basically: Can the kernel choose where the page comes from and get the page back when it wants? I really don't like the current state of things like with SEV or with KVM direct device assignment where the physical location is quite locked down and the kernel really can't manage the memory. I'm trying really hard to make sure future hardware is more permissive about such things. My hope is that these are a temporary blip and not the new normal. > So, if we upstream MKTME as anonymous memory with a magic config > syscall, I predict that, in a few years, it will be end up inheriting > all downsides of both approaches with few of the upsides. Programs > like QEMU will need to learn to manipulate pages that can't be > accessed outside the VM without special VM buy-in, so the fact that > MKTME pages are fully functional and can be GUP-ed won't be very > useful. And the VM will learn about all these things, but MKTME won't > really fit in. Kai Huang (who is on cc) has been doing the QEMU enabling and might want to weigh in. I'd also love to hear from the AMD folks in case I'm not grokking some aspect of SEV. But, my understanding is that, even today, neither QEMU nor the kernel can see SEV-encrypted guest memory. So QEMU should already understand how to not interact with guest memory. I _assume_ it's also already doing this with anonymous memory, without needing /dev/sme or something. > And, one of these days, someone will come up with a version of XPFO > that could actually be upstreamed, and it seems entirely plausible > that it will be totally incompatible with MKTME-as-anonymous-memory > and that users of MKTME will actually get *worse* security. I'm not following here. XPFO just means that we don't keep the direct map around all the time for all memory. If XPFO and MKTME-as-anonymous-memory were both in play, I think we'd just be creating/destroying the MKTME-enlightened direct map instead of a vanilla one.