From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3EECCA9EAE for ; Tue, 29 Oct 2019 11:25:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ABC2120663 for ; Tue, 29 Oct 2019 11:25:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ABC2120663 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=intel.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 56B736B0005; Tue, 29 Oct 2019 07:25:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 51D1C6B0006; Tue, 29 Oct 2019 07:25:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 40BD46B0007; Tue, 29 Oct 2019 07:25:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1DD006B0005 for ; Tue, 29 Oct 2019 07:25:21 -0400 (EDT) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id B3A7B4FFA for ; Tue, 29 Oct 2019 11:25:20 +0000 (UTC) X-FDA: 76096591200.06.plane27_2e8462d6d4940 X-HE-Tag: plane27_2e8462d6d4940 X-Filterd-Recvd-Size: 6446 Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by imf36.hostedemail.com (Postfix) with ESMTP for ; Tue, 29 Oct 2019 11:25:19 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Oct 2019 04:25:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,243,1569308400"; d="scan'208";a="203537076" Received: from irsmsx151.ger.corp.intel.com ([163.33.192.59]) by orsmga006.jf.intel.com with ESMTP; 29 Oct 2019 04:25:14 -0700 Received: from irsmsx112.ger.corp.intel.com (10.108.20.5) by IRSMSX151.ger.corp.intel.com (163.33.192.59) with Microsoft SMTP Server (TLS) id 14.3.439.0; Tue, 29 Oct 2019 11:25:13 +0000 Received: from irsmsx102.ger.corp.intel.com ([169.254.2.40]) by irsmsx112.ger.corp.intel.com ([169.254.1.60]) with mapi id 14.03.0439.000; Tue, 29 Oct 2019 11:25:13 +0000 From: "Reshetova, Elena" To: Mike Rapoport , "linux-kernel@vger.kernel.org" CC: Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Dave Hansen , James Bottomley , "Peter Zijlstra" , Steven Rostedt , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , "linux-api@vger.kernel.org" , "linux-mm@kvack.org" , "x86@kernel.org" , Mike Rapoport , Tycho Andersen , Alan Cox Subject: RE: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings Thread-Topic: [PATCH RFC] mm: add MAP_EXCLUSIVE to create exclusive user mappings Thread-Index: AQHVjK/LKXzttSiV6ES+HuOmjQLbFqdxS3bQ Date: Tue, 29 Oct 2019 11:25:12 +0000 Message-ID: <2236FBA76BA1254E88B949DDB74E612BA4EEC0CE@IRSMSX102.ger.corp.intel.com> References: <1572171452-7958-1-git-send-email-rppt@kernel.org> In-Reply-To: <1572171452-7958-1-git-send-email-rppt@kernel.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > The patch below aims to allow applications to create mappins that have > pages visible only to the owning process. Such mappings could be used to > store secrets so that these secrets are not visible neither to other > processes nor to the kernel. Hi Mike,=20 I have actually been looking into the closely related problem for the past= =20 couple of weeks (on and off). What is common here is the need for userspace to indicate to kernel that some pages contain secrets. And then there are actually a number of things that kernel can do to try to protect these secr= ets better. Unmap from direct map is one of them. Another thing is to map such pages as non-cached, which can help us to prevent or considerably restrict speculation on such pages. The initial proof of concept for marking pages a= s "UNCACHED" that I got from Dave Hansen was actually based on mlock2()=20 and a new flag for it for this purpose. Since then I have been thinking on = what interface suits the use case better and actually selected going with new ma= dvise()=20 flag instead because of all possible implications for fragmentation and per= formance.=20 My logic was that we better allocate the secret data explicitly (using mmap= ())=20 to make sure that no other process data accidentally gets to suffer. Imagine I would allocate a buffer to hold a secret key, signal with mlock to protect it and suddenly my other high throughput non-secret buffer=20 (which happened to live on the same page by chance) became very slow and I don't even have an easy way (apart from mmap()ing it!) to guarantee that it won't be affected. So, I ended up towards smth like: secret_buffer =3D mmap(NULL, PAGE_SIZE, ...) madvise(secret_buffer, size, MADV_SECRET) I have work in progress code here: https://github.com/ereshetova/linux/commits/madvise I haven't sent it for review, because it is not ready yet and I am now work= ing on trying to add the page wiping functionality. Otherwise it would be usele= ss to protect the page during the time it is used in userspace, but then allow= it to get reused by a different process later after it has been released back = and userspace was stupid enough not to wipe the contents (or was crashed on=20 purpose before it was able to wipe anything out).=20 We have also had some discussions with Tycho that XPFO can be also applied selectively for such "SECRET" marked pages and I know that he has a= lso did some initial prototyping on this, so I think it would be great to decid= e on userspace interface first and then see how we can assemble together all these features.=20 The *very* far fetching goal for all of this would be something that Alan C= ox suggested when I started looking into this - actually have a new libc funct= ion to=20 allocate memory in a secure way, which can hide all the dancing with mmap()= /madvise() (or/and potentially interaction with a chardev that Andy was suggesting als= o) and implement an efficient allocator for such secret pages. Openssl has its own version of "secure heap", which is essentially mmap area with addition= al=20 MLOCK_ONFAULT and MADV_DONTDUMP flags for protection. Some other=20 apps or libs must use smth similar if they want additional protection, whic= h makes them to reimplement the same concept again and again. Sadly or surpri= singly=20 other major libs like boringssl, mbedTLS or client like openssh do not user= any mlock()/ madvise() flags for any additional protection of secrets that they hold in = memory.=20 Maybe if all of it would be behind a single secure API situation would star= t to=20 change in userspace towards better.=20 Best Regards, Elena. =20 .=20