From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.6 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45DD2C433DB for ; Mon, 8 Feb 2021 21:26:26 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A52D964E8C for ; Mon, 8 Feb 2021 21:26:25 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A52D964E8C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0738E6B006C; Mon, 8 Feb 2021 16:26:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 049D66B006E; Mon, 8 Feb 2021 16:26:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E30F36B0070; Mon, 8 Feb 2021 16:26:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0004.hostedemail.com [216.40.44.4]) by kanga.kvack.org (Postfix) with ESMTP id C161F6B006C for ; Mon, 8 Feb 2021 16:26:24 -0500 (EST) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 7DA4E1802587E for ; Mon, 8 Feb 2021 21:26:24 +0000 (UTC) X-FDA: 77796384288.13.key51_5f151a027601 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin13.hostedemail.com (Postfix) with ESMTP id 5D30E1846F5F1 for ; Mon, 8 Feb 2021 21:26:24 +0000 (UTC) X-HE-Tag: key51_5f151a027601 X-Filterd-Recvd-Size: 6014 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Mon, 8 Feb 2021 21:26:23 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id E27F164E6C; Mon, 8 Feb 2021 21:26:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612819582; bh=Y8G7gf2IwGjBXBF5ErxQfWKVOLSdVIOpGEIG67JA2AY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Eerdn/7TBUJNWh5JSQGQanhkCbXO6r8i+07ORvc0kNlXTv/gMu3WAJsIb2RYRWrDl NzMmvcDn3iTmjaUTOACFOeq+fWRq6saXTmy51YQtoTk38Gte4b+P8L19Xbwx0L6ml+ Ea3Z7Qd3D+MldYIlVvyh5Yb26npCORZwTE+3wJTir9k1lgaS6eZYPXQ4zFo0WChRGs fUiJgsx/cCeor6w8T2f3XtMihtpacVS0+tPYSqs4uM0k3MR4YWV+vaapoyzoPPxAl+ yFpPIEmRq4D/vlPxshoMK56Hm/B9URPT2mlstPmtBfgreYHioAYDlS64KLgxO+QeoG FSPZy/knOHzIQ== Date: Mon, 8 Feb 2021 23:26:05 +0200 From: Mike Rapoport To: Michal Hocko Cc: Andrew Morton , Alexander Viro , Andy Lutomirski , Arnd Bergmann , Borislav Petkov , Catalin Marinas , Christopher Lameter , Dan Williams , Dave Hansen , David Hildenbrand , Elena Reshetova , "H. Peter Anvin" , Ingo Molnar , James Bottomley , "Kirill A. Shutemov" , Matthew Wilcox , Mark Rutland , Mike Rapoport , Michael Kerrisk , Palmer Dabbelt , Paul Walmsley , Peter Zijlstra , Rick Edgecombe , Roman Gushchin , Shakeel Butt , Shuah Khan , Thomas Gleixner , Tycho Andersen , Will Deacon , linux-api@vger.kernel.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-nvdimm@lists.01.org, linux-riscv@lists.infradead.org, x86@kernel.org, Hagen Paul Pfeifer , Palmer Dabbelt Subject: Re: [PATCH v17 07/10] mm: introduce memfd_secret system call to create "secret" memory areas Message-ID: <20210208212605.GX242749@kernel.org> References: <20210208084920.2884-1-rppt@kernel.org> <20210208084920.2884-8-rppt@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Feb 08, 2021 at 11:49:22AM +0100, Michal Hocko wrote: > On Mon 08-02-21 10:49:17, Mike Rapoport wrote: > > From: Mike Rapoport > > > > Introduce "memfd_secret" system call with the ability to create memory > > areas visible only in the context of the owning process and not mapped not > > only to other processes but in the kernel page tables as well. > > > > The secretmem feature is off by default and the user must explicitly enable > > it at the boot time. > > > > Once secretmem is enabled, the user will be able to create a file > > descriptor using the memfd_secret() system call. The memory areas created > > by mmap() calls from this file descriptor will be unmapped from the kernel > > direct map and they will be only mapped in the page table of the owning mm. > > Is this really true? I guess you meant to say that the memory will > visible only via page tables to anybody who can mmap the respective file > descriptor. There is nothing like an owning mm as the fd is inherently a > shareable resource and the ownership becomes a very vague and hard to > define term. Hmm, it seems I've been dragging this paragraph from the very first mmap(MAP_EXCLUSIVE) rfc and nobody (including myself) noticed the inconsistency. > > The file descriptor based memory has several advantages over the > > "traditional" mm interfaces, such as mlock(), mprotect(), madvise(). It > > paves the way for VMMs to remove the secret memory range from the process; > > I do not understand how it helps to remove the memory from the process > as the interface explicitly allows to add a memory that is removed from > all other processes via direct map. The current implementation does not help to remove the memory from the process, but using fd-backed memory seems a better interface to remove guest memory from host mappings than mmap. As Andy nicely put it: "Getting fd-backed memory into a guest will take some possibly major work in the kernel, but getting vma-backed memory into a guest without mapping it in the host user address space seems much, much worse." > > As secret memory implementation is not an extension of tmpfs or hugetlbfs, > > usage of a dedicated system call rather than hooking new functionality into > > memfd_create(2) emphasises that memfd_secret(2) has different semantics and > > allows better upwards compatibility. > > What is this supposed to mean? What are differences? Well, the phrasing could be better indeed. That supposed to mean that they differ in the semantics behind the file descriptor: memfd_create implements sealing for shmem and hugetlbfs while memfd_secret implements memory hidden from the kernel. > > The secretmem mappings are locked in memory so they cannot exceed > > RLIMIT_MEMLOCK. Since these mappings are already locked an attempt to > > mlock() secretmem range would fail and mlockall() will ignore secretmem > > mappings. > > What about munlock? Isn't this implied? ;-) I'll add a sentence about it. -- Sincerely yours, Mike.