From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BA57CCA48B for ; Thu, 21 Jul 2022 15:05:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230020AbiGUPFT (ORCPT ); Thu, 21 Jul 2022 11:05:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229446AbiGUPFS (ORCPT ); Thu, 21 Jul 2022 11:05:18 -0400 Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C307362ED for ; Thu, 21 Jul 2022 08:05:14 -0700 (PDT) Received: by mail-pf1-x42c.google.com with SMTP id y141so1991964pfb.7 for ; Thu, 21 Jul 2022 08:05:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+qHP7skd3Cv7XJJygZVSwGskr21EGyWaFJgDdfd1xqA=; b=oQSFR9+MjkkY1ur4gDW1MTlCrXf3uwlJLfESqNwsf2awYRDW0+8aavy8VzAuy9DB0G JDCIQp1vt5BvWbaqkABHrCn9h2qVHzO/SxEDKgSBlbay4FQdznJ6OWnc4Hc0C/769sBL aq7cj8hP0AUHXUTvnue16TNcqCwcMJHJ1Gz4cj7NXbvObCQklTY4pdJ1ZwEWRYUotwWt BRQpaLf82OEJ9tYyxOPCUOLwadWEY7ln+vmIVrCj1MGptoVCZg53AGjaczk8CTfjFYcd x+IeQO2qzB9zjTUGUovD73lL5DdL7DCL3eyEpBmg3MXYdMfYAAsyQOy2T2yk808kuO5O Wxgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+qHP7skd3Cv7XJJygZVSwGskr21EGyWaFJgDdfd1xqA=; b=eRIsvfVFxdIt5Y7YoNpMNS9ap8JXQpXMk9Wt04hKsANfrGnP+oGBdJZIroVzdpzPsm 6K6nxmSkm2HpczgVamXylfr07d2SOGBtEk2FWY+M6jiEdZXCExDXoZMwfInH72hlIDls fkMLNYCYD6p0R2Wc7xuhNyBf4WT3t3AnhEEVEHc94KMaNiTWDGElhaN5x7dYU1LAsqZC uDWbXP+DELAEP+qK5GQ8SRfFeA3L/x8qlzi7/DBWySRRPS7PvgvPrHORZJK4kyAizMhx W93aWMzCI+/dZme99DwQDuC7rkwJwcF66TtSwFtN7Jv9ypVHjOT/Bg8gtznTKgzUUKto OGlw== X-Gm-Message-State: AJIora9MWnHzMOY1f/hI7jbSEHMQSk6h/xlNpU3kPJYF2VPxYH467M0R 8AFWuDNUpMJZOoecBxNfMABV1A== X-Google-Smtp-Source: AGRyM1uCSCT3ljh7PToWudc0PHcaebaKp/GEubhDiVG89ItWbH0HuD55vNtJx6Xaam1ITOer8uSeSA== X-Received: by 2002:a63:b56:0:b0:41a:495a:2a26 with SMTP id a22-20020a630b56000000b0041a495a2a26mr13293961pgl.411.1658415913932; Thu, 21 Jul 2022 08:05:13 -0700 (PDT) Received: from google.com (123.65.230.35.bc.googleusercontent.com. [35.230.65.123]) by smtp.gmail.com with ESMTPSA id a17-20020aa78e91000000b005254e44b748sm1872610pfr.84.2022.07.21.08.05.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Jul 2022 08:05:13 -0700 (PDT) Date: Thu, 21 Jul 2022 15:05:09 +0000 From: Sean Christopherson To: David Hildenbrand Cc: Chao Peng , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, linux-kselftest@vger.kernel.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , Michael Roth , mhocko@suse.com, Muchun Song Subject: Re: [PATCH v7 01/14] mm: Add F_SEAL_AUTO_ALLOCATE seal to memfd Message-ID: References: <20220706082016.2603916-1-chao.p.peng@linux.intel.com> <20220706082016.2603916-2-chao.p.peng@linux.intel.com> <39067d09-b32b-23a6-ae0e-00ac2fe0466c@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <39067d09-b32b-23a6-ae0e-00ac2fe0466c@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-api@vger.kernel.org On Thu, Jul 21, 2022, David Hildenbrand wrote: > On 21.07.22 11:44, David Hildenbrand wrote: > > On 06.07.22 10:20, Chao Peng wrote: > >> Normally, a write to unallocated space of a file or the hole of a sparse > >> file automatically causes space allocation, for memfd, this equals to > >> memory allocation. This new seal prevents such automatically allocating, > >> either this is from a direct write() or a write on the previously > >> mmap-ed area. The seal does not prevent fallocate() so an explicit > >> fallocate() can still cause allocating and can be used to reserve > >> memory. > >> > >> This is used to prevent unintentional allocation from userspace on a > >> stray or careless write and any intentional allocation should use an > >> explicit fallocate(). One of the main usecases is to avoid memory double > >> allocation for confidential computing usage where we use two memfds to > >> back guest memory and at a single point only one memfd is alive and we > >> want to prevent memory allocation for the other memfd which may have > >> been mmap-ed previously. More discussion can be found at: > >> > >> https://lkml.org/lkml/2022/6/14/1255 > >> > >> Suggested-by: Sean Christopherson > >> Signed-off-by: Chao Peng > >> --- > >> include/uapi/linux/fcntl.h | 1 + > >> mm/memfd.c | 3 ++- > >> mm/shmem.c | 16 ++++++++++++++-- > >> 3 files changed, 17 insertions(+), 3 deletions(-) > >> > >> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h > >> index 2f86b2ad6d7e..98bdabc8e309 100644 > >> --- a/include/uapi/linux/fcntl.h > >> +++ b/include/uapi/linux/fcntl.h > >> @@ -43,6 +43,7 @@ > >> #define F_SEAL_GROW 0x0004 /* prevent file from growing */ > >> #define F_SEAL_WRITE 0x0008 /* prevent writes */ > >> #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */ > >> +#define F_SEAL_AUTO_ALLOCATE 0x0020 /* prevent allocation for writes */ > > > > Why only "on writes" and not "on reads". IIRC, shmem doesn't support the > > shared zeropage, so you'll simply allocate a new page via read() or on > > read faults. > > Correction: on read() we don't allocate a fresh page. But on read faults > we would. So this comment here needs clarification. Not just the comment, the code too. The intent of F_SEAL_AUTO_ALLOCATE is very much to block _all_ implicit allocations (or maybe just fault-based allocations if "implicit" is too broad of a description).