From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 781CFCCA487 for ; Thu, 21 Jul 2022 15:05:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 950F88E0001; Thu, 21 Jul 2022 11:05:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8FFBD6B0075; Thu, 21 Jul 2022 11:05:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C78B8E0001; Thu, 21 Jul 2022 11:05:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 6BD0C6B0074 for ; Thu, 21 Jul 2022 11:05:18 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 3CF431A098C for ; Thu, 21 Jul 2022 15:05:18 +0000 (UTC) X-FDA: 79711430316.20.58EB790 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) by imf01.hostedemail.com (Postfix) with ESMTP id 503AB4007F for ; Thu, 21 Jul 2022 15:05:15 +0000 (UTC) Received: by mail-pg1-f169.google.com with SMTP id 12so1692826pga.1 for ; Thu, 21 Jul 2022 08:05:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=+qHP7skd3Cv7XJJygZVSwGskr21EGyWaFJgDdfd1xqA=; b=oQSFR9+MjkkY1ur4gDW1MTlCrXf3uwlJLfESqNwsf2awYRDW0+8aavy8VzAuy9DB0G JDCIQp1vt5BvWbaqkABHrCn9h2qVHzO/SxEDKgSBlbay4FQdznJ6OWnc4Hc0C/769sBL aq7cj8hP0AUHXUTvnue16TNcqCwcMJHJ1Gz4cj7NXbvObCQklTY4pdJ1ZwEWRYUotwWt BRQpaLf82OEJ9tYyxOPCUOLwadWEY7ln+vmIVrCj1MGptoVCZg53AGjaczk8CTfjFYcd x+IeQO2qzB9zjTUGUovD73lL5DdL7DCL3eyEpBmg3MXYdMfYAAsyQOy2T2yk808kuO5O Wxgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=+qHP7skd3Cv7XJJygZVSwGskr21EGyWaFJgDdfd1xqA=; b=aTSHqrAFPnJ8+qVTLSCORenY3T24RcmpIyFh/ZtnYoOcBIx6Z4rAnTnSkxgQJM1Knz 8+JnL413LPBrq30DXp/zoUKeBZQY66MO0ttcIESZZZmY+ikOHx6hFacIdrCYZs3QkVFX gMu9wu/ibwx0bv/a2nu/ABhA5WND69mIS1C+54zsVWpMhj/Buxo0zAmgl7hbAZ0J3mv4 3Cf9ujy6/kOFG7PKgXmm1AC/gNoGAm9kpkZ9uVI92ZGRY8ka0uVQ7v5DcdmGSiy1UN6W jhqFVrB9+WAVDJJG4Dvo8T+ccbGvucjEXOtjmRBXSm9ePnXlLqCxOLgq7rbU0x/tTUlM LFkQ== X-Gm-Message-State: AJIora/z9epnye4bd9dMI/FUeSqt0ZogVcFvwYGhf/NszjdLIiT79o/2 fh5DDq+7G4DA7zE3PW+qm37Iug== X-Google-Smtp-Source: AGRyM1uCSCT3ljh7PToWudc0PHcaebaKp/GEubhDiVG89ItWbH0HuD55vNtJx6Xaam1ITOer8uSeSA== X-Received: by 2002:a63:b56:0:b0:41a:495a:2a26 with SMTP id a22-20020a630b56000000b0041a495a2a26mr13293961pgl.411.1658415913932; Thu, 21 Jul 2022 08:05:13 -0700 (PDT) Received: from google.com (123.65.230.35.bc.googleusercontent.com. [35.230.65.123]) by smtp.gmail.com with ESMTPSA id a17-20020aa78e91000000b005254e44b748sm1872610pfr.84.2022.07.21.08.05.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Jul 2022 08:05:13 -0700 (PDT) Date: Thu, 21 Jul 2022 15:05:09 +0000 From: Sean Christopherson To: David Hildenbrand Cc: Chao Peng , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, linux-kselftest@vger.kernel.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , Michael Roth , mhocko@suse.com, Muchun Song Subject: Re: [PATCH v7 01/14] mm: Add F_SEAL_AUTO_ALLOCATE seal to memfd Message-ID: References: <20220706082016.2603916-1-chao.p.peng@linux.intel.com> <20220706082016.2603916-2-chao.p.peng@linux.intel.com> <39067d09-b32b-23a6-ae0e-00ac2fe0466c@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <39067d09-b32b-23a6-ae0e-00ac2fe0466c@redhat.com> ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658415917; a=rsa-sha256; cv=none; b=Flnq/VHfTx8axoX6xjPr80mfUffd64KVSpa+VawSekNYZDhzaCyGnYDK6VfqC/Lyn/dCfX BFC+HUorsbilH0ha2u7Ihz1cnerP/11XC4JCTPWYLXoh5ZFAIUtBu+k1wODo3wQdH1KxuJ lXmhK2khvug7uLv5ZXf5hMs2Jh4S0as= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=oQSFR9+M; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of seanjc@google.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=seanjc@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658415917; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+qHP7skd3Cv7XJJygZVSwGskr21EGyWaFJgDdfd1xqA=; b=6gqGqAF7P+h9jXu+4w3HSsEJ3PxNKi4IZazUNykmuyejQQfsxtWhoLdbCUNwEfzW4Ga8gV ACoPqF3JrDD1xYyTHXUPgC7fmWTCy2ppG2Z3f65W6k+neEM2pQKjJis/6QM4+Rjc3UetX0 1X9a3Lo5m1kLpz8pZUzNAeWt8Y9e0iM= Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=oQSFR9+M; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf01.hostedemail.com: domain of seanjc@google.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=seanjc@google.com X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 503AB4007F X-Stat-Signature: zg3iig35mbm6m637ppdpwgf4ksuhjwr6 X-HE-Tag: 1658415915-310295 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jul 21, 2022, David Hildenbrand wrote: > On 21.07.22 11:44, David Hildenbrand wrote: > > On 06.07.22 10:20, Chao Peng wrote: > >> Normally, a write to unallocated space of a file or the hole of a sparse > >> file automatically causes space allocation, for memfd, this equals to > >> memory allocation. This new seal prevents such automatically allocating, > >> either this is from a direct write() or a write on the previously > >> mmap-ed area. The seal does not prevent fallocate() so an explicit > >> fallocate() can still cause allocating and can be used to reserve > >> memory. > >> > >> This is used to prevent unintentional allocation from userspace on a > >> stray or careless write and any intentional allocation should use an > >> explicit fallocate(). One of the main usecases is to avoid memory double > >> allocation for confidential computing usage where we use two memfds to > >> back guest memory and at a single point only one memfd is alive and we > >> want to prevent memory allocation for the other memfd which may have > >> been mmap-ed previously. More discussion can be found at: > >> > >> https://lkml.org/lkml/2022/6/14/1255 > >> > >> Suggested-by: Sean Christopherson > >> Signed-off-by: Chao Peng > >> --- > >> include/uapi/linux/fcntl.h | 1 + > >> mm/memfd.c | 3 ++- > >> mm/shmem.c | 16 ++++++++++++++-- > >> 3 files changed, 17 insertions(+), 3 deletions(-) > >> > >> diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h > >> index 2f86b2ad6d7e..98bdabc8e309 100644 > >> --- a/include/uapi/linux/fcntl.h > >> +++ b/include/uapi/linux/fcntl.h > >> @@ -43,6 +43,7 @@ > >> #define F_SEAL_GROW 0x0004 /* prevent file from growing */ > >> #define F_SEAL_WRITE 0x0008 /* prevent writes */ > >> #define F_SEAL_FUTURE_WRITE 0x0010 /* prevent future writes while mapped */ > >> +#define F_SEAL_AUTO_ALLOCATE 0x0020 /* prevent allocation for writes */ > > > > Why only "on writes" and not "on reads". IIRC, shmem doesn't support the > > shared zeropage, so you'll simply allocate a new page via read() or on > > read faults. > > Correction: on read() we don't allocate a fresh page. But on read faults > we would. So this comment here needs clarification. Not just the comment, the code too. The intent of F_SEAL_AUTO_ALLOCATE is very much to block _all_ implicit allocations (or maybe just fault-based allocations if "implicit" is too broad of a description).