From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45BA6C4321E for ; Tue, 29 Nov 2022 14:03:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B02F76B0071; Tue, 29 Nov 2022 09:03:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ADA166B0074; Tue, 29 Nov 2022 09:03:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 97B1D6B0075; Tue, 29 Nov 2022 09:03:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8AE376B0071 for ; Tue, 29 Nov 2022 09:03:47 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4F7AE40FD3 for ; Tue, 29 Nov 2022 14:03:47 +0000 (UTC) X-FDA: 80186648094.21.B0643D1 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf11.hostedemail.com (Postfix) with ESMTP id 1ED324001A for ; Tue, 29 Nov 2022 14:03:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1669730621; x=1701266621; h=date:from:to:cc:subject:message-id:reply-to:references: mime-version:in-reply-to; bh=RRv6X9kxxmg0r2Bt/bN85+Xzhbzh0Sz0OFcppuwQCsk=; b=f2BnlrQtENeoOI/ugHFvu2hLZAih5GgVcHoXCWhAg0GpUF4S0TI/CS2H 7qs+g1cqFoGdmFW+p9W+s+c32PMuMCwbRwu3tEwMiw1sKFv790EUXAhJB nrJJ7CxfUVYrmQPsiqPFYdXlrafGgsAIpacGdwmlU04qQBqTvaYwHCw+Y ymimtObpW0cpJKb64s7UWiDeShRyA5U2EeA3yPEXqT0JuinOHcx6wXlTn N8BX8XYG57dmwVqdQqb8mAZ6dgNhI6dzcx3Io5swGzPJHDxXKh7c3uIyT PoRwVyjVjng4ARapmME1zrXeQYLoBuDlGB3FJxjheE2pzBhCNlinQK/We w==; X-IronPort-AV: E=McAfee;i="6500,9779,10546"; a="316948118" X-IronPort-AV: E=Sophos;i="5.96,203,1665471600"; d="scan'208";a="316948118" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Nov 2022 06:03:16 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10546"; a="707221475" X-IronPort-AV: E=Sophos;i="5.96,203,1665471600"; d="scan'208";a="707221475" Received: from chaop.bj.intel.com (HELO localhost) ([10.240.193.75]) by fmsmga008.fm.intel.com with ESMTP; 29 Nov 2022 06:03:05 -0800 Date: Tue, 29 Nov 2022 21:58:44 +0800 From: Chao Peng To: "Kirill A. Shutemov" Cc: Michael Roth , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, linux-doc@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Shuah Khan , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com, aarcange@redhat.com, ddutile@redhat.com, dhildenb@redhat.com, Quentin Perret , tabba@google.com, mhocko@suse.com, Muchun Song , wei.w.wang@intel.com Subject: Re: [PATCH v9 1/8] mm: Introduce memfd_restricted system call to create restricted user memory Message-ID: <20221129135844.GA902164@chaop.bj.intel.com> Reply-To: Chao Peng References: <20221025151344.3784230-1-chao.p.peng@linux.intel.com> <20221025151344.3784230-2-chao.p.peng@linux.intel.com> <20221129000632.sz6pobh6p7teouiu@amd.com> <20221129112139.usp6dqhbih47qpjl@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221129112139.usp6dqhbih47qpjl@box.shutemov.name> ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669730621; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ru/SgZIo97dVU0HZL3d2J+2V/vhcPGsZvQR6D+n8dQI=; b=vNcljwK4fLppcfna/PI0JJ1FhtJ1VLbv6/fuKy3pZToS/Vi2HlFTSqXKzDcUZTfWxPhPfM B0BorUpfPMQ1m13f2OW6N7dskX6MC2QZE0d9yPcdlwFuO0+cywK9fWMvFeMF1dm7cKoo/+ Is5RShp1M8icBeWkQb++2QFW3mUskmI= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=f2BnlrQt; spf=none (imf11.hostedemail.com: domain of chao.p.peng@linux.intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=chao.p.peng@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669730621; a=rsa-sha256; cv=none; b=jWzNPyxNUKFyd/fOQ9ULp6l9nAoULydYLdO4XsuTDc8r1sTd6QnyOwtr10QsTxbN2N1ZA0 ssZK1R0UaS7hE8+FsxNp3k/jewwJ1FUCzoEPdw8wHbSi1vLz2QlJ0qkJZ8y9S61xlDEKUF /4LxkVQ79C8b1AO/EsmlMh2FOxAy9w8= X-Stat-Signature: oskfhjgystyhojsiq7thyf3h44ze6who X-Rspam-User: X-Rspamd-Queue-Id: 1ED324001A X-Rspamd-Server: rspam11 Authentication-Results: imf11.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=f2BnlrQt; spf=none (imf11.hostedemail.com: domain of chao.p.peng@linux.intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=chao.p.peng@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-HE-Tag: 1669730620-169469 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 29, 2022 at 02:21:39PM +0300, Kirill A. Shutemov wrote: > On Mon, Nov 28, 2022 at 06:06:32PM -0600, Michael Roth wrote: > > On Tue, Oct 25, 2022 at 11:13:37PM +0800, Chao Peng wrote: > > > From: "Kirill A. Shutemov" > > > > > > > > > > > > +static struct file *restrictedmem_file_create(struct file *memfd) > > > +{ > > > + struct restrictedmem_data *data; > > > + struct address_space *mapping; > > > + struct inode *inode; > > > + struct file *file; > > > + > > > + data = kzalloc(sizeof(*data), GFP_KERNEL); > > > + if (!data) > > > + return ERR_PTR(-ENOMEM); > > > + > > > + data->memfd = memfd; > > > + mutex_init(&data->lock); > > > + INIT_LIST_HEAD(&data->notifiers); > > > + > > > + inode = alloc_anon_inode(restrictedmem_mnt->mnt_sb); > > > + if (IS_ERR(inode)) { > > > + kfree(data); > > > + return ERR_CAST(inode); > > > + } > > > + > > > + inode->i_mode |= S_IFREG; > > > + inode->i_op = &restrictedmem_iops; > > > + inode->i_mapping->private_data = data; > > > + > > > + file = alloc_file_pseudo(inode, restrictedmem_mnt, > > > + "restrictedmem", O_RDWR, > > > + &restrictedmem_fops); > > > + if (IS_ERR(file)) { > > > + iput(inode); > > > + kfree(data); > > > + return ERR_CAST(file); > > > + } > > > + > > > + file->f_flags |= O_LARGEFILE; > > > + > > > + mapping = memfd->f_mapping; > > > + mapping_set_unevictable(mapping); > > > + mapping_set_gfp_mask(mapping, > > > + mapping_gfp_mask(mapping) & ~__GFP_MOVABLE); > > > > Is this supposed to prevent migration of pages being used for > > restrictedmem/shmem backend? > > Yes, my bad. I expected it to prevent migration, but it is not true. > > Looks like we need to bump refcount in restrictedmem_get_page() and reduce > it back when KVM is no longer use it. The restrictedmem_get_page() has taken a reference, but later KVM put_page() after populating the secondary page table entry through kvm_release_pfn_clean(). One option would let the user feature(e.g. TDX/SEV) to get_page/put_page() during populating the secondary page table entry, AFAICS, this requirement also comes from these features. Chao > > Chao, could you adjust it? > > -- > Kiryl Shutsemau / Kirill A. Shutemov