All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Chao Peng <chao.p.peng@linux.intel.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-api@vger.kernel.org, linux-doc@vger.kernel.org,
	qemu-devel@nongnu.org, Paolo Bonzini <pbonzini@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
	Hugh Dickins <hughd@google.com>, Jeff Layton <jlayton@kernel.org>,
	"J . Bruce Fields" <bfields@fieldses.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Shuah Khan <shuah@kernel.org>, Mike Rapoport <rppt@kernel.org>,
	Steven Price <steven.price@arm.com>,
	"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
	Vishal Annapurve <vannapurve@google.com>,
	Yu Zhang <yu.c.zhang@linux.intel.com>,
	luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com,
	ak@linux.intel.com, david@redhat.com, aarcange@redhat.com,
	ddutile@redhat.com, dhildenb@redhat.com,
	Quentin Perret <qperret@google.com>,
	Michael Roth <michael.roth@amd.com>,
	mhocko@suse.com, Muchun Song <songmuchun@bytedance.com>,
	wei.w.wang@intel.com
Subject: Re: [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd
Date: Mon, 17 Oct 2022 19:19:55 +0300	[thread overview]
Message-ID: <20221017161955.t4gditaztbwijgcn@box.shutemov.name> (raw)
In-Reply-To: <de680280-f6b1-9337-2ae4-4b2faf2b823b@suse.cz>

On Mon, Oct 17, 2022 at 03:00:21PM +0200, Vlastimil Babka wrote:
> On 9/15/22 16:29, Chao Peng wrote:
> > From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> > 
> > KVM can use memfd-provided memory for guest memory. For normal userspace
> > accessible memory, KVM userspace (e.g. QEMU) mmaps the memfd into its
> > virtual address space and then tells KVM to use the virtual address to
> > setup the mapping in the secondary page table (e.g. EPT).
> > 
> > With confidential computing technologies like Intel TDX, the
> > memfd-provided memory may be encrypted with special key for special
> > software domain (e.g. KVM guest) and is not expected to be directly
> > accessed by userspace. Precisely, userspace access to such encrypted
> > memory may lead to host crash so it should be prevented.
> > 
> > This patch introduces userspace inaccessible memfd (created with
> > MFD_INACCESSIBLE). Its memory is inaccessible from userspace through
> > ordinary MMU access (e.g. read/write/mmap) but can be accessed via
> > in-kernel interface so KVM can directly interact with core-mm without
> > the need to map the memory into KVM userspace.
> > 
> > It provides semantics required for KVM guest private(encrypted) memory
> > support that a file descriptor with this flag set is going to be used as
> > the source of guest memory in confidential computing environments such
> > as Intel TDX/AMD SEV.
> > 
> > KVM userspace is still in charge of the lifecycle of the memfd. It
> > should pass the opened fd to KVM. KVM uses the kernel APIs newly added
> > in this patch to obtain the physical memory address and then populate
> > the secondary page table entries.
> > 
> > The userspace inaccessible memfd can be fallocate-ed and hole-punched
> > from userspace. When hole-punching happens, KVM can get notified through
> > inaccessible_notifier it then gets chance to remove any mapped entries
> > of the range in the secondary page tables.
> > 
> > The userspace inaccessible memfd itself is implemented as a shim layer
> > on top of real memory file systems like tmpfs/hugetlbfs but this patch
> > only implemented tmpfs. The allocated memory is currently marked as
> > unmovable and unevictable, this is required for current confidential
> > usage. But in future this might be changed.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
> > ---
> 
> ...
> 
> > +static long inaccessible_fallocate(struct file *file, int mode,
> > +				   loff_t offset, loff_t len)
> > +{
> > +	struct inaccessible_data *data = file->f_mapping->private_data;
> > +	struct file *memfd = data->memfd;
> > +	int ret;
> > +
> > +	if (mode & FALLOC_FL_PUNCH_HOLE) {
> > +		if (!PAGE_ALIGNED(offset) || !PAGE_ALIGNED(len))
> > +			return -EINVAL;
> > +	}
> > +
> > +	ret = memfd->f_op->fallocate(memfd, mode, offset, len);
> > +	inaccessible_notifier_invalidate(data, offset, offset + len);
> 
> Wonder if invalidate should precede the actual hole punch, otherwise we open
> a window where the page tables point to memory no longer valid?

Yes, you are right. Thanks for catching this.

> > +	return ret;
> > +}
> > +
> 
> ...
> 
> > +
> > +static struct file_system_type inaccessible_fs = {
> > +	.owner		= THIS_MODULE,
> > +	.name		= "[inaccessible]",
> 
> Dunno where exactly is this name visible, but shouldn't it better be
> "[memfd:inaccessible]"?

Maybe. And skip brackets.

> 
> > +	.init_fs_context = inaccessible_init_fs_context,
> > +	.kill_sb	= kill_anon_super,
> > +};
> > +
> 

-- 
  Kiryl Shutsemau / Kirill A. Shutemov

  reply	other threads:[~2022-10-17 16:20 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-15 14:29 [PATCH v8 0/8] KVM: mm: fd-based approach for supporting KVM Chao Peng
2022-09-15 14:29 ` [PATCH v8 1/8] mm/memfd: Introduce userspace inaccessible memfd Chao Peng
2022-09-19  9:12   ` David Hildenbrand
2022-09-19 19:10     ` Sean Christopherson
2022-09-21 21:10       ` Andy Lutomirski
2022-09-22 13:23         ` Wang, Wei W
2022-09-23 15:20         ` Fuad Tabba
2022-09-23 15:19       ` Fuad Tabba
2022-09-26 14:23         ` Chao Peng
2022-09-26 15:51           ` Fuad Tabba
2022-09-27 22:47             ` Sean Christopherson
2022-09-30 16:19               ` Fuad Tabba
2022-10-13 13:34                 ` Chao Peng
2022-10-17 10:31                   ` Fuad Tabba
2022-10-17 14:58                     ` Chao Peng
2022-10-17 19:05                       ` Fuad Tabba
2022-10-19 13:30                         ` Chao Peng
2022-10-18  0:33                 ` Sean Christopherson
2022-10-19 15:04                   ` Fuad Tabba
2022-09-23  0:58     ` Kirill A . Shutemov
2022-09-26 10:35       ` David Hildenbrand
2022-09-26 14:48         ` Kirill A. Shutemov
2022-09-26 14:53           ` David Hildenbrand
2022-09-27 23:23             ` Sean Christopherson
2022-09-28 13:36               ` Kirill A. Shutemov
2022-09-22 13:26   ` Wang, Wei W
2022-09-22 19:49     ` Sean Christopherson
2022-09-23  0:53       ` Kirill A . Shutemov
2022-09-23 15:20         ` Fuad Tabba
2022-09-30 16:14   ` Fuad Tabba
2022-09-30 16:23     ` Kirill A . Shutemov
2022-10-03  7:33       ` Fuad Tabba
2022-10-03 11:01         ` Kirill A. Shutemov
2022-10-04 15:39           ` Fuad Tabba
2022-10-06  8:50   ` Fuad Tabba
2022-10-06 13:04     ` Kirill A. Shutemov
2022-10-17 13:00   ` Vlastimil Babka
2022-10-17 16:19     ` Kirill A . Shutemov [this message]
2022-10-17 16:39       ` Gupta, Pankaj
2022-10-17 21:56         ` Kirill A . Shutemov
2022-10-18 13:42           ` Vishal Annapurve
2022-10-19 15:32             ` Kirill A . Shutemov
2022-10-20 10:50               ` Vishal Annapurve
2022-10-21 13:54                 ` Chao Peng
2022-10-21 16:53                   ` Sean Christopherson
2022-10-19 12:23   ` Vishal Annapurve
2022-10-21 13:47     ` Chao Peng
2022-10-21 16:18       ` Sean Christopherson
2022-10-24 14:59         ` Kirill A . Shutemov
2022-10-24 15:26           ` David Hildenbrand
2022-11-03 16:27           ` Vishal Annapurve
2022-09-15 14:29 ` [PATCH v8 2/8] KVM: Extend the memslot to support fd-based private memory Chao Peng
2022-09-16  9:14   ` Bagas Sanjaya
2022-09-16  9:53     ` Chao Peng
2022-09-26 10:26   ` Fuad Tabba
2022-09-26 14:04     ` Chao Peng
2022-09-29 22:45   ` Isaku Yamahata
2022-09-29 23:22     ` Sean Christopherson
2022-10-05 13:04   ` Jarkko Sakkinen
2022-10-05 22:05     ` Jarkko Sakkinen
2022-10-06  9:00   ` Fuad Tabba
2022-10-06 14:58   ` Jarkko Sakkinen
2022-10-06 15:07     ` Jarkko Sakkinen
2022-10-06 15:34       ` Sean Christopherson
2022-10-07 11:14         ` Jarkko Sakkinen
2022-10-07 14:58           ` Sean Christopherson
2022-10-07 21:54             ` Jarkko Sakkinen
2022-10-08 16:15               ` Jarkko Sakkinen
2022-10-08 17:35                 ` Jarkko Sakkinen
2022-10-10  8:25                   ` Chao Peng
2022-10-12  8:14                     ` Jarkko Sakkinen
2022-09-15 14:29 ` [PATCH v8 3/8] KVM: Add KVM_EXIT_MEMORY_FAULT exit Chao Peng
2022-09-16  9:17   ` Bagas Sanjaya
2022-09-16  9:54     ` Chao Peng
2022-09-15 14:29 ` [PATCH v8 4/8] KVM: Use gfn instead of hva for mmu_notifier_retry Chao Peng
2022-09-15 14:29 ` [PATCH v8 5/8] KVM: Register/unregister the guest private memory regions Chao Peng
2022-09-26 10:36   ` Fuad Tabba
2022-09-26 14:07     ` Chao Peng
2022-10-11  9:48   ` Fuad Tabba
2022-10-12  2:35     ` Chao Peng
2022-10-17 10:15       ` Fuad Tabba
2022-10-17 22:17         ` Sean Christopherson
2022-10-19 13:23           ` Chao Peng
2022-10-19 15:02             ` Fuad Tabba
2022-10-19 16:09               ` Sean Christopherson
2022-10-19 18:32                 ` Fuad Tabba
2022-09-15 14:29 ` [PATCH v8 6/8] KVM: Update lpage info when private/shared memory are mixed Chao Peng
2022-09-29 16:52   ` Isaku Yamahata
2022-09-30  8:59     ` Chao Peng
2022-09-15 14:29 ` [PATCH v8 7/8] KVM: Handle page fault for private memory Chao Peng
2022-10-14 18:57   ` Sean Christopherson
2022-10-17 14:48     ` Chao Peng
2022-09-15 14:29 ` [PATCH v8 8/8] KVM: Enable and expose KVM_MEM_PRIVATE Chao Peng
2022-10-04 14:55   ` Jarkko Sakkinen
2022-10-10  8:31     ` Chao Peng
2022-10-06  8:55   ` Fuad Tabba
2022-10-10  8:33     ` Chao Peng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221017161955.t4gditaztbwijgcn@box.shutemov.name \
    --to=kirill.shutemov@linux.intel.com \
    --cc=aarcange@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=bp@alien8.de \
    --cc=chao.p.peng@linux.intel.com \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=dhildenb@redhat.com \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=jlayton@kernel.org \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=jun.nakajima@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=mhocko@suse.com \
    --cc=michael.roth@amd.com \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qperret@google.com \
    --cc=rppt@kernel.org \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=songmuchun@bytedance.com \
    --cc=steven.price@arm.com \
    --cc=tglx@linutronix.de \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.