kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Vishal Annapurve <vannapurve@google.com>
Cc: Ackerley Tng <ackerleytng@google.com>,
	pbonzini@redhat.com, maz@kernel.org, oliver.upton@linux.dev,
	chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org,
	paul.walmsley@sifive.com, palmer@dabbelt.com,
	aou@eecs.berkeley.edu, willy@infradead.org,
	akpm@linux-foundation.org, paul@paul-moore.com,
	jmorris@namei.org, serge@hallyn.com, kvm@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-security-module@vger.kernel.org,
	linux-kernel@vger.kernel.org, chao.p.peng@linux.intel.com,
	tabba@google.com, jarkko@kernel.org, yu.c.zhang@linux.intel.com,
	mail@maciej.szmigiero.name, vbabka@suse.cz, david@redhat.com,
	qperret@google.com, michael.roth@amd.com, wei.w.wang@intel.com,
	liam.merwick@oracle.com, isaku.yamahata@gmail.com,
	kirill.shutemov@linux.intel.com
Subject: Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory
Date: Fri, 11 Aug 2023 10:44:11 -0700	[thread overview]
Message-ID: <ZNZza/emWldkJC6X@google.com> (raw)
In-Reply-To: <CAGtprH9YE50RtqhW-U+wK0Vv6aKfqqtOPn8q4s8or=UZwPXZoA@mail.gmail.com>

On Thu, Aug 10, 2023, Vishal Annapurve wrote:
> On Tue, Aug 8, 2023 at 2:13 PM Sean Christopherson <seanjc@google.com> wrote:
> > ...
> 
> > > + When binding a memslot to the file, if a kvm pointer exists, it must
> > >   be the same kvm as the one in this binding
> > > + When the binding to the last memslot is removed from a file, NULL the
> > >   kvm pointer.
> >
> > Nullifying the KVM pointer isn't sufficient, because without additional actions
> > userspace could extract data from a VM by deleting its memslots and then binding
> > the guest_memfd to an attacker controlled VM.  Or more likely with TDX and SNP,
> > induce badness by coercing KVM into mapping memory into a guest with the wrong
> > ASID/HKID.
> >
> 
> TDX/SNP have mechanisms i.e. PAMT/RMP tables to ensure that the same
> memory is not assigned to two different VMs.

One of the main reasons we pivoted away from using a flag in "struct page" to
indicate that a page was private was so that KVM could enforce 1:1 VM:page ownership
*without* relying on hardware.

And FWIW, the PAMT provides no protection in this specific case because KVM does
TDH.MEM.PAGE.REMOVE when zapping S-EPT entries, and that marks the page clear in
the PAMT.  The danger there is that physical memory is still encrypted with the
guest's HKID, and so mapping the memory into a different VM, which might not be
a TDX guest!, could lead to corruption and/or poison #MCs.

The HKID issues wouldn't be a problem if v15 is merged as-is, because zapping
S-EPT entries also fully purges and reclaims the page, but as we discussed in
one of the many threads, reclaiming physical memory should be tied to the inode,
i.e. to memory truly being freed, and not to S-EPTs being zapped.  And there is
a very good reason for wanting to do that, as it allows KVM to do the expensive
cache flush + clear outside of mmu_lock.

> Deleting memslots should also clear out the contents of the memory as the EPT
> tables will be zapped in the process

No, deleting a memslot should not clear memory.  As I said in my previous response,
the fact that zapping S-EPT entries is destructive is a limitiation of TDX, not a
feature we want to apply to other VM types.  And that's not even a fundamental
property of TDX, e.g. TDX could remove the limitation, at the cost of consuming
quite a bit more memory, by tracking the exact owner by HKID in the PAMT and
decoupling S-EPT entries from page ownership.

Or in theory, KVM could workaround the limitation by only doing TDH.MEM.RANGE.BLOCK
when zapping S-EPTs.  Hmm, that might actually be worth looking at.

> and the host will reclaim the memory.

There are no guarantees that the host will reclaim the memory.  E.g. QEMU will
delete and re-create memslots for "regular" VMs when emulating option ROMs.  Even
if that use case is nonsensical for confidential VMs (and it probably is nonsensical),
I don't want to define KVM's ABI based on what we *think* userspace will do.

  reply	other threads:[~2023-08-11 17:44 UTC|newest]

Thread overview: 140+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-18 23:44 [RFC PATCH v11 00/29] KVM: guest_memfd() and per-page attributes Sean Christopherson
2023-07-18 23:44 ` [RFC PATCH v11 01/29] KVM: Wrap kvm_gfn_range.pte in a per-action union Sean Christopherson
2023-07-19 13:39   ` Jarkko Sakkinen
2023-07-19 15:39     ` Sean Christopherson
2023-07-19 16:55   ` Paolo Bonzini
2023-07-26 20:22     ` Sean Christopherson
2023-07-21  6:26   ` Yan Zhao
2023-07-21 10:45     ` Xu Yilun
2023-07-25 18:05       ` Sean Christopherson
2023-07-18 23:44 ` [RFC PATCH v11 02/29] KVM: Tweak kvm_hva_range and hva_handler_t to allow reusing for gfn ranges Sean Christopherson
2023-07-19 17:12   ` Paolo Bonzini
2023-07-18 23:44 ` [RFC PATCH v11 03/29] KVM: Use gfn instead of hva for mmu_notifier_retry Sean Christopherson
2023-07-19 17:12   ` Paolo Bonzini
2023-07-18 23:44 ` [RFC PATCH v11 04/29] KVM: PPC: Drop dead code related to KVM_ARCH_WANT_MMU_NOTIFIER Sean Christopherson
2023-07-19 17:34   ` Paolo Bonzini
2023-07-18 23:44 ` [RFC PATCH v11 05/29] KVM: Convert KVM_ARCH_WANT_MMU_NOTIFIER to CONFIG_KVM_GENERIC_MMU_NOTIFIER Sean Christopherson
2023-07-19  7:31   ` Yuan Yao
2023-07-19 14:15     ` Sean Christopherson
2023-07-20  1:15       ` Yuan Yao
2023-07-18 23:44 ` [RFC PATCH v11 06/29] KVM: Introduce KVM_SET_USER_MEMORY_REGION2 Sean Christopherson
2023-07-21  9:03   ` Paolo Bonzini
2023-07-28  9:25   ` Quentin Perret
2023-07-29  0:03     ` Sean Christopherson
2023-07-31  9:30       ` Quentin Perret
2023-07-31 15:58       ` Paolo Bonzini
2023-07-18 23:44 ` [RFC PATCH v11 07/29] KVM: Add KVM_EXIT_MEMORY_FAULT exit Sean Christopherson
2023-07-19  7:54   ` Yuan Yao
2023-07-19 14:16     ` Sean Christopherson
2023-07-18 23:44 ` [RFC PATCH v11 08/29] KVM: Introduce per-page memory attributes Sean Christopherson
2023-07-20  8:09   ` Yuan Yao
2023-07-20 19:02     ` Isaku Yamahata
2023-07-20 20:20       ` Sean Christopherson
2023-07-21 10:57   ` Paolo Bonzini
2023-07-21 15:56   ` Xiaoyao Li
2023-07-24  4:43   ` Xu Yilun
2023-07-26 15:59     ` Sean Christopherson
2023-07-27  3:24       ` Xu Yilun
2023-08-02 20:31   ` Isaku Yamahata
2023-08-14  0:44   ` Binbin Wu
2023-08-14 21:54     ` Sean Christopherson
2023-07-18 23:44 ` [RFC PATCH v11 09/29] KVM: x86: Disallow hugepages when memory attributes are mixed Sean Christopherson
2023-07-21 11:59   ` Paolo Bonzini
2023-07-21 17:41     ` Sean Christopherson
2023-07-18 23:44 ` [RFC PATCH v11 10/29] mm: Add AS_UNMOVABLE to mark mapping as completely unmovable Sean Christopherson
2023-07-25 10:24   ` Kirill A . Shutemov
2023-07-25 12:51     ` Matthew Wilcox
2023-07-26 11:36       ` Kirill A . Shutemov
2023-07-28 16:02       ` Vlastimil Babka
2023-07-28 16:13         ` Paolo Bonzini
2023-09-01  8:23       ` Vlastimil Babka
2023-07-18 23:44 ` [RFC PATCH v11 11/29] security: Export security_inode_init_security_anon() for use by KVM Sean Christopherson
2023-07-19  2:14   ` Paul Moore
2023-07-31 10:46   ` Vlastimil Babka
2023-07-18 23:44 ` [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory Sean Christopherson
2023-07-19 17:21   ` Vishal Annapurve
2023-07-19 17:47     ` Sean Christopherson
2023-07-20 14:45   ` Xiaoyao Li
2023-07-20 15:14     ` Sean Christopherson
2023-07-20 21:28   ` Isaku Yamahata
2023-07-21  6:13   ` Yuan Yao
2023-07-21 22:27     ` Isaku Yamahata
2023-07-21 22:33       ` Sean Christopherson
2023-07-21 15:05   ` Xiaoyao Li
2023-07-21 15:42     ` Xiaoyao Li
2023-07-21 17:42       ` Sean Christopherson
2023-07-21 17:17   ` Paolo Bonzini
2023-07-21 17:50     ` Sean Christopherson
2023-07-25 15:09   ` Wang, Wei W
2023-07-25 16:03     ` Sean Christopherson
2023-07-26  1:51       ` Wang, Wei W
2023-07-31 16:23       ` Fuad Tabba
2023-07-26 17:18   ` Elliot Berman
2023-07-26 19:28     ` Sean Christopherson
2023-07-27 10:39   ` Fuad Tabba
2023-07-27 17:13     ` Sean Christopherson
2023-07-31 13:46       ` Fuad Tabba
2023-08-03 19:15   ` Ryan Afranji
2023-08-07 23:06   ` Ackerley Tng
2023-08-08 21:13     ` Sean Christopherson
2023-08-10 23:57       ` Vishal Annapurve
2023-08-11 17:44         ` Sean Christopherson [this message]
2023-08-15 18:43       ` Ackerley Tng
2023-08-15 20:03         ` Sean Christopherson
2023-08-21 17:30           ` Ackerley Tng
2023-08-21 19:33             ` Sean Christopherson
2023-08-28 22:56               ` Ackerley Tng
2023-08-29  2:53                 ` Elliot Berman
2023-09-14 19:12                   ` Sean Christopherson
2023-09-14 18:15                 ` Sean Christopherson
2023-09-14 23:19                   ` Ackerley Tng
2023-09-15  0:33                     ` Sean Christopherson
2023-08-30 15:12   ` Binbin Wu
2023-08-30 16:44     ` Ackerley Tng
2023-09-01  3:45       ` Binbin Wu
2023-09-01 16:46         ` Ackerley Tng
2023-07-18 23:44 ` [RFC PATCH v11 13/29] KVM: Add transparent hugepage support for dedicated guest memory Sean Christopherson
2023-07-21 15:07   ` Paolo Bonzini
2023-07-21 17:13     ` Sean Christopherson
2023-09-06 22:10       ` Paolo Bonzini
2023-07-18 23:44 ` [RFC PATCH v11 14/29] KVM: x86/mmu: Handle page fault for private memory Sean Christopherson
2023-07-21 15:09   ` Paolo Bonzini
2023-07-18 23:44 ` [RFC PATCH v11 15/29] KVM: Drop superfluous __KVM_VCPU_MULTIPLE_ADDRESS_SPACE macro Sean Christopherson
2023-07-21 15:07   ` Paolo Bonzini
2023-07-18 23:44 ` [RFC PATCH v11 16/29] KVM: Allow arch code to track number of memslot address spaces per VM Sean Christopherson
2023-07-21 15:12   ` Paolo Bonzini
2023-07-18 23:45 ` [RFC PATCH v11 17/29] KVM: x86: Add support for "protected VMs" that can utilize private memory Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 18/29] KVM: selftests: Drop unused kvm_userspace_memory_region_find() helper Sean Christopherson
2023-07-21 15:14   ` Paolo Bonzini
2023-07-18 23:45 ` [RFC PATCH v11 19/29] KVM: selftests: Convert lib's mem regions to KVM_SET_USER_MEMORY_REGION2 Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 20/29] KVM: selftests: Add support for creating private memslots Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 21/29] KVM: selftests: Add helpers to convert guest memory b/w private and shared Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 22/29] KVM: selftests: Add helpers to do KVM_HC_MAP_GPA_RANGE hypercalls (x86) Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 23/29] KVM: selftests: Introduce VM "shape" to allow tests to specify the VM type Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 24/29] KVM: selftests: Add GUEST_SYNC[1-6] macros for synchronizing more data Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 25/29] KVM: selftests: Add x86-only selftest for private memory conversions Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 26/29] KVM: selftests: Add KVM_SET_USER_MEMORY_REGION2 helper Sean Christopherson
2023-07-18 23:45 ` [RFC PATCH v11 27/29] KVM: selftests: Expand set_memory_region_test to validate guest_memfd() Sean Christopherson
2023-08-07 23:17   ` Ackerley Tng
2023-07-18 23:45 ` [RFC PATCH v11 28/29] KVM: selftests: Add basic selftest for guest_memfd() Sean Christopherson
2023-08-07 23:20   ` Ackerley Tng
2023-08-18 23:03     ` Sean Christopherson
2023-08-07 23:25   ` Ackerley Tng
2023-08-18 23:01     ` Sean Christopherson
2023-08-21 19:49       ` Ackerley Tng
2023-07-18 23:45 ` [RFC PATCH v11 29/29] KVM: selftests: Test KVM exit behavior for private memory/access Sean Christopherson
2023-07-24  6:38 ` [RFC PATCH v11 00/29] KVM: guest_memfd() and per-page attributes Nikunj A. Dadhania
2023-07-24 17:00   ` Sean Christopherson
2023-07-26 11:20     ` Nikunj A. Dadhania
2023-07-26 14:24       ` Sean Christopherson
2023-07-27  6:42         ` Nikunj A. Dadhania
2023-08-03 11:03       ` Vlastimil Babka
2023-07-24 20:16 ` Sean Christopherson
2023-08-25 17:47 ` Sean Christopherson
2023-08-29  9:12   ` Chao Peng
2023-08-31 18:29     ` Sean Christopherson
2023-09-01  1:17       ` Chao Peng
2023-09-01  8:26         ` Vlastimil Babka
2023-09-01  9:10         ` Paolo Bonzini
2023-08-30  0:00   ` Isaku Yamahata
2023-09-09  0:16   ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZNZza/emWldkJC6X@google.com \
    --to=seanjc@google.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=anup@brainfault.org \
    --cc=aou@eecs.berkeley.edu \
    --cc=chao.p.peng@linux.intel.com \
    --cc=chenhuacai@kernel.org \
    --cc=david@redhat.com \
    --cc=isaku.yamahata@gmail.com \
    --cc=jarkko@kernel.org \
    --cc=jmorris@namei.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm-riscv@lists.infradead.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=liam.merwick@oracle.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mail@maciej.szmigiero.name \
    --cc=maz@kernel.org \
    --cc=michael.roth@amd.com \
    --cc=mpe@ellerman.id.au \
    --cc=oliver.upton@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paul@paul-moore.com \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=serge@hallyn.com \
    --cc=tabba@google.com \
    --cc=vannapurve@google.com \
    --cc=vbabka@suse.cz \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).