From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Dave Hansen <dave.hansen@linux.intel.com>,
Andy Lutomirski <luto@kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <sean.j.christopherson@intel.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Wanpeng Li <wanpengli@tencent.com>,
Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
David Rientjes <rientjes@google.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Kees Cook <keescook@chromium.org>, Will Drewry <wad@chromium.org>,
"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
"Kleen, Andi" <andi.kleen@intel.com>,
x86@kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, Mike Rapoport <rppt@linux.ibm.com>,
Alexandre Chartre <alexandre.chartre@oracle.com>,
Marius Hillenbrand <mhillenb@amazon.de>
Subject: Re: [RFC 00/16] KVM protected memory extension
Date: Mon, 25 May 2020 08:27:04 +0300 [thread overview]
Message-ID: <20200525052704.phyk5olkykncj3bj@black.fi.intel.com> (raw)
In-Reply-To: <20200522125214.31348-1-kirill.shutemov@linux.intel.com>
On Fri, May 22, 2020 at 03:51:58PM +0300, Kirill A. Shutemov wrote:
> == Background / Problem ==
>
> There are a number of hardware features (MKTME, SEV) which protect guest
> memory from some unauthorized host access. The patchset proposes a purely
> software feature that mitigates some of the same host-side read-only
> attacks.
CC people who worked on the related patchsets.
> == What does this set mitigate? ==
>
> - Host kernel ”accidental” access to guest data (think speculation)
>
> - Host kernel induced access to guest data (write(fd, &guest_data_ptr, len))
>
> - Host userspace access to guest data (compromised qemu)
>
> == What does this set NOT mitigate? ==
>
> - Full host kernel compromise. Kernel will just map the pages again.
>
> - Hardware attacks
>
>
> The patchset is RFC-quality: it works but has known issues that must be
> addressed before it can be considered for applying.
>
> We are looking for high-level feedback on the concept. Some open
> questions:
>
> - This protects from some kernel and host userspace read-only attacks,
> but does not place the host kernel outside the trust boundary. Is it
> still valuable?
>
> - Can this approach be used to avoid cache-coherency problems with
> hardware encryption schemes that repurpose physical bits?
>
> - The guest kernel must be modified for this to work. Is that a deal
> breaker, especially for public clouds?
>
> - Are the costs of removing pages from the direct map too high to be
> feasible?
>
> == Series Overview ==
>
> The hardware features protect guest data by encrypting it and then
> ensuring that only the right guest can decrypt it. This has the
> side-effect of making the kernel direct map and userspace mapping
> (QEMU et al) useless. But, this teaches us something very useful:
> neither the kernel or userspace mappings are really necessary for normal
> guest operations.
>
> Instead of using encryption, this series simply unmaps the memory. One
> advantage compared to allowing access to ciphertext is that it allows bad
> accesses to be caught instead of simply reading garbage.
>
> Protection from physical attacks needs to be provided by some other means.
> On Intel platforms, (single-key) Total Memory Encryption (TME) provides
> mitigation against physical attacks, such as DIMM interposers sniffing
> memory bus traffic.
>
> The patchset modifies both host and guest kernel. The guest OS must enable
> the feature via hypercall and mark any memory range that has to be shared
> with the host: DMA regions, bounce buffers, etc. SEV does this marking via a
> bit in the guest’s page table while this approach uses a hypercall.
>
> For removing the userspace mapping, use a trick similar to what NUMA
> balancing does: convert memory that belongs to KVM memory slots to
> PROT_NONE: all existing entries converted to PROT_NONE with mprotect() and
> the newly faulted in pages get PROT_NONE from the updated vm_page_prot.
> The new VMA flag -- VM_KVM_PROTECTED -- indicates that the pages in the
> VMA must be treated in a special way in the GUP and fault paths. The flag
> allows GUP to return the page even though it is mapped with PROT_NONE, but
> only if the new GUP flag -- FOLL_KVM -- is specified. Any userspace access
> to the memory would result in SIGBUS. Any GUP access without FOLL_KVM
> would result in -EFAULT.
>
> Any anonymous page faulted into the VM_KVM_PROTECTED VMA gets removed from
> the direct mapping with kernel_map_pages(). Note that kernel_map_pages() only
> flushes local TLB. I think it's a reasonable compromise between security and
> perfromance.
>
> Zapping the PTE would bring the page back to the direct mapping after clearing.
> At least for now, we don't remove file-backed pages from the direct mapping.
> File-backed pages could be accessed via read/write syscalls. It adds
> complexity.
>
> Occasionally, host kernel has to access guest memory that was not made
> shared by the guest. For instance, it happens for instruction emulation.
> Normally, it's done via copy_to/from_user() which would fail with -EFAULT
> now. We introduced a new pair of helpers: copy_to/from_guest(). The new
> helpers acquire the page via GUP, map it into kernel address space with
> kmap_atomic()-style mechanism and only then copy the data.
>
> For some instruction emulation copying is not good enough: cmpxchg
> emulation has to have direct access to the guest memory. __kvm_map_gfn()
> is modified to accommodate the case.
>
> The patchset is on top of v5.7-rc6 plus this patch:
>
> https://lkml.kernel.org/r/20200402172507.2786-1-jimmyassarsson@gmail.com
>
> == Open Issues ==
>
> Unmapping the pages from direct mapping bring a few of issues that have
> not rectified yet:
>
> - Touching direct mapping leads to fragmentation. We need to be able to
> recover from it. I have a buggy patch that aims at recovering 2M/1G page.
> It has to be fixed and tested properly
>
> - Page migration and KSM is not supported yet.
>
> - Live migration of a guest would require a new flow. Not sure yet how it
> would look like.
>
> - The feature interfere with NUMA balancing. Not sure yet if it's
> possible to make them work together.
>
> - Guests have no mechanism to ensure that even a well-behaving host has
> unmapped its private data. With SEV, for instance, the guest only has
> to trust the hardware to encrypt a page after the C bit is set in a
> guest PTE. A mechanism for a guest to query the host mapping state, or
> to constantly assert the intent for a page to be Private would be
> valuable.
--
Kirill A. Shutemov
next prev parent reply other threads:[~2020-05-25 5:27 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-22 12:51 [RFC 00/16] KVM protected memory extension Kirill A. Shutemov
2020-05-22 12:51 ` [RFC 01/16] x86/mm: Move force_dma_unencrypted() to common code Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 02/16] x86/kvm: Introduce KVM memory protection feature Kirill A. Shutemov
2020-05-25 14:58 ` Vitaly Kuznetsov
2020-05-25 15:15 ` Kirill A. Shutemov
2020-05-27 5:03 ` Sean Christopherson
2020-05-27 8:39 ` Vitaly Kuznetsov
2020-05-27 8:52 ` Sean Christopherson
2020-06-03 2:09 ` Huang, Kai
2020-06-03 11:14 ` Vitaly Kuznetsov
2020-05-22 12:52 ` [RFC 03/16] x86/kvm: Make DMA pages shared Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 04/16] x86/kvm: Use bounce buffers for KVM memory protection Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 05/16] x86/kvm: Make VirtIO use DMA API in KVM guest Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 06/16] KVM: Use GUP instead of copy_from/to_user() to access guest memory Kirill A. Shutemov
2020-05-25 15:08 ` Vitaly Kuznetsov
2020-05-25 15:17 ` Kirill A. Shutemov
2020-06-01 16:35 ` Paolo Bonzini
2020-06-02 13:33 ` Kirill A. Shutemov
2020-05-26 6:14 ` Mike Rapoport
2020-05-26 21:56 ` Kirill A. Shutemov
2020-05-29 15:24 ` Kees Cook
2020-05-22 12:52 ` [RFC 07/16] KVM: mm: Introduce VM_KVM_PROTECTED Kirill A. Shutemov
2020-05-26 6:15 ` Mike Rapoport
2020-05-26 22:01 ` Kirill A. Shutemov
2020-05-26 6:40 ` John Hubbard
2020-05-26 22:04 ` Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 08/16] KVM: x86: Use GUP for page walk instead of __get_user() Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 09/16] KVM: Protected memory extension Kirill A. Shutemov
2020-05-25 15:26 ` Vitaly Kuznetsov
2020-05-25 15:34 ` Kirill A. Shutemov
2020-06-03 1:34 ` Huang, Kai
2020-05-22 12:52 ` [RFC 10/16] KVM: x86: Enabled protected " Kirill A. Shutemov
2020-05-25 15:26 ` Vitaly Kuznetsov
2020-05-26 6:16 ` Mike Rapoport
2020-05-26 21:58 ` Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 11/16] KVM: Rework copy_to/from_guest() to avoid direct mapping Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 12/16] x86/kvm: Share steal time page with host Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 13/16] x86/kvmclock: Share hvclock memory with the host Kirill A. Shutemov
2020-05-25 15:22 ` Vitaly Kuznetsov
2020-05-25 15:25 ` Kirill A. Shutemov
2020-05-25 15:42 ` Vitaly Kuznetsov
2020-05-22 12:52 ` [RFC 14/16] KVM: Introduce gfn_to_pfn_memslot_protected() Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 15/16] KVM: Handle protected memory in __kvm_map_gfn()/__kvm_unmap_gfn() Kirill A. Shutemov
2020-05-22 12:52 ` [RFC 16/16] KVM: Unmap protected pages from direct mapping Kirill A. Shutemov
2020-05-26 6:16 ` Mike Rapoport
2020-05-26 22:10 ` Kirill A. Shutemov
2020-05-25 5:27 ` Kirill A. Shutemov [this message]
2020-05-25 13:47 ` [RFC 00/16] KVM protected memory extension Liran Alon
2020-05-25 14:46 ` Kirill A. Shutemov
2020-05-25 15:56 ` Liran Alon
2020-05-26 6:17 ` Mike Rapoport
2020-05-26 10:16 ` Liran Alon
2020-05-26 11:38 ` Mike Rapoport
2020-05-27 15:45 ` Dave Hansen
2020-05-27 21:22 ` Mike Rapoport
2020-06-04 15:15 ` Marc Zyngier
2020-06-04 15:48 ` Sean Christopherson
2020-06-04 16:27 ` Marc Zyngier
2020-06-04 16:35 ` Will Deacon
2020-06-04 19:09 ` Nakajima, Jun
2020-06-04 21:03 ` Jim Mattson
2020-06-04 23:29 ` Nakajima, Jun
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200525052704.phyk5olkykncj3bj@black.fi.intel.com \
--to=kirill.shutemov@linux.intel.com \
--cc=aarcange@redhat.com \
--cc=alexandre.chartre@oracle.com \
--cc=andi.kleen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=keescook@chromium.org \
--cc=kirill@shutemov.name \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mhillenb@amazon.de \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rick.p.edgecombe@intel.com \
--cc=rientjes@google.com \
--cc=rppt@linux.ibm.com \
--cc=sean.j.christopherson@intel.com \
--cc=vkuznets@redhat.com \
--cc=wad@chromium.org \
--cc=wanpengli@tencent.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).