All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: David Rientjes <rientjes@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Kees Cook <keescook@chromium.org>, Will Drewry <wad@chromium.org>,
	"Edgecombe\, Rick P" <rick.p.edgecombe@intel.com>, "Kleen\,
	Andi" <andi.kleen@intel.com>, Liran Alon <liran.alon@oracle.com>,
	Mike Rapoport <rppt@kernel.org>,
	x86@kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <sean.j.christopherson@intel.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>
Subject: Re: [RFCv2 00/16] KVM protected memory extension
Date: Tue, 20 Oct 2020 09:46:11 +0200	[thread overview]
Message-ID: <87ft6949x8.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <20201020061859.18385-1-kirill.shutemov@linux.intel.com>

"Kirill A. Shutemov" <kirill@shutemov.name> writes:

> == Background / Problem ==
>
> There are a number of hardware features (MKTME, SEV) which protect guest
> memory from some unauthorized host access. The patchset proposes a purely
> software feature that mitigates some of the same host-side read-only
> attacks.
>
>
> == What does this set mitigate? ==
>
>  - Host kernel ”accidental” access to guest data (think speculation)
>
>  - Host kernel induced access to guest data (write(fd, &guest_data_ptr, len))
>
>  - Host userspace access to guest data (compromised qemu)
>
>  - Guest privilege escalation via compromised QEMU device emulation
>
> == What does this set NOT mitigate? ==
>
>  - Full host kernel compromise.  Kernel will just map the pages again.
>
>  - Hardware attacks
>
>
> The second RFC revision addresses /most/ of the feedback.
>
> I still didn't found a good solution to reboot and kexec. Unprotect all
> the memory on such operations defeat the goal of the feature. Clearing up
> most of the memory before unprotecting what is required for reboot (or
> kexec) is tedious and error-prone.
> Maybe we should just declare them unsupported?

Making reboot unsupported is a hard sell. Could you please elaborate on
why you think that "unprotect all" hypercall (or rather a single
hypercall supporting both protecting/unprotecting) defeats the purpose
of the feature?

(Leaving kexec aside for a while) Yes, it is not easy for a guest to
clean up *all* its memory upon reboot, however:
- It may only clean up the most sensitive parts. This should probably be
done even without this new feature and even on bare metal (think about
next boot target being malicious).
- The attack window shrinks significantly. "Speculative" bugs require
time to exploit and it will only remain open until it boots up again
(few seconds).

>
> == Series Overview ==
>
> The hardware features protect guest data by encrypting it and then
> ensuring that only the right guest can decrypt it.  This has the
> side-effect of making the kernel direct map and userspace mapping
> (QEMU et al) useless.  But, this teaches us something very useful:
> neither the kernel or userspace mappings are really necessary for normal
> guest operations.
>
> Instead of using encryption, this series simply unmaps the memory. One
> advantage compared to allowing access to ciphertext is that it allows bad
> accesses to be caught instead of simply reading garbage.
>
> Protection from physical attacks needs to be provided by some other means.
> On Intel platforms, (single-key) Total Memory Encryption (TME) provides
> mitigation against physical attacks, such as DIMM interposers sniffing
> memory bus traffic.
>
> The patchset modifies both host and guest kernel. The guest OS must enable
> the feature via hypercall and mark any memory range that has to be shared
> with the host: DMA regions, bounce buffers, etc. SEV does this marking via a
> bit in the guest’s page table while this approach uses a hypercall.
>
> For removing the userspace mapping, use a trick similar to what NUMA
> balancing does: convert memory that belongs to KVM memory slots to
> PROT_NONE: all existing entries converted to PROT_NONE with mprotect() and
> the newly faulted in pages get PROT_NONE from the updated vm_page_prot.
> The new VMA flag -- VM_KVM_PROTECTED -- indicates that the pages in the
> VMA must be treated in a special way in the GUP and fault paths. The flag
> allows GUP to return the page even though it is mapped with PROT_NONE, but
> only if the new GUP flag -- FOLL_KVM -- is specified. Any userspace access
> to the memory would result in SIGBUS. Any GUP access without FOLL_KVM
> would result in -EFAULT.
>
> Removing userspace mapping of the guest memory from QEMU process can help
> to address some guest privilege escalation attacks. Consider the case when
> unprivileged guest user exploits bug in a QEMU device emulation to gain
> access to data it cannot normally have access within the guest.
>
> Any anonymous page faulted into the VM_KVM_PROTECTED VMA gets removed from
> the direct mapping with kernel_map_pages(). Note that kernel_map_pages() only
> flushes local TLB. I think it's a reasonable compromise between security and
> perfromance.
>
> Zapping the PTE would bring the page back to the direct mapping after clearing.
> At least for now, we don't remove file-backed pages from the direct mapping.
> File-backed pages could be accessed via read/write syscalls. It adds
> complexity.
>
> Occasionally, host kernel has to access guest memory that was not made
> shared by the guest. For instance, it happens for instruction emulation.
> Normally, it's done via copy_to/from_user() which would fail with -EFAULT
> now. We introduced a new pair of helpers: copy_to/from_guest(). The new
> helpers acquire the page via GUP, map it into kernel address space with
> kmap_atomic()-style mechanism and only then copy the data.
>
> For some instruction emulation copying is not good enough: cmpxchg
> emulation has to have direct access to the guest memory. __kvm_map_gfn()
> is modified to accommodate the case.
>
> The patchset is on top of v5.9
>
> Kirill A. Shutemov (16):
>   x86/mm: Move force_dma_unencrypted() to common code
>   x86/kvm: Introduce KVM memory protection feature
>   x86/kvm: Make DMA pages shared
>   x86/kvm: Use bounce buffers for KVM memory protection
>   x86/kvm: Make VirtIO use DMA API in KVM guest
>   x86/kvmclock: Share hvclock memory with the host
>   x86/realmode: Share trampoline area if KVM memory protection enabled
>   KVM: Use GUP instead of copy_from/to_user() to access guest memory
>   KVM: mm: Introduce VM_KVM_PROTECTED
>   KVM: x86: Use GUP for page walk instead of __get_user()
>   KVM: Protected memory extension
>   KVM: x86: Enabled protected memory extension
>   KVM: Rework copy_to/from_guest() to avoid direct mapping
>   KVM: Handle protected memory in __kvm_map_gfn()/__kvm_unmap_gfn()
>   KVM: Unmap protected pages from direct mapping
>   mm: Do not use zero page for VM_KVM_PROTECTED VMAs
>
>  arch/powerpc/kvm/book3s_64_mmu_hv.c    |   2 +-
>  arch/powerpc/kvm/book3s_64_mmu_radix.c |   2 +-
>  arch/s390/include/asm/pgtable.h        |   2 +-
>  arch/x86/Kconfig                       |  11 +-
>  arch/x86/include/asm/cpufeatures.h     |   1 +
>  arch/x86/include/asm/io.h              |   6 +-
>  arch/x86/include/asm/kvm_para.h        |   5 +
>  arch/x86/include/asm/pgtable_types.h   |   1 +
>  arch/x86/include/uapi/asm/kvm_para.h   |   3 +-
>  arch/x86/kernel/kvm.c                  |  20 +++
>  arch/x86/kernel/kvmclock.c             |   2 +-
>  arch/x86/kernel/pci-swiotlb.c          |   3 +-
>  arch/x86/kvm/Kconfig                   |   1 +
>  arch/x86/kvm/cpuid.c                   |   3 +-
>  arch/x86/kvm/mmu/mmu.c                 |   6 +-
>  arch/x86/kvm/mmu/paging_tmpl.h         |  10 +-
>  arch/x86/kvm/x86.c                     |   9 +
>  arch/x86/mm/Makefile                   |   2 +
>  arch/x86/mm/ioremap.c                  |  16 +-
>  arch/x86/mm/mem_encrypt.c              |  51 ------
>  arch/x86/mm/mem_encrypt_common.c       |  62 +++++++
>  arch/x86/mm/pat/set_memory.c           |   7 +
>  arch/x86/realmode/init.c               |   7 +-
>  drivers/virtio/virtio_ring.c           |   4 +
>  include/linux/kvm_host.h               |  11 +-
>  include/linux/kvm_types.h              |   1 +
>  include/linux/mm.h                     |  21 ++-
>  include/uapi/linux/kvm_para.h          |   5 +-
>  mm/gup.c                               |  20 ++-
>  mm/huge_memory.c                       |  31 +++-
>  mm/ksm.c                               |   2 +
>  mm/memory.c                            |  18 +-
>  mm/mmap.c                              |   3 +
>  mm/rmap.c                              |   4 +
>  virt/kvm/Kconfig                       |   3 +
>  virt/kvm/async_pf.c                    |   2 +-
>  virt/kvm/kvm_main.c                    | 238 +++++++++++++++++++++----
>  virt/lib/Makefile                      |   1 +
>  virt/lib/mem_protected.c               | 193 ++++++++++++++++++++
>  39 files changed, 666 insertions(+), 123 deletions(-)
>  create mode 100644 arch/x86/mm/mem_encrypt_common.c
>  create mode 100644 virt/lib/mem_protected.c

-- 
Vitaly


  parent reply	other threads:[~2020-10-20  7:46 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20  6:18 [RFCv2 00/16] KVM protected memory extension Kirill A. Shutemov
2020-10-20  6:18 ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 01/16] x86/mm: Move force_dma_unencrypted() to common code Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 02/16] x86/kvm: Introduce KVM memory protection feature Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 03/16] x86/kvm: Make DMA pages shared Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 04/16] x86/kvm: Use bounce buffers for KVM memory protection Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  7:46   ` kernel test robot
2020-10-20  8:48   ` kernel test robot
2020-10-20  6:18 ` [RFCv2 05/16] x86/kvm: Make VirtIO use DMA API in KVM guest Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  8:06   ` Christoph Hellwig
2020-10-20 12:47     ` Kirill A. Shutemov
2020-10-20  9:18   ` kernel test robot
2020-10-22  3:31   ` Halil Pasic
2020-10-20  6:18 ` [RFCv2 06/16] x86/kvmclock: Share hvclock memory with the host Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 07/16] x86/realmode: Share trampoline area if KVM memory protection enabled Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 08/16] KVM: Use GUP instead of copy_from/to_user() to access guest memory Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  8:25   ` John Hubbard
2020-10-20 12:51     ` Kirill A. Shutemov
2020-10-22 11:49     ` Matthew Wilcox
2020-10-22 19:58       ` John Hubbard
2020-10-26  4:21         ` Matthew Wilcox
2020-10-26  4:44           ` John Hubbard
2020-10-26 13:28             ` Matthew Wilcox
2020-10-26 14:16               ` Jason Gunthorpe
2020-10-26 20:52               ` John Hubbard
2020-10-20 17:29   ` Ira Weiny
2020-10-22 11:37     ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 09/16] KVM: mm: Introduce VM_KVM_PROTECTED Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-21 18:47   ` Edgecombe, Rick P
2020-10-22 12:01     ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 10/16] KVM: x86: Use GUP for page walk instead of __get_user() Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 11/16] KVM: Protected memory extension Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  7:17   ` Peter Zijlstra
2020-10-20 12:55     ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 12/16] KVM: x86: Enabled protected " Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  9:01   ` kernel test robot
2020-10-20  6:18 ` [RFCv2 13/16] KVM: Rework copy_to/from_guest() to avoid direct mapping Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  9:40   ` kernel test robot
2020-10-20  6:18 ` [RFCv2 14/16] KVM: Handle protected memory in __kvm_map_gfn()/__kvm_unmap_gfn() Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20 10:34   ` kernel test robot
2020-10-20 11:56   ` kernel test robot
2020-10-21 18:50   ` Edgecombe, Rick P
2020-10-22 12:06     ` Kirill A. Shutemov
2020-10-22 16:59       ` Edgecombe, Rick P
2020-10-23 10:36         ` Kirill A. Shutemov
2020-10-22  3:26   ` Halil Pasic
2020-10-22 12:07     ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 15/16] KVM: Unmap protected pages from direct mapping Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  7:12   ` Peter Zijlstra
2020-10-20 12:18   ` David Hildenbrand
2020-10-20 13:20     ` David Hildenbrand
2020-10-21  1:20       ` Edgecombe, Rick P
2020-10-26 19:55     ` Tom Lendacky
2020-10-21 18:49   ` Edgecombe, Rick P
2020-10-23 12:37   ` Mike Rapoport
2020-10-23 16:32     ` Sean Christopherson
2020-10-20  6:18 ` [RFCv2 16/16] mm: Do not use zero page for VM_KVM_PROTECTED VMAs Kirill A. Shutemov
2020-10-20  6:18   ` Kirill A. Shutemov
2020-10-20  7:46 ` Vitaly Kuznetsov [this message]
2020-10-20 13:49   ` [RFCv2 00/16] KVM protected memory extension Kirill A. Shutemov
2020-10-21 14:46     ` Vitaly Kuznetsov
2020-10-23 11:35       ` Kirill A. Shutemov
2020-10-23 12:01         ` Vitaly Kuznetsov
2020-10-21 18:20 ` Andy Lutomirski
2020-10-21 18:20   ` Andy Lutomirski
2020-10-26 15:29   ` Kirill A. Shutemov
2020-10-26 23:58     ` Andy Lutomirski
2020-10-26 23:58       ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ft6949x8.fsf@vitty.brq.redhat.com \
    --to=vkuznets@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=andi.kleen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liran.alon@oracle.com \
    --cc=luto@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=wad@chromium.org \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.