linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "peterz@infradead.org" <peterz@infradead.org>,
	"jmattson@google.com" <jmattson@google.com>,
	"Christopherson, Sean J" <sean.j.christopherson@intel.com>,
	"vkuznets@redhat.com" <vkuznets@redhat.com>,
	"david@redhat.com" <david@redhat.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"joro@8bytes.org" <joro@8bytes.org>,
	"luto@kernel.org" <luto@kernel.org>,
	"kirill@shutemov.name" <kirill@shutemov.name>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"rppt@linux.ibm.com" <rppt@linux.ibm.com>,
	"wanpengli@tencent.com" <wanpengli@tencent.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"keescook@chromium.org" <keescook@chromium.org>,
	"rppt@kernel.org" <rppt@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"aarcange@redhat.com" <aarcange@redhat.com>,
	"liran.alon@oracle.com" <liran.alon@oracle.com>,
	"wad@chromium.org" <wad@chromium.org>,
	"rientjes@google.com" <rientjes@google.com>,
	"Kleen, Andi" <andi.kleen@intel.com>
Subject: Re: [RFCv2 15/16] KVM: Unmap protected pages from direct mapping
Date: Wed, 21 Oct 2020 01:20:18 +0000	[thread overview]
Message-ID: <fac0a35ff7c7543467c5704767f9815ff719566b.camel@intel.com> (raw)
In-Reply-To: <2759b4bf-e1e3-d006-7d86-78a40348269d@redhat.com>

On Tue, 2020-10-20 at 15:20 +0200, David Hildenbrand wrote:
> On 20.10.20 14:18, David Hildenbrand wrote:
> > On 20.10.20 08:18, Kirill A. Shutemov wrote:
> > > If the protected memory feature enabled, unmap guest memory from
> > > kernel's direct mappings.
> > 
> > Gah, ugly. I guess this also defeats compaction, swapping, ... oh
> > gosh.
> > As if all of the encrypted VM implementations didn't bring us
> > enough
> > ugliness already (SEV extensions also don't support reboots, but
> > can at
> > least kexec() IIRC).
> > 
> > Something similar is done with secretmem [1]. And people don't seem
> > to
> > like fragmenting the direct mapping (including me).
> > 
> > [1] https://lkml.kernel.org/r/20200924132904.1391-1-rppt@kernel.org
> > 
> 
> I just thought "hey, we might have to replace pud/pmd mappings by
> page
> tables when calling kernel_map_pages", this can fail with -ENOMEM,
> why
> isn't there proper error handling.
> 
> Then I dived into __kernel_map_pages() which states:
> 
> "The return value is ignored as the calls cannot fail. Large pages
> for
> identity mappings are not used at boot time and hence no memory
> allocations during large page split."
> 
> I am probably missing something important, but how is calling
> kernel_map_pages() safe *after* booting?! I know we use it for
> debug_pagealloc(), but using it in a production-ready feature feels
> completely irresponsible. What am I missing?
> 

My understanding was that DEBUG_PAGEALLOC maps the direct map at 4k
post-boot as well so that kernel_map_pages() can't fail for it, and to
avoid having to take locks for splits in interrupts. I think you are on
to something though. That function is essentially a set_memory_()
functions on x86 at least, and many callers do not check the error
codes. Kees Cook has been pushing to fix this actually: 
https://github.com/KSPP/linux/issues/7

As long as a page has been broken to 4k, it should be able to be re-
mapped without failure (like debug page alloc relies on). If an initial
call to restrict permissions needs and fails to break a page, its remap
does not need to succeed either before freeing the page, since the page
never got set with a lower permission. That's my understanding from
x86's cpa at least. The problem becomes that the page silently doesn't
have its expected protections during use. Unchecked set_memory_()
caching property change failures might result in functional problems
though.

So there is some background if you wanted it, but yea, I agree this
feature should handle if the unmap failed. Probably return an error on
the IOCTL and maybe the hypercall. kernel_map_pages() also doesn't do a
shootdown though, so this direct map protection is more in the best
effort category in its current state.


For unmapping causing direct map fragmentation. The two techniques
floating around, besides breaking indiscriminately, seem to be:
1. Group and cache unmapped pages to minimize the damage (like secret
mem)
2. Somehow repair them back to large pages when reset RW (as Kirill had
tried earlier this year in another series)

I had imagined this usage would want something like that eventually,
but neither helps with the other limitations you mentioned (migration,
etc).

  reply	other threads:[~2020-10-21  1:20 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-20  6:18 [RFCv2 00/16] KVM protected memory extension Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 01/16] x86/mm: Move force_dma_unencrypted() to common code Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 02/16] x86/kvm: Introduce KVM memory protection feature Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 03/16] x86/kvm: Make DMA pages shared Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 04/16] x86/kvm: Use bounce buffers for KVM memory protection Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 05/16] x86/kvm: Make VirtIO use DMA API in KVM guest Kirill A. Shutemov
2020-10-20  8:06   ` Christoph Hellwig
2020-10-20 12:47     ` Kirill A. Shutemov
2020-10-22  3:31   ` Halil Pasic
2020-10-20  6:18 ` [RFCv2 06/16] x86/kvmclock: Share hvclock memory with the host Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 07/16] x86/realmode: Share trampoline area if KVM memory protection enabled Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 08/16] KVM: Use GUP instead of copy_from/to_user() to access guest memory Kirill A. Shutemov
2020-10-20  8:25   ` John Hubbard
2020-10-20 12:51     ` Kirill A. Shutemov
2020-10-22 11:49     ` Matthew Wilcox
2020-10-22 19:58       ` John Hubbard
2020-10-26  4:21         ` Matthew Wilcox
2020-10-26  4:44           ` John Hubbard
2020-10-26 13:28             ` Matthew Wilcox
2020-10-26 14:16               ` Jason Gunthorpe
2020-10-26 20:52               ` John Hubbard
2020-10-20 17:29   ` Ira Weiny
2020-10-22 11:37     ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 09/16] KVM: mm: Introduce VM_KVM_PROTECTED Kirill A. Shutemov
2020-10-21 18:47   ` Edgecombe, Rick P
2020-10-22 12:01     ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 10/16] KVM: x86: Use GUP for page walk instead of __get_user() Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 11/16] KVM: Protected memory extension Kirill A. Shutemov
2020-10-20  7:17   ` Peter Zijlstra
2020-10-20 12:55     ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 12/16] KVM: x86: Enabled protected " Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 13/16] KVM: Rework copy_to/from_guest() to avoid direct mapping Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 14/16] KVM: Handle protected memory in __kvm_map_gfn()/__kvm_unmap_gfn() Kirill A. Shutemov
2020-10-21 18:50   ` Edgecombe, Rick P
2020-10-22 12:06     ` Kirill A. Shutemov
2020-10-22 16:59       ` Edgecombe, Rick P
2020-10-23 10:36         ` Kirill A. Shutemov
2020-10-22  3:26   ` Halil Pasic
2020-10-22 12:07     ` Kirill A. Shutemov
2020-10-20  6:18 ` [RFCv2 15/16] KVM: Unmap protected pages from direct mapping Kirill A. Shutemov
2020-10-20  7:12   ` Peter Zijlstra
2020-10-20 12:18   ` David Hildenbrand
2020-10-20 13:20     ` David Hildenbrand
2020-10-21  1:20       ` Edgecombe, Rick P [this message]
2020-10-26 19:55     ` Tom Lendacky
2020-10-21 18:49   ` Edgecombe, Rick P
2020-10-23 12:37   ` Mike Rapoport
2020-10-23 16:32     ` Sean Christopherson
2020-10-20  6:18 ` [RFCv2 16/16] mm: Do not use zero page for VM_KVM_PROTECTED VMAs Kirill A. Shutemov
2020-10-20  7:46 ` [RFCv2 00/16] KVM protected memory extension Vitaly Kuznetsov
2020-10-20 13:49   ` Kirill A. Shutemov
2020-10-21 14:46     ` Vitaly Kuznetsov
2020-10-23 11:35       ` Kirill A. Shutemov
2020-10-23 12:01         ` Vitaly Kuznetsov
2020-10-21 18:20 ` Andy Lutomirski
2020-10-26 15:29   ` Kirill A. Shutemov
2020-10-26 23:58     ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fac0a35ff7c7543467c5704767f9815ff719566b.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=aarcange@redhat.com \
    --cc=andi.kleen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liran.alon@oracle.com \
    --cc=luto@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rppt@kernel.org \
    --cc=rppt@linux.ibm.com \
    --cc=sean.j.christopherson@intel.com \
    --cc=vkuznets@redhat.com \
    --cc=wad@chromium.org \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).