From: Mike Rapoport <rppt@kernel.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Joerg Roedel <jroedel@suse.de>,
David Rientjes <rientjes@google.com>,
Borislav Petkov <bp@alien8.de>, Andy Lutomirski <luto@kernel.org>,
Sean Christopherson <seanjc@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andi Kleen <ak@linux.intel.com>,
Brijesh Singh <brijesh.singh@amd.com>,
Tom Lendacky <thomas.lendacky@amd.com>,
Jon Grimm <jon.grimm@amd.com>,
Thomas Gleixner <tglx@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Ingo Molnar <mingo@redhat.com>,
"Kaplan, David" <David.Kaplan@amd.com>,
Varad Gautam <varad.gautam@suse.com>,
Dario Faggioli <dfaggioli@suse.com>,
x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev
Subject: Re: Runtime Memory Validation in Intel-TDX and AMD-SNP
Date: Wed, 21 Jul 2021 13:22:56 +0300 [thread overview]
Message-ID: <YPf1gNs1OoyS6dUt@kernel.org> (raw)
In-Reply-To: <20210721100206.mfldptiwiothowpz@box>
On Wed, Jul 21, 2021 at 01:02:06PM +0300, Kirill A. Shutemov wrote:
> On Wed, Jul 21, 2021 at 12:20:17PM +0300, Mike Rapoport wrote:
> > On Tue, Jul 20, 2021 at 08:30:04PM +0300, Kirill A. Shutemov wrote:
> > > On Mon, Jul 19, 2021 at 02:58:22PM +0200, Joerg Roedel wrote:
> > > > Hi,
> > > >
> > > > I'd like to get some movement again into the discussion around how to
> > > > implement runtime memory validation for confidential guests and wrote up
> > > > some thoughts on it.
> > > > Below are the results in form of a proposal I put together. Please let
> > > > me know your thoughts on it and whether it fits everyones requirements.
> > >
> > > Thanks for bringing it up. I'm working on the topic for Intel TDX. See
> > > comments below.
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Joerg
> > > >
> > > > Proposal for Runtime Memory Validation in Secure Guests on x86
> > > > ==============================================================
> >
> > [ snip ]
> >
> > > > 8. When memory is returned to the memblock or page allocators,
> > > > it is _not_ invalidated. In fact, all memory which is freed
> > > > need to be valid. If it was marked invalid in the meantime
> > > > (e.g. if it the memory was used for DMA buffers), the code
> > > > owning the memory needs to validate it again before freeing
> > > > it.
> > > >
> > > > The benefit of doing memory validation at allocation time is
> > > > that it keeps the exception handler for invalid memory
> > > > simple, because no exceptions of this kind are expected under
> > > > normal operation.
> > >
> > > During early boot I treat unaccepted memory as a usable RAM. It only
> > > requires special treatment on memblock_reserve(), which used for early
> > > memory allocation: unaccepted usable RAM has to be accepted, before
> > > reserving.
> >
> > memblock_reserve() is not always used for early allocations and some of the
> > early allocations on x86 don't use memblock at all.
>
> Do you mean any codepath in particular?
I don't have examples handy, but in general there are calls to
e820__range_update() that make memory !RAM and it never gets into memblock.
On the other side, memblock_reserve() can be called to reserve memory owned
y firmware that may be already accepted.
> > Hooking
> > validation/acceptance to memblock_reserve() should be fine for PoC but I
> > suspect there will be caveats for production.
>
> That's why I do PoC. Will see. So far so good. Maybe it will be visible
> with smaller pre-accepted memory size.
Maybe some of my concerns only apply to systems with BIOSes weirder than
usual and for VMs all would be fine.
I'd suggest to experiment with "memmap=" to manually assign various e820
types to memory chunks to see if there are any strange effects.
> > > For fine-grained accepting/validation tracking I use PageOffline() flags
> > > (it's encoded into mapcount): before adding an unaccepted page to free
> > > list I set the PageOffline() to indicate that the page has to be accepted
> > > before returning from the page allocator. Currently, we never have
> > > PageOffline() set for pages on free lists, so we won't have confusion with
> > > ballooning or memory hotplug.
> > >
> > > I try to keep pages accepted in 2M or 4M chunks (pageblock_order or
> > > MAX_ORDER). It is reasonable compromise on speed/latency.
> >
> > Keeping fine grained accepting/validation information in the memory map
> > means it cannot be reused across reboots/kexec and there should be an
> > additional data structure to carry this information. It could be the same
> > structure that is used by firmware to inform kernel about usable memory,
> > just it needs to live after boot and get updates about new (in)validations.
> > Doing those in 2M/4M chunks will help to prevent this structure from
> > exploding.
>
> Yeah, we would need to reconstruct the EFI map somehow. Or we can give
> most of memory back to the host and accept/validate the memory again after
> reboot/kexec. I donno.
>
> > BTW, as Dave mentioned, the deferred struct page init can also take care of
> > the validation.
>
> That was my first thought too and I tried it just to realize that it is
> not what we want. If we would accept page on page struct init it means we
> would make host allocate all memory assigned to the guest on boot even if
> guest actually use small portion of it.
Yep, you are right.
> Also deferred page init only allows to scale validation across multiple
> CPUs, but doesn't allow to get to userspace before we done with it. See
> wait_for_completion(&pgdat_init_all_done_comp).
True.
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2021-07-21 10:23 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-19 12:58 Runtime Memory Validation in Intel-TDX and AMD-SNP Joerg Roedel
2021-07-19 13:07 ` Matthew Wilcox
2021-07-19 15:02 ` Joerg Roedel
2021-07-19 20:39 ` Andi Kleen
2021-07-20 8:55 ` Joerg Roedel
2021-07-20 9:34 ` Dr. David Alan Gilbert
2021-07-20 11:50 ` Joerg Roedel
2021-07-20 0:26 ` Andy Lutomirski
[not found] ` <CAAYXXYwFzrf8uY-PFkMRSG28+HztfGdJft8kB3Y3keWCx9K8TQ@mail.gmail.com>
2021-07-20 2:00 ` Erdem Aktas
2021-07-20 5:17 ` Andi Kleen
2021-07-20 9:11 ` Joerg Roedel
2021-07-20 17:32 ` Andi Kleen
2021-07-20 23:09 ` Erdem Aktas
2021-07-21 0:38 ` Andi Kleen
2021-07-22 17:31 ` Marc Orr
2021-07-26 18:55 ` Joerg Roedel
[not found] ` <eacb9c1f-2c61-4a7f-b5a3-7bf579e6cbf6@www.fastmail.com>
2021-07-20 19:54 ` Erdem Aktas
2021-07-20 22:01 ` Andi Kleen
2021-07-20 23:55 ` Erdem Aktas
2021-07-21 0:35 ` Andi Kleen
2021-07-21 8:51 ` Joerg Roedel
2021-07-20 8:44 ` Joerg Roedel
2021-07-20 14:14 ` Dave Hansen
2021-07-20 17:30 ` Kirill A. Shutemov
2021-07-21 9:20 ` Mike Rapoport
2021-07-21 10:02 ` Kirill A. Shutemov
2021-07-21 10:22 ` Mike Rapoport [this message]
2021-07-21 10:53 ` Joerg Roedel
2021-07-21 9:25 ` Joerg Roedel
2021-07-21 10:25 ` Kirill A. Shutemov
2021-07-21 10:48 ` Joerg Roedel
2021-07-22 15:46 ` David Hildenbrand
2021-07-26 19:02 ` Joerg Roedel
2021-07-27 9:34 ` David Hildenbrand
2021-08-02 10:19 ` Joerg Roedel
2021-08-02 18:47 ` David Hildenbrand
2021-07-22 15:57 ` David Hildenbrand
2021-07-22 19:51 ` Kirill A. Shutemov
2021-07-23 15:23 ` Mike Rapoport
2021-07-23 16:29 ` Kirill A. Shutemov
2021-07-25 9:16 ` Mike Rapoport
2021-07-25 18:28 ` Kirill A. Shutemov
2021-07-26 10:00 ` Mike Rapoport
2021-07-26 11:53 ` Kirill A. Shutemov
2021-07-26 19:13 ` Joerg Roedel
2021-07-26 23:02 ` Erdem Aktas
2021-07-26 23:54 ` Kirill A. Shutemov
2021-07-27 1:35 ` Erdem Aktas
2021-07-23 11:04 ` Varad Gautam
2021-07-23 14:34 ` Kaplan, David
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YPf1gNs1OoyS6dUt@kernel.org \
--to=rppt@kernel.org \
--cc=David.Kaplan@amd.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=bp@alien8.de \
--cc=brijesh.singh@amd.com \
--cc=dfaggioli@suse.com \
--cc=jon.grimm@amd.com \
--cc=jroedel@suse.de \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-coco@lists.linux.dev \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=seanjc@google.com \
--cc=tglx@linutronix.de \
--cc=thomas.lendacky@amd.com \
--cc=varad.gautam@suse.com \
--cc=vbabka@suse.cz \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).