All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Andy Lutomirski" <luto@kernel.org>
To: "Erdem Aktas" <erdemaktas@google.com>
Cc: "Joerg Roedel" <jroedel@suse.de>,
	"David Rientjes" <rientjes@google.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Sean Christopherson" <seanjc@google.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	"Andi Kleen" <ak@linux.intel.com>,
	"Brijesh Singh" <brijesh.singh@amd.com>,
	"Tom Lendacky" <thomas.lendacky@amd.com>,
	"Jon Grimm" <jon.grimm@amd.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Peter Zijlstra (Intel)" <peterz@infradead.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Kaplan, David" <David.Kaplan@amd.com>,
	"Varad Gautam" <varad.gautam@suse.com>,
	"Dario Faggioli" <dfaggioli@suse.com>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	linux-mm@kvack.org, linux-coco@lists.linux.dev
Subject: Re: Runtime Memory Validation in Intel-TDX and AMD-SNP
Date: Mon, 19 Jul 2021 20:30:15 -0700	[thread overview]
Message-ID: <eacb9c1f-2c61-4a7f-b5a3-7bf579e6cbf6@www.fastmail.com> (raw)
In-Reply-To: <CAAYXXYwFzrf8uY-PFkMRSG28+HztfGdJft8kB3Y3keWCx9K8TQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4905 bytes --]



On Mon, Jul 19, 2021, at 6:51 PM, Erdem Aktas wrote:
> With the new UEFI memory type, option 2 seems like a better option to me.
> 
> I was thinking with the lack of new UEFI memory type support yet, option 3 can be implemented as a temporary solution. IMO, this is crucial for a reasonable boot performance. 
> 
> > There's one exception to this, which is the previous memory view in
> > crash kernels. But that's an relatively obscure case and there might be
> > other solutions for this.
> 
> I think this is an important angle. It might cause reliability issues. if kexec kernel does not know which page is shared or private, it can use a previously shared page as a code page which will not work. It is also a security concern. Hosts can always cause crashes which forces guests to do kexec for crash dump. If the kexec kernel does not know which pages are validated before, it might be compromised with page replay attacks.

What’s the attack you have in mind?  With TDX, the guest using the wrong shared vs secure type should, at worst, cause crashes.  With SEV, I can imagine it’s possible for a guest to read or write the ciphertext of a private page, but actually turning that into an attack seems like it would require convincing a guest to use the same page with both modes.

> 
> Also kexec is not only for crash dumps. For warm resets, kexec kernel needs to know the valid page map.
> 
> >> Also in general i don't think it will really happen, at least initially.
> >> All the shared buffers we use are allocated and never freed. So such a
> >> problem could be deferred.
> 
> Does it not depend on kernel configs? Currently, there is a valid control path in dma_alloc_coherent which might alloc and free shared pages.
> 
> >> At the risk of asking a potentially silly question, would it be
> >> reasonable to treat non-validated memory as not-present for kernel
> >> purposes and hot-add it in a thread as it gets validated? 
> 
> My concern with this is, it assumes that all the present memory is private. UEFI might have some pages which are shared therefore also are present. 

Why is this a problem?  In TDX, I don’t think shared pages need any sort of validation. The private memory needs acceptance, but only DoS should be possible by getting it wrong. If EFI passed in a messy map with shared and private transitions all over, there will be a lot of extents in the map, but what actually goes wrong?

> -Erdem
> 
> On Mon, Jul 19, 2021 at 5:26 PM Andy Lutomirski <luto@kernel.org> wrote:
>> On 7/19/21 5:58 AM, Joerg Roedel wrote:
>> 
>> > Memory Validation through the Boot Process and in the Running System
>> > --------------------------------------------------------------------
>> > 
>> > The memory is validated throughout the boot process as described below.
>> > These steps assume a firmware is present, but this proposal does not
>> > strictly require a firmware. The tasks done be the firmware can also be
>> > done by the hypervisor before starting the guest. The steps are:
>> > 
>> >       1. The firmware validates all memory which will not be owned by
>> >          the boot loader or the OS.
>> > 
>> >       2. The firmware also validates the first X MB of memory, just
>> >          enough to run a boot loader and to load the compressed Linux
>> >          kernel image. X is not expected to be very large, 64 or 128
>> >          MB should be enough. This pre-validation should not cause
>> >          significant delays in the boot process.
>> > 
>> >       3. The validated memory is marked E820-Usable in struct
>> >          boot_params for the Linux decompressor. The rest of the
>> >          memory is also passed to Linux via new special E820 entries
>> >          which mark the memory as Usable-but-Invalid.
>> > 
>> >       4. When the Linux decompressor takes over control, it evaluates
>> >          the E820 table and calculates to total amount of memory
>> >          available to Linux (valid and invalid memory).
>> > 
>> >          The decompressor allocates a physically contiguous data
>> >          structure at a random memory location which is big enough to
>> >          hold the the validation states of all 4kb pages available to
>> >          the guest. This data structure will be called the Validation
>> >          Bitmap through the rest of this document. The Validation
>> >          Bitmap is indexed by page frame numbers. 
>> 
>> At the risk of asking a potentially silly question, would it be
>> reasonable to treat non-validated memory as not-present for kernel
>> purposes and hot-add it in a thread as it gets validated?  Or would this
>> result in poor system behavior before enough memory is validated?
>> Perhaps we should block instead of failing allocations if we want more
>> memory than is currently validated?
>> 
>> --Andy
>> 

[-- Attachment #2: Type: text/html, Size: 7163 bytes --]

  parent reply	other threads:[~2021-07-20  3:30 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-19 12:58 Runtime Memory Validation in Intel-TDX and AMD-SNP Joerg Roedel
2021-07-19 13:07 ` Matthew Wilcox
2021-07-19 15:02   ` Joerg Roedel
2021-07-19 20:39 ` Andi Kleen
2021-07-20  8:55   ` Joerg Roedel
2021-07-20  9:34     ` Dr. David Alan Gilbert
2021-07-20 11:50       ` Joerg Roedel
2021-07-20  0:26 ` Andy Lutomirski
2021-07-20  1:51   ` Erdem Aktas
2021-07-20  2:00     ` Erdem Aktas
2021-07-20  3:30     ` Andy Lutomirski [this message]
2021-07-20 19:54       ` Erdem Aktas
2021-07-20 22:01         ` Andi Kleen
2021-07-20 23:55           ` Erdem Aktas
2021-07-21  0:35             ` Andi Kleen
2021-07-21  8:51           ` Joerg Roedel
2021-07-20  5:17     ` Andi Kleen
2021-07-20  9:11       ` Joerg Roedel
2021-07-20 17:32         ` Andi Kleen
2021-07-20 23:09       ` Erdem Aktas
2021-07-21  0:38         ` Andi Kleen
2021-07-22 17:31       ` Marc Orr
2021-07-26 18:55         ` Joerg Roedel
2021-07-20  8:44   ` Joerg Roedel
2021-07-20 14:14   ` Dave Hansen
2021-07-20 17:30 ` Kirill A. Shutemov
2021-07-21  9:20   ` Mike Rapoport
2021-07-21 10:02     ` Kirill A. Shutemov
2021-07-21 10:22       ` Mike Rapoport
2021-07-21 10:53       ` Joerg Roedel
2021-07-21  9:25   ` Joerg Roedel
2021-07-21 10:25     ` Kirill A. Shutemov
2021-07-21 10:48       ` Joerg Roedel
2021-07-22 15:46   ` David Hildenbrand
2021-07-26 19:02     ` Joerg Roedel
2021-07-27  9:34       ` David Hildenbrand
2021-08-02 10:19         ` Joerg Roedel
2021-08-02 18:47           ` David Hildenbrand
2021-07-22 15:57 ` David Hildenbrand
2021-07-22 19:51 ` Kirill A. Shutemov
2021-07-23 15:23   ` Mike Rapoport
2021-07-23 16:29     ` Kirill A. Shutemov
2021-07-25  9:16       ` Mike Rapoport
2021-07-25 18:28         ` Kirill A. Shutemov
2021-07-26 10:00           ` Mike Rapoport
2021-07-26 11:53             ` Kirill A. Shutemov
2021-07-26 19:13   ` Joerg Roedel
2021-07-26 23:02   ` Erdem Aktas
2021-07-26 23:54     ` Kirill A. Shutemov
2021-07-27  1:35       ` Erdem Aktas
2021-07-23 11:04 ` Varad Gautam
2021-07-23 14:34   ` Kaplan, David

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eacb9c1f-2c61-4a7f-b5a3-7bf579e6cbf6@www.fastmail.com \
    --to=luto@kernel.org \
    --cc=David.Kaplan@amd.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=brijesh.singh@amd.com \
    --cc=dfaggioli@suse.com \
    --cc=erdemaktas@google.com \
    --cc=jon.grimm@amd.com \
    --cc=jroedel@suse.de \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=varad.gautam@suse.com \
    --cc=vbabka@suse.cz \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.