linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Dave Hansen <dave.hansen@intel.com>,
	"Kirill A. Shutemov" <kirill@shutemov.name>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Sean Christopherson <seanjc@google.com>,
	Jim Mattson <jmattson@google.com>
Cc: David Rientjes <rientjes@google.com>,
	"Edgecombe, Rick P" <rick.p.edgecombe@intel.com>,
	"Kleen, Andi" <andi.kleen@intel.com>,
	"Yamahata, Isaku" <isaku.yamahata@intel.com>,
	x86@kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [RFCv1 7/7] KVM: unmap guest memory using poisoned pages
Date: Tue, 6 Apr 2021 16:57:46 +0200	[thread overview]
Message-ID: <d94d3042-098a-8df7-9ef6-b869851a4134@redhat.com> (raw)
In-Reply-To: <52518f09-7350-ebe9-7ddb-29095cd3a4d9@intel.com>

On 06.04.21 16:33, Dave Hansen wrote:
> On 4/6/21 12:44 AM, David Hildenbrand wrote:
>> On 02.04.21 17:26, Kirill A. Shutemov wrote:
>>> TDX architecture aims to provide resiliency against confidentiality and
>>> integrity attacks. Towards this goal, the TDX architecture helps enforce
>>> the enabling of memory integrity for all TD-private memory.
>>>
>>> The CPU memory controller computes the integrity check value (MAC) for
>>> the data (cache line) during writes, and it stores the MAC with the
>>> memory as meta-data. A 28-bit MAC is stored in the ECC bits.
>>>
>>> Checking of memory integrity is performed during memory reads. If
>>> integrity check fails, CPU poisones cache line.
>>>
>>> On a subsequent consumption (read) of the poisoned data by software,
>>> there are two possible scenarios:
>>>
>>>    - Core determines that the execution can continue and it treats
>>>      poison with exception semantics signaled as a #MCE
>>>
>>>    - Core determines execution cannot continue,and it does an unbreakable
>>>      shutdown
>>>
>>> For more details, see Chapter 14 of Intel TDX Module EAS[1]
>>>
>>> As some of integrity check failures may lead to system shutdown host
>>> kernel must not allow any writes to TD-private memory. This requirment
>>> clashes with KVM design: KVM expects the guest memory to be mapped into
>>> host userspace (e.g. QEMU).
>>
>> So what you are saying is that if QEMU would write to such memory, it
>> could crash the kernel? What a broken design.
> 
> IMNHO, the broken design is mapping the memory to userspace in the first
> place.  Why the heck would you actually expose something with the MMU to
> a context that can't possibly meaningfully access or safely write to it?

I'd say the broken design is being able to crash the machine via a 
simple memory write, instead of only crashing a single process in case 
you're doing something nasty. From the evaluation of the problem it 
feels like this was a CPU design workaround: instead of properly 
cleaning up when it gets tricky within the core, just crash the machine. 
And that's a CPU "feature", not a kernel "feature". Now we have to fix 
broken HW in the kernel - once again.

However, you raise a valid point: it does not make too much sense to to 
map this into user space. Not arguing against that; but crashing the 
machine is just plain ugly.

I wonder: why do we even *want* a VMA/mmap describing that memory? 
Sounds like: for hacking support for that memory type into QEMU/KVM.

This all feels wrong, but I cannot really tell how it could be better. 
That memory can really only be used (right now?) with hardware 
virtualization from some point on. From that point on (right from the 
start?), there should be no VMA/mmap/page tables for user space anymore.

Or am I missing something? Is there still valid user space access?

> 
> This started with SEV.  QEMU creates normal memory mappings with the SEV
> C-bit (encryption) disabled.  The kernel plumbs those into NPT, but when
> those are instantiated, they have the C-bit set.  So, we have mismatched
> mappings.  Where does that lead?  The two mappings not only differ in
> the encryption bit, causing one side to read gibberish if the other
> writes: they're not even cache coherent.
> 
> That's the situation *TODAY*, even ignoring TDX.
> 
> BTW, I'm pretty sure I know the answer to the "why would you expose this
> to userspace" question: it's what QEMU/KVM did alreadhy for
> non-encrypted memory, so this was the quickest way to get SEV working.
> 

Yes, I guess so. It was the fastest way to "hack" it into QEMU.

Would we ever even want a VMA/mmap/process page tables for that memory? 
How could user space ever do something *not so nasty* with that memory 
(in the current context of VMs)?

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2021-04-06 14:58 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-02 15:26 [RFCv1 0/7] TDX and guest memory unmapping Kirill A. Shutemov
2021-04-02 15:26 ` [RFCv1 1/7] x86/mm: Move force_dma_unencrypted() to common code Kirill A. Shutemov
2021-04-02 15:26 ` [RFCv1 2/7] x86/kvm: Introduce KVM memory protection feature Kirill A. Shutemov
2021-04-08  9:52   ` Borislav Petkov
2021-04-09 13:36     ` Kirill A. Shutemov
2021-04-09 14:37       ` Borislav Petkov
2021-04-02 15:26 ` [RFCv1 3/7] x86/kvm: Make DMA pages shared Kirill A. Shutemov
2021-04-02 15:26 ` [RFCv1 4/7] x86/kvm: Use bounce buffers for KVM memory protection Kirill A. Shutemov
2021-04-02 15:26 ` [RFCv1 5/7] x86/kvmclock: Share hvclock memory with the host Kirill A. Shutemov
2021-04-02 15:26 ` [RFCv1 6/7] x86/realmode: Share trampoline area if KVM memory protection enabled Kirill A. Shutemov
2021-04-02 15:26 ` [RFCv1 7/7] KVM: unmap guest memory using poisoned pages Kirill A. Shutemov
2021-04-06  7:44   ` David Hildenbrand
2021-04-06 10:50     ` Kirill A. Shutemov
2021-04-06 14:33     ` Dave Hansen
2021-04-06 14:57       ` David Hildenbrand [this message]
2021-04-07 13:16         ` Kirill A. Shutemov
2021-04-07 13:31           ` Christophe de Dinechin
2021-04-07 14:09             ` Andi Kleen
2021-04-07 14:09           ` David Hildenbrand
2021-04-07 14:36             ` Kirill A. Shutemov
2021-04-06 17:52       ` Tom Lendacky
2021-04-07 14:55   ` David Hildenbrand
2021-04-07 15:10     ` Andi Kleen
2021-04-09 13:33     ` Kirill A. Shutemov
2021-04-09 13:50       ` David Hildenbrand
2021-04-09 14:12         ` Kirill A. Shutemov
2021-04-09 14:18           ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d94d3042-098a-8df7-9ef6-b869851a4134@redhat.com \
    --to=david@redhat.com \
    --cc=andi.kleen@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=isaku.yamahata@intel.com \
    --cc=jmattson@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=rientjes@google.com \
    --cc=seanjc@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).