All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Yu Zhang <yu.c.zhang@linux.intel.com>
Cc: Sean Christopherson <seanjc@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	Borislav Petkov <bp@alien8.de>, Andy Lutomirski <luto@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Joerg Roedel <jroedel@suse.de>, Andi Kleen <ak@linux.intel.com>,
	David Rientjes <rientjes@google.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	Varad Gautam <varad.gautam@suse.com>,
	Dario Faggioli <dfaggioli@suse.com>,
	x86@kernel.org, linux-mm@kvack.org, linux-coco@lists.linux.dev,
	"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
	"Kirill A . Shutemov" <kirill@shutemov.name>,
	Kuppuswamy Sathyanarayanan 
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	Dave Hansen <dave.hansen@intel.com>
Subject: Re: [RFC] KVM: mm: fd-based approach for supporting KVM guest private memory
Date: Tue, 31 Aug 2021 21:08:33 +0200	[thread overview]
Message-ID: <243bc6a3-b43b-cd18-9cbb-1f42a5de802f@redhat.com> (raw)
In-Reply-To: <20210827023150.jotwvom7mlsawjh4@linux.intel.com>

On 27.08.21 04:31, Yu Zhang wrote:
> On Thu, Aug 26, 2021 at 12:15:48PM +0200, David Hildenbrand wrote:
>> On 24.08.21 02:52, Sean Christopherson wrote:
>>> The goal of this RFC is to try and align KVM, mm, and anyone else with skin in the
>>> game, on an acceptable direction for supporting guest private memory, e.g. for
>>> Intel's TDX.  The TDX architectural effectively allows KVM guests to crash the
>>> host if guest private memory is accessible to host userspace, and thus does not
>>> play nice with KVM's existing approach of pulling the pfn and mapping level from
>>> the host page tables.
>>>
>>> This is by no means a complete patch; it's a rough sketch of the KVM changes that
>>> would be needed.  The kernel side of things is completely omitted from the patch;
>>> the design concept is below.
>>>
>>> There's also fair bit of hand waving on implementation details that shouldn't
>>> fundamentally change the overall ABI, e.g. how the backing store will ensure
>>> there are no mappings when "converting" to guest private.
>>>
>>
>> This is a lot of complexity and rather advanced approaches (not saying they
>> are bad, just that we try to teach the whole stack something completely
>> new).
>>
>>
>> What I think would really help is a list of requirements, such that
>> everybody is aware of what we actually want to achieve. Let me start:
>>
>> GFN: Guest Frame Number
>> EPFN: Encrypted Physical Frame Number
>>
>>
>> 1) An EPFN must not get mapped into more than one VM: it belongs exactly to
>> one VM. It must neither be shared between VMs between processes nor between
>> VMs within a processes.
>>
>>
>> 2) User space (well, and actually the kernel) must never access an EPFN:
>>
>> - If we go for an fd, essentially all operations (read/write) have to
>>    fail.
>> - If we have to map an EPFN into user space page tables (e.g., to
>>    simplify KVM), we could only allow fake swap entries such that "there
>>    is something" but it cannot be  accessed and is flagged accordingly.
>> - /proc/kcore and friends have to be careful as well and should not read
>>    this memory. So there has to be a way to flag these pages.
>>
>> 3) We need a way to express the GFN<->EPFN mapping and essentially assign an
>> EPFN to a GFN.
>>
>>
>> 4) Once we assigned a EPFN to a GFN, that assignment must not longer change.
>> Further, an EPFN must not get assigned to multiple GFNs.
>>
>>
>> 5) There has to be a way to "replace" encrypted parts by "shared" parts
>>     and the other way around.
>>
>> What else?
> 
> Thanks a lot for this summary. A question about the requirement: do we or
> do we not have plan to support assigned device to the protected VM?

Good question, I assume that is stuff for the far far future.

> 
> If yes. The fd based solution may need change the VFIO interface as well(
> though the fake swap entry solution need mess with VFIO too). Because:
> 
> 1> KVM uses VFIO when assigning devices into a VM.
> 
> 2> Not knowing which GPA ranges may be used by the VM as DMA buffer, all
> guest pages will have to be mapped in host IOMMU page table to host pages,
> which are pinned during the whole life cycle fo the VM.
> 
> 3> IOMMU mapping is done during VM creation time by VFIO and IOMMU driver,
> in vfio_dma_do_map().
> 
> 4> However, vfio_dma_do_map() needs the HVA to perform a GUP to get the HPA
> and pin the page.
> 
> But if we are using fd based solution, not every GPA can have a HVA, thus
> the current VFIO interface to map and pin the GPA(IOVA) wont work. And I
> doubt if VFIO can be modified to support this easily.

I fully agree. Maybe Intel folks have some idea how that's supposed to 
look like in the future.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2021-08-31 19:08 UTC|newest]

Thread overview: 142+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-24  0:52 [RFC] KVM: mm: fd-based approach for supporting KVM guest private memory Sean Christopherson
2021-08-24  0:52 ` Sean Christopherson
2021-08-24 10:48 ` Yu Zhang
2021-08-24 10:48   ` Yu Zhang
2021-08-26  0:35   ` Sean Christopherson
2021-08-26  0:35     ` Sean Christopherson
2021-08-26 13:23     ` Yu Zhang
2021-08-26 13:23       ` Yu Zhang
2021-08-26 10:15 ` David Hildenbrand
2021-08-26 10:15   ` David Hildenbrand
2021-08-26 17:05   ` Andy Lutomirski
2021-08-26 17:05     ` Andy Lutomirski
2021-08-26 21:26     ` David Hildenbrand
2021-08-26 21:26       ` David Hildenbrand
2021-08-27 18:24       ` Andy Lutomirski
2021-08-27 18:24         ` Andy Lutomirski
2021-08-27 22:28         ` Sean Christopherson
2021-08-27 22:28           ` Sean Christopherson
2021-08-31 19:12           ` David Hildenbrand
2021-08-31 19:12             ` David Hildenbrand
2021-08-31 20:45             ` Sean Christopherson
2021-08-31 20:45               ` Sean Christopherson
2021-09-01  7:51               ` David Hildenbrand
2021-09-01  7:51                 ` David Hildenbrand
2021-08-27  2:31   ` Yu Zhang
2021-08-27  2:31     ` Yu Zhang
2021-08-31 19:08     ` David Hildenbrand [this message]
2021-08-31 19:08       ` David Hildenbrand
2021-08-31 20:01       ` Andi Kleen
2021-08-31 20:01         ` Andi Kleen
2021-08-31 20:15         ` David Hildenbrand
2021-08-31 20:15           ` David Hildenbrand
2021-08-31 20:39           ` Andi Kleen
2021-08-31 20:39             ` Andi Kleen
2021-09-01  3:34             ` Yu Zhang
2021-09-01  3:34               ` Yu Zhang
2021-09-01  4:53     ` Andy Lutomirski
2021-09-01  4:53       ` Andy Lutomirski
2021-09-01  7:12       ` Tian, Kevin
2021-09-01  7:12         ` Tian, Kevin
2021-09-01 10:24       ` Yu Zhang
2021-09-01 10:24         ` Yu Zhang
2021-09-01 16:07         ` Andy Lutomirski
2021-09-01 16:07           ` Andy Lutomirski
2021-09-01 16:27           ` David Hildenbrand
2021-09-01 16:27             ` David Hildenbrand
2021-09-02  8:34             ` Yu Zhang
2021-09-02  8:34               ` Yu Zhang
2021-09-02  8:44               ` David Hildenbrand
2021-09-02  8:44                 ` David Hildenbrand
2021-09-02 11:02                 ` Yu Zhang
2021-09-02 11:02                   ` Yu Zhang
2021-09-02  8:19           ` Yu Zhang
2021-09-02  8:19             ` Yu Zhang
2021-09-02 18:41             ` Andy Lutomirski
2021-09-02 18:41               ` Andy Lutomirski
2021-09-07  1:33             ` Yan Zhao
2021-09-07  1:33               ` Yan Zhao
2021-09-02  9:27           ` Joerg Roedel
2021-09-02  9:27             ` Joerg Roedel
2021-09-02 18:41             ` Andy Lutomirski
2021-09-02 18:41               ` Andy Lutomirski
2021-09-02 18:57               ` Sean Christopherson
2021-09-02 18:57                 ` Sean Christopherson
2021-09-02 19:07                 ` Dave Hansen
2021-09-02 19:07                   ` Dave Hansen
2021-09-02 20:42                   ` Andy Lutomirski
2021-09-02 20:42                     ` Andy Lutomirski
2021-08-27 22:18   ` Sean Christopherson
2021-08-27 22:18     ` Sean Christopherson
2021-08-31 19:07     ` David Hildenbrand
2021-08-31 19:07       ` David Hildenbrand
2021-08-31 21:54       ` Sean Christopherson
2021-08-31 21:54         ` Sean Christopherson
2021-09-01  8:09         ` David Hildenbrand
2021-09-01  8:09           ` David Hildenbrand
2021-09-01 15:54           ` Andy Lutomirski
2021-09-01 15:54             ` Andy Lutomirski
2021-09-01 16:16             ` David Hildenbrand
2021-09-01 16:16               ` David Hildenbrand
2021-09-01 17:09               ` Andy Lutomirski
2021-09-01 17:09                 ` Andy Lutomirski
2021-09-01 16:18             ` James Bottomley
2021-09-01 16:18               ` James Bottomley
2021-09-01 16:22               ` David Hildenbrand
2021-09-01 16:22                 ` David Hildenbrand
2021-09-01 16:31                 ` James Bottomley
2021-09-01 16:31                   ` James Bottomley
2021-09-01 16:37                   ` David Hildenbrand
2021-09-01 16:37                     ` David Hildenbrand
2021-09-01 16:45                     ` James Bottomley
2021-09-01 16:45                       ` James Bottomley
2021-09-01 17:08                       ` David Hildenbrand
2021-09-01 17:08                         ` David Hildenbrand
2021-09-01 17:50                         ` Sean Christopherson
2021-09-01 17:50                           ` Sean Christopherson
2021-09-01 17:53                           ` David Hildenbrand
2021-09-01 17:53                             ` David Hildenbrand
2021-09-01 17:08               ` Andy Lutomirski
2021-09-01 17:08                 ` Andy Lutomirski
2021-09-01 17:13                 ` James Bottomley
2021-09-01 17:13                   ` James Bottomley
2021-09-02 10:18                 ` Joerg Roedel
2021-09-02 10:18                   ` Joerg Roedel
2021-09-01 18:24               ` Andy Lutomirski
2021-09-01 18:24                 ` Andy Lutomirski
2021-09-01 19:26               ` Dave Hansen
2021-09-01 19:26                 ` Dave Hansen
2021-09-07 15:00               ` Tom Lendacky
2021-09-07 15:00                 ` Tom Lendacky
2021-09-01  4:58       ` Andy Lutomirski
2021-09-01  4:58         ` Andy Lutomirski
2021-09-01  7:49         ` David Hildenbrand
2021-09-01  7:49           ` David Hildenbrand
2021-09-02 18:47 ` Kirill A. Shutemov
2021-09-02 18:47   ` Kirill A. Shutemov
2021-09-02 20:33   ` Sean Christopherson
2021-09-02 20:33     ` Sean Christopherson
2021-09-03 19:14     ` Kirill A. Shutemov
2021-09-03 19:14       ` Kirill A. Shutemov
2021-09-03 19:15       ` Andy Lutomirski
2021-09-03 19:15         ` Andy Lutomirski
2021-09-10 17:18         ` Kirill A. Shutemov
2021-09-10 17:18           ` Kirill A. Shutemov
2021-09-15 19:58           ` Chao Peng
2021-09-15 19:58             ` Chao Peng
2021-09-15 13:51             ` David Hildenbrand
2021-09-15 13:51               ` David Hildenbrand
2021-09-15 14:29               ` Kirill A. Shutemov
2021-09-15 14:29                 ` Kirill A. Shutemov
2021-09-15 14:59                 ` David Hildenbrand
2021-09-15 14:59                   ` David Hildenbrand
2021-09-15 15:35                   ` David Hildenbrand
2021-09-15 15:35                     ` David Hildenbrand
2021-09-15 20:04                   ` Kirill A. Shutemov
2021-09-15 20:04                     ` Kirill A. Shutemov
2021-09-15 14:11             ` Kirill A. Shutemov
2021-09-15 14:11               ` Kirill A. Shutemov
2021-09-16  7:36               ` Chao Peng
2021-09-16  7:36                 ` Chao Peng
2021-09-16  9:24               ` Paolo Bonzini
2021-09-16  9:24                 ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=243bc6a3-b43b-cd18-9cbb-1f42a5de802f@redhat.com \
    --to=david@redhat.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=dfaggioli@suse.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=jroedel@suse.de \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=varad.gautam@suse.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    --cc=yu.c.zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.