linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Huang, Kai" <kai.huang@intel.com>
To: "kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"mhkelley58@gmail.com" <mhkelley58@gmail.com>,
	"Cui, Dexuan" <decui@microsoft.com>,
	"jpiotrowski@linux.microsoft.com"
	<jpiotrowski@linux.microsoft.com>
Cc: "cascardo@canonical.com" <cascardo@canonical.com>,
	"tim.gardner@canonical.com" <tim.gardner@canonical.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"thomas.lendacky@amd.com" <thomas.lendacky@amd.com>,
	"roxana.nicolescu@canonical.com" <roxana.nicolescu@canonical.com>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>,
	"haiyangz@microsoft.com" <haiyangz@microsoft.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"stefan.bader@canonical.com" <stefan.bader@canonical.com>,
	"nik.borisov@suse.com" <nik.borisov@suse.com>,
	"kys@microsoft.com" <kys@microsoft.com>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	"sashal@kernel.org" <sashal@kernel.org>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"bp@alien8.de" <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>
Subject: Re: [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init
Date: Tue, 5 Dec 2023 13:26:04 +0000	[thread overview]
Message-ID: <7b725783f1f9102c176737667bfec12f75099961.camel@intel.com> (raw)
In-Reply-To: <02e079e8-cc72-49d8-9191-8a753526eb18@linux.microsoft.com>

> 
> > > > 
> > > > Hm. Okay.
> > > > 
> > > > Can we take a step back? What is bigger picture here? What enlightenment
> > > > do you expect from the guest when everything is in-place?
> > > > 
> > > 
> > > All the functional enlightenment are already in place in the kernel and
> > > everything works (correct me if I'm wrong Dexuan/Michael). The enlightenments
> > > are that TDX VMCALLs are needed for MSR manipulation and vmbus operations,
> > > encrypted bit needs to be manipulated in the page tables and page
> > > visibility propagated to VMM.
> > 
> > Not quite family with hyperv enlightenments, but are these enlightenments TDX
> > guest specific?  Because if they are not, then they should be able to be
> > emulated by the normal hyperv, thus the hyperv as L1 (which is TDX guest) can
> > emulate them w/o letting the L2 know the hypervisor it runs on is actually a TDX
> > guest.
> 
> I would say that these hyperv enlightenments are confidential guest specific
> (TDX/SNP) when running with TD-partitioning/VMPL. In both cases there are TDX/SNP
> specific ways to exit directly to L0 (when needed) and native privileged instructions
> trap to the paravisor.
> 
> L1 is not hyperv and no one wants to emulate the I/O path. The L2 guest knows that
> it's confidential so that it can explicitly use swiotlb, toggle page visibility
> and notify the host (L0) on the I/O path without incurring additional emulation
> overhead.
> 
> > 
> > Btw, even if there's performance concern here, as you mentioned the TDVMCALL is
> > actually made to the L0 which means L0 must be aware such VMCALL is from L2 and
> > needs to be injected to L1 to handle, which IMHO not only complicates the L0 but
> > also may not have any performance benefits.
> 
> The TDVMCALLs are related to the I/O path (networking/block io) into the L2 guest, and
> so they intentionally go straight to L0 and are never injected to L1. L1 is not
> involved in that path at all.
> 
> Using something different than TDVMCALLs here would lead to additional traps to L1 and
> just add latency/complexity.

Looks by default you assume we should use TDX partitioning as "paravisor L1" +
"L0 device I/O emulation".

I think we are lacking background of this usage model and how it works.  For
instance, typically L2 is created by L1, and L1 is responsible for L2's device
I/O emulation.  I don't quite understand how could L0 emulate L2's device I/O?

Can you provide more information?

> 
> > 
> > > 
> > > Whats missing is the tdx_guest flag is not exposed to userspace in /proc/cpuinfo,
> > > and as a result dmesg does not currently display:
> > > "Memory Encryption Features active: Intel TDX".
> > > 
> > > That's what I set out to correct.
> > > 
> > > > So far I see that you try to get kernel think that it runs as TDX guest,
> > > > but not really. This is not very convincing model.
> > > > 
> > > 
> > > No that's not accurate at all. The kernel is running as a TDX guest so I
> > > want the kernel to know that. 
> > > 
> > 
> > But it isn't.  It runs on a hypervisor which is a TDX guest, but this doesn't
> > make itself a TDX guest.> 
> 
> That depends on your definition of "TDX guest". The TDX 1.5 TD partitioning spec
> talks of TDX-enlightened L1 VMM, (optionally) TDX-enlightened L2 VM and Unmodified
> Legacy L2 VM. Here we're dealing with a TDX-enlightened L2 VM.
> 
> If a guest runs inside an Intel TDX protected TD, is aware of memory encryption and
> issues TDVMCALLs - to me that makes it a TDX guest.

The thing I don't quite understand is what enlightenment(s) requires L2 to issue
TDVMCALL and know "encryption bit".

The reason that I can think of is:

If device I/O emulation of L2 is done by L0 then I guess it's reasonable to make
L2 aware of the "encryption bit" because L0 can only write emulated data to
shared buffer.  The shared buffer must be initially converted by the L2 by using
MAP_GPA TDVMCALL to L0 (to zap private pages in S-EPT etc), and L2 needs to know
the "encryption bit" to set up its page table properly.  L1 must be aware of
such private <-> shared conversion too to setup page table properly so L1 must
also be notified.

The concern I am having is whether there's other usage model(s) that we need to
consider.  For instance, running both unmodified L2 and enlightened L2.  Or some
L2 only needs TDVMCALL enlightenment but no "encryption bit".

In other words, that seems pretty much L1 hypervisor/paravisor implementation
specific.  I am wondering whether we can completely hide the enlightenment(s)
logic to hypervisor/paravisor specific code but not generically mark L2 as TDX
guest but still need to disable TDCALL sort of things.

Hope we are getting closer to be on the same page.


  reply	other threads:[~2023-12-05 13:26 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-22 17:01 [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init Jeremi Piotrowski
2023-11-22 17:01 ` [PATCH v1 2/3] x86/coco: Disable TDX module calls when TD partitioning is active Jeremi Piotrowski
2023-11-23 14:13   ` Kirill A. Shutemov
2023-11-24 10:38     ` Jeremi Piotrowski
2023-11-29 10:37       ` Huang, Kai
2023-12-01 15:27         ` Jeremi Piotrowski
2023-11-22 17:01 ` [PATCH v1 3/3] x86/tdx: Provide stub tdx_accept_memory() for non-TDX configs Jeremi Piotrowski
2023-11-23 14:11   ` Kirill A. Shutemov
2023-11-24 10:00     ` Jeremi Piotrowski
2023-11-22 17:19 ` [PATCH v1 1/3] x86/tdx: Check for TDX partitioning during early TDX init Jeremi Piotrowski
2023-11-29 16:40   ` Borislav Petkov
2023-11-30  7:08     ` Reshetova, Elena
2023-11-30  7:55       ` Borislav Petkov
2023-11-30  8:31         ` Reshetova, Elena
2023-11-30  9:21           ` Borislav Petkov
2023-12-04 16:44             ` Jeremi Piotrowski
2023-12-04 13:39           ` Jeremi Piotrowski
2023-12-04 19:37     ` Jeremi Piotrowski
2023-11-23 13:58 ` Kirill A. Shutemov
2023-11-24 10:31   ` Jeremi Piotrowski
2023-11-24 10:43     ` Kirill A. Shutemov
2023-11-24 11:04       ` Jeremi Piotrowski
2023-11-24 13:33         ` Kirill A. Shutemov
2023-11-24 16:19           ` Jeremi Piotrowski
2023-11-29  4:36             ` Huang, Kai
2023-12-01 16:16               ` Jeremi Piotrowski
2023-12-05 13:26                 ` Huang, Kai [this message]
2023-12-06 18:47                   ` Jeremi Piotrowski
2023-12-07 12:58                     ` Huang, Kai
2023-12-07 17:21                       ` Jeremi Piotrowski
2023-12-07 19:35                         ` Jeremi Piotrowski
2023-12-08 10:51                           ` Huang, Kai
2023-12-07 17:36                     ` Reshetova, Elena
2023-12-08 12:45                       ` Jeremi Piotrowski
2023-12-04  9:17 ` Reshetova, Elena
2023-12-04 19:07   ` Jeremi Piotrowski
2023-12-05 10:54     ` Kirill A. Shutemov
2023-12-06 17:49       ` Jeremi Piotrowski
2023-12-06 22:54         ` Kirill A. Shutemov
2023-12-07 17:06           ` Jeremi Piotrowski
2023-12-07 20:56             ` Kirill A. Shutemov
2023-12-05 13:24     ` Reshetova, Elena

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b725783f1f9102c176737667bfec12f75099961.camel@intel.com \
    --to=kai.huang@intel.com \
    --cc=bp@alien8.de \
    --cc=cascardo@canonical.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=jpiotrowski@linux.microsoft.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhkelley58@gmail.com \
    --cc=mingo@redhat.com \
    --cc=nik.borisov@suse.com \
    --cc=peterz@infradead.org \
    --cc=roxana.nicolescu@canonical.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=stefan.bader@canonical.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tim.gardner@canonical.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).