linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kai Huang <kai.huang@intel.com>
To: x86@kernel.org
Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
	dave.hansen@intel.com, luto@kernel.org, kvm@vger.kernel.org,
	pbonzini@redhat.com, seanjc@google.com, hpa@zytor.com,
	peterz@infradead.org, kirill.shutemov@linux.intel.com,
	sathyanarayanan.kuppuswamy@linux.intel.com, tony.luck@intel.com,
	ak@linux.intel.com, dan.j.williams@intel.com,
	chang.seok.bae@intel.com, keescook@chromium.org,
	hengqi.arch@bytedance.com, laijs@linux.alibaba.com,
	metze@samba.org, linux-kernel@vger.kernel.org,
	kai.huang@intel.com
Subject: [RFC PATCH 19/21] x86: Flush cache of TDX private memory during kexec()
Date: Mon, 28 Feb 2022 15:13:07 +1300	[thread overview]
Message-ID: <64bb89cf1108e85057f4b426406fbb5ec5172273.1646007267.git.kai.huang@intel.com> (raw)
In-Reply-To: <cover.1646007267.git.kai.huang@intel.com>

If TDX is ever enabled and/or used to run any TD guests, the cachelines
of TDX private memory, including PAMTs, used by TDX module need to be
flushed before transiting to the new kernel otherwise they may silently
corrupt the new kernel.

TDX module can only be initialized once during its lifetime.  TDX does
not have interface to reset TDX module to an uninitialized state so it
could be initialized again.  If the old kernel has enabled TDX, the new
kernel won't be able to use TDX again.  Therefore, ideally the old
kernel should shut down the TDX module if it is ever initialized so that
no SEAMCALLs can be made to it again.

However, SEAMCALL requires cpu being in VMX operation (VMXON has been
done).  Currently, only KVM handles VMXON and when KVM is unloaded, all
cpus leave VMX operation.  Theoretically, during kexec() there's no
guarantee all cpus are in VMX operation.  Adding VMXON handling to the
core kernel isn't trivial so this implementation depends on the caller
of TDX to guarantee that.  This means it's not easy to shut down TDX
module during kexec().  Therefore, this implementation doesn't shut down
TDX module, but only does cache flush and just leave TDX module open.

And it's fine to leave the module open.  If the new kernel wants to use
TDX, it needs to go through the initialization process which will fail
at the first SEAMCALL due to TDX module is not in uninitialized state.
If the new kernel doesn't want to use TDX, then TDX module won't run at
all.

Following the implementation of SME support, use wbinvd() to flush cache
in stop_this_cpu().  Introduce a new function platform_has_tdx() to only
check whether the platform is TDX-capable and do wbinvd() when it is
true.  platform_has_tdx() returns true when SEAMRR is enabled and there
are enough TDX private KeyIDs to run at least one TD guest (both of
which are detected at boot time).  TDX is enabled on demand at runtime
and it has a state machine with mutex to protect multiple callers to
initialize TDX in parallel.  Getting TDX module state needs to hold the
mutex but stop_this_cpu() runs in interrupt context, so just check
whether platform supports TDX and flush cache.

Signed-off-by: Kai Huang <kai.huang@intel.com>
---
 arch/x86/include/asm/tdx.h |  2 ++
 arch/x86/kernel/process.c  | 26 +++++++++++++++++++++++++-
 arch/x86/virt/vmx/tdx.c    | 14 ++++++++++++++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index b526d41c4bbf..24f2b7e8b280 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -85,10 +85,12 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1,
 void tdx_detect_cpu(struct cpuinfo_x86 *c);
 int tdx_detect(void);
 int tdx_init(void);
+bool platform_has_tdx(void);
 #else
 static inline void tdx_detect_cpu(struct cpuinfo_x86 *c) { }
 static inline int tdx_detect(void) { return -ENODEV; }
 static inline int tdx_init(void) { return -ENODEV; }
+static inline bool platform_has_tdx(void) { return false; }
 #endif /* CONFIG_INTEL_TDX_HOST */
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 71aa12082370..70eea43d1f32 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -766,8 +766,32 @@ void stop_this_cpu(void *dummy)
 	 * without the encryption bit, they don't race each other when flushed
 	 * and potentially end up with the wrong entry being committed to
 	 * memory.
+	 *
+	 * In case of kexec, similar to SME, if TDX is ever enabled, the
+	 * cachelines of TDX private memory (including PAMTs) used by TDX
+	 * module need to be flushed before transiting to the new kernel,
+	 * otherwise they may silently corrupt the new kernel.
+	 *
+	 * Note TDX is enabled on demand at runtime, and enabling TDX has a
+	 * state machine protected with a mutex to prevent concurrent calls
+	 * from multiple callers.  Holding the mutex is required to get the
+	 * TDX enabling status, but this function runs in interrupt context.
+	 * So to make it simple, always flush cache when platform supports
+	 * TDX (detected at boot time), regardless whether TDX is truly
+	 * enabled by kernel.
+	 *
+	 * TDX module can only be initialized once during its lifetime. So
+	 * if TDX is enabled in old kernel, the new kernel won't be able to
+	 * use TDX again, because when new kernel go through the TDX module
+	 * initialization process, it will fail immediately at the first
+	 * SEAMCALL.  Ideally, it's better to shut down TDX module, but this
+	 * requires SEAMCALL, which requires CPU already being in VMX
+	 * operation.  It's not trival to do VMXON here so to keep it simple
+	 * just leave the module open.  And leaving TDX module open is OK.
+	 * The new kernel cannot use TDX anyway.  The TDX module won't run
+	 * at all in the new kernel.
 	 */
-	if (boot_cpu_has(X86_FEATURE_SME))
+	if (boot_cpu_has(X86_FEATURE_SME) || platform_has_tdx())
 		native_wbinvd();
 	for (;;) {
 		/*
diff --git a/arch/x86/virt/vmx/tdx.c b/arch/x86/virt/vmx/tdx.c
index 2760c10a430a..f704fddc9dfc 100644
--- a/arch/x86/virt/vmx/tdx.c
+++ b/arch/x86/virt/vmx/tdx.c
@@ -1602,3 +1602,17 @@ int tdx_init(void)
 	return ret;
 }
 EXPORT_SYMBOL_GPL(tdx_init);
+
+/**
+ * platform_has_tdx - Whether platform supports TDX
+ *
+ * Check whether platform supports TDX (i.e. TDX is enabled in BIOS),
+ * regardless whether TDX is truly enabled by kernel.
+ *
+ * Return true if SEAMRR is enabled, and there are sufficient TDX private
+ * KeyIDs to run TD guests.
+ */
+bool platform_has_tdx(void)
+{
+	return seamrr_enabled() && tdx_keyid_sufficient();
+}
-- 
2.33.1


  parent reply	other threads:[~2022-02-28  2:15 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-28  2:12 [RFC PATCH 00/21] TDX host kernel support Kai Huang
2022-02-28  2:12 ` [RFC PATCH 01/21] x86/virt/tdx: Detect SEAM Kai Huang
2022-02-28  2:12 ` [RFC PATCH 02/21] x86/virt/tdx: Detect TDX private KeyIDs Kai Huang
2022-02-28  2:12 ` [RFC PATCH 03/21] x86/virt/tdx: Implement the SEAMCALL base function Kai Huang
2022-02-28  2:12 ` [RFC PATCH 04/21] x86/virt/tdx: Add skeleton for detecting and initializing TDX on demand Kai Huang
2022-02-28  2:12 ` [RFC PATCH 05/21] x86/virt/tdx: Detect P-SEAMLDR and TDX module Kai Huang
2022-02-28  2:12 ` [RFC PATCH 06/21] x86/virt/tdx: Shut down TDX module in case of error Kai Huang
2022-02-28  2:12 ` [RFC PATCH 07/21] x86/virt/tdx: Do TDX module global initialization Kai Huang
2022-02-28  2:12 ` [RFC PATCH 08/21] x86/virt/tdx: Do logical-cpu scope TDX module initialization Kai Huang
2022-02-28  2:12 ` [RFC PATCH 09/21] x86/virt/tdx: Get information about TDX module and convertible memory Kai Huang
2022-02-28  2:12 ` [RFC PATCH 10/21] x86/virt/tdx: Add placeholder to coveret all system RAM as TDX memory Kai Huang
2022-02-28  2:12 ` [RFC PATCH 11/21] x86/virt/tdx: Choose to use " Kai Huang
2022-02-28  2:13 ` [RFC PATCH 12/21] x86/virt/tdx: Create TDMRs to cover all system RAM Kai Huang
2022-02-28  2:13 ` [RFC PATCH 13/21] x86/virt/tdx: Allocate and set up PAMTs for TDMRs Kai Huang
2022-02-28  2:13 ` [RFC PATCH 14/21] x86/virt/tdx: Set up reserved areas for all TDMRs Kai Huang
2022-02-28  2:13 ` [RFC PATCH 15/21] x86/virt/tdx: Reserve TDX module global KeyID Kai Huang
2022-02-28  2:13 ` [RFC PATCH 16/21] x86/virt/tdx: Configure TDX module with TDMRs and " Kai Huang
2022-02-28  2:13 ` [RFC PATCH 17/21] x86/virt/tdx: Configure global KeyID on all packages Kai Huang
2022-02-28  2:13 ` [RFC PATCH 18/21] x86/virt/tdx: Initialize all TDMRs Kai Huang
2022-02-28  2:13 ` Kai Huang [this message]
2022-02-28  2:13 ` [RFC PATCH 20/21] x86/virt/tdx: Add kernel command line to opt-in TDX host support Kai Huang
2022-02-28  2:13 ` [RFC PATCH 21/21] Documentation/x86: Add documentation for " Kai Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=64bb89cf1108e85057f4b426406fbb5ec5172273.1646007267.git.kai.huang@intel.com \
    --to=kai.huang@intel.com \
    --cc=ak@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=chang.seok.bae@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=hengqi.arch@bytedance.com \
    --cc=hpa@zytor.com \
    --cc=keescook@chromium.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=laijs@linux.alibaba.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=metze@samba.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).