linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kuppuswamy Sathyanarayanan  <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Kai Huang <kai.huang@intel.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org
Cc: x86@kernel.org, dave.hansen@intel.com,
	kirill.shutemov@linux.intel.com, peterz@infradead.org,
	tony.luck@intel.com, tglx@linutronix.de, bp@alien8.de,
	mingo@redhat.com, hpa@zytor.com, seanjc@google.com,
	pbonzini@redhat.com, rafael@kernel.org, david@redhat.com,
	dan.j.williams@intel.com, len.brown@intel.com,
	ak@linux.intel.com, isaku.yamahata@intel.com,
	ying.huang@intel.com, chao.gao@intel.com, nik.borisov@suse.com,
	bagasdotme@gmail.com, sagis@google.com, imammedo@redhat.com
Subject: Re: [PATCH v14 07/23] x86/virt/tdx: Add skeleton to enable TDX on demand
Date: Tue, 17 Oct 2023 07:24:29 -0700	[thread overview]
Message-ID: <c3622d89-28d8-4917-9385-67b4cabaccd0@linux.intel.com> (raw)
In-Reply-To: <4fd10771907ae276548140cf7f8746e2eb38821c.1697532085.git.kai.huang@intel.com>



On 10/17/2023 3:14 AM, Kai Huang wrote:
> To enable TDX the kernel needs to initialize TDX from two perspectives:
> 1) Do a set of SEAMCALLs to initialize the TDX module to make it ready
> to create and run TDX guests; 2) Do the per-cpu initialization SEAMCALL
> on one logical cpu before the kernel wants to make any other SEAMCALLs
> on that cpu (including those involved during module initialization and
> running TDX guests).
> 
> The TDX module can be initialized only once in its lifetime.  Instead
> of always initializing it at boot time, this implementation chooses an
> "on demand" approach to initialize TDX until there is a real need (e.g
> when requested by KVM).  This approach has below pros:
> 
> 1) It avoids consuming the memory that must be allocated by kernel and
> given to the TDX module as metadata (~1/256th of the TDX-usable memory),
> and also saves the CPU cycles of initializing the TDX module (and the
> metadata) when TDX is not used at all.
> 
> 2) The TDX module design allows it to be updated while the system is
> running.  The update procedure shares quite a few steps with this "on
> demand" initialization mechanism.  The hope is that much of "on demand"
> mechanism can be shared with a future "update" mechanism.  A boot-time
> TDX module implementation would not be able to share much code with the
> update mechanism.
> 
> 3) Making SEAMCALL requires VMX to be enabled.  Currently, only the KVM
> code mucks with VMX enabling.  If the TDX module were to be initialized
> separately from KVM (like at boot), the boot code would need to be
> taught how to muck with VMX enabling and KVM would need to be taught how
> to cope with that.  Making KVM itself responsible for TDX initialization
> lets the rest of the kernel stay blissfully unaware of VMX.
> 
> Similar to module initialization, also make the per-cpu initialization
> "on demand" as it also depends on VMX being enabled.
> 
> Add two functions, tdx_enable() and tdx_cpu_enable(), to enable the TDX
> module and enable TDX on local cpu respectively.  For now tdx_enable()
> is a placeholder.  The TODO list will be pared down as functionality is
> added.
> 
> Export both tdx_cpu_enable() and tdx_enable() for KVM use.
> 
> In tdx_enable() use a state machine protected by mutex to make sure the
> initialization will only be done once, as tdx_enable() can be called
> multiple times (i.e. KVM module can be reloaded) and may be called
> concurrently by other kernel components in the future.
> 
> The per-cpu initialization on each cpu can only be done once during the
> module's life time.  Use a per-cpu variable to track its status to make
> sure it is only done once in tdx_cpu_enable().
> 
> Also, a SEAMCALL to do TDX module global initialization must be done
> once on any logical cpu before any per-cpu initialization SEAMCALL.  Do
> it inside tdx_cpu_enable() too (if hasn't been done).
> 
> tdx_enable() can potentially invoke SEAMCALLs on any online cpus.  The
> per-cpu initialization must be done before those SEAMCALLs are invoked
> on some cpu.  To keep things simple, in tdx_cpu_enable(), always do the
> per-cpu initialization regardless of whether the TDX module has been
> initialized or not.  And in tdx_enable(), don't call tdx_cpu_enable()
> but assume the caller has disabled CPU hotplug, done VMXON and
> tdx_cpu_enable() on all online cpus before calling tdx_enable().
> 
> Signed-off-by: Kai Huang <kai.huang@intel.com>
> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
> 
> v13 -> v14:
>  - Use lockdep_assert_irqs_off() in try_init_model_global() (Nikolay),
>    but still keep the comment (Kirill).
>  - Add code to print "module not loaded" in the first SEAMCALL.
>  - If SYS.INIT fails, stop calling LP.INIT in other tdx_cpu_enable()s.
>  - Added Kirill's tag
> 
> v12 -> v13:
>  - Made tdx_cpu_enable() always be called with IRQ disabled via IPI
>    funciton call (Peter, Kirill).
> 
> v11 -> v12:
>  - Simplified TDX module global init and lp init status tracking (David).
>  - Added comment around try_init_module_global() for using
>    raw_spin_lock() (Dave).
>  - Added one sentence to changelog to explain why to expose tdx_enable()
>    and tdx_cpu_enable() (Dave).
>  - Simplifed comments around tdx_enable() and tdx_cpu_enable() to use
>    lockdep_assert_*() instead. (Dave)
>  - Removed redundent "TDX" in error message (Dave).
> 
> v10 -> v11:
>  - Return -NODEV instead of -EINVAL when CONFIG_INTEL_TDX_HOST is off.
>  - Return the actual error code for tdx_enable() instead of -EINVAL.
>  - Added Isaku's Reviewed-by.
> 
> v9 -> v10:
>  - Merged the patch to handle per-cpu initialization to this patch to
>    tell the story better.
>  - Changed how to handle the per-cpu initialization to only provide a
>    tdx_cpu_enable() function to let the user of TDX to do it when the
>    user wants to run TDX code on a certain cpu.
>  - Changed tdx_enable() to not call cpus_read_lock() explicitly, but
>    call lockdep_assert_cpus_held() to assume the caller has done that.
>  - Improved comments around tdx_enable() and tdx_cpu_enable().
>  - Improved changelog to tell the story better accordingly.
> 
> v8 -> v9:
>  - Removed detailed TODO list in the changelog (Dave).
>  - Added back steps to do module global initialization and per-cpu
>    initialization in the TODO list comment.
>  - Moved the 'enum tdx_module_status_t' from tdx.c to local tdx.h
> 
> v7 -> v8:
>  - Refined changelog (Dave).
>  - Removed "all BIOS-enabled cpus" related code (Peter/Thomas/Dave).
>  - Add a "TODO list" comment in init_tdx_module() to list all steps of
>    initializing the TDX Module to tell the story (Dave).
>  - Made tdx_enable() unverisally return -EINVAL, and removed nonsense
>    comments (Dave).
>  - Simplified __tdx_enable() to only handle success or failure.
>  - TDX_MODULE_SHUTDOWN -> TDX_MODULE_ERROR
>  - Removed TDX_MODULE_NONE (not loaded) as it is not necessary.
>  - Improved comments (Dave).
>  - Pointed out 'tdx_module_status' is software thing (Dave).
> 
>  ...
> 
> ---
>  arch/x86/include/asm/tdx.h  |   4 +
>  arch/x86/virt/vmx/tdx/tdx.c | 167 ++++++++++++++++++++++++++++++++++++
>  arch/x86/virt/vmx/tdx/tdx.h |  30 +++++++
>  3 files changed, 201 insertions(+)
>  create mode 100644 arch/x86/virt/vmx/tdx/tdx.h
> 
> diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
> index 984efd3114ed..3cfd64f8a2b5 100644
> --- a/arch/x86/include/asm/tdx.h
> +++ b/arch/x86/include/asm/tdx.h
> @@ -110,8 +110,12 @@ static inline u64 sc_retry(sc_func_t func, u64 fn,
>  #define seamcall_saved_ret(_fn, _args)	sc_retry(__seamcall_saved_ret, (_fn), (_args))
>  
>  bool platform_tdx_enabled(void);
> +int tdx_cpu_enable(void);
> +int tdx_enable(void);
>  #else
>  static inline bool platform_tdx_enabled(void) { return false; }
> +static inline int tdx_cpu_enable(void) { return -ENODEV; }
> +static inline int tdx_enable(void)  { return -ENODEV; }
>  #endif	/* CONFIG_INTEL_TDX_HOST */
>  
>  #endif /* !__ASSEMBLY__ */
> diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
> index 52fb14e0195f..36db33133cd5 100644
> --- a/arch/x86/virt/vmx/tdx/tdx.c
> +++ b/arch/x86/virt/vmx/tdx/tdx.c
> @@ -12,14 +12,24 @@
>  #include <linux/init.h>
>  #include <linux/errno.h>
>  #include <linux/printk.h>
> +#include <linux/cpu.h>
> +#include <linux/spinlock.h>
> +#include <linux/percpu-defs.h>
> +#include <linux/mutex.h>
>  #include <asm/msr-index.h>
>  #include <asm/msr.h>
>  #include <asm/tdx.h>
> +#include "tdx.h"
>  
>  static u32 tdx_global_keyid __ro_after_init;
>  static u32 tdx_guest_keyid_start __ro_after_init;
>  static u32 tdx_nr_guest_keyids __ro_after_init;
>  
> +static DEFINE_PER_CPU(bool, tdx_lp_initialized);
> +
> +static enum tdx_module_status_t tdx_module_status;
> +static DEFINE_MUTEX(tdx_module_lock);
> +
>  typedef void (*sc_err_func_t)(u64 fn, u64 err, struct tdx_module_args *args);
>  
>  static inline void seamcall_err(u64 fn, u64 err, struct tdx_module_args *args)
> @@ -72,6 +82,163 @@ static inline int sc_retry_prerr(sc_func_t func, sc_err_func_t err_func,
>  #define seamcall_prerr_ret(__fn, __args)					\
>  	sc_retry_prerr(__seamcall_ret, seamcall_err_ret, (__fn), (__args))
>  
> +/*
> + * Do the module global initialization once and return its result.
> + * It can be done on any cpu.  It's always called with interrupts
> + * disabled.
> + */
> +static int try_init_module_global(void)
> +{
> +	struct tdx_module_args args = {};
> +	static DEFINE_RAW_SPINLOCK(sysinit_lock);
> +	static bool sysinit_done;
> +	static int sysinit_ret;
> +
> +	lockdep_assert_irqs_disabled();
> +
> +	raw_spin_lock(&sysinit_lock);
> +
> +	if (sysinit_done)
> +		goto out;
> +
> +	/* RCX is module attributes and all bits are reserved */
> +	args.rcx = 0;

Isn't args.rcx already initialized to 0, why explixitly set it?

> +	sysinit_ret = seamcall_prerr(TDH_SYS_INIT, &args);
> +
> +	/*
> +	 * The first SEAMCALL also detects the TDX module, thus
> +	 * it can fail due to the TDX module is not loaded.
> +	 * Dump message to let the user know.
> +	 */
> +	if (sysinit_ret == -ENODEV)
> +		pr_err("module not loaded\n");
> +
> +	sysinit_done = true;
> +out:
> +	raw_spin_unlock(&sysinit_lock);
> +	return sysinit_ret;
> +}
> +
> +/**
> + * tdx_cpu_enable - Enable TDX on local cpu
> + *
> + * Do one-time TDX module per-cpu initialization SEAMCALL (and TDX module
> + * global initialization SEAMCALL if not done) on local cpu to make this
> + * cpu be ready to run any other SEAMCALLs.
> + *
> + * Always call this function via IPI function calls.
> + *
> + * Return 0 on success, otherwise errors.
> + */
> +int tdx_cpu_enable(void)
> +{
> +	struct tdx_module_args args = {};
> +	int ret;
> +
> +	if (!platform_tdx_enabled())
> +		return -ENODEV;
> +
> +	lockdep_assert_irqs_disabled();
> +
> +	if (__this_cpu_read(tdx_lp_initialized))
> +		return 0;
> +
> +	/*
> +	 * The TDX module global initialization is the very first step
> +	 * to enable TDX.  Need to do it first (if hasn't been done)
> +	 * before the per-cpu initialization.
> +	 */
> +	ret = try_init_module_global();
> +	if (ret)
> +		return ret;
> +
> +	ret = seamcall_prerr(TDH_SYS_LP_INIT, &args);

If you have more usage of zero intialized args variant, do you want to add a helper for it?

> +	if (ret)
> +		return ret;
> +
> +	__this_cpu_write(tdx_lp_initialized, true);
> +
> +	return 0;
> +}
> +EXPORT_SYMBOL_GPL(tdx_cpu_enable);
> +
> +static int init_tdx_module(void)
> +{
> +	/*
> +	 * TODO:
> +	 *
> +	 *  - Get TDX module information and TDX-capable memory regions.
> +	 *  - Build the list of TDX-usable memory regions.
> +	 *  - Construct a list of "TD Memory Regions" (TDMRs) to cover
> +	 *    all TDX-usable memory regions.
> +	 *  - Configure the TDMRs and the global KeyID to the TDX module.
> +	 *  - Configure the global KeyID on all packages.
> +	 *  - Initialize all TDMRs.
> +	 *
> +	 *  Return error before all steps are done.
> +	 */
> +	return -EINVAL;
> +}
> +
> +static int __tdx_enable(void)
> +{
> +	int ret;
> +
> +	ret = init_tdx_module();
> +	if (ret) {
> +		pr_err("module initialization failed (%d)\n", ret);
> +		tdx_module_status = TDX_MODULE_ERROR;
> +		return ret;
> +	}
> +
> +	pr_info("module initialized\n");
> +	tdx_module_status = TDX_MODULE_INITIALIZED;
> +
> +	return 0;
> +}
> +
> +/**
> + * tdx_enable - Enable TDX module to make it ready to run TDX guests
> + *
> + * This function assumes the caller has: 1) held read lock of CPU hotplug
> + * lock to prevent any new cpu from becoming online; 2) done both VMXON
> + * and tdx_cpu_enable() on all online cpus.
> + *
> + * This function can be called in parallel by multiple callers.
> + *
> + * Return 0 if TDX is enabled successfully, otherwise error.
> + */
> +int tdx_enable(void)
> +{
> +	int ret;
> +
> +	if (!platform_tdx_enabled())
> +		return -ENODEV;
> +
> +	lockdep_assert_cpus_held();
> +
> +	mutex_lock(&tdx_module_lock);
> +
> +	switch (tdx_module_status) {
> +	case TDX_MODULE_UNINITIALIZED:
> +		ret = __tdx_enable();
> +		break;
> +	case TDX_MODULE_INITIALIZED:
> +		/* Already initialized, great, tell the caller. */
> +		ret = 0;
> +		break;
> +	default:
> +		/* Failed to initialize in the previous attempts */
> +		ret = -EINVAL;
> +		break;
> +	}
> +
> +	mutex_unlock(&tdx_module_lock);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(tdx_enable);
> +
>  static int __init record_keyid_partitioning(u32 *tdx_keyid_start,
>  					    u32 *nr_tdx_keyids)
>  {
> diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h
> new file mode 100644
> index 000000000000..a3c52270df5b
> --- /dev/null
> +++ b/arch/x86/virt/vmx/tdx/tdx.h
> @@ -0,0 +1,30 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _X86_VIRT_TDX_H
> +#define _X86_VIRT_TDX_H
> +
> +/*
> + * This file contains both macros and data structures defined by the TDX
> + * architecture and Linux defined software data structures and functions.
> + * The two should not be mixed together for better readability.  The
> + * architectural definitions come first.
> + */
> +
> +/*
> + * TDX module SEAMCALL leaf functions
> + */
> +#define TDH_SYS_INIT		33
> +#define TDH_SYS_LP_INIT		35
> +
> +/*
> + * Do not put any hardware-defined TDX structure representations below
> + * this comment!
> + */
> +
> +/* Kernel defined TDX module status during module initialization. */
> +enum tdx_module_status_t {
> +	TDX_MODULE_UNINITIALIZED,
> +	TDX_MODULE_INITIALIZED,
> +	TDX_MODULE_ERROR
> +};
> +
> +#endif

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer

  reply	other threads:[~2023-10-17 14:24 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-17 10:14 [PATCH v14 00/23] TDX host kernel support Kai Huang
2023-10-17 10:14 ` [PATCH v14 01/23] x86/virt/tdx: Detect TDX during kernel boot Kai Huang
2023-10-17 13:24   ` Kuppuswamy Sathyanarayanan
2023-10-17 10:14 ` [PATCH v14 02/23] x86/tdx: Define TDX supported page sizes as macros Kai Huang
2023-10-17 10:14 ` [PATCH v14 03/23] x86/virt/tdx: Make INTEL_TDX_HOST depend on X86_X2APIC Kai Huang
2023-10-17 10:14 ` [PATCH v14 04/23] x86/cpu: Detect TDX partial write machine check erratum Kai Huang
2023-10-17 10:14 ` [PATCH v14 05/23] x86/virt/tdx: Handle SEAMCALL no entropy error in common code Kai Huang
2023-10-17 13:34   ` Kuppuswamy Sathyanarayanan
2023-10-17 10:14 ` [PATCH v14 06/23] x86/virt/tdx: Add SEAMCALL error printing for module initialization Kai Huang
2023-10-17 13:37   ` Kuppuswamy Sathyanarayanan
2023-10-18  6:27     ` Huang, Kai
2023-10-18  7:40   ` Nikolay Borisov
2023-10-18  8:26     ` Huang, Kai
2023-10-18 14:17   ` Kuppuswamy Sathyanarayanan
2023-10-17 10:14 ` [PATCH v14 07/23] x86/virt/tdx: Add skeleton to enable TDX on demand Kai Huang
2023-10-17 14:24   ` Kuppuswamy Sathyanarayanan [this message]
2023-10-18  6:51     ` Huang, Kai
2023-10-18 13:56       ` Dave Hansen
2023-10-18 19:55         ` Huang, Kai
2023-10-18  7:57   ` Nikolay Borisov
2023-10-18  8:29     ` Huang, Kai
2023-10-18  8:39       ` Nikolay Borisov
2023-10-18  8:57         ` Huang, Kai
2023-10-18  9:14   ` Nikolay Borisov
2023-10-18  9:17     ` Huang, Kai
2023-10-17 10:14 ` [PATCH v14 08/23] x86/virt/tdx: Get information about TDX module and TDX-capable memory Kai Huang
2023-10-17 10:14 ` [PATCH v14 09/23] x86/virt/tdx: Use all system memory when initializing TDX module as TDX memory Kai Huang
2023-10-17 10:14 ` [PATCH v14 10/23] x86/virt/tdx: Add placeholder to construct TDMRs to cover all TDX memory regions Kai Huang
2023-10-17 10:14 ` [PATCH v14 11/23] x86/virt/tdx: Fill out " Kai Huang
2023-10-17 10:14 ` [PATCH v14 12/23] x86/virt/tdx: Allocate and set up PAMTs for TDMRs Kai Huang
2023-10-24  5:53   ` Nikolay Borisov
2023-10-24 10:49     ` Huang, Kai
2023-10-24 13:31       ` Nikolay Borisov
2023-10-17 10:14 ` [PATCH v14 13/23] x86/virt/tdx: Designate reserved areas for all TDMRs Kai Huang
2023-10-17 10:14 ` [PATCH v14 14/23] x86/virt/tdx: Configure TDX module with the TDMRs and global KeyID Kai Huang
2023-10-17 10:14 ` [PATCH v14 15/23] x86/virt/tdx: Configure global KeyID on all packages Kai Huang
2023-10-17 10:14 ` [PATCH v14 16/23] x86/virt/tdx: Initialize all TDMRs Kai Huang
2023-10-17 10:14 ` [PATCH v14 17/23] x86/kexec: Flush cache of TDX private memory Kai Huang
2023-10-17 10:14 ` [PATCH v14 18/23] x86/virt/tdx: Keep TDMRs when module initialization is successful Kai Huang
2023-10-17 10:14 ` [PATCH v14 19/23] x86/virt/tdx: Improve readability of module initialization error handling Kai Huang
2023-10-17 10:14 ` [PATCH v14 20/23] x86/kexec(): Reset TDX private memory on platforms with TDX erratum Kai Huang
2023-10-17 10:14 ` [PATCH v14 21/23] x86/virt/tdx: Handle TDX interaction with ACPI S3 and deeper states Kai Huang
2023-10-17 10:53   ` Rafael J. Wysocki
2023-10-18  3:22     ` Huang, Kai
2023-10-18 10:15       ` Rafael J. Wysocki
2023-10-18 10:51         ` Huang, Kai
2023-10-18 10:53           ` Rafael J. Wysocki
2023-10-19 20:45             ` Huang, Kai
2023-10-24 10:46               ` Huang, Kai
2023-10-17 10:14 ` [PATCH v14 22/23] x86/mce: Improve error log of kernel space TDX #MC due to erratum Kai Huang
2023-10-17 10:14 ` [PATCH v14 23/23] Documentation/x86: Add documentation for TDX host support Kai Huang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c3622d89-28d8-4917-9385-67b4cabaccd0@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=bagasdotme@gmail.com \
    --cc=bp@alien8.de \
    --cc=chao.gao@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=david@redhat.com \
    --cc=hpa@zytor.com \
    --cc=imammedo@redhat.com \
    --cc=isaku.yamahata@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nik.borisov@suse.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=sagis@google.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).