All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kuppuswamy Sathyanarayanan  <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>
Cc: Peter H Anvin <hpa@zytor.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Tony Luck <tony.luck@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Kirill Shutemov <kirill.shutemov@linux.intel.com>,
	Sean Christopherson <seanjc@google.com>,
	Kuppuswamy Sathyanarayanan <knsathya@kernel.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v5 07/12] x86/traps: Add #VE support for TDX guest
Date: Wed,  4 Aug 2021 11:13:24 -0700	[thread overview]
Message-ID: <20210804181329.2899708-8-sathyanarayanan.kuppuswamy@linux.intel.com> (raw)
In-Reply-To: <20210804181329.2899708-1-sathyanarayanan.kuppuswamy@linux.intel.com>

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Virtualization Exceptions (#VE) are delivered to TDX guests due to
specific guest actions which may happen in either user space or the kernel:

 * Specific instructions (WBINVD, for example)
 * Specific MSR accesses
 * Specific CPUID leaf accesses
 * Access to TD-shared memory, which includes MMIO

In the settings that Linux will run in, virtual exceptions are never
generated on accesses to normal, TD-private memory that has been
accepted.

The entry paths do not access TD-shared memory, MMIO regions or use
those specific MSRs, instructions, CPUID leaves that might generate #VE.
In addition, all interrupts including NMIs are blocked by the hardware
starting with #VE delivery until TDGETVEINFO is called.  This eliminates
the chance of a #VE during the syscall gap or paranoid entry paths and
simplifies #VE handling.

After TDGETVEINFO #VE could happen in theory (e.g. through an NMI),
but it is expected not to happen because TDX expects NMIs not to
trigger #VEs. Another case where they could happen is if the #VE
exception panics, but in this case there are no guarantees on anything
anyways.

If a guest kernel action which would normally cause a #VE occurs in the
interrupt-disabled region before TDGETVEINFO, a #DF is delivered to the
guest which will result in an oops (and should eventually be a panic, as
we would like to set panic_on_oops to 1 for TDX guests).

Add basic infrastructure to handle any #VE which occurs in the kernel or
userspace.  Later patches will add handling for specific #VE scenarios.

Convert unhandled #VE's (everything, until later in this series) so that
they appear just like a #GP by calling ve_raise_fault() directly.
ve_raise_fault() is similar to #GP handler and is responsible for
sending SIGSEGV to userspace and cpu die and notifying debuggers and
other die chain users.  

Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v4:
 * Since ve_raise_fault() is used only by TDX code, moved it
   within #ifdef CONFIG_INTEL_TDX_GUEST.

Changes since v3:
 * None

 arch/x86/include/asm/idtentry.h |  4 ++
 arch/x86/include/asm/tdx.h      | 19 +++++++++
 arch/x86/kernel/idt.c           |  6 +++
 arch/x86/kernel/tdx.c           | 36 +++++++++++++++++
 arch/x86/kernel/traps.c         | 69 +++++++++++++++++++++++++++++++++
 5 files changed, 134 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 1345088e9902..8ccc81d653b3 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -625,6 +625,10 @@ DECLARE_IDTENTRY_XENCB(X86_TRAP_OTHER,	exc_xen_hypervisor_callback);
 DECLARE_IDTENTRY_RAW(X86_TRAP_OTHER,	exc_xen_unknown_trap);
 #endif
 
+#ifdef CONFIG_INTEL_TDX_GUEST
+DECLARE_IDTENTRY(X86_TRAP_VE,		exc_virtualization_exception);
+#endif
+
 /* Device interrupts common/spurious */
 DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER,	common_interrupt);
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 72a6a719ce37..846fe58f0426 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -39,6 +39,20 @@ struct tdx_hypercall_output {
 	u64 r15;
 };
 
+/*
+ * Used by #VE exception handler to gather the #VE exception
+ * info from the TDX module. This is software only structure
+ * and not related to TDX module/VMM.
+ */
+struct ve_info {
+	u64 exit_reason;
+	u64 exit_qual;
+	u64 gla;	/* Guest Linear (virtual) Address */
+	u64 gpa;	/* Guest Physical (virtual) Address */
+	u32 instr_len;
+	u32 instr_info;
+};
+
 #ifdef CONFIG_INTEL_TDX_GUEST
 
 void __init tdx_early_init(void);
@@ -53,6 +67,11 @@ u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
 u64 __tdx_hypercall(u64 type, u64 fn, u64 r12, u64 r13, u64 r14,
 		    u64 r15, struct tdx_hypercall_output *out);
 
+unsigned long tdg_get_ve_info(struct ve_info *ve);
+
+int tdg_handle_virtualization_exception(struct pt_regs *regs,
+					struct ve_info *ve);
+
 #else
 
 static inline void tdx_early_init(void) { };
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index df0fa695bb09..a5eaae8e6c44 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -68,6 +68,9 @@ static const __initconst struct idt_data early_idts[] = {
 	 */
 	INTG(X86_TRAP_PF,		asm_exc_page_fault),
 #endif
+#ifdef CONFIG_INTEL_TDX_GUEST
+	INTG(X86_TRAP_VE,		asm_exc_virtualization_exception),
+#endif
 };
 
 /*
@@ -91,6 +94,9 @@ static const __initconst struct idt_data def_idts[] = {
 	INTG(X86_TRAP_MF,		asm_exc_coprocessor_error),
 	INTG(X86_TRAP_AC,		asm_exc_alignment_check),
 	INTG(X86_TRAP_XF,		asm_exc_simd_coprocessor_error),
+#ifdef CONFIG_INTEL_TDX_GUEST
+	INTG(X86_TRAP_VE,		asm_exc_virtualization_exception),
+#endif
 
 #ifdef CONFIG_X86_32
 	TSKG(X86_TRAP_DF,		GDT_ENTRY_DOUBLEFAULT_TSS),
diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index 3973e81751ba..6169f9c740b2 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -10,6 +10,7 @@
 
 /* TDX Module call Leaf IDs */
 #define TDINFO				1
+#define TDGETVEINFO			3
 
 static struct {
 	unsigned int gpa_width;
@@ -75,6 +76,41 @@ static void tdg_get_info(void)
 	td_info.attributes = out.rdx;
 }
 
+unsigned long tdg_get_ve_info(struct ve_info *ve)
+{
+	u64 ret;
+	struct tdx_module_output out = {0};
+
+	/*
+	 * NMIs and machine checks are suppressed. Before this point any
+	 * #VE is fatal. After this point (TDGETVEINFO call), NMIs and
+	 * additional #VEs are permitted (but we don't expect them to
+	 * happen unless you panic).
+	 */
+	ret = __tdx_module_call(TDGETVEINFO, 0, 0, 0, 0, &out);
+
+	ve->exit_reason = out.rcx;
+	ve->exit_qual   = out.rdx;
+	ve->gla         = out.r8;
+	ve->gpa         = out.r9;
+	ve->instr_len   = out.r10 & UINT_MAX;
+	ve->instr_info  = out.r10 >> 32;
+
+	return ret;
+}
+
+int tdg_handle_virtualization_exception(struct pt_regs *regs,
+					struct ve_info *ve)
+{
+	/*
+	 * TODO: Add handler support for various #VE exit
+	 * reasons. It will be added by other patches in
+	 * the series.
+	 */
+	pr_warn("Unexpected #VE: %lld\n", ve->exit_reason);
+	return -EFAULT;
+}
+
 void __init tdx_early_init(void)
 {
 	if (!cpuid_has_tdx_guest())
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index a58800973aed..be56f0281cb5 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -61,6 +61,7 @@
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
 #include <asm/vdso.h>
+#include <asm/tdx.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/x86_init.h>
@@ -1140,6 +1141,74 @@ DEFINE_IDTENTRY(exc_device_not_available)
 	}
 }
 
+#ifdef CONFIG_INTEL_TDX_GUEST
+#define VEFSTR "VE fault"
+static void ve_raise_fault(struct pt_regs *regs, long error_code)
+{
+	struct task_struct *tsk = current;
+
+	if (user_mode(regs)) {
+		tsk->thread.error_code = error_code;
+		tsk->thread.trap_nr = X86_TRAP_VE;
+
+		/*
+		 * Not fixing up VDSO exceptions similar to #GP handler
+		 * because we don't expect the VDSO to trigger #VE.
+		 */
+		show_signal(tsk, SIGSEGV, "", VEFSTR, regs, error_code);
+		force_sig(SIGSEGV);
+		return;
+	}
+
+	if (fixup_exception(regs, X86_TRAP_VE, error_code, 0))
+		return;
+
+	tsk->thread.error_code = error_code;
+	tsk->thread.trap_nr = X86_TRAP_VE;
+
+	/*
+	 * To be potentially processing a kprobe fault and to trust the result
+	 * from kprobe_running(), we have to be non-preemptible.
+	 */
+	if (!preemptible() &&
+	    kprobe_running() &&
+	    kprobe_fault_handler(regs, X86_TRAP_VE))
+		return;
+
+	notify_die(DIE_GPF, VEFSTR, regs, error_code, X86_TRAP_VE, SIGSEGV);
+
+	die_addr(VEFSTR, regs, error_code, 0);
+}
+
+DEFINE_IDTENTRY(exc_virtualization_exception)
+{
+	struct ve_info ve;
+	int ret;
+
+	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+
+	/*
+	 * NMIs/Machine-checks/Interrupts will be in a disabled state
+	 * till TDGETVEINFO TDCALL is executed. This prevents #VE
+	 * nesting issue.
+	 */
+	ret = tdg_get_ve_info(&ve);
+
+	cond_local_irq_enable(regs);
+
+	if (!ret)
+		ret = tdg_handle_virtualization_exception(regs, &ve);
+	/*
+	 * If tdg_handle_virtualization_exception() could not process
+	 * it successfully, treat it as #GP(0) and handle it.
+	 */
+	if (ret)
+		ve_raise_fault(regs, 0);
+
+	cond_local_irq_disable(regs);
+}
+#endif
+
 #ifdef CONFIG_X86_32
 DEFINE_IDTENTRY_SW(iret_error)
 {
-- 
2.25.1


  parent reply	other threads:[~2021-08-04 18:15 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04 18:13 [PATCH v5 00/12] Add TDX Guest Support (Initial support) Kuppuswamy Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 01/12] x86/paravirt: Move halt paravirt calls under CONFIG_PARAVIRT Kuppuswamy Sathyanarayanan
2021-08-12  7:18   ` Borislav Petkov
2021-08-12 17:17     ` Kuppuswamy, Sathyanarayanan
2021-08-17 12:50     ` Juergen Gross
2021-08-17 13:16       ` Kuppuswamy, Sathyanarayanan
2021-08-17 13:28         ` Juergen Gross
2021-08-17 13:39           ` Kuppuswamy, Sathyanarayanan
2021-08-17 13:47             ` Juergen Gross
2021-08-17 13:50               ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 02/12] x86/tdx: Introduce INTEL_TDX_GUEST config option Kuppuswamy Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 03/12] x86/cpufeatures: Add TDX Guest CPU feature Kuppuswamy Sathyanarayanan
2021-08-12  7:39   ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 04/12] x86/tdx: Add protected guest support for TDX guest Kuppuswamy Sathyanarayanan
2021-08-04 21:59   ` Sean Christopherson
2021-08-04 22:03     ` Dave Hansen
2021-08-04 22:26       ` Kuppuswamy, Sathyanarayanan
2021-08-04 22:42         ` Sean Christopherson
2021-08-04 23:00           ` Kuppuswamy, Sathyanarayanan
2021-08-12  7:53             ` Borislav Petkov
2021-08-12 17:18               ` Kuppuswamy, Sathyanarayanan
2021-08-20 14:28                 ` Borislav Petkov
2021-08-20 16:42                   ` Kuppuswamy, Sathyanarayanan
2021-08-20 16:59                     ` Borislav Petkov
2021-08-20 17:11                       ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 05/12] x86/tdx: Add __tdx_module_call() and __tdx_hypercall() helper functions Kuppuswamy Sathyanarayanan
2021-08-20 15:16   ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 06/12] x86/tdx: Get TD execution environment information via TDINFO Kuppuswamy Sathyanarayanan
2021-08-04 22:38   ` Sean Christopherson
2021-08-20 17:13   ` Borislav Petkov
2021-08-20 17:31     ` Kuppuswamy, Sathyanarayanan
2021-08-20 17:35       ` Borislav Petkov
2021-08-20 18:29         ` Kuppuswamy, Sathyanarayanan
2021-08-20 18:58           ` Andi Kleen
2021-08-20 19:01             ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` Kuppuswamy Sathyanarayanan [this message]
2021-08-24 10:17   ` [PATCH v5 07/12] x86/traps: Add #VE support for TDX guest Borislav Petkov
2021-08-24 17:32     ` Kuppuswamy, Sathyanarayanan
2021-08-24 17:36       ` Dave Hansen
2021-08-24 17:46       ` Borislav Petkov
2021-09-02 15:24         ` Kuppuswamy, Sathyanarayanan
2021-09-03 10:17           ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 08/12] x86/tdx: Add HLT " Kuppuswamy Sathyanarayanan
2021-08-24 16:10   ` Borislav Petkov
2021-08-24 17:06     ` Sean Christopherson
2021-08-24 17:25       ` Andi Kleen
2021-08-24 17:27       ` Borislav Petkov
2021-08-24 17:47         ` Sean Christopherson
2021-08-24 17:50           ` Borislav Petkov
2021-08-31 20:49         ` Kuppuswamy, Sathyanarayanan
2021-09-01  7:42           ` Borislav Petkov
2021-08-24 18:18       ` Kuppuswamy, Sathyanarayanan
2021-08-24 18:28         ` Andi Kleen
2021-08-24 17:35     ` Kuppuswamy, Sathyanarayanan
2021-08-24 17:48       ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 09/12] x86/tdx: Wire up KVM hypercalls Kuppuswamy Sathyanarayanan
2021-08-24 16:34   ` Borislav Petkov
2021-08-24 18:11     ` Kuppuswamy, Sathyanarayanan
2021-08-24 18:29       ` Borislav Petkov
2021-08-24 19:11         ` Kuppuswamy, Sathyanarayanan
2021-08-24 19:39           ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 10/12] x86/tdx: Add MSR support for TDX guest Kuppuswamy Sathyanarayanan
2021-08-24 16:55   ` Borislav Petkov
2021-08-24 18:12     ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 11/12] x86/tdx: Don't write CSTAR MSR on Intel Kuppuswamy Sathyanarayanan
2021-08-04 18:31   ` Sean Christopherson
2021-08-04 21:03     ` Kuppuswamy, Sathyanarayanan
2021-08-04 21:44       ` Sean Christopherson
2021-08-04 21:48       ` Dave Hansen
2021-08-04 22:23         ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 12/12] x86/tdx: Handle CPUID via #VE Kuppuswamy Sathyanarayanan
2021-08-24 17:48   ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210804181329.2899708-8-sathyanarayanan.kuppuswamy@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=knsathya@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.