linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kuppuswamy Sathyanarayanan  <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@kernel.org>
Cc: Peter H Anvin <hpa@zytor.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Tony Luck <tony.luck@intel.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Andi Kleen <ak@linux.intel.com>,
	Kirill Shutemov <kirill.shutemov@linux.intel.com>,
	Sean Christopherson <seanjc@google.com>,
	Kuppuswamy Sathyanarayanan <knsathya@kernel.org>,
	x86@kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH v5 07/12] x86/traps: Add #VE support for TDX guest
Date: Wed,  4 Aug 2021 11:13:24 -0700	[thread overview]
Message-ID: <20210804181329.2899708-8-sathyanarayanan.kuppuswamy@linux.intel.com> (raw)
In-Reply-To: <20210804181329.2899708-1-sathyanarayanan.kuppuswamy@linux.intel.com>

From: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>

Virtualization Exceptions (#VE) are delivered to TDX guests due to
specific guest actions which may happen in either user space or the kernel:

 * Specific instructions (WBINVD, for example)
 * Specific MSR accesses
 * Specific CPUID leaf accesses
 * Access to TD-shared memory, which includes MMIO

In the settings that Linux will run in, virtual exceptions are never
generated on accesses to normal, TD-private memory that has been
accepted.

The entry paths do not access TD-shared memory, MMIO regions or use
those specific MSRs, instructions, CPUID leaves that might generate #VE.
In addition, all interrupts including NMIs are blocked by the hardware
starting with #VE delivery until TDGETVEINFO is called.  This eliminates
the chance of a #VE during the syscall gap or paranoid entry paths and
simplifies #VE handling.

After TDGETVEINFO #VE could happen in theory (e.g. through an NMI),
but it is expected not to happen because TDX expects NMIs not to
trigger #VEs. Another case where they could happen is if the #VE
exception panics, but in this case there are no guarantees on anything
anyways.

If a guest kernel action which would normally cause a #VE occurs in the
interrupt-disabled region before TDGETVEINFO, a #DF is delivered to the
guest which will result in an oops (and should eventually be a panic, as
we would like to set panic_on_oops to 1 for TDX guests).

Add basic infrastructure to handle any #VE which occurs in the kernel or
userspace.  Later patches will add handling for specific #VE scenarios.

Convert unhandled #VE's (everything, until later in this series) so that
they appear just like a #GP by calling ve_raise_fault() directly.
ve_raise_fault() is similar to #GP handler and is responsible for
sending SIGSEGV to userspace and cpu die and notifying debuggers and
other die chain users.  

Co-developed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---

Changes since v4:
 * Since ve_raise_fault() is used only by TDX code, moved it
   within #ifdef CONFIG_INTEL_TDX_GUEST.

Changes since v3:
 * None

 arch/x86/include/asm/idtentry.h |  4 ++
 arch/x86/include/asm/tdx.h      | 19 +++++++++
 arch/x86/kernel/idt.c           |  6 +++
 arch/x86/kernel/tdx.c           | 36 +++++++++++++++++
 arch/x86/kernel/traps.c         | 69 +++++++++++++++++++++++++++++++++
 5 files changed, 134 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 1345088e9902..8ccc81d653b3 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -625,6 +625,10 @@ DECLARE_IDTENTRY_XENCB(X86_TRAP_OTHER,	exc_xen_hypervisor_callback);
 DECLARE_IDTENTRY_RAW(X86_TRAP_OTHER,	exc_xen_unknown_trap);
 #endif
 
+#ifdef CONFIG_INTEL_TDX_GUEST
+DECLARE_IDTENTRY(X86_TRAP_VE,		exc_virtualization_exception);
+#endif
+
 /* Device interrupts common/spurious */
 DECLARE_IDTENTRY_IRQ(X86_TRAP_OTHER,	common_interrupt);
 #ifdef CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h
index 72a6a719ce37..846fe58f0426 100644
--- a/arch/x86/include/asm/tdx.h
+++ b/arch/x86/include/asm/tdx.h
@@ -39,6 +39,20 @@ struct tdx_hypercall_output {
 	u64 r15;
 };
 
+/*
+ * Used by #VE exception handler to gather the #VE exception
+ * info from the TDX module. This is software only structure
+ * and not related to TDX module/VMM.
+ */
+struct ve_info {
+	u64 exit_reason;
+	u64 exit_qual;
+	u64 gla;	/* Guest Linear (virtual) Address */
+	u64 gpa;	/* Guest Physical (virtual) Address */
+	u32 instr_len;
+	u32 instr_info;
+};
+
 #ifdef CONFIG_INTEL_TDX_GUEST
 
 void __init tdx_early_init(void);
@@ -53,6 +67,11 @@ u64 __tdx_module_call(u64 fn, u64 rcx, u64 rdx, u64 r8, u64 r9,
 u64 __tdx_hypercall(u64 type, u64 fn, u64 r12, u64 r13, u64 r14,
 		    u64 r15, struct tdx_hypercall_output *out);
 
+unsigned long tdg_get_ve_info(struct ve_info *ve);
+
+int tdg_handle_virtualization_exception(struct pt_regs *regs,
+					struct ve_info *ve);
+
 #else
 
 static inline void tdx_early_init(void) { };
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index df0fa695bb09..a5eaae8e6c44 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -68,6 +68,9 @@ static const __initconst struct idt_data early_idts[] = {
 	 */
 	INTG(X86_TRAP_PF,		asm_exc_page_fault),
 #endif
+#ifdef CONFIG_INTEL_TDX_GUEST
+	INTG(X86_TRAP_VE,		asm_exc_virtualization_exception),
+#endif
 };
 
 /*
@@ -91,6 +94,9 @@ static const __initconst struct idt_data def_idts[] = {
 	INTG(X86_TRAP_MF,		asm_exc_coprocessor_error),
 	INTG(X86_TRAP_AC,		asm_exc_alignment_check),
 	INTG(X86_TRAP_XF,		asm_exc_simd_coprocessor_error),
+#ifdef CONFIG_INTEL_TDX_GUEST
+	INTG(X86_TRAP_VE,		asm_exc_virtualization_exception),
+#endif
 
 #ifdef CONFIG_X86_32
 	TSKG(X86_TRAP_DF,		GDT_ENTRY_DOUBLEFAULT_TSS),
diff --git a/arch/x86/kernel/tdx.c b/arch/x86/kernel/tdx.c
index 3973e81751ba..6169f9c740b2 100644
--- a/arch/x86/kernel/tdx.c
+++ b/arch/x86/kernel/tdx.c
@@ -10,6 +10,7 @@
 
 /* TDX Module call Leaf IDs */
 #define TDINFO				1
+#define TDGETVEINFO			3
 
 static struct {
 	unsigned int gpa_width;
@@ -75,6 +76,41 @@ static void tdg_get_info(void)
 	td_info.attributes = out.rdx;
 }
 
+unsigned long tdg_get_ve_info(struct ve_info *ve)
+{
+	u64 ret;
+	struct tdx_module_output out = {0};
+
+	/*
+	 * NMIs and machine checks are suppressed. Before this point any
+	 * #VE is fatal. After this point (TDGETVEINFO call), NMIs and
+	 * additional #VEs are permitted (but we don't expect them to
+	 * happen unless you panic).
+	 */
+	ret = __tdx_module_call(TDGETVEINFO, 0, 0, 0, 0, &out);
+
+	ve->exit_reason = out.rcx;
+	ve->exit_qual   = out.rdx;
+	ve->gla         = out.r8;
+	ve->gpa         = out.r9;
+	ve->instr_len   = out.r10 & UINT_MAX;
+	ve->instr_info  = out.r10 >> 32;
+
+	return ret;
+}
+
+int tdg_handle_virtualization_exception(struct pt_regs *regs,
+					struct ve_info *ve)
+{
+	/*
+	 * TODO: Add handler support for various #VE exit
+	 * reasons. It will be added by other patches in
+	 * the series.
+	 */
+	pr_warn("Unexpected #VE: %lld\n", ve->exit_reason);
+	return -EFAULT;
+}
+
 void __init tdx_early_init(void)
 {
 	if (!cpuid_has_tdx_guest())
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index a58800973aed..be56f0281cb5 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -61,6 +61,7 @@
 #include <asm/insn.h>
 #include <asm/insn-eval.h>
 #include <asm/vdso.h>
+#include <asm/tdx.h>
 
 #ifdef CONFIG_X86_64
 #include <asm/x86_init.h>
@@ -1140,6 +1141,74 @@ DEFINE_IDTENTRY(exc_device_not_available)
 	}
 }
 
+#ifdef CONFIG_INTEL_TDX_GUEST
+#define VEFSTR "VE fault"
+static void ve_raise_fault(struct pt_regs *regs, long error_code)
+{
+	struct task_struct *tsk = current;
+
+	if (user_mode(regs)) {
+		tsk->thread.error_code = error_code;
+		tsk->thread.trap_nr = X86_TRAP_VE;
+
+		/*
+		 * Not fixing up VDSO exceptions similar to #GP handler
+		 * because we don't expect the VDSO to trigger #VE.
+		 */
+		show_signal(tsk, SIGSEGV, "", VEFSTR, regs, error_code);
+		force_sig(SIGSEGV);
+		return;
+	}
+
+	if (fixup_exception(regs, X86_TRAP_VE, error_code, 0))
+		return;
+
+	tsk->thread.error_code = error_code;
+	tsk->thread.trap_nr = X86_TRAP_VE;
+
+	/*
+	 * To be potentially processing a kprobe fault and to trust the result
+	 * from kprobe_running(), we have to be non-preemptible.
+	 */
+	if (!preemptible() &&
+	    kprobe_running() &&
+	    kprobe_fault_handler(regs, X86_TRAP_VE))
+		return;
+
+	notify_die(DIE_GPF, VEFSTR, regs, error_code, X86_TRAP_VE, SIGSEGV);
+
+	die_addr(VEFSTR, regs, error_code, 0);
+}
+
+DEFINE_IDTENTRY(exc_virtualization_exception)
+{
+	struct ve_info ve;
+	int ret;
+
+	RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+
+	/*
+	 * NMIs/Machine-checks/Interrupts will be in a disabled state
+	 * till TDGETVEINFO TDCALL is executed. This prevents #VE
+	 * nesting issue.
+	 */
+	ret = tdg_get_ve_info(&ve);
+
+	cond_local_irq_enable(regs);
+
+	if (!ret)
+		ret = tdg_handle_virtualization_exception(regs, &ve);
+	/*
+	 * If tdg_handle_virtualization_exception() could not process
+	 * it successfully, treat it as #GP(0) and handle it.
+	 */
+	if (ret)
+		ve_raise_fault(regs, 0);
+
+	cond_local_irq_disable(regs);
+}
+#endif
+
 #ifdef CONFIG_X86_32
 DEFINE_IDTENTRY_SW(iret_error)
 {
-- 
2.25.1


  parent reply	other threads:[~2021-08-04 18:15 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04 18:13 [PATCH v5 00/12] Add TDX Guest Support (Initial support) Kuppuswamy Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 01/12] x86/paravirt: Move halt paravirt calls under CONFIG_PARAVIRT Kuppuswamy Sathyanarayanan
2021-08-12  7:18   ` Borislav Petkov
2021-08-12 17:17     ` Kuppuswamy, Sathyanarayanan
2021-08-17 12:50     ` Juergen Gross
2021-08-17 13:16       ` Kuppuswamy, Sathyanarayanan
2021-08-17 13:28         ` Juergen Gross
2021-08-17 13:39           ` Kuppuswamy, Sathyanarayanan
2021-08-17 13:47             ` Juergen Gross
2021-08-17 13:50               ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 02/12] x86/tdx: Introduce INTEL_TDX_GUEST config option Kuppuswamy Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 03/12] x86/cpufeatures: Add TDX Guest CPU feature Kuppuswamy Sathyanarayanan
2021-08-12  7:39   ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 04/12] x86/tdx: Add protected guest support for TDX guest Kuppuswamy Sathyanarayanan
2021-08-04 21:59   ` Sean Christopherson
2021-08-04 22:03     ` Dave Hansen
2021-08-04 22:26       ` Kuppuswamy, Sathyanarayanan
2021-08-04 22:42         ` Sean Christopherson
2021-08-04 23:00           ` Kuppuswamy, Sathyanarayanan
2021-08-12  7:53             ` Borislav Petkov
2021-08-12 17:18               ` Kuppuswamy, Sathyanarayanan
2021-08-20 14:28                 ` Borislav Petkov
2021-08-20 16:42                   ` Kuppuswamy, Sathyanarayanan
2021-08-20 16:59                     ` Borislav Petkov
2021-08-20 17:11                       ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 05/12] x86/tdx: Add __tdx_module_call() and __tdx_hypercall() helper functions Kuppuswamy Sathyanarayanan
2021-08-20 15:16   ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 06/12] x86/tdx: Get TD execution environment information via TDINFO Kuppuswamy Sathyanarayanan
2021-08-04 22:38   ` Sean Christopherson
2021-08-20 17:13   ` Borislav Petkov
2021-08-20 17:31     ` Kuppuswamy, Sathyanarayanan
2021-08-20 17:35       ` Borislav Petkov
2021-08-20 18:29         ` Kuppuswamy, Sathyanarayanan
2021-08-20 18:58           ` Andi Kleen
2021-08-20 19:01             ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` Kuppuswamy Sathyanarayanan [this message]
2021-08-24 10:17   ` [PATCH v5 07/12] x86/traps: Add #VE support for TDX guest Borislav Petkov
2021-08-24 17:32     ` Kuppuswamy, Sathyanarayanan
2021-08-24 17:36       ` Dave Hansen
2021-08-24 17:46       ` Borislav Petkov
2021-09-02 15:24         ` Kuppuswamy, Sathyanarayanan
2021-09-03 10:17           ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 08/12] x86/tdx: Add HLT " Kuppuswamy Sathyanarayanan
2021-08-24 16:10   ` Borislav Petkov
2021-08-24 17:06     ` Sean Christopherson
2021-08-24 17:25       ` Andi Kleen
2021-08-24 17:27       ` Borislav Petkov
2021-08-24 17:47         ` Sean Christopherson
2021-08-24 17:50           ` Borislav Petkov
2021-08-31 20:49         ` Kuppuswamy, Sathyanarayanan
2021-09-01  7:42           ` Borislav Petkov
2021-08-24 18:18       ` Kuppuswamy, Sathyanarayanan
2021-08-24 18:28         ` Andi Kleen
2021-08-24 17:35     ` Kuppuswamy, Sathyanarayanan
2021-08-24 17:48       ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 09/12] x86/tdx: Wire up KVM hypercalls Kuppuswamy Sathyanarayanan
2021-08-24 16:34   ` Borislav Petkov
2021-08-24 18:11     ` Kuppuswamy, Sathyanarayanan
2021-08-24 18:29       ` Borislav Petkov
2021-08-24 19:11         ` Kuppuswamy, Sathyanarayanan
2021-08-24 19:39           ` Borislav Petkov
2021-08-04 18:13 ` [PATCH v5 10/12] x86/tdx: Add MSR support for TDX guest Kuppuswamy Sathyanarayanan
2021-08-24 16:55   ` Borislav Petkov
2021-08-24 18:12     ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 11/12] x86/tdx: Don't write CSTAR MSR on Intel Kuppuswamy Sathyanarayanan
2021-08-04 18:31   ` Sean Christopherson
2021-08-04 21:03     ` Kuppuswamy, Sathyanarayanan
2021-08-04 21:44       ` Sean Christopherson
2021-08-04 21:48       ` Dave Hansen
2021-08-04 22:23         ` Kuppuswamy, Sathyanarayanan
2021-08-04 18:13 ` [PATCH v5 12/12] x86/tdx: Handle CPUID via #VE Kuppuswamy Sathyanarayanan
2021-08-24 17:48   ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210804181329.2899708-8-sathyanarayanan.kuppuswamy@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=ak@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=knsathya@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).