All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv4 0/3] An updated version of the load_unaligned_zeropad() fix
@ 2022-06-14 12:01 Kirill A. Shutemov
  2022-06-14 12:01 ` [PATCHv4 1/3] x86/tdx: Fix early #VE handling Kirill A. Shutemov
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-14 12:01 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86,
	Kirill A. Shutemov

An updated version of the load_unaligned_zeropad() fix.

v4:
 - Allow unaligned MMIO access, but fail split page accesses;
 - Explicit switch() in ve_instr_len() helper;
 - Update comments;
v3:
 - Rework load_unaligned_zeropad() fix: fail misaligned MMIO accesses
   instead of upfront exception fixups;
 - Add ve_instr_len() helper to wrap access to ve->instr_len;
 - Add Reviewed-by from Sathya;
v2:
 - Clarify RIP adjustments during #VE handling;
 - Fix early #VE handling;
 - Fix comment and commit message;

Kirill A. Shutemov (3):
  x86/tdx: Fix early #VE handling
  x86/tdx: Clarify RIP adjustments in #VE handler
  x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page

 arch/x86/coco/tdx/tdx.c | 187 +++++++++++++++++++++++++++++-----------
 1 file changed, 136 insertions(+), 51 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCHv4 1/3] x86/tdx: Fix early #VE handling
  2022-06-14 12:01 [PATCHv4 0/3] An updated version of the load_unaligned_zeropad() fix Kirill A. Shutemov
@ 2022-06-14 12:01 ` Kirill A. Shutemov
  2022-06-17 21:53   ` [tip: x86/urgent] " tip-bot2 for Kirill A. Shutemov
  2022-06-14 12:01 ` [PATCHv4 2/3] x86/tdx: Clarify RIP adjustments in #VE handler Kirill A. Shutemov
  2022-06-14 12:01 ` [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Kirill A. Shutemov
  2 siblings, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-14 12:01 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86,
	Kirill A. Shutemov

Move RIP in tdx_early_handle_ve() after handling the exception. Failure
to do that leads to infinite loop of exceptions.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 32e72854fa5f ("x86/tdx: Port I/O: Add early boot support")
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
---
 arch/x86/coco/tdx/tdx.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 03deb4d6920d..faae53f8d559 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -447,13 +447,17 @@ static bool handle_io(struct pt_regs *regs, u32 exit_qual)
 __init bool tdx_early_handle_ve(struct pt_regs *regs)
 {
 	struct ve_info ve;
+	bool ret;
 
 	tdx_get_ve_info(&ve);
 
 	if (ve.exit_reason != EXIT_REASON_IO_INSTRUCTION)
 		return false;
 
-	return handle_io(regs, ve.exit_qual);
+	ret = handle_io(regs, ve.exit_qual);
+	if (ret)
+		regs->ip += ve.instr_len;
+	return ret;
 }
 
 void tdx_get_ve_info(struct ve_info *ve)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCHv4 2/3] x86/tdx: Clarify RIP adjustments in #VE handler
  2022-06-14 12:01 [PATCHv4 0/3] An updated version of the load_unaligned_zeropad() fix Kirill A. Shutemov
  2022-06-14 12:01 ` [PATCHv4 1/3] x86/tdx: Fix early #VE handling Kirill A. Shutemov
@ 2022-06-14 12:01 ` Kirill A. Shutemov
  2022-06-17 21:53   ` [tip: x86/urgent] " tip-bot2 for Kirill A. Shutemov
  2022-06-14 12:01 ` [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Kirill A. Shutemov
  2 siblings, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-14 12:01 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86,
	Kirill A. Shutemov

After successful #VE handling, tdx_handle_virt_exception() has to move
RIP to the next instruction. The handler needs to know the length of the
instruction.

If the #VE happened due to instruction execution, GET_VEINFO TDX module
call provides info on the instruction in R10, including its length.

For #VE due to EPT violation, info in R10 is not usable and kernel has
to decode instruction manually to find out its length.

Restructure the code to make it explicit that the instruction length
depends on the type of #VE. Handler of an exit reason returns
instruction length on success or -errno on failure.

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/coco/tdx/tdx.c | 178 +++++++++++++++++++++++++++-------------
 1 file changed, 123 insertions(+), 55 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index faae53f8d559..7d6d484a6d28 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -124,6 +124,51 @@ static u64 get_cc_mask(void)
 	return BIT_ULL(gpa_width - 1);
 }
 
+/*
+ * TDX module spec states that #VE may be injected by the Intel TDX module in
+ * several cases:
+ *
+ *  - Emulation of the architectural #VE injection on EPT violation;
+ *
+ *  - As a result of guest TD execution of a disallowed instruction,
+ *    a disallowed MSR access, or CPUID virtualization;
+ *
+ *  - A notification to the guest TD about anomalous behavior;
+ *
+ * The last one is opt-in and is not used by the kernel.
+ *
+ * Intel Software Developer's Manual describes cases when instruction length
+ * field can be used in section "Information for VM Exits Due to Instruction
+ * Execution".
+ *
+ * For TDX, it ultimately means GET_VEINFO provides reliable instruction length
+ * information if #VE occurred due to instruction execution, but not for EPT
+ * violations.
+ */
+static int ve_instr_len(struct ve_info *ve)
+{
+	switch (ve->exit_reason) {
+	case EXIT_REASON_HLT:
+	case EXIT_REASON_MSR_READ:
+	case EXIT_REASON_MSR_WRITE:
+	case EXIT_REASON_CPUID:
+	case EXIT_REASON_IO_INSTRUCTION:
+		/* It is safe to use ve->instr_len for #VE due instructions */
+		return ve->instr_len;
+	case EXIT_REASON_EPT_VIOLATION:
+		/*
+		 * For EPT violations, ve->insn_len is not defined. For those,
+		 * the kernel must decode instructions manually and should not
+		 * be using this function.
+		 */
+		WARN_ONCE(1, "ve->instr_len is not defined for EPT violations");
+		return 0;
+	default:
+		WARN_ONCE(1, "Unexpected #VE-type: %lld\n", ve->exit_reason);
+		return ve->instr_len;
+	}
+}
+
 static u64 __cpuidle __halt(const bool irq_disabled, const bool do_sti)
 {
 	struct tdx_hypercall_args args = {
@@ -147,7 +192,7 @@ static u64 __cpuidle __halt(const bool irq_disabled, const bool do_sti)
 	return __tdx_hypercall(&args, do_sti ? TDX_HCALL_ISSUE_STI : 0);
 }
 
-static bool handle_halt(void)
+static int handle_halt(struct ve_info *ve)
 {
 	/*
 	 * Since non safe halt is mainly used in CPU offlining
@@ -158,9 +203,9 @@ static bool handle_halt(void)
 	const bool do_sti = false;
 
 	if (__halt(irq_disabled, do_sti))
-		return false;
+		return -EIO;
 
-	return true;
+	return ve_instr_len(ve);
 }
 
 void __cpuidle tdx_safe_halt(void)
@@ -180,7 +225,7 @@ void __cpuidle tdx_safe_halt(void)
 		WARN_ONCE(1, "HLT instruction emulation failed\n");
 }
 
-static bool read_msr(struct pt_regs *regs)
+static int read_msr(struct pt_regs *regs, struct ve_info *ve)
 {
 	struct tdx_hypercall_args args = {
 		.r10 = TDX_HYPERCALL_STANDARD,
@@ -194,14 +239,14 @@ static bool read_msr(struct pt_regs *regs)
 	 * (GHCI), section titled "TDG.VP.VMCALL<Instruction.RDMSR>".
 	 */
 	if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
-		return false;
+		return -EIO;
 
 	regs->ax = lower_32_bits(args.r11);
 	regs->dx = upper_32_bits(args.r11);
-	return true;
+	return ve_instr_len(ve);
 }
 
-static bool write_msr(struct pt_regs *regs)
+static int write_msr(struct pt_regs *regs, struct ve_info *ve)
 {
 	struct tdx_hypercall_args args = {
 		.r10 = TDX_HYPERCALL_STANDARD,
@@ -215,10 +260,13 @@ static bool write_msr(struct pt_regs *regs)
 	 * can be found in TDX Guest-Host-Communication Interface
 	 * (GHCI) section titled "TDG.VP.VMCALL<Instruction.WRMSR>".
 	 */
-	return !__tdx_hypercall(&args, 0);
+	if (__tdx_hypercall(&args, 0))
+		return -EIO;
+
+	return ve_instr_len(ve);
 }
 
-static bool handle_cpuid(struct pt_regs *regs)
+static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve)
 {
 	struct tdx_hypercall_args args = {
 		.r10 = TDX_HYPERCALL_STANDARD,
@@ -236,7 +284,7 @@ static bool handle_cpuid(struct pt_regs *regs)
 	 */
 	if (regs->ax < 0x40000000 || regs->ax > 0x4FFFFFFF) {
 		regs->ax = regs->bx = regs->cx = regs->dx = 0;
-		return true;
+		return ve_instr_len(ve);
 	}
 
 	/*
@@ -245,7 +293,7 @@ static bool handle_cpuid(struct pt_regs *regs)
 	 * (GHCI), section titled "VP.VMCALL<Instruction.CPUID>".
 	 */
 	if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
-		return false;
+		return -EIO;
 
 	/*
 	 * As per TDX GHCI CPUID ABI, r12-r15 registers contain contents of
@@ -257,7 +305,7 @@ static bool handle_cpuid(struct pt_regs *regs)
 	regs->cx = args.r14;
 	regs->dx = args.r15;
 
-	return true;
+	return ve_instr_len(ve);
 }
 
 static bool mmio_read(int size, unsigned long addr, unsigned long *val)
@@ -283,7 +331,7 @@ static bool mmio_write(int size, unsigned long addr, unsigned long val)
 			       EPT_WRITE, addr, val);
 }
 
-static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve)
+static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 {
 	char buffer[MAX_INSN_SIZE];
 	unsigned long *reg, val;
@@ -294,34 +342,36 @@ static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 
 	/* Only in-kernel MMIO is supported */
 	if (WARN_ON_ONCE(user_mode(regs)))
-		return false;
+		return -EFAULT;
 
 	if (copy_from_kernel_nofault(buffer, (void *)regs->ip, MAX_INSN_SIZE))
-		return false;
+		return -EFAULT;
 
 	if (insn_decode(&insn, buffer, MAX_INSN_SIZE, INSN_MODE_64))
-		return false;
+		return -EINVAL;
 
 	mmio = insn_decode_mmio(&insn, &size);
 	if (WARN_ON_ONCE(mmio == MMIO_DECODE_FAILED))
-		return false;
+		return -EINVAL;
 
 	if (mmio != MMIO_WRITE_IMM && mmio != MMIO_MOVS) {
 		reg = insn_get_modrm_reg_ptr(&insn, regs);
 		if (!reg)
-			return false;
+			return -EINVAL;
 	}
 
-	ve->instr_len = insn.length;
-
 	/* Handle writes first */
 	switch (mmio) {
 	case MMIO_WRITE:
 		memcpy(&val, reg, size);
-		return mmio_write(size, ve->gpa, val);
+		if (!mmio_write(size, ve->gpa, val))
+			return -EIO;
+		return insn.length;
 	case MMIO_WRITE_IMM:
 		val = insn.immediate.value;
-		return mmio_write(size, ve->gpa, val);
+		if (!mmio_write(size, ve->gpa, val))
+			return -EIO;
+		return insn.length;
 	case MMIO_READ:
 	case MMIO_READ_ZERO_EXTEND:
 	case MMIO_READ_SIGN_EXTEND:
@@ -334,15 +384,15 @@ static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 		 * decoded or handled properly. It was likely not using io.h
 		 * helpers or accessed MMIO accidentally.
 		 */
-		return false;
+		return -EINVAL;
 	default:
 		WARN_ONCE(1, "Unknown insn_decode_mmio() decode value?");
-		return false;
+		return -EINVAL;
 	}
 
 	/* Handle reads */
 	if (!mmio_read(size, ve->gpa, &val))
-		return false;
+		return -EIO;
 
 	switch (mmio) {
 	case MMIO_READ:
@@ -364,13 +414,13 @@ static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 	default:
 		/* All other cases has to be covered with the first switch() */
 		WARN_ON_ONCE(1);
-		return false;
+		return -EINVAL;
 	}
 
 	if (extend_size)
 		memset(reg, extend_val, extend_size);
 	memcpy(reg, &val, size);
-	return true;
+	return insn.length;
 }
 
 static bool handle_in(struct pt_regs *regs, int size, int port)
@@ -421,13 +471,14 @@ static bool handle_out(struct pt_regs *regs, int size, int port)
  *
  * Return True on success or False on failure.
  */
-static bool handle_io(struct pt_regs *regs, u32 exit_qual)
+static int handle_io(struct pt_regs *regs, struct ve_info *ve)
 {
+	u32 exit_qual = ve->exit_qual;
 	int size, port;
-	bool in;
+	bool in, ret;
 
 	if (VE_IS_IO_STRING(exit_qual))
-		return false;
+		return -EIO;
 
 	in   = VE_IS_IO_IN(exit_qual);
 	size = VE_GET_IO_SIZE(exit_qual);
@@ -435,9 +486,13 @@ static bool handle_io(struct pt_regs *regs, u32 exit_qual)
 
 
 	if (in)
-		return handle_in(regs, size, port);
+		ret = handle_in(regs, size, port);
 	else
-		return handle_out(regs, size, port);
+		ret = handle_out(regs, size, port);
+	if (!ret)
+		return -EIO;
+
+	return ve_instr_len(ve);
 }
 
 /*
@@ -447,17 +502,19 @@ static bool handle_io(struct pt_regs *regs, u32 exit_qual)
 __init bool tdx_early_handle_ve(struct pt_regs *regs)
 {
 	struct ve_info ve;
-	bool ret;
+	int insn_len;
 
 	tdx_get_ve_info(&ve);
 
 	if (ve.exit_reason != EXIT_REASON_IO_INSTRUCTION)
 		return false;
 
-	ret = handle_io(regs, ve.exit_qual);
-	if (ret)
-		regs->ip += ve.instr_len;
-	return ret;
+	insn_len = handle_io(regs, &ve);
+	if (insn_len < 0)
+		return false;
+
+	regs->ip += insn_len;
+	return true;
 }
 
 void tdx_get_ve_info(struct ve_info *ve)
@@ -490,54 +547,65 @@ void tdx_get_ve_info(struct ve_info *ve)
 	ve->instr_info  = upper_32_bits(out.r10);
 }
 
-/* Handle the user initiated #VE */
-static bool virt_exception_user(struct pt_regs *regs, struct ve_info *ve)
+/*
+ * Handle the user initiated #VE.
+ *
+ * On success, returns the number of bytes RIP should be incremented (>=0)
+ * or -errno on error.
+ */
+static int virt_exception_user(struct pt_regs *regs, struct ve_info *ve)
 {
 	switch (ve->exit_reason) {
 	case EXIT_REASON_CPUID:
-		return handle_cpuid(regs);
+		return handle_cpuid(regs, ve);
 	default:
 		pr_warn("Unexpected #VE: %lld\n", ve->exit_reason);
-		return false;
+		return -EIO;
 	}
 }
 
-/* Handle the kernel #VE */
-static bool virt_exception_kernel(struct pt_regs *regs, struct ve_info *ve)
+/*
+ * Handle the kernel #VE.
+ *
+ * On success, returns the number of bytes RIP should be incremented (>=0)
+ * or -errno on error.
+ */
+static int virt_exception_kernel(struct pt_regs *regs, struct ve_info *ve)
 {
 	switch (ve->exit_reason) {
 	case EXIT_REASON_HLT:
-		return handle_halt();
+		return handle_halt(ve);
 	case EXIT_REASON_MSR_READ:
-		return read_msr(regs);
+		return read_msr(regs, ve);
 	case EXIT_REASON_MSR_WRITE:
-		return write_msr(regs);
+		return write_msr(regs, ve);
 	case EXIT_REASON_CPUID:
-		return handle_cpuid(regs);
+		return handle_cpuid(regs, ve);
 	case EXIT_REASON_EPT_VIOLATION:
 		return handle_mmio(regs, ve);
 	case EXIT_REASON_IO_INSTRUCTION:
-		return handle_io(regs, ve->exit_qual);
+		return handle_io(regs, ve);
 	default:
 		pr_warn("Unexpected #VE: %lld\n", ve->exit_reason);
-		return false;
+		return -EIO;
 	}
 }
 
 bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve)
 {
-	bool ret;
+	int insn_len;
 
 	if (user_mode(regs))
-		ret = virt_exception_user(regs, ve);
+		insn_len = virt_exception_user(regs, ve);
 	else
-		ret = virt_exception_kernel(regs, ve);
+		insn_len = virt_exception_kernel(regs, ve);
+	if (insn_len < 0)
+		return false;
 
 	/* After successful #VE handling, move the IP */
-	if (ret)
-		regs->ip += ve->instr_len;
+	regs->ip += insn_len;
 
-	return ret;
+	return true;
 }
 
 static bool tdx_tlb_flush_required(bool private)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-14 12:01 [PATCHv4 0/3] An updated version of the load_unaligned_zeropad() fix Kirill A. Shutemov
  2022-06-14 12:01 ` [PATCHv4 1/3] x86/tdx: Fix early #VE handling Kirill A. Shutemov
  2022-06-14 12:01 ` [PATCHv4 2/3] x86/tdx: Clarify RIP adjustments in #VE handler Kirill A. Shutemov
@ 2022-06-14 12:01 ` Kirill A. Shutemov
  2022-06-15 15:27   ` Dave Hansen
                     ` (4 more replies)
  2 siblings, 5 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-14 12:01 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86,
	Kirill A. Shutemov

load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
The unwanted loads are typically harmless. But, they might be made to
totally unrelated or even unmapped memory. load_unaligned_zeropad()
relies on exception fixup (#PF, #GP and now #VE) to recover from these
unwanted loads.

In TDX guests, the second page can be shared page and VMM may configure
it to trigger #VE.

Kernel assumes that #VE on a shared page is MMIO access and tries to
decode instruction to handle it. In case of load_unaligned_zeropad() it
may result in confusion as it is not MMIO access.

Fix it by detecting split page MMIO accesses and fail them.
load_unaligned_zeropad() will recover using exception fixups.

The issue was discovered by analysis. It was not triggered during the
testing.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
 arch/x86/coco/tdx/tdx.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 7d6d484a6d28..3bcaf2170ede 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -333,8 +333,8 @@ static bool mmio_write(int size, unsigned long addr, unsigned long val)
 
 static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 {
+	unsigned long *reg, val, vaddr;
 	char buffer[MAX_INSN_SIZE];
-	unsigned long *reg, val;
 	struct insn insn = {};
 	enum mmio_type mmio;
 	int size, extend_size;
@@ -360,6 +360,19 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 			return -EINVAL;
 	}
 
+	/*
+	 * Reject EPT violation #VEs that split pages.
+	 *
+	 * MMIO accesses suppose to be naturally aligned and therefore never
+	 * cross a page boundary. Seeing split page accesses indicates a bug
+	 * or load_unaligned_zeropad() that steps into unmapped shared page.
+	 *
+	 * load_unaligned_zeropad() will recover using exception fixups.
+	 */
+	vaddr = (unsigned long)insn_get_addr_ref(&insn, regs);
+	if (vaddr / PAGE_SIZE != (vaddr + size) / PAGE_SIZE)
+		return -EFAULT;
+
 	/* Handle writes first */
 	switch (mmio) {
 	case MMIO_WRITE:
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-14 12:01 ` [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Kirill A. Shutemov
@ 2022-06-15 15:27   ` Dave Hansen
  2022-06-15 17:10     ` Kirill A. Shutemov
  2022-06-15 18:12   ` Dave Hansen
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Dave Hansen @ 2022-06-15 15:27 UTC (permalink / raw)
  To: Kirill A. Shutemov, tglx, mingo, bp, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86

On 6/14/22 05:01, Kirill A. Shutemov wrote:
> load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
> The unwanted loads are typically harmless. But, they might be made to
> totally unrelated or even unmapped memory. load_unaligned_zeropad()
> relies on exception fixup (#PF, #GP and now #VE) to recover from these
> unwanted loads.
> 
> In TDX guests, the second page can be shared page and VMM may configure
> it to trigger #VE.
> 
> Kernel assumes that #VE on a shared page is MMIO access and tries to
> decode instruction to handle it. In case of load_unaligned_zeropad() it
> may result in confusion as it is not MMIO access.
> 
> Fix it by detecting split page MMIO accesses and fail them.
> load_unaligned_zeropad() will recover using exception fixups.
> 
> The issue was discovered by analysis. It was not triggered during the
> testing.

I thought this whole exercise was kicked off by hitting this in testing.
 Am I remembering this wrong?

> https://lore.kernel.org/all/20220517153444.11195-10-kirill.shutemov@linux.intel.com/

Says:

> This is an actual, real-world problem which was discovered during TDX
> testing.

Or were you considering this a different problem somehow?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 15:27   ` Dave Hansen
@ 2022-06-15 17:10     ` Kirill A. Shutemov
  2022-06-15 21:55       ` Dave Hansen
  0 siblings, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-15 17:10 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On Wed, Jun 15, 2022 at 08:27:52AM -0700, Dave Hansen wrote:
> On 6/14/22 05:01, Kirill A. Shutemov wrote:
> > load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
> > The unwanted loads are typically harmless. But, they might be made to
> > totally unrelated or even unmapped memory. load_unaligned_zeropad()
> > relies on exception fixup (#PF, #GP and now #VE) to recover from these
> > unwanted loads.
> > 
> > In TDX guests, the second page can be shared page and VMM may configure
> > it to trigger #VE.
> > 
> > Kernel assumes that #VE on a shared page is MMIO access and tries to
> > decode instruction to handle it. In case of load_unaligned_zeropad() it
> > may result in confusion as it is not MMIO access.
> > 
> > Fix it by detecting split page MMIO accesses and fail them.
> > load_unaligned_zeropad() will recover using exception fixups.
> > 
> > The issue was discovered by analysis. It was not triggered during the
> > testing.
> 
> I thought this whole exercise was kicked off by hitting this in testing.
>  Am I remembering this wrong?
> 
> > https://lore.kernel.org/all/20220517153444.11195-10-kirill.shutemov@linux.intel.com/
> 
> Says:
> 
> > This is an actual, real-world problem which was discovered during TDX
> > testing.
> 
> Or were you considering this a different problem somehow?

They are different.

The patch by the link addresses issue of load_unaligned_zeropad() stepping
onto unaccepted memory. This was triggered in practice.

This patch address stepping onto MMIO shared memory. I had to force the
situation manually as MMIO memory mapped with ioremap() and it is not next
to normally allocated memory used by load_unaligned_zeropad() (such as
dentry cache).

Although any shared memory (SWIOTLB buffer for instance) can generate EPT
violation #VE if the VMM is malicious enough.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-14 12:01 ` [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Kirill A. Shutemov
  2022-06-15 15:27   ` Dave Hansen
@ 2022-06-15 18:12   ` Dave Hansen
  2022-06-15 22:32     ` Kirill A. Shutemov
  2022-06-15 22:52   ` Kirill A. Shutemov
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Dave Hansen @ 2022-06-15 18:12 UTC (permalink / raw)
  To: Kirill A. Shutemov, tglx, mingo, bp, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86

On 6/14/22 05:01, Kirill A. Shutemov wrote:
> load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
> The unwanted loads are typically harmless. But, they might be made to
> totally unrelated or even unmapped memory. load_unaligned_zeropad()
> relies on exception fixup (#PF, #GP and now #VE) to recover from these
> unwanted loads.
> 
> In TDX guests, the second page can be shared page and VMM may configure
> it to trigger #VE.
> 
> Kernel assumes that #VE on a shared page is MMIO access and tries to
> decode instruction to handle it. In case of load_unaligned_zeropad() it
> may result in confusion as it is not MMIO access.
> 
> Fix it by detecting split page MMIO accesses and fail them.
> load_unaligned_zeropad() will recover using exception fixups.
> 
> The issue was discovered by analysis. It was not triggered during the
> testing.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  arch/x86/coco/tdx/tdx.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 7d6d484a6d28..3bcaf2170ede 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -333,8 +333,8 @@ static bool mmio_write(int size, unsigned long addr, unsigned long val)
>  
>  static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
>  {
> +	unsigned long *reg, val, vaddr;
>  	char buffer[MAX_INSN_SIZE];
> -	unsigned long *reg, val;
>  	struct insn insn = {};
>  	enum mmio_type mmio;
>  	int size, extend_size;
> @@ -360,6 +360,19 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
>  			return -EINVAL;
>  	}
>  
> +	/*
> +	 * Reject EPT violation #VEs that split pages.
> +	 *
> +	 * MMIO accesses suppose to be naturally aligned and therefore never
> +	 * cross a page boundary. Seeing split page accesses indicates a bug
> +	 * or load_unaligned_zeropad() that steps into unmapped shared page.

Isn't this "unmapped" thing a rather superfluous implementation detail?

For the guest, it just needs to know that it *CAN* #VE on access to MMIO
and that it needs to be prepared.  The fact that MMIO is implemented
with TDX shared memory *AND* that "unmapped shared pages" can cause
#VE's seems like too much detail.

Also, is this all precise?  Are literal unmapped shared pages the *ONLY*
thing that a hypervisor can do do case a #VE?  What about, say, reserved
bits being set in a shared EPT entry?

I was thinking a comment like this might be better:

>         /*
>          * Reject EPT violation #VEs that split pages.
>          *
>          * MMIO accesses are supposed to be naturally aligned and therefore
>          * never cross page boundaries. Seeing split page accesses indicates
>          * a bug or a load_unaligned_zeropad() that stepped into an MMIO page.
>          *
>          * load_unaligned_zeropad() will recover using exception fixups.
>          */



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 17:10     ` Kirill A. Shutemov
@ 2022-06-15 21:55       ` Dave Hansen
  2022-06-15 22:43         ` Kirill A. Shutemov
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Hansen @ 2022-06-15 21:55 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On 6/15/22 10:10, Kirill A. Shutemov wrote:
>> I thought this whole exercise was kicked off by hitting this in testing.
>>  Am I remembering this wrong?
>>
>>> https://lore.kernel.org/all/20220517153444.11195-10-kirill.shutemov@linux.intel.com/
>> Says:
>>
>>> This is an actual, real-world problem which was discovered during TDX
>>> testing.
>> Or were you considering this a different problem somehow?
> They are different.
> 
> The patch by the link addresses issue of load_unaligned_zeropad() stepping
> onto unaccepted memory. This was triggered in practice.

OK, so we've got two problems both triggered by
load_unaligned_zeropad(), but where the fixes are different.  We
actually hit the "unaccepted memory one" in testing, but that made us
think about other problems in the area and that's where this one came up.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 18:12   ` Dave Hansen
@ 2022-06-15 22:32     ` Kirill A. Shutemov
  2022-06-15 22:36       ` Dave Hansen
  0 siblings, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-15 22:32 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On Wed, Jun 15, 2022 at 11:12:35AM -0700, Dave Hansen wrote:
> On 6/14/22 05:01, Kirill A. Shutemov wrote:
> > load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
> > The unwanted loads are typically harmless. But, they might be made to
> > totally unrelated or even unmapped memory. load_unaligned_zeropad()
> > relies on exception fixup (#PF, #GP and now #VE) to recover from these
> > unwanted loads.
> > 
> > In TDX guests, the second page can be shared page and VMM may configure
> > it to trigger #VE.
> > 
> > Kernel assumes that #VE on a shared page is MMIO access and tries to
> > decode instruction to handle it. In case of load_unaligned_zeropad() it
> > may result in confusion as it is not MMIO access.
> > 
> > Fix it by detecting split page MMIO accesses and fail them.
> > load_unaligned_zeropad() will recover using exception fixups.
> > 
> > The issue was discovered by analysis. It was not triggered during the
> > testing.
> > 
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > ---
> >  arch/x86/coco/tdx/tdx.c | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> > index 7d6d484a6d28..3bcaf2170ede 100644
> > --- a/arch/x86/coco/tdx/tdx.c
> > +++ b/arch/x86/coco/tdx/tdx.c
> > @@ -333,8 +333,8 @@ static bool mmio_write(int size, unsigned long addr, unsigned long val)
> >  
> >  static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
> >  {
> > +	unsigned long *reg, val, vaddr;
> >  	char buffer[MAX_INSN_SIZE];
> > -	unsigned long *reg, val;
> >  	struct insn insn = {};
> >  	enum mmio_type mmio;
> >  	int size, extend_size;
> > @@ -360,6 +360,19 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
> >  			return -EINVAL;
> >  	}
> >  
> > +	/*
> > +	 * Reject EPT violation #VEs that split pages.
> > +	 *
> > +	 * MMIO accesses suppose to be naturally aligned and therefore never
> > +	 * cross a page boundary. Seeing split page accesses indicates a bug
> > +	 * or load_unaligned_zeropad() that steps into unmapped shared page.
> 
> Isn't this "unmapped" thing a rather superfluous implementation detail?
> 
> For the guest, it just needs to know that it *CAN* #VE on access to MMIO
> and that it needs to be prepared.  The fact that MMIO is implemented
> with TDX shared memory *AND* that "unmapped shared pages" can cause
> #VE's seems like too much detail.

Okay, fair enough.

> Also, is this all precise?  Are literal unmapped shared pages the *ONLY*
> thing that a hypervisor can do do case a #VE?  What about, say, reserved
> bits being set in a shared EPT entry?

Right, it is analogous to page fault. So, yes, it can be triggered for
a number of reasons, not only unmapped.

> I was thinking a comment like this might be better:
> 
> >         /*
> >          * Reject EPT violation #VEs that split pages.
> >          *
> >          * MMIO accesses are supposed to be naturally aligned and therefore
> >          * never cross page boundaries. Seeing split page accesses indicates
> >          * a bug or a load_unaligned_zeropad() that stepped into an MMIO page.
> >          *
> >          * load_unaligned_zeropad() will recover using exception fixups.
> >          */

Looks good, thanks.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 22:32     ` Kirill A. Shutemov
@ 2022-06-15 22:36       ` Dave Hansen
  2022-06-15 23:16         ` Kirill A. Shutemov
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Hansen @ 2022-06-15 22:36 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On 6/15/22 15:32, Kirill A. Shutemov wrote:
>>>         /*
>>>          * Reject EPT violation #VEs that split pages.
>>>          *
>>>          * MMIO accesses are supposed to be naturally aligned and therefore
>>>          * never cross page boundaries. Seeing split page accesses indicates
>>>          * a bug or a load_unaligned_zeropad() that stepped into an MMIO page.
>>>          *
>>>          * load_unaligned_zeropad() will recover using exception fixups.
>>>          */
> Looks good, thanks.

I've got that snippet and a few other things staged here:

> https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=testme

Could you take a quick look through before I push them somewhere for real?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 21:55       ` Dave Hansen
@ 2022-06-15 22:43         ` Kirill A. Shutemov
  2022-06-15 22:49           ` Dave Hansen
  0 siblings, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-15 22:43 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On Wed, Jun 15, 2022 at 02:55:13PM -0700, Dave Hansen wrote:
> On 6/15/22 10:10, Kirill A. Shutemov wrote:
> >> I thought this whole exercise was kicked off by hitting this in testing.
> >>  Am I remembering this wrong?
> >>
> >>> https://lore.kernel.org/all/20220517153444.11195-10-kirill.shutemov@linux.intel.com/
> >> Says:
> >>
> >>> This is an actual, real-world problem which was discovered during TDX
> >>> testing.
> >> Or were you considering this a different problem somehow?
> > They are different.
> > 
> > The patch by the link addresses issue of load_unaligned_zeropad() stepping
> > onto unaccepted memory. This was triggered in practice.
> 
> OK, so we've got two problems both triggered by
> load_unaligned_zeropad(), but where the fixes are different.  We
> actually hit the "unaccepted memory one" in testing, but that made us
> think about other problems in the area and that's where this one came up.

I will reword it like this:

	The issue was discovered by analysis after triggering other issue with
	load_unaligned_zeropad().

Works for you?

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 22:43         ` Kirill A. Shutemov
@ 2022-06-15 22:49           ` Dave Hansen
  2022-06-15 23:16             ` Kirill A. Shutemov
  0 siblings, 1 reply; 22+ messages in thread
From: Dave Hansen @ 2022-06-15 22:49 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On 6/15/22 15:43, Kirill A. Shutemov wrote:
> I will reword it like this:
> 
> 	The issue was discovered by analysis after triggering other issue with
> 	load_unaligned_zeropad().

Yeah, that sounds sane.  I'm also happy to shove this into the commit
message before I push it out.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-14 12:01 ` [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Kirill A. Shutemov
  2022-06-15 15:27   ` Dave Hansen
  2022-06-15 18:12   ` Dave Hansen
@ 2022-06-15 22:52   ` Kirill A. Shutemov
  2022-06-15 23:34     ` Dave Hansen
  2022-06-17 21:53   ` [tip: x86/urgent] " tip-bot2 for Kirill A. Shutemov
  2022-06-17 22:41   ` tip-bot2 for Kirill A. Shutemov
  4 siblings, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-15 22:52 UTC (permalink / raw)
  To: tglx, mingo, bp, dave.hansen, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86

On Tue, Jun 14, 2022 at 03:01:35PM +0300, Kirill A. Shutemov wrote:
> load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
> The unwanted loads are typically harmless. But, they might be made to
> totally unrelated or even unmapped memory. load_unaligned_zeropad()
> relies on exception fixup (#PF, #GP and now #VE) to recover from these
> unwanted loads.
> 
> In TDX guests, the second page can be shared page and VMM may configure
> it to trigger #VE.
> 
> Kernel assumes that #VE on a shared page is MMIO access and tries to
> decode instruction to handle it. In case of load_unaligned_zeropad() it
> may result in confusion as it is not MMIO access.
> 
> Fix it by detecting split page MMIO accesses and fail them.
> load_unaligned_zeropad() will recover using exception fixups.
> 
> The issue was discovered by analysis. It was not triggered during the
> testing.
> 
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> ---
>  arch/x86/coco/tdx/tdx.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
> index 7d6d484a6d28..3bcaf2170ede 100644
> --- a/arch/x86/coco/tdx/tdx.c
> +++ b/arch/x86/coco/tdx/tdx.c
> @@ -333,8 +333,8 @@ static bool mmio_write(int size, unsigned long addr, unsigned long val)
>  
>  static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
>  {
> +	unsigned long *reg, val, vaddr;
>  	char buffer[MAX_INSN_SIZE];
> -	unsigned long *reg, val;
>  	struct insn insn = {};
>  	enum mmio_type mmio;
>  	int size, extend_size;
> @@ -360,6 +360,19 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
>  			return -EINVAL;
>  	}
>  
> +	/*
> +	 * Reject EPT violation #VEs that split pages.
> +	 *
> +	 * MMIO accesses suppose to be naturally aligned and therefore never
> +	 * cross a page boundary. Seeing split page accesses indicates a bug
> +	 * or load_unaligned_zeropad() that steps into unmapped shared page.
> +	 *
> +	 * load_unaligned_zeropad() will recover using exception fixups.
> +	 */
> +	vaddr = (unsigned long)insn_get_addr_ref(&insn, regs);
> +	if (vaddr / PAGE_SIZE != (vaddr + size) / PAGE_SIZE)

Oops. I just realized it has off-by-one. It supposed to be:

	if (vaddr / PAGE_SIZE != (vaddr + size - 1) / PAGE_SIZE)

> +		return -EFAULT;
> +
>  	/* Handle writes first */
>  	switch (mmio) {
>  	case MMIO_WRITE:
> -- 
> 2.35.1
> 

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 22:36       ` Dave Hansen
@ 2022-06-15 23:16         ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-15 23:16 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On Wed, Jun 15, 2022 at 03:36:02PM -0700, Dave Hansen wrote:
> On 6/15/22 15:32, Kirill A. Shutemov wrote:
> >>>         /*
> >>>          * Reject EPT violation #VEs that split pages.
> >>>          *
> >>>          * MMIO accesses are supposed to be naturally aligned and therefore
> >>>          * never cross page boundaries. Seeing split page accesses indicates
> >>>          * a bug or a load_unaligned_zeropad() that stepped into an MMIO page.
> >>>          *
> >>>          * load_unaligned_zeropad() will recover using exception fixups.
> >>>          */
> > Looks good, thanks.
> 
> I've got that snippet and a few other things staged here:
> 
> > https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=testme
> 
> Could you take a quick look through before I push them somewhere for real?

Looks good to me. Could you fold in the off-by-one fix? 

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 22:49           ` Dave Hansen
@ 2022-06-15 23:16             ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-15 23:16 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On Wed, Jun 15, 2022 at 03:49:49PM -0700, Dave Hansen wrote:
> On 6/15/22 15:43, Kirill A. Shutemov wrote:
> > I will reword it like this:
> > 
> > 	The issue was discovered by analysis after triggering other issue with
> > 	load_unaligned_zeropad().
> 
> Yeah, that sounds sane.  I'm also happy to shove this into the commit
> message before I push it out.

It is fine as is in your tree.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 22:52   ` Kirill A. Shutemov
@ 2022-06-15 23:34     ` Dave Hansen
  2022-06-16  1:19       ` Kirill A. Shutemov
  2022-06-16  6:35       ` David Laight
  0 siblings, 2 replies; 22+ messages in thread
From: Dave Hansen @ 2022-06-15 23:34 UTC (permalink / raw)
  To: Kirill A. Shutemov, tglx, mingo, bp, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86

On 6/15/22 15:52, Kirill A. Shutemov wrote:
>> +	vaddr = (unsigned long)insn_get_addr_ref(&insn, regs);
>> +	if (vaddr / PAGE_SIZE != (vaddr + size) / PAGE_SIZE)
> Oops. I just realized it has off-by-one. It supposed to be:
> 
> 	if (vaddr / PAGE_SIZE != (vaddr + size - 1) / PAGE_SIZE)

That was bugging me.  Glad you caught this.

Wouldn't this be more obviously correct?

	if (ALIGN_DOWN(vaddr,        PAGE_SIZE) !=
	    ALIGN_DOWN(vaddr + size, PAGE_SIZE))
		...

I don't think we have a PAGE_ALIGN_DOWN().

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 23:34     ` Dave Hansen
@ 2022-06-16  1:19       ` Kirill A. Shutemov
  2022-06-16  6:35       ` David Laight
  1 sibling, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2022-06-16  1:19 UTC (permalink / raw)
  To: Dave Hansen
  Cc: tglx, mingo, bp, luto, peterz, ak, dan.j.williams, david, hpa,
	linux-kernel, sathyanarayanan.kuppuswamy, seanjc,
	thomas.lendacky, x86

On Wed, Jun 15, 2022 at 04:34:48PM -0700, Dave Hansen wrote:
> On 6/15/22 15:52, Kirill A. Shutemov wrote:
> >> +	vaddr = (unsigned long)insn_get_addr_ref(&insn, regs);
> >> +	if (vaddr / PAGE_SIZE != (vaddr + size) / PAGE_SIZE)
> > Oops. I just realized it has off-by-one. It supposed to be:
> > 
> > 	if (vaddr / PAGE_SIZE != (vaddr + size - 1) / PAGE_SIZE)
> 
> That was bugging me.  Glad you caught this.
> 
> Wouldn't this be more obviously correct?
> 
> 	if (ALIGN_DOWN(vaddr,        PAGE_SIZE) !=
> 	    ALIGN_DOWN(vaddr + size, PAGE_SIZE))
> 		...

I don't think it fixes anything.

Consider the case when vaddr is 4092 and size is 4. This is legitimate
access -- aligned and not page crosser.

The left side of the check will be evaluated to 0 and the right will be 1
which is wrong and will get us -EFAULT.

I cannot think of a helper that would fit the need.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-15 23:34     ` Dave Hansen
  2022-06-16  1:19       ` Kirill A. Shutemov
@ 2022-06-16  6:35       ` David Laight
  1 sibling, 0 replies; 22+ messages in thread
From: David Laight @ 2022-06-16  6:35 UTC (permalink / raw)
  To: 'Dave Hansen', Kirill A. Shutemov, tglx, mingo, bp, luto, peterz
  Cc: ak, dan.j.williams, david, hpa, linux-kernel,
	sathyanarayanan.kuppuswamy, seanjc, thomas.lendacky, x86

From: Dave Hansen
> Sent: 16 June 2022 00:35
> 
> On 6/15/22 15:52, Kirill A. Shutemov wrote:
> >> +	vaddr = (unsigned long)insn_get_addr_ref(&insn, regs);
> >> +	if (vaddr / PAGE_SIZE != (vaddr + size) / PAGE_SIZE)
> > Oops. I just realized it has off-by-one. It supposed to be:
> >
> > 	if (vaddr / PAGE_SIZE != (vaddr + size - 1) / PAGE_SIZE)
> 
> That was bugging me.  Glad you caught this.
> 
> Wouldn't this be more obviously correct?
> 
> 	if (ALIGN_DOWN(vaddr,        PAGE_SIZE) !=
> 	    ALIGN_DOWN(vaddr + size, PAGE_SIZE))
> 		...
> 
> I don't think we have a PAGE_ALIGN_DOWN().

Or:
	if ((vaddr ^ (vaddr + size - 1)) >> PAGE_SHIFT)

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [tip: x86/urgent] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-14 12:01 ` [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Kirill A. Shutemov
                     ` (2 preceding siblings ...)
  2022-06-15 22:52   ` Kirill A. Shutemov
@ 2022-06-17 21:53   ` tip-bot2 for Kirill A. Shutemov
  2022-06-17 22:41   ` tip-bot2 for Kirill A. Shutemov
  4 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Kirill A. Shutemov @ 2022-06-17 21:53 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Kirill A. Shutemov, Dave Hansen, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     ceba767b943de2128eaef95e19880809274ac35d
Gitweb:        https://git.kernel.org/tip/ceba767b943de2128eaef95e19880809274ac35d
Author:        Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate:    Tue, 14 Jun 2022 15:01:35 +03:00
Committer:     Dave Hansen <dave.hansen@linux.intel.com>
CommitterDate: Fri, 17 Jun 2022 14:30:20 -07:00

x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page

load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
The unwanted loads are typically harmless. But, they might be made to
totally unrelated or even unmapped memory. load_unaligned_zeropad()
relies on exception fixup (#PF, #GP and now #VE) to recover from these
unwanted loads.

In TDX guests, the second page can be shared page and a VMM may configure
it to trigger #VE.

The kernel assumes that #VE on a shared page is an MMIO access and tries to
decode instruction to handle it. In case of load_unaligned_zeropad() it
may result in confusion as it is not MMIO access.

Fix it by detecting split page MMIO accesses and failing them.
load_unaligned_zeropad() will recover using exception fixups.

The issue was discovered by analysis and reproduced artificially. It was
not triggered during testing.

[ dhansen: fix up changelogs and comments for grammar and clarity,
	   plus incorporate Kirill's off-by-one fix]

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220614120135.14812-4-kirill.shutemov@linux.intel.com
---
 arch/x86/coco/tdx/tdx.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index c8d44f4..d5c51c9 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -333,8 +333,8 @@ static bool mmio_write(int size, unsigned long addr, unsigned long val)
 
 static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 {
+	unsigned long *reg, val, vaddr;
 	char buffer[MAX_INSN_SIZE];
-	unsigned long *reg, val;
 	struct insn insn = {};
 	enum mmio_type mmio;
 	int size, extend_size;
@@ -360,6 +360,19 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 			return -EINVAL;
 	}
 
+	/*
+	 * Reject EPT violation #VEs that split pages.
+	 *
+	 * MMIO accesses are supposed to be naturally aligned and therefore
+	 * never cross page boundaries. Seeing split page accesses indicates
+	 * a bug or a load_unaligned_zeropad() that stepped into an MMIO page.
+	 *
+	 * load_unaligned_zeropad() will recover using exception fixups.
+	 */
+	vaddr = (unsigned long)insn_get_addr_ref(&insn, regs);
+	if (vaddr / PAGE_SIZE != (vaddr + size) / PAGE_SIZE)
+		return -EFAULT;
+
 	/* Handle writes first */
 	switch (mmio) {
 	case MMIO_WRITE:

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/urgent] x86/tdx: Clarify RIP adjustments in #VE handler
  2022-06-14 12:01 ` [PATCHv4 2/3] x86/tdx: Clarify RIP adjustments in #VE handler Kirill A. Shutemov
@ 2022-06-17 21:53   ` tip-bot2 for Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Kirill A. Shutemov @ 2022-06-17 21:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Dave Hansen, Kirill A. Shutemov, Dave Hansen, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     cdd85786f4b3b9273e4376e69aa95a2d71722764
Gitweb:        https://git.kernel.org/tip/cdd85786f4b3b9273e4376e69aa95a2d71722764
Author:        Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate:    Tue, 14 Jun 2022 15:01:34 +03:00
Committer:     Dave Hansen <dave.hansen@linux.intel.com>
CommitterDate: Wed, 15 Jun 2022 11:05:16 -07:00

x86/tdx: Clarify RIP adjustments in #VE handler

After successful #VE handling, tdx_handle_virt_exception() has to move
RIP to the next instruction. The handler needs to know the length of the
instruction.

If the #VE happened due to instruction execution, the GET_VEINFO TDX
module call provides info on the instruction in R10, including its length.

For #VE due to EPT violation, the info in R10 is not populand and the
kernel must decode the instruction manually to find out its length.

Restructure the code to make it explicit that the instruction length
depends on the type of #VE. Make individual #VE handlers return
the instruction length on success or -errno on failure.

[ dhansen: fix up changelog and comments ]

Suggested-by: Dave Hansen <dave.hansen@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220614120135.14812-3-kirill.shutemov@linux.intel.com
---
 arch/x86/coco/tdx/tdx.c | 178 ++++++++++++++++++++++++++-------------
 1 file changed, 123 insertions(+), 55 deletions(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index faae53f..c8d44f4 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -124,6 +124,51 @@ static u64 get_cc_mask(void)
 	return BIT_ULL(gpa_width - 1);
 }
 
+/*
+ * The TDX module spec states that #VE may be injected for a limited set of
+ * reasons:
+ *
+ *  - Emulation of the architectural #VE injection on EPT violation;
+ *
+ *  - As a result of guest TD execution of a disallowed instruction,
+ *    a disallowed MSR access, or CPUID virtualization;
+ *
+ *  - A notification to the guest TD about anomalous behavior;
+ *
+ * The last one is opt-in and is not used by the kernel.
+ *
+ * The Intel Software Developer's Manual describes cases when instruction
+ * length field can be used in section "Information for VM Exits Due to
+ * Instruction Execution".
+ *
+ * For TDX, it ultimately means GET_VEINFO provides reliable instruction length
+ * information if #VE occurred due to instruction execution, but not for EPT
+ * violations.
+ */
+static int ve_instr_len(struct ve_info *ve)
+{
+	switch (ve->exit_reason) {
+	case EXIT_REASON_HLT:
+	case EXIT_REASON_MSR_READ:
+	case EXIT_REASON_MSR_WRITE:
+	case EXIT_REASON_CPUID:
+	case EXIT_REASON_IO_INSTRUCTION:
+		/* It is safe to use ve->instr_len for #VE due instructions */
+		return ve->instr_len;
+	case EXIT_REASON_EPT_VIOLATION:
+		/*
+		 * For EPT violations, ve->insn_len is not defined. For those,
+		 * the kernel must decode instructions manually and should not
+		 * be using this function.
+		 */
+		WARN_ONCE(1, "ve->instr_len is not defined for EPT violations");
+		return 0;
+	default:
+		WARN_ONCE(1, "Unexpected #VE-type: %lld\n", ve->exit_reason);
+		return ve->instr_len;
+	}
+}
+
 static u64 __cpuidle __halt(const bool irq_disabled, const bool do_sti)
 {
 	struct tdx_hypercall_args args = {
@@ -147,7 +192,7 @@ static u64 __cpuidle __halt(const bool irq_disabled, const bool do_sti)
 	return __tdx_hypercall(&args, do_sti ? TDX_HCALL_ISSUE_STI : 0);
 }
 
-static bool handle_halt(void)
+static int handle_halt(struct ve_info *ve)
 {
 	/*
 	 * Since non safe halt is mainly used in CPU offlining
@@ -158,9 +203,9 @@ static bool handle_halt(void)
 	const bool do_sti = false;
 
 	if (__halt(irq_disabled, do_sti))
-		return false;
+		return -EIO;
 
-	return true;
+	return ve_instr_len(ve);
 }
 
 void __cpuidle tdx_safe_halt(void)
@@ -180,7 +225,7 @@ void __cpuidle tdx_safe_halt(void)
 		WARN_ONCE(1, "HLT instruction emulation failed\n");
 }
 
-static bool read_msr(struct pt_regs *regs)
+static int read_msr(struct pt_regs *regs, struct ve_info *ve)
 {
 	struct tdx_hypercall_args args = {
 		.r10 = TDX_HYPERCALL_STANDARD,
@@ -194,14 +239,14 @@ static bool read_msr(struct pt_regs *regs)
 	 * (GHCI), section titled "TDG.VP.VMCALL<Instruction.RDMSR>".
 	 */
 	if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
-		return false;
+		return -EIO;
 
 	regs->ax = lower_32_bits(args.r11);
 	regs->dx = upper_32_bits(args.r11);
-	return true;
+	return ve_instr_len(ve);
 }
 
-static bool write_msr(struct pt_regs *regs)
+static int write_msr(struct pt_regs *regs, struct ve_info *ve)
 {
 	struct tdx_hypercall_args args = {
 		.r10 = TDX_HYPERCALL_STANDARD,
@@ -215,10 +260,13 @@ static bool write_msr(struct pt_regs *regs)
 	 * can be found in TDX Guest-Host-Communication Interface
 	 * (GHCI) section titled "TDG.VP.VMCALL<Instruction.WRMSR>".
 	 */
-	return !__tdx_hypercall(&args, 0);
+	if (__tdx_hypercall(&args, 0))
+		return -EIO;
+
+	return ve_instr_len(ve);
 }
 
-static bool handle_cpuid(struct pt_regs *regs)
+static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve)
 {
 	struct tdx_hypercall_args args = {
 		.r10 = TDX_HYPERCALL_STANDARD,
@@ -236,7 +284,7 @@ static bool handle_cpuid(struct pt_regs *regs)
 	 */
 	if (regs->ax < 0x40000000 || regs->ax > 0x4FFFFFFF) {
 		regs->ax = regs->bx = regs->cx = regs->dx = 0;
-		return true;
+		return ve_instr_len(ve);
 	}
 
 	/*
@@ -245,7 +293,7 @@ static bool handle_cpuid(struct pt_regs *regs)
 	 * (GHCI), section titled "VP.VMCALL<Instruction.CPUID>".
 	 */
 	if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
-		return false;
+		return -EIO;
 
 	/*
 	 * As per TDX GHCI CPUID ABI, r12-r15 registers contain contents of
@@ -257,7 +305,7 @@ static bool handle_cpuid(struct pt_regs *regs)
 	regs->cx = args.r14;
 	regs->dx = args.r15;
 
-	return true;
+	return ve_instr_len(ve);
 }
 
 static bool mmio_read(int size, unsigned long addr, unsigned long *val)
@@ -283,7 +331,7 @@ static bool mmio_write(int size, unsigned long addr, unsigned long val)
 			       EPT_WRITE, addr, val);
 }
 
-static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve)
+static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 {
 	char buffer[MAX_INSN_SIZE];
 	unsigned long *reg, val;
@@ -294,34 +342,36 @@ static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 
 	/* Only in-kernel MMIO is supported */
 	if (WARN_ON_ONCE(user_mode(regs)))
-		return false;
+		return -EFAULT;
 
 	if (copy_from_kernel_nofault(buffer, (void *)regs->ip, MAX_INSN_SIZE))
-		return false;
+		return -EFAULT;
 
 	if (insn_decode(&insn, buffer, MAX_INSN_SIZE, INSN_MODE_64))
-		return false;
+		return -EINVAL;
 
 	mmio = insn_decode_mmio(&insn, &size);
 	if (WARN_ON_ONCE(mmio == MMIO_DECODE_FAILED))
-		return false;
+		return -EINVAL;
 
 	if (mmio != MMIO_WRITE_IMM && mmio != MMIO_MOVS) {
 		reg = insn_get_modrm_reg_ptr(&insn, regs);
 		if (!reg)
-			return false;
+			return -EINVAL;
 	}
 
-	ve->instr_len = insn.length;
-
 	/* Handle writes first */
 	switch (mmio) {
 	case MMIO_WRITE:
 		memcpy(&val, reg, size);
-		return mmio_write(size, ve->gpa, val);
+		if (!mmio_write(size, ve->gpa, val))
+			return -EIO;
+		return insn.length;
 	case MMIO_WRITE_IMM:
 		val = insn.immediate.value;
-		return mmio_write(size, ve->gpa, val);
+		if (!mmio_write(size, ve->gpa, val))
+			return -EIO;
+		return insn.length;
 	case MMIO_READ:
 	case MMIO_READ_ZERO_EXTEND:
 	case MMIO_READ_SIGN_EXTEND:
@@ -334,15 +384,15 @@ static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 		 * decoded or handled properly. It was likely not using io.h
 		 * helpers or accessed MMIO accidentally.
 		 */
-		return false;
+		return -EINVAL;
 	default:
 		WARN_ONCE(1, "Unknown insn_decode_mmio() decode value?");
-		return false;
+		return -EINVAL;
 	}
 
 	/* Handle reads */
 	if (!mmio_read(size, ve->gpa, &val))
-		return false;
+		return -EIO;
 
 	switch (mmio) {
 	case MMIO_READ:
@@ -364,13 +414,13 @@ static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 	default:
 		/* All other cases has to be covered with the first switch() */
 		WARN_ON_ONCE(1);
-		return false;
+		return -EINVAL;
 	}
 
 	if (extend_size)
 		memset(reg, extend_val, extend_size);
 	memcpy(reg, &val, size);
-	return true;
+	return insn.length;
 }
 
 static bool handle_in(struct pt_regs *regs, int size, int port)
@@ -421,13 +471,14 @@ static bool handle_out(struct pt_regs *regs, int size, int port)
  *
  * Return True on success or False on failure.
  */
-static bool handle_io(struct pt_regs *regs, u32 exit_qual)
+static int handle_io(struct pt_regs *regs, struct ve_info *ve)
 {
+	u32 exit_qual = ve->exit_qual;
 	int size, port;
-	bool in;
+	bool in, ret;
 
 	if (VE_IS_IO_STRING(exit_qual))
-		return false;
+		return -EIO;
 
 	in   = VE_IS_IO_IN(exit_qual);
 	size = VE_GET_IO_SIZE(exit_qual);
@@ -435,9 +486,13 @@ static bool handle_io(struct pt_regs *regs, u32 exit_qual)
 
 
 	if (in)
-		return handle_in(regs, size, port);
+		ret = handle_in(regs, size, port);
 	else
-		return handle_out(regs, size, port);
+		ret = handle_out(regs, size, port);
+	if (!ret)
+		return -EIO;
+
+	return ve_instr_len(ve);
 }
 
 /*
@@ -447,17 +502,19 @@ static bool handle_io(struct pt_regs *regs, u32 exit_qual)
 __init bool tdx_early_handle_ve(struct pt_regs *regs)
 {
 	struct ve_info ve;
-	bool ret;
+	int insn_len;
 
 	tdx_get_ve_info(&ve);
 
 	if (ve.exit_reason != EXIT_REASON_IO_INSTRUCTION)
 		return false;
 
-	ret = handle_io(regs, ve.exit_qual);
-	if (ret)
-		regs->ip += ve.instr_len;
-	return ret;
+	insn_len = handle_io(regs, &ve);
+	if (insn_len < 0)
+		return false;
+
+	regs->ip += insn_len;
+	return true;
 }
 
 void tdx_get_ve_info(struct ve_info *ve)
@@ -490,54 +547,65 @@ void tdx_get_ve_info(struct ve_info *ve)
 	ve->instr_info  = upper_32_bits(out.r10);
 }
 
-/* Handle the user initiated #VE */
-static bool virt_exception_user(struct pt_regs *regs, struct ve_info *ve)
+/*
+ * Handle the user initiated #VE.
+ *
+ * On success, returns the number of bytes RIP should be incremented (>=0)
+ * or -errno on error.
+ */
+static int virt_exception_user(struct pt_regs *regs, struct ve_info *ve)
 {
 	switch (ve->exit_reason) {
 	case EXIT_REASON_CPUID:
-		return handle_cpuid(regs);
+		return handle_cpuid(regs, ve);
 	default:
 		pr_warn("Unexpected #VE: %lld\n", ve->exit_reason);
-		return false;
+		return -EIO;
 	}
 }
 
-/* Handle the kernel #VE */
-static bool virt_exception_kernel(struct pt_regs *regs, struct ve_info *ve)
+/*
+ * Handle the kernel #VE.
+ *
+ * On success, returns the number of bytes RIP should be incremented (>=0)
+ * or -errno on error.
+ */
+static int virt_exception_kernel(struct pt_regs *regs, struct ve_info *ve)
 {
 	switch (ve->exit_reason) {
 	case EXIT_REASON_HLT:
-		return handle_halt();
+		return handle_halt(ve);
 	case EXIT_REASON_MSR_READ:
-		return read_msr(regs);
+		return read_msr(regs, ve);
 	case EXIT_REASON_MSR_WRITE:
-		return write_msr(regs);
+		return write_msr(regs, ve);
 	case EXIT_REASON_CPUID:
-		return handle_cpuid(regs);
+		return handle_cpuid(regs, ve);
 	case EXIT_REASON_EPT_VIOLATION:
 		return handle_mmio(regs, ve);
 	case EXIT_REASON_IO_INSTRUCTION:
-		return handle_io(regs, ve->exit_qual);
+		return handle_io(regs, ve);
 	default:
 		pr_warn("Unexpected #VE: %lld\n", ve->exit_reason);
-		return false;
+		return -EIO;
 	}
 }
 
 bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve)
 {
-	bool ret;
+	int insn_len;
 
 	if (user_mode(regs))
-		ret = virt_exception_user(regs, ve);
+		insn_len = virt_exception_user(regs, ve);
 	else
-		ret = virt_exception_kernel(regs, ve);
+		insn_len = virt_exception_kernel(regs, ve);
+	if (insn_len < 0)
+		return false;
 
 	/* After successful #VE handling, move the IP */
-	if (ret)
-		regs->ip += ve->instr_len;
+	regs->ip += insn_len;
 
-	return ret;
+	return true;
 }
 
 static bool tdx_tlb_flush_required(bool private)

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/urgent] x86/tdx: Fix early #VE handling
  2022-06-14 12:01 ` [PATCHv4 1/3] x86/tdx: Fix early #VE handling Kirill A. Shutemov
@ 2022-06-17 21:53   ` tip-bot2 for Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Kirill A. Shutemov @ 2022-06-17 21:53 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Kirill A. Shutemov, Dave Hansen, Kuppuswamy Sathyanarayanan, x86,
	linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     60428d8bc27f52e8f1540f98e1b6ef0156d43f0d
Gitweb:        https://git.kernel.org/tip/60428d8bc27f52e8f1540f98e1b6ef0156d43f0d
Author:        Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate:    Tue, 14 Jun 2022 15:01:33 +03:00
Committer:     Dave Hansen <dave.hansen@linux.intel.com>
CommitterDate: Wed, 15 Jun 2022 10:52:59 -07:00

x86/tdx: Fix early #VE handling

tdx_early_handle_ve() does not increment RIP after successfully
handling the exception.  That leads to infinite loop of exceptions.

Move RIP when exceptions are successfully handled.

[ dhansen: make problem statement more clear ]

Fixes: 32e72854fa5f ("x86/tdx: Port I/O: Add early boot support")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://lkml.kernel.org/r/20220614120135.14812-2-kirill.shutemov@linux.intel.com
---
 arch/x86/coco/tdx/tdx.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index 03deb4d..faae53f 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -447,13 +447,17 @@ static bool handle_io(struct pt_regs *regs, u32 exit_qual)
 __init bool tdx_early_handle_ve(struct pt_regs *regs)
 {
 	struct ve_info ve;
+	bool ret;
 
 	tdx_get_ve_info(&ve);
 
 	if (ve.exit_reason != EXIT_REASON_IO_INSTRUCTION)
 		return false;
 
-	return handle_io(regs, ve.exit_qual);
+	ret = handle_io(regs, ve.exit_qual);
+	if (ret)
+		regs->ip += ve.instr_len;
+	return ret;
 }
 
 void tdx_get_ve_info(struct ve_info *ve)

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [tip: x86/urgent] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page
  2022-06-14 12:01 ` [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Kirill A. Shutemov
                     ` (3 preceding siblings ...)
  2022-06-17 21:53   ` [tip: x86/urgent] " tip-bot2 for Kirill A. Shutemov
@ 2022-06-17 22:41   ` tip-bot2 for Kirill A. Shutemov
  4 siblings, 0 replies; 22+ messages in thread
From: tip-bot2 for Kirill A. Shutemov @ 2022-06-17 22:41 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: Kirill A. Shutemov, Dave Hansen, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     1e7769653b06b56b7ea7d56911d2d5b2957750cd
Gitweb:        https://git.kernel.org/tip/1e7769653b06b56b7ea7d56911d2d5b2957750cd
Author:        Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
AuthorDate:    Tue, 14 Jun 2022 15:01:35 +03:00
Committer:     Dave Hansen <dave.hansen@linux.intel.com>
CommitterDate: Fri, 17 Jun 2022 15:37:33 -07:00

x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page

load_unaligned_zeropad() can lead to unwanted loads across page boundaries.
The unwanted loads are typically harmless. But, they might be made to
totally unrelated or even unmapped memory. load_unaligned_zeropad()
relies on exception fixup (#PF, #GP and now #VE) to recover from these
unwanted loads.

In TDX guests, the second page can be shared page and a VMM may configure
it to trigger #VE.

The kernel assumes that #VE on a shared page is an MMIO access and tries to
decode instruction to handle it. In case of load_unaligned_zeropad() it
may result in confusion as it is not MMIO access.

Fix it by detecting split page MMIO accesses and failing them.
load_unaligned_zeropad() will recover using exception fixups.

The issue was discovered by analysis and reproduced artificially. It was
not triggered during testing.

[ dhansen: fix up changelogs and comments for grammar and clarity,
	   plus incorporate Kirill's off-by-one fix]

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220614120135.14812-4-kirill.shutemov@linux.intel.com
---
 arch/x86/coco/tdx/tdx.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c
index c8d44f4..928dcf7 100644
--- a/arch/x86/coco/tdx/tdx.c
+++ b/arch/x86/coco/tdx/tdx.c
@@ -333,8 +333,8 @@ static bool mmio_write(int size, unsigned long addr, unsigned long val)
 
 static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 {
+	unsigned long *reg, val, vaddr;
 	char buffer[MAX_INSN_SIZE];
-	unsigned long *reg, val;
 	struct insn insn = {};
 	enum mmio_type mmio;
 	int size, extend_size;
@@ -360,6 +360,19 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
 			return -EINVAL;
 	}
 
+	/*
+	 * Reject EPT violation #VEs that split pages.
+	 *
+	 * MMIO accesses are supposed to be naturally aligned and therefore
+	 * never cross page boundaries. Seeing split page accesses indicates
+	 * a bug or a load_unaligned_zeropad() that stepped into an MMIO page.
+	 *
+	 * load_unaligned_zeropad() will recover using exception fixups.
+	 */
+	vaddr = (unsigned long)insn_get_addr_ref(&insn, regs);
+	if (vaddr / PAGE_SIZE != (vaddr + size - 1) / PAGE_SIZE)
+		return -EFAULT;
+
 	/* Handle writes first */
 	switch (mmio) {
 	case MMIO_WRITE:

^ permalink raw reply related	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-06-17 22:41 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-14 12:01 [PATCHv4 0/3] An updated version of the load_unaligned_zeropad() fix Kirill A. Shutemov
2022-06-14 12:01 ` [PATCHv4 1/3] x86/tdx: Fix early #VE handling Kirill A. Shutemov
2022-06-17 21:53   ` [tip: x86/urgent] " tip-bot2 for Kirill A. Shutemov
2022-06-14 12:01 ` [PATCHv4 2/3] x86/tdx: Clarify RIP adjustments in #VE handler Kirill A. Shutemov
2022-06-17 21:53   ` [tip: x86/urgent] " tip-bot2 for Kirill A. Shutemov
2022-06-14 12:01 ` [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Kirill A. Shutemov
2022-06-15 15:27   ` Dave Hansen
2022-06-15 17:10     ` Kirill A. Shutemov
2022-06-15 21:55       ` Dave Hansen
2022-06-15 22:43         ` Kirill A. Shutemov
2022-06-15 22:49           ` Dave Hansen
2022-06-15 23:16             ` Kirill A. Shutemov
2022-06-15 18:12   ` Dave Hansen
2022-06-15 22:32     ` Kirill A. Shutemov
2022-06-15 22:36       ` Dave Hansen
2022-06-15 23:16         ` Kirill A. Shutemov
2022-06-15 22:52   ` Kirill A. Shutemov
2022-06-15 23:34     ` Dave Hansen
2022-06-16  1:19       ` Kirill A. Shutemov
2022-06-16  6:35       ` David Laight
2022-06-17 21:53   ` [tip: x86/urgent] " tip-bot2 for Kirill A. Shutemov
2022-06-17 22:41   ` tip-bot2 for Kirill A. Shutemov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.