kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support
@ 2021-06-08  9:54 Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 1/7] x86/ioremap: Map efi_mem_reserve() memory as encrypted for SEV Joerg Roedel
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08  9:54 UTC (permalink / raw)
  To: x86
  Cc: Joerg Roedel, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra, Jiri Slaby, Dan Williams, Tom Lendacky,
	Juergen Gross, Kees Cook, David Rientjes, Cfir Cohen,
	Erdem Aktas, Masami Hiramatsu, Mike Stunes, Sean Christopherson,
	Martin Radev, Arvind Sankar, linux-coco, linux-kernel, kvm,
	virtualization

From: Joerg Roedel <jroedel@suse.de>

Hi,

here is the next revision of my pending fixes for SEV-ES guest
support. Changes to the previous version are:

	- Removed the patches already merged
	- Added a new fix to map the EFI MOKVar table encrypted
	- Disabled IRQs when GHCB is active
	- Relaxed state tracking by using irqentry_enter()/exit
	  instead of irqentry_nmi_enter()/exit()
	- Changed error reporting from insn_fetch_from_user*() as
	  requested by Boris

Changes are based on tip/x86/urgent. Please review.

Thanks,

	Joerg

Joerg Roedel (6):
  x86/sev-es: Fix error message in runtime #VC handler
  x86/sev-es: Disable IRQs while GHCB is active
  x86/sev-es: Run #VC handler in plain IRQ state
  x86/insn-eval: Make 0 a valid RIP for insn_get_effective_ip()
  x86/insn: Extend error reporting from
    insn_fetch_from_user[_inatomic]()
  x86/sev-es: Propagate #GP if getting linear instruction address failed

Tom Lendacky (1):
  x86/ioremap: Map efi_mem_reserve() memory as encrypted for SEV

 arch/x86/kernel/sev.c    | 61 +++++++++++++++++++++++++---------------
 arch/x86/kernel/umip.c   | 10 +++----
 arch/x86/lib/insn-eval.c | 22 +++++++++------
 arch/x86/mm/ioremap.c    |  4 ++-
 4 files changed, 59 insertions(+), 38 deletions(-)


base-commit: 009767dbf42ac0dbe3cf48c1ee224f6b778aa85a
-- 
2.31.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v3 1/7] x86/ioremap: Map efi_mem_reserve() memory as encrypted for SEV
  2021-06-08  9:54 [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support Joerg Roedel
@ 2021-06-08  9:54 ` Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 2/7] x86/sev-es: Fix error message in runtime #VC handler Joerg Roedel
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08  9:54 UTC (permalink / raw)
  To: x86
  Cc: Joerg Roedel, Joerg Roedel, Tom Lendacky, hpa, Andy Lutomirski,
	Dave Hansen, Peter Zijlstra, Jiri Slaby, Dan Williams,
	Juergen Gross, Kees Cook, David Rientjes, Cfir Cohen,
	Erdem Aktas, Masami Hiramatsu, Mike Stunes, Sean Christopherson,
	Martin Radev, Arvind Sankar, linux-coco, linux-kernel, kvm,
	virtualization

From: Tom Lendacky <thomas.lendacky@amd.com>

Some drivers require memory that is marked as EFI boot services data. So that
this memory is not re-used by the kernel after ExitBootServices(),
efi_mem_reserve() is used to preserve it by inserting a new EFI memory
descriptor and marking it with the EFI_MEMORY_RUNTIME attribute.

Under SEV, memory marked with the EFI_MEMORY_RUNTIME attribute needs to
be mapped encrypted by Linux, otherwise the kernel might crash at boot
like below:

 EFI Variables Facility v0.08 2004-May-17
 general protection fault, probably for non-canonical address 0x3597688770a868b2: 0000 [#1] SMP NOPTI
 CPU: 13 PID: 1 Comm: swapper/0 Not tainted 5.12.4-2-default #1 openSUSE Tumbleweed
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
 RIP: 0010:efi_mokvar_entry_next+0x34/0x40
 Code: c5 01 48 8b 17 48 c7 07 00 00 00 00 48 85 c0 74 24 48 85 d2 74 14 80 3a 00 74 18 48 8b 82 00 01 00 00 48 8d 84 02 08 01 00 00 <80> 38 00 74 04 48 89 07 c3 31 c0 c3 0f 1f 44 00 00 41 54 4c 8b 25
 [...]
 Call Trace:
  efi_mokvar_sysfs_init
  ? efi_mokvar_table_init
  do_one_initcall
  ? __kmalloc
  kernel_init_freeable
  ? rest_init
  kernel_init
  ret_from_fork
 Modules linked in:
 ---[ end trace 0de27ecc25d41b73 ]---

Expand the __ioremap_check_other() function to additionally check for this
other type of "runtime" data and indicate that it should be mapped encrypted
for an SEV guest.

Fixes: 58c909022a5a ("efi: Support for MOK variable config table")
Reported-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/mm/ioremap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 12c686c65ea9..60ade7dd71bd 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -118,7 +118,9 @@ static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *des
 	if (!IS_ENABLED(CONFIG_EFI))
 		return;
 
-	if (efi_mem_type(addr) == EFI_RUNTIME_SERVICES_DATA)
+	if (efi_mem_type(addr) == EFI_RUNTIME_SERVICES_DATA ||
+	    (efi_mem_type(addr) == EFI_BOOT_SERVICES_DATA &&
+	     efi_mem_attributes(addr) & EFI_MEMORY_RUNTIME))
 		desc->flags |= IORES_MAP_ENCRYPTED;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 2/7] x86/sev-es: Fix error message in runtime #VC handler
  2021-06-08  9:54 [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 1/7] x86/ioremap: Map efi_mem_reserve() memory as encrypted for SEV Joerg Roedel
@ 2021-06-08  9:54 ` Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 3/7] x86/sev-es: Disable IRQs while GHCB is active Joerg Roedel
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08  9:54 UTC (permalink / raw)
  To: x86
  Cc: Joerg Roedel, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra, Jiri Slaby, Dan Williams, Tom Lendacky,
	Juergen Gross, Kees Cook, David Rientjes, Cfir Cohen,
	Erdem Aktas, Masami Hiramatsu, Mike Stunes, Sean Christopherson,
	Martin Radev, Arvind Sankar, linux-coco, linux-kernel, kvm,
	virtualization

From: Joerg Roedel <jroedel@suse.de>

The runtime #VC handler is not "early" anymore. Fix the copy&paste error
and remove that word from the error message.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/sev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 651b81cd648e..4fd997bbf059 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1369,7 +1369,7 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 		vc_finish_insn(&ctxt);
 		break;
 	case ES_UNSUPPORTED:
-		pr_err_ratelimited("Unsupported exit-code 0x%02lx in early #VC exception (IP: 0x%lx)\n",
+		pr_err_ratelimited("Unsupported exit-code 0x%02lx in #VC exception (IP: 0x%lx)\n",
 				   error_code, regs->ip);
 		goto fail;
 	case ES_VMM_ERROR:
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 3/7] x86/sev-es: Disable IRQs while GHCB is active
  2021-06-08  9:54 [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 1/7] x86/ioremap: Map efi_mem_reserve() memory as encrypted for SEV Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 2/7] x86/sev-es: Fix error message in runtime #VC handler Joerg Roedel
@ 2021-06-08  9:54 ` Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 4/7] x86/sev-es: Run #VC handler in plain IRQ state Joerg Roedel
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08  9:54 UTC (permalink / raw)
  To: x86
  Cc: Joerg Roedel, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra, Jiri Slaby, Dan Williams, Tom Lendacky,
	Juergen Gross, Kees Cook, David Rientjes, Cfir Cohen,
	Erdem Aktas, Masami Hiramatsu, Mike Stunes, Sean Christopherson,
	Martin Radev, Arvind Sankar, linux-coco, linux-kernel, kvm,
	virtualization

From: Joerg Roedel <jroedel@suse.de>

The #VC handler only cares about IRQs being disabled while the GHCB is
active, as it must not be interrupted by something which could cause
another #VC while it holds the GHCB (NMI is the exception for which the
backup GHCB is there).

Make sure nothing interrupts the code path while the GHCB is active by
disabling IRQs in sev_es_get_ghcb() and restoring the previous irq state
in sev_es_put_ghcb().

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/sev.c | 39 +++++++++++++++++++++++++--------------
 1 file changed, 25 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 4fd997bbf059..2a922d1b03c8 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -192,14 +192,23 @@ void noinstr __sev_es_ist_exit(void)
 	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
 }
 
-static __always_inline struct ghcb *sev_es_get_ghcb(struct ghcb_state *state)
+static __always_inline struct ghcb *sev_es_get_ghcb(struct ghcb_state *state,
+						    unsigned long *flags)
 {
 	struct sev_es_runtime_data *data;
 	struct ghcb *ghcb;
 
+	/*
+	 * Nothing shall interrupt this code path while holding the per-cpu
+	 * GHCB. The backup GHCB is only for NMIs interrupting this path.
+	 */
+	local_irq_save(*flags);
+
 	data = this_cpu_read(runtime_data);
 	ghcb = &data->ghcb_page;
 
+
+
 	if (unlikely(data->ghcb_active)) {
 		/* GHCB is already in use - save its contents */
 
@@ -479,7 +488,8 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt
 /* Include code shared with pre-decompression boot stage */
 #include "sev-shared.c"
 
-static __always_inline void sev_es_put_ghcb(struct ghcb_state *state)
+static __always_inline void sev_es_put_ghcb(struct ghcb_state *state,
+					    unsigned long flags)
 {
 	struct sev_es_runtime_data *data;
 	struct ghcb *ghcb;
@@ -500,14 +510,17 @@ static __always_inline void sev_es_put_ghcb(struct ghcb_state *state)
 		vc_ghcb_invalidate(ghcb);
 		data->ghcb_active = false;
 	}
+
+	local_irq_restore(flags);
 }
 
 void noinstr __sev_es_nmi_complete(void)
 {
 	struct ghcb_state state;
+	unsigned long flags;
 	struct ghcb *ghcb;
 
-	ghcb = sev_es_get_ghcb(&state);
+	ghcb = sev_es_get_ghcb(&state, &flags);
 
 	vc_ghcb_invalidate(ghcb);
 	ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_NMI_COMPLETE);
@@ -517,7 +530,7 @@ void noinstr __sev_es_nmi_complete(void)
 	sev_es_wr_ghcb_msr(__pa_nodebug(ghcb));
 	VMGEXIT();
 
-	sev_es_put_ghcb(&state);
+	sev_es_put_ghcb(&state, flags);
 }
 
 static u64 get_jump_table_addr(void)
@@ -527,9 +540,7 @@ static u64 get_jump_table_addr(void)
 	struct ghcb *ghcb;
 	u64 ret = 0;
 
-	local_irq_save(flags);
-
-	ghcb = sev_es_get_ghcb(&state);
+	ghcb = sev_es_get_ghcb(&state, &flags);
 
 	vc_ghcb_invalidate(ghcb);
 	ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_JUMP_TABLE);
@@ -543,9 +554,7 @@ static u64 get_jump_table_addr(void)
 	    ghcb_sw_exit_info_2_is_valid(ghcb))
 		ret = ghcb->save.sw_exit_info_2;
 
-	sev_es_put_ghcb(&state);
-
-	local_irq_restore(flags);
+	sev_es_put_ghcb(&state, flags);
 
 	return ret;
 }
@@ -666,9 +675,10 @@ static bool __init sev_es_setup_ghcb(void)
 static void sev_es_ap_hlt_loop(void)
 {
 	struct ghcb_state state;
+	unsigned long flags;
 	struct ghcb *ghcb;
 
-	ghcb = sev_es_get_ghcb(&state);
+	ghcb = sev_es_get_ghcb(&state, &flags);
 
 	while (true) {
 		vc_ghcb_invalidate(ghcb);
@@ -685,7 +695,7 @@ static void sev_es_ap_hlt_loop(void)
 			break;
 	}
 
-	sev_es_put_ghcb(&state);
+	sev_es_put_ghcb(&state, flags);
 }
 
 /*
@@ -1333,6 +1343,7 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 	struct ghcb_state state;
 	struct es_em_ctxt ctxt;
 	enum es_result result;
+	unsigned long flags;
 	struct ghcb *ghcb;
 
 	/*
@@ -1353,7 +1364,7 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 	 * keep the IRQs disabled to protect us against concurrent TLB flushes.
 	 */
 
-	ghcb = sev_es_get_ghcb(&state);
+	ghcb = sev_es_get_ghcb(&state, &flags);
 
 	vc_ghcb_invalidate(ghcb);
 	result = vc_init_em_ctxt(&ctxt, regs, error_code);
@@ -1361,7 +1372,7 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 	if (result == ES_OK)
 		result = vc_handle_exitcode(&ctxt, ghcb, error_code);
 
-	sev_es_put_ghcb(&state);
+	sev_es_put_ghcb(&state, flags);
 
 	/* Done - now check the result */
 	switch (result) {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 4/7] x86/sev-es: Run #VC handler in plain IRQ state
  2021-06-08  9:54 [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support Joerg Roedel
                   ` (2 preceding siblings ...)
  2021-06-08  9:54 ` [PATCH v3 3/7] x86/sev-es: Disable IRQs while GHCB is active Joerg Roedel
@ 2021-06-08  9:54 ` Joerg Roedel
  2021-06-08 11:58   ` Peter Zijlstra
  2021-06-08  9:54 ` [PATCH v3 5/7] x86/insn-eval: Make 0 a valid RIP for insn_get_effective_ip() Joerg Roedel
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08  9:54 UTC (permalink / raw)
  To: x86
  Cc: Joerg Roedel, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra, Jiri Slaby, Dan Williams, Tom Lendacky,
	Juergen Gross, Kees Cook, David Rientjes, Cfir Cohen,
	Erdem Aktas, Masami Hiramatsu, Mike Stunes, Sean Christopherson,
	Martin Radev, Arvind Sankar, linux-coco, linux-kernel, kvm,
	virtualization

From: Joerg Roedel <jroedel@suse.de>

Use irqentry_enter() and irqentry_exit() to track the runtime state of
the #VC handler. The reason it ran in NMI mode was solely to make sure
nothing interrupts the handler while the GHCB is in use.

This is handled now in sev_es_get/put_ghcb() directly, so there is no
reason the #VC handler can not run in normal IRQ mode and enjoy the
benefits like being able to send signals.

Fixes: 62441a1fb532 ("x86/sev-es: Correctly track IRQ states in runtime #VC handler")
Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/sev.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 2a922d1b03c8..b563fb747aed 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -1354,8 +1354,7 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 		return;
 	}
 
-	irq_state = irqentry_nmi_enter(regs);
-	lockdep_assert_irqs_disabled();
+	irq_state = irqentry_enter(regs);
 	instrumentation_begin();
 
 	/*
@@ -1408,7 +1407,7 @@ DEFINE_IDTENTRY_VC_SAFE_STACK(exc_vmm_communication)
 
 out:
 	instrumentation_end();
-	irqentry_nmi_exit(regs, irq_state);
+	irqentry_exit(regs, irq_state);
 
 	return;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 5/7] x86/insn-eval: Make 0 a valid RIP for insn_get_effective_ip()
  2021-06-08  9:54 [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support Joerg Roedel
                   ` (3 preceding siblings ...)
  2021-06-08  9:54 ` [PATCH v3 4/7] x86/sev-es: Run #VC handler in plain IRQ state Joerg Roedel
@ 2021-06-08  9:54 ` Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 6/7] x86/insn: Extend error reporting from insn_fetch_from_user[_inatomic]() Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 7/7] x86/sev-es: Propagate #GP if getting linear instruction address failed Joerg Roedel
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08  9:54 UTC (permalink / raw)
  To: x86
  Cc: Joerg Roedel, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra, Jiri Slaby, Dan Williams, Tom Lendacky,
	Juergen Gross, Kees Cook, David Rientjes, Cfir Cohen,
	Erdem Aktas, Masami Hiramatsu, Mike Stunes, Sean Christopherson,
	Martin Radev, Arvind Sankar, linux-coco, linux-kernel, kvm,
	virtualization

From: Joerg Roedel <jroedel@suse.de>

In theory 0 is a valid value for the instruction pointer, so don't use
it as the error return value from insn_get_effective_ip().

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/lib/insn-eval.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index a67afd74232c..4eecb9c7c6a0 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -1417,7 +1417,7 @@ void __user *insn_get_addr_ref(struct insn *insn, struct pt_regs *regs)
 	}
 }
 
-static unsigned long insn_get_effective_ip(struct pt_regs *regs)
+static int insn_get_effective_ip(struct pt_regs *regs, unsigned long *ip)
 {
 	unsigned long seg_base = 0;
 
@@ -1430,10 +1430,12 @@ static unsigned long insn_get_effective_ip(struct pt_regs *regs)
 	if (!user_64bit_mode(regs)) {
 		seg_base = insn_get_seg_base(regs, INAT_SEG_REG_CS);
 		if (seg_base == -1L)
-			return 0;
+			return -EINVAL;
 	}
 
-	return seg_base + regs->ip;
+	*ip = seg_base + regs->ip;
+
+	return 0;
 }
 
 /**
@@ -1455,8 +1457,7 @@ int insn_fetch_from_user(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE])
 	unsigned long ip;
 	int not_copied;
 
-	ip = insn_get_effective_ip(regs);
-	if (!ip)
+	if (insn_get_effective_ip(regs, &ip))
 		return 0;
 
 	not_copied = copy_from_user(buf, (void __user *)ip, MAX_INSN_SIZE);
@@ -1484,8 +1485,7 @@ int insn_fetch_from_user_inatomic(struct pt_regs *regs, unsigned char buf[MAX_IN
 	unsigned long ip;
 	int not_copied;
 
-	ip = insn_get_effective_ip(regs);
-	if (!ip)
+	if (insn_get_effective_ip(regs, &ip))
 		return 0;
 
 	not_copied = __copy_from_user_inatomic(buf, (void __user *)ip, MAX_INSN_SIZE);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 6/7] x86/insn: Extend error reporting from insn_fetch_from_user[_inatomic]()
  2021-06-08  9:54 [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support Joerg Roedel
                   ` (4 preceding siblings ...)
  2021-06-08  9:54 ` [PATCH v3 5/7] x86/insn-eval: Make 0 a valid RIP for insn_get_effective_ip() Joerg Roedel
@ 2021-06-08  9:54 ` Joerg Roedel
  2021-06-08  9:54 ` [PATCH v3 7/7] x86/sev-es: Propagate #GP if getting linear instruction address failed Joerg Roedel
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08  9:54 UTC (permalink / raw)
  To: x86
  Cc: Joerg Roedel, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra, Jiri Slaby, Dan Williams, Tom Lendacky,
	Juergen Gross, Kees Cook, David Rientjes, Cfir Cohen,
	Erdem Aktas, Masami Hiramatsu, Mike Stunes, Sean Christopherson,
	Martin Radev, Arvind Sankar, linux-coco, linux-kernel, kvm,
	virtualization

From: Joerg Roedel <jroedel@suse.de>

The error reporting from the insn_fetch_from_user*() functions is not
very verbose. Extend it to include information on whether the linear
RIP could not be calculated or whether the memory access faulted.

This will be used in the SEV-ES code to propagate the correct
exception depending on what went wrong during instruction fetch.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/sev.c    |  8 ++++----
 arch/x86/kernel/umip.c   | 10 ++++------
 arch/x86/lib/insn-eval.c |  8 ++++++--
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index b563fb747aed..2b499affb2fb 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -267,17 +267,17 @@ static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt,
 static enum es_result __vc_decode_user_insn(struct es_em_ctxt *ctxt)
 {
 	char buffer[MAX_INSN_SIZE];
-	int res;
+	int insn_bytes;
 
-	res = insn_fetch_from_user_inatomic(ctxt->regs, buffer);
-	if (!res) {
+	insn_bytes = insn_fetch_from_user_inatomic(ctxt->regs, buffer);
+	if (insn_bytes <= 0) {
 		ctxt->fi.vector     = X86_TRAP_PF;
 		ctxt->fi.error_code = X86_PF_INSTR | X86_PF_USER;
 		ctxt->fi.cr2        = ctxt->regs->ip;
 		return ES_EXCEPTION;
 	}
 
-	if (!insn_decode_from_regs(&ctxt->insn, ctxt->regs, buffer, res))
+	if (!insn_decode_from_regs(&ctxt->insn, ctxt->regs, buffer, insn_bytes))
 		return ES_DECODE_FAILED;
 
 	if (ctxt->insn.immediate.got)
diff --git a/arch/x86/kernel/umip.c b/arch/x86/kernel/umip.c
index 8daa70b0d2da..337178809c89 100644
--- a/arch/x86/kernel/umip.c
+++ b/arch/x86/kernel/umip.c
@@ -346,14 +346,12 @@ bool fixup_umip_exception(struct pt_regs *regs)
 	if (!regs)
 		return false;
 
-	nr_copied = insn_fetch_from_user(regs, buf);
-
 	/*
-	 * The insn_fetch_from_user above could have failed if user code
-	 * is protected by a memory protection key. Give up on emulation
-	 * in such a case.  Should we issue a page fault?
+	 * Give up on emulation if fetching the instruction failed. Should we
+	 * issue a page fault or a #GP?
 	 */
-	if (!nr_copied)
+	nr_copied = insn_fetch_from_user(regs, buf);
+	if (nr_copied <= 0)
 		return false;
 
 	if (!insn_decode_from_regs(&insn, regs, buf, nr_copied))
diff --git a/arch/x86/lib/insn-eval.c b/arch/x86/lib/insn-eval.c
index 4eecb9c7c6a0..1b5cdf8b7a4e 100644
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -1451,6 +1451,8 @@ static int insn_get_effective_ip(struct pt_regs *regs, unsigned long *ip)
  * Number of instruction bytes copied.
  *
  * 0 if nothing was copied.
+ *
+ * -EINVAL if the linear address of the instruction could not be calculated
  */
 int insn_fetch_from_user(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE])
 {
@@ -1458,7 +1460,7 @@ int insn_fetch_from_user(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE])
 	int not_copied;
 
 	if (insn_get_effective_ip(regs, &ip))
-		return 0;
+		return -EINVAL;
 
 	not_copied = copy_from_user(buf, (void __user *)ip, MAX_INSN_SIZE);
 
@@ -1479,6 +1481,8 @@ int insn_fetch_from_user(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE])
  * Number of instruction bytes copied.
  *
  * 0 if nothing was copied.
+ *
+ * -EINVAL if the linear address of the instruction could not be calculated
  */
 int insn_fetch_from_user_inatomic(struct pt_regs *regs, unsigned char buf[MAX_INSN_SIZE])
 {
@@ -1486,7 +1490,7 @@ int insn_fetch_from_user_inatomic(struct pt_regs *regs, unsigned char buf[MAX_IN
 	int not_copied;
 
 	if (insn_get_effective_ip(regs, &ip))
-		return 0;
+		return -EINVAL;
 
 	not_copied = __copy_from_user_inatomic(buf, (void __user *)ip, MAX_INSN_SIZE);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v3 7/7] x86/sev-es: Propagate #GP if getting linear instruction address failed
  2021-06-08  9:54 [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support Joerg Roedel
                   ` (5 preceding siblings ...)
  2021-06-08  9:54 ` [PATCH v3 6/7] x86/insn: Extend error reporting from insn_fetch_from_user[_inatomic]() Joerg Roedel
@ 2021-06-08  9:54 ` Joerg Roedel
  6 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08  9:54 UTC (permalink / raw)
  To: x86
  Cc: Joerg Roedel, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen,
	Peter Zijlstra, Jiri Slaby, Dan Williams, Tom Lendacky,
	Juergen Gross, Kees Cook, David Rientjes, Cfir Cohen,
	Erdem Aktas, Masami Hiramatsu, Mike Stunes, Sean Christopherson,
	Martin Radev, Arvind Sankar, linux-coco, linux-kernel, kvm,
	virtualization

From: Joerg Roedel <jroedel@suse.de>

When an instruction is fetched from user-space, segmentation needs to
be taken into account. This means that getting the linear address of
an instruction can fail. Hardware would raise a #GP
exception in that case, but the #VC exception handler would emulate it
as a page-fault.

The insn_fetch_from_user*() functions now provide the relevant
information in case of an failure. Use that and propagate a #GP when
the linear address of an instruction to fetch could not be calculated.

Signed-off-by: Joerg Roedel <jroedel@suse.de>
---
 arch/x86/kernel/sev.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 2b499affb2fb..737d7198aab1 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -270,11 +270,18 @@ static enum es_result __vc_decode_user_insn(struct es_em_ctxt *ctxt)
 	int insn_bytes;
 
 	insn_bytes = insn_fetch_from_user_inatomic(ctxt->regs, buffer);
-	if (insn_bytes <= 0) {
+	if (insn_bytes == 0) {
+		/* Nothing could be copied */
 		ctxt->fi.vector     = X86_TRAP_PF;
 		ctxt->fi.error_code = X86_PF_INSTR | X86_PF_USER;
 		ctxt->fi.cr2        = ctxt->regs->ip;
 		return ES_EXCEPTION;
+	} else if (insn_bytes == -EINVAL) {
+		/* Effective RIP could not be calculated */
+		ctxt->fi.vector     = X86_TRAP_GP;
+		ctxt->fi.error_code = 0;
+		ctxt->fi.cr2        = 0;
+		return ES_EXCEPTION;
 	}
 
 	if (!insn_decode_from_regs(&ctxt->insn, ctxt->regs, buffer, insn_bytes))
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 4/7] x86/sev-es: Run #VC handler in plain IRQ state
  2021-06-08  9:54 ` [PATCH v3 4/7] x86/sev-es: Run #VC handler in plain IRQ state Joerg Roedel
@ 2021-06-08 11:58   ` Peter Zijlstra
  2021-06-08 13:25     ` Joerg Roedel
  0 siblings, 1 reply; 10+ messages in thread
From: Peter Zijlstra @ 2021-06-08 11:58 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: x86, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	linux-coco, linux-kernel, kvm, virtualization

On Tue, Jun 08, 2021 at 11:54:36AM +0200, Joerg Roedel wrote:
> From: Joerg Roedel <jroedel@suse.de>
> 
> Use irqentry_enter() and irqentry_exit() to track the runtime state of
> the #VC handler. The reason it ran in NMI mode was solely to make sure
> nothing interrupts the handler while the GHCB is in use.
> 
> This is handled now in sev_es_get/put_ghcb() directly, so there is no
> reason the #VC handler can not run in normal IRQ mode and enjoy the
> benefits like being able to send signals.

You sure?

So #VC cannot happen with IRQs disabled?

	raw_spin_lock_irq(&my_lock);
	<#VC>
		raw_spin_lock_irqsave(&my_lock); // whoopsie

Every exception that can happen with IRQs disabled must be NMI like.

Again, what you seem to want is to split the handler in a from-user and
from-kernel way, just like we did with #DB and MCE. See how
exc_debug_user() is IRQ-like and can send signals, while
exc_debug_kernel() is NMI like and can not.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v3 4/7] x86/sev-es: Run #VC handler in plain IRQ state
  2021-06-08 11:58   ` Peter Zijlstra
@ 2021-06-08 13:25     ` Joerg Roedel
  0 siblings, 0 replies; 10+ messages in thread
From: Joerg Roedel @ 2021-06-08 13:25 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: x86, Joerg Roedel, hpa, Andy Lutomirski, Dave Hansen, Jiri Slaby,
	Dan Williams, Tom Lendacky, Juergen Gross, Kees Cook,
	David Rientjes, Cfir Cohen, Erdem Aktas, Masami Hiramatsu,
	Mike Stunes, Sean Christopherson, Martin Radev, Arvind Sankar,
	linux-coco, linux-kernel, kvm, virtualization

Hi Peter,

On Tue, Jun 08, 2021 at 01:58:47PM +0200, Peter Zijlstra wrote:
> So #VC cannot happen with IRQs disabled?
> 
> 	raw_spin_lock_irq(&my_lock);
> 	<#VC>
> 		raw_spin_lock_irqsave(&my_lock); // whoopsie
> 
> Every exception that can happen with IRQs disabled must be NMI like.
> 
> Again, what you seem to want is to split the handler in a from-user and
> from-kernel way, just like we did with #DB and MCE. See how
> exc_debug_user() is IRQ-like and can send signals, while
> exc_debug_kernel() is NMI like and can not.

You are right, thanks for pointing this out. I replaced that patch by
one implementing the split in a from-user and from-kernel part. Initial
testing looks good, will send it out later this week.

Thanks,

	Joerg


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-06-08 13:25 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-08  9:54 [PATCH v3 0/7] x86/sev-es: Fixes for SEV-ES Guest Support Joerg Roedel
2021-06-08  9:54 ` [PATCH v3 1/7] x86/ioremap: Map efi_mem_reserve() memory as encrypted for SEV Joerg Roedel
2021-06-08  9:54 ` [PATCH v3 2/7] x86/sev-es: Fix error message in runtime #VC handler Joerg Roedel
2021-06-08  9:54 ` [PATCH v3 3/7] x86/sev-es: Disable IRQs while GHCB is active Joerg Roedel
2021-06-08  9:54 ` [PATCH v3 4/7] x86/sev-es: Run #VC handler in plain IRQ state Joerg Roedel
2021-06-08 11:58   ` Peter Zijlstra
2021-06-08 13:25     ` Joerg Roedel
2021-06-08  9:54 ` [PATCH v3 5/7] x86/insn-eval: Make 0 a valid RIP for insn_get_effective_ip() Joerg Roedel
2021-06-08  9:54 ` [PATCH v3 6/7] x86/insn: Extend error reporting from insn_fetch_from_user[_inatomic]() Joerg Roedel
2021-06-08  9:54 ` [PATCH v3 7/7] x86/sev-es: Propagate #GP if getting linear instruction address failed Joerg Roedel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).