All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
@ 2023-01-22  2:45 Tianyu Lan
  2023-01-22  2:45 ` [RFC PATCH V3 01/16] x86/hyperv: Add sev-snp enlightened guest specific config Tianyu Lan
                   ` (17 more replies)
  0 siblings, 18 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

This patchset is to add AMD sev-snp enlightened guest
support on hyperv. Hyperv uses Linux direct boot mode
to boot up Linux kernel and so it needs to pvalidate
system memory by itself.

In hyperv case, there is no boot loader and so cc blob
is prepared by hypervisor. In this series, hypervisor
set the cc blob address directly into boot parameter
of Linux kernel. If the magic number on cc blob address
is valid, kernel will read cc blob.

Shared memory between guests and hypervisor should be
decrypted and zero memory after decrypt memory. The data
in the target address. It maybe smearedto avoid smearing
data.

Introduce #HV exception support in AMD sev snp code and
#HV handler.

Change since v2:
       - Remove validate kernel memory code at boot stage
       - Split #HV page patch into two parts
       - Remove HV-APIC change due to enable x2apic from
       	 host side
       - Rework vmbus code to handle error of decrypt page
       - Spilt memory and cpu initialization patch. 

Change since v1:
       - Remove boot param changes for cc blob address and
       use setup head to pass cc blob info
       - Remove unnessary WARN and BUG check
       - Add system vector table map in the #HV exception
       - Fix interrupt exit issue when use #HV exception

Ashish Kalra (2):
  x86/sev: optimize system vector processing invoked from #HV exception
  x86/sev: Fix interrupt exit code paths from #HV exception

Tianyu Lan (14):
  x86/hyperv: Add sev-snp enlightened guest specific config
  x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest
  x86/hyperv: Set Virtual Trust Level in vmbus init message
  x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp
    enlightened guest
  clocksource/drivers/hyper-v: decrypt hyperv tsc page in sev-snp
    enlightened guest
  x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
  drivers: hv: Decrypt percpu hvcall input arg page in sev-snp
    enlightened guest
  x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
  x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
  x86/hyperv: Add smp support for sev-snp guest
  x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES
  x86/sev: Add a #HV exception handler
  x86/sev: Add Check of #HV event in path
  x86/sev: Initialize #HV doorbell and handle interrupt requests

 arch/x86/entry/entry_64.S             |  82 ++++++
 arch/x86/hyperv/hv_init.c             |  43 +++
 arch/x86/hyperv/ivm.c                 |  10 +
 arch/x86/include/asm/cpu_entry_area.h |   6 +
 arch/x86/include/asm/hyperv-tlfs.h    |   4 +
 arch/x86/include/asm/idtentry.h       | 105 ++++++-
 arch/x86/include/asm/irqflags.h       |  10 +
 arch/x86/include/asm/mem_encrypt.h    |   2 +
 arch/x86/include/asm/mshyperv.h       |  56 +++-
 arch/x86/include/asm/msr-index.h      |   6 +
 arch/x86/include/asm/page_64_types.h  |   1 +
 arch/x86/include/asm/sev.h            |  13 +
 arch/x86/include/asm/svm.h            |  59 +++-
 arch/x86/include/asm/trapnr.h         |   1 +
 arch/x86/include/asm/traps.h          |   1 +
 arch/x86/include/asm/x86_init.h       |   2 +
 arch/x86/include/uapi/asm/svm.h       |   4 +
 arch/x86/kernel/cpu/common.c          |   1 +
 arch/x86/kernel/cpu/mshyperv.c        | 228 ++++++++++++++-
 arch/x86/kernel/dumpstack_64.c        |   9 +-
 arch/x86/kernel/idt.c                 |   1 +
 arch/x86/kernel/sev.c                 | 395 ++++++++++++++++++++++----
 arch/x86/kernel/traps.c               |  42 +++
 arch/x86/kernel/vmlinux.lds.S         |   7 +
 arch/x86/kernel/x86_init.c            |   4 +-
 arch/x86/mm/cpu_entry_area.c          |   2 +
 drivers/clocksource/hyperv_timer.c    |   2 +-
 drivers/hv/connection.c               |   1 +
 drivers/hv/hv.c                       |  33 ++-
 drivers/hv/hv_common.c                |  26 +-
 include/asm-generic/hyperv-tlfs.h     |  19 ++
 include/asm-generic/mshyperv.h        |   2 +
 include/linux/hyperv.h                |   4 +-
 33 files changed, 1102 insertions(+), 79 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 01/16] x86/hyperv: Add sev-snp enlightened guest specific config
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-31 17:34   ` Michael Kelley (LINUX)
  2023-01-22  2:45 ` [RFC PATCH V3 02/16] x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest Tianyu Lan
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Introduce static key isolation_type_en_snp for enlightened
guest check and add some specific options in ms_hyperv_init_
platform().

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
 arch/x86/hyperv/ivm.c           | 10 ++++++++++
 arch/x86/include/asm/mshyperv.h |  3 +++
 arch/x86/kernel/cpu/mshyperv.c  | 16 +++++++++++++++-
 drivers/hv/hv_common.c          |  6 ++++++
 4 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
index abca9431d068..8c5dd8e4eb1e 100644
--- a/arch/x86/hyperv/ivm.c
+++ b/arch/x86/hyperv/ivm.c
@@ -386,6 +386,16 @@ bool hv_is_isolation_supported(void)
 }
 
 DEFINE_STATIC_KEY_FALSE(isolation_type_snp);
+DEFINE_STATIC_KEY_FALSE(isolation_type_en_snp);
+
+/*
+ * hv_isolation_type_en_snp - Check system runs in the AMD SEV-SNP based
+ * isolation enlightened VM.
+ */
+bool hv_isolation_type_en_snp(void)
+{
+	return static_branch_unlikely(&isolation_type_en_snp);
+}
 
 /*
  * hv_isolation_type_snp - Check system runs in the AMD SEV-SNP based
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 010768d40155..285df71150e4 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -14,6 +14,7 @@
 union hv_ghcb;
 
 DECLARE_STATIC_KEY_FALSE(isolation_type_snp);
+DECLARE_STATIC_KEY_FALSE(isolation_type_en_snp);
 
 typedef int (*hyperv_fill_flush_list_func)(
 		struct hv_guest_mapping_flush_list *flush,
@@ -28,6 +29,8 @@ extern void *hv_hypercall_pg;
 
 extern u64 hv_current_partition_id;
 
+extern bool hv_isolation_type_en_snp(void);
+
 extern union hv_ghcb * __percpu *hv_ghcb_pg;
 
 int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 8f83ceec45dc..ace5901ba0fc 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -273,6 +273,18 @@ static void __init ms_hyperv_init_platform(void)
 
 	hv_max_functions_eax = cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS);
 
+	/*
+	 * Add custom configuration for SEV-SNP Enlightened guest
+	 */
+	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
+		ms_hyperv.features |= HV_ACCESS_FREQUENCY_MSRS;
+		ms_hyperv.misc_features |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
+		ms_hyperv.misc_features &= ~HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
+		ms_hyperv.hints |= HV_DEPRECATING_AEOI_RECOMMENDED;
+		ms_hyperv.hints |= HV_X64_APIC_ACCESS_RECOMMENDED;
+		ms_hyperv.hints |= HV_X64_CLUSTER_IPI_RECOMMENDED;
+	}
+
 	pr_info("Hyper-V: privilege flags low 0x%x, high 0x%x, hints 0x%x, misc 0x%x\n",
 		ms_hyperv.features, ms_hyperv.priv_high, ms_hyperv.hints,
 		ms_hyperv.misc_features);
@@ -331,7 +343,9 @@ static void __init ms_hyperv_init_platform(void)
 		pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n",
 			ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
 
-		if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP)
+		if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+			static_branch_enable(&isolation_type_en_snp);
+		else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP)
 			static_branch_enable(&isolation_type_snp);
 	}
 
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index 566735f35c28..f788c64de0bd 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -268,6 +268,12 @@ bool __weak hv_isolation_type_snp(void)
 }
 EXPORT_SYMBOL_GPL(hv_isolation_type_snp);
 
+bool __weak hv_isolation_type_en_snp(void)
+{
+	return false;
+}
+EXPORT_SYMBOL_GPL(hv_isolation_type_en_snp);
+
 void __weak hv_setup_vmbus_handler(void (*handler)(void))
 {
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 02/16] x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
  2023-01-22  2:45 ` [RFC PATCH V3 01/16] x86/hyperv: Add sev-snp enlightened guest specific config Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-22  2:45 ` [RFC PATCH V3 03/16] x86/hyperv: Set Virtual Trust Level in vmbus init message Tianyu Lan
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

hv vp assist page is shared between sev snp guest and hyperv. Decrypt
the page when use it.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
 arch/x86/hyperv/hv_init.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index a5f9474f08e1..24154c1ee12b 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -29,6 +29,7 @@
 #include <linux/syscore_ops.h>
 #include <clocksource/hyperv_timer.h>
 #include <linux/highmem.h>
+#include <linux/set_memory.h>
 
 int hyperv_init_cpuhp;
 u64 hv_current_partition_id = ~0ull;
@@ -113,6 +114,11 @@ static int hv_cpu_init(unsigned int cpu)
 
 	}
 	if (!WARN_ON(!(*hvp))) {
+		if (hv_isolation_type_en_snp()) {
+			WARN_ON_ONCE(set_memory_decrypted((unsigned long)(*hvp), 1));
+			memset(*hvp, 0, PAGE_SIZE);
+		}
+
 		msr.enable = 1;
 		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64);
 	}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 03/16] x86/hyperv: Set Virtual Trust Level in vmbus init message
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
  2023-01-22  2:45 ` [RFC PATCH V3 01/16] x86/hyperv: Add sev-snp enlightened guest specific config Tianyu Lan
  2023-01-22  2:45 ` [RFC PATCH V3 02/16] x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-31 17:55   ` Michael Kelley (LINUX)
  2023-01-22  2:45 ` [RFC PATCH V3 04/16] x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest Tianyu Lan
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

sev-snp guest provides vtl(Virtual Trust Level) and
get it from hyperv hvcall via HVCALL_GET_VP_REGISTERS.
Set target vtl in the vmbus init message.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
Change since RFC v2:
       * Rename get_current_vtl() to get_vtl()
       * Fix some coding style issues
---
 arch/x86/hyperv/hv_init.c          | 37 ++++++++++++++++++++++++++++++
 arch/x86/include/asm/hyperv-tlfs.h |  4 ++++
 drivers/hv/connection.c            |  1 +
 include/asm-generic/mshyperv.h     |  2 ++
 include/linux/hyperv.h             |  4 ++--
 5 files changed, 46 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 24154c1ee12b..9e9757049915 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -384,6 +384,40 @@ static void __init hv_get_partition_id(void)
 	local_irq_restore(flags);
 }
 
+static u8 __init get_vtl(void)
+{
+	u64 control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_REGISTERS;
+	struct hv_get_vp_registers_input *input = NULL;
+	struct hv_get_vp_registers_output *output = NULL;
+	u64 vtl = 0;
+	int ret;
+	unsigned long flags;
+
+	local_irq_save(flags);
+	input = *(struct hv_get_vp_registers_input **)this_cpu_ptr(hyperv_pcpu_input_arg);
+	output = (struct hv_get_vp_registers_output *)input;
+	if (!input || !output) {
+		local_irq_restore(flags);
+		goto done;
+	}
+
+	memset(input, 0, sizeof(*input) + sizeof(input->element[0]));
+	input->header.partitionid = HV_PARTITION_ID_SELF;
+	input->header.vpindex = HV_VP_INDEX_SELF;
+	input->header.inputvtl = 0;
+	input->element[0].name0 = HV_X64_REGISTER_VSM_VP_STATUS;
+
+	ret = hv_do_hypercall(control, input, output);
+	if (ret == 0)
+		vtl = output->as64.low & HV_X64_VTL_MASK;
+	else
+		pr_err("Hyper-V: failed to get VTL!");
+	local_irq_restore(flags);
+
+done:
+	return vtl;
+}
+
 /*
  * This function is to be invoked early in the boot sequence after the
  * hypervisor has been detected.
@@ -512,6 +546,9 @@ void __init hyperv_init(void)
 	/* Query the VMs extended capability once, so that it can be cached. */
 	hv_query_ext_cap(0);
 
+	/* Find the VTL */
+	ms_hyperv.vtl = get_vtl();
+
 	return;
 
 clean_guest_os_id:
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index db2202d985bd..6dcbb21aac2b 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -36,6 +36,10 @@
 #define HYPERV_CPUID_MIN			0x40000005
 #define HYPERV_CPUID_MAX			0x4000ffff
 
+/* Support for HVCALL_GET_VP_REGISTERS hvcall */
+#define	HV_X64_REGISTER_VSM_VP_STATUS	0x000D0003
+#define HV_X64_VTL_MASK			GENMASK(3, 0)
+
 /*
  * Group D Features.  The bit assignments are custom to each architecture.
  * On x86/x64 these are HYPERV_CPUID_FEATURES.EDX bits.
diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
index f670cfd2e056..e4c39f4016ad 100644
--- a/drivers/hv/connection.c
+++ b/drivers/hv/connection.c
@@ -98,6 +98,7 @@ int vmbus_negotiate_version(struct vmbus_channel_msginfo *msginfo, u32 version)
 	 */
 	if (version >= VERSION_WIN10_V5) {
 		msg->msg_sint = VMBUS_MESSAGE_SINT;
+		msg->msg_vtl = ms_hyperv.vtl;
 		vmbus_connection.msg_conn_id = VMBUS_MESSAGE_CONNECTION_ID_4;
 	} else {
 		msg->interrupt_page = virt_to_phys(vmbus_connection.int_page);
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index f2c0856f1797..44e56777fea7 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -48,6 +48,7 @@ struct ms_hyperv_info {
 		};
 	};
 	u64 shared_gpa_boundary;
+	u8 vtl;
 };
 extern struct ms_hyperv_info ms_hyperv;
 
@@ -57,6 +58,7 @@ extern void * __percpu *hyperv_pcpu_output_arg;
 extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
 extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
 extern bool hv_isolation_type_snp(void);
+extern bool hv_isolation_type_en_snp(void);
 
 /* Helper functions that provide a consistent pattern for checking Hyper-V hypercall status. */
 static inline int hv_result(u64 status)
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 85f7c5a63aa6..65121b21b0af 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -665,8 +665,8 @@ struct vmbus_channel_initiate_contact {
 		u64 interrupt_page;
 		struct {
 			u8	msg_sint;
-			u8	padding1[3];
-			u32	padding2;
+			u8	msg_vtl;
+			u8	reserved[6];
 		};
 	};
 	u64 monitor_page1;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 04/16] x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (2 preceding siblings ...)
  2023-01-22  2:45 ` [RFC PATCH V3 03/16] x86/hyperv: Set Virtual Trust Level in vmbus init message Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-22  2:45 ` [RFC PATCH V3 05/16] clocksource/drivers/hyper-v: decrypt hyperv tsc page " Tianyu Lan
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

In sev-snp enlightened guest, Hyper-V hypercall needs
to use vmmcall to trigger vmexit and notify hypervisor
to handle hypercall request.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
Change since RFC V2:
       * Fix indentation style
---
 arch/x86/include/asm/mshyperv.h | 44 ++++++++++++++++++++++++---------
 1 file changed, 33 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 285df71150e4..1a4af0a4f29a 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -44,16 +44,25 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 	u64 hv_status;
 
 #ifdef CONFIG_X86_64
-	if (!hv_hypercall_pg)
-		return U64_MAX;
+	if (hv_isolation_type_en_snp()) {
+		__asm__ __volatile__("mov %4, %%r8\n"
+				     "vmmcall"
+				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
+				       "+c" (control), "+d" (input_address)
+				     :  "r" (output_address)
+				     : "cc", "memory", "r8", "r9", "r10", "r11");
+	} else {
+		if (!hv_hypercall_pg)
+			return U64_MAX;
 
-	__asm__ __volatile__("mov %4, %%r8\n"
-			     CALL_NOSPEC
-			     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
-			       "+c" (control), "+d" (input_address)
-			     :  "r" (output_address),
-				THUNK_TARGET(hv_hypercall_pg)
-			     : "cc", "memory", "r8", "r9", "r10", "r11");
+		__asm__ __volatile__("mov %4, %%r8\n"
+				     CALL_NOSPEC
+				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
+				       "+c" (control), "+d" (input_address)
+				     :  "r" (output_address),
+					THUNK_TARGET(hv_hypercall_pg)
+				     : "cc", "memory", "r8", "r9", "r10", "r11");
+	}
 #else
 	u32 input_address_hi = upper_32_bits(input_address);
 	u32 input_address_lo = lower_32_bits(input_address);
@@ -81,7 +90,13 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
 	u64 hv_status, control = (u64)code | HV_HYPERCALL_FAST_BIT;
 
 #ifdef CONFIG_X86_64
-	{
+	if (hv_isolation_type_en_snp()) {
+		__asm__ __volatile__(
+				"vmmcall"
+				: "=a" (hv_status), ASM_CALL_CONSTRAINT,
+				"+c" (control), "+d" (input1)
+				:: "cc", "r8", "r9", "r10", "r11");
+	} else {
 		__asm__ __volatile__(CALL_NOSPEC
 				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
 				       "+c" (control), "+d" (input1)
@@ -112,7 +127,14 @@ static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
 	u64 hv_status, control = (u64)code | HV_HYPERCALL_FAST_BIT;
 
 #ifdef CONFIG_X86_64
-	{
+	if (hv_isolation_type_en_snp()) {
+		__asm__ __volatile__("mov %4, %%r8\n"
+				     "vmmcall"
+				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
+				       "+c" (control), "+d" (input1)
+				     : "r" (input2)
+				     : "cc", "r8", "r9", "r10", "r11");
+	} else {
 		__asm__ __volatile__("mov %4, %%r8\n"
 				     CALL_NOSPEC
 				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 05/16] clocksource/drivers/hyper-v: decrypt hyperv tsc page in sev-snp enlightened guest
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (3 preceding siblings ...)
  2023-01-22  2:45 ` [RFC PATCH V3 04/16] x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-22  2:45 ` [RFC PATCH V3 06/16] x86/hyperv: decrypt vmbus pages for " Tianyu Lan
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Hyper-V tsc page is shared with hypervisor and it should
be decrypted in sev-snp enlightened guest when it's used.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
Change since RFC V2:
       * Change the Subject line prefix
---
 drivers/clocksource/hyperv_timer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index c0cef92b12b8..44da16ca203c 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -365,7 +365,7 @@ EXPORT_SYMBOL_GPL(hv_stimer_global_cleanup);
 static union {
 	struct ms_hyperv_tsc_page page;
 	u8 reserved[PAGE_SIZE];
-} tsc_pg __aligned(PAGE_SIZE);
+} tsc_pg __bss_decrypted __aligned(PAGE_SIZE);
 
 static struct ms_hyperv_tsc_page *tsc_page = &tsc_pg.page;
 static unsigned long tsc_pfn;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 06/16] x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (4 preceding siblings ...)
  2023-01-22  2:45 ` [RFC PATCH V3 05/16] clocksource/drivers/hyper-v: decrypt hyperv tsc page " Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-31 17:58   ` Michael Kelley (LINUX)
  2023-01-22  2:45 ` [RFC PATCH V3 07/16] drivers: hv: Decrypt percpu hvcall input arg page in " Tianyu Lan
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Vmbus post msg, synic event and message pages are shared
with hypervisor and so decrypt these pages in the sev-snp guest.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
Change since RFC V2:
       * Fix error in the error code path and encrypt
       	 pages correctly when decryption failure happens.
---
 drivers/hv/hv.c | 33 ++++++++++++++++++++++++++++++++-
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 410e6c4e80d3..52edc54c8172 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -20,6 +20,7 @@
 #include <linux/interrupt.h>
 #include <clocksource/hyperv_timer.h>
 #include <asm/mshyperv.h>
+#include <linux/set_memory.h>
 #include "hyperv_vmbus.h"
 
 /* The one and only */
@@ -117,7 +118,7 @@ int hv_post_message(union hv_connection_id connection_id,
 
 int hv_synic_alloc(void)
 {
-	int cpu;
+	int cpu, ret;
 	struct hv_per_cpu_context *hv_cpu;
 
 	/*
@@ -168,9 +169,39 @@ int hv_synic_alloc(void)
 			pr_err("Unable to allocate post msg page\n");
 			goto err;
 		}
+
+		if (hv_isolation_type_en_snp()) {
+			ret = set_memory_decrypted((unsigned long)
+				hv_cpu->synic_message_page, 1);
+			if (ret)
+				goto err;
+
+			ret = set_memory_decrypted((unsigned long)
+				hv_cpu->synic_event_page, 1);
+			if (ret)
+				goto err_decrypt_event_page;
+
+			ret = set_memory_decrypted((unsigned long)
+				hv_cpu->post_msg_page, 1);
+			if (ret)
+				goto err_decrypt_msg_page;
+
+			memset(hv_cpu->synic_message_page, 0, PAGE_SIZE);
+			memset(hv_cpu->synic_event_page, 0, PAGE_SIZE);
+			memset(hv_cpu->post_msg_page, 0, PAGE_SIZE);
+		}
 	}
 
 	return 0;
+
+err_decrypt_msg_page:
+	set_memory_encrypted((unsigned long)
+		hv_cpu->synic_event_page, 1);
+
+err_decrypt_event_page:
+	set_memory_encrypted((unsigned long)
+		hv_cpu->synic_message_page, 1);
+
 err:
 	/*
 	 * Any memory allocations that succeeded will be freed when
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 07/16] drivers: hv: Decrypt percpu hvcall input arg page in sev-snp enlightened guest
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (5 preceding siblings ...)
  2023-01-22  2:45 ` [RFC PATCH V3 06/16] x86/hyperv: decrypt vmbus pages for " Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-31 18:02   ` Michael Kelley (LINUX)
  2023-01-22  2:45 ` [RFC PATCH V3 08/16] x86/hyperv: Initialize cpu and memory for " Tianyu Lan
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Hypervisor needs to access iput arg page and guest should decrypt
the page.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
Change since RFC V2:
	* Set inputarg to be zero after kfree()
	* Not free mem when fail to encrypt mem in the hv_common_cpu_die().
---
 drivers/hv/hv_common.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index f788c64de0bd..205b6380d794 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -21,6 +21,7 @@
 #include <linux/ptrace.h>
 #include <linux/slab.h>
 #include <linux/dma-map-ops.h>
+#include <linux/set_memory.h>
 #include <asm/hyperv-tlfs.h>
 #include <asm/mshyperv.h>
 
@@ -125,6 +126,7 @@ int hv_common_cpu_init(unsigned int cpu)
 	u64 msr_vp_index;
 	gfp_t flags;
 	int pgcount = hv_root_partition ? 2 : 1;
+	int ret;
 
 	/* hv_cpu_init() can be called with IRQs disabled from hv_resume() */
 	flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL;
@@ -134,6 +136,17 @@ int hv_common_cpu_init(unsigned int cpu)
 	if (!(*inputarg))
 		return -ENOMEM;
 
+	if (hv_isolation_type_en_snp()) {
+		ret = set_memory_decrypted((unsigned long)*inputarg, pgcount);
+		if (ret) {
+			kfree(*inputarg);
+			*inputarg = NULL;
+			return ret;
+		}
+
+		memset(*inputarg, 0x00, PAGE_SIZE);
+	}
+
 	if (hv_root_partition) {
 		outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
 		*outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
@@ -168,7 +181,12 @@ int hv_common_cpu_die(unsigned int cpu)
 
 	local_irq_restore(flags);
 
-	kfree(mem);
+	if (hv_isolation_type_en_snp()) {
+		if (!set_memory_encrypted((unsigned long)mem, 1))
+			kfree(mem);
+	} else {
+		kfree(mem);
+	}
 
 	return 0;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 08/16] x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (6 preceding siblings ...)
  2023-01-22  2:45 ` [RFC PATCH V3 07/16] drivers: hv: Decrypt percpu hvcall input arg page in " Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-31 18:20   ` Michael Kelley (LINUX)
  2023-01-22  2:45 ` [RFC PATCH V3 09/16] x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc Tianyu Lan
                   ` (9 subsequent siblings)
  17 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Read processor amd memory info from specific address which are
populated by Hyper-V. Initialize smp cpu related ops, pvalidate
system memory and add it into e820 table.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
 arch/x86/kernel/cpu/mshyperv.c | 85 ++++++++++++++++++++++++++++++++++
 1 file changed, 85 insertions(+)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index ace5901ba0fc..b1871a7bb4c9 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -32,6 +32,12 @@
 #include <asm/nmi.h>
 #include <clocksource/hyperv_timer.h>
 #include <asm/numa.h>
+#include <asm/coco.h>
+#include <asm/io_apic.h>
+#include <asm/svm.h>
+#include <asm/sev.h>
+#include <asm/realmode.h>
+#include <asm/e820/api.h>
 
 /* Is Linux running as the root partition? */
 bool hv_root_partition;
@@ -251,6 +257,30 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
 }
 #endif
 
+static u32 processor_count;
+
+static __init void hv_snp_get_smp_config(unsigned int early)
+{
+	if (!early) {
+		while (num_processors < processor_count) {
+			early_per_cpu(x86_cpu_to_apicid, num_processors) = num_processors;
+			early_per_cpu(x86_bios_cpu_apicid, num_processors) = num_processors;
+			physid_set(num_processors, phys_cpu_present_map);
+			set_cpu_possible(num_processors, true);
+			set_cpu_present(num_processors, true);
+			num_processors++;
+		}
+	}
+}
+
+struct memory_map_entry {
+	u64 starting_gpn;
+	u64 numpages;
+	u16 type;
+	u16 flags;
+	u32 reserved;
+};
+
 static void __init ms_hyperv_init_platform(void)
 {
 	int hv_max_functions_eax;
@@ -258,6 +288,11 @@ static void __init ms_hyperv_init_platform(void)
 	int hv_host_info_ebx;
 	int hv_host_info_ecx;
 	int hv_host_info_edx;
+	struct memory_map_entry *entry;
+	struct e820_entry *e820_entry;
+	u64 e820_end;
+	u64 ram_end;
+	u64 page;
 
 #ifdef CONFIG_PARAVIRT
 	pv_info.name = "Hyper-V";
@@ -466,6 +501,56 @@ static void __init ms_hyperv_init_platform(void)
 	if (!(ms_hyperv.features & HV_ACCESS_TSC_INVARIANT))
 		mark_tsc_unstable("running on Hyper-V");
 
+	if (isolation_type_en_snp()) {
+		/*
+		 * Hyper-V enlightened snp guest boots kernel
+		 * directly without bootloader and so roms,
+		 * bios regions and reserve resources are not
+		 * available. Set these callback to NULL.
+		 */
+		x86_platform.legacy.reserve_bios_regions = x86_init_noop;
+		x86_init.resources.probe_roms = x86_init_noop;
+		x86_init.resources.reserve_resources = x86_init_noop;
+		x86_init.mpparse.find_smp_config = x86_init_noop;
+		x86_init.mpparse.get_smp_config = hv_snp_get_smp_config;
+
+		/*
+		 * Hyper-V SEV-SNP enlightened guest doesn't support ioapic
+		 * and legacy APIC page read/write. Switch to hv apic here.
+		 */
+		disable_ioapic_support();
+
+		/* Read processor number and memory layout. */
+		processor_count = *(u32 *)__va(EN_SEV_SNP_PROCESSOR_INFO_ADDR);
+		entry = (struct memory_map_entry *)(__va(EN_SEV_SNP_PROCESSOR_INFO_ADDR)
+				+ sizeof(struct memory_map_entry));
+
+		/*
+		 * E820 table in the memory just describes memory for
+		 * kernel, ACPI table, cmdline, boot params and ramdisk.
+		 * Hyper-V popoulates the rest memory layout in the EN_SEV_
+		 * SNP_PROCESSOR_INFO_ADDR.
+		 */
+		for (; entry->numpages != 0; entry++) {
+			e820_entry = &e820_table->entries[
+					e820_table->nr_entries - 1];
+			e820_end = e820_entry->addr + e820_entry->size;
+			ram_end = (entry->starting_gpn +
+				   entry->numpages) * PAGE_SIZE;
+
+			if (e820_end < entry->starting_gpn * PAGE_SIZE)
+				e820_end = entry->starting_gpn * PAGE_SIZE;
+
+			if (e820_end < ram_end) {
+				pr_info("Hyper-V: add e820 entry [mem %#018Lx-%#018Lx]\n", e820_end, ram_end - 1);
+				e820__range_add(e820_end, ram_end - e820_end,
+						E820_TYPE_RAM);
+				for (page = e820_end; page < ram_end; page += PAGE_SIZE)
+					pvalidate((unsigned long)__va(page), RMP_PG_SIZE_4K, true);
+			}
+		}
+	}
+
 	hardlockup_detector_disable();
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 09/16] x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (7 preceding siblings ...)
  2023-01-22  2:45 ` [RFC PATCH V3 08/16] x86/hyperv: Initialize cpu and memory for " Tianyu Lan
@ 2023-01-22  2:45 ` Tianyu Lan
  2023-01-31 14:03   ` Wei Liu
  2023-01-22  2:46 ` [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest Tianyu Lan
                   ` (8 subsequent siblings)
  17 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:45 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

SEV-SNP enlightened guest doesn't support legacy rtc. Set
legacy.rtc, x86_platform.set_wallclock and get_wallclock to
0 or noop(). Make get/set_rtc_noop() to be public and reuse
them in the ms_hyperv_init_platform().

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
 arch/x86/include/asm/mshyperv.h | 7 ++++++-
 arch/x86/include/asm/x86_init.h | 2 ++
 arch/x86/kernel/cpu/mshyperv.c  | 3 +++
 arch/x86/kernel/x86_init.c      | 4 ++--
 4 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 1a4af0a4f29a..7266d71d30d6 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -33,6 +33,12 @@ extern bool hv_isolation_type_en_snp(void);
 
 extern union hv_ghcb * __percpu *hv_ghcb_pg;
 
+/*
+ * Hyper-V puts processor and memory layout info
+ * to this address in SEV-SNP enlightened guest.
+ */
+#define EN_SEV_SNP_PROCESSOR_INFO_ADDR	0x802000
+
 int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
 int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
 int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
@@ -267,7 +273,6 @@ static inline void hv_set_register(unsigned int reg, u64 value) { }
 static inline u64 hv_get_register(unsigned int reg) { return 0; }
 #endif /* CONFIG_HYPERV */
 
-
 #include <asm-generic/mshyperv.h>
 
 #endif
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index c1c8c581759d..d8fb3a1639e9 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -326,5 +326,7 @@ extern void x86_init_uint_noop(unsigned int unused);
 extern bool bool_x86_init_noop(void);
 extern void x86_op_int_noop(int cpu);
 extern bool x86_pnpbios_disabled(void);
+extern int set_rtc_noop(const struct timespec64 *now);
+extern void get_rtc_noop(struct timespec64 *now);
 
 #endif
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index b1871a7bb4c9..197c8f2ec4eb 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -508,6 +508,9 @@ static void __init ms_hyperv_init_platform(void)
 		 * bios regions and reserve resources are not
 		 * available. Set these callback to NULL.
 		 */
+		x86_platform.legacy.rtc = 0;
+		x86_platform.set_wallclock = set_rtc_noop;
+		x86_platform.get_wallclock = get_rtc_noop;
 		x86_platform.legacy.reserve_bios_regions = x86_init_noop;
 		x86_init.resources.probe_roms = x86_init_noop;
 		x86_init.resources.reserve_resources = x86_init_noop;
diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c
index ef80d361b463..d93aeffec19b 100644
--- a/arch/x86/kernel/x86_init.c
+++ b/arch/x86/kernel/x86_init.c
@@ -33,8 +33,8 @@ static int __init iommu_init_noop(void) { return 0; }
 static void iommu_shutdown_noop(void) { }
 bool __init bool_x86_init_noop(void) { return false; }
 void x86_op_int_noop(int cpu) { }
-static __init int set_rtc_noop(const struct timespec64 *now) { return -EINVAL; }
-static __init void get_rtc_noop(struct timespec64 *now) { }
+int set_rtc_noop(const struct timespec64 *now) { return -EINVAL; }
+void get_rtc_noop(struct timespec64 *now) { }
 
 static __initconst const struct of_device_id of_cmos_match[] = {
 	{ .compatible = "motorola,mc146818" },
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (8 preceding siblings ...)
  2023-01-22  2:45 ` [RFC PATCH V3 09/16] x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc Tianyu Lan
@ 2023-01-22  2:46 ` Tianyu Lan
  2023-01-23 15:30   ` Tom Lendacky
  2023-01-31 18:34   ` Michael Kelley (LINUX)
  2023-01-22  2:46 ` [RFC PATCH V3 11/16] x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES Tianyu Lan
                   ` (7 subsequent siblings)
  17 siblings, 2 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:46 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

The wakeup_secondary_cpu callback was populated with wakeup_
cpu_via_vmgexit() which doesn't work for Hyper-V. Override it
with Hyper-V specific hook which uses HVCALL_START_VIRTUAL_
PROCESSOR hvcall to start AP with vmsa data structure.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
Change since RFC v2:
       * Add helper function to initialize segment
       * Fix some coding style
---
 arch/x86/include/asm/mshyperv.h   |   2 +
 arch/x86/include/asm/sev.h        |  13 ++++
 arch/x86/include/asm/svm.h        |  47 +++++++++++++
 arch/x86/kernel/cpu/mshyperv.c    | 112 ++++++++++++++++++++++++++++--
 include/asm-generic/hyperv-tlfs.h |  19 +++++
 5 files changed, 189 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 7266d71d30d6..c69051eec0e1 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -203,6 +203,8 @@ struct irq_domain *hv_create_pci_msi_domain(void);
 int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector,
 		struct hv_interrupt_entry *entry);
 int hv_unmap_ioapic_interrupt(int ioapic_id, struct hv_interrupt_entry *entry);
+int hv_set_mem_host_visibility(unsigned long addr, int numpages, bool visible);
+int hv_snp_boot_ap(int cpu, unsigned long start_ip);
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 void hv_ghcb_msr_write(u64 msr, u64 value);
diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
index ebc271bb6d8e..e34aaf730220 100644
--- a/arch/x86/include/asm/sev.h
+++ b/arch/x86/include/asm/sev.h
@@ -86,6 +86,19 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
 
 #define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
 
+union sev_rmp_adjust {
+	u64 as_uint64;
+	struct {
+		unsigned long target_vmpl : 8;
+		unsigned long enable_read : 1;
+		unsigned long enable_write : 1;
+		unsigned long enable_user_execute : 1;
+		unsigned long enable_kernel_execute : 1;
+		unsigned long reserved1 : 4;
+		unsigned long vmsa : 1;
+	};
+};
+
 /* SNP Guest message request */
 struct snp_req_data {
 	unsigned long req_gpa;
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index cb1ee53ad3b1..f8b321a11ee4 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -336,6 +336,53 @@ struct vmcb_save_area {
 	u64 last_excp_to;
 	u8 reserved_0x298[72];
 	u32 spec_ctrl;		/* Guest version of SPEC_CTRL at 0x2E0 */
+	u8 reserved_7b[4];
+	u32 pkru;
+	u8 reserved_7a[20];
+	u64 reserved_8;		/* rax already available at 0x01f8 */
+	u64 rcx;
+	u64 rdx;
+	u64 rbx;
+	u64 reserved_9;		/* rsp already available at 0x01d8 */
+	u64 rbp;
+	u64 rsi;
+	u64 rdi;
+	u64 r8;
+	u64 r9;
+	u64 r10;
+	u64 r11;
+	u64 r12;
+	u64 r13;
+	u64 r14;
+	u64 r15;
+	u8 reserved_10[16];
+	u64 sw_exit_code;
+	u64 sw_exit_info_1;
+	u64 sw_exit_info_2;
+	u64 sw_scratch;
+	union {
+		u64 sev_features;
+		struct {
+			u64 sev_feature_snp			: 1;
+			u64 sev_feature_vtom			: 1;
+			u64 sev_feature_reflectvc		: 1;
+			u64 sev_feature_restrict_injection	: 1;
+			u64 sev_feature_alternate_injection	: 1;
+			u64 sev_feature_full_debug		: 1;
+			u64 sev_feature_reserved1		: 1;
+			u64 sev_feature_snpbtb_isolation	: 1;
+			u64 sev_feature_resrved2		: 56;
+		};
+	};
+	u64 vintr_ctrl;
+	u64 guest_error_code;
+	u64 virtual_tom;
+	u64 tlb_id;
+	u64 pcpu_id;
+	u64 event_inject;
+	u64 xcr0;
+	u8 valid_bitmap[16];
+	u64 x87_state_gpa;
 } __packed;
 
 /* Save area definition for SEV-ES and SEV-SNP guests */
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 197c8f2ec4eb..9d547751a1a7 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -39,6 +39,13 @@
 #include <asm/realmode.h>
 #include <asm/e820/api.h>
 
+/*
+ * DEFAULT INIT GPAT and SEGMENT LIMIT value in struct VMSA
+ * to start AP in enlightened SEV guest.
+ */
+#define HV_AP_INIT_GPAT_DEFAULT		0x0007040600070406ULL
+#define HV_AP_SEGMENT_LIMIT		0xffffffff
+
 /* Is Linux running as the root partition? */
 bool hv_root_partition;
 struct ms_hyperv_info ms_hyperv;
@@ -230,6 +237,94 @@ static void __init hv_smp_prepare_boot_cpu(void)
 #endif
 }
 
+static u8 ap_start_input_arg[PAGE_SIZE] __bss_decrypted __aligned(PAGE_SIZE);
+static u8 ap_start_stack[PAGE_SIZE] __aligned(PAGE_SIZE);
+
+#define hv_populate_vmcb_seg(seg, gdtr_base)			\
+do {								\
+	if (seg.selector) {					\
+		seg.base = 0;					\
+		seg.limit = HV_AP_SEGMENT_LIMIT;		\
+		seg.attrib = *(u16 *)(gdtr_base + seg.selector + 5);	\
+		seg.attrib = (seg.attrib & 0xFF) | ((seg.attrib >> 4) & 0xF00); \
+	}							\
+} while (0)							\
+
+int hv_snp_boot_ap(int cpu, unsigned long start_ip)
+{
+	struct vmcb_save_area *vmsa = (struct vmcb_save_area *)
+		__get_free_page(GFP_KERNEL | __GFP_ZERO);
+	struct desc_ptr gdtr;
+	u64 ret, retry = 5;
+	struct hv_start_virtual_processor_input *start_vp_input;
+	union sev_rmp_adjust rmp_adjust;
+	unsigned long flags;
+
+	native_store_gdt(&gdtr);
+
+	vmsa->gdtr.base = gdtr.address;
+	vmsa->gdtr.limit = gdtr.size;
+
+	asm volatile("movl %%es, %%eax;" : "=a" (vmsa->es.selector));
+	hv_populate_vmcb_seg(vmsa->es, vmsa->gdtr.base);
+
+	asm volatile("movl %%cs, %%eax;" : "=a" (vmsa->cs.selector));
+	hv_populate_vmcb_seg(vmsa->cs, vmsa->gdtr.base);
+
+	asm volatile("movl %%ss, %%eax;" : "=a" (vmsa->ss.selector));
+	hv_populate_vmcb_seg(vmsa->ss, vmsa->gdtr.base);
+
+	asm volatile("movl %%ds, %%eax;" : "=a" (vmsa->ds.selector));
+	hv_populate_vmcb_seg(vmsa->ds, vmsa->gdtr.base);
+
+	vmsa->efer = native_read_msr(MSR_EFER);
+
+	asm volatile("movq %%cr4, %%rax;" : "=a" (vmsa->cr4));
+	asm volatile("movq %%cr3, %%rax;" : "=a" (vmsa->cr3));
+	asm volatile("movq %%cr0, %%rax;" : "=a" (vmsa->cr0));
+
+	vmsa->xcr0 = 1;
+	vmsa->g_pat = HV_AP_INIT_GPAT_DEFAULT;
+	vmsa->rip = (u64)secondary_startup_64_no_verify;
+	vmsa->rsp = (u64)&ap_start_stack[PAGE_SIZE];
+
+	vmsa->sev_feature_snp = 1;
+	vmsa->sev_feature_restrict_injection = 1;
+
+	rmp_adjust.as_uint64 = 0;
+	rmp_adjust.target_vmpl = 1;
+	rmp_adjust.vmsa = 1;
+	ret = rmpadjust((unsigned long)vmsa, RMP_PG_SIZE_4K,
+			rmp_adjust.as_uint64);
+	if (ret != 0) {
+		pr_err("RMPADJUST(%llx) failed: %llx\n", (u64)vmsa, ret);
+		return ret;
+	}
+
+	local_irq_save(flags);
+	start_vp_input =
+		(struct hv_start_virtual_processor_input *)ap_start_input_arg;
+	memset(start_vp_input, 0, sizeof(*start_vp_input));
+	start_vp_input->partitionid = -1;
+	start_vp_input->vpindex = cpu;
+	start_vp_input->targetvtl = ms_hyperv.vtl;
+	*(u64 *)&start_vp_input->context[0] = __pa(vmsa) | 1;
+
+	do {
+		ret = hv_do_hypercall(HVCALL_START_VIRTUAL_PROCESSOR,
+				      start_vp_input, NULL);
+	} while (hv_result(ret) == HV_STATUS_TIME_OUT && retry--);
+
+	if (!hv_result_success(ret)) {
+		pr_err("HvCallStartVirtualProcessor failed: %llx\n", ret);
+		goto done;
+	}
+
+done:
+	local_irq_restore(flags);
+	return ret;
+}
+
 static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
 {
 #ifdef CONFIG_X86_64
@@ -239,6 +334,16 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
 
 	native_smp_prepare_cpus(max_cpus);
 
+	/*
+	 *  Override wakeup_secondary_cpu callback for SEV-SNP
+	 *  enlightened guest.
+	 */
+	if (hv_isolation_type_en_snp())
+		apic->wakeup_secondary_cpu = hv_snp_boot_ap;
+
+	if (!hv_root_partition)
+		return;
+
 #ifdef CONFIG_X86_64
 	for_each_present_cpu(i) {
 		if (i == 0)
@@ -475,8 +580,7 @@ static void __init ms_hyperv_init_platform(void)
 
 # ifdef CONFIG_SMP
 	smp_ops.smp_prepare_boot_cpu = hv_smp_prepare_boot_cpu;
-	if (hv_root_partition)
-		smp_ops.smp_prepare_cpus = hv_smp_prepare_cpus;
+	smp_ops.smp_prepare_cpus = hv_smp_prepare_cpus;
 # endif
 
 	/*
@@ -501,7 +605,7 @@ static void __init ms_hyperv_init_platform(void)
 	if (!(ms_hyperv.features & HV_ACCESS_TSC_INVARIANT))
 		mark_tsc_unstable("running on Hyper-V");
 
-	if (isolation_type_en_snp()) {
+	if (hv_isolation_type_en_snp()) {
 		/*
 		 * Hyper-V enlightened snp guest boots kernel
 		 * directly without bootloader and so roms,
@@ -511,7 +615,7 @@ static void __init ms_hyperv_init_platform(void)
 		x86_platform.legacy.rtc = 0;
 		x86_platform.set_wallclock = set_rtc_noop;
 		x86_platform.get_wallclock = get_rtc_noop;
-		x86_platform.legacy.reserve_bios_regions = x86_init_noop;
+		x86_platform.legacy.reserve_bios_regions = 0;
 		x86_init.resources.probe_roms = x86_init_noop;
 		x86_init.resources.reserve_resources = x86_init_noop;
 		x86_init.mpparse.find_smp_config = x86_init_noop;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index c1cc3ec36ad5..3d7c67be9f56 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -148,6 +148,7 @@ union hv_reference_tsc_msr {
 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST	0x0003
 #define HVCALL_NOTIFY_LONG_SPIN_WAIT		0x0008
 #define HVCALL_SEND_IPI				0x000b
+#define HVCALL_ENABLE_VP_VTL			0x000f
 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX	0x0013
 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
 #define HVCALL_SEND_IPI_EX			0x0015
@@ -165,6 +166,7 @@ union hv_reference_tsc_msr {
 #define HVCALL_MAP_DEVICE_INTERRUPT		0x007c
 #define HVCALL_UNMAP_DEVICE_INTERRUPT		0x007d
 #define HVCALL_RETARGET_INTERRUPT		0x007e
+#define HVCALL_START_VIRTUAL_PROCESSOR		0x0099
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
 #define HVCALL_MODIFY_SPARSE_GPA_PAGE_HOST_VISIBILITY 0x00db
@@ -219,6 +221,7 @@ enum HV_GENERIC_SET_FORMAT {
 #define HV_STATUS_INVALID_PORT_ID		17
 #define HV_STATUS_INVALID_CONNECTION_ID		18
 #define HV_STATUS_INSUFFICIENT_BUFFERS		19
+#define HV_STATUS_TIME_OUT                     0x78
 
 /*
  * The Hyper-V TimeRefCount register and the TSC
@@ -778,6 +781,22 @@ struct hv_input_unmap_device_interrupt {
 	struct hv_interrupt_entry interrupt_entry;
 } __packed;
 
+struct hv_enable_vp_vtl_input {
+	u64 partitionid;
+	u32 vpindex;
+	u8 targetvtl;
+	u8 padding[3];
+	u8 context[0xe0];
+} __packed;
+
+struct hv_start_virtual_processor_input {
+	u64 partitionid;
+	u32 vpindex;
+	u8 targetvtl;
+	u8 padding[3];
+	u8 context[0xe0];
+} __packed;
+
 #define HV_SOURCE_SHADOW_NONE               0x0
 #define HV_SOURCE_SHADOW_BRIDGE_BUS_RANGE   0x1
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 11/16] x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (9 preceding siblings ...)
  2023-01-22  2:46 ` [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest Tianyu Lan
@ 2023-01-22  2:46 ` Tianyu Lan
  2023-01-22  2:46 ` [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler Tianyu Lan
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:46 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Add Hyperv-specific handling for faults caused by VMMCALL
instructions.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
 arch/x86/kernel/cpu/mshyperv.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 9d547751a1a7..9f37b51ba94b 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -694,6 +694,20 @@ static bool __init ms_hyperv_msi_ext_dest_id(void)
 	return eax & HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE;
 }
 
+static void hv_sev_es_hcall_prepare(struct ghcb *ghcb, struct pt_regs *regs)
+{
+	/* RAX and CPL are already in the GHCB */
+	ghcb_set_rcx(ghcb, regs->cx);
+	ghcb_set_rdx(ghcb, regs->dx);
+	ghcb_set_r8(ghcb, regs->r8);
+}
+
+static bool hv_sev_es_hcall_finish(struct ghcb *ghcb, struct pt_regs *regs)
+{
+	/* No checking of the return state needed */
+	return true;
+}
+
 const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.name			= "Microsoft Hyper-V",
 	.detect			= ms_hyperv_platform,
@@ -701,4 +715,6 @@ const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv = {
 	.init.x2apic_available	= ms_hyperv_x2apic_available,
 	.init.msi_ext_dest_id	= ms_hyperv_msi_ext_dest_id,
 	.init.init_platform	= ms_hyperv_init_platform,
+	.runtime.sev_es_hcall_prepare = hv_sev_es_hcall_prepare,
+	.runtime.sev_es_hcall_finish = hv_sev_es_hcall_finish,
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (10 preceding siblings ...)
  2023-01-22  2:46 ` [RFC PATCH V3 11/16] x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES Tianyu Lan
@ 2023-01-22  2:46 ` Tianyu Lan
  2023-01-23  7:33   ` Gupta, Pankaj
                     ` (2 more replies)
  2023-01-22  2:46 ` [RFC PATCH V3 13/16] x86/sev: Add Check of #HV event in path Tianyu Lan
                   ` (5 subsequent siblings)
  17 siblings, 3 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:46 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Add a #HV exception handler that uses IST stack.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
Change since RFC V2:
       * Remove unnecessary line in the change log.
---
 arch/x86/entry/entry_64.S             | 58 +++++++++++++++++++++++++++
 arch/x86/include/asm/cpu_entry_area.h |  6 +++
 arch/x86/include/asm/idtentry.h       | 39 +++++++++++++++++-
 arch/x86/include/asm/page_64_types.h  |  1 +
 arch/x86/include/asm/trapnr.h         |  1 +
 arch/x86/include/asm/traps.h          |  1 +
 arch/x86/kernel/cpu/common.c          |  1 +
 arch/x86/kernel/dumpstack_64.c        |  9 ++++-
 arch/x86/kernel/idt.c                 |  1 +
 arch/x86/kernel/sev.c                 | 53 ++++++++++++++++++++++++
 arch/x86/kernel/traps.c               | 40 ++++++++++++++++++
 arch/x86/mm/cpu_entry_area.c          |  2 +
 12 files changed, 209 insertions(+), 3 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 15739a2c0983..6baec7653f19 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -563,6 +563,64 @@ SYM_CODE_START(\asmsym)
 .Lfrom_usermode_switch_stack_\@:
 	idtentry_body user_\cfunc, has_error_code=1
 
+_ASM_NOKPROBE(\asmsym)
+SYM_CODE_END(\asmsym)
+.endm
+/*
+ * idtentry_hv - Macro to generate entry stub for #HV
+ * @vector:		Vector number
+ * @asmsym:		ASM symbol for the entry point
+ * @cfunc:		C function to be called
+ *
+ * The macro emits code to set up the kernel context for #HV. The #HV handler
+ * runs on an IST stack and needs to be able to support nested #HV exceptions.
+ *
+ * To make this work the #HV entry code tries its best to pretend it doesn't use
+ * an IST stack by switching to the task stack if coming from user-space (which
+ * includes early SYSCALL entry path) or back to the stack in the IRET frame if
+ * entered from kernel-mode.
+ *
+ * If entered from kernel-mode the return stack is validated first, and if it is
+ * not safe to use (e.g. because it points to the entry stack) the #HV handler
+ * will switch to a fall-back stack (HV2) and call a special handler function.
+ *
+ * The macro is only used for one vector, but it is planned to be extended in
+ * the future for the #HV exception.
+ */
+.macro idtentry_hv vector asmsym cfunc
+SYM_CODE_START(\asmsym)
+	UNWIND_HINT_IRET_REGS
+	ASM_CLAC
+	pushq	$-1			/* ORIG_RAX: no syscall to restart */
+
+	testb	$3, CS-ORIG_RAX(%rsp)
+	jnz	.Lfrom_usermode_switch_stack_\@
+
+	call	paranoid_entry
+
+	UNWIND_HINT_REGS
+
+	/*
+	 * Switch off the IST stack to make it free for nested exceptions.
+	 */
+	movq	%rsp, %rdi		/* pt_regs pointer */
+	call	hv_switch_off_ist
+	movq	%rax, %rsp		/* Switch to new stack */
+
+	UNWIND_HINT_REGS
+
+	/* Update pt_regs */
+	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
+	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
+
+	movq	%rsp, %rdi		/* pt_regs pointer */
+	call	kernel_\cfunc
+
+	jmp	paranoid_exit
+
+.Lfrom_usermode_switch_stack_\@:
+	idtentry_body user_\cfunc, has_error_code=1
+
 _ASM_NOKPROBE(\asmsym)
 SYM_CODE_END(\asmsym)
 .endm
diff --git a/arch/x86/include/asm/cpu_entry_area.h b/arch/x86/include/asm/cpu_entry_area.h
index 462fc34f1317..2186ed601b4a 100644
--- a/arch/x86/include/asm/cpu_entry_area.h
+++ b/arch/x86/include/asm/cpu_entry_area.h
@@ -30,6 +30,10 @@
 	char	VC_stack[optional_stack_size];			\
 	char	VC2_stack_guard[guardsize];			\
 	char	VC2_stack[optional_stack_size];			\
+	char	HV_stack_guard[guardsize];			\
+	char	HV_stack[optional_stack_size];			\
+	char	HV2_stack_guard[guardsize];			\
+	char	HV2_stack[optional_stack_size];			\
 	char	IST_top_guard[guardsize];			\
 
 /* The exception stacks' physical storage. No guard pages required */
@@ -52,6 +56,8 @@ enum exception_stack_ordering {
 	ESTACK_MCE,
 	ESTACK_VC,
 	ESTACK_VC2,
+	ESTACK_HV,
+	ESTACK_HV2,
 	N_EXCEPTION_STACKS
 };
 
diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 72184b0b2219..652fea10d377 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -317,6 +317,19 @@ static __always_inline void __##func(struct pt_regs *regs)
 	__visible noinstr void kernel_##func(struct pt_regs *regs, unsigned long error_code);	\
 	__visible noinstr void   user_##func(struct pt_regs *regs, unsigned long error_code)
 
+
+/**
+ * DECLARE_IDTENTRY_HV - Declare functions for the HV entry point
+ * @vector:	Vector number (ignored for C)
+ * @func:	Function name of the entry point
+ *
+ * Maps to DECLARE_IDTENTRY_RAW, but declares also the user C handler.
+ */
+#define DECLARE_IDTENTRY_HV(vector, func)				\
+	DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func);			\
+	__visible noinstr void kernel_##func(struct pt_regs *regs);	\
+	__visible noinstr void   user_##func(struct pt_regs *regs)
+
 /**
  * DEFINE_IDTENTRY_IST - Emit code for IST entry points
  * @func:	Function name of the entry point
@@ -376,6 +389,26 @@ static __always_inline void __##func(struct pt_regs *regs)
 #define DEFINE_IDTENTRY_VC_USER(func)				\
 	DEFINE_IDTENTRY_RAW_ERRORCODE(user_##func)
 
+/**
+ * DEFINE_IDTENTRY_HV_KERNEL - Emit code for HV injection handler
+ *			       when raised from kernel mode
+ * @func:	Function name of the entry point
+ *
+ * Maps to DEFINE_IDTENTRY_RAW
+ */
+#define DEFINE_IDTENTRY_HV_KERNEL(func)					\
+	DEFINE_IDTENTRY_RAW(kernel_##func)
+
+/**
+ * DEFINE_IDTENTRY_HV_USER - Emit code for HV injection handler
+ *			     when raised from user mode
+ * @func:	Function name of the entry point
+ *
+ * Maps to DEFINE_IDTENTRY_RAW
+ */
+#define DEFINE_IDTENTRY_HV_USER(func)					\
+	DEFINE_IDTENTRY_RAW(user_##func)
+
 #else	/* CONFIG_X86_64 */
 
 /**
@@ -465,6 +498,9 @@ __visible noinstr void func(struct pt_regs *regs,			\
 # define DECLARE_IDTENTRY_VC(vector, func)				\
 	idtentry_vc vector asm_##func func
 
+# define DECLARE_IDTENTRY_HV(vector, func)				\
+	idtentry_hv vector asm_##func func
+
 #else
 # define DECLARE_IDTENTRY_MCE(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
@@ -622,9 +658,10 @@ DECLARE_IDTENTRY_RAW_ERRORCODE(X86_TRAP_DF,	xenpv_exc_double_fault);
 DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP,	exc_control_protection);
 #endif
 
-/* #VC */
+/* #VC & #HV */
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 DECLARE_IDTENTRY_VC(X86_TRAP_VC,	exc_vmm_communication);
+DECLARE_IDTENTRY_HV(X86_TRAP_HV,	exc_hv_injection);
 #endif
 
 #ifdef CONFIG_XEN_PV
diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
index e9e2c3ba5923..0bd7dab676c5 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -29,6 +29,7 @@
 #define	IST_INDEX_DB		2
 #define	IST_INDEX_MCE		3
 #define	IST_INDEX_VC		4
+#define	IST_INDEX_HV		5
 
 /*
  * Set __PAGE_OFFSET to the most negative possible address +
diff --git a/arch/x86/include/asm/trapnr.h b/arch/x86/include/asm/trapnr.h
index f5d2325aa0b7..c6583631cecb 100644
--- a/arch/x86/include/asm/trapnr.h
+++ b/arch/x86/include/asm/trapnr.h
@@ -26,6 +26,7 @@
 #define X86_TRAP_XF		19	/* SIMD Floating-Point Exception */
 #define X86_TRAP_VE		20	/* Virtualization Exception */
 #define X86_TRAP_CP		21	/* Control Protection Exception */
+#define X86_TRAP_HV		28	/* HV injected exception in SNP restricted mode */
 #define X86_TRAP_VC		29	/* VMM Communication Exception */
 #define X86_TRAP_IRET		32	/* IRET Exception */
 
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 47ecfff2c83d..6795d3e517d6 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -16,6 +16,7 @@ asmlinkage __visible notrace
 struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
 void __init trap_init(void);
 asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
+asmlinkage __visible noinstr struct pt_regs *hv_switch_off_ist(struct pt_regs *eregs);
 #endif
 
 extern bool ibt_selftest(void);
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 9cfca3d7d0e2..e48a489777ec 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -2162,6 +2162,7 @@ static inline void tss_setup_ist(struct tss_struct *tss)
 	tss->x86_tss.ist[IST_INDEX_MCE] = __this_cpu_ist_top_va(MCE);
 	/* Only mapped when SEV-ES is active */
 	tss->x86_tss.ist[IST_INDEX_VC] = __this_cpu_ist_top_va(VC);
+	tss->x86_tss.ist[IST_INDEX_HV] = __this_cpu_ist_top_va(HV);
 }
 
 #else /* CONFIG_X86_64 */
diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
index f05339fee778..6d8f8864810c 100644
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -26,11 +26,14 @@ static const char * const exception_stack_names[] = {
 		[ ESTACK_MCE	]	= "#MC",
 		[ ESTACK_VC	]	= "#VC",
 		[ ESTACK_VC2	]	= "#VC2",
+		[ ESTACK_HV	]	= "#HV",
+		[ ESTACK_HV2	]	= "#HV2",
+		
 };
 
 const char *stack_type_name(enum stack_type type)
 {
-	BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
+	BUILD_BUG_ON(N_EXCEPTION_STACKS != 8);
 
 	if (type == STACK_TYPE_TASK)
 		return "TASK";
@@ -89,6 +92,8 @@ struct estack_pages estack_pages[CEA_ESTACK_PAGES] ____cacheline_aligned = {
 	EPAGERANGE(MCE),
 	EPAGERANGE(VC),
 	EPAGERANGE(VC2),
+	EPAGERANGE(HV),
+	EPAGERANGE(HV2),
 };
 
 static __always_inline bool in_exception_stack(unsigned long *stack, struct stack_info *info)
@@ -98,7 +103,7 @@ static __always_inline bool in_exception_stack(unsigned long *stack, struct stac
 	struct pt_regs *regs;
 	unsigned int k;
 
-	BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
+	BUILD_BUG_ON(N_EXCEPTION_STACKS != 8);
 
 	begin = (unsigned long)__this_cpu_read(cea_exception_stacks);
 	/*
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index a58c6bc1cd68..48c0a7e1dbcb 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -113,6 +113,7 @@ static const __initconst struct idt_data def_idts[] = {
 
 #ifdef CONFIG_AMD_MEM_ENCRYPT
 	ISTG(X86_TRAP_VC,		asm_exc_vmm_communication, IST_INDEX_VC),
+	ISTG(X86_TRAP_HV,		asm_exc_hv_injection, IST_INDEX_HV),
 #endif
 
 	SYSG(X86_TRAP_OF,		asm_exc_overflow),
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 679026a640ef..a8862a2eff67 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -2004,6 +2004,59 @@ DEFINE_IDTENTRY_VC_USER(exc_vmm_communication)
 	irqentry_exit_to_user_mode(regs);
 }
 
+static bool hv_raw_handle_exception(struct pt_regs *regs)
+{
+	return false;
+}
+
+static __always_inline bool on_hv_fallback_stack(struct pt_regs *regs)
+{
+	unsigned long sp = (unsigned long)regs;
+
+	return (sp >= __this_cpu_ist_bottom_va(HV2) && sp < __this_cpu_ist_top_va(HV2));
+}
+
+DEFINE_IDTENTRY_HV_USER(exc_hv_injection)
+{
+	irqentry_enter_from_user_mode(regs);
+	instrumentation_begin();
+
+	if (!hv_raw_handle_exception(regs)) {
+		/*
+		 * Do not kill the machine if user-space triggered the
+		 * exception. Send SIGBUS instead and let user-space deal
+		 * with it.
+		 */
+		force_sig_fault(SIGBUS, BUS_OBJERR, (void __user *)0);
+	}
+
+	instrumentation_end();
+	irqentry_exit_to_user_mode(regs);
+}
+
+DEFINE_IDTENTRY_HV_KERNEL(exc_hv_injection)
+{
+	irqentry_state_t irq_state;
+
+	irq_state = irqentry_nmi_enter(regs);
+	instrumentation_begin();
+
+	if (!hv_raw_handle_exception(regs)) {
+		pr_emerg("PANIC: Unhandled #HV exception in kernel space\n");
+
+		/* Show some debug info */
+		show_regs(regs);
+
+		/* Ask hypervisor to sev_es_terminate */
+		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
+
+		panic("Returned from Terminate-Request to Hypervisor\n");
+	}
+
+	instrumentation_end();
+	irqentry_nmi_exit(regs, irq_state);
+}
+
 bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
 {
 	unsigned long exit_code = regs->orig_ax;
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index d317dc3d06a3..d29debec8134 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -905,6 +905,46 @@ asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *r
 
 	return regs_ret;
 }
+
+asmlinkage __visible noinstr struct pt_regs *hv_switch_off_ist(struct pt_regs *regs)
+{
+	unsigned long sp, *stack;
+	struct stack_info info;
+	struct pt_regs *regs_ret;
+
+	/*
+	 * In the SYSCALL entry path the RSP value comes from user-space - don't
+	 * trust it and switch to the current kernel stack
+	 */
+	if (ip_within_syscall_gap(regs)) {
+		sp = this_cpu_read(pcpu_hot.top_of_stack);
+		goto sync;
+	}
+
+	/*
+	 * From here on the RSP value is trusted. Now check whether entry
+	 * happened from a safe stack. Not safe are the entry or unknown stacks,
+	 * use the fall-back stack instead in this case.
+	 */
+	sp    = regs->sp;
+	stack = (unsigned long *)sp;
+
+	if (!get_stack_info_noinstr(stack, current, &info) || info.type == STACK_TYPE_ENTRY ||
+	    info.type > STACK_TYPE_EXCEPTION_LAST)
+		sp = __this_cpu_ist_top_va(HV2);
+sync:
+	/*
+	 * Found a safe stack - switch to it as if the entry didn't happen via
+	 * IST stack. The code below only copies pt_regs, the real switch happens
+	 * in assembly code.
+	 */
+	sp = ALIGN_DOWN(sp, 8) - sizeof(*regs_ret);
+
+	regs_ret = (struct pt_regs *)sp;
+	*regs_ret = *regs;
+
+	return regs_ret;
+}
 #endif
 
 asmlinkage __visible noinstr struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs)
diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
index 7316a8224259..3ec844cef652 100644
--- a/arch/x86/mm/cpu_entry_area.c
+++ b/arch/x86/mm/cpu_entry_area.c
@@ -153,6 +153,8 @@ static void __init percpu_setup_exception_stacks(unsigned int cpu)
 		if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) {
 			cea_map_stack(VC);
 			cea_map_stack(VC2);
+			cea_map_stack(HV);
+			cea_map_stack(HV2);
 		}
 	}
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 13/16] x86/sev: Add Check of #HV event in path
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (11 preceding siblings ...)
  2023-01-22  2:46 ` [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler Tianyu Lan
@ 2023-01-22  2:46 ` Tianyu Lan
  2023-03-01 11:11   ` Gupta, Pankaj
  2023-01-22  2:46 ` [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests Tianyu Lan
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:46 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Add check_hv_pending() and check_hv_pending_after_irq() to
check queued #HV event when irq is disabled.

Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
 arch/x86/entry/entry_64.S       | 18 +++++++++++++++
 arch/x86/include/asm/irqflags.h | 10 +++++++++
 arch/x86/kernel/sev.c           | 39 +++++++++++++++++++++++++++++++++
 3 files changed, 67 insertions(+)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 6baec7653f19..aec8dc4443d1 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -1064,6 +1064,15 @@ SYM_CODE_END(paranoid_entry)
  * R15 - old SPEC_CTRL
  */
 SYM_CODE_START_LOCAL(paranoid_exit)
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	/*
+	 * If a #HV was delivered during execution and interrupts were
+	 * disabled, then check if it can be handled before the iret
+	 * (which may re-enable interrupts).
+	 */
+	mov     %rsp, %rdi
+	call    check_hv_pending
+#endif
 	UNWIND_HINT_REGS
 
 	/*
@@ -1188,6 +1197,15 @@ SYM_CODE_START(error_entry)
 SYM_CODE_END(error_entry)
 
 SYM_CODE_START_LOCAL(error_return)
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	/*
+	 * If a #HV was delivered during execution and interrupts were
+	 * disabled, then check if it can be handled before the iret
+	 * (which may re-enable interrupts).
+	 */
+	mov     %rsp, %rdi
+	call    check_hv_pending
+#endif
 	UNWIND_HINT_REGS
 	DEBUG_ENTRY_ASSERT_IRQS_OFF
 	testb	$3, CS(%rsp)
diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index 7793e52d6237..fe46e59168dd 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -14,6 +14,10 @@
 /*
  * Interrupt control:
  */
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+void check_hv_pending(struct pt_regs *regs);
+void check_hv_pending_irq_enable(void);
+#endif
 
 /* Declaration required for gcc < 4.9 to prevent -Werror=missing-prototypes */
 extern inline unsigned long native_save_fl(void);
@@ -43,12 +47,18 @@ static __always_inline void native_irq_disable(void)
 static __always_inline void native_irq_enable(void)
 {
 	asm volatile("sti": : :"memory");
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	check_hv_pending_irq_enable();
+#endif
 }
 
 static inline __cpuidle void native_safe_halt(void)
 {
 	mds_idle_clear_cpu_buffers();
 	asm volatile("sti; hlt": : :"memory");
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+	check_hv_pending_irq_enable();
+#endif
 }
 
 static inline __cpuidle void native_halt(void)
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index a8862a2eff67..fe5e5e41433d 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -179,6 +179,45 @@ void noinstr __sev_es_ist_enter(struct pt_regs *regs)
 	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist);
 }
 
+static void do_exc_hv(struct pt_regs *regs)
+{
+	/* Handle #HV exception. */
+}
+
+void check_hv_pending(struct pt_regs *regs)
+{
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
+
+	if ((regs->flags & X86_EFLAGS_IF) == 0)
+		return;
+
+	do_exc_hv(regs);
+}
+
+void check_hv_pending_irq_enable(void)
+{
+	unsigned long flags;
+	struct pt_regs regs;
+
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+		return;
+
+	memset(&regs, 0, sizeof(struct pt_regs));
+	asm volatile("movl %%cs, %%eax;" : "=a" (regs.cs));
+	asm volatile("movl %%ss, %%eax;" : "=a" (regs.ss));
+	regs.orig_ax = 0xffffffff;
+	regs.flags = native_save_fl();
+
+	/*
+	 * Disable irq when handle pending #HV events after
+	 * re-enabling irq.
+	 */
+	asm volatile("cli" : : : "memory");
+	do_exc_hv(&regs);
+	asm volatile("sti" : : : "memory");
+}
+
 void noinstr __sev_es_ist_exit(void)
 {
 	unsigned long ist;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (12 preceding siblings ...)
  2023-01-22  2:46 ` [RFC PATCH V3 13/16] x86/sev: Add Check of #HV event in path Tianyu Lan
@ 2023-01-22  2:46 ` Tianyu Lan
  2023-02-16 14:46   ` Gupta, Pankaj
                     ` (2 more replies)
  2023-01-22  2:46 ` [RFC PATCH V3 15/16] x86/sev: optimize system vector processing invoked from #HV exception Tianyu Lan
                   ` (3 subsequent siblings)
  17 siblings, 3 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:46 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <tiala@microsoft.com>

Enable #HV exception to handle interrupt requests from hypervisor.

Co-developed-by: Lendacky Thomas <thomas.lendacky@amd.com>
Co-developed-by: Kalra Ashish <ashish.kalra@amd.com>
Signed-off-by: Tianyu Lan <tiala@microsoft.com>
---
 arch/x86/include/asm/mem_encrypt.h |   2 +
 arch/x86/include/asm/msr-index.h   |   6 +
 arch/x86/include/asm/svm.h         |  12 +-
 arch/x86/include/uapi/asm/svm.h    |   4 +
 arch/x86/kernel/sev.c              | 307 +++++++++++++++++++++++------
 arch/x86/kernel/traps.c            |   2 +
 6 files changed, 272 insertions(+), 61 deletions(-)

diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index 72ca90552b6a..7264ca5f5b2d 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -50,6 +50,7 @@ void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
 void __init mem_encrypt_free_decrypted_mem(void);
 
 void __init sev_es_init_vc_handling(void);
+void __init sev_snp_init_hv_handling(void);
 
 #define __bss_decrypted __section(".bss..decrypted")
 
@@ -72,6 +73,7 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline void sev_es_init_vc_handling(void) { }
+static inline void sev_snp_init_hv_handling(void) { }
 
 static inline int __init
 early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 6a6e70e792a4..70af0ce5f2c4 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -562,11 +562,17 @@
 #define MSR_AMD64_SEV_ENABLED_BIT	0
 #define MSR_AMD64_SEV_ES_ENABLED_BIT	1
 #define MSR_AMD64_SEV_SNP_ENABLED_BIT	2
+#define MSR_AMD64_SEV_REFLECTVC_ENABLED_BIT		4
+#define MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED_BIT	5
+#define MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED_BIT	6
 #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
 #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
 #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
 #define MSR_AMD64_SNP_VTOM_ENABLED	BIT_ULL(3)
 
+#define MSR_AMD64_SEV_REFLECTVC_ENABLED			BIT_ULL(MSR_AMD64_SEV_REFLECTVC_ENABLED_BIT)
+#define MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED	BIT_ULL(MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED_BIT)
+#define MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED	BIT_ULL(MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED_BIT)
 #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
 
 /* AMD Collaborative Processor Performance Control MSRs */
diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index f8b321a11ee4..911c991fec78 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -568,12 +568,12 @@ static inline void __unused_size_checks(void)
 
 	/* Check offsets of reserved fields */
 
-	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xa0);
-	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xcc);
-	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xd8);
-	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x180);
-	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x248);
-	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x298);
+//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xa0);
+//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xcc);
+//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xd8);
+//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x180);
+//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x248);
+//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x298);
 
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0xc8);
 	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0xcc);
diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
index f69c168391aa..85d6882262e7 100644
--- a/arch/x86/include/uapi/asm/svm.h
+++ b/arch/x86/include/uapi/asm/svm.h
@@ -115,6 +115,10 @@
 #define SVM_VMGEXIT_AP_CREATE_ON_INIT		0
 #define SVM_VMGEXIT_AP_CREATE			1
 #define SVM_VMGEXIT_AP_DESTROY			2
+#define SVM_VMGEXIT_HV_DOORBELL_PAGE		0x80000014
+#define SVM_VMGEXIT_GET_PREFERRED_HV_DOORBELL_PAGE	0
+#define SVM_VMGEXIT_SET_HV_DOORBELL_PAGE		1
+#define SVM_VMGEXIT_QUERY_HV_DOORBELL_PAGE		2
 #define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
 #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
 
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index fe5e5e41433d..03d99fad9e76 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -122,6 +122,150 @@ struct sev_config {
 
 static struct sev_config sev_cfg __read_mostly;
 
+static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state);
+static noinstr void __sev_put_ghcb(struct ghcb_state *state);
+static int vmgexit_hv_doorbell_page(struct ghcb *ghcb, u64 op, u64 pa);
+static void sev_snp_setup_hv_doorbell_page(struct ghcb *ghcb);
+
+union hv_pending_events {
+	u16 events;
+	struct {
+		u8 vector;
+		u8 nmi : 1;
+		u8 mc : 1;
+		u8 reserved1 : 5;
+		u8 no_further_signal : 1;
+	};
+};
+
+struct sev_hv_doorbell_page {
+	union hv_pending_events pending_events;
+	u8 no_eoi_required;
+	u8 reserved2[61];
+	u8 padding[4032];
+};
+
+struct sev_snp_runtime_data {
+	struct sev_hv_doorbell_page hv_doorbell_page;
+};
+
+static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
+
+static inline u64 sev_es_rd_ghcb_msr(void)
+{
+	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
+}
+
+static __always_inline void sev_es_wr_ghcb_msr(u64 val)
+{
+	u32 low, high;
+
+	low  = (u32)(val);
+	high = (u32)(val >> 32);
+
+	native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
+}
+
+struct sev_hv_doorbell_page *sev_snp_current_doorbell_page(void)
+{
+	return &this_cpu_read(snp_runtime_data)->hv_doorbell_page;
+}
+
+static u8 sev_hv_pending(void)
+{
+	return sev_snp_current_doorbell_page()->pending_events.events;
+}
+
+static void hv_doorbell_apic_eoi_write(u32 reg, u32 val)
+{
+	if (xchg(&sev_snp_current_doorbell_page()->no_eoi_required, 0) & 0x1)
+		return;
+
+	BUG_ON(reg != APIC_EOI);
+	apic->write(reg, val);
+}
+
+static void do_exc_hv(struct pt_regs *regs)
+{
+	union hv_pending_events pending_events;
+	u8 vector;
+
+	while (sev_hv_pending()) {
+		pending_events.events = xchg(
+			&sev_snp_current_doorbell_page()->pending_events.events,
+			0);
+
+		if (pending_events.nmi)
+			exc_nmi(regs);
+
+#ifdef CONFIG_X86_MCE
+		if (pending_events.mc)
+			exc_machine_check(regs);
+#endif
+
+		if (!pending_events.vector)
+			return;
+
+		if (pending_events.vector < FIRST_EXTERNAL_VECTOR) {
+			/* Exception vectors */
+			WARN(1, "exception shouldn't happen\n");
+		} else if (pending_events.vector == FIRST_EXTERNAL_VECTOR) {
+			sysvec_irq_move_cleanup(regs);
+		} else if (pending_events.vector == IA32_SYSCALL_VECTOR) {
+			WARN(1, "syscall shouldn't happen\n");
+		} else if (pending_events.vector >= FIRST_SYSTEM_VECTOR) {
+			switch (pending_events.vector) {
+#if IS_ENABLED(CONFIG_HYPERV)
+			case HYPERV_STIMER0_VECTOR:
+				sysvec_hyperv_stimer0(regs);
+				break;
+			case HYPERVISOR_CALLBACK_VECTOR:
+				sysvec_hyperv_callback(regs);
+				break;
+#endif
+#ifdef CONFIG_SMP
+			case RESCHEDULE_VECTOR:
+				sysvec_reschedule_ipi(regs);
+				break;
+			case IRQ_MOVE_CLEANUP_VECTOR:
+				sysvec_irq_move_cleanup(regs);
+				break;
+			case REBOOT_VECTOR:
+				sysvec_reboot(regs);
+				break;
+			case CALL_FUNCTION_SINGLE_VECTOR:
+				sysvec_call_function_single(regs);
+				break;
+			case CALL_FUNCTION_VECTOR:
+				sysvec_call_function(regs);
+				break;
+#endif
+#ifdef CONFIG_X86_LOCAL_APIC
+			case ERROR_APIC_VECTOR:
+				sysvec_error_interrupt(regs);
+				break;
+			case SPURIOUS_APIC_VECTOR:
+				sysvec_spurious_apic_interrupt(regs);
+				break;
+			case LOCAL_TIMER_VECTOR:
+				sysvec_apic_timer_interrupt(regs);
+				break;
+			case X86_PLATFORM_IPI_VECTOR:
+				sysvec_x86_platform_ipi(regs);
+				break;
+#endif
+			case 0x0:
+				break;
+			default:
+				panic("Unexpected vector %d\n", vector);
+				unreachable();
+			}
+		} else {
+			common_interrupt(regs, pending_events.vector);
+		}
+	}
+}
+
 static __always_inline bool on_vc_stack(struct pt_regs *regs)
 {
 	unsigned long sp = regs->sp;
@@ -179,11 +323,6 @@ void noinstr __sev_es_ist_enter(struct pt_regs *regs)
 	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist);
 }
 
-static void do_exc_hv(struct pt_regs *regs)
-{
-	/* Handle #HV exception. */
-}
-
 void check_hv_pending(struct pt_regs *regs)
 {
 	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
@@ -232,68 +371,38 @@ void noinstr __sev_es_ist_exit(void)
 	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
 }
 
-/*
- * Nothing shall interrupt this code path while holding the per-CPU
- * GHCB. The backup GHCB is only for NMIs interrupting this path.
- *
- * Callers must disable local interrupts around it.
- */
-static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
+static bool sev_restricted_injection_enabled(void)
+{
+	return sev_status & MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED;
+}
+
+void __init sev_snp_init_hv_handling(void)
 {
+	struct sev_snp_runtime_data *snp_data;
 	struct sev_es_runtime_data *data;
+	struct ghcb_state state;
 	struct ghcb *ghcb;
+	unsigned long flags;
+	int cpu;
+	int err;
 
 	WARN_ON(!irqs_disabled());
+	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP) || !sev_restricted_injection_enabled())
+		return;
 
 	data = this_cpu_read(runtime_data);
-	ghcb = &data->ghcb_page;
 
-	if (unlikely(data->ghcb_active)) {
-		/* GHCB is already in use - save its contents */
-
-		if (unlikely(data->backup_ghcb_active)) {
-			/*
-			 * Backup-GHCB is also already in use. There is no way
-			 * to continue here so just kill the machine. To make
-			 * panic() work, mark GHCBs inactive so that messages
-			 * can be printed out.
-			 */
-			data->ghcb_active        = false;
-			data->backup_ghcb_active = false;
-
-			instrumentation_begin();
-			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
-			instrumentation_end();
-		}
-
-		/* Mark backup_ghcb active before writing to it */
-		data->backup_ghcb_active = true;
-
-		state->ghcb = &data->backup_ghcb;
+	local_irq_save(flags);
 
-		/* Backup GHCB content */
-		*state->ghcb = *ghcb;
-	} else {
-		state->ghcb = NULL;
-		data->ghcb_active = true;
-	}
+	ghcb = __sev_get_ghcb(&state);
 
-	return ghcb;
-}
+	sev_snp_setup_hv_doorbell_page(ghcb);
 
-static inline u64 sev_es_rd_ghcb_msr(void)
-{
-	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
-}
-
-static __always_inline void sev_es_wr_ghcb_msr(u64 val)
-{
-	u32 low, high;
+	__sev_put_ghcb(&state);
 
-	low  = (u32)(val);
-	high = (u32)(val >> 32);
+	apic_set_eoi_write(hv_doorbell_apic_eoi_write);
 
-	native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
+	local_irq_restore(flags);
 }
 
 static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt,
@@ -554,6 +663,69 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt
 /* Include code shared with pre-decompression boot stage */
 #include "sev-shared.c"
 
+/*
+ * Nothing shall interrupt this code path while holding the per-CPU
+ * GHCB. The backup GHCB is only for NMIs interrupting this path.
+ *
+ * Callers must disable local interrupts around it.
+ */
+static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
+{
+	struct sev_es_runtime_data *data;
+	struct ghcb *ghcb;
+
+	WARN_ON(!irqs_disabled());
+
+	data = this_cpu_read(runtime_data);
+	ghcb = &data->ghcb_page;
+
+	if (unlikely(data->ghcb_active)) {
+		/* GHCB is already in use - save its contents */
+
+		if (unlikely(data->backup_ghcb_active)) {
+			/*
+			 * Backup-GHCB is also already in use. There is no way
+			 * to continue here so just kill the machine. To make
+			 * panic() work, mark GHCBs inactive so that messages
+			 * can be printed out.
+			 */
+			data->ghcb_active        = false;
+			data->backup_ghcb_active = false;
+
+			instrumentation_begin();
+			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
+			instrumentation_end();
+		}
+
+		/* Mark backup_ghcb active before writing to it */
+		data->backup_ghcb_active = true;
+
+		state->ghcb = &data->backup_ghcb;
+
+		/* Backup GHCB content */
+		*state->ghcb = *ghcb;
+	} else {
+		state->ghcb = NULL;
+		data->ghcb_active = true;
+	}
+
+	return ghcb;
+}
+
+static void sev_snp_setup_hv_doorbell_page(struct ghcb *ghcb)
+{
+	u64 pa;
+	enum es_result ret;
+
+	pa = __pa(sev_snp_current_doorbell_page());
+	vc_ghcb_invalidate(ghcb);
+	ret = vmgexit_hv_doorbell_page(ghcb,
+				       SVM_VMGEXIT_SET_HV_DOORBELL_PAGE,
+				       pa);
+	if (ret != ES_OK)
+		panic("SEV-SNP: failed to set up #HV doorbell page");
+}
+
 static noinstr void __sev_put_ghcb(struct ghcb_state *state)
 {
 	struct sev_es_runtime_data *data;
@@ -1282,6 +1454,7 @@ static void snp_register_per_cpu_ghcb(void)
 	ghcb = &data->ghcb_page;
 
 	snp_register_ghcb_early(__pa(ghcb));
+	sev_snp_setup_hv_doorbell_page(ghcb);
 }
 
 void setup_ghcb(void)
@@ -1321,6 +1494,11 @@ void setup_ghcb(void)
 		snp_register_ghcb_early(__pa(&boot_ghcb_page));
 }
 
+int vmgexit_hv_doorbell_page(struct ghcb *ghcb, u64 op, u64 pa)
+{
+	return sev_es_ghcb_hv_call(ghcb, NULL, SVM_VMGEXIT_HV_DOORBELL_PAGE, op, pa);
+}
+
 #ifdef CONFIG_HOTPLUG_CPU
 static void sev_es_ap_hlt_loop(void)
 {
@@ -1394,6 +1572,7 @@ static void __init alloc_runtime_data(int cpu)
 static void __init init_ghcb(int cpu)
 {
 	struct sev_es_runtime_data *data;
+	struct sev_snp_runtime_data *snp_data;
 	int err;
 
 	data = per_cpu(runtime_data, cpu);
@@ -1405,6 +1584,19 @@ static void __init init_ghcb(int cpu)
 
 	memset(&data->ghcb_page, 0, sizeof(data->ghcb_page));
 
+	snp_data = memblock_alloc(sizeof(*snp_data), PAGE_SIZE);
+	if (!snp_data)
+		panic("Can't allocate SEV-SNP runtime data");
+
+	err = early_set_memory_decrypted((unsigned long)&snp_data->hv_doorbell_page,
+					 sizeof(snp_data->hv_doorbell_page));
+	if (err)
+		panic("Can't map #HV doorbell pages unencrypted");
+
+	memset(&snp_data->hv_doorbell_page, 0, sizeof(snp_data->hv_doorbell_page));
+
+	per_cpu(snp_runtime_data, cpu) = snp_data;
+
 	data->ghcb_active = false;
 	data->backup_ghcb_active = false;
 }
@@ -2045,7 +2237,12 @@ DEFINE_IDTENTRY_VC_USER(exc_vmm_communication)
 
 static bool hv_raw_handle_exception(struct pt_regs *regs)
 {
-	return false;
+	/* Clear the no_further_signal bit */
+	sev_snp_current_doorbell_page()->pending_events.events &= 0x7fff;
+
+	check_hv_pending(regs);
+
+	return true;
 }
 
 static __always_inline bool on_hv_fallback_stack(struct pt_regs *regs)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index d29debec8134..1aa6cab2394b 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -1503,5 +1503,7 @@ void __init trap_init(void)
 	cpu_init_exception_handling();
 	/* Setup traps as cpu_init() might #GP */
 	idt_setup_traps();
+	sev_snp_init_hv_handling();
+
 	cpu_init();
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 15/16] x86/sev: optimize system vector processing invoked from #HV exception
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (13 preceding siblings ...)
  2023-01-22  2:46 ` [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests Tianyu Lan
@ 2023-01-22  2:46 ` Tianyu Lan
  2023-01-22  2:46 ` [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths " Tianyu Lan
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:46 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Ashish Kalra <ashish.kalra@amd.com>

Construct system vector table and dispatch system vector exceptions through
sysvec_table from #HV exception handler instead of explicitly calling each
system vector. The system vector table is created dynamically and is placed
in a new named ELF section.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/entry/entry_64.S     |  6 +++
 arch/x86/kernel/sev.c         | 70 +++++++++++++----------------------
 arch/x86/kernel/vmlinux.lds.S |  7 ++++
 3 files changed, 38 insertions(+), 45 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index aec8dc4443d1..03af871f08e9 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -412,6 +412,12 @@ SYM_CODE_START(\asmsym)
 
 _ASM_NOKPROBE(\asmsym)
 SYM_CODE_END(\asmsym)
+	.if \vector >= FIRST_SYSTEM_VECTOR && \vector < NR_VECTORS
+		.section .system_vectors, "aw"
+		.byte \vector
+		.quad \cfunc
+		.previous
+	.endif
 .endm
 
 /*
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index 03d99fad9e76..b1a98c2a52f8 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -151,6 +151,16 @@ struct sev_snp_runtime_data {
 
 static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
 
+static void (*sysvec_table[NR_VECTORS - FIRST_SYSTEM_VECTOR])
+		(struct pt_regs *regs) __ro_after_init;
+
+struct sysvec_entry {
+	unsigned char vector;
+	void (*sysvec_func)(struct pt_regs *regs);
+} __packed;
+
+extern struct sysvec_entry __system_vectors[], __system_vectors_end[];
+
 static inline u64 sev_es_rd_ghcb_msr(void)
 {
 	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
@@ -214,51 +224,11 @@ static void do_exc_hv(struct pt_regs *regs)
 		} else if (pending_events.vector == IA32_SYSCALL_VECTOR) {
 			WARN(1, "syscall shouldn't happen\n");
 		} else if (pending_events.vector >= FIRST_SYSTEM_VECTOR) {
-			switch (pending_events.vector) {
-#if IS_ENABLED(CONFIG_HYPERV)
-			case HYPERV_STIMER0_VECTOR:
-				sysvec_hyperv_stimer0(regs);
-				break;
-			case HYPERVISOR_CALLBACK_VECTOR:
-				sysvec_hyperv_callback(regs);
-				break;
-#endif
-#ifdef CONFIG_SMP
-			case RESCHEDULE_VECTOR:
-				sysvec_reschedule_ipi(regs);
-				break;
-			case IRQ_MOVE_CLEANUP_VECTOR:
-				sysvec_irq_move_cleanup(regs);
-				break;
-			case REBOOT_VECTOR:
-				sysvec_reboot(regs);
-				break;
-			case CALL_FUNCTION_SINGLE_VECTOR:
-				sysvec_call_function_single(regs);
-				break;
-			case CALL_FUNCTION_VECTOR:
-				sysvec_call_function(regs);
-				break;
-#endif
-#ifdef CONFIG_X86_LOCAL_APIC
-			case ERROR_APIC_VECTOR:
-				sysvec_error_interrupt(regs);
-				break;
-			case SPURIOUS_APIC_VECTOR:
-				sysvec_spurious_apic_interrupt(regs);
-				break;
-			case LOCAL_TIMER_VECTOR:
-				sysvec_apic_timer_interrupt(regs);
-				break;
-			case X86_PLATFORM_IPI_VECTOR:
-				sysvec_x86_platform_ipi(regs);
-				break;
-#endif
-			case 0x0:
-				break;
-			default:
-				panic("Unexpected vector %d\n", vector);
-				unreachable();
+			if (!(sysvec_table[pending_events.vector - FIRST_SYSTEM_VECTOR])) {
+				WARN(1, "system vector entry 0x%x is NULL\n",
+				     pending_events.vector);
+			} else {
+				(*sysvec_table[pending_events.vector - FIRST_SYSTEM_VECTOR])(regs);
 			}
 		} else {
 			common_interrupt(regs, pending_events.vector);
@@ -376,6 +346,14 @@ static bool sev_restricted_injection_enabled(void)
 	return sev_status & MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED;
 }
 
+static void __init construct_sysvec_table(void)
+{
+	struct sysvec_entry *p;
+
+	for (p = __system_vectors; p < __system_vectors_end; p++)
+		sysvec_table[p->vector - FIRST_SYSTEM_VECTOR] = p->sysvec_func;
+}
+
 void __init sev_snp_init_hv_handling(void)
 {
 	struct sev_snp_runtime_data *snp_data;
@@ -403,6 +381,8 @@ void __init sev_snp_init_hv_handling(void)
 	apic_set_eoi_write(hv_doorbell_apic_eoi_write);
 
 	local_irq_restore(flags);
+
+	construct_sysvec_table();
 }
 
 static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt,
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 2e0ee14229bf..aeadb4754b00 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -339,6 +339,13 @@ SECTIONS
 		*(.altinstr_replacement)
 	}
 
+	. = ALIGN(8);
+	.system_vectors : AT(ADDR(.system_vectors) - LOAD_OFFSET) {
+		__system_vectors = .;
+		*(.system_vectors)
+		__system_vectors_end = .;
+	}
+
 	. = ALIGN(8);
 	.apicdrivers : AT(ADDR(.apicdrivers) - LOAD_OFFSET) {
 		__apicdrivers = .;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths from #HV exception
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (14 preceding siblings ...)
  2023-01-22  2:46 ` [RFC PATCH V3 15/16] x86/sev: optimize system vector processing invoked from #HV exception Tianyu Lan
@ 2023-01-22  2:46 ` Tianyu Lan
  2023-02-02 23:20   ` Zhi Wang
  2023-02-21 16:44   ` Gupta, Pankaj
  2023-02-02 23:00 ` [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Zhi Wang
  2023-02-09 11:36 ` Gupta, Pankaj
  17 siblings, 2 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-01-22  2:46 UTC (permalink / raw)
  To: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Ashish Kalra <ashish.kalra@amd.com>

Add checks in interrupt exit code paths in case of returns
to user mode to check if currently executing the #HV handler
then don't follow the irqentry_exit_to_user_mode path as
that can potentially cause the #HV handler to be
preempted and rescheduled on another CPU. Rescheduled #HV
handler on another cpu will cause interrupts to be handled
on a different cpu than the injected one, causing
invalid EOIs and missed/lost guest interrupts and
corresponding hangs and/or per-cpu IRQs handled on
non-intended cpu.

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
---
 arch/x86/include/asm/idtentry.h | 66 +++++++++++++++++++++++++++++++++
 arch/x86/kernel/sev.c           | 30 +++++++++++++++
 2 files changed, 96 insertions(+)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 652fea10d377..45b47132be7c 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -13,6 +13,10 @@
 
 #include <asm/irq_stack.h>
 
+#ifdef CONFIG_AMD_MEM_ENCRYPT
+noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, irqentry_state_t state);
+#endif
+
 /**
  * DECLARE_IDTENTRY - Declare functions for simple IDT entry points
  *		      No error code pushed by hardware
@@ -176,6 +180,7 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code)
 #define DECLARE_IDTENTRY_IRQ(vector, func)				\
 	DECLARE_IDTENTRY_ERRORCODE(vector, func)
 
+#ifndef CONFIG_AMD_MEM_ENCRYPT
 /**
  * DEFINE_IDTENTRY_IRQ - Emit code for device interrupt IDT entry points
  * @func:	Function name of the entry point
@@ -205,6 +210,26 @@ __visible noinstr void func(struct pt_regs *regs,			\
 }									\
 									\
 static noinline void __##func(struct pt_regs *regs, u32 vector)
+#else
+
+#define DEFINE_IDTENTRY_IRQ(func)					\
+static void __##func(struct pt_regs *regs, u32 vector);		\
+									\
+__visible noinstr void func(struct pt_regs *regs,			\
+			    unsigned long error_code)			\
+{									\
+	irqentry_state_t state = irqentry_enter(regs);			\
+	u32 vector = (u32)(u8)error_code;				\
+									\
+	instrumentation_begin();					\
+	kvm_set_cpu_l1tf_flush_l1d();					\
+	run_irq_on_irqstack_cond(__##func, regs, vector);		\
+	instrumentation_end();						\
+	irqentry_exit_hv_cond(regs, state);				\
+}									\
+									\
+static noinline void __##func(struct pt_regs *regs, u32 vector)
+#endif
 
 /**
  * DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points
@@ -221,6 +246,7 @@ static noinline void __##func(struct pt_regs *regs, u32 vector)
 #define DECLARE_IDTENTRY_SYSVEC(vector, func)				\
 	DECLARE_IDTENTRY(vector, func)
 
+#ifndef CONFIG_AMD_MEM_ENCRYPT
 /**
  * DEFINE_IDTENTRY_SYSVEC - Emit code for system vector IDT entry points
  * @func:	Function name of the entry point
@@ -245,6 +271,26 @@ __visible noinstr void func(struct pt_regs *regs)			\
 }									\
 									\
 static noinline void __##func(struct pt_regs *regs)
+#else
+
+#define DEFINE_IDTENTRY_SYSVEC(func)					\
+static void __##func(struct pt_regs *regs);				\
+									\
+__visible noinstr void func(struct pt_regs *regs)			\
+{									\
+	irqentry_state_t state = irqentry_enter(regs);			\
+									\
+	instrumentation_begin();					\
+	kvm_set_cpu_l1tf_flush_l1d();					\
+	run_sysvec_on_irqstack_cond(__##func, regs);			\
+	instrumentation_end();						\
+	irqentry_exit_hv_cond(regs, state);				\
+}									\
+									\
+static noinline void __##func(struct pt_regs *regs)
+#endif
+
+#ifndef CONFIG_AMD_MEM_ENCRYPT
 
 /**
  * DEFINE_IDTENTRY_SYSVEC_SIMPLE - Emit code for simple system vector IDT
@@ -274,6 +320,26 @@ __visible noinstr void func(struct pt_regs *regs)			\
 }									\
 									\
 static __always_inline void __##func(struct pt_regs *regs)
+#else
+
+#define DEFINE_IDTENTRY_SYSVEC_SIMPLE(func)				\
+static __always_inline void __##func(struct pt_regs *regs);		\
+									\
+__visible noinstr void func(struct pt_regs *regs)			\
+{									\
+	irqentry_state_t state = irqentry_enter(regs);			\
+									\
+	instrumentation_begin();					\
+	__irq_enter_raw();						\
+	kvm_set_cpu_l1tf_flush_l1d();					\
+	__##func(regs);						\
+	__irq_exit_raw();						\
+	instrumentation_end();						\
+	irqentry_exit_hv_cond(regs, state);				\
+}									\
+									\
+static __always_inline void __##func(struct pt_regs *regs)
+#endif
 
 /**
  * DECLARE_IDTENTRY_XENCB - Declare functions for XEN HV callback entry point
diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
index b1a98c2a52f8..23f15e95838b 100644
--- a/arch/x86/kernel/sev.c
+++ b/arch/x86/kernel/sev.c
@@ -147,6 +147,10 @@ struct sev_hv_doorbell_page {
 
 struct sev_snp_runtime_data {
 	struct sev_hv_doorbell_page hv_doorbell_page;
+	/*
+	 * Indication that we are currently handling #HV events.
+	 */
+	bool hv_handling_events;
 };
 
 static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
@@ -200,6 +204,8 @@ static void do_exc_hv(struct pt_regs *regs)
 	union hv_pending_events pending_events;
 	u8 vector;
 
+	this_cpu_read(snp_runtime_data)->hv_handling_events = true;
+
 	while (sev_hv_pending()) {
 		pending_events.events = xchg(
 			&sev_snp_current_doorbell_page()->pending_events.events,
@@ -234,6 +240,8 @@ static void do_exc_hv(struct pt_regs *regs)
 			common_interrupt(regs, pending_events.vector);
 		}
 	}
+
+	this_cpu_read(snp_runtime_data)->hv_handling_events = false;
 }
 
 static __always_inline bool on_vc_stack(struct pt_regs *regs)
@@ -2529,3 +2537,25 @@ static int __init snp_init_platform_device(void)
 	return 0;
 }
 device_initcall(snp_init_platform_device);
+
+noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, irqentry_state_t state)
+{
+	/*
+	 * Check whether this returns to user mode, if so and if
+	 * we are currently executing the #HV handler then we don't
+	 * want to follow the irqentry_exit_to_user_mode path as
+	 * that can potentially cause the #HV handler to be
+	 * preempted and rescheduled on another CPU. Rescheduled #HV
+	 * handler on another cpu will cause interrupts to be handled
+	 * on a different cpu than the injected one, causing
+	 * invalid EOIs and missed/lost guest interrupts and
+	 * corresponding hangs and/or per-cpu IRQs handled on
+	 * non-intended cpu.
+	 */
+	if (user_mode(regs) &&
+	    this_cpu_read(snp_runtime_data)->hv_handling_events)
+		return;
+
+	/* follow normal interrupt return/exit path */
+	irqentry_exit(regs, state);
+}
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler
  2023-01-22  2:46 ` [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler Tianyu Lan
@ 2023-01-23  7:33   ` Gupta, Pankaj
  2023-02-03  7:27     ` Tianyu Lan
  2023-03-09 11:48   ` Gupta, Pankaj
  2023-03-31 15:57   ` Borislav Petkov
  2 siblings, 1 reply; 60+ messages in thread
From: Gupta, Pankaj @ 2023-01-23  7:33 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

Hi Tianyu,

Just trying to skim over what all changed in this version.

> From: Tianyu Lan <tiala@microsoft.com>
> 
> Add a #HV exception handler that uses IST stack.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> Change since RFC V2:
>         * Remove unnecessary line in the change log.
> ---
>   arch/x86/entry/entry_64.S             | 58 +++++++++++++++++++++++++++
>   arch/x86/include/asm/cpu_entry_area.h |  6 +++
>   arch/x86/include/asm/idtentry.h       | 39 +++++++++++++++++-
>   arch/x86/include/asm/page_64_types.h  |  1 +
>   arch/x86/include/asm/trapnr.h         |  1 +
>   arch/x86/include/asm/traps.h          |  1 +
>   arch/x86/kernel/cpu/common.c          |  1 +
>   arch/x86/kernel/dumpstack_64.c        |  9 ++++-
>   arch/x86/kernel/idt.c                 |  1 +
>   arch/x86/kernel/sev.c                 | 53 ++++++++++++++++++++++++
>   arch/x86/kernel/traps.c               | 40 ++++++++++++++++++
>   arch/x86/mm/cpu_entry_area.c          |  2 +
>   12 files changed, 209 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 15739a2c0983..6baec7653f19 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -563,6 +563,64 @@ SYM_CODE_START(\asmsym)
>   .Lfrom_usermode_switch_stack_\@:
>   	idtentry_body user_\cfunc, has_error_code=1
>   
> +_ASM_NOKPROBE(\asmsym)
> +SYM_CODE_END(\asmsym)
> +.endm
> +/*
> + * idtentry_hv - Macro to generate entry stub for #HV
> + * @vector:		Vector number
> + * @asmsym:		ASM symbol for the entry point
> + * @cfunc:		C function to be called
> + *
> + * The macro emits code to set up the kernel context for #HV. The #HV handler
> + * runs on an IST stack and needs to be able to support nested #HV exceptions.
> + *
> + * To make this work the #HV entry code tries its best to pretend it doesn't use
> + * an IST stack by switching to the task stack if coming from user-space (which
> + * includes early SYSCALL entry path) or back to the stack in the IRET frame if
> + * entered from kernel-mode.
> + *
> + * If entered from kernel-mode the return stack is validated first, and if it is
> + * not safe to use (e.g. because it points to the entry stack) the #HV handler
> + * will switch to a fall-back stack (HV2) and call a special handler function.
> + *
> + * The macro is only used for one vector, but it is planned to be extended in
> + * the future for the #HV exception.

seems you did not remove this line in the comment.

> + */
> +.macro idtentry_hv vector asmsym cfunc
> +SYM_CODE_START(\asmsym)
> +	UNWIND_HINT_IRET_REGS
> +	ASM_CLAC

Did you get a chance to review the new instructions
added at the start similar to idtentry_vc and comments
added assuggested here?

https://lore.kernel.org/lkml/16e50239-39b2-4fb4-5110-18f13ba197fe@amd.com/

> +	pushq	$-1			/* ORIG_RAX: no syscall to restart */
> +
> +	testb	$3, CS-ORIG_RAX(%rsp)
> +	jnz	.Lfrom_usermode_switch_stack_\@
> +
> +	call	paranoid_entry
> +
> +	UNWIND_HINT_REGS
> +
> +	/*
> +	 * Switch off the IST stack to make it free for nested exceptions.
> +	 */
> +	movq	%rsp, %rdi		/* pt_regs pointer */
> +	call	hv_switch_off_ist
> +	movq	%rax, %rsp		/* Switch to new stack */
> +
> +	UNWIND_HINT_REGS
> +
> +	/* Update pt_regs */
> +	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
> +	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
> +
> +	movq	%rsp, %rdi		/* pt_regs pointer */
> +	call	kernel_\cfunc
> +
> +	jmp	paranoid_exit
> +
> +.Lfrom_usermode_switch_stack_\@:
> +	idtentry_body user_\cfunc, has_error_code=1
> +
>   _ASM_NOKPROBE(\asmsym)
>   SYM_CODE_END(\asmsym)
>   .endm
> diff --git a/arch/x86/include/asm/cpu_entry_area.h b/arch/x86/include/asm/cpu_entry_area.h
> index 462fc34f1317..2186ed601b4a 100644
> --- a/arch/x86/include/asm/cpu_entry_area.h
> +++ b/arch/x86/include/asm/cpu_entry_area.h
> @@ -30,6 +30,10 @@
>   	char	VC_stack[optional_stack_size];			\
>   	char	VC2_stack_guard[guardsize];			\
>   	char	VC2_stack[optional_stack_size];			\
> +	char	HV_stack_guard[guardsize];			\
> +	char	HV_stack[optional_stack_size];			\
> +	char	HV2_stack_guard[guardsize];			\
> +	char	HV2_stack[optional_stack_size];			\
>   	char	IST_top_guard[guardsize];			\
>   
>   /* The exception stacks' physical storage. No guard pages required */
> @@ -52,6 +56,8 @@ enum exception_stack_ordering {
>   	ESTACK_MCE,
>   	ESTACK_VC,
>   	ESTACK_VC2,
> +	ESTACK_HV,
> +	ESTACK_HV2,
>   	N_EXCEPTION_STACKS
>   };
>   
> diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
> index 72184b0b2219..652fea10d377 100644
> --- a/arch/x86/include/asm/idtentry.h
> +++ b/arch/x86/include/asm/idtentry.h
> @@ -317,6 +317,19 @@ static __always_inline void __##func(struct pt_regs *regs)
>   	__visible noinstr void kernel_##func(struct pt_regs *regs, unsigned long error_code);	\
>   	__visible noinstr void   user_##func(struct pt_regs *regs, unsigned long error_code)
>   
> +
> +/**
> + * DECLARE_IDTENTRY_HV - Declare functions for the HV entry point
> + * @vector:	Vector number (ignored for C)
> + * @func:	Function name of the entry point
> + *
> + * Maps to DECLARE_IDTENTRY_RAW, but declares also the user C handler.
> + */
> +#define DECLARE_IDTENTRY_HV(vector, func)				\
> +	DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func);			\
> +	__visible noinstr void kernel_##func(struct pt_regs *regs);	\
> +	__visible noinstr void   user_##func(struct pt_regs *regs)
> +
>   /**
>    * DEFINE_IDTENTRY_IST - Emit code for IST entry points
>    * @func:	Function name of the entry point
> @@ -376,6 +389,26 @@ static __always_inline void __##func(struct pt_regs *regs)
>   #define DEFINE_IDTENTRY_VC_USER(func)				\
>   	DEFINE_IDTENTRY_RAW_ERRORCODE(user_##func)
>   
> +/**
> + * DEFINE_IDTENTRY_HV_KERNEL - Emit code for HV injection handler
> + *			       when raised from kernel mode
> + * @func:	Function name of the entry point
> + *
> + * Maps to DEFINE_IDTENTRY_RAW
> + */
> +#define DEFINE_IDTENTRY_HV_KERNEL(func)					\
> +	DEFINE_IDTENTRY_RAW(kernel_##func)
> +
> +/**
> + * DEFINE_IDTENTRY_HV_USER - Emit code for HV injection handler
> + *			     when raised from user mode
> + * @func:	Function name of the entry point
> + *
> + * Maps to DEFINE_IDTENTRY_RAW
> + */
> +#define DEFINE_IDTENTRY_HV_USER(func)					\
> +	DEFINE_IDTENTRY_RAW(user_##func)
> +
>   #else	/* CONFIG_X86_64 */
>   
>   /**
> @@ -465,6 +498,9 @@ __visible noinstr void func(struct pt_regs *regs,			\
>   # define DECLARE_IDTENTRY_VC(vector, func)				\
>   	idtentry_vc vector asm_##func func
>   
> +# define DECLARE_IDTENTRY_HV(vector, func)				\
> +	idtentry_hv vector asm_##func func
> +
>   #else
>   # define DECLARE_IDTENTRY_MCE(vector, func)				\
>   	DECLARE_IDTENTRY(vector, func)
> @@ -622,9 +658,10 @@ DECLARE_IDTENTRY_RAW_ERRORCODE(X86_TRAP_DF,	xenpv_exc_double_fault);
>   DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP,	exc_control_protection);
>   #endif
>   
> -/* #VC */
> +/* #VC & #HV */
>   #ifdef CONFIG_AMD_MEM_ENCRYPT
>   DECLARE_IDTENTRY_VC(X86_TRAP_VC,	exc_vmm_communication);
> +DECLARE_IDTENTRY_HV(X86_TRAP_HV,	exc_hv_injection);
>   #endif
>   
>   #ifdef CONFIG_XEN_PV
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index e9e2c3ba5923..0bd7dab676c5 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -29,6 +29,7 @@
>   #define	IST_INDEX_DB		2
>   #define	IST_INDEX_MCE		3
>   #define	IST_INDEX_VC		4
> +#define	IST_INDEX_HV		5
>   
>   /*
>    * Set __PAGE_OFFSET to the most negative possible address +
> diff --git a/arch/x86/include/asm/trapnr.h b/arch/x86/include/asm/trapnr.h
> index f5d2325aa0b7..c6583631cecb 100644
> --- a/arch/x86/include/asm/trapnr.h
> +++ b/arch/x86/include/asm/trapnr.h
> @@ -26,6 +26,7 @@
>   #define X86_TRAP_XF		19	/* SIMD Floating-Point Exception */
>   #define X86_TRAP_VE		20	/* Virtualization Exception */
>   #define X86_TRAP_CP		21	/* Control Protection Exception */
> +#define X86_TRAP_HV		28	/* HV injected exception in SNP restricted mode */
>   #define X86_TRAP_VC		29	/* VMM Communication Exception */
>   #define X86_TRAP_IRET		32	/* IRET Exception */
>   
> diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
> index 47ecfff2c83d..6795d3e517d6 100644
> --- a/arch/x86/include/asm/traps.h
> +++ b/arch/x86/include/asm/traps.h
> @@ -16,6 +16,7 @@ asmlinkage __visible notrace
>   struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
>   void __init trap_init(void);
>   asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
> +asmlinkage __visible noinstr struct pt_regs *hv_switch_off_ist(struct pt_regs *eregs);
>   #endif
>   
>   extern bool ibt_selftest(void);
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 9cfca3d7d0e2..e48a489777ec 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2162,6 +2162,7 @@ static inline void tss_setup_ist(struct tss_struct *tss)
>   	tss->x86_tss.ist[IST_INDEX_MCE] = __this_cpu_ist_top_va(MCE);
>   	/* Only mapped when SEV-ES is active */
>   	tss->x86_tss.ist[IST_INDEX_VC] = __this_cpu_ist_top_va(VC);
> +	tss->x86_tss.ist[IST_INDEX_HV] = __this_cpu_ist_top_va(HV);
>   }
>   
>   #else /* CONFIG_X86_64 */
> diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
> index f05339fee778..6d8f8864810c 100644
> --- a/arch/x86/kernel/dumpstack_64.c
> +++ b/arch/x86/kernel/dumpstack_64.c
> @@ -26,11 +26,14 @@ static const char * const exception_stack_names[] = {
>   		[ ESTACK_MCE	]	= "#MC",
>   		[ ESTACK_VC	]	= "#VC",
>   		[ ESTACK_VC2	]	= "#VC2",
> +		[ ESTACK_HV	]	= "#HV",
> +		[ ESTACK_HV2	]	= "#HV2",
> +		
>   };
>   
>   const char *stack_type_name(enum stack_type type)
>   {
> -	BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
> +	BUILD_BUG_ON(N_EXCEPTION_STACKS != 8);
>   
>   	if (type == STACK_TYPE_TASK)
>   		return "TASK";
> @@ -89,6 +92,8 @@ struct estack_pages estack_pages[CEA_ESTACK_PAGES] ____cacheline_aligned = {
>   	EPAGERANGE(MCE),
>   	EPAGERANGE(VC),
>   	EPAGERANGE(VC2),
> +	EPAGERANGE(HV),
> +	EPAGERANGE(HV2),
>   };
>   
>   static __always_inline bool in_exception_stack(unsigned long *stack, struct stack_info *info)
> @@ -98,7 +103,7 @@ static __always_inline bool in_exception_stack(unsigned long *stack, struct stac
>   	struct pt_regs *regs;
>   	unsigned int k;
>   
> -	BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
> +	BUILD_BUG_ON(N_EXCEPTION_STACKS != 8);
>   
>   	begin = (unsigned long)__this_cpu_read(cea_exception_stacks);
>   	/*
> diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
> index a58c6bc1cd68..48c0a7e1dbcb 100644
> --- a/arch/x86/kernel/idt.c
> +++ b/arch/x86/kernel/idt.c
> @@ -113,6 +113,7 @@ static const __initconst struct idt_data def_idts[] = {
>   
>   #ifdef CONFIG_AMD_MEM_ENCRYPT
>   	ISTG(X86_TRAP_VC,		asm_exc_vmm_communication, IST_INDEX_VC),
> +	ISTG(X86_TRAP_HV,		asm_exc_hv_injection, IST_INDEX_HV),
>   #endif
>   
>   	SYSG(X86_TRAP_OF,		asm_exc_overflow),
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 679026a640ef..a8862a2eff67 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -2004,6 +2004,59 @@ DEFINE_IDTENTRY_VC_USER(exc_vmm_communication)
>   	irqentry_exit_to_user_mode(regs);
>   }
>   
> +static bool hv_raw_handle_exception(struct pt_regs *regs)
> +{
> +	return false;
> +}
> +
> +static __always_inline bool on_hv_fallback_stack(struct pt_regs *regs)
> +{
> +	unsigned long sp = (unsigned long)regs;
> +
> +	return (sp >= __this_cpu_ist_bottom_va(HV2) && sp < __this_cpu_ist_top_va(HV2));
> +}
> +
> +DEFINE_IDTENTRY_HV_USER(exc_hv_injection)
> +{
> +	irqentry_enter_from_user_mode(regs);
> +	instrumentation_begin();
> +
> +	if (!hv_raw_handle_exception(regs)) {
> +		/*
> +		 * Do not kill the machine if user-space triggered the
> +		 * exception. Send SIGBUS instead and let user-space deal
> +		 * with it.
> +		 */
> +		force_sig_fault(SIGBUS, BUS_OBJERR, (void __user *)0);
> +	}
> +
> +	instrumentation_end();
> +	irqentry_exit_to_user_mode(regs);
> +}
> +
> +DEFINE_IDTENTRY_HV_KERNEL(exc_hv_injection)
> +{
> +	irqentry_state_t irq_state;
> +
> +	irq_state = irqentry_nmi_enter(regs);
> +	instrumentation_begin();
> +
> +	if (!hv_raw_handle_exception(regs)) {
> +		pr_emerg("PANIC: Unhandled #HV exception in kernel space\n");
> +
> +		/* Show some debug info */
> +		show_regs(regs);
> +
> +		/* Ask hypervisor to sev_es_terminate */
> +		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
> +
> +		panic("Returned from Terminate-Request to Hypervisor\n");
> +	}
> +
> +	instrumentation_end();
> +	irqentry_nmi_exit(regs, irq_state);
> +}
> +
>   bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
>   {
>   	unsigned long exit_code = regs->orig_ax;
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index d317dc3d06a3..d29debec8134 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -905,6 +905,46 @@ asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *r
>   
>   	return regs_ret;
>   }
> +
> +asmlinkage __visible noinstr struct pt_regs *hv_switch_off_ist(struct pt_regs *regs)
> +{
> +	unsigned long sp, *stack;
> +	struct stack_info info;
> +	struct pt_regs *regs_ret;
> +
> +	/*
> +	 * In the SYSCALL entry path the RSP value comes from user-space - don't
> +	 * trust it and switch to the current kernel stack
> +	 */
> +	if (ip_within_syscall_gap(regs)) {
> +		sp = this_cpu_read(pcpu_hot.top_of_stack);
> +		goto sync;
> +	}
> +
> +	/*
> +	 * From here on the RSP value is trusted. Now check whether entry
> +	 * happened from a safe stack. Not safe are the entry or unknown stacks,
> +	 * use the fall-back stack instead in this case.
> +	 */
> +	sp    = regs->sp;
> +	stack = (unsigned long *)sp;
> +
> +	if (!get_stack_info_noinstr(stack, current, &info) || info.type == STACK_TYPE_ENTRY ||
> +	    info.type > STACK_TYPE_EXCEPTION_LAST)
> +		sp = __this_cpu_ist_top_va(HV2);
> +sync:
> +	/*
> +	 * Found a safe stack - switch to it as if the entry didn't happen via
> +	 * IST stack. The code below only copies pt_regs, the real switch happens
> +	 * in assembly code.
> +	 */
> +	sp = ALIGN_DOWN(sp, 8) - sizeof(*regs_ret);
> +
> +	regs_ret = (struct pt_regs *)sp;
> +	*regs_ret = *regs;
> +
> +	return regs_ret;
> +}
>   #endif
>   
>   asmlinkage __visible noinstr struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs)
> diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
> index 7316a8224259..3ec844cef652 100644
> --- a/arch/x86/mm/cpu_entry_area.c
> +++ b/arch/x86/mm/cpu_entry_area.c
> @@ -153,6 +153,8 @@ static void __init percpu_setup_exception_stacks(unsigned int cpu)
>   		if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) {
>   			cea_map_stack(VC);
>   			cea_map_stack(VC2);
> +			cea_map_stack(HV);
> +			cea_map_stack(HV2);
>   		}
>   	}
>   }


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest
  2023-01-22  2:46 ` [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest Tianyu Lan
@ 2023-01-23 15:30   ` Tom Lendacky
  2023-02-03  7:00     ` Tianyu Lan
  2023-01-31 18:34   ` Michael Kelley (LINUX)
  1 sibling, 1 reply; 60+ messages in thread
From: Tom Lendacky @ 2023-01-23 15:30 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, venu.busireddy, sterritt, tony.luck, samitolvanen,
	fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 1/21/23 20:46, Tianyu Lan wrote:
> From: Tianyu Lan <tiala@microsoft.com>
> 
> The wakeup_secondary_cpu callback was populated with wakeup_
> cpu_via_vmgexit() which doesn't work for Hyper-V. Override it

An explanation as to why is doesn't work would be nice here.

> with Hyper-V specific hook which uses HVCALL_START_VIRTUAL_
> PROCESSOR hvcall to start AP with vmsa data structure.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> Change since RFC v2:
>         * Add helper function to initialize segment
>         * Fix some coding style
> ---
>   arch/x86/include/asm/mshyperv.h   |   2 +
>   arch/x86/include/asm/sev.h        |  13 ++++
>   arch/x86/include/asm/svm.h        |  47 +++++++++++++
>   arch/x86/kernel/cpu/mshyperv.c    | 112 ++++++++++++++++++++++++++++--
>   include/asm-generic/hyperv-tlfs.h |  19 +++++
>   5 files changed, 189 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 7266d71d30d6..c69051eec0e1 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -203,6 +203,8 @@ struct irq_domain *hv_create_pci_msi_domain(void);
>   int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector,
>   		struct hv_interrupt_entry *entry);
>   int hv_unmap_ioapic_interrupt(int ioapic_id, struct hv_interrupt_entry *entry);
> +int hv_set_mem_host_visibility(unsigned long addr, int numpages, bool visible);
> +int hv_snp_boot_ap(int cpu, unsigned long start_ip);
>   
>   #ifdef CONFIG_AMD_MEM_ENCRYPT
>   void hv_ghcb_msr_write(u64 msr, u64 value);
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index ebc271bb6d8e..e34aaf730220 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -86,6 +86,19 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
>   
>   #define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
>   
> +union sev_rmp_adjust {
> +	u64 as_uint64;
> +	struct {
> +		unsigned long target_vmpl : 8;
> +		unsigned long enable_read : 1;
> +		unsigned long enable_write : 1;
> +		unsigned long enable_user_execute : 1;
> +		unsigned long enable_kernel_execute : 1;
> +		unsigned long reserved1 : 4;
> +		unsigned long vmsa : 1;
> +	};
> +};
> +
>   /* SNP Guest message request */
>   struct snp_req_data {
>   	unsigned long req_gpa;
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index cb1ee53ad3b1..f8b321a11ee4 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -336,6 +336,53 @@ struct vmcb_save_area {

Please don't update the vmcb_save_area, you should be using/updating the 
sev_es_save_area structure for SNP.

>   	u64 last_excp_to;
>   	u8 reserved_0x298[72];
>   	u32 spec_ctrl;		/* Guest version of SPEC_CTRL at 0x2E0 */
> +	u8 reserved_7b[4];
> +	u32 pkru;
> +	u8 reserved_7a[20];
> +	u64 reserved_8;		/* rax already available at 0x01f8 */
> +	u64 rcx;
> +	u64 rdx;
> +	u64 rbx;
> +	u64 reserved_9;		/* rsp already available at 0x01d8 */
> +	u64 rbp;
> +	u64 rsi;
> +	u64 rdi;
> +	u64 r8;
> +	u64 r9;
> +	u64 r10;
> +	u64 r11;
> +	u64 r12;
> +	u64 r13;
> +	u64 r14;
> +	u64 r15;
> +	u8 reserved_10[16];
> +	u64 sw_exit_code;
> +	u64 sw_exit_info_1;
> +	u64 sw_exit_info_2;
> +	u64 sw_scratch;
> +	union {
> +		u64 sev_features;
> +		struct {
> +			u64 sev_feature_snp			: 1;
> +			u64 sev_feature_vtom			: 1;
> +			u64 sev_feature_reflectvc		: 1;
> +			u64 sev_feature_restrict_injection	: 1;
> +			u64 sev_feature_alternate_injection	: 1;
> +			u64 sev_feature_full_debug		: 1;
> +			u64 sev_feature_reserved1		: 1;
> +			u64 sev_feature_snpbtb_isolation	: 1;
> +			u64 sev_feature_resrved2		: 56;

For the bits definition, use:

			u64 sev_feature_snp			: 1,
			    sev_feature_vtom			: 1,
			    sev_feature_reflectvc		: 1,
			    ...

Thanks,
Tom

> +		};
> +	};
> +	u64 vintr_ctrl;
> +	u64 guest_error_code;
> +	u64 virtual_tom;
> +	u64 tlb_id;
> +	u64 pcpu_id;
> +	u64 event_inject;
> +	u64 xcr0;
> +	u8 valid_bitmap[16];
> +	u64 x87_state_gpa;
>   } __packed;
>   
>   /* Save area definition for SEV-ES and SEV-SNP guests */
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 197c8f2ec4eb..9d547751a1a7 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -39,6 +39,13 @@
>   #include <asm/realmode.h>
>   #include <asm/e820/api.h>
>   
> +/*
> + * DEFAULT INIT GPAT and SEGMENT LIMIT value in struct VMSA
> + * to start AP in enlightened SEV guest.
> + */
> +#define HV_AP_INIT_GPAT_DEFAULT		0x0007040600070406ULL
> +#define HV_AP_SEGMENT_LIMIT		0xffffffff
> +
>   /* Is Linux running as the root partition? */
>   bool hv_root_partition;
>   struct ms_hyperv_info ms_hyperv;
> @@ -230,6 +237,94 @@ static void __init hv_smp_prepare_boot_cpu(void)
>   #endif
>   }
>   
> +static u8 ap_start_input_arg[PAGE_SIZE] __bss_decrypted __aligned(PAGE_SIZE);
> +static u8 ap_start_stack[PAGE_SIZE] __aligned(PAGE_SIZE);
> +
> +#define hv_populate_vmcb_seg(seg, gdtr_base)			\
> +do {								\
> +	if (seg.selector) {					\
> +		seg.base = 0;					\
> +		seg.limit = HV_AP_SEGMENT_LIMIT;		\
> +		seg.attrib = *(u16 *)(gdtr_base + seg.selector + 5);	\
> +		seg.attrib = (seg.attrib & 0xFF) | ((seg.attrib >> 4) & 0xF00); \
> +	}							\
> +} while (0)							\
> +
> +int hv_snp_boot_ap(int cpu, unsigned long start_ip)
> +{
> +	struct vmcb_save_area *vmsa = (struct vmcb_save_area *)
> +		__get_free_page(GFP_KERNEL | __GFP_ZERO);
> +	struct desc_ptr gdtr;
> +	u64 ret, retry = 5;
> +	struct hv_start_virtual_processor_input *start_vp_input;
> +	union sev_rmp_adjust rmp_adjust;
> +	unsigned long flags;
> +
> +	native_store_gdt(&gdtr);
> +
> +	vmsa->gdtr.base = gdtr.address;
> +	vmsa->gdtr.limit = gdtr.size;
> +
> +	asm volatile("movl %%es, %%eax;" : "=a" (vmsa->es.selector));
> +	hv_populate_vmcb_seg(vmsa->es, vmsa->gdtr.base);
> +
> +	asm volatile("movl %%cs, %%eax;" : "=a" (vmsa->cs.selector));
> +	hv_populate_vmcb_seg(vmsa->cs, vmsa->gdtr.base);
> +
> +	asm volatile("movl %%ss, %%eax;" : "=a" (vmsa->ss.selector));
> +	hv_populate_vmcb_seg(vmsa->ss, vmsa->gdtr.base);
> +
> +	asm volatile("movl %%ds, %%eax;" : "=a" (vmsa->ds.selector));
> +	hv_populate_vmcb_seg(vmsa->ds, vmsa->gdtr.base);
> +
> +	vmsa->efer = native_read_msr(MSR_EFER);
> +
> +	asm volatile("movq %%cr4, %%rax;" : "=a" (vmsa->cr4));
> +	asm volatile("movq %%cr3, %%rax;" : "=a" (vmsa->cr3));
> +	asm volatile("movq %%cr0, %%rax;" : "=a" (vmsa->cr0));
> +
> +	vmsa->xcr0 = 1;
> +	vmsa->g_pat = HV_AP_INIT_GPAT_DEFAULT;
> +	vmsa->rip = (u64)secondary_startup_64_no_verify;
> +	vmsa->rsp = (u64)&ap_start_stack[PAGE_SIZE];
> +
> +	vmsa->sev_feature_snp = 1;
> +	vmsa->sev_feature_restrict_injection = 1;
> +
> +	rmp_adjust.as_uint64 = 0;
> +	rmp_adjust.target_vmpl = 1;
> +	rmp_adjust.vmsa = 1;
> +	ret = rmpadjust((unsigned long)vmsa, RMP_PG_SIZE_4K,
> +			rmp_adjust.as_uint64);
> +	if (ret != 0) {
> +		pr_err("RMPADJUST(%llx) failed: %llx\n", (u64)vmsa, ret);
> +		return ret;
> +	}
> +
> +	local_irq_save(flags);
> +	start_vp_input =
> +		(struct hv_start_virtual_processor_input *)ap_start_input_arg;
> +	memset(start_vp_input, 0, sizeof(*start_vp_input));
> +	start_vp_input->partitionid = -1;
> +	start_vp_input->vpindex = cpu;
> +	start_vp_input->targetvtl = ms_hyperv.vtl;
> +	*(u64 *)&start_vp_input->context[0] = __pa(vmsa) | 1;
> +
> +	do {
> +		ret = hv_do_hypercall(HVCALL_START_VIRTUAL_PROCESSOR,
> +				      start_vp_input, NULL);
> +	} while (hv_result(ret) == HV_STATUS_TIME_OUT && retry--);
> +
> +	if (!hv_result_success(ret)) {
> +		pr_err("HvCallStartVirtualProcessor failed: %llx\n", ret);
> +		goto done;
> +	}
> +
> +done:
> +	local_irq_restore(flags);
> +	return ret;
> +}
> +
>   static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
>   {
>   #ifdef CONFIG_X86_64
> @@ -239,6 +334,16 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
>   
>   	native_smp_prepare_cpus(max_cpus);
>   
> +	/*
> +	 *  Override wakeup_secondary_cpu callback for SEV-SNP
> +	 *  enlightened guest.
> +	 */
> +	if (hv_isolation_type_en_snp())
> +		apic->wakeup_secondary_cpu = hv_snp_boot_ap;
> +
> +	if (!hv_root_partition)
> +		return;
> +
>   #ifdef CONFIG_X86_64
>   	for_each_present_cpu(i) {
>   		if (i == 0)
> @@ -475,8 +580,7 @@ static void __init ms_hyperv_init_platform(void)
>   
>   # ifdef CONFIG_SMP
>   	smp_ops.smp_prepare_boot_cpu = hv_smp_prepare_boot_cpu;
> -	if (hv_root_partition)
> -		smp_ops.smp_prepare_cpus = hv_smp_prepare_cpus;
> +	smp_ops.smp_prepare_cpus = hv_smp_prepare_cpus;
>   # endif
>   
>   	/*
> @@ -501,7 +605,7 @@ static void __init ms_hyperv_init_platform(void)
>   	if (!(ms_hyperv.features & HV_ACCESS_TSC_INVARIANT))
>   		mark_tsc_unstable("running on Hyper-V");
>   
> -	if (isolation_type_en_snp()) {
> +	if (hv_isolation_type_en_snp()) {
>   		/*
>   		 * Hyper-V enlightened snp guest boots kernel
>   		 * directly without bootloader and so roms,
> @@ -511,7 +615,7 @@ static void __init ms_hyperv_init_platform(void)
>   		x86_platform.legacy.rtc = 0;
>   		x86_platform.set_wallclock = set_rtc_noop;
>   		x86_platform.get_wallclock = get_rtc_noop;
> -		x86_platform.legacy.reserve_bios_regions = x86_init_noop;
> +		x86_platform.legacy.reserve_bios_regions = 0;
>   		x86_init.resources.probe_roms = x86_init_noop;
>   		x86_init.resources.reserve_resources = x86_init_noop;
>   		x86_init.mpparse.find_smp_config = x86_init_noop;
> diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
> index c1cc3ec36ad5..3d7c67be9f56 100644
> --- a/include/asm-generic/hyperv-tlfs.h
> +++ b/include/asm-generic/hyperv-tlfs.h
> @@ -148,6 +148,7 @@ union hv_reference_tsc_msr {
>   #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST	0x0003
>   #define HVCALL_NOTIFY_LONG_SPIN_WAIT		0x0008
>   #define HVCALL_SEND_IPI				0x000b
> +#define HVCALL_ENABLE_VP_VTL			0x000f
>   #define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX	0x0013
>   #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
>   #define HVCALL_SEND_IPI_EX			0x0015
> @@ -165,6 +166,7 @@ union hv_reference_tsc_msr {
>   #define HVCALL_MAP_DEVICE_INTERRUPT		0x007c
>   #define HVCALL_UNMAP_DEVICE_INTERRUPT		0x007d
>   #define HVCALL_RETARGET_INTERRUPT		0x007e
> +#define HVCALL_START_VIRTUAL_PROCESSOR		0x0099
>   #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
>   #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
>   #define HVCALL_MODIFY_SPARSE_GPA_PAGE_HOST_VISIBILITY 0x00db
> @@ -219,6 +221,7 @@ enum HV_GENERIC_SET_FORMAT {
>   #define HV_STATUS_INVALID_PORT_ID		17
>   #define HV_STATUS_INVALID_CONNECTION_ID		18
>   #define HV_STATUS_INSUFFICIENT_BUFFERS		19
> +#define HV_STATUS_TIME_OUT                     0x78
>   
>   /*
>    * The Hyper-V TimeRefCount register and the TSC
> @@ -778,6 +781,22 @@ struct hv_input_unmap_device_interrupt {
>   	struct hv_interrupt_entry interrupt_entry;
>   } __packed;
>   
> +struct hv_enable_vp_vtl_input {
> +	u64 partitionid;
> +	u32 vpindex;
> +	u8 targetvtl;
> +	u8 padding[3];
> +	u8 context[0xe0];
> +} __packed;
> +
> +struct hv_start_virtual_processor_input {
> +	u64 partitionid;
> +	u32 vpindex;
> +	u8 targetvtl;
> +	u8 padding[3];
> +	u8 context[0xe0];
> +} __packed;
> +
>   #define HV_SOURCE_SHADOW_NONE               0x0
>   #define HV_SOURCE_SHADOW_BRIDGE_BUS_RANGE   0x1
>   

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 09/16] x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
  2023-01-22  2:45 ` [RFC PATCH V3 09/16] x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc Tianyu Lan
@ 2023-01-31 14:03   ` Wei Liu
  2023-02-02  3:43     ` Tianyu Lan
  0 siblings, 1 reply; 60+ messages in thread
From: Wei Liu @ 2023-01-31 14:03 UTC (permalink / raw)
  To: Tianyu Lan
  Cc: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu,
	linux-kernel, kvm, linux-hyperv, linux-arch, Wei Liu

On Sat, Jan 21, 2023 at 09:45:59PM -0500, Tianyu Lan wrote:
> From: Tianyu Lan <tiala@microsoft.com>
> 
> SEV-SNP enlightened guest doesn't support legacy rtc. Set
> legacy.rtc, x86_platform.set_wallclock and get_wallclock to
> 0 or noop(). Make get/set_rtc_noop() to be public and reuse
> them in the ms_hyperv_init_platform().
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>  arch/x86/include/asm/mshyperv.h | 7 ++++++-
>  arch/x86/include/asm/x86_init.h | 2 ++
>  arch/x86/kernel/cpu/mshyperv.c  | 3 +++
>  arch/x86/kernel/x86_init.c      | 4 ++--
>  4 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 1a4af0a4f29a..7266d71d30d6 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -33,6 +33,12 @@ extern bool hv_isolation_type_en_snp(void);
>  
>  extern union hv_ghcb * __percpu *hv_ghcb_pg;
>  
> +/*
> + * Hyper-V puts processor and memory layout info
> + * to this address in SEV-SNP enlightened guest.
> + */
> +#define EN_SEV_SNP_PROCESSOR_INFO_ADDR	0x802000

This hunk should be moved to the previous patch. It is not needed in
this patch.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH V3 01/16] x86/hyperv: Add sev-snp enlightened guest specific config
  2023-01-22  2:45 ` [RFC PATCH V3 01/16] x86/hyperv: Add sev-snp enlightened guest specific config Tianyu Lan
@ 2023-01-31 17:34   ` Michael Kelley (LINUX)
  2023-02-02  4:01     ` Tianyu Lan
  0 siblings, 1 reply; 60+ messages in thread
From: Michael Kelley (LINUX) @ 2023-01-31 17:34 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <ltykernel@gmail.com> Sent: Saturday, January 21, 2023 6:46 PM
> 
> Introduce static key isolation_type_en_snp for enlightened
> guest check and add some specific options in ms_hyperv_init_
> platform().
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>  arch/x86/hyperv/ivm.c           | 10 ++++++++++
>  arch/x86/include/asm/mshyperv.h |  3 +++
>  arch/x86/kernel/cpu/mshyperv.c  | 16 +++++++++++++++-
>  drivers/hv/hv_common.c          |  6 ++++++
>  4 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/hyperv/ivm.c b/arch/x86/hyperv/ivm.c
> index abca9431d068..8c5dd8e4eb1e 100644
> --- a/arch/x86/hyperv/ivm.c
> +++ b/arch/x86/hyperv/ivm.c
> @@ -386,6 +386,16 @@ bool hv_is_isolation_supported(void)
>  }
> 
>  DEFINE_STATIC_KEY_FALSE(isolation_type_snp);
> +DEFINE_STATIC_KEY_FALSE(isolation_type_en_snp);
> +
> +/*
> + * hv_isolation_type_en_snp - Check system runs in the AMD SEV-SNP based
> + * isolation enlightened VM.
> + */
> +bool hv_isolation_type_en_snp(void)
> +{
> +	return static_branch_unlikely(&isolation_type_en_snp);
> +}
> 
>  /*
>   * hv_isolation_type_snp - Check system runs in the AMD SEV-SNP based
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 010768d40155..285df71150e4 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -14,6 +14,7 @@
>  union hv_ghcb;
> 
>  DECLARE_STATIC_KEY_FALSE(isolation_type_snp);
> +DECLARE_STATIC_KEY_FALSE(isolation_type_en_snp);
> 
>  typedef int (*hyperv_fill_flush_list_func)(
>  		struct hv_guest_mapping_flush_list *flush,
> @@ -28,6 +29,8 @@ extern void *hv_hypercall_pg;
> 
>  extern u64 hv_current_partition_id;
> 
> +extern bool hv_isolation_type_en_snp(void);
> +
>  extern union hv_ghcb * __percpu *hv_ghcb_pg;
> 
>  int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 8f83ceec45dc..ace5901ba0fc 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -273,6 +273,18 @@ static void __init ms_hyperv_init_platform(void)
> 
>  	hv_max_functions_eax = cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS);
> 
> +	/*
> +	 * Add custom configuration for SEV-SNP Enlightened guest
> +	 */
> +	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
> +		ms_hyperv.features |= HV_ACCESS_FREQUENCY_MSRS;
> +		ms_hyperv.misc_features |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
> +		ms_hyperv.misc_features &= ~HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
> +		ms_hyperv.hints |= HV_DEPRECATING_AEOI_RECOMMENDED;
> +		ms_hyperv.hints |= HV_X64_APIC_ACCESS_RECOMMENDED;
> +		ms_hyperv.hints |= HV_X64_CLUSTER_IPI_RECOMMENDED;

Two different things are happening in changing the above flags:

1)  Disabling certain feature that Hyper-V might offer to a guest, such
as the crash MSRs and Auto EOI.  (In some cases disabling the feature
means removing the flag.  In other cases in means adding the flag.  But
the net result is same -- other Hyper-V specific code will not use the
feature.)  This category is OK.

2)  Forcing certain features to be treated as enabled.  This category
is somewhat concerning.  Assuming that Hyper-V is accurately indicating
which features are available, it seems better to check that the flags
required by SNP are present, and refuse to boot in SNP mode if not.
Or is this code handling a different problem, where Hyper-V is not
indicating that the feature is available, even though it really is?

> +	}
> +
>  	pr_info("Hyper-V: privilege flags low 0x%x, high 0x%x, hints 0x%x, misc 0x%x\n",
>  		ms_hyperv.features, ms_hyperv.priv_high, ms_hyperv.hints,
>  		ms_hyperv.misc_features);
> @@ -331,7 +343,9 @@ static void __init ms_hyperv_init_platform(void)
>  		pr_info("Hyper-V: Isolation Config: Group A 0x%x, Group B 0x%x\n",
>  			ms_hyperv.isolation_config_a, ms_hyperv.isolation_config_b);
> 
> -		if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP)
> +		if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> +			static_branch_enable(&isolation_type_en_snp);
> +		else if (hv_get_isolation_type() == HV_ISOLATION_TYPE_SNP)
>  			static_branch_enable(&isolation_type_snp);
>  	}
> 
> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> index 566735f35c28..f788c64de0bd 100644
> --- a/drivers/hv/hv_common.c
> +++ b/drivers/hv/hv_common.c
> @@ -268,6 +268,12 @@ bool __weak hv_isolation_type_snp(void)
>  }
>  EXPORT_SYMBOL_GPL(hv_isolation_type_snp);
> 
> +bool __weak hv_isolation_type_en_snp(void)
> +{
> +	return false;
> +}
> +EXPORT_SYMBOL_GPL(hv_isolation_type_en_snp);
> +
>  void __weak hv_setup_vmbus_handler(void (*handler)(void))
>  {
>  }
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH V3 03/16] x86/hyperv: Set Virtual Trust Level in vmbus init message
  2023-01-22  2:45 ` [RFC PATCH V3 03/16] x86/hyperv: Set Virtual Trust Level in vmbus init message Tianyu Lan
@ 2023-01-31 17:55   ` Michael Kelley (LINUX)
  2023-02-03  3:32     ` Tianyu Lan
  0 siblings, 1 reply; 60+ messages in thread
From: Michael Kelley (LINUX) @ 2023-01-31 17:55 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <ltykernel@gmail.com> Sent: Saturday, January 21, 2023 6:46 PM
> 
> sev-snp guest provides vtl(Virtual Trust Level) and
> get it from hyperv hvcall via HVCALL_GET_VP_REGISTERS.
> Set target vtl in the vmbus init message.

I'm still wondering why this is necessary in an SNP VM, vs.
just assuming VTL 0.

Also, I had several comments on v2 of this patch that don't appear
to have been taken into account.  I strongly think the code should
use the standard helper functions for checking hypercall results.
Some of my other code comments are more nit-picky and could
perhaps be ignored. :-)

> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> Change since RFC v2:
>        * Rename get_current_vtl() to get_vtl()
>        * Fix some coding style issues
> ---
>  arch/x86/hyperv/hv_init.c          | 37 ++++++++++++++++++++++++++++++
>  arch/x86/include/asm/hyperv-tlfs.h |  4 ++++
>  drivers/hv/connection.c            |  1 +
>  include/asm-generic/mshyperv.h     |  2 ++
>  include/linux/hyperv.h             |  4 ++--
>  5 files changed, 46 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 24154c1ee12b..9e9757049915 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -384,6 +384,40 @@ static void __init hv_get_partition_id(void)
>  	local_irq_restore(flags);
>  }
> 
> +static u8 __init get_vtl(void)
> +{
> +	u64 control = HV_HYPERCALL_REP_COMP_1 | HVCALL_GET_VP_REGISTERS;
> +	struct hv_get_vp_registers_input *input = NULL;
> +	struct hv_get_vp_registers_output *output = NULL;
> +	u64 vtl = 0;
> +	int ret;
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	input = *(struct hv_get_vp_registers_input **)this_cpu_ptr(hyperv_pcpu_input_arg);
> +	output = (struct hv_get_vp_registers_output *)input;
> +	if (!input || !output) {
> +		local_irq_restore(flags);
> +		goto done;
> +	}
> +
> +	memset(input, 0, sizeof(*input) + sizeof(input->element[0]));
> +	input->header.partitionid = HV_PARTITION_ID_SELF;
> +	input->header.vpindex = HV_VP_INDEX_SELF;
> +	input->header.inputvtl = 0;
> +	input->element[0].name0 = HV_X64_REGISTER_VSM_VP_STATUS;
> +
> +	ret = hv_do_hypercall(control, input, output);
> +	if (ret == 0)
> +		vtl = output->as64.low & HV_X64_VTL_MASK;
> +	else
> +		pr_err("Hyper-V: failed to get VTL!");
> +	local_irq_restore(flags);
> +
> +done:
> +	return vtl;
> +}
> +
>  /*
>   * This function is to be invoked early in the boot sequence after the
>   * hypervisor has been detected.
> @@ -512,6 +546,9 @@ void __init hyperv_init(void)
>  	/* Query the VMs extended capability once, so that it can be cached. */
>  	hv_query_ext_cap(0);
> 
> +	/* Find the VTL */
> +	ms_hyperv.vtl = get_vtl();
> +
>  	return;
> 
>  clean_guest_os_id:
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index db2202d985bd..6dcbb21aac2b 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -36,6 +36,10 @@
>  #define HYPERV_CPUID_MIN			0x40000005
>  #define HYPERV_CPUID_MAX			0x4000ffff
> 
> +/* Support for HVCALL_GET_VP_REGISTERS hvcall */

The above comment isn't really right, in that these definitions
aren't for the hypercall.  They are for the specific synthetic register.

> +#define	HV_X64_REGISTER_VSM_VP_STATUS	0x000D0003
> +#define HV_X64_VTL_MASK			GENMASK(3, 0)

Hyper-V synthetic registers have two different numbering schemes.
For registers that have synthetic MSR equivalents, there's a full list
starting with HV_X64_MSR_GUEST_OS_ID, which defines the MSR
address.  But these registers also have register numbers that are
not the same as the MSR address.  These register numbers
aren't defined anywhere in x86 Linux code because we don't access
them using the register number.   (The register numbers *are*
defined in ARM64 code since ARM64 doesn't have MSRs.)  But this
register is an exception on x86.  There's no MSR equivalent so we
must use a hypercall to fetch the value.

I'd suggest starting a separate list after the definition of
HV_X64_MSR_REFERENCE_TSC and make clear in a comment
about the list that this is a list of register numbers, not MSR addresses.

> +
>  /*
>   * Group D Features.  The bit assignments are custom to each architecture.
>   * On x86/x64 these are HYPERV_CPUID_FEATURES.EDX bits.
> diff --git a/drivers/hv/connection.c b/drivers/hv/connection.c
> index f670cfd2e056..e4c39f4016ad 100644
> --- a/drivers/hv/connection.c
> +++ b/drivers/hv/connection.c
> @@ -98,6 +98,7 @@ int vmbus_negotiate_version(struct vmbus_channel_msginfo
> *msginfo, u32 version)
>  	 */
>  	if (version >= VERSION_WIN10_V5) {
>  		msg->msg_sint = VMBUS_MESSAGE_SINT;
> +		msg->msg_vtl = ms_hyperv.vtl;
>  		vmbus_connection.msg_conn_id = VMBUS_MESSAGE_CONNECTION_ID_4;
>  	} else {
>  		msg->interrupt_page = virt_to_phys(vmbus_connection.int_page);
> diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
> index f2c0856f1797..44e56777fea7 100644
> --- a/include/asm-generic/mshyperv.h
> +++ b/include/asm-generic/mshyperv.h
> @@ -48,6 +48,7 @@ struct ms_hyperv_info {
>  		};
>  	};
>  	u64 shared_gpa_boundary;
> +	u8 vtl;
>  };
>  extern struct ms_hyperv_info ms_hyperv;
> 
> @@ -57,6 +58,7 @@ extern void * __percpu *hyperv_pcpu_output_arg;
>  extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
>  extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
>  extern bool hv_isolation_type_snp(void);
> +extern bool hv_isolation_type_en_snp(void);
> 
>  /* Helper functions that provide a consistent pattern for checking Hyper-V hypercall
> status. */
>  static inline int hv_result(u64 status)
> diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
> index 85f7c5a63aa6..65121b21b0af 100644
> --- a/include/linux/hyperv.h
> +++ b/include/linux/hyperv.h
> @@ -665,8 +665,8 @@ struct vmbus_channel_initiate_contact {
>  		u64 interrupt_page;
>  		struct {
>  			u8	msg_sint;
> -			u8	padding1[3];
> -			u32	padding2;
> +			u8	msg_vtl;
> +			u8	reserved[6];
>  		};
>  	};
>  	u64 monitor_page1;
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH V3 06/16] x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
  2023-01-22  2:45 ` [RFC PATCH V3 06/16] x86/hyperv: decrypt vmbus pages for " Tianyu Lan
@ 2023-01-31 17:58   ` Michael Kelley (LINUX)
  2023-02-03  4:11     ` Tianyu Lan
  0 siblings, 1 reply; 60+ messages in thread
From: Michael Kelley (LINUX) @ 2023-01-31 17:58 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <ltykernel@gmail.com> Sent: Saturday, January 21, 2023 6:46 PM
> 

As I comment on v2 of this patch, the Subject prefix should be
"Drivers: hv: vmbus:"

> Vmbus post msg, synic event and message pages are shared

s/Vmbus/VMBus/

We're trying to be consistent about the capitalization of "VMBus"
in comments and other text. :-)

> with hypervisor and so decrypt these pages in the sev-snp guest.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> Change since RFC V2:
>        * Fix error in the error code path and encrypt
>        	 pages correctly when decryption failure happens.
> ---
>  drivers/hv/hv.c | 33 ++++++++++++++++++++++++++++++++-
>  1 file changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
> index 410e6c4e80d3..52edc54c8172 100644
> --- a/drivers/hv/hv.c
> +++ b/drivers/hv/hv.c
> @@ -20,6 +20,7 @@
>  #include <linux/interrupt.h>
>  #include <clocksource/hyperv_timer.h>
>  #include <asm/mshyperv.h>
> +#include <linux/set_memory.h>
>  #include "hyperv_vmbus.h"
> 
>  /* The one and only */
> @@ -117,7 +118,7 @@ int hv_post_message(union hv_connection_id connection_id,
> 
>  int hv_synic_alloc(void)
>  {
> -	int cpu;
> +	int cpu, ret;
>  	struct hv_per_cpu_context *hv_cpu;
> 
>  	/*
> @@ -168,9 +169,39 @@ int hv_synic_alloc(void)
>  			pr_err("Unable to allocate post msg page\n");
>  			goto err;
>  		}
> +
> +		if (hv_isolation_type_en_snp()) {
> +			ret = set_memory_decrypted((unsigned long)
> +				hv_cpu->synic_message_page, 1);
> +			if (ret)
> +				goto err;
> +
> +			ret = set_memory_decrypted((unsigned long)
> +				hv_cpu->synic_event_page, 1);
> +			if (ret)
> +				goto err_decrypt_event_page;
> +
> +			ret = set_memory_decrypted((unsigned long)
> +				hv_cpu->post_msg_page, 1);
> +			if (ret)
> +				goto err_decrypt_msg_page;
> +
> +			memset(hv_cpu->synic_message_page, 0, PAGE_SIZE);
> +			memset(hv_cpu->synic_event_page, 0, PAGE_SIZE);
> +			memset(hv_cpu->post_msg_page, 0, PAGE_SIZE);
> +		}

Having decrypted the pages here in hv_synic_alloc(), shouldn't
there be corresponding re-encryption in hv_synic_free()?

>  	}
> 
>  	return 0;
> +
> +err_decrypt_msg_page:
> +	set_memory_encrypted((unsigned long)
> +		hv_cpu->synic_event_page, 1);
> +
> +err_decrypt_event_page:
> +	set_memory_encrypted((unsigned long)
> +		hv_cpu->synic_message_page, 1);
> +
>  err:
>  	/*
>  	 * Any memory allocations that succeeded will be freed when
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH V3 07/16] drivers: hv: Decrypt percpu hvcall input arg page in sev-snp enlightened guest
  2023-01-22  2:45 ` [RFC PATCH V3 07/16] drivers: hv: Decrypt percpu hvcall input arg page in " Tianyu Lan
@ 2023-01-31 18:02   ` Michael Kelley (LINUX)
  2023-02-03  5:23     ` Tianyu Lan
  0 siblings, 1 reply; 60+ messages in thread
From: Michael Kelley (LINUX) @ 2023-01-31 18:02 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <ltykernel@gmail.com> Sent: Saturday, January 21, 2023 6:46 PM
> 
> Hypervisor needs to access iput arg page and guest should decrypt
> the page.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> Change since RFC V2:
> 	* Set inputarg to be zero after kfree()
> 	* Not free mem when fail to encrypt mem in the hv_common_cpu_die().
> ---
>  drivers/hv/hv_common.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
> index f788c64de0bd..205b6380d794 100644
> --- a/drivers/hv/hv_common.c
> +++ b/drivers/hv/hv_common.c
> @@ -21,6 +21,7 @@
>  #include <linux/ptrace.h>
>  #include <linux/slab.h>
>  #include <linux/dma-map-ops.h>
> +#include <linux/set_memory.h>
>  #include <asm/hyperv-tlfs.h>
>  #include <asm/mshyperv.h>
> 
> @@ -125,6 +126,7 @@ int hv_common_cpu_init(unsigned int cpu)
>  	u64 msr_vp_index;
>  	gfp_t flags;
>  	int pgcount = hv_root_partition ? 2 : 1;
> +	int ret;
> 
>  	/* hv_cpu_init() can be called with IRQs disabled from hv_resume() */
>  	flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL;
> @@ -134,6 +136,17 @@ int hv_common_cpu_init(unsigned int cpu)
>  	if (!(*inputarg))
>  		return -ENOMEM;
> 
> +	if (hv_isolation_type_en_snp()) {
> +		ret = set_memory_decrypted((unsigned long)*inputarg, pgcount);

You used "pgcount" here in response to a comment on v2 of the
patch.  But the corresponding re-encryption in hv_common_cpu_die()
uses a fixed value of "1".   The two cases should be consistent.  Either
assert that hv_root_partition will never be true in an SNP VM, in which
case hard coding "1" is OK.  Or properly calculate the number of pages
in both cases so they are consistent.

> +		if (ret) {
> +			kfree(*inputarg);
> +			*inputarg = NULL;
> +			return ret;
> +		}
> +
> +		memset(*inputarg, 0x00, PAGE_SIZE);
> +	}
> +
>  	if (hv_root_partition) {
>  		outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
>  		*outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE;
> @@ -168,7 +181,12 @@ int hv_common_cpu_die(unsigned int cpu)
> 
>  	local_irq_restore(flags);
> 
> -	kfree(mem);
> +	if (hv_isolation_type_en_snp()) {
> +		if (!set_memory_encrypted((unsigned long)mem, 1))
> +			kfree(mem);
> +	} else {
> +		kfree(mem);
> +	}
> 
>  	return 0;
>  }
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH V3 08/16] x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
  2023-01-22  2:45 ` [RFC PATCH V3 08/16] x86/hyperv: Initialize cpu and memory for " Tianyu Lan
@ 2023-01-31 18:20   ` Michael Kelley (LINUX)
  2023-02-03  5:58     ` Tianyu Lan
  0 siblings, 1 reply; 60+ messages in thread
From: Michael Kelley (LINUX) @ 2023-01-31 18:20 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <ltykernel@gmail.com> Sent: Saturday, January 21, 2023 6:46 PM
> 
> Read processor amd memory info from specific address which are
> populated by Hyper-V. Initialize smp cpu related ops, pvalidate
> system memory and add it into e820 table.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>  arch/x86/kernel/cpu/mshyperv.c | 85 ++++++++++++++++++++++++++++++++++
>  1 file changed, 85 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index ace5901ba0fc..b1871a7bb4c9 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -32,6 +32,12 @@
>  #include <asm/nmi.h>
>  #include <clocksource/hyperv_timer.h>
>  #include <asm/numa.h>
> +#include <asm/coco.h>
> +#include <asm/io_apic.h>
> +#include <asm/svm.h>
> +#include <asm/sev.h>
> +#include <asm/realmode.h>
> +#include <asm/e820/api.h>
> 
>  /* Is Linux running as the root partition? */
>  bool hv_root_partition;
> @@ -251,6 +257,30 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
>  }
>  #endif
> 
> +static u32 processor_count;
> +
> +static __init void hv_snp_get_smp_config(unsigned int early)
> +{
> +	if (!early) {
> +		while (num_processors < processor_count) {
> +			early_per_cpu(x86_cpu_to_apicid, num_processors) = num_processors;
> +			early_per_cpu(x86_bios_cpu_apicid, num_processors) = num_processors;
> +			physid_set(num_processors, phys_cpu_present_map);
> +			set_cpu_possible(num_processors, true);
> +			set_cpu_present(num_processors, true);
> +			num_processors++;
> +		}
> +	}
> +}
> +
> +struct memory_map_entry {
> +	u64 starting_gpn;
> +	u64 numpages;
> +	u16 type;
> +	u16 flags;
> +	u32 reserved;
> +};

Am I correct that this structure is defined by Hyper-V?  If so, it seems
like it should go in hyperv-tlfs.h, along with the definition of
EN_SEV_SNP_PROCESSOR_INFO_ADDR (which is also defined by
Hyper-V?)

> +
>  static void __init ms_hyperv_init_platform(void)
>  {
>  	int hv_max_functions_eax;
> @@ -258,6 +288,11 @@ static void __init ms_hyperv_init_platform(void)
>  	int hv_host_info_ebx;
>  	int hv_host_info_ecx;
>  	int hv_host_info_edx;
> +	struct memory_map_entry *entry;
> +	struct e820_entry *e820_entry;
> +	u64 e820_end;
> +	u64 ram_end;
> +	u64 page;
> 
>  #ifdef CONFIG_PARAVIRT
>  	pv_info.name = "Hyper-V";
> @@ -466,6 +501,56 @@ static void __init ms_hyperv_init_platform(void)
>  	if (!(ms_hyperv.features & HV_ACCESS_TSC_INVARIANT))
>  		mark_tsc_unstable("running on Hyper-V");
> 
> +	if (isolation_type_en_snp()) {

The above doesn't compile.  The function name is hv_isolation_type_en_snp().

> +		/*
> +		 * Hyper-V enlightened snp guest boots kernel
> +		 * directly without bootloader and so roms,
> +		 * bios regions and reserve resources are not
> +		 * available. Set these callback to NULL.
> +		 */
> +		x86_platform.legacy.reserve_bios_regions = x86_init_noop;
> +		x86_init.resources.probe_roms = x86_init_noop;
> +		x86_init.resources.reserve_resources = x86_init_noop;
> +		x86_init.mpparse.find_smp_config = x86_init_noop;
> +		x86_init.mpparse.get_smp_config = hv_snp_get_smp_config;
> +
> +		/*
> +		 * Hyper-V SEV-SNP enlightened guest doesn't support ioapic
> +		 * and legacy APIC page read/write. Switch to hv apic here.
> +		 */
> +		disable_ioapic_support();
> +
> +		/* Read processor number and memory layout. */
> +		processor_count = *(u32 *)__va(EN_SEV_SNP_PROCESSOR_INFO_ADDR);
> +		entry = (struct memory_map_entry *)(__va(EN_SEV_SNP_PROCESSOR_INFO_ADDR)
> +				+ sizeof(struct memory_map_entry));
> +
> +		/*
> +		 * E820 table in the memory just describes memory for
> +		 * kernel, ACPI table, cmdline, boot params and ramdisk.
> +		 * Hyper-V popoulates the rest memory layout in the EN_SEV_
> +		 * SNP_PROCESSOR_INFO_ADDR.
> +		 */
> +		for (; entry->numpages != 0; entry++) {
> +			e820_entry = &e820_table->entries[
> +					e820_table->nr_entries - 1];
> +			e820_end = e820_entry->addr + e820_entry->size;
> +			ram_end = (entry->starting_gpn +
> +				   entry->numpages) * PAGE_SIZE;
> +
> +			if (e820_end < entry->starting_gpn * PAGE_SIZE)
> +				e820_end = entry->starting_gpn * PAGE_SIZE;
> +
> +			if (e820_end < ram_end) {
> +				pr_info("Hyper-V: add e820 entry [mem %#018Lx-%#018Lx]\n", e820_end, ram_end - 1);
> +				e820__range_add(e820_end, ram_end - e820_end,
> +						E820_TYPE_RAM);
> +				for (page = e820_end; page < ram_end; page += PAGE_SIZE)
> +					pvalidate((unsigned long)__va(page), RMP_PG_SIZE_4K, true);
> +			}
> +		}
> +	}
> +

For SNP vTOM mode, most of the supporting code is placed in
arch/x86/hyperv/ivm.c, which is built only if CONFIG_HYPERV
is defined.  arch/x86/kernel/cpu/mshyperv.c is built for *any*
flavor of guest (i.e., CONFIG_HYPERVISOR_GUEST).  I'm thinking
all this code should go as a supporting function in ivm.c, to
avoid overloading mshyperv.c.  Take a look at how hv_vtom_init()
is handled in my patch set.

Breaking it out as a separate supporting function might also
help reduce the deep indentation problem a bit. :-)

>  	hardlockup_detector_disable();
>  }
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest
  2023-01-22  2:46 ` [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest Tianyu Lan
  2023-01-23 15:30   ` Tom Lendacky
@ 2023-01-31 18:34   ` Michael Kelley (LINUX)
  2023-02-03  6:10     ` Tianyu Lan
  1 sibling, 1 reply; 60+ messages in thread
From: Michael Kelley (LINUX) @ 2023-01-31 18:34 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

From: Tianyu Lan <ltykernel@gmail.com> Sent: Saturday, January 21, 2023 6:46 PM
> 
> The wakeup_secondary_cpu callback was populated with wakeup_
> cpu_via_vmgexit() which doesn't work for Hyper-V. Override it
> with Hyper-V specific hook which uses HVCALL_START_VIRTUAL_
> PROCESSOR hvcall to start AP with vmsa data structure.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> Change since RFC v2:
>        * Add helper function to initialize segment
>        * Fix some coding style
> ---
>  arch/x86/include/asm/mshyperv.h   |   2 +
>  arch/x86/include/asm/sev.h        |  13 ++++
>  arch/x86/include/asm/svm.h        |  47 +++++++++++++
>  arch/x86/kernel/cpu/mshyperv.c    | 112 ++++++++++++++++++++++++++++--
>  include/asm-generic/hyperv-tlfs.h |  19 +++++
>  5 files changed, 189 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 7266d71d30d6..c69051eec0e1 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -203,6 +203,8 @@ struct irq_domain *hv_create_pci_msi_domain(void);
>  int hv_map_ioapic_interrupt(int ioapic_id, bool level, int vcpu, int vector,
>  		struct hv_interrupt_entry *entry);
>  int hv_unmap_ioapic_interrupt(int ioapic_id, struct hv_interrupt_entry *entry);
> +int hv_set_mem_host_visibility(unsigned long addr, int numpages, bool visible);
> +int hv_snp_boot_ap(int cpu, unsigned long start_ip);
> 
>  #ifdef CONFIG_AMD_MEM_ENCRYPT
>  void hv_ghcb_msr_write(u64 msr, u64 value);
> diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h
> index ebc271bb6d8e..e34aaf730220 100644
> --- a/arch/x86/include/asm/sev.h
> +++ b/arch/x86/include/asm/sev.h
> @@ -86,6 +86,19 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs);
> 
>  #define RMPADJUST_VMSA_PAGE_BIT		BIT(16)
> 
> +union sev_rmp_adjust {
> +	u64 as_uint64;
> +	struct {
> +		unsigned long target_vmpl : 8;
> +		unsigned long enable_read : 1;
> +		unsigned long enable_write : 1;
> +		unsigned long enable_user_execute : 1;
> +		unsigned long enable_kernel_execute : 1;
> +		unsigned long reserved1 : 4;
> +		unsigned long vmsa : 1;
> +	};
> +};
> +
>  /* SNP Guest message request */
>  struct snp_req_data {
>  	unsigned long req_gpa;
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index cb1ee53ad3b1..f8b321a11ee4 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -336,6 +336,53 @@ struct vmcb_save_area {
>  	u64 last_excp_to;
>  	u8 reserved_0x298[72];
>  	u32 spec_ctrl;		/* Guest version of SPEC_CTRL at 0x2E0 */
> +	u8 reserved_7b[4];
> +	u32 pkru;
> +	u8 reserved_7a[20];
> +	u64 reserved_8;		/* rax already available at 0x01f8 */
> +	u64 rcx;
> +	u64 rdx;
> +	u64 rbx;
> +	u64 reserved_9;		/* rsp already available at 0x01d8 */
> +	u64 rbp;
> +	u64 rsi;
> +	u64 rdi;
> +	u64 r8;
> +	u64 r9;
> +	u64 r10;
> +	u64 r11;
> +	u64 r12;
> +	u64 r13;
> +	u64 r14;
> +	u64 r15;
> +	u8 reserved_10[16];
> +	u64 sw_exit_code;
> +	u64 sw_exit_info_1;
> +	u64 sw_exit_info_2;
> +	u64 sw_scratch;
> +	union {
> +		u64 sev_features;
> +		struct {
> +			u64 sev_feature_snp			: 1;
> +			u64 sev_feature_vtom			: 1;
> +			u64 sev_feature_reflectvc		: 1;
> +			u64 sev_feature_restrict_injection	: 1;
> +			u64 sev_feature_alternate_injection	: 1;
> +			u64 sev_feature_full_debug		: 1;
> +			u64 sev_feature_reserved1		: 1;
> +			u64 sev_feature_snpbtb_isolation	: 1;
> +			u64 sev_feature_resrved2		: 56;
> +		};
> +	};
> +	u64 vintr_ctrl;
> +	u64 guest_error_code;
> +	u64 virtual_tom;
> +	u64 tlb_id;
> +	u64 pcpu_id;
> +	u64 event_inject;
> +	u64 xcr0;
> +	u8 valid_bitmap[16];
> +	u64 x87_state_gpa;
>  } __packed;
> 
>  /* Save area definition for SEV-ES and SEV-SNP guests */
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 197c8f2ec4eb..9d547751a1a7 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -39,6 +39,13 @@
>  #include <asm/realmode.h>
>  #include <asm/e820/api.h>
> 
> +/*
> + * DEFAULT INIT GPAT and SEGMENT LIMIT value in struct VMSA
> + * to start AP in enlightened SEV guest.
> + */
> +#define HV_AP_INIT_GPAT_DEFAULT		0x0007040600070406ULL
> +#define HV_AP_SEGMENT_LIMIT		0xffffffff

If these values are defined by Hyper-V, they should probably go in
hyperv-tlfs.h.

> +
>  /* Is Linux running as the root partition? */
>  bool hv_root_partition;
>  struct ms_hyperv_info ms_hyperv;
> @@ -230,6 +237,94 @@ static void __init hv_smp_prepare_boot_cpu(void)
>  #endif
>  }
> 
> +static u8 ap_start_input_arg[PAGE_SIZE] __bss_decrypted __aligned(PAGE_SIZE);
> +static u8 ap_start_stack[PAGE_SIZE] __aligned(PAGE_SIZE);
> +
> +#define hv_populate_vmcb_seg(seg, gdtr_base)			\
> +do {								\
> +	if (seg.selector) {					\
> +		seg.base = 0;					\
> +		seg.limit = HV_AP_SEGMENT_LIMIT;		\
> +		seg.attrib = *(u16 *)(gdtr_base + seg.selector + 5);	\
> +		seg.attrib = (seg.attrib & 0xFF) | ((seg.attrib >> 4) & 0xF00); \
> +	}							\
> +} while (0)							\
> +
> +int hv_snp_boot_ap(int cpu, unsigned long start_ip)
> +{
> +	struct vmcb_save_area *vmsa = (struct vmcb_save_area *)
> +		__get_free_page(GFP_KERNEL | __GFP_ZERO);
> +	struct desc_ptr gdtr;
> +	u64 ret, retry = 5;
> +	struct hv_start_virtual_processor_input *start_vp_input;
> +	union sev_rmp_adjust rmp_adjust;
> +	unsigned long flags;
> +
> +	native_store_gdt(&gdtr);
> +
> +	vmsa->gdtr.base = gdtr.address;
> +	vmsa->gdtr.limit = gdtr.size;
> +
> +	asm volatile("movl %%es, %%eax;" : "=a" (vmsa->es.selector));
> +	hv_populate_vmcb_seg(vmsa->es, vmsa->gdtr.base);
> +
> +	asm volatile("movl %%cs, %%eax;" : "=a" (vmsa->cs.selector));
> +	hv_populate_vmcb_seg(vmsa->cs, vmsa->gdtr.base);
> +
> +	asm volatile("movl %%ss, %%eax;" : "=a" (vmsa->ss.selector));
> +	hv_populate_vmcb_seg(vmsa->ss, vmsa->gdtr.base);
> +
> +	asm volatile("movl %%ds, %%eax;" : "=a" (vmsa->ds.selector));
> +	hv_populate_vmcb_seg(vmsa->ds, vmsa->gdtr.base);
> +
> +	vmsa->efer = native_read_msr(MSR_EFER);
> +
> +	asm volatile("movq %%cr4, %%rax;" : "=a" (vmsa->cr4));
> +	asm volatile("movq %%cr3, %%rax;" : "=a" (vmsa->cr3));
> +	asm volatile("movq %%cr0, %%rax;" : "=a" (vmsa->cr0));
> +
> +	vmsa->xcr0 = 1;
> +	vmsa->g_pat = HV_AP_INIT_GPAT_DEFAULT;
> +	vmsa->rip = (u64)secondary_startup_64_no_verify;
> +	vmsa->rsp = (u64)&ap_start_stack[PAGE_SIZE];
> +
> +	vmsa->sev_feature_snp = 1;
> +	vmsa->sev_feature_restrict_injection = 1;
> +
> +	rmp_adjust.as_uint64 = 0;
> +	rmp_adjust.target_vmpl = 1;
> +	rmp_adjust.vmsa = 1;
> +	ret = rmpadjust((unsigned long)vmsa, RMP_PG_SIZE_4K,
> +			rmp_adjust.as_uint64);
> +	if (ret != 0) {
> +		pr_err("RMPADJUST(%llx) failed: %llx\n", (u64)vmsa, ret);
> +		return ret;
> +	}
> +
> +	local_irq_save(flags);
> +	start_vp_input =
> +		(struct hv_start_virtual_processor_input *)ap_start_input_arg;
> +	memset(start_vp_input, 0, sizeof(*start_vp_input));
> +	start_vp_input->partitionid = -1;
> +	start_vp_input->vpindex = cpu;
> +	start_vp_input->targetvtl = ms_hyperv.vtl;
> +	*(u64 *)&start_vp_input->context[0] = __pa(vmsa) | 1;
> +
> +	do {
> +		ret = hv_do_hypercall(HVCALL_START_VIRTUAL_PROCESSOR,
> +				      start_vp_input, NULL);
> +	} while (hv_result(ret) == HV_STATUS_TIME_OUT && retry--);
> +
> +	if (!hv_result_success(ret)) {
> +		pr_err("HvCallStartVirtualProcessor failed: %llx\n", ret);
> +		goto done;
> +	}
> +
> +done:
> +	local_irq_restore(flags);
> +	return ret;
> +}
> +

Like a comment in an earlier patch, I'm wondering if the bulk of
this code could move to ivm.c, to avoid overloading mshyperv.c.

>  static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
>  {
>  #ifdef CONFIG_X86_64
> @@ -239,6 +334,16 @@ static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
> 
>  	native_smp_prepare_cpus(max_cpus);
> 
> +	/*
> +	 *  Override wakeup_secondary_cpu callback for SEV-SNP
> +	 *  enlightened guest.
> +	 */
> +	if (hv_isolation_type_en_snp())
> +		apic->wakeup_secondary_cpu = hv_snp_boot_ap;
> +
> +	if (!hv_root_partition)
> +		return;
> +
>  #ifdef CONFIG_X86_64
>  	for_each_present_cpu(i) {
>  		if (i == 0)
> @@ -475,8 +580,7 @@ static void __init ms_hyperv_init_platform(void)
> 
>  # ifdef CONFIG_SMP
>  	smp_ops.smp_prepare_boot_cpu = hv_smp_prepare_boot_cpu;
> -	if (hv_root_partition)
> -		smp_ops.smp_prepare_cpus = hv_smp_prepare_cpus;
> +	smp_ops.smp_prepare_cpus = hv_smp_prepare_cpus;
>  # endif
> 
>  	/*
> @@ -501,7 +605,7 @@ static void __init ms_hyperv_init_platform(void)
>  	if (!(ms_hyperv.features & HV_ACCESS_TSC_INVARIANT))
>  		mark_tsc_unstable("running on Hyper-V");
> 
> -	if (isolation_type_en_snp()) {
> +	if (hv_isolation_type_en_snp()) {

Also a bug fix to an earlier patch in this series.

>  		/*
>  		 * Hyper-V enlightened snp guest boots kernel
>  		 * directly without bootloader and so roms,
> @@ -511,7 +615,7 @@ static void __init ms_hyperv_init_platform(void)
>  		x86_platform.legacy.rtc = 0;
>  		x86_platform.set_wallclock = set_rtc_noop;
>  		x86_platform.get_wallclock = get_rtc_noop;
> -		x86_platform.legacy.reserve_bios_regions = x86_init_noop;
> +		x86_platform.legacy.reserve_bios_regions = 0;

This looks like a bug fix to Patch 8 of the series.  It should be fixed
in patch 8.

>  		x86_init.resources.probe_roms = x86_init_noop;
>  		x86_init.resources.reserve_resources = x86_init_noop;
>  		x86_init.mpparse.find_smp_config = x86_init_noop;
> diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
> index c1cc3ec36ad5..3d7c67be9f56 100644
> --- a/include/asm-generic/hyperv-tlfs.h
> +++ b/include/asm-generic/hyperv-tlfs.h
> @@ -148,6 +148,7 @@ union hv_reference_tsc_msr {
>  #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST	0x0003
>  #define HVCALL_NOTIFY_LONG_SPIN_WAIT		0x0008
>  #define HVCALL_SEND_IPI				0x000b
> +#define HVCALL_ENABLE_VP_VTL			0x000f
>  #define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX	0x0013
>  #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
>  #define HVCALL_SEND_IPI_EX			0x0015
> @@ -165,6 +166,7 @@ union hv_reference_tsc_msr {
>  #define HVCALL_MAP_DEVICE_INTERRUPT		0x007c
>  #define HVCALL_UNMAP_DEVICE_INTERRUPT		0x007d
>  #define HVCALL_RETARGET_INTERRUPT		0x007e
> +#define HVCALL_START_VIRTUAL_PROCESSOR		0x0099
>  #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
>  #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
>  #define HVCALL_MODIFY_SPARSE_GPA_PAGE_HOST_VISIBILITY 0x00db
> @@ -219,6 +221,7 @@ enum HV_GENERIC_SET_FORMAT {
>  #define HV_STATUS_INVALID_PORT_ID		17
>  #define HV_STATUS_INVALID_CONNECTION_ID		18
>  #define HV_STATUS_INSUFFICIENT_BUFFERS		19
> +#define HV_STATUS_TIME_OUT                     0x78
> 
>  /*
>   * The Hyper-V TimeRefCount register and the TSC
> @@ -778,6 +781,22 @@ struct hv_input_unmap_device_interrupt {
>  	struct hv_interrupt_entry interrupt_entry;
>  } __packed;
> 
> +struct hv_enable_vp_vtl_input {
> +	u64 partitionid;
> +	u32 vpindex;
> +	u8 targetvtl;
> +	u8 padding[3];
> +	u8 context[0xe0];
> +} __packed;
> +
> +struct hv_start_virtual_processor_input {
> +	u64 partitionid;
> +	u32 vpindex;
> +	u8 targetvtl;
> +	u8 padding[3];
> +	u8 context[0xe0];
> +} __packed;
> +
>  #define HV_SOURCE_SHADOW_NONE               0x0
>  #define HV_SOURCE_SHADOW_BRIDGE_BUS_RANGE   0x1
> 
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 09/16] x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
  2023-01-31 14:03   ` Wei Liu
@ 2023-02-02  3:43     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-02  3:43 UTC (permalink / raw)
  To: Wei Liu
  Cc: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu,
	linux-kernel, kvm, linux-hyperv, linux-arch

On 1/31/2023 10:03 PM, Wei Liu wrote:
>> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
>> index 1a4af0a4f29a..7266d71d30d6 100644
>> --- a/arch/x86/include/asm/mshyperv.h
>> +++ b/arch/x86/include/asm/mshyperv.h
>> @@ -33,6 +33,12 @@ extern bool hv_isolation_type_en_snp(void);
>>   
>>   extern union hv_ghcb * __percpu *hv_ghcb_pg;
>>   
>> +/*
>> + * Hyper-V puts processor and memory layout info
>> + * to this address in SEV-SNP enlightened guest.
>> + */
>> +#define EN_SEV_SNP_PROCESSOR_INFO_ADDR	0x802000
> This hunk should be moved to the previous patch. It is not needed in
> this patch.

Nice catch. Sorry for noise. Will fix in the next version.

Thanks.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 01/16] x86/hyperv: Add sev-snp enlightened guest specific config
  2023-01-31 17:34   ` Michael Kelley (LINUX)
@ 2023-02-02  4:01     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-02  4:01 UTC (permalink / raw)
  To: Michael Kelley (LINUX),
	luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/1/2023 1:34 AM, Michael Kelley (LINUX) wrote:
>> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
>> index 8f83ceec45dc..ace5901ba0fc 100644
>> --- a/arch/x86/kernel/cpu/mshyperv.c
>> +++ b/arch/x86/kernel/cpu/mshyperv.c
>> @@ -273,6 +273,18 @@ static void __init ms_hyperv_init_platform(void)
>>
>>   	hv_max_functions_eax = cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS);
>>
>> +	/*
>> +	 * Add custom configuration for SEV-SNP Enlightened guest
>> +	 */
>> +	if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) {
>> +		ms_hyperv.features |= HV_ACCESS_FREQUENCY_MSRS;
>> +		ms_hyperv.misc_features |= HV_FEATURE_FREQUENCY_MSRS_AVAILABLE;
>> +		ms_hyperv.misc_features &= ~HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE;
>> +		ms_hyperv.hints |= HV_DEPRECATING_AEOI_RECOMMENDED;
>> +		ms_hyperv.hints |= HV_X64_APIC_ACCESS_RECOMMENDED;
>> +		ms_hyperv.hints |= HV_X64_CLUSTER_IPI_RECOMMENDED;
> Two different things are happening in changing the above flags:
> 
> 1)  Disabling certain feature that Hyper-V might offer to a guest, such
> as the crash MSRs and Auto EOI.  (In some cases disabling the feature
> means removing the flag.  In other cases in means adding the flag.  But
> the net result is same -- other Hyper-V specific code will not use the
> feature.)  This category is OK.
> 
> 2)  Forcing certain features to be treated as enabled.  This category
> is somewhat concerning.  Assuming that Hyper-V is accurately indicating
> which features are available, it seems better to check that the flags
> required by SNP are present, and refuse to boot in SNP mode if not.
> Or is this code handling a different problem, where Hyper-V is not
> indicating that the feature is available, even though it really is?
> 

Agree. The CPUID emulation in SEV-SNP guest may be controlled by the
cpuid table which is passed to kernel via EFI bootloader or hypervisor.
In Hyper-V case, the CPUID table is passed by Hyper-V directly and the
table is built during making guest image. To avoid the confusion here,
will try hiding the change in the cpuid table and double check whether 
these features will be enalbed or disabled on different machine or VM
type.

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (15 preceding siblings ...)
  2023-01-22  2:46 ` [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths " Tianyu Lan
@ 2023-02-02 23:00 ` Zhi Wang
  2023-02-03  4:04   ` Michael Kelley (LINUX)
  2023-02-09 11:36 ` Gupta, Pankaj
  17 siblings, 1 reply; 60+ messages in thread
From: Zhi Wang @ 2023-02-02 23:00 UTC (permalink / raw)
  To: Tianyu Lan
  Cc: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu,
	linux-kernel, kvm, linux-hyperv, linux-arch

On Sat, 21 Jan 2023 21:45:50 -0500
Tianyu Lan <ltykernel@gmail.com> wrote:

1) I am thinking if it is a good time to organize a common code path for
enlightened VM on hyper-v.

Wouldn't it be better to have a common flag for enlightened VM? 
Like bool hv_isolation_type_enlightened()

Many of the decryption of the post msg page... are also required
in the enlightened TDX guest, they are not AMD-specific. 

Then in the "TDX guest on hyper-V" patch set, Dexuan can save some LOCs instead
of ending up with if (hv_isolation_type_en_snp() ||
hv_isolation_type_en_tdx())...

2) It seems the AMD SEV-SNP enlightened guest on hyper-V is implemented as
CC_VENDOR_AMD, while TDX enlightened guest is still implemented as
CC_VENDOR_HYPERV. I am curious about the reason.

> From: Tianyu Lan <tiala@microsoft.com>
> 
> This patchset is to add AMD sev-snp enlightened guest
> support on hyperv. Hyperv uses Linux direct boot mode
> to boot up Linux kernel and so it needs to pvalidate
> system memory by itself.
> 
> In hyperv case, there is no boot loader and so cc blob
> is prepared by hypervisor. In this series, hypervisor
> set the cc blob address directly into boot parameter
> of Linux kernel. If the magic number on cc blob address
> is valid, kernel will read cc blob.
> 
> Shared memory between guests and hypervisor should be
> decrypted and zero memory after decrypt memory. The data
> in the target address. It maybe smearedto avoid smearing
> data.
> 
> Introduce #HV exception support in AMD sev snp code and
> #HV handler.
> 
> Change since v2:
>        - Remove validate kernel memory code at boot stage
>        - Split #HV page patch into two parts
>        - Remove HV-APIC change due to enable x2apic from
>        	 host side
>        - Rework vmbus code to handle error of decrypt page
>        - Spilt memory and cpu initialization patch. 
> 
> Change since v1:
>        - Remove boot param changes for cc blob address and
>        use setup head to pass cc blob info
>        - Remove unnessary WARN and BUG check
>        - Add system vector table map in the #HV exception
>        - Fix interrupt exit issue when use #HV exception
> 
> Ashish Kalra (2):
>   x86/sev: optimize system vector processing invoked from #HV exception
>   x86/sev: Fix interrupt exit code paths from #HV exception
> 
> Tianyu Lan (14):
>   x86/hyperv: Add sev-snp enlightened guest specific config
>   x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest
>   x86/hyperv: Set Virtual Trust Level in vmbus init message
>   x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp
>     enlightened guest
>   clocksource/drivers/hyper-v: decrypt hyperv tsc page in sev-snp
>     enlightened guest
>   x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
>   drivers: hv: Decrypt percpu hvcall input arg page in sev-snp
>     enlightened guest
>   x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
>   x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
>   x86/hyperv: Add smp support for sev-snp guest
>   x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES
>   x86/sev: Add a #HV exception handler
>   x86/sev: Add Check of #HV event in path
>   x86/sev: Initialize #HV doorbell and handle interrupt requests
> 
>  arch/x86/entry/entry_64.S             |  82 ++++++
>  arch/x86/hyperv/hv_init.c             |  43 +++
>  arch/x86/hyperv/ivm.c                 |  10 +
>  arch/x86/include/asm/cpu_entry_area.h |   6 +
>  arch/x86/include/asm/hyperv-tlfs.h    |   4 +
>  arch/x86/include/asm/idtentry.h       | 105 ++++++-
>  arch/x86/include/asm/irqflags.h       |  10 +
>  arch/x86/include/asm/mem_encrypt.h    |   2 +
>  arch/x86/include/asm/mshyperv.h       |  56 +++-
>  arch/x86/include/asm/msr-index.h      |   6 +
>  arch/x86/include/asm/page_64_types.h  |   1 +
>  arch/x86/include/asm/sev.h            |  13 +
>  arch/x86/include/asm/svm.h            |  59 +++-
>  arch/x86/include/asm/trapnr.h         |   1 +
>  arch/x86/include/asm/traps.h          |   1 +
>  arch/x86/include/asm/x86_init.h       |   2 +
>  arch/x86/include/uapi/asm/svm.h       |   4 +
>  arch/x86/kernel/cpu/common.c          |   1 +
>  arch/x86/kernel/cpu/mshyperv.c        | 228 ++++++++++++++-
>  arch/x86/kernel/dumpstack_64.c        |   9 +-
>  arch/x86/kernel/idt.c                 |   1 +
>  arch/x86/kernel/sev.c                 | 395 ++++++++++++++++++++++----
>  arch/x86/kernel/traps.c               |  42 +++
>  arch/x86/kernel/vmlinux.lds.S         |   7 +
>  arch/x86/kernel/x86_init.c            |   4 +-
>  arch/x86/mm/cpu_entry_area.c          |   2 +
>  drivers/clocksource/hyperv_timer.c    |   2 +-
>  drivers/hv/connection.c               |   1 +
>  drivers/hv/hv.c                       |  33 ++-
>  drivers/hv/hv_common.c                |  26 +-
>  include/asm-generic/hyperv-tlfs.h     |  19 ++
>  include/asm-generic/mshyperv.h        |   2 +
>  include/linux/hyperv.h                |   4 +-
>  33 files changed, 1102 insertions(+), 79 deletions(-)
> 


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths from #HV exception
  2023-01-22  2:46 ` [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths " Tianyu Lan
@ 2023-02-02 23:20   ` Zhi Wang
  2023-02-08 23:53     ` Kalra, Ashish
  2023-02-21 16:44   ` Gupta, Pankaj
  1 sibling, 1 reply; 60+ messages in thread
From: Zhi Wang @ 2023-02-02 23:20 UTC (permalink / raw)
  To: Tianyu Lan
  Cc: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu,
	linux-kernel, kvm, linux-hyperv, linux-arch

On Sat, 21 Jan 2023 21:46:06 -0500
Tianyu Lan <ltykernel@gmail.com> wrote:

> From: Ashish Kalra <ashish.kalra@amd.com>
> 
> Add checks in interrupt exit code paths in case of returns
> to user mode to check if currently executing the #HV handler
> then don't follow the irqentry_exit_to_user_mode path as
> that can potentially cause the #HV handler to be
> preempted and rescheduled on another CPU. Rescheduled #HV
> handler on another cpu will cause interrupts to be handled
> on a different cpu than the injected one, causing
> invalid EOIs and missed/lost guest interrupts and
> corresponding hangs and/or per-cpu IRQs handled on
> non-intended cpu.
> 

Why doesn't this problem happen in #VC handler? As #VC handler doesn't have
this special handling.

> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>  arch/x86/include/asm/idtentry.h | 66 +++++++++++++++++++++++++++++++++
>  arch/x86/kernel/sev.c           | 30 +++++++++++++++
>  2 files changed, 96 insertions(+)
> 
> diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
> index 652fea10d377..45b47132be7c 100644
> --- a/arch/x86/include/asm/idtentry.h
> +++ b/arch/x86/include/asm/idtentry.h
> @@ -13,6 +13,10 @@
>  
>  #include <asm/irq_stack.h>
>  
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, irqentry_state_t state);
> +#endif
> +
>  /**
>   * DECLARE_IDTENTRY - Declare functions for simple IDT entry points
>   *		      No error code pushed by hardware
> @@ -176,6 +180,7 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code)
>  #define DECLARE_IDTENTRY_IRQ(vector, func)				\
>  	DECLARE_IDTENTRY_ERRORCODE(vector, func)
>  
> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>  /**
>   * DEFINE_IDTENTRY_IRQ - Emit code for device interrupt IDT entry points
>   * @func:	Function name of the entry point
> @@ -205,6 +210,26 @@ __visible noinstr void func(struct pt_regs *regs,			\
>  }									\
>  									\
>  static noinline void __##func(struct pt_regs *regs, u32 vector)
> +#else
> +
> +#define DEFINE_IDTENTRY_IRQ(func)					\
> +static void __##func(struct pt_regs *regs, u32 vector);		\
> +									\
> +__visible noinstr void func(struct pt_regs *regs,			\
> +			    unsigned long error_code)			\
> +{									\
> +	irqentry_state_t state = irqentry_enter(regs);			\
> +	u32 vector = (u32)(u8)error_code;				\
> +									\
> +	instrumentation_begin();					\
> +	kvm_set_cpu_l1tf_flush_l1d();					\
> +	run_irq_on_irqstack_cond(__##func, regs, vector);		\
> +	instrumentation_end();						\
> +	irqentry_exit_hv_cond(regs, state);				\
> +}									\
> +									\
> +static noinline void __##func(struct pt_regs *regs, u32 vector)
> +#endif
>  
>  /**
>   * DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points
> @@ -221,6 +246,7 @@ static noinline void __##func(struct pt_regs *regs, u32 vector)
>  #define DECLARE_IDTENTRY_SYSVEC(vector, func)				\
>  	DECLARE_IDTENTRY(vector, func)
>  
> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>  /**
>   * DEFINE_IDTENTRY_SYSVEC - Emit code for system vector IDT entry points
>   * @func:	Function name of the entry point
> @@ -245,6 +271,26 @@ __visible noinstr void func(struct pt_regs *regs)			\
>  }									\
>  									\
>  static noinline void __##func(struct pt_regs *regs)
> +#else
> +
> +#define DEFINE_IDTENTRY_SYSVEC(func)					\
> +static void __##func(struct pt_regs *regs);				\
> +									\
> +__visible noinstr void func(struct pt_regs *regs)			\
> +{									\
> +	irqentry_state_t state = irqentry_enter(regs);			\
> +									\
> +	instrumentation_begin();					\
> +	kvm_set_cpu_l1tf_flush_l1d();					\
> +	run_sysvec_on_irqstack_cond(__##func, regs);			\
> +	instrumentation_end();						\
> +	irqentry_exit_hv_cond(regs, state);				\
> +}									\
> +									\
> +static noinline void __##func(struct pt_regs *regs)
> +#endif
> +
> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>  
>  /**
>   * DEFINE_IDTENTRY_SYSVEC_SIMPLE - Emit code for simple system vector IDT
> @@ -274,6 +320,26 @@ __visible noinstr void func(struct pt_regs *regs)			\
>  }									\
>  									\
>  static __always_inline void __##func(struct pt_regs *regs)
> +#else
> +
> +#define DEFINE_IDTENTRY_SYSVEC_SIMPLE(func)				\
> +static __always_inline void __##func(struct pt_regs *regs);		\
> +									\
> +__visible noinstr void func(struct pt_regs *regs)			\
> +{									\
> +	irqentry_state_t state = irqentry_enter(regs);			\
> +									\
> +	instrumentation_begin();					\
> +	__irq_enter_raw();						\
> +	kvm_set_cpu_l1tf_flush_l1d();					\
> +	__##func(regs);						\
> +	__irq_exit_raw();						\
> +	instrumentation_end();						\
> +	irqentry_exit_hv_cond(regs, state);				\
> +}									\
> +									\
> +static __always_inline void __##func(struct pt_regs *regs)
> +#endif
>  
>  /**
>   * DECLARE_IDTENTRY_XENCB - Declare functions for XEN HV callback entry point
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index b1a98c2a52f8..23f15e95838b 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -147,6 +147,10 @@ struct sev_hv_doorbell_page {
>  
>  struct sev_snp_runtime_data {
>  	struct sev_hv_doorbell_page hv_doorbell_page;
> +	/*
> +	 * Indication that we are currently handling #HV events.
> +	 */
> +	bool hv_handling_events;
>  };
>  
>  static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
> @@ -200,6 +204,8 @@ static void do_exc_hv(struct pt_regs *regs)
>  	union hv_pending_events pending_events;
>  	u8 vector;
>  
> +	this_cpu_read(snp_runtime_data)->hv_handling_events = true;
> +
>  	while (sev_hv_pending()) {
>  		pending_events.events = xchg(
>  			&sev_snp_current_doorbell_page()->pending_events.events,
> @@ -234,6 +240,8 @@ static void do_exc_hv(struct pt_regs *regs)
>  			common_interrupt(regs, pending_events.vector);
>  		}
>  	}
> +
> +	this_cpu_read(snp_runtime_data)->hv_handling_events = false;
>  }
>  
>  static __always_inline bool on_vc_stack(struct pt_regs *regs)
> @@ -2529,3 +2537,25 @@ static int __init snp_init_platform_device(void)
>  	return 0;
>  }
>  device_initcall(snp_init_platform_device);
> +
> +noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, irqentry_state_t state)
> +{
> +	/*
> +	 * Check whether this returns to user mode, if so and if
> +	 * we are currently executing the #HV handler then we don't
> +	 * want to follow the irqentry_exit_to_user_mode path as
> +	 * that can potentially cause the #HV handler to be
> +	 * preempted and rescheduled on another CPU. Rescheduled #HV
> +	 * handler on another cpu will cause interrupts to be handled
> +	 * on a different cpu than the injected one, causing
> +	 * invalid EOIs and missed/lost guest interrupts and
> +	 * corresponding hangs and/or per-cpu IRQs handled on
> +	 * non-intended cpu.
> +	 */
> +	if (user_mode(regs) &&
> +	    this_cpu_read(snp_runtime_data)->hv_handling_events)
> +		return;
> +
> +	/* follow normal interrupt return/exit path */
> +	irqentry_exit(regs, state);
> +}


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 03/16] x86/hyperv: Set Virtual Trust Level in vmbus init message
  2023-01-31 17:55   ` Michael Kelley (LINUX)
@ 2023-02-03  3:32     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-03  3:32 UTC (permalink / raw)
  To: Michael Kelley (LINUX),
	luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/1/2023 1:55 AM, Michael Kelley (LINUX) wrote:
>> index db2202d985bd..6dcbb21aac2b 100644
>> --- a/arch/x86/include/asm/hyperv-tlfs.h
>> +++ b/arch/x86/include/asm/hyperv-tlfs.h
>> @@ -36,6 +36,10 @@
>>   #define HYPERV_CPUID_MIN			0x40000005
>>   #define HYPERV_CPUID_MAX			0x4000ffff
>>
>> +/* Support for HVCALL_GET_VP_REGISTERS hvcall */
> The above comment isn't really right, in that these definitions
> aren't for the hypercall.  They are for the specific synthetic register.
> 
>> +#define	HV_X64_REGISTER_VSM_VP_STATUS	0x000D0003
>> +#define HV_X64_VTL_MASK			GENMASK(3, 0)
> Hyper-V synthetic registers have two different numbering schemes.
> For registers that have synthetic MSR equivalents, there's a full list
> starting with HV_X64_MSR_GUEST_OS_ID, which defines the MSR
> address.  But these registers also have register numbers that are
> not the same as the MSR address.  These register numbers
> aren't defined anywhere in x86 Linux code because we don't access
> them using the register number.   (The register numbers*are*
> defined in ARM64 code since ARM64 doesn't have MSRs.)  But this
> register is an exception on x86.  There's no MSR equivalent so we
> must use a hypercall to fetch the value.
> 
> I'd suggest starting a separate list after the definition of
> HV_X64_MSR_REFERENCE_TSC and make clear in a comment
> about the list that this is a list of register numbers, not MSR addresses.
> 

Agree. Will update in the next version.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* RE: [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
  2023-02-02 23:00 ` [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Zhi Wang
@ 2023-02-03  4:04   ` Michael Kelley (LINUX)
  0 siblings, 0 replies; 60+ messages in thread
From: Michael Kelley (LINUX) @ 2023-02-03  4:04 UTC (permalink / raw)
  To: Zhi Wang, Tianyu Lan
  Cc: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu,
	linux-kernel, kvm, linux-hyperv, linux-arch

From: Zhi Wang <zhi.wang.linux@gmail.com> Sent: Thursday, February 2, 2023 3:01 PM
> 
> On Sat, 21 Jan 2023 21:45:50 -0500
> Tianyu Lan <ltykernel@gmail.com> wrote:
> 
> 1) I am thinking if it is a good time to organize a common code path for
> enlightened VM on hyper-v.
> 
> Wouldn't it be better to have a common flag for enlightened VM?
> Like bool hv_isolation_type_enlightened()
> 
> Many of the decryption of the post msg page... are also required
> in the enlightened TDX guest, they are not AMD-specific.
> 
> Then in the "TDX guest on hyper-V" patch set, Dexuan can save some LOCs instead
> of ending up with if (hv_isolation_type_en_snp() ||
> hv_isolation_type_en_tdx())...

I've had the same thought, and have briefly discussed the
idea with Dexuan and Tianyu.  But there's some code coming
for a non-confidential VM scenario that hasn't yet been posted
upstream, and it adds yet more cases to consider.   We were
thinking to wait a bit until all the cases were evident, and then
find the right simplification.  If we try to do the simplification
now, we may need to do it again.

> 
> 2) It seems the AMD SEV-SNP enlightened guest on hyper-V is implemented as
> CC_VENDOR_AMD, while TDX enlightened guest is still implemented as
> CC_VENDOR_HYPERV. I am curious about the reason.

Patch set [1] makes CC_VENDOR_HYPERV go away.  Once that
happens, the TDX enlightened guest uses CC_VENDOR_INTEL.

Michael

[1] https://lore.kernel.org/linux-hyperv/1673559753-94403-1-git-send-email-mikelley@microsoft.com/T/#m4639d697e9a6619edfcdceffc1b0613a9016f601



> 
> > From: Tianyu Lan <tiala@microsoft.com>
> >
> > This patchset is to add AMD sev-snp enlightened guest
> > support on hyperv. Hyperv uses Linux direct boot mode
> > to boot up Linux kernel and so it needs to pvalidate
> > system memory by itself.
> >
> > In hyperv case, there is no boot loader and so cc blob
> > is prepared by hypervisor. In this series, hypervisor
> > set the cc blob address directly into boot parameter
> > of Linux kernel. If the magic number on cc blob address
> > is valid, kernel will read cc blob.
> >
> > Shared memory between guests and hypervisor should be
> > decrypted and zero memory after decrypt memory. The data
> > in the target address. It maybe smearedto avoid smearing
> > data.
> >
> > Introduce #HV exception support in AMD sev snp code and
> > #HV handler.
> >
> > Change since v2:
> >        - Remove validate kernel memory code at boot stage
> >        - Split #HV page patch into two parts
> >        - Remove HV-APIC change due to enable x2apic from
> >        	 host side
> >        - Rework vmbus code to handle error of decrypt page
> >        - Spilt memory and cpu initialization patch.
> >
> > Change since v1:
> >        - Remove boot param changes for cc blob address and
> >        use setup head to pass cc blob info
> >        - Remove unnessary WARN and BUG check
> >        - Add system vector table map in the #HV exception
> >        - Fix interrupt exit issue when use #HV exception
> >
> > Ashish Kalra (2):
> >   x86/sev: optimize system vector processing invoked from #HV exception
> >   x86/sev: Fix interrupt exit code paths from #HV exception
> >
> > Tianyu Lan (14):
> >   x86/hyperv: Add sev-snp enlightened guest specific config
> >   x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest
> >   x86/hyperv: Set Virtual Trust Level in vmbus init message
> >   x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp
> >     enlightened guest
> >   clocksource/drivers/hyper-v: decrypt hyperv tsc page in sev-snp
> >     enlightened guest
> >   x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
> >   drivers: hv: Decrypt percpu hvcall input arg page in sev-snp
> >     enlightened guest
> >   x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
> >   x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
> >   x86/hyperv: Add smp support for sev-snp guest
> >   x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES
> >   x86/sev: Add a #HV exception handler
> >   x86/sev: Add Check of #HV event in path
> >   x86/sev: Initialize #HV doorbell and handle interrupt requests
> >
> >  arch/x86/entry/entry_64.S             |  82 ++++++
> >  arch/x86/hyperv/hv_init.c             |  43 +++
> >  arch/x86/hyperv/ivm.c                 |  10 +
> >  arch/x86/include/asm/cpu_entry_area.h |   6 +
> >  arch/x86/include/asm/hyperv-tlfs.h    |   4 +
> >  arch/x86/include/asm/idtentry.h       | 105 ++++++-
> >  arch/x86/include/asm/irqflags.h       |  10 +
> >  arch/x86/include/asm/mem_encrypt.h    |   2 +
> >  arch/x86/include/asm/mshyperv.h       |  56 +++-
> >  arch/x86/include/asm/msr-index.h      |   6 +
> >  arch/x86/include/asm/page_64_types.h  |   1 +
> >  arch/x86/include/asm/sev.h            |  13 +
> >  arch/x86/include/asm/svm.h            |  59 +++-
> >  arch/x86/include/asm/trapnr.h         |   1 +
> >  arch/x86/include/asm/traps.h          |   1 +
> >  arch/x86/include/asm/x86_init.h       |   2 +
> >  arch/x86/include/uapi/asm/svm.h       |   4 +
> >  arch/x86/kernel/cpu/common.c          |   1 +
> >  arch/x86/kernel/cpu/mshyperv.c        | 228 ++++++++++++++-
> >  arch/x86/kernel/dumpstack_64.c        |   9 +-
> >  arch/x86/kernel/idt.c                 |   1 +
> >  arch/x86/kernel/sev.c                 | 395 ++++++++++++++++++++++----
> >  arch/x86/kernel/traps.c               |  42 +++
> >  arch/x86/kernel/vmlinux.lds.S         |   7 +
> >  arch/x86/kernel/x86_init.c            |   4 +-
> >  arch/x86/mm/cpu_entry_area.c          |   2 +
> >  drivers/clocksource/hyperv_timer.c    |   2 +-
> >  drivers/hv/connection.c               |   1 +
> >  drivers/hv/hv.c                       |  33 ++-
> >  drivers/hv/hv_common.c                |  26 +-
> >  include/asm-generic/hyperv-tlfs.h     |  19 ++
> >  include/asm-generic/mshyperv.h        |   2 +
> >  include/linux/hyperv.h                |   4 +-
> >  33 files changed, 1102 insertions(+), 79 deletions(-)
> >


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 06/16] x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
  2023-01-31 17:58   ` Michael Kelley (LINUX)
@ 2023-02-03  4:11     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-03  4:11 UTC (permalink / raw)
  To: Michael Kelley (LINUX),
	luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/1/2023 1:58 AM, Michael Kelley (LINUX) wrote:
>> +
>> +			ret = set_memory_decrypted((unsigned long)
>> +				hv_cpu->post_msg_page, 1);
>> +			if (ret)
>> +				goto err_decrypt_msg_page;
>> +
>> +			memset(hv_cpu->synic_message_page, 0, PAGE_SIZE);
>> +			memset(hv_cpu->synic_event_page, 0, PAGE_SIZE);
>> +			memset(hv_cpu->post_msg_page, 0, PAGE_SIZE);
>> +		}
> Having decrypted the pages here in hv_synic_alloc(), shouldn't
> there be corresponding re-encryption in hv_synic_free()?
> 

Yes, I ignore in this version and will fix it.

Thanks.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 07/16] drivers: hv: Decrypt percpu hvcall input arg page in sev-snp enlightened guest
  2023-01-31 18:02   ` Michael Kelley (LINUX)
@ 2023-02-03  5:23     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-03  5:23 UTC (permalink / raw)
  To: Michael Kelley (LINUX),
	luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/1/2023 2:02 AM, Michael Kelley (LINUX) wrote:
>> @@ -134,6 +136,17 @@ int hv_common_cpu_init(unsigned int cpu)
>>   	if (!(*inputarg))
>>   		return -ENOMEM;
>>
>> +	if (hv_isolation_type_en_snp()) {
>> +		ret = set_memory_decrypted((unsigned long)*inputarg, pgcount);
> You used "pgcount" here in response to a comment on v2 of the
> patch.  But the corresponding re-encryption in hv_common_cpu_die()
> uses a fixed value of "1".   The two cases should be consistent.  Either
> assert that hv_root_partition will never be true in an SNP VM, in which
> case hard coding "1" is OK.  Or properly calculate the number of pages
> in both cases so they are consistent.
> 

Agree. We should keep the logic in both hv_common_cpu_init() and 
hv_common_cpu_die(). Will fix it.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 08/16] x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
  2023-01-31 18:20   ` Michael Kelley (LINUX)
@ 2023-02-03  5:58     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-03  5:58 UTC (permalink / raw)
  To: Michael Kelley (LINUX),
	luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/1/2023 2:20 AM, Michael Kelley (LINUX) wrote:
>> +struct memory_map_entry {
>> +	u64 starting_gpn;
>> +	u64 numpages;
>> +	u16 type;
>> +	u16 flags;
>> +	u32 reserved;
>> +};
> Am I correct that this structure is defined by Hyper-V?  If so, it seems
> like it should go in hyperv-tlfs.h, along with the definition of
> EN_SEV_SNP_PROCESSOR_INFO_ADDR (which is also defined by
> Hyper-V?)
>

Yes, it's Hyper-V data structure and will move to hyperv-tlfs.h.


>> +			if (e820_end < ram_end) {
>> +				pr_info("Hyper-V: add e820 entry [mem %#018Lx-%#018Lx]\n", e820_end, ram_end - 1);
>> +				e820__range_add(e820_end, ram_end - e820_end,
>> +						E820_TYPE_RAM);
>> +				for (page = e820_end; page < ram_end; page += PAGE_SIZE)
>> +					pvalidate((unsigned long)__va(page), RMP_PG_SIZE_4K, true);
>> +			}
>> +		}
>> +	}
>> +
> For SNP vTOM mode, most of the supporting code is placed in
> arch/x86/hyperv/ivm.c, which is built only if CONFIG_HYPERV
> is defined.  arch/x86/kernel/cpu/mshyperv.c is built for*any*
> flavor of guest (i.e., CONFIG_HYPERVISOR_GUEST).  I'm thinking
> all this code should go as a supporting function in ivm.c, to
> avoid overloading mshyperv.c.  Take a look at how hv_vtom_init()
> is handled in my patch set.
> 
> Breaking it out as a separate supporting function might also
> help reduce the deep indentation problem a bit. 😄

Good idea. Will update in the next version.
Thanks for your suggestion.
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest
  2023-01-31 18:34   ` Michael Kelley (LINUX)
@ 2023-02-03  6:10     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-03  6:10 UTC (permalink / raw)
  To: Michael Kelley (LINUX),
	luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, Tianyu Lan, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/1/2023 2:34 AM, Michael Kelley (LINUX) wrote:
>> +		pr_err("HvCallStartVirtualProcessor failed: %llx\n", ret);
>> +		goto done;
>> +	}
>> +
>> +done:
>> +	local_irq_restore(flags);
>> +	return ret;
>> +}
>> +
> Like a comment in an earlier patch, I'm wondering if the bulk of
> this code could move to ivm.c, to avoid overloading mshyperv.c.

Sure. Will update in the next version.
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest
  2023-01-23 15:30   ` Tom Lendacky
@ 2023-02-03  7:00     ` Tianyu Lan
  2023-02-06 20:11       ` Borislav Petkov
  0 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-02-03  7:00 UTC (permalink / raw)
  To: Tom Lendacky, luto, tglx, mingo, bp, dave.hansen, x86, hpa,
	seanjc, pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, venu.busireddy, sterritt, tony.luck, samitolvanen,
	fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 1/23/2023 11:30 PM, Tom Lendacky wrote:
> On 1/21/23 20:46, Tianyu Lan wrote:
>> From: Tianyu Lan <tiala@microsoft.com>
>>
>> The wakeup_secondary_cpu callback was populated with wakeup_
>> cpu_via_vmgexit() which doesn't work for Hyper-V. Override it
> 
> An explanation as to why is doesn't work would be nice here.

Hi Thomas:
	Thanks for your review. Good idea. Will update.

>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
>> index cb1ee53ad3b1..f8b321a11ee4 100644
>> --- a/arch/x86/include/asm/svm.h
>> +++ b/arch/x86/include/asm/svm.h
>> @@ -336,6 +336,53 @@ struct vmcb_save_area {
> 
> Please don't update the vmcb_save_area, you should be using/updating the 
> sev_es_save_area structure for SNP.

OK. Will update in the next version.

>>             u64 sev_feature_restrict_injection    : 1;
>> +            u64 sev_feature_alternate_injection    : 1;
>> +            u64 sev_feature_full_debug        : 1;
>> +            u64 sev_feature_reserved1        : 1;
>> +            u64 sev_feature_snpbtb_isolation    : 1;
>> +            u64 sev_feature_resrved2        : 56;
> 
> For the bits definition, use:
> 
>              u64 sev_feature_snp            : 1,
>                  sev_feature_vtom            : 1,
>                  sev_feature_reflectvc        : 1,
>                  ...
> 

Good suggestion. Thanks.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler
  2023-01-23  7:33   ` Gupta, Pankaj
@ 2023-02-03  7:27     ` Tianyu Lan
  2023-02-16 13:50       ` Gupta, Pankaj
  0 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-02-03  7:27 UTC (permalink / raw)
  To: Gupta, Pankaj, luto, tglx, mingo, bp, dave.hansen, x86, hpa,
	seanjc, pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 1/23/2023 3:33 PM, Gupta, Pankaj wrote:
> 
>> + */
>> +.macro idtentry_hv vector asmsym cfunc
>> +SYM_CODE_START(\asmsym)
>> +    UNWIND_HINT_IRET_REGS
>> +    ASM_CLAC
> 
> Did you get a chance to review the new instructions
> added at the start similar to idtentry_vc and comments
> added assuggested here?
> 
> https://lore.kernel.org/lkml/16e50239-39b2-4fb4-5110-18f13ba197fe@amd.com/

Hi Pankaj:
	Thanks for your reminder. Yes, CLD should be add after ASM_CLAC. Will 
fix it.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest
  2023-02-03  7:00     ` Tianyu Lan
@ 2023-02-06 20:11       ` Borislav Petkov
  2023-02-07 13:49         ` Tianyu Lan
  0 siblings, 1 reply; 60+ messages in thread
From: Borislav Petkov @ 2023-02-06 20:11 UTC (permalink / raw)
  To: Tianyu Lan
  Cc: Tom Lendacky, luto, tglx, mingo, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, venu.busireddy, sterritt, tony.luck, samitolvanen,
	fenghua.yu, linux-kernel, kvm, linux-hyperv, linux-arch

On Fri, Feb 03, 2023 at 03:00:44PM +0800, Tianyu Lan wrote:
> > For the bits definition, use:
> > 
> >              u64 sev_feature_snp            : 1,
> >                  sev_feature_vtom            : 1,
> >                  sev_feature_reflectvc        : 1,
> >                  ...
> > 
> 
> Good suggestion. Thanks.

Actually, I'd prefer if you used a named union and drop all this
"sev_feature_" prefixes everywhere:

        union {
                struct {
                        u64 snp                     : 1;
                        u64 vtom                    : 1;
                        u64 reflectvc               : 1;
                        u64 restrict_injection      : 1;
                        u64 alternate_injection     : 1;
                        u64 full_debug              : 1;
                        u64 reserved1               : 1;
                        u64 snpbtb_isolation        : 1;
                        u64 resrved2                : 56;
                };
                u64 val;
        } sev_features;



so that you can do in code:

	struct sev_es_save_area *sev;

	...

	sev->sev_features.snp = ...

and so on.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest
  2023-02-06 20:11       ` Borislav Petkov
@ 2023-02-07 13:49         ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-07 13:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tom Lendacky, luto, tglx, mingo, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, venu.busireddy, sterritt, tony.luck, samitolvanen,
	fenghua.yu, linux-kernel, kvm, linux-hyperv, linux-arch

On 2/7/2023 4:11 AM, Borislav Petkov wrote:
> On Fri, Feb 03, 2023 at 03:00:44PM +0800, Tianyu Lan wrote:
>>> For the bits definition, use:
>>>
>>>               u64 sev_feature_snp            : 1,
>>>                   sev_feature_vtom            : 1,
>>>                   sev_feature_reflectvc        : 1,
>>>                   ...
>>>
>>
>> Good suggestion. Thanks.
> 
> Actually, I'd prefer if you used a named union and drop all this
> "sev_feature_" prefixes everywhere:
> 
>          union {
>                  struct {
>                          u64 snp                     : 1;
>                          u64 vtom                    : 1;
>                          u64 reflectvc               : 1;
>                          u64 restrict_injection      : 1;
>                          u64 alternate_injection     : 1;
>                          u64 full_debug              : 1;
>                          u64 reserved1               : 1;
>                          u64 snpbtb_isolation        : 1;
>                          u64 resrved2                : 56;
>                  };
>                  u64 val;
>          } sev_features;
> 
> 
> 
> so that you can do in code:
> 
> 	struct sev_es_save_area *sev;
> 
> 	...
> 
> 	sev->sev_features.snp = ...
> 
> and so on.

Hi Boris:
	Thanks a lot for your suggestion. Will update.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths from #HV exception
  2023-02-02 23:20   ` Zhi Wang
@ 2023-02-08 23:53     ` Kalra, Ashish
  0 siblings, 0 replies; 60+ messages in thread
From: Kalra, Ashish @ 2023-02-08 23:53 UTC (permalink / raw)
  To: Zhi Wang, Tianyu Lan
  Cc: luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, srutherford, akpm,
	anshuman.khandual, pawan.kumar.gupta, adrian.hunter,
	daniel.sneddon, alexander.shishkin, sandipan.das, ray.huang,
	brijesh.singh, michael.roth, thomas.lendacky, venu.busireddy,
	sterritt, tony.luck, samitolvanen, fenghua.yu, linux-kernel, kvm,
	linux-hyperv, linux-arch

On 2/2/2023 5:20 PM, Zhi Wang wrote:
> On Sat, 21 Jan 2023 21:46:06 -0500
> Tianyu Lan <ltykernel@gmail.com> wrote:
> 
>> From: Ashish Kalra <ashish.kalra@amd.com>
>>
>> Add checks in interrupt exit code paths in case of returns
>> to user mode to check if currently executing the #HV handler
>> then don't follow the irqentry_exit_to_user_mode path as
>> that can potentially cause the #HV handler to be
>> preempted and rescheduled on another CPU. Rescheduled #HV
>> handler on another cpu will cause interrupts to be handled
>> on a different cpu than the injected one, causing
>> invalid EOIs and missed/lost guest interrupts and
>> corresponding hangs and/or per-cpu IRQs handled on
>> non-intended cpu.
>>
> 
> Why doesn't this problem happen in #VC handler? As #VC handler doesn't have
> this special handling.
> 

Because the #VC handler does not invoke common_interrupt() handler to do 
IRQ processing. Doing IRQ handling is specific to #HV exception handler 
as all guest interrupt handling is invoked from #HV exception handler 
once restricted interrupt injection support is enabled.

Thanks,
Ashish

>> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
>> ---
>>   arch/x86/include/asm/idtentry.h | 66 +++++++++++++++++++++++++++++++++
>>   arch/x86/kernel/sev.c           | 30 +++++++++++++++
>>   2 files changed, 96 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
>> index 652fea10d377..45b47132be7c 100644
>> --- a/arch/x86/include/asm/idtentry.h
>> +++ b/arch/x86/include/asm/idtentry.h
>> @@ -13,6 +13,10 @@
>>   
>>   #include <asm/irq_stack.h>
>>   
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, irqentry_state_t state);
>> +#endif
>> +
>>   /**
>>    * DECLARE_IDTENTRY - Declare functions for simple IDT entry points
>>    *		      No error code pushed by hardware
>> @@ -176,6 +180,7 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code)
>>   #define DECLARE_IDTENTRY_IRQ(vector, func)				\
>>   	DECLARE_IDTENTRY_ERRORCODE(vector, func)
>>   
>> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>>   /**
>>    * DEFINE_IDTENTRY_IRQ - Emit code for device interrupt IDT entry points
>>    * @func:	Function name of the entry point
>> @@ -205,6 +210,26 @@ __visible noinstr void func(struct pt_regs *regs,			\
>>   }									\
>>   									\
>>   static noinline void __##func(struct pt_regs *regs, u32 vector)
>> +#else
>> +
>> +#define DEFINE_IDTENTRY_IRQ(func)					\
>> +static void __##func(struct pt_regs *regs, u32 vector);		\
>> +									\
>> +__visible noinstr void func(struct pt_regs *regs,			\
>> +			    unsigned long error_code)			\
>> +{									\
>> +	irqentry_state_t state = irqentry_enter(regs);			\
>> +	u32 vector = (u32)(u8)error_code;				\
>> +									\
>> +	instrumentation_begin();					\
>> +	kvm_set_cpu_l1tf_flush_l1d();					\
>> +	run_irq_on_irqstack_cond(__##func, regs, vector);		\
>> +	instrumentation_end();						\
>> +	irqentry_exit_hv_cond(regs, state);				\
>> +}									\
>> +									\
>> +static noinline void __##func(struct pt_regs *regs, u32 vector)
>> +#endif
>>   
>>   /**
>>    * DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points
>> @@ -221,6 +246,7 @@ static noinline void __##func(struct pt_regs *regs, u32 vector)
>>   #define DECLARE_IDTENTRY_SYSVEC(vector, func)				\
>>   	DECLARE_IDTENTRY(vector, func)
>>   
>> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>>   /**
>>    * DEFINE_IDTENTRY_SYSVEC - Emit code for system vector IDT entry points
>>    * @func:	Function name of the entry point
>> @@ -245,6 +271,26 @@ __visible noinstr void func(struct pt_regs *regs)			\
>>   }									\
>>   									\
>>   static noinline void __##func(struct pt_regs *regs)
>> +#else
>> +
>> +#define DEFINE_IDTENTRY_SYSVEC(func)					\
>> +static void __##func(struct pt_regs *regs);				\
>> +									\
>> +__visible noinstr void func(struct pt_regs *regs)			\
>> +{									\
>> +	irqentry_state_t state = irqentry_enter(regs);			\
>> +									\
>> +	instrumentation_begin();					\
>> +	kvm_set_cpu_l1tf_flush_l1d();					\
>> +	run_sysvec_on_irqstack_cond(__##func, regs);			\
>> +	instrumentation_end();						\
>> +	irqentry_exit_hv_cond(regs, state);				\
>> +}									\
>> +									\
>> +static noinline void __##func(struct pt_regs *regs)
>> +#endif
>> +
>> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>>   
>>   /**
>>    * DEFINE_IDTENTRY_SYSVEC_SIMPLE - Emit code for simple system vector IDT
>> @@ -274,6 +320,26 @@ __visible noinstr void func(struct pt_regs *regs)			\
>>   }									\
>>   									\
>>   static __always_inline void __##func(struct pt_regs *regs)
>> +#else
>> +
>> +#define DEFINE_IDTENTRY_SYSVEC_SIMPLE(func)				\
>> +static __always_inline void __##func(struct pt_regs *regs);		\
>> +									\
>> +__visible noinstr void func(struct pt_regs *regs)			\
>> +{									\
>> +	irqentry_state_t state = irqentry_enter(regs);			\
>> +									\
>> +	instrumentation_begin();					\
>> +	__irq_enter_raw();						\
>> +	kvm_set_cpu_l1tf_flush_l1d();					\
>> +	__##func(regs);						\
>> +	__irq_exit_raw();						\
>> +	instrumentation_end();						\
>> +	irqentry_exit_hv_cond(regs, state);				\
>> +}									\
>> +									\
>> +static __always_inline void __##func(struct pt_regs *regs)
>> +#endif
>>   
>>   /**
>>    * DECLARE_IDTENTRY_XENCB - Declare functions for XEN HV callback entry point
>> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
>> index b1a98c2a52f8..23f15e95838b 100644
>> --- a/arch/x86/kernel/sev.c
>> +++ b/arch/x86/kernel/sev.c
>> @@ -147,6 +147,10 @@ struct sev_hv_doorbell_page {
>>   
>>   struct sev_snp_runtime_data {
>>   	struct sev_hv_doorbell_page hv_doorbell_page;
>> +	/*
>> +	 * Indication that we are currently handling #HV events.
>> +	 */
>> +	bool hv_handling_events;
>>   };
>>   
>>   static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
>> @@ -200,6 +204,8 @@ static void do_exc_hv(struct pt_regs *regs)
>>   	union hv_pending_events pending_events;
>>   	u8 vector;
>>   
>> +	this_cpu_read(snp_runtime_data)->hv_handling_events = true;
>> +
>>   	while (sev_hv_pending()) {
>>   		pending_events.events = xchg(
>>   			&sev_snp_current_doorbell_page()->pending_events.events,
>> @@ -234,6 +240,8 @@ static void do_exc_hv(struct pt_regs *regs)
>>   			common_interrupt(regs, pending_events.vector);
>>   		}
>>   	}
>> +
>> +	this_cpu_read(snp_runtime_data)->hv_handling_events = false;
>>   }
>>   
>>   static __always_inline bool on_vc_stack(struct pt_regs *regs)
>> @@ -2529,3 +2537,25 @@ static int __init snp_init_platform_device(void)
>>   	return 0;
>>   }
>>   device_initcall(snp_init_platform_device);
>> +
>> +noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, irqentry_state_t state)
>> +{
>> +	/*
>> +	 * Check whether this returns to user mode, if so and if
>> +	 * we are currently executing the #HV handler then we don't
>> +	 * want to follow the irqentry_exit_to_user_mode path as
>> +	 * that can potentially cause the #HV handler to be
>> +	 * preempted and rescheduled on another CPU. Rescheduled #HV
>> +	 * handler on another cpu will cause interrupts to be handled
>> +	 * on a different cpu than the injected one, causing
>> +	 * invalid EOIs and missed/lost guest interrupts and
>> +	 * corresponding hangs and/or per-cpu IRQs handled on
>> +	 * non-intended cpu.
>> +	 */
>> +	if (user_mode(regs) &&
>> +	    this_cpu_read(snp_runtime_data)->hv_handling_events)
>> +		return;
>> +
>> +	/* follow normal interrupt return/exit path */
>> +	irqentry_exit(regs, state);
>> +}
> 

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
  2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
                   ` (16 preceding siblings ...)
  2023-02-02 23:00 ` [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Zhi Wang
@ 2023-02-09 11:36 ` Gupta, Pankaj
  2023-02-17 12:47   ` Gupta, Pankaj
  17 siblings, 1 reply; 60+ messages in thread
From: Gupta, Pankaj @ 2023-02-09 11:36 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

Hi Tianyu,

> This patchset is to add AMD sev-snp enlightened guest
> support on hyperv. Hyperv uses Linux direct boot mode
> to boot up Linux kernel and so it needs to pvalidate
> system memory by itself.
> 
> In hyperv case, there is no boot loader and so cc blob
> is prepared by hypervisor. In this series, hypervisor
> set the cc blob address directly into boot parameter
> of Linux kernel. If the magic number on cc blob address
> is valid, kernel will read cc blob.
> 
> Shared memory between guests and hypervisor should be
> decrypted and zero memory after decrypt memory. The data
> in the target address. It maybe smearedto avoid smearing
> data.
> 
> Introduce #HV exception support in AMD sev snp code and
> #HV handler.

I am interested to test the Linux guest #HV exception handling (patches 
12-16 in this series) for the restricted interrupt injection with the 
Linux/KVM host.

Do you have a git tree which or any base commit on which
I can use to apply these patches?

Thank You,
Pankaj

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler
  2023-02-03  7:27     ` Tianyu Lan
@ 2023-02-16 13:50       ` Gupta, Pankaj
  0 siblings, 0 replies; 60+ messages in thread
From: Gupta, Pankaj @ 2023-02-16 13:50 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/3/2023 8:27 AM, Tianyu Lan wrote:
> On 1/23/2023 3:33 PM, Gupta, Pankaj wrote:
>>
>>> + */
>>> +.macro idtentry_hv vector asmsym cfunc
>>> +SYM_CODE_START(\asmsym)
>>> +    UNWIND_HINT_IRET_REGS
>>> +    ASM_CLAC
>>
>> Did you get a chance to review the new instructions
>> added at the start similar to idtentry_vc and comments
>> added assuggested here?
>>
>> https://lore.kernel.org/lkml/16e50239-39b2-4fb4-5110-18f13ba197fe@amd.com/
> 
> Hi Pankaj:
>      Thanks for your reminder. Yes, CLD should be add after ASM_CLAC. 
> Will fix it.

Also it looks ENDBR also needs to be added before ASM_CLAC? as I also 
get this:

vmlinux.o: warning: objtool: asm_exc_hv_injection+0x0: 
UNWIND_HINT_IRET_REGS without ENDBR
vmlinux.o: warning: objtool: ibt_selftest+0x11: sibling call from 
callable instruction with modified stack frame
vmlinux.o: warning: objtool: ibt_selftest+0x1e: return with modified 
stack frame
vmlinux.o: warning: objtool: def_idts+0x1d8: data relocation to !ENDBR: 
asm_exc_hv_injection+0x0

Thanks,
Pankaj


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests
  2023-01-22  2:46 ` [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests Tianyu Lan
@ 2023-02-16 14:46   ` Gupta, Pankaj
  2023-02-17 12:45   ` Gupta, Pankaj
  2023-03-01 19:34   ` Gupta, Pankaj
  2 siblings, 0 replies; 60+ messages in thread
From: Gupta, Pankaj @ 2023-02-16 14:46 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 1/22/2023 3:46 AM, Tianyu Lan wrote:
> From: Tianyu Lan <tiala@microsoft.com>
> 
> Enable #HV exception to handle interrupt requests from hypervisor.
> 
> Co-developed-by: Lendacky Thomas <thomas.lendacky@amd.com>
> Co-developed-by: Kalra Ashish <ashish.kalra@amd.com>
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>   arch/x86/include/asm/mem_encrypt.h |   2 +
>   arch/x86/include/asm/msr-index.h   |   6 +
>   arch/x86/include/asm/svm.h         |  12 +-
>   arch/x86/include/uapi/asm/svm.h    |   4 +
>   arch/x86/kernel/sev.c              | 307 +++++++++++++++++++++++------
>   arch/x86/kernel/traps.c            |   2 +
>   6 files changed, 272 insertions(+), 61 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 72ca90552b6a..7264ca5f5b2d 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -50,6 +50,7 @@ void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
>   void __init mem_encrypt_free_decrypted_mem(void);
>   
>   void __init sev_es_init_vc_handling(void);
> +void __init sev_snp_init_hv_handling(void);
>   
>   #define __bss_decrypted __section(".bss..decrypted")
>   
> @@ -72,6 +73,7 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { }
>   static inline void __init sme_enable(struct boot_params *bp) { }
>   
>   static inline void sev_es_init_vc_handling(void) { }
> +static inline void sev_snp_init_hv_handling(void) { }
>   
>   static inline int __init
>   early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 6a6e70e792a4..70af0ce5f2c4 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -562,11 +562,17 @@
>   #define MSR_AMD64_SEV_ENABLED_BIT	0
>   #define MSR_AMD64_SEV_ES_ENABLED_BIT	1
>   #define MSR_AMD64_SEV_SNP_ENABLED_BIT	2
> +#define MSR_AMD64_SEV_REFLECTVC_ENABLED_BIT		4
> +#define MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED_BIT	5
> +#define MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED_BIT	6
>   #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
>   #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
>   #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
>   #define MSR_AMD64_SNP_VTOM_ENABLED	BIT_ULL(3)
>   
> +#define MSR_AMD64_SEV_REFLECTVC_ENABLED			BIT_ULL(MSR_AMD64_SEV_REFLECTVC_ENABLED_BIT)
> +#define MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED	BIT_ULL(MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED_BIT)


> +#define MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED	BIT_ULL(MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED_BIT)
>   #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
>   
>   /* AMD Collaborative Processor Performance Control MSRs */
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index f8b321a11ee4..911c991fec78 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -568,12 +568,12 @@ static inline void __unused_size_checks(void)
>   
>   	/* Check offsets of reserved fields */
>   
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xa0);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xcc);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xd8);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x180);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x248);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x298);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xa0);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xcc);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xd8);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x180);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x248);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x298);
>   
>   	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0xc8);
>   	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0xcc);
> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
> index f69c168391aa..85d6882262e7 100644
> --- a/arch/x86/include/uapi/asm/svm.h
> +++ b/arch/x86/include/uapi/asm/svm.h
> @@ -115,6 +115,10 @@
>   #define SVM_VMGEXIT_AP_CREATE_ON_INIT		0
>   #define SVM_VMGEXIT_AP_CREATE			1
>   #define SVM_VMGEXIT_AP_DESTROY			2
> +#define SVM_VMGEXIT_HV_DOORBELL_PAGE		0x80000014
> +#define SVM_VMGEXIT_GET_PREFERRED_HV_DOORBELL_PAGE	0
> +#define SVM_VMGEXIT_SET_HV_DOORBELL_PAGE		1
> +#define SVM_VMGEXIT_QUERY_HV_DOORBELL_PAGE		2
>   #define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
>   #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
>   
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index fe5e5e41433d..03d99fad9e76 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -122,6 +122,150 @@ struct sev_config {
>   
>   static struct sev_config sev_cfg __read_mostly;
>   
> +static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state);
> +static noinstr void __sev_put_ghcb(struct ghcb_state *state);
> +static int vmgexit_hv_doorbell_page(struct ghcb *ghcb, u64 op, u64 pa);
> +static void sev_snp_setup_hv_doorbell_page(struct ghcb *ghcb);
> +
> +union hv_pending_events {
> +	u16 events;
> +	struct {
> +		u8 vector;
> +		u8 nmi : 1;
> +		u8 mc : 1;
> +		u8 reserved1 : 5;
> +		u8 no_further_signal : 1;
> +	};
> +};
> +
> +struct sev_hv_doorbell_page {
> +	union hv_pending_events pending_events;
> +	u8 no_eoi_required;
> +	u8 reserved2[61];
> +	u8 padding[4032];
> +};
> +
> +struct sev_snp_runtime_data {
> +	struct sev_hv_doorbell_page hv_doorbell_page;
> +};
> +
> +static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
> +
> +static inline u64 sev_es_rd_ghcb_msr(void)
> +{
> +	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
> +}
> +
> +static __always_inline void sev_es_wr_ghcb_msr(u64 val)
> +{
> +	u32 low, high;
> +
> +	low  = (u32)(val);
> +	high = (u32)(val >> 32);
> +
> +	native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
> +}
> +
> +struct sev_hv_doorbell_page *sev_snp_current_doorbell_page(void)
> +{
> +	return &this_cpu_read(snp_runtime_data)->hv_doorbell_page;
> +}
> +
> +static u8 sev_hv_pending(void)
> +{
> +	return sev_snp_current_doorbell_page()->pending_events.events;
> +}
> +
> +static void hv_doorbell_apic_eoi_write(u32 reg, u32 val)
> +{
> +	if (xchg(&sev_snp_current_doorbell_page()->no_eoi_required, 0) & 0x1)
> +		return;
> +
> +	BUG_ON(reg != APIC_EOI);
> +	apic->write(reg, val);
> +}
> +
> +static void do_exc_hv(struct pt_regs *regs)
> +{
> +	union hv_pending_events pending_events;
> +	u8 vector;

Unused variable.

> +
> +	while (sev_hv_pending()) {
> +		pending_events.events = xchg(
> +			&sev_snp_current_doorbell_page()->pending_events.events,
> +			0);
> +
> +		if (pending_events.nmi)
> +			exc_nmi(regs);
> +
> +#ifdef CONFIG_X86_MCE
> +		if (pending_events.mc)
> +			exc_machine_check(regs);
> +#endif
> +
> +		if (!pending_events.vector)
> +			return;
> +
> +		if (pending_events.vector < FIRST_EXTERNAL_VECTOR) {
> +			/* Exception vectors */
> +			WARN(1, "exception shouldn't happen\n");
> +		} else if (pending_events.vector == FIRST_EXTERNAL_VECTOR) {
> +			sysvec_irq_move_cleanup(regs);
> +		} else if (pending_events.vector == IA32_SYSCALL_VECTOR) {
> +			WARN(1, "syscall shouldn't happen\n");
> +		} else if (pending_events.vector >= FIRST_SYSTEM_VECTOR) {
> +			switch (pending_events.vector) {
> +#if IS_ENABLED(CONFIG_HYPERV)
> +			case HYPERV_STIMER0_VECTOR:
> +				sysvec_hyperv_stimer0(regs);
> +				break;
> +			case HYPERVISOR_CALLBACK_VECTOR:
> +				sysvec_hyperv_callback(regs);
> +				break;
> +#endif
> +#ifdef CONFIG_SMP
> +			case RESCHEDULE_VECTOR:
> +				sysvec_reschedule_ipi(regs);
> +				break;
> +			case IRQ_MOVE_CLEANUP_VECTOR:
> +				sysvec_irq_move_cleanup(regs);
> +				break;
> +			case REBOOT_VECTOR:
> +				sysvec_reboot(regs);
> +				break;
> +			case CALL_FUNCTION_SINGLE_VECTOR:
> +				sysvec_call_function_single(regs);
> +				break;
> +			case CALL_FUNCTION_VECTOR:
> +				sysvec_call_function(regs);
> +				break;
> +#endif
> +#ifdef CONFIG_X86_LOCAL_APIC
> +			case ERROR_APIC_VECTOR:
> +				sysvec_error_interrupt(regs);
> +				break;
> +			case SPURIOUS_APIC_VECTOR:
> +				sysvec_spurious_apic_interrupt(regs);
> +				break;
> +			case LOCAL_TIMER_VECTOR:
> +				sysvec_apic_timer_interrupt(regs);
> +				break;
> +			case X86_PLATFORM_IPI_VECTOR:
> +				sysvec_x86_platform_ipi(regs);
> +				break;
> +#endif
> +			case 0x0:
> +				break;
> +			default:
> +				panic("Unexpected vector %d\n", vector);
> +				unreachable();
> +			}
> +		} else {
> +			common_interrupt(regs, pending_events.vector);
> +		}
> +	}
> +}
> +
>   static __always_inline bool on_vc_stack(struct pt_regs *regs)
>   {
>   	unsigned long sp = regs->sp;
> @@ -179,11 +323,6 @@ void noinstr __sev_es_ist_enter(struct pt_regs *regs)
>   	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist);
>   }
>   
> -static void do_exc_hv(struct pt_regs *regs)
> -{
> -	/* Handle #HV exception. */
> -}
> -
>   void check_hv_pending(struct pt_regs *regs)
>   {
>   	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> @@ -232,68 +371,38 @@ void noinstr __sev_es_ist_exit(void)
>   	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
>   }
>   
> -/*
> - * Nothing shall interrupt this code path while holding the per-CPU
> - * GHCB. The backup GHCB is only for NMIs interrupting this path.
> - *
> - * Callers must disable local interrupts around it.
> - */
> -static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
> +static bool sev_restricted_injection_enabled(void)
> +{
> +	return sev_status & MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED;
> +}
> +
> +void __init sev_snp_init_hv_handling(void)
>   {
> +	struct sev_snp_runtime_data *snp_data;
unused variable.

>   	struct sev_es_runtime_data *data;
> +	struct ghcb_state state;
>   	struct ghcb *ghcb;
> +	unsigned long flags;
> +	int cpu;

unused variable.

> +	int err;

unused variable.
>   
>   	WARN_ON(!irqs_disabled());
> +	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP) || !sev_restricted_injection_enabled())
> +		return;
>   
>   	data = this_cpu_read(runtime_data);
> -	ghcb = &data->ghcb_page;
>   
> -	if (unlikely(data->ghcb_active)) {
> -		/* GHCB is already in use - save its contents */
> -
> -		if (unlikely(data->backup_ghcb_active)) {
> -			/*
> -			 * Backup-GHCB is also already in use. There is no way
> -			 * to continue here so just kill the machine. To make
> -			 * panic() work, mark GHCBs inactive so that messages
> -			 * can be printed out.
> -			 */
> -			data->ghcb_active        = false;
> -			data->backup_ghcb_active = false;
> -
> -			instrumentation_begin();
> -			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
> -			instrumentation_end();
> -		}
> -
> -		/* Mark backup_ghcb active before writing to it */
> -		data->backup_ghcb_active = true;
> -
> -		state->ghcb = &data->backup_ghcb;
> +	local_irq_save(flags);
>   
> -		/* Backup GHCB content */
> -		*state->ghcb = *ghcb;
> -	} else {
> -		state->ghcb = NULL;
> -		data->ghcb_active = true;
> -	}
> +	ghcb = __sev_get_ghcb(&state);
>   
> -	return ghcb;
> -}
> +	sev_snp_setup_hv_doorbell_page(ghcb);
>   
> -static inline u64 sev_es_rd_ghcb_msr(void)
> -{
> -	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
> -}
> -
> -static __always_inline void sev_es_wr_ghcb_msr(u64 val)
> -{
> -	u32 low, high;
> +	__sev_put_ghcb(&state);
>   
> -	low  = (u32)(val);
> -	high = (u32)(val >> 32);
> +	apic_set_eoi_write(hv_doorbell_apic_eoi_write);
>   
> -	native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
> +	local_irq_restore(flags);
>   }
>   
>   static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt,
> @@ -554,6 +663,69 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt
>   /* Include code shared with pre-decompression boot stage */
>   #include "sev-shared.c"
>   
> +/*
> + * Nothing shall interrupt this code path while holding the per-CPU
> + * GHCB. The backup GHCB is only for NMIs interrupting this path.
> + *
> + * Callers must disable local interrupts around it.
> + */
> +static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
> +{
> +	struct sev_es_runtime_data *data;
> +	struct ghcb *ghcb;
> +
> +	WARN_ON(!irqs_disabled());
> +
> +	data = this_cpu_read(runtime_data);
> +	ghcb = &data->ghcb_page;
> +
> +	if (unlikely(data->ghcb_active)) {
> +		/* GHCB is already in use - save its contents */
> +
> +		if (unlikely(data->backup_ghcb_active)) {
> +			/*
> +			 * Backup-GHCB is also already in use. There is no way
> +			 * to continue here so just kill the machine. To make
> +			 * panic() work, mark GHCBs inactive so that messages
> +			 * can be printed out.
> +			 */
> +			data->ghcb_active        = false;
> +			data->backup_ghcb_active = false;
> +
> +			instrumentation_begin();
> +			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
> +			instrumentation_end();
> +		}
> +
> +		/* Mark backup_ghcb active before writing to it */
> +		data->backup_ghcb_active = true;
> +
> +		state->ghcb = &data->backup_ghcb;
> +
> +		/* Backup GHCB content */
> +		*state->ghcb = *ghcb;
> +	} else {
> +		state->ghcb = NULL;
> +		data->ghcb_active = true;
> +	}
> +
> +	return ghcb;
> +}
> +
> +static void sev_snp_setup_hv_doorbell_page(struct ghcb *ghcb)
> +{
> +	u64 pa;
> +	enum es_result ret;
> +
> +	pa = __pa(sev_snp_current_doorbell_page());
> +	vc_ghcb_invalidate(ghcb);
> +	ret = vmgexit_hv_doorbell_page(ghcb,
> +				       SVM_VMGEXIT_SET_HV_DOORBELL_PAGE,
> +				       pa);
> +	if (ret != ES_OK)
> +		panic("SEV-SNP: failed to set up #HV doorbell page");
> +}
> +
>   static noinstr void __sev_put_ghcb(struct ghcb_state *state)
>   {
>   	struct sev_es_runtime_data *data;
> @@ -1282,6 +1454,7 @@ static void snp_register_per_cpu_ghcb(void)
>   	ghcb = &data->ghcb_page;
>   
>   	snp_register_ghcb_early(__pa(ghcb));
> +	sev_snp_setup_hv_doorbell_page(ghcb);
>   }
>   
>   void setup_ghcb(void)
> @@ -1321,6 +1494,11 @@ void setup_ghcb(void)
>   		snp_register_ghcb_early(__pa(&boot_ghcb_page));
>   }
>   
> +int vmgexit_hv_doorbell_page(struct ghcb *ghcb, u64 op, u64 pa)
> +{
> +	return sev_es_ghcb_hv_call(ghcb, NULL, SVM_VMGEXIT_HV_DOORBELL_PAGE, op, pa);
> +}
> +
>   #ifdef CONFIG_HOTPLUG_CPU
>   static void sev_es_ap_hlt_loop(void)
>   {
> @@ -1394,6 +1572,7 @@ static void __init alloc_runtime_data(int cpu)
>   static void __init init_ghcb(int cpu)
>   {
>   	struct sev_es_runtime_data *data;
> +	struct sev_snp_runtime_data *snp_data;
>   	int err;
>   
>   	data = per_cpu(runtime_data, cpu);
> @@ -1405,6 +1584,19 @@ static void __init init_ghcb(int cpu)
>   
>   	memset(&data->ghcb_page, 0, sizeof(data->ghcb_page));
>   
> +	snp_data = memblock_alloc(sizeof(*snp_data), PAGE_SIZE);
> +	if (!snp_data)
> +		panic("Can't allocate SEV-SNP runtime data");
> +
> +	err = early_set_memory_decrypted((unsigned long)&snp_data->hv_doorbell_page,
> +					 sizeof(snp_data->hv_doorbell_page));
> +	if (err)
> +		panic("Can't map #HV doorbell pages unencrypted");
> +
> +	memset(&snp_data->hv_doorbell_page, 0, sizeof(snp_data->hv_doorbell_page));
> +
> +	per_cpu(snp_runtime_data, cpu) = snp_data;
> +
>   	data->ghcb_active = false;
>   	data->backup_ghcb_active = false;
>   }
> @@ -2045,7 +2237,12 @@ DEFINE_IDTENTRY_VC_USER(exc_vmm_communication)
>   
>   static bool hv_raw_handle_exception(struct pt_regs *regs)
>   {
> -	return false;
> +	/* Clear the no_further_signal bit */
> +	sev_snp_current_doorbell_page()->pending_events.events &= 0x7fff;
> +
> +	check_hv_pending(regs);
> +
> +	return true;
>   }
>   
>   static __always_inline bool on_hv_fallback_stack(struct pt_regs *regs)
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index d29debec8134..1aa6cab2394b 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -1503,5 +1503,7 @@ void __init trap_init(void)
>   	cpu_init_exception_handling();
>   	/* Setup traps as cpu_init() might #GP */
>   	idt_setup_traps();
> +	sev_snp_init_hv_handling();
> +
>   	cpu_init();
>   }


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests
  2023-01-22  2:46 ` [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests Tianyu Lan
  2023-02-16 14:46   ` Gupta, Pankaj
@ 2023-02-17 12:45   ` Gupta, Pankaj
  2023-03-01 19:34   ` Gupta, Pankaj
  2 siblings, 0 replies; 60+ messages in thread
From: Gupta, Pankaj @ 2023-02-17 12:45 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch


> Enable #HV exception to handle interrupt requests from hypervisor.
> 
> Co-developed-by: Lendacky Thomas <thomas.lendacky@amd.com>
> Co-developed-by: Kalra Ashish <ashish.kalra@amd.com>
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>   arch/x86/include/asm/mem_encrypt.h |   2 +
>   arch/x86/include/asm/msr-index.h   |   6 +
>   arch/x86/include/asm/svm.h         |  12 +-
>   arch/x86/include/uapi/asm/svm.h    |   4 +
>   arch/x86/kernel/sev.c              | 307 +++++++++++++++++++++++------
>   arch/x86/kernel/traps.c            |   2 +
>   6 files changed, 272 insertions(+), 61 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 72ca90552b6a..7264ca5f5b2d 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -50,6 +50,7 @@ void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
>   void __init mem_encrypt_free_decrypted_mem(void);
>   
>   void __init sev_es_init_vc_handling(void);
> +void __init sev_snp_init_hv_handling(void);
>   
>   #define __bss_decrypted __section(".bss..decrypted")
>   
> @@ -72,6 +73,7 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { }
>   static inline void __init sme_enable(struct boot_params *bp) { }
>   
>   static inline void sev_es_init_vc_handling(void) { }
> +static inline void sev_snp_init_hv_handling(void) { }
>   
>   static inline int __init
>   early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 6a6e70e792a4..70af0ce5f2c4 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -562,11 +562,17 @@
>   #define MSR_AMD64_SEV_ENABLED_BIT	0
>   #define MSR_AMD64_SEV_ES_ENABLED_BIT	1
>   #define MSR_AMD64_SEV_SNP_ENABLED_BIT	2
> +#define MSR_AMD64_SEV_REFLECTVC_ENABLED_BIT		4
> +#define MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED_BIT	5
> +#define MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED_BIT	6

These are already commited part of:
("8c29f0165405 x86/sev: Add SEV-SNP guest feature negotiation support")

Thanks,
Pankaj
>   #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
>   #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
>   #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
>   #define MSR_AMD64_SNP_VTOM_ENABLED	BIT_ULL(3)
>   
> +#define MSR_AMD64_SEV_REFLECTVC_ENABLED			BIT_ULL(MSR_AMD64_SEV_REFLECTVC_ENABLED_BIT)
> +#define MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED	BIT_ULL(MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED_BIT)
> +#define MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED	BIT_ULL(MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED_BIT
>   #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
>   
>   /* AMD Collaborative Processor Performance Control MSRs */
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index f8b321a11ee4..911c991fec78 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -568,12 +568,12 @@ static inline void __unused_size_checks(void)
>   
>   	/* Check offsets of reserved fields */
>   
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xa0);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xcc);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xd8);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x180);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x248);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x298);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xa0);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xcc);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xd8);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x180);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x248);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x298);
>   
>   	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0xc8);
>   	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0xcc);
> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
> index f69c168391aa..85d6882262e7 100644
> --- a/arch/x86/include/uapi/asm/svm.h
> +++ b/arch/x86/include/uapi/asm/svm.h
> @@ -115,6 +115,10 @@
>   #define SVM_VMGEXIT_AP_CREATE_ON_INIT		0
>   #define SVM_VMGEXIT_AP_CREATE			1
>   #define SVM_VMGEXIT_AP_DESTROY			2
> +#define SVM_VMGEXIT_HV_DOORBELL_PAGE		0x80000014
> +#define SVM_VMGEXIT_GET_PREFERRED_HV_DOORBELL_PAGE	0
> +#define SVM_VMGEXIT_SET_HV_DOORBELL_PAGE		1
> +#define SVM_VMGEXIT_QUERY_HV_DOORBELL_PAGE		2
>   #define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
>   #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
>   
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index fe5e5e41433d..03d99fad9e76 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -122,6 +122,150 @@ struct sev_config {
>   
>   static struct sev_config sev_cfg __read_mostly;
>   
> +static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state);
> +static noinstr void __sev_put_ghcb(struct ghcb_state *state);
> +static int vmgexit_hv_doorbell_page(struct ghcb *ghcb, u64 op, u64 pa);
> +static void sev_snp_setup_hv_doorbell_page(struct ghcb *ghcb);
> +
> +union hv_pending_events {
> +	u16 events;
> +	struct {
> +		u8 vector;
> +		u8 nmi : 1;
> +		u8 mc : 1;
> +		u8 reserved1 : 5;
> +		u8 no_further_signal : 1;
> +	};
> +};
> +
> +struct sev_hv_doorbell_page {
> +	union hv_pending_events pending_events;
> +	u8 no_eoi_required;
> +	u8 reserved2[61];
> +	u8 padding[4032];
> +};
> +
> +struct sev_snp_runtime_data {
> +	struct sev_hv_doorbell_page hv_doorbell_page;
> +};
> +
> +static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
> +
> +static inline u64 sev_es_rd_ghcb_msr(void)
> +{
> +	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
> +}
> +
> +static __always_inline void sev_es_wr_ghcb_msr(u64 val)
> +{
> +	u32 low, high;
> +
> +	low  = (u32)(val);
> +	high = (u32)(val >> 32);
> +
> +	native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
> +}
> +
> +struct sev_hv_doorbell_page *sev_snp_current_doorbell_page(void)
> +{
> +	return &this_cpu_read(snp_runtime_data)->hv_doorbell_page;
> +}
> +
> +static u8 sev_hv_pending(void)
> +{
> +	return sev_snp_current_doorbell_page()->pending_events.events;
> +}
> +
> +static void hv_doorbell_apic_eoi_write(u32 reg, u32 val)
> +{
> +	if (xchg(&sev_snp_current_doorbell_page()->no_eoi_required, 0) & 0x1)
> +		return;
> +
> +	BUG_ON(reg != APIC_EOI);
> +	apic->write(reg, val);
> +}
> +
> +static void do_exc_hv(struct pt_regs *regs)
> +{
> +	union hv_pending_events pending_events;
> +	u8 vector;
> +
> +	while (sev_hv_pending()) {
> +		pending_events.events = xchg(
> +			&sev_snp_current_doorbell_page()->pending_events.events,
> +			0);
> +
> +		if (pending_events.nmi)
> +			exc_nmi(regs);
> +
> +#ifdef CONFIG_X86_MCE
> +		if (pending_events.mc)
> +			exc_machine_check(regs);
> +#endif
> +
> +		if (!pending_events.vector)
> +			return;
> +
> +		if (pending_events.vector < FIRST_EXTERNAL_VECTOR) {
> +			/* Exception vectors */
> +			WARN(1, "exception shouldn't happen\n");
> +		} else if (pending_events.vector == FIRST_EXTERNAL_VECTOR) {
> +			sysvec_irq_move_cleanup(regs);
> +		} else if (pending_events.vector == IA32_SYSCALL_VECTOR) {
> +			WARN(1, "syscall shouldn't happen\n");
> +		} else if (pending_events.vector >= FIRST_SYSTEM_VECTOR) {
> +			switch (pending_events.vector) {
> +#if IS_ENABLED(CONFIG_HYPERV)
> +			case HYPERV_STIMER0_VECTOR:
> +				sysvec_hyperv_stimer0(regs);
> +				break;
> +			case HYPERVISOR_CALLBACK_VECTOR:
> +				sysvec_hyperv_callback(regs);
> +				break;
> +#endif
> +#ifdef CONFIG_SMP
> +			case RESCHEDULE_VECTOR:
> +				sysvec_reschedule_ipi(regs);
> +				break;
> +			case IRQ_MOVE_CLEANUP_VECTOR:
> +				sysvec_irq_move_cleanup(regs);
> +				break;
> +			case REBOOT_VECTOR:
> +				sysvec_reboot(regs);
> +				break;
> +			case CALL_FUNCTION_SINGLE_VECTOR:
> +				sysvec_call_function_single(regs);
> +				break;
> +			case CALL_FUNCTION_VECTOR:
> +				sysvec_call_function(regs);
> +				break;
> +#endif
> +#ifdef CONFIG_X86_LOCAL_APIC
> +			case ERROR_APIC_VECTOR:
> +				sysvec_error_interrupt(regs);
> +				break;
> +			case SPURIOUS_APIC_VECTOR:
> +				sysvec_spurious_apic_interrupt(regs);
> +				break;
> +			case LOCAL_TIMER_VECTOR:
> +				sysvec_apic_timer_interrupt(regs);
> +				break;
> +			case X86_PLATFORM_IPI_VECTOR:
> +				sysvec_x86_platform_ipi(regs);
> +				break;
> +#endif
> +			case 0x0:
> +				break;
> +			default:
> +				panic("Unexpected vector %d\n", vector);
> +				unreachable();
> +			}
> +		} else {
> +			common_interrupt(regs, pending_events.vector);
> +		}
> +	}
> +}
> +
>   static __always_inline bool on_vc_stack(struct pt_regs *regs)
>   {
>   	unsigned long sp = regs->sp;
> @@ -179,11 +323,6 @@ void noinstr __sev_es_ist_enter(struct pt_regs *regs)
>   	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist);
>   }
>   
> -static void do_exc_hv(struct pt_regs *regs)
> -{
> -	/* Handle #HV exception. */
> -}
> -
>   void check_hv_pending(struct pt_regs *regs)
>   {
>   	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> @@ -232,68 +371,38 @@ void noinstr __sev_es_ist_exit(void)
>   	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
>   }
>   
> -/*
> - * Nothing shall interrupt this code path while holding the per-CPU
> - * GHCB. The backup GHCB is only for NMIs interrupting this path.
> - *
> - * Callers must disable local interrupts around it.
> - */
> -static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
> +static bool sev_restricted_injection_enabled(void)
> +{
> +	return sev_status & MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED;
> +}
> +
> +void __init sev_snp_init_hv_handling(void)
>   {
> +	struct sev_snp_runtime_data *snp_data;
>   	struct sev_es_runtime_data *data;
> +	struct ghcb_state state;
>   	struct ghcb *ghcb;
> +	unsigned long flags;
> +	int cpu;
> +	int err;
>   
>   	WARN_ON(!irqs_disabled());
> +	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP) || !sev_restricted_injection_enabled())
> +		return;
>   
>   	data = this_cpu_read(runtime_data);
> -	ghcb = &data->ghcb_page;
>   
> -	if (unlikely(data->ghcb_active)) {
> -		/* GHCB is already in use - save its contents */
> -
> -		if (unlikely(data->backup_ghcb_active)) {
> -			/*
> -			 * Backup-GHCB is also already in use. There is no way
> -			 * to continue here so just kill the machine. To make
> -			 * panic() work, mark GHCBs inactive so that messages
> -			 * can be printed out.
> -			 */
> -			data->ghcb_active        = false;
> -			data->backup_ghcb_active = false;
> -
> -			instrumentation_begin();
> -			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
> -			instrumentation_end();
> -		}
> -
> -		/* Mark backup_ghcb active before writing to it */
> -		data->backup_ghcb_active = true;
> -
> -		state->ghcb = &data->backup_ghcb;
> +	local_irq_save(flags);
>   
> -		/* Backup GHCB content */
> -		*state->ghcb = *ghcb;
> -	} else {
> -		state->ghcb = NULL;
> -		data->ghcb_active = true;
> -	}
> +	ghcb = __sev_get_ghcb(&state);
>   
> -	return ghcb;
> -}
> +	sev_snp_setup_hv_doorbell_page(ghcb);
>   
> -static inline u64 sev_es_rd_ghcb_msr(void)
> -{
> -	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
> -}
> -
> -static __always_inline void sev_es_wr_ghcb_msr(u64 val)
> -{
> -	u32 low, high;
> +	__sev_put_ghcb(&state);
>   
> -	low  = (u32)(val);
> -	high = (u32)(val >> 32);
> +	apic_set_eoi_write(hv_doorbell_apic_eoi_write);
>   
> -	native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
> +	local_irq_restore(flags);
>   }
>   
>   static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt,
> @@ -554,6 +663,69 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt
>   /* Include code shared with pre-decompression boot stage */
>   #include "sev-shared.c"
>   
> +/*
> + * Nothing shall interrupt this code path while holding the per-CPU
> + * GHCB. The backup GHCB is only for NMIs interrupting this path.
> + *
> + * Callers must disable local interrupts around it.
> + */
> +static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
> +{
> +	struct sev_es_runtime_data *data;
> +	struct ghcb *ghcb;
> +
> +	WARN_ON(!irqs_disabled());
> +
> +	data = this_cpu_read(runtime_data);
> +	ghcb = &data->ghcb_page;
> +
> +	if (unlikely(data->ghcb_active)) {
> +		/* GHCB is already in use - save its contents */
> +
> +		if (unlikely(data->backup_ghcb_active)) {
> +			/*
> +			 * Backup-GHCB is also already in use. There is no way
> +			 * to continue here so just kill the machine. To make
> +			 * panic() work, mark GHCBs inactive so that messages
> +			 * can be printed out.
> +			 */
> +			data->ghcb_active        = false;
> +			data->backup_ghcb_active = false;
> +
> +			instrumentation_begin();
> +			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
> +			instrumentation_end();
> +		}
> +
> +		/* Mark backup_ghcb active before writing to it */
> +		data->backup_ghcb_active = true;
> +
> +		state->ghcb = &data->backup_ghcb;
> +
> +		/* Backup GHCB content */
> +		*state->ghcb = *ghcb;
> +	} else {
> +		state->ghcb = NULL;
> +		data->ghcb_active = true;
> +	}
> +
> +	return ghcb;
> +}
> +
> +static void sev_snp_setup_hv_doorbell_page(struct ghcb *ghcb)
> +{
> +	u64 pa;
> +	enum es_result ret;
> +
> +	pa = __pa(sev_snp_current_doorbell_page());
> +	vc_ghcb_invalidate(ghcb);
> +	ret = vmgexit_hv_doorbell_page(ghcb,
> +				       SVM_VMGEXIT_SET_HV_DOORBELL_PAGE,
> +				       pa);
> +	if (ret != ES_OK)
> +		panic("SEV-SNP: failed to set up #HV doorbell page");
> +}
> +
>   static noinstr void __sev_put_ghcb(struct ghcb_state *state)
>   {
>   	struct sev_es_runtime_data *data;
> @@ -1282,6 +1454,7 @@ static void snp_register_per_cpu_ghcb(void)
>   	ghcb = &data->ghcb_page;
>   
>   	snp_register_ghcb_early(__pa(ghcb));
> +	sev_snp_setup_hv_doorbell_page(ghcb);
>   }
>   
>   void setup_ghcb(void)
> @@ -1321,6 +1494,11 @@ void setup_ghcb(void)
>   		snp_register_ghcb_early(__pa(&boot_ghcb_page));
>   }
>   
> +int vmgexit_hv_doorbell_page(struct ghcb *ghcb, u64 op, u64 pa)
> +{
> +	return sev_es_ghcb_hv_call(ghcb, NULL, SVM_VMGEXIT_HV_DOORBELL_PAGE, op, pa);
> +}
> +
>   #ifdef CONFIG_HOTPLUG_CPU
>   static void sev_es_ap_hlt_loop(void)
>   {
> @@ -1394,6 +1572,7 @@ static void __init alloc_runtime_data(int cpu)
>   static void __init init_ghcb(int cpu)
>   {
>   	struct sev_es_runtime_data *data;
> +	struct sev_snp_runtime_data *snp_data;
>   	int err;
>   
>   	data = per_cpu(runtime_data, cpu);
> @@ -1405,6 +1584,19 @@ static void __init init_ghcb(int cpu)
>   
>   	memset(&data->ghcb_page, 0, sizeof(data->ghcb_page));
>   
> +	snp_data = memblock_alloc(sizeof(*snp_data), PAGE_SIZE);
> +	if (!snp_data)
> +		panic("Can't allocate SEV-SNP runtime data");
> +
> +	err = early_set_memory_decrypted((unsigned long)&snp_data->hv_doorbell_page,
> +					 sizeof(snp_data->hv_doorbell_page));
> +	if (err)
> +		panic("Can't map #HV doorbell pages unencrypted");
> +
> +	memset(&snp_data->hv_doorbell_page, 0, sizeof(snp_data->hv_doorbell_page));
> +
> +	per_cpu(snp_runtime_data, cpu) = snp_data;
> +
>   	data->ghcb_active = false;
>   	data->backup_ghcb_active = false;
>   }
> @@ -2045,7 +2237,12 @@ DEFINE_IDTENTRY_VC_USER(exc_vmm_communication)
>   
>   static bool hv_raw_handle_exception(struct pt_regs *regs)
>   {
> -	return false;
> +	/* Clear the no_further_signal bit */
> +	sev_snp_current_doorbell_page()->pending_events.events &= 0x7fff;
> +
> +	check_hv_pending(regs);
> +
> +	return true;
>   }
>   
>   static __always_inline bool on_hv_fallback_stack(struct pt_regs *regs)
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index d29debec8134..1aa6cab2394b 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -1503,5 +1503,7 @@ void __init trap_init(void)
>   	cpu_init_exception_handling();
>   	/* Setup traps as cpu_init() might #GP */
>   	idt_setup_traps();
> +	sev_snp_init_hv_handling();
> +
>   	cpu_init();
>   }


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
  2023-02-09 11:36 ` Gupta, Pankaj
@ 2023-02-17 12:47   ` Gupta, Pankaj
  2023-02-18  7:15     ` Tianyu Lan
  2023-03-10 15:35     ` Gupta, Pankaj
  0 siblings, 2 replies; 60+ messages in thread
From: Gupta, Pankaj @ 2023-02-17 12:47 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/9/2023 12:36 PM, Gupta, Pankaj wrote:
> Hi Tianyu,
> 
>> This patchset is to add AMD sev-snp enlightened guest
>> support on hyperv. Hyperv uses Linux direct boot mode
>> to boot up Linux kernel and so it needs to pvalidate
>> system memory by itself.
>>
>> In hyperv case, there is no boot loader and so cc blob
>> is prepared by hypervisor. In this series, hypervisor
>> set the cc blob address directly into boot parameter
>> of Linux kernel. If the magic number on cc blob address
>> is valid, kernel will read cc blob.
>>
>> Shared memory between guests and hypervisor should be
>> decrypted and zero memory after decrypt memory. The data
>> in the target address. It maybe smearedto avoid smearing
>> data.
>>
>> Introduce #HV exception support in AMD sev snp code and
>> #HV handler.
> 
> I am interested to test the Linux guest #HV exception handling (patches 
> 12-16 in this series) for the restricted interrupt injection with the 
> Linux/KVM host.
> 
> Do you have a git tree which or any base commit on which
> I can use to apply these patches?

Never mind. I could apply the patches 12-16 on master (except minor 
tweak in patch 14). Now, will try to test.

Thanks,
Pankaj


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
  2023-02-17 12:47   ` Gupta, Pankaj
@ 2023-02-18  7:15     ` Tianyu Lan
  2023-03-10 15:35     ` Gupta, Pankaj
  1 sibling, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-02-18  7:15 UTC (permalink / raw)
  To: Gupta, Pankaj, luto, tglx, mingo, bp, dave.hansen, x86, hpa,
	seanjc, pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/17/2023 8:47 PM, Gupta, Pankaj wrote:
> On 2/9/2023 12:36 PM, Gupta, Pankaj wrote:
>> Hi Tianyu,
>>
>>> This patchset is to add AMD sev-snp enlightened guest
>>> support on hyperv. Hyperv uses Linux direct boot mode
>>> to boot up Linux kernel and so it needs to pvalidate
>>> system memory by itself.
>>>
>>> In hyperv case, there is no boot loader and so cc blob
>>> is prepared by hypervisor. In this series, hypervisor
>>> set the cc blob address directly into boot parameter
>>> of Linux kernel. If the magic number on cc blob address
>>> is valid, kernel will read cc blob.
>>>
>>> Shared memory between guests and hypervisor should be
>>> decrypted and zero memory after decrypt memory. The data
>>> in the target address. It maybe smearedto avoid smearing
>>> data.
>>>
>>> Introduce #HV exception support in AMD sev snp code and
>>> #HV handler.
>>
>> I am interested to test the Linux guest #HV exception handling 
>> (patches 12-16 in this series) for the restricted interrupt injection 
>> with the Linux/KVM host.
>>
>> Do you have a git tree which or any base commit on which
>> I can use to apply these patches?
> 
> Never mind. I could apply the patches 12-16 on master (except minor 
> tweak in patch 14). Now, will try to test.
> 

Hi Pankaj:
	Sorry. I missed your first mail. Please let me know any issue son KVM 
side if available。Thanks in advance.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths from #HV exception
  2023-01-22  2:46 ` [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths " Tianyu Lan
  2023-02-02 23:20   ` Zhi Wang
@ 2023-02-21 16:44   ` Gupta, Pankaj
  2023-03-10 16:02     ` Tianyu Lan
  1 sibling, 1 reply; 60+ messages in thread
From: Gupta, Pankaj @ 2023-02-21 16:44 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 1/22/2023 3:46 AM, Tianyu Lan wrote:
> From: Ashish Kalra <ashish.kalra@amd.com>
> 
> Add checks in interrupt exit code paths in case of returns
> to user mode to check if currently executing the #HV handler
> then don't follow the irqentry_exit_to_user_mode path as
> that can potentially cause the #HV handler to be
> preempted and rescheduled on another CPU. Rescheduled #HV
> handler on another cpu will cause interrupts to be handled
> on a different cpu than the injected one, causing
> invalid EOIs and missed/lost guest interrupts and
> corresponding hangs and/or per-cpu IRQs handled on
> non-intended cpu.
> 
> Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
> ---
>   arch/x86/include/asm/idtentry.h | 66 +++++++++++++++++++++++++++++++++
>   arch/x86/kernel/sev.c           | 30 +++++++++++++++
>   2 files changed, 96 insertions(+)
> 
> diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
> index 652fea10d377..45b47132be7c 100644
> --- a/arch/x86/include/asm/idtentry.h
> +++ b/arch/x86/include/asm/idtentry.h
> @@ -13,6 +13,10 @@
>   
>   #include <asm/irq_stack.h>
>   
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, irqentry_state_t state);
> +#endif
> +
>   /**
>    * DECLARE_IDTENTRY - Declare functions for simple IDT entry points
>    *		      No error code pushed by hardware
> @@ -176,6 +180,7 @@ __visible noinstr void func(struct pt_regs *regs, unsigned long error_code)
>   #define DECLARE_IDTENTRY_IRQ(vector, func)				\
>   	DECLARE_IDTENTRY_ERRORCODE(vector, func)
>   
> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>   /**
>    * DEFINE_IDTENTRY_IRQ - Emit code for device interrupt IDT entry points
>    * @func:	Function name of the entry point
> @@ -205,6 +210,26 @@ __visible noinstr void func(struct pt_regs *regs,			\
>   }									\
>   									\
>   static noinline void __##func(struct pt_regs *regs, u32 vector)
> +#else
> +
> +#define DEFINE_IDTENTRY_IRQ(func)					\
> +static void __##func(struct pt_regs *regs, u32 vector);		\
> +									\
> +__visible noinstr void func(struct pt_regs *regs,			\
> +			    unsigned long error_code)			\
> +{									\
> +	irqentry_state_t state = irqentry_enter(regs);			\
> +	u32 vector = (u32)(u8)error_code;				\
> +									\
> +	instrumentation_begin();					\
> +	kvm_set_cpu_l1tf_flush_l1d();					\
> +	run_irq_on_irqstack_cond(__##func, regs, vector);		\
> +	instrumentation_end();						\
> +	irqentry_exit_hv_cond(regs, state);				\
> +}									\
> +									\
> +static noinline void __##func(struct pt_regs *regs, u32 vector)
> +#endif
>   
>   /**
>    * DECLARE_IDTENTRY_SYSVEC - Declare functions for system vector entry points
> @@ -221,6 +246,7 @@ static noinline void __##func(struct pt_regs *regs, u32 vector)
>   #define DECLARE_IDTENTRY_SYSVEC(vector, func)				\
>   	DECLARE_IDTENTRY(vector, func)
>   
> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>   /**
>    * DEFINE_IDTENTRY_SYSVEC - Emit code for system vector IDT entry points
>    * @func:	Function name of the entry point
> @@ -245,6 +271,26 @@ __visible noinstr void func(struct pt_regs *regs)			\
>   }									\
>   									\
>   static noinline void __##func(struct pt_regs *regs)
> +#else
> +
> +#define DEFINE_IDTENTRY_SYSVEC(func)					\
> +static void __##func(struct pt_regs *regs);				\
> +									\
> +__visible noinstr void func(struct pt_regs *regs)			\
> +{									\
> +	irqentry_state_t state = irqentry_enter(regs);			\
> +									\
> +	instrumentation_begin();					\
> +	kvm_set_cpu_l1tf_flush_l1d();					\
> +	run_sysvec_on_irqstack_cond(__##func, regs);			\
> +	instrumentation_end();						\
> +	irqentry_exit_hv_cond(regs, state);				\
> +}									\
> +									\
> +static noinline void __##func(struct pt_regs *regs)
> +#endif
> +
> +#ifndef CONFIG_AMD_MEM_ENCRYPT
>   
>   /**
>    * DEFINE_IDTENTRY_SYSVEC_SIMPLE - Emit code for simple system vector IDT
> @@ -274,6 +320,26 @@ __visible noinstr void func(struct pt_regs *regs)			\
>   }									\
>   									\
>   static __always_inline void __##func(struct pt_regs *regs)
> +#else
> +
> +#define DEFINE_IDTENTRY_SYSVEC_SIMPLE(func)				\
> +static __always_inline void __##func(struct pt_regs *regs);		\
> +									\
> +__visible noinstr void func(struct pt_regs *regs)			\
> +{									\
> +	irqentry_state_t state = irqentry_enter(regs);			\
> +									\
> +	instrumentation_begin();					\
> +	__irq_enter_raw();						\
> +	kvm_set_cpu_l1tf_flush_l1d();					\
> +	__##func(regs);						\
> +	__irq_exit_raw();						\
> +	instrumentation_end();						\
> +	irqentry_exit_hv_cond(regs, state);				\
> +}									\
> +									\
> +static __always_inline void __##func(struct pt_regs *regs)
> +#endif
>   
>   /**
>    * DECLARE_IDTENTRY_XENCB - Declare functions for XEN HV callback entry point
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index b1a98c2a52f8..23f15e95838b 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -147,6 +147,10 @@ struct sev_hv_doorbell_page {
>   
>   struct sev_snp_runtime_data {
>   	struct sev_hv_doorbell_page hv_doorbell_page;
> +	/*
> +	 * Indication that we are currently handling #HV events.
> +	 */
> +	bool hv_handling_events;
>   };
>   
>   static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
> @@ -200,6 +204,8 @@ static void do_exc_hv(struct pt_regs *regs)
>   	union hv_pending_events pending_events;
>   	u8 vector;
>   
> +	this_cpu_read(snp_runtime_data)->hv_handling_events = true;
> +
>   	while (sev_hv_pending()) {
>   		pending_events.events = xchg(
>   			&sev_snp_current_doorbell_page()->pending_events.events,
> @@ -234,6 +240,8 @@ static void do_exc_hv(struct pt_regs *regs)
>   			common_interrupt(regs, pending_events.vector);
>   		}
>   	}
> +
> +	this_cpu_read(snp_runtime_data)->hv_handling_events = false;
>   }
>   
>   static __always_inline bool on_vc_stack(struct pt_regs *regs)
> @@ -2529,3 +2537,25 @@ static int __init snp_init_platform_device(void)
>   	return 0;
>   }
>   device_initcall(snp_init_platform_device);
> +
> +noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, irqentry_state_t state)
> +{

This code path is being called even for the guest without SNP. Ran
a SEV guest and guest crashed in this code path. Checking & returning
made guest (non SNP) to boot with some call traces. But this branch 
needs to be avoided for non-SNP guests and host as well.

Thanks,
Pankaj

+++ b/arch/x86/kernel/sev.c
@@ -2540,6 +2540,9 @@ device_initcall(snp_init_platform_device);

  noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, 
irqentry_state_t state)
  {
+
+       if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
+                               return;

> +	/*
> +	 * Check whether this returns to user mode, if so and if
> +	 * we are currently executing the #HV handler then we don't
> +	 * want to follow the irqentry_exit_to_user_mode path as
> +	 * that can potentially cause the #HV handler to be
> +	 * preempted and rescheduled on another CPU. Rescheduled #HV
> +	 * handler on another cpu will cause interrupts to be handled
> +	 * on a different cpu than the injected one, causing
> +	 * invalid EOIs and missed/lost guest interrupts and
> +	 * corresponding hangs and/or per-cpu IRQs handled on
> +	 * non-intended cpu.
> +	 */
> +	if (user_mode(regs) &&
> +	    this_cpu_read(snp_runtime_data)->hv_handling_events)
> +		return;
> +
> +	/* follow normal interrupt return/exit path */
> +	irqentry_exit(regs, state);
> +}


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 13/16] x86/sev: Add Check of #HV event in path
  2023-01-22  2:46 ` [RFC PATCH V3 13/16] x86/sev: Add Check of #HV event in path Tianyu Lan
@ 2023-03-01 11:11   ` Gupta, Pankaj
  2023-03-08 16:18     ` Gupta, Pankaj
  0 siblings, 1 reply; 60+ messages in thread
From: Gupta, Pankaj @ 2023-03-01 11:11 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 1/22/2023 3:46 AM, Tianyu Lan wrote:
> From: Tianyu Lan <tiala@microsoft.com>
> 
> Add check_hv_pending() and check_hv_pending_after_irq() to
> check queued #HV event when irq is disabled.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>   arch/x86/entry/entry_64.S       | 18 +++++++++++++++
>   arch/x86/include/asm/irqflags.h | 10 +++++++++
>   arch/x86/kernel/sev.c           | 39 +++++++++++++++++++++++++++++++++
>   3 files changed, 67 insertions(+)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 6baec7653f19..aec8dc4443d1 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -1064,6 +1064,15 @@ SYM_CODE_END(paranoid_entry)
>    * R15 - old SPEC_CTRL
>    */
>   SYM_CODE_START_LOCAL(paranoid_exit)
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	/*
> +	 * If a #HV was delivered during execution and interrupts were
> +	 * disabled, then check if it can be handled before the iret
> +	 * (which may re-enable interrupts).
> +	 */
> +	mov     %rsp, %rdi
> +	call    check_hv_pending
> +#endif
>   	UNWIND_HINT_REGS
>   
>   	/*
> @@ -1188,6 +1197,15 @@ SYM_CODE_START(error_entry)
>   SYM_CODE_END(error_entry)
>   
>   SYM_CODE_START_LOCAL(error_return)
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	/*
> +	 * If a #HV was delivered during execution and interrupts were
> +	 * disabled, then check if it can be handled before the iret
> +	 * (which may re-enable interrupts).
> +	 */
> +	mov     %rsp, %rdi
> +	call    check_hv_pending
> +#endif
>   	UNWIND_HINT_REGS
>   	DEBUG_ENTRY_ASSERT_IRQS_OFF
>   	testb	$3, CS(%rsp)
> diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
> index 7793e52d6237..fe46e59168dd 100644
> --- a/arch/x86/include/asm/irqflags.h
> +++ b/arch/x86/include/asm/irqflags.h
> @@ -14,6 +14,10 @@
>   /*
>    * Interrupt control:
>    */
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +void check_hv_pending(struct pt_regs *regs);
> +void check_hv_pending_irq_enable(void);
> +#endif
>   
>   /* Declaration required for gcc < 4.9 to prevent -Werror=missing-prototypes */
>   extern inline unsigned long native_save_fl(void);
> @@ -43,12 +47,18 @@ static __always_inline void native_irq_disable(void)
>   static __always_inline void native_irq_enable(void)
>   {
>   	asm volatile("sti": : :"memory");
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	check_hv_pending_irq_enable();
> +#endif
>   }
>   
>   static inline __cpuidle void native_safe_halt(void)
>   {
>   	mds_idle_clear_cpu_buffers();
>   	asm volatile("sti; hlt": : :"memory");
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +	check_hv_pending_irq_enable();
> +#endif
>   }
>   
>   static inline __cpuidle void native_halt(void)
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index a8862a2eff67..fe5e5e41433d 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -179,6 +179,45 @@ void noinstr __sev_es_ist_enter(struct pt_regs *regs)
>   	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist);
>   }
>   
> +static void do_exc_hv(struct pt_regs *regs)
> +{
> +	/* Handle #HV exception. */
> +}
> +
> +void check_hv_pending(struct pt_regs *regs)
> +{
> +	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> +		return;
> +
> +	if ((regs->flags & X86_EFLAGS_IF) == 0)
> +		return;

Will this return and prevent guest from executing NMI's
while irqs are disabled?

Thanks,
Pankaj

> +
> +	do_exc_hv(regs);
> +}
> +
> +void check_hv_pending_irq_enable(void)
> +{
> +	unsigned long flags;
> +	struct pt_regs regs;
> +
> +	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> +		return;
> +
> +	memset(&regs, 0, sizeof(struct pt_regs));
> +	asm volatile("movl %%cs, %%eax;" : "=a" (regs.cs));
> +	asm volatile("movl %%ss, %%eax;" : "=a" (regs.ss));
> +	regs.orig_ax = 0xffffffff;
> +	regs.flags = native_save_fl();
> +
> +	/*
> +	 * Disable irq when handle pending #HV events after
> +	 * re-enabling irq.
> +	 */
> +	asm volatile("cli" : : : "memory");
> +	do_exc_hv(&regs);
> +	asm volatile("sti" : : : "memory");
> +}
> +
>   void noinstr __sev_es_ist_exit(void)
>   {
>   	unsigned long ist;


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests
  2023-01-22  2:46 ` [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests Tianyu Lan
  2023-02-16 14:46   ` Gupta, Pankaj
  2023-02-17 12:45   ` Gupta, Pankaj
@ 2023-03-01 19:34   ` Gupta, Pankaj
  2 siblings, 0 replies; 60+ messages in thread
From: Gupta, Pankaj @ 2023-03-01 19:34 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 1/22/2023 3:46 AM, Tianyu Lan wrote:
> From: Tianyu Lan <tiala@microsoft.com>
> 
> Enable #HV exception to handle interrupt requests from hypervisor.
> 
> Co-developed-by: Lendacky Thomas <thomas.lendacky@amd.com>
> Co-developed-by: Kalra Ashish <ashish.kalra@amd.com>
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
>   arch/x86/include/asm/mem_encrypt.h |   2 +
>   arch/x86/include/asm/msr-index.h   |   6 +
>   arch/x86/include/asm/svm.h         |  12 +-
>   arch/x86/include/uapi/asm/svm.h    |   4 +
>   arch/x86/kernel/sev.c              | 307 +++++++++++++++++++++++------
>   arch/x86/kernel/traps.c            |   2 +
>   6 files changed, 272 insertions(+), 61 deletions(-)
> 
> diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
> index 72ca90552b6a..7264ca5f5b2d 100644
> --- a/arch/x86/include/asm/mem_encrypt.h
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -50,6 +50,7 @@ void __init early_set_mem_enc_dec_hypercall(unsigned long vaddr, int npages,
>   void __init mem_encrypt_free_decrypted_mem(void);
>   
>   void __init sev_es_init_vc_handling(void);
> +void __init sev_snp_init_hv_handling(void);
>   
>   #define __bss_decrypted __section(".bss..decrypted")
>   
> @@ -72,6 +73,7 @@ static inline void __init sme_encrypt_kernel(struct boot_params *bp) { }
>   static inline void __init sme_enable(struct boot_params *bp) { }
>   
>   static inline void sev_es_init_vc_handling(void) { }
> +static inline void sev_snp_init_hv_handling(void) { }
>   
>   static inline int __init
>   early_set_memory_decrypted(unsigned long vaddr, unsigned long size) { return 0; }
> diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
> index 6a6e70e792a4..70af0ce5f2c4 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -562,11 +562,17 @@
>   #define MSR_AMD64_SEV_ENABLED_BIT	0
>   #define MSR_AMD64_SEV_ES_ENABLED_BIT	1
>   #define MSR_AMD64_SEV_SNP_ENABLED_BIT	2
> +#define MSR_AMD64_SEV_REFLECTVC_ENABLED_BIT		4
> +#define MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED_BIT	5
> +#define MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED_BIT	6
>   #define MSR_AMD64_SEV_ENABLED		BIT_ULL(MSR_AMD64_SEV_ENABLED_BIT)
>   #define MSR_AMD64_SEV_ES_ENABLED	BIT_ULL(MSR_AMD64_SEV_ES_ENABLED_BIT)
>   #define MSR_AMD64_SEV_SNP_ENABLED	BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT)
>   #define MSR_AMD64_SNP_VTOM_ENABLED	BIT_ULL(3)
>   
> +#define MSR_AMD64_SEV_REFLECTVC_ENABLED			BIT_ULL(MSR_AMD64_SEV_REFLECTVC_ENABLED_BIT)
> +#define MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED	BIT_ULL(MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED_BIT)
> +#define MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED	BIT_ULL(MSR_AMD64_SEV_ALTERNATE_INJECTION_ENABLED_BIT)
>   #define MSR_AMD64_VIRT_SPEC_CTRL	0xc001011f
>   
>   /* AMD Collaborative Processor Performance Control MSRs */
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index f8b321a11ee4..911c991fec78 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -568,12 +568,12 @@ static inline void __unused_size_checks(void)
>   
>   	/* Check offsets of reserved fields */
>   
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xa0);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xcc);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xd8);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x180);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x248);
> -	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x298);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xa0);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xcc);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0xd8);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x180);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x248);
> +//	BUILD_BUG_RESERVED_OFFSET(vmcb_save_area, 0x298);
>   
>   	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0xc8);
>   	BUILD_BUG_RESERVED_OFFSET(sev_es_save_area, 0xcc);
> diff --git a/arch/x86/include/uapi/asm/svm.h b/arch/x86/include/uapi/asm/svm.h
> index f69c168391aa..85d6882262e7 100644
> --- a/arch/x86/include/uapi/asm/svm.h
> +++ b/arch/x86/include/uapi/asm/svm.h
> @@ -115,6 +115,10 @@
>   #define SVM_VMGEXIT_AP_CREATE_ON_INIT		0
>   #define SVM_VMGEXIT_AP_CREATE			1
>   #define SVM_VMGEXIT_AP_DESTROY			2
> +#define SVM_VMGEXIT_HV_DOORBELL_PAGE		0x80000014
> +#define SVM_VMGEXIT_GET_PREFERRED_HV_DOORBELL_PAGE	0
> +#define SVM_VMGEXIT_SET_HV_DOORBELL_PAGE		1
> +#define SVM_VMGEXIT_QUERY_HV_DOORBELL_PAGE		2
>   #define SVM_VMGEXIT_HV_FEATURES			0x8000fffd
>   #define SVM_VMGEXIT_UNSUPPORTED_EVENT		0x8000ffff
>   
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index fe5e5e41433d..03d99fad9e76 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -122,6 +122,150 @@ struct sev_config {
>   
>   static struct sev_config sev_cfg __read_mostly;
>   
> +static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state);
> +static noinstr void __sev_put_ghcb(struct ghcb_state *state);
> +static int vmgexit_hv_doorbell_page(struct ghcb *ghcb, u64 op, u64 pa);
> +static void sev_snp_setup_hv_doorbell_page(struct ghcb *ghcb);
> +
> +union hv_pending_events {
> +	u16 events;
> +	struct {
> +		u8 vector;
> +		u8 nmi : 1;
> +		u8 mc : 1;
> +		u8 reserved1 : 5;
> +		u8 no_further_signal : 1;
> +	};
> +};
> +
> +struct sev_hv_doorbell_page {
> +	union hv_pending_events pending_events;
> +	u8 no_eoi_required;
> +	u8 reserved2[61];
> +	u8 padding[4032];
> +};
> +
> +struct sev_snp_runtime_data {
> +	struct sev_hv_doorbell_page hv_doorbell_page;
> +};
> +
> +static DEFINE_PER_CPU(struct sev_snp_runtime_data*, snp_runtime_data);
> +
> +static inline u64 sev_es_rd_ghcb_msr(void)
> +{
> +	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
> +}
> +
> +static __always_inline void sev_es_wr_ghcb_msr(u64 val)
> +{
> +	u32 low, high;
> +
> +	low  = (u32)(val);
> +	high = (u32)(val >> 32);
> +
> +	native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
> +}
> +
> +struct sev_hv_doorbell_page *sev_snp_current_doorbell_page(void)
> +{
> +	return &this_cpu_read(snp_runtime_data)->hv_doorbell_page;
> +}
> +
> +static u8 sev_hv_pending(void)
> +{
> +	return sev_snp_current_doorbell_page()->pending_events.events;
> +}
> +
> +static void hv_doorbell_apic_eoi_write(u32 reg, u32 val)
> +{
> +	if (xchg(&sev_snp_current_doorbell_page()->no_eoi_required, 0) & 0x1)
> +		return;
> +
> +	BUG_ON(reg != APIC_EOI);
> +	apic->write(reg, val);
> +}
> +
> +static void do_exc_hv(struct pt_regs *regs)
> +{
> +	union hv_pending_events pending_events;
> +	u8 vector;
> +
> +	while (sev_hv_pending()) {
> +		pending_events.events = xchg(
> +			&sev_snp_current_doorbell_page()->pending_events.events,
> +			0);
> +
> +		if (pending_events.nmi)
> +			exc_nmi(regs);
> +
> +#ifdef CONFIG_X86_MCE
> +		if (pending_events.mc)
> +			exc_machine_check(regs);
> +#endif
> +
> +		if (!pending_events.vector)
> +			return;
> +
> +		if (pending_events.vector < FIRST_EXTERNAL_VECTOR) {
> +			/* Exception vectors */
> +			WARN(1, "exception shouldn't happen\n");
> +		} else if (pending_events.vector == FIRST_EXTERNAL_VECTOR) {
> +			sysvec_irq_move_cleanup(regs);
> +		} else if (pending_events.vector == IA32_SYSCALL_VECTOR) {
> +			WARN(1, "syscall shouldn't happen\n");
> +		} else if (pending_events.vector >= FIRST_SYSTEM_VECTOR) {
> +			switch (pending_events.vector) {
> +#if IS_ENABLED(CONFIG_HYPERV)
> +			case HYPERV_STIMER0_VECTOR:
> +				sysvec_hyperv_stimer0(regs);
> +				break;
> +			case HYPERVISOR_CALLBACK_VECTOR:
> +				sysvec_hyperv_callback(regs);
> +				break;
> +#endif
> +#ifdef CONFIG_SMP
> +			case RESCHEDULE_VECTOR:
> +				sysvec_reschedule_ipi(regs);
> +				break;
> +			case IRQ_MOVE_CLEANUP_VECTOR:
> +				sysvec_irq_move_cleanup(regs);
> +				break;
> +			case REBOOT_VECTOR:
> +				sysvec_reboot(regs);
> +				break;
> +			case CALL_FUNCTION_SINGLE_VECTOR:
> +				sysvec_call_function_single(regs);
> +				break;
> +			case CALL_FUNCTION_VECTOR:
> +				sysvec_call_function(regs);
> +				break;
> +#endif
> +#ifdef CONFIG_X86_LOCAL_APIC
> +			case ERROR_APIC_VECTOR:
> +				sysvec_error_interrupt(regs);
> +				break;
> +			case SPURIOUS_APIC_VECTOR:
> +				sysvec_spurious_apic_interrupt(regs);
> +				break;
> +			case LOCAL_TIMER_VECTOR:
> +				sysvec_apic_timer_interrupt(regs);
> +				break;
> +			case X86_PLATFORM_IPI_VECTOR:
> +				sysvec_x86_platform_ipi(regs);
> +				break;
> +#endif
> +			case 0x0:
> +				break;
> +			default:
> +				panic("Unexpected vector %d\n", vector);
> +				unreachable();
> +			}
> +		} else {
> +			common_interrupt(regs, pending_events.vector);
> +		}
> +	}
> +}
> +
>   static __always_inline bool on_vc_stack(struct pt_regs *regs)
>   {
>   	unsigned long sp = regs->sp;
> @@ -179,11 +323,6 @@ void noinstr __sev_es_ist_enter(struct pt_regs *regs)
>   	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist);
>   }
>   
> -static void do_exc_hv(struct pt_regs *regs)
> -{
> -	/* Handle #HV exception. */
> -}
> -
>   void check_hv_pending(struct pt_regs *regs)
>   {
>   	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
> @@ -232,68 +371,38 @@ void noinstr __sev_es_ist_exit(void)
>   	this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist);
>   }
>   
> -/*
> - * Nothing shall interrupt this code path while holding the per-CPU
> - * GHCB. The backup GHCB is only for NMIs interrupting this path.
> - *
> - * Callers must disable local interrupts around it.
> - */
> -static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
> +static bool sev_restricted_injection_enabled(void)
> +{
> +	return sev_status & MSR_AMD64_SEV_RESTRICTED_INJECTION_ENABLED;
> +}
> +
> +void __init sev_snp_init_hv_handling(void)
>   {
> +	struct sev_snp_runtime_data *snp_data;
>   	struct sev_es_runtime_data *data;
> +	struct ghcb_state state;
>   	struct ghcb *ghcb;
> +	unsigned long flags;
> +	int cpu;
> +	int err;
>   
>   	WARN_ON(!irqs_disabled());
> +	if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP) || !sev_restricted_injection_enabled())
> +		return;
>   
>   	data = this_cpu_read(runtime_data);
> -	ghcb = &data->ghcb_page;
>   
> -	if (unlikely(data->ghcb_active)) {
> -		/* GHCB is already in use - save its contents */
> -
> -		if (unlikely(data->backup_ghcb_active)) {
> -			/*
> -			 * Backup-GHCB is also already in use. There is no way
> -			 * to continue here so just kill the machine. To make
> -			 * panic() work, mark GHCBs inactive so that messages
> -			 * can be printed out.
> -			 */
> -			data->ghcb_active        = false;
> -			data->backup_ghcb_active = false;
> -
> -			instrumentation_begin();
> -			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
> -			instrumentation_end();
> -		}
> -
> -		/* Mark backup_ghcb active before writing to it */
> -		data->backup_ghcb_active = true;
> -
> -		state->ghcb = &data->backup_ghcb;
> +	local_irq_save(flags);
>   
> -		/* Backup GHCB content */
> -		*state->ghcb = *ghcb;
> -	} else {
> -		state->ghcb = NULL;
> -		data->ghcb_active = true;
> -	}
> +	ghcb = __sev_get_ghcb(&state);
>   
> -	return ghcb;
> -}
> +	sev_snp_setup_hv_doorbell_page(ghcb);
>   
> -static inline u64 sev_es_rd_ghcb_msr(void)
> -{
> -	return __rdmsr(MSR_AMD64_SEV_ES_GHCB);
> -}
> -
> -static __always_inline void sev_es_wr_ghcb_msr(u64 val)
> -{
> -	u32 low, high;
> +	__sev_put_ghcb(&state);
>   
> -	low  = (u32)(val);
> -	high = (u32)(val >> 32);
> +	apic_set_eoi_write(hv_doorbell_apic_eoi_write);
>   
> -	native_wrmsr(MSR_AMD64_SEV_ES_GHCB, low, high);
> +	local_irq_restore(flags);
>   }
>   
>   static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt,
> @@ -554,6 +663,69 @@ static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt
>   /* Include code shared with pre-decompression boot stage */
>   #include "sev-shared.c"
>   
> +/*
> + * Nothing shall interrupt this code path while holding the per-CPU
> + * GHCB. The backup GHCB is only for NMIs interrupting this path.
> + *
> + * Callers must disable local interrupts around it.
> + */
> +static noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state)
> +{
> +	struct sev_es_runtime_data *data;
> +	struct ghcb *ghcb;
> +
> +	WARN_ON(!irqs_disabled());
> +
> +	data = this_cpu_read(runtime_data);
> +	ghcb = &data->ghcb_page;
> +
> +	if (unlikely(data->ghcb_active)) {
> +		/* GHCB is already in use - save its contents */
> +
> +		if (unlikely(data->backup_ghcb_active)) {
> +			/*
> +			 * Backup-GHCB is also already in use. There is no way
> +			 * to continue here so just kill the machine. To make
> +			 * panic() work, mark GHCBs inactive so that messages
> +			 * can be printed out.
> +			 */
> +			data->ghcb_active        = false;
> +			data->backup_ghcb_active = false;
> +
> +			instrumentation_begin();
> +			panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use");
> +			instrumentation_end();
> +		}
> +
> +		/* Mark backup_ghcb active before writing to it */
> +		data->backup_ghcb_active = true;
> +
> +		state->ghcb = &data->backup_ghcb;
> +
> +		/* Backup GHCB content */
> +		*state->ghcb = *ghcb;
> +	} else {
> +		state->ghcb = NULL;
> +		data->ghcb_active = true;
> +	}
> +
> +	return ghcb;
> +}
> +
> +static void sev_snp_setup_hv_doorbell_page(struct ghcb *ghcb)
> +{
> +	u64 pa;
> +	enum es_result ret;
> +
> +	pa = __pa(sev_snp_current_doorbell_page());
> +	vc_ghcb_invalidate(ghcb);
> +	ret = vmgexit_hv_doorbell_page(ghcb,
> +				       SVM_VMGEXIT_SET_HV_DOORBELL_PAGE,
> +				       pa);
> +	if (ret != ES_OK)
> +		panic("SEV-SNP: failed to set up #HV doorbell page");
> +}
> +
>   static noinstr void __sev_put_ghcb(struct ghcb_state *state)
>   {
>   	struct sev_es_runtime_data *data;
> @@ -1282,6 +1454,7 @@ static void snp_register_per_cpu_ghcb(void)
>   	ghcb = &data->ghcb_page;
>   
>   	snp_register_ghcb_early(__pa(ghcb));
> +	sev_snp_setup_hv_doorbell_page(ghcb);
>   }
>   
>   void setup_ghcb(void)
> @@ -1321,6 +1494,11 @@ void setup_ghcb(void)
>   		snp_register_ghcb_early(__pa(&boot_ghcb_page));
>   }
>   
> +int vmgexit_hv_doorbell_page(struct ghcb *ghcb, u64 op, u64 pa)
> +{
> +	return sev_es_ghcb_hv_call(ghcb, NULL, SVM_VMGEXIT_HV_DOORBELL_PAGE, op, pa);
> +}
> +
>   #ifdef CONFIG_HOTPLUG_CPU
>   static void sev_es_ap_hlt_loop(void)
>   {
> @@ -1394,6 +1572,7 @@ static void __init alloc_runtime_data(int cpu)
>   static void __init init_ghcb(int cpu)
>   {
>   	struct sev_es_runtime_data *data;
> +	struct sev_snp_runtime_data *snp_data;
>   	int err;
>   
>   	data = per_cpu(runtime_data, cpu);
> @@ -1405,6 +1584,19 @@ static void __init init_ghcb(int cpu)
>   
>   	memset(&data->ghcb_page, 0, sizeof(data->ghcb_page));
>   
> +	snp_data = memblock_alloc(sizeof(*snp_data), PAGE_SIZE);
> +	if (!snp_data)
> +		panic("Can't allocate SEV-SNP runtime data");
> +
> +	err = early_set_memory_decrypted((unsigned long)&snp_data->hv_doorbell_page,
> +					 sizeof(snp_data->hv_doorbell_page));
> +	if (err)
> +		panic("Can't map #HV doorbell pages unencrypted");
> +
> +	memset(&snp_data->hv_doorbell_page, 0, sizeof(snp_data->hv_doorbell_page));
> +
> +	per_cpu(snp_runtime_data, cpu) = snp_data;
> +
>   	data->ghcb_active = false;
>   	data->backup_ghcb_active = false;
>   }
> @@ -2045,7 +2237,12 @@ DEFINE_IDTENTRY_VC_USER(exc_vmm_communication)
>   
>   static bool hv_raw_handle_exception(struct pt_regs *regs)
>   {
> -	return false;
> +	/* Clear the no_further_signal bit */
> +	sev_snp_current_doorbell_page()->pending_events.events &= 0x7fff;

Do we need clearing of "no_further_signal" here? as we reset it in the 
handler (do_exc_hv()) as well?

Thanks,
Pankaj

> +
> +	check_hv_pending(regs);
> +
> +	return true;
>   }
>   
>   static __always_inline bool on_hv_fallback_stack(struct pt_regs *regs)
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index d29debec8134..1aa6cab2394b 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -1503,5 +1503,7 @@ void __init trap_init(void)
>   	cpu_init_exception_handling();
>   	/* Setup traps as cpu_init() might #GP */
>   	idt_setup_traps();
> +	sev_snp_init_hv_handling();
> +
>   	cpu_init();
>   }


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 13/16] x86/sev: Add Check of #HV event in path
  2023-03-01 11:11   ` Gupta, Pankaj
@ 2023-03-08 16:18     ` Gupta, Pankaj
  2023-03-10 15:59       ` Tianyu Lan
  0 siblings, 1 reply; 60+ messages in thread
From: Gupta, Pankaj @ 2023-03-08 16:18 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 3/1/2023 12:11 PM, Gupta, Pankaj wrote:
> On 1/22/2023 3:46 AM, Tianyu Lan wrote:
>> From: Tianyu Lan <tiala@microsoft.com>
>>
>> Add check_hv_pending() and check_hv_pending_after_irq() to
>> check queued #HV event when irq is disabled.
>>
>> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
>> ---
>>   arch/x86/entry/entry_64.S       | 18 +++++++++++++++
>>   arch/x86/include/asm/irqflags.h | 10 +++++++++
>>   arch/x86/kernel/sev.c           | 39 +++++++++++++++++++++++++++++++++
>>   3 files changed, 67 insertions(+)
>>
>> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
>> index 6baec7653f19..aec8dc4443d1 100644
>> --- a/arch/x86/entry/entry_64.S
>> +++ b/arch/x86/entry/entry_64.S
>> @@ -1064,6 +1064,15 @@ SYM_CODE_END(paranoid_entry)
>>    * R15 - old SPEC_CTRL
>>    */
>>   SYM_CODE_START_LOCAL(paranoid_exit)
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +    /*
>> +     * If a #HV was delivered during execution and interrupts were
>> +     * disabled, then check if it can be handled before the iret
>> +     * (which may re-enable interrupts).
>> +     */
>> +    mov     %rsp, %rdi
>> +    call    check_hv_pending
>> +#endif
>>       UNWIND_HINT_REGS
>>       /*
>> @@ -1188,6 +1197,15 @@ SYM_CODE_START(error_entry)
>>   SYM_CODE_END(error_entry)
>>   SYM_CODE_START_LOCAL(error_return)
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +    /*
>> +     * If a #HV was delivered during execution and interrupts were
>> +     * disabled, then check if it can be handled before the iret
>> +     * (which may re-enable interrupts).
>> +     */
>> +    mov     %rsp, %rdi
>> +    call    check_hv_pending
>> +#endif
>>       UNWIND_HINT_REGS
>>       DEBUG_ENTRY_ASSERT_IRQS_OFF
>>       testb    $3, CS(%rsp)
>> diff --git a/arch/x86/include/asm/irqflags.h 
>> b/arch/x86/include/asm/irqflags.h
>> index 7793e52d6237..fe46e59168dd 100644
>> --- a/arch/x86/include/asm/irqflags.h
>> +++ b/arch/x86/include/asm/irqflags.h
>> @@ -14,6 +14,10 @@
>>   /*
>>    * Interrupt control:
>>    */
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +void check_hv_pending(struct pt_regs *regs);
>> +void check_hv_pending_irq_enable(void);
>> +#endif
>>   /* Declaration required for gcc < 4.9 to prevent 
>> -Werror=missing-prototypes */
>>   extern inline unsigned long native_save_fl(void);
>> @@ -43,12 +47,18 @@ static __always_inline void native_irq_disable(void)
>>   static __always_inline void native_irq_enable(void)
>>   {
>>       asm volatile("sti": : :"memory");
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +    check_hv_pending_irq_enable();
>> +#endif
>>   }
>>   static inline __cpuidle void native_safe_halt(void)
>>   {
>>       mds_idle_clear_cpu_buffers();
>>       asm volatile("sti; hlt": : :"memory");
>> +#ifdef CONFIG_AMD_MEM_ENCRYPT
>> +    check_hv_pending_irq_enable();
>> +#endif
>>   }
>>   static inline __cpuidle void native_halt(void)
>> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
>> index a8862a2eff67..fe5e5e41433d 100644
>> --- a/arch/x86/kernel/sev.c
>> +++ b/arch/x86/kernel/sev.c
>> @@ -179,6 +179,45 @@ void noinstr __sev_es_ist_enter(struct pt_regs 
>> *regs)
>>       this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist);
>>   }
>> +static void do_exc_hv(struct pt_regs *regs)
>> +{
>> +    /* Handle #HV exception. */
>> +}
>> +
>> +void check_hv_pending(struct pt_regs *regs)
>> +{
>> +    if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
>> +        return;
>> +
>> +    if ((regs->flags & X86_EFLAGS_IF) == 0)
>> +        return;
> 
> Will this return and prevent guest from executing NMI's
> while irqs are disabled?

I think we need to handle NMI's even when irqs are disabled.

As we reset "no_further_signal" in hv_raw_handle_exception()
and return from check_hv_pending() when irqs are disabled, this
can result in loss/delay of NMI event?

Thanks,
Pankaj

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler
  2023-01-22  2:46 ` [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler Tianyu Lan
  2023-01-23  7:33   ` Gupta, Pankaj
@ 2023-03-09 11:48   ` Gupta, Pankaj
  2023-03-10 15:48     ` Tianyu Lan
  2023-03-31 15:57   ` Borislav Petkov
  2 siblings, 1 reply; 60+ messages in thread
From: Gupta, Pankaj @ 2023-03-09 11:48 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 1/22/2023 3:46 AM, Tianyu Lan wrote:
> From: Tianyu Lan <tiala@microsoft.com>
> 
> Add a #HV exception handler that uses IST stack.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> Change since RFC V2:
>         * Remove unnecessary line in the change log.
> ---
>   arch/x86/entry/entry_64.S             | 58 +++++++++++++++++++++++++++
>   arch/x86/include/asm/cpu_entry_area.h |  6 +++
>   arch/x86/include/asm/idtentry.h       | 39 +++++++++++++++++-
>   arch/x86/include/asm/page_64_types.h  |  1 +
>   arch/x86/include/asm/trapnr.h         |  1 +
>   arch/x86/include/asm/traps.h          |  1 +
>   arch/x86/kernel/cpu/common.c          |  1 +
>   arch/x86/kernel/dumpstack_64.c        |  9 ++++-
>   arch/x86/kernel/idt.c                 |  1 +
>   arch/x86/kernel/sev.c                 | 53 ++++++++++++++++++++++++
>   arch/x86/kernel/traps.c               | 40 ++++++++++++++++++
>   arch/x86/mm/cpu_entry_area.c          |  2 +
>   12 files changed, 209 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 15739a2c0983..6baec7653f19 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -563,6 +563,64 @@ SYM_CODE_START(\asmsym)
>   .Lfrom_usermode_switch_stack_\@:
>   	idtentry_body user_\cfunc, has_error_code=1
>   
> +_ASM_NOKPROBE(\asmsym)
> +SYM_CODE_END(\asmsym)
> +.endm
> +/*
> + * idtentry_hv - Macro to generate entry stub for #HV
> + * @vector:		Vector number
> + * @asmsym:		ASM symbol for the entry point
> + * @cfunc:		C function to be called
> + *
> + * The macro emits code to set up the kernel context for #HV. The #HV handler
> + * runs on an IST stack and needs to be able to support nested #HV exceptions.
> + *
> + * To make this work the #HV entry code tries its best to pretend it doesn't use
> + * an IST stack by switching to the task stack if coming from user-space (which
> + * includes early SYSCALL entry path) or back to the stack in the IRET frame if
> + * entered from kernel-mode.
> + *
> + * If entered from kernel-mode the return stack is validated first, and if it is
> + * not safe to use (e.g. because it points to the entry stack) the #HV handler
> + * will switch to a fall-back stack (HV2) and call a special handler function.
> + *
> + * The macro is only used for one vector, but it is planned to be extended in
> + * the future for the #HV exception.
> + */
> +.macro idtentry_hv vector asmsym cfunc
> +SYM_CODE_START(\asmsym)
> +	UNWIND_HINT_IRET_REGS
> +	ASM_CLAC
> +	pushq	$-1			/* ORIG_RAX: no syscall to restart */
> +
> +	testb	$3, CS-ORIG_RAX(%rsp)
> +	jnz	.Lfrom_usermode_switch_stack_\@
> +
> +	call	paranoid_entry
> +
> +	UNWIND_HINT_REGS
> +
> +	/*
> +	 * Switch off the IST stack to make it free for nested exceptions.
> +	 */
> +	movq	%rsp, %rdi		/* pt_regs pointer */
> +	call	hv_switch_off_ist
> +	movq	%rax, %rsp		/* Switch to new stack */
> +

We need "ENCODE_FRAME_POINTER" similar to "vc_switch_off_ist" here as we 
are switching stack?

> +	UNWIND_HINT_REGS
> +
> +	/* Update pt_regs */
> +	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
> +	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
> +
> +	movq	%rsp, %rdi		/* pt_regs pointer */
> +	call	kernel_\cfunc
> +
> +	jmp	paranoid_exit
> +
> +.Lfrom_usermode_switch_stack_\@:
> +	idtentry_body user_\cfunc, has_error_code=1
> +
>   _ASM_NOKPROBE(\asmsym)
>   SYM_CODE_END(\asmsym)
>   .endm
> diff --git a/arch/x86/include/asm/cpu_entry_area.h b/arch/x86/include/asm/cpu_entry_area.h
> index 462fc34f1317..2186ed601b4a 100644
> --- a/arch/x86/include/asm/cpu_entry_area.h
> +++ b/arch/x86/include/asm/cpu_entry_area.h
> @@ -30,6 +30,10 @@
>   	char	VC_stack[optional_stack_size];			\
>   	char	VC2_stack_guard[guardsize];			\
>   	char	VC2_stack[optional_stack_size];			\
> +	char	HV_stack_guard[guardsize];			\
> +	char	HV_stack[optional_stack_size];			\
> +	char	HV2_stack_guard[guardsize];			\
> +	char	HV2_stack[optional_stack_size];			\
>   	char	IST_top_guard[guardsize];			\
>   
>   /* The exception stacks' physical storage. No guard pages required */
> @@ -52,6 +56,8 @@ enum exception_stack_ordering {
>   	ESTACK_MCE,
>   	ESTACK_VC,
>   	ESTACK_VC2,
> +	ESTACK_HV,
> +	ESTACK_HV2,
>   	N_EXCEPTION_STACKS
>   };
>   
> diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
> index 72184b0b2219..652fea10d377 100644
> --- a/arch/x86/include/asm/idtentry.h
> +++ b/arch/x86/include/asm/idtentry.h
> @@ -317,6 +317,19 @@ static __always_inline void __##func(struct pt_regs *regs)
>   	__visible noinstr void kernel_##func(struct pt_regs *regs, unsigned long error_code);	\
>   	__visible noinstr void   user_##func(struct pt_regs *regs, unsigned long error_code)
>   
> +
> +/**
> + * DECLARE_IDTENTRY_HV - Declare functions for the HV entry point
> + * @vector:	Vector number (ignored for C)
> + * @func:	Function name of the entry point
> + *
> + * Maps to DECLARE_IDTENTRY_RAW, but declares also the user C handler.
> + */
> +#define DECLARE_IDTENTRY_HV(vector, func)				\
> +	DECLARE_IDTENTRY_RAW_ERRORCODE(vector, func);			\
> +	__visible noinstr void kernel_##func(struct pt_regs *regs);	\
> +	__visible noinstr void   user_##func(struct pt_regs *regs)
> +
>   /**
>    * DEFINE_IDTENTRY_IST - Emit code for IST entry points
>    * @func:	Function name of the entry point
> @@ -376,6 +389,26 @@ static __always_inline void __##func(struct pt_regs *regs)
>   #define DEFINE_IDTENTRY_VC_USER(func)				\
>   	DEFINE_IDTENTRY_RAW_ERRORCODE(user_##func)
>   
> +/**
> + * DEFINE_IDTENTRY_HV_KERNEL - Emit code for HV injection handler
> + *			       when raised from kernel mode
> + * @func:	Function name of the entry point
> + *
> + * Maps to DEFINE_IDTENTRY_RAW
> + */
> +#define DEFINE_IDTENTRY_HV_KERNEL(func)					\
> +	DEFINE_IDTENTRY_RAW(kernel_##func)
> +
> +/**
> + * DEFINE_IDTENTRY_HV_USER - Emit code for HV injection handler
> + *			     when raised from user mode
> + * @func:	Function name of the entry point
> + *
> + * Maps to DEFINE_IDTENTRY_RAW
> + */
> +#define DEFINE_IDTENTRY_HV_USER(func)					\
> +	DEFINE_IDTENTRY_RAW(user_##func)
> +
>   #else	/* CONFIG_X86_64 */
>   
>   /**
> @@ -465,6 +498,9 @@ __visible noinstr void func(struct pt_regs *regs,			\
>   # define DECLARE_IDTENTRY_VC(vector, func)				\
>   	idtentry_vc vector asm_##func func
>   
> +# define DECLARE_IDTENTRY_HV(vector, func)				\
> +	idtentry_hv vector asm_##func func
> +
>   #else
>   # define DECLARE_IDTENTRY_MCE(vector, func)				\
>   	DECLARE_IDTENTRY(vector, func)
> @@ -622,9 +658,10 @@ DECLARE_IDTENTRY_RAW_ERRORCODE(X86_TRAP_DF,	xenpv_exc_double_fault);
>   DECLARE_IDTENTRY_ERRORCODE(X86_TRAP_CP,	exc_control_protection);
>   #endif
>   
> -/* #VC */
> +/* #VC & #HV */
>   #ifdef CONFIG_AMD_MEM_ENCRYPT
>   DECLARE_IDTENTRY_VC(X86_TRAP_VC,	exc_vmm_communication);
> +DECLARE_IDTENTRY_HV(X86_TRAP_HV,	exc_hv_injection);
>   #endif
>   
>   #ifdef CONFIG_XEN_PV
> diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_types.h
> index e9e2c3ba5923..0bd7dab676c5 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -29,6 +29,7 @@
>   #define	IST_INDEX_DB		2
>   #define	IST_INDEX_MCE		3
>   #define	IST_INDEX_VC		4
> +#define	IST_INDEX_HV		5
>   
>   /*
>    * Set __PAGE_OFFSET to the most negative possible address +
> diff --git a/arch/x86/include/asm/trapnr.h b/arch/x86/include/asm/trapnr.h
> index f5d2325aa0b7..c6583631cecb 100644
> --- a/arch/x86/include/asm/trapnr.h
> +++ b/arch/x86/include/asm/trapnr.h
> @@ -26,6 +26,7 @@
>   #define X86_TRAP_XF		19	/* SIMD Floating-Point Exception */
>   #define X86_TRAP_VE		20	/* Virtualization Exception */
>   #define X86_TRAP_CP		21	/* Control Protection Exception */
> +#define X86_TRAP_HV		28	/* HV injected exception in SNP restricted mode */
>   #define X86_TRAP_VC		29	/* VMM Communication Exception */
>   #define X86_TRAP_IRET		32	/* IRET Exception */
>   
> diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
> index 47ecfff2c83d..6795d3e517d6 100644
> --- a/arch/x86/include/asm/traps.h
> +++ b/arch/x86/include/asm/traps.h
> @@ -16,6 +16,7 @@ asmlinkage __visible notrace
>   struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs);
>   void __init trap_init(void);
>   asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *eregs);
> +asmlinkage __visible noinstr struct pt_regs *hv_switch_off_ist(struct pt_regs *eregs);
>   #endif
>   
>   extern bool ibt_selftest(void);
> diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
> index 9cfca3d7d0e2..e48a489777ec 100644
> --- a/arch/x86/kernel/cpu/common.c
> +++ b/arch/x86/kernel/cpu/common.c
> @@ -2162,6 +2162,7 @@ static inline void tss_setup_ist(struct tss_struct *tss)
>   	tss->x86_tss.ist[IST_INDEX_MCE] = __this_cpu_ist_top_va(MCE);
>   	/* Only mapped when SEV-ES is active */
>   	tss->x86_tss.ist[IST_INDEX_VC] = __this_cpu_ist_top_va(VC);
> +	tss->x86_tss.ist[IST_INDEX_HV] = __this_cpu_ist_top_va(HV);
>   }
>   
>   #else /* CONFIG_X86_64 */
> diff --git a/arch/x86/kernel/dumpstack_64.c b/arch/x86/kernel/dumpstack_64.c
> index f05339fee778..6d8f8864810c 100644
> --- a/arch/x86/kernel/dumpstack_64.c
> +++ b/arch/x86/kernel/dumpstack_64.c
> @@ -26,11 +26,14 @@ static const char * const exception_stack_names[] = {
>   		[ ESTACK_MCE	]	= "#MC",
>   		[ ESTACK_VC	]	= "#VC",
>   		[ ESTACK_VC2	]	= "#VC2",
> +		[ ESTACK_HV	]	= "#HV",
> +		[ ESTACK_HV2	]	= "#HV2",
> +		
>   };
>   
>   const char *stack_type_name(enum stack_type type)
>   {
> -	BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
> +	BUILD_BUG_ON(N_EXCEPTION_STACKS != 8);
>   
>   	if (type == STACK_TYPE_TASK)
>   		return "TASK";
> @@ -89,6 +92,8 @@ struct estack_pages estack_pages[CEA_ESTACK_PAGES] ____cacheline_aligned = {
>   	EPAGERANGE(MCE),
>   	EPAGERANGE(VC),
>   	EPAGERANGE(VC2),
> +	EPAGERANGE(HV),
> +	EPAGERANGE(HV2),
>   };
>   
>   static __always_inline bool in_exception_stack(unsigned long *stack, struct stack_info *info)
> @@ -98,7 +103,7 @@ static __always_inline bool in_exception_stack(unsigned long *stack, struct stac
>   	struct pt_regs *regs;
>   	unsigned int k;
>   
> -	BUILD_BUG_ON(N_EXCEPTION_STACKS != 6);
> +	BUILD_BUG_ON(N_EXCEPTION_STACKS != 8);
>   
>   	begin = (unsigned long)__this_cpu_read(cea_exception_stacks);
>   	/*
> diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
> index a58c6bc1cd68..48c0a7e1dbcb 100644
> --- a/arch/x86/kernel/idt.c
> +++ b/arch/x86/kernel/idt.c
> @@ -113,6 +113,7 @@ static const __initconst struct idt_data def_idts[] = {
>   
>   #ifdef CONFIG_AMD_MEM_ENCRYPT
>   	ISTG(X86_TRAP_VC,		asm_exc_vmm_communication, IST_INDEX_VC),
> +	ISTG(X86_TRAP_HV,		asm_exc_hv_injection, IST_INDEX_HV),
>   #endif
>   
>   	SYSG(X86_TRAP_OF,		asm_exc_overflow),
> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
> index 679026a640ef..a8862a2eff67 100644
> --- a/arch/x86/kernel/sev.c
> +++ b/arch/x86/kernel/sev.c
> @@ -2004,6 +2004,59 @@ DEFINE_IDTENTRY_VC_USER(exc_vmm_communication)
>   	irqentry_exit_to_user_mode(regs);
>   }
>   
> +static bool hv_raw_handle_exception(struct pt_regs *regs)
> +{
> +	return false;
> +}
> +
> +static __always_inline bool on_hv_fallback_stack(struct pt_regs *regs)
> +{
> +	unsigned long sp = (unsigned long)regs;
> +
> +	return (sp >= __this_cpu_ist_bottom_va(HV2) && sp < __this_cpu_ist_top_va(HV2));
> +}
> +
> +DEFINE_IDTENTRY_HV_USER(exc_hv_injection)
> +{
> +	irqentry_enter_from_user_mode(regs);
> +	instrumentation_begin();
> +
> +	if (!hv_raw_handle_exception(regs)) {
> +		/*
> +		 * Do not kill the machine if user-space triggered the
> +		 * exception. Send SIGBUS instead and let user-space deal
> +		 * with it.
> +		 */
> +		force_sig_fault(SIGBUS, BUS_OBJERR, (void __user *)0);
> +	}
> +
> +	instrumentation_end();
> +	irqentry_exit_to_user_mode(regs);
> +}
> +
> +DEFINE_IDTENTRY_HV_KERNEL(exc_hv_injection)
> +{
> +	irqentry_state_t irq_state;
> +
> +	irq_state = irqentry_nmi_enter(regs);
> +	instrumentation_begin();
> +
> +	if (!hv_raw_handle_exception(regs)) {
> +		pr_emerg("PANIC: Unhandled #HV exception in kernel space\n");
> +
> +		/* Show some debug info */
> +		show_regs(regs);
> +
> +		/* Ask hypervisor to sev_es_terminate */
> +		sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ);
> +
> +		panic("Returned from Terminate-Request to Hypervisor\n");
> +	}
> +
> +	instrumentation_end();
> +	irqentry_nmi_exit(regs, irq_state);
> +}
> +
>   bool __init handle_vc_boot_ghcb(struct pt_regs *regs)
>   {
>   	unsigned long exit_code = regs->orig_ax;
> diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
> index d317dc3d06a3..d29debec8134 100644
> --- a/arch/x86/kernel/traps.c
> +++ b/arch/x86/kernel/traps.c
> @@ -905,6 +905,46 @@ asmlinkage __visible noinstr struct pt_regs *vc_switch_off_ist(struct pt_regs *r
>   
>   	return regs_ret;
>   }
> +
> +asmlinkage __visible noinstr struct pt_regs *hv_switch_off_ist(struct pt_regs *regs)
> +{
> +	unsigned long sp, *stack;
> +	struct stack_info info;
> +	struct pt_regs *regs_ret;
> +
> +	/*
> +	 * In the SYSCALL entry path the RSP value comes from user-space - don't
> +	 * trust it and switch to the current kernel stack
> +	 */
> +	if (ip_within_syscall_gap(regs)) {
> +		sp = this_cpu_read(pcpu_hot.top_of_stack);
> +		goto sync;
> +	}
> +
> +	/*
> +	 * From here on the RSP value is trusted. Now check whether entry
> +	 * happened from a safe stack. Not safe are the entry or unknown stacks,
> +	 * use the fall-back stack instead in this case.
> +	 */
> +	sp    = regs->sp;
> +	stack = (unsigned long *)sp;
> +
> +	if (!get_stack_info_noinstr(stack, current, &info) || info.type == STACK_TYPE_ENTRY ||
> +	    info.type > STACK_TYPE_EXCEPTION_LAST)
> +		sp = __this_cpu_ist_top_va(HV2);
> +sync:
> +	/*
> +	 * Found a safe stack - switch to it as if the entry didn't happen via
> +	 * IST stack. The code below only copies pt_regs, the real switch happens
> +	 * in assembly code.
> +	 */
> +	sp = ALIGN_DOWN(sp, 8) - sizeof(*regs_ret);
> +
> +	regs_ret = (struct pt_regs *)sp;
> +	*regs_ret = *regs;
> +
> +	return regs_ret;
> +}
>   #endif
>   
>   asmlinkage __visible noinstr struct pt_regs *fixup_bad_iret(struct pt_regs *bad_regs)
> diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c
> index 7316a8224259..3ec844cef652 100644
> --- a/arch/x86/mm/cpu_entry_area.c
> +++ b/arch/x86/mm/cpu_entry_area.c
> @@ -153,6 +153,8 @@ static void __init percpu_setup_exception_stacks(unsigned int cpu)
>   		if (cc_platform_has(CC_ATTR_GUEST_STATE_ENCRYPT)) {
>   			cea_map_stack(VC);
>   			cea_map_stack(VC2);
> +			cea_map_stack(HV);
> +			cea_map_stack(HV2);
>   		}
>   	}
>   }


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
  2023-02-17 12:47   ` Gupta, Pankaj
  2023-02-18  7:15     ` Tianyu Lan
@ 2023-03-10 15:35     ` Gupta, Pankaj
  2023-03-10 16:19       ` Tianyu Lan
  1 sibling, 1 reply; 60+ messages in thread
From: Gupta, Pankaj @ 2023-03-10 15:35 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch



Hi Tianyu,

While testing the guest patches on KVM host, My guest kernel is stuck
at early bootup. As it did not seem a hang but sort of loop where 
interrupts are getting processed from "pv_native_irq_enable" path 
repeatedly and prevent boot process to make progress IIUC. Did you face 
any such scenario in your testing?

It seems to me "native_irq_enable" enable interrupts and 
"check_hv_pending_irq_enable" starts handling the interrupts (after 
disabling irqs). But "check_hv_pending_irq_enable=>do_exc_hv" can again 
call "pv_native_irq_enable" in interrupt handling path and execute the 
same loop?

Also pasting below the stack dump [1].

Thanks,
Pankaj

[1]
[   20.530786] Call Trace:^M
[   20.531099]  <IRQ>^M
[   20.531360]  dump_stack_lvl+0x4d/0x67^M
[   20.531820]  dump_stack+0x14/0x1a^M
[   20.532235]  do_exc_hv.cold+0x11/0xec^M
[   20.532792]  check_hv_pending_irq_enable+0x64/0x80^M
[   20.533390]  pv_native_irq_enable+0xe/0x20^M   ====> here
[   20.533902]  __do_softirq+0x89/0x2f3^M
[   20.534352]  __irq_exit_rcu+0x9f/0x110^M
[   20.534825]  irq_exit_rcu+0x12/0x20^M
[   20.535267]  common_interrupt+0xca/0xf0^M
[   20.535745]  </IRQ>^M
[   20.536014]  <TASK>^M
[   20.536286]  do_exc_hv.cold+0xda/0xec^M
[   20.536826]  check_hv_pending_irq_enable+0x64/0x80^M
[   20.537429]  pv_native_irq_enable+0xe/0x20^M    ====> here
[   20.537942]  _raw_spin_unlock_irqrestore+0x21/0x50^M
[   20.538539]  __setup_irq+0x3be/0x740^M
[   20.538990]  request_threaded_irq+0x116/0x180^M
[   20.539533]  hpet_time_init+0x35/0x56^M
[   20.539994]  x86_late_time_init+0x1f/0x3d^M
[   20.540556]  start_kernel+0x8af/0x970^M
[   20.541033]  x86_64_start_reservations+0x28/0x2e^M
[   20.541607]  x86_64_start_kernel+0x96/0xa0^M
[   20.542126]  secondary_startup_64_no_verify+0xe5/0xeb^M
[   20.542757]  </TASK>^M

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler
  2023-03-09 11:48   ` Gupta, Pankaj
@ 2023-03-10 15:48     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-03-10 15:48 UTC (permalink / raw)
  To: Gupta, Pankaj, luto, tglx, mingo, bp, dave.hansen, x86, hpa,
	seanjc, pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch


On 3/9/2023 7:48 PM, Gupta, Pankaj wrote:
> On 1/22/2023 3:46 AM, Tianyu Lan wrote:
>> From: Tianyu Lan <tiala@microsoft.com>
>> +    UNWIND_HINT_IRET_REGS
>> +    ASM_CLAC
>> +    pushq    $-1            /* ORIG_RAX: no syscall to restart */
>> +
>> +    testb    $3, CS-ORIG_RAX(%rsp)
>> +    jnz    .Lfrom_usermode_switch_stack_\@
>> +
>> +    call    paranoid_entry
>> +
>> +    UNWIND_HINT_REGS
>> +
>> +    /*
>> +     * Switch off the IST stack to make it free for nested exceptions.
>> +     */
>> +    movq    %rsp, %rdi        /* pt_regs pointer */
>> +    call    hv_switch_off_ist
>> +    movq    %rax, %rsp        /* Switch to new stack */
>> +
> 
> We need "ENCODE_FRAME_POINTER" similar to "vc_switch_off_ist" here as we 
> are switching stack?
> 

Agree. Will add it into the next version. Thanks.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 13/16] x86/sev: Add Check of #HV event in path
  2023-03-08 16:18     ` Gupta, Pankaj
@ 2023-03-10 15:59       ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-03-10 15:59 UTC (permalink / raw)
  To: Gupta, Pankaj, luto, tglx, mingo, bp, dave.hansen, x86, hpa,
	seanjc, pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 3/9/2023 12:18 AM, Gupta, Pankaj wrote:
> On 3/1/2023 12:11 PM, Gupta, Pankaj wrote:
>> On 1/22/2023 3:46 AM, Tianyu Lan wrote:

>>> diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c
>>> index a8862a2eff67..fe5e5e41433d 100644
>>> --- a/arch/x86/kernel/sev.c
>>> +++ b/arch/x86/kernel/sev.c
>>> @@ -179,6 +179,45 @@ void noinstr __sev_es_ist_enter(struct pt_regs 
>>> *regs)
>>>       this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], new_ist);
>>>   }
>>> +static void do_exc_hv(struct pt_regs *regs)
>>> +{
>>> +    /* Handle #HV exception. */
>>> +}
>>> +
>>> +void check_hv_pending(struct pt_regs *regs)
>>> +{
>>> +    if (!cc_platform_has(CC_ATTR_GUEST_SEV_SNP))
>>> +        return;
>>> +
>>> +    if ((regs->flags & X86_EFLAGS_IF) == 0)
>>> +        return;
>>
>> Will this return and prevent guest from executing NMI's
>> while irqs are disabled?
> 
> I think we need to handle NMI's even when irqs are disabled.
> 

Yes, nice catch!

> As we reset "no_further_signal" in hv_raw_handle_exception()
> and return from check_hv_pending() when irqs are disabled, this
> can result in loss/delay of NMI event?

Will fix this in the next version.

Thanks.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths from #HV exception
  2023-02-21 16:44   ` Gupta, Pankaj
@ 2023-03-10 16:02     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-03-10 16:02 UTC (permalink / raw)
  To: Gupta, Pankaj, luto, tglx, mingo, bp, dave.hansen, x86, hpa,
	seanjc, pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

On 2/22/2023 12:44 AM, Gupta, Pankaj wrote:
>> @@ -2529,3 +2537,25 @@ static int __init snp_init_platform_device(void)
>>       return 0;
>>   }
>>   device_initcall(snp_init_platform_device);
>> +
>> +noinstr void irqentry_exit_hv_cond(struct pt_regs *regs, 
>> irqentry_state_t state)
>> +{
> 
> This code path is being called even for the guest without SNP. Ran
> a SEV guest and guest crashed in this code path. Checking & returning
> made guest (non SNP) to boot with some call traces. But this branch 
> needs to be avoided for non-SNP guests and host as well.
> 

Nice catch! I will fix it in the next version.

Thanks.

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
  2023-03-10 15:35     ` Gupta, Pankaj
@ 2023-03-10 16:19       ` Tianyu Lan
  2023-03-15  6:40         ` Gupta, Pankaj
  0 siblings, 1 reply; 60+ messages in thread
From: Tianyu Lan @ 2023-03-10 16:19 UTC (permalink / raw)
  To: Gupta, Pankaj, luto, tglx, mingo, bp, dave.hansen, x86, hpa,
	seanjc, pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch


On 3/10/2023 11:35 PM, Gupta, Pankaj wrote:
> 
> 
> Hi Tianyu,
> 
> While testing the guest patches on KVM host, My guest kernel is stuck
> at early bootup. As it did not seem a hang but sort of loop where 
> interrupts are getting processed from "pv_native_irq_enable" path 
> repeatedly and prevent boot process to make progress IIUC. Did you face 
> any such scenario in your testing?
> 
> It seems to me "native_irq_enable" enable interrupts and 
> "check_hv_pending_irq_enable" starts handling the interrupts (after 
> disabling irqs). But "check_hv_pending_irq_enable=>do_exc_hv" can again 
> call "pv_native_irq_enable" in interrupt handling path and execute the 
> same loop?


I don't meet the issue. Thanks for report. I will double check and 
report back.

> Also pasting below the stack dump [1].
> 
> Thanks,
> Pankaj
> 
> [1]
> [   20.530786] Call Trace:^M
> [   20.531099]  <IRQ>^M
> [   20.531360]  dump_stack_lvl+0x4d/0x67^M
> [   20.531820]  dump_stack+0x14/0x1a^M
> [   20.532235]  do_exc_hv.cold+0x11/0xec^M
> [   20.532792]  check_hv_pending_irq_enable+0x64/0x80^M
> [   20.533390]  pv_native_irq_enable+0xe/0x20^M   ====> here
> [   20.533902]  __do_softirq+0x89/0x2f3^M
> [   20.534352]  __irq_exit_rcu+0x9f/0x110^M
> [   20.534825]  irq_exit_rcu+0x12/0x20^M
> [   20.535267]  common_interrupt+0xca/0xf0^M
> [   20.535745]  </IRQ>^M
> [   20.536014]  <TASK>^M
> [   20.536286]  do_exc_hv.cold+0xda/0xec^M
> [   20.536826]  check_hv_pending_irq_enable+0x64/0x80^M
> [   20.537429]  pv_native_irq_enable+0xe/0x20^M    ====> here
> [   20.537942]  _raw_spin_unlock_irqrestore+0x21/0x50^M
> [   20.538539]  __setup_irq+0x3be/0x740^M
> [   20.538990]  request_threaded_irq+0x116/0x180^M
> [   20.539533]  hpet_time_init+0x35/0x56^M
> [   20.539994]  x86_late_time_init+0x1f/0x3d^M
> [   20.540556]  start_kernel+0x8af/0x970^M
> [   20.541033]  x86_64_start_reservations+0x28/0x2e^M
> [   20.541607]  x86_64_start_kernel+0x96/0xa0^M
> [   20.542126]  secondary_startup_64_no_verify+0xe5/0xeb^M
> [   20.542757]  </TASK>^M

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv
  2023-03-10 16:19       ` Tianyu Lan
@ 2023-03-15  6:40         ` Gupta, Pankaj
  0 siblings, 0 replies; 60+ messages in thread
From: Gupta, Pankaj @ 2023-03-15  6:40 UTC (permalink / raw)
  To: Tianyu Lan, luto, tglx, mingo, bp, dave.hansen, x86, hpa, seanjc,
	pbonzini, jgross, tiala, kirill, jiangshan.ljs, peterz,
	ashish.kalra, srutherford, akpm, anshuman.khandual,
	pawan.kumar.gupta, adrian.hunter, daniel.sneddon,
	alexander.shishkin, sandipan.das, ray.huang, brijesh.singh,
	michael.roth, thomas.lendacky, venu.busireddy, sterritt,
	tony.luck, samitolvanen, fenghua.yu
  Cc: linux-kernel, kvm, linux-hyperv, linux-arch

Hi Tianyu,

>> Hi Tianyu,
>>
>> While testing the guest patches on KVM host, My guest kernel is stuck
>> at early bootup. As it did not seem a hang but sort of loop where 
>> interrupts are getting processed from "pv_native_irq_enable" path 
>> repeatedly and prevent boot process to make progress IIUC. Did you 
>> face any such scenario in your testing?
>>
>> It seems to me "native_irq_enable" enable interrupts and 
>> "check_hv_pending_irq_enable" starts handling the interrupts (after 
>> disabling irqs). But "check_hv_pending_irq_enable=>do_exc_hv" can 
>> again call "pv_native_irq_enable" in interrupt handling path and 
>> execute the same loop?
> 
> 
> I don't meet the issue. Thanks for report. I will double check and 
> report back.

Thank you!

More testing with the patches: After I commented out "do_exc_hv" from
pv_native_irq_enable()->check_hv_pending_irq_enable() code path. Now, I 
am getting below [2] stack trace repeatedly when I dump stack.

This seems to me after IST stack return from #VC handling
for "native_cpuid", paranoid_exit =>"do_exc_hv" is handling interrupts. 
As we don't disable interrupts in check_hv_pending()=>do_exc_hv(), so 
interrupts are handled continuously here. This also prevents the boot 
processor to make progress and stuck here.

Thoughts please? as I might be missing some important details here.

Thanks,
Pankaj

[2]

[   59.845396] Call Trace:^M
[   59.845703]  <TASK>^M
[   59.845980]  dump_stack_lvl+0x4d/0x67^M
[   59.846432]  dump_stack+0x14/0x1a^M
[   59.846842]  do_exc_hv.cold+0x22/0xfd^M
[   59.847301]  check_hv_pending+0x38/0x50^M
[   59.847773]  paranoid_exit+0x8/0x70^M
[   59.848205] RIP: 0010:native_cpuid+0x19/0x30^M
[   59.848729] Code: 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 
f3 0f 1e fa 55 49 89 f8 49 89 c9 48 89 d7 41 8b 00 48 89 e5 53 8b 0a 0f 
a2 <41> 89 00 89 1e 48 8b 5d f8 89 0f 41 89 11 c9 e9 f7 bc df 00 0f 1f^M
[   59.850995] RSP: 0000:ffffffffbd403e48 EFLAGS: 00010202^M
[   59.851636] RAX: 000000000100007b RBX: 0000000000000000 RCX: 
0000000000000000^M
[   59.852498] RDX: 0000000000000000 RSI: ffffffffbd403e64 RDI: 
ffffffffbd403e68^M
[   59.853361] RBP: ffffffffbd403e50 R08: ffffffffbd403e60 R09: 
ffffffffbd403e6c^M
[   59.854240] R10: ffffffffbd403d10 R11: ffff9af5bff3cfe8 R12: 
0000000000000056^M
[   59.855111] R13: ffff9af5bffc8e40 R14: 0000000000000000 R15: 
ffffffffbd41a120^M
[   59.855976]  kvm_arch_para_features+0x4e/0x80^M
[   59.856511]  pv_ipi_supported+0xe/0x34^M
[   59.856973]  kvm_apic_init+0x12/0x3f^M
[   59.857414]  apic_intr_mode_init+0x8d/0x10d^M
[   59.857939]  x86_late_time_init+0x28/0x3d^M
[   59.858435]  start_kernel+0x8af/0x970^M
[   59.858894]  x86_64_start_reservations+0x28/0x2e^M
[   59.859461]  x86_64_start_kernel+0x96/0xa0^M
[   59.859965]  secondary_startup_64_no_verify+0xe5/0xeb^M
[   59.860583]  </TASK>^M

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler
  2023-01-22  2:46 ` [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler Tianyu Lan
  2023-01-23  7:33   ` Gupta, Pankaj
  2023-03-09 11:48   ` Gupta, Pankaj
@ 2023-03-31 15:57   ` Borislav Petkov
  2023-04-03 18:09     ` Tianyu Lan
  2 siblings, 1 reply; 60+ messages in thread
From: Borislav Petkov @ 2023-03-31 15:57 UTC (permalink / raw)
  To: Tianyu Lan
  Cc: luto, tglx, mingo, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu,
	linux-kernel, kvm, linux-hyperv, linux-arch

On Sat, Jan 21, 2023 at 09:46:02PM -0500, Tianyu Lan wrote:
> From: Tianyu Lan <tiala@microsoft.com>
> 
> Add a #HV exception handler that uses IST stack.
> 
> Signed-off-by: Tianyu Lan <tiala@microsoft.com>
> ---
> Change since RFC V2:
>        * Remove unnecessary line in the change log.
> ---
>  arch/x86/entry/entry_64.S             | 58 +++++++++++++++++++++++++++
>  arch/x86/include/asm/cpu_entry_area.h |  6 +++
>  arch/x86/include/asm/idtentry.h       | 39 +++++++++++++++++-
>  arch/x86/include/asm/page_64_types.h  |  1 +
>  arch/x86/include/asm/trapnr.h         |  1 +
>  arch/x86/include/asm/traps.h          |  1 +
>  arch/x86/kernel/cpu/common.c          |  1 +
>  arch/x86/kernel/dumpstack_64.c        |  9 ++++-
>  arch/x86/kernel/idt.c                 |  1 +
>  arch/x86/kernel/sev.c                 | 53 ++++++++++++++++++++++++
>  arch/x86/kernel/traps.c               | 40 ++++++++++++++++++
>  arch/x86/mm/cpu_entry_area.c          |  2 +
>  12 files changed, 209 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 15739a2c0983..6baec7653f19 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -563,6 +563,64 @@ SYM_CODE_START(\asmsym)
>  .Lfrom_usermode_switch_stack_\@:
>  	idtentry_body user_\cfunc, has_error_code=1
>  
> +_ASM_NOKPROBE(\asmsym)
> +SYM_CODE_END(\asmsym)
> +.endm
> +/*
> + * idtentry_hv - Macro to generate entry stub for #HV
> + * @vector:		Vector number
> + * @asmsym:		ASM symbol for the entry point
> + * @cfunc:		C function to be called
> + *
> + * The macro emits code to set up the kernel context for #HV. The #HV handler
> + * runs on an IST stack and needs to be able to support nested #HV exceptions.
> + *
> + * To make this work the #HV entry code tries its best to pretend it doesn't use
> + * an IST stack by switching to the task stack if coming from user-space (which
> + * includes early SYSCALL entry path) or back to the stack in the IRET frame if
> + * entered from kernel-mode.
> + *
> + * If entered from kernel-mode the return stack is validated first, and if it is
> + * not safe to use (e.g. because it points to the entry stack) the #HV handler
> + * will switch to a fall-back stack (HV2) and call a special handler function.
> + *
> + * The macro is only used for one vector, but it is planned to be extended in
> + * the future for the #HV exception.
> + */
> +.macro idtentry_hv vector asmsym cfunc
> +SYM_CODE_START(\asmsym)

...

why is this so much duplicated code instead of sharing it with
idtentry_vc and all the facilities it does?

> +	UNWIND_HINT_IRET_REGS
> +	ASM_CLAC
> +	pushq	$-1			/* ORIG_RAX: no syscall to restart */
> +
> +	testb	$3, CS-ORIG_RAX(%rsp)
> +	jnz	.Lfrom_usermode_switch_stack_\@
> +
> +	call	paranoid_entry
> +
> +	UNWIND_HINT_REGS
> +
> +	/*
> +	 * Switch off the IST stack to make it free for nested exceptions.
> +	 */
> +	movq	%rsp, %rdi		/* pt_regs pointer */
> +	call	hv_switch_off_ist
> +	movq	%rax, %rsp		/* Switch to new stack */
> +
> +	UNWIND_HINT_REGS
> +
> +	/* Update pt_regs */
> +	movq	ORIG_RAX(%rsp), %rsi	/* get error code into 2nd argument*/
> +	movq	$-1, ORIG_RAX(%rsp)	/* no syscall to restart */
> +
> +	movq	%rsp, %rdi		/* pt_regs pointer */
> +	call	kernel_\cfunc
> +
> +	jmp	paranoid_exit
> +
> +.Lfrom_usermode_switch_stack_\@:
> +	idtentry_body user_\cfunc, has_error_code=1
> +
>  _ASM_NOKPROBE(\asmsym)
>  SYM_CODE_END(\asmsym)
>  .endm
> diff --git a/arch/x86/include/asm/cpu_entry_area.h b/arch/x86/include/asm/cpu_entry_area.h
> index 462fc34f1317..2186ed601b4a 100644
> --- a/arch/x86/include/asm/cpu_entry_area.h
> +++ b/arch/x86/include/asm/cpu_entry_area.h
> @@ -30,6 +30,10 @@
>  	char	VC_stack[optional_stack_size];			\
>  	char	VC2_stack_guard[guardsize];			\
>  	char	VC2_stack[optional_stack_size];			\
> +	char	HV_stack_guard[guardsize];			\
> +	char	HV_stack[optional_stack_size];			\
> +	char	HV2_stack_guard[guardsize];			\
> +	char	HV2_stack[optional_stack_size];			\
>  	char	IST_top_guard[guardsize];			\
>  
>  /* The exception stacks' physical storage. No guard pages required */
> @@ -52,6 +56,8 @@ enum exception_stack_ordering {
>  	ESTACK_MCE,
>  	ESTACK_VC,
>  	ESTACK_VC2,
> +	ESTACK_HV,
> +	ESTACK_HV2,
>  	N_EXCEPTION_STACKS

Ditto.

And so on...

Please share code - not duplicate.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler
  2023-03-31 15:57   ` Borislav Petkov
@ 2023-04-03 18:09     ` Tianyu Lan
  0 siblings, 0 replies; 60+ messages in thread
From: Tianyu Lan @ 2023-04-03 18:09 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: luto, tglx, mingo, dave.hansen, x86, hpa, seanjc, pbonzini,
	jgross, tiala, kirill, jiangshan.ljs, peterz, ashish.kalra,
	srutherford, akpm, anshuman.khandual, pawan.kumar.gupta,
	adrian.hunter, daniel.sneddon, alexander.shishkin, sandipan.das,
	ray.huang, brijesh.singh, michael.roth, thomas.lendacky,
	venu.busireddy, sterritt, tony.luck, samitolvanen, fenghua.yu,
	linux-kernel, kvm, linux-hyperv, linux-arch

On 3/31/2023 11:57 PM, Borislav Petkov wrote:
>> + *
>> + * If entered from kernel-mode the return stack is validated first, and if it is
>> + * not safe to use (e.g. because it points to the entry stack) the #HV handler
>> + * will switch to a fall-back stack (HV2) and call a special handler function.
>> + *
>> + * The macro is only used for one vector, but it is planned to be extended in
>> + * the future for the #HV exception.
>> + */
>> +.macro idtentry_hv vector asmsym cfunc
>> +SYM_CODE_START(\asmsym)
> ...
> 
> why is this so much duplicated code instead of sharing it with
> idtentry_vc and all the facilities it does?
> 

Hi Boris:
	#VC and #HV use different stack. I try reusing vc code path for #HV 
doesn't work. I will continue to work on this direction and report back 
later. In the RFC v4, I still keep the old version and other patches may 
be reviewed in the parellel.

Thanks.

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2023-04-03 18:10 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-22  2:45 [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 01/16] x86/hyperv: Add sev-snp enlightened guest specific config Tianyu Lan
2023-01-31 17:34   ` Michael Kelley (LINUX)
2023-02-02  4:01     ` Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 02/16] x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 03/16] x86/hyperv: Set Virtual Trust Level in vmbus init message Tianyu Lan
2023-01-31 17:55   ` Michael Kelley (LINUX)
2023-02-03  3:32     ` Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 04/16] x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp enlightened guest Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 05/16] clocksource/drivers/hyper-v: decrypt hyperv tsc page " Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 06/16] x86/hyperv: decrypt vmbus pages for " Tianyu Lan
2023-01-31 17:58   ` Michael Kelley (LINUX)
2023-02-03  4:11     ` Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 07/16] drivers: hv: Decrypt percpu hvcall input arg page in " Tianyu Lan
2023-01-31 18:02   ` Michael Kelley (LINUX)
2023-02-03  5:23     ` Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 08/16] x86/hyperv: Initialize cpu and memory for " Tianyu Lan
2023-01-31 18:20   ` Michael Kelley (LINUX)
2023-02-03  5:58     ` Tianyu Lan
2023-01-22  2:45 ` [RFC PATCH V3 09/16] x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc Tianyu Lan
2023-01-31 14:03   ` Wei Liu
2023-02-02  3:43     ` Tianyu Lan
2023-01-22  2:46 ` [RFC PATCH V3 10/16] x86/hyperv: Add smp support for sev-snp guest Tianyu Lan
2023-01-23 15:30   ` Tom Lendacky
2023-02-03  7:00     ` Tianyu Lan
2023-02-06 20:11       ` Borislav Petkov
2023-02-07 13:49         ` Tianyu Lan
2023-01-31 18:34   ` Michael Kelley (LINUX)
2023-02-03  6:10     ` Tianyu Lan
2023-01-22  2:46 ` [RFC PATCH V3 11/16] x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES Tianyu Lan
2023-01-22  2:46 ` [RFC PATCH V3 12/16] x86/sev: Add a #HV exception handler Tianyu Lan
2023-01-23  7:33   ` Gupta, Pankaj
2023-02-03  7:27     ` Tianyu Lan
2023-02-16 13:50       ` Gupta, Pankaj
2023-03-09 11:48   ` Gupta, Pankaj
2023-03-10 15:48     ` Tianyu Lan
2023-03-31 15:57   ` Borislav Petkov
2023-04-03 18:09     ` Tianyu Lan
2023-01-22  2:46 ` [RFC PATCH V3 13/16] x86/sev: Add Check of #HV event in path Tianyu Lan
2023-03-01 11:11   ` Gupta, Pankaj
2023-03-08 16:18     ` Gupta, Pankaj
2023-03-10 15:59       ` Tianyu Lan
2023-01-22  2:46 ` [RFC PATCH V3 14/16] x86/sev: Initialize #HV doorbell and handle interrupt requests Tianyu Lan
2023-02-16 14:46   ` Gupta, Pankaj
2023-02-17 12:45   ` Gupta, Pankaj
2023-03-01 19:34   ` Gupta, Pankaj
2023-01-22  2:46 ` [RFC PATCH V3 15/16] x86/sev: optimize system vector processing invoked from #HV exception Tianyu Lan
2023-01-22  2:46 ` [RFC PATCH V3 16/16] x86/sev: Fix interrupt exit code paths " Tianyu Lan
2023-02-02 23:20   ` Zhi Wang
2023-02-08 23:53     ` Kalra, Ashish
2023-02-21 16:44   ` Gupta, Pankaj
2023-03-10 16:02     ` Tianyu Lan
2023-02-02 23:00 ` [RFC PATCH V3 00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv Zhi Wang
2023-02-03  4:04   ` Michael Kelley (LINUX)
2023-02-09 11:36 ` Gupta, Pankaj
2023-02-17 12:47   ` Gupta, Pankaj
2023-02-18  7:15     ` Tianyu Lan
2023-03-10 15:35     ` Gupta, Pankaj
2023-03-10 16:19       ` Tianyu Lan
2023-03-15  6:40         ` Gupta, Pankaj

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.