linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] Add support running nested Microsoft Hypervisor
@ 2022-11-02 14:00 Jinank Jain
  2022-11-02 14:00 ` [PATCH 1/6] mshv: Add support for detecting nested hypervisor Jinank Jain
                   ` (6 more replies)
  0 siblings, 7 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-02 14:00 UTC (permalink / raw)
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, jinankjain,
	seanjc, kirill.shutemov, ak, sathyanarayanan.kuppuswamy,
	linux-hyperv, linux-kernel, linux-arch

From: Jinank Jain <jinankjain@microsoft.com>

This patch series plans to add support for running nested Microsoft
Hypervisor. In case of nested Microsoft Hypervisor there are few
privileged hypercalls which need to go L0 Hypervisor instead of L1
Hypervisor. This patches series basically identifies such hypercalls and
replace them with nested hypercalls.

Jinank Jain (6):
  mshv: Add support for detecting nested hypervisor
  hv: Setup synic registers in case of nested root partition
  hv: Set the correct EOM register in case of nested hypervisor
  hv: Add an interface to do nested hypercalls
  hv: Enable vmbus driver for nested root partition
  hv, mshv : Change interrupt vector for nested root partition

 arch/x86/include/asm/hyperv-tlfs.h | 17 ++++++++-
 arch/x86/include/asm/idtentry.h    |  2 ++
 arch/x86/include/asm/irq_vectors.h |  6 ++++
 arch/x86/include/asm/mshyperv.h    | 42 ++++++++++++++++++++---
 arch/x86/kernel/cpu/mshyperv.c     | 22 ++++++++++++
 arch/x86/kernel/idt.c              |  9 +++++
 drivers/hv/hv.c                    | 55 ++++++++++++++++++------------
 drivers/hv/vmbus_drv.c             |  5 +--
 include/asm-generic/hyperv-tlfs.h  |  1 +
 include/asm-generic/mshyperv.h     |  7 +++-
 10 files changed, 137 insertions(+), 29 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/6] mshv: Add support for detecting nested hypervisor
  2022-11-02 14:00 [PATCH 0/6] Add support running nested Microsoft Hypervisor Jinank Jain
@ 2022-11-02 14:00 ` Jinank Jain
  2022-11-02 14:00 ` [PATCH 2/6] hv: Setup synic registers in case of nested root partition Jinank Jain
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-02 14:00 UTC (permalink / raw)
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, jinankjain,
	seanjc, kirill.shutemov, ak, sathyanarayanan.kuppuswamy,
	linux-hyperv, linux-kernel, linux-arch

When Linux runs as a root partition for Microsoft Hypervisor. It is
possible to detect if it is running as nested hypervisor using
hints exposed by mshv. While at it expose a new variable called
hv_nested which can be used later for making decisions specific to
nested use case.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 3 +++
 arch/x86/kernel/cpu/mshyperv.c     | 7 +++++++
 include/asm-generic/mshyperv.h     | 2 ++
 3 files changed, 12 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 3089ec352743..d9a611565859 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -114,6 +114,9 @@
 /* Recommend using the newer ExProcessorMasks interface */
 #define HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED		BIT(11)
 
+/* Indicates that the hypervisor is nested within a Hyper-V partition. */
+#define HV_X64_HYPERV_NESTED				BIT(12)
+
 /* Recommend using enlightened VMCS */
 #define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED		BIT(14)
 
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 831613959a92..2555535f5237 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -37,6 +37,8 @@
 
 /* Is Linux running as the root partition? */
 bool hv_root_partition;
+/* Is Linux running on nested Microsoft Hypervisor */
+bool hv_nested;
 struct ms_hyperv_info ms_hyperv;
 
 #if IS_ENABLED(CONFIG_HYPERV)
@@ -301,6 +303,11 @@ static void __init ms_hyperv_init_platform(void)
 		pr_info("Hyper-V: running as root partition\n");
 	}
 
+	if (ms_hyperv.hints & HV_X64_HYPERV_NESTED) {
+		hv_nested = true;
+		pr_info("Hyper-V: Linux running on a nested hypervisor\n");
+	}
+
 	/*
 	 * Extract host information.
 	 */
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index bfb9eb9d7215..49d2e9274379 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -115,6 +115,8 @@ static inline u64 hv_generate_guest_id(u64 kernel_version)
 	return guest_id;
 }
 
+extern bool hv_nested;
+
 /* Free the message slot and signal end-of-message if required */
 static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/6] hv: Setup synic registers in case of nested root partition
  2022-11-02 14:00 [PATCH 0/6] Add support running nested Microsoft Hypervisor Jinank Jain
  2022-11-02 14:00 ` [PATCH 1/6] mshv: Add support for detecting nested hypervisor Jinank Jain
@ 2022-11-02 14:00 ` Jinank Jain
  2022-11-02 14:58   ` Vitaly Kuznetsov
  2022-11-02 14:00 ` [PATCH 3/6] hv: Set the correct EOM register in case of nested hypervisor Jinank Jain
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 16+ messages in thread
From: Jinank Jain @ 2022-11-02 14:00 UTC (permalink / raw)
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, jinankjain,
	seanjc, kirill.shutemov, ak, sathyanarayanan.kuppuswamy,
	linux-hyperv, linux-kernel, linux-arch

Child partitions are free to allocate SynIC message and event page but in
case of root partition it must use the pages allocated by Microsoft
Hypervisor (MSHV). Base address for these pages can be found using
synthetic MSRs exposed by MSHV. There is a slight difference in those MSRs
for nested vs non-nested root partition.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 11 ++++++
 drivers/hv/hv.c                    | 55 ++++++++++++++++++------------
 2 files changed, 45 insertions(+), 21 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index d9a611565859..0319091e2019 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -225,6 +225,17 @@ enum hv_isolation_type {
 #define HV_REGISTER_SINT14			0x4000009E
 #define HV_REGISTER_SINT15			0x4000009F
 
+/*
+ * Define synthetic interrupt controller model specific registers for
+ * nested hypervisor.
+ */
+#define HV_REGISTER_NESTED_SCONTROL            0x40001080
+#define HV_REGISTER_NESTED_SVERSION            0x40001081
+#define HV_REGISTER_NESTED_SIEFP               0x40001082
+#define HV_REGISTER_NESTED_SIMP                0x40001083
+#define HV_REGISTER_NESTED_EOM                 0x40001084
+#define HV_REGISTER_NESTED_SINT0               0x40001090
+
 /*
  * Synthetic Timer MSRs. Four timers per vcpu.
  */
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 4d6480d57546..92ee910561c4 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -25,6 +25,11 @@
 /* The one and only */
 struct hv_context hv_context;
 
+#define REG_SIMP (hv_nested ? HV_REGISTER_NESTED_SIMP : HV_REGISTER_SIMP)
+#define REG_SIEFP (hv_nested ? HV_REGISTER_NESTED_SIEFP : HV_REGISTER_SIEFP)
+#define REG_SCTRL (hv_nested ? HV_REGISTER_NESTED_SCONTROL : HV_REGISTER_SCONTROL)
+#define REG_SINT0 (hv_nested ? HV_REGISTER_NESTED_SINT0 : HV_REGISTER_SINT0)
+
 /*
  * hv_init - Main initialization routine.
  *
@@ -147,7 +152,7 @@ int hv_synic_alloc(void)
 		 * Synic message and event pages are allocated by paravisor.
 		 * Skip these pages allocation here.
 		 */
-		if (!hv_isolation_type_snp()) {
+		if (!hv_isolation_type_snp() && !hv_root_partition) {
 			hv_cpu->synic_message_page =
 				(void *)get_zeroed_page(GFP_ATOMIC);
 			if (hv_cpu->synic_message_page == NULL) {
@@ -188,8 +193,16 @@ void hv_synic_free(void)
 		struct hv_per_cpu_context *hv_cpu
 			= per_cpu_ptr(hv_context.cpu_context, cpu);
 
-		free_page((unsigned long)hv_cpu->synic_event_page);
-		free_page((unsigned long)hv_cpu->synic_message_page);
+		if (hv_root_partition) {
+			if (hv_cpu->synic_event_page != NULL)
+				memunmap(hv_cpu->synic_event_page);
+
+			if (hv_cpu->synic_message_page != NULL)
+				memunmap(hv_cpu->synic_message_page);
+		} else {
+			free_page((unsigned long)hv_cpu->synic_event_page);
+			free_page((unsigned long)hv_cpu->synic_message_page);
+		}
 		free_page((unsigned long)hv_cpu->post_msg_page);
 	}
 
@@ -213,10 +226,10 @@ void hv_synic_enable_regs(unsigned int cpu)
 	union hv_synic_scontrol sctrl;
 
 	/* Setup the Synic's message page */
-	simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
+	simp.as_uint64 = hv_get_register(REG_SIMP);
 	simp.simp_enabled = 1;
 
-	if (hv_isolation_type_snp()) {
+	if (hv_isolation_type_snp() || hv_root_partition) {
 		hv_cpu->synic_message_page
 			= memremap(simp.base_simp_gpa << HV_HYP_PAGE_SHIFT,
 				   HV_HYP_PAGE_SIZE, MEMREMAP_WB);
@@ -227,13 +240,13 @@ void hv_synic_enable_regs(unsigned int cpu)
 			>> HV_HYP_PAGE_SHIFT;
 	}
 
-	hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);
+	hv_set_register(REG_SIMP, simp.as_uint64);
 
 	/* Setup the Synic's event page */
-	siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
+	siefp.as_uint64 = hv_get_register(REG_SIEFP);
 	siefp.siefp_enabled = 1;
 
-	if (hv_isolation_type_snp()) {
+	if (hv_isolation_type_snp() || hv_root_partition) {
 		hv_cpu->synic_event_page =
 			memremap(siefp.base_siefp_gpa << HV_HYP_PAGE_SHIFT,
 				 HV_HYP_PAGE_SIZE, MEMREMAP_WB);
@@ -245,12 +258,12 @@ void hv_synic_enable_regs(unsigned int cpu)
 			>> HV_HYP_PAGE_SHIFT;
 	}
 
-	hv_set_register(HV_REGISTER_SIEFP, siefp.as_uint64);
+	hv_set_register(REG_SIEFP, siefp.as_uint64);
 
 	/* Setup the shared SINT. */
 	if (vmbus_irq != -1)
 		enable_percpu_irq(vmbus_irq, 0);
-	shared_sint.as_uint64 = hv_get_register(HV_REGISTER_SINT0 +
+	shared_sint.as_uint64 = hv_get_register(REG_SINT0 +
 					VMBUS_MESSAGE_SINT);
 
 	shared_sint.vector = vmbus_interrupt;
@@ -266,14 +279,14 @@ void hv_synic_enable_regs(unsigned int cpu)
 #else
 	shared_sint.auto_eoi = 0;
 #endif
-	hv_set_register(HV_REGISTER_SINT0 + VMBUS_MESSAGE_SINT,
+	hv_set_register(REG_SINT0 + VMBUS_MESSAGE_SINT,
 				shared_sint.as_uint64);
 
 	/* Enable the global synic bit */
-	sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
+	sctrl.as_uint64 = hv_get_register(REG_SCTRL);
 	sctrl.enable = 1;
 
-	hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
+	hv_set_register(REG_SCTRL, sctrl.as_uint64);
 }
 
 int hv_synic_init(unsigned int cpu)
@@ -297,17 +310,17 @@ void hv_synic_disable_regs(unsigned int cpu)
 	union hv_synic_siefp siefp;
 	union hv_synic_scontrol sctrl;
 
-	shared_sint.as_uint64 = hv_get_register(HV_REGISTER_SINT0 +
+	shared_sint.as_uint64 = hv_get_register(REG_SINT0 +
 					VMBUS_MESSAGE_SINT);
 
 	shared_sint.masked = 1;
 
 	/* Need to correctly cleanup in the case of SMP!!! */
 	/* Disable the interrupt */
-	hv_set_register(HV_REGISTER_SINT0 + VMBUS_MESSAGE_SINT,
+	hv_set_register(REG_SINT0 + VMBUS_MESSAGE_SINT,
 				shared_sint.as_uint64);
 
-	simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
+	simp.as_uint64 = hv_get_register(REG_SIMP);
 	/*
 	 * In Isolation VM, sim and sief pages are allocated by
 	 * paravisor. These pages also will be used by kdump
@@ -320,9 +333,9 @@ void hv_synic_disable_regs(unsigned int cpu)
 	else
 		simp.base_simp_gpa = 0;
 
-	hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);
+	hv_set_register(REG_SIMP, simp.as_uint64);
 
-	siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
+	siefp.as_uint64 = hv_get_register(REG_SIEFP);
 	siefp.siefp_enabled = 0;
 
 	if (hv_isolation_type_snp())
@@ -330,12 +343,12 @@ void hv_synic_disable_regs(unsigned int cpu)
 	else
 		siefp.base_siefp_gpa = 0;
 
-	hv_set_register(HV_REGISTER_SIEFP, siefp.as_uint64);
+	hv_set_register(REG_SIEFP, siefp.as_uint64);
 
 	/* Disable the global synic bit */
-	sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
+	sctrl.as_uint64 = hv_get_register(REG_SCTRL);
 	sctrl.enable = 0;
-	hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
+	hv_set_register(REG_SCTRL, sctrl.as_uint64);
 
 	if (vmbus_irq != -1)
 		disable_percpu_irq(vmbus_irq);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/6] hv: Set the correct EOM register in case of nested hypervisor
  2022-11-02 14:00 [PATCH 0/6] Add support running nested Microsoft Hypervisor Jinank Jain
  2022-11-02 14:00 ` [PATCH 1/6] mshv: Add support for detecting nested hypervisor Jinank Jain
  2022-11-02 14:00 ` [PATCH 2/6] hv: Setup synic registers in case of nested root partition Jinank Jain
@ 2022-11-02 14:00 ` Jinank Jain
  2022-11-02 14:00 ` [PATCH 4/6] hv: Add an interface to do nested hypercalls Jinank Jain
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-02 14:00 UTC (permalink / raw)
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, jinankjain,
	seanjc, kirill.shutemov, ak, sathyanarayanan.kuppuswamy,
	linux-hyperv, linux-kernel, linux-arch

Currently we are using the default EOM register value. But this needs to
changes when running under nested MSHV setup.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 include/asm-generic/mshyperv.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 49d2e9274379..7256e2cb7b67 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -117,6 +117,8 @@ static inline u64 hv_generate_guest_id(u64 kernel_version)
 
 extern bool hv_nested;
 
+#define REG_EOM (hv_nested ? HV_REGISTER_NESTED_EOM : HV_REGISTER_EOM)
+
 /* Free the message slot and signal end-of-message if required */
 static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
 {
@@ -148,7 +150,7 @@ static inline void vmbus_signal_eom(struct hv_message *msg, u32 old_msg_type)
 		 * possibly deliver another msg from the
 		 * hypervisor
 		 */
-		hv_set_register(HV_REGISTER_EOM, 0);
+		hv_set_register(REG_EOM, 0);
 	}
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/6] hv: Add an interface to do nested hypercalls
  2022-11-02 14:00 [PATCH 0/6] Add support running nested Microsoft Hypervisor Jinank Jain
                   ` (2 preceding siblings ...)
  2022-11-02 14:00 ` [PATCH 3/6] hv: Set the correct EOM register in case of nested hypervisor Jinank Jain
@ 2022-11-02 14:00 ` Jinank Jain
  2022-11-02 14:00 ` [PATCH 5/6] hv: Enable vmbus driver for nested root partition Jinank Jain
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-02 14:00 UTC (permalink / raw)
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, jinankjain,
	seanjc, kirill.shutemov, ak, sathyanarayanan.kuppuswamy,
	linux-hyperv, linux-kernel, linux-arch

According to TLFS, in order to communicate to L0 hypervisor there needs
to be an additional bit set in the control register. This communication
is required to perform priviledged instructions which can only be
performed by L0 hypervisor. An example of that could be setting up the
VMBus infrastructure.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 arch/x86/include/asm/hyperv-tlfs.h |  3 ++-
 arch/x86/include/asm/mshyperv.h    | 42 +++++++++++++++++++++++++++---
 include/asm-generic/hyperv-tlfs.h  |  1 +
 include/asm-generic/mshyperv.h     |  1 +
 4 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0319091e2019..fd066226f12b 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -380,7 +380,8 @@ struct hv_nested_enlightenments_control {
 		__u32 reserved:31;
 	} features;
 	struct {
-		__u32 reserved;
+		__u32 inter_partition_comm:1;
+		__u32 reserved:31;
 	} hypercallControls;
 } __packed;
 
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 61f0c206bff0..4ce9c3c9025d 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -74,10 +74,16 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 	return hv_status;
 }
 
+/* Hypercall to the L0 hypervisor */
+static inline u64 hv_do_nested_hypercall(u64 control, void *input, void *output)
+{
+	return hv_do_hypercall(control | HV_HYPERCALL_NESTED, input, output);
+}
+
 /* Fast hypercall with 8 bytes of input and no output */
-static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
+static inline u64 _hv_do_fast_hypercall8(u64 control, u16 code, u64 input1)
 {
-	u64 hv_status, control = (u64)code | HV_HYPERCALL_FAST_BIT;
+	u64 hv_status;
 
 #ifdef CONFIG_X86_64
 	{
@@ -105,10 +111,24 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
 		return hv_status;
 }
 
+static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
+{
+	u64 control = (u64)code | HV_HYPERCALL_FAST_BIT;
+
+	return _hv_do_fast_hypercall8(control, code, input1);
+}
+
+static inline u64 hv_do_fast_nested_hypercall8(u16 code, u64 input1)
+{
+	u64 control = (u64)code | HV_HYPERCALL_FAST_BIT | HV_HYPERCALL_NESTED;
+
+	return _hv_do_fast_hypercall8(control, code, input1);
+}
+
 /* Fast hypercall with 16 bytes of input */
-static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
+static inline u64 _hv_do_fast_hypercall16(u64 control, u16 code, u64 input1, u64 input2)
 {
-	u64 hv_status, control = (u64)code | HV_HYPERCALL_FAST_BIT;
+	u64 hv_status;
 
 #ifdef CONFIG_X86_64
 	{
@@ -139,6 +159,20 @@ static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
 	return hv_status;
 }
 
+static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
+{
+	u64 control = (u64)code | HV_HYPERCALL_FAST_BIT;
+
+	return _hv_do_fast_hypercall16(control, code, input1, input2);
+}
+
+static inline u64 hv_do_fast_nested_hypercall16(u16 code, u64 input1, u64 input2)
+{
+	u64 control = (u64)code | HV_HYPERCALL_FAST_BIT | HV_HYPERCALL_NESTED;
+
+	return _hv_do_fast_hypercall16(control, code, input1, input2);
+}
+
 extern struct hv_vp_assist_page **hv_vp_assist_page;
 
 static inline struct hv_vp_assist_page *hv_get_vp_assist_page(unsigned int cpu)
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index fdce7a4cfc6f..c67836dd8468 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -185,6 +185,7 @@ enum HV_GENERIC_SET_FORMAT {
 #define HV_HYPERCALL_VARHEAD_OFFSET	17
 #define HV_HYPERCALL_VARHEAD_MASK	GENMASK_ULL(26, 17)
 #define HV_HYPERCALL_RSVD0_MASK		GENMASK_ULL(31, 27)
+#define HV_HYPERCALL_NESTED		BIT(31)
 #define HV_HYPERCALL_REP_COMP_OFFSET	32
 #define HV_HYPERCALL_REP_COMP_1		BIT_ULL(32)
 #define HV_HYPERCALL_REP_COMP_MASK	GENMASK_ULL(43, 32)
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 7256e2cb7b67..86297ca74399 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -53,6 +53,7 @@ extern void * __percpu *hyperv_pcpu_input_arg;
 extern void * __percpu *hyperv_pcpu_output_arg;
 
 extern u64 hv_do_hypercall(u64 control, void *inputaddr, void *outputaddr);
+extern u64 hv_do_nested_hypercall(u64 control, void *inputaddr, void *outputaddr);
 extern u64 hv_do_fast_hypercall8(u16 control, u64 input8);
 extern bool hv_isolation_type_snp(void);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/6] hv: Enable vmbus driver for nested root partition
  2022-11-02 14:00 [PATCH 0/6] Add support running nested Microsoft Hypervisor Jinank Jain
                   ` (3 preceding siblings ...)
  2022-11-02 14:00 ` [PATCH 4/6] hv: Add an interface to do nested hypercalls Jinank Jain
@ 2022-11-02 14:00 ` Jinank Jain
  2022-11-02 14:00 ` [PATCH 6/6] hv, mshv : Change interrupt vector " Jinank Jain
  2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
  6 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-02 14:00 UTC (permalink / raw)
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, jinankjain,
	seanjc, kirill.shutemov, ak, sathyanarayanan.kuppuswamy,
	linux-hyperv, linux-kernel, linux-arch

Currently VMBus driver is not initialized for root partition but we need
to enable the VMBus driver for nested root partition. This is required
to expose VMBus devices to the L2 guest in the nested setup.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 drivers/hv/vmbus_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 8b2e413bf19c..2f0cf75e811b 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2723,7 +2723,7 @@ static int __init hv_acpi_init(void)
 	if (!hv_is_hyperv_initialized())
 		return -ENODEV;
 
-	if (hv_root_partition)
+	if (hv_root_partition && !hv_nested)
 		return 0;
 
 	/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/6] hv, mshv : Change interrupt vector for nested root partition
  2022-11-02 14:00 [PATCH 0/6] Add support running nested Microsoft Hypervisor Jinank Jain
                   ` (4 preceding siblings ...)
  2022-11-02 14:00 ` [PATCH 5/6] hv: Enable vmbus driver for nested root partition Jinank Jain
@ 2022-11-02 14:00 ` Jinank Jain
  2022-11-02 15:46   ` Wei Liu
  2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
  6 siblings, 1 reply; 16+ messages in thread
From: Jinank Jain @ 2022-11-02 14:00 UTC (permalink / raw)
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, jinankjain,
	seanjc, kirill.shutemov, ak, sathyanarayanan.kuppuswamy,
	linux-hyperv, linux-kernel, linux-arch

Traditionally we have been using the HYPERVISOR_CALLBACK_VECTOR to relay
the VMBus interrupt. But this does not work in case of nested
hypervisor. Microsoft Hypervisor reserves 0x31 to 0x34 as the interrupt
vector range for VMBus and thus we have to use one of the vectors from
that range and setup the IDT accordingly.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 arch/x86/include/asm/idtentry.h    |  2 ++
 arch/x86/include/asm/irq_vectors.h |  6 ++++++
 arch/x86/kernel/cpu/mshyperv.c     | 15 +++++++++++++++
 arch/x86/kernel/idt.c              |  9 +++++++++
 drivers/hv/vmbus_drv.c             |  3 ++-
 5 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 72184b0b2219..c0648e3e4d4a 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -686,6 +686,8 @@ DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_NESTED_VECTOR,	sysvec_kvm_posted_intr_nested
 DECLARE_IDTENTRY_SYSVEC(HYPERVISOR_CALLBACK_VECTOR,	sysvec_hyperv_callback);
 DECLARE_IDTENTRY_SYSVEC(HYPERV_REENLIGHTENMENT_VECTOR,	sysvec_hyperv_reenlightenment);
 DECLARE_IDTENTRY_SYSVEC(HYPERV_STIMER0_VECTOR,	sysvec_hyperv_stimer0);
+DECLARE_IDTENTRY_SYSVEC(HYPERV_INTR_NESTED_VMBUS_VECTOR,
+			sysvec_hyperv_nested_vmbus_intr);
 #endif
 
 #if IS_ENABLED(CONFIG_ACRN_GUEST)
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 43dcb9284208..729d19eab7f5 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -102,6 +102,12 @@
 #if IS_ENABLED(CONFIG_HYPERV)
 #define HYPERV_REENLIGHTENMENT_VECTOR	0xee
 #define HYPERV_STIMER0_VECTOR		0xed
+/*
+ * FIXME: Change this, once Microsoft Hypervisor changes its assumption
+ * around VMBus interrupt vector allocation for nested root partition.
+ * Or provides a better interface to detect this instead of hardcoding.
+ */
+#define HYPERV_INTR_NESTED_VMBUS_VECTOR	0x31
 #endif
 
 #define LOCAL_TIMER_VECTOR		0xec
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 2555535f5237..83aab88bf298 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -61,6 +61,21 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_callback)
 	set_irq_regs(old_regs);
 }
 
+DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_nested_vmbus_intr)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	inc_irq_stat(irq_hv_callback_count);
+
+	if (vmbus_handler)
+		vmbus_handler();
+
+	if (ms_hyperv.hints & HV_DEPRECATING_AEOI_RECOMMENDED)
+		ack_APIC_irq();
+
+	set_irq_regs(old_regs);
+}
+
 void hv_setup_vmbus_handler(void (*handler)(void))
 {
 	vmbus_handler = handler;
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index a58c6bc1cd68..ace648856a0b 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -160,6 +160,15 @@ static const __initconst struct idt_data apic_idts[] = {
 # endif
 	INTG(SPURIOUS_APIC_VECTOR,		asm_sysvec_spurious_apic_interrupt),
 	INTG(ERROR_APIC_VECTOR,			asm_sysvec_error_interrupt),
+#ifdef CONFIG_HYPERV
+	/*
+	 * This is a hack because we cannot install this interrupt handler via alloc_intr_gate
+	 * as it does not allow interrupt vector less than FIRST_SYSTEM_VECTORS. And hyperv
+	 * does not want anything other than 0x31-0x34 as the interrupt vector for vmbus
+	 * interrupt in case of nested setup.
+	 */
+	INTG(HYPERV_INTR_NESTED_VMBUS_VECTOR, asm_sysvec_hyperv_nested_vmbus_intr),
+#endif
 #endif
 };
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 2f0cf75e811b..e6fb77fb44b9 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2746,7 +2746,8 @@ static int __init hv_acpi_init(void)
 	 * normal Linux IRQ mechanism is not used in this case.
 	 */
 #ifdef HYPERVISOR_CALLBACK_VECTOR
-	vmbus_interrupt = HYPERVISOR_CALLBACK_VECTOR;
+	vmbus_interrupt = hv_nested ? HYPERV_INTR_NESTED_VMBUS_VECTOR :
+					    HYPERVISOR_CALLBACK_VECTOR;
 	vmbus_irq = -1;
 #endif
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/6] hv: Setup synic registers in case of nested root partition
  2022-11-02 14:00 ` [PATCH 2/6] hv: Setup synic registers in case of nested root partition Jinank Jain
@ 2022-11-02 14:58   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 16+ messages in thread
From: Vitaly Kuznetsov @ 2022-11-02 14:58 UTC (permalink / raw)
  To: Jinank Jain
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, jinankjain,
	seanjc, kirill.shutemov, ak, sathyanarayanan.kuppuswamy,
	linux-hyperv, linux-kernel, linux-arch

Jinank Jain <jinankjain@linux.microsoft.com> writes:

> Child partitions are free to allocate SynIC message and event page but in
> case of root partition it must use the pages allocated by Microsoft
> Hypervisor (MSHV). Base address for these pages can be found using
> synthetic MSRs exposed by MSHV. There is a slight difference in those MSRs
> for nested vs non-nested root partition.
>
> Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
> ---
>  arch/x86/include/asm/hyperv-tlfs.h | 11 ++++++
>  drivers/hv/hv.c                    | 55 ++++++++++++++++++------------
>  2 files changed, 45 insertions(+), 21 deletions(-)
>
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index d9a611565859..0319091e2019 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -225,6 +225,17 @@ enum hv_isolation_type {
>  #define HV_REGISTER_SINT14			0x4000009E
>  #define HV_REGISTER_SINT15			0x4000009F
>  
> +/*
> + * Define synthetic interrupt controller model specific registers for
> + * nested hypervisor.
> + */
> +#define HV_REGISTER_NESTED_SCONTROL            0x40001080
> +#define HV_REGISTER_NESTED_SVERSION            0x40001081
> +#define HV_REGISTER_NESTED_SIEFP               0x40001082
> +#define HV_REGISTER_NESTED_SIMP                0x40001083
> +#define HV_REGISTER_NESTED_EOM                 0x40001084
> +#define HV_REGISTER_NESTED_SINT0               0x40001090
> +
>  /*
>   * Synthetic Timer MSRs. Four timers per vcpu.
>   */
> diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
> index 4d6480d57546..92ee910561c4 100644
> --- a/drivers/hv/hv.c
> +++ b/drivers/hv/hv.c
> @@ -25,6 +25,11 @@
>  /* The one and only */
>  struct hv_context hv_context;
>  
> +#define REG_SIMP (hv_nested ? HV_REGISTER_NESTED_SIMP : HV_REGISTER_SIMP)
> +#define REG_SIEFP (hv_nested ? HV_REGISTER_NESTED_SIEFP : HV_REGISTER_SIEFP)
> +#define REG_SCTRL (hv_nested ? HV_REGISTER_NESTED_SCONTROL : HV_REGISTER_SCONTROL)
> +#define REG_SINT0 (hv_nested ? HV_REGISTER_NESTED_SINT0 : HV_REGISTER_SINT0)
> +
>  /*
>   * hv_init - Main initialization routine.
>   *
> @@ -147,7 +152,7 @@ int hv_synic_alloc(void)
>  		 * Synic message and event pages are allocated by paravisor.
>  		 * Skip these pages allocation here.
>  		 */
> -		if (!hv_isolation_type_snp()) {
> +		if (!hv_isolation_type_snp() && !hv_root_partition) {
>  			hv_cpu->synic_message_page =
>  				(void *)get_zeroed_page(GFP_ATOMIC);
>  			if (hv_cpu->synic_message_page == NULL) {
> @@ -188,8 +193,16 @@ void hv_synic_free(void)
>  		struct hv_per_cpu_context *hv_cpu
>  			= per_cpu_ptr(hv_context.cpu_context, cpu);
>  
> -		free_page((unsigned long)hv_cpu->synic_event_page);
> -		free_page((unsigned long)hv_cpu->synic_message_page);
> +		if (hv_root_partition) {
> +			if (hv_cpu->synic_event_page != NULL)
> +				memunmap(hv_cpu->synic_event_page);
> +
> +			if (hv_cpu->synic_message_page != NULL)
> +				memunmap(hv_cpu->synic_message_page);
> +		} else {
> +			free_page((unsigned long)hv_cpu->synic_event_page);
> +			free_page((unsigned long)hv_cpu->synic_message_page);
> +		}
>  		free_page((unsigned long)hv_cpu->post_msg_page);
>  	}
>  
> @@ -213,10 +226,10 @@ void hv_synic_enable_regs(unsigned int cpu)
>  	union hv_synic_scontrol sctrl;
>  
>  	/* Setup the Synic's message page */
> -	simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
> +	simp.as_uint64 = hv_get_register(REG_SIMP);

To avoid all this code churn (here and in the next patch dealing with
EOM), would it make sense to move the logic picking nested/non-nested
register into hv_{get,set}_register() instead?

E.g. something like (untested, incomplete):

static inline u32 hv_get_nested_reg(u32 reg) {
	switch (reg) {
	HV_REGISTER_SIMP: return HV_REGISTER_NESTED_SIMP;
        HV_REGISTER_NESTED_SVERSION: return HV_REGISTER_NESTED_SVERSION;
        ...
 	default: return reg;
	}

}

static inline u64 hv_get_register(unsigned int reg)
{
	u64 value;

	if (hv_nested)
		reg = hv_get_nested_reg(reg);

	if (hv_is_synic_reg(reg) && hv_isolation_type_snp())
		hv_ghcb_msr_read(reg, &value);
	else
		rdmsrl(reg, value);
	return value;
}

static inline void hv_set_register(unsigned int reg, u64 value)
{
	if (hv_nested)
		reg = hv_get_nested_reg(reg);

	if (hv_is_synic_reg(reg) && hv_isolation_type_snp()) {
		hv_ghcb_msr_write(reg, value);

		/* Write proxy bit via wrmsl instruction */
		if (reg >= HV_REGISTER_SINT0 &&
		    reg <= HV_REGISTER_SINT15)
			wrmsrl(reg, value | 1 << 20);
	} else {
		wrmsrl(reg, value);
	}
}


>  	simp.simp_enabled = 1;
>  
> -	if (hv_isolation_type_snp()) {
> +	if (hv_isolation_type_snp() || hv_root_partition) {
>  		hv_cpu->synic_message_page
>  			= memremap(simp.base_simp_gpa << HV_HYP_PAGE_SHIFT,
>  				   HV_HYP_PAGE_SIZE, MEMREMAP_WB);
> @@ -227,13 +240,13 @@ void hv_synic_enable_regs(unsigned int cpu)
>  			>> HV_HYP_PAGE_SHIFT;
>  	}
>  
> -	hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);
> +	hv_set_register(REG_SIMP, simp.as_uint64);
>  
>  	/* Setup the Synic's event page */
> -	siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
> +	siefp.as_uint64 = hv_get_register(REG_SIEFP);
>  	siefp.siefp_enabled = 1;
>  
> -	if (hv_isolation_type_snp()) {
> +	if (hv_isolation_type_snp() || hv_root_partition) {
>  		hv_cpu->synic_event_page =
>  			memremap(siefp.base_siefp_gpa << HV_HYP_PAGE_SHIFT,
>  				 HV_HYP_PAGE_SIZE, MEMREMAP_WB);
> @@ -245,12 +258,12 @@ void hv_synic_enable_regs(unsigned int cpu)
>  			>> HV_HYP_PAGE_SHIFT;
>  	}
>  
> -	hv_set_register(HV_REGISTER_SIEFP, siefp.as_uint64);
> +	hv_set_register(REG_SIEFP, siefp.as_uint64);
>  
>  	/* Setup the shared SINT. */
>  	if (vmbus_irq != -1)
>  		enable_percpu_irq(vmbus_irq, 0);
> -	shared_sint.as_uint64 = hv_get_register(HV_REGISTER_SINT0 +
> +	shared_sint.as_uint64 = hv_get_register(REG_SINT0 +
>  					VMBUS_MESSAGE_SINT);
>  
>  	shared_sint.vector = vmbus_interrupt;
> @@ -266,14 +279,14 @@ void hv_synic_enable_regs(unsigned int cpu)
>  #else
>  	shared_sint.auto_eoi = 0;
>  #endif
> -	hv_set_register(HV_REGISTER_SINT0 + VMBUS_MESSAGE_SINT,
> +	hv_set_register(REG_SINT0 + VMBUS_MESSAGE_SINT,
>  				shared_sint.as_uint64);
>  
>  	/* Enable the global synic bit */
> -	sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
> +	sctrl.as_uint64 = hv_get_register(REG_SCTRL);
>  	sctrl.enable = 1;
>  
> -	hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
> +	hv_set_register(REG_SCTRL, sctrl.as_uint64);
>  }
>  
>  int hv_synic_init(unsigned int cpu)
> @@ -297,17 +310,17 @@ void hv_synic_disable_regs(unsigned int cpu)
>  	union hv_synic_siefp siefp;
>  	union hv_synic_scontrol sctrl;
>  
> -	shared_sint.as_uint64 = hv_get_register(HV_REGISTER_SINT0 +
> +	shared_sint.as_uint64 = hv_get_register(REG_SINT0 +
>  					VMBUS_MESSAGE_SINT);
>  
>  	shared_sint.masked = 1;
>  
>  	/* Need to correctly cleanup in the case of SMP!!! */
>  	/* Disable the interrupt */
> -	hv_set_register(HV_REGISTER_SINT0 + VMBUS_MESSAGE_SINT,
> +	hv_set_register(REG_SINT0 + VMBUS_MESSAGE_SINT,
>  				shared_sint.as_uint64);
>  
> -	simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
> +	simp.as_uint64 = hv_get_register(REG_SIMP);
>  	/*
>  	 * In Isolation VM, sim and sief pages are allocated by
>  	 * paravisor. These pages also will be used by kdump
> @@ -320,9 +333,9 @@ void hv_synic_disable_regs(unsigned int cpu)
>  	else
>  		simp.base_simp_gpa = 0;
>  
> -	hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);
> +	hv_set_register(REG_SIMP, simp.as_uint64);
>  
> -	siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
> +	siefp.as_uint64 = hv_get_register(REG_SIEFP);
>  	siefp.siefp_enabled = 0;
>  
>  	if (hv_isolation_type_snp())
> @@ -330,12 +343,12 @@ void hv_synic_disable_regs(unsigned int cpu)
>  	else
>  		siefp.base_siefp_gpa = 0;
>  
> -	hv_set_register(HV_REGISTER_SIEFP, siefp.as_uint64);
> +	hv_set_register(REG_SIEFP, siefp.as_uint64);
>  
>  	/* Disable the global synic bit */
> -	sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
> +	sctrl.as_uint64 = hv_get_register(REG_SCTRL);
>  	sctrl.enable = 0;
> -	hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
> +	hv_set_register(REG_SCTRL, sctrl.as_uint64);
>  
>  	if (vmbus_irq != -1)
>  		disable_percpu_irq(vmbus_irq);

-- 
Vitaly


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 6/6] hv, mshv : Change interrupt vector for nested root partition
  2022-11-02 14:00 ` [PATCH 6/6] hv, mshv : Change interrupt vector " Jinank Jain
@ 2022-11-02 15:46   ` Wei Liu
  0 siblings, 0 replies; 16+ messages in thread
From: Wei Liu @ 2022-11-02 15:46 UTC (permalink / raw)
  To: Jinank Jain
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp,
	dave.hansen, x86, hpa, arnd, peterz, jpoimboe, seanjc,
	kirill.shutemov, ak, sathyanarayanan.kuppuswamy, linux-hyperv,
	linux-kernel, linux-arch

On Wed, Nov 02, 2022 at 02:00:17PM +0000, Jinank Jain wrote:
> Traditionally we have been using the HYPERVISOR_CALLBACK_VECTOR to relay
> the VMBus interrupt. But this does not work in case of nested
> hypervisor. Microsoft Hypervisor reserves 0x31 to 0x34 as the interrupt
> vector range for VMBus and thus we have to use one of the vectors from
> that range and setup the IDT accordingly.
> 
> Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
> ---
>  arch/x86/include/asm/idtentry.h    |  2 ++
>  arch/x86/include/asm/irq_vectors.h |  6 ++++++
[...]
>  #if IS_ENABLED(CONFIG_ACRN_GUEST)
> diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
> index 43dcb9284208..729d19eab7f5 100644
> --- a/arch/x86/include/asm/irq_vectors.h
> +++ b/arch/x86/include/asm/irq_vectors.h
> @@ -102,6 +102,12 @@
>  #if IS_ENABLED(CONFIG_HYPERV)
>  #define HYPERV_REENLIGHTENMENT_VECTOR	0xee
>  #define HYPERV_STIMER0_VECTOR		0xed
> +/*
> + * FIXME: Change this, once Microsoft Hypervisor changes its assumption
> + * around VMBus interrupt vector allocation for nested root partition.
> + * Or provides a better interface to detect this instead of hardcoding.
> + */
> +#define HYPERV_INTR_NESTED_VMBUS_VECTOR	0x31

I would like to hear x86 maintainers opinion on this.

Thanks,
Wei.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 0/5] Add support running nested Microsoft Hypervisor
  2022-11-02 14:00 [PATCH 0/6] Add support running nested Microsoft Hypervisor Jinank Jain
                   ` (5 preceding siblings ...)
  2022-11-02 14:00 ` [PATCH 6/6] hv, mshv : Change interrupt vector " Jinank Jain
@ 2022-11-24  5:53 ` Jinank Jain
  2022-11-24  5:53   ` [PATCH v5 1/5] x86/hyperv: Add support for detecting nested hypervisor Jinank Jain
                     ` (5 more replies)
  6 siblings, 6 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-24  5:53 UTC (permalink / raw)
  To: jinankjain
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, x86,
	hpa, arnd, peterz, jpoimboe, jinankjain, seanjc, kirill.shutemov,
	ak, sathyanarayanan.kuppuswamy, linux-hyperv, linux-kernel,
	linux-arch, anrayabh, mikelley

This patch series plans to add support for running nested Microsoft
Hypervisor. In case of nested Microsoft Hypervisor there are few
privileged hypercalls which need to go L0 Hypervisor instead of L1
Hypervisor. This patches series basically identifies such hypercalls and
replace them with nested hypercalls.

Jinank Jain (5):
  x86/hyperv: Add support for detecting nested hypervisor
  Drivers: hv: Setup synic registers in case of nested root partition
  x86/hyperv: Add an interface to do nested hypercalls
  Drivers: hv: Enable vmbus driver for nested root partition
  x86/hyperv: Change interrupt vector for nested root partition

[v4]
- Fix ARM64 compilation

[v5]
- Fix comments from Michael Kelly

 arch/arm64/hyperv/mshyperv.c       |  6 +++
 arch/x86/include/asm/hyperv-tlfs.h | 17 ++++++-
 arch/x86/include/asm/idtentry.h    |  2 +
 arch/x86/include/asm/irq_vectors.h |  6 +++
 arch/x86/include/asm/mshyperv.h    | 68 ++++++++++++++++------------
 arch/x86/kernel/cpu/mshyperv.c     | 71 ++++++++++++++++++++++++++++++
 arch/x86/kernel/idt.c              |  9 ++++
 drivers/hv/hv.c                    | 18 +++++---
 drivers/hv/hv_common.c             |  7 ++-
 drivers/hv/vmbus_drv.c             |  5 ++-
 include/asm-generic/hyperv-tlfs.h  |  1 +
 include/asm-generic/mshyperv.h     |  1 +
 12 files changed, 173 insertions(+), 38 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH v5 1/5] x86/hyperv: Add support for detecting nested hypervisor
  2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
@ 2022-11-24  5:53   ` Jinank Jain
  2022-11-24  5:53   ` [PATCH v5 2/5] Drivers: hv: Setup synic registers in case of nested root partition Jinank Jain
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-24  5:53 UTC (permalink / raw)
  To: jinankjain
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, x86,
	hpa, arnd, peterz, jpoimboe, jinankjain, seanjc, kirill.shutemov,
	ak, sathyanarayanan.kuppuswamy, linux-hyperv, linux-kernel,
	linux-arch, anrayabh, mikelley

When Linux runs as a root partition for Microsoft Hypervisor. It is
possible to detect if it is running as nested hypervisor using
hints exposed by mshv. While at it expose a new variable called
hv_nested which can be used later for making decisions specific to
nested use case.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 arch/arm64/hyperv/mshyperv.c       | 6 ++++++
 arch/x86/include/asm/hyperv-tlfs.h | 3 +++
 arch/x86/kernel/cpu/mshyperv.c     | 7 +++++++
 drivers/hv/hv_common.c             | 7 +++++--
 include/asm-generic/mshyperv.h     | 1 +
 5 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/hyperv/mshyperv.c b/arch/arm64/hyperv/mshyperv.c
index a406454578f0..2024b19dc514 100644
--- a/arch/arm64/hyperv/mshyperv.c
+++ b/arch/arm64/hyperv/mshyperv.c
@@ -19,6 +19,9 @@
 
 static bool hyperv_initialized;
 
+/* Is Linux running on nested Microsoft Hypervisor */
+bool hv_nested;
+
 static int __init hyperv_init(void)
 {
 	struct hv_get_vp_registers_output	result;
@@ -63,6 +66,9 @@ static int __init hyperv_init(void)
 	pr_info("Hyper-V: Host Build %d.%d.%d.%d-%d-%d\n",
 		b >> 16, b & 0xFFFF, a,	d & 0xFFFFFF, c, d >> 24);
 
+	/* ARM64 does not support nested virtualization */
+	hv_nested = false;
+
 	ret = hv_common_init();
 	if (ret)
 		return ret;
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 6d9368ea3701..58c03d18c235 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -114,6 +114,9 @@
 /* Recommend using the newer ExProcessorMasks interface */
 #define HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED		BIT(11)
 
+/* Indicates that the hypervisor is nested within a Hyper-V partition. */
+#define HV_X64_HYPERV_NESTED				BIT(12)
+
 /* Recommend using enlightened VMCS */
 #define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED		BIT(14)
 
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 831613959a92..9a4204139490 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -37,6 +37,8 @@
 
 /* Is Linux running as the root partition? */
 bool hv_root_partition;
+/* Is Linux running on nested Microsoft Hypervisor */
+bool hv_nested;
 struct ms_hyperv_info ms_hyperv;
 
 #if IS_ENABLED(CONFIG_HYPERV)
@@ -301,6 +303,11 @@ static void __init ms_hyperv_init_platform(void)
 		pr_info("Hyper-V: running as root partition\n");
 	}
 
+	if (ms_hyperv.hints & HV_X64_HYPERV_NESTED) {
+		hv_nested = true;
+		pr_info("Hyper-V: running on a nested hypervisor\n");
+	}
+
 	/*
 	 * Extract host information.
 	 */
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c
index ae68298c0dca..dcb336ce374f 100644
--- a/drivers/hv/hv_common.c
+++ b/drivers/hv/hv_common.c
@@ -25,8 +25,8 @@
 #include <asm/mshyperv.h>
 
 /*
- * hv_root_partition and ms_hyperv are defined here with other Hyper-V
- * specific globals so they are shared across all architectures and are
+ * hv_root_partition, ms_hyperv and hv_nested are defined here with other
+ * Hyper-V specific globals so they are shared across all architectures and are
  * built only when CONFIG_HYPERV is defined.  But on x86,
  * ms_hyperv_init_platform() is built even when CONFIG_HYPERV is not
  * defined, and it uses these two variables.  So mark them as __weak
@@ -36,6 +36,9 @@
 bool __weak hv_root_partition;
 EXPORT_SYMBOL_GPL(hv_root_partition);
 
+bool __weak hv_nested;
+EXPORT_SYMBOL_GPL(hv_nested);
+
 struct ms_hyperv_info __weak ms_hyperv;
 EXPORT_SYMBOL_GPL(ms_hyperv);
 
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index bfb9eb9d7215..5df6e944e6a9 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -164,6 +164,7 @@ extern int vmbus_interrupt;
 extern int vmbus_irq;
 
 extern bool hv_root_partition;
+extern bool hv_nested;
 
 #if IS_ENABLED(CONFIG_HYPERV)
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 2/5] Drivers: hv: Setup synic registers in case of nested root partition
  2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
  2022-11-24  5:53   ` [PATCH v5 1/5] x86/hyperv: Add support for detecting nested hypervisor Jinank Jain
@ 2022-11-24  5:53   ` Jinank Jain
  2022-11-24  5:53   ` [PATCH v5 3/5] x86/hyperv: Add an interface to do nested hypercalls Jinank Jain
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-24  5:53 UTC (permalink / raw)
  To: jinankjain
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, x86,
	hpa, arnd, peterz, jpoimboe, jinankjain, seanjc, kirill.shutemov,
	ak, sathyanarayanan.kuppuswamy, linux-hyperv, linux-kernel,
	linux-arch, anrayabh, mikelley

Child partitions are free to allocate SynIC message and event page but in
case of root partition it must use the pages allocated by Microsoft
Hypervisor (MSHV). Base address for these pages can be found using
synthetic MSRs exposed by MSHV. There is a slight difference in those MSRs
for nested vs non-nested root partition.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 arch/x86/include/asm/hyperv-tlfs.h | 11 +++++++
 arch/x86/include/asm/mshyperv.h    | 26 ++--------------
 arch/x86/kernel/cpu/mshyperv.c     | 49 ++++++++++++++++++++++++++++++
 drivers/hv/hv.c                    | 18 ++++++++---
 4 files changed, 75 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 58c03d18c235..b5019becb618 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -225,6 +225,17 @@ enum hv_isolation_type {
 #define HV_REGISTER_SINT14			0x4000009E
 #define HV_REGISTER_SINT15			0x4000009F
 
+/*
+ * Define synthetic interrupt controller model specific registers for
+ * nested hypervisor.
+ */
+#define HV_REGISTER_NESTED_SCONTROL            0x40001080
+#define HV_REGISTER_NESTED_SVERSION            0x40001081
+#define HV_REGISTER_NESTED_SIEFP               0x40001082
+#define HV_REGISTER_NESTED_SIMP                0x40001083
+#define HV_REGISTER_NESTED_EOM                 0x40001084
+#define HV_REGISTER_NESTED_SINT0               0x40001090
+
 /*
  * Synthetic Timer MSRs. Four timers per vcpu.
  */
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 61f0c206bff0..326d699b30d5 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -198,30 +198,8 @@ static inline bool hv_is_synic_reg(unsigned int reg)
 	return false;
 }
 
-static inline u64 hv_get_register(unsigned int reg)
-{
-	u64 value;
-
-	if (hv_is_synic_reg(reg) && hv_isolation_type_snp())
-		hv_ghcb_msr_read(reg, &value);
-	else
-		rdmsrl(reg, value);
-	return value;
-}
-
-static inline void hv_set_register(unsigned int reg, u64 value)
-{
-	if (hv_is_synic_reg(reg) && hv_isolation_type_snp()) {
-		hv_ghcb_msr_write(reg, value);
-
-		/* Write proxy bit via wrmsl instruction */
-		if (reg >= HV_REGISTER_SINT0 &&
-		    reg <= HV_REGISTER_SINT15)
-			wrmsrl(reg, value | 1 << 20);
-	} else {
-		wrmsrl(reg, value);
-	}
-}
+u64 hv_get_register(unsigned int reg);
+void hv_set_register(unsigned int reg, u64 value);
 
 #else /* CONFIG_HYPERV */
 static inline void hyperv_init(void) {}
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 9a4204139490..3e6711a6af6b 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -41,6 +41,55 @@ bool hv_root_partition;
 bool hv_nested;
 struct ms_hyperv_info ms_hyperv;
 
+static inline unsigned int hv_get_nested_reg(unsigned int reg)
+{
+	switch (reg) {
+	case HV_REGISTER_SIMP:
+		return HV_REGISTER_NESTED_SIMP;
+	case HV_REGISTER_NESTED_SIEFP:
+		return HV_REGISTER_SIEFP;
+	case HV_REGISTER_SCONTROL:
+		return HV_REGISTER_NESTED_SCONTROL;
+	case HV_REGISTER_SINT0:
+		return HV_REGISTER_NESTED_SINT0;
+	case HV_REGISTER_EOM:
+		return HV_REGISTER_NESTED_EOM;
+	default:
+		return reg;
+	}
+}
+
+inline u64 hv_get_register(unsigned int reg)
+{
+	u64 value;
+
+	if (hv_nested)
+		reg = hv_get_nested_reg(reg);
+
+	if (hv_is_synic_reg(reg) && hv_isolation_type_snp())
+		hv_ghcb_msr_read(reg, &value);
+	else
+		rdmsrl(reg, value);
+	return value;
+}
+
+inline void hv_set_register(unsigned int reg, u64 value)
+{
+	if (hv_nested)
+		reg = hv_get_nested_reg(reg);
+
+	if (hv_is_synic_reg(reg) && hv_isolation_type_snp()) {
+		hv_ghcb_msr_write(reg, value);
+
+		/* Write proxy bit via wrmsl instruction */
+		if (reg >= HV_REGISTER_SINT0 &&
+		    reg <= HV_REGISTER_SINT15)
+			wrmsrl(reg, value | 1 << 20);
+	} else {
+		wrmsrl(reg, value);
+	}
+}
+
 #if IS_ENABLED(CONFIG_HYPERV)
 static void (*vmbus_handler)(void);
 static void (*hv_stimer0_handler)(void);
diff --git a/drivers/hv/hv.c b/drivers/hv/hv.c
index 4d6480d57546..9e1eb50cc76f 100644
--- a/drivers/hv/hv.c
+++ b/drivers/hv/hv.c
@@ -147,7 +147,7 @@ int hv_synic_alloc(void)
 		 * Synic message and event pages are allocated by paravisor.
 		 * Skip these pages allocation here.
 		 */
-		if (!hv_isolation_type_snp()) {
+		if (!hv_isolation_type_snp() && !hv_root_partition) {
 			hv_cpu->synic_message_page =
 				(void *)get_zeroed_page(GFP_ATOMIC);
 			if (hv_cpu->synic_message_page == NULL) {
@@ -188,8 +188,16 @@ void hv_synic_free(void)
 		struct hv_per_cpu_context *hv_cpu
 			= per_cpu_ptr(hv_context.cpu_context, cpu);
 
-		free_page((unsigned long)hv_cpu->synic_event_page);
-		free_page((unsigned long)hv_cpu->synic_message_page);
+		if (hv_root_partition) {
+			if (hv_cpu->synic_event_page != NULL)
+				memunmap(hv_cpu->synic_event_page);
+
+			if (hv_cpu->synic_message_page != NULL)
+				memunmap(hv_cpu->synic_message_page);
+		} else {
+			free_page((unsigned long)hv_cpu->synic_event_page);
+			free_page((unsigned long)hv_cpu->synic_message_page);
+		}
 		free_page((unsigned long)hv_cpu->post_msg_page);
 	}
 
@@ -216,7 +224,7 @@ void hv_synic_enable_regs(unsigned int cpu)
 	simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
 	simp.simp_enabled = 1;
 
-	if (hv_isolation_type_snp()) {
+	if (hv_isolation_type_snp() || hv_root_partition) {
 		hv_cpu->synic_message_page
 			= memremap(simp.base_simp_gpa << HV_HYP_PAGE_SHIFT,
 				   HV_HYP_PAGE_SIZE, MEMREMAP_WB);
@@ -233,7 +241,7 @@ void hv_synic_enable_regs(unsigned int cpu)
 	siefp.as_uint64 = hv_get_register(HV_REGISTER_SIEFP);
 	siefp.siefp_enabled = 1;
 
-	if (hv_isolation_type_snp()) {
+	if (hv_isolation_type_snp() || hv_root_partition) {
 		hv_cpu->synic_event_page =
 			memremap(siefp.base_siefp_gpa << HV_HYP_PAGE_SHIFT,
 				 HV_HYP_PAGE_SIZE, MEMREMAP_WB);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 3/5] x86/hyperv: Add an interface to do nested hypercalls
  2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
  2022-11-24  5:53   ` [PATCH v5 1/5] x86/hyperv: Add support for detecting nested hypervisor Jinank Jain
  2022-11-24  5:53   ` [PATCH v5 2/5] Drivers: hv: Setup synic registers in case of nested root partition Jinank Jain
@ 2022-11-24  5:53   ` Jinank Jain
  2022-11-24  5:53   ` [PATCH v5 4/5] Drivers: hv: Enable vmbus driver for nested root partition Jinank Jain
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-24  5:53 UTC (permalink / raw)
  To: jinankjain
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, x86,
	hpa, arnd, peterz, jpoimboe, jinankjain, seanjc, kirill.shutemov,
	ak, sathyanarayanan.kuppuswamy, linux-hyperv, linux-kernel,
	linux-arch, anrayabh, mikelley

According to TLFS, in order to communicate to L0 hypervisor there needs
to be an additional bit set in the control register. This communication
is required to perform priviledged instructions which can only be
performed by L0 hypervisor. An example of that could be setting up the
VMBus infrastructure.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 arch/x86/include/asm/hyperv-tlfs.h |  3 ++-
 arch/x86/include/asm/mshyperv.h    | 42 +++++++++++++++++++++++++++---
 include/asm-generic/hyperv-tlfs.h  |  1 +
 3 files changed, 41 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index b5019becb618..7758c495541d 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -380,7 +380,8 @@ struct hv_nested_enlightenments_control {
 		__u32 reserved:31;
 	} features;
 	struct {
-		__u32 reserved;
+		__u32 inter_partition_comm:1;
+		__u32 reserved:31;
 	} hypercallControls;
 } __packed;
 
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 326d699b30d5..42e42cea0384 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -74,10 +74,16 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 	return hv_status;
 }
 
+/* Hypercall to the L0 hypervisor */
+static inline u64 hv_do_nested_hypercall(u64 control, void *input, void *output)
+{
+	return hv_do_hypercall(control | HV_HYPERCALL_NESTED, input, output);
+}
+
 /* Fast hypercall with 8 bytes of input and no output */
-static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
+static inline u64 _hv_do_fast_hypercall8(u64 control, u16 code, u64 input1)
 {
-	u64 hv_status, control = (u64)code | HV_HYPERCALL_FAST_BIT;
+	u64 hv_status;
 
 #ifdef CONFIG_X86_64
 	{
@@ -105,10 +111,24 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
 		return hv_status;
 }
 
+static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
+{
+	u64 control = (u64)code | HV_HYPERCALL_FAST_BIT;
+
+	return _hv_do_fast_hypercall8(control, code, input1);
+}
+
+static inline u64 hv_do_fast_nested_hypercall8(u16 code, u64 input1)
+{
+	u64 control = (u64)code | HV_HYPERCALL_FAST_BIT | HV_HYPERCALL_NESTED;
+
+	return _hv_do_fast_hypercall8(control, code, input1);
+}
+
 /* Fast hypercall with 16 bytes of input */
-static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
+static inline u64 _hv_do_fast_hypercall16(u64 control, u16 code, u64 input1, u64 input2)
 {
-	u64 hv_status, control = (u64)code | HV_HYPERCALL_FAST_BIT;
+	u64 hv_status;
 
 #ifdef CONFIG_X86_64
 	{
@@ -139,6 +159,20 @@ static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
 	return hv_status;
 }
 
+static inline u64 hv_do_fast_hypercall16(u16 code, u64 input1, u64 input2)
+{
+	u64 control = (u64)code | HV_HYPERCALL_FAST_BIT;
+
+	return _hv_do_fast_hypercall16(control, code, input1, input2);
+}
+
+static inline u64 hv_do_fast_nested_hypercall16(u16 code, u64 input1, u64 input2)
+{
+	u64 control = (u64)code | HV_HYPERCALL_FAST_BIT | HV_HYPERCALL_NESTED;
+
+	return _hv_do_fast_hypercall16(control, code, input1, input2);
+}
+
 extern struct hv_vp_assist_page **hv_vp_assist_page;
 
 static inline struct hv_vp_assist_page *hv_get_vp_assist_page(unsigned int cpu)
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index b17c6eeb9afa..e61ee461c4fc 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -194,6 +194,7 @@ enum HV_GENERIC_SET_FORMAT {
 #define HV_HYPERCALL_VARHEAD_OFFSET	17
 #define HV_HYPERCALL_VARHEAD_MASK	GENMASK_ULL(26, 17)
 #define HV_HYPERCALL_RSVD0_MASK		GENMASK_ULL(31, 27)
+#define HV_HYPERCALL_NESTED		BIT_ULL(31)
 #define HV_HYPERCALL_REP_COMP_OFFSET	32
 #define HV_HYPERCALL_REP_COMP_1		BIT_ULL(32)
 #define HV_HYPERCALL_REP_COMP_MASK	GENMASK_ULL(43, 32)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 4/5] Drivers: hv: Enable vmbus driver for nested root partition
  2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
                     ` (2 preceding siblings ...)
  2022-11-24  5:53   ` [PATCH v5 3/5] x86/hyperv: Add an interface to do nested hypercalls Jinank Jain
@ 2022-11-24  5:53   ` Jinank Jain
  2022-11-24  5:53   ` [PATCH v5 5/5] x86/hyperv: Change interrupt vector " Jinank Jain
  2022-11-24  6:01   ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
  5 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-24  5:53 UTC (permalink / raw)
  To: jinankjain
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, x86,
	hpa, arnd, peterz, jpoimboe, jinankjain, seanjc, kirill.shutemov,
	ak, sathyanarayanan.kuppuswamy, linux-hyperv, linux-kernel,
	linux-arch, anrayabh, mikelley

Currently VMBus driver is not initialized for root partition but we need
to enable the VMBus driver for nested root partition. This is required,
so that L2 root can use the VMBus devices.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 drivers/hv/vmbus_drv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index db00d20c726d..0937877eade9 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2744,7 +2744,7 @@ static int __init hv_acpi_init(void)
 	if (!hv_is_hyperv_initialized())
 		return -ENODEV;
 
-	if (hv_root_partition)
+	if (hv_root_partition && !hv_nested)
 		return 0;
 
 	/*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH v5 5/5] x86/hyperv: Change interrupt vector for nested root partition
  2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
                     ` (3 preceding siblings ...)
  2022-11-24  5:53   ` [PATCH v5 4/5] Drivers: hv: Enable vmbus driver for nested root partition Jinank Jain
@ 2022-11-24  5:53   ` Jinank Jain
  2022-11-24  6:01   ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
  5 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-24  5:53 UTC (permalink / raw)
  To: jinankjain
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, x86,
	hpa, arnd, peterz, jpoimboe, jinankjain, seanjc, kirill.shutemov,
	ak, sathyanarayanan.kuppuswamy, linux-hyperv, linux-kernel,
	linux-arch, anrayabh, mikelley

Traditionally we have been using the HYPERVISOR_CALLBACK_VECTOR to relay
the VMBus interrupt. But this does not work in case of nested
hypervisor. Microsoft Hypervisor reserves 0x31 to 0x34 as the interrupt
vector range for VMBus and thus we have to use one of the vectors from
that range and setup the IDT accordingly.

Signed-off-by: Jinank Jain <jinankjain@linux.microsoft.com>
---
 arch/x86/include/asm/idtentry.h    |  2 ++
 arch/x86/include/asm/irq_vectors.h |  6 ++++++
 arch/x86/kernel/cpu/mshyperv.c     | 15 +++++++++++++++
 arch/x86/kernel/idt.c              |  9 +++++++++
 drivers/hv/vmbus_drv.c             |  3 ++-
 5 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/idtentry.h b/arch/x86/include/asm/idtentry.h
index 72184b0b2219..c0648e3e4d4a 100644
--- a/arch/x86/include/asm/idtentry.h
+++ b/arch/x86/include/asm/idtentry.h
@@ -686,6 +686,8 @@ DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_NESTED_VECTOR,	sysvec_kvm_posted_intr_nested
 DECLARE_IDTENTRY_SYSVEC(HYPERVISOR_CALLBACK_VECTOR,	sysvec_hyperv_callback);
 DECLARE_IDTENTRY_SYSVEC(HYPERV_REENLIGHTENMENT_VECTOR,	sysvec_hyperv_reenlightenment);
 DECLARE_IDTENTRY_SYSVEC(HYPERV_STIMER0_VECTOR,	sysvec_hyperv_stimer0);
+DECLARE_IDTENTRY_SYSVEC(HYPERV_INTR_NESTED_VMBUS_VECTOR,
+			sysvec_hyperv_nested_vmbus_intr);
 #endif
 
 #if IS_ENABLED(CONFIG_ACRN_GUEST)
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 43dcb9284208..729d19eab7f5 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -102,6 +102,12 @@
 #if IS_ENABLED(CONFIG_HYPERV)
 #define HYPERV_REENLIGHTENMENT_VECTOR	0xee
 #define HYPERV_STIMER0_VECTOR		0xed
+/*
+ * FIXME: Change this, once Microsoft Hypervisor changes its assumption
+ * around VMBus interrupt vector allocation for nested root partition.
+ * Or provides a better interface to detect this instead of hardcoding.
+ */
+#define HYPERV_INTR_NESTED_VMBUS_VECTOR	0x31
 #endif
 
 #define LOCAL_TIMER_VECTOR		0xec
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 3e6711a6af6b..ec7fef43e03b 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -110,6 +110,21 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_callback)
 	set_irq_regs(old_regs);
 }
 
+DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_nested_vmbus_intr)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	inc_irq_stat(irq_hv_callback_count);
+
+	if (vmbus_handler)
+		vmbus_handler();
+
+	if (ms_hyperv.hints & HV_DEPRECATING_AEOI_RECOMMENDED)
+		ack_APIC_irq();
+
+	set_irq_regs(old_regs);
+}
+
 void hv_setup_vmbus_handler(void (*handler)(void))
 {
 	vmbus_handler = handler;
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index a58c6bc1cd68..ace648856a0b 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -160,6 +160,15 @@ static const __initconst struct idt_data apic_idts[] = {
 # endif
 	INTG(SPURIOUS_APIC_VECTOR,		asm_sysvec_spurious_apic_interrupt),
 	INTG(ERROR_APIC_VECTOR,			asm_sysvec_error_interrupt),
+#ifdef CONFIG_HYPERV
+	/*
+	 * This is a hack because we cannot install this interrupt handler via alloc_intr_gate
+	 * as it does not allow interrupt vector less than FIRST_SYSTEM_VECTORS. And hyperv
+	 * does not want anything other than 0x31-0x34 as the interrupt vector for vmbus
+	 * interrupt in case of nested setup.
+	 */
+	INTG(HYPERV_INTR_NESTED_VMBUS_VECTOR, asm_sysvec_hyperv_nested_vmbus_intr),
+#endif
 #endif
 };
 
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 0937877eade9..c1477f3a08dd 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2767,7 +2767,8 @@ static int __init hv_acpi_init(void)
 	 * normal Linux IRQ mechanism is not used in this case.
 	 */
 #ifdef HYPERVISOR_CALLBACK_VECTOR
-	vmbus_interrupt = HYPERVISOR_CALLBACK_VECTOR;
+	vmbus_interrupt = hv_nested ? HYPERV_INTR_NESTED_VMBUS_VECTOR :
+					    HYPERVISOR_CALLBACK_VECTOR;
 	vmbus_irq = -1;
 #endif
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH v5 0/5] Add support running nested Microsoft Hypervisor
  2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
                     ` (4 preceding siblings ...)
  2022-11-24  5:53   ` [PATCH v5 5/5] x86/hyperv: Change interrupt vector " Jinank Jain
@ 2022-11-24  6:01   ` Jinank Jain
  5 siblings, 0 replies; 16+ messages in thread
From: Jinank Jain @ 2022-11-24  6:01 UTC (permalink / raw)
  To: jinankjain
  Cc: kys, haiyangz, wei.liu, decui, tglx, mingo, bp, dave.hansen, x86,
	hpa, arnd, peterz, jpoimboe, seanjc, kirill.shutemov, ak,
	sathyanarayanan.kuppuswamy, linux-hyperv, linux-kernel,
	linux-arch, anrayabh, mikelley

Please ignore the v5 patch series I posted the wrong set of patches.

Regards,

Jinank

On 11/24/2022 11:23 AM, Jinank Jain wrote:
> This patch series plans to add support for running nested Microsoft
> Hypervisor. In case of nested Microsoft Hypervisor there are few
> privileged hypercalls which need to go L0 Hypervisor instead of L1
> Hypervisor. This patches series basically identifies such hypercalls and
> replace them with nested hypercalls.
>
> Jinank Jain (5):
>    x86/hyperv: Add support for detecting nested hypervisor
>    Drivers: hv: Setup synic registers in case of nested root partition
>    x86/hyperv: Add an interface to do nested hypercalls
>    Drivers: hv: Enable vmbus driver for nested root partition
>    x86/hyperv: Change interrupt vector for nested root partition
>
> [v4]
> - Fix ARM64 compilation
>
> [v5]
> - Fix comments from Michael Kelly
>
>   arch/arm64/hyperv/mshyperv.c       |  6 +++
>   arch/x86/include/asm/hyperv-tlfs.h | 17 ++++++-
>   arch/x86/include/asm/idtentry.h    |  2 +
>   arch/x86/include/asm/irq_vectors.h |  6 +++
>   arch/x86/include/asm/mshyperv.h    | 68 ++++++++++++++++------------
>   arch/x86/kernel/cpu/mshyperv.c     | 71 ++++++++++++++++++++++++++++++
>   arch/x86/kernel/idt.c              |  9 ++++
>   drivers/hv/hv.c                    | 18 +++++---
>   drivers/hv/hv_common.c             |  7 ++-
>   drivers/hv/vmbus_drv.c             |  5 ++-
>   include/asm-generic/hyperv-tlfs.h  |  1 +
>   include/asm-generic/mshyperv.h     |  1 +
>   12 files changed, 173 insertions(+), 38 deletions(-)
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-11-24  6:01 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-02 14:00 [PATCH 0/6] Add support running nested Microsoft Hypervisor Jinank Jain
2022-11-02 14:00 ` [PATCH 1/6] mshv: Add support for detecting nested hypervisor Jinank Jain
2022-11-02 14:00 ` [PATCH 2/6] hv: Setup synic registers in case of nested root partition Jinank Jain
2022-11-02 14:58   ` Vitaly Kuznetsov
2022-11-02 14:00 ` [PATCH 3/6] hv: Set the correct EOM register in case of nested hypervisor Jinank Jain
2022-11-02 14:00 ` [PATCH 4/6] hv: Add an interface to do nested hypercalls Jinank Jain
2022-11-02 14:00 ` [PATCH 5/6] hv: Enable vmbus driver for nested root partition Jinank Jain
2022-11-02 14:00 ` [PATCH 6/6] hv, mshv : Change interrupt vector " Jinank Jain
2022-11-02 15:46   ` Wei Liu
2022-11-24  5:53 ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain
2022-11-24  5:53   ` [PATCH v5 1/5] x86/hyperv: Add support for detecting nested hypervisor Jinank Jain
2022-11-24  5:53   ` [PATCH v5 2/5] Drivers: hv: Setup synic registers in case of nested root partition Jinank Jain
2022-11-24  5:53   ` [PATCH v5 3/5] x86/hyperv: Add an interface to do nested hypercalls Jinank Jain
2022-11-24  5:53   ` [PATCH v5 4/5] Drivers: hv: Enable vmbus driver for nested root partition Jinank Jain
2022-11-24  5:53   ` [PATCH v5 5/5] x86/hyperv: Change interrupt vector " Jinank Jain
2022-11-24  6:01   ` [PATCH v5 0/5] Add support running nested Microsoft Hypervisor Jinank Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).