All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface
@ 2021-09-28 18:30 Nuno Das Neves
  2021-09-28 18:30 ` [PATCH v3 01/19] x86/hyperv: convert hyperv statuses to linux error codes Nuno Das Neves
                   ` (18 more replies)
  0 siblings, 19 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:30 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

This patch series provides a userspace interface for creating and running guest
virtual machines while running on the Microsoft Hypervisor [0].

Since managing guest machines can only be done when Linux is the root partition,
this series depends on Wei Liu's patch series merged in 5.12:
https://lore.kernel.org/linux-hyperv/20210203150435.27941-1-wei.liu@kernel.org/

The first two patches provide some helpers for converting hypervisor status
codes to linux error codes, and printing hypervisor status codes to dmesg for
debugging.

Hyper-V related headers asm-generic/hyperv-tlfs.h and x86/asm/hyperv-tlfs.h are
split into uapi and non-uapi. The uapi versions contain structures used in both
the ioctl interface and the kernel.

The mshv API is introduced in drivers/hv/mshv_main.c. As each interface is
introduced, documentation is added in Documentation/virt/mshv/api.rst.
The API is file-desciptor based, like KVM. The entry point is /dev/mshv.

/dev/mshv ioctls:
MSHV_CHECK_EXTENSION
MSHV_CREATE_PARTITION

Partition (vm) ioctls:
MSHV_MAP_GUEST_MEMORY, MSHV_UNMAP_GUEST_MEMORY
MSHV_INSTALL_INTERCEPT
MSHV_ASSERT_INTERRUPT
MSHV_GET_PARTITION_PROPERTY, MSHV_SET_PARTITION_PROPERTY
MSHV_CREATE_VP

Vp (vcpu) ioctls:
MSHV_GET_VP_REGISTERS, MSHV_SET_VP_REGISTERS
MSHV_RUN_VP
MSHV_GET_VP_STATE, MSHV_SET_VP_STATE
MSHV_VP_TRANSLATE_GVA
mmap() (register page)

[0] Hyper-V is more well-known, but it really refers to the whole stack
    including the hypervisor and other components that run in Windows kernel
    and userspace.

Changes since v2:
1. Fix kernel test robot issues
2. Bugfix in GVA to GPA patch provided by Anatol Belski

Changes since v1:
1. Correct mshv_dev mode to octal 0600
2. Fix bug in mshv_vp_iotcl_run - correctly set suspend registers on early exit
3. Address comments from Wei Liu, Sunil Muthuswamy, and Vitaly Kuznetsov
4. Run checkpatch.pl - fix whitespace and other style issues

Changes since RFC:
1. Moved code from virt/mshv to drivers/hv
2. Split hypercall helper functions and synic code to hv_call.c and hv_synic.c
3. MSHV_REQUEST_VERSION ioctl replaced with MSHV_CHECK_EXTENSION
3. Numerous suggestions, fixes, style changes, etc from Michael Kelley, Vitaly
   Kuznetsov, Wei Liu, and Vineeth Pillai
4. Added patch to enable hypervisor enlightenments on partition creation
5. Added Wei Liu's patch for GVA to GPA translation

Nuno Das Neves (18):
  x86/hyperv: convert hyperv statuses to linux error codes
  x86/hyperv: convert hyperv statuses to strings
  drivers/hv: minimal mshv module (/dev/mshv/)
  drivers/hv: check extension ioctl
  drivers/hv: create partition ioctl
  drivers/hv: create, initialize, finalize, delete partition hypercalls
  drivers/hv: withdraw memory hypercall
  drivers/hv: map and unmap guest memory
  drivers/hv: create vcpu ioctl
  drivers/hv: get and set vcpu registers ioctls
  drivers/hv: set up synic pages for intercept messages
  drivers/hv: run vp ioctl and isr
  drivers/hv: install intercept ioctl
  drivers/hv: assert interrupt ioctl
  drivers/hv: get and set vp state ioctls
  drivers/hv: mmap vp register page
  drivers/hv: get and set partition property ioctls
  drivers/hv: Add enlightenment bits to create partition ioctl

Wei Liu (1):
  drivers/hv: Translate GVA to GPA

 .../userspace-api/ioctl/ioctl-number.rst      |    2 +
 Documentation/virt/mshv/api.rst               |  173 +++
 arch/x86/hyperv/Makefile                      |    1 +
 arch/x86/hyperv/hv_init.c                     |    2 +-
 arch/x86/hyperv/hv_proc.c                     |   51 +-
 arch/x86/include/asm/hyperv-tlfs.h            |   15 +-
 arch/x86/include/asm/mshyperv.h               |    1 +
 arch/x86/include/uapi/asm/hyperv-tlfs.h       | 1274 +++++++++++++++++
 arch/x86/kernel/cpu/mshyperv.c                |   16 +
 drivers/hv/Kconfig                            |   18 +
 drivers/hv/Makefile                           |    4 +
 drivers/hv/hv_call.c                          |  742 ++++++++++
 drivers/hv/hv_synic.c                         |  181 +++
 drivers/hv/mshv.h                             |  120 ++
 drivers/hv/mshv_main.c                        | 1166 +++++++++++++++
 include/asm-generic/hyperv-tlfs.h             |  354 +++--
 include/asm-generic/mshyperv.h                |    4 +
 include/linux/mshv.h                          |   61 +
 include/uapi/asm-generic/hyperv-tlfs.h        |  242 ++++
 include/uapi/linux/mshv.h                     |  117 ++
 20 files changed, 4399 insertions(+), 145 deletions(-)
 create mode 100644 Documentation/virt/mshv/api.rst
 create mode 100644 arch/x86/include/uapi/asm/hyperv-tlfs.h
 create mode 100644 drivers/hv/hv_call.c
 create mode 100644 drivers/hv/hv_synic.c
 create mode 100644 drivers/hv/mshv.h
 create mode 100644 drivers/hv/mshv_main.c
 create mode 100644 include/linux/mshv.h
 create mode 100644 include/uapi/asm-generic/hyperv-tlfs.h
 create mode 100644 include/uapi/linux/mshv.h

-- 
2.23.4


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3 01/19] x86/hyperv: convert hyperv statuses to linux error codes
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
@ 2021-09-28 18:30 ` Nuno Das Neves
  2021-09-28 18:30 ` [PATCH v3 02/19] x86/hyperv: convert hyperv statuses to strings Nuno Das Neves
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:30 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Return linux-friendly error codes from hypercall wrapper functions.
This will be needed in the mshv module.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Reviewed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
---
 arch/x86/hyperv/hv_proc.c         | 30 ++++++++++++++++++++++++++---
 arch/x86/include/asm/mshyperv.h   |  1 +
 include/asm-generic/hyperv-tlfs.h | 32 +++++++++++++++++++++----------
 3 files changed, 50 insertions(+), 13 deletions(-)

diff --git a/arch/x86/hyperv/hv_proc.c b/arch/x86/hyperv/hv_proc.c
index 68a0843d4750..59cf9a9e0975 100644
--- a/arch/x86/hyperv/hv_proc.c
+++ b/arch/x86/hyperv/hv_proc.c
@@ -14,6 +14,30 @@
 
 #include <asm/trace/hyperv.h>
 
+int hv_status_to_errno(u64 hv_status)
+{
+	switch (hv_result(hv_status)) {
+	case HV_STATUS_SUCCESS:
+		return 0;
+	case HV_STATUS_INVALID_PARAMETER:
+	case HV_STATUS_UNKNOWN_PROPERTY:
+	case HV_STATUS_PROPERTY_VALUE_OUT_OF_RANGE:
+	case HV_STATUS_INVALID_VP_INDEX:
+	case HV_STATUS_INVALID_REGISTER_VALUE:
+	case HV_STATUS_INVALID_LP_INDEX:
+		return -EINVAL;
+	case HV_STATUS_ACCESS_DENIED:
+	case HV_STATUS_OPERATION_DENIED:
+		return -EACCES;
+	case HV_STATUS_NOT_ACKNOWLEDGED:
+	case HV_STATUS_INVALID_VP_STATE:
+	case HV_STATUS_INVALID_PARTITION_STATE:
+		return -EBADFD;
+	}
+	return -ENOTRECOVERABLE;
+}
+EXPORT_SYMBOL_GPL(hv_status_to_errno);
+
 /*
  * See struct hv_deposit_memory. The first u64 is partition ID, the rest
  * are GPAs.
@@ -94,7 +118,7 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
 	local_irq_restore(flags);
 	if (!hv_result_success(status)) {
 		pr_err("Failed to deposit pages: %lld\n", status);
-		ret = hv_result(status);
+		ret = hv_status_to_errno(status);
 		goto err_free_allocations;
 	}
 
@@ -150,7 +174,7 @@ int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
 			if (!hv_result_success(status)) {
 				pr_err("%s: cpu %u apic ID %u, %lld\n", __func__,
 				       lp_index, apic_id, status);
-				ret = hv_result(status);
+				ret = hv_status_to_errno(status);
 			}
 			break;
 		}
@@ -200,7 +224,7 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
 			if (!hv_result_success(status)) {
 				pr_err("%s: vcpu %u, lp %u, %lld\n", __func__,
 				       vp_index, flags, status);
-				ret = hv_result(status);
+				ret = hv_status_to_errno(status);
 			}
 			break;
 		}
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 67ff0d637e55..c6eb01f3864d 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -169,6 +169,7 @@ int hyperv_flush_guest_mapping_range(u64 as,
 int hyperv_fill_flush_guest_mapping_list(
 		struct hv_guest_mapping_flush_list *flush,
 		u64 start_gfn, u64 end_gfn);
+int hv_status_to_errno(u64 hv_status);
 
 extern bool hv_root_partition;
 
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 515c3fb06ab3..fe6d41d0b114 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -189,16 +189,28 @@ enum HV_GENERIC_SET_FORMAT {
 #define HV_HYPERCALL_REP_START_MASK	GENMASK_ULL(59, 48)
 
 /* hypercall status code */
-#define HV_STATUS_SUCCESS			0
-#define HV_STATUS_INVALID_HYPERCALL_CODE	2
-#define HV_STATUS_INVALID_HYPERCALL_INPUT	3
-#define HV_STATUS_INVALID_ALIGNMENT		4
-#define HV_STATUS_INVALID_PARAMETER		5
-#define HV_STATUS_OPERATION_DENIED		8
-#define HV_STATUS_INSUFFICIENT_MEMORY		11
-#define HV_STATUS_INVALID_PORT_ID		17
-#define HV_STATUS_INVALID_CONNECTION_ID		18
-#define HV_STATUS_INSUFFICIENT_BUFFERS		19
+#define HV_STATUS_SUCCESS			0x0
+#define HV_STATUS_INVALID_HYPERCALL_CODE	0x2
+#define HV_STATUS_INVALID_HYPERCALL_INPUT	0x3
+#define HV_STATUS_INVALID_ALIGNMENT		0x4
+#define HV_STATUS_INVALID_PARAMETER		0x5
+#define HV_STATUS_ACCESS_DENIED			0x6
+#define HV_STATUS_INVALID_PARTITION_STATE	0x7
+#define HV_STATUS_OPERATION_DENIED		0x8
+#define HV_STATUS_UNKNOWN_PROPERTY		0x9
+#define HV_STATUS_PROPERTY_VALUE_OUT_OF_RANGE	0xA
+#define HV_STATUS_INSUFFICIENT_MEMORY		0xB
+#define HV_STATUS_INVALID_PARTITION_ID		0xD
+#define HV_STATUS_INVALID_VP_INDEX		0xE
+#define HV_STATUS_NOT_FOUND			0x10
+#define HV_STATUS_INVALID_PORT_ID		0x11
+#define HV_STATUS_INVALID_CONNECTION_ID		0x12
+#define HV_STATUS_INSUFFICIENT_BUFFERS		0x13
+#define HV_STATUS_NOT_ACKNOWLEDGED		0x14
+#define HV_STATUS_INVALID_VP_STATE		0x15
+#define HV_STATUS_NO_RESOURCES			0x1D
+#define HV_STATUS_INVALID_LP_INDEX		0x41
+#define HV_STATUS_INVALID_REGISTER_VALUE	0x50
 
 /*
  * The Hyper-V TimeRefCount register and the TSC
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 02/19] x86/hyperv: convert hyperv statuses to strings
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
  2021-09-28 18:30 ` [PATCH v3 01/19] x86/hyperv: convert hyperv statuses to linux error codes Nuno Das Neves
@ 2021-09-28 18:30 ` Nuno Das Neves
  2021-09-28 18:30 ` [PATCH v3 03/19] drivers/hv: minimal mshv module (/dev/mshv/) Nuno Das Neves
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:30 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Allow hyperv hypercall failures to be debugged more easily with dmesg.
This will be used in the mshv module.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 arch/x86/hyperv/hv_init.c         |  2 +-
 arch/x86/hyperv/hv_proc.c         | 19 ++++++++---
 include/asm-generic/hyperv-tlfs.h | 52 ++++++++++++++++++-------------
 include/asm-generic/mshyperv.h    |  1 +
 4 files changed, 46 insertions(+), 28 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index bb0ae4b5c00f..722bafdb2225 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -349,7 +349,7 @@ static void __init hv_get_partition_id(void)
 	status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, output_page);
 	if (!hv_result_success(status)) {
 		/* No point in proceeding if this failed */
-		pr_err("Failed to get partition ID: %lld\n", status);
+		pr_err("Failed to get partition ID: %s\n", hv_status_to_string(status));
 		BUG();
 	}
 	hv_current_partition_id = output_page->partition_id;
diff --git a/arch/x86/hyperv/hv_proc.c b/arch/x86/hyperv/hv_proc.c
index 59cf9a9e0975..e75c78a243e7 100644
--- a/arch/x86/hyperv/hv_proc.c
+++ b/arch/x86/hyperv/hv_proc.c
@@ -38,6 +38,15 @@ int hv_status_to_errno(u64 hv_status)
 }
 EXPORT_SYMBOL_GPL(hv_status_to_errno);
 
+const char *hv_status_to_string(u64 hv_status)
+{
+	switch (hv_result(hv_status)) {
+	__HV_STATUS_DEF(__HV_MAKE_HV_STATUS_CASE)
+	default : return "Unknown";
+	}
+}
+EXPORT_SYMBOL_GPL(hv_status_to_string);
+
 /*
  * See struct hv_deposit_memory. The first u64 is partition ID, the rest
  * are GPAs.
@@ -117,7 +126,7 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
 				     page_count, 0, input_page, NULL);
 	local_irq_restore(flags);
 	if (!hv_result_success(status)) {
-		pr_err("Failed to deposit pages: %lld\n", status);
+		pr_err("Failed to deposit pages: %s\n", hv_status_to_string(status));
 		ret = hv_status_to_errno(status);
 		goto err_free_allocations;
 	}
@@ -172,8 +181,8 @@ int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
 
 		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
 			if (!hv_result_success(status)) {
-				pr_err("%s: cpu %u apic ID %u, %lld\n", __func__,
-				       lp_index, apic_id, status);
+				pr_err("%s: cpu %u apic ID %u, %s\n", __func__,
+				       lp_index, apic_id, hv_status_to_string(status));
 				ret = hv_status_to_errno(status);
 			}
 			break;
@@ -222,8 +231,8 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
 
 		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
 			if (!hv_result_success(status)) {
-				pr_err("%s: vcpu %u, lp %u, %lld\n", __func__,
-				       vp_index, flags, status);
+				pr_err("%s: vcpu %u, lp %u, %s\n", __func__,
+				       vp_index, flags, hv_status_to_string(status));
 				ret = hv_status_to_errno(status);
 			}
 			break;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index fe6d41d0b114..40ff7cdd4a2b 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -189,28 +189,36 @@ enum HV_GENERIC_SET_FORMAT {
 #define HV_HYPERCALL_REP_START_MASK	GENMASK_ULL(59, 48)
 
 /* hypercall status code */
-#define HV_STATUS_SUCCESS			0x0
-#define HV_STATUS_INVALID_HYPERCALL_CODE	0x2
-#define HV_STATUS_INVALID_HYPERCALL_INPUT	0x3
-#define HV_STATUS_INVALID_ALIGNMENT		0x4
-#define HV_STATUS_INVALID_PARAMETER		0x5
-#define HV_STATUS_ACCESS_DENIED			0x6
-#define HV_STATUS_INVALID_PARTITION_STATE	0x7
-#define HV_STATUS_OPERATION_DENIED		0x8
-#define HV_STATUS_UNKNOWN_PROPERTY		0x9
-#define HV_STATUS_PROPERTY_VALUE_OUT_OF_RANGE	0xA
-#define HV_STATUS_INSUFFICIENT_MEMORY		0xB
-#define HV_STATUS_INVALID_PARTITION_ID		0xD
-#define HV_STATUS_INVALID_VP_INDEX		0xE
-#define HV_STATUS_NOT_FOUND			0x10
-#define HV_STATUS_INVALID_PORT_ID		0x11
-#define HV_STATUS_INVALID_CONNECTION_ID		0x12
-#define HV_STATUS_INSUFFICIENT_BUFFERS		0x13
-#define HV_STATUS_NOT_ACKNOWLEDGED		0x14
-#define HV_STATUS_INVALID_VP_STATE		0x15
-#define HV_STATUS_NO_RESOURCES			0x1D
-#define HV_STATUS_INVALID_LP_INDEX		0x41
-#define HV_STATUS_INVALID_REGISTER_VALUE	0x50
+#define __HV_STATUS_DEF(OP) \
+	OP(HV_STATUS_SUCCESS,				0x0) \
+	OP(HV_STATUS_INVALID_HYPERCALL_CODE,		0x2) \
+	OP(HV_STATUS_INVALID_HYPERCALL_INPUT,		0x3) \
+	OP(HV_STATUS_INVALID_ALIGNMENT,			0x4) \
+	OP(HV_STATUS_INVALID_PARAMETER,			0x5) \
+	OP(HV_STATUS_ACCESS_DENIED,			0x6) \
+	OP(HV_STATUS_INVALID_PARTITION_STATE,		0x7) \
+	OP(HV_STATUS_OPERATION_DENIED,			0x8) \
+	OP(HV_STATUS_UNKNOWN_PROPERTY,			0x9) \
+	OP(HV_STATUS_PROPERTY_VALUE_OUT_OF_RANGE,	0xA) \
+	OP(HV_STATUS_INSUFFICIENT_MEMORY,		0xB) \
+	OP(HV_STATUS_INVALID_PARTITION_ID,		0xD) \
+	OP(HV_STATUS_INVALID_VP_INDEX,			0xE) \
+	OP(HV_STATUS_NOT_FOUND,				0x10) \
+	OP(HV_STATUS_INVALID_PORT_ID,			0x11) \
+	OP(HV_STATUS_INVALID_CONNECTION_ID,		0x12) \
+	OP(HV_STATUS_INSUFFICIENT_BUFFERS,		0x13) \
+	OP(HV_STATUS_NOT_ACKNOWLEDGED,			0x14) \
+	OP(HV_STATUS_INVALID_VP_STATE,			0x15) \
+	OP(HV_STATUS_NO_RESOURCES,			0x1D) \
+	OP(HV_STATUS_INVALID_LP_INDEX,			0x41) \
+	OP(HV_STATUS_INVALID_REGISTER_VALUE,		0x50)
+
+#define __HV_MAKE_HV_STATUS_ENUM(NAME, VAL) NAME = (VAL),
+#define __HV_MAKE_HV_STATUS_CASE(NAME, VAL) case (NAME): return (#NAME);
+
+enum hv_status {
+	__HV_STATUS_DEF(__HV_MAKE_HV_STATUS_ENUM)
+};
 
 /*
  * The Hyper-V TimeRefCount register and the TSC
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 9a000ba2bb75..672b08f79dae 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -219,6 +219,7 @@ static inline int cpumask_to_vpset(struct hv_vpset *vpset,
 	return nr_bank;
 }
 
+const char *hv_status_to_string(u64 hv_status);
 void hyperv_report_panic(struct pt_regs *regs, long err, bool in_die);
 bool hv_is_hyperv_initialized(void);
 bool hv_is_hibernation_supported(void);
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 03/19] drivers/hv: minimal mshv module (/dev/mshv/)
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
  2021-09-28 18:30 ` [PATCH v3 01/19] x86/hyperv: convert hyperv statuses to linux error codes Nuno Das Neves
  2021-09-28 18:30 ` [PATCH v3 02/19] x86/hyperv: convert hyperv statuses to strings Nuno Das Neves
@ 2021-09-28 18:30 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 04/19] drivers/hv: check extension ioctl Nuno Das Neves
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:30 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce a barebones module file for the mshv API.
Introduce CONFIG_HYPERV_VMM_API for controlling compilation of mshv.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 arch/x86/hyperv/Makefile |  1 +
 drivers/hv/Kconfig       | 18 ++++++++++
 drivers/hv/Makefile      |  3 ++
 drivers/hv/mshv_main.c   | 77 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 99 insertions(+)
 create mode 100644 drivers/hv/mshv_main.c

diff --git a/arch/x86/hyperv/Makefile b/arch/x86/hyperv/Makefile
index 48e2c51464e8..4b5a8a96ba01 100644
--- a/arch/x86/hyperv/Makefile
+++ b/arch/x86/hyperv/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_X86_64)	+= hv_apic.o hv_proc.o
 ifdef CONFIG_X86_64
 obj-$(CONFIG_PARAVIRT_SPINLOCKS)	+= hv_spinlock.o
 endif
+
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
index 66c794d92391..0fc3fcce3cf7 100644
--- a/drivers/hv/Kconfig
+++ b/drivers/hv/Kconfig
@@ -27,4 +27,22 @@ config HYPERV_BALLOON
 	help
 	  Select this option to enable Hyper-V Balloon driver.
 
+config HYPERV_VMM_API
+	tristate "Microsoft Hypervisor root partition interfaces: /dev/mshv"
+	depends on HYPERV
+	help
+	  Provides access to interfaces for managing guest virtual machines
+	  running under the Microsoft Hypervisor.
+
+	  These interfaces will only work when Linux is running as root
+	  partition on the Microsoft Hypervisor.
+
+	  The interfaces are provided via a device named /dev/mshv.
+
+	  To compile this as a module, choose M here.
+	  The module is named mshv.
+
+	  If unsure, say N.
+
+
 endmenu
diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
index 94daf8240c95..84e7fe2022a0 100644
--- a/drivers/hv/Makefile
+++ b/drivers/hv/Makefile
@@ -2,6 +2,7 @@
 obj-$(CONFIG_HYPERV)		+= hv_vmbus.o
 obj-$(CONFIG_HYPERV_UTILS)	+= hv_utils.o
 obj-$(CONFIG_HYPERV_BALLOON)	+= hv_balloon.o
+obj-$(CONFIG_HYPERV_VMM_API)	+= mshv.o
 
 CFLAGS_hv_trace.o = -I$(src)
 CFLAGS_hv_balloon.o = -I$(src)
@@ -11,3 +12,5 @@ hv_vmbus-y := vmbus_drv.o \
 		 channel_mgmt.o ring_buffer.o hv_trace.o
 hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
 hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_fcopy.o hv_utils_transport.o
+
+mshv-y				+= mshv_main.o
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
new file mode 100644
index 000000000000..e44adf91f660
--- /dev/null
+++ b/drivers/hv/mshv_main.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2020, Microsoft Corporation.
+ *
+ * Authors:
+ *   Nuno Das Neves <nudasnev@microsoft.com>
+ *   Lillian Grassin-Drake <ligrassi@microsoft.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+
+MODULE_AUTHOR("Microsoft");
+MODULE_LICENSE("GPL");
+
+static int mshv_dev_open(struct inode *inode, struct file *filp);
+static int mshv_dev_release(struct inode *inode, struct file *filp);
+static long mshv_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg);
+
+static const struct file_operations mshv_dev_fops = {
+	.owner = THIS_MODULE,
+	.open = mshv_dev_open,
+	.release = mshv_dev_release,
+	.unlocked_ioctl = mshv_dev_ioctl,
+	.llseek = noop_llseek,
+};
+
+static struct miscdevice mshv_dev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = "mshv",
+	.fops = &mshv_dev_fops,
+	.mode = 0600,
+};
+
+static long
+mshv_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
+{
+	return -ENOTTY;
+}
+
+static int
+mshv_dev_open(struct inode *inode, struct file *filp)
+{
+	return 0;
+}
+
+static int
+mshv_dev_release(struct inode *inode, struct file *filp)
+{
+	return 0;
+}
+
+static int
+__init mshv_init(void)
+{
+	int ret;
+
+	if (!hv_is_hyperv_initialized())
+		return -ENODEV;
+
+	ret = misc_register(&mshv_dev);
+	if (ret)
+		pr_err("%s: misc device register failed\n", __func__);
+
+	return ret;
+}
+
+static void
+__exit mshv_exit(void)
+{
+	misc_deregister(&mshv_dev);
+}
+
+module_init(mshv_init);
+module_exit(mshv_exit);
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 04/19] drivers/hv: check extension ioctl
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (2 preceding siblings ...)
  2021-09-28 18:30 ` [PATCH v3 03/19] drivers/hv: minimal mshv module (/dev/mshv/) Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 05/19] drivers/hv: create partition ioctl Nuno Das Neves
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Reserve ioctl number in userpsace-api/ioctl/ioctl-number.rst
Introduce MSHV_CHECK_EXTENSION ioctl.
Introduce documentation for /dev/mshv in Documentation/virt/mshv

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 .../userspace-api/ioctl/ioctl-number.rst      |  2 +
 Documentation/virt/mshv/api.rst               | 60 +++++++++++++++++++
 drivers/hv/mshv_main.c                        | 23 +++++++
 include/linux/mshv.h                          | 11 ++++
 include/uapi/linux/mshv.h                     | 20 +++++++
 5 files changed, 116 insertions(+)
 create mode 100644 Documentation/virt/mshv/api.rst
 create mode 100644 include/linux/mshv.h
 create mode 100644 include/uapi/linux/mshv.h

diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 9bfc2b510c64..585d9cc42a5a 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -350,6 +350,8 @@ Code  Seq#    Include File                                           Comments
 0xB6  all    linux/fpga-dfl.h
 0xB7  all    uapi/linux/remoteproc_cdev.h                            <mailto:linux-remoteproc@vger.kernel.org>
 0xB7  all    uapi/linux/nsfs.h                                       <mailto:Andrei Vagin <avagin@openvz.org>>
+0xB8  all    uapi/linux/mshv.h                                       Microsoft Hypervisor root partition APIs
+                                                                     <mailto:linux-hyperv@vger.kernel.org>
 0xC0  00-0F  linux/usb/iowarrior.h
 0xCA  00-0F  uapi/misc/cxl.h
 0xCA  10-2F  uapi/misc/ocxl.h
diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
new file mode 100644
index 000000000000..75c5e073ecc0
--- /dev/null
+++ b/Documentation/virt/mshv/api.rst
@@ -0,0 +1,60 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=====================================================
+Microsoft Hypervisor Root Partition API Documentation
+=====================================================
+
+1. Overview
+===========
+
+This document describes APIs for creating and managing guest virtual machines
+when running Linux as the root partition on the Microsoft Hypervisor.
+
+Note that this API is not yet stable!
+
+2. Glossary/Terms
+=================
+
+hv
+--
+Short for Hyper-V. This name is used in the kernel to describe interfaces to
+the Microsoft Hypervisor.
+
+mshv
+----
+Short for Microsoft Hypervisor. This is the name of the userland API module
+described in this document.
+
+Partition
+---------
+A virtual machine running on the Microsoft Hypervisor.
+
+Root Partition
+--------------
+The partition that is created and assumes control when the machine boots. The
+root partition can use mshv APIs to create guest partitions.
+
+3. API description
+==================
+
+The module is named mshv and can be configured with CONFIG_HYPERV_ROOT_API.
+
+Mshv is file descriptor-based, following a similar pattern to KVM.
+
+To get a handle to the mshv driver, use open("/dev/mshv").
+
+3.1 MSHV_CHECK_EXTENSION
+------------------------
+:Type: /dev/mshv ioctl
+:Parameters: pointer to a u32
+:Returns: 0 if extension unsupported, positive number if supported
+
+This ioctl takes a single argument corresponding to an API extension to check
+support for.
+
+If the extension is supported, MSHV_CHECK_EXTENSION will return a positive
+number. If not, it will return 0.
+
+The first extension that can be checked is MSHV_CAP_CORE_API_STABLE. This
+will be supported when the core API is stable.
+
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index e44adf91f660..d73b64ea1448 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -11,6 +11,8 @@
 #include <linux/module.h>
 #include <linux/fs.h>
 #include <linux/miscdevice.h>
+#include <linux/slab.h>
+#include <linux/mshv.h>
 
 MODULE_AUTHOR("Microsoft");
 MODULE_LICENSE("GPL");
@@ -34,9 +36,30 @@ static struct miscdevice mshv_dev = {
 	.mode = 0600,
 };
 
+static long
+mshv_ioctl_check_extension(void __user *user_arg)
+{
+	u32 arg;
+
+	if (copy_from_user(&arg, user_arg, sizeof(arg)))
+		return -EFAULT;
+
+	switch (arg) {
+	case MSHV_CAP_CORE_API_STABLE:
+		return 0;
+	}
+
+	return -EOPNOTSUPP;
+}
+
 static long
 mshv_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
+	switch (ioctl) {
+	case MSHV_CHECK_EXTENSION:
+		return mshv_ioctl_check_extension((void __user *)arg);
+	}
+
 	return -ENOTTY;
 }
 
diff --git a/include/linux/mshv.h b/include/linux/mshv.h
new file mode 100644
index 000000000000..a0982fe2c0b8
--- /dev/null
+++ b/include/linux/mshv.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _LINUX_MSHV_H
+#define _LINUX_MSHV_H
+
+/*
+ * Microsoft Hypervisor root partition driver for /dev/mshv
+ */
+
+#include <uapi/linux/mshv.h>
+
+#endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
new file mode 100644
index 000000000000..3b84e3ea97be
--- /dev/null
+++ b/include/uapi/linux/mshv.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_LINUX_MSHV_H
+#define _UAPI_LINUX_MSHV_H
+
+/*
+ * Userspace interface for /dev/mshv
+ * Microsoft Hypervisor root partition APIs
+ * NOTE: This API is not yet stable!
+ */
+
+#include <linux/types.h>
+
+#define MSHV_CAP_CORE_API_STABLE    0x0
+
+#define MSHV_IOCTL 0xB8
+
+/* mshv device */
+#define MSHV_CHECK_EXTENSION    _IOW(MSHV_IOCTL, 0x00, __u32)
+
+#endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 05/19] drivers/hv: create partition ioctl
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (3 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 04/19] drivers/hv: check extension ioctl Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 06/19] drivers/hv: create, initialize, finalize, delete partition hypercalls Nuno Das Neves
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Add MSHV_CREATE_PARTITION, which creates an fd to track a new partition.
Partition is not yet created in the hypervisor itself.
Introduce header files for userspace-facing hyperv structures.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst         |  12 ++
 arch/x86/include/asm/hyperv-tlfs.h      |   1 +
 arch/x86/include/uapi/asm/hyperv-tlfs.h | 124 +++++++++++++++++++++
 drivers/hv/mshv_main.c                  | 141 ++++++++++++++++++++++++
 include/asm-generic/hyperv-tlfs.h       |   1 +
 include/linux/mshv.h                    |  16 +++
 include/uapi/asm-generic/hyperv-tlfs.h  |  15 +++
 include/uapi/linux/mshv.h               |   8 ++
 8 files changed, 318 insertions(+)
 create mode 100644 arch/x86/include/uapi/asm/hyperv-tlfs.h
 create mode 100644 include/uapi/asm-generic/hyperv-tlfs.h

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index 75c5e073ecc0..f92892b27ccc 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -39,6 +39,9 @@ root partition can use mshv APIs to create guest partitions.
 
 The module is named mshv and can be configured with CONFIG_HYPERV_ROOT_API.
 
+The uapi header files you need are linux/mshv.h, asm/hyperv-tlfs.h, and
+asm-generic/hyperv-tlfs.h.
+
 Mshv is file descriptor-based, following a similar pattern to KVM.
 
 To get a handle to the mshv driver, use open("/dev/mshv").
@@ -58,3 +61,12 @@ number. If not, it will return 0.
 The first extension that can be checked is MSHV_CAP_CORE_API_STABLE. This
 will be supported when the core API is stable.
 
+3.2 MSHV_CREATE_PARTITION
+-------------------------
+:Type: /dev/mshv ioctl
+:Parameters: struct mshv_create_partition
+:Returns: partition file descriptor, or -1 on failure
+
+This ioctl creates a guest partition, returning a file descriptor to use as a
+handle for partition ioctls.
+
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 606f5cc579b2..2b6f7dca79e6 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -11,6 +11,7 @@
 
 #include <linux/types.h>
 #include <asm/page.h>
+#include <uapi/asm/hyperv-tlfs.h>
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
  * is set by CPUID(HvCpuIdFunctionVersionAndFeatures).
diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h b/arch/x86/include/uapi/asm/hyperv-tlfs.h
new file mode 100644
index 000000000000..8a5fc59bb33a
--- /dev/null
+++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
@@ -0,0 +1,124 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_ASM_X86_HYPERV_TLFS_USER_H
+#define _UAPI_ASM_X86_HYPERV_TLFS_USER_H
+
+#include <linux/types.h>
+
+#define HV_PARTITION_PROCESSOR_FEATURE_BANKS 2
+
+union hv_partition_processor_features {
+	struct {
+		__u64 sse3_support:1;
+		__u64 lahf_sahf_support:1;
+		__u64 ssse3_support:1;
+		__u64 sse4_1_support:1;
+		__u64 sse4_2_support:1;
+		__u64 sse4a_support:1;
+		__u64 xop_support:1;
+		__u64 pop_cnt_support:1;
+		__u64 cmpxchg16b_support:1;
+		__u64 altmovcr8_support:1;
+		__u64 lzcnt_support:1;
+		__u64 mis_align_sse_support:1;
+		__u64 mmx_ext_support:1;
+		__u64 amd3dnow_support:1;
+		__u64 extended_amd3dnow_support:1;
+		__u64 page_1gb_support:1;
+		__u64 aes_support:1;
+		__u64 pclmulqdq_support:1;
+		__u64 pcid_support:1;
+		__u64 fma4_support:1;
+		__u64 f16c_support:1;
+		__u64 rd_rand_support:1;
+		__u64 rd_wr_fs_gs_support:1;
+		__u64 smep_support:1;
+		__u64 enhanced_fast_string_support:1;
+		__u64 bmi1_support:1;
+		__u64 bmi2_support:1;
+		__u64 hle_support_deprecated:1;
+		__u64 rtm_support_deprecated:1;
+		__u64 movbe_support:1;
+		__u64 npiep1_support:1;
+		__u64 dep_x87_fpu_save_support:1;
+		__u64 rd_seed_support:1;
+		__u64 adx_support:1;
+		__u64 intel_prefetch_support:1;
+		__u64 smap_support:1;
+		__u64 hle_support:1;
+		__u64 rtm_support:1;
+		__u64 rdtscp_support:1;
+		__u64 clflushopt_support:1;
+		__u64 clwb_support:1;
+		__u64 sha_support:1;
+		__u64 x87_pointers_saved_support:1;
+		__u64 invpcid_support:1;
+		__u64 ibrs_support:1;
+		__u64 stibp_support:1;
+		__u64 ibpb_support: 1;
+		__u64 unrestricted_guest_support:1;
+		__u64 mdd_support:1;
+		__u64 fast_short_rep_mov_support:1;
+		__u64 l1dcache_flush_support:1;
+		__u64 rdcl_no_support:1;
+		__u64 ibrs_all_support:1;
+		__u64 skip_l1df_support:1;
+		__u64 ssb_no_support:1;
+		__u64 rsb_a_no_support:1;
+		__u64 virt_spec_ctrl_support:1;
+		__u64 rd_pid_support:1;
+		__u64 umip_support:1;
+		__u64 mbs_no_support:1;
+		__u64 mb_clear_support:1;
+		__u64 taa_no_support:1;
+		__u64 tsx_ctrl_support:1;
+		/*
+		 * N.B. The final processor feature bit in bank 0 is reserved to
+		 * simplify potential downlevel backports.
+		 */
+		__u64 reserved_bank0:1;
+
+		/* N.B. Begin bank 1 processor features. */
+		__u64 acount_mcount_support:1;
+		__u64 tsc_invariant_support:1;
+		__u64 cl_zero_support:1;
+		__u64 rdpru_support:1;
+		__u64 la57_support:1;
+		__u64 mbec_support:1;
+		__u64 nested_virt_support:1;
+		__u64 psfd_support:1;
+		__u64 cet_ss_support:1;
+		__u64 cet_ibt_support:1;
+		__u64 vmx_exception_inject_support:1;
+		__u64 enqcmd_support:1;
+		__u64 umwait_tpause_support:1;
+		__u64 movdiri_support:1;
+		__u64 movdir64b_support:1;
+		__u64 cldemote_support:1;
+		__u64 serialize_support:1;
+		__u64 tsc_deadline_tmr_support:1;
+		__u64 tsc_adjust_support:1;
+		__u64 fzlrep_movsb:1;
+		__u64 fsrep_stosb:1;
+		__u64 fsrep_cmpsb:1;
+		__u64 reserved_bank1:42;
+	} __packed;
+	__u64 as_uint64[HV_PARTITION_PROCESSOR_FEATURE_BANKS];
+};
+
+union hv_partition_processor_xsave_features {
+	struct {
+		__u64 xsave_support : 1;
+		__u64 xsaveopt_support : 1;
+		__u64 avx_support : 1;
+		__u64 reserved1 : 61;
+	} __packed;
+	__u64 as_uint64;
+};
+
+struct hv_partition_creation_properties {
+	union hv_partition_processor_features disabled_processor_features;
+	union hv_partition_processor_xsave_features
+		disabled_processor_xsave_features;
+} __packed;
+
+#endif
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index d73b64ea1448..378bced4044c 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -12,15 +12,28 @@
 #include <linux/fs.h>
 #include <linux/miscdevice.h>
 #include <linux/slab.h>
+#include <linux/file.h>
+#include <linux/anon_inodes.h>
 #include <linux/mshv.h>
 
 MODULE_AUTHOR("Microsoft");
 MODULE_LICENSE("GPL");
 
+struct mshv mshv = {};
+
+static void mshv_partition_put(struct mshv_partition *partition);
+static int mshv_partition_release(struct inode *inode, struct file *filp);
+static long mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg);
 static int mshv_dev_open(struct inode *inode, struct file *filp);
 static int mshv_dev_release(struct inode *inode, struct file *filp);
 static long mshv_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg);
 
+static const struct file_operations mshv_partition_fops = {
+	.release = mshv_partition_release,
+	.unlocked_ioctl = mshv_partition_ioctl,
+	.llseek = noop_llseek,
+};
+
 static const struct file_operations mshv_dev_fops = {
 	.owner = THIS_MODULE,
 	.open = mshv_dev_open,
@@ -36,6 +49,130 @@ static struct miscdevice mshv_dev = {
 	.mode = 0600,
 };
 
+static long
+mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
+{
+	return -ENOTTY;
+}
+
+static void
+destroy_partition(struct mshv_partition *partition)
+{
+	unsigned long flags;
+	int i;
+
+	/* Remove from list of partitions */
+	spin_lock_irqsave(&mshv.partitions.lock, flags);
+
+	for (i = 0; i < MSHV_MAX_PARTITIONS; ++i) {
+		if (mshv.partitions.array[i] == partition)
+			break;
+	}
+
+	if (i == MSHV_MAX_PARTITIONS) {
+		pr_err("%s: failed to locate partition in array\n", __func__);
+	} else {
+		mshv.partitions.count--;
+		mshv.partitions.array[i] = NULL;
+	}
+
+	spin_unlock_irqrestore(&mshv.partitions.lock, flags);
+
+	kfree(partition);
+}
+
+static void
+mshv_partition_put(struct mshv_partition *partition)
+{
+	if (refcount_dec_and_test(&partition->ref_count))
+		destroy_partition(partition);
+}
+
+static int
+mshv_partition_release(struct inode *inode, struct file *filp)
+{
+	struct mshv_partition *partition = filp->private_data;
+
+	mshv_partition_put(partition);
+
+	return 0;
+}
+
+static int
+add_partition(struct mshv_partition *partition)
+{
+	unsigned long flags;
+	int i, ret = 0;
+
+	spin_lock_irqsave(&mshv.partitions.lock, flags);
+
+	if (mshv.partitions.count >= MSHV_MAX_PARTITIONS) {
+		pr_err("%s: too many partitions\n", __func__);
+		ret = -ENOSPC;
+		goto out_unlock;
+	}
+
+	for (i = 0; i < MSHV_MAX_PARTITIONS; ++i) {
+		if (!mshv.partitions.array[i])
+			break;
+	}
+
+	mshv.partitions.count++;
+	mshv.partitions.array[i] = partition;
+
+out_unlock:
+	spin_unlock_irqrestore(&mshv.partitions.lock, flags);
+
+	return ret;
+}
+
+static long
+mshv_ioctl_create_partition(void __user *user_arg)
+{
+	struct mshv_create_partition args;
+	struct mshv_partition *partition;
+	struct file *file;
+	int fd;
+	long ret;
+
+	if (copy_from_user(&args, user_arg, sizeof(args)))
+		return -EFAULT;
+
+	partition = kzalloc(sizeof(*partition), GFP_KERNEL);
+	if (!partition)
+		return -ENOMEM;
+
+	fd = get_unused_fd_flags(O_CLOEXEC);
+	if (fd < 0) {
+		ret = fd;
+		goto free_partition;
+	}
+
+	file = anon_inode_getfile("mshv_partition", &mshv_partition_fops,
+				  partition, O_RDWR);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+		goto put_fd;
+	}
+	refcount_set(&partition->ref_count, 1);
+
+	ret = add_partition(partition);
+	if (ret)
+		goto release_file;
+
+	fd_install(fd, file);
+
+	return fd;
+
+release_file:
+	file->f_op->release(file->f_inode, file);
+put_fd:
+	put_unused_fd(fd);
+free_partition:
+	kfree(partition);
+	return ret;
+}
+
 static long
 mshv_ioctl_check_extension(void __user *user_arg)
 {
@@ -58,6 +195,8 @@ mshv_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 	switch (ioctl) {
 	case MSHV_CHECK_EXTENSION:
 		return mshv_ioctl_check_extension((void __user *)arg);
+	case MSHV_CREATE_PARTITION:
+		return mshv_ioctl_create_partition((void __user *)arg);
 	}
 
 	return -ENOTTY;
@@ -87,6 +226,8 @@ __init mshv_init(void)
 	if (ret)
 		pr_err("%s: misc device register failed\n", __func__);
 
+	spin_lock_init(&mshv.partitions.lock);
+
 	return ret;
 }
 
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 40ff7cdd4a2b..50dc6eafb6a6 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -12,6 +12,7 @@
 #include <linux/types.h>
 #include <linux/bits.h>
 #include <linux/time64.h>
+#include <uapi/asm-generic/hyperv-tlfs.h>
 
 /*
  * While not explicitly listed in the TLFS, Hyper-V always runs with a page size
diff --git a/include/linux/mshv.h b/include/linux/mshv.h
index a0982fe2c0b8..fc4f35089b2c 100644
--- a/include/linux/mshv.h
+++ b/include/linux/mshv.h
@@ -6,6 +6,22 @@
  * Microsoft Hypervisor root partition driver for /dev/mshv
  */
 
+#include <linux/spinlock.h>
 #include <uapi/linux/mshv.h>
 
+#define MSHV_MAX_PARTITIONS		128
+
+struct mshv_partition {
+	u64 id;
+	refcount_t ref_count;
+};
+
+struct mshv {
+	struct {
+		spinlock_t lock;
+		u64 count;
+		struct mshv_partition *array[MSHV_MAX_PARTITIONS];
+	} partitions;
+};
+
 #endif
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
new file mode 100644
index 000000000000..7a858226a9c5
--- /dev/null
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -0,0 +1,15 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_ASM_GENERIC_HYPERV_TLFS_USER_H
+#define _UAPI_ASM_GENERIC_HYPERV_TLFS_USER_H
+
+#ifndef BIT
+#define BIT(X)	(1ULL << (X))
+#endif
+
+/* Userspace-visible partition creation flags */
+#define HV_PARTITION_CREATION_FLAG_SMT_ENABLED_GUEST                BIT(0)
+#define HV_PARTITION_CREATION_FLAG_GPA_LARGE_PAGES_DISABLED         BIT(3)
+#define HV_PARTITION_CREATION_FLAG_GPA_SUPER_PAGES_ENABLED          BIT(4)
+#define HV_PARTITION_CREATION_FLAG_LAPIC_ENABLED                    BIT(13)
+
+#endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 3b84e3ea97be..03b1ed66245d 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -9,12 +9,20 @@
  */
 
 #include <linux/types.h>
+#include <asm/hyperv-tlfs.h>
+#include <asm-generic/hyperv-tlfs.h>
 
 #define MSHV_CAP_CORE_API_STABLE    0x0
 
+struct mshv_create_partition {
+	__u64 flags;
+	struct hv_partition_creation_properties partition_creation_properties;
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
 #define MSHV_CHECK_EXTENSION    _IOW(MSHV_IOCTL, 0x00, __u32)
+#define MSHV_CREATE_PARTITION	_IOW(MSHV_IOCTL, 0x01, struct mshv_create_partition)
 
 #endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 06/19] drivers/hv: create, initialize, finalize, delete partition hypercalls
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (4 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 05/19] drivers/hv: create partition ioctl Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 07/19] drivers/hv: withdraw memory hypercall Nuno Das Neves
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Add hypercalls for fully setting up and mostly tearing down a guest
partition.
Export hv_call_deposit_memory and hv_call_create_vp.
The teardown operation will generate an error as the deposited
memory has not been withdrawn.
This is fixed in the next patch.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com>
---
 arch/x86/hyperv/hv_proc.c         |   2 +
 drivers/hv/Makefile               |   3 +-
 drivers/hv/hv_call.c              | 129 ++++++++++++++++++++++++++++++
 drivers/hv/mshv.h                 |  28 +++++++
 drivers/hv/mshv_main.c            |  34 +++++++-
 include/asm-generic/hyperv-tlfs.h |  49 ++++++++++++
 6 files changed, 243 insertions(+), 2 deletions(-)
 create mode 100644 drivers/hv/hv_call.c
 create mode 100644 drivers/hv/mshv.h

diff --git a/arch/x86/hyperv/hv_proc.c b/arch/x86/hyperv/hv_proc.c
index e75c78a243e7..ae52dd5fab6d 100644
--- a/arch/x86/hyperv/hv_proc.c
+++ b/arch/x86/hyperv/hv_proc.c
@@ -146,6 +146,7 @@ int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
 	kfree(counts);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(hv_call_deposit_pages);
 
 int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
 {
@@ -243,4 +244,5 @@ int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(hv_call_create_vp);
 
diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
index 84e7fe2022a0..d20761b5df80 100644
--- a/drivers/hv/Makefile
+++ b/drivers/hv/Makefile
@@ -13,4 +13,5 @@ hv_vmbus-y := vmbus_drv.o \
 hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
 hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_fcopy.o hv_utils_transport.o
 
-mshv-y				+= mshv_main.o
+mshv-y				+= mshv_main.o hv_call.o
+
diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
new file mode 100644
index 000000000000..a96809792d63
--- /dev/null
+++ b/drivers/hv/hv_call.c
@@ -0,0 +1,129 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2021, Microsoft Corporation.
+ *
+ * Authors:
+ *   Nuno Das Neves <nudasnev@microsoft.com>
+ *   Lillian Grassin-Drake <ligrassi@microsoft.com>
+ *   Vineeth Pillai <viremana@linux.microsoft.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <asm/mshyperv.h>
+
+#include "mshv.h"
+
+int hv_call_create_partition(
+		u64 flags,
+		struct hv_partition_creation_properties creation_properties,
+		u64 *partition_id)
+{
+	struct hv_create_partition_in *input;
+	struct hv_create_partition_out *output;
+	u64 status;
+	int ret;
+	unsigned long irq_flags;
+	int i;
+
+	do {
+		local_irq_save(irq_flags);
+		input = (struct hv_create_partition_in *)(*this_cpu_ptr(
+			hyperv_pcpu_input_arg));
+		output = (struct hv_create_partition_out *)(*this_cpu_ptr(
+			hyperv_pcpu_output_arg));
+
+		input->flags = flags;
+		input->proximity_domain_info.as_uint64 = 0;
+		input->compatibility_version = HV_COMPATIBILITY_20_H1;
+		for (i = 0; i < HV_PARTITION_PROCESSOR_FEATURE_BANKS; ++i)
+			input->partition_creation_properties
+				.disabled_processor_features.as_uint64[i] = 0;
+		input->partition_creation_properties
+			.disabled_processor_xsave_features.as_uint64 = 0;
+		input->isolation_properties.as_uint64 = 0;
+
+		status = hv_do_hypercall(HVCALL_CREATE_PARTITION,
+					 input, output);
+
+		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
+			if (hv_result_success(status))
+				*partition_id = output->partition_id;
+			else
+				pr_err("%s: %s\n",
+				       __func__, hv_status_to_string(status));
+			local_irq_restore(irq_flags);
+			ret = hv_status_to_errno(status);
+			break;
+		}
+		local_irq_restore(irq_flags);
+		ret = hv_call_deposit_pages(NUMA_NO_NODE,
+					    hv_current_partition_id, 1);
+	} while (!ret);
+
+	return ret;
+}
+
+int hv_call_initialize_partition(u64 partition_id)
+{
+	struct hv_initialize_partition input;
+	u64 status;
+	int ret;
+
+	input.partition_id = partition_id;
+
+	ret = hv_call_deposit_pages(
+				NUMA_NO_NODE,
+				partition_id,
+				HV_INIT_PARTITION_DEPOSIT_PAGES);
+	if (ret)
+		return ret;
+
+	do {
+		status = hv_do_fast_hypercall8(
+				HVCALL_INITIALIZE_PARTITION,
+				*(u64 *)&input);
+
+		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
+			if (!hv_result_success(status))
+				pr_err("%s: %s\n",
+				       __func__, hv_status_to_string(status));
+			ret = hv_status_to_errno(status);
+			break;
+		}
+		ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id, 1);
+	} while (!ret);
+
+	return ret;
+}
+
+int hv_call_finalize_partition(u64 partition_id)
+{
+	struct hv_finalize_partition input;
+	u64 status;
+
+	input.partition_id = partition_id;
+	status = hv_do_fast_hypercall8(
+			HVCALL_FINALIZE_PARTITION,
+			*(u64 *)&input);
+
+	if (!hv_result_success(status))
+		pr_err("%s: %s\n", __func__, hv_status_to_string(status));
+
+	return hv_status_to_errno(status);
+}
+
+int hv_call_delete_partition(u64 partition_id)
+{
+	struct hv_delete_partition input;
+	u64 status;
+
+	input.partition_id = partition_id;
+	status = hv_do_fast_hypercall8(HVCALL_DELETE_PARTITION, *(u64 *)&input);
+
+	if (!hv_result_success(status))
+		pr_err("%s: %s\n", __func__, hv_status_to_string(status));
+
+	return hv_status_to_errno(status);
+}
+
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
new file mode 100644
index 000000000000..46121bd30592
--- /dev/null
+++ b/drivers/hv/mshv.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) 2021, Microsoft Corporation.
+ *
+ * Authors:
+ *   Nuno Das Neves <nudasnev@microsoft.com>
+ *   Lillian Grassin-Drake <ligrassi@microsoft.com>
+ *   Vineeth Pillai <viremana@linux.microsoft.com>
+ */
+
+#ifndef _MSHV_H_
+#define _MSHV_H_
+
+/* Determined empirically */
+#define HV_INIT_PARTITION_DEPOSIT_PAGES 208
+
+/*
+ * Hyper-V hypercalls
+ */
+int hv_call_create_partition(
+		u64 flags,
+		struct hv_partition_creation_properties creation_properties,
+		u64 *partition_id);
+int hv_call_initialize_partition(u64 partition_id);
+int hv_call_finalize_partition(u64 partition_id);
+int hv_call_delete_partition(u64 partition_id);
+
+#endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index 378bced4044c..a76d1c8c21b1 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -15,6 +15,9 @@
 #include <linux/file.h>
 #include <linux/anon_inodes.h>
 #include <linux/mshv.h>
+#include <asm/mshyperv.h>
+
+#include "mshv.h"
 
 MODULE_AUTHOR("Microsoft");
 MODULE_LICENSE("GPL");
@@ -49,6 +52,7 @@ static struct miscdevice mshv_dev = {
 	.mode = 0600,
 };
 
+
 static long
 mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
@@ -78,6 +82,17 @@ destroy_partition(struct mshv_partition *partition)
 
 	spin_unlock_irqrestore(&mshv.partitions.lock, flags);
 
+	/*
+	 * There are no remaining references to the partition,
+	 * so the remaining cleanup can be lockless
+	 */
+
+	/* Deallocates and unmaps everything including vcpus, GPA mappings etc */
+	hv_call_finalize_partition(partition->id);
+	/* TODO: Withdraw and free all pages we deposited */
+
+	hv_call_delete_partition(partition->id);
+
 	kfree(partition);
 }
 
@@ -138,6 +153,9 @@ mshv_ioctl_create_partition(void __user *user_arg)
 	if (copy_from_user(&args, user_arg, sizeof(args)))
 		return -EFAULT;
 
+	/* Only support EXO partitions */
+	args.flags |= HV_PARTITION_CREATION_FLAG_EXO_PARTITION;
+
 	partition = kzalloc(sizeof(*partition), GFP_KERNEL);
 	if (!partition)
 		return -ENOMEM;
@@ -148,11 +166,21 @@ mshv_ioctl_create_partition(void __user *user_arg)
 		goto free_partition;
 	}
 
+	ret = hv_call_create_partition(args.flags,
+				       args.partition_creation_properties,
+				       &partition->id);
+	if (ret)
+		goto put_fd;
+
+	ret = hv_call_initialize_partition(partition->id);
+	if (ret)
+		goto delete_partition;
+
 	file = anon_inode_getfile("mshv_partition", &mshv_partition_fops,
 				  partition, O_RDWR);
 	if (IS_ERR(file)) {
 		ret = PTR_ERR(file);
-		goto put_fd;
+		goto finalize_partition;
 	}
 	refcount_set(&partition->ref_count, 1);
 
@@ -166,6 +194,10 @@ mshv_ioctl_create_partition(void __user *user_arg)
 
 release_file:
 	file->f_op->release(file->f_inode, file);
+finalize_partition:
+	hv_call_finalize_partition(partition->id);
+delete_partition:
+	hv_call_delete_partition(partition->id);
 put_fd:
 	put_unused_fd(fd);
 free_partition:
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 50dc6eafb6a6..49099b7e0f71 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -143,6 +143,10 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX	0x0013
 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
 #define HVCALL_SEND_IPI_EX			0x0015
+#define HVCALL_CREATE_PARTITION			0x0040
+#define HVCALL_INITIALIZE_PARTITION		0x0041
+#define HVCALL_FINALIZE_PARTITION		0x0042
+#define HVCALL_DELETE_PARTITION			0x0043
 #define HVCALL_GET_PARTITION_ID			0x0046
 #define HVCALL_DEPOSIT_MEMORY			0x0048
 #define HVCALL_CREATE_VP			0x004e
@@ -826,4 +830,49 @@ struct hv_memory_hint {
 	union hv_gpa_page_range ranges[];
 } __packed;
 
+union hv_partition_isolation_properties {
+	u64 as_uint64;
+	struct {
+		u64 isolation_type: 5;
+		u64 rsvd_z: 7;
+		u64 shared_gpa_boundary_page_number: 52;
+	};
+} __packed;
+
+/* Non-userspace-visible partition creation flags */
+#define HV_PARTITION_CREATION_FLAG_EXO_PARTITION                    BIT(8)
+
+#define HV_MAKE_COMPATIBILITY_VERSION(major_, minor_)	\
+	((u32)((major_) << 8 | (minor_)))
+
+#define HV_COMPATIBILITY_19_H1		HV_MAKE_COMPATIBILITY_VERSION(0X6, 0X5)
+#define HV_COMPATIBILITY_20_H1		HV_MAKE_COMPATIBILITY_VERSION(0X6, 0X7)
+#define HV_COMPATIBILITY_PRERELEASE	HV_MAKE_COMPATIBILITY_VERSION(0XFE, 0X0)
+#define HV_COMPATIBILITY_EXPERIMENT	HV_MAKE_COMPATIBILITY_VERSION(0XFF, 0X0)
+
+struct hv_create_partition_in {
+	u64 flags;
+	union hv_proximity_domain_info proximity_domain_info;
+	u32 compatibility_version;
+	u32 padding;
+	struct hv_partition_creation_properties partition_creation_properties;
+	union hv_partition_isolation_properties isolation_properties;
+} __packed;
+
+struct hv_create_partition_out {
+	u64 partition_id;
+} __packed;
+
+struct hv_initialize_partition {
+	u64 partition_id;
+} __packed;
+
+struct hv_finalize_partition {
+	u64 partition_id;
+} __packed;
+
+struct hv_delete_partition {
+	u64 partition_id;
+} __packed;
+
 #endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 07/19] drivers/hv: withdraw memory hypercall
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (5 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 06/19] drivers/hv: create, initialize, finalize, delete partition hypercalls Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 08/19] drivers/hv: map and unmap guest memory Nuno Das Neves
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Withdraw the memory from a finalized partition and free the pages.
The partition is now cleaned up correctly when the fd is released.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 drivers/hv/hv_call.c              | 53 +++++++++++++++++++++++++++++++
 drivers/hv/mshv.h                 |  6 ++++
 drivers/hv/mshv_main.c            |  5 +--
 include/asm-generic/hyperv-tlfs.h | 11 +++++++
 4 files changed, 73 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index a96809792d63..a22b1cfb3563 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -14,6 +14,59 @@
 
 #include "mshv.h"
 
+int hv_call_withdraw_memory(u64 count, int node, u64 partition_id)
+{
+	struct hv_withdraw_memory_in *input_page;
+	struct hv_withdraw_memory_out *output_page;
+	struct page *page;
+	u16 completed;
+	unsigned long remaining = count;
+	u64 status;
+	int i;
+	unsigned long flags;
+
+	page = alloc_page(GFP_KERNEL);
+	if (!page)
+		return -ENOMEM;
+	output_page = page_address(page);
+
+	while (remaining) {
+		local_irq_save(flags);
+
+		input_page = (struct hv_withdraw_memory_in *)(*this_cpu_ptr(
+			hyperv_pcpu_input_arg));
+
+		input_page->partition_id = partition_id;
+		input_page->proximity_domain_info.as_uint64 = 0;
+		status = hv_do_rep_hypercall(
+			HVCALL_WITHDRAW_MEMORY,
+			min(remaining, HV_WITHDRAW_BATCH_SIZE), 0, input_page,
+			output_page);
+
+		local_irq_restore(flags);
+
+		completed = hv_repcomp(status);
+
+		for (i = 0; i < completed; i++)
+			__free_page(pfn_to_page(output_page->gpa_page_list[i]));
+
+		if (!hv_result_success(status)) {
+			if (hv_result(status) == HV_STATUS_NO_RESOURCES)
+				status = HV_STATUS_SUCCESS;
+			else
+				pr_err("%s: %s\n", __func__,
+				       hv_status_to_string(status));
+			break;
+		}
+
+		remaining -= completed;
+	}
+	free_page((unsigned long)output_page);
+
+	return hv_status_to_errno(status);
+}
+
+
 int hv_call_create_partition(
 		u64 flags,
 		struct hv_partition_creation_properties creation_properties,
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index 46121bd30592..cf48ec5840b7 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -11,12 +11,18 @@
 #ifndef _MSHV_H_
 #define _MSHV_H_
 
+#include<asm/hyperv-tlfs.h>
+
 /* Determined empirically */
 #define HV_INIT_PARTITION_DEPOSIT_PAGES 208
 
+#define HV_WITHDRAW_BATCH_SIZE	(HV_HYP_PAGE_SIZE / sizeof(u64))
+
 /*
  * Hyper-V hypercalls
  */
+
+int hv_call_withdraw_memory(u64 count, int node, u64 partition_id);
 int hv_call_create_partition(
 		u64 flags,
 		struct hv_partition_creation_properties creation_properties,
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index a76d1c8c21b1..f49666502ba7 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -14,6 +14,7 @@
 #include <linux/slab.h>
 #include <linux/file.h>
 #include <linux/anon_inodes.h>
+#include <linux/mm.h>
 #include <linux/mshv.h>
 #include <asm/mshyperv.h>
 
@@ -52,7 +53,6 @@ static struct miscdevice mshv_dev = {
 	.mode = 0600,
 };
 
-
 static long
 mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
@@ -89,7 +89,8 @@ destroy_partition(struct mshv_partition *partition)
 
 	/* Deallocates and unmaps everything including vcpus, GPA mappings etc */
 	hv_call_finalize_partition(partition->id);
-	/* TODO: Withdraw and free all pages we deposited */
+	/* Withdraw and free all pages we deposited */
+	hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->id);
 
 	hv_call_delete_partition(partition->id);
 
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 49099b7e0f71..2e1573978569 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -149,6 +149,7 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_DELETE_PARTITION			0x0043
 #define HVCALL_GET_PARTITION_ID			0x0046
 #define HVCALL_DEPOSIT_MEMORY			0x0048
+#define HVCALL_WITHDRAW_MEMORY			0x0049
 #define HVCALL_CREATE_VP			0x004e
 #define HVCALL_GET_VP_REGISTERS			0x0050
 #define HVCALL_SET_VP_REGISTERS			0x0051
@@ -515,6 +516,16 @@ union hv_proximity_domain_info {
 	u64 as_uint64;
 } __packed;
 
+struct hv_withdraw_memory_in {
+	u64 partition_id;
+	union hv_proximity_domain_info proximity_domain_info;
+} __packed;
+
+struct hv_withdraw_memory_out {
+	/* Hack - compiler doesn't like empty array size in struct with no other members */
+	u64 gpa_page_list[0];
+} __packed;
+
 struct hv_lp_startup_status {
 	u64 hv_status;
 	u64 substatus1;
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 08/19] drivers/hv: map and unmap guest memory
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (6 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 07/19] drivers/hv: withdraw memory hypercall Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 21:27     ` Olaf Hering
  2021-09-28 18:31 ` [PATCH v3 09/19] drivers/hv: create vcpu ioctl Nuno Das Neves
                   ` (10 subsequent siblings)
  18 siblings, 1 reply; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce ioctls for mapping and unmapping regions of guest memory.

Uses a table of memory 'slots' similar to KVM, but the slot
number is not visible to userspace.

For now, this simple implementation requires each new mapping to be
disjoint - the underlying hypercalls have no such restriction, and
implicitly overwrite any mappings on the pages in the specified regions.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst        |  17 +++
 drivers/hv/hv_call.c                   | 114 +++++++++++++++
 drivers/hv/mshv.h                      |  13 ++
 drivers/hv/mshv_main.c                 | 194 ++++++++++++++++++++++++-
 include/asm-generic/hyperv-tlfs.h      |  23 ++-
 include/linux/mshv.h                   |  14 ++
 include/uapi/asm-generic/hyperv-tlfs.h |   9 ++
 include/uapi/linux/mshv.h              |  15 ++
 8 files changed, 394 insertions(+), 5 deletions(-)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index f92892b27ccc..71c93b73e999 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -70,3 +70,20 @@ will be supported when the core API is stable.
 This ioctl creates a guest partition, returning a file descriptor to use as a
 handle for partition ioctls.
 
+3.3 MSHV_MAP_GUEST_MEMORY and MSHV_UNMAP_GUEST_MEMORY
+-----------------------------------------------------
+:Type: partition ioctl
+:Parameters: struct mshv_user_mem_region
+:Returns: 0 on success
+
+Create a mapping from memory in the user space of the calling process (running
+in the root partition) to a region of guest physical memory in a guest partition.
+
+Mappings must be disjoint from each other in both userspace and guest physical
+address space.
+
+Note: In the current implementation, this memory is pinned to real physical
+memory to stop the pages being moved by Linux in the root partition,
+and subsequently being clobbered by the hypervisor. So the region is backed
+by real physical memory.
+
diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index a22b1cfb3563..31d59de4a7f7 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -180,3 +180,117 @@ int hv_call_delete_partition(u64 partition_id)
 	return hv_status_to_errno(status);
 }
 
+int hv_call_map_gpa_pages(
+		u64 partition_id,
+		u64 gpa_target,
+		u64 page_count, u32 flags,
+		struct page **pages)
+{
+	struct hv_map_gpa_pages *input_page;
+	u64 status;
+	int i;
+	struct page **p;
+	u32 completed = 0;
+	unsigned long remaining = page_count;
+	int rep_count;
+	unsigned long irq_flags;
+	int ret = 0;
+
+	while (remaining) {
+
+		rep_count = min(remaining, HV_MAP_GPA_BATCH_SIZE);
+
+		local_irq_save(irq_flags);
+		input_page = (struct hv_map_gpa_pages *)(*this_cpu_ptr(
+			hyperv_pcpu_input_arg));
+
+		input_page->target_partition_id = partition_id;
+		input_page->target_gpa_base = gpa_target;
+		input_page->map_flags = flags;
+
+		for (i = 0, p = pages; i < rep_count; i++, p++)
+			input_page->source_gpa_page_list[i] = page_to_pfn(*p);
+		status = hv_do_rep_hypercall(
+			HVCALL_MAP_GPA_PAGES, rep_count, 0, input_page, NULL);
+		local_irq_restore(irq_flags);
+
+		completed = hv_repcomp(status);
+
+		if (hv_result(status) == HV_STATUS_INSUFFICIENT_MEMORY) {
+			ret = hv_call_deposit_pages(NUMA_NO_NODE,
+						    partition_id,
+						    HV_MAP_GPA_DEPOSIT_PAGES);
+			if (ret)
+				break;
+		} else if (!hv_result_success(status)) {
+			pr_err("%s: completed %llu out of %llu, %s\n",
+			       __func__,
+			       page_count - remaining, page_count,
+			       hv_status_to_string(status));
+			ret = hv_status_to_errno(status);
+			break;
+		}
+
+		pages += completed;
+		remaining -= completed;
+		gpa_target += completed;
+	}
+
+	if (ret && remaining < page_count) {
+		pr_err("%s: Partially succeeded; mapped regions may be in invalid state",
+		       __func__);
+		ret = -EBADFD;
+	}
+
+	return ret;
+}
+
+int hv_call_unmap_gpa_pages(
+		u64 partition_id,
+		u64 gpa_target,
+		u64 page_count, u32 flags)
+{
+	struct hv_unmap_gpa_pages *input_page;
+	u64 status;
+	int ret = 0;
+	u32 completed = 0;
+	unsigned long remaining = page_count;
+	int rep_count;
+	unsigned long irq_flags;
+
+	while (remaining) {
+		local_irq_save(irq_flags);
+		input_page = (struct hv_unmap_gpa_pages *)(*this_cpu_ptr(
+			hyperv_pcpu_input_arg));
+
+		input_page->target_partition_id = partition_id;
+		input_page->target_gpa_base = gpa_target;
+		input_page->unmap_flags = flags;
+		rep_count = min(remaining, HV_MAP_GPA_BATCH_SIZE);
+		status = hv_do_rep_hypercall(
+			HVCALL_UNMAP_GPA_PAGES, rep_count, 0, input_page, NULL);
+		local_irq_restore(irq_flags);
+
+		completed = hv_repcomp(status);
+		if (!hv_result_success(status)) {
+			pr_err("%s: completed %llu out of %llu, %s\n",
+			       __func__,
+			       page_count - remaining, page_count,
+			       hv_status_to_string(status));
+			ret = hv_status_to_errno(status);
+			break;
+		}
+
+		remaining -= completed;
+		gpa_target += completed;
+	}
+
+	if (ret && remaining < page_count) {
+		pr_err("%s: Partially succeeded; mapped regions may be in invalid state",
+		       __func__);
+		ret = -EBADFD;
+	}
+
+	return ret;
+}
+
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index cf48ec5840b7..13d9df7c3e0d 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -15,8 +15,12 @@
 
 /* Determined empirically */
 #define HV_INIT_PARTITION_DEPOSIT_PAGES 208
+#define HV_MAP_GPA_DEPOSIT_PAGES	256
 
 #define HV_WITHDRAW_BATCH_SIZE	(HV_HYP_PAGE_SIZE / sizeof(u64))
+#define HV_MAP_GPA_BATCH_SIZE	\
+		((HV_HYP_PAGE_SIZE - sizeof(struct hv_map_gpa_pages)) / sizeof(u64))
+#define PIN_PAGES_BATCH_SIZE	(0x10000000 / HV_HYP_PAGE_SIZE)
 
 /*
  * Hyper-V hypercalls
@@ -30,5 +34,14 @@ int hv_call_create_partition(
 int hv_call_initialize_partition(u64 partition_id);
 int hv_call_finalize_partition(u64 partition_id);
 int hv_call_delete_partition(u64 partition_id);
+int hv_call_map_gpa_pages(
+		u64 partition_id,
+		u64 gpa_target,
+		u64 page_count, u32 flags,
+		struct page **pages);
+int hv_call_unmap_gpa_pages(
+		u64 partition_id,
+		u64 gpa_target,
+		u64 page_count, u32 flags);
 
 #endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index f49666502ba7..c5c18826a38f 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -53,16 +53,194 @@ static struct miscdevice mshv_dev = {
 	.mode = 0600,
 };
 
+static long
+mshv_partition_ioctl_map_memory(struct mshv_partition *partition,
+				struct mshv_user_mem_region __user *user_mem)
+{
+	struct mshv_user_mem_region mem;
+	struct mshv_mem_region *region;
+	int completed;
+	unsigned long remaining, batch_size;
+	int i;
+	struct page **pages;
+	u64 page_count, user_start, user_end, gpfn_start, gpfn_end;
+	u64 region_page_count, region_user_start, region_user_end;
+	u64 region_gpfn_start, region_gpfn_end;
+	long ret = 0;
+
+	/* Check we have enough slots*/
+	if (partition->regions.count == MSHV_MAX_MEM_REGIONS) {
+		pr_err("%s: not enough memory region slots\n", __func__);
+		return -ENOSPC;
+	}
+
+	if (copy_from_user(&mem, user_mem, sizeof(mem)))
+		return -EFAULT;
+
+	if (!mem.size ||
+	    !PAGE_ALIGNED(mem.size) ||
+	    !PAGE_ALIGNED(mem.userspace_addr) ||
+	    !access_ok(mem.userspace_addr, mem.size))
+		return -EINVAL;
+
+	/* Reject overlapping regions */
+	page_count = mem.size >> HV_HYP_PAGE_SHIFT;
+	user_start = mem.userspace_addr;
+	user_end = mem.userspace_addr + mem.size;
+	gpfn_start = mem.guest_pfn;
+	gpfn_end = mem.guest_pfn + page_count;
+	for (i = 0; i < MSHV_MAX_MEM_REGIONS; ++i) {
+		region = &partition->regions.slots[i];
+		if (!region->size)
+			continue;
+		region_page_count = region->size >> HV_HYP_PAGE_SHIFT;
+		region_user_start = region->userspace_addr;
+		region_user_end = region->userspace_addr + region->size;
+		region_gpfn_start = region->guest_pfn;
+		region_gpfn_end = region->guest_pfn + region_page_count;
+
+		if (!(user_end <= region_user_start) &&
+		    !(region_user_end <= user_start)) {
+			return -EEXIST;
+		}
+		if (!(gpfn_end <= region_gpfn_start) &&
+		    !(region_gpfn_end <= gpfn_start)) {
+			return -EEXIST;
+		}
+	}
+
+	/* Pin the userspace pages */
+	pages = vzalloc(sizeof(struct page *) * page_count);
+	if (!pages)
+		return -ENOMEM;
+
+	remaining = page_count;
+	while (remaining) {
+		/*
+		 * We need to batch this, as pin_user_pages_fast with the
+		 * FOLL_LONGTERM flag does a big temporary allocation
+		 * of contiguous memory
+		 */
+		batch_size = min(remaining, PIN_PAGES_BATCH_SIZE);
+		completed = pin_user_pages_fast(
+				mem.userspace_addr + (page_count - remaining) * HV_HYP_PAGE_SIZE,
+				batch_size,
+				FOLL_WRITE | FOLL_LONGTERM,
+				&pages[page_count - remaining]);
+		if (completed < 0) {
+			pr_err("%s: failed to pin user pages error %i\n",
+			       __func__,
+			       completed);
+			ret = completed;
+			goto err_unpin_pages;
+		}
+		remaining -= completed;
+	}
+
+	/* Map the pages to GPA pages */
+	ret = hv_call_map_gpa_pages(partition->id, mem.guest_pfn,
+				    page_count, mem.flags, pages);
+	if (ret)
+		goto err_unpin_pages;
+
+	/* Install the new region */
+	for (i = 0; i < MSHV_MAX_MEM_REGIONS; ++i) {
+		if (!partition->regions.slots[i].size) {
+			region = &partition->regions.slots[i];
+			break;
+		}
+	}
+	region->pages = pages;
+	region->size = mem.size;
+	region->guest_pfn = mem.guest_pfn;
+	region->userspace_addr = mem.userspace_addr;
+
+	partition->regions.count++;
+
+	return 0;
+
+err_unpin_pages:
+	unpin_user_pages(pages, page_count - remaining);
+	vfree(pages);
+
+	return ret;
+}
+
+static long
+mshv_partition_ioctl_unmap_memory(struct mshv_partition *partition,
+				  struct mshv_user_mem_region __user *user_mem)
+{
+	struct mshv_user_mem_region mem;
+	struct mshv_mem_region *region_ptr;
+	int i;
+	u64 page_count;
+	long ret;
+
+	if (!partition->regions.count)
+		return -EINVAL;
+
+	if (copy_from_user(&mem, user_mem, sizeof(mem)))
+		return -EFAULT;
+
+	/* Find matching region */
+	for (i = 0; i < MSHV_MAX_MEM_REGIONS; ++i) {
+		if (!partition->regions.slots[i].size)
+			continue;
+		region_ptr = &partition->regions.slots[i];
+		if (region_ptr->userspace_addr == mem.userspace_addr &&
+		    region_ptr->size == mem.size &&
+		    region_ptr->guest_pfn == mem.guest_pfn)
+			break;
+	}
+
+	if (i == MSHV_MAX_MEM_REGIONS)
+		return -EINVAL;
+
+	page_count = region_ptr->size >> HV_HYP_PAGE_SHIFT;
+	ret = hv_call_unmap_gpa_pages(partition->id, region_ptr->guest_pfn,
+				      page_count, 0);
+	if (ret)
+		return ret;
+
+	unpin_user_pages(region_ptr->pages, page_count);
+	vfree(region_ptr->pages);
+	memset(region_ptr, 0, sizeof(*region_ptr));
+	partition->regions.count--;
+
+	return 0;
+}
+
 static long
 mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
-	return -ENOTTY;
+	struct mshv_partition *partition = filp->private_data;
+	long ret;
+
+	if (mutex_lock_killable(&partition->mutex))
+		return -EINTR;
+
+	switch (ioctl) {
+	case MSHV_MAP_GUEST_MEMORY:
+		ret = mshv_partition_ioctl_map_memory(partition,
+							(void __user *)arg);
+		break;
+	case MSHV_UNMAP_GUEST_MEMORY:
+		ret = mshv_partition_ioctl_unmap_memory(partition,
+							(void __user *)arg);
+		break;
+	default:
+		ret = -ENOTTY;
+	}
+
+	mutex_unlock(&partition->mutex);
+	return ret;
 }
 
 static void
 destroy_partition(struct mshv_partition *partition)
 {
-	unsigned long flags;
+	unsigned long flags, page_count;
+	struct mshv_mem_region *region;
 	int i;
 
 	/* Remove from list of partitions */
@@ -94,6 +272,16 @@ destroy_partition(struct mshv_partition *partition)
 
 	hv_call_delete_partition(partition->id);
 
+	/* Remove regions and unpin the pages */
+	for (i = 0; i < MSHV_MAX_MEM_REGIONS; ++i) {
+		region = &partition->regions.slots[i];
+		if (!region->size)
+			continue;
+		page_count = region->size >> HV_HYP_PAGE_SHIFT;
+		unpin_user_pages(region->pages, page_count);
+		vfree(region->pages);
+	}
+
 	kfree(partition);
 }
 
@@ -161,6 +349,8 @@ mshv_ioctl_create_partition(void __user *user_arg)
 	if (!partition)
 		return -ENOMEM;
 
+	mutex_init(&partition->mutex);
+
 	fd = get_unused_fd_flags(O_CLOEXEC);
 	if (fd < 0) {
 		ret = fd;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 2e1573978569..8684e7f9ec5b 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -20,9 +20,9 @@
  * guest physical pages and guest physical page addresses, since the guest page
  * size may not be 4096 on all architectures.
  */
-#define HV_HYP_PAGE_SHIFT      12
-#define HV_HYP_PAGE_SIZE       BIT(HV_HYP_PAGE_SHIFT)
-#define HV_HYP_PAGE_MASK       (~(HV_HYP_PAGE_SIZE - 1))
+#define HV_HYP_PAGE_SHIFT		12
+#define HV_HYP_PAGE_SIZE		BIT(HV_HYP_PAGE_SHIFT)
+#define HV_HYP_PAGE_MASK		(~(HV_HYP_PAGE_SIZE - 1))
 
 /*
  * Hyper-V provides two categories of flags relevant to guest VMs.  The
@@ -150,6 +150,8 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_GET_PARTITION_ID			0x0046
 #define HVCALL_DEPOSIT_MEMORY			0x0048
 #define HVCALL_WITHDRAW_MEMORY			0x0049
+#define HVCALL_MAP_GPA_PAGES			0x004b
+#define HVCALL_UNMAP_GPA_PAGES			0x004c
 #define HVCALL_CREATE_VP			0x004e
 #define HVCALL_GET_VP_REGISTERS			0x0050
 #define HVCALL_SET_VP_REGISTERS			0x0051
@@ -886,4 +888,19 @@ struct hv_delete_partition {
 	u64 partition_id;
 } __packed;
 
+struct hv_map_gpa_pages {
+	u64 target_partition_id;
+	u64 target_gpa_base;
+	u32 map_flags;
+	u32 padding;
+	u64 source_gpa_page_list[];
+} __packed;
+
+struct hv_unmap_gpa_pages {
+	u64 target_partition_id;
+	u64 target_gpa_base;
+	u32 unmap_flags;
+	u32 padding;
+} __packed;
+
 #endif
diff --git a/include/linux/mshv.h b/include/linux/mshv.h
index fc4f35089b2c..91a742f37440 100644
--- a/include/linux/mshv.h
+++ b/include/linux/mshv.h
@@ -7,13 +7,27 @@
  */
 
 #include <linux/spinlock.h>
+#include <linux/mutex.h>
 #include <uapi/linux/mshv.h>
 
 #define MSHV_MAX_PARTITIONS		128
+#define MSHV_MAX_MEM_REGIONS		64
+
+struct mshv_mem_region {
+	u64 size; /* bytes */
+	u64 guest_pfn;
+	u64 userspace_addr; /* start of the userspace allocated memory */
+	struct page **pages;
+};
 
 struct mshv_partition {
 	u64 id;
 	refcount_t ref_count;
+	struct mutex mutex;
+	struct {
+		u32 count;
+		struct mshv_mem_region slots[MSHV_MAX_MEM_REGIONS];
+	} regions;
 };
 
 struct mshv {
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
index 7a858226a9c5..e7b09b9f00de 100644
--- a/include/uapi/asm-generic/hyperv-tlfs.h
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -12,4 +12,13 @@
 #define HV_PARTITION_CREATION_FLAG_GPA_SUPER_PAGES_ENABLED          BIT(4)
 #define HV_PARTITION_CREATION_FLAG_LAPIC_ENABLED                    BIT(13)
 
+/* HV Map GPA (Guest Physical Address) Flags */
+#define HV_MAP_GPA_PERMISSIONS_NONE     0x0
+#define HV_MAP_GPA_READABLE             0x1
+#define HV_MAP_GPA_WRITABLE             0x2
+#define HV_MAP_GPA_KERNEL_EXECUTABLE    0x4
+#define HV_MAP_GPA_USER_EXECUTABLE      0x8
+#define HV_MAP_GPA_EXECUTABLE           0xC
+#define HV_MAP_GPA_PERMISSIONS_MASK     0xF
+
 #endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 03b1ed66245d..7ead5f1c8b14 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -19,10 +19,25 @@ struct mshv_create_partition {
 	struct hv_partition_creation_properties partition_creation_properties;
 };
 
+/*
+ * Mappings can't overlap in GPA space or userspace
+ * To unmap, these fields must match an existing mapping
+ */
+struct mshv_user_mem_region {
+	__u64 size;		/* bytes */
+	__u64 guest_pfn;
+	__u64 userspace_addr;	/* start of the userspace allocated memory */
+	__u32 flags;		/* ignored on unmap */
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
 #define MSHV_CHECK_EXTENSION    _IOW(MSHV_IOCTL, 0x00, __u32)
 #define MSHV_CREATE_PARTITION	_IOW(MSHV_IOCTL, 0x01, struct mshv_create_partition)
 
+/* partition device */
+#define MSHV_MAP_GUEST_MEMORY	_IOW(MSHV_IOCTL, 0x02, struct mshv_user_mem_region)
+#define MSHV_UNMAP_GUEST_MEMORY	_IOW(MSHV_IOCTL, 0x03, struct mshv_user_mem_region)
+
 #endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 09/19] drivers/hv: create vcpu ioctl
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (7 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 08/19] drivers/hv: map and unmap guest memory Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 10/19] drivers/hv: get and set vcpu registers ioctls Nuno Das Neves
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce ioctl for creating a virtual processor in a partition.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst |   9 +++
 drivers/hv/mshv_main.c          | 119 +++++++++++++++++++++++++++++++-
 include/linux/mshv.h            |  10 +++
 include/uapi/linux/mshv.h       |   5 ++
 4 files changed, 142 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index 71c93b73e999..2538756bc86b 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -87,3 +87,12 @@ memory to stop the pages being moved by Linux in the root partition,
 and subsequently being clobbered by the hypervisor. So the region is backed
 by real physical memory.
 
+3.4 MSHV_CREATE_VP
+------------------
+:Type: partition ioctl
+:Parameters: struct mshv_create_vp
+:Returns: vp file descriptor, or -1 on failure
+
+Create a virtual processor in a guest partition, returning a file descriptor to
+represent the vp and perform ioctls on.
+
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index c5c18826a38f..c3ac8c371d0f 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -25,6 +25,9 @@ MODULE_LICENSE("GPL");
 
 struct mshv mshv = {};
 
+static int mshv_vp_release(struct inode *inode, struct file *filp);
+static long mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg);
+static struct mshv_partition *mshv_partition_get(struct mshv_partition *partition);
 static void mshv_partition_put(struct mshv_partition *partition);
 static int mshv_partition_release(struct inode *inode, struct file *filp);
 static long mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg);
@@ -32,6 +35,12 @@ static int mshv_dev_open(struct inode *inode, struct file *filp);
 static int mshv_dev_release(struct inode *inode, struct file *filp);
 static long mshv_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg);
 
+static const struct file_operations mshv_vp_fops = {
+	.release = mshv_vp_release,
+	.unlocked_ioctl = mshv_vp_ioctl,
+	.llseek = noop_llseek,
+};
+
 static const struct file_operations mshv_partition_fops = {
 	.release = mshv_partition_release,
 	.unlocked_ioctl = mshv_partition_ioctl,
@@ -53,6 +62,94 @@ static struct miscdevice mshv_dev = {
 	.mode = 0600,
 };
 
+static long
+mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
+{
+	return -ENOTTY;
+}
+
+static int
+mshv_vp_release(struct inode *inode, struct file *filp)
+{
+	struct mshv_vp *vp = filp->private_data;
+
+	/* Rest of VP cleanup happens in destroy_partition() */
+	mshv_partition_put(vp->partition);
+	return 0;
+}
+
+static long
+mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
+			       void __user *arg)
+{
+	struct mshv_create_vp args;
+	struct mshv_vp *vp;
+	struct file *file;
+	int fd;
+	long ret;
+
+	if (copy_from_user(&args, arg, sizeof(args)))
+		return -EFAULT;
+
+	if (args.vp_index >= MSHV_MAX_VPS)
+		return -EINVAL;
+
+	if (partition->vps.array[args.vp_index])
+		return -EEXIST;
+
+	vp = kzalloc(sizeof(*vp), GFP_KERNEL);
+
+	if (!vp)
+		return -ENOMEM;
+
+	vp->index = args.vp_index;
+	vp->partition = mshv_partition_get(partition);
+	if (!vp->partition) {
+		ret = -EBADF;
+		goto free_vp;
+	}
+
+	fd = get_unused_fd_flags(O_CLOEXEC);
+	if (fd < 0) {
+		ret = fd;
+		goto put_partition;
+	}
+
+	file = anon_inode_getfile("mshv_vp", &mshv_vp_fops, vp, O_RDWR);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+		goto put_fd;
+	}
+
+	ret = hv_call_create_vp(
+			NUMA_NO_NODE,
+			partition->id,
+			args.vp_index,
+			0 /* Only valid for root partition VPs */
+			);
+	if (ret)
+		goto release_file;
+
+	/* already exclusive with the partition mutex for all ioctls */
+	partition->vps.count++;
+	partition->vps.array[args.vp_index] = vp;
+
+	fd_install(fd, file);
+
+	return fd;
+
+release_file:
+	file->f_op->release(file->f_inode, file);
+put_fd:
+	put_unused_fd(fd);
+put_partition:
+	mshv_partition_put(partition);
+free_vp:
+	kfree(vp);
+
+	return ret;
+}
+
 static long
 mshv_partition_ioctl_map_memory(struct mshv_partition *partition,
 				struct mshv_user_mem_region __user *user_mem)
@@ -228,6 +325,10 @@ mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 		ret = mshv_partition_ioctl_unmap_memory(partition,
 							(void __user *)arg);
 		break;
+	case MSHV_CREATE_VP:
+		ret = mshv_partition_ioctl_create_vp(partition,
+							(void __user *)arg);
+		break;
 	default:
 		ret = -ENOTTY;
 	}
@@ -240,6 +341,7 @@ static void
 destroy_partition(struct mshv_partition *partition)
 {
 	unsigned long flags, page_count;
+	struct mshv_vp *vp;
 	struct mshv_mem_region *region;
 	int i;
 
@@ -269,9 +371,16 @@ destroy_partition(struct mshv_partition *partition)
 	hv_call_finalize_partition(partition->id);
 	/* Withdraw and free all pages we deposited */
 	hv_call_withdraw_memory(U64_MAX, NUMA_NO_NODE, partition->id);
-
 	hv_call_delete_partition(partition->id);
 
+	/* Remove vps */
+	for (i = 0; i < MSHV_MAX_VPS; ++i) {
+		vp = partition->vps.array[i];
+		if (!vp)
+			continue;
+		kfree(vp);
+	}
+
 	/* Remove regions and unpin the pages */
 	for (i = 0; i < MSHV_MAX_MEM_REGIONS; ++i) {
 		region = &partition->regions.slots[i];
@@ -285,6 +394,14 @@ destroy_partition(struct mshv_partition *partition)
 	kfree(partition);
 }
 
+static struct
+mshv_partition *mshv_partition_get(struct mshv_partition *partition)
+{
+	if (refcount_inc_not_zero(&partition->ref_count))
+		return partition;
+	return NULL;
+}
+
 static void
 mshv_partition_put(struct mshv_partition *partition)
 {
diff --git a/include/linux/mshv.h b/include/linux/mshv.h
index 91a742f37440..50521c5f7948 100644
--- a/include/linux/mshv.h
+++ b/include/linux/mshv.h
@@ -12,6 +12,12 @@
 
 #define MSHV_MAX_PARTITIONS		128
 #define MSHV_MAX_MEM_REGIONS		64
+#define MSHV_MAX_VPS			256
+
+struct mshv_vp {
+	u32 index;
+	struct mshv_partition *partition;
+};
 
 struct mshv_mem_region {
 	u64 size; /* bytes */
@@ -28,6 +34,10 @@ struct mshv_partition {
 		u32 count;
 		struct mshv_mem_region slots[MSHV_MAX_MEM_REGIONS];
 	} regions;
+	struct {
+		u32 count;
+		struct mshv_vp *array[MSHV_MAX_VPS];
+	} vps;
 };
 
 struct mshv {
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 7ead5f1c8b14..251976348441 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -30,6 +30,10 @@ struct mshv_user_mem_region {
 	__u32 flags;		/* ignored on unmap */
 };
 
+struct mshv_create_vp {
+	__u32 vp_index;
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
@@ -39,5 +43,6 @@ struct mshv_user_mem_region {
 /* partition device */
 #define MSHV_MAP_GUEST_MEMORY	_IOW(MSHV_IOCTL, 0x02, struct mshv_user_mem_region)
 #define MSHV_UNMAP_GUEST_MEMORY	_IOW(MSHV_IOCTL, 0x03, struct mshv_user_mem_region)
+#define MSHV_CREATE_VP		_IOW(MSHV_IOCTL, 0x04, struct mshv_create_vp)
 
 #endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 10/19] drivers/hv: get and set vcpu registers ioctls
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (8 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 09/19] drivers/hv: create vcpu ioctl Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 11/19] drivers/hv: set up synic pages for intercept messages Nuno Das Neves
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Add ioctls for getting and setting virtual processor registers.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst         |  11 +
 arch/x86/include/uapi/asm/hyperv-tlfs.h | 602 ++++++++++++++++++++++++
 drivers/hv/hv_call.c                    | 100 ++++
 drivers/hv/mshv.h                       |  15 +
 drivers/hv/mshv_main.c                  |  96 +++-
 include/asm-generic/hyperv-tlfs.h       |  47 +-
 include/linux/mshv.h                    |   1 +
 include/uapi/asm-generic/hyperv-tlfs.h  |   7 +
 include/uapi/linux/mshv.h               |  11 +
 9 files changed, 859 insertions(+), 31 deletions(-)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index 2538756bc86b..f0631236c063 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -96,3 +96,14 @@ by real physical memory.
 Create a virtual processor in a guest partition, returning a file descriptor to
 represent the vp and perform ioctls on.
 
+3.5 MSHV_GET_VP_REGISTERS and MSHV_SET_VP_REGISTERS
+---------------------------------------------------
+:Type: vp ioctl
+:Parameters: struct mshv_vp_registers
+:Returns: 0 on success
+
+Get/set vp registers. See asm/hyperv-tlfs.h for the complete set of registers.
+Includes general purpose platform registers, MSRs, and virtual registers that
+are part of Microsoft Hypervisor platform and not directly exposed to the guest.
+
+
diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h b/arch/x86/include/uapi/asm/hyperv-tlfs.h
index 8a5fc59bb33a..a42c63001055 100644
--- a/arch/x86/include/uapi/asm/hyperv-tlfs.h
+++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
@@ -121,4 +121,606 @@ struct hv_partition_creation_properties {
 		disabled_processor_xsave_features;
 } __packed;
 
+enum hv_register_name {
+	/* Suspend Registers */
+	HV_REGISTER_EXPLICIT_SUSPEND		= 0x00000000,
+	HV_REGISTER_INTERCEPT_SUSPEND		= 0x00000001,
+	HV_REGISTER_INSTRUCTION_EMULATION_HINTS	= 0x00000002,
+	HV_REGISTER_DISPATCH_SUSPEND		= 0x00000003,
+	HV_REGISTER_INTERNAL_ACTIVITY_STATE	= 0x00000004,
+
+	/* Version */
+	HV_REGISTER_HYPERVISOR_VERSION	= 0x00000100, /* 128-bit result same as CPUID 0x40000002 */
+
+	/* Feature Access (registers are 128 bits) - same as CPUID 0x40000003 - 0x4000000B */
+	HV_REGISTER_PRIVILEGES_AND_FEATURES_INFO	= 0x00000200,
+	HV_REGISTER_FEATURES_INFO			= 0x00000201,
+	HV_REGISTER_IMPLEMENTATION_LIMITS_INFO		= 0x00000202,
+	HV_REGISTER_HARDWARE_FEATURES_INFO		= 0x00000203,
+	HV_REGISTER_CPU_MANAGEMENT_FEATURES_INFO	= 0x00000204,
+	HV_REGISTER_SVM_FEATURES_INFO			= 0x00000205,
+	HV_REGISTER_SKIP_LEVEL_FEATURES_INFO		= 0x00000206,
+	HV_REGISTER_NESTED_VIRT_FEATURES_INFO		= 0x00000207,
+	HV_REGISTER_IPT_FEATURES_INFO			= 0x00000208,
+
+	/* Guest Crash Registers */
+	HV_REGISTER_GUEST_CRASH_P0	= 0x00000210,
+	HV_REGISTER_GUEST_CRASH_P1	= 0x00000211,
+	HV_REGISTER_GUEST_CRASH_P2	= 0x00000212,
+	HV_REGISTER_GUEST_CRASH_P3	= 0x00000213,
+	HV_REGISTER_GUEST_CRASH_P4	= 0x00000214,
+	HV_REGISTER_GUEST_CRASH_CTL	= 0x00000215,
+
+	/* Power State Configuration */
+	HV_REGISTER_POWER_STATE_CONFIG_C1	= 0x00000220,
+	HV_REGISTER_POWER_STATE_TRIGGER_C1	= 0x00000221,
+	HV_REGISTER_POWER_STATE_CONFIG_C2	= 0x00000222,
+	HV_REGISTER_POWER_STATE_TRIGGER_C2	= 0x00000223,
+	HV_REGISTER_POWER_STATE_CONFIG_C3	= 0x00000224,
+	HV_REGISTER_POWER_STATE_TRIGGER_C3	= 0x00000225,
+
+	/* Frequency Registers */
+	HV_REGISTER_PROCESSOR_CLOCK_FREQUENCY	= 0x00000240,
+	HV_REGISTER_INTERRUPT_CLOCK_FREQUENCY	= 0x00000241,
+
+	/* Idle Register */
+	HV_REGISTER_GUEST_IDLE	= 0x00000250,
+
+	/* Guest Debug */
+	HV_REGISTER_DEBUG_DEVICE_OPTIONS	= 0x00000260,
+
+	/* Memory Zeroing Conrol Register */
+	HV_REGISTER_MEMORY_ZEROING_CONTROL	= 0x00000270,
+
+	/* Pending Event Register */
+	HV_REGISTER_PENDING_EVENT0	= 0x00010004,
+	HV_REGISTER_PENDING_EVENT1	= 0x00010005,
+
+	/* Misc */
+	HV_REGISTER_VP_RUNTIME			= 0x00090000,
+	HV_REGISTER_GUEST_OS_ID			= 0x00090002,
+	HV_REGISTER_VP_INDEX			= 0x00090003,
+	HV_REGISTER_TIME_REF_COUNT		= 0x00090004,
+	HV_REGISTER_CPU_MANAGEMENT_VERSION	= 0x00090007,
+	HV_REGISTER_VP_ASSIST_PAGE		= 0x00090013,
+	HV_REGISTER_VP_ROOT_SIGNAL_COUNT	= 0x00090014,
+	HV_REGISTER_REFERENCE_TSC		= 0x00090017,
+
+	/* Performance statistics Registers */
+	HV_REGISTER_STATS_PARTITION_RETAIL	= 0x00090020,
+	HV_REGISTER_STATS_PARTITION_INTERNAL	= 0x00090021,
+	HV_REGISTER_STATS_VP_RETAIL		= 0x00090022,
+	HV_REGISTER_STATS_VP_INTERNAL		= 0x00090023,
+
+	HV_REGISTER_NESTED_VP_INDEX	= 0x00091003,
+
+	/* Hypervisor-defined Registers (Synic) */
+	HV_REGISTER_SINT0	= 0x000A0000,
+	HV_REGISTER_SINT1	= 0x000A0001,
+	HV_REGISTER_SINT2	= 0x000A0002,
+	HV_REGISTER_SINT3	= 0x000A0003,
+	HV_REGISTER_SINT4	= 0x000A0004,
+	HV_REGISTER_SINT5	= 0x000A0005,
+	HV_REGISTER_SINT6	= 0x000A0006,
+	HV_REGISTER_SINT7	= 0x000A0007,
+	HV_REGISTER_SINT8	= 0x000A0008,
+	HV_REGISTER_SINT9	= 0x000A0009,
+	HV_REGISTER_SINT10	= 0x000A000A,
+	HV_REGISTER_SINT11	= 0x000A000B,
+	HV_REGISTER_SINT12	= 0x000A000C,
+	HV_REGISTER_SINT13	= 0x000A000D,
+	HV_REGISTER_SINT14	= 0x000A000E,
+	HV_REGISTER_SINT15	= 0x000A000F,
+	HV_REGISTER_SCONTROL	= 0x000A0010,
+	HV_REGISTER_SVERSION	= 0x000A0011,
+	HV_REGISTER_SIFP	= 0x000A0012,
+	HV_REGISTER_SIPP	= 0x000A0013,
+	HV_REGISTER_EOM		= 0x000A0014,
+	HV_REGISTER_SIRBP	= 0x000A0015,
+
+	HV_REGISTER_NESTED_SINT0	= 0x000A1000,
+	HV_REGISTER_NESTED_SINT1	= 0x000A1001,
+	HV_REGISTER_NESTED_SINT2	= 0x000A1002,
+	HV_REGISTER_NESTED_SINT3	= 0x000A1003,
+	HV_REGISTER_NESTED_SINT4	= 0x000A1004,
+	HV_REGISTER_NESTED_SINT5	= 0x000A1005,
+	HV_REGISTER_NESTED_SINT6	= 0x000A1006,
+	HV_REGISTER_NESTED_SINT7	= 0x000A1007,
+	HV_REGISTER_NESTED_SINT8	= 0x000A1008,
+	HV_REGISTER_NESTED_SINT9	= 0x000A1009,
+	HV_REGISTER_NESTED_SINT10	= 0x000A100A,
+	HV_REGISTER_NESTED_SINT11	= 0x000A100B,
+	HV_REGISTER_NESTED_SINT12	= 0x000A100C,
+	HV_REGISTER_NESTED_SINT13	= 0x000A100D,
+	HV_REGISTER_NESTED_SINT14	= 0x000A100E,
+	HV_REGISTER_NESTED_SINT15	= 0x000A100F,
+	HV_REGISTER_NESTED_SCONTROL	= 0x000A1010,
+	HV_REGISTER_NESTED_SVERSION	= 0x000A1011,
+	HV_REGISTER_NESTED_SIFP		= 0x000A1012,
+	HV_REGISTER_NESTED_SIPP		= 0x000A1013,
+	HV_REGISTER_NESTED_EOM		= 0x000A1014,
+	HV_REGISTER_NESTED_SIRBP	= 0x000a1015,
+
+
+	/* Hypervisor-defined Registers (Synthetic Timers) */
+	HV_REGISTER_STIMER0_CONFIG		= 0x000B0000,
+	HV_REGISTER_STIMER0_COUNT		= 0x000B0001,
+	HV_REGISTER_STIMER1_CONFIG		= 0x000B0002,
+	HV_REGISTER_STIMER1_COUNT		= 0x000B0003,
+	HV_REGISTER_STIMER2_CONFIG		= 0x000B0004,
+	HV_REGISTER_STIMER2_COUNT		= 0x000B0005,
+	HV_REGISTER_STIMER3_CONFIG		= 0x000B0006,
+	HV_REGISTER_STIMER3_COUNT		= 0x000B0007,
+	HV_REGISTER_STIME_UNHALTED_TIMER_CONFIG	= 0x000B0100,
+	HV_REGISTER_STIME_UNHALTED_TIMER_COUNT	= 0x000b0101,
+
+	/* Synthetic VSM registers */
+
+	/* 0x000D0000-1 are available for future use. */
+	HV_REGISTER_VSM_CODE_PAGE_OFFSETS	= 0x000D0002,
+	HV_REGISTER_VSM_VP_STATUS		= 0x000D0003,
+	HV_REGISTER_VSM_PARTITION_STATUS	= 0x000D0004,
+	HV_REGISTER_VSM_VINA			= 0x000D0005,
+	HV_REGISTER_VSM_CAPABILITIES		= 0x000D0006,
+	HV_REGISTER_VSM_PARTITION_CONFIG	= 0x000D0007,
+
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL0	= 0x000D0010,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL1	= 0x000D0011,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL2	= 0x000D0012,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL3	= 0x000D0013,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL4	= 0x000D0014,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL5	= 0x000D0015,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL6	= 0x000D0016,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL7	= 0x000D0017,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL8	= 0x000D0018,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL9	= 0x000D0019,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL10	= 0x000D001A,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL11	= 0x000D001B,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL12	= 0x000D001C,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL13	= 0x000D001D,
+	HV_REGISTER_VSM_VP_SECURE_CONFIG_VTL14	= 0x000D001E,
+
+	HV_REGISTER_VSM_VP_WAIT_FOR_TLB_LOCK	= 0x000D0020,
+
+	HV_REGISTER_ISOLATION_CAPABILITIES	= 0x000D0100,
+
+	/* Pending Interruption Register */
+	HV_REGISTER_PENDING_INTERRUPTION	= 0x00010002,
+
+	/* Interrupt State register */
+	HV_REGISTER_INTERRUPT_STATE	= 0x00010003,
+
+	/* Interruptible notification register */
+	HV_X64_REGISTER_DELIVERABILITY_NOTIFICATIONS	= 0x00010006,
+
+	/* X64 User-Mode Registers */
+	HV_X64_REGISTER_RAX	= 0x00020000,
+	HV_X64_REGISTER_RCX	= 0x00020001,
+	HV_X64_REGISTER_RDX	= 0x00020002,
+	HV_X64_REGISTER_RBX	= 0x00020003,
+	HV_X64_REGISTER_RSP	= 0x00020004,
+	HV_X64_REGISTER_RBP	= 0x00020005,
+	HV_X64_REGISTER_RSI	= 0x00020006,
+	HV_X64_REGISTER_RDI	= 0x00020007,
+	HV_X64_REGISTER_R8	= 0x00020008,
+	HV_X64_REGISTER_R9	= 0x00020009,
+	HV_X64_REGISTER_R10	= 0x0002000A,
+	HV_X64_REGISTER_R11	= 0x0002000B,
+	HV_X64_REGISTER_R12	= 0x0002000C,
+	HV_X64_REGISTER_R13	= 0x0002000D,
+	HV_X64_REGISTER_R14	= 0x0002000E,
+	HV_X64_REGISTER_R15	= 0x0002000F,
+	HV_X64_REGISTER_RIP	= 0x00020010,
+	HV_X64_REGISTER_RFLAGS	= 0x00020011,
+
+	/* X64 Floating Point and Vector Registers */
+	HV_X64_REGISTER_XMM0			= 0x00030000,
+	HV_X64_REGISTER_XMM1			= 0x00030001,
+	HV_X64_REGISTER_XMM2			= 0x00030002,
+	HV_X64_REGISTER_XMM3			= 0x00030003,
+	HV_X64_REGISTER_XMM4			= 0x00030004,
+	HV_X64_REGISTER_XMM5			= 0x00030005,
+	HV_X64_REGISTER_XMM6			= 0x00030006,
+	HV_X64_REGISTER_XMM7			= 0x00030007,
+	HV_X64_REGISTER_XMM8			= 0x00030008,
+	HV_X64_REGISTER_XMM9			= 0x00030009,
+	HV_X64_REGISTER_XMM10			= 0x0003000A,
+	HV_X64_REGISTER_XMM11			= 0x0003000B,
+	HV_X64_REGISTER_XMM12			= 0x0003000C,
+	HV_X64_REGISTER_XMM13			= 0x0003000D,
+	HV_X64_REGISTER_XMM14			= 0x0003000E,
+	HV_X64_REGISTER_XMM15			= 0x0003000F,
+	HV_X64_REGISTER_FP_MMX0			= 0x00030010,
+	HV_X64_REGISTER_FP_MMX1			= 0x00030011,
+	HV_X64_REGISTER_FP_MMX2			= 0x00030012,
+	HV_X64_REGISTER_FP_MMX3			= 0x00030013,
+	HV_X64_REGISTER_FP_MMX4			= 0x00030014,
+	HV_X64_REGISTER_FP_MMX5			= 0x00030015,
+	HV_X64_REGISTER_FP_MMX6			= 0x00030016,
+	HV_X64_REGISTER_FP_MMX7			= 0x00030017,
+	HV_X64_REGISTER_FP_CONTROL_STATUS	= 0x00030018,
+	HV_X64_REGISTER_XMM_CONTROL_STATUS	= 0x00030019,
+
+	/* X64 Control Registers */
+	HV_X64_REGISTER_CR0	= 0x00040000,
+	HV_X64_REGISTER_CR2	= 0x00040001,
+	HV_X64_REGISTER_CR3	= 0x00040002,
+	HV_X64_REGISTER_CR4	= 0x00040003,
+	HV_X64_REGISTER_CR8	= 0x00040004,
+	HV_X64_REGISTER_XFEM	= 0x00040005,
+
+	/* X64 Intermediate Control Registers */
+	HV_X64_REGISTER_INTERMEDIATE_CR0	= 0x00041000,
+	HV_X64_REGISTER_INTERMEDIATE_CR4	= 0x00041003,
+	HV_X64_REGISTER_INTERMEDIATE_CR8	= 0x00041004,
+
+	/* X64 Debug Registers */
+	HV_X64_REGISTER_DR0	= 0x00050000,
+	HV_X64_REGISTER_DR1	= 0x00050001,
+	HV_X64_REGISTER_DR2	= 0x00050002,
+	HV_X64_REGISTER_DR3	= 0x00050003,
+	HV_X64_REGISTER_DR6	= 0x00050004,
+	HV_X64_REGISTER_DR7	= 0x00050005,
+
+	/* X64 Segment Registers */
+	HV_X64_REGISTER_ES	= 0x00060000,
+	HV_X64_REGISTER_CS	= 0x00060001,
+	HV_X64_REGISTER_SS	= 0x00060002,
+	HV_X64_REGISTER_DS	= 0x00060003,
+	HV_X64_REGISTER_FS	= 0x00060004,
+	HV_X64_REGISTER_GS	= 0x00060005,
+	HV_X64_REGISTER_LDTR	= 0x00060006,
+	HV_X64_REGISTER_TR	= 0x00060007,
+
+	/* X64 Table Registers */
+	HV_X64_REGISTER_IDTR	= 0x00070000,
+	HV_X64_REGISTER_GDTR	= 0x00070001,
+
+	/* X64 Virtualized MSRs */
+	HV_X64_REGISTER_TSC		= 0x00080000,
+	HV_X64_REGISTER_EFER		= 0x00080001,
+	HV_X64_REGISTER_KERNEL_GS_BASE	= 0x00080002,
+	HV_X64_REGISTER_APIC_BASE	= 0x00080003,
+	HV_X64_REGISTER_PAT		= 0x00080004,
+	HV_X64_REGISTER_SYSENTER_CS	= 0x00080005,
+	HV_X64_REGISTER_SYSENTER_EIP	= 0x00080006,
+	HV_X64_REGISTER_SYSENTER_ESP	= 0x00080007,
+	HV_X64_REGISTER_STAR		= 0x00080008,
+	HV_X64_REGISTER_LSTAR		= 0x00080009,
+	HV_X64_REGISTER_CSTAR		= 0x0008000A,
+	HV_X64_REGISTER_SFMASK		= 0x0008000B,
+	HV_X64_REGISTER_INITIAL_APIC_ID	= 0x0008000C,
+
+	/* X64 Cache control MSRs */
+	HV_X64_REGISTER_MSR_MTRR_CAP		= 0x0008000D,
+	HV_X64_REGISTER_MSR_MTRR_DEF_TYPE	= 0x0008000E,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE0	= 0x00080010,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE1	= 0x00080011,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE2	= 0x00080012,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE3	= 0x00080013,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE4	= 0x00080014,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE5	= 0x00080015,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE6	= 0x00080016,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE7	= 0x00080017,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE8	= 0x00080018,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASE9	= 0x00080019,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASEA	= 0x0008001A,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASEB	= 0x0008001B,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASEC	= 0x0008001C,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASED	= 0x0008001D,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASEE	= 0x0008001E,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_BASEF	= 0x0008001F,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK0	= 0x00080040,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK1	= 0x00080041,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK2	= 0x00080042,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK3	= 0x00080043,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK4	= 0x00080044,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK5	= 0x00080045,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK6	= 0x00080046,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK7	= 0x00080047,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK8	= 0x00080048,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASK9	= 0x00080049,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASKA	= 0x0008004A,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASKB	= 0x0008004B,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASKC	= 0x0008004C,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASKD	= 0x0008004D,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASKE	= 0x0008004E,
+	HV_X64_REGISTER_MSR_MTRR_PHYS_MASKF	= 0x0008004F,
+	HV_X64_REGISTER_MSR_MTRR_FIX64K00000	= 0x00080070,
+	HV_X64_REGISTER_MSR_MTRR_FIX16K80000	= 0x00080071,
+	HV_X64_REGISTER_MSR_MTRR_FIX16KA0000	= 0x00080072,
+	HV_X64_REGISTER_MSR_MTRR_FIX4KC0000	= 0x00080073,
+	HV_X64_REGISTER_MSR_MTRR_FIX4KC8000	= 0x00080074,
+	HV_X64_REGISTER_MSR_MTRR_FIX4KD0000	= 0x00080075,
+	HV_X64_REGISTER_MSR_MTRR_FIX4KD8000	= 0x00080076,
+	HV_X64_REGISTER_MSR_MTRR_FIX4KE0000	= 0x00080077,
+	HV_X64_REGISTER_MSR_MTRR_FIX4KE8000	= 0x00080078,
+	HV_X64_REGISTER_MSR_MTRR_FIX4KF0000	= 0x00080079,
+	HV_X64_REGISTER_MSR_MTRR_FIX4KF8000	= 0x0008007A,
+
+	HV_X64_REGISTER_TSC_AUX		= 0x0008007B,
+	HV_X64_REGISTER_BNDCFGS		= 0x0008007C,
+	HV_X64_REGISTER_DEBUG_CTL	= 0x0008007D,
+
+	/* Available */
+	HV_X64_REGISTER_AVAILABLE0008007E	= 0x0008007E,
+	HV_X64_REGISTER_AVAILABLE0008007F	= 0x0008007F,
+
+	HV_X64_REGISTER_SGX_LAUNCH_CONTROL0	= 0x00080080,
+	HV_X64_REGISTER_SGX_LAUNCH_CONTROL1	= 0x00080081,
+	HV_X64_REGISTER_SGX_LAUNCH_CONTROL2	= 0x00080082,
+	HV_X64_REGISTER_SGX_LAUNCH_CONTROL3	= 0x00080083,
+	HV_X64_REGISTER_SPEC_CTRL		= 0x00080084,
+	HV_X64_REGISTER_PRED_CMD		= 0x00080085,
+	HV_X64_REGISTER_VIRT_SPEC_CTRL		= 0x00080086,
+
+	/* Other MSRs */
+	HV_X64_REGISTER_MSR_IA32_MISC_ENABLE		= 0x000800A0,
+	HV_X64_REGISTER_IA32_FEATURE_CONTROL		= 0x000800A1,
+	HV_X64_REGISTER_IA32_VMX_BASIC			= 0x000800A2,
+	HV_X64_REGISTER_IA32_VMX_PINBASED_CTLS		= 0x000800A3,
+	HV_X64_REGISTER_IA32_VMX_PROCBASED_CTLS		= 0x000800A4,
+	HV_X64_REGISTER_IA32_VMX_EXIT_CTLS		= 0x000800A5,
+	HV_X64_REGISTER_IA32_VMX_ENTRY_CTLS		= 0x000800A6,
+	HV_X64_REGISTER_IA32_VMX_MISC			= 0x000800A7,
+	HV_X64_REGISTER_IA32_VMX_CR0_FIXED0		= 0x000800A8,
+	HV_X64_REGISTER_IA32_VMX_CR0_FIXED1		= 0x000800A9,
+	HV_X64_REGISTER_IA32_VMX_CR4_FIXED0		= 0x000800AA,
+	HV_X64_REGISTER_IA32_VMX_CR4_FIXED1		= 0x000800AB,
+	HV_X64_REGISTER_IA32_VMX_VMCS_ENUM		= 0x000800AC,
+	HV_X64_REGISTER_IA32_VMX_PROCBASED_CTLS2	= 0x000800AD,
+	HV_X64_REGISTER_IA32_VMX_EPT_VPID_CAP		= 0x000800AE,
+	HV_X64_REGISTER_IA32_VMX_TRUE_PINBASED_CTLS	= 0x000800AF,
+	HV_X64_REGISTER_IA32_VMX_TRUE_PROCBASED_CTLS	= 0x000800B0,
+	HV_X64_REGISTER_IA32_VMX_TRUE_EXIT_CTLS		= 0x000800B1,
+	HV_X64_REGISTER_IA32_VMX_TRUE_ENTRY_CTLS	= 0x000800B2,
+
+	/* Performance monitoring MSRs */
+	HV_X64_REGISTER_PERF_GLOBAL_CTRL	= 0x00081000,
+	HV_X64_REGISTER_PERF_GLOBAL_STATUS	= 0x00081001,
+	HV_X64_REGISTER_PERF_GLOBAL_IN_USE	= 0x00081002,
+	HV_X64_REGISTER_FIXED_CTR_CTRL		= 0x00081003,
+	HV_X64_REGISTER_DS_AREA			= 0x00081004,
+	HV_X64_REGISTER_PEBS_ENABLE		= 0x00081005,
+	HV_X64_REGISTER_PEBS_LD_LAT		= 0x00081006,
+	HV_X64_REGISTER_PEBS_FRONTEND		= 0x00081007,
+	HV_X64_REGISTER_PERF_EVT_SEL0		= 0x00081100,
+	HV_X64_REGISTER_PMC0			= 0x00081200,
+	HV_X64_REGISTER_FIXED_CTR0		= 0x00081300,
+
+	HV_X64_REGISTER_LBR_TOS		= 0x00082000,
+	HV_X64_REGISTER_LBR_SELECT	= 0x00082001,
+	HV_X64_REGISTER_LER_FROM_LIP	= 0x00082002,
+	HV_X64_REGISTER_LER_TO_LIP	= 0x00082003,
+	HV_X64_REGISTER_LBR_FROM0	= 0x00082100,
+	HV_X64_REGISTER_LBR_TO0		= 0x00082200,
+	HV_X64_REGISTER_LBR_INFO0	= 0x00083300,
+
+	/* Intel processor trace MSRs */
+	HV_X64_REGISTER_RTIT_CTL		= 0x00081008,
+	HV_X64_REGISTER_RTIT_STATUS		= 0x00081009,
+	HV_X64_REGISTER_RTIT_OUTPUT_BASE	= 0x0008100A,
+	HV_X64_REGISTER_RTIT_OUTPUT_MASK_PTRS	= 0x0008100B,
+	HV_X64_REGISTER_RTIT_CR3_MATCH		= 0x0008100C,
+	HV_X64_REGISTER_RTIT_ADDR0A		= 0x00081400,
+
+	/* RtitAddr0A/B - RtitAddr3A/B occupy 0x00081400-0x00081407. */
+
+	/* X64 Apic registers. These match the equivalent x2APIC MSR offsets. */
+	HV_X64_REGISTER_APIC_ID		= 0x00084802,
+	HV_X64_REGISTER_APIC_VERSION	= 0x00084803,
+
+	/* Hypervisor-defined registers (Misc) */
+	HV_X64_REGISTER_HYPERCALL	= 0x00090001,
+
+	/* X64 Virtual APIC registers synthetic MSRs */
+	HV_X64_REGISTER_SYNTHETIC_EOI	= 0x00090010,
+	HV_X64_REGISTER_SYNTHETIC_ICR	= 0x00090011,
+	HV_X64_REGISTER_SYNTHETIC_TPR	= 0x00090012,
+
+	/* Partition Timer Assist Registers */
+	HV_X64_REGISTER_EMULATED_TIMER_PERIOD	= 0x00090030,
+	HV_X64_REGISTER_EMULATED_TIMER_CONTROL	= 0x00090031,
+	HV_X64_REGISTER_PM_TIMER_ASSIST		= 0x00090032,
+
+	/* Intercept Control Registers */
+	HV_X64_REGISTER_CR_INTERCEPT_CONTROL			= 0x000E0000,
+	HV_X64_REGISTER_CR_INTERCEPT_CR0_MASK			= 0x000E0001,
+	HV_X64_REGISTER_CR_INTERCEPT_CR4_MASK			= 0x000E0002,
+	HV_X64_REGISTER_CR_INTERCEPT_IA32_MISC_ENABLE_MASK	= 0x000E0003,
+};
+
+struct hv_u128 {
+	__u64 high_part;
+	__u64 low_part;
+} __packed;
+
+union hv_x64_fp_register {
+	struct hv_u128 as_uint128;
+	struct {
+		__u64 mantissa;
+		__u64 biased_exponent : 15;
+		__u64 sign : 1;
+		__u64 reserved : 48;
+	} __packed;
+} __packed;
+
+union hv_x64_fp_control_status_register {
+	struct hv_u128 as_uint128;
+	struct {
+		__u16 fp_control;
+		__u16 fp_status;
+		__u8 fp_tag;
+		__u8 reserved;
+		__u16 last_fp_op;
+		union {
+			/* long mode */
+			__u64 last_fp_rip;
+			/* 32 bit mode */
+			struct {
+				__u32 last_fp_eip;
+				__u16 last_fp_cs;
+				__u16 padding;
+			} __packed;
+		};
+	} __packed;
+} __packed;
+
+union hv_x64_xmm_control_status_register {
+	struct hv_u128 as_uint128;
+	struct {
+		union {
+			/* long mode */
+			__u64 last_fp_rdp;
+			/* 32 bit mode */
+			struct {
+				__u32 last_fp_dp;
+				__u16 last_fp_ds;
+				__u16 padding;
+			} __packed;
+		};
+		__u32 xmm_status_control;
+		__u32 xmm_status_control_mask;
+	} __packed;
+} __packed;
+
+struct hv_x64_segment_register {
+	__u64 base;
+	__u32 limit;
+	__u16 selector;
+	union {
+		struct {
+			__u16 segment_type : 4;
+			__u16 non_system_segment : 1;
+			__u16 descriptor_privilege_level : 2;
+			__u16 present : 1;
+			__u16 reserved : 4;
+			__u16 available : 1;
+			__u16 _long : 1;
+			__u16 _default : 1;
+			__u16 granularity : 1;
+		} __packed;
+		__u16 attributes;
+	};
+} __packed;
+
+struct hv_x64_table_register {
+	__u16 pad[3];
+	__u16 limit;
+	__u64 base;
+} __packed;
+
+union hv_explicit_suspend_register {
+	__u64 as_uint64;
+	struct {
+		__u64 suspended : 1;
+		__u64 reserved : 63;
+	} __packed;
+};
+
+union hv_intercept_suspend_register {
+	__u64 as_uint64;
+	struct {
+		__u64 suspended : 1;
+		__u64 reserved : 63;
+	} __packed;
+};
+
+union hv_dispatch_suspend_register {
+	__u64 as_uint64;
+	struct {
+		__u64 suspended : 1;
+		__u64 reserved : 63;
+	} __packed;
+};
+
+union hv_x64_interrupt_state_register {
+	__u64 as_uint64;
+	struct {
+		__u64 interrupt_shadow : 1;
+		__u64 nmi_masked : 1;
+		__u64 reserved : 62;
+	} __packed;
+};
+
+union hv_x64_pending_interruption_register {
+	__u64 as_uint64;
+	struct {
+		__u32 interruption_pending : 1;
+		__u32 interruption_type : 3;
+		__u32 deliver_error_code : 1;
+		__u32 instruction_length : 4;
+		__u32 nested_event : 1;
+		__u32 reserved : 6;
+		__u32 interruption_vector : 16;
+		__u32 error_code;
+	} __packed;
+};
+
+union hv_x64_msr_npiep_config_contents {
+	__u64 as_uint64;
+	struct {
+		/*
+		 * These bits enable instruction execution prevention for
+		 * specific instructions.
+		 */
+		__u64 prevents_gdt : 1;
+		__u64 prevents_idt : 1;
+		__u64 prevents_ldt : 1;
+		__u64 prevents_tr : 1;
+
+		/* The reserved bits must always be 0. */
+		__u64 reserved : 60;
+	} __packed;
+};
+
+union hv_x64_pending_exception_event {
+	__u64 as_uint64[2];
+	struct {
+		__u32 event_pending : 1;
+		__u32 event_type : 3;
+		__u32 reserved0 : 4;
+		__u32 deliver_error_code : 1;
+		__u32 reserved1 : 7;
+		__u32 vector : 16;
+		__u32 error_code;
+		__u64 exception_parameter;
+	} __packed;
+};
+
+union hv_x64_pending_virtualization_fault_event {
+	__u64 as_uint64[2];
+	struct {
+		__u32 event_pending : 1;
+		__u32 event_type : 3;
+		__u32 reserved0 : 4;
+		__u32 reserved1 : 8;
+		__u32 parameter0 : 16;
+		__u32 code;
+		__u64 parameter1;
+	} __packed;
+};
+
+union hv_register_value {
+	struct hv_u128 reg128;
+	__u64 reg64;
+	__u32 reg32;
+	__u16 reg16;
+	__u8 reg8;
+	union hv_x64_fp_register fp;
+	union hv_x64_fp_control_status_register fp_control_status;
+	union hv_x64_xmm_control_status_register xmm_control_status;
+	struct hv_x64_segment_register segment;
+	struct hv_x64_table_register table;
+	union hv_explicit_suspend_register explicit_suspend;
+	union hv_intercept_suspend_register intercept_suspend;
+	union hv_dispatch_suspend_register dispatch_suspend;
+	union hv_x64_interrupt_state_register interrupt_state;
+	union hv_x64_pending_interruption_register pending_interruption;
+	union hv_x64_msr_npiep_config_contents npiep_config;
+	union hv_x64_pending_exception_event pending_exception_event;
+	union hv_x64_pending_virtualization_fault_event
+		pending_virtualization_fault_event;
+};
+
 #endif
diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index 31d59de4a7f7..37dcd6c636a7 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -294,3 +294,103 @@ int hv_call_unmap_gpa_pages(
 	return ret;
 }
 
+int hv_call_get_vp_registers(
+		u32 vp_index,
+		u64 partition_id,
+		u16 count,
+		struct hv_register_assoc *registers)
+{
+	struct hv_get_vp_registers *input_page;
+	union hv_register_value *output_page;
+	u16 completed = 0;
+	unsigned long remaining = count;
+	int rep_count, i;
+	u64 status;
+	unsigned long flags;
+
+	local_irq_save(flags);
+
+	input_page = (struct hv_get_vp_registers *)(*this_cpu_ptr(
+		hyperv_pcpu_input_arg));
+	output_page = (union hv_register_value *)(*this_cpu_ptr(
+		hyperv_pcpu_output_arg));
+
+	input_page->partition_id = partition_id;
+	input_page->vp_index = vp_index;
+	input_page->input_vtl = 0;
+	input_page->rsvd_z8 = 0;
+	input_page->rsvd_z16 = 0;
+
+	while (remaining) {
+		rep_count = min(remaining, HV_GET_REGISTER_BATCH_SIZE);
+		for (i = 0; i < rep_count; ++i)
+			input_page->names[i] = registers[i].name;
+
+		status = hv_do_rep_hypercall(HVCALL_GET_VP_REGISTERS, rep_count,
+					     0, input_page, output_page);
+		if (!hv_result_success(status)) {
+			pr_err("%s: completed %li out of %u, %s\n",
+			       __func__,
+			       count - remaining, count,
+			       hv_status_to_string(status));
+			break;
+		}
+		completed = hv_repcomp(status);
+		for (i = 0; i < completed; ++i)
+			registers[i].value = output_page[i];
+
+		registers += completed;
+		remaining -= completed;
+	}
+	local_irq_restore(flags);
+
+	return hv_status_to_errno(status);
+}
+
+int hv_call_set_vp_registers(
+		u32 vp_index,
+		u64 partition_id,
+		u16 count,
+		struct hv_register_assoc *registers)
+{
+	struct hv_set_vp_registers *input_page;
+	u16 completed = 0;
+	unsigned long remaining = count;
+	int rep_count;
+	u64 status;
+	unsigned long flags;
+
+	local_irq_save(flags);
+	input_page = (struct hv_set_vp_registers *)(*this_cpu_ptr(
+		hyperv_pcpu_input_arg));
+
+	input_page->partition_id = partition_id;
+	input_page->vp_index = vp_index;
+	input_page->input_vtl = 0;
+	input_page->rsvd_z8 = 0;
+	input_page->rsvd_z16 = 0;
+
+	while (remaining) {
+		rep_count = min(remaining, HV_SET_REGISTER_BATCH_SIZE);
+		memcpy(input_page->elements, registers,
+			sizeof(struct hv_register_assoc) * rep_count);
+
+		status = hv_do_rep_hypercall(HVCALL_SET_VP_REGISTERS, rep_count,
+					     0, input_page, NULL);
+		if (!hv_result_success(status)) {
+			pr_err("%s: completed %li out of %u, %s\n",
+			       __func__,
+			       count - remaining, count,
+			       hv_status_to_string(status));
+			break;
+		}
+		completed = hv_repcomp(status);
+		registers += completed;
+		remaining -= completed;
+	}
+
+	local_irq_restore(flags);
+
+	return hv_status_to_errno(status);
+}
+
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index 13d9df7c3e0d..9e63d2fabc74 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -21,6 +21,11 @@
 #define HV_MAP_GPA_BATCH_SIZE	\
 		((HV_HYP_PAGE_SIZE - sizeof(struct hv_map_gpa_pages)) / sizeof(u64))
 #define PIN_PAGES_BATCH_SIZE	(0x10000000 / HV_HYP_PAGE_SIZE)
+#define HV_GET_REGISTER_BATCH_SIZE	\
+	(HV_HYP_PAGE_SIZE / sizeof(union hv_register_value))
+#define HV_SET_REGISTER_BATCH_SIZE	\
+	((HV_HYP_PAGE_SIZE - sizeof(struct hv_set_vp_registers)) \
+		/ sizeof(struct hv_register_assoc))
 
 /*
  * Hyper-V hypercalls
@@ -43,5 +48,15 @@ int hv_call_unmap_gpa_pages(
 		u64 partition_id,
 		u64 gpa_target,
 		u64 page_count, u32 flags);
+int hv_call_get_vp_registers(
+		u32 vp_index,
+		u64 partition_id,
+		u16 count,
+		struct hv_register_assoc *registers);
+int hv_call_set_vp_registers(
+		u32 vp_index,
+		u64 partition_id,
+		u16 count,
+		struct hv_register_assoc *registers);
 
 #endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index c3ac8c371d0f..f66644d0dca5 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -62,10 +62,102 @@ static struct miscdevice mshv_dev = {
 	.mode = 0600,
 };
 
+static long
+mshv_vp_ioctl_get_regs(struct mshv_vp *vp, void __user *user_args)
+{
+	struct mshv_vp_registers args;
+	struct hv_register_assoc *registers;
+	long ret;
+
+	if (copy_from_user(&args, user_args, sizeof(args)))
+		return -EFAULT;
+
+	if (args.count > MSHV_VP_MAX_REGISTERS)
+		return -EINVAL;
+
+	registers = kmalloc_array(args.count,
+				  sizeof(*registers),
+				  GFP_KERNEL);
+	if (!registers)
+		return -ENOMEM;
+
+	if (copy_from_user(registers, args.regs,
+			   sizeof(*registers) * args.count)) {
+		ret = -EFAULT;
+		goto free_return;
+	}
+
+	ret = hv_call_get_vp_registers(vp->index, vp->partition->id,
+				       args.count, registers);
+	if (ret)
+		goto free_return;
+
+	if (copy_to_user(args.regs, registers,
+			 sizeof(*registers) * args.count)) {
+		ret = -EFAULT;
+	}
+
+free_return:
+	kfree(registers);
+	return ret;
+}
+
+static long
+mshv_vp_ioctl_set_regs(struct mshv_vp *vp, void __user *user_args)
+{
+	struct mshv_vp_registers args;
+	struct hv_register_assoc *registers;
+	long ret;
+
+	if (copy_from_user(&args, user_args, sizeof(args)))
+		return -EFAULT;
+
+	if (args.count > MSHV_VP_MAX_REGISTERS)
+		return -EINVAL;
+
+	registers = kmalloc_array(args.count,
+				  sizeof(*registers),
+				  GFP_KERNEL);
+	if (!registers)
+		return -ENOMEM;
+
+	if (copy_from_user(registers, args.regs,
+			   sizeof(*registers) * args.count)) {
+		ret = -EFAULT;
+		goto free_return;
+	}
+
+	ret = hv_call_set_vp_registers(vp->index, vp->partition->id,
+				       args.count, registers);
+
+free_return:
+	kfree(registers);
+	return ret;
+}
+
 static long
 mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
-	return -ENOTTY;
+	struct mshv_vp *vp = filp->private_data;
+	long r = 0;
+
+	if (mutex_lock_killable(&vp->mutex))
+		return -EINTR;
+
+	switch (ioctl) {
+	case MSHV_GET_VP_REGISTERS:
+		r = mshv_vp_ioctl_get_regs(vp, (void __user *)arg);
+		break;
+	case MSHV_SET_VP_REGISTERS:
+		r = mshv_vp_ioctl_set_regs(vp, (void __user *)arg);
+		break;
+	default:
+		r = -ENOTTY;
+		break;
+	}
+	mutex_unlock(&vp->mutex);
+
+	return r;
 }
 
 static int
@@ -102,6 +194,8 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
 	if (!vp)
 		return -ENOMEM;
 
+	mutex_init(&vp->mutex);
+
 	vp->index = args.vp_index;
 	vp->partition = mshv_partition_get(partition);
 	if (!vp->partition) {
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 8684e7f9ec5b..8679c39181a2 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -665,23 +665,17 @@ struct hv_retarget_device_interrupt {
 	struct hv_device_interrupt_target int_target;
 } __packed __aligned(8);
 
-
-/* HvGetVpRegisters hypercall input with variable size reg name list*/
-struct hv_get_vp_registers_input {
-	struct {
-		u64 partitionid;
-		u32 vpindex;
-		u8  inputvtl;
-		u8  padding[3];
-	} header;
-	struct input {
-		u32 name0;
-		u32 name1;
-	} element[];
+/* HvGetVpRegisters hypercall with variable size reg name list*/
+struct hv_get_vp_registers {
+	u64 partition_id;
+	u32 vp_index;
+	u8  input_vtl;
+	u8  rsvd_z8;
+	u16 rsvd_z16;
+	u32 names[];
 } __packed;
 
-
-/* HvGetVpRegisters returns an array of these output elements */
+/* HvGetVpRegisters returns an array of register values */
 struct hv_get_vp_registers_output {
 	union {
 		struct {
@@ -695,23 +689,16 @@ struct hv_get_vp_registers_output {
 			u64 high;
 		} as64 __packed;
 	};
-};
+} __packed;
 
 /* HvSetVpRegisters hypercall with variable size reg name/value list*/
-struct hv_set_vp_registers_input {
-	struct {
-		u64 partitionid;
-		u32 vpindex;
-		u8  inputvtl;
-		u8  padding[3];
-	} header;
-	struct {
-		u32 name;
-		u32 padding1;
-		u64 padding2;
-		u64 valuelow;
-		u64 valuehigh;
-	} element[];
+struct hv_set_vp_registers {
+	u64 partition_id;
+	u32 vp_index;
+	u8  input_vtl;
+	u8  rsvd_z8;
+	u16 rsvd_z16;
+	struct hv_register_assoc elements[];
 } __packed;
 
 enum hv_device_type {
diff --git a/include/linux/mshv.h b/include/linux/mshv.h
index 50521c5f7948..dfe469f573f9 100644
--- a/include/linux/mshv.h
+++ b/include/linux/mshv.h
@@ -17,6 +17,7 @@
 struct mshv_vp {
 	u32 index;
 	struct mshv_partition *partition;
+	struct mutex mutex;
 };
 
 struct mshv_mem_region {
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
index e7b09b9f00de..f49099d1f894 100644
--- a/include/uapi/asm-generic/hyperv-tlfs.h
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -21,4 +21,11 @@
 #define HV_MAP_GPA_EXECUTABLE           0xC
 #define HV_MAP_GPA_PERMISSIONS_MASK     0xF
 
+struct hv_register_assoc {
+	__u32 name;			/* enum hv_register_name */
+	__u32 reserved1;
+	__u64 reserved2;
+	union hv_register_value value;
+} __packed;
+
 #endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 251976348441..7a4e0c340dd4 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -34,6 +34,13 @@ struct mshv_create_vp {
 	__u32 vp_index;
 };
 
+#define MSHV_VP_MAX_REGISTERS	128
+
+struct mshv_vp_registers {
+	int count; /* at most MSHV_VP_MAX_REGISTERS */
+	struct hv_register_assoc *regs;
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
@@ -45,4 +52,8 @@ struct mshv_create_vp {
 #define MSHV_UNMAP_GUEST_MEMORY	_IOW(MSHV_IOCTL, 0x03, struct mshv_user_mem_region)
 #define MSHV_CREATE_VP		_IOW(MSHV_IOCTL, 0x04, struct mshv_create_vp)
 
+/* vp device */
+#define MSHV_GET_VP_REGISTERS   _IOWR(MSHV_IOCTL, 0x05, struct mshv_vp_registers)
+#define MSHV_SET_VP_REGISTERS   _IOW(MSHV_IOCTL, 0x06, struct mshv_vp_registers)
+
 #endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 11/19] drivers/hv: set up synic pages for intercept messages
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (9 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 10/19] drivers/hv: get and set vcpu registers ioctls Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 21:38   ` Olaf Hering
  2021-09-28 18:31 ` [PATCH v3 12/19] drivers/hv: run vp ioctl and isr Nuno Das Neves
                   ` (7 subsequent siblings)
  18 siblings, 1 reply; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Same idea as synic setup in drivers/hv/hv.c:hv_synic_enable_regs()
and hv_synic_disable_regs().
Setting up synic registers in both vmbus driver and mshv would clobber
them, but the vmbus driver will not run in the root partition, so this
is safe.
Move struct hv_message and related definitions to uapi.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 arch/x86/include/uapi/asm/hyperv-tlfs.h | 225 ++++++++++++++++++++++++
 drivers/hv/Makefile                     |   2 +-
 drivers/hv/hv_synic.c                   |  85 +++++++++
 drivers/hv/mshv.h                       |   5 +
 drivers/hv/mshv_main.c                  |  30 +++-
 include/asm-generic/hyperv-tlfs.h       |  81 +--------
 include/linux/mshv.h                    |   1 +
 include/uapi/asm-generic/hyperv-tlfs.h  |  75 ++++++++
 8 files changed, 422 insertions(+), 82 deletions(-)
 create mode 100644 drivers/hv/hv_synic.c

diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h b/arch/x86/include/uapi/asm/hyperv-tlfs.h
index a42c63001055..4ffa7e1cd185 100644
--- a/arch/x86/include/uapi/asm/hyperv-tlfs.h
+++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
@@ -723,4 +723,229 @@ union hv_register_value {
 		pending_virtualization_fault_event;
 };
 
+union hv_x64_vp_execution_state {
+	__u16 as_uint16;
+	struct {
+		__u16 cpl:2;
+		__u16 cr0_pe:1;
+		__u16 cr0_am:1;
+		__u16 efer_lma:1;
+		__u16 debug_active:1;
+		__u16 interruption_pending:1;
+		__u16 vtl:4;
+		__u16 enclave_mode:1;
+		__u16 interrupt_shadow:1;
+		__u16 virtualization_fault_active:1;
+		__u16 reserved:2;
+	} __packed;
+};
+
+/* Values for intercept_access_type field */
+#define HV_INTERCEPT_ACCESS_READ	0
+#define HV_INTERCEPT_ACCESS_WRITE	1
+#define HV_INTERCEPT_ACCESS_EXECUTE	2
+
+struct hv_x64_intercept_message_header {
+	__u32 vp_index;
+	__u8 instruction_length:4;
+	__u8 cr8:4; // only set for exo partitions
+	__u8 intercept_access_type;
+	union hv_x64_vp_execution_state execution_state;
+	struct hv_x64_segment_register cs_segment;
+	__u64 rip;
+	__u64 rflags;
+} __packed;
+
+#define HV_HYPERCALL_INTERCEPT_MAX_XMM_REGISTERS 6
+
+struct hv_x64_hypercall_intercept_message {
+	struct hv_x64_intercept_message_header header;
+	__u64 rax;
+	__u64 rbx;
+	__u64 rcx;
+	__u64 rdx;
+	__u64 r8;
+	__u64 rsi;
+	__u64 rdi;
+	struct hv_u128 xmmregisters[HV_HYPERCALL_INTERCEPT_MAX_XMM_REGISTERS];
+	struct {
+		__u32 isolated:1;
+		__u32 reserved:31;
+	} __packed;
+} __packed;
+
+union hv_x64_register_access_info {
+	union hv_register_value source_value;
+	__u32 destination_register;
+	__u64 source_address;
+	__u64 destination_address;
+};
+
+struct hv_x64_register_intercept_message {
+	struct hv_x64_intercept_message_header header;
+	struct {
+		__u8 is_memory_op:1;
+		__u8 reserved:7;
+	} __packed;
+	__u8 reserved8;
+	__u16 reserved16;
+	__u32 register_name;
+	union hv_x64_register_access_info access_info;
+} __packed;
+
+union hv_x64_memory_access_info {
+	__u8 as_uint8;
+	struct {
+		__u8 gva_valid:1;
+		__u8 gva_gpa_valid:1;
+		__u8 hypercall_output_pending:1;
+		__u8 tlb_locked_no_overlay:1;
+		__u8 reserved:4;
+	} __packed;
+};
+
+union hv_x64_io_port_access_info {
+	__u8 as_uint8;
+	struct {
+		__u8 access_size:3;
+		__u8 string_op:1;
+		__u8 rep_prefix:1;
+		__u8 reserved:3;
+	} __packed;
+};
+
+union hv_x64_exception_info {
+	__u8 as_uint8;
+	struct {
+		__u8 error_code_valid:1;
+		__u8 software_exception:1;
+		__u8 reserved:6;
+	} __packed;
+};
+
+#define HV_CACHE_TYPE_UNCACHED		0
+#define HV_CACHE_TYPE_WRITE_COMBINING	1
+#define HV_CACHE_TYPE_WRITE_THROUGH	4
+#define HV_CACHE_TYPE_WRITE_PROTECTED	5
+#define HV_CACHE_TYPE_WRITE_BACK	6
+
+struct hv_x64_memory_intercept_message {
+	struct hv_x64_intercept_message_header header;
+	__u32 cache_type;
+	__u8 instruction_byte_count;
+	union hv_x64_memory_access_info memory_access_info;
+	__u8 tpr_priority;
+	__u8 reserved1;
+	__u64 guest_virtual_address;
+	__u64 guest_physical_address;
+	__u8 instruction_bytes[16];
+} __packed;
+
+struct hv_x64_cpuid_intercept_message {
+	struct hv_x64_intercept_message_header header;
+	__u64 rax;
+	__u64 rcx;
+	__u64 rdx;
+	__u64 rbx;
+	__u64 default_result_rax;
+	__u64 default_result_rcx;
+	__u64 default_result_rdx;
+	__u64 default_result_rbx;
+} __packed;
+
+struct hv_x64_msr_intercept_message {
+	struct hv_x64_intercept_message_header header;
+	__u32 msr_number;
+	__u32 reserved;
+	__u64 rdx;
+	__u64 rax;
+} __packed;
+
+struct hv_x64_io_port_intercept_message {
+	struct hv_x64_intercept_message_header header;
+	__u16 port_number;
+	union hv_x64_io_port_access_info access_info;
+	__u8 instruction_byte_count;
+	__u32 reserved;
+	__u64 rax;
+	__u8 instruction_bytes[16];
+	struct hv_x64_segment_register ds_segment;
+	struct hv_x64_segment_register es_segment;
+	__u64 rcx;
+	__u64 rsi;
+	__u64 rdi;
+} __packed;
+
+struct hv_x64_exception_intercept_message {
+	struct hv_x64_intercept_message_header header;
+	__u16 exception_vector;
+	union hv_x64_exception_info exception_info;
+	__u8 instruction_byte_count;
+	__u32 error_code;
+	__u64 exception_parameter;
+	__u64 reserved;
+	__u8 instruction_bytes[16];
+	struct hv_x64_segment_register ds_segment;
+	struct hv_x64_segment_register ss_segment;
+	__u64 rax;
+	__u64 rcx;
+	__u64 rdx;
+	__u64 rbx;
+	__u64 rsp;
+	__u64 rbp;
+	__u64 rsi;
+	__u64 rdi;
+	__u64 r8;
+	__u64 r9;
+	__u64 r10;
+	__u64 r11;
+	__u64 r12;
+	__u64 r13;
+	__u64 r14;
+	__u64 r15;
+} __packed;
+
+struct hv_x64_invalid_vp_register_message {
+	__u32 vp_index;
+	__u32 reserved;
+} __packed;
+
+struct hv_x64_unrecoverable_exception_message {
+	struct hv_x64_intercept_message_header header;
+} __packed;
+
+#define HV_UNSUPPORTED_FEATURE_INTERCEPT	1
+#define HV_UNSUPPORTED_FEATURE_TASK_SWITCH_TSS	2
+
+struct hv_x64_unsupported_feature_message {
+	__u32 vp_index;
+	__u32 feature_code;
+	__u64 feature_parameter;
+} __packed;
+
+struct hv_x64_halt_message {
+	struct hv_x64_intercept_message_header header;
+} __packed;
+
+#define HV_X64_PENDING_INTERRUPT	0
+#define HV_X64_PENDING_NMI		2
+#define HV_X64_PENDING_EXCEPTION	3
+
+struct hv_x64_interruption_deliverable_message {
+	struct hv_x64_intercept_message_header header;
+	__u32 deliverable_type; /* pending interruption type */
+	__u32 rsvd;
+} __packed;
+
+struct hv_x64_sipi_intercept_message {
+	struct hv_x64_intercept_message_header header;
+	__u32 target_vp_index;
+	__u32 interrupt_vector;
+} __packed;
+
+struct hv_x64_apic_eoi_message {
+	__u32 vp_index;
+	__u32 interrupt_vector;
+} __packed;
+
 #endif
diff --git a/drivers/hv/Makefile b/drivers/hv/Makefile
index d20761b5df80..df2825ceb3a6 100644
--- a/drivers/hv/Makefile
+++ b/drivers/hv/Makefile
@@ -13,5 +13,5 @@ hv_vmbus-y := vmbus_drv.o \
 hv_vmbus-$(CONFIG_HYPERV_TESTING)	+= hv_debugfs.o
 hv_utils-y := hv_util.o hv_kvp.o hv_snapshot.o hv_fcopy.o hv_utils_transport.o
 
-mshv-y				+= mshv_main.o hv_call.o
+mshv-y				+= mshv_main.o hv_call.o hv_synic.o
 
diff --git a/drivers/hv/hv_synic.c b/drivers/hv/hv_synic.c
new file mode 100644
index 000000000000..c6546ae54ea9
--- /dev/null
+++ b/drivers/hv/hv_synic.c
@@ -0,0 +1,85 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2021, Microsoft Corporation.
+ *
+ * Authors:
+ *   Nuno Das Neves <nudasnev@microsoft.com>
+ *   Lillian Grassin-Drake <ligrassi@microsoft.com>
+ *   Vineeth Remanan Pillai <viremana@linux.microsoft.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/io.h>
+#include <linux/mshv.h>
+#include <asm/mshyperv.h>
+
+#include "mshv.h"
+
+int mshv_synic_init(unsigned int cpu)
+{
+	union hv_synic_simp simp;
+	union hv_synic_sint sint;
+	union hv_synic_scontrol sctrl;
+	struct hv_message_page **msg_page =
+			this_cpu_ptr(mshv.synic_message_page);
+
+	/* Setup the Synic's message page */
+	simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
+	simp.simp_enabled = true;
+	*msg_page = memremap(simp.base_simp_gpa << HV_HYP_PAGE_SHIFT,
+			     HV_HYP_PAGE_SIZE,
+			     MEMREMAP_WB);
+	if (!(*msg_page)) {
+		pr_err("%s: memremap failed\n", __func__);
+		return -EFAULT;
+	}
+	hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);
+
+	/* Enable intercepts */
+	sint.as_uint64 = 0;
+	sint.vector = HYPERVISOR_CALLBACK_VECTOR;
+	sint.masked = false;
+#ifdef HV_DEPRECATING_AEOI_RECOMMENDED
+	sint.auto_eoi =	!(ms_hyperv.hints & HV_DEPRECATING_AEOI_RECOMMENDED);
+#else
+	sint.auto_eoi = 0;
+#endif
+	hv_set_register(HV_REGISTER_SINT0 + HV_SYNIC_INTERCEPTION_SINT_INDEX,
+			sint.as_uint64);
+
+	/* Enable global synic bit */
+	sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
+	sctrl.enable = 1;
+	hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
+
+	return 0;
+}
+
+int mshv_synic_cleanup(unsigned int cpu)
+{
+	union hv_synic_sint sint;
+	union hv_synic_simp simp;
+	union hv_synic_scontrol sctrl;
+	struct hv_message_page **msg_page =
+			this_cpu_ptr(mshv.synic_message_page);
+
+	/* Disable the interrupt */
+	sint.as_uint64 = hv_get_register(HV_REGISTER_SINT0 + HV_SYNIC_INTERCEPTION_SINT_INDEX);
+	sint.masked = true;
+	hv_set_register(HV_REGISTER_SINT0 + HV_SYNIC_INTERCEPTION_SINT_INDEX,
+			sint.as_uint64);
+
+	/* Disable Synic's message page */
+	simp.as_uint64 = hv_get_register(HV_REGISTER_SIMP);
+	simp.simp_enabled = false;
+	hv_set_register(HV_REGISTER_SIMP, simp.as_uint64);
+	memunmap(*msg_page);
+
+	/* Disable global synic bit */
+	sctrl.as_uint64 = hv_get_register(HV_REGISTER_SCONTROL);
+	sctrl.enable = 0;
+	hv_set_register(HV_REGISTER_SCONTROL, sctrl.as_uint64);
+
+	return 0;
+}
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index 9e63d2fabc74..b8fece9fe80d 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -27,6 +27,11 @@
 	((HV_HYP_PAGE_SIZE - sizeof(struct hv_set_vp_registers)) \
 		/ sizeof(struct hv_register_assoc))
 
+extern struct mshv mshv;
+
+int mshv_synic_init(unsigned int cpu);
+int mshv_synic_cleanup(unsigned int cpu);
+
 /*
  * Hyper-V hypercalls
  */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index f66644d0dca5..1b32cf7ad9f3 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -15,6 +15,8 @@
 #include <linux/file.h>
 #include <linux/anon_inodes.h>
 #include <linux/mm.h>
+#include <linux/io.h>
+#include <linux/cpuhotplug.h>
 #include <linux/mshv.h>
 #include <asm/mshyperv.h>
 
@@ -648,6 +650,8 @@ mshv_dev_release(struct inode *inode, struct file *filp)
 	return 0;
 }
 
+static int mshv_cpuhp_online;
+
 static int
 __init mshv_init(void)
 {
@@ -657,17 +661,39 @@ __init mshv_init(void)
 		return -ENODEV;
 
 	ret = misc_register(&mshv_dev);
-	if (ret)
+	if (ret) {
 		pr_err("%s: misc device register failed\n", __func__);
+		return ret;
+	}
+
+	mshv.synic_message_page = alloc_percpu(struct hv_message_page *);
+	if (!mshv.synic_message_page) {
+		pr_err("%s: failed to allocate percpu synic page\n", __func__);
+		misc_deregister(&mshv_dev);
+		return -ENOMEM;
+	}
 
+	ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mshv_synic",
+				mshv_synic_init,
+				mshv_synic_cleanup);
+	if (ret < 0) {
+		pr_err("%s: failed to setup cpu hotplug state: %i\n",
+		       __func__, ret);
+		return ret;
+	}
+
+	mshv_cpuhp_online = ret;
 	spin_lock_init(&mshv.partitions.lock);
 
-	return ret;
+	return 0;
 }
 
 static void
 __exit mshv_exit(void)
 {
+	cpuhp_remove_state(mshv_cpuhp_online);
+	free_percpu(mshv.synic_message_page);
+
 	misc_deregister(&mshv_dev);
 }
 
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 8679c39181a2..ace8fca88f66 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -241,6 +241,8 @@ enum hv_status {
 /* Valid SynIC vectors are 16-255. */
 #define HV_SYNIC_FIRST_VALID_VECTOR	(16)
 
+#define HV_SYNIC_INTERCEPTION_SINT_INDEX 0x00000000
+
 #define HV_SYNIC_CONTROL_ENABLE		(1ULL << 0)
 #define HV_SYNIC_SIMP_ENABLE		(1ULL << 0)
 #define HV_SYNIC_SIEFP_ENABLE		(1ULL << 0)
@@ -250,84 +252,6 @@ enum hv_status {
 
 #define HV_SYNIC_STIMER_COUNT		(4)
 
-/* Define synthetic interrupt controller message constants. */
-#define HV_MESSAGE_SIZE			(256)
-#define HV_MESSAGE_PAYLOAD_BYTE_COUNT	(240)
-#define HV_MESSAGE_PAYLOAD_QWORD_COUNT	(30)
-
-/*
- * Define hypervisor message types. Some of the message types
- * are x86/x64 specific, but there's no good way to separate
- * them out into the arch-specific version of hyperv-tlfs.h
- * because C doesn't provide a way to extend enum types.
- * Keeping them all in the arch neutral hyperv-tlfs.h seems
- * the least messy compromise.
- */
-enum hv_message_type {
-	HVMSG_NONE			= 0x00000000,
-
-	/* Memory access messages. */
-	HVMSG_UNMAPPED_GPA		= 0x80000000,
-	HVMSG_GPA_INTERCEPT		= 0x80000001,
-
-	/* Timer notification messages. */
-	HVMSG_TIMER_EXPIRED		= 0x80000010,
-
-	/* Error messages. */
-	HVMSG_INVALID_VP_REGISTER_VALUE	= 0x80000020,
-	HVMSG_UNRECOVERABLE_EXCEPTION	= 0x80000021,
-	HVMSG_UNSUPPORTED_FEATURE	= 0x80000022,
-
-	/* Trace buffer complete messages. */
-	HVMSG_EVENTLOG_BUFFERCOMPLETE	= 0x80000040,
-
-	/* Platform-specific processor intercept messages. */
-	HVMSG_X64_IOPORT_INTERCEPT	= 0x80010000,
-	HVMSG_X64_MSR_INTERCEPT		= 0x80010001,
-	HVMSG_X64_CPUID_INTERCEPT	= 0x80010002,
-	HVMSG_X64_EXCEPTION_INTERCEPT	= 0x80010003,
-	HVMSG_X64_APIC_EOI		= 0x80010004,
-	HVMSG_X64_LEGACY_FP_ERROR	= 0x80010005
-};
-
-/* Define synthetic interrupt controller message flags. */
-union hv_message_flags {
-	__u8 asu8;
-	struct {
-		__u8 msg_pending:1;
-		__u8 reserved:7;
-	} __packed;
-};
-
-/* Define port identifier type. */
-union hv_port_id {
-	__u32 asu32;
-	struct {
-		__u32 id:24;
-		__u32 reserved:8;
-	} __packed u;
-};
-
-/* Define synthetic interrupt controller message header. */
-struct hv_message_header {
-	__u32 message_type;
-	__u8 payload_size;
-	union hv_message_flags message_flags;
-	__u8 reserved[2];
-	union {
-		__u64 sender;
-		union hv_port_id port;
-	};
-} __packed;
-
-/* Define synthetic interrupt controller message format. */
-struct hv_message {
-	struct hv_message_header header;
-	union {
-		__u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
-	} u;
-} __packed;
-
 /* Define the synthetic interrupt message page layout. */
 struct hv_message_page {
 	struct hv_message sint_message[HV_SYNIC_SINT_COUNT];
@@ -341,7 +265,6 @@ struct hv_timer_message_payload {
 	__u64 delivery_time;	/* When the message was delivered */
 } __packed;
 
-
 /* Define synthetic interrupt controller flag constants. */
 #define HV_EVENT_FLAGS_COUNT		(256 * 8)
 #define HV_EVENT_FLAGS_LONG_COUNT	(256 / sizeof(unsigned long))
diff --git a/include/linux/mshv.h b/include/linux/mshv.h
index dfe469f573f9..7709aaa1e064 100644
--- a/include/linux/mshv.h
+++ b/include/linux/mshv.h
@@ -42,6 +42,7 @@ struct mshv_partition {
 };
 
 struct mshv {
+	struct hv_message_page __percpu **synic_message_page;
 	struct {
 		spinlock_t lock;
 		u64 count;
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
index f49099d1f894..4ecb29fe1a0e 100644
--- a/include/uapi/asm-generic/hyperv-tlfs.h
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -6,6 +6,81 @@
 #define BIT(X)	(1ULL << (X))
 #endif
 
+/* Define synthetic interrupt controller message constants. */
+#define HV_MESSAGE_SIZE			(256)
+#define HV_MESSAGE_PAYLOAD_BYTE_COUNT	(240)
+#define HV_MESSAGE_PAYLOAD_QWORD_COUNT	(30)
+
+/* Define hypervisor message types. */
+enum hv_message_type {
+	HVMSG_NONE				= 0x00000000,
+
+	/* Memory access messages. */
+	HVMSG_UNMAPPED_GPA			= 0x80000000,
+	HVMSG_GPA_INTERCEPT			= 0x80000001,
+
+	/* Timer notification messages. */
+	HVMSG_TIMER_EXPIRED			= 0x80000010,
+
+	/* Error messages. */
+	HVMSG_INVALID_VP_REGISTER_VALUE		= 0x80000020,
+	HVMSG_UNRECOVERABLE_EXCEPTION		= 0x80000021,
+	HVMSG_UNSUPPORTED_FEATURE		= 0x80000022,
+
+	/* Trace buffer complete messages. */
+	HVMSG_EVENTLOG_BUFFERCOMPLETE		= 0x80000040,
+
+	/* Platform-specific processor intercept messages. */
+	HVMSG_X64_IO_PORT_INTERCEPT		= 0x80010000,
+	HVMSG_X64_MSR_INTERCEPT			= 0x80010001,
+	HVMSG_X64_CPUID_INTERCEPT		= 0x80010002,
+	HVMSG_X64_EXCEPTION_INTERCEPT		= 0x80010003,
+	HVMSG_X64_APIC_EOI			= 0x80010004,
+	HVMSG_X64_LEGACY_FP_ERROR		= 0x80010005,
+	HVMSG_X64_IOMMU_PRQ			= 0x80010006,
+	HVMSG_X64_HALT				= 0x80010007,
+	HVMSG_X64_INTERRUPTION_DELIVERABLE	= 0x80010008,
+	HVMSG_X64_SIPI_INTERCEPT		= 0x80010009,
+};
+
+/* Define synthetic interrupt controller message flags. */
+union hv_message_flags {
+	__u8 asu8;
+	struct {
+		__u8 msg_pending:1;
+		__u8 reserved:7;
+	} __packed;
+};
+
+/* Define port identifier type. */
+union hv_port_id {
+	__u32 asu32;
+	struct {
+		__u32 id:24;
+		__u32 reserved:8;
+	} __packed u;
+};
+
+/* Define synthetic interrupt controller message header. */
+struct hv_message_header {
+	__u32 message_type;
+	__u8 payload_size;
+	union hv_message_flags message_flags;
+	__u8 reserved[2];
+	union {
+		__u64 sender;
+		union hv_port_id port;
+	};
+} __packed;
+
+/* Define synthetic interrupt controller message format. */
+struct hv_message {
+	struct hv_message_header header;
+	union {
+		__u64 payload[HV_MESSAGE_PAYLOAD_QWORD_COUNT];
+	} u;
+} __packed;
+
 /* Userspace-visible partition creation flags */
 #define HV_PARTITION_CREATION_FLAG_SMT_ENABLED_GUEST                BIT(0)
 #define HV_PARTITION_CREATION_FLAG_GPA_LARGE_PAGES_DISABLED         BIT(3)
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 12/19] drivers/hv: run vp ioctl and isr
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (10 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 11/19] drivers/hv: set up synic pages for intercept messages Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 13/19] drivers/hv: install intercept ioctl Nuno Das Neves
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce an ioctl for running a vp and an isr to copy messages from
the synic page to the vp data structure.

Add synchronization primitives to ensure that the isr is finished
when the run vp ioctl is entered.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Vineeth Pillai <viremana@linux.microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst |  14 +++
 arch/x86/kernel/cpu/mshyperv.c  |  16 +++
 drivers/hv/hv_synic.c           |  96 ++++++++++++++++++
 drivers/hv/mshv.h               |   1 +
 drivers/hv/mshv_main.c          | 173 +++++++++++++++++++++++++++++++-
 include/asm-generic/mshyperv.h  |   3 +
 include/linux/mshv.h            |   7 ++
 include/uapi/linux/mshv.h       |   1 +
 8 files changed, 310 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index f0631236c063..9deddcd7de54 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -106,4 +106,18 @@ Get/set vp registers. See asm/hyperv-tlfs.h for the complete set of registers.
 Includes general purpose platform registers, MSRs, and virtual registers that
 are part of Microsoft Hypervisor platform and not directly exposed to the guest.
 
+3.6 MSHV_RUN_VP
+---------------
+:Type: vp ioctl
+:Parameters: struct hv_message
+:Returns: 0 on success
+
+Run the vp, returning when it triggers an intercept, or if the calling thread
+is interrupted by a signal. In this case errno will be set to EINTR.
+
+On return, the vp will be suspended.
+This ioctl will fail on any vp that's already running (not suspended).
+
+Information about the intercept is returned in the hv_message struct.
+
 
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 22f13343b5da..3efc4c3dbb7e 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -41,6 +41,7 @@ struct ms_hyperv_info ms_hyperv;
 EXPORT_SYMBOL_GPL(ms_hyperv);
 
 #if IS_ENABLED(CONFIG_HYPERV)
+static void (*mshv_handler)(void);
 static void (*vmbus_handler)(void);
 static void (*hv_stimer0_handler)(void);
 static void (*hv_kexec_handler)(void);
@@ -51,6 +52,9 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_callback)
 	struct pt_regs *old_regs = set_irq_regs(regs);
 
 	inc_irq_stat(irq_hv_callback_count);
+	if (mshv_handler)
+		mshv_handler();
+
 	if (vmbus_handler)
 		vmbus_handler();
 
@@ -60,6 +64,18 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_hyperv_callback)
 	set_irq_regs(old_regs);
 }
 
+void hv_setup_mshv_irq(void (*handler)(void))
+{
+	mshv_handler = handler;
+}
+EXPORT_SYMBOL_GPL(hv_setup_mshv_irq);
+
+void hv_remove_mshv_irq(void)
+{
+	mshv_handler = NULL;
+}
+EXPORT_SYMBOL_GPL(hv_remove_mshv_irq);
+
 void hv_setup_vmbus_handler(void (*handler)(void))
 {
 	vmbus_handler = handler;
diff --git a/drivers/hv/hv_synic.c b/drivers/hv/hv_synic.c
index c6546ae54ea9..27d52c480043 100644
--- a/drivers/hv/hv_synic.c
+++ b/drivers/hv/hv_synic.c
@@ -11,11 +11,107 @@
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/io.h>
+#include <linux/random.h>
 #include <linux/mshv.h>
 #include <asm/mshyperv.h>
 
 #include "mshv.h"
 
+void mshv_isr(void)
+{
+	struct hv_message_page **msg_page =
+			this_cpu_ptr(mshv.synic_message_page);
+	struct hv_message *msg;
+	u32 message_type;
+	struct mshv_partition *partition;
+	struct mshv_vp *vp;
+	u64 partition_id;
+	u32 vp_index;
+	int i;
+	unsigned long flags;
+	struct task_struct *task;
+
+	if (unlikely(!(*msg_page))) {
+		pr_err("%s: Missing synic page!\n", __func__);
+		return;
+	}
+
+	msg = &((*msg_page)->sint_message[HV_SYNIC_INTERCEPTION_SINT_INDEX]);
+
+	/*
+	 * If the type isn't set, there isn't really a message;
+	 * it may be some other hyperv interrupt
+	 */
+	message_type = msg->header.message_type;
+	if (message_type == HVMSG_NONE)
+		return;
+
+	/* Look for the partition */
+	partition_id = msg->header.sender;
+
+	/* Hold this lock for the rest of the isr, because the partition could
+	 * be released anytime.
+	 * e.g. the MSHV_RUN_VP thread could wake on another cpu; it could
+	 * release the partition unless we hold this!
+	 */
+	spin_lock_irqsave(&mshv.partitions.lock, flags);
+
+	for (i = 0; i < MSHV_MAX_PARTITIONS; i++) {
+		partition = mshv.partitions.array[i];
+		if (partition && partition->id == partition_id)
+			break;
+	}
+
+	if (unlikely(i == MSHV_MAX_PARTITIONS)) {
+		pr_err("%s: failed to find partition\n", __func__);
+		goto unlock_out;
+	}
+
+	/*
+	 * Since we directly index the vp, and it has to exist for us to be here
+	 * (because the vp is only deleted when the partition is), no additional
+	 * locking is needed here
+	 */
+	vp_index = ((struct hv_x64_intercept_message_header *)msg->u.payload)->vp_index;
+	vp = partition->vps.array[vp_index];
+	if (unlikely(!vp)) {
+		pr_err("%s: failed to find vp\n", __func__);
+		goto unlock_out;
+	}
+
+	memcpy(vp->run.intercept_message, msg, sizeof(struct hv_message));
+
+	if (unlikely(!vp->run.task)) {
+		pr_err("%s: vp run task not set\n", __func__);
+		goto unlock_out;
+	}
+
+	/* Save the task and reset it so we can wake without racing */
+	task = vp->run.task;
+	vp->run.task = NULL;
+
+	/*
+	 * up the semaphore before waking so that we don't race with
+	 * down_trylock
+	 */
+	up(&vp->run.sem);
+
+	/*
+	 * Finally, wake the process. If it wakes the vp and generates
+	 * another intercept then the message will be queued by the hypervisor
+	 */
+	wake_up_process(task);
+
+unlock_out:
+	spin_unlock_irqrestore(&mshv.partitions.lock, flags);
+
+	/* Acknowledge message with hypervisor */
+	msg->header.message_type = HVMSG_NONE;
+	wrmsrl(HV_X64_MSR_EOM, 0);
+
+	add_interrupt_randomness(HYPERVISOR_CALLBACK_VECTOR, 0);
+}
+
 int mshv_synic_init(unsigned int cpu)
 {
 	union hv_synic_simp simp;
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index b8fece9fe80d..014352a11190 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -29,6 +29,7 @@
 
 extern struct mshv mshv;
 
+void mshv_isr(void);
 int mshv_synic_init(unsigned int cpu);
 int mshv_synic_cleanup(unsigned int cpu);
 
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index 1b32cf7ad9f3..390ccf893dd1 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -17,6 +17,7 @@
 #include <linux/mm.h>
 #include <linux/io.h>
 #include <linux/cpuhotplug.h>
+#include <linux/random.h>
 #include <linux/mshv.h>
 #include <asm/mshyperv.h>
 
@@ -64,6 +65,141 @@ static struct miscdevice mshv_dev = {
 	.mode = 0600,
 };
 
+static long
+mshv_vp_ioctl_run_vp(struct mshv_vp *vp, void __user *ret_message)
+{
+	long ret;
+	u32 msg_type;
+	struct hv_register_assoc suspend_registers[2] = {
+		{ .name = HV_REGISTER_EXPLICIT_SUSPEND },
+		{ .name = HV_REGISTER_INTERCEPT_SUSPEND }
+	};
+	/* Pointers to values for convenience */
+	union hv_explicit_suspend_register *explicit_suspend =
+				&suspend_registers[0].value.explicit_suspend;
+	union hv_intercept_suspend_register *intercept_suspend =
+				&suspend_registers[1].value.intercept_suspend;
+
+	/* Check that the VP is suspended */
+	ret = hv_call_get_vp_registers(
+			vp->index,
+			vp->partition->id,
+			2,
+			suspend_registers);
+	if (ret)
+		return ret;
+
+	if (!explicit_suspend->suspended &&
+	    !intercept_suspend->suspended) {
+		pr_err("%s: vp not suspended!\n", __func__);
+		return -EBADFD;
+	}
+
+	/*
+	 * If intercept_suspend is set, we missed a message and need to
+	 * wait for mshv_isr to complete
+	 */
+	if (intercept_suspend->suspended) {
+		if (down_interruptible(&vp->run.sem))
+			return -EINTR;
+		if (copy_to_user(ret_message, vp->run.intercept_message,
+				 sizeof(struct hv_message)))
+			return -EFAULT;
+		intercept_suspend->suspended = 0;
+		explicit_suspend->suspended = 1;
+		ret = hv_call_set_vp_registers(
+				vp->index,
+				vp->partition->id,
+				2,
+				suspend_registers);
+		if (ret) {
+			pr_err("%s: failed to set suspend bits\n", __func__);
+			return ret;
+		}
+		return 0;
+	}
+
+	/*
+	 * At this point the semaphore ensures that mshv_isr is done,
+	 * and the mutex ensures that no other threads are touching this vp
+	 */
+	vp->run.task = current;
+	set_current_state(TASK_INTERRUPTIBLE);
+
+	/* Now actually start the vp running */
+	explicit_suspend->suspended = 0;
+	intercept_suspend->suspended = 0;
+	ret = hv_call_set_vp_registers(
+			vp->index,
+			vp->partition->id,
+			2,
+			suspend_registers);
+	if (ret) {
+		pr_err("%s: failed to clear suspend bits\n", __func__);
+		set_current_state(TASK_RUNNING);
+		vp->run.task = NULL;
+		return ret;
+	}
+
+	schedule();
+
+	/* Explicitly suspend the vp to make sure it's stopped */
+	explicit_suspend->suspended = 1;
+	ret = hv_call_set_vp_registers(
+		vp->index,
+		vp->partition->id,
+		1,
+		&suspend_registers[0]);
+	if (ret) {
+		pr_err("%s: failed to set explicit suspend bit\n", __func__);
+		return -EBADFD;
+	}
+
+	/*
+	 * Check if woken up by a signal
+	 * Note that if the signal came after being woken by mshv_isr(),
+	 * we will still get the message correctly on re-entry
+	 */
+	if (signal_pending(current)) {
+		pr_debug("%s: woke up, received signal\n", __func__);
+		return -EINTR;
+	}
+
+	/*
+	 * No signal pending, so we were woken by hv_host_isr()
+	 * The isr can't be running now, and the intercept_suspend bit is set
+	 * We use it as a flag to tell if we missed a message due to a signal,
+	 * so we must clear it here and reset the semaphore
+	 */
+	intercept_suspend->suspended = 0;
+	ret = hv_call_set_vp_registers(
+		vp->index,
+		vp->partition->id,
+		1,
+		&suspend_registers[1]);
+	if (ret) {
+		pr_err("%s: failed to clear intercept suspend bit\n", __func__);
+		return -EBADFD;
+	}
+	if (down_trylock(&vp->run.sem)) {
+		pr_err("%s: semaphore in unexpected state\n", __func__);
+		return -EBADFD;
+	}
+
+	msg_type = vp->run.intercept_message->header.message_type;
+
+	if (msg_type == HVMSG_NONE) {
+		pr_err("%s: woke up, but no message\n", __func__);
+		return -ENOMSG;
+	}
+
+	if (copy_to_user(ret_message, vp->run.intercept_message,
+			 sizeof(struct hv_message)))
+		return -EFAULT;
+
+	return 0;
+}
+
 static long
 mshv_vp_ioctl_get_regs(struct mshv_vp *vp, void __user *user_args)
 {
@@ -110,6 +246,7 @@ mshv_vp_ioctl_set_regs(struct mshv_vp *vp, void __user *user_args)
 	struct mshv_vp_registers args;
 	struct hv_register_assoc *registers;
 	long ret;
+	int i;
 
 	if (copy_from_user(&args, user_args, sizeof(args)))
 		return -EFAULT;
@@ -129,6 +266,20 @@ mshv_vp_ioctl_set_regs(struct mshv_vp *vp, void __user *user_args)
 		goto free_return;
 	}
 
+	for (i = 0; i < args.count; i++) {
+		/*
+		 * Disallow setting suspend registers to ensure run vp state
+		 * is consistent
+		 */
+		if (registers[i].name == HV_REGISTER_EXPLICIT_SUSPEND ||
+		    registers[i].name == HV_REGISTER_INTERCEPT_SUSPEND) {
+			pr_err("%s: not allowed to set suspend registers\n",
+			       __func__);
+			ret = -EINVAL;
+			goto free_return;
+		}
+	}
+
 	ret = hv_call_set_vp_registers(vp->index, vp->partition->id,
 				       args.count, registers);
 
@@ -147,6 +298,9 @@ mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 		return -EINTR;
 
 	switch (ioctl) {
+	case MSHV_RUN_VP:
+		r = mshv_vp_ioctl_run_vp(vp, (void __user *)arg);
+		break;
 	case MSHV_GET_VP_REGISTERS:
 		r = mshv_vp_ioctl_get_regs(vp, (void __user *)arg);
 		break;
@@ -197,12 +351,20 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
 		return -ENOMEM;
 
 	mutex_init(&vp->mutex);
+	sema_init(&vp->run.sem, 0);
+
+	vp->run.intercept_message =
+		(struct hv_message *)get_zeroed_page(GFP_KERNEL);
+	if (!vp->run.intercept_message) {
+		ret = -ENOMEM;
+		goto free_vp;
+	}
 
 	vp->index = args.vp_index;
 	vp->partition = mshv_partition_get(partition);
 	if (!vp->partition) {
 		ret = -EBADF;
-		goto free_vp;
+		goto free_message;
 	}
 
 	fd = get_unused_fd_flags(O_CLOEXEC);
@@ -240,6 +402,8 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
 	put_unused_fd(fd);
 put_partition:
 	mshv_partition_put(partition);
+free_message:
+	free_page((unsigned long)vp->run.intercept_message);
 free_vp:
 	kfree(vp);
 
@@ -456,6 +620,9 @@ destroy_partition(struct mshv_partition *partition)
 		mshv.partitions.array[i] = NULL;
 	}
 
+	if (!mshv.partitions.count)
+		hv_remove_mshv_irq();
+
 	spin_unlock_irqrestore(&mshv.partitions.lock, flags);
 
 	/*
@@ -474,6 +641,7 @@ destroy_partition(struct mshv_partition *partition)
 		vp = partition->vps.array[i];
 		if (!vp)
 			continue;
+		free_page((unsigned long)vp->run.intercept_message);
 		kfree(vp);
 	}
 
@@ -537,6 +705,9 @@ add_partition(struct mshv_partition *partition)
 	mshv.partitions.count++;
 	mshv.partitions.array[i] = partition;
 
+	if (mshv.partitions.count == 1)
+		hv_setup_mshv_irq(mshv_isr);
+
 out_unlock:
 	spin_unlock_irqrestore(&mshv.partitions.lock, flags);
 
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h
index 672b08f79dae..36412d27ecc7 100644
--- a/include/asm-generic/mshyperv.h
+++ b/include/asm-generic/mshyperv.h
@@ -143,6 +143,9 @@ void hv_remove_vmbus_handler(void);
 void hv_setup_stimer0_handler(void (*handler)(void));
 void hv_remove_stimer0_handler(void);
 
+void hv_setup_mshv_irq(void (*handler)(void));
+void hv_remove_mshv_irq(void);
+
 void hv_setup_kexec_handler(void (*handler)(void));
 void hv_remove_kexec_handler(void);
 void hv_setup_crash_handler(void (*handler)(struct pt_regs *regs));
diff --git a/include/linux/mshv.h b/include/linux/mshv.h
index 7709aaa1e064..3933d80294f1 100644
--- a/include/linux/mshv.h
+++ b/include/linux/mshv.h
@@ -8,6 +8,8 @@
 
 #include <linux/spinlock.h>
 #include <linux/mutex.h>
+#include <linux/semaphore.h>
+#include <linux/sched.h>
 #include <uapi/linux/mshv.h>
 
 #define MSHV_MAX_PARTITIONS		128
@@ -18,6 +20,11 @@ struct mshv_vp {
 	u32 index;
 	struct mshv_partition *partition;
 	struct mutex mutex;
+	struct {
+		struct semaphore sem;
+		struct task_struct *task;
+		struct hv_message *intercept_message;
+	} run;
 };
 
 struct mshv_mem_region {
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 7a4e0c340dd4..229abac9502f 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -55,5 +55,6 @@ struct mshv_vp_registers {
 /* vp device */
 #define MSHV_GET_VP_REGISTERS   _IOWR(MSHV_IOCTL, 0x05, struct mshv_vp_registers)
 #define MSHV_SET_VP_REGISTERS   _IOW(MSHV_IOCTL, 0x06, struct mshv_vp_registers)
+#define MSHV_RUN_VP		_IOR(MSHV_IOCTL, 0x07, struct hv_message)
 
 #endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 13/19] drivers/hv: install intercept ioctl
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (11 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 12/19] drivers/hv: run vp ioctl and isr Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 14/19] drivers/hv: assert interrupt ioctl Nuno Das Neves
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce ioctl for configuring intercept messages from a guest partition.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst         |  9 ++++++
 arch/x86/include/uapi/asm/hyperv-tlfs.h | 43 +++++++++++++++++++++++++
 drivers/hv/hv_call.c                    | 38 ++++++++++++++++++++++
 drivers/hv/mshv.h                       |  3 ++
 drivers/hv/mshv_main.c                  | 20 ++++++++++++
 include/asm-generic/hyperv-tlfs.h       |  8 +++++
 include/uapi/linux/mshv.h               |  7 ++++
 7 files changed, 128 insertions(+)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index 9deddcd7de54..f0094258d834 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -120,4 +120,13 @@ This ioctl will fail on any vp that's already running (not suspended).
 
 Information about the intercept is returned in the hv_message struct.
 
+3.7 MSHV_INSTALL_INTERCEPT
+--------------------------
+:Type: partition ioctl
+:Parameters: struct mshv_install_intercept
+:Returns: 0 on success
+
+Enable and configure different types of intercepts. Intercepts are events in a
+guest partition that will suspend the guest vp and send a message to the root
+partition (returned from MSHV_RUN_VP).
 
diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h b/arch/x86/include/uapi/asm/hyperv-tlfs.h
index 4ffa7e1cd185..442c4bb4113e 100644
--- a/arch/x86/include/uapi/asm/hyperv-tlfs.h
+++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
@@ -948,4 +948,47 @@ struct hv_x64_apic_eoi_message {
 	__u32 interrupt_vector;
 } __packed;
 
+enum hv_intercept_type {
+	HV_INTERCEPT_TYPE_X64_IO_PORT			= 0X00000000,
+	HV_INTERCEPT_TYPE_X64_MSR			= 0X00000001,
+	HV_INTERCEPT_TYPE_X64_CPUID			= 0X00000002,
+	HV_INTERCEPT_TYPE_EXCEPTION			= 0X00000003,
+	HV_INTERCEPT_TYPE_REGISTER			= 0X00000004,
+	HV_INTERCEPT_TYPE_MMIO				= 0X00000005,
+	HV_INTERCEPT_TYPE_X64_GLOBAL_CPUID		= 0X00000006,
+	HV_INTERCEPT_TYPE_X64_APIC_SMI			= 0X00000007,
+	HV_INTERCEPT_TYPE_HYPERCALL			= 0X00000008,
+	HV_INTERCEPT_TYPE_X64_APIC_INIT_SIPI		= 0X00000009,
+	HV_INTERCEPT_MC_UPDATE_PATCH_LEVEL_MSR_READ	= 0X0000000A,
+	HV_INTERCEPT_TYPE_X64_APIC_WRITE		= 0X0000000B,
+	HV_INTERCEPT_TYPE_MAX,
+	HV_INTERCEPT_TYPE_INVALID			= 0XFFFFFFFF,
+};
+
+union hv_intercept_parameters {
+	__u64 as_uint64;
+
+	/* hv_intercept_type_x64_io_port */
+	__u16 io_port;
+
+	/* hv_intercept_type_x64_cpuid */
+	__u32 cpuid_index;
+
+	/* hv_intercept_type_x64_apic_write */
+	__u32 apic_write_mask;
+
+	/* hv_intercept_type_exception */
+	__u16 exception_vector;
+
+	/* N.B. Other intercept types do not have any parameters. */
+};
+
+/* Access types for the install intercept hypercall parameter */
+#define HV_INTERCEPT_ACCESS_MASK_NONE		0x00
+#define HV_INTERCEPT_ACCESS_MASK_READ		0X01
+#define HV_INTERCEPT_ACCESS_MASK_WRITE		0x02
+#define HV_INTERCEPT_ACCESS_MASK_EXECUTE	0x04
+
+
+
 #endif
diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index 37dcd6c636a7..ec71b5a08a76 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -394,3 +394,41 @@ int hv_call_set_vp_registers(
 	return hv_status_to_errno(status);
 }
 
+int hv_call_install_intercept(
+		u64 partition_id,
+		u32 access_type,
+		enum hv_intercept_type intercept_type,
+		union hv_intercept_parameters intercept_parameter)
+{
+	struct hv_install_intercept *input;
+	unsigned long flags;
+	u64 status;
+	int ret;
+
+	do {
+		local_irq_save(flags);
+		input = (struct hv_install_intercept *)(*this_cpu_ptr(
+					hyperv_pcpu_input_arg));
+		input->partition_id = partition_id;
+		input->access_type = access_type;
+		input->intercept_type = intercept_type;
+		input->intercept_parameter = intercept_parameter;
+		status = hv_do_hypercall(
+				HVCALL_INSTALL_INTERCEPT, input, NULL);
+
+		local_irq_restore(flags);
+		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
+			if (!hv_result_success(status))
+				pr_err("%s: %s\n", __func__,
+				       hv_status_to_string(status));
+			ret = hv_status_to_errno(status);
+			break;
+		}
+
+		ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id, 1);
+
+	} while (!ret);
+
+	return ret;
+}
+
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index 014352a11190..541c83a36767 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -64,5 +64,8 @@ int hv_call_set_vp_registers(
 		u64 partition_id,
 		u16 count,
 		struct hv_register_assoc *registers);
+int hv_call_install_intercept(u64 partition_id, u32 access_type,
+		enum hv_intercept_type intercept_type,
+		union hv_intercept_parameters intercept_parameter);
 
 #endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index 390ccf893dd1..911dfc61e24c 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -567,6 +567,22 @@ mshv_partition_ioctl_unmap_memory(struct mshv_partition *partition,
 	return 0;
 }
 
+static long
+mshv_partition_ioctl_install_intercept(struct mshv_partition *partition,
+				       void __user *user_args)
+{
+	struct mshv_install_intercept args;
+
+	if (copy_from_user(&args, user_args, sizeof(args)))
+		return -EFAULT;
+
+	return hv_call_install_intercept(
+			partition->id,
+			args.access_type_mask,
+			args.intercept_type,
+			args.intercept_parameter);
+}
+
 static long
 mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
@@ -589,6 +605,10 @@ mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 		ret = mshv_partition_ioctl_create_vp(partition,
 							(void __user *)arg);
 		break;
+	case MSHV_INSTALL_INTERCEPT:
+		ret = mshv_partition_ioctl_install_intercept(partition,
+							(void __user *)arg);
+		break;
 	default:
 		ret = -ENOTTY;
 	}
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index ace8fca88f66..4453ba4d3293 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -152,6 +152,7 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_WITHDRAW_MEMORY			0x0049
 #define HVCALL_MAP_GPA_PAGES			0x004b
 #define HVCALL_UNMAP_GPA_PAGES			0x004c
+#define HVCALL_INSTALL_INTERCEPT		0x004d
 #define HVCALL_CREATE_VP			0x004e
 #define HVCALL_GET_VP_REGISTERS			0x0050
 #define HVCALL_SET_VP_REGISTERS			0x0051
@@ -813,4 +814,11 @@ struct hv_unmap_gpa_pages {
 	u32 padding;
 } __packed;
 
+struct hv_install_intercept {
+	u64 partition_id;
+	u32 access_type; /* mask */
+	u32 intercept_type;
+	union hv_intercept_parameters intercept_parameter;
+} __packed;
+
 #endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 229abac9502f..8574a4e62715 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -41,6 +41,12 @@ struct mshv_vp_registers {
 	struct hv_register_assoc *regs;
 };
 
+struct mshv_install_intercept {
+	__u32 access_type_mask;
+	enum hv_intercept_type intercept_type;
+	union hv_intercept_parameters intercept_parameter;
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
@@ -51,6 +57,7 @@ struct mshv_vp_registers {
 #define MSHV_MAP_GUEST_MEMORY	_IOW(MSHV_IOCTL, 0x02, struct mshv_user_mem_region)
 #define MSHV_UNMAP_GUEST_MEMORY	_IOW(MSHV_IOCTL, 0x03, struct mshv_user_mem_region)
 #define MSHV_CREATE_VP		_IOW(MSHV_IOCTL, 0x04, struct mshv_create_vp)
+#define MSHV_INSTALL_INTERCEPT	_IOW(MSHV_IOCTL, 0x08, struct mshv_install_intercept)
 
 /* vp device */
 #define MSHV_GET_VP_REGISTERS   _IOWR(MSHV_IOCTL, 0x05, struct mshv_vp_registers)
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 14/19] drivers/hv: assert interrupt ioctl
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (12 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 13/19] drivers/hv: install intercept ioctl Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 15/19] drivers/hv: get and set vp state ioctls Nuno Das Neves
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce ioctl for asserting an interrupt on a given APIC within a
guest partition.

Co-developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst         | 11 ++++++++++
 arch/x86/include/asm/hyperv-tlfs.h      | 14 ------------
 arch/x86/include/uapi/asm/hyperv-tlfs.h | 22 +++++++++++++++++++
 drivers/hv/hv_call.c                    | 29 +++++++++++++++++++++++++
 drivers/hv/mshv.h                       |  5 +++++
 drivers/hv/mshv_main.c                  | 20 +++++++++++++++++
 include/asm-generic/hyperv-tlfs.h       | 11 ++++++++++
 include/uapi/linux/mshv.h               |  7 ++++++
 8 files changed, 105 insertions(+), 14 deletions(-)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index f0094258d834..76f98485cd93 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -130,3 +130,14 @@ Enable and configure different types of intercepts. Intercepts are events in a
 guest partition that will suspend the guest vp and send a message to the root
 partition (returned from MSHV_RUN_VP).
 
+3.8 MSHV_ASSERT_INTERRUPT
+--------------------------
+:Type: partition ioctl
+:Parameters: struct mshv_assert_interrupt
+:Returns: 0 on success
+
+Assert interrupts in partitions that use Microsoft Hypervisor's internal
+emulated LAPIC. This must be enabled on partition creation with the flag:
+HV_PARTITION_CREATION_FLAG_LAPIC_ENABLED
+
+
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 2b6f7dca79e6..871f5d014ae0 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -546,20 +546,6 @@ struct hv_partition_assist_pg {
 	u32 tlb_lock_count;
 };
 
-enum hv_interrupt_type {
-	HV_X64_INTERRUPT_TYPE_FIXED             = 0x0000,
-	HV_X64_INTERRUPT_TYPE_LOWESTPRIORITY    = 0x0001,
-	HV_X64_INTERRUPT_TYPE_SMI               = 0x0002,
-	HV_X64_INTERRUPT_TYPE_REMOTEREAD        = 0x0003,
-	HV_X64_INTERRUPT_TYPE_NMI               = 0x0004,
-	HV_X64_INTERRUPT_TYPE_INIT              = 0x0005,
-	HV_X64_INTERRUPT_TYPE_SIPI              = 0x0006,
-	HV_X64_INTERRUPT_TYPE_EXTINT            = 0x0007,
-	HV_X64_INTERRUPT_TYPE_LOCALINT0         = 0x0008,
-	HV_X64_INTERRUPT_TYPE_LOCALINT1         = 0x0009,
-	HV_X64_INTERRUPT_TYPE_MAXIMUM           = 0x000A,
-};
-
 #include <asm-generic/hyperv-tlfs.h>
 
 #endif
diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h b/arch/x86/include/uapi/asm/hyperv-tlfs.h
index 442c4bb4113e..e234297521a3 100644
--- a/arch/x86/include/uapi/asm/hyperv-tlfs.h
+++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
@@ -989,6 +989,28 @@ union hv_intercept_parameters {
 #define HV_INTERCEPT_ACCESS_MASK_WRITE		0x02
 #define HV_INTERCEPT_ACCESS_MASK_EXECUTE	0x04
 
+enum hv_interrupt_type {
+	HV_X64_INTERRUPT_TYPE_FIXED             = 0x0000,
+	HV_X64_INTERRUPT_TYPE_LOWESTPRIORITY    = 0x0001,
+	HV_X64_INTERRUPT_TYPE_SMI               = 0x0002,
+	HV_X64_INTERRUPT_TYPE_REMOTEREAD        = 0x0003,
+	HV_X64_INTERRUPT_TYPE_NMI               = 0x0004,
+	HV_X64_INTERRUPT_TYPE_INIT              = 0x0005,
+	HV_X64_INTERRUPT_TYPE_SIPI              = 0x0006,
+	HV_X64_INTERRUPT_TYPE_EXTINT            = 0x0007,
+	HV_X64_INTERRUPT_TYPE_LOCALINT0         = 0x0008,
+	HV_X64_INTERRUPT_TYPE_LOCALINT1         = 0x0009,
+	HV_X64_INTERRUPT_TYPE_MAXIMUM           = 0x000A
+};
 
+union hv_interrupt_control {
+	struct {
+		__u32 interrupt_type; /* enum hv_interrupt type */
+		__u32 level_triggered : 1;
+		__u32 logical_dest_mode : 1;
+		__u32 rsvd : 30;
+	} __packed;
+	__u64 as_uint64;
+};
 
 #endif
diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index ec71b5a08a76..72e93d13d8ee 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -432,3 +432,32 @@ int hv_call_install_intercept(
 	return ret;
 }
 
+int hv_call_assert_virtual_interrupt(
+		u64 partition_id,
+		u32 vector,
+		u64 dest_addr,
+		union hv_interrupt_control control)
+{
+	struct hv_assert_virtual_interrupt *input;
+	unsigned long flags;
+	u64 status;
+
+	local_irq_save(flags);
+	input = (struct hv_assert_virtual_interrupt *)(*this_cpu_ptr(
+			hyperv_pcpu_input_arg));
+	memset(input, 0, sizeof(*input));
+	input->partition_id = partition_id;
+	input->vector = vector;
+	input->dest_addr = dest_addr;
+	input->control = control;
+	status = hv_do_hypercall(HVCALL_ASSERT_VIRTUAL_INTERRUPT, input, NULL);
+	local_irq_restore(flags);
+
+	if (!hv_result_success(status)) {
+		pr_err("%s: %s\n", __func__, hv_status_to_string(status));
+		return hv_status_to_errno(status);
+	}
+
+	return 0;
+}
+
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index 541c83a36767..c0a0ccb3626a 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -67,5 +67,10 @@ int hv_call_set_vp_registers(
 int hv_call_install_intercept(u64 partition_id, u32 access_type,
 		enum hv_intercept_type intercept_type,
 		union hv_intercept_parameters intercept_parameter);
+int hv_call_assert_virtual_interrupt(
+		u64 partition_id,
+		u32 vector,
+		u64 dest_addr,
+		union hv_interrupt_control control);
 
 #endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index 911dfc61e24c..ee41b59cc922 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -583,6 +583,22 @@ mshv_partition_ioctl_install_intercept(struct mshv_partition *partition,
 			args.intercept_parameter);
 }
 
+static long
+mshv_partition_ioctl_assert_interrupt(struct mshv_partition *partition,
+				      void __user *user_args)
+{
+	struct mshv_assert_interrupt args;
+
+	if (copy_from_user(&args, user_args, sizeof(args)))
+		return -EFAULT;
+
+	return hv_call_assert_virtual_interrupt(
+			partition->id,
+			args.vector,
+			args.dest_addr,
+			args.control);
+}
+
 static long
 mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
@@ -609,6 +625,10 @@ mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 		ret = mshv_partition_ioctl_install_intercept(partition,
 							(void __user *)arg);
 		break;
+	case MSHV_ASSERT_INTERRUPT:
+		ret = mshv_partition_ioctl_assert_interrupt(partition,
+							(void __user *)arg);
+		break;
 	default:
 		ret = -ENOTTY;
 	}
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 4453ba4d3293..d1cc5dbc78b5 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -165,6 +165,7 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_MAP_DEVICE_INTERRUPT		0x007c
 #define HVCALL_UNMAP_DEVICE_INTERRUPT		0x007d
 #define HVCALL_RETARGET_INTERRUPT		0x007e
+#define HVCALL_ASSERT_VIRTUAL_INTERRUPT		0x0094
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
 
@@ -821,4 +822,14 @@ struct hv_install_intercept {
 	union hv_intercept_parameters intercept_parameter;
 } __packed;
 
+struct hv_assert_virtual_interrupt {
+	u64 partition_id;
+	union hv_interrupt_control control;
+	u64 dest_addr; /* cpu's apic id */
+	u32 vector;
+	u8 target_vtl;
+	u8 rsvd_z0;
+	u16 rsvd_z1;
+} __packed;
+
 #endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 8574a4e62715..f65248a1ee89 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -47,6 +47,12 @@ struct mshv_install_intercept {
 	union hv_intercept_parameters intercept_parameter;
 };
 
+struct mshv_assert_interrupt {
+	union hv_interrupt_control control;
+	__u64 dest_addr;
+	__u32 vector;
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
@@ -58,6 +64,7 @@ struct mshv_install_intercept {
 #define MSHV_UNMAP_GUEST_MEMORY	_IOW(MSHV_IOCTL, 0x03, struct mshv_user_mem_region)
 #define MSHV_CREATE_VP		_IOW(MSHV_IOCTL, 0x04, struct mshv_create_vp)
 #define MSHV_INSTALL_INTERCEPT	_IOW(MSHV_IOCTL, 0x08, struct mshv_install_intercept)
+#define MSHV_ASSERT_INTERRUPT	_IOW(MSHV_IOCTL, 0x09, struct mshv_assert_interrupt)
 
 /* vp device */
 #define MSHV_GET_VP_REGISTERS   _IOWR(MSHV_IOCTL, 0x05, struct mshv_vp_registers)
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 15/19] drivers/hv: get and set vp state ioctls
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (13 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 14/19] drivers/hv: assert interrupt ioctl Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 16/19] drivers/hv: mmap vp register page Nuno Das Neves
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce ioctls for getting and setting guest vcpu emulated LAPIC
state, and xsave data.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst         |   8 ++
 arch/x86/include/uapi/asm/hyperv-tlfs.h |  59 ++++++++++
 drivers/hv/hv_call.c                    | 138 +++++++++++++++++++++++-
 drivers/hv/mshv.h                       |  25 +++++
 drivers/hv/mshv_main.c                  | 122 +++++++++++++++++++++
 include/asm-generic/hyperv-tlfs.h       |  40 +++++++
 include/uapi/asm-generic/hyperv-tlfs.h  |  28 +++++
 include/uapi/linux/mshv.h               |  13 +++
 8 files changed, 432 insertions(+), 1 deletion(-)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index 76f98485cd93..1613ac6e9428 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -140,4 +140,12 @@ Assert interrupts in partitions that use Microsoft Hypervisor's internal
 emulated LAPIC. This must be enabled on partition creation with the flag:
 HV_PARTITION_CREATION_FLAG_LAPIC_ENABLED
 
+3.9 MSHV_GET_VP_STATE and MSHV_SET_VP_STATE
+--------------------------
+:Type: vp ioctl
+:Parameters: struct mshv_vp_state
+:Returns: 0 on success
+
+Get/set various vp state. Currently these can be used to get and set
+emulated LAPIC state, and xsave data.
 
diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h b/arch/x86/include/uapi/asm/hyperv-tlfs.h
index e234297521a3..46806227e869 100644
--- a/arch/x86/include/uapi/asm/hyperv-tlfs.h
+++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
@@ -1013,4 +1013,63 @@ union hv_interrupt_control {
 	__u64 as_uint64;
 };
 
+struct hv_local_interrupt_controller_state {
+	__u32 apic_id;
+	__u32 apic_version;
+	__u32 apic_ldr;
+	__u32 apic_dfr;
+	__u32 apic_spurious;
+	__u32 apic_isr[8];
+	__u32 apic_tmr[8];
+	__u32 apic_irr[8];
+	__u32 apic_esr;
+	__u32 apic_icr_high;
+	__u32 apic_icr_low;
+	__u32 apic_lvt_timer;
+	__u32 apic_lvt_thermal;
+	__u32 apic_lvt_perfmon;
+	__u32 apic_lvt_lint0;
+	__u32 apic_lvt_lint1;
+	__u32 apic_lvt_error;
+	__u32 apic_lvt_cmci;
+	__u32 apic_error_status;
+	__u32 apic_initial_count;
+	__u32 apic_counter_value;
+	__u32 apic_divide_configuration;
+	__u32 apic_remote_read;
+} __packed;
+
+#define HV_XSAVE_DATA_NO_XMM_REGISTERS 1
+
+union hv_x64_xsave_xfem_register {
+	__u64 as_uint64;
+	struct {
+		__u32 low_uint32;
+		__u32 high_uint32;
+	} __packed;
+	struct {
+		__u64 legacy_x87: 1;
+		__u64 legacy_sse: 1;
+		__u64 avx: 1;
+		__u64 mpx_bndreg: 1;
+		__u64 mpx_bndcsr: 1;
+		__u64 avx_512_op_mask: 1;
+		__u64 avx_512_zmmhi: 1;
+		__u64 avx_512_zmm16_31: 1;
+		__u64 rsvd8_9: 2;
+		__u64 pasid: 1;
+		__u64 cet_u: 1;
+		__u64 cet_s: 1;
+		__u64 rsvd13_16: 4;
+		__u64 xtile_cfg: 1;
+		__u64 xtile_data: 1;
+		__u64 rsvd19_63: 45;
+	} __packed;
+};
+
+struct hv_vp_state_data_xsave {
+	__u64 flags;
+	union hv_x64_xsave_xfem_register states;
+} __packed;
+
 #endif
diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index 72e93d13d8ee..c358a2b51ba1 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -426,7 +426,6 @@ int hv_call_install_intercept(
 		}
 
 		ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id, 1);
-
 	} while (!ret);
 
 	return ret;
@@ -461,3 +460,140 @@ int hv_call_assert_virtual_interrupt(
 	return 0;
 }
 
+int hv_call_get_vp_state(
+		u32 vp_index,
+		u64 partition_id,
+		enum hv_get_set_vp_state_type type,
+		struct hv_vp_state_data_xsave xsave,
+		/* Choose between pages and ret_output */
+		u64 page_count,
+		struct page **pages,
+		union hv_get_vp_state_out *ret_output)
+{
+	struct hv_get_vp_state_in *input;
+	union hv_get_vp_state_out *output;
+	u64 status;
+	int i;
+	u64 control;
+	unsigned long flags;
+	int ret = 0;
+
+	if (page_count > HV_GET_VP_STATE_BATCH_SIZE)
+		return -EINVAL;
+
+	if (!page_count && !ret_output)
+		return -EINVAL;
+
+	do {
+		local_irq_save(flags);
+		input = (struct hv_get_vp_state_in *)
+				(*this_cpu_ptr(hyperv_pcpu_input_arg));
+		output = (union hv_get_vp_state_out *)
+				(*this_cpu_ptr(hyperv_pcpu_output_arg));
+		memset(input, 0, sizeof(*input));
+		memset(output, 0, sizeof(*output));
+
+		input->partition_id = partition_id;
+		input->vp_index = vp_index;
+		input->state_data.type = type;
+		memcpy(&input->state_data.xsave, &xsave, sizeof(xsave));
+		for (i = 0; i < page_count; i++)
+			input->output_data_pfns[i] = page_to_pfn(pages[i]);
+
+		control = (HVCALL_GET_VP_STATE) |
+			  (page_count << HV_HYPERCALL_VARHEAD_OFFSET);
+
+		status = hv_do_hypercall(control, input, output);
+
+		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
+			if (!hv_result_success(status))
+				pr_err("%s: %s\n", __func__,
+				       hv_status_to_string(status));
+			else if (ret_output)
+				memcpy(ret_output, output, sizeof(*output));
+
+			local_irq_restore(flags);
+			ret = hv_status_to_errno(status);
+			break;
+		}
+		local_irq_restore(flags);
+
+		ret = hv_call_deposit_pages(NUMA_NO_NODE,
+					    partition_id, 1);
+	} while (!ret);
+
+	return ret;
+}
+
+int hv_call_set_vp_state(
+		u32 vp_index,
+		u64 partition_id,
+		enum hv_get_set_vp_state_type type,
+		struct hv_vp_state_data_xsave xsave,
+		/* Choose between pages and bytes */
+		u64 page_count,
+		struct page **pages,
+		u32 num_bytes,
+		u8 *bytes)
+{
+	struct hv_set_vp_state_in *input;
+	u64 status;
+	int i;
+	u64 control;
+	unsigned long flags;
+	int ret = 0;
+	u16 varhead_sz;
+
+	if (page_count > HV_SET_VP_STATE_BATCH_SIZE)
+		return -EINVAL;
+	if (sizeof(*input) + num_bytes > HV_HYP_PAGE_SIZE)
+		return -EINVAL;
+
+	if (num_bytes)
+		/* round up to 8 and divide by 8 */
+		varhead_sz = (num_bytes + 7) >> 3;
+	else if (page_count)
+		varhead_sz =  page_count;
+	else
+		return -EINVAL;
+
+	do {
+		local_irq_save(flags);
+		input = (struct hv_set_vp_state_in *)
+				(*this_cpu_ptr(hyperv_pcpu_input_arg));
+		memset(input, 0, sizeof(*input));
+
+		input->partition_id = partition_id;
+		input->vp_index = vp_index;
+		input->state_data.type = type;
+		memcpy(&input->state_data.xsave, &xsave, sizeof(xsave));
+		if (num_bytes) {
+			memcpy((u8 *)input->data, bytes, num_bytes);
+		} else {
+			for (i = 0; i < page_count; i++)
+				input->data[i].pfns = page_to_pfn(pages[i]);
+		}
+
+		control = (HVCALL_SET_VP_STATE) |
+			  (varhead_sz << HV_HYPERCALL_VARHEAD_OFFSET);
+
+		status = hv_do_hypercall(control, input, NULL);
+
+		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
+			if (!hv_result_success(status))
+				pr_err("%s: %s\n", __func__,
+				       hv_status_to_string(status));
+
+			local_irq_restore(flags);
+			ret = hv_status_to_errno(status);
+			break;
+		}
+		local_irq_restore(flags);
+
+		ret = hv_call_deposit_pages(NUMA_NO_NODE,
+					    partition_id, 1);
+	} while (!ret);
+
+	return ret;
+}
+
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index c0a0ccb3626a..c8f3919a5cdc 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -26,6 +26,12 @@
 #define HV_SET_REGISTER_BATCH_SIZE	\
 	((HV_HYP_PAGE_SIZE - sizeof(struct hv_set_vp_registers)) \
 		/ sizeof(struct hv_register_assoc))
+#define HV_GET_VP_STATE_BATCH_SIZE	\
+	((HV_HYP_PAGE_SIZE - sizeof(struct hv_get_vp_state_in)) \
+		/ sizeof(u64))
+#define HV_SET_VP_STATE_BATCH_SIZE	\
+	((HV_HYP_PAGE_SIZE - sizeof(struct hv_set_vp_state_in)) \
+		/ sizeof(u64))
 
 extern struct mshv mshv;
 
@@ -72,5 +78,24 @@ int hv_call_assert_virtual_interrupt(
 		u32 vector,
 		u64 dest_addr,
 		union hv_interrupt_control control);
+int hv_call_get_vp_state(
+		u32 vp_index,
+		u64 partition_id,
+		enum hv_get_set_vp_state_type type,
+		struct hv_vp_state_data_xsave xsave,
+		/* Choose between pages and ret_output */
+		u64 page_count,
+		struct page **pages,
+		union hv_get_vp_state_out *ret_output);
+int hv_call_set_vp_state(
+		u32 vp_index,
+		u64 partition_id,
+		enum hv_get_set_vp_state_type type,
+		struct hv_vp_state_data_xsave xsave,
+		/* Choose between pages and bytes */
+		u64 page_count,
+		struct page **pages,
+		u32 num_bytes,
+		u8 *bytes);
 
 #endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index ee41b59cc922..cef77f53d7c7 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -288,6 +288,122 @@ mshv_vp_ioctl_set_regs(struct mshv_vp *vp, void __user *user_args)
 	return ret;
 }
 
+static long
+mshv_vp_ioctl_get_set_state_pfn(struct mshv_vp *vp,
+				struct mshv_vp_state *args,
+				bool is_set)
+{
+	u64 page_count, remaining;
+	int completed;
+	struct page **pages;
+	long ret;
+	unsigned long u_buf;
+
+	/* Buffer must be page aligned */
+	if (!PAGE_ALIGNED(args->buf_size) ||
+	    !PAGE_ALIGNED(args->buf.bytes))
+		return -EINVAL;
+
+	if (!access_ok(args->buf.bytes, args->buf_size))
+		return -EFAULT;
+
+	/* Pin user pages so hypervisor can copy directly to them */
+	page_count = args->buf_size >> HV_HYP_PAGE_SHIFT;
+	pages = kcalloc(page_count, sizeof(struct page *), GFP_KERNEL);
+	if (!pages)
+		return -ENOMEM;
+
+	remaining = page_count;
+	u_buf = (unsigned long)args->buf.bytes;
+	while (remaining) {
+		completed = pin_user_pages_fast(
+				u_buf,
+				remaining,
+				FOLL_WRITE,
+				&pages[page_count - remaining]);
+		if (completed < 0) {
+			pr_err("%s: failed to pin user pages error %i\n",
+			       __func__, completed);
+			ret = completed;
+			goto unpin_pages;
+		}
+		remaining -= completed;
+		u_buf += completed * HV_HYP_PAGE_SIZE;
+	}
+
+	if (is_set)
+		ret = hv_call_set_vp_state(vp->index,
+					   vp->partition->id,
+					   args->type, args->xsave,
+					   page_count, pages,
+					   0, NULL);
+	else
+		ret = hv_call_get_vp_state(vp->index,
+					   vp->partition->id,
+					   args->type, args->xsave,
+					   page_count, pages,
+					   NULL);
+
+unpin_pages:
+	unpin_user_pages(pages, page_count - remaining);
+	kfree(pages);
+	return ret;
+}
+
+static long
+mshv_vp_ioctl_get_set_state(struct mshv_vp *vp, void __user *user_args, bool is_set)
+{
+	struct mshv_vp_state args;
+	long ret = 0;
+	union hv_get_vp_state_out vp_state;
+
+	if (copy_from_user(&args, user_args, sizeof(args)))
+		return -EFAULT;
+
+	/* For now just support these */
+	if (args.type != HV_GET_SET_VP_STATE_LOCAL_INTERRUPT_CONTROLLER_STATE &&
+	    args.type != HV_GET_SET_VP_STATE_XSAVE)
+		return -EINVAL;
+
+	/* If we need to pin pfns, delegate to helper */
+	if (args.type & HV_GET_SET_VP_STATE_TYPE_PFN)
+		return mshv_vp_ioctl_get_set_state_pfn(vp, &args, is_set);
+
+	if (args.buf_size < sizeof(vp_state))
+		return -EINVAL;
+
+	if (is_set) {
+		if (copy_from_user(
+				&vp_state,
+				args.buf.lapic,
+				sizeof(vp_state)))
+			return -EFAULT;
+
+		return hv_call_set_vp_state(vp->index,
+					    vp->partition->id,
+					    args.type, args.xsave,
+					    0, NULL,
+					    sizeof(vp_state),
+					    (u8 *)&vp_state);
+	}
+
+	ret = hv_call_get_vp_state(vp->index,
+				   vp->partition->id,
+				   args.type, args.xsave,
+				   0, NULL,
+				   &vp_state);
+
+	if (ret)
+		return ret;
+
+	if (copy_to_user(args.buf.lapic,
+			 &vp_state.interrupt_controller_state,
+			 sizeof(vp_state.interrupt_controller_state)))
+		return -EFAULT;
+
+	return 0;
+}
+
 static long
 mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
@@ -307,6 +423,12 @@ mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 	case MSHV_SET_VP_REGISTERS:
 		r = mshv_vp_ioctl_set_regs(vp, (void __user *)arg);
 		break;
+	case MSHV_GET_VP_STATE:
+		r = mshv_vp_ioctl_get_set_state(vp, (void __user *)arg, false);
+		break;
+	case MSHV_SET_VP_STATE:
+		r = mshv_vp_ioctl_get_set_state(vp, (void __user *)arg, true);
+		break;
 	default:
 		r = -ENOTTY;
 		break;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index d1cc5dbc78b5..55a957436813 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -168,6 +168,9 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_ASSERT_VIRTUAL_INTERRUPT		0x0094
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
+#define HVCALL_MAP_VP_STATE_PAGE			0x00e1
+#define HVCALL_GET_VP_STATE				0x00e3
+#define HVCALL_SET_VP_STATE				0x00e4
 
 /* Extended hypercalls */
 #define HV_EXT_CALL_QUERY_CAPABILITIES		0x8001
@@ -832,4 +835,41 @@ struct hv_assert_virtual_interrupt {
 	u16 rsvd_z1;
 } __packed;
 
+struct hv_vp_state_data {
+	u32 type;
+	u32 rsvd;
+	struct hv_vp_state_data_xsave xsave;
+} __packed;
+
+struct hv_get_vp_state_in {
+	u64 partition_id;
+	u32 vp_index;
+	u8 input_vtl;
+	u8 rsvd0;
+	u16 rsvd1;
+	struct hv_vp_state_data state_data;
+	u64 output_data_pfns[];
+} __packed;
+
+union hv_get_vp_state_out {
+	struct hv_local_interrupt_controller_state interrupt_controller_state;
+	/* Not supported yet */
+	/* struct hv_synthetic_timers_state synthetic_timers_state; */
+} __packed;
+
+union hv_input_set_vp_state_data {
+	u64 pfns;
+	u8 bytes;
+} __packed;
+
+struct hv_set_vp_state_in {
+	u64 partition_id;
+	u32 vp_index;
+	u8 input_vtl;
+	u8 rsvd0;
+	u16 rsvd1;
+	struct hv_vp_state_data state_data;
+	union hv_input_set_vp_state_data data[];
+} __packed;
+
 #endif
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
index 4ecb29fe1a0e..f4d8e9d148c3 100644
--- a/include/uapi/asm-generic/hyperv-tlfs.h
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -103,4 +103,32 @@ struct hv_register_assoc {
 	union hv_register_value value;
 } __packed;
 
+/*
+ * For getting and setting VP state, there are two options based on the state type:
+ *
+ *     1.) Data that is accessed by PFNs in the input hypercall page. This is used
+ *         for state which may not fit into the hypercall pages.
+ *     2.) Data that is accessed directly in the input\output hypercall pages.
+ *         This is used for state that will always fit into the hypercall pages.
+ *
+ * In the future this could be dynamic based on the size if needed.
+ *
+ * Note these hypercalls have an 8-byte aligned variable header size as per the tlfs
+ */
+
+#define HV_GET_SET_VP_STATE_TYPE_PFN	BIT(31)
+
+enum hv_get_set_vp_state_type {
+	HV_GET_SET_VP_STATE_LOCAL_INTERRUPT_CONTROLLER_STATE = 0,
+
+	HV_GET_SET_VP_STATE_XSAVE		= 1 | HV_GET_SET_VP_STATE_TYPE_PFN,
+	/* Synthetic message page */
+	HV_GET_SET_VP_STATE_SIM_PAGE		= 2 | HV_GET_SET_VP_STATE_TYPE_PFN,
+	/* Synthetic interrupt event flags page. */
+	HV_GET_SET_VP_STATE_SIEF_PAGE		= 3 | HV_GET_SET_VP_STATE_TYPE_PFN,
+
+	/* Synthetic timers. */
+	HV_GET_SET_VP_STATE_SYNTHETIC_TIMERS	= 4,
+};
+
 #endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index f65248a1ee89..73c24478e87e 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -53,6 +53,17 @@ struct mshv_assert_interrupt {
 	__u32 vector;
 };
 
+struct mshv_vp_state {
+	enum hv_get_set_vp_state_type type;
+	struct hv_vp_state_data_xsave xsave; /* only for xsave request */
+
+	__u64 buf_size; /* If xsave, must be page-aligned */
+	union {
+		struct hv_local_interrupt_controller_state *lapic;
+		__u8 *bytes; /* Xsave data. must be page-aligned */
+	} buf;
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
@@ -70,5 +81,7 @@ struct mshv_assert_interrupt {
 #define MSHV_GET_VP_REGISTERS   _IOWR(MSHV_IOCTL, 0x05, struct mshv_vp_registers)
 #define MSHV_SET_VP_REGISTERS   _IOW(MSHV_IOCTL, 0x06, struct mshv_vp_registers)
 #define MSHV_RUN_VP		_IOR(MSHV_IOCTL, 0x07, struct hv_message)
+#define MSHV_GET_VP_STATE	_IOWR(MSHV_IOCTL, 0x0A, struct mshv_vp_state)
+#define MSHV_SET_VP_STATE	_IOWR(MSHV_IOCTL, 0x0B, struct mshv_vp_state)
 
 #endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 16/19] drivers/hv: mmap vp register page
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (14 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 15/19] drivers/hv: get and set vp state ioctls Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 17/19] drivers/hv: get and set partition property ioctls Nuno Das Neves
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce mmap interface for a virtual processor, exposing a page for
setting and getting common registers while the VP is suspended.

This provides a more performant and convenient way to get and set these
registers in the context of a vmm's run-loop.

Co-developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst         | 11 ++++
 arch/x86/include/uapi/asm/hyperv-tlfs.h | 74 +++++++++++++++++++++++++
 drivers/hv/hv_call.c                    | 41 ++++++++++++++
 drivers/hv/mshv.h                       |  4 ++
 drivers/hv/mshv_main.c                  | 44 +++++++++++++++
 include/asm-generic/hyperv-tlfs.h       | 10 ++++
 include/linux/mshv.h                    |  1 +
 include/uapi/asm-generic/hyperv-tlfs.h  |  5 ++
 include/uapi/linux/mshv.h               | 12 ++++
 9 files changed, 202 insertions(+)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index 1613ac6e9428..bf3c060bd418 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -149,3 +149,14 @@ HV_PARTITION_CREATION_FLAG_LAPIC_ENABLED
 Get/set various vp state. Currently these can be used to get and set
 emulated LAPIC state, and xsave data.
 
+3.10 mmap(vp)
+-------------
+:Type: vp mmap
+:Parameters: offset should be HV_VP_MMAP_REGISTERS_OFFSET
+:Returns: 0 on success
+
+Maps a page into userspace that can be used to get and set common registers
+while the vp is suspended.
+The page is laid out in struct hv_vp_register_page in asm/hyperv-tlfs.h.
+
+
diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h b/arch/x86/include/uapi/asm/hyperv-tlfs.h
index 46806227e869..5430f3c98934 100644
--- a/arch/x86/include/uapi/asm/hyperv-tlfs.h
+++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
@@ -1072,4 +1072,78 @@ struct hv_vp_state_data_xsave {
 	union hv_x64_xsave_xfem_register states;
 } __packed;
 
+/* Bits for dirty mask of hv_vp_register_page */
+#define HV_X64_REGISTER_CLASS_GENERAL	0
+#define HV_X64_REGISTER_CLASS_IP	1
+#define HV_X64_REGISTER_CLASS_XMM	2
+#define HV_X64_REGISTER_CLASS_SEGMENT	3
+#define HV_X64_REGISTER_CLASS_FLAGS	4
+
+#define HV_VP_REGISTER_PAGE_VERSION_1	1u
+
+struct hv_vp_register_page {
+	__u16 version;
+	__u8 isvalid;
+	__u8 rsvdz;
+	__u32 dirty;
+	union {
+		struct {
+			__u64 rax;
+			__u64 rcx;
+			__u64 rdx;
+			__u64 rbx;
+			__u64 rsp;
+			__u64 rbp;
+			__u64 rsi;
+			__u64 rdi;
+			__u64 r8;
+			__u64 r9;
+			__u64 r10;
+			__u64 r11;
+			__u64 r12;
+			__u64 r13;
+			__u64 r14;
+			__u64 r15;
+		} __packed;
+
+		__u64 gp_registers[16];
+	};
+	__u64 rip;
+	__u64 rflags;
+	union {
+		struct {
+			struct hv_u128 xmm0;
+			struct hv_u128 xmm1;
+			struct hv_u128 xmm2;
+			struct hv_u128 xmm3;
+			struct hv_u128 xmm4;
+			struct hv_u128 xmm5;
+		} __packed;
+
+		struct hv_u128 xmm_registers[6];
+	};
+	union {
+		struct {
+			struct hv_x64_segment_register es;
+			struct hv_x64_segment_register cs;
+			struct hv_x64_segment_register ss;
+			struct hv_x64_segment_register ds;
+			struct hv_x64_segment_register fs;
+			struct hv_x64_segment_register gs;
+		} __packed;
+
+		struct hv_x64_segment_register segment_registers[6];
+	};
+	/* read only */
+	__u64 cr0;
+	__u64 cr3;
+	__u64 cr4;
+	__u64 cr8;
+	__u64 efer;
+	__u64 dr7;
+	union hv_x64_pending_interruption_register pending_interruption;
+	union hv_x64_interrupt_state_register interrupt_state;
+	__u64 instruction_emulation_hints;
+} __packed;
+
 #endif
diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index c358a2b51ba1..eb98183ce8ef 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -597,3 +597,44 @@ int hv_call_set_vp_state(
 	return ret;
 }
 
+int hv_call_map_vp_state_page(
+		u32 vp_index,
+		u64 partition_id,
+		struct page **state_page)
+{
+	struct hv_map_vp_state_page_in *input;
+	struct hv_map_vp_state_page_out *output;
+	u64 status;
+	int ret;
+	unsigned long flags;
+
+	do {
+		local_irq_save(flags);
+		input = (struct hv_map_vp_state_page_in *)(*this_cpu_ptr(
+			hyperv_pcpu_input_arg));
+		output = (struct hv_map_vp_state_page_out *)(*this_cpu_ptr(
+			hyperv_pcpu_output_arg));
+
+		input->partition_id = partition_id;
+		input->vp_index = vp_index;
+		input->type = HV_VP_STATE_PAGE_REGISTERS;
+		status = hv_do_hypercall(HVCALL_MAP_VP_STATE_PAGE,
+						   input, output);
+
+		if (hv_result(status) != HV_STATUS_INSUFFICIENT_MEMORY) {
+			if (hv_result_success(status))
+				*state_page = pfn_to_page(output->map_location);
+			else
+				pr_err("%s: %s\n", __func__,
+				       hv_status_to_string(status));
+			local_irq_restore(flags);
+			ret = hv_status_to_errno(status);
+			break;
+		}
+		local_irq_restore(flags);
+
+		ret = hv_call_deposit_pages(NUMA_NO_NODE, partition_id, 1);
+	} while (!ret);
+
+	return ret;
+}
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index c8f3919a5cdc..a9215581be6b 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -97,5 +97,9 @@ int hv_call_set_vp_state(
 		struct page **pages,
 		u32 num_bytes,
 		u8 *bytes);
+int hv_call_map_vp_state_page(
+		u32 vp_index,
+		u64 partition_id,
+		struct page **state_page);
 
 #endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index cef77f53d7c7..a30119043737 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -37,11 +37,18 @@ static long mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned
 static int mshv_dev_open(struct inode *inode, struct file *filp);
 static int mshv_dev_release(struct inode *inode, struct file *filp);
 static long mshv_dev_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg);
+static int mshv_vp_mmap(struct file *file, struct vm_area_struct *vma);
+static vm_fault_t mshv_vp_fault(struct vm_fault *vmf);
+
+static const struct vm_operations_struct mshv_vp_vm_ops = {
+	.fault = mshv_vp_fault,
+};
 
 static const struct file_operations mshv_vp_fops = {
 	.release = mshv_vp_release,
 	.unlocked_ioctl = mshv_vp_ioctl,
 	.llseek = noop_llseek,
+	.mmap = mshv_vp_mmap,
 };
 
 static const struct file_operations mshv_partition_fops = {
@@ -438,6 +445,43 @@ mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 	return r;
 }
 
+static vm_fault_t mshv_vp_fault(struct vm_fault *vmf)
+{
+	struct mshv_vp *vp = vmf->vma->vm_file->private_data;
+
+	vmf->page = vp->register_page;
+	get_page(vp->register_page);
+
+	return 0;
+}
+
+static int mshv_vp_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	int ret;
+	struct mshv_vp *vp = file->private_data;
+
+	if (vma->vm_pgoff != MSHV_VP_MMAP_REGISTERS_OFFSET)
+		return -EINVAL;
+
+	if (mutex_lock_killable(&vp->mutex))
+		return -EINTR;
+
+	if (!vp->register_page) {
+		ret = hv_call_map_vp_state_page(vp->index,
+						vp->partition->id,
+						&vp->register_page);
+		if (ret) {
+			mutex_unlock(&vp->mutex);
+			return ret;
+		}
+	}
+
+	mutex_unlock(&vp->mutex);
+
+	vma->vm_ops = &mshv_vp_vm_ops;
+	return 0;
+}
+
 static int
 mshv_vp_release(struct inode *inode, struct file *filp)
 {
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 55a957436813..f8f44008c013 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -872,4 +872,14 @@ struct hv_set_vp_state_in {
 	union hv_input_set_vp_state_data data[];
 } __packed;
 
+struct hv_map_vp_state_page_in {
+	u64 partition_id;
+	u32 vp_index;
+	u32 type; /* enum hv_vp_state_page_type */
+} __packed;
+
+struct hv_map_vp_state_page_out {
+	u64 map_location; /* page number */
+} __packed;
+
 #endif
diff --git a/include/linux/mshv.h b/include/linux/mshv.h
index 3933d80294f1..33f4d0cfee11 100644
--- a/include/linux/mshv.h
+++ b/include/linux/mshv.h
@@ -20,6 +20,7 @@ struct mshv_vp {
 	u32 index;
 	struct mshv_partition *partition;
 	struct mutex mutex;
+	struct page *register_page;
 	struct {
 		struct semaphore sem;
 		struct task_struct *task;
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
index f4d8e9d148c3..a1bc77e463dd 100644
--- a/include/uapi/asm-generic/hyperv-tlfs.h
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -131,4 +131,9 @@ enum hv_get_set_vp_state_type {
 	HV_GET_SET_VP_STATE_SYNTHETIC_TIMERS	= 4,
 };
 
+enum hv_vp_state_page_type {
+	HV_VP_STATE_PAGE_REGISTERS = 0,
+	HV_VP_STATE_PAGE_COUNT
+};
+
 #endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 73c24478e87e..718a3617e1f1 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -14,6 +14,8 @@
 
 #define MSHV_CAP_CORE_API_STABLE    0x0
 
+#define MSHV_VP_MMAP_REGISTERS_OFFSET (HV_VP_STATE_PAGE_REGISTERS * 0x1000)
+
 struct mshv_create_partition {
 	__u64 flags;
 	struct hv_partition_creation_properties partition_creation_properties;
@@ -84,4 +86,14 @@ struct mshv_vp_state {
 #define MSHV_GET_VP_STATE	_IOWR(MSHV_IOCTL, 0x0A, struct mshv_vp_state)
 #define MSHV_SET_VP_STATE	_IOWR(MSHV_IOCTL, 0x0B, struct mshv_vp_state)
 
+/* register page mapping example:
+ * struct hv_vp_register_page *regs = mmap(NULL,
+ *					   4096,
+ *					   PROT_READ | PROT_WRITE,
+ *					   MAP_SHARED,
+ *					   vp_fd,
+ *					   HV_VP_MMAP_REGISTERS_OFFSET);
+ * munmap(regs, 4096);
+ */
+
 #endif
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 17/19] drivers/hv: get and set partition property ioctls
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (15 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 16/19] drivers/hv: mmap vp register page Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 18/19] drivers/hv: Add enlightenment bits to create partition ioctl Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 19/19] drivers/hv: Translate GVA to GPA Nuno Das Neves
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce ioctls for getting and setting properties of guest partitions.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst        |  8 ++++
 drivers/hv/hv_call.c                   | 58 +++++++++++++++++++++++++
 drivers/hv/mshv.h                      |  8 ++++
 drivers/hv/mshv_main.c                 | 47 ++++++++++++++++++++
 include/asm-generic/hyperv-tlfs.h      | 19 +++++++++
 include/uapi/asm-generic/hyperv-tlfs.h | 59 ++++++++++++++++++++++++++
 include/uapi/linux/mshv.h              |  9 ++++
 7 files changed, 208 insertions(+)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index bf3c060bd418..e660e0e6865e 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -159,4 +159,12 @@ Maps a page into userspace that can be used to get and set common registers
 while the vp is suspended.
 The page is laid out in struct hv_vp_register_page in asm/hyperv-tlfs.h.
 
+3.11 MSHV_SET_PARTITION_PROPERTY and MSHV_GET_PARTITION_PROPERTY
+----------------------------------------------------------------
+:Type: partition ioctl
+:Parameters: struct mshv_partition_property
+:Returns: 0 on success
+
+Can be used to get/set various properties of a partition.
+
 
diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index eb98183ce8ef..776095de9679 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -638,3 +638,61 @@ int hv_call_map_vp_state_page(
 
 	return ret;
 }
+
+int hv_call_get_partition_property(
+		u64 partition_id,
+		u64 property_code,
+		u64 *property_value)
+{
+	u64 status;
+	unsigned long flags;
+	struct hv_get_partition_property_in *input;
+	struct hv_get_partition_property_out *output;
+
+	local_irq_save(flags);
+	input = (struct hv_get_partition_property_in *)(*this_cpu_ptr(
+			hyperv_pcpu_input_arg));
+	output = (struct hv_get_partition_property_out *)(*this_cpu_ptr(
+			hyperv_pcpu_output_arg));
+	memset(input, 0, sizeof(*input));
+	input->partition_id = partition_id;
+	input->property_code = property_code;
+	status = hv_do_hypercall(HVCALL_GET_PARTITION_PROPERTY, input,
+			output);
+
+	if (!hv_result_success(status)) {
+		pr_err("%s: %s\n", __func__, hv_status_to_string(status));
+		local_irq_restore(flags);
+		return hv_status_to_errno(status);
+	}
+	*property_value = output->property_value;
+
+	local_irq_restore(flags);
+
+	return 0;
+}
+
+int hv_call_set_partition_property(
+		u64 partition_id,
+		u64 property_code,
+		u64 property_value)
+{
+	u64 status;
+	unsigned long flags;
+	struct hv_set_partition_property *input;
+
+	local_irq_save(flags);
+	input = (struct hv_set_partition_property *)(*this_cpu_ptr(
+			hyperv_pcpu_input_arg));
+	memset(input, 0, sizeof(*input));
+	input->partition_id = partition_id;
+	input->property_code = property_code;
+	input->property_value = property_value;
+	status = hv_do_hypercall(HVCALL_SET_PARTITION_PROPERTY, input, NULL);
+	local_irq_restore(flags);
+
+	if (!hv_result_success(status))
+		pr_err("%s: %s\n", __func__, hv_status_to_string(status));
+
+	return hv_status_to_errno(status);
+}
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index a9215581be6b..8230368b4257 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -101,5 +101,13 @@ int hv_call_map_vp_state_page(
 		u32 vp_index,
 		u64 partition_id,
 		struct page **state_page);
+int hv_call_get_partition_property(
+		u64 partition_id,
+		u64 property_code,
+		u64 *property_value);
+int hv_call_set_partition_property(
+		u64 partition_id,
+		u64 property_code,
+		u64 property_value);
 
 #endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index a30119043737..d65bcd8567a4 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -576,6 +576,45 @@ mshv_partition_ioctl_create_vp(struct mshv_partition *partition,
 	return ret;
 }
 
+static long
+mshv_partition_ioctl_get_property(struct mshv_partition *partition,
+				  void __user *user_args)
+{
+	struct mshv_partition_property args;
+	long ret;
+
+	if (copy_from_user(&args, user_args, sizeof(args)))
+		return -EFAULT;
+
+	ret = hv_call_get_partition_property(
+					partition->id,
+					args.property_code,
+					&args.property_value);
+
+	if (ret)
+		return ret;
+
+	if (copy_to_user(user_args, &args, sizeof(args)))
+		return -EFAULT;
+
+	return 0;
+}
+
+static long
+mshv_partition_ioctl_set_property(struct mshv_partition *partition,
+				  void __user *user_args)
+{
+	struct mshv_partition_property args;
+
+	if (copy_from_user(&args, user_args, sizeof(args)))
+		return -EFAULT;
+
+	return hv_call_set_partition_property(
+			partition->id,
+			args.property_code,
+			args.property_value);
+}
+
 static long
 mshv_partition_ioctl_map_memory(struct mshv_partition *partition,
 				struct mshv_user_mem_region __user *user_mem)
@@ -795,6 +834,14 @@ mshv_partition_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 		ret = mshv_partition_ioctl_assert_interrupt(partition,
 							(void __user *)arg);
 		break;
+	case MSHV_GET_PARTITION_PROPERTY:
+		ret = mshv_partition_ioctl_get_property(partition,
+							(void __user *)arg);
+		break;
+	case MSHV_SET_PARTITION_PROPERTY:
+		ret = mshv_partition_ioctl_set_property(partition,
+							(void __user *)arg);
+		break;
 	default:
 		ret = -ENOTTY;
 	}
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index f8f44008c013..2c0dfd0b8763 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -147,6 +147,8 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_INITIALIZE_PARTITION		0x0041
 #define HVCALL_FINALIZE_PARTITION		0x0042
 #define HVCALL_DELETE_PARTITION			0x0043
+#define HVCALL_GET_PARTITION_PROPERTY		0x0044
+#define HVCALL_SET_PARTITION_PROPERTY		0x0045
 #define HVCALL_GET_PARTITION_ID			0x0046
 #define HVCALL_DEPOSIT_MEMORY			0x0048
 #define HVCALL_WITHDRAW_MEMORY			0x0049
@@ -882,4 +884,21 @@ struct hv_map_vp_state_page_out {
 	u64 map_location; /* page number */
 } __packed;
 
+struct hv_get_partition_property_in {
+	u64 partition_id;
+	u32 property_code; /* enum hv_partition_property_code */
+	u32 padding;
+} __packed;
+
+struct hv_get_partition_property_out {
+	u64 property_value;
+} __packed;
+
+struct hv_set_partition_property {
+	u64 partition_id;
+	u32 property_code; /* enum hv_partition_property_code */
+	u32 padding;
+	u64 property_value;
+} __packed;
+
 #endif
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
index a1bc77e463dd..1e572d38234a 100644
--- a/include/uapi/asm-generic/hyperv-tlfs.h
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -136,4 +136,63 @@ enum hv_vp_state_page_type {
 	HV_VP_STATE_PAGE_COUNT
 };
 
+enum hv_partition_property_code {
+	/* Privilege properties */
+	HV_PARTITION_PROPERTY_PRIVILEGE_FLAGS				= 0x00010000,
+
+	/* Scheduling properties */
+	HV_PARTITION_PROPERTY_SUSPEND					= 0x00020000,
+	HV_PARTITION_PROPERTY_CPU_RESERVE				= 0x00020001,
+	HV_PARTITION_PROPERTY_CPU_CAP					= 0x00020002,
+	HV_PARTITION_PROPERTY_CPU_WEIGHT				= 0x00020003,
+	HV_PARTITION_PROPERTY_CPU_GROUP_ID				= 0x00020004,
+
+	/* Time properties */
+	HV_PARTITION_PROPERTY_TIME_FREEZE				= 0x00030003,
+
+	/* Debugging properties */
+	HV_PARTITION_PROPERTY_DEBUG_CHANNEL_ID				= 0x00040000,
+
+	/* Resource properties */
+	HV_PARTITION_PROPERTY_VIRTUAL_TLB_PAGE_COUNT			= 0x00050000,
+	HV_PARTITION_PROPERTY_VSM_CONFIG				= 0x00050001,
+	HV_PARTITION_PROPERTY_ZERO_MEMORY_ON_RESET			= 0x00050002,
+	HV_PARTITION_PROPERTY_PROCESSORS_PER_SOCKET			= 0x00050003,
+	HV_PARTITION_PROPERTY_NESTED_TLB_SIZE				= 0x00050004,
+	HV_PARTITION_PROPERTY_GPA_PAGE_ACCESS_TRACKING			= 0x00050005,
+	HV_PARTITION_PROPERTY_VSM_PERMISSIONS_DIRTY_SINCE_LAST_QUERY	= 0x00050006,
+	HV_PARTITION_PROPERTY_SGX_LAUNCH_CONTROL_CONFIG			= 0x00050007,
+	HV_PARTITION_PROPERTY_DEFAULT_SGX_LAUNCH_CONTROL0		= 0x00050008,
+	HV_PARTITION_PROPERTY_DEFAULT_SGX_LAUNCH_CONTROL1		= 0x00050009,
+	HV_PARTITION_PROPERTY_DEFAULT_SGX_LAUNCH_CONTROL2		= 0x0005000a,
+	HV_PARTITION_PROPERTY_DEFAULT_SGX_LAUNCH_CONTROL3		= 0x0005000b,
+	HV_PARTITION_PROPERTY_ISOLATION_STATE				= 0x0005000c,
+	HV_PARTITION_PROPERTY_ISOLATION_CONTROL				= 0x0005000d,
+	HV_PARTITION_PROPERTY_RDT_L3_COS_INDEX				= 0x0005000e,
+	HV_PARTITION_PROPERTY_RDT_RMID					= 0x0005000f,
+	HV_PARTITION_PROPERTY_IMPLEMENTED_PHYSICAL_ADDRESS_BITS		= 0x00050010,
+	HV_PARTITION_PROPERTY_NON_ARCHITECTURAL_CORE_SHARING		= 0x00050011,
+	HV_PARTITION_PROPERTY_HYPERCALL_DOORBELL_PAGE			= 0x00050012,
+
+	/* Compatibility properties */
+	HV_PARTITION_PROPERTY_PROCESSOR_VENDOR				= 0x00060000,
+	HV_PARTITION_PROPERTY_PROCESSOR_FEATURES_DEPRECATED		= 0x00060001,
+	HV_PARTITION_PROPERTY_PROCESSOR_XSAVE_FEATURES			= 0x00060002,
+	HV_PARTITION_PROPERTY_PROCESSOR_CL_FLUSH_SIZE			= 0x00060003,
+	HV_PARTITION_PROPERTY_ENLIGHTENMENT_MODIFICATIONS		= 0x00060004,
+	HV_PARTITION_PROPERTY_COMPATIBILITY_VERSION			= 0x00060005,
+	HV_PARTITION_PROPERTY_PHYSICAL_ADDRESS_WIDTH			= 0x00060006,
+	HV_PARTITION_PROPERTY_XSAVE_STATES				= 0x00060007,
+	HV_PARTITION_PROPERTY_MAX_XSAVE_DATA_SIZE			= 0x00060008,
+	HV_PARTITION_PROPERTY_PROCESSOR_CLOCK_FREQUENCY			= 0x00060009,
+	HV_PARTITION_PROPERTY_PROCESSOR_FEATURES0			= 0x0006000a,
+	HV_PARTITION_PROPERTY_PROCESSOR_FEATURES1			= 0x0006000b,
+
+	/* Guest software properties */
+	HV_PARTITION_PROPERTY_GUEST_OS_ID				= 0x00070000,
+
+	/* Nested virtualization properties */
+	HV_PARTITION_PROPERTY_PROCESSOR_VIRTUALIZATION_FEATURES		= 0x00080000,
+};
+
 #endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 718a3617e1f1..1a6c22db4978 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -66,6 +66,11 @@ struct mshv_vp_state {
 	} buf;
 };
 
+struct mshv_partition_property {
+	enum hv_partition_property_code property_code;
+	__u64 property_value;
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
@@ -78,6 +83,10 @@ struct mshv_vp_state {
 #define MSHV_CREATE_VP		_IOW(MSHV_IOCTL, 0x04, struct mshv_create_vp)
 #define MSHV_INSTALL_INTERCEPT	_IOW(MSHV_IOCTL, 0x08, struct mshv_install_intercept)
 #define MSHV_ASSERT_INTERRUPT	_IOW(MSHV_IOCTL, 0x09, struct mshv_assert_interrupt)
+#define MSHV_SET_PARTITION_PROPERTY \
+				_IOW(MSHV_IOCTL, 0xC, struct mshv_partition_property)
+#define MSHV_GET_PARTITION_PROPERTY \
+				_IOWR(MSHV_IOCTL, 0xD, struct mshv_partition_property)
 
 /* vp device */
 #define MSHV_GET_VP_REGISTERS   _IOWR(MSHV_IOCTL, 0x05, struct mshv_vp_registers)
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 18/19] drivers/hv: Add enlightenment bits to create partition ioctl
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (16 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 17/19] drivers/hv: get and set partition property ioctls Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  2021-09-28 18:31 ` [PATCH v3 19/19] drivers/hv: Translate GVA to GPA Nuno Das Neves
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

Introduce hv_partition_synthetic_processor features mask to
MSHV_CREATE_PARTITION ioctl, which can be used to enable hypervisor
enlightenments for exo partitions.

Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
---
 Documentation/virt/mshv/api.rst         |   3 +
 arch/x86/include/uapi/asm/hyperv-tlfs.h | 125 ++++++++++++++++++++++++
 drivers/hv/mshv_main.c                  |   7 ++
 include/uapi/asm-generic/hyperv-tlfs.h  |   1 +
 include/uapi/linux/mshv.h               |   1 +
 5 files changed, 137 insertions(+)

diff --git a/Documentation/virt/mshv/api.rst b/Documentation/virt/mshv/api.rst
index e660e0e6865e..56a6edfcfe29 100644
--- a/Documentation/virt/mshv/api.rst
+++ b/Documentation/virt/mshv/api.rst
@@ -167,4 +167,7 @@ The page is laid out in struct hv_vp_register_page in asm/hyperv-tlfs.h.
 
 Can be used to get/set various properties of a partition.
 
+Some properties can only be set at partition creation. For these, there are
+parameters in MSHV_CREATE_PARTITION.
+
 
diff --git a/arch/x86/include/uapi/asm/hyperv-tlfs.h b/arch/x86/include/uapi/asm/hyperv-tlfs.h
index 5430f3c98934..4447ef5362e9 100644
--- a/arch/x86/include/uapi/asm/hyperv-tlfs.h
+++ b/arch/x86/include/uapi/asm/hyperv-tlfs.h
@@ -1146,4 +1146,129 @@ struct hv_vp_register_page {
 	__u64 instruction_emulation_hints;
 } __packed;
 
+#define HV_PARTITION_SYNTHETIC_PROCESSOR_FEATURES_BANKS 1
+
+union hv_partition_synthetic_processor_features {
+	__u64 as_uint64[HV_PARTITION_SYNTHETIC_PROCESSOR_FEATURES_BANKS];
+
+	struct {
+		/* Report a hypervisor is present. CPUID leaves
+		 * 0x40000000 and 0x40000001 are supported.
+		 */
+		__u64 hypervisor_present:1;
+
+		/*
+		 * Features associated with HV#1:
+		 */
+
+		/* Report support for Hv1 (CPUID leaves 0x40000000 - 0x40000006). */
+		__u64 hv1:1;
+
+		/* Access to HV_X64_MSR_VP_RUNTIME.
+		 * Corresponds to access_vp_run_time_reg privilege.
+		 */
+		__u64 access_vp_run_time_reg:1;
+
+		/* Access to HV_X64_MSR_TIME_REF_COUNT.
+		 * Corresponds to access_partition_reference_counter privilege.
+		 */
+		__u64 access_partition_reference_counter:1;
+
+		/* Access to SINT-related registers (HV_X64_MSR_SCONTROL through
+		 * HV_X64_MSR_EOM and HV_X64_MSR_SINT0 through HV_X64_MSR_SINT15).
+		 * Corresponds to access_synic_regs privilege.
+		 */
+		__u64 access_synic_regs:1;
+
+		/* Access to synthetic timers and associated MSRs
+		 * (HV_X64_MSR_STIMER0_CONFIG through HV_X64_MSR_STIMER3_COUNT).
+		 * Corresponds to access_synthetic_timer_regs privilege.
+		 */
+		__u64 access_synthetic_timer_regs:1;
+
+		/* Access to APIC MSRs (HV_X64_MSR_EOI, HV_X64_MSR_ICR and HV_X64_MSR_TPR)
+		 * as well as the VP assist page.
+		 * Corresponds to access_intr_ctrl_regs privilege.
+		 */
+		__u64 access_intr_ctrl_regs:1;
+
+		/* Access to registers associated with hypercalls (HV_X64_MSR_GUEST_OS_ID
+		 * and HV_X64_MSR_HYPERCALL).
+		 * Corresponds to access_hypercall_msrs privilege.
+		 */
+		__u64 access_hypercall_regs:1;
+
+		/* VP index can be queried. corresponds to access_vp_index privilege. */
+		__u64 access_vp_index:1;
+
+		/* Access to the reference TSC. Corresponds to access_partition_reference_tsc
+		 * privilege.
+		 */
+		__u64 access_partition_reference_tsc:1;
+
+		/* Partition has access to the guest idle reg. Corresponds to
+		 * access_guest_idle_reg privilege.
+		 */
+		__u64 access_guest_idle_reg:1;
+
+		/* Partition has access to frequency regs. corresponds to access_frequency_regs
+		 * privilege.
+		 */
+		__u64 access_frequency_regs:1;
+
+		__u64 reserved_z12:1; /* Reserved for access_reenlightenment_controls. */
+		__u64 reserved_z13:1; /* Reserved for access_root_scheduler_reg. */
+		__u64 reserved_z14:1; /* Reserved for access_tsc_invariant_controls. */
+
+		/* Extended GVA ranges for HvCallFlushVirtualAddressList hypercall.
+		 * Corresponds to privilege.
+		 */
+		__u64 enable_extended_gva_ranges_for_flush_virtual_address_list:1;
+
+		__u64 reserved_z16:1; /* Reserved for access_vsm. */
+		__u64 reserved_z17:1; /* Reserved for access_vp_registers. */
+
+		/* Use fast hypercall output. Corresponds to privilege. */
+		__u64 fast_hypercall_output:1;
+
+		__u64 reserved_z19:1; /* Reserved for enable_extended_hypercalls. */
+
+		/*
+		 * HvStartVirtualProcessor can be used to start virtual processors.
+		 * Corresponds to privilege.
+		 */
+		__u64 start_virtual_processor:1;
+
+		__u64 reserved_z21:1; /* Reserved for Isolation. */
+
+		/* Synthetic timers in direct mode. */
+		__u64 direct_synthetic_timers:1;
+
+		__u64 reserved_z23:1; /* Reserved for synthetic time unhalted timer */
+
+		/* Use extended processor masks. */
+		__u64 extended_processor_masks:1;
+
+		/* HvCallFlushVirtualAddressSpace / HvCallFlushVirtualAddressList are supported. */
+		__u64 tb_flush_hypercalls:1;
+
+		/* HvCallSendSyntheticClusterIpi is supported. */
+		__u64 synthetic_cluster_ipi:1;
+
+		/* HvCallNotifyLongSpinWait is supported. */
+		__u64 notify_long_spin_wait:1;
+
+		/* HvCallQueryNumaDistance is supported. */
+		__u64 query_numa_distance:1;
+
+		/* HvCallSignalEvent is supported. Corresponds to privilege. */
+		__u64 signal_events:1;
+
+		/* HvCallRetargetDeviceInterrupt is supported. */
+		__u64 retarget_device_interrupt:1;
+
+		__u64 reserved:33;
+	} __packed;
+};
+
 #endif
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index d65bcd8567a4..766ba7d5d168 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -1000,6 +1000,13 @@ mshv_ioctl_create_partition(void __user *user_arg)
 	if (ret)
 		goto put_fd;
 
+	ret = hv_call_set_partition_property(
+				partition->id,
+				HV_PARTITION_PROPERTY_SYNTHETIC_PROC_FEATURES,
+				args.synthetic_processor_features.as_uint64[0]);
+	if (ret)
+		goto delete_partition;
+
 	ret = hv_call_initialize_partition(partition->id);
 	if (ret)
 		goto delete_partition;
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
index 1e572d38234a..5d8d5e89f432 100644
--- a/include/uapi/asm-generic/hyperv-tlfs.h
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -139,6 +139,7 @@ enum hv_vp_state_page_type {
 enum hv_partition_property_code {
 	/* Privilege properties */
 	HV_PARTITION_PROPERTY_PRIVILEGE_FLAGS				= 0x00010000,
+	HV_PARTITION_PROPERTY_SYNTHETIC_PROC_FEATURES			= 0x00010001,
 
 	/* Scheduling properties */
 	HV_PARTITION_PROPERTY_SUSPEND					= 0x00020000,
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index 1a6c22db4978..ec8281712430 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -19,6 +19,7 @@
 struct mshv_create_partition {
 	__u64 flags;
 	struct hv_partition_creation_properties partition_creation_properties;
+	union hv_partition_synthetic_processor_features synthetic_processor_features;
 };
 
 /*
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v3 19/19] drivers/hv: Translate GVA to GPA
  2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
                   ` (17 preceding siblings ...)
  2021-09-28 18:31 ` [PATCH v3 18/19] drivers/hv: Add enlightenment bits to create partition ioctl Nuno Das Neves
@ 2021-09-28 18:31 ` Nuno Das Neves
  18 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-09-28 18:31 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: virtualization, mikelley, viremana, sunilmut, wei.liu, vkuznets,
	ligrassi, kys, sthemmin, anbelski

From: Wei Liu <wei.liu@kernel.org>

Introduce ioctl for translating Guest Virtual Address (GVA) to Guest
Physical Address (GPA)

Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
---
 drivers/hv/hv_call.c                   | 44 ++++++++++++++++++++++++++
 drivers/hv/mshv.h                      |  7 ++++
 drivers/hv/mshv_main.c                 | 34 ++++++++++++++++++++
 include/asm-generic/hyperv-tlfs.h      | 14 ++++++++
 include/uapi/asm-generic/hyperv-tlfs.h | 43 +++++++++++++++++++++++++
 include/uapi/linux/mshv.h              |  8 +++++
 6 files changed, 150 insertions(+)

diff --git a/drivers/hv/hv_call.c b/drivers/hv/hv_call.c
index 776095de9679..0900e7377826 100644
--- a/drivers/hv/hv_call.c
+++ b/drivers/hv/hv_call.c
@@ -10,6 +10,7 @@
 
 #include <linux/kernel.h>
 #include <linux/mm.h>
+#include <linux/hyperv.h>
 #include <asm/mshyperv.h>
 
 #include "mshv.h"
@@ -696,3 +697,46 @@ int hv_call_set_partition_property(
 
 	return hv_status_to_errno(status);
 }
+
+int hv_call_translate_virtual_address(
+		u32 vp_index,
+		u64 partition_id,
+		u64 flags,
+		u64 gva,
+		u64 *gpa,
+		union hv_translate_gva_result *result)
+{
+	u64 status;
+	unsigned long irq_flags;
+	struct hv_translate_virtual_address_in *input;
+	struct hv_translate_virtual_address_out *output;
+
+	local_irq_save(irq_flags);
+
+	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	output = *this_cpu_ptr(hyperv_pcpu_output_arg);
+
+	memset(input, 0, sizeof(*input));
+	memset(output, 0, sizeof(*output));
+
+	input->partition_id = partition_id;
+	input->vp_index = vp_index;
+	input->control_flags = flags;
+	input->gva_page = gva >> HV_HYP_PAGE_SHIFT;
+
+	status = hv_do_hypercall(HVCALL_TRANSLATE_VIRTUAL_ADDRESS, input, output);
+
+	if (!hv_result_success(status)) {
+		pr_err("%s: %s\n", __func__, hv_status_to_string(status));
+		goto out;
+	}
+
+	*result = output->translation_result;
+	*gpa = (output->gpa_page << HV_HYP_PAGE_SHIFT) + offset_in_hvpage(gva);
+
+out:
+	local_irq_restore(irq_flags);
+
+	return hv_status_to_errno(status);
+}
+
diff --git a/drivers/hv/mshv.h b/drivers/hv/mshv.h
index 8230368b4257..1a8c94edb9c5 100644
--- a/drivers/hv/mshv.h
+++ b/drivers/hv/mshv.h
@@ -109,5 +109,12 @@ int hv_call_set_partition_property(
 		u64 partition_id,
 		u64 property_code,
 		u64 property_value);
+int hv_call_translate_virtual_address(
+		u32 vp_index,
+		u64 partition_id,
+		u64 flags,
+		u64 gva,
+		u64 *gpa,
+		union hv_translate_gva_result *result);
 
 #endif /* _MSHV_H */
diff --git a/drivers/hv/mshv_main.c b/drivers/hv/mshv_main.c
index 766ba7d5d168..26426d03d521 100644
--- a/drivers/hv/mshv_main.c
+++ b/drivers/hv/mshv_main.c
@@ -411,6 +411,37 @@ mshv_vp_ioctl_get_set_state(struct mshv_vp *vp, void __user *user_args, bool is_
 	return 0;
 }
 
+static long
+mshv_vp_ioctl_translate_gva(struct mshv_vp *vp, void __user *user_args)
+{
+	long ret;
+	struct mshv_translate_gva args;
+	u64 gpa;
+	union hv_translate_gva_result result;
+
+	if (copy_from_user(&args, user_args, sizeof(args)))
+		return -EFAULT;
+
+	ret = hv_call_translate_virtual_address(
+			vp->index,
+			vp->partition->id,
+			args.flags,
+			args.gva,
+			&gpa,
+			&result);
+
+	if (ret)
+		return ret;
+
+	if (copy_to_user(args.result, &result, sizeof(*args.result)))
+		return -EFAULT;
+
+	if (copy_to_user(args.gpa, &gpa, sizeof(*args.gpa)))
+		return -EFAULT;
+
+	return 0;
+}
+
 static long
 mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 {
@@ -436,6 +467,9 @@ mshv_vp_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
 	case MSHV_SET_VP_STATE:
 		r = mshv_vp_ioctl_get_set_state(vp, (void __user *)arg, true);
 		break;
+	case MSHV_TRANSLATE_GVA:
+		r = mshv_vp_ioctl_translate_gva(vp, (void __user *)arg);
+		break;
 	default:
 		r = -ENOTTY;
 		break;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 2c0dfd0b8763..2e520e7d765d 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -158,6 +158,7 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_CREATE_VP			0x004e
 #define HVCALL_GET_VP_REGISTERS			0x0050
 #define HVCALL_SET_VP_REGISTERS			0x0051
+#define HVCALL_TRANSLATE_VIRTUAL_ADDRESS	0x0052
 #define HVCALL_POST_MESSAGE			0x005c
 #define HVCALL_SIGNAL_EVENT			0x005d
 #define HVCALL_POST_DEBUG_DATA			0x0069
@@ -901,4 +902,17 @@ struct hv_set_partition_property {
 	u64 property_value;
 } __packed;
 
+struct hv_translate_virtual_address_in {
+	u64 partition_id;
+	u32 vp_index;
+	u32 padding;
+	u64 control_flags;
+	u64 gva_page;
+} __packed;
+
+struct hv_translate_virtual_address_out {
+	union hv_translate_gva_result translation_result;
+	u64 gpa_page;
+} __packed;
+
 #endif
diff --git a/include/uapi/asm-generic/hyperv-tlfs.h b/include/uapi/asm-generic/hyperv-tlfs.h
index 5d8d5e89f432..95020e3a67ba 100644
--- a/include/uapi/asm-generic/hyperv-tlfs.h
+++ b/include/uapi/asm-generic/hyperv-tlfs.h
@@ -196,4 +196,47 @@ enum hv_partition_property_code {
 	HV_PARTITION_PROPERTY_PROCESSOR_VIRTUALIZATION_FEATURES		= 0x00080000,
 };
 
+enum hv_translate_gva_result_code {
+	HV_TRANSLATE_GVA_SUCCESS			= 0,
+
+	/* Translation failures. */
+	HV_TRANSLATE_GVA_PAGE_NOT_PRESENT		= 1,
+	HV_TRANSLATE_GVA_PRIVILEGE_VIOLATION		= 2,
+	HV_TRANSLATE_GVA_INVALIDE_PAGE_TABLE_FLAGS	= 3,
+
+	/* GPA access failures. */
+	HV_TRANSLATE_GVA_GPA_UNMAPPED			= 4,
+	HV_TRANSLATE_GVA_GPA_NO_READ_ACCESS		= 5,
+	HV_TRANSLATE_GVA_GPA_NO_WRITE_ACCESS		= 6,
+	HV_TRANSLATE_GVA_GPA_ILLEGAL_OVERLAY_ACCESS	= 7,
+
+	/*
+	 * Intercept for memory access by either
+	 *  - a higher VTL
+	 *  - a nested hypervisor (due to a violation of the nested page table)
+	 */
+	HV_TRANSLATE_GVA_INTERCEPT			= 8,
+
+	HV_TRANSLATE_GVA_GPA_UNACCEPTED			= 9,
+};
+
+union hv_translate_gva_result {
+	__u64 as_uint64;
+	struct {
+		__u32 result_code; /* enum hv_translate_hva_result_code */
+		__u32 cache_type : 8;
+		__u32 overlay_page : 1;
+		__u32 reserved : 23;
+	} __packed;
+};
+
+/* hv_translage_gva flags */
+#define HV_TRANSLATE_GVA_VALIDATE_READ		0x0001
+#define HV_TRANSLATE_GVA_VALIDATE_WRITE		0x0002
+#define HV_TRANSLATE_GVA_VALIDATE_EXECUTE	0x0004
+#define HV_TRANSLATE_GVA_PRIVILEGE_EXCEMP	0x0008
+#define HV_TRANSLATE_GVA_SET_PAGE_TABLE_BITS	0x0010
+#define HV_TRANSLATE_GVA_TLB_FLUSH_INHIBIT	0x0020
+#define HV_TRANSLATE_GVA_CONTROL_MASK		0x003f
+
 #endif
diff --git a/include/uapi/linux/mshv.h b/include/uapi/linux/mshv.h
index ec8281712430..0c46ce77cbb3 100644
--- a/include/uapi/linux/mshv.h
+++ b/include/uapi/linux/mshv.h
@@ -72,6 +72,13 @@ struct mshv_partition_property {
 	__u64 property_value;
 };
 
+struct mshv_translate_gva {
+	__u64 gva;
+	__u64 flags;
+	union hv_translate_gva_result *result;
+	__u64 *gpa;
+};
+
 #define MSHV_IOCTL 0xB8
 
 /* mshv device */
@@ -95,6 +102,7 @@ struct mshv_partition_property {
 #define MSHV_RUN_VP		_IOR(MSHV_IOCTL, 0x07, struct hv_message)
 #define MSHV_GET_VP_STATE	_IOWR(MSHV_IOCTL, 0x0A, struct mshv_vp_state)
 #define MSHV_SET_VP_STATE	_IOWR(MSHV_IOCTL, 0x0B, struct mshv_vp_state)
+#define MSHV_TRANSLATE_GVA	_IOWR(MSHV_IOCTL, 0x0E, struct mshv_translate_gva)
 
 /* register page mapping example:
  * struct hv_vp_register_page *regs = mmap(NULL,
-- 
2.23.4


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 08/19] drivers/hv: map and unmap guest memory
  2021-09-28 18:31 ` [PATCH v3 08/19] drivers/hv: map and unmap guest memory Nuno Das Neves
@ 2021-09-28 21:27     ` Olaf Hering
  0 siblings, 0 replies; 25+ messages in thread
From: Olaf Hering @ 2021-09-28 21:27 UTC (permalink / raw)
  To: Nuno Das Neves
  Cc: linux-hyperv, linux-kernel, virtualization, mikelley, viremana,
	sunilmut, wei.liu, vkuznets, ligrassi, kys, sthemmin, anbelski

[-- Attachment #1: Type: text/plain, Size: 308 bytes --]

Am Tue, 28 Sep 2021 11:31:04 -0700
schrieb Nuno Das Neves <nunodasneves@linux.microsoft.com>:

> +++ b/include/asm-generic/hyperv-tlfs.h
> -#define HV_HYP_PAGE_SHIFT      12
> +#define HV_HYP_PAGE_SHIFT		12

I think in case this change is really required, it should be in a separate patch.


Olaf

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 08/19] drivers/hv: map and unmap guest memory
@ 2021-09-28 21:27     ` Olaf Hering
  0 siblings, 0 replies; 25+ messages in thread
From: Olaf Hering @ 2021-09-28 21:27 UTC (permalink / raw)
  To: Nuno Das Neves
  Cc: linux-hyperv, sthemmin, ligrassi, linux-kernel, mikelley,
	wei.liu, anbelski, sunilmut, virtualization, viremana


[-- Attachment #1.1: Type: text/plain, Size: 308 bytes --]

Am Tue, 28 Sep 2021 11:31:04 -0700
schrieb Nuno Das Neves <nunodasneves@linux.microsoft.com>:

> +++ b/include/asm-generic/hyperv-tlfs.h
> -#define HV_HYP_PAGE_SHIFT      12
> +#define HV_HYP_PAGE_SHIFT		12

I think in case this change is really required, it should be in a separate patch.


Olaf

[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

[-- Attachment #2: Type: text/plain, Size: 183 bytes --]

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 11/19] drivers/hv: set up synic pages for intercept messages
  2021-09-28 18:31 ` [PATCH v3 11/19] drivers/hv: set up synic pages for intercept messages Nuno Das Neves
@ 2021-09-28 21:38   ` Olaf Hering
  0 siblings, 0 replies; 25+ messages in thread
From: Olaf Hering @ 2021-09-28 21:38 UTC (permalink / raw)
  To: Nuno Das Neves
  Cc: linux-hyperv, linux-kernel, virtualization, mikelley, viremana,
	sunilmut, wei.liu, vkuznets, ligrassi, kys, sthemmin, anbelski

Am Tue, 28 Sep 2021 11:31:07 -0700
schrieb Nuno Das Neves <nunodasneves@linux.microsoft.com>:

> +++ b/include/asm-generic/hyperv-tlfs.h
> -/* Define synthetic interrupt controller message constants. */

I think this code movement could be done in a separate patch.
This will reduce conflicts during backporting.

Olaf

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 08/19] drivers/hv: map and unmap guest memory
  2021-09-28 21:27     ` Olaf Hering
  (?)
@ 2021-09-30 14:17     ` Wei Liu
  2021-10-04 17:08       ` Nuno Das Neves
  -1 siblings, 1 reply; 25+ messages in thread
From: Wei Liu @ 2021-09-30 14:17 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Nuno Das Neves, linux-hyperv, linux-kernel, virtualization,
	mikelley, viremana, sunilmut, wei.liu, vkuznets, ligrassi, kys,
	sthemmin, anbelski

On Tue, Sep 28, 2021 at 11:27:02PM +0200, Olaf Hering wrote:
> Am Tue, 28 Sep 2021 11:31:04 -0700
> schrieb Nuno Das Neves <nunodasneves@linux.microsoft.com>:
> 
> > +++ b/include/asm-generic/hyperv-tlfs.h
> > -#define HV_HYP_PAGE_SHIFT      12
> > +#define HV_HYP_PAGE_SHIFT		12
> 
> I think in case this change is really required, it should be in a separate patch.

I don't think this hunk should be in this patch. It is just changing
whitespaces.

Wei.

> 
> 
> Olaf



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3 08/19] drivers/hv: map and unmap guest memory
  2021-09-30 14:17     ` Wei Liu
@ 2021-10-04 17:08       ` Nuno Das Neves
  0 siblings, 0 replies; 25+ messages in thread
From: Nuno Das Neves @ 2021-10-04 17:08 UTC (permalink / raw)
  To: Wei Liu, Olaf Hering
  Cc: linux-hyperv, linux-kernel, virtualization, mikelley, viremana,
	sunilmut, vkuznets, ligrassi, kys, sthemmin, anbelski



On 9/30/2021 7:17 AM, Wei Liu wrote:
> On Tue, Sep 28, 2021 at 11:27:02PM +0200, Olaf Hering wrote:
>> Am Tue, 28 Sep 2021 11:31:04 -0700
>> schrieb Nuno Das Neves <nunodasneves@linux.microsoft.com>:
>>
>>> +++ b/include/asm-generic/hyperv-tlfs.h
>>> -#define HV_HYP_PAGE_SHIFT      12
>>> +#define HV_HYP_PAGE_SHIFT		12
>>
>> I think in case this change is really required, it should be in a separate patch.
> 
> I don't think this hunk should be in this patch. It is just changing
> whitespaces.
> 

Thanks, good point. I think I'll remove this hunk from the series altogether.

> Wei.
> 
>>
>>
>> Olaf
> 

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2021-10-04 17:08 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-28 18:30 [PATCH v3 00/19] Microsoft Hypervisor root partition ioctl interface Nuno Das Neves
2021-09-28 18:30 ` [PATCH v3 01/19] x86/hyperv: convert hyperv statuses to linux error codes Nuno Das Neves
2021-09-28 18:30 ` [PATCH v3 02/19] x86/hyperv: convert hyperv statuses to strings Nuno Das Neves
2021-09-28 18:30 ` [PATCH v3 03/19] drivers/hv: minimal mshv module (/dev/mshv/) Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 04/19] drivers/hv: check extension ioctl Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 05/19] drivers/hv: create partition ioctl Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 06/19] drivers/hv: create, initialize, finalize, delete partition hypercalls Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 07/19] drivers/hv: withdraw memory hypercall Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 08/19] drivers/hv: map and unmap guest memory Nuno Das Neves
2021-09-28 21:27   ` Olaf Hering
2021-09-28 21:27     ` Olaf Hering
2021-09-30 14:17     ` Wei Liu
2021-10-04 17:08       ` Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 09/19] drivers/hv: create vcpu ioctl Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 10/19] drivers/hv: get and set vcpu registers ioctls Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 11/19] drivers/hv: set up synic pages for intercept messages Nuno Das Neves
2021-09-28 21:38   ` Olaf Hering
2021-09-28 18:31 ` [PATCH v3 12/19] drivers/hv: run vp ioctl and isr Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 13/19] drivers/hv: install intercept ioctl Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 14/19] drivers/hv: assert interrupt ioctl Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 15/19] drivers/hv: get and set vp state ioctls Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 16/19] drivers/hv: mmap vp register page Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 17/19] drivers/hv: get and set partition property ioctls Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 18/19] drivers/hv: Add enlightenment bits to create partition ioctl Nuno Das Neves
2021-09-28 18:31 ` [PATCH v3 19/19] drivers/hv: Translate GVA to GPA Nuno Das Neves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.