linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor
@ 2020-11-05 16:57 Wei Liu
  2020-11-05 16:57 ` [PATCH v2 01/17] asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to HV_CPU_MANAGEMENT Wei Liu
                   ` (16 more replies)
  0 siblings, 17 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:57 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu

Hi all

Here we propose this patch series to make Linux run as the root partition [0]
on Microsoft Hypervisor [1]. There will be a subsequent patch series to provide a
device node (/dev/mshv) such that userspace programs can create and run virtual
machines. We've also ported Cloud Hypervisor [3] over and have been able to
boot a Linux guest with Virtio devices since late July.

Being an RFC sereis, this implements only the absolutely necessary
components to get things running.  I will break down this series a bit.

A large portion of this series consists of patches that augment hyperv-tlfs.h.
They should be rather uncontroversial and can be applied right away.

A few key things other than the changes to hyperv-tlfs.h:

1. Linux needs to setup existing Hyper-V facilities differently.
2. Linux needs to make a few hypercalls to bring up APs.
3. Interrupts are remapped by IOMMU, which is controlled by the hypervisor.
   Linux needs to make hypercalls to map and unmap interrupts. This is
   done by introducing a new MSI irqdomain and new irqchips.

This series is now based on 5.10-rc1. And thanks to tglx's overhaul of
the MSI code, our implementation of the MSI irq domain is shorter.

Comments and suggestions are welcome.

Thanks,
Wei.

Changes since v1:
1. Simplify MSI IRQ domain implementation.
2. Address Vitaly's comments.

Wei Liu (17):
  asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to
    HV_CPU_MANAGEMENT
  x86/hyperv: detect if Linux is the root partition
  Drivers: hv: vmbus: skip VMBus initialization if Linux is root
  iommu/hyperv: don't setup IRQ remapping when running as root
  clocksource/hyperv: use MSR-based access if running as root
  x86/hyperv: allocate output arg pages if required
  x86/hyperv: extract partition ID from Microsoft Hypervisor if
    necessary
  x86/hyperv: handling hypercall page setup for root
  x86/hyperv: provide a bunch of helper functions
  x86/hyperv: implement and use hv_smp_prepare_cpus
  asm-generic/hyperv: update hv_msi_entry
  asm-generic/hyperv: update hv_interrupt_entry
  asm-generic/hyperv: introduce hv_device_id and auxiliary structures
  asm-generic/hyperv: import data structures for mapping device
    interrupts
  x86/hyperv: implement an MSI domain for root partition
  x86/ioapic: export a few functions and data structures via io_apic.h
  x86/hyperv: handle IO-APIC when running as root

 arch/x86/hyperv/Makefile            |   2 +-
 arch/x86/hyperv/hv_init.c           | 118 +++++-
 arch/x86/hyperv/hv_proc.c           | 217 +++++++++++
 arch/x86/hyperv/irqdomain.c         | 556 ++++++++++++++++++++++++++++
 arch/x86/include/asm/hyperv-tlfs.h  |  23 ++
 arch/x86/include/asm/io_apic.h      |  21 ++
 arch/x86/include/asm/mshyperv.h     |  13 +-
 arch/x86/kernel/apic/io_apic.c      |  28 +-
 arch/x86/kernel/cpu/mshyperv.c      |  43 +++
 drivers/clocksource/hyperv_timer.c  |   3 +
 drivers/hv/vmbus_drv.c              |   3 +
 drivers/iommu/hyperv-iommu.c        |   3 +-
 drivers/pci/controller/pci-hyperv.c |   2 +-
 include/asm-generic/hyperv-tlfs.h   | 254 ++++++++++++-
 14 files changed, 1250 insertions(+), 36 deletions(-)
 create mode 100644 arch/x86/hyperv/hv_proc.c
 create mode 100644 arch/x86/hyperv/irqdomain.c

-- 
2.20.1


^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH v2 01/17] asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to HV_CPU_MANAGEMENT
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
@ 2020-11-05 16:57 ` Wei Liu
  2020-11-12 15:14   ` Vitaly Kuznetsov
  2020-11-05 16:57 ` [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition Wei Liu
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:57 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Arnd Bergmann, open list:GENERIC INCLUDE/ASM HEADER FILES

This makes the name match Hyper-V TLFS.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 include/asm-generic/hyperv-tlfs.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index e73a11850055..e6903589a82a 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -88,7 +88,7 @@
 #define HV_CONNECT_PORT				BIT(7)
 #define HV_ACCESS_STATS				BIT(8)
 #define HV_DEBUGGING				BIT(11)
-#define HV_CPU_POWER_MANAGEMENT			BIT(12)
+#define HV_CPU_MANAGEMENT			BIT(12)
 
 
 /*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
  2020-11-05 16:57 ` [PATCH v2 01/17] asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to HV_CPU_MANAGEMENT Wei Liu
@ 2020-11-05 16:57 ` Wei Liu
  2020-11-05 19:16   ` kernel test robot
                     ` (2 more replies)
  2020-11-05 16:58 ` [PATCH v2 03/17] Drivers: hv: vmbus: skip VMBus initialization if Linux is root Wei Liu
                   ` (14 subsequent siblings)
  16 siblings, 3 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:57 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

For now we can use the privilege flag to check. Stash the value to be
used later.

Put in a bunch of defines for future use when we want to have more
fine-grained detection.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 arch/x86/hyperv/hv_init.c          |  4 ++++
 arch/x86/include/asm/hyperv-tlfs.h | 10 ++++++++++
 arch/x86/include/asm/mshyperv.h    |  2 ++
 arch/x86/kernel/cpu/mshyperv.c     | 16 ++++++++++++++++
 4 files changed, 32 insertions(+)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index e04d90af4c27..533fe9e887f2 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -26,6 +26,10 @@
 #include <linux/syscore_ops.h>
 #include <clocksource/hyperv_timer.h>
 
+/* Is Linux running as the root partition? */
+bool hv_root_partition;
+EXPORT_SYMBOL_GPL(hv_root_partition);
+
 void *hv_hypercall_pg;
 EXPORT_SYMBOL_GPL(hv_hypercall_pg);
 
diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 0ed20e8bba9e..41b628b9fb15 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -21,6 +21,7 @@
 #define HYPERV_CPUID_FEATURES			0x40000003
 #define HYPERV_CPUID_ENLIGHTMENT_INFO		0x40000004
 #define HYPERV_CPUID_IMPLEMENT_LIMITS		0x40000005
+#define HYPERV_CPUID_CPU_MANAGEMENT_FEATURES	0x40000007
 #define HYPERV_CPUID_NESTED_FEATURES		0x4000000A
 
 #define HYPERV_HYPERVISOR_PRESENT_BIT		0x80000000
@@ -103,6 +104,15 @@
 /* Recommend using enlightened VMCS */
 #define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED		BIT(14)
 
+/*
+ * CPU management features identification.
+ * These are HYPERV_CPUID_CPU_MANAGEMENT_FEATURES.EAX bits.
+ */
+#define HV_X64_START_LOGICAL_PROCESSOR			BIT(0)
+#define HV_X64_CREATE_ROOT_VIRTUAL_PROCESSOR		BIT(1)
+#define HV_X64_PERFORMANCE_COUNTER_SYNC			BIT(2)
+#define HV_X64_RESERVED_IDENTITY_BIT			BIT(31)
+
 /*
  * Virtual processor will never share a physical core with another virtual
  * processor, except for virtual processors that are reported as sibling SMT
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ffc289992d1b..ac2b0d110f03 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -237,6 +237,8 @@ int hyperv_fill_flush_guest_mapping_list(
 		struct hv_guest_mapping_flush_list *flush,
 		u64 start_gfn, u64 end_gfn);
 
+extern bool hv_root_partition;
+
 #ifdef CONFIG_X86_64
 void hv_apic_init(void);
 void __init hv_init_spinlocks(void);
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 05ef1f4550cb..f7633e1e4c82 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -237,6 +237,22 @@ static void __init ms_hyperv_init_platform(void)
 	pr_debug("Hyper-V: max %u virtual processors, %u logical processors\n",
 		 ms_hyperv.max_vp_index, ms_hyperv.max_lp_index);
 
+	/*
+	 * Check CPU management privilege.
+	 *
+	 * To mirror what Windows does we should extract CPU management
+	 * features and use the ReservedIdentityBit to detect if Linux is the
+	 * root partition. But that requires negotiating CPU management
+	 * interface (a process to be finalized).
+	 *
+	 * For now, use the privilege flag as the indicator for running as
+	 * root.
+	 */
+	if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_CPU_MANAGEMENT) {
+		hv_root_partition = true;
+		pr_info("Hyper-V: running as root partition\n");
+	}
+
 	/*
 	 * Extract host information.
 	 */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 03/17] Drivers: hv: vmbus: skip VMBus initialization if Linux is root
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
  2020-11-05 16:57 ` [PATCH v2 01/17] asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to HV_CPU_MANAGEMENT Wei Liu
  2020-11-05 16:57 ` [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 15:24   ` Vitaly Kuznetsov
  2020-11-05 16:58 ` [PATCH v2 04/17] iommu/hyperv: don't setup IRQ remapping when running as root Wei Liu
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger

There is no VMBus and the other infrastructures initialized in
hv_acpi_init when Linux is running as the root partition.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 drivers/hv/vmbus_drv.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 4fad3e6745e5..37c4d3a28309 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2612,6 +2612,9 @@ static int __init hv_acpi_init(void)
 	if (!hv_is_hyperv_initialized())
 		return -ENODEV;
 
+	if (hv_root_partition)
+		return -ENODEV;
+
 	init_completion(&probe_event);
 
 	/*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 04/17] iommu/hyperv: don't setup IRQ remapping when running as root
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (2 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 03/17] Drivers: hv: vmbus: skip VMBus initialization if Linux is root Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 15:27   ` Vitaly Kuznetsov
  2020-11-05 16:58 ` [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if " Wei Liu
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Joerg Roedel, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Joerg Roedel, open list:IOMMU DRIVERS

The IOMMU code needs more work. We're sure for now the IRQ remapping
hooks are not applicable when Linux is the root.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Joerg Roedel <jroedel@suse.de>
---
 drivers/iommu/hyperv-iommu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c
index e09e2d734c57..8d3ce3add57d 100644
--- a/drivers/iommu/hyperv-iommu.c
+++ b/drivers/iommu/hyperv-iommu.c
@@ -20,6 +20,7 @@
 #include <asm/io_apic.h>
 #include <asm/irq_remapping.h>
 #include <asm/hypervisor.h>
+#include <asm/mshyperv.h>
 
 #include "irq_remapping.h"
 
@@ -143,7 +144,7 @@ static int __init hyperv_prepare_irq_remapping(void)
 	int i;
 
 	if (!hypervisor_is_type(X86_HYPER_MS_HYPERV) ||
-	    !x2apic_supported())
+	    !x2apic_supported() || hv_root_partition)
 		return -ENODEV;
 
 	fn = irq_domain_alloc_named_id_fwnode("HYPERV-IR", 0);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if running as root
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (3 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 04/17] iommu/hyperv: don't setup IRQ remapping when running as root Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12  9:56   ` Daniel Lezcano
  2020-11-12 15:30   ` Vitaly Kuznetsov
  2020-11-05 16:58 ` [PATCH v2 06/17] x86/hyperv: allocate output arg pages if required Wei Liu
                   ` (11 subsequent siblings)
  16 siblings, 2 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Daniel Lezcano, Thomas Gleixner

Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 drivers/clocksource/hyperv_timer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
index ba04cb381cd3..269a691bd2c4 100644
--- a/drivers/clocksource/hyperv_timer.c
+++ b/drivers/clocksource/hyperv_timer.c
@@ -426,6 +426,9 @@ static bool __init hv_init_tsc_clocksource(void)
 	if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
 		return false;
 
+	if (hv_root_partition)
+		return false;
+
 	hv_read_reference_counter = read_hv_clock_tsc;
 	phys_addr = virt_to_phys(hv_get_tsc_page());
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 06/17] x86/hyperv: allocate output arg pages if required
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (4 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if " Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 15:35   ` Vitaly Kuznetsov
  2020-11-05 16:58 ` [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary Wei Liu
                   ` (10 subsequent siblings)
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

When Linux runs as the root partition, it will need to make hypercalls
which return data from the hypervisor.

Allocate pages for storing results when Linux runs as the root
partition.

Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
v2: Address Vitaly's comments
---
 arch/x86/hyperv/hv_init.c       | 35 ++++++++++++++++++++++++++++-----
 arch/x86/include/asm/mshyperv.h |  1 +
 2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 533fe9e887f2..7a2e37f025b0 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -45,6 +45,9 @@ EXPORT_SYMBOL_GPL(hv_vp_assist_page);
 void  __percpu **hyperv_pcpu_input_arg;
 EXPORT_SYMBOL_GPL(hyperv_pcpu_input_arg);
 
+void  __percpu **hyperv_pcpu_output_arg;
+EXPORT_SYMBOL_GPL(hyperv_pcpu_output_arg);
+
 u32 hv_max_vp_index;
 EXPORT_SYMBOL_GPL(hv_max_vp_index);
 
@@ -77,12 +80,19 @@ static int hv_cpu_init(unsigned int cpu)
 	void **input_arg;
 	struct page *pg;
 
-	input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
 	/* hv_cpu_init() can be called with IRQs disabled from hv_resume() */
-	pg = alloc_page(irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL);
+	pg = alloc_pages(irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL, hv_root_partition ? 1 : 0);
 	if (unlikely(!pg))
 		return -ENOMEM;
+
+	input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
 	*input_arg = page_address(pg);
+	if (hv_root_partition) {
+		void **output_arg;
+
+		output_arg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
+		*output_arg = page_address(pg + 1);
+	}
 
 	hv_get_vp_index(msr_vp_index);
 
@@ -209,14 +219,23 @@ static int hv_cpu_die(unsigned int cpu)
 	unsigned int new_cpu;
 	unsigned long flags;
 	void **input_arg;
-	void *input_pg = NULL;
+	void *pg;
 
 	local_irq_save(flags);
 	input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
-	input_pg = *input_arg;
+	pg = *input_arg;
 	*input_arg = NULL;
+
+	if (hv_root_partition) {
+		void **output_arg;
+
+		output_arg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
+		*output_arg = NULL;
+	}
+
 	local_irq_restore(flags);
-	free_page((unsigned long)input_pg);
+
+	free_page((unsigned long)pg);
 
 	if (hv_vp_assist_page && hv_vp_assist_page[cpu])
 		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
@@ -350,6 +369,12 @@ void __init hyperv_init(void)
 
 	BUG_ON(hyperv_pcpu_input_arg == NULL);
 
+	/* Allocate the per-CPU state for output arg for root */
+	if (hv_root_partition) {
+		hyperv_pcpu_output_arg = alloc_percpu(void *);
+		BUG_ON(hyperv_pcpu_output_arg == NULL);
+	}
+
 	/* Allocate percpu VP index */
 	hv_vp_index = kmalloc_array(num_possible_cpus(), sizeof(*hv_vp_index),
 				    GFP_KERNEL);
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ac2b0d110f03..62d9390f1ddf 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -76,6 +76,7 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
 #if IS_ENABLED(CONFIG_HYPERV)
 extern void *hv_hypercall_pg;
 extern void  __percpu  **hyperv_pcpu_input_arg;
+extern void  __percpu  **hyperv_pcpu_output_arg;
 
 static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (5 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 06/17] x86/hyperv: allocate output arg pages if required Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-05 20:07   ` kernel test robot
                     ` (2 more replies)
  2020-11-05 16:58 ` [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root Wei Liu
                   ` (9 subsequent siblings)
  16 siblings, 3 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

We will need the partition ID for executing some hypercalls later.

Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 arch/x86/hyperv/hv_init.c         | 26 ++++++++++++++++++++++++++
 arch/x86/include/asm/mshyperv.h   |  2 ++
 include/asm-generic/hyperv-tlfs.h |  6 ++++++
 3 files changed, 34 insertions(+)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 7a2e37f025b0..73b0fb851f76 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -30,6 +30,9 @@
 bool hv_root_partition;
 EXPORT_SYMBOL_GPL(hv_root_partition);
 
+u64 hv_current_partition_id;
+EXPORT_SYMBOL_GPL(hv_current_partition_id);
+
 void *hv_hypercall_pg;
 EXPORT_SYMBOL_GPL(hv_hypercall_pg);
 
@@ -335,6 +338,26 @@ static struct syscore_ops hv_syscore_ops = {
 	.resume		= hv_resume,
 };
 
+void __init hv_get_partition_id(void)
+{
+	struct hv_get_partition_id *output_page;
+	u16 status;
+	unsigned long flags;
+
+	local_irq_save(flags);
+	output_page = *this_cpu_ptr(hyperv_pcpu_output_arg);
+	status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, output_page) &
+		HV_HYPERCALL_RESULT_MASK;
+	if (status != HV_STATUS_SUCCESS)
+		pr_err("Failed to get partition ID: %d\n", status);
+	else
+		hv_current_partition_id = output_page->partition_id;
+	local_irq_restore(flags);
+
+	/* No point in proceeding if this failed */
+	BUG_ON(status != HV_STATUS_SUCCESS);
+}
+
 /*
  * This function is to be invoked early in the boot sequence after the
  * hypervisor has been detected.
@@ -430,6 +453,9 @@ void __init hyperv_init(void)
 
 	register_syscore_ops(&hv_syscore_ops);
 
+	if (hv_root_partition)
+		hv_get_partition_id();
+
 	return;
 
 remove_cpuhp_state:
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 62d9390f1ddf..67f5d35a73d3 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -78,6 +78,8 @@ extern void *hv_hypercall_pg;
 extern void  __percpu  **hyperv_pcpu_input_arg;
 extern void  __percpu  **hyperv_pcpu_output_arg;
 
+extern u64 hv_current_partition_id;
+
 static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 {
 	u64 input_address = input ? virt_to_phys(input) : 0;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index e6903589a82a..87b1a79b19eb 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -141,6 +141,7 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX	0x0013
 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
 #define HVCALL_SEND_IPI_EX			0x0015
+#define HVCALL_GET_PARTITION_ID			0x0046
 #define HVCALL_GET_VP_REGISTERS			0x0050
 #define HVCALL_SET_VP_REGISTERS			0x0051
 #define HVCALL_POST_MESSAGE			0x005c
@@ -407,6 +408,11 @@ struct hv_tlb_flush_ex {
 	u64 gva_list[];
 } __packed;
 
+/* HvGetPartitionId hypercall (output only) */
+struct hv_get_partition_id {
+	u64 partition_id;
+} __packed;
+
 /* HvRetargetDeviceInterrupt hypercall */
 union hv_msi_entry {
 	u64 as_uint64;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (6 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 15:51   ` Vitaly Kuznetsov
  2020-11-05 16:58 ` [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions Wei Liu
                   ` (8 subsequent siblings)
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

When Linux is running as the root partition, the hypercall page will
have already been setup by Hyper-V. Copy the content over to the
allocated page.

The suspend, resume and cleanup paths remain untouched because they are
not supported in this setup yet.

Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 arch/x86/hyperv/hv_init.c | 32 ++++++++++++++++++++++++++++++--
 1 file changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 73b0fb851f76..9fcaf741be99 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -25,6 +25,7 @@
 #include <linux/cpuhotplug.h>
 #include <linux/syscore_ops.h>
 #include <clocksource/hyperv_timer.h>
+#include <linux/highmem.h>
 
 /* Is Linux running as the root partition? */
 bool hv_root_partition;
@@ -438,8 +439,35 @@ void __init hyperv_init(void)
 
 	rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
 	hypercall_msr.enable = 1;
-	hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
-	wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
+
+	if (hv_root_partition) {
+		struct page *pg;
+		void *src, *dst;
+
+		/*
+		 * For the root partition, the hypervisor will set up its
+		 * hypercall page. The hypervisor guarantees it will not show
+		 * up in the root's address space. The root can't change the
+		 * location of the hypercall page.
+		 *
+		 * Order is important here. We must enable the hypercall page
+		 * so it is populated with code, then copy the code to an
+		 * executable page.
+		 */
+		wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
+
+		pg = vmalloc_to_page(hv_hypercall_pg);
+		dst = kmap(pg);
+		src = memremap(hypercall_msr.guest_physical_address << PAGE_SHIFT, PAGE_SIZE,
+				MEMREMAP_WB);
+		BUG_ON(!(src && dst));
+		memcpy(dst, src, PAGE_SIZE);
+		memunmap(src);
+		kunmap(pg);
+	} else {
+		hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
+		wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
+	}
 
 	/*
 	 * Ignore any errors in setting up stimer clockevents
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (7 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 15:57   ` Vitaly Kuznetsov
  2020-11-05 16:58 ` [PATCH v2 10/17] x86/hyperv: implement and use hv_smp_prepare_cpus Wei Liu
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

They are used to deposit pages into Microsoft Hypervisor and bring up
logical and virtual processors.

Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
v2:
1. Adapt to hypervisor side changes
2. Address Vitaly's comments
---
 arch/x86/hyperv/Makefile          |   2 +-
 arch/x86/hyperv/hv_proc.c         | 217 ++++++++++++++++++++++++++++++
 arch/x86/include/asm/mshyperv.h   |   4 +
 include/asm-generic/hyperv-tlfs.h |  67 +++++++++
 4 files changed, 289 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/hyperv/hv_proc.c

diff --git a/arch/x86/hyperv/Makefile b/arch/x86/hyperv/Makefile
index 89b1f74d3225..565358020921 100644
--- a/arch/x86/hyperv/Makefile
+++ b/arch/x86/hyperv/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-y			:= hv_init.o mmu.o nested.o
-obj-$(CONFIG_X86_64)	+= hv_apic.o
+obj-$(CONFIG_X86_64)	+= hv_apic.o hv_proc.o
 
 ifdef CONFIG_X86_64
 obj-$(CONFIG_PARAVIRT_SPINLOCKS)	+= hv_spinlock.o
diff --git a/arch/x86/hyperv/hv_proc.c b/arch/x86/hyperv/hv_proc.c
new file mode 100644
index 000000000000..0fd972c9129a
--- /dev/null
+++ b/arch/x86/hyperv/hv_proc.c
@@ -0,0 +1,217 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/types.h>
+#include <linux/version.h>
+#include <linux/vmalloc.h>
+#include <linux/mm.h>
+#include <linux/clockchips.h>
+#include <linux/acpi.h>
+#include <linux/hyperv.h>
+#include <linux/slab.h>
+#include <linux/cpuhotplug.h>
+#include <linux/minmax.h>
+#include <asm/hypervisor.h>
+#include <asm/mshyperv.h>
+#include <asm/apic.h>
+
+#include <asm/trace/hyperv.h>
+
+#define HV_DEPOSIT_MAX_ORDER (8)
+#define HV_DEPOSIT_MAX (1 << HV_DEPOSIT_MAX_ORDER)
+
+/*
+ * Deposits exact number of pages
+ * Must be called with interrupts enabled
+ * Max 256 pages
+ */
+int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
+{
+	struct page **pages;
+	int *counts;
+	int num_allocations;
+	int i, j, page_count;
+	int order;
+	int desired_order;
+	u16 status;
+	int ret;
+	u64 base_pfn;
+	struct hv_deposit_memory *input_page;
+	unsigned long flags;
+
+	if (num_pages > HV_DEPOSIT_MAX)
+		return -E2BIG;
+	if (!num_pages)
+		return 0;
+
+	/* One buffer for page pointers and counts */
+	pages = page_address(alloc_page(GFP_KERNEL));
+	if (!pages)
+		return -ENOMEM;
+
+	counts = kcalloc(HV_DEPOSIT_MAX, sizeof(int), GFP_KERNEL);
+	if (!counts) {
+		free_page((unsigned long)pages);
+		return -ENOMEM;
+	}
+
+	/* Allocate all the pages before disabling interrupts */
+	num_allocations = 0;
+	i = 0;
+	order = HV_DEPOSIT_MAX_ORDER;
+
+	while (num_pages) {
+		/* Find highest order we can actually allocate */
+		desired_order = 31 - __builtin_clz(num_pages);
+		order = min(desired_order, order);
+		do {
+			pages[i] = alloc_pages_node(node, GFP_KERNEL, order);
+			if (!pages[i]) {
+				if (!order) {
+					ret = -ENOMEM;
+					goto err_free_allocations;
+				}
+				--order;
+			}
+		} while (!pages[i]);
+
+		split_page(pages[i], order);
+		counts[i] = 1 << order;
+		num_pages -= counts[i];
+		i++;
+		num_allocations++;
+	}
+
+	local_irq_save(flags);
+
+	input_page = *this_cpu_ptr(hyperv_pcpu_input_arg);
+
+	input_page->partition_id = partition_id;
+
+	/* Populate gpa_page_list - these will fit on the input page */
+	for (i = 0, page_count = 0; i < num_allocations; ++i) {
+		base_pfn = page_to_pfn(pages[i]);
+		for (j = 0; j < counts[i]; ++j, ++page_count)
+			input_page->gpa_page_list[page_count] = base_pfn + j;
+	}
+	status = hv_do_rep_hypercall(HVCALL_DEPOSIT_MEMORY,
+				     page_count, 0, input_page,
+				     NULL) & HV_HYPERCALL_RESULT_MASK;
+	local_irq_restore(flags);
+
+	if (status != HV_STATUS_SUCCESS) {
+		pr_err("Failed to deposit pages: %d\n", status);
+		ret = status;
+		goto err_free_allocations;
+	}
+
+	ret = 0;
+	goto free_buf;
+
+err_free_allocations:
+	for (i = 0; i < num_allocations; ++i) {
+		base_pfn = page_to_pfn(pages[i]);
+		for (j = 0; j < counts[i]; ++j)
+			__free_page(pfn_to_page(base_pfn + j));
+	}
+
+free_buf:
+	free_page((unsigned long)pages);
+	kfree(counts);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(hv_call_deposit_pages);
+
+int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
+{
+	struct hv_add_logical_processor_in *input;
+	struct hv_add_logical_processor_out *output;
+	int status;
+	unsigned long flags;
+	int ret = 0;
+
+	/*
+	 * When adding a logical processor, the hypervisor may return
+	 * HV_STATUS_INSUFFICIENT_MEMORY. When that happens, we deposit more
+	 * pages and retry.
+	 */
+	do {
+		local_irq_save(flags);
+
+		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
+		/* We don't do anything with the output right now */
+		output = *this_cpu_ptr(hyperv_pcpu_output_arg);
+
+		input->lp_index = lp_index;
+		input->apic_id = apic_id;
+		input->flags = 0;
+		input->proximity_domain_info.domain_id = node_to_pxm(node);
+		input->proximity_domain_info.flags.reserved = 0;
+		input->proximity_domain_info.flags.proximity_info_valid = 1;
+		input->proximity_domain_info.flags.proximity_preferred = 1;
+		status = hv_do_hypercall(HVCALL_ADD_LOGICAL_PROCESSOR,
+					 input, output);
+		local_irq_restore(flags);
+
+		if (status != HV_STATUS_INSUFFICIENT_MEMORY) {
+			if (status != HV_STATUS_SUCCESS) {
+				pr_err("%s: cpu %u apic ID %u, %d\n", __func__,
+				       lp_index, apic_id, status);
+				ret = status;
+			}
+			break;
+		}
+		ret = hv_call_deposit_pages(node, hv_current_partition_id, 1);
+	} while (!ret);
+
+	return ret;
+}
+
+int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
+{
+	struct hv_create_vp *input;
+	u16 status;
+	unsigned long irq_flags;
+	int ret = 0;
+
+	/* Root VPs don't seem to need pages deposited */
+	if (partition_id != hv_current_partition_id) {
+		ret = hv_call_deposit_pages(node, partition_id, 90);
+		if (ret)
+			return ret;
+	}
+
+	do {
+		local_irq_save(irq_flags);
+
+		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
+
+		input->partition_id = partition_id;
+		input->vp_index = vp_index;
+		input->flags = flags;
+		input->subnode_type = HvSubnodeAny;
+		if (node != NUMA_NO_NODE) {
+			input->proximity_domain_info.domain_id = node_to_pxm(node);
+			input->proximity_domain_info.flags.reserved = 0;
+			input->proximity_domain_info.flags.proximity_info_valid = 1;
+			input->proximity_domain_info.flags.proximity_preferred = 1;
+		} else {
+			input->proximity_domain_info.as_uint64 = 0;
+		}
+		status = hv_do_hypercall(HVCALL_CREATE_VP, input, NULL);
+		local_irq_restore(irq_flags);
+
+		if (status != HV_STATUS_INSUFFICIENT_MEMORY) {
+			if (status != HV_STATUS_SUCCESS) {
+				pr_err("%s: vcpu %u, lp %u, %d\n", __func__,
+				       vp_index, flags, status);
+				ret = status;
+			}
+			break;
+		}
+		ret = hv_call_deposit_pages(node, partition_id, 1);
+
+	} while (!ret);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(hv_call_create_vp);
+
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 67f5d35a73d3..4e590a167160 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -80,6 +80,10 @@ extern void  __percpu  **hyperv_pcpu_output_arg;
 
 extern u64 hv_current_partition_id;
 
+int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
+int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
+int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
+
 static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 {
 	u64 input_address = input ? virt_to_phys(input) : 0;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 87b1a79b19eb..b6c74e1a5524 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -142,6 +142,8 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
 #define HVCALL_SEND_IPI_EX			0x0015
 #define HVCALL_GET_PARTITION_ID			0x0046
+#define HVCALL_DEPOSIT_MEMORY			0x0048
+#define HVCALL_CREATE_VP			0x004e
 #define HVCALL_GET_VP_REGISTERS			0x0050
 #define HVCALL_SET_VP_REGISTERS			0x0051
 #define HVCALL_POST_MESSAGE			0x005c
@@ -149,6 +151,7 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_POST_DEBUG_DATA			0x0069
 #define HVCALL_RETRIEVE_DEBUG_DATA		0x006a
 #define HVCALL_RESET_DEBUG_SESSION		0x006b
+#define HVCALL_ADD_LOGICAL_PROCESSOR		0x0076
 #define HVCALL_RETARGET_INTERRUPT		0x007e
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
@@ -413,6 +416,70 @@ struct hv_get_partition_id {
 	u64 partition_id;
 } __packed;
 
+/* HvDepositMemory hypercall */
+struct hv_deposit_memory {
+	u64 partition_id;
+	u64 gpa_page_list[];
+} __packed;
+
+struct hv_proximity_domain_flags {
+	u32 proximity_preferred : 1;
+	u32 reserved : 30;
+	u32 proximity_info_valid : 1;
+};
+
+/* Not a union in windows but useful for zeroing */
+union hv_proximity_domain_info {
+	struct {
+		u32 domain_id;
+		struct hv_proximity_domain_flags flags;
+	};
+	u64 as_uint64;
+};
+
+struct hv_lp_startup_status {
+	u64 hv_status;
+	u64 substatus1;
+	u64 substatus2;
+	u64 substatus3;
+	u64 substatus4;
+	u64 substatus5;
+	u64 substatus6;
+};
+
+/* HvAddLogicalProcessor hypercall */
+struct hv_add_logical_processor_in {
+	u32 lp_index;
+	u32 apic_id;
+	union hv_proximity_domain_info proximity_domain_info;
+	u64 flags;
+};
+
+struct hv_add_logical_processor_out {
+	struct hv_lp_startup_status startup_status;
+};
+
+enum HV_SUBNODE_TYPE
+{
+    HvSubnodeAny = 0,
+    HvSubnodeSocket,
+    HvSubnodeAmdNode,
+    HvSubnodeL3,
+    HvSubnodeCount,
+    HvSubnodeInvalid = -1
+};
+
+/* HvCreateVp hypercall */
+struct hv_create_vp {
+	u64 partition_id;
+	u32 vp_index;
+	u8 padding[3];
+	u8 subnode_type;
+	u64 subnode_id;
+	union hv_proximity_domain_info proximity_domain_info;
+	u64 flags;
+};
+
 /* HvRetargetDeviceInterrupt hypercall */
 union hv_msi_entry {
 	u64 as_uint64;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 10/17] x86/hyperv: implement and use hv_smp_prepare_cpus
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (8 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 16:44   ` Vitaly Kuznetsov
  2020-11-05 16:58 ` [PATCH v2 11/17] asm-generic/hyperv: update hv_msi_entry Wei Liu
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

Microsoft Hypervisor requires the root partition to make a few
hypercalls to setup application processors before they can be used.

Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
CPU hotplug and unplug is not yet supported in this setup, so those
paths remain untouched.
---
 arch/x86/kernel/cpu/mshyperv.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index f7633e1e4c82..4795e54550e6 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -31,6 +31,7 @@
 #include <asm/reboot.h>
 #include <asm/nmi.h>
 #include <clocksource/hyperv_timer.h>
+#include <asm/numa.h>
 
 struct ms_hyperv_info ms_hyperv;
 EXPORT_SYMBOL_GPL(ms_hyperv);
@@ -208,6 +209,30 @@ static void __init hv_smp_prepare_boot_cpu(void)
 	hv_init_spinlocks();
 #endif
 }
+
+static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
+{
+#if defined(CONFIG_X86_64)
+	int i;
+	int ret;
+
+	native_smp_prepare_cpus(max_cpus);
+
+	for_each_present_cpu(i) {
+		if (i == 0)
+			continue;
+		ret = hv_call_add_logical_proc(numa_cpu_node(i), i, cpu_physical_id(i));
+		BUG_ON(ret);
+	}
+
+	for_each_present_cpu(i) {
+		if (i == 0)
+			continue;
+		ret = hv_call_create_vp(numa_cpu_node(i), hv_current_partition_id, i, i);
+		BUG_ON(ret);
+	}
+#endif
+}
 #endif
 
 static void __init ms_hyperv_init_platform(void)
@@ -364,6 +389,8 @@ static void __init ms_hyperv_init_platform(void)
 
 # ifdef CONFIG_SMP
 	smp_ops.smp_prepare_boot_cpu = hv_smp_prepare_boot_cpu;
+	if (hv_root_partition)
+		smp_ops.smp_prepare_cpus = hv_smp_prepare_cpus;
 # endif
 
 	/*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 11/17] asm-generic/hyperv: update hv_msi_entry
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (9 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 10/17] x86/hyperv: implement and use hv_smp_prepare_cpus Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-05 16:58 ` [PATCH v2 12/17] asm-generic/hyperv: update hv_interrupt_entry Wei Liu
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

We will soon need to access fields inside the MSI address and MSI data
fields. Introduce hv_msi_address_register and hv_msi_data_register.

Fix up one user of hv_msi_entry in mshyperv.h.

No functional change expected.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 arch/x86/include/asm/mshyperv.h   |  4 ++--
 include/asm-generic/hyperv-tlfs.h | 28 ++++++++++++++++++++++++++--
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 4e590a167160..cbee72550a12 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -257,8 +257,8 @@ static inline void hv_apic_init(void) {}
 static inline void hv_set_msi_entry_from_desc(union hv_msi_entry *msi_entry,
 					      struct msi_desc *msi_desc)
 {
-	msi_entry->address = msi_desc->msg.address_lo;
-	msi_entry->data = msi_desc->msg.data;
+	msi_entry->address.as_uint32 = msi_desc->msg.address_lo;
+	msi_entry->data.as_uint32 = msi_desc->msg.data;
 }
 
 #else /* CONFIG_HYPERV */
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index b6c74e1a5524..86dff6846278 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -480,12 +480,36 @@ struct hv_create_vp {
 	u64 flags;
 };
 
+union hv_msi_address_register {
+	u32 as_uint32;
+	struct {
+		u32 reserved1:2;
+		u32 destination_mode:1;
+		u32 redirection_hint:1;
+		u32 reserved2:8;
+		u32 destination_id:8;
+		u32 msi_base:12;
+	};
+} __packed;
+
+union hv_msi_data_register {
+	u32 as_uint32;
+	struct {
+		u32 vector:8;
+		u32 delivery_mode:3;
+		u32 reserved1:3;
+		u32 level_assert:1;
+		u32 trigger_mode:1;
+		u32 reserved2:16;
+	};
+} __packed;
+
 /* HvRetargetDeviceInterrupt hypercall */
 union hv_msi_entry {
 	u64 as_uint64;
 	struct {
-		u32 address;
-		u32 data;
+		union hv_msi_address_register address;
+		union hv_msi_data_register data;
 	} __packed;
 };
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 12/17] asm-generic/hyperv: update hv_interrupt_entry
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (10 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 11/17] asm-generic/hyperv: update hv_msi_entry Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-05 16:58 ` [PATCH v2 13/17] asm-generic/hyperv: introduce hv_device_id and auxiliary structures Wei Liu
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Rob Herring, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Lorenzo Pieralisi, Bjorn Helgaas, Arnd Bergmann,
	open list:PCI NATIVE HOST BRIDGE AND ENDPOINT DRIVERS,
	open list:GENERIC INCLUDE/ASM HEADER FILES

We will soon use the same structure to handle IO-APIC interrupts as
well. Introduce an enum to identify the source and a data structure for
IO-APIC RTE.

While at it, update pci-hyperv.c to use the enum.

No functional change.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Rob Herring <robh@kernel.org>
---
 drivers/pci/controller/pci-hyperv.c |  2 +-
 include/asm-generic/hyperv-tlfs.h   | 36 +++++++++++++++++++++++++++--
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 03ed5cb1c4b2..59edc0bf00fe 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1216,7 +1216,7 @@ static void hv_irq_unmask(struct irq_data *data)
 	params = &hbus->retarget_msi_interrupt_params;
 	memset(params, 0, sizeof(*params));
 	params->partition_id = HV_PARTITION_ID_SELF;
-	params->int_entry.source = 1; /* MSI(-X) */
+	params->int_entry.source = HV_INTERRUPT_SOURCE_MSI;
 	hv_set_msi_entry_from_desc(&params->int_entry.msi_entry, msi_desc);
 	params->device_id = (hbus->hdev->dev_instance.b[5] << 24) |
 			   (hbus->hdev->dev_instance.b[4] << 16) |
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 86dff6846278..b6895d4d9ad0 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -480,6 +480,11 @@ struct hv_create_vp {
 	u64 flags;
 };
 
+enum hv_interrupt_source {
+	HV_INTERRUPT_SOURCE_MSI = 1, /* MSI and MSI-X */
+	HV_INTERRUPT_SOURCE_IOAPIC,
+};
+
 union hv_msi_address_register {
 	u32 as_uint32;
 	struct {
@@ -513,10 +518,37 @@ union hv_msi_entry {
 	} __packed;
 };
 
+union hv_ioapic_rte {
+	u64 as_uint64;
+
+	struct {
+		u32 vector:8;
+		u32 delivery_mode:3;
+		u32 destination_mode:1;
+		u32 delivery_status:1;
+		u32 interrupt_polarity:1;
+		u32 remote_irr:1;
+		u32 trigger_mode:1;
+		u32 interrupt_mask:1;
+		u32 reserved1:15;
+
+		u32 reserved2:24;
+		u32 destination_id:8;
+	};
+
+	struct {
+		u32 low_uint32;
+		u32 high_uint32;
+	};
+} __packed;
+
 struct hv_interrupt_entry {
-	u32 source;			/* 1 for MSI(-X) */
+	u32 source;
 	u32 reserved1;
-	union hv_msi_entry msi_entry;
+	union {
+		union hv_msi_entry msi_entry;
+		union hv_ioapic_rte ioapic_rte;
+	};
 } __packed;
 
 /*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 13/17] asm-generic/hyperv: introduce hv_device_id and auxiliary structures
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (11 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 12/17] asm-generic/hyperv: update hv_interrupt_entry Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-05 16:58 ` [PATCH v2 14/17] asm-generic/hyperv: import data structures for mapping device interrupts Wei Liu
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Arnd Bergmann, open list:GENERIC INCLUDE/ASM HEADER FILES

We will need to identify the device we want Microsoft Hypervisor to
manipulate.  Introduce the data structures for that purpose.

They will be used in a later patch.

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 include/asm-generic/hyperv-tlfs.h | 79 +++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index b6895d4d9ad0..09223e41172c 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -623,4 +623,83 @@ struct hv_set_vp_registers_input {
 	} element[];
 } __packed;
 
+enum hv_device_type {
+	HV_DEVICE_TYPE_LOGICAL = 0,
+	HV_DEVICE_TYPE_PCI = 1,
+	HV_DEVICE_TYPE_IOAPIC = 2,
+	HV_DEVICE_TYPE_ACPI = 3,
+};
+
+typedef u16 hv_pci_rid;
+typedef u16 hv_pci_segment;
+typedef u64 hv_logical_device_id;
+union hv_pci_bdf {
+	u16 as_uint16;
+
+	struct {
+		u8 function:3;
+		u8 device:5;
+		u8 bus;
+	};
+} __packed;
+
+union hv_pci_bus_range {
+	u16 as_uint16;
+
+	struct {
+		u8 subordinate_bus;
+		u8 secondary_bus;
+	};
+} __packed;
+
+union hv_device_id {
+	u64 as_uint64;
+
+	struct {
+		u64 :62;
+		u64 device_type:2;
+	};
+
+	/* HV_DEVICE_TYPE_LOGICAL */
+	struct {
+		u64 id:62;
+		u64 device_type:2;
+	} logical;
+
+	/* HV_DEVICE_TYPE_PCI */
+	struct {
+		union {
+			hv_pci_rid rid;
+			union hv_pci_bdf bdf;
+		};
+
+		hv_pci_segment segment;
+		union hv_pci_bus_range shadow_bus_range;
+
+		u16 phantom_function_bits:2;
+		u16 source_shadow:1;
+
+		u16 rsvdz0:11;
+		u16 device_type:2;
+	} pci;
+
+	/* HV_DEVICE_TYPE_IOAPIC */
+	struct {
+		u8 ioapic_id;
+		u8 rsvdz0;
+		u16 rsvdz1;
+		u16 rsvdz2;
+
+		u16 rsvdz3:14;
+		u16 device_type:2;
+	} ioapic;
+
+	/* HV_DEVICE_TYPE_ACPI */
+	struct {
+		u32 input_mapping_base;
+		u32 input_mapping_count:30;
+		u32 device_type:2;
+	} acpi;
+} __packed;
+
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 14/17] asm-generic/hyperv: import data structures for mapping device interrupts
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (12 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 13/17] asm-generic/hyperv: introduce hv_device_id and auxiliary structures Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-05 16:58 ` [PATCH v2 15/17] x86/hyperv: implement an MSI domain for root partition Wei Liu
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 arch/x86/include/asm/hyperv-tlfs.h | 13 +++++++++++
 include/asm-generic/hyperv-tlfs.h  | 36 ++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
index 41b628b9fb15..592c75e51e0f 100644
--- a/arch/x86/include/asm/hyperv-tlfs.h
+++ b/arch/x86/include/asm/hyperv-tlfs.h
@@ -526,6 +526,19 @@ struct hv_partition_assist_pg {
 	u32 tlb_lock_count;
 };
 
+enum hv_interrupt_type {
+	HV_X64_INTERRUPT_TYPE_FIXED             = 0x0000,
+	HV_X64_INTERRUPT_TYPE_LOWESTPRIORITY    = 0x0001,
+	HV_X64_INTERRUPT_TYPE_SMI               = 0x0002,
+	HV_X64_INTERRUPT_TYPE_REMOTEREAD        = 0x0003,
+	HV_X64_INTERRUPT_TYPE_NMI               = 0x0004,
+	HV_X64_INTERRUPT_TYPE_INIT              = 0x0005,
+	HV_X64_INTERRUPT_TYPE_SIPI              = 0x0006,
+	HV_X64_INTERRUPT_TYPE_EXTINT            = 0x0007,
+	HV_X64_INTERRUPT_TYPE_LOCALINT0         = 0x0008,
+	HV_X64_INTERRUPT_TYPE_LOCALINT1         = 0x0009,
+	HV_X64_INTERRUPT_TYPE_MAXIMUM           = 0x000A,
+};
 
 #include <asm-generic/hyperv-tlfs.h>
 
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index 09223e41172c..dd385c6a71b5 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -152,6 +152,8 @@ struct ms_hyperv_tsc_page {
 #define HVCALL_RETRIEVE_DEBUG_DATA		0x006a
 #define HVCALL_RESET_DEBUG_SESSION		0x006b
 #define HVCALL_ADD_LOGICAL_PROCESSOR		0x0076
+#define HVCALL_MAP_DEVICE_INTERRUPT		0x007c
+#define HVCALL_UNMAP_DEVICE_INTERRUPT		0x007d
 #define HVCALL_RETARGET_INTERRUPT		0x007e
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
 #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
@@ -702,4 +704,38 @@ union hv_device_id {
 	} acpi;
 } __packed;
 
+enum hv_interrupt_trigger_mode {
+	HV_INTERRUPT_TRIGGER_MODE_EDGE = 0,
+	HV_INTERRUPT_TRIGGER_MODE_LEVEL = 1,
+};
+
+struct hv_device_interrupt_descriptor {
+	u32 interrupt_type;
+	u32 trigger_mode;
+	u32 vector_count;
+	u32 reserved;
+	struct hv_device_interrupt_target target;
+} __packed;
+
+struct hv_input_map_device_interrupt {
+	u64 partition_id;
+	u64 device_id;
+	u64 flags;
+	struct hv_interrupt_entry logical_interrupt_entry;
+	struct hv_device_interrupt_descriptor interrupt_descriptor;
+} __packed;
+
+struct hv_output_map_device_interrupt {
+	struct hv_interrupt_entry interrupt_entry;
+} __packed;
+
+struct hv_input_unmap_device_interrupt {
+	u64 partition_id;
+	u64 device_id;
+	struct hv_interrupt_entry interrupt_entry;
+} __packed;
+
+#define HV_SOURCE_SHADOW_NONE               0x0
+#define HV_SOURCE_SHADOW_BRIDGE_BUS_RANGE   0x1
+
 #endif
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 15/17] x86/hyperv: implement an MSI domain for root partition
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (13 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 14/17] asm-generic/hyperv: import data structures for mapping device interrupts Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 13:50   ` Wei Liu
  2020-11-05 16:58 ` [PATCH v2 16/17] x86/ioapic: export a few functions and data structures via io_apic.h Wei Liu
  2020-11-05 16:58 ` [PATCH v2 17/17] x86/hyperv: handle IO-APIC when running as root Wei Liu
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

When Linux runs as the root partition on Microsoft Hypervisor, its
interrupts are remapped.  Linux will need to explicitly map and unmap
interrupts for hardware.

Implement an MSI domain to issue the correct hypercalls. And initialize
this irqdomain as the default MSI irq domain.

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
v2:
This patch is simplified due to upstream changes.
---
 arch/x86/hyperv/Makefile    |   2 +-
 arch/x86/hyperv/hv_init.c   |  10 +-
 arch/x86/hyperv/irqdomain.c | 330 ++++++++++++++++++++++++++++++++++++
 3 files changed, 340 insertions(+), 2 deletions(-)
 create mode 100644 arch/x86/hyperv/irqdomain.c

diff --git a/arch/x86/hyperv/Makefile b/arch/x86/hyperv/Makefile
index 565358020921..2ebcf3969121 100644
--- a/arch/x86/hyperv/Makefile
+++ b/arch/x86/hyperv/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-y			:= hv_init.o mmu.o nested.o
-obj-$(CONFIG_X86_64)	+= hv_apic.o hv_proc.o
+obj-$(CONFIG_X86_64)	+= hv_apic.o hv_proc.o irqdomain.o
 
 ifdef CONFIG_X86_64
 obj-$(CONFIG_PARAVIRT_SPINLOCKS)	+= hv_spinlock.o
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 9fcaf741be99..a46b817b5b2a 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -359,6 +359,8 @@ void __init hv_get_partition_id(void)
 	BUG_ON(status != HV_STATUS_SUCCESS);
 }
 
+extern struct irq_domain *hv_create_pci_msi_domain(void);
+
 /*
  * This function is to be invoked early in the boot sequence after the
  * hypervisor has been detected.
@@ -481,8 +483,14 @@ void __init hyperv_init(void)
 
 	register_syscore_ops(&hv_syscore_ops);
 
-	if (hv_root_partition)
+	/*
+	 * If we're running as root, we want to create our own PCI MSI domain.
+	 * We can't set this in hv_pci_init because that would be too late.
+	 */
+	if (hv_root_partition) {
+		x86_init.irqs.create_pci_msi_domain = hv_create_pci_msi_domain;
 		hv_get_partition_id();
+	}
 
 	return;
 
diff --git a/arch/x86/hyperv/irqdomain.c b/arch/x86/hyperv/irqdomain.c
new file mode 100644
index 000000000000..80109e3cbf8f
--- /dev/null
+++ b/arch/x86/hyperv/irqdomain.c
@@ -0,0 +1,330 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Irqdomain for Linux to run as the root partition on Microsoft Hypervisor.
+//
+// Authors:
+//   Sunil Muthuswamy <sunilmut@microsoft.com>
+//   Wei Liu <wei.liu@kernel.org>
+
+#include <linux/pci.h>
+#include <linux/irq.h>
+#include <asm/mshyperv.h>
+
+struct rid_data {
+	struct pci_dev *bridge;
+	u32 rid;
+};
+
+static int get_rid_cb(struct pci_dev *pdev, u16 alias, void *data)
+{
+	struct rid_data *rd = data;
+	u8 bus = PCI_BUS_NUM(rd->rid);
+
+	if (pdev->bus->number != bus || PCI_BUS_NUM(alias) != bus) {
+		rd->bridge = pdev;
+		rd->rid = alias;
+	}
+
+	return 0;
+}
+
+static union hv_device_id hv_build_pci_dev_id(struct pci_dev *dev)
+{
+	union hv_device_id dev_id;
+	struct rid_data data = {
+		.bridge = NULL,
+		.rid = PCI_DEVID(dev->bus->number, dev->devfn)
+	};
+
+	pci_for_each_dma_alias(dev, get_rid_cb, &data);
+
+	dev_id.as_uint64 = 0;
+	dev_id.device_type = HV_DEVICE_TYPE_PCI;
+	dev_id.pci.segment = pci_domain_nr(dev->bus);
+
+	dev_id.pci.bdf.bus = PCI_BUS_NUM(data.rid);
+	dev_id.pci.bdf.device = PCI_SLOT(data.rid);
+	dev_id.pci.bdf.function = PCI_FUNC(data.rid);
+	dev_id.pci.source_shadow = HV_SOURCE_SHADOW_NONE;
+
+	if (data.bridge) {
+		int pos;
+
+		/*
+		 * Microsoft Hypervisor requires a bus range when the bridge is
+		 * running in PCI-X mode.
+		 *
+		 * To distinguish conventional vs PCI-X bridge, we can check
+		 * the bridge's PCI-X Secondary Status Register, Secondary Bus
+		 * Mode and Frequency bits. See PCI Express to PCI/PCI-X Bridge
+		 * Specification Revision 1.0 5.2.2.1.3.
+		 *
+		 * Value zero means it is in conventional mode, otherwise it is
+		 * in PCI-X mode.
+		 */
+
+		pos = pci_find_capability(data.bridge, PCI_CAP_ID_PCIX);
+		if (pos) {
+			u16 status;
+
+			pci_read_config_word(data.bridge, pos +
+					PCI_X_BRIDGE_SSTATUS, &status);
+
+			if (status & PCI_X_SSTATUS_FREQ) {
+				/* Non-zero, PCI-X mode */
+				u8 sec_bus, sub_bus;
+
+				dev_id.pci.source_shadow = HV_SOURCE_SHADOW_BRIDGE_BUS_RANGE;
+
+				pci_read_config_byte(data.bridge, PCI_SECONDARY_BUS, &sec_bus);
+				dev_id.pci.shadow_bus_range.secondary_bus = sec_bus;
+				pci_read_config_byte(data.bridge, PCI_SUBORDINATE_BUS, &sub_bus);
+				dev_id.pci.shadow_bus_range.subordinate_bus = sub_bus;
+			}
+		}
+	}
+
+	return dev_id;
+}
+
+static int hv_map_msi_interrupt(struct pci_dev *dev, int vcpu, int vector,
+				struct hv_interrupt_entry *entry)
+{
+	struct hv_input_map_device_interrupt *input;
+	struct hv_output_map_device_interrupt *output;
+	struct hv_device_interrupt_descriptor *intr_desc;
+	unsigned long flags;
+	u16 status;
+
+	local_irq_save(flags);
+
+	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	output = *this_cpu_ptr(hyperv_pcpu_output_arg);
+
+	intr_desc = &input->interrupt_descriptor;
+	memset(input, 0, sizeof(*input));
+	input->partition_id = hv_current_partition_id;
+	input->device_id = hv_build_pci_dev_id(dev).as_uint64;
+	intr_desc->interrupt_type = HV_X64_INTERRUPT_TYPE_FIXED;
+	intr_desc->trigger_mode = HV_INTERRUPT_TRIGGER_MODE_EDGE;
+	intr_desc->vector_count = 1;
+	intr_desc->target.vector = vector;
+	__set_bit(vcpu, (unsigned long*)&intr_desc->target.vp_mask);
+
+	status = hv_do_rep_hypercall(HVCALL_MAP_DEVICE_INTERRUPT, 0, 0, input, output) &
+			 HV_HYPERCALL_RESULT_MASK;
+	*entry = output->interrupt_entry;
+
+	local_irq_restore(flags);
+
+	if (status != HV_STATUS_SUCCESS)
+		pr_err("%s: hypercall failed, status %d\n", __func__, status);
+
+	return status;
+}
+
+static inline void entry_to_msi_msg(struct hv_interrupt_entry *entry, struct msi_msg *msg)
+{
+	/* High address is always 0 */
+	msg->address_hi = 0;
+	msg->address_lo = entry->msi_entry.address.as_uint32;
+	msg->data = entry->msi_entry.data.as_uint32;
+}
+
+static int hv_unmap_msi_interrupt(struct pci_dev *dev, struct hv_interrupt_entry *old_entry);
+static void hv_irq_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
+{
+	struct msi_desc *msidesc;
+	struct pci_dev *dev;
+	struct hv_interrupt_entry out_entry, *stored_entry;
+	struct irq_cfg *cfg = irqd_cfg(data);
+	struct cpumask *affinity;
+	int cpu, vcpu;
+	u16 status;
+
+	msidesc = irq_data_get_msi_desc(data);
+	dev = msi_desc_to_pci_dev(msidesc);
+
+	if (!cfg) {
+		pr_debug("%s: cfg is NULL", __func__);
+		return;
+	}
+
+	affinity = irq_data_get_effective_affinity_mask(data);
+	cpu = cpumask_first_and(affinity, cpu_online_mask);
+	vcpu = hv_cpu_number_to_vp_number(cpu);
+
+	if (data->chip_data) {
+		/*
+		 * This interrupt is already mapped. Let's unmap first.
+		 *
+		 * We don't use retarget interrupt hypercalls here because
+		 * Microsoft Hypervisor doens't allow root to change the vector
+		 * or specify VPs outside of the set that is initially used
+		 * during mapping.
+		 */
+		stored_entry = data->chip_data;
+		data->chip_data = NULL;
+
+		status = hv_unmap_msi_interrupt(dev, stored_entry);
+
+		kfree(stored_entry);
+
+		if (status != HV_STATUS_SUCCESS) {
+			pr_debug("%s: failed to unmap, status %d", __func__, status);
+			return;
+		}
+	}
+
+	stored_entry = kzalloc(sizeof(*stored_entry), GFP_ATOMIC);
+	if (!stored_entry) {
+		pr_debug("%s: failed to allocate chip data\n", __func__);
+		return;
+	}
+
+	status = hv_map_msi_interrupt(dev, vcpu, cfg->vector, &out_entry);
+	if (status != HV_STATUS_SUCCESS) {
+		kfree(stored_entry);
+		return;
+	}
+
+	*stored_entry = out_entry;
+	data->chip_data = stored_entry;
+	entry_to_msi_msg(&out_entry, msg);
+
+	return;
+}
+
+static int hv_unmap_interrupt(u64 id, struct hv_interrupt_entry *old_entry)
+{
+	unsigned long flags;
+	struct hv_input_unmap_device_interrupt *input;
+	struct hv_interrupt_entry *intr_entry;
+	u16 status;
+
+	local_irq_save(flags);
+	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
+
+	memset(input, 0, sizeof(*input));
+	intr_entry = &input->interrupt_entry;
+	input->partition_id = hv_current_partition_id;
+	input->device_id = id;
+	*intr_entry = *old_entry;
+
+	status = hv_do_rep_hypercall(HVCALL_UNMAP_DEVICE_INTERRUPT, 0, 0, input, NULL) &
+			 HV_HYPERCALL_RESULT_MASK;
+	local_irq_restore(flags);
+
+	return status;
+}
+
+static int hv_unmap_msi_interrupt(struct pci_dev *dev, struct hv_interrupt_entry *old_entry)
+{
+	return hv_unmap_interrupt(hv_build_pci_dev_id(dev).as_uint64, old_entry)
+		& HV_HYPERCALL_RESULT_MASK;
+}
+
+static void hv_teardown_msi_irq_common(struct pci_dev *dev, struct msi_desc *msidesc, int irq)
+{
+	u16 status;
+	struct hv_interrupt_entry old_entry;
+	struct irq_desc *desc;
+	struct irq_data *data;
+	struct msi_msg msg;
+
+	desc = irq_to_desc(irq);
+	if (!desc) {
+		pr_debug("%s: no irq desc\n", __func__);
+		return;
+	}
+
+	data = &desc->irq_data;
+	if (!data) {
+		pr_debug("%s: no irq data\n", __func__);
+		return;
+	}
+
+	if (!data->chip_data) {
+		pr_debug("%s: no chip data\n!", __func__);
+		return;
+	}
+
+	old_entry = *(struct hv_interrupt_entry *)data->chip_data;
+	entry_to_msi_msg(&old_entry, &msg);
+
+	kfree(data->chip_data);
+	data->chip_data = NULL;
+
+	status = hv_unmap_msi_interrupt(dev, &old_entry);
+
+	if (status != HV_STATUS_SUCCESS) {
+		pr_err("%s: hypercall failed, status %d\n", __func__, status);
+		return;
+	}
+}
+
+static void hv_msi_domain_free_irqs(struct irq_domain *domain, struct device *dev)
+{
+	int i;
+	struct msi_desc *entry;
+	struct pci_dev *pdev;
+
+	if (WARN_ON_ONCE(!dev_is_pci(dev)))
+		return;
+
+	pdev = to_pci_dev(dev);
+
+	for_each_pci_msi_entry(entry, pdev) {
+		if (entry->irq) {
+			for (i = 0; i < entry->nvec_used; i++) {
+				hv_teardown_msi_irq_common(pdev, entry, entry->irq + i);
+				irq_domain_free_irqs(entry->irq + i, 1);
+			}
+		}
+	}
+}
+
+/*
+ * IRQ Chip for MSI PCI/PCI-X/PCI-Express Devices,
+ * which implement the MSI or MSI-X Capability Structure.
+ */
+static struct irq_chip hv_pci_msi_controller = {
+	.name			= "HV-PCI-MSI",
+	.irq_unmask		= pci_msi_unmask_irq,
+	.irq_mask		= pci_msi_mask_irq,
+	.irq_ack		= irq_chip_ack_parent,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_compose_msi_msg	= hv_irq_compose_msi_msg,
+	.irq_set_affinity	= msi_domain_set_affinity,
+	.flags			= IRQCHIP_SKIP_SET_WAKE,
+};
+
+static struct msi_domain_ops pci_msi_domain_ops = {
+	.domain_free_irqs	= hv_msi_domain_free_irqs,
+	.msi_prepare		= pci_msi_prepare,
+};
+
+static struct msi_domain_info hv_pci_msi_domain_info = {
+	.flags		= MSI_FLAG_USE_DEF_DOM_OPS | MSI_FLAG_USE_DEF_CHIP_OPS |
+			  MSI_FLAG_PCI_MSIX,
+	.ops		= &pci_msi_domain_ops,
+	.chip		= &hv_pci_msi_controller,
+	.handler	= handle_edge_irq,
+	.handler_name	= "edge",
+};
+
+struct irq_domain * __init hv_create_pci_msi_domain(void)
+{
+	struct irq_domain *d = NULL;
+	struct fwnode_handle *fn;
+
+	fn = irq_domain_alloc_named_fwnode("HV-PCI-MSI");
+	if (fn)
+		d = pci_msi_create_irq_domain(fn, &hv_pci_msi_domain_info, x86_vector_domain);
+
+	/* No point in going further if we can't get an irq domain */
+	BUG_ON(!d);
+
+	return d;
+}
+
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 16/17] x86/ioapic: export a few functions and data structures via io_apic.h
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (14 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 15/17] x86/hyperv: implement an MSI domain for root partition Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 11:39   ` Wei Liu
  2020-11-05 16:58 ` [PATCH v2 17/17] x86/hyperv: handle IO-APIC when running as root Wei Liu
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Joerg Roedel, Bjorn Helgaas, Jon Derrick,
	YueHaibing, Gustavo A. R. Silva

We are about to implement an irqchip for IO-APIC when Linux runs as root
on Microsoft Hypervisor. At the same time we would like to reuse
existing code as much as possible.

Move mp_chip_data to io_apic.h and make a few helper functions
non-static.

No functional change.

Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 arch/x86/include/asm/io_apic.h | 21 +++++++++++++++++++++
 arch/x86/kernel/apic/io_apic.c | 28 +++++++++-------------------
 2 files changed, 30 insertions(+), 19 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index a1a26f6d3aa4..1375983a6028 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -152,6 +152,15 @@ extern unsigned long io_apic_irqs;
 #define io_apic_assign_pci_irqs \
 	(mp_irq_entries && !skip_ioapic_setup && io_apic_irqs)
 
+struct mp_chip_data {
+	struct list_head irq_2_pin;
+	struct IO_APIC_route_entry entry;
+	int trigger;
+	int polarity;
+	u32 count;
+	bool isa_irq;
+};
+
 struct irq_cfg;
 extern void ioapic_insert_resources(void);
 extern int arch_early_ioapic_init(void);
@@ -195,6 +204,18 @@ extern void clear_IO_APIC(void);
 extern void restore_boot_irq_mode(void);
 extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
 extern void print_IO_APICs(void);
+
+struct irq_data;
+extern struct IO_APIC_route_entry ioapic_read_entry(int apic, int pin);
+extern void ioapic_write_entry(int apic, int pin, struct IO_APIC_route_entry e);
+extern void mask_ioapic_irq(struct irq_data *irq_data);
+extern void unmask_ioapic_irq(struct irq_data *irq_data);
+extern int ioapic_set_affinity(struct irq_data *irq_data, const struct cpumask *mask, bool force);
+extern struct irq_domain *mp_ioapic_irqdomain(int ioapic);
+enum irqchip_irq_state;
+extern int ioapic_irq_get_chip_state(struct irq_data *irqd,
+				enum irqchip_irq_state which,
+				bool *state);
 #else  /* !CONFIG_X86_IO_APIC */
 
 #define IO_APIC_IRQ(x)		0
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 7b3c7e0d4a09..23047f98b5e4 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -88,15 +88,6 @@ struct irq_pin_list {
 	int apic, pin;
 };
 
-struct mp_chip_data {
-	struct list_head irq_2_pin;
-	struct IO_APIC_route_entry entry;
-	int trigger;
-	int polarity;
-	u32 count;
-	bool isa_irq;
-};
-
 struct mp_ioapic_gsi {
 	u32 gsi_base;
 	u32 gsi_end;
@@ -154,7 +145,7 @@ static inline bool mp_is_legacy_irq(int irq)
 	return irq >= 0 && irq < nr_legacy_irqs();
 }
 
-static inline struct irq_domain *mp_ioapic_irqdomain(int ioapic)
+struct irq_domain *mp_ioapic_irqdomain(int ioapic)
 {
 	return ioapics[ioapic].irqdomain;
 }
@@ -301,7 +292,7 @@ static struct IO_APIC_route_entry __ioapic_read_entry(int apic, int pin)
 	return eu.entry;
 }
 
-static struct IO_APIC_route_entry ioapic_read_entry(int apic, int pin)
+struct IO_APIC_route_entry ioapic_read_entry(int apic, int pin)
 {
 	union entry_union eu;
 	unsigned long flags;
@@ -328,7 +319,7 @@ static void __ioapic_write_entry(int apic, int pin, struct IO_APIC_route_entry e
 	io_apic_write(apic, 0x10 + 2*pin, eu.w1);
 }
 
-static void ioapic_write_entry(int apic, int pin, struct IO_APIC_route_entry e)
+void ioapic_write_entry(int apic, int pin, struct IO_APIC_route_entry e)
 {
 	unsigned long flags;
 
@@ -453,7 +444,7 @@ static void io_apic_sync(struct irq_pin_list *entry)
 	readl(&io_apic->data);
 }
 
-static void mask_ioapic_irq(struct irq_data *irq_data)
+void mask_ioapic_irq(struct irq_data *irq_data)
 {
 	struct mp_chip_data *data = irq_data->chip_data;
 	unsigned long flags;
@@ -468,7 +459,7 @@ static void __unmask_ioapic(struct mp_chip_data *data)
 	io_apic_modify_irq(data, ~IO_APIC_REDIR_MASKED, 0, NULL);
 }
 
-static void unmask_ioapic_irq(struct irq_data *irq_data)
+void unmask_ioapic_irq(struct irq_data *irq_data)
 {
 	struct mp_chip_data *data = irq_data->chip_data;
 	unsigned long flags;
@@ -1868,8 +1859,7 @@ static void ioapic_configure_entry(struct irq_data *irqd)
 		__ioapic_write_entry(entry->apic, entry->pin, mpd->entry);
 }
 
-static int ioapic_set_affinity(struct irq_data *irq_data,
-			       const struct cpumask *mask, bool force)
+int ioapic_set_affinity(struct irq_data *irq_data, const struct cpumask *mask, bool force)
 {
 	struct irq_data *parent = irq_data->parent_data;
 	unsigned long flags;
@@ -1898,9 +1888,9 @@ static int ioapic_set_affinity(struct irq_data *irq_data,
  *
  * Verify that the corresponding Remote-IRR bits are clear.
  */
-static int ioapic_irq_get_chip_state(struct irq_data *irqd,
-				   enum irqchip_irq_state which,
-				   bool *state)
+int ioapic_irq_get_chip_state(struct irq_data *irqd,
+				enum irqchip_irq_state which,
+				bool *state)
 {
 	struct mp_chip_data *mcd = irqd->chip_data;
 	struct IO_APIC_route_entry rentry;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH v2 17/17] x86/hyperv: handle IO-APIC when running as root
  2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
                   ` (15 preceding siblings ...)
  2020-11-05 16:58 ` [PATCH v2 16/17] x86/ioapic: export a few functions and data structures via io_apic.h Wei Liu
@ 2020-11-05 16:58 ` Wei Liu
  2020-11-12 16:56   ` Vitaly Kuznetsov
  16 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-05 16:58 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

Just like MSI/MSI-X, IO-APIC interrupts are remapped by Microsoft
Hypervisor when Linux runs as the root partition. Implement an IRQ chip
to handle mapping and unmapping of IO-APIC interrupts.

Use custom functions for mapping and unmapping ACPI GSIs. They will
issue Microsoft Hypervisor specific hypercalls on top of the native
routines.

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
---
 arch/x86/hyperv/hv_init.c   |  13 +++
 arch/x86/hyperv/irqdomain.c | 226 ++++++++++++++++++++++++++++++++++++
 2 files changed, 239 insertions(+)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index a46b817b5b2a..2c2189832da7 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -267,10 +267,23 @@ static int hv_cpu_die(unsigned int cpu)
 	return 0;
 }
 
+int hv_acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity);
+void hv_acpi_unregister_gsi(u32 gsi);
+
+extern int (*native_acpi_register_gsi)(struct device *dev, u32 gsi, int trigger, int polarity);
+extern void (*native_acpi_unregister_gsi)(u32 gsi);
+
 static int __init hv_pci_init(void)
 {
 	int gen2vm = efi_enabled(EFI_BOOT);
 
+	if (hv_root_partition) {
+		native_acpi_register_gsi = __acpi_register_gsi;
+		native_acpi_unregister_gsi = __acpi_unregister_gsi;
+		__acpi_register_gsi = hv_acpi_register_gsi;
+		__acpi_unregister_gsi = hv_acpi_unregister_gsi;
+	}
+
 	/*
 	 * For Generation-2 VM, we exit from pci_arch_init() by returning 0.
 	 * The purpose is to suppress the harmless warning:
diff --git a/arch/x86/hyperv/irqdomain.c b/arch/x86/hyperv/irqdomain.c
index 80109e3cbf8f..ae9a728589a4 100644
--- a/arch/x86/hyperv/irqdomain.c
+++ b/arch/x86/hyperv/irqdomain.c
@@ -9,6 +9,8 @@
 #include <linux/pci.h>
 #include <linux/irq.h>
 #include <asm/mshyperv.h>
+#include <asm/apic.h>
+#include <asm/io_apic.h>
 
 struct rid_data {
 	struct pci_dev *bridge;
@@ -328,3 +330,227 @@ struct irq_domain * __init hv_create_pci_msi_domain(void)
 	return d;
 }
 
+/* Copied from io_apic.c */
+union entry_union {
+	struct { u32 w1, w2; };
+	struct IO_APIC_route_entry entry;
+};
+
+static int hv_unmap_ioapic_interrupt(int gsi)
+{
+	union hv_device_id device_id;
+	int ioapic, ioapic_id;
+	u8 ioapic_pin;
+	struct IO_APIC_route_entry ire;
+	union entry_union eu;
+	struct hv_interrupt_entry entry;
+
+	ioapic = mp_find_ioapic(gsi);
+	ioapic_pin = mp_find_ioapic_pin(ioapic, gsi);
+	ioapic_id = mpc_ioapic_id(ioapic);
+	ire = ioapic_read_entry(ioapic, ioapic_pin);
+
+	eu.entry = ire;
+
+	/*
+	 * Polarity may have been set by us, but Hyper-V expects the exact same
+	 * entry. See the mapping routine.
+	 */
+	eu.entry.polarity = 0;
+
+	memset(&entry, 0, sizeof(entry));
+	entry.source = HV_INTERRUPT_SOURCE_IOAPIC;
+	entry.ioapic_rte.low_uint32 = eu.w1;
+	entry.ioapic_rte.high_uint32 = eu.w2;
+
+	device_id.as_uint64 = 0;
+	device_id.device_type = HV_DEVICE_TYPE_IOAPIC;
+	device_id.ioapic.ioapic_id = (u8)ioapic_id;
+
+	return hv_unmap_interrupt(device_id.as_uint64, &entry) & HV_HYPERCALL_RESULT_MASK;
+}
+
+static int hv_map_ioapic_interrupt(int ioapic_id, int trigger, int vcpu, int vector,
+		struct hv_interrupt_entry *out_entry)
+{
+	unsigned long flags;
+	struct hv_input_map_device_interrupt *input;
+	struct hv_output_map_device_interrupt *output;
+	union hv_device_id device_id;
+	struct hv_device_interrupt_descriptor *intr_desc;
+	u16 status;
+
+	device_id.as_uint64 = 0;
+	device_id.device_type = HV_DEVICE_TYPE_IOAPIC;
+	device_id.ioapic.ioapic_id = (u8)ioapic_id;
+
+	local_irq_save(flags);
+	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
+	output = *this_cpu_ptr(hyperv_pcpu_output_arg);
+	memset(input, 0, sizeof(*input));
+	intr_desc = &input->interrupt_descriptor;
+	input->partition_id = hv_current_partition_id;
+	input->device_id = device_id.as_uint64;
+	intr_desc->interrupt_type = HV_X64_INTERRUPT_TYPE_FIXED;
+	intr_desc->target.vector = vector;
+	intr_desc->vector_count = 1;
+
+	if (trigger)
+		intr_desc->trigger_mode = HV_INTERRUPT_TRIGGER_MODE_LEVEL;
+	else
+		intr_desc->trigger_mode = HV_INTERRUPT_TRIGGER_MODE_EDGE;
+
+	__set_bit(vcpu, (unsigned long *)&intr_desc->target.vp_mask);
+
+	status = hv_do_rep_hypercall(HVCALL_MAP_DEVICE_INTERRUPT, 0, 0, input, output) &
+			 HV_HYPERCALL_RESULT_MASK;
+	local_irq_restore(flags);
+
+	*out_entry = output->interrupt_entry;
+
+	return status;
+}
+
+static unsigned int hv_ioapic_startup_irq(struct irq_data *data)
+{
+	u16 status;
+	struct IO_APIC_route_entry ire;
+	u32 vector;
+	struct irq_cfg *cfg;
+	int ioapic;
+	u8 ioapic_pin;
+	int ioapic_id;
+	int gsi;
+	union entry_union eu;
+	struct cpumask *affinity;
+	int cpu, vcpu;
+	struct hv_interrupt_entry entry;
+	struct mp_chip_data *mp_data = data->chip_data;
+
+	gsi = data->irq;
+	cfg = irqd_cfg(data);
+	affinity = irq_data_get_effective_affinity_mask(data);
+	cpu = cpumask_first_and(affinity, cpu_online_mask);
+	vcpu = hv_cpu_number_to_vp_number(cpu);
+
+	vector = cfg->vector;
+
+	ioapic = mp_find_ioapic(gsi);
+	ioapic_pin = mp_find_ioapic_pin(ioapic, gsi);
+	ioapic_id = mpc_ioapic_id(ioapic);
+	ire = ioapic_read_entry(ioapic, ioapic_pin);
+
+	/*
+	 * Always try unmapping. We do not have visibility into which whether
+	 * an IO-APIC has been mapped or not. We can't use chip_data because it
+	 * already points to mp_data.
+	 *
+	 * We don't use retarget interrupt hypercalls here because Hyper-V
+	 * doens't allow root to change the vector or specify VPs outside of
+	 * the set that is initially used during mapping.
+	 */
+	status = hv_unmap_ioapic_interrupt(gsi);
+
+	if (!(status == HV_STATUS_SUCCESS || status == HV_STATUS_INVALID_PARAMETER)) {
+		pr_debug("%s: unexpected unmap status %d\n", __func__, status);
+		return -1;
+	}
+
+	status = hv_map_ioapic_interrupt(ioapic_id, ire.trigger, vcpu, vector, &entry);
+
+	if (status != HV_STATUS_SUCCESS) {
+		pr_err("%s: map hypercall failed, status %d\n", __func__, status);
+		return -1;
+	}
+
+	/* Update the entry in mp_chip_data. It is used in other places. */
+	mp_data->entry = *(struct IO_APIC_route_entry *)&entry.ioapic_rte;
+
+	/* Sync polarity -- Hyper-V's returned polarity is always 0... */
+	mp_data->entry.polarity = ire.polarity;
+
+	eu.w1 = entry.ioapic_rte.low_uint32;
+	eu.w2 = entry.ioapic_rte.high_uint32;
+	ioapic_write_entry(ioapic, ioapic_pin, eu.entry);
+
+	return 0;
+}
+
+static void hv_ioapic_mask_irq(struct irq_data *data)
+{
+	mask_ioapic_irq(data);
+}
+
+static void hv_ioapic_unmask_irq(struct irq_data *data)
+{
+	unmask_ioapic_irq(data);
+}
+
+static int hv_ioapic_set_affinity(struct irq_data *data,
+			       const struct cpumask *mask, bool force)
+{
+	/*
+	 * We only update the affinity mask here. Programming the hardware is
+	 * done in irq_startup.
+	 */
+	return ioapic_set_affinity(data, mask, force);
+}
+
+void hv_ioapic_ack_level(struct irq_data *irq_data)
+{
+	/*
+	 * Per email exchange with Hyper-V team, all is needed is write to
+	 * LAPIC's EOI register. They don't support directed EOI to IO-APIC.
+	 * Hyper-V handles it for us.
+	 */
+	apic_ack_irq(irq_data);
+}
+
+struct irq_chip hv_ioapic_chip __read_mostly = {
+	.name			= "HV-IO-APIC",
+	.irq_startup		= hv_ioapic_startup_irq,
+	.irq_mask		= hv_ioapic_mask_irq,
+	.irq_unmask		= hv_ioapic_unmask_irq,
+	.irq_ack		= irq_chip_ack_parent,
+	.irq_eoi		= hv_ioapic_ack_level,
+	.irq_set_affinity	= hv_ioapic_set_affinity,
+	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
+	.flags			= IRQCHIP_SKIP_SET_WAKE,
+};
+
+
+int (*native_acpi_register_gsi)(struct device *dev, u32 gsi, int trigger, int polarity);
+void (*native_acpi_unregister_gsi)(u32 gsi);
+
+int hv_acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity)
+{
+	int irq = gsi;
+
+#ifdef CONFIG_X86_IO_APIC
+	irq = native_acpi_register_gsi(dev, gsi, trigger, polarity);
+	if (irq < 0) {
+		pr_err("native_acpi_register_gsi failed %d\n", irq);
+		return irq;
+	}
+
+	if (trigger) {
+		irq_set_status_flags(irq, IRQ_LEVEL);
+		irq_set_chip_and_handler_name(irq, &hv_ioapic_chip,
+			handle_fasteoi_irq, "ioapic-fasteoi");
+	} else {
+		irq_clear_status_flags(irq, IRQ_LEVEL);
+		irq_set_chip_and_handler_name(irq, &hv_ioapic_chip,
+			handle_edge_irq, "ioapic-edge");
+	}
+#endif
+	return irq;
+}
+
+void hv_acpi_unregister_gsi(u32 gsi)
+{
+#ifdef CONFIG_X86_IO_APIC
+	(void)hv_unmap_ioapic_interrupt(gsi);
+	native_acpi_unregister_gsi(gsi);
+#endif
+}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition
  2020-11-05 16:57 ` [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition Wei Liu
@ 2020-11-05 19:16   ` kernel test robot
  2020-11-12 11:42     ` Wei Liu
  2020-11-05 20:23   ` kernel test robot
  2020-11-12 15:16   ` Vitaly Kuznetsov
  2 siblings, 1 reply; 58+ messages in thread
From: kernel test robot @ 2020-11-05 19:16 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: kbuild-all, virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang

[-- Attachment #1: Type: text/plain, Size: 4886 bytes --]

Hi Wei,

I love your patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[also build test ERROR on asm-generic/master iommu/next tip/timers/core pci/next linus/master v5.10-rc2 next-20201105]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970
config: x86_64-randconfig-m001-20201104 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/3984ce0be67e74b8945288f1751a91615459f75e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
        git checkout 3984ce0be67e74b8945288f1751a91615459f75e
        # save the attached .config to linux build tree
        make W=1 ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   arch/x86/kernel/cpu/mshyperv.c: In function 'ms_hyperv_init_platform':
>> arch/x86/kernel/cpu/mshyperv.c:247:3: error: 'hv_root_partition' undeclared (first use in this function); did you mean 'blk_drop_partitions'?
     247 |   hv_root_partition = true;
         |   ^~~~~~~~~~~~~~~~~
         |   blk_drop_partitions
   arch/x86/kernel/cpu/mshyperv.c:247:3: note: each undeclared identifier is reported only once for each function it appears in

vim +247 arch/x86/kernel/cpu/mshyperv.c

   218	
   219		/*
   220		 * Extract the features and hints
   221		 */
   222		ms_hyperv.features = cpuid_eax(HYPERV_CPUID_FEATURES);
   223		ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES);
   224		ms_hyperv.hints    = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO);
   225	
   226		pr_info("Hyper-V: features 0x%x, hints 0x%x, misc 0x%x\n",
   227			ms_hyperv.features, ms_hyperv.hints, ms_hyperv.misc_features);
   228	
   229		ms_hyperv.max_vp_index = cpuid_eax(HYPERV_CPUID_IMPLEMENT_LIMITS);
   230		ms_hyperv.max_lp_index = cpuid_ebx(HYPERV_CPUID_IMPLEMENT_LIMITS);
   231	
   232		pr_debug("Hyper-V: max %u virtual processors, %u logical processors\n",
   233			 ms_hyperv.max_vp_index, ms_hyperv.max_lp_index);
   234	
   235		/*
   236		 * Check CPU management privilege.
   237		 *
   238		 * To mirror what Windows does we should extract CPU management
   239		 * features and use the ReservedIdentityBit to detect if Linux is the
   240		 * root partition. But that requires negotiating CPU management
   241		 * interface (a process to be finalized).
   242		 *
   243		 * For now, use the privilege flag as the indicator for running as
   244		 * root.
   245		 */
   246		if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_CPU_MANAGEMENT) {
 > 247			hv_root_partition = true;
   248			pr_info("Hyper-V: running as root partition\n");
   249		}
   250	
   251		/*
   252		 * Extract host information.
   253		 */
   254		if (cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS) >=
   255		    HYPERV_CPUID_VERSION) {
   256			hv_host_info_eax = cpuid_eax(HYPERV_CPUID_VERSION);
   257			hv_host_info_ebx = cpuid_ebx(HYPERV_CPUID_VERSION);
   258			hv_host_info_ecx = cpuid_ecx(HYPERV_CPUID_VERSION);
   259			hv_host_info_edx = cpuid_edx(HYPERV_CPUID_VERSION);
   260	
   261			pr_info("Hyper-V Host Build:%d-%d.%d-%d-%d.%d\n",
   262				hv_host_info_eax, hv_host_info_ebx >> 16,
   263				hv_host_info_ebx & 0xFFFF, hv_host_info_ecx,
   264				hv_host_info_edx >> 24, hv_host_info_edx & 0xFFFFFF);
   265		}
   266	
   267		if (ms_hyperv.features & HV_X64_ACCESS_FREQUENCY_MSRS &&
   268		    ms_hyperv.misc_features & HV_FEATURE_FREQUENCY_MSRS_AVAILABLE) {
   269			x86_platform.calibrate_tsc = hv_get_tsc_khz;
   270			x86_platform.calibrate_cpu = hv_get_tsc_khz;
   271		}
   272	
   273		if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED) {
   274			ms_hyperv.nested_features =
   275				cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
   276		}
   277	
   278		/*
   279		 * Hyper-V expects to get crash register data or kmsg when
   280		 * crash enlightment is available and system crashes. Set
   281		 * crash_kexec_post_notifiers to be true to make sure that
   282		 * calling crash enlightment interface before running kdump
   283		 * kernel.
   284		 */
   285		if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)
   286			crash_kexec_post_notifiers = true;
   287	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 30462 bytes --]

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary
  2020-11-05 16:58 ` [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary Wei Liu
@ 2020-11-05 20:07   ` kernel test robot
  2020-11-12 11:54     ` Wei Liu
  2020-11-05 20:20   ` kernel test robot
  2020-11-12 15:44   ` Vitaly Kuznetsov
  2 siblings, 1 reply; 58+ messages in thread
From: kernel test robot @ 2020-11-05 20:07 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: kbuild-all, virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan

[-- Attachment #1: Type: text/plain, Size: 2468 bytes --]

Hi Wei,

I love your patch! Perhaps something to improve:

[auto build test WARNING on tip/x86/core]
[also build test WARNING on asm-generic/master iommu/next tip/timers/core pci/next linus/master v5.10-rc2 next-20201105]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970
config: i386-randconfig-r002-20201104 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/83c03b4e30e729a77688b8c0ffeffa2a555dcce7
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
        git checkout 83c03b4e30e729a77688b8c0ffeffa2a555dcce7
        # save the attached .config to linux build tree
        make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/hyperv/hv_init.c:341:13: warning: no previous prototype for 'hv_get_partition_id' [-Wmissing-prototypes]
     341 | void __init hv_get_partition_id(void)
         |             ^~~~~~~~~~~~~~~~~~~

vim +/hv_get_partition_id +341 arch/x86/hyperv/hv_init.c

   340	
 > 341	void __init hv_get_partition_id(void)
   342	{
   343		struct hv_get_partition_id *output_page;
   344		u16 status;
   345		unsigned long flags;
   346	
   347		local_irq_save(flags);
   348		output_page = *this_cpu_ptr(hyperv_pcpu_output_arg);
   349		status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, output_page) &
   350			HV_HYPERCALL_RESULT_MASK;
   351		if (status != HV_STATUS_SUCCESS)
   352			pr_err("Failed to get partition ID: %d\n", status);
   353		else
   354			hv_current_partition_id = output_page->partition_id;
   355		local_irq_restore(flags);
   356	
   357		/* No point in proceeding if this failed */
   358		BUG_ON(status != HV_STATUS_SUCCESS);
   359	}
   360	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 34684 bytes --]

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary
  2020-11-05 16:58 ` [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary Wei Liu
  2020-11-05 20:07   ` kernel test robot
@ 2020-11-05 20:20   ` kernel test robot
  2020-11-12 15:44   ` Vitaly Kuznetsov
  2 siblings, 0 replies; 58+ messages in thread
From: kernel test robot @ 2020-11-05 20:20 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: kbuild-all, virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

Hi Wei,

I love your patch! Perhaps something to improve:

[auto build test WARNING on tip/x86/core]
[also build test WARNING on asm-generic/master iommu/next tip/timers/core pci/next linus/master v5.10-rc2 next-20201105]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970
config: x86_64-randconfig-s021-20201105 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.3-76-gf680124b-dirty
        # https://github.com/0day-ci/linux/commit/83c03b4e30e729a77688b8c0ffeffa2a555dcce7
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
        git checkout 83c03b4e30e729a77688b8c0ffeffa2a555dcce7
        # save the attached .config to linux build tree
        make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

>> arch/x86/hyperv/hv_init.c:341:13: warning: no previous prototype for 'hv_get_partition_id' [-Wmissing-prototypes]
     341 | void __init hv_get_partition_id(void)
         |             ^~~~~~~~~~~~~~~~~~~

vim +/hv_get_partition_id +341 arch/x86/hyperv/hv_init.c

   340	
 > 341	void __init hv_get_partition_id(void)
   342	{
   343		struct hv_get_partition_id *output_page;
   344		u16 status;
   345		unsigned long flags;
   346	
   347		local_irq_save(flags);
   348		output_page = *this_cpu_ptr(hyperv_pcpu_output_arg);
   349		status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, output_page) &
   350			HV_HYPERCALL_RESULT_MASK;
   351		if (status != HV_STATUS_SUCCESS)
   352			pr_err("Failed to get partition ID: %d\n", status);
   353		else
   354			hv_current_partition_id = output_page->partition_id;
   355		local_irq_restore(flags);
   356	
   357		/* No point in proceeding if this failed */
   358		BUG_ON(status != HV_STATUS_SUCCESS);
   359	}
   360	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 40607 bytes --]

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition
  2020-11-05 16:57 ` [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition Wei Liu
  2020-11-05 19:16   ` kernel test robot
@ 2020-11-05 20:23   ` kernel test robot
  2020-11-12 15:16   ` Vitaly Kuznetsov
  2 siblings, 0 replies; 58+ messages in thread
From: kernel test robot @ 2020-11-05 20:23 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: kbuild-all, clang-built-linux, virtualization, Linux Kernel List,
	Michael Kelley, Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves,
	Wei Liu, K. Y. Srinivasan, Haiyang Zhang

[-- Attachment #1: Type: text/plain, Size: 5016 bytes --]

Hi Wei,

I love your patch! Yet something to improve:

[auto build test ERROR on tip/x86/core]
[also build test ERROR on asm-generic/master iommu/next tip/timers/core pci/next linus/master v5.10-rc2 next-20201105]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970
config: x86_64-randconfig-a004-20201104 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 09ec07827b1128504457a93dee80b2ceee1af600)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/3984ce0be67e74b8945288f1751a91615459f75e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
        git checkout 3984ce0be67e74b8945288f1751a91615459f75e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> arch/x86/kernel/cpu/mshyperv.c:247:3: error: use of undeclared identifier 'hv_root_partition'
                   hv_root_partition = true;
                   ^
   1 error generated.

vim +/hv_root_partition +247 arch/x86/kernel/cpu/mshyperv.c

   218	
   219		/*
   220		 * Extract the features and hints
   221		 */
   222		ms_hyperv.features = cpuid_eax(HYPERV_CPUID_FEATURES);
   223		ms_hyperv.misc_features = cpuid_edx(HYPERV_CPUID_FEATURES);
   224		ms_hyperv.hints    = cpuid_eax(HYPERV_CPUID_ENLIGHTMENT_INFO);
   225	
   226		pr_info("Hyper-V: features 0x%x, hints 0x%x, misc 0x%x\n",
   227			ms_hyperv.features, ms_hyperv.hints, ms_hyperv.misc_features);
   228	
   229		ms_hyperv.max_vp_index = cpuid_eax(HYPERV_CPUID_IMPLEMENT_LIMITS);
   230		ms_hyperv.max_lp_index = cpuid_ebx(HYPERV_CPUID_IMPLEMENT_LIMITS);
   231	
   232		pr_debug("Hyper-V: max %u virtual processors, %u logical processors\n",
   233			 ms_hyperv.max_vp_index, ms_hyperv.max_lp_index);
   234	
   235		/*
   236		 * Check CPU management privilege.
   237		 *
   238		 * To mirror what Windows does we should extract CPU management
   239		 * features and use the ReservedIdentityBit to detect if Linux is the
   240		 * root partition. But that requires negotiating CPU management
   241		 * interface (a process to be finalized).
   242		 *
   243		 * For now, use the privilege flag as the indicator for running as
   244		 * root.
   245		 */
   246		if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_CPU_MANAGEMENT) {
 > 247			hv_root_partition = true;
   248			pr_info("Hyper-V: running as root partition\n");
   249		}
   250	
   251		/*
   252		 * Extract host information.
   253		 */
   254		if (cpuid_eax(HYPERV_CPUID_VENDOR_AND_MAX_FUNCTIONS) >=
   255		    HYPERV_CPUID_VERSION) {
   256			hv_host_info_eax = cpuid_eax(HYPERV_CPUID_VERSION);
   257			hv_host_info_ebx = cpuid_ebx(HYPERV_CPUID_VERSION);
   258			hv_host_info_ecx = cpuid_ecx(HYPERV_CPUID_VERSION);
   259			hv_host_info_edx = cpuid_edx(HYPERV_CPUID_VERSION);
   260	
   261			pr_info("Hyper-V Host Build:%d-%d.%d-%d-%d.%d\n",
   262				hv_host_info_eax, hv_host_info_ebx >> 16,
   263				hv_host_info_ebx & 0xFFFF, hv_host_info_ecx,
   264				hv_host_info_edx >> 24, hv_host_info_edx & 0xFFFFFF);
   265		}
   266	
   267		if (ms_hyperv.features & HV_X64_ACCESS_FREQUENCY_MSRS &&
   268		    ms_hyperv.misc_features & HV_FEATURE_FREQUENCY_MSRS_AVAILABLE) {
   269			x86_platform.calibrate_tsc = hv_get_tsc_khz;
   270			x86_platform.calibrate_cpu = hv_get_tsc_khz;
   271		}
   272	
   273		if (ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED) {
   274			ms_hyperv.nested_features =
   275				cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
   276		}
   277	
   278		/*
   279		 * Hyper-V expects to get crash register data or kmsg when
   280		 * crash enlightment is available and system crashes. Set
   281		 * crash_kexec_post_notifiers to be true to make sure that
   282		 * calling crash enlightment interface before running kdump
   283		 * kernel.
   284		 */
   285		if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)
   286			crash_kexec_post_notifiers = true;
   287	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 36937 bytes --]

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if running as root
  2020-11-05 16:58 ` [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if " Wei Liu
@ 2020-11-12  9:56   ` Daniel Lezcano
  2020-11-12 11:24     ` Wei Liu
  2020-11-12 15:30   ` Vitaly Kuznetsov
  1 sibling, 1 reply; 58+ messages in thread
From: Daniel Lezcano @ 2020-11-12  9:56 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner

On 05/11/2020 17:58, Wei Liu wrote:
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---

I would like to apply this patch but the changelog is too short (one line).

Please add a small paragraph (no need to resend just answer here, I will
amend the log myself.

>  drivers/clocksource/hyperv_timer.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index ba04cb381cd3..269a691bd2c4 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -426,6 +426,9 @@ static bool __init hv_init_tsc_clocksource(void)
>  	if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
>  		return false;
>  
> +	if (hv_root_partition)
> +		return false;
> +
>  	hv_read_reference_counter = read_hv_clock_tsc;
>  	phys_addr = virt_to_phys(hv_get_tsc_page());
>  
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if running as root
  2020-11-12  9:56   ` Daniel Lezcano
@ 2020-11-12 11:24     ` Wei Liu
  2020-11-12 11:40       ` Daniel Lezcano
  0 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-12 11:24 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Thomas Gleixner

On Thu, Nov 12, 2020 at 10:56:17AM +0100, Daniel Lezcano wrote:
> On 05/11/2020 17:58, Wei Liu wrote:
> > Signed-off-by: Wei Liu <wei.liu@kernel.org>
> > ---
> 
> I would like to apply this patch but the changelog is too short (one line).
> 
> Please add a small paragraph (no need to resend just answer here, I will
> amend the log myself.

Please don't apply this to your tree. It is dependent on a previous
patch. I expect this series to go through the hyperv tree.

I will add in this small paragraph in a later version:

    When Linux runs as the root partition, the setup required for TSC page
    is different. Luckily Linux also has access to the MSR based
    clocksource. We can just disable the TSC page clocksource if Linux is
    the root partition.

If you're happy with the description, I would love to have an ack from
you. I will funnel the patch via the hyperv tree.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 16/17] x86/ioapic: export a few functions and data structures via io_apic.h
  2020-11-05 16:58 ` [PATCH v2 16/17] x86/ioapic: export a few functions and data structures via io_apic.h Wei Liu
@ 2020-11-12 11:39   ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-12 11:39 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Joerg Roedel, Bjorn Helgaas, Jon Derrick,
	YueHaibing, Gustavo A. R. Silva

On Thu, Nov 05, 2020 at 04:58:13PM +0000, Wei Liu wrote:
> We are about to implement an irqchip for IO-APIC when Linux runs as root
> on Microsoft Hypervisor. At the same time we would like to reuse
> existing code as much as possible.
> 
> Move mp_chip_data to io_apic.h and make a few helper functions
> non-static.
> 
> No functional change.
> 
> Signed-off-by: Wei Liu <wei.liu@kernel.org>

x86 maintainers, any comment on this patch?

This is the only patch in this series that's not MSHV specific.

Wei.

> ---
>  arch/x86/include/asm/io_apic.h | 21 +++++++++++++++++++++
>  arch/x86/kernel/apic/io_apic.c | 28 +++++++++-------------------
>  2 files changed, 30 insertions(+), 19 deletions(-)
> 
> diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
> index a1a26f6d3aa4..1375983a6028 100644
> --- a/arch/x86/include/asm/io_apic.h
> +++ b/arch/x86/include/asm/io_apic.h
> @@ -152,6 +152,15 @@ extern unsigned long io_apic_irqs;
>  #define io_apic_assign_pci_irqs \
>  	(mp_irq_entries && !skip_ioapic_setup && io_apic_irqs)
>  
> +struct mp_chip_data {
> +	struct list_head irq_2_pin;
> +	struct IO_APIC_route_entry entry;
> +	int trigger;
> +	int polarity;
> +	u32 count;
> +	bool isa_irq;
> +};
> +
>  struct irq_cfg;
>  extern void ioapic_insert_resources(void);
>  extern int arch_early_ioapic_init(void);
> @@ -195,6 +204,18 @@ extern void clear_IO_APIC(void);
>  extern void restore_boot_irq_mode(void);
>  extern int IO_APIC_get_PCI_irq_vector(int bus, int devfn, int pin);
>  extern void print_IO_APICs(void);
> +
> +struct irq_data;
> +extern struct IO_APIC_route_entry ioapic_read_entry(int apic, int pin);
> +extern void ioapic_write_entry(int apic, int pin, struct IO_APIC_route_entry e);
> +extern void mask_ioapic_irq(struct irq_data *irq_data);
> +extern void unmask_ioapic_irq(struct irq_data *irq_data);
> +extern int ioapic_set_affinity(struct irq_data *irq_data, const struct cpumask *mask, bool force);
> +extern struct irq_domain *mp_ioapic_irqdomain(int ioapic);
> +enum irqchip_irq_state;
> +extern int ioapic_irq_get_chip_state(struct irq_data *irqd,
> +				enum irqchip_irq_state which,
> +				bool *state);
>  #else  /* !CONFIG_X86_IO_APIC */
>  
>  #define IO_APIC_IRQ(x)		0
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 7b3c7e0d4a09..23047f98b5e4 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -88,15 +88,6 @@ struct irq_pin_list {
>  	int apic, pin;
>  };
>  
> -struct mp_chip_data {
> -	struct list_head irq_2_pin;
> -	struct IO_APIC_route_entry entry;
> -	int trigger;
> -	int polarity;
> -	u32 count;
> -	bool isa_irq;
> -};
> -
>  struct mp_ioapic_gsi {
>  	u32 gsi_base;
>  	u32 gsi_end;
> @@ -154,7 +145,7 @@ static inline bool mp_is_legacy_irq(int irq)
>  	return irq >= 0 && irq < nr_legacy_irqs();
>  }
>  
> -static inline struct irq_domain *mp_ioapic_irqdomain(int ioapic)
> +struct irq_domain *mp_ioapic_irqdomain(int ioapic)
>  {
>  	return ioapics[ioapic].irqdomain;
>  }
> @@ -301,7 +292,7 @@ static struct IO_APIC_route_entry __ioapic_read_entry(int apic, int pin)
>  	return eu.entry;
>  }
>  
> -static struct IO_APIC_route_entry ioapic_read_entry(int apic, int pin)
> +struct IO_APIC_route_entry ioapic_read_entry(int apic, int pin)
>  {
>  	union entry_union eu;
>  	unsigned long flags;
> @@ -328,7 +319,7 @@ static void __ioapic_write_entry(int apic, int pin, struct IO_APIC_route_entry e
>  	io_apic_write(apic, 0x10 + 2*pin, eu.w1);
>  }
>  
> -static void ioapic_write_entry(int apic, int pin, struct IO_APIC_route_entry e)
> +void ioapic_write_entry(int apic, int pin, struct IO_APIC_route_entry e)
>  {
>  	unsigned long flags;
>  
> @@ -453,7 +444,7 @@ static void io_apic_sync(struct irq_pin_list *entry)
>  	readl(&io_apic->data);
>  }
>  
> -static void mask_ioapic_irq(struct irq_data *irq_data)
> +void mask_ioapic_irq(struct irq_data *irq_data)
>  {
>  	struct mp_chip_data *data = irq_data->chip_data;
>  	unsigned long flags;
> @@ -468,7 +459,7 @@ static void __unmask_ioapic(struct mp_chip_data *data)
>  	io_apic_modify_irq(data, ~IO_APIC_REDIR_MASKED, 0, NULL);
>  }
>  
> -static void unmask_ioapic_irq(struct irq_data *irq_data)
> +void unmask_ioapic_irq(struct irq_data *irq_data)
>  {
>  	struct mp_chip_data *data = irq_data->chip_data;
>  	unsigned long flags;
> @@ -1868,8 +1859,7 @@ static void ioapic_configure_entry(struct irq_data *irqd)
>  		__ioapic_write_entry(entry->apic, entry->pin, mpd->entry);
>  }
>  
> -static int ioapic_set_affinity(struct irq_data *irq_data,
> -			       const struct cpumask *mask, bool force)
> +int ioapic_set_affinity(struct irq_data *irq_data, const struct cpumask *mask, bool force)
>  {
>  	struct irq_data *parent = irq_data->parent_data;
>  	unsigned long flags;
> @@ -1898,9 +1888,9 @@ static int ioapic_set_affinity(struct irq_data *irq_data,
>   *
>   * Verify that the corresponding Remote-IRR bits are clear.
>   */
> -static int ioapic_irq_get_chip_state(struct irq_data *irqd,
> -				   enum irqchip_irq_state which,
> -				   bool *state)
> +int ioapic_irq_get_chip_state(struct irq_data *irqd,
> +				enum irqchip_irq_state which,
> +				bool *state)
>  {
>  	struct mp_chip_data *mcd = irqd->chip_data;
>  	struct IO_APIC_route_entry rentry;
> -- 
> 2.20.1
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if running as root
  2020-11-12 11:24     ` Wei Liu
@ 2020-11-12 11:40       ` Daniel Lezcano
  2020-11-12 11:42         ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Daniel Lezcano @ 2020-11-12 11:40 UTC (permalink / raw)
  To: Wei Liu
  Cc: Linux on Hyper-V List, virtualization, Linux Kernel List,
	Michael Kelley, Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner

On 12/11/2020 12:24, Wei Liu wrote:
> On Thu, Nov 12, 2020 at 10:56:17AM +0100, Daniel Lezcano wrote:
>> On 05/11/2020 17:58, Wei Liu wrote:
>>> Signed-off-by: Wei Liu <wei.liu@kernel.org>

Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

>>> ---
>>
>> I would like to apply this patch but the changelog is too short (one line).
>>
>> Please add a small paragraph (no need to resend just answer here, I will
>> amend the log myself.
> 
> Please don't apply this to your tree. It is dependent on a previous
> patch. I expect this series to go through the hyperv tree.
> 
> I will add in this small paragraph in a later version:
> 
>     When Linux runs as the root partition, the setup required for TSC page
>     is different. Luckily Linux also has access to the MSR based
>     clocksource. We can just disable the TSC page clocksource if Linux is
>     the root partition.
> 
> If you're happy with the description, I would love to have an ack from
> you. I will funnel the patch via the hyperv tree.
> 
> Wei.
> 


-- 
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition
  2020-11-05 19:16   ` kernel test robot
@ 2020-11-12 11:42     ` Wei Liu
  2020-11-12 11:46       ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-12 11:42 UTC (permalink / raw)
  To: kernel test robot
  Cc: Wei Liu, Linux on Hyper-V List, kbuild-all, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang

On Fri, Nov 06, 2020 at 03:16:07AM +0800, kernel test robot wrote:
> Hi Wei,
> 
> I love your patch! Yet something to improve:
> 
> [auto build test ERROR on tip/x86/core]
> [also build test ERROR on asm-generic/master iommu/next tip/timers/core pci/next linus/master v5.10-rc2 next-20201105]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
> 

This report is incorrect.

The bot seems to have only picked up this one patch but not the whole
series. While the patch can apply cleanly to all those trees, it has a
dependency on an earlier patch in this series.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if running as root
  2020-11-12 11:40       ` Daniel Lezcano
@ 2020-11-12 11:42         ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-12 11:42 UTC (permalink / raw)
  To: Daniel Lezcano
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Thomas Gleixner

On Thu, Nov 12, 2020 at 12:40:53PM +0100, Daniel Lezcano wrote:
> On 12/11/2020 12:24, Wei Liu wrote:
> > On Thu, Nov 12, 2020 at 10:56:17AM +0100, Daniel Lezcano wrote:
> >> On 05/11/2020 17:58, Wei Liu wrote:
> >>> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> 
> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Thanks Daniel.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition
  2020-11-12 11:42     ` Wei Liu
@ 2020-11-12 11:46       ` Wei Liu
  2020-11-12 12:22         ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-12 11:46 UTC (permalink / raw)
  To: kernel test robot
  Cc: Wei Liu, Linux on Hyper-V List, kbuild-all, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang

On Thu, Nov 12, 2020 at 11:42:15AM +0000, Wei Liu wrote:
> On Fri, Nov 06, 2020 at 03:16:07AM +0800, kernel test robot wrote:
> > Hi Wei,
> > 
> > I love your patch! Yet something to improve:
> > 
> > [auto build test ERROR on tip/x86/core]
> > [also build test ERROR on asm-generic/master iommu/next tip/timers/core pci/next linus/master v5.10-rc2 next-20201105]
> > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > And when submitting patch, we suggest to use '--base' as documented in
> > https://git-scm.com/docs/git-format-patch]
> > 
> 
> This report is incorrect.
> 
> The bot seems to have only picked up this one patch but not the whole
> series. While the patch can apply cleanly to all those trees, it has a
> dependency on an earlier patch in this series.

I misread this report and I'm confused now. Let me fetch the config and
try locally first.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary
  2020-11-05 20:07   ` kernel test robot
@ 2020-11-12 11:54     ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-12 11:54 UTC (permalink / raw)
  To: kernel test robot
  Cc: Wei Liu, Linux on Hyper-V List, kbuild-all, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan

On Fri, Nov 06, 2020 at 04:07:33AM +0800, kernel test robot wrote:
> Hi Wei,
> 
> I love your patch! Perhaps something to improve:
> 
> [auto build test WARNING on tip/x86/core]
> [also build test WARNING on asm-generic/master iommu/next tip/timers/core pci/next linus/master v5.10-rc2 next-20201105]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
> 
> url:    https://github.com/0day-ci/linux/commits/Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git 238c91115cd05c71447ea071624a4c9fe661f970
> config: i386-randconfig-r002-20201104 (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
> reproduce (this is a W=1 build):
>         # https://github.com/0day-ci/linux/commit/83c03b4e30e729a77688b8c0ffeffa2a555dcce7
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Wei-Liu/Introducing-Linux-root-partition-support-for-Microsoft-Hypervisor/20201106-010058
>         git checkout 83c03b4e30e729a77688b8c0ffeffa2a555dcce7
>         # save the attached .config to linux build tree
>         make W=1 ARCH=i386 
> 
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
> 
> All warnings (new ones prefixed by >>):
> 
> >> arch/x86/hyperv/hv_init.c:341:13: warning: no previous prototype for 'hv_get_partition_id' [-Wmissing-prototypes]
>      341 | void __init hv_get_partition_id(void)
>          |             ^~~~~~~~~~~~~~~~~~~

This function can be made static since the only user is in the same
file.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition
  2020-11-12 11:46       ` Wei Liu
@ 2020-11-12 12:22         ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-12 12:22 UTC (permalink / raw)
  To: kernel test robot
  Cc: Wei Liu, Linux on Hyper-V List, kbuild-all, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang

On Thu, Nov 12, 2020 at 11:46:41AM +0000, Wei Liu wrote:
> On Thu, Nov 12, 2020 at 11:42:15AM +0000, Wei Liu wrote:
> > On Fri, Nov 06, 2020 at 03:16:07AM +0800, kernel test robot wrote:
> > > Hi Wei,
> > > 
> > > I love your patch! Yet something to improve:
> > > 
> > > [auto build test ERROR on tip/x86/core]
> > > [also build test ERROR on asm-generic/master iommu/next tip/timers/core pci/next linus/master v5.10-rc2 next-20201105]
> > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > And when submitting patch, we suggest to use '--base' as documented in
> > > https://git-scm.com/docs/git-format-patch]
> > > 
> > 
> > This report is incorrect.
> > 
> > The bot seems to have only picked up this one patch but not the whole
> > series. While the patch can apply cleanly to all those trees, it has a
> > dependency on an earlier patch in this series.
> 
> I misread this report and I'm confused now. Let me fetch the config and
> try locally first.

The attached config file has

  # CONFIG_HYPERV is not set
.

The reported issue has been fixed by moving hv_root_partition to
mshyperv.c.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 15/17] x86/hyperv: implement an MSI domain for root partition
  2020-11-05 16:58 ` [PATCH v2 15/17] x86/hyperv: implement an MSI domain for root partition Wei Liu
@ 2020-11-12 13:50   ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-12 13:50 UTC (permalink / raw)
  To: Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

On Thu, Nov 05, 2020 at 04:58:12PM +0000, Wei Liu wrote:
> When Linux runs as the root partition on Microsoft Hypervisor, its
> interrupts are remapped.  Linux will need to explicitly map and unmap
> interrupts for hardware.
> 
> Implement an MSI domain to issue the correct hypercalls. And initialize
> this irqdomain as the default MSI irq domain.
> 
> Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
> v2:
> This patch is simplified due to upstream changes.
> ---
>  arch/x86/hyperv/Makefile    |   2 +-
>  arch/x86/hyperv/hv_init.c   |  10 +-
>  arch/x86/hyperv/irqdomain.c | 330 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 340 insertions(+), 2 deletions(-)
>  create mode 100644 arch/x86/hyperv/irqdomain.c
> 
> diff --git a/arch/x86/hyperv/Makefile b/arch/x86/hyperv/Makefile
> index 565358020921..2ebcf3969121 100644
> --- a/arch/x86/hyperv/Makefile
> +++ b/arch/x86/hyperv/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  obj-y			:= hv_init.o mmu.o nested.o
> -obj-$(CONFIG_X86_64)	+= hv_apic.o hv_proc.o
> +obj-$(CONFIG_X86_64)	+= hv_apic.o hv_proc.o irqdomain.o

I need to move irqdomain.o to obj-y because 32bit build also needs it in
hv_init.o.

After this change, 32bit build of the kernel builds and works as expected.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 01/17] asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to HV_CPU_MANAGEMENT
  2020-11-05 16:57 ` [PATCH v2 01/17] asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to HV_CPU_MANAGEMENT Wei Liu
@ 2020-11-12 15:14   ` Vitaly Kuznetsov
  0 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:14 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Arnd Bergmann, open list:GENERIC INCLUDE/ASM HEADER FILES

Wei Liu <wei.liu@kernel.org> writes:

> This makes the name match Hyper-V TLFS.
>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
>  include/asm-generic/hyperv-tlfs.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
> index e73a11850055..e6903589a82a 100644
> --- a/include/asm-generic/hyperv-tlfs.h
> +++ b/include/asm-generic/hyperv-tlfs.h
> @@ -88,7 +88,7 @@
>  #define HV_CONNECT_PORT				BIT(7)
>  #define HV_ACCESS_STATS				BIT(8)
>  #define HV_DEBUGGING				BIT(11)
> -#define HV_CPU_POWER_MANAGEMENT			BIT(12)
> +#define HV_CPU_MANAGEMENT			BIT(12)
>  
>  
>  /*

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition
  2020-11-05 16:57 ` [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition Wei Liu
  2020-11-05 19:16   ` kernel test robot
  2020-11-05 20:23   ` kernel test robot
@ 2020-11-12 15:16   ` Vitaly Kuznetsov
  2020-11-12 15:51     ` Wei Liu
  2 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:16 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

Wei Liu <wei.liu@kernel.org> writes:

> For now we can use the privilege flag to check. Stash the value to be
> used later.
>
> Put in a bunch of defines for future use when we want to have more
> fine-grained detection.
>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
>  arch/x86/hyperv/hv_init.c          |  4 ++++
>  arch/x86/include/asm/hyperv-tlfs.h | 10 ++++++++++
>  arch/x86/include/asm/mshyperv.h    |  2 ++
>  arch/x86/kernel/cpu/mshyperv.c     | 16 ++++++++++++++++
>  4 files changed, 32 insertions(+)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index e04d90af4c27..533fe9e887f2 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -26,6 +26,10 @@
>  #include <linux/syscore_ops.h>
>  #include <clocksource/hyperv_timer.h>
>  
> +/* Is Linux running as the root partition? */
> +bool hv_root_partition;
> +EXPORT_SYMBOL_GPL(hv_root_partition);

(Nitpick and rather a personal preference): I'd prefer
'hv_partition_is_root' for a boolean.

> +
>  void *hv_hypercall_pg;
>  EXPORT_SYMBOL_GPL(hv_hypercall_pg);
>  
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index 0ed20e8bba9e..41b628b9fb15 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -21,6 +21,7 @@
>  #define HYPERV_CPUID_FEATURES			0x40000003
>  #define HYPERV_CPUID_ENLIGHTMENT_INFO		0x40000004
>  #define HYPERV_CPUID_IMPLEMENT_LIMITS		0x40000005
> +#define HYPERV_CPUID_CPU_MANAGEMENT_FEATURES	0x40000007
>  #define HYPERV_CPUID_NESTED_FEATURES		0x4000000A
>  
>  #define HYPERV_HYPERVISOR_PRESENT_BIT		0x80000000
> @@ -103,6 +104,15 @@
>  /* Recommend using enlightened VMCS */
>  #define HV_X64_ENLIGHTENED_VMCS_RECOMMENDED		BIT(14)
>  
> +/*
> + * CPU management features identification.
> + * These are HYPERV_CPUID_CPU_MANAGEMENT_FEATURES.EAX bits.
> + */
> +#define HV_X64_START_LOGICAL_PROCESSOR			BIT(0)
> +#define HV_X64_CREATE_ROOT_VIRTUAL_PROCESSOR		BIT(1)
> +#define HV_X64_PERFORMANCE_COUNTER_SYNC			BIT(2)
> +#define HV_X64_RESERVED_IDENTITY_BIT			BIT(31)
> +
>  /*
>   * Virtual processor will never share a physical core with another virtual
>   * processor, except for virtual processors that are reported as sibling SMT
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index ffc289992d1b..ac2b0d110f03 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -237,6 +237,8 @@ int hyperv_fill_flush_guest_mapping_list(
>  		struct hv_guest_mapping_flush_list *flush,
>  		u64 start_gfn, u64 end_gfn);
>  
> +extern bool hv_root_partition;

Eventually this is not going to be an x86 only thing I believe?

> +
>  #ifdef CONFIG_X86_64
>  void hv_apic_init(void);
>  void __init hv_init_spinlocks(void);
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 05ef1f4550cb..f7633e1e4c82 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -237,6 +237,22 @@ static void __init ms_hyperv_init_platform(void)
>  	pr_debug("Hyper-V: max %u virtual processors, %u logical processors\n",
>  		 ms_hyperv.max_vp_index, ms_hyperv.max_lp_index);
>  
> +	/*
> +	 * Check CPU management privilege.
> +	 *
> +	 * To mirror what Windows does we should extract CPU management
> +	 * features and use the ReservedIdentityBit to detect if Linux is the
> +	 * root partition. But that requires negotiating CPU management
> +	 * interface (a process to be finalized).
> +	 *
> +	 * For now, use the privilege flag as the indicator for running as
> +	 * root.
> +	 */
> +	if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_CPU_MANAGEMENT) {

We may want to cache cpuid_ebx(HYPERV_CPUID_FEATURES) somewhere but we
already had a discussion regading naming for these caches and decided to
wait until TLFS for ARM is out so we don't need to rename again.

> +		hv_root_partition = true;
> +		pr_info("Hyper-V: running as root partition\n");
> +	}
> +
>  	/*
>  	 * Extract host information.
>  	 */

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 03/17] Drivers: hv: vmbus: skip VMBus initialization if Linux is root
  2020-11-05 16:58 ` [PATCH v2 03/17] Drivers: hv: vmbus: skip VMBus initialization if Linux is root Wei Liu
@ 2020-11-12 15:24   ` Vitaly Kuznetsov
  2020-11-13 14:51     ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:24 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger

Wei Liu <wei.liu@kernel.org> writes:

> There is no VMBus and the other infrastructures initialized in
> hv_acpi_init when Linux is running as the root partition.
>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
>  drivers/hv/vmbus_drv.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 4fad3e6745e5..37c4d3a28309 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2612,6 +2612,9 @@ static int __init hv_acpi_init(void)
>  	if (!hv_is_hyperv_initialized())
>  		return -ENODEV;
>  
> +	if (hv_root_partition)
> +		return -ENODEV;
> +

Nit: any particular reason why we need to return an error from here? I'd
suggest we 'return 0;' if it doesn't break anything (we're still running
on Hyper-V, it's just a coincedence that there's nothing to do here,
eventually we may get some devices/handlers I guess. Also, there's going
to be server-side Vmbus eventually, we may as well initialize it here.

>  	init_completion(&probe_event);
>  
>  	/*

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 04/17] iommu/hyperv: don't setup IRQ remapping when running as root
  2020-11-05 16:58 ` [PATCH v2 04/17] iommu/hyperv: don't setup IRQ remapping when running as root Wei Liu
@ 2020-11-12 15:27   ` Vitaly Kuznetsov
  2020-11-13 14:53     ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:27 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Joerg Roedel, K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Joerg Roedel, open list:IOMMU DRIVERS

Wei Liu <wei.liu@kernel.org> writes:

> The IOMMU code needs more work. We're sure for now the IRQ remapping
> hooks are not applicable when Linux is the root.

Super-nitpick: I would suggest we always say 'root partition' as 'root'
has a 'slightly different' meaning in Linux and this commit message may
sound confusing to an unprepared reader.

>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> Acked-by: Joerg Roedel <jroedel@suse.de>
> ---
>  drivers/iommu/hyperv-iommu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c
> index e09e2d734c57..8d3ce3add57d 100644
> --- a/drivers/iommu/hyperv-iommu.c
> +++ b/drivers/iommu/hyperv-iommu.c
> @@ -20,6 +20,7 @@
>  #include <asm/io_apic.h>
>  #include <asm/irq_remapping.h>
>  #include <asm/hypervisor.h>
> +#include <asm/mshyperv.h>
>  
>  #include "irq_remapping.h"
>  
> @@ -143,7 +144,7 @@ static int __init hyperv_prepare_irq_remapping(void)
>  	int i;
>  
>  	if (!hypervisor_is_type(X86_HYPER_MS_HYPERV) ||
> -	    !x2apic_supported())
> +	    !x2apic_supported() || hv_root_partition)
>  		return -ENODEV;
>  
>  	fn = irq_domain_alloc_named_id_fwnode("HYPERV-IR", 0);

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if running as root
  2020-11-05 16:58 ` [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if " Wei Liu
  2020-11-12  9:56   ` Daniel Lezcano
@ 2020-11-12 15:30   ` Vitaly Kuznetsov
  2020-11-12 15:52     ` Wei Liu
  1 sibling, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:30 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Daniel Lezcano, Thomas Gleixner

Wei Liu <wei.liu@kernel.org> writes:

> Signed-off-by: Wei Liu <wei.liu@kernel.org>

In the missing commit message I'd like to see why we don't use 'TSC
page' clocksource for the root partition. My guess would be that it's
not available and actually we're supposed to use raw TSC value (because
root partitions never migrate) but please spell it out.

> ---
>  drivers/clocksource/hyperv_timer.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/drivers/clocksource/hyperv_timer.c b/drivers/clocksource/hyperv_timer.c
> index ba04cb381cd3..269a691bd2c4 100644
> --- a/drivers/clocksource/hyperv_timer.c
> +++ b/drivers/clocksource/hyperv_timer.c
> @@ -426,6 +426,9 @@ static bool __init hv_init_tsc_clocksource(void)
>  	if (!(ms_hyperv.features & HV_MSR_REFERENCE_TSC_AVAILABLE))
>  		return false;
>  
> +	if (hv_root_partition)
> +		return false;
> +
>  	hv_read_reference_counter = read_hv_clock_tsc;
>  	phys_addr = virt_to_phys(hv_get_tsc_page());

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 06/17] x86/hyperv: allocate output arg pages if required
  2020-11-05 16:58 ` [PATCH v2 06/17] x86/hyperv: allocate output arg pages if required Wei Liu
@ 2020-11-12 15:35   ` Vitaly Kuznetsov
  2020-11-13 15:05     ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:35 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

Wei Liu <wei.liu@kernel.org> writes:

> When Linux runs as the root partition, it will need to make hypercalls
> which return data from the hypervisor.
>
> Allocate pages for storing results when Linux runs as the root
> partition.
>
> Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
> v2: Address Vitaly's comments
> ---
>  arch/x86/hyperv/hv_init.c       | 35 ++++++++++++++++++++++++++++-----
>  arch/x86/include/asm/mshyperv.h |  1 +
>  2 files changed, 31 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 533fe9e887f2..7a2e37f025b0 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -45,6 +45,9 @@ EXPORT_SYMBOL_GPL(hv_vp_assist_page);
>  void  __percpu **hyperv_pcpu_input_arg;
>  EXPORT_SYMBOL_GPL(hyperv_pcpu_input_arg);
>  
> +void  __percpu **hyperv_pcpu_output_arg;
> +EXPORT_SYMBOL_GPL(hyperv_pcpu_output_arg);
> +
>  u32 hv_max_vp_index;
>  EXPORT_SYMBOL_GPL(hv_max_vp_index);
>  
> @@ -77,12 +80,19 @@ static int hv_cpu_init(unsigned int cpu)
>  	void **input_arg;
>  	struct page *pg;
>  
> -	input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
>  	/* hv_cpu_init() can be called with IRQs disabled from hv_resume() */
> -	pg = alloc_page(irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL);
> +	pg = alloc_pages(irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL, hv_root_partition ? 1 : 0);
>  	if (unlikely(!pg))
>  		return -ENOMEM;
> +
> +	input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
>  	*input_arg = page_address(pg);
> +	if (hv_root_partition) {
> +		void **output_arg;
> +
> +		output_arg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> +		*output_arg = page_address(pg + 1);
> +	}
>  
>  	hv_get_vp_index(msr_vp_index);
>  
> @@ -209,14 +219,23 @@ static int hv_cpu_die(unsigned int cpu)
>  	unsigned int new_cpu;
>  	unsigned long flags;
>  	void **input_arg;
> -	void *input_pg = NULL;
> +	void *pg;
>  
>  	local_irq_save(flags);
>  	input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
> -	input_pg = *input_arg;
> +	pg = *input_arg;
>  	*input_arg = NULL;
> +
> +	if (hv_root_partition) {
> +		void **output_arg;
> +
> +		output_arg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> +		*output_arg = NULL;
> +	}
> +
>  	local_irq_restore(flags);
> -	free_page((unsigned long)input_pg);
> +
> +	free_page((unsigned long)pg);
>  

Hm, but in case we've allocated output_arg, don't we need to do
	free_pages((unsigned long)pg, 1);

instead?

>  	if (hv_vp_assist_page && hv_vp_assist_page[cpu])
>  		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
> @@ -350,6 +369,12 @@ void __init hyperv_init(void)
>  
>  	BUG_ON(hyperv_pcpu_input_arg == NULL);
>  
> +	/* Allocate the per-CPU state for output arg for root */
> +	if (hv_root_partition) {
> +		hyperv_pcpu_output_arg = alloc_percpu(void *);
> +		BUG_ON(hyperv_pcpu_output_arg == NULL);
> +	}
> +
>  	/* Allocate percpu VP index */
>  	hv_vp_index = kmalloc_array(num_possible_cpus(), sizeof(*hv_vp_index),
>  				    GFP_KERNEL);
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index ac2b0d110f03..62d9390f1ddf 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -76,6 +76,7 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
>  #if IS_ENABLED(CONFIG_HYPERV)
>  extern void *hv_hypercall_pg;
>  extern void  __percpu  **hyperv_pcpu_input_arg;
> +extern void  __percpu  **hyperv_pcpu_output_arg;
>  
>  static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
>  {

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary
  2020-11-05 16:58 ` [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary Wei Liu
  2020-11-05 20:07   ` kernel test robot
  2020-11-05 20:20   ` kernel test robot
@ 2020-11-12 15:44   ` Vitaly Kuznetsov
  2020-11-13 15:21     ` Wei Liu
  2 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:44 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

Wei Liu <wei.liu@kernel.org> writes:

> We will need the partition ID for executing some hypercalls later.
>
> Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
>  arch/x86/hyperv/hv_init.c         | 26 ++++++++++++++++++++++++++
>  arch/x86/include/asm/mshyperv.h   |  2 ++
>  include/asm-generic/hyperv-tlfs.h |  6 ++++++
>  3 files changed, 34 insertions(+)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 7a2e37f025b0..73b0fb851f76 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -30,6 +30,9 @@
>  bool hv_root_partition;
>  EXPORT_SYMBOL_GPL(hv_root_partition);
>  
> +u64 hv_current_partition_id;
> +EXPORT_SYMBOL_GPL(hv_current_partition_id);
> +
>  void *hv_hypercall_pg;
>  EXPORT_SYMBOL_GPL(hv_hypercall_pg);
>  
> @@ -335,6 +338,26 @@ static struct syscore_ops hv_syscore_ops = {
>  	.resume		= hv_resume,
>  };
>  
> +void __init hv_get_partition_id(void)
> +{
> +	struct hv_get_partition_id *output_page;
> +	u16 status;
> +	unsigned long flags;
> +
> +	local_irq_save(flags);
> +	output_page = *this_cpu_ptr(hyperv_pcpu_output_arg);
> +	status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, output_page) &
> +		HV_HYPERCALL_RESULT_MASK;
> +	if (status != HV_STATUS_SUCCESS)
> +		pr_err("Failed to get partition ID: %d\n", status);
> +	else
> +		hv_current_partition_id = output_page->partition_id;

Nit: I'd suggest we simplify this to:

	if (status != HV_STATUS_SUCCESS) {
		pr_err("Failed to get partition ID: %d\n", status);
		BUG();
	}
	hv_current_partition_id = output_page->partition_id;

and drop BUG_ON() below;

> +	local_irq_restore(flags);
> +
> +	/* No point in proceeding if this failed */
> +	BUG_ON(status != HV_STATUS_SUCCESS);
> +}
> +
>  /*
>   * This function is to be invoked early in the boot sequence after the
>   * hypervisor has been detected.
> @@ -430,6 +453,9 @@ void __init hyperv_init(void)
>  
>  	register_syscore_ops(&hv_syscore_ops);
>  
> +	if (hv_root_partition)
> +		hv_get_partition_id();
> +


We don't seem to check that the partition has AccessPartitionId
privilege. While I guess that root partitions always have it, I'd
suggest we write this as:

	if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_ACCESS_PARTITION_ID)
		hv_get_partition_id();

	BUG_ON(hv_root_partition && !hv_current_partition_id);

for correctness. Also, we need to make sure '0' is not a valid partition
id and use e.g. -1 otherwise.

>  	return;
>  
>  remove_cpuhp_state:
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 62d9390f1ddf..67f5d35a73d3 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -78,6 +78,8 @@ extern void *hv_hypercall_pg;
>  extern void  __percpu  **hyperv_pcpu_input_arg;
>  extern void  __percpu  **hyperv_pcpu_output_arg;
>  
> +extern u64 hv_current_partition_id;
> +
>  static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
>  {
>  	u64 input_address = input ? virt_to_phys(input) : 0;
> diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
> index e6903589a82a..87b1a79b19eb 100644
> --- a/include/asm-generic/hyperv-tlfs.h
> +++ b/include/asm-generic/hyperv-tlfs.h
> @@ -141,6 +141,7 @@ struct ms_hyperv_tsc_page {
>  #define HVCALL_FLUSH_VIRTUAL_ADDRESS_SPACE_EX	0x0013
>  #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
>  #define HVCALL_SEND_IPI_EX			0x0015
> +#define HVCALL_GET_PARTITION_ID			0x0046
>  #define HVCALL_GET_VP_REGISTERS			0x0050
>  #define HVCALL_SET_VP_REGISTERS			0x0051
>  #define HVCALL_POST_MESSAGE			0x005c
> @@ -407,6 +408,11 @@ struct hv_tlb_flush_ex {
>  	u64 gva_list[];
>  } __packed;
>  
> +/* HvGetPartitionId hypercall (output only) */
> +struct hv_get_partition_id {
> +	u64 partition_id;
> +} __packed;
> +
>  /* HvRetargetDeviceInterrupt hypercall */
>  union hv_msi_entry {
>  	u64 as_uint64;

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root
  2020-11-05 16:58 ` [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root Wei Liu
@ 2020-11-12 15:51   ` Vitaly Kuznetsov
  2020-11-13 15:33     ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:51 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

Wei Liu <wei.liu@kernel.org> writes:

> When Linux is running as the root partition, the hypercall page will
> have already been setup by Hyper-V. Copy the content over to the
> allocated page.
>
> The suspend, resume and cleanup paths remain untouched because they are
> not supported in this setup yet.

What about adding BUG_ONs there then?

>
> Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Co-Developed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
>  arch/x86/hyperv/hv_init.c | 32 ++++++++++++++++++++++++++++++--
>  1 file changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 73b0fb851f76..9fcaf741be99 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -25,6 +25,7 @@
>  #include <linux/cpuhotplug.h>
>  #include <linux/syscore_ops.h>
>  #include <clocksource/hyperv_timer.h>
> +#include <linux/highmem.h>
>  
>  /* Is Linux running as the root partition? */
>  bool hv_root_partition;
> @@ -438,8 +439,35 @@ void __init hyperv_init(void)
>  
>  	rdmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
>  	hypercall_msr.enable = 1;
> -	hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
> -	wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> +
> +	if (hv_root_partition) {
> +		struct page *pg;
> +		void *src, *dst;
> +
> +		/*
> +		 * For the root partition, the hypervisor will set up its
> +		 * hypercall page. The hypervisor guarantees it will not show
> +		 * up in the root's address space. The root can't change the
> +		 * location of the hypercall page.
> +		 *
> +		 * Order is important here. We must enable the hypercall page
> +		 * so it is populated with code, then copy the code to an
> +		 * executable page.
> +		 */
> +		wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> +
> +		pg = vmalloc_to_page(hv_hypercall_pg);
> +		dst = kmap(pg);
> +		src = memremap(hypercall_msr.guest_physical_address << PAGE_SHIFT, PAGE_SIZE,
> +				MEMREMAP_WB);
> +		BUG_ON(!(src && dst));
> +		memcpy(dst, src, PAGE_SIZE);

Super-nit: while on x86 PAGE_SIZE always matches HV_HYP_PAGE_SIZE, would
it be more accurate to use the later here?

> +		memunmap(src);
> +		kunmap(pg);
> +	} else {
> +		hypercall_msr.guest_physical_address = vmalloc_to_pfn(hv_hypercall_pg);
> +		wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> +	}
>  
>  	/*
>  	 * Ignore any errors in setting up stimer clockevents

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition
  2020-11-12 15:16   ` Vitaly Kuznetsov
@ 2020-11-12 15:51     ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-12 15:51 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

On Thu, Nov 12, 2020 at 04:16:30PM +0100, Vitaly Kuznetsov wrote:
[...]
> >  /*
> >   * Virtual processor will never share a physical core with another virtual
> >   * processor, except for virtual processors that are reported as sibling SMT
> > diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> > index ffc289992d1b..ac2b0d110f03 100644
> > --- a/arch/x86/include/asm/mshyperv.h
> > +++ b/arch/x86/include/asm/mshyperv.h
> > @@ -237,6 +237,8 @@ int hyperv_fill_flush_guest_mapping_list(
> >  		struct hv_guest_mapping_flush_list *flush,
> >  		u64 start_gfn, u64 end_gfn);
> >  
> > +extern bool hv_root_partition;
> 
> Eventually this is not going to be an x86 only thing I believe?

I hope so. :-)

> 
> > +
> >  #ifdef CONFIG_X86_64
> >  void hv_apic_init(void);
> >  void __init hv_init_spinlocks(void);
> > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > index 05ef1f4550cb..f7633e1e4c82 100644
> > --- a/arch/x86/kernel/cpu/mshyperv.c
> > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > @@ -237,6 +237,22 @@ static void __init ms_hyperv_init_platform(void)
> >  	pr_debug("Hyper-V: max %u virtual processors, %u logical processors\n",
> >  		 ms_hyperv.max_vp_index, ms_hyperv.max_lp_index);
> >  
> > +	/*
> > +	 * Check CPU management privilege.
> > +	 *
> > +	 * To mirror what Windows does we should extract CPU management
> > +	 * features and use the ReservedIdentityBit to detect if Linux is the
> > +	 * root partition. But that requires negotiating CPU management
> > +	 * interface (a process to be finalized).
> > +	 *
> > +	 * For now, use the privilege flag as the indicator for running as
> > +	 * root.
> > +	 */
> > +	if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_CPU_MANAGEMENT) {
> 
> We may want to cache cpuid_ebx(HYPERV_CPUID_FEATURES) somewhere but we
> already had a discussion regading naming for these caches and decided to
> wait until TLFS for ARM is out so we don't need to rename again.

Exactly.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if running as root
  2020-11-12 15:30   ` Vitaly Kuznetsov
@ 2020-11-12 15:52     ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-12 15:52 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Daniel Lezcano,
	Thomas Gleixner

On Thu, Nov 12, 2020 at 04:30:57PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu@kernel.org> writes:
> 
> > Signed-off-by: Wei Liu <wei.liu@kernel.org>
> 
> In the missing commit message I'd like to see why we don't use 'TSC
> page' clocksource for the root partition. My guess would be that it's
> not available and actually we're supposed to use raw TSC value (because
> root partitions never migrate) but please spell it out.

See my exchange with Daniel.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions
  2020-11-05 16:58 ` [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions Wei Liu
@ 2020-11-12 15:57   ` Vitaly Kuznetsov
  2020-11-13 15:51     ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 15:57 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

Wei Liu <wei.liu@kernel.org> writes:

> They are used to deposit pages into Microsoft Hypervisor and bring up
> logical and virtual processors.
>
> Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Co-Developed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
> v2:
> 1. Adapt to hypervisor side changes
> 2. Address Vitaly's comments
> ---
>  arch/x86/hyperv/Makefile          |   2 +-
>  arch/x86/hyperv/hv_proc.c         | 217 ++++++++++++++++++++++++++++++
>  arch/x86/include/asm/mshyperv.h   |   4 +
>  include/asm-generic/hyperv-tlfs.h |  67 +++++++++
>  4 files changed, 289 insertions(+), 1 deletion(-)
>  create mode 100644 arch/x86/hyperv/hv_proc.c
>
> diff --git a/arch/x86/hyperv/Makefile b/arch/x86/hyperv/Makefile
> index 89b1f74d3225..565358020921 100644
> --- a/arch/x86/hyperv/Makefile
> +++ b/arch/x86/hyperv/Makefile
> @@ -1,6 +1,6 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  obj-y			:= hv_init.o mmu.o nested.o
> -obj-$(CONFIG_X86_64)	+= hv_apic.o
> +obj-$(CONFIG_X86_64)	+= hv_apic.o hv_proc.o
>  
>  ifdef CONFIG_X86_64
>  obj-$(CONFIG_PARAVIRT_SPINLOCKS)	+= hv_spinlock.o
> diff --git a/arch/x86/hyperv/hv_proc.c b/arch/x86/hyperv/hv_proc.c
> new file mode 100644
> index 000000000000..0fd972c9129a
> --- /dev/null
> +++ b/arch/x86/hyperv/hv_proc.c
> @@ -0,0 +1,217 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/types.h>
> +#include <linux/version.h>
> +#include <linux/vmalloc.h>
> +#include <linux/mm.h>
> +#include <linux/clockchips.h>
> +#include <linux/acpi.h>
> +#include <linux/hyperv.h>
> +#include <linux/slab.h>
> +#include <linux/cpuhotplug.h>
> +#include <linux/minmax.h>
> +#include <asm/hypervisor.h>
> +#include <asm/mshyperv.h>
> +#include <asm/apic.h>
> +
> +#include <asm/trace/hyperv.h>
> +
> +#define HV_DEPOSIT_MAX_ORDER (8)
> +#define HV_DEPOSIT_MAX (1 << HV_DEPOSIT_MAX_ORDER)
> +
> +/*
> + * Deposits exact number of pages
> + * Must be called with interrupts enabled
> + * Max 256 pages
> + */
> +int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages)
> +{
> +	struct page **pages;
> +	int *counts;
> +	int num_allocations;
> +	int i, j, page_count;
> +	int order;
> +	int desired_order;
> +	u16 status;
> +	int ret;
> +	u64 base_pfn;
> +	struct hv_deposit_memory *input_page;
> +	unsigned long flags;
> +
> +	if (num_pages > HV_DEPOSIT_MAX)
> +		return -E2BIG;
> +	if (!num_pages)
> +		return 0;
> +
> +	/* One buffer for page pointers and counts */
> +	pages = page_address(alloc_page(GFP_KERNEL));
> +	if (!pages)
> +		return -ENOMEM;
> +
> +	counts = kcalloc(HV_DEPOSIT_MAX, sizeof(int), GFP_KERNEL);
> +	if (!counts) {
> +		free_page((unsigned long)pages);
> +		return -ENOMEM;
> +	}
> +
> +	/* Allocate all the pages before disabling interrupts */
> +	num_allocations = 0;
> +	i = 0;
> +	order = HV_DEPOSIT_MAX_ORDER;
> +
> +	while (num_pages) {
> +		/* Find highest order we can actually allocate */
> +		desired_order = 31 - __builtin_clz(num_pages);
> +		order = min(desired_order, order);
> +		do {
> +			pages[i] = alloc_pages_node(node, GFP_KERNEL, order);
> +			if (!pages[i]) {
> +				if (!order) {
> +					ret = -ENOMEM;
> +					goto err_free_allocations;
> +				}
> +				--order;
> +			}
> +		} while (!pages[i]);
> +
> +		split_page(pages[i], order);
> +		counts[i] = 1 << order;
> +		num_pages -= counts[i];
> +		i++;
> +		num_allocations++;
> +	}
> +
> +	local_irq_save(flags);
> +
> +	input_page = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +
> +	input_page->partition_id = partition_id;
> +
> +	/* Populate gpa_page_list - these will fit on the input page */
> +	for (i = 0, page_count = 0; i < num_allocations; ++i) {
> +		base_pfn = page_to_pfn(pages[i]);
> +		for (j = 0; j < counts[i]; ++j, ++page_count)
> +			input_page->gpa_page_list[page_count] = base_pfn + j;
> +	}
> +	status = hv_do_rep_hypercall(HVCALL_DEPOSIT_MEMORY,
> +				     page_count, 0, input_page,
> +				     NULL) & HV_HYPERCALL_RESULT_MASK;
> +	local_irq_restore(flags);
> +
> +	if (status != HV_STATUS_SUCCESS) {
> +		pr_err("Failed to deposit pages: %d\n", status);
> +		ret = status;
> +		goto err_free_allocations;
> +	}
> +
> +	ret = 0;
> +	goto free_buf;
> +
> +err_free_allocations:
> +	for (i = 0; i < num_allocations; ++i) {
> +		base_pfn = page_to_pfn(pages[i]);
> +		for (j = 0; j < counts[i]; ++j)
> +			__free_page(pfn_to_page(base_pfn + j));
> +	}
> +
> +free_buf:
> +	free_page((unsigned long)pages);
> +	kfree(counts);
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(hv_call_deposit_pages);
> +
> +int hv_call_add_logical_proc(int node, u32 lp_index, u32 apic_id)
> +{
> +	struct hv_add_logical_processor_in *input;
> +	struct hv_add_logical_processor_out *output;
> +	int status;
> +	unsigned long flags;
> +	int ret = 0;
> +
> +	/*
> +	 * When adding a logical processor, the hypervisor may return
> +	 * HV_STATUS_INSUFFICIENT_MEMORY. When that happens, we deposit more
> +	 * pages and retry.
> +	 */
> +	do {
> +		local_irq_save(flags);
> +
> +		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +		/* We don't do anything with the output right now */
> +		output = *this_cpu_ptr(hyperv_pcpu_output_arg);
> +
> +		input->lp_index = lp_index;
> +		input->apic_id = apic_id;
> +		input->flags = 0;
> +		input->proximity_domain_info.domain_id = node_to_pxm(node);
> +		input->proximity_domain_info.flags.reserved = 0;
> +		input->proximity_domain_info.flags.proximity_info_valid = 1;
> +		input->proximity_domain_info.flags.proximity_preferred = 1;
> +		status = hv_do_hypercall(HVCALL_ADD_LOGICAL_PROCESSOR,
> +					 input, output);
> +		local_irq_restore(flags);
> +
> +		if (status != HV_STATUS_INSUFFICIENT_MEMORY) {
> +			if (status != HV_STATUS_SUCCESS) {
> +				pr_err("%s: cpu %u apic ID %u, %d\n", __func__,
> +				       lp_index, apic_id, status);
> +				ret = status;
> +			}
> +			break;
> +		}
> +		ret = hv_call_deposit_pages(node, hv_current_partition_id, 1);
> +	} while (!ret);
> +
> +	return ret;
> +}
> +
> +int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags)
> +{
> +	struct hv_create_vp *input;
> +	u16 status;
> +	unsigned long irq_flags;
> +	int ret = 0;
> +
> +	/* Root VPs don't seem to need pages deposited */
> +	if (partition_id != hv_current_partition_id) {
> +		ret = hv_call_deposit_pages(node, partition_id, 90);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	do {
> +		local_irq_save(irq_flags);
> +
> +		input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +
> +		input->partition_id = partition_id;
> +		input->vp_index = vp_index;
> +		input->flags = flags;
> +		input->subnode_type = HvSubnodeAny;
> +		if (node != NUMA_NO_NODE) {
> +			input->proximity_domain_info.domain_id = node_to_pxm(node);
> +			input->proximity_domain_info.flags.reserved = 0;
> +			input->proximity_domain_info.flags.proximity_info_valid = 1;
> +			input->proximity_domain_info.flags.proximity_preferred = 1;
> +		} else {
> +			input->proximity_domain_info.as_uint64 = 0;
> +		}
> +		status = hv_do_hypercall(HVCALL_CREATE_VP, input, NULL);
> +		local_irq_restore(irq_flags);
> +
> +		if (status != HV_STATUS_INSUFFICIENT_MEMORY) {
> +			if (status != HV_STATUS_SUCCESS) {
> +				pr_err("%s: vcpu %u, lp %u, %d\n", __func__,
> +				       vp_index, flags, status);
> +				ret = status;
> +			}
> +			break;
> +		}
> +		ret = hv_call_deposit_pages(node, partition_id, 1);
> +
> +	} while (!ret);
> +
> +	return ret;
> +}
> +EXPORT_SYMBOL_GPL(hv_call_create_vp);
> +
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 67f5d35a73d3..4e590a167160 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -80,6 +80,10 @@ extern void  __percpu  **hyperv_pcpu_output_arg;
>  
>  extern u64 hv_current_partition_id;
>  
> +int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
> +int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
> +int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);

You seem to be only doing EXPORT_SYMBOL_GPL() for
hv_call_deposit_pages() and hv_call_create_vp() but not for
hv_call_add_logical_proc() - is this intended? Also, I don't see
hv_call_create_vp()/hv_call_add_logical_proc() usage outside of
arch/x86/kernel/cpu/mshyperv.c so maybe we don't need to export them at all?

> +
>  static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
>  {
>  	u64 input_address = input ? virt_to_phys(input) : 0;
> diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
> index 87b1a79b19eb..b6c74e1a5524 100644
> --- a/include/asm-generic/hyperv-tlfs.h
> +++ b/include/asm-generic/hyperv-tlfs.h
> @@ -142,6 +142,8 @@ struct ms_hyperv_tsc_page {
>  #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
>  #define HVCALL_SEND_IPI_EX			0x0015
>  #define HVCALL_GET_PARTITION_ID			0x0046
> +#define HVCALL_DEPOSIT_MEMORY			0x0048
> +#define HVCALL_CREATE_VP			0x004e
>  #define HVCALL_GET_VP_REGISTERS			0x0050
>  #define HVCALL_SET_VP_REGISTERS			0x0051
>  #define HVCALL_POST_MESSAGE			0x005c
> @@ -149,6 +151,7 @@ struct ms_hyperv_tsc_page {
>  #define HVCALL_POST_DEBUG_DATA			0x0069
>  #define HVCALL_RETRIEVE_DEBUG_DATA		0x006a
>  #define HVCALL_RESET_DEBUG_SESSION		0x006b
> +#define HVCALL_ADD_LOGICAL_PROCESSOR		0x0076
>  #define HVCALL_RETARGET_INTERRUPT		0x007e
>  #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
>  #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
> @@ -413,6 +416,70 @@ struct hv_get_partition_id {
>  	u64 partition_id;
>  } __packed;
>  
> +/* HvDepositMemory hypercall */
> +struct hv_deposit_memory {
> +	u64 partition_id;
> +	u64 gpa_page_list[];
> +} __packed;
> +
> +struct hv_proximity_domain_flags {
> +	u32 proximity_preferred : 1;
> +	u32 reserved : 30;
> +	u32 proximity_info_valid : 1;
> +};
> +
> +/* Not a union in windows but useful for zeroing */
> +union hv_proximity_domain_info {
> +	struct {
> +		u32 domain_id;
> +		struct hv_proximity_domain_flags flags;
> +	};
> +	u64 as_uint64;
> +};
> +
> +struct hv_lp_startup_status {
> +	u64 hv_status;
> +	u64 substatus1;
> +	u64 substatus2;
> +	u64 substatus3;
> +	u64 substatus4;
> +	u64 substatus5;
> +	u64 substatus6;
> +};
> +
> +/* HvAddLogicalProcessor hypercall */
> +struct hv_add_logical_processor_in {
> +	u32 lp_index;
> +	u32 apic_id;
> +	union hv_proximity_domain_info proximity_domain_info;
> +	u64 flags;
> +};
> +
> +struct hv_add_logical_processor_out {
> +	struct hv_lp_startup_status startup_status;
> +};
> +
> +enum HV_SUBNODE_TYPE
> +{
> +    HvSubnodeAny = 0,
> +    HvSubnodeSocket,
> +    HvSubnodeAmdNode,
> +    HvSubnodeL3,
> +    HvSubnodeCount,
> +    HvSubnodeInvalid = -1
> +};
> +
> +/* HvCreateVp hypercall */
> +struct hv_create_vp {
> +	u64 partition_id;
> +	u32 vp_index;
> +	u8 padding[3];
> +	u8 subnode_type;
> +	u64 subnode_id;
> +	union hv_proximity_domain_info proximity_domain_info;
> +	u64 flags;
> +};

You add '__packed' to 'struct hv_deposit_memory' but not to 'struct
hv_create_vp'/'struct hv_add_logical_processor_in', this looks
inconsistent.

> +
>  /* HvRetargetDeviceInterrupt hypercall */
>  union hv_msi_entry {
>  	u64 as_uint64;

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 10/17] x86/hyperv: implement and use hv_smp_prepare_cpus
  2020-11-05 16:58 ` [PATCH v2 10/17] x86/hyperv: implement and use hv_smp_prepare_cpus Wei Liu
@ 2020-11-12 16:44   ` Vitaly Kuznetsov
  2020-11-13 15:56     ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 16:44 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	Lillian Grassin-Drake, K. Y. Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

Wei Liu <wei.liu@kernel.org> writes:

> Microsoft Hypervisor requires the root partition to make a few
> hypercalls to setup application processors before they can be used.
>
> Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
> CPU hotplug and unplug is not yet supported in this setup, so those
> paths remain untouched.
> ---
>  arch/x86/kernel/cpu/mshyperv.c | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
>
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index f7633e1e4c82..4795e54550e6 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -31,6 +31,7 @@
>  #include <asm/reboot.h>
>  #include <asm/nmi.h>
>  #include <clocksource/hyperv_timer.h>
> +#include <asm/numa.h>
>  
>  struct ms_hyperv_info ms_hyperv;
>  EXPORT_SYMBOL_GPL(ms_hyperv);
> @@ -208,6 +209,30 @@ static void __init hv_smp_prepare_boot_cpu(void)
>  	hv_init_spinlocks();
>  #endif
>  }
> +
> +static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
> +{
> +#if defined(CONFIG_X86_64)

'#ifdef CONFIG_X86_64' is equally good as you can't compile x86_64
support as a module :-)

> +	int i;
> +	int ret;
> +
> +	native_smp_prepare_cpus(max_cpus);
> +

So hypotetically, if hv_root_partition is true but 'ifdef CONFIG_X86_64'
is false, we won't even be doing native_smp_prepare_cpus()? This doesn't
sound right. Either move it outside of #ifdef or put the #ifdef around
'smp_ops.smp_prepare_cpus' assignment too.

> +	for_each_present_cpu(i) {
> +		if (i == 0)
> +			continue;
> +		ret = hv_call_add_logical_proc(numa_cpu_node(i), i, cpu_physical_id(i));
> +		BUG_ON(ret);
> +	}
> +
> +	for_each_present_cpu(i) {
> +		if (i == 0)
> +			continue;
> +		ret = hv_call_create_vp(numa_cpu_node(i), hv_current_partition_id, i, i);
> +		BUG_ON(ret);
> +	}
> +#endif
> +}
>  #endif
>  
>  static void __init ms_hyperv_init_platform(void)
> @@ -364,6 +389,8 @@ static void __init ms_hyperv_init_platform(void)
>  
>  # ifdef CONFIG_SMP
>  	smp_ops.smp_prepare_boot_cpu = hv_smp_prepare_boot_cpu;
> +	if (hv_root_partition)
> +		smp_ops.smp_prepare_cpus = hv_smp_prepare_cpus;
>  # endif
>  
>  	/*

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 17/17] x86/hyperv: handle IO-APIC when running as root
  2020-11-05 16:58 ` [PATCH v2 17/17] x86/hyperv: handle IO-APIC when running as root Wei Liu
@ 2020-11-12 16:56   ` Vitaly Kuznetsov
  2020-11-13 16:01     ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-12 16:56 UTC (permalink / raw)
  To: Wei Liu, Linux on Hyper-V List
  Cc: virtualization, Linux Kernel List, Michael Kelley,
	Vineeth Pillai, Sunil Muthuswamy, Nuno Das Neves, Wei Liu,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

Wei Liu <wei.liu@kernel.org> writes:

> Just like MSI/MSI-X, IO-APIC interrupts are remapped by Microsoft
> Hypervisor when Linux runs as the root partition. Implement an IRQ chip
> to handle mapping and unmapping of IO-APIC interrupts.
>
> Use custom functions for mapping and unmapping ACPI GSIs. They will
> issue Microsoft Hypervisor specific hypercalls on top of the native
> routines.
>
> Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> Signed-off-by: Wei Liu <wei.liu@kernel.org>
> ---
>  arch/x86/hyperv/hv_init.c   |  13 +++
>  arch/x86/hyperv/irqdomain.c | 226 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 239 insertions(+)
>
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index a46b817b5b2a..2c2189832da7 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -267,10 +267,23 @@ static int hv_cpu_die(unsigned int cpu)
>  	return 0;
>  }
>  
> +int hv_acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity);
> +void hv_acpi_unregister_gsi(u32 gsi);
> +
> +extern int (*native_acpi_register_gsi)(struct device *dev, u32 gsi, int trigger, int polarity);
> +extern void (*native_acpi_unregister_gsi)(u32 gsi);
> +
>  static int __init hv_pci_init(void)
>  {
>  	int gen2vm = efi_enabled(EFI_BOOT);
>  
> +	if (hv_root_partition) {
> +		native_acpi_register_gsi = __acpi_register_gsi;
> +		native_acpi_unregister_gsi = __acpi_unregister_gsi;
> +		__acpi_register_gsi = hv_acpi_register_gsi;
> +		__acpi_unregister_gsi = hv_acpi_unregister_gsi;
> +	}
> +
>  	/*
>  	 * For Generation-2 VM, we exit from pci_arch_init() by returning 0.
>  	 * The purpose is to suppress the harmless warning:
> diff --git a/arch/x86/hyperv/irqdomain.c b/arch/x86/hyperv/irqdomain.c
> index 80109e3cbf8f..ae9a728589a4 100644
> --- a/arch/x86/hyperv/irqdomain.c
> +++ b/arch/x86/hyperv/irqdomain.c
> @@ -9,6 +9,8 @@
>  #include <linux/pci.h>
>  #include <linux/irq.h>
>  #include <asm/mshyperv.h>
> +#include <asm/apic.h>
> +#include <asm/io_apic.h>
>  
>  struct rid_data {
>  	struct pci_dev *bridge;
> @@ -328,3 +330,227 @@ struct irq_domain * __init hv_create_pci_msi_domain(void)
>  	return d;
>  }
>  
> +/* Copied from io_apic.c */
> +union entry_union {
> +	struct { u32 w1, w2; };
> +	struct IO_APIC_route_entry entry;
> +};
> +
> +static int hv_unmap_ioapic_interrupt(int gsi)
> +{
> +	union hv_device_id device_id;
> +	int ioapic, ioapic_id;
> +	u8 ioapic_pin;
> +	struct IO_APIC_route_entry ire;
> +	union entry_union eu;
> +	struct hv_interrupt_entry entry;
> +
> +	ioapic = mp_find_ioapic(gsi);
> +	ioapic_pin = mp_find_ioapic_pin(ioapic, gsi);
> +	ioapic_id = mpc_ioapic_id(ioapic);
> +	ire = ioapic_read_entry(ioapic, ioapic_pin);
> +
> +	eu.entry = ire;
> +
> +	/*
> +	 * Polarity may have been set by us, but Hyper-V expects the exact same
> +	 * entry. See the mapping routine.
> +	 */
> +	eu.entry.polarity = 0;
> +
> +	memset(&entry, 0, sizeof(entry));
> +	entry.source = HV_INTERRUPT_SOURCE_IOAPIC;
> +	entry.ioapic_rte.low_uint32 = eu.w1;
> +	entry.ioapic_rte.high_uint32 = eu.w2;
> +
> +	device_id.as_uint64 = 0;
> +	device_id.device_type = HV_DEVICE_TYPE_IOAPIC;
> +	device_id.ioapic.ioapic_id = (u8)ioapic_id;
> +
> +	return hv_unmap_interrupt(device_id.as_uint64, &entry) & HV_HYPERCALL_RESULT_MASK;
> +}
> +
> +static int hv_map_ioapic_interrupt(int ioapic_id, int trigger, int vcpu, int vector,
> +		struct hv_interrupt_entry *out_entry)
> +{
> +	unsigned long flags;
> +	struct hv_input_map_device_interrupt *input;
> +	struct hv_output_map_device_interrupt *output;
> +	union hv_device_id device_id;
> +	struct hv_device_interrupt_descriptor *intr_desc;
> +	u16 status;
> +
> +	device_id.as_uint64 = 0;
> +	device_id.device_type = HV_DEVICE_TYPE_IOAPIC;
> +	device_id.ioapic.ioapic_id = (u8)ioapic_id;
> +
> +	local_irq_save(flags);
> +	input = *this_cpu_ptr(hyperv_pcpu_input_arg);
> +	output = *this_cpu_ptr(hyperv_pcpu_output_arg);
> +	memset(input, 0, sizeof(*input));
> +	intr_desc = &input->interrupt_descriptor;
> +	input->partition_id = hv_current_partition_id;
> +	input->device_id = device_id.as_uint64;
> +	intr_desc->interrupt_type = HV_X64_INTERRUPT_TYPE_FIXED;
> +	intr_desc->target.vector = vector;
> +	intr_desc->vector_count = 1;
> +
> +	if (trigger)
> +		intr_desc->trigger_mode = HV_INTERRUPT_TRIGGER_MODE_LEVEL;
> +	else
> +		intr_desc->trigger_mode = HV_INTERRUPT_TRIGGER_MODE_EDGE;
> +
> +	__set_bit(vcpu, (unsigned long *)&intr_desc->target.vp_mask);
> +
> +	status = hv_do_rep_hypercall(HVCALL_MAP_DEVICE_INTERRUPT, 0, 0, input, output) &
> +			 HV_HYPERCALL_RESULT_MASK;
> +	local_irq_restore(flags);
> +
> +	*out_entry = output->interrupt_entry;
> +
> +	return status;
> +}
> +
> +static unsigned int hv_ioapic_startup_irq(struct irq_data *data)
> +{
> +	u16 status;
> +	struct IO_APIC_route_entry ire;
> +	u32 vector;
> +	struct irq_cfg *cfg;
> +	int ioapic;
> +	u8 ioapic_pin;
> +	int ioapic_id;
> +	int gsi;
> +	union entry_union eu;
> +	struct cpumask *affinity;
> +	int cpu, vcpu;
> +	struct hv_interrupt_entry entry;
> +	struct mp_chip_data *mp_data = data->chip_data;
> +
> +	gsi = data->irq;
> +	cfg = irqd_cfg(data);
> +	affinity = irq_data_get_effective_affinity_mask(data);
> +	cpu = cpumask_first_and(affinity, cpu_online_mask);
> +	vcpu = hv_cpu_number_to_vp_number(cpu);
> +
> +	vector = cfg->vector;
> +
> +	ioapic = mp_find_ioapic(gsi);
> +	ioapic_pin = mp_find_ioapic_pin(ioapic, gsi);
> +	ioapic_id = mpc_ioapic_id(ioapic);
> +	ire = ioapic_read_entry(ioapic, ioapic_pin);
> +
> +	/*
> +	 * Always try unmapping. We do not have visibility into which whether
> +	 * an IO-APIC has been mapped or not. We can't use chip_data because it
> +	 * already points to mp_data.
> +	 *
> +	 * We don't use retarget interrupt hypercalls here because Hyper-V
> +	 * doens't allow root to change the vector or specify VPs outside of
> +	 * the set that is initially used during mapping.
> +	 */
> +	status = hv_unmap_ioapic_interrupt(gsi);
> +
> +	if (!(status == HV_STATUS_SUCCESS || status == HV_STATUS_INVALID_PARAMETER)) {
> +		pr_debug("%s: unexpected unmap status %d\n", __func__, status);
> +		return -1;

Nit: the function returns 'unsigned int' but I see other 'irq_startup'
routines return negative values too, however, they tend to returd
'-ESOMETHING' so maybe -EFAULT here?

> +	}
> +
> +	status = hv_map_ioapic_interrupt(ioapic_id, ire.trigger, vcpu, vector, &entry);
> +
> +	if (status != HV_STATUS_SUCCESS) {
> +		pr_err("%s: map hypercall failed, status %d\n", __func__, status);
> +		return -1;

and here.

> +	}
> +
> +	/* Update the entry in mp_chip_data. It is used in other places. */
> +	mp_data->entry = *(struct IO_APIC_route_entry *)&entry.ioapic_rte;
> +
> +	/* Sync polarity -- Hyper-V's returned polarity is always 0... */
> +	mp_data->entry.polarity = ire.polarity;
> +
> +	eu.w1 = entry.ioapic_rte.low_uint32;
> +	eu.w2 = entry.ioapic_rte.high_uint32;
> +	ioapic_write_entry(ioapic, ioapic_pin, eu.entry);
> +
> +	return 0;
> +}
> +
> +static void hv_ioapic_mask_irq(struct irq_data *data)
> +{
> +	mask_ioapic_irq(data);
> +}
> +
> +static void hv_ioapic_unmask_irq(struct irq_data *data)
> +{
> +	unmask_ioapic_irq(data);
> +}
> +
> +static int hv_ioapic_set_affinity(struct irq_data *data,
> +			       const struct cpumask *mask, bool force)
> +{
> +	/*
> +	 * We only update the affinity mask here. Programming the hardware is
> +	 * done in irq_startup.
> +	 */
> +	return ioapic_set_affinity(data, mask, force);
> +}
> +
> +void hv_ioapic_ack_level(struct irq_data *irq_data)
> +{
> +	/*
> +	 * Per email exchange with Hyper-V team, all is needed is write to
> +	 * LAPIC's EOI register. They don't support directed EOI to IO-APIC.
> +	 * Hyper-V handles it for us.
> +	 */
> +	apic_ack_irq(irq_data);
> +}
> +
> +struct irq_chip hv_ioapic_chip __read_mostly = {
> +	.name			= "HV-IO-APIC",
> +	.irq_startup		= hv_ioapic_startup_irq,
> +	.irq_mask		= hv_ioapic_mask_irq,
> +	.irq_unmask		= hv_ioapic_unmask_irq,
> +	.irq_ack		= irq_chip_ack_parent,
> +	.irq_eoi		= hv_ioapic_ack_level,
> +	.irq_set_affinity	= hv_ioapic_set_affinity,
> +	.irq_retrigger		= irq_chip_retrigger_hierarchy,
> +	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
> +	.flags			= IRQCHIP_SKIP_SET_WAKE,
> +};
> +
> +
> +int (*native_acpi_register_gsi)(struct device *dev, u32 gsi, int trigger, int polarity);
> +void (*native_acpi_unregister_gsi)(u32 gsi);
> +
> +int hv_acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity)
> +{
> +	int irq = gsi;
> +
> +#ifdef CONFIG_X86_IO_APIC
> +	irq = native_acpi_register_gsi(dev, gsi, trigger, polarity);
> +	if (irq < 0) {
> +		pr_err("native_acpi_register_gsi failed %d\n", irq);
> +		return irq;
> +	}
> +
> +	if (trigger) {
> +		irq_set_status_flags(irq, IRQ_LEVEL);
> +		irq_set_chip_and_handler_name(irq, &hv_ioapic_chip,
> +			handle_fasteoi_irq, "ioapic-fasteoi");
> +	} else {
> +		irq_clear_status_flags(irq, IRQ_LEVEL);
> +		irq_set_chip_and_handler_name(irq, &hv_ioapic_chip,
> +			handle_edge_irq, "ioapic-edge");
> +	}
> +#endif
> +	return irq;
> +}
> +
> +void hv_acpi_unregister_gsi(u32 gsi)
> +{
> +#ifdef CONFIG_X86_IO_APIC
> +	(void)hv_unmap_ioapic_interrupt(gsi);
> +	native_acpi_unregister_gsi(gsi);
> +#endif
> +}

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 03/17] Drivers: hv: vmbus: skip VMBus initialization if Linux is root
  2020-11-12 15:24   ` Vitaly Kuznetsov
@ 2020-11-13 14:51     ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-13 14:51 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger

On Thu, Nov 12, 2020 at 04:24:38PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu@kernel.org> writes:
> 
> > There is no VMBus and the other infrastructures initialized in
> > hv_acpi_init when Linux is running as the root partition.
> >
> > Signed-off-by: Wei Liu <wei.liu@kernel.org>
> > ---
> >  drivers/hv/vmbus_drv.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> > index 4fad3e6745e5..37c4d3a28309 100644
> > --- a/drivers/hv/vmbus_drv.c
> > +++ b/drivers/hv/vmbus_drv.c
> > @@ -2612,6 +2612,9 @@ static int __init hv_acpi_init(void)
> >  	if (!hv_is_hyperv_initialized())
> >  		return -ENODEV;
> >  
> > +	if (hv_root_partition)
> > +		return -ENODEV;
> > +
> 
> Nit: any particular reason why we need to return an error from here? I'd
> suggest we 'return 0;' if it doesn't break anything (we're still running
> on Hyper-V, it's just a coincedence that there's nothing to do here,
> eventually we may get some devices/handlers I guess. Also, there's going
> to be server-side Vmbus eventually, we may as well initialize it here.

Returning 0 should be fine. It is not likely to make any practical
difference at this stage.

Not sure what you mean by server-side Vmbus. If you mean Vmbus on the
host, yes, there will be. The initialization is, again, a bit different
there.  You will see why when my colleague post /dev/mshv code. The long
term goal is to refactor the Vmbus initialization code to work in all
scenarios.  We are not there yet though.

Wei.

> 
> >  	init_completion(&probe_event);
> >  
> >  	/*
> 
> -- 
> Vitaly
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 04/17] iommu/hyperv: don't setup IRQ remapping when running as root
  2020-11-12 15:27   ` Vitaly Kuznetsov
@ 2020-11-13 14:53     ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-13 14:53 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Joerg Roedel, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Joerg Roedel,
	open list:IOMMU DRIVERS

On Thu, Nov 12, 2020 at 04:27:14PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu@kernel.org> writes:
> 
> > The IOMMU code needs more work. We're sure for now the IRQ remapping
> > hooks are not applicable when Linux is the root.
> 
> Super-nitpick: I would suggest we always say 'root partition' as 'root'
> has a 'slightly different' meaning in Linux and this commit message may
> sound confusing to an unprepared reader.

Fixed.

> 
> >
> > Signed-off-by: Wei Liu <wei.liu@kernel.org>
> > Acked-by: Joerg Roedel <jroedel@suse.de>
> > ---
> >  drivers/iommu/hyperv-iommu.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/hyperv-iommu.c b/drivers/iommu/hyperv-iommu.c
> > index e09e2d734c57..8d3ce3add57d 100644
> > --- a/drivers/iommu/hyperv-iommu.c
> > +++ b/drivers/iommu/hyperv-iommu.c
> > @@ -20,6 +20,7 @@
> >  #include <asm/io_apic.h>
> >  #include <asm/irq_remapping.h>
> >  #include <asm/hypervisor.h>
> > +#include <asm/mshyperv.h>
> >  
> >  #include "irq_remapping.h"
> >  
> > @@ -143,7 +144,7 @@ static int __init hyperv_prepare_irq_remapping(void)
> >  	int i;
> >  
> >  	if (!hypervisor_is_type(X86_HYPER_MS_HYPERV) ||
> > -	    !x2apic_supported())
> > +	    !x2apic_supported() || hv_root_partition)
> >  		return -ENODEV;
> >  
> >  	fn = irq_domain_alloc_named_id_fwnode("HYPERV-IR", 0);
> 
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Thanks.

Wei.

> 
> -- 
> Vitaly
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 06/17] x86/hyperv: allocate output arg pages if required
  2020-11-12 15:35   ` Vitaly Kuznetsov
@ 2020-11-13 15:05     ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-13 15:05 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

On Thu, Nov 12, 2020 at 04:35:48PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu@kernel.org> writes:
[...]
> > @@ -209,14 +219,23 @@ static int hv_cpu_die(unsigned int cpu)
> >  	unsigned int new_cpu;
> >  	unsigned long flags;
> >  	void **input_arg;
> > -	void *input_pg = NULL;
> > +	void *pg;
> >  
> >  	local_irq_save(flags);
> >  	input_arg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg);
> > -	input_pg = *input_arg;
> > +	pg = *input_arg;
> >  	*input_arg = NULL;
> > +
> > +	if (hv_root_partition) {
> > +		void **output_arg;
> > +
> > +		output_arg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg);
> > +		*output_arg = NULL;
> > +	}
> > +
> >  	local_irq_restore(flags);
> > -	free_page((unsigned long)input_pg);
> > +
> > +	free_page((unsigned long)pg);
> >  
> 
> Hm, but in case we've allocated output_arg, don't we need to do
> 	free_pages((unsigned long)pg, 1);
> 
> instead?

Indeed. This has been fixed with:

    free_pages((unsigned long)pg, hv_root_partition ? 1 : 0);

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary
  2020-11-12 15:44   ` Vitaly Kuznetsov
@ 2020-11-13 15:21     ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-13 15:21 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

On Thu, Nov 12, 2020 at 04:44:39PM +0100, Vitaly Kuznetsov wrote:
[...]
> > +void __init hv_get_partition_id(void)
> > +{
> > +	struct hv_get_partition_id *output_page;
> > +	u16 status;
> > +	unsigned long flags;
> > +
> > +	local_irq_save(flags);
> > +	output_page = *this_cpu_ptr(hyperv_pcpu_output_arg);
> > +	status = hv_do_hypercall(HVCALL_GET_PARTITION_ID, NULL, output_page) &
> > +		HV_HYPERCALL_RESULT_MASK;
> > +	if (status != HV_STATUS_SUCCESS)
> > +		pr_err("Failed to get partition ID: %d\n", status);
> > +	else
> > +		hv_current_partition_id = output_page->partition_id;
> 
> Nit: I'd suggest we simplify this to:
> 
> 	if (status != HV_STATUS_SUCCESS) {
> 		pr_err("Failed to get partition ID: %d\n", status);
> 		BUG();
> 	}
> 	hv_current_partition_id = output_page->partition_id;
> 
> and drop BUG_ON() below;
> 
> > +	local_irq_restore(flags);
> > +
> > +	/* No point in proceeding if this failed */
> > +	BUG_ON(status != HV_STATUS_SUCCESS);
> > +}
> > +
> >  /*
> >   * This function is to be invoked early in the boot sequence after the
> >   * hypervisor has been detected.
> > @@ -430,6 +453,9 @@ void __init hyperv_init(void)
> >  
> >  	register_syscore_ops(&hv_syscore_ops);
> >  
> > +	if (hv_root_partition)
> > +		hv_get_partition_id();
> > +
> 
> 
> We don't seem to check that the partition has AccessPartitionId
> privilege. While I guess that root partitions always have it, I'd
> suggest we write this as:
> 

Yes. Root should always have that permission. That's my understanding.

> 	if (cpuid_ebx(HYPERV_CPUID_FEATURES) & HV_ACCESS_PARTITION_ID)
> 		hv_get_partition_id();
> 
> 	BUG_ON(hv_root_partition && !hv_current_partition_id);
> 
> for correctness. Also, we need to make sure '0' is not a valid partition
> id and use e.g. -1 otherwise.
> 

I've changed both places.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root
  2020-11-12 15:51   ` Vitaly Kuznetsov
@ 2020-11-13 15:33     ` Wei Liu
  2020-11-13 16:09       ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-13 15:33 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

On Thu, Nov 12, 2020 at 04:51:09PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu@kernel.org> writes:
> 
> > When Linux is running as the root partition, the hypercall page will
> > have already been setup by Hyper-V. Copy the content over to the
> > allocated page.
> >
> > The suspend, resume and cleanup paths remain untouched because they are
> > not supported in this setup yet.
> 
> What about adding BUG_ONs there then?

I generally avoid cluttering code if I'm sure it definitely does not
work.

In any case, adding BUG_ONs is not the right answer. Both hv_suspend and
hv_resume can return an error code. I would rather just do

   if (hv_root_partition)
       return -EPERM;

in both places.

And also make hv_is_hibernation_supported return false when Linux is the
root partition.

> > +
> > +	if (hv_root_partition) {
> > +		struct page *pg;
> > +		void *src, *dst;
> > +
> > +		/*
> > +		 * For the root partition, the hypervisor will set up its
> > +		 * hypercall page. The hypervisor guarantees it will not show
> > +		 * up in the root's address space. The root can't change the
> > +		 * location of the hypercall page.
> > +		 *
> > +		 * Order is important here. We must enable the hypercall page
> > +		 * so it is populated with code, then copy the code to an
> > +		 * executable page.
> > +		 */
> > +		wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
> > +
> > +		pg = vmalloc_to_page(hv_hypercall_pg);
> > +		dst = kmap(pg);
> > +		src = memremap(hypercall_msr.guest_physical_address << PAGE_SHIFT, PAGE_SIZE,
> > +				MEMREMAP_WB);
> > +		BUG_ON(!(src && dst));
> > +		memcpy(dst, src, PAGE_SIZE);
> 
> Super-nit: while on x86 PAGE_SIZE always matches HV_HYP_PAGE_SIZE, would
> it be more accurate to use the later here?

Sure. That can be done.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions
  2020-11-12 15:57   ` Vitaly Kuznetsov
@ 2020-11-13 15:51     ` Wei Liu
  2020-11-13 16:13       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-13 15:51 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

On Thu, Nov 12, 2020 at 04:57:46PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu@kernel.org> writes:
[...]
> > diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> > index 67f5d35a73d3..4e590a167160 100644
> > --- a/arch/x86/include/asm/mshyperv.h
> > +++ b/arch/x86/include/asm/mshyperv.h
> > @@ -80,6 +80,10 @@ extern void  __percpu  **hyperv_pcpu_output_arg;
> >  
> >  extern u64 hv_current_partition_id;
> >  
> > +int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
> > +int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
> > +int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
> 
> You seem to be only doing EXPORT_SYMBOL_GPL() for
> hv_call_deposit_pages() and hv_call_create_vp() but not for
> hv_call_add_logical_proc() - is this intended? Also, I don't see
> hv_call_create_vp()/hv_call_add_logical_proc() usage outside of
> arch/x86/kernel/cpu/mshyperv.c so maybe we don't need to export them at all?
> 

hv_call_deposit_pages and hv_call_create_vp will be needed by /dev/mshv,
which can be built as a module.

hv_call_add_logical_proc is only needed by mshyperv.c and not exported
in the first place.

This code is fine.

> > +
> >  static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
> >  {
> >  	u64 input_address = input ? virt_to_phys(input) : 0;
> > diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
> > index 87b1a79b19eb..b6c74e1a5524 100644
> > --- a/include/asm-generic/hyperv-tlfs.h
> > +++ b/include/asm-generic/hyperv-tlfs.h
> > @@ -142,6 +142,8 @@ struct ms_hyperv_tsc_page {
> >  #define HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX	0x0014
> >  #define HVCALL_SEND_IPI_EX			0x0015
> >  #define HVCALL_GET_PARTITION_ID			0x0046
> > +#define HVCALL_DEPOSIT_MEMORY			0x0048
> > +#define HVCALL_CREATE_VP			0x004e
> >  #define HVCALL_GET_VP_REGISTERS			0x0050
> >  #define HVCALL_SET_VP_REGISTERS			0x0051
> >  #define HVCALL_POST_MESSAGE			0x005c
> > @@ -149,6 +151,7 @@ struct ms_hyperv_tsc_page {
> >  #define HVCALL_POST_DEBUG_DATA			0x0069
> >  #define HVCALL_RETRIEVE_DEBUG_DATA		0x006a
> >  #define HVCALL_RESET_DEBUG_SESSION		0x006b
> > +#define HVCALL_ADD_LOGICAL_PROCESSOR		0x0076
> >  #define HVCALL_RETARGET_INTERRUPT		0x007e
> >  #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_SPACE 0x00af
> >  #define HVCALL_FLUSH_GUEST_PHYSICAL_ADDRESS_LIST 0x00b0
> > @@ -413,6 +416,70 @@ struct hv_get_partition_id {
> >  	u64 partition_id;
> >  } __packed;
> >  
> > +/* HvDepositMemory hypercall */
> > +struct hv_deposit_memory {
> > +	u64 partition_id;
> > +	u64 gpa_page_list[];
> > +} __packed;
> > +
> > +struct hv_proximity_domain_flags {
> > +	u32 proximity_preferred : 1;
> > +	u32 reserved : 30;
> > +	u32 proximity_info_valid : 1;
> > +};
> > +
> > +/* Not a union in windows but useful for zeroing */
> > +union hv_proximity_domain_info {
> > +	struct {
> > +		u32 domain_id;
> > +		struct hv_proximity_domain_flags flags;
> > +	};
> > +	u64 as_uint64;
> > +};
> > +
> > +struct hv_lp_startup_status {
> > +	u64 hv_status;
> > +	u64 substatus1;
> > +	u64 substatus2;
> > +	u64 substatus3;
> > +	u64 substatus4;
> > +	u64 substatus5;
> > +	u64 substatus6;
> > +};
> > +
> > +/* HvAddLogicalProcessor hypercall */
> > +struct hv_add_logical_processor_in {
> > +	u32 lp_index;
> > +	u32 apic_id;
> > +	union hv_proximity_domain_info proximity_domain_info;
> > +	u64 flags;
> > +};
> > +
> > +struct hv_add_logical_processor_out {
> > +	struct hv_lp_startup_status startup_status;
> > +};
> > +
> > +enum HV_SUBNODE_TYPE
> > +{
> > +    HvSubnodeAny = 0,
> > +    HvSubnodeSocket,
> > +    HvSubnodeAmdNode,
> > +    HvSubnodeL3,
> > +    HvSubnodeCount,
> > +    HvSubnodeInvalid = -1
> > +};
> > +
> > +/* HvCreateVp hypercall */
> > +struct hv_create_vp {
> > +	u64 partition_id;
> > +	u32 vp_index;
> > +	u8 padding[3];
> > +	u8 subnode_type;
> > +	u64 subnode_id;
> > +	union hv_proximity_domain_info proximity_domain_info;
> > +	u64 flags;
> > +};
> 
> You add '__packed' to 'struct hv_deposit_memory' but not to 'struct
> hv_create_vp'/'struct hv_add_logical_processor_in', this looks
> inconsistent.

Fixed.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 10/17] x86/hyperv: implement and use hv_smp_prepare_cpus
  2020-11-12 16:44   ` Vitaly Kuznetsov
@ 2020-11-13 15:56     ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-13 15:56 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

On Thu, Nov 12, 2020 at 05:44:48PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu@kernel.org> writes:
> 
> > Microsoft Hypervisor requires the root partition to make a few
> > hypercalls to setup application processors before they can be used.
> >
> > Signed-off-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> > Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> > Co-Developed-by: Lillian Grassin-Drake <ligrassi@microsoft.com>
> > Co-Developed-by: Sunil Muthuswamy <sunilmut@microsoft.com>
> > Signed-off-by: Wei Liu <wei.liu@kernel.org>
> > ---
> > CPU hotplug and unplug is not yet supported in this setup, so those
> > paths remain untouched.
> > ---
> >  arch/x86/kernel/cpu/mshyperv.c | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> >
> > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> > index f7633e1e4c82..4795e54550e6 100644
> > --- a/arch/x86/kernel/cpu/mshyperv.c
> > +++ b/arch/x86/kernel/cpu/mshyperv.c
> > @@ -31,6 +31,7 @@
> >  #include <asm/reboot.h>
> >  #include <asm/nmi.h>
> >  #include <clocksource/hyperv_timer.h>
> > +#include <asm/numa.h>
> >  
> >  struct ms_hyperv_info ms_hyperv;
> >  EXPORT_SYMBOL_GPL(ms_hyperv);
> > @@ -208,6 +209,30 @@ static void __init hv_smp_prepare_boot_cpu(void)
> >  	hv_init_spinlocks();
> >  #endif
> >  }
> > +
> > +static void __init hv_smp_prepare_cpus(unsigned int max_cpus)
> > +{
> > +#if defined(CONFIG_X86_64)
> 
> '#ifdef CONFIG_X86_64' is equally good as you can't compile x86_64
> support as a module :-)
> 
> > +	int i;
> > +	int ret;
> > +
> > +	native_smp_prepare_cpus(max_cpus);
> > +
> 
> So hypotetically, if hv_root_partition is true but 'ifdef CONFIG_X86_64'
> is false, we won't even be doing native_smp_prepare_cpus()? This doesn't
> sound right. Either move it outside of #ifdef or put the #ifdef around
> 'smp_ops.smp_prepare_cpus' assignment too.
> 

Fixed. Thanks.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 17/17] x86/hyperv: handle IO-APIC when running as root
  2020-11-12 16:56   ` Vitaly Kuznetsov
@ 2020-11-13 16:01     ` Wei Liu
  2020-11-13 16:04       ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-13 16:01 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

On Thu, Nov 12, 2020 at 05:56:41PM +0100, Vitaly Kuznetsov wrote:
[...]
> > +static unsigned int hv_ioapic_startup_irq(struct irq_data *data)
> > +{
> > +	u16 status;
> > +	struct IO_APIC_route_entry ire;
> > +	u32 vector;
> > +	struct irq_cfg *cfg;
> > +	int ioapic;
> > +	u8 ioapic_pin;
> > +	int ioapic_id;
> > +	int gsi;
> > +	union entry_union eu;
> > +	struct cpumask *affinity;
> > +	int cpu, vcpu;
> > +	struct hv_interrupt_entry entry;
> > +	struct mp_chip_data *mp_data = data->chip_data;
> > +
> > +	gsi = data->irq;
> > +	cfg = irqd_cfg(data);
> > +	affinity = irq_data_get_effective_affinity_mask(data);
> > +	cpu = cpumask_first_and(affinity, cpu_online_mask);
> > +	vcpu = hv_cpu_number_to_vp_number(cpu);
> > +
> > +	vector = cfg->vector;
> > +
> > +	ioapic = mp_find_ioapic(gsi);
> > +	ioapic_pin = mp_find_ioapic_pin(ioapic, gsi);
> > +	ioapic_id = mpc_ioapic_id(ioapic);
> > +	ire = ioapic_read_entry(ioapic, ioapic_pin);
> > +
> > +	/*
> > +	 * Always try unmapping. We do not have visibility into which whether
> > +	 * an IO-APIC has been mapped or not. We can't use chip_data because it
> > +	 * already points to mp_data.
> > +	 *
> > +	 * We don't use retarget interrupt hypercalls here because Hyper-V
> > +	 * doens't allow root to change the vector or specify VPs outside of
> > +	 * the set that is initially used during mapping.
> > +	 */
> > +	status = hv_unmap_ioapic_interrupt(gsi);
> > +
> > +	if (!(status == HV_STATUS_SUCCESS || status == HV_STATUS_INVALID_PARAMETER)) {
> > +		pr_debug("%s: unexpected unmap status %d\n", __func__, status);
> > +		return -1;
> 
> Nit: the function returns 'unsigned int' but I see other 'irq_startup'
> routines return negative values too, however, they tend to returd
> '-ESOMETHING' so maybe -EFAULT here?
> 

The return type should've been int instead. That's what the function
signature in struct irq_chip looks like.

> > +	}
> > +
> > +	status = hv_map_ioapic_interrupt(ioapic_id, ire.trigger, vcpu, vector, &entry);
> > +
> > +	if (status != HV_STATUS_SUCCESS) {
> > +		pr_err("%s: map hypercall failed, status %d\n", __func__, status);
> > +		return -1;
> 
> and here.
> 

-EINVAL would be more appropriate in both cases.

Wei.

> > +	}
> > +
> > +	/* Update the entry in mp_chip_data. It is used in other places. */
> > +	mp_data->entry = *(struct IO_APIC_route_entry *)&entry.ioapic_rte;
> > +
> > +	/* Sync polarity -- Hyper-V's returned polarity is always 0... */
> > +	mp_data->entry.polarity = ire.polarity;
> > +
> > +	eu.w1 = entry.ioapic_rte.low_uint32;
> > +	eu.w2 = entry.ioapic_rte.high_uint32;
> > +	ioapic_write_entry(ioapic, ioapic_pin, eu.entry);
> > +
> > +	return 0;
> > +}
> > +
> > +static void hv_ioapic_mask_irq(struct irq_data *data)
> > +{
> > +	mask_ioapic_irq(data);
> > +}
> > +
> > +static void hv_ioapic_unmask_irq(struct irq_data *data)
> > +{
> > +	unmask_ioapic_irq(data);
> > +}
> > +
> > +static int hv_ioapic_set_affinity(struct irq_data *data,
> > +			       const struct cpumask *mask, bool force)
> > +{
> > +	/*
> > +	 * We only update the affinity mask here. Programming the hardware is
> > +	 * done in irq_startup.
> > +	 */
> > +	return ioapic_set_affinity(data, mask, force);
> > +}
> > +
> > +void hv_ioapic_ack_level(struct irq_data *irq_data)
> > +{
> > +	/*
> > +	 * Per email exchange with Hyper-V team, all is needed is write to
> > +	 * LAPIC's EOI register. They don't support directed EOI to IO-APIC.
> > +	 * Hyper-V handles it for us.
> > +	 */
> > +	apic_ack_irq(irq_data);
> > +}
> > +
> > +struct irq_chip hv_ioapic_chip __read_mostly = {
> > +	.name			= "HV-IO-APIC",
> > +	.irq_startup		= hv_ioapic_startup_irq,
> > +	.irq_mask		= hv_ioapic_mask_irq,
> > +	.irq_unmask		= hv_ioapic_unmask_irq,
> > +	.irq_ack		= irq_chip_ack_parent,
> > +	.irq_eoi		= hv_ioapic_ack_level,
> > +	.irq_set_affinity	= hv_ioapic_set_affinity,
> > +	.irq_retrigger		= irq_chip_retrigger_hierarchy,
> > +	.irq_get_irqchip_state	= ioapic_irq_get_chip_state,
> > +	.flags			= IRQCHIP_SKIP_SET_WAKE,
> > +};
> > +
> > +
> > +int (*native_acpi_register_gsi)(struct device *dev, u32 gsi, int trigger, int polarity);
> > +void (*native_acpi_unregister_gsi)(u32 gsi);
> > +
> > +int hv_acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity)
> > +{
> > +	int irq = gsi;
> > +
> > +#ifdef CONFIG_X86_IO_APIC
> > +	irq = native_acpi_register_gsi(dev, gsi, trigger, polarity);
> > +	if (irq < 0) {
> > +		pr_err("native_acpi_register_gsi failed %d\n", irq);
> > +		return irq;
> > +	}
> > +
> > +	if (trigger) {
> > +		irq_set_status_flags(irq, IRQ_LEVEL);
> > +		irq_set_chip_and_handler_name(irq, &hv_ioapic_chip,
> > +			handle_fasteoi_irq, "ioapic-fasteoi");
> > +	} else {
> > +		irq_clear_status_flags(irq, IRQ_LEVEL);
> > +		irq_set_chip_and_handler_name(irq, &hv_ioapic_chip,
> > +			handle_edge_irq, "ioapic-edge");
> > +	}
> > +#endif
> > +	return irq;
> > +}
> > +
> > +void hv_acpi_unregister_gsi(u32 gsi)
> > +{
> > +#ifdef CONFIG_X86_IO_APIC
> > +	(void)hv_unmap_ioapic_interrupt(gsi);
> > +	native_acpi_unregister_gsi(gsi);
> > +#endif
> > +}
> 
> -- 
> Vitaly
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 17/17] x86/hyperv: handle IO-APIC when running as root
  2020-11-13 16:01     ` Wei Liu
@ 2020-11-13 16:04       ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-13 16:04 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

On Fri, Nov 13, 2020 at 04:01:58PM +0000, Wei Liu wrote:
> On Thu, Nov 12, 2020 at 05:56:41PM +0100, Vitaly Kuznetsov wrote:
> [...]
> > > +static unsigned int hv_ioapic_startup_irq(struct irq_data *data)
> > > +{
> > > +	u16 status;
> > > +	struct IO_APIC_route_entry ire;
> > > +	u32 vector;
> > > +	struct irq_cfg *cfg;
> > > +	int ioapic;
> > > +	u8 ioapic_pin;
> > > +	int ioapic_id;
> > > +	int gsi;
> > > +	union entry_union eu;
> > > +	struct cpumask *affinity;
> > > +	int cpu, vcpu;
> > > +	struct hv_interrupt_entry entry;
> > > +	struct mp_chip_data *mp_data = data->chip_data;
> > > +
> > > +	gsi = data->irq;
> > > +	cfg = irqd_cfg(data);
> > > +	affinity = irq_data_get_effective_affinity_mask(data);
> > > +	cpu = cpumask_first_and(affinity, cpu_online_mask);
> > > +	vcpu = hv_cpu_number_to_vp_number(cpu);
> > > +
> > > +	vector = cfg->vector;
> > > +
> > > +	ioapic = mp_find_ioapic(gsi);
> > > +	ioapic_pin = mp_find_ioapic_pin(ioapic, gsi);
> > > +	ioapic_id = mpc_ioapic_id(ioapic);
> > > +	ire = ioapic_read_entry(ioapic, ioapic_pin);
> > > +
> > > +	/*
> > > +	 * Always try unmapping. We do not have visibility into which whether
> > > +	 * an IO-APIC has been mapped or not. We can't use chip_data because it
> > > +	 * already points to mp_data.
> > > +	 *
> > > +	 * We don't use retarget interrupt hypercalls here because Hyper-V
> > > +	 * doens't allow root to change the vector or specify VPs outside of
> > > +	 * the set that is initially used during mapping.
> > > +	 */
> > > +	status = hv_unmap_ioapic_interrupt(gsi);
> > > +
> > > +	if (!(status == HV_STATUS_SUCCESS || status == HV_STATUS_INVALID_PARAMETER)) {
> > > +		pr_debug("%s: unexpected unmap status %d\n", __func__, status);
> > > +		return -1;
> > 
> > Nit: the function returns 'unsigned int' but I see other 'irq_startup'
> > routines return negative values too, however, they tend to returd
> > '-ESOMETHING' so maybe -EFAULT here?
> > 
> 
> The return type should've been int instead. That's what the function
> signature in struct irq_chip looks like.

Actually it is unsigned int. Oh well.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root
  2020-11-13 15:33     ` Wei Liu
@ 2020-11-13 16:09       ` Wei Liu
  2020-11-13 16:16         ` Vitaly Kuznetsov
  0 siblings, 1 reply; 58+ messages in thread
From: Wei Liu @ 2020-11-13 16:09 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

On Fri, Nov 13, 2020 at 03:33:33PM +0000, Wei Liu wrote:
> On Thu, Nov 12, 2020 at 04:51:09PM +0100, Vitaly Kuznetsov wrote:
> > Wei Liu <wei.liu@kernel.org> writes:
> > 
> > > When Linux is running as the root partition, the hypercall page will
> > > have already been setup by Hyper-V. Copy the content over to the
> > > allocated page.
> > >
> > > The suspend, resume and cleanup paths remain untouched because they are
> > > not supported in this setup yet.
> > 
> > What about adding BUG_ONs there then?
> 
> I generally avoid cluttering code if I'm sure it definitely does not
> work.
> 
> In any case, adding BUG_ONs is not the right answer. Both hv_suspend and
> hv_resume can return an error code. I would rather just do
> 
>    if (hv_root_partition)
>        return -EPERM;
> 
> in both places.

Correction: hv_resume is void, so I won't add that code snippet. But we
should still be fine because hv_suspend will have already failed in the
first place.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions
  2020-11-13 15:51     ` Wei Liu
@ 2020-11-13 16:13       ` Vitaly Kuznetsov
  2020-11-16 11:41         ` Wei Liu
  0 siblings, 1 reply; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-13 16:13 UTC (permalink / raw)
  To: Wei Liu
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

Wei Liu <wei.liu@kernel.org> writes:

> On Thu, Nov 12, 2020 at 04:57:46PM +0100, Vitaly Kuznetsov wrote:
>> Wei Liu <wei.liu@kernel.org> writes:
> [...]
>> > diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
>> > index 67f5d35a73d3..4e590a167160 100644
>> > --- a/arch/x86/include/asm/mshyperv.h
>> > +++ b/arch/x86/include/asm/mshyperv.h
>> > @@ -80,6 +80,10 @@ extern void  __percpu  **hyperv_pcpu_output_arg;
>> >  
>> >  extern u64 hv_current_partition_id;
>> >  
>> > +int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
>> > +int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
>> > +int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
>> 
>> You seem to be only doing EXPORT_SYMBOL_GPL() for
>> hv_call_deposit_pages() and hv_call_create_vp() but not for
>> hv_call_add_logical_proc() - is this intended? Also, I don't see
>> hv_call_create_vp()/hv_call_add_logical_proc() usage outside of
>> arch/x86/kernel/cpu/mshyperv.c so maybe we don't need to export them at all?
>> 
>
> hv_call_deposit_pages and hv_call_create_vp will be needed by /dev/mshv,
> which can be built as a module.
>

I'd suggest to move EXPORT_SYMBOL_GPL() to the series adding '/dev/mshv'
then. Dangling exported symbols with no users tend to be removed. No
strong opinion, just a suggestion.

> hv_call_add_logical_proc is only needed by mshyperv.c and not exported
> in the first place.
>
> This code is fine.

Thanks for the confirmation!

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root
  2020-11-13 16:09       ` Wei Liu
@ 2020-11-13 16:16         ` Vitaly Kuznetsov
  0 siblings, 0 replies; 58+ messages in thread
From: Vitaly Kuznetsov @ 2020-11-13 16:16 UTC (permalink / raw)
  To: Wei Liu
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin

Wei Liu <wei.liu@kernel.org> writes:

> On Fri, Nov 13, 2020 at 03:33:33PM +0000, Wei Liu wrote:
>> On Thu, Nov 12, 2020 at 04:51:09PM +0100, Vitaly Kuznetsov wrote:
>> > Wei Liu <wei.liu@kernel.org> writes:
>> > 
>> > > When Linux is running as the root partition, the hypercall page will
>> > > have already been setup by Hyper-V. Copy the content over to the
>> > > allocated page.
>> > >
>> > > The suspend, resume and cleanup paths remain untouched because they are
>> > > not supported in this setup yet.
>> > 
>> > What about adding BUG_ONs there then?
>> 
>> I generally avoid cluttering code if I'm sure it definitely does not
>> work.
>> 
>> In any case, adding BUG_ONs is not the right answer. Both hv_suspend and
>> hv_resume can return an error code. I would rather just do
>> 
>>    if (hv_root_partition)
>>        return -EPERM;
>> 
>> in both places.
>
> Correction: hv_resume is void, so I won't add that code snippet. But we
> should still be fine because hv_suspend will have already failed in the
> first place.
>

Works for me. I just very much prefer to get reports like "system
doesn't go to sleep" instead of "something crashes when I put my system
to sleep")

-- 
Vitaly


^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions
  2020-11-13 16:13       ` Vitaly Kuznetsov
@ 2020-11-16 11:41         ` Wei Liu
  0 siblings, 0 replies; 58+ messages in thread
From: Wei Liu @ 2020-11-16 11:41 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Wei Liu, Linux on Hyper-V List, virtualization,
	Linux Kernel List, Michael Kelley, Vineeth Pillai,
	Sunil Muthuswamy, Nuno Das Neves, Lillian Grassin-Drake,
	K. Y. Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT),
	H. Peter Anvin, Arnd Bergmann,
	open list:GENERIC INCLUDE/ASM HEADER FILES

On Fri, Nov 13, 2020 at 05:13:44PM +0100, Vitaly Kuznetsov wrote:
> Wei Liu <wei.liu@kernel.org> writes:
> 
> > On Thu, Nov 12, 2020 at 04:57:46PM +0100, Vitaly Kuznetsov wrote:
> >> Wei Liu <wei.liu@kernel.org> writes:
> > [...]
> >> > diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> >> > index 67f5d35a73d3..4e590a167160 100644
> >> > --- a/arch/x86/include/asm/mshyperv.h
> >> > +++ b/arch/x86/include/asm/mshyperv.h
> >> > @@ -80,6 +80,10 @@ extern void  __percpu  **hyperv_pcpu_output_arg;
> >> >  
> >> >  extern u64 hv_current_partition_id;
> >> >  
> >> > +int hv_call_deposit_pages(int node, u64 partition_id, u32 num_pages);
> >> > +int hv_call_add_logical_proc(int node, u32 lp_index, u32 acpi_id);
> >> > +int hv_call_create_vp(int node, u64 partition_id, u32 vp_index, u32 flags);
> >> 
> >> You seem to be only doing EXPORT_SYMBOL_GPL() for
> >> hv_call_deposit_pages() and hv_call_create_vp() but not for
> >> hv_call_add_logical_proc() - is this intended? Also, I don't see
> >> hv_call_create_vp()/hv_call_add_logical_proc() usage outside of
> >> arch/x86/kernel/cpu/mshyperv.c so maybe we don't need to export them at all?
> >> 
> >
> > hv_call_deposit_pages and hv_call_create_vp will be needed by /dev/mshv,
> > which can be built as a module.
> >
> 
> I'd suggest to move EXPORT_SYMBOL_GPL() to the series adding '/dev/mshv'
> then. Dangling exported symbols with no users tend to be removed. No
> strong opinion, just a suggestion.

Sure. This is now done in my v3.

Wei.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2020-11-16 12:38 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-05 16:57 [PATCH v2 00/17] Introducing Linux root partition support for Microsoft Hypervisor Wei Liu
2020-11-05 16:57 ` [PATCH v2 01/17] asm-generic/hyperv: change HV_CPU_POWER_MANAGEMENT to HV_CPU_MANAGEMENT Wei Liu
2020-11-12 15:14   ` Vitaly Kuznetsov
2020-11-05 16:57 ` [PATCH v2 02/17] x86/hyperv: detect if Linux is the root partition Wei Liu
2020-11-05 19:16   ` kernel test robot
2020-11-12 11:42     ` Wei Liu
2020-11-12 11:46       ` Wei Liu
2020-11-12 12:22         ` Wei Liu
2020-11-05 20:23   ` kernel test robot
2020-11-12 15:16   ` Vitaly Kuznetsov
2020-11-12 15:51     ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 03/17] Drivers: hv: vmbus: skip VMBus initialization if Linux is root Wei Liu
2020-11-12 15:24   ` Vitaly Kuznetsov
2020-11-13 14:51     ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 04/17] iommu/hyperv: don't setup IRQ remapping when running as root Wei Liu
2020-11-12 15:27   ` Vitaly Kuznetsov
2020-11-13 14:53     ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 05/17] clocksource/hyperv: use MSR-based access if " Wei Liu
2020-11-12  9:56   ` Daniel Lezcano
2020-11-12 11:24     ` Wei Liu
2020-11-12 11:40       ` Daniel Lezcano
2020-11-12 11:42         ` Wei Liu
2020-11-12 15:30   ` Vitaly Kuznetsov
2020-11-12 15:52     ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 06/17] x86/hyperv: allocate output arg pages if required Wei Liu
2020-11-12 15:35   ` Vitaly Kuznetsov
2020-11-13 15:05     ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 07/17] x86/hyperv: extract partition ID from Microsoft Hypervisor if necessary Wei Liu
2020-11-05 20:07   ` kernel test robot
2020-11-12 11:54     ` Wei Liu
2020-11-05 20:20   ` kernel test robot
2020-11-12 15:44   ` Vitaly Kuznetsov
2020-11-13 15:21     ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 08/17] x86/hyperv: handling hypercall page setup for root Wei Liu
2020-11-12 15:51   ` Vitaly Kuznetsov
2020-11-13 15:33     ` Wei Liu
2020-11-13 16:09       ` Wei Liu
2020-11-13 16:16         ` Vitaly Kuznetsov
2020-11-05 16:58 ` [PATCH v2 09/17] x86/hyperv: provide a bunch of helper functions Wei Liu
2020-11-12 15:57   ` Vitaly Kuznetsov
2020-11-13 15:51     ` Wei Liu
2020-11-13 16:13       ` Vitaly Kuznetsov
2020-11-16 11:41         ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 10/17] x86/hyperv: implement and use hv_smp_prepare_cpus Wei Liu
2020-11-12 16:44   ` Vitaly Kuznetsov
2020-11-13 15:56     ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 11/17] asm-generic/hyperv: update hv_msi_entry Wei Liu
2020-11-05 16:58 ` [PATCH v2 12/17] asm-generic/hyperv: update hv_interrupt_entry Wei Liu
2020-11-05 16:58 ` [PATCH v2 13/17] asm-generic/hyperv: introduce hv_device_id and auxiliary structures Wei Liu
2020-11-05 16:58 ` [PATCH v2 14/17] asm-generic/hyperv: import data structures for mapping device interrupts Wei Liu
2020-11-05 16:58 ` [PATCH v2 15/17] x86/hyperv: implement an MSI domain for root partition Wei Liu
2020-11-12 13:50   ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 16/17] x86/ioapic: export a few functions and data structures via io_apic.h Wei Liu
2020-11-12 11:39   ` Wei Liu
2020-11-05 16:58 ` [PATCH v2 17/17] x86/hyperv: handle IO-APIC when running as root Wei Liu
2020-11-12 16:56   ` Vitaly Kuznetsov
2020-11-13 16:01     ` Wei Liu
2020-11-13 16:04       ` Wei Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).