linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] hyperv: root partition faults writing to VP ASSIST MSR PAGE
@ 2021-07-21 18:03 Praveen Kumar
  2021-07-22  5:53 ` Michael Kelley
  2021-07-22 10:27 ` Wei Liu
  0 siblings, 2 replies; 5+ messages in thread
From: Praveen Kumar @ 2021-07-21 18:03 UTC (permalink / raw)
  To: linux-hyperv, linux-kernel
  Cc: kys, haiyangz, sthemmin, wei.liu, decui, tglx, mingo, bp, x86,
	hpa, viremana, sunilmut, nunodasneves

For Root partition the VP assist pages are pre-determined by the
hypervisor. The Root kernel is not allowed to change them to
different locations. And thus, we are getting below stack as in
current implementation Root is trying to perform write to specific
MSR.

[ 2.778197] unchecked MSR access error: WRMSR to 0x40000073 (tried to
write 0x0000000145ac5001) at rIP: 0xffffffff810c1084
(native_write_msr+0x4/0x30)
[ 2.784867] Call Trace:
[ 2.791507] hv_cpu_init+0xf1/0x1c0
[ 2.798144] ? hyperv_report_panic+0xd0/0xd0
[ 2.804806] cpuhp_invoke_callback+0x11a/0x440
[ 2.811465] ? hv_resume+0x90/0x90
[ 2.818137] cpuhp_issue_call+0x126/0x130
[ 2.824782] __cpuhp_setup_state_cpuslocked+0x102/0x2b0
[ 2.831427] ? hyperv_report_panic+0xd0/0xd0
[ 2.838075] ? hyperv_report_panic+0xd0/0xd0
[ 2.844723] ? hv_resume+0x90/0x90
[ 2.851375] __cpuhp_setup_state+0x3d/0x90
[ 2.858030] hyperv_init+0x14e/0x410
[ 2.864689] ? enable_IR_x2apic+0x190/0x1a0
[ 2.871349] apic_intr_mode_init+0x8b/0x100
[ 2.878017] x86_late_time_init+0x20/0x30
[ 2.884675] start_kernel+0x459/0x4fb
[ 2.891329] secondary_startup_64_no_verify+0xb0/0xbb

Since, the hypervisor already provides the VP assist page for root
partition, we need to memremaps the memory from hypervisor for root
kernel to use. The mapping is done in hv_cpu_init during bringup and
is unmaped in hv_cpu_die during teardown.

Signed-off-by: Praveen Kumar <kumarpraveen@linux.microsoft.com>
---
 arch/x86/hyperv/hv_init.c | 53 ++++++++++++++++++++++++++-------------
 1 file changed, 36 insertions(+), 17 deletions(-)

changelog:
v1: initial patch
v2: commit message changes, removal of HV_MSR_APIC_ACCESS_AVAILABLE
    check and addition of null check before reading the VP assist MSR
    for root partition

---
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 6f247e7e07eb..ffd3d3b37235 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -55,26 +55,41 @@ static int hv_cpu_init(unsigned int cpu)
 		return 0;
 
 	/*
-	 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
-	 * 5.2.1 "GPA Overlay Pages"). Here it must be zeroed out to make sure
-	 * we always write the EOI MSR in hv_apic_eoi_write() *after* the
-	 * EOI optimization is disabled in hv_cpu_die(), otherwise a CPU may
-	 * not be stopped in the case of CPU offlining and the VM will hang.
+	 * For Root partition we need to map the hypervisor VP ASSIST PAGE
+	 * instead of allocating a new page.
 	 */
-	if (!*hvp) {
-		*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
-	}
+	if (hv_root_partition) {
+		union hv_x64_msr_hypercall_contents hypercall_msr;
+
+		rdmsrl(HV_X64_MSR_VP_ASSIST_PAGE, hypercall_msr.as_uint64);
+		/* remapping to root partition address space */
+		if (!*hvp)
+			*hvp = memremap(hypercall_msr.guest_physical_address <<
+					HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT,
+					PAGE_SIZE, MEMREMAP_WB);
+		WARN_ON(!(*hvp));
+	} else {
+		/*
+		 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's
+		 * Section 5.2.1 "GPA Overlay Pages"). Here it must be zeroed
+		 * out to make sure we always write the EOI MSR in
+		 * hv_apic_eoi_write() *after* theEOI optimization is disabled
+		 * in hv_cpu_die(), otherwise a CPU may not be stopped in the
+		 * case of CPU offlining and the VM will hang.
+		 */
+		if (!*hvp)
+			*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
 
-	if (*hvp) {
-		u64 val;
+		if (*hvp) {
+			u64 val;
 
-		val = vmalloc_to_pfn(*hvp);
-		val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
-			HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
+			val = vmalloc_to_pfn(*hvp);
+			val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
+				HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
 
-		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
+			wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
+		}
 	}
-
 	return 0;
 }
 
@@ -170,8 +185,12 @@ static int hv_cpu_die(unsigned int cpu)
 
 	hv_common_cpu_die(cpu);
 
-	if (hv_vp_assist_page && hv_vp_assist_page[cpu])
-		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
+	if (hv_vp_assist_page && hv_vp_assist_page[cpu]) {
+		if (hv_root_partition)
+			memunmap(hv_vp_assist_page[cpu]);
+		else
+			wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
+	}
 
 	if (hv_reenlightenment_cb == NULL)
 		return 0;
-- 
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH v2] hyperv: root partition faults writing to VP ASSIST MSR PAGE
  2021-07-21 18:03 [PATCH v2] hyperv: root partition faults writing to VP ASSIST MSR PAGE Praveen Kumar
@ 2021-07-22  5:53 ` Michael Kelley
  2021-07-22 10:27 ` Wei Liu
  1 sibling, 0 replies; 5+ messages in thread
From: Michael Kelley @ 2021-07-22  5:53 UTC (permalink / raw)
  To: Praveen Kumar, linux-hyperv, linux-kernel
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, wei.liu,
	Dexuan Cui, tglx, mingo, bp, x86, hpa, viremana,
	Sunil Muthuswamy, nunodasneves

From: Praveen Kumar <kumarpraveen@linux.microsoft.com> Sent: Wednesday, July 21, 2021 11:03 AM
> 
> For Root partition the VP assist pages are pre-determined by the
> hypervisor. The Root kernel is not allowed to change them to
> different locations. And thus, we are getting below stack as in
> current implementation Root is trying to perform write to specific
> MSR.
> 
> [ 2.778197] unchecked MSR access error: WRMSR to 0x40000073 (tried to
> write 0x0000000145ac5001) at rIP: 0xffffffff810c1084
> (native_write_msr+0x4/0x30)
> [ 2.784867] Call Trace:
> [ 2.791507] hv_cpu_init+0xf1/0x1c0
> [ 2.798144] ? hyperv_report_panic+0xd0/0xd0
> [ 2.804806] cpuhp_invoke_callback+0x11a/0x440
> [ 2.811465] ? hv_resume+0x90/0x90
> [ 2.818137] cpuhp_issue_call+0x126/0x130
> [ 2.824782] __cpuhp_setup_state_cpuslocked+0x102/0x2b0
> [ 2.831427] ? hyperv_report_panic+0xd0/0xd0
> [ 2.838075] ? hyperv_report_panic+0xd0/0xd0
> [ 2.844723] ? hv_resume+0x90/0x90
> [ 2.851375] __cpuhp_setup_state+0x3d/0x90
> [ 2.858030] hyperv_init+0x14e/0x410
> [ 2.864689] ? enable_IR_x2apic+0x190/0x1a0
> [ 2.871349] apic_intr_mode_init+0x8b/0x100
> [ 2.878017] x86_late_time_init+0x20/0x30
> [ 2.884675] start_kernel+0x459/0x4fb
> [ 2.891329] secondary_startup_64_no_verify+0xb0/0xbb
> 
> Since, the hypervisor already provides the VP assist page for root
> partition, we need to memremaps the memory from hypervisor for root

s/memremaps/memremap/

> kernel to use. The mapping is done in hv_cpu_init during bringup and
> is unmaped in hv_cpu_die during teardown.
> 
> Signed-off-by: Praveen Kumar <kumarpraveen@linux.microsoft.com>
> ---
>  arch/x86/hyperv/hv_init.c | 53 ++++++++++++++++++++++++++-------------
>  1 file changed, 36 insertions(+), 17 deletions(-)
> 
> changelog:
> v1: initial patch
> v2: commit message changes, removal of HV_MSR_APIC_ACCESS_AVAILABLE
>     check and addition of null check before reading the VP assist MSR
>     for root partition
> 
> ---
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 6f247e7e07eb..ffd3d3b37235 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -55,26 +55,41 @@ static int hv_cpu_init(unsigned int cpu)
>  		return 0;
> 
>  	/*
> -	 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
> -	 * 5.2.1 "GPA Overlay Pages"). Here it must be zeroed out to make sure
> -	 * we always write the EOI MSR in hv_apic_eoi_write() *after* the
> -	 * EOI optimization is disabled in hv_cpu_die(), otherwise a CPU may
> -	 * not be stopped in the case of CPU offlining and the VM will hang.
> +	 * For Root partition we need to map the hypervisor VP ASSIST PAGE
> +	 * instead of allocating a new page.
>  	 */
> -	if (!*hvp) {
> -		*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
> -	}
> +	if (hv_root_partition) {
> +		union hv_x64_msr_hypercall_contents hypercall_msr;

This isn't the correct variable type to be using here.  Union
hv_x64_msr_hypercall_contents is specifically for HV_X64_MSR_HYPERCALL.
It also happens to be correct for HV_X64_MSR_VP_ASSIST_PAGE, but the
layout of the two MSRs could diverge in the future.  Instead of using this union,
I would suggest just reading into a u64, and then mask as needed.  The code in
the non-root-partition branch of the 'if' statement is similarly open coding
the needed shifting/masking to construct the value to write.

Or you could define another union specifically for the VP Assist page MSR.
I'm OK with either approach.

Michael

> +
> +		rdmsrl(HV_X64_MSR_VP_ASSIST_PAGE, hypercall_msr.as_uint64);
> +		/* remapping to root partition address space */
> +		if (!*hvp)
> +			*hvp = memremap(hypercall_msr.guest_physical_address <<
> +					HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT,
> +					PAGE_SIZE, MEMREMAP_WB);
> +		WARN_ON(!(*hvp));
> +	} else {
> +		/*
> +		 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's
> +		 * Section 5.2.1 "GPA Overlay Pages"). Here it must be zeroed
> +		 * out to make sure we always write the EOI MSR in
> +		 * hv_apic_eoi_write() *after* theEOI optimization is disabled
> +		 * in hv_cpu_die(), otherwise a CPU may not be stopped in the
> +		 * case of CPU offlining and the VM will hang.
> +		 */
> +		if (!*hvp)
> +			*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
> 
> -	if (*hvp) {
> -		u64 val;
> +		if (*hvp) {
> +			u64 val;
> 
> -		val = vmalloc_to_pfn(*hvp);
> -		val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
> -			HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
> +			val = vmalloc_to_pfn(*hvp);
> +			val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
> +				HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
> 
> -		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
> +			wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
> +		}
>  	}
> -
>  	return 0;
>  }
> 
> @@ -170,8 +185,12 @@ static int hv_cpu_die(unsigned int cpu)
> 
>  	hv_common_cpu_die(cpu);
> 
> -	if (hv_vp_assist_page && hv_vp_assist_page[cpu])
> -		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
> +	if (hv_vp_assist_page && hv_vp_assist_page[cpu]) {
> +		if (hv_root_partition)
> +			memunmap(hv_vp_assist_page[cpu]);
> +		else
> +			wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
> +	}
> 
>  	if (hv_reenlightenment_cb == NULL)
>  		return 0;
> --
> 2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] hyperv: root partition faults writing to VP ASSIST MSR PAGE
  2021-07-21 18:03 [PATCH v2] hyperv: root partition faults writing to VP ASSIST MSR PAGE Praveen Kumar
  2021-07-22  5:53 ` Michael Kelley
@ 2021-07-22 10:27 ` Wei Liu
  2021-07-22 16:15   ` Praveen Kumar
  1 sibling, 1 reply; 5+ messages in thread
From: Wei Liu @ 2021-07-22 10:27 UTC (permalink / raw)
  To: Praveen Kumar
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, sthemmin, wei.liu,
	decui, tglx, mingo, bp, x86, hpa, viremana, sunilmut,
	nunodasneves

On Wed, Jul 21, 2021 at 11:33:02PM +0530, Praveen Kumar wrote:
> For Root partition the VP assist pages are pre-determined by the
> hypervisor. The Root kernel is not allowed to change them to
> different locations. And thus, we are getting below stack as in
> current implementation Root is trying to perform write to specific
> MSR.
> 
> [ 2.778197] unchecked MSR access error: WRMSR to 0x40000073 (tried to
> write 0x0000000145ac5001) at rIP: 0xffffffff810c1084
> (native_write_msr+0x4/0x30)
> [ 2.784867] Call Trace:
> [ 2.791507] hv_cpu_init+0xf1/0x1c0
> [ 2.798144] ? hyperv_report_panic+0xd0/0xd0
> [ 2.804806] cpuhp_invoke_callback+0x11a/0x440
> [ 2.811465] ? hv_resume+0x90/0x90
> [ 2.818137] cpuhp_issue_call+0x126/0x130
> [ 2.824782] __cpuhp_setup_state_cpuslocked+0x102/0x2b0
> [ 2.831427] ? hyperv_report_panic+0xd0/0xd0
> [ 2.838075] ? hyperv_report_panic+0xd0/0xd0
> [ 2.844723] ? hv_resume+0x90/0x90
> [ 2.851375] __cpuhp_setup_state+0x3d/0x90
> [ 2.858030] hyperv_init+0x14e/0x410
> [ 2.864689] ? enable_IR_x2apic+0x190/0x1a0
> [ 2.871349] apic_intr_mode_init+0x8b/0x100
> [ 2.878017] x86_late_time_init+0x20/0x30
> [ 2.884675] start_kernel+0x459/0x4fb
> [ 2.891329] secondary_startup_64_no_verify+0xb0/0xbb
> 
> Since, the hypervisor already provides the VP assist page for root
> partition, we need to memremaps the memory from hypervisor for root
> kernel to use. The mapping is done in hv_cpu_init during bringup and
> is unmaped in hv_cpu_die during teardown.
> 
> Signed-off-by: Praveen Kumar <kumarpraveen@linux.microsoft.com>
> ---
>  arch/x86/hyperv/hv_init.c | 53 ++++++++++++++++++++++++++-------------
>  1 file changed, 36 insertions(+), 17 deletions(-)
> 
> changelog:
> v1: initial patch
> v2: commit message changes, removal of HV_MSR_APIC_ACCESS_AVAILABLE
>     check and addition of null check before reading the VP assist MSR
>     for root partition
> 
> ---
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 6f247e7e07eb..ffd3d3b37235 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -55,26 +55,41 @@ static int hv_cpu_init(unsigned int cpu)
>  		return 0;
>  
>  	/*
> -	 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
> -	 * 5.2.1 "GPA Overlay Pages"). Here it must be zeroed out to make sure
> -	 * we always write the EOI MSR in hv_apic_eoi_write() *after* the
> -	 * EOI optimization is disabled in hv_cpu_die(), otherwise a CPU may
> -	 * not be stopped in the case of CPU offlining and the VM will hang.
> +	 * For Root partition we need to map the hypervisor VP ASSIST PAGE
> +	 * instead of allocating a new page.
>  	 */
> -	if (!*hvp) {
> -		*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
> -	}
> +	if (hv_root_partition) {
> +		union hv_x64_msr_hypercall_contents hypercall_msr;
> +
> +		rdmsrl(HV_X64_MSR_VP_ASSIST_PAGE, hypercall_msr.as_uint64);
> +		/* remapping to root partition address space */
> +		if (!*hvp)
> +			*hvp = memremap(hypercall_msr.guest_physical_address <<
> +					HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT,
> +					PAGE_SIZE, MEMREMAP_WB);
> +		WARN_ON(!(*hvp));
> +	} else {
> +		/*
> +		 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's
> +		 * Section 5.2.1 "GPA Overlay Pages"). Here it must be zeroed
> +		 * out to make sure we always write the EOI MSR in
> +		 * hv_apic_eoi_write() *after* theEOI optimization is disabled
> +		 * in hv_cpu_die(), otherwise a CPU may not be stopped in the
> +		 * case of CPU offlining and the VM will hang.
> +		 */
> +		if (!*hvp)
> +			*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
>  
> -	if (*hvp) {
> -		u64 val;
> +		if (*hvp) {
> +			u64 val;
>  
> -		val = vmalloc_to_pfn(*hvp);
> -		val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
> -			HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
> +			val = vmalloc_to_pfn(*hvp);
> +			val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
> +				HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
>  
> -		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
> +			wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
> +		}
>  	}
> -
>  	return 0;
>  }
>  
> @@ -170,8 +185,12 @@ static int hv_cpu_die(unsigned int cpu)
>  
>  	hv_common_cpu_die(cpu);
>  
> -	if (hv_vp_assist_page && hv_vp_assist_page[cpu])
> -		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
> +	if (hv_vp_assist_page && hv_vp_assist_page[cpu]) {
> +		if (hv_root_partition)
> +			memunmap(hv_vp_assist_page[cpu]);

I think about this a bit more, the NULL check for *hvp in hv_cpu_init in
the original code is perhaps due to the code has opted to not free the
page when disabling the VP assist page. When the CPU is brought back
online, it does not want to allocate another page, but to use the one
that's already allocated.

So, since you listened to my suggestion to add a similar check, you need
to reset hv_vp_assist_page to NULL here. Alternatively the check for
*hvp can be dropped for the root path. Either way, the difference
between root and non-root should be documented.

Wei.

> +		else
> +			wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
> +	}
>  
>  	if (hv_reenlightenment_cb == NULL)
>  		return 0;
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] hyperv: root partition faults writing to VP ASSIST MSR PAGE
  2021-07-22 10:27 ` Wei Liu
@ 2021-07-22 16:15   ` Praveen Kumar
  2021-07-24 15:43     ` Wei Liu
  0 siblings, 1 reply; 5+ messages in thread
From: Praveen Kumar @ 2021-07-22 16:15 UTC (permalink / raw)
  To: Wei Liu
  Cc: linux-hyperv, linux-kernel, kys, haiyangz, sthemmin, decui, tglx,
	mingo, bp, x86, hpa, viremana, sunilmut, nunodasneves

On 22-07-2021 15:57, Wei Liu wrote:
> On Wed, Jul 21, 2021 at 11:33:02PM +0530, Praveen Kumar wrote:
>> For Root partition the VP assist pages are pre-determined by the
>> hypervisor. The Root kernel is not allowed to change them to
>> different locations. And thus, we are getting below stack as in
>> current implementation Root is trying to perform write to specific
>> MSR.
>>
>> [ 2.778197] unchecked MSR access error: WRMSR to 0x40000073 (tried to
>> write 0x0000000145ac5001) at rIP: 0xffffffff810c1084
>> (native_write_msr+0x4/0x30)
>> [ 2.784867] Call Trace:
>> [ 2.791507] hv_cpu_init+0xf1/0x1c0
>> [ 2.798144] ? hyperv_report_panic+0xd0/0xd0
>> [ 2.804806] cpuhp_invoke_callback+0x11a/0x440
>> [ 2.811465] ? hv_resume+0x90/0x90
>> [ 2.818137] cpuhp_issue_call+0x126/0x130
>> [ 2.824782] __cpuhp_setup_state_cpuslocked+0x102/0x2b0
>> [ 2.831427] ? hyperv_report_panic+0xd0/0xd0
>> [ 2.838075] ? hyperv_report_panic+0xd0/0xd0
>> [ 2.844723] ? hv_resume+0x90/0x90
>> [ 2.851375] __cpuhp_setup_state+0x3d/0x90
>> [ 2.858030] hyperv_init+0x14e/0x410
>> [ 2.864689] ? enable_IR_x2apic+0x190/0x1a0
>> [ 2.871349] apic_intr_mode_init+0x8b/0x100
>> [ 2.878017] x86_late_time_init+0x20/0x30
>> [ 2.884675] start_kernel+0x459/0x4fb
>> [ 2.891329] secondary_startup_64_no_verify+0xb0/0xbb
>>
>> Since, the hypervisor already provides the VP assist page for root
>> partition, we need to memremaps the memory from hypervisor for root
>> kernel to use. The mapping is done in hv_cpu_init during bringup and
>> is unmaped in hv_cpu_die during teardown.
>>
>> Signed-off-by: Praveen Kumar <kumarpraveen@linux.microsoft.com>
>> ---
>>  arch/x86/hyperv/hv_init.c | 53 ++++++++++++++++++++++++++-------------
>>  1 file changed, 36 insertions(+), 17 deletions(-)
>>
>> changelog:
>> v1: initial patch
>> v2: commit message changes, removal of HV_MSR_APIC_ACCESS_AVAILABLE
>>     check and addition of null check before reading the VP assist MSR
>>     for root partition
>>
>> ---
>> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
>> index 6f247e7e07eb..ffd3d3b37235 100644
>> --- a/arch/x86/hyperv/hv_init.c
>> +++ b/arch/x86/hyperv/hv_init.c
>> @@ -55,26 +55,41 @@ static int hv_cpu_init(unsigned int cpu)
>>  		return 0;
>>  
>>  	/*
>> -	 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
>> -	 * 5.2.1 "GPA Overlay Pages"). Here it must be zeroed out to make sure
>> -	 * we always write the EOI MSR in hv_apic_eoi_write() *after* the
>> -	 * EOI optimization is disabled in hv_cpu_die(), otherwise a CPU may
>> -	 * not be stopped in the case of CPU offlining and the VM will hang.
>> +	 * For Root partition we need to map the hypervisor VP ASSIST PAGE
>> +	 * instead of allocating a new page.
>>  	 */
>> -	if (!*hvp) {
>> -		*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
>> -	}
>> +	if (hv_root_partition) {
>> +		union hv_x64_msr_hypercall_contents hypercall_msr;
>> +
>> +		rdmsrl(HV_X64_MSR_VP_ASSIST_PAGE, hypercall_msr.as_uint64);
>> +		/* remapping to root partition address space */
>> +		if (!*hvp)
>> +			*hvp = memremap(hypercall_msr.guest_physical_address <<
>> +					HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT,
>> +					PAGE_SIZE, MEMREMAP_WB);
>> +		WARN_ON(!(*hvp));
>> +	} else {
>> +		/*
>> +		 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's
>> +		 * Section 5.2.1 "GPA Overlay Pages"). Here it must be zeroed
>> +		 * out to make sure we always write the EOI MSR in
>> +		 * hv_apic_eoi_write() *after* theEOI optimization is disabled
>> +		 * in hv_cpu_die(), otherwise a CPU may not be stopped in the
>> +		 * case of CPU offlining and the VM will hang.
>> +		 */
>> +		if (!*hvp)
>> +			*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
>>  
>> -	if (*hvp) {
>> -		u64 val;
>> +		if (*hvp) {
>> +			u64 val;
>>  
>> -		val = vmalloc_to_pfn(*hvp);
>> -		val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
>> -			HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
>> +			val = vmalloc_to_pfn(*hvp);
>> +			val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
>> +				HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
>>  
>> -		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
>> +			wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
>> +		}
>>  	}
>> -
>>  	return 0;
>>  }
>>  
>> @@ -170,8 +185,12 @@ static int hv_cpu_die(unsigned int cpu)
>>  
>>  	hv_common_cpu_die(cpu);
>>  
>> -	if (hv_vp_assist_page && hv_vp_assist_page[cpu])
>> -		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
>> +	if (hv_vp_assist_page && hv_vp_assist_page[cpu]) {
>> +		if (hv_root_partition)
>> +			memunmap(hv_vp_assist_page[cpu]);
> 
> I think about this a bit more, the NULL check for *hvp in hv_cpu_init in
> the original code is perhaps due to the code has opted to not free the
> page when disabling the VP assist page. When the CPU is brought back
> online, it does not want to allocate another page, but to use the one
> that's already allocated.
> 
> So, since you listened to my suggestion to add a similar check, you need
> to reset hv_vp_assist_page to NULL here. Alternatively the check for
> *hvp can be dropped for the root path. Either way, the difference
> between root and non-root should be documented.
> 

I would make it as NULL post memunmap as you suggested, so that we don't end up reusing the old/cached value. Before doing that, is there any use-case where hypervisor can allocate or change the VP assist page that can impact root kernel execution ?

> Wei.
> 
>> +		else
>> +			wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);
>> +	}
>>  
>>  	if (hv_reenlightenment_cb == NULL)
>>  		return 0;
>> -- 
>> 2.25.1
>>

Regards,

~Praveen.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] hyperv: root partition faults writing to VP ASSIST MSR PAGE
  2021-07-22 16:15   ` Praveen Kumar
@ 2021-07-24 15:43     ` Wei Liu
  0 siblings, 0 replies; 5+ messages in thread
From: Wei Liu @ 2021-07-24 15:43 UTC (permalink / raw)
  To: Praveen Kumar
  Cc: Wei Liu, linux-hyperv, linux-kernel, kys, haiyangz, sthemmin,
	decui, tglx, mingo, bp, x86, hpa, viremana, sunilmut,
	nunodasneves

On Thu, Jul 22, 2021 at 09:45:36PM +0530, Praveen Kumar wrote:
> > 
> > I think about this a bit more, the NULL check for *hvp in hv_cpu_init in
> > the original code is perhaps due to the code has opted to not free the
> > page when disabling the VP assist page. When the CPU is brought back
> > online, it does not want to allocate another page, but to use the one
> > that's already allocated.
> > 
> > So, since you listened to my suggestion to add a similar check, you need
> > to reset hv_vp_assist_page to NULL here. Alternatively the check for
> > *hvp can be dropped for the root path. Either way, the difference
> > between root and non-root should be documented.
> > 
> 
> I would make it as NULL post memunmap as you suggested, so that we
> don't end up reusing the old/cached value. Before doing that, is there
> any use-case where hypervisor can allocate or change the VP assist
> page that can impact root kernel execution ?

I don't think a sane hypervisor will change the page in such a way.

Wei.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-24 15:43 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-21 18:03 [PATCH v2] hyperv: root partition faults writing to VP ASSIST MSR PAGE Praveen Kumar
2021-07-22  5:53 ` Michael Kelley
2021-07-22 10:27 ` Wei Liu
2021-07-22 16:15   ` Praveen Kumar
2021-07-24 15:43     ` Wei Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).