linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Kelley <mikelley@microsoft.com>
To: Praveen Kumar <kumarpraveen@linux.microsoft.com>,
	"linux-hyperv@vger.kernel.org" <linux-hyperv@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Cc: KY Srinivasan <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	"wei.liu@kernel.org" <wei.liu@kernel.org>,
	Dexuan Cui <decui@microsoft.com>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>, "x86@kernel.org" <x86@kernel.org>,
	"hpa@zytor.com" <hpa@zytor.com>,
	"viremana@linux.microsoft.com" <viremana@linux.microsoft.com>,
	Sunil Muthuswamy <sunilmut@microsoft.com>,
	"nunodasneves@linux.microsoft.com"
	<nunodasneves@linux.microsoft.com>
Subject: RE: [PATCH v3] hyperv: root partition faults writing to VP ASSIST MSR PAGE
Date: Tue, 27 Jul 2021 16:35:22 +0000	[thread overview]
Message-ID: <MWHPR21MB1593C1A51C6E812B0DA82ED5D7E99@MWHPR21MB1593.namprd21.prod.outlook.com> (raw)
In-Reply-To: <20210727104044.28078-1-kumarpraveen@linux.microsoft.com>

From: Praveen Kumar <kumarpraveen@linux.microsoft.com> Sent: Tuesday, July 27, 2021 3:41 AM
> 
> For Root partition the VP assist pages are pre-determined by the
> hypervisor. The Root kernel is not allowed to change them to
> different locations. And thus, we are getting below stack as in
> current implementation Root is trying to perform write to specific
> MSR.
> 
> [ 2.778197] unchecked MSR access error: WRMSR to 0x40000073 (tried to
> write 0x0000000145ac5001) at rIP: 0xffffffff810c1084
> (native_write_msr+0x4/0x30)
> [ 2.784867] Call Trace:
> [ 2.791507] hv_cpu_init+0xf1/0x1c0
> [ 2.798144] ? hyperv_report_panic+0xd0/0xd0
> [ 2.804806] cpuhp_invoke_callback+0x11a/0x440
> [ 2.811465] ? hv_resume+0x90/0x90
> [ 2.818137] cpuhp_issue_call+0x126/0x130
> [ 2.824782] __cpuhp_setup_state_cpuslocked+0x102/0x2b0
> [ 2.831427] ? hyperv_report_panic+0xd0/0xd0
> [ 2.838075] ? hyperv_report_panic+0xd0/0xd0
> [ 2.844723] ? hv_resume+0x90/0x90
> [ 2.851375] __cpuhp_setup_state+0x3d/0x90
> [ 2.858030] hyperv_init+0x14e/0x410
> [ 2.864689] ? enable_IR_x2apic+0x190/0x1a0
> [ 2.871349] apic_intr_mode_init+0x8b/0x100
> [ 2.878017] x86_late_time_init+0x20/0x30
> [ 2.884675] start_kernel+0x459/0x4fb
> [ 2.891329] secondary_startup_64_no_verify+0xb0/0xbb
> 
> Since, the hypervisor already provides the VP assist page for root
> partition, we need to memremap the memory from hypervisor for root
> kernel to use. The mapping is done in hv_cpu_init during bringup and
> is unmaped in hv_cpu_die during teardown.
> 
> Signed-off-by: Praveen Kumar <kumarpraveen@linux.microsoft.com>
> ---
>  arch/x86/hyperv/hv_init.c          | 61 +++++++++++++++++++++---------
>  arch/x86/include/asm/hyperv-tlfs.h |  9 +++++
>  2 files changed, 53 insertions(+), 17 deletions(-)
> 
> changelog:
> v1: initial patch
> v2: commit message changes, removal of HV_MSR_APIC_ACCESS_AVAILABLE
>     check and addition of null check before reading the VP assist MSR
>     for root partition
> v3: added new data structure to handle VP ASSIST MSR page and done
>     handling in hv_cpu_init and hv_cpu_die
> 
> ---
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 6f247e7e07eb..b859e42b4943 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -44,6 +44,7 @@ EXPORT_SYMBOL_GPL(hv_vp_assist_page);
> 
>  static int hv_cpu_init(unsigned int cpu)
>  {
> +	union hv_vp_assist_msr_contents msr;
>  	struct hv_vp_assist_page **hvp = &hv_vp_assist_page[smp_processor_id()];
>  	int ret;
> 
> @@ -54,27 +55,41 @@ static int hv_cpu_init(unsigned int cpu)
>  	if (!hv_vp_assist_page)
>  		return 0;
> 
> -	/*
> -	 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's Section
> -	 * 5.2.1 "GPA Overlay Pages"). Here it must be zeroed out to make sure
> -	 * we always write the EOI MSR in hv_apic_eoi_write() *after* the
> -	 * EOI optimization is disabled in hv_cpu_die(), otherwise a CPU may
> -	 * not be stopped in the case of CPU offlining and the VM will hang.
> -	 */
> -	if (!*hvp) {
> -		*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
> +	if (hv_root_partition) {
> +		/*
> +		 * For Root partition we get the hypervisor provided VP ASSIST
> +		 * PAGE, instead of allocating a new page.
> +		 */
> +		rdmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64);
> +
> +		/* remapping to root partition address space */
> +		if (!*hvp)
> +			*hvp = memremap(msr.guest_physical_address <<
> +					HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT,
> +					PAGE_SIZE, MEMREMAP_WB);
> +	} else {
> +		/*
> +		 * The VP ASSIST PAGE is an "overlay" page (see Hyper-V TLFS's
> +		 * Section 5.2.1 "GPA Overlay Pages"). Here it must be zeroed
> +		 * out to make sure we always write the EOI MSR in
> +		 * hv_apic_eoi_write() *after* theEOI optimization is disabled
> +		 * in hv_cpu_die(), otherwise a CPU may not be stopped in the
> +		 * case of CPU offlining and the VM will hang.
> +		 */
> +		if (!*hvp)
> +			*hvp = __vmalloc(PAGE_SIZE, GFP_KERNEL | __GFP_ZERO);
> +
>  	}

The tests here could be reversed to eliminate some duplication.  For example:

	if(!*hvp) {
		if (hv_root_partition) {
			rdmsrl(....);
			*hvp = memremap( .....);
		} else {
			*hvp = __vmalloc(....);
		}
	}


> 
> -	if (*hvp) {
> -		u64 val;
> +	WARN_ON(!(*hvp));
> 
> -		val = vmalloc_to_pfn(*hvp);
> -		val = (val << HV_X64_MSR_VP_ASSIST_PAGE_ADDRESS_SHIFT) |
> -			HV_X64_MSR_VP_ASSIST_PAGE_ENABLE;
> +	if (*hvp) {
> +		if (!hv_root_partition)
> +			msr.guest_physical_address = vmalloc_to_pfn(*hvp);
> 
> -		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, val);
> +		msr.enable = 1;
> +		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, msr.as_uint64);

This version has a substantive difference compared with previous versions
in that the "enable" bit is being set and written back to the MSR even when
running in the root partition.  Is that intentional?

>  	}
> -
>  	return 0;
>  }
> 
> @@ -170,9 +185,21 @@ static int hv_cpu_die(unsigned int cpu)
> 
>  	hv_common_cpu_die(cpu);
> 
> -	if (hv_vp_assist_page && hv_vp_assist_page[cpu])
> +	if (hv_vp_assist_page && hv_vp_assist_page[cpu]) {
>  		wrmsrl(HV_X64_MSR_VP_ASSIST_PAGE, 0);

This will set the guest_physical_address in the MSR to zero,
even in the root partition case.  Is that OK?  It seems inconsistent
with hv_cpu_init() where the existing guest_physical_address
in the MSR is carefully preserved for the root partition case.
Or is the intent here simply to clear the "enable" flag?

> 
> +		if (hv_root_partition) {
> +			/*
> +			 * For Root partition the VP ASSIST page is mapped to
> +			 * hypervisor provided page, and thus, we unmap the
> +			 * page here and nullify it, so that in future we have
> +			 * correct page address mapped in hv_cpu_init
> +			 */
> +			memunmap(hv_vp_assist_page[cpu]);
> +			hv_vp_assist_page[cpu] = NULL;
> +		}
> +	}
> +
>  	if (hv_reenlightenment_cb == NULL)
>  		return 0;
> 
> diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hyperv-tlfs.h
> index f1366ce609e3..2e4e87046aa7 100644
> --- a/arch/x86/include/asm/hyperv-tlfs.h
> +++ b/arch/x86/include/asm/hyperv-tlfs.h
> @@ -288,6 +288,15 @@ union hv_x64_msr_hypercall_contents {
>  	} __packed;
>  };
> 
> +union hv_vp_assist_msr_contents {
> +	u64 as_uint64;
> +	struct {
> +		u64 enable:1;
> +		u64 reserved:11;
> +		u64 guest_physical_address:52;

This field really should be named "guest_physical_page", as
it is a page number, not an address.  You've matched the
field names used in hv_x64_msr_hypercall_contents, which
is good for consistency, except that the field name is
wrong in hv_x64_msr_hypercall_contents. :-(   I think
the Hyper-V TLFS originally called it a "physical address", but
the TLFS has since been fixed to described it as a page number.
I'd suggest getting this one named correctly; fixing the field
name in hv_x64_msr_hypercall_contents is a separate cleanup
that doesn't need to be done now.

> +	} __packed;
> +};
> +
>  struct hv_reenlightenment_control {
>  	__u64 vector:8;
>  	__u64 reserved1:8;
> --
> 2.25.1


  reply	other threads:[~2021-07-27 16:35 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-27 10:40 [PATCH v3] hyperv: root partition faults writing to VP ASSIST MSR PAGE Praveen Kumar
2021-07-27 16:35 ` Michael Kelley [this message]
2021-07-27 16:51   ` Sunil Muthuswamy
2021-07-27 16:58   ` Praveen Kumar
2021-07-27 16:45 ` Sunil Muthuswamy
2021-07-27 17:12   ` Praveen Kumar
2021-07-27 17:23     ` Sunil Muthuswamy
2021-07-27 17:51 ` Wei Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR21MB1593C1A51C6E812B0DA82ED5D7E99@MWHPR21MB1593.namprd21.prod.outlook.com \
    --to=mikelley@microsoft.com \
    --cc=bp@alien8.de \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=hpa@zytor.com \
    --cc=kumarpraveen@linux.microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nunodasneves@linux.microsoft.com \
    --cc=sthemmin@microsoft.com \
    --cc=sunilmut@microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=viremana@linux.microsoft.com \
    --cc=wei.liu@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).