linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] x86/hyperv: make vapic support x2apic mode
@ 2019-10-02 10:19 Roman Kagan
  2019-10-03 10:54 ` Vitaly Kuznetsov
  0 siblings, 1 reply; 7+ messages in thread
From: Roman Kagan @ 2019-10-02 10:19 UTC (permalink / raw)
  To: Michael Kelley, Lan Tianyu, Joerg Roedel, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Sasha Levin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, linux-hyperv,
	linux-kernel
  Cc: kvm

Now that there's Hyper-V IOMMU driver, Linux can switch to x2apic mode
when supported by the vcpus.

However, the apic access functions for Hyper-V enlightened apic assume
xapic mode only.

As a result, Linux fails to bring up secondary cpus when run as a guest
in QEMU/KVM with both hv_apic and x2apic enabled.

I didn't manage to make my instance of Hyper-V expose x2apic to the
guest; nor does Hyper-V spec document the expected behavior.  However,
a Windows guest running in QEMU/KVM with hv_apic and x2apic and a big
number of vcpus (so that it turns on x2apic mode) does use enlightened
apic MSRs passing unshifted 32bit destination id and falls back to the
regular x2apic MSRs for less frequently used apic fields.

So implement the same behavior, by replacing enlightened apic access
functions (only those where it makes a difference) with their
x2apic-aware versions when x2apic is in use.

Fixes: 29217a474683 ("iommu/hyper-v: Add Hyper-V stub IOMMU driver")
Fixes: 6b48cb5f8347 ("X86/Hyper-V: Enlighten APIC access")
Cc: stable@vger.kernel.org
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
---
v1 -> v2:
- add ifdefs to handle !CONFIG_X86_X2APIC

 arch/x86/hyperv/hv_apic.c | 54 ++++++++++++++++++++++++++++++++++++---
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index 5c056b8aebef..eb1434ae9e46 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -84,6 +84,44 @@ static void hv_apic_write(u32 reg, u32 val)
 	}
 }
 
+#ifdef CONFIG_X86_X2APIC
+static void hv_x2apic_icr_write(u32 low, u32 id)
+{
+	wrmsr(HV_X64_MSR_ICR, low, id);
+}
+
+static u32 hv_x2apic_read(u32 reg)
+{
+	u32 reg_val, hi;
+
+	switch (reg) {
+	case APIC_EOI:
+		rdmsr(HV_X64_MSR_EOI, reg_val, hi);
+		return reg_val;
+	case APIC_TASKPRI:
+		rdmsr(HV_X64_MSR_TPR, reg_val, hi);
+		return reg_val;
+
+	default:
+		return native_apic_msr_read(reg);
+	}
+}
+
+static void hv_x2apic_write(u32 reg, u32 val)
+{
+	switch (reg) {
+	case APIC_EOI:
+		wrmsr(HV_X64_MSR_EOI, val, 0);
+		break;
+	case APIC_TASKPRI:
+		wrmsr(HV_X64_MSR_TPR, val, 0);
+		break;
+	default:
+		native_apic_msr_write(reg, val);
+	}
+}
+#endif /* CONFIG_X86_X2APIC */
+
 static void hv_apic_eoi_write(u32 reg, u32 val)
 {
 	struct hv_vp_assist_page *hvp = hv_vp_assist_page[smp_processor_id()];
@@ -262,9 +300,19 @@ void __init hv_apic_init(void)
 	if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) {
 		pr_info("Hyper-V: Using MSR based APIC access\n");
 		apic_set_eoi_write(hv_apic_eoi_write);
-		apic->read      = hv_apic_read;
-		apic->write     = hv_apic_write;
-		apic->icr_write = hv_apic_icr_write;
+#ifdef CONFIG_X86_X2APIC
+		if (x2apic_enabled()) {
+			apic->read      = hv_x2apic_read;
+			apic->write     = hv_x2apic_write;
+			apic->icr_write = hv_x2apic_icr_write;
+		} else {
+#endif
+			apic->read      = hv_apic_read;
+			apic->write     = hv_apic_write;
+			apic->icr_write = hv_apic_icr_write;
+#ifdef CONFIG_X86_X2APIC
+		}
+#endif
 		apic->icr_read  = hv_apic_icr_read;
 	}
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] x86/hyperv: make vapic support x2apic mode
  2019-10-02 10:19 [PATCH v2] x86/hyperv: make vapic support x2apic mode Roman Kagan
@ 2019-10-03 10:54 ` Vitaly Kuznetsov
  2019-10-03 12:52   ` Roman Kagan
  0 siblings, 1 reply; 7+ messages in thread
From: Vitaly Kuznetsov @ 2019-10-03 10:54 UTC (permalink / raw)
  To: Roman Kagan
  Cc: kvm, Michael Kelley, Lan Tianyu, Joerg Roedel, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Sasha Levin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, linux-hyperv,
	linux-kernel

Roman Kagan <rkagan@virtuozzo.com> writes:

> Now that there's Hyper-V IOMMU driver, Linux can switch to x2apic mode
> when supported by the vcpus.
>
> However, the apic access functions for Hyper-V enlightened apic assume
> xapic mode only.
>
> As a result, Linux fails to bring up secondary cpus when run as a guest
> in QEMU/KVM with both hv_apic and x2apic enabled.
>
> I didn't manage to make my instance of Hyper-V expose x2apic to the
> guest; nor does Hyper-V spec document the expected behavior.  However,
> a Windows guest running in QEMU/KVM with hv_apic and x2apic and a big
> number of vcpus (so that it turns on x2apic mode) does use enlightened
> apic MSRs passing unshifted 32bit destination id and falls back to the
> regular x2apic MSRs for less frequently used apic fields.
>
> So implement the same behavior, by replacing enlightened apic access
> functions (only those where it makes a difference) with their
> x2apic-aware versions when x2apic is in use.
>
> Fixes: 29217a474683 ("iommu/hyper-v: Add Hyper-V stub IOMMU driver")
> Fixes: 6b48cb5f8347 ("X86/Hyper-V: Enlighten APIC access")
> Cc: stable@vger.kernel.org
> Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> ---
> v1 -> v2:
> - add ifdefs to handle !CONFIG_X86_X2APIC
>
>  arch/x86/hyperv/hv_apic.c | 54 ++++++++++++++++++++++++++++++++++++---
>  1 file changed, 51 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
> index 5c056b8aebef..eb1434ae9e46 100644
> --- a/arch/x86/hyperv/hv_apic.c
> +++ b/arch/x86/hyperv/hv_apic.c
> @@ -84,6 +84,44 @@ static void hv_apic_write(u32 reg, u32 val)
>  	}
>  }
>  
> +#ifdef CONFIG_X86_X2APIC
> +static void hv_x2apic_icr_write(u32 low, u32 id)
> +{
> +	wrmsr(HV_X64_MSR_ICR, low, id);
> +}

AFAIU you're trying to mirror native_x2apic_icr_write() here but this is
different from what hv_apic_icr_write() does
(SET_APIC_DEST_FIELD(id)). Is it actually correct? (I think you've
tested this and it is but) Michael, could you please shed some light
here?

> +
> +static u32 hv_x2apic_read(u32 reg)
> +{
> +	u32 reg_val, hi;
> +
> +	switch (reg) {
> +	case APIC_EOI:
> +		rdmsr(HV_X64_MSR_EOI, reg_val, hi);
> +		return reg_val;
> +	case APIC_TASKPRI:
> +		rdmsr(HV_X64_MSR_TPR, reg_val, hi);
> +		return reg_val;
> +
> +	default:
> +		return native_apic_msr_read(reg);
> +	}
> +}
> +
> +static void hv_x2apic_write(u32 reg, u32 val)
> +{
> +	switch (reg) {
> +	case APIC_EOI:
> +		wrmsr(HV_X64_MSR_EOI, val, 0);
> +		break;
> +	case APIC_TASKPRI:
> +		wrmsr(HV_X64_MSR_TPR, val, 0);
> +		break;
> +	default:
> +		native_apic_msr_write(reg, val);
> +	}
> +}
> +#endif /* CONFIG_X86_X2APIC */
> +
>  static void hv_apic_eoi_write(u32 reg, u32 val)
>  {
>  	struct hv_vp_assist_page *hvp = hv_vp_assist_page[smp_processor_id()];
> @@ -262,9 +300,19 @@ void __init hv_apic_init(void)
>  	if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) {
>  		pr_info("Hyper-V: Using MSR based APIC access\n");
>  		apic_set_eoi_write(hv_apic_eoi_write);
> -		apic->read      = hv_apic_read;
> -		apic->write     = hv_apic_write;
> -		apic->icr_write = hv_apic_icr_write;
> +#ifdef CONFIG_X86_X2APIC
> +		if (x2apic_enabled()) {
> +			apic->read      = hv_x2apic_read;
> +			apic->write     = hv_x2apic_write;
> +			apic->icr_write = hv_x2apic_icr_write;
> +		} else {
> +#endif
> +			apic->read      = hv_apic_read;
> +			apic->write     = hv_apic_write;
> +			apic->icr_write = hv_apic_icr_write;

(just wondering): Is it always safe to assume that we cannot switch
between apic_flat/x2apic in runtime? Moreover, the only difference
between hv_apic_read/hv_apic_write and hv_x2apic_read/hv_x2apic_write is
native_apic_mem_{read,write} -> native_apic_msr_{read,write}. Would it
make sense to move if (x2apic_enabled()) and merge these functions?

> +#ifdef CONFIG_X86_X2APIC
> +		}
> +#endif
>  		apic->icr_read  = hv_apic_icr_read;
>  	}
>  }

-- 
Vitaly

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] x86/hyperv: make vapic support x2apic mode
  2019-10-03 10:54 ` Vitaly Kuznetsov
@ 2019-10-03 12:52   ` Roman Kagan
  2019-10-04  3:01     ` Michael Kelley
  0 siblings, 1 reply; 7+ messages in thread
From: Roman Kagan @ 2019-10-03 12:52 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: kvm, Michael Kelley, Lan Tianyu, Joerg Roedel, K. Y. Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Sasha Levin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, linux-hyperv,
	linux-kernel

On Thu, Oct 03, 2019 at 12:54:03PM +0200, Vitaly Kuznetsov wrote:
> Roman Kagan <rkagan@virtuozzo.com> writes:
> 
> > Now that there's Hyper-V IOMMU driver, Linux can switch to x2apic mode
> > when supported by the vcpus.
> >
> > However, the apic access functions for Hyper-V enlightened apic assume
> > xapic mode only.
> >
> > As a result, Linux fails to bring up secondary cpus when run as a guest
> > in QEMU/KVM with both hv_apic and x2apic enabled.
> >
> > I didn't manage to make my instance of Hyper-V expose x2apic to the
> > guest; nor does Hyper-V spec document the expected behavior.  However,
> > a Windows guest running in QEMU/KVM with hv_apic and x2apic and a big
> > number of vcpus (so that it turns on x2apic mode) does use enlightened
> > apic MSRs passing unshifted 32bit destination id and falls back to the
> > regular x2apic MSRs for less frequently used apic fields.
> >
> > So implement the same behavior, by replacing enlightened apic access
> > functions (only those where it makes a difference) with their
> > x2apic-aware versions when x2apic is in use.
> >
> > Fixes: 29217a474683 ("iommu/hyper-v: Add Hyper-V stub IOMMU driver")
> > Fixes: 6b48cb5f8347 ("X86/Hyper-V: Enlighten APIC access")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
> > ---
> > v1 -> v2:
> > - add ifdefs to handle !CONFIG_X86_X2APIC
> >
> >  arch/x86/hyperv/hv_apic.c | 54 ++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 51 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
> > index 5c056b8aebef..eb1434ae9e46 100644
> > --- a/arch/x86/hyperv/hv_apic.c
> > +++ b/arch/x86/hyperv/hv_apic.c
> > @@ -84,6 +84,44 @@ static void hv_apic_write(u32 reg, u32 val)
> >  	}
> >  }
> >  
> > +#ifdef CONFIG_X86_X2APIC
> > +static void hv_x2apic_icr_write(u32 low, u32 id)
> > +{
> > +	wrmsr(HV_X64_MSR_ICR, low, id);
> > +}
> 
> AFAIU you're trying to mirror native_x2apic_icr_write() here but this is
> different from what hv_apic_icr_write() does
> (SET_APIC_DEST_FIELD(id)).

Right.  In xapic mode the ICR2 aka the high 4 bytes of ICR is programmed
with the destination id in the highest byte; in x2apic mode the whole
ICR2 is set to the 32bit destination id.

> Is it actually correct? (I think you've tested this and it is but)

As I wrote in the commit log, I haven't tested it in the sense that I
ran a Linux guest in a Hyper-V VM exposing x2apic to the guest, because
I didn't manage to configure it to do so.  OTOH I did run a Windows
guest in QEMU/KVM with hv_apic and x2apic enabled and saw it write
destination ids unshifted to the ICR2 part of ICR, so I assume it's
correct.

> Michael, could you please shed some light here?

Would be appreciated, indeed.

> > +static u32 hv_x2apic_read(u32 reg)
> > +{
> > +	u32 reg_val, hi;
> > +
> > +	switch (reg) {
> > +	case APIC_EOI:
> > +		rdmsr(HV_X64_MSR_EOI, reg_val, hi);
> > +		return reg_val;
> > +	case APIC_TASKPRI:
> > +		rdmsr(HV_X64_MSR_TPR, reg_val, hi);
> > +		return reg_val;
> > +
> > +	default:
> > +		return native_apic_msr_read(reg);
> > +	}
> > +}
> > +
> > +static void hv_x2apic_write(u32 reg, u32 val)
> > +{
> > +	switch (reg) {
> > +	case APIC_EOI:
> > +		wrmsr(HV_X64_MSR_EOI, val, 0);
> > +		break;
> > +	case APIC_TASKPRI:
> > +		wrmsr(HV_X64_MSR_TPR, val, 0);
> > +		break;
> > +	default:
> > +		native_apic_msr_write(reg, val);
> > +	}
> > +}
> > +#endif /* CONFIG_X86_X2APIC */
> > +
> >  static void hv_apic_eoi_write(u32 reg, u32 val)
> >  {
> >  	struct hv_vp_assist_page *hvp = hv_vp_assist_page[smp_processor_id()];
> > @@ -262,9 +300,19 @@ void __init hv_apic_init(void)
> >  	if (ms_hyperv.hints & HV_X64_APIC_ACCESS_RECOMMENDED) {
> >  		pr_info("Hyper-V: Using MSR based APIC access\n");
> >  		apic_set_eoi_write(hv_apic_eoi_write);
> > -		apic->read      = hv_apic_read;
> > -		apic->write     = hv_apic_write;
> > -		apic->icr_write = hv_apic_icr_write;
> > +#ifdef CONFIG_X86_X2APIC
> > +		if (x2apic_enabled()) {
> > +			apic->read      = hv_x2apic_read;
> > +			apic->write     = hv_x2apic_write;
> > +			apic->icr_write = hv_x2apic_icr_write;
> > +		} else {
> > +#endif
> > +			apic->read      = hv_apic_read;
> > +			apic->write     = hv_apic_write;
> > +			apic->icr_write = hv_apic_icr_write;
> 
> (just wondering): Is it always safe to assume that we cannot switch
> between apic_flat/x2apic in runtime?

I guess so.  All apic choices are made early at __init, obviously before
the secondary CPUs are brought up, and aren't reconsidered afterwards.

> Moreover, the only difference
> between hv_apic_read/hv_apic_write and hv_x2apic_read/hv_x2apic_write is
> native_apic_mem_{read,write} -> native_apic_msr_{read,write}. Would it
> make sense to move if (x2apic_enabled()) and merge these functions?

x2apic_enabled() is too heavy for that: it reads MSR_IA32_APICBASE.  One
could probably use a read-mostly global variable instead but I'm not
sure it would make the code better.

Thanks,
Roman.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v2] x86/hyperv: make vapic support x2apic mode
  2019-10-03 12:52   ` Roman Kagan
@ 2019-10-04  3:01     ` Michael Kelley
  2019-10-04  9:18       ` Roman Kagan
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Kelley @ 2019-10-04  3:01 UTC (permalink / raw)
  To: Roman Kagan, vkuznets
  Cc: kvm, Tianyu Lan, Joerg Roedel, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Sasha Levin, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, H. Peter Anvin, x86, linux-hyperv, linux-kernel

From: Roman Kagan <rkagan@virtuozzo.com> Sent: Thursday, October 3, 2019 5:53 AM
> >
> > AFAIU you're trying to mirror native_x2apic_icr_write() here but this is
> > different from what hv_apic_icr_write() does
> > (SET_APIC_DEST_FIELD(id)).
> 
> Right.  In xapic mode the ICR2 aka the high 4 bytes of ICR is programmed
> with the destination id in the highest byte; in x2apic mode the whole
> ICR2 is set to the 32bit destination id.
> 
> > Is it actually correct? (I think you've tested this and it is but)
> 
> As I wrote in the commit log, I haven't tested it in the sense that I
> ran a Linux guest in a Hyper-V VM exposing x2apic to the guest, because
> I didn't manage to configure it to do so.  OTOH I did run a Windows
> guest in QEMU/KVM with hv_apic and x2apic enabled and saw it write
> destination ids unshifted to the ICR2 part of ICR, so I assume it's
> correct.
> 
> > Michael, could you please shed some light here?
> 
> Would be appreciated, indeed.
> 

The newest version of Hyper-V provides an x2apic in a guest VM when the
number of vCPUs in the VM is > 240.  This version of Hyper-V is beginning
to be deployed in Azure to enable the M416v2 VM size, but the functionality
is not yet available for the on-premises version of Hyper-V.  However, I can
test this configuration internally with the above patch -- give me a few days.

An additional complication is that when running on Intel processors that offer
vAPIC functionality, the Hyper-V "hints" value does *not* recommend using the
MSR-based APIC accesses.  In this case, memory-mapped access to the x2apic
registers is faster than the synthetic MSRs.  I've already looked at a VM that has
the x2apic, and indeed that is the case, so the above code wouldn't run
anyway.  But I can temporarily code around that for testing purposes and see
if everything works.

Michael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] x86/hyperv: make vapic support x2apic mode
  2019-10-04  3:01     ` Michael Kelley
@ 2019-10-04  9:18       ` Roman Kagan
  2019-10-04 22:33         ` Michael Kelley
  0 siblings, 1 reply; 7+ messages in thread
From: Roman Kagan @ 2019-10-04  9:18 UTC (permalink / raw)
  To: Michael Kelley
  Cc: vkuznets, kvm, Tianyu Lan, Joerg Roedel, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Sasha Levin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, linux-hyperv,
	linux-kernel

On Fri, Oct 04, 2019 at 03:01:51AM +0000, Michael Kelley wrote:
> From: Roman Kagan <rkagan@virtuozzo.com> Sent: Thursday, October 3, 2019 5:53 AM
> > >
> > > AFAIU you're trying to mirror native_x2apic_icr_write() here but this is
> > > different from what hv_apic_icr_write() does
> > > (SET_APIC_DEST_FIELD(id)).
> > 
> > Right.  In xapic mode the ICR2 aka the high 4 bytes of ICR is programmed
> > with the destination id in the highest byte; in x2apic mode the whole
> > ICR2 is set to the 32bit destination id.
> > 
> > > Is it actually correct? (I think you've tested this and it is but)
> > 
> > As I wrote in the commit log, I haven't tested it in the sense that I
> > ran a Linux guest in a Hyper-V VM exposing x2apic to the guest, because
> > I didn't manage to configure it to do so.  OTOH I did run a Windows
> > guest in QEMU/KVM with hv_apic and x2apic enabled and saw it write
> > destination ids unshifted to the ICR2 part of ICR, so I assume it's
> > correct.
> > 
> > > Michael, could you please shed some light here?
> > 
> > Would be appreciated, indeed.
> > 
> 
> The newest version of Hyper-V provides an x2apic in a guest VM when the
> number of vCPUs in the VM is > 240.  This version of Hyper-V is beginning
> to be deployed in Azure to enable the M416v2 VM size, but the functionality
> is not yet available for the on-premises version of Hyper-V.  However, I can
> test this configuration internally with the above patch -- give me a few days.
> 
> An additional complication is that when running on Intel processors that offer
> vAPIC functionality, the Hyper-V "hints" value does *not* recommend using the
> MSR-based APIC accesses.  In this case, memory-mapped access to the x2apic
> registers is faster than the synthetic MSRs.

I guess you mean "using regular x2apic MSRs compared to the synthetic
MSRs".  Indeed they do essentially the same thing, and there's no reason
for one set of MSRs to be significantly faster than the other.  However,
hv_apic_eoi_write makes use of "apic assists" aka lazy EOI which is
certainly a win, and I'm not sure if it works without hv_apic.

> I've already looked at a VM that has
> the x2apic, and indeed that is the case, so the above code wouldn't run
> anyway.  But I can temporarily code around that for testing purposes and see
> if everything works.

Thanks!
Roman.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v2] x86/hyperv: make vapic support x2apic mode
  2019-10-04  9:18       ` Roman Kagan
@ 2019-10-04 22:33         ` Michael Kelley
  2019-10-08 20:54           ` Michael Kelley
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Kelley @ 2019-10-04 22:33 UTC (permalink / raw)
  To: Roman Kagan
  Cc: vkuznets, kvm, Tianyu Lan, Joerg Roedel, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Sasha Levin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, linux-hyperv,
	linux-kernel

From: Roman Kagan <rkagan@virtuozzo.com> Sent: Friday, October 4, 2019 2:19 AM
> 
> On Fri, Oct 04, 2019 at 03:01:51AM +0000, Michael Kelley wrote:
> > From: Roman Kagan <rkagan@virtuozzo.com> Sent: Thursday, October 3, 2019 5:53 AM
> > > >
> > > > AFAIU you're trying to mirror native_x2apic_icr_write() here but this is
> > > > different from what hv_apic_icr_write() does
> > > > (SET_APIC_DEST_FIELD(id)).
> > >
> > > Right.  In xapic mode the ICR2 aka the high 4 bytes of ICR is programmed
> > > with the destination id in the highest byte; in x2apic mode the whole
> > > ICR2 is set to the 32bit destination id.
> > >
> > > > Is it actually correct? (I think you've tested this and it is but)
> > >
> > > As I wrote in the commit log, I haven't tested it in the sense that I
> > > ran a Linux guest in a Hyper-V VM exposing x2apic to the guest, because
> > > I didn't manage to configure it to do so.  OTOH I did run a Windows
> > > guest in QEMU/KVM with hv_apic and x2apic enabled and saw it write
> > > destination ids unshifted to the ICR2 part of ICR, so I assume it's
> > > correct.
> > >
> > > > Michael, could you please shed some light here?
> > >
> > > Would be appreciated, indeed.
> > >
> >
> > The newest version of Hyper-V provides an x2apic in a guest VM when the
> > number of vCPUs in the VM is > 240.  This version of Hyper-V is beginning
> > to be deployed in Azure to enable the M416v2 VM size, but the functionality
> > is not yet available for the on-premises version of Hyper-V.  However, I can
> > test this configuration internally with the above patch -- give me a few days.
> >
> > An additional complication is that when running on Intel processors that offer
> > vAPIC functionality, the Hyper-V "hints" value does *not* recommend using the
> > MSR-based APIC accesses.  In this case, memory-mapped access to the x2apic
> > registers is faster than the synthetic MSRs.
> 
> I guess you mean "using regular x2apic MSRs compared to the synthetic
> MSRs".  

Yes, of course you are correct.

> Indeed they do essentially the same thing, and there's no reason
> for one set of MSRs to be significantly faster than the other.  However,
> hv_apic_eoi_write makes use of "apic assists" aka lazy EOI which is
> certainly a win, and I'm not sure if it works without hv_apic.
> 

I've checked with the Hyper-V people and the presence of vAPIC makes
a difference.  If vAPIC is present in the hardware:
1) Hyper-V does not set the HV_X64_APIC_ACCESS_RECOMMENDED flag
2) The architectural MSRs should be used instead of the Hyper-V
    synthetic MSRs, as they are significantly faster.  The architectural
    MSRs do not cause a VMEXIT because they are handled entirely by
    the vAPIC microcode in the CPU.  The synthetic MSRs do cause a VMEXIT.
3) The lazy EOI functionality should not be used

If vAPIC is not present in the hardware:
1) Hyper-V will set HV_X64_APIC_ACCESS_RECOMMENDED
2) Either set of MSRs has about the same performance, but we
    should use the synthetic MSRs.
3) The lazy EOI functionality has some value and should be used

The same will apply to the AMD AVIC in some Hyper-V updates that
are coming soon.

So I think your code makes sense given the above information.  By
Monday I'll try to test it on a Hyper-V guest VM with x2APIC.

Michael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH v2] x86/hyperv: make vapic support x2apic mode
  2019-10-04 22:33         ` Michael Kelley
@ 2019-10-08 20:54           ` Michael Kelley
  0 siblings, 0 replies; 7+ messages in thread
From: Michael Kelley @ 2019-10-08 20:54 UTC (permalink / raw)
  To: Roman Kagan
  Cc: vkuznets, kvm, Tianyu Lan, Joerg Roedel, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Sasha Levin, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, x86, linux-hyperv,
	linux-kernel

From: Michael Kelley <mikelley@microsoft.com> Sent: Friday, October 4, 2019 3:33 PM
> 
> From: Roman Kagan <rkagan@virtuozzo.com> Sent: Friday, October 4, 2019 2:19 AM
> >
> > On Fri, Oct 04, 2019 at 03:01:51AM +0000, Michael Kelley wrote:
> > > From: Roman Kagan <rkagan@virtuozzo.com> Sent: Thursday, October 3, 2019 5:53 AM
> > > > >
> > > > > AFAIU you're trying to mirror native_x2apic_icr_write() here but this is
> > > > > different from what hv_apic_icr_write() does
> > > > > (SET_APIC_DEST_FIELD(id)).
> > > >
> > > > Right.  In xapic mode the ICR2 aka the high 4 bytes of ICR is programmed
> > > > with the destination id in the highest byte; in x2apic mode the whole
> > > > ICR2 is set to the 32bit destination id.
> > > >
> > > > > Is it actually correct? (I think you've tested this and it is but)
> > > >
> > > > As I wrote in the commit log, I haven't tested it in the sense that I
> > > > ran a Linux guest in a Hyper-V VM exposing x2apic to the guest, because
> > > > I didn't manage to configure it to do so.  OTOH I did run a Windows
> > > > guest in QEMU/KVM with hv_apic and x2apic enabled and saw it write
> > > > destination ids unshifted to the ICR2 part of ICR, so I assume it's
> > > > correct.
> > > >
> > > > > Michael, could you please shed some light here?
> > > >
> > > > Would be appreciated, indeed.
> > > >
> > >
> > > The newest version of Hyper-V provides an x2apic in a guest VM when the
> > > number of vCPUs in the VM is > 240.  This version of Hyper-V is beginning
> > > to be deployed in Azure to enable the M416v2 VM size, but the functionality
> > > is not yet available for the on-premises version of Hyper-V.  However, I can
> > > test this configuration internally with the above patch -- give me a few days.
> > >
> > > An additional complication is that when running on Intel processors that offer
> > > vAPIC functionality, the Hyper-V "hints" value does *not* recommend using the
> > > MSR-based APIC accesses.  In this case, memory-mapped access to the x2apic
> > > registers is faster than the synthetic MSRs.
> >
> > I guess you mean "using regular x2apic MSRs compared to the synthetic
> > MSRs".
> 
> Yes, of course you are correct.
> 
> > Indeed they do essentially the same thing, and there's no reason
> > for one set of MSRs to be significantly faster than the other.  However,
> > hv_apic_eoi_write makes use of "apic assists" aka lazy EOI which is
> > certainly a win, and I'm not sure if it works without hv_apic.
> >
> 
> I've checked with the Hyper-V people and the presence of vAPIC makes
> a difference.  If vAPIC is present in the hardware:
> 1) Hyper-V does not set the HV_X64_APIC_ACCESS_RECOMMENDED flag
> 2) The architectural MSRs should be used instead of the Hyper-V
>     synthetic MSRs, as they are significantly faster.  The architectural
>     MSRs do not cause a VMEXIT because they are handled entirely by
>     the vAPIC microcode in the CPU.  The synthetic MSRs do cause a VMEXIT.
> 3) The lazy EOI functionality should not be used
> 
> If vAPIC is not present in the hardware:
> 1) Hyper-V will set HV_X64_APIC_ACCESS_RECOMMENDED
> 2) Either set of MSRs has about the same performance, but we
>     should use the synthetic MSRs.
> 3) The lazy EOI functionality has some value and should be used
> 
> The same will apply to the AMD AVIC in some Hyper-V updates that
> are coming soon.
> 
> So I think your code makes sense given the above information.  By
> Monday I'll try to test it on a Hyper-V guest VM with x2APIC.
> 

I've smoke tested your code with a Hyper-V guest VM with x2APIC
and 1024 vCPUs and HV_X64_APIC_ACCESS_RECOMMENDED
enabled.  The new x2apic functions you have added appear to work.
No issues were seen.

However, based on further discussion with the Hyper-V team, the
architectural MSRs and the synthetic MSRs really are interchangeable
with an x2apic.  There's no perf difference like there is with the
memory-mapped registers in the classic APIC.  So your new x2apic
functions aren't really needed -- all that's needed is to skip plugging
in the hv_apic functions when an x2apic is present.  The native x2apic
functions will work just fine.  Note that even with x2apic,
hv_apic_eoi_write() should still be used to take advantage of the
lazy EOI functionality.  It's OK to use the synthetic EOI MSR with
x2apic for this case, so again no additional code is needed.

I quickly changed the code to do the above so that the architectural
MSRs are used, and those changes successfully smoke test on the
same 1024 vCPU VM with no problems.  I tested with vAPIC enabled
and with vAPIC disabled, and all looks good.

Michael

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-10-08 20:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-02 10:19 [PATCH v2] x86/hyperv: make vapic support x2apic mode Roman Kagan
2019-10-03 10:54 ` Vitaly Kuznetsov
2019-10-03 12:52   ` Roman Kagan
2019-10-04  3:01     ` Michael Kelley
2019-10-04  9:18       ` Roman Kagan
2019-10-04 22:33         ` Michael Kelley
2019-10-08 20:54           ` Michael Kelley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).