Message ID | 20210116223136.13892-1-decui@microsoft.com |
---|---|
State | Accepted |
Commit | fff7b5e6ee63c5d20406a131b260c619cdd24fd1 |
Headers | show |
Series |
|
Related | show |
From: Dexuan Cui <decui@microsoft.com> Sent: Saturday, January 16, 2021 2:32 PM > > With commit 4df4cb9e99f8, the Hyper-V direct-mode STIMER is actually > initialized before LAPIC is initialized: see > > apic_intr_mode_init() > > x86_platform.apic_post_init() > hyperv_init() > hv_stimer_alloc() > > apic_bsp_setup() > setup_local_APIC() > > setup_local_APIC() temporarily disables LAPIC, initializes it and > re-eanble it. The direct-mode STIMER depends on LAPIC, and when it's > registered, it can be programmed immediately and the timer can fire > very soon: > > hv_stimer_init > clockevents_config_and_register > clockevents_register_device > tick_check_new_device > tick_setup_device > tick_setup_periodic(), tick_setup_oneshot() > clockevents_program_event > > When the timer fires in the hypervisor, if the LAPIC is in the > disabled state, new versions of Hyper-V ignore the event and don't inject > the timer interrupt into the VM, and hence the VM hangs when it boots. > > Note: when the VM starts/reboots, the LAPIC is pre-enabled by the > firmware, so the window of LAPIC being temporarily disabled is pretty > small, and the issue can only happen once out of 100~200 reboots for > a 40-vCPU VM on one dev host, and on another host the issue doesn't > reproduce after 2000 reboots. > > The issue is more noticeable for kdump/kexec, because the LAPIC is > disabled by the first kernel, and stays disabled until the kdump/kexec > kernel enables it. This is especially an issue to a Generation-2 VM > (for which Hyper-V doesn't emulate the PIT timer) when CONFIG_HZ=1000 > (rather than CONFIG_HZ=250) is used. > > Fix the issue by moving hv_stimer_alloc() to a later place where the > LAPIC timer is initialized. > > Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU onlining") > Signed-off-by: Dexuan Cui <decui@microsoft.com> > --- > arch/x86/hyperv/hv_init.c | 29 ++++++++++++++++++++++++++--- > 1 file changed, 26 insertions(+), 3 deletions(-) > Reviewed-by: Michael Kelley <mikelley@microsoft.com>
On Sat, Jan 16, 2021 at 10:48:04PM +0000, Michael Kelley wrote: > From: Dexuan Cui <decui@microsoft.com> Sent: Saturday, January 16, 2021 2:32 PM > > > > Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU onlining") > > Signed-off-by: Dexuan Cui <decui@microsoft.com> > > --- > > arch/x86/hyperv/hv_init.c | 29 ++++++++++++++++++++++++++--- > > 1 file changed, 26 insertions(+), 3 deletions(-) > > > > Reviewed-by: Michael Kelley <mikelley@microsoft.com> Applied to hyperv-fixes. Thanks. >
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 4638a52d8eae..6375967a8244 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -315,6 +315,25 @@ static struct syscore_ops hv_syscore_ops = { .resume = hv_resume, }; +static void (* __initdata old_setup_percpu_clockev)(void); + +static void __init hv_stimer_setup_percpu_clockev(void) +{ + /* + * Ignore any errors in setting up stimer clockevents + * as we can run with the LAPIC timer as a fallback. + */ + (void)hv_stimer_alloc(); + + /* + * Still register the LAPIC timer, because the direct-mode STIMER is + * not supported by old versions of Hyper-V. This also allows users + * to switch to LAPIC timer via /sys, if they want to. + */ + if (old_setup_percpu_clockev) + old_setup_percpu_clockev(); +} + /* * This function is to be invoked early in the boot sequence after the * hypervisor has been detected. @@ -393,10 +412,14 @@ void __init hyperv_init(void) wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64); /* - * Ignore any errors in setting up stimer clockevents - * as we can run with the LAPIC timer as a fallback. + * hyperv_init() is called before LAPIC is initialized: see + * apic_intr_mode_init() -> x86_platform.apic_post_init() and + * apic_bsp_setup() -> setup_local_APIC(). The direct-mode STIMER + * depends on LAPIC, so hv_stimer_alloc() should be called from + * x86_init.timers.setup_percpu_clockev. */ - (void)hv_stimer_alloc(); + old_setup_percpu_clockev = x86_init.timers.setup_percpu_clockev; + x86_init.timers.setup_percpu_clockev = hv_stimer_setup_percpu_clockev; hv_apic_init();
With commit 4df4cb9e99f8, the Hyper-V direct-mode STIMER is actually initialized before LAPIC is initialized: see apic_intr_mode_init() x86_platform.apic_post_init() hyperv_init() hv_stimer_alloc() apic_bsp_setup() setup_local_APIC() setup_local_APIC() temporarily disables LAPIC, initializes it and re-eanble it. The direct-mode STIMER depends on LAPIC, and when it's registered, it can be programmed immediately and the timer can fire very soon: hv_stimer_init clockevents_config_and_register clockevents_register_device tick_check_new_device tick_setup_device tick_setup_periodic(), tick_setup_oneshot() clockevents_program_event When the timer fires in the hypervisor, if the LAPIC is in the disabled state, new versions of Hyper-V ignore the event and don't inject the timer interrupt into the VM, and hence the VM hangs when it boots. Note: when the VM starts/reboots, the LAPIC is pre-enabled by the firmware, so the window of LAPIC being temporarily disabled is pretty small, and the issue can only happen once out of 100~200 reboots for a 40-vCPU VM on one dev host, and on another host the issue doesn't reproduce after 2000 reboots. The issue is more noticeable for kdump/kexec, because the LAPIC is disabled by the first kernel, and stays disabled until the kdump/kexec kernel enables it. This is especially an issue to a Generation-2 VM (for which Hyper-V doesn't emulate the PIT timer) when CONFIG_HZ=1000 (rather than CONFIG_HZ=250) is used. Fix the issue by moving hv_stimer_alloc() to a later place where the LAPIC timer is initialized. Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU onlining") Signed-off-by: Dexuan Cui <decui@microsoft.com> --- arch/x86/hyperv/hv_init.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-)