All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Dexuan Cui <decui@microsoft.com>,
	Michael Kelley <mikelley@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.10 03/43] x86/hyperv: Initialize clockevents after LAPIC is initialized
Date: Fri, 22 Jan 2021 15:12:19 +0100	[thread overview]
Message-ID: <20210122135735.780499208@linuxfoundation.org> (raw)
In-Reply-To: <20210122135735.652681690@linuxfoundation.org>

From: Dexuan Cui <decui@microsoft.com>

[ Upstream commit fff7b5e6ee63c5d20406a131b260c619cdd24fd1 ]

With commit 4df4cb9e99f8, the Hyper-V direct-mode STIMER is actually
initialized before LAPIC is initialized: see

  apic_intr_mode_init()

    x86_platform.apic_post_init()
      hyperv_init()
        hv_stimer_alloc()

    apic_bsp_setup()
      setup_local_APIC()

setup_local_APIC() temporarily disables LAPIC, initializes it and
re-eanble it.  The direct-mode STIMER depends on LAPIC, and when it's
registered, it can be programmed immediately and the timer can fire
very soon:

  hv_stimer_init
    clockevents_config_and_register
      clockevents_register_device
        tick_check_new_device
          tick_setup_device
            tick_setup_periodic(), tick_setup_oneshot()
              clockevents_program_event

When the timer fires in the hypervisor, if the LAPIC is in the
disabled state, new versions of Hyper-V ignore the event and don't inject
the timer interrupt into the VM, and hence the VM hangs when it boots.

Note: when the VM starts/reboots, the LAPIC is pre-enabled by the
firmware, so the window of LAPIC being temporarily disabled is pretty
small, and the issue can only happen once out of 100~200 reboots for
a 40-vCPU VM on one dev host, and on another host the issue doesn't
reproduce after 2000 reboots.

The issue is more noticeable for kdump/kexec, because the LAPIC is
disabled by the first kernel, and stays disabled until the kdump/kexec
kernel enables it. This is especially an issue to a Generation-2 VM
(for which Hyper-V doesn't emulate the PIT timer) when CONFIG_HZ=1000
(rather than CONFIG_HZ=250) is used.

Fix the issue by moving hv_stimer_alloc() to a later place where the
LAPIC timer is initialized.

Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU onlining")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by:  Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210116223136.13892-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/hyperv/hv_init.c |   29 ++++++++++++++++++++++++++---
 1 file changed, 26 insertions(+), 3 deletions(-)

--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -312,6 +312,25 @@ static struct syscore_ops hv_syscore_ops
 	.resume		= hv_resume,
 };
 
+static void (* __initdata old_setup_percpu_clockev)(void);
+
+static void __init hv_stimer_setup_percpu_clockev(void)
+{
+	/*
+	 * Ignore any errors in setting up stimer clockevents
+	 * as we can run with the LAPIC timer as a fallback.
+	 */
+	(void)hv_stimer_alloc();
+
+	/*
+	 * Still register the LAPIC timer, because the direct-mode STIMER is
+	 * not supported by old versions of Hyper-V. This also allows users
+	 * to switch to LAPIC timer via /sys, if they want to.
+	 */
+	if (old_setup_percpu_clockev)
+		old_setup_percpu_clockev();
+}
+
 /*
  * This function is to be invoked early in the boot sequence after the
  * hypervisor has been detected.
@@ -390,10 +409,14 @@ void __init hyperv_init(void)
 	wrmsrl(HV_X64_MSR_HYPERCALL, hypercall_msr.as_uint64);
 
 	/*
-	 * Ignore any errors in setting up stimer clockevents
-	 * as we can run with the LAPIC timer as a fallback.
+	 * hyperv_init() is called before LAPIC is initialized: see
+	 * apic_intr_mode_init() -> x86_platform.apic_post_init() and
+	 * apic_bsp_setup() -> setup_local_APIC(). The direct-mode STIMER
+	 * depends on LAPIC, so hv_stimer_alloc() should be called from
+	 * x86_init.timers.setup_percpu_clockev.
 	 */
-	(void)hv_stimer_alloc();
+	old_setup_percpu_clockev = x86_init.timers.setup_percpu_clockev;
+	x86_init.timers.setup_percpu_clockev = hv_stimer_setup_percpu_clockev;
 
 	hv_apic_init();
 



  parent reply	other threads:[~2021-01-22 14:57 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-22 14:12 [PATCH 5.10 00/43] 5.10.10-rc1 review Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 01/43] Revert "kconfig: remove kvmconfig and xenconfig shorthands" Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 02/43] bpf: Fix selftest compilation on clang 11 Greg Kroah-Hartman
2021-01-22 14:12 ` Greg Kroah-Hartman [this message]
2021-01-22 14:12 ` [PATCH 5.10 04/43] drm/amdgpu/display: drop DCN support for aarch64 Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 05/43] bpf: Fix signed_{sub,add32}_overflows type handling Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 06/43] X.509: Fix crash caused by NULL pointer Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 07/43] nfsd4: readdirplus shouldnt return parent of export Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 08/43] bpf: Dont leak memory in bpf getsockopt when optlen == 0 Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 09/43] bpf: Support PTR_TO_MEM{,_OR_NULL} register spilling Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 10/43] bpf: Fix helper bpf_map_peek_elem_proto pointing to wrong callback Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 11/43] net: ipa: modem: add missing SET_NETDEV_DEV() for proper sysfs links Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 12/43] net: fix use-after-free when UDP GRO with shared fraglist Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 13/43] udp: Prevent reuseport_select_sock from reading uninitialized socks Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 14/43] netxen_nic: fix MSI/MSI-x interrupts Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 15/43] net: ipv6: Validate GSO SKB before finish IPv6 processing Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 16/43] tipc: fix NULL deref in tipc_link_xmit() Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 17/43] mlxsw: core: Add validation of transceiver temperature thresholds Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 18/43] mlxsw: core: Increase critical threshold for ASIC thermal zone Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 19/43] net: mvpp2: Remove Pause and Asym_Pause support Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 20/43] rndis_host: set proper input size for OID_GEN_PHYSICAL_MEDIUM request Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 21/43] esp: avoid unneeded kmap_atomic call Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 22/43] net: dcb: Validate netlink message in DCB handler Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 23/43] net: dcb: Accept RTM_GETDCB messages carrying set-like DCB commands Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 24/43] rxrpc: Call state should be read with READ_ONCE() under some circumstances Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 25/43] i40e: fix potential NULL pointer dereferencing Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 26/43] net: stmmac: Fixed mtu channged by cache aligned Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 27/43] net: sit: unregister_netdevice on newlinks error path Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 28/43] net: stmmac: fix taprio schedule configuration Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 29/43] net: stmmac: fix taprio configuration when base_time is in the past Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 30/43] net: avoid 32 x truesize under-estimation for tiny skbs Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 31/43] dt-bindings: net: renesas,etheravb: RZ/G2H needs tx-internal-delay-ps Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 32/43] net: phy: smsc: fix clk error handling Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 33/43] net: dsa: clear devlink port type before unregistering slave netdevs Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 34/43] rxrpc: Fix handling of an unsupported token type in rxrpc_read() Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 35/43] net: stmmac: use __napi_schedule() for PREEMPT_RT Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 36/43] can: mcp251xfd: mcp251xfd_handle_rxif_one(): fix wrong NULL pointer check Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 37/43] drm/panel: otm8009a: allow using non-continuous dsi clock Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 38/43] mac80211: do not drop tx nulldata packets on encrypted links Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 39/43] mac80211: check if atf has been disabled in __ieee80211_schedule_txq Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 40/43] net: dsa: unbind all switches from tree when DSA master unbinds Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 41/43] cxgb4/chtls: Fix tid stuck due to wrong update of qid Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 42/43] spi: fsl: Fix driver breakage when SPI_CS_HIGH is not set in spi->mode Greg Kroah-Hartman
2021-01-22 14:12 ` [PATCH 5.10 43/43] spi: cadence: cache reference clock rate during probe Greg Kroah-Hartman
2021-01-23  0:24 ` [PATCH 5.10 00/43] 5.10.10-rc1 review Shuah Khan
2021-01-23 15:06   ` Greg Kroah-Hartman
2021-01-23  5:44 ` Naresh Kamboju
2021-01-23  7:20   ` Naresh Kamboju
2021-01-23 15:06     ` Greg Kroah-Hartman
2021-01-23  9:52 ` Pavel Machek
2021-01-23 15:06   ` Greg Kroah-Hartman
2021-01-23  9:59 ` Jon Hunter
2021-01-23 15:19   ` Greg Kroah-Hartman
2021-01-23 14:36 ` Guenter Roeck
2021-01-23 15:07   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210122135735.780499208@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=decui@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.