All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Dexuan Cui <decui@microsoft.com>,
	Michael Kelley <mikelley@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Sasha Levin <sashal@kernel.org>,
	linux-hyperv@vger.kernel.org
Subject: [PATCH AUTOSEL 5.10 08/45] x86/hyperv: Fix kexec panic/hang issues
Date: Tue, 19 Jan 2021 20:25:25 -0500	[thread overview]
Message-ID: <20210120012602.769683-8-sashal@kernel.org> (raw)
In-Reply-To: <20210120012602.769683-1-sashal@kernel.org>

From: Dexuan Cui <decui@microsoft.com>

[ Upstream commit dfe94d4086e40e92b1926bddcefa629b791e9b28 ]

Currently the kexec kernel can panic or hang due to 2 causes:

1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
old VP Assist Pages when the kexec kernel runs. The same issue is fixed
for hibernation in commit 421f090c819d ("x86/hyperv: Suspend/resume the
VP assist page for hibernation"). Now fix it for kexec.

2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
can still try to access the hypercall page and cause panic. The workaround
"hv_hypercall_pg = NULL;" in hyperv_cleanup() is unreliabe. Move
hyperv_cleanup() to a better place.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20201222065541.24312-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 arch/x86/hyperv/hv_init.c       |  4 ++++
 arch/x86/include/asm/mshyperv.h |  2 ++
 arch/x86/kernel/cpu/mshyperv.c  | 18 ++++++++++++++++++
 drivers/hv/vmbus_drv.c          |  2 --
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index e04d90af4c27c..4638a52d8eaea 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -16,6 +16,7 @@
 #include <asm/hyperv-tlfs.h>
 #include <asm/mshyperv.h>
 #include <asm/idtentry.h>
+#include <linux/kexec.h>
 #include <linux/version.h>
 #include <linux/vmalloc.h>
 #include <linux/mm.h>
@@ -26,6 +27,8 @@
 #include <linux/syscore_ops.h>
 #include <clocksource/hyperv_timer.h>
 
+int hyperv_init_cpuhp;
+
 void *hv_hypercall_pg;
 EXPORT_SYMBOL_GPL(hv_hypercall_pg);
 
@@ -401,6 +404,7 @@ void __init hyperv_init(void)
 
 	register_syscore_ops(&hv_syscore_ops);
 
+	hyperv_init_cpuhp = cpuhp;
 	return;
 
 remove_cpuhp_state:
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ffc289992d1b0..30f76b9668579 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -74,6 +74,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
 
 
 #if IS_ENABLED(CONFIG_HYPERV)
+extern int hyperv_init_cpuhp;
+
 extern void *hv_hypercall_pg;
 extern void  __percpu  **hyperv_pcpu_input_arg;
 
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 05ef1f4550cbd..6cc50ab07bded 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -135,14 +135,32 @@ static void hv_machine_shutdown(void)
 {
 	if (kexec_in_progress && hv_kexec_handler)
 		hv_kexec_handler();
+
+	/*
+	 * Call hv_cpu_die() on all the CPUs, otherwise later the hypervisor
+	 * corrupts the old VP Assist Pages and can crash the kexec kernel.
+	 */
+	if (kexec_in_progress && hyperv_init_cpuhp > 0)
+		cpuhp_remove_state(hyperv_init_cpuhp);
+
+	/* The function calls stop_other_cpus(). */
 	native_machine_shutdown();
+
+	/* Disable the hypercall page when there is only 1 active CPU. */
+	if (kexec_in_progress)
+		hyperv_cleanup();
 }
 
 static void hv_machine_crash_shutdown(struct pt_regs *regs)
 {
 	if (hv_crash_handler)
 		hv_crash_handler(regs);
+
+	/* The function calls crash_smp_send_stop(). */
 	native_machine_crash_shutdown(regs);
+
+	/* Disable the hypercall page when there is only 1 active CPU. */
+	hyperv_cleanup();
 }
 #endif /* CONFIG_KEXEC_CORE */
 #endif /* CONFIG_HYPERV */
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 4fad3e6745e53..a5a402e776c77 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2542,7 +2542,6 @@ static void hv_kexec_handler(void)
 	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
 	mb();
 	cpuhp_remove_state(hyperv_cpuhp_online);
-	hyperv_cleanup();
 };
 
 static void hv_crash_handler(struct pt_regs *regs)
@@ -2558,7 +2557,6 @@ static void hv_crash_handler(struct pt_regs *regs)
 	cpu = smp_processor_id();
 	hv_stimer_cleanup(cpu);
 	hv_synic_disable_regs(cpu);
-	hyperv_cleanup();
 };
 
 static int hv_synic_suspend(void)
-- 
2.27.0


  parent reply	other threads:[~2021-01-20  3:24 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-20  1:25 [PATCH AUTOSEL 5.10 01/45] ASoC: Intel: haswell: Add missing pm_ops Sasha Levin
2021-01-20  1:25 ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 02/45] ASoC: rt711: mutex between calibration and power state changes Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 03/45] SUNRPC: Handle TCP socket sends with kernel_sendpage() again Sasha Levin
2021-02-08 19:34   ` Trond Myklebust
2021-02-08 19:48     ` Chuck Lever
2021-02-08 20:12       ` Trond Myklebust
2021-02-08 20:17         ` Chuck Lever
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 04/45] ASoC: AMD Renoir - add DMI entry for Lenovo ThinkPad E14 Gen 2 Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 05/45] HID: multitouch: Enable multi-input for Synaptics pointstick/touchpad device Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 06/45] HID: sony: select CONFIG_CRC32 Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 07/45] dm integrity: select CRYPTO_SKCIPHER Sasha Levin
2021-01-20  1:25   ` [dm-devel] " Sasha Levin
2021-01-20  1:25 ` Sasha Levin [this message]
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 09/45] scsi: ufs: Relax the condition of UFSHCI_QUIRK_SKIP_MANUAL_WB_FLUSH_CTRL Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 10/45] scsi: ufs: Correct the LUN used in eh_device_reset_handler() callback Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 11/45] scsi: qedi: Correct max length of CHAP secret Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 12/45] scsi: scsi_debug: Fix memleak in scsi_debug_init() Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 13/45] scsi: sd: Suppress spurious errors when WRITE SAME is being disabled Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 14/45] riscv: Fix kernel time_init() Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 15/45] riscv: Fix sifive serial driver Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 16/45] riscv: Enable interrupts during syscalls with M-Mode Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 17/45] HID: logitech-dj: add the G602 receiver Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 18/45] HID: Ignore battery for Elan touchscreen on ASUS UX550 Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 19/45] clk: tegra30: Add hda clock default rates to clock driver Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 20/45] ALSA: hda/tegra: fix tegra-hda on tegra30 soc Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 21/45] r8152: Add Lenovo Powered USB-C Travel Hub Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 22/45] riscv: cacheinfo: Fix using smp_processor_id() in preemptible Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 23/45] arm64: make atomic helpers __always_inline Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 24/45] xen: Fix event channel callback via INTX/GSI Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 25/45] x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 26/45] x86/xen: Fix xen_hvm_smp_init() when vector callback not available Sasha Levin
2021-01-20  1:35   ` Boris Ostrovsky
2021-01-24 13:11     ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 27/45] net: stmmac: use __napi_schedule() for PREEMPT_RT Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 28/45] net: stmmac: Fixed mtu channged by cache aligned Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  6:08   ` Jakub Kicinski
2021-01-20  6:08     ` Jakub Kicinski
2021-01-20 14:26     ` Sasha Levin
2021-01-20 14:26       ` Sasha Levin
2021-01-21 14:39       ` [Linux-stm32] " Ahmad Fatoum
2021-01-21 14:39         ` Ahmad Fatoum
2021-01-21 16:02         ` Sasha Levin
2021-01-21 16:02           ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 29/45] dts: phy: fix missing mdio device and probe failure of vsc8541-01 device Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 30/45] dts: phy: add GPIO number and active state used for phy reset Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 31/45] riscv: defconfig: enable gpio support for HiFive Unleashed Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 32/45] drm/amdgpu/psp: fix psp gfx ctrl cmds Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 33/45] drm/amd/display: disable dcn10 pipe split by default Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 34/45] HID: logitech-hidpp: Add product ID for MX Ergo in Bluetooth mode Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 35/45] drm/amd/display: Fix to be able to stop crc calculation Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 36/45] drm/nouveau/bios: fix issue shadowing expansion ROMs Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 37/45] drm/nouveau/privring: ack interrupts the same way as RM Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 38/45] drm/nouveau/i2c/gm200: increase width of aux semaphore owner fields Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 39/45] drm/nouveau/mmu: fix vram heap sizing Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 40/45] drm/nouveau/kms/nv50-: fix case where notifier buffer is at offset 0 Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25   ` Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 41/45] io_uring: flush timeouts that should already have expired Sasha Levin
2021-01-20  1:25 ` [PATCH AUTOSEL 5.10 42/45] libperf tests: If a test fails return non-zero Sasha Levin
2021-01-20  1:26 ` [PATCH AUTOSEL 5.10 43/45] libperf tests: Fail when failing to get a tracepoint id Sasha Levin
2021-01-20  1:26 ` [PATCH AUTOSEL 5.10 44/45] RISC-V: Set current memblock limit Sasha Levin
2021-01-20  1:26   ` Sasha Levin
2021-01-20  1:26 ` [PATCH AUTOSEL 5.10 45/45] RISC-V: Fix maximum allowed phsyical memory for RV32 Sasha Levin
2021-01-20  1:26   ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210120012602.769683-8-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=decui@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=stable@vger.kernel.org \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.