From: Dexuan Cui <decui@microsoft.com>
To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de,
x86@kernel.org, hpa@zytor.com, linux-hyperv@vger.kernel.org,
mikelley@microsoft.com, wei.liu@kernel.org, vkuznets@redhat.com,
jwiesner@suse.com, ohering@suse.com
Cc: linux-kernel@vger.kernel.org, sthemmin@microsoft.com,
haiyangz@microsoft.com, kys@microsoft.com,
Dexuan Cui <decui@microsoft.com>
Subject: [PATCH] x86/hyperv: Fix kexec panic/hang issues
Date: Sun, 20 Dec 2020 14:30:14 -0800 [thread overview]
Message-ID: <20201220223014.14602-1-decui@microsoft.com> (raw)
Currently the kexec kernel can panic or hang due to 2 causes:
1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
VP Assist Pages when the kexec kernel runs. We ever fixed the same issue
for hibernation in
commit 421f090c819d ("x86/hyperv: Suspend/resume the VP assist page for hibernation")
and now let's fix it for kexec.
2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
can still try to access the hypercall page and cause panic. The workaround
"hv_hypercall_pg = NULL;" in hyperv_cleanup() can't work reliably.
Move hyperv_cleanup() to the right place.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
arch/x86/hyperv/hv_init.c | 4 ++++
arch/x86/include/asm/mshyperv.h | 2 ++
arch/x86/kernel/cpu/mshyperv.c | 18 ++++++++++++++++++
drivers/hv/vmbus_drv.c | 2 --
4 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index e04d90af4c27..4638a52d8eae 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -16,6 +16,7 @@
#include <asm/hyperv-tlfs.h>
#include <asm/mshyperv.h>
#include <asm/idtentry.h>
+#include <linux/kexec.h>
#include <linux/version.h>
#include <linux/vmalloc.h>
#include <linux/mm.h>
@@ -26,6 +27,8 @@
#include <linux/syscore_ops.h>
#include <clocksource/hyperv_timer.h>
+int hyperv_init_cpuhp;
+
void *hv_hypercall_pg;
EXPORT_SYMBOL_GPL(hv_hypercall_pg);
@@ -401,6 +404,7 @@ void __init hyperv_init(void)
register_syscore_ops(&hv_syscore_ops);
+ hyperv_init_cpuhp = cpuhp;
return;
remove_cpuhp_state:
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ffc289992d1b..30f76b966857 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -74,6 +74,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
#if IS_ENABLED(CONFIG_HYPERV)
+extern int hyperv_init_cpuhp;
+
extern void *hv_hypercall_pg;
extern void __percpu **hyperv_pcpu_input_arg;
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index f628e3dc150f..43b54bef5448 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -135,14 +135,32 @@ static void hv_machine_shutdown(void)
{
if (kexec_in_progress && hv_kexec_handler)
hv_kexec_handler();
+
+ /*
+ * Call hv_cpu_die() on all the CPUs, otherwise later the hypervisor
+ * corrupts the old VP Assist Pages and can crash the kexec kernel.
+ */
+ if (kexec_in_progress && hyperv_init_cpuhp > 0)
+ cpuhp_remove_state(hyperv_init_cpuhp);
+
+ /* The function calls stop_other_cpus(). */
native_machine_shutdown();
+
+ /* Disable the hypercall page when there is only 1 active CPU. */
+ if (kexec_in_progress)
+ hyperv_cleanup();
}
static void hv_machine_crash_shutdown(struct pt_regs *regs)
{
if (hv_crash_handler)
hv_crash_handler(regs);
+
+ /* The function calls crash_smp_send_stop(). */
native_machine_crash_shutdown(regs);
+
+ /* Disable the hypercall page when there is only 1 active CPU. */
+ hyperv_cleanup();
}
#endif /* CONFIG_KEXEC_CORE */
#endif /* CONFIG_HYPERV */
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 502f8cd95f6d..d491fdcee61f 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2550,7 +2550,6 @@ static void hv_kexec_handler(void)
/* Make sure conn_state is set as hv_synic_cleanup checks for it */
mb();
cpuhp_remove_state(hyperv_cpuhp_online);
- hyperv_cleanup();
};
static void hv_crash_handler(struct pt_regs *regs)
@@ -2566,7 +2565,6 @@ static void hv_crash_handler(struct pt_regs *regs)
cpu = smp_processor_id();
hv_stimer_cleanup(cpu);
hv_synic_disable_regs(cpu);
- hyperv_cleanup();
};
static int hv_synic_suspend(void)
--
2.19.1
next reply other threads:[~2020-12-20 22:31 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-20 22:30 Dexuan Cui [this message]
2020-12-21 23:33 ` [PATCH] x86/hyperv: Fix kexec panic/hang issues Michael Kelley
2020-12-22 1:03 ` Dexuan Cui
2020-12-22 3:36 ` Michael Kelley
2020-12-22 6:41 ` Dexuan Cui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201220223014.14602-1-decui@microsoft.com \
--to=decui@microsoft.com \
--cc=bp@alien8.de \
--cc=haiyangz@microsoft.com \
--cc=hpa@zytor.com \
--cc=jwiesner@suse.com \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mikelley@microsoft.com \
--cc=mingo@redhat.com \
--cc=ohering@suse.com \
--cc=sthemmin@microsoft.com \
--cc=tglx@linutronix.de \
--cc=vkuznets@redhat.com \
--cc=wei.liu@kernel.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).