All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/hyperv: Fix kexec panic/hang issues
@ 2020-12-20 22:30 Dexuan Cui
  2020-12-21 23:33 ` Michael Kelley
  0 siblings, 1 reply; 5+ messages in thread
From: Dexuan Cui @ 2020-12-20 22:30 UTC (permalink / raw)
  To: tglx, mingo, bp, x86, hpa, linux-hyperv, mikelley, wei.liu,
	vkuznets, jwiesner, ohering
  Cc: linux-kernel, sthemmin, haiyangz, kys, Dexuan Cui

Currently the kexec kernel can panic or hang due to 2 causes:

1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
VP Assist Pages when the kexec kernel runs. We ever fixed the same issue
for hibernation in
commit 421f090c819d ("x86/hyperv: Suspend/resume the VP assist page for hibernation")
and now let's fix it for kexec.

2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
can still try to access the hypercall page and cause panic. The workaround
"hv_hypercall_pg = NULL;" in hyperv_cleanup() can't work reliably.
Move hyperv_cleanup() to the right place.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 arch/x86/hyperv/hv_init.c       |  4 ++++
 arch/x86/include/asm/mshyperv.h |  2 ++
 arch/x86/kernel/cpu/mshyperv.c  | 18 ++++++++++++++++++
 drivers/hv/vmbus_drv.c          |  2 --
 4 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index e04d90af4c27..4638a52d8eae 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -16,6 +16,7 @@
 #include <asm/hyperv-tlfs.h>
 #include <asm/mshyperv.h>
 #include <asm/idtentry.h>
+#include <linux/kexec.h>
 #include <linux/version.h>
 #include <linux/vmalloc.h>
 #include <linux/mm.h>
@@ -26,6 +27,8 @@
 #include <linux/syscore_ops.h>
 #include <clocksource/hyperv_timer.h>
 
+int hyperv_init_cpuhp;
+
 void *hv_hypercall_pg;
 EXPORT_SYMBOL_GPL(hv_hypercall_pg);
 
@@ -401,6 +404,7 @@ void __init hyperv_init(void)
 
 	register_syscore_ops(&hv_syscore_ops);
 
+	hyperv_init_cpuhp = cpuhp;
 	return;
 
 remove_cpuhp_state:
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index ffc289992d1b..30f76b966857 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -74,6 +74,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
 
 
 #if IS_ENABLED(CONFIG_HYPERV)
+extern int hyperv_init_cpuhp;
+
 extern void *hv_hypercall_pg;
 extern void  __percpu  **hyperv_pcpu_input_arg;
 
diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index f628e3dc150f..43b54bef5448 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -135,14 +135,32 @@ static void hv_machine_shutdown(void)
 {
 	if (kexec_in_progress && hv_kexec_handler)
 		hv_kexec_handler();
+
+	/*
+	 * Call hv_cpu_die() on all the CPUs, otherwise later the hypervisor
+	 * corrupts the old VP Assist Pages and can crash the kexec kernel.
+	 */
+	if (kexec_in_progress && hyperv_init_cpuhp > 0)
+		cpuhp_remove_state(hyperv_init_cpuhp);
+
+	/* The function calls stop_other_cpus(). */
 	native_machine_shutdown();
+
+	/* Disable the hypercall page when there is only 1 active CPU. */
+	if (kexec_in_progress)
+		hyperv_cleanup();
 }
 
 static void hv_machine_crash_shutdown(struct pt_regs *regs)
 {
 	if (hv_crash_handler)
 		hv_crash_handler(regs);
+
+	/* The function calls crash_smp_send_stop(). */
 	native_machine_crash_shutdown(regs);
+
+	/* Disable the hypercall page when there is only 1 active CPU. */
+	hyperv_cleanup();
 }
 #endif /* CONFIG_KEXEC_CORE */
 #endif /* CONFIG_HYPERV */
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 502f8cd95f6d..d491fdcee61f 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -2550,7 +2550,6 @@ static void hv_kexec_handler(void)
 	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
 	mb();
 	cpuhp_remove_state(hyperv_cpuhp_online);
-	hyperv_cleanup();
 };
 
 static void hv_crash_handler(struct pt_regs *regs)
@@ -2566,7 +2565,6 @@ static void hv_crash_handler(struct pt_regs *regs)
 	cpu = smp_processor_id();
 	hv_stimer_cleanup(cpu);
 	hv_synic_disable_regs(cpu);
-	hyperv_cleanup();
 };
 
 static int hv_synic_suspend(void)
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [PATCH] x86/hyperv: Fix kexec panic/hang issues
  2020-12-20 22:30 [PATCH] x86/hyperv: Fix kexec panic/hang issues Dexuan Cui
@ 2020-12-21 23:33 ` Michael Kelley
  2020-12-22  1:03   ` Dexuan Cui
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Kelley @ 2020-12-21 23:33 UTC (permalink / raw)
  To: Dexuan Cui, tglx, mingo, bp, x86, hpa, linux-hyperv, wei.liu,
	vkuznets, jwiesner, ohering
  Cc: linux-kernel, Stephen Hemminger, Haiyang Zhang, KY Srinivasan

From: Dexuan Cui <decui@microsoft.com> Sent: Sunday, December 20, 2020 2:30 PM
> 
> Currently the kexec kernel can panic or hang due to 2 causes:
> 
> 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
> VP Assist Pages when the kexec kernel runs. We ever fixed the same issue

Spurious word "ever".  And avoid first person "we".  Perhaps:

     The same issue is fixed for hibernation in commit ..... .  Now fix it for kexec.

> for hibernation in
> commit 421f090c819d ("x86/hyperv: Suspend/resume the VP assist page for hibernation")
> and now let's fix it for kexec.

Is the VP Assist Page getting cleared anywhere on the panic path?  We can
only do it for the CPU that runs panic(), but I don't think it is getting cleared
even for that CPU.   It is cleared only in hv_cpu_die(), and that's not called on
the panic path.

> 
> 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
> are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
> between hv_kexec_handler() and native_machine_shutdown(), the other CPUs
> can still try to access the hypercall page and cause panic. The workaround
> "hv_hypercall_pg = NULL;" in hyperv_cleanup() can't work reliably.

I would note that the comment in hv_suspend() is also incorrect on this
point.  Setting hv_hypercall_pg to NULL does not cause subsequent
hypercalls to fail safely.  The fast hypercalls don't test for it, and even if they
did test like hv_do_hypercall(), the test just creates a race condition.

> Move hyperv_cleanup() to the right place.
> 
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> ---
>  arch/x86/hyperv/hv_init.c       |  4 ++++
>  arch/x86/include/asm/mshyperv.h |  2 ++
>  arch/x86/kernel/cpu/mshyperv.c  | 18 ++++++++++++++++++
>  drivers/hv/vmbus_drv.c          |  2 --
>  4 files changed, 24 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index e04d90af4c27..4638a52d8eae 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -16,6 +16,7 @@
>  #include <asm/hyperv-tlfs.h>
>  #include <asm/mshyperv.h>
>  #include <asm/idtentry.h>
> +#include <linux/kexec.h>
>  #include <linux/version.h>
>  #include <linux/vmalloc.h>
>  #include <linux/mm.h>
> @@ -26,6 +27,8 @@
>  #include <linux/syscore_ops.h>
>  #include <clocksource/hyperv_timer.h>
> 
> +int hyperv_init_cpuhp;
> +
>  void *hv_hypercall_pg;
>  EXPORT_SYMBOL_GPL(hv_hypercall_pg);
> 
> @@ -401,6 +404,7 @@ void __init hyperv_init(void)
> 
>  	register_syscore_ops(&hv_syscore_ops);
> 
> +	hyperv_init_cpuhp = cpuhp;
>  	return;
> 
>  remove_cpuhp_state:
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index ffc289992d1b..30f76b966857 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -74,6 +74,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
> 
> 
>  #if IS_ENABLED(CONFIG_HYPERV)
> +extern int hyperv_init_cpuhp;
> +
>  extern void *hv_hypercall_pg;
>  extern void  __percpu  **hyperv_pcpu_input_arg;
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index f628e3dc150f..43b54bef5448 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -135,14 +135,32 @@ static void hv_machine_shutdown(void)
>  {
>  	if (kexec_in_progress && hv_kexec_handler)
>  		hv_kexec_handler();
> +
> +	/*
> +	 * Call hv_cpu_die() on all the CPUs, otherwise later the hypervisor
> +	 * corrupts the old VP Assist Pages and can crash the kexec kernel.
> +	 */
> +	if (kexec_in_progress && hyperv_init_cpuhp > 0)
> +		cpuhp_remove_state(hyperv_init_cpuhp);
> +
> +	/* The function calls stop_other_cpus(). */
>  	native_machine_shutdown();
> +
> +	/* Disable the hypercall page when there is only 1 active CPU. */
> +	if (kexec_in_progress)
> +		hyperv_cleanup();
>  }
> 
>  static void hv_machine_crash_shutdown(struct pt_regs *regs)
>  {
>  	if (hv_crash_handler)
>  		hv_crash_handler(regs);
> +
> +	/* The function calls crash_smp_send_stop(). */

Actually, crash_smp_send_stop() or smp_send_stop() has already been
called earlier by panic(), so there's already only a single CPU running at
this point.  crash_smp_send_stop() is called again in
native_machine_crash_shutdown(), but it has a flag to prevent it from
running again.

>  	native_machine_crash_shutdown(regs);
> +
> +	/* Disable the hypercall page when there is only 1 active CPU. */
> +	hyperv_cleanup();

Moving the call to hyperv_cleanup() in the panic path is OK, and it makes
the panic and kexec() paths more similar, but I don't think it is necessary.
As noted above, the other CPUs have already been stopped, so they shouldn't
be executing any hypercalls.  

>  }
>  #endif /* CONFIG_KEXEC_CORE */
>  #endif /* CONFIG_HYPERV */
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 502f8cd95f6d..d491fdcee61f 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -2550,7 +2550,6 @@ static void hv_kexec_handler(void)
>  	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
>  	mb();
>  	cpuhp_remove_state(hyperv_cpuhp_online);
> -	hyperv_cleanup();
>  };
> 
>  static void hv_crash_handler(struct pt_regs *regs)
> @@ -2566,7 +2565,6 @@ static void hv_crash_handler(struct pt_regs *regs)
>  	cpu = smp_processor_id();
>  	hv_stimer_cleanup(cpu);
>  	hv_synic_disable_regs(cpu);
> -	hyperv_cleanup();
>  };
> 
>  static int hv_synic_suspend(void)
> --
> 2.19.1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] x86/hyperv: Fix kexec panic/hang issues
  2020-12-21 23:33 ` Michael Kelley
@ 2020-12-22  1:03   ` Dexuan Cui
  2020-12-22  3:36     ` Michael Kelley
  0 siblings, 1 reply; 5+ messages in thread
From: Dexuan Cui @ 2020-12-22  1:03 UTC (permalink / raw)
  To: Michael Kelley, tglx, mingo, bp, x86, hpa, linux-hyperv, wei.liu,
	vkuznets, jwiesner, ohering
  Cc: linux-kernel, Stephen Hemminger, Haiyang Zhang, KY Srinivasan

> From: Michael Kelley
> Sent: Monday, December 21, 2020 3:33 PM
> From: Dexuan Cui
> Sent: Sunday, December 20, 2020 2:30 PM
> >
> > Currently the kexec kernel can panic or hang due to 2 causes:
> >
> > 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
> > VP Assist Pages when the kexec kernel runs. We ever fixed the same issue
> 
> Spurious word "ever".  And avoid first person "we".  Perhaps:
> 
> The same issue is fixed for hibernation in commit ..... .  Now fix it for
> kexec.

Thanks! Will use this in v2.

> > for hibernation in
> > commit 421f090c819d ("x86/hyperv: Suspend/resume the VP assist page for
> hibernation")
> > and now let's fix it for kexec.
> 
> Is the VP Assist Page getting cleared anywhere on the panic path?  We can

It's not.

> only do it for the CPU that runs panic(), but I don't think it is getting cleared
> even for that CPU.   It is cleared only in hv_cpu_die(), and that's not called on
> the panic path.

IMO kdump is different from the non-kdump kexec in that the kdump kernel
runs without depending on the memory used by the first kernel, so it looks 
unnecessary to clear the first kernel's VP Assist Page (and the hypercallpage).
According to my test, the second kernel can re-enble the VP Asist Page and 
the hypercall page using different GPAs, without disabling the old pages first.
Of course, in the future Hyper-V may require the guest to disable the pages first
before trying to re-enabling them, so I agree we'd better clear the pages in the
first kernell like this:

diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 4638a52d8eae..8022f51c9c05 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -202,7 +202,7 @@ void clear_hv_tscchange_cb(void)
 }
 EXPORT_SYMBOL_GPL(clear_hv_tscchange_cb);

-static int hv_cpu_die(unsigned int cpu)
+int hv_cpu_die(unsigned int cpu)
 {
        struct hv_reenlightenment_control re_ctrl;
        unsigned int new_cpu;
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 30f76b966857..d090e781d216 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -76,6 +76,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
 #if IS_ENABLED(CONFIG_HYPERV)
 extern int hyperv_init_cpuhp;

+int hv_cpu_die(unsigned int cpu);
+
 extern void *hv_hypercall_pg;
 extern void  __percpu  **hyperv_pcpu_input_arg;

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index 43b54bef5448..e54f8262bfe0 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -156,6 +156,9 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs)
        if (hv_crash_handler)
                hv_crash_handler(regs);

+       /* Only call this on the faulting CPU. */
+       hv_cpu_die(raw_smp_processor_id());
+
        /* The function calls crash_smp_send_stop(). */
        native_machine_crash_shutdown(regs);

> > 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
> > are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
> > between hv_kexec_handler() and native_machine_shutdown(), the other
> CPUs
> > can still try to access the hypercall page and cause panic. The workaround
> > "hv_hypercall_pg = NULL;" in hyperv_cleanup() can't work reliably.
> 
> I would note that the comment in hv_suspend() is also incorrect on this
> point.  Setting hv_hypercall_pg to NULL does not cause subsequent
> hypercalls to fail safely.  The fast hypercalls don't test for it, and even if they
> did test like hv_do_hypercall(), the test just creates a race condition.

The comment in hv_suspend() should be correct because hv_suspend()
is only called during hibernation from the syscore_ops path where only
one CPU is active, e.g. for the suspend operation, it's called from
state_store
  hibernate
    hibernation_snapshot
      create_image
        suspend_disable_secondary_cpus 
        syscore_suspend
          hv_suspend

It's similar for the resume operation:
resume_store
  software_resume
    load_image_and_restore
      hibernation_restore
        resume_target_kernel
          hibernate_resume_nonboot_cpu_disable
          syscore_suspend
            hv_suspend
 
> >  static void hv_machine_crash_shutdown(struct pt_regs *regs)
> >  {
> >  	if (hv_crash_handler)
> >  		hv_crash_handler(regs);
> > +
> > +	/* The function calls crash_smp_send_stop(). */
> 
> Actually, crash_smp_send_stop() or smp_send_stop() has already been
> called earlier by panic(),

This is true only when the Hyper-V host supports the feature
HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE. On an old Hyper-V host 
without the feature, ms_hyperv_init_platform() doesn't set
crash_kexec_post_notifiers, so crash_kexec_post_notifiers keeps its
initial value "false", and panic() calls smp_send_stop() *after*
__crash_kexec() (which calls machine_crash_shutdown() ->
hv_machine_crash_shutdown()).

>  so there's already only a single CPU running at
> this point.  crash_smp_send_stop() is called again in
> native_machine_crash_shutdown(), but it has a flag to prevent it from
> running again.
> 
> >  	native_machine_crash_shutdown(regs);
> > +
> > +	/* Disable the hypercall page when there is only 1 active CPU. */
> > +	hyperv_cleanup();
> 
> Moving the call to hyperv_cleanup() in the panic path is OK, and it makes
> the panic and kexec() paths more similar, but I don't think it is necessary.
> As noted above, the other CPUs have already been stopped, so they shouldn't
> be executing any hypercalls.

As I explained above, it's necessary for old Hyper-V hosts. :-)

Thanks,
-- Dexuan


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [PATCH] x86/hyperv: Fix kexec panic/hang issues
  2020-12-22  1:03   ` Dexuan Cui
@ 2020-12-22  3:36     ` Michael Kelley
  2020-12-22  6:41       ` Dexuan Cui
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Kelley @ 2020-12-22  3:36 UTC (permalink / raw)
  To: Dexuan Cui, tglx, mingo, bp, x86, hpa, linux-hyperv, wei.liu,
	vkuznets, jwiesner, ohering
  Cc: linux-kernel, Stephen Hemminger, Haiyang Zhang, KY Srinivasan

From: Dexuan Cui <decui@microsoft.com>  Sent: Monday, December 21, 2020 5:04 PM
> 
> > From: Michael Kelley
> > Sent: Monday, December 21, 2020 3:33 PM
> > From: Dexuan Cui
> > Sent: Sunday, December 20, 2020 2:30 PM
> > >
> > > Currently the kexec kernel can panic or hang due to 2 causes:
> > >
> > > 1) hv_cpu_die() is not called upon kexec, so the hypervisor corrupts the
> > > VP Assist Pages when the kexec kernel runs. We ever fixed the same issue
> >
> > Spurious word "ever".  And avoid first person "we".  Perhaps:
> >
> > The same issue is fixed for hibernation in commit ..... .  Now fix it for
> > kexec.
> 
> Thanks! Will use this in v2.
> 
> > > for hibernation in
> > > commit 421f090c819d ("x86/hyperv: Suspend/resume the VP assist page for
> > hibernation")
> > > and now let's fix it for kexec.
> >
> > Is the VP Assist Page getting cleared anywhere on the panic path?  We can
> 
> It's not.
> 
> > only do it for the CPU that runs panic(), but I don't think it is getting cleared
> > even for that CPU.   It is cleared only in hv_cpu_die(), and that's not called on
> > the panic path.
> 
> IMO kdump is different from the non-kdump kexec in that the kdump kernel
> runs without depending on the memory used by the first kernel, so it looks
> unnecessary to clear the first kernel's VP Assist Page (and the hypercallpage).
> According to my test, the second kernel can re-enble the VP Asist Page and
> the hypercall page using different GPAs, without disabling the old pages first.

Ah yes.  You are right.  The kdump kernel must be using a disjoint area of
physical memory,  so not clearing these per-CPU overlay pages shouldn't
put the kdump kernel at risk.

> Of course, in the future Hyper-V may require the guest to disable the pages first
> before trying to re-enabling them, so I agree we'd better clear the pages in the
> first kernell like this:
> 
> diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
> index 4638a52d8eae..8022f51c9c05 100644
> --- a/arch/x86/hyperv/hv_init.c
> +++ b/arch/x86/hyperv/hv_init.c
> @@ -202,7 +202,7 @@ void clear_hv_tscchange_cb(void)
>  }
>  EXPORT_SYMBOL_GPL(clear_hv_tscchange_cb);
> 
> -static int hv_cpu_die(unsigned int cpu)
> +int hv_cpu_die(unsigned int cpu)
>  {
>         struct hv_reenlightenment_control re_ctrl;
>         unsigned int new_cpu;
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index 30f76b966857..d090e781d216 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -76,6 +76,8 @@ static inline void hv_disable_stimer0_percpu_irq(int irq) {}
>  #if IS_ENABLED(CONFIG_HYPERV)
>  extern int hyperv_init_cpuhp;
> 
> +int hv_cpu_die(unsigned int cpu);
> +
>  extern void *hv_hypercall_pg;
>  extern void  __percpu  **hyperv_pcpu_input_arg;
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index 43b54bef5448..e54f8262bfe0 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -156,6 +156,9 @@ static void hv_machine_crash_shutdown(struct pt_regs *regs)
>         if (hv_crash_handler)
>                 hv_crash_handler(regs);
> 
> +       /* Only call this on the faulting CPU. */
> +       hv_cpu_die(raw_smp_processor_id());
> +
>         /* The function calls crash_smp_send_stop(). */
>         native_machine_crash_shutdown(regs);

Since we don't *need* to do this, I think it might be less risky to just leave
the code "as is".   But I'm OK either way.

> 
> > > 2) hyperv_cleanup() is called too early. In the kexec path, the other CPUs
> > > are stopped in hv_machine_shutdown() -> native_machine_shutdown(), so
> > > between hv_kexec_handler() and native_machine_shutdown(), the other
> > CPUs
> > > can still try to access the hypercall page and cause panic. The workaround
> > > "hv_hypercall_pg = NULL;" in hyperv_cleanup() can't work reliably.
> >
> > I would note that the comment in hv_suspend() is also incorrect on this
> > point.  Setting hv_hypercall_pg to NULL does not cause subsequent
> > hypercalls to fail safely.  The fast hypercalls don't test for it, and even if they
> > did test like hv_do_hypercall(), the test just creates a race condition.
> 
> The comment in hv_suspend() should be correct because hv_suspend()
> is only called during hibernation from the syscore_ops path where only
> one CPU is active, e.g. for the suspend operation, it's called from
> state_store
>   hibernate
>     hibernation_snapshot
>       create_image
>         suspend_disable_secondary_cpus
>         syscore_suspend
>           hv_suspend
> 
> It's similar for the resume operation:
> resume_store
>   software_resume
>     load_image_and_restore
>       hibernation_restore
>         resume_target_kernel
>           hibernate_resume_nonboot_cpu_disable
>           syscore_suspend
>             hv_suspend

I agree the hv_suspend() code is correct.  I read the second sentence of
the comment as being a more general statement that hypercalls could be
cleanly stopped by setting hv_hypercall_pg to NULL, which isn't true.

> 
> > >  static void hv_machine_crash_shutdown(struct pt_regs *regs)
> > >  {
> > >  	if (hv_crash_handler)
> > >  		hv_crash_handler(regs);
> > > +
> > > +	/* The function calls crash_smp_send_stop(). */
> >
> > Actually, crash_smp_send_stop() or smp_send_stop() has already been
> > called earlier by panic(),
> 
> This is true only when the Hyper-V host supports the feature
> HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE. On an old Hyper-V host
> without the feature, ms_hyperv_init_platform() doesn't set
> crash_kexec_post_notifiers, so crash_kexec_post_notifiers keeps its
> initial value "false", and panic() calls smp_send_stop() *after*
> __crash_kexec() (which calls machine_crash_shutdown() ->
> hv_machine_crash_shutdown()).

OK, I see your point.

> 
> >  so there's already only a single CPU running at
> > this point.  crash_smp_send_stop() is called again in
> > native_machine_crash_shutdown(), but it has a flag to prevent it from
> > running again.
> >
> > >  	native_machine_crash_shutdown(regs);
> > > +
> > > +	/* Disable the hypercall page when there is only 1 active CPU. */
> > > +	hyperv_cleanup();
> >
> > Moving the call to hyperv_cleanup() in the panic path is OK, and it makes
> > the panic and kexec() paths more similar, but I don't think it is necessary.
> > As noted above, the other CPUs have already been stopped, so they shouldn't
> > be executing any hypercalls.
> 
> As I explained above, it's necessary for old Hyper-V hosts. :-)

Agreed.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] x86/hyperv: Fix kexec panic/hang issues
  2020-12-22  3:36     ` Michael Kelley
@ 2020-12-22  6:41       ` Dexuan Cui
  0 siblings, 0 replies; 5+ messages in thread
From: Dexuan Cui @ 2020-12-22  6:41 UTC (permalink / raw)
  To: Michael Kelley, tglx, mingo, bp, x86, hpa, linux-hyperv, wei.liu,
	vkuznets, jwiesner, ohering
  Cc: linux-kernel, Stephen Hemminger, Haiyang Zhang, KY Srinivasan

> From: Michael Kelley <mikelley@microsoft.com>
> Sent: Monday, December 21, 2020 7:36 PM
> ...
> Since we don't *need* to do this, I think it might be less risky to just leave
> the code "as is".   But I'm OK either way.

Ok, then I'll leave it as is in v2.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-12-22  6:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-20 22:30 [PATCH] x86/hyperv: Fix kexec panic/hang issues Dexuan Cui
2020-12-21 23:33 ` Michael Kelley
2020-12-22  1:03   ` Dexuan Cui
2020-12-22  3:36     ` Michael Kelley
2020-12-22  6:41       ` Dexuan Cui

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.