linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] x86/Hyper-V: Panic code path fixes
@ 2020-03-17 13:25 ltykernel
  2020-03-17 13:25 ` [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback ltykernel
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: ltykernel @ 2020-03-17 13:25 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

This patchset fixes some issues in the Hyper-V panic code path.
Patch 1 resolves issue that panic system still responses network
packets.
Patch 2-3 resolves crash enlightenment issues.
Patch 4 is to set crash_kexec_post_notifiers to true for Hyper-V
VM in order to report crash data or kmsg to host before running
kdump kernel.

Tianyu Lan (4):
  x86/Hyper-V: Unload vmbus channel in hv panic callback
  x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
  x86/Hyper-V: Trigger crash enlightenment only once during  system
    crash.
  x86/Hyper-V: Report crash register data or ksmg before running crash
    kernel

 arch/x86/kernel/cpu/mshyperv.c | 10 ++++++++++
 drivers/hv/channel_mgmt.c      |  5 +++++
 drivers/hv/vmbus_drv.c         | 35 +++++++++++++++++++++++++----------
 3 files changed, 40 insertions(+), 10 deletions(-)

-- 
2.14.5


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback
  2020-03-17 13:25 [PATCH 0/4] x86/Hyper-V: Panic code path fixes ltykernel
@ 2020-03-17 13:25 ` ltykernel
  2020-03-17 17:35   ` Wei Liu
  2020-03-18 15:58   ` Vitaly Kuznetsov
  2020-03-17 13:25 ` [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump ltykernel
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: ltykernel @ 2020-03-17 13:25 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

Customer reported Hyper-V VM still responded network traffic
ack packets after kernel panic with kernel parameter "panic=0”.
This becauses vmbus driver interrupt handler still works
on the panic cpu after kernel panic. Panic cpu falls into
infinite loop of panic() with interrupt enabled at that point.
Vmbus driver can still handle network traffic.

This confuses remote service that the panic system is still
alive when it gets ack packets. Unload vmbus channel in hv panic
callback and fix it.

vmbus_initiate_unload() maybe double called during panic process
(e.g, hyperv_panic_event() and hv_crash_handler()). So check
and set connection state in vmbus_initiate_unload() to resolve
reenter issue.

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
---
 drivers/hv/channel_mgmt.c |  5 +++++
 drivers/hv/vmbus_drv.c    | 17 +++++++++--------
 2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 0370364169c4..893493f2b420 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
 {
 	struct vmbus_channel_message_header hdr;
 
+	if (vmbus_connection.conn_state == DISCONNECTED)
+		return;
+
 	/* Pre-Win2012R2 hosts don't support reconnect */
 	if (vmbus_proto_version < VERSION_WIN8_1)
 		return;
@@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
 		wait_for_completion(&vmbus_connection.unload_event);
 	else
 		vmbus_wait_for_unload();
+
+	vmbus_connection.conn_state = DISCONNECTED;
 }
 
 static void check_ready_for_resume_event(void)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 029378c27421..b56b9fb9bd90 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -53,9 +53,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
 {
 	struct pt_regs *regs;
 
-	regs = current_pt_regs();
+	vmbus_initiate_unload(true);
 
-	hyperv_report_panic(regs, val);
+	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
+		regs = current_pt_regs();
+		hyperv_report_panic(regs, val);
+	}
 	return NOTIFY_DONE;
 }
 
@@ -1391,10 +1394,12 @@ static int vmbus_bus_init(void)
 		}
 
 		register_die_notifier(&hyperv_die_block);
-		atomic_notifier_chain_register(&panic_notifier_list,
-					       &hyperv_panic_block);
 	}
 
+	/* Vmbus channel is unloaded in panic callback when panic happens.*/
+	atomic_notifier_chain_register(&panic_notifier_list,
+			       &hyperv_panic_block);
+
 	vmbus_request_offers();
 
 	return 0;
@@ -2204,8 +2209,6 @@ static int vmbus_bus_suspend(struct device *dev)
 
 	vmbus_initiate_unload(false);
 
-	vmbus_connection.conn_state = DISCONNECTED;
-
 	/* Reset the event for the next resume. */
 	reinit_completion(&vmbus_connection.ready_for_resume_event);
 
@@ -2289,7 +2292,6 @@ static void hv_kexec_handler(void)
 {
 	hv_stimer_global_cleanup();
 	vmbus_initiate_unload(false);
-	vmbus_connection.conn_state = DISCONNECTED;
 	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
 	mb();
 	cpuhp_remove_state(hyperv_cpuhp_online);
@@ -2306,7 +2308,6 @@ static void hv_crash_handler(struct pt_regs *regs)
 	 * doing the cleanup for current CPU only. This should be sufficient
 	 * for kdump.
 	 */
-	vmbus_connection.conn_state = DISCONNECTED;
 	cpu = smp_processor_id();
 	hv_stimer_cleanup(cpu);
 	hv_synic_disable_regs(cpu);
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
  2020-03-17 13:25 [PATCH 0/4] x86/Hyper-V: Panic code path fixes ltykernel
  2020-03-17 13:25 ` [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback ltykernel
@ 2020-03-17 13:25 ` ltykernel
  2020-03-17 17:36   ` Wei Liu
  2020-03-19  0:38   ` Michael Kelley
  2020-03-17 13:25 ` [PATCH 3/4] x86/Hyper-V: Trigger crash enlightenment only once during system crash ltykernel
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 22+ messages in thread
From: ltykernel @ 2020-03-17 13:25 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

If fail to register kmsg dump on Hyper-V platform, hv_panic_page
will not be used anywhere. So free and reset it.

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
---
 drivers/hv/vmbus_drv.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b56b9fb9bd90..b043efea092a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1385,9 +1385,13 @@ static int vmbus_bus_init(void)
 			hv_panic_page = (void *)hv_alloc_hyperv_zeroed_page();
 			if (hv_panic_page) {
 				ret = kmsg_dump_register(&hv_kmsg_dumper);
-				if (ret)
+				if (ret) {
 					pr_err("Hyper-V: kmsg dump register "
 						"error 0x%x\n", ret);
+					hv_free_hyperv_page(
+					    (unsigned long)hv_panic_page);
+					hv_panic_page = NULL;
+				}
 			} else
 				pr_err("Hyper-V: panic message page memory "
 					"allocation failed");
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/4] x86/Hyper-V: Trigger crash enlightenment only once during system crash.
  2020-03-17 13:25 [PATCH 0/4] x86/Hyper-V: Panic code path fixes ltykernel
  2020-03-17 13:25 ` [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback ltykernel
  2020-03-17 13:25 ` [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump ltykernel
@ 2020-03-17 13:25 ` ltykernel
  2020-03-19  0:45   ` Michael Kelley
  2020-03-17 13:25 ` [PATCH 4/4] x86/Hyper-V: Report crash register data or ksmg before running crash kernel ltykernel
  2020-03-19  0:57 ` [PATCH 0/4] x86/Hyper-V: Panic code path fixes Michael Kelley
  4 siblings, 1 reply; 22+ messages in thread
From: ltykernel @ 2020-03-17 13:25 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

Hyper-V expects guest only triggers crash enlightenment.
The second crash notify will be ignored by Hyper-V.

Current code may trigger crash enlightenment during system
panic twice.
1) The enlightenment is triggered in hyperv_panic/die_event()
via hyperv_report_panic().
2) hv_kmsg_dump() reports kmsg to host via hyperv_report_panic_msg().

Fix it. If kmsg dump is registered successfully, just report
kmsg via hyperv_report_panic_msg() and not report register values
via hyperv_report_panic().

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
---
 drivers/hv/vmbus_drv.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b043efea092a..1787d6246251 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -55,7 +55,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
 
 	vmbus_initiate_unload(true);
 
-	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
+	/*
+	 * Crash notify only can be triggered once. If crash notify
+	 * message is available, just report kmsg to crash buffer.
+	 */
+	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE
+	    && !hv_panic_page) {
 		regs = current_pt_regs();
 		hyperv_report_panic(regs, val);
 	}
@@ -68,7 +73,12 @@ static int hyperv_die_event(struct notifier_block *nb, unsigned long val,
 	struct die_args *die = (struct die_args *)args;
 	struct pt_regs *regs = die->regs;
 
-	hyperv_report_panic(regs, val);
+	/*
+	 * Crash notify only can be triggered once. If crash notify
+	 * message is available, just report kmsg to crash buffer.
+	 */
+	if (!hv_panic_page)
+		hyperv_report_panic(regs, val);
 	return NOTIFY_DONE;
 }
 
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/4] x86/Hyper-V: Report crash register data or ksmg before  running crash kernel
  2020-03-17 13:25 [PATCH 0/4] x86/Hyper-V: Panic code path fixes ltykernel
                   ` (2 preceding siblings ...)
  2020-03-17 13:25 ` [PATCH 3/4] x86/Hyper-V: Trigger crash enlightenment only once during system crash ltykernel
@ 2020-03-17 13:25 ` ltykernel
  2020-03-19  0:51   ` Michael Kelley
  2020-03-19  0:57 ` [PATCH 0/4] x86/Hyper-V: Panic code path fixes Michael Kelley
  4 siblings, 1 reply; 22+ messages in thread
From: ltykernel @ 2020-03-17 13:25 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: Tianyu Lan <Tianyu.Lan@microsoft.com>

Hyper-V expects to get crash register data or kmsg via crash
enlightenment when guest crash happens. crash_kexec_post_notifiers
is default to be false and crash kernel runs before calling hv
panic callback and dumping kmsg. In this case, Hyper-V doesn't
get crash register data or kmsg from guest. Set crash_kexec_post
_notifiers to be true for Hyper-V VM and fix it.

Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
---
 arch/x86/kernel/cpu/mshyperv.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index caa032ce3fe3..5e296a7e6036 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -263,6 +263,16 @@ static void __init ms_hyperv_init_platform(void)
 			cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
 	}
 
+	/*
+	 * Hyper-V expects to get crash register data or kmsg when
+	 * crash enlightment is available and system crashes. Set
+	 * crash_kexec_post_notifiers to be true to make sure that
+	 * calling crash enlightment interface before running kdump
+	 * kernel.
+	 */
+	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)
+		crash_kexec_post_notifiers = true;
+
 #ifdef CONFIG_X86_LOCAL_APIC
 	if (ms_hyperv.features & HV_X64_ACCESS_FREQUENCY_MSRS &&
 	    ms_hyperv.misc_features & HV_FEATURE_FREQUENCY_MSRS_AVAILABLE) {
-- 
2.14.5


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback
  2020-03-17 13:25 ` [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback ltykernel
@ 2020-03-17 17:35   ` Wei Liu
  2020-03-19  8:24     ` Tianyu Lan
  2020-03-18 15:58   ` Vitaly Kuznetsov
  1 sibling, 1 reply; 22+ messages in thread
From: Wei Liu @ 2020-03-17 17:35 UTC (permalink / raw)
  To: ltykernel
  Cc: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley, Tianyu Lan, linux-hyperv, linux-kernel,
	vkuznets, Wei Liu

On Tue, Mar 17, 2020 at 06:25:20AM -0700, ltykernel@gmail.com wrote:
> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> 
> Customer reported Hyper-V VM still responded network traffic
> ack packets after kernel panic with kernel parameter "panic=0”.
> This becauses vmbus driver interrupt handler still works
> on the panic cpu after kernel panic. Panic cpu falls into
> infinite loop of panic() with interrupt enabled at that point.
> Vmbus driver can still handle network traffic.
> 
> This confuses remote service that the panic system is still
> alive when it gets ack packets. Unload vmbus channel in hv panic
> callback and fix it.
> 
> vmbus_initiate_unload() maybe double called during panic process
> (e.g, hyperv_panic_event() and hv_crash_handler()). So check
> and set connection state in vmbus_initiate_unload() to resolve
> reenter issue.
> 
> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> ---
>  drivers/hv/channel_mgmt.c |  5 +++++
>  drivers/hv/vmbus_drv.c    | 17 +++++++++--------
>  2 files changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 0370364169c4..893493f2b420 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
>  {
>  	struct vmbus_channel_message_header hdr;
>  
> +	if (vmbus_connection.conn_state == DISCONNECTED)
> +		return;
> +
>  	/* Pre-Win2012R2 hosts don't support reconnect */
>  	if (vmbus_proto_version < VERSION_WIN8_1)
>  		return;
> @@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
>  		wait_for_completion(&vmbus_connection.unload_event);
>  	else
>  		vmbus_wait_for_unload();
> +
> +	vmbus_connection.conn_state = DISCONNECTED;

This is only set at the end of the function.  I don't see how this solve
the re-entrant issue with the check at the beginning. Do I miss anything
here?

Maybe this function should check and set the state to
DISCONNECTING/DISCONNECTED at the beginning of this function?

Wei.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
  2020-03-17 13:25 ` [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump ltykernel
@ 2020-03-17 17:36   ` Wei Liu
  2020-03-19  8:12     ` Tianyu Lan
  2020-03-19  0:38   ` Michael Kelley
  1 sibling, 1 reply; 22+ messages in thread
From: Wei Liu @ 2020-03-17 17:36 UTC (permalink / raw)
  To: ltykernel
  Cc: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley, Tianyu Lan, linux-hyperv, linux-kernel,
	vkuznets, Wei Liu

On Tue, Mar 17, 2020 at 06:25:21AM -0700, ltykernel@gmail.com wrote:
> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> 
> If fail to register kmsg dump on Hyper-V platform, hv_panic_page
> will not be used anywhere. So free and reset it.
> 
> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> ---
>  drivers/hv/vmbus_drv.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index b56b9fb9bd90..b043efea092a 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1385,9 +1385,13 @@ static int vmbus_bus_init(void)
>  			hv_panic_page = (void *)hv_alloc_hyperv_zeroed_page();
>  			if (hv_panic_page) {
>  				ret = kmsg_dump_register(&hv_kmsg_dumper);
> -				if (ret)
> +				if (ret) {
>  					pr_err("Hyper-V: kmsg dump register "
>  						"error 0x%x\n", ret);
> +					hv_free_hyperv_page(
> +					    (unsigned long)hv_panic_page);
> +					hv_panic_page = NULL;
> +				}

While this modification looks correct to me, there is a call to free
hv_panic_page in the err_alloc path. That makes the error handling a bit
confusing here.

I think you can just remove that function call in err_alloc path.

Wei.

>  			} else
>  				pr_err("Hyper-V: panic message page memory "
>  					"allocation failed");
> -- 
> 2.14.5
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback
  2020-03-17 13:25 ` [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback ltykernel
  2020-03-17 17:35   ` Wei Liu
@ 2020-03-18 15:58   ` Vitaly Kuznetsov
  2020-03-19  0:33     ` Michael Kelley
  2020-03-19  8:03     ` Tianyu Lan
  1 sibling, 2 replies; 22+ messages in thread
From: Vitaly Kuznetsov @ 2020-03-18 15:58 UTC (permalink / raw)
  To: ltykernel
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, kys, haiyangz, sthemmin,
	liuwe, tglx, mingo, bp, hpa, x86, michael.h.kelley

ltykernel@gmail.com writes:

> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
>
> Customer reported Hyper-V VM still responded network traffic
> ack packets after kernel panic with kernel parameter "panic=0”.
> This becauses vmbus driver interrupt handler still works
> on the panic cpu after kernel panic. Panic cpu falls into
> infinite loop of panic() with interrupt enabled at that point.
> Vmbus driver can still handle network traffic.
>
> This confuses remote service that the panic system is still
> alive when it gets ack packets. Unload vmbus channel in hv panic
> callback and fix it.
>
> vmbus_initiate_unload() maybe double called during panic process
> (e.g, hyperv_panic_event() and hv_crash_handler()). So check
> and set connection state in vmbus_initiate_unload() to resolve
> reenter issue.
>
> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> ---
>  drivers/hv/channel_mgmt.c |  5 +++++
>  drivers/hv/vmbus_drv.c    | 17 +++++++++--------
>  2 files changed, 14 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 0370364169c4..893493f2b420 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
>  {
>  	struct vmbus_channel_message_header hdr;
>  
> +	if (vmbus_connection.conn_state == DISCONNECTED)
> +		return;
> +

To make this less racy, can we do something like

	if (xchg(&vmbus_connection.conn_state, DISCONNECTED) == DISCONNECTED)
		return;

?

>  	/* Pre-Win2012R2 hosts don't support reconnect */
>  	if (vmbus_proto_version < VERSION_WIN8_1)
>  		return;
> @@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
>  		wait_for_completion(&vmbus_connection.unload_event);
>  	else
>  		vmbus_wait_for_unload();
> +
> +	vmbus_connection.conn_state = DISCONNECTED;
>  }
>  
>  static void check_ready_for_resume_event(void)
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index 029378c27421..b56b9fb9bd90 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -53,9 +53,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
>  {
>  	struct pt_regs *regs;
>  
> -	regs = current_pt_regs();
> +	vmbus_initiate_unload(true);
>  
> -	hyperv_report_panic(regs, val);
> +	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {

With Michael's effors to make code in drivers/hv arch agnostic, I think
we need a better, arch-neutral way.

> +		regs = current_pt_regs();
> +		hyperv_report_panic(regs, val);
> +	}
>  	return NOTIFY_DONE;
>  }
>  
> @@ -1391,10 +1394,12 @@ static int vmbus_bus_init(void)
>  		}
>  
>  		register_die_notifier(&hyperv_die_block);
> -		atomic_notifier_chain_register(&panic_notifier_list,
> -					       &hyperv_panic_block);
>  	}
>  
> +	/* Vmbus channel is unloaded in panic callback when panic happens.*/
> +	atomic_notifier_chain_register(&panic_notifier_list,
> +			       &hyperv_panic_block);
> +
>  	vmbus_request_offers();
>  
>  	return 0;
> @@ -2204,8 +2209,6 @@ static int vmbus_bus_suspend(struct device *dev)
>  
>  	vmbus_initiate_unload(false);
>  
> -	vmbus_connection.conn_state = DISCONNECTED;
> -
>  	/* Reset the event for the next resume. */
>  	reinit_completion(&vmbus_connection.ready_for_resume_event);
>  
> @@ -2289,7 +2292,6 @@ static void hv_kexec_handler(void)
>  {
>  	hv_stimer_global_cleanup();
>  	vmbus_initiate_unload(false);
> -	vmbus_connection.conn_state = DISCONNECTED;
>  	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
>  	mb();
>  	cpuhp_remove_state(hyperv_cpuhp_online);
> @@ -2306,7 +2308,6 @@ static void hv_crash_handler(struct pt_regs *regs)
>  	 * doing the cleanup for current CPU only. This should be sufficient
>  	 * for kdump.
>  	 */
> -	vmbus_connection.conn_state = DISCONNECTED;
>  	cpu = smp_processor_id();
>  	hv_stimer_cleanup(cpu);
>  	hv_synic_disable_regs(cpu);

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback
  2020-03-18 15:58   ` Vitaly Kuznetsov
@ 2020-03-19  0:33     ` Michael Kelley
  2020-03-19  8:03       ` Vitaly Kuznetsov
  2020-03-19  8:03     ` Tianyu Lan
  1 sibling, 1 reply; 22+ messages in thread
From: Michael Kelley @ 2020-03-19  0:33 UTC (permalink / raw)
  To: vkuznets, ltykernel
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Wei Liu, tglx, mingo, bp, hpa,
	x86

From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Wednesday, March 18, 2020 8:58 AM
> 
> ltykernel@gmail.com writes:
> 
> > From: Tianyu Lan <Tianyu.Lan@microsoft.com>
> >
> > Customer reported Hyper-V VM still responded network traffic
> > ack packets after kernel panic with kernel parameter "panic=0”.
> > This becauses vmbus driver interrupt handler still works
> > on the panic cpu after kernel panic. Panic cpu falls into
> > infinite loop of panic() with interrupt enabled at that point.
> > Vmbus driver can still handle network traffic.
> >
> > This confuses remote service that the panic system is still
> > alive when it gets ack packets. Unload vmbus channel in hv panic
> > callback and fix it.
> >
> > vmbus_initiate_unload() maybe double called during panic process
> > (e.g, hyperv_panic_event() and hv_crash_handler()). So check
> > and set connection state in vmbus_initiate_unload() to resolve
> > reenter issue.

Let me suggest a revised version of the commit message:

When kdump is not configured, a Hyper-V VM might still respond to
network traffic after a kernel panic when kernel parameter panic=0.
The panic CPU goes into an infinite loop with interrupts enabled,
and the VMbus driver interrupt handler still works because the 
VMbus connection is unloaded only in the kdump path.  The network
responses make the other end of the connection think the VM is
still functional even though it has panic'ed, which could affect any
failover actions that should be taken.

Fix this by unloading the VMbus connection during the panic process.
vmbus_initiate_unload() could then be called twice (e.g., by
hyperv_panic_event() and hv_crash_handler(), so reset the connection
state in vmbus_initiate_unload() to ensure the unload is done only
once.

> >
> > Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> > ---
> >  drivers/hv/channel_mgmt.c |  5 +++++
> >  drivers/hv/vmbus_drv.c    | 17 +++++++++--------
> >  2 files changed, 14 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index 0370364169c4..893493f2b420 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
> >  {
> >  	struct vmbus_channel_message_header hdr;
> >
> > +	if (vmbus_connection.conn_state == DISCONNECTED)
> > +		return;
> > +
> 
> To make this less racy, can we do something like
> 
> 	if (xchg(&vmbus_connection.conn_state, DISCONNECTED) == DISCONNECTED)
> 		return;
> 
> ?

I was trying to decide if there can actually be a race.  The panic() and die()
functions both ensure that only a single CPU can execute in those paths at any
one time, though maybe panic() and die() could be running concurrently.
And vmbus_initiate_unload() can also be called in the hibernation path in
vmbus_bus_suspend(), so there could be a race.  Doing the xchg() makes
sense.

> 
> >  	/* Pre-Win2012R2 hosts don't support reconnect */
> >  	if (vmbus_proto_version < VERSION_WIN8_1)
> >  		return;
> > @@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
> >  		wait_for_completion(&vmbus_connection.unload_event);
> >  	else
> >  		vmbus_wait_for_unload();
> > +
> > +	vmbus_connection.conn_state = DISCONNECTED;
> >  }
> >
> >  static void check_ready_for_resume_event(void)
> > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> > index 029378c27421..b56b9fb9bd90 100644
> > --- a/drivers/hv/vmbus_drv.c
> > +++ b/drivers/hv/vmbus_drv.c
> > @@ -53,9 +53,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned
> long val,
> >  {
> >  	struct pt_regs *regs;
> >
> > -	regs = current_pt_regs();
> > +	vmbus_initiate_unload(true);
> >
> > -	hyperv_report_panic(regs, val);
> > +	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
> 
> With Michael's effors to make code in drivers/hv arch agnostic, I think
> we need a better, arch-neutral way.

Vitaly -- could you elaborate on what part is not arch-neutral?  I don't see
a problem.  ms_hyperv and the misc_features field exist for both the x86
and ARM64 code branches.  It turns out the particular bit for
GUEST_CRASH_MSR_AVAILABLE is different on the two architectures, but
the compiler will do the right thing.

> 
> > +		regs = current_pt_regs();
> > +		hyperv_report_panic(regs, val);
> > +	}
> >  	return NOTIFY_DONE;
> >  }
> >
> > @@ -1391,10 +1394,12 @@ static int vmbus_bus_init(void)
> >  		}
> >
> >  		register_die_notifier(&hyperv_die_block);
> > -		atomic_notifier_chain_register(&panic_notifier_list,
> > -					       &hyperv_panic_block);
> >  	}
> >
> > +	/* Vmbus channel is unloaded in panic callback when panic happens.*/
> > +	atomic_notifier_chain_register(&panic_notifier_list,
> > +			       &hyperv_panic_block);
> > +
> >  	vmbus_request_offers();
> >
> >  	return 0;
> > @@ -2204,8 +2209,6 @@ static int vmbus_bus_suspend(struct device *dev)
> >
> >  	vmbus_initiate_unload(false);
> >
> > -	vmbus_connection.conn_state = DISCONNECTED;
> > -
> >  	/* Reset the event for the next resume. */
> >  	reinit_completion(&vmbus_connection.ready_for_resume_event);
> >
> > @@ -2289,7 +2292,6 @@ static void hv_kexec_handler(void)
> >  {
> >  	hv_stimer_global_cleanup();
> >  	vmbus_initiate_unload(false);
> > -	vmbus_connection.conn_state = DISCONNECTED;
> >  	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
> >  	mb();
> >  	cpuhp_remove_state(hyperv_cpuhp_online);
> > @@ -2306,7 +2308,6 @@ static void hv_crash_handler(struct pt_regs *regs)
> >  	 * doing the cleanup for current CPU only. This should be sufficient
> >  	 * for kdump.
> >  	 */
> > -	vmbus_connection.conn_state = DISCONNECTED;
> >  	cpu = smp_processor_id();
> >  	hv_stimer_cleanup(cpu);
> >  	hv_synic_disable_regs(cpu);
> 
> --
> Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
  2020-03-17 13:25 ` [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump ltykernel
  2020-03-17 17:36   ` Wei Liu
@ 2020-03-19  0:38   ` Michael Kelley
  1 sibling, 0 replies; 22+ messages in thread
From: Michael Kelley @ 2020-03-19  0:38 UTC (permalink / raw)
  To: ltykernel, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Wei Liu, tglx, mingo, bp, hpa, x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: ltykernel@gmail.com <ltykernel@gmail.com> Sent: Tuesday, March 17, 2020 6:25 AM
> 
> If fail to register kmsg dump on Hyper-V platform, hv_panic_page
> will not be used anywhere. So free and reset it.

Slight commit message wording cleanup:

If kmsg_dump_register() fails, hv_panic_page will not be used
anywhere.  So free and reset it.

> 
> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> ---
>  drivers/hv/vmbus_drv.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index b56b9fb9bd90..b043efea092a 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -1385,9 +1385,13 @@ static int vmbus_bus_init(void)
>  			hv_panic_page = (void *)hv_alloc_hyperv_zeroed_page();
>  			if (hv_panic_page) {
>  				ret = kmsg_dump_register(&hv_kmsg_dumper);
> -				if (ret)
> +				if (ret) {
>  					pr_err("Hyper-V: kmsg dump register "
>  						"error 0x%x\n", ret);
> +					hv_free_hyperv_page(
> +					    (unsigned long)hv_panic_page);
> +					hv_panic_page = NULL;
> +				}
>  			} else
>  				pr_err("Hyper-V: panic message page memory "
>  					"allocation failed");
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 3/4] x86/Hyper-V: Trigger crash enlightenment only once during system crash.
  2020-03-17 13:25 ` [PATCH 3/4] x86/Hyper-V: Trigger crash enlightenment only once during system crash ltykernel
@ 2020-03-19  0:45   ` Michael Kelley
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kelley @ 2020-03-19  0:45 UTC (permalink / raw)
  To: ltykernel, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Wei Liu, tglx, mingo, bp, hpa, x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: ltykernel@gmail.com <ltykernel@gmail.com>  Sent: Tuesday, March 17, 2020 6:25 AM
> 
> Hyper-V expects guest only triggers crash enlightenment.
> The second crash notify will be ignored by Hyper-V.
> 
> Current code may trigger crash enlightenment during system
> panic twice.
> 1) The enlightenment is triggered in hyperv_panic/die_event()
> via hyperv_report_panic().
> 2) hv_kmsg_dump() reports kmsg to host via hyperv_report_panic_msg().
> 
> Fix it. If kmsg dump is registered successfully, just report
> kmsg via hyperv_report_panic_msg() and not report register values
> via hyperv_report_panic().

Suggested wording improvements:

When a guest VM panics, Hyper-V should be notified only once via the
crash synthetic MSRs.  Current Linux code might write these crash MSRs
twice during a system panic:
1) hyperv_panic/die_event() calling hyperv_report_panic()
2) hv_kmsg_dump() calling hyperv_report_panic_msg()

Fix this by not calling hyperv_report_panic() if a kmsg dump has been
successfully registered.  The notification will happen later via
hyperv_report_panic_msg().

> 
> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> ---
>  drivers/hv/vmbus_drv.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> index b043efea092a..1787d6246251 100644
> --- a/drivers/hv/vmbus_drv.c
> +++ b/drivers/hv/vmbus_drv.c
> @@ -55,7 +55,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned
> long val,
> 
>  	vmbus_initiate_unload(true);
> 
> -	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
> +	/*
> +	 * Crash notify only can be triggered once. If crash notify
> +	 * message is available, just report kmsg to crash buffer.
> +	 */
> +	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE
> +	    && !hv_panic_page) {
>  		regs = current_pt_regs();
>  		hyperv_report_panic(regs, val);
>  	}
> @@ -68,7 +73,12 @@ static int hyperv_die_event(struct notifier_block *nb, unsigned long
> val,
>  	struct die_args *die = (struct die_args *)args;
>  	struct pt_regs *regs = die->regs;
> 
> -	hyperv_report_panic(regs, val);
> +	/*
> +	 * Crash notify only can be triggered once. If crash notify
> +	 * message is available, just report kmsg to crash buffer.
> +	 */
> +	if (!hv_panic_page)
> +		hyperv_report_panic(regs, val);
>  	return NOTIFY_DONE;
>  }
> 
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 4/4] x86/Hyper-V: Report crash register data or ksmg before  running crash kernel
  2020-03-17 13:25 ` [PATCH 4/4] x86/Hyper-V: Report crash register data or ksmg before running crash kernel ltykernel
@ 2020-03-19  0:51   ` Michael Kelley
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kelley @ 2020-03-19  0:51 UTC (permalink / raw)
  To: ltykernel, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Wei Liu, tglx, mingo, bp, hpa, x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: ltykernel@gmail.com <ltykernel@gmail.com>  Sent: Tuesday, March 17, 2020 6:25 AM
> 
> Hyper-V expects to get crash register data or kmsg via crash
> enlightenment when guest crash happens. crash_kexec_post_notifiers
> is default to be false and crash kernel runs before calling hv
> panic callback and dumping kmsg. In this case, Hyper-V doesn't
> get crash register data or kmsg from guest. Set crash_kexec_post
> _notifiers to be true for Hyper-V VM and fix it.

Suggested wording tweaks:

We want to notify Hyper-V when a Linux guest VM crash occurs, so
there is a record of the crash even when kdump is enabled.   But
crash_kexec_post_notifiers defaults to "false", so the kdump kernel
runs before the notifiers and Hyper-V never gets notified.  Fix this by
always setting crash_kexec_post_notifiers to be true for Hyper-V VMs.

> 
> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
> ---
>  arch/x86/kernel/cpu/mshyperv.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index caa032ce3fe3..5e296a7e6036 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -263,6 +263,16 @@ static void __init ms_hyperv_init_platform(void)
>  			cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
>  	}
> 
> +	/*
> +	 * Hyper-V expects to get crash register data or kmsg when
> +	 * crash enlightment is available and system crashes. Set
> +	 * crash_kexec_post_notifiers to be true to make sure that
> +	 * calling crash enlightment interface before running kdump
> +	 * kernel.
> +	 */
> +	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)
> +		crash_kexec_post_notifiers = true;
> +
>  #ifdef CONFIG_X86_LOCAL_APIC
>  	if (ms_hyperv.features & HV_X64_ACCESS_FREQUENCY_MSRS &&
>  	    ms_hyperv.misc_features & HV_FEATURE_FREQUENCY_MSRS_AVAILABLE) {
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 0/4] x86/Hyper-V: Panic code path fixes
  2020-03-17 13:25 [PATCH 0/4] x86/Hyper-V: Panic code path fixes ltykernel
                   ` (3 preceding siblings ...)
  2020-03-17 13:25 ` [PATCH 4/4] x86/Hyper-V: Report crash register data or ksmg before running crash kernel ltykernel
@ 2020-03-19  0:57 ` Michael Kelley
  2020-03-19 14:08   ` Tianyu Lan
  4 siblings, 1 reply; 22+ messages in thread
From: Michael Kelley @ 2020-03-19  0:57 UTC (permalink / raw)
  To: ltykernel, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Wei Liu, tglx, mingo, bp, hpa, x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: ltykernel@gmail.com <ltykernel@gmail.com> Sent: Tuesday, March 17, 2020 6:25 AM
> 
> This patchset fixes some issues in the Hyper-V panic code path.
> Patch 1 resolves issue that panic system still responses network
> packets.
> Patch 2-3 resolves crash enlightenment issues.
> Patch 4 is to set crash_kexec_post_notifiers to true for Hyper-V
> VM in order to report crash data or kmsg to host before running
> kdump kernel.

I still see an issue that isn't addressed by these patches.  The VMbus
driver registers a "die notifier" and a "panic notifier".   But die() will
eventually call panic() if panic_on_oops is set (which I think it typically
is).  If the CRASH_NOTIFY_MSG option is *not* enabled, then
hyperv_report_panic() could get called by the die notifier, and then
again by the panic notifier.

Do we even need the "die notifier"?  If it was removed, there would
not be any notification to Hyper-V via the die() path unless panic_on_oops
is set, which I think is actually the correct behavior.  I'm not
completely clear on what is supposed to happen in general to the
Linux kernel if panic_on_oops is not set. Does it try to continue to run?
If so, then we should not be notifying Hyper-V if panic_on_oops is not
set, and removing the die notifier is the right thing to do.

Michael

> 
> Tianyu Lan (4):
>   x86/Hyper-V: Unload vmbus channel in hv panic callback
>   x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
>   x86/Hyper-V: Trigger crash enlightenment only once during  system
>     crash.
>   x86/Hyper-V: Report crash register data or ksmg before running crash
>     kernel
> 
>  arch/x86/kernel/cpu/mshyperv.c | 10 ++++++++++
>  drivers/hv/channel_mgmt.c      |  5 +++++
>  drivers/hv/vmbus_drv.c         | 35 +++++++++++++++++++++++++----------
>  3 files changed, 40 insertions(+), 10 deletions(-)
> 
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback
  2020-03-18 15:58   ` Vitaly Kuznetsov
  2020-03-19  0:33     ` Michael Kelley
@ 2020-03-19  8:03     ` Tianyu Lan
  1 sibling, 0 replies; 22+ messages in thread
From: Tianyu Lan @ 2020-03-19  8:03 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, kys, haiyangz, sthemmin,
	liuwe, tglx, mingo, bp, hpa, x86, michael.h.kelley

Hi Vitaly:
      Thanks for your review.

On 3/18/2020 11:58 PM, Vitaly Kuznetsov wrote:
> ltykernel@gmail.com writes:
> 
>> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
>>
>> Customer reported Hyper-V VM still responded network traffic
>> ack packets after kernel panic with kernel parameter "panic=0”.
>> This becauses vmbus driver interrupt handler still works
>> on the panic cpu after kernel panic. Panic cpu falls into
>> infinite loop of panic() with interrupt enabled at that point.
>> Vmbus driver can still handle network traffic.
>>
>> This confuses remote service that the panic system is still
>> alive when it gets ack packets. Unload vmbus channel in hv panic
>> callback and fix it.
>>
>> vmbus_initiate_unload() maybe double called during panic process
>> (e.g, hyperv_panic_event() and hv_crash_handler()). So check
>> and set connection state in vmbus_initiate_unload() to resolve
>> reenter issue.
>>
>> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
>> ---
>>   drivers/hv/channel_mgmt.c |  5 +++++
>>   drivers/hv/vmbus_drv.c    | 17 +++++++++--------
>>   2 files changed, 14 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
>> index 0370364169c4..893493f2b420 100644
>> --- a/drivers/hv/channel_mgmt.c
>> +++ b/drivers/hv/channel_mgmt.c
>> @@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
>>   {
>>   	struct vmbus_channel_message_header hdr;
>>   
>> +	if (vmbus_connection.conn_state == DISCONNECTED)
>> +		return;
>> +
> 
> To make this less racy, can we do something like
> 
> 	if (xchg(&vmbus_connection.conn_state, DISCONNECTED) == DISCONNECTED)
> 		return;
> 
> ?

Agree. Will update in the next version.

> 
>>   	/* Pre-Win2012R2 hosts don't support reconnect */
>>   	if (vmbus_proto_version < VERSION_WIN8_1)
>>   		return;
>> @@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
>>   		wait_for_completion(&vmbus_connection.unload_event);
>>   	else
>>   		vmbus_wait_for_unload();
>> +
>> +	vmbus_connection.conn_state = DISCONNECTED;
>>   }
>>   
>>   static void check_ready_for_resume_event(void)
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
>> index 029378c27421..b56b9fb9bd90 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -53,9 +53,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
>>   {
>>   	struct pt_regs *regs;
>>   
>> -	regs = current_pt_regs();
>> +	vmbus_initiate_unload(true);
>>   
>> -	hyperv_report_panic(regs, val);
>> +	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
> 
> With Michael's effors to make code in drivers/hv arch agnostic, I think
> we need a better, arch-neutral way.
> 
>> +		regs = current_pt_regs();
>> +		hyperv_report_panic(regs, val);
>> +	}
>>   	return NOTIFY_DONE;
>>   }
>>   
>> @@ -1391,10 +1394,12 @@ static int vmbus_bus_init(void)
>>   		}
>>   
>>   		register_die_notifier(&hyperv_die_block);
>> -		atomic_notifier_chain_register(&panic_notifier_list,
>> -					       &hyperv_panic_block);
>>   	}
>>   
>> +	/* Vmbus channel is unloaded in panic callback when panic happens.*/
>> +	atomic_notifier_chain_register(&panic_notifier_list,
>> +			       &hyperv_panic_block);
>> +
>>   	vmbus_request_offers();
>>   
>>   	return 0;
>> @@ -2204,8 +2209,6 @@ static int vmbus_bus_suspend(struct device *dev)
>>   
>>   	vmbus_initiate_unload(false);
>>   
>> -	vmbus_connection.conn_state = DISCONNECTED;
>> -
>>   	/* Reset the event for the next resume. */
>>   	reinit_completion(&vmbus_connection.ready_for_resume_event);
>>   
>> @@ -2289,7 +2292,6 @@ static void hv_kexec_handler(void)
>>   {
>>   	hv_stimer_global_cleanup();
>>   	vmbus_initiate_unload(false);
>> -	vmbus_connection.conn_state = DISCONNECTED;
>>   	/* Make sure conn_state is set as hv_synic_cleanup checks for it */
>>   	mb();
>>   	cpuhp_remove_state(hyperv_cpuhp_online);
>> @@ -2306,7 +2308,6 @@ static void hv_crash_handler(struct pt_regs *regs)
>>   	 * doing the cleanup for current CPU only. This should be sufficient
>>   	 * for kdump.
>>   	 */
>> -	vmbus_connection.conn_state = DISCONNECTED;
>>   	cpu = smp_processor_id();
>>   	hv_stimer_cleanup(cpu);
>>   	hv_synic_disable_regs(cpu);
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback
  2020-03-19  0:33     ` Michael Kelley
@ 2020-03-19  8:03       ` Vitaly Kuznetsov
  2020-03-19 15:06         ` Michael Kelley
  0 siblings, 1 reply; 22+ messages in thread
From: Vitaly Kuznetsov @ 2020-03-19  8:03 UTC (permalink / raw)
  To: Michael Kelley, ltykernel
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Wei Liu, tglx, mingo, bp, hpa,
	x86

Michael Kelley <mikelley@microsoft.com> writes:

>> > --- a/drivers/hv/vmbus_drv.c
>> > +++ b/drivers/hv/vmbus_drv.c
>> > @@ -53,9 +53,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned
>> long val,
>> >  {
>> >  	struct pt_regs *regs;
>> >
>> > -	regs = current_pt_regs();
>> > +	vmbus_initiate_unload(true);
>> >
>> > -	hyperv_report_panic(regs, val);
>> > +	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
>> 
>> With Michael's effors to make code in drivers/hv arch agnostic, I think
>> we need a better, arch-neutral way.
>
> Vitaly -- could you elaborate on what part is not arch-neutral?  I don't see
> a problem.  ms_hyperv and the misc_features field exist for both the x86
> and ARM64 code branches.  It turns out the particular bit for
> GUEST_CRASH_MSR_AVAILABLE is different on the two architectures, but
> the compiler will do the right thing.
>

Ah, apologies, missed the fact that we also call it
'HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE' - I have to admit I was confused
by the 'MSR' part. We can probably rename this to something like
HV_FEATURE_GUEST_CRASH_REGS_AVAILABLE - but not as part of the series.

-- 
Vitaly


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
  2020-03-17 17:36   ` Wei Liu
@ 2020-03-19  8:12     ` Tianyu Lan
  0 siblings, 0 replies; 22+ messages in thread
From: Tianyu Lan @ 2020-03-19  8:12 UTC (permalink / raw)
  To: Wei Liu
  Cc: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley, Tianyu Lan, linux-hyperv, linux-kernel,
	vkuznets

Hi Wei:
	Thanks for your review.

On 3/18/2020 1:36 AM, Wei Liu wrote:
> On Tue, Mar 17, 2020 at 06:25:21AM -0700, ltykernel@gmail.com wrote:
>> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
>>
>> If fail to register kmsg dump on Hyper-V platform, hv_panic_page
>> will not be used anywhere. So free and reset it.
>>
>> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
>> ---
>>   drivers/hv/vmbus_drv.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
>> index b56b9fb9bd90..b043efea092a 100644
>> --- a/drivers/hv/vmbus_drv.c
>> +++ b/drivers/hv/vmbus_drv.c
>> @@ -1385,9 +1385,13 @@ static int vmbus_bus_init(void)
>>   			hv_panic_page = (void *)hv_alloc_hyperv_zeroed_page();
>>   			if (hv_panic_page) {
>>   				ret = kmsg_dump_register(&hv_kmsg_dumper);
>> -				if (ret)
>> +				if (ret) {
>>   					pr_err("Hyper-V: kmsg dump register "
>>   						"error 0x%x\n", ret);
>> +					hv_free_hyperv_page(
>> +					    (unsigned long)hv_panic_page);
>> +					hv_panic_page = NULL;
>> +				}
> 
> While this modification looks correct to me, there is a call to free
> hv_panic_page in the err_alloc path. That makes the error handling a bit
> confusing here.
> 
> I think you can just remove that function call in err_alloc path.

OK. Will update in the next version.

> 
> Wei.
> 
>>   			} else
>>   				pr_err("Hyper-V: panic message page memory "
>>   					"allocation failed");
>> -- 
>> 2.14.5
>>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback
  2020-03-17 17:35   ` Wei Liu
@ 2020-03-19  8:24     ` Tianyu Lan
  0 siblings, 0 replies; 22+ messages in thread
From: Tianyu Lan @ 2020-03-19  8:24 UTC (permalink / raw)
  To: Wei Liu
  Cc: kys, haiyangz, sthemmin, liuwe, tglx, mingo, bp, hpa, x86,
	michael.h.kelley, Tianyu Lan, linux-hyperv, linux-kernel,
	vkuznets

On 3/18/2020 1:35 AM, Wei Liu wrote:
> On Tue, Mar 17, 2020 at 06:25:20AM -0700, ltykernel@gmail.com wrote:
>> From: Tianyu Lan <Tianyu.Lan@microsoft.com>
>>
>> Customer reported Hyper-V VM still responded network traffic
>> ack packets after kernel panic with kernel parameter "panic=0”.
>> This becauses vmbus driver interrupt handler still works
>> on the panic cpu after kernel panic. Panic cpu falls into
>> infinite loop of panic() with interrupt enabled at that point.
>> Vmbus driver can still handle network traffic.
>>
>> This confuses remote service that the panic system is still
>> alive when it gets ack packets. Unload vmbus channel in hv panic
>> callback and fix it.
>>
>> vmbus_initiate_unload() maybe double called during panic process
>> (e.g, hyperv_panic_event() and hv_crash_handler()). So check
>> and set connection state in vmbus_initiate_unload() to resolve
>> reenter issue.
>>
>> Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
>> ---
>>   drivers/hv/channel_mgmt.c |  5 +++++
>>   drivers/hv/vmbus_drv.c    | 17 +++++++++--------
>>   2 files changed, 14 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
>> index 0370364169c4..893493f2b420 100644
>> --- a/drivers/hv/channel_mgmt.c
>> +++ b/drivers/hv/channel_mgmt.c
>> @@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
>>   {
>>   	struct vmbus_channel_message_header hdr;
>>   
>> +	if (vmbus_connection.conn_state == DISCONNECTED)
>> +		return;
>> +
>>   	/* Pre-Win2012R2 hosts don't support reconnect */
>>   	if (vmbus_proto_version < VERSION_WIN8_1)
>>   		return;
>> @@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
>>   		wait_for_completion(&vmbus_connection.unload_event);
>>   	else
>>   		vmbus_wait_for_unload();
>> +
>> +	vmbus_connection.conn_state = DISCONNECTED;
> 
> This is only set at the end of the function.  I don't see how this solve
> the re-entrant issue with the check at the beginning. Do I miss anything
> here?
> 

For this issue, vmbus_initiate_unload() maybe called on the panic vcpu
twice and so just split check and set conn_state.

> Maybe this function should check and set the state to
> DISCONNECTING/DISCONNECTED at the beginning of this function?
> 
Yes, Vitaly also gave suggestion to use "xchg" to check and set
conn_state. Will update in the next version.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] x86/Hyper-V: Panic code path fixes
  2020-03-19  0:57 ` [PATCH 0/4] x86/Hyper-V: Panic code path fixes Michael Kelley
@ 2020-03-19 14:08   ` Tianyu Lan
  2020-03-19 15:14     ` Michael Kelley
  0 siblings, 1 reply; 22+ messages in thread
From: Tianyu Lan @ 2020-03-19 14:08 UTC (permalink / raw)
  To: Michael Kelley, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Wei Liu, tglx, mingo, bp, hpa, x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

Hi Michael:
      Thanks for your review.

On 3/19/2020 8:57 AM, Michael Kelley wrote:
> From: ltykernel@gmail.com <ltykernel@gmail.com> Sent: Tuesday, March 17, 2020 6:25 AM
>>
>> This patchset fixes some issues in the Hyper-V panic code path.
>> Patch 1 resolves issue that panic system still responses network
>> packets.
>> Patch 2-3 resolves crash enlightenment issues.
>> Patch 4 is to set crash_kexec_post_notifiers to true for Hyper-V
>> VM in order to report crash data or kmsg to host before running
>> kdump kernel.
> 
> I still see an issue that isn't addressed by these patches.  The VMbus
> driver registers a "die notifier" and a "panic notifier".   But die() will
> eventually call panic() if panic_on_oops is set (which I think it typically
> is).  If the CRASH_NOTIFY_MSG option is *not* enabled, then
> hyperv_report_panic() could get called by the die notifier, and then
> again by the panic notifier.
> 
> Do we even need the "die notifier"?  If it was removed, there would
> not be any notification to Hyper-V via the die() path unless panic_on_oops
> is set, which I think is actually the correct behavior.  I'm not
> completely clear on what is supposed to happen in general to the
> Linux kernel if panic_on_oops is not set. Does it try to continue to run?
> If so, then we should not be notifying Hyper-V if panic_on_oops is not
> set, and removing the die notifier is the right thing to do.
> 

hyperv_report_panic() has re-enter check inside and so kernel only 
reports crash register data once during die(). From comment in the
hyperv_report_panic(), register value reported in die chain is more
exact than value in panic chain. The register value in die chain is
passed by die() caller. Register value reported in panic chain
is collected in the hyperv_panic_event().


If panic_on_oops is not set, the task should be killed and kernel
still runs. In this case, we may not trigger crash enlightenment.



> Michael
> 
>>
>> Tianyu Lan (4):
>>    x86/Hyper-V: Unload vmbus channel in hv panic callback
>>    x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
>>    x86/Hyper-V: Trigger crash enlightenment only once during  system
>>      crash.
>>    x86/Hyper-V: Report crash register data or ksmg before running crash
>>      kernel
>>
>>   arch/x86/kernel/cpu/mshyperv.c | 10 ++++++++++
>>   drivers/hv/channel_mgmt.c      |  5 +++++
>>   drivers/hv/vmbus_drv.c         | 35 +++++++++++++++++++++++++----------
>>   3 files changed, 40 insertions(+), 10 deletions(-)
>>
>> --
>> 2.14.5
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback
  2020-03-19  8:03       ` Vitaly Kuznetsov
@ 2020-03-19 15:06         ` Michael Kelley
  0 siblings, 0 replies; 22+ messages in thread
From: Michael Kelley @ 2020-03-19 15:06 UTC (permalink / raw)
  To: vkuznets, ltykernel
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, KY Srinivasan,
	Haiyang Zhang, Stephen Hemminger, Wei Liu, tglx, mingo, bp, hpa,
	x86

From: Vitaly Kuznetsov <vkuznets@redhat.com>  Sent: Thursday, March 19, 2020 1:04 AM
> 
> Michael Kelley <mikelley@microsoft.com> writes:
> 
> >> > --- a/drivers/hv/vmbus_drv.c
> >> > +++ b/drivers/hv/vmbus_drv.c
> >> > @@ -53,9 +53,12 @@ static int hyperv_panic_event(struct notifier_block *nb,
> unsigned
> >> long val,
> >> >  {
> >> >  	struct pt_regs *regs;
> >> >
> >> > -	regs = current_pt_regs();
> >> > +	vmbus_initiate_unload(true);
> >> >
> >> > -	hyperv_report_panic(regs, val);
> >> > +	if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
> >>
> >> With Michael's effors to make code in drivers/hv arch agnostic, I think
> >> we need a better, arch-neutral way.
> >
> > Vitaly -- could you elaborate on what part is not arch-neutral?  I don't see
> > a problem.  ms_hyperv and the misc_features field exist for both the x86
> > and ARM64 code branches.  It turns out the particular bit for
> > GUEST_CRASH_MSR_AVAILABLE is different on the two architectures, but
> > the compiler will do the right thing.
> >
> 
> Ah, apologies, missed the fact that we also call it
> 'HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE' - I have to admit I was confused
> by the 'MSR' part. We can probably rename this to something like
> HV_FEATURE_GUEST_CRASH_REGS_AVAILABLE - but not as part of the series.
> 

Good point.  Agreed.

Michael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 0/4] x86/Hyper-V: Panic code path fixes
  2020-03-19 14:08   ` Tianyu Lan
@ 2020-03-19 15:14     ` Michael Kelley
  2020-03-19 16:07       ` Michael Kelley
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Kelley @ 2020-03-19 15:14 UTC (permalink / raw)
  To: Tianyu Lan, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Wei Liu, tglx, mingo, bp, hpa, x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: Tianyu Lan <Tianyu.Lan@microsoft.com> Sent: Thursday, March 19, 2020 7:08 AM
> >>
> >> This patchset fixes some issues in the Hyper-V panic code path.
> >> Patch 1 resolves issue that panic system still responses network
> >> packets.
> >> Patch 2-3 resolves crash enlightenment issues.
> >> Patch 4 is to set crash_kexec_post_notifiers to true for Hyper-V
> >> VM in order to report crash data or kmsg to host before running
> >> kdump kernel.
> >
> > I still see an issue that isn't addressed by these patches.  The VMbus
> > driver registers a "die notifier" and a "panic notifier".   But die() will
> > eventually call panic() if panic_on_oops is set (which I think it typically
> > is).  If the CRASH_NOTIFY_MSG option is *not* enabled, then
> > hyperv_report_panic() could get called by the die notifier, and then
> > again by the panic notifier.
> >
> > Do we even need the "die notifier"?  If it was removed, there would
> > not be any notification to Hyper-V via the die() path unless panic_on_oops
> > is set, which I think is actually the correct behavior.  I'm not
> > completely clear on what is supposed to happen in general to the
> > Linux kernel if panic_on_oops is not set. Does it try to continue to run?
> > If so, then we should not be notifying Hyper-V if panic_on_oops is not
> > set, and removing the die notifier is the right thing to do.
> >
> 
> hyperv_report_panic() has re-enter check inside and so kernel only
> reports crash register data once during die().

Ah, yes, you are right.

> From comment in the
> hyperv_report_panic(), register value reported in die chain is more
> exact than value in panic chain. The register value in die chain is
> passed by die() caller. Register value reported in panic chain
> is collected in the hyperv_panic_event(). 
> 
> If panic_on_oops is not set, the task should be killed and kernel
> still runs. In this case, we may not trigger crash enlightenment.

I'm not completely clear on your last statement.   It seems like there
is still a problem in that die() will call hyperv_report_panic() even if
panic_on_oops is not set.  We will have reported a panic to Hyper-V
even though the VM did not stop running.

Michael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH 0/4] x86/Hyper-V: Panic code path fixes
  2020-03-19 15:14     ` Michael Kelley
@ 2020-03-19 16:07       ` Michael Kelley
  2020-03-20  2:21         ` Tianyu Lan
  0 siblings, 1 reply; 22+ messages in thread
From: Michael Kelley @ 2020-03-19 16:07 UTC (permalink / raw)
  To: Michael Kelley, Tianyu Lan, KY Srinivasan, Haiyang Zhang,
	Stephen Hemminger, Wei Liu, tglx, mingo, bp, hpa, x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

From: Michael Kelley <mikelley@microsoft.com> Sent: Thursday, March 19, 2020 8:15 AM
> 
> From: Tianyu Lan <Tianyu.Lan@microsoft.com> Sent: Thursday, March 19, 2020 7:08 AM
> > >>
> > >> This patchset fixes some issues in the Hyper-V panic code path.
> > >> Patch 1 resolves issue that panic system still responses network
> > >> packets.
> > >> Patch 2-3 resolves crash enlightenment issues.
> > >> Patch 4 is to set crash_kexec_post_notifiers to true for Hyper-V
> > >> VM in order to report crash data or kmsg to host before running
> > >> kdump kernel.
> > >
> > > I still see an issue that isn't addressed by these patches.  The VMbus
> > > driver registers a "die notifier" and a "panic notifier".   But die() will
> > > eventually call panic() if panic_on_oops is set (which I think it typically
> > > is).  If the CRASH_NOTIFY_MSG option is *not* enabled, then
> > > hyperv_report_panic() could get called by the die notifier, and then
> > > again by the panic notifier.
> > >
> > > Do we even need the "die notifier"?  If it was removed, there would
> > > not be any notification to Hyper-V via the die() path unless panic_on_oops
> > > is set, which I think is actually the correct behavior.  I'm not
> > > completely clear on what is supposed to happen in general to the
> > > Linux kernel if panic_on_oops is not set. Does it try to continue to run?
> > > If so, then we should not be notifying Hyper-V if panic_on_oops is not
> > > set, and removing the die notifier is the right thing to do.
> > >
> >
> > hyperv_report_panic() has re-enter check inside and so kernel only
> > reports crash register data once during die().
> 
> Ah, yes, you are right.
> 
> > From comment in the
> > hyperv_report_panic(), register value reported in die chain is more
> > exact than value in panic chain. The register value in die chain is
> > passed by die() caller. Register value reported in panic chain
> > is collected in the hyperv_panic_event().
> >
> > If panic_on_oops is not set, the task should be killed and kernel
> > still runs. In this case, we may not trigger crash enlightenment.
> 
> I'm not completely clear on your last statement.   It seems like there
> is still a problem in that die() will call hyperv_report_panic() even if
> panic_on_oops is not set.  We will have reported a panic to Hyper-V
> even though the VM did not stop running.
> 
> Michael

There's one more issue to consider.  hv_kmsg_dump() skips calling
hyperv_report_panic_msg() if sysctl_record_panic_msg has been cleared
by a sysctl command.  (This sysctl option gives a customer the ability to 
increase privacy by not having the VM's dmesg contents sent to Hyper-V.)
In this case, the earlier hyperv_report_panic() call should be used.  Otherwise,
there would not be any notification to Hyper-V about the panic.

Michael

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 0/4] x86/Hyper-V: Panic code path fixes
  2020-03-19 16:07       ` Michael Kelley
@ 2020-03-20  2:21         ` Tianyu Lan
  0 siblings, 0 replies; 22+ messages in thread
From: Tianyu Lan @ 2020-03-20  2:21 UTC (permalink / raw)
  To: Michael Kelley, KY Srinivasan, Haiyang Zhang, Stephen Hemminger,
	Wei Liu, tglx, mingo, bp, hpa, x86
  Cc: Tianyu Lan, linux-hyperv, linux-kernel, vkuznets

On 3/20/2020 12:07 AM, Michael Kelley wrote:
> From: Michael Kelley <mikelley@microsoft.com> Sent: Thursday, March 19, 2020 8:15 AM
>>
>> From: Tianyu Lan <Tianyu.Lan@microsoft.com> Sent: Thursday, March 19, 2020 7:08 AM
>>>>>
>>>>> This patchset fixes some issues in the Hyper-V panic code path.
>>>>> Patch 1 resolves issue that panic system still responses network
>>>>> packets.
>>>>> Patch 2-3 resolves crash enlightenment issues.
>>>>> Patch 4 is to set crash_kexec_post_notifiers to true for Hyper-V
>>>>> VM in order to report crash data or kmsg to host before running
>>>>> kdump kernel.
>>>>
>>>> I still see an issue that isn't addressed by these patches.  The VMbus
>>>> driver registers a "die notifier" and a "panic notifier".   But die() will
>>>> eventually call panic() if panic_on_oops is set (which I think it typically
>>>> is).  If the CRASH_NOTIFY_MSG option is *not* enabled, then
>>>> hyperv_report_panic() could get called by the die notifier, and then
>>>> again by the panic notifier.
>>>>
>>>> Do we even need the "die notifier"?  If it was removed, there would
>>>> not be any notification to Hyper-V via the die() path unless panic_on_oops
>>>> is set, which I think is actually the correct behavior.  I'm not
>>>> completely clear on what is supposed to happen in general to the
>>>> Linux kernel if panic_on_oops is not set. Does it try to continue to run?
>>>> If so, then we should not be notifying Hyper-V if panic_on_oops is not
>>>> set, and removing the die notifier is the right thing to do.
>>>>
>>>
>>> hyperv_report_panic() has re-enter check inside and so kernel only
>>> reports crash register data once during die().
>>
>> Ah, yes, you are right.
>>
>>>  From comment in the
>>> hyperv_report_panic(), register value reported in die chain is more
>>> exact than value in panic chain. The register value in die chain is
>>> passed by die() caller. Register value reported in panic chain
>>> is collected in the hyperv_panic_event().
>>>
>>> If panic_on_oops is not set, the task should be killed and kernel
>>> still runs. In this case, we may not trigger crash enlightenment.
>>
>> I'm not completely clear on your last statement.   It seems like there
>> is still a problem in that die() will call hyperv_report_panic() even if
>> panic_on_oops is not set.  We will have reported a panic to Hyper-V
>> even though the VM did not stop running.
>
Yes, the die callback is still necessary and we should skip report
if panic_on_oops isn't set.

> 
> There's one more issue to consider.  hv_kmsg_dump() skips calling
> hyperv_report_panic_msg() if sysctl_record_panic_msg has been cleared
> by a sysctl command.  (This sysctl option gives a customer the ability to
> increase privacy by not having the VM's dmesg contents sent to Hyper-V.)
> In this case, the earlier hyperv_report_panic() call should be used.  Otherwise,
> there would not be any notification to Hyper-V about the panic.
> 

Nice catch. I will fix this in the next version.

Thanks.



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2020-03-20  2:21 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-17 13:25 [PATCH 0/4] x86/Hyper-V: Panic code path fixes ltykernel
2020-03-17 13:25 ` [PATCH 0/4] x86/Hyper-V: Unload vmbus channel in hv panic callback ltykernel
2020-03-17 17:35   ` Wei Liu
2020-03-19  8:24     ` Tianyu Lan
2020-03-18 15:58   ` Vitaly Kuznetsov
2020-03-19  0:33     ` Michael Kelley
2020-03-19  8:03       ` Vitaly Kuznetsov
2020-03-19 15:06         ` Michael Kelley
2020-03-19  8:03     ` Tianyu Lan
2020-03-17 13:25 ` [PATCH 2/4] x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump ltykernel
2020-03-17 17:36   ` Wei Liu
2020-03-19  8:12     ` Tianyu Lan
2020-03-19  0:38   ` Michael Kelley
2020-03-17 13:25 ` [PATCH 3/4] x86/Hyper-V: Trigger crash enlightenment only once during system crash ltykernel
2020-03-19  0:45   ` Michael Kelley
2020-03-17 13:25 ` [PATCH 4/4] x86/Hyper-V: Report crash register data or ksmg before running crash kernel ltykernel
2020-03-19  0:51   ` Michael Kelley
2020-03-19  0:57 ` [PATCH 0/4] x86/Hyper-V: Panic code path fixes Michael Kelley
2020-03-19 14:08   ` Tianyu Lan
2020-03-19 15:14     ` Michael Kelley
2020-03-19 16:07       ` Michael Kelley
2020-03-20  2:21         ` Tianyu Lan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).