Linux-HyperV Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 1/1] Drivers: hv: vmbus: Increase wait time for VMbus unload
@ 2021-04-20  4:48 Michael Kelley
  2021-04-20  9:31 ` ** POTENTIAL FRAUD ALERT - RED HAT ** " Vitaly Kuznetsov
  0 siblings, 1 reply; 5+ messages in thread
From: Michael Kelley @ 2021-04-20  4:48 UTC (permalink / raw)
  To: kys, haiyangz, sthemmin, wei.liu, linux-kernel, linux-hyperv,
	vkuznets, decui
  Cc: mikelley

When running in Azure, disks may be connected to a Linux VM with
read/write caching enabled. If a VM panics and issues a VMbus
UNLOAD request to Hyper-V, the response is delayed until all dirty
data in the disk cache is flushed.  In extreme cases, this flushing
can take 10's of seconds, depending on the disk speed and the amount
of dirty data. If kdump is configured for the VM, the current 10 second
timeout in vmbus_wait_for_unload() may be exceeded, and the UNLOAD
complete message may arrive well after the kdump kernel is already
running, causing problems.  Note that no problem occurs if kdump is
not enabled because Hyper-V waits for the cache flush before doing
a reboot through the BIOS/UEFI code.

Fix this problem by increasing the timeout in vmbus_wait_for_unload()
to 100 seconds. Also output periodic messages so that if anyone is
watching the serial console, they won't think the VM is completely
hung.

Fixes: 911e1987efc8 ("Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
---

Changed in v2: Fixed silly error in the argument to mdelay()

---
 drivers/hv/channel_mgmt.c | 30 +++++++++++++++++++++++++-----
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index f3cf4af..ef4685c 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -755,6 +755,12 @@ static void init_vp_index(struct vmbus_channel *channel)
 	free_cpumask_var(available_mask);
 }
 
+#define UNLOAD_DELAY_UNIT_MS	10		/* 10 milliseconds */
+#define UNLOAD_WAIT_MS		(100*1000)	/* 100 seconds */
+#define UNLOAD_WAIT_LOOPS	(UNLOAD_WAIT_MS/UNLOAD_DELAY_UNIT_MS)
+#define UNLOAD_MSG_MS		(5*1000)	/* Every 5 seconds */
+#define UNLOAD_MSG_LOOPS	(UNLOAD_MSG_MS/UNLOAD_DELAY_UNIT_MS)
+
 static void vmbus_wait_for_unload(void)
 {
 	int cpu;
@@ -772,12 +778,17 @@ static void vmbus_wait_for_unload(void)
 	 * vmbus_connection.unload_event. If not, the last thing we can do is
 	 * read message pages for all CPUs directly.
 	 *
-	 * Wait no more than 10 seconds so that the panic path can't get
-	 * hung forever in case the response message isn't seen.
+	 * Wait up to 100 seconds since an Azure host must writeback any dirty
+	 * data in its disk cache before the VMbus UNLOAD request will
+	 * complete. This flushing has been empirically observed to take up
+	 * to 50 seconds in cases with a lot of dirty data, so allow additional
+	 * leeway and for inaccuracies in mdelay(). But eventually time out so
+	 * that the panic path can't get hung forever in case the response
+	 * message isn't seen.
 	 */
-	for (i = 0; i < 1000; i++) {
+	for (i = 1; i <= UNLOAD_WAIT_LOOPS; i++) {
 		if (completion_done(&vmbus_connection.unload_event))
-			break;
+			goto completed;
 
 		for_each_online_cpu(cpu) {
 			struct hv_per_cpu_context *hv_cpu
@@ -800,9 +811,18 @@ static void vmbus_wait_for_unload(void)
 			vmbus_signal_eom(msg, message_type);
 		}
 
-		mdelay(10);
+		/*
+		 * Give a notice periodically so someone watching the
+		 * serial output won't think it is completely hung.
+		 */
+		if (!(i % UNLOAD_MSG_LOOPS))
+			pr_notice("Waiting for VMBus UNLOAD to complete\n");
+
+		mdelay(UNLOAD_DELAY_UNIT_MS);
 	}
+	pr_err("Continuing even though VMBus UNLOAD did not complete\n");
 
+completed:
 	/*
 	 * We're crashing and already got the UNLOAD_RESPONSE, cleanup all
 	 * maybe-pending messages on all CPUs to be able to receive new
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ** POTENTIAL FRAUD ALERT - RED HAT ** [PATCH v2 1/1] Drivers: hv: vmbus: Increase wait time for VMbus unload
  2021-04-20  4:48 [PATCH v2 1/1] Drivers: hv: vmbus: Increase wait time for VMbus unload Michael Kelley
@ 2021-04-20  9:31 ` Vitaly Kuznetsov
  2021-04-20 19:46   ` Wei Liu
  2021-04-20 21:23   ` Michael Kelley
  0 siblings, 2 replies; 5+ messages in thread
From: Vitaly Kuznetsov @ 2021-04-20  9:31 UTC (permalink / raw)
  To: Michael Kelley
  Cc: mikelley, kys, haiyangz, sthemmin, wei.liu, linux-kernel,
	linux-hyperv, decui

Michael Kelley <mikelley@microsoft.com> writes:

> When running in Azure, disks may be connected to a Linux VM with
> read/write caching enabled. If a VM panics and issues a VMbus
> UNLOAD request to Hyper-V, the response is delayed until all dirty
> data in the disk cache is flushed.  In extreme cases, this flushing
> can take 10's of seconds, depending on the disk speed and the amount
> of dirty data. If kdump is configured for the VM, the current 10 second
> timeout in vmbus_wait_for_unload() may be exceeded, and the UNLOAD
> complete message may arrive well after the kdump kernel is already
> running, causing problems.  Note that no problem occurs if kdump is
> not enabled because Hyper-V waits for the cache flush before doing
> a reboot through the BIOS/UEFI code.
>
> Fix this problem by increasing the timeout in vmbus_wait_for_unload()
> to 100 seconds. Also output periodic messages so that if anyone is
> watching the serial console, they won't think the VM is completely
> hung.
>
> Fixes: 911e1987efc8 ("Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload")
> Signed-off-by: Michael Kelley <mikelley@microsoft.com>
> ---
>
> Changed in v2: Fixed silly error in the argument to mdelay()
>
> ---
>  drivers/hv/channel_mgmt.c | 30 +++++++++++++++++++++++++-----
>  1 file changed, 25 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index f3cf4af..ef4685c 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -755,6 +755,12 @@ static void init_vp_index(struct vmbus_channel *channel)
>  	free_cpumask_var(available_mask);
>  }
>  
> +#define UNLOAD_DELAY_UNIT_MS	10		/* 10 milliseconds */
> +#define UNLOAD_WAIT_MS		(100*1000)	/* 100 seconds */
> +#define UNLOAD_WAIT_LOOPS	(UNLOAD_WAIT_MS/UNLOAD_DELAY_UNIT_MS)
> +#define UNLOAD_MSG_MS		(5*1000)	/* Every 5 seconds */
> +#define UNLOAD_MSG_LOOPS	(UNLOAD_MSG_MS/UNLOAD_DELAY_UNIT_MS)
> +
>  static void vmbus_wait_for_unload(void)
>  {
>  	int cpu;
> @@ -772,12 +778,17 @@ static void vmbus_wait_for_unload(void)
>  	 * vmbus_connection.unload_event. If not, the last thing we can do is
>  	 * read message pages for all CPUs directly.
>  	 *
> -	 * Wait no more than 10 seconds so that the panic path can't get
> -	 * hung forever in case the response message isn't seen.
> +	 * Wait up to 100 seconds since an Azure host must writeback any dirty
> +	 * data in its disk cache before the VMbus UNLOAD request will
> +	 * complete. This flushing has been empirically observed to take up
> +	 * to 50 seconds in cases with a lot of dirty data, so allow additional
> +	 * leeway and for inaccuracies in mdelay(). But eventually time out so
> +	 * that the panic path can't get hung forever in case the response
> +	 * message isn't seen.

I vaguely remember debugging cases when CHANNELMSG_UNLOAD_RESPONSE never
arrives, it was kind of pointless to proceed to kexec as attempts to
reconnect Vmbus devices were failing (no devices were offered after
CHANNELMSG_REQUESTOFFERS AFAIR). Would it maybe make sense to just do
emergency reboot instead of proceeding to kexec when this happens? Just
wondering.

>  	 */
> -	for (i = 0; i < 1000; i++) {
> +	for (i = 1; i <= UNLOAD_WAIT_LOOPS; i++) {
>  		if (completion_done(&vmbus_connection.unload_event))
> -			break;
> +			goto completed;
>  
>  		for_each_online_cpu(cpu) {
>  			struct hv_per_cpu_context *hv_cpu
> @@ -800,9 +811,18 @@ static void vmbus_wait_for_unload(void)
>  			vmbus_signal_eom(msg, message_type);
>  		}
>  
> -		mdelay(10);
> +		/*
> +		 * Give a notice periodically so someone watching the
> +		 * serial output won't think it is completely hung.
> +		 */
> +		if (!(i % UNLOAD_MSG_LOOPS))
> +			pr_notice("Waiting for VMBus UNLOAD to complete\n");
> +
> +		mdelay(UNLOAD_DELAY_UNIT_MS);
>  	}
> +	pr_err("Continuing even though VMBus UNLOAD did not complete\n");
>  
> +completed:
>  	/*
>  	 * We're crashing and already got the UNLOAD_RESPONSE, cleanup all
>  	 * maybe-pending messages on all CPUs to be able to receive new

This is definitely an improvement,

Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>

-- 
Vitaly


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: ** POTENTIAL FRAUD ALERT - RED HAT ** [PATCH v2 1/1] Drivers: hv: vmbus: Increase wait time for VMbus unload
  2021-04-20  9:31 ` ** POTENTIAL FRAUD ALERT - RED HAT ** " Vitaly Kuznetsov
@ 2021-04-20 19:46   ` Wei Liu
  2021-04-20 21:23   ` Michael Kelley
  1 sibling, 0 replies; 5+ messages in thread
From: Wei Liu @ 2021-04-20 19:46 UTC (permalink / raw)
  To: Vitaly Kuznetsov
  Cc: Michael Kelley, kys, haiyangz, sthemmin, wei.liu, linux-kernel,
	linux-hyperv, decui

On Tue, Apr 20, 2021 at 11:31:54AM +0200, Vitaly Kuznetsov wrote:
> Michael Kelley <mikelley@microsoft.com> writes:
> 
> > When running in Azure, disks may be connected to a Linux VM with
> > read/write caching enabled. If a VM panics and issues a VMbus
> > UNLOAD request to Hyper-V, the response is delayed until all dirty
> > data in the disk cache is flushed.  In extreme cases, this flushing
> > can take 10's of seconds, depending on the disk speed and the amount
> > of dirty data. If kdump is configured for the VM, the current 10 second
> > timeout in vmbus_wait_for_unload() may be exceeded, and the UNLOAD
> > complete message may arrive well after the kdump kernel is already
> > running, causing problems.  Note that no problem occurs if kdump is
> > not enabled because Hyper-V waits for the cache flush before doing
> > a reboot through the BIOS/UEFI code.
> >
> > Fix this problem by increasing the timeout in vmbus_wait_for_unload()
> > to 100 seconds. Also output periodic messages so that if anyone is
> > watching the serial console, they won't think the VM is completely
> > hung.
> >
> > Fixes: 911e1987efc8 ("Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload")
> > Signed-off-by: Michael Kelley <mikelley@microsoft.com>

Applied to hyperv-next. Thanks.

> > ---
[...]
> >  
> > +#define UNLOAD_DELAY_UNIT_MS	10		/* 10 milliseconds */
> > +#define UNLOAD_WAIT_MS		(100*1000)	/* 100 seconds */
> > +#define UNLOAD_WAIT_LOOPS	(UNLOAD_WAIT_MS/UNLOAD_DELAY_UNIT_MS)
> > +#define UNLOAD_MSG_MS		(5*1000)	/* Every 5 seconds */
> > +#define UNLOAD_MSG_LOOPS	(UNLOAD_MSG_MS/UNLOAD_DELAY_UNIT_MS)
> > +
> >  static void vmbus_wait_for_unload(void)
> >  {
> >  	int cpu;
> > @@ -772,12 +778,17 @@ static void vmbus_wait_for_unload(void)
> >  	 * vmbus_connection.unload_event. If not, the last thing we can do is
> >  	 * read message pages for all CPUs directly.
> >  	 *
> > -	 * Wait no more than 10 seconds so that the panic path can't get
> > -	 * hung forever in case the response message isn't seen.
> > +	 * Wait up to 100 seconds since an Azure host must writeback any dirty
> > +	 * data in its disk cache before the VMbus UNLOAD request will
> > +	 * complete. This flushing has been empirically observed to take up
> > +	 * to 50 seconds in cases with a lot of dirty data, so allow additional
> > +	 * leeway and for inaccuracies in mdelay(). But eventually time out so
> > +	 * that the panic path can't get hung forever in case the response
> > +	 * message isn't seen.
> 
> I vaguely remember debugging cases when CHANNELMSG_UNLOAD_RESPONSE never
> arrives, it was kind of pointless to proceed to kexec as attempts to
> reconnect Vmbus devices were failing (no devices were offered after
> CHANNELMSG_REQUESTOFFERS AFAIR). Would it maybe make sense to just do
> emergency reboot instead of proceeding to kexec when this happens? Just
> wondering.
> 

Please submit a follow-up patch if necessary.

Wei.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: ** POTENTIAL FRAUD ALERT - RED HAT ** [PATCH v2 1/1] Drivers: hv: vmbus: Increase wait time for VMbus unload
  2021-04-20  9:31 ` ** POTENTIAL FRAUD ALERT - RED HAT ** " Vitaly Kuznetsov
  2021-04-20 19:46   ` Wei Liu
@ 2021-04-20 21:23   ` Michael Kelley
  2021-04-21  8:24     ` Vitaly Kuznetsov
  1 sibling, 1 reply; 5+ messages in thread
From: Michael Kelley @ 2021-04-20 21:23 UTC (permalink / raw)
  To: vkuznets
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, wei.liu,
	linux-kernel, linux-hyperv, Dexuan Cui

From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Tuesday, April 20, 2021 2:32 AM
> 
> Michael Kelley <mikelley@microsoft.com> writes:
> 
> > When running in Azure, disks may be connected to a Linux VM with
> > read/write caching enabled. If a VM panics and issues a VMbus
> > UNLOAD request to Hyper-V, the response is delayed until all dirty
> > data in the disk cache is flushed.  In extreme cases, this flushing
> > can take 10's of seconds, depending on the disk speed and the amount
> > of dirty data. If kdump is configured for the VM, the current 10 second
> > timeout in vmbus_wait_for_unload() may be exceeded, and the UNLOAD
> > complete message may arrive well after the kdump kernel is already
> > running, causing problems.  Note that no problem occurs if kdump is
> > not enabled because Hyper-V waits for the cache flush before doing
> > a reboot through the BIOS/UEFI code.
> >
> > Fix this problem by increasing the timeout in vmbus_wait_for_unload()
> > to 100 seconds. Also output periodic messages so that if anyone is
> > watching the serial console, they won't think the VM is completely
> > hung.
> >
> > Fixes: 911e1987efc8 ("Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload")
> > Signed-off-by: Michael Kelley <mikelley@microsoft.com>
> > ---
> >
> > Changed in v2: Fixed silly error in the argument to mdelay()
> >
> > ---
> >  drivers/hv/channel_mgmt.c | 30 +++++++++++++++++++++++++-----
> >  1 file changed, 25 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> > index f3cf4af..ef4685c 100644
> > --- a/drivers/hv/channel_mgmt.c
> > +++ b/drivers/hv/channel_mgmt.c
> > @@ -755,6 +755,12 @@ static void init_vp_index(struct vmbus_channel *channel)
> >  	free_cpumask_var(available_mask);
> >  }
> >
> > +#define UNLOAD_DELAY_UNIT_MS	10		/* 10 milliseconds */
> > +#define UNLOAD_WAIT_MS		(100*1000)	/* 100 seconds */
> > +#define UNLOAD_WAIT_LOOPS	(UNLOAD_WAIT_MS/UNLOAD_DELAY_UNIT_MS)
> > +#define UNLOAD_MSG_MS		(5*1000)	/* Every 5 seconds */
> > +#define UNLOAD_MSG_LOOPS	(UNLOAD_MSG_MS/UNLOAD_DELAY_UNIT_MS)
> > +
> >  static void vmbus_wait_for_unload(void)
> >  {
> >  	int cpu;
> > @@ -772,12 +778,17 @@ static void vmbus_wait_for_unload(void)
> >  	 * vmbus_connection.unload_event. If not, the last thing we can do is
> >  	 * read message pages for all CPUs directly.
> >  	 *
> > -	 * Wait no more than 10 seconds so that the panic path can't get
> > -	 * hung forever in case the response message isn't seen.
> > +	 * Wait up to 100 seconds since an Azure host must writeback any dirty
> > +	 * data in its disk cache before the VMbus UNLOAD request will
> > +	 * complete. This flushing has been empirically observed to take up
> > +	 * to 50 seconds in cases with a lot of dirty data, so allow additional
> > +	 * leeway and for inaccuracies in mdelay(). But eventually time out so
> > +	 * that the panic path can't get hung forever in case the response
> > +	 * message isn't seen.
> 
> I vaguely remember debugging cases when CHANNELMSG_UNLOAD_RESPONSE never
> arrives, it was kind of pointless to proceed to kexec as attempts to
> reconnect Vmbus devices were failing (no devices were offered after
> CHANNELMSG_REQUESTOFFERS AFAIR). Would it maybe make sense to just do
> emergency reboot instead of proceeding to kexec when this happens? Just
> wondering.
> 

Yes, I think there have been (and maybe still are) situations where we don't
ever get the UNLOAD response.  But there have been bugs fixed in Hyper-V
that I think make that less likely.  There's also an unfixed (and maybe not fixable)
problem when not operating in STIMER Direct Mode, where an old-style
timer message can block the UNLOAD response message.  But as the world
moves forward to later kernel versions that use STIMER Direct Mode, that
also becomes less likely.   So my inclination is to let execution continue on
the normal execution path, even if the UNLOAD response message isn't
received.  Maybe we just didn't wait quite long enough (even at 100 seconds).
It's a judgment call, and it's not clear to me that doing an emergency reboot
is really any better.

As background work for this patch, we also discovered another bug in Hyper-V.
If the kdump kernel runs and does a VMbus INITIATE_CONTACT while the
UNLOAD is still in progress, the Hyper-V code is supposed to wait for the UNLOAD
to complete, and then commence the VMbus version negotiation.  But it
doesn't do that -- it finally sends the UNLOAD response, but never does the
version negotiation, so the kdump kernel hangs forever.  The Hyper-V team
plans to fix this, and hopefully we'll get a patch deployed in Azure, which
will eliminate one more scenario where the kdump kernel doesn't succeed.

Michael

> >  	 */
> > -	for (i = 0; i < 1000; i++) {
> > +	for (i = 1; i <= UNLOAD_WAIT_LOOPS; i++) {
> >  		if (completion_done(&vmbus_connection.unload_event))
> > -			break;
> > +			goto completed;
> >
> >  		for_each_online_cpu(cpu) {
> >  			struct hv_per_cpu_context *hv_cpu
> > @@ -800,9 +811,18 @@ static void vmbus_wait_for_unload(void)
> >  			vmbus_signal_eom(msg, message_type);
> >  		}
> >
> > -		mdelay(10);
> > +		/*
> > +		 * Give a notice periodically so someone watching the
> > +		 * serial output won't think it is completely hung.
> > +		 */
> > +		if (!(i % UNLOAD_MSG_LOOPS))
> > +			pr_notice("Waiting for VMBus UNLOAD to complete\n");
> > +
> > +		mdelay(UNLOAD_DELAY_UNIT_MS);
> >  	}
> > +	pr_err("Continuing even though VMBus UNLOAD did not complete\n");
> >
> > +completed:
> >  	/*
> >  	 * We're crashing and already got the UNLOAD_RESPONSE, cleanup all
> >  	 * maybe-pending messages on all CPUs to be able to receive new
> 
> This is definitely an improvement,
> 
> Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> 
> --
> Vitaly


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: ** POTENTIAL FRAUD ALERT - RED HAT ** [PATCH v2 1/1] Drivers: hv: vmbus: Increase wait time for VMbus unload
  2021-04-20 21:23   ` Michael Kelley
@ 2021-04-21  8:24     ` Vitaly Kuznetsov
  0 siblings, 0 replies; 5+ messages in thread
From: Vitaly Kuznetsov @ 2021-04-21  8:24 UTC (permalink / raw)
  To: Michael Kelley
  Cc: KY Srinivasan, Haiyang Zhang, Stephen Hemminger, wei.liu,
	linux-kernel, linux-hyperv, Dexuan Cui

Michael Kelley <mikelley@microsoft.com> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com> Sent: Tuesday, April 20, 2021 2:32 AM
>> 
>> Michael Kelley <mikelley@microsoft.com> writes:
>> 
>> > When running in Azure, disks may be connected to a Linux VM with
>> > read/write caching enabled. If a VM panics and issues a VMbus
>> > UNLOAD request to Hyper-V, the response is delayed until all dirty
>> > data in the disk cache is flushed.  In extreme cases, this flushing
>> > can take 10's of seconds, depending on the disk speed and the amount
>> > of dirty data. If kdump is configured for the VM, the current 10 second
>> > timeout in vmbus_wait_for_unload() may be exceeded, and the UNLOAD
>> > complete message may arrive well after the kdump kernel is already
>> > running, causing problems.  Note that no problem occurs if kdump is
>> > not enabled because Hyper-V waits for the cache flush before doing
>> > a reboot through the BIOS/UEFI code.
>> >
>> > Fix this problem by increasing the timeout in vmbus_wait_for_unload()
>> > to 100 seconds. Also output periodic messages so that if anyone is
>> > watching the serial console, they won't think the VM is completely
>> > hung.
>> >
>> > Fixes: 911e1987efc8 ("Drivers: hv: vmbus: Add timeout to vmbus_wait_for_unload")
>> > Signed-off-by: Michael Kelley <mikelley@microsoft.com>
>> > ---
>> >
>> > Changed in v2: Fixed silly error in the argument to mdelay()
>> >
>> > ---
>> >  drivers/hv/channel_mgmt.c | 30 +++++++++++++++++++++++++-----
>> >  1 file changed, 25 insertions(+), 5 deletions(-)
>> >
>> > diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
>> > index f3cf4af..ef4685c 100644
>> > --- a/drivers/hv/channel_mgmt.c
>> > +++ b/drivers/hv/channel_mgmt.c
>> > @@ -755,6 +755,12 @@ static void init_vp_index(struct vmbus_channel *channel)
>> >  	free_cpumask_var(available_mask);
>> >  }
>> >
>> > +#define UNLOAD_DELAY_UNIT_MS	10		/* 10 milliseconds */
>> > +#define UNLOAD_WAIT_MS		(100*1000)	/* 100 seconds */
>> > +#define UNLOAD_WAIT_LOOPS	(UNLOAD_WAIT_MS/UNLOAD_DELAY_UNIT_MS)
>> > +#define UNLOAD_MSG_MS		(5*1000)	/* Every 5 seconds */
>> > +#define UNLOAD_MSG_LOOPS	(UNLOAD_MSG_MS/UNLOAD_DELAY_UNIT_MS)
>> > +
>> >  static void vmbus_wait_for_unload(void)
>> >  {
>> >  	int cpu;
>> > @@ -772,12 +778,17 @@ static void vmbus_wait_for_unload(void)
>> >  	 * vmbus_connection.unload_event. If not, the last thing we can do is
>> >  	 * read message pages for all CPUs directly.
>> >  	 *
>> > -	 * Wait no more than 10 seconds so that the panic path can't get
>> > -	 * hung forever in case the response message isn't seen.
>> > +	 * Wait up to 100 seconds since an Azure host must writeback any dirty
>> > +	 * data in its disk cache before the VMbus UNLOAD request will
>> > +	 * complete. This flushing has been empirically observed to take up
>> > +	 * to 50 seconds in cases with a lot of dirty data, so allow additional
>> > +	 * leeway and for inaccuracies in mdelay(). But eventually time out so
>> > +	 * that the panic path can't get hung forever in case the response
>> > +	 * message isn't seen.
>> 
>> I vaguely remember debugging cases when CHANNELMSG_UNLOAD_RESPONSE never
>> arrives, it was kind of pointless to proceed to kexec as attempts to
>> reconnect Vmbus devices were failing (no devices were offered after
>> CHANNELMSG_REQUESTOFFERS AFAIR). Would it maybe make sense to just do
>> emergency reboot instead of proceeding to kexec when this happens? Just
>> wondering.
>> 
>
> Yes, I think there have been (and maybe still are) situations where we don't
> ever get the UNLOAD response.  But there have been bugs fixed in Hyper-V
> that I think make that less likely.  There's also an unfixed (and maybe not fixable)
> problem when not operating in STIMER Direct Mode, where an old-style
> timer message can block the UNLOAD response message.  But as the world
> moves forward to later kernel versions that use STIMER Direct Mode, that
> also becomes less likely.   So my inclination is to let execution continue on
> the normal execution path, even if the UNLOAD response message isn't
> received.  Maybe we just didn't wait quite long enough (even at 100 seconds).
> It's a judgment call, and it's not clear to me that doing an emergency reboot
> is really any better.
>
> As background work for this patch, we also discovered another bug in Hyper-V.
> If the kdump kernel runs and does a VMbus INITIATE_CONTACT while the
> UNLOAD is still in progress, the Hyper-V code is supposed to wait for the UNLOAD
> to complete, and then commence the VMbus version negotiation.  But it
> doesn't do that -- it finally sends the UNLOAD response, but never does the
> version negotiation, so the kdump kernel hangs forever.  The Hyper-V team
> plans to fix this, and hopefully we'll get a patch deployed in Azure, which
> will eliminate one more scenario where the kdump kernel doesn't succeed.
>

Ah, ok, if bugs in Hyper-V/Azure are being fixed then it seems
reasonable to keep the current logic (proceeding to kexec even when we
didn't receive UNLOAD). Thanks for the additional info!

-- 
Vitaly


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, back to index

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-20  4:48 [PATCH v2 1/1] Drivers: hv: vmbus: Increase wait time for VMbus unload Michael Kelley
2021-04-20  9:31 ` ** POTENTIAL FRAUD ALERT - RED HAT ** " Vitaly Kuznetsov
2021-04-20 19:46   ` Wei Liu
2021-04-20 21:23   ` Michael Kelley
2021-04-21  8:24     ` Vitaly Kuznetsov

Linux-HyperV Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-hyperv/0 linux-hyperv/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-hyperv linux-hyperv/ https://lore.kernel.org/linux-hyperv \
		linux-hyperv@vger.kernel.org
	public-inbox-index linux-hyperv

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-hyperv


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git