All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [linux-4.4.y only] HV: properly delay KVP packets when negotiation is in progress
@ 2018-10-12  2:52 Dexuan Cui
  2018-10-16 13:54 ` 'gregkh@linuxfoundation.org'
  0 siblings, 1 reply; 4+ messages in thread
From: Dexuan Cui @ 2018-10-12  2:52 UTC (permalink / raw)
  To: 'gregkh@linuxfoundation.org',
	'stable@vger.kernel.org',
	Wang Jian, Long Li
  Cc: KY Srinivasan, Stephen Hemminger, Haiyang Zhang, Josh Poulson,
	Michael Kelley (EOSG)


The host may send multiple negotiation packets
(due to timeout) before the KVP user-mode daemon
is connected. KVP user-mode daemon is connected.
We need to defer processing those packets
until the daemon is negotiated and connected.
It's okay for guest to respond
to all negotiation packets.

In addition, the host may send multiple staged
KVP requests as soon as negotiation is done.
We need to properly process those packets using one
tasklet for exclusive access to ring buffer.

This patch is based on the work of
Nick Meier <Nick.Meier@microsoft.com>.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

The above is the original changelog of
a3ade8cc474d ("HV: properly delay KVP packets when negotiation is in progress"

Here I re-worked the original patch because the mainline version
can't work for the linux-4.4.y branch, on which channel->callback_event
doesn't exist yet. In the mainline, channel->callback_event was added by:
631e63a9f346 ("vmbus: change to per channel tasklet"). Here we don't want
to backport it to v4.4, as it requires extra supporting changes and fixes,
which are unnecessary as to the KVP bug we're trying to resolve.

NOTE: before this patch is used, we should cherry-pick the other related
3 patches from the mainline first:

2d0c3b5 ("Drivers: hv: utils: Invoke the poll function after handshake")
b9830d1 ("Drivers: hv: util: Pass the channel information during the init call")
4dbfc2e ("Drivers: hv: kvp: fix IP Failover")

And, actually it would better if we can cherry-pick more fixes from the
mainline first (the 3 above patches are also included in this 27-patch list):

01 b003596 Drivers: hv: utils: use memdup_user in hvt_op_write
02 2d0c3b5 Drivers: hv: utils: Invoke the poll function after handshake
03 1f75338 Drivers: hv: utils: fix memory leak on on_msg() failure
04 a72f3a4 Drivers: hv: utils: rename outmsg_lock
05 a150256 Drivers: hv: utils: introduce HVUTIL_TRANSPORT_DESTROY mode
06 9420098 Drivers: hv: utils: fix crash when device is removed from host side
07 77b744a Drivers: hv: utils: fix hvt_op_poll() return value on transport destroy
08 b9830d1 Drivers: hv: util: Pass the channel information during the init call
09 e66853b Drivers: hv: utils: Remove util transport handler from list if registration fails
10 4dbfc2e Drivers: hv: kvp: fix IP Failover
11 e0fa3e5 Drivers: hv: utils: fix a race on userspace daemons registration
12 497af84 Drivers: hv: utils: Continue to poll VSS channel after handling requests.
13 db886e4 Drivers: hv: utils: Check VSS daemon is listening before a hot backup
14 abeda47 Drivers: hv: utils: Rename version definitions to reflect protocol version.
15 2e338f7 Drivers: hv: utils: Use TimeSync samples to adjust the clock after boot.
16 8e1d260 Drivers: hv: utils: Support TimeSync version 4.0 protocol samples.
17 3ba1eb1 Drivers: hv: hv_util: Avoid dynamic allocation in time synch
18 3da0401b Drivers: hv: utils: Fix the mapping between host version and protocol to use
19 23d2cc0 Drivers: hv: vss: Improve log messages.
20 b357fd3 Drivers: hv: vss: Operation timeouts should match host expectation
21 1724462 hv_util: switch to using timespec64
22 a165645 Drivers: hv: vmbus: Use all supported IC versions to negotiate
23 1274a69 Drivers: hv: Log the negotiated IC versions.
24 bb6a4db Drivers: hv: util: Fix a typo
25 e9c18ae Drivers: hv: util: move waiting for release to hv_utils_transport itself
26 bdc1dd4 vmbus: fix spelling errors
27 ddce54b Drivers: hv: kvp: Use MAX_ADAPTER_ID_SIZE for translating adapter id

This to to say, we're requesting a backport of 4 patches or 28 patches.
If 28 patches seem too many, we hope at least the 4 patches can be backported.

The patches can be applied cleanly to the latest v4.4 branch (currently it's
v4.4.160).

The background of this backport request is that: recently Wang Jian reported
some KVP issues: https://github.com/LIS/lis-next/issues/593:
e.g. the /var/lib/hyperv/.kvp_pool_* files can not be updated, and sometimes
if the hv_kvp_daemon doesn't timely start, the host may not be able to query
the VM's IP address via KVP.

Wang Jian tested the 4 patches and the 28 patches, and the issues can be
fixed by the patches.

Reported-by: Wang Jian <jianjian.wang1@gmail.com>
Tested-by: Wang Jian <jianjian.wang1@gmail.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
 drivers/hv/hv_kvp.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
index f3d3d75ac913e..e4fbc17bbe190 100644
--- a/drivers/hv/hv_kvp.c
+++ b/drivers/hv/hv_kvp.c
@@ -627,21 +627,22 @@ void hv_kvp_onchannelcallback(void *context)
 		     NEGO_IN_PROGRESS,
 		     NEGO_FINISHED} host_negotiatied = NEGO_NOT_STARTED;
 
-	if (host_negotiatied == NEGO_NOT_STARTED &&
-	    kvp_transaction.state < HVUTIL_READY) {
+	if (kvp_transaction.state < HVUTIL_READY) {
 		/*
 		 * If userspace daemon is not connected and host is asking
 		 * us to negotiate we need to delay to not lose messages.
 		 * This is important for Failover IP setting.
 		 */
-		host_negotiatied = NEGO_IN_PROGRESS;
-		schedule_delayed_work(&kvp_host_handshake_work,
+		if (host_negotiatied == NEGO_NOT_STARTED) {
+			host_negotiatied = NEGO_IN_PROGRESS;
+			schedule_delayed_work(&kvp_host_handshake_work,
 				      HV_UTIL_NEGO_TIMEOUT * HZ);
+		}
 		return;
 	}
 	if (kvp_transaction.state > HVUTIL_READY)
 		return;
-
+recheck:
 	vmbus_recvpacket(channel, recv_buffer, PAGE_SIZE * 4, &recvlen,
 			 &requestid);
 
@@ -704,6 +705,8 @@ void hv_kvp_onchannelcallback(void *context)
 				       VM_PKT_DATA_INBAND, 0);
 
 		host_negotiatied = NEGO_FINISHED;
+
+		goto recheck;
 	}
 
 }

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] [linux-4.4.y only] HV: properly delay KVP packets when negotiation is in progress
  2018-10-12  2:52 [PATCH] [linux-4.4.y only] HV: properly delay KVP packets when negotiation is in progress Dexuan Cui
@ 2018-10-16 13:54 ` 'gregkh@linuxfoundation.org'
  2018-10-16 22:39   ` Dexuan Cui
  0 siblings, 1 reply; 4+ messages in thread
From: 'gregkh@linuxfoundation.org' @ 2018-10-16 13:54 UTC (permalink / raw)
  To: Dexuan Cui
  Cc: 'stable@vger.kernel.org',
	Wang Jian, Long Li, KY Srinivasan, Stephen Hemminger,
	Haiyang Zhang, Josh Poulson, Michael Kelley (EOSG)

On Fri, Oct 12, 2018 at 02:52:46AM +0000, Dexuan Cui wrote:
> 
> The host may send multiple negotiation packets
> (due to timeout) before the KVP user-mode daemon
> is connected. KVP user-mode daemon is connected.
> We need to defer processing those packets
> until the daemon is negotiated and connected.
> It's okay for guest to respond
> to all negotiation packets.
> 
> In addition, the host may send multiple staged
> KVP requests as soon as negotiation is done.
> We need to properly process those packets using one
> tasklet for exclusive access to ring buffer.
> 
> This patch is based on the work of
> Nick Meier <Nick.Meier@microsoft.com>.
> 
> Signed-off-by: Long Li <longli@microsoft.com>
> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> The above is the original changelog of
> a3ade8cc474d ("HV: properly delay KVP packets when negotiation is in progress"
> 
> Here I re-worked the original patch because the mainline version
> can't work for the linux-4.4.y branch, on which channel->callback_event
> doesn't exist yet. In the mainline, channel->callback_event was added by:
> 631e63a9f346 ("vmbus: change to per channel tasklet"). Here we don't want
> to backport it to v4.4, as it requires extra supporting changes and fixes,
> which are unnecessary as to the KVP bug we're trying to resolve.
> 
> NOTE: before this patch is used, we should cherry-pick the other related
> 3 patches from the mainline first:
> 
> 2d0c3b5 ("Drivers: hv: utils: Invoke the poll function after handshake")
> b9830d1 ("Drivers: hv: util: Pass the channel information during the init call")
> 4dbfc2e ("Drivers: hv: kvp: fix IP Failover")
> 
> And, actually it would better if we can cherry-pick more fixes from the
> mainline first (the 3 above patches are also included in this 27-patch list):
> 
> 01 b003596 Drivers: hv: utils: use memdup_user in hvt_op_write
> 02 2d0c3b5 Drivers: hv: utils: Invoke the poll function after handshake
> 03 1f75338 Drivers: hv: utils: fix memory leak on on_msg() failure
> 04 a72f3a4 Drivers: hv: utils: rename outmsg_lock
> 05 a150256 Drivers: hv: utils: introduce HVUTIL_TRANSPORT_DESTROY mode
> 06 9420098 Drivers: hv: utils: fix crash when device is removed from host side
> 07 77b744a Drivers: hv: utils: fix hvt_op_poll() return value on transport destroy
> 08 b9830d1 Drivers: hv: util: Pass the channel information during the init call
> 09 e66853b Drivers: hv: utils: Remove util transport handler from list if registration fails
> 10 4dbfc2e Drivers: hv: kvp: fix IP Failover
> 11 e0fa3e5 Drivers: hv: utils: fix a race on userspace daemons registration
> 12 497af84 Drivers: hv: utils: Continue to poll VSS channel after handling requests.
> 13 db886e4 Drivers: hv: utils: Check VSS daemon is listening before a hot backup
> 14 abeda47 Drivers: hv: utils: Rename version definitions to reflect protocol version.
> 15 2e338f7 Drivers: hv: utils: Use TimeSync samples to adjust the clock after boot.
> 16 8e1d260 Drivers: hv: utils: Support TimeSync version 4.0 protocol samples.
> 17 3ba1eb1 Drivers: hv: hv_util: Avoid dynamic allocation in time synch
> 18 3da0401b Drivers: hv: utils: Fix the mapping between host version and protocol to use
> 19 23d2cc0 Drivers: hv: vss: Improve log messages.
> 20 b357fd3 Drivers: hv: vss: Operation timeouts should match host expectation
> 21 1724462 hv_util: switch to using timespec64
> 22 a165645 Drivers: hv: vmbus: Use all supported IC versions to negotiate
> 23 1274a69 Drivers: hv: Log the negotiated IC versions.
> 24 bb6a4db Drivers: hv: util: Fix a typo
> 25 e9c18ae Drivers: hv: util: move waiting for release to hv_utils_transport itself
> 26 bdc1dd4 vmbus: fix spelling errors
> 27 ddce54b Drivers: hv: kvp: Use MAX_ADAPTER_ID_SIZE for translating adapter id
> 
> This to to say, we're requesting a backport of 4 patches or 28 patches.
> If 28 patches seem too many, we hope at least the 4 patches can be backported.

28 seems odd, there's lots of things in there that you do not need.

So 4 is good, can you send all 4 as a patch series, properly backported
and tested with this patch as the last one?

> The patches can be applied cleanly to the latest v4.4 branch (currently it's
> v4.4.160).

But, I really want to know why people are still trying to use the 4.4
kernel right now for a "general purpose" system.  They should be using
4.9 at the very least by now, 4.4 is not a good idea at all.  Why can
you not just move your users to 4.9 instead of a newer 4.4 kernel?  It
should be the exact same, right?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [PATCH] [linux-4.4.y only] HV: properly delay KVP packets when negotiation is in progress
  2018-10-16 13:54 ` 'gregkh@linuxfoundation.org'
@ 2018-10-16 22:39   ` Dexuan Cui
  2018-10-17  0:34     ` Wang Jian
  0 siblings, 1 reply; 4+ messages in thread
From: Dexuan Cui @ 2018-10-16 22:39 UTC (permalink / raw)
  To: 'gregkh@linuxfoundation.org'
  Cc: 'stable@vger.kernel.org',
	Wang Jian, Long Li, KY Srinivasan, Stephen Hemminger,
	Haiyang Zhang, Josh Poulson, Michael Kelley (EOSG)

> From: 'gregkh@linuxfoundation.org' <gregkh@linuxfoundation.org>
> Sent: Tuesday, October 16, 2018 06:55
> > ...
> > This is to say, we're requesting a backport of 4 patches or 28 patches.
> > If 28 patches seem too many, we hope at least the 4 patches can be
> backported.
> 
> 28 seems odd, there's lots of things in there that you do not need.
Yes, some of the 28 patches are completely unnecessary for a "stable" kernel,
but some are fixes for other known issues. Only backporting the minimal
amount of the patches can't work due to merge conflicts, so I generated
the 28-patch list which can be applied cleanly in order.

> So 4 is good, can you send all 4 as a patch series, properly backported
> and tested with this patch as the last one?
I'm OK with only backporting the 4 patches for this particular issue
reported by Wang Jian. Maybe we can backport more fixes in future
if people report new KVP issues against the 4.4 kernel.

So I'm going to send all the 4 patches as a patch series. Wang Jian
has tested them.

> But, I really want to know why people are still trying to use the 4.4
> kernel right now for a "general purpose" system.  They should be using
> 4.9 at the very least by now, 4.4 is not a good idea at all.  Why can
> you not just move your users to 4.9 instead of a newer 4.4 kernel?  It
> should be the exact same, right?
> greg k-h

We definitely encourage users to use new kernels like 4.9 and 4.1x, but it
looks some users have to use their customized 4.4 kernels due to some
reason I don't know (believe it or not, except Wang Jian, I have made
the same private backport twice for two companies since July). 

And Ubuntu 16.04.5 LTS (http://releases.ubuntu.com/16.04/), which is
based on v4.4, also has the same KVP bug:
http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/Makefile?h=Ubuntu-4.4.0-137.163
And I did receive a bug report from a Ubuntu user last week.

Ubuntu 16.04 will reach End-of-Life on April 2021 -- still 2.5 years left
since now. So I hope after the 4 patches are merged into the upstream
4.4.y branch, the Ubuntu guys will notice them and pick them up.

Thanks,
-- Dexuan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] [linux-4.4.y only] HV: properly delay KVP packets when negotiation is in progress
  2018-10-16 22:39   ` Dexuan Cui
@ 2018-10-17  0:34     ` Wang Jian
  0 siblings, 0 replies; 4+ messages in thread
From: Wang Jian @ 2018-10-17  0:34 UTC (permalink / raw)
  To: decui
  Cc: gregkh, stable, Long Li, kys, sthemmin, Haiyang Zhang, jopoulso,
	Michael.H.Kelley

> But, I really want to know why people are still trying to use the 4.4
> kernel right now for a "general purpose" system.  They should be using
> 4.9 at the very least by now, 4.4 is not a good idea at all.  Why can
> you not just move your users to 4.9 instead of a newer 4.4 kernel?  It
> should be the exact same, right?
> greg k-h

Sorry about this.
Maybe you don't believe this, we are just upgrading to 4.4 kernel from
3.2. I can do nothing for this....
But certainly, we are not a completely "general purpose" Linux.
On Wed, Oct 17, 2018 at 6:40 AM Dexuan Cui <decui@microsoft.com> wrote:
>
> > From: 'gregkh@linuxfoundation.org' <gregkh@linuxfoundation.org>
> > Sent: Tuesday, October 16, 2018 06:55
> > > ...
> > > This is to say, we're requesting a backport of 4 patches or 28 patches.
> > > If 28 patches seem too many, we hope at least the 4 patches can be
> > backported.
> >
> > 28 seems odd, there's lots of things in there that you do not need.
> Yes, some of the 28 patches are completely unnecessary for a "stable" kernel,
> but some are fixes for other known issues. Only backporting the minimal
> amount of the patches can't work due to merge conflicts, so I generated
> the 28-patch list which can be applied cleanly in order.
>
> > So 4 is good, can you send all 4 as a patch series, properly backported
> > and tested with this patch as the last one?
> I'm OK with only backporting the 4 patches for this particular issue
> reported by Wang Jian. Maybe we can backport more fixes in future
> if people report new KVP issues against the 4.4 kernel.
>
> So I'm going to send all the 4 patches as a patch series. Wang Jian
> has tested them.
>
> > But, I really want to know why people are still trying to use the 4.4
> > kernel right now for a "general purpose" system.  They should be using
> > 4.9 at the very least by now, 4.4 is not a good idea at all.  Why can
> > you not just move your users to 4.9 instead of a newer 4.4 kernel?  It
> > should be the exact same, right?
> > greg k-h
>
> We definitely encourage users to use new kernels like 4.9 and 4.1x, but it
> looks some users have to use their customized 4.4 kernels due to some
> reason I don't know (believe it or not, except Wang Jian, I have made
> the same private backport twice for two companies since July).
>
> And Ubuntu 16.04.5 LTS (http://releases.ubuntu.com/16.04/), which is
> based on v4.4, also has the same KVP bug:
> http://kernel.ubuntu.com/git/ubuntu/ubuntu-xenial.git/tree/Makefile?h=Ubuntu-4.4.0-137.163
> And I did receive a bug report from a Ubuntu user last week.
>
> Ubuntu 16.04 will reach End-of-Life on April 2021 -- still 2.5 years left
> since now. So I hope after the 4 patches are merged into the upstream
> 4.4.y branch, the Ubuntu guys will notice them and pick them up.
>
> Thanks,
> -- Dexuan
>


-- 
Regards,
Wang Jian

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2018-10-17  8:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-12  2:52 [PATCH] [linux-4.4.y only] HV: properly delay KVP packets when negotiation is in progress Dexuan Cui
2018-10-16 13:54 ` 'gregkh@linuxfoundation.org'
2018-10-16 22:39   ` Dexuan Cui
2018-10-17  0:34     ` Wang Jian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.