[PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
@ 2016-10-19 13:53 Vitaly Kuznetsov
  2016-10-20  8:36 ` Stephen Hemminger
  2016-10-21 15:28 ` David Miller
  0 siblings, 2 replies; 9+ messages in thread
From: Vitaly Kuznetsov @ 2016-10-19 13:53 UTC (permalink / raw)
  To: netdev
  Cc: Stephen Hemminger, devel, linux-kernel, K. Y. Srinivasan, Haiyang Zhang

Fix in commit 880988348270 ("hv_netvsc: set nvdev link after populating
chn_table") turns out to be incomplete. A crash in
netvsc_get_next_send_section() is observed on mtu change when the device
is under load. The race I identified is: if we get to netvsc_send() after
we set net_device_ctx->nvdev link in netvsc_device_add() but before we
finish netvsc_connect_vsp()->netvsc_init_buf() send_section_map is not
allocated and we crash. Unfortunately we can't set net_device_ctx->nvdev
link after the netvsc_init_buf() call as during the negotiation we need
to receive packets and on the receive path we check for it. It would
probably be possible to split nvdev into a pair of nvdev_in and nvdev_out
links and check them accordingly in get_outbound_net_device()/
get_inbound_net_device() but this looks like an overkill.

Check that send_section_map is allocated in netvsc_send().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/net/hyperv/netvsc.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c
index 720b5fa..e2bfaac 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -888,6 +888,13 @@ int netvsc_send(struct hv_device *device,
 	if (!net_device)
 		return -ENODEV;

+	/* We may race with netvsc_connect_vsp()/netvsc_init_buf() and get
+	 * here before the negotiation with the host is finished and
+	 * send_section_map may not be allocated yet.
+	 */
+	if (!net_device->send_section_map)
+		return -EAGAIN;
+
 	out_channel = net_device->chn_table[q_idx];

 	packet->send_buf_index = NETVSC_INVALID_INDEX;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
  2016-10-19 13:53 [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf() Vitaly Kuznetsov
@ 2016-10-20  8:36 ` Stephen Hemminger
  2016-10-20  8:51   ` Vitaly Kuznetsov
  2016-10-21 15:28 ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Stephen Hemminger @ 2016-10-20  8:36 UTC (permalink / raw)
  To: Vitaly Kuznetsov, netdev
  Cc: devel, linux-kernel, KY Srinivasan, Haiyang Zhang

Do we need ACCESS_ONCE() here to avoid check/use issues?

-----Original Message-----
From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com] 
Sent: Wednesday, October 19, 2016 2:53 PM
To: netdev@vger.kernel.org
Cc: Stephen Hemminger <sthemmin@microsoft.com>; devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>
Subject: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()

Fix in commit 880988348270 ("hv_netvsc: set nvdev link after populating
chn_table") turns out to be incomplete. A crash in
netvsc_get_next_send_section() is observed on mtu change when the device is under load. The race I identified is: if we get to netvsc_send() after we set net_device_ctx->nvdev link in netvsc_device_add() but before we finish netvsc_connect_vsp()->netvsc_init_buf() send_section_map is not allocated and we crash. Unfortunately we can't set net_device_ctx->nvdev link after the netvsc_init_buf() call as during the negotiation we need to receive packets and on the receive path we check for it. It would probably be possible to split nvdev into a pair of nvdev_in and nvdev_out links and check them accordingly in get_outbound_net_device()/
get_inbound_net_device() but this looks like an overkill.

Check that send_section_map is allocated in netvsc_send().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
 drivers/net/hyperv/netvsc.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 720b5fa..e2bfaac 100644
--- a/drivers/net/hyperv/netvsc.c
+++ b/drivers/net/hyperv/netvsc.c
@@ -888,6 +888,13 @@ int netvsc_send(struct hv_device *device,
 	if (!net_device)
 		return -ENODEV;

+	/* We may race with netvsc_connect_vsp()/netvsc_init_buf() and get
+	 * here before the negotiation with the host is finished and
+	 * send_section_map may not be allocated yet.
+	 */
+	if (!net_device->send_section_map)
+		return -EAGAIN;
+
 	out_channel = net_device->chn_table[q_idx];

 	packet->send_buf_index = NETVSC_INVALID_INDEX;
--
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
  2016-10-20  8:36 ` Stephen Hemminger
@ 2016-10-20  8:51   ` Vitaly Kuznetsov
  2016-10-20 18:03     ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2016-10-20  8:51 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: netdev, devel, linux-kernel, KY Srinivasan, Haiyang Zhang

Stephen Hemminger <sthemmin@microsoft.com> writes:

> Do we need ACCESS_ONCE() here to avoid check/use issues?
>

I think we don't: this is the only place in the function where we read
the variable so we'll get normal read. We're not trying to syncronize
with netvsc_init_buf() as that would require locking, if we read stale
NULL value after it was already updated on a different CPU we're fine,
we'll just return -EAGAIN.

> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:vkuznets@redhat.com] 
> Sent: Wednesday, October 19, 2016 2:53 PM
> To: netdev@vger.kernel.org
> Cc: Stephen Hemminger <sthemmin@microsoft.com>; devel@linuxdriverproject.org; linux-kernel@vger.kernel.org; KY Srinivasan <kys@microsoft.com>; Haiyang Zhang <haiyangz@microsoft.com>
> Subject: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
>
> Fix in commit 880988348270 ("hv_netvsc: set nvdev link after populating
> chn_table") turns out to be incomplete. A crash in
> netvsc_get_next_send_section() is observed on mtu change when the device is under load. The race I identified is: if we get to netvsc_send() after we set net_device_ctx->nvdev link in netvsc_device_add() but before we finish netvsc_connect_vsp()->netvsc_init_buf() send_section_map is not allocated and we crash. Unfortunately we can't set net_device_ctx->nvdev link after the netvsc_init_buf() call as during the negotiation we need to receive packets and on the receive path we check for it. It would probably be possible to split nvdev into a pair of nvdev_in and nvdev_out links and check them accordingly in get_outbound_net_device()/
> get_inbound_net_device() but this looks like an overkill.
>
> Check that send_section_map is allocated in netvsc_send().
>
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
> ---
>  drivers/net/hyperv/netvsc.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 720b5fa..e2bfaac 100644
> --- a/drivers/net/hyperv/netvsc.c
> +++ b/drivers/net/hyperv/netvsc.c
> @@ -888,6 +888,13 @@ int netvsc_send(struct hv_device *device,
>  	if (!net_device)
>  		return -ENODEV;
>
> +	/* We may race with netvsc_connect_vsp()/netvsc_init_buf() and get
> +	 * here before the negotiation with the host is finished and
> +	 * send_section_map may not be allocated yet.
> +	 */
> +	if (!net_device->send_section_map)
> +		return -EAGAIN;
> +
>  	out_channel = net_device->chn_table[q_idx];
>
>  	packet->send_buf_index = NETVSC_INVALID_INDEX;
> --
> 2.7.4

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
  2016-10-20  8:51   ` Vitaly Kuznetsov
@ 2016-10-20 18:03     ` David Miller
  2016-10-21 11:15       ` Vitaly Kuznetsov
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2016-10-20 18:03 UTC (permalink / raw)
  To: vkuznets; +Cc: sthemmin, netdev, devel, linux-kernel, kys, haiyangz

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Thu, 20 Oct 2016 10:51:04 +0200

> Stephen Hemminger <sthemmin@microsoft.com> writes:
> 
>> Do we need ACCESS_ONCE() here to avoid check/use issues?
>>
> 
> I think we don't: this is the only place in the function where we read
> the variable so we'll get normal read. We're not trying to syncronize
> with netvsc_init_buf() as that would require locking, if we read stale
> NULL value after it was already updated on a different CPU we're fine,
> we'll just return -EAGAIN.

The concern is if we race with netvsc_destroy_buf() and this pointer
becomes NULL after the test you are adding.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
  2016-10-20 18:03     ` David Miller
@ 2016-10-21 11:15       ` Vitaly Kuznetsov
  2016-10-21 14:52         ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2016-10-21 11:15 UTC (permalink / raw)
  To: David Miller; +Cc: sthemmin, netdev, devel, linux-kernel, kys, haiyangz

David Miller <davem@davemloft.net> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date: Thu, 20 Oct 2016 10:51:04 +0200
>
>> Stephen Hemminger <sthemmin@microsoft.com> writes:
>> 
>>> Do we need ACCESS_ONCE() here to avoid check/use issues?
>>>
>> 
>> I think we don't: this is the only place in the function where we read
>> the variable so we'll get normal read. We're not trying to syncronize
>> with netvsc_init_buf() as that would require locking, if we read stale
>> NULL value after it was already updated on a different CPU we're fine,
>> we'll just return -EAGAIN.
>
> The concern is if we race with netvsc_destroy_buf() and this pointer
> becomes NULL after the test you are adding.

Thanks, this is interesting.

We may reach to netvsc_destroy_buf() by 3 pathes:

1) cleanup path in netvsc_init_buf(). It is never called with
send_section_map being not NULL so it seems we're safe.

2) from netvsc_remove() when the device is being removed. As far as I
understand we can't be on the transmit path after we call
unregister_netdev() so we're safe.

3) from netvsc_change_mtu() and netvsc_set_channels(). These pathes are
specific to netvsc driver as basically we need to remove the device and
add it back to change mtu/number of channels. The underligning 'struct
net_device' stays but the rest is being removed and added back. On both
pathes we first call netvsc_close() which does netif_tx_disable() and as
far as I understand (I may be wrong) this guarantees that we won't be in
netvsc_send().

So *I think* that we can't ever free send_section_map while in
netvsc_send() and we can't even get to netvsc_send() after it is freed
but I may have missed something.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
  2016-10-21 11:15       ` Vitaly Kuznetsov
@ 2016-10-21 14:52         ` David Miller
  2016-10-21 15:17           ` Vitaly Kuznetsov
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2016-10-21 14:52 UTC (permalink / raw)
  To: vkuznets; +Cc: sthemmin, netdev, devel, linux-kernel, kys, haiyangz

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Fri, 21 Oct 2016 13:15:53 +0200

> David Miller <davem@davemloft.net> writes:
> 
>> From: Vitaly Kuznetsov <vkuznets@redhat.com>
>> Date: Thu, 20 Oct 2016 10:51:04 +0200
>>
>>> Stephen Hemminger <sthemmin@microsoft.com> writes:
>>> 
>>>> Do we need ACCESS_ONCE() here to avoid check/use issues?
>>>>
>>> 
>>> I think we don't: this is the only place in the function where we read
>>> the variable so we'll get normal read. We're not trying to syncronize
>>> with netvsc_init_buf() as that would require locking, if we read stale
>>> NULL value after it was already updated on a different CPU we're fine,
>>> we'll just return -EAGAIN.
>>
>> The concern is if we race with netvsc_destroy_buf() and this pointer
>> becomes NULL after the test you are adding.
> 
> Thanks, this is interesting.
> 
> We may reach to netvsc_destroy_buf() by 3 pathes:
> 
> 1) cleanup path in netvsc_init_buf(). It is never called with
> send_section_map being not NULL so it seems we're safe.
> 
> 2) from netvsc_remove() when the device is being removed. As far as I
> understand we can't be on the transmit path after we call
> unregister_netdev() so we're safe.
> 
> 3) from netvsc_change_mtu() and netvsc_set_channels(). These pathes are
> specific to netvsc driver as basically we need to remove the device and
> add it back to change mtu/number of channels. The underligning 'struct
> net_device' stays but the rest is being removed and added back. On both
> pathes we first call netvsc_close() which does netif_tx_disable() and as
> far as I understand (I may be wrong) this guarantees that we won't be in
> netvsc_send().
> 
> So *I think* that we can't ever free send_section_map while in
> netvsc_send() and we can't even get to netvsc_send() after it is freed
> but I may have missed something.

Ok you may be right.

Can't the device be taken down by asynchronous events as well?  For example
if the peer end of the interface in the other guest disappears.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
  2016-10-21 14:52         ` David Miller
@ 2016-10-21 15:17           ` Vitaly Kuznetsov
  2016-10-21 15:27             ` David Miller
  0 siblings, 1 reply; 9+ messages in thread
From: Vitaly Kuznetsov @ 2016-10-21 15:17 UTC (permalink / raw)
  To: David Miller; +Cc: sthemmin, netdev, devel, linux-kernel, kys, haiyangz

David Miller <davem@davemloft.net> writes:

> From: Vitaly Kuznetsov <vkuznets@redhat.com>
> Date: Fri, 21 Oct 2016 13:15:53 +0200
>
>> David Miller <davem@davemloft.net> writes:
>> 
>>> From: Vitaly Kuznetsov <vkuznets@redhat.com>
>>> Date: Thu, 20 Oct 2016 10:51:04 +0200
>>>
>>>> Stephen Hemminger <sthemmin@microsoft.com> writes:
>>>> 
>>>>> Do we need ACCESS_ONCE() here to avoid check/use issues?
>>>>>
>>>> 
>>>> I think we don't: this is the only place in the function where we read
>>>> the variable so we'll get normal read. We're not trying to syncronize
>>>> with netvsc_init_buf() as that would require locking, if we read stale
>>>> NULL value after it was already updated on a different CPU we're fine,
>>>> we'll just return -EAGAIN.
>>>
>>> The concern is if we race with netvsc_destroy_buf() and this pointer
>>> becomes NULL after the test you are adding.
>> 
>> Thanks, this is interesting.
>> 
>> We may reach to netvsc_destroy_buf() by 3 pathes:
>> 
>> 1) cleanup path in netvsc_init_buf(). It is never called with
>> send_section_map being not NULL so it seems we're safe.
>> 
>> 2) from netvsc_remove() when the device is being removed. As far as I
>> understand we can't be on the transmit path after we call
>> unregister_netdev() so we're safe.
>> 
>> 3) from netvsc_change_mtu() and netvsc_set_channels(). These pathes are
>> specific to netvsc driver as basically we need to remove the device and
>> add it back to change mtu/number of channels. The underligning 'struct
>> net_device' stays but the rest is being removed and added back. On both
>> pathes we first call netvsc_close() which does netif_tx_disable() and as
>> far as I understand (I may be wrong) this guarantees that we won't be in
>> netvsc_send().
>> 
>> So *I think* that we can't ever free send_section_map while in
>> netvsc_send() and we can't even get to netvsc_send() after it is freed
>> but I may have missed something.
>
> Ok you may be right.
>
> Can't the device be taken down by asynchronous events as well?  For example
> if the peer end of the interface in the other guest disappears.

The device may be hot removed from host's side. In this case we follow
the following call chain:

... -> vmbus_device_unregister() -> ... -> vmbus_remove() -> netvsc_remove()

 and netvsc_remove() does netif_tx_disable(); unregister_netdev();
before calling rndis_filter_device_remove() leading to netvsc_destroy_buf().

So it seems we can't be in netvsc_send() when the device is
disappearing.

-- 
  Vitaly

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
  2016-10-21 15:17           ` Vitaly Kuznetsov
@ 2016-10-21 15:27             ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2016-10-21 15:27 UTC (permalink / raw)
  To: vkuznets; +Cc: sthemmin, netdev, devel, linux-kernel, kys, haiyangz

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Fri, 21 Oct 2016 17:17:18 +0200

> David Miller <davem@davemloft.net> writes:
> 
>> From: Vitaly Kuznetsov <vkuznets@redhat.com>
>> Date: Fri, 21 Oct 2016 13:15:53 +0200
>>
>>> David Miller <davem@davemloft.net> writes:
>>> 
>>>> From: Vitaly Kuznetsov <vkuznets@redhat.com>
>>>> Date: Thu, 20 Oct 2016 10:51:04 +0200
>>>>
>>>>> Stephen Hemminger <sthemmin@microsoft.com> writes:
>>>>> 
>>>>>> Do we need ACCESS_ONCE() here to avoid check/use issues?
>>>>>>
>>>>> 
>>>>> I think we don't: this is the only place in the function where we read
>>>>> the variable so we'll get normal read. We're not trying to syncronize
>>>>> with netvsc_init_buf() as that would require locking, if we read stale
>>>>> NULL value after it was already updated on a different CPU we're fine,
>>>>> we'll just return -EAGAIN.
>>>>
>>>> The concern is if we race with netvsc_destroy_buf() and this pointer
>>>> becomes NULL after the test you are adding.
>>> 
>>> Thanks, this is interesting.
>>> 
>>> We may reach to netvsc_destroy_buf() by 3 pathes:
>>> 
>>> 1) cleanup path in netvsc_init_buf(). It is never called with
>>> send_section_map being not NULL so it seems we're safe.
>>> 
>>> 2) from netvsc_remove() when the device is being removed. As far as I
>>> understand we can't be on the transmit path after we call
>>> unregister_netdev() so we're safe.
>>> 
>>> 3) from netvsc_change_mtu() and netvsc_set_channels(). These pathes are
>>> specific to netvsc driver as basically we need to remove the device and
>>> add it back to change mtu/number of channels. The underligning 'struct
>>> net_device' stays but the rest is being removed and added back. On both
>>> pathes we first call netvsc_close() which does netif_tx_disable() and as
>>> far as I understand (I may be wrong) this guarantees that we won't be in
>>> netvsc_send().
>>> 
>>> So *I think* that we can't ever free send_section_map while in
>>> netvsc_send() and we can't even get to netvsc_send() after it is freed
>>> but I may have missed something.
>>
>> Ok you may be right.
>>
>> Can't the device be taken down by asynchronous events as well?  For example
>> if the peer end of the interface in the other guest disappears.
> 
> The device may be hot removed from host's side. In this case we follow
> the following call chain:
> 
> ... -> vmbus_device_unregister() -> ... -> vmbus_remove() -> netvsc_remove()
> 
>  and netvsc_remove() does netif_tx_disable(); unregister_netdev();
> before calling rndis_filter_device_remove() leading to netvsc_destroy_buf().
> 
> So it seems we can't be in netvsc_send() when the device is
> disappearing.

Ok, it all looks good then.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf()
  2016-10-19 13:53 [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf() Vitaly Kuznetsov
  2016-10-20  8:36 ` Stephen Hemminger
@ 2016-10-21 15:28 ` David Miller
  1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2016-10-21 15:28 UTC (permalink / raw)
  To: vkuznets; +Cc: netdev, sthemmin, devel, linux-kernel, kys, haiyangz

From: Vitaly Kuznetsov <vkuznets@redhat.com>
Date: Wed, 19 Oct 2016 15:53:01 +0200

> Fix in commit 880988348270 ("hv_netvsc: set nvdev link after populating
> chn_table") turns out to be incomplete. A crash in
> netvsc_get_next_send_section() is observed on mtu change when the device
> is under load. The race I identified is: if we get to netvsc_send() after
> we set net_device_ctx->nvdev link in netvsc_device_add() but before we
> finish netvsc_connect_vsp()->netvsc_init_buf() send_section_map is not
> allocated and we crash. Unfortunately we can't set net_device_ctx->nvdev
> link after the netvsc_init_buf() call as during the negotiation we need
> to receive packets and on the receive path we check for it. It would
> probably be possible to split nvdev into a pair of nvdev_in and nvdev_out
> links and check them accordingly in get_outbound_net_device()/
> get_inbound_net_device() but this looks like an overkill.
> 
> Check that send_section_map is allocated in netvsc_send().
> 
> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-10-21 15:28 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-19 13:53 [PATCH net-next] hv_netvsc: fix a race between netvsc_send() and netvsc_init_buf() Vitaly Kuznetsov
2016-10-20  8:36 ` Stephen Hemminger
2016-10-20  8:51   ` Vitaly Kuznetsov
2016-10-20 18:03     ` David Miller
2016-10-21 11:15       ` Vitaly Kuznetsov
2016-10-21 14:52         ` David Miller
2016-10-21 15:17           ` Vitaly Kuznetsov
2016-10-21 15:27             ` David Miller
2016-10-21 15:28 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).