* [PATCH net-next] vxlan: Fix vs->vni_list locking.
@ 2013-07-10 22:04 Pravin B Shelar
2013-07-10 22:58 ` Stephen Hemminger
0 siblings, 1 reply; 13+ messages in thread
From: Pravin B Shelar @ 2013-07-10 22:04 UTC (permalink / raw)
To: netdev; +Cc: stephen, Pravin B Shelar
Use rtnl lock to protect vs->vni_list updates.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
drivers/net/vxlan.c | 13 +++++++++----
1 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 227b54a..deca481 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -89,7 +89,7 @@ struct vxlan_sock {
struct work_struct del_work;
atomic_t refcnt;
struct socket *sock;
- struct hlist_head vni_list[VNI_HASH_SIZE];
+ struct hlist_head vni_list[VNI_HASH_SIZE]; /* Protected by RTNL. */
};
/* per-network namespace private data for this module */
@@ -122,7 +122,7 @@ struct vxlan_fdb {
/* Pseudo network device */
struct vxlan_dev {
- struct hlist_node hlist; /* vni hash table */
+ struct hlist_node hlist; /* vni hash table (vni_list) */
struct list_head next; /* vxlan's per namespace list */
struct vxlan_sock *vn_sock; /* listening socket */
struct net_device *dev;
@@ -1644,16 +1644,21 @@ static void vxlan_sock_work(struct work_struct *work)
if (ovs) {
atomic_inc(&ovs->refcnt);
vxlan->vn_sock = ovs;
- hlist_add_head_rcu(&vxlan->hlist, vni_head(ovs, vni));
spin_unlock(&vn->sock_lock);
+ rtnl_lock();
+ hlist_add_head_rcu(&vxlan->hlist, vni_head(ovs, vni));
+ rtnl_unlock();
+
sk_release_kernel(nvs->sock->sk);
kfree(nvs);
} else {
vxlan->vn_sock = nvs;
hlist_add_head_rcu(&nvs->hlist, vs_head(net, port));
- hlist_add_head_rcu(&vxlan->hlist, vni_head(nvs, vni));
spin_unlock(&vn->sock_lock);
+ rtnl_lock();
+ hlist_add_head_rcu(&vxlan->hlist, vni_head(nvs, vni));
+ rtnl_unlock();
}
out:
dev_put(dev);
--
1.7.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH net-next] vxlan: Fix vs->vni_list locking.
2013-07-10 22:04 [PATCH net-next] vxlan: Fix vs->vni_list locking Pravin B Shelar
@ 2013-07-10 22:58 ` Stephen Hemminger
2013-07-10 23:08 ` Pravin Shelar
0 siblings, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2013-07-10 22:58 UTC (permalink / raw)
To: Pravin B Shelar; +Cc: netdev
On Wed, 10 Jul 2013 15:04:44 -0700
Pravin B Shelar <pshelar@nicira.com> wrote:
> Use rtnl lock to protect vs->vni_list updates.
>
> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
> ---
I don't think this is necessary. I intentionally changed the
locking when socket management was moved to a work queue.
The vxlan_net socket lock is held there already, and finer grain.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next] vxlan: Fix vs->vni_list locking.
2013-07-10 22:58 ` Stephen Hemminger
@ 2013-07-10 23:08 ` Pravin Shelar
2013-07-10 23:25 ` Stephen Hemminger
2013-07-13 17:18 ` [PATCH net] vxlan: add necessary locking on device removal Stephen Hemminger
0 siblings, 2 replies; 13+ messages in thread
From: Pravin Shelar @ 2013-07-10 23:08 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
On Wed, Jul 10, 2013 at 3:58 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Wed, 10 Jul 2013 15:04:44 -0700
> Pravin B Shelar <pshelar@nicira.com> wrote:
>
>> Use rtnl lock to protect vs->vni_list updates.
>>
>> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
>> ---
>
> I don't think this is necessary. I intentionally changed the
> locking when socket management was moved to a work queue.
> The vxlan_net socket lock is held there already, and finer grain.
what abt vxlan_dellink()?
it is deleting vxlan-dev from hash table without lock.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next] vxlan: Fix vs->vni_list locking.
2013-07-10 23:08 ` Pravin Shelar
@ 2013-07-10 23:25 ` Stephen Hemminger
2013-07-11 2:55 ` David Miller
2013-07-13 17:18 ` [PATCH net] vxlan: add necessary locking on device removal Stephen Hemminger
1 sibling, 1 reply; 13+ messages in thread
From: Stephen Hemminger @ 2013-07-10 23:25 UTC (permalink / raw)
To: Pravin Shelar; +Cc: netdev
On Wed, 10 Jul 2013 16:08:38 -0700
Pravin Shelar <pshelar@nicira.com> wrote:
> On Wed, Jul 10, 2013 at 3:58 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > On Wed, 10 Jul 2013 15:04:44 -0700
> > Pravin B Shelar <pshelar@nicira.com> wrote:
> >
> >> Use rtnl lock to protect vs->vni_list updates.
> >>
> >> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
> >> ---
> >
> > I don't think this is necessary. I intentionally changed the
> > locking when socket management was moved to a work queue.
> > The vxlan_net socket lock is held there already, and finer grain.
>
> what abt vxlan_dellink()?
> it is deleting vxlan-dev from hash table without lock.
Ok, then this should fix it.
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -1768,8 +1768,12 @@ static int vxlan_newlink(struct net *net
static void vxlan_dellink(struct net_device *dev, struct list_head *head)
{
struct vxlan_dev *vxlan = netdev_priv(dev);
+ struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
+ spin_lock(&vn->sock_lock);
hlist_del_rcu(&vxlan->hlist);
+ spin_unlock(&vn->sock_lock);
+
list_del(&vxlan->next);
unregister_netdevice_queue(dev, head);
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net-next] vxlan: Fix vs->vni_list locking.
2013-07-10 23:25 ` Stephen Hemminger
@ 2013-07-11 2:55 ` David Miller
0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2013-07-11 2:55 UTC (permalink / raw)
To: stephen; +Cc: pshelar, netdev
From: Stephen Hemminger <stephen@networkplumber.org>
Date: Wed, 10 Jul 2013 16:25:09 -0700
> On Wed, 10 Jul 2013 16:08:38 -0700
> Pravin Shelar <pshelar@nicira.com> wrote:
>
>> On Wed, Jul 10, 2013 at 3:58 PM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>> > On Wed, 10 Jul 2013 15:04:44 -0700
>> > Pravin B Shelar <pshelar@nicira.com> wrote:
>> >
>> >> Use rtnl lock to protect vs->vni_list updates.
>> >>
>> >> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
>> >> ---
>> >
>> > I don't think this is necessary. I intentionally changed the
>> > locking when socket management was moved to a work queue.
>> > The vxlan_net socket lock is held there already, and finer grain.
>>
>> what abt vxlan_dellink()?
>> it is deleting vxlan-dev from hash table without lock.
>
> Ok, then this should fix it.
Looks good, one minor thing:
> {
> struct vxlan_dev *vxlan = netdev_priv(dev);
> + struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
Please declare 'vn' on the first line rather than the second.
A formal submission with that fixup and a signoff, and I'll apply
this.
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH net] vxlan: add necessary locking on device removal
2013-07-10 23:08 ` Pravin Shelar
2013-07-10 23:25 ` Stephen Hemminger
@ 2013-07-13 17:18 ` Stephen Hemminger
2013-07-13 19:21 ` Pravin Shelar
2013-07-17 19:51 ` David Miller
1 sibling, 2 replies; 13+ messages in thread
From: Stephen Hemminger @ 2013-07-13 17:18 UTC (permalink / raw)
To: Pravin Shelar; +Cc: netdev
The socket management is now done in workqueue (outside of RTNL)
and protected by vn->sock_lock. There were two possible bugs, first
the vxlan device was removed from the VNI hash table per socket without
holding lock. And there was a race when device is created and the workqueue
could run after deletion.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
--- a/drivers/net/vxlan.c 2013-07-08 16:31:50.080744429 -0700
+++ b/drivers/net/vxlan.c 2013-07-10 20:15:47.337653899 -0700
@@ -1767,9 +1767,15 @@ static int vxlan_newlink(struct net *net
static void vxlan_dellink(struct net_device *dev, struct list_head *head)
{
+ struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
struct vxlan_dev *vxlan = netdev_priv(dev);
+ flush_workqueue(vxlan_wq);
+
+ spin_lock(&vn->sock_lock);
hlist_del_rcu(&vxlan->hlist);
+ spin_unlock(&vn->sock_lock);
+
list_del(&vxlan->next);
unregister_netdevice_queue(dev, head);
}
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] vxlan: add necessary locking on device removal
2013-07-13 17:18 ` [PATCH net] vxlan: add necessary locking on device removal Stephen Hemminger
@ 2013-07-13 19:21 ` Pravin Shelar
2013-07-16 18:28 ` David Miller
2013-07-17 19:51 ` David Miller
1 sibling, 1 reply; 13+ messages in thread
From: Pravin Shelar @ 2013-07-13 19:21 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev
On Sat, Jul 13, 2013 at 10:18 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> The socket management is now done in workqueue (outside of RTNL)
> and protected by vn->sock_lock. There were two possible bugs, first
> the vxlan device was removed from the VNI hash table per socket without
> holding lock. And there was a race when device is created and the workqueue
> could run after deletion.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>
> --- a/drivers/net/vxlan.c 2013-07-08 16:31:50.080744429 -0700
> +++ b/drivers/net/vxlan.c 2013-07-10 20:15:47.337653899 -0700
> @@ -1767,9 +1767,15 @@ static int vxlan_newlink(struct net *net
>
> static void vxlan_dellink(struct net_device *dev, struct list_head *head)
> {
> + struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
> struct vxlan_dev *vxlan = netdev_priv(dev);
>
> + flush_workqueue(vxlan_wq);
> +
Doesn't this create dependency on sock_work thread while holding RTNL.
If so it can result in deadlock.
> + spin_lock(&vn->sock_lock);
> hlist_del_rcu(&vxlan->hlist);
> + spin_unlock(&vn->sock_lock);
> +
> list_del(&vxlan->next);
> unregister_netdevice_queue(dev, head);
> }
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] vxlan: add necessary locking on device removal
2013-07-13 19:21 ` Pravin Shelar
@ 2013-07-16 18:28 ` David Miller
2013-07-16 20:29 ` Pravin Shelar
0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2013-07-16 18:28 UTC (permalink / raw)
To: pshelar; +Cc: stephen, netdev
From: Pravin Shelar <pshelar@nicira.com>
Date: Sat, 13 Jul 2013 12:21:52 -0700
> On Sat, Jul 13, 2013 at 10:18 AM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> The socket management is now done in workqueue (outside of RTNL)
>> and protected by vn->sock_lock. There were two possible bugs, first
>> the vxlan device was removed from the VNI hash table per socket without
>> holding lock. And there was a race when device is created and the workqueue
>> could run after deletion.
>>
>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>
>> --- a/drivers/net/vxlan.c 2013-07-08 16:31:50.080744429 -0700
>> +++ b/drivers/net/vxlan.c 2013-07-10 20:15:47.337653899 -0700
>> @@ -1767,9 +1767,15 @@ static int vxlan_newlink(struct net *net
>>
>> static void vxlan_dellink(struct net_device *dev, struct list_head *head)
>> {
>> + struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
>> struct vxlan_dev *vxlan = netdev_priv(dev);
>>
>> + flush_workqueue(vxlan_wq);
>> +
> Doesn't this create dependency on sock_work thread while holding RTNL.
> If so it can result in deadlock.
What exact deadlock do you perceive? I don't see any code path in the
sock_work handler (vxlan_sock_work) which takes the RTNL mutex.
So we should be able to safely flush any pending sock_work jobs from
vxlan_dellink(). The fact that vxlan_dellink() runs with the RTNL
mutex shouldn't cause any issues.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] vxlan: add necessary locking on device removal
2013-07-16 18:28 ` David Miller
@ 2013-07-16 20:29 ` Pravin Shelar
2013-07-17 6:06 ` David Miller
0 siblings, 1 reply; 13+ messages in thread
From: Pravin Shelar @ 2013-07-16 20:29 UTC (permalink / raw)
To: David Miller; +Cc: stephen, netdev
On Tue, Jul 16, 2013 at 11:28 AM, David Miller <davem@davemloft.net> wrote:
> From: Pravin Shelar <pshelar@nicira.com>
> Date: Sat, 13 Jul 2013 12:21:52 -0700
>
>> On Sat, Jul 13, 2013 at 10:18 AM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>>> The socket management is now done in workqueue (outside of RTNL)
>>> and protected by vn->sock_lock. There were two possible bugs, first
>>> the vxlan device was removed from the VNI hash table per socket without
>>> holding lock. And there was a race when device is created and the workqueue
>>> could run after deletion.
>>>
>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>>
>>> --- a/drivers/net/vxlan.c 2013-07-08 16:31:50.080744429 -0700
>>> +++ b/drivers/net/vxlan.c 2013-07-10 20:15:47.337653899 -0700
>>> @@ -1767,9 +1767,15 @@ static int vxlan_newlink(struct net *net
>>>
>>> static void vxlan_dellink(struct net_device *dev, struct list_head *head)
>>> {
>>> + struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
>>> struct vxlan_dev *vxlan = netdev_priv(dev);
>>>
>>> + flush_workqueue(vxlan_wq);
>>> +
>> Doesn't this create dependency on sock_work thread while holding RTNL.
>> If so it can result in deadlock.
>
> What exact deadlock do you perceive? I don't see any code path in the
> sock_work handler (vxlan_sock_work) which takes the RTNL mutex.
>
> So we should be able to safely flush any pending sock_work jobs from
> vxlan_dellink(). The fact that vxlan_dellink() runs with the RTNL
> mutex shouldn't cause any issues.
commit 1c51a9159ddefa51 (vxlan: fix race caused by dropping
rtnl_unlock) moved sock-create to sock_work workq. Thats why I thought
sock_work takes RTNL. but if it is not taking RTNL in any case, they
why vxlan sock create is deferred to sock_work?
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] vxlan: add necessary locking on device removal
2013-07-16 20:29 ` Pravin Shelar
@ 2013-07-17 6:06 ` David Miller
2013-07-17 15:41 ` Pravin Shelar
0 siblings, 1 reply; 13+ messages in thread
From: David Miller @ 2013-07-17 6:06 UTC (permalink / raw)
To: pshelar; +Cc: stephen, netdev
From: Pravin Shelar <pshelar@nicira.com>
Date: Tue, 16 Jul 2013 13:29:08 -0700
> commit 1c51a9159ddefa51 (vxlan: fix race caused by dropping
> rtnl_unlock) moved sock-create to sock_work workq. Thats why I thought
> sock_work takes RTNL. but if it is not taking RTNL in any case, they
> why vxlan sock create is deferred to sock_work?
That commit is handling the fact that the RTNL mutex is NOT held
during the create operation in question.
It defers to a workqueue so that the new ->sock_lock can be taken
in the appropriate context.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] vxlan: add necessary locking on device removal
2013-07-17 6:06 ` David Miller
@ 2013-07-17 15:41 ` Pravin Shelar
2013-07-17 19:09 ` David Miller
0 siblings, 1 reply; 13+ messages in thread
From: Pravin Shelar @ 2013-07-17 15:41 UTC (permalink / raw)
To: David Miller; +Cc: stephen, netdev
On Tue, Jul 16, 2013 at 11:06 PM, David Miller <davem@davemloft.net> wrote:
> From: Pravin Shelar <pshelar@nicira.com>
> Date: Tue, 16 Jul 2013 13:29:08 -0700
>
>> commit 1c51a9159ddefa51 (vxlan: fix race caused by dropping
>> rtnl_unlock) moved sock-create to sock_work workq. Thats why I thought
>> sock_work takes RTNL. but if it is not taking RTNL in any case, they
>> why vxlan sock create is deferred to sock_work?
>
> That commit is handling the fact that the RTNL mutex is NOT held
> during the create operation in question.
>
> It defers to a workqueue so that the new ->sock_lock can be taken
> in the appropriate context.
ok, Thanks for explanation. I do not see any problem with the patch.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] vxlan: add necessary locking on device removal
2013-07-17 15:41 ` Pravin Shelar
@ 2013-07-17 19:09 ` David Miller
0 siblings, 0 replies; 13+ messages in thread
From: David Miller @ 2013-07-17 19:09 UTC (permalink / raw)
To: pshelar; +Cc: stephen, netdev
From: Pravin Shelar <pshelar@nicira.com>
Date: Wed, 17 Jul 2013 08:41:33 -0700
> On Tue, Jul 16, 2013 at 11:06 PM, David Miller <davem@davemloft.net> wrote:
>> From: Pravin Shelar <pshelar@nicira.com>
>> Date: Tue, 16 Jul 2013 13:29:08 -0700
>>
>>> commit 1c51a9159ddefa51 (vxlan: fix race caused by dropping
>>> rtnl_unlock) moved sock-create to sock_work workq. Thats why I thought
>>> sock_work takes RTNL. but if it is not taking RTNL in any case, they
>>> why vxlan sock create is deferred to sock_work?
>>
>> That commit is handling the fact that the RTNL mutex is NOT held
>> during the create operation in question.
>>
>> It defers to a workqueue so that the new ->sock_lock can be taken
>> in the appropriate context.
>
> ok, Thanks for explanation. I do not see any problem with the patch.
Great, I'm going to apply Stephen's patch then.
Thanks.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH net] vxlan: add necessary locking on device removal
2013-07-13 17:18 ` [PATCH net] vxlan: add necessary locking on device removal Stephen Hemminger
2013-07-13 19:21 ` Pravin Shelar
@ 2013-07-17 19:51 ` David Miller
1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2013-07-17 19:51 UTC (permalink / raw)
To: stephen; +Cc: pshelar, netdev
From: Stephen Hemminger <stephen@networkplumber.org>
Date: Sat, 13 Jul 2013 10:18:18 -0700
> The socket management is now done in workqueue (outside of RTNL)
> and protected by vn->sock_lock. There were two possible bugs, first
> the vxlan device was removed from the VNI hash table per socket without
> holding lock. And there was a race when device is created and the workqueue
> could run after deletion.
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Applied.
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2013-07-17 19:51 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-10 22:04 [PATCH net-next] vxlan: Fix vs->vni_list locking Pravin B Shelar
2013-07-10 22:58 ` Stephen Hemminger
2013-07-10 23:08 ` Pravin Shelar
2013-07-10 23:25 ` Stephen Hemminger
2013-07-11 2:55 ` David Miller
2013-07-13 17:18 ` [PATCH net] vxlan: add necessary locking on device removal Stephen Hemminger
2013-07-13 19:21 ` Pravin Shelar
2013-07-16 18:28 ` David Miller
2013-07-16 20:29 ` Pravin Shelar
2013-07-17 6:06 ` David Miller
2013-07-17 15:41 ` Pravin Shelar
2013-07-17 19:09 ` David Miller
2013-07-17 19:51 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).