All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/2] vxlan updates
@ 2014-01-13 17:41 Daniel Borkmann
  2014-01-13 17:41 ` [PATCH net-next 1/2] net: vxlan: when lower dev unregisters remove vxlan dev as well Daniel Borkmann
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-01-13 17:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

Did the split into two patches upon request from Cong Wang.

Changelog:

 v1->v2:
  - Removed BUG_ON as it's not needed.
 v2->v3:
  - Removed dev->reg_state check for netns.
 v3->v4:
  - Removed list_del(), we seem to do it in some places and
    in some others not; we agreed it's not really necessary.
  - Split patch into 2 patches, notifier part and module
    unload cleanup part.

Daniel Borkmann (2):
  net: vxlan: when lower dev unregisters remove vxlan dev as well
  net: vxlan: properly cleanup devs on module unload

 drivers/net/vxlan.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 51 insertions(+), 7 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net-next 1/2] net: vxlan: when lower dev unregisters remove vxlan dev as well
  2014-01-13 17:41 [PATCH net-next 0/2] vxlan updates Daniel Borkmann
@ 2014-01-13 17:41 ` Daniel Borkmann
  2014-01-14  2:22   ` Stephen Hemminger
  2014-01-13 17:41 ` [PATCH net-next 2/2] net: vxlan: properly cleanup devs on module unload Daniel Borkmann
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 7+ messages in thread
From: Daniel Borkmann @ 2014-01-13 17:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

We can create a vxlan device with an explicit underlying carrier.
In that case, when the carrier link is being deleted from the
system (e.g. due to module unload) we should also clean up all
created vxlan devices on top of it since otherwise we're in an
inconsistent state in vxlan device. In that case, the user needs
to remove all such devices, while in case of other virtual devs
that sit on top of physical ones, it is usually the case that
these devices do unregister automatically as well and do not
leave the burden on the user.

This work is not necessary when vxlan device was not created with
a real underlying device, as connections can resume in that case
when driver is plugged again. But at least for the other cases,
we should go ahead and do the cleanup on removal.

We don't register the notifier during vxlan_newlink() here since
I consider this event rather rare, and therefore we should not
bloat vxlan's core structure unecessary. Also, we can simply make
use of unregister_netdevice_many() to batch that. fdb is flushed
upon ndo_stop().

E.g. `ip -d link show vxlan13` after carrier removal before
this patch:

5: vxlan13: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
    link/ether 1e:47:da:6d:4d:99 brd ff:ff:ff:ff:ff:ff promiscuity 0
    vxlan id 13 group 239.0.0.10 dev 2 port 32768 61000 ageing 300
                                 ^^^^^
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 drivers/net/vxlan.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 46 insertions(+), 2 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 481f85d..81c553f 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2656,6 +2656,44 @@ static struct rtnl_link_ops vxlan_link_ops __read_mostly = {
 	.fill_info	= vxlan_fill_info,
 };
 
+static void vxlan_handle_lowerdev_unregister(struct vxlan_net *vn,
+					     struct net_device *dev)
+{
+	struct vxlan_dev *vxlan, *next;
+	LIST_HEAD(list_kill);
+
+	list_for_each_entry_safe(vxlan, next, &vn->vxlan_list, next) {
+		struct vxlan_rdst *dst = &vxlan->default_dst;
+
+		/* In case we created vxlan device with carrier
+		 * and we loose the carrier due to module unload
+		 * we also need to remove vxlan device. In other
+		 * cases, it's not necessary and remote_ifindex
+		 * is 0 here, so no matches.
+		 */
+		if (dst->remote_ifindex == dev->ifindex)
+			vxlan_dellink(vxlan->dev, &list_kill);
+	}
+
+	unregister_netdevice_many(&list_kill);
+}
+
+static int vxlan_lowerdev_event(struct notifier_block *unused,
+				unsigned long event, void *ptr)
+{
+	struct net_device *dev = netdev_notifier_info_to_dev(ptr);
+	struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
+
+	if (event == NETDEV_UNREGISTER)
+		vxlan_handle_lowerdev_unregister(vn, dev);
+
+	return NOTIFY_DONE;
+}
+
+static struct notifier_block vxlan_notifier_block __read_mostly = {
+	.notifier_call = vxlan_lowerdev_event,
+};
+
 static __net_init int vxlan_init_net(struct net *net)
 {
 	struct vxlan_net *vn = net_generic(net, vxlan_net_id);
@@ -2704,12 +2742,17 @@ static int __init vxlan_init_module(void)
 	if (rc)
 		goto out1;
 
-	rc = rtnl_link_register(&vxlan_link_ops);
+	rc = register_netdevice_notifier(&vxlan_notifier_block);
 	if (rc)
 		goto out2;
 
-	return 0;
+	rc = rtnl_link_register(&vxlan_link_ops);
+	if (rc)
+		goto out3;
 
+	return 0;
+out3:
+	unregister_netdevice_notifier(&vxlan_notifier_block);
 out2:
 	unregister_pernet_device(&vxlan_net_ops);
 out1:
@@ -2721,6 +2764,7 @@ late_initcall(vxlan_init_module);
 static void __exit vxlan_cleanup_module(void)
 {
 	rtnl_link_unregister(&vxlan_link_ops);
+	unregister_netdevice_notifier(&vxlan_notifier_block);
 	destroy_workqueue(vxlan_wq);
 	unregister_pernet_device(&vxlan_net_ops);
 	rcu_barrier();
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next 2/2] net: vxlan: properly cleanup devs on module unload
  2014-01-13 17:41 [PATCH net-next 0/2] vxlan updates Daniel Borkmann
  2014-01-13 17:41 ` [PATCH net-next 1/2] net: vxlan: when lower dev unregisters remove vxlan dev as well Daniel Borkmann
@ 2014-01-13 17:41 ` Daniel Borkmann
  2014-01-13 21:59 ` [PATCH net-next 0/2] vxlan updates Cong Wang
  2014-01-15  7:39 ` David Miller
  3 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-01-13 17:41 UTC (permalink / raw)
  To: davem; +Cc: netdev

We should use vxlan_dellink() handler in vxlan_exit_net(), since
i) we're not in fast-path and we should be consistent in dismantle
just as we would remove a device through rtnl ops, and more
importantly, ii) in case future code will kfree() memory in
vxlan_dellink(), we would leak it right here unnoticed. Therefore,
do not only half of the cleanup work, but make it properly.

Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 drivers/net/vxlan.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 81c553f..d2acb6b 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2711,13 +2711,13 @@ static __net_init int vxlan_init_net(struct net *net)
 static __net_exit void vxlan_exit_net(struct net *net)
 {
 	struct vxlan_net *vn = net_generic(net, vxlan_net_id);
-	struct vxlan_dev *vxlan;
-	LIST_HEAD(list);
+	struct vxlan_dev *vxlan, *next;
+	LIST_HEAD(list_kill);
 
 	rtnl_lock();
-	list_for_each_entry(vxlan, &vn->vxlan_list, next)
-		unregister_netdevice_queue(vxlan->dev, &list);
-	unregister_netdevice_many(&list);
+	list_for_each_entry_safe(vxlan, next, &vn->vxlan_list, next)
+		vxlan_dellink(vxlan->dev, &list_kill);
+	unregister_netdevice_many(&list_kill);
 	rtnl_unlock();
 }
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 0/2] vxlan updates
  2014-01-13 17:41 [PATCH net-next 0/2] vxlan updates Daniel Borkmann
  2014-01-13 17:41 ` [PATCH net-next 1/2] net: vxlan: when lower dev unregisters remove vxlan dev as well Daniel Borkmann
  2014-01-13 17:41 ` [PATCH net-next 2/2] net: vxlan: properly cleanup devs on module unload Daniel Borkmann
@ 2014-01-13 21:59 ` Cong Wang
  2014-01-15  7:39 ` David Miller
  3 siblings, 0 replies; 7+ messages in thread
From: Cong Wang @ 2014-01-13 21:59 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: David Miller, netdev

On Mon, Jan 13, 2014 at 9:41 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
> Did the split into two patches upon request from Cong Wang.
>
> Changelog:
>
>  v1->v2:
>   - Removed BUG_ON as it's not needed.
>  v2->v3:
>   - Removed dev->reg_state check for netns.
>  v3->v4:
>   - Removed list_del(), we seem to do it in some places and
>     in some others not; we agreed it's not really necessary.
>   - Split patch into 2 patches, notifier part and module
>     unload cleanup part.


Looks good to my eyes, so

Reviewed-by: Cong Wang <cwang@twopensource.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 1/2] net: vxlan: when lower dev unregisters remove vxlan dev as well
  2014-01-13 17:41 ` [PATCH net-next 1/2] net: vxlan: when lower dev unregisters remove vxlan dev as well Daniel Borkmann
@ 2014-01-14  2:22   ` Stephen Hemminger
  2014-01-14 14:02     ` Daniel Borkmann
  0 siblings, 1 reply; 7+ messages in thread
From: Stephen Hemminger @ 2014-01-14  2:22 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: davem, netdev

On Mon, 13 Jan 2014 18:41:19 +0100
Daniel Borkmann <dborkman@redhat.com> wrote:

> We can create a vxlan device with an explicit underlying carrier.
> In that case, when the carrier link is being deleted from the
> system (e.g. due to module unload) we should also clean up all
> created vxlan devices on top of it since otherwise we're in an
> inconsistent state in vxlan device. In that case, the user needs
> to remove all such devices, while in case of other virtual devs
> that sit on top of physical ones, it is usually the case that
> these devices do unregister automatically as well and do not
> leave the burden on the user.
> 
> This work is not necessary when vxlan device was not created with
> a real underlying device, as connections can resume in that case
> when driver is plugged again. But at least for the other cases,
> we should go ahead and do the cleanup on removal.
> 
> We don't register the notifier during vxlan_newlink() here since
> I consider this event rather rare, and therefore we should not
> bloat vxlan's core structure unecessary. Also, we can simply make
> use of unregister_netdevice_many() to batch that. fdb is flushed
> upon ndo_stop().
> 
> E.g. `ip -d link show vxlan13` after carrier removal before
> this patch:
> 
> 5: vxlan13: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
>     link/ether 1e:47:da:6d:4d:99 brd ff:ff:ff:ff:ff:ff promiscuity 0
>     vxlan id 13 group 239.0.0.10 dev 2 port 32768 61000 ageing 300
>                                  ^^^^^
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>

Since vxlan is running over UDP socket. I wonder if this could be
done better by implementing something equivalent to SO_BINDTODEVICE.

What happens to a user land application which has a UDP socket
and has done SO_BINDTODEVICE and device is removed? Is there an asynchronous
error, can the application recover? Why can't vxlan use the same mechanism?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 1/2] net: vxlan: when lower dev unregisters remove vxlan dev as well
  2014-01-14  2:22   ` Stephen Hemminger
@ 2014-01-14 14:02     ` Daniel Borkmann
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Borkmann @ 2014-01-14 14:02 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: davem, netdev

On 01/14/2014 03:22 AM, Stephen Hemminger wrote:
> On Mon, 13 Jan 2014 18:41:19 +0100
> Daniel Borkmann <dborkman@redhat.com> wrote:
>
>> We can create a vxlan device with an explicit underlying carrier.
>> In that case, when the carrier link is being deleted from the
>> system (e.g. due to module unload) we should also clean up all
>> created vxlan devices on top of it since otherwise we're in an
>> inconsistent state in vxlan device. In that case, the user needs
>> to remove all such devices, while in case of other virtual devs
>> that sit on top of physical ones, it is usually the case that
>> these devices do unregister automatically as well and do not
>> leave the burden on the user.
>>
>> This work is not necessary when vxlan device was not created with
>> a real underlying device, as connections can resume in that case
>> when driver is plugged again. But at least for the other cases,
>> we should go ahead and do the cleanup on removal.
>>
>> We don't register the notifier during vxlan_newlink() here since
>> I consider this event rather rare, and therefore we should not
>> bloat vxlan's core structure unecessary. Also, we can simply make
>> use of unregister_netdevice_many() to batch that. fdb is flushed
>> upon ndo_stop().
>>
>> E.g. `ip -d link show vxlan13` after carrier removal before
>> this patch:
>>
>> 5: vxlan13: <BROADCAST,MULTICAST> mtu 1450 qdisc noop state DOWN mode DEFAULT group default
>>      link/ether 1e:47:da:6d:4d:99 brd ff:ff:ff:ff:ff:ff promiscuity 0
>>      vxlan id 13 group 239.0.0.10 dev 2 port 32768 61000 ageing 300
>>                                   ^^^^^
>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
>
> Since vxlan is running over UDP socket. I wonder if this could be
> done better by implementing something equivalent to SO_BINDTODEVICE.
>
> What happens to a user land application which has a UDP socket
> and has done SO_BINDTODEVICE and device is removed? Is there an asynchronous
> error, can the application recover? Why can't vxlan use the same mechanism?

Interesting point. What seems to happen with UDP sockets and SO_BINDTODEVICE
in case the device was present during the setsockopt(2), and module was
unloaded at time of sendto(2)/recvfrom(2), that at least senders give an
error of ENODEV in such cases while receivers seem not to notice as far as
I can tell. When we reload the module and device appears again with the
same name, then sendto(2), still fails with ENODEV, since we, of course,
work with indexes, that is, sk->sk_bound_dev_if. The only chance user space
would have is to try to redo the setsockopt(2) with SO_BINDTODEVICE on the
same device _name_, but different new index now, in the hope that this would
be the same underlying NIC. It seems, however not recommended to do so [1].
Anyway, so I'm not sure how useful this would be, I guess just doing what
this patch here does should be appropriate to do.

   [1] http://ftp.riken.go.jp/Linux/kernel/v2.0/patch-html/patch-2.0.31/linux_Documentation_networking_so_bindtodevice.txt.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next 0/2] vxlan updates
  2014-01-13 17:41 [PATCH net-next 0/2] vxlan updates Daniel Borkmann
                   ` (2 preceding siblings ...)
  2014-01-13 21:59 ` [PATCH net-next 0/2] vxlan updates Cong Wang
@ 2014-01-15  7:39 ` David Miller
  3 siblings, 0 replies; 7+ messages in thread
From: David Miller @ 2014-01-15  7:39 UTC (permalink / raw)
  To: dborkman; +Cc: netdev

From: Daniel Borkmann <dborkman@redhat.com>
Date: Mon, 13 Jan 2014 18:41:18 +0100

> Did the split into two patches upon request from Cong Wang.
> 
> Changelog:
> 
>  v1->v2:
>   - Removed BUG_ON as it's not needed.
>  v2->v3:
>   - Removed dev->reg_state check for netns.
>  v3->v4:
>   - Removed list_del(), we seem to do it in some places and
>     in some others not; we agreed it's not really necessary.
>   - Split patch into 2 patches, notifier part and module
>     unload cleanup part.

Series applied, thanks Daniel.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-01-15  7:39 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-13 17:41 [PATCH net-next 0/2] vxlan updates Daniel Borkmann
2014-01-13 17:41 ` [PATCH net-next 1/2] net: vxlan: when lower dev unregisters remove vxlan dev as well Daniel Borkmann
2014-01-14  2:22   ` Stephen Hemminger
2014-01-14 14:02     ` Daniel Borkmann
2014-01-13 17:41 ` [PATCH net-next 2/2] net: vxlan: properly cleanup devs on module unload Daniel Borkmann
2014-01-13 21:59 ` [PATCH net-next 0/2] vxlan updates Cong Wang
2014-01-15  7:39 ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.