All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed
@ 2019-06-20 11:51 Taehee Yoo
  2019-06-23 18:07 ` David Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Taehee Yoo @ 2019-06-20 11:51 UTC (permalink / raw)
  To: davem, netdev; +Cc: ap420073

__vxlan_dev_create() destroys FDB using specific pointer which indicates
a fdb when error occurs.
But that pointer should not be used when register_netdevice() fails because
register_netdevice() internally destroys fdb when error occurs.

In order to avoid un-registered dev's notification, fdb destroying routine
checks dev's register status before notification.

Test command
    ip link add bonding_masters type vxlan id 0 group 239.1.1.1 \
	    dev enp0s9 dstport 4789

Splat looks like:
[  130.396714] kasan: CONFIG_KASAN_INLINE enabled
[  130.397649] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  130.398939] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  130.399829] CPU: 0 PID: 991 Comm: ip Not tainted 5.2.0-rc3+ #41
[  130.401581] RIP: 0010:vxlan_fdb_destroy+0x120/0x220 [vxlan]
[  130.402280] Code: df 48 8b 2b 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 06 01 00 00 4c 8b 63 08 48 b8 00 00 00 00d
[  130.404578] RSP: 0018:ffff8880cfac7080 EFLAGS: 00010a02
[  130.405235] RAX: dffffc0000000000 RBX: ffff8880d0613348 RCX: 0000000000000000
[  130.406172] RDX: 1bd5a00000000040 RSI: ffff8880d0613348 RDI: ffff8880d0613350
[  130.407056] RBP: 0000000000000000 R08: fffffbfff4378005 R09: 0000000000000000
[  130.408011] R10: 00000000ffffffef R11: 0000000000000000 R12: dead000000000200
[  130.408921] R13: ffff8880cfac71d8 R14: ffff8880b5d8cda0 R15: ffff8880b5d8cda0
[  130.409811] FS:  00007f9ef157e0c0(0000) GS:ffff8880da400000(0000) knlGS:0000000000000000
[  130.410805] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  130.411515] CR2: 0000560fe8118d54 CR3: 00000000bc684006 CR4: 00000000000606f0
[  130.412385] Call Trace:
[  130.412708]  __vxlan_dev_create+0x3a9/0x7d0 [vxlan]
[  130.413314]  ? vxlan_changelink+0x780/0x780 [vxlan]
[  130.413919]  ? rcu_read_unlock+0x60/0x60 [vxlan]
[  130.414497]  ? __kasan_kmalloc.constprop.3+0xa0/0xd0
[  130.415112]  vxlan_newlink+0x99/0xf0 [vxlan]
[  130.415640]  ? __vxlan_dev_create+0x7d0/0x7d0 [vxlan]
[  130.416270]  ? __netlink_ns_capable+0xc3/0xf0
[  130.416806]  __rtnl_newlink+0xb9f/0x11b0
[ ... ]

Fixes: 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
---
 drivers/net/vxlan.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 4c9bc29fe3d5..0bc07e3232c4 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -861,7 +861,7 @@ static void vxlan_fdb_destroy(struct vxlan_dev *vxlan, struct vxlan_fdb *f,
 	netdev_dbg(vxlan->dev, "delete %pM\n", f->eth_addr);
 
 	--vxlan->addrcnt;
-	if (do_notify)
+	if (do_notify && vxlan->dev->reg_state >= NETREG_REGISTERED)
 		list_for_each_entry(rd, &f->remotes, list)
 			vxlan_fdb_notify(vxlan, f, rd, RTM_DELNEIGH,
 					 swdev_notify, NULL);
@@ -3542,7 +3542,6 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
 	struct vxlan_net *vn = net_generic(net, vxlan_net_id);
 	struct vxlan_dev *vxlan = netdev_priv(dev);
 	struct vxlan_fdb *f = NULL;
-	bool unregister = false;
 	int err;
 
 	err = vxlan_dev_configure(net, dev, conf, false, extack);
@@ -3567,8 +3566,7 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
 
 	err = register_netdevice(dev);
 	if (err)
-		goto errout;
-	unregister = true;
+		return err;
 
 	err = rtnl_configure_link(dev, NULL);
 	if (err)
@@ -3592,8 +3590,7 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev,
 	 */
 	if (f)
 		vxlan_fdb_destroy(vxlan, f, false, false);
-	if (unregister)
-		unregister_netdevice(dev);
+	unregister_netdevice(dev);
 	return err;
 }
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed
  2019-06-20 11:51 [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed Taehee Yoo
@ 2019-06-23 18:07 ` David Miller
  2019-06-24  2:18   ` Taehee Yoo
  0 siblings, 1 reply; 7+ messages in thread
From: David Miller @ 2019-06-23 18:07 UTC (permalink / raw)
  To: ap420073; +Cc: netdev

From: Taehee Yoo <ap420073@gmail.com>
Date: Thu, 20 Jun 2019 20:51:08 +0900

> __vxlan_dev_create() destroys FDB using specific pointer which indicates
> a fdb when error occurs.
> But that pointer should not be used when register_netdevice() fails because
> register_netdevice() internally destroys fdb when error occurs.
> 
> In order to avoid un-registered dev's notification, fdb destroying routine
> checks dev's register status before notification.

Simply pass do_notify as false in this failure code path of __vxlan_dev_create(),
thank you.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed
  2019-06-23 18:07 ` David Miller
@ 2019-06-24  2:18   ` Taehee Yoo
  2019-06-25  4:12     ` Roopa Prabhu
  0 siblings, 1 reply; 7+ messages in thread
From: Taehee Yoo @ 2019-06-24  2:18 UTC (permalink / raw)
  To: David Miller; +Cc: Netdev

On Mon, 24 Jun 2019 at 03:07, David Miller <davem@davemloft.net> wrote:
>

Hi David,

Thank you for the review!

> From: Taehee Yoo <ap420073@gmail.com>
> Date: Thu, 20 Jun 2019 20:51:08 +0900
>
> > __vxlan_dev_create() destroys FDB using specific pointer which indicates
> > a fdb when error occurs.
> > But that pointer should not be used when register_netdevice() fails because
> > register_netdevice() internally destroys fdb when error occurs.
> >
> > In order to avoid un-registered dev's notification, fdb destroying routine
> > checks dev's register status before notification.
>
> Simply pass do_notify as false in this failure code path of __vxlan_dev_create(),
> thank you.

Failure path of __vxlan_dev_create() can't handle do_notify in that case
because if register_netdevice() fails it internally calls
->ndo_uninit() which is
vxlan_uninit().
vxlan_uninit() internally calls vxlan_fdb_delete_default() and it callls
vxlan_fdb_destroy().
do_notify of vxlan_fdb_destroy() in vxlan_fdb_delete_default() is always true.
So, failure path of __vxlan_dev_create() doesn't have any opportunity to
handle do_notify.

Thank you!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed
  2019-06-24  2:18   ` Taehee Yoo
@ 2019-06-25  4:12     ` Roopa Prabhu
  2019-06-25 16:08       ` Taehee Yoo
  0 siblings, 1 reply; 7+ messages in thread
From: Roopa Prabhu @ 2019-06-25  4:12 UTC (permalink / raw)
  To: Taehee Yoo; +Cc: David Miller, Netdev

On Sun, Jun 23, 2019 at 7:18 PM Taehee Yoo <ap420073@gmail.com> wrote:
>
> On Mon, 24 Jun 2019 at 03:07, David Miller <davem@davemloft.net> wrote:
> >
>
> Hi David,
>
> Thank you for the review!
>
> > From: Taehee Yoo <ap420073@gmail.com>
> > Date: Thu, 20 Jun 2019 20:51:08 +0900
> >
> > > __vxlan_dev_create() destroys FDB using specific pointer which indicates
> > > a fdb when error occurs.
> > > But that pointer should not be used when register_netdevice() fails because
> > > register_netdevice() internally destroys fdb when error occurs.
> > >
> > > In order to avoid un-registered dev's notification, fdb destroying routine
> > > checks dev's register status before notification.
> >
> > Simply pass do_notify as false in this failure code path of __vxlan_dev_create(),
> > thank you.
>
> Failure path of __vxlan_dev_create() can't handle do_notify in that case
> because if register_netdevice() fails it internally calls
> ->ndo_uninit() which is
> vxlan_uninit().
> vxlan_uninit() internally calls vxlan_fdb_delete_default() and it callls
> vxlan_fdb_destroy().
> do_notify of vxlan_fdb_destroy() in vxlan_fdb_delete_default() is always true.
> So, failure path of __vxlan_dev_create() doesn't have any opportunity to
> handle do_notify.


I don't see register_netdevice calling ndo_uninit in case of all
errors. In the case where it does not,
does your patch leak the fdb entry ?.

Wondering if we should just use vxlan_fdb_delete_default with a notify
flag to delete the entry if exists.
Will that help ?

There is another commit that touched this code path:
commit 6db9246871394b3a136cd52001a0763676563840

Author: Petr Machata <petrm@mellanox.com>
Date:   Tue Dec 18 13:16:00 2018 +0000
    vxlan: Fix error path in __vxlan_dev_create()

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed
  2019-06-25  4:12     ` Roopa Prabhu
@ 2019-06-25 16:08       ` Taehee Yoo
  2019-06-26  5:03         ` Roopa Prabhu
  0 siblings, 1 reply; 7+ messages in thread
From: Taehee Yoo @ 2019-06-25 16:08 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: David Miller, Netdev

On Tue, 25 Jun 2019 at 13:12, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>

Hi Roopa,

Thank you for the review!

> On Sun, Jun 23, 2019 at 7:18 PM Taehee Yoo <ap420073@gmail.com> wrote:
> >
> > On Mon, 24 Jun 2019 at 03:07, David Miller <davem@davemloft.net> wrote:
> > >
> >
> > Hi David,
> >
> > Thank you for the review!
> >
> > > From: Taehee Yoo <ap420073@gmail.com>
> > > Date: Thu, 20 Jun 2019 20:51:08 +0900
> > >
> > > > __vxlan_dev_create() destroys FDB using specific pointer which indicates
> > > > a fdb when error occurs.
> > > > But that pointer should not be used when register_netdevice() fails because
> > > > register_netdevice() internally destroys fdb when error occurs.
> > > >
> > > > In order to avoid un-registered dev's notification, fdb destroying routine
> > > > checks dev's register status before notification.
> > >
> > > Simply pass do_notify as false in this failure code path of __vxlan_dev_create(),
> > > thank you.
> >
> > Failure path of __vxlan_dev_create() can't handle do_notify in that case
> > because if register_netdevice() fails it internally calls
> > ->ndo_uninit() which is
> > vxlan_uninit().
> > vxlan_uninit() internally calls vxlan_fdb_delete_default() and it callls
> > vxlan_fdb_destroy().
> > do_notify of vxlan_fdb_destroy() in vxlan_fdb_delete_default() is always true.
> > So, failure path of __vxlan_dev_create() doesn't have any opportunity to
> > handle do_notify.
>
>
> I don't see register_netdevice calling ndo_uninit in case of all
> errors. In the case where it does not,
> does your patch leak the fdb entry ?.
>
> Wondering if we should just use vxlan_fdb_delete_default with a notify
> flag to delete the entry if exists.
> Will that help ?
>
> There is another commit that touched this code path:
> commit 6db9246871394b3a136cd52001a0763676563840
>
> Author: Petr Machata <petrm@mellanox.com>
> Date:   Tue Dec 18 13:16:00 2018 +0000
>     vxlan: Fix error path in __vxlan_dev_create()

I have checked up failure path of register_netdevice().
Yes, this patch leaks fdb entry.
There are 3 failure cases in the register_netdevice().
A. error occurs before calling ->ndo_init().
it doesn't call ->ndo_uninit().
B. error occurs after calling ->ndo_init().
it calls ->ndo_uninit() and dev->reg_state is NETREG_UNINITIALIZED.
C. error occurs after registering netdev. it calls rollback_registered().
rollback_registered() internally calls ->ndo_uninit()
and dev->reg_state is NETREG_UNREGISTERING.

A panic due to these problem could be fixed by using
vxlan_fdb_delete_default() with notify flag.
But notification problem could not be fixed clearly
because of the case C.

I don't have clear solution for the case C.
Please let me know, if you have any good idea for fixing the case C.

Thank you!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed
  2019-06-25 16:08       ` Taehee Yoo
@ 2019-06-26  5:03         ` Roopa Prabhu
  2019-06-26 14:53           ` Taehee Yoo
  0 siblings, 1 reply; 7+ messages in thread
From: Roopa Prabhu @ 2019-06-26  5:03 UTC (permalink / raw)
  To: Taehee Yoo; +Cc: David Miller, Netdev

On Tue, Jun 25, 2019 at 9:08 AM Taehee Yoo <ap420073@gmail.com> wrote:
>
> On Tue, 25 Jun 2019 at 13:12, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> >
>
> Hi Roopa,
>
> Thank you for the review!
>
> > On Sun, Jun 23, 2019 at 7:18 PM Taehee Yoo <ap420073@gmail.com> wrote:
> > >
> > > On Mon, 24 Jun 2019 at 03:07, David Miller <davem@davemloft.net> wrote:
> > > >
> > >
> > > Hi David,
> > >
> > > Thank you for the review!
> > >
> > > > From: Taehee Yoo <ap420073@gmail.com>
> > > > Date: Thu, 20 Jun 2019 20:51:08 +0900
> > > >
> > > > > __vxlan_dev_create() destroys FDB using specific pointer which indicates
> > > > > a fdb when error occurs.
> > > > > But that pointer should not be used when register_netdevice() fails because
> > > > > register_netdevice() internally destroys fdb when error occurs.
> > > > >
> > > > > In order to avoid un-registered dev's notification, fdb destroying routine
> > > > > checks dev's register status before notification.
> > > >
> > > > Simply pass do_notify as false in this failure code path of __vxlan_dev_create(),
> > > > thank you.
> > >
> > > Failure path of __vxlan_dev_create() can't handle do_notify in that case
> > > because if register_netdevice() fails it internally calls
> > > ->ndo_uninit() which is
> > > vxlan_uninit().
> > > vxlan_uninit() internally calls vxlan_fdb_delete_default() and it callls
> > > vxlan_fdb_destroy().
> > > do_notify of vxlan_fdb_destroy() in vxlan_fdb_delete_default() is always true.
> > > So, failure path of __vxlan_dev_create() doesn't have any opportunity to
> > > handle do_notify.
> >
> >
> > I don't see register_netdevice calling ndo_uninit in case of all
> > errors. In the case where it does not,
> > does your patch leak the fdb entry ?.
> >
> > Wondering if we should just use vxlan_fdb_delete_default with a notify
> > flag to delete the entry if exists.
> > Will that help ?
> >
> > There is another commit that touched this code path:
> > commit 6db9246871394b3a136cd52001a0763676563840
> >
> > Author: Petr Machata <petrm@mellanox.com>
> > Date:   Tue Dec 18 13:16:00 2018 +0000
> >     vxlan: Fix error path in __vxlan_dev_create()
>
> I have checked up failure path of register_netdevice().
> Yes, this patch leaks fdb entry.
> There are 3 failure cases in the register_netdevice().
> A. error occurs before calling ->ndo_init().
> it doesn't call ->ndo_uninit().
> B. error occurs after calling ->ndo_init().
> it calls ->ndo_uninit() and dev->reg_state is NETREG_UNINITIALIZED.
> C. error occurs after registering netdev. it calls rollback_registered().
> rollback_registered() internally calls ->ndo_uninit()
> and dev->reg_state is NETREG_UNREGISTERING.
>
> A panic due to these problem could be fixed by using
> vxlan_fdb_delete_default() with notify flag.
> But notification problem could not be fixed clearly
> because of the case C.

yes, you are right. The notification issue still remains.

>
> I don't have clear solution for the case C.
> Please let me know, if you have any good idea for fixing the case C.

One option is a variant of fdb create. alloc the fdb  but don't assign
it to the vxlan dev.
__vxlan_dev_create
             create fdb entry
             register_netdevice
             rtnl_configure_link
             link fdb to vxlan
             fdb notify


Yet another option is moving fdb create after register_netdevice
__vxlan_dev_create
             register_netdevice
             rtnl_configure_link
             create fdb entry
             fdb notify
But if fdb create fails, user-space will see , NEWLINK + DELLINK when
creating a vxlan device and that seems weird.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed
  2019-06-26  5:03         ` Roopa Prabhu
@ 2019-06-26 14:53           ` Taehee Yoo
  0 siblings, 0 replies; 7+ messages in thread
From: Taehee Yoo @ 2019-06-26 14:53 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: David Miller, Netdev

On Wed, 26 Jun 2019 at 14:03, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>
> On Tue, Jun 25, 2019 at 9:08 AM Taehee Yoo <ap420073@gmail.com> wrote:
> >
> > On Tue, 25 Jun 2019 at 13:12, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> > >
> >
> > Hi Roopa,
> >
> > Thank you for the review!
> >
> > > On Sun, Jun 23, 2019 at 7:18 PM Taehee Yoo <ap420073@gmail.com> wrote:
> > > >
> > > > On Mon, 24 Jun 2019 at 03:07, David Miller <davem@davemloft.net> wrote:
> > > > >
> > > >
> > > > Hi David,
> > > >
> > > > Thank you for the review!
> > > >
> > > > > From: Taehee Yoo <ap420073@gmail.com>
> > > > > Date: Thu, 20 Jun 2019 20:51:08 +0900
> > > > >
> > > > > > __vxlan_dev_create() destroys FDB using specific pointer which indicates
> > > > > > a fdb when error occurs.
> > > > > > But that pointer should not be used when register_netdevice() fails because
> > > > > > register_netdevice() internally destroys fdb when error occurs.
> > > > > >
> > > > > > In order to avoid un-registered dev's notification, fdb destroying routine
> > > > > > checks dev's register status before notification.
> > > > >
> > > > > Simply pass do_notify as false in this failure code path of __vxlan_dev_create(),
> > > > > thank you.
> > > >
> > > > Failure path of __vxlan_dev_create() can't handle do_notify in that case
> > > > because if register_netdevice() fails it internally calls
> > > > ->ndo_uninit() which is
> > > > vxlan_uninit().
> > > > vxlan_uninit() internally calls vxlan_fdb_delete_default() and it callls
> > > > vxlan_fdb_destroy().
> > > > do_notify of vxlan_fdb_destroy() in vxlan_fdb_delete_default() is always true.
> > > > So, failure path of __vxlan_dev_create() doesn't have any opportunity to
> > > > handle do_notify.
> > >
> > >
> > > I don't see register_netdevice calling ndo_uninit in case of all
> > > errors. In the case where it does not,
> > > does your patch leak the fdb entry ?.
> > >
> > > Wondering if we should just use vxlan_fdb_delete_default with a notify
> > > flag to delete the entry if exists.
> > > Will that help ?
> > >
> > > There is another commit that touched this code path:
> > > commit 6db9246871394b3a136cd52001a0763676563840
> > >
> > > Author: Petr Machata <petrm@mellanox.com>
> > > Date:   Tue Dec 18 13:16:00 2018 +0000
> > >     vxlan: Fix error path in __vxlan_dev_create()
> >
> > I have checked up failure path of register_netdevice().
> > Yes, this patch leaks fdb entry.
> > There are 3 failure cases in the register_netdevice().
> > A. error occurs before calling ->ndo_init().
> > it doesn't call ->ndo_uninit().
> > B. error occurs after calling ->ndo_init().
> > it calls ->ndo_uninit() and dev->reg_state is NETREG_UNINITIALIZED.
> > C. error occurs after registering netdev. it calls rollback_registered().
> > rollback_registered() internally calls ->ndo_uninit()
> > and dev->reg_state is NETREG_UNREGISTERING.
> >
> > A panic due to these problem could be fixed by using
> > vxlan_fdb_delete_default() with notify flag.
> > But notification problem could not be fixed clearly
> > because of the case C.
>
> yes, you are right. The notification issue still remains.
>
> >
> > I don't have clear solution for the case C.
> > Please let me know, if you have any good idea for fixing the case C.
>
> One option is a variant of fdb create. alloc the fdb  but don't assign
> it to the vxlan dev.
> __vxlan_dev_create
>              create fdb entry
>              register_netdevice
>              rtnl_configure_link
>              link fdb to vxlan
>              fdb notify
>
>
> Yet another option is moving fdb create after register_netdevice
> __vxlan_dev_create
>              register_netdevice
>              rtnl_configure_link
>              create fdb entry
>              fdb notify
> But if fdb create fails, user-space will see , NEWLINK + DELLINK when
> creating a vxlan device and that seems weird.

Thank you for suggesting ideas!

I think first one is very nice.
It's simple and clear.
So, I have been testing first one.
I will send a new patch after more testing.

Thank you again!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-06-26 14:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-20 11:51 [PATCH net] vxlan: do not destroy fdb if register_netdevice() is failed Taehee Yoo
2019-06-23 18:07 ` David Miller
2019-06-24  2:18   ` Taehee Yoo
2019-06-25  4:12     ` Roopa Prabhu
2019-06-25 16:08       ` Taehee Yoo
2019-06-26  5:03         ` Roopa Prabhu
2019-06-26 14:53           ` Taehee Yoo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.