All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 12:49 ` idosch
  0 siblings, 0 replies; 22+ messages in thread
From: idosch @ 2017-04-07 12:49 UTC (permalink / raw)
  To: netdev
  Cc: davem, stephen, bridge, mlxsw, peter, cera, Ido Schimmel,
	Nikolay Aleksandrov

From: Ido Schimmel <idosch@mellanox.com>

Peter reported a kernel oops when executing the following command:

$ ip link add name test type bridge vlan_default_pvid 1

[13634.939408] BUG: unable to handle kernel NULL pointer dereference at
0000000000000190
[13634.939436] IP: __vlan_add+0x73/0x5f0
[...]
[13634.939783] Call Trace:
[13634.939791]  ? pcpu_next_unpop+0x3b/0x50
[13634.939801]  ? pcpu_alloc+0x3d2/0x680
[13634.939810]  ? br_vlan_add+0x135/0x1b0
[13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
[13634.939834]  ? br_changelink+0x120/0x4e0
[13634.939844]  ? br_dev_newlink+0x50/0x70
[13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
[13634.939864]  ? rtnl_newlink+0x176/0x8a0
[13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
[13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
[13634.939896]  ? lookup_fast+0x52/0x370
[13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
[13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
[13634.939925]  ? rtnetlink_rcv+0x24/0x30
[13634.939934]  ? netlink_unicast+0x177/0x220
[13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
[13634.939954]  ? _copy_from_user+0x39/0x40
[13634.939964]  ? sock_sendmsg+0x30/0x40
[13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
[13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
[13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
[13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
[13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
[13634.940809]  ? __sys_sendmsg+0x51/0x90
[13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad

The problem is that the bridge's VLAN group is created after setting the
default PVID, when registering the netdevice and executing its
ndo_init().

Fix this by changing the order of both operations, so that
br_changelink() is only processed after the netdevice is registered,
when the VLAN group is already initialized.

The changelink() call is done on a best-effort basis since unregistering
the netdevice upon failure won't perform a proper cleanup due to a
missing ndo_uninit(), which I'll try to add for net-next.

Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Peter V. Saveliev <peter@svinota.eu>
---
Please consider this for 4.4.y, 4.9.y and 4.10.y as well.
---
 net/bridge/br_netlink.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index a8f6acd..cf06c1a 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1165,11 +1165,15 @@ static int br_dev_newlink(struct net *src_net, struct net_device *dev,
 		spin_unlock_bh(&br->lock);
 	}
 
-	err = br_changelink(dev, tb, data);
+	err = register_netdevice(dev);
 	if (err)
 		return err;
 
-	return register_netdevice(dev);
+	err = br_changelink(dev, tb, data);
+	if (err)
+		netdev_err(dev, "Failed to configure bridge device\n");
+
+	return 0;
 }
 
 static size_t br_get_size(const struct net_device *brdev)
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 12:49 ` idosch
  0 siblings, 0 replies; 22+ messages in thread
From: idosch @ 2017-04-07 12:49 UTC (permalink / raw)
  To: netdev
  Cc: Ido Schimmel, cera, mlxsw, Nikolay Aleksandrov, peter, bridge, davem

From: Ido Schimmel <idosch@mellanox.com>

Peter reported a kernel oops when executing the following command:

$ ip link add name test type bridge vlan_default_pvid 1

[13634.939408] BUG: unable to handle kernel NULL pointer dereference at
0000000000000190
[13634.939436] IP: __vlan_add+0x73/0x5f0
[...]
[13634.939783] Call Trace:
[13634.939791]  ? pcpu_next_unpop+0x3b/0x50
[13634.939801]  ? pcpu_alloc+0x3d2/0x680
[13634.939810]  ? br_vlan_add+0x135/0x1b0
[13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
[13634.939834]  ? br_changelink+0x120/0x4e0
[13634.939844]  ? br_dev_newlink+0x50/0x70
[13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
[13634.939864]  ? rtnl_newlink+0x176/0x8a0
[13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
[13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
[13634.939896]  ? lookup_fast+0x52/0x370
[13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
[13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
[13634.939925]  ? rtnetlink_rcv+0x24/0x30
[13634.939934]  ? netlink_unicast+0x177/0x220
[13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
[13634.939954]  ? _copy_from_user+0x39/0x40
[13634.939964]  ? sock_sendmsg+0x30/0x40
[13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
[13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
[13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
[13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
[13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
[13634.940809]  ? __sys_sendmsg+0x51/0x90
[13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad

The problem is that the bridge's VLAN group is created after setting the
default PVID, when registering the netdevice and executing its
ndo_init().

Fix this by changing the order of both operations, so that
br_changelink() is only processed after the netdevice is registered,
when the VLAN group is already initialized.

The changelink() call is done on a best-effort basis since unregistering
the netdevice upon failure won't perform a proper cleanup due to a
missing ndo_uninit(), which I'll try to add for net-next.

Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Peter V. Saveliev <peter@svinota.eu>
---
Please consider this for 4.4.y, 4.9.y and 4.10.y as well.
---
 net/bridge/br_netlink.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index a8f6acd..cf06c1a 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1165,11 +1165,15 @@ static int br_dev_newlink(struct net *src_net, struct net_device *dev,
 		spin_unlock_bh(&br->lock);
 	}
 
-	err = br_changelink(dev, tb, data);
+	err = register_netdevice(dev);
 	if (err)
 		return err;
 
-	return register_netdevice(dev);
+	err = br_changelink(dev, tb, data);
+	if (err)
+		netdev_err(dev, "Failed to configure bridge device\n");
+
+	return 0;
 }
 
 static size_t br_get_size(const struct net_device *brdev)
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 12:49 ` [Bridge] " idosch
@ 2017-04-07 14:10   ` Stephen Hemminger
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 14:10 UTC (permalink / raw)
  To: idosch; +Cc: netdev, davem, bridge, mlxsw, peter, cera, Nikolay Aleksandrov

On Fri, 7 Apr 2017 15:49:15 +0300
<idosch@mellanox.com> wrote:

> From: Ido Schimmel <idosch@mellanox.com>
> 
> Peter reported a kernel oops when executing the following command:
> 
> $ ip link add name test type bridge vlan_default_pvid 1
> 
> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000190
> [13634.939436] IP: __vlan_add+0x73/0x5f0
> [...]
> [13634.939783] Call Trace:
> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
> [13634.939810]  ? br_vlan_add+0x135/0x1b0
> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
> [13634.939834]  ? br_changelink+0x120/0x4e0
> [13634.939844]  ? br_dev_newlink+0x50/0x70
> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
> [13634.939896]  ? lookup_fast+0x52/0x370
> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
> [13634.939934]  ? netlink_unicast+0x177/0x220
> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
> [13634.939954]  ? _copy_from_user+0x39/0x40
> [13634.939964]  ? sock_sendmsg+0x30/0x40
> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
> [13634.940809]  ? __sys_sendmsg+0x51/0x90
> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
> 
> The problem is that the bridge's VLAN group is created after setting the
> default PVID, when registering the netdevice and executing its
> ndo_init().
> 
> Fix this by changing the order of both operations, so that
> br_changelink() is only processed after the netdevice is registered,
> when the VLAN group is already initialized.
> 
> The changelink() call is done on a best-effort basis since unregistering
> the netdevice upon failure won't perform a proper cleanup due to a
> missing ndo_uninit(), which I'll try to add for net-next.
> 
> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Reported-by: Peter V. Saveliev <peter@svinota.eu>
> ---
> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.

Although this patch fixes the OOPS it breaks all the error handling
of br_changelink. If bad attributes are passed to newlink, you leave a
broken partially configured bridge device. The code needs to cleanup
and return the correct errno.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 14:10   ` Stephen Hemminger
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 14:10 UTC (permalink / raw)
  To: idosch; +Cc: cera, mlxsw, Nikolay Aleksandrov, netdev, peter, bridge, davem

On Fri, 7 Apr 2017 15:49:15 +0300
<idosch@mellanox.com> wrote:

> From: Ido Schimmel <idosch@mellanox.com>
> 
> Peter reported a kernel oops when executing the following command:
> 
> $ ip link add name test type bridge vlan_default_pvid 1
> 
> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000190
> [13634.939436] IP: __vlan_add+0x73/0x5f0
> [...]
> [13634.939783] Call Trace:
> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
> [13634.939810]  ? br_vlan_add+0x135/0x1b0
> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
> [13634.939834]  ? br_changelink+0x120/0x4e0
> [13634.939844]  ? br_dev_newlink+0x50/0x70
> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
> [13634.939896]  ? lookup_fast+0x52/0x370
> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
> [13634.939934]  ? netlink_unicast+0x177/0x220
> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
> [13634.939954]  ? _copy_from_user+0x39/0x40
> [13634.939964]  ? sock_sendmsg+0x30/0x40
> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
> [13634.940809]  ? __sys_sendmsg+0x51/0x90
> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
> 
> The problem is that the bridge's VLAN group is created after setting the
> default PVID, when registering the netdevice and executing its
> ndo_init().
> 
> Fix this by changing the order of both operations, so that
> br_changelink() is only processed after the netdevice is registered,
> when the VLAN group is already initialized.
> 
> The changelink() call is done on a best-effort basis since unregistering
> the netdevice upon failure won't perform a proper cleanup due to a
> missing ndo_uninit(), which I'll try to add for net-next.
> 
> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Reported-by: Peter V. Saveliev <peter@svinota.eu>
> ---
> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.

Although this patch fixes the OOPS it breaks all the error handling
of br_changelink. If bad attributes are passed to newlink, you leave a
broken partially configured bridge device. The code needs to cleanup
and return the correct errno.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 14:10   ` [Bridge] " Stephen Hemminger
@ 2017-04-07 14:19     ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 22+ messages in thread
From: Nikolay Aleksandrov via Bridge @ 2017-04-07 14:19 UTC (permalink / raw)
  To: Stephen Hemminger, idosch; +Cc: cera, mlxsw, netdev, peter, bridge, davem

On 07/04/17 17:10, Stephen Hemminger wrote:
> On Fri, 7 Apr 2017 15:49:15 +0300
> <idosch@mellanox.com> wrote:
> 
>> From: Ido Schimmel <idosch@mellanox.com>
>>
>> Peter reported a kernel oops when executing the following command:
>>
>> $ ip link add name test type bridge vlan_default_pvid 1
>>
>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000190
>> [13634.939436] IP: __vlan_add+0x73/0x5f0
>> [...]
>> [13634.939783] Call Trace:
>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
>> [13634.939834]  ? br_changelink+0x120/0x4e0
>> [13634.939844]  ? br_dev_newlink+0x50/0x70
>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
>> [13634.939896]  ? lookup_fast+0x52/0x370
>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
>> [13634.939934]  ? netlink_unicast+0x177/0x220
>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
>> [13634.939954]  ? _copy_from_user+0x39/0x40
>> [13634.939964]  ? sock_sendmsg+0x30/0x40
>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
>>
>> The problem is that the bridge's VLAN group is created after setting the
>> default PVID, when registering the netdevice and executing its
>> ndo_init().
>>
>> Fix this by changing the order of both operations, so that
>> br_changelink() is only processed after the netdevice is registered,
>> when the VLAN group is already initialized.
>>
>> The changelink() call is done on a best-effort basis since unregistering
>> the netdevice upon failure won't perform a proper cleanup due to a
>> missing ndo_uninit(), which I'll try to add for net-next.
>>
>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
>> ---
>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.
> 
> Although this patch fixes the OOPS it breaks all the error handling
> of br_changelink. If bad attributes are passed to newlink, you leave a
> broken partially configured bridge device. The code needs to cleanup
> and return the correct errno.
> 

The cleanup would require adding ndo_uninit() and a much bigger churn
which doesn't seem okay for -net, it will be targetted at net-next.
The bridge can always be reconfigured as all of the options can be set
during runtime, so anything can be fixed, thus the best-effort changelink.

If it is not desirable for -net then maybe we should just revert the
patch there altogether and make it again correctly with cleanup and so
on in net-next.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 14:19     ` Nikolay Aleksandrov
  0 siblings, 0 replies; 22+ messages in thread
From: Nikolay Aleksandrov @ 2017-04-07 14:19 UTC (permalink / raw)
  To: Stephen Hemminger, idosch; +Cc: cera, mlxsw, netdev, peter, bridge, davem

On 07/04/17 17:10, Stephen Hemminger wrote:
> On Fri, 7 Apr 2017 15:49:15 +0300
> <idosch@mellanox.com> wrote:
> 
>> From: Ido Schimmel <idosch@mellanox.com>
>>
>> Peter reported a kernel oops when executing the following command:
>>
>> $ ip link add name test type bridge vlan_default_pvid 1
>>
>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000190
>> [13634.939436] IP: __vlan_add+0x73/0x5f0
>> [...]
>> [13634.939783] Call Trace:
>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
>> [13634.939834]  ? br_changelink+0x120/0x4e0
>> [13634.939844]  ? br_dev_newlink+0x50/0x70
>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
>> [13634.939896]  ? lookup_fast+0x52/0x370
>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
>> [13634.939934]  ? netlink_unicast+0x177/0x220
>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
>> [13634.939954]  ? _copy_from_user+0x39/0x40
>> [13634.939964]  ? sock_sendmsg+0x30/0x40
>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
>>
>> The problem is that the bridge's VLAN group is created after setting the
>> default PVID, when registering the netdevice and executing its
>> ndo_init().
>>
>> Fix this by changing the order of both operations, so that
>> br_changelink() is only processed after the netdevice is registered,
>> when the VLAN group is already initialized.
>>
>> The changelink() call is done on a best-effort basis since unregistering
>> the netdevice upon failure won't perform a proper cleanup due to a
>> missing ndo_uninit(), which I'll try to add for net-next.
>>
>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
>> ---
>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.
> 
> Although this patch fixes the OOPS it breaks all the error handling
> of br_changelink. If bad attributes are passed to newlink, you leave a
> broken partially configured bridge device. The code needs to cleanup
> and return the correct errno.
> 

The cleanup would require adding ndo_uninit() and a much bigger churn
which doesn't seem okay for -net, it will be targetted at net-next.
The bridge can always be reconfigured as all of the options can be set
during runtime, so anything can be fixed, thus the best-effort changelink.

If it is not desirable for -net then maybe we should just revert the
patch there altogether and make it again correctly with cleanup and so
on in net-next.




^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 14:19     ` [Bridge] " Nikolay Aleksandrov
@ 2017-04-07 15:22       ` Stephen Hemminger
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 15:22 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: idosch, netdev, davem, bridge, mlxsw, peter, cera

On Fri, 7 Apr 2017 17:19:48 +0300
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> On 07/04/17 17:10, Stephen Hemminger wrote:
> > On Fri, 7 Apr 2017 15:49:15 +0300
> > <idosch@mellanox.com> wrote:
> >   
> >> From: Ido Schimmel <idosch@mellanox.com>
> >>
> >> Peter reported a kernel oops when executing the following command:
> >>
> >> $ ip link add name test type bridge vlan_default_pvid 1
> >>
> >> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
> >> 0000000000000190
> >> [13634.939436] IP: __vlan_add+0x73/0x5f0
> >> [...]
> >> [13634.939783] Call Trace:
> >> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
> >> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
> >> [13634.939810]  ? br_vlan_add+0x135/0x1b0
> >> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
> >> [13634.939834]  ? br_changelink+0x120/0x4e0
> >> [13634.939844]  ? br_dev_newlink+0x50/0x70
> >> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
> >> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
> >> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
> >> [13634.939896]  ? lookup_fast+0x52/0x370
> >> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
> >> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
> >> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
> >> [13634.939934]  ? netlink_unicast+0x177/0x220
> >> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
> >> [13634.939954]  ? _copy_from_user+0x39/0x40
> >> [13634.939964]  ? sock_sendmsg+0x30/0x40
> >> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
> >> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
> >> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
> >> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
> >> [13634.940809]  ? __sys_sendmsg+0x51/0x90
> >> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
> >>
> >> The problem is that the bridge's VLAN group is created after setting the
> >> default PVID, when registering the netdevice and executing its
> >> ndo_init().
> >>
> >> Fix this by changing the order of both operations, so that
> >> br_changelink() is only processed after the netdevice is registered,
> >> when the VLAN group is already initialized.
> >>
> >> The changelink() call is done on a best-effort basis since unregistering
> >> the netdevice upon failure won't perform a proper cleanup due to a
> >> missing ndo_uninit(), which I'll try to add for net-next.
> >>
> >> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> >> Reported-by: Peter V. Saveliev <peter@svinota.eu>
> >> ---
> >> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.  
> > 
> > Although this patch fixes the OOPS it breaks all the error handling
> > of br_changelink. If bad attributes are passed to newlink, you leave a
> > broken partially configured bridge device. The code needs to cleanup
> > and return the correct errno.
> >   
> 
> The cleanup would require adding ndo_uninit() and a much bigger churn
> which doesn't seem okay for -net, it will be targetted at net-next.
> The bridge can always be reconfigured as all of the options can be set
> during runtime, so anything can be fixed, thus the best-effort changelink.
> 
> If it is not desirable for -net then maybe we should just revert the
> patch there altogether and make it again correctly with cleanup and so
> on in net-next.
> 
> 
> 

Why not just add pointer validation in the PVID attribute parsing.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 15:22       ` Stephen Hemminger
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 15:22 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: cera, mlxsw, netdev, peter, bridge, idosch, davem

On Fri, 7 Apr 2017 17:19:48 +0300
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> On 07/04/17 17:10, Stephen Hemminger wrote:
> > On Fri, 7 Apr 2017 15:49:15 +0300
> > <idosch@mellanox.com> wrote:
> >   
> >> From: Ido Schimmel <idosch@mellanox.com>
> >>
> >> Peter reported a kernel oops when executing the following command:
> >>
> >> $ ip link add name test type bridge vlan_default_pvid 1
> >>
> >> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
> >> 0000000000000190
> >> [13634.939436] IP: __vlan_add+0x73/0x5f0
> >> [...]
> >> [13634.939783] Call Trace:
> >> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
> >> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
> >> [13634.939810]  ? br_vlan_add+0x135/0x1b0
> >> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
> >> [13634.939834]  ? br_changelink+0x120/0x4e0
> >> [13634.939844]  ? br_dev_newlink+0x50/0x70
> >> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
> >> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
> >> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
> >> [13634.939896]  ? lookup_fast+0x52/0x370
> >> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
> >> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
> >> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
> >> [13634.939934]  ? netlink_unicast+0x177/0x220
> >> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
> >> [13634.939954]  ? _copy_from_user+0x39/0x40
> >> [13634.939964]  ? sock_sendmsg+0x30/0x40
> >> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
> >> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
> >> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
> >> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
> >> [13634.940809]  ? __sys_sendmsg+0x51/0x90
> >> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
> >>
> >> The problem is that the bridge's VLAN group is created after setting the
> >> default PVID, when registering the netdevice and executing its
> >> ndo_init().
> >>
> >> Fix this by changing the order of both operations, so that
> >> br_changelink() is only processed after the netdevice is registered,
> >> when the VLAN group is already initialized.
> >>
> >> The changelink() call is done on a best-effort basis since unregistering
> >> the netdevice upon failure won't perform a proper cleanup due to a
> >> missing ndo_uninit(), which I'll try to add for net-next.
> >>
> >> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
> >> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> >> Reported-by: Peter V. Saveliev <peter@svinota.eu>
> >> ---
> >> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.  
> > 
> > Although this patch fixes the OOPS it breaks all the error handling
> > of br_changelink. If bad attributes are passed to newlink, you leave a
> > broken partially configured bridge device. The code needs to cleanup
> > and return the correct errno.
> >   
> 
> The cleanup would require adding ndo_uninit() and a much bigger churn
> which doesn't seem okay for -net, it will be targetted at net-next.
> The bridge can always be reconfigured as all of the options can be set
> during runtime, so anything can be fixed, thus the best-effort changelink.
> 
> If it is not desirable for -net then maybe we should just revert the
> patch there altogether and make it again correctly with cleanup and so
> on in net-next.
> 
> 
> 

Why not just add pointer validation in the PVID attribute parsing.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 15:22       ` [Bridge] " Stephen Hemminger
@ 2017-04-07 15:27         ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 22+ messages in thread
From: Nikolay Aleksandrov @ 2017-04-07 15:27 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: idosch, netdev, davem, bridge, mlxsw, peter, cera

On 07/04/17 18:22, Stephen Hemminger wrote:
> On Fri, 7 Apr 2017 17:19:48 +0300
> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> 
>> On 07/04/17 17:10, Stephen Hemminger wrote:
>>> On Fri, 7 Apr 2017 15:49:15 +0300
>>> <idosch@mellanox.com> wrote:
>>>   
>>>> From: Ido Schimmel <idosch@mellanox.com>
>>>>
>>>> Peter reported a kernel oops when executing the following command:
>>>>
>>>> $ ip link add name test type bridge vlan_default_pvid 1
>>>>
>>>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
>>>> 0000000000000190
>>>> [13634.939436] IP: __vlan_add+0x73/0x5f0
>>>> [...]
>>>> [13634.939783] Call Trace:
>>>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
>>>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
>>>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
>>>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
>>>> [13634.939834]  ? br_changelink+0x120/0x4e0
>>>> [13634.939844]  ? br_dev_newlink+0x50/0x70
>>>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
>>>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
>>>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>>>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
>>>> [13634.939896]  ? lookup_fast+0x52/0x370
>>>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
>>>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
>>>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
>>>> [13634.939934]  ? netlink_unicast+0x177/0x220
>>>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
>>>> [13634.939954]  ? _copy_from_user+0x39/0x40
>>>> [13634.939964]  ? sock_sendmsg+0x30/0x40
>>>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
>>>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
>>>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>>>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
>>>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
>>>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
>>>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
>>>>
>>>> The problem is that the bridge's VLAN group is created after setting the
>>>> default PVID, when registering the netdevice and executing its
>>>> ndo_init().
>>>>
>>>> Fix this by changing the order of both operations, so that
>>>> br_changelink() is only processed after the netdevice is registered,
>>>> when the VLAN group is already initialized.
>>>>
>>>> The changelink() call is done on a best-effort basis since unregistering
>>>> the netdevice upon failure won't perform a proper cleanup due to a
>>>> missing ndo_uninit(), which I'll try to add for net-next.
>>>>
>>>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>>>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
>>>> ---
>>>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.  
>>>
>>> Although this patch fixes the OOPS it breaks all the error handling
>>> of br_changelink. If bad attributes are passed to newlink, you leave a
>>> broken partially configured bridge device. The code needs to cleanup
>>> and return the correct errno.
>>>   
>>
>> The cleanup would require adding ndo_uninit() and a much bigger churn
>> which doesn't seem okay for -net, it will be targetted at net-next.
>> The bridge can always be reconfigured as all of the options can be set
>> during runtime, so anything can be fixed, thus the best-effort changelink.
>>
>> If it is not desirable for -net then maybe we should just revert the
>> patch there altogether and make it again correctly with cleanup and so
>> on in net-next.
>>
>>
>>
> 
> Why not just add pointer validation in the PVID attribute parsing.
> 

We cannot have the changelink() before the register  for many reasons,
first the vlan config will not be applied so all of the vlan options
won't get set even though they're passed, then the changelink() can
cause more trouble via the STP calls (f.e. br_stp_start) which can use
br->dev->ifindex (= 0 at that point), also can use br->dev->name (still
haven't passed validation and is uninitialized, i.e.
dev_get_valid_name() hasn't been called and we can have format
specifiers in it), multicast code also has some codepaths that will
cause various timers to get started...

Moving changelink() after the register is much safer.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 15:27         ` Nikolay Aleksandrov
  0 siblings, 0 replies; 22+ messages in thread
From: Nikolay Aleksandrov @ 2017-04-07 15:27 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: cera, mlxsw, netdev, peter, bridge, idosch, davem

On 07/04/17 18:22, Stephen Hemminger wrote:
> On Fri, 7 Apr 2017 17:19:48 +0300
> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> 
>> On 07/04/17 17:10, Stephen Hemminger wrote:
>>> On Fri, 7 Apr 2017 15:49:15 +0300
>>> <idosch@mellanox.com> wrote:
>>>   
>>>> From: Ido Schimmel <idosch@mellanox.com>
>>>>
>>>> Peter reported a kernel oops when executing the following command:
>>>>
>>>> $ ip link add name test type bridge vlan_default_pvid 1
>>>>
>>>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
>>>> 0000000000000190
>>>> [13634.939436] IP: __vlan_add+0x73/0x5f0
>>>> [...]
>>>> [13634.939783] Call Trace:
>>>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
>>>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
>>>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
>>>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
>>>> [13634.939834]  ? br_changelink+0x120/0x4e0
>>>> [13634.939844]  ? br_dev_newlink+0x50/0x70
>>>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
>>>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
>>>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>>>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
>>>> [13634.939896]  ? lookup_fast+0x52/0x370
>>>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
>>>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
>>>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
>>>> [13634.939934]  ? netlink_unicast+0x177/0x220
>>>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
>>>> [13634.939954]  ? _copy_from_user+0x39/0x40
>>>> [13634.939964]  ? sock_sendmsg+0x30/0x40
>>>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
>>>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
>>>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>>>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
>>>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
>>>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
>>>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
>>>>
>>>> The problem is that the bridge's VLAN group is created after setting the
>>>> default PVID, when registering the netdevice and executing its
>>>> ndo_init().
>>>>
>>>> Fix this by changing the order of both operations, so that
>>>> br_changelink() is only processed after the netdevice is registered,
>>>> when the VLAN group is already initialized.
>>>>
>>>> The changelink() call is done on a best-effort basis since unregistering
>>>> the netdevice upon failure won't perform a proper cleanup due to a
>>>> missing ndo_uninit(), which I'll try to add for net-next.
>>>>
>>>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>>>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
>>>> ---
>>>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.  
>>>
>>> Although this patch fixes the OOPS it breaks all the error handling
>>> of br_changelink. If bad attributes are passed to newlink, you leave a
>>> broken partially configured bridge device. The code needs to cleanup
>>> and return the correct errno.
>>>   
>>
>> The cleanup would require adding ndo_uninit() and a much bigger churn
>> which doesn't seem okay for -net, it will be targetted at net-next.
>> The bridge can always be reconfigured as all of the options can be set
>> during runtime, so anything can be fixed, thus the best-effort changelink.
>>
>> If it is not desirable for -net then maybe we should just revert the
>> patch there altogether and make it again correctly with cleanup and so
>> on in net-next.
>>
>>
>>
> 
> Why not just add pointer validation in the PVID attribute parsing.
> 

We cannot have the changelink() before the register  for many reasons,
first the vlan config will not be applied so all of the vlan options
won't get set even though they're passed, then the changelink() can
cause more trouble via the STP calls (f.e. br_stp_start) which can use
br->dev->ifindex (= 0 at that point), also can use br->dev->name (still
haven't passed validation and is uninitialized, i.e.
dev_get_valid_name() hasn't been called and we can have format
specifiers in it), multicast code also has some codepaths that will
cause various timers to get started...

Moving changelink() after the register is much safer.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 15:27         ` [Bridge] " Nikolay Aleksandrov
@ 2017-04-07 15:36           ` Stephen Hemminger
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 15:36 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: idosch, netdev, davem, bridge, mlxsw, peter, cera

On Fri, 7 Apr 2017 18:27:37 +0300
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> On 07/04/17 18:22, Stephen Hemminger wrote:
> > On Fri, 7 Apr 2017 17:19:48 +0300
> > Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> >   
> >> On 07/04/17 17:10, Stephen Hemminger wrote:  
> >>> On Fri, 7 Apr 2017 15:49:15 +0300
> >>> <idosch@mellanox.com> wrote:
> >>>     
> >>>> From: Ido Schimmel <idosch@mellanox.com>
> >>>>
> >>>> Peter reported a kernel oops when executing the following command:
> >>>>
> >>>> $ ip link add name test type bridge vlan_default_pvid 1
> >>>>
> >>>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
> >>>> 0000000000000190
> >>>> [13634.939436] IP: __vlan_add+0x73/0x5f0
> >>>> [...]
> >>>> [13634.939783] Call Trace:
> >>>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
> >>>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
> >>>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
> >>>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
> >>>> [13634.939834]  ? br_changelink+0x120/0x4e0
> >>>> [13634.939844]  ? br_dev_newlink+0x50/0x70
> >>>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
> >>>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
> >>>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >>>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
> >>>> [13634.939896]  ? lookup_fast+0x52/0x370
> >>>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
> >>>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
> >>>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
> >>>> [13634.939934]  ? netlink_unicast+0x177/0x220
> >>>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
> >>>> [13634.939954]  ? _copy_from_user+0x39/0x40
> >>>> [13634.939964]  ? sock_sendmsg+0x30/0x40
> >>>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
> >>>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
> >>>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >>>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
> >>>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
> >>>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
> >>>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
> >>>>
> >>>> The problem is that the bridge's VLAN group is created after setting the
> >>>> default PVID, when registering the netdevice and executing its
> >>>> ndo_init().
> >>>>
> >>>> Fix this by changing the order of both operations, so that
> >>>> br_changelink() is only processed after the netdevice is registered,
> >>>> when the VLAN group is already initialized.
> >>>>
> >>>> The changelink() call is done on a best-effort basis since unregistering
> >>>> the netdevice upon failure won't perform a proper cleanup due to a
> >>>> missing ndo_uninit(), which I'll try to add for net-next.
> >>>>
> >>>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
> >>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> >>>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
> >>>> ---
> >>>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.    
> >>>
> >>> Although this patch fixes the OOPS it breaks all the error handling
> >>> of br_changelink. If bad attributes are passed to newlink, you leave a
> >>> broken partially configured bridge device. The code needs to cleanup
> >>> and return the correct errno.
> >>>     
> >>
> >> The cleanup would require adding ndo_uninit() and a much bigger churn
> >> which doesn't seem okay for -net, it will be targetted at net-next.
> >> The bridge can always be reconfigured as all of the options can be set
> >> during runtime, so anything can be fixed, thus the best-effort changelink.
> >>
> >> If it is not desirable for -net then maybe we should just revert the
> >> patch there altogether and make it again correctly with cleanup and so
> >> on in net-next.
> >>
> >>
> >>  
> > 
> > Why not just add pointer validation in the PVID attribute parsing.
> >   
> 
> We cannot have the changelink() before the register  for many reasons,
> first the vlan config will not be applied so all of the vlan options
> won't get set even though they're passed, then the changelink() can
> cause more trouble via the STP calls (f.e. br_stp_start) which can use
> br->dev->ifindex (= 0 at that point), also can use br->dev->name (still
> haven't passed validation and is uninitialized, i.e.
> dev_get_valid_name() hasn't been called and we can have format
> specifiers in it), multicast code also has some codepaths that will
> cause various timers to get started...
> 
> Moving changelink() after the register is much safer.
> 

Then just fix error handling. Why not call unregister?

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index a8f6acd23e30..6d92d3198e8b 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1165,11 +1165,15 @@ static int br_dev_newlink(struct net *src_net, struct net_device *dev,
 		spin_unlock_bh(&br->lock);
 	}
 
-	err = br_changelink(dev, tb, data);
+	err = register_netdevice(dev);
 	if (err)
 		return err;
 
-	return register_netdevice(dev);
+	err = br_changelink(dev, tb, data);
+	if (err)
+		unregister_netdevice(dev);
+
+	return err;
 }
 
 static size_t br_get_size(const struct net_device *brdev)
-- 
2.11.0

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 15:36           ` Stephen Hemminger
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 15:36 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: cera, mlxsw, netdev, peter, bridge, idosch, davem

On Fri, 7 Apr 2017 18:27:37 +0300
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> On 07/04/17 18:22, Stephen Hemminger wrote:
> > On Fri, 7 Apr 2017 17:19:48 +0300
> > Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> >   
> >> On 07/04/17 17:10, Stephen Hemminger wrote:  
> >>> On Fri, 7 Apr 2017 15:49:15 +0300
> >>> <idosch@mellanox.com> wrote:
> >>>     
> >>>> From: Ido Schimmel <idosch@mellanox.com>
> >>>>
> >>>> Peter reported a kernel oops when executing the following command:
> >>>>
> >>>> $ ip link add name test type bridge vlan_default_pvid 1
> >>>>
> >>>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
> >>>> 0000000000000190
> >>>> [13634.939436] IP: __vlan_add+0x73/0x5f0
> >>>> [...]
> >>>> [13634.939783] Call Trace:
> >>>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
> >>>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
> >>>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
> >>>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
> >>>> [13634.939834]  ? br_changelink+0x120/0x4e0
> >>>> [13634.939844]  ? br_dev_newlink+0x50/0x70
> >>>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
> >>>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
> >>>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >>>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
> >>>> [13634.939896]  ? lookup_fast+0x52/0x370
> >>>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
> >>>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
> >>>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
> >>>> [13634.939934]  ? netlink_unicast+0x177/0x220
> >>>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
> >>>> [13634.939954]  ? _copy_from_user+0x39/0x40
> >>>> [13634.939964]  ? sock_sendmsg+0x30/0x40
> >>>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
> >>>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
> >>>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >>>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
> >>>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
> >>>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
> >>>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
> >>>>
> >>>> The problem is that the bridge's VLAN group is created after setting the
> >>>> default PVID, when registering the netdevice and executing its
> >>>> ndo_init().
> >>>>
> >>>> Fix this by changing the order of both operations, so that
> >>>> br_changelink() is only processed after the netdevice is registered,
> >>>> when the VLAN group is already initialized.
> >>>>
> >>>> The changelink() call is done on a best-effort basis since unregistering
> >>>> the netdevice upon failure won't perform a proper cleanup due to a
> >>>> missing ndo_uninit(), which I'll try to add for net-next.
> >>>>
> >>>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
> >>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> >>>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
> >>>> ---
> >>>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.    
> >>>
> >>> Although this patch fixes the OOPS it breaks all the error handling
> >>> of br_changelink. If bad attributes are passed to newlink, you leave a
> >>> broken partially configured bridge device. The code needs to cleanup
> >>> and return the correct errno.
> >>>     
> >>
> >> The cleanup would require adding ndo_uninit() and a much bigger churn
> >> which doesn't seem okay for -net, it will be targetted at net-next.
> >> The bridge can always be reconfigured as all of the options can be set
> >> during runtime, so anything can be fixed, thus the best-effort changelink.
> >>
> >> If it is not desirable for -net then maybe we should just revert the
> >> patch there altogether and make it again correctly with cleanup and so
> >> on in net-next.
> >>
> >>
> >>  
> > 
> > Why not just add pointer validation in the PVID attribute parsing.
> >   
> 
> We cannot have the changelink() before the register  for many reasons,
> first the vlan config will not be applied so all of the vlan options
> won't get set even though they're passed, then the changelink() can
> cause more trouble via the STP calls (f.e. br_stp_start) which can use
> br->dev->ifindex (= 0 at that point), also can use br->dev->name (still
> haven't passed validation and is uninitialized, i.e.
> dev_get_valid_name() hasn't been called and we can have format
> specifiers in it), multicast code also has some codepaths that will
> cause various timers to get started...
> 
> Moving changelink() after the register is much safer.
> 

Then just fix error handling. Why not call unregister?

diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index a8f6acd23e30..6d92d3198e8b 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1165,11 +1165,15 @@ static int br_dev_newlink(struct net *src_net, struct net_device *dev,
 		spin_unlock_bh(&br->lock);
 	}
 
-	err = br_changelink(dev, tb, data);
+	err = register_netdevice(dev);
 	if (err)
 		return err;
 
-	return register_netdevice(dev);
+	err = br_changelink(dev, tb, data);
+	if (err)
+		unregister_netdevice(dev);
+
+	return err;
 }
 
 static size_t br_get_size(const struct net_device *brdev)
-- 
2.11.0



^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 15:36           ` [Bridge] " Stephen Hemminger
@ 2017-04-07 15:43             ` Nikolay Aleksandrov
  -1 siblings, 0 replies; 22+ messages in thread
From: Nikolay Aleksandrov via Bridge @ 2017-04-07 15:43 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: cera, mlxsw, netdev, peter, bridge, idosch, davem

On 07/04/17 18:36, Stephen Hemminger wrote:
> On Fri, 7 Apr 2017 18:27:37 +0300
> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> 
>> On 07/04/17 18:22, Stephen Hemminger wrote:
>>> On Fri, 7 Apr 2017 17:19:48 +0300
>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>   
>>>> On 07/04/17 17:10, Stephen Hemminger wrote:  
>>>>> On Fri, 7 Apr 2017 15:49:15 +0300
>>>>> <idosch@mellanox.com> wrote:
>>>>>     
>>>>>> From: Ido Schimmel <idosch@mellanox.com>
>>>>>>
>>>>>> Peter reported a kernel oops when executing the following command:
>>>>>>
>>>>>> $ ip link add name test type bridge vlan_default_pvid 1
>>>>>>
>>>>>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
>>>>>> 0000000000000190
>>>>>> [13634.939436] IP: __vlan_add+0x73/0x5f0
>>>>>> [...]
>>>>>> [13634.939783] Call Trace:
>>>>>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
>>>>>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
>>>>>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
>>>>>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
>>>>>> [13634.939834]  ? br_changelink+0x120/0x4e0
>>>>>> [13634.939844]  ? br_dev_newlink+0x50/0x70
>>>>>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
>>>>>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
>>>>>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>>>>>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
>>>>>> [13634.939896]  ? lookup_fast+0x52/0x370
>>>>>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
>>>>>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
>>>>>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
>>>>>> [13634.939934]  ? netlink_unicast+0x177/0x220
>>>>>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
>>>>>> [13634.939954]  ? _copy_from_user+0x39/0x40
>>>>>> [13634.939964]  ? sock_sendmsg+0x30/0x40
>>>>>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
>>>>>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
>>>>>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>>>>>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
>>>>>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
>>>>>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
>>>>>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
>>>>>>
>>>>>> The problem is that the bridge's VLAN group is created after setting the
>>>>>> default PVID, when registering the netdevice and executing its
>>>>>> ndo_init().
>>>>>>
>>>>>> Fix this by changing the order of both operations, so that
>>>>>> br_changelink() is only processed after the netdevice is registered,
>>>>>> when the VLAN group is already initialized.
>>>>>>
>>>>>> The changelink() call is done on a best-effort basis since unregistering
>>>>>> the netdevice upon failure won't perform a proper cleanup due to a
>>>>>> missing ndo_uninit(), which I'll try to add for net-next.
>>>>>>
>>>>>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>>>>>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
>>>>>> ---
>>>>>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.    
>>>>>
>>>>> Although this patch fixes the OOPS it breaks all the error handling
>>>>> of br_changelink. If bad attributes are passed to newlink, you leave a
>>>>> broken partially configured bridge device. The code needs to cleanup
>>>>> and return the correct errno.
>>>>>     
>>>>
>>>> The cleanup would require adding ndo_uninit() and a much bigger churn
>>>> which doesn't seem okay for -net, it will be targetted at net-next.
>>>> The bridge can always be reconfigured as all of the options can be set
>>>> during runtime, so anything can be fixed, thus the best-effort changelink.
>>>>
>>>> If it is not desirable for -net then maybe we should just revert the
>>>> patch there altogether and make it again correctly with cleanup and so
>>>> on in net-next.
>>>>
>>>>
>>>>  
>>>
>>> Why not just add pointer validation in the PVID attribute parsing.
>>>   
>>
>> We cannot have the changelink() before the register  for many reasons,
>> first the vlan config will not be applied so all of the vlan options
>> won't get set even though they're passed, then the changelink() can
>> cause more trouble via the STP calls (f.e. br_stp_start) which can use
>> br->dev->ifindex (= 0 at that point), also can use br->dev->name (still
>> haven't passed validation and is uninitialized, i.e.
>> dev_get_valid_name() hasn't been called and we can have format
>> specifiers in it), multicast code also has some codepaths that will
>> cause various timers to get started...
>>
>> Moving changelink() after the register is much safer.
>>
> 
> Then just fix error handling. Why not call unregister?
> 

Right, because there's no ndo_uninit() and this will not cleanup the
bridge properly. That's the plan for net-next, reorg the code and add
ndo_uninit() for that reason.

>From Ido's commit message above:
"The changelink() call is done on a best-effort basis since
unregistering the netdevice upon failure won't perform a proper cleanup
due to a missing ndo_uninit(), which I'll try to add for net-next."

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 15:43             ` Nikolay Aleksandrov
  0 siblings, 0 replies; 22+ messages in thread
From: Nikolay Aleksandrov @ 2017-04-07 15:43 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: cera, mlxsw, netdev, peter, bridge, idosch, davem

On 07/04/17 18:36, Stephen Hemminger wrote:
> On Fri, 7 Apr 2017 18:27:37 +0300
> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> 
>> On 07/04/17 18:22, Stephen Hemminger wrote:
>>> On Fri, 7 Apr 2017 17:19:48 +0300
>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>   
>>>> On 07/04/17 17:10, Stephen Hemminger wrote:  
>>>>> On Fri, 7 Apr 2017 15:49:15 +0300
>>>>> <idosch@mellanox.com> wrote:
>>>>>     
>>>>>> From: Ido Schimmel <idosch@mellanox.com>
>>>>>>
>>>>>> Peter reported a kernel oops when executing the following command:
>>>>>>
>>>>>> $ ip link add name test type bridge vlan_default_pvid 1
>>>>>>
>>>>>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
>>>>>> 0000000000000190
>>>>>> [13634.939436] IP: __vlan_add+0x73/0x5f0
>>>>>> [...]
>>>>>> [13634.939783] Call Trace:
>>>>>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
>>>>>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
>>>>>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
>>>>>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
>>>>>> [13634.939834]  ? br_changelink+0x120/0x4e0
>>>>>> [13634.939844]  ? br_dev_newlink+0x50/0x70
>>>>>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
>>>>>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
>>>>>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>>>>>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
>>>>>> [13634.939896]  ? lookup_fast+0x52/0x370
>>>>>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
>>>>>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
>>>>>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
>>>>>> [13634.939934]  ? netlink_unicast+0x177/0x220
>>>>>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
>>>>>> [13634.939954]  ? _copy_from_user+0x39/0x40
>>>>>> [13634.939964]  ? sock_sendmsg+0x30/0x40
>>>>>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
>>>>>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
>>>>>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
>>>>>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
>>>>>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
>>>>>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
>>>>>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
>>>>>>
>>>>>> The problem is that the bridge's VLAN group is created after setting the
>>>>>> default PVID, when registering the netdevice and executing its
>>>>>> ndo_init().
>>>>>>
>>>>>> Fix this by changing the order of both operations, so that
>>>>>> br_changelink() is only processed after the netdevice is registered,
>>>>>> when the VLAN group is already initialized.
>>>>>>
>>>>>> The changelink() call is done on a best-effort basis since unregistering
>>>>>> the netdevice upon failure won't perform a proper cleanup due to a
>>>>>> missing ndo_uninit(), which I'll try to add for net-next.
>>>>>>
>>>>>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
>>>>>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
>>>>>> ---
>>>>>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.    
>>>>>
>>>>> Although this patch fixes the OOPS it breaks all the error handling
>>>>> of br_changelink. If bad attributes are passed to newlink, you leave a
>>>>> broken partially configured bridge device. The code needs to cleanup
>>>>> and return the correct errno.
>>>>>     
>>>>
>>>> The cleanup would require adding ndo_uninit() and a much bigger churn
>>>> which doesn't seem okay for -net, it will be targetted at net-next.
>>>> The bridge can always be reconfigured as all of the options can be set
>>>> during runtime, so anything can be fixed, thus the best-effort changelink.
>>>>
>>>> If it is not desirable for -net then maybe we should just revert the
>>>> patch there altogether and make it again correctly with cleanup and so
>>>> on in net-next.
>>>>
>>>>
>>>>  
>>>
>>> Why not just add pointer validation in the PVID attribute parsing.
>>>   
>>
>> We cannot have the changelink() before the register  for many reasons,
>> first the vlan config will not be applied so all of the vlan options
>> won't get set even though they're passed, then the changelink() can
>> cause more trouble via the STP calls (f.e. br_stp_start) which can use
>> br->dev->ifindex (= 0 at that point), also can use br->dev->name (still
>> haven't passed validation and is uninitialized, i.e.
>> dev_get_valid_name() hasn't been called and we can have format
>> specifiers in it), multicast code also has some codepaths that will
>> cause various timers to get started...
>>
>> Moving changelink() after the register is much safer.
>>
> 
> Then just fix error handling. Why not call unregister?
> 

Right, because there's no ndo_uninit() and this will not cleanup the
bridge properly. That's the plan for net-next, reorg the code and add
ndo_uninit() for that reason.

From Ido's commit message above:
"The changelink() call is done on a best-effort basis since
unregistering the netdevice upon failure won't perform a proper cleanup
due to a missing ndo_uninit(), which I'll try to add for net-next."





^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 15:43             ` [Bridge] " Nikolay Aleksandrov
@ 2017-04-07 17:37               ` Stephen Hemminger
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 17:37 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: cera, mlxsw, netdev, peter, bridge, idosch, davem

On Fri, 7 Apr 2017 18:43:06 +0300
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> On 07/04/17 18:36, Stephen Hemminger wrote:
> > On Fri, 7 Apr 2017 18:27:37 +0300
> > Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> >   
> >> On 07/04/17 18:22, Stephen Hemminger wrote:  
> >>> On Fri, 7 Apr 2017 17:19:48 +0300
> >>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> >>>     
> >>>> On 07/04/17 17:10, Stephen Hemminger wrote:    
> >>>>> On Fri, 7 Apr 2017 15:49:15 +0300
> >>>>> <idosch@mellanox.com> wrote:
> >>>>>       
> >>>>>> From: Ido Schimmel <idosch@mellanox.com>
> >>>>>>
> >>>>>> Peter reported a kernel oops when executing the following command:
> >>>>>>
> >>>>>> $ ip link add name test type bridge vlan_default_pvid 1
> >>>>>>
> >>>>>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
> >>>>>> 0000000000000190
> >>>>>> [13634.939436] IP: __vlan_add+0x73/0x5f0
> >>>>>> [...]
> >>>>>> [13634.939783] Call Trace:
> >>>>>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
> >>>>>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
> >>>>>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
> >>>>>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
> >>>>>> [13634.939834]  ? br_changelink+0x120/0x4e0
> >>>>>> [13634.939844]  ? br_dev_newlink+0x50/0x70
> >>>>>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
> >>>>>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
> >>>>>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >>>>>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
> >>>>>> [13634.939896]  ? lookup_fast+0x52/0x370
> >>>>>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
> >>>>>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
> >>>>>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
> >>>>>> [13634.939934]  ? netlink_unicast+0x177/0x220
> >>>>>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
> >>>>>> [13634.939954]  ? _copy_from_user+0x39/0x40
> >>>>>> [13634.939964]  ? sock_sendmsg+0x30/0x40
> >>>>>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
> >>>>>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
> >>>>>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >>>>>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
> >>>>>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
> >>>>>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
> >>>>>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
> >>>>>>
> >>>>>> The problem is that the bridge's VLAN group is created after setting the
> >>>>>> default PVID, when registering the netdevice and executing its
> >>>>>> ndo_init().
> >>>>>>
> >>>>>> Fix this by changing the order of both operations, so that
> >>>>>> br_changelink() is only processed after the netdevice is registered,
> >>>>>> when the VLAN group is already initialized.
> >>>>>>
> >>>>>> The changelink() call is done on a best-effort basis since unregistering
> >>>>>> the netdevice upon failure won't perform a proper cleanup due to a
> >>>>>> missing ndo_uninit(), which I'll try to add for net-next.
> >>>>>>
> >>>>>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
> >>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >>>>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> >>>>>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
> >>>>>> ---
> >>>>>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.      
> >>>>>
> >>>>> Although this patch fixes the OOPS it breaks all the error handling
> >>>>> of br_changelink. If bad attributes are passed to newlink, you leave a
> >>>>> broken partially configured bridge device. The code needs to cleanup
> >>>>> and return the correct errno.
> >>>>>       
> >>>>
> >>>> The cleanup would require adding ndo_uninit() and a much bigger churn
> >>>> which doesn't seem okay for -net, it will be targetted at net-next.
> >>>> The bridge can always be reconfigured as all of the options can be set
> >>>> during runtime, so anything can be fixed, thus the best-effort changelink.
> >>>>
> >>>> If it is not desirable for -net then maybe we should just revert the
> >>>> patch there altogether and make it again correctly with cleanup and so
> >>>> on in net-next.
> >>>>
> >>>>
> >>>>    
> >>>
> >>> Why not just add pointer validation in the PVID attribute parsing.
> >>>     
> >>
> >> We cannot have the changelink() before the register  for many reasons,
> >> first the vlan config will not be applied so all of the vlan options
> >> won't get set even though they're passed, then the changelink() can
> >> cause more trouble via the STP calls (f.e. br_stp_start) which can use
> >> br->dev->ifindex (= 0 at that point), also can use br->dev->name (still
> >> haven't passed validation and is uninitialized, i.e.
> >> dev_get_valid_name() hasn't been called and we can have format
> >> specifiers in it), multicast code also has some codepaths that will
> >> cause various timers to get started...
> >>
> >> Moving changelink() after the register is much safer.
> >>  
> > 
> > Then just fix error handling. Why not call unregister?
> >   
> 
> Right, because there's no ndo_uninit() and this will not cleanup the
> bridge properly. That's the plan for net-next, reorg the code and add
> ndo_uninit() for that reason.
> 
> From Ido's commit message above:
> "The changelink() call is done on a best-effort basis since
> unregistering the netdevice upon failure won't perform a proper cleanup
> due to a missing ndo_uninit(), which I'll try to add for net-next."

Fix it right now. The patch you propose is too half baked.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 17:37               ` Stephen Hemminger
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 17:37 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: cera, mlxsw, netdev, peter, bridge, idosch, davem

On Fri, 7 Apr 2017 18:43:06 +0300
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> On 07/04/17 18:36, Stephen Hemminger wrote:
> > On Fri, 7 Apr 2017 18:27:37 +0300
> > Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> >   
> >> On 07/04/17 18:22, Stephen Hemminger wrote:  
> >>> On Fri, 7 Apr 2017 17:19:48 +0300
> >>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
> >>>     
> >>>> On 07/04/17 17:10, Stephen Hemminger wrote:    
> >>>>> On Fri, 7 Apr 2017 15:49:15 +0300
> >>>>> <idosch@mellanox.com> wrote:
> >>>>>       
> >>>>>> From: Ido Schimmel <idosch@mellanox.com>
> >>>>>>
> >>>>>> Peter reported a kernel oops when executing the following command:
> >>>>>>
> >>>>>> $ ip link add name test type bridge vlan_default_pvid 1
> >>>>>>
> >>>>>> [13634.939408] BUG: unable to handle kernel NULL pointer dereference at
> >>>>>> 0000000000000190
> >>>>>> [13634.939436] IP: __vlan_add+0x73/0x5f0
> >>>>>> [...]
> >>>>>> [13634.939783] Call Trace:
> >>>>>> [13634.939791]  ? pcpu_next_unpop+0x3b/0x50
> >>>>>> [13634.939801]  ? pcpu_alloc+0x3d2/0x680
> >>>>>> [13634.939810]  ? br_vlan_add+0x135/0x1b0
> >>>>>> [13634.939820]  ? __br_vlan_set_default_pvid.part.28+0x204/0x2b0
> >>>>>> [13634.939834]  ? br_changelink+0x120/0x4e0
> >>>>>> [13634.939844]  ? br_dev_newlink+0x50/0x70
> >>>>>> [13634.939854]  ? rtnl_newlink+0x5f5/0x8a0
> >>>>>> [13634.939864]  ? rtnl_newlink+0x176/0x8a0
> >>>>>> [13634.939874]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >>>>>> [13634.939886]  ? rtnetlink_rcv_msg+0xe1/0x220
> >>>>>> [13634.939896]  ? lookup_fast+0x52/0x370
> >>>>>> [13634.939905]  ? rtnl_newlink+0x8a0/0x8a0
> >>>>>> [13634.939915]  ? netlink_rcv_skb+0xa1/0xc0
> >>>>>> [13634.939925]  ? rtnetlink_rcv+0x24/0x30
> >>>>>> [13634.939934]  ? netlink_unicast+0x177/0x220
> >>>>>> [13634.939944]  ? netlink_sendmsg+0x2fe/0x3b0
> >>>>>> [13634.939954]  ? _copy_from_user+0x39/0x40
> >>>>>> [13634.939964]  ? sock_sendmsg+0x30/0x40
> >>>>>> [13634.940159]  ? ___sys_sendmsg+0x29d/0x2b0
> >>>>>> [13634.940326]  ? __alloc_pages_nodemask+0xdf/0x230
> >>>>>> [13634.940478]  ? mem_cgroup_commit_charge+0x7c/0x4e0
> >>>>>> [13634.940592]  ? mem_cgroup_try_charge+0x76/0x1a0
> >>>>>> [13634.940701]  ? __handle_mm_fault+0xdb9/0x10b0
> >>>>>> [13634.940809]  ? __sys_sendmsg+0x51/0x90
> >>>>>> [13634.940917]  ? entry_SYSCALL_64_fastpath+0x1e/0xad
> >>>>>>
> >>>>>> The problem is that the bridge's VLAN group is created after setting the
> >>>>>> default PVID, when registering the netdevice and executing its
> >>>>>> ndo_init().
> >>>>>>
> >>>>>> Fix this by changing the order of both operations, so that
> >>>>>> br_changelink() is only processed after the netdevice is registered,
> >>>>>> when the VLAN group is already initialized.
> >>>>>>
> >>>>>> The changelink() call is done on a best-effort basis since unregistering
> >>>>>> the netdevice upon failure won't perform a proper cleanup due to a
> >>>>>> missing ndo_uninit(), which I'll try to add for net-next.
> >>>>>>
> >>>>>> Fixes: b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")
> >>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> >>>>>> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> >>>>>> Reported-by: Peter V. Saveliev <peter@svinota.eu>
> >>>>>> ---
> >>>>>> Please consider this for 4.4.y, 4.9.y and 4.10.y as well.      
> >>>>>
> >>>>> Although this patch fixes the OOPS it breaks all the error handling
> >>>>> of br_changelink. If bad attributes are passed to newlink, you leave a
> >>>>> broken partially configured bridge device. The code needs to cleanup
> >>>>> and return the correct errno.
> >>>>>       
> >>>>
> >>>> The cleanup would require adding ndo_uninit() and a much bigger churn
> >>>> which doesn't seem okay for -net, it will be targetted at net-next.
> >>>> The bridge can always be reconfigured as all of the options can be set
> >>>> during runtime, so anything can be fixed, thus the best-effort changelink.
> >>>>
> >>>> If it is not desirable for -net then maybe we should just revert the
> >>>> patch there altogether and make it again correctly with cleanup and so
> >>>> on in net-next.
> >>>>
> >>>>
> >>>>    
> >>>
> >>> Why not just add pointer validation in the PVID attribute parsing.
> >>>     
> >>
> >> We cannot have the changelink() before the register  for many reasons,
> >> first the vlan config will not be applied so all of the vlan options
> >> won't get set even though they're passed, then the changelink() can
> >> cause more trouble via the STP calls (f.e. br_stp_start) which can use
> >> br->dev->ifindex (= 0 at that point), also can use br->dev->name (still
> >> haven't passed validation and is uninitialized, i.e.
> >> dev_get_valid_name() hasn't been called and we can have format
> >> specifiers in it), multicast code also has some codepaths that will
> >> cause various timers to get started...
> >>
> >> Moving changelink() after the register is much safer.
> >>  
> > 
> > Then just fix error handling. Why not call unregister?
> >   
> 
> Right, because there's no ndo_uninit() and this will not cleanup the
> bridge properly. That's the plan for net-next, reorg the code and add
> ndo_uninit() for that reason.
> 
> From Ido's commit message above:
> "The changelink() call is done on a best-effort basis since
> unregistering the netdevice upon failure won't perform a proper cleanup
> due to a missing ndo_uninit(), which I'll try to add for net-next."

Fix it right now. The patch you propose is too half baked.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 17:37               ` [Bridge] " Stephen Hemminger
@ 2017-04-07 18:16                 ` Ido Schimmel
  -1 siblings, 0 replies; 22+ messages in thread
From: Ido Schimmel @ 2017-04-07 18:16 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: cera, mlxsw, Nikolay Aleksandrov, netdev, peter, bridge, idosch, davem

On Fri, Apr 07, 2017 at 01:37:53PM -0400, Stephen Hemminger wrote:
> Fix it right now. The patch you propose is too half baked.

Given this patch needs to be back ported to three different stable
kernels I tried to keep the amount of changes to a minimum while fixing
the bug.

I'll work on the ndo_uninit() patch and submit it. Then people can
decide which course of action they're more comfortable with.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 18:16                 ` Ido Schimmel
  0 siblings, 0 replies; 22+ messages in thread
From: Ido Schimmel @ 2017-04-07 18:16 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: cera, mlxsw, Nikolay Aleksandrov, netdev, peter, bridge, idosch, davem

On Fri, Apr 07, 2017 at 01:37:53PM -0400, Stephen Hemminger wrote:
> Fix it right now. The patch you propose is too half baked.

Given this patch needs to be back ported to three different stable
kernels I tried to keep the amount of changes to a minimum while fixing
the bug.

I'll work on the ndo_uninit() patch and submit it. Then people can
decide which course of action they're more comfortable with.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 18:16                 ` [Bridge] " Ido Schimmel
@ 2017-04-07 18:44                   ` Stephen Hemminger
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 18:44 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Nikolay Aleksandrov, idosch, netdev, davem, bridge, mlxsw, peter, cera

On Fri, 7 Apr 2017 21:16:49 +0300
Ido Schimmel <idosch@idosch.org> wrote:

> On Fri, Apr 07, 2017 at 01:37:53PM -0400, Stephen Hemminger wrote:
> > Fix it right now. The patch you propose is too half baked.  
> 
> Given this patch needs to be back ported to three different stable
> kernels I tried to keep the amount of changes to a minimum while fixing
> the bug.
> 
> I'll work on the ndo_uninit() patch and submit it. Then people can
> decide which course of action they're more comfortable with.

I don't want enterprise distros to ship without error reporting.
At a minimum just return correct errno and leave bridge in stuck/broken state.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 18:44                   ` Stephen Hemminger
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 18:44 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: cera, mlxsw, Nikolay Aleksandrov, netdev, peter, bridge, idosch, davem

On Fri, 7 Apr 2017 21:16:49 +0300
Ido Schimmel <idosch@idosch.org> wrote:

> On Fri, Apr 07, 2017 at 01:37:53PM -0400, Stephen Hemminger wrote:
> > Fix it right now. The patch you propose is too half baked.  
> 
> Given this patch needs to be back ported to three different stable
> kernels I tried to keep the amount of changes to a minimum while fixing
> the bug.
> 
> I'll work on the ndo_uninit() patch and submit it. Then people can
> decide which course of action they're more comfortable with.

I don't want enterprise distros to ship without error reporting.
At a minimum just return correct errno and leave bridge in stuck/broken state.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH net] bridge: netlink: register netdevice before executing changelink
  2017-04-07 18:16                 ` [Bridge] " Ido Schimmel
@ 2017-04-07 18:45                   ` Stephen Hemminger
  -1 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 18:45 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Nikolay Aleksandrov, idosch, netdev, davem, bridge, mlxsw, peter, cera

On Fri, 7 Apr 2017 21:16:49 +0300
Ido Schimmel <idosch@idosch.org> wrote:

> On Fri, Apr 07, 2017 at 01:37:53PM -0400, Stephen Hemminger wrote:
> > Fix it right now. The patch you propose is too half baked.  
> 
> Given this patch needs to be back ported to three different stable
> kernels I tried to keep the amount of changes to a minimum while fixing
> the bug.
> 
> I'll work on the ndo_uninit() patch and submit it. Then people can
> decide which course of action they're more comfortable with.

Other option for stable is to revert b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Bridge] [PATCH net] bridge: netlink: register netdevice before executing changelink
@ 2017-04-07 18:45                   ` Stephen Hemminger
  0 siblings, 0 replies; 22+ messages in thread
From: Stephen Hemminger @ 2017-04-07 18:45 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: cera, mlxsw, Nikolay Aleksandrov, netdev, peter, bridge, idosch, davem

On Fri, 7 Apr 2017 21:16:49 +0300
Ido Schimmel <idosch@idosch.org> wrote:

> On Fri, Apr 07, 2017 at 01:37:53PM -0400, Stephen Hemminger wrote:
> > Fix it right now. The patch you propose is too half baked.  
> 
> Given this patch needs to be back ported to three different stable
> kernels I tried to keep the amount of changes to a minimum while fixing
> the bug.
> 
> I'll work on the ndo_uninit() patch and submit it. Then people can
> decide which course of action they're more comfortable with.

Other option for stable is to revert b6677449dff6 ("bridge: netlink: call br_changelink() during br_dev_newlink()")

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-04-07 18:45 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-07 12:49 [PATCH net] bridge: netlink: register netdevice before executing changelink idosch
2017-04-07 12:49 ` [Bridge] " idosch
2017-04-07 14:10 ` Stephen Hemminger
2017-04-07 14:10   ` [Bridge] " Stephen Hemminger
2017-04-07 14:19   ` Nikolay Aleksandrov via Bridge
2017-04-07 14:19     ` [Bridge] " Nikolay Aleksandrov
2017-04-07 15:22     ` Stephen Hemminger
2017-04-07 15:22       ` [Bridge] " Stephen Hemminger
2017-04-07 15:27       ` Nikolay Aleksandrov
2017-04-07 15:27         ` [Bridge] " Nikolay Aleksandrov
2017-04-07 15:36         ` Stephen Hemminger
2017-04-07 15:36           ` [Bridge] " Stephen Hemminger
2017-04-07 15:43           ` Nikolay Aleksandrov via Bridge
2017-04-07 15:43             ` [Bridge] " Nikolay Aleksandrov
2017-04-07 17:37             ` Stephen Hemminger
2017-04-07 17:37               ` [Bridge] " Stephen Hemminger
2017-04-07 18:16               ` Ido Schimmel
2017-04-07 18:16                 ` [Bridge] " Ido Schimmel
2017-04-07 18:44                 ` Stephen Hemminger
2017-04-07 18:44                   ` [Bridge] " Stephen Hemminger
2017-04-07 18:45                 ` Stephen Hemminger
2017-04-07 18:45                   ` [Bridge] " Stephen Hemminger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.