* [BUG] [FIX] net: dsa: oops in br_vlan_enabled
@ 2019-02-16 16:22 Frank Wunderlich
2019-02-17 17:13 ` Florian Fainelli
0 siblings, 1 reply; 6+ messages in thread
From: Frank Wunderlich @ 2019-02-16 16:22 UTC (permalink / raw)
To: netdev
Hi,
i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0 (also works in 4.14.101)
root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5
root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev lan0.5
root@bpi-r2:~# ip link set dev lan0 up
root@bpi-r2:~# ip link set dev lan0.5 up
12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff
inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5
valid_lft forever preferred_lft forever
root@bpi-r2:~# brctl addbr bridge_name
root@bpi-r2:~# brctl addif bridge_name lan0.5
[ 352.057128] bridge_name: port 1(lan0.5) entered blocking state
[ 352.063065] bridge_name: port 1(lan0.5) entered disabled state
[ 352.069181] device lan0.5 entered promiscuous mode
[ 352.074018] device lan0 entered promiscuous mode
[ 352.078906] Unable to handle kernel NULL pointer dereference at virtual address 00000558
...
[ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core])
[ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from [<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core])
[ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4)
[ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from [<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4)
[ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from [<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130)
[ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from [<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190)
[ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge])
[ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge])
[ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from [<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge])
[ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from [<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge])
[ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from [<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge])
[ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] (add_del_if+0x6c/0x80 [bridge])
[ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from [<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge])
[ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from [<c09583d4>] (dev_ifsioc+0x184/0x324)
[ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] (dev_ioctl+0x32c/0x5cc)
[ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] (sock_ioctl+0x3bc/0x580)
since my 4.19.23 kernel is modified a bit i tried with 4.19.10 without my net modifications and it is still reproducable with steps above (create a vlan on dsa-user-port and then use it in a bridge)
i fixed it with these changes:
diff --git a/net/dsa/port.c b/net/dsa/port.c
index ed0595459df1..962887752ae8 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -255,8 +255,9 @@ int dsa_port_vlan_add(struct dsa_port *dp,
if (netif_is_bridge_master(vlan->obj.orig_dev))
return -EOPNOTSUPP;
- if (br_vlan_enabled(dp->bridge_dev))
- return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info);
+ printk(KERN_ALERT "DEBUG: Passed %s %d 0x%x \n",__FUNCTION__,__LINE__,(unsigned int)dp->bridge_dev);
+ if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev))
+ return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
return 0;
}
@@ -273,7 +274,7 @@ int dsa_port_vlan_del(struct dsa_port *dp,
if (netif_is_bridge_master(vlan->obj.orig_dev))
return -EOPNOTSUPP;
- if (br_vlan_enabled(dp->bridge_dev))
+ if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev))
return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
return 0;
i've found in a Patch from florian/vivien: https://www.mail-archive.com/netdev@vger.kernel.org/msg281415.html
Strange that 5.0-rc1 does not crash,because these 2 code-sections are unchanged: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/dsa/port.c#n255 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_vlan.c#n788
maybe you know why only 4.19 is affected...
regards Frank
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled
2019-02-16 16:22 [BUG] [FIX] net: dsa: oops in br_vlan_enabled Frank Wunderlich
@ 2019-02-17 17:13 ` Florian Fainelli
2019-02-17 23:20 ` Florian Fainelli
0 siblings, 1 reply; 6+ messages in thread
From: Florian Fainelli @ 2019-02-17 17:13 UTC (permalink / raw)
To: Frank Wunderlich, netdev, Andrew Lunn, Vivien Didelot
Hi Frank,
On 2/16/2019 8:22 AM, Frank Wunderlich wrote:
> Hi,
>
> i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0 (also works in 4.14.101)
>
> root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5
> root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev lan0.5
> root@bpi-r2:~# ip link set dev lan0 up
> root@bpi-r2:~# ip link set dev lan0.5 up
So that these steps don't involve a bridge, and because we don't (yet)
implment ndo_vlan_rx_{add,kill}_vid() netdevice_ops, there is no VLAN
programming, it's all software
>
> 12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
> link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff
> inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5
> valid_lft forever preferred_lft forever
>
> root@bpi-r2:~# brctl addbr bridge_name
> root@bpi-r2:~# brctl addif bridge_name lan0.5
Unless you changed the bridge to have VLAN filtering/awareness with:
echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering
There will be no VLAN configuration/objects pushed to DSA since commit
2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add vlans
when vlan filtering is disabled") so I am not sure how you got into that
situation because prior to pushing a VLAN object, the port must be part
of a bridge, so the steps typically look like:
- port_bridge_join which assigns dp->bridge_dev
- port_vlan_add
Does your 4.19.23 kernel somehow change how VLAN objects are pushed down
the switch driver?
> [ 352.057128] bridge_name: port 1(lan0.5) entered blocking state
> [ 352.063065] bridge_name: port 1(lan0.5) entered disabled state
> [ 352.069181] device lan0.5 entered promiscuous mode
> [ 352.074018] device lan0 entered promiscuous mode
> [ 352.078906] Unable to handle kernel NULL pointer dereference at virtual address 00000558
> ...
> [ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core])
> [ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from [<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core])
> [ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4)
> [ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from [<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4)
> [ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from [<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130)
> [ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from [<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190)
> [ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge])
> [ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge])
> [ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from [<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge])
> [ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from [<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge])
> [ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from [<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge])
> [ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] (add_del_if+0x6c/0x80 [bridge])
> [ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from [<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge])
> [ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from [<c09583d4>] (dev_ifsioc+0x184/0x324)
> [ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] (dev_ioctl+0x32c/0x5cc)
> [ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] (sock_ioctl+0x3bc/0x580)
>
>
> since my 4.19.23 kernel is modified a bit i tried with 4.19.10 without my net modifications and it is still reproducable with steps above (create a vlan on dsa-user-port and then use it in a bridge)
>
> i fixed it with these changes:
>
> diff --git a/net/dsa/port.c b/net/dsa/port.c
> index ed0595459df1..962887752ae8 100644
> --- a/net/dsa/port.c
> +++ b/net/dsa/port.c
> @@ -255,8 +255,9 @@ int dsa_port_vlan_add(struct dsa_port *dp,
> if (netif_is_bridge_master(vlan->obj.orig_dev))
> return -EOPNOTSUPP;
>
> - if (br_vlan_enabled(dp->bridge_dev))
> - return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info);
> + printk(KERN_ALERT "DEBUG: Passed %s %d 0x%x \n",__FUNCTION__,__LINE__,(unsigned int)dp->bridge_dev);
> + if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev))
> + return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
>
> return 0;
> }
> @@ -273,7 +274,7 @@ int dsa_port_vlan_del(struct dsa_port *dp,
> if (netif_is_bridge_master(vlan->obj.orig_dev))
> return -EOPNOTSUPP;
>
> - if (br_vlan_enabled(dp->bridge_dev))
> + if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev))
> return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
>
> return 0;
>
> i've found in a Patch from florian/vivien: https://www.mail-archive.com/netdev@vger.kernel.org/msg281415.html
>
> Strange that 5.0-rc1 does not crash,because these 2 code-sections are unchanged: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/dsa/port.c#n255 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_vlan.c#n788
>
> maybe you know why only 4.19 is affected...
>
> regards Frank
>
--
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled
2019-02-17 17:13 ` Florian Fainelli
@ 2019-02-17 23:20 ` Florian Fainelli
2019-02-18 17:11 ` Aw: " Frank Wunderlich
0 siblings, 1 reply; 6+ messages in thread
From: Florian Fainelli @ 2019-02-17 23:20 UTC (permalink / raw)
To: Frank Wunderlich, netdev, Andrew Lunn, Vivien Didelot
[-- Attachment #1: Type: text/plain, Size: 5020 bytes --]
On 2/17/2019 9:13 AM, Florian Fainelli wrote:
> Hi Frank,
>
> On 2/16/2019 8:22 AM, Frank Wunderlich wrote:
>> Hi,
>>
>> i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0 (also works in 4.14.101)
>>
>> root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5
>> root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev lan0.5
>> root@bpi-r2:~# ip link set dev lan0 up
>> root@bpi-r2:~# ip link set dev lan0.5 up
>
> So that these steps don't involve a bridge, and because we don't (yet)
> implment ndo_vlan_rx_{add,kill}_vid() netdevice_ops, there is no VLAN
> programming, it's all software
>
>>
>> 12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
>> link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff
>> inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5
>> valid_lft forever preferred_lft forever
>>
>> root@bpi-r2:~# brctl addbr bridge_name
>> root@bpi-r2:~# brctl addif bridge_name lan0.5
>
> Unless you changed the bridge to have VLAN filtering/awareness with:
>
> echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering
>
> There will be no VLAN configuration/objects pushed to DSA since commit
> 2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add vlans
> when vlan filtering is disabled") so I am not sure how you got into that
> situation because prior to pushing a VLAN object, the port must be part
> of a bridge, so the steps typically look like:
>
> - port_bridge_join which assigns dp->bridge_dev
> - port_vlan_add
>
> Does your 4.19.23 kernel somehow change how VLAN objects are pushed down
> the switch driver?
>
>> [ 352.057128] bridge_name: port 1(lan0.5) entered blocking state
>> [ 352.063065] bridge_name: port 1(lan0.5) entered disabled state
>> [ 352.069181] device lan0.5 entered promiscuous mode
>> [ 352.074018] device lan0 entered promiscuous mode
>> [ 352.078906] Unable to handle kernel NULL pointer dereference at virtual address 00000558
>> ...
>> [ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core])
>> [ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from [<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core])
>> [ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4)
>> [ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from [<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4)
>> [ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from [<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130)
>> [ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from [<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190)
>> [ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge])
>> [ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge])
>> [ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from [<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge])
>> [ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from [<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge])
>> [ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from [<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge])
>> [ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] (add_del_if+0x6c/0x80 [bridge])
>> [ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from [<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge])
>> [ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from [<c09583d4>] (dev_ifsioc+0x184/0x324)
>> [ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] (dev_ioctl+0x32c/0x5cc)
>> [ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] (sock_ioctl+0x3bc/0x580)
>>
>>
>> since my 4.19.23 kernel is modified a bit i tried with 4.19.10 without my net modifications and it is still reproducable with steps above (create a vlan on dsa-user-port and then use it in a bridge)
>>
>> i fixed it with these changes:
Your fix is undoing the commit from Andrew I just referenced, so it is
definitively not the right fix because it will push VLAN objects down to
a non-VLAN aware bridge, not that this is really a problem, but it
should not be happening anyway.
The problem appears to be the following though: you are enslaving a VLAN
device, which does not have switchdev_ops, so we recurse into the lower
device, which is the DSA network device, which does have switchdev_ops
defined. Once we are there we check the orig_dev against being a bridge
master network device, but we are not checking that it's a VLAN device
so we end-up assuming it is a DSA network device, we de-reference
garbage by checking br_vlan_enabled(dp->bridge_dev) because there is
simply no such data structure associated with the VLAN device.
The attached patch should help.
This is no longer a problem in newer kernels because the switchdev
operations use a notifier which checks the target network device to be DSA.
--
Florian
[-- Attachment #2: 0001-net-dsa-Prevent-oops-while-enslaving-non-DSA-devices.patch --]
[-- Type: text/plain, Size: 2684 bytes --]
From c0c60a1d1dc451a51136867b51df6a4ec34e336b Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Sun, 17 Feb 2019 15:16:22 -0800
Subject: [PATCH] net: dsa: Prevent oops while enslaving non-DSA devices
DSA currently does not check that the target network device of a switchdev operation is actually a DSA slave network device
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
net/dsa/slave.c | 34 ++++++++++++++++++++++++++++------
1 file changed, 28 insertions(+), 6 deletions(-)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 1c45c1d6d241..3ceef299b030 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -278,9 +278,14 @@ static int dsa_slave_port_attr_set(struct net_device *dev,
const struct switchdev_attr *attr,
struct switchdev_trans *trans)
{
- struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_port *dp;
int ret;
+ if (!dsa_slave_dev_check(dev))
+ return -EOPNOTSUPP;
+
+ dp = dsa_slave_to_port(dev);
+
switch (attr->id) {
case SWITCHDEV_ATTR_ID_PORT_STP_STATE:
ret = dsa_port_set_state(dp, attr->u.stp_state, trans);
@@ -304,9 +309,14 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
const struct switchdev_obj *obj,
struct switchdev_trans *trans)
{
- struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_port *dp;
int err;
+ if (!dsa_slave_dev_check(dev))
+ return -EOPNOTSUPP;
+
+ dp = dsa_slave_to_port(dev);
+
/* For the prepare phase, ensure the full set of changes is feasable in
* one go in order to signal a failure properly. If an operation is not
* supported, return -EOPNOTSUPP.
@@ -338,9 +348,14 @@ static int dsa_slave_port_obj_add(struct net_device *dev,
static int dsa_slave_port_obj_del(struct net_device *dev,
const struct switchdev_obj *obj)
{
- struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_port *dp;
int err;
+ if (!dsa_slave_dev_check(dev))
+ return -EOPNOTSUPP;
+
+ dp = dsa_slave_to_port(dev);
+
switch (obj->id) {
case SWITCHDEV_OBJ_ID_PORT_MDB:
err = dsa_port_mdb_del(dp, SWITCHDEV_OBJ_PORT_MDB(obj));
@@ -365,9 +380,16 @@ static int dsa_slave_port_obj_del(struct net_device *dev,
static int dsa_slave_port_attr_get(struct net_device *dev,
struct switchdev_attr *attr)
{
- struct dsa_port *dp = dsa_slave_to_port(dev);
- struct dsa_switch *ds = dp->ds;
- struct dsa_switch_tree *dst = ds->dst;
+ struct dsa_switch_tree *dst;
+ struct dsa_switch *ds;
+ struct dsa_port *dp;
+
+ if (!dsa_slave_dev_check(dev))
+ return -EOPNOTSUPP;
+
+ dp = dsa_slave_to_port(dev);
+ ds = dp->ds;
+ dst = ds->dst;
switch (attr->id) {
case SWITCHDEV_ATTR_ID_PORT_PARENT_ID:
--
2.17.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Aw: Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled
2019-02-17 23:20 ` Florian Fainelli
@ 2019-02-18 17:11 ` Frank Wunderlich
2019-02-18 19:09 ` Florian Fainelli
0 siblings, 1 reply; 6+ messages in thread
From: Frank Wunderlich @ 2019-02-18 17:11 UTC (permalink / raw)
To: Florian Fainelli; +Cc: netdev, Andrew Lunn, Vivien Didelot
Hi,
tried your Patch but crashes the same way:
[ 107.416972] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf14e234>] (dsa_po
rt_vlan_add+0x60/0xbc [dsa_core])
[ 107.426939] [<bf14e234>] (dsa_port_vlan_add [dsa_core]) from [<bf14eba4>] (ds
a_slave_port_obj_add+0x64/0x68 [dsa_core])
[ 107.437666] [<bf14eba4>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e684>
] (__switchdev_port_obj_add+0x50/0xc4)
[ 107.448029] [<c0b4e684>] (__switchdev_port_obj_add) from [<c0b4e6d4>] (__swit
chdev_port_obj_add+0xa0/0xc4)
[ 107.457612] [<c0b4e6d4>] (__switchdev_port_obj_add) from [<c0b4e758>] (switch
dev_port_obj_add_now+0x60/0x130)
[ 107.467453] [<c0b4e758>] (switchdev_port_obj_add_now) from [<c0b4eb94>] (swit
chdev_port_obj_add+0x44/0x190)
[ 107.477181] [<c0b4eb94>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switc
hdev_port_vlan_add+0x60/0x7c [bridge])
[ 107.487644] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff25
0>] (__vlan_add+0xb0/0x620 [bridge])
regards Frank
> Gesendet: Montag, 18. Februar 2019 um 00:20 Uhr
> Von: "Florian Fainelli" <f.fainelli@gmail.com>
> An: "Frank Wunderlich" <frank-w@public-files.de>, netdev@vger.kernel.org, "Andrew Lunn" <andrew@lunn.ch>, "Vivien Didelot" <vivien.didelot@gmail.com>
> Betreff: Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled
>
>
>
> On 2/17/2019 9:13 AM, Florian Fainelli wrote:
> > Hi Frank,
> >
> > On 2/16/2019 8:22 AM, Frank Wunderlich wrote:
> >> Hi,
> >>
> >> i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0 (also works in 4.14.101)
> >>
> >> root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5
> >> root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev lan0.5
> >> root@bpi-r2:~# ip link set dev lan0 up
> >> root@bpi-r2:~# ip link set dev lan0.5 up
> >
> > So that these steps don't involve a bridge, and because we don't (yet)
> > implment ndo_vlan_rx_{add,kill}_vid() netdevice_ops, there is no VLAN
> > programming, it's all software
> >
> >>
> >> 12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
> >> link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff
> >> inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5
> >> valid_lft forever preferred_lft forever
> >>
> >> root@bpi-r2:~# brctl addbr bridge_name
> >> root@bpi-r2:~# brctl addif bridge_name lan0.5
> >
> > Unless you changed the bridge to have VLAN filtering/awareness with:
> >
> > echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering
> >
> > There will be no VLAN configuration/objects pushed to DSA since commit
> > 2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add vlans
> > when vlan filtering is disabled") so I am not sure how you got into that
> > situation because prior to pushing a VLAN object, the port must be part
> > of a bridge, so the steps typically look like:
> >
> > - port_bridge_join which assigns dp->bridge_dev
> > - port_vlan_add
> >
> > Does your 4.19.23 kernel somehow change how VLAN objects are pushed down
> > the switch driver?
> >
> >> [ 352.057128] bridge_name: port 1(lan0.5) entered blocking state
> >> [ 352.063065] bridge_name: port 1(lan0.5) entered disabled state
> >> [ 352.069181] device lan0.5 entered promiscuous mode
> >> [ 352.074018] device lan0 entered promiscuous mode
> >> [ 352.078906] Unable to handle kernel NULL pointer dereference at virtual address 00000558
> >> ...
> >> [ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from [<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core])
> >> [ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from [<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core])
> >> [ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core]) from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4)
> >> [ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from [<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4)
> >> [ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from [<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130)
> >> [ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from [<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190)
> >> [ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from [<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge])
> >> [ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge])
> >> [ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from [<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge])
> >> [ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from [<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge])
> >> [ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from [<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge])
> >> [ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>] (add_del_if+0x6c/0x80 [bridge])
> >> [ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from [<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge])
> >> [ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from [<c09583d4>] (dev_ifsioc+0x184/0x324)
> >> [ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>] (dev_ioctl+0x32c/0x5cc)
> >> [ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>] (sock_ioctl+0x3bc/0x580)
> >>
> >>
> >> since my 4.19.23 kernel is modified a bit i tried with 4.19.10 without my net modifications and it is still reproducable with steps above (create a vlan on dsa-user-port and then use it in a bridge)
> >>
> >> i fixed it with these changes:
>
> Your fix is undoing the commit from Andrew I just referenced, so it is
> definitively not the right fix because it will push VLAN objects down to
> a non-VLAN aware bridge, not that this is really a problem, but it
> should not be happening anyway.
>
> The problem appears to be the following though: you are enslaving a VLAN
> device, which does not have switchdev_ops, so we recurse into the lower
> device, which is the DSA network device, which does have switchdev_ops
> defined. Once we are there we check the orig_dev against being a bridge
> master network device, but we are not checking that it's a VLAN device
> so we end-up assuming it is a DSA network device, we de-reference
> garbage by checking br_vlan_enabled(dp->bridge_dev) because there is
> simply no such data structure associated with the VLAN device.
>
> The attached patch should help.
>
> This is no longer a problem in newer kernels because the switchdev
> operations use a notifier which checks the target network device to be DSA.
> --
> Florian
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled
2019-02-18 17:11 ` Aw: " Frank Wunderlich
@ 2019-02-18 19:09 ` Florian Fainelli
0 siblings, 0 replies; 6+ messages in thread
From: Florian Fainelli @ 2019-02-18 19:09 UTC (permalink / raw)
To: Frank Wunderlich; +Cc: netdev, Andrew Lunn, Vivien Didelot
On February 18, 2019 9:11:27 AM PST, Frank Wunderlich <frank-w@public-files.de> wrote:
>Hi,
>
>tried your Patch but crashes the same way:
Yes the lower_dev targeted by switched does point to the DSA slave network device, but we really don't have dp->bridge_dev assigned since the physical DSA port was not enslaved in the bridge. Will follow up with a correct version late today.
>
>[ 107.416972] [<bf0fde88>] (br_vlan_enabled [bridge]) from
>[<bf14e234>] (dsa_po
>rt_vlan_add+0x60/0xbc [dsa_core])
>
>[ 107.426939] [<bf14e234>] (dsa_port_vlan_add [dsa_core]) from
>[<bf14eba4>] (ds
>a_slave_port_obj_add+0x64/0x68 [dsa_core])
>
>[ 107.437666] [<bf14eba4>] (dsa_slave_port_obj_add [dsa_core]) from
>[<c0b4e684>
>] (__switchdev_port_obj_add+0x50/0xc4)
>
>[ 107.448029] [<c0b4e684>] (__switchdev_port_obj_add) from
>[<c0b4e6d4>] (__swit
>chdev_port_obj_add+0xa0/0xc4)
>
>[ 107.457612] [<c0b4e6d4>] (__switchdev_port_obj_add) from
>[<c0b4e758>] (switch
>dev_port_obj_add_now+0x60/0x130)
>
>[ 107.467453] [<c0b4e758>] (switchdev_port_obj_add_now) from
>[<c0b4eb94>] (swit
>chdev_port_obj_add+0x44/0x190)
>
>[ 107.477181] [<c0b4eb94>] (switchdev_port_obj_add) from [<bf1013d0>]
>(br_switc
>hdev_port_vlan_add+0x60/0x7c [bridge])
>
>[ 107.487644] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge]) from
>[<bf0ff25
>0>] (__vlan_add+0xb0/0x620 [bridge])
>
>regards Frank
>
>
>> Gesendet: Montag, 18. Februar 2019 um 00:20 Uhr
>> Von: "Florian Fainelli" <f.fainelli@gmail.com>
>> An: "Frank Wunderlich" <frank-w@public-files.de>,
>netdev@vger.kernel.org, "Andrew Lunn" <andrew@lunn.ch>, "Vivien
>Didelot" <vivien.didelot@gmail.com>
>> Betreff: Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled
>>
>>
>>
>> On 2/17/2019 9:13 AM, Florian Fainelli wrote:
>> > Hi Frank,
>> >
>> > On 2/16/2019 8:22 AM, Frank Wunderlich wrote:
>> >> Hi,
>> >>
>> >> i've found an oops in 4.19.23/10, seems to be fixed anyhow in 5.0
>(also works in 4.14.101)
>> >>
>> >> root@bpi-r2:~# ip link add link lan0 name lan0.5 type vlan id 5
>> >> root@bpi-r2:~# ip addr add 192.168.5.200/24 brd 192.168.5.255 dev
>lan0.5
>> >> root@bpi-r2:~# ip link set dev lan0 up
>> >> root@bpi-r2:~# ip link set dev lan0.5 up
>> >
>> > So that these steps don't involve a bridge, and because we don't
>(yet)
>> > implment ndo_vlan_rx_{add,kill}_vid() netdevice_ops, there is no
>VLAN
>> > programming, it's all software
>> >
>> >>
>> >> 12: lan0.5@lan0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500
>qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
>> >> link/ether 02:02:02:02:02:02 brd ff:ff:ff:ff:ff:ff
>> >> inet 192.168.5.200/24 brd 192.168.5.255 scope global lan0.5
>> >> valid_lft forever preferred_lft forever
>> >>
>> >> root@bpi-r2:~# brctl addbr bridge_name
>> >> root@bpi-r2:~# brctl addif bridge_name lan0.5
>> >
>> > Unless you changed the bridge to have VLAN filtering/awareness
>with:
>> >
>> > echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering
>> >
>> > There will be no VLAN configuration/objects pushed to DSA since
>commit
>> > 2ea7a679ca2abd251c1ec03f20508619707e1749 ("net: dsa: Don't add
>vlans
>> > when vlan filtering is disabled") so I am not sure how you got into
>that
>> > situation because prior to pushing a VLAN object, the port must be
>part
>> > of a bridge, so the steps typically look like:
>> >
>> > - port_bridge_join which assigns dp->bridge_dev
>> > - port_vlan_add
>> >
>> > Does your 4.19.23 kernel somehow change how VLAN objects are pushed
>down
>> > the switch driver?
>> >
>> >> [ 352.057128] bridge_name: port 1(lan0.5) entered blocking state
>> >> [ 352.063065] bridge_name: port 1(lan0.5) entered disabled state
>> >> [ 352.069181] device lan0.5 entered promiscuous mode
>> >> [ 352.074018] device lan0 entered promiscuous mode
>> >> [ 352.078906] Unable to handle kernel NULL pointer dereference at
>virtual address 00000558
>> >> ...
>> >> [ 352.493085] [<bf0fde88>] (br_vlan_enabled [bridge]) from
>[<bf12c234>] (dsa_port_vlan_add+0x60/0xbc [dsa_core])
>> >> [ 352.503050] [<bf12c234>] (dsa_port_vlan_add [dsa_core]) from
>[<bf12cb64>] (dsa_slave_port_obj_add+0x4c/0x50 [dsa_core])
>> >> [ 352.513776] [<bf12cb64>] (dsa_slave_port_obj_add [dsa_core])
>from [<c0b4e2d4>] (__switchdev_port_obj_add+0x50/0xc4)
>> >> [ 352.524138] [<c0b4e2d4>] (__switchdev_port_obj_add) from
>[<c0b4e324>] (__switchdev_port_obj_add+0xa0/0xc4)
>> >> [ 352.533721] [<c0b4e324>] (__switchdev_port_obj_add) from
>[<c0b4e3a8>] (switchdev_port_obj_add_now+0x60/0x130)
>> >> [ 352.543562] [<c0b4e3a8>] (switchdev_port_obj_add_now) from
>[<c0b4e7e4>] (switchdev_port_obj_add+0x44/0x190)
>> >> [ 352.553284] [<c0b4e7e4>] (switchdev_port_obj_add) from
>[<bf1013d0>] (br_switchdev_port_vlan_add+0x60/0x7c [bridge])
>> >> [ 352.563733] [<bf1013d0>] (br_switchdev_port_vlan_add [bridge])
>from [<bf0ff250>] (__vlan_add+0xb0/0x620 [bridge])
>> >> [ 352.574007] [<bf0ff250>] (__vlan_add [bridge]) from
>[<bf0ffd04>] (nbp_vlan_add+0xc4/0x150 [bridge])
>> >> [ 352.583073] [<bf0ffd04>] (nbp_vlan_add [bridge]) from
>[<bf0ffec4>] (nbp_vlan_init+0x134/0x164 [bridge])
>> >> [ 352.592482] [<bf0ffec4>] (nbp_vlan_init [bridge]) from
>[<bf0edd4c>] (br_add_if+0x40c/0x5fc [bridge])
>> >> [ 352.601632] [<bf0edd4c>] (br_add_if [bridge]) from [<bf0eeb14>]
>(add_del_if+0x6c/0x80 [bridge])
>> >> [ 352.610351] [<bf0eeb14>] (add_del_if [bridge]) from
>[<bf0ef5b0>] (br_dev_ioctl+0x7c/0x9c [bridge])
>> >> [ 352.619290] [<bf0ef5b0>] (br_dev_ioctl [bridge]) from
>[<c09583d4>] (dev_ifsioc+0x184/0x324)
>> >> [ 352.627582] [<c09583d4>] (dev_ifsioc) from [<c09589e8>]
>(dev_ioctl+0x32c/0x5cc)
>> >> [ 352.634837] [<c09589e8>] (dev_ioctl) from [<c090913c>]
>(sock_ioctl+0x3bc/0x580)
>> >>
>> >>
>> >> since my 4.19.23 kernel is modified a bit i tried with 4.19.10
>without my net modifications and it is still reproducable with steps
>above (create a vlan on dsa-user-port and then use it in a bridge)
>> >>
>> >> i fixed it with these changes:
>>
>> Your fix is undoing the commit from Andrew I just referenced, so it
>is
>> definitively not the right fix because it will push VLAN objects down
>to
>> a non-VLAN aware bridge, not that this is really a problem, but it
>> should not be happening anyway.
>>
>> The problem appears to be the following though: you are enslaving a
>VLAN
>> device, which does not have switchdev_ops, so we recurse into the
>lower
>> device, which is the DSA network device, which does have
>switchdev_ops
>> defined. Once we are there we check the orig_dev against being a
>bridge
>> master network device, but we are not checking that it's a VLAN
>device
>> so we end-up assuming it is a DSA network device, we de-reference
>> garbage by checking br_vlan_enabled(dp->bridge_dev) because there is
>> simply no such data structure associated with the VLAN device.
>>
>> The attached patch should help.
>>
>> This is no longer a problem in newer kernels because the switchdev
>> operations use a notifier which checks the target network device to
>be DSA.
>> --
>> Florian
>>
--
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] [FIX] net: dsa: oops in br_vlan_enabled
2019-02-17 19:05 Frank Wunderlich
@ 2019-02-17 22:38 ` Florian Fainelli
0 siblings, 0 replies; 6+ messages in thread
From: Florian Fainelli @ 2019-02-17 22:38 UTC (permalink / raw)
To: Frank Wunderlich, netdev, andrew, Vivien Didelot
On 2/17/2019 11:05 AM, Frank Wunderlich wrote:
> Hi Florian
>
> a user from Bananapi-forum has reported the oops and i had reproduced this and patched out this oops.
> That means not that vlan in bridge works, only no crash.
>
> my kernel only conatins the second gmac dsa-patches for mt7530/bpi-r2 i've posted at the end of last year, but the oops also occur without them.
>
> have not changed bridge vlan-configuration by
>
> echo 1 > /sys/class/net/bridge_name/bridge/vlan_filtering
>
> maybe that vlan-objects/configuration is not passed through, but there should happen no oops.
>
> kernel-source is here: https://github.com/frank-w/BPI-R2-4.14/tree/4.19-main
>
> ps: please take me in CC so i can answer directly
You were in the To: of my previous reply, let's move it back there since
it has all context.
--
Florian
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-02-18 19:09 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-16 16:22 [BUG] [FIX] net: dsa: oops in br_vlan_enabled Frank Wunderlich
2019-02-17 17:13 ` Florian Fainelli
2019-02-17 23:20 ` Florian Fainelli
2019-02-18 17:11 ` Aw: " Frank Wunderlich
2019-02-18 19:09 ` Florian Fainelli
2019-02-17 19:05 Frank Wunderlich
2019-02-17 22:38 ` Florian Fainelli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).