netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
@ 2021-08-01 23:17 Vladimir Oltean
  2021-08-02  7:42 ` Nikolay Aleksandrov
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-01 23:17 UTC (permalink / raw)
  To: netdev, Jakub Kicinski, David S. Miller
  Cc: Jiri Pirko, Ido Schimmel, Roopa Prabhu, Nikolay Aleksandrov,
	bridge, syzkaller-bugs, syzbot+9ba1174359adba5a5b7c

Currently it is possible to add broken extern_learn FDB entries to the
bridge in two ways:

1. Entries pointing towards the bridge device that are not local/permanent:

ip link add br0 type bridge
bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static

2. Entries pointing towards the bridge device or towards a port that
are marked as local/permanent, however the bridge does not process the
'permanent' bit in any way, therefore they are recorded as though they
aren't permanent:

ip link add br0 type bridge
bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent

Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the
same as entries towards the bridge"), these incorrect FDB entries can
even trigger NULL pointer dereferences inside the kernel.

This is because that commit made the assumption that all FDB entries
that are not local/permanent have a valid destination port. For context,
local / permanent FDB entries either have fdb->dst == NULL, and these
point towards the bridge device and are therefore local and not to be
used for forwarding, or have fdb->dst == a net_bridge_port structure
(but are to be treated in the same way, i.e. not for forwarding).

That assumption _is_ correct as long as things are working correctly in
the bridge driver, i.e. we cannot logically have fdb->dst == NULL under
any circumstance for FDB entries that are not local. However, the
extern_learn code path where FDB entries are managed by a user space
controller show that it is possible for the bridge kernel driver to
misinterpret the NUD flags of an entry transmitted by user space, and
end up having fdb->dst == NULL while not being a local entry. This is
invalid and should be rejected.

Before, the two commands listed above both crashed the kernel in this
check from br_switchdev_fdb_notify:

	struct net_device *dev = info.is_local ? br->dev : dst->dev;

info.is_local == false, dst == NULL.

After this patch, the invalid entry added by the first command is
rejected:

ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0
Error: bridge: FDB entry towards bridge must be permanent.

and the valid entry added by the second command is properly treated as a
local address and does not crash br_switchdev_fdb_notify anymore:

ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0

Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")
Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
 net/bridge/br.c         |  3 ++-
 net/bridge/br_fdb.c     | 30 ++++++++++++++++++++++++------
 net/bridge/br_private.h |  2 +-
 3 files changed, 27 insertions(+), 8 deletions(-)

diff --git a/net/bridge/br.c b/net/bridge/br.c
index ef743f94254d..bbab9984f24e 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,
 	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
 		fdb_info = ptr;
 		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
-						fdb_info->vid, false);
+						fdb_info->vid,
+						fdb_info->is_local, false);
 		if (err) {
 			err = notifier_from_errno(err);
 			break;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index a16191dcaed1..835cec1e5a03 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1019,7 +1019,8 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source,
 
 static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
 			struct net_bridge_port *p, const unsigned char *addr,
-			u16 nlh_flags, u16 vid, struct nlattr *nfea_tb[])
+			u16 nlh_flags, u16 vid, struct nlattr *nfea_tb[],
+			struct netlink_ext_ack *extack)
 {
 	int err = 0;
 
@@ -1038,7 +1039,15 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
 		rcu_read_unlock();
 		local_bh_enable();
 	} else if (ndm->ndm_flags & NTF_EXT_LEARNED) {
-		err = br_fdb_external_learn_add(br, p, addr, vid, true);
+		if (!p && !(ndm->ndm_state & NUD_PERMANENT)) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "FDB entry towards bridge must be permanent");
+			return -EINVAL;
+		}
+
+		err = br_fdb_external_learn_add(br, p, addr, vid,
+						ndm->ndm_state & NUD_PERMANENT,
+						true);
 	} else {
 		spin_lock_bh(&br->hash_lock);
 		err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
@@ -1110,9 +1119,11 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 		}
 
 		/* VID was specified, so use it. */
-		err = __br_fdb_add(ndm, br, p, addr, nlh_flags, vid, nfea_tb);
+		err = __br_fdb_add(ndm, br, p, addr, nlh_flags, vid, nfea_tb,
+				   extack);
 	} else {
-		err = __br_fdb_add(ndm, br, p, addr, nlh_flags, 0, nfea_tb);
+		err = __br_fdb_add(ndm, br, p, addr, nlh_flags, 0, nfea_tb,
+				   extack);
 		if (err || !vg || !vg->num_vlans)
 			goto out;
 
@@ -1124,7 +1135,7 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
 			if (!br_vlan_should_use(v))
 				continue;
 			err = __br_fdb_add(ndm, br, p, addr, nlh_flags, v->vid,
-					   nfea_tb);
+					   nfea_tb, extack);
 			if (err)
 				goto out;
 		}
@@ -1264,7 +1275,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 }
 
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-			      const unsigned char *addr, u16 vid,
+			      const unsigned char *addr, u16 vid, bool is_local,
 			      bool swdev_notify)
 {
 	struct net_bridge_fdb_entry *fdb;
@@ -1281,6 +1292,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 
 		if (swdev_notify)
 			flags |= BIT(BR_FDB_ADDED_BY_USER);
+
+		if (is_local)
+			flags |= BIT(BR_FDB_LOCAL);
+
 		fdb = fdb_create(br, p, addr, vid, flags);
 		if (!fdb) {
 			err = -ENOMEM;
@@ -1307,6 +1322,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		if (swdev_notify)
 			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 
+		if (is_local)
+			set_bit(BR_FDB_LOCAL, &fdb->flags);
+
 		if (modified)
 			fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify);
 	}
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 2b48b204205e..aa64d8d63ca3 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -711,7 +711,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,
 int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
 void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-			      const unsigned char *addr, u16 vid,
+			      const unsigned char *addr, u16 vid, bool is_local,
 			      bool swdev_notify);
 int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-01 23:17 [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
@ 2021-08-02  7:42 ` Nikolay Aleksandrov
  2021-08-02  9:20   ` Vladimir Oltean
  2021-08-02 11:25 ` Nikolay Aleksandrov
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-02  7:42 UTC (permalink / raw)
  To: Vladimir Oltean, netdev, Jakub Kicinski, David S. Miller
  Cc: Jiri Pirko, Ido Schimmel, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On 02/08/2021 02:17, Vladimir Oltean wrote:
> Currently it is possible to add broken extern_learn FDB entries to the
> bridge in two ways:
> 
> 1. Entries pointing towards the bridge device that are not local/permanent:
> 
> ip link add br0 type bridge
> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static
> 
> 2. Entries pointing towards the bridge device or towards a port that
> are marked as local/permanent, however the bridge does not process the
> 'permanent' bit in any way, therefore they are recorded as though they
> aren't permanent:
> 
> ip link add br0 type bridge
> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent
> 
> Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the
> same as entries towards the bridge"), these incorrect FDB entries can
> even trigger NULL pointer dereferences inside the kernel.
> 
> This is because that commit made the assumption that all FDB entries
> that are not local/permanent have a valid destination port. For context,
> local / permanent FDB entries either have fdb->dst == NULL, and these
> point towards the bridge device and are therefore local and not to be
> used for forwarding, or have fdb->dst == a net_bridge_port structure
> (but are to be treated in the same way, i.e. not for forwarding).
> 
> That assumption _is_ correct as long as things are working correctly in
> the bridge driver, i.e. we cannot logically have fdb->dst == NULL under
> any circumstance for FDB entries that are not local. However, the
> extern_learn code path where FDB entries are managed by a user space
> controller show that it is possible for the bridge kernel driver to
> misinterpret the NUD flags of an entry transmitted by user space, and
> end up having fdb->dst == NULL while not being a local entry. This is
> invalid and should be rejected.
> 
> Before, the two commands listed above both crashed the kernel in this
> check from br_switchdev_fdb_notify:
> 

Not before 52e4bec15546 though, the check used to be:
struct net_device *dev = dst ? dst->dev : br->dev;

which wouldn't crash. So the fixes tag below is incorrect, you could
add a weird extern learn entry, but it wouldn't crash the kernel.

> 	struct net_device *dev = info.is_local ? br->dev : dst->dev;
> 
> info.is_local == false, dst == NULL.
> 
> After this patch, the invalid entry added by the first command is
> rejected:
> 
> ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0
> Error: bridge: FDB entry towards bridge must be permanent.
> 
> and the valid entry added by the second command is properly treated as a
> local address and does not crash br_switchdev_fdb_notify anymore:
> 
> ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0
> 
> Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")

Please change fixes tag to 52e4bec15546.

> Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---
[snip]

Thanks,
 Nik

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-02  7:42 ` Nikolay Aleksandrov
@ 2021-08-02  9:20   ` Vladimir Oltean
  2021-08-02  9:42     ` Nikolay Aleksandrov
  0 siblings, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-02  9:20 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Ido Schimmel, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Mon, Aug 02, 2021 at 10:42:41AM +0300, Nikolay Aleksandrov wrote:
> On 02/08/2021 02:17, Vladimir Oltean wrote:
> > Currently it is possible to add broken extern_learn FDB entries to the
> > bridge in two ways:
> >
> > 1. Entries pointing towards the bridge device that are not local/permanent:
> >
> > ip link add br0 type bridge
> > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static
> >
> > 2. Entries pointing towards the bridge device or towards a port that
> > are marked as local/permanent, however the bridge does not process the
> > 'permanent' bit in any way, therefore they are recorded as though they
> > aren't permanent:
> >
> > ip link add br0 type bridge
> > bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent
> >
> > Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the
> > same as entries towards the bridge"), these incorrect FDB entries can
> > even trigger NULL pointer dereferences inside the kernel.
> >
> > This is because that commit made the assumption that all FDB entries
> > that are not local/permanent have a valid destination port. For context,
> > local / permanent FDB entries either have fdb->dst == NULL, and these
> > point towards the bridge device and are therefore local and not to be
> > used for forwarding, or have fdb->dst == a net_bridge_port structure
> > (but are to be treated in the same way, i.e. not for forwarding).
> >
> > That assumption _is_ correct as long as things are working correctly in
> > the bridge driver, i.e. we cannot logically have fdb->dst == NULL under
> > any circumstance for FDB entries that are not local. However, the
> > extern_learn code path where FDB entries are managed by a user space
> > controller show that it is possible for the bridge kernel driver to
> > misinterpret the NUD flags of an entry transmitted by user space, and
> > end up having fdb->dst == NULL while not being a local entry. This is
> > invalid and should be rejected.
> >
> > Before, the two commands listed above both crashed the kernel in this
> > check from br_switchdev_fdb_notify:
> >
>
> Not before 52e4bec15546 though, the check used to be:
> struct net_device *dev = dst ? dst->dev : br->dev;

"Before", as in "before this patch, on net-next/linux-next".

> which wouldn't crash. So the fixes tag below is incorrect, you could
> add a weird extern learn entry, but it wouldn't crash the kernel.

:)

Is our only criterion whether a patch is buggy or not that it causes a
NULL pointer dereference inside the kernel?

I thought I'd mention the interaction with the net-next work for the
sake of being thorough, and because this is how the syzcaller caught it
by coincidence, but "kernel does not treat an FDB entry with the
'permanent' flag as permanent" is enough of a reason to submit this as a
bug fix for the commit that I pointed to. Granted, I don't have any use
case with extern_learn, so probably your user space programs simply
don't add permanent FDB entries, but as this is the kernel UAPI, it
should nevertheless do whatever the user space is allowed to say. For a
permanent FDB entry, that behavior is to stop forwarding for that MAC
DA, and that behavior obviously was not taking place even before any
change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-02  9:20   ` Vladimir Oltean
@ 2021-08-02  9:42     ` Nikolay Aleksandrov
  2021-08-02 10:52       ` Vladimir Oltean
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-02  9:42 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Ido Schimmel, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On 02/08/2021 12:20, Vladimir Oltean wrote:
> On Mon, Aug 02, 2021 at 10:42:41AM +0300, Nikolay Aleksandrov wrote:
>> On 02/08/2021 02:17, Vladimir Oltean wrote:
>>> Currently it is possible to add broken extern_learn FDB entries to the
>>> bridge in two ways:
>>>
>>> 1. Entries pointing towards the bridge device that are not local/permanent:
>>>
>>> ip link add br0 type bridge
>>> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static
>>>
>>> 2. Entries pointing towards the bridge device or towards a port that
>>> are marked as local/permanent, however the bridge does not process the
>>> 'permanent' bit in any way, therefore they are recorded as though they
>>> aren't permanent:
>>>
>>> ip link add br0 type bridge
>>> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent
>>>
>>> Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the
>>> same as entries towards the bridge"), these incorrect FDB entries can
>>> even trigger NULL pointer dereferences inside the kernel.
>>>
>>> This is because that commit made the assumption that all FDB entries
>>> that are not local/permanent have a valid destination port. For context,
>>> local / permanent FDB entries either have fdb->dst == NULL, and these
>>> point towards the bridge device and are therefore local and not to be
>>> used for forwarding, or have fdb->dst == a net_bridge_port structure
>>> (but are to be treated in the same way, i.e. not for forwarding).
>>>
>>> That assumption _is_ correct as long as things are working correctly in
>>> the bridge driver, i.e. we cannot logically have fdb->dst == NULL under
>>> any circumstance for FDB entries that are not local. However, the
>>> extern_learn code path where FDB entries are managed by a user space
>>> controller show that it is possible for the bridge kernel driver to
>>> misinterpret the NUD flags of an entry transmitted by user space, and
>>> end up having fdb->dst == NULL while not being a local entry. This is
>>> invalid and should be rejected.
>>>
>>> Before, the two commands listed above both crashed the kernel in this
>>> check from br_switchdev_fdb_notify:
>>>
>>
>> Not before 52e4bec15546 though, the check used to be:
>> struct net_device *dev = dst ? dst->dev : br->dev;
> 
> "Before", as in "before this patch, on net-next/linux-next".
> 

We still need that check, more below.

>> which wouldn't crash. So the fixes tag below is incorrect, you could
>> add a weird extern learn entry, but it wouldn't crash the kernel.
> 
> :)
> 
> Is our only criterion whether a patch is buggy or not that it causes a
> NULL pointer dereference inside the kernel?
> 
> I thought I'd mention the interaction with the net-next work for the
> sake of being thorough, and because this is how the syzcaller caught it
> by coincidence, but "kernel does not treat an FDB entry with the
> 'permanent' flag as permanent" is enough of a reason to submit this as a

Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set.
The actual bug is that it points to the bridge device, e.g. null dst without the flag.

> bug fix for the commit that I pointed to. Granted, I don't have any use
> case with extern_learn, so probably your user space programs simply
> don't add permanent FDB entries, but as this is the kernel UAPI, it
> should nevertheless do whatever the user space is allowed to say. For a
> permanent FDB entry, that behavior is to stop forwarding for that MAC
> DA, and that behavior obviously was not taking place even before any
> change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.
> 

Actually I believe there is still a bug in 52e4bec15546 even with this fix.
The flag can change after the dst has been read in br_switchdev_fdb_notify()
so in theory you could still do a null pointer dereference. fdb_notify()
can be called from a few places without locking. The code shouldn't dereference
the dst based on the flag.

I'm okay with this change due to the null dst without permanent flag fix, but
it doesn't fully fix the null pointer dereference.




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-02  9:42     ` Nikolay Aleksandrov
@ 2021-08-02 10:52       ` Vladimir Oltean
  2021-08-02 11:02         ` Nikolay Aleksandrov
  0 siblings, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-02 10:52 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Ido Schimmel, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Mon, Aug 02, 2021 at 12:42:17PM +0300, Nikolay Aleksandrov wrote:
> >>> Before, the two commands listed above both crashed the kernel in this
> >>> check from br_switchdev_fdb_notify:
> >>>
> >>
> >> Not before 52e4bec15546 though, the check used to be:
> >> struct net_device *dev = dst ? dst->dev : br->dev;
> >
> > "Before", as in "before this patch, on net-next/linux-next".
> >
>
> We still need that check, more below.
>
> >> which wouldn't crash. So the fixes tag below is incorrect, you could
> >> add a weird extern learn entry, but it wouldn't crash the kernel.
> >
> > :)
> >
> > Is our only criterion whether a patch is buggy or not that it causes a
> > NULL pointer dereference inside the kernel?
> >
> > I thought I'd mention the interaction with the net-next work for the
> > sake of being thorough, and because this is how the syzcaller caught it
> > by coincidence, but "kernel does not treat an FDB entry with the
> > 'permanent' flag as permanent" is enough of a reason to submit this as a
>
> Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set.

And that is the bug I am addressing here, no?

> The actual bug is that it points to the bridge device, e.g. null dst without the flag.
>
> > bug fix for the commit that I pointed to. Granted, I don't have any use
> > case with extern_learn, so probably your user space programs simply
> > don't add permanent FDB entries, but as this is the kernel UAPI, it
> > should nevertheless do whatever the user space is allowed to say. For a
> > permanent FDB entry, that behavior is to stop forwarding for that MAC
> > DA, and that behavior obviously was not taking place even before any
> > change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.
> >
>
> Actually I believe there is still a bug in 52e4bec15546 even with this fix.
> The flag can change after the dst has been read in br_switchdev_fdb_notify()
> so in theory you could still do a null pointer dereference. fdb_notify()
> can be called from a few places without locking. The code shouldn't dereference
> the dst based on the flag.

Are you thinking of a specific code path that triggers a race between
(a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then
    set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and
(b) a reader side catching that fdb exactly in between the above 2
    statements, through fdb_notify or otherwise (br_fdb_replay)?

Because I don't see any.

Plus, I am a bit nervous about protecting against theoretical/unproven
races in a way that masks real bugs, as we would be doing if I add an
extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the
case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL.

>
> I'm okay with this change due to the null dst without permanent flag fix, but
> it doesn't fully fix the null pointer dereference.

So is there any change that I should make to this patch?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-02 10:52       ` Vladimir Oltean
@ 2021-08-02 11:02         ` Nikolay Aleksandrov
  2021-08-02 11:20           ` Vladimir Oltean
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-02 11:02 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Ido Schimmel, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On 02/08/2021 13:52, Vladimir Oltean wrote:
> On Mon, Aug 02, 2021 at 12:42:17PM +0300, Nikolay Aleksandrov wrote:
>>>>> Before, the two commands listed above both crashed the kernel in this
>>>>> check from br_switchdev_fdb_notify:
>>>>>
>>>>
>>>> Not before 52e4bec15546 though, the check used to be:
>>>> struct net_device *dev = dst ? dst->dev : br->dev;
>>>
>>> "Before", as in "before this patch, on net-next/linux-next".
>>>
>>
>> We still need that check, more below.
>>
>>>> which wouldn't crash. So the fixes tag below is incorrect, you could
>>>> add a weird extern learn entry, but it wouldn't crash the kernel.
>>>
>>> :)
>>>
>>> Is our only criterion whether a patch is buggy or not that it causes a
>>> NULL pointer dereference inside the kernel?
>>>
>>> I thought I'd mention the interaction with the net-next work for the
>>> sake of being thorough, and because this is how the syzcaller caught it
>>> by coincidence, but "kernel does not treat an FDB entry with the
>>> 'permanent' flag as permanent" is enough of a reason to submit this as a
>>
>> Not exactly right, you may add it as permanent but it doesn't get "permanent" flag set.
> 
> And that is the bug I am addressing here, no?
> 

Obviously I pointed to the statement above, I don't get the question here.

>> The actual bug is that it points to the bridge device, e.g. null dst without the flag.
>>
>>> bug fix for the commit that I pointed to. Granted, I don't have any use
>>> case with extern_learn, so probably your user space programs simply
>>> don't add permanent FDB entries, but as this is the kernel UAPI, it
>>> should nevertheless do whatever the user space is allowed to say. For a
>>> permanent FDB entry, that behavior is to stop forwarding for that MAC
>>> DA, and that behavior obviously was not taking place even before any
>>> change in br_switchdev_fdb_notify(), or even with CONFIG_NET_SWITCHDEV=n.
>>>
>>
>> Actually I believe there is still a bug in 52e4bec15546 even with this fix.
>> The flag can change after the dst has been read in br_switchdev_fdb_notify()
>> so in theory you could still do a null pointer dereference. fdb_notify()
>> can be called from a few places without locking. The code shouldn't dereference
>> the dst based on the flag.
> 
> Are you thinking of a specific code path that triggers a race between
> (a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then
>     set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and

Visible order is not guaranteed, there are no barriers neither at writer nor reader
sides, especially when used without locking. So we cannot make any assumptions
about the order visibility of these writes.

> (b) a reader side catching that fdb exactly in between the above 2
>     statements, through fdb_notify or otherwise (br_fdb_replay)?
> 
> Because I don't see any.
> 
> Plus, I am a bit nervous about protecting against theoretical/unproven
> races in a way that masks real bugs, as we would be doing if I add an
> extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the
> case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL.
> 

The bits are _not_ visible atomically with the setting of ->dst. It is obvious
you must not dereference anything based on them, they are only indications when used
outside of locked regions and code must be able to deal with inconsistencies as that
is implied by the way they're used. It is a clear and obvious bug dereferencing based
on a bit that can change in parallel without any memory ordering guarantees.

You are not "masking" anything, but fixing what is currently buggy use of fdb bits.
As I already said - this doesn't fix the null deref bug completely, in fact it fixes a different
inconsistency, before at worst you'd get blackholed traffic for such entries now
you get a null pointer dereference.

>>
>> I'm okay with this change due to the null dst without permanent flag fix, but
>> it doesn't fully fix the null pointer dereference.
> 
> So is there any change that I should make to this patch?
> 

No, this patch is fine as-is.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-02 11:02         ` Nikolay Aleksandrov
@ 2021-08-02 11:20           ` Vladimir Oltean
  0 siblings, 0 replies; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-02 11:20 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Ido Schimmel, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Mon, Aug 02, 2021 at 02:02:36PM +0300, Nikolay Aleksandrov wrote:
> >> Actually I believe there is still a bug in 52e4bec15546 even with this fix.
> >> The flag can change after the dst has been read in br_switchdev_fdb_notify()
> >> so in theory you could still do a null pointer dereference. fdb_notify()
> >> can be called from a few places without locking. The code shouldn't dereference
> >> the dst based on the flag.
> >
> > Are you thinking of a specific code path that triggers a race between
> > (a) a writer side doing WRITE_ONCE(fdb->dst, NULL) and then
> >     set_bit(BR_FDB_LOCAL, &fdb->flags), exactly in this order, and
>
> Visible order is not guaranteed, there are no barriers neither at writer nor reader
> sides, especially when used without locking. So we cannot make any assumptions
> about the order visibility of these writes.
>
> > (b) a reader side catching that fdb exactly in between the above 2
> >     statements, through fdb_notify or otherwise (br_fdb_replay)?
> >
> > Because I don't see any.
> >
> > Plus, I am a bit nervous about protecting against theoretical/unproven
> > races in a way that masks real bugs, as we would be doing if I add an
> > extra check in br_fdb_replay_one and br_switchdev_fdb_notify against the
> > case where an entry has fdb->dst == NULL but not BR_FDB_LOCAL.
> >
>
> The bits are _not_ visible atomically with the setting of ->dst. It is obvious
> you must not dereference anything based on them, they are only indications when used
> outside of locked regions and code must be able to deal with inconsistencies as that
> is implied by the way they're used. It is a clear and obvious bug dereferencing based
> on a bit that can change in parallel without any memory ordering guarantees.

Ok, I will send a separate patch for that.

> You are not "masking" anything, but fixing what is currently buggy use of fdb bits.

I am "masking" in the sense that the bug I am fixing here was not
obvious to me until it triggered a NPD. That would stop happening with
the patch I'm about to send, but maybe there are still bridge UAPI
functions that do not validate the 'permanent' flag from FDB entries.

> As I already said - this doesn't fix the null deref bug completely, in fact it fixes a different
> inconsistency, before at worst you'd get blackholed traffic for such entries now
> you get a null pointer dereference.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-01 23:17 [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
  2021-08-02  7:42 ` Nikolay Aleksandrov
@ 2021-08-02 11:25 ` Nikolay Aleksandrov
  2021-08-02 22:10 ` patchwork-bot+netdevbpf
  2021-08-09 12:16 ` Ido Schimmel
  3 siblings, 0 replies; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-02 11:25 UTC (permalink / raw)
  To: Vladimir Oltean, netdev, Jakub Kicinski, David S. Miller
  Cc: Jiri Pirko, Ido Schimmel, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On 02/08/2021 02:17, Vladimir Oltean wrote:
> Currently it is possible to add broken extern_learn FDB entries to the
> bridge in two ways:
> 
> 1. Entries pointing towards the bridge device that are not local/permanent:
> 
> ip link add br0 type bridge
> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static
> 
> 2. Entries pointing towards the bridge device or towards a port that
> are marked as local/permanent, however the bridge does not process the
> 'permanent' bit in any way, therefore they are recorded as though they
> aren't permanent:
> 
> ip link add br0 type bridge
> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent
> 
> Since commit 52e4bec15546 ("net: bridge: switchdev: treat local FDBs the
> same as entries towards the bridge"), these incorrect FDB entries can
> even trigger NULL pointer dereferences inside the kernel.
> 
> This is because that commit made the assumption that all FDB entries
> that are not local/permanent have a valid destination port. For context,
> local / permanent FDB entries either have fdb->dst == NULL, and these
> point towards the bridge device and are therefore local and not to be
> used for forwarding, or have fdb->dst == a net_bridge_port structure
> (but are to be treated in the same way, i.e. not for forwarding).
> 
> That assumption _is_ correct as long as things are working correctly in
> the bridge driver, i.e. we cannot logically have fdb->dst == NULL under
> any circumstance for FDB entries that are not local. However, the
> extern_learn code path where FDB entries are managed by a user space
> controller show that it is possible for the bridge kernel driver to
> misinterpret the NUD flags of an entry transmitted by user space, and
> end up having fdb->dst == NULL while not being a local entry. This is
> invalid and should be rejected.
> 
> Before, the two commands listed above both crashed the kernel in this
> check from br_switchdev_fdb_notify:
> 
> 	struct net_device *dev = info.is_local ? br->dev : dst->dev;
> 
> info.is_local == false, dst == NULL.
> 
> After this patch, the invalid entry added by the first command is
> rejected:
> 
> ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static; ip link del br0
> Error: bridge: FDB entry towards bridge must be permanent.
> 
> and the valid entry added by the second command is properly treated as a
> local address and does not crash br_switchdev_fdb_notify anymore:
> 
> ip link add br0 type bridge && bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn permanent; ip link del br0
> 
> Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")
> Reported-by: syzbot+9ba1174359adba5a5b7c@syzkaller.appspotmail.com
> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
> ---

Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-01 23:17 [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
  2021-08-02  7:42 ` Nikolay Aleksandrov
  2021-08-02 11:25 ` Nikolay Aleksandrov
@ 2021-08-02 22:10 ` patchwork-bot+netdevbpf
  2021-08-09 12:16 ` Ido Schimmel
  3 siblings, 0 replies; 23+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-08-02 22:10 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, kuba, davem, jiri, idosch, roopa, nikolay, bridge,
	syzkaller-bugs, syzbot+9ba1174359adba5a5b7c

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Mon,  2 Aug 2021 02:17:30 +0300 you wrote:
> Currently it is possible to add broken extern_learn FDB entries to the
> bridge in two ways:
> 
> 1. Entries pointing towards the bridge device that are not local/permanent:
> 
> ip link add br0 type bridge
> bridge fdb add 00:01:02:03:04:05 dev br0 self extern_learn static
> 
> [...]

Here is the summary with links:
  - [net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
    https://git.kernel.org/netdev/net/c/0541a6293298

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-01 23:17 [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
                   ` (2 preceding siblings ...)
  2021-08-02 22:10 ` patchwork-bot+netdevbpf
@ 2021-08-09 12:16 ` Ido Schimmel
  2021-08-09 12:32   ` Vladimir Oltean
                     ` (2 more replies)
  3 siblings, 3 replies; 23+ messages in thread
From: Ido Schimmel @ 2021-08-09 12:16 UTC (permalink / raw)
  To: Vladimir Oltean, nikolay
  Cc: netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Roopa Prabhu, Nikolay Aleksandrov, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote:
> diff --git a/net/bridge/br.c b/net/bridge/br.c
> index ef743f94254d..bbab9984f24e 100644
> --- a/net/bridge/br.c
> +++ b/net/bridge/br.c
> @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,
>  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
>  		fdb_info = ptr;
>  		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
> -						fdb_info->vid, false);
> +						fdb_info->vid,
> +						fdb_info->is_local, false);

When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge:
switchdev: include local flag in FDB notifications") it was not
initialized in all the call sites that emit
'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage.

>  		if (err) {
>  			err = notifier_from_errno(err);
>  			break;

[...]

> @@ -1281,6 +1292,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  
>  		if (swdev_notify)
>  			flags |= BIT(BR_FDB_ADDED_BY_USER);
> +
> +		if (is_local)
> +			flags |= BIT(BR_FDB_LOCAL);

I have at least once selftest where I forgot the 'static' keyword:

bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1

This patch breaks the test when run against both the kernel and hardware
data paths. I don't mind patching these tests, but we might get more
reports in the future.

Nik, what do you think?

> +
>  		fdb = fdb_create(br, p, addr, vid, flags);
>  		if (!fdb) {
>  			err = -ENOMEM;
> @@ -1307,6 +1322,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>  		if (swdev_notify)
>  			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
>  
> +		if (is_local)
> +			set_bit(BR_FDB_LOCAL, &fdb->flags);
> +
>  		if (modified)
>  			fdb_notify(br, fdb, RTM_NEWNEIGH, swdev_notify);
>  	}
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 2b48b204205e..aa64d8d63ca3 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -711,7 +711,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,
>  int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
>  void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
> -			      const unsigned char *addr, u16 vid,
> +			      const unsigned char *addr, u16 vid, bool is_local,
>  			      bool swdev_notify);
>  int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
>  			      const unsigned char *addr, u16 vid,
> -- 
> 2.25.1
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-09 12:16 ` Ido Schimmel
@ 2021-08-09 12:32   ` Vladimir Oltean
  2021-08-09 15:33   ` Nikolay Aleksandrov
  2021-08-09 16:05   ` [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
  2 siblings, 0 replies; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-09 12:32 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: nikolay, netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
> On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote:
> > diff --git a/net/bridge/br.c b/net/bridge/br.c
> > index ef743f94254d..bbab9984f24e 100644
> > --- a/net/bridge/br.c
> > +++ b/net/bridge/br.c
> > @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,
> >  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
> >  		fdb_info = ptr;
> >  		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
> > -						fdb_info->vid, false);
> > +						fdb_info->vid,
> > +						fdb_info->is_local, false);
> 
> When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge:
> switchdev: include local flag in FDB notifications") it was not
> initialized in all the call sites that emit
> 'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage.

Thanks for the report, I'll send a patch which adds a:

	memset(&info, 0, sizeof(info));

to all SWITCHDEV_FDB_*_TO_BRIDGE call sites.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-09 12:16 ` Ido Schimmel
  2021-08-09 12:32   ` Vladimir Oltean
@ 2021-08-09 15:33   ` Nikolay Aleksandrov
  2021-08-10  6:40     ` Ido Schimmel
  2021-08-09 16:05   ` [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
  2 siblings, 1 reply; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-09 15:33 UTC (permalink / raw)
  To: Ido Schimmel, Vladimir Oltean
  Cc: netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On 09/08/2021 15:16, Ido Schimmel wrote:
> On Mon, Aug 02, 2021 at 02:17:30AM +0300, Vladimir Oltean wrote:
>> diff --git a/net/bridge/br.c b/net/bridge/br.c
>> index ef743f94254d..bbab9984f24e 100644
>> --- a/net/bridge/br.c
>> +++ b/net/bridge/br.c
>> @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused,
>>  	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
>>  		fdb_info = ptr;
>>  		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
>> -						fdb_info->vid, false);
>> +						fdb_info->vid,
>> +						fdb_info->is_local, false);
> 
> When 'is_local' was added in commit 2c4eca3ef716 ("net: bridge:
> switchdev: include local flag in FDB notifications") it was not
> initialized in all the call sites that emit
> 'SWITCHDEV_FDB_ADD_TO_BRIDGE' notification, so it can contain garbage.
> 

nice catch

>>  		if (err) {
>>  			err = notifier_from_errno(err);
>>  			break;
> 
> [...]
> 
>> @@ -1281,6 +1292,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>>  
>>  		if (swdev_notify)
>>  			flags |= BIT(BR_FDB_ADDED_BY_USER);
>> +
>> +		if (is_local)
>> +			flags |= BIT(BR_FDB_LOCAL);
> 
> I have at least once selftest where I forgot the 'static' keyword:
> 
> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1
> 
> This patch breaks the test when run against both the kernel and hardware
> data paths. I don't mind patching these tests, but we might get more
> reports in the future.
> 
> Nik, what do you think?
> 

Ahh, that's unfortunate. The patch's assumption is correct that we must not have fdb->dst == NULL
and the dst to be non-local (i.e. without BR_FDB_LOCAL). Since all solutions break user-space in
a different way and since this patch also already broke it by the check for !p && !NUD_PERMANENT
in __br_fdb_add() which was allowed before that, I think the best course of action is to ignore
NUD_PERMANENT in __br_fdb_add() for extern_learn case and always set BR_FDB_LOCAL in br_fdb_external_learn_add()
when !p. That would allow all prior calls to work and would remove the dst==NULL without BR_FDB_LOCAL
issue. Honestly, I doubt anyone is using extern_learn with bridge device entries, but we cannot assume
anything since this is already a part of the uAPI and we must allow it. Basically we silently
fix the BR_FDB_LOCAL problem so old user syntax and code can continue working.
It is a hack, but I don't see another solution which doesn't break user-space in some way.
Handling NUD_PERMANENT only with !p is equivalent, unfortunately we shouldn't keep the error
since that can break someone who was adding such entries without NUD_PERMANENT flag, but
we can force it in kernel, that should make such scripts succeed. Traffic used to be blackholed
for such entries and now it will be received locally, that will be the only difference.

TBH, I want to keep that error so middle ground would be to handle NUD_PERMANENT only
when used with !p and keep it. :) WDYT ?

Solution which forces BR_FDB_LOCAL for !p calls (completely untested):
diff --git a/net/bridge/br.c b/net/bridge/br.c
index c8ae823aa8e7..d3a32c6813e0 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -166,8 +166,7 @@ static int br_switchdev_event(struct notifier_block *unused,
        case SWITCHDEV_FDB_ADD_TO_BRIDGE:
                fdb_info = ptr;
                err = br_fdb_external_learn_add(br, p, fdb_info->addr,
-                                               fdb_info->vid,
-                                               fdb_info->is_local, false);
+                                               fdb_info->vid, false);
                if (err) {
                        err = notifier_from_errno(err);
                        break;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index b8e22057f680..4e3b1b66f132 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1255,15 +1255,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
                rcu_read_unlock();
                local_bh_enable();
        } else if (ndm->ndm_flags & NTF_EXT_LEARNED) {
-               if (!p && !(ndm->ndm_state & NUD_PERMANENT)) {
-                       NL_SET_ERR_MSG_MOD(extack,
-                                          "FDB entry towards bridge must be permanent");
-                       return -EINVAL;
-               }
-
-               err = br_fdb_external_learn_add(br, p, addr, vid,
-                                               ndm->ndm_state & NUD_PERMANENT,
-                                               true);
+               err = br_fdb_external_learn_add(br, p, addr, vid, true);
        } else {
                spin_lock_bh(&br->hash_lock);
                err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
@@ -1491,7 +1483,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 }
 
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-                             const unsigned char *addr, u16 vid, bool is_local,
+                             const unsigned char *addr, u16 vid,
                              bool swdev_notify)
 {
        struct net_bridge_fdb_entry *fdb;
@@ -1509,7 +1501,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
                if (swdev_notify)
                        flags |= BIT(BR_FDB_ADDED_BY_USER);
 
-               if (is_local)
+               if (!p)
                        flags |= BIT(BR_FDB_LOCAL);
 
                fdb = fdb_create(br, p, addr, vid, flags);
@@ -1538,7 +1530,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
                if (swdev_notify)
                        set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 
-               if (is_local)
+               if (!p)
                        set_bit(BR_FDB_LOCAL, &fdb->flags);
 
                if (modified)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 86969d1bd036..907e5742b392 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -778,7 +778,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,
 int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
 void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-                             const unsigned char *addr, u16 vid, bool is_local,
+                             const unsigned char *addr, u16 vid,
                              bool swdev_notify);
 int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
                              const unsigned char *addr, u16 vid,



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-09 12:16 ` Ido Schimmel
  2021-08-09 12:32   ` Vladimir Oltean
  2021-08-09 15:33   ` Nikolay Aleksandrov
@ 2021-08-09 16:05   ` Vladimir Oltean
  2021-08-10  6:46     ` Ido Schimmel
  2 siblings, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-09 16:05 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: nikolay, netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
> I have at least once selftest where I forgot the 'static' keyword:
>
> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1
>
> This patch breaks the test when run against both the kernel and hardware
> data paths. I don't mind patching these tests, but we might get more
> reports in the future.

Is it the 'static' keyword that you forgot, or 'dynamic'? The
tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest
looks to me like it's testing the behavior of an FDB entry which should
roam, and which without the extern_learn flag would be ageable.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-09 15:33   ` Nikolay Aleksandrov
@ 2021-08-10  6:40     ` Ido Schimmel
  2021-08-10  7:21       ` [PATCH net] net: bridge: fix flags interpretation for extern learn fdb entries Nikolay Aleksandrov
  0 siblings, 1 reply; 23+ messages in thread
From: Ido Schimmel @ 2021-08-10  6:40 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Vladimir Oltean, netdev, Jakub Kicinski, David S. Miller,
	Jiri Pirko, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Mon, Aug 09, 2021 at 06:33:30PM +0300, Nikolay Aleksandrov wrote:
> TBH, I want to keep that error so middle ground would be to handle NUD_PERMANENT only
> when used with !p and keep it. :) WDYT ?

Yes, works for me

> 
> Solution which forces BR_FDB_LOCAL for !p calls (completely untested):

Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>

> diff --git a/net/bridge/br.c b/net/bridge/br.c
> index c8ae823aa8e7..d3a32c6813e0 100644
> --- a/net/bridge/br.c
> +++ b/net/bridge/br.c
> @@ -166,8 +166,7 @@ static int br_switchdev_event(struct notifier_block *unused,
>         case SWITCHDEV_FDB_ADD_TO_BRIDGE:
>                 fdb_info = ptr;
>                 err = br_fdb_external_learn_add(br, p, fdb_info->addr,
> -                                               fdb_info->vid,
> -                                               fdb_info->is_local, false);
> +                                               fdb_info->vid, false);
>                 if (err) {
>                         err = notifier_from_errno(err);
>                         break;
> diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
> index b8e22057f680..4e3b1b66f132 100644
> --- a/net/bridge/br_fdb.c
> +++ b/net/bridge/br_fdb.c
> @@ -1255,15 +1255,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
>                 rcu_read_unlock();
>                 local_bh_enable();
>         } else if (ndm->ndm_flags & NTF_EXT_LEARNED) {
> -               if (!p && !(ndm->ndm_state & NUD_PERMANENT)) {
> -                       NL_SET_ERR_MSG_MOD(extack,
> -                                          "FDB entry towards bridge must be permanent");
> -                       return -EINVAL;
> -               }
> -
> -               err = br_fdb_external_learn_add(br, p, addr, vid,
> -                                               ndm->ndm_state & NUD_PERMANENT,
> -                                               true);
> +               err = br_fdb_external_learn_add(br, p, addr, vid, true);
>         } else {
>                 spin_lock_bh(&br->hash_lock);
>                 err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
> @@ -1491,7 +1483,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
>  }
>  
>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
> -                             const unsigned char *addr, u16 vid, bool is_local,
> +                             const unsigned char *addr, u16 vid,
>                               bool swdev_notify)
>  {
>         struct net_bridge_fdb_entry *fdb;
> @@ -1509,7 +1501,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>                 if (swdev_notify)
>                         flags |= BIT(BR_FDB_ADDED_BY_USER);
>  
> -               if (is_local)
> +               if (!p)
>                         flags |= BIT(BR_FDB_LOCAL);
>  
>                 fdb = fdb_create(br, p, addr, vid, flags);
> @@ -1538,7 +1530,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
>                 if (swdev_notify)
>                         set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
>  
> -               if (is_local)
> +               if (!p)
>                         set_bit(BR_FDB_LOCAL, &fdb->flags);
>  
>                 if (modified)
> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
> index 86969d1bd036..907e5742b392 100644
> --- a/net/bridge/br_private.h
> +++ b/net/bridge/br_private.h
> @@ -778,7 +778,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,
>  int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
>  void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
>  int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
> -                             const unsigned char *addr, u16 vid, bool is_local,
> +                             const unsigned char *addr, u16 vid,
>                               bool swdev_notify);
>  int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
>                               const unsigned char *addr, u16 vid,
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-09 16:05   ` [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
@ 2021-08-10  6:46     ` Ido Schimmel
  2021-08-10 10:09       ` Vladimir Oltean
  0 siblings, 1 reply; 23+ messages in thread
From: Ido Schimmel @ 2021-08-10  6:46 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: nikolay, netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:
> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
> > I have at least once selftest where I forgot the 'static' keyword:
> >
> > bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1
> >
> > This patch breaks the test when run against both the kernel and hardware
> > data paths. I don't mind patching these tests, but we might get more
> > reports in the future.
> 
> Is it the 'static' keyword that you forgot, or 'dynamic'? The
> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest
> looks to me like it's testing the behavior of an FDB entry which should
> roam, and which without the extern_learn flag would be ageable.

static - no aging, no roaming
dynamic - aging, roaming
extern_learn - no aging, roaming

So these combinations do not make any sense and the kernel will ignore
static/dynamic when extern_learn is specified. It's needed to work
around iproute2 behavior of "assume permanent"

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net] net: bridge: fix flags interpretation for extern learn fdb entries
  2021-08-10  6:40     ` Ido Schimmel
@ 2021-08-10  7:21       ` Nikolay Aleksandrov
  2021-08-10 11:00         ` [PATCH net v2] " Nikolay Aleksandrov
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-10  7:21 UTC (permalink / raw)
  To: netdev; +Cc: roopa, bridge, vladimir.oltean, Nikolay Aleksandrov, Ido Schimmel

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Ignore fdb flags when adding port extern learn entries and always set
BR_FDB_LOCAL flag when adding bridge extern learn entries. This is
closest to the behaviour we had before and avoids breaking any use cases
which were allowed.

This patch fixes iproute2 calls which assume NUD_PERMANENT and were
allowed before, example:
$ bridge fdb add 00:11:22:33:44:55 dev swp1 extern_learn

Extern learn entries are allowed to roam, but do not expire, so static
or dynamic flags make no sense for them.

Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")
Fixes: 0541a6293298 ("net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry")
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
As discussed I decided to keep the error for !p and !NUD_PERMANENT case.

 net/bridge/br.c         |  3 +--
 net/bridge/br_fdb.c     | 11 ++++-------
 net/bridge/br_private.h |  2 +-
 3 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/net/bridge/br.c b/net/bridge/br.c
index bbab9984f24e..ef743f94254d 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -166,8 +166,7 @@ static int br_switchdev_event(struct notifier_block *unused,
 	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
 		fdb_info = ptr;
 		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
-						fdb_info->vid,
-						fdb_info->is_local, false);
+						fdb_info->vid, false);
 		if (err) {
 			err = notifier_from_errno(err);
 			break;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 835cec1e5a03..5dee30966ed3 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1044,10 +1044,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
 					   "FDB entry towards bridge must be permanent");
 			return -EINVAL;
 		}
-
-		err = br_fdb_external_learn_add(br, p, addr, vid,
-						ndm->ndm_state & NUD_PERMANENT,
-						true);
+		err = br_fdb_external_learn_add(br, p, addr, vid, true);
 	} else {
 		spin_lock_bh(&br->hash_lock);
 		err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
@@ -1275,7 +1272,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 }
 
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-			      const unsigned char *addr, u16 vid, bool is_local,
+			      const unsigned char *addr, u16 vid,
 			      bool swdev_notify)
 {
 	struct net_bridge_fdb_entry *fdb;
@@ -1293,7 +1290,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		if (swdev_notify)
 			flags |= BIT(BR_FDB_ADDED_BY_USER);
 
-		if (is_local)
+		if (!p)
 			flags |= BIT(BR_FDB_LOCAL);
 
 		fdb = fdb_create(br, p, addr, vid, flags);
@@ -1322,7 +1319,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		if (swdev_notify)
 			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 
-		if (is_local)
+		if (!p)
 			set_bit(BR_FDB_LOCAL, &fdb->flags);
 
 		if (modified)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index aa64d8d63ca3..2b48b204205e 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -711,7 +711,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,
 int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
 void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-			      const unsigned char *addr, u16 vid, bool is_local,
+			      const unsigned char *addr, u16 vid,
 			      bool swdev_notify);
 int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-10  6:46     ` Ido Schimmel
@ 2021-08-10 10:09       ` Vladimir Oltean
  2021-08-10 10:15         ` Nikolay Aleksandrov
  0 siblings, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-10 10:09 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: nikolay, netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote:
> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:
> > On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
> > > I have at least once selftest where I forgot the 'static' keyword:
> > >
> > > bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1
> > >
> > > This patch breaks the test when run against both the kernel and hardware
> > > data paths. I don't mind patching these tests, but we might get more
> > > reports in the future.
> > 
> > Is it the 'static' keyword that you forgot, or 'dynamic'? The
> > tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest
> > looks to me like it's testing the behavior of an FDB entry which should
> > roam, and which without the extern_learn flag would be ageable.
> 
> static - no aging, no roaming
> dynamic - aging, roaming
> extern_learn - no aging, roaming
> 
> So these combinations do not make any sense and the kernel will ignore
> static/dynamic when extern_learn is specified. It's needed to work
> around iproute2 behavior of "assume permanent"

Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP
are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn'
belongs to the same class of bridge neighbor attributes as 'static'/'dynamic',
and that it is invalid to have the full degree of freedom. If it isn't,
shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state?
If it's too late to validate, shouldn't we at least document somewhere
that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED?
It is user API after all, easter eggs like this aren't very enjoyable.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-10 10:09       ` Vladimir Oltean
@ 2021-08-10 10:15         ` Nikolay Aleksandrov
  2021-08-10 10:38           ` Vladimir Oltean
  0 siblings, 1 reply; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-10 10:15 UTC (permalink / raw)
  To: Vladimir Oltean, Ido Schimmel
  Cc: netdev, Jakub Kicinski, David S. Miller, Jiri Pirko,
	Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On 10/08/2021 13:09, Vladimir Oltean wrote:
> On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote:
>> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:
>>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
>>>> I have at least once selftest where I forgot the 'static' keyword:
>>>>
>>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1
>>>>
>>>> This patch breaks the test when run against both the kernel and hardware
>>>> data paths. I don't mind patching these tests, but we might get more
>>>> reports in the future.
>>>
>>> Is it the 'static' keyword that you forgot, or 'dynamic'? The
>>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest
>>> looks to me like it's testing the behavior of an FDB entry which should
>>> roam, and which without the extern_learn flag would be ageable.
>>
>> static - no aging, no roaming
>> dynamic - aging, roaming
>> extern_learn - no aging, roaming
>>
>> So these combinations do not make any sense and the kernel will ignore
>> static/dynamic when extern_learn is specified. It's needed to work
>> around iproute2 behavior of "assume permanent"
> 
> Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP
> are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn'
> belongs to the same class of bridge neighbor attributes as 'static'/'dynamic',
> and that it is invalid to have the full degree of freedom. If it isn't,
> shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state?
> If it's too late to validate, shouldn't we at least document somewhere
> that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED?
> It is user API after all, easter eggs like this aren't very enjoyable.
> 

It's too late unfortunately, ignoring other flags in that case has been the standard
behaviour for a long time (it has never made sense to specify flags for extern_learn
entries). I'll send a separate patch that adds a comment to document it or if you have
another thing in mind feel free to send a patch.

Thanks,
 Nik


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-10 10:15         ` Nikolay Aleksandrov
@ 2021-08-10 10:38           ` Vladimir Oltean
  2021-08-10 10:43             ` Nikolay Aleksandrov
  0 siblings, 1 reply; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-10 10:38 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Ido Schimmel, netdev, Jakub Kicinski, David S. Miller,
	Jiri Pirko, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On Tue, Aug 10, 2021 at 01:15:32PM +0300, Nikolay Aleksandrov wrote:
> On 10/08/2021 13:09, Vladimir Oltean wrote:
> > On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote:
> >> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:
> >>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
> >>>> I have at least once selftest where I forgot the 'static' keyword:
> >>>>
> >>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1
> >>>>
> >>>> This patch breaks the test when run against both the kernel and hardware
> >>>> data paths. I don't mind patching these tests, but we might get more
> >>>> reports in the future.
> >>>
> >>> Is it the 'static' keyword that you forgot, or 'dynamic'? The
> >>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest
> >>> looks to me like it's testing the behavior of an FDB entry which should
> >>> roam, and which without the extern_learn flag would be ageable.
> >>
> >> static - no aging, no roaming
> >> dynamic - aging, roaming
> >> extern_learn - no aging, roaming
> >>
> >> So these combinations do not make any sense and the kernel will ignore
> >> static/dynamic when extern_learn is specified. It's needed to work
> >> around iproute2 behavior of "assume permanent"
> > 
> > Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP
> > are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn'
> > belongs to the same class of bridge neighbor attributes as 'static'/'dynamic',
> > and that it is invalid to have the full degree of freedom. If it isn't,
> > shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state?
> > If it's too late to validate, shouldn't we at least document somewhere
> > that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED?
> > It is user API after all, easter eggs like this aren't very enjoyable.
> > 
> 
> It's too late unfortunately, ignoring other flags in that case has been the standard
> behaviour for a long time (it has never made sense to specify flags for extern_learn
> entries). I'll send a separate patch that adds a comment to document it or if you have
> another thing in mind feel free to send a patch.

No, I don't have anything else in mind, but since the topic is the same
as the "net: bridge: fix flags interpretation for extern learn fdb entries"
patch you already sent, you could as well just send a v2 for that and
add an extra phrase in a comment somewhere near a NTF_EXT_LEARNED uapi
definition, or perhaps extend this comment right here:

/* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change
   and make no address resolution or NUD.
   NUD_PERMANENT also cannot be deleted by garbage collectors.
 */

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry
  2021-08-10 10:38           ` Vladimir Oltean
@ 2021-08-10 10:43             ` Nikolay Aleksandrov
  0 siblings, 0 replies; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-10 10:43 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Ido Schimmel, netdev, Jakub Kicinski, David S. Miller,
	Jiri Pirko, Roopa Prabhu, bridge, syzkaller-bugs,
	syzbot+9ba1174359adba5a5b7c

On 10/08/2021 13:38, Vladimir Oltean wrote:
> On Tue, Aug 10, 2021 at 01:15:32PM +0300, Nikolay Aleksandrov wrote:
>> On 10/08/2021 13:09, Vladimir Oltean wrote:
>>> On Tue, Aug 10, 2021 at 09:46:34AM +0300, Ido Schimmel wrote:
>>>> On Mon, Aug 09, 2021 at 04:05:22PM +0000, Vladimir Oltean wrote:
>>>>> On Mon, Aug 09, 2021 at 03:16:40PM +0300, Ido Schimmel wrote:
>>>>>> I have at least once selftest where I forgot the 'static' keyword:
>>>>>>
>>>>>> bridge fdb add de:ad:be:ef:13:37 dev $swp1 master extern_learn vlan 1
>>>>>>
>>>>>> This patch breaks the test when run against both the kernel and hardware
>>>>>> data paths. I don't mind patching these tests, but we might get more
>>>>>> reports in the future.
>>>>>
>>>>> Is it the 'static' keyword that you forgot, or 'dynamic'? The
>>>>> tools/testing/selftests/net/forwarding/bridge_vlan_aware.sh selftest
>>>>> looks to me like it's testing the behavior of an FDB entry which should
>>>>> roam, and which without the extern_learn flag would be ageable.
>>>>
>>>> static - no aging, no roaming
>>>> dynamic - aging, roaming
>>>> extern_learn - no aging, roaming
>>>>
>>>> So these combinations do not make any sense and the kernel will ignore
>>>> static/dynamic when extern_learn is specified. It's needed to work
>>>> around iproute2 behavior of "assume permanent"
>>>
>>> Since NTF_EXT_LEARNED is part of ndm->ndm_flags and NUD_REACHABLE/NUD_NOARP
>>> are part of ndm->ndm_state, it is not at all clear to me that 'extern_learn'
>>> belongs to the same class of bridge neighbor attributes as 'static'/'dynamic',
>>> and that it is invalid to have the full degree of freedom. If it isn't,
>>> shouldn't the kernel validate that, instead of just ignoring the ndm->ndm_state?
>>> If it's too late to validate, shouldn't we at least document somewhere
>>> that the ndm_state is ignored in the presence of ndm_flags & NTF_EXT_LEARNED?
>>> It is user API after all, easter eggs like this aren't very enjoyable.
>>>
>>
>> It's too late unfortunately, ignoring other flags in that case has been the standard
>> behaviour for a long time (it has never made sense to specify flags for extern_learn
>> entries). I'll send a separate patch that adds a comment to document it or if you have
>> another thing in mind feel free to send a patch.
> 
> No, I don't have anything else in mind, but since the topic is the same
> as the "net: bridge: fix flags interpretation for extern learn fdb entries"
> patch you already sent, you could as well just send a v2 for that and
> add an extra phrase in a comment somewhere near a NTF_EXT_LEARNED uapi
> definition, or perhaps extend this comment right here:
> 
> /* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change
>    and make no address resolution or NUD.
>    NUD_PERMANENT also cannot be deleted by garbage collectors.
>  */
> 

sure, I was going to send it for net-next, but I might as well do it in -net.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH net v2] net: bridge: fix flags interpretation for extern learn fdb entries
  2021-08-10  7:21       ` [PATCH net] net: bridge: fix flags interpretation for extern learn fdb entries Nikolay Aleksandrov
@ 2021-08-10 11:00         ` Nikolay Aleksandrov
  2021-08-10 13:50           ` Vladimir Oltean
  2021-08-10 20:20           ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 23+ messages in thread
From: Nikolay Aleksandrov @ 2021-08-10 11:00 UTC (permalink / raw)
  To: netdev; +Cc: roopa, vladimir.oltean, bridge, Nikolay Aleksandrov, Ido Schimmel

From: Nikolay Aleksandrov <nikolay@nvidia.com>

Ignore fdb flags when adding port extern learn entries and always set
BR_FDB_LOCAL flag when adding bridge extern learn entries. This is
closest to the behaviour we had before and avoids breaking any use cases
which were allowed.

This patch fixes iproute2 calls which assume NUD_PERMANENT and were
allowed before, example:
$ bridge fdb add 00:11:22:33:44:55 dev swp1 extern_learn

Extern learn entries are allowed to roam, but do not expire, so static
or dynamic flags make no sense for them.

Also add a comment for future reference.

Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")
Fixes: 0541a6293298 ("net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry")
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
---
v2: add a comment as suggested, no functional changes

 include/uapi/linux/neighbour.h |  7 +++++--
 net/bridge/br.c                |  3 +--
 net/bridge/br_fdb.c            | 11 ++++-------
 net/bridge/br_private.h        |  2 +-
 4 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h
index dc8b72201f6c..00a60695fa53 100644
--- a/include/uapi/linux/neighbour.h
+++ b/include/uapi/linux/neighbour.h
@@ -66,8 +66,11 @@ enum {
 #define NUD_NONE	0x00
 
 /* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change
-   and make no address resolution or NUD.
-   NUD_PERMANENT also cannot be deleted by garbage collectors.
+ * and make no address resolution or NUD.
+ * NUD_PERMANENT also cannot be deleted by garbage collectors.
+ * When NTF_EXT_LEARNED is set for a bridge fdb entry the different cache entry
+ * states don't make sense and thus are ignored. Such entries don't age and
+ * can roam.
  */
 
 struct nda_cacheinfo {
diff --git a/net/bridge/br.c b/net/bridge/br.c
index bbab9984f24e..ef743f94254d 100644
--- a/net/bridge/br.c
+++ b/net/bridge/br.c
@@ -166,8 +166,7 @@ static int br_switchdev_event(struct notifier_block *unused,
 	case SWITCHDEV_FDB_ADD_TO_BRIDGE:
 		fdb_info = ptr;
 		err = br_fdb_external_learn_add(br, p, fdb_info->addr,
-						fdb_info->vid,
-						fdb_info->is_local, false);
+						fdb_info->vid, false);
 		if (err) {
 			err = notifier_from_errno(err);
 			break;
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index 835cec1e5a03..5dee30966ed3 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -1044,10 +1044,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br,
 					   "FDB entry towards bridge must be permanent");
 			return -EINVAL;
 		}
-
-		err = br_fdb_external_learn_add(br, p, addr, vid,
-						ndm->ndm_state & NUD_PERMANENT,
-						true);
+		err = br_fdb_external_learn_add(br, p, addr, vid, true);
 	} else {
 		spin_lock_bh(&br->hash_lock);
 		err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);
@@ -1275,7 +1272,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
 }
 
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-			      const unsigned char *addr, u16 vid, bool is_local,
+			      const unsigned char *addr, u16 vid,
 			      bool swdev_notify)
 {
 	struct net_bridge_fdb_entry *fdb;
@@ -1293,7 +1290,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		if (swdev_notify)
 			flags |= BIT(BR_FDB_ADDED_BY_USER);
 
-		if (is_local)
+		if (!p)
 			flags |= BIT(BR_FDB_LOCAL);
 
 		fdb = fdb_create(br, p, addr, vid, flags);
@@ -1322,7 +1319,7 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
 		if (swdev_notify)
 			set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
 
-		if (is_local)
+		if (!p)
 			set_bit(BR_FDB_LOCAL, &fdb->flags);
 
 		if (modified)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index aa64d8d63ca3..2b48b204205e 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -711,7 +711,7 @@ int br_fdb_get(struct sk_buff *skb, struct nlattr *tb[], struct net_device *dev,
 int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p);
 void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p);
 int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p,
-			      const unsigned char *addr, u16 vid, bool is_local,
+			      const unsigned char *addr, u16 vid,
 			      bool swdev_notify);
 int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p,
 			      const unsigned char *addr, u16 vid,
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH net v2] net: bridge: fix flags interpretation for extern learn fdb entries
  2021-08-10 11:00         ` [PATCH net v2] " Nikolay Aleksandrov
@ 2021-08-10 13:50           ` Vladimir Oltean
  2021-08-10 20:20           ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 23+ messages in thread
From: Vladimir Oltean @ 2021-08-10 13:50 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, roopa, bridge, Nikolay Aleksandrov, Ido Schimmel

On Tue, Aug 10, 2021 at 02:00:10PM +0300, Nikolay Aleksandrov wrote:
> From: Nikolay Aleksandrov <nikolay@nvidia.com>
> 
> Ignore fdb flags when adding port extern learn entries and always set
> BR_FDB_LOCAL flag when adding bridge extern learn entries. This is
> closest to the behaviour we had before and avoids breaking any use cases
> which were allowed.
> 
> This patch fixes iproute2 calls which assume NUD_PERMANENT and were
> allowed before, example:
> $ bridge fdb add 00:11:22:33:44:55 dev swp1 extern_learn
> 
> Extern learn entries are allowed to roam, but do not expire, so static
> or dynamic flags make no sense for them.
> 
> Also add a comment for future reference.
> 
> Fixes: eb100e0e24a2 ("net: bridge: allow to add externally learned entries from user-space")
> Fixes: 0541a6293298 ("net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry")
> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
> Tested-by: Ido Schimmel <idosch@nvidia.com>
> Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
> ---

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH net v2] net: bridge: fix flags interpretation for extern learn fdb entries
  2021-08-10 11:00         ` [PATCH net v2] " Nikolay Aleksandrov
  2021-08-10 13:50           ` Vladimir Oltean
@ 2021-08-10 20:20           ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 23+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-08-10 20:20 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, roopa, vladimir.oltean, bridge, nikolay, idosch

Hello:

This patch was applied to netdev/net.git (refs/heads/master):

On Tue, 10 Aug 2021 14:00:10 +0300 you wrote:
> From: Nikolay Aleksandrov <nikolay@nvidia.com>
> 
> Ignore fdb flags when adding port extern learn entries and always set
> BR_FDB_LOCAL flag when adding bridge extern learn entries. This is
> closest to the behaviour we had before and avoids breaking any use cases
> which were allowed.
> 
> [...]

Here is the summary with links:
  - [net,v2] net: bridge: fix flags interpretation for extern learn fdb entries
    https://git.kernel.org/netdev/net/c/45a687879b31

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2021-08-10 20:20 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-01 23:17 [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
2021-08-02  7:42 ` Nikolay Aleksandrov
2021-08-02  9:20   ` Vladimir Oltean
2021-08-02  9:42     ` Nikolay Aleksandrov
2021-08-02 10:52       ` Vladimir Oltean
2021-08-02 11:02         ` Nikolay Aleksandrov
2021-08-02 11:20           ` Vladimir Oltean
2021-08-02 11:25 ` Nikolay Aleksandrov
2021-08-02 22:10 ` patchwork-bot+netdevbpf
2021-08-09 12:16 ` Ido Schimmel
2021-08-09 12:32   ` Vladimir Oltean
2021-08-09 15:33   ` Nikolay Aleksandrov
2021-08-10  6:40     ` Ido Schimmel
2021-08-10  7:21       ` [PATCH net] net: bridge: fix flags interpretation for extern learn fdb entries Nikolay Aleksandrov
2021-08-10 11:00         ` [PATCH net v2] " Nikolay Aleksandrov
2021-08-10 13:50           ` Vladimir Oltean
2021-08-10 20:20           ` patchwork-bot+netdevbpf
2021-08-09 16:05   ` [PATCH net] net: bridge: validate the NUD_PERMANENT bit when adding an extern_learn FDB entry Vladimir Oltean
2021-08-10  6:46     ` Ido Schimmel
2021-08-10 10:09       ` Vladimir Oltean
2021-08-10 10:15         ` Nikolay Aleksandrov
2021-08-10 10:38           ` Vladimir Oltean
2021-08-10 10:43             ` Nikolay Aleksandrov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).