All of lore.kernel.org
 help / color / mirror / Atom feed
* What is the correct way to install an L2 multicast route into a bridge?
@ 2020-07-08  9:04 Vladimir Oltean
  2020-07-08  9:16 ` Nikolay Aleksandrov
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Oltean @ 2020-07-08  9:04 UTC (permalink / raw)
  To: nikolay, roopa, netdev

Hi,

I am confused after reading man/man8/bridge.8. I have a bridge br0 with
4 interfaces (eth0 -> eth3), and I would like to install a rule such
that the non-IP multicast address of 09:00:70:00:00:00 is only forwarded
towards 3 of those ports, instead of being flooded.
The manual says that 'bridge mdb' is only for IP multicast, and implies
that 'bridge fdb append' (NLM_F_APPEND) is only used by vxlan. So, what
is the correct user interface for what I am trying to do?

Thank you,
-Vladimir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What is the correct way to install an L2 multicast route into a bridge?
  2020-07-08  9:04 What is the correct way to install an L2 multicast route into a bridge? Vladimir Oltean
@ 2020-07-08  9:16 ` Nikolay Aleksandrov
  2020-07-08  9:42   ` Vladimir Oltean
  0 siblings, 1 reply; 9+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-08  9:16 UTC (permalink / raw)
  To: Vladimir Oltean, roopa, netdev

On 08/07/2020 12:04, Vladimir Oltean wrote:
> Hi,
> 
> I am confused after reading man/man8/bridge.8. I have a bridge br0 with
> 4 interfaces (eth0 -> eth3), and I would like to install a rule such
> that the non-IP multicast address of 09:00:70:00:00:00 is only forwarded
> towards 3 of those ports, instead of being flooded.
> The manual says that 'bridge mdb' is only for IP multicast, and implies
> that 'bridge fdb append' (NLM_F_APPEND) is only used by vxlan. So, what
> is the correct user interface for what I am trying to do?
> 
> Thank you,
> -Vladimir
> 

Hi Vladimir,
The bridge currently doesn't support L2 multicast routes. The MDB interface needs to be extended
for such support. Soon I'll post patches that move it to a new, entirely netlink attribute-
based, structure so it can be extended easily for that, too. My change is motivated mainly by SSM
but it will help with implementing this feature as well.
In case you need it sooner, patches are always welcome! :)

Current MDB fixed-size structure can also be used for implementing L2 mcast routes, but it would
require some workarounds. 

Cheers,
 Nik



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What is the correct way to install an L2 multicast route into a bridge?
  2020-07-08  9:16 ` Nikolay Aleksandrov
@ 2020-07-08  9:42   ` Vladimir Oltean
  2020-07-08 11:07     ` Nikolay Aleksandrov
  0 siblings, 1 reply; 9+ messages in thread
From: Vladimir Oltean @ 2020-07-08  9:42 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: roopa, netdev

On Wed, Jul 08, 2020 at 12:16:27PM +0300, Nikolay Aleksandrov wrote:
> On 08/07/2020 12:04, Vladimir Oltean wrote:
> > Hi,
> > 
> > I am confused after reading man/man8/bridge.8. I have a bridge br0 with
> > 4 interfaces (eth0 -> eth3), and I would like to install a rule such
> > that the non-IP multicast address of 09:00:70:00:00:00 is only forwarded
> > towards 3 of those ports, instead of being flooded.
> > The manual says that 'bridge mdb' is only for IP multicast, and implies
> > that 'bridge fdb append' (NLM_F_APPEND) is only used by vxlan. So, what
> > is the correct user interface for what I am trying to do?
> > 
> > Thank you,
> > -Vladimir
> > 
> 
> Hi Vladimir,
> The bridge currently doesn't support L2 multicast routes. The MDB interface needs to be extended
> for such support. Soon I'll post patches that move it to a new, entirely netlink attribute-
> based, structure so it can be extended easily for that, too. My change is motivated mainly by SSM
> but it will help with implementing this feature as well.
> In case you need it sooner, patches are always welcome! :)
> 
> Current MDB fixed-size structure can also be used for implementing L2 mcast routes, but it would
> require some workarounds. 
> 
> Cheers,
>  Nik
> 
> 

Thanks, Nikolay.
Isn't mdb_modify() already netlink-based? I think you're talking about
some changes to 'struct br_mdb_entry' which would be necessary. What
changes would be needed, do you know (both in the 'workaround' case as
well as in 'fully netlink')?

-Vladimir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What is the correct way to install an L2 multicast route into a bridge?
  2020-07-08  9:42   ` Vladimir Oltean
@ 2020-07-08 11:07     ` Nikolay Aleksandrov
  2020-07-08 11:17       ` Nikolay Aleksandrov
  0 siblings, 1 reply; 9+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-08 11:07 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: roopa, netdev

On 08/07/2020 12:42, Vladimir Oltean wrote:
> On Wed, Jul 08, 2020 at 12:16:27PM +0300, Nikolay Aleksandrov wrote:
>> On 08/07/2020 12:04, Vladimir Oltean wrote:
>>> Hi,
>>>
>>> I am confused after reading man/man8/bridge.8. I have a bridge br0 with
>>> 4 interfaces (eth0 -> eth3), and I would like to install a rule such
>>> that the non-IP multicast address of 09:00:70:00:00:00 is only forwarded
>>> towards 3 of those ports, instead of being flooded.
>>> The manual says that 'bridge mdb' is only for IP multicast, and implies
>>> that 'bridge fdb append' (NLM_F_APPEND) is only used by vxlan. So, what
>>> is the correct user interface for what I am trying to do?
>>>
>>> Thank you,
>>> -Vladimir
>>>
>>
>> Hi Vladimir,
>> The bridge currently doesn't support L2 multicast routes. The MDB interface needs to be extended
>> for such support. Soon I'll post patches that move it to a new, entirely netlink attribute-
>> based, structure so it can be extended easily for that, too. My change is motivated mainly by SSM
>> but it will help with implementing this feature as well.
>> In case you need it sooner, patches are always welcome! :)
>>
>> Current MDB fixed-size structure can also be used for implementing L2 mcast routes, but it would
>> require some workarounds. 
>>
>> Cheers,
>>  Nik
>>
>>
> 
> Thanks, Nikolay.
> Isn't mdb_modify() already netlink-based? I think you're talking about
> some changes to 'struct br_mdb_entry' which would be necessary. What
> changes would be needed, do you know (both in the 'workaround' case as
> well as in 'fully netlink')?
> 
> -Vladimir
> 

That is netlink-based, but the uAPI (used also for add/del/dump) uses a fixed-size struct
which is very inconvenient and hard to extend. I plan to add MDBv2 which uses separate
netlink attributes and can be easily extended as we plan to add some new features and will
need that flexibility. It will use a new container attribute for the notifications as well.

In the workaround case IIRC you'd have to add a new protocol type to denote the L2 routes, and
re-work the lookup logic to include L2 in non-IP case. You'd have to edit the multicast fast-path,
and everything else that assumes the frame has to be IP/IPv6. I'm sure I'm missing some details as
last I did this was over an year ago where I made a quick and dirty hack that implemented it with proto = 0
to denote an L2 entry just as a proof of concept.
Also you would have to make sure all of that is compatible with current user-space code. For example
iproute2/bridge/mdb.c considers that proto can be only IPv4 or IPv6 if it's not v4, i.e. it will
print the new L2 entries as :: IPv6 entries until it's fixed.

Obviously some of the items for the workaround case are valid in all cases for L2 routes (e.g. fast-path/lookup edit).
But I think it's not that hard to implement without affecting the fast path much or even at all.

Cheers,
 Nik




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What is the correct way to install an L2 multicast route into a bridge?
  2020-07-08 11:07     ` Nikolay Aleksandrov
@ 2020-07-08 11:17       ` Nikolay Aleksandrov
  2020-07-08 12:55         ` Nikolay Aleksandrov
  0 siblings, 1 reply; 9+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-08 11:17 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: roopa, netdev

On 08/07/2020 14:07, Nikolay Aleksandrov wrote:
> On 08/07/2020 12:42, Vladimir Oltean wrote:
>> On Wed, Jul 08, 2020 at 12:16:27PM +0300, Nikolay Aleksandrov wrote:
>>> On 08/07/2020 12:04, Vladimir Oltean wrote:
>>>> Hi,
>>>>
>>>> I am confused after reading man/man8/bridge.8. I have a bridge br0 with
>>>> 4 interfaces (eth0 -> eth3), and I would like to install a rule such
>>>> that the non-IP multicast address of 09:00:70:00:00:00 is only forwarded
>>>> towards 3 of those ports, instead of being flooded.
>>>> The manual says that 'bridge mdb' is only for IP multicast, and implies
>>>> that 'bridge fdb append' (NLM_F_APPEND) is only used by vxlan. So, what
>>>> is the correct user interface for what I am trying to do?
>>>>
>>>> Thank you,
>>>> -Vladimir
>>>>
>>>
>>> Hi Vladimir,
>>> The bridge currently doesn't support L2 multicast routes. The MDB interface needs to be extended
>>> for such support. Soon I'll post patches that move it to a new, entirely netlink attribute-
>>> based, structure so it can be extended easily for that, too. My change is motivated mainly by SSM
>>> but it will help with implementing this feature as well.
>>> In case you need it sooner, patches are always welcome! :)
>>>
>>> Current MDB fixed-size structure can also be used for implementing L2 mcast routes, but it would
>>> require some workarounds. 
>>>
>>> Cheers,
>>>  Nik
>>>
>>>
>>
>> Thanks, Nikolay.
>> Isn't mdb_modify() already netlink-based? I think you're talking about
>> some changes to 'struct br_mdb_entry' which would be necessary. What
>> changes would be needed, do you know (both in the 'workaround' case as
>> well as in 'fully netlink')?
>>
>> -Vladimir
>>
> 
> That is netlink-based, but the uAPI (used also for add/del/dump) uses a fixed-size struct
> which is very inconvenient and hard to extend. I plan to add MDBv2 which uses separate
> netlink attributes and can be easily extended as we plan to add some new features and will
> need that flexibility. It will use a new container attribute for the notifications as well.
> 
> In the workaround case IIRC you'd have to add a new protocol type to denote the L2 routes, and

Actually drop the whole /workaround/ comment altogether. It can be implemented fairly straight-forward
even with the struct we got now. You don't need any new attributes.
I just had forgotten the details and spoke too quickly. :)

> re-work the lookup logic to include L2 in non-IP case. You'd have to edit the multicast fast-path,
> and everything else that assumes the frame has to be IP/IPv6. I'm sure I'm missing some details as
> last I did this was over an year ago where I made a quick and dirty hack that implemented it with proto = 0
> to denote an L2 entry just as a proof of concept.
> Also you would have to make sure all of that is compatible with current user-space code. For example
> iproute2/bridge/mdb.c considers that proto can be only IPv4 or IPv6 if it's not v4, i.e. it will
> print the new L2 entries as :: IPv6 entries until it's fixed.
> 
> Obviously some of the items for the workaround case are valid in all cases for L2 routes (e.g. fast-path/lookup edit).
> But I think it's not that hard to implement without affecting the fast path much or even at all.
> 
> Cheers,
>  Nik
> 
> 
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: What is the correct way to install an L2 multicast route into a bridge?
  2020-07-08 11:17       ` Nikolay Aleksandrov
@ 2020-07-08 12:55         ` Nikolay Aleksandrov
  2020-07-08 13:10           ` Vladimir Oltean
                             ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Nikolay Aleksandrov @ 2020-07-08 12:55 UTC (permalink / raw)
  To: Vladimir Oltean; +Cc: roopa, netdev

[-- Attachment #1: Type: text/plain, Size: 2474 bytes --]

On 08/07/2020 14:17, Nikolay Aleksandrov wrote:
> On 08/07/2020 14:07, Nikolay Aleksandrov wrote:
>> On 08/07/2020 12:42, Vladimir Oltean wrote:
>>> On Wed, Jul 08, 2020 at 12:16:27PM +0300, Nikolay Aleksandrov wrote:
>>>> On 08/07/2020 12:04, Vladimir Oltean wrote:
[snip]
>>>>
>>>
>>> Thanks, Nikolay.
>>> Isn't mdb_modify() already netlink-based? I think you're talking about
>>> some changes to 'struct br_mdb_entry' which would be necessary. What
>>> changes would be needed, do you know (both in the 'workaround' case as
>>> well as in 'fully netlink')?
>>>
>>> -Vladimir
>>>
>>
>> That is netlink-based, but the uAPI (used also for add/del/dump) uses a fixed-size struct
>> which is very inconvenient and hard to extend. I plan to add MDBv2 which uses separate
>> netlink attributes and can be easily extended as we plan to add some new features and will
>> need that flexibility. It will use a new container attribute for the notifications as well.
>>
>> In the workaround case IIRC you'd have to add a new protocol type to denote the L2 routes, and
> 
> Actually drop the whole /workaround/ comment altogether. It can be implemented fairly straight-forward
> even with the struct we got now. You don't need any new attributes.
> I just had forgotten the details and spoke too quickly. :)
> 
>> re-work the lookup logic to include L2 in non-IP case. You'd have to edit the multicast fast-path,
>> and everything else that assumes the frame has to be IP/IPv6. I'm sure I'm missing some details as
>> last I did this was over an year ago where I made a quick and dirty hack that implemented it with proto = 0
>> to denote an L2 entry just as a proof of concept.
>> Also you would have to make sure all of that is compatible with current user-space code. For example
>> iproute2/bridge/mdb.c considers that proto can be only IPv4 or IPv6 if it's not v4, i.e. it will
>> print the new L2 entries as :: IPv6 entries until it's fixed.
>>
>> Obviously some of the items for the workaround case are valid in all cases for L2 routes (e.g. fast-path/lookup edit).
>> But I think it's not that hard to implement without affecting the fast path much or even at all.
>>
>> Cheers,
>>  Nik
>>

I found the patch and rebased it against net-next. I want to stress that it is unfinished and
barely tested, it was just a hack to enable L2 entries and forwarding.
If you're interested and find it useful please feel free to take it over as
I don't have time right now.

Thanks,
 Nik



[-- Attachment #2: 0001-net-bridge-multicast-add-support-for-L2-entries.patch --]
[-- Type: text/x-patch, Size: 9150 bytes --]

From 87ad93d5953932eb14739572743d049de1ba8b1d Mon Sep 17 00:00:00 2001
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Wed, 8 Jul 2020 15:49:26 +0300
Subject: [RFC] net: bridge: multicast: add support for L2 entries

Unfinished.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
 include/linux/if_bridge.h      |  1 +
 include/uapi/linux/if_bridge.h |  2 ++
 net/bridge/br_device.c         |  2 +-
 net/bridge/br_input.c          |  2 +-
 net/bridge/br_mdb.c            | 35 ++++++++++++++++++++++++++--------
 net/bridge/br_multicast.c      | 12 +++++++++---
 net/bridge/br_private.h        |  7 +++++--
 7 files changed, 46 insertions(+), 15 deletions(-)

diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h
index b3a8d3054af0..4c95eff22c8e 100644
--- a/include/linux/if_bridge.h
+++ b/include/linux/if_bridge.h
@@ -20,6 +20,7 @@ struct br_ip {
 		struct in6_addr ip6;
 #endif
 	} u;
+	__u8 mac_addr[ETH_ALEN];
 	__be16		proto;
 	__u16           vid;
 };
diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index c114c1c2bd53..73e129370662 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -437,10 +437,12 @@ struct br_mdb_entry {
 	__u8 state;
 #define MDB_FLAGS_OFFLOAD	(1 << 0)
 #define MDB_FLAGS_FAST_LEAVE	(1 << 1)
+#define MDB_FLAGS_L2		(1 << 2)
 	__u8 flags;
 	__u16 vid;
 	struct {
 		union {
+			__u8 mac_addr[ETH_ALEN];
 			__be32	ip4;
 			struct in6_addr ip6;
 		} u;
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 8c7b78f8bc23..b32421f484f3 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -91,7 +91,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 
 		mdst = br_mdb_get(br, skb, vid);
 		if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
-		    br_multicast_querier_exists(br, eth_hdr(skb)))
+		    br_multicast_querier_exists(br, eth_hdr(skb), mdst))
 			br_multicast_flood(mdst, skb, false, true);
 		else
 			br_flood(br, skb, BR_PKT_MULTICAST, false, true);
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 59a318b9f646..d31b5c18c6a1 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -134,7 +134,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 	case BR_PKT_MULTICAST:
 		mdst = br_mdb_get(br, skb, vid);
 		if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
-		    br_multicast_querier_exists(br, eth_hdr(skb))) {
+		    br_multicast_querier_exists(br, eth_hdr(skb), mdst)) {
 			if ((mdst && mdst->host_joined) ||
 			    br_multicast_is_router(br)) {
 				local_rcv = true;
diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index da5ed4cf9233..c80218a5e6fa 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -62,6 +62,8 @@ static void __mdb_entry_fill_flags(struct br_mdb_entry *e, unsigned char flags)
 		e->flags |= MDB_FLAGS_OFFLOAD;
 	if (flags & MDB_PG_FLAGS_FAST_LEAVE)
 		e->flags |= MDB_FLAGS_FAST_LEAVE;
+	if (flags & MDB_PG_FLAGS_L2)
+		e->flags |= MDB_FLAGS_L2;
 }
 
 static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip)
@@ -72,9 +74,11 @@ static void __mdb_entry_to_br_ip(struct br_mdb_entry *entry, struct br_ip *ip)
 	if (ip->proto == htons(ETH_P_IP))
 		ip->u.ip4 = entry->addr.u.ip4;
 #if IS_ENABLED(CONFIG_IPV6)
-	else
+	else if (ip->proto == htons(ETH_P_IPV6))
 		ip->u.ip6 = entry->addr.u.ip6;
 #endif
+	else
+		memcpy(ip->mac_addr, entry->addr.u.mac_addr, ETH_ALEN);
 }
 
 static int __mdb_fill_info(struct sk_buff *skb,
@@ -103,9 +107,12 @@ static int __mdb_fill_info(struct sk_buff *skb,
 	if (mp->addr.proto == htons(ETH_P_IP))
 		e.addr.u.ip4 = mp->addr.u.ip4;
 #if IS_ENABLED(CONFIG_IPV6)
-	if (mp->addr.proto == htons(ETH_P_IPV6))
+	else if (mp->addr.proto == htons(ETH_P_IPV6))
 		e.addr.u.ip6 = mp->addr.u.ip6;
 #endif
+	else
+		memcpy(e.addr.u.mac_addr, mp->addr.mac_addr, ETH_ALEN);
+
 	e.addr.proto = mp->addr.proto;
 	nest_ent = nla_nest_start_noflag(skb,
 					 MDBA_MDB_ENTRY_INFO);
@@ -399,9 +406,11 @@ static void __br_mdb_notify(struct net_device *dev, struct net_bridge_port *p,
 	if (entry->addr.proto == htons(ETH_P_IP))
 		ip_eth_mc_map(entry->addr.u.ip4, mdb.addr);
 #if IS_ENABLED(CONFIG_IPV6)
-	else
+	else if (entry->addr.proto == htons(ETH_P_IPV6))
 		ipv6_eth_mc_map(&entry->addr.u.ip6, mdb.addr);
 #endif
+	else
+		memcpy(mdb.addr, entry->addr.u.mac_addr, ETH_ALEN);
 
 	mdb.obj.orig_dev = port_dev;
 	if (p && port_dev && type == RTM_NEWMDB) {
@@ -447,6 +456,7 @@ void br_mdb_notify(struct net_device *dev, struct net_bridge_port *port,
 		entry.ifindex = port->dev->ifindex;
 	else
 		entry.ifindex = dev->ifindex;
+	memcpy(entry.addr.u.mac_addr, group->mac_addr, ETH_ALEN);
 	entry.addr.proto = group->proto;
 	entry.addr.u.ip4 = group->u.ip4;
 #if IS_ENABLED(CONFIG_IPV6)
@@ -529,18 +539,24 @@ static bool is_valid_mdb_entry(struct br_mdb_entry *entry)
 	if (entry->ifindex == 0)
 		return false;
 
-	if (entry->addr.proto == htons(ETH_P_IP)) {
+	switch (entry->addr.proto) {
+	case htons(ETH_P_IP):
 		if (!ipv4_is_multicast(entry->addr.u.ip4))
 			return false;
 		if (ipv4_is_local_multicast(entry->addr.u.ip4))
 			return false;
+		break;
 #if IS_ENABLED(CONFIG_IPV6)
-	} else if (entry->addr.proto == htons(ETH_P_IPV6)) {
+	case htons(ETH_P_IPV6):
 		if (ipv6_addr_is_ll_all_nodes(&entry->addr.u.ip6))
 			return false;
+		break;
 #endif
-	} else
-		return false;
+	default:
+		if (entry->addr.proto != 0)
+			return false;
+	}
+
 	if (entry->state != MDB_PERMANENT && entry->state != MDB_TEMPORARY)
 		return false;
 	if (entry->vid >= VLAN_VID_MASK)
@@ -616,6 +632,9 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port,
 			return err;
 	}
 
+	if (mp->l2 && state != MDB_PERMANENT)
+		return -EINVAL;
+
 	/* host join */
 	if (!port) {
 		/* don't allow any flags for host-joined groups */
@@ -642,7 +661,7 @@ static int br_mdb_add_group(struct net_bridge *br, struct net_bridge_port *port,
 	if (unlikely(!p))
 		return -ENOMEM;
 	rcu_assign_pointer(*pp, p);
-	if (state == MDB_TEMPORARY)
+	if (state == MDB_TEMPORARY && !mp->l2)
 		mod_timer(&p->timer, now + br->multicast_membership_interval);
 
 	return 0;
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 83490bf73a13..79af5b656df1 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -133,7 +133,8 @@ struct net_bridge_mdb_entry *br_mdb_get(struct net_bridge *br,
 		break;
 #endif
 	default:
-		return NULL;
+		ip.proto = 0;
+		memcpy(&ip.mac_addr, eth_hdr(skb)->h_dest, ETH_ALEN);
 	}
 
 	return br_mdb_ip_get_rcu(br, &ip);
@@ -457,6 +458,7 @@ struct net_bridge_mdb_entry *br_multicast_new_group(struct net_bridge *br,
 
 	mp->br = br;
 	mp->addr = *group;
+	mp->l2 = !!(group->proto == 0);
 	timer_setup(&mp->timer, br_multicast_group_expired, 0);
 	err = rhashtable_lookup_insert_fast(&br->mdb_hash_tbl, &mp->rhnode,
 					    br_mdb_rht_params);
@@ -486,6 +488,8 @@ struct net_bridge_port_group *br_multicast_new_port_group(
 	p->addr = *group;
 	p->port = port;
 	p->flags = flags;
+	if (group->proto == htons(0))
+		p->flags |= MDB_PG_FLAGS_L2;
 	rcu_assign_pointer(p->next, next);
 	hlist_add_head(&p->mglist, &port->mglist);
 	timer_setup(&p->timer, br_multicast_port_group_expired, 0);
@@ -519,7 +523,9 @@ void br_multicast_host_join(struct net_bridge_mdb_entry *mp, bool notify)
 			br_mdb_notify(mp->br->dev, NULL, &mp->addr,
 				      RTM_NEWMDB, 0);
 	}
-	mod_timer(&mp->timer, jiffies + mp->br->multicast_membership_interval);
+	if (!mp->l2)
+		mod_timer(&mp->timer,
+			  jiffies + mp->br->multicast_membership_interval);
 }
 
 void br_multicast_host_leave(struct net_bridge_mdb_entry *mp, bool notify)
@@ -2252,7 +2258,7 @@ bool br_multicast_has_querier_anywhere(struct net_device *dev, int proto)
 	memset(&eth, 0, sizeof(eth));
 	eth.h_proto = htons(proto);
 
-	ret = br_multicast_querier_exists(br, &eth);
+	ret = br_multicast_querier_exists(br, &eth, NULL);
 
 unlock:
 	rcu_read_unlock();
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 65d2c163a24a..ea0a13d9fbea 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -213,6 +213,7 @@ struct net_bridge_fdb_entry {
 #define MDB_PG_FLAGS_PERMANENT	BIT(0)
 #define MDB_PG_FLAGS_OFFLOAD	BIT(1)
 #define MDB_PG_FLAGS_FAST_LEAVE	BIT(2)
+#define MDB_PG_FLAGS_L2		BIT(3)
 
 struct net_bridge_port_group {
 	struct net_bridge_port		*port;
@@ -233,6 +234,7 @@ struct net_bridge_mdb_entry {
 	struct timer_list		timer;
 	struct br_ip			addr;
 	bool				host_joined;
+	bool				l2;
 	struct hlist_node		mdb_node;
 };
 
@@ -816,7 +818,8 @@ __br_multicast_querier_exists(struct net_bridge *br,
 }
 
 static inline bool br_multicast_querier_exists(struct net_bridge *br,
-					       struct ethhdr *eth)
+					       struct ethhdr *eth,
+					       const struct net_bridge_mdb_entry *mdb)
 {
 	switch (eth->h_proto) {
 	case (htons(ETH_P_IP)):
@@ -828,7 +831,7 @@ static inline bool br_multicast_querier_exists(struct net_bridge *br,
 			&br->ip6_other_query, true);
 #endif
 	default:
-		return false;
+		return !!(mdb && mdb->l2);
 	}
 }
 
-- 
2.25.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: What is the correct way to install an L2 multicast route into a bridge?
  2020-07-08 12:55         ` Nikolay Aleksandrov
@ 2020-07-08 13:10           ` Vladimir Oltean
  2020-07-08 19:48           ` [RFC] net: bridge: multicast: add support for L2 entries kernel test robot
  2020-07-09  1:10           ` kernel test robot
  2 siblings, 0 replies; 9+ messages in thread
From: Vladimir Oltean @ 2020-07-08 13:10 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: roopa, netdev

On Wed, Jul 08, 2020 at 03:55:23PM +0300, Nikolay Aleksandrov wrote:
> On 08/07/2020 14:17, Nikolay Aleksandrov wrote:
> > On 08/07/2020 14:07, Nikolay Aleksandrov wrote:
> >> On 08/07/2020 12:42, Vladimir Oltean wrote:
> >>> On Wed, Jul 08, 2020 at 12:16:27PM +0300, Nikolay Aleksandrov wrote:
> >>>> On 08/07/2020 12:04, Vladimir Oltean wrote:
> [snip]
> >>>>
> >>>
> >>> Thanks, Nikolay.
> >>> Isn't mdb_modify() already netlink-based? I think you're talking about
> >>> some changes to 'struct br_mdb_entry' which would be necessary. What
> >>> changes would be needed, do you know (both in the 'workaround' case as
> >>> well as in 'fully netlink')?
> >>>
> >>> -Vladimir
> >>>
> >>
> >> That is netlink-based, but the uAPI (used also for add/del/dump) uses a fixed-size struct
> >> which is very inconvenient and hard to extend. I plan to add MDBv2 which uses separate
> >> netlink attributes and can be easily extended as we plan to add some new features and will
> >> need that flexibility. It will use a new container attribute for the notifications as well.
> >>
> >> In the workaround case IIRC you'd have to add a new protocol type to denote the L2 routes, and
> > 
> > Actually drop the whole /workaround/ comment altogether. It can be implemented fairly straight-forward
> > even with the struct we got now. You don't need any new attributes.
> > I just had forgotten the details and spoke too quickly. :)
> > 
> >> re-work the lookup logic to include L2 in non-IP case. You'd have to edit the multicast fast-path,
> >> and everything else that assumes the frame has to be IP/IPv6. I'm sure I'm missing some details as
> >> last I did this was over an year ago where I made a quick and dirty hack that implemented it with proto = 0
> >> to denote an L2 entry just as a proof of concept.
> >> Also you would have to make sure all of that is compatible with current user-space code. For example
> >> iproute2/bridge/mdb.c considers that proto can be only IPv4 or IPv6 if it's not v4, i.e. it will
> >> print the new L2 entries as :: IPv6 entries until it's fixed.
> >>
> >> Obviously some of the items for the workaround case are valid in all cases for L2 routes (e.g. fast-path/lookup edit).
> >> But I think it's not that hard to implement without affecting the fast path much or even at all.
> >>
> >> Cheers,
> >>  Nik
> >>
> 
> I found the patch and rebased it against net-next. I want to stress that it is unfinished and
> barely tested, it was just a hack to enable L2 entries and forwarding.
> If you're interested and find it useful please feel free to take it over as
> I don't have time right now.
> 
> Thanks,
>  Nik
> 
> 

Thanks! I'll give it a try, and I'll submit it if I get it to work
properly and see no regressions with IP multicast.

-Vladimir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] net: bridge: multicast: add support for L2 entries
  2020-07-08 12:55         ` Nikolay Aleksandrov
  2020-07-08 13:10           ` Vladimir Oltean
@ 2020-07-08 19:48           ` kernel test robot
  2020-07-09  1:10           ` kernel test robot
  2 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2020-07-08 19:48 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5499 bytes --]

Hi Nikolay,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on linus/master]
[also build test ERROR on v5.8-rc4 next-20200708]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nikolay-Aleksandrov/net-bridge-multicast-add-support-for-L2-entries/20200708-205657
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git dcde237b9b0eb1d19306e6f48c0a4e058907619f
config: parisc-randconfig-r004-20200709 (attached as .config)
compiler: hppa-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=parisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   net/bridge/br_device.c: In function 'br_dev_xmit':
>> net/bridge/br_device.c:94:7: error: too many arguments to function 'br_multicast_querier_exists'
      94 |       br_multicast_querier_exists(br, eth_hdr(skb), mdst))
         |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from net/bridge/br_device.c:19:
   net/bridge/br_private.h:897:20: note: declared here
     897 | static inline bool br_multicast_querier_exists(struct net_bridge *br,
         |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
--
   net/bridge/br_input.c: In function 'br_handle_frame_finish':
>> net/bridge/br_input.c:137:7: error: too many arguments to function 'br_multicast_querier_exists'
     137 |       br_multicast_querier_exists(br, eth_hdr(skb), mdst)) {
         |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~
   In file included from net/bridge/br_input.c:23:
   net/bridge/br_private.h:897:20: note: declared here
     897 | static inline bool br_multicast_querier_exists(struct net_bridge *br,
         |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~

vim +/br_multicast_querier_exists +94 net/bridge/br_device.c

    26	
    27	/* net device transmit always called with BH disabled */
    28	netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
    29	{
    30		struct net_bridge *br = netdev_priv(dev);
    31		struct net_bridge_fdb_entry *dst;
    32		struct net_bridge_mdb_entry *mdst;
    33		struct pcpu_sw_netstats *brstats = this_cpu_ptr(br->stats);
    34		const struct nf_br_ops *nf_ops;
    35		u8 state = BR_STATE_FORWARDING;
    36		const unsigned char *dest;
    37		u16 vid = 0;
    38	
    39		rcu_read_lock();
    40		nf_ops = rcu_dereference(nf_br_ops);
    41		if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
    42			rcu_read_unlock();
    43			return NETDEV_TX_OK;
    44		}
    45	
    46		u64_stats_update_begin(&brstats->syncp);
    47		brstats->tx_packets++;
    48		brstats->tx_bytes += skb->len;
    49		u64_stats_update_end(&brstats->syncp);
    50	
    51		br_switchdev_frame_unmark(skb);
    52		BR_INPUT_SKB_CB(skb)->brdev = dev;
    53		BR_INPUT_SKB_CB(skb)->frag_max_size = 0;
    54	
    55		skb_reset_mac_header(skb);
    56		skb_pull(skb, ETH_HLEN);
    57	
    58		if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid, &state))
    59			goto out;
    60	
    61		if (IS_ENABLED(CONFIG_INET) &&
    62		    (eth_hdr(skb)->h_proto == htons(ETH_P_ARP) ||
    63		     eth_hdr(skb)->h_proto == htons(ETH_P_RARP)) &&
    64		    br_opt_get(br, BROPT_NEIGH_SUPPRESS_ENABLED)) {
    65			br_do_proxy_suppress_arp(skb, br, vid, NULL);
    66		} else if (IS_ENABLED(CONFIG_IPV6) &&
    67			   skb->protocol == htons(ETH_P_IPV6) &&
    68			   br_opt_get(br, BROPT_NEIGH_SUPPRESS_ENABLED) &&
    69			   pskb_may_pull(skb, sizeof(struct ipv6hdr) +
    70					 sizeof(struct nd_msg)) &&
    71			   ipv6_hdr(skb)->nexthdr == IPPROTO_ICMPV6) {
    72				struct nd_msg *msg, _msg;
    73	
    74				msg = br_is_nd_neigh_msg(skb, &_msg);
    75				if (msg)
    76					br_do_suppress_nd(skb, br, vid, NULL, msg);
    77		}
    78	
    79		dest = eth_hdr(skb)->h_dest;
    80		if (is_broadcast_ether_addr(dest)) {
    81			br_flood(br, skb, BR_PKT_BROADCAST, false, true);
    82		} else if (is_multicast_ether_addr(dest)) {
    83			if (unlikely(netpoll_tx_running(dev))) {
    84				br_flood(br, skb, BR_PKT_MULTICAST, false, true);
    85				goto out;
    86			}
    87			if (br_multicast_rcv(br, NULL, skb, vid)) {
    88				kfree_skb(skb);
    89				goto out;
    90			}
    91	
    92			mdst = br_mdb_get(br, skb, vid);
    93			if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
  > 94			    br_multicast_querier_exists(br, eth_hdr(skb), mdst))
    95				br_multicast_flood(mdst, skb, false, true);
    96			else
    97				br_flood(br, skb, BR_PKT_MULTICAST, false, true);
    98		} else if ((dst = br_fdb_find_rcu(br, dest, vid)) != NULL) {
    99			br_forward(dst->dst, skb, false, true);
   100		} else {
   101			br_flood(br, skb, BR_PKT_UNICAST, false, true);
   102		}
   103	out:
   104		rcu_read_unlock();
   105		return NETDEV_TX_OK;
   106	}
   107	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 25316 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC] net: bridge: multicast: add support for L2 entries
  2020-07-08 12:55         ` Nikolay Aleksandrov
  2020-07-08 13:10           ` Vladimir Oltean
  2020-07-08 19:48           ` [RFC] net: bridge: multicast: add support for L2 entries kernel test robot
@ 2020-07-09  1:10           ` kernel test robot
  2 siblings, 0 replies; 9+ messages in thread
From: kernel test robot @ 2020-07-09  1:10 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5514 bytes --]

Hi Nikolay,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on linus/master]
[also build test ERROR on v5.8-rc4 next-20200708]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Nikolay-Aleksandrov/net-bridge-multicast-add-support-for-L2-entries/20200708-205657
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git dcde237b9b0eb1d19306e6f48c0a4e058907619f
config: x86_64-randconfig-a016-20200708 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 02946de3802d3bc65bc9f2eb9b8d4969b5a7add8)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

>> net/bridge/br_device.c:94:53: error: too many arguments to function call, expected 2, have 3
                       br_multicast_querier_exists(br, eth_hdr(skb), mdst))
                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~                   ^~~~
   net/bridge/br_private.h:897:20: note: 'br_multicast_querier_exists' declared here
   static inline bool br_multicast_querier_exists(struct net_bridge *br,
                      ^
   1 error generated.
--
>> net/bridge/br_input.c:137:53: error: too many arguments to function call, expected 2, have 3
                       br_multicast_querier_exists(br, eth_hdr(skb), mdst)) {
                       ~~~~~~~~~~~~~~~~~~~~~~~~~~~                   ^~~~
   net/bridge/br_private.h:897:20: note: 'br_multicast_querier_exists' declared here
   static inline bool br_multicast_querier_exists(struct net_bridge *br,
                      ^
   1 error generated.

vim +94 net/bridge/br_device.c

    26	
    27	/* net device transmit always called with BH disabled */
    28	netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
    29	{
    30		struct net_bridge *br = netdev_priv(dev);
    31		struct net_bridge_fdb_entry *dst;
    32		struct net_bridge_mdb_entry *mdst;
    33		struct pcpu_sw_netstats *brstats = this_cpu_ptr(br->stats);
    34		const struct nf_br_ops *nf_ops;
    35		u8 state = BR_STATE_FORWARDING;
    36		const unsigned char *dest;
    37		u16 vid = 0;
    38	
    39		rcu_read_lock();
    40		nf_ops = rcu_dereference(nf_br_ops);
    41		if (nf_ops && nf_ops->br_dev_xmit_hook(skb)) {
    42			rcu_read_unlock();
    43			return NETDEV_TX_OK;
    44		}
    45	
    46		u64_stats_update_begin(&brstats->syncp);
    47		brstats->tx_packets++;
    48		brstats->tx_bytes += skb->len;
    49		u64_stats_update_end(&brstats->syncp);
    50	
    51		br_switchdev_frame_unmark(skb);
    52		BR_INPUT_SKB_CB(skb)->brdev = dev;
    53		BR_INPUT_SKB_CB(skb)->frag_max_size = 0;
    54	
    55		skb_reset_mac_header(skb);
    56		skb_pull(skb, ETH_HLEN);
    57	
    58		if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid, &state))
    59			goto out;
    60	
    61		if (IS_ENABLED(CONFIG_INET) &&
    62		    (eth_hdr(skb)->h_proto == htons(ETH_P_ARP) ||
    63		     eth_hdr(skb)->h_proto == htons(ETH_P_RARP)) &&
    64		    br_opt_get(br, BROPT_NEIGH_SUPPRESS_ENABLED)) {
    65			br_do_proxy_suppress_arp(skb, br, vid, NULL);
    66		} else if (IS_ENABLED(CONFIG_IPV6) &&
    67			   skb->protocol == htons(ETH_P_IPV6) &&
    68			   br_opt_get(br, BROPT_NEIGH_SUPPRESS_ENABLED) &&
    69			   pskb_may_pull(skb, sizeof(struct ipv6hdr) +
    70					 sizeof(struct nd_msg)) &&
    71			   ipv6_hdr(skb)->nexthdr == IPPROTO_ICMPV6) {
    72				struct nd_msg *msg, _msg;
    73	
    74				msg = br_is_nd_neigh_msg(skb, &_msg);
    75				if (msg)
    76					br_do_suppress_nd(skb, br, vid, NULL, msg);
    77		}
    78	
    79		dest = eth_hdr(skb)->h_dest;
    80		if (is_broadcast_ether_addr(dest)) {
    81			br_flood(br, skb, BR_PKT_BROADCAST, false, true);
    82		} else if (is_multicast_ether_addr(dest)) {
    83			if (unlikely(netpoll_tx_running(dev))) {
    84				br_flood(br, skb, BR_PKT_MULTICAST, false, true);
    85				goto out;
    86			}
    87			if (br_multicast_rcv(br, NULL, skb, vid)) {
    88				kfree_skb(skb);
    89				goto out;
    90			}
    91	
    92			mdst = br_mdb_get(br, skb, vid);
    93			if ((mdst || BR_INPUT_SKB_CB_MROUTERS_ONLY(skb)) &&
  > 94			    br_multicast_querier_exists(br, eth_hdr(skb), mdst))
    95				br_multicast_flood(mdst, skb, false, true);
    96			else
    97				br_flood(br, skb, BR_PKT_MULTICAST, false, true);
    98		} else if ((dst = br_fdb_find_rcu(br, dest, vid)) != NULL) {
    99			br_forward(dst->dst, skb, false, true);
   100		} else {
   101			br_flood(br, skb, BR_PKT_UNICAST, false, true);
   102		}
   103	out:
   104		rcu_read_unlock();
   105		return NETDEV_TX_OK;
   106	}
   107	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 34775 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-07-09  1:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-08  9:04 What is the correct way to install an L2 multicast route into a bridge? Vladimir Oltean
2020-07-08  9:16 ` Nikolay Aleksandrov
2020-07-08  9:42   ` Vladimir Oltean
2020-07-08 11:07     ` Nikolay Aleksandrov
2020-07-08 11:17       ` Nikolay Aleksandrov
2020-07-08 12:55         ` Nikolay Aleksandrov
2020-07-08 13:10           ` Vladimir Oltean
2020-07-08 19:48           ` [RFC] net: bridge: multicast: add support for L2 entries kernel test robot
2020-07-09  1:10           ` kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.