netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ido Schimmel <idosch@idosch.org>
To: "Linus Lüssing" <linus.luessing@c0d3.blue>
Cc: Florian Fainelli <f.fainelli@gmail.com>,
	Jiri Pirko <jiri@resnulli.us>,
	nikolay@cumulusnetworks.com,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	bridge@lists.linux-foundation.org,
	Russell King - ARM Linux admin <linux@armlinux.org.uk>,
	"davem@davemloft.net" <davem@davemloft.net>,
	Ido Schimmel <idosch@mellanox.com>,
	Vivien Didelot <vivien.didelot@gmail.com>,
	b.a.t.m.a.n@lists.open-mesh.org
Subject: Re: [RFC net-next] net: dsa: add support for MC_DISABLED attribute
Date: Sun, 7 Jul 2019 12:07:47 +0300	[thread overview]
Message-ID: <20190707090747.GA5516@splinter> (raw)
In-Reply-To: <20190702231308.GA2414@otheros>

On Wed, Jul 03, 2019 at 01:13:08AM +0200, Linus Lüssing wrote:
> Hi Ido,
> 
> > Do you differentiate between IPv4 and IPv6 in batman-adv?
> 
> For most things, yes: The querier state is kept separately for
> IPv4 and IPv6. And we do have something like a "router node"
> flag to signalize that a node needs all multicast traffic, which
> is split into IPv4 and IPv6.
> 
> The "MDB" equivalent in batman-adv (multicast entries in our "TT",
> translation table) are on MAC address base right now, not on an IP
> address base yet, so that sounds similar to what you do in mlxsw?

Yes.

> Regarding querier state, we periodically ask the
> bridge via "br_multicast_has_querier_anywhere(dev, ETH_P_IP)"
> and "br_multicast_has_querier_anywhere(dev, ETH_P_IPV6)".
> 
> (Something more event based with handler functions would probably
> be nicer, but this was the easier thing to start with.)

Thanks for the reference. Will check the code. I believe that we will
add switchdev notifications for querier state change, so it might be
useful for you as well.

> > 1. All the IPv6 MDB entries need to be removed from the device. At least
> > in mlxsw, we do not have a way to ignore only IPv6 entries. From the
> > device's perspective, an MDB entry is just a multicast DMAC with a
> > bitmap of ports packets should be replicated to.
> 
> Ah, I see, yes. We had a similar "issue". Initially we just always
> added any multicast entry into our translation table offered by
> the IP stack or bridge, no matter what a querier state or "router
> node" state said. Which would have led to a node receiving two
> copies of a multicast packet if it were both a querier or router
> and were also having a listener announced via IGMP/MLD.
> 
> So there we also just recently changed that, to filter out
> IPv6 multicast TT entries if this node were also announcing itself as
> an MLD querier or IPv6 router. And same, independently for
> IPv4/IGMP.

This is actually not a problem with mlxsw. The ports a packet should be
replicated to are represented using a bitmap. It does not matter if we
set the bit because it has an MDB entry or because it is an mrouter
port. And obviously it does not matter if we set it twice :)

> > 2. We need to split the flood tables used for IPv4 and IPv6 unregistered
> > multicast packets. For IPv4, packets should only be flooded to mrouter
> > ports whereas for IPv6 packets should be flooded to all the member
> > ports.
> 
> This one I do not fully understand yet. Why don't you apply the
> "flood to all ports" (in the no IGMP querier present case)
> for IPv4, too?
> 
> Sure, for IPv4 nothing "essential" will break, as IPv4 unicast
> does not rely on multicast (contrary to IPv6, due to NDP, as you
> mentioned). But still there would be potential multicast packet loss
> for a 239.x.x.x listener on the local link, for instance, wouldn't
> there?
> 
> 
> If I'm not mistaken, we do not apply differing behaviour for IPv4
> vs. IPv6 in the bridge either and would flood on all ports for IPv4
> with no querier present, too.

I think I was not clear, so I will explain again. I was referring to a
situation where IPv4 has a querier, but IPv6 does not. In this case, the
bridge will flood IPv4 unregistered multicast packets to mrouter ports
only. On the other hand, IPv6 unregistered multicast packets will be
flooded to all the ports. Based on my reading of the code, this is
controlled by 'mcast_hit' in br_handle_frame_finish().

In mlxsw, each packet type (e.g., unknown unicast, IPvX unregistered
multicast) is associated with a flood table (basically a huge matrix,
where row corresponds to VLAN and column corresponds to a port). If we
are to handle the case where IPv4 unregistered multicast packets need to
be flooded to mrouter ports only, whereas IPv6 unregistered multicast
packets need to be flooded to all the ports, then we need to use a
separate flood table for each.

Alternatively, we can use the same flood table, but flood to all the
ports if IPv4 or IPv6 querier is missing. I do not think anything will
break, it is just very efficient. This seems to be allowed by RFC 4541
(Section 2.1.2):

"If a switch receives an unregistered packet, it must forward that
packet on all ports to which an IGMP router is attached. A switch may
default to forwarding unregistered packets on all ports."

> Regards, Linus

Linus, thanks a lot for the great feedback. I really appreciate it!

  parent reply	other threads:[~2019-07-07  9:08 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-20 23:56 [RFC net-next] net: dsa: add support for MC_DISABLED attribute Vivien Didelot
2019-06-21  2:24 ` Florian Fainelli
2019-06-21 21:29   ` Vivien Didelot
2019-06-21 22:09     ` Russell King - ARM Linux admin
2019-06-23  7:09     ` Ido Schimmel
2019-06-23  7:26       ` Russell King - ARM Linux admin
2019-06-23  7:44         ` Ido Schimmel
2019-06-29 16:29           ` Ido Schimmel
2019-06-30 16:56             ` Linus Lüssing
2019-07-02 14:27               ` Nikolay Aleksandrov
2019-07-02 17:11               ` Ido Schimmel
     [not found]                 ` <20190702231308.GA2414@otheros>
2019-07-07  9:07                   ` Ido Schimmel [this message]
2019-07-05 16:01       ` Vivien Didelot
2019-07-07 10:28         ` Ido Schimmel
2019-06-23  6:48   ` Ido Schimmel
2019-06-29 15:31     ` Ido Schimmel
2019-06-29 23:06       ` Andrew Lunn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190707090747.GA5516@splinter \
    --to=idosch@idosch.org \
    --cc=b.a.t.m.a.n@lists.open-mesh.org \
    --cc=bridge@lists.linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=f.fainelli@gmail.com \
    --cc=idosch@mellanox.com \
    --cc=jiri@resnulli.us \
    --cc=linus.luessing@c0d3.blue \
    --cc=linux@armlinux.org.uk \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=vivien.didelot@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).