All of lore.kernel.org
 help / color / mirror / Atom feed
* packets trickling out of STP-blocked ports
@ 2021-12-30 23:07 Colin Foster
  2021-12-31  0:28 ` Vladimir Oltean
  2021-12-31 10:27 ` Alexandre Belloni
  0 siblings, 2 replies; 8+ messages in thread
From: Colin Foster @ 2021-12-30 23:07 UTC (permalink / raw)
  To: netdev; +Cc: Vladimir Oltean, Alexandre Belloni, Horatiu Vultur

Hi all,

I'm not sure who all to include in this email, but I'm starting with
this list to start.

Probably obvious to those in this email list, I'm testing a VSC7512 dev
board controlled via SPI. The patches are still out-of-tree, but I
figured I'll report these findings, since they seem real.

My setup is port 0 of the 7512 is tied to a Beaglebone Black. Port 1 is
tied to my development PC. Ports 2 and 3 are tied together to test STP.

I run the commands:

ip link set eth0 up
ip link set swp[1-3] up
ip link add name br0 type bridge stp_state 1
ip link set dev swp[1-3] master br0
ip addr add 10.100.3.1/16 dev br0
ip link set dev br0 up

After running this, the STP blocks swp3, and swp1/2 are forwarding.

Periodically I see messages saying that swp2 is receiving packets with
own address as source address.

I can confirm that via ethtool that TX packets are increasing on swp3. I
believe I captured the event via tshark. A 4 minute capture showed three
non-STP packets on swp2. All three of these packets are ICMPv6 Router
Solicitation packets. 

I would expect no packets at all to egress swp3. Is this an issue that
is unique to me and my in-development configuration? Or is this an issue
with all Ocelot / Felix devices?

If this is an Ocelot thing, I can try to come up with a different test 
setup to capture more data... printing the packet when it is received,
capturing the traffic externally, capturing eth0 traffic to see if it is
coming from the kernel or being hardware-forwarded...

(side note - if there's a place where a parser for Ocelot NPI traffic is
hidden, that might eventually save me a lot of debugging in Lua)


An idea of how frequently this happens - my system has been currently up
for 3700 seconds. Eight "own address as source address" events have
happened at 66, 96, 156, 279, 509, 996, 1897, and 3699 seconds. 

Thanks, 

Colin Foster

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: packets trickling out of STP-blocked ports
  2021-12-30 23:07 packets trickling out of STP-blocked ports Colin Foster
@ 2021-12-31  0:28 ` Vladimir Oltean
  2022-01-01 16:04   ` Vladimir Oltean
  2021-12-31 10:27 ` Alexandre Belloni
  1 sibling, 1 reply; 8+ messages in thread
From: Vladimir Oltean @ 2021-12-31  0:28 UTC (permalink / raw)
  To: Colin Foster; +Cc: netdev, Alexandre Belloni, Horatiu Vultur

Hi Colin,

On Thu, Dec 30, 2021 at 03:07:40PM -0800, Colin Foster wrote:
> After running this, the STP blocks swp3, and swp1/2 are forwarding.
>
> Periodically I see messages saying that swp2 is receiving packets with
> own address as source address.
>
> I can confirm that via ethtool that TX packets are increasing on swp3. I
> believe I captured the event via tshark. A 4 minute capture showed three
> non-STP packets on swp2. All three of these packets are ICMPv6 Router
> Solicitation packets.
>
> I would expect no packets at all to egress swp3. Is this an issue that
> is unique to me and my in-development configuration? Or is this an issue
> with all Ocelot / Felix devices?

I don't remember noticing these (or maybe I did and I forgot), but
reasoning about it, it's a pretty logical consequence of some of the
design decisions that were made.

One would think that when a network interface is under a bridge, it is
unavailable for direct IP termination by itself - you do the IP
termination through the br0 interface. But that isn't really enforced
anywhere - it's just that the bridge breaks IP termination by default on
its individual member ports by stealing all their traffic with its RX handler.
That RX handler can be taught what to steal and what not to steal using
netfilter ebtables rules. With some carefully designed rules, you could
still have some IP termination through the individual bridge ports.

Hardware isn't carved out according to your expectation that no packets
should egress a blocked port, either. Switches in general, and Ocelot in
particular, have a way to send "control" packets that bypass the
analyzer block and STP state (the bridging service, basically) and are
sent towards a precise set of destination ports. This is done by setting
the BYPASS bit from the injection frame header. Currently, Linux sends
"control" packets to the switch all the time, and that is fine, because
although those packets have the ability to go where they don't belong,
the OS (the bridge driver) is supposed to know that, and just not send
packets there. As a side note, there was some work to allow switch
drivers to send "data" packets to the switch, and these correspond to
traffic that originates from a bridge device, but I am just mentioning
this to clarify that it is irrelevant for the purpose of the discussion here.

Even considering an Intel card with no bridging offload at all, if you
put it in the same situation (eth0 under br0, and eth0 is blocked), you
can still put an IP address on eth0 and ping away just fine (you won't
get back the reply as mentioned above, but that's separate really).
Nobody will prevent packets from eth0 from being sent, since the bridge
driver code path isn't invoked on TX unless the socket is bound to br0.

The key point is that the direct xmit data path through swp3, as well as
the data path br0 -> swp3, both exist, in hardware and in software. And
while in hardware they're a bit more clearly separated (in IEEE 802.1Q
there's even a block diagram to clarify that both exist), in software
they're entangled in a bit of a mess, and there are parts of the network
stack and of user space that aren't aware that swp3 is under a bridge,
so IPv6 Router Solicitation messages being sent through swp3 shouldn't
be much of a surprise.



With that out of the way.

Traditionally, DSA has made a design decision that all switch ports
inherit the single MAC address of the DSA master. IOW, if you have 1 DSA
master and 4 switch ports, you have 5 interfaces in the system with the
same MAC address. It was like this for a long time, and relatively
recently, Xiaofei Shen added the ability for individual DSA interfaces
to have their own MAC address stored in the device tree.

As an argument in favor of the status quo, Florian explained that:

| By default, DSA switch need to come up in a configuration where all
| ports (except CPU/management) must be strictly separate from every other
| port such that we can achieve what a standalone Ethernet NIC would do.
| This works because all ports are isolated from one another, so there is
| no cross talk and so having the same MAC address (the one from the CPU)
| on the DSA slave network devices just works, each port is a separate
| broadcast domain.
| 
| Once you start bridging one or ore ports, the bridge root port will have
| a MAC address, most likely the one the CPU/management Ethernet MAC, but
| similarly, this is not an issue and that's exactly how a software bridge
| would work as well.

https://patchwork.kernel.org/project/linux-arm-msm/patch/20190222125815.12866-1-vkoul@kernel.org/

Although yes, that does make some level of sense, it kind of omits the
fact that two DSA ports can be used for communication in loopback too
(either through a direct cable, or through an externally switched network).
In that case, having a MAC SA != MAC DA in the Ethernet packets is kind
of important (I found that out while trying to compose some selftests
for DSA).


If my intuition is correct, you are using the default configuration
where all DSA interfaces have the MAC address inherited from the DSA
master. Corrolary, swp2 and swp3 have the same MAC address.

swp3 is a bridged port, and a blocked port at that, but not all parts of
the network stack know that. So from time to time, you get these IPv6
Router Solicitation messages. They could be anything else, in fact.

swp2 is a bridged port, and in the forwarding state. So packets it
receives are eligible for learning.

When br0 receives a packet via swp2 that originated from swp3, it just
complains: "hey, learning the route for this packet's MAC SA to go
towards swp2 would mean that I would no longer terminate packets with
this MAC DA locally, which is kinda weird, since that MAC address is
also marked as non-forwarded." Which is fair.


So IMHO, this behavior is neither good nor bad, it is just the way it is,
nothing to worry about if that's what concerns you. To prove or disprove
what I said you could try to configure individual MAC addresses and see
whether that fixes the problem.

> (side note - if there's a place where a parser for Ocelot NPI traffic is
> hidden, that might eventually save me a lot of debugging in Lua)

Nope, there isn't, although it would certainly be great if you could
teach tcpdump about it, similar to what Vivien has done for Marvell:
https://github.com/the-tcpdump-group/tcpdump/blob/master/print-dsa.c

I've wanted to do that for a long time, but I've had lots of other
priorities, and it's tricky for various reasons (there isn't exactly a
single on-the-wire format, but it depends on whether you configure the
NPI port to have no prefix, a short prefix or a long prefix; this
configuration is independent for the RX and TX directions; currently we
use short prefix on RX and TX, but in older kernels we used to use no
prefix on TX, and long prefix on RX on some older kernels, all while the
tagging protocol was still "ocelot"; I'm not sure whether the presence
or absence of a prefix, and what kind, can be deduced by looking at the
packet alone).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: packets trickling out of STP-blocked ports
  2021-12-30 23:07 packets trickling out of STP-blocked ports Colin Foster
  2021-12-31  0:28 ` Vladimir Oltean
@ 2021-12-31 10:27 ` Alexandre Belloni
  2021-12-31 15:06   ` Colin Foster
  1 sibling, 1 reply; 8+ messages in thread
From: Alexandre Belloni @ 2021-12-31 10:27 UTC (permalink / raw)
  To: Colin Foster; +Cc: netdev, Vladimir Oltean, Horatiu Vultur

Hi,

On 30/12/2021 15:07:40-0800, Colin Foster wrote:
> Hi all,
> 
> I'm not sure who all to include in this email, but I'm starting with
> this list to start.
> 
> Probably obvious to those in this email list, I'm testing a VSC7512 dev
> board controlled via SPI. The patches are still out-of-tree, but I
> figured I'll report these findings, since they seem real.
> 
> My setup is port 0 of the 7512 is tied to a Beaglebone Black. Port 1 is
> tied to my development PC. Ports 2 and 3 are tied together to test STP.
> 
> I run the commands:
> 
> ip link set eth0 up
> ip link set swp[1-3] up
> ip link add name br0 type bridge stp_state 1
> ip link set dev swp[1-3] master br0
> ip addr add 10.100.3.1/16 dev br0
> ip link set dev br0 up
> 
> After running this, the STP blocks swp3, and swp1/2 are forwarding.
> 
> Periodically I see messages saying that swp2 is receiving packets with
> own address as source address.
> 
> I can confirm that via ethtool that TX packets are increasing on swp3. I
> believe I captured the event via tshark. A 4 minute capture showed three
> non-STP packets on swp2. All three of these packets are ICMPv6 Router
> Solicitation packets. 
> 
> I would expect no packets at all to egress swp3. Is this an issue that
> is unique to me and my in-development configuration? Or is this an issue
> with all Ocelot / Felix devices?
> 
> If this is an Ocelot thing, I can try to come up with a different test 
> setup to capture more data... printing the packet when it is received,
> capturing the traffic externally, capturing eth0 traffic to see if it is
> coming from the kernel or being hardware-forwarded...
> 
> (side note - if there's a place where a parser for Ocelot NPI traffic is
> hidden, that might eventually save me a lot of debugging in Lua)
> 
> 
> An idea of how frequently this happens - my system has been currently up
> for 3700 seconds. Eight "own address as source address" events have
> happened at 66, 96, 156, 279, 509, 996, 1897, and 3699 seconds. 
> 

This is something I solved back in 2017. I can exactly remember how, you
can try:

sysctl -w net.ipv6.conf.swp3.autoconf=0


-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: packets trickling out of STP-blocked ports
  2021-12-31 10:27 ` Alexandre Belloni
@ 2021-12-31 15:06   ` Colin Foster
  2021-12-31 15:17     ` Alexandre Belloni
  0 siblings, 1 reply; 8+ messages in thread
From: Colin Foster @ 2021-12-31 15:06 UTC (permalink / raw)
  To: Alexandre Belloni; +Cc: netdev, Vladimir Oltean, Horatiu Vultur

Hi Alexandre

On Fri, Dec 31, 2021 at 11:27:16AM +0100, Alexandre Belloni wrote:
> Hi,
> 
> On 30/12/2021 15:07:40-0800, Colin Foster wrote:
> > Hi all,
> > 
> > An idea of how frequently this happens - my system has been currently up
> > for 3700 seconds. Eight "own address as source address" events have
> > happened at 66, 96, 156, 279, 509, 996, 1897, and 3699 seconds. 
> > 
> 
> This is something I solved back in 2017. I can exactly remember how, you
> can try:
> 
> sysctl -w net.ipv6.conf.swp3.autoconf=0

That sounds very promising! Sorry you had to fix my system config, but
glad that this all makes perfect sense. 

> 
> 
> -- 
> Alexandre Belloni, co-owner and COO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: packets trickling out of STP-blocked ports
  2021-12-31 15:06   ` Colin Foster
@ 2021-12-31 15:17     ` Alexandre Belloni
  2021-12-31 15:31       ` Andrew Lunn
  2021-12-31 15:53       ` Colin Foster
  0 siblings, 2 replies; 8+ messages in thread
From: Alexandre Belloni @ 2021-12-31 15:17 UTC (permalink / raw)
  To: Colin Foster; +Cc: netdev, Vladimir Oltean, Horatiu Vultur

On 31/12/2021 07:06:51-0800, Colin Foster wrote:
> Hi Alexandre
> 
> On Fri, Dec 31, 2021 at 11:27:16AM +0100, Alexandre Belloni wrote:
> > Hi,
> > 
> > On 30/12/2021 15:07:40-0800, Colin Foster wrote:
> > > Hi all,
> > > 
> > > An idea of how frequently this happens - my system has been currently up
> > > for 3700 seconds. Eight "own address as source address" events have
> > > happened at 66, 96, 156, 279, 509, 996, 1897, and 3699 seconds. 
> > > 
> > 
> > This is something I solved back in 2017. I can exactly remember how, you

Sorry, I meant "I can't exactly" ;)

> > can try:
> > 
> > sysctl -w net.ipv6.conf.swp3.autoconf=0
> 
> That sounds very promising! Sorry you had to fix my system config, but
> glad that this all makes perfect sense. 
> 

Let me know if this works ;) The bottom line being that you should
probably disable ipv6 autoconf on individual interfaces and then enable
it on the bridge.


-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: packets trickling out of STP-blocked ports
  2021-12-31 15:17     ` Alexandre Belloni
@ 2021-12-31 15:31       ` Andrew Lunn
  2021-12-31 15:53       ` Colin Foster
  1 sibling, 0 replies; 8+ messages in thread
From: Andrew Lunn @ 2021-12-31 15:31 UTC (permalink / raw)
  To: Alexandre Belloni; +Cc: Colin Foster, netdev, Vladimir Oltean, Horatiu Vultur

> > > sysctl -w net.ipv6.conf.swp3.autoconf=0
> > 
> > That sounds very promising! Sorry you had to fix my system config, but
> > glad that this all makes perfect sense. 
> > 

Hi Alexandre

> 
> Let me know if this works ;) The bottom line being that you should
> probably disable ipv6 autoconf on individual interfaces and then enable
> it on the bridge.

Does this also stop the interface getting a link local IPv6 address
based on its MAC address?

e.g. my wifi interface has MAC address b8:ae:ed:78:ef:9d and gets an
IPv6 address

inet6 fe80::baae:edff:fe78:ef9d/64 scope link 

It will also perform duplicate address detection, DAD, when the
interface is brought up. That is probably hard to see with tcpdump on
the host, since it happens very quickly, but a link peer should see
the packets.

    Andrew

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: packets trickling out of STP-blocked ports
  2021-12-31 15:17     ` Alexandre Belloni
  2021-12-31 15:31       ` Andrew Lunn
@ 2021-12-31 15:53       ` Colin Foster
  1 sibling, 0 replies; 8+ messages in thread
From: Colin Foster @ 2021-12-31 15:53 UTC (permalink / raw)
  To: Alexandre Belloni; +Cc: netdev, Vladimir Oltean, Horatiu Vultur

On Fri, Dec 31, 2021 at 04:17:44PM +0100, Alexandre Belloni wrote:
> On 31/12/2021 07:06:51-0800, Colin Foster wrote:
> > Hi Alexandre
> > 
> > On Fri, Dec 31, 2021 at 11:27:16AM +0100, Alexandre Belloni wrote:
> > > Hi,
> > > 
> > > On 30/12/2021 15:07:40-0800, Colin Foster wrote:
> > > > Hi all,
> > > > 
> > > > An idea of how frequently this happens - my system has been currently up
> > > > for 3700 seconds. Eight "own address as source address" events have
> > > > happened at 66, 96, 156, 279, 509, 996, 1897, and 3699 seconds. 
> > > > 
> > > 
> > > This is something I solved back in 2017. I can exactly remember how, you
> 
> Sorry, I meant "I can't exactly" ;)
> 
> > > can try:
> > > 
> > > sysctl -w net.ipv6.conf.swp3.autoconf=0
> > 
> > That sounds very promising! Sorry you had to fix my system config, but
> > glad that this all makes perfect sense. 
> > 
> 
> Let me know if this works ;) The bottom line being that you should
> probably disable ipv6 autoconf on individual interfaces and then enable
> it on the bridge.

Just gave it a shot. No luck.

But poking around sysctl there's
net.ipv6.conf.swp3.router_solicitation{s,_delay,_interval,_max_interval}

As Andrew hints at, there might be some unintended consequences. It
seems that writing -1 to net.ipv6.conf.swp3.router_solicitation_delay
"fixed it." I don't know how that'll affect an IPv6 network in
production.

> 
> 
> -- 
> Alexandre Belloni, co-owner and COO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: packets trickling out of STP-blocked ports
  2021-12-31  0:28 ` Vladimir Oltean
@ 2022-01-01 16:04   ` Vladimir Oltean
  0 siblings, 0 replies; 8+ messages in thread
From: Vladimir Oltean @ 2022-01-01 16:04 UTC (permalink / raw)
  To: Colin Foster; +Cc: netdev, Alexandre Belloni, Horatiu Vultur

On Fri, Dec 31, 2021 at 02:28:23AM +0200, Vladimir Oltean wrote:
> Traditionally, DSA has made a design decision that all switch ports
> inherit the single MAC address of the DSA master. IOW, if you have 1 DSA
> master and 4 switch ports, you have 5 interfaces in the system with the
> same MAC address. It was like this for a long time, and relatively
> recently, Xiaofei Shen added the ability for individual DSA interfaces
> to have their own MAC address stored in the device tree.

I thought a bit more in the back of my head and I need to make a
correction to what I said. It doesn't matter that DSA interfaces have
the same MAC address, because swp2 and swp3 are both bridge ports, and
therefore, their MAC addresses are both local FDB entries, even if
unique. So the bridge would still complain that it receives packets with
a MAC SA equal to a local FDB entry.

ip link add veth0 type veth peer name veth1
ip link add br0 type bridge
ip link set veth0 master br0
[   84.987666] br0: port 1(veth0) entered blocking state
[   84.992857] br0: port 1(veth0) entered disabled state
[   84.998172] device veth0 entered promiscuous mode
ip link set veth1 master br0
[   87.083140] br0: port 2(veth1) entered blocking state
[   87.088280] br0: port 2(veth1) entered disabled state
[   87.093625] device veth1 entered promiscuous mode
ip link set br0 type bridge stp_state 1
ip link set br0 up
ip link set veth0 up
ip link set veth1 up
[  116.758260] IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
[  116.764899] br0: port 2(veth1) entered blocking state
[  116.771353] br0: port 2(veth1) entered listening state
[  116.778272] IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
[  116.785703] br0: port 1(veth0) entered blocking state
[  116.790776] br0: port 1(veth0) entered listening state
[  117.112892] br0: port 2(veth1) entered blocking state
[  132.312686] br0: port 1(veth0) entered learning state
[  133.740183] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[  145.752889] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[  147.672675] br0: port 1(veth0) entered forwarding state
[  147.677978] br0: topology change detected, propagating
[  147.683388] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready
[  149.741219] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[  176.472805] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[  181.742095] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[  238.552814] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[  245.742659] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
bridge link
6: veth1@veth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state blocking priority 32 cost 2
7: veth0@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master br0 state forwarding priority 32 cost 2

Looking at br_fdb_update(), the print is pretty harmless - the bridge is
smart enough to not actually relearn the local FDB entry towards an
external port. No FDB update is being done. The print is also
rate-limited. So it's just that - a warning. I am not sure whether it's
worth disabling IPv6 Router Solicitations, given that, as mentioned,
basically any other traffic sent through the plain bridge port will
trigger this. I consider it a non-problem.

ip addr add 192.168.100.1/24 dev veth1
ping 192.168.100.2
PING 192.168.100.[ 1434.033119] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
2 (192.168.100.2) 56(84) bytes of data.
[ 1435.112977] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[ 1436.152991] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[ 1437.193428] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
From 192.168.100.1 icmp_seq=1 Destination Host Unreachable
From 192.168.100.1 icmp_seq=2 Destination Host Unreachable
From 192.168.100.1 icmp_seq=3 Destination Host Unreachable
[ 1438.232784] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[ 1438.260075] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[ 1439.272769] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
[ 1440.312904] br0: received packet on veth0 with own address as source address (addr:c2:fc:33:30:4c:bf, vlan:0)
From 192.168.100.1 icmp_seq=4 Destination Host Unreachable
^C
--- 192.168.100.2 ping statistics ---
7 packets transmitted, 0 received, +4 errors, 100% packet loss, time 6280ms
pipe 4

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-01-01 16:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-30 23:07 packets trickling out of STP-blocked ports Colin Foster
2021-12-31  0:28 ` Vladimir Oltean
2022-01-01 16:04   ` Vladimir Oltean
2021-12-31 10:27 ` Alexandre Belloni
2021-12-31 15:06   ` Colin Foster
2021-12-31 15:17     ` Alexandre Belloni
2021-12-31 15:31       ` Andrew Lunn
2021-12-31 15:53       ` Colin Foster

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.