netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* What to do when a bridge port gets its pvid deleted?
@ 2019-06-25 20:49 Vladimir Oltean
  2019-06-25 21:05 ` Vladimir Oltean
  2019-06-28 12:30 ` Ido Schimmel
  0 siblings, 2 replies; 10+ messages in thread
From: Vladimir Oltean @ 2019-06-25 20:49 UTC (permalink / raw)
  To: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot, stephen

Hi,

A number of DSA drivers (BCM53XX, Microchip KSZ94XX, Mediatek MT7530
at the very least), as well as Mellanox Spectrum (I didn't look at all
the pure switchdev drivers) try to restore the pvid to a default value
on .port_vlan_del.
Sure, the port stops receiving traffic when its pvid is a VLAN ID that
is not installed in its hw filter, but as far as the bridge core is
concerned, this is to be expected:

# bridge vlan add dev swp2 vid 100 pvid untagged
# bridge vlan
port    vlan ids
swp5     1 PVID Egress Untagged

swp2     1 Egress Untagged
         100 PVID Egress Untagged

swp3     1 PVID Egress Untagged

swp4     1 PVID Egress Untagged

br0      1 PVID Egress Untagged
# ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.682 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.299 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.251 ms
64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.324 ms
64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.257 ms
^C
--- 10.0.0.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4188ms
rtt min/avg/max/mdev = 0.251/0.362/0.682/0.163 ms
# bridge vlan del dev swp2 vid 100
# bridge vlan
port    vlan ids
swp5     1 PVID Egress Untagged

swp2     1 Egress Untagged

swp3     1 PVID Egress Untagged

swp4     1 PVID Egress Untagged

br0      1 PVID Egress Untagged

# ping 10.0.0.1
PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
^C
--- 10.0.0.1 ping statistics ---
8 packets transmitted, 0 received, 100% packet loss, time 7267ms

What is the consensus here? Is there a reason why the bridge driver
doesn't take care of this? Do switchdev drivers have to restore the
pvid to always be operational, even if their state becomes
inconsistent with the upper dev? Is it just 'nice to have'? What if
VID 1 isn't in the hw filter either (perfectly legal)?

Thanks!
-Vladimir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-06-25 20:49 What to do when a bridge port gets its pvid deleted? Vladimir Oltean
@ 2019-06-25 21:05 ` Vladimir Oltean
  2019-06-28 12:30 ` Ido Schimmel
  1 sibling, 0 replies; 10+ messages in thread
From: Vladimir Oltean @ 2019-06-25 21:05 UTC (permalink / raw)
  To: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot, stephen

On Tue, 25 Jun 2019 at 23:49, Vladimir Oltean <olteanv@gmail.com> wrote:
>
> Hi,
>
> A number of DSA drivers (BCM53XX, Microchip KSZ94XX, Mediatek MT7530
> at the very least), as well as Mellanox Spectrum (I didn't look at all
> the pure switchdev drivers) try to restore the pvid to a default value
> on .port_vlan_del.
> Sure, the port stops receiving traffic when its pvid is a VLAN ID that
> is not installed in its hw filter, but as far as the bridge core is
> concerned, this is to be expected:
>
> # bridge vlan add dev swp2 vid 100 pvid untagged
> # bridge vlan
> port    vlan ids
> swp5     1 PVID Egress Untagged
>
> swp2     1 Egress Untagged
>          100 PVID Egress Untagged
>
> swp3     1 PVID Egress Untagged
>
> swp4     1 PVID Egress Untagged
>
> br0      1 PVID Egress Untagged
> # ping 10.0.0.1
> PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.682 ms
> 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.299 ms
> 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.251 ms
> 64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.324 ms
> 64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.257 ms
> ^C
> --- 10.0.0.1 ping statistics ---
> 5 packets transmitted, 5 received, 0% packet loss, time 4188ms
> rtt min/avg/max/mdev = 0.251/0.362/0.682/0.163 ms
> # bridge vlan del dev swp2 vid 100
> # bridge vlan
> port    vlan ids
> swp5     1 PVID Egress Untagged
>
> swp2     1 Egress Untagged
>
> swp3     1 PVID Egress Untagged
>
> swp4     1 PVID Egress Untagged
>
> br0      1 PVID Egress Untagged
>
> # ping 10.0.0.1
> PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> ^C
> --- 10.0.0.1 ping statistics ---
> 8 packets transmitted, 0 received, 100% packet loss, time 7267ms
>
> What is the consensus here? Is there a reason why the bridge driver
> doesn't take care of this? Do switchdev drivers have to restore the
> pvid to always be operational, even if their state becomes
> inconsistent with the upper dev? Is it just 'nice to have'? What if
> VID 1 isn't in the hw filter either (perfectly legal)?
>
> Thanks!
> -Vladimir

Or rather, let me put it differently.
I am not only asking to figure out whether I should put this extra
logic or not in a switchdev driver - I can understand the "you don't
want it, don't put it" concept.
But let's say that for various reasons, MSTP is not supported in the
Linux kernel and I'm trying to create an ad-hoc MSTP setup where I
break the loop manually in the pvid. Is there any good reason why a
switchdev driver would oppose fierce resistance to me trying to do
that? Again, this is perfectly valid - the switch will no longer
receive untagged traffic but it will continue to forward frames in
other VLANs.

-Vladimir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-06-25 20:49 What to do when a bridge port gets its pvid deleted? Vladimir Oltean
  2019-06-25 21:05 ` Vladimir Oltean
@ 2019-06-28 12:30 ` Ido Schimmel
  2019-06-28 12:37   ` Vladimir Oltean
  1 sibling, 1 reply; 10+ messages in thread
From: Ido Schimmel @ 2019-06-28 12:30 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot, stephen

On Tue, Jun 25, 2019 at 11:49:29PM +0300, Vladimir Oltean wrote:
> A number of DSA drivers (BCM53XX, Microchip KSZ94XX, Mediatek MT7530
> at the very least), as well as Mellanox Spectrum (I didn't look at all
> the pure switchdev drivers) try to restore the pvid to a default value
> on .port_vlan_del.

I don't know about DSA drivers, but that's not what mlxsw is doing. If
the VLAN that is configured as PVID is deleted from the bridge port, the
driver instructs the port to discard untagged and prio-tagged packets.
This is consistent with the bridge driver's behavior.

We do have a flow the "restores" the PVID, but that's when a port is
unlinked from its bridge master. The PVID we set is 4095 which cannot be
configured by the 8021q / bridge driver. This is due to the way the
underlying hardware works. Even if a port is not bridged and used purely
for routing, packets still do L2 lookup first which sends them directly
to the router block. If PVID is not configured, untagged packets could
not be routed. Obviously, at egress we strip this VLAN.

> Sure, the port stops receiving traffic when its pvid is a VLAN ID that
> is not installed in its hw filter, but as far as the bridge core is
> concerned, this is to be expected:
> 
> # bridge vlan add dev swp2 vid 100 pvid untagged
> # bridge vlan
> port    vlan ids
> swp5     1 PVID Egress Untagged
> 
> swp2     1 Egress Untagged
>          100 PVID Egress Untagged
> 
> swp3     1 PVID Egress Untagged
> 
> swp4     1 PVID Egress Untagged
> 
> br0      1 PVID Egress Untagged
> # ping 10.0.0.1
> PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.682 ms
> 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.299 ms
> 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.251 ms
> 64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.324 ms
> 64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.257 ms
> ^C
> --- 10.0.0.1 ping statistics ---
> 5 packets transmitted, 5 received, 0% packet loss, time 4188ms
> rtt min/avg/max/mdev = 0.251/0.362/0.682/0.163 ms
> # bridge vlan del dev swp2 vid 100
> # bridge vlan
> port    vlan ids
> swp5     1 PVID Egress Untagged
> 
> swp2     1 Egress Untagged
> 
> swp3     1 PVID Egress Untagged
> 
> swp4     1 PVID Egress Untagged
> 
> br0      1 PVID Egress Untagged
> 
> # ping 10.0.0.1
> PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> ^C
> --- 10.0.0.1 ping statistics ---
> 8 packets transmitted, 0 received, 100% packet loss, time 7267ms
> 
> What is the consensus here? Is there a reason why the bridge driver
> doesn't take care of this? 

Take care of what? :) Always maintaining a PVID on the bridge port? It's
completely OK not to have a PVID.

> Do switchdev drivers have to restore the pvid to always be
> operational, even if their state becomes inconsistent with the upper
> dev? Is it just 'nice to have'? What if VID 1 isn't in the hw filter
> either (perfectly legal)?

Are you saying that DSA drivers always maintain a PVID on the bridge
port and allow untagged traffic to ingress regardless of the bridge
driver's configuration? If so, I think this needs to be fixed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-06-28 12:30 ` Ido Schimmel
@ 2019-06-28 12:37   ` Vladimir Oltean
  2019-06-28 16:45     ` Florian Fainelli
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Oltean @ 2019-06-28 12:37 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: netdev, Andrew Lunn, Florian Fainelli, Vivien Didelot, stephen

On Fri, 28 Jun 2019 at 15:30, Ido Schimmel <idosch@idosch.org> wrote:
>
> On Tue, Jun 25, 2019 at 11:49:29PM +0300, Vladimir Oltean wrote:
> > A number of DSA drivers (BCM53XX, Microchip KSZ94XX, Mediatek MT7530
> > at the very least), as well as Mellanox Spectrum (I didn't look at all
> > the pure switchdev drivers) try to restore the pvid to a default value
> > on .port_vlan_del.
>
> I don't know about DSA drivers, but that's not what mlxsw is doing. If
> the VLAN that is configured as PVID is deleted from the bridge port, the
> driver instructs the port to discard untagged and prio-tagged packets.
> This is consistent with the bridge driver's behavior.
>
> We do have a flow the "restores" the PVID, but that's when a port is
> unlinked from its bridge master. The PVID we set is 4095 which cannot be
> configured by the 8021q / bridge driver. This is due to the way the
> underlying hardware works. Even if a port is not bridged and used purely
> for routing, packets still do L2 lookup first which sends them directly
> to the router block. If PVID is not configured, untagged packets could
> not be routed. Obviously, at egress we strip this VLAN.
>
> > Sure, the port stops receiving traffic when its pvid is a VLAN ID that
> > is not installed in its hw filter, but as far as the bridge core is
> > concerned, this is to be expected:
> >
> > # bridge vlan add dev swp2 vid 100 pvid untagged
> > # bridge vlan
> > port    vlan ids
> > swp5     1 PVID Egress Untagged
> >
> > swp2     1 Egress Untagged
> >          100 PVID Egress Untagged
> >
> > swp3     1 PVID Egress Untagged
> >
> > swp4     1 PVID Egress Untagged
> >
> > br0      1 PVID Egress Untagged
> > # ping 10.0.0.1
> > PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> > 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.682 ms
> > 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.299 ms
> > 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.251 ms
> > 64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.324 ms
> > 64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.257 ms
> > ^C
> > --- 10.0.0.1 ping statistics ---
> > 5 packets transmitted, 5 received, 0% packet loss, time 4188ms
> > rtt min/avg/max/mdev = 0.251/0.362/0.682/0.163 ms
> > # bridge vlan del dev swp2 vid 100
> > # bridge vlan
> > port    vlan ids
> > swp5     1 PVID Egress Untagged
> >
> > swp2     1 Egress Untagged
> >
> > swp3     1 PVID Egress Untagged
> >
> > swp4     1 PVID Egress Untagged
> >
> > br0      1 PVID Egress Untagged
> >
> > # ping 10.0.0.1
> > PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> > ^C
> > --- 10.0.0.1 ping statistics ---
> > 8 packets transmitted, 0 received, 100% packet loss, time 7267ms
> >
> > What is the consensus here? Is there a reason why the bridge driver
> > doesn't take care of this?
>
> Take care of what? :) Always maintaining a PVID on the bridge port? It's
> completely OK not to have a PVID.
>

Yes, I didn't think it through during the first email. I came to the
same conclusion in the second one.

> > Do switchdev drivers have to restore the pvid to always be
> > operational, even if their state becomes inconsistent with the upper
> > dev? Is it just 'nice to have'? What if VID 1 isn't in the hw filter
> > either (perfectly legal)?
>
> Are you saying that DSA drivers always maintain a PVID on the bridge
> port and allow untagged traffic to ingress regardless of the bridge
> driver's configuration? If so, I think this needs to be fixed.

Well, not at the DSA core level.
But for Microchip:
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/microchip/ksz9477.c#n576
For Broadcom:
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/b53/b53_common.c#n1376
For Mediatek:
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/mt7530.c#n1196

There might be others as well.

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-06-28 12:37   ` Vladimir Oltean
@ 2019-06-28 16:45     ` Florian Fainelli
  2019-08-19 17:15       ` Vladimir Oltean
  0 siblings, 1 reply; 10+ messages in thread
From: Florian Fainelli @ 2019-06-28 16:45 UTC (permalink / raw)
  To: Vladimir Oltean, Ido Schimmel
  Cc: netdev, Andrew Lunn, Vivien Didelot, stephen

On 6/28/19 5:37 AM, Vladimir Oltean wrote:
> On Fri, 28 Jun 2019 at 15:30, Ido Schimmel <idosch@idosch.org> wrote:
>>
>> On Tue, Jun 25, 2019 at 11:49:29PM +0300, Vladimir Oltean wrote:
>>> A number of DSA drivers (BCM53XX, Microchip KSZ94XX, Mediatek MT7530
>>> at the very least), as well as Mellanox Spectrum (I didn't look at all
>>> the pure switchdev drivers) try to restore the pvid to a default value
>>> on .port_vlan_del.
>>
>> I don't know about DSA drivers, but that's not what mlxsw is doing. If
>> the VLAN that is configured as PVID is deleted from the bridge port, the
>> driver instructs the port to discard untagged and prio-tagged packets.
>> This is consistent with the bridge driver's behavior.
>>
>> We do have a flow the "restores" the PVID, but that's when a port is
>> unlinked from its bridge master. The PVID we set is 4095 which cannot be
>> configured by the 8021q / bridge driver. This is due to the way the
>> underlying hardware works. Even if a port is not bridged and used purely
>> for routing, packets still do L2 lookup first which sends them directly
>> to the router block. If PVID is not configured, untagged packets could
>> not be routed. Obviously, at egress we strip this VLAN.
>>
>>> Sure, the port stops receiving traffic when its pvid is a VLAN ID that
>>> is not installed in its hw filter, but as far as the bridge core is
>>> concerned, this is to be expected:
>>>
>>> # bridge vlan add dev swp2 vid 100 pvid untagged
>>> # bridge vlan
>>> port    vlan ids
>>> swp5     1 PVID Egress Untagged
>>>
>>> swp2     1 Egress Untagged
>>>          100 PVID Egress Untagged
>>>
>>> swp3     1 PVID Egress Untagged
>>>
>>> swp4     1 PVID Egress Untagged
>>>
>>> br0      1 PVID Egress Untagged
>>> # ping 10.0.0.1
>>> PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
>>> 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.682 ms
>>> 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.299 ms
>>> 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.251 ms
>>> 64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.324 ms
>>> 64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.257 ms
>>> ^C
>>> --- 10.0.0.1 ping statistics ---
>>> 5 packets transmitted, 5 received, 0% packet loss, time 4188ms
>>> rtt min/avg/max/mdev = 0.251/0.362/0.682/0.163 ms
>>> # bridge vlan del dev swp2 vid 100
>>> # bridge vlan
>>> port    vlan ids
>>> swp5     1 PVID Egress Untagged
>>>
>>> swp2     1 Egress Untagged
>>>
>>> swp3     1 PVID Egress Untagged
>>>
>>> swp4     1 PVID Egress Untagged
>>>
>>> br0      1 PVID Egress Untagged
>>>
>>> # ping 10.0.0.1
>>> PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
>>> ^C
>>> --- 10.0.0.1 ping statistics ---
>>> 8 packets transmitted, 0 received, 100% packet loss, time 7267ms
>>>
>>> What is the consensus here? Is there a reason why the bridge driver
>>> doesn't take care of this?
>>
>> Take care of what? :) Always maintaining a PVID on the bridge port? It's
>> completely OK not to have a PVID.
>>
> 
> Yes, I didn't think it through during the first email. I came to the
> same conclusion in the second one.
> 
>>> Do switchdev drivers have to restore the pvid to always be
>>> operational, even if their state becomes inconsistent with the upper
>>> dev? Is it just 'nice to have'? What if VID 1 isn't in the hw filter
>>> either (perfectly legal)?
>>
>> Are you saying that DSA drivers always maintain a PVID on the bridge
>> port and allow untagged traffic to ingress regardless of the bridge
>> driver's configuration? If so, I think this needs to be fixed.
> 
> Well, not at the DSA core level.
> But for Microchip:
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/microchip/ksz9477.c#n576
> For Broadcom:
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/b53/b53_common.c#n1376
> For Mediatek:
> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/mt7530.c#n1196
> 
> There might be others as well.

That sounds bogus indeed, and I bet that the two other drivers just
followed the b53 driver there. I will have to test this again and come
up with a patch eventually.

When the port leaves the bridge we do bring it back into a default PVID
(which is different than the bridge's default PVID) so that part should
be okay.
-- 
Florian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-06-28 16:45     ` Florian Fainelli
@ 2019-08-19 17:15       ` Vladimir Oltean
  2019-08-19 20:15         ` Ido Schimmel
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Oltean @ 2019-08-19 17:15 UTC (permalink / raw)
  To: Florian Fainelli, Ido Schimmel, roopa, nikolay
  Cc: netdev, Andrew Lunn, Vivien Didelot, stephen

On 6/28/19 7:45 PM, Florian Fainelli wrote:
> On 6/28/19 5:37 AM, Vladimir Oltean wrote:
>> On Fri, 28 Jun 2019 at 15:30, Ido Schimmel <idosch@idosch.org> wrote:
>>>
>>> On Tue, Jun 25, 2019 at 11:49:29PM +0300, Vladimir Oltean wrote:
>>>> A number of DSA drivers (BCM53XX, Microchip KSZ94XX, Mediatek MT7530
>>>> at the very least), as well as Mellanox Spectrum (I didn't look at all
>>>> the pure switchdev drivers) try to restore the pvid to a default value
>>>> on .port_vlan_del.
>>>
>>> I don't know about DSA drivers, but that's not what mlxsw is doing. If
>>> the VLAN that is configured as PVID is deleted from the bridge port, the
>>> driver instructs the port to discard untagged and prio-tagged packets.
>>> This is consistent with the bridge driver's behavior.
>>>
>>> We do have a flow the "restores" the PVID, but that's when a port is
>>> unlinked from its bridge master. The PVID we set is 4095 which cannot be
>>> configured by the 8021q / bridge driver. This is due to the way the
>>> underlying hardware works. Even if a port is not bridged and used purely
>>> for routing, packets still do L2 lookup first which sends them directly
>>> to the router block. If PVID is not configured, untagged packets could
>>> not be routed. Obviously, at egress we strip this VLAN.
>>>
>>>> Sure, the port stops receiving traffic when its pvid is a VLAN ID that
>>>> is not installed in its hw filter, but as far as the bridge core is
>>>> concerned, this is to be expected:
>>>>
>>>> # bridge vlan add dev swp2 vid 100 pvid untagged
>>>> # bridge vlan
>>>> port    vlan ids
>>>> swp5     1 PVID Egress Untagged
>>>>
>>>> swp2     1 Egress Untagged
>>>>           100 PVID Egress Untagged
>>>>
>>>> swp3     1 PVID Egress Untagged
>>>>
>>>> swp4     1 PVID Egress Untagged
>>>>
>>>> br0      1 PVID Egress Untagged
>>>> # ping 10.0.0.1
>>>> PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
>>>> 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.682 ms
>>>> 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.299 ms
>>>> 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.251 ms
>>>> 64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.324 ms
>>>> 64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.257 ms
>>>> ^C
>>>> --- 10.0.0.1 ping statistics ---
>>>> 5 packets transmitted, 5 received, 0% packet loss, time 4188ms
>>>> rtt min/avg/max/mdev = 0.251/0.362/0.682/0.163 ms
>>>> # bridge vlan del dev swp2 vid 100
>>>> # bridge vlan
>>>> port    vlan ids
>>>> swp5     1 PVID Egress Untagged
>>>>
>>>> swp2     1 Egress Untagged
>>>>
>>>> swp3     1 PVID Egress Untagged
>>>>
>>>> swp4     1 PVID Egress Untagged
>>>>
>>>> br0      1 PVID Egress Untagged
>>>>
>>>> # ping 10.0.0.1
>>>> PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
>>>> ^C
>>>> --- 10.0.0.1 ping statistics ---
>>>> 8 packets transmitted, 0 received, 100% packet loss, time 7267ms
>>>>
>>>> What is the consensus here? Is there a reason why the bridge driver
>>>> doesn't take care of this?
>>>
>>> Take care of what? :) Always maintaining a PVID on the bridge port? It's
>>> completely OK not to have a PVID.
>>>
>>
>> Yes, I didn't think it through during the first email. I came to the
>> same conclusion in the second one.
>>
>>>> Do switchdev drivers have to restore the pvid to always be
>>>> operational, even if their state becomes inconsistent with the upper
>>>> dev? Is it just 'nice to have'? What if VID 1 isn't in the hw filter
>>>> either (perfectly legal)?
>>>
>>> Are you saying that DSA drivers always maintain a PVID on the bridge
>>> port and allow untagged traffic to ingress regardless of the bridge
>>> driver's configuration? If so, I think this needs to be fixed.
>>
>> Well, not at the DSA core level.
>> But for Microchip:
>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/microchip/ksz9477.c#n576
>> For Broadcom:
>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/b53/b53_common.c#n1376
>> For Mediatek:
>> https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/mt7530.c#n1196
>>
>> There might be others as well.
> 
> That sounds bogus indeed, and I bet that the two other drivers just
> followed the b53 driver there. I will have to test this again and come
> up with a patch eventually.
> 
> When the port leaves the bridge we do bring it back into a default PVID
> (which is different than the bridge's default PVID) so that part should
> be okay.
> 

Adding a few more networking people.
So my flow is something like this:
- Boot a board with a DSA switch
- Bring all interfaces up
- Enslave all interfaces to br0
- Enable vlan_filtering on br0

What VIDs should be installed into the ports' hw filters, and what 
should the pvid be at this point?
Should the switch ports pass any traffic?
At this point, 'bridge vlan' shows a confusing:
port    vlan ids
eth0     1 PVID Egress Untagged

swp5     1 PVID Egress Untagged

swp2     1 PVID Egress Untagged

swp3     1 PVID Egress Untagged

swp4     1 PVID Egress Untagged

br0      1 PVID Egress Untagged
for all ports, but the .port_vlan_add callback is nowhere to be found.

Whose responsibility is it for the switch to pass traffic without any 
further 'bridge vlan' command? What is the mechanism through which this 
should work?

What if I do:
sudo bridge vlan add vid 100 dev swp2 pvid untagged
echo 0 | sudo tee /sys/class/net/br0/bridge/vlan_filtering
echo 1 | sudo tee /sys/class/net/br0/bridge/vlan_filtering
What pvid should there be on swp2 now?
'bridge vlan' shows:
port    vlan ids
eth0     1 PVID Egress Untagged

swp5     1 PVID Egress Untagged

swp2     1 Egress Untagged
          100 PVID Egress Untagged

swp3     1 PVID Egress Untagged

swp4     1 PVID Egress Untagged

br0      1 PVID Egress Untagged
If the 'bridge vlan' output is correct, whose responsibility is it to 
restore this pvid?

More context: the sja1105 driver is somewhat similar to the mlxsw in 
that VLAN awareness cannot be truly disabled. Arid details aside, in 
both cases, achieving "VLAN-unaware"-like behavior involves manipulating 
the pvid in both cases. But it appears that the bridge core does expect:
(1) that the driver performs a default VLAN initialization which matches 
its own, without them ever communicating. But because switchdev/DSA 
drivers start off in standalone mode, vlan_filtering=0 comes first, 
hence the non-standard pvid. Through what mechanism is the 
bridge-expected pvid supposed to get restored upon flipping vlan_filtering?
(2) that toggling VLAN filtering off and on has no other state upon the 
underlying driver than enabling and disabling VLAN awareness. The VLAN 
hw filter table is assumed to be unchanged. Is this a correct assumption?

Thanks,
-Vladimir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-08-19 17:15       ` Vladimir Oltean
@ 2019-08-19 20:15         ` Ido Schimmel
  2019-08-19 21:10           ` Vladimir Oltean
  0 siblings, 1 reply; 10+ messages in thread
From: Ido Schimmel @ 2019-08-19 20:15 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Florian Fainelli, roopa, nikolay, netdev, Andrew Lunn,
	Vivien Didelot, stephen

On Mon, Aug 19, 2019 at 08:15:03PM +0300, Vladimir Oltean wrote:
> On 6/28/19 7:45 PM, Florian Fainelli wrote:
> > On 6/28/19 5:37 AM, Vladimir Oltean wrote:
> > > On Fri, 28 Jun 2019 at 15:30, Ido Schimmel <idosch@idosch.org> wrote:
> > > > 
> > > > On Tue, Jun 25, 2019 at 11:49:29PM +0300, Vladimir Oltean wrote:
> > > > > A number of DSA drivers (BCM53XX, Microchip KSZ94XX, Mediatek MT7530
> > > > > at the very least), as well as Mellanox Spectrum (I didn't look at all
> > > > > the pure switchdev drivers) try to restore the pvid to a default value
> > > > > on .port_vlan_del.
> > > > 
> > > > I don't know about DSA drivers, but that's not what mlxsw is doing. If
> > > > the VLAN that is configured as PVID is deleted from the bridge port, the
> > > > driver instructs the port to discard untagged and prio-tagged packets.
> > > > This is consistent with the bridge driver's behavior.
> > > > 
> > > > We do have a flow the "restores" the PVID, but that's when a port is
> > > > unlinked from its bridge master. The PVID we set is 4095 which cannot be
> > > > configured by the 8021q / bridge driver. This is due to the way the
> > > > underlying hardware works. Even if a port is not bridged and used purely
> > > > for routing, packets still do L2 lookup first which sends them directly
> > > > to the router block. If PVID is not configured, untagged packets could
> > > > not be routed. Obviously, at egress we strip this VLAN.
> > > > 
> > > > > Sure, the port stops receiving traffic when its pvid is a VLAN ID that
> > > > > is not installed in its hw filter, but as far as the bridge core is
> > > > > concerned, this is to be expected:
> > > > > 
> > > > > # bridge vlan add dev swp2 vid 100 pvid untagged
> > > > > # bridge vlan
> > > > > port    vlan ids
> > > > > swp5     1 PVID Egress Untagged
> > > > > 
> > > > > swp2     1 Egress Untagged
> > > > >           100 PVID Egress Untagged
> > > > > 
> > > > > swp3     1 PVID Egress Untagged
> > > > > 
> > > > > swp4     1 PVID Egress Untagged
> > > > > 
> > > > > br0      1 PVID Egress Untagged
> > > > > # ping 10.0.0.1
> > > > > PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> > > > > 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.682 ms
> > > > > 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.299 ms
> > > > > 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.251 ms
> > > > > 64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.324 ms
> > > > > 64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.257 ms
> > > > > ^C
> > > > > --- 10.0.0.1 ping statistics ---
> > > > > 5 packets transmitted, 5 received, 0% packet loss, time 4188ms
> > > > > rtt min/avg/max/mdev = 0.251/0.362/0.682/0.163 ms
> > > > > # bridge vlan del dev swp2 vid 100
> > > > > # bridge vlan
> > > > > port    vlan ids
> > > > > swp5     1 PVID Egress Untagged
> > > > > 
> > > > > swp2     1 Egress Untagged
> > > > > 
> > > > > swp3     1 PVID Egress Untagged
> > > > > 
> > > > > swp4     1 PVID Egress Untagged
> > > > > 
> > > > > br0      1 PVID Egress Untagged
> > > > > 
> > > > > # ping 10.0.0.1
> > > > > PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> > > > > ^C
> > > > > --- 10.0.0.1 ping statistics ---
> > > > > 8 packets transmitted, 0 received, 100% packet loss, time 7267ms
> > > > > 
> > > > > What is the consensus here? Is there a reason why the bridge driver
> > > > > doesn't take care of this?
> > > > 
> > > > Take care of what? :) Always maintaining a PVID on the bridge port? It's
> > > > completely OK not to have a PVID.
> > > > 
> > > 
> > > Yes, I didn't think it through during the first email. I came to the
> > > same conclusion in the second one.
> > > 
> > > > > Do switchdev drivers have to restore the pvid to always be
> > > > > operational, even if their state becomes inconsistent with the upper
> > > > > dev? Is it just 'nice to have'? What if VID 1 isn't in the hw filter
> > > > > either (perfectly legal)?
> > > > 
> > > > Are you saying that DSA drivers always maintain a PVID on the bridge
> > > > port and allow untagged traffic to ingress regardless of the bridge
> > > > driver's configuration? If so, I think this needs to be fixed.
> > > 
> > > Well, not at the DSA core level.
> > > But for Microchip:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/microchip/ksz9477.c#n576
> > > For Broadcom:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/b53/b53_common.c#n1376
> > > For Mediatek:
> > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/mt7530.c#n1196
> > > 
> > > There might be others as well.
> > 
> > That sounds bogus indeed, and I bet that the two other drivers just
> > followed the b53 driver there. I will have to test this again and come
> > up with a patch eventually.
> > 
> > When the port leaves the bridge we do bring it back into a default PVID
> > (which is different than the bridge's default PVID) so that part should
> > be okay.
> > 
> 
> Adding a few more networking people.
> So my flow is something like this:
> - Boot a board with a DSA switch
> - Bring all interfaces up
> - Enslave all interfaces to br0
> - Enable vlan_filtering on br0
> 
> What VIDs should be installed into the ports' hw filters, and what should
> the pvid be at this point?
> Should the switch ports pass any traffic?
> At this point, 'bridge vlan' shows a confusing:
> port    vlan ids
> eth0     1 PVID Egress Untagged
> 
> swp5     1 PVID Egress Untagged
> 
> swp2     1 PVID Egress Untagged
> 
> swp3     1 PVID Egress Untagged
> 
> swp4     1 PVID Egress Untagged
> 
> br0      1 PVID Egress Untagged
> for all ports, but the .port_vlan_add callback is nowhere to be found.

The bridge adds a PVID on the port when it is enslaved to the bridge.
The configuration only takes effect when VLAN filtering is enabled. I'm
looking at dsa_port_vlan_add() and it seems that it does not propagate
the VLAN call when VLAN filtering is disabled. This explains why you
never see the callback.

I assume that if you configure the bridge with VLAN filtering enabled
and then enslave a port, then everything works fine.

mlxsw avoids the situation by forbidding the toggling of VLAN filtering
on the bridge when its ports are enslaved.

> 
> Whose responsibility is it for the switch to pass traffic without any
> further 'bridge vlan' command? What is the mechanism through which this
> should work?
> 
> What if I do:
> sudo bridge vlan add vid 100 dev swp2 pvid untagged
> echo 0 | sudo tee /sys/class/net/br0/bridge/vlan_filtering
> echo 1 | sudo tee /sys/class/net/br0/bridge/vlan_filtering
> What pvid should there be on swp2 now?
> 'bridge vlan' shows:
> port    vlan ids
> eth0     1 PVID Egress Untagged
> 
> swp5     1 PVID Egress Untagged
> 
> swp2     1 Egress Untagged
>          100 PVID Egress Untagged
> 
> swp3     1 PVID Egress Untagged
> 
> swp4     1 PVID Egress Untagged
> 
> br0      1 PVID Egress Untagged
> If the 'bridge vlan' output is correct, whose responsibility is it to
> restore this pvid?

I suggest to follow mlxsw and avoid this mess. You can support both VLAN
filtering enable / disable without supporting dynamically toggling the
option.

> 
> More context: the sja1105 driver is somewhat similar to the mlxsw in that
> VLAN awareness cannot be truly disabled. Arid details aside, in both cases,
> achieving "VLAN-unaware"-like behavior involves manipulating the pvid in
> both cases. But it appears that the bridge core does expect:
> (1) that the driver performs a default VLAN initialization which matches its
> own, without them ever communicating. But because switchdev/DSA drivers
> start off in standalone mode, vlan_filtering=0 comes first, hence the
> non-standard pvid. Through what mechanism is the bridge-expected pvid
> supposed to get restored upon flipping vlan_filtering?
> (2) that toggling VLAN filtering off and on has no other state upon the
> underlying driver than enabling and disabling VLAN awareness. The VLAN hw
> filter table is assumed to be unchanged. Is this a correct assumption?
> 
> Thanks,
> -Vladimir

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-08-19 20:15         ` Ido Schimmel
@ 2019-08-19 21:10           ` Vladimir Oltean
  2019-08-19 23:01             ` Nikolay Aleksandrov
  0 siblings, 1 reply; 10+ messages in thread
From: Vladimir Oltean @ 2019-08-19 21:10 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Florian Fainelli, Roopa Prabhu, nikolay, netdev, Andrew Lunn,
	Vivien Didelot, stephen

On Mon, 19 Aug 2019 at 23:15, Ido Schimmel <idosch@idosch.org> wrote:
>
> On Mon, Aug 19, 2019 at 08:15:03PM +0300, Vladimir Oltean wrote:
> > On 6/28/19 7:45 PM, Florian Fainelli wrote:
> > > On 6/28/19 5:37 AM, Vladimir Oltean wrote:
> > > > On Fri, 28 Jun 2019 at 15:30, Ido Schimmel <idosch@idosch.org> wrote:
> > > > >
> > > > > On Tue, Jun 25, 2019 at 11:49:29PM +0300, Vladimir Oltean wrote:
> > > > > > A number of DSA drivers (BCM53XX, Microchip KSZ94XX, Mediatek MT7530
> > > > > > at the very least), as well as Mellanox Spectrum (I didn't look at all
> > > > > > the pure switchdev drivers) try to restore the pvid to a default value
> > > > > > on .port_vlan_del.
> > > > >
> > > > > I don't know about DSA drivers, but that's not what mlxsw is doing. If
> > > > > the VLAN that is configured as PVID is deleted from the bridge port, the
> > > > > driver instructs the port to discard untagged and prio-tagged packets.
> > > > > This is consistent with the bridge driver's behavior.
> > > > >
> > > > > We do have a flow the "restores" the PVID, but that's when a port is
> > > > > unlinked from its bridge master. The PVID we set is 4095 which cannot be
> > > > > configured by the 8021q / bridge driver. This is due to the way the
> > > > > underlying hardware works. Even if a port is not bridged and used purely
> > > > > for routing, packets still do L2 lookup first which sends them directly
> > > > > to the router block. If PVID is not configured, untagged packets could
> > > > > not be routed. Obviously, at egress we strip this VLAN.
> > > > >
> > > > > > Sure, the port stops receiving traffic when its pvid is a VLAN ID that
> > > > > > is not installed in its hw filter, but as far as the bridge core is
> > > > > > concerned, this is to be expected:
> > > > > >
> > > > > > # bridge vlan add dev swp2 vid 100 pvid untagged
> > > > > > # bridge vlan
> > > > > > port    vlan ids
> > > > > > swp5     1 PVID Egress Untagged
> > > > > >
> > > > > > swp2     1 Egress Untagged
> > > > > >           100 PVID Egress Untagged
> > > > > >
> > > > > > swp3     1 PVID Egress Untagged
> > > > > >
> > > > > > swp4     1 PVID Egress Untagged
> > > > > >
> > > > > > br0      1 PVID Egress Untagged
> > > > > > # ping 10.0.0.1
> > > > > > PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> > > > > > 64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.682 ms
> > > > > > 64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.299 ms
> > > > > > 64 bytes from 10.0.0.1: icmp_seq=3 ttl=64 time=0.251 ms
> > > > > > 64 bytes from 10.0.0.1: icmp_seq=4 ttl=64 time=0.324 ms
> > > > > > 64 bytes from 10.0.0.1: icmp_seq=5 ttl=64 time=0.257 ms
> > > > > > ^C
> > > > > > --- 10.0.0.1 ping statistics ---
> > > > > > 5 packets transmitted, 5 received, 0% packet loss, time 4188ms
> > > > > > rtt min/avg/max/mdev = 0.251/0.362/0.682/0.163 ms
> > > > > > # bridge vlan del dev swp2 vid 100
> > > > > > # bridge vlan
> > > > > > port    vlan ids
> > > > > > swp5     1 PVID Egress Untagged
> > > > > >
> > > > > > swp2     1 Egress Untagged
> > > > > >
> > > > > > swp3     1 PVID Egress Untagged
> > > > > >
> > > > > > swp4     1 PVID Egress Untagged
> > > > > >
> > > > > > br0      1 PVID Egress Untagged
> > > > > >
> > > > > > # ping 10.0.0.1
> > > > > > PING 10.0.0.1 (10.0.0.1) 56(84) bytes of data.
> > > > > > ^C
> > > > > > --- 10.0.0.1 ping statistics ---
> > > > > > 8 packets transmitted, 0 received, 100% packet loss, time 7267ms
> > > > > >
> > > > > > What is the consensus here? Is there a reason why the bridge driver
> > > > > > doesn't take care of this?
> > > > >
> > > > > Take care of what? :) Always maintaining a PVID on the bridge port? It's
> > > > > completely OK not to have a PVID.
> > > > >
> > > >
> > > > Yes, I didn't think it through during the first email. I came to the
> > > > same conclusion in the second one.
> > > >
> > > > > > Do switchdev drivers have to restore the pvid to always be
> > > > > > operational, even if their state becomes inconsistent with the upper
> > > > > > dev? Is it just 'nice to have'? What if VID 1 isn't in the hw filter
> > > > > > either (perfectly legal)?
> > > > >
> > > > > Are you saying that DSA drivers always maintain a PVID on the bridge
> > > > > port and allow untagged traffic to ingress regardless of the bridge
> > > > > driver's configuration? If so, I think this needs to be fixed.
> > > >
> > > > Well, not at the DSA core level.
> > > > But for Microchip:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/microchip/ksz9477.c#n576
> > > > For Broadcom:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/b53/b53_common.c#n1376
> > > > For Mediatek:
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git/tree/drivers/net/dsa/mt7530.c#n1196
> > > >
> > > > There might be others as well.
> > >
> > > That sounds bogus indeed, and I bet that the two other drivers just
> > > followed the b53 driver there. I will have to test this again and come
> > > up with a patch eventually.
> > >
> > > When the port leaves the bridge we do bring it back into a default PVID
> > > (which is different than the bridge's default PVID) so that part should
> > > be okay.
> > >
> >
> > Adding a few more networking people.
> > So my flow is something like this:
> > - Boot a board with a DSA switch
> > - Bring all interfaces up
> > - Enslave all interfaces to br0
> > - Enable vlan_filtering on br0
> >
> > What VIDs should be installed into the ports' hw filters, and what should
> > the pvid be at this point?
> > Should the switch ports pass any traffic?
> > At this point, 'bridge vlan' shows a confusing:
> > port    vlan ids
> > eth0     1 PVID Egress Untagged
> >
> > swp5     1 PVID Egress Untagged
> >
> > swp2     1 PVID Egress Untagged
> >
> > swp3     1 PVID Egress Untagged
> >
> > swp4     1 PVID Egress Untagged
> >
> > br0      1 PVID Egress Untagged
> > for all ports, but the .port_vlan_add callback is nowhere to be found.
>
> The bridge adds a PVID on the port when it is enslaved to the bridge.
> The configuration only takes effect when VLAN filtering is enabled. I'm
> looking at dsa_port_vlan_add() and it seems that it does not propagate
> the VLAN call when VLAN filtering is disabled. This explains why you
> never see the callback.
>

Aha! The offending commit is this:

commit 2ea7a679ca2abd251c1ec03f20508619707e1749
Author: Andrew Lunn <andrew@lunn.ch>
Date:   Tue Nov 7 00:04:24 2017 +0100

    net: dsa: Don't add vlans when vlan filtering is disabled

    The software bridge can be build with vlan filtering support
    included. However, by default it is turned off. In its turned off
    state, it still passes VLANs via switchev, even though they are not to
    be used. Don't pass these VLANs to the hardware. Only do so when vlan
    filtering is enabled.

    This fixes at least one corner case. There are still issues in other
    corners, such as when vlan_filtering is later enabled.

    Signed-off-by: Andrew Lunn <andrew@lunn.ch>
    Signed-off-by: David S. Miller <davem@davemloft.net>

It's good to know that it's there (like you said, it explains some
things) but I can't exactly say that removing it helps in any way.
In fact, removing it only overwrites the dsa_8021q VLANs with 1 during
bridge_join, while not actually doing anything upon a vlan_filtering
toggle.
So the sja1105 driver is in a way shielded by DSA from the bridge, for
the better.
It still appears to be true that the bridge doesn't think it's
necessary to notify through SWITCHDEV_OBJ_ID_PORT_VLAN again. So my
best bet is to restore the pvid on my own.
However I've already stumbled upon an apparent bug while trying to do
that. Does this look off? If it doesn't, I'll submit it as a patch:

commit 788f03991aa576fc0b4b26ca330af0f412c55582
Author: Vladimir Oltean <olteanv@gmail.com>
Date:   Mon Aug 19 22:57:00 2019 +0300

    net: bridge: Keep the BRIDGE_VLAN_INFO_PVID flag in net_bridge_vlan

    Currently this simplified code snippet fails:

            br_vlan_get_pvid(netdev, &pvid);
            br_vlan_get_info(netdev, pvid, &vinfo);
            ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));

    It is intuitive that the pvid of a netdevice should have the
    BRIDGE_VLAN_INFO_PVID flag set.

    However I can't seem to pinpoint a commit where this behavior was
    introduced. It seems like it's been like that since forever.

    Signed-off-by: Vladimir Oltean <olteanv@gmail.com>

diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 021cc9f66804..f49b2758bcab 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -68,10 +68,13 @@ static bool __vlan_add_flags(struct
net_bridge_vlan *v, u16 flags)
     else
         vg = nbp_vlan_group(v->port);

-    if (flags & BRIDGE_VLAN_INFO_PVID)
+    if (flags & BRIDGE_VLAN_INFO_PVID) {
         ret = __vlan_add_pvid(vg, v->vid);
-    else
+        v->flags |= BRIDGE_VLAN_INFO_PVID;
+    } else {
         ret = __vlan_delete_pvid(vg, v->vid);
+        v->flags &= ~BRIDGE_VLAN_INFO_PVID;
+    }

     if (flags & BRIDGE_VLAN_INFO_UNTAGGED)
         v->flags |= BRIDGE_VLAN_INFO_UNTAGGED;


> I assume that if you configure the bridge with VLAN filtering enabled
> and then enslave a port, then everything works fine.
>
> mlxsw avoids the situation by forbidding the toggling of VLAN filtering
> on the bridge when its ports are enslaved.
>

I can certainly understand where this comes from. However a simpleton
might object that this:

ip link add name br0 type bridge vlan_filtering 1
ip link set dev swp2 master br0

should behave the same as this:

ip link add name br0 type bridge
ip link set dev swp2 master br0
echo 1 > /sys/class/net/br0/bridge/vlan_filtering

I can't disagree with said simpleton.

> >
> > Whose responsibility is it for the switch to pass traffic without any
> > further 'bridge vlan' command? What is the mechanism through which this
> > should work?
> >
> > What if I do:
> > sudo bridge vlan add vid 100 dev swp2 pvid untagged
> > echo 0 | sudo tee /sys/class/net/br0/bridge/vlan_filtering
> > echo 1 | sudo tee /sys/class/net/br0/bridge/vlan_filtering
> > What pvid should there be on swp2 now?
> > 'bridge vlan' shows:
> > port    vlan ids
> > eth0     1 PVID Egress Untagged
> >
> > swp5     1 PVID Egress Untagged
> >
> > swp2     1 Egress Untagged
> >          100 PVID Egress Untagged
> >
> > swp3     1 PVID Egress Untagged
> >
> > swp4     1 PVID Egress Untagged
> >
> > br0      1 PVID Egress Untagged
> > If the 'bridge vlan' output is correct, whose responsibility is it to
> > restore this pvid?
>
> I suggest to follow mlxsw and avoid this mess. You can support both VLAN
> filtering enable / disable without supporting dynamically toggling the
> option.
>
> >
> > More context: the sja1105 driver is somewhat similar to the mlxsw in that
> > VLAN awareness cannot be truly disabled. Arid details aside, in both cases,
> > achieving "VLAN-unaware"-like behavior involves manipulating the pvid in
> > both cases. But it appears that the bridge core does expect:
> > (1) that the driver performs a default VLAN initialization which matches its
> > own, without them ever communicating. But because switchdev/DSA drivers
> > start off in standalone mode, vlan_filtering=0 comes first, hence the
> > non-standard pvid. Through what mechanism is the bridge-expected pvid
> > supposed to get restored upon flipping vlan_filtering?
> > (2) that toggling VLAN filtering off and on has no other state upon the
> > underlying driver than enabling and disabling VLAN awareness. The VLAN hw
> > filter table is assumed to be unchanged. Is this a correct assumption?
> >
> > Thanks,
> > -Vladimir

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-08-19 21:10           ` Vladimir Oltean
@ 2019-08-19 23:01             ` Nikolay Aleksandrov
  2019-08-19 23:12               ` Nikolay Aleksandrov
  0 siblings, 1 reply; 10+ messages in thread
From: Nikolay Aleksandrov @ 2019-08-19 23:01 UTC (permalink / raw)
  To: Vladimir Oltean, Ido Schimmel
  Cc: Florian Fainelli, Roopa Prabhu, netdev, Andrew Lunn,
	Vivien Didelot, stephen

On 8/20/19 12:10 AM, Vladimir Oltean wrote:
[snip]
> It's good to know that it's there (like you said, it explains some
> things) but I can't exactly say that removing it helps in any way.
> In fact, removing it only overwrites the dsa_8021q VLANs with 1 during
> bridge_join, while not actually doing anything upon a vlan_filtering
> toggle.
> So the sja1105 driver is in a way shielded by DSA from the bridge, for
> the better.
> It still appears to be true that the bridge doesn't think it's
> necessary to notify through SWITCHDEV_OBJ_ID_PORT_VLAN again. So my
> best bet is to restore the pvid on my own.
> However I've already stumbled upon an apparent bug while trying to do
> that. Does this look off? If it doesn't, I'll submit it as a patch:
> 
> commit 788f03991aa576fc0b4b26ca330af0f412c55582
> Author: Vladimir Oltean <olteanv@gmail.com>
> Date:   Mon Aug 19 22:57:00 2019 +0300
> 
>     net: bridge: Keep the BRIDGE_VLAN_INFO_PVID flag in net_bridge_vlan
> 
>     Currently this simplified code snippet fails:
> 
>             br_vlan_get_pvid(netdev, &pvid);
>             br_vlan_get_info(netdev, pvid, &vinfo);
>             ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));
> 
>     It is intuitive that the pvid of a netdevice should have the
>     BRIDGE_VLAN_INFO_PVID flag set.
> 
>     However I can't seem to pinpoint a commit where this behavior was
>     introduced. It seems like it's been like that since forever.
> 
>     Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
> 
> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
> index 021cc9f66804..f49b2758bcab 100644
> --- a/net/bridge/br_vlan.c
> +++ b/net/bridge/br_vlan.c
> @@ -68,10 +68,13 @@ static bool __vlan_add_flags(struct
> net_bridge_vlan *v, u16 flags)
>      else
>          vg = nbp_vlan_group(v->port);
> 
> -    if (flags & BRIDGE_VLAN_INFO_PVID)
> +    if (flags & BRIDGE_VLAN_INFO_PVID) {
>          ret = __vlan_add_pvid(vg, v->vid);
> -    else
> +        v->flags |= BRIDGE_VLAN_INFO_PVID;
> +    } else {
>          ret = __vlan_delete_pvid(vg, v->vid);
> +        v->flags &= ~BRIDGE_VLAN_INFO_PVID;
> +    }
> 
>      if (flags & BRIDGE_VLAN_INFO_UNTAGGED)
>          v->flags |= BRIDGE_VLAN_INFO_UNTAGGED;
> 

Hi Vladimir,
I know it looks logical to have it, but there are a few reasons why we don't
do it, most importantly because we need to have only one visible pvid at any single
time, even if it's stale - it must be just one. Right now that rule will not
be violated  by your change, but people will try using this flag and could see
two pvids simultaneously. You can see that the pvid code is even using memory
barriers to propagate the new value faster and everywhere the pvid is read only once.
That is the reason the flag is set dynamically when dumping entries, too.
A second (weaker) argument against would be given the above we don't want
another way to do the same thing, specifically if it can provide us with
two pvids (e.g. if walking the vlan list) or if it can provide us with a pvid
different from the one set in the vg.
If you do need that flag, you can change br_vlan_get_info() to set the flag like
the other places do - if the vid matches the current vg pvid, or you could change
the whole pvid handling code to this new scheme as long as it can guarantee a single
pvid when walking the vlan list and fast pvid retrieval in the fast-path.

Cheers,
 Nik
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: What to do when a bridge port gets its pvid deleted?
  2019-08-19 23:01             ` Nikolay Aleksandrov
@ 2019-08-19 23:12               ` Nikolay Aleksandrov
  0 siblings, 0 replies; 10+ messages in thread
From: Nikolay Aleksandrov @ 2019-08-19 23:12 UTC (permalink / raw)
  To: Vladimir Oltean, Ido Schimmel
  Cc: Florian Fainelli, Roopa Prabhu, netdev, Andrew Lunn,
	Vivien Didelot, stephen

On 8/20/19 2:01 AM, Nikolay Aleksandrov wrote:
> On 8/20/19 12:10 AM, Vladimir Oltean wrote:
> [snip]
>> It's good to know that it's there (like you said, it explains some
>> things) but I can't exactly say that removing it helps in any way.
>> In fact, removing it only overwrites the dsa_8021q VLANs with 1 during
>> bridge_join, while not actually doing anything upon a vlan_filtering
>> toggle.
>> So the sja1105 driver is in a way shielded by DSA from the bridge, for
>> the better.
>> It still appears to be true that the bridge doesn't think it's
>> necessary to notify through SWITCHDEV_OBJ_ID_PORT_VLAN again. So my
>> best bet is to restore the pvid on my own.
>> However I've already stumbled upon an apparent bug while trying to do
>> that. Does this look off? If it doesn't, I'll submit it as a patch:
>>
>> commit 788f03991aa576fc0b4b26ca330af0f412c55582
>> Author: Vladimir Oltean <olteanv@gmail.com>
>> Date:   Mon Aug 19 22:57:00 2019 +0300
>>
>>     net: bridge: Keep the BRIDGE_VLAN_INFO_PVID flag in net_bridge_vlan
>>
>>     Currently this simplified code snippet fails:
>>
>>             br_vlan_get_pvid(netdev, &pvid);
>>             br_vlan_get_info(netdev, pvid, &vinfo);
>>             ASSERT(!(vinfo.flags & BRIDGE_VLAN_INFO_PVID));
>>
>>     It is intuitive that the pvid of a netdevice should have the
>>     BRIDGE_VLAN_INFO_PVID flag set.
>>
>>     However I can't seem to pinpoint a commit where this behavior was
>>     introduced. It seems like it's been like that since forever.
>>
>>     Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
>>
>> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
>> index 021cc9f66804..f49b2758bcab 100644
>> --- a/net/bridge/br_vlan.c
>> +++ b/net/bridge/br_vlan.c
>> @@ -68,10 +68,13 @@ static bool __vlan_add_flags(struct
>> net_bridge_vlan *v, u16 flags)
>>      else
>>          vg = nbp_vlan_group(v->port);
>>
>> -    if (flags & BRIDGE_VLAN_INFO_PVID)
>> +    if (flags & BRIDGE_VLAN_INFO_PVID) {
>>          ret = __vlan_add_pvid(vg, v->vid);
>> -    else
>> +        v->flags |= BRIDGE_VLAN_INFO_PVID;
>> +    } else {
>>          ret = __vlan_delete_pvid(vg, v->vid);
>> +        v->flags &= ~BRIDGE_VLAN_INFO_PVID;
>> +    }
>>
>>      if (flags & BRIDGE_VLAN_INFO_UNTAGGED)
>>          v->flags |= BRIDGE_VLAN_INFO_UNTAGGED;
>>
> 
> Hi Vladimir,
> I know it looks logical to have it, but there are a few reasons why we don't
> do it, most importantly because we need to have only one visible pvid at any single
> time, even if it's stale - it must be just one. Right now that rule will not
> be violated  by your change, but people will try using this flag and could see

Obviously, I'm talking about RCU pvid/vlan use cases similar to the dumps.
The locked cases are fine.

> two pvids simultaneously. You can see that the pvid code is even using memory
> barriers to propagate the new value faster and everywhere the pvid is read only once.
> That is the reason the flag is set dynamically when dumping entries, too.
> A second (weaker) argument against would be given the above we don't want
> another way to do the same thing, specifically if it can provide us with

i.e. I would like to avoid explaining why this shouldn't be relied upon without locking

> two pvids (e.g. if walking the vlan list) or if it can provide us with a pvid
> different from the one set in the vg.
> If you do need that flag, you can change br_vlan_get_info() to set the flag like
> the other places do - if the vid matches the current vg pvid, or you could change
> the whole pvid handling code to this new scheme as long as it can guarantee a single
> pvid when walking the vlan list and fast pvid retrieval in the fast-path.
> 
> Cheers,
>  Nik
>  
> 


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-08-19 23:12 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-25 20:49 What to do when a bridge port gets its pvid deleted? Vladimir Oltean
2019-06-25 21:05 ` Vladimir Oltean
2019-06-28 12:30 ` Ido Schimmel
2019-06-28 12:37   ` Vladimir Oltean
2019-06-28 16:45     ` Florian Fainelli
2019-08-19 17:15       ` Vladimir Oltean
2019-08-19 20:15         ` Ido Schimmel
2019-08-19 21:10           ` Vladimir Oltean
2019-08-19 23:01             ` Nikolay Aleksandrov
2019-08-19 23:12               ` Nikolay Aleksandrov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).