On Tue Oct 06 2020, Vladimir Oltean wrote:
> On Tue, Oct 06, 2020 at 08:09:39AM +0200, Kurt Kanzenbach wrote:
>> On Sun Oct 04 2020, Vladimir Oltean wrote:
>> > I don't think this works.
>> >
>> > ip link add br0 type bridge vlan_filtering 1
>> > ip link set swp0 master br0
>> > bridge vlan add dev swp0 vid 100
>> > ip link set br0 type bridge vlan_filtering 0
>> > bridge vlan del dev swp0 vid 100
>> > ip link set br0 type bridge vlan_filtering 1
>> >
>> > The expectation would be that swp0 blocks vid 100 now, but with your
>> > scheme it doesn't (it is not unapplied, and not unqueued either, because
>> > it was never queued in the first place).
>> 
>> Yes, that's correct. So, I think we have to queue not only the addition
>> of VLANs, but rather the "action" itself such as add or del. And then
>> apply all pending actions whenever vlan_filtering is set.
>
> Please remind me why you have to queue a VLAN addition/removal and can't
> do it straight away? Is it because of private VID 2 and 3, which need to
> be deleted first then re-added from the bridge VLAN group?

It's because of the private VLANs 2 and 3 which shouldn't be tampered
with. Isn't it? You said:

> If you need caching of VLANs installed by the bridge and/or by the 8021q
> module, then you can add those to a list, and restore them in the
> .port_vlan_filtering callback by yourself. You can look at how sja1105
> does that.
[...]
> If your driver makes private use of VLAN tags beyond what the upper
> layers ask for, then it should keep track of them.

That's what I did.

At the end of the day the driver needs to port separation
somehow. Otherwise it doesn't match the DSA model, right? Again there is
no port forwarding matrix which would make things easy. It has to be
solved in software.

If the private VLAN stuff isn't working, because all of the different
corner cases, then what's the alternative?

>
>> >> +static int hellcreek_port_bridge_join(struct dsa_switch *ds, int port,
>> >> +				      struct net_device *br)
>> >> +{
>> >> +	struct hellcreek *hellcreek = ds->priv;
>> >> +	int i;
>> >> +
>> >> +	dev_dbg(hellcreek->dev, "Port %d joins a bridge\n", port);
>> >> +
>> >> +	/* Configure port's vid to all other ports as egress untagged */
>> >> +	for (i = 0; i < ds->num_ports; ++i) {
>> >> +		if (!dsa_is_user_port(ds, i))
>> >> +			continue;
>> >> +
>> >> +		if (i == port)
>> >> +			continue;
>> >> +
>> >> +		hellcreek_apply_vlan(hellcreek, i, port, false, true);
>> >> +	}
>> >
>> > I think this is buggy when joining a VLAN filtering bridge. Your ports
>> > will pass frames with VID=2 with no problem, even without the user
>> > specifying 'bridge vlan add dev swp0 vid 2', and that's an issue. My
>> > understanding is that VLANs 1, 2, 3 stop having any sort of special
>> > meaning when the upper bridge has vlan_filtering=1.
>> 
>> Yes, that understanding is correct. So, what happens is when a port is
>> joining a VLAN filtering bridge is:
>> 
>> |root@tsn:~# ip link add name br0 type bridge
>> |root@tsn:~# ip link set dev br0 type bridge vlan_filtering 1
>> |root@tsn:~# ip link set dev lan0 master br0
>> |[  209.375055] br0: port 1(lan0) entered blocking state
>> |[  209.380073] br0: port 1(lan0) entered disabled state
>> |[  209.385340] hellcreek ff240000.switch: Port 2 joins a bridge
>> |[  209.391584] hellcreek ff240000.switch: Apply VLAN: port=3 vid=2 pvid=0 untagged=1
>> |[  209.399439] device lan0 entered promiscuous mode
>> |[  209.404043] device eth0 entered promiscuous mode
>> |[  209.409204] hellcreek ff240000.switch: Enable VLAN filtering on port 2
>> |[  209.415716] hellcreek ff240000.switch: Unapply VLAN: port=2 vid=2
>> |[  209.421840] hellcreek ff240000.switch: Unapply VLAN: port=0 vid=2
>
> Now I understand even less. If the entire purpose of
> hellcreek_setup_vlan_membership is to isolate lan0 from lan1

Yes.

> , then why do you even bother to install vid 2 to port=3 (lan1) when
> joining a bridge, be it vlan_filtering or not?

So, that traffic is actually switched between the ports.

> In bridged mode, they don't need a unique pvid, it only complicates
> the implementation. They can have the pvid from the bridge VLAN group.

Meaning rely on the fact that VLAN 1 is programmed automatically? Maybe
just unapply the private VLAN in bridge_join()?

>
>> |[  209.428170] hellcreek ff240000.switch: Apply queued VLANs: port2
>> |[  209.434158] hellcreek ff240000.switch: Apply VLAN: port=2 vid=0 pvid=0 untagged=0
>> |[  209.441649] hellcreek ff240000.switch: Clear queued VLANs: port2
>> |[  209.447920] hellcreek ff240000.switch: Apply queued VLANs: port0
>> |[  209.453910] hellcreek ff240000.switch: Apply VLAN: port=0 vid=0 pvid=0 untagged=0
>> |[  209.461402] hellcreek ff240000.switch: Clear queued VLANs: port0
>> |[  209.467620] hellcreek ff240000.switch: VLAN prepare for port 2
>> |[  209.473476] hellcreek ff240000.switch: VLAN prepare for port 0
>> |[  209.479534] hellcreek ff240000.switch: Add VLANs (1 -- 1) on port 2, untagged, PVID
>> |[  209.487164] hellcreek ff240000.switch: Apply VLAN: port=2 vid=1 pvid=1 untagged=1
>> |[  209.494659] hellcreek ff240000.switch: Add VLANs (1 -- 1) on port 0, untagged, no PVID
>> |[  209.502794] hellcreek ff240000.switch: Apply VLAN: port=0 vid=1 pvid=0 untagged=1
>> |root@tsn:~# bridge vlan show
>
> This is by no means a good indicator for anything. It shows the bridge
> VLAN groups, not the hardware database.
>
>> |port    vlan ids
>> |lan0     1 PVID Egress Untagged
>> |
>> |br0      1 PVID Egress Untagged
>> 
>> ... which looks correct to me. The VLAN 2 is unapplied as expected. Or?
>
> Ok, it gets applied in .port_bridge_join and unapplied in .port_vlan_filtering,
> which is a convoluted way of doing nothing.
>
>> >
>> > And how do you deal with the case where swp1 and swp2 are bridged and
>> > have the VLAN 3 installed via 'bridge vlan', but swp3 isn't bridged?
>> > Will swp1/swp2 communicate with swp3? If yes, that's a problem.
>> 
>> There is no swp3. Currently there are only two ports and either they are
>> bridged or not.
>
> So this answers my question of whether the tunnel port is a user port or
> not, ok.
>
> How about other hardware revisions? Is this going to be a 2-port switch
> forever?

At the moment, yes. It's meant to be used for switched endpoints. More
port devices may come in the future.

> Your solution will indeed work for 2 ports (as long as you
> address the other feedback from v5 w.r.t. declaring the ports as "always
> filtering" and rejecting invalid 8021q uppers, which I don't see
> here),

I've checked that property with ethtool and it's set to the value you
suggested. And yes, the same VLAN on top of single ports will break
separation with the current solution.

> but it will not scale for 3 ports, due to the fact that the bridge can
> install a VLAN on a lan2 port, without knowing that it is in fact the
> private pvid of lan1 or lan0.

Yes, that's also a limitation of the VLAN approach.

>
>> >> +static int __hellcreek_fdb_del(struct hellcreek *hellcreek,
>> >> +			       const struct hellcreek_fdb_entry *entry)
>> >> +{
>> >> +	dev_dbg(hellcreek->dev, "Delete FDB entry: MAC=%pM!\n", entry->mac);
>> >> +
>> >
>> > Do these dev_dbg statements bring much value at all, even to you?
>> 
>> Yes, they do. See the log snippet above.
>> 
>
> If you want to dump the hardware database you can look at the devlink
> regions that Andrew added very recently. Much more reliable than
> following the order of operations in the log.

I saw the patches and it's really useful. However, I won't implement any
new features to this drivers unless that port separation problem is
sorted out.

>
>> >> +static const struct hellcreek_platform_data de1soc_r1_pdata = {
>> >> +	.num_ports	 = 4,
>> >> +	.is_100_mbits	 = 1,
>> >> +	.qbv_support	 = 1,
>> >> +	.qbv_on_cpu_port = 1,
>> >
>> > Why does this matter?
>> 
>> Because Qbv on the CPU port is a feature and not all switch variants
>> have that. It will matter as soon as TAPRIO is implemented.
>
> How do you plan to install a tc-taprio qdisc on the CPU port?

That's an issue to be sorted out.

Thanks,
Kurt