All of lore.kernel.org
 help / color / mirror / Atom feed
* Is it ok for switch TCAMs to depend on the bridge state?
@ 2021-11-02 11:03 Vladimir Oltean
  2021-11-03 16:38 ` Jakub Kicinski
  2021-11-07 11:50 ` Ido Schimmel
  0 siblings, 2 replies; 6+ messages in thread
From: Vladimir Oltean @ 2021-11-02 11:03 UTC (permalink / raw)
  To: netdev
  Cc: Jakub Kicinski, Nikolay Aleksandrov, Jiri Pirko, Ido Schimmel,
	Andrew Lunn, Florian Fainelli

Hi,

I've been reviewing a patch set which offloads to hardware some
tc-flower filters with some TSN-specific actions (ingress policing).
The keys of those offloaded tc-flower filters are not arbitrary, they
are the destination MAC address and VLAN ID of the frames, which is
relevant because these TSN policers are actually coupled with the
bridging service in hardware. So the premise of that patch set was that
the user would first need to add static FDB entries to the bridge with
the same key as the tc-flower key, before the tc-flower filters would be
accepted for offloading.

Naturally, with the current bridge/switchdev design where drivers cannot
actually NACK the removal of a bridge FDB entry, that is quite fragile,
because if the user would then proceed to delete the FDB entry, the
tc-flower filter would stop working and the user would wonder why.
So that patch set has stalled, currently.

But I was thinking, the above case is not the only one where features
offloaded through tc-flower might depend on the state of the bridging
service (and therefore on stuff configured in the Linux bridge and
offloaded through switchdev). Another example I can find is where there
are some tc-flower filters that involve VLANs (either in the key or in
the action portion). Generally, switches have the notion of a classified
VLAN, aka the VID used for internal processing of a packet. This may or
may not be equal to the VID from the 802.1Q header. For example, if a
port is VLAN-unaware or standalone, the classified VLAN is pretty much
guaranteed to not be equal to the VID from the 802.1Q header, if that
exists at all.

Also, I don't know whether this is the case in general or not, but the
hardware I'm working with has TCAM actions that operate on the
classified VLAN, not on the VID from the 802.1Q header. Therefore, this
again is tightly coupled with the bridging service.

What is currently done to support things like VLAN rewriting using the
tc-vlan action is to require vlan_filtering to be set to 1 at the time
when the tc-flower rule is added, and then dynamic changes to the
vlan_filtering property are denied.

But the driver still cannot veto the removal of the port from the
bridge, or the deletion of the bridge itself. So this is still very
fragile, and there are cases where we could end up with broken
offloading for non-obvious reasons.

I don't have a clear picture in my mind about what is wrong. An airplane
viewer might argue that the TCAM should be completely separate from the
bridging service, but I'm not completely sure that this can be achieved
in the aforementioned case with VLAN rewriting on ingress and on egress,
it would seem more natural for these features to operate on the
classified VLAN (which again, depends on VLAN awareness being turned on).
Alternatively, one might argue that the deletion of a bridge interface
should be vetoed, and so should the removal of a port from a bridge.
But that is quite complicated, and doesn't answer questions such as
"what should you do when you reboot".
Alternatively, one might say that letting the user remove TCAM
dependencies from the bridging service is fine, but the driver should
have a way to also unoffload the tc-flower keys as long as the
requirements are not satisfied. I think this is also difficult to
implement.

I haven't copied any of the directly interested parties because I would
like to hear some neutral opinions first. Thanks for reading.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is it ok for switch TCAMs to depend on the bridge state?
  2021-11-02 11:03 Is it ok for switch TCAMs to depend on the bridge state? Vladimir Oltean
@ 2021-11-03 16:38 ` Jakub Kicinski
  2021-11-07 11:50 ` Ido Schimmel
  1 sibling, 0 replies; 6+ messages in thread
From: Jakub Kicinski @ 2021-11-03 16:38 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Nikolay Aleksandrov, Jiri Pirko, Ido Schimmel,
	Andrew Lunn, Florian Fainelli

On Tue, 2 Nov 2021 11:03:53 +0000 Vladimir Oltean wrote:
> I don't have a clear picture in my mind about what is wrong. An airplane
> viewer might argue that the TCAM should be completely separate from the
> bridging service, but I'm not completely sure that this can be achieved
> in the aforementioned case with VLAN rewriting on ingress and on egress,
> it would seem more natural for these features to operate on the
> classified VLAN (which again, depends on VLAN awareness being turned on).
> Alternatively, one might argue that the deletion of a bridge interface
> should be vetoed, and so should the removal of a port from a bridge.
> But that is quite complicated, and doesn't answer questions such as
> "what should you do when you reboot".
> Alternatively, one might say that letting the user remove TCAM
> dependencies from the bridging service is fine, but the driver should
> have a way to also unoffload the tc-flower keys as long as the
> requirements are not satisfied. I think this is also difficult to
> implement.

Some random thoughts which may be completely nonsensical.

I thought we do have a way of indicating that flower rules are no
longer offloaded because tunnel rules need neigh to be resolved, 
but looking at the code it seems we only report some semblance of
offload status as part of stats.

For port removal maybe we can add a callback just for vetoing in case
the operation originates from user space?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is it ok for switch TCAMs to depend on the bridge state?
  2021-11-02 11:03 Is it ok for switch TCAMs to depend on the bridge state? Vladimir Oltean
  2021-11-03 16:38 ` Jakub Kicinski
@ 2021-11-07 11:50 ` Ido Schimmel
  2021-11-11 11:52   ` Vladimir Oltean
  1 sibling, 1 reply; 6+ messages in thread
From: Ido Schimmel @ 2021-11-07 11:50 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: netdev, Jakub Kicinski, Nikolay Aleksandrov, Jiri Pirko,
	Ido Schimmel, Andrew Lunn, Florian Fainelli

On Tue, Nov 02, 2021 at 11:03:53AM +0000, Vladimir Oltean wrote:
> I've been reviewing a patch set which offloads to hardware some
> tc-flower filters with some TSN-specific actions (ingress policing).
> The keys of those offloaded tc-flower filters are not arbitrary, they
> are the destination MAC address and VLAN ID of the frames, which is
> relevant because these TSN policers are actually coupled with the
> bridging service in hardware. So the premise of that patch set was that
> the user would first need to add static FDB entries to the bridge with
> the same key as the tc-flower key, before the tc-flower filters would be
> accepted for offloading.

[...]

> I don't have a clear picture in my mind about what is wrong. An airplane
> viewer might argue that the TCAM should be completely separate from the
> bridging service, but I'm not completely sure that this can be achieved
> in the aforementioned case with VLAN rewriting on ingress and on egress,
> it would seem more natural for these features to operate on the
> classified VLAN (which again, depends on VLAN awareness being turned on).
> Alternatively, one might argue that the deletion of a bridge interface
> should be vetoed, and so should the removal of a port from a bridge.
> But that is quite complicated, and doesn't answer questions such as
> "what should you do when you reboot".
> Alternatively, one might say that letting the user remove TCAM
> dependencies from the bridging service is fine, but the driver should
> have a way to also unoffload the tc-flower keys as long as the
> requirements are not satisfied. I think this is also difficult to
> implement.

Regarding the question in the subject ("Is it ok for switch TCAMs to
depend on the bridge state?"), I believe the answer is yes because there
is no way to avoid it and effectively it is already happening.

To add to your examples and Jakub's, this is also how "ERSPAN" works in
mlxsw. User space installs some flower filter with a mirror action
towards a gretap netdev, but the HW does not do the forwarding towards
the destination. Instead, it relies on the SW to tell it which headers
(i.e., Eth, IP, GRE) to put on the mirrored packet and tell it from
which port the packet should egress. When we have a bridge in the
forwarding path, it means that the offload state of the filter is
affected by FDB updates. As was discussed in the past, we are missing
the ability to notify user space when the offload state of the filter
changes.

Regarding the particular example of TSN policers. I'm not familiar with
the subject, but from your mail I get the impression that the dependency
between them and the bridge is a quirk of the hardware you are working
with and that in general the two are not related. If so, in order to
make the user experience somewhat better, you might consider vetoing the
addition of the flower filter or at least emit a warning via extack when
the port is not enslaved to a bridge. Regarding the FDB entries, instead
of requiring user space to understand that it needs to install those
entries in order to make the filter work, you can notify them from the
driver to the bridge via SWITCHDEV_FDB_ADD_TO_BRIDGE.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is it ok for switch TCAMs to depend on the bridge state?
  2021-11-07 11:50 ` Ido Schimmel
@ 2021-11-11 11:52   ` Vladimir Oltean
  2021-11-11 13:46     ` Ido Schimmel
  0 siblings, 1 reply; 6+ messages in thread
From: Vladimir Oltean @ 2021-11-11 11:52 UTC (permalink / raw)
  To: Ido Schimmel, Jakub Kicinski
  Cc: netdev, Nikolay Aleksandrov, Jiri Pirko, Ido Schimmel,
	Andrew Lunn, Florian Fainelli

On Sun, Nov 07, 2021 at 01:50:36PM +0200, Ido Schimmel wrote:
> On Tue, Nov 02, 2021 at 11:03:53AM +0000, Vladimir Oltean wrote:
> > I've been reviewing a patch set which offloads to hardware some
> > tc-flower filters with some TSN-specific actions (ingress policing).
> > The keys of those offloaded tc-flower filters are not arbitrary, they
> > are the destination MAC address and VLAN ID of the frames, which is
> > relevant because these TSN policers are actually coupled with the
> > bridging service in hardware. So the premise of that patch set was that
> > the user would first need to add static FDB entries to the bridge with
> > the same key as the tc-flower key, before the tc-flower filters would be
> > accepted for offloading.
> 
> [...]
> 
> > I don't have a clear picture in my mind about what is wrong. An airplane
> > viewer might argue that the TCAM should be completely separate from the
> > bridging service, but I'm not completely sure that this can be achieved
> > in the aforementioned case with VLAN rewriting on ingress and on egress,
> > it would seem more natural for these features to operate on the
> > classified VLAN (which again, depends on VLAN awareness being turned on).
> > Alternatively, one might argue that the deletion of a bridge interface
> > should be vetoed, and so should the removal of a port from a bridge.
> > But that is quite complicated, and doesn't answer questions such as
> > "what should you do when you reboot".
> > Alternatively, one might say that letting the user remove TCAM
> > dependencies from the bridging service is fine, but the driver should
> > have a way to also unoffload the tc-flower keys as long as the
> > requirements are not satisfied. I think this is also difficult to
> > implement.
> 
> Regarding the question in the subject ("Is it ok for switch TCAMs to
> depend on the bridge state?"), I believe the answer is yes because there
> is no way to avoid it and effectively it is already happening.
> 
> To add to your examples and Jakub's, this is also how "ERSPAN" works in
> mlxsw. User space installs some flower filter with a mirror action
> towards a gretap netdev, but the HW does not do the forwarding towards
> the destination.

I don't understand this part. By "forwarding" you mean "mirroring" here,
and the "destination" is the gretap interface which is offloaded?

> Instead, it relies on the SW to tell it which headers
> (i.e., Eth, IP, GRE) to put on the mirrored packet and tell it from
> which port the packet should egress. When we have a bridge in the
> forwarding path, it means that the offload state of the filter is
> affected by FDB updates.

Here you're saying that the gretap interface whose local IP address is
the IP address of a bridge interface that is offloaded by mlxsw, and the
precise egress port is determined by the bridge's FDB? But since you
don't support bridging with foreign interfaces, why would the mirred
rule ever become unoffloaded?

I'm afraid that I don't understand this case very well.

> As was discussed in the past, we are missing
> the ability to notify user space when the offload state of the filter
> changes.
> 
> Regarding the particular example of TSN policers. I'm not familiar with
> the subject, but from your mail I get the impression that the dependency
> between them and the bridge is a quirk of the hardware you are working
> with and that in general the two are not related. If so, in order to
> make the user experience somewhat better, you might consider vetoing the
> addition of the flower filter or at least emit a warning via extack when
> the port is not enslaved to a bridge. Regarding the FDB entries, instead
> of requiring user space to understand that it needs to install those
> entries in order to make the filter work, you can notify them from the
> driver to the bridge via SWITCHDEV_FDB_ADD_TO_BRIDGE.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is it ok for switch TCAMs to depend on the bridge state?
  2021-11-11 11:52   ` Vladimir Oltean
@ 2021-11-11 13:46     ` Ido Schimmel
  2021-11-11 14:12       ` Vladimir Oltean
  0 siblings, 1 reply; 6+ messages in thread
From: Ido Schimmel @ 2021-11-11 13:46 UTC (permalink / raw)
  To: Vladimir Oltean
  Cc: Jakub Kicinski, netdev, Nikolay Aleksandrov, Jiri Pirko,
	Ido Schimmel, Andrew Lunn, Florian Fainelli

On Thu, Nov 11, 2021 at 11:52:55AM +0000, Vladimir Oltean wrote:
> On Sun, Nov 07, 2021 at 01:50:36PM +0200, Ido Schimmel wrote:
> > On Tue, Nov 02, 2021 at 11:03:53AM +0000, Vladimir Oltean wrote:
> > > I've been reviewing a patch set which offloads to hardware some
> > > tc-flower filters with some TSN-specific actions (ingress policing).
> > > The keys of those offloaded tc-flower filters are not arbitrary, they
> > > are the destination MAC address and VLAN ID of the frames, which is
> > > relevant because these TSN policers are actually coupled with the
> > > bridging service in hardware. So the premise of that patch set was that
> > > the user would first need to add static FDB entries to the bridge with
> > > the same key as the tc-flower key, before the tc-flower filters would be
> > > accepted for offloading.
> > 
> > [...]
> > 
> > > I don't have a clear picture in my mind about what is wrong. An airplane
> > > viewer might argue that the TCAM should be completely separate from the
> > > bridging service, but I'm not completely sure that this can be achieved
> > > in the aforementioned case with VLAN rewriting on ingress and on egress,
> > > it would seem more natural for these features to operate on the
> > > classified VLAN (which again, depends on VLAN awareness being turned on).
> > > Alternatively, one might argue that the deletion of a bridge interface
> > > should be vetoed, and so should the removal of a port from a bridge.
> > > But that is quite complicated, and doesn't answer questions such as
> > > "what should you do when you reboot".
> > > Alternatively, one might say that letting the user remove TCAM
> > > dependencies from the bridging service is fine, but the driver should
> > > have a way to also unoffload the tc-flower keys as long as the
> > > requirements are not satisfied. I think this is also difficult to
> > > implement.
> > 
> > Regarding the question in the subject ("Is it ok for switch TCAMs to
> > depend on the bridge state?"), I believe the answer is yes because there
> > is no way to avoid it and effectively it is already happening.
> > 
> > To add to your examples and Jakub's, this is also how "ERSPAN" works in
> > mlxsw. User space installs some flower filter with a mirror action
> > towards a gretap netdev, but the HW does not do the forwarding towards
> > the destination.
> 
> I don't understand this part. By "forwarding" you mean "mirroring" here,

Yes

> and the "destination" is the gretap interface which is offloaded?

No. See more below

> 
> > Instead, it relies on the SW to tell it which headers
> > (i.e., Eth, IP, GRE) to put on the mirrored packet and tell it from
> > which port the packet should egress. When we have a bridge in the
> > forwarding path, it means that the offload state of the filter is
> > affected by FDB updates.
> 
> Here you're saying that the gretap interface whose local IP address is
> the IP address of a bridge interface that is offloaded by mlxsw, and the
> precise egress port is determined by the bridge's FDB? But since you
> don't support bridging with foreign interfaces, why would the mirred
> rule ever become unoffloaded?
> 
> I'm afraid that I don't understand this case very well.

In software, when you mirror to a gretap via act_mirred, the packet is
cloned and transmitted through the gretap netdev. This netdev will then
put a GRE header on the packet, specifying that the next protocol is
Ethernet. It will then put an IP header on the packet with the
configured source and destination IPs and route the packet towards its
destination.

It is possible that routing will determine that the encapsulated packet
should be transmitted via a bridge. In which case, the packet will also
do an FDB lookup in the bridge before determining the egress port.

In hardware, we don't have a representation for the gretap device.
Instead, the hardware is kept very simple and requires the driver to
tell it:

a. Via which port to mirror the packet
b. Which headers to encapsulate the packet with

So the "offload-ability" of the filter is conditioned on software being
able to determine the correct path, which can change with time following
FDB/routes/etc updates.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Is it ok for switch TCAMs to depend on the bridge state?
  2021-11-11 13:46     ` Ido Schimmel
@ 2021-11-11 14:12       ` Vladimir Oltean
  0 siblings, 0 replies; 6+ messages in thread
From: Vladimir Oltean @ 2021-11-11 14:12 UTC (permalink / raw)
  To: Ido Schimmel
  Cc: Jakub Kicinski, netdev, Nikolay Aleksandrov, Jiri Pirko,
	Ido Schimmel, Andrew Lunn, Florian Fainelli

On Thu, Nov 11, 2021 at 03:46:35PM +0200, Ido Schimmel wrote:
> On Thu, Nov 11, 2021 at 11:52:55AM +0000, Vladimir Oltean wrote:
> > On Sun, Nov 07, 2021 at 01:50:36PM +0200, Ido Schimmel wrote:
> > > On Tue, Nov 02, 2021 at 11:03:53AM +0000, Vladimir Oltean wrote:
> > > > I've been reviewing a patch set which offloads to hardware some
> > > > tc-flower filters with some TSN-specific actions (ingress policing).
> > > > The keys of those offloaded tc-flower filters are not arbitrary, they
> > > > are the destination MAC address and VLAN ID of the frames, which is
> > > > relevant because these TSN policers are actually coupled with the
> > > > bridging service in hardware. So the premise of that patch set was that
> > > > the user would first need to add static FDB entries to the bridge with
> > > > the same key as the tc-flower key, before the tc-flower filters would be
> > > > accepted for offloading.
> > > 
> > > [...]
> > > 
> > > > I don't have a clear picture in my mind about what is wrong. An airplane
> > > > viewer might argue that the TCAM should be completely separate from the
> > > > bridging service, but I'm not completely sure that this can be achieved
> > > > in the aforementioned case with VLAN rewriting on ingress and on egress,
> > > > it would seem more natural for these features to operate on the
> > > > classified VLAN (which again, depends on VLAN awareness being turned on).
> > > > Alternatively, one might argue that the deletion of a bridge interface
> > > > should be vetoed, and so should the removal of a port from a bridge.
> > > > But that is quite complicated, and doesn't answer questions such as
> > > > "what should you do when you reboot".
> > > > Alternatively, one might say that letting the user remove TCAM
> > > > dependencies from the bridging service is fine, but the driver should
> > > > have a way to also unoffload the tc-flower keys as long as the
> > > > requirements are not satisfied. I think this is also difficult to
> > > > implement.
> > > 
> > > Regarding the question in the subject ("Is it ok for switch TCAMs to
> > > depend on the bridge state?"), I believe the answer is yes because there
> > > is no way to avoid it and effectively it is already happening.
> > > 
> > > To add to your examples and Jakub's, this is also how "ERSPAN" works in
> > > mlxsw. User space installs some flower filter with a mirror action
> > > towards a gretap netdev, but the HW does not do the forwarding towards
> > > the destination.
> > 
> > I don't understand this part. By "forwarding" you mean "mirroring" here,
> 
> Yes
> 
> > and the "destination" is the gretap interface which is offloaded?
> 
> No. See more below
> 
> > 
> > > Instead, it relies on the SW to tell it which headers
> > > (i.e., Eth, IP, GRE) to put on the mirrored packet and tell it from
> > > which port the packet should egress. When we have a bridge in the
> > > forwarding path, it means that the offload state of the filter is
> > > affected by FDB updates.
> > 
> > Here you're saying that the gretap interface whose local IP address is
> > the IP address of a bridge interface that is offloaded by mlxsw, and the
> > precise egress port is determined by the bridge's FDB? But since you
> > don't support bridging with foreign interfaces, why would the mirred
> > rule ever become unoffloaded?
> > 
> > I'm afraid that I don't understand this case very well.
> 
> In software, when you mirror to a gretap via act_mirred, the packet is
> cloned and transmitted through the gretap netdev. This netdev will then
> put a GRE header on the packet, specifying that the next protocol is
> Ethernet. It will then put an IP header on the packet with the
> configured source and destination IPs and route the packet towards its
> destination.
> 
> It is possible that routing will determine that the encapsulated packet
> should be transmitted via a bridge. In which case, the packet will also
> do an FDB lookup in the bridge before determining the egress port.
> 
> In hardware, we don't have a representation for the gretap device.
> Instead, the hardware is kept very simple and requires the driver to
> tell it:
> 
> a. Via which port to mirror the packet
> b. Which headers to encapsulate the packet with
> 
> So the "offload-ability" of the filter is conditioned on software being
> able to determine the correct path, which can change with time following
> FDB/routes/etc updates.

Understood now. So it depends upon a lot more things than just the
bridge state, also IP routes. I thought you were giving an example
related strictly to the bridge. Now it makes more sense. Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-11-11 14:12 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-02 11:03 Is it ok for switch TCAMs to depend on the bridge state? Vladimir Oltean
2021-11-03 16:38 ` Jakub Kicinski
2021-11-07 11:50 ` Ido Schimmel
2021-11-11 11:52   ` Vladimir Oltean
2021-11-11 13:46     ` Ido Schimmel
2021-11-11 14:12       ` Vladimir Oltean

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.