netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] sfc: remove udp_tnl_has_port
@ 2020-06-30 22:50 Jakub Kicinski
  2020-07-01 14:32 ` Edward Cree
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2020-06-30 22:50 UTC (permalink / raw)
  To: davem; +Cc: netdev, ecree, mhabets, linux-net-drivers, Jakub Kicinski

Nothing seems to have ever been calling this.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
 drivers/net/ethernet/sfc/ef10.c       | 23 -----------------------
 drivers/net/ethernet/sfc/net_driver.h |  2 --
 2 files changed, 25 deletions(-)

diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 5faf2f0e4d62..bd4f5eb5b098 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -3907,28 +3907,6 @@ static int efx_ef10_udp_tnl_add_port(struct efx_nic *efx,
 	return rc;
 }
 
-/* Called under the TX lock with the TX queue running, hence no-one can be
- * in the middle of updating the UDP tunnels table.  However, they could
- * have tried and failed the MCDI, in which case they'll have set the dirty
- * flag before dropping their locks.
- */
-static bool efx_ef10_udp_tnl_has_port(struct efx_nic *efx, __be16 port)
-{
-	struct efx_ef10_nic_data *nic_data = efx->nic_data;
-
-	if (!(nic_data->datapath_caps &
-	      (1 << MC_CMD_GET_CAPABILITIES_OUT_VXLAN_NVGRE_LBN)))
-		return false;
-
-	if (nic_data->udp_tunnels_dirty)
-		/* SW table may not match HW state, so just assume we can't
-		 * use any UDP tunnel offloads.
-		 */
-		return false;
-
-	return __efx_ef10_udp_tnl_lookup_port(efx, port) != NULL;
-}
-
 static int efx_ef10_udp_tnl_del_port(struct efx_nic *efx,
 				     struct efx_udp_tunnel tnl)
 {
@@ -4216,7 +4194,6 @@ const struct efx_nic_type efx_hunt_a0_nic_type = {
 	.vlan_rx_kill_vid = efx_ef10_vlan_rx_kill_vid,
 	.udp_tnl_push_ports = efx_ef10_udp_tnl_push_ports,
 	.udp_tnl_add_port = efx_ef10_udp_tnl_add_port,
-	.udp_tnl_has_port = efx_ef10_udp_tnl_has_port,
 	.udp_tnl_del_port = efx_ef10_udp_tnl_del_port,
 #ifdef CONFIG_SFC_SRIOV
 	.sriov_configure = efx_ef10_sriov_configure,
diff --git a/drivers/net/ethernet/sfc/net_driver.h b/drivers/net/ethernet/sfc/net_driver.h
index e0b84b2e3bd2..69d58f2b638c 100644
--- a/drivers/net/ethernet/sfc/net_driver.h
+++ b/drivers/net/ethernet/sfc/net_driver.h
@@ -1298,7 +1298,6 @@ struct efx_udp_tunnel {
  *	If %NULL, then device does not support any TSO version.
  * @udp_tnl_push_ports: Push the list of UDP tunnel ports to the NIC if required.
  * @udp_tnl_add_port: Add a UDP tunnel port
- * @udp_tnl_has_port: Check if a port has been added as UDP tunnel
  * @udp_tnl_del_port: Remove a UDP tunnel port
  * @print_additional_fwver: Dump NIC-specific additional FW version info
  * @revision: Hardware architecture revision
@@ -1472,7 +1471,6 @@ struct efx_nic_type {
 	u32 (*tso_versions)(struct efx_nic *efx);
 	int (*udp_tnl_push_ports)(struct efx_nic *efx);
 	int (*udp_tnl_add_port)(struct efx_nic *efx, struct efx_udp_tunnel tnl);
-	bool (*udp_tnl_has_port)(struct efx_nic *efx, __be16 port);
 	int (*udp_tnl_del_port)(struct efx_nic *efx, struct efx_udp_tunnel tnl);
 	size_t (*print_additional_fwver)(struct efx_nic *efx, char *buf,
 					 size_t len);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] sfc: remove udp_tnl_has_port
  2020-06-30 22:50 [PATCH net-next] sfc: remove udp_tnl_has_port Jakub Kicinski
@ 2020-07-01 14:32 ` Edward Cree
  2020-07-01 18:43   ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Edward Cree @ 2020-07-01 14:32 UTC (permalink / raw)
  To: Jakub Kicinski, davem; +Cc: netdev, mhabets, linux-net-drivers

On 30/06/2020 23:50, Jakub Kicinski wrote:
> Nothing seems to have ever been calling this.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
That was intended to be used by encap offloads (TX csum and TSO), which
 we only recently realised we hadn't upstreamed the rest of; the
 udp_tnl_has_port method would be called from our ndo_features_check().
I'll try to get to upstreaming that support after ef100 is in, hopefully
 within this cycle, but if you don't want this dead code lying around in
 the meantime then have an
Acked-by: Edward Cree <ecree@solarflare.com>
 and I can revert it when I add the code that calls it.
(And don't worry, ef100 doesn't use ugly port-based offloads; it does
 proper CHECKSUM_PARTIAL and GSO_PARTIAL, so it won't have this stuff.)

-ed

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] sfc: remove udp_tnl_has_port
  2020-07-01 14:32 ` Edward Cree
@ 2020-07-01 18:43   ` Jakub Kicinski
  2020-07-01 22:02     ` Edward Cree
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Kicinski @ 2020-07-01 18:43 UTC (permalink / raw)
  To: Edward Cree; +Cc: davem, netdev, mhabets, linux-net-drivers

On Wed, 1 Jul 2020 15:32:03 +0100 Edward Cree wrote:
> On 30/06/2020 23:50, Jakub Kicinski wrote:
> > Nothing seems to have ever been calling this.
> >
> > Signed-off-by: Jakub Kicinski <kuba@kernel.org>  
> That was intended to be used by encap offloads (TX csum and TSO), which
>  we only recently realised we hadn't upstreamed the rest of; the
>  udp_tnl_has_port method would be called from our ndo_features_check().
> I'll try to get to upstreaming that support after ef100 is in, hopefully
>  within this cycle, but if you don't want this dead code lying around in
>  the meantime then have an
> Acked-by: Edward Cree <ecree@solarflare.com>
>  and I can revert it when I add the code that calls it.
> (And don't worry, ef100 doesn't use ugly port-based offloads; it does
>  proper CHECKSUM_PARTIAL and GSO_PARTIAL, so it won't have this stuff.)

There's a number of drivers which try to match the UDP ports. That
seems fragile to me. Is it actually required for HW to operate
correctly? 

Aren't the ports per ns in the kernel? There's no guarantee that some
other netns won't send a TSO skb and whatever other UDP encap.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] sfc: remove udp_tnl_has_port
  2020-07-01 18:43   ` Jakub Kicinski
@ 2020-07-01 22:02     ` Edward Cree
  2020-07-01 23:11       ` Jakub Kicinski
  0 siblings, 1 reply; 5+ messages in thread
From: Edward Cree @ 2020-07-01 22:02 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: davem, netdev, mhabets, linux-net-drivers

On 01/07/2020 19:43, Jakub Kicinski wrote:
> There's a number of drivers which try to match the UDP ports. That
> seems fragile to me. Is it actually required for HW to operate
> correctly? 
For EF10 hardware, yes, because the hardware parses the packet headers
 to find for itself the offsets at which to make the various TSO edits;
 thus its parser needs to know which UDP ports correspond to VXLAN or
 GENEVE.  If a GSO skb arrives at the driver with skb->encapsulation
 set but on a UDP port that's not known to the hardware, the driver
 will have to reject it in ndo_features_check() or 'manually' fall back
 to software segmentation from the transmit path.
EF10 also makes use of encap parsing on receive, for CHECKSUM_UNNECESSARY
 offload (with CSUM_LEVEL) as well as RSS and filtering on inner headers
 (although there is currently no driver support for inner-frame RXNFC, as
 ethtool's API doesn't cover it).
> Aren't the ports per ns in the kernel? There's no guarantee that some
> other netns won't send a TSO skb and whatever other UDP encap.
That is indeed one of the flaws with port-based tunnel offloads; in
 theory the UDP port's scope is only the 3-tuple of the socket used by
 the tunnel device, so never mind netns, it would be logically valid to
 use the same port for different encap protocols on different IP
 addresses on the same network interface.
AFAICT udp_tunnel_notify_add_rx_port() gets a netns from the sock and
 then calls the ndo for every netdev in that ns.  So in a setup like
 that, the ndo would get called twice for the same port (without any IP
 address information other than sa_family being passed to the driver),
 the driver would ignore the second one (print a netif_dbg and return
 EEXIST, which the caller ignores), and any TSO skbs trying to use the
 second one would be parsed by the hardware with the wrong encap type
 and probably go out garbled on the wire.  I think at the time everyone
 took the position that "this is a really unlikely setup and if anybody
 really wants to do that they'll just have to turn off encap TSO".

So ndo_udp_tunnel_add is a fundamentally broken interface that people
 shouldn't design new hardware to support but it's close enough that it
 seems reasonable to use it to get _some_ encap TSO mileage out of the
 port-based hardware that already exists.  Agree/disagree/other?

-ed

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net-next] sfc: remove udp_tnl_has_port
  2020-07-01 22:02     ` Edward Cree
@ 2020-07-01 23:11       ` Jakub Kicinski
  0 siblings, 0 replies; 5+ messages in thread
From: Jakub Kicinski @ 2020-07-01 23:11 UTC (permalink / raw)
  To: Edward Cree; +Cc: davem, netdev, mhabets, linux-net-drivers

On Wed, 1 Jul 2020 23:02:09 +0100 Edward Cree wrote:
> On 01/07/2020 19:43, Jakub Kicinski wrote:
> > There's a number of drivers which try to match the UDP ports. That
> > seems fragile to me. Is it actually required for HW to operate
> > correctly?   
> For EF10 hardware, yes, because the hardware parses the packet headers
>  to find for itself the offsets at which to make the various TSO edits;
>  thus its parser needs to know which UDP ports correspond to VXLAN or
>  GENEVE.  If a GSO skb arrives at the driver with skb->encapsulation
>  set but on a UDP port that's not known to the hardware, the driver
>  will have to reject it in ndo_features_check() or 'manually' fall back
>  to software segmentation from the transmit path.

I see. I'm asking because I'm working on a rewrite of udp tunnel-
-related callbacks. I'll keep the ef10's table checking, then.

We can drop this patch if you plan to upstream the support for TX
side offloads soon.

> EF10 also makes use of encap parsing on receive, for CHECKSUM_UNNECESSARY
>  offload (with CSUM_LEVEL) as well as RSS and filtering on inner headers
>  (although there is currently no driver support for inner-frame RXNFC, as
>  ethtool's API doesn't cover it).
> > Aren't the ports per ns in the kernel? There's no guarantee that some
> > other netns won't send a TSO skb and whatever other UDP encap.  
> That is indeed one of the flaws with port-based tunnel offloads; in
>  theory the UDP port's scope is only the 3-tuple of the socket used by
>  the tunnel device, so never mind netns, it would be logically valid to
>  use the same port for different encap protocols on different IP
>  addresses on the same network interface.
> AFAICT udp_tunnel_notify_add_rx_port() gets a netns from the sock and
>  then calls the ndo for every netdev in that ns.  So in a setup like
>  that, the ndo would get called twice for the same port (without any IP
>  address information other than sa_family being passed to the driver),
>  the driver would ignore the second one (print a netif_dbg and return
>  EEXIST, which the caller ignores), and any TSO skbs trying to use the
>  second one would be parsed by the hardware with the wrong encap type
>  and probably go out garbled on the wire.  I think at the time everyone
>  took the position that "this is a really unlikely setup and if anybody
>  really wants to do that they'll just have to turn off encap TSO".
> 
> So ndo_udp_tunnel_add is a fundamentally broken interface that people
>  shouldn't design new hardware to support but it's close enough that it
>  seems reasonable to use it to get _some_ encap TSO mileage out of the
>  port-based hardware that already exists.  Agree/disagree/other?

The port offload interface is just a hint for RX side offloads which
can't cause harm. It's the use of this hint as a hard fact for TX
offloads which is incorrect.

If NIC thinks the inner csum is invalid because the packet was in fact
not carrying encapsulated frames - it should just pass it up the stack.
We don't trust NICs to tell us checksums are wrong.

RSS is also relatively harmless if gone wrong. Most NICs actually
default to not computing RSS using inner headers, to stay on the safe
side.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-07-01 23:11 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-30 22:50 [PATCH net-next] sfc: remove udp_tnl_has_port Jakub Kicinski
2020-07-01 14:32 ` Edward Cree
2020-07-01 18:43   ` Jakub Kicinski
2020-07-01 22:02     ` Edward Cree
2020-07-01 23:11       ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).