netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v2] vxlan: calculate correct header length for GPE
@ 2023-07-20  9:05 Jiri Benc
  2023-07-21 12:46 ` Simon Horman
  2023-07-24  8:40 ` patchwork-bot+netdevbpf
  0 siblings, 2 replies; 3+ messages in thread
From: Jiri Benc @ 2023-07-20  9:05 UTC (permalink / raw)
  To: netdev
  Cc: Jesse Brandeburg, Tony Nguyen, Jakub Kicinski, Paolo Abeni, Eyal Birger

VXLAN-GPE does not add an extra inner Ethernet header. Take that into
account when calculating header length.

This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is
cached.

In the collect_md mode (which is the only mode that VXLAN-GPE
supports), there's no magic auto-setting of the tunnel interface MTU.
It can't be, since the destination and thus the underlying interface
may be different for each packet.

So, the administrator is responsible for setting the correct tunnel
interface MTU. Apparently, the administrators are capable enough to
calculate that the maximum MTU for VXLAN-GPE is (their_lower_MTU - 36).
They set the tunnel interface MTU to 1464. If you run a TCP stream over
such interface, it's then segmented according to the MTU 1464, i.e.
producing 1514 bytes frames. Which is okay, this still fits the lower
MTU.

However, skb_tunnel_check_pmtu (called from vxlan_xmit_one) uses 50 as
the header size and thus incorrectly calculates the frame size to be
1528. This leads to ICMP too big message being generated (locally),
PMTU of 1450 to be cached and the TCP stream to be resegmented.

The fix is to use the correct actual header size, especially for
skb_tunnel_check_pmtu calculation.

Fixes: e1e5314de08ba ("vxlan: implement GPE")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
v2: more verbose patch description
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |  2 +-
 drivers/net/vxlan/vxlan_core.c                | 23 ++++++++-----------
 include/net/vxlan.h                           | 13 +++++++----
 3 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 1726297f2e0d..8eb9839a3ca6 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -8479,7 +8479,7 @@ static void ixgbe_atr(struct ixgbe_ring *ring,
 		struct ixgbe_adapter *adapter = q_vector->adapter;
 
 		if (unlikely(skb_tail_pointer(skb) < hdr.network +
-			     VXLAN_HEADROOM))
+			     vxlan_headroom(0)))
 			return;
 
 		/* verify the port is recognized as VXLAN */
diff --git a/drivers/net/vxlan/vxlan_core.c b/drivers/net/vxlan/vxlan_core.c
index 78744549c1b3..42be8a26c171 100644
--- a/drivers/net/vxlan/vxlan_core.c
+++ b/drivers/net/vxlan/vxlan_core.c
@@ -2516,7 +2516,7 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		}
 
 		ndst = &rt->dst;
-		err = skb_tunnel_check_pmtu(skb, ndst, VXLAN_HEADROOM,
+		err = skb_tunnel_check_pmtu(skb, ndst, vxlan_headroom(flags & VXLAN_F_GPE),
 					    netif_is_any_bridge_port(dev));
 		if (err < 0) {
 			goto tx_error;
@@ -2577,7 +2577,8 @@ void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 				goto out_unlock;
 		}
 
-		err = skb_tunnel_check_pmtu(skb, ndst, VXLAN6_HEADROOM,
+		err = skb_tunnel_check_pmtu(skb, ndst,
+					    vxlan_headroom((flags & VXLAN_F_GPE) | VXLAN_F_IPV6),
 					    netif_is_any_bridge_port(dev));
 		if (err < 0) {
 			goto tx_error;
@@ -2989,14 +2990,12 @@ static int vxlan_change_mtu(struct net_device *dev, int new_mtu)
 	struct vxlan_rdst *dst = &vxlan->default_dst;
 	struct net_device *lowerdev = __dev_get_by_index(vxlan->net,
 							 dst->remote_ifindex);
-	bool use_ipv6 = !!(vxlan->cfg.flags & VXLAN_F_IPV6);
 
 	/* This check is different than dev->max_mtu, because it looks at
 	 * the lowerdev->mtu, rather than the static dev->max_mtu
 	 */
 	if (lowerdev) {
-		int max_mtu = lowerdev->mtu -
-			      (use_ipv6 ? VXLAN6_HEADROOM : VXLAN_HEADROOM);
+		int max_mtu = lowerdev->mtu - vxlan_headroom(vxlan->cfg.flags);
 		if (new_mtu > max_mtu)
 			return -EINVAL;
 	}
@@ -3644,11 +3643,11 @@ static void vxlan_config_apply(struct net_device *dev,
 	struct vxlan_dev *vxlan = netdev_priv(dev);
 	struct vxlan_rdst *dst = &vxlan->default_dst;
 	unsigned short needed_headroom = ETH_HLEN;
-	bool use_ipv6 = !!(conf->flags & VXLAN_F_IPV6);
 	int max_mtu = ETH_MAX_MTU;
+	u32 flags = conf->flags;
 
 	if (!changelink) {
-		if (conf->flags & VXLAN_F_GPE)
+		if (flags & VXLAN_F_GPE)
 			vxlan_raw_setup(dev);
 		else
 			vxlan_ether_setup(dev);
@@ -3673,8 +3672,7 @@ static void vxlan_config_apply(struct net_device *dev,
 
 		dev->needed_tailroom = lowerdev->needed_tailroom;
 
-		max_mtu = lowerdev->mtu - (use_ipv6 ? VXLAN6_HEADROOM :
-					   VXLAN_HEADROOM);
+		max_mtu = lowerdev->mtu - vxlan_headroom(flags);
 		if (max_mtu < ETH_MIN_MTU)
 			max_mtu = ETH_MIN_MTU;
 
@@ -3685,10 +3683,9 @@ static void vxlan_config_apply(struct net_device *dev,
 	if (dev->mtu > max_mtu)
 		dev->mtu = max_mtu;
 
-	if (use_ipv6 || conf->flags & VXLAN_F_COLLECT_METADATA)
-		needed_headroom += VXLAN6_HEADROOM;
-	else
-		needed_headroom += VXLAN_HEADROOM;
+	if (flags & VXLAN_F_COLLECT_METADATA)
+		flags |= VXLAN_F_IPV6;
+	needed_headroom += vxlan_headroom(flags);
 	dev->needed_headroom = needed_headroom;
 
 	memcpy(&vxlan->cfg, conf, sizeof(*conf));
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index 0be91ca78d3a..1648240c9668 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -386,10 +386,15 @@ static inline netdev_features_t vxlan_features_check(struct sk_buff *skb,
 	return features;
 }
 
-/* IP header + UDP + VXLAN + Ethernet header */
-#define VXLAN_HEADROOM (20 + 8 + 8 + 14)
-/* IPv6 header + UDP + VXLAN + Ethernet header */
-#define VXLAN6_HEADROOM (40 + 8 + 8 + 14)
+static inline int vxlan_headroom(u32 flags)
+{
+	/* VXLAN:     IP4/6 header + UDP + VXLAN + Ethernet header */
+	/* VXLAN-GPE: IP4/6 header + UDP + VXLAN */
+	return (flags & VXLAN_F_IPV6 ? sizeof(struct ipv6hdr) :
+				       sizeof(struct iphdr)) +
+	       sizeof(struct udphdr) + sizeof(struct vxlanhdr) +
+	       (flags & VXLAN_F_GPE ? 0 : ETH_HLEN);
+}
 
 static inline struct vxlanhdr *vxlan_hdr(struct sk_buff *skb)
 {
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH net v2] vxlan: calculate correct header length for GPE
  2023-07-20  9:05 [PATCH net v2] vxlan: calculate correct header length for GPE Jiri Benc
@ 2023-07-21 12:46 ` Simon Horman
  2023-07-24  8:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: Simon Horman @ 2023-07-21 12:46 UTC (permalink / raw)
  To: Jiri Benc
  Cc: netdev, Jesse Brandeburg, Tony Nguyen, Jakub Kicinski,
	Paolo Abeni, Eyal Birger

On Thu, Jul 20, 2023 at 11:05:56AM +0200, Jiri Benc wrote:
> VXLAN-GPE does not add an extra inner Ethernet header. Take that into
> account when calculating header length.
> 
> This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is
> cached.
> 
> In the collect_md mode (which is the only mode that VXLAN-GPE
> supports), there's no magic auto-setting of the tunnel interface MTU.
> It can't be, since the destination and thus the underlying interface
> may be different for each packet.
> 
> So, the administrator is responsible for setting the correct tunnel
> interface MTU. Apparently, the administrators are capable enough to
> calculate that the maximum MTU for VXLAN-GPE is (their_lower_MTU - 36).
> They set the tunnel interface MTU to 1464. If you run a TCP stream over
> such interface, it's then segmented according to the MTU 1464, i.e.
> producing 1514 bytes frames. Which is okay, this still fits the lower
> MTU.
> 
> However, skb_tunnel_check_pmtu (called from vxlan_xmit_one) uses 50 as
> the header size and thus incorrectly calculates the frame size to be
> 1528. This leads to ICMP too big message being generated (locally),
> PMTU of 1450 to be cached and the TCP stream to be resegmented.
> 
> The fix is to use the correct actual header size, especially for
> skb_tunnel_check_pmtu calculation.
> 
> Fixes: e1e5314de08ba ("vxlan: implement GPE")
> Signed-off-by: Jiri Benc <jbenc@redhat.com>
> ---
> v2: more verbose patch description

Reviewed-by: Simon Horman <simon.horman@corigine.com>


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH net v2] vxlan: calculate correct header length for GPE
  2023-07-20  9:05 [PATCH net v2] vxlan: calculate correct header length for GPE Jiri Benc
  2023-07-21 12:46 ` Simon Horman
@ 2023-07-24  8:40 ` patchwork-bot+netdevbpf
  1 sibling, 0 replies; 3+ messages in thread
From: patchwork-bot+netdevbpf @ 2023-07-24  8:40 UTC (permalink / raw)
  To: Jiri Benc
  Cc: netdev, jesse.brandeburg, anthony.l.nguyen, kuba, pabeni, eyal.birger

Hello:

This patch was applied to netdev/net.git (main)
by David S. Miller <davem@davemloft.net>:

On Thu, 20 Jul 2023 11:05:56 +0200 you wrote:
> VXLAN-GPE does not add an extra inner Ethernet header. Take that into
> account when calculating header length.
> 
> This causes problems in skb_tunnel_check_pmtu, where incorrect PMTU is
> cached.
> 
> In the collect_md mode (which is the only mode that VXLAN-GPE
> supports), there's no magic auto-setting of the tunnel interface MTU.
> It can't be, since the destination and thus the underlying interface
> may be different for each packet.
> 
> [...]

Here is the summary with links:
  - [net,v2] vxlan: calculate correct header length for GPE
    https://git.kernel.org/netdev/net/c/94d166c5318c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-07-24  8:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-20  9:05 [PATCH net v2] vxlan: calculate correct header length for GPE Jiri Benc
2023-07-21 12:46 ` Simon Horman
2023-07-24  8:40 ` patchwork-bot+netdevbpf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).